• English
    • Tiếng Việt
  • English 
    • English
    • Tiếng Việt
  • Login
View Item 
  •   VinSpace Home
  • The College of Engineering and Computer Science
  • Pham Huy Hieu, PhD.
  • View Item
  •   VinSpace Home
  • The College of Engineering and Computer Science
  • Pham Huy Hieu, PhD.
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Enhancing few-shot image classification with cosine transformer

Thumbnail
View/Open
hieu3.pdf (4.823Mb)
Date
2023-07-21
Author
Nguyen, Quang Huy
Nguyen, Q. Cuong
Le, D. Dung
Pham, H. Hieu
Metadata
Show full item record
Abstract
This paper addresses the few-shot image classification problem, where the classification task is performed on unlabeled query samples given a small amount of labeled support samples only. One major challenge of the few-shot learning problem is the large variety of object visual appearances that prevents the support samples from representing that object comprehensively. This might result in a significant difference between support and query samples, therefore undermining the performance of few-shot algorithms. In this paper, we tackle the problem by proposing Few-shot Cosine Transformer (FS-CT), where the relational map between supports and queries is effectively obtained for the few-shot tasks. The FS-CT consists of two parts, a learnable prototypical embedding network to obtain categorical representations from support samples with hard cases, and a transformer encoder to effectively achieve the relational map from two different support and query samples. We introduce Cosine Attention, a more robust and stable attention module that enhances the transformer module significantly and therefore improves FS-CT performance from 5% to over 20% in accuracy compared to the default scaled dot-product mechanism. Our method performs competitive results in mini-ImageNet, CUB-200, and CIFAR-FS on 1-shot learning and 5-shot learning tasks across backbones and few-shot configurations. We also developed a custom few-shot dataset for Yoga pose recognition to demonstrate the potential of our algorithm for practical application. Our FS-CT with cosine attention is a lightweight, simple few-shot algorithm that can be applied for a wide range of applications, such as healthcare, medical, and security surveillance. The official implementation code of our Few-shot Cosine Transformer is available at https://github.com/vinuni-vishc/Few-Shot-Cosine-Transformer.
URI
https://vinspace.edu.vn/handle/VIN/573
Collections
  • Pham Huy Hieu, PhD. [36]

Contact Us | Send Feedback
 

 

Browse

All of VinSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

LoginRegister

Contact Us | Send Feedback