• English
    • Tiếng Việt
  • Tiếng Việt 
    • English
    • Tiếng Việt
  • Đăng nhập
View Item 
  •   Trang chủ
  • The College of Engineering and Computer Science
  • Pham Huy Hieu, PhD.
  • View Item
  •   Trang chủ
  • The College of Engineering and Computer Science
  • Pham Huy Hieu, PhD.
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Enhancing few-shot image classification with cosine transformer

Thumbnail
Xem/Mở
hieu3.pdf (4.823Mb)
Năm xuất bản
2023-07-21
Tác giả
Nguyen, Quang Huy
Nguyen, Q. Cuong
Le, D. Dung
Pham, H. Hieu
Metadata
Hiển thị đầy đủ biểu ghi
Tóm tắt
This paper addresses the few-shot image classification problem, where the classification task is performed on unlabeled query samples given a small amount of labeled support samples only. One major challenge of the few-shot learning problem is the large variety of object visual appearances that prevents the support samples from representing that object comprehensively. This might result in a significant difference between support and query samples, therefore undermining the performance of few-shot algorithms. In this paper, we tackle the problem by proposing Few-shot Cosine Transformer (FS-CT), where the relational map between supports and queries is effectively obtained for the few-shot tasks. The FS-CT consists of two parts, a learnable prototypical embedding network to obtain categorical representations from support samples with hard cases, and a transformer encoder to effectively achieve the relational map from two different support and query samples. We introduce Cosine Attention, a more robust and stable attention module that enhances the transformer module significantly and therefore improves FS-CT performance from 5% to over 20% in accuracy compared to the default scaled dot-product mechanism. Our method performs competitive results in mini-ImageNet, CUB-200, and CIFAR-FS on 1-shot learning and 5-shot learning tasks across backbones and few-shot configurations. We also developed a custom few-shot dataset for Yoga pose recognition to demonstrate the potential of our algorithm for practical application. Our FS-CT with cosine attention is a lightweight, simple few-shot algorithm that can be applied for a wide range of applications, such as healthcare, medical, and security surveillance. The official implementation code of our Few-shot Cosine Transformer is available at https://github.com/vinuni-vishc/Few-Shot-Cosine-Transformer.
Định danh
https://vinspace.edu.vn/handle/VIN/573
Collections
  • Pham Huy Hieu, PhD. [36]

Liên hệ | Gửi phản hồi
 

 

Duyệt theo

Toàn bộ thư việnĐơn vị và Bộ sưu tậpNăm xuất bảnTác giảNhan đềChủ đềTrong Bộ sưu tậpNăm xuất bảnTác giảNhan đềChủ đề

Tài khoản

Đăng nhậpĐăng ký

Liên hệ | Gửi phản hồi