Browsing by Author "Osher, Stanley J."
Now showing items 1-2 of 2
-
Improving transformers with probabilistic attention keys
Le, Duy Dung; Tran, Viet Anh; Nguyen, M. Tan; Nguyen, Tam; Nguyen, Duy Khuong; Baraniuk, Richard G.; Ho, Nhat; Osher, Stanley J. (2022)Multi-head attention is a driving force behind state-of-the-art transformers, which achieve remarkable performance across a variety of natural language processing (NLP) and computer vision tasks. It has been observed that ... -
Improving transformers with probabilistic attention keys
Nguyen, Tam; Nguyen, M. Tan; Le, D. Dung; Nguyen, Khuong Duy; Tran, Viet Anh; Baraniuk, Richard G.; Osher, Stanley J.; Ho, Nhat (2022-06-13)Multi-head attention is a driving force behind state-of-the-art transformers, which achieve remarkable performance across a variety of natural language processing (NLP) and computer vision tasks. It has been observed that ...