dc.description.abstract | In many healthcare applications, accurately identifying pills from images captured under varying conditions has become increasingly crucial. Despite numerous attempts to employ deep learning methods for pill recognition, the high similarity in pill appearances often leads to misrecognition, posing a significant challenge.
To address this issue, we introduce PIKA, a novel approach that utilizes external knowledge to enhance pill recognition accuracy. Our focus is on a practical scenario termed contextual pill recognition, which aims to identify pills in images of a patient’s pill intake.
First, we develop a method to model implicit associations between pills using an external data source—in this case, prescriptions. Second, we propose a walk-based graph embedding model that converts the graph space into vector space, extracting condensed relational features of the pills. Finally, we present a comprehensive framework that integrates both image-based visual features and graph-based relational features for pill identification.
Within this framework, the visual representation of each pill is mapped to the graph embedding space, allowing us to apply attention mechanisms over the graph representation. This results in a semantically rich context vector that aids in the final classification. To our knowledge, this is the first study to leverage external prescription data to establish associations between medications and improve classification accuracy.
The architecture of PIKA is lightweight and flexible, making it compatible with various recognition backbones. Experimental results demonstrate that by utilizing the external knowledge graph, PIKA enhances recognition accuracy significantly, achieving an increase in F1 score from 4.8% to 34.1% compared to baseline methods. | en_US |