Robust Educational Dialogue Act Classifiers with Low-Resource and Imbalanced Datasets

Lin, Jionghao; Tan, Wei; Nguyen, Ngoc Dang; Lang, David; Du, Lan; Buntine, Wray; Beare, Richard; Chen, Guanliang; Gašević, Dragan

dc.contributor.author	Lin, Jionghao
dc.contributor.author	Tan, Wei
dc.contributor.author	Nguyen, Ngoc Dang
dc.contributor.author	Lang, David
dc.contributor.author	Du, Lan
dc.contributor.author	Buntine, Wray
dc.contributor.author	Beare, Richard
dc.contributor.author	Chen, Guanliang
dc.contributor.author	Gašević, Dragan
dc.date.accessioned	2024-10-24T16:01:15Z
dc.date.available	2024-10-24T16:01:15Z
dc.date.issued	2023-04-15
dc.identifier.uri	https://vinspace.edu.vn/handle/VIN/376
dc.description.abstract	Dialogue acts (DAs) can represent conversational actions of tutors or students that take place during tutoring dialogues. Automating the identification of DAs in tutoring dialogues is significant to the design of dialogue-based intelligent tutoring systems. Many prior studies employ machine learning models to classify DAs in tutoring dialogues and invest much effort to optimize the classification accuracy by using limited amounts of training data (i.e., low-resource data scenario). However, beyond the classification accuracy, the robustness of the classifier is also important, which can reflect the capability of the classifier on learning the patterns from different class distributions. We note that many prior studies on classifying educational DAs employ cross entropy (CE) loss to optimize DA classifiers on low-resource data with imbalanced DA distribution. The DA classifiers in these studies tend to prioritize accuracy on the majority class at the expense of the minority class which might not be robust to the data with imbalanced ratios of different DA classes. To optimize the robustness of classifiers on imbalanced class distributions, we propose to optimize the performance of the DA classifier by maximizing the area under the ROC curve (AUC) score (i.e., AUC maximization). Through extensive experiments, our study provides evidence that (i) by maximizing AUC in the training process, the DA classifier achieves significant performance improvement compared to the CE approach under low-resource data, and (ii) AUC maximization approaches can improve the robustness of the DA classifier under different class imbalance ratios.	en_US
dc.language.iso	en	en_US
dc.subject	educational dialogue act classification	en_US
dc.subject	model robustness	en_US
dc.subject	low-resource data	en_US
dc.subject	imbalanced data	en_US
dc.subject	large language models	en_US
dc.title	Robust Educational Dialogue Act Classifiers with Low-Resource and Imbalanced Datasets	en_US
dc.type	Article	en_US

Files in this item

Name:: Robust Educational Dialogue Act ...
Size:: 661.0Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Wray Buntine, PhD. [13]
College of Engineering and Computer Science Director, Computer Science program

Show simple item record