Does informativeness matter? Active learning for educational dialogue act classification

Tan, Wei; Lin, Jionghao; Lang, David; Chen, Guanliang; Gasevic, Dragan; Du, Lan; Buntine, Wray

View/Open

Does Informativeness Matter_ Active Learning for Educational Dialogue Act Classification.pdf (1.576Mb)

Date

2023-04-12

Author

Tan, Wei

Lin, Jionghao

Lang, David

Chen, Guanliang

Gasevic, Dragan

Du, Lan

Buntine, Wray

Metadata

Show full item record

Abstract

Dialogue Acts (DAs) can be used to explain what expert tutors do and what students know during the tutoring process. Most empirical studies adopt the random sampling method to obtain sentence samples for manual annotation of DAs, which are then used to train DA classifiers. However, these studies have paid little attention to sample informativeness, which can reflect the information quantity of the selected samples and inform the extent to which a classifier can learn patterns. Notably, the informativeness level may vary among the samples, and the classifier might only need a small amount of low informative samples to learn the patterns. Random sampling may overlook sample informativeness, which consumes human labelling costs and contributes less to training the classifiers. As an alternative, researchers suggest employing statistical sampling methods of Active Learning (AL) to identify the informative samples for training the classifiers. However, the use of AL methods in educational DA classification tasks is under-explored. In this paper, we examine the informativeness of annotated sentence samples. Then, the study investigates how the AL methods can select informative samples to support DA classifiers in the AL sampling process. The results reveal that most annotated sentences present low informativeness in the training dataset, and the patterns of these sentences can be easily captured by the DA classifier. We also demonstrate how AL methods can reduce the cost of manual annotation in the AL sampling process.

URI

https://vinspace.edu.vn/handle/VIN/276

Collections

Wray Buntine, PhD. [13]