Verlagslink: https://www.scitepress.org/Papers/2023/116060/116060.pdf
Verlagslink DOI: 10.5220/0011606000003393
Titel: Towards low-budget real-time active learning for text classification via proxy-based data selection
Sprache: Englisch
Autorenschaft: Andersen, Jakob Smedegaard  
Zukunft, Olaf 
Herausgeber*In: Rocha, Ana Paula 
Steels, Luc 
Herik, Jaap 
Schlagwörter: Text Classification; Active Learning; Cost-Sensitive Learning
Erscheinungsdatum: 2023
Verlag: ScitePress
Teil der Schriftenreihe: Proceedings of the 15th International Conference on Agents and Artificial Intelligence 
Bandangabe: 3: ICAART
Anfangsseite: 25
Endseite: 33
Konferenz: International Conference on Agents and Artificial Intelligence 2023 
Zusammenfassung: 
Training data is typically the bottleneck of supervised machine learning applications, heavily relying on cost-intensive human annotations. Active Learning proposes an interactive framework to efficiently spend human efforts in the training data generation process. However, re-training state-of-the-art text classifiers is highly computationally intensive, leading to long training cycles that cause annoying interruptions to humans in the loop. To enhance the applicability of Active Learning, we investigate low-budget real-time Active Learning via Proxy-based data selection in the domain of text classification. We aim to enable fast interactive cycles within a minimal labelling effort while exploiting the performance of state-of-the-art text classifiers. Our results show that Proxy-based Active Learning can increase the F1-score of a lightweight classifier compared to a traditional budget Active Learning approach up to ~19%. Our novel Proxy-based Active Learning approach can be carried out time-efficiently, requiring less than 1 second for each learning iteration.
URI: http://hdl.handle.net/20.500.12738/14988
ISBN: 978-989-758-623-1
ISSN: 2184-433X
Begutachtungsstatus: Diese Version hat ein Peer-Review-Verfahren durchlaufen (Peer Review)
Einrichtung: Fakultät Technik und Informatik 
Department Informatik 
Dokumenttyp: Konferenzveröffentlichung
Enthalten in den Sammlungen:Publications without full text

Zur Langanzeige

Seitenansichten

29
checked on 26.11.2024

Google ScholarTM

Prüfe

HAW Katalog

Prüfe

Volltext ergänzen

Feedback zu diesem Datensatz


Diese Ressource wurde unter folgender Copyright-Bestimmung veröffentlicht: Lizenz von Creative Commons Creative Commons