Are We Really Making Much Progress in Text Classification? A Comparative Review - Summary
This paper reviews and compares methods for single-label and multi-label text classification, categorizing them into bag-of-words, sequence-based, graph-based, and hierarchical methods. The findings reveal that pre-trained language models outperform all recently proposed graph-based and hierarchy-b
Arxiv URL: https://arxiv.org/abs/2204.03954v4
Authors: Lukas Galke, Andor Diera, Bao Xin Lin, Bhakti Khera, Tim Meuser, Tushar Singhal, Fabian Karl, Ansgar Scherp
Summary:
This paper reviews and compares methods for single-label and multi-label text classification, categorizing them into bag-of-words, sequence-based, graph-based, and hierarchical methods. The findings reveal that pre-trained language models outperform all recently proposed graph-based and hierarchy-based methods, and sometimes perform better than standard machine learning methods like a multilayer perceptron on a bag-of-words. The study questions the contributions made by graph-based methods and suggests that future work should thoroughly test against strong bag-of-words baselines and state-of-the-art pre-trained language models.
Key Insights & Learnings:
- Pre-trained language models outperform all recently proposed graph-based and hierarchy-based methods.
- Standard machine learning methods like a multilayer perceptron on a bag-of-words sometimes perform better than graph-based and hierarchy-based methods.
- Graph-based methods have limited effect while requiring more memory and time resources.
- Simple methods like MLPs and logistic regression have largely been ignored as strong and serious competitors for newly developed methods.
- Sequence-based Transformers resemble the state-of-the-art in single-label and multi-label text classification.
Terms Mentioned: text classification, single-label, multi-label, bag-of-words, sequence-based, graph-based, hierarchical methods, pre-trained language models, multilayer perceptron, Convolutional Neural Networks, Long Short-Term Memory, Transformer, self-attention, synthetic co-occurrence graph, Graph Neural Networks, inductive, transductive, logistic regression, label embeddings, global hierarchy, local hierarchy
Technologies / Libraries Mentioned: PyTorch, scikit-learn, TextGCN, HeteGCN, HBGL