Are We Really Making Much Progress in Text Classification? A Comparative Review - Summary

This paper reviews and compares methods for single-label and multi-label text classification, categorizing them into bag-of-words, sequence-based, graph-based, and hierarchical methods. The findings reveal that pre-trained language models outperform all recently proposed graph-based and hierarchy-b

Arxiv URL: https://arxiv.org/abs/2204.03954v4

Authors: Lukas Galke, Andor Diera, Bao Xin Lin, Bhakti Khera, Tim Meuser, Tushar Singhal, Fabian Karl, Ansgar Scherp

Summary:

This paper reviews and compares methods for single-label and multi-label text classification, categorizing them into bag-of-words, sequence-based, graph-based, and hierarchical methods. The findings reveal that pre-trained language models outperform all recently proposed graph-based and hierarchy-based methods, and sometimes perform better than standard machine learning methods like a multilayer perceptron on a bag-of-words. The study questions the contributions made by graph-based methods and suggests that future work should thoroughly test against strong bag-of-words baselines and state-of-the-art pre-trained language models.

Key Insights & Learnings:

  • Pre-trained language models outperform all recently proposed graph-based and hierarchy-based methods.
  • Standard machine learning methods like a multilayer perceptron on a bag-of-words sometimes perform better than graph-based and hierarchy-based methods.
  • Graph-based methods have limited effect while requiring more memory and time resources.
  • Simple methods like MLPs and logistic regression have largely been ignored as strong and serious competitors for newly developed methods.
  • Sequence-based Transformers resemble the state-of-the-art in single-label and multi-label text classification.


Terms Mentioned: text classification, single-label, multi-label, bag-of-words, sequence-based, graph-based, hierarchical methods, pre-trained language models, multilayer perceptron, Convolutional Neural Networks, Long Short-Term Memory, Transformer, self-attention, synthetic co-occurrence graph, Graph Neural Networks, inductive, transductive, logistic regression, label embeddings, global hierarchy, local hierarchy

Technologies / Libraries Mentioned: PyTorch, scikit-learn, TextGCN, HeteGCN, HBGL