paper summaries

MiLMo:Minority Multilingual Pre-trained Language Model - Summary

The paper presents a multilingual pre-trained language model named MiLMo that performs better on minority language tasks, including Mongolian, Tibetan, Uyghur, Kazakh and Korean. The authors also construct a minority multilingual text classification dataset named MiTC, and train a word2vec model fo

Arxiv URL: https://arxiv.org/abs/2212.01779v2

Authors: Junjie Deng, Hanru Shi, Xinhe Yu, Wugedele Bao, Yuan Sun, Xiaobing Zhao

Summary:

The paper presents a multilingual pre-trained language model named MiLMo that performs better on minority language tasks, including Mongolian, Tibetan, Uyghur, Kazakh and Korean. The authors also construct a minority multilingual text classification dataset named MiTC, and train a word2vec model for each language to provide optimal scheme for the downstream task research of minority languages.

Key Insights & Learnings:

MiLMo is a multilingual pre-trained language model that performs better on minority language tasks.
MiTC is a minority multilingual text classification dataset constructed to solve the problem of scarcity of minority language datasets.
MiLMo outperforms the word2vec representation in the downstream task of text classification.
The paper provides the best scheme for the research of downstream task of minority languages.
The existing multilingual pre-trained models do not work well on minority languages, which seriously affects the construction of minority language informatization.

Terms Mentioned: pre-trained language models, multilingual pre-trained language models, minority languages, MiLMo, MiTC, word2vec model, downstream task

Technologies / Libraries Mentioned: BERT, ELMo, Transformer, GPT, ALBERT, SpanBERT, RoBERTa, XLM, XLM-R, mBERT

Instruction Tuning with GPT-4 - Summary

The paper presents the first attempt to use GPT-4 to generate instruction-following data for Large Language Models (LLMs) finetuning. The 52K English and Chinese instruction-following data generated by GPT-4 leads to superior zero-shot performance on new tasks compared to the instruction-following

Are We Really Making Much Progress in Text Classification? A Comparative Review - Summary

This paper reviews and compares methods for single-label and multi-label text classification, categorizing them into bag-of-words, sequence-based, graph-based, and hierarchical methods. The findings reveal that pre-trained language models outperform all recently proposed graph-based and hierarchy-b

A Survey of Large Language Models - Summary

This paper surveys the recent advances in Large Language Models (LLMs), which are pre-trained Transformer models over large-scale corpora. The paper discusses the background, key findings, and mainstream techniques of LLMs, focusing on pre-training, adaptation tuning, utilization, and capacity eval

Read next