We have described the current developments around different areas of research, including natural language processing, conversational AI, computer vision, and reinforcement learning, to help you remain well trained for 2021.
In various fields, we also recommend key research articles that we feel are reflective of the latest advances.
Natural Language Processing
In 2020, major pre-trained language models, especially transformers, were still dominating NLP research advances. We’re likely to see some more insightful research ideas this year on developing the design of the transformer and the effectiveness of its training. At the same time, we can be confident that top tech firms will continue to take advantage of the scale of the model as the primary driver for improving language model performance, with GPT-4 or anything similar likely to be implemented in 2021.
Accelerating the training of large language models. Although transformers show outstanding efficiency on downstream NLP activities, it takes too much time and energy to pre-train or fine-tune the new transformer-based language models. Last year, some interesting ideas for accelerating transformer training were presented, and this topic is likely to stay hot in 2021.
Detecting and eliminating biases and toxicity. GPT-3, especially in language generation tasks, has shown remarkable, even human-like, outcomes. Its output, however, most frequently includes toxic and biased remarks.One of the main problems for the NLP research group in the coming years is likely to be the identification and elimination of toxicity and prejudice from the production of language models.
Applying a multilingual environment to language models. The research reveals that without any specific cross-lingual control, pre-trained multilingual models can generalize across languages. NLP researchers do not yet have a clear understanding of how this works, making language models an important focus for future research in a multilingual context.
Exploring effective strategies for data augmentation. The generation of different and yet semantically invariant text disturbances is even more demanding compared to images. Even, there are several strategies for data augmentation that function in NLP but are not commonly used. As recently seen, these strategies struggle to reliably enhance the efficiency of pre-trained transformers, suggesting that there are no additional advantages to data augmentation beyond those of pre-training. However, in settings where pre-training reveals limitations, it may be helpful (e.g., negation, malformed input).
The landmark year for open-domain chatbots was 2020. Google’s Meena chatbot and Facebook’s Blender chatbot, both launched that year, attain near to efficiency at the human level. The creators of these state-of-the-art conversational agents proposed innovative approaches to optimizing the level of communication in terms of sensitivity and awareness, the agent’s empathy and the continuity of his personality.
Also, understanding that Meena is based on a model of 2.6 billion parameters and Blender is trained with up to 9.4 billion parameters using a Transformer-based model, we can conclude that the scale of the model is one of the main factors for the performance of these models.Most industries, however, can not afford to train and deploy chatbots of that scale, and instead search for ‘smarter’ approaches to improving dialog agent performance.
Building transformer-based conversational agents. Latest analysis reveals that the transformer architecture can be extended successfully to open-domain dialog systems (e.g. TransferTransfo by HuggingFace, GPT-3 by OpenAI). It is also possible that transformer-based models will continue to improve conversational agent efficiency.
Dealing with data shortage with task-oriented dialog systems. As it is typically very costly to collect data for goal-oriented dialog agents, one of the appealing avenues for future research is to create new methods to overcome data scarcity.The sample quality in policy learning and implementing methods for zero-shot domain adaptation are other related research directions.
Developing evaluation metrics for open-domain chatbots. For the NLP research group, evaluating open-domain dialog agents remains a very challenging issue. Chit-chat conversations do not have an explicit target, unlike task-oriented dialogs, and there are several potential correct answers in each dialog turn.