Latest developments in the world of Natural Language Processing: A comparison of different language models

Justus Tschötsch

Natural language processing (NLP) is a rapidly evolving sub-field of artificial intelligence. With ever new developments and breakthroughs, language models are already able to understand and generate human-like language with impressive accuracy. To keep track and catch up, we will compare different language models and have a look at the latest advancements, opportunities, and challenges of natural language processing.

What is NLP and what are language models?

Natural language processing concentrates on programming computers so that they can understand and interpret human language in the form of text, speech, or images. If you want to read an in-detail explanation of NLP before diving into this article, you can do this on our blog entry about the question “What is NLP?”.

Language models are an implementation of Natural Language Processing. Based on the previous entry, language models use machine learning to predict the most likely next word in a sentence. Trained on large datasets of text, language models can learn the patterns and relationships between words and phrases in language. They can be used for a variety of tasks such as text production, speech recognition, machine translation, and optical character recognition.

Overview and comparison of existing language models

Language models have revolutionized the field of NLP. This section will provide a brief overview of the purpose, development, applications, and advantages of some models. We will delve into models like BERT, GPT, and Luminous.



Released in May 2020 by OpenAI, GPT-3, or “Generative Pretrained Transformer”, and especially its fine-tuned variant named “ChatGPT” impressed the world. GPT-3’s main use is to generate extensive and human-like text responses with only a small amount of text as input. The quality and accuracy of the output provided by the GPT-3 model is so high, that it is very difficult to ascertain whether a text was written by a human or not. It is also able to accomplish other natural language processing tasks ranging from text summarization to even being able to generate programming code.


GPT-3 was developed using the Transformer architecture, which refers to the “T” in “GPT”. The Transformer architecture is a neural network structure that applies self-attention mechanisms to process and relate input sequences. GPT-3’s training was done using around 45 TB of text data from various sources like Wikipedia, books, and web text, resulting in a development cost of a staggering $4.6 million. To be more precise, these datasets were used to train the model:

  • Common Crawl: Petabytes of data, all collected from eight years of crawling the web: metadata extracts, text extracts with light filtering and raw web page data 

  • WebText2: Text of all web pages that were linked to from Reddit posts with more than 3 upvotes

  • Books1 and Books2: internet-based books corpora

  • Wikipedia: english pages of the encyclopedia

The pre-training was done using smaller models (GPT-1, GPT-2) first. Each model's training got more detailed, with increasing amounts of data and more complex architectures. This resulted in significant performance improvements from model to model. The initial training process then used unsupervised learning, meaning that the language model was trained to forecast the next word of a given sentence.


GPT-3 can be used for a wide range of natural language processing tasks, such as:

  • Content creation: GPT-3’s text production is able to cover a wide range of topics. Since the text output is almost indistinguishable from human-made text, it is commonly used for content creation, journalism, and copywriting. 

  • Language translation: GPT-3 has the capability to translate text from one language to another, which is used for website translations and communication tools.

  • Virtual assistants and chatbots: Because of its ability to understand natural language, GPT-3 is used to create intelligent chatbots that can respond to users in a conversational manner.

  • Sentiment analysis: Due to its unsupervised learning from vast text data, GPT-3 is able to analyze the sentiment of a text. This makes it useful for applications like brand monitoring, social media analysis, and stock trading.

  • Question-answering: By understanding the context and meaning of a question, GPT-3 is able to provide precise responses.

  • Personalization: The model is able to create recommendations as well as personalized content based on the user’s behavior and preferences.

  • Medical diagnosis: GPT-3 can assist with diagnosis and treatment recommendations by analyzing medical data.


GPT-3 has the ability to generate large amounts of text in a relatively short time span, which makes the production of text-based content efficient, fast, and easy. In addition to that, the accuracy of the language model is very high, due to the massive amount of text data GPT-3 was trained on. Furthermore, OpenAI trains GPT-3 continually. As it gets exposed to more data, GPT-3 can learn and improve over time, which leads it to always be up-to-date with the latest language usage patterns and be more precise.



BERT is the abbreviation for “Bidirectional Encoder Representations from Transformers”, which was released by Google in October 2018. The language model can enhance the performance of natural language processing tasks by being a pre-trained language model that can be fine-tuned for specific tasks. It is a powerful and flexible tool for NLP programmers which allows a state-of-art performance even with limited training time and data.


The BERT model was developed on the Transformer architecture, too. The key innovation of the BERT model is its bidirectional training approach. This allows the model to take the whole context into account when making predictions, in contrast to previous NLP models, which only took the words into account that came before the target word. BERT was trained on a massive amount of text data - including the entire Wikipedia dataset as well as the BookCorpus dataset - which contains more than 800 million words. The language model’s pre-training took place simultaneously on two tasks: 

Next sentence prediction (NSP)

Masked language modeling (MLM)

The task here is to determine if two sentences are consecutive. While training the model, the second sentence follows the first sentence half of the time, and the other half of the time the second sentence is randomly selected from the training corpus. This makes the model able to predict if two sentences are consecutive or not. The task has the purpose to help the model understand the relationship between sentences and the context they appear in.

Here, in each sentence, there is a certain percentage of tokens selected randomly and replaced with a special “MASK” token. The model is then trained to predict the word under the “MASK” token from the surrounding context. This task has the purpose of training BERT to understand the relationship between words in a sentence as well as capture the meaning of the sentence as a whole. You can find a good visualization below.


  • Language translation: The BERT model has been used as a pre-trained model for machine translation. To improve the precision and quality of the translation, the model can be fine-tuned to a specific language pair. 

  • Google search: As BERT is a language model developed by Google, it is used to improve the relevance and accuracy of Google search results. 

  • Text classification: This language model is able to create classifications of texts in different categories, e.g. spam or not spam. 

  • Text prediction: The MLM pre-training of BERT can be leveraged for text prediction. Due to its fine-tuning possibilities, it can be adapted for this task. 

  • Question answering: BERT improved the accuracy of question-answering systems because the model’s training allows it to give a precise answer from a given context. If you want to read more about that, we also have created a blog entry for question-answering using BERT.

  • Sentiment analysis: The language model is able to classify text sentiments in the categories of positive, neutral, and negative. 


The possibility of fine-tuning BERT for a wide range of tasks makes it a really powerful and flexible tool. The training on a large corpus of data makes the use of the model easier for defined tasks. Another advantage of this language model are the frequent updates, allowing outstanding accuracy. Furthermore, BERT has an extremely high availability: It is available and pre-trained in more than 100 languages, which makes it a viable option for non-English projects.



Launched in April 2022 by the German AI company Aleph Alpha, Luminous is the umbrella term for a family of large language models. In the following, I’m referring to “Luminous-supreme”, which is the most capable model from the family. Its purpose is to help to make natural language processing tasks more efficient and advanced, especially in the area of conversational AI. The primary use is the procession and production of human text while mimicking human-like responses. Even though Luminous is less popular than the GPT family and BERT, it is the first language model from Europe that can compete with the world’s leading models.


The Luminous language model was trained in five different languages: Spanish, English, German, Italian, and French. Unfortunately, since Aleph Alpha is a private company, deeper insights into the development are not publicly available.  


  • Virtual assistants: Luminous can be used for the creation of virtual assistants, which are able to understand and respond to user queries. This provides a personalized and efficient experience for the user. 

  • Sentiment analysis: Like other language models, Luminous can be used for sentiment analysis. It is capable of analyzing the emotion and tone of a text to understand people’s feelings about a certain topic. 

  • Translation: Luminous is a language model that can be used for machine translation, making the automatic translation of text possible.

  • Chatbots: By powering chatbots, it can also be used for the automation of customer service processes and other conversations.


In contrast to GPT-3 and BERT, Luminous has the ability to work with images. It has the option to be fine-tuned for various natural language processing tasks, making it a versatile tool. Another advantage is the speed, which is twice as fast compared to competitor models while maintaining the same level of accuracy, as you can see in the graphic below. Furthermore, Luminous is designed to understand and create a natural language with human-like responses, which makes it extremely accurate. Lastly, the Luminous model ensures that user data are kept safe and secure. Its compliance with data protection regulations like CCPA and GDPR makes it a viable, reliable, and trustworthy option and well-suited for organizations that handle sensitive data. 



Last but definitely not least is the successor of the famous GPT-3 language model, GPT-4. Launched on March 14, 2023, the model's purpose remains the same as its predecessors (GPT-3), while performance and capability improved significantly. Its development cost was around $100 million and it is, as of current, the most advanced language model from OpenAI.


Unfortunately, there is not much to say about the development of GPT-4 since OpenAI didn’t provide detailed information on this. According to the technical report that came with the launch of GPT-4, “the competitive landscape and the safety implications of large-scale models”, were the reasons why technical details like model size, architecture, and the hardware that was used during the training of GPT-4 were excluded from it. While excluding these points, it is described that the model was trained using supervised learning on a giant dataset, followed by reinforcement learning using both AI and human feedback.


  • Financial transaction fraud:  By analyzing financial data, GPT-4 is able to identify patterns and detect fraudulent activity.

  • Content creation: From blog posts to books - GPT-4 can create high-quality content which is almost identical to human-written text.

  • Providing programming code: GPT-4 can provide coding assistance with more accurate suggestions and is able to write large chunks of code. 

  • Advanced virtual assistants & chatbots: Like GPT-3, GPT-4 is also able to understand natural language. Since its improved performance and speed, it can be used to create more sophisticated AI-powered chatbots that are able to respond to user queries more accurately. These can be used in industries such as customer service and education.

  • Image recognition: Because of its ability to recognize images, GPT-4 has the capability to enhance visual accessibility. 

  • Research: GPT-4 can be used to review large volumes of academic literature, making it useful for researchers to quickly summarize trends and key findings. 


Unlike GPT-3, GPT-4 can work with both text and images. Due to its increased number of parameters, it has an enhanced ability to understand and generate natural language. GPT-4 provides an advanced understanding of context, syntax and semantics, enabling it to provide more coherent and appropriate responses. The high parameter count also leads to a higher fidelity and a reduction of errors. 

New developments and the future of NLP and language models: an outlook

The evolution of language models and NLP is unstoppable and rapid, with significant improvements and new developments being promised. In the following, I’m going to provide an outlook on the NLP market situation as well as the opportunities and challenges NLP will face in the coming years. 

The situation of the NLP market

Investments in the NLP market will continue to grow with a compound annual growth rate (CAGR) of around 25%, according to the Markets & Markets report. It states that the NLP market size, which was approximately $16 billion in 2022, is expected to reach around $50 billion by 2027. This impressive CAGR is mainly due to three factors: 

Progress in machine learning technologies 

The model size of NLP systems is constantly increasing due to the AI chip makers designing processors which are able to process more parameters. The more powerful the chips, the more human-like interactions can be carried out by NLP models. For example in 2018, models like “ELMo” were able to train roughly 100 million parameters. As of now, models can achieve far more than that, with the largest model, the Megatron-Turing NLG being able to train 530 billion parameters. 

Improved availability and quality of data

Another factor that improves the ability of NLP systems is the exponentially growing availability of data, which you can see in the graphic. To use that data properly, labeling tools are able to enhance the quality of training data. The improvement of text annotation tools and audio annotation tools are reasons for the growth of the NLP market as well.

Customer expectations

Accenture did research which stated that 75% of all CEOs want to change their procedure of managing customer relationships to keep up with changing consumer needs. Since customers expect fast interactions with the brands, implementation of NLP models will be a necessity to manage customer relationships.

The opportunities of NLP

As we saw above, natural language processing is a rapidly evolving and growing field, with more potential that will be unleashed in the coming years. As that, I will outline and explain three key opportunities and trends of NLP:

Sentiment analysis

There is a lot of potential regarding the improvement of sentiment analysis using NLP. Businesses could gain very valuable insights into customer sentiment by analyzing online content such as social media. They can then use that information to improve their services and products.

Machine translation

While machine translation is already widely used in different applications such as DeepL or Google Translate, there is still immense room for improvement. With NLP technology advancing in the future, we can expect to see even more accurate and differentiated translations.

Conversational AI

Conversational AI is the application of NLP that has the most potential. There already are virtual assistants like Siri or Alexa who are able to understand what we say, but not what we mean. For the future of NLP, we can expect that to change, resulting in more advanced and personalized conversational systems that can also understand the meaning of natural language. This will result in more accurate and relevant responses. 

The challenges NLP is facing and will face in the future

While the opportunities of NLP are looking like a bright future and are indeed impressive, the challenges that come with these opportunities need to be considered as well. Coming with the massive development, NLP is facing major challenges:

Data privacy

As mentioned above, natural language processing systems need large amounts of data to receive proper training, which is the key to the improvement of these models. However, the collection and usage of this data can raise genuine and justified concerns about privacy. So the focus on the privacy of data needs to be increased. This can be done by introducing new regulations as well as developing practices to protect the data of the users.


The increasing complexity of NLP models makes it more difficult to understand the view on how they get to certain decisions. Developers need to focus on creating models that are understandable, so that the path by which these models get to a certain decision will be clear.


Furthermore, one of natural language processing’s great challenges is the bias of its models. If they behave unfairly or are biased toward certain demographics or groups, it can lead to discriminatory outcomes. To prevent that, it is necessary for NLP developers to focus on creating fair and unbiased models in the future.

Conclusion and opinion

As you could see in the comparison, the different language models are all basically suited for the same applications, while the real differences lay in the performance, efficiency, accuracy, and quality of responses. Newer models like GPT-4 perform way better than older models, due to new technologies allowing the improvement of training quality as well as the use of significantly more data to train new models. We have seen that while earlier models have very much contributed to the development of language models, newer models promise even more advanced comprehension and generation abilities, visualizing the rapid progress of the natural language processing (NLP) field. 

Nevertheless, all these breakthroughs also highlight the crucial challenges we are facing: Ethical considerations, data privacy, and the explainability of language models are topics that always need to be well-considered and respected. The huge positive potential of AI, NLP, and language models is difficult to assess, as is the potential for negative impact.