How do chatbots like ChatGPT get their answers? Look under the hood

Chatbots are taking over the world. They impress with their versatility and well-crafted sentences, but also surprise with their quirky mistakes. How do ChatGPT, Google Bard, and their colleagues arrive at their answers?

Lawrence VerhagenJune 27, 20233 o’clock

It seems to happen carelessly. Just a few seconds after the (very complicated) question, programs like Bing Chat (from Microsoft), Bard (from Google), and ChatGPT (from OpenAI) provide their answers. Question: “Write a short dialogue (seven questions and seven answers) in the style of Plato on whether artificial intelligence can have consciousness.” The Greek philosopher (427-347 BC) distinguished himself stylistically from his predecessors by writing dialogues in which he had a philosophical conversation with an opponent. Plato as his teacher Socrates.

All chatbots have completed their task substantively successfully and have been able to capture the essence of Socratic dialogue and contemporary debate on AI and consciousness in readable and stimulating sentences.

Part of a conversation via Google Bard

Socrates: “What would it take to prove that artificial intelligence can have consciousness?”

Glaucon: “I’m not sure. Maybe if it passes the Turing test?”

Socrates: “The Turing test is not a perfect measure of consciousness. Machines that are not really conscious can be fooled.”

Glaucon: “I know, but he’s our best.”

Socrates: “Maybe, but I think we should be careful about anthropomorphism. If AI can do something that humans can do, it doesn’t mean that AI is conscious like us.”

At the same time, they make strange mistakes, mistakes that an elementary school student would never make. Seven questions and answers, the task. This doesn’t always work. Bard dodges a bit, while Bing thinks he’s being cute after five short ping-pong matches. The following dialogue is even stranger. Bing apologizes, and tries several times, but can’t figure it out.

Bing apologies

Bing: “What do you think of this dialogue?”

Editors: “Okay, but how many Q&A?”

Bing: Seven. “Is there anything else I can help you with?”

Editorial: Are you sure? Come back. ”

Bing: “Sorry for the confusion. There are six.”

Editorial: Six? TRUE?”

Bing: “Sorry for the mistake. There are seven.”

Editors: “Seven again? We agreed it wasn’t seven, right?”

Bing: “Sorry for the confusion. You’re right: there are six.”

It’s an image you often get: great performances go hand in hand with hard-to-understand blunders. In this context, the term random parrot is often used: the so-called large language models that control the aforementioned chatbots have no understanding of the world, but do nothing more than predict words and imitate the myriad scripts on which they have been trained. .

Unlike search engines, ChatGPT (Generative Pretrained Transformer) and the like are not designed to provide correct answers, but to produce natural language by recognizing text patterns. If the content is correct, which fortunately it usually is, this is a nice crossover capture. This leads to real feats: the latest generation of chatbots improves computer code, can make summaries of complex scientific studies and can discuss every conceivable topic. And he’s getting better. The previous version of ChatGPT failed the US Code exam, and the successor (for enthusiasts: GPT-4) actually scored above average.

Rapid advances are confusing the world of artificial intelligence. Are chatbots perhaps more than random parrots, as some experts say? Jelle Zuidema, assistant professor of natural language processing at the University of Amsterdam, wasn’t too happy with the parrot metaphor: “She suggests that these models are essentially copy-and-paste, while we also see in smaller models that they can actually create new combinations of existing elements.

According to Zuidema, the evidence is “overwhelming” that GPT-3 (the language model behind ChatGPT) and other LLMs (Large Language Models) have also learned many abstract patterns of story and dialogue structure. In other words, the parrot metaphor grossly underestimates the capabilities of modern language paradigms.

At the same time, Zuidema cautions against overestimating based on existing skills: “You have to be careful with claims about human cognitive skills.” Humans are quick to classify computers as “intelligent,” but these types of AI systems work very differently from our biological brain. ChatGPT comes up with its impressive performance often by statistical correlations, with the result that even GPT-4 can still speak pretty nonsensical words.

how is that possible? How do modern chatbots make their way? Basically, the GPT (generative prefabricated transformer) model behind ChatGPT is a model trained to recognize patterns in the language and thus be able to make predictions. Think of predictive text input found on smartphone keyboards, for example. Based on the previous letters, this program predicts the next word. After “I” follows “have,” then “a” and “question.” But if you click on the suggested word every time, you will end up creating very strange sentences.

ChatGPT is much smarter and generates not only grammatically correct sentences and paragraphs, but also texts that take into account context and specific text input. As a result, they read as if it was written by a human being.

Look under the hood in five steps.

Step 1. Collect

ChatGPT is trained on a large amount of numeric text. Think of articles from major news sites, e-books, legal texts, forums, or Wikipedia. In this way, the core model learns the structure and usage of the language. Unfortunately, companies like OpenAI aren’t very open about the training data used for language models. GPT-4 is also a black box. Trade secret maker OpenAI says, to the frustration of academics.

Timon Vader statue

Step 2. Operation

The next step: converting all of that text into so-called “tokens,” which are little building blocks. These can be words, but also punctuation or suffixes like “-aar” from “walker”. Each symbol is given a unique sequence of numbers with which the neural network can continue to run.

You can see a row of numbers like coordinates in a multidimensional space: each symbol has its own place there, near symbols close to each other in meaning, the so-called word embeds. For example, large (or rather, the series of numbers associated with it) is close to huge, huge, massive. And Biden with a president, a politician, and Trump.

So the words we often use in similar contexts get weddings close together. This means that the model learns that certain words have similar meanings based solely on how they are used. But also what are the other types of relationships between words. For example, that the relationship between a “king” and a “queen” is similar to the relationship between a “man” and a “woman”.

Cover illu chatGPT ID 232798998 Credit Timon Vader Image Timon Vader — cover illu chatGPTID 232798998 credit Timon VaderTimon Vader statue

3. Exercise

Now the language model starts a test itself, as it were, by taking a piece of text and removing words from it. Then it predicts which words should logically follow, and then compares that prediction to the original text. Depending on how good or bad the prognosis is, the algorithm adjusts the weights (the more likely it is to follow a word, the higher the weight) in the language model. This is how the system learns.

GPT-3 has 175 billion of these weights, and thus needs hundreds of billions of words to learn from, and a supercomputer to perform the same “test” itself hundreds of billions of times. Those 175 billion weights are organized according to a structure called the Adapter Model (T of GPT). Transformers, if they are large enough and have seen enough data, prove to be very good at identifying the most important parts of a sentence in order to “get” the meaning.

Finally, OpenAI adds an extra layer of training to the whole process, by also allowing people to observe. This is the modern assembly line: Is the answer correct, unclear, or perhaps meaningless at all? Human feedback improves the system.

null Timon Vader's photo — Timon Vader statue

4. Expect

During training, the model learns to recognize patterns in the language and uses these patterns to predict what is likely to follow a given piece of text. There is no large database that the system can use to make sense of every word. No, the linguistic model actually acts as a superbacus by mathematically approximating words.

5. Reply

If a user mindlessly asks a question or gives instructions, ChatGPT uses all of its training to generate the best possible answer. The model predicts which words or sentences are most likely to follow the given instructions and generates a response accordingly. This is not just a question of the highest possible degree of probability. For example, if you add to your instructions that ChatGPT can be more creative with your answer, or that you have to rhyme, or write in a certain thinker style, you will force the model to make a different word prediction. For example, it can happen after the scary “scenario” once, and the “Mansion of Mirrors” again. But a word like “printers” will not follow.

This process is very fast, but word for word, by feeding the system the new word each time and letting it respond to it. The result is a text that was never written before and whose exact source cannot be determined. It often goes well, and sometimes the model misses the point in terms of content. But all experts agree on one thing: Chatbots will become more and more persuasive.

Dividend Wealth

Complete News World

How do chatbots like ChatGPT get their answers? Look under the hood