How ChatGPT Actually Does Its Job?

Let’s explore how the super cool AI chatbot, ChatGPT, actually works. If you’re curious about its clever AI tricks, keep reading!

Glenn Burgess

Google, Wolfram Alpha, and ChatGPT work through a text box and give you written answers. Google shows web pages, Wolfram Alpha helps with math, and ChatGPT understands what you want and can even write stories or code.

Google is good at finding stuff in big databases, Wolfram Alpha does math, and ChatGPT can explain things using a lot of text (up to 2021).

I also used ChatGPT to help with this information by asking it questions. Some of its answers are included here.

The primary stages of ChatGPT's operation

Let’s use Google as a comparison. When you ask Google to find something, it doesn’t immediately search the entire internet for answers. Instead, it looks in its own database for web pages that match your request. Google has two main steps: first, it collects data from the web, and then it searches for what you asked when you type something.

ChatGPT works in a similar way. The first step, collecting data, is called pre-training. The second step, when it responds to you, is called inference. The cool thing about generative AI like ChatGPT is that pre-training has become much better and bigger, thanks to new technology and cloud computing. This is why it’s become so popular.

How pre-training the AI works

In the world of AI, there are two main ways to train models: supervised and unsupervised. Up until the latest generation of AI like ChatGPT, most projects used the supervised approach.

Supervised training is like teaching with a textbook. The AI is trained using a dataset where each input has a matching output. For example, if you want to teach an AI about customer service, you’d show it questions like, “How can I reset my password?” and the corresponding answers from a customer service rep.

In supervised training, the AI learns to match inputs to outputs. It’s good for tasks like sorting or making predictions, but it has limits. Humans have to predict all possible questions and answers, which can take a long time and be limited in knowledge.

Now, ChatGPT does things differently. It uses unsupervised training, which is a game changer. In unsupervised training, the AI learns from data without specific matching inputs and outputs. It figures out the patterns and structure in the data without a specific job in mind. This is great for understanding language and generating meaningful text in conversations.

This is how ChatGPT seems to know so much. Developers don’t need to know all the answers in advance. They just feed lots of data into ChatGPT’s pre-training, which is called transformer-based language modeling. This is what makes ChatGPT so powerful.

Transformer architecture

The transformer architecture is a type of neural network used for handling natural language data. Think of it like a team of players passing a hockey puck to score a goal, with each player having a specific role. Similarly, a neural network processes information through interconnected nodes, mimicking how our brains work.

The transformer processes word sequences using “self-attention.” This means it looks at all the words in a sequence to understand the context and word relationships, much like how a reader might refer back to previous sentences in a book for context.

The transformer consists of layers, mainly the self-attention and feedforward layers. The self-attention layer evaluates word importance, while the feedforward layer makes data transformations. These layers help the transformer learn word relationships.

During training, the transformer takes input (like a sentence) and predicts an output. It learns by adjusting its predictions based on how close they are to the real output. This helps it understand word context and relationships, making it great for tasks like language translation.

However, these models can produce biased or harmful content because they learn from the data they’re trained on. Companies try to add safeguards, but it’s tricky because different people have different views, making universal chatbot design challenging given society’s complexity.

Now, let’s dive into ChatGPT’s data input and its interaction with users and natural language.

ChatGPT's training datasets

ChatGPT was trained on a massive dataset. It’s based on the GPT-3 architecture, but there’s a difference between the free version and ChatGPT Plus, which costs £20/month. ChatGPT uses either the GPT-3 dataset or a more extensive GPT-4 dataset if you have ChatGPT Plus.

Now, let’s break down GPT: it’s generative (creates stuff), pre-trained (uses lots of data), and uses the transformer architecture (weighs text to understand context).

GPT-3 trained on WebText2, a massive 45-terabyte library of text data. To put it in perspective, that’s much less space than pictures or videos take up.

This huge dataset helped ChatGPT learn language patterns and relationships on an unprecedented scale, making it great at giving relevant responses.

ChatGPT, though based on GPT-3, was fine-tuned on different data for conversation. OpenAI made a dataset called Persona-Chat, with 160,000 dialogues, each with unique personalities. This helped ChatGPT give personalised responses.

Besides Persona-Chat, ChatGPT learned from other conversational datasets like movie scripts, tech support chats, and daily life conversations.

In addition, ChatGPT read a lot of internet text, like websites and books, to understand language’s structure and patterns in general.

ChatGPT is a distinct model with 1.5 billion parameters, smaller than GPT-3’s 175 billion.

Overall, ChatGPT learned from conversation data to generate natural responses. It used unsupervised training, meaning it processed lots of data and found patterns by itself.

But understanding and answering questions come in the inference phase, which involves natural language processing and dialog management.

What about human involvement in pre-training?

While non-supervised pre-training plays a big role in ChatGPT’s development, there’s evidence to suggest human assistance was involved in preparing it for public use.

TIME Magazine reported that human “data labelers” in Kenya were paid between £1.32 and £2 per hour to review and flag disturbing and explicit internet content for ChatGPT’s training.

Additionally, an AI newsletter called Martechpost mentioned that ChatGPT used a process called Reinforcement Learning from Human Feedback (RLHF) during its training. In this process, human trainers acted as both users and AI assistants to fine-tune the initial model using supervised learning.

ChatGPT itself clarified that it underwent pre-training with a combination of unsupervised and supervised learning techniques, like language modeling and sequence prediction, using vast amounts of internet text data. After pre-training, reinforcement learning with human feedback was applied to improve performance in specific tasks or domains, with humans providing rewards or penalties for feedback.

It appears that initial pre-training was non-supervised, allowing ChatGPT to learn from a massive amount of data. However, when developing dialog responses and filtering out inappropriate content, human assistance may have been involved.

The article reached out to OpenAI for clarification but hadn’t received a response at the time of writing. If OpenAI provides more information, the article will be updated accordingly.

Natural language processing

Natural Language Processing (NLP) is all about making computers understand, interpret, and talk like humans. In today’s data-driven world, NLP is super important for businesses.

NLP can be used for lots of things, like figuring out if people are happy or sad, making chatbots, listening to spoken words, and translating languages. Businesses use NLP to save time, offer better customer service, and learn from what people say online.

But NLP isn’t a walk in the park. Human language can be tricky and confusing. NLP needs to learn from loads of data to understand the little details of how we talk. It also has to keep learning as language changes.

How does it work? Well, NLP takes big chunks of text, like sentences, and breaks them into smaller pieces. Then, it looks at the pieces and how they connect to make sense of things. It uses fancy math and learning tricks to do this.

So, NLP is like the language guru of the computer world, helping businesses talk and understand their customers better.

Dialogue management

You might have noticed that ChatGPT is pretty good at asking follow-up questions to understand you better and give personalised responses. It’s like having a real conversation with a computer!

This magic happens because of algorithms and machine learning. ChatGPT uses them to grasp what’s going on in the conversation and keep it going smoothly over many exchanges with you.

This is a big deal in the world of natural language processing because it makes computer programs feel more like talking to a person and less like talking to a robot. It builds trust and keeps you engaged, which is great for both you and the folks using the program.

But here’s the thing: While it’s awesome for building trust, it’s also a bit tricky because it’s a way AI can influence people. So, there’s a balance to strike between building trust and being cautious about manipulation.

A look inside the hardware that runs ChatGPT

Microsoft has put out a video that dives into how Azure, their cloud platform, plays a crucial role in running all the computing and storage needed for ChatGPT. It’s a pretty interesting watch, shedding light on both the technology behind Azure and the hardware that makes ChatGPT work.