📖 AI Glossary

50 essential terms explained simply, with concrete examples. Looking for a word? Use the search below.

50 terms

📚Fundamentals(6)

IA (Intelligence Artificielle)

📚Fundamentals

Field of computer science creating systems that perform tasks normally requiring human intelligence: image recognition, language understanding, decision making.

Example : ChatGPT, face recognition on your phone, Netflix recommendations.
Read more :cest-quoi-un-llm

LLM (Large Language Model)

📚Fundamentals

AI model trained on massive amounts of text to understand and generate natural language. ChatGPT, Claude and Gemini are LLMs.

Example : When you ask ChatGPT a question, the LLM predicts the next words based on its billions of parameters.

AGI (Artificial General Intelligence)

📚Fundamentals

Hypothetical AI matching or surpassing humans on ALL cognitive tasks, not just specific ones. Does not exist yet in 2026.

Example : GPT-5 is great but not AGI: it can write essays but cannot truly understand real-world physics.

Machine Learning (ML)

📚Fundamentals

Subfield of AI where machines learn from data without being explicitly programmed for each task.

Example : A spam filter learns what spam looks like from thousands of examples, rather than fixed rules.

Deep Learning

📚Fundamentals

Type of machine learning using deep neural networks (many layers). Behind recent AI breakthroughs.

Example : LLMs, image recognition and machine translation all rely on deep learning.

IA Générative

📚Fundamentals

AI capable of creating new content: text, images, video, code, audio. Includes ChatGPT, Midjourney, Sora.

Example : Asking Midjourney to generate a unicorn-in-space image = generative AI.

🤖Models(8)

GPT (Generative Pre-trained Transformer)

🤖Models

Family of AI models from OpenAI. GPT-3.5 popularized ChatGPT in 2022, GPT-5 is the 2026 standard.

Claude

🤖Models

AI model family from Anthropic, known for quality writing, excellent code and enhanced safety.

Gemini

🤖Models

AI model family from Google, natively multimodal, with ultra-long context (up to 1M tokens).

Modèle open-source

🤖Models

Model whose weights are public and downloadable. Llama (Meta), Mistral, DeepSeek are open-source.

Read more :open-vs-proprio

Multimodal

🤖Models

Model handling multiple data types: text, image, audio, video. GPT-5 and Gemini are multimodal.

Context window (fenêtre de contexte)

🤖Models

Maximum amount of text an LLM can "see" at once. Measured in tokens. GPT-5 = 400K, Claude = 500K, Gemini = 1M.

Example : With 1M tokens, Gemini can analyze an entire book, or several.

Token

🤖Models

Basic unit of an LLM. One token ≈ 4 characters in English, 3 in French. "Hello" = 1 token.

Example : GPT-5 bills per token. 1M input tokens = ~750K English words.

Paramètres (parameters)

🤖Models

Internal variables learned by a model. More parameters = more expressive (but also more expensive).

Example : GPT-3 = 175 billion parameters. GPT-5 = ~1 trillion (estimated).

🧠Training(9)

Training (entraînement)

🧠Training

Phase where a model "learns" by analyzing billions of examples. Very expensive (millions of dollars for big models).

Example : GPT-4 reportedly cost ~$100M to train.

Fine-tuning

🧠Training

Re-training an existing model on your own data to specialize it for your domain.

Example : Fine-tuning GPT-5 on your customer emails so it writes in your style.

RLHF (Reinforcement Learning from Human Feedback)

🧠Training

Training technique where humans rate model responses to align it with their preferences.

Example : RLHF transformed GPT-3 (sometimes rude) into ChatGPT (polite and helpful).

Embeddings

🧠Training

Numerical representation of text as a vector. Enables measuring similarity between texts.

Example : Embeddings of "cat" and "dog" are close. Those of "cat" and "car" are far. Foundation of RAG.

Vector database

🧠Training

Database specialized in storing and searching embeddings. Pinecone, Qdrant, pgvector.

Transformer

🧠Training

Neural architecture invented by Google in 2017. Foundation of all modern LLMs (GPT, Claude, Gemini).

Attention mechanism

🧠Training

Core mechanism of transformers: lets the model "look at" all words in context simultaneously.

Pre-training

🧠Training

First training phase where the model learns language basics on terabytes of text.

Inference

🧠Training

Phase where the trained model is used to produce responses. Much cheaper than training.

🎯Usage(10)

Few-shot learning

🎯Usage

Technique of providing a few examples in the prompt to guide the LLM on the expected format.

Example : "Here are 3 examples of emails I liked: [...]. Now write a similar one for [...]"

Chain-of-thought (CoT)

🎯Usage

Prompting technique asking the LLM to "reason step by step" before answering. Greatly improves results on complex problems.

Example : Instead of "What is 23 × 17?", try "Calculate 23 × 17 step by step."

System prompt

🎯Usage

Hidden instruction given to the LLM before the conversation to set its role, tone and limits.

Example : "You are Claude, made by Anthropic. Be helpful and precise." is an example of system prompt.

RAG (Retrieval-Augmented Generation)

🎯Usage

Technique combining LLM + external knowledge base. The LLM "retrieves" relevant info before answering.

Example : A company chatbot answering with internal docs uses RAG.
Read more :rag

Agent IA

🎯Usage

AI capable of executing actions autonomously (browsing, sending emails, making purchases), not just answering questions.

MCP (Model Context Protocol)

🎯Usage

Standard protocol created by Anthropic in 2024 to let LLMs connect to external tools (DBs, APIs, files).

Example : Like USB for computers: MCP standardizes how LLMs talk to tools.

Function calling

🎯Usage

Ability of an LLM to call external functions (send email, query DB) as needed.

Temperature

🎯Usage

Parameter controlling an LLM's creativity. 0 = deterministic and factual, 1 = creative and variable.

Example : For code generation: temperature 0. For story writing: temperature 0.8.

🛡️Security(8)

Hallucination

🛡️Security

When an LLM confidently makes up false information. Fundamental flaw not yet eliminated.

Example : Ask GPT-5 "Cite a 1980 Camus book" → it invents a plausible title (Camus died in 1960).

Prompt injection

🛡️Security

Attack where a user or document "hijacks" the LLM by giving it hidden instructions. #1 OWASP LLM flaw.

Jailbreak

🛡️Security

Technique to make an LLM say things normally forbidden (dangerous instructions, inappropriate content).

Example : The "DAN" jailbreak (Do Anything Now) made GPT believe it was another rule-free AI.

Red-teaming

🛡️Security

Practice of methodically attacking your own AI system to find flaws before attackers do.

Shadow AI

🛡️Security

Unauthorized use of personal AI tools (personal ChatGPT) with corporate data. Major enterprise risk.

Deepfake

🛡️Security

AI-generated synthetic video, image or audio, imitating a real person very realistically.

AI Act (UE)

🛡️Security

European AI regulation, effective since 2024. Classifies AI by risk level and imposes obligations.

RGPD & IA

🛡️Security

GDPR applies to data used by AI. You're responsible for client data sent to ChatGPT.

⚙️Tech(9)

API (Application Programming Interface)

⚙️Tech

Interface allowing a program to use an AI service. OpenAI's API lets you integrate GPT into your apps.

GPU (Graphics Processing Unit)

⚙️Tech

Graphics processor. Massively used for AI as it does many parallel calculations. Nvidia dominates the market.

TPU (Tensor Processing Unit)

⚙️Tech

AI-specialized chip designed by Google. Alternative to Nvidia GPUs for training.

Quantization

⚙️Tech

Technique to reduce a model's size (and run it on weaker machines) by sacrificing some precision.

Example : 4-bit quantization lets Llama 70B run on a MacBook Pro M3.

Self-hosted

⚙️Tech

Running an AI model on your own servers (instead of cloud API). More control, more complexity.

Ollama

⚙️Tech

Popular tool to easily run open-source LLMs (Llama, Mistral, Gemma) on your computer.

Benchmark

⚙️Tech

Standardized test to measure model performance. Examples: MMLU (knowledge), SWE-Bench (code), HumanEval.

Hugging Face

⚙️Tech

Reference platform for sharing open-source AI models. The "GitHub of AI".

Mixture of Experts (MoE)

⚙️Tech

Architecture where the model contains several "experts" and only activates relevant ones per query. More efficient.

Example : Mixtral 8x7B uses MoE: 8 experts of 7B parameters, only 2 active at a time.