AI Glossary - Three Potato Productions

AI Glossary

Essential terms for understanding the world of Artificial Intelligence.

A

Activation Function: A mathematical function that introduces non-linearity into a neural network, helping it learn complex patterns. It determines whether a neuron should be "fired" or activated, and to what extent.
Agent Frameworks: Software libraries or platforms (e.g., LangChain, AutoGen, CrewAI) that simplify the development of AI agents by providing pre-built components for planning, tool use, and memory management.
Agentic Behavior: The ability of an AI system to take proactive, goal-oriented actions in an environment with minimal human supervision. This goes beyond simple reactive responses to prompts. For example, an agentic system could autonomously book flights and reserve hotels based on a high-level user request like "plan my business trip to Tokyo."
AI Infrastructure: The hardware and software components required to develop, train, and deploy AI models. This includes specialized processors (like GPUs and TPUs), servers, data storage systems, and cloud-based platforms.
AI Proof of Concept (PoC): A small-scale project designed to test and demonstrate the feasibility and potential value of an AI solution before committing to a full-scale development effort.
AI Scalability: The ability of an AI system to handle increasing amounts of data, user traffic, and computational complexity without a significant decrease in performance or efficiency.
AI Tokenization: The process of breaking down a continuous sequence of text into smaller, meaningful units called tokens. These tokens are often words, subwords, or characters, which are then used as input for a model.
Arithmetic Intensity: A metric that measures the ratio of a program's total number of arithmetic operations to its total number of memory accesses. In machine learning, it matters because a higher ratio indicates more efficient use of computational resources, which is critical for optimizing compute-bound workloads.
Attention Mechanism: A technique that allows a neural network to focus on the most relevant parts of the input data when making a prediction. It is a core component of transformer models.
Auto-Regressive Models: A class of models that predict the next item in a sequence based on the items that have come before it. These models are fundamental to how large language models generate text.
Automated Machine Learning (AutoML): The process of automating the end-to-end pipeline of applying machine learning, from data preparation and feature engineering to model selection, training, and deployment.

B

Backpropagation: The primary algorithm used to train neural networks. It works by calculating the gradient of the loss function with respect to the model's parameters and then adjusting the weights to minimize the loss.
Baseline Models: A simple, often non-AI, model or heuristic that provides a minimum performance target. It serves as a benchmark against which the performance of more complex and sophisticated models is measured.
Bias in AI: An algorithmic or systemic error in an AI model that results in unfair or skewed outcomes, often reflecting and amplifying biases present in the training data. This can lead to discrimination against certain groups. See also: Bias Mitigation
Bias Mitigation: A set of techniques and practices aimed at identifying, measuring, and reducing bias in AI systems. This includes methods like re-weighting training data or adjusting model outputs.

C

Chain-of-Thought Prompting: A prompting technique that encourages an LLM to "think out loud" by breaking down a complex problem into a series of intermediate, logical steps before providing a final answer. This often leads to more accurate and reliable results.
CI/CD (Continuous Integration/Continuous Deployment) for Machine Learning: The application of Continuous Integration and Continuous Deployment principles to the machine learning lifecycle. It automates the testing, building, and deployment of ML models into production.
Classification Threshold: A specific probability value that a classification model uses to decide whether an instance belongs to a positive or negative class. For example, a threshold of $0.5$ means any prediction with a probability greater than $0.5$ is classified as positive.
Computer Vision: A field of AI that enables computers and systems to derive meaningful information from digital images, videos, and other visual inputs. It is the science of teaching computers to see.
Concept Drift: A phenomenon where the statistical properties of the target variable change over time, causing a machine learning model's predictions to become less accurate. This is common in real-world data streams. See also: Drift Monitoring
Context Window: The maximum amount of text (measured in tokens or words) that a large language model can consider at once to generate a response. The size of the context window is a key limitation for LLMs.
Continuous Validation: The ongoing process of monitoring and evaluating a deployed machine learning model's performance to ensure it remains accurate and relevant in a changing environment. This is closely related to drift monitoring.
Convolutional Neural Network (CNN): A class of deep neural networks, most commonly applied to analyzing visual imagery. They are designed to process pixel data and are highly effective for image recognition and classification.
Cross-Validation: A statistical technique for evaluating a model's performance by partitioning the dataset into multiple subsets. The model is trained on a portion of the data and validated on the remaining subset, and this process is repeated for robust evaluation.

D

Data Augmentation: A set of techniques used to artificially increase the amount of data by creating new data from existing data. This helps to improve the robustness and generalization of a model.
Data Flywheel: A self-reinforcing cycle where a product's usage generates more data, which in turn improves the product, leading to more usage, creating a virtuous cycle of growth and improvement.
Data Ingestion: The process of importing, transferring, and loading raw data from various sources into a centralized storage system or database for subsequent processing and analysis.
Data Pipeline Automation: The automation of the sequence of steps for moving and transforming data from its source to its destination. This ensures data is clean, consistent, and ready for model training.
Deep Learning Pipelines: A series of automated steps for processing data and training deep learning models. These pipelines typically include data preparation, model training, hyperparameter tuning, and evaluation.
Diffusion Models: A class of generative AI models that create high-quality images, audio, or other data by learning to progressively reverse a process of adding noise. They have become a leading approach for image generation.
Drift Monitoring: The practice of continuously tracking and analyzing changes in the statistical properties of a model's input data or target variable to detect data drift or concept drift. This is a key part of continuous validation.

E

Enterprise Data Science: The application of data science principles and techniques within a business context to solve complex problems, improve operational efficiency, and drive strategic decision-making.
Excessive Agency: A risk in AI systems where the model takes actions or makes decisions that go beyond its intended scope or without proper human oversight, often leading to unintended negative consequences. For example, an AI agent could unexpectedly spend a large sum of money or delete critical files based on a simple, ambiguous command.
Explainable AI (XAI): A set of techniques and tools that help make the decisions and predictions of complex AI models more transparent and understandable to humans, building trust and allowing for auditing.

F

False Positive Rate: In a classification model, the proportion of all negative instances that were incorrectly classified as positive. It is calculated as $\frac{FalsePositives}{FalsePositives + TrueNegatives}$.
Feature Engineering: The process of using domain knowledge to select, transform, and create relevant variables (features) from raw data. This is a critical step for improving a model's performance.
Feature Store: A centralized repository for managing, versioning, and serving machine learning features. It ensures consistency and reusability of features across different models and teams.
Feature Vector: A numerical representation of an object or data point. It is an ordered list of numerical features that a machine learning model uses as input.
Few-shot Learning: A machine learning approach where a model is trained on a very small number of examples to learn a new task. This is a key capability of advanced foundation models.
Fine-Tuning LLMs: The process of further training a pre-trained large language model on a smaller, specific dataset to adapt its behavior or knowledge for a particular task, domain, or style.
Foundation Models: Very large models trained on vast amounts of unannotated data that can be adapted to a wide range of downstream tasks. Examples include GPT-3, PaLM, and DALL-E 2.
Frontier Model: The most advanced and capable large-scale AI models at any given time. These models typically push the boundaries of what is possible in terms of performance and general capabilities.

G

Gen AI App: A software application that uses generative AI models to create new and unique content, such as text, images, audio, or code, based on user prompts.
Generative Agents: Autonomous systems that use generative models to perceive and react to their environment, plan actions, and generate human-like behaviors and interactions. See also: Agent Frameworks
Generative AI: A broad category of AI that focuses on creating new and original content, including text, images, music, and code. It is a major subfield of AI and the technology behind models like ChatGPT and Midjourney.
Generative Adversarial Network (GAN): A type of generative model that uses two competing neural networks—a generator and a discriminator—to create new, synthetic data that is nearly indistinguishable from real data.
GPU (Graphics Processing Unit) for Machine Learning: A specialized processor that excels at parallel computation. Its ability to perform many calculations simultaneously makes it ideal for the intensive training of deep learning models.
Gradient Descent: An iterative optimization algorithm used to minimize a function, typically a model's error or loss. It works by moving in the direction of the steepest decrease of the function.

H

Hallucination Mitigation: Techniques used to reduce or eliminate LLM hallucinations. Methods include grounding responses in external data, providing clear constraints in the prompt, and using retrieval-augmented generation (RAG).
Holdout Dataset: A portion of the data that is not used during the training or validation phases of model development. It is "held out" and used for the final, unbiased evaluation of the model's performance.
Human in the Loop: A machine learning approach where human judgment is incorporated into the training, validation, or deployment process. This can improve a model's performance, safety, and reliability.
Hyperpersonalization in AI: The use of AI to deliver highly customized and individualized experiences, content, or recommendations to each user. It goes beyond standard personalization by leveraging real-time data and context.
Hyperparameter: A configuration variable external to the model that is set manually to guide the learning process. It is not learned from the data, unlike a model's parameters (e.g., weights).

I

Image Processing Framework: A software library or toolkit that provides a collection of functions and algorithms for manipulating, analyzing, and transforming digital images. In an ML context, these are used for tasks like preparing image datasets for computer vision models.
In-Context Learning: The ability of a large language model to learn from the examples provided within a single prompt, without any further training or fine-tuning.
Inductive Bias: The set of assumptions that a learning algorithm makes to generalize from training data to new, unseen data. It is a necessary component for a model to be able to make predictions.

K

Knowledge Distillation: A technique where a small, efficient "student" model is trained to mimic the behavior of a larger, more complex "teacher" model. This allows for faster and more cost-effective inference in production.
Kubeflow Pipelines: A platform for building and deploying portable, scalable machine learning workflows on Kubernetes. It simplifies the process of creating and managing complex ML pipelines.
Kubernetes for MLOps: The use of Kubernetes, a container orchestration system, to manage and automate the deployment, scaling, and management of the various components of a machine learning production environment.

L

Large Language Models (LLMs): A type of AI model, typically based on the transformer architecture, that is trained on a massive amount of text data. LLMs can understand and generate human-like text, answer questions, and perform many other language-based tasks.
Latent Space: A low-dimensional representation of data where similar data points are located close to each other. Generative models learn to navigate this space to create new data.
LLM Agents: AI systems that use LLMs as their core reasoning and planning engine. They can autonomously perform complex tasks by interacting with tools, databases, and other systems. See also: Agent Frameworks
LLM as a Judge: The practice of using a large language model to evaluate the quality of another model's output or to rank multiple responses. This is an emerging research technique used to automate evaluation.
LLM Customization: The process of modifying or adapting a pre-trained LLM to better suit a specific task or domain. This can be done through techniques like fine-tuning, prompt engineering, or retrieval-augmented generation.
LLM Embeddings: Numerical representations of text that capture the semantic meaning of words, phrases, or entire documents. LLMs use these embeddings to understand relationships and context.
LLM Grounding: The process of connecting an LLM's responses to external, verified knowledge sources (e.g., databases, websites) to improve factual accuracy and reduce hallucinations.
LLM Hallucinations: When a large language model generates false, nonsensical, or factually incorrect information that is not supported by its training data or the provided context. This can be a significant source of misinformation.
LLM Monitoring: The continuous process of tracking the performance, cost, and behavior of a deployed large language model to detect issues like performance degradation, bias, or excessive costs.
LLM Optimization: Techniques used to improve the efficiency, speed, and cost-effectiveness of LLMs, such as quantization, model pruning, or using smaller, more specialized models.
LLM Orchestration: The process of managing the flow of data and control between multiple components in an LLM-powered application. This involves chaining together different models, tools, and data sources to perform a complex task.
LLM Temperature: A hyperparameter that controls the randomness and creativity of an LLM's output. A higher temperature results in more diverse and unpredictable responses, while a lower temperature produces more deterministic ones.
LLM Tracing: The process of recording and visualizing the step-by-step execution of an LLM-powered application for debugging, performance analysis, and understanding its behavior.
LLMOps: A set of practices for operationalizing and managing the entire lifecycle of large language models, from development and experimentation to deployment, monitoring, and governance in a production environment.

M

Machine Learning Infrastructure: The underlying hardware, software, and services required to support machine learning workflows, including data storage, compute resources, and model deployment platforms.
Machine Learning Lifecycle: The complete process of a machine learning project, from initial business understanding and data collection to model development, deployment, and ongoing monitoring.
Mixture of Experts (MoE): A neural network architecture that uses multiple "expert" sub-networks. A gating network learns to route each input to the most appropriate expert or a combination of experts, allowing the model to be more efficient and scalable.
ML Pipeline: A sequence of automated steps for building and deploying a machine learning model, typically including data ingestion, data preprocessing, model training, and model evaluation.
ML Pipeline Tools: Software tools that help in designing, building, and managing ML pipelines, such as Apache Airflow, Kubeflow, or cloud-based services like Vertex AI Pipelines.
ML Stack: The complete collection of technologies, including frameworks, libraries, and tools, used to develop and deploy a machine learning application. It encompasses everything from data storage to model serving.
MLOps Governance: The set of policies, procedures, and controls that ensure the ethical, secure, and compliant use of machine learning models within an organization. It focuses on risk management and responsible AI.
Model Accuracy in Machine Learning: A common metric for evaluating classification models, defined as the proportion of total correct predictions out of the total number of predictions. Calculated as $\frac{TruePositives + TrueNegatives}{TotalPredictions}$.
Model Behavior: The way a machine learning model responds to different inputs and conditions. This includes its predictions, its biases, and its typical output patterns, which are crucial to monitor in production.
Model Deployment: The process of making a trained machine learning model available for use in a production environment, typically by serving it through an API or integrating it into an application.
Model Evaluation: The process of assessing a model's performance on a dataset to determine its effectiveness. This involves using various metrics like accuracy, precision, recall, and F1-score to measure its quality.
Model Management: The practice of tracking, versioning, and organizing machine learning models throughout their lifecycle. This ensures reproducibility and proper governance of ML assets.
Model Monitoring: The ongoing process of observing a deployed model's performance and behavior in real-time to detect issues such as performance degradation, data drift, or bias.
Model Retraining: The process of re-training a deployed model with new or updated data. This is necessary to adapt to concept drift and maintain the model's performance over time.
Model Serving: The infrastructure and process for making a model's predictions available to a user or application, often via a web service or API that can handle real-time requests.
Model Serving Pipeline: The automated workflow for deploying and serving models, often including steps for model registration, containerization, and the creation of a scalable API endpoint.
Model Training: The process of teaching a machine learning model to recognize patterns in data by feeding it a dataset and allowing it to learn and optimize its parameters.
Model Tuning: The process of optimizing a model's hyperparameters (parameters not learned from data, like the learning rate or number of layers) to improve its performance on a specific task.
Multimodal Models: AI models that can process and generate information across multiple data types, such as text, images, and audio. LLMs like GPT-4o and Gemini are examples of multimodal models.

N

Natural Language Processing (NLP): A field of AI focused on enabling computers to understand, interpret, and generate human language in a way that is both meaningful and useful.
Neural Network: A computational model inspired by the structure of the human brain. It consists of layers of interconnected nodes (neurons) that process information and learn to recognize patterns in data.
Noise in ML: Random error or irrelevant, non-informative data in a dataset. Noise can make it difficult for a model to learn the true underlying patterns and can lead to overfitting.

O

On-Premise AI Platform: An AI platform that is hosted and managed within a company's own data center, rather than being deployed on a third-party cloud service.
Open Source Model: A machine learning model whose code, weights, and architecture are publicly available for use, modification, and distribution. This encourages collaboration and innovation.
Operationalizing ML: The process of moving a machine learning model from a research or development phase into a production environment where it can be used to generate real-time predictions or insights. Also known as MLOps.
Overfitting: A common problem where a model learns the training data too well, including its noise and outliers. An overfit model performs poorly on new, unseen data because it fails to generalize the patterns.

P

Parameter: A value in a machine learning model that is learned directly from the training data. The model's parameters (e.g., weights and biases in a neural network) determine its predictions.
Prompt Chaining: A technique where the output of one prompt is used as the input for a subsequent prompt, creating a chain of instructions for an LLM to solve a complex, multi-step problem.
Prompt Engineering: The art and science of crafting effective inputs (prompts) for large language models to guide their behavior, specify constraints, and elicit desired responses.
Prompt Management: The process of storing, versioning, and organizing the prompts used to interact with LLMs. This is essential for maintaining consistency, improving collaboration, and ensuring reproducibility.

R

Random Forest: A type of ensemble machine learning algorithm that constructs a multitude of decision trees during training. It combines the outputs of these trees to produce a more robust and accurate prediction.
Real Time ML: The use of machine learning models to make predictions or decisions on data as it is generated, with minimal delay. This is crucial for applications like fraud detection and recommendation systems.
Reasoning Engine: The component of an AI system that uses logic, inference, and learned patterns to derive conclusions from a set of facts and rules. It allows the AI to perform complex problem-solving.
Recall (a.k.a. True Positive Rate): In a classification model, a metric that measures the proportion of actual positive instances that were correctly identified by the model. It is calculated as $\frac{TruePositives}{TruePositives + FalseNegatives}$.
Regression: A type of machine learning task where the model predicts a continuous numerical value, such as a price, temperature, or sales forecast.
Reinforcement Learning: A type of machine learning where an agent learns to make decisions by taking actions in an environment to maximize a cumulative reward. It's the AI equivalent of learning by trial and error.
Retrieval-Augmented Generation (RAG): An architectural pattern for LLMs that retrieves information from an external knowledge base to ground its responses in a specific set of facts. This helps reduce hallucinations and enables models to access up-to-date information.
Risk Management: The process of identifying, assessing, and mitigating potential risks associated with the development and deployment of AI systems, including security, ethical, and reputational risks.
RLHF (Reinforcement Learning from Human Feedback): A technique used to align large language models with human preferences by training them on a dataset of human-ranked outputs.

S

Self Reflection in LLMs: The ability of an LLM to analyze its own outputs and internal state to improve its performance or correct its mistakes. This is an advanced reasoning capability that can enhance model accuracy.
Supervised Learning: A type of machine learning where the model learns from a labeled dataset. The model's task is to predict the correct output for a given input, with the correct answers provided in the training data.
Synthetic Data: Artificially generated data that is not collected from real-world events. It is often used to train models when real data is scarce, sensitive, or requires specific characteristics.

T

Token Limit: The maximum number of tokens that a large language model can process in a single request, including both the user prompt and the model's response. This is a practical constraint for managing LLM cost and performance.
Transformer Model: A neural network architecture that relies on the attention mechanism to process sequential data, such as text. It has become the standard for large language models and other generative AI applications.
Transfer Learning: A machine learning technique where a model trained on one task is reused as the starting point for a model on a different but related task. This significantly reduces training time and data requirements.

U

Underfitting: A modeling error where a model is too simple to capture the underlying structure of the data. This results in poor performance on both training data and new, unseen data.
Unsupervised Machine Learning: A type of machine learning where the model is trained on unlabeled data. The goal is to discover hidden patterns, relationships, or structures within the data without any predefined target variables.

V

Vector: A fundamental mathematical object used in machine learning to represent a data point. It is an ordered list of numbers that captures the features of an item.
Vector Database: A specialized database designed to store, manage, and search for high-dimensional vectors. They are essential for applications that use embeddings to find semantically similar items, such as in RAG systems.

W

What is Prompt Injection?: A security vulnerability in an AI system where an attacker uses a specially crafted prompt to manipulate a large language model's behavior or bypass its safety guardrails, often to get it to reveal confidential information or perform unintended actions. See also: Prompt Engineering, LLM Monitoring

X

XAI: An abbreviation for Explainable AI. See the full definition under "E".

Y

YOLO (You Only Look Once): A popular and highly efficient deep learning algorithm for object detection in images and videos. It can identify multiple objects and their locations in a single pass.

Z

Zero-shot Learning: The ability of a machine learning model, typically a large language model, to perform a task it was not explicitly trained on, using only instructions from the prompt without any examples.
Z-Score Normalization: A data preprocessing technique that transforms data to have a mean of $0$ and a standard deviation of $1$. This is a common step for preparing data for machine learning models, as it can improve their performance.