Key Concepts

Glossary of terms and concepts used across the handbook

Key Concepts

This glossary defines terms you'll encounter in this handbook and our daily work. Bookmark this page.

Tip: Use Ctrl+F / Cmd+F to search for terms.

Development Basics

API (Application Programming Interface)

A way for programs to communicate. When our app needs data from DataForSEO, it calls their API.

Backend

The server-side of an application. Handles data, logic, and communication with databases. Our backends use NextJS API routes or FastAPI.

Frontend

The user-facing part of an application. What people see and interact with. We build frontends with NextJS + Tailwind + Shadcn.

Full-stack

Working on both frontend and backend. Most of our team works full-stack.

Repository (Repo)

A project's folder tracked by Git. Contains all code, history, and configuration.

Environment Variables

Configuration values stored outside code (like API keys). Set differently for development vs production.

Deployment

Making an application available on the internet. We deploy to Railway.

CI/CD (Continuous Integration/Continuous Deployment)

Automated testing and deployment when code is pushed. We use GitHub Actions.

Web Technologies

SSR (Server-Side Rendering)

Generating HTML on the server before sending to browser. NextJS does this by default.

SSG (Static Site Generation)

Pre-building pages at build time. Faster but less dynamic.

CSR (Client-Side Rendering)

Building pages in the browser with JavaScript. More interactive but slower initial load.

REST API

A pattern for designing APIs using HTTP methods (GET, POST, PUT, DELETE).

Webhook

A URL that receives automated notifications from external services.

CORS (Cross-Origin Resource Sharing)

Security feature that controls which websites can call your API.

AI & Machine Learning

LLM (Large Language Model)

AI models trained on text data. Claude and GPT are LLMs.

Prompt

The text instruction you give to an LLM. Better prompts = better results.

Token

A chunk of text (roughly 4 characters). LLMs process and charge by tokens.

Context Window

Maximum tokens an LLM can process at once. Claude has ~200K tokens.

Embedding

A numerical representation of text as a vector. Used for similarity search and clustering.

Vector

A list of numbers representing something (like text meaning). Embeddings are vectors.

Dimensionality

How many numbers in a vector. OpenAI embeddings have 1536 dimensions.

Cosine Similarity

A measure of how similar two vectors are. Used to find related content.

RAG (Retrieval Augmented Generation)

Giving an LLM relevant context before it generates a response. Improves accuracy.

Chunking

Breaking long documents into smaller pieces for processing.

Hallucination

When an LLM generates false information confidently.

Fine-tuning

Training an existing model on custom data. We rarely do this.

Agent

An AI system that can take actions, not just generate text. Uses tools and makes decisions.

Tool Calling (Function Calling)

When an LLM decides to use a tool (like search) during generation.

MCP (Model Context Protocol)

A standard for connecting LLMs to external tools and data sources.

Machine Learning Specific

HDBSCAN

Clustering algorithm we use for keyword grouping. Handles noise well.

Clustering

Grouping similar items together automatically.

Outlier

A data point that doesn't fit any cluster. In keyword clustering, these are standalone terms.

Dimensionality Reduction

Converting high-dimensional vectors to fewer dimensions. We reduce 1536D embeddings to ~50D for clustering.

UMAP

Algorithm for dimensionality reduction. Preserves local structure well.

PCA (Principal Component Analysis)

Simpler dimensionality reduction. Faster but less accurate than UMAP.

Pipeline

A sequence of processing steps. Our ML pipelines: preprocess → embed → reduce → cluster → label.

Batch Processing

Processing many items at once instead of one-by-one. More efficient for API calls.

Agent Architecture

Streaming

Sending response chunks as they're generated, not waiting for completion.

SSE (Server-Sent Events)

Technology for streaming data from server to browser. Simpler than WebSockets.

WebSocket

Two-way real-time communication between browser and server.

Memory (Agent)

Storing conversation history so agents remember previous interactions.

Orchestration

Coordinating multiple AI components or agents to complete a task.

Chain

A sequence of LLM calls where output of one becomes input to next.

Multi-Agent

Multiple specialized agents working together on a task.

Database & Data

PostgreSQL

Relational database we use via Supabase. Stores structured data.

Schema

The structure of a database: tables, columns, relationships.

Query

A request for data from a database. Written in SQL.

Index

Database optimization that speeds up queries on specific columns.

Row-Level Security (RLS)

Database feature restricting which rows users can access. Supabase supports this.

Migration

Versioned changes to database schema. Tracks database evolution.

CRUD

Create, Read, Update, Delete. Basic database operations.

Denormalization

Duplicating data to speed up queries. Trade-off: faster reads, harder updates.

Authentication & Security

Authentication (AuthN)

Verifying WHO someone is. "Are you really user@example.com?"

Authorization (AuthZ)

Determining WHAT someone can do. "Can this user delete projects?"

Session

Server-side record that a user is logged in.

JWT (JSON Web Token)

Encoded token containing user info. Alternative to sessions.

OAuth

Protocol for "Login with Google/GitHub" functionality.

OWASP

Organization that publishes security best practices and common vulnerabilities.

SQL Injection

Attack where malicious SQL is inserted through user input.

XSS (Cross-Site Scripting)

Attack where malicious scripts run in users' browsers.

CSRF (Cross-Site Request Forgery)

Attack that tricks users into making unwanted requests.

Rate Limiting

Restricting how many API calls a user can make in a time period.

SEO Domain

SERP (Search Engine Results Page)

The page Google shows after a search.

AI Overview

Google's AI-generated answer at top of some search results.

Keyword

A search term people type into Google.

Keyword Clustering

Grouping related keywords by search intent or topic.

Search Intent

What the user wants when searching: informational, transactional, navigational.

A link from another website to yours. Important for SEO.

Domain Authority

Metric predicting how well a site will rank. Higher = better.

Infrastructure

PaaS (Platform as a Service)

Hosting that handles servers for you. Railway is a PaaS.

CDN (Content Delivery Network)

Servers worldwide that cache and serve your static files. Cloudflare provides this.

DNS (Domain Name System)

Translates domain names (example.com) to IP addresses.

SSL/TLS

Encryption for HTTPS connections. Railway and Cloudflare handle this.

Container

Packaged application with all dependencies. Docker creates containers.

Serverless

Running code without managing servers. Functions that scale automatically.

Project Management

PRD (Product Requirements Document)

Document describing what to build and why.

MVP (Minimum Viable Product)

Simplest version that delivers value. Ship this first.

Sprint

Fixed time period for development work. We often work in 1-2 week cycles.

Standup

Brief daily meeting to share progress and blockers.

Pair Programming

Two people coding together. One types, one reviews.

Code Review

Reviewing someone else's code before merging.

Technical Debt

Shortcuts taken now that will need fixing later.

Scope Creep

Gradual expansion of project requirements beyond original plan.

Quick Lookup by Category

When Discussing AI Agents

RAG, MCP, Streaming, SSE, Memory, Orchestration, Tool Calling, Context Window

When Discussing ML

Embedding, Clustering, HDBSCAN, UMAP, Dimensionality, Pipeline, Batch Processing

When Discussing Security

Authentication, Authorization, Session, OWASP, SQL Injection, XSS, Rate Limiting

When Discussing Architecture

API, Backend, Frontend, SSR, REST, Webhook, Environment Variables

When Discussing Deployment

CI/CD, Railway, Cloudflare, DNS, Container, PaaS


Can't find a term? Ask the team or add it here.