Key Concepts
Glossary of terms and concepts used across the handbook
Key Concepts
This glossary defines terms you'll encounter in this handbook and our daily work. Bookmark this page.
Tip: Use Ctrl+F / Cmd+F to search for terms.
Development Basics
API (Application Programming Interface)
A way for programs to communicate. When our app needs data from DataForSEO, it calls their API.
Backend
The server-side of an application. Handles data, logic, and communication with databases. Our backends use NextJS API routes or FastAPI.
Frontend
The user-facing part of an application. What people see and interact with. We build frontends with NextJS + Tailwind + Shadcn.
Full-stack
Working on both frontend and backend. Most of our team works full-stack.
Repository (Repo)
A project's folder tracked by Git. Contains all code, history, and configuration.
Environment Variables
Configuration values stored outside code (like API keys). Set differently for development vs production.
Deployment
Making an application available on the internet. We deploy to Railway.
CI/CD (Continuous Integration/Continuous Deployment)
Automated testing and deployment when code is pushed. We use GitHub Actions.
Web Technologies
SSR (Server-Side Rendering)
Generating HTML on the server before sending to browser. NextJS does this by default.
SSG (Static Site Generation)
Pre-building pages at build time. Faster but less dynamic.
CSR (Client-Side Rendering)
Building pages in the browser with JavaScript. More interactive but slower initial load.
REST API
A pattern for designing APIs using HTTP methods (GET, POST, PUT, DELETE).
Webhook
A URL that receives automated notifications from external services.
CORS (Cross-Origin Resource Sharing)
Security feature that controls which websites can call your API.
AI & Machine Learning
LLM (Large Language Model)
AI models trained on text data. Claude and GPT are LLMs.
Prompt
The text instruction you give to an LLM. Better prompts = better results.
Token
A chunk of text (roughly 4 characters). LLMs process and charge by tokens.
Context Window
Maximum tokens an LLM can process at once. Claude has ~200K tokens.
Embedding
A numerical representation of text as a vector. Used for similarity search and clustering.
Vector
A list of numbers representing something (like text meaning). Embeddings are vectors.
Dimensionality
How many numbers in a vector. OpenAI embeddings have 1536 dimensions.
Cosine Similarity
A measure of how similar two vectors are. Used to find related content.
RAG (Retrieval Augmented Generation)
Giving an LLM relevant context before it generates a response. Improves accuracy.
Chunking
Breaking long documents into smaller pieces for processing.
Hallucination
When an LLM generates false information confidently.
Fine-tuning
Training an existing model on custom data. We rarely do this.
Agent
An AI system that can take actions, not just generate text. Uses tools and makes decisions.
Tool Calling (Function Calling)
When an LLM decides to use a tool (like search) during generation.
MCP (Model Context Protocol)
A standard for connecting LLMs to external tools and data sources.
Machine Learning Specific
HDBSCAN
Clustering algorithm we use for keyword grouping. Handles noise well.
Clustering
Grouping similar items together automatically.
Outlier
A data point that doesn't fit any cluster. In keyword clustering, these are standalone terms.
Dimensionality Reduction
Converting high-dimensional vectors to fewer dimensions. We reduce 1536D embeddings to ~50D for clustering.
UMAP
Algorithm for dimensionality reduction. Preserves local structure well.
PCA (Principal Component Analysis)
Simpler dimensionality reduction. Faster but less accurate than UMAP.
Pipeline
A sequence of processing steps. Our ML pipelines: preprocess → embed → reduce → cluster → label.
Batch Processing
Processing many items at once instead of one-by-one. More efficient for API calls.
Agent Architecture
Streaming
Sending response chunks as they're generated, not waiting for completion.
SSE (Server-Sent Events)
Technology for streaming data from server to browser. Simpler than WebSockets.
WebSocket
Two-way real-time communication between browser and server.
Memory (Agent)
Storing conversation history so agents remember previous interactions.
Orchestration
Coordinating multiple AI components or agents to complete a task.
Chain
A sequence of LLM calls where output of one becomes input to next.
Multi-Agent
Multiple specialized agents working together on a task.
Database & Data
PostgreSQL
Relational database we use via Supabase. Stores structured data.
Schema
The structure of a database: tables, columns, relationships.
Query
A request for data from a database. Written in SQL.
Index
Database optimization that speeds up queries on specific columns.
Row-Level Security (RLS)
Database feature restricting which rows users can access. Supabase supports this.
Migration
Versioned changes to database schema. Tracks database evolution.
CRUD
Create, Read, Update, Delete. Basic database operations.
Denormalization
Duplicating data to speed up queries. Trade-off: faster reads, harder updates.
Authentication & Security
Authentication (AuthN)
Verifying WHO someone is. "Are you really user@example.com?"
Authorization (AuthZ)
Determining WHAT someone can do. "Can this user delete projects?"
Session
Server-side record that a user is logged in.
JWT (JSON Web Token)
Encoded token containing user info. Alternative to sessions.
OAuth
Protocol for "Login with Google/GitHub" functionality.
OWASP
Organization that publishes security best practices and common vulnerabilities.
SQL Injection
Attack where malicious SQL is inserted through user input.
XSS (Cross-Site Scripting)
Attack where malicious scripts run in users' browsers.
CSRF (Cross-Site Request Forgery)
Attack that tricks users into making unwanted requests.
Rate Limiting
Restricting how many API calls a user can make in a time period.
SEO Domain
SERP (Search Engine Results Page)
The page Google shows after a search.
AI Overview
Google's AI-generated answer at top of some search results.
Keyword
A search term people type into Google.
Keyword Clustering
Grouping related keywords by search intent or topic.
Search Intent
What the user wants when searching: informational, transactional, navigational.
Backlink
A link from another website to yours. Important for SEO.
Domain Authority
Metric predicting how well a site will rank. Higher = better.
Infrastructure
PaaS (Platform as a Service)
Hosting that handles servers for you. Railway is a PaaS.
CDN (Content Delivery Network)
Servers worldwide that cache and serve your static files. Cloudflare provides this.
DNS (Domain Name System)
Translates domain names (example.com) to IP addresses.
SSL/TLS
Encryption for HTTPS connections. Railway and Cloudflare handle this.
Container
Packaged application with all dependencies. Docker creates containers.
Serverless
Running code without managing servers. Functions that scale automatically.
Project Management
PRD (Product Requirements Document)
Document describing what to build and why.
MVP (Minimum Viable Product)
Simplest version that delivers value. Ship this first.
Sprint
Fixed time period for development work. We often work in 1-2 week cycles.
Standup
Brief daily meeting to share progress and blockers.
Pair Programming
Two people coding together. One types, one reviews.
Code Review
Reviewing someone else's code before merging.
Technical Debt
Shortcuts taken now that will need fixing later.
Scope Creep
Gradual expansion of project requirements beyond original plan.
Quick Lookup by Category
When Discussing AI Agents
RAG, MCP, Streaming, SSE, Memory, Orchestration, Tool Calling, Context Window
When Discussing ML
Embedding, Clustering, HDBSCAN, UMAP, Dimensionality, Pipeline, Batch Processing
When Discussing Security
Authentication, Authorization, Session, OWASP, SQL Injection, XSS, Rate Limiting
When Discussing Architecture
API, Backend, Frontend, SSR, REST, Webhook, Environment Variables
When Discussing Deployment
CI/CD, Railway, Cloudflare, DNS, Container, PaaS
Can't find a term? Ask the team or add it here.