How We Work with External APIs
Patterns for integrating with external services reliably and maintainably
How We Work with External APIs
External services are outside your control. They go down, change behavior, throttle you, and return unexpected data. This guide covers how to build integrations that handle reality gracefully.
The Core Mindset
When integrating with external APIs, assume:
"This service will fail, change, or behave unexpectedly. How do I protect my application and my users?"
The goal isn't to prevent failures — it's to handle them gracefully.
Principles
1. Isolate External Dependencies
Keep external integrations separate from your business logic.
┌─────────────────────────────────────────────────────────────┐
│ YOUR APPLICATION │
│ Routes, components, business logic │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ API CLIENT │
│ Wrapper around external service │
│ - Authentication │
│ - Request formatting │
│ - Response parsing │
│ - Error handling │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ EXTERNAL SERVICE │
│ Third-party API │
└─────────────────────────────────────────────────────────────┘
The mistake: Calling external APIs directly from business logic. The API's quirks, authentication, and error handling leak everywhere.
The principle: Create a client layer. Your business logic talks to the client; the client talks to the external service. Changes to the external API affect only the client.
The benefit:
- Swap providers by changing one file
- Test business logic with mocked clients
- Handle API-specific complexity in one place
2. Transform to Your Domain
Don't let external data shapes dictate your application's structure.
The mistake: Using the external API's field names and structures throughout your code. When they rename user_id to userId, you update 50 files.
The principle: Transform external data into your domain models immediately. Your application speaks your language, not theirs.
The pattern:
- Receive external response
- Validate it matches expected shape
- Transform to your domain model
- Use only your model in application code
Why it matters: External APIs are designed for their needs, not yours. Don't inherit their decisions.
3. Expect and Handle Failure
Every external call can fail. Plan for it.
The mistake: Treating external calls like local function calls. One API hiccup crashes your entire request.
The principle: Every external call needs:
- A timeout (what if they never respond?)
- Error handling (what if they return an error?)
- A fallback (what do we do when they're down?)
Error categories:
- Retryable: Network issues, rate limits, server errors (5xx). Try again.
- Not retryable: Bad request, auth failure, not found. Fix the problem.
- Permanent: Resource doesn't exist, access revoked. Handle gracefully.
4. Store What You Can't Recreate
Raw responses are valuable. Parsed data can be regenerated; raw data cannot.
The mistake: Only storing the fields you need today. Six months later, you need a field you threw away.
The principle: Store the raw API response alongside parsed data. You can always reparse; you can't unfetch.
The benefit:
- Debug issues by examining exactly what the API returned
- Extract new fields later without re-fetching
- Audit trail for compliance
- Reprocess data if parsing logic changes
5. Respect Rate Limits
External APIs have limits. Hitting them hurts everyone.
The mistake: Firing off 10,000 requests at once. Getting blocked. Your feature breaks for hours.
The principle:
- Know the limits before you code
- Build throttling into your client
- Process in batches with delays
- Handle 429 (rate limited) gracefully
The reality: Rate limits exist because the provider can't handle unlimited load. If you hit them, you're the problem.
Decision Framework
"How should I authenticate?"
API Key in header when:
- The provider requires it
- You're making server-side calls only
- Simple, stateless authentication
OAuth when:
- Acting on behalf of a user
- The provider uses OAuth
- You need scoped permissions
Signed requests when:
- High security requirements
- Webhooks from external services
- Providers require request signatures
The principle: Never expose secrets to the client. API keys stay on the server.
"Should I retry on failure?"
Yes, retry when:
- Network timeout or connection error
- 429 (rate limited) with retry-after header
- 5xx server errors (their problem, might be temporary)
No, don't retry when:
- 400 (bad request) — your input is wrong
- 401/403 (auth error) — retrying won't help
- 404 (not found) — the resource doesn't exist
How to retry:
- Exponential backoff: wait 1s, then 2s, then 4s
- Add jitter: randomize slightly to prevent thundering herd
- Limit attempts: 3-5 retries max, then fail
"Should I cache this response?"
Cache when:
- Data doesn't change often
- Requests are expensive or slow
- Rate limits are tight
- Stale data is acceptable
Don't cache when:
- Data must be real-time
- Each request is unique
- Caching adds more complexity than it saves
Cache invalidation strategies:
- Time-based: Expires after N minutes
- Event-based: Invalidate when something changes
- Manual: Provide a way to force refresh
"How should I handle partial failures?"
In batch operations, some items may succeed and others fail.
Fail the whole batch when:
- All-or-nothing semantics are required
- Partial success leaves inconsistent state
- You can easily retry the whole thing
Process what you can when:
- Items are independent
- Partial results have value
- Retrying the whole batch is expensive
The pattern: Track success/failure per item. Return which succeeded, which failed, why. Let the caller decide what to do.
"What should I do when the API is down?"
Graceful degradation options:
- Use cached data: "Here's what we knew as of 5 minutes ago"
- Skip the feature: Hide or disable what relies on the API
- Queue for later: Accept the request, process when the API recovers
- Fallback provider: Switch to a backup service
The principle: Decide during development what happens when the API fails. Don't figure it out during an outage.
Common Mistakes
No Timeout
Waiting forever for a response that never comes.
Signs: Requests hang indefinitely. Thread pools exhaust. Everything slows down.
The fix: Set explicit timeouts. Default to something reasonable (10-30 seconds). Adjust based on expected response times.
Ignoring Rate Limits
Blasting requests until you're blocked.
Signs: 429 errors. API access revoked. Features mysteriously stop working.
The fix: Read the docs. Implement rate limiting. Batch and delay requests. Handle 429 with backoff.
Trusting External Data
Assuming the API returns what you expect.
Signs: Type errors, null reference exceptions, "undefined is not a function" in production.
The fix: Validate responses before using them. Use schemas. Handle missing or unexpected fields gracefully.
Leaky Abstractions
External API details spreading through your codebase.
Signs: Business logic handles API-specific error codes. Domain models mirror external schemas. Provider name appears in 20 files.
The fix: Wrap the API. Transform data at the boundary. Throw your own errors.
Testing Against Production
Using the real API in tests.
Signs: Tests fail randomly (API was down). Tests are slow (real network calls). Tests cost money (metered API).
The fix: Mock the API client in tests. Test the client itself separately, possibly with recorded responses.
Missing Audit Trail
No record of what you sent or received.
Signs: "What did the API return?" "I don't know, it was two weeks ago." Debugging is guesswork.
The fix: Store raw requests and responses. Log at the API client layer. Include timestamps and request IDs.
How to Evaluate Your Integration
Your integration is working if:
- The API client is isolated (one file/module)
- External data is transformed to domain models
- All calls have timeouts
- Errors are handled and classified
- Rate limits are respected
- Raw responses are stored for debugging
Your integration needs work if:
- API details appear in business logic
- Some calls have no timeout
- Rate limit errors happen frequently
- You can't reproduce issues (no logs)
- Tests call the real API
- One API outage takes down your whole app
Building Resilient Integrations
The Circuit Breaker Pattern
If an API keeps failing, stop calling it temporarily.
How it works:
- Track recent failures
- If failures exceed threshold, "open" the circuit
- While open, fail fast (don't even try)
- After a timeout, allow one test request
- If it succeeds, "close" the circuit
- If it fails, stay open
Why it matters: Prevents wasting resources on a broken service. Gives the service time to recover.
Health Checks
Monitor external dependencies actively.
What to check:
- Can you reach the API?
- Is authentication working?
- Are response times acceptable?
- Is the API returning valid data?
When to check:
- On application startup
- Periodically in the background
- Before critical operations
Timeouts and Deadlines
Connection timeout: How long to wait for a connection to establish. Usually 5-10 seconds.
Read timeout: How long to wait for a response after connecting. Depends on the operation.
Overall deadline: Maximum time for the entire operation, including retries.
The principle: Be aggressive with timeouts. It's better to fail fast and retry than to hang forever.
Progress and Feedback for Long Operations
When processing many items:
- Track progress: Know how many are done, how many remain
- Report status: Let users see what's happening
- Allow cancellation: Let users stop if needed
- Handle interruption: Don't lose completed work if interrupted
The pattern:
- Process in batches
- Update progress after each batch
- Store progress so you can resume
- Show estimated time remaining