ZAHLEN

API User Guide

Chapter 11 - Error Handling

HTTP status codes | Validation errors | Authentication failures

Audience

Merchants, developers, and integration engineers who build resilient client applications against the Zahlen

API.

Version 1.0 | Source baseline: zahlen_deploy_0616A.tar.gz | June 2026 Commercial developer experience | Tenant-safe operations | Explainable retry intelligence

‌Chapter 11 - Error Handling

Learning objectives

By the end of this chapter, you should be able to distinguish transport, authentication, authorization, validation, throttling, conflict, and server errors; decide whether a request is safe to retry; and preserve the

identifiers needed for support and audit.

A dependable API client does not treat every non-success response the same way. A malformed JSON request needs a code change. A revoked API key needs credential remediation. A rate-limit response needs controlled backoff. A transient server failure may be safe to retry only when the operation has stable idempotency semantics.

The first rule is simple: classify the failure before deciding what to do next. Blind retries can create duplicate work, trigger retry storms, consume quotas, or obscure the original cause.

Payment schedule boundary

HTTP request retries are not payment retries. Client error recovery must never create authorization attempts

outside Zahlen's fixed Day 1, Day 2, Day 6, and Day 16 payment schedule.

‌Error-handling goals
- Give developers enough detail to correct a request without exposing secrets or internal implementation data.
- Preserve request IDs, event IDs, batch IDs, decision IDs, outcome IDs, and upload job IDs for traceability.
- Retry only transient failures and only with bounded backoff, jitter, and stable idempotency.
- Fail closed when authentication or tenant ownership cannot be resolved.
- Make errors observable through logs, metrics, alerts, and audit records.

‌HTTP status codes

The table below describes the status classes a Zahlen client should be prepared to handle. Exact response bodies may vary by route, but the client behavior should remain consistent.

Status

Meaning

Typical cause

Recommended client

response

200 / 201

Request succeeded

Resource read, created, or

accepted.

Parse the body and persist

returned identifiers.

400

Malformed or business-invalid request

Invalid JSON, incompatible values, or a business rule

failure.

Correct the request. Do not blindly retry.

401

Missing or invalid authentication

Missing X-API-Key,

malformed key, revoked key, or wrong environment.

Stop the request flow and repair credentials.

403

Authenticated but not permitted

Plan, role, capability, or

endpoint policy denies access.

Check authorization and contract settings.

404

Resource not visible or not found

Wrong identifier, wrong tenant, deleted resource, or

wrong environment.

Verify tenant-scoped identifiers and hostname.

409

Conflict or idempotency mismatch

Same idempotency key used with a different logical request or state conflict.

Compare the original request and key; do not generate another payment

attempt.

422

Schema validation failure

Missing required field,

wrong type, invalid bounds, or forbidden extra field.

Read field-level errors and fix serialization.

429

Rate or quota enforcement

Short-window rate limit or longer-window quota

exhausted.

Honor Retry-After when present; back off with jitter.

500

Unexpected server failure

Unhandled server condition.

Retry cautiously with idempotency; alert if

repeated.

503

Service or dependency unavailable

Maintenance, dependency outage, worker issue, or

overload.

Use bounded backoff; stop after a configured limit.

Do not assume every 4xx is permanent

Some 401, 403, or 404 responses result from using the wrong environment or tenant-scoped identifier.

Diagnose the context before changing application logic.

‌Reading an error response
The deployed API includes an ApiErrorResponse model with top-level error and meta objects. Clients should preserve the entire safe response for diagnostics, while avoiding logs that expose credentials or prohibited payment data.
{
"error": {
"code": "EXAMPLE_CODE",
"message": "Human-readable explanation", "details": {"field": "example"}
},
"meta": {
"request_id": "req_example", "time": "2026-06-16T15:00:00Z"
}
}

Illustrative structure

The sample above demonstrates a practical parsing pattern. Treat the actual response contract returned by

the route as the source of truth and tolerate additional documented metadata.

‌What to capture

Value	Why it matters
HTTP method and path	Identifies the failing operation without logging secrets.
Status code	Primary classification for client behavior.
Safe error code and message	Supports remediation and alert grouping.
Server request ID	Connects merchant logs to Zahlen audit and support records.
Client correlation IDs	Links event, batch, decision, outcome, and job flows.
Attempt count and elapsed time	Shows retry behavior and retry-storm risk.
Environment and service version	Helps identify wrong-host and contract-version problems.

‌What not to log

The complete X-API-Key or webhook verification secret.
Full primary account numbers, CVV values, passwords, or raw bank credentials.
Unredacted Authorization headers or secret-manager values.
Arbitrary metadata without a reviewed allowlist.

‌Validation errors

Key Zahlen request models use strict validation and forbid unknown top-level properties. This prevents misspelled or unsupported fields from being silently ignored. A validation failure is a client defect or contract mismatch, not a transient network event.

‌Common causes

Cause	Example	Correction
Missing required field	Payment event without event_id.	Add the required field before sending.
Wrong data type	attempt_number sent as an object.	Serialize the documented integer type.
Out-of-range value	attempt_number below its minimum.	Validate bounds in the client model.
Empty collection	Payment-events request with no events.	Send at least one event.
Oversized collection	More than 10,000 payment events or more than 500 legacy batch decisions.	Split into valid batches and respect quotas.
Forbidden extra field	Misspelled property such as eventid.	Correct the field name; do not expect it to be ignored.
Invalid URL or string length	Webhook callback URL outside schema constraints.	Validate before submission.

‌Example: invalid payment event

{

"events": [{

"eventid": "evt_0001", "attempt_number": 0

}]

}

This request has two problems: eventid is not the required event_id field, and attempt_number violates the payment-event minimum of 1. Because unknown fields are forbidden, the misspelled field is not silently accepted.

‌Client-side validation pattern

Build request objects from strict typed models.
Validate required fields, types, bounds, and collection sizes before network transmission.
Serialize once and contract-test the exact JSON shape.
On HTTP 422, map returned field errors to developer-visible diagnostics.
Do not retry until the payload is corrected.
Confirm the X-API-Key header is present and spelled exactly.
Confirm the secret was loaded from the intended secret manager or environment variable.
Confirm the base URL belongs to the same environment as the key.
Confirm the key is active and has not been revoked or expired.
Check whether whitespace, quotes, line breaks, or proxy configuration altered the header.
Use the key identifier or safe fingerprint to review activity without exposing the secret.

Rotate immediately if compromise is suspected.

Fail closed

When a key cannot be resolved to a valid tenant context, the request must be denied. Never fall back to a

default production tenant.

‌401 versus 403

Response

Interpretation

Example response action

401 Unauthorized

The caller is not successfully

authenticated.

Repair or rotate the credential.

403 Forbidden

The caller is authenticated but lacks

permission for the route or capability.

Check plan, role, endpoint

authorization, and contract.

‌Security response to suspected compromise

Revoke the affected key and create a replacement through an approved administrative workflow.
Deploy the replacement to all intended services and verify authenticated traffic.
Review audit and activity records for unexpected endpoints, tenants, IPs, or volumes.

Do not paste the key into support messages, tickets, screenshots, or command history.

‌Resource, conflict, and ownership errors

‌HTTP 404

A resource can be absent, deleted, located in another environment, or intentionally invisible because it belongs to another tenant. A tenant-scoped 404 should not cause a client to remove ownership filters or try identifiers from another merchant.

Verify the exact event_id, batch_id, subscription_id, decision_id, outcome_id, or job_id.
Verify the base URL and environment.
Verify that the same authenticated tenant created or owns the resource.

Check whether the resource is eventually available after asynchronous processing, but poll at a controlled interval.

‌HTTP 409

A conflict commonly indicates an idempotency mismatch or incompatible resource state. The safe response is to compare the request with the original operation, not to generate a fresh key automatically.

Idempotency-Key: merchant-order-8842-attempt-2

Idempotency rule

Use one stable idempotency key for one logical operation. Reusing the key with changed request data can correctly produce a conflict. Generating a new key to bypass that conflict can create duplicate decisions or

duplicate outcome records.

‌Conflict checklist

Locate the original request body and idempotency key.
Compare the current body byte-for-byte or field-for-field with the original logical operation.
Read the existing resource if the API exposes it.
Continue from the known durable result rather than creating a duplicate.

Escalate if the original result cannot be reconciled safely.

‌Safe retry strategy

Automatic retry behavior should be explicit for each operation. Retry budgets must be bounded by maximum attempts, maximum elapsed time, and circuit-breaker rules.

Operation / response	Automatic retry?	Required safeguards
GET resource + transient 5xx/503	Usually	Exponential backoff, jitter, maximum attempts.
POST retry decision + transient failure	Yes, carefully	Reuse the same Idempotency-Key and identical logical request.
POST retry outcome + transient failure	Yes, carefully	Reuse stable identifiers and idempotency where supported.
POST payment-event batch + uncertain result	Only with safeguards	Use stable event IDs; first attempt to read the resulting resource or reconcile ingestion.
HTTP 400 or 422	No	Fix the request before resubmission.
HTTP 401	No	Repair authentication or rotate the key.
HTTP 403	No	Resolve authorization or plan restrictions.
HTTP 404	Usually no	Verify identifier, ownership, environment, and asynchronous timing.
HTTP 409	No blind retry	Reconcile the original idempotent operation.
HTTP 429	Yes, later	Honor Retry-After; use bounded backoff and jitter.

‌Exponential backoff with jitter

delay = min(max_delay, base_delay * (2 ** retry_number))

delay = delay * random.uniform(0.75, 1.25)

Use a maximum number of attempts and maximum total elapsed time.
Share throttling state across workers when many instances use the same tenant quota.
Stop retrying when a circuit breaker detects a sustained outage.
Preserve the same idempotency key for the same logical POST operation.
Never let network retry loops schedule additional payment attempts.

‌Implementation examples
‌Python error mapper

import time import random import requests

RETRYABLE = {429, 500, 503}

def request_with_policy(method, url, *, headers, json=None, max_attempts=4): for attempt in range(max_attempts):
response = requests.request(
method, url, headers=headers, json=json, timeout=20
)
if response.ok:
return response

request_id = response.headers.get("X-Request-ID") if response.status_code not in RETRYABLE:
raise RuntimeError(
f"Zahlen HTTP {response.status_code}; request_id={request_id}; " f"body={response.text[:1000]}"
)

if attempt == max_attempts - 1: response.raise_for_status()

retry_after = response.headers.get("Retry-After") if retry_after and retry_after.isdigit():
delay = float(retry_after) else:
delay = min(30.0, 0.5 * (2 ** attempt)) delay *= random.uniform(0.75, 1.25)
time.sleep(delay)

raise RuntimeError("unreachable")

Production note

Use structured logs and redact response content according to your data policy. The example truncates the

body but does not replace a proper allowlist-based logging design.

‌JavaScript status handling

const response = await fetch(url, options);

const requestId = response.headers.get("x-request-id");

if (response.ok) {

return await response.json();

}

const body = await response.text(); switch (response.status) {

case 401:

throw new Error(`Authentication failed; request_id=${requestId}`); case 403:

throw new Error(`Not permitted; request_id=${requestId}`); case 422:

throw new Error(`Validation failed; request_id=${requestId}; ${body}`); case 429:

throw new Error(`Throttled; retry-after=${response.headers.get("retry-after")}`); default:

throw new Error(`Zahlen HTTP ${response.status}; request_id=${requestId}`);

}

‌Monitoring and alerting

Error handling is incomplete until failures are observable. Aggregate by tenant, route, status class, error code, key identifier, and deployment version while protecting sensitive information.

Signal	What it may indicate	Suggested response
401 rate	Revoked, expired, missing, or misconfigured keys.	Check recent deployments and key activity.
403 rate	Plan or authorization mismatch.	Review capability and endpoint policy.
422 rate	Client release or schema drift.	Inspect field errors and contract tests.
429 rate	Capacity pressure, quota exhaustion, or request loop.	Throttle clients and review usage.
5xx / 503 rate	Runtime or dependency degradation.	Open incident and use circuit breakers.
Idempotent replay rate	Expected retry behavior or unstable client network.	Verify that replays are returning the same durable result.
Outcome-reporting lag	Broken recovery learning loop.	Check retry-outcome clients and queues.

‌Alert design

Alert on sustained rates or error-budget impact, not every individual validation error.
Use separate alerts for authentication spikes, throttling, server failures, and outcome-reporting lag.
Include safe correlation identifiers and links to internal dashboards, never complete secrets.

Suppress duplicate alerts during a known incident while preserving metrics and logs.

Support package

When escalating to Zahlen support, provide environment, UTC time window, method, path, status, safe error

code, request ID, and relevant durable identifiers. Do not provide the full API key or prohibited payment data.

‌Troubleshooting playbooks

‌Validation failures after a client deployment

Compare the new serialized JSON with the previous known-good payload.
Check renamed, missing, nullable, and forbidden fields.
Validate collection sizes and numeric bounds.
Run contract tests against the current discovery schema.
Roll back or correct the client; do not add blind retry logic.
‌Authentication failures across all routes
Check base URL and environment.
Check secret injection and header construction.
Check key status, rotation timing, and revocation records.
Verify the system clock and proxy/header forwarding behavior where relevant.
Rotate if compromise or accidental disclosure is possible.
‌Repeated 429 responses
Stop immediate retries and honor Retry-After when provided.
Measure traffic by service, route, and key identifier.
Look for a retry loop or duplicated worker deployment.
Check plan assignment, quota configuration, and current usage.
Increase capacity or quota only after explaining the traffic pattern.
‌Repeated 5xx or 503 responses
Enable bounded backoff and a circuit breaker.
Preserve idempotency keys and request correlation.
Check Zahlen health and version endpoints when reachable.
Stop before exceeding the retry budget.
Escalate with a safe support package and UTC timestamps.

Final rule

A successful recovery client is conservative: it validates before sending, authenticates securely, retries only when safe, preserves durable identifiers, and never converts transport uncertainty into an extra payment

attempt.

‌Production readiness checklist
- Every API call has a documented status-handling policy.
- Validation is performed locally with strict typed request models.
- Unknown fields fail tests before reaching production.
- API keys are never logged and 401 handling stops automatic retries.
- 409 handling reconciles the original idempotent operation.
- 429 handling honors Retry-After and uses bounded jittered backoff.
- POST retries reuse the same stable idempotency key where supported.