API USER GUIDE
Listing runs • Run status • Run detail retrieval
For merchants, developers, and integration engineers
Source baseline: zahlen_deploy_0616A.tar.gz | Version 1.0 | June 2026 Administrative visibility • Tenant-safe processing • Durable operational traceability
Commercial workflow context
Payment Event → Retry Decision → Retry Outcome → Investigation Run →
Reporting
Learning objectives |
By the end of this chapter, you should be able to list tenant-scoped investigation runs, read processing status, retrieve run details, and connect each run to the wider Zahlen evidence pipeline. |
Investigation runs are durable administrative records that track how uploaded or API-ingested payment evidence moves through validation, normalization, analysis, and downstream population. They provide the operational bridge between a merchant submission and the reporting, Recovery Intelligence, issuer monitoring, and governance views that follow.
Where investigation runs fit
Stage | Primary identifier | What it tells you |
Payment-event ingestion | payment_event_batch_id / batch_id | Which merchant evidence was accepted for processing. |
Background processing | upload_job_id / job_id | Which durable processing job owns the work. |
Investigation run | run or job resource | Whether processing completed and what downstream evidence was produced. |
Reporting and monitoring | durable record IDs | How the completed evidence appears in Recovery Truth, issuer health, timelines, and reports. |
Administrative boundary |
Investigation-run routes are under /v1/admin/investigation-runs. Do not assume that a merchant X-API-Key grants access. These routes require an approved administrative context, such as an authenticated operator session or enterprise administrative authorization. |
The merchant-facing API returns an upload_job_id during payment-event ingestion. Store that identifier even when your application does not have direct access to the administrative routes. It gives support and operations teams a stable correlation point.
Confirmed route family
Method | Path | Purpose |
GET | /v1/admin/investigation-runs | List investigation runs visible to the authenticated tenant context. |
GET | /v1/admin/investigation-runs/{job_id} | Retrieve detailed information for one run. |
GET | /v1/admin/investigation-runs/{job_id}/ status | Retrieve the current processing status for one run. |
GET | /v1/admin/investigation-runs/ readiness | Evaluate whether the investigation-run subsystem is ready and properly connected. |
Authentication model
Administrative API access is governed separately from the merchant API. The exact credential or session mechanism depends on the deployment and enterprise contract. The important rule is that tenant ownership must come from authenticated context, not from a tenant_id supplied in the URL, query string, or request body.
Use the administrative base URL and authentication method supplied by the Zahlen administrator.
Never add tenant_id to a request merely to make an empty result return data.
Treat a 401 response as an authentication problem and a 403 response as an authorization or role problem.
Treat an empty list as a possible valid tenant-scoped result until runtime and population health are checked.
Fail closed |
If the platform cannot resolve the authenticated tenant or administrative identity, access should be denied. A production system must not fall back to a default tenant. |
Listing investigation runs
Use the list route to discover recent runs available to the current administrative tenant. A list response commonly acts as an operational index: it helps operators find a job by date, source, status, or upload identifier before opening the detail resource.
curl -sS 'https://api.example.com/v1/admin/investigation-runs' \
-H 'Accept: application/json' \
-H 'Authorization: Bearer ADMIN_TOKEN_REPLACE_ME' | python -m json.tool
The header above is illustrative. Use the administrative authentication contract supplied by your deployment; do not substitute X-API-Key unless that route is explicitly configured to accept it.
What to capture from a list response
Field category | Examples | Client use |
Identity | job_id, upload_job_id, run identifier | Open status and detail resources; correlate with ingestion responses. |
Ownership | tenant or merchant context | Confirm the record belongs to the authenticated scope. |
Lifecycle | status, created_at, started_at, completed_at | Sort recent work and identify stale or unfinished runs. |
Volume | total rows, valid rows, invalid rows, error count | Determine whether the run processed the expected evidence. |
Source | upload, API ingestion, source label | Trace the run back to the originating integration path. |
List-processing pattern
Request the list using the approved administrative context.
Filter or sort locally only after confirming the server already enforced tenant scope.
Select the run using a durable job identifier, not only a human-readable timestamp.
Open the status resource for active work and the detail resource for completed or failed work.
Reading run status
The status route answers a narrow operational question: what state is this job in now? It is appropriate for polling while a run is still processing and for detecting terminal completion or failure.
curl -sS \
'https://api.example.com/v1/admin/investigation-runs/ZN-2026-06-16-0001/status' \
-H 'Accept: application/json' \
-H 'Authorization: Bearer ADMIN_TOKEN_REPLACE_ME' | python -m json.tool
Status categories
Category | Meaning | Recommended behavior |
Queued or pending | Accepted but not actively processing. | Poll at a controlled interval; watch queue age. |
Running or processing | Work is active. | Continue polling with increasing intervals; do not submit a duplicate job. |
Completed | Primary run processing reached a terminal success state. | Retrieve detail and confirm downstream population. |
Failed | The run ended without successful completion. | Read error detail, preserve identifiers, and alert operations. |
Unknown or unavailable | The resource is not visible, missing, or status cannot be resolved. | Check tenant context, identifier, authorization, and runtime health. |
Completed is not the final diagnostic step |
A run can report completion while a downstream bridge or composition layer remains empty. After completion, verify Recovery Truth, radar, issuer health, monitoring events, timelines, cohort memory, and classification persistence when those features are in scope. |
Polling guidance
Start with a moderate interval rather than polling continuously.
Increase the interval for long-running jobs.
Stop polling when a terminal state is reached.
Apply a maximum polling duration and alert when it is exceeded.
Log job_id, request correlation, timestamps, and the final status.
Retrieving run details
The detail route provides the richer record needed for troubleshooting, reporting correlation, and governance review. Use it after a run completes, fails, or appears inconsistent with the downstream dashboards.
curl -sS \
'https://api.example.com/v1/admin/investigation-runs/ZN-2026-06-16-0001' \
-H 'Accept: application/json' \
-H 'Authorization: Bearer ADMIN_TOKEN_REPLACE_ME' | python -m json.tool
Detail categories to inspect
Category | Questions to answer |
Identity | Does the job ID match the upload_job_id captured during ingestion? |
Tenant and merchant scope | Is the record visible only within the authenticated ownership boundary? |
Source evidence | Was the run created from the expected API batch or uploaded evidence? |
Row accounting | Do total, valid, invalid, and error counts reconcile? |
Lifecycle timing | When was the run created, started, and completed? Is the duration plausible? |
Errors and warnings | Are failures actionable, repeatable, and tied to specific evidence? |
Downstream outputs | Which durable stores and monitoring layers were populated? |
Reconciliation rule
Do not evaluate one count in isolation. Reconcile the evidence chain: submitted rows → accepted rows → valid rows → invalid rows → persisted records → downstream monitoring artifacts. A discrepancy may be valid, but it should be explainable.
Preserve identifiers |
Store upload_job_id from the merchant ingestion response and the administrative job identifier returned by the run APIs. These IDs are the most reliable way to connect a customer support case to durable processing evidence. |
End-to-end investigation-run workflow
Step | Developer or operator action | Expected evidence |
1 | Submit payment events through /v1/payment-events or /v1/payment-events/batch. | batch ID and upload_job_id |
2 | Store the returned identifiers with merchant-side event records. | durable client correlation |
3 | List administrative investigation runs when authorized. | tenant-scoped run index |
4 | Poll the selected run status at a controlled interval. | current lifecycle state |
5 | Retrieve run detail after terminal completion or failure. | row accounting, timestamps, errors, outputs |
6 | Verify downstream population and reporting. | Recovery Truth, monitoring, classification, reports |
7 | Escalate anomalies using IDs, timestamps, and evidence. | auditable incident or support record |
Relationship to the fixed retry schedule
Investigation runs analyze evidence generated by the merchant payment process. They do not change Zahlen’s canonical retry schedule. Payment attempts remain governed by the fixed sequence: Day 1, Day 2, Day 6, and Day 16. Administrative polling or rerunning a report must never create an additional payment attempt.
Operation | May repeat automatically? | Payment effect |
GET run list | Yes, with reasonable polling limits. | None |
GET run status | Yes, with controlled intervals. | None |
GET run detail | Yes. | None |
Resubmit payment evidence | Only with documented replay safeguards. | May create duplicate processing evidence if IDs are not stable. |
Execute payment retry | Only on the fixed Day 1, Day 2, Day 6, Day 16 schedule. | Creates a real authorization attempt. |
Error handling
HTTP status | Likely meaning | Recommended response |
200 | Request succeeded. | Parse the resource and persist identifiers or status. |
401 | Administrative authentication is missing or invalid. | Verify the approved credential and environment. |
403 | The identity is authenticated but not allowed to access the route. | Check role, enterprise entitlement, or endpoint policy. |
404 | The run is absent or not visible in the authenticated tenant scope. | Verify job ID and tenant context; do not bypass ownership filters. |
422 | A supplied parameter failed validation. | Correct the request before retrying. |
429 | Administrative rate limit or quota enforcement. | Back off and honor Retry-After when present. |
500/503 | Runtime or dependency failure. | Retry GET operations with bounded backoff; alert if sustained. |
Empty list troubleshooting
Confirm the administrative identity and current environment.
Confirm the expected ingestion request returned an upload_job_id.
Check the investigation-run readiness route.
Check worker, supervisor, queue, and last-cycle health.
Verify tenant resolution before considering any data repair.
Use backfill only as a controlled remediation after the missing bridge is identified.
Do not bypass tenant isolation |
An empty list can be the correct response for the authenticated tenant. Never remove tenant filters or substitute a default production tenant merely to make data appear. |
Example client polling pattern
The following Python example illustrates a bounded status poll. Replace the illustrative bearer token with the administrative authentication contract for your deployment.
import os import time import requests
BASE_URL = os.environ["ZAHLEN_BASE_URL"] ADMIN_TOKEN = os.environ["ZAHLEN_ADMIN_TOKEN"] JOB_ID = "ZN-2026-06-16-0001"
headers = {
"Accept": "application/json", "Authorization": f"Bearer {ADMIN_TOKEN}",
}
interval_seconds = 5
max_interval_seconds = 60
deadline = time.monotonic() + 15 * 60
while time.monotonic() < deadline:
response = requests.get(
f"{BASE_URL}/v1/admin/investigation-runs/{JOB_ID}/status", headers=headers,
timeout=20,
)
response.raise_for_status() payload = response.json()
status = str(payload.get("status", "")).upper()
if status in {"COMPLETED", "FAILED"}:
print(payload) break
time.sleep(interval_seconds)
interval_seconds = min(max_interval_seconds, interval_seconds * 2) else:
raise TimeoutError(f"Investigation run {JOB_ID} did not finish in time")
Production improvements
Add jitter so multiple clients do not poll in lockstep.
Handle 429 using Retry-After when present.
Capture request IDs and response timestamps in logs.
Use a circuit breaker for sustained 5xx failures.
Retrieve the detail resource after COMPLETED or FAILED.
Production readiness checklist
Administrative access is explicitly approved and separated from merchant API-key access.
Environment base URLs and credentials are not shared across development, staging, and
production.
upload_job_id is stored for every accepted payment-event ingestion request.
Run listing is tenant-scoped and an empty list is handled as a valid possible result.
Status polling is bounded, uses increasing intervals, and stops on terminal states.
Completed runs are followed by downstream population verification when dashboards are
expected to contain data.
Failed runs preserve error evidence, identifiers, and timestamps for support or incident review.
No administrative poll, retry, or remediation creates a payment attempt outside Day 1, Day 2, Day
6, and Day 16.
Backfill is used only after identifying the missing population bridge.
Logs do not expose secrets or prohibited cardholder data.
Chapter summary |
Investigation runs make background processing observable. Use the list route to find tenant-scoped work, the status route to monitor lifecycle state, and the detail route to reconcile evidence, errors, and downstream outputs. Preserve upload_job_id, respect the administrative authorization boundary, and treat completion as the start of downstream verification—not as proof that every reporting layer is populated. |
Key terms
Term | Meaning |
upload_job_id | Identifier returned during ingestion that correlates merchant evidence with background processing. |
Investigation run | Durable administrative record of processing and downstream population. |
Terminal state | A lifecycle state such as COMPLETED or FAILED that ends active polling. |
Readiness | Operational evaluation of whether investigation-run services and dependencies are available. |
Population bridge | The service path that converts completed evidence into Recovery Truth and monitoring artifacts. |