Zahlen Documentation
6.1 — CSV Schemas

Phase 6 — API & Integration Documentation

This chapter explains the CSV schema patterns used by Zahlen for payment recovery observability, issuer-health signal generation, replay-safe ingestion, and operator investigation workflows.

Chapter Purpose

CSV ingestion is the most accessible integration path for organizations beginning with Zahlen. It allows payment teams to upload transaction or retry-event data without first building a real-time API or event-stream integration.

This chapter defines how CSV schemas should be structured, how fields should be interpreted, how processor-specific columns should be mapped into canonical Zahlen concepts, and how operators should troubleshoot ingestion issues.

The purpose of the CSV schema is not merely to load rows. The purpose is to convert payment history into reliable operational evidence. Each column should help Zahlen understand issuer behavior, recovery timing, retry outcomes, response-code meaning, and replay-safe event lineage.

Operator Perspective

A clean CSV schema helps Zahlen understand what happened, when it happened, which issuer was involved, what response was returned, and whether the payment eventually recovered. If the schema is unclear, the analysis may still run, but the operational meaning of the result may be weaker.

What is a CSV Schema?

A CSV schema is the agreed structure of a comma-separated value file. It defines which columns appear in the file, what each column means, and how Zahlen should interpret the values inside those columns.

In Zahlen, a CSV schema is also an operational contract. The schema tells the platform how to transform raw payment rows into issuer-health events, recovery signals, telemetry records, investigation artifacts, and replayable operational evidence.

A schema is considered strong when field names are clear, timestamps are consistent, issuer identity is present, payment outcome fields are interpretable, and recovery lifecycle fields can be mapped into deterministic retry windows.

A schema is considered weak when important fields are missing, response codes are ambiguous, issuer identity cannot be determined, timestamps are inconsistent, or the same concept appears under multiple conflicting column names.

Why This Matters

The CSV schema determines the quality of the evidence. Better schema quality produces more trustworthy issuer intelligence, clearer operator investigations, and stronger replay consistency.

Recommended Minimal CSV Schema

The recommended minimal CSV schema contains the fields required to identify the payment event, interpret the issuer context, classify the authorization response, and determine whether recovery occurred.

The minimal schema should be used when a team wants to run basic issuer diagnostics, recovery analysis, and response-code evaluation. More advanced schemas may include richer customer lifecycle, retry, telemetry, processor, or settlement context.

Canonical Field	Definition	Why It Matters
event_id	A unique identifier for the payment or retry event.	The event identifier helps prevent duplicate interpretation and supports replay-safe lineage.
event_at	The timestamp when the payment or authorization event occurred.	The event timestamp allows Zahlen to order events and reconstruct recovery timelines.
merchant_id	The internal merchant or tenant identifier, when available.	The merchant identifier supports tenant isolation and merchant-specific investigation.
issuer_bin	The issuer identification prefix associated with the payment card.	The issuer BIN helps Zahlen group payment behavior by issuing institution or issuer cohort.
issuer_country	The country associated with the issuer or card-issuing environment.	Issuer country supports regional analysis and cross-country degradation detection.
card_brand	The card network brand, such as visa, mastercard, amex, or discover.	Card brand allows operators to detect behavior differences across payment networks.
response_code	The canonical authorization or processor response code.	Response code is central to decline analysis, recovery interpretation, and issuer behavior modeling.
authorization_status	The normalized outcome of the authorization attempt, such as approved, declined, recovered, or failed.	Authorization status explains whether the attempt succeeded or failed in operational terms.
retry_day	The deterministic retry lifecycle day, such as Day 1, Day 2, Day 6, or Day 16.	Retry day allows Zahlen to build recovery curves from stable retry windows.
recovered	A boolean or normalized value indicating whether the payment recovered.	Recovery outcome is required for recovery-rate calculation and cohort analysis.

Canonical Field Mappings

Canonical field mappings are the rules that translate source CSV columns into the field names used internally by Zahlen.

This matters because different processors, billing systems, and internal exports often use different labels for the same concept. One system may call the response code processor_code. Another may call it decline_code. Another may call it payment_response_code. Zahlen should map those fields into the canonical response_code concept whenever possible.

A canonical field is the preferred platform-level name for an operational concept. Canonical names reduce ambiguity and allow downstream services, dashboards, telemetry reports, and replay workflows to interpret data consistently.

Zahlen Canonical Field	Common Source Variants	Operational Meaning
response_code	response_code, canonical_response_code, decline_code, processor_code, payment_response_code, paymentech_code	The normalized code used to interpret authorization or decline behavior.
issuer_bin	issuer_bin, bin, card_bin, issuing_bin, bank_bin	The issuer identity prefix used for issuer-level grouping.
issuer_country	issuer_country, country, issuing_country, card_country	The country context used for regional analysis.
card_brand	card_brand, brand, scheme, network, card_network	The card network used to compare behavior across brands.
event_at	event_at, created_at, transaction_at, authorization_at, timestamp	The time associated with the payment or authorization event.
recovered	recovered, success, is_recovered, payment_recovered, recovery_status	The field that indicates whether payment recovery occurred.
retry_day	retry_day, attempt_day, billing_day, recovery_day, lifecycle_day	The relative day in the deterministic retry lifecycle.

Important Convention

The canonical field name is response_code. Older or processor-specific labels may still appear in source files, but documentation, operator surfaces, APIs, and future integrations should treat response_code as the primary field.

Response Code Conventions

The response_code field is one of the most important fields in a Zahlen CSV file because it allows the platform to classify issuer behavior, decline patterns, recovery likelihood, and operational instability.

A response code is a compact value returned by an authorization or payment-processing system. It may indicate approval, insufficient funds, expired card, suspected fraud, issuer unavailable, invalid account, or another processor-defined condition.

Within Zahlen, response codes should be preserved as strings whenever possible. This avoids accidentally converting codes such as 05 into 5, or treating alphanumeric processor codes as numeric values. Preserving the original code improves traceability and reduces interpretation errors.

The response code should then be mapped into operational meaning. A code may be retryable, terminal, ambiguous, issuer-related, fraud-related, customer-action-related, or processor-specific. The exact meaning may vary by processor, so the first responsibility of the CSV schema is to preserve the code accurately.

Response Code Principle	Definition	Operator Interpretation
Preserve original value	Keep the response code as supplied by the source system.	Operators can trace findings back to source evidence without losing precision.
Normalize into response_code	Map variant field names into the canonical response_code field.	Dashboards and reports can interpret codes consistently.
Avoid processor lock-in	Do not make one processor’s terminology the primary platform concept.	Zahlen remains generic and extensible across processors.
Classify carefully	Do not assume every decline code has the same meaning across systems.	Operators should confirm code semantics when investigating important findings.
Retain legacy aliases when needed	Older fields may be accepted as compatibility inputs.	Backward compatibility helps ingestion without making legacy names canonical.

Issuer Identity Fields

Issuer identity fields allow Zahlen to group payment behavior by the financial institution or issuer cohort involved in the authorization decision.

The issuer_bin field is usually the most useful issuer identity field in a CSV file. It identifies the issuing institution or issuer range associated with the payment card. While a BIN is not always a perfect representation of a complete issuer entity, it provides a practical grouping key for issuer-level analysis.

The issuer_country field adds geographic context. It helps operators determine whether instability is isolated to a country, appearing across countries, or concentrated in a specific regional payment environment.

The card_brand field identifies the payment network brand. This allows Zahlen to evaluate whether behavior is network-specific or consistent across brands.

Operator Perspective

Issuer identity fields answer the question: which issuer environment produced this payment behavior? Without issuer identity, Zahlen can still count outcomes, but it cannot provide strong issuer cognition.

Retry Lifecycle Fields

Retry lifecycle fields describe where the payment event belongs in the deterministic recovery sequence.

The retry_day field is the most direct lifecycle field because it maps an event into the relative retry schedule. In Zahlen’s canonical retry philosophy, the expected retry windows are Day 1, Day 2, Day 6, and Day 16, with suspension after 16 days unless a justified exception exists.

The retry attempt number may also be useful, but retry attempt and retry day are not identical. Attempt number describes sequence position. Retry day describes lifecycle timing. Zahlen cares deeply about lifecycle timing because recovery curves depend on comparing equivalent retry windows.

A billing cohort date or failed_billing_at timestamp can also help establish the starting point for the retry lifecycle. This is useful when the CSV does not explicitly include retry_day, but includes enough timestamps to infer relative lifecycle position.

Lifecycle Field	Definition	Why It Matters
retry_day	The relative day in the deterministic retry lifecycle.	Required for clear recovery curve interpretation.
retry_attempt	The sequence number of the retry attempt.	Useful for ordering attempts, but less informative than lifecycle day by itself.
failed_billing_at	The timestamp of the initial failed payment or billing event.	Allows retry windows to be inferred when retry_day is missing.
billing_cohort	The group of payments that entered recovery around the same lifecycle point.	Supports cohort recovery analysis.
suspension_at	The timestamp or expected date when the account reached suspension.	Helps evaluate whether recovery occurred before the lifecycle endpoint.

Recovery Outcome Fields

Recovery outcome fields tell Zahlen whether a retry or payment event ultimately recovered value.

The recovered field should be expressed consistently. It may use true or false, yes or no, 1 or 0, recovered or not_recovered, or another normalized convention. The most important requirement is consistency. Mixed values make recovery analysis harder.

The authorization_status field provides additional operational meaning. It can distinguish between approved, declined, failed, pending, reversed, settled, or recovered states depending on the source system.

The settlement_status field may be useful when the organization wants to distinguish authorization success from final settlement success. Authorization success means the transaction was approved. Settlement success means the funds completed the settlement lifecycle. These are related but not identical concepts.

Why This Matters

Recovery analysis depends on knowing whether a payment actually recovered. If recovery outcome fields are missing or inconsistent, Zahlen may detect issuer response-code patterns but may not be able to calculate trustworthy recovery curves.

Supported Schema Examples

The examples below show practical CSV structures that can be used as starting points. These examples are intentionally generic and processor-neutral. Processor-specific exports may be accepted through canonical mapping, but the recommended documentation standard should avoid making any one processor’s field names the primary model.

event_id	event_at	issuer_bin	issuer_country	card_brand	response_code	retry_day	recovered
evt_001	2026-05-27T14:10:11Z	414720	US	visa	51	1	false
evt_002	2026-05-28T14:10:11Z	414720	US	visa	00	2	true
evt_003	2026-05-27T14:10:11Z	515462	US	mastercard	91	1	false

This minimal schema supports issuer grouping, response-code interpretation, retry-window analysis, and recovery outcome measurement. It is suitable for first-pass diagnostics and operator training.

event_id	customer_id	failed_billing_at	event_at	issuer_bin	card_brand	response_code	authorization_status	settlement_status	recovered
evt_101	cust_001	2026-05-27T00:00:00Z	2026-05-28T00:00:00Z	546616	mastercard	51	declined	not_settled	false
evt_102	cust_001	2026-05-27T00:00:00Z	2026-05-29T00:00:00Z	546616	mastercard	00	approved	settled	true

This richer schema supports customer-lifecycle reconstruction, retry-window inference, authorization versus settlement distinction, and replay-safe investigation when customer-level identifiers are permitted within the tenant boundary.

Replay-Safe CSV Ingestion

Replay-safe CSV ingestion means that uploaded CSV data is preserved and interpreted in a way that allows the analysis to be reconstructed later.

Replay-safe ingestion requires stable event identifiers, consistent timestamps, preserved source values, canonical field mappings, clear retry lifecycle context, and durable output artifacts. Without these elements, the platform may still generate a report, but later replay or audit review may be weaker.

The goal of replay-safe ingestion is to ensure that an operator can later answer several questions. Which file was uploaded? Which rows were processed? Which fields were mapped? Which response codes were observed? Which issuer cohorts were analyzed? Which findings were generated? Which telemetry signals supported the conclusion?

Replay safety also protects governance integrity. If a finding leads to an investigation or operational recommendation, the platform should be able to reconstruct the evidence path from uploaded CSV row to issuer-health event, alert, dashboard entry, and operator action.

Replay-Safe Requirement	Definition	Why It Matters
Stable event identity	Each row should be uniquely identifiable.	Prevents duplicates and supports evidence lineage.
Consistent timestamps	Event times should use a consistent format and timezone convention.	Supports deterministic ordering and replay reconstruction.
Preserved source values	Original response codes and source fields should not be destructively altered.	Allows operators to trace findings back to source evidence.
Canonical mappings	Variant field names should map into platform-level field names.	Allows downstream services to interpret data consistently.
Durable artifacts	Summary, findings, alerts, records, and telemetry outputs should remain available after ingestion.	Supports investigation, audit, and governance review.

Validation Expectations

CSV validation is the process of checking whether the uploaded file contains the fields and values required for meaningful analysis.

Validation should not be understood only as a technical pass-fail check. It is also an evidence-quality control. A file may be syntactically valid but operationally weak if it lacks issuer identity, response-code data, retry timing, or recovery outcome fields.

The strongest validation model classifies issues by severity. A blocking issue prevents meaningful ingestion. A warning issue allows ingestion but weakens analysis. An informational issue gives operators context without preventing use.

Validation Issue	Severity	Meaning
Missing file or unreadable CSV	Blocking	The file cannot be processed.
Missing response_code or equivalent field	Blocking or high warning	Decline and issuer behavior analysis will be severely limited.
Missing issuer_bin	High warning	Issuer-level grouping may be unavailable or weaker.
Missing retry_day and lifecycle timestamps	High warning	Recovery curve analysis may be limited.
Inconsistent timestamp formats	Warning	Replay ordering and timeline analysis may be less reliable.
Mixed recovered values	Warning	Recovery-rate calculations may require normalization.
Unexpected extra columns	Informational	Extra fields may be preserved or ignored depending on configuration.

Ingestion Troubleshooting

Ingestion troubleshooting is the process of resolving schema, formatting, mapping, and evidence-quality issues that prevent Zahlen from interpreting the CSV correctly.

Operators should begin by confirming that the CSV is readable, that the header row is present, and that the key analytical fields are included. If the file uploads successfully but produces weak or empty findings, the issue is often not the upload itself. The issue is usually missing issuer identity, missing response-code information, missing recovery outcomes, insufficient row volume, or unclear retry lifecycle context.

Symptom	Likely Cause	Recommended Fix
Upload fails	File is not a valid CSV or is malformed.	Re-export as UTF-8 CSV and confirm the header row is present.
No issuer findings appear	issuer_bin or equivalent issuer identity is missing.	Add issuer_bin, bin, card_bin, or another mappable issuer field.
Response-code report is empty	response_code cannot be identified.	Map decline_code, processor_code, or payment_response_code into response_code.
Recovery rates are zero or blank	recovered or success field is missing or inconsistent.	Normalize recovery outcome values before upload.
Retry curve cannot be interpreted	retry_day or lifecycle timestamp is missing.	Add retry_day or provide failed_billing_at and event_at timestamps.
Truth fields show NONE or NOT_RUN	Live truth enrichment was not available or not executed.	Treat the run as telemetry-only until truth enrichment is configured.

Truth and Telemetry Fields

Truth and telemetry fields may appear in advanced ingestion outputs or reports. These fields help operators understand whether the uploaded data was linked to external truth sources, internal evidence sources, or telemetry enrichment.

Truth data refers to validated external or internal reference evidence used to confirm or enrich an observed payment behavior signal. If truth matching is not configured or no matching evidence is found, truth-related fields may show NONE, zero, or NOT_RUN values.

Telemetry data refers to operational evidence generated by the platform while processing, analyzing, enriching, or reporting on events. Telemetry helps operators understand evidence quality, enrichment status, and processing behavior.

Field	Definition	Operator Interpretation
truth_matches_found	The number of matching truth records found for the analyzed signal.	Zero means no truth evidence was matched for that signal.
truth_matched_by	The method or key used to match truth evidence.	NONE means no match method was applied or no match was found.
truth_confidence_band	The confidence band assigned to matched truth evidence.	NONE means no truth confidence was available.
external_status	The status of external enrichment or external validation.	NOT_RUN means the external process was not executed for that run.
telemetry_event_count	The number of telemetry events associated with a signal or run.	Higher counts may indicate more processing evidence, but not necessarily higher truth confidence.

Operator Note

If truth fields show NONE and external_status shows NOT_RUN, the CSV analysis may still be valid as a telemetry-supported run. It simply means live or external truth enrichment was not available for that evidence window.

Schema Governance

Schema governance is the discipline of keeping CSV field definitions stable, documented, and compatible over time.

This matters because Zahlen is designed for deterministic analysis and replay-safe operational reasoning. If field meanings change without documentation, historical comparisons may become unreliable. If processor-specific field names become embedded as platform concepts, the architecture may become harder to generalize.

The recommended schema governance model is to keep canonical field names stable, accept compatible aliases when necessary, preserve source values, and document any transformation from source field to canonical field.

Schema governance also supports operator trust. When an operator sees response_code, issuer_bin, retry_day, or recovered, the operator should know exactly what those fields mean and how they are used in analysis.

Recommended CSV Preparation Checklist

Before uploading a CSV file, operators should confirm that the file supports the intended analysis.

The file should include issuer identity if the goal is issuer intelligence. It should include response codes if the goal is decline analysis. It should include retry lifecycle fields if the goal is recovery curve analysis. It should include recovery outcomes if the goal is recovery-rate calculation. It should include consistent timestamps if the goal is replay-safe investigation.

Operators should also preserve original source values and avoid manually overwriting processor response codes, timestamps, or identifiers. If normalization is needed, it is better to add canonical columns while preserving source columns where possible.

Checklist Item	Question to Ask	Why It Matters
Readable CSV	Can the file be opened and parsed as CSV?	Basic ingestion requires a valid file structure.
Header row	Does the first row clearly name each column?	Zahlen needs headers to map source fields to canonical fields.
Issuer identity	Can the issuer be identified?	Issuer intelligence requires issuer grouping.
Response code	Is a response_code or mappable equivalent present?	Decline and recovery behavior depend on response-code interpretation.
Retry lifecycle	Can retry timing be identified?	Recovery curves require lifecycle context.
Recovery outcome	Can recovered payments be distinguished from unrecovered payments?	Recovery rates require outcome fields.
Timestamps	Are timestamps consistent and ordered?	Replay-safe analysis depends on event ordering.

Chapter Summary

CSV schemas are the foundation of Zahlen’s most accessible integration path. A strong schema allows uploaded payment data to become issuer-health evidence, recovery intelligence, telemetry context, and replay-safe investigation material.

Canonical field mappings allow processor-specific exports to be interpreted through stable Zahlen concepts. The response_code convention protects the architecture from processor lock-in. Issuer identity fields support issuer cognition. Retry lifecycle fields support recovery curves. Recovery outcome fields support recovery-rate analysis. Replay-safe ingestion preserves the evidence path needed for auditability and governance review.

A well-structured CSV file therefore does more than start an analysis job. It creates the operational evidence base that Zahlen uses to understand issuer behavior.