Reliability

What the SDK retries, what it doesn't, deadline math, idempotency semantics, and the full error class table.

The OHM SDK ships Stripe-grade reliability primitives out of the box: typed errors, exponential backoff, idempotent retries, a total-deadline budget, and a queue helper for offline mobile use. This page is the single source of truth for what we guarantee.

Retry policy

The SDK automatically retries failed requests when all three are true:

The HTTP status is retriable: 408 (request timeout), 429 (rate limited), or 5xx (server-side).
The AbortSignal hasn't fired (caller didn't cancel).
The attempt count is below maxRetries (default 2 → up to 3 total tries).

Transport-layer errors (DNS, TCP, TLS, dropped sockets) are also retried — they surface as OHMNetworkError only after every retry is exhausted.

attempt 0 → fails → wait [0..250 ms] → attempt 1
                          ↓
                  fails → wait [0..500 ms] → attempt 2
                                  ↓
                          fails → throw the typed OHM*Error

Backoff is full-jitter exponential, capped at 8 s: Math.floor(Math.random() × min(8000, 250 × 2^attempt)).

Deadline budget

Without totalTimeoutMs, the worst-case wall-clock latency is unbounded: (maxRetries + 1) × timeoutMs + Σbackoff. With it, the SDK guarantees the call fails fast as soon as the budget is exhausted — even mid-retry.

const ohm = new OHM({
  apiKey: process.env.OHM_API_KEY!,
  timeoutMs: 30_000,        // per-attempt
  totalTimeoutMs: 60_000,   // total wall-clock across all retries
  maxRetries: 2,
});

If a retry sleep would push past the deadline, the SDK throws OHMTimeoutError before sleeping — never waste budget on a guaranteed-failure attempt.

Set totalTimeoutMs on every production client unless you know your upstream is bounded. Without it, a flapping LLM provider could keep a mobile request open for 3+ minutes.

Idempotency

Every unsafe HTTP method (POST / PATCH / PUT / DELETE) gets an Idempotency-Key header automatically. The server short-circuits same-key retries within 24 h to the cached response — eliminating duplicate-write bugs from mobile retries.

// Auto — the SDK generates a UUID v4 if the caller doesn't supply one.
await ohm.extract({ apiSlug: "opd", text });

// Caller-supplied — recommended for "this represents visit X" semantics.
await ohm.extract({
  apiSlug: "opd",
  text,
  idempotencyKey: `visit_${visitId}_${transcribeRunId}`,
});

// Explicit opt-out — every retry hits the server fresh.
await ohm.extract({ apiSlug: "opd", text, idempotencyKey: null });

// Disable for the whole client.
const ohm = new OHM({ apiKey, disableAutoIdempotency: true });

Error classes — pattern-match by class

The SDK maps every wire-level failure to one of these classes. Each carries code, status, requestId, plus the raw responseHeaders and responseBody for support-ticket debugging.

Class	HTTP	When	Retried?	Notes
`OHMAuthError`	401 / 403	Key bad / expired / revoked	❌	Rotate the key.
`OHMValidationError`	400 / 422	Body fails Zod / schema	❌	`.fields[]` lists failing JSON-Schema paths.
`OHMRateLimitError`	429	Per-key per-minute cap	✅	`.retryAfterSec` from `Retry-After` header.
`OHMQuotaExceededError`	402 / 429-quota	Org-wide quota / billing	❌	`.resetAt`, `.quotaKind`. Distinct from rate limit.
`OHMNotFoundError`	404	Slug / job id gone	❌	`.availableSlugs[]` powers a slug picker.
`OHMTimeoutError`	408 / 504 / client-side	Deadline exceeded	❌	Distinct from `OHMAbortError`.
`OHMNetworkError`	0	DNS / TCP / TLS / dropped	✅ (then thrown)	Pair with `OhmQueue` on RN.
`OHMAbortError`	0	Caller `AbortSignal` fired	❌	User navigated away — not an error.
`OHMServerError`	5xx	Generic upstream failure	✅	Catch-all.
`OHMConfigError`	0	SDK init wrong	❌	Bad apiKey shape / RN live-key in bundle.

import {
  OHMRateLimitError,
  OHMValidationError,
  OHMNetworkError,
  OHMAbortError,
} from "@ohm_studio/sdk";

try {
  await ohm.extract({ apiSlug, text });
} catch (e) {
  if (e instanceof OHMRateLimitError)  return wait(e.retryAfterSec!);
  if (e instanceof OHMValidationError) return highlight(e.fields ?? []);
  if (e instanceof OHMNetworkError)    return queue.enqueue(...);   // RN
  if (e instanceof OHMAbortError)      return;                       // user cancelled
  throw e;                                                            // re-raise
}

Stable error codes (`OHM_ERROR_CODES`)

The class hierarchy may evolve; the code strings never change. Safe to store in customer logs / analytics.

import { OHM_ERROR_CODES, OHMError } from "@ohm_studio/sdk";

try { /* ... */ }
catch (e) {
  if (e instanceof OHMError) {
    if (e.code === OHM_ERROR_CODES.RATE_LIMITED) { /* ... */ }
  }
}

Per-call overrides

For one known-slow call you don't want to construct a second client.

const slow = ohm.withOverrides({
  timeoutMs: 5 * 60_000,
  totalTimeoutMs: 6 * 60_000,
  maxRetries: 1,
});
await slow.audio.extract({ apiSlug, file: hourLongAudio });

withOverrides returns a thin clone — same auth, same baseUrl, same hooks — that you can use just like the original client.

Lifecycle hooks

Tap into individual phases without writing your own retry wrapper.

const ohm = new OHM({
  apiKey,
  hooks: {
    onRequest:  ({ method, url, attempt }) => log.info("→", method, url, { attempt }),
    onResponse: ({ status, latencyMs, requestId }) =>
      log.info("←", status, latencyMs + "ms", requestId),
    onError:    ({ error, attempt, willRetry }) =>
      log.warn(error.name, { attempt, willRetry }),
  },
});

All hooks are fire-and-forget — exceptions inside them are caught and logged; they never affect the request's success/failure.

What the SDK does NOT do

Honest list — don't rely on these guarantees if you need them:

No persistent retry queue between processes. If your Node process dies mid-retry, the request is lost. Use OhmQueue on RN (AsyncStorage-backed) or webhook callbacks on long jobs.
No automatic circuit breaker. If our API is down, you'll get OHMServerError on every call until you implement your own breaker.
No request signing beyond Authorization: Bearer. For HMAC-signed callbacks (async-job webhooks), see webhook-receiver.
No client-side rate limiter. You can fire 1 000 calls/sec; the server will reject most. Wrap with p-limit or similar if you need bounded concurrency.