Changelog

All notable changes to OHM Studio's APIs, SDKs, and docs are listed here. We follow Semantic Versioning — breaking changes bump the major version, additive features bump the minor, fixes bump the patch.

The SDK packages live on npm:

Package	Short name	Latest	Install
`@ohm_studio/sdk`	OHM SDK	0.12.0	`npm install @ohm_studio/sdk`
`@ohm_studio/sdk-react-native`	OHM RN SDK	0.12.0	`npm install @ohm_studio/sdk-react-native`
`@ohm_studio/sdk-core`	core	0.10.0	(transitive — pulled in by the wrappers)
`@ohm_studio/cli`	Studio CLI	0.1.0	`npm install -D @ohm_studio/cli`

v0.12.0 — Hospital-deployed architecture

Release date: 2026-05-18 Affects: @ohm_studio/sdk@0.12.0, @ohm_studio/sdk-react-native@0.12.0, @ohm_studio/sdk-core@0.10.0.

Not a breaking change for working SDK code, but the operational model has changed and you should update your config:

What changed at the platform level

OHM now ships as two separately-deployed images:

Hospital orchestrator — runs at each hospital (api.<hospital>.example). Holds PHI, runs auth, audit, queue.
OHM Engine — runs only at OHM (api.ohm-engine.in). Holds the STT/LLM vendors, prompts, extraction logic, full reference data.

When you call the SDK, requests go to the hospital orchestrator (just like before). The hospital then forwards AI work to the engine internally. Customers never see the engine.

What changes for SDK users

The baseUrl you pass should be the hospital's API URL, not a single global SaaS endpoint. Each hospital has its own:

const ohm = new OHM({
  apiKey: process.env.OHM_API_KEY!,
  baseUrl: process.env.OHM_API_URL!,         // e.g. https://api.kauvery.example
});

https://api.ohm.doctor continues to work as the default — it's OHM's own demo hospital deployment. But for any production integration, set baseUrl to the actual hospital you're connecting to. Ask the hospital admin if unsure.

Response shapes

Every endpoint that returned tokensUsed / inputTokens / outputTokens now also surfaces these in a consistent usage block alongside the result. Existing fields stay for backward compatibility; new code should read the usage block.

Vendor-neutral errors

All upstream failures are now mapped to vendor-neutral codes — ENGINE_VENDOR_UNAVAILABLE, ENGINE_RATE_LIMITED, ENGINE_TIMEOUT. Upstream provider names no longer appear in any error path. Your error-handling code should not reference those strings.

v0.11.1 — Full validation pass

Release date: 2026-05-18 Affects: @ohm_studio/sdk@0.11.1, @ohm_studio/sdk-react-native@0.11.1, @ohm_studio/sdk-core@0.9.1. Pair with the matching server release.

Plugs the gaps that v0.11.0 left open. Three classes of fixes:

Hard LLM timeout on every Studio surface

v0.11.0 added a 240-sec timeout to extract. v0.11.1 brings the same protection to every other LLM-using Studio service so no customer-facing endpoint can hang:

Service	Endpoint(s)	Ceiling
`StudioExtractService`	`/extract`, `/audio/extract`, streaming, async jobs, playground	240s
`StudioInsightsService`	`/insights`, playground insights tab	240s
`StudioSummarizeService`	`/summarize`	120s
`StudioAiAssistService`	`/ai-assist` (Studio UI prompt drafting)	120s

A hung upstream connection now surfaces a clean OHMServerError after the ceiling expires; the customer retries instead of waiting.

`chunked` / `chunkCount` propagated through ALL audio paths

In v0.11.0 these flags only appeared on the sync transcribe / extract responses. v0.11.1 adds them to:

StreamChunk (type: "transcript") — streaming consumers see the flag the instant transcription completes.
JobDetail — async-job pollers see it in the final terminal-state response.
Sync audio extract response — was missing the field even though transcribe had it; now consistent.

// Streaming
for await (const chunk of ohm.audio.extractStream({ apiSlug, file })) {
  if (chunk.type === "transcript" && chunk.chunked) {
    toast.warn(`Processed in ${chunk.chunkCount} chunks.`);
  }
}

// Async jobs
const result = await ohm.audio.jobs.poll(jobId);
if (result.chunked) {
  toast.warn(`Processed in ${result.chunkCount} chunks.`);
}

Schema migration

StudioExtractionJob gains two nullable columns:

chunked    Boolean? @default(false)
chunkCount Int?

Purely additive — prisma db push --skip-generate --accept-data-loss=false (which runs automatically on api container boot) applies it without locking the table.

Migration

npm install @ohm_studio/sdk@^0.11.1
# or:
npm install @ohm_studio/sdk-react-native@^0.11.1

No code changes required.

v0.11.0 — Long-audio chunking, file-size pre-flight, LLM hard-timeout

Release date: 2026-05-18 Affects: @ohm_studio/sdk@0.11.0, @ohm_studio/sdk-react-native@0.11.0, @ohm_studio/sdk-core@0.9.0. Pair with the matching server release.

Hardening sweep so hour-plus consultations and overweight files don't silently fail. Every change is backward-compatible — drop the new SDK in and the new behavior is automatic.

Server: long-audio chunking on the Studio extract path

The STT provider has a documented per-file ceiling around 1 hour. Until this release, anything longer was silently truncated at minute 60 — the second half of a 90-min consultation just disappeared from the transcript with no error, no warning.

Now, the server ffprobes the upload, and if it's longer than 55 min, splits into ≤55-min chunks using ffmpeg -c copy (stream-copy, no re-encode, ~instant), submits each chunk separately, and merges the transcripts. Same path covers /api/studio/v1/audio/transcribe and every /api/studio/v1/audio/extract/:apiSlug* endpoint.

The response surfaces a chunked: true + chunkCount: N pair so SDK consumers can show a tiny warning ("This recording was processed in N chunks — a sentence spanning a chunk boundary may be missing 2–3 words"). Otherwise everything looks the same to the caller.

Server: 240-sec hard timeout on the LLM extraction

A hung upstream connection used to freeze the entire extract response indefinitely. Now wrapped in Promise.race against a 240-sec timer — plenty of room for p99 latency on ~50k-char transcripts; anything longer is genuinely broken and the caller gets a clean error to retry.

SDK: `chunked` + `chunkCount` on transcribe / extract results

const { transcript, chunked, chunkCount } = await ohm.audio.transcribe({ file });

if (chunked) {
  toast.warn(
    `Long recording — processed in ${chunkCount} chunks. ` +
    `Sentences across chunk boundaries may be missing 2–3 words.`
  );
}

AudioExtractResult extends AudioTranscribeResult, so these fields also appear on ohm.audio.extract({...}) responses.

SDK: pre-upload file-size guard (500 MB hard cap)

A multi-GB mis-attached file (someone dropped a recorded lecture or a video) used to crawl across the wire before failing server-side. The SDK now reads the file's size / byteLength BEFORE constructing the multipart body, and throws OHMValidationError synchronously if it exceeds 500 MB (~2 hours of 16 kHz mono WAV — far above any realistic clinical encounter).

try {
  await ohm.audio.extract({ apiSlug: "opd", file: hugeFile });
} catch (err) {
  if (err instanceof OHMValidationError) {
    // "Audio file too large (812.3 MB). Maximum is 500 MB."
  }
}

Applies to: audio.transcribe, audio.extract, audio.extractStream, audio.jobs.create. The cap is a generous safety net — if your clinical use case actually needs files larger than 500 MB, open an issue and we'll discuss.

Migration

npm install @ohm_studio/sdk@^0.11.0
# or:
npm install @ohm_studio/sdk-react-native@^0.11.0

No code changes required. Optional: surface a chunk-boundary warning when result.chunked === true.

v0.10.0 — Total-deadline + auto-idempotency + bulk + warmUp + hooks

Release date: 2026-05-11 Affects: @ohm_studio/sdk@0.10.0, @ohm_studio/sdk-react-native@0.10.0, @ohm_studio/sdk-core@0.8.0. No server changes.

A 15-point reliability + performance sweep. Every customer-visible addition is BC — drop the new SDK in without code changes and you inherit the wins automatically.

Reliability — P0

`totalTimeoutMs` — bounded worst-case latency

Without it, a chatty upstream + 3 retries could keep a request open for 3 × timeoutMs + Σbackoff (~3 minutes with defaults). With it, the SDK throws OHMTimeoutError as soon as the budget is exhausted — even mid-retry, even mid-sleep.

const ohm = new OHM({
  apiKey,
  timeoutMs: 30_000,        // per-attempt
  totalTimeoutMs: 60_000,   // total wall-clock — NEW
  maxRetries: 2,
});

Auto `Idempotency-Key` on every unsafe method

POST / PATCH / PUT / DELETE now get an auto-generated UUID v4 in Idempotency-Key when the caller doesn't supply one. Eliminates duplicate-write bugs from mobile retries — the server short-circuits same-key calls within 24 h to the cached response.

await ohm.extract({ apiSlug, text });              // auto-keyed
await ohm.extract({ apiSlug, text, idempotencyKey: "visit_42" }); // explicit
await ohm.extract({ apiSlug, text, idempotencyKey: null });        // opt-out
new OHM({ apiKey, disableAutoIdempotency: true });                 // disable globally

`withOverrides({ ... })` — per-call timeout / retry tuning

const slow = ohm.withOverrides({ timeoutMs: 5 * 60_000, maxRetries: 1 });
await slow.audio.extract({ apiSlug, file: hourLongAudio });

`OHMError.responseHeaders` + `responseBody`

Every server-originated error now carries the raw HTTP headers and body of the failed response. Debug "the server returned 502 but what cache header was on it?" tickets without a second round-trip.

catch (e) {
  if (e instanceof OHMError) {
    console.log(e.responseHeaders);   // { "cf-cache-status": "MISS", ... }
    console.log(e.responseBody);      // server's error envelope
  }
}

Speed — P1

`ohm.warmUp()` — drops cold-start latency by ~300 ms

const ohm = new OHM({ apiKey });
void ohm.warmUp();        // fire-and-forget at app boot
// ... first real call now ~150 ms instead of ~500 ms

Automatic `keepalive: true` on small JSON POSTs

extract, summarize, insights (anything ≤ 60 KB body) now passes keepalive: true to fetch. Saves ~30 ms on every call after the first by reusing the TCP socket. Multipart audio uploads skip this — browser caps keepalive bodies at 64 KB. No opt-in, no code change.

`enableHttp2()` — opt-in HTTP/2 multiplexing on Node

import { enableHttp2 } from "@ohm_studio/sdk/http2";
enableHttp2();   // call once at process start

Saves 50–100 ms on parallel calls. Node-only (browsers + RN already use the platform's H2 stack). Silently no-ops elsewhere.

`streamBufferMs` reserved option

Forward-compatible knob for delta-streaming (when we ship transcript.delta chunks). No behavior change today; ship streamBufferMs: 50 then to coalesce.

Developer experience — P2

Lifecycle hooks `{ onRequest, onResponse, onError }`

Cleaner than the old onUsage for non-trivial observability — you tap into individual phases instead of getting one combined event.

const ohm = new OHM({
  apiKey,
  hooks: {
    onRequest:  ({ method, url, attempt }) => log.info("→", method, url),
    onResponse: ({ status, latencyMs, requestId }) =>
      log.info("←", status, latencyMs + "ms", requestId),
    onError:    ({ error, attempt, willRetry }) =>
      log.warn(error.name, { attempt, willRetry }),
  },
});

All hooks fire-and-forget — exceptions are caught and never affect the request. onUsage continues to work for backwards compat.

`User-Agent` with runtime info (Node only)

User-Agent: ohm-sdk/0.8.0 (node/22.16; darwin x64) — helps your server logs identify which client / Node / OS is misbehaving without asking the customer. Browsers + RN skip the header (forbidden).

`ohm.extractBulk([...])` — batched concurrent extract

const results = await ohm.extractBulk(
  transcripts.map((t) => ({ apiSlug: "opd-clinic", text: t })),
  { concurrency: 8, onProgress: (done, total) => console.log(`${done}/${total}`) },
);
const ok = results.filter((r) => r.ok);
const err = results.filter((r) => !r.ok);   // partial failures don't fail the batch

`jobs.poll` exponential backoff

Polling interval now grows 1.5× per attempt, capped at maxIntervalMs (default 30 s). Protects the worker from a chatty client when a job stays PROCESSING for 10+ minutes. Same first-poll latency; smarter after.

Documentation

NEW /sdk/reliability — retry policy, deadline math, idempotency semantics, full error class table, what we DON'T retry.
NEW /sdk/performance — per-endpoint p50/p95, sync vs streaming vs async decision matrix, warmUp() pattern, enableHttp2() hint, bulk extraction, performance anti-patterns.
/versions compatibility matrix expanded with the v0.10 row.

Migration from 0.9 → 0.10

Zero breaking changes. Every new field is optional. To pick up the wins:

// Before
const ohm = new OHM({ apiKey, timeoutMs: 60_000 });

// After — three new lines, zero behavior loss
const ohm = new OHM({
  apiKey,
  timeoutMs: 60_000,
  totalTimeoutMs: 120_000,   // NEW — bounded worst-case
});
void ohm.warmUp();           // NEW — drops first-call latency
// enableHttp2() if you're on Node and fanning out parallel calls

v0.9.0 — Granular error classes + zero-config defaults + async-job probe

Release date: 2026-05-11 Affects: @ohm_studio/sdk@0.9.0, @ohm_studio/sdk-react-native@0.9.0, @ohm_studio/sdk-core@0.7.0. Server: new probe + comprehensive config-reference doc.

Reliability + DX hardening. Backwards-compatible.

Four new error classes

Customers can now pattern-match HTTP failure modes precisely:

import {
  OHMTimeoutError,        // 408 / 504 / client-side timeout
  OHMNetworkError,        // DNS / TCP / TLS / dropped connection
  OHMNotFoundError,       // 404 — slug not published, job purged, …
  OHMQuotaExceededError,  // 402 / 429-with-quota — distinct from rate limit
} from "@ohm_studio/sdk";

OHMNotFoundError carries availableSlugs[] when the server can offer alternatives (powers a customer-side picker without a second round-trip).

OHMQuotaExceededError carries resetAt (ISO-8601) + quotaKind ("tokens" | "audio_seconds" | "calls" | "storage") so customers can show "you'll be able to extract again at HH:MM" messaging or trigger upgrade-plan modals.

OHMTimeoutError distinguishes deadline-exceeded from OHMAbortError (user cancellation). OHMNetworkError is the canonical "you're offline → queue it locally" signal — pair with OhmQueue on RN.

Stable error-code constants exported as OHM_ERROR_CODES:

const codes = OHM_ERROR_CODES;
//   { ABORTED: "aborted", AUTH_ERROR: "auth_error", ... }

The class hierarchy may evolve; the codes don't. Use them for log analytics, alerting rules, customer-side error analytics.

New "Configuration" docs page

Single source of truth at docs.ohm.doctor/configuration listing every server env var, every per-API toggle in Studio, every SDK option, and every default. The TL;DR: hospitals don't have to configure anything to start. The page is the menu when they want to customise.

Async-job end-to-end probe

apps/api/scripts/probes/async-jobs.ts — verifies the v0.8.0 async-extraction pipeline end-to-end:

Enqueue + idempotency replay (same key returns same jobId)
Worker claim + processing + terminal state
Cancel from QUEUED → CANCELLED
Webhook delivery to a local mock receiver with HMAC verification
- delivery-id presence + payload-shape check

Run via pnpm exec tsx scripts/probes/async-jobs.ts. Use as a pre-deploy gate.

Upgrade

npm install @ohm_studio/sdk@0.9.0
npm install @ohm_studio/sdk-react-native@0.9.0

Existing 0.8.0 callers continue to work unchanged. Errors that previously surfaced as OHMServerError may now surface as the more specific subclass — they're all still OHMError so a generic catch still works.

v0.8.0 — Async-extraction jobs (long recordings, webhook callbacks)

Release date: 2026-05-10 Affects: @ohm_studio/sdk@0.8.0, @ohm_studio/sdk-react-native@0.8.0, @ohm_studio/sdk-core@0.6.0. Server: new studio_extraction_jobs table + worker.

The synchronous extract surface (HTTP POST, hold connection open until done) breaks down for audio over ~30 minutes — proxies kill the connection, mobile apps can't keep it alive when backgrounded. v0.8.0 adds an async-job pattern hospitals expect for long-recording workflows: submit, poll OR webhook, done.

New SDK surface — `ohm.audio.jobs.{create,get,cancel,poll}`

// Submit (~100ms — just an upload)
const { jobId } = await ohm.audio.jobs.create({
  apiSlug: "long-consult",
  file: bigBlob,
  webhookUrl: "https://your-backend/ohm-callback",   // optional
  patientHash: sha256(`abha:${patient.abhaId}`),
  recordedById: currentUser.id,
});

// Poll (client-side)
const result = await ohm.audio.jobs.poll(jobId, {
  intervalMs: 3000,
  maxWaitMs: 30 * 60_000,
  onProgress: (j) => setProgress(j.workerProgress),
});

// OR convenience: submit + poll in one call
const result = await ohm.audio.extractAsync({ apiSlug, file, ... });

What's signed and retried

Webhooks fire on every terminal state (extraction.job.completed or extraction.job.failed). HMAC-SHA256 signed with a per-job secret. Stripe-style retry schedule on 4xx/5xx: 5min → 30min → 2h → 5h → 10h → 24h → 24h, ~3 days total. Dead-letter after 7 attempts.

Headers on every webhook delivery:

X-OHM-Event — event name
X-OHM-Delivery-Id — UUID v4 unique per attempt (idempotency on your side)
X-OHM-Signature: sha256=<hex> — HMAC of the JSON body

When to use which

Audio length	Mode
0–10 min	sync (`ohm.audio.extract`)
10–30 min	streaming (`ohm.audio.extractStream`)
30 min – 1 hr	async polling
> 1 hr or mobile background	async webhook
Bulk replay (10 000 historical)	async webhook + idempotency keys

Server: in-process worker

DB-backed queue with FOR UPDATE SKIP LOCKED claim — multi-instance safe. Run any number of API pods on the same DB; each picks up a different job. Worker disabled per-pod via STUDIO_JOBS_WORKER_DISABLED env if you want read-only replicas.

Migration applied non-destructively

CREATE TYPE "StudioJobStatus" AS ENUM ('QUEUED','PROCESSING','COMPLETED','FAILED','CANCELLED');
CREATE TABLE "studio_extraction_jobs" (...);
-- 4 indexes including org/patientHash/createdAt for audit search.

Existing StudioInvocation rows untouched. Async-job completions mirror to StudioInvocation so per-API Logs panels include async traffic in their token / audio-second roll-ups.

Upgrade

npm install @ohm_studio/sdk@0.8.0
npm install @ohm_studio/sdk-react-native@0.8.0

Existing 0.7.0 callers continue to work unchanged. New methods are opt-in; sync extract is unchanged.

See Async extraction for the full reference, including a webhook receiver template and sync/streaming/async comparison matrix.

v0.7.0 — Hospital-readiness pack: audit, idempotency, PHI, alerts, offline queue

Release date: 2026-05-10 Affects: @ohm_studio/sdk@0.7.0, @ohm_studio/sdk-react-native@0.7.0, @ohm_studio/sdk-core@0.5.0. SDK-core minor bump because of new exports.

Backwards-compatible — every new field / method is opt-in. Existing 0.6.0 callers continue to compile and behave identically.

New methods

ohm.apis.get(slug) — full schema detail for one API (description, publishedSchema, publishedSystemPrompt, publishedInputs). Use for dynamic playground UIs / runtime validation.
ohm.invocations.searchByPatient({ patientHash, sinceDays?, limit? }) — patient-level audit search. Returns metadata-only invocation rows (timing, tokens, recordedById). Transcripts and extracted JSON are never returned via this surface.

New audit fields on every method

patientHash, recordedById, idempotencyKey accepted on:

ohm.extract, ohm.audio.transcribe, ohm.audio.extract, ohm.audio.extractStream, ohm.insights

Idempotency-Key is sent as an HTTP header automatically (Stripe / Twilio convention). Same key in same org returns cached response for 24 h — protects mobile retries from duplicate chart entries.

New helpers

restoreTokens(data, phiTokenMap) — restore PHI when an API has Studio's "Redact PHI before extraction" toggle on. The server returns phiTokenMap; this helper deep-clones data swapping tokens like [PATIENT_1] back to original strings.
OhmQueue (RN only) — offline queue. Persists failed extractions to AsyncStorage; replays on flush(). Bring-your-own storage adapter via makeAsyncStorageAdapter(AsyncStorage).

Server-side improvements (no SDK change required)

PHI redaction — opt-in per-API toggle in Studio Settings. Server scrubs patient names (after honorifics), ABHA / Aadhaar / phone / MRN / UHID / IPD identifiers from the transcript before the LLM call.
Critical-value alerts — _alerts array emitted on every extraction with vitals: SpO₂ < 90, BP ≥ 180/120, HR < 40 or > 150, Temp ≥ 104°F, Pain ≥ 8/10, etc. Centralised threshold logic.
List-type recovery — when the LLM drops items from chained drug / diagnosis / lab dictation, regex recovery picks up the missed entries against built-in dictionaries.
Auto-retry on transient LLM failures — 2 retries with exponential backoff. Permanent shape errors bypass retry.
FHIR R4 mappers for doctor-note (Composition with OPConsultRecord profile) and nurse-shift (Composition + Procedure bundle) — ABDM-ready, kept inert until ABHA gateway integration.

Migrations applied non-destructively

-- v0.6 (audit fields)
ALTER TABLE studio_invocations
  ADD COLUMN idempotencyKey VARCHAR(128),
  ADD COLUMN patientHash VARCHAR(128),
  ADD COLUMN recordedById TEXT,
  ALTER COLUMN apiId DROP NOT NULL;

-- v0.7 (PHI redaction toggle)
ALTER TABLE studio_apis
  ADD COLUMN "redactPHI" BOOLEAN NOT NULL DEFAULT false;

Both additive — existing rows unchanged.

Upgrade

npm install @ohm_studio/sdk@0.7.0
npm install @ohm_studio/sdk-react-native@0.7.0

Server 2026-05-10 — Extraction reliability: messy clinical dictation

Release date: 2026-05-10 Affects: API server only (apps/api). No SDK update required.

Real-world clinical dictation rarely follows the textbook "label number, label number, label number" cadence. Speakers slip in connector words ("temperature is 99", "saturation was 88"), chain mentions across non-schema vitals ("saturation 88, temperature 99, weight 105"), or state partial blood pressure ("BP 120" with no diastolic). The extraction layer would intermittently drop fields in these patterns.

What changed: added a deterministic regex-based recovery pass (recoverFlatVitals) that runs on every Studio extraction with a flat numeric-vitals schema. For each vital the LLM dropped, we re-scan the transcript for the label + number with any connector word (is, was, of, at, =, comma, dash) and inject the value. The LLM's judgment still wins for fields it actually emitted — recovery only fills gaps.

Pain-score gets a smarter reconciler:

Explicit number stated → use it.
"No pain" / "pain free" → 0.
LLM emitted 0 but the speaker never said "pain" / "NRS" / "score" → drop the field. Catches the textbook default-normal hallucination.

What customers see: higher field-completion rate on natural- language dictations, especially the "BP 120" partial-systolic case and "temperature is 99" connector-word case. No code change required in your app — same response shape, just more populated.

Also tightened the inpatient-vitals schema's system prompt + per- field helpText to match — visible only to customers who repaste the updated schema from examples/hospital-integration/. The server-side recovery runs regardless of which prompt version you've published.

Studio 2026-05-10 — Builder JSON tab auto-classifies full schemas

Release date: 2026-05-10 Affects: studio.ohm.doctor only. No SDK or API change.

Customers pasting a full Studio schema JSON (the *.studio.json files shipped in the public examples) into the Builder → JSON tab used to hit "Top level must be an array of sections" — the textarea only accepted a bare sections array. Now the tab accepts both:

A bare sections array (canonical Builder shape — unchanged).
A full schema object — sections populates the Builder, plus systemPrompt lands in the Prompt tab, inputs lands in the Inputs tab, and any insightsSchema / insightsPrompt / insightsEnabled lands in the Insights tab. One paste fills every tab.

Any field absent from the pasted JSON is left alone — partial pastes don't wipe existing state. Toast confirms what was applied: "Imported full schema · 1 section + prompt + 2 inputs".

v0.6.0 — Cancellation, upload progress, slug discovery, CLI codegen

Release date: 2026-05-10 Affects: @ohm_studio/sdk@0.6.0, @ohm_studio/sdk-react-native@0.6.0, @ohm_studio/sdk-core@0.4.0. New: @ohm_studio/cli@0.1.0.

Five additive items, zero breaking changes. Existing call sites continue to compile and behave identically; new options are opt-in.

`signal: AbortSignal` on every method

Every SDK method now accepts a signal?: AbortSignal:

const controller = new AbortController();

const promise = ohm.extract({
  apiSlug: "opd-clinic",
  text: transcript,
  signal: controller.signal,
});

// User clicked "Cancel"
controller.abort();

Aborts surface as a typed OHMAbortError (code: "aborted", status: 0) so you can distinguish cancellation from genuine errors. The caller-supplied signal is bridged into the SDK's internal timeout controller — either source trips the same fetch abort, and an already-aborted signal short-circuits before any work starts.

React hooks auto-abort on unmount

useOhmExtract, useOhmAudioExtract, useOhmSummarize, and useRecorder all attach an internal AbortController and call abort() on unmount and on the next mutateAsync(...) call. A user navigating away mid-extract never debits a half-finished call. Each hook also exposes a cancel() method for explicit Cancel-button flows.

`onProgress` for audio uploads

audio.transcribe and audio.extract accept an onProgress callback that receives { loaded, total, percent } while the file is uploading:

await ohm.audio.extract({
  apiSlug: "opd-clinic",
  file: audioBlob,
  onProgress: (e) => setUploadPct(e.percent),
});

When onProgress is set, the SDK routes through XMLHttpRequest (which exposes upload progress on both browser and React Native); when it isn't, the existing fetch path stays unchanged with zero overhead. No-op for callers who don't pass the callback.

`ohm.apis.list()` for slug discovery

Enumerate the published Studio APIs the credential can see, without hard-coding slugs:

const apis = await ohm.apis.list();              // PUBLISHED only
const drafts = await ohm.apis.list({ status: "DRAFT" });

Returns ApiSummary[] — { slug, name, status, version, updatedAt }. Powered by the new GET /api/studio/v1/apis endpoint, which accepts either an API key (returns project-scoped APIs) or a Studio user JWT (returns organisation-wide APIs). Useful for typeahead pickers, codegen pipelines, and admin dashboards.

`@ohm_studio/cli` — codegen for typed `data`

New companion package. Generates TypeScript interfaces from your published Studio API schemas, so ohm.extract<MyApiData>(...) is fully typed against the schema you designed in Studio:

npm install -D @ohm_studio/cli
export OHM_API_KEY=ohms_live_xxx

ohm-studio pull opd-clinic              # → ./ohm-types/opd-clinic.ts
ohm-studio pull-all --out src/ohm       # every published API
ohm-studio list                         # what's published?

Pre-build hook to keep types fresh:

{
  "scripts": {
    "predev": "ohm-studio pull-all --out src/ohm",
    "prebuild": "ohm-studio pull-all --out src/ohm"
  }
}

Covers every Studio field type: text, textarea, rich-text, date, number, boolean, choice (typed enums), multi-choice, vitals-block, diagnosis-list, medication-list, allergy-list, investigation-list, referral-list, procedure-list, code-list, repeater (nested item types).

Server: `vitals-block` extraction reliability fix

The internal extraction schema for vitals-block was reshaped so the clinical engine reliably emits every vital the speaker mentioned — previously some multi-field readings dropped HR / RR / BP under load. The new shape is post-processed back to the canonical vitals.bp.{systolic, diastolic} form before any consumer sees the data, so the 15+ existing call sites (doctor app, Visit feature, FHIR mappers) continue to read vitals.bp.systolic unchanged. Server-side only — no client code change required.

The Studio extraction stack was also upgraded to a higher-tier clinical-grade model. Verified extraction quality on the same Indian-English dictation that previously dropped fields:

Probe	Before	After
`vitals-block` (7 vitals)	4/9	9/9
Vitals (flat 7-field)	7/7	7/7
Doctor note (content + plan)	2/2	2/2
Nurse-shift (SOAP+timeline)	passes	passes

Upgrade

npm install @ohm_studio/sdk@0.6.0
npm install @ohm_studio/sdk-react-native@0.6.0

No code changes required. To opt in to the new features, see the sections above and the API reference.

Server 2026-05-10 — LLM stack migrated to Vercel AI SDK doc-canonical pattern

Release date: 2026-05-10 Affects: API server only (apps/api). No SDK update required.

Internal cleanup — same HTTP request/response shapes for every customer endpoint (/extract, /audio/extract, /audio/transcribe, /summarize, /insights). Customers using @ohm_studio/sdk@0.5.3 get this for free.

What changed: the API server's structured-extraction path was reworked to follow the canonical structured-output pattern recommended by the underlying inference SDK we use end-to-end:

const result = await generateText({
  model: wrapLanguageModel({ model, middleware: extractJsonMiddleware() }),
  output: Output.object({ schema, name, description }),
  ...
});

The new middleware handles models that wrap JSON in markdown code fences. Eliminates the Structured extraction failed after 3 attempts: No object generated errors we'd see intermittently.
System content is now routed through the dedicated option instead of interleaved into messages[] — improves prompt-injection isolation.
Removed ~150 lines of manual workaround code: 3-attempt retry loop, temperature sweep, mode juggling, regex JSON extraction, manual fence stripping, fallback parsers, and a dead generateStructured method that had zero callers.

What customers see: higher first-attempt success rate, fewer 502 retries, identical response shapes. No code change required in your app.

v0.5.3 — README polish (translate-mode note)

Release date: 2026-05-10

Patch release — README only, no behavioural change.

The npm package READMEs now call out that audio.transcribe and audio.extract always return an English transcript regardless of the spoken language — the server runs OHM's STT layer in translate mode, so a Tamil / Hindi / Telugu / Bengali / code-mixed consult comes back as clean English text. This was already the runtime behaviour shipped in v0.5.2; v0.5.3 just surfaces it on the npm package landing page.

Same code, drop-in upgrade.

v0.5.2 — End-to-end English transcript pipeline · `foundation@v3`

Release date: 2026-05-10

Server-side hardening — no SDK API changes, drop-in upgrade. End-to-end extraction now scores 100% across English, Hindi, Tamil, and Telugu OPD recordings on our internal benchmark.

`audio.transcribe` returns English regardless of source language

Studio's /api/studio/v1/audio/transcribe (and the audio path inside audio.extract) now runs OHM's STT layer in translate mode. The transcript comes back in English no matter what the speaker spoke — Tamil, Hindi, Telugu, Bengali, or any code-mixed combination. This matches the Visit/doctor app pipeline; downstream extraction prompts only ever see clean English clinical text.

Cost: ~150 ms extra on already-English audio (the translate layer runs as a no-op).

`foundation@v3` — English-first, speaker-neutral

The OHM Clinical Foundation Block was rewritten to match the new pipeline assumption:

English-first — dropped the multi-script interpretation rules (Devanagari / Tamil / Telugu / Bengali / Gujarati examples) that are now handled at the STT layer. The prompt is leaner; the LLM has fewer conflicting instructions.
Speaker-neutral — replaces every "the doctor" with "clinician / nurse / resident / student / patient / family member". Anyone in the room can speak; clinical facts get extracted regardless of role.
Fahrenheit window rule — most clinicians say "temp 99" or "temperature 101.2" without saying "Fahrenheit". The prompt now codifies: 90–110 = °F (convert), 33–43 = °C (use as-is), 43–90 = ambiguous (omit).
Past-history vs active comorbidity — explicit routing: if the patient is on a current chronic medication for a named condition, the condition is an active diagnosis. "Known diabetic on Metformin" → both Diabetes (active) and Metformin (continuing).
Named-investigation extraction — when a test is named (CBC, ECG, CK-MB, cardiac troponin, lipid profile, 2D Echo, CT brain, sputum culture, HbA1c …), it goes into the investigation list as a separate entry, even if the result is dictated inline.

Already-published Studio APIs are unaffected — they continue to use the prompt snapshot they were published with. Republish to pick up v3.

`hardenVitals()` post-processor

Some LLMs drop temp when forced to do °F→°C math under tight schema constraints, or hallucinate a height value by copying the weight number. A small post-processor in extract.service now:

recovers temp from the transcript if the LLM dropped it (regex around "temperature / temp / fever" + the same Fahrenheit-window conversion the prompt specifies),
deletes hallucinated height when no anchor word ("height / cm / metres / feet / inches / tall") appears in the transcript.

Tightened the Zod schema for the vitals-block field type:

temp description now leads with the conversion instruction.
height lower bound moved from 25 cm to 50 cm — eliminates overlap with Fahrenheit values (94–108) so out-of-band routing is impossible at the schema level too.

Extraction quality (internal benchmark)

End-to-end test: 5-minute synthesised consults in 4 languages, run through synthesised TTS → OHM's translate-mode STT → production StudioExtractService:

Script	Language	Score
OPD fever / dengue	English	100%
OPD cough / pneumonia	Hindi	100%
OPD migraine	Tamil	100%
OPD STEMI / chest pain	Telugu	100%

Vitals (5/5), diagnoses, medications, investigations, and negation handling all hit ceiling on every script.

v0.5.1 — Repository URL & metadata polish

Release date: 2026-05-10

Patch release — metadata only, no behavioural change.

repository.url on every SDK package now points at the public examples repo at github.com/open-holistic-medicine/ohm-sdk. The npmjs.com "Repository" link on each package page now resolves cleanly.
The public examples repo holds clone-and-run versions of all four examples (Node CLI, Next.js server action, Expo mic recorder, bare React Native) wired against the published SDKs.

No code changes — safe drop-in upgrade.

v0.5.0 — Speaker mode (doctor / doctor + patient)

Release date: 2026-05-10

The Studio Playground audio tab and audio.transcribe / audio.extract endpoints now accept exactly two speaker modes — doctor (default, single-speaker dictation) or doctor_patient (two-speaker conversation).

`SpeakerMode` end to end

New SpeakerMode type and SPEAKER_MODES constant exported from @ohm_studio/sdk and @ohm_studio/sdk-react-native. Use the constant to render a picker:

import { SPEAKER_MODES } from "@ohm_studio/sdk";

<select onChange={(e) => setMode(e.target.value as SpeakerMode)}>
  {SPEAKER_MODES.map((m) => (
    <option key={m.code} value={m.code}>{m.label}</option>
  ))}
</select>

useRecorder({ apiSlug, speakerMode: "doctor_patient" }) and the imperative ohm.audio.extract({ ..., speakerMode }) / ohm.audio.transcribe({ ..., speakerMode }) thread the mode to the server.
Studio Playground gained a card-button picker above the language dropdown — exactly two intentional choices.

Server-side language code mapping

ohm.audio.transcribe({ language }) now accepts every customer-facing form — "auto", ISO short codes ("en", "hi", "ta", ...), and the provider-shaped xx-IN codes. The server normalises before calling STT and returns a clear 400 for anything outside the 23 supported languages.

Studio Playground UX

Audio file upload (in addition to mic record). 100 MB cap.
Language dropdown (auto-detect default; English label + native script for each entry).
Audio source badge (mic vs upload, with filename and size).
The previously paste-a-test-mode-key footer is gone — every project now gets an auto-minted default Playground key on creation. Users no longer manage keys to use the Playground.

Default Playground key (auto-minted)

Organization projects now ship with a Playground (default) key marked isPlaygroundDefault. Test-mode only (ohms_test_*).
Cannot be revoked from the UI — the SDK service refuses with a 400 and a clear message.
Can be rotated via POST /api/studio/v1/projects/:id/playground-key/rotate (or "Rotate" button in the Keys page).

v0.4.0 — WebM duration fix · IndexedDB recovery · BareRecorder · 85 tests

Release date: 2026-05-10

Web (`@ohm_studio/sdk@0.4.0`)

WebM duration metadata patch — Chrome's MediaRecorder produces WebM files with broken duration in the EBML header (<audio> shows Infinity duration, seek breaks, some upload pipelines reject the file). The Recorder now lazy-loads fix-webm-duration on stop() and patches the header with the actual recorded duration before returning the Blob. Non-WebM blobs and patcher failures fall through unchanged.
Crash-safe IndexedDB persistence — opt in via useRecorder({ persist: true }) or use saveRecording / getRecording / listRecordings / removeRecording / clearRecordings directly. Stores the Blob to IndexedDB before extraction so a tab crash, browser close, or network drop mid-upload doesn't lose the consult. New usePendingRecordings() React hook surfaces unsent recordings on mount for one-click retry. Backed by idb-keyval (~600 B).
useNetworkStatus() hook — live online/offline state from navigator.onLine + window events. Gate uploads on flaky hospital wifi without writing the listener boilerplate.
Studio Playground dogfooding — Studio's own Playground now uses Recorder from this SDK with VU meter, 8s silence auto-stop, 10-min cap, and wake-lock. Same code path customers ship.
Next.js example refreshed — examples/nextjs-server-action now uses the modern Recorder API.

React Native (`@ohm_studio/sdk-react-native@0.4.0`)

BareRecorder — first-class adapter for react-native-audio-recorder-player (bare RN, no Expo). Same lifecycle, state machine, error codes, level metering, silence auto-stop, and max-duration cap as ExpoRecorder. Configures the native module with a clinical preset (16 kHz mono AAC at 32 kbps, Android VOICE_RECOGNITION source).
expo-audio (Expo SDK 54+) — documented as a first-class path: drive useAudioRecorder directly and pass { uri, name, type } to ohm.audio.extract. Code snippet on the React Native page.
Expo example uses useRecorder — examples/expo-mic-recorder now uses the hook with auto-extract.

Studio app reliability

Tab error boundaries — every tab in the API builder (Builder, Prompt, Inputs, Insights, Playground, API call, Logs, Versions, Settings) is wrapped in react-error-boundary. A render-time crash in one tab shows a "Reset tab" panel; the rest of the page, including unsaved drafts, keeps working.

Test coverage

85 tests across both SDK packages. All scenarios covered:

Browser SDK (60 tests)	React Native (25 tests)
Codec cascade — webm/opus, mp4/aac, ogg/opus, fallbacks	ExpoRecorder permission flow + iOS audio session
All 6 error codes (Permission/NoMic/Busy/OverConstrained/Lost/NotSupported/Unknown)	Pause/resume + NotSupported on older SDKs
State machine (transitions, double-start, pause-while-idle, etc.)	Silence auto-stop + speech recovery
Stop returns Blob with correct mime; tracks released	Max-duration cap
Cancel cleanup; idle no-op	dB→linear math (peak/mid/floor); NaN guard
Device-lost mid-record fires `onDeviceLost` + auto-cancel	Keep-awake activate/deactivate
Level metering + silence auto-stop + reset	BareRecorder start/stop returns RNFile
Max-duration cap	BareRecorder pause/resume + NotSupported when missing
Wake lock acquired on demand; survives denial	Constructor accepts class or instance
Duration tracking excludes paused time	RecorderError shape (code/name/cause)
Timeslice chunks emit periodically
`useRecorder` lifecycle + auto-extract via `apiSlug`
IndexedDB persist round-trip + custom id + ordering + delete + clear
`useNetworkStatus` flips on online/offline events
`usePendingRecordings` loading + non-empty + empty paths

Run with pnpm --filter @ohm_studio/sdk test and pnpm --filter @ohm_studio/sdk-react-native test.

v0.3.0 — Recorder upgrade + `useRecorder()` hook

Release date: 2026-05-10 Affects: @ohm_studio/sdk and @ohm_studio/sdk-react-native.

The browser Recorder is now production-grade for clinical use. Drop-in compatible — existing new Recorder().start() / .stop() code continues to work.

Browser compatibility

Codec cascade — MediaRecorder.isTypeSupported() picks the best of audio/webm;codecs=opus → audio/mp4;codecs=mp4a.40.2 → audio/ogg;codecs=opus → audio/mp4 → audio/webm. Fixes recording on iOS Safari and older Firefox builds where WebM/Opus isn't available.
Clinical defaults — by default getUserMedia is requested with 16 kHz mono, echoCancellation, noiseSuppression, autoGainControl. Pass clinicalDefaults: false to opt out.

New features

Pause / resume — rec.pause() / rec.resume(); duration tracking excludes paused time.
VU level metering — onLevel: (rms) => … emits linear RMS 0–1 via an AudioContext AnalyserNode. Wire it to a meter UI.
Silence auto-stop — silenceAutoStop: { ms, threshold } auto-stops after sustained silence.
Wake Lock — wakeLock: true keeps tablets/phones awake during long consults.
Tab-hidden pause — pauseOnHidden: true pauses when the tab loses focus.
Streaming chunks — timesliceMs + onChunk for chunked uploads.
Hard duration cap — maxDurationMs auto-stops after N ms.
Permissions preflight — Recorder.probePermission() returns "granted" | "denied" | "prompt" | "unknown" without prompting.
Microphone enumeration — Recorder.listMicrophones() + deviceId option for picker UI.
Mic disconnect detection — onDeviceLost fires when the track ends mid-record.
Typed errors — RecorderError with code: "PermissionDenied" | "NoMicrophone" | "MicrophoneBusy" | "OverConstrained" | "DeviceLost" | "NotSupported" | "InvalidState".

`useRecorder()` React hook

A new one-call hook in @ohm_studio/sdk/react:

const r = useRecorder({
  apiSlug: "visit-extract",          // optional — auto-extracts on stop
  silenceAutoStop: { ms: 6000 },
  maxDurationMs: 15 * 60_000,
  wakeLock: true,
});
// r.start, r.stop, r.pause, r.resume,
// r.state, r.level, r.durationSec, r.transcript, r.data

Full reference: Browser Recorder.

React Native parity

@ohm_studio/sdk-react-native@0.3.0 got the same upgrade:

Clinical recording preset by default — 16 kHz mono AAC at 32 kbps, replaces the old HIGH_QUALITY preset (which was overkill for clinical speech and produced 3-4× larger files).
iOS audio session — automatically configured so recording works in silent mode (playsInSilentModeIOS: true, allowsRecordingIOS: true).
Pause / resume — await rec.pause() / await rec.resume() on Expo SDK 50+.
dB → linear level metering — Expo emits metering in dBFS; SDK linearises to 0–1 for the onLevel / hook r.level field.
Silence auto-stop — silenceAutoStop: { ms, thresholdDb }.
Hard duration cap — maxDurationMs.
Optional keep-awake hooks — pass { activate, deactivate } to wire expo-keep-awake without us depending on it.
useRecorder() hook — same shape as web (r.start, r.stop, r.level, r.durationSec, r.transcript, r.data with auto-extract).
Typed RecorderError with codes PermissionDenied | NoMicrophone | InvalidState | Interrupted | NotSupported | Unknown.
probePermission() — non-prompting permission check.

Drop-in compatible — existing new ExpoRecorder(Audio).start()/.stop() code keeps working; the new options are all opt-in.

v0.2.0 — Streaming, Recorder, mock mode

Release date: 2026-05-09

Streaming

ohm.audio.extract.stream(...) — returns an AsyncIterable of StreamChunk events so UI can render the transcript first, then update fields when the extraction LLM call completes. Backed by Server-Sent Events; the SDK parses the event stream automatically.
New backend endpoint: POST /api/studio/v1/audio/extract/:slug/stream
Chunks: { type: "transcript", transcript, language? }, { type: "data", data, apiSlug }, { type: "done", latencyMs }, { type: "error", message, code? }.

Recorder utilities

Recorder from @ohm_studio/sdk — browser MediaRecorder wrapper. await rec.start() / await rec.stop() returns a Blob ready for ohm.audio.extract. isRecordingSupported() for SSR-safe detection.
ExpoRecorder from @ohm_studio/sdk-react-native — thin Expo-AV adapter. Pass the Audio namespace from expo-av; we don't take a hard dep so bare-RN customers can stay with react-native-audio-recorder-player.

Mock mode

new OHM({ mock: true }) returns deterministic canned data for every method without hitting the network. Override per method via mockResponses: { extract, transcribe, audioExtract, summarize, insights }. Streaming variant emits the same canned chunks. Useful for unit tests, Storybook, and local preview builds.

Examples

New examples/ folder with three runnable samples:

node-cli/ — CLI summarizer using ohm.summarize.
nextjs-server-action/ — browser uploader + Next.js 'use server' action calling ohm.audio.extract. Live key stays on the server.
expo-mic-recorder/ — Expo app using ExpoRecorder + ohm.audio.extract.

Tooling

Removed the ${NODE_AUTH_TOKEN} reference from the committed .npmrc to silence pnpm's "Failed to replace env in config" warning on every install. Token is now passed to publish via --//registry…:_authToken per-invocation.

v0.1.0 — Initial public release

Release date: 2026-05-09

Studio platform

Multi-tenant developer platform at studio.ohm.doctor
- Projects, APIs, keys, versions, invocations, usage, audit log
- Org-scoped tenant isolation for projects / APIs / keys / invocations
- Default landing page is a Dashboard with 24-hour activity KPIs
API Builder
- 16 field types — including 7 medical primitives (vitals, diagnoses, medications, allergies, investigations, referrals, procedures)
- Visual builder + JSON view, two-way sync
- Drag-and-drop section + field reordering, inline edit, delete
- Field palette with categories (Basic, Clinical primitives)
Prompt tab
- Editable system prompt with the OHM Clinical Foundation Block (vital sanity, negation, code-mix, narrative formatting) prepended by default
- Per-API opt-out with required reason (audit-logged)
Inputs tab — declare HTTP-time variables your callers pass
Insights tab — toggle the second-pass insights extraction with its own schema + prompt
Playground tab — text or browser-mic audio, JWT-authed, runs the draft spec without debiting customer keys
API call tab — copy-paste cURL, JS / Node, React hook, React Native, Python (cURL fallback) snippets
Logs tab — invocations table (status, latency, tokens, error) plus 24-hour and 30-day stat cards
Versions tab — list of published snapshots with current marker
Settings tab — rate limit, payload retention, foundation opt-out, archive
Keys page — mint test-mode + live-mode keys, one-time plaintext reveal, last-used timestamp & IP, revoke
AI assistant — six modes (chat / suggest fields / improve prompt / find edge cases / generate test transcript / diagnose extraction). Cost borne by OHM, never debited to customer quota.

Public extraction API (under `/api/studio/v1`)

POST /audio/transcribe — multipart audio → transcript
POST /extract/:apiSlug — text → structured JSON
POST /audio/extract/:apiSlug — audio → transcript → structured JSON in one call
POST /summarize — text → summary (4 styles: patient / handover / executive / progress-note)
POST /insights/:apiSlug — transcript → specialty insights
API-key auth, bcrypt-hashed at rest, per-key rate limit
Bundle-key safeguard: live keys blocked in React Native unless acknowledgeBundledKey: true is passed
Sanitised vendor-neutral error messages — provider names never leak into HTTP responses or SDK error stacks

SDKs

@ohm_studio/sdk (OHM SDK) — works in browser, Node 18+, Next.js (server actions, route handlers, edge runtime)
- Methods: extract, summarize, insights, audio.transcribe, audio.extract
- React hooks via @ohm_studio/sdk/react: useOhmExtract, useOhmAudioExtract, useOhmSummarize, <OhmProvider>
@ohm_studio/sdk-react-native (OHM RN SDK) — RN-shaped multipart uploads ({ uri, name, type }), native fetch, hooks subentry
Shared @ohm_studio/sdk-core — http client, errors, retry, types
Bundle size: @ohm_studio/sdk ~6 KB packed, RN ~5 KB packed
All three packages built dual ESM+CJS+.d.ts, dist-only

Docs

New site at docs.ohm.doctor (Fumadocs / Next.js / MDX)
Pages: Quickstart, Your first audio API, Authentication, Templates & schemas, Field types, Prompts & Foundation, JS SDK, React hooks, React Native SDK, API reference, RN key handling, Compliance, Cookbook (triage form on a tablet), Changelog
Static icon map (12 lucide icons) — sidebar tree icons render correctly under Turbopack ESM bundling

Roadmap

These ship in subsequent minor versions:

0.2 — streaming primitive (ohm.audio.extract.stream(...) returning AsyncIterable for live transcript + field updates)
0.2 — Recorder utility (web MediaRecorder + Expo / bare RN wrappers) so customers don't have to set up their own
0.2 — new OHM({ mock: true }) returns canned responses for tests
0.3 — Webhook callbacks for async extractions (schema reserved in v0.1)
0.3 — Custom domain mapping per project
0.4 — Marketplace / template sharing across orgs
0.4 — Native Python SDK

On this page