Async extraction (jobs)
Submit audio, get a jobId, poll or webhook callback. For long recordings and mobile-background workflows.
For audio over 30 minutes, mobile workflows where the app may be backgrounded mid-call, or any time you can't keep an HTTP connection open until extraction completes — use the async-job pattern.
The lifecycle:
- Submit — POST audio to
/audio/extract/:slug/jobs. Server returns immediately with{ jobId, status: "QUEUED" }. - Process — A worker picks up the job, transcribes with batch STT, runs structured extraction, persists the result.
- Notify — Either you poll
GET /jobs/:iduntil status hits a terminal state, or the server fires a webhook to your backend. Both can be used together.
When to use which mode
| Audio length | Mode |
|---|---|
| 0–10 min, real-time UX | Sync — ohm.audio.extract({ file }) |
| 10–30 min, want progress | Streaming — ohm.audio.extractStream({ file }) |
| 30 min – 1 hr | Async polling — jobs.create + jobs.poll |
| > 1 hr, mobile background, hour-long recordings | Async webhook — jobs.create({ webhookUrl }) |
| Bulk replay (10 000 historical files) | Async webhook with idempotency keys |
Quick start — submit + poll
import { OHM } from "@ohm_studio/sdk";
const ohm = new OHM(process.env.OHM_API_KEY!);
// Submit
const { jobId } = await ohm.audio.jobs.create({
apiSlug: "long-consult",
file: bigBlob, // 30 MB / 25-min audio
});
// Poll — onProgress fires on every poll for UI updates
const result = await ohm.audio.jobs.poll(jobId, {
intervalMs: 3000,
maxWaitMs: 30 * 60_000,
onProgress: (j) => console.log(`${j.status} · ${j.workerProgress}%`),
});
if (result.status === "COMPLETED") {
console.log(result.resultTranscript);
console.log(result.resultData);
} else {
console.error(result.errorMessage);
}Quick start — submit + webhook
The server POSTs a signed payload to webhookUrl when the job
completes. Stripe-style retry schedule (5min → 30min → 2h → 5h → 10h
→ 24h → 24h, ~3 days total) on 4xx / 5xx.
const { jobId } = await ohm.audio.jobs.create({
apiSlug: "long-consult",
file: bigBlob,
webhookUrl: "https://your-backend.example.com/ohm-callback",
patientHash: sha256(`abha:${patient.abhaId}`),
recordedById: currentUser.id,
});
// Show "Processing…" in the UI; the webhook does the rest.Your webhook receiver:
import express from "express";
import { createHmac } from "node:crypto";
app.post("/ohm-callback", express.json(), (req, res) => {
// 1. Acknowledge fast — return 200 BEFORE doing any work.
// The retry schedule is Stripe-style (5min → 30min → 2h → 5h →
// 10h → 24h, ~3 days total) on 4xx/5xx, but you don't want to
// sit in a tight retry loop with the server for any reason.
res.status(200).end();
// 2. Verify the signature.
const expected = createHmac("sha256", JOB_WEBHOOK_SECRET)
.update(JSON.stringify(req.body))
.digest("hex");
if (`sha256=${expected}` !== req.headers["x-ohm-signature"]) {
console.warn("invalid signature; dropping");
return;
}
// 3. Idempotent processing — same delivery ID may arrive twice.
const deliveryId = req.headers["x-ohm-delivery-id"];
if (await alreadyProcessed(deliveryId)) return;
// 4. Persist + push to mobile (e.g. via Expo push, FCM, or your
// websocket fan-out).
const { event, jobId, data, transcript } = req.body;
if (event === "extraction.job.completed") {
await emrSavedExtraction(jobId, data, transcript);
}
});The webhookSecret for HMAC verification is per-job — generated
fresh on every jobs.create and persisted server-side. To get it
back into your verifier, store it on your side when you create the
job (the SDK doesn't currently surface it; we'll add it in 0.7.1).
One-line "fire and wait" — extractAsync
When you don't want to manage create + poll separately:
const result = await ohm.audio.extractAsync({
apiSlug: "long-consult",
file: bigBlob,
intervalMs: 3000,
maxWaitMs: 30 * 60_000,
onProgress: (j) => setProgress(j.workerProgress),
});Identical to jobs.create followed by jobs.poll — useful for
scripts, build pipelines, and Node-CLI flows.
Manual polling — jobs.get
If you don't want jobs.poll's built-in loop (for example, you're
polling on a UI tick or storing the jobId between sessions and need
to resume), call jobs.get yourself:
const job = await ohm.audio.jobs.get(jobId);
if (job.status === "COMPLETED") {
console.log(job.resultTranscript, job.resultData);
} else if (job.status === "FAILED") {
console.error(job.errorCode, job.errorMessage);
} else {
// QUEUED | PROCESSING — show job.workerProgress (0–100) and check again
}The returned JobDetail carries:
| Field | When populated |
|---|---|
status | always — QUEUED · PROCESSING · COMPLETED · FAILED · CANCELLED |
workerProgress | during PROCESSING (0–100) |
audioSeconds · totalTokens | once transcribe + extract land |
resultTranscript · resultData · resultLanguage | only on COMPLETED |
errorCode · errorMessage | only on FAILED |
webhookAttempts (array) | each delivery attempt + signature dead-letter trail |
Polling rate of thumb: every 2 s during PROCESSING is plenty — the worker phases (transcribe → extract) usually take 4 s + 6 s.
Cancellation
Best-effort. QUEUED jobs cancel immediately; PROCESSING jobs may complete if past phase checkpoints (transcribe done / extract done) — the final state is in the returned record.
const cancelled = await ohm.audio.jobs.cancel(jobId);
console.log(cancelled.status); // "CANCELLED" or actual final stateWhat the SDK guarantees
| Behaviour | Guarantee |
|---|---|
Idempotency on create | Same Idempotency-Key returns the existing job; never duplicates |
| Worker progress | workerProgress 0-100 reflects transcribe → extract phase progress |
| Webhook retry | Stripe-style schedule on 4xx/5xx, dead-letter after 7 attempts |
| Webhook signing | HMAC-SHA256 with X-OHM-Signature header |
| Webhook idempotency | X-OHM-Delivery-Id UUID v4 unique per delivery attempt |
| Audit fields | patientHash + recordedById stored on the job + every invocation row |
| Multi-instance worker | DB-backed FOR UPDATE SKIP LOCKED claim — N pods safe |
Limits
- Single audio file ≤ 500 MB (SDK rejects synchronously with
OHMValidationError; server cap configurable viaSTUDIO_MAX_AUDIO_BYTES). - Any duration is supported. Recordings longer than 55 min are split server-side, transcribed in parallel, and merged. The final
JobDetailhaschunked: true+chunkCountso your UI can surface a chunk-boundary warning. - Max poll wait: 15 min default (override via
maxWaitMs). - Webhook delivery: 30s timeout per attempt. After 7 retries the
delivery is dead-lettered (visible in the job's
webhookAttemptslog viajobs.get). - Audio retained on server only until terminal state. Once the job completes or fails, the upload is deleted.
Compared to sync extraction
| Feature | Sync | Streaming | Async |
|---|---|---|---|
| HTTP connection time | full duration | full duration | ~100ms (just submit) |
| Best for audio length | 0–10 min | 10–30 min | 30 min – hours |
| Survives network blips | ❌ | ❌ | ✅ |
| Mobile background safe | ❌ | ❌ | ✅ |
| Webhook callback | n/a | n/a | ✅ |
Idempotency-Key retry-safe | ✅ | ✅ | ✅ |
patientHash + recordedById | ✅ | ✅ | ✅ |
Async jobs cost the same per call as sync — same LLM tokens, same STT seconds. The advantage is reliability under long recordings, not pricing.