OHMOHM Studio

Async extraction (jobs)

Submit audio, get a jobId, poll or webhook callback. For long recordings and mobile-background workflows.

View as Markdown

For audio over 30 minutes, mobile workflows where the app may be backgrounded mid-call, or any time you can't keep an HTTP connection open until extraction completes — use the async-job pattern.

The lifecycle:

  1. Submit — POST audio to /audio/extract/:slug/jobs. Server returns immediately with { jobId, status: "QUEUED" }.
  2. Process — A worker picks up the job, transcribes with batch STT, runs structured extraction, persists the result.
  3. Notify — Either you poll GET /jobs/:id until status hits a terminal state, or the server fires a webhook to your backend. Both can be used together.

When to use which mode

Audio lengthMode
0–10 min, real-time UXSyncohm.audio.extract({ file })
10–30 min, want progressStreamingohm.audio.extractStream({ file })
30 min – 1 hrAsync pollingjobs.create + jobs.poll
> 1 hr, mobile background, hour-long recordingsAsync webhookjobs.create({ webhookUrl })
Bulk replay (10 000 historical files)Async webhook with idempotency keys

Quick start — submit + poll

import { OHM } from "@ohm_studio/sdk";

const ohm = new OHM(process.env.OHM_API_KEY!);

// Submit
const { jobId } = await ohm.audio.jobs.create({
  apiSlug: "long-consult",
  file: bigBlob,                // 30 MB / 25-min audio
});

// Poll — onProgress fires on every poll for UI updates
const result = await ohm.audio.jobs.poll(jobId, {
  intervalMs: 3000,
  maxWaitMs: 30 * 60_000,
  onProgress: (j) => console.log(`${j.status} · ${j.workerProgress}%`),
});

if (result.status === "COMPLETED") {
  console.log(result.resultTranscript);
  console.log(result.resultData);
} else {
  console.error(result.errorMessage);
}

Quick start — submit + webhook

The server POSTs a signed payload to webhookUrl when the job completes. Stripe-style retry schedule (5min → 30min → 2h → 5h → 10h → 24h → 24h, ~3 days total) on 4xx / 5xx.

const { jobId } = await ohm.audio.jobs.create({
  apiSlug: "long-consult",
  file: bigBlob,
  webhookUrl: "https://your-backend.example.com/ohm-callback",
  patientHash: sha256(`abha:${patient.abhaId}`),
  recordedById: currentUser.id,
});

// Show "Processing…" in the UI; the webhook does the rest.

Your webhook receiver:

import express from "express";
import { createHmac } from "node:crypto";

app.post("/ohm-callback", express.json(), (req, res) => {
  // 1. Acknowledge fast — return 200 BEFORE doing any work.
  //    The retry schedule is Stripe-style (5min → 30min → 2h → 5h →
  //    10h → 24h, ~3 days total) on 4xx/5xx, but you don't want to
  //    sit in a tight retry loop with the server for any reason.
  res.status(200).end();

  // 2. Verify the signature.
  const expected = createHmac("sha256", JOB_WEBHOOK_SECRET)
    .update(JSON.stringify(req.body))
    .digest("hex");
  if (`sha256=${expected}` !== req.headers["x-ohm-signature"]) {
    console.warn("invalid signature; dropping");
    return;
  }

  // 3. Idempotent processing — same delivery ID may arrive twice.
  const deliveryId = req.headers["x-ohm-delivery-id"];
  if (await alreadyProcessed(deliveryId)) return;

  // 4. Persist + push to mobile (e.g. via Expo push, FCM, or your
  //    websocket fan-out).
  const { event, jobId, data, transcript } = req.body;
  if (event === "extraction.job.completed") {
    await emrSavedExtraction(jobId, data, transcript);
  }
});

The webhookSecret for HMAC verification is per-job — generated fresh on every jobs.create and persisted server-side. To get it back into your verifier, store it on your side when you create the job (the SDK doesn't currently surface it; we'll add it in 0.7.1).

One-line "fire and wait" — extractAsync

When you don't want to manage create + poll separately:

const result = await ohm.audio.extractAsync({
  apiSlug: "long-consult",
  file: bigBlob,
  intervalMs: 3000,
  maxWaitMs: 30 * 60_000,
  onProgress: (j) => setProgress(j.workerProgress),
});

Identical to jobs.create followed by jobs.poll — useful for scripts, build pipelines, and Node-CLI flows.

Manual polling — jobs.get

If you don't want jobs.poll's built-in loop (for example, you're polling on a UI tick or storing the jobId between sessions and need to resume), call jobs.get yourself:

const job = await ohm.audio.jobs.get(jobId);

if (job.status === "COMPLETED") {
  console.log(job.resultTranscript, job.resultData);
} else if (job.status === "FAILED") {
  console.error(job.errorCode, job.errorMessage);
} else {
  // QUEUED | PROCESSING — show job.workerProgress (0–100) and check again
}

The returned JobDetail carries:

FieldWhen populated
statusalways — QUEUED · PROCESSING · COMPLETED · FAILED · CANCELLED
workerProgressduring PROCESSING (0–100)
audioSeconds · totalTokensonce transcribe + extract land
resultTranscript · resultData · resultLanguageonly on COMPLETED
errorCode · errorMessageonly on FAILED
webhookAttempts (array)each delivery attempt + signature dead-letter trail

Polling rate of thumb: every 2 s during PROCESSING is plenty — the worker phases (transcribe → extract) usually take 4 s + 6 s.

Cancellation

Best-effort. QUEUED jobs cancel immediately; PROCESSING jobs may complete if past phase checkpoints (transcribe done / extract done) — the final state is in the returned record.

const cancelled = await ohm.audio.jobs.cancel(jobId);
console.log(cancelled.status);  // "CANCELLED" or actual final state

What the SDK guarantees

BehaviourGuarantee
Idempotency on createSame Idempotency-Key returns the existing job; never duplicates
Worker progressworkerProgress 0-100 reflects transcribe → extract phase progress
Webhook retryStripe-style schedule on 4xx/5xx, dead-letter after 7 attempts
Webhook signingHMAC-SHA256 with X-OHM-Signature header
Webhook idempotencyX-OHM-Delivery-Id UUID v4 unique per delivery attempt
Audit fieldspatientHash + recordedById stored on the job + every invocation row
Multi-instance workerDB-backed FOR UPDATE SKIP LOCKED claim — N pods safe

Limits

  • Single audio file ≤ 500 MB (SDK rejects synchronously with OHMValidationError; server cap configurable via STUDIO_MAX_AUDIO_BYTES).
  • Any duration is supported. Recordings longer than 55 min are split server-side, transcribed in parallel, and merged. The final JobDetail has chunked: true + chunkCount so your UI can surface a chunk-boundary warning.
  • Max poll wait: 15 min default (override via maxWaitMs).
  • Webhook delivery: 30s timeout per attempt. After 7 retries the delivery is dead-lettered (visible in the job's webhookAttempts log via jobs.get).
  • Audio retained on server only until terminal state. Once the job completes or fails, the upload is deleted.

Compared to sync extraction

FeatureSyncStreamingAsync
HTTP connection timefull durationfull duration~100ms (just submit)
Best for audio length0–10 min10–30 min30 min – hours
Survives network blips
Mobile background safe
Webhook callbackn/an/a
Idempotency-Key retry-safe
patientHash + recordedById

Async jobs cost the same per call as sync — same LLM tokens, same STT seconds. The advantage is reliability under long recordings, not pricing.