OHMOHM Studio

Your first audio API

Record voice in the browser → structured JSON in two seconds.

View as Markdown

Your first audio API

Build a working voice-to-JSON pipeline you can drop into a Next.js or React Native app.

What we're building

A medical receptionist app that lets a doctor record a one-minute consult, and immediately surfaces the structured clinical note (vitals, diagnoses, medications) — ready to print or push into the hospital's EMR.

Set up the API in Studio

  1. + New API → Clone OPD Prescription
  2. Customise: drop presentIllness if you don't need narrative HPI; add a patientId input under the Inputs tab.
  3. Prompt tab: keep the OHM Clinical Foundation Block enforced (default).
  4. Publish. Note the slug — say triage-quick.
  5. Mint an ohms_test_* key from the Keys page.

Build the recorder

src/main.tsx
import { OHM } from "@ohm_studio/sdk";
import { OhmProvider } from "@ohm_studio/sdk/react";
import { Recorder } from "./components/Recorder";

const ohm = new OHM({
  apiKey: import.meta.env.VITE_OHM_TEST_KEY!,
  baseUrl: "https://api.ohm.doctor",
});

export default () => (
  <OhmProvider client={ohm}>
    <Recorder />
  </OhmProvider>
);
src/components/Recorder.tsx
import { useRecorder } from "@ohm_studio/sdk/react";

export function Recorder() {
  const r = useRecorder({
    apiSlug: "triage-quick",
    speakerMode: "doctor",          // single-speaker dictation
    extractLanguage: "auto",
    silenceAutoStop: { ms: 6000 },
    maxDurationMs: 10 * 60_000,
    wakeLock: true,
  });

  return (
    <div>
      <button
        onClick={r.isRecording ? r.stop : r.start}
        disabled={r.extracting}
      >
        {r.extracting
          ? "Extracting…"
          : r.isRecording
            ? `Stop · ${r.durationSec.toFixed(0)}s`
            : "Record consult"}
      </button>
      {r.transcript && <p>{r.transcript}</p>}
      {r.data && <pre>{JSON.stringify(r.data, null, 2)}</pre>}
      {r.error && <p style={{ color: "red" }}>{r.error.message}</p>}
    </div>
  );
}

The useRecorder hook does the codec cascade, mic permission, VU metering, silence auto-stop, and the audio.extract call for you. See Browser Recorder for the full option list.

Put your live key on the server, never in the browser:

app/actions.ts
"use server";
import { OHM } from "@ohm_studio/sdk";

const ohm = new OHM({ apiKey: process.env.OHM_API_KEY! });

export async function extractAction(formData: FormData) {
  const file = formData.get("audio") as Blob;
  const result = await ohm.audio.extract({
    apiSlug: "triage-quick",
    file,
  });
  return result;
}
components/Recorder.tsx
import { Audio } from "expo-av";
import { OHM } from "@ohm_studio/sdk-react-native";

const ohm = new OHM({
  apiKey: TEST_KEY,                    // ohms_test_*
  acknowledgeBundledKey: true,         // dev-only override
});

export async function record() {
  await Audio.requestPermissionsAsync();
  const { recording } = await Audio.Recording.createAsync(
    Audio.RecordingOptionsPresets.HIGH_QUALITY,
  );
  // …pause, stop, get URI…
  await recording.stopAndUnloadAsync();
  const uri = recording.getURI()!;

  const result = await ohm.audio.extract({
    apiSlug: "triage-quick",
    file: { uri, name: "rec.m4a", type: "audio/mp4" },
  });
  console.log(result.data);
}

Production: never ship a live key in your RN bundle. See API key handling for the proxy pattern.

What's next