Doctors basically invented professional dictation — and then the tools moved to the cloud, dragging patient audio onto third-party servers and adding subscription pricing on top. Local AI quietly reversed that. Here's a realistic look at dictating clinical notes on a Mac with nothing leaving the device.
Why local processing matters in medicine
- PHI exposure: a recording of you describing a patient is sensitive by definition. If it's processed on-device, there's no vendor, no data agreement, and no breach surface to reason about.
- No per-minute fees: medical dictation services historically charge by volume. Local models cost nothing to run.
- Works anywhere: clinic basement, home office, airplane — no connectivity required.
The obligatory caveat: nothing here is compliance advice. Local-only processing removes the third-party-processor question, but your documentation workflow, storage, and EHR remain your responsibility — loop in your compliance officer.
The setup
Sotto runs Whisper and NVIDIA Parakeet models entirely on Apple Silicon's Neural Engine. A practical configuration for clinical use:
- Model: Parakeet v2 for English clinical dictation — highest recall, very fast. Whisper Large V3 Turbo if you dictate in other languages.
- Custom vocabulary: load your dictionary with drug names, procedures, abbreviations, and colleague names. This is the single biggest accuracy lever for specialty terms.
- Always-on rules: enable smart punctuation and filler-word removal so notes come out clean without manual editing.
- Push-to-talk: hold a hotkey while speaking, release to insert — directly into your EHR's text field, a note template, or a document.
Three workflows that work
1. Between patients: press the hotkey and dictate the encounter summary straight into whatever field your EHR or notes app has focused. At ~150 spoken words per minute, a SOAP note takes a fraction of the typing time.
2. Batch at end of day: record voice memos throughout the day (phone or recorder), then drag the audio files into Sotto and transcribe them all locally. Every recording stays searchable, and you can re-transcribe any of them with a bigger model if needed.
3. Correspondence: referral letters and patient communication via dictation plus a "professional tone" rule — speak casually, paste polished text.
What about dedicated medical dictation suites?
Tools like Dragon Medical One are deeply integrated with EHRs, include medical language models out of the box, and are sold (at significant subscription cost) with enterprise agreements. If your hospital provides one, use it. The local-Mac approach shines for private practices, telehealth from a home office, researchers, and anyone whose institution doesn't provide tooling — at $49 once instead of four figures a year.