SottoSotto
Back to blog
offlinetranscriptionmacoswhisperprivacy

How to Transcribe Audio to Text Offline on a Mac (No Cloud, No Uploads)

Three ways to transcribe audio files to text completely offline on macOS: whisper.cpp on the command line, free GUI apps, and Sotto's drag-and-drop import with re-transcription.

K
June 12, 20267 min read

You don't need to upload a confidential meeting recording to some startup's server to get a transcript. Apple Silicon Macs are fast enough to run state-of-the-art speech models locally. Here are three ways to do it, from nerdiest to easiest.

Why offline matters

  • Privacy: interviews, medical notes, legal calls, and voice memos never leave your machine.
  • Cost: cloud APIs charge per minute, forever. Local models are free to run.
  • Reliability: works on a plane, works when the API is down, works in 2030.

Option 1: whisper.cpp (free, command line)

If you're comfortable in a terminal, whisper.cpp is the classic route: install it with Homebrew, download a model, and run it against your file. It's free and scriptable, but you handle audio conversion, model management, and output formatting yourself — and there's no UI for fixing or searching transcripts afterwards.

Option 2: free GUI apps

Aiko (free) and MacWhisper's free tier both transcribe files on-device with a proper interface. Great for occasional use. The limits show up at higher volume: smaller model selections, fewer cleanup tools, and no connection to a dictation workflow.

Option 3: drag-and-drop in Sotto

Sotto is dictation-first, but it also imports audio files: drag an .mp3, .m4a, .wav, or .webm into the app (or press Cmd+Shift+I) and it transcribes using whichever model you've selected — all locally.

A few things make it pleasant for file work:

  • Re-transcribe anytime: ran a voice memo through the Tiny model and it botched the names? One click re-runs it through Large V3 Turbo or Parakeet.
  • Custom vocabulary: add your product names, jargon, and acronyms so the model gets them right.
  • History + search: every transcript is saved locally and full-text searchable.
  • Cleanup rules: auto-remove filler words and fix punctuation on the way out.

Which local model should you use?

ModelSizeBest for
Whisper Tiny / Base66–105 MBQuick notes, drafts
Whisper Large V3 Turbo~954 MBBest all-round quality
Parakeet v2 (English)2.6 GBHighest English accuracy, very fast
Parakeet v3 (Multilingual)2.7 GBBest multilingual accuracy

Curious how the two model families differ? We compared Whisper vs Parakeet in depth.

Step by step (the easy way)

  • Install Sotto and download a model (Large V3 Turbo is the sweet spot).
  • Drag your audio file onto the app, or press Cmd+Shift+I.
  • Wait — the Neural Engine chews through audio much faster than realtime.
  • Copy the transcript, or search it later from History.

That's it. No account, no upload, no per-minute billing. Sotto is $49 once and the transcription stays on your Mac.

K

About Kitze

Creator of Sotto and indie developer building tools for productivity. Passionate about local AI and privacy-first software.

Follow on Twitter