Speech-to-text accuracy has improved dramatically, but it's not magic. Your environment, equipment, and technique all affect results. Here are 10 proven tips to get the best transcription accuracy.
1. Use a Quality Microphone
Your Mac's built-in mic works, but dedicated microphones make a significant difference:
- AirPods/AirPods Pro: Excellent for casual use, good noise isolation
- USB Headset: Consistent positioning, noise cancellation
- Lapel Mic: Great for hands-free, close to mouth
- Desktop Condenser: Studio quality but picks up room noise
Pro tip: AirPods Pro with noise cancellation active are surprisingly excellent for dictation.
2. Optimize Your Environment
Background noise is the enemy of accuracy:
- Close windows to reduce street noise
- Turn off fans, AC, or noisy appliances nearby
- Use a room with soft surfaces (carpet, curtains) to reduce echo
- Face away from noise sources
3. Speak Clearly, Not Slowly
Common mistake: speaking unnaturally slow. Modern AI models are trained on natural speech. Speak at your normal pace, but:
- Enunciate clearly (don't mumble)
- Maintain consistent volume
- Avoid trailing off at end of sentences
- Pause briefly between distinct thoughts
4. Use a Custom Dictionary
Technical terms, names, and jargon trip up speech recognition. Apps like Sotto let you add custom vocabulary:
- Company and product names
- Technical acronyms (API, SDK, JWT)
- Framework names (Next.js, SwiftUI)
- Colleague names
- Industry-specific terminology
A well-trained dictionary can improve accuracy by 20-30% for specialized content.
5. Choose the Right Model Size
Whisper comes in different sizes. Bigger isn't always better:
- Tiny: Fast, good for quick notes
- Base: Good balance of speed and accuracy
- Small: Better accuracy, slight delay
- Medium: High accuracy, more processing time
- Large: Best accuracy, requires more RAM
For real-time dictation, Tiny or Base often provide the best experience.
6. Know Your Punctuation Commands
Speaking punctuation gets you cleaner output:
- "period" → .
- "comma" → ,
- "question mark" → ?
- "exclamation point" → !
- "new line" → line break
- "new paragraph" → double line break
- "colon" → :
- "semicolon" → ;
7. Warm Up Your Voice
First thing in the morning, your voice may be unclear. Before important dictation:
- Drink water (hydrated vocal cords perform better)
- Do some light speaking to warm up
- Clear your throat if needed
8. Position Your Microphone Correctly
Microphone placement matters:
- Distance: 2-6 inches from mouth for most mics
- Angle: Slightly off-axis to reduce plosives (p, b sounds)
- Consistency: Keep the same position throughout
9. Use Cloud Models for Difficult Content
Local models are great for privacy and speed, but cloud APIs (OpenAI, Groq) can handle tricky situations better:
- Heavy accents
- Multiple speakers
- Technical/medical terminology
- Background noise
Apps like Sotto let you switch between local and cloud on the fly.
10. Review and Correct Consistently
No transcription is 100% accurate. Build a habit of quick review:
- Glance at output after each dictation
- Fix errors immediately (muscle memory)
- Add commonly misrecognized words to your dictionary
- Note patterns (certain words always wrong → add to dictionary)
Accuracy Expectations
With good conditions and technique, expect:
- 95-98% accuracy for clear speech, common words
- 90-95% accuracy for technical content with trained dictionary
- 85-90% accuracy for difficult conditions or accents
Even at 95%, you'll spend less time correcting than you would typing from scratch.
Get Better Accuracy with Sotto
Sotto includes custom dictionary, multiple model sizes, and optional cloud fallback for maximum accuracy. $29 one-time for 3 Macs.
Get Sotto