macOS · menu bar · v0.1

Dictation
that thinks.

Press ⌘⌥M anywhere on macOS. Speak naturally — even mid-sentence corrections. Watch the messy raw transcript get rewritten live into something you'd actually send. Then it pastes into wherever you were typing.

Download for macOS See how it works

On-device Whisper·Local or BYOK LLM·No telemetry

Listening · ⌘⌥M to stop

Raw

colin thanks for setting up the meeting i'm not gonna make it for the full 15 minutes probably can make it for like 10 minutes actually no 5 minuteshope that's okay catch you later

Cleaned · pasting to

Mail.app

Hi Colin,

Thanks for setting up the meeting. I can only make 5 minutes of the 15 — hope that's okay.

Catch you later,

email · subject : Re: tomorrow

Workflow

Four moments,
one motion.

01Press · ⌘⌥M

Trigger from anywhere

Speakit lives in your menu bar. Hit the global hotkey while you're focused in any text field — Mail, Slack, your terminal, a code comment. The floating panel appears without stealing focus.

02Capture · WhisperKit

Transcribe on-device

Your voice is streamed into Whisper running entirely on your Mac — no audio leaves the machine. Partial words refine into confirmed segments as you speak. You see it happen in real time.

03Format · LLM

Clean up with intent

When you stop, an LLM rewrites the raw transcript: resolves mid-sentence corrections ("actually no, 5 minutes"), strips filler, picks the right register for where you're typing. Local Ollama or cloud — your choice.

04Paste · CGEvent

Land in the right place

Speakit remembers where you were typing. The cleaned text drops in at your cursor with a synthetic ⌘V. Your clipboard is preserved and restored. You never lose context.

Designed details

Made for people
who actually speak.

Context aware

Knows the room you're
talking to.

Speakit reads the frontmost app and (when allowed) the focused field to pick a register. Mail gets greetings and paragraphs. Slack gets short and casual. Terminal gets a bare command. Xcode keeps your identifiers intact.

emailchatcodeterminalnotesgeneric

Resolves rewrites

"Actually, no, 5 minutes."

The LLM drops abandoned drafts when you correct yourself mid-sentence. You don't have to speak in finished thoughts. You can think out loud.

Bring your own brain

Five providers, one settings pane.

Ollamalocal · default

AnthropicClaude Haiku 4.5

OpenAIGPT-4o-mini

GoogleGemini 2.5 Flash

z.aiGLM 4.6

Paste-back

Goes back to where you were.

Speakit tracks the last app you were typing in, even when you click the menu bar to start. When you stop, your clipboard is saved, the cleaned text drops in at your cursor, and your clipboard is restored two beats later.

previous-app→activate→CGEvent ⌘V→restore clipboard

Live updating

Watch your words sharpen.

Whisper streams partial words that snap into confirmed segments as you finish phrases. When you stop, the LLM streams its rewrite token-by-token into the same panel. Nothing about Speakit feels like waiting.

One key, everywhere

Global from any app.

Registered through Carbon at the lowest reliable level. Works in browsers, Electron apps, terminals, fullscreen software, and your own builds. No focus shuffle.

⌘⌥M

iii

Privacy

Your voice can stay yours.

Fully offline path

Pair on-device Whisper with a local Ollama model and Speakit makes zero network requests. Your voice is captured, transcribed, cleaned, and pasted without a packet leaving your Mac.

mic → WhisperKit (CoreML)

raw → Ollama (qwen2.5:3b)

clean → pasteboard → ⌘V

// no egress

BYOK cloud path

Prefer a cloud model for sharper formatting? Drop your own Anthropic, OpenAI, Gemini, or z.ai key into Settings. Your key lives in macOS Keychain. Requests go straight from your Mac to your provider — Speakit has no backend.

API key → Keychain

raw → your provider

clean → pasteboard → ⌘V

// nothing routed through us

No middleman

Speakit has no servers and no accounts. No analytics SDK, no crash reporting, no remote config. Your voice never leaves your Mac unless you've chosen a cloud formatter — and even then it goes straight to your provider, on your key.

Ready to install

Get Speakit.

Free during the beta. Signed and notarized for macOS 13+. Drop into Applications, grant mic + accessibility in the onboarding window, you're dictating in under a minute.

Download · macOS

downloads

0 ms

telemetry sent

16 kHz

on-device PCM

LLM providers

Questions answered

The honest small print.

Does my audio leave my Mac?

By default, no. Whisper runs on-device via WhisperKit (CoreML), and the default formatter is Ollama running locally. If you switch the formatter to Anthropic, OpenAI, Gemini, or z.ai in Settings, the raw transcript text (not the audio) is sent to your chosen provider using your own API key. Speakit has no backend of its own.

Which Whisper model is used?

We default to openai_whisper-base.en — ~140 MB, English-only, fast enough for live updates on Apple Silicon. The model downloads once on first launch into ~/Documents/huggingface/models. Swapping to base, small, or large variants will be a Settings option in a later release.

What's the difference between local Ollama and cloud LLMs?

Quality vs latency vs privacy. qwen2.5:3b on a recent Mac formats short utterances in a second or two with zero network. Claude Haiku 4.5 or GPT-4o-mini handles complex rewrites better — especially long emails and intricate self-corrections — but spends a fraction of a cent per request and needs a network round-trip. Pick the one whose tradeoffs you like.

Will it work with my favorite app?

If you can type into it, you can dictate into it. Speakit pastes via a system-level ⌘V keystroke, so it works in native apps, Electron apps, browser inputs, terminals, fullscreen software — anywhere ⌘V works. Email composers, Slack, VS Code, Cursor, Terminal/iTerm/Warp/Ghostty, Apple Notes, Obsidian, Bear, Linear, Discord, and more are all explicitly recognised for context-aware formatting.

How does context detection work?

Two layers. First, a bundle-ID lookup table maps known apps to a writing context (email / chat / code / terminal / notes / generic). Second, when Accessibility permission is granted, Speakit reads the focused field's placeholder/label so it can tell, say, an email Subject from the body. That hint goes into the LLM prompt.

What does the onboarding window do?

Walks you through the two macOS permissions Speakit needs — microphone and accessibility — once. Live status badges turn green the instant you toggle each one on in System Settings, so you're not guessing. Re-open it any time from the menu bar → Setup…. There's a 'Permissions stuck? Reset and relaunch' link for the rare case macOS has a stale entry from an earlier signature.

Is it really free?

Free during the beta. No account, no telemetry, no email signup. Pricing only comes in if cloud features land that need a backend (sync, team workspaces) — the core local dictation will always be free.

What about Windows / Linux?

Not yet. Speakit leans hard on macOS-specific APIs — Carbon hotkeys, NSWorkspace, accessibility, NSPanel — so a port wouldn't be a recompile. If there's enough interest, a Windows version is possible. The transcription and formatting layers are already largely portable.

Dictationthat thinks.

Four moments,one motion.