tapWhisper — Alibaba Qwen 3 Formatter

Specifications

Size 397 MB (0.6B) to 2.5 GB (4B)

Architecture GGUF LLM (Q4_K_M)

Latency Fast (80-120 tok/s on Apple Silicon)

Function Formatting & grammar cleanup

Developer / Creator

Alibaba Group / llama.cpp community

License

Apache-2.0 (Qwen 3 GGUF); Apple platform terms (built-in cleanup)

Download Source

Verified Repository Source

Hugging Face Hub (via tapWhisper downloader)

Unsloth Qwen 3 GGUF repositories

Exact runtime artifacts

Model Overview

Qwen 3 is a family of lightweight, high-performance language models (0.6B to 4B parameters) in GGUF format used for local text formatting. In tapWhisper, selecting STT + LLM formatting starts a persistent, localhost-only llama.cpp server. Qwen formats and cleans up raw speech output: adding punctuation, correcting grammar, and formatting code on-device.

Available Model Variants

Model Name	File Size	RAM Usage	Format/Quant	Languages	Description
Apple Built-in Cleanup	0 MB	0 MB	System API	English	Built-in local text cleanup for basic grammar and spacing correction.
Small (Qwen 3 0.6B) ⭐	378 MB	650 MB	Q4_K_M (GGUF)	Multilingual	Default recommended formatter. Lightning-fast grammar, punctuation, and coding layout.
Medium (Qwen 3 1.7B)	1.03 GB	1.5 GB	Q4_K_M (GGUF)	Multilingual	Enhanced local language parsing. Handles structural text reorganization.
Large (Qwen 3 4B)	2.33 GB	3.2 GB	Q4_K_M (GGUF)	Multilingual	Highest-accuracy offline text formatter. Requires a powerful Mac (8GB+ RAM).

Back to tapWhisper