tapWhisper — OpenAI Whisper GGML

Specifications

Size 75 MB (Tiny) to 1.5 GB (Large)

Architecture Transformer Encoder-Decoder

Latency 1-3s for average dictation

Language 99+ languages

Developer / Creator

OpenAI (original weights), GGML / whisper.cpp community (quantized files)

License

MIT

Download Source

Verified Repository Source

Hugging Face Hub (via tapWhisper downloader)

ggerganov/whisper.cpp

Exact runtime artifacts

Model Overview

Whisper is OpenAI's state-of-the-art general-purpose speech recognition model. In tapWhisper, Whisper models run offline using whisper.cpp (GGML format) with full Metal GPU acceleration on Apple Silicon. Users can download different sizes (Base, Small, Medium, Large) from settings. It offers extreme multilingual accuracy and includes custom vocabulary prompting.

Available Model Variants

Model Name	File Size	RAM Usage	Format/Quant	Languages	Description
Whisper Very Small	74 MB	180 MB	Float16 (Full)	Multilingual	Fastest transcription speed, lower accuracy. Ideal for fast test queries.
Whisper Very Small Q5	31 MB	110 MB	Q5_1 (Quantized)	Multilingual	Smallest quantized Whisper option. Ultra-low storage requirement.
Whisper Small	141 MB	300 MB	Float16 (Full)	Multilingual	Balanced base model with decent accuracy for simple everyday sentences.
Whisper Small Q5	57 MB	180 MB	Q5_1 (Quantized)	Multilingual	Quantized Whisper base model. Optimized memory and storage usage.
Whisper Medium ⭐	547 MB	900 MB	Q5_0 (Quantized)	Multilingual	Best speed-to-quality ratio. Recommended as the default offline model.
Whisper Very Small (English)	74 MB	180 MB	Float16 (Full)	English	Fastest English-only dictation model. Low resource consumption.
Whisper Very Small Q5 (English)	31 MB	110 MB	Q5_1 (Quantized)	English	Quantized English-only tiny model. Extremely lightweight.
Whisper Small (English)	141 MB	300 MB	Float16 (Full)	English	Standard English-only base model for standard dictation.
Whisper Small Q5 (English)	57 MB	180 MB	Q5_1 (Quantized)	English	Quantized English-only base model. High efficiency.
Whisper Standard	465 MB	850 MB	Float16 (Full)	Multilingual	Standard model. Offers solid recognition accuracy for multiple languages.
Whisper Standard Q5	181 MB	450 MB	Q5_1 (Quantized)	Multilingual	Quantized Whisper small model. Excellent balance of size and fidelity.
Whisper Standard (English)	465 MB	850 MB	Float16 (Full)	English	Standard English-only model. Ideal for clean English speech dictation.
Whisper Standard Q5 (English)	181 MB	450 MB	Q5_1 (Quantized)	English	Quantized English-only standard model. High memory efficiency.
Whisper Large (legacy)	1.43 GB	2.2 GB	Float16 (Full)	Multilingual	Older large model with broad language coverage. High accuracy, heavy footprint.
Whisper Medium HQ	1.51 GB	2.3 GB	Float16 (Full)	Multilingual	High-quality medium model (Turbo architecture). Outstanding accuracy.
Whisper Very Big	2.88 GB	4.2 GB	Float16 (Full)	Multilingual	Maximum general accuracy. Heavy download, slower processing overhead.

Back to tapWhisper