Kokoro is a fast, lightweight (82M parameter) multilingual TTS model with 54 built-in voice presets. It delivers high-quality speech synthesis with minimal resource usage, making it ideal for quick generation tasks on Apple Silicon.
# Basic generation (American English)mlx_audio.tts.generate\--modelmlx-community/Kokoro-82M-bf16\--text"Hello, world!"\--lang_codea
# Choose a voice and adjust speedmlx_audio.tts.generate\--modelmlx-community/Kokoro-82M-bf16\--text"Welcome to MLX-Audio!"\--voiceaf_heart\--speed1.2\--lang_codea
# Play audio immediatelymlx_audio.tts.generate\--modelmlx-community/Kokoro-82M-bf16\--text"Hello!"\--play\--lang_codea
frommlx_audio.tts.utilsimportload_modelmodel=load_model("mlx-community/Kokoro-82M-bf16")forresultinmodel.generate(text="Welcome to MLX-Audio!",voice="af_heart",# American femalespeed=1.0,lang_code="a",# American English):audio=result.audio# mx.array waveform