InternVL2_5-2B is a powerful multi-modal model integrating image and text for advanced applications like product descriptions and visual question answering.
Zonos-v0.1 is a high-fidelity real-time TTS model with 1.6B parameter Transformer and Hybrid architectures supporting multiple languages and flexible voice adjustments for natural expression.