translation, speech synthesis, paralinguistic understanding, and general audio understanding. Advance research on speech... applications. Explore representation alignment and fusion mechanisms between audio/speech and other modalities in large multimodal...
understanding or practical experience in one or more of the following areas: Multimodal models (e.g., vision-language models, audio...&D team, you will help develop large-scale models with multimodal perception, autonomous learning, and reasoning abilities...