Sesame, the startup behind the viral virtual assistant Maya, releases its base AI model
techcrunch.com
AI companySesamehas released the base model that powers Maya, theimpressively realistic voice assistant.The model, which is 1 billion parameters in size (parameters referring to individual components of the model), is under an Apache 2.0 license, meaning it can be used commercially with few restrictions. Called CSM-1B, the model generates RVQ audio codes from text and audio inputs, according to Sesames description on the AI dev platform Hugging Face.RVQ refers to residual vector quantization, a technique for encoding audio into discrete tokens called codes. RVQ is used in a number of recent AI audio technologies, including Googles SoundStream and Metas Encodec.CSM-1B uses a model from Metas Llama family as its backbone paired with an audio decoder component. A fine-tuned variant of CSM powers Maya, Sesame says.The model open-sourced here is a base generation model, Sesame writes in CSM-1Bs Hugging Face and GitHub repositories. It is capable of producing a variety of voices, but it has not been fine-tuned on any specific voice [] The model has some capacity for non-English languages due to data contamination in the training data, but it likely wont do well.Its unclear what data Sesame used to train CSM-1B. The company didnt say. Its worth noting the model has no real safeguards to speak of. Sesame has an honor system and merely urges developers and users not to use the model to mimic a persons voice without their consent, create misleading content like fake news, or engage in harmful or malicious activities.I tried the demo on Hugging Face, and cloning my voice took less than a minute. From there, it was easy to generate speech to my hearts desire, including on controversial topics like the election and Russian propaganda.Consumer Reports recently warned that many popular AI-powered voice cloning tools on the market dont have meaningful safeguards to prevent fraud or abuse.Sesame, co-founded by Oculus co-creator Brendan Iribe, went viral in late February for its assistant tech, which comes close to clearing uncanny valley territory. Maya and Sesames other assistant, Miles, take breaths and speak with disfluencies, and can be interrupted while speaking, much like OpenAIs Voice Mode. Sesame has raised an undisclosed amount of capital from Andreessen Horowitz, Spark Capital, and Matrix Partners. In addition to building voice assistant tech, the company says its prototyping AI glasses designed to be worn all day thatll be equipped with its custom models.
0 Commentaires ·0 Parts ·40 Vue