Character AI Launches AvatarFX Model That Generates Consistent Videos from Images

Character AI, a California-based AI platform, has introduced its first video generation model, named AvatarFX, which can convert images into 2D and 3D animated videos. The company claims that the videos generated by AvatarFX will maintain temporal consistency, ensuring that elements like facial expressions, hand, and body movements remain smooth and coherent across frames. The videos will also incorporate speech, powered by Character AI’s native text-to-speech (TTS) models. AvatarFX is expected to be released in the coming months, with paid subscribers gaining early access to the tool.

AvatarFX marks a significant expansion for Character AI, which has primarily focused on text and image-based models in the past. With this new model, the company ventures into the realm of AI-generated videos, allowing users to create animated characters that can move and speak. However, unlike most video generation models, AvatarFX will not generate realistic human characters. Instead, it focuses on 2D and 3D cartoon characters, as well as non-human faces. The goal is to provide users with a tool that allows for more creative and controlled video generation.

A key feature of AvatarFX is its emphasis on temporal consistency. This means that the generated videos will ensure the continuity of movement, with facial expressions, hand, and body gestures remaining fluid between frames. The company asserts that this model will significantly reduce issues like glitches or inconsistencies, such as extra limbs or distorted facial expressions, which can often occur in AI-generated video. While these claims sound promising, the true capabilities of AvatarFX can only be confirmed once the model is officially released.

One important distinction of AvatarFX is that it will not generate videos based on text inputs. Instead, the model accepts images as its sole input. Character AI believes that this approach will allow users to have better control over the video generation process, ensuring that the resulting videos are closer to the user’s vision. The inclusion of speech, powered by the company’s TTS models, adds another layer of realism to the animated content, making it more engaging and dynamic. This move signals Character AI’s push to enhance the way we create and interact with AI-generated videos, offering new possibilities for animation and storytelling.