Picsart’s artificial intelligence research team (PAIR) has built a new generative model that can create entirely new video content from only text descriptions.
The technology, often described as text-to-video generative artificial intelligence (AI), has been released as an open-source demonstration on Twitter and has been published on GitHub and Hugging Face. The team behind it has also published a research paper describing the methodology.
“Recent text-to-video generation approaches rely on computationally heavy training and require large-scale video datasets. In this paper, we introduce a new task of zero-shot text-to-video generation and propose a low-cost approach (without any training or optimization) by leveraging the power of existing text-to-image synthesis methods (e.g., Stable Diffusion), making them suitable for the video domain,” the researchers explain.