Skip to main content

Documentation Index

Fetch the complete documentation index at: https://dripart-mintlify-e28287af.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Cosmos-Predict2 is NVIDIA’s next-generation physical world foundation model, specifically designed for high-quality visual generation and prediction tasks in physical AI scenarios. The model features exceptional physical accuracy, environmental interactivity, and detail reproduction capabilities, enabling realistic simulation of complex physical phenomena and dynamic scenes. Cosmos-Predict2 supports various generation methods including Text-to-Image (Text2Image) and Video-to-World (Video2World), and is widely used in industrial simulation, autonomous driving, urban planning, scientific research, and other fields. It serves as a crucial foundational tool for promoting deep integration of intelligent vision and the physical world. GitHub:Cosmos-predict2 huggingface: Cosmos-Predict2 This guide will walk you through completing Video2World generation in ComfyUI. For the text-to-image section, please refer to the following part:

Cosmos Predict2 Text to Image

Using Cosmos-Predict2 for text-to-image generation