Advertisement

The First Step of Li Feifei's Lab in Exploring Spatial Intelligence: Generating a 3D World from a Single Image

Latest release: an AI system capable of generating a 3D world from a single image. This technology allows users to "step into" the picture and explore the world within in 3D.

Unlike most current generative AI tools that only produce 2D content (such as images or videos), 3D generation offers greater control and consistency. This capability will redefine how movies, games, simulators, and other forms of digital reality are presented.

Camera effects

After generating a scene, the system renders it in real-time through a virtual camera, enabling various artistic photographic effects.

Shallow depth of field simulation

Focus on objects at a specific distance to create a blurred background effect:

Close-upLong shot

Dolly zoom simulation

Simultaneously adjust the camera position and field of view to create unique dynamic visual effects:

Wide field of viewNarrow field of view


3D effects: the world beyond pixels

Compared to directly generating 2D pixels, generating 3D scenes has significant advantages:

  • Once the generated world is complete, its structure does not change with the perspective.
  • Users can freely move within the generated scene, observing details or exploring hidden areas.
  • The world follows 3D physical rules, presenting a solid sense of depth.

Depth

ColorDepth

Scene

  • Sonar
  • Spotlight
  • Ripples

Additionally, dynamic presentations of the scene can be enabled:

  • Breeze
  • Waves
  • Color fluctuations

Step into classic artworks

3D generation technology allows you to experience classic art in new ways. Below are worlds generated from works by Van Gogh, Hopper, Seurat, and Kandinsky. Unpainted parts of the artwork are supplemented by AI.


Creative workflow

3D world generation naturally collaborates with other AI tools, providing creators with new means of expression. For example, after generating an image from text, it can be further expanded into a 3D world. Different generative models give the scene unique styles.

Below are examples of scenes generated by different models under the same text prompt:

A vibrant cartoon-style teenager's bedroom with a bed covered in colorful blankets, a cluttered desk with a computer, posters on the walls, and scattered sports gear. A guitar leans against the wall, and a cozy, patterned rug is in the center. Light from a window adds a warm, youthful vibe to the room.

About World Labs

World Labs, a company dedicated to developing "spatial intelligence" AI. We focus on building large-scale world models (LWMs) that elevate AI from two-dimensional pixels to full 3D virtual and real worlds, endowing it with near-human spatial intelligence. Founded by leading figures in the AI field, Dr. Fei-Fei Li, Justin Johnson, Christoph Lassner, and Ben Mildenhall, our team consists of top talents in computer vision and graphics.

World Labs believes in ushering in a new era of spatial intelligence AI. One of the core aspects of human intelligence is spatial intelligence. Spatial intelligence enables us to understand and interact with the world around us, from building sandcastles to designing skyscrapers, all creations and reasoning rely on this ability. While language intelligence drove the early revolution in generative AI, we are now moving to a higher dimension—endowing AI with spatial intelligence so it can understand objects, places, and interactions in the 3D world.