AI to 3D with Claude, starting from a single image 🦴
- Francesca Fini

- 13 hours ago
- 2 min read
One of the most important things in my AI-generated performances is spatial coherence. Not just making a beautiful image, but building a reusable environment that can hold multiple shots and inevitable actions without collapsing in perspective, proportion, or logic.
There are many commercial tools that are heavily addressing this issue (proving my point), especially in workflows based on Gaussian splats.Â
But for me, the most reliable solution is still a more grounded one: rebuilding the space in 3D. So AI to 3D.
Recently, I found a way to make that process much faster by combining Claude, Python, and Blender.
I started from a single view of the AI space.
From that image, I generated the other three views using LUMAÂ Uni-1 model, which, in my experience, understands space, spatial coordinates, and perspective much better than NanoBanana.Â
I’ll share that workflow soon, especially because it works very well for building equirectangular scenes for VR or Gaussian splatting applications.
I then created a custom Claude script that takes those four views and generates Python code to reconstruct the architecture inside Blender.
What Claude produces is not just a visual interpretation of the space, but a parametric architectural blueprint in code. That means I can keep refining the environment simply by talking to Claude: adjusting dimensions, proportions, relationships, and details before stepping in manually.
The result is not a pixel-perfect copy of the original images, but it is an extremely strong starting point: a fully parametric model, already structured and lit, ready to be refined.
From there, I texture the model manually and render the space properly to create a consistent, spatially coherent environment for all shots.
I also place mannequin placeholders for the characters, which I later replace with final characters rendered through any Image editing tool (Nanobanana Pro, UNI-1, Seedream), using reference datasets to maintain character consistency.
For this video experiment, I worked with the raw geometry of the space, still very basic, with simple lighting and no texturing, and used Dreamina AI + Seedance 2 for character animation.
So the workflow becomes:
AI-generated concept → parametric 3D reconstruction → manual texturing → placeholder staging → Character re-rendering from placeholders → Animation
Each step informs the next. It creates a bridge between generative AI and actual spatial construction, which, for now, is still the most effective way I’ve found to achieve cinematic consistency.
AI can generate a scene.
But if you want that scene to behave like a real place across multiple shots, 3D still matters.



Comments