Skip to content

🌀Spatial Reasoners¶

A Python package for spatial reasoning over continuous variables with generative denoisers.

GitHub license Python Version PyPI version arXiv

Try it out with pip install spatialreasoners, and check our example project.

Overview¶

Spatial Reasoners Overview

🌀Spatial Reasoners is a Python package for spatial reasoning over continuous variables with generative denoising models. Denoising generative models have become the de-facto standard for image generation, due to their effectiveness in sampling from complex, high-dimensional distributions. Recently, they have started being explored in the context of reasoning over multiple continuous variables.

Our package provides a comprehensive framework to facilitate research in this area, offering easy-to-use interfaces to control:

  • Variable Mapping: Seamlessly map to variables from arbitrary data domains.
  • Generative Model Paradigms: Flexibly work with a wide range of denoising formulations.
  • Samplers & Inference Strategies: Implement and experiment with diverse samplers and inference techniques.

🌀Spatial Reasoners is a generalization of Spatial Reasoning Models (SRMs) to new domains, packaged as a reusable library for the research community.

Key Features¶

  • 🚀 One-line Training: Get started with minimal setup using sensible defaults
  • 🔧 Flexible Configuration: Powerful config system with automatic merging of local and embedded configurations
  • 📦 Modular Architecture: Extensible design with pluggable components for datasets, models, and training strategies
  • 🔬 Research-Ready: Built-in benchmarks, evaluation protocols, and example projects
  • âš¡ Production-Ready: Lightning-based training infrastructure with distributed training support

Architecture Overview¶

Spatial Reasoners is built with modularity and extensibility in mind:

spatialreasoners/
├── api/                  # High-level API
├── dataset/              # Data loading and processing
├── denoising_model/      # Model implementations
│   ├── denoiser/         # Denoiser architectures (UNet, DiT, MAR, etc.)
│   ├── flow/             # Flow variants (rectified, cosine, etc.)
│   └── tokenizer/        # Tokenizers of variables for the denoiser
├── training/             # Training infrastructure
├── variable_mapper/      # Variable mapping logic
├── benchmark/            # Evaluation framework
└── configs/              # Embedded default configs

Research Applications¶

Spatial Reasoners are a generalization of the idea of diffusion models that allows using different noise levels within a sample. Similar direction to SRMs has been explored by for example MAR, Rolling Diffusion, Diffusion Forcing, and the concurrent xAR -- Spatial Reasoners allows you to build similiar setups. For some architectures (such as Unet, DiT, xAR's variant of DiT or History Guided Diffusion's U-ViT-pose) you can just specify the denoiser config and directly start training.

In some domains starting your work could be even faster due to already implemented Variable Mappers and some evaluations -- this is true for tasks like:

  • Sudoku generation Our MNIST Sudoku dataset
  • Image generation With prepared dataset implementations for ImageNet, CIFAR10, CelebA, SRM's Counting Stars and many others
  • Video generation Where a variable is a single frame -- as in Diffusion Forcing

We also highly encourage you to take Spatial Reasoners to completely new domains -- see our example project to see how to train new models in your domain!

Next Steps¶

Citation¶

If you use Spatial Reasoners in your research, please cite:

@inproceedings{pogodzinski25spatialreasoners,
  title={Spatial Reasoners for Continuous Variables in Any Domain},
  author={Bart Pogodzinski and Christopher Wewer and Bernt Schiele and Jan Eric Lenssen},
  booktitle={Championing Open-source DEvelopment in ML Workshop @ ICML25},
  year={2025},
  url={https://arxiv.org/abs/2507.10768},
}

@inproceedings{wewer25srm,
  title     = {Spatial Reasoning with Denoising Models},
  author    = {Wewer, Christopher and Pogodzinski, Bartlomiej and Schiele, Bernt and Lenssen, Jan Eric},
  booktitle = {International Conference on Machine Learning ({ICML})},
  year      = {2025},
}

Support & Community¶