Learning real-world dynamics from visual observations is crucial across graphics, robotics, and embodied AI. A common strategy is to calibrate simulators by estimating physical parameters, yet accuracy is ultimately bounded by the underlying physical models, which usually assume materials are homogeneous and isotropic. Even when this is reasonable, real-world objects typically exhibit mild anisotropy and heterogeneity. Once the near-isotropic backbone is calibrated, these residual effects become the key bottleneck for further closing the real-to-sim gap. Black-box neural dynamics, on the other hand, discard strong physical priors and suffer from poor data efficiency and overfitting.
We propose MoSA, a motion-constrained stress adaptation framework that targets these residual effects directly. MoSA keeps an isotropic constitutive law as a physics prior and learns a structured residual stress operator that progressively adapts stresses via microplane-constrained redistribution in a physics-informed cascaded network. We further impose motion constraints by supervising temporal and spatial derivatives of the deformation field with cues from dynamic 3D Gaussian Splatting reconstruction. On synthetic and real-world multi-view captures, MoSA achieves superior accuracy, generalization, and robustness, while learning physically meaningful residual anisotropy. Deployed in a sim-to-real robot manipulation setting, the better dynamics directly translate into a +26% success rate over an isotropic-prior baseline.
To check that MoSA captures physically meaningful residual effects rather than overfitting, we probe two complementary aspects of the learned model: (i) directional anisotropy of the stress-adaptation operator, and (ii) spatial heterogeneity of the local material field.
Directional anisotropy. We probe the learned operator on a sample with known anisotropic ground truth. (a) Mechanism check. The normalized directional Young's modulus $E_{\text{norm}}(\theta)$ and the normalized Jacobian response $\lVert J\rVert_{\text{norm}}(\theta)$ trace the same anisotropy pattern, showing that the operator's stress-redistribution mechanism is internally consistent with the directional stiffness it produces. (b) Ground-truth check. Against baselines, the isotropic prior (dashed) collapses to a perfect circle and misses the directional dependence entirely, while a black-box neural fit overshoots and hallucinates spurious anisotropy. MoSA closely tracks the GT, confirming that the residual stress operator captures physically meaningful anisotropy rather than overfitting.
Spatial heterogeneity. The learned field $\eta(\mathbf{x})$ locally modulates the global material parameter and produces smooth, object-dependent stiffness patterns — confirming that the continuous field captures structured material variation rather than fitting unstructured residual noise.
Real-to-sim rollouts on our 12-camera real-world dataset. Click the arrows to switch scenes; drag the slider below to scrub the current clip.
Chick1 — elastoplastic squash, residual scaling along the head-to-tail axis.
Chick2 — held-out initial pose; physics-consistent rollout.
Gorilla — plastic squash with residual stress redistribution.
Mandarin — elastic anisotropy along the equatorial direction.
Peanut — elongated body with stiffness gradient between the lobes.
Rabbit — elastoplastic deformation under free-fall impact.
Rainbowball — heterogeneous stiffness across colored sectors.
Per-scene and mean PSNR / SSIM on our real-world multi-view dataset (7 objects). SSIM is scaled by 100.
| Method | Chick1 | Gorilla | Mandarin | Chick2 | Peanut | Rabbit | RBball | Mean | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| PSNR↑ | SSIM↑ | PSNR↑ | SSIM↑ | PSNR↑ | SSIM↑ | PSNR↑ | SSIM↑ | PSNR↑ | SSIM↑ | PSNR↑ | SSIM↑ | PSNR↑ | SSIM↑ | PSNR↑ | SSIM↑ | |
| DEL | 28.32 | 91.6 | 29.11 | 90.4 | 31.92 | 92.4 | 28.59 | 90.9 | 28.38 | 91.3 | 28.57 | 91.4 | 30.95 | 91.7 | 29.41 | 91.4 |
| Vid2Sim | 26.71 | 90.9 | 29.02 | 90.0 | 25.85 | 89.8 | 25.70 | 89.5 | 30.69 | 94.6 | 26.85 | 90.3 | 31.71 | 91.7 | 28.08 | 91.0 |
| NeuMA | 30.73 | 92.4 | 29.78 | 91.1 | 31.85 | 92.4 | 28.92 | 91.0 | 30.70 | 91.8 | 28.30 | 92.3 | 30.74 | 91.1 | 30.00 | 91.7 |
| GIC | 30.93 | 92.5 | 29.75 | 91.0 | 29.54 | 91.6 | 28.17 | 90.9 | 32.88 | 91.3 | 28.08 | 91.3 | 30.78 | 91.7 | 30.02 | 91.5 |
| MoSA (Ours) | 32.05 | 92.7 | 30.19 | 91.9 | 32.83 | 92.7 | 30.17 | 91.6 | 33.01 | 92.0 | 30.35 | 92.1 | 32.06 | 92.7 | 31.35 | 92.3 |
Per-scene and mean CD / EMD on the PAC-NeRF dataset (7 scenes). Both metrics scaled by 100.
| Method | torus | cat | playdoh | droplet | Cream | Bird | Letter | Mean | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| CD↓ | EMD↓ | CD↓ | EMD↓ | CD↓ | EMD↓ | CD↓ | EMD↓ | CD↓ | EMD↓ | CD↓ | EMD↓ | CD↓ | EMD↓ | CD↓ | EMD↓ | |
| PAC-NeRF | 21.8 | 11.6 | 9.8 | 14.4 | 18.6 | 5.6 | 10.4 | 3.2 | 20.5 | 12.7 | 19.3 | 21.1 | 12.8 | 8.5 | 16.2 | 11.0 |
| DEL | 21.7 | 10.7 | 7.9 | 12.8 | 12.2 | 2.5 | 9.8 | 1.7 | 19.8 | 9.8 | 17.8 | 20.2 | 12.6 | 7.2 | 14.5 | 9.3 |
| GIC | 20.2 | 9.9 | 7.6 | 12.6 | 12.3 | 2.5 | 10.2 | 1.9 | 19.5 | 10.1 | 16.5 | 19.5 | 10.3 | 7.5 | 13.8 | 9.1 |
| MoSA (Ours) | 20.1 | 9.8 | 7.3 | 12.4 | 11.4 | 2.3 | 9.6 | 1.5 | 19.4 | 9.5 | 16.3 | 19.2 | 10.1 | 6.6 | 13.5 | 8.8 |
We learn object dynamics from video, train a manipulation policy in the learned simulator, and zero-shot transfer to a real robot. Better real-to-sim dynamics translates directly into more reliable sim-to-real policy execution.
Drag the slider to scrub through 20 keyframes. Use the arrows to switch between Cup, Rabbit and Towel.
Isotropic Physical Model Baseline
MoSA (Ours)
Cup — grasp and lift a deformable cup.
Isotropic Physical Model Baseline
MoSA (Ours)
Rabbit — place an elastic rabbit onto a target box.
Isotropic Physical Model Baseline
MoSA (Ours)
Towel — fold and hang a flexible towel.
Place a deformable rabbit onto a white box.
68 / 100
vs. 42 / 100 with isotropic prior
+26% success
Hang a deformable object on a peg tower.
82 / 100
vs. 55 / 100 with isotropic prior
+27% success
@inproceedings{wang2026mosa,
title = {{MoSA}: Motion-constrained Stress Adaptation for Mitigating Real-to-Sim Gap
in Continuum Dynamics via Learning Residual Anisotropy},
author = {Wang, Jiaxu and He, Junhao and Coauthors and Advisor},
booktitle = {Proceedings of the 43rd International Conference on Machine Learning (ICML)},
year = {2026}
}