- [2026/03/01] Released new model weights and training code.
- [2026/01/26] Accepted to ICLR 2026.
Despite significant progress in 3D avatar reconstruction, it still faces challenges such as high time complexity, sensitivity to data quality, and low data utilization. We propose FastAvatar, a feedforward 3D avatar framework capable of flexibly leveraging diverse daily recordings (e.g., a single image, multi-view observations, or monocular video) to reconstruct a high-quality 3D Gaussian Splatting (3DGS) model within seconds, using only a single unified model. The core of FastAvatar is a Large Gaussian Reconstruction Transformer (LGRT) featuring three key designs: First, a 3DGS transformer aggregating multi-frame cues while injecting initial 3D prompt to predict the corresponding registered canonical 3DGS representations; Second, multi-granular guidance encoding (camera pose, expression coefficient, head pose) mitigating animation-induced misalignment for variable-length inputs; Third, incremental Gaussian aggregation via landmark tracking and sliced fusion losses. Integrating these features, FastAvatar enables incremental reconstruction, i.e., improving quality with more observations without wasting input data as in previous works. This yields a quality-speed-tunable paradigm for highly usable avatar modeling. Extensive experiments show that FastAvatar has a higher quality and highly competitive speed compared to existing methods. Code and models are available at the project repository.
FastAvatar is a feedforward framework designed to reconstruct a high-quality, animatable 3D Gaussian Splatting (3DGS) avatar from an unordered, variable-length set of observations such as a single selfie, monocular video frames, or multi-view captures. The model consumes RGB observations together with camera parameters, expression coefficients, and head pose, then outputs a canonical 3DGS avatar that can be animated from arbitrary viewpoints.
The pipeline is centered on the Large Gaussian Reconstruction Transformer (LGRT):
@inproceedings{wu2026fastavatar,
title={FastAvatar: Towards Unified and Fast 3D Avatar Reconstruction with Large Gaussian Reconstruction Transformers},
author={Yue Wu and Xuanhong Chen and Yufan Wu and Wen Li and Yuxi Lu and Kairui Feng},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=P7zBSCs4Xt}
}