LaGeM: A Large Geometry Model for 3D Representation Learning and Diffusion

Authors: Biao Zhang, Peter Wonka

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The main autoencoding experiment is trained on Objaverse (Deitke et al., 2023). Models are zero-centered and normalized into the unit sphere. [...] We use Chamfer distance and F-score as the metrics. The results are shown in Table 4. [...] To further prove the generalization ability of La Ge M-Objaverse, we also test the autoencoding on various datasets, including Thingi10k (Zhou & Jacobson, 2016), ABO (Collins et al., 2022), EGAD (Morrison et al., 2020), GSO (Downs et al., 2022), pix3d (Sun et al., 2018) and FAUST (Bogo et al., 2014).
Researcher Affiliation Academia Biao Zhang, Peter Wonka KAUST, Saudi Arabia EMAIL
Pseudocode Yes A.1 VOLUME POINTS SAMPLING. We sample volume points uniformly in the bounding sphere. 1 N_vol = 250000 2 vol_points = np.random.randn(N_vol, 3) 3 vol_points = vol_points / np.linalg.norm(vol_points, axis=1)[:, None] *
Open Source Code Yes We release our model trained on a 600k geometry dataset.
Open Datasets Yes The main autoencoding experiment is trained on Objaverse (Deitke et al., 2023). [...] To further prove the generalization ability of La Ge M-Objaverse, we also test the autoencoding on various datasets, including Thingi10k (Zhou & Jacobson, 2016), ABO (Collins et al., 2022), EGAD (Morrison et al., 2020), GSO (Downs et al., 2022), pix3d (Sun et al., 2018) and FAUST (Bogo et al., 2014).
Dataset Splits Yes We also apply the method to Shape Net, where the train split is taken from (Zhang et al., 2022). [...] Table 5: Generalization on Various Datasets. [...] Shape Net (Chang et al., 2015)-test 2k Yes 3.25 2.33 -0.92 97.41 99.49 2.08
Hardware Specification Yes For Shape Net, the denoising networks of the 3 levels have 12 self-attention blocks with 768 channels. We trained the model for around 200 hours with 4 A100 GPUs. [...] The model is trained on 16 A100 GPUs for around 100 hours.
Software Dependencies No The proposed regularization (see Table 2) is implemented with layer normalization (Py Torch code).
Experiment Setup Yes The three levels of latents are 128 x 64, 512 x 32, and 2048 x 16 (where 64, 32, and 16 are channels of the latents). Some other hyperparameters of the network can also be found in Table 3. [...] For Shape Net, the denoising networks of the 3 levels have 12 self-attention blocks with 768 channels. [...] The loss is binary cross entropy as in previous work (Zhang et al., 2022).