LaGeM: A Large Geometry Model for 3D Representation Learning and Diffusion
Authors: Biao Zhang, Peter Wonka
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The main autoencoding experiment is trained on Objaverse (Deitke et al., 2023). Models are zero-centered and normalized into the unit sphere. [...] We use Chamfer distance and F-score as the metrics. The results are shown in Table 4. [...] To further prove the generalization ability of La Ge M-Objaverse, we also test the autoencoding on various datasets, including Thingi10k (Zhou & Jacobson, 2016), ABO (Collins et al., 2022), EGAD (Morrison et al., 2020), GSO (Downs et al., 2022), pix3d (Sun et al., 2018) and FAUST (Bogo et al., 2014). |
| Researcher Affiliation | Academia | Biao Zhang, Peter Wonka KAUST, Saudi Arabia EMAIL |
| Pseudocode | Yes | A.1 VOLUME POINTS SAMPLING. We sample volume points uniformly in the bounding sphere. 1 N_vol = 250000 2 vol_points = np.random.randn(N_vol, 3) 3 vol_points = vol_points / np.linalg.norm(vol_points, axis=1)[:, None] * |
| Open Source Code | Yes | We release our model trained on a 600k geometry dataset. |
| Open Datasets | Yes | The main autoencoding experiment is trained on Objaverse (Deitke et al., 2023). [...] To further prove the generalization ability of La Ge M-Objaverse, we also test the autoencoding on various datasets, including Thingi10k (Zhou & Jacobson, 2016), ABO (Collins et al., 2022), EGAD (Morrison et al., 2020), GSO (Downs et al., 2022), pix3d (Sun et al., 2018) and FAUST (Bogo et al., 2014). |
| Dataset Splits | Yes | We also apply the method to Shape Net, where the train split is taken from (Zhang et al., 2022). [...] Table 5: Generalization on Various Datasets. [...] Shape Net (Chang et al., 2015)-test 2k Yes 3.25 2.33 -0.92 97.41 99.49 2.08 |
| Hardware Specification | Yes | For Shape Net, the denoising networks of the 3 levels have 12 self-attention blocks with 768 channels. We trained the model for around 200 hours with 4 A100 GPUs. [...] The model is trained on 16 A100 GPUs for around 100 hours. |
| Software Dependencies | No | The proposed regularization (see Table 2) is implemented with layer normalization (Py Torch code). |
| Experiment Setup | Yes | The three levels of latents are 128 x 64, 512 x 32, and 2048 x 16 (where 64, 32, and 16 are channels of the latents). Some other hyperparameters of the network can also be found in Table 3. [...] For Shape Net, the denoising networks of the 3 levels have 12 self-attention blocks with 768 channels. [...] The loss is binary cross entropy as in previous work (Zhang et al., 2022). |