reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

VisRec: A Semi-Supervised Approach to Visibility Data Reconstruction in Radio Astronomy

Authors: Ruoqi Wang, Haitao Wang, Qiong Luo, Feng Wang, Hejun Wu

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our evaluation results show that Vis Rec is applicable to various models, and outperforms all baseline methods in terms of reconstruction quality, robustness, and generalizability.
Researcher Affiliation	Academia	1The Hong Kong University of Science and Technology (Guangzhou) 2School of Computer Science and Engineering, Sun Yat-Sen University, Guangzhou, China 3The Hong Kong University of Science and Technology 4Guangzhou University EMAIL, EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1: Main process of Vis Rec.
Open Source Code	Yes	Code https://github.com/Rapids At HKUST/Vis Rec
Open Datasets	Yes	For reference images, we use the Galaxy10 DECals dataset (Henry 2021) which contains 17,736 images of various galaxies. The dataset is from the DESI Legacy Imaging Surveys (Dey et al. 2019), which merges data from the Beijing-Arizona Sky Survey (BASS) (Zou et al. 2017), the DECam Legacy Survey (DECa LS) (Blum et al. 2016), and the Mayall zband Legacy Survey (Silva et al. 2016). Using these images as a reference, we employ the eht-imaging toolkit (Chael et al. 2019, 2018) to produce visibility data represented by {us, vs, V (us, vs)}.
Dataset Splits	Yes	For each dataset, we randomly select 1,024 examples for testing and the remaining samples are for training. More details regarding the datasets are reported in the appendix. To simulate a labelscarce setting, we randomly split a small number of examples from the training set as the labeled dataset, and the remaining samples are all used as unlabeled data. Maintaining a constant total training data size \|D\| = 16,692, we vary the size of the labeled subset \|D(s)\| from 64 to 2k (k = 1024) and use the remaining data as the unlabeled subset.
Hardware Specification	Yes	Platform: We conduct all experiments on a server with two AMD EPYC 7763 CPUs, 512GB main memory, and eight Nvidia RTX 4090 GPUs each with 24GB device memory. The server is equipped with two NVME 2TB SSD and two 16TB SATA hard disks.
Software Dependencies	Yes	Our model is implemented in Py Torch 1.8.1 (Paszke et al. 2019).
Experiment Setup	Yes	The total loss Ltotal is a weighted sum of these two losses, where λ is a hyperparameter that balances the two components. The model parameters θ are updated to minimize Ltotal. ... We evaluate the impact of consistency loss weight λ in Vis Rec on the reconstruction of visibilities with different noise corruption levels. The results are shown in Figure 8.