reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Alberta Wells Dataset: Pinpointing Oil and Gas Wells from Satellite Imagery

Authors: Pratinav Seth, Michelle Lin, Brefo Dwamena Yaw, Jade Boutot, Mary Kang, David Rolnick

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate baseline algorithms for well detection and segmentation, showing the promise of computer vision approaches but also significant room for improvement. We evaluate a wide range of state-of-the-art deep learning algorithms, showing promising performance but emphasizing the challenging nature of the task. Table 4. Results for the binary segmentation task for a variety of models evaluated over the test set.
Researcher Affiliation	Academia	1Mila Quebec AI Institute, Montreal, Canada 2Department of Data Science & Computer Application, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka, India 3Universit e de Montr eal, Montreal, Canada 4Mc Gill University, Montreal, Canada. Correspondence to: Pratinav Seth <EMAIL>.
Pseudocode	Yes	Algorithm 1 Clustering Algorithm for Dataset Splitting
Open Source Code	Yes	Our Benchmarking Code is available at: https://github.com/Rolnick Lab/ Alberta_Wells_Dataset.
Open Datasets	Yes	We introduce the Alberta Wells Dataset, containing information on over 200k abandoned, suspended, and active onshore oil and gas wells with high-resolution satellite imagery. Our Dataset available at: https: //zenodo.org/records/13743323. Alberta Wells Dataset is available at: https://zenodo. org/records/13743323.
Dataset Splits	Yes	Table 2. Statistics of instances and wells across the dataset. Split Total Patches Wells/ Non-Wells Patches Count of Well Type in Wells Patches of Split Abandoned Suspended Active Train 167436 83718 46342 47595 100294 Validation 9463 4731 3166 2671 2406 Test 11789 5894 4024 3609 3340. To create a well-distributed dataset that represents various geographical regions and offers a diverse dataset for evaluation, we developed a splitting algorithm (Algorithm 1) which focuses on balancing regions, not individual examples, ensuring that both the training and test sets reflect a diverse range of regions from Alberta s varied landscape.
Hardware Specification	No	We also thank Mila and the Digital Research Alliance of Canada for provision of computing facilities, and NVIDIA Corporation for additional computational resources. This statement does not provide specific hardware details such as GPU/CPU models or memory specifications.
Software Dependencies	No	We acquired multispectral satellite imagery data from Planet Labs, which comprises four bands (RGBN) with a 3-meter-per-pixel resolution obtained through their proprietary API. This data was processed using quality-controlled and cleaned well data to generate segmentation and object detection annotations. The annotations were created using custom Python code, leveraging libraries like Shapely, Geo Pandas, and Rasterio, and were validated through visualization using folium and matplotlib. The paper mentions software libraries like Shapely, Geo Pandas, Rasterio, folium, and matplotlib but does not provide specific version numbers for any of them.
Experiment Setup	Yes	We train all CNN-based models using a Res Net50 backbone, a batch size of 128, and the BCELogits loss function. To fine-tune the model, a cosine annealing scheduler (Loshchilov & Hutter, 2017) is used, which adjusts the learning rate smoothly in a cyclical manner by gradually decreasing it. ... For transformer-based models, while both Segformer and Uper Net use a Dice loss function and a polynomial learning rate scheduler, Segformer utilizes a mit-b0-ade (Xie et al., 2021) backbone with a batch size of 128, while Uper Net uses Conv Ne XT (small and base) and Swin Transformer backbones with a batch size of 64. All models are optimized using Adam W for 50 epochs. ... All object detection models are trained using a Res Net50 backbone, except for SSD Lite, which is trained with a Mobile Net backbone. The batch size is set to 256 for Faster R-CNN and FCOS and 512 for Retina Net and SSD Lite. We used a cosine annealing scheduler (Loshchilov & Hutter, 2017) and trained all models for 120 epochs. DETR similarly uses a Res Net50 backbone but has a batch size set to 64. All models are optimized using the Adam W optimizer. We augment images by randomly resizing images to 256 256, ensuring all bounding boxes remain intact for object detection. We then apply horizontal and vertical flipping with a probability of 0.25 each, followed by normalization using channel-wise mean and standard deviation calculated from the training split of the dataset.