Sensor-Invariant Tactile Representation
Authors: Harsh Gupta, Yuchen Mo, Shengmiao Jin, Wenzhen Yuan
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate our method s effectiveness across various tactile sensing applications, facilitating data and model transferability for future advancements in the field. We evaluate the generalizability of our method across various downstream tasks using multiple realworld Gel Sight sensors. Our experimental results demonstrate that SITR outperforms baseline models and other related tactile representations in different downstream tasks, showcasing robust transferability and effectiveness. We conduct an ablation study to investigate the impact of the number and type of calibration images on the performance of SITR. |
| Researcher Affiliation | Academia | Harsh Gupta Yuchen Mo Shengmiao Jin Wenzhen Yuan University of Illinois Urbana-Champaign EMAIL |
| Pseudocode | No | The paper describes the methodology and architecture in detail but does not present any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code for the methodology described, nor does it provide a link to a code repository. |
| Open Datasets | No | We construct a large-scale synthetic dataset that spans a wide range of tactile sensor configurations, providing tactile signals of contact geometries along with their corresponding normal maps. We collect real-world datasets for training and evaluating downstream tasks across different baselines. |
| Dataset Splits | No | For each downstream task, we freeze the SITR encoder and only train the downstream task-specific decoder on a single sensor. We evaluate this model using the rest of the sensors in the set. The paper describes a strategy for splitting data across sensors for evaluation but does not provide specific training/validation/test splits for the data used to train the task-specific decoders or the synthetic dataset. |
| Hardware Specification | No | Figure 13 shows the modified Ender-3 Pro 3D printer. We mount indentors and collect the pose estimation dataset for multiple sensors. While a 3D printer is mentioned for data collection, no specific hardware (like GPUs or CPUs) used for training or inference of the models is specified. |
| Software Dependencies | No | We use Physics-based Rendering (PBR) (Pharr et al., 2023) to simulate Gel Sight sensors (Agarwal et al., 2021) and implement the algorithm in Blender. The paper mentions 'Blender' for simulation but does not specify a version number. No other software dependencies with specific version numbers are provided. |
| Experiment Setup | Yes | In summary, the total loss for SITR is defined as L = λnormal Lnormal+λSCL LSCL where λnormal and λSCL are loss weighting hyperparameters. Refer to Section A.1 for more implementation details. We set both loss weighting hyperparameters λnormal and λSCL to 1. For SITR, we choose a temperature of 0.07 for its strong performance in the classification task. Downstream Task Decoders: ... Classification Decoders We use Cross Entropy Loss for this task. Pose Estimation Decoders We use MSE loss for this task. |