Learning Cross-Domain Representations for Transferable Drug Perturbations on Single-Cell Transcriptional Responses
Authors: Hui Liu, Shikai Jin
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conducted extensive evaluations of our model on multiple datasets, including single-cell transcriptional responses to drugs and singleand combinatorial genetic perturbations. The experimental results show that XTransfer CDR achieved better performance than current state-of-theart methods, showcasing its potential to advance phenotypic drug discovery. |
| Researcher Affiliation | Academia | College of Computer and Information Engineering, Nanjing Tech University, Nanjing, 211816, China EMAIL |
| Pseudocode | No | The paper describes the framework, equations, and components but does not contain a clearly labeled pseudocode or algorithm block. |
| Open Source Code | Yes | Code https://github.com/hliulab/XTransfer CDR |
| Open Datasets | Yes | We initially evaluated the model s performance on the single-cell chemical response dataset from the sci-Plex project (Srivatsan et al. 2020). The sci-Plex3 dataset contains the single-cell transcriptional responses of three human cancer cell lines (MCF7, K562, and A549) exposed to 188 different drugs. Moreover, we have noticed that the sci-Plex project has released a new dataset sci-Plex4, thereby we extended our evaluation to this dataset. We further evaluated our method on two single-cell datasets established by genetic perturbation assays (Replogle, Saunders, and et al. 2021). To further validate the effectiveness of learned transferable perturbations, we carried out systematic evaluation on another dataset that was generated through CRISPRbased knockout (deactivation) of multiple genes, aimed at observing the consequent alterations in single-cell phenotypes (Norman et al. 2019). |
| Dataset Splits | Yes | For model evaluation, the expression profiles induced by these nine drugs were held out as the test set (n=3,071), while the remaining data were used to create the paired samples for training (n=101,190) and validation set (n=8,499) with a 4:1 ratio. Following the drug-level data partitioning strategy, the sci-Plex4 dataset was randomly divided into a training set (n=7,104), validation set (n=718), and test set (n=733). the K562 dataset was divided into a training set (n=64,249), a validation set (n=2,234), and a test set (n=2,233). The RPE-1 dataset was divided into a training set (n=72,200), a validation set (n=2,045), and a test set (n=2,044). |
| Hardware Specification | Yes | all experiments were conducted on a Cent OS Linux 8.2.2004 (Core) system, equipped with a Ge Force RTX 4090 GPU and 128GB memory. |
| Software Dependencies | No | The paper mentions "Cent OS Linux 8.2.2004 (Core) system" as the operating environment but does not specify any programming languages, libraries, or frameworks with version numbers that are critical software dependencies for reproducing the experiments. |
| Experiment Setup | Yes | They consist of four feedforward layers with sizes of 1024, 512, 256, and 128, respectively. Each feed-forward layer is followed by a batch normalization layer, and a dropout layer with the dropout probability set to 0.2. The learning rate is set to 2e-4, and the bottleneck dimension between the encoder and decoder is set to 128. The model was trained for 60 epochs |