Extended Deep Submodular Functions
Authors: Seyed Mohammad Hosseini, Arash Jamshidi, Seyed Mahdi Noormousavi, Mahdi Siavoshani, Naeimeh Omidvar
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive experiments, we demonstrate that EDSFs exhibit significantly lower empirical generalization error in representing and learning coverage and cut functions compared to existing baselines, such as DSFs, Deep Sets, and Set Transformers. |
| Researcher Affiliation | Academia | Seyed Mohammad Hosseini* EMAIL Department of Computer Engineering Sharif University of Technology Arash Jamshidi* EMAIL Department of Computer Engineering Sharif University of Technology Seyed Mahdi Noormousavi EMAIL Department of Computer Engineering Sharif University of Technology Mahdi Jafari Siavoshani EMAIL Department of Computer Engineering Sharif University of Technology Naeimeh Omidvar EMAIL Tehran Institute for Advanced Studies (TEIAS), Khatam University, Tehran, Iran |
| Pseudocode | Yes | Algorithm 1 Gradient Ascent Input: valuation functions v1, v2, . . . , vn, set of items S (|S| = s), learning rate η Initialize a = (0)ij Project each column of a on the probability simplex repeat Compute gradient of SW function in the point a = g SW. a = a + η.g Project each column of a on the probability simplex until convergence for i in S do Select the user assigned to item i by sampling from i th column of a (corresponding distribution for item i). end for |
| Open Source Code | Yes | All of the codes associated with experiments are available at https://github.com/semohosseini/comb-auction |
| Open Datasets | No | To create a coverage function for our experiments, we define the universe size, the number of items (subsets), and the probability that each element in the universe independently belongs to each subset. Additionally, the weights in the coverage function are kept constant with a value of 1. For each experiment we generate a dataset, allocating 80% for training and 20% for testing. |
| Dataset Splits | Yes | For each experiment we generate a dataset, allocating 80% for training and 20% for testing. |
| Hardware Specification | No | The paper does not explicitly mention any specific hardware used for running its experiments, such as GPU or CPU models. It only refers to 'neural networks' and 'models' being trained. |
| Software Dependencies | No | The paper mentions 'Adam optimizer' and 'L1-loss function' but does not specify version numbers for any software libraries, frameworks (e.g., PyTorch, TensorFlow), or programming languages. |
| Experiment Setup | Yes | The learning setup for all experiments in this section is the same. The optimizer used is Adam with a learning rate of 0.01. Each model is trained on a dataset of 1024 samples for 10,000 epochs. The employed cost function is the L1-loss function. Additionally, the weights for all EDSF and DSF neural networks are initialized using a Gaussian distribution with a mean of 0 and a variance of 0.01 |