| (f,Gamma)-Divergences: Interpolating between f-Divergences and Integral Probability Metrics |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| A Bregman Learning Framework for Sparse Neural Networks |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| A Class of Conjugate Priors for Multinomial Probit Models which Includes the Multivariate Normal One |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| A Closer Look at Embedding Propagation for Manifold Smoothing |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| A Computationally Efficient Framework for Vector Representation of Persistence Diagrams |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
✅ |
5 |
| A Distribution Free Conditional Independence Test with Applications to Causal Discovery |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| A Forward Approach for Sufficient Dimension Reduction in Binary Classification |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| A Generalized Projected Bellman Error for Off-policy Value Estimation in Reinforcement Learning |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| A Kernel Two-Sample Test for Functional Data |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| A Momentumized, Adaptive, Dual Averaged Gradient Method |
✅ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
6 |
| A Nonconvex Framework for Structured Dynamic Covariance Recovery |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| A Perturbation-Based Kernel Approximation Framework |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| A Primer for Neural Arithmetic Logic Modules |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| A Random Matrix Perspective on Random Tensors |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| A Statistical Approach for Optimal Topic Model Identification |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
❌ |
3 |
| A Stochastic Bundle Method for Interpolation |
✅ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
6 |
| A Unified Statistical Learning Model for Rankings and Scores with Application to Grant Panel Review |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
✅ |
4 |
| A Unifying Framework for Variance-Reduced Algorithms for Findings Zeroes of Monotone operators |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| A Wasserstein Distance Approach for Concentration of Empirical Risk Estimates |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| A Worst Case Analysis of Calibrated Label Ranking Multi-label Classification Method |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| A proof of convergence for the gradient descent optimization method with random initializations in the training of neural networks with ReLU activation for piecewise linear target functions |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| A spectral-based analysis of the separation between two-layer neural networks and linear methods |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| A universally consistent learning rule with a universally monotone error |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| ALMA: Alternating Minimization Algorithm for Clustering Mixture Multilayer Network |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Accelerated Zeroth-Order and First-Order Momentum Methods from Mini to Minimax Optimization |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Accelerating Adaptive Cubic Regularization of Newton's Method via Random Sampling |
✅ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Active Learning for Nonlinear System Identification with Guarantees |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Active Structure Learning of Bayesian Networks in an Observational Setting |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Adaptive Greedy Algorithm for Moderately Large Dimensions in Kernel Conditional Density Estimation |
✅ |
❌ |
❌ |
❌ |
✅ |
❌ |
✅ |
3 |
| Additive Nonlinear Quantile Regression in Ultra-high Dimension |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
✅ |
6 |
| Advantage of Deep Neural Networks for Estimating Functions with Singularity on Hypersurfaces |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Adversarial Classification: Necessary Conditions and Geometric Flows |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Adversarial Robustness Guarantees for Gaussian Processes |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| All You Need is a Good Functional Prior for Bayesian Deep Learning |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| An Efficient Sampling Algorithm for Non-smooth Composite Potentials |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| An Error Analysis of Generative Adversarial Networks for Learning Distributions |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| An Improper Estimator with Optimal Excess Risk in Misspecified Density Estimation and Logistic Regression |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| An Optimization-centric View on Bayes' Rule: Reviewing and Generalizing Variational Inference |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Analytically Tractable Hidden-States Inference in Bayesian Neural Networks |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Approximate Bayesian Computation via Classification |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Approximate Information State for Approximate Planning and Reinforcement Learning in Partially Observed Systems |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Approximation and Optimization Theory for Linear Continuous-Time Recurrent Neural Networks |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
1 |
| Are All Layers Created Equal? |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Asymptotic Analysis of Sampling Estimators for Randomized Numerical Linear Algebra Algorithms |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Asymptotic Network Independence and Step-Size for a Distributed Subgradient Method |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Asymptotic Study of Stochastic Adaptive Algorithms in Non-convex Landscape |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Attraction-Repulsion Spectrum in Neighbor Embeddings |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
✅ |
5 |
| Auto-Sklearn 2.0: Hands-free AutoML via Meta-Learning |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
7 |
| Batch Normalization Preconditioning for Neural Network Training |
✅ |
❌ |
✅ |
✅ |
✅ |
✅ |
✅ |
6 |
| Bayesian Covariate-Dependent Gaussian Graphical Models with Varying Structure |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Bayesian Multinomial Logistic Normal Models through Marginally Latent Matrix-T Processes |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
✅ |
5 |
| Bayesian Pseudo Posterior Mechanism under Asymptotic Differential Privacy |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Bayesian subset selection and variable importance for interpretable prediction and classification |
✅ |
✅ |
❌ |
✅ |
❌ |
❌ |
✅ |
4 |
| Behavior Priors for Efficient Reinforcement Learning |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Beyond Sub-Gaussian Noises: Sharp Concentration Analysis for Stochastic Gradient Descent |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
1 |
| Boulevard: Regularized Stochastic Gradient Boosted Trees and Their Limiting Distribution |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Bounding the Error of Discretized Langevin Algorithms for Non-Strongly Log-Concave Targets |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| CD-split and HPD-split: Efficient Conformal Regions in High Dimensions |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Cascaded Diffusion Models for High Fidelity Image Generation |
✅ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Cauchy–Schwarz Regularized Autoencoder |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Causal Aggregation: Estimation and Inference of Causal Effects by Constraint-Based Data Fusion |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Causal Classification: Treatment Effect Estimation vs. Outcome Prediction |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
3 |
| Change point localization in dependent dynamic nonparametric random dot product graphs |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| CleanRL: High-quality Single-file Implementations of Deep Reinforcement Learning Algorithms |
❌ |
✅ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| Clustering with Semidefinite Programming and Fixed Point Iteration |
❌ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
3 |
| Communication-Constrained Distributed Quantile Regression with Optimal Statistical Guarantees |
✅ |
❌ |
❌ |
✅ |
❌ |
❌ |
✅ |
3 |
| Community detection in sparse latent space models |
✅ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| Conditions and Assumptions for Constraint-based Causal Structure Learning |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Constraint Reasoning Embedded Structured Prediction |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Contraction rates for sparse variational approximations in Gaussian process regression |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
1 |
| Convergence Guarantees for the Good-Turing Estimator |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
1 |
| Convergence Rates for Gaussian Mixtures of Experts |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| D-GCCA: Decomposition-based Generalized Canonical Correlation Analysis for Multi-view High-dimensional Data |
❌ |
✅ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| Darts: User-Friendly Modern Machine Learning for Time Series |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Data-Derived Weak Universal Consistency |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| De-Sequentialized Monte Carlo: a parallel-in-time particle smoother |
✅ |
✅ |
❌ |
❌ |
✅ |
❌ |
✅ |
4 |
| Debiased Distributed Learning for Sparse Partial Linear Models in High Dimensions |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
1 |
| Decimated Framelet System on Graphs and Fast G-Framelet Transforms |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
7 |
| Deep Learning in Target Space |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
7 |
| Deep Limits and a Cut-Off Phenomenon for Neural Networks |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
1 |
| Deep Network Approximation: Achieving Arbitrary Accuracy with Fixed Number of Neurons |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Deepchecks: A Library for Testing and Validating Machine Learning Models and Data |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Dependent randomized rounding for clustering and partition systems with knapsack constraints |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Depth separation beyond radial functions |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Detecting Latent Communities in Network Formation Models |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Distributed Bayesian Varying Coefficient Modeling Using a Gaussian Process Prior |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Distributed Bootstrap for Simultaneous Inference Under High Dimensionality |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Distributed Learning of Finite Gaussian Mixtures |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
7 |
| Distributed Stochastic Gradient Descent: Nonconvexity, Nonsmoothness, and Convergence to Local Minima |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Distributional Random Forests: Heterogeneity Adjustment and Multivariate Distributional Regression |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
✅ |
6 |
| Double Spike Dirichlet Priors for Structured Weighting |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| DoubleML - An Object-Oriented Implementation of Double Machine Learning in Python |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| EV-GAN: Simulation of extreme events with ReLU neural networks |
❌ |
❌ |
✅ |
❌ |
✅ |
✅ |
✅ |
4 |
| Early Stopping for Iterative Regularization with General Loss Functions |
❌ |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
1 |
| Efficient Change-Point Detection for Tackling Piecewise-Stationary Bandits |
✅ |
✅ |
❌ |
❌ |
❌ |
❌ |
✅ |
3 |
| Efficient Inference for Dynamic Flexible Interactions of Neural Populations |
✅ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
6 |
| Efficient Least Squares for Estimating Total Effects under Linearity and Causal Sufficiency |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
✅ |
5 |
| Efficient MCMC Sampling with Dimension-Free Convergence Rate using ADMM-type Splitting |
✅ |
❌ |
❌ |
❌ |
✅ |
❌ |
✅ |
3 |
| EiGLasso for Scalable Sparse Kronecker-Sum Inverse Covariance Estimation |
✅ |
✅ |
✅ |
❌ |
✅ |
❌ |
✅ |
5 |
| Empirical Risk Minimization under Random Censorship |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Estimating Causal Effects under Network Interference with Bayesian Generalized Propensity Scores |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Estimating Density Models with Truncation Boundaries using Score Matching |
❌ |
✅ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| Estimation and inference on high-dimensional individualized treatment rule in observational data using split-and-pooled de-correlated score |
✅ |
✅ |
❌ |
✅ |
❌ |
❌ |
✅ |
4 |
| Evolutionary Variational Optimization of Generative Models |
✅ |
✅ |
✅ |
❌ |
✅ |
❌ |
✅ |
5 |
| Exact Partitioning of High-order Models with a Novel Convex Tensor Cone Relaxation |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Exact simulation of diffusion first exit times: algorithm acceleration |
✅ |
✅ |
❌ |
❌ |
✅ |
❌ |
✅ |
4 |
| Existence, Stability and Scalability of Orthogonal Convolutional Neural Networks |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Expected Regret and Pseudo-Regret are Equivalent When the Optimal Arm is Unique |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
1 |
| Explicit Convergence Rates of Greedy and Random Quasi-Newton Methods |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Exploiting locality in high-dimensional Factorial hidden Markov models |
✅ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
6 |
| Extensions to the Proximal Distance Method of Constrained Optimization |
✅ |
✅ |
✅ |
❌ |
✅ |
✅ |
✅ |
6 |
| Fairness-Aware PAC Learning from Corrupted Data |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Fast Stagewise Sparse Factor Regression |
✅ |
❌ |
✅ |
✅ |
❌ |
✅ |
✅ |
5 |
| Fast and Robust Rank Aggregation against Model Misspecification |
✅ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| Faster Randomized Interior Point Methods for Tall/Wide Linear Programs |
✅ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| Foolish Crowds Support Benign Overfitting |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| FuDGE: A Method to Estimate a Functional Differential Graph in a High-Dimensional Setting |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
✅ |
6 |
| Fully General Online Imitation Learning |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Functional Linear Regression with Mixed Predictors |
✅ |
✅ |
❌ |
✅ |
❌ |
❌ |
✅ |
4 |
| Fundamental Limits and Tradeoffs in Invariant Representation Learning |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
2 |
| Gauss-Legendre Features for Gaussian Process Regression |
❌ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
4 |
| Gaussian Process Boosting |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
7 |
| Gaussian Process Parameter Estimation Using Mini-batch Stochastic Gradient Descent: Convergence Guarantees and Empirical Benefits |
✅ |
❌ |
✅ |
✅ |
✅ |
✅ |
✅ |
6 |
| Gaussian process regression: Optimality, robustness, and relationship with kernel ridge regression |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
1 |
| Generalization Bounds and Representation Learning for Estimation of Potential Outcomes and Causal Effects |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Generalized Ambiguity Decomposition for Ranking Ensemble Learning |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Generalized Matrix Factorization: efficient algorithms for fitting generalized linear latent variable models to large data arrays |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Generalized Resubstitution for Classification Error Estimation |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Generalized Sparse Additive Models |
✅ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
6 |
| Getting Better from Worse: Augmented Bagging and A Cautionary Tale of Variable Importance |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Globally Injective ReLU Networks |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Graph Partitioning and Sparse Matrix Ordering using Reinforcement Learning and Graph Neural Networks |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Hamilton-Jacobi equations on graphs with applications to semi-supervised learning and data depth |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Handling Hard Affine SDP Shape Constraints in RKHSs |
✅ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
6 |
| IALE: Imitating Active Learner Ensembles |
✅ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
6 |
| Implicit Differentiation for Fast Hyperparameter Selection in Non-Smooth Convex Learning |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Improved Classification Rates for Localized SVMs |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Improved Generalization Bounds for Adversarially Robust Learning |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Improving Bayesian Network Structure Learning in the Presence of Measurement Error |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Information-Theoretic Characterization of the Generalization Error for Iterative Semi-Supervised Learning |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Information-theoretic Classification Accuracy: A Criterion that Guides Data-driven Combination of Ambiguous Outcome Labels in Multi-class Classification |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Inherent Tradeoffs in Learning Fair Representations |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Innovations Autoencoder and its Application in One-class Anomalous Sequence Detection |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Integral Autoencoder Network for Discretization-Invariant Learning |
❌ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Interlocking Backpropagation: Improving depthwise model-parallelism |
❌ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Interpolating Predictors in High-Dimensional Factor Regression |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
1 |
| InterpretDL: Explaining Deep Models in PaddlePaddle |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Interpretable Classification of Categorical Time Series Using the Spectral Envelope and Optimal Scalings |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Interval-censored Hawkes processes |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Intrinsic Dimension Estimation Using Wasserstein Distance |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
1 |
| Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning |
✅ |
✅ |
❌ |
❌ |
✅ |
❌ |
✅ |
4 |
| Joint Continuous and Discrete Model Selection via Submodularity |
❌ |
❌ |
✅ |
❌ |
✅ |
✅ |
✅ |
4 |
| Joint Estimation and Inference for Data Integration Problems based on Multiple Multi-layered Gaussian Graphical Models |
✅ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Joint Inference of Multiple Graphs from Matrix Polynomials |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| JsonGrinder.jl: automated differentiable neural architecture for embedding arbitrary JSON data |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
❌ |
2 |
| Jump Gaussian Process Model for Estimating Piecewise Continuous Regression Functions |
✅ |
❌ |
❌ |
✅ |
❌ |
❌ |
✅ |
3 |
| KL-UCB-Switch: Optimal Regret Bounds for Stochastic Bandits from Both a Distribution-Dependent and a Distribution-Free Viewpoints |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Kernel Autocovariance Operators of Stationary Processes: Estimation and Convergence |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Kernel Packet: An Exact and Scalable Algorithm for Gaussian Process Regression with Matérn Correlations |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Kernel Partial Correlation Coefficient --- a Measure of Conditional Dependence |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
✅ |
6 |
| KoPA: Automated Kronecker Product Approximation |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| LSAR: Efficient Leverage Score Sampling Algorithm for the Analysis of Big Time Series Data |
✅ |
❌ |
✅ |
❌ |
✅ |
✅ |
✅ |
5 |
| Learning Green's functions associated with time-dependent partial differential equations |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Learning Operators with Coupled Attention |
✅ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
6 |
| Learning Rates as a Function of Batch Size: A Random Matrix Theory Approach to Neural Network Training |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Learning Temporal Evolution of Spatial Dependence with Generalized Spatiotemporal Gaussian Process Models |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Learning from Noisy Pairwise Similarity and Unlabeled Data |
✅ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Learning linear non-Gaussian directed acyclic graph with diverging number of nodes |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Learning to Optimize: A Primer and A Benchmark |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Let's Make Block Coordinate Descent Converge Faster: Faster Greedy Rules, Message-Passing, Active-Set Complexity, and Superlinear Convergence |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| LinCDE: Conditional Density Estimation via Lindsey's Method |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
7 |
| Linearization and Identification of Multiple-Attractor Dynamical Systems through Laplacian Eigenmaps |
✅ |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
5 |
| Logarithmic Regret for Episodic Continuous-Time Linear-Quadratic Reinforcement Learning over a Finite-Time Horizon |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Low-rank Tensor Learning with Nonconvex Overlapped Nuclear Norm Regularization |
✅ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| MALTS: Matching After Learning to Stretch |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Machine Learning on Graphs: A Model and Comprehensive Taxonomy |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
❌ |
2 |
| Manifold Coordinates with Physical Meaning |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Mappings for Marginal Probabilities with Applications to Models in Statistical Physics |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
1 |
| Matrix Completion with Covariate Information and Informative Missingness |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Maximum sampled conditional likelihood for informative subsampling |
❌ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
4 |
| Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
1 |
| Meta-analysis of heterogeneous data: integrative sparse regression in high-dimensions |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Metrics of Calibration for Probabilistic Predictions |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
❌ |
2 |
| Minimax Mixing Time of the Metropolis-Adjusted Langevin Algorithm for Log-Concave Sampling |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Minimax optimal approaches to the label shift problem in non-parametric settings |
❌ |
✅ |
❌ |
✅ |
❌ |
❌ |
✅ |
3 |
| Mitigating the Effects of Non-Identifiability on Inference for Bayesian Neural Networks with Latent Variables |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Model Averaging Is Asymptotically Better Than Model Selection For Prediction |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| More Powerful Conditional Selective Inference for Generalized Lasso by Parametric Programming |
✅ |
✅ |
✅ |
❌ |
✅ |
❌ |
✅ |
5 |
| Multi-Agent Multi-Armed Bandits with Limited Communication |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Multi-Agent Online Optimization with Delays: Asynchronicity, Adaptivity, and Optimism |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Multi-Task Dynamical Systems |
✅ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
6 |
| Multiple Testing in Nonparametric Hidden Markov Models: An Empirical Bayes Approach |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Multiple-Splitting Projection Test for High-Dimensional Mean Vectors |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Multivariate Boosted Trees and Applications to Forecasting and Control |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| MurTree: Optimal Decision Trees via Dynamic Programming and Search |
✅ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
6 |
| Mutual Information Constraints for Monte-Carlo Objectives to Prevent Posterior Collapse Especially in Language Modelling |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Near Optimality of Finite Memory Feedback Policies in Partially Observed Markov Decision Processes |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
1 |
| Network Regression with Graph Laplacians |
❌ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
6 |
| Neural Estimation of Statistical Divergences |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| New Insights for the Multivariate Square-Root Lasso |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| No Weighted-Regret Learning in Adversarial Bandits with Delays |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Non-asymptotic Properties of Individualized Treatment Rules from Sequentially Rule-Adaptive Trials |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Non-asymptotic and Accurate Learning of Nonlinear Dynamical Systems |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
1 |
| Nonconvex Matrix Completion with Linearly Parameterized Factors |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
1 |
| Nonparametric Neighborhood Selection in Graphical Models |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Nonparametric Principal Subspace Regression |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
2 |
| Nonparametric adaptive control and prediction: theory and randomized algorithms |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
1 |
| Nonstochastic Bandits with Composite Anonymous Feedback |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Novel Min-Max Reformulations of Linear Inverse Problems |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Nystrom Regularization for Time Series Forecasting |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
7 |
| OMLT: Optimization & Machine Learning Toolkit |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
❌ |
2 |
| OVERT: An Algorithm for Safety Verification of Neural Network Control Policies for Nonlinear Systems |
✅ |
✅ |
✅ |
❌ |
✅ |
❌ |
✅ |
5 |
| On Acceleration for Convex Composite Minimization with Noise-Corrupted Gradients and Approximate Proximal Mapping |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| On Biased Stochastic Gradient Estimation |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| On Constraints in First-Order Optimization: A View from Non-Smooth Dynamical Systems |
✅ |
✅ |
❌ |
❌ |
✅ |
✅ |
✅ |
5 |
| On Generalizations of Some Distance Based Classifiers for HDLSS Data |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| On Instrumental Variable Regression for Deep Offline Policy Evaluation |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| On Low-rank Trace Regression under General Sampling Distribution |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| On Mixup Regularization |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| On Regularized Square-root Regression Problems: Distributionally Robust Interpretation and Fast Computations |
✅ |
❌ |
✅ |
✅ |
✅ |
✅ |
✅ |
6 |
| On the Approximation of Cooperative Heterogeneous Multi-Agent Reinforcement Learning (MARL) using Mean Field Control (MFC) |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| On the Complexity of Approximating Multimarginal Optimal Transport |
✅ |
❌ |
✅ |
❌ |
✅ |
✅ |
❌ |
4 |
| On the Convergence Rates of Policy Gradient Methods |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| On the Efficiency of Entropic Regularized Algorithms for Optimal Transport |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| On the Robustness to Misspecification of α-posteriors and Their Variational Approximations |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
1 |
| Online Mirror Descent and Dual Averaging: Keeping Pace in the Dynamic Case |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Online Nonnegative CP-dictionary Learning for Markovian Data |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Optimal Transport for Stationary Markov Chains via Policy Iteration |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Optimality and Stability in Non-Convex Smooth Games |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Oracle Complexity in Nonsmooth Nonconvex Optimization |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Overparameterization of Deep ResNet: Zero Loss and Mean-field Analysis |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| PAC Guarantees and Effective Algorithms for Detecting Novel Categories |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| PECOS: Prediction for Enormous and Correlated Output Spaces |
✅ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
6 |
| Pathfinder: Parallel quasi-Newton variational inference |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
✅ |
5 |
| Policy Evaluation and Temporal-Difference Learning in Continuous Time and Space: A Martingale Approach |
❌ |
✅ |
❌ |
❌ |
✅ |
✅ |
✅ |
4 |
| Policy Gradient and Actor-Critic Learning in Continuous Time and Space: Theory and Algorithms |
✅ |
✅ |
❌ |
❌ |
❌ |
❌ |
✅ |
3 |
| Posterior Asymptotics for Boosted Hierarchical Dirichlet Process Mixtures |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Power Iteration for Tensor PCA |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
1 |
| Principal Components Bias in Over-parameterized Linear Models, and its Manifestation in Deep Neural Networks |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Prior Adaptive Semi-supervised Learning with Application to EHR Phenotyping |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Project and Forget: Solving Large-Scale Metric Constrained Problems |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
7 |
| Projected Robust PCA with Application to Smooth Image Recovery |
✅ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| Projected Statistical Methods for Distributional Data on the Real Line with the Wasserstein Metric |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Projection-free Distributed Online Learning with Sublinear Communication Complexity |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Provable Tensor-Train Format Tensor Completion by Riemannian Optimization |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Quantile regression with ReLU Networks: Estimators and minimax rates |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Ranking and Tuning Pre-trained Models: A New Paradigm for Exploiting Model Hubs |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Recovering shared structure from multiple networks with unknown edge distributions |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Recovery and Generalization in Over-Realized Dictionary Learning |
✅ |
❌ |
❌ |
✅ |
❌ |
❌ |
✅ |
3 |
| ReduNet: A White-box Deep Network from the Principle of Maximizing Rate Reduction |
✅ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
6 |
| Regularized K-means Through Hard-Thresholding |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Regularized and Smooth Double Core Tensor Factorization for Heterogeneous Data |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Representation Learning for Maximization of MI, Nonlinear ICA and Nonlinear Subspaces with Robust Density Ratio Estimation |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| ReservoirComputing.jl: An Efficient and Modular Library for Reservoir Computing Models |
❌ |
✅ |
❌ |
✅ |
✅ |
❌ |
✅ |
4 |
| Rethinking Nonlinear Instrumental Variable Models through Prediction Validity |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Reverse-mode differentiation in arbitrary tensor network format: with application to supervised learning |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
N/A |
3 |
| Riemannian Stochastic Proximal Gradient Methods for Nonsmooth Optimization over the Stiefel Manifold |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Robust Distributed Accelerated Stochastic Gradient Methods for Multi-Agent Networks |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Robust and scalable manifold learning via landmark diffusion for long-term medical signal processing |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
7 |
| SGD with Coordinate Sampling: Theory and Practice |
✅ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
6 |
| SMAC3: A Versatile Bayesian Optimization Package for Hyperparameter Optimization |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
❌ |
2 |
| SODEN: A Scalable Continuous-Time Survival Model through Ordinary Differential Equation Networks |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Sampling Permutations for Shapley Value Estimation |
✅ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| Scalable Gaussian-process regression and variable selection using Vecchia approximations |
✅ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
6 |
| Scalable and Efficient Hypothesis Testing with Random Forests |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Scaling Laws from the Data Manifold Dimension |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Scaling and Scalability: Provable Nonconvex Low-Rank Tensor Estimation from Incomplete Measurements |
✅ |
✅ |
❌ |
❌ |
✅ |
❌ |
✅ |
4 |
| Scaling-Translation-Equivariant Networks with Decomposed Convolutional Filters |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Score Matched Neural Exponential Families for Likelihood-Free Inference |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Selective Machine Learning of the Average Treatment Effect with an Invalid Instrumental Variable |
✅ |
❌ |
✅ |
✅ |
❌ |
✅ |
❌ |
4 |
| Self-Healing Robust Neural Networks via Closed-Loop Control |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Semiparametric Inference For Causal Effects In Graphical Models With Hidden Variables |
✅ |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
2 |
| Signature Moments to Characterize Laws of Stochastic Processes |
❌ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Simple Agent, Complex Environment: Efficient Reinforcement Learning with Agent States |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Simple and Optimal Stochastic Gradient Methods for Nonsmooth Nonconvex Optimization |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Smooth Robust Tensor Completion for Background/Foreground Separation with Missing Pixels: Novel Algorithm with Convergence Guarantee |
✅ |
❌ |
✅ |
✅ |
✅ |
✅ |
✅ |
6 |
| Solving L1-regularized SVMs and Related Linear Programs: Revisiting the Effectiveness of Column and Constraint Generation |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
✅ |
5 |
| Solving Large-Scale Sparse PCA to Certifiable (Near) Optimality |
✅ |
✅ |
✅ |
❌ |
✅ |
✅ |
✅ |
6 |
| Sparse Additive Gaussian Process Regression |
✅ |
❌ |
✅ |
✅ |
✅ |
✅ |
✅ |
6 |
| Sparse Continuous Distributions and Fenchel-Young Losses |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Spatial Multivariate Trees for Big Data Bayesian Regression |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
7 |
| Stable Classification |
❌ |
❌ |
✅ |
✅ |
✅ |
✅ |
✅ |
5 |
| Stacking for Non-mixing Bayesian Computations: The Curse and Blessing of Multimodal Posteriors |
❌ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Statistical Optimality and Computational Efficiency of Nystrom Kernel PCA |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Statistical Optimality and Stability of Tangent Transform Algorithms in Logit Models |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
1 |
| Statistical Rates of Convergence for Functional Partially Linear Support Vector Machines for Classification |
❌ |
❌ |
❌ |
✅ |
❌ |
❌ |
✅ |
2 |
| Stochastic DCA with Variance Reduction and Applications in Machine Learning |
✅ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| Stochastic Zeroth-Order Optimization under Nonstationarity and Nonconvexity |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Stochastic subgradient for composite convex optimization with functional constraints |
✅ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Structural Agnostic Modeling: Adversarial Learning of Causal Graphs |
✅ |
✅ |
✅ |
❌ |
✅ |
✅ |
✅ |
6 |
| Structure Learning for Directed Trees |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
✅ |
6 |
| Structure-adaptive Manifold Estimation |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Sufficient reductions in regression with mixed predictors |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Sum of Ranked Range Loss for Supervised Learning |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
7 |
| Supervised Dimensionality Reduction and Visualization using Centroid-Encoder |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity |
✅ |
✅ |
✅ |
❌ |
✅ |
❌ |
✅ |
5 |
| TFPnP: Tuning-free Plug-and-Play Proximal Algorithms with Applications to Inverse Imaging Problems |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Testing Whether a Learning Procedure is Calibrated |
❌ |
❌ |
❌ |
✅ |
❌ |
❌ |
✅ |
2 |
| The AIM and EM Algorithms for Learning from Coarse Data |
✅ |
✅ |
❌ |
❌ |
❌ |
❌ |
✅ |
3 |
| The Correlation-assisted Missing Data Estimator |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| The EM Algorithm is Adaptively-Optimal for Unbalanced Symmetric Gaussian Mixtures |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| The Geometry of Uniqueness, Sparsity and Clustering in Penalized Estimation |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| The Importance of Being Correlated: Implications of Dependence in Joint Spectral Inference across Multiple Networks |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
2 |
| The Interplay Between Implicit Bias and Benign Overfitting in Two-Layer Linear Networks |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| The Separation Capacity of Random Neural Networks |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| The Two-Sided Game of Googol |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| The Weighted Generalised Covariance Measure |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Theoretical Convergence of Multi-Step Model-Agnostic Meta-Learning |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Theoretical Foundations of t-SNE for Visualizing High-Dimensional Clustered Data |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Three rates of convergence or separation via U-statistics in a dependent framework |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Tianshou: A Highly Modularized Deep Reinforcement Learning Library |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
❌ |
2 |
| Toolbox for Multimodal Learn (scikit-multimodallearn) |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Topologically penalized regression on manifolds |
❌ |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
4 |
| Total Stability of SVMs and Localized SVMs |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Toward Understanding Convolutional Neural Networks from Volterra Convolution Perspective |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
❌ |
2 |
| Towards An Efficient Approach for the Nonconvex lp Ball Projection: Algorithm and Analysis |
✅ |
✅ |
❌ |
❌ |
✅ |
❌ |
✅ |
4 |
| Towards Practical Adam: Non-Convexity, Convergence Theory, and Mini-Batch Acceleration |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Training Two-Layer ReLU Networks with Gradient Descent is Inconsistent |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Training and Evaluation of Deep Policies Using Reinforcement Learning and Generative Models |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Transfer Learning in Information Criteria-based Feature Selection |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
7 |
| Tree-Based Models for Correlated Data |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Tree-Values: Selective Inference for Regression Trees |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
✅ |
6 |
| Tree-based Node Aggregation in Sparse Graphical Models |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Truncated Emphatic Temporal Difference Methods for Prediction and Control |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Two-Sample Testing on Ranked Preference Data and the Role of Modeling Assumptions |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Two-mode Networks: Inference with as Many Parameters as Actors and Differential Privacy |
✅ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| Unbiased estimators for random design regression |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Under-bagging Nearest Neighbors for Imbalanced Classification |
✅ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Underspecification Presents Challenges for Credibility in Modern Machine Learning |
❌ |
❌ |
✅ |
✅ |
❌ |
✅ |
❌ |
3 |
| Uniform deconvolution for Poisson Point Processes |
❌ |
✅ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| Universal Approximation Theorems for Differentiable Geometric Deep Learning |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Universal Approximation in Dropout Neural Networks |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Universal Approximation of Functions on Sets |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Unlabeled Data Help in Graph-Based Semi-Supervised Learning: A Bayesian Nonparametrics Perspective |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Using Active Queries to Infer Symmetric Node Functions of Graph Dynamical Systems |
✅ |
✅ |
✅ |
❌ |
✅ |
✅ |
✅ |
6 |
| Using Shapley Values and Variational Autoencoders to Explain Predictive Models with Dependent Mixed Features |
❌ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
6 |
| Variance Reduced EXTRA and DIGing and Their Optimal Acceleration for Strongly Convex Decentralized Optimization |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Variational Inference in high-dimensional linear regression |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Vector-Valued Least-Squares Regression under Output Regularity Assumptions |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| WarpDrive: Fast End-to-End Deep Multi-Agent Reinforcement Learning on a GPU |
❌ |
✅ |
❌ |
❌ |
✅ |
❌ |
❌ |
2 |
| Weakly Supervised Disentangled Generative Causal Representation Learning |
✅ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
6 |
| When Hardness of Approximation Meets Hardness of Learning |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| When is the Convergence Time of Langevin Algorithms Dimension Independent? A Composite Optimization Viewpoint |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| XAI Beyond Classification: Interpretable Neural Clustering |
❌ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
6 |
| abess: A Fast Best-Subset Selection Library in Python and R |
❌ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
6 |
| d3rlpy: An Offline Deep Reinforcement Learning Library |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| ktrain: A Low-Code Library for Augmented Machine Learning |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| solo-learn: A Library of Self-supervised Methods for Visual Representation Learning |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| tntorch: Tensor Network Learning with PyTorch |
❌ |
✅ |
❌ |
❌ |
✅ |
✅ |
❌ |
3 |