Path Development Network with Finite-dimensional Lie Group
Authors: Hang Lou, Siran Li, Hao Ni
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our layer demonstrates its strength in irregular time series modelling. Empirical results on a range of datasets show that the development layer consistently and significantly outperforms signature features on accuracy and dimensionality. The compact hybrid model (stacking onelayer LSTM with the development layer) achieves state-of-the-art against various RNN and continuous time series models. Our layer also enhances the performance of modelling dynamics constrained to Lie groups. |
| Researcher Affiliation | Academia | Hang Lou EMAIL Department of Mathematics University College London Siran Li EMAIL Department of Mathematics Shanghai Jiao Tong University Hao Ni EMAIL Department of Mathematics University College London |
| Pseudocode | Yes | Algorithm 1 Forward Pass of Path Development Layer 1: Input: θ gd Rm m d (model parameters), x = (x0, , x N) Rd (N+1) (input time series), m N (order of the matrix Lie algebra), (d, N) are the feature and time dimensions of x, respectively. 2: z0 Idm 3: for n {1, N} do 4: zn zn 1 exp(Mθ(xn xn 1)) 5: end for 6: Output: z = (z0, , z N) GN+1 Rm m (N+1) (sequential output) or z N G Rm m (static output). Algorithm 2 Backward Pass of Path Development Layer 1: Input: x = (x0, , x N) R(N+1) d (input time series), z = (z0, , z N) GN+1 (output series by the forward pass), θ = (θ1, , θd) gd Rm m d (model parameters), η R (learning rate), ψ : GN+1 R (loss function). |
| Open Source Code | Yes | Code is available at https://github.com/PDev Net/Dev Net.git. |
| Open Datasets | Yes | This is demonstrated using the following datasets. (1) Character Trajectories (Bagnall et al. (2018)). (2) Speech Commands dataset (Warden (2018)). (3) Sequential MNIST(Le et al. (2015)), permuted sequential MNIST (Le et al. (2015)), and sequential CIFAR-10 (Chang et al. (2017)). |
| Dataset Splits | Yes | We follow the data generating procedure in Kidger et al. (2020) and took a 70%/15%/15% train/validation/test split. The batch size was 128 for every model. We follow the approach in Kidger et al. (2020), in which we combined the train/test split of the original dataset and then took a 70%/15%/15% train/validation/test split. The simulated input/output trajectories are split into train/validation/test with ratio 80%/10%/10%. All models were trained with learning rate 0.003 and exponential decay rate 0.998. The batch size was 64 for every model. We followed Kipf et al. (2018) to simulate 2-dimensional trajectories of the five charged, interacting particles. The particles carry positive and negative charges, sampled with uniform probabilities, interacting according to the relative location and charges. We simulated 1000 training trajectories, 500 validation trajectories and 1000 test trajectories, each having 5000 time steps. |
| Hardware Specification | Yes | All experiments were run on five Quadro RTX 8000 GPUs. |
| Software Dependencies | Yes | We ran all models with Py Torch 1.9.1 Paszke et al. (2019) and performed hyperparameter tunning with Wandb Biewald (2020). |
| Experiment Setup | Yes | Optimisers. All experiments used the ADAM optimiser as in Kingma & Ba (2014). We follow the data generating procedure in Kidger et al. (2020) and took a 70%/15%/15% train/validation/test split. The batch size was 128 for every model. Dev(SO), LSTM+Dev(SO) and signature models used a constant learning rate 0.001 throughout. LSTM was trained with learning rate 0.001 and exponential decay rate 0.997. Training terminates if the validation accuracy stops improving for 50 epochs. Set maximum training epochs to 150. The batch size used was 32 for every model. Dev(SO), LSTM+Dev(SO) and signature model used a constant learning rate of 0.001 throughout the training. LSTM was trained with a learning rate of 0.003, with an exponential decay rate of 0.997. If the validation accuracy stops improving for 50 epochs, we terminate the training process. The maximum training epochs is set to be 150. The batch size used was 128 for both sequential MNIST and CIFAR10. LSTM+DEV(SO) was trained with an inital learning rate of 0.002 and an exponential decay rate of 0.997. We terminate the training process if the validation accuracy stops improving for 50 epochs. The maximum training epochs is set to be 200. All models were trained with learning rate 0.003 and exponential decay rate 0.998. The batch size was 64 for every model. Training was terminated if the validation MSE stopped lowering for 100 epochs. The maximum number of epochs was set to be 300. All models are trained with a learning rate of 0.001 with a 0.997 exponential decay rate. The batch size used was 128 for every model. The training was terminated if the validation MSE stopped lowering for 50 epochs. The maximum number of epochs was set to be 200. |