Signature Moments to Characterize Laws of Stochastic Processes
Authors: Ilya Chevyrev, Harald Oberhauser
JMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We apply our theoretical results to the problem of two-sample hypothesis testing for stochastic processes. A two-sample test for a stochastic processes X = (Xt)t [0,T], Y = (Yt)t [0,T], tests the null-hypothesis H0 : PX = PY against the alternative H1 : PX = PY, where PX := P X 1 and PY := P Y 1 are the laws of X and Y . Our results about the characteristicness allow us to use the framework of Gretton et al. (2009, 2012) for this problem. The data we use is a popular archive of multi-variate time series data that contains heterogeneous datasets from various domains such as speech, handwriting, motion, etc. We benchmark against classical tests for multi-variate data and MMD tests using classical kernels. |
| Researcher Affiliation | Academia | Ilya Chevyrev EMAIL School of Mathematics University of Edinburgh Edinburgh, EH9 3FD, United Kingdom Harald Oberhauser EMAIL Mathematical Institute University of Oxford Oxford, OX2 6GG, United Kingdom |
| Pseudocode | No | The paper describes algorithms but does not include any pseudocode or algorithm blocks directly within its content. It refers to 'Efficient recursive algorithms for inner products of signatures have been developed, see (Kiraly and Oberhauser, 2019), and we can extend this to the robust signature Φ.' |
| Open Source Code | Yes | As base for our signature MMD computations we use Csaba Toth s KSig library13 that provides efficient GPU implementations of signature kernels in Python. As base for general kernel MMD two-sample permutation test, we built on code from the DS3 summer school course Representing and comparing probabilities with kernels provided by H. Strathmann and D. Sutherland14 but we rewrote much of it to make it compatible with the KSig package. 13. https://github.com/tgcsaba/KSig 14. https://github.com/karlnapf/ds3_kernel_testing |
| Open Datasets | Yes | As data sources we use the multi-variate time series datasets from UAE & UCR, see (Bagnall et al., 2018) for a detailed description. |
| Dataset Splits | Yes | For each time-series dataset we define µ (resp. ν) as follows: under the null H0 a sample from µ (resp. ν) is generated by choosing uniformly at random a time-series W = (W(t1), . . . , W(t L0)) from this dataset and double the sequence length by concatenation; that is X = (W(t1), . . . , W(t L0), W(t1), . . . , W(t L0)). Under the alternative H1, a sample from µ is generated as above, but a sample from ν is generated by concatenating in reverse, Y = (W(t1), . . . , W(t L0), W(t L0), . . . , W(t1)). In our experiments, we varied the length L0 {10, 100, 200} by truncating after the first L0 entries of the time-series in the original dataset. Further, we restricted to the case of balanced samples, m = n, and varied the number of samples m {30, 70, 200}. |
| Hardware Specification | Yes | All experiments were run on a machine with 36 cores (2 x 18 Core with HT), 2.6/3.9 GHz, 1536GB RAM. |
| Software Dependencies | No | As base for our signature MMD computations we use Csaba Toth s KSig library... to find the non-negative root of the polynomial P(λ) from Proposition 33 we used an optimization of Brent s method (Brent, 1973, Ch. 3-4) implemented as optimize.brentq in the Sci Py package (Jones et al., 2001 ). The paper mentions software packages like KSig and SciPy but does not provide specific version numbers for them. |
| Experiment Setup | Yes | For each test statistic we used the permutation test with significance level 5% and repeated the following for each dataset 20 times: (i) Generate samples X,Y under the null H0 and record the number of false rejections of the null H0 (Type I error). (ii) Generate samples X,Y under the alternative H1 and record the number of false acceptances of the null H0 (Type II error). The power of the test the probability of rejecting the null H0 when H1 is true then determines which test performs best; the closer to 1 the better. ... for the signature MMDs, the linear kernel does better than non-linear kernels. This is in stark contrast to supervised time-series classification where non-linear signature kernels strongly outperform the linear signature kernel as in (Toth and Oberhauser, 2020). We believe the reason is that the additional parameter selection for the non-linear kernel introduces additional variance... All our tables show the results when the approach in (Fukumizu et al., 2009) is applied for the kernel parameter choice. ... The parameter selection resulted in a value for M that was less or equal to 5 for all signature MMDs. Furthermore, for the signature MMDs build on top of a non-linear kernel κ, the value of M was typically even lower, often equal to 2. ... The other parameters of the signature MMD a, C determine the tensor normalization, Example 4. For simplicity we kept both fixed (a = 1, C = 103)... The parameter for the length scale σ varied over several orders of magnitude from dataset to dataset, for both signature MMDs and the other MMDs. |