Semi-parametric Learning of Structured Temporal Point Processes
Authors: Ganggang Xu, Ming Wang, Jiangze Bian, Hui Huang, Timothy R. Burch, Sandro C. Andrade, Jingfei Zhang, Yongtao Guan
JMLR 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Asymptotic properties of the proposed estimators are investigated, and the effectiveness of our procedures is illustrated through a simulation study and an application to a stock trading dataset. Section 5 demonstrates the efficacy of the proposed methods through a simulation study. Section 6 applies the proposed method to the stock trading dataset. |
| Researcher Affiliation | Academia | Ganggang Xu EMAIL Ming Wang EMAIL Department of Management Science University of Miami Coral Gables, FL 33146, USA Jiangze Bian EMAIL School of Banking and Finance University of International Business and Economics Beijing, 100871, P. R. China Hui Huang EMAIL School of Mathematics Sun Yat-Sen University Guangzhou, 510275, P. R. China Timothy R. Burch EMAIL Sandro C. Andrade EMAIL Department of Finance University of Miami Coral Gables, FL 33146, USA Jingfei Zhang EMAIL Yongtao Guan EMAIL Department of Management Science University of Miami Coral Gables, FL 33146, USA |
| Pseudocode | No | The paper describes mathematical models, estimation procedures, and theoretical properties. It details steps like calculating marginal intensity functions, estimating covariance functions, and predicting principal component scores through equations and descriptive text, but it does not include a clearly labeled pseudocode or algorithm block with structured steps. |
| Open Source Code | No | The paper states: "All simulation runs were carried out in the software R on a cluster of 100 Linux machines with a total of 100 CPU cores, with each core running at approximately 2 GFLOPS." However, it does not explicitly provide a link to the source code for the methodology described in the paper, nor does it state that the code is available in supplementary materials or upon request. |
| Open Datasets | No | As an example, we analyze a dataset that draws from stock trading transactions (recorded at the second level) by more than 300,000 Chinese trading accounts over approximately three years. The paper describes a specific private stock trading dataset from a brokerage house, but no public access information (link, DOI, repository, or citation to a publicly available version) is provided. |
| Dataset Splits | Yes | The cross-validation score can then be defined as ... The optimal bandwidth ˆhx is then chosen by minimizing CVX(h). Following the same procedure, we can define the cross-validation score CVY (h) using data aggregated over all accounts, NY 1 , , NY m, as defined in Section 2.5 and choose the optimal ˆhy accordingly. In our analysis, we have removed accounts with fewer than 30 or more than 2,000 total transactions, which results in retaining approximately 47% of the accounts in the broader dataset, or equivalently, a total of 157, 203 accounts. |
| Hardware Specification | No | All simulation runs were carried out in the software R on a cluster of 100 Linux machines with a total of 100 CPU cores, with each core running at approximately 2 GFLOPS. |
| Software Dependencies | No | All simulation runs were carried out in the software R on a cluster of 100 Linux machines with a total of 100 CPU cores, with each core running at approximately 2 GFLOPS. The paper mentions the use of 'R' software but does not specify a version number or any other software dependencies with their versions. |
| Experiment Setup | Yes | For the univariate log-Gaussian process, the data are simulated from the univariate point processes model presented in Section 2.1 using the following intensity model: λij(t) = λ0(t) exp[P k=1 ξX ikφX k (t) + P k ξY jkφY k (t) + P k ξZ ijkφZ k (t)], t [0, 1], for i = 1, , n, j = 1, , m. We set p X = p Y = p Z = 2, λ0(t) = 0.3 cos(2πt) + 1 and simulate principal component scores ξX ik, ξY jk and ξZ ijk as independent normal random variables with means 0 and variances equal to the eigenvalues ηX k , ηY k and ηZ k , k = 1, 2, respectively. ... The Epanechnikov kernel function is used in all simulation studies and summary statistics based on B = 500 simulation runs are calculated. The bandwidths ˆhx, ˆhy, and ˆhz are selected by the data-driven procedure proposed in Section 2.7. |