HC-Search: A Learning Framework for Search-based Structured Prediction

Authors: J.R. Doppa, A. Fern, P. Tadepalli

JAIR 2014 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on several benchmark domains show that our approach significantly outperforms several state-of-the-art methods. Section 4 presents our experimental results followed by an engineering methodology for applying our framework to new problems in Section 5.
Researcher Affiliation Academia Janardhan Rao Doppa EMAIL School of EECS, Oregon State University Corvallis, OR 97331-5501, USA Alan Fern EMAIL School of EECS, Oregon State University Corvallis, OR 97331-5501, USA Prasad Tadepalli EMAIL School of EECS, Oregon State University Corvallis, OR 97331-5501, USA
Pseudocode Yes Algorithm 1 Heuristic Function Learning via Exact Imitation Input: D = Training examples, (I, S) = Search space definition, L = Loss function, A = Rank-based search procedure, τmax = search time bound Output: H, the heuristic function Algorithm 2 Cost Function Learning via Cross Validation Input: D = Training examples, So = Search space definition, L = Loss function, A = Search procedure, τmax = search time bound Output: C, the cost function
Open Source Code No The paper mentions external code for a baseline (Cascades) with a URL, but does not provide specific access to source code for the methodology described in this paper. No explicit statement of code release or repository link for HC-Search itself is present.
Open Datasets Yes Handwriting Recognition (HW). ...This dataset contains roughly 6600 examples divided into 10 folds (Taskar et al., 2003). NETtalk Stress. ...There are 1000 training words and 1000 test words in the standard dataset. Scene labeling. This data set contains 700 images of outdoor scenes (Vogel & Schiele, 2007).
Dataset Splits Yes Handwriting Recognition (HW). ...divided into 10 folds (Taskar et al., 2003). We consider two different variants of this task as in the work of Hal Daum e III, Langford, and Marcu (2009). In the HW-Small version, we use one fold for training and the remaining 9 folds for testing, and vice-versa in HW-Large. NETtalk Stress. This is a text-to-speech mapping problem, where the task is to assign one of the 5 stress labels to each letter of a word. There are 1000 training words and 1000 test words in the standard dataset. Scene labeling. This data set contains 700 images of outdoor scenes (Vogel & Schiele, 2007). ...Training was performed with 600 images, and the remaining 100 images were used for testing.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU models, or detailed computer specifications used for running its experiments. It discusses computational time constraints but not the hardware achieving them.
Software Dependencies No In our specific implementation we employed the online Passive-Aggressive (PA) algorithm (Crammer, Dekel, Keshet, Shalev-Shwartz, & Singer, 2006) as our base learner. ... We can use any off-the-shelf rank-learning algorithm (e.g., Perceptron, SVM-Rank) as our base learner... The paper mentions algorithms used (Passive-Aggressive, Perceptron, SVM-Rank) and refers to the implementation of a third-party tool ('Cascades'). However, it does not provide specific version numbers for any software libraries, programming languages, or solvers directly used for their own implementation.
Experiment Setup Yes During training and testing we set the search time bound τ to be 25 search steps for all domains except for scene labeling, which has a much larger search space and uses τ = 150. Training was conducted for 50 iterations in all of our experiments. For all domains, we learn linear heuristic and cost functions over second order features unless otherwise noted. We measure error with Hamming loss unless otherwise noted.