MADCAT Chinese pilot training set

Saved in:
Bibliographic Details
Imprint:[Philadelphia, PA] : Linguistic Data Consortium, c2014.
Description:5 DVD-ROMs ; 4 3/4 in.
Language:English
Chinese
Subject:
Format: DVD Video
URL for this record:http://pi.lib.uchicago.edu/1001/cat/bib/10084208
Hidden Bibliographic Details
Varying Form of Title:Multilingual automatic document classification analysis and translation Chinese pilot training set
Other authors / contributors:Linguistic Data Consortium.
ISBN:1585636800
9781585636808
Notes:Title from disc label.
"LDC2014T13".
Author: Zhiyi Song, David Lee, Stephen Grimes, Dave Doermann, Stephanie Strassel.
Data type: Text, stillImage.
Date source: Newswire, newsgroups, weblogs.
System requirements: DVD-ROM drive; web browser; software to read files in the XML format.
English and Chinese.
Summary:"MADCAT (Multilingual Automatic Document Classification Analysis and Translation) Chinese Pilot Training Set contains all training data created by the Linguistic Data Consortium (LDC) to support a Chinese pilot collection in the DARPA MADCAT Program. The data in this release consists of handwritten Chinese documents, scanned at high resolution and annotated for the physical coordinates of each line and token. Digital transcripts and English translations of each document are also provided, with the various content and annotation layers integrated in a single MADCAT XML output." --LDC online catalogue.