| Bookmark: |
Full-text
| URL: http://personal.cityu.edu.hk/~ctckit/papers/viacompress.pdf |
| Cached: PDF-310K |
| SAVE AS an easy-to-recall long filename: |
| Filename format: author--year--title PDF-310K |
| Filename format: author--year--title--journal|proceedings|...--pages PDF-310K |
Related links
| Web search: Google Web Search :: Google Scholar |
| Within this site: References (31) |
Abstract
This paper presents a learning-via-compression approach to unsupervised acquisition of word forms with no a priori knowledge. Following the basic ideas in Solomonoff’s theory of inductive inference and Rissanen’s MDL framework, the learning is formulated as a process of inferring regularities, in the form of string patterns (i.e., words), from a given set of data. A segmentation algorithm is designed to segment each input utterance into a sequence of word candidates giving an optimal sum of description length gain (DLG). The learning model has a lexical refinement module to exploit this algorithm to derive finer-grained word candidates recursively until no more compression effect is available. Experimental results on an infant-directed speech corpus show that this approach reaches a state-of-art performance in terms of precision and recall of both words and word boundariesBibTex
@incollection{kit05lexicalLearning,
author={Chunyu Kit},
title={Unsupervised Lexical Learning As Inductive Inference via Compression},
year={2005},
month={July},
editor={James W. Minett and William S.-Y. Wang},
publisher={City University of Hong Kong Press},
booktitle={Language Acquisition, Change and Emergence: Essays in Evolutionary Linguistics},
url={http://www.isrl.uiuc.edu/~amag/langev/paper/kit05lexicalLearning.html}
}
| HOME :: Edited Book List :: Book Chapter | Comments to: junwang4 you-know-at gmail.com | Last update: 2/2/08 |