ThesisAuthors: Xu, Ruifeng (2006)
Collocation is a lexical phenomenon in which two or more words are habitually combined together as some conventional way of saying things. Collocation information is essential to many natural language processing tasks such as word sense disambiguation, machine translation, and information extraction. Most of current works on collocation extraction are statistical based with limited precision and recall because they can not well distinguish word co-occurrences, which are statistically 'significant, from true collocations, which are of habitual use and are thus either syntactically or semantically significant.
The objective of this study is to investigate methods to improve the performance of Chinese collocation extraction algorithms. Different types of collocations are identified. C...