Enriching a Thai Lexical Database with Selectional Preferences
Canasai Kruengkrai, Thatsanee Charoenporn, Virach Sornlertlamvanich, and Hitoshi Isahara.
Abstract
A statistical corpus-based approach for acquiring selectional preferences of verbs is proposed. By parsing through text corpora, we obtain examples of context nouns that are considered to be the selectional preferences of a given verb. The approach is to generalize initial noun classes to the most appropriate levels on a semantic hierarchy. We present an iterative algorithm for generalization by combining an agglomerative merging and a model selection technique called the Bayesian Information Criterion (BIC). In our experiments, we consider the Web as a large corpus. We also propose approaches for extracting examples from the Web. Preliminarily experimental results are given to show the feasibility and effectiveness of our approach.
Download: pdf, ps
Canasai Kruengkrai