Acquiring Selectional Preferences in a Thai Lexical Database

Canasai Kruengkrai, Thatsanee Charoenporn, Virach Sornlertlamvanich, and Hitoshi Isahara

Abstract

In this paper, we consider the problem of enriching a Thai lexical database by extending the semantic information with selectional preferences. We propose a novel approach for acquiring selectional preferences of verbs, which is motivated by the tree cut model. We apply a model selection technique called the Bayesian Information Criterion (BIC). Given a semantic hierarchy, our goal is to generalize initial noun classes to the most plausible levels on that hierarchy. We present an iterative algorithm for generalization. The algorithm performs agglomerative merging on the semantic hierarchy in a bottom up manner. The BIC is used to measure the improvement of the model both locally and globally. In our experiments, we consider the Web as a large corpus. We also propose approaches for extracting examples from the Web. Preliminarily experimental results are given to show the feasibility and effectiveness of our approach.

Download: pdf, ps


Canasai Kruengkrai