JAIST Repository >
Center for Strategic Development of Science and Technology 2003-2008 >
JAIST PRESS Publications >
IFSR 2005 >

Please use this identifier to cite or link to this item: https://hdl.handle.net/10119/3907

Title: The Word Clustering Method for Lexical Knowledge Acquisition from Domain-Specific Documents
Authors: SAITO, Takahiro
WATANABE, Isamu
MATSUI, Kunio
TERADA, Akira
SAITO, Takashi
Keywords: clustering
graph
similarity-measure
mutual-substitutability
Issue Date: Nov-2005
Publisher: JAIST Press
Abstract: In this paper, we introduce a new similarity measure between words, and a graph-based word clustering method using this similarity measure. Our similarity measure is a quantification of the “mutual substitutability” of two words, and our graph-based word clustering method is composed of two steps. The first step is a building pairs of terms whose similarity measures are high into the connected graphs, and the second step is a division of the connected graphs by estimating the density of their edges. Here we report on the results of experiments in which we compared our method with existing techniques. In these experiments, we attempted to acquire the lexical knowledge from aviation incident reports. To conclude, we show that our similarity measure is more suitable for this purpose than the cosine measure, a popular similarity measure, and show that our clustering method creates more meaningful clusters than the k-means clustering method, a popular clustering method.
Description: The original publication is available at JAIST Press http://www.jaist.ac.jp/library/jaist-press/index.html
IFSR 2005 : Proceedings of the First World Congress of the International Federation for Systems Research : The New Roles of Systems Sciences For a Knowledge-based Society : Nov. 14-17, 2117, Kobe, Japan
Symposium 5, Session 2 : Data/Text Mining from Large Databases Text Mining
Language: ENG
URI: https://hdl.handle.net/10119/3907
ISBN: 4-903092-02-X
Appears in Collections:IFSR 2005

Files in This Item:

File Description SizeFormat
20078.pdf140KbAdobe PDFView/Open

All items in DSpace are protected by copyright, with all rights reserved.

 


Contact : Library Information Section, JAIST (ir-sys[at]ml.jaist.ac.jp)