JAIST Repository >
b. 情報科学研究科・情報科学系 >
b10. 学術雑誌論文等 >
b10-1. 雑誌掲載論文 >
このアイテムの引用には次の識別子を使用してください:
http://hdl.handle.net/10119/7832
|
タイトル: | Automatic Extraction of the Fine Category of Person Named Entities from Text Corpora |
著者: | NGUYEN, Tri-Thanh SHIMAZU, Akira |
キーワード: | fine person categories extraction named entities pattern extraction algorithm |
発行日: | 2007-10-01 |
出版者: | 電子情報通信学会 |
誌名: | IEICE TRANSACTIONS on Information and Systems |
巻: | E90-D |
号: | 10 |
開始ページ: | 1542 |
終了ページ: | 1549 |
DOI: | 10.1093/ietisy/e90-d.10.1542 |
抄録: | Named entities play an important role in many Natural Language Processing applications. Currently, most named entity recognition systems rely on a small set of general named entity (NE) types. Though some efforts have been proposed to expand the hierarchy of NE types, there are still a fixed number of NE types. In real applications, such as question answering or semantic search systems, users may be interested in more diverse specific NE types. This paper proposes a method to extract categories of person named entities from text documents. Based on Dual Iterative Pattern Relation Extraction method, we develop a more suitable model for solving our problem, and explore the generation of different pattern types. A method for validating whether a category is valid or not is proposed to improve the performance, and experiments on Wall Street Journal corpus give promising results. |
Rights: | Copyright (C)2007 IEICE. Tri-Thanh Nguyen, Akira Shimazu, IEICE TRANSACTIONS on Information and Systems, E90-D(10), 2007, 1542-1549. http://www.ieice.org/jpn/trans_online/ |
URI: | http://hdl.handle.net/10119/7832 |
資料タイプ: | publisher |
出現コレクション: | b10-1. 雑誌掲載論文 (Journal Articles)
|
このアイテムのファイル:
ファイル |
記述 |
サイズ | 形式 |
A11970.pdf | | 430Kb | Adobe PDF | 見る/開く |
|
当システムに保管されているアイテムはすべて著作権により保護されています。
|