Title Saityno informacijos išrinkimo metodų tyrimas /
Translation of Title Evaluation of information extraction methods from World Wide Web.
Authors Gleixner, Jurgis
Full Text Download
Pages 54
Keywords [eng] information extraction ; structure based methods ; precision
Abstract [eng] Nowadays the amount of information in internet is increasing very fast. It becomes a difficult and time consuming work to find required information. Not many websites offer a possibility to filter information in more complex ways. The solution of this problem is an information extraction system, which collects information from websites and transforms it into a more flexible form (XML, CSV, DB), where complex filters and data manipulations can be applied. In this work we analyze methods to automatically extract information from websites in simple and interactive way. This work is more focused on structural pattern based information extraction systems. We introduce such a system and compare its functionality with other similar systems. Precision is one of the most important attributes of such systems, so we analyze ways to increase it.
Type Master thesis
Language Lithuanian
Publication date 2011