Method and system for extracting information from unstructured text using symbolic machine learning

Inactive Publication Date: 2006-01-12
IBM CORP
View PDF17 Cites 35 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

[0030] Thus, the present invention provides an improved method for relational learning in which a non-specialist can intui

Problems solved by technology

Extracting relational information from text is an important and unsolved problem in the area of Unstructured Information Management.
Manual approaches are very costly to develop, since they require experts in computational linguistics or related disciplines to develop formal grammars or special purpose programs.
Non-specialists cannot customize manual systems for new d

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for extracting information from unstructured text using symbolic machine learning
  • Method and system for extracting information from unstructured text using symbolic machine learning
  • Method and system for extracting information from unstructured text using symbolic machine learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0044] Referring now to the drawings, and more particularly to FIGS. 1-12, exemplary embodiments of the present invention will now be described.

[0045] Machine learning approaches have the advantage that they require only labeled examples of the information sought. Much recent work on relational learning has been statistical. One such approach that reflects the state of the art for statistical methods is “Kernel Methods for Relation Extraction” by D. Zelenko, C. Aone, and A. Richardella, where the learning is of a function measuring similarity between shallow parses of examples. Statistical methods, in particular, need to have a large amount of labeled training data before anything useful can be done. This is a major problem for statistical approaches.

[0046] Work in another vein has concerned various attempts to accomplish relational learning by using heuristics to learn finite state recognizers or regular expressions, as exemplified by “Learning Information Extraction Rules for Se...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A method (and structure) of extracting information from text, includes parsing an input sample of text to form a parse tree and using user inputs to define a machine-labeled learning pattern from the parse tree.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS [0001] The present Application is related to U.S. Provisional Patent Application No. 60 / 586,877, filed on Jul. 12, 2004, to Johnson et al., entitled “System and Method for Extracting Information from Unstructured Text Using Symbolic Machine Learning”, having IBM Docket YOR920040239US1, assigned to the present assignee, and incorporated herein by reference.BACKGROUND OF THE INVENTION [0002] 1. Field of the Invention [0003] The present invention generally relates to extracting information from text. More specifically, in a relational learning system, a pattern learner module receives a small number of learning samples defined by user interactions in relational pattern templates format wherein elements are defined in a precedence relation and in an inclusion relation, and calculates a minimal most specific generalization (MMSG) for these samples so that information matching the generalized template can then be extracted from unseen text. [0004] 2...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/21G06F40/00
CPCG06F17/2705G06F40/205
Inventor JOHNSON, DAVID E.OLES, FRANK J.
Owner IBM CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products