Detecting relationships in unstructured text

a technology of unstructured text and relationships, applied in the field of data mining, can solve the problem of laborious task of manually detecting such relationships among the large corpus of documents contained on the web, and achieve the effect of “noise free”

Inactive Publication Date: 2007-03-22
IBM CORP
View PDF23 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the task of manually detecting such relationships from amongst the large corpus of documents contained on the Web is laborious.
The challenge is both in identifying entities in a document and in detecting the particular relationship, if any, between two entities.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Detecting relationships in unstructured text
  • Detecting relationships in unstructured text
  • Detecting relationships in unstructured text

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] The embodiments of the invention and the various features and advantageous details thereof are explained more fully with reference to the non-limiting embodiments that are illustrated in the accompanying drawings and detailed in the following description. It should be noted that the features illustrated in the drawings are not necessarily drawn to scale. Descriptions of well-known components and processing techniques are omitted so as to not unnecessarily obscure the embodiments of the invention. The examples used herein are intended merely to facilitate an understanding of ways in which the embodiments of the invention may be practiced and to further enable those of skill in the art to practice the embodiments of the invention. Accordingly, the examples should not be construed as limiting the scope of the invention.

[0025] As mentioned above, there is need for a system and a computer-implemented method for automatically and accurately detecting relationships (e.g., a partner...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

Disclosed are embodiments of a system and a method for detecting relationships described in unstructured text-based electronic documents. The system and method incorporate the use of an input file that contains one or more text patterns that represent particular relationships. The text patterns each include regular text expressions that describe the particular relationship and slots for the location of each entity in that relationship. Document(s) are selected by a user and scanned by a proper noun tagger that identifies and tags every occurrence of proper names within the document(s). Then, a pattern matcher scans the document(s) to match text patterns. If a text pattern is matched within a document a relationship detector extracts all pairs of proper names found in the slots for each matched text pattern. The output from the relationship detector includes the names for each entity in the relationship, the type of relationship, and the identity of the document and the location of the sentence describing the relationship in the document.

Description

BACKGROUND OF THE INVENTION [0001] 1. Field of the Invention [0002] The invention generally relates to the field of data mining and, more particularly, to a system and a computer-implemented method of detecting relationships by creating input files of text patterns for each type of relationship, identifying a specific text pattern within a text-based document, tagging proper names in the text-based document, and extracting those proper names located within the specific text pattern so as to identify the two entities in the relationship. [0003] 2. Description of the Related Art [0004] Recently, there has been a rapid growth of on-line discussion groups and news websites on the World Wide Web (WWW). Detecting relationships between entities (e.g., buyer / seller, employee / employer, partnerships, parent / subsidiaries, etc.) discussed on those websites could prove to be a valuable resource (e.g., to a company investigating a rival company's business dealings, to a company or individual inve...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(United States)
IPC IPC(8): G06F7/00
CPCG06F17/30731G06F17/278G06F16/36G06F40/295
Inventor NOVAK, JASMINE
Owner IBM CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products