Text-based key personal name extraction method and system

A name and key technology, applied in the field of information extraction, can solve the problems of not being able to effectively consider the roles and functions of users and characters, and cannot realize the extraction of key people, so as to achieve the effect of improving accuracy and effectiveness

Inactive Publication Date: 2017-05-24
INST OF COMPUTING TECH CHINESE ACAD OF SCI
View PDF5 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] After investigation, the existing technology mainly focuses on the capture of Weibo users and social circles, and cannot effectively consider the roles and functions of users and characters in the event, so the goal of extracting key characters in the event cannot be achieved in the present invention

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text-based key personal name extraction method and system
  • Text-based key personal name extraction method and system
  • Text-based key personal name extraction method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] The technical solution of the present invention will be specifically introduced below in conjunction with the embodiments.

[0043] In the microblog platform, a large number of microblog messages will appear for an event, and the present invention is used to extract the names of key persons appearing in the event from these microblog messages. Of course, the present invention can also be aimed at other texts, and realize the technical purpose of extracting names of key persons therefrom.

[0044] Generally speaking, the name of a person who frequently appears in an event is likely to be the protagonist of the event. The number of times each person's name appears in the microblogs involved in the entire event is used as its weight. The more times a person's name appears, the greater the possibility that he is the protagonist of the event.

[0045] figure 1 , figure 2 It is a flowchart of a method for extracting key names based on text in the present invention. The m...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a text-based key personal name extraction method and system. The method comprises the steps of 1, executing a word segmentation operation on a target text, and extracting target words of which part of speech is a personal name; 2, performing statistics on an occurrence frequency of each target word in the target text, and setting a weight of the target word according to the occurrence frequency; 3, adjusting the weight of the target word according to an occurrence probability of the target word serving as the personal name and recorded in an ambiguous personal name priori probability dictionary; and 4, selecting the target word with large weight as a key personal name. Through the method, the extraction of figures related to specific events, the extraction of the key personal name in the text and the extraction of important spreading users, event development node users, public pointing users and information source users can be realized, and the accuracy and validity of figure extraction can be improved.

Description

technical field [0001] The invention belongs to the technical field of information extraction, in particular to a method and system for extracting key names based on text. Background technique [0002] With the rapid development of WEB2.0 technology, ordinary users have become the main producers of content on the Internet. UGC (User Generated Content) has the characteristics of timely response and fast dissemination. As a typical representative of UGC, Weibo platform relies on low entry barriers. , a large amount of data, free and timely sharing, and diversified forms, etc., has become an important source of events and a place for online public opinion, generating a large number of Weibo messages every day. The relevant conditions for event analysis based on the microblog platform are already available, but the accuracy and comprehensiveness of character extraction largely affect the accuracy and comprehensiveness of event analysis. The present invention is based on microblo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27G06F17/30
Inventor 曹娟张勇东张俊强李锦涛
Owner INST OF COMPUTING TECH CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products