Entity attribute information extraction method and device based on syntactic dependency
A technology of entity attributes and attribute information, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve the problem of misalignment of attributes in information extraction methods, and achieve the effect of reducing workload, improving efficiency, and improving accuracy.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0055] refer to Figure 1-2 , taking the text "Deng Chao, born in Nanchang, Jiangxi Province in 1979, and admitted to the Performance Department of the Central Academy of Drama in 1998." as an example, the method of extracting entity attribute information based on the syntax-dependent path is explained in detail:
[0056] Step 1: According to the keyword request entered by the user, the text to be extracted is obtained from the Internet with the help of existing crawler software, and the text to be extracted is preprocessed to obtain the text entity to be extracted;
[0057] Step 1.1: Record the text to be extracted as "Deng Chao, born in Nanchang, Jiangxi Province in 1979, and was admitted to the Performance Department of the Central Academy of Drama in 1998." as I, use the HanLP open source tool to segment the text I, and obtain the word set after word segmentation, denoted as W;
[0058] Step 1.2: Use the HanLP open source tool to perform part-of-speech tagging and named e...
Embodiment 2
[0081] Now take the text "Yuan Hong, graduated from the Shanghai Theater Academy, and is Hu Ge's classmate and friend." as an example, to describe in detail the method of extracting entity-related information based on the syntax-dependent path:
[0082] Step 1: Preprocess the text to be extracted to obtain the text entity to be extracted;
[0083] Step 1.1: Record the text to be extracted as "Yuan Hong, graduated from the Shanghai Theater Academy, and is a classmate of Hu Ge." as I, use the Stanford open source NLP tool to process the text I, and obtain the word set after text segmentation, which is recorded as W, the set of words such as image 3 As shown, NN represents a common noun, PU represents a sentence break, VV represents a verb, NR represents a proper noun, VC represents yes, and DEG represents an auxiliary word;
[0084] Step 1.2: Use the Stanford open source NLP tool to perform part-of-speech tagging and named entity recognition on the word set. The obtained word ...
Embodiment 3
[0109] refer to Figure 5 , the present invention also discloses a device for extracting entity related information based on a syntax-dependent path, including:
[0110] The preprocessing module is used to obtain the text to be extracted from the Internet by means of the existing crawler software according to the keyword request input by the user, and preprocess the text to be extracted to obtain the text entity to be extracted;
[0111] The path calculation module is used to establish an undirected weighted graph between words according to the syntactic dependence and part-of-speech relationship of the text to be extracted, and obtain the candidate attribute information of the text entity to be extracted according to the part-of-speech relationship; search in the undirected weighted graph The shortest path between the text entity to be extracted and the words of the candidate attribute information, and the words passing through the shortest path form a set of associated infor...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com