Document information extraction and mapping method and system
A document information and graphization technology, which is applied in text database clustering/classification, unstructured text data retrieval, semantic analysis, etc., can solve problems such as pruning processing of entity words that cannot be misrecognized, and reduce computer resource consumption , strong interpretability, and the effect of improving efficiency
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0062] In combination with the requirements, the present invention extracts entity and relationship attributes from the itemized data of the requirement documents and imports them into the graph database for storage and visualization, studies the underlying natural language understanding and natural language processing technology, and combines the open source natural language processing platform LTP Analyzed the word formation features of Chinese requirement documents from the aspects of lexical, syntactic, and semantic aspects, formulated the corresponding information extraction rules, used the Drools engine for rule maintenance, extracted the entity and relationship attributes in the requirement documents, and graphed them to form requirement knowledge Atlas.
[0063] According to the method for extracting and graphing demand management document information based on syntax and semantic rules provided by the present invention, the method includes the following steps:
[0064]...
Embodiment 2
[0093] Embodiment 2 is a preferred example of Embodiment 1.
[0094] The method for extracting entities of requirements management documents based on syntactic and semantic rules according to the present invention includes:
[0095] baseNP: Simple Non-Nested Noun Phrases - First proposed in English by Church in 1988. Chinese non-nested noun phrases are different from English. The formal description of Chinese baseNP (basic entity noun) is divided into 4 categories:
[0096] 1. baseNP→baseNP+baseNP
[0097] 2. baseNP→baseNP+noun / gerund
[0098] 3. baseNP→baseNP+noun / gerund
[0099] 4. baseNP→baseNP+noun / gerund
[0100] The definite attributives include: adjective|differential word|adverb|verb|noun|local word|English word|numeral|quantifier|.
[0101] Obtain the word formation features of the required document from the word features and dependency syntax tree, and formulate rules to extract entities by pattern matching. This process is actually the process of traversing all ...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com