Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Knowledge graph construction method and system for multi-source Chinese financial announcement document

A technology of knowledge graph and finance, applied in the fields of knowledge extraction and knowledge graph construction, can solve the problem of lack of knowledge graph, and achieve the effect of reducing labor cost, improving accuracy and efficiency

Pending Publication Date: 2021-10-29
ZHEJIANG UNIV OF TECH
View PDF0 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0009] The present invention overcomes the lack of knowledge graphs in the Chinese financial field in the prior art, and provides a method and system for constructing knowledge graphs of multi-source Chinese financial announcement documents

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Knowledge graph construction method and system for multi-source Chinese financial announcement document
  • Knowledge graph construction method and system for multi-source Chinese financial announcement document
  • Knowledge graph construction method and system for multi-source Chinese financial announcement document

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0038] Below in conjunction with accompanying drawing, the present invention will be further described.

[0039] The knowledge map construction method of the multi-source Chinese financial announcement document of the present invention comprises the following steps:

[0040] Step 1: Build a relatively complete document structure tree, and obtain the document structure, including the general title, first-level title, second-level title, etc. and their corresponding text blocks.

[0041] Step 2: According to the fuzzy matching of the annotation content, the effective block position is obtained, and its corresponding title is extracted.

[0042] Step 3: Make short complements and long cuts for effective words, and unify the length of words to the number of words set in advance. In this example, since the title is relatively short, try to make it as long as possible to ensure that information will not be lost. The Chinese BERT word vector is used for encoding. In this example, ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The knowledge graph construction method for the multi-source Chinese financial announcement document comprises the following steps: structuring the hierarchical relationship of each chapter of the document, and constructing a relatively complete document structure tree; labeling all the title data; unifying the length of the title to a preset word number, and carrying out word embedding coding at a character level by using BERT to obtain a corresponding vector representation; dividing the processed data set into a training set and a test set, and training to obtain a title classification model; classifying the document titles by using a title classification model; the complex and effective knowledge of the effective text blocks is masked; constructing a semantic model with a mask, constructing a multi-source similar generalization mask Bi-LSTM semantic model (M-MST model), and feeding the M-MST model for training to obtain a knowledge extraction model; according to the knowledge extraction model, combining with an external knowledge base to obtain an entity relationship triple; and constructing a multi-source financial announcement document knowledge graph and realizing incremental updating or expansion. The invention further discloses a system for implementing the knowledge graph construction method for the multi-source Chinese financial announcement document.

Description

technical field [0001] The invention relates to a method and system for knowledge extraction and knowledge map construction, especially for the extraction of complex entities in the financial field and the construction of a knowledge map in the financial field. [0002] The present invention relates to the fields of natural language, knowledge map, deep learning, etc., and specifically relates to the field of modeling based on deep learning. Background technique [0003] The development of listed companies, which are the mainstay of China's economic development, and innovative, small and medium-sized private enterprises that support economic growth, are and will continue to face various challenges for a period of time. Production does not stagnate, development does not shift gears, food and grass must go first. The capital market is the granary for listed companies to replenish their "blood". At the beginning of this year, the new refinancing regulations issued by the Chin...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/36G06F16/35G06F40/295G06F40/30
CPCG06F16/367G06F16/35G06F40/30G06F40/295
Inventor 高楠杜宇轩陈国鑫陈磊杨博威
Owner ZHEJIANG UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products