Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and system for mining and searching unstructured text data in financial field

An unstructured, text data technology, applied in the fields of data processing and finance, can solve the problems of sparseness and unreadability, unable to make full use of the relational network, unable to express the results intuitively, etc., to achieve intuitive readability and utilization value Effect

Pending Publication Date: 2018-11-20
成都量子矩阵科技有限公司 +1
View PDF1 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, there are still deficiencies, and the association network cannot be used to construct graphs for mining deep information and implicit association information, etc.
[0004] In traditional natural language processing, only shallow information can be obtained, and the deep information in the text and the implicit association between entities cannot be extracted.
The relational network constructed from large-scale data is a complex network, which is sparse and unreadable. It cannot be used directly, nor can it intuitively represent the results of mining. It can only be used for basic query and search, and cannot be fully utilized. Affiliate network

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for mining and searching unstructured text data in financial field
  • Method and system for mining and searching unstructured text data in financial field
  • Method and system for mining and searching unstructured text data in financial field

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0079] Those skilled in the art can implement the present invention as a method for mining unstructured text data in the field of finance and economics. In this embodiment, the following steps are performed:

[0080] S1, collecting data, crawling data from designated financial fields on the Internet;

[0081] S2, clean the data, to remove the CSS fields or paragraph tags that are not removed during the crawling process, and then store them in the database;

[0082] S3, preprocessing data, reading the data stored in the database in step S2, performing word segmentation processing and named entity recognition processing on the sentences in the text of the obtained data, and storing the processed information in the database;

[0083] S4, mining the association relationship, mining the association relationship between named entities;

[0084] S5, building an association graph, using the mined association relationship to construct an association graph, using named entities as vert...

Embodiment 2

[0090] Those skilled in the art can implement the present invention as a method for mining unstructured text data in the field of finance and economics. In this embodiment, on the basis of Embodiment 1, a six-degree association network is constructed in step S5, and the execution is as follows step:

[0091] That is, given a center point, generate a six-degree association network centered on this point, first initialize a center node set, add the center node to the set, initialize a candidate node set, search the association network, and directly connect to the center point as Once the node is added to the candidate node set, the central node set and the candidate node set are merged into a new central node set, and the nodes in the associated relationship network that are connected to the nodes in the central node set and not in the central set are found, and the candidate nodes are added. And so on until a six-degree network is generated, or until all nodes are already in th...

Embodiment 3

[0094] Those skilled in the art can implement the present invention as a mining system for unstructured text data in the field of finance and economics. In this embodiment, it includes a data acquisition module, a data cleaning module, a data preprocessing module, an association mining module, and an association map Building blocks and complex network analysis modules;

[0095] The data acquisition module is used to crawl data from the specified financial field of the Internet;

[0096] The data cleaning module is used to remove CSS fields or paragraph tags that are not removed during the crawling process, and then store them in the database;

[0097] The data preprocessing module is used to read the data stored in the database, perform word segmentation processing and named entity recognition processing on the sentences in the text of the acquired data, and store the processed information in the database;

[0098] The association mining module is used to use the preprocessed...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method and system for mining and searching unstructured text data in a financial field and provides a scheme of named entity identification for the unstructured text data, association relationship mining between entities and association relationship network construction and utilization in the financial field, and the scheme is mainly for information extraction in the financial field, named entity identification, association relationship mining and the construction and utilization of an association network. The system comprises a data acquisition module, a data cleaning module, a data preprocessing module, an association mining module, an association map construction module and a complex network analysis module. The invention can complete the basic data analysis and information extraction work, construct an economic map by using the mined information and use the economic map to mine deep information and a hidden association, so that the data have more intuitivereadability and utilization value.

Description

technical field [0001] The present invention relates to the technical field of data processing, and more specifically, to a mining and searching method and system for unstructured text data in the field of finance and economics. Background technique [0002] In the current big data era, the explosive growth of data and information related to economic activities has formed a huge amount of textual big data for the economic field, and manual analysis of huge amounts of textual information has become impossible. [0003] The Chinese authorized patent with the publication number CN103955542B discloses a method for mining fully weighted positive and negative association patterns between text words and its mining system, using a Chinese text preprocessing module for preprocessing to build a text database and a feature word project library; using feature words The frequent item set and negative item set mining implementation module mines the fully weighted feature word candidate it...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
Inventor 周焕来尹凯赵宏森罗钰敏刘丹
Owner 成都量子矩阵科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products