Text classification method for research report text

A text classification and text technology, applied in the field of machine learning, can solve the problems of paragraph extraction and low classification accuracy, and achieve the effect of improving text analysis ability, high analysis efficiency and accuracy rate

Inactive Publication Date: 2020-01-21
创新奇智(南京)科技有限公司
View PDF4 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] At present, the existing mature natural language processing technology can identify the entities in the research reports, and can classify the research reports, such as individual stock research reports, industry research reports, futures research reports, etc., but if you need to classify each research report If you classify each paragraph, for example, a stock research report includes core viewpoints, objective exp

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text classification method for research report text

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0017] The present invention will be described in further detail below in conjunction with the examples.

[0018] Such as figure 1 As shown, the present invention discloses a text classification method for research report text, the process is:

[0019] a. Collect a certain number of research reports, and mark the paragraphs of the collected research reports to form samples;

[0020] b. Send the marked samples to the machine learning framework for training, so as to obtain a comprehensive training model;

[0021] c. Finally, the original research report files to be identified are extracted and de-noised, and the comprehensive training model is used to complete the extraction and classification of the research report content.

[0022] In a, the collected research report paragraphs are marked manually, and the workload ranges from several thousand to tens of thousands.

[0023] In b, the comprehensive training model includes several neural network training models, and the neur...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a text classification method for research report texts, which comprises the following steps: firstly, collecting a certain number of research reports, and marking collected research report paragraphs to form a sample; sending the labeled samples to a machine learning framework for training to obtain a comprehensive training model; and finally, performing content extractionand text noise reduction processing on an original research and report file to be identified, and finishing extraction and classification of research and report contents by the comprehensive trainingmodel. According to the method, the accuracy of extracting and classifying the research report paragraphs is effectively improved, and the text analysis capability of research reports is improved.

Description

technical field [0001] This patent application belongs to the field of machine learning technology, and more specifically, relates to a text classification method for research paper texts. Background technique [0002] At present, the existing mature natural language processing technology can identify the entities in the research reports, and can classify the research reports, such as individual stock research reports, industry research reports, futures research reports, etc., but if you need to classify each research report If each paragraph is classified, for example, a stock research report includes core viewpoints, objective expositions, profit forecasts, and risk warnings, then the existing text classification technology obviously cannot meet the needs. [0003] At the same time, the current deep learning models mainly include TextCnn, LSTM, FastText and other models. These models are all deep learning models based on neural networks. They are good at single text classi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/35G06F40/205G06N3/08
CPCG06F16/35G06N3/08
Inventor 张发恩戴辉辉龚才春
Owner 创新奇智(南京)科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products