Financial news public opinion early warning classification method and system based on electra deep neural network

By preprocessing and incrementally training financial news using the Electra deep neural network model, extracting key sentences and classifying public opinion, the problem of low accuracy in financial public opinion classification in existing technologies is solved, achieving faster and more accurate public opinion identification and risk warning.

CN116151989BActive Publication Date: 2026-06-19IND BANK CO +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
IND BANK CO
Filing Date
2022-12-15
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing technologies struggle to effectively capture the complex linguistic features in financial news, resulting in low accuracy in classifying financial public opinion and an inability to promptly assess financial public opinion within the vast amount of information available online.

Method used

The Electra deep neural network model is used for preprocessing and incremental training of financial news data. Negative text is identified through fully connected layers and discriminators, topic sentences are extracted and similarity is calculated, and public opinion is classified by combining a Softmax classifier.

🎯Benefits of technology

It achieves more accurate classification of financial news and public opinion, enabling rapid identification of potential financial risks and improving risk management capabilities.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116151989B_ABST
    Figure CN116151989B_ABST
Patent Text Reader

Abstract

This invention provides a method and system for financial news public opinion early warning and classification based on Electra deep neural networks. The method includes: collecting and labeling financial news data, preprocessing it, and inputting it into an Electra pre-trained model for incremental training and weight updates to obtain an updated Electra deep neural network model; obtaining text representations through this model to determine whether the corresponding financial news data text is negative; if so, extracting the main idea sentences from the negative financial news data text, and then extracting the public opinion category features from each main idea sentence using the Electra deep neural network model; if not, not issuing an early warning; inputting the public opinion category features into a classifier for classification to obtain the final public opinion classification of the financial news, thereby implementing the early warning instruction. This invention achieves feature extraction and classification of unstructured text data in a specific domain, thus solving the problem of financial news public opinion early warning and classification.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of data processing technology, specifically to a financial news public opinion early warning classification method and system based on Electra deep neural network. Background Technology

[0002] Since its inception, online public opinion monitoring technology has demonstrated immense power, with its reach expanding ever wider, moving from the purely social sphere into the more specialized realm of financial investment. Due to the development of information and communication technologies and the internet, the impact of internet financial information on the financial market has become increasingly undeniable. This information is crucial to the development and stability of the entire financial industry.

[0003] Financial public opinion refers to the state of information dissemination and interaction among various entities releasing financial information on a specific topic through various channels, based on their respective viewpoints. The emergence, development, and evolution of financial-related public opinion have a significant impact on the financial industry and even the macroeconomy, which is why it deserves special attention. However, the diverse layers and redundant information in current online financial texts hinder the classification and detection of financial news and public opinion.

[0004] Natural Language Processing (NLP) is an important area within computer science and artificial intelligence. Before the advent of deep learning, traditional NLP methods, such as rule-based methods, probabilistic modeling, and linear classifiers, were widely used in tasks such as spam filtering and sentiment classification. These classic methods utilized statistical features of sequences and simple linguistic features. However, a major drawback of these techniques was their inability to capture complex linguistic features. With the advent of the deep learning model BERT, its NLP techniques have proven highly effective at capturing the linguistic features of text, making it the best method for various text processing tasks.

[0005] Patent document CN114398480A discloses a method and device for detecting the segmentation of financial public opinion based on key information extraction. The steps include: preprocessing financial text data and a set of financial public opinion tag descriptions; encoding financial text fragments and financial public opinion tag sentences to obtain fragment representations; performing similarity representation on financial text fragments and the set of financial public opinion tag descriptions, and then performing classification training to obtain a key information sentence extraction and classification auxiliary model; using the key information sentence extraction and classification auxiliary model to calculate the importance of financial text fragments to the set of financial public opinion tag descriptions, and selecting tag key sentences; constructing a combination of tag key sentences to input into the financial text, and performing segmented public opinion classification.

[0006] However, patent document CN114398480A uses a recurrent neural network for encoding, which cannot fully capture the semantic information implied in the text context, resulting in a low accuracy rate for subsequent tasks. Furthermore, this document segments financial public opinion based on key information extraction, without addressing whether financial news contains financial public opinion, thus failing to directly assess the vast amount of information available on the internet. Summary of the Invention

[0007] To address the shortcomings of existing technologies, the purpose of this invention is to provide a financial news and public opinion early warning classification method and system based on Electra.

[0008] A financial news sentiment early warning classification method based on Electra deep neural network provided by the present invention includes:

[0009] Step S1: Collect and label financial news data;

[0010] Step S2: Preprocess the labeled financial news data and input it into the Electra pre-trained model for incremental training and updating the model weights to obtain the updated Electra deep neural network model;

[0011] Step S3: Obtain the text representation through the Electra deep neural network model, and then determine whether the corresponding financial news data text is negative text. If yes, trigger step S4; otherwise, do not issue a warning.

[0012] Step S4: Extract the main idea sentences from the negative financial news data text, and then extract the public opinion type features of each main idea sentence using the Electra deep neural network model;

[0013] Step S5: Input the public opinion type features into the classifier for classification to obtain the final public opinion classification of financial news, thereby realizing the early warning instruction.

[0014] Preferably, the labeling includes labeling the collected financial news data according to preset category tags;

[0015] The category labels include abnormal financial business behavior, poor management, abnormal corporate operations, major corporate changes, significant negative information about borrowers other than business operations, and illegal behavior by borrowers.

[0016] Preferably, the preprocessing includes data cleaning of the financial news data to remove invalid characters, while limiting the maximum length of the input text to 512 bytes. The portion exceeding the maximum length is input in slices and then sequentially input into the backend word segmenter to obtain text segmentation.

[0017] Preferably, step S3 includes: inputting the text encoding corresponding to the financial news data to be classified into the fully connected layer of the Electra deep neural network model, and performing binary classification on the tensor of the encoding through a discriminator to output whether the current financial news data is a negative classification.

[0018] Preferably, step S4 includes:

[0019] Step S4.1: Extract each sentence from the identified negative financial news data text as input text, and obtain the corresponding sentence body through named entity recognition;

[0020] Step S4.2: Calculate the similarity between the main body and its corresponding text title. The calculation formula is as follows:

[0021]

[0022] Where B represents the mean word vector of the phrase in the text title, and J... i S represents the mean word vector of the main word in the i-th sentence of the current text. i This represents the similarity between the body of the i-th sentence in the current text and the text title corresponding to that body;

[0023] Step S4.3: Select the corresponding sentence with the highest similarity score as the topic sentence.

[0024] A financial news sentiment early warning and classification system based on an Electra deep neural network, provided by the present invention, includes:

[0025] Module M1: Collects and annotates financial news data;

[0026] Module M2: Preprocesses the labeled financial news data and inputs it into the Electra pre-trained model for incremental training and updating the model weights to obtain the updated Electra deep neural network model;

[0027] Module M3: Obtains text representation through the Electra deep neural network model, and then determines whether the corresponding financial news data text is negative text. If so, it triggers module M4; otherwise, it does not issue a warning.

[0028] Module M4: Extracts the main idea sentences from negative financial news data text, and then uses the Electra deep neural network model to extract the public opinion type features of each main idea sentence;

[0029] Module M5: Inputs the characteristics of the public opinion type into the classifier for classification, obtains the final public opinion classification of financial news, and then realizes the early warning command.

[0030] Preferably, the labeling includes labeling the collected financial news data according to preset category tags;

[0031] The category labels include abnormal financial business behavior, poor management, abnormal corporate operations, major corporate changes, significant negative information about borrowers other than business operations, and illegal behavior by borrowers.

[0032] Preferably, the preprocessing includes data cleaning of the financial news data to remove invalid characters, while limiting the maximum length of the input text to 512 bytes. The portion exceeding the maximum length is input in slices and then sequentially input into the backend word segmenter to obtain text segmentation.

[0033] Preferably, module M3 includes: inputting the text encoding corresponding to the financial news data to be classified into the fully connected layer of the Electra deep neural network model, and performing binary classification on the tensor of the encoding through a discriminator to output whether the current financial news data is a negative classification.

[0034] Preferably, module M4 includes:

[0035] Module M4.1: Extracts each sentence from the identified negative financial news data text as input text, and obtains the corresponding sentence body through named entity recognition;

[0036] Module M4.2: Calculates the similarity between the main body and its corresponding text title using the following formula:

[0037]

[0038] Where B represents the mean word vector of the phrase in the text title, and J... i S represents the mean word vector of the main word in the i-th sentence of the current text. i This represents the similarity between the body of the i-th sentence in the current text and the text title corresponding to that body;

[0039] Module M4.3: Select the corresponding sentence with the highest similarity score as the topic sentence.

[0040] Compared with the prior art, the present invention has the following beneficial effects:

[0041] 1. This invention uses Electra from the BERT series of deep learning as a pre-trained model, then adds a financial dataset for fine-tuning to form an encoder, and then adds a fully connected layer classifier to achieve feature extraction and classification of unstructured text data in a specific domain, thereby solving the problem of financial news public opinion early warning classification.

[0042] 2. This invention uses Electra as the backbone network, which can solve the problem of mismatch of BERT's mask, making its performance significantly better than BERT. Furthermore, this invention can obtain financial public opinion information more quickly and accurately and provide more accurate public opinion classification.

[0043] 3. This invention discovers financial public opinion on the Internet and issues timely and accurate early warning information to help financial practitioners identify potential financial risks, thereby improving the risk management capabilities of the financial system. Attached Figure Description

[0044] Other features, objects, and advantages of the present invention will become more apparent from the following detailed description of non-limiting embodiments with reference to the accompanying drawings:

[0045] Figure 1 This is a schematic diagram of the workflow of the present invention. Detailed Implementation

[0046] The present invention will now be described in detail with reference to specific embodiments. These embodiments will help those skilled in the art to further understand the present invention, but do not limit the invention in any way. It should be noted that those skilled in the art can make several changes and improvements without departing from the concept of the present invention. These all fall within the protection scope of the present invention.

[0047] This paper applies Natural Language Processing (NLP) models from deep learning to classify and provide early warnings for financial news. The implementation methods mainly include natural language text data preprocessing, text vector encoding, fully connected layer text classification, the establishment of a news and public opinion early warning system, and iterative version updates of the new system.

[0048] The public opinion classification model of this invention adopts a two-stage model. In the first stage, the Electra model is fine-tuned to extract features of news public opinion, and then a Sigmoid discriminator is used to determine whether the overall public opinion sentiment of the news is negative. In the second stage, negative news is segmented into sentences. The segmented text is then processed by the Electra deep network model to extract the public opinion category features of each sentence, and finally classified using a Softmax classifier to output the final public opinion classification of the news. Furthermore, this invention uses the Electra model as the backbone network because, compared to BERT, Electra uses a new pre-trained task, replaced token detection, which can solve the problem of mismatched masking in BERT, making its performance significantly superior to BERT.

[0049] Example 1

[0050] According to the present invention, a financial news sentiment early warning classification method based on Electra deep neural network is provided, such as... Figure 1 As shown, it includes:

[0051] Step S1: Collect and label financial news data. Labeling involves tagging the collected financial news data according to preset category labels. These category labels include abnormal financial business behavior, poor management, abnormal corporate operations, significant corporate changes, major negative information regarding borrowers (excluding business-related matters), and illegal activities by borrowers.

[0052] Specifically, the collection of financial news data includes two main sources: a small amount of financial news text already collected by financial professionals and a large amount of financial news text obtained from publicly available online channels. For the small amount of financial news text already collected by financial professionals, it is first divided into negative and non-negative categories. Then, negative financial news is segmented into sentences, the main idea sentences are extracted, and then it is divided into six categories. For the large amount of financial news text obtained from publicly available online channels, interfering information such as advertisements and images is first removed. Then, based on the set category labels, this semi-structured data is labeled and classified. The labeled news is divided into training, validation, and test sets.

[0053] Step S2: Preprocess the labeled financial news data and input it into the Electra pre-trained model for incremental training and weight updates to obtain the updated Electra deep neural network model. The preprocessing includes data cleaning of the financial news data, removing invalid characters, and limiting the maximum text length to 512 bytes. For portions exceeding this maximum length, slices are input sequentially into the backend word segmenter to obtain text segments.

[0054] The pre-trained Electra model already possesses text embeddings based on general language knowledge. Fine-tuning is then performed to enable it to learn better text embeddings (encodings) for a specific domain. This fine-tuning involves incrementally training the model and updating its weights. Specifically, it involves using Electra's entire architecture and initial weights, feeding labeled training data into the Electra model for incremental training. For example, setting the optimizer to AdamW, the learning rate to 0.0004, the loss function to cross-entropy, the batch size to 128, and shuffling the dataset in each training round with a warm-up approach for six rounds. This training updates the model weights, resulting in more suitable text embeddings (encodings).

[0055] Step S3: Obtain the text representation through the Electra deep neural network model, and then determine whether the corresponding financial news data text is negative text. If yes, trigger step S4; otherwise, no warning is issued. Specifically, the text encoding corresponding to the financial news data to be classified is input into the fully connected layer of the Electra deep neural network model, and the tensor of the encoding is binary classified by a discriminator to output whether the current financial news data is negative. The discriminator includes a Sigmoid discriminator.

[0056] Step S4: Extract the main idea sentences from the negative financial news data text, and then use the Electra deep neural network model to extract the sentiment type features of each main idea sentence. Step S4 includes:

[0057] Step S4.1: Extract each sentence from the identified negative financial news data text as input text, and obtain the corresponding sentence body through named entity recognition; the named entity recognition models include Bi-LSTM and CRF models.

[0058] Step S4.2: Calculate the similarity between the main body and its corresponding text title. The calculation formula is as follows:

[0059]

[0060] Where B represents the mean word vector of the phrase in the text title, and J... i S represents the mean word vector of the main word in the i-th sentence of the current text. i This represents the similarity between the body of the i-th sentence in the current text and the text title corresponding to that body;

[0061] Step S4.3: Select the corresponding sentence with the highest similarity score as the topic sentence.

[0062] Step S5: Input the aforementioned public opinion category features into a classifier for classification to obtain the final public opinion classification of financial news, thereby realizing early warning instructions. Specifically, the main idea sentence is encoded and then input into a fully connected layer. A Softmax classifier is used to perform multi-classification on the encoded tensor, outputting the business classification of negative sentences.

[0063] Example 2

[0064] The present invention also provides a financial news public opinion early warning classification system based on Electra deep neural network. Those skilled in the art can implement the financial news public opinion early warning classification system based on Electra deep neural network by executing the steps of the method. That is, the method of financial news public opinion early warning classification based on Electra deep neural network can be understood as a preferred embodiment of the financial news public opinion early warning classification system based on Electra deep neural network.

[0065] A financial news sentiment early warning and classification system based on an Electra deep neural network, provided by the present invention, includes:

[0066] Module M1: Collects and labels financial news data. The labeling includes tagging the collected financial news data according to preset category labels. These category labels include abnormal financial business behavior, poor management, abnormal corporate operations, significant corporate changes, major negative information about borrowers (excluding business-related matters), and illegal activities by borrowers.

[0067] Module M2: Preprocesses the labeled financial news data and inputs it into the Electra pre-trained model for incremental training and weight updates, resulting in an updated Electra deep neural network model. The preprocessing includes data cleaning of the financial news data, removing invalid characters, and limiting the maximum text input length to 512 bytes. Data exceeding this maximum length is input in segments, which are then sequentially input into the backend word segmenter to obtain text segments.

[0068] Module M3: Obtains text representations through the Electra deep neural network model, then determines whether the corresponding financial news data text is negative. If not, no warning is issued; if so, the main idea sentences in the negative financial news data text are extracted, and the sentiment category features in each main idea sentence are extracted through the Electra deep neural network model. Module M3 includes: inputting the text encoding corresponding to the financial news data to be classified into the fully connected layer of the Electra deep neural network model, and performing binary classification on the tensor of the encoding through a discriminator to output whether the current financial news data is classified as negative.

[0069] Module M4: Inputs the characteristics of the public opinion types into a classifier for classification, obtains the final public opinion classification of financial news, and then implements early warning commands. Module M4 includes:

[0070] Module M4.1: Extracts each sentence from the identified negative financial news data text as input text, and obtains the corresponding sentence body through named entity recognition.

[0071] Module M4.2: Calculates the similarity between the main body and its corresponding text title using the following formula:

[0072]

[0073] Where B represents the mean word vector of the phrase in the text title, and J... i S represents the mean word vector of the main word in the i-th sentence of the current text. i This represents the similarity between the body of the i-th sentence in the current text and the text title corresponding to that body.

[0074] Module M4.3: Select the corresponding sentence with the highest similarity score as the topic sentence.

[0075] Those skilled in the art will understand that, in addition to implementing the system, apparatus, and their modules provided by this invention in purely computer-readable program code, the same program can be implemented in the form of logic gates, switches, application-specific integrated circuits, programmable logic controllers, and embedded microcontrollers by logically programming the method steps. Therefore, the system, apparatus, and their modules provided by this invention can be considered a hardware component, and the modules included therein for implementing various programs can also be considered structures within the hardware component; alternatively, modules for implementing various functions can be considered both software programs implementing the method and structures within the hardware component.

[0076] Specific embodiments of the present invention have been described above. It should be understood that the present invention is not limited to the specific embodiments described above, and those skilled in the art can make various changes or modifications within the scope of the claims, which do not affect the essence of the present invention. Unless otherwise specified, the embodiments and features described in this application can be arbitrarily combined with each other.

Claims

1. A financial news public opinion early warning classification method based on an Electra deep neural network, characterized in that, include: Step S1: Collect and label financial news data; Step S2: Preprocess the labeled financial news data and input it into the Electra pre-trained model for incremental training and updating the model weights to obtain the updated Electra deep neural network model; Step S3: Obtain the text representation through the Electra deep neural network model, and then determine whether the corresponding financial news data text is negative text. If yes, trigger step S4; otherwise, do not issue a warning. Step S4: Extract the main idea sentences from the negative financial news data text, and then extract the public opinion type features of each main idea sentence using the Electra deep neural network model; Step S5: Input the public opinion type features into the classifier for classification to obtain the final public opinion classification of financial news, thereby realizing the early warning instruction; Step S3 includes: inputting the text encoding corresponding to the financial news data to be classified into the fully connected layer of the Electra deep neural network model, and performing binary classification on the encoded tensor through a discriminator to output whether the current financial news data is a negative classification.

2. The financial news public opinion early warning classification method based on the Electra deep neural network according to claim 1, characterized in that, The labeling includes labeling the collected financial news data according to preset category tags; The category labels include abnormal financial business behavior, poor management, abnormal corporate operations, major corporate changes, significant negative information about borrowers other than business operations, and illegal behavior by borrowers.

3. The financial news public opinion early warning classification method based on Electra deep neural network according to claim 1, characterized in that, The preprocessing includes cleaning the financial news data, removing invalid characters, and limiting the maximum text length input to 512 bytes. For the portion exceeding the maximum length, the data is input in segments and then sequentially input into the backend word segmenter to obtain text segments.

4. The financial news public opinion early warning classification method based on Electra deep neural network according to claim 1, characterized in that, Step S4 includes: Step S4.1: Extract each sentence from the identified negative financial news data text as input text, and obtain the corresponding sentence body through named entity recognition; Step S4.2: Calculate the similarity between the main body and its corresponding text title. The calculation formula is as follows: Where B represents the mean word vector of the phrase in the text title. This represents the mean word vector of the main word in the i-th sentence of the current text. This represents the similarity between the body of the i-th sentence in the current text and the text title corresponding to that body; Step S4.3: Select the corresponding sentence with the highest similarity score as the topic sentence.

5. A financial news public opinion early warning and classification system based on Electra deep neural network, characterized in that, include: Module M1: Collects and annotates financial news data; Module M2: Preprocesses the labeled financial news data and inputs it into the Electra pre-trained model for incremental training and updating the model weights to obtain the updated Electra deep neural network model; Module M3: Obtains text representation through the Electra deep neural network model, and then determines whether the corresponding financial news data text is negative text. If so, it triggers module M4; otherwise, it does not issue a warning. Module M4: Extracts the main idea sentences from negative financial news data text, and then uses the Electra deep neural network model to extract the public opinion type features of each main idea sentence; Module M5: Inputs the characteristics of the public opinion type into the classifier for classification, obtains the final public opinion classification of financial news, and then realizes the early warning instruction; The module M3 includes: inputting the text encoding corresponding to the financial news data to be classified into the fully connected layer of the Electra deep neural network model, and performing binary classification on the encoded tensor through a discriminator to output whether the current financial news data is a negative classification.

6. The financial news public opinion early warning and classification system based on Electra deep neural network according to claim 5, characterized in that, The labeling includes labeling the collected financial news data according to preset category tags; The category labels include abnormal financial business behavior, poor management, abnormal corporate operations, major corporate changes, significant negative information about borrowers other than business operations, and illegal behavior by borrowers.

7. The financial news public opinion early warning and classification system based on Electra deep neural network according to claim 5, characterized in that, The preprocessing includes cleaning the financial news data, removing invalid characters, and limiting the maximum text length input to 512 bytes. For the portion exceeding the maximum length, the data is input in segments and then sequentially input into the backend word segmenter to obtain text segments.

8. The financial news public opinion early warning and classification system based on Electra deep neural network according to claim 5, characterized in that, Module M4 includes: Module M4.1: Extracts each sentence from the identified negative financial news data text as input text, and obtains the corresponding sentence body through named entity recognition; Module M4.2: Calculates the similarity between the main body and its corresponding text title using the following formula: Where B represents the mean word vector of the phrase in the text title. This represents the mean word vector of the main word in the i-th sentence of the current text. This represents the similarity between the body of the i-th sentence in the current text and the text title corresponding to that body; Module M4.3: Select the corresponding sentence with the highest similarity score as the topic sentence.

Citation Information

Patent Citations

  • Financial public opinion subdivision aspect detection method and device based on key information extraction

    CN114398480A

  • Financial same-industry public opinion analysis method and system based on deep learning algorithm

    CN111639183A

  • Financial news emotion analysis method and device, computer equipment and storage medium

    CN112380346A