Unlock instant, AI-driven research and patent intelligence for your innovation.

Government affair text classification and hotspot problem mining method and system based on machine learning

A text classification and machine learning technology, applied in the field of hot issue mining and government affairs text classification, can solve problems such as inability to represent semantic information, insufficient model performance, and lack of objectivity in classification results, so as to improve clustering effect and improve clustering efficiency. and accuracy, the effect of improving accuracy

Pending Publication Date: 2020-11-27
SHANDONG NORMAL UNIV
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Among them, the current text classification method usually uses a general-purpose dictionary to encode words. This method ignores the text context, and each word is independent of each other, unable to represent semantic information, resulting in insufficient model performance; and clustering algorithms usually It is a subjective designation of several categories, and the obtained classification results lack objectivity

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Government affair text classification and hotspot problem mining method and system based on machine learning
  • Government affair text classification and hotspot problem mining method and system based on machine learning
  • Government affair text classification and hotspot problem mining method and system based on machine learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0038] This embodiment discloses a method for classifying government affairs texts based on machine learning, such as figure 1 shown, including:

[0039] S1: Obtain multiple pieces of training government affairs text data and corresponding label data, and construct a dictionary of training government affairs texts, which includes each word and corresponding code in the training government affairs text data;

[0040] The government affairs text data records the message number, message user, message subject, message time, and message details of each message user. There is also a first-level label for the training data, but there is no test data; according to the content in the government affairs text document, Extract the user's message details and perform operations such as data preprocessing, word segmentation, and stop word removal. There are 7 categories of tags in the data, namely urban and rural construction, environmental protection, transportation, education and sports,...

Embodiment 2

[0068] This embodiment provides a machine learning-based government affairs text clustering method, such as image 3 shown, including:

[0069] S1: Obtain the message data set for clustering, and classify the data according to the classification method described in Example 1;

[0070] The government affairs text data records the message number, message user, message subject, message time, message details, objections and likes of each message user; according to the content in the government affairs text file, the user’s message details are extracted and processed. Data preprocessing, word segmentation, stop word removal and other operations.

[0071] This example includes the message information of 4326 message users to form the initial data set, and the format of the data set is csv format, as shown in Table 2.

[0072] Table 2 Evaluation information table of popularity of user messages

[0073]

[0074] After the message data set for clustering is obtain...

Embodiment 3

[0137] The purpose of this embodiment is to provide an electronic device.

[0138] An electronic device, including a memory, a processor, and a computer program stored on the memory and operable on the processor, when the processor executes the program, the method for classifying government affairs texts in Embodiment 1 or the method in Embodiment 2 is realized A method for mining hot issues in government texts.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a government affair text classification and hotspot problem mining method and system based on machine learning, and the method comprises the steps: obtaining a plurality of pieces of training government affair text data and corresponding labels, and constructing a coding dictionary; obtaining vector representation of the plurality of pieces of training government affair text data based on a coding dictionary; encoding the label data to obtain vector representation of each label; and training a government affair text classification model by adopting a machine learning model according to the text data and the vector representation of the corresponding label, wherein the government affair text classification model is used for government affair text classification. According to the method, the dictionary is constructed through the government affair text, and text coding and vector representation are performed based on the dictionary, so that the government affair text classification accuracy can be improved. On the basis of classification, the questions in each class are clustered, and the number of classes of the questions is calculated through similarity, so that the government affair text clustering effect can be further improved.

Description

technical field [0001] The present disclosure relates to the technical field of text data mining, and in particular to a machine learning-based government affairs text classification and hot issue mining method and system. Background technique [0002] The statements in this section merely provide background information related to the present disclosure and do not necessarily constitute prior art. [0003] With the development of network technology, people can grasp the latest information and express their thoughts or suggestions at any time through online platforms such as Weibo, WeChat, the mayor's mailbox, and Sunshine Hotline. This also broadens the channels for people to respond to problems. Departments can also grasp people's wishes at any time, so as to provide better services. However, the continuous increase in the amount of text data related to various social conditions and public opinions has brought great challenges to the work of relevant departments that mainl...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/35G06N3/04
CPCG06F16/353G06N3/048G06N3/045
Inventor 王红李威张慧庄鲁贺韩书杨杰杨雪王正军李刚刘鹏
Owner SHANDONG NORMAL UNIV