Government affair text classification and hotspot problem mining method and system based on machine learning
A text classification and machine learning technology, applied in the field of hot issue mining and government affairs text classification, can solve problems such as inability to represent semantic information, insufficient model performance, and lack of objectivity in classification results, so as to improve clustering effect and improve clustering efficiency. and accuracy, the effect of improving accuracy
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0038] This embodiment discloses a method for classifying government affairs texts based on machine learning, such as figure 1 shown, including:
[0039] S1: Obtain multiple pieces of training government affairs text data and corresponding label data, and construct a dictionary of training government affairs texts, which includes each word and corresponding code in the training government affairs text data;
[0040] The government affairs text data records the message number, message user, message subject, message time, and message details of each message user. There is also a first-level label for the training data, but there is no test data; according to the content in the government affairs text document, Extract the user's message details and perform operations such as data preprocessing, word segmentation, and stop word removal. There are 7 categories of tags in the data, namely urban and rural construction, environmental protection, transportation, education and sports,...
Embodiment 2
[0068] This embodiment provides a machine learning-based government affairs text clustering method, such as image 3 shown, including:
[0069] S1: Obtain the message data set for clustering, and classify the data according to the classification method described in Example 1;
[0070] The government affairs text data records the message number, message user, message subject, message time, message details, objections and likes of each message user; according to the content in the government affairs text file, the user’s message details are extracted and processed. Data preprocessing, word segmentation, stop word removal and other operations.
[0071] This example includes the message information of 4326 message users to form the initial data set, and the format of the data set is csv format, as shown in Table 2.
[0072] Table 2 Evaluation information table of popularity of user messages
[0073]
[0074] After the message data set for clustering is obtain...
Embodiment 3
[0137] The purpose of this embodiment is to provide an electronic device.
[0138] An electronic device, including a memory, a processor, and a computer program stored on the memory and operable on the processor, when the processor executes the program, the method for classifying government affairs texts in Embodiment 1 or the method in Embodiment 2 is realized A method for mining hot issues in government texts.
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


