Unlock instant, AI-driven research and patent intelligence for your innovation.

Text processing method and device, text classification method and device, equipment and storage medium

A text processing and text classification technology, applied in the Internet field, can solve the problems of high manual maintenance cost and low text accuracy, and achieve the effect of improving performance, recall precision, and high accuracy.

Pending Publication Date: 2022-03-15
BEIJING DAJIA INTERNET INFORMATION TECH CO LTD
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The present disclosure provides a text processing method, a text classification method, a device, a device, and a storage medium, so as to at least solve the problem of high manual maintenance costs or high performance of the language model obtained by training when long text is used to train the language model in the related art. Ideal enough, the problem of low accuracy in classifying text

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text processing method and device, text classification method and device, equipment and storage medium
  • Text processing method and device, text classification method and device, equipment and storage medium
  • Text processing method and device, text classification method and device, equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0093] In order to enable ordinary persons in the art to better understand the technical solutions of the present disclosure, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below in conjunction with the accompanying drawings.

[0094] It should be noted that the terms "first" and "second" in the specification and claims of the present disclosure and the above drawings are used to distinguish similar objects, but not necessarily used to describe a specific sequence or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein can be practiced in sequences other than those illustrated or described herein. The implementations described in the following exemplary examples do not represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatuses and methods consi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a text processing method and device, a text classification method and device, equipment and a storage medium. The text classification method comprises the steps of obtaining a to-be-processed text; when the length of the to-be-processed text is greater than a preset length, extracting a first sub-text with a preset length from the to-be-processed text; under the condition that the second sub-text comprises the preset characters, splicing the preset characters and a plurality of characters in the first sub-text to obtain a first target spliced text with a preset length; wherein the second sub-text is the text, except the first sub-text, in the to-be-processed text. According to the method, the problem that the word number of the long text does not meet the language model requirement is solved, and the key characters representing the text core content and the first target spliced text of the subject name needing to be monitored can be intercepted from the long text to train the model, so that the performance of the model is improved; therefore, the language model obtained by training has higher accuracy when the text is classified.

Description

technical field [0001] The present disclosure relates to the technical field of the Internet, and in particular to a text processing method, a text classification method, a device, a device, and a storage medium. Background technique [0002] Due to the openness and communication characteristics of the Internet, it is necessary to obtain network public opinion analysis reports for network public opinion monitoring. Public opinion analysis platforms generally obtain various comments, articles, news, etc. from the Internet, and then classify the comments, articles and other texts . Since most of the texts on the network are long texts with a large number of words, and the current machine learning algorithm is limited by the machine memory and hardware configuration, it is impossible to train the entire content of long texts to obtain a classification model. Therefore, when long text is input to the language model for training and classification, it is often necessary to prepr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/35G06F16/38G06N20/00G06F40/289
CPCG06F16/35G06F16/38G06N20/00G06F40/289
Inventor 刘凡高旭宁张皓天温瀚翔张紫钰
Owner BEIJING DAJIA INTERNET INFORMATION TECH CO LTD
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More