Check patentability & draft patents in minutes with Patsnap Eureka AI!

Text processing method and device, model training method and device, equipment and storage medium

A text processing and text technology, applied in neural learning methods, biological neural network models, unstructured text data retrieval, etc. problem, to achieve the effect of refining the processing granularity

Pending Publication Date: 2022-07-01
ALIBABA GRP HLDG LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] In the prior art, for long text data with ambiguous description and difficult semantic understanding, it is often only possible to roughly identify whether the long text data as a whole is a sensitive type, and cannot identify the specific location of the sensitive content in the long text data, making the long text data The granularity of the classification is relatively rough, which also makes the processing granularity of long text data relatively rough

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text processing method and device, model training method and device, equipment and storage medium
  • Text processing method and device, model training method and device, equipment and storage medium
  • Text processing method and device, model training method and device, equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0049] In order to make the purposes, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments These are some embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

[0050] The terms used in the embodiments of the present invention are only for the purpose of describing specific embodiments, and are not intended to limit the present invention. The singular forms "a," "the," and "the" as used in the embodiments of the present invention and the appended claims are intended to include the plural fo...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention provides a text processing method and device, a model training method and device, equipment and a storage medium, and the method comprises the steps: firstly obtaining a to-be-detected text containing a plurality of statements, and then extracting statement feature vectors corresponding to the plurality of statements; the type of each statement is determined according to the statement feature vector corresponding to the statement, and then the type of the whole to-be-detected text is determined according to the type of each statement. It can be seen that classification of different levels is achieved in the classification process, that is, whether the whole text relates to the sensitive content or not is determined, meanwhile, which statements in the text relate to the sensitive content can be specifically determined, and therefore the classification granularity of the text is refined. In the practical application, the classification results of the statement level and the document level can be output to the user, so that the user can process the whole text or some statements in the text according to the classification results of different levels, and the effect of refining the processing granularity of the long text data is also achieved.

Description

technical field [0001] The present invention relates to the field of artificial intelligence, and in particular to text processing methods, model training methods, devices, equipment and storage media Background technique [0002] For the massive text data existing on the Internet, there are often sensitive contents such as pornography, violence, gambling, and poisoning. In order to limit the widespread dissemination of such sensitive content, it needs to be accurately identified and dealt with accordingly. [0003] In the prior art, for long text data with obscure description and difficult semantic understanding, it is often only possible to roughly identify whether the long text data as a whole is a sensitive type, and cannot identify the specific location of sensitive content in the long text data, which makes the long text data The classification granularity is relatively rough, which also makes the processing granularity of long text data relatively rough. SUMMARY OF...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/35G06N3/04G06N3/08
CPCG06F16/35G06N3/08G06N3/044G06N3/045
Inventor 宋凯嵩孙常龙康杨杨刘晓钟林君
Owner ALIBABA GRP HLDG LTD
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More