Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Long text processing method and device, equipment and storage medium

A processing method and text processing technology, applied in the fields of electrical digital data processing, natural language data processing, instruments, etc.

Pending Publication Date: 2022-05-13
BEIJING KINGSOFT DIGITAL ENTERTAINMENT CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the current Transformer-based pre-training model has limitations on the length of text input into the model at a single time. For example, for a 12-layer or 24-layer Transformer structure, the length of text processed at a time does not exceed 512 words. Therefore, for some needs For NLP tasks dealing with long text, the length of 512 cannot meet the demand

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Long text processing method and device, equipment and storage medium
  • Long text processing method and device, equipment and storage medium
  • Long text processing method and device, equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0049] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art based on the present application belong to the protection scope of the present invention.

[0050] In order to realize the processing of long text, the embodiments of the present invention provide a long text processing method, device, device and storage medium.

[0051] Wherein, a kind of long text processing method that the embodiment of the present invention provides, comprises:

[0052] Obtain the long text to be processed, and divide the long text into multiple text blocks, the length of each text block is determined according to the preset text ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention provides a long text processing method and device, equipment and a storage medium, and the method comprises the steps: obtaining a long text to be processed, segmenting the long text into a plurality of text blocks, and determining the length of each text block according to a preset text processing model; for each text block, encoding the text block to obtain a feature vector corresponding to the text block; obtaining a candidate feature vector, wherein the candidate feature vector is obtained by encoding the target text; calculating a weight value corresponding to each text block based on the feature vector corresponding to each text block and the candidate feature vector; and obtaining a feature vector of the long text based on the feature vector corresponding to each text block and the weight value corresponding to each text block, and taking the feature vector of the long text as a processing result of the long text. According to the embodiment of the invention, the long text can be processed.

Description

technical field [0001] The present invention relates to the technical field of natural language processing, in particular to a long text processing method, device, equipment and storage medium. Background technique [0002] With the development of the field of artificial intelligence, Natural Language Processing technology (Natural Language Processing, NLP) has been widely used in many scenarios, such as sentiment analysis, text similarity calculation, comment opinion extraction, text classification, lexical analysis and other scenarios. In these natural language processing scenarios, text processing is required. At present, most text processing models are trained or adjusted based on the trained pre-training model. Specifically, based on the pre-training model, fine-tune the parameters of the pre-training model to train downstream tasks to achieve text in different scenarios. The processing model is used to process the text, and the downstream tasks can be, for example, te...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/279G06F40/253G06F40/30
CPCG06F40/279G06F40/253G06F40/30
Inventor 白金国李长亮李小龙
Owner BEIJING KINGSOFT DIGITAL ENTERTAINMENT CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products