Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Text processing method and device, storage medium and electronic device

A text processing and text technology, applied in the field of text processing, can solve problems such as difficulty in accurately reflecting natural language features, and achieve the effect of improving the ability of vectorized representation

Active Publication Date: 2020-01-17
BEIJING DA MI TECH CO LTD
View PDF9 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Using an end-to-end model to calculate the similarity of text requires labeling a large number of text pairs, and the understanding and labeling of semantics vary from person to person, and it is difficult to accurately reflect the characteristics of natural language

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text processing method and device, storage medium and electronic device
  • Text processing method and device, storage medium and electronic device
  • Text processing method and device, storage medium and electronic device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0041] The present invention is described below based on examples, but the present invention is not limited to these examples. In the following detailed description of the invention, some specific details are set forth in detail. The present invention can be fully understood by those skilled in the art without the description of these detailed parts. In order not to obscure the essence of the present invention, well-known methods, procedures, procedures, components and circuits have not been described in detail.

[0042] Additionally, those of ordinary skill in the art will appreciate that the drawings provided herein are for illustrative purposes and are not necessarily drawn to scale.

[0043] Unless the context clearly requires, words like "including" and "including" throughout the application documents should be interpreted as an inclusive meaning rather than an exclusive or exhaustive meaning; that is, the meaning of "including but not limited to".

[0044] In the descr...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a text processing method and device, a storage medium and an electronic device. The method comprises: obtaining a first word sequence corresponding to a first text and a secondword sequence corresponding to a second text; determining a first word vector set and a first word weight set corresponding to the first word sequence, and determining a second word vector set and asecond word weight set corresponding to the second word sequence, and calculating the similarity between the first text and the second text based on the first word vector set, the first word weight set, the second word vector set and the second word weight set. Therefore, the word weight of each word in the corpus in different scenes can be obtained through a self-adaptive method, the word weightis used for synthesizing text vectorization, the contribution degree of each word to text semantics is fully reflected, and the vectorization representation capability of the text is improved.

Description

technical field [0001] The present invention relates to the technical field of text processing, in particular to a text processing method, device, storage medium and electronic equipment. Background technique [0002] With the in-depth research and product implementation of natural language processing technology, text similarity calculation has been widely used in many scenarios, such as information retrieval, intelligent question and answer, multi-round dialogue and recommendation system, etc. [0003] At present, the methods of text vectorization can be divided into two categories based on statistical models and deep learning. Among them, the method based on statistics is mainly the bag-of-words model (Bag-of-words), and the construction of each feature can use the TF-IDF (term frequency–inverse document frequency, word frequency inverse text frequency index) algorithm or BM25 (a method used to Algorithms that evaluate the relevance between search terms and documents). M...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/35G06K9/62G06F40/30
CPCG06F16/35G06F18/2411
Inventor 王鹏王永会孙海龙
Owner BEIJING DA MI TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products