Text representation method and device

A text representation and text technology, applied in the computer field, can solve the problems of ignoring word information, large ambiguity, and inability to accurately represent text information, etc., to achieve the effect of enriching information, improving accuracy, and enriching information representation

Pending Publication Date: 2020-08-25
TENCENT TECH (SHENZHEN) CO LTD
View PDF10 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] In related technologies, the text representation method is mainly to directly use words or words as the smallest unit, that is, a meta-unit, and then convert it into a vector representation, and then use the relevant network to obtain the vector representation of the sentence text as a whole. However, in related technologies, directly adopt A word or word is used as a unit for text representation, ignoring the information between words, and a single word has a large ambiguity, which cannot accurately represent text information

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text representation method and device
  • Text representation method and device
  • Text representation method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment approach

[0104] The first embodiment: According to the fusion vector representation of each word segment, the text vector representation of the text to be processed is obtained, which specifically includes:

[0105] Obtain the user portrait feature information of the user corresponding to the text to be processed; obtain the text vector representation of the to-be-processed text according to the fusion vector representation of each word segment and the user portrait feature information.

[0106] For example, short texts in the video field are often related to the corresponding video content, and different users have different attitudes towards the same video, such as like or dislike, so that when the short text is represented by a vector, the user portrait information can be represented. It is integrated into the representation of the short text, wherein the user portrait feature information is, for example, age, occupation, hobby, gender, etc., which are not limited in this embodiment ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the technical field of computers, in particular to a text representation method and device, and the method comprises the steps: obtaining the word vector representation of each word in a to-be-processed text; obtaining original word vector representation of each segmented word in the to-be-processed text; fusing the character vector representation of each character with the original word vector representation of each corresponding segmented word to obtain a fused vector representation of each segmented word; and obtaining text vector representation of the to-be-processed text according to the fusion vector representation of each segmented word. The words are fused, the text representation information can be enriched, and the text representation accuracy is improved.

Description

technical field [0001] The present application relates to the field of computer technology, and in particular, to a text representation method and apparatus. Background technique [0002] Text representation method refers to the vectorization method of text. Representing text as a vector containing semantic information is helpful for applications such as classification, retrieval and recommendation. How to accurately represent text is very necessary. [0003] In the related art, the text representation method mainly takes the word or word as the smallest unit, that is, the meta unit, and then converts it into a vector representation, and then uses the related network to obtain the vector representation of the entire sentence text. A word or word is used as a meta-unit for text representation, ignoring the information between words, and a single word is ambiguous and cannot accurately represent text information. SUMMARY OF THE INVENTION [0004] Embodiments of the present ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/31G06F40/30
CPCG06F16/31G06F40/30
Inventor 李伟康
Owner TENCENT TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products