Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Text correlation determination method and apparatus

A technology for determining methods and relevance, applied in the field of Internet applications, can solve the problems of large number of relevant texts and long time spent, etc., and achieve the effect of reducing the amount of calculation and improving the speed

Active Publication Date: 2017-03-01
BEIJING QIYI CENTURY SCI & TECH CO LTD
View PDF15 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Therefore, to remove noise texts with low correlation from the results, the prior art usually uses the vectorization of the target text to be processed and compares it with the vector of related text in the target field to obtain the correlation between the target text and the target field. Due to the large number of related texts, it takes a long time to compare one by one

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text correlation determination method and apparatus
  • Text correlation determination method and apparatus

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0050] The technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.

[0051] In order to solve the problems in the prior art, the embodiments of the present invention provide a method and device for determining text relevance, which will be described in detail below.

[0052] It should be noted that, according to the target field, a large number of texts related to the target field can be obtained, that is, text samples of the target field. The text vector corresponding to each text in the obtained text samp...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

Embodiments of the invention disclose a text correlation determination method and apparatus. Text vectors corresponding to texts in an obtained text sample for a target field are pre-clustered, and a centroid vector of each category is calculated. The method comprises the steps of obtaining a text vector corresponding to a to-be-processed target text; calculating a degree of correlation between the to-be-processed target text and the centroid of each category according to the text vector corresponding to the to-be-processed target text and the centroid vector of each category; and determining a correlation between the to-be-processed target text and the target field according to the degree of correlation. By applying the text correlation determination method and apparatus provided by the embodiments of the invention, the speed of judging the correlation between the target text and the target field is increased.

Description

Technical field [0001] The present invention relates to the field of Internet application technology, in particular to a method and device for determining text relevance. Background technique [0002] With the continuous development of Web technology, the era of big data has arrived, and machine learning based on big data has been applied in many fields such as medical care, education, transportation, and entertainment. Text is the most common data type, consisting of several words, usually from emails, short messages, Weibo, forum posts, etc. on the Internet. The determination of the relevance of the target text and the target field is a common text data processing method. [0003] Take keyword text capture as an example. For example, if you search for film reviews related to the movie named "Left Ear", you may get: "I went to the theater to watch "Left Ear" on weekends, it was very beautiful", "I left Uncomfortable ears, need to see an ear doctor" irrelevant text. Therefore, t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/3347G06F16/35
Inventor 沈一鲍昕平蔡龙军
Owner BEIJING QIYI CENTURY SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products