Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and device for determining similarity between inquiry sentence and webpage, terminal and server

A technology for querying sentences and determining methods, which is applied in the field of data processing, and can solve problems such as low recall rate, poor user search experience, and large differences in web page collections, so as to achieve the effect of improving web page recall rate and search experience

Active Publication Date: 2015-02-04
BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
View PDF6 Cites 21 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, in the prior art, after obtaining the query sentence input by the user, no other processing is performed on the query sentence, but a hard matching method is adopted to directly calculate the correlation between the query sentence and the webpage, so that on the one hand, the search engine can identify the relevant query sentence. The recall rate of webpages related to the query statement is low; on the other hand, the webpage collection obtained by the search engine is quite different under the query sentences with different expressions but similar semantics, and the user search experience is poor.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for determining similarity between inquiry sentence and webpage, terminal and server
  • Method and device for determining similarity between inquiry sentence and webpage, terminal and server
  • Method and device for determining similarity between inquiry sentence and webpage, terminal and server

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0047] figure 1It is a schematic flowchart of a method for determining the similarity between a query statement and a webpage provided by Embodiment 1 of the present invention. This embodiment is applicable to calculating the similarity between the query statement and the webpage after obtaining the query statement input by the user , so that the search engine can determine whether the webpage can be used as a candidate webpage in the query result based on the similarity, or rank each candidate webpage under the query statement based on the similarity.

[0048] The method can be performed by a device for determining the similarity between the query sentence and the web page, and the device can be the search engine itself that provides web page search services for users, or a third-party server that provides the search engine with the calculation of the similarity between the query sentence and the web page . see figure 1 , the method provided in this embodiment specifically ...

Embodiment 2

[0059] figure 2 It is a schematic flowchart of a method for determining the similarity between a query sentence and a web page provided by Embodiment 2 of the present invention. On the basis of Embodiment 1 above, this embodiment further adds the operation of "creating a phrase translation model". see figure 2 , the method provided in this embodiment specifically includes the following operations:

[0060] Operation 210, determine the translation bilingual pair corpus; wherein, the source language sentence of the translation bilingual pair in the translation bilingual pair corpus is a query sentence, and the target language sentence is a webpage topic sentence.

[0061] Operation 220: Train the translated bilingual corpus to create a phrase translation model; the input of the phrase translation model is a query sentence, and the output includes at least one candidate sentence having similar semantics to the input.

[0062] Operation 230, using the pre-created phrase transl...

Embodiment 3

[0076] image 3 It is a schematic flowchart of a method for determining the similarity between a query sentence and a web page provided by Embodiment 3 of the present invention. On the basis of the above-mentioned embodiments, this embodiment will "determine the similarity between the target query sentence and the topic sentence of the web page" The operation of "is further optimized to "determine the similarity between the target query sentence and the topic sentence of the web page according to the translation probability of the candidate sentence and the similarity between the candidate sentence and the topic sentence of the web page". see image 3, the method provided in this embodiment specifically includes the following operations:

[0077] Operation 310: Translate the target query sentence into at least one candidate sentence with similar semantics through the pre-created phrase translation model.

[0078] Operation 320. Determine the similarity between the target que...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention discloses a method and a device for determining the similarity between an inquiry sentence and a webpage, a terminal and a server. The method comprises the following steps: translating a target inquiry sentence into at least one candidate sentence having similar semantics through a pre-created phrase translation model; determining the similarity between the target inquiry sentence and a webpage topic sentence according to the similarity between the at least one candidate sentence and the webpage topic sentence, wherein the webpage topic sentence is a webpage title or a sentence for describing major webpage content obtained by resolving webpage content based on a set algorithm. By adopting the technical scheme provided by the embodiment, the webpage recall rate of any inquiry sentence by a search engine can be increased, the search engine can return webpage sets with small differences specific to inquiry sentences having different representation forms but similar semantics, and the user satisfaction of an inquiry result is improved.

Description

technical field [0001] The embodiments of the present invention relate to the technical field of data processing, and in particular to a method, a device, a terminal and a server for determining the similarity between a query statement and a web page. Background technique [0002] At present, when search engines provide users with search services, they usually first obtain the instructions entered by users in the search bar, generate query sentences based on the instructions, then calculate the correlation between the query sentences and a large number of web pages, and finally put the relevant Links corresponding to webpages with high reliability are presented to the user as the query result for the user to click to view. [0003] Since the ranking of web pages based on correlation calculation directly determines the quality of search engines and the quality of user experience, how to accurately and efficiently calculate the correlation between query statements and web page...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
CPCG06F16/3332G06F16/95
Inventor 呉先超
Owner BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products