Creation method and device for web page scoring model

A web page and model technology, applied in the computer field, can solve problems such as insufficient learning, insufficient function, unreasonable web page scoring model, etc., to achieve the effect of improving search experience and improving accuracy

Active Publication Date: 2015-02-18
BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
View PDF3 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, due to the limitations of the training sample set, it may lead to insufficient learning of some features of web pages during the learning process, resulting in unreasonable scoring models for web pages created, which greatly reduces the accuracy of sorting results
For example, the traditional GBRank model will not learn enough omission features of web pages, resulting in insufficient role of this feature in the model
By using this model to sort a group of webpages under a certain query word, it is very easy to rank the webpages with smaller omission feature value and poor correlation before the webpage with larger omission feature value and better correlation, which will seriously Affect the user's search experience

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Creation method and device for web page scoring model
  • Creation method and device for web page scoring model
  • Creation method and device for web page scoring model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0023] Figure 1A It is a schematic flowchart of a method for creating a webpage scoring model provided by Embodiment 1 of the present invention. Using the webpage scoring model created in this embodiment, multiple webpages under any input query can be scored, and the search product can sort the multiple webpages according to the scoring results, and then sort the multiple webpages after sorting. Link information corresponding to the web page is presented to the user. Figure 1B It is an application scenario of ranking webpages used by the method for creating a webpage scoring model provided in Embodiment 1 of the present invention. see Figure 1B , the basic flow of webpage sorting can be divided into two processes: offline training and online prediction. The input of offline training is a webpage training sample set, and a webpage scoring model is produced by the learning system. The learning system uses the method provided in this embodiment. A method for creating a webpa...

Embodiment 2

[0042] figure 2 It is a schematic flowchart of a method for creating a webpage scoring model provided by Embodiment 2 of the present invention. In this embodiment, operation 120 is optimized on the basis of the first embodiment above. see figure 2 , the method provided in this embodiment specifically includes the following operations:

[0043] Operation 210, acquiring a webpage training sample set, wherein the webpage training sample set includes feature vectors and annotation scores of a plurality of sample webpages under each query word in at least one preset query word.

[0044] Operation 220. Obtain an original loss function obtained according to the annotation scores of each sample webpage in the webpage training sample set.

[0045] In this embodiment, the original loss function is generated in advance according to the annotation scores of each sample webpage in the webpage training sample set, and then according to the machine learning algorithm, the characteristic...

Embodiment 3

[0084] image 3 It is a schematic flowchart of a method for creating a webpage scoring model provided by Embodiment 3 of the present invention. On the basis of the first and second embodiments above, this embodiment further adds the operations of updating the action coefficient in the target loss function and creating a new webpage scoring model. see image 3 , the method provided in this embodiment specifically includes the following operations:

[0085] Operation 310, acquiring a webpage training sample set, wherein the webpage training sample set includes feature vectors and annotation scores of a plurality of sample webpages under each query word in at least one preset query word.

[0086] Operation 320. Obtain an original loss function obtained according to the annotation scoring of each sample webpage in the webpage training sample set.

[0087] Operation 330. Determine the decision factor in the original loss function used to measure the degree of difference between ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention discloses a creation method and a creation device for a web page scoring model. The method comprises the following steps of acquiring a web page training sample set, wherein the web page training sample set comprises characteristic vectors and mark scores of a plurality of sample web pages under each of at least one preset query term; generating a target loss function according to the mark score of each sample web page in the web page training sample set and at least one pre-determined web page characteristic to be regulated; creating the web page scoring model according to the generated target loss function and the characteristic vectors of each sample web page in the web page training sample set. According to the technical scheme provided by the embodiment of the invention, the accuracy of a web page ranking result can be improved, and the searching experiences of a user can be improved.

Description

technical field [0001] The embodiments of the present invention relate to the field of computer technology, and in particular to a method and device for creating a web page scoring model. Background technique [0002] At present, after the search product receives the query word entered by the user, it will first determine multiple related web pages to be returned based on the query word, then sort these related web pages, and finally combine the link information of all the web pages after the sorting operation into a A list, presented to the user as search results. Whether the ranking of web pages is accurate or not plays a vital role in the accuracy of search results and user satisfaction with search. The more relevant a web page is to the query, the higher it should rank. [0003] At present, search products mostly pre-create a webpage scoring model, such as the GBRank (Gradient Boosting Rank, gradient boosting sorting) model, and then score all webpages currently determ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/951
Inventor 杨燕
Owner BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products