Training data generation method and device and searching method and device

A technology of training data and data, which is applied in the field of search engines, can solve problems such as high computational complexity, poor results, and unsuitable training data methods for neural network models, so as to achieve accurate search results and improve accuracy

Active Publication Date: 2017-01-18
BEIJING BAIDU NETCOM SCI & TECH CO LTD
View PDF6 Cites 33 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, in the above three methods, because the single-document method ignores the relative order between documents, the effect will be poor, while the document list method will have relatively high computational complexity during training, and the training data is also difficult to label. Therefore, in practical applications, the document comparison method is often selected, so it is necessary to know the relative order between documents. Because a large amount of labeled data is required, manual labeling is not practical, and because the current neural network model is widely used It has been applied to many fields in the industry, and it is also used for LTR, but LTR is somewhat different from the previous neural network model learning methods and goals, that is, the way of constructing training data is not suitable for neural network models.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Training data generation method and device and searching method and device
  • Training data generation method and device and searching method and device
  • Training data generation method and device and searching method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary and are intended to explain the present invention and should not be construed as limiting the present invention.

[0028] A method for generating training data, a search method and device based on a neural network model according to an embodiment of the present invention will be described below with reference to the accompanying drawings.

[0029]At present, LTR generally has three types of methods: single document method (Pointwise), document comparison method (Pairwise), and document list method (Listwise). The object of the single-document method is a single document. After converting the document into a feature vector, it mainly converts ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a training data generation method and device and a searching method and device based on a neural network model. The training data generation method comprises the following steps: obtaining history search data of a user, wherein the history search data comprises history search words and history search results corresponding to the history search words; obtaining history search and click behaviors of the user, and carrying out classification on the history search results corresponding to the history search words according to the history search and click behaviors to generate labels of the history search results; and according to a preset strategy, the labels of the history search results, the history search words and the history search results, generating training data. The method does not need participation of workers, thereby realizing an automation function; and the training data is generated according to the labels of the history search results, and the training data can be more suitable for the neural network model, so that searching can be carried out through the neural network model to obtain a more accurate search result, and search accuracy is improved.

Description

technical field [0001] The invention relates to the technical field of search engines, in particular to a method for generating training data, a search method and device based on a neural network model. Background technique [0002] In information retrieval, LTR (Learning To Rank, learning to rank) is an important sorting method. After the search engine recalls many relevant webpages from the webpage library, it needs to sort these webpage documents and present them to users. In this process, LTR plays a key role. LTR is supervised learning, and the acquisition of training data is particularly critical. At present, LTR generally has three types of methods: single document method (Pointwise), document comparison method (Pairwise), and document list method (Listwise). However, in the above three methods, because the single-document method ignores the relative order between documents, the effect will be poor, while the document list method will have relatively high computatio...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N5/02G06N3/02G06F17/27
CPCG06F40/284G06N3/02G06N5/022
Inventor 姜迪石磊廖梦陈泽裕连荣忠
Owner BEIJING BAIDU NETCOM SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products