Unlock instant, AI-driven research and patent intelligence for your innovation.

System And Method For Generating Training Data For Function Approximation Of An Unknown Process Such As A Search Engine Ranking Algorithm

a search engine ranking algorithm and training data technology, applied in the field of machine learning algorithms, can solve the problems of inability to respond to user queries at any time, inconvenient use, and inability to meet user requirements,

Inactive Publication Date: 2010-03-04
CONDUCTOR INC
View PDF5 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

[0012]Another embodiment of the invention is a method for generating training data for a machine learning system. The method comprises sending at least one input to a system effective to perform a process; and receiving at a first processor at least a first and a second output from the system in response to the input, the first output having a first rank, the second output having a second rank, the first and second rank being based on the input. The method further comprises assigning at the first processor a first label to the first output based on the first rank; assigning at the first processor a second label to the second output based on the second rank; and forwarding the first result, second result, first label and second label to a machine learning processor.
[0013]Yet another embodiment of the invention is a system for generating training data for a machine learning system. The system comprises a first processor effective to send at least one keyword to a search engine. The first processor is further effective to: receive at least a first and a second page from the search engine in response to the keyword, the first page having a first rank, the second page having a second rank, the first and second

Problems solved by technology

Searching and indexing these pages to produce useful results in response to user queries is constantly a challenge.
However, most of these systems use manual human judgment and historical knowledge about search engines.
Consequently, most prior art solutions are inaccurate, time consuming, and require expensive human capital.
Moreover, these solutions are available only for specific search engines and are not immune to changes in search or ranking algorithms used by known search engines nor do they have the ability to adapt to new search engines.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • System And Method For Generating Training Data For Function Approximation Of An Unknown Process Such As A Search Engine Ranking Algorithm
  • System And Method For Generating Training Data For Function Approximation Of An Unknown Process Such As A Search Engine Ranking Algorithm
  • System And Method For Generating Training Data For Function Approximation Of An Unknown Process Such As A Search Engine Ranking Algorithm

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

)

[0020]Various embodiments of the invention are described hereinafter with reference to the figures. Elements of like structures or function are represented with like reference numerals throughout the figures. The figures are only intended to facilitate the description of the invention or as a limitation on the scope of the invention. In addition, an aspect described in conjunction with a particular embodiment of the invention is not necessarily limited to that embodiment and can be practiced in conjunction with any other embodiments of the invention.

[0021]When applying a ranking function, search engines receive as input: 1) at least one keyword and 2) a plurality of web pages in a result set produced based on keyword(s). With those inputs, the search engine produces as an output a ranking score for each web page. The inventors recognized this phenomenon and produced a system and algorithm to reverse engineer the function performed by search engines to produce that output. Stated an...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A system and method for generating training data for a machine learning system. A training data generator server sends at least one keyword to a search engine. The training data generator server receives at least a first and a second page from the search engine in response to the keyword, the first page having a first rank, the second page having a second rank, the first and second rank being based on the keyword. The training data generator server assigns a first label to the first page based on the first rank; and assigns a second label to the second page based on the second rank. The first web page, second page, first label and second label are forwarded to a machine learning server.

Description

[0001]This application claims priority to U.S. Patent application Ser. No. 61 / 093,586 entitled “Techniques for Automated Search Rank Function, Approximation, Rank Improvement Recommendations and Predictions”, filed Sep. 2, 2008, the entirety of which is hereby incorporated by reference.BACKGROUND OF THE INVENTION [0002]1. Field of the Invention[0003]This disclosure relates to machine learning algorithms and, more particularly, to generation of training data for machine learning algorithms.[0004]2. Description of the Related Art[0005]Referring to FIG. 1, the World Wide Web (“WWW”) is a distributed database including literally billions of pages accessible through the Internet. Searching and indexing these pages to produce useful results in response to user queries is constantly a challenge. A search engine is typically used to search the WWW.[0006]A typical prior art search engine 20 is shown in FIG. 1. Pages from the Internet or other source 22 are accessed through the use of a crawl...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
CPCG06F17/30864G06F16/951
Inventor KULKARNI, PARASHURAM
Owner CONDUCTOR INC