Segmenting postings list reader

Inactive Publication Date: 2011-02-17
GLOBALSPEC
View PDF52 Cites 18 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

[0038]FIG. 8 is a sequence diagram for one example of a method of reading a posting list comprising two segmen

Problems solved by technology

A large inverted index may not fit into a computer's main memory, requiring secondary storage, typically disk storage, to help store the posting file, lexicon, or both.
Each separate access to disk may

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Segmenting postings list reader
  • Segmenting postings list reader
  • Segmenting postings list reader

Examples

Experimental program
Comparison scheme
Effect test

Example

[0040]Posting lists in a search index are described by Zipf's law, which states that given a corpus of natural language documents, the frequency of any word is inversely proportional to its rank in the frequency table.

[0041]FIG. 1 shows, for an index built from a natural language corpus, a graph 100 of term rank 102 versus document frequency 104, where document frequency is the number of distinct documents the term occurs in. Another way to think about document frequency is posting list length. The graph shows that most terms have very short posting lists, and only relatively few posting lists are long.

[0042]Observing that queries submitted to a search system are little natural language documents, they too adhere to Zipf's law. It follows that the relatively few long posting lists in a search index are also the most frequently accessed during query processing. An efficient read strategy for long posting lists can help a search system deliver fast query run times. It is convenient th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A size of a posting list is determined as part of searching an inverted index. The posting list is segmented for reading into a plurality of segments based on the size. For example, the segmenting may be performed if the size is larger than a predetermined size. Finally, each of the plurality of segments is read into memory.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS[0001]This application claims priority under 35 U.S.C. §119 to the following U.S. Provisional Applications, which are herein incorporated by reference in their entirety:[0002]Provisional Patent Application Ser. No. 61 / 233,411, by Flatland et al., entitled “ESTIMATION OF POSTINGS LIST LENGTH IN A SEARCH SYSTEM USING AN APPROXIMATION TABLE,” filed on Aug. 12, 2009;[0003]Provisional Patent Application No. 61 / 233,420, by Flatland et al., entitled “EFFICIENT BUFFERED READING WITH A PLUG IN FOR INPUT BUFFER SIZE DETERMINATION,” filed on Aug. 12, 2009; and[0004]Provisional Patent Application Ser. No. 61 / 233,427, by Flatland et al., entitled “SEGMENTING POSTINGS LIST READER,” filed on Aug. 12, 2009.[0005]This application contains subject matter which is related to the subject matter of the following applications, each of which is assigned to the same assignee as this application and filed on the same day as this application. Each of the below listed ap...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F17/30622G06F16/319
Inventor FLATLAND, STEINARDALTON, JEFF J.
Owner GLOBALSPEC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products