Tree structure based indexing method

A tree structure and index technology, applied in the field of data processing, can solve problems such as unsatisfactory dynamics, and achieve the effect of improving retrieval efficiency and reducing index space.

Active Publication Date: 2016-03-23
SICHUAN CINGHOO TECH
View PDF3 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Although the Pat array greatly compresses the overhead in the creation process, because of the storage method of

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Tree structure based indexing method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] The technical solutions of the present invention will be clearly and completely described below in conjunction with the accompanying drawings of the present invention. Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numerals in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the present invention. Rather, they are merely examples of apparatuses and methods consistent with aspects of the invention as recited in the appended claims.

[0029] At first, some terms used in the present invention are introduced as follows:

[0030] (1) Source document library: The source document library refers to the collection of original webpage files captured by web crawlers fr...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention proposes a tree structure based indexing method. The method is used for processing Chinese webpage data in a Chinese search engine. The method comprises: Step S100, preprocessing webpage data: (1) extracting text information in a webpage, generating a corresponding text and numbering the text; (2) generating a webpage index file; (3) removing punctuation marks in the text to enable the text to become a set of short character strings; and Step S200, establishing a webpage data index file. According to the method, an index is created for the webpage data by adopting a binary inter-correlative subsequent tree model, and the advantages and disadvantages of word indexing and phrase indexing are considered, so that the retrieval efficiency is improved while the indexing space is reduced.

Description

technical field [0001] The invention relates to the field of data processing, in particular to an indexing method based on a tree structure. Background technique [0002] With the rapid development of the Internet, the exponential growth of information, and the diversity of data forms, it is difficult for people to quickly find the part that meets their needs in Hailiang's information. The emergence of full-text database has greatly improved this situation. A full-text database, also known as a text database, is a system for managing massive texts. The work to be done by the full-text database is still the two functions of the traditional database: storage and retrieval, specifically, the storage of text data and the retrieval of arbitrary character strings. The string used as the search condition can be a constant string, or a set of strings with common characteristics represented by a regular expression (or other methods, such as distance limit, etc.). [0003] At prese...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/322G06F16/35G06F16/951
Inventor 陈虹宇罗阳苗宁
Owner SICHUAN CINGHOO TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products