An Indexing Method Based on Tree Structure

A tree structure and index technology, applied in the field of data processing, can solve problems such as unsatisfactory dynamics, and achieve the effect of improving retrieval efficiency and reducing index space.

Active Publication Date: 2019-03-26
四川神虎科技有限公司
View PDF3 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Although the Pat array greatly compresses the overhead in the creation process, because of the storage method of the array, its creation and merging need to move a large amount of data, and the dynamics are not satisfactory.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An Indexing Method Based on Tree Structure

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] The technical solution of the present invention will be clearly and completely described below in conjunction with the accompanying drawings of the present invention. The exemplary embodiments will be described in detail here, and examples thereof are shown in the accompanying drawings. When the following description refers to the accompanying drawings, unless otherwise indicated, the same numbers in different drawings represent the same or similar elements. The implementation manners described in the following exemplary embodiments do not represent all implementation manners consistent with the present invention. On the contrary, they are only examples of devices and methods consistent with some aspects of the present invention as detailed in the appended claims.

[0029] First, some terms used in the present invention are introduced as follows:

[0030] (1) Source document library: The source document library refers to a collection of original webpage files crawled from ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention proposes a tree structure based indexing method. The method is used for processing Chinese webpage data in a Chinese search engine. The method comprises: Step S100, preprocessing webpage data: (1) extracting text information in a webpage, generating a corresponding text and numbering the text; (2) generating a webpage index file; (3) removing punctuation marks in the text to enable the text to become a set of short character strings; and Step S200, establishing a webpage data index file. According to the method, an index is created for the webpage data by adopting a binary inter-correlative subsequent tree model, and the advantages and disadvantages of word indexing and phrase indexing are considered, so that the retrieval efficiency is improved while the indexing space is reduced.

Description

Technical field [0001] The invention relates to the field of data processing, in particular to an index method based on a tree structure. Background technique [0002] With the rapid development of the Internet, the exponential growth of information, and the diversity of data forms, it is difficult for people to quickly find the part that meets their needs in Hailiang's information. The emergence of full-text databases has greatly improved this situation. Full-text database, also known as text database, is a system for managing massive texts. The work to be completed by full-text databases is still the two major functions of traditional databases: storage and retrieval, specifically, the storage of text data and the retrieval of arbitrary strings. The character string used as the search condition can be a constant character string, or a set of character strings with common characteristics expressed by a regular expression (or other methods, such as distance limitation, etc.). ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/901G06F16/951G06F16/903
CPCG06F16/322G06F16/35G06F16/951
Inventor 陈虹宇罗阳苗宁
Owner 四川神虎科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products