Method for creating real information index, and full-text retrieval system based on cloud platform

A technology for establishing methods and indexes, which is applied in the field of data processing and can solve problems such as low efficiency of data retrieval

Inactive Publication Date: 2016-08-10
掌沃云科技(北京)有限公司
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The main purpose of the present invention is to provide a method for establishing a realistic information index and a full-text retrieval system based on a cloud platform, so as to solve the problem of low data retrieval efficiency in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for creating real information index, and full-text retrieval system based on cloud platform
  • Method for creating real information index, and full-text retrieval system based on cloud platform
  • Method for creating real information index, and full-text retrieval system based on cloud platform

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] The present invention will be further described below in conjunction with the accompanying drawings and specific embodiments. It should be pointed out that, in the case of no conflict, the embodiments in the present application and the features in the embodiments can be combined with each other.

[0043] In the first aspect, Embodiment 1 of the present invention provides a method for establishing a real information index, which mainly describes the process of establishing a real information database index, see figure 1 , the method may include the following steps S1 to S6.

[0044] Step S1: Segment the text in the real information database to obtain a thesaurus.

[0045] Words are the smallest meaningful language components that can move independently. English words are separated by spaces as a natural delimiter, while Chinese uses characters as the basic writing unit, and there is no obvious distinguishing mark between words. Therefore, Chinese Word analysis is the f...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for creating a real information index, and a full-text retrieval system based on a cloud platform. The method comprises the following steps of: performing word segmentation of a text in a real information database so as to obtain a word bank; obtaining one word from the word bank so as to obtain a first word; obtaining N-1 words from the word bank without the first word, forming a word group together with the first word, calculating the overall correlation distance K of every two words in the word group by adopting the following formula so as to obtain K*(a formula shown in the specification), calculating the overall distance coefficient P of the word group by adopting the following formula: K=lambda(1)K(different texts)+lambda(2)K(same texts)*[lambda(3)K(different paragraphs)+lambda(4)K(same paragraphs)*(lambda(5)K(different sentences)+lambda(6)K(same sentences)], FORMULE, wherein N-1 words are obtained for M times, such that M*P can be obtained by calculation; finding out the N-1 words obtained by calculation when P is minimum so as to form a first related word group with the first word; determining the relationship of various words in the first related word group; and creating the real information index according to the relationship of various words in the first related word group. By means of the method disclosed by the invention, effective data can be found more efficiently.

Description

technical field [0001] The present invention relates to the technical field of data processing, in particular to a method for establishing a reality information index and a full-text retrieval system based on a cloud platform. Background technique [0002] With the development of the Internet, we have entered the big data era of data explosion. More and more data affect all aspects of people's lives, and people need to store, call, and analyze various types of data in a classified manner. However, in call analysis, there is a problem of low efficiency of traversing data. [0003] Aiming at the problem of low data retrieval efficiency in the prior art, no effective solution has been proposed yet. Contents of the invention [0004] The main purpose of the present invention is to provide a method for establishing a realistic information index and a full-text retrieval system based on a cloud platform, so as to solve the problem of low data retrieval efficiency in the prior ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/3344G06F40/289
Inventor 李唳天马雄鹰
Owner 掌沃云科技(北京)有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products