Fast indexing method and system based on position top-k keyword in sliding window

A sliding window and top-k technology, applied in the computer field, can solve problems such as the lack of update rate of the system, and achieve the effect of improving query speed, good performance, and fast query speed

Active Publication Date: 2018-01-26
SHENZHEN UNIV
View PDF5 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, neither system

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Fast indexing method and system based on position top-k keyword in sliding window
  • Fast indexing method and system based on position top-k keyword in sliding window
  • Fast indexing method and system based on position top-k keyword in sliding window

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0066] The present invention is described in further detail now in conjunction with accompanying drawing. These drawings are all simplified schematic diagrams, which only illustrate the basic structure of the present invention in a schematic manner, so they only show the configurations related to the present invention.

[0067] 1. Problem definition

[0068]Let D be a two-dimensional Euclidean space, W be a sliding window, and S be a collection of geographic text information in D and W. Each geographic text information is expressed as o=(pos, text), where pos is a position point in D, and text is text information. A LkTQ q consists of a tuple (loc,k), where loc represents the query location point, and k represents the number of result keywords that can be specified by the user. Finally, k keywords with the highest position-aware word frequency scores in the information in W are returned.

[0069] The position-aware word frequency score of a word t in a sliding window W is d...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a fast indexing method and system based on a position top-k keyword in a sliding window. The method comprises creating a data indexing model and performing query; the data indexing model is created as follows: determining a geographical range covered by a quadtree and a node splitting rule; accepting a data stream, and inserting data in a node; splitting the node satisfyinga splitting rule, and inserting the data in the nodes to generate a complete quadtree; storing reverse indexes by leaf nodes; and storing MG polymerization abstracts of sub nodes thereof by non leafnodes; and adjusting the structure of the quadtree; the query comprises: initializing a result set; performing a pruning operation to obtain a candidate result set; and selecting a word with the maximum value in a priority queue, starting to calculate, starting to traverse from the root node until an accurate score of the root node is found on the leaf nodes, putting the accurate score in the queue, and repeating the operation until the previous k words of the priority queue are invariable. By adoption of the fast indexing method and system, the cost can be effectively reduced, the query speedis improved, the search space can be effectively trimmed according to the word frequency and the location proximity, and the geographic text data stream with high arrival rate can be processed.

Description

technical field [0001] The invention belongs to the field of computers, and in particular relates to an indexing method, in particular to a fast indexing method suitable for querying based on position top-k keywords under a sliding window. In addition, the invention also relates to a fast indexing system based on position top-k keyword query under a sliding window. Background technique [0002] With the proliferation of social media, cloud storage and location-based services, the number of messages containing text and geographic information (for example, geotagged tweets) has soared. Such news, which can be modeled as geotext data streams, can often provide first-hand information for a variety of local events of different types and sizes, including news stories in a region, urban disasters, local business promotions, and events of public concern in the city. hot topics etc. [0003] The data streams of location-based social media have the following properties: (1) Bursty n...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/29G06F16/31
Inventor 毛睿李荣华陆敏华王毅罗秋明商烁刘刚
Owner SHENZHEN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products