Distributed framework-based log data storing and retrieving method

A distributed architecture and data storage technology, applied in the field of mobile communications, can solve problems such as slow data retrieval, achieve high availability, easy expansion, and meet storage requirements

Active Publication Date: 2015-12-09
WUHAN HONGXIN TECH SERVICE CO LTD
View PDF2 Cites 72 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0009] The present invention provides a method for constructing an efficient distributed data storage and data retrieval system based on HBase and Solr, which solves the problem of high-performance storage of massive user log data and the problem of slow data retrieval under multi-dimensional and multi-keyword conditions; Including using HBase and Solr to realize the storage method of massive log data and the retrieval method of log data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Distributed framework-based log data storing and retrieving method
  • Distributed framework-based log data storing and retrieving method
  • Distributed framework-based log data storing and retrieving method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0029] One aspect of the present invention realizes the method that user's online log data is stored and indexed in the HBase and Solr distributed systems, which includes the user's online log data being stored in HBase in the form of entries, and the method for establishing a unique identifier, and the user's online log data Dimension splitting and preprocessing of dimension word segmentation, methods of establishing the index relationship between query dimensions and word segmentation keywords and data identifiers, methods of dimension splitting log data, methods of word segmentation processing for dimensions, and methods based on dimensions The method of indexing with word segmentation keywords.

[0030] Wherein, during the preprocessing process of the user's online log data, the log data preprocessing device reads the user's online log data from the user's online log data file. The format of the log data file can be any format in: common text (txt) format with conventional...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a distributed framework-based log data storing and retrieving method and aims at solving the problem of high performance storage of mass user internet log data and the problem of slow data retrieval under multi-dimension and multi-keyword conditions. On one hand, the invention provides a method for storing user internet log data; HBase and Solr are mainly utilized to achieve distributed data storage and index construction; the method comprises the method for storing the user internet log data into the HBase in an entry form and building a unique identification, the method for carrying out dimension splitting and dimension word segmentation on the user internet log data and building an index relationship between the enquiry dimension and a word segmentation keyword and a data identification, the method for carrying out dimension splitting on the log data, the method for carrying out word segmentation on the dimension, and the method for building an index according to the dimension and the word segmentation keyword; on the other hand, the invention provides a method for retrieving the user internet log data; and the method mainly comprises an organization and data accessing method under the multi-dimension and multi-keyword conditions.

Description

technical field [0001] The present invention relates to the field of mobile communication, in particular to a method for storing and retrieving online log data of massive users in 2G, 3G and 4G data networks in a mobile communication network, and in particular to a method for storing and retrieving log data based on a distributed architecture. Background technique [0002] Hadoop is a distributed system infrastructure developed by the Apache Foundation. The core design of the Hadoop framework is: HDFS and MapReduce. HDFS provides storage for massive data, and MapReduce provides calculation for massive data. [0003] HBase is a NoSQL database system based on the Hadoop distributed system. It is a highly reliable, high-performance, column-oriented, and scalable distributed storage system. [0004] Solr is a high-performance, Lucene-based full-text search server. At the same time, it has been extended to provide a richer query language than Lucene. At the same time, it is co...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/1815G06F16/182G06F16/27G06F16/9535
Inventor 杨定义蔡剑峰陈亮李磊肖伟民余道敏
Owner WUHAN HONGXIN TECH SERVICE CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products