Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Storage method and device for labeling system data

A system data and label technology, applied in the field of data processing, can solve the problems of poor data batch read and write performance, poor data throughput rate, etc., and achieve the effects of convenient version management, data delay reduction, and cost reduction

Active Publication Date: 2020-10-09
HUAWEI MACHINERY
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The present application provides a storage method and device for label system data, which are used to solve the problem that in the existing label system, label data is stored in a NoSQL database, resulting in poor data throughput and poor batch read and write performance of data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Storage method and device for labeling system data
  • Storage method and device for labeling system data
  • Storage method and device for labeling system data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0056] The embodiment of the present application provides a storage method and device for label system data, which is used to solve the problem that in the existing big data ecosystem, the data calculation results are usually stored in the NoSQL database, and the NoSQL database stores the data by key, and there is a data throughput rate. Poor, the batch read and write performance of data is poor. The embodiments of the present application are applied to computers, servers, computer clusters, and the like. An exemplary deployment on a computer cluster is a hadoop system, and the hadoop system usually includes components such as hdfs, yarn, and spark. Hadoop cloud services based on container technology or virtual technology can also be directly used on the computer cluster.

[0057] The method for storing label system data provided by the present application will be described in detail below using specific embodiments.

[0058] figure 1 It is a schematic flowchart of the stor...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a method and device for storing label system data. The method includes the steps that when a label computation task is completed, a column family file stored in a working directory in a column format is obtained; the column family file includes one mark column and one or more label columns; according to the column family name of the column family file, the column family file is moved into an archive subdiretory, corresponding to the column family name, in an archive directory. The column family file is stored in the column format, in this way, the throughput of data reading and writing in batches is increased, and data updating is convenient. Meanwhile, a computing result of the label computing task is directly stored in an HDFS, in this way, data moving is reduced, data delay is reduced to the maximum degree, and cost is lowered. By storing multiple versions of the column family file, the version of the column family file is convenient to manage; column family files of multiple versions are sequenced based on a time sequence.

Description

technical field [0001] The present application relates to the field of data processing, in particular to a method and device for storing label system data. Background technique [0002] In the era of big data, in order to facilitate data analysis and information mining, more and more data are stored in the form of tag systems. User tagging system is a common tagging system application. User tags can be used to analyze and profile users accurately and quickly. [0003] The data storage and analysis in the labeling system are mostly realized by the technical components of the big data ecosystem (hadoop). Exemplarily, data is usually calculated in batches based on the calculation engine spark or Map-Reduce, and then the calculation results are stored in a non-relational database (NoSQL), such as a key-value (key-value) type database. In the storage process, the entity identifier is usually used as the primary key of the row, and the label is stored in the NoSQL database in th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/22G06F16/27
CPCG06F16/221G06F16/27
Inventor 郝铸
Owner HUAWEI MACHINERY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products