Supercharge Your Innovation With Domain-Expert AI Agents!

Self-service data labeling platform and self-service data labeling method based on big data technology

A big data technology and data labeling technology, applied in the field of self-service data labeling platform, can solve the problems of inability to label any business object, low data processing capacity, etc., to improve data processing capacity, reduce technical threshold, and improve flexibility Effect

Pending Publication Date: 2020-04-03
广州云徙科技有限公司
View PDF7 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to overcome the defects of low data processing capability, incapability of labeling arbitrary business objects and high technical requirements in the existing data labeling platform, and to provide a self-service data labeling platform based on big data technology, which can be fully visualized Define the calculation process of tags on the interface, and use big data Spark, Hive, Hbase and other technologies to realize the calculation of tag data for complex business needs in PB-level data, and further provide a data basis for user grouping and user tags

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Self-service data labeling platform and self-service data labeling method based on big data technology
  • Self-service data labeling platform and self-service data labeling method based on big data technology

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0037] Embodiment 1 of the present invention proposes a self-service data labeling platform based on big data technology. The business system is accessed by many users, and there are multiple product forms such as websites, APPs, and small programs. Each product module and product end A large amount of business data and behavioral data will be generated. This self-service data labeling platform includes:

[0038] The metadata definition unit is configured to: distinguish the marking object into attribute data, business data and behavior data through the definition of metadata, and the relationship between each data and the marking object;

[0039] The label definition unit is configured to: perform label definition on data, including fact labels, model labels and subjective labels; wherein, the fact labels and model labels are generated by defining calculation rules on the interface;

[0040] The data acquisition unit is configured to: collect the data of the business system ...

Embodiment 2

[0048] Embodiment 2 of the present invention proposes a self-service data labeling method based on big data technology, comprising the following steps:

[0049] The metadata definition step is to distinguish the marking object into attribute data, business data and behavior data through the definition of metadata, and the relationship between each data and the marking object;

[0050] The label definition step is to define labels for the data, including fact labels, model labels and subjective labels; wherein, the fact labels and model labels are generated by defining calculation rules on the interface;

[0051] In the data collection step, the data of the business system is collected through DataX, including: object attribute data, behavior data, associated data, etc., and imported into the Hive data warehouse;

[0052] The basic label calculation step, by reading the label configuration rules, automatically produces calculation logic including: SparkSQL, Spark RDD codes, etc.,...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a self-service data labeling platform and a self-service data labeling method based on a big data technology, and the method comprises the steps: a metadata definition step: enabling a marking object to be divided into attribute data, business data and behavior data; a label definition step: performing label definition on the data; a data acquisition step: acquiring data ofthe business system, and importing the data into a data warehouse; a basic label calculation step; a combined label calculation step: calculating a combined label and writing the combined label intothe Hbase; and an object grouping calculation step: automatically generating a calculation code according to the configuration to calculate a crowd circling result in real time, and writing the resultinto the Hbase. According to the method, the calculation process of the label is defined on the interface in a completely visual mode, the label data required by the composite service is calculated in the PB-level data through big data, and a data basis is further provided for user grouping and user labels.

Description

technical field [0001] The invention relates to the technical field of big data processing, in particular to a self-service data labeling platform based on big data technology. Background technique [0002] Big data marketing is a marketing method applied to the Internet advertising industry based on a large amount of data from various social platforms and relying on big data technology. The core of big data marketing is to allow online advertisements to be delivered to the right people at the right time, through the right carrier, and in the right way. Big data marketing is derived from the Internet industry and acts on the Internet industry. Relying on the big data collection of multiple platforms, as well as the analysis and prediction capabilities of big data technology, it can make advertising more accurate and effective, and bring a higher return on investment to brands or enterprises. [0003] Accomplishing precision marketing emphasizes personalization, and it is n...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/2457G06F16/248G06Q30/02G06F16/25
CPCG06Q30/0201G06F16/2457G06F16/248G06F16/254
Inventor 李元佳陈新伟郭逸重
Owner 广州云徙科技有限公司
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More