Self-service data labeling platform and self-service data labeling method based on big data technology
A big data technology and data labeling technology, applied in the field of self-service data labeling platform, can solve the problems of inability to label any business object, low data processing capacity, etc., to improve data processing capacity, reduce technical threshold, and improve flexibility Effect
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0037] Embodiment 1 of the present invention proposes a self-service data labeling platform based on big data technology. The business system is accessed by many users, and there are multiple product forms such as websites, APPs, and small programs. Each product module and product end A large amount of business data and behavioral data will be generated. This self-service data labeling platform includes:
[0038] The metadata definition unit is configured to: distinguish the marking object into attribute data, business data and behavior data through the definition of metadata, and the relationship between each data and the marking object;
[0039] The label definition unit is configured to: perform label definition on data, including fact labels, model labels and subjective labels; wherein, the fact labels and model labels are generated by defining calculation rules on the interface;
[0040] The data acquisition unit is configured to: collect the data of the business system ...
Embodiment 2
[0048] Embodiment 2 of the present invention proposes a self-service data labeling method based on big data technology, comprising the following steps:
[0049] The metadata definition step is to distinguish the marking object into attribute data, business data and behavior data through the definition of metadata, and the relationship between each data and the marking object;
[0050] The label definition step is to define labels for the data, including fact labels, model labels and subjective labels; wherein, the fact labels and model labels are generated by defining calculation rules on the interface;
[0051] In the data collection step, the data of the business system is collected through DataX, including: object attribute data, behavior data, associated data, etc., and imported into the Hive data warehouse;
[0052] The basic label calculation step, by reading the label configuration rules, automatically produces calculation logic including: SparkSQL, Spark RDD codes, etc.,...
PUM

Abstract
Description
Claims
Application Information

- R&D
- Intellectual Property
- Life Sciences
- Materials
- Tech Scout
- Unparalleled Data Quality
- Higher Quality Content
- 60% Fewer Hallucinations
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2025 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com