User data preprocessing system for link prediction relation recommendation
A data preprocessing and link prediction technology, which is applied in database management systems, relational databases, electrical digital data processing, etc., can solve problems such as improper selection of field matching algorithms, poor versatility of data preprocessing systems, and difficulty in expansion.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Examples
Embodiment 1
[0047] The user builds a data mining case and submits a task. After the system receives the task case, receives the data file, or connects to the location of the database, it decomposes the task and obtains the sequence of subtasks. The discrete recommendation engine transfers the file to the data analysis agent to analyze and obtain the task After the data file information is submitted to the discretization recommendation agent, the intelligent control agent checks the discretization database data, and performs rough set analysis to obtain the corresponding discretization decision rules.
[0048] Further, in the above technical solution, the link prediction relationship recommendation layer obtains the edge set in the weighted relationship network graph, and divides the edge set into a training set and a test set, and according to the training Link prediction is performed on the set to obtain a prediction result, and based on preset indicators, a preset index value is obtained...
Embodiment 2
[0061] After the system parses the data configuration file, the data extraction module extracts data from the external data file, and the data transfer module stores the data in the form of text. The uniquely identified ID, the text is stored in the order of the fields in the configuration file, so that when restoring data, you only need to parse the fields in the configuration file to identify the fields corresponding to the values in the record. After the data extraction is completed, the extracted The data is stored in the HDFS file system according to the specified format for use by the subsequent preprocessing module. The data preprocessing module is divided into three parts. The first part is data integrity, consistency, and validity detection, and the second part is similar duplicate data detection. The third part is abnormal data detection, data integrity, consistency, and validity detection. After obtaining the data to be processed from the HDFS file system, first de...
Embodiment 3
[0069] After the data is processed by the data preprocessing module, the data quality is improved. The task of the data storage module is to import these data into the specified database in batches. The data storage medium and storage method need to be determined according to the characteristics of different business types. For business logic Relevant data, use RDBMS for strong relational constraints, and use distributed storage methods for data with weak structural requirements and large-scale data, using HBase database. Data storage must not only be stored in the RDBMS database, but also To be stored in the HBase database.
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com