A framework design method for machine learning modeling platform based on r language

A technology of machine learning and design methods, applied in machine learning, instruments, calculations, etc., can solve problems such as R operator distributed computing, and achieve the effects of secondary sharing of efficient models, enhanced support, and cluster load balancing

Active Publication Date: 2020-11-10
成都优易数据有限公司
View PDF4 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to: provide a kind of R language-based machine learning modeling platform framework design method, solve the technical problem that can not carry out distributed calculation to R operator

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A framework design method for machine learning modeling platform based on r language
  • A framework design method for machine learning modeling platform based on r language

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment 1

[0035] A method for designing a machine learning modeling platform architecture based on R language, comprising the following steps:

[0036] Step 1: The user uses the inherent packaging format of the R operator to add various types of R operators on the modeling platform and classifies them according to their functions. The modeling platform is a WEB modeling platform; the WEB application modeling platform is divided into The classification results set the classification directory, and perform visual management and display; freely drag and drop n R operators in the classification directory to the workflow editing area, and connect according to a certain logical relationship to complete the machine learning operator construction; the data flow direction of the n R operators is from the first operator to the nth operator.

[0037] Step 2: Write a Shell script file for each R operator to receive the configuration parameters of the R operator in the modeling platform, complete th...

specific Embodiment 2

[0041]Step 1: The user uses the inherent packaging format of the R operator to add various types of R operators on the WEB application modeling platform, and classify them according to their functions; the WEB application modeling platform sets the classification directory according to the classification results, and Carry out visual management and display; freely drag and drop the three R operators in the category: operator A, operator B and operator C to the workflow editing area, and connect them according to a certain logical relationship to complete Construction of machine learning operators; the data flow direction of the three R operators is from operator A to operator B and then to operator C.

[0042] Step 2: Use the Oozie component to dynamically assign the three operators to different Hadoop cluster computing nodes. Operator A is assigned to computing node A, operator B is assigned to computing node B, and operator C is assigned to computing node c.

[0043] Step 3...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a machine learning modeling platform architecture design method based on R language. A visual machine learning operator based on R language is built. R operators in the machine learning operator are allocated to different Hadoop cluster computing nodes by an Oozie component. The Hadoop cluster computing nodes call data managed by an HDFS component, and performs calculation according to the logical relationship of the machine learning operator to get a final result of the machine learning operator. By using the method, distributed computing of the visual machine learning operator based on R language is realized. The modeling platform has rich machine learning operators based on R language and an efficient and flexible programming system. The R operators are adaptively scheduled to different Hadoop cluster computing nodes by an Oozie process control component. Cluster load balance and multi-user high-capacity concurrent modeling computation are realized.

Description

technical field [0001] The invention belongs to the field of big data analysis and processing, and in particular relates to an R language-based machine learning modeling platform architecture design method for distributed computing of machine learning operators. Background technique [0002] The big data analysis and processing platform is based on distributed computing architecture and machine learning operators, and is used to solve data mining modeling problems under the condition of large data scale. However, in the actual use of the platform, it is found that small-scale input data and modeling requirements are the main usage forms, and the distributed processing architecture does not have obvious efficiency advantages for the processing of small input data. There is an obvious problem of data interaction delay; at the same time, limited by the number of machine learning operators that currently support distributed computing, the modeling capability of the platform unde...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06N20/00
CPCG06N20/00
Inventor 竹登虎勇萌哲
Owner 成都优易数据有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products