Method and system for integrated machine learning convenient for data analysis personnel to use

A data analysis and machine learning technology, applied in the field of machine learning, can solve problems such as cumbersome modeling process, difficult reuse of data cleaning function codes, and large differences between models and business expectations, so as to reduce technical costs and modeling technical thresholds Reduced effect

Inactive Publication Date: 2018-08-03
北京至信普林科技有限公司
View PDF0 Cites 45 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

It is difficult to reuse the data cleaning function code of different projects
Second, the modeling process is cumbersome, including the time-consuming and cumbersome process of "modeling-evaluation-parameter tuning-evaluation"
The model fusion process is complex and cumbersome
Traditional machine learning developers only spend 20% of their time understanding the business, but spend 80% of their energy on data cleaning, model tuning, and other modeling tasks. The final model often differs greatly from business expectations

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for integrated machine learning convenient for data analysis personnel to use
  • Method and system for integrated machine learning convenient for data analysis personnel to use
  • Method and system for integrated machine learning convenient for data analysis personnel to use

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0065] The specific embodiments of the present invention will be described in detail below in conjunction with the drawings and specific embodiments. These specific implementation methods are only for description and are not used to limit the scope or implementation principles of the present invention. The protection scope of the present invention is still subject to the claims, including obvious changes or changes made on this basis.

[0066] The structural design of the present invention adopts the spark framework as a basic platform for development, and reads and processes structured data, semi-structured data and unstructured data. The data is processed by the spark platform, and the results are output in the form of data files, DataFrame and word reports.

[0067] Such as figure 1 As shown, the functions of the system include four major function points: data processing, feature processing, model processing and natural language processing, including data exploration, data...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the technical field of machine learning, and especially relate to a method and a system for integrated machine learning convenient for data analysis personnel to use. The method comprises the following steps: (1) data exploring; (2) data cleaning; (3) feature extraction; (4) feature selection; (5) sampling; (6) model training; (7) model optimization; (8) model combination;(9) model interpretability; (10) nature language processing. The system comprises a data processing module, a feature processing module, a model processing module, and a nature language processing module. The method and the system provide a unified algorithm modeling process for machine learning engineers, students, teachers, and machine learning fans, so that the machine learning engineers, students, teachers, and machine learning fans complete a modeling process by 20% efforts, and concentrate 80% efforts on understanding of business and model application, to deeply understand business andpreferably realize requirements of business personnel on models.

Description

technical field [0001] The invention relates to the technical field of machine learning, in particular to an integrated machine learning method and system convenient for data analysts to use. Background technique [0002] There are two industry pain points in big data modeling. First, data cleaning tasks are heavy and mechanical, including heavy data cleaning tasks and time constraints. It is difficult to reuse the data cleaning function codes of different projects. Second, the modeling process is cumbersome, including the "modeling-evaluation-parameter tuning-evaluation" process, which is time-consuming and cumbersome. The model fusion process is complex and tedious. Traditional machine learning developers only spend 20% of their time understanding the business, but spend 80% of their energy on data cleaning, model tuning, and other modeling tasks. The final model is often quite different from business expectations. There is an urgent need to launch a product in the mark...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06K9/62G06N99/00
CPCG06F16/215G06F16/313G06N20/00G06F18/214
Inventor 李雪鹏翟昶于上上冯博毛智愚
Owner 北京至信普林科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products