Big data platform and algorithm model-based calculation method and system

A big data platform and algorithm model technology, applied in the computer field, can solve the problem that data cannot be directly used for algorithm model calculations, and achieve the effect of increasing speed

Active Publication Date: 2016-11-09
CTRIP COMP TECH SHANGHAI
View PDF7 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The technical problem to be solved by the present invention is to provide a calculation method based on the big data platform and the algorithm model ...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Big data platform and algorithm model-based calculation method and system
  • Big data platform and algorithm model-based calculation method and system
  • Big data platform and algorithm model-based calculation method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0045] A calculation method based on a big data platform and an algorithm model, such as figure 1 and 2 As shown, the calculation method includes:

[0046] Step 101, saving the data in the Hive data warehouse of the big data platform. It specifically includes: importing source data from the target database. The target database can be a multi-dimensional real transaction database. Since the amount of source data is very large and the partition formats are different, the process of importing also includes the step of processing source data , which can specifically include extracting, cleaning, splitting, re-partitioning, aggregating, counting and calculating the source data using the HQL language. Record the processed source data as the first data, store the first data in the Hdfs file system 01, and form a task queue, the task queue is an ordered task queue, the Hdfs file system 01 and the hadoop cluster 03 connection.

[0047] Step 102, put the script file of the algorithm...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a big data platform and algorithm model-based calculation method and system. The calculation method comprises the following steps of: storing data in a Hive data warehouse of a big data platform; putting a script file of an algorithm model on a server, wherein the script file of the algorithm model is compiled by using an R language; starting an Rserve on the server and starting remote access; connecting the Hive data warehouse with the Rserve, operating the data which is stored in the Hive data warehouse on the Rserve, and importing a calculation result in the Hive data warehouse. According to the big data platform and algorithm model-based calculation method and system, the shortage that the data processed by the big data platform cannot be directly used for algorithm model calculation in the prior art is covered; and a communication channel between the Hive data warehouse and the Rserve is established by utilizing a java language, and the data processing is combined with the algorithm model, so that the data processing and the model calculation can be freely jointed and then the data processing speed is improved.

Description

technical field [0001] The invention belongs to the field of computers, and in particular relates to a calculation method based on a big data platform and an algorithm model. Background technique [0002] With the development of the Internet, the amount of user data is increasing, and it is diverse and real-time. How to count and analyze these user data becomes very meaningful. The widely used Hadoop technology has a very good effect on storing and processing big data. However, in different application scenarios, data analysis needs to use more professional algorithms and models for calculation. Only by combining the two can we meet the reality. need. [0003] In today's Internet companies, big data processing and algorithm models are often handled independently. Because the professional backgrounds of the two are different, the software tools used are different: data processing often uses Hive (a data warehouse tool based on Hadoop), Hbase (a distributed, column-oriented ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F9/30G06F17/30
CPCG06F9/30007G06F16/254
Inventor 张露瑶陈榕李腾龙
Owner CTRIP COMP TECH SHANGHAI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products