Cloud computing platform and scheduling and data analysis method and system thereof

A data analysis system and cloud computing platform technology, applied in the network field, can solve the problems of low data reading and parsing efficiency, failure to achieve distributed, high real-time processing requirements, etc., to achieve low data interaction pressure, ensure load balance, Effects that improve application processing performance

Pending Publication Date: 2020-09-25
中联云港数据科技股份有限公司
View PDF4 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Through the above analysis, the problems and defects of the existing technology are as follows: the existing cloud computing platform does not achieve distributed data capture, and is only based on a single machine or a simple homogeneous cluster, and the efficiency of data reading and analysis is low; At the same time, the existing cloud computing platform cannot handle data with high throughput and high real-time requirements

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cloud computing platform and scheduling and data analysis method and system thereof
  • Cloud computing platform and scheduling and data analysis method and system thereof
  • Cloud computing platform and scheduling and data analysis method and system thereof

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0087] The cloud computing platform and its scheduling and data analysis methods provided by the embodiments of the present invention are as follows: figure 1 As shown, as a preferred embodiment, such as Figure 4 As shown, the method for dividing and clustering the collected website datasets and / or webpage datasets provided by the embodiments of the present invention includes:

[0088] S201, using the fuzzy C-means clustering algorithm to divide the collected data set into subcategories, and defining a cluster center for each subcategory.

[0089] S202, using particle swarm optimization to find an optimal clustering center.

[0090] The method for finding the optimal clustering center by using particle swarm calculation provided by the embodiment of the present invention is as follows:

[0091] Let the category set of the data division be {C=c 1 , c 2 ,...,c l}, the corresponding set of cluster centers is {V=v 1 , v 2 ,...,v l}, then the fitness function of particle s...

Embodiment 2

[0095] The cloud computing platform and its scheduling and data analysis methods provided by the embodiments of the present invention are as follows: figure 1 As shown, as a preferred embodiment, such as Figure 5 As shown, the method for scheduling and processing website datasets and / or webpage datasets provided by the embodiments of the present invention includes:

[0096] S301. Receive a website data set and / or a web page data set to be processed, and output the stored data set according to the state of each buffer area in the first double buffer based on a read command.

[0097] S302, using the sqoop program to extract data from the database into hadoop, and using SparkSQL to read the extracted data for calculation.

[0098] S303. Perform formatting and preprocessing on the calculated data set, and output the stored formatted and preprocessed data set according to the state of each buffer area in the second double buffer based on the read command.

[0099] S304. Perform ...

Embodiment 3

[0102] The cloud computing platform and its scheduling and data analysis methods provided by the embodiments of the present invention are as follows: figure 1 As shown, as a preferred embodiment, the data set security detection method provided by the embodiment of the present invention includes:

[0103] (1) Receive and acquire the clustered website data set and / or web page data set by using the security detection program through the security detection module; scan and identify the sensitive data in the acquired data set, analyze the data set, and extract the The source address and business type identification of the data set;

[0104] (2) Obtaining a TCP connection record corresponding to the service type identifier of the data set;

[0105] (3) extracting the TCP connection state corresponding to the source address according to the obtained TCP connection record;

[0106] (4) Judging whether the TCP connection state is normal, if so, then judging that the data set is a saf...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the technical field of networks, and discloses a cloud computing platform and a scheduling and data analysis method and system thereof. The cloud computing platform and the scheduling and data analysis system thereof comprise a data acquisition module, a data clustering module, a security detection module, a data transmission module, a cloud computing platform, a data scheduling module, a data analysis module, a cloud storage module and a display module. Webpage capture and / or website capture are / is executed through the data acquisition module, capture of whole networkdata is supported, extremely high universality is achieved, the maintenance and operation cost is reduced, and the reliability of effective data capture is improved. Security detection is performed on the acquired data set through the security detection module, so that the security of the cloud computing platform is ensured; the data storage mode and the processing mode of the cloud computing center are improved, the reliability of data storage in the cloud computing platform is improved, the load balance between cloud nodes is ensured, and the application processing performance of the cloudcomputing platform is effectively improved.

Description

technical field [0001] The invention belongs to the field of network technology, and in particular relates to a cloud computing platform and its scheduling and data analysis method and system. Background technique [0002] At present, cloud computing is an emerging business model, which is the product of the integration and development of technologies such as distributed computing, parallel computing, grid computing, virtualization, and load balancing. The realization of cloud computing system services mainly depends on cloud data centers. Due to the development of cloud computing technology, the requirements for cloud data centers are becoming more and more complex. The cloud data center is mainly composed of a huge number of servers and network devices. These network devices and servers are highly heterogeneous, and users have complex needs, high-quality services, and more reasonable dynamic resource management. Therefore, the cloud data center is proposed higher requirem...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/951G06K9/62H04L29/06H04L29/08
CPCG06F16/951H04L67/10H04L69/163H04L67/60G06F18/2321
Inventor 周康董岩闫强石凯武铁军
Owner 中联云港数据科技股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products