Unlock instant, AI-driven research and patent intelligence for your innovation.

Cluster server fault test method and a related device in a machine learning system

A cluster server, machine learning technology, applied in the field of cluster server failure testing, can solve problems such as single application scenario

Active Publication Date: 2019-05-24
SHENZHEN INTELLIFUSION TECHNOLOGIES CO LTD
View PDF7 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] However, the existing testing methods can only perform hardware failure testing on the hardware failure test script issued by a single server, and the application scenario is single

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cluster server fault test method and a related device in a machine learning system
  • Cluster server fault test method and a related device in a machine learning system
  • Cluster server fault test method and a related device in a machine learning system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0034] Each will be described in detail below.

[0035] The terms "comprising" and "having" and any variations thereof in the description and claims of the present invention and the drawings are intended to cover a non-exclusive inclusion. For example, a process, method, system, product or device comprising a series of steps or units is not limited to the listed steps or units, but optionally also includes unlisted steps or units, or optionally further includes Fo...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a cluster server fault test method and a related device in a machine learning system, and the method comprises the steps: receiving a fault test task sent by a fault generationserver, the fault test task carrying a software fault test script; issuing a test request carrying the software fault test script to M servers in the cluster server, where M is a positive integer; receiving M test responses sent by the M servers, the M test responses carrying M pieces of software fault test data obtained by running the software fault test script by the M servers, and the M servers being in one-to-one correspondence with the M test responses; and verifying the M pieces of software fault test data to obtain M software fault test results. Implementation of the embodiment of theinvention is beneficial to enriching application scenarios.

Description

technical field [0001] The invention relates to the field of computer technology, in particular to a cluster server failure testing method and related devices in a machine learning system. Background technique [0002] After decades of development, machine learning has finally been widely used with the development of storage capacity and computing power. Model training in machine learning requires computing a large amount of data to obtain a suitable model. Although the computing power of the GPU is several orders of magnitude better than that of the CPU, it is still not enough for the computing needs of machine learning. Therefore, the computing needs of machine learning are often met by deploying docker services on cluster servers for multi-machine multi-card training. Among them, Docker is an open source container engine and a lightweight virtualization technology, which has little performance loss and is easy to package, so it is more and more widely used in machine le...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F11/36H04L12/26
Inventor 郑海刚吕旭涛王孝宇
Owner SHENZHEN INTELLIFUSION TECHNOLOGIES CO LTD