Multi-model reasoning service deployment method and device based on k8s cluster

A multi-model and model technology, applied in the field of cloud computing, can solve the problems of complex operation, does not support cluster elastic scaling, etc., to achieve the effect of simple deployment and operation

Active Publication Date: 2021-01-15
INSPUR SUZHOU INTELLIGENT TECH CO LTD
View PDF4 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Currently, the main way to deploy multi-models is to deploy services that support multi-model loading in the system, such as Tensor Flow Serving, Trion Serving, and AWS Multi-Model Serving. However, such services are traditional services and do not support elastic scaling in clusters. , and the operation is complex

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-model reasoning service deployment method and device based on k8s cluster
  • Multi-model reasoning service deployment method and device based on k8s cluster
  • Multi-model reasoning service deployment method and device based on k8s cluster

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0038] It should be noted that all expressions using "first" and "second" in the embodiments of the present invention are to distinguish two entities with the same name but different parameters or parameters that are not the same, see "first" and "second" It is only for the convenience of expression, and should not be construed as a limitation on the embodiments of the present invention, which will not be described one by one in the subsequent embodiments.

[0039] In one example, please refer to figure 1 As shown, the present invention provides a k8s cluster-based multi-model reasoning service deployment method, which specifically includes the following steps:

[0040] S100. Deploy a scheduling service in the smallest scheduling unit of the k8s cluster, and configure memory, computing resources, and scheduling policies for the scheduling service; wherein, the smallest scheduling unit is a pod;

[0041] S200. Deploy multiple model reasoning services according to the memory of...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a multi-model reasoning service deployment method and device based on a k8s cluster. The method comprises the following steps: deploying a scheduling service in a minimum scheduling unit of a k8s cluster, and configuring a memory, computing resources and a scheduling strategy for the scheduling service; deploying a plurality of model inference services according to a memoryof a scheduling service, and configuring each model inference service as a computing resource for using the scheduling service and to be associated with the scheduling service; and the scheduling service calls the plurality of model reasoning services according to the scheduling strategy so as to process the reasoning task. According to the scheme of the invention, the capability of sharing the minimum scheduling unit by multiple model inference services is realized, the multi-model inference service can be elastically expanded and contracted along with the service load, and the deployment operation is relatively simple.

Description

technical field [0001] The invention belongs to the field of cloud computing, and in particular relates to a k8s cluster-based multi-model reasoning service deployment method, device, computer equipment and storage medium. Background technique [0002] As machine learning methods are more and more widely used in actual production, the number of models that need to be deployed in the production system is also increasing. For example, machine learning applications to provide a personalized experience often require the training of many models; for example, a news classification service will train a custom model on news categories, and a recommendation model can train each user's usage history to personalize its recommendations; respectively The main reason for training so many models is to protect the user's model and data privacy. [0003] In the K8S cluster, the number of POD resources is limited (by default, each Node can start 110 POD instances), and by default, in a clust...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/455G06N5/04
CPCG06F9/45558G06N5/04G06F2009/45595G06F2009/45583G06F2009/4557
Inventor 陈清山
Owner INSPUR SUZHOU INTELLIGENT TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products