Multi-model reasoning service deployment method and device based on k8s cluster

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A multi-model and model technology, applied in the field of cloud computing, can solve the problems of complex operation, does not support cluster elastic scaling, etc., to achieve the effect of simple deployment and operation

Active Publication Date: 2021-01-15

INSPUR SUZHOU INTELLIGENT TECH CO LTD

View PDF4 Cites 3 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Currently, the main way to deploy multi-models is to deploy services that support multi-model loading in the system, such as Tensor Flow Serving, Trion Serving, and AWS Multi-Model Serving. However, such services are traditional services and do not support elastic scaling in clusters. , and the operation is complex

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0038] It should be noted that all expressions using "first" and "second" in the embodiments of the present invention are to distinguish two entities with the same name but different parameters or parameters that are not the same, see "first" and "second" It is only for the convenience of expression, and should not be construed as a limitation on the embodiments of the present invention, which will not be described one by one in the subsequent embodiments.

[0039] In one example, please refer to figure 1 As shown, the present invention provides a k8s cluster-based multi-model reasoning service deployment method, which specifically includes the following steps:

[0040] S100. Deploy a scheduling service in the smallest scheduling unit of the k8s cluster, and configure memory, computing resources, and scheduling policies for the scheduling service; wherein, the smallest scheduling unit is a pod;

[0041] S200. Deploy multiple model reasoning services according to the memory of...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a multi-model reasoning service deployment method and device based on a k8s cluster. The method comprises the following steps: deploying a scheduling service in a minimum scheduling unit of a k8s cluster, and configuring a memory, computing resources and a scheduling strategy for the scheduling service; deploying a plurality of model inference services according to a memoryof a scheduling service, and configuring each model inference service as a computing resource for using the scheduling service and to be associated with the scheduling service; and the scheduling service calls the plurality of model reasoning services according to the scheduling strategy so as to process the reasoning task. According to the scheme of the invention, the capability of sharing the minimum scheduling unit by multiple model inference services is realized, the multi-model inference service can be elastically expanded and contracted along with the service load, and the deployment operation is relatively simple.

Description

technical field [0001] The invention belongs to the field of cloud computing, and in particular relates to a k8s cluster-based multi-model reasoning service deployment method, device, computer equipment and storage medium. Background technique [0002] As machine learning methods are more and more widely used in actual production, the number of models that need to be deployed in the production system is also increasing. For example, machine learning applications to provide a personalized experience often require the training of many models; for example, a news classification service will train a custom model on news categories, and a recommendation model can train each user's usage history to personalize its recommendations; respectively The main reason for training so many models is to protect the user's model and data privacy. [0003] In the K8S cluster, the number of POD resources is limited (by default, each Node can start 110 POD instances), and by default, in a clust...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06F9/455G06N5/04

CPCG06F9/45558G06N5/04G06F2009/45595G06F2009/45583G06F2009/4557

Inventor 陈清山

Owner INSPUR SUZHOU INTELLIGENT TECH CO LTD

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Multi-model reasoning service deployment method and device based on k8s cluster

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology