Method and device for deploying multi-model inference service based on k8s cluster
Patent Information
- Authority / Receiving Office
- CN Β· China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- SUZHOU METABRAIN INTELLIGENT TECH CO LTD
- Publication Date
- 2022-07-08
Smart Images

Figure 1 
Figure 2 
Figure 3
Abstract
Description
technical field
[0001] The invention belongs to the field of cloud computing, and in particular relates to a method, device, computer equipment and storage medium for deploying a multi-model inference service based on a k8s cluster. Background technique
[0002] As machine learning methods are more widely used in actual production, the number of models that need to be deployed in production systems is also increasing. For example, a machine learning application to provide a personalized experience often requires training many models; for example, a news classification service trains a custom model on the news category, and a recommendation model can train each user's usage history to personalize its recommendations; respectively; The main reason for training so many models is to protect the privacy of users' models and data.
[0003] In a K8S cluster, the number of POD resources is limited (by default, each Node can start 110 POD instances). By default, in a cluster of 100 ...