Method for constructing online machine learning project and machine learning system

A machine learning and project technology, applied in the field of machine learning, which can solve the problems of large differences in data set specifications, inability to analyze data and share experiments, and inability to save the experimental running status and data, so as to save storage space and improve Ease of access and improved readability

Pending Publication Date: 2020-11-06
MEGVII BEIJINGTECH CO LTD +1
4 Cites 2 Cited by

AI-Extracted Technical Summary

Problems solved by technology

The first aspect is poor flexibility, because it is difficult to easily change computing resources due to the use of a single machine; second aspect, the data flexibility and use threshold are high, and the specifications of the data sets vary greatly. Frequent uploading and downloading of data sets will Brings a huge tim...
View more

Method used

As shown in Figure 4, in some embodiments, described service is model training service, and S102 comprises: according to image file, set up the container for running online programmable notebook in described user service cluster; Based on NFS agreement in Mount the data set of the first project to which the service belongs in the container; provide the code data of the first project; S103 includes: receiving the request from the user to access the target data set, wherein the target data set belongs to Part or all of the data set of the first item; S104 includes: providing the user with the target data set and data within a predetermined range adjacent to the storage location of the target data set on the network storage server, for the user to download. The embodiment of the present application uses NFS to mount the data set to the container of the built operating environment. Compared with the existing method of directly downloading the data set for model training, the mounting method of the embodiment of the present application obviously saves the storage space for the user. It better meets the needs of users for machine learning or deep learning model training.
The 3rd step, load user data in the traditional scheme and generally adopt the form of downloading, considering that the scale of data set is generally larger (for example, tens of MB to hundreds of GB all have, even bigger), and this data Generally, the set does not change much, so the embodiment of this application helps users to automatically mount the data set in the form of NFS (for example, mount the data set to the specified folder of the deep learning system or platform), which saves the need for pre-downloading the cost of. From the user's perspective, the storage of the data set is like a local disk mounted on its own physi...
View more

Abstract

An embodiment of the invention provides a method for constructing an online machine learning project and a machine learning system. The method for constructing the online machine learning project comprises the following steps of: receiving a service request from a user; allocating resources of a user service cluster to a service according to the service request, and constructing an online operation environment, wherein the online operation environment is used for executing a model development service or a model training service of machine learning; receiving an operation from the user in the online running environment; and providing operation information to the user based on the operation, wherein the operation information comprises an operation result or an operation state of the model development service or the model training service. According to the method of the invention, the user locally compiles codes to realize model development or locally uses cloud resources to realize modeltraining, and returns a code development running result and a model training result executed at the cloud to a user browser, so that the effect of the user for performing machine learning projects isimproved.

Application Domain

Machine learningTransmission

Technology Topic

Machine learningEngineering +5

Image

  • Method for constructing online machine learning project and machine learning system
  • Method for constructing online machine learning project and machine learning system
  • Method for constructing online machine learning project and machine learning system

Examples

  • Experimental program(2)

Example

[0070] Example
[0071] like image 3 As shown, when the service is the model development service, S102 includes: a container for running online programmable notebooks in the user service cluster according to a mirror image; S103 includes: receiving the network through a network The user enters the code data to the programmable notebook and runs the code data; S104 includes: feedback to the user's operational results.
[0072] In order to enhance the convenience of users' access to code, such as image 3 As shown, the method of constructing an online machine learning item after S103 or as an input operation of performing S103 is also included: S105, saving the code data in real time; or by listening to the user's save behavior Or release a new version behavior to save the code data. The present application embodiment can save the user's programmable notebook code data, compared with existing programmable notebooks, programmable notebooks with storage functions significantly increases the user's access to the code.
[0073] In order to further enhance the readability of the user's stored code, such as image 3 As shown in some embodiments, the method of constructing an online machine learning item further comprises: S106, receiving a request from accessing the code data of the user; providing the user with the user with the user with the user. Code data. The present application embodiment can convert the code data of the programmable notebook JSON format to the HTML format, because the readability of the HTML format is more readable, so that the readability of the user's access code is improved.

Example

[0074] Example 2
[0075] like Figure 4 As shown in some embodiments, the service is a model training service, and S102 includes: a container for running online programmable notebooks in the user service cluster according to a mirroring file; based on NFS protocols Mount The data set of the first item belongs; the code data of the first item is provided; S103 includes: receiving the request of the user access target dataset, wherein the target data set belongs to the first item. Some or all of the data set; S104 includes: providing the user with the target data set and data within a predetermined range adjacent to the storage location of the target dataset for the user. . The present application example uses NFS to mount the data set in the container of the built-built operating environment, and the method of mounting the implementation of the present application embodiment is significantly saved by the user's storage space than the existing model training needs to download the data set. Demand, better to meet the needs of model training for users to learn or depth.
[0076] In order to enhance the convenience of user model retrieval, such as Figure 4 As shown in some embodiments, S103 includes: receiving a model retrieval request from the user over a network; S104 includes: obtaining a target model from a stored multiple model based on the model retrieval request, and will target the target The code corresponding to the model is loaded into the programmable notebook running in the container. The present application embodiment also adds a model retrieval function to retrieve and load a depth learning model when the programmable notebook initiated in the container is developed. Compared with the associated depth learning platform, the present application may not retrieve the desired depth learning model directly in the operating environment without exiting the platform, significantly improve the effectiveness of the user's acquisition and loading depth learning model.
[0077] In order to improve the accuracy of the model search, in some embodiments, S104 acquires the target model from the stored plurality of models, and loads the code corresponding to the target model to the container. In the programmable notebook, including: a set of candidate models is determined according to the weight of the plurality of models; the input of the user is received from the candidate model set, wherein the weight is small. Related to matching types of each model in the plurality of models, the matching type includes the query word to match the label, the query word and the title matching and the query word and the introduction. The present application example determines the candidate model and the target model using the matching feature of the query word and the model. Compared to the existing model retrieval algorithm, since some embodiments of the present application consider the meaning of the query word and the model, the title and the introduction of the representative are different, that is, the query word can be described with the label match with a model. The probability of the model is the largest, matched with the title, and the indirect match is the probability of the user's desired model, so the type of matching of the query word can be apparently enhance the accuracy of the search term.
[0078] In order to improve the accuracy of the model retrieval, in still some embodiment, the weight is also smaller than the correction coefficient assigned to the respective match types, wherein the correction coefficient corresponding to the query word and the label match is greater than The query word is matched with the query word with the title matches, and the correction coefficient when the query word matches the title matches is greater than the correction coefficient when the query word is matched with the introduction. The present application examples have different correction factors that match the title matching and the introduction matching, respectively, with the introduction matching, and the introduction factor matching, respectively, the accuracy difference is clearly related to the accuracy of the model description. It can improve the accuracy of the model needed to retrieve users.
[0079] In order to improve the accuracy of the model retrieval, in some embodiments, the weight is small in accordance with the type of match, and the model download quantity score in the plurality of models and the exported items of each model in the plurality of models are weighted. The weight is obtained, in which the model download amount is determined according to the download amount of the first model and the total load amount of the plurality of models, the proposed project weighting weight is based on the total number of items used in the second model. OK, the first model and the second model are any of the plurality of models. In order to further improve the accuracy of model retrieval, some embodiments are used to update the weight of the model and the probability of the model, the larger the probability of the model being searched. Since some embodiments of the model are frequently referenced or downloaded reflect the quality of the model, some embodiments of this application can also consider the number of references and downloads in the calculation model weights to correct the target model based on query word matching. The deviation can further enhance the accuracy of retrieval retrieval.
[0080] For example, the calculation formula of the model weight W is as follows:
[0081] W = [count (q IN L) * 10 + count (q in t) * 6 + count (q in d) * 4] / 10 + modeldownload + idxProject
[0082] Among them, L represents the label of the model, T represent the title of the model, D represents the introduction of the model, and the modeldownload model download amount, IDXProject is the weight loss weight of the project, Q represents the query word, the model download amount MODELDELDOWNLOAD = Download quantitative / station Downloads * Total number of total models; exported project weighted weights idxProject = total number of items used in project score (for example, project score is 1, 2 and 3, default is 1, official project and collection is located in front of the whole station 20 % The model is 3, the number of collections is 50% in the whole station); COUNT (A, B) represents the number of times that calculates A in B, for example, the maximum number of times is 5.
[0083]It should be noted that certain of the above two examples (i.e., Example I and Example II) may be recombined to form new technical solutions. For example, when performing the task model development, some embodiments of the present disclosure may also provide a search function to the user model for the user to retrieve and download the model; in the implementation of the model development tasks to be run by loading a programmable laptop network storage server container loading data set.
[0084] Methods In order to enhance the utilization of the service user's cluster resource and cost savings tenants, in some embodiments, the build machine learning programs also include: monitoring the resource usage to determine the close of the operating environment.
[0085] In order to enhance the security of the service user in the cluster service is running (for example, not to be an unprivileged user access), before executing S103, the method of online machine learning to build further comprising: confirm the identity of the user authenticated; traffic destined for the specified IP address and port, wherein the specified IP address is the address of the operating environment, the port is port started in the operating environment service model development or model training services. Further, in order to further enhance the safety of the user service clusters, machine learning methods line item the construct may further comprise: assigning a token for the service (e.g., in some examples may also be set longer valid token); the flow rate of the token acknowledgment addressed to the IP address and the specified port. Embodiments of the present application to determine whether to direct traffic to applications running in the environment within the cluster according to the user service token. Since all user services in user services in a cluster on a network plane, if we can forward traffic to a specific user service, also it represents can send traffic to the service is not operating right, in order to overcome this problem this application of such embodiment uses the token to solve security problems, enhance the security of the user program running in container cluster.
[0086] In some embodiments, the monitoring of the usage of resources, in order to determine close to the operating environment, comprising: validating the user's browser to a first predetermined time period set unreported heartbeat; or determining the user performs computing a second preset time period is not based on the resource assignment. Some embodiments of the present situation by monitoring the application of the presumption browser user is using the resources or leased by determining whether the resource rent is actually used to determine whether there is a waste of resources, enhance the accuracy of resource usage monitoring, to avoid being the use of resources the user to perform the task being performed close operation and affect users. For example, the development of the service is a service model, the determining the second predetermined period of time the user does not perform computing tasks based on the resource, comprising: confirmation of the start of the container in the programmable notebook said second predetermined time period has not received the input or output operation. For example, the service is a service model training, the user determines that the preset time period of the second resource based not perform computing tasks, comprising: determining the load condition according to the resource comprises a processor in the second preset time period the user to perform the resource model is not based on a training mission.
[0087] Incidentally, the present application is not limited to the first embodiment of the preset time period and a second period duration of the specific embodiment, those skilled in the art can be flexibly set depending on the project. Described manner using a first time period and a second predetermined preset time period, merely for convenience of description, long duration of at least one example of the first predetermined time period may be equal to a second predetermined period of time in other embodiments, the first long predetermined time period may not be equal to the second predetermined period of time is long.
[0088] Some embodiments of the present application when you run the model development tasks in the operating environment can be determined by detecting whether the user to execute code input whether resources are used to enhance the timely and accurate discovery open model development task runtime resource vacant unused condition. Some embodiments of the present application, when you run the model for a training mission in the operating environment can confirm whether the resources are idle load detected by the processor, improved real-time and accurate training mission have found the model runtime resources unused vacant condition.
[0089] By below Figure 5 The structure of the machine learning system 100 is illustrated an exemplary operation of a server or a server cluster.
[0090] like Figure 5 , The machine learning system 100 comprises: mirroring the warehouse 110 is configured to store a boot image file container; access server 120 is configured to receive a service request submitted by a user through a browser; resource sharing server cluster 130 (i.e., user services clusters), is configured to: in response to a service request of the access server to allocate resources to build online service execution environment to enable the user to load or run an interactive model development task in the task model training runtime environment; wherein, dependent on the operating environment of the boot image file.
[0091] The following examples set forth exemplary functional access server 120.
[0092] In order to enhance the security of the machine learning system 100, in some embodiments, the access 120 is further configured to authenticate the user, and provide to the online operating environment after authentication by the server data flow. In particular, in some embodiments, the execution figure 2 S103 before starting container, or prior to: confirm the identity of the user by the access server is verified, and sends the specified IP address to the authentication of the user traffic (e.g., the IP address of the start of the container) and the port (e.g. , port applications running in the container), wherein the specified IP address is the address of the operating environment, the port is port started in the operating environment service model development or model training services.
[0093] To determine whether traffic directed to the application running in the environment to further enhance the security of the service user clusters, in some embodiments, the access server 120 is further configured to allocate the service (e.g., matching and create) a token; cluster after the user service based on the flow rate of the token acknowledgment addressed to the IP address and port specified. For example, the validity of the token having length setting. Example embodiment of the present application has a certain token timeliness.
[0094] In order to enhance the convenience of the user access code in some embodiments, the runtime environment comprises at least one container to the user service runs cluster, the model development task is configured to operate the at least one container can notebook program, wherein the model development task is further configured to receive and store the code of the user input to the programmable notebook. For example, the access server 120 is further configured to store the code in real time or stored in response to the user's behavior. That is, the notebook programmable real-time code data input by the user access server 120 to run the code stored in the container and storing the result. In other embodiments, such as Image 6 Machine learning system shown in cloud server 150 may further comprise, the server cloud 150 is configured to store the code in real time or stored in response to the user's behavior.
[0095] To facilitate reading the code data stored in the model development, in some embodiments, the access server 120 is further configured to respond to a first request for browsing of the user code, by JSON format to the first user the code into the HTML format.
[0096] In order to enhance the user model retrieving and loading the model or models developed during training, in some embodiments, the access server 120 is further configured to respond to a second user model retrieval request to said second user providing a model search and download service. For example, the shared server cluster running the resource model to develop an interactive model of training tasks or task is further configured to: receive the second user query keywords, and to provide the candidate including a plurality of second user list model; code object model and the model of the target to receive the second user from the list is selected to embed the interactive model to develop the corresponding task or tasks of a programmable model training notebook.

PUM

no PUM

Description & Claims & Application Information

We can also present the details of the Description, Claims and Application information to help users get a comprehensive understanding of the technical details of the patent, such as background art, summary of invention, brief description of drawings, description of embodiments, and other original content. On the other hand, users can also determine the specific scope of protection of the technology through the list of claims; as well as understand the changes in the life cycle of the technology with the presentation of the patent timeline. Login to view more.

Similar technology patents

Cleaning agent for aquaculture and river channel deodorization and preparation method of cleaning agent

InactiveCN110683595AImprovement effectMaterials and procedures are reasonable
Owner:江苏红膏大闸蟹有限公司

Classification and recommendation of technical efficacy words

Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products