Artificial intelligence training method, system and device for massive small files and medium

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A massive small file, artificial intelligence technology, applied in the field of artificial intelligence training of massive small files, can solve the problems of uncontrollable granularity, poor data swapping performance, etc., to avoid overfitting problems and improve bandwidth utilization. Effect

Active Publication Date: 2021-03-09

SUZHOU LANGCHAO INTELLIGENT TECH CO LTD

View PDF4 Cites 1 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Although the data is pulled from the remote central storage to the local cache, multiple RPC calls are required to read the data. The training performance of massive small files is worse than that of the direct local disk cache, and the granularity is uncontrollable when the data is eliminated. , when the cache space is insufficient, every time the cache is read and written, the performance of data swapping in and swapping out will be poor

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0020] In order to make the object, technical solution and advantages of the present invention clearer, the embodiments of the present invention will be further described in detail below in conjunction with specific embodiments and with reference to the accompanying drawings.

[0021] It should be noted that all expressions using "first" and "second" in the embodiments of the present invention are to distinguish two entities with the same name but different parameters or parameters that are not the same, see "first" and "second" It is only for the convenience of expression, and should not be construed as a limitation on the embodiments of the present invention, which will not be described one by one in the subsequent embodiments.

[0022] Based on the above purpose, the first aspect of the embodiments of the present invention proposes an embodiment of an artificial intelligence training method for massive small files. figure 1 What is shown is a schematic diagram of an embodim...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an artificial intelligence training method, system and device for massive small files and a storage medium. The method comprises the steps: responding to the start of an artificial intelligence training task, obtaining a data set from a far-end center, and combining the small files in the data set into a data block according to the structural definition of the block; generating a training task data set list based on synchronous shuffle mechanisms between the data blocks and in the data blocks in response to training starting or epoch updating; obtaining file list information of the data blocks according to the training task data set list; and obtaining file data according to the file list information of the data blocks, locally caching the file data with one or moredata block granularity, and performing artificial intelligence task training. According to the method, the problem that the I / O bandwidth utilization rate is low when massive small files read data intraining is solved, the problem that the I / O reading rate is not matched with the GPU computing rate is relieved, the utilization rate of computing resources is increased, and the whole training process of the massive small files is accelerated.

Description

technical field [0001] The present invention relates to the field of AI training, more specifically, a method, system, computer equipment and readable medium for artificial intelligence training of massive small files. Background technique [0002] AI (artificial intelligence) training of large-scale, massive and small files usually includes the following characteristics: 1. Large-scale data sets are usually placed in external storage media (systems), such as nfs, beegfs, cloud, etc.; 2. For traditional The file system, the metadata (including access time, permissions, modification time, etc.) of a large number of small files usually exists on the disk. When obtaining the file, the disk metadata needs to be loaded into the memory first. The location of the file on the disk can be obtained from the information of the file, and the storage information of the file can be obtained from the disk at last, and the overall performance of reading the file is poor; 3. The reading of s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06K9/62G06N20/20

CPCG06N20/20G06F18/214

Inventor 刘慧兴

Owner SUZHOU LANGCHAO INTELLIGENT TECH CO LTD

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Artificial intelligence training method, system and device for massive small files and medium

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology