Data preprocessing method and equipment and storage medium
A data preprocessing and preprocessing technology, which is applied in electrical digital data processing, program control design, instruments, etc., can solve the problem of non-standardization of preprocessing steps, and achieve the effect of reducing waiting.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0020] Such as figure 1 As shown, a data preprocessing method provided by an embodiment of the present invention includes:
[0021] S101. Monitor the path where the original data is located.
[0022] Specifically, the original data can be the preprocessing of artificial intelligence training data, or it can be big data that needs data cleaning. The data file format can be various, including but not limited to data sources in the form of images, text, or tables.
[0023] S102. After detecting that there is unprocessed raw data, execute the preprocessing script or program corresponding to each step according to the execution order of each step preset in the configuration file.
[0024] Wherein, the preprocessing script or program is implemented by using the same or different programming language, and is used for preprocessing the data under the data input path, and saving the preprocessing result to the data output path.
[0025] Specifically, in the data preprocessing proces...
Embodiment 2
[0031] Such as image 3 As shown, a data preprocessing method provided by an embodiment of the present invention includes:
[0032] S301. Predefine a configuration file for data preprocessing.
[0033] Among them, the preprocessing configuration file is defined according to the actual application scenario. In the configuration file, all steps of data preprocessing and their execution sequence, data input path, data output path, entry script, and subtask script or program corresponding to each step are predefined. The entry script is used to define the execution order of the subtask scripts or programs. The basic content of the entry script is to call each script or program in this step, and judge whether to exit abnormally or continue to the next step according to the return value. Subtask scripts or programs are a series of subtask scripts or programs written according to preset rules to realize step functions. There are no requirements for the number of subtask scripts o...
Embodiment 3
[0049] Such as Figure 4 As shown, the embodiment of the present invention is described by taking face recognition as an example.
[0050] Take the face image preprocessing required for face recognition as an example. One source of face images is the faces of celebrities captured on the Internet. The captured faces belonging to the same person are stored in the same path. The photos below can only be used for face model training after four steps of preprocessing: face detection, face positioning, face calibration, and face deduplication. These four steps have corresponding processing scripts or program, but all are manual operations, the present invention can be applied according to the following steps to be automated.
[0051] In step S401 , the preprocessing steps of the predefined face recognition data are four steps of face detection, face location, face calibration and face weight ranking.
[0052] Step S402, respectively predefine the subtasks of each step, and write t...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap