Data cleaning method
A data cleaning and data table technology, applied in the field of data cleaning, can solve problems such as inability to complete yearbook data cleaning work, OCR recognition method cannot take effect, etc., to achieve high degree of automation, low labor cost, and good results
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0053] like figure 1 As shown, this embodiment provides a data cleaning method, comprising the following steps:
[0054] S1. Determine area coordinates corresponding to preset areas in the two-dimensional data table, where the preset areas include the area where row headers are located and the area where column headers are located.
[0055] This step specifically includes: determining the areas corresponding to different fill colors in the two-dimensional data table, and determining the preset area in the two-dimensional data table according to the coordinate value characteristics of the area coordinates corresponding to the areas corresponding to different fill colors Corresponding area coordinates; or determining the area coordinates respectively corresponding to the preset areas in the two-dimensional data table according to the preset corresponding relationship between different filling colors and different preset areas.
[0056] For example, when performing data cleaning...
Embodiment 2
[0070] like figure 2 As shown, this embodiment provides another data cleaning method, which is different from Embodiment 1 in that: the preset area in step S1 in this embodiment also includes the area where the table title in the two-dimensional data table is located, And also include steps after step S36:
[0071] S37. Determine the last column of the area where the list header is located according to the area coordinates corresponding to the area where the list header is located;
[0072] S38. Write the header attribute in the table jointly identified by the preset column after the last column and the first preset row in the row occupied by the area where the table title is located; wherein, the number of preset columns is related to the header attribute the same quantity;
[0073] S39. According to the semantics of the words constituting the table title obtained through analysis, write the corresponding words in the column of the table header attribute to which the corre...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com