Patent data cleaning method and system based on AdaBoost algorithm
A data cleaning and patented technology, applied in the field of data processing, can solve problems such as inability to effectively process data sets, performance cannot meet requirements, data quality degradation, etc., to avoid overfitting, low error rate, and high accuracy rate. Effect
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0037] In order to explain in detail the technical content, structural features, achieved goals and effects of the technical solution, the following will be described in detail in conjunction with specific embodiments and accompanying drawings.
[0038] see figure 1 , a flow chart of a preferred embodiment of the present invention, a patented data cleaning method based on the AdaBoost algorithm, which includes the following steps,
[0039] S1. Collect patent data from the patent database, and put the collected patent data sources into the database to be cleaned.
[0040] In this embodiment, the Derwent database is used as the basic data source, and the field of data collection is the iron and steel industry. Summarize search terms, IPC classification numbers, Derwent manual codes, etc. as basic search methods, formulate search strategies, and extract patent data related to the iron and steel industry. A total of about 270,000 pieces of patent data are retrieved for data index...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


