Configurable data analysis method and computer readable storage medium
A data analysis and data object technology, applied in the field of Internet data crawling, can solve the problems of inability to support Json format web page analysis, insufficient adaptation of web page flexibility, and inability to adapt to encapsulation mode, so as to improve analysis efficiency and flexibility, The effect of reducing the amount of analysis data and facilitating machine recognition
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Example Embodiment
[0038] Example one
[0039] See figure 1 , A configurable data parsing method, including the following steps: create a new parsing configuration page, and configure the URL, parsing type, parsing attributes and the name of the logical table used to save the parsing results on the parsing table configuration page to be crawled , Generally, the logic table is a two-dimensional table, such as an excel form, which is submitted after completion; the analysis attributes include analysis area and row positioning information, or the analysis attributes only include row positioning information; create a new field configuration page in The field configuration page configures the field name of each field in the logic table, and each field name corresponds to the data object to be extracted, for example, the field name reg_code, the corresponding data object is the number string of the organization code, and the field name is law_person, The corresponding data object is the name of the legal...
Example Embodiment
[0061] Example two
[0062] The difference between this embodiment and the first embodiment is that the field configuration further includes the configuration field identifier. When extracting the data object, the data object to be extracted is matched according to the field identifier, and then mapped to the logical table.
[0063] The data analysis method performs the following steps: configure the URL of the target webpage to be crawled, the analysis type, the analysis attribute, and the logical table name for saving the analysis result on the analysis configuration page, and submit it after completion; the analysis attribute includes the analysis area And line positioning information, or the analytic attribute only includes line positioning information; such as Figure 14 As shown, configure the field name and field identifier of each field in the logical table on the field configuration page; create a blank logical table according to the logical table name, and write each fiel...
Example Embodiment
[0068] Example three
[0069] A computer-readable storage medium on which a computer program is stored. When the program is executed by a processor, the following steps are executed: configure the URL of the target webpage to be crawled on the parsing table configuration page, parsing type, parsing attributes, and storing The logical table name of the analysis result, submitted after completion; the analysis attribute includes analysis area and row positioning information, or the analysis attribute only includes row positioning information; the field name of each field in the logical table is configured on the field configuration page, Each field name corresponds to the data object to be extracted; a blank logical table is created according to the logical table name, and each field name is written into the blank logical table, and the sorting of each field is consistent with the extraction order of the data object during parsing; capture The target webpage corresponding to the UR...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap