A method and device for automatically obtaining enterprise multi-level classification training data
A training data, multi-level technology, applied in natural language data processing, text database query, electronic digital data processing and other directions, can solve problems such as low work efficiency, affecting the efficiency and accuracy of enterprise industry classification, and inability to meet practical applications. achieve the effect of improving accuracy
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0055] Such as figure 1 As shown, the embodiment of the present invention provides a method for automatically obtaining enterprise multi-level classification training data, including:
[0056] S101. Obtain industry information, product name information and enterprise description text;
[0057] S102. Generate an industry hierarchy system according to the industry information;
[0058] S103. Clustering the product name information and associating with the industry hierarchy system to obtain a multi-level industry keyword list;
[0059] S104. According to the keywords in the enterprise description text that match the multi-level keyword list of the industry, mark the corresponding industry classification for the enterprise, and obtain the industry labels of each level;
[0060] S105. Form training data according to the enterprise description text and industry labels of each level of the enterprise.
[0061] Optionally, in step S101, the industry information, product name informa...
Embodiment 2
[0105] An embodiment of the present invention provides a method for multi-level classification of an enterprise, including:
[0106] Utilize the training data that the method described in embodiment one obtains to train classification algorithm, obtain enterprise classification model;
[0107] Enter the enterprise description text into the enterprise classification model to obtain the multi-level industry classification of the enterprise.
[0108] Specifically, after obtaining the training data using the method described in Embodiment 1, select the BiLSTM classification algorithm, use the training data to train the BiLSTM classification algorithm, and obtain a reliable enterprise classification model.
[0109] The enterprise description text includes the public enterprise products, business, business scope and patent data, etc. Optionally, perform preprocessing on the enterprise description text, including feature selection, word segmentation, stop word removal, length fillin...
Embodiment 3
[0113] Such as figure 2 As shown, the present invention also includes a functional module architecture that is completely consistent with the aforementioned method flow, that is, the embodiment of the present invention also provides a device for automatically obtaining multi-level classification training data of an enterprise, including:
[0114] Information acquisition module 201, used to acquire industry information, product name information and enterprise description text;
[0115] An industry level generation module 202, configured to generate an industry level system according to the industry information;
[0116] The keyword list acquisition module 203 is used for clustering the product name information and associating with the industry hierarchy system to obtain an industry multi-level keyword list;
[0117] The industry label acquisition module 204 is used to mark the corresponding industry classification for the enterprise according to the keywords matching the mult...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


