Unlock instant, AI-driven research and patent intelligence for your innovation.
Enterprise name keyword extraction method
What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of enterprise name and extraction method, applied in the field of data processing, can solve the problems of large investment and increased difficulty, and achieve the effect of high coverage
Active Publication Date: 2018-03-02
中检美亚(厦门)科技有限公司
View PDF18 Cites 6 Cited by
Summary
Abstract
Description
Claims
Application Information
AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology
Problems solved by technology
Due to the complexity and diversity of enterprise names, it is more difficult to use data processing technology to extract enterprise name keywords
At present, for enterprise name keyword data, it can only be screened and supplemented manually. In order to obtain a large amount of data and high coverage of enterprise name keyword data, a large amount of manpower is needed in actual operation.
Method used
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more
Image
Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
Click on the blue label to locate the original text in one second.
Reading with bidirectional positioning of images and text.
Smart Image
Examples
Experimental program
Comparison scheme
Effect test
Embodiment
[0036] see figure 1 , the invention discloses a method for extracting enterprise name keywords, comprising the following steps:
[0037] S1. Build a basic hot word library related to the name of the enterprise, and tag the hot words in the basic hot word library to define the tag categories of the hot words. The basic hot thesaurus is built by the following methods:
[0038] S11. Prepare enterprise name data in advance. In this embodiment, the enterprise name data is collected by a web crawler, and the enterprise name data contains more than 40 million enterprise names.
[0039] S12. Perform Chinese word segmentation processing on the enterprise name data. The Chinese word segmentation processing is to utilize IKAnalyzer word segmentation device, word segmentation device, Ansj word segmentation device or Stanford word segmentation device to carry out Chinese word segmentation processing, certainly also can adopt other word segmentation device, the present invention does not...
example 1
[0060] 1. In step S2, the user inputs "Xiamen Meiya Shangding Information Technology Co., Ltd.", and the word segmentation result is:
[0066] 4. In step S5, the blank operation process is as follows:
[0067]
[0068] The final result is: Meiya Shang Ding.
[0069] 5. In step S6, it is determined that the length of "Meiya Shangding" is greater than 2,...
example 2
[0071] 1. The user inputs "Xiamen Beichen Shanchuan Culture Communication Co., Ltd.", and executes steps S2-S6. The company name is all replaced with blanks, and the result is "", and executes step S7.
[0072] 2. The execution process of step S7 is:
[0073]
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More
PUM
Login to View More
Abstract
The invention discloses an enterprise name keyword extraction method. The method comprises the following steps of establishing a basic hot word library related to an enterprise name; performing Chinese word segmentation processing on the enterprise name input by a user, and outputting a word segmentation result; declaring a new array arrs_a, traversing the word segmentation result, and if a segmented word in the word segmentation result is matched with a hot word in the basic hot word library in the traversal process, adding the segmented word to the array arrs_a; according to the word lengthsand positions of the segmented words in sequence, sorting the array arrs_a; and traversing the sorted array arrs_a, performing over-displacement operation on the enterprise name in sequence for eachsegmented word in the array arrs_a, and taking an obtained final word as an enterprise name keyword. The enterprise name keyword can be quickly extracted according to the enterprise name, so that large-data-volume and high-coverage-rate enterprise name keyword data can be obtained conveniently.
Description
technical field [0001] The invention relates to the technical field of data processing, in particular to a method for extracting enterprise name keywords. Background technique [0002] The enterprise name keyword is the most important part of the enterprise name, and it is also the core data asset of the enterprise. The enterprise name keyword plays an important role in the process of processing enterprise data. If the keywords of the enterprise name can be quickly extracted based on the collected enterprise name, it can be provided to third-party systems for other purposes, including but not limited to search engines, crawlers, public opinion analysis and other application scenarios. [0003] The name of an enterprise usually consists of four elements: administrative division, font size, industry, and organizational form, among which the font size is the core part of the keyword of the enterprise name. Due to the complexity and diversity of enterprise names, it is more dif...
Claims
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More
Application Information
Patent Timeline
Application Date:The date an application was filed.
Publication Date:The date a patent or application was officially published.
First Publication Date:The earliest publication date of a patent with the same application number.
Issue Date:Publication date of the patent grant document.
PCT Entry Date:The Entry date of PCT National Phase.
Estimated Expiry Date:The statutory expiry date of a patent right according to the Patent Law, and it is the longest term of protection that the patent right can achieve without the termination of the patent right due to other reasons(Term extension factor has been taken into account ).
Invalid Date:Actual expiry date is based on effective date or publication date of legal transaction data of invalid patent.