A Term Extraction Method Based on Definition and Relationship
A term and relationship technology, applied in the field of text mining, can solve the problems of low recognition ability of low-frequency terms, poor ability to extract long-word terms, omission, etc., to achieve the effect of improving recognition ability
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0062] The embodiments will be described in detail below in conjunction with the accompanying drawings.
[0063] figure 1 The overall flowchart of the term extraction method proposed by the present invention specifically includes the following steps:
[0064] Step (1), text preprocessing, word segmentation and word frequency statistics
[0065] The resource presented in web page html format is the resource with the widest source and the easiest way to obtain, and the present invention selects html text as the data input of the method. Resources in html format are not in plain text form, so data cleaning for text preprocessing is required.
[0066] The pictures in the web page are only links without semantic information in the text, and the tables are difficult to process due to the changeable format, so the present invention uses regular expressions to label with
[0067]
[0068]
[0069]
[0070]
[0071]
[0072]
[0073]
[0074] The content in the tag is a...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


