General text information extraction method and system
A text information and extraction system technology, which is applied in unstructured text data retrieval, special data processing applications, natural language data processing, etc., can solve the problem that the accuracy of extraction is not applicable, the accuracy is difficult to predict, and the limitation is strong. problem, to achieve the effect of avoiding manual labeling investment, expanding the scope of application, and reducing dependence
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0045] Such as image 3 As shown, the general text information extraction method of the present invention comprises the following steps:
[0046] Step 1. Write a limited number of regular expressions to extract the original corpus, and extract the text corpus and field corpus;
[0047] Step 2, cutting out a limited proportion of the text corpus from the extracted text corpus as the training text corpus, and obtaining the field corpus corresponding to the training text corpus as the training field corpus;
[0048] Step 3, importing the fields of training text corpus, training field corpus and each training field corpus corresponding to the front and rear limited number into the automatic pattern induction method, constructing the extraction model, and the automatic pattern induction method is the CRF algorithm;
[0049] Step a. Use the remaining corpus in step 2 as the verification corpus, extract the verification corpus through the extraction model, and judge the accuracy of ...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


