State transition and neural network-based Chinese chunk parsing method
A neural network and state transfer technology, applied in natural language data processing, special data processing applications, instruments, etc., can solve problems such as not being able to make full use of block level and long-distance information features
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0176] First of all, the model parameters in the present embodiment are first in the 728 files in the Penn State Treebank Chinese version CTB (The Chinese Penn Treebank) 4.0 according to the method in the additional explanation of the model parameter training method in the instructions (the file numbers start from chtb_001. fid to chtb_899.ptb, it should be noted that the numbers are not consecutive, so there are only 110 files) and the training is obtained on 9978 sentences.
[0177] The present embodiment utilizes the Chinese chunk analysis method based on state transition and neural network in the present invention to carry out the complete process of Chinese chunk analysis to a sentence as follows:
[0178] Step 1-1, define Chinese chunk types, 12 types are defined on the basis of CTB4.0 of the Chinese version of Penn Treebank: ADJP, ADVP, CLP, DNP, DP, DVP, LCP, LST, NP, PP, QP, VP, for their specific meanings, see step 1-1 in the manual;
[0179] Step 1-2, determine the...
Embodiment 2
[0205] Algorithms used in the present invention are all written and implemented in C++ language. The model used in the experiment of this embodiment is: Intel(R) Core(TM) i7-5930K processor, the main frequency is 3.50GHz, and the memory is 64G. First of all, the model parameters in the present embodiment are first in the 728 files in the Penn State Treebank Chinese version CTB (TheChinese Penn Treebank) 4.0 according to the method in the additional explanation of the model parameter training method in the specification sheet (the file number is from chtb_001.fid To chtb_899.ptb, it should be noted that the numbers are not consecutive, so there are only 110 files) that are trained on 9978 sentences. The data used in the experimental test uses 5290 sentences in 110 files (the file numbers are from chtb_900.fid to chtb_1078.ptb, it should be noted that the numbers are not consecutive, so there are only 110 files) for block analysis. The results are shown in Table 7:
[0206] Ta...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com