Uygur language sentence boundary recognition method
A Uyghur language and boundary recognition technology, which is applied in special data processing applications, instruments, and electronic digital data processing, etc., can solve problems such as large impact, impact on analysis accuracy, and models that cannot directly use Uyghur sentence boundary recognition tasks to achieve Effects of improved accuracy, high processing power and robustness
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0015] A Uyghur sentence boundary recognition method, 1. Propose the recognition rules for unambiguous punctuation marks in Uyghur sentence recognition; 2. Propose a Uyghur paragraph classification algorithm, which can effectively reduce the scale of statistical space and rapidly improve efficiency; 3. Use statistics to establish the Uyghur sentence boundary recognition feature space, and efficiently identify ambiguous punctuation marks in Uyghur sentences; 4. Realize high-performance Uyghur sentence boundary recognition for undifferentiated corpora.
[0016] like figure 1 As shown, the process and functional modules involved in the present invention are: a paragraph classification rule base, a test corpus, a paragraph classifier, a sentence boundary recognition rule base, a training corpus, and a maximum entropy model module. The main process includes: first, with the support of the rule base, the Uyghur text is divided into unambiguous paragraphs and ambiguous paragraphs thr...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 