Text sequence labeling algorithm using overlapping splitting rule
A text sequence and sequence labeling technology, applied in computing, instrumentation, electrical and digital data processing, etc., can solve problems such as time-consuming, large model, low computational space efficiency, etc., to improve processing efficiency, good application, and improve model prediction. effect of effect
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0052] Overlap splitting: Assuming that the maximum sentence length is 10 and the length of the overlapping part is 3, the following sentence can be divided into several short sentences.
[0053] Example sentence 1: One of the most important tasks is the water safety and convenience of residents, see Table 2.
[0054] Table 2 example sentence 1 is a case demonstration of overlapping split
[0055] That middle one item very Heavy want of work do At once yes live in civil of use water install Complete 。 That middle one item very Heavy want of work do of work do At once yes live in civil of use water of use water install Complete 。
[0056] Therefore, after splitting, it becomes the above four clauses, and sentences that all meet the maximum sequence length of the model can be obtained, which can solve t...
Embodiment 2
[0060] Description: An entity (or vocabulary) contains another entity (or vocabulary), that is, there is a containment relationship.
[0061] If there are entities (or vocabulary) in the overlapping parts of the two sentences that are taken to the truncation boundary (B, E, S label), they will be merged directly and the longer entity (or vocabulary) will be taken. This can be aimed at the three tasks of word segmentation, part-of-speech tagging, and named entity recognition. (1) The following example 2 named entity recognition results, "Guiyang City Big Data Center" covers "Big Data Center", take the longer entity "Guiyang City Big Data Center", see Table 3.
[0062] Table 3. Example 2. Case demonstration of overlapping splitting
[0063] Token Overlap 1 Overlap 2 expensive O State O exist O expensive B-Organization O Positive I-Organization O city I-Organization O Big I-Organization B-Organization nu...
Embodiment 3
[0069] If only one of the overlapping parts of the two sentences has an entity (or vocabulary) that reaches the truncation boundary (B, E, S label), remove the entity (or vocabulary) and then merge.
[0070] (1) The results of named entity recognition in Example 4 are as follows: the characters "政" and "市" are the initial characters and the last characters of the two overlapping parts respectively. One of them has an entity and the other does not, so the "government procurement The complete entity "net" is ignored and then merged.
[0071] Table 5 Example 4 for a case demonstration of overlapping split
[0072] Token Overlap 1 Overlap 2 expensive B-Organization State I-Organization Province I-Organization politics I-Organization B-Organization government I-Organization I-Organization Pick I-Organization I-Organization purchase I-Organization I-Organization network E-Organization E-Organizatio...
PUM

Abstract
Description
Claims
Application Information

- R&D
- Intellectual Property
- Life Sciences
- Materials
- Tech Scout
- Unparalleled Data Quality
- Higher Quality Content
- 60% Fewer Hallucinations
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2025 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com