Word segmentation method and device, electronic equipment and storage medium
A technology of electronic equipment and word segmentation method, which is applied in the fields of electrical digital data processing, instruments, and computing, and can solve the problems of low accuracy and low efficiency of word segmentation results.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0060] figure 1 A schematic diagram of a word segmentation process provided by an embodiment of the present invention, the process includes the following steps:
[0061] S101: Input word segmentation data into a pre-saved baseline word segmentation model, and determine a preliminary word segmentation result of the word segmentation data based on the baseline word segmentation model.
[0062] The word segmentation method provided by the embodiment of the present invention is applied to an electronic device, and a baseline word segmentation model is pre-stored in the electronic device, and the baseline word segmentation module is an existing word segmentation model.
[0063] The electronic device can obtain the word segmentation data to be segmented. The word segmentation data may be input by the user, or may be collected by the electronic device on other devices through the collection interface.
[0064] After the word segmentation data is obtained by the electronic device, th...
Embodiment 2
[0078] On the basis of the above embodiments, in the embodiments of the present invention, before merging the at least two segmentation units according to the preset merging rules, the method further includes:
[0079] The segmentation result is input into a pre-trained tagger, and based on the tagger, an annotation sequence of the segmentation result is output, wherein the annotation sequence includes each of the at least two segmentation units Word tagging for segmentation units;
[0080] According to the preset merging rule, merging the at least two segmentation units includes:
[0081] Merge each of the segmentation units according to the word tags of each of the segmentation units and a preset merging rule.
[0082] When the electronic device merges at least two segmentation units, if the combination is performed according to the label information corresponding to each segmentation unit, the electronic device can first determine that each segmentation unit corresponds to...
Embodiment 3
[0088] On the basis of the above embodiments, in the embodiment of the present invention, the merging of each segmentation unit according to the word tag and the preset merging rule of each segmentation unit includes:
[0089] Sequentially read each of the segmentation units and the word tags of each of the segmentation units, and merge in the following manner until the merger of each of the segmentation units is completed:
[0090] If there is a word marked as the first segmentation unit of the word start label, search for the second segmentation unit whose adjacent words are marked as the end of the word, and determine that the first segmentation unit and the first segmentation unit are located in the labeling sequence The third segmentation unit between the second segmentation units; according to the order in the label sequence, the first segmentation unit, the third segmentation unit and the second segmentation unit are combined into one complete word;
[0091] If there i...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com