Multi-granularity fusion word segmentation method and device, equipment and storage medium

A word segmentation method and multi-granularity technology, applied in neural learning methods, character and pattern recognition, instruments, etc., can solve problems such as poor experience, unsatisfied needs, obstacles, etc., to meet custom needs, improve recall rate, The effect of improving reading efficiency

Pending Publication Date: 2021-10-15
上海艾爵斯信息科技有限公司
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] In the existing technology, the word segmentation method is mainly oriented to general data sets, and cannot meet the needs of a specific scene, for example, the legal scene requirements
There are a large number of legal entities in the legal field, and the general word segmentation model cannot accurately segment such legal entities. For example, "Criminal Law of the People's Republic of China" is mistakenly divided into two words: "People's Republic of China" and "Criminal Law".
Word segmentation results will create obstacles to legal understanding and reading, and bring a bad experience

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-granularity fusion word segmentation method and device, equipment and storage medium
  • Multi-granularity fusion word segmentation method and device, equipment and storage medium
  • Multi-granularity fusion word segmentation method and device, equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0040] In order to enable those skilled in the art to better understand the solution of the present application, the technical solution in the embodiment of the application will be clearly and completely described below in conjunction with the accompanying drawings in the embodiment of the application. Obviously, the described embodiment is only It is an embodiment of a part of the application, but not all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without creative efforts shall fall within the scope of protection of this application.

[0041] It should be noted that, in the case of no conflict, the embodiments in the present application and the features in the embodiments can be combined with each other. The present application will be described in detail below with reference to the accompanying drawings and embodiments.

[0042] This application proposes a multi-granularity fusion ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a multi-granularity fusion word segmentation method and device, equipment and a storage medium. The multi-granularity fusion word segmentation method comprises the following steps: establishing a coarse-granularity legal word segmentation corpus set and a fine-granularity legal word segmentation corpus set; training a legal word segmentation model according to the coarse-grained legal word segmentation word stock set and the fine-grained legal word segmentation word stock set; and inputting a to-be-recognized text into the trained legal word segmentation model for word segmentation to respectively obtain coarse-grained and fine-grained word segmentation results. According to the method, a multi-granularity word segmentation method is adopted, and the word segmentation requirement in a legal scene is met. For understanding and cognition of the text, word segmentation is needed for assistance, the text reading efficiency can be improved, and specific phrases in the legal field can be found.

Description

technical field [0001] The present application relates to the technical field of word segmentation processing, and in particular, relates to a multi-granularity fusion word segmentation method, device, device and storage medium. Background technique [0002] In the prior art, word segmentation methods are mainly oriented to general data sets, and cannot meet the requirements of a specific scenario, for example, the requirements of legal scenarios. There are a large number of legal entities in the legal field, and the general word segmentation model cannot accurately segment such legal entities. For example, "Criminal Law of the People's Republic of China" was mistakenly divided into two words: "People's Republic of China" and "Criminal Law". The result of word segmentation will create obstacles to the reading of legal understanding and bring a bad experience. Contents of the invention [0003] The main purpose of the present application is to provide a multi-granularity f...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/284G06K9/62G06N3/04G06N3/08
CPCG06F40/284G06N3/04G06N3/08G06F18/214G06F18/25
Inventor 顾敏杜向阳徐芳
Owner 上海艾爵斯信息科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products