Word segmentation method, device, system and equipment

A word segmentation method and word segmentation technology, applied in the field of word segmentation technology, can solve the problem of unable to obtain satisfactory word segmentation results, and achieve the effect of optimizing word segmentation results and accurate word segmentation results.

Pending Publication Date: 2022-02-18
东方财富信息股份有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The purpose of this application is to provide a word segmentation technical solution to solve the technical problem that satisfactory word segmentation results cannot be obtained when using existing word segmentation tools to segment special word texts containing specific fields

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Word segmentation method, device, system and equipment
  • Word segmentation method, device, system and equipment
  • Word segmentation method, device, system and equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0040] The present invention will be described in further detail below in conjunction with the accompanying drawings.

[0041] In a typical configuration of the present application, the equipment, device, system and / or related trusted parties may include one or more processors (CPU), input / output interfaces, network interfaces and memory.

[0042] Memory may include non-permanent storage in computer-readable media, in the form of random access memory (RAM) and / or nonvolatile memory, such as read-only memory (ROM) or flash RAM. Memory is an example of computer readable media.

[0043] Computer-readable media, including both permanent and non-permanent, removable and non-removable media, can be implemented by any method or technology for storage of information. Information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random acce...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a word segmentation technical scheme, which comprises the following steps of: performing word segmentation on an obtained text to be subjected to word segmentation to obtain a first word segmentation result; based on a first preset condition, traversing the first word segmentation result, judging each word segmentation in the first word segmentation result, and if the current word segmentation meets the first preset condition, determining the current word segmentation as a trigger word; judging whether other words in a preset range before and after the trigger word meet a second preset condition or not, and if so, re-segmenting the word segmentation result in the preset range before and after the trigger word based on a preset rule; continuing to traverse until the last word segmentation of the first word segmentation result, and obtaining a second word segmentation result. Through the method, on the basis of the word segmentation result obtained by adopting the existing word segmentation tool, the word segmentation result can be further optimized in combination with the preset strategy which can be flexibly adjusted, so that the word segmentation result which is more accurate in semantics is obtained.

Description

technical field [0001] The present application relates to the technical field of computer data processing, in particular to a word segmentation technology. Background technique [0002] With the popularization of digital infrastructure such as the Internet, human beings have entered the era of information explosion, and more and more data need to be processed. However, a large number of natural language texts, pictures, and videos in the network are often unstructured data. Among them, The number of natural language texts is the largest. In order to be able to analyze and utilize these natural language text information, NLP (Natural Language Processing, natural language processing) technology is needed. NLP is equivalent to a translation or a communication bridge between computer language and human language, so as to achieve the purpose of human-computer interaction. [0003] As a basic content in NLP, word segmentation is the process of recombining continuous word sequenc...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/289
CPCG06F40/289
Inventor 梁浩晨
Owner 东方财富信息股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products