Unambiguous Japanese name list building method and name identification method and device

A recognition method and Japanese technology, applied in the field of text recognition, can solve the problems of missing Japanese names, the coverage of names is not very high, and there are many limitations.

Inactive Publication Date: 2015-03-18
FUJITSU LTD
View PDF2 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this method has many limitations. First of all, the discovery of Japanese surnames is very important for the surnames such as "Lin" and "Yin" that exist in both China and Japan, or "Shuxia" and "Datong" that have very large differences in Chinese texts. Ambiguous Japanese surnames will produce wrong recognition results; secondly, the suffixes of personal names often do not appear after Japanese personal names, so the coverage of the recognition method after the personal name is not very high; in addition, because some charact

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Unambiguous Japanese name list building method and name identification method and device
  • Unambiguous Japanese name list building method and name identification method and device
  • Unambiguous Japanese name list building method and name identification method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] Exemplary embodiments of the present invention will be described below with reference to the accompanying drawings. In the interest of clarity and conciseness, not all features of an actual implementation are described in this specification. It should be understood, however, that in developing any such practical embodiment, many implementation-specific decisions must be made in order to achieve the developer's specific goals, such as meeting those constraints related to the system and business, and those Restrictions may vary from implementation to implementation. Moreover, it should also be understood that development work, while potentially complex and time-consuming, would at least be a routine undertaking for those skilled in the art having the benefit of this disclosure.

[0024] Here, it should also be noted that, in order to avoid obscuring the present invention due to unnecessary details, only the device structure and / or processing steps closely related to the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an unambiguous Japanese name list building method and a name identification method and device. The list building method comprises the following steps: performing name separation on a Japanese common name list by using a Japanese surname list to obtain a Japanese name list; dividing training corpuses with Japanese name relevant marks into a Japanese name relevant word set and other word sets; combining the Japanese name relevant word set, the Japanese common name list, the Japanese surname list and the Japanese name list obtained by performing name separation on the Japanese common name list into a Japanese name relevant word total set; specific to each Japanese name relevant word in the Japanese name relevant word total set, judging whether the word is an unambiguous Japanese name relevant word to establish an unambiguous Japanese name relevant word list. Through word segmentation and name role marking by using the list, the overall accuracy of Chinese word segmentation can be increased, and the overall name role marking performance and the final name identification result are enhanced.

Description

technical field [0001] The invention relates to the field of text recognition, in particular to a method and device for recognizing Japanese names. Background technique [0002] With the development of communication technology, exchanges among countries have become increasingly extensive, and information dissemination has become more convenient and faster. Therefore, Chinese texts will contain a large number of named entities such as foreign institution names, person names, and place names. However, these named entities themselves do not necessarily exist in traditional dictionaries, and belong to unregistered words (Out Of Vocabulary, OOV for short), which brings difficulties to many natural language processing-related applications based on Chinese word segmentation. . Effective identification of these named entities can effectively improve the application effects of network text classification, entity association network construction, and topic detection and tracking. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/35G06F16/3344
Inventor 宋双永孟遥郑仲光于浩
Owner FUJITSU LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products