Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Domain dictionary creation

A technology of dictionaries and subject words, applied in the field of domain dictionary creation, can solve the problems of not being able to easily identify new words, difficulty in new word detection, etc., and achieve the effect of improving data processing performance

Inactive Publication Date: 2010-09-15
GOOGLE LLC
View PDF0 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Therefore, new words in text cannot be easily identified since new words are compound sequences of characters or existing words
This makes new word detection a difficult task for these languages

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Domain dictionary creation
  • Domain dictionary creation
  • Domain dictionary creation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] Figure 1A is a block diagram of an example device 100 that may be used to implement an input method editor (IME). The device 100 may for example be implemented in a computer device such as a personal computer device, a web server, a telecommunications switch, or in other electronic devices such as a mobile phone, mobile communication device, personal digital assistant (PDA), game box or the like.

[0029] The example device 100 includes a processing device 102 , a first data storage unit 104 , a second data storage unit 106 , an input device 108 , an output device 110 , and a network interface 112 . Data communication between components 102, 104, 106, 108, 110 and 112 may be established and controlled using a bus system 114 including, for example, a data bus and a motherboard. Other example system architectures can also be used.

[0030] Processing device 102 may, for example, include one or more microprocessors. The first data storage unit 104 may, for example, incl...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

Methods, systems, and apparatus, including computer program products, to identify topic words in a document corpus that includes topic documents related to a topic are disclosed. A reference topic word divergence value based on the document corpus and the topic document corpus is determined. A candidate topic word divergence value for a candidate topic word is determined based on the document corpus and the topic document corpus. The candidate topic word is determined to be a topic word if the candidate topic word divergence value is greater than the reference topic word divergence value.

Description

[0001] Cross References to Related Applications [0002] This application claims priority to US Application Nos. 11 / 844,067 and 11 / 844,153, filed August 23,2007. The disclosure of the aforementioned applications should be considered part of (and incorporated by reference) part of the disclosure of the present application. technical field [0003] The present disclosure relates to dictionaries for natural language processing applications such as machine translation, segmentation of non-Roman language words, speech recognition, and input method editors. Background technique [0004] The use of increasingly advanced natural language processing techniques in data processing systems such as speech processing systems, handwriting / optical character recognition systems, automatic translation systems, or in word processing systems for Spelling / grammar checking. These natural language processing techniques may include automatically updating dictionaries for natural language applicat...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/27G06F17/30
CPCG06F17/30684G06F17/30737G06F17/30616G06F17/2735G06F17/30734G06F16/313G06F16/3344G06F16/367G06F16/374G06F40/242
Inventor 吴军唐溪柳洪锋王咏刚杨波张蕾
Owner GOOGLE LLC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products