Method and system for generating input-method word frequency base based on internet information

A technology for generating systems and input methods, applied in the fields of input method systems and input method word frequency database generation, which can solve problems such as inability to cover, slow update, and inconformity with Internet activity, so as to improve hit rate, input speed and efficiency Effect
CN1936893AActive Publication Date: 2007-03-28BEIJING SOGOU TECHNOLOGY DEVELOPMENT CO LTD

Patent Information

Authority / Receiving Office
CN Β· China
Patent Type
Applications(China)
Current Assignee / Owner
BEIJING SOGOU TECHNOLOGY DEVELOPMENT CO LTD
Publication Date
2007-03-28

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The method includes following procedures: using technique of network crawler to obtain web pages of Internet; carrying out procedure of dividing words for information of web pages; carrying out statistics of word frequency for vocabulary entry, and saving statistical result so as to form Internet word frequency base. Using public real-time changeable information from Internet being as source of statistics of word frequency, the invention can create up to date, optimal information of word frequency. Through each convenient way, the method updates the word frequency base of system in input method system from the said optimal information of word frequency. Thus, information of word frequency base of system can be kept consistent to information in Internet. The invention raises hit rate of first selected word from user so as to raise input speed and efficiency.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The present invention relates to the field of Internet information processing, in particular to a method and system for generating an input method word frequency database using Internet information as a source of word frequency statistics, and an input method system. Background technique

[0002] Current input method systems (including Chinese, Japanese, Korean, etc.) are all based on their thesaurus systems and word frequencies in the thesaurus systems to provide users with a ranking of candidate words during information input. The ranking of candidate words is an important indicator of the hit rate of users' preferred words in the process of information input. The hit rate of the preferred words means that after the user inputs certain keyboard information, the words or words ranked first are most needed by the user. Of course, taking the Chinese input method as an example, technically speaking, the input method system itself cannot know which word ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More