Method for automatically extracting bilingual translation dictionary from internet

A technology for automatically extracting and translating dictionaries, which is applied in special data processing applications, instruments, and electronic digital data processing, etc., to achieve the effect of short update cycle, small workload, and overcoming performance bottlenecks
CN101833571BInactive Publication Date: 2011-12-28TSINGHUA UNIV +1

Patent Information

Authority / Receiving Office
CN Β· China
Patent Type
Patents(China)
Current Assignee / Owner
TSINGHUA UNIV
Publication Date
2011-12-28
Estimated Expiration
Not applicable Β· inactive patent

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention discloses a method for automatically extracting a bilingual translation dictionary from the internet. The method is characterized by comprising the following steps of: extracting bracket bilingual words and right-structured bilingual words from Chinese and foreign bilingual web pages; intercepting the extracted bracket bilingual words to obtain exactly translated bracket bilingual words; carrying out root combination on the right-structured bilingual words and the exactly translated bracket bilingual words; for given Chinese, searching corresponding translations in the right-structured bilingual words, and if the corresponding translations are searched, ignoring the translations of the bracket bilingual words, or else, searching the corresponding translations in the bracket bilingual words; and processing all foreign languages by using the same method to obtain a final bilingual translation dictionary. The invention can quickly, effectively and automatically construct the bilingual translation dictionary according to the word frequency of the bilingual words without relying on any external resources.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention relates to the technical field of statistical natural language processing, in particular to a method for automatically extracting bilingual translation dictionaries from the Internet. Background technique

[0002] Whether it is scientific research or daily life, people have a high degree of exposure to and dependence on foreign languages. Traditional translation dictionaries mainly come from manual collation and editing, with a long generation cycle, slow update, and low coverage. Existing methods for generating translation dictionaries based on the Internet rely on a variety of natural language processing technologies and machine learning technologies. These methods may become performance bottlenecks when processing large-scale data, and rely on pre-established resources.

[0003] The bilingual translation dictionary we constructed comes from the Internet. In addition to traditional vocabulary, it can also cover current popular vocabula...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More