Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and apparatus for confirming text stream character set

A technology of character sets and text streams, applied in digital data processing, special data processing applications, instruments, etc., can solve the problems of not knowing the character set of the source text stream, unable to perform character set conversion, etc., and achieve the effect of accurate results

Inactive Publication Date: 2007-09-12
北京立通无限科技有限公司
View PDF0 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

But the problem is that when receiving text stream information, sometimes you don’t know what character set the source text stream is in, so you can’t perform character set conversion

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and apparatus for confirming text stream character set
  • Method and apparatus for confirming text stream character set
  • Method and apparatus for confirming text stream character set

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0056] In the following, the method for determining the text stream character set provided by the present invention will be further specifically described in conjunction with Embodiment 1. Figure 1 is a flowchart of the method.

[0057] In step 101, a character set set is preset, and the character sets in the set are sorted according to the encoding range of the character set, wherein the character set with a small encoding range is arranged first, and the character set with a large encoding range is arranged last.

[0058] In step 102, the first character set in the sequence is set as the current character set, that is, the character set with the smallest encoding range is set as the current character set.

[0059] In step 103, specify the current character set as the source character set of the text stream, and the character set unicode as the destination character set of the text stream, and call the libiconv function library to convert the text stream from the current char...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a method determining a character set in the text flow, including: default characters in the set, according to the encoding scope of the character set arranging the character set in rank in the pool; Set up the first set of characters in the sequence described for the current character set; convert the above text flow to the character set, if successful conversion, the current characters regarded as the correct source character set in the text flow referred; If the conversion failed, the next step is implemented; judging whether the present character set is the last character set in the sets referred, if not, in the current set up a new character set next to the current set as a new current set, implementing the above step. The present invention also provides a device determining a set of characters in the text flow. Use of methods and devices of this invention providing, can quickly determine a copy of the source of characters in the flow received, at the same time is able to recognize the character sets in the secondary with possible confusion and avoid validation error.

Description

technical field [0001] The invention relates to character detection technology, in particular to a method and device for determining a text stream character set. Background technique [0002] With the development of computer networks and communication technologies, people use the Internet and related electronic services more and more commonly, and information transmission between people with different natural languages ​​in different places through these services is also more and more frequent. [0003] However, users of different languages ​​in different places use different character sets for information processing and transmission stipulated by different countries or regions in their computer devices. Multiple character sets are also used in locales. For example, in China, the character sets used include EUC-CN, HZ, GBK, GB18030, EUC-TW, BIG5, CP950, BIG5-HKSCS, ISO-2022-CN, ISO-2022-CN-EXT, etc.; , the character sets used include ASCII, ISO-8859, KOI8-R, KOI8-U, KOI8-R...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/22G06F17/28
Inventor 蒋光泽葛兵徐鲁博王黎晓张跃华
Owner 北京立通无限科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products