Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

442 results about "Character encoding" patented technology

Character encoding is used to represent a repertoire of characters by some kind of encoding system. Depending on the abstraction level and context, corresponding code points and the resulting code space may be regarded as bit patterns, octets, natural numbers, electrical pulses, etc. A character encoding is used in computation, data storage, and transmission of textual data. "Character set", "character map", "codeset" and "code page" are related, but not identical, terms.

Schema-based dynamic parse/build engine for parsing multi-format messages

A parse / build engine that can handle multi-format financial messages. The engine converts the different format messages into a common format, and the common format message is then processed by the business service application. A parser examines the message and determines an appropriate schema for the particular format of message received. The schema is a data structure in a schema registry that includes a grammar structure for the received format as well as pointers to handlers for converting the different fields of the message into the internal message format using the grammar structure (the “grammar” can include field sequence, field type, length, character encoding, optional and required fields, etc.). The handlers are individually compiled. As formats change, new formats or changes to old formats can be dynamically added to the parse / build engine by loading new schema and handlers.
Owner:VISA USA INC (US)

Method and system for deep nerve translation based on character encoding

The invention provides a method and a system for deep nerve translation based on character encoding. A combined nerve network model is established by using an RNN to cover the whole translation process, and translation tasks are directly completed from the perspective of an encoder-decoder framework. The method comprises the following steps: A, word vector generation: performing word segmentation on character-level input data through neural network modeling and generating a word vector; B, language model generation: establishing grammar rules by utilizing the characteristic of memory of the recurrent neural network in time; C, word alignment model generation: obtaining the probability of translating multiple words in a source language statement into target language words; D, output: translating an inputted source language into a target language; E, translation model combination: establishing a deep nerve translation model (RNN-embed) based on character encoding in combination with neural network models in the four steps and accelerating model training by using CPU parallel computation.
Owner:HARBIN INST OF TECH SHENZHEN GRADUATE SCHOOL

Systems and methods of directory entry encodings

In general, the invention relates to supporting multiple different character encodings in the same file system. In one embodiment, a method is provided for filename lookup that supports multiple character encodings. The method comprises storing filename data in a first character encoding into an indexed data structure. The method further comprises receiving filename data in a second encoding. The method also comprises looking up filename data in the indexed data structure using the second encoding.
Owner:EMC IP HLDG CO LLC

Data structure for creating, scoping, and converting to unicode data from single byte character sets, double byte character sets, or mixed character sets comprising both single byte and double byte character sets

A data structure for specifying the types of constants whose character values are to be converted to Unicode; for specifying which code page or pages are used for specifying the character encodings used in the source program for writing the character strings to be converted to Unicode; and that can be used to perform conversions from SBCS, mixed SBCS / DBCS, and pure DBCS character strings to Unicode. A syntax suitable for specifying character data conversion from SBCS, mixed SBCS / DBCS, and pure DBCS representation to Unicode utilizes an extension to the conventional constant subtype notation. In converting the nominal value data to Unicode, currently relevant SBCS and DBCS code pages are used, as specified by three levels or scopes derived from either global options, from local AOPTIONS statement specifications, or from constant-specific modifiers. Global code page specifications apply to the entire source program. These global specifications allow a programmer to declare the source-program code page or code pages just once. These specifications then apply to all constants containing a request for conversion to Unicode. Local code page specifications apply to all subsequent source-program statements. These local specifications allow the programmer to create groups of statements containing Unicode conversion requests, all of which use the same code page or code pages for their source-character encodings. Code page specifications that apply to individual constants allow a detailed level of control over the source data encodings to be used for Unicode conversion. The conversion of source data to Unicode may be implemented inherently to the translator (assembler, compiler, or interpreter) wherein it recognizes and parses the complete syntax of the statement in which the constant or constants is specified, and performs the requested conversion. Alternatively, an external function may be invoked by a variety of source language syntaxes which parses as little or as much of the source statement as its implementation provides, and returns the converted value for inclusion in the generated machine language of the object program. Alternatively, the conversion may be provided by the translator's macro instruction definition facility.
Owner:IBM CORP

Complementary Character Encoding for Preventing Input Injection in Web Applications

Method to prevent the effect of web application injection attacks, such as SQL injection and cross-site scripting (XSS), which are major threats to the security of the Internet. Method using complementary character coding, a new approach to character level dynamic tainting, which allows efficient and precise taint propagation across the boundaries of server components, and also between servers and clients over HTTP. In this approach, each character has two encodings, which can be used to distinguish trusted and untrusted data. Small modifications to the lexical analyzers in components such as the application code interpreter, the database management system, and (optionally) the web browser allow them to become complement aware components, capable of using this alternative character coding scheme to enforce security policies aimed at preventing injection attacks, while continuing to function normally in other respects. This approach overcomes some weaknesses of previous dynamic tainting approaches by offering a precise protection against persistent cross-site scripting attacks, as taint information is maintained when data is passed to a database and later retrieved by the application program. The technique is effective on a group of vulnerable benchmarks and has low overhead.
Owner:POLYTECHNIC INSTITUTE OF NEW YORK UNIVERSITY

Systems and methods of directory entry encodings

In general, the invention relates to supporting multiple different character encodings in the same file system. In one embodiment, a method is provided for filename lookup that supports multiple character encodings. The method comprises storing filename data in a first character encoding into an indexed data structure. The method further comprises receiving filename data in a second encoding. The method also comprises looking up filename data in the indexed data structure using the second encoding.
Owner:EMC IP HLDG CO LLC

Schema-based dynamic parse/build engine for parsing multi-format messages

A parse / build engine that can handle multi-format financial messages. The engine converts the different format messages into a common format, and the common format message is then processed by the business service application. A parser examines the message and determines an appropriate schema for the particular format of message received. The schema is a data structure in a schema registry that includes a grammar structure for the received format as well as pointers to handlers for converting the different fields of the message into the internal message format using the grammar structure (the “grammar” can include field sequence, field type, length, character encoding, optional and required fields, etc.). The handlers are individually compiled. As formats change, new formats or changes to old formats can be dynamically added to the parse / build engine by loading new schema and handlers.
Owner:VISA USA INC (US)

Movement and attitude controlled mobile station control

A mobile station embodiment (400) is provided with a reflection detector (401) which may provide supplemental inputs along with keys (402) such that a character encoding, such as, e.g. ASCII, is selected on the basis of the reflection detector (401) alone, or in combination with keys (402) either pushed down or released. A movable target or pendulum (405) may provide an ability to sense the near space along a direction that the reflection detector is sensitive to. Signals may be transmitted from the reflection detector (401) and pass across a void or other great distance before being reflected, if at all. If such signals are reflected toward the reflection detector (401), and the signals have not been overly attenuated, the reflection detector (401) may provide a ‘reflect’ signal to any on-board processor of the mobile station (400). The pendulum (405) may be influenced by wind, gravity or acceleration (405) to operate as a reflector to cooperate with the reflection detector (401) and generate a ‘reflect’ signal.
Owner:NOKIA TECHNOLOGLES OY

Data transformation and exchange

A data transformation and exchange server receives an input data stream from one or more application servers and / or computing devices. The data stream includes a plurality of input records and each input record can be in a different input protocol and / or character encoding. The transformation and exchange server determines each input record in the input data stream based on one or more boundary points and determines a template from a plurality of templates based on the input record. The transformation and exchange server transforms the input record into an output record based on the template and communicates the output record via an output data stream.
Owner:VORRO

Information hiding method taking text information as carrier

The invention discloses an information hiding method taking text information as a carrier, which is used for providing a hiding channel to secret information in the process of data transmission. The method comprises the following steps: self-defining a reserved code set according to a character coding standard characteristic; self-defining an L scale code according to a base number L of the reserved code set; self-defining a bijective function f of the L scale code to the reserved code set and a reverse function f<-1> thereof; self-defining a method of embedding reserved code character stringsin text information; embedding the secret information to a carrier of the text information and obtaining a secret object; and extracting the secret information from the secret object. The informationhiding technology takes the text information as the carrier and has strong versatility, large hiding capacity, good transparence and strong safety.
Owner:张浩

Screen video encoding and decoding method based on progressive character block compression and encoding and decoding device

The invention discloses a screen video encoding and decoding method based on progressive character block compression and an encoding and decoding device thereof. The screen video encoding and decoding method comprises the following steps: obtaining a frame in a screen video, and dividing the frame into M*N macro blocks; dividing each macro block into character blocks or image blocks according to the category; dividing each character block into a plurality of character compression code streams according to a main color and a non-main color; encoding the main color by an index table, and progressively encoding escape colors of the non-main color based on a bit plane; dividing each image block into a plurality of image encoding quality layers by using a progressive image encoding algorithm based on wavelet transform; transmitting the code streams to a receiving terminal according to different quality grades; and decoding and displaying the code stream of each quality layer by the receiving terminal. According to the screen video encoding and decoding method disclosed by the invention, in view of the limitation and defects of traditional character encoders, the progressive character encoding technology of a plurality of quality layers is realized, no support of special equipment is needed, and the screen video encoding and decoding method is applicable to all occasions needing to compress screen images containing characters.
Owner:XIAN WANXIANG ELECTRONICS TECH CO LTD

Method and system for inputting and displaying character

A method for input-displaying character includes loading character repertoire picture and character coding set corresponding to said picture as well as grid information of said picture, decode-converting said character repertoire picture device correlated bit map information being linear-continuously stored, writing said device correlated bit map information and character coding set separately into pixel information item and character information item of terminal character repertoire for generating character repertoire and character style corresponding to said character repertoire.
Owner:HUAWEI TECH CO LTD

Text content digital watermark encryption and protection method and device

ActiveCN108090329ATo achieve the purpose of original data protectionThe display effect is not affectedDigital data protectionProgram/content distribution protectionArray data structureOriginal data
The invention discloses a text content digital watermark encryption and protection method and device. The method comprises S1, eliminating posterior repeated characters in an original text to generatean array A composed of characters without repetition, wherein the serial number of every character in the array A is in order of appearance in the original text; S2, converting the characters in thearray A one by one into character codes A (n), and encrypting the character codes A (n) one by one to acquire encrypted new character codes B (n); S3, converting the new character codes B (n) into acquire character codes C (n), mapping the characters in the array A with the character codes C (n) so that the character codes C (n) are displayed as the font of the characters in the array A. Accordingto the text content digital watermark encryption and protection method, by encrypting and coding the character codes of original codes again and enabling the encrypted character codes to carry watermark information, the display effects can avoid being affected, text content can be encrypted and carry the unforgeable watermark information, so that the aim of original data protection of content ofcopyright can be achieved.
Owner:上海海笛数字出版科技有限公司

Character pitch encoding-based dual-watermark embedded text watermarking method

The invention relates to a character pitch encoding-based dual-watermark embedded text watermarking method, comprising the following steps of: converting watermark information M and a key D, which are required to be hidden, into a binary sequence, circularly executing modulus to get an encrypted binary sequence, adding error-correcting codes for the to-be-embedded watermark information, and finally embedding dual watermarks and synchronization information of the watermarks by encoding the attribute of a text object without page modification and encoding a character pitch. With the method, the text content cannot be changed, the watermark information is hidden well, and the method has the characteristics of high robustness and high capacity. By extracting the embedded watermark information, transmission and modification of a document can be controlled effectively, and whether the document is modified can be judged, and finally, the aims of digital copyright protection, data integrity authentication and safe covert communication for the text document are achieved.
Owner:UNIV OF SHANGHAI FOR SCI & TECH

Chinese word segmentation algorithm based on reverse maximum matching

The invention discloses a Chinese word segmentation algorithm based on reverse maximum matching, which comprises the following steps: initializing three objects in a memory; inputting the contents of a text which needs word segmentation; splitting characters in the text into different types according to character codes; directly adding characters which are not Chinese characters to word segmentation results according to the character codes after the text is segmented into short sentences; splitting the short sentences into character sets according to a character string matching and decision-making mechanism; matching the character sets with character sets in a word segmentation dictionary based on the reverse maximum matching algorithm; storing matched character sets into a word segmentation result set; combining consecutive unmatched characters; and adding the consecutive unmatched characters to the word segmentation results to complete word segmentation. A quick word segmentation algorithm based on dictionaries is provided, and the dictionary loading efficiency and the word segmentation efficiency are greatly improved while word segmentation accuracy is ensured.
Owner:BEIJING JINHER SOFTWARE

Method and device for displaying fonts in terminal device

InactiveCN103902513ASolve the problem that the page display is not supportedImprove display efficiencySpecial data processing applicationsComputer graphics (images)Algorithm
The invention provides a method and device for displaying fonts in a terminal device. The method includes the steps of obtaining character codes of to-be-displayed characters in the terminal device; obtaining vector diagram information of the fonts corresponding to the obtained character codes from a font resource library of a server, wherein the font resource library is used for storing the character codes and the vector diagram information of the fonts corresponding to the character codes; drawing the obtained vector diagram information of the fonts in a page of the terminal device to display the fonts of the to-be-displayed characters. By means of the method and device, the problem that the system fonts do not support page displaying in the terminal device can be effectively solved, and the displaying efficiency of the non-system fonts in the terminal device can be further improved.
Owner:BEIJING BAIDU NETCOM SCI & TECH CO LTD

Color character encoding method and decoding method

The invention discloses a color character encoding method and a decoding method which solve the problems of an anti-counterfeiting code technology that the security is low and the appearance effect of a product is affected. The color character encoding method comprises the following steps that (1) N different colors are selected; (2) an N system encoding library is set up and the base of the encoding library is matched with the selected colors; (3) source information is input and converted into an M system code; (4) the M system code is converted into an N system code to obtain corresponding data; (5) the data is substituted through the colors which are matched with the base of the N system encoding library, lined and combined into a color character; and (6) the color character is output. The color character decoding method comprises the following steps that (1) the color character is identified and substituted into the M system code by terminal equipment; (2) the M system code is converted into a Unicode or an ASCII code; and (3) the terminal equipment converts the Unicode or ASCII code into source information and outputs the same. According to the color character encoding method and the decoding method, an anti-counterfeiting code manufacturing method is novel and the security is high.
Owner:曾芝渝 +3

Method for extracting data of webpage table

ActiveCN102254009ASimplify the extraction methodIncrease flexibilitySpecial data processing applicationsCharacter encodingWorld Wide Web Consortium
The invention provides a method for extracting data of a webpage table. The method comprises the following steps: 10, reading a webpage source code, analyzing the webpage source code into a Document object of W3C (World Wide Web Consortium) and acquiring any two keywords in the webpage table; 20, performing depth-first traversal on all nodes in the Document object and acquiring two nodes to whichthe two keywords belong respectively; 30, acquiring a common father node with unique attribute of the two nodes, and acquiring the positioning condition of the webpage table by utilizing the unique attribute; and 40, filtering the webpage source code by utilizing the data positioning condition of the webpage table, and extracting a webpage table with the same effect as webpage display. In the method, according to the any two keywords to be extracted in the webpage table and the required table row / column values, the table with the same effect as the original webpage display can be accurately and quickly extracted from the webpage which changes in real time, data of designated rows / columns is acquired, and the flexibility and accuracy of data extraction are improved.
Owner:FUJIAN STAR NET COMM

Method of, system for, and computer program product for scoping the conversion of unicode data from single byte character sets, double byte character sets, or mixed character sets comprising both single byte and double byte character sets

Provided are a method, system and program for translating a source character string in a first character encoding into a target character string in a second character encoding. A plurality of specifications are maintained. Each specification has one of a plurality of scopes identifying at least one code page providing a mapping for source character strings in the first character encoding. The scopes specify different portions of the program to which the code page identified by the specification applies. The source character string for which translation is requested in the program is processed and a determination is made of one specification having one scope that is applicable to the processed source character string. The code page identified by the determined specification is used to translate the processed source character string in the first character encoding into the target character string in the second character encoding.
Owner:IBM CORP

Image processing device and program product

When an image group of character images such as a word consists of character images, which are character code data candidates as their character recognition certainties are higher than a prescribed value, and character images, which are character image data candidates as their character recognition certainties are lower than a prescribed value, a computer 10 with a capability of functioning as an image processing device has a character output format judgment unit 33 that makes a judgment to cut out all character images within said word without converting them to character code data to form character image data.
Owner:MINOLTA CO LTD

Source editing, internationalization, advanced configuration wizard, and summary page selection for information automation systems

A source manager includes an editor program that can be used to edit an existing source record via a graphical user interface (GUI). Test Action and Test Source functions allow a user to test enter a query and to test a source expeditiously. A conversion tool converts existing sources to the design and format to reconcile data scattered among the source engine data and source partition record. For handling internationalization issues, aspects of the invention include persistently storing the source's encoding type during the configuration process, and then using that encoding type later during the deep harvest phase. According to another aspect of the invention a solution for selecting a summary passage for a particular source is provided. Other aspects of the invention include solutions for character encoding, “Next Links” recognition and “Next Results” handling.
Owner:BRIGHTPLANET CORP

Webpage text extraction method based on maximum text density

The invention relates to a webpage text extraction method based on the maximum text density. The method includes the following steps of (1) preprocessing a webpage, processing character codes and standardizing the webpage, (2) analyzing the webpage into a DOM tree and extracting tag text blocks in the webpage according to specific tags, (3) calculating the maximum text density, and (4) extracting texts, carrying out sequencing according to calculated text densities after all the tag text blocks are processed, and selecting a tag with the maximum text density, wherein the tag and content of a nested sub-tag serve as a text block and the text is obtained after the tag is eliminated. The webpage text extraction method based on the maximum text density is low in algorithm complexity, has universality and has a good effect on webpages with complex structures.
Owner:TONGJI UNIV

Portable data entry device

The invention is a portable data entry device that allows a user to record hand written material on paper using standard writing tools while simultaneously recording an electronic copy of the written material for subsequent processing and storage on suitable data processing and storage devices. The inventive device includes data processing software which performs an initial analysis of the electronically recorded hand writing and encodes non-ambiguous characters into a compressed character encoding format and stores the encoded data along with the remaining ambiguous character data in local storage memory. The encoded non-ambiguous character data is subsequently transmitted along with the remaining ambiguous character data to a base data processing device for further character recognition data processing.
Owner:KARBONSTREAM CORP

Method and system for processing script data

The invention discloses a method and system for processing script data, belonging to the script technology field. The volume of traditional script data is lager, logic using the traditional script data is more complex, the speed is lower, and the efficiency is lower. In the method and system of the invention, a direct map relation between the character encoding and the character pattern index is firstly established and recorded in the mapping table of the character encoding and the character pattern index, and the redundant data in the script data is deleted; and when the script is used, the character pattern index of characters is obtained firstly through inquiring the mapping table of the character encoding and the character pattern index, and then the character pattern description dataof the characters is obtained from the script data according to the character pattern index. After the script data is processed by adopting the method and system of the invention, the volume of the scrip data is reduced, and the use efficiency of the script data is improved. The invention is especially suitable for files integrating character information with script data into a whole or being added into file reading software.
Owner:NEW FOUNDER HLDG DEV LLC +2

Complementary character encoding for preventing input injection in web applications

Method to prevent the effect of web application injection attacks, such as SQL injection and cross-site scripting (XSS), which are major threats to the security of the Internet. Method using complementary character coding, a new approach to character level dynamic tainting, which allows efficient and precise taint propagation across the boundaries of server components, and also between servers and clients over HTTP. In this approach, each character has two encodings, which can be used to distinguish trusted and untrusted data. Small modifications to the lexical analyzers in components such as the application code interpreter, the database management system, and (optionally) the web browser allow them to become complement aware components, capable of using this alternative character coding scheme to enforce security policies aimed at preventing injection attacks, while continuing to function normally in other respects. This approach overcomes some weaknesses of previous dynamic tainting approaches by offering a precise protection against persistent cross-site scripting attacks, as taint information is maintained when data is passed to a database and later retrieved by the application program. The technique is effective on a group of vulnerable benchmarks and has low overhead.
Owner:POLYTECHNIC INSTITUTE OF NEW YORK UNIVERSITY

Intelligent television input method and system thereof

The invention discloses an intelligent television input method and a system thereof. The input method comprises the following steps: starting the state of the input method and displaying a virtual keyboard with different editing areas; outputting a character type corresponding to one editing area selected by a user; receiving an operation instruction sent by the user through virtual keys in the editing area and inputting character codes corresponding to the virtual keys according to the operation instruction. According to the intelligent television input method and the system, the selected letters or phonetic alphabets are displayed around the corresponding keys by taking keyboard layouts in a plurality of editing areas, so that the user does not need to switch the editing interfaces repeatedly on the virtual keyboard during inputting different types of characters and also does not need to select by pressing the keys for a plurality of times or moving a focus point repeatedly during inputting the letters or the Chinese characters; the times for pressing the keys are reduced; the operation steps are simplified; the characters in a television can be more conveniently inputted.
Owner:KONKA GROUP

Apparatus and method of mobile communication terminal character conversion

The invention discloses a character conversion device of a mobile communication terminal, and comprises a data recognition unit which judges whether the code of a received first character exists in a local character base; a character code corresponding table which records the code of the first character and the code of a corresponding second character in the local character base; and a character conversion unit which converts the received first character into the corresponding second character in the local character base according to the character code corresponding table. Besides, the invention discloses a character conversion method of the mobile communication terminal, which first judges whether the code of the received first character exists in the local character base, and searches the code of the second character in the local character base which is corresponding to the first character according to the set character code corresponding table, if not. The invention reduces the occupation of memory space by the character base resource, has a small calculating amount, is suitable for mobile terminal devices limited by resources and cost and can realize the compatibility of characters with only a set of character base in the mobile terminal device.
Owner:SHANGHAI SIMCOM LTD

Machine readable 2D symbology printable on demand

An optically readable two dimensional symbology employed to encode a string of characters belonging to a source string alphabet. The two dimensional symbol can comprise an ordered plurality of printable elements. The printable elements can be arranged in a rectangular array. The printable elements can be positioned on a grid diagonal to the longitudinal axis of the rectangular array, and sequenced according to a pre-defined pattern. Each character of the character string can be encoded into a sequence of printable elements using an encoding scheme comprising at least one code set. A code set can include a plurality of bit sequences. Each bit sequence can correspond to a group of one or more characters of the source string alphabet. Each bit sequence can comprise one or more binary digits. The first binary digit can be encoded by a printable element printed in a printable element position, and a second binary digit can be encoded by a vacant printable element position.
Owner:HAND HELD PRODS
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products