Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

78 results about "Organization Name" patented technology

A non-unique textual identifier for the organization. [BRIDG]

Meta-content analysis and annotation of email and other electronic documents

Meta-content analysis and annotation upon the body of email documents, and other electronic documents, and to create a displayable index of these instances of meta-content, which is sorted and annotated by type are provided. In addition, the electronic document is enhanced by providing links for the semantic foci to external documents containing related information. An electronic document adapted for delivery to one or more recipients, the electronic document including a header and a body, is processed by:performing meta-content extraction of semantic foci within said header and said body, the semantic foci comprising a plurality of type of information including one or more of email addresses, URLs, dates, currency values, organization names, names of people, names of places, and phone numbers;creating a meta-content index the document based upon said extracted semantic foci;arranging the meta-index according to said plurality of types;combining said meta-content index with said header and said body to provide an enhanced document; andsending said enhanced document to said one or more recipients via a communication network.The process includes converting the electronic mail document to a markup language format, and wherein said meta-content index comprises one or more objects expressed in said markup language adapted for presentation with body in said enhanced document.
Owner:SAP AMERICA

Text joins for data cleansing and integration in a relational database management system

An organization's data records are often noisy: because of transcription errors, incomplete information, and lack of standard formats for textual data. A fundamental task during data cleansing and integration is matching strings—perhaps across multiple relations—that refer to the same entity (e.g., organization name or address). Furthermore, it is desirable to perform this matching within an RDBMS, which is where the data is likely to reside. In this paper, We adapt the widely used and established cosine similarity metric from the information retrieval field to the relational database context in order to identify potential string matches across relations. We then use this similarity metric to characterize this key aspect of data cleansing and integration as a join between relations on textual attributes, where the similarity of matches exceeds a specified threshold. Computing an exact answer to the text join can be expensive. For query processing efficiency, we propose an approximate, sampling-based approach to the join problem that can be easily and efficiently executed in a standard, unmodified RDBMS. Therefore the present invention includes a system for string matching across multiple relations in a relational database management system comprising generating a set of strings from a set of characters, decomposing each string into a subset of tokens, establishing at least two relations within the strings, establishing a similarity threshold for the relations, sampling the at least two relations, correlating the relations for the similarity threshold and returning all of the tokens which meet the criteria of the similarity threshold.
Owner:AMERICAN TELEPHONE & TELEGRAPH CO +1

Virtual domain name system using the user's preferred language for the internet

This invention describes a system that allows a user to enter domain names in any language of the user's preference by automatically converting them into the corresponding real domain names in English that comply with the Domain Name System. The system incorporates two conversion methods. The first method is to convert the coded portions of a domain name such as organization code and country code. In this method, each coded portion in English is pre-assigned an equivalent word or code in the user's preferred language, and the equivalent word or code entered in the user's preferred language is converted into the corresponding real coded portion in English. The second method is to convert the remaining portions of a domain name such as organization name and server computer name. In this method, the user enters each portion in the user's preferred language as the corresponding real portion in English is transliterated into the user's preferred language in accordance with the standard pronunciation of English words or letters in the user's preferred language. Then, the letters of the portion entered in the user's preferred language are converted into English letters by matching the phonemes of the portion entered in the user's preferred language with English phonemes that have the same or proximate sounds and transcribing the English phonemes into the corresponding English letters. The conversion system of the present invention can be implemented automatically at the user's computer without having to change the Domain Name System.
Owner:DUALNAME

Chinese business card OCR (optical character recognition) data correction system utilizing massive associated information of knowledge base

InactiveCN103927352AImprove accuracyAdapt to the ever-increasing demand for informationCharacter and pattern recognitionSpecial data processing applicationsIncremental maintenanceBusiness card
The invention provides a Chinese business card OCR (optical character recognition) data correction system utilizing massive associated information of a knowledge base. The Chinese business card OCR data correction system comprises an image collection module, an image standardized processing module, a block extracting module, an OCR module, a knowledge base module, a data correction module, a gain maintaining module and a result displaying module. The system is characterized in that to-be-corrected data are labeled by subjecting recognition results of the OCR module to information structuralized processing; address and organization name associated information is corrected by utilizing the massive associated information of the knowledge base module and combing a series of techniques like Chinese word segmentation, importance weighting based on the knowledge base, similarity comparison based on texts and images and information integration to improve accuracy; corrected OCR results are output and displayed. In addition, the gain maintaining module of the system performs information maintaining on the knowledge base in a semiautomatic manner to meet needs of continually-growing of information quantity.
Owner:JIANGSU WEISHI TECH

Method and device for translating Chinese organization name into English with the aid of network knowledge

The invention relates to a method and device for translating a Chinese organization name into English. The method for translating the Chinese organization name into English comprises the following steps: dividing the Chinese organization name to be translated into English into four language chunks by using a word-based conditional random field model, and carrying out word segmentation to the fourlanguage chunks; selecting a plurality of phrases with certain information and translation confidence for statistical translation to obtain the translation results of the phrases of the organization name and form a bilingual inquiry with the Chinese organization name to be translated into English; searching the bilingual inquiry with a search engine to obtain the segments of a plurality of Chinese-English mixed webpages; extracting the English in the segments of the Chinese-English mixed webpages and selecting the segment which has the highest matching rate with the Chinese organization name in English sentences with the aid of the asymmetrical Chinese-English aligning technology; and determining an optical segment as the translation of the Chinese organization name by calculating the occurrence frequency of each segment. The method for translating the Chinese organization name into English overcomes the defect that a statistical translating model is prone to the structure, order and phrase selection errors during the Chinese organization name translation and improves the Chinese organization name translation precision by 35.26 percent.
Owner:INST OF AUTOMATION CHINESE ACAD OF SCI

User search string organization name recognition method based on semantic feature model

The invention belongs to the field of the processing of a natural language, and particularly relates to a user search string organization name recognition method based on a semantic feature model. The method comprises a treatment process of a model establishment stage and a recognition stage. The method comprises the steps of establishing a training language database conforming to the distribution of user search strings by utilizing the existing a long text marking language database at the model establishing stage, wherein the semantic database is used for storing the features of traditional participle and part-of-speech tagging and is additionally provided with a context feature in the search string and a cohesive feature correlated semantic environment feature, establishing a condition random field model according to the composite semantic feature, and adopting the random condition field model as an organization name recognition model; calculating the semantic environment feature corresponding to the user search string to obtain a model sequence of the user inquiry string, extracting the model sequence conforming to the organization name, and obtaining an organization name in the user search string. By adopting the method, the accuracy and recall rate for recognizing the organization name in the user search string can be comprehensively improved.
Owner:BEIJING INSTITUTE OF TECHNOLOGYGY

Device and method for identifying organization name by word segmentation program

The invention relates to the technical field of network data communication and discloses a device and a method for identifying an organization name by a word segmentation program. The device comprises a storage module, a word segmentation module, an identification module and an output module, wherein the storage module is applicable to data storage, the word segmentation module is applicable in segmenting words in a sentence to be identified by an entry dictionary in order to obtain entries in the sentence to be identified; the identification module is applicable in extracting the entries which can satisfy a relevant word property of the preset organization name and is found in a word property dictionary from the entries obtained from word segmentation, can splice the extracted entries according to connection rules of the preset relevant word property, takes a spliced entry as a candidate organization name and adds the entry into a candidate set, and selects an entry satisfying output conditions of the preset organization name from the candidate set; and the output module is applicable in taking the selected entry as the organization name and outputting the entry. The device and the method for identifying the organization name by the word segmentation program provided by the invention can solve and realize the problem of extracting the organization name from a text and obtain the beneficial effect of automatically extracting the organization name from the text.
Owner:BEIJING QIHOO TECH CO LTD +1

Incident location extraction method oriented to Chinese news texts

The invention provides an incident location extraction method oriented to Chinese news texts. According to the method, firstly, character segmentation is conducted on the Chinese news texts T through an ICTCLAS Chinese character segmentation tool, and characters with the property being organization names, location nouns and place names are selected to form a candidate incident location set; for each character in the candidate incident location set, a three-dimensional feature vector including the context feature, the position feature and the topology feature is established; finally, through the established three-dimensional feature vectors, a Random Forest classifier is adopted to conduct two-value classification on all the characters in the candidate incident location set according to the incident locations and the non-incident locations, and thus extraction of the incident locations is achieved. According to the method, multiple types of features in the news texts can be utilized comprehensively, the context features, the position features and the topology features are extracted to form the feature vectors, the Random Forest classifier is adopted to obtain the organization names, the location nouns and the place names form the segmented characters so as to recognize the incident locations; the places where news events occur can be further recognized based on place name identification.
Owner:XIAN JIAOTONG UNIV CITY COLLEGE

Information processing method and device and method and device for standardizing organization names

The invention discloses an information processing method and device and a method and device for standardizing organization names. The information processing method comprises the steps of dividing organization names, wherein the organization names are divided into multiple levels of sub organization names according to the semantic characteristics of the organization names; analyzing subordinate relation, wherein the subordinate relation between the multiple levels of sub organization names is analyzed so as to obtain the internal organization relation of the organizations relates to the organization names; analyzing equal relation, wherein the equal relation between the organization names is analyzed by utilizing public information sources; storing organization name, wherein the relation between the organization names and the internal organization structure and the equal relation are stored in a relevant mode so as to establish a knowledge library in the organization name storage step. According to the information processing method and device and the method and device for standardizing organization names, the organization names can be standardized more efficiently and accurately, and therefore unified management and rapid retrieval of documents are facilitated.
Owner:FUJITSU LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products