Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

1224 results about "Word group" patented technology

Control system and method for controlling household appliances through voice

The invention relates to a control system and method for controlling household appliances through voice. The system comprises a master control module, an audio input device, an audio playing device and a household appliance master control unit. The output end of the audio input device is connected with the input end of the master control module. The input end of the audio input device is connected with the output end of the master control module. The household appliance master control unit is in bidirectional connection with the communication port of the master control module. According to the system and method, the household appliances can be controlled through a voice command word group containing two keywords, wherein the first half of the voice command word is the first keyword, namely a household appliance awakening word, and the second half of the voice command word is the second keyword, namely a household appliance operating command word. For example, an intelligent air conditioner starting command word group is adopted, the first keyword, intelligent air conditioner, can awaken an air conditioner control unit, and the second keyword, starting, indicates the air conditioner control unit to achieve the starting action. The system and method can help a user to quickly control the household appliances through voice in a user-friendly mode.
Owner:IFLYTEK CO LTD

Code, system, and method for generating concepts

Disclosed are a computer-readable code, system and method for generating candidate novel concepts in one or more selected fields. The system operates to generate strings of terms composed of combinations of word and optionally, word-group terms that are descriptive of concept elements in such field(s), and uses a genetic algorithm to find one or more high fitness strings, based on the application of a fitness metric which quantifies, e.g., the number occurrence of pairs of terms in texts in a selected library of texts. The highest- score string or strings are then applied in a database search to identify one or more pairs of primary and secondary texts whose terms overlap with those of a high fitness string.
Owner:WORD DATA

Information management and retrieval

A method and apparatus is provided for extracting key terms from a data set, the method includes identifying a first set of one or more word groups of one or more word that occur more than once in the data set, and removing from this first set a second set of word groups that are sub-strings of longer word groups in the first set. The remaining word groups are key terms. Each word group is weighted according to its frequency of occurrence within the data set. The weighting of any word group may be increased by the frequency of any sub-string of words occurring in the second set and then dividing each weighting by the number of words in the word group. This weighting process operates to determine the order of occurrence of the word groups. Prefixes and suffixes are also removed from each word in the data set. This produces a neutral form of each word so that the weighting values are prefix and suffix independent.
Owner:BRITISH TELECOMM PLC

Method and apparatus for identifying documents relevant to a search query in a medical information resource

A computerized system and method for providing information for use in medical care. Documents in a medical information resource may have several associated sections, such as title, headings, text, keyword and document type sections. Display of search results resulting from a user's query may be determined based on at least one document section in which the search engine identifies at least one search term. The search engine may generate a set of search terms for identifying documents relevant to a user's query, at least in part, by using a search term synonym resource that includes a plurality of search terms arranged in groups of associated synonyms. Synonyms in an associated group may be arranged in a hierarchical structure such that each synonym in the associated group has a parent, sibling or child relationship with each other synonym in the associated group.
Owner:UPTODATE

Code, method, and system for manipulating texts

Disclosed are a computer-readable code, system and method for combining texts to form novel combinations of texts related to a desired target concept, where the concept is represented in the form of a natural-language text or a list of descriptive word and / or word-group terms. The system operates to find primary and secondary groups of texts having highest term match scores with a first and second subset of terms in the concept, respectively. It then generates pairs of texts containing a text from each of the primary and secondary groups of database texts, and selects for presentation to the user, those pairs of texts having highest overlap scores as determined from one or more of (i) term overlap, (ii) term coverage, (iii) feature-specific cross-correlation, (iv) attribute-specific correlation, and (v) citation score of one or both texts in the pair.
Owner:WORD DATA

Named entities recognition method based on bidirectional LSTM and CRF

The invention discloses a named entities recognition method based on bidirectional LSTM and CRF. The named entities recognition method based on the bidirectional LSTM and CRF is improved and optimizedbased on the traditional named entities recognition algorithm in the prior art. The named entities recognition method based on the bidirectional LSTM and CRF comprises the following steps: (1) preprocessing a text, extracting phrase information and character information of the text; (2) coding the text character information by means of the bidirectional LSTM neural network to convert the text character information into character vectors; (3) using the glove model to code the text phrase information into word vectors; (4) combining the character vectors and the word vectors into a context information vector and putting the context information vector into the bidirectional LSTM neural network; and (5) decoding the output of the bidirectional LSTM with a linear chain condition random field to obtain a text annotation entity. The invention uses a deep neural network to extract text features and decodes the textual features with the condition random field, therefore, the text feature information can be effectively extracted and good effects can be achieved in the entity recognition tasks of different languages.
Owner:南京安链数据科技有限公司

Text representation and method

A computer method for representing a natural-language document in a vector form suitable for text manipulation operations is disclosed. The method involves determining (a) for each of a plurality of terms composed of non-generic words and, optionally, proximately arranged word groups in the document, a selectivity value of the term related to the frequency of occurrence of that term in a library of texts in one field, relative to the frequency of occurrence of the same term in one or more other libraries of texts in one or more other fields, respectively. The document is represented as a vector of terms, where the coefficient assigned to each term includes a function of the selectivity value determined for that term, and optionally related to the inverse document frequency of that word in one or more libraries of texts. Also disclosed are a computer-readable code for carrying out the method, a computer system that employs the code, and a vector produced by the method.
Owner:WORD DATA

Apparatus and method for forming compound words

A method and device for generating meaningful compound words is provided. A user interface (120, 212) is configured to receive data input corresponding to one or more compound words. A processor (206) is configured to identify word combinations of shorter words that may be combined to form a portion or all of the one or more compound words. A display (118, 208) is configured to show the word combinations in a priority based on one or more criteria, such as distinguishing word combinations having different quantities of shorter words.
Owner:GOOGLE TECH HLDG LLC

Aligning chunk translations for language learners

InactiveUS20110097693A1Easily input dataSimple data formatElectrical appliancesTeaching apparatusText editingDocument preparation
A method and apparatus to align and edit chunks of text and translation. Language learners compare segments of text and translation. Both text and translation are segmented into word groups or “chunks” and related to each other. The related chunks are aligned to facilitate their comparison. For a reader, unfamiliar chunks can be related to more familiar chunks. Constant alignment of text and translation chunks occurs in many variable outputs, including bifocal formats and directly editable alignments. Thus, human edits and improvements input into the system can inform improving machine chunk translation. Both text and translation are editable within one single document, manageable in a wide variety of text editing environments, including common Textarea Input fields. Resulting chunk translations are easily printed on paper and / or displayed electronically. Language learners using the system may include humans and machines. Productions of aligned texts are customized for individual language learners.
Owner:CRAWFORD RICHARD HENRY DANA

System and method for heterogeneous information mining and visual analysis

The invention relates to the field of heterogeneous information retrieval, in particular to an intelligent retrieval and analyzing method based on domain ontology and information mining and a visual analyzing system comprising the method. The system mainly comprises a field data acquisition subsystem, a corpus resource processing subsystem, an information mining subsystem and a visual analyzing subsystem, wherein the field data acquisition subsystem is used for acquiring data by network capturing and local uploading, the field data acquisition subsystem is used for pre-processing field related data, the field data acquisition subsystem is used for analyzing and mining related information in corpus, and the visual analyzing subsystem is used for dynamically displaying and counting and analyzing retrieval results. Concepts in a domain ontology base and mutual relations of the concepts are fully used by the system for heterogeneous information mining and visual analysis, requirements of users can be correctly understood to automatically cluster hierarchical structural information of a certain field so as to support the users to inquire key words, phrases and simple sentences and optimize retrieval results, relevant concepts and extension concepts can be found by ontological reasoning to support graphic display preview of each information meaning in the inquiry results, and the professional-field information retrieval performance can be remarkably improved to realize dynamic information display.
Owner:BEIJING ZHONGJIKEHAI TECH & DEV

Apparatus and method for word translation information output processing

When it accepts an input sentence, the present apparatus divides the input sentence into substrings through morpheme analysis and obtains a candidate word group for translation of the substrings from a machine translation dictionary. It then obtains information on occurrence of each candidate word in the candidate word group within a bilingual example sentence database and calculates their priorities based on the occurrence information. Then, it grants priority as translation to each of the candidate words to generate a prioritized candidate word group and sorts the candidate words in descending order of priority for output.
Owner:FUJITSU LTD

Intelligent word input method and input method system and updating method thereof

The invention discloses an intelligent word-group input method in input system, including get the portfolio information between at least two basic words from the pre-set internet corpus, which including the combination relationship and adjacent same frequency of (the least two) words; generate multi-table according to the portfolio information; receive the user input encoding string, and segment the encoding string; get the combination information in the multi-table according to segmentation string, and extract the words with the corresponding relations in combination information as the candidate words. The invention can improve the first selected words hit rate of user input words, phrases, short sentences or long sentences to avoid ineffective repeat calculation process to improve the efficiency of the user input.
Owner:BEIJING SOGOU TECHNOLOGY DEVELOPMENT CO LTD

Code, system and method for representing a natural-language text in a form suitable for text manipulation

A computer method, system and code, for representing a natural-language document in a vector form suitable for text manipulation operations are disclosed. The method involves determining (a) for each of a plurality of terms selected from one of (i) non-generic words in the document, (ii) proximately arranged word groups in the document, and (iii) a combination of (i) and (ii), a selectivity value of the term related to the frequency of occurrence of that term in a library of texts in one field, relative to the frequency of occurrence of the same term in one or more other libraries of texts in one or more other fields, respectively. The document is represented as a vector of terms, where the coefficient assigned to each term includes a function of the selectivity value determined for that term.
Owner:WORD DATA

Chinese network review emotion classification method based on integrated study frame

The invention discloses a Chinese network review emotion classification method based on an integrated study frame. According to the method, a part-of-speech combination mode, an order-preserving sub-matrix mode and a frequent word sequence mode are adopted as input characteristics, in the level of characteristics, factors of the influence of Chinese word order information, interval phrase characteristics and the sentence length are considered, and the characteristic vector sparsity problem is solved through semantic similarities; the problem that many review text characteristics exist is solved, the inter-base-classifier independence is guaranteed, and the classification performance of base classifiers is improved as much as possible; a base classifier algorithm constructed based on product attributes is adopted to comprehensively review emotion information of each attribute in a text, and then the sentence-level emotional tendency of reviews is judged, so that a final classification result is more accurate. The Chinese network review emotion classification method based on the integrated study frame is applicable to e-commerce network review emotion classification in various fields, can make a potential consumer know evaluation information of a commodity before purchase and can also make a merchant better sufficiently know the consumer's opinion, and therefore the service quality is improved.
Owner:NANJING SILICON INTELLIGENCE TECH CO LTD

Method and system for text sentiment analysis and processing

The invention relates to a method and system for text sentiment analysis and processing. The method comprises the steps that word segmentation is conducted on a text; word vector training is conducted on segmented words of the text, so a binary file is obtained; sentiment characteristic word groups are extracted from the binary file, and syntax characteristic information and sentiment characteristic information are acquired from the word groups; characteristic integration is conducted on the syntax characteristic information and the sentiment characteristic information, so text characteristics containing syntaxes and sentiment information are obtained; the word vectors and the sentiment characteristic information in the binary file are integrated, so word vectors containing the sentiment information are obtained; the word vectors are extracted, so semantic characteristics containing the sentiment information are obtained; and the text characteristics containing the syntax and sentiment information are integrated with the semantic characteristics containing the sentiment information, so grammar information, semantic information, syntax information and sentiment information of the text can be obtained. According to the invention, the problem in the prior art, that the extracted characteristics cannot contain the semantic information, syntax information and sentiment information at the same time, can be solved; and results obtained are highly accurate.
Owner:GUILIN UNIV OF ELECTRONIC TECH

Requirement defining method, method for developing software, and method for changing requirement word, and newly defining method

An output items to be finally obtained by computer software which is a development object are determinded, these items are prescribed using a data generation equation using interim data items, and all the interim data items are prescribed by separate data generation equations. This is continued until all new data items are prescribed by input data, and prescriptions of the data items, data generation equation, and data generation execution condition and the like obtained as the result constitute requirement definition. Furthermore, the requirement definition obtained in this manner is applied to a method of automatically finding a process order of data items based on the definition, or automatically establishing data in a correct order to automatically develop a program, and accordingly the software to be finally obtained is automatically developed. To change a requirement word group completed as requirement definition, other words that define a generation method of the word concerned before and after the change, and other words whose generation method is defined by the word concerned among the other words, and the definition for such words are extracted and provided, and necessity of the change for them is studied. If “necessary”, the change is performed and continued. If “not necessary”, the influence is interrupted there, and therefore the change of the word concerned ends. A problem that a range of the influence of the change is not limitable in a conventional method is solved.
Owner:CATENA SA

Method and System for Automatic Management of Reputation of Translators

The present invention provides a method that includes receiving a result word set in a target language representing a translation of a test word set in a source language. When the result word set is not in a set of acceptable translations, the method includes measuring a minimum number of edits to transform the result word set into a transform word set. The transform word set is in the set of acceptable translations. A system is provided that includes a receiver to receive a result word set and a counter to measure a minimum number of edits to transform the result word set into a transform word set. A method is provided that includes automatically determining a translation ability of a human translator based on a test result. The method also includes adjusting the translation ability of the human translator based on historical data of translations performed by the human translator.
Owner:SDL INK

Training method of human action recognition and recognition method

The invention provides a training method of human action recognition, comprising the following steps: extracting space-time interest points from a video file; quantizing all the space-time interest points to corresponding video words according to the feature descriptors contained by the space-time interest points and generating a statistical histogram for the video words; obtaining other video words in the space-time neighborhood of the video words according to the space-time context information in the space-time neighborhood of the video words and forming space-time video phrases by the video words and one of other video words which meets space-time constraint; clustering the space-time contexts in the space-time neighborhood of the video words to obtain context words and forming space-time video word groups by the video words and the context words; selecting the representative space-time video phrase from the space-time video phrases and selecting the representative space-time video word group from the space-time video word groups; and training a classifier by utilizing the result after one or more features in the video words, the representative space-time video phrase and the representative space-time video word group are fused.
Owner:INST OF COMPUTING TECH CHINESE ACAD OF SCI

Method for searching for characters displayed in screen and based on mobile terminal and mobile terminal

The invention provides a method for searching for characters displayed in a screen and based on a mobile terminal. The method includes setting a word fetching tool, wherein the window rank of the word fetching tool is higher than that of an application program of the mobile terminal; utilizing the word fetching tool to intercept picture information on the screen according to the gesture of a user when detecting a triggering command of the user; conducting image-to-text identification operation on the picture information to obtain a plurality of characters, conducting word segmentation on the plurality of characters to obtain a plurality of word groups; acquiring a key word list according to the position of the characters in the word groups on the screen and the position of the word fetching tool on the screen during the picture information intercepting and displaying the key word list; receiving a searching word of the user, conducting searching according to key words in the key word list and displaying a searching result for the user. The method improves user experience, increases the page view of the searched pages, and has quickness, high efficiency and usability. The mobile terminal is further disclosed.
Owner:BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD

Browsing based Chinese input method

A platform to implement a Chinese input method consists of the following components: 1. A Keypad. 2. A cascade Multi-Window, 3. A Sentence Editing Buffer. 4. An Attribute Viewing Window. 5. A Text Accumulation Window. 6. A two-level phrase set refining control window. The keypads are designed based on the lexical structures of the Zhu-Yin and Pin-Yin phonetic systems. Efficient mouse operations have been designed to enter phonetic symbol strings. The Multi-Window can present a great many candidate words and phrases. It allows a user to browse on its multi pages without mouse clicking. A two-phase sentence generation procedure relieves users from the burden of sentence segmentation and creates the possibility of harvesting system supplied longer generalized phrases.
Owner:DU MIN WEN +1

Automatic question and answer method and apparatus, and storage medium

Embodiments of the invention disclose an automatic question and answer method and apparatus, and a storage medium. The method comprises the steps of adopting multiple question and answer pairs formedbased on social data in a social platform, wherein the question and answer pairs comprise questions and corresponding answers; then, establishing reverse indexes of the questions and corresponding word groups; obtaining a retrieval question; according to a question word group of the retrieval question and the reverse indexes, determining similar questions similar to the retrieval question; according to the similar questions and the question and answer pairs, obtaining candidate answers of the retrieval question, thereby obtaining a candidate answer set of the retrieval question; and selectinga target answer of the retrieval question from the candidate answer set. According to the scheme, the answer matched with the retrieval question can be output, so that the accuracy and quality of output answers of a chat robot system are improved.
Owner:TENCENT TECH (SHENZHEN) CO LTD

Augmented-word language model

A language model comprising a plurality of augmented-word n-grams and probabilities corresponding to such n-grams. Each n-gram is comprised of a sequence of augmented words. Each augmented word is comprised of the orthographic representation of the word together with a tag representing lexical information regarding the word, such as syntactic or semantic information. Also disclosed are a method of building such a language model, a method of automatically recognizing speech using the language model and a speech recognition system that employs the language model.
Owner:MICROSOFT TECH LICENSING LLC

Information search method and system based on interactive document clustering

The invention provides an information search method and system based on interactive document clustering. The method comprises the following steps that a document set is horizontally partitioned and preprocessed; word frequency statistics is conducted, and high-frequency words constitute a characteristic word set; vector space representation of documents is generated, the distances between the documents are calculated, and a similarity matrix is generated; a Laplacian matrix is generated, the number of clusters and a representation matrix are determined according to intervals between proper values of the Laplacian matrix, secondary clustering is conducted, and initial distance results are obtained; users conduct interactive operation on the initial distance results, new characteristic words are mined through chi-square statistics, a vector space is reconstructed, and the clustering process is repeated; finally, clustering results are shown to the users, and therefore the users obtain different categories of search results. According to the information search method and system, a semi-supervised learning approach in which the users intervene is adopted, the documents are clustered and analyzed, and the users obtain the different categories of search results.
Owner:PEKING UNIV

Method for realizing identity discrimination of operating users through recognizing keyboard/mouse input habits of operating users

The invention belongs to the field of computers, and discloses a method for realizing the identity discrimination of operating users through recognizing the keyboard / mouse input habits of the operating users. The method comprises the following steps: defining the input habits such as keystroke pressures, pressing intervals, loosening intervals, associated intervals and combined intervals; when carrying out single-key mapping input through a keyboard, inputting word groups through a keyboard, inputting function keys through a keyboard, inputting combination keys through a keyboard, and carrying out mouse keypad entry and mouse pulley key input, and recording related inputs so as to form a quantitative data storage-recognition data model; and because a difference with a certain rule exists between the recognition data models formed by operating the keyboard and the mouse by each user, when the users carry out input operation, recognizing different users through comparing the input information with the recognition data models. By using the method disclosed by the invention, the effective identity discrimination on the operating users can be realized through recognizing the keyboard / mouse input habits of the operating user; and through carrying out quantization processing on the slight differences of input habits, the recognition level as that of fingerprint recognition can be achieved.
Owner:DATCENT TECH

Crowdsourcing-based novel question answering system

ActiveCN104615755AQ&A is fast and accurateSpecial data processing applicationsData feedData source
The invention provides a crowdsourcing-based novel question answering system comprising a question answering module, an intelligent answering module, a question answering analysis module, a mediating mode decision module, a data source query module and an optimizing module. The question answering analysis module acquires key word groups according to acquired question answering data; the mediating mode decision module generates a mediating mode according to the key word groups and a preset attribute candidate set by mapping; the data source query module generates data source query statements according to the mediating mode and retrieves entity data from multiple data sources; the optimizing module packages crowdsourcing data into a crowdsourcing task and transmits the crowdsourcing task to the intelligent answering module; the optimizing module generates intelligent answering data according to the crowdsourcing feedback data fed back by the intelligent answering module. The crowdsourcing-based novel question answering system has the advantages that the technical problem that the automatic question answering system is limited to semantic analysis has high technical difficulty and high cost is solved, the technical problem that a community question answering system cannot be timely is solved, and questions posed by users are quickly and accurately answered.
Owner:BEIHANG UNIV

Clustering method and system aiming at massive similar short texts

InactiveCN102184256ASolve the duplicate detection problemSpecial data processing applicationsRepeat analysisText detection
The invention relates to a clustering method and system aiming at massive similar short texts, belonging to a research on repeated short text detection in the scientific field of information technology. Due to self features of the short texts, the calculated result obtained by applying the traditional repeated text analysis method to short texts are not satisfactory. By adopting a repeated analysis method based on main short text content and combining related word groups, the invention not only can detect completely repeated texts, but also can detect texts with extremely high similarity. The method and system disclosed by the invention have high processing speed and high efficiencyand can better process massive data. By the adoption of the method, redundant short texts can be removed, the system processing scale can be greatly decreased, and hot short texts can be found to a certain extent. therefore, the method and system disclosed by the invention are helpful to find out social hotspots.
Owner:BEIJING UNIV OF POSTS & TELECOMM

Audio stream manipulation for an in-vehicle infotainment system

Systems and methods pertaining to an audio stream manipulation system for manipulating an audio stream for an in-vehicle infotainment system are disclosed. A particular embodiment includes: receiving an audio stream via a subsystem of a vehicle; scanning the audio stream, by use of a data processor, to extract keywords, keyword phrases, or acoustic properties; using the extracted keywords, keyword phrases, or acoustic properties to classify audio segments of the audio stream as content segments, advertising (ad) segments, or functional segments; substituting, by use of the data processor, at least one audio segment of the audio stream with a new audio segment to generate a modified audio stream in real time; and causing the modified audio stream to be rendered for a user.
Owner:LENNY INSURANCE LTD

Apparatus and method for forming compound words

A method and device for generating meaningful compound words is provided. A user interface (120, 212) is configured to receive data input corresponding to one or more compound words. A processor (206) is configured to identify word combinations of shorter words that may be combined to form a portion or all of the one or more compound words. A display (118, 208) is configured to show the word combinations in a priority based on one or more criteria, such as distinguishing word combinations having different quantities of shorter words.
Owner:GOOGLE TECH HLDG LLC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products