Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

37 results about "Document summarization" patented technology

Automated healthcare information composition and query enhancement

Certain embodiments of the present invention provide systems and methods for information composition and query enhancement. Certain embodiments provide an information composition and query enhancement system. The system includes a query generation and enhancement engine generating and conducting a query of one or more data sources based on user input and a data context to produce query results. The system also includes an information composition engine assembling the query results to provide a bundle of documents meaningful to the particular user. The system further includes a document summarization engine clustering and summarizing the bundle of documents to provide a content summary in addition to the bundle of documents for output in a presentation to a user.
Owner:GENERAL ELECTRIC CO

System and method for document collection, grouping and summarization

A system for generating a summary of a plurality of documents and presenting the summary information to a user is provided which includes a computer readable document collection containing a plurality of related documents stored in electronic form. Documents can be pre-processed to group documents into document clusters. The document clusters can also be assigned to predetermined document categories for presentation to a user. A number of multiple document summarization engines are provided which generate summaries for specific classes of multiple documents clusters. A summarizer router is employed to determining a relationship of the documents in a cluster and select one of the document summarization engines for use in generating a summary of the cluster. A single event engine is provided to generate summaries of documents which are closely related temporally and to a specific event. A dissimilarity engine for multiple document summary generation is provided which generates summaries of document clusters having documents with varying degrees of relatedness. A user interface is provided to display categories, cluster titles, summaries, related images.
Owner:THE TRUSTEES OF COLUMBIA UNIV IN THE CITY OF NEW YORK

Method and system for simultaneously abstracting document summarization and key words

The invention relates to a method which extracts the abstracts and key words of a file at the same time, belonging to language words processing technique. The existing method takes the extraction of abstracts of the file and the extraction of the key words of the file as two irrelative tasks and respectively processes the two tasks which have the same nature; the method can utilize the same nature of the extraction and completes the extractions of the abstracts and key words at the same time. The method utilizes a figure learning model and comprehensively utilizes the relationships between sentences in the file, between the sentence and the words in the file, and between the words in the file, exactly evaluates the importance of the sentences and the words, and finally adopts the important sentences and words as the abstracts and key words of the file. The method can extract the abstracts and key words of the file at the same time on the one hand, and can gain a better effect of the extraction of abstracts and key words on the other hand; the method can be widely applied to the fields such as text information processing and digging and the like.
Owner:PEKING UNIV +2

Document summarization

Systems, methods, and other embodiments associated with automatically summarizing a document are described. One method embodiment includes computing term scores for members of a set of terms in a document to be summarized and computing sentence scores for sentences in a set of sentences in the document. The method embodiment also includes computing a set of entries for a term-sentence matrix that relates terms to sentences. The method embodiment also includes computing a dominant topic for the document and simultaneously ranking the set of terms and the set of sentences based on the dominant topic. The method embodiment provides a summarization item(s) selected from the set of terms and / or the set of sentences.
Owner:ORACLE INT CORP

Topic model based document keyword extraction method and system

The invention discloses a topic model based document keyword extraction method and system. The document keyword extraction method comprises the following steps of document information preprocessing, document structure graph construction, document topic distribution extraction, word weight extraction and keyword generation. The document keyword extraction system comprises the following modules: a document information preprocessing module, a document structure graph construction module, a document topic distribution extraction module, a word weight extraction module and a keyword generation module. According to the method and system, extracted keywords are more reasonable and related to a topic of a document more closely; and partial deficiencies in the keyword extraction field at present are overcome, a better document summarization effect is achieved, and a user can conveniently and quickly know an abstract of the document.
Owner:SOUTH CHINA UNIV OF TECH

Method for summarization of threads in electronic mail

The present invention discloses a pre-processing summarization technique that makes use of knowledge specific to the electronic mail domain to pre-process an electronic mail message so that commercially-available document summarization software can subsequently generate a more useful summary from the message. The summarization technique removes extraneous headers, quoted text, forward information, and electronic signatures, leaving more useful text to be summarized. If an enclosing electronic mail thread exists, the summarization technique uses the electronic mail message's ancestors to provide additional context for summarizing the electronic mail message. The disclosed system can be used with IBM Lotus Notes and Domino infrastructure, along with existing single-document summarizer software, to generate a summary of the discourse activity in an electronic mail thread dynamically. The summary may be further augmented to list any names, dates, and names of companies that are present in the electronic mail message being summarized.
Owner:META PLATFORMS INC

Method and device for generating document summarization

ActiveCN104503958AReduce build timeImprove the efficiency of generating summariesSpecial data processing applicationsDocument summarizationGeneration process
The invention provides a method and a device for generating a document summarization. The method comprises the following steps: obtaining a document, processing the document by utilizing preset characteristics to obtain a summarization candidate sentence, wherein the preset characteristics comprise keywords, numbers and one or a plurality of sentences and subtitles which are far away from a title contained in the document for a preset range; carrying out compression processing to the summarization candidate sentence; and carrying out postprocessing on the summarization candidate sentence subjected to the compression processing to generate the document summarization. The summarization generated by the method and the device, which are disclosed by the embodiment of the invention, for generating a document summarization is concise and accurate, no redundant information exists in the summarization, a generation process is simple and does not need artificial participation, time for generating the document summarization can be greatly shortened, and efficiency on generating the document summarization is improved.
Owner:BEIJING BAIDU NETCOM SCI & TECH CO LTD

Method and apparatus for summarization of threads in electronic mail

The present invention discloses a pre-processing summarization technique that makes use of knowledge specific to the electronic mail domain to pre-process an electronic mail message so that commercially-available document summarization software can subsequently generate a more useful summary from the message. The summarization technique removes extraneous headers, quoted text, forward information, and electronic signatures, leaving more useful text to be summarized. If an enclosing electronic mail thread exists, the summarization technique uses the electronic mail message's ancestors to provide additional context for summarizing the electronic mail message. The disclosed system can be used with IBM Lotus Notes and Domino infrastructure, along with existing single-document summarizer software, to generate a summary of the discourse activity in an electronic mail thread dynamically. The summary may be further augmented to list any names, dates, and names of companies that are present in the electronic mail message being summarized.
Owner:META PLATFORMS INC

Method and apparatus for highlighting diverse aspects in a document

The disclosure generally relates to document summarization. Given a document, summarization can be defined as picking k sentences from the original document D such that the constructed summary exhibits two key properties: coverage and orthogonality. In one embodiment of the disclosure, the two requirements are captured in a combinatorial formulation of the problem and presented as an algorithms.
Owner:IBM CORP

Document summarization based on topicality and specificity

Topicality scores are determined for a number of phrasal expressions in documents. Phrasal expressions may be noun phrases, with or without corresponding prepositional phrases, subject-verb pairs, and verb-object pairs. The documents describe some topic or multiple topics. Techniques can be used to determined how the phrasal expression compares with the topic or topics being described in the documents. Specificities are determined for the phrasal expressions. Techniques may be used to determine whether phrasal expressions are more or less specific than other phrasal expressions. An order is determined for the phrasal expressions by using the topicality scores and the specificities. The order may be represented as a phrasal expression tree, for example. The phrasal expression tree may be displayed to a user, and the user can navigate through the phrasal expression tree, and therefore through the one or more documents.
Owner:NUANCE COMM INC

System and method for document collection, grouping and summarization

A system for generating a summary of a plurality of documents and presenting the summary information to a user is provided which includes a computer readable document collection containing a plurality of related documents stored in electronic form. Documents can be pre-processed to group documents into document clusters. The document clusters can also be assigned to predetermined document categories for presentation to a user. A number of multiple document summarization engines are provided which generate summaries for specific classes of multiple documents clusters. A summarizer router is employed to determining a relationship of the documents in a cluster and select one of the document summarization engines for use in generating a summary of the cluster. A single event engine is provided to generate summaries of documents which are closely related temporally and to a specific event. A dissimilarity engine for multiple document summary generation is provided which generates summaries of document clusters having documents with varying degrees of relatedness. A user interface is provided to display categories, cluster titles, summaries, related images.
Owner:THE TRUSTEES OF COLUMBIA UNIV IN THE CITY OF NEW YORK

Document summarization based on topicality and specificity

Topicality scores are determined for a number of phrasal expressions in documents. Phrasal expressions may be noun phrases, with or without corresponding prepositional phrases, subject-verb pairs, and verb-object pairs. The documents describe some topic or multiple topics. Techniques can be used to determined how the phrasal expression compares with the topic or topics being described in the documents. Specificities are determined for the phrasal expressions. Techniques may be used to determine whether phrasal expressions are more or less specific than other phrasal expressions. An order is determined for the phrasal expressions by using the topicality scores and the specificities. The order may be represented as a phrasal expression tree, for example. The phrasal expression tree may be displayed to a user, and the user can navigate through the phrasal expression tree, and therefore through the one or more documents.
Owner:MICROSOFT TECH LICENSING LLC

System for summarization of threads in electronic mail

The present invention discloses a pre-processing summarization technique that makes use of knowledge specific to the electronic mail domain to pre-process an electronic mail message so that commercially-available document summarization software can subsequently generate a more useful summary from the message. The summarization technique removes extraneous headers, quoted text, forward information, and electronic signatures, leaving more useful text to be summarized. If an enclosing electronic mail thread exists, the summarization technique uses the electronic mail message's ancestors to provide additional context for summarizing the electronic mail message. The disclosed system can be used with IBM Lotus Notes and Domino infrastructure, along with existing single-document summarizer software, to generate a summary of the discourse activity in an electronic mail thread dynamically. The summary may be further augmented to list any names, dates, and names of companies that are present in the electronic mail message being summarized.
Owner:META PLATFORMS INC

Method and apparatus for summarization of threads in electronic mail

The present invention discloses a pre-processing summarization technique that makes use of knowledge specific to the electronic mail domain to pre-process an electronic mail message so that commercially-available document summarization software can subsequently generate a more useful summary from the message. The summarization technique removes extraneous headers, quoted text, forward information, and electronic signatures, leaving more useful text to be summarized. If an enclosing electronic mail thread exists, the summarization technique uses the electronic mail message's ancestors to provide additional context for summarizing the electronic mail message. The disclosed system can be used with IBM Lotus Notes and Domino infrastructure, along with existing single-document summarizer software, to generate a summary of the discourse activity in an electronic mail thread dynamically. The summary may be further augmented to list any names, dates, and names of companies that are present in the electronic mail message being summarized.
Owner:META PLATFORMS INC

Encrypt data searching method based on 5g communication standard

The invention provides an encrypt data searching method based on a 5g communication standard, and therefore the cost spent on communication between a user client and a server can be reduced. The encrypt data searching method comprises that encrypting a forward index on a data provider client to obtain a forward index file, and encrypting a document plaintext line by line to obtain a ciphertext set; creating an inverted index table of a ciphertext type by the server in dependence on the forward index file; receiving a querying trap door uploaded by the user on the server with the querying trap door comprising a keyword trap door and a document number decryption key; judging whether the inverted index table includes an index entry being consistent with the keyword trap door or not; if yes, deciphering the document number ciphertext through the document number decryption key, sorting documents in the ciphertext set according to a rule that the correlation between a keyword and a document is gradually reduced, and returning document summarization ciphertexts whose correlation between the keyword and the document ranks in the top k and corresponding document number ciphertexts to the user client by combining the keyword position information. The encrypt data searching method based on the 5g communication standard is suitable for the technical field of information safety.
Owner:UNIV OF SCI & TECH BEIJING

Database script management method and device, computer equipment and storage medium

The invention discloses a database script management method and device, computer equipment and a storage medium. The method comprises the steps of obtaining a current DB script sent by a service system, detecting a script name of the current DB script, and extracting a version number, a service code, a database instance and a script type corresponding to the current DB script; maintaining the current DB script based on the service code, and determining a target DB script and a target execution sequence; based on the target execution sequence, executing the target DB script in a target databasecorresponding to the database instance, and obtaining a script execution result; and if the script type is the DDL script, obtaining a target interface document corresponding to the DDL script, performing associated document summarization based on the target interface document, obtaining a summarized interface document, processing the target DB script based on the summarized interface document and the script execution result, and obtaining a version sealing processing result. According to the method, the target DB script management and test cost can be saved, the interface document is synchronously updated, and the accuracy of the execution process is guaranteed.
Owner:ONE CONNECT SMART TECH CO LTD SHENZHEN

Multiple-document summarization using document clustering

The invention relates to multiple-document summarization using document clustering. Systems and methods are disclosed for summarizing multiple documents by generating a model of the documents as a mixture of document clusters, each document in turn having a mixture of sentences, wherein the model simultaneously representing summarization information and document cluster structure; and determininga loss function for evaluating the model and optimizing the model.
Owner:NEC LAB AMERICA

System, method, and user interface for a search engine based on multi-document summarization

A method for searching multiple documents on a computer system includes steps for sending a query to a system core where the query is passed to a search component for searching the documents. The system core in turn receives results from the search component indicating related documents to the query and passes to a summarization component a specified number of the results. The summarization component processes related documents corresponding to the specified number of results and removes duplicate results to produce a multi-document summary. The system core receives the summary from the summarization component. The multi-document summary is received from the system core.
Owner:SOUBBOTIN DMITRI

Guarding system for emergency management work

The invention discloses a guarding system for emergency management work. The guarding system comprises an information processing module, a guarding communication module, an intelligent scheduling module, an official document review module, a data library module, a guarding big data module, a disposal evaluation module, a process simulation module, an instant communication module and a system management module. The information processing module outputs the information abstract to the official document summarization module for quickly editing daily duty logs. The information processing module outputs the event information to the on-duty communication module for information association. The information processing module and the on-duty communication module output the processing timeliness tothe processing evaluation module for evaluating the timeliness of event processing. According to the invention, a rapid and efficient information management system is constructed, intra-system integration of various event information is realized, the system automatically associates related information, the event information is displayed in an event time stream form, an event development process istaken as a main line, the event information is connected in series, and the availability of the information is improved.
Owner:中电科新型智慧城市研究院有限公司

An Evolutionary Summary Generation Method for Internet News Events

The invention relates to an evolutionary summarization generation method for internet news events. The evolutionary summarization generation method includes the steps: inputting a related news document set; representing documents as topic facture vectors by an LDA (latent Dirichlet allocation) topic model; clustering the documents represented as the topic facture vectors; calculating local scoresof the documents in each topic; calculating global scores of the documents in each topic; calculating final scores of the documents in each topic; extracting document titles with high scores from eachtopic to serve as a summary according to time sequence; outputting the summary. The dimensions of the topic facture vectors are first preset values, and each cluster represents one topic. According to the evolutionary summarization generation method for the internet news events, the extracted summary has dynamic evolvability and is coherent and strong in readability, and experimental results indicate that the system is greatly improved in terms of redundancy, coherence and dynamic evolvability as compared with a traditional multi-document summarization system.
Owner:SUZHOU UNIV

Method and device for generating document summaries

ActiveCN104503958BReduce build timeImprove the efficiency of generating summariesSpecial data processing applicationsGeneration processDocument summarization
The invention provides a method and a device for generating a document summarization. The method comprises the following steps: obtaining a document, processing the document by utilizing preset characteristics to obtain a summarization candidate sentence, wherein the preset characteristics comprise keywords, numbers and one or a plurality of sentences and subtitles which are far away from a title contained in the document for a preset range; carrying out compression processing to the summarization candidate sentence; and carrying out postprocessing on the summarization candidate sentence subjected to the compression processing to generate the document summarization. The summarization generated by the method and the device, which are disclosed by the embodiment of the invention, for generating a document summarization is concise and accurate, no redundant information exists in the summarization, a generation process is simple and does not need artificial participation, time for generating the document summarization can be greatly shortened, and efficiency on generating the document summarization is improved.
Owner:BEIJING BAIDU NETCOM SCI & TECH CO LTD

A method for automatic extraction of document summaries based on word vectors

Provided is an automatic document summarization extraction method based on term vectors. The method includes the steps that S1, a deep neural network model is used for training linguistic data to obtain term vector representation of feature terms; S2, a sentence graph model is constructed; S3, the weights of sentences are calculated; S4, a maximum marginal relevance algorithm is used for generating a summarization. According to the method, a linguistic data set is collected and preprocessed to obtain a training feature linguistic data set, the deep neural network model is used for training the constructed training feature linguistic data set to obtain the term vectors of the feature terms, a candidate document set and a candidate sentence set are obtained from the linguistic data set through preset search terms, the semantic similarity between the senesces is obtained according to the term vectors of the feature terms, and then the semantic relation between every two sentences is obtained. The problem that in a traditional calculation method based on term co-occurrence, calculation errors are caused under the condition that semantic meaning is identical but terms are different is avoided, and therefore the accuracy of similarity calculation and the performance of the summarization are improved.
Owner:DALIAN UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products