Patents

Literature

Patsnap Eureka AI that helps you search prior art, draft patents, and assess FTO risks, powered by patent and scientific literature data.

122 results about "Document level" patented technology

Filter

Efficacy Topic

Property

Owner

Technical Advancement

Application Domain

Technology Topic

Technology Field Word

Patent Country/Region

Patent Type

Patent Status

Application Year

Inventor

Document Level. Also sometimes referred to as Code Behind. Consists of a single assembly associated with a single workbook, document or template. The code is inside an assembly that is then linked to the particular office file.

Determining ad targeting information and/or ad creative information using past search queries

InactiveUS20050222901A1Web data indexingUnstructured textual data retrievalDomain levelPaper document

Ad information, such as ad targeting keywords and / or ad creative content for example, may be determined using aggregated selected document-to-query information associations. For example, popular terms and / or phrases also associated with a selected document may be used as ad targeting keywords and / or ad creative content for an ad having the document as a landing page. Query information may be tracked on a per document level, a per domain level, etc. The determined ad information may be used to automatically populate an ad record, or may be provided to an advertiser as suggested or recommended ad information.

Determining ad targeting information and/or ad creative information using past search queries

Determining ad targeting information and/or ad creative information using past search queries

Determining ad targeting information and/or ad creative information using past search queries

Owner:GOOGLE LLC

Method and system for semantic search and retrieval of electronic documents

InactiveUS20060235843A1Reduce inclusionsImprove rankingWeb data indexingSpecial data processing applicationsElectronic documentPattern matching

A system and method for semantic search for electronic documents stored on a computer readable media, and providing a search result in response to a query. The system includes a corpus including a plurality of electronic documents that are domain tagged at a document level and analyzed based on the tags to identify word usage patterns. An index of word usage patterns is provided that indexes the plurality of documents in the corpus according to their word usage patterns. The system also includes a query pre-processing module that receives a query from a user, and analyzes the query to determine probable word usage patterns in the query. The system further includes a processor that uses the index to identify documents having word usage patterns that matches the probable word usage patterns in the query as a candidate electronic document, and retrieves the candidate electronic document.

Method and system for semantic search and retrieval of electronic documents

Method and system for semantic search and retrieval of electronic documents

Method and system for semantic search and retrieval of electronic documents

Owner:TEXTDIGGER

Web based data collaboration tool

InactiveUS20060117247A1Digital data protectionOffice automationData collaborationConfigfs

A web based data collaboration tool includes a dynamic international collaborative environment in which system partners, including customers, technology partners and suppliers, can exchange information between one another in a truly collaborative environment. The web based data collaboration tool includes unique “fine grain” security at both the document and sub-document level. This allows one source document to be shared between the system partners, including partners from different companies and those located in different countries, based upon an individual document / sub-document security profile. Further, the web based data collaboration tool includes a secure “Sandbox” for peer-to-peer sharing of sensitive documents and electronically incorporates a business area export representative (BAER) approval process that includes the required retention of International Traffic in Arms Regulations (ITAR) documents making the web based data collaboration tool fully ITAR compliant.

Web based data collaboration tool

Web based data collaboration tool

Web based data collaboration tool

Owner:UNITED TECH CORP

Comment text emotion classification model training and emotion classification method and device and equipment

ActiveCN108363753AAchieving Context Semantic Robust AwarenessRealize semantic expressionSemantic analysisSpecial data processing applicationsClassification methodsNetwork model

The invention discloses a comment text emotion classification model training and emotion classification method and device and equipment and belongs to the field of text emotion classification in natural language processing. Model training comprises the steps that a comment text and associated subject and object information are acquired; a comment subject and object attention mechanism is fused based on a first-layer Bi-LSTM network to extract sentence-level feature representation; the comment subject and object attention mechanism is fused based on a second-layer Bi-LSTM network to extract document-level feature representation; and a hyperbolic tangent non-linear mapping function is adopted to map document-level features to an emotion category space, softmax classification is adopted to train parameters in a model, and an optimal text emotion classification model is obtained. According to the method, the hierarchical bidirectional Bi-LSTM network model and the attention mechanism are adopted, context semantic robust perception and semantic expression of the text can be realized, the robustness of text emotion classification can be remarkably improved, and the correct rate of classification is increased.

Comment text emotion classification model training and emotion classification method and device and equipment

Comment text emotion classification model training and emotion classification method and device and equipment

Comment text emotion classification model training and emotion classification method and device and equipment

Owner:NANJING UNIV OF POSTS & TELECOMM

Data document generator

ActiveUS20030014447A1Natural language data processingSpecial data processing applicationsOriginal dataData file

A data management system for generating customized versions of data documents. Initially the document is stored as in the form of raw data, which is subsequently parsed into an internal representation of the document. In one embodiment, raw data is stored in XML form and is parsed by an XML parser. Upon the initial request for a customized version of the document, a sequence of transforms is applied to the internal representation and to subsequently transformed documents in order to create hierarchical, customized document levels. In one embodiment, transforms are implemented as either XSL stylesheets, although Java classes may also be employed. The document versions are written to cache, and subsequent requests for existing versions of the document are referred to cache. In the event that any document dependencies change, a cached version will be denoted invalid, and subsequent requests will result in the re-generation of a customized version. The data management system is implemented in the form of a document manager, a database that includes a document table and a transform table. The document manager reads raw documents from a raw-document database and reads transforms from a transform database. Requested customized documents are written to cache.

Data document generator

Data document generator

Data document generator

Owner:VERSATA SOFTWARE INC

Secure and granular index for information retrieval

InactiveUS20070255698A1Digital data information retrievalDigital data processing detailsInternet privacyPaper document

A method and apparatus for a secure and granular index for information is described herein. According to one embodiment of the invention a computer-implemented method is described including evaluating a user query against a set of documents having sub-document level security control, determining a security access for said user, and providing a result for the user query based on the security access for the user and the sub-document level security control.

Secure and granular index for information retrieval

Secure and granular index for information retrieval

Secure and granular index for information retrieval

Owner:SAWTEETH

Secure search of private documents in an enterprise content management system

InactiveUS20090106271A1Efficient use ofEfficient retrievalText database queryingSpecial data processing applicationsDocument analysisEnterprise content management

An enterprise content management system such as an electronic contract system manages a large number of secure documents for many organizations. The search of these private documents for different organizational users with role-based access control is a challenging task. A content-based extensible mark-up language (XML)-annotated secure-index search mechanism is provided that provides an effective search and retrieval of private documents with document-level security. The search mechanism includes a document analysis framework for text analysis and annotation, a search indexer to build and incorporate document access control information directly into a search index, an XML-based search engine, and a compound query generation technique to join user role and organization information into search query. By incorporating document access information directly into the search index and combining user information in the search query, search and retrieval of private contract documents can be achieved very effectively and securely with high performance.

Secure search of private documents in an enterprise content management system

Secure search of private documents in an enterprise content management system

Secure search of private documents in an enterprise content management system

Owner:IBM CORP

Document processing apparatus and a method for controlling a document processing apparatus

InactiveUS20080024802A1Easy to monitorInhibit outputDigitally marking record carriersDigital computer detailsDocument preparationDocumentation

The present invention enables page-level or document-level print setup of an XPS document via a user interface. The print ticket of a page of interest is obtained by merging a job-level print ticket 1804, to which the page of interest belongs, a document-level print ticket 1805, to which the page of interest belongs, and a page-level print ticket 1806 of the page of interest. The obtained individual page print tickets are compared with the job-level print ticket and if there are differences, it is determined that this particular page has exception settings and the exception settings are saved and displayed.

Document processing apparatus and a method for controlling a document processing apparatus

Document processing apparatus and a method for controlling a document processing apparatus

Document processing apparatus and a method for controlling a document processing apparatus

Owner:CANON KK

Composition scoring method based on attention mechanism

ActiveCN107133211ASemantic analysisNeural learning methodsPattern recognitionScore method

The invention provides a composition scoring method based on an attention mechanism; the method comprises: using a neural network attention frame of word-sentence-document trilayer structure in a composition scoring system, using artificially extracted features to perform fusing in sentence and document levels of the frame, and setting attention weights of the sentence and document levels. The influences of factors, such as local features and global features of a language, completeness of sentences, accuracy of word usage, diversity of expressions, coherency of sentences and digressing or zero digressing from the subject, upon a scoring task are comprehensively considered herein, and composition scoring effect is maximally improved.

Composition scoring method based on attention mechanism

Composition scoring method based on attention mechanism

Composition scoring method based on attention mechanism

Owner:RENMIN UNIVERSITY OF CHINA +2

Knowledge discovery from citation networks

ActiveUS8630975B1Improve document organizationEfficient searchDigital data processing detailsProbabilistic networksGenerative processDocumentation

In a corpus of scientific articles such as a digital library, documents are connected by citations and one document plays two different roles in the corpus: document itself and a citation of other documents. A Bernoulli Process Topic (BPT) model is provided which models the corpus at two levels: document level and citation level. In the BPT model, each document has two different representations in the latent topic space associated with its roles. Moreover, the multi-level hierarchical structure of the citation network is captured by a generative process involving a Bernoulli process. The distribution parameters of the BPT model are estimated by a variational approximation approach.

Knowledge discovery from citation networks

Knowledge discovery from citation networks

Knowledge discovery from citation networks

Owner:THE RES FOUND OF STATE UNIV OF NEW YORK

Event trigger word extraction method based on document level attention mechanism

ActiveCN108829801AEasy to identifyRealize identificationNeural architecturesSpecial data processing applicationsAttention modelAlgorithm

The invention relates to an event trigger word extraction method, in particular to an event trigger word extraction method based on a document level attention mechanism, comprising the following steps: (1) preprocessing training corpus; (2) performing word vector training by using PubMed database corpus; (3) constructing a distributed representation way of a sample; (4) constructing a characteristic representation way based on BiLSTM-Attention; (5) adopting CRF learning, and acquiring an optimal sequence labeling result of the current document sequence; and (6) extracting event trigger words.The method provided by the invention has the advantages that firstly a BIO tag labeling way is adopted, and recognition including multi-word trigger word recognition is realized; secondly a corresponding simple word and characteristic distributed representation way is constructed for a trigger word recognition task; and thirdly, a BiLSTM-Attention model is proposed, a distributed representation structure specific to the currently input document level information is realized by introducing an Attention mechanism, and trigger word recognition effect is improved.

Event trigger word extraction method based on document level attention mechanism

Event trigger word extraction method based on document level attention mechanism

Event trigger word extraction method based on document level attention mechanism

Owner:DALIAN UNIV OF TECH

Document level indexes for efficient processing in multiple tiers of a computer system

ActiveUS20070016604A1Digital data processing detailsSemi-structured data indexingDatabase serverPaper document

To improve performance of performing XML operations on an XML document in by a client tier, the client generates an index that indexes the nodes of an XML document. The index may be generated, for example, by and during parsing of the XML document. The index contains similar structures to those maintained by a database server to perform XML operations on collections of XML documents. In lieu of parsing XML document to generate an index, the client may generate indexes based on data retrieved from the indexes at the database server.

Document level indexes for efficient processing in multiple tiers of a computer system

Document level indexes for efficient processing in multiple tiers of a computer system

Document level indexes for efficient processing in multiple tiers of a computer system

Owner:ORACLE INT CORP

Attention dual-layer LSTM-based long text emotional tendency analysis method

InactiveCN108446275AImprove the accuracy of sentiment classificationAvoiding the pitfalls of RNNsSemantic analysisCharacter and pattern recognitionSemanticsDocumentation

The invention relates to an attention dual-layer LSTM-based long text emotional tendency analysis method, belongs to the field of natural language processing and machine learning, and mainly aims to solve the problem of difficulty in accurately judging an emotional tendency of a full text due to long comment length of the long text, discrete distribution of positive and negative emotional featuresand different emotional semantic contribution degrees of sentences. The method comprises the steps of firstly learning sentence-level emotional vector representation by utilizing LSTM; secondly coding semantic relationships between emotional semantics of all the sentences in a document and the sentences by adopting bidirectional LSTM, and based on an attention mechanism, performing weight allocation on the sentences with different emotional semantic contribution degrees; and finally, weighting the sentence-level emotional vector representation to obtain document-level emotional vector representation of the long text, and through a Softmax layer, obtaining the emotional tendency of the long text. An experiment is performed in Yelp2015 and IMDb film comment corpora; and a result shows thata relatively good classification effect can be achieved, so that the emotional classification correctness is further improved.

Attention dual-layer LSTM-based long text emotional tendency analysis method

Attention dual-layer LSTM-based long text emotional tendency analysis method

Attention dual-layer LSTM-based long text emotional tendency analysis method

Owner:BEIJING INSTITUTE OF TECHNOLOGYGY

Data caching system and data caching method

ActiveCN103488581AEasy to controlImplement automatic updatesMemory adressing/allocation/relocationSpecial data processing applicationsOperational systemData store

The invention provides a data caching system and a data caching method, wherein the data caching system comprises a cache dividing unit and a storage unit; the cache dividing unit is used for querying the type of the data to be stored, setting a corresponding logo for each type of data and respectively dividing a corresponding caching area for each type of data; the storage unit is used for storing the data in each caching area into the memory of an operation system and storing the metadata, logo and / or the attribution information, which correspond to the data respectively in each caching area, into the memory of a virtual machine. By the technical scheme disclosed by the invention, a KEY set and the actual data can be stored in a separating mode under the condition of performing a large-memory caching, the version control on caching data can be realized, audio monitors with a document level can be automatically updated, and the monitoring strength can be improved.

Data caching system and data caching method

Data caching system and data caching method

Data caching system and data caching method

Owner:YONYOU NETWORK TECH

Artificial intelligence-based multi-label classification method and system of multi-level text

ActiveCN108073677AFit closelyImprove scalabilityCharacter and pattern recognitionSpecial data processing applicationsExtensibilityMulti-label classification

The invention relates to an artificial intelligence-based multi-label classification method and system of multi-level text. The method includes: 1) utilizing a neural network to construct a multi-label classification model of the multi-level text, and obtaining text class prediction results of training text according to the model; 2) carrying out learning on parameters of the multi-label classification model of the multi-level text according to existing text class labeling information in the training text and the text class prediction results, which are of the training text and are obtained inthe step 1), to obtain a multi-label classification model of the multi-level text with determined parameters; and 3) utilizing the multi-label classification model of the multi-level text with the determined parameters to classify to-be-classified text. The method infers labels of the formed text simply through the document-level labeling information, and can be well applied to scenes where labels of formed text are difficult to collect; compared with traditional multi-instance learning (MIL) methods, the method of the invention introduces minimal assumptions, and can better fit actual data;and the method of the invention has good scalability.

Artificial intelligence-based multi-label classification method and system of multi-level text

Artificial intelligence-based multi-label classification method and system of multi-level text

Artificial intelligence-based multi-label classification method and system of multi-level text

Owner:INST OF INFORMATION ENG CHINESE ACAD OF SCI

Document-level sentiment analysis method based on specific domain sentiment words

ActiveCN108804417AMake up for the lack of domain specific wordsVersatilitySemantic analysisCharacter and pattern recognitionData setAlgorithm

The invention provides a document-level sentiment analysis method based on specific domain sentiment words. The method is implemented by the following steps of collecting a document data set, traininga set of prototype words by using a Skip-gram word vector model to obtain a word vector corresponding to each prototype word, recombining the word vectors by utilizing an attention mechanism, and capturing a relation between non-continuous words in the word vectors; synthesizing the words and sentences by using an asymmetric convolutional neural network and a bidirectional gate recurrent neural network based on the attention mechanism respectively, thereby forming document vector characteristics; generating sentiment eigenvectors by utilizing a domain sentiment dictionary of the Skip-gram word vector model; and finally, combining the document vector characteristics and the sentiment eigenvectors by utilizing a linear combination layer to form document characteristics beneficial to document classification. The sentiment analysis is widely applied to the product analysis, the commodity recommendation, the stock price trend prediction and the like; and the method provided by the invention can accurately and efficiently carry out sentiment analysis on documents, and has great commercial values.

Document-level sentiment analysis method based on specific domain sentiment words

Document-level sentiment analysis method based on specific domain sentiment words

Document-level sentiment analysis method based on specific domain sentiment words

Owner:SHANDONG UNIV OF SCI & TECH

Methods and systems for providing automated actions on recognized text strings in a computer-generated document

InactiveUS20050182617A1Light protection screensCharacter and pattern recognitionUser inputPaper document

Methods and systems provide for automatically performing actions on or in association with text or data strings that are recognized as belonging to certain semantic categories. Text entered by a user is passed to a recognizer application. If a given text or data string is recognized as belonging to a given semantic category, the recognizer application passes data corresponding to the recognized string back to a host application. In response to recognized text or data, a pointer to the object model of the host application may be passed to the recognizer application to allow the recognizer application to perform any function of the host application in response to the recognized string. Alternatively, after the recognizer application passes data corresponding to the recognized string back to the host application, the host application may fire an application level or document level event for causing an action component to perform desired actions on recognized strings. Alternatively, after a string is recognized by the recognizer application, the recognizer application may set a property associated with a desired action to be performed on or in association with the recognized string. The host application may call an action component identified by the property for automatically performing the desired action on or in association with the recognized string.

Methods and systems for providing automated actions on recognized text strings in a computer-generated document

Methods and systems for providing automated actions on recognized text strings in a computer-generated document

Methods and systems for providing automated actions on recognized text strings in a computer-generated document

Owner:MICROSOFT TECH LICENSING LLC

System and method for multithreaded text indexing for next generation multi-core architectures

InactiveUS20110252033A1Cutting synchronizationDigital data information retrievalDigital data processing detailsArray data structureDocumentation

A system and method for indexing documents in a data storage system includes generating a single document hash table in storage memory for a single document using an index construction in a multithreaded and scalable configuration wherein multiple threads are each assigned work to reduce synchronization between threads. The single document hash table includes partitioning the single document and indexing strings of partitioned portions of the single document to create a minor hash table for each document sub-part; generating a document level hash table from the minor hash tables; updating a stream level hash table for the strings which maps every string to a global identifier; and generating a term reordered array from the document level hash table.

System and method for multithreaded text indexing for next generation multi-core architectures

System and method for multithreaded text indexing for next generation multi-core architectures

System and method for multithreaded text indexing for next generation multi-core architectures

Owner:IBM CORP

Secure and granular index for information retrieval

InactiveUS7874013B2Digital data information retrievalDigital data processing detailsInternet privacyDocumentation

A method and apparatus for a secure and granular index for information is described herein. According to one embodiment of the invention a computer-implemented method is described including evaluating a user query against a set of documents having sub-document level security control, determining a security access for said user, and providing a result for the user query based on the security access for the user and the sub-document level security control.

Secure and granular index for information retrieval

Secure and granular index for information retrieval

Secure and granular index for information retrieval

Owner:SAWTEETH

Document processing apparatus and a method for controlling a document processing apparatus

InactiveUS7916332B2Easy to monitorInhibit outputDigitally marking record carriersMultiple digital computer combinationsDocument preparationDocumentation

The present invention enables page-level or document-level print setup of an XPS document via a user interface. The print ticket of a page of interest is obtained by merging a job-level print ticket 1804, to which the page of interest belongs, a document-level print ticket 1805, to which the page of interest belongs, and a page-level print ticket 1806 of the page of interest. The obtained individual page print tickets are compared with the job-level print ticket and if there are differences, it is determined that this particular page has exception settings and the exception settings are saved and displayed.

Document processing apparatus and a method for controlling a document processing apparatus

Document processing apparatus and a method for controlling a document processing apparatus

Document processing apparatus and a method for controlling a document processing apparatus

Owner:CANON KK

Cryptograph index structure based on blocking organization and management method thereof

InactiveCN101655858AEfficient security structureEfficient security maintenance mechanismDigital data protectionSpecial data processing applicationsInternal memoryTimestamp

The invention discloses a cryptograph index structure based on a blocking organization and a management method thereof. Aiming at a blocking cryptograph index structure, an index establishing mode based on combination is firstly adopted to establish a plain text index when the index is established, and then the plain text index is blocked and encrypted in a unitive way. A maintenance mechanism based on a cryptograph index is divided into the addition, the deletion and the modification of a document in the index. The addition of the document is mainly divided into two conditions of batch addition and littleness addition; in the batch addition, a temporary index is established on a disc; and in the littleness addition, an internal memory index is established. In the deletion of the document,a deletion mark is firstly made on the document to be deleted, and the document is deleted in a unitive way until a proper opportunity. In the modification of the index, an original document is firstly deleted, and then a novel document is anew added. In a key management strategy, stratification management is carried out on an index encryption key, and the update of the key is realized by a timestamp mechanism. In an access control strategy based on the index, access control information is integrated into the index, and the access control of document level granularity is realized.

Cryptograph index structure based on blocking organization and management method thereof

Cryptograph index structure based on blocking organization and management method thereof

Cryptograph index structure based on blocking organization and management method thereof

Owner:HUAZHONG UNIV OF SCI & TECH

Methods for automatically determining workflow for print jobs

ActiveUS20100195140A1Simple methodOptimize workflowDigital computer detailsVisual presentationDocument preparationDocumentation

A method automatically determines a workflow for a print job via analysis of PDF documents is disclosed. A job synopsis comprising hashed information of a historical print job's document content along with a PDF document-level and object-level metadata and information about workflow nodes and parameters can be stored in a database. The job synopsis of a new incoming print job can then be compared to the historical print job synopses database. If the new print job matches the historical print job within a certain predefined similarity limit, then workflow and parameter information associated with the historical job can be utilized to augment an initial workflow for the incoming print job.

Methods for automatically determining workflow for print jobs

Methods for automatically determining workflow for print jobs

Methods for automatically determining workflow for print jobs

Owner:XEROX CORP

Method for searching and sequencing personalized web pages based on user retention time analysis

InactiveCN102231165ACapture interest in readingPredict Potential AttractionSpecial data processing applicationsPersonalizationWeb browser

The invention discloses a method for searching and sequencing personalized web pages based on user retention time analysis. The method comprises the following steps of: firstly, obtaining document-level user retention time through a custom web browser; accordingly, predicting concept word-level user retention time; then, according to the predicted concept word-level user retention time, further predicting personalized reading interests of a user to each web page in any web page searching result; and finally, according to the personalized reading interests of the user, generating a personalized web page searching result facing to the user. In the method disclosed by the invention, by using an artificial intelligent related technology and methods for searching web pages, processing texts and the like, reading interests of users to different concepts can be estimated; therefore, personal reading habits and requirements are considered in the process for searching and sequencing the web pages; and the sequencing of the web page searching results is closer to the user personalized prediction result, therefore, better network search and browser support are provided for users.

Method for searching and sequencing personalized web pages based on user retention time analysis

Method for searching and sequencing personalized web pages based on user retention time analysis

Method for searching and sequencing personalized web pages based on user retention time analysis

Owner:ZHEJIANG UNIV

Knowledge discovery from citation networks

ActiveUS20140188780A1Improve fitImprove document organizationProbabilistic networksOffice automationGenerative processSubject matter

In a corpus of scientific articles such as a digital library, documents are connected by citations and one document plays two different roles in the corpus: document itself and a citation of other documents. A Bernoulli Process Topic (BPT) model is provided which models the corpus at two levels: document level and citation level. In the BPT model, each document has two different representations in the latent topic space associated with its roles. Moreover, the multi-level hierarchical structure of the citation network is captured by a generative process involving a Bernoulli process. The distribution parameters of the BPT model are estimated by a variational approximation approach.

Knowledge discovery from citation networks

Knowledge discovery from citation networks

Knowledge discovery from citation networks

Owner:THE RES FOUND OF STATE UNIV OF NEW YORK

Methods and systems for providing automated actions on recognized text strings in a computer-generated document

InactiveCN1658188ALight protection screensCharacter and pattern recognitionUser inputApplication software

Methods and systems provide for automatically performing actions on or in association with text or data strings that are recognized as belonging to certain semantic categories. Text entered by a user is passed to a recognizer application. If a given text or data string is recognized as belonging to a given semantic category, the recognizer application passes data corresponding to the recognized string back to a host application. In response to recognized text or data, a pointer to the object model of the host application may be passed to the recognizer application to allow the recognizer application to perform any function of the host application in response to the recognized string. Alternatively, after the recognizer application passes data corresponding to the recognized string back to the host application, the host application may fire an application level or document level event for causing an action component to perform desired actions on recognized strings. Alternatively, after a string is recognized by the recognizer application, the recognizer application may set a property associated with a desired action to be performed on or in association with the recognized string. The host application may call an action component identified by the property for automatically performing the desired action on or in association with the recognized string.

Methods and systems for providing automated actions on recognized text strings in a computer-generated document

Methods and systems for providing automated actions on recognized text strings in a computer-generated document

Methods and systems for providing automated actions on recognized text strings in a computer-generated document

Owner:MICROSOFT CORP

Method and system for generating large coded data set of text from textual documents using high resolution labeling

InactiveUS20170270096A1High resolution labelingHigh scoreSemantic analysisMachine learningData setDocumentation

A method and a system for generating coded dataset of sentences with a high resolution labeling are provided herein. The method may include: obtaining a plurality of textual documents that are pre-classified on a whole document level, into topics; training one or more mixed-membership model unsupervised algorithms, implemented by a computer processor, based on said topics, to yield a distribution of sub topics for each of the textual documents; and applying a transformation, implemented by a computer processor, to said distribution of sub topics for each of the textual documents, to yield a topic tagging score for said sub topics on a text-portion level.

Method and system for generating large coded data set of text from textual documents using high resolution labeling

Method and system for generating large coded data set of text from textual documents using high resolution labeling

Method and system for generating large coded data set of text from textual documents using high resolution labeling

Owner:YISSUM RES DEV CO OF THE HEBREWUNIVERSITY OF JERUSALEM LTD

Method, system and device for enhancing business information security

ActiveUS20160217276A1Improve information securityPrevent leakageDigital data protectionTransmissionElectronic documentSecurity Measure

The present invention provides a method for creating an electronic document file comprising monitoring creation and changes of an electronic document file, receiving a policy file including document level set-up information and security policy, searching for words associated with business information from the text data retrieved from the electronic document file, computing an exposure score of the electronic document file based on the number of times for words associated with business information being searched and document level set-up information, assigning a document level to the electronic document file based on the exposure score, and inserting a watermark to text of the electronic document file to be displayed on the client device based on the user's personal information received from the server. Accordingly, leakage of business documents for electronic document files including business information can be prevented by providing pre-security and post-security measures stronger than conventional measures.

Method, system and device for enhancing business information security

Method, system and device for enhancing business information security

Method, system and device for enhancing business information security

Owner:MARKANY

Twitter viewpoint classification-oriented sentiment-enriched word embedding learning method

InactiveCN106980650ANeural architecturesSpecial data processing applicationsViewpointsData set

The invention provides a Twitter viewpoint classification-oriented sentiment-enriched word embedding learning method, and relates to the technical field of computers. When word level n-gram and polarity information are modeled at the same time, sentiment polarity information of tweet document level is modeled, sentiment information of the word level is integrated, and the word level is naturally input to convolution to serve as an input of the tweet level. When learnt words are embedded in a Twitter viewpoint polarity classification task, an experimental result on a standard data set shows that the method provided by the invention is superior to existing similar methods.

Twitter viewpoint classification-oriented sentiment-enriched word embedding learning method

Twitter viewpoint classification-oriented sentiment-enriched word embedding learning method

Twitter viewpoint classification-oriented sentiment-enriched word embedding learning method

Owner:PINGDINGSHAN UNIVERSITY

Document-specific gazetteers for named entity recognition

ActiveUS9836453B2Relational databasesCharacter and pattern recognitionNamed-entity recognitionDocumentation

A method for entity recognition employs document-level entity tags which correspond to mentions appearing in the document, without specifying their locations. A named entity recognition model is trained on features extracted from text samples tagged with document-level entity tags. A text document to be labeled is received, the text document being tagged with at least one document-level entity tag. A document-specific gazetteer is generated, based on the at least one document-level entity tag. The gazetteer includes a set of entries, one entry for each of a set of entity names. For a text sequence of the document, features for tokens of the text sequence are extracted. The features include document-specific features for tokens matching at least a part of the entity name of one of the gazetteer entries. Entity labels are predicted for the tokens in the text sequence with the named entity recognition model, based on the extracted features.

Document-specific gazetteers for named entity recognition

Document-specific gazetteers for named entity recognition

Document-specific gazetteers for named entity recognition

Owner:CONDUENT BUSINESS SERVICES LLC

Scope-based xps filter driver configuration to perform dynamic operations

InactiveUS20080278741A1Improve performanceDigital output to print unitsManagement unitOperating system

A filter pipeline framework is provided for a printer driver including a plurality of filters. The framework includes a logical page filter configured to perform operations on logical pages within a first thread; a document level filter configured to perform document level operations within a second thread; a job level and physical page filter configured perform job level operations and physical page operations within a third thread; and a command managing unit configured to generate and manage commands compiled in command lists for the filters. Print ticket processing is performed at the beginning of the filter pipeline in the logical page filter. And further, each filter simultaneously runs within its own thread to perform a specific scope of operations defined for each filter.

Scope-based xps filter driver configuration to perform dynamic operations

Scope-based xps filter driver configuration to perform dynamic operations

Scope-based xps filter driver configuration to perform dynamic operations

Owner:CANON KK

Popular searches

Information provision Landing page Phrase Document level Engineering Data science Information retrieval Semantic search Document recognition Digital document