Patents

Literature

Patsnap Eureka AI that helps you search prior art, draft patents, and assess FTO risks, powered by patent and scientific literature data.

239 results about "Document recognition" patented technology

Filter

Efficacy Topic

Property

Owner

Technical Advancement

Application Domain

Technology Topic

Technology Field Word

Patent Country/Region

Patent Type

Patent Status

Application Year

Inventor

Intelligent document recognition is a new technology that promises to transform the way businesses handle document processing. An Intelligent document recognition system analyzes the content of the document that it receives, and looks for certain keywords that match its database of business terms.

Query-less searching

InactiveUS20060212415A1Web data indexingSpecial data processing applicationsThe InternetDocument recognition

Some embodiments of the invention provide a method for identifying relevant documents. The method receives a set of reference documents. The method analyzes the received set of reference documents. Based on this analysis, the method then identifies one or more documents that are potentially relevant to the discussion in one or more reference documents. In some embodiments, the method identifies the relevant documents by examining candidate documents that are on a computer or are accessible by a computer through a computer network (e.g., a local area network, a wide area network, or a network of networks, such as the Internet). In these embodiments, the method uses its analysis of the reference document set to determine whether the discussion (i.e., content) of the candidate document is relevant to the topics discussed in one or more of the reference documents. If so, the method of some embodiments identifies the candidate document as a potentially relevant document (i.e., as a document that is potentially relevant or related to the reference document set).

Query-less searching

Query-less searching

Query-less searching

Owner:CALIFORNIA INST OF TECH +1

Method and system for semantic search and retrieval of electronic documents

InactiveUS20060235843A1Reduce inclusionsImprove rankingWeb data indexingSpecial data processing applicationsElectronic documentPattern matching

A system and method for semantic search for electronic documents stored on a computer readable media, and providing a search result in response to a query. The system includes a corpus including a plurality of electronic documents that are domain tagged at a document level and analyzed based on the tags to identify word usage patterns. An index of word usage patterns is provided that indexes the plurality of documents in the corpus according to their word usage patterns. The system also includes a query pre-processing module that receives a query from a user, and analyzes the query to determine probable word usage patterns in the query. The system further includes a processor that uses the index to identify documents having word usage patterns that matches the probable word usage patterns in the query as a candidate electronic document, and retrieves the candidate electronic document.

Method and system for semantic search and retrieval of electronic documents

Method and system for semantic search and retrieval of electronic documents

Method and system for semantic search and retrieval of electronic documents

Owner:TEXTDIGGER

Collaborative email

InactiveUS20050033811A1FinanceNatural language data processingDocument recognitionPaper document

Methods, systems, and products are disclosed for writing collaborative email documents, including establishing a collaborative email document on an administrator's computer; identifying one or more collaborators who are authorized to view and edit the document; providing to the collaborators copies of the document for viewing and editing, wherein the collaborators' copies reside on collaborators' computers; creating revisions in at least one copy of the document; recording the revisions; and updating the copies of the document on collaborators' computers with the revisions. Embodiments typically include identifying editable portions of the email document, including specifying that only certain collaborators are authorized to view and edit one or more portions of the document. In many embodiments, revisions are streamed through a server so that there is no ‘master copy’ of a collaborative document, on a server or elsewhere, against which revisions are recognized.

Collaborative email

Collaborative email

Collaborative email

Owner:PAYPAL INC

Information processor, document management system, and processing method and program of information processor

InactiveUS20090235155A1Easily level of importanceOperation efficiency can be improvedNatural language data processingSpecial data processing applicationsDocument recognitionInformation processor

A client terminal acquires from a server terminal one or more document information which includes at least a thumbnail image and document identification information for identifying document data corresponding to the thumbnail image, and includes first annotation data and / or second annotation data associated with the document identification information. If first annotation data is included in respective acquired document information, the client terminal displays a thumbnail image with which the first annotation data is combined, as a list with thumbnail view, on a display unit. If the second annotation data is included in specified document data, the client terminal individually displays specified document data with which the second annotation data is combined, on a display unit.

Information processor, document management system, and processing method and program of information processor

Information processor, document management system, and processing method and program of information processor

Information processor, document management system, and processing method and program of information processor

Owner:CANON KK

Methods and systems for analyzing XML documents

InactiveUS20060161559A1Digital data processing detailsNatural language data processingAnalytic modelXPath

Methods and systems for analyzing XML documents. The system scans an XML document, identifies different dimensions that span the XML document and detects scoping relationships amongst them. The system uses the dimensional information to create a logical hierarchical scoped dimension analysis model, maps the logical XML tree to this model, and then implements the analytical method over the logical model. The logical model allows both structural features and numeric / non-numeric data to be used for analysis. The analytical method allows users to query irregular structural properties of the XML documents using the XPath navigational API.

Methods and systems for analyzing XML documents

Methods and systems for analyzing XML documents

Methods and systems for analyzing XML documents

Owner:IBM CORP

Assigning document identification tags

ActiveUS8136025B1Web data indexingDigital data processing detailsDocument recognitionDocument preparation

Document identification tags are assigned to documents to be added to a collection of documents. Based on query-independent information about a new document, a document identification tag is assigned to the new document. The document identification tag so assigned is used in the indexing of the new document. When a list of document identification tags are produced by an index in response to a query, the list is approximately ordered with respect to a measure of query-independent relevance. In some embodiments, the measure of query-independent relevance is related to the connectivity matrix of the World Wide Web. In other embodiments, the measure is related to the recency of crawling. In still other embodiments, the measure is a mixture of these two. The provided systems and methods allow for real-time indexing of documents as they are crawled from a collection of documents.

Assigning document identification tags

Assigning document identification tags

Assigning document identification tags

Owner:GOOGLE LLC

System, method, and computer program product for identifying multi-page documents in hypertext collections

InactiveUS20050071310A1Precise definitionBetter quality toolDigital data information retrievalDigital data processing detailsWebIDEntry point

A system, method, and computer program product for identifying compound documents as a coherent body of hyperlinked material on a single topic as created by an author or collaborating authors, analyzing the content and structure of the compound documents and related hyperlinks, and responsively selecting a preferred entry point at which to begin processing such documents. The body of material may include the internet, an intranet, or other digital library that typically has content distributed over several separate pages or URLs, sometimes in a hierarchical directory structure. The processing may include creating at least one taxonomy, as well as searching or indexing the compound documents. The identification and analysis schemes include a observation of a number of heuristics run on component documents in the compound documents.

System, method, and computer program product for identifying multi-page documents in hypertext collections

System, method, and computer program product for identifying multi-page documents in hypertext collections

System, method, and computer program product for identifying multi-page documents in hypertext collections

Owner:IBM CORP

Web page authoring apparatus, web page authoring method, and program

InactiveUS20060161841A1Natural language data processingSpecial data processing applicationsComputer graphics (images)Document recognition

The present invention improves application of a style to a view object when a document for a Web page to be edited is edited on a browser-type edit screen. First, a view object is detected from a managed document. Then, a direct style directly described in the managed document and an indirect style identified only by referring to an external document are collected. A browser-type edit screen is generated in which the direct and indirect styles are applied to each view object. The content of the managed document is synchronized with the edited content on the browser-type edit screen based on the editing operations on the browser-type edit screen.

Web page authoring apparatus, web page authoring method, and program

Web page authoring apparatus, web page authoring method, and program

Web page authoring apparatus, web page authoring method, and program

Owner:IBM CORP

System and method for using text analytics to identify a set of related documents from a source document

ActiveUS20070112748A1Natural language data processingSpecial data processing applicationsSystems analysisDocument recognition

A system and method for processing a document to generate a set of related documents. A system is provided that includes a textual analytics system that analyzes unstructured data contained in a source document and extracts a set of structured information about the source document; and a compare system that identifies a set of related documents by comparing the set of structured information with metadata indexed from a set of publications.

System and method for using text analytics to identify a set of related documents from a source document

System and method for using text analytics to identify a set of related documents from a source document

System and method for using text analytics to identify a set of related documents from a source document

Owner:IBM CORP

A self-learning system and methods for automatic document recognition, authentication, and information extraction

ActiveUS20170132866A1Convenient timeRequires minimizationPaper-money testing devicesCharacter and pattern recognitionNODALFeature vector

A computerized system for classifying and authenticating documents is provided. The Classification process involves the creation of a Unique Pair Feature Vector which provides the best discrimination information for each pair of Document Classes at every node in a Pairwise Comparison Nodal Network. The Nodal Network has a plurality of nodes, each node corresponding to the best discrimination information between two potential document classes. By performing a pairwise comparison of the potential documents using this nodal network, the document is classified. After classification, the document can be authenticated for validity.

A self-learning system and methods for automatic document recognition, authentication, and information extraction

A self-learning system and methods for automatic document recognition, authentication, and information extraction

A self-learning system and methods for automatic document recognition, authentication, and information extraction

Owner:META PLATFORMS INC

Method and apparatus for sharing viewable content with conference participants through automated identification of content to be shared

ActiveUS20160094593A1Office automationTransmissionWeb serviceDisplay device

A computer implemented method and apparatus for sharing the contents of a presentation in a web conference through automated identification of documents for selective sharing with web conferences comprises launching a web conference in which an application or an entire desktop view of a conference presenter is caused to be rendered as content viewable on the displays of all conference participants. The method detects that one or more documents are being accessed by application(s) executed concurrently with the desktop sharing application and identifies documents having a supported format as including viewable content available for rendering to the display of the presenter. Automatically, or after presenter confirmation, the content is uploaded to a web server and converted into a format that can be distributed to and cached at the respective participant computers.

Method and apparatus for sharing viewable content with conference participants through automated identification of content to be shared

Method and apparatus for sharing viewable content with conference participants through automated identification of content to be shared

Method and apparatus for sharing viewable content with conference participants through automated identification of content to be shared

Owner:ADOBE SYST INC

System and method for performing electronic information retrieval using keywords

ActiveUS7370034B2Data processing applicationsDigital data processing detailsA domainDocument preparation

Output documents similar to an input document are identified. A query is formulated using a list of best keywords from the input document to search for a first set of output documents. The list of best keywords is defined with a maximum number of keywords less than the total number of keywords in the list of best keywords that are identified as belonging to a domain specific dictionary of words and as having no measurable linguistic frequency. Lists of keywords are identified for each output document in the first set of documents. A second set of similar documents is determined using a measure of similarity that is computed between keywords identified in the input document and each output document in the first set of documents.

System and method for performing electronic information retrieval using keywords

System and method for performing electronic information retrieval using keywords

System and method for performing electronic information retrieval using keywords

Owner:XEROX CORP

Network system for directing the transmission of facsimiles

InactiveUS6906817B1Digital computer detailsCharacter and pattern recognitionPaper documentDocument preparation

A general document recognition system is described which is intended to be used in connection with an electronic document transmission function used on a computer network. The general document recognition system is set up to recognize any number of document types created by application programs in the network and is also set up with rules as to how to extract destination data from each document type. The extracted data from each document can be the actual intended destination, such as a facsimile telephone number, or can be the identity of the intended recipient individual. If a recipient, rather than a destination, is extracted from the document, the general document recognition system can query a previously designated external database to recover the destination information for that recipient. An LDAP database is the preferred external database for this function.

Network system for directing the transmission of facsimiles

Network system for directing the transmission of facsimiles

Network system for directing the transmission of facsimiles

Owner:ESKER SA

Insurance document imaging and processing system

InactiveUS20090265385A1Digital data information retrievalFinanceDocument recognitionDocument preparation

According to some embodiments, an insurance document is received at a document conversion system. The received document may be converted to a document image, and document identification data may be assigned to the document image. The assigned document identification data may be automatically matched to (and / or associated with) insurance information. It may then be arranged to provide the document image, the insurance information, and / or the document identification data for review. Subsequent to review, an insurance claim may be processed in accordance with the document image, the insurance information, and / or the document identification data.

Insurance document imaging and processing system

Insurance document imaging and processing system

Insurance document imaging and processing system

Owner:HARTFORD FIRE INSURANCE

Systems, methods and computer program products for labeled forms processing

InactiveUS20060044605A1Character recognitionDigital output to print unitsDocument preparationForm processing

A system, method, and computer product for processing paper documents for electronic storage and retrieval where a label containing a document identification code is generated and is affixed to a paper document. The paper document is then converted to a digital format and transmitted to a central processing center. The digital document is separated into two or more individual pages and may be presented to a user with a viewer program. Through the viewer program the user may then identify a portion of the label to image and convert the imaged portion of the label to textual data relating to the document and its contents. The textual data may then be used in archiving the documents in an archiving program. The data also may be retrieved from a stored database location and verified with information entered in a particular field by a user.

Systems, methods and computer program products for labeled forms processing

Systems, methods and computer program products for labeled forms processing

Systems, methods and computer program products for labeled forms processing

Owner:U S SECURITY ASSOCS

Authoritative document identification

InactiveUS20060149800A1Web data indexingSpecial data processing applicationsDocument IdentifierDocument recognition

A system determines documents that are associated with a location, identifies a group of signals associated with each of the documents, and determines authoritativeness of the documents for the location based on the signals.

Authoritative document identification

Authoritative document identification

Authoritative document identification

Owner:GOOGLE LLC

Print control mechanism based on printing environment

ActiveUS20100020355A1Overcome disadvantagesDigital data processing detailsUnauthorized memory use protectionData controlImage formation

An image forming apparatus implements a print restriction depending on the environment of the image forming apparatus, such as who is or is not near the image forming apparatus. The image forming apparatus communicates with a short-range wireless terminal for authenticating print data with reference to access right information in which document identifying information identifying the print data is associated with wireless terminal identifying information identifying the short-range wireless terminal. The image forming apparatus includes an acquiring unit for acquiring the wireless terminal identifying information from the short-range wireless terminal; a determining unit for determining whether the printing of the print data should be permitted or not based on the wireless terminal identifying information acquired by the acquiring unit and the access right information; and a control unit for controlling the printing of the print data depending on a result of the determination made by the determining unit.

Print control mechanism based on printing environment

Print control mechanism based on printing environment

Print control mechanism based on printing environment

Owner:RICOH KK

Near-duplicate document detection for web crawling

ActiveUS8140505B1Web data indexingDigital data processing detailsComputer hardwareDocument recognition

A system generates a hash value for a fetched document and compares the hash value with a set of stored hash values to identify ones of the stored hash values with a sequence of bit positions, less than all of the bit positions, that match a corresponding sequence of bit positions of the hash value. The system also determines whether any of the identified hash values are substantially similar to the hash value and identify the fetched document as a near-duplicate of another document when one of the identified hash values is substantially similar to the hash value.

Near-duplicate document detection for web crawling

Near-duplicate document detection for web crawling

Near-duplicate document detection for web crawling

Owner:GOOGLE LLC

Systems and methods to automatically classify electronic documents using extracted image and text features and using a machine learning subsystem

InactiveUS20090116736A1Improve abilitiesCharacter and pattern recognitionElectronic documentFeature set

A document analysis system that automatically classifies documents by recognizing in each document distinctive features comprises a document acquisition system, a document recognition training system, a document classification system, a document recognition system, and a job organization system. The document acquisition system receives jobs wherein each job containing at least one electronic document. The document feature recognition system automatically extracts image and text features from each received document. The document classification system automatically classifies recognized electronic documents by finding the best match between the extracted features of each of the document and feature sets associated with each category of document. The document recognition training system automatically trains the feature set for each corresponding category of documents, wherein the training system using extracted features of unrecognized documents automatically modifies the feature set for a document category. The job organization system automatically organizes each job according to the document categories it contains.

Systems and methods to automatically classify electronic documents using extracted image and text features and using a machine learning subsystem

Systems and methods to automatically classify electronic documents using extracted image and text features and using a machine learning subsystem

Systems and methods to automatically classify electronic documents using extracted image and text features and using a machine learning subsystem

Owner:GRUNTWORX

Preserving user applied markings made to a hardcopy original document

ActiveUS20110249299A1Readily apparentImage enhancementVisual presentationComputer graphics (images)Document recognition

What is disclosed is a novel system and method for identifying and removing print defects from an original document such that user markings applied to the hardcopy originally can be more readily identified and extracted. In one embodiment, an image of an original document and a marked document are received. The original document was printed using a print device which caused a print defect in the hardcopy print. Methods for identifying the print defect in the difference image are provided herein. The identified print defect is removed from the difference image. The difference image retains the user-applied markings once the print defects have been identified and removed. The user markings can then be provided to a storage device for subsequent retrieval and added into the image of the original document to generate an image of a new marked document containing the user markings without the defect. Various embodiments are disclosed.

Preserving user applied markings made to a hardcopy original document

Preserving user applied markings made to a hardcopy original document

Preserving user applied markings made to a hardcopy original document

Owner:XEROX CORP

Systems and methods for providing data-driven document suggestions

InactiveUS20130290347A1Digital data processing detailsSpecial data processing applicationsDocument recognitionData-driven

Systems and methods are disclosed for providing at least one document suggestion from a computer system using at least one information source, the method comprising storing in the information source a plurality of associations, each of which includes a numeric coefficient that corresponds to at least one action of a user and at least one document; receiving a triggering action related to the at least one action of the user; comparing the numeric coefficients stored in the information source with a suggestion threshold based on the triggering action; and for each numeric coefficient that exceeds the suggestion threshold, identifying the corresponding at least one document as a suggested document.

Systems and methods for providing data-driven document suggestions

Systems and methods for providing data-driven document suggestions

Systems and methods for providing data-driven document suggestions

Owner:APPSENSE

Dynamically and customizably managing data in compliance with privacy and security standards

InactiveUS20050192830A1Preserving professional confidenceAvoid identificationComputer security arrangementsPayment architectureDocumentation procedureRecordset

Systems and methods for managing data in compliance with privacy, security and / or retention standards in business industries. A dynamic and customizable archival and retrieval system allows for information and documentation to be placed and made available in the system. The document type and identifying information for that document type are described. Definitions are established for the documents being managed, the data identifying the documents, and the retention policies for the documents. The documents are associated with the identifying data for a particular set of records. A single point of entry is provided for external and / or internal requests, and / or a single point of exit is provided for transmissions of information, wherein the transmissions to requestors include information that is individually approved. Moreover, digital authorizations and consents for retrieval from external data sources may be utilized.

Dynamically and customizably managing data in compliance with privacy and security standards

Dynamically and customizably managing data in compliance with privacy and security standards

Dynamically and customizably managing data in compliance with privacy and security standards

Owner:VERISMA SYST

Efficient passage retrieval using document metadata

InactiveUS20120078926A1Efficient retrievalImprove efficiencyDigital data processing detailsKnowledge representationQuery analysisDocument preparation

A system, method and computer program product for efficiently retrieving relevant passages to questions based on a corpus of data. A processor device receives an input query and performs a query analysis to obtain searchable query terms. The processor performs: matching metadata associated with one or more documents against the query terms. The document metadata includes one or more of: a title of the documents, one or more user tags or clouds. Then the processor device performs: mapping matched document metadata to corresponding one or more documents; identifying corresponding matched documents to form a subcorpus of documents; and conducting a search in the data subcorpus using the searchable query terms to obtain one or more passages relevant input query from the identified documents.

Efficient passage retrieval using document metadata

Efficient passage retrieval using document metadata

Efficient passage retrieval using document metadata

Owner:IBM CORP

E-dictionary search apparatus and method for document in which korean characters and chinese characters are mixed

InactiveUS20110188756A1Enhanced search functionEfficient executionNatural language translationCharacter and pattern recognitionSearch wordsChinese characters

A method for providing a correct e-dictionary search result for a document recognition result includes performing character recognition of a document in which Korean characters (Hangul) and Chinese characters are mixed and displaying a recognition result. If a character string to be searched is selected by a user from the recognition result, determining whether the selected character string corresponds to Hangul or Chinese characters, detecting a Hangul word or a Chinese word included in the selected character string, and outputting an e-dictionary search result corresponding to the detected Hangul or a Chinese word. Accordingly, the user can use an e-dictionary function without directly inputting a search word and obtain a correct e-dictionary search result for a document in which Hangul and Chinese characters are mixed.

E-dictionary search apparatus and method for document in which korean characters and chinese characters are mixed

E-dictionary search apparatus and method for document in which korean characters and chinese characters are mixed

E-dictionary search apparatus and method for document in which korean characters and chinese characters are mixed

Owner:SAMSUNG ELECTRONICS CO LTD

Document searching apparatus, document searching method, and computer-readable recording medium

InactiveUS20090183115A1Solve problemsMetadata text retrievalDigital data processing detailsDocument recognitionDocument preparation

A document searching apparatus includes an element-correspondence storing unit that stores therein a page-correspondence managing table in which document data is associated with each page making up the document data, a searching unit that searches the page-correspondence managing table for pages satisfying a search criterion, a document identifying unit that identifies document data associated with the retrieved pages, a collating unit that groups the retrieved pages according to the identified document data, and a display processing unit that displays the pages grouped by document data.

Document searching apparatus, document searching method, and computer-readable recording medium

Document searching apparatus, document searching method, and computer-readable recording medium

Document searching apparatus, document searching method, and computer-readable recording medium

Owner:RICOH KK

Method and device for making documents secure using unique imprint derived from unique marking variations

ActiveUS8345315B2Stability is not easyHigh riskImage analysisPaper-money testing devicesDocumentation procedureDocument recognition

The document identification method includes:a step of generating an image,a step of marking a plurality of documents to form the image on each document with unique variations on each document, the majority of the images formed on the documents presenting a physical anti-copy characteristic satisfying a predefined criterion such that the characteristic of the majority of the copies that can be produced based on these images do not satisfy the pre-defined criterion,a step of characterizing the variations to form a unique imprint of the mark formed, for each document anda step of memorizing the unique imprint.

Method and device for making documents secure using unique imprint derived from unique marking variations

Method and device for making documents secure using unique imprint derived from unique marking variations

Method and device for making documents secure using unique imprint derived from unique marking variations

Owner:ADVANCED TRACK TRACE

Intelligently driven visual interface on mobile devices and tablets based on implicit and explicit user actions

ActiveUS20180329990A1Improve conversion rateCharacter and pattern recognitionSpecial data processing applicationsTablet computerDocumentation

A method for identifying a desired document is provided to include forming K clusters of documents and, for each cluster: for each respective document of the cluster determining a sum of distances between (i) the respective document and (ii) each of the other documents of the cluster; and identifying a medoid document of the cluster as the document of the cluster having the smallest sum of determined distances of all of the documents of the cluster. The method also includes selecting M representative documents for each cluster, identifying for dynamic display toward the user K groupings of documents, wherein each of the K groupings of documents identifies the selected M representative documents of a corresponding cluster, and, in response to user selection of one of the K groupings of documents, identifying for dynamic display toward the user P documents of the cluster that corresponds to the selected grouping.

Intelligently driven visual interface on mobile devices and tablets based on implicit and explicit user actions

Intelligently driven visual interface on mobile devices and tablets based on implicit and explicit user actions

Intelligently driven visual interface on mobile devices and tablets based on implicit and explicit user actions

Owner:EVOLV TECH SOLUTIONS INC

Method for generating a graph lattice from a corpus of one or more data graphs

ActiveUS20120069024A1Efficiently buildEfficient use ofDrawing from basic elementsDocument recognitionGraphics

A document recognition system and method, where images are represented as a collection of primitive features whose spatial relations are represented as a graph. Useful subsets of all the possible subgraphs representing different portions of images are represented over a corpus of many images. The data structure is a lattice of subgraphs, and algorithms are provided means to build and use the graph lattice efficiently and effectively.

Method for generating a graph lattice from a corpus of one or more data graphs

Method for generating a graph lattice from a corpus of one or more data graphs

Method for generating a graph lattice from a corpus of one or more data graphs

Owner:PALO ALTO RES CENT INC

Information processing device, information processing system, information processing method, program, and storage medium

InactiveUS20080244378A1Information obtainedReduce the burden onCharacter and pattern recognitionOffice automationInformation processingFeature extraction

An information processing device includes: a feature extracting section for extracting, as format information, a format feature of a process-target document from image data of the process-target document, on which filling-in spaces of plural items are printed; a document recognizing section for comparing the format information of the process-target document with registered format information stored in a storage device, and specifying a registered document that corresponds to the process-target document, the registered format information regarding format features of registered documents; a data acquiring section for converting characters in the image data of the process-target document into text data; and a distributing section for grouping the image data and text data of the characters into plural groups according to a separation rule that is set for the registered document, the characters being written in the fill-in spaces of the items of the process-target document, and for transmitting the different groups to different external devices. With this, information such as personal information to be protected can be processed, preventing an operator dealing with the information from obtaining the whole information.

Information processing device, information processing system, information processing method, program, and storage medium

Information processing device, information processing system, information processing method, program, and storage medium

Information processing device, information processing system, information processing method, program, and storage medium

Owner:SHARP KK

A method and a terminal for creating paper document structured data based on a deep learning model

ActiveCN109800761AImprove efficiencyImprove accuracyCharacter and pattern recognitionNeural architecturesDocument structuringDocument recognition

The invention relates to a method and a terminal for creating paper document structured data based on a deep learning model. The method comprises the following steps: training a sample set through a preset document; wherein each sample in the document sample set comprises a paper document OCR recognition result and a labeled document corresponding to the paper document OCR recognition result; wherein the labeled document records position information and category information of each key field in the OCR recognition result of the paper document; training a preset first deep learning model by using the training sample set to obtain a second deep learning model; enabling the second deep learning model to analyze a first paper document OCR recognition result to obtain position information and category information of each key field in the first paper document OCR recognition result; and creating a structured document corresponding to the first paper document OCR recognition result accordingto the position information and the category information of each key field in the first paper document OCR recognition result. The accuracy of converting the OCR result of the paper document into thestructured document is improved.

A method and a terminal for creating paper document structured data based on a deep learning model

A method and a terminal for creating paper document structured data based on a deep learning model

A method and a terminal for creating paper document structured data based on a deep learning model

Owner:厦门商集网络科技有限责任公司

Popular searches

Subject matter Wide area network Reference Document Local area network Semantic search Document level Digital document Text corpus World Wide Web Electronic mail