system
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Applications
- Current Assignee / Owner
- SOFTBANK GROUP CORP
- Filing Date
- 2024-12-09
- Publication Date
- 2026-06-19
AI Technical Summary
The increasing number of electronic documents handled in a disorderly manner leads to difficulties in quickly finding specific information, resulting in inefficient operations and human errors due to the absence of character information management and search capabilities.
A system utilizing optical character recognition to extract character information, natural language processing to analyze and organize it, and indexing to improve search efficiency, along with query generation for efficient retrieval and management of electronic documents.
Enables rapid and accurate management of electronic documents, improving operational efficiency by structuring and organizing information for quick access and reducing the time required to find specific information.
Smart Images

Figure 2026100694000001_ABST
Abstract
Description
Technical Field
[0001] The technology of the present disclosure relates to a system.
Background Art
[0002] Patent Document 1 discloses a method for controlling a persona chatbot, which is performed by at least one processor, including steps of receiving a user utterance, adding the user utterance to a prompt including an instruction sentence related to an explanation of a character of the chatbot, encoding the prompt, and inputting the encoded prompt into a language model to generate a chatbot utterance in response to the user utterance.
Prior Art Documents
Patent Documents
[0003]
Patent Document 1
Summary of the Invention
Problems to be Solved by the Invention
[0004] In modern operations, while the number of contract documents handled as electronic documents is increasing, there is a problem that these are stored in a disorderly manner, and it is difficult to search for important information due to the absence of character information. In particular, it is difficult to quickly find specific information from a plurality of documents, and a large amount of working time is required. Such a situation hinders efficient operation and becomes a factor in operation delays and human errors. The present invention aims to solve such problems and improve the management and search of electronic documents.
Means for Solving the Problems
[0005] This invention provides a means for automatically extracting character information from electronic documents using optical character recognition technology. Furthermore, it includes means for analyzing the extracted information using natural language processing technology, registering the identified information in a database, and generating a link to the original document. It also has query generation means for searching the database based on user prompts, and enables efficient information retrieval and referencing by quickly presenting the search results. Moreover, it provides a system that improves search efficiency by indexing electronic documents and related information, and organizes electronic documents based on the extracted information, thereby improving business efficiency.
[0006] "Electronic documents" refer to document data created and stored on computers and electronic devices.
[0007] "Optical character recognition means" refers to a device or software that uses technology to automatically read characters from image data and convert them into electronic character information.
[0008] "Character information" refers to information such as letters, numbers, and symbols that make up the text within a document.
[0009] "Natural language processing means" refers to technologies, devices, or software that enable computers to understand and analyze human language.
[0010] "Information management means" refers to a system or method for efficiently organizing, storing, and making accessible collected information.
[0011] "Query generation means" refers to technology that generates search commands to access a database and retrieve information based on user requests and conditions.
[0012] "Indexing means" refers to the techniques and procedures for creating indexes necessary for efficiently retrieving information.
[0013] "Data organization methods" refer to techniques for efficiently classifying, structuring, and preparing collected data for use. [Brief explanation of the drawing]
[0014] [Figure 1] This is a conceptual diagram showing an example of the configuration of a data processing system according to the first embodiment. [Figure 2] This is a conceptual diagram showing an example of the essential functions of a data processing device and a smart device according to the first embodiment. [Figure 3] This is a conceptual diagram showing an example of the configuration of a data processing system according to the second embodiment. [Figure 4] This is a conceptual diagram showing an example of the main functions of a data processing device and smart glasses according to the second embodiment. [Figure 5] This is a conceptual diagram showing an example of the configuration of a data processing system according to the third embodiment. [Figure 6] This is a conceptual diagram showing an example of the main functions of a data processing device and a headset-type terminal according to the third embodiment. [Figure 7] This is a conceptual diagram showing an example of the configuration of a data processing system according to the fourth embodiment. [Figure 8] This is a conceptual diagram showing an example of the main functions of a data processing device and a robot according to the fourth embodiment. [Figure 9] This shows an emotion map where multiple emotions are mapped. [Figure 10] This shows an emotion map where multiple emotions are mapped. [Figure 11] This is a sequence diagram showing the processing flow of the data processing system in Example 1. [Figure 12] This is a sequence diagram showing the processing flow of the data processing system in Application Example 1. [Figure 13] This is a sequence diagram showing the processing flow of the data processing system in Example 2, when an emotion engine is combined. [Figure 14]It is a sequence diagram showing the processing flow of a data processing system in Application Example 2 when a sentiment engine is combined.
Embodiments for Carrying Out the Invention
[0015] Hereinafter, an example of an embodiment of a system according to the technology of the present disclosure will be described with reference to the accompanying drawings.
[0016] First, the terms used in the following description will be explained.
[0017] In the following embodiments, a numbered processor (hereinafter simply referred to as "processor") may be a single arithmetic unit or a combination of multiple arithmetic units. Also, the processor may be a single type of arithmetic unit or a combination of multiple types of arithmetic units. Examples of arithmetic units include a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), a GPGPU (General-Purpose computing on Graphics Processing Units), an APU (Accelerated Processing Unit), and the like.
[0018] In the following embodiments, a numbered RAM (Random Access Memory) is a memory in which information is temporarily stored and is used as a work memory by the processor.
[0019] In the following embodiments, a numbered storage is one or more non-volatile storage devices that store various programs and various parameters, etc. Examples of non-volatile storage devices include flash memory (SSD (Solid State Drive)), magnetic disks (e.g., hard disks), or magnetic tapes, etc.
[0020] In the following embodiments, the signed communication interface (I / F) is an interface that includes a communication processor and an antenna, etc. The communication interface manages communication between multiple computers. Examples of communication standards applicable to the communication interface include wireless communication standards such as 5G (5th Generation Mobile Communication System), Wi-Fi (registered trademark), or Bluetooth (registered trademark).
[0021] In the following embodiments, "A and / or B" is synonymous with "at least one of A and B." That is, "A and / or B" means that it may be A alone, or B alone, or a combination of A and B. Furthermore, in this specification, the same concept as "A and / or B" applies when expressing three or more things linked by "and / or."
[0022] [First Embodiment]
[0023] Figure 1 shows an example of the configuration of the data processing system 10 according to the first embodiment.
[0024] As shown in Figure 1, the data processing system 10 includes a data processing device 12 and a smart device 14. An example of the data processing device 12 is a server.
[0025] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0026] The smart device 14 comprises a computer 36, a reception device 38, an output device 40, a camera 42, and a communication interface 44. The computer 36 comprises a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The reception device 38, output device 40, and camera 42 are also connected to the bus 52.
[0027] The reception device 38 is equipped with a touch panel 38A and a microphone 38B, etc., and receives user input. The touch panel 38A receives user input by detecting contact with an object (e.g., a pen or finger). The microphone 38B receives user input by detecting the user's voice. The control unit 46A transmits data indicating the user input received by the touch panel 38A and microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the data indicating the user input.
[0028] The output device 40 includes a display 40A and a speaker 40B, and presents data to the user 20 by outputting the data in a form perceptible to the user 20 (e.g., audio and / or text). The display 40A displays visible information such as text and images according to instructions from the processor 46. The speaker 40B outputs audio according to instructions from the processor 46. The camera 42 is a small digital camera equipped with an optical system such as a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor.
[0029] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various types of information between processor 46 and processor 28 via network 54.
[0030] Figure 2 shows an example of the main functions of the data processing device 12 and the smart device 14.
[0031] As shown in Figure 2, in the data processing device 12, a specific processing is performed by the processor 28. A specific processing program 56 is stored in the storage 32. The specific processing program 56 is an example of a "program" related to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 according to the specific processing program 56 executed on the RAM 30.
[0032] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0033] In the smart device 14, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The reception output program 60 is used in conjunction with a specific processing program 56 by the data processing system 10. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0034] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".
[0035] The system of the present invention consists of three components: a server, a terminal, and a user. The specific operation of each component is described below.
[0036] First, the user uploads contracts and other related electronic documents through the business system interface. The uploaded documents are sent from the terminal to the server. The server receives these documents and efficiently extracts character information from the electronic documents using optical character recognition (OCR).
[0037] The server then analyzes the extracted text information using natural language processing to identify important information such as the contract number, contracter name, contract date, and contract amount. This information is organized using information management tools and registered in the contract database. At the same time, a link to the original electronic document is generated and stored in the database.
[0038] Users can then search for necessary contract information through this system. When a user enters a concise and unexaggerated search prompt into the terminal, the terminal uses a query generation mechanism to generate a query to search the database based on the user's input. The server quickly searches the database in response to the query and collects the relevant information.
[0039] Search results are presented to the user via their device. These results include links to relevant documents and summaries of necessary contractual information. This allows users to quickly access the information they need and proceed with their work efficiently.
[0040] In this system, the server uses indexing mechanisms to structurally organize information and improve the speed of search result retrieval. Furthermore, data organization mechanisms organize documents based on the extracted data, thereby improving operational efficiency.
[0041] For example, if a user enters a prompt such as "search for contracts concluded in the past year," the terminal will receive this prompt and generate an appropriate query. The server will process the query, retrieve the relevant information from the database, and return the search results to the user. In this way, the present invention provides a means to significantly streamline the management of electronic documents.
[0042] The following describes the processing flow.
[0043] Step 1:
[0044] Users upload PDF files of contracts from their terminals to the server using the web interface of the business system.
[0045] Step 2:
[0046] The server receives the uploaded PDF file, performs a security check, and then saves the file to temporary storage.
[0047] Step 3:
[0048] The server activates the optical character recognition (OCR) system and extracts character information from the saved PDF file. During this process, image-based characters are converted into text data.
[0049] Step 4:
[0050] The server uses natural language processing to analyze important contract information from the text extracted by OCR processing, identifying, for example, the name of the contractor, the contract date, and the contract amount.
[0051] Step 5:
[0052] The server organizes the identified information and registers it in the contract database. Furthermore, it generates a reference link to the original PDF file and saves this link in the database as well.
[0053] Step 6:
[0054] The user uses the business system's search function to input conditions or keywords into the terminal to search for specific contract information.
[0055] Step 7:
[0056] The terminal activates a query generation mechanism based on user input and constructs a search query targeting the database.
[0057] Step 8:
[0058] The server receives the generated query, quickly searches the database, and retrieves relevant contract information and links.
[0059] Step 9:
[0060] The terminal receives search results from the server and displays the information to the user in a list format. This allows the user to quickly access the necessary contracts and information.
[0061] (Example 1)
[0062] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."
[0063] In today's business environment, managing electronic information requires considerable time and effort. In particular, quickly and accurately extracting, managing, and searching for necessary information from important documents such as contracts is difficult. Furthermore, there is a demand for improved information retrieval speed. Solving these challenges is essential to improving operational efficiency.
[0064] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0065] In this invention, the server includes recognition means for extracting character information from electronic information, language processing means for analyzing the extracted character information and identifying specific information, and management means for registering the analyzed information in a storage device and generating a reference to the original electronic information. This enables rapid and accurate management of electronic information and efficient retrieval of necessary information.
[0066] "Recognition means" refers to a device or program that has the function of automatically extracting textual information from electronic information.
[0067] "Language processing means" refers to a device or program that analyzes extracted character information and performs processing to identify specific information.
[0068] "Management means" refers to a device or program that has the function of systematically registering the analyzed information and generating a reference to the original electronic information.
[0069] A "storage device" is a physical or virtual device used to permanently or temporarily store information.
[0070] A "search query" is a set of conditions or words used to find specific information within a set of information.
[0071] "Information presentation means" refers to a device or program that has the function of presenting search results or necessary information to the user visually or audibly.
[0072] "Indexing" is the process of structuring and organizing information to improve the speed and efficiency of searching.
[0073] To implement this invention, a system is constructed in which a server, terminal, and user work together.
[0074] First, the user uploads electronic documents such as contracts from their terminal through the business system interface. The terminal then transfers these documents to the server using a secure communication protocol.
[0075] The server extracts text information from received electronic documents using optical character recognition (OCR) technology. Specifically, known OCR software such as ABBYY FineReader may be used. The extracted text information is then analyzed using natural language processing (NLP) technology. Here, NLP libraries such as spaCy or NLTK may be used. Through this analysis, information such as the contract number, contracter name, and contract date is automatically identified.
[0076] The analyzed information is registered in a storage device for information management, and a link to the original electronic document is generated. This link facilitates the referencing and tracking of the information. Furthermore, the information is indexed, improving search efficiency.
[0077] Users can quickly find registered information using the system's search function. For example, by entering a prompt such as "Search for contracts concluded this year," relevant contract information can be quickly retrieved. The terminal generates an appropriate search query based on this prompt, and a database search is performed on the server.
[0078] This embodiment of the invention allows users to efficiently manage and retrieve electronic information. The generative AI model assists in generating prompt sentences to enable flexible information retrieval tailored to the user's needs.
[0079] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0080] Step 1:
[0081] The user uses the business system interface to select and upload electronic documents, such as contracts, from their terminal. This operation imports the documents into the terminal's system. Once the import is complete, the terminal sends the electronic document data it received as input to the server.
[0082] Step 2:
[0083] The server receives electronic documents sent from terminals and applies optical character recognition (OCR) technology. This process extracts text information from the electronic documents. The OCR software outputs image-based characters as machine-readable text data.
[0084] Step 3:
[0085] The server analyzes the text data extracted by OCR using natural language processing (NLP). Based on the input text information, it identifies items such as contract number, contractor name, contract date, and contract amount, and outputs them as structured data. By using an NLP library, the server performs the specific processing of automatically extracting the necessary information from the document.
[0086] Step 4:
[0087] The server registers the analyzed structured data in the information management system and generates a link to the original electronic document. This process stores the identified information in the database, and the link is added as part of the output, making it easier to access the information.
[0088] Step 5:
[0089] The user enters a prompt into the terminal to use the system to search for the necessary information. For example, they might enter the prompt "Search for contracts concluded in the past year" and send it to the terminal.
[0090] Step 6:
[0091] The terminal parses user input prompts and generates appropriate database search queries. Search conditions are generated based on the entered prompt text, and the results are output as queries for searching the database.
[0092] Step 7:
[0093] The server searches the database using the generated query. The process then retrieves relevant information from the database, and the search results, including the corresponding contract information and links, are aggregated.
[0094] Step 8:
[0095] The terminal displays search results sent from the server to the user. This allows the user to quickly obtain the necessary information and improve work efficiency. The displayed information includes links to relevant documents and summaries.
[0096] (Application Example 1)
[0097] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."
[0098] Conventional electronic document management systems often require manual extraction and management of important information such as contracts, resulting in time-consuming and labor-intensive processes. Furthermore, verifying and searching transaction records is cumbersome, hindering efficient business operations. Therefore, there is a need for a system that efficiently manages electronic contracts and allows users to easily access contract information and transaction history.
[0099] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0100] In this invention, the server includes recognition means for extracting character data from an electronic document, processing means for analyzing the extracted character data and identifying specific information, information control means for registering the analyzed information in a data storage and generating a reference to the original electronic document, and communication means for transmitting the information to a communication terminal so that users can easily check contract information and transaction records on their terminal devices. This automates the management of electronic contracts and enables users to efficiently access the information they need.
[0101] "Electronic documents" refer to documents created in digital format, such as contracts and transaction records.
[0102] "Character data" refers to individual characters and sets of characters extracted from electronic documents.
[0103] "Recognition means" refers to technologies and devices for extracting character data from electronic documents.
[0104] "Processing means" refers to software or algorithms used to analyze extracted character data and identify specific information.
[0105] A "data vault" refers to a database or storage system used to store analyzed information.
[0106] "Information control means" refers to a system that has the function of registering information in a data storage and generating a reference to the original electronic document.
[0107] "Communication means" refers to communication technologies and infrastructure that transmit information to users' terminal devices, enabling them to verify contract information and transaction records.
[0108] "Terminal devices" refer to devices that users use to access information, such as smartphones and computers.
[0109] To realize this invention, an efficient management system for electronic documents will be constructed. The server will use optical character recognition (OCR) technology to recognize electronic documents such as electronic contracts and transaction records. Specifically, it will use the Google® Cloud Vision API to extract text data and AWS® Lambda to analyze the information for natural language processing (NLP). The analyzed data will be registered in AWS RDS (Relational Database Service) and stored along with a link to the original document.
[0110] For communication devices used by users, such as smartphones, an application is installed on the device to allow for quick access to contract information and transaction records. This application communicates with a server via API Gateway, generates queries, and presents the search results to the user as a graphical user interface (GUI). As a concrete example, a user can immediately retrieve relevant information by entering a prompt such as "Show me all transaction records from last year."
[0111] By using this system, users can automate the processing of contracts, access necessary information more quickly, and significantly improve operational efficiency.
[0112] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0113] Step 1:
[0114] The user uploads an electronic document using their device. Upon uploading, the document is sent to the cloud, and OCR (Optical Character Recognition) is activated. The input is an electronic document, and the output is text data.
[0115] Step 2:
[0116] The server uses OCR technology to extract character data from electronic documents. Here, the Google Cloud Vision API is used to identify characters from image data and obtain them as text data. The input is an electronic document, and the output is the extracted character data.
[0117] Step 3:
[0118] The server uses AWS Lambda to perform natural language processing on the extracted text data. Here, it identifies information such as the contract number, contract holder name, and contract date. The input is text data, and the output is the parsed specific contract information.
[0119] Step 4:
[0120] The server registers the analyzed contract information in AWS RDS and generates a link to the original electronic document. This link is stored in the database and can be accessed as needed. The input is the analyzed contract information, and the output is the database and link information.
[0121] Step 5:
[0122] The user requests a search based on a prompt entered into the terminal. The terminal generates a search query from the prompt and sends it to the server. The input is the prompt, and the output is the generated search query.
[0123] Step 6:
[0124] The server searches the database within AWS RDS based on the received search query and collects relevant contract information. It then uses the indexed data to perform efficient searches. The input is the search query, and the output is the search results data.
[0125] Step 7:
[0126] The server sends search results to the terminal, which displays the results to the user via a GUI. The user can then review contract information and related links and perform necessary tasks. The input is the search results data, and the output is the information displayed to the user.
[0127] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0128] The present invention is a system that incorporates an emotion engine that analyzes user emotions and applies them to information retrieval and presentation. The system mainly consists of a server, a terminal, and the user.
[0129] First, the user uploads the contract and related documents as electronic data. The server extracts text information from the electronic documents, such as PDFs, using optical character recognition (OCR). This information is then analyzed by natural language processing to identify specific contract information. This identified information is registered in a database, and a link to the original document is also generated.
[0130] A distinctive feature of this system is the introduction of an emotion engine. When a user inputs data or interacts with the system, the server analyzes the user's emotions through the emotion engine based on the data and behavioral patterns obtained from the terminal. Based on this, the way search results are presented and the priority of information are dynamically adjusted, enabling the provision of information that is tailored to the user.
[0131] For example, if a user indicates an emotion requiring urgent attention, the server can change its priorities to quickly display relevant search results. Similarly, if the emotion engine determines that the user is relaxed, it can provide more detailed information than usual.
[0132] Furthermore, the emotion engine also suggests changes to the information presentation style and design on the device to ensure users can continue to use the interface without stress. This feature significantly improves the user experience and leads to smoother workflow.
[0133] For example, when a user enters "I urgently need the latest contract information," the emotion engine senses a high level of urgency. This prompts the server to quickly provide information, and relevant, important contract information is promptly displayed on the terminal. In this way, the present invention realizes an advanced information management system that responds to user needs through emotion estimation.
[0134] The following describes the processing flow.
[0135] Step 1:
[0136] Users upload PDF contract files from their terminals to the server via the business system interface. During this process, user actions and input data are used for initial analysis by the emotion engine.
[0137] Step 2:
[0138] The server receives the uploaded PDF file, activates the optical character recognition (OCR) system, and extracts character information from the electronic document. This character information is temporarily stored for later natural language processing.
[0139] Step 3:
[0140] The server uses natural language processing to analyze the text information extracted by OCR. This analysis identifies important information such as the contractor's name, contract date, and contract amount, and then initiates a process to register this information in a database.
[0141] Step 4:
[0142] The server uses information management tools to organize the analyzed information, generate links to the original electronic documents, and store them in a database. This process allows for quick access to the information when searching later.
[0143] Step 5:
[0144] The device activates an emotion engine and determines the user's emotional state based on their input speed and content. Based on this data, it analyzes whether the user's current psychological state is high-stress or low-stress.
[0145] Step 6:
[0146] The user enters criteria on the interface to search for specific contract information. As the user enters the information, the system monitors the user's keyboard typing patterns and mouse movements.
[0147] Step 7:
[0148] The terminal uses a query generation mechanism to construct database queries based on the user's emotional state and search criteria. The priority of these queries is dynamically adjusted according to the user's emotions.
[0149] Step 8:
[0150] The server processes queries and searches the database based on the analysis results of the emotion engine. It efficiently retrieves search results and prioritizes information as needed.
[0151] Step 9:
[0152] The device receives search results from the server and quickly displays information tailored to the user's analyzed emotional state. By presenting information in a way that suits the user's state, the user can view and use the information in the most optimal manner.
[0153] (Example 2)
[0154] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."
[0155] In the modern era, while information is rapidly digitizing, there is a lack of systems that can quickly and efficiently extract necessary information from electronic documents and provide it appropriately according to the user's situation and emotions. Furthermore, because there are no technological means to improve the user experience by presenting information while taking user emotions into consideration, problems such as stress and information overload are occurring.
[0156] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0157] In this invention, the server includes optical character recognition means for extracting character information from an electronic document, natural language processing means for analyzing the extracted character information and identifying desired information, information management means for registering the analyzed information in an information recording unit and generating connection information to the original electronic document, and emotion analysis means for analyzing the user's emotions and adjusting the method of presenting information based on the results. This enables the user to always receive information in a form that suits their emotional state, thereby improving the accuracy of information acquisition and the user experience.
[0158] An "electronic document" is a file containing text data and image data stored in a format that can be read by computers and electronic devices.
[0159] "Textual information" refers to information expressed in text format, including data such as strings of characters and numbers.
[0160] "Optical character recognition means" refers to a device or program that reads characters from image data and converts them into text data.
[0161] "Natural language processing" refers to the technology used to analyze and understand natural language used by humans using computers.
[0162] "Information management means" refers to a mechanism or system for efficiently organizing, storing, retrieving, and managing information.
[0163] "Emotional analysis methods" refer to analytical methods that use techniques to estimate a user's emotions and psychological state from data.
[0164] The "information recording unit" is a data storage area that stores acquired information and arranges it in a way that allows access as needed.
[0165] A "search query generation means" is a function or process that automatically or manually generates inquiries for information retrieval.
[0166] "Information presentation means" refers to a method or device for presenting acquired information to the user in an easily understandable manner.
[0167] "Operation data collection means" refers to a mechanism or device that collects data related to the user's operation history and usage status.
[0168] "Systematization means" refers to a mechanism or method for structuring information and organizing it in an easily accessible format.
[0169] An "information classification and organization means" is a method or apparatus for classifying and organizing acquired information according to specific criteria.
[0170] "Information presentation adjustment means" refers to a function or process that dynamically adjusts the method of presenting information based on sentiment analysis and usage conditions.
[0171] This invention uses a system consisting of a user, a terminal, and a server to perform information analysis on electronic documents and provide information tailored to the user's emotions.
[0172] The user uploads contracts and related documents as electronic files to their device. The device then sends these electronic documents to the server.
[0173] The server plays a central role in processing received electronic documents. First, the server extracts text information from the document using optical character recognition (OCR) technology. This process uses common OCR tools such as Tesseract OCR. Next, the extracted text information is analyzed using natural language processing (NLP) technology. NLP processing uses libraries such as SpaCy and NLTK to identify contract information and important text data.
[0174] The server registers the identified information in a database and generates a link to the original document. This database management makes it possible to quickly retrieve the necessary information.
[0175] Furthermore, the server incorporates an emotion analysis engine. This engine evaluates emotional data from the user's inputs and actions on the terminal. The server utilizes generative AI models such as TENSORFLOW® and PyTorch to analyze the user's emotional state (e.g., urgency, relaxation) and dynamically adjusts the way information is presented based on the obtained emotional information.
[0176] The terminal is responsible for retrieving information sent from the server and displaying it in a format suitable for the user. For example, if the user needs urgent information, relevant information will be displayed quickly and prioritized on the terminal. Conversely, if the user is relaxed, more detailed and comprehensive information will be provided. The terminal can also adjust the screen design and information presentation style according to the user's emotional state.
[0177] For example, if a user enters "I urgently need information about the latest contracts," the server performs sentiment analysis on this input and detects a high level of urgency. This prompts the server to quickly search its database and begin the process of displaying relevant and important contract information on the user's device. Examples of prompts for the AI model in this scenario include "Consider the user's urgency and prioritize displaying important information."
[0178] This invention allows users to efficiently obtain necessary information without experiencing stress. By adapting information delivery to the user's emotional state, it becomes possible to provide a more comfortable user experience.
[0179] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0180] Step 1:
[0181] The user uploads contracts and related documents as electronic data to the terminal. The input data is in PDF or image format. The terminal sends these files to the server. The server temporarily stores the received documents to pass them on to the next processing step. At this stage, the electronic documents are received as input and are ready for OCR processing.
[0182] Step 2:
[0183] The server performs optical character recognition (OCR) on the received electronic document. The input is the document file sent in step 1. Using software such as Tesseract OCR, the server extracts character information from the image data and converts it into text format. This OCR process generates string data from the document as output.
[0184] Step 3:
[0185] The server analyzes the character information extracted by OCR using natural language processing (NLP) techniques. The input data is the string data obtained in the previous step. The server uses SpaCy or NLTK to identify contract-related information in the document (e.g., contractor name, date, conditions, etc.) and organize it as structured data. This analysis identifies specific information, preparing it for registration in the database as output.
[0186] Step 4:
[0187] The server registers the analyzed information in the database. The input is the structured data identified in step 3. Based on this information, the server also generates a link to the original document and saves it in the database. In this step, the registered data is generated as output and becomes accessible for subsequent search queries.
[0188] Step 5:
[0189] The device records user input and actions, collecting sentiment data. Input consists of the user's operation history and actual interactions. The device sends this data to a server, preparing it for use as the basis for sentiment analysis. At this stage, user behavior data is obtained as output.
[0190] Step 6:
[0191] The server analyzes data sent from the terminal using an emotion analysis engine to estimate the user's emotional state. The input is the user's behavioral data collected in step 5. Using generative AI models such as TensorFlow and PyTorch, the server identifies the user's emotions (such as the degree of urgency or relaxation) and generates analysis results. These results are output as emotion data used to adjust the information presented.
[0192] Step 7:
[0193] The server adjusts how information is presented based on the sentiment analysis results. The input is the sentiment data obtained in step 6. The server prioritizes and modifies the display method to provide information optimized for the user. For example, if the user indicates urgency, relevant information is prioritized and presented quickly. The adjusted information is then presented to the user as output.
[0194] Step 8:
[0195] The terminal receives information provided by the server and displays it to the user in the most optimal way. The input is the information adjusted in step 7. The terminal dynamically changes the screen layout and presentation style according to the user's emotions. For example, if the terminal determines that the user is relaxed, detailed information will be displayed. In this step, the information provided to the user is finalized as the final output.
[0196] (Application Example 2)
[0197] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."
[0198] Conventional electronic document management systems have been unable to dynamically present information in response to the user's emotional state, limiting the improvement of the user experience. In particular, in the field of electronic payments, optimizing information based on the user's emotions is required, but current systems are not adequately capable of doing so.
[0199] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0200] In this invention, the server includes optical character recognition means for extracting character information from an electronic document, natural language processing means for analyzing the extracted character information and identifying desired information, information management means for registering the analyzed information in a database and generating a link to the original electronic document, and emotion analysis means for analyzing the user's emotional state and dynamically adjusting the method of presenting information. This enables optimal presentation of information according to the user's emotional state and smooth information provision in electronic payments.
[0201] An "optical character recognition means" is a device that extracts character information from electronic documents.
[0202] "Natural language processing means" refers to technology that analyzes extracted textual information and identifies information that aligns with the user's intent.
[0203] "Information management means" refers to a technological system that registers analyzed information in a database and generates links to facilitate access to the original electronic document.
[0204] An "emotion analysis tool" is a technological system that analyzes the user's emotional state from their input and actions, and dynamically adjusts the information presented based on that information.
[0205] A "query generation mechanism" is a system that automatically creates appropriate database queries based on user input.
[0206] "Information presentation means" refers to technologies that display search results to users visually or by other means, and that display them in a way that is appropriate to the user's emotional state.
[0207] An "indexing method" is a technique for organizing electronic documents and related information so that they can be efficiently searched.
[0208] A "data organization method" is a technical system that arranges extracted information according to certain rules and formats, making it available for use.
[0209] "Payment support methods" are technologies that optimize payment information and options according to the user's emotions, thereby supporting a smooth payment process.
[0210] The system that realizes this invention will be built as a smartphone application to support electronic payments. At the core of this system is a server with an integrated emotion analysis engine that analyzes user input and behavior and provides information according to the user's emotional state.
[0211] The server first receives data sent by the user through the application. This data includes purchase information as electronic documents and user interaction logs. Using optical character recognition and natural language processing, the server extracts useful character information from this data and identifies the desired information. The analyzed information is registered in a database, and a link to the original document is also generated as needed.
[0212] Furthermore, the server uses a generative AI model, an emotion analysis engine, to recognize the user's emotional state in real time. This engine analyzes emotions from the user's input patterns and past data, and can dynamically adjust the priority and method of information presentation. For example, if the user indicates an urgent need, it can immediately present concise information and simplified payment options.
[0213] On the terminal, information is presented using search results and payment options. The interface design and amount of information are appropriately adjusted according to the user's emotional state, ensuring a stress-free experience. For example, in a situation where a user is about to miss a train and needs to quickly purchase a ticket, the simplest payment method is presented, allowing for a swift completion of the transaction.
[0214] An example of a prompt for a generative AI model is, "How can we offer the best payment method when the user is in a hurry?" This prompt serves as a guide to improve the output of the sentiment analysis engine and helps to enhance the user experience.
[0215] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0216] Step 1:
[0217] The user accesses an electronic payment application using their smartphone and selects the items they wish to purchase. Input includes the item selection and purchase intent. The device prepares to send this information to the server. The selected items are sent to the server as output.
[0218] Step 2:
[0219] The server analyzes the received user data using optical character recognition and natural language processing. The input includes purchase intent and product information sent by the user. The server extracts useful character information from this data and identifies specific purchase information. The identified information is generated as output and registered in the database.
[0220] Step 3:
[0221] The server uses a generative AI model, the emotion analysis engine, to analyze the user's emotional state in real time. Inputs include the user's behavioral patterns and past interaction data. Based on this data, the server determines whether the user is in an urgent situation or relaxed. The output is an evaluation of the emotional state.
[0222] Step 4:
[0223] The server generates the optimal payment option through information presentation methods based on the results of sentiment analysis. The input is the evaluation result of the emotional state. Based on the evaluation, the server dynamically adjusts the priority and method of information presentation. In the case of an emotion indicating urgency, an intuitive and rapid payment procedure is selected and presented as the output.
[0224] Step 5:
[0225] The terminal displays optimized payment information and options sent from the server to the user. The input consists of payment options and information generated by the server. The terminal displays this information in a user-friendly format and prompts the user to take action. The output is a user-friendly presentation of information.
[0226] Step 6:
[0227] The user reviews the presented payment options and completes the final purchase confirmation process. The input is the information displayed on the terminal. The user's actions initiate the final payment process, and the output is the completion of the purchase.
[0228] The specific processing unit 290 transmits the result of the specific processing to the smart device 14. In the smart device 14, the control unit 46A causes the output device 40 to output the result of the specific processing. The microphone 38B acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the audio data.
[0229] Data generation model 58 is a so-called generative AI (Artificial Intelligence). An example of data generation model 58 is ChatGPT (registered trademark) (Internet search).<URL: https: / / openai.com / blog / chatgpt> ), Gemini (registered trademark) (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0230] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart device 14.
[0231] [Second Embodiment]
[0232] Figure 3 shows an example of the configuration of the data processing system 210 according to the second embodiment.
[0233] As shown in Figure 3, the data processing system 210 includes a data processing device 12 and smart glasses 214. An example of the data processing device 12 is a server.
[0234] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0235] The smart glasses 214 include a computer 36, a microphone 238, a speaker 240, a camera 42, and a communication interface 44. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, and camera 42 are also connected to the bus 52.
[0236] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.
[0237] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).
[0238] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.
[0239] Figure 4 shows an example of the main functions of the data processing device 12 and the smart glasses 214. As shown in Figure 4, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.
[0240] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.
[0241] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0242] In the smart glasses 214, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0243] Next, the identification processing performed by the identification processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".
[0244] The system of the present invention consists of three components: a server, a terminal, and a user. The specific operation of each component is described below.
[0245] First, the user uploads contracts and other related electronic documents through the business system interface. The uploaded documents are sent from the terminal to the server. The server receives these documents and efficiently extracts character information from the electronic documents using optical character recognition (OCR).
[0246] The server then analyzes the extracted text information using natural language processing to identify important information such as the contract number, contracter name, contract date, and contract amount. This information is organized using information management tools and registered in the contract database. At the same time, a link to the original electronic document is generated and stored in the database.
[0247] Users can then search for necessary contract information through this system. When a user enters a concise and unexaggerated search prompt into the terminal, the terminal uses a query generation mechanism to generate a query to search the database based on the user's input. The server quickly searches the database in response to the query and collects the relevant information.
[0248] Search results are presented to the user via their device. These results include links to relevant documents and summaries of necessary contractual information. This allows users to quickly access the information they need and proceed with their work efficiently.
[0249] In this system, the server uses indexing mechanisms to structurally organize information and improve the speed of search result retrieval. Furthermore, data organization mechanisms organize documents based on the extracted data, thereby improving operational efficiency.
[0250] For example, if a user enters a prompt such as "search for contracts concluded in the past year," the terminal will receive this prompt and generate an appropriate query. The server will process the query, retrieve the relevant information from the database, and return the search results to the user. In this way, the present invention provides a means to significantly streamline the management of electronic documents.
[0251] The following describes the processing flow.
[0252] Step 1:
[0253] Users upload PDF files of contracts from their terminals to the server using the web interface of the business system.
[0254] Step 2:
[0255] The server receives the uploaded PDF file, performs a security check, and then saves the file to temporary storage.
[0256] Step 3:
[0257] The server activates the optical character recognition (OCR) system and extracts character information from the saved PDF file. During this process, image-based characters are converted into text data.
[0258] Step 4:
[0259] The server uses natural language processing to analyze important contract information from the text extracted by OCR processing, identifying, for example, the name of the contractor, the contract date, and the contract amount.
[0260] Step 5:
[0261] The server organizes the identified information and registers it in the contract database. Furthermore, it generates a reference link to the original PDF file and saves this link in the database as well.
[0262] Step 6:
[0263] The user uses the business system's search function to input conditions or keywords into the terminal to search for specific contract information.
[0264] Step 7:
[0265] The terminal activates a query generation mechanism based on user input and constructs a search query targeting the database.
[0266] Step 8:
[0267] The server receives the generated query, quickly searches the database, and retrieves relevant contract information and links.
[0268] Step 9:
[0269] The terminal receives search results from the server and displays the information to the user in a list format. This allows the user to quickly access the necessary contracts and information.
[0270] (Example 1)
[0271] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."
[0272] In today's business environment, managing electronic information requires considerable time and effort. In particular, quickly and accurately extracting, managing, and searching for necessary information from important documents such as contracts is difficult. Furthermore, there is a demand for improved information retrieval speed. Solving these challenges is essential to improving operational efficiency.
[0273] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0274] In this invention, the server includes recognition means for extracting character information from electronic information, language processing means for analyzing the extracted character information and identifying specific information, and management means for registering the analyzed information in a storage device and generating a reference to the original electronic information. This enables rapid and accurate management of electronic information and efficient retrieval of necessary information.
[0275] "Recognition means" refers to a device or program that has the function of automatically extracting textual information from electronic information.
[0276] The "language processing means" is a device or program that analyzes the extracted character information and performs processing for identifying specific information.
[0277] The "management means" is a device or program having a function to systematically register the analyzed information and generate a reference to the original electronic information.
[0278] The "storage device" is a physical or virtual device for permanently or temporarily holding information.
[0279] The "search query" is a set of conditions or phrases used to search for specific information from an information collection.
[0280] The "information presentation means" is a device or program having a function to visually or aurally present search results or necessary information to a user.
[0281] "Indexing" is a process for structuring and organizing information to improve the speed and efficiency of searching.
[0282] To implement this invention, a system in which a server, a terminal, and a user operate in cooperation is constructed.
[0283] First, the user uploads an electronic document such as a contract from the terminal through the interface of the business system. The terminal transfers these documents to the server using a secure communication protocol.
[0284] The server extracts character information from the received electronic document using optical character recognition (OCR) technology. Specifically, it is conceivable to use known OCR software such as ABBYY FineReader. The extracted character information is then analyzed using natural language processing (NLP) technology. Here, NLP libraries such as spaCy and NLTK may be utilized. Through this analysis, information regarding the contract number, the name of the contractor, and the contract date is automatically identified.
[0285] The analyzed information is registered in a storage device for information management, and a link to the original electronic document is generated. This link is for facilitating the reference and tracking of information. Also, the information is indexed, improving the efficiency of searching.
[0286] The user can quickly find the registered information by using the search function of the system. As a specific example, by inputting a prompt such as "Search for contracts concluded this year", relevant contract information can be quickly obtained. The terminal generates an appropriate search query based on this prompt, and a database search is performed on the server.
[0287] With this aspect of the invention, the user can efficiently manage and search for electronic information. The generative AI model assists in generating prompt sentences for realizing flexible information search according to the user's needs.
[0288] The flow of the specific process in Example 1 will be described using FIG. 11.
[0289] Step 1:
[0290] The user uses the interface of the business system to select an electronic document such as a contract from the terminal and upload it. By this operation, the document is taken into the system of the terminal. When the capture is completed, the terminal sends the electronic document data received as input to the server.
[0291] Step 2:
[0292] The server receives the electronic document sent from the terminal and applies optical character recognition (OCR) technology. By this process, text information is extracted from the electronic document. Through the operation of the OCR software, the characters in image format are output as machine-readable text data.
[0293] Step 3:
[0294] The server analyzes the text data extracted by OCR using natural language processing (NLP). Based on the input text information, it identifies items such as contract number, contractor name, contract date, and contract amount, and outputs them as structured data. By using an NLP library, the server performs the specific processing of automatically extracting the necessary information from the document.
[0295] Step 4:
[0296] The server registers the analyzed structured data in the information management system and generates a link to the original electronic document. This process stores the identified information in the database, and the link is added as part of the output, making it easier to access the information.
[0297] Step 5:
[0298] The user enters a prompt into the terminal to use the system to search for the necessary information. For example, they might enter the prompt "Search for contracts concluded in the past year" and send it to the terminal.
[0299] Step 6:
[0300] The terminal parses user input prompts and generates appropriate database search queries. Search conditions are generated based on the entered prompt text, and the results are output as queries for searching the database.
[0301] Step 7:
[0302] The server searches the database using the generated query. The process then retrieves relevant information from the database, and the search results, including the corresponding contract information and links, are aggregated.
[0303] Step 8:
[0304] The terminal presents the search results sent from the server to the user. As a result, the user can quickly obtain the necessary information and improve the efficiency of their work. The presented information includes links to relevant documents and summaries.
[0305] (Application Example 1)
[0306] Next, Application Example 1 will be described. In the following description, the data processing device 12 is referred to as the "server", and the smart glasses 214 are referred to as the "terminal".
[0307] In conventional electronic document management systems, the extraction and management of important information such as contracts are often done manually, which is time-consuming and labor-intensive. There are also problems such as the confirmation and search of transaction records being cumbersome and it being difficult to perform efficient business operations. Therefore, there is a need to provide a system that can efficiently manage electronic contracts and allows users to easily check contract information and transaction histories.
[0308] The specific processing by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0309] In this invention, the server includes a recognition means for extracting character data from an electronic document, a processing means for analyzing the extracted character data to identify specific information, an information control means for registering the analyzed information in a data repository and generating a reference to the original electronic document, and a communication means for transmitting the information to a communication terminal so that the user can easily check contract information and transaction records on the terminal device. This automates the management of electronic contracts and enables users to efficiently access the necessary information.
[0310] "Electronic document" refers to a document created in digital form such as a contract or a transaction record.
[0311] "Character data" refers to individual characters extracted from an electronic document and their collections.
[0312] "Recognition means" refers to technologies and devices for extracting character data from electronic documents.
[0313] "Processing means" refers to software or algorithms used to analyze extracted character data and identify specific information.
[0314] A "data vault" refers to a database or storage system used to store analyzed information.
[0315] "Information control means" refers to a system that has the function of registering information in a data storage and generating a reference to the original electronic document.
[0316] "Communication means" refers to communication technologies and infrastructure that transmit information to users' terminal devices, enabling them to verify contract information and transaction records.
[0317] "Terminal devices" refer to devices that users use to access information, such as smartphones and computers.
[0318] To realize this invention, an efficient management system for electronic documents will be constructed. The server will use optical character recognition (OCR) technology to recognize electronic documents such as electronic contracts and transaction records. Specifically, it will use the Google Cloud Vision API to extract text data and AWS Lambda to analyze the information for natural language processing (NLP). The analyzed data will be registered in AWS RDS (relational database service) and stored along with a link to the original document.
[0319] For communication devices used by users, such as smartphones, an application is installed on the device to allow for quick access to contract information and transaction records. This application communicates with a server via API Gateway, generates queries, and presents the search results to the user as a graphical user interface (GUI). As a concrete example, a user can immediately retrieve relevant information by entering a prompt such as "Show me all transaction records from last year."
[0320] By using this system, users can automate the processing of contracts, access necessary information more quickly, and significantly improve operational efficiency.
[0321] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0322] Step 1:
[0323] The user uploads an electronic document using their device. Upon uploading, the document is sent to the cloud, and OCR (Optical Character Recognition) is activated. The input is an electronic document, and the output is text data.
[0324] Step 2:
[0325] The server uses OCR technology to extract character data from electronic documents. Here, the Google Cloud Vision API is used to identify characters from image data and obtain them as text data. The input is an electronic document, and the output is the extracted character data.
[0326] Step 3:
[0327] The server uses AWS Lambda to perform natural language processing on the extracted text data. Here, it identifies information such as the contract number, contract holder name, and contract date. The input is text data, and the output is the parsed specific contract information.
[0328] Step 4:
[0329] The server registers the analyzed contract information in AWS RDS and generates a link to the original electronic document. This link is stored in the database and can be accessed as needed. The input is the analyzed contract information, and the output is the database and link information.
[0330] Step 5:
[0331] The user requests a search based on a prompt entered into the terminal. The terminal generates a search query from the prompt and sends it to the server. The input is the prompt, and the output is the generated search query.
[0332] Step 6:
[0333] The server searches the database within AWS RDS based on the received search query and collects relevant contract information. It then uses the indexed data to perform efficient searches. The input is the search query, and the output is the search results data.
[0334] Step 7:
[0335] The server sends search results to the terminal, which displays the results to the user via a GUI. The user can then review contract information and related links and perform necessary tasks. The input is the search results data, and the output is the information displayed to the user.
[0336] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0337] The present invention is a system that incorporates an emotion engine that analyzes user emotions and applies them to information retrieval and presentation. The system mainly consists of a server, a terminal, and the user.
[0338] First, the user uploads the contract and related documents as electronic data. The server extracts text information from the electronic documents, such as PDFs, using optical character recognition (OCR). This information is then analyzed by natural language processing to identify specific contract information. This identified information is registered in a database, and a link to the original document is also generated.
[0339] A distinctive feature of this system is the introduction of an emotion engine. When a user inputs data or interacts with the system, the server analyzes the user's emotions through the emotion engine based on the data and behavioral patterns obtained from the terminal. Based on this, the way search results are presented and the priority of information are dynamically adjusted, enabling the provision of information that is tailored to the user.
[0340] For example, if a user indicates an emotion requiring urgent attention, the server can change its priorities to quickly display relevant search results. Similarly, if the emotion engine determines that the user is relaxed, it can provide more detailed information than usual.
[0341] Furthermore, the emotion engine also suggests changes to the information presentation style and design on the device to ensure users can continue to use the interface without stress. This feature significantly improves the user experience and leads to smoother workflow.
[0342] For example, when a user enters "I urgently need the latest contract information," the emotion engine senses a high level of urgency. This prompts the server to quickly provide information, and relevant, important contract information is promptly displayed on the terminal. In this way, the present invention realizes an advanced information management system that responds to user needs through emotion estimation.
[0343] The following describes the processing flow.
[0344] Step 1:
[0345] Users upload PDF contract files from their terminals to the server via the business system interface. During this process, user actions and input data are used for initial analysis by the emotion engine.
[0346] Step 2:
[0347] The server receives the uploaded PDF file, activates the optical character recognition (OCR) system, and extracts character information from the electronic document. This character information is temporarily stored for later natural language processing.
[0348] Step 3:
[0349] The server uses natural language processing to analyze the text information extracted by OCR. This analysis identifies important information such as the contractor's name, contract date, and contract amount, and then initiates a process to register this information in a database.
[0350] Step 4:
[0351] The server uses information management tools to organize the analyzed information, generate links to the original electronic documents, and store them in a database. This process allows for quick access to the information when searching later.
[0352] Step 5:
[0353] The device activates an emotion engine and determines the user's emotional state based on their input speed and content. Based on this data, it analyzes whether the user's current psychological state is high-stress or low-stress.
[0354] Step 6:
[0355] The user enters criteria on the interface to search for specific contract information. As the user enters the information, the system monitors the user's keyboard typing patterns and mouse movements.
[0356] Step 7:
[0357] The terminal uses a query generation mechanism to construct database queries based on the user's emotional state and search criteria. The priority of these queries is dynamically adjusted according to the user's emotions.
[0358] Step 8:
[0359] The server processes queries and searches the database based on the analysis results of the emotion engine. It efficiently retrieves search results and prioritizes information as needed.
[0360] Step 9:
[0361] The device receives search results from the server and quickly displays information tailored to the user's analyzed emotional state. By presenting information in a way that suits the user's state, the user can view and use the information in the most optimal manner.
[0362] (Example 2)
[0363] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".
[0364] In the modern era, while information is rapidly digitizing, there is a lack of systems that can quickly and efficiently extract necessary information from electronic documents and provide it appropriately according to the user's situation and emotions. Furthermore, because there are no technological means to improve the user experience by presenting information while taking user emotions into consideration, problems such as stress and information overload are occurring.
[0365] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0366] In this invention, the server includes optical character recognition means for extracting character information from an electronic document, natural language processing means for analyzing the extracted character information and identifying desired information, information management means for registering the analyzed information in an information recording unit and generating connection information to the original electronic document, and emotion analysis means for analyzing the user's emotions and adjusting the method of presenting information based on the results. This enables the user to always receive information in a form that suits their emotional state, thereby improving the accuracy of information acquisition and the user experience.
[0367] An "electronic document" is a file containing text data and image data stored in a format that can be read by computers and electronic devices.
[0368] "Textual information" refers to information expressed in text format, including data such as strings of characters and numbers.
[0369] "Optical character recognition means" refers to a device or program that reads characters from image data and converts them into text data.
[0370] "Natural language processing" refers to the technology used to analyze and understand natural language used by humans using computers.
[0371] "Information management means" refers to a mechanism or system for efficiently organizing, storing, retrieving, and managing information.
[0372] "Emotional analysis methods" refer to analytical methods that use techniques to estimate a user's emotions and psychological state from data.
[0373] The "information recording unit" is a data storage area that stores acquired information and arranges it in a way that allows access as needed.
[0374] A "search query generation means" is a function or process that automatically or manually generates inquiries for information retrieval.
[0375] "Information presentation means" refers to a method or device for presenting acquired information to the user in an easily understandable manner.
[0376] "Operation data collection means" refers to a mechanism or device that collects data related to the user's operation history and usage status.
[0377] "Systematization means" refers to a mechanism or method for structuring information and organizing it in an easily accessible format.
[0378] An "information classification and organization means" is a method or apparatus for classifying and organizing acquired information according to specific criteria.
[0379] "Information presentation adjustment means" refers to a function or process that dynamically adjusts the method of presenting information based on sentiment analysis and usage conditions.
[0380] This invention uses a system consisting of a user, a terminal, and a server to perform information analysis on electronic documents and provide information tailored to the user's emotions.
[0381] The user uploads contracts and related documents as electronic files to their device. The device then sends these electronic documents to the server.
[0382] The server plays a central role in processing received electronic documents. First, the server extracts text information from the document using optical character recognition (OCR) technology. This process uses common OCR tools such as Tesseract OCR. Next, the extracted text information is analyzed using natural language processing (NLP) technology. NLP processing uses libraries such as SpaCy and NLTK to identify contract information and important text data.
[0383] The server registers the identified information in a database and generates a link to the original document. This database management makes it possible to quickly retrieve the necessary information.
[0384] Furthermore, the server incorporates an emotion analysis engine. This engine evaluates emotional data from the user's inputs and actions on the device. The server utilizes generative AI models such as TensorFlow and PyTorch to analyze the user's emotional state (e.g., urgency, relaxation) and dynamically adjusts how information is presented based on the obtained emotional information.
[0385] The terminal is responsible for retrieving information sent from the server and displaying it in a format suitable for the user. For example, if the user needs urgent information, relevant information will be displayed quickly and prioritized on the terminal. Conversely, if the user is relaxed, more detailed and comprehensive information will be provided. The terminal can also adjust the screen design and information presentation style according to the user's emotional state.
[0386] For example, if a user enters "I urgently need information about the latest contracts," the server performs sentiment analysis on this input and detects a high level of urgency. This prompts the server to quickly search its database and begin the process of displaying relevant and important contract information on the user's device. Examples of prompts for the AI model in this scenario include "Consider the user's urgency and prioritize displaying important information."
[0387] This invention allows users to efficiently obtain necessary information without experiencing stress. By adapting information delivery to the user's emotional state, it becomes possible to provide a more comfortable user experience.
[0388] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0389] Step 1:
[0390] The user uploads contracts and related documents as electronic data to the terminal. The input data is in PDF or image format. The terminal sends these files to the server. The server temporarily stores the received documents to pass them on to the next processing step. At this stage, the electronic documents are received as input and are ready for OCR processing.
[0391] Step 2:
[0392] The server performs optical character recognition (OCR) on the received electronic document. The input is the document file sent in step 1. Using software such as Tesseract OCR, the server extracts character information from the image data and converts it into text format. This OCR process generates string data from the document as output.
[0393] Step 3:
[0394] The server analyzes the character information extracted by OCR using natural language processing (NLP) techniques. The input data is the string data obtained in the previous step. The server uses SpaCy or NLTK to identify contract-related information in the document (e.g., contractor name, date, conditions, etc.) and organize it as structured data. This analysis identifies specific information, preparing it for registration in the database as output.
[0395] Step 4:
[0396] The server registers the analyzed information in the database. The input is the structured data identified in step 3. Based on this information, the server also generates a link to the original document and saves it in the database. In this step, the registered data is generated as output and becomes accessible for subsequent search queries.
[0397] Step 5:
[0398] The device records user input and actions, collecting sentiment data. Input consists of the user's operation history and actual interactions. The device sends this data to a server, preparing it for use as the basis for sentiment analysis. At this stage, user behavior data is obtained as output.
[0399] Step 6:
[0400] The server analyzes data sent from the terminal using an emotion analysis engine to estimate the user's emotional state. The input is the user's behavioral data collected in step 5. Using generative AI models such as TensorFlow and PyTorch, the server identifies the user's emotions (such as the degree of urgency or relaxation) and generates analysis results. These results are output as emotion data used to adjust the information presented.
[0401] Step 7:
[0402] The server adjusts how information is presented based on the sentiment analysis results. The input is the sentiment data obtained in step 6. The server prioritizes and modifies the display method to provide information optimized for the user. For example, if the user indicates urgency, relevant information is prioritized and presented quickly. The adjusted information is then presented to the user as output.
[0403] Step 8:
[0404] The terminal receives information provided by the server and displays it to the user in the most optimal way. The input is the information adjusted in step 7. The terminal dynamically changes the screen layout and presentation style according to the user's emotions. For example, if the terminal determines that the user is relaxed, detailed information will be displayed. In this step, the information provided to the user is finalized as the final output.
[0405] (Application Example 2)
[0406] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."
[0407] Conventional electronic document management systems have been unable to dynamically present information in response to the user's emotional state, limiting the improvement of the user experience. In particular, in the field of electronic payments, optimizing information based on the user's emotions is required, but current systems are not adequately capable of doing so.
[0408] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0409] In this invention, the server includes optical character recognition means for extracting character information from an electronic document, natural language processing means for analyzing the extracted character information and identifying desired information, information management means for registering the analyzed information in a database and generating a link to the original electronic document, and emotion analysis means for analyzing the user's emotional state and dynamically adjusting the method of presenting information. This enables optimal presentation of information according to the user's emotional state and smooth information provision in electronic payments.
[0410] An "optical character recognition means" is a device that extracts character information from electronic documents.
[0411] "Natural language processing means" refers to technology that analyzes extracted textual information and identifies information that aligns with the user's intent.
[0412] "Information management means" refers to a technological system that registers analyzed information in a database and generates links to facilitate access to the original electronic document.
[0413] An "emotion analysis tool" is a technological system that analyzes the user's emotional state from their input and actions, and dynamically adjusts the information presented based on that information.
[0414] A "query generation mechanism" is a system that automatically creates appropriate database queries based on user input.
[0415] "Information presentation means" refers to technologies that display search results to users visually or by other means, and that display them in a way that is appropriate to the user's emotional state.
[0416] An "indexing method" is a technique for organizing electronic documents and related information so that they can be efficiently searched.
[0417] A "data organization method" is a technical system that arranges extracted information according to certain rules and formats, making it available for use.
[0418] "Payment support methods" are technologies that optimize payment information and options according to the user's emotions, thereby supporting a smooth payment process.
[0419] The system that realizes this invention will be built as a smartphone application to support electronic payments. At the core of this system is a server with an integrated emotion analysis engine that analyzes user input and behavior and provides information according to the user's emotional state.
[0420] The server first receives data sent by the user through the application. This data includes purchase information as electronic documents and user interaction logs. Using optical character recognition and natural language processing, the server extracts useful character information from this data and identifies the desired information. The analyzed information is registered in a database, and a link to the original document is also generated as needed.
[0421] Furthermore, the server uses a generative AI model, an emotion analysis engine, to recognize the user's emotional state in real time. This engine analyzes emotions from the user's input patterns and past data, and can dynamically adjust the priority and method of information presentation. For example, if the user indicates an urgent need, it can immediately present concise information and simplified payment options.
[0422] On the terminal, information is presented using search results and payment options. The interface design and amount of information are appropriately adjusted according to the user's emotional state, ensuring a stress-free experience. For example, in a situation where a user is about to miss a train and needs to quickly purchase a ticket, the simplest payment method is presented, allowing for a swift completion of the transaction.
[0423] An example of a prompt for a generative AI model is, "How can we offer the best payment method when the user is in a hurry?" This prompt serves as a guide to improve the output of the sentiment analysis engine and helps to enhance the user experience.
[0424] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0425] Step 1:
[0426] The user accesses an electronic payment application using their smartphone and selects the items they wish to purchase. Input includes the item selection and purchase intent. The device prepares to send this information to the server. The selected items are sent to the server as output.
[0427] Step 2:
[0428] The server analyzes the received user data using optical character recognition and natural language processing. The input includes purchase intent and product information sent by the user. The server extracts useful character information from this data and identifies specific purchase information. The identified information is generated as output and registered in the database.
[0429] Step 3:
[0430] The server uses a generative AI model, the emotion analysis engine, to analyze the user's emotional state in real time. Inputs include the user's behavioral patterns and past interaction data. Based on this data, the server determines whether the user is in an urgent situation or relaxed. The output is an evaluation of the emotional state.
[0431] Step 4:
[0432] The server generates the optimal payment option through information presentation methods based on the results of sentiment analysis. The input is the evaluation result of the emotional state. Based on the evaluation, the server dynamically adjusts the priority and method of information presentation. In the case of an emotion indicating urgency, an intuitive and rapid payment procedure is selected and presented as the output.
[0433] Step 5:
[0434] The terminal displays optimized payment information and options sent from the server to the user. The input consists of payment options and information generated by the server. The terminal displays this information in a user-friendly format and prompts the user to take action. The output is a user-friendly presentation of information.
[0435] Step 6:
[0436] The user reviews the presented payment options and completes the final purchase confirmation process. The input is the information displayed on the terminal. The user's actions initiate the final payment process, and the output is the completion of the purchase.
[0437] The specific processing unit 290 transmits the result of the specific processing to the smart glasses 214. In the smart glasses 214, the control unit 46A causes the speaker 240 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.
[0438] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). An example of data generation model 58 is ChatGPT (Internet Search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0439] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart glasses 214.
[0440] [Third Embodiment]
[0441] Figure 5 shows an example of the configuration of the data processing system 310 according to the third embodiment.
[0442] As shown in Figure 5, the data processing system 310 includes a data processing device 12 and a headset terminal 314. An example of the data processing device 12 is a server.
[0443] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0444] The headset terminal 314 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a display 343. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and display 343 are also connected to the bus 52.
[0445] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.
[0446] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).
[0447] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.
[0448] Figure 6 shows an example of the main functions of the data processing device 12 and the headset terminal 314. As shown in Figure 6, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.
[0449] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.
[0450] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0451] In the headset terminal 314, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0452] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the headset terminal 314 will be referred to as the "terminal".
[0453] The system of the present invention consists of three components: a server, a terminal, and a user. The specific operation of each component is described below.
[0454] First, the user uploads contracts and other related electronic documents through the business system interface. The uploaded documents are sent from the terminal to the server. The server receives these documents and efficiently extracts character information from the electronic documents using optical character recognition (OCR).
[0455] The server then analyzes the extracted text information using natural language processing to identify important information such as the contract number, contracter name, contract date, and contract amount. This information is organized using information management tools and registered in the contract database. At the same time, a link to the original electronic document is generated and stored in the database.
[0456] Users can then search for necessary contract information through this system. When a user enters a concise and unexaggerated search prompt into the terminal, the terminal uses a query generation mechanism to generate a query to search the database based on the user's input. The server quickly searches the database in response to the query and collects the relevant information.
[0457] Search results are presented to the user via their device. These results include links to relevant documents and summaries of necessary contractual information. This allows users to quickly access the information they need and proceed with their work efficiently.
[0458] In this system, the server uses indexing mechanisms to structurally organize information and improve the speed of search result retrieval. Furthermore, data organization mechanisms organize documents based on the extracted data, thereby improving operational efficiency.
[0459] For example, if a user enters a prompt such as "search for contracts concluded in the past year," the terminal will receive this prompt and generate an appropriate query. The server will process the query, retrieve the relevant information from the database, and return the search results to the user. In this way, the present invention provides a means to significantly streamline the management of electronic documents.
[0460] The following describes the processing flow.
[0461] Step 1:
[0462] Users upload PDF files of contracts from their terminals to the server using the web interface of the business system.
[0463] Step 2:
[0464] The server receives the uploaded PDF file, performs a security check, and then saves the file to temporary storage.
[0465] Step 3:
[0466] The server activates the optical character recognition (OCR) system and extracts character information from the saved PDF file. During this process, image-based characters are converted into text data.
[0467] Step 4:
[0468] The server uses natural language processing to analyze important contract information from the text extracted by OCR processing, identifying, for example, the name of the contractor, the contract date, and the contract amount.
[0469] Step 5:
[0470] The server organizes the identified information and registers it in the contract database. Furthermore, it generates a reference link to the original PDF file and saves this link in the database as well.
[0471] Step 6:
[0472] The user uses the business system's search function to input conditions or keywords into the terminal to search for specific contract information.
[0473] Step 7:
[0474] The terminal activates a query generation mechanism based on user input and constructs a search query targeting the database.
[0475] Step 8:
[0476] The server receives the generated query, quickly searches the database, and retrieves relevant contract information and links.
[0477] Step 9:
[0478] The terminal receives search results from the server and displays the information to the user in a list format. This allows the user to quickly access the necessary contracts and information.
[0479] (Example 1)
[0480] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0481] In today's business environment, managing electronic information requires considerable time and effort. In particular, quickly and accurately extracting, managing, and searching for necessary information from important documents such as contracts is difficult. Furthermore, there is a demand for improved information retrieval speed. Solving these challenges is essential to improving operational efficiency.
[0482] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0483] In this invention, the server includes recognition means for extracting character information from electronic information, language processing means for analyzing the extracted character information and identifying specific information, and management means for registering the analyzed information in a storage device and generating a reference to the original electronic information. This enables rapid and accurate management of electronic information and efficient retrieval of necessary information.
[0484] "Recognition means" refers to a device or program that has the function of automatically extracting textual information from electronic information.
[0485] "Language processing means" refers to a device or program that analyzes extracted character information and performs processing to identify specific information.
[0486] "Management means" refers to a device or program that has the function of systematically registering the analyzed information and generating a reference to the original electronic information.
[0487] A "storage device" is a physical or virtual device used to permanently or temporarily store information.
[0488] A "search query" is a set of conditions or words used to find specific information within a set of information.
[0489] "Information presentation means" refers to a device or program that has the function of presenting search results or necessary information to the user visually or audibly.
[0490] "Indexing" is the process of structuring and organizing information to improve the speed and efficiency of searching.
[0491] To implement this invention, a system is constructed in which a server, terminal, and user work together.
[0492] First, the user uploads electronic documents such as contracts from their terminal through the business system interface. The terminal then transfers these documents to the server using a secure communication protocol.
[0493] The server extracts text information from received electronic documents using optical character recognition (OCR) technology. Specifically, known OCR software such as ABBYY FineReader may be used. The extracted text information is then analyzed using natural language processing (NLP) technology. Here, NLP libraries such as spaCy or NLTK may be used. Through this analysis, information such as the contract number, contracter name, and contract date is automatically identified.
[0494] The analyzed information is registered in a storage device for information management, and a link to the original electronic document is generated. This link facilitates the referencing and tracking of the information. Furthermore, the information is indexed, improving search efficiency.
[0495] Users can quickly find registered information using the system's search function. For example, by entering a prompt such as "Search for contracts concluded this year," relevant contract information can be quickly retrieved. The terminal generates an appropriate search query based on this prompt, and a database search is performed on the server.
[0496] This embodiment of the invention allows users to efficiently manage and retrieve electronic information. The generative AI model assists in generating prompt sentences to enable flexible information retrieval tailored to the user's needs.
[0497] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0498] Step 1:
[0499] The user uses the business system interface to select and upload electronic documents, such as contracts, from their terminal. This operation imports the documents into the terminal's system. Once the import is complete, the terminal sends the electronic document data it received as input to the server.
[0500] Step 2:
[0501] The server receives electronic documents sent from terminals and applies optical character recognition (OCR) technology. This process extracts text information from the electronic documents. The OCR software outputs image-based characters as machine-readable text data.
[0502] Step 3:
[0503] The server analyzes the text data extracted by OCR using natural language processing (NLP). Based on the input text information, it identifies items such as contract number, contractor name, contract date, and contract amount, and outputs them as structured data. By using an NLP library, the server performs the specific processing of automatically extracting the necessary information from the document.
[0504] Step 4:
[0505] The server registers the analyzed structured data in the information management system and generates a link to the original electronic document. This process stores the identified information in the database, and the link is added as part of the output, making it easier to access the information.
[0506] Step 5:
[0507] The user enters a prompt into the terminal to use the system to search for the necessary information. For example, they might enter the prompt "Search for contracts concluded in the past year" and send it to the terminal.
[0508] Step 6:
[0509] The terminal parses user input prompts and generates appropriate database search queries. Search conditions are generated based on the entered prompt text, and the results are output as queries for searching the database.
[0510] Step 7:
[0511] The server searches the database using the generated query. The process then retrieves relevant information from the database, and the search results, including the corresponding contract information and links, are aggregated.
[0512] Step 8:
[0513] The terminal displays search results sent from the server to the user. This allows the user to quickly obtain the necessary information and improve work efficiency. The displayed information includes links to relevant documents and summaries.
[0514] (Application Example 1)
[0515] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0516] Conventional electronic document management systems often require manual extraction and management of important information such as contracts, resulting in time-consuming and labor-intensive processes. Furthermore, verifying and searching transaction records is cumbersome, hindering efficient business operations. Therefore, there is a need for a system that efficiently manages electronic contracts and allows users to easily access contract information and transaction history.
[0517] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0518] In this invention, the server includes recognition means for extracting character data from an electronic document, processing means for analyzing the extracted character data and identifying specific information, information control means for registering the analyzed information in a data storage and generating a reference to the original electronic document, and communication means for transmitting the information to a communication terminal so that users can easily check contract information and transaction records on their terminal devices. This automates the management of electronic contracts and enables users to efficiently access the information they need.
[0519] "Electronic documents" refer to documents created in digital format, such as contracts and transaction records.
[0520] "Character data" refers to individual characters and sets of characters extracted from electronic documents.
[0521] "Recognition means" refers to technologies and devices for extracting character data from electronic documents.
[0522] "Processing means" refers to software or algorithms used to analyze extracted character data and identify specific information.
[0523] A "data vault" refers to a database or storage system used to store analyzed information.
[0524] "Information control means" refers to a system that has the function of registering information in a data storage and generating a reference to the original electronic document.
[0525] "Communication means" refers to communication technologies and infrastructure that transmit information to users' terminal devices, enabling them to verify contract information and transaction records.
[0526] "Terminal devices" refer to devices that users use to access information, such as smartphones and computers.
[0527] To realize this invention, an efficient management system for electronic documents will be constructed. The server will use optical character recognition (OCR) technology to recognize electronic documents such as electronic contracts and transaction records. Specifically, it will use the Google Cloud Vision API to extract text data and AWS Lambda to analyze the information for natural language processing (NLP). The analyzed data will be registered in AWS RDS (relational database service) and stored along with a link to the original document.
[0528] For communication devices used by users, such as smartphones, an application is installed on the device to allow for quick access to contract information and transaction records. This application communicates with a server via API Gateway, generates queries, and presents the search results to the user as a graphical user interface (GUI). As a concrete example, a user can immediately retrieve relevant information by entering a prompt such as "Show me all transaction records from last year."
[0529] By using this system, users can automate the processing of contracts, access necessary information more quickly, and significantly improve operational efficiency.
[0530] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0531] Step 1:
[0532] The user uploads an electronic document using their device. Upon uploading, the document is sent to the cloud, and OCR (Optical Character Recognition) is activated. The input is an electronic document, and the output is text data.
[0533] Step 2:
[0534] The server uses OCR technology to extract character data from electronic documents. Here, the Google Cloud Vision API is used to identify characters from image data and obtain them as text data. The input is an electronic document, and the output is the extracted character data.
[0535] Step 3:
[0536] The server uses AWS Lambda to perform natural language processing on the extracted text data. Here, it identifies information such as the contract number, contract holder name, and contract date. The input is text data, and the output is the parsed specific contract information.
[0537] Step 4:
[0538] The server registers the analyzed contract information in AWS RDS and generates a link to the original electronic document. This link is stored in the database and can be accessed as needed. The input is the analyzed contract information, and the output is the database and link information.
[0539] Step 5:
[0540] The user requests a search based on a prompt entered into the terminal. The terminal generates a search query from the prompt and sends it to the server. The input is the prompt, and the output is the generated search query.
[0541] Step 6:
[0542] The server searches the database within AWS RDS based on the received search query and collects relevant contract information. It then uses the indexed data to perform efficient searches. The input is the search query, and the output is the search results data.
[0543] Step 7:
[0544] The server sends search results to the terminal, which displays the results to the user via a GUI. The user can then review contract information and related links and perform necessary tasks. The input is the search results data, and the output is the information displayed to the user.
[0545] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0546] The present invention is a system that incorporates an emotion engine that analyzes user emotions and applies them to information retrieval and presentation. The system mainly consists of a server, a terminal, and the user.
[0547] First, the user uploads the contract and related documents as electronic data. The server extracts text information from the electronic documents, such as PDFs, using optical character recognition (OCR). This information is then analyzed by natural language processing to identify specific contract information. This identified information is registered in a database, and a link to the original document is also generated.
[0548] A distinctive feature of this system is the introduction of an emotion engine. When a user inputs data or interacts with the system, the server analyzes the user's emotions through the emotion engine based on the data and behavioral patterns obtained from the terminal. Based on this, the way search results are presented and the priority of information are dynamically adjusted, enabling the provision of information that is tailored to the user.
[0549] For example, if a user indicates an emotion requiring urgent attention, the server can change its priorities to quickly display relevant search results. Similarly, if the emotion engine determines that the user is relaxed, it can provide more detailed information than usual.
[0550] Furthermore, the emotion engine also suggests changes to the information presentation style and design on the device to ensure users can continue to use the interface without stress. This feature significantly improves the user experience and leads to smoother workflow.
[0551] For example, when a user enters "I urgently need the latest contract information," the emotion engine senses a high level of urgency. This prompts the server to quickly provide information, and relevant, important contract information is promptly displayed on the terminal. In this way, the present invention realizes an advanced information management system that responds to user needs through emotion estimation.
[0552] The following describes the processing flow.
[0553] Step 1:
[0554] Users upload PDF contract files from their terminals to the server via the business system interface. During this process, user actions and input data are used for initial analysis by the emotion engine.
[0555] Step 2:
[0556] The server receives the uploaded PDF file, activates the optical character recognition (OCR) system, and extracts character information from the electronic document. This character information is temporarily stored for later natural language processing.
[0557] Step 3:
[0558] The server uses natural language processing to analyze the text information extracted by OCR. This analysis identifies important information such as the contractor's name, contract date, and contract amount, and then initiates a process to register this information in a database.
[0559] Step 4:
[0560] The server uses information management tools to organize the analyzed information, generate links to the original electronic documents, and store them in a database. This process allows for quick access to the information when searching later.
[0561] Step 5:
[0562] The device activates an emotion engine and determines the user's emotional state based on their input speed and content. Based on this data, it analyzes whether the user's current psychological state is high-stress or low-stress.
[0563] Step 6:
[0564] The user enters criteria on the interface to search for specific contract information. As the user enters the information, the system monitors the user's keyboard typing patterns and mouse movements.
[0565] Step 7:
[0566] The terminal uses a query generation mechanism to construct database queries based on the user's emotional state and search criteria. The priority of these queries is dynamically adjusted according to the user's emotions.
[0567] Step 8:
[0568] The server processes queries and searches the database based on the analysis results of the emotion engine. It efficiently retrieves search results and prioritizes information as needed.
[0569] Step 9:
[0570] The device receives search results from the server and quickly displays information tailored to the user's analyzed emotional state. By presenting information in a way that suits the user's state, the user can view and use the information in the most optimal manner.
[0571] (Example 2)
[0572] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0573] In the modern era, while information is rapidly digitizing, there is a lack of systems that can quickly and efficiently extract necessary information from electronic documents and provide it appropriately according to the user's situation and emotions. Furthermore, because there are no technological means to improve the user experience by presenting information while taking user emotions into consideration, problems such as stress and information overload are occurring.
[0574] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0575] In this invention, the server includes optical character recognition means for extracting character information from an electronic document, natural language processing means for analyzing the extracted character information and identifying desired information, information management means for registering the analyzed information in an information recording unit and generating connection information to the original electronic document, and emotion analysis means for analyzing the user's emotions and adjusting the method of presenting information based on the results. This enables the user to always receive information in a form that suits their emotional state, thereby improving the accuracy of information acquisition and the user experience.
[0576] An "electronic document" is a file containing text data and image data stored in a format that can be read by computers and electronic devices.
[0577] "Textual information" refers to information expressed in text format, including data such as strings of characters and numbers.
[0578] "Optical character recognition means" refers to a device or program that reads characters from image data and converts them into text data.
[0579] "Natural language processing" refers to the technology used to analyze and understand natural language used by humans using computers.
[0580] "Information management means" refers to a mechanism or system for efficiently organizing, storing, retrieving, and managing information.
[0581] "Emotional analysis methods" refer to analytical methods that use techniques to estimate a user's emotions and psychological state from data.
[0582] The "information recording unit" is a data storage area that stores acquired information and arranges it in a way that allows access as needed.
[0583] A "search query generation means" is a function or process that automatically or manually generates inquiries for information retrieval.
[0584] "Information presentation means" refers to a method or device for presenting acquired information to the user in an easily understandable manner.
[0585] "Operation data collection means" refers to a mechanism or device that collects data related to the user's operation history and usage status.
[0586] "Systematization means" refers to a mechanism or method for structuring information and organizing it in an easily accessible format.
[0587] An "information classification and organization means" is a method or apparatus for classifying and organizing acquired information according to specific criteria.
[0588] "Information presentation adjustment means" refers to a function or process that dynamically adjusts the method of presenting information based on sentiment analysis and usage conditions.
[0589] This invention uses a system consisting of a user, a terminal, and a server to perform information analysis on electronic documents and provide information tailored to the user's emotions.
[0590] The user uploads contracts and related documents as electronic files to their device. The device then sends these electronic documents to the server.
[0591] The server plays a central role in processing received electronic documents. First, the server extracts text information from the document using optical character recognition (OCR) technology. This process uses common OCR tools such as Tesseract OCR. Next, the extracted text information is analyzed using natural language processing (NLP) technology. NLP processing uses libraries such as SpaCy and NLTK to identify contract information and important text data.
[0592] The server registers the identified information in a database and generates a link to the original document. This database management makes it possible to quickly retrieve the necessary information.
[0593] Furthermore, the server incorporates an emotion analysis engine. This engine evaluates emotional data from the user's inputs and actions on the device. The server utilizes generative AI models such as TensorFlow and PyTorch to analyze the user's emotional state (e.g., urgency, relaxation) and dynamically adjusts how information is presented based on the obtained emotional information.
[0594] The terminal is responsible for retrieving information sent from the server and displaying it in a format suitable for the user. For example, if the user needs urgent information, relevant information will be displayed quickly and prioritized on the terminal. Conversely, if the user is relaxed, more detailed and comprehensive information will be provided. The terminal can also adjust the screen design and information presentation style according to the user's emotional state.
[0595] For example, if a user enters "I urgently need information about the latest contracts," the server performs sentiment analysis on this input and detects a high level of urgency. This prompts the server to quickly search its database and begin the process of displaying relevant and important contract information on the user's device. Examples of prompts for the AI model in this scenario include "Consider the user's urgency and prioritize displaying important information."
[0596] This invention allows users to efficiently obtain necessary information without experiencing stress. By adapting information delivery to the user's emotional state, it becomes possible to provide a more comfortable user experience.
[0597] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0598] Step 1:
[0599] The user uploads contracts and related documents as electronic data to the terminal. The input data is in PDF or image format. The terminal sends these files to the server. The server temporarily stores the received documents to pass them on to the next processing step. At this stage, the electronic documents are received as input and are ready for OCR processing.
[0600] Step 2:
[0601] The server performs optical character recognition (OCR) on the received electronic document. The input is the document file sent in step 1. Using software such as Tesseract OCR, the server extracts character information from the image data and converts it into text format. This OCR process generates string data from the document as output.
[0602] Step 3:
[0603] The server analyzes the character information extracted by OCR using natural language processing (NLP) techniques. The input data is the string data obtained in the previous step. The server uses SpaCy or NLTK to identify contract-related information in the document (e.g., contractor name, date, conditions, etc.) and organize it as structured data. This analysis identifies specific information, preparing it for registration in the database as output.
[0604] Step 4:
[0605] The server registers the analyzed information in the database. The input is the structured data identified in step 3. Based on this information, the server also generates a link to the original document and saves it in the database. In this step, the registered data is generated as output and becomes accessible for subsequent search queries.
[0606] Step 5:
[0607] The device records user input and actions, collecting sentiment data. Input consists of the user's operation history and actual interactions. The device sends this data to a server, preparing it for use as the basis for sentiment analysis. At this stage, user behavior data is obtained as output.
[0608] Step 6:
[0609] The server analyzes data sent from the terminal using an emotion analysis engine to estimate the user's emotional state. The input is the user's behavioral data collected in step 5. Using generative AI models such as TensorFlow and PyTorch, the server identifies the user's emotions (such as the degree of urgency or relaxation) and generates analysis results. These results are output as emotion data used to adjust the information presented.
[0610] Step 7:
[0611] The server adjusts how information is presented based on the sentiment analysis results. The input is the sentiment data obtained in step 6. The server prioritizes and modifies the display method to provide information optimized for the user. For example, if the user indicates urgency, relevant information is prioritized and presented quickly. The adjusted information is then presented to the user as output.
[0612] Step 8:
[0613] The terminal receives information provided by the server and displays it to the user in the most optimal way. The input is the information adjusted in step 7. The terminal dynamically changes the screen layout and presentation style according to the user's emotions. For example, if the terminal determines that the user is relaxed, detailed information will be displayed. In this step, the information provided to the user is finalized as the final output.
[0614] (Application Example 2)
[0615] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0616] Conventional electronic document management systems have been unable to dynamically present information in response to the user's emotional state, limiting the improvement of the user experience. In particular, in the field of electronic payments, optimizing information based on the user's emotions is required, but current systems are not adequately capable of doing so.
[0617] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0618] In this invention, the server includes optical character recognition means for extracting character information from an electronic document, natural language processing means for analyzing the extracted character information and identifying desired information, information management means for registering the analyzed information in a database and generating a link to the original electronic document, and emotion analysis means for analyzing the user's emotional state and dynamically adjusting the method of presenting information. This enables optimal presentation of information according to the user's emotional state and smooth information provision in electronic payments.
[0619] An "optical character recognition means" is a device that extracts character information from electronic documents.
[0620] "Natural language processing means" refers to technology that analyzes extracted textual information and identifies information that aligns with the user's intent.
[0621] "Information management means" refers to a technological system that registers analyzed information in a database and generates links to facilitate access to the original electronic document.
[0622] An "emotion analysis tool" is a technological system that analyzes the user's emotional state from their input and actions, and dynamically adjusts the information presented based on that information.
[0623] A "query generation mechanism" is a system that automatically creates appropriate database queries based on user input.
[0624] "Information presentation means" refers to technologies that display search results to users visually or by other means, and that display them in a way that is appropriate to the user's emotional state.
[0625] An "indexing method" is a technique for organizing electronic documents and related information so that they can be efficiently searched.
[0626] A "data organization method" is a technical system that arranges extracted information according to certain rules and formats, making it available for use.
[0627] "Payment support methods" are technologies that optimize payment information and options according to the user's emotions, thereby supporting a smooth payment process.
[0628] The system that realizes this invention will be built as a smartphone application to support electronic payments. At the core of this system is a server with an integrated emotion analysis engine that analyzes user input and behavior and provides information according to the user's emotional state.
[0629] The server first receives data sent by the user through the application. This data includes purchase information as electronic documents and user interaction logs. Using optical character recognition and natural language processing, the server extracts useful character information from this data and identifies the desired information. The analyzed information is registered in a database, and a link to the original document is also generated as needed.
[0630] Furthermore, the server uses a generative AI model, an emotion analysis engine, to recognize the user's emotional state in real time. This engine analyzes emotions from the user's input patterns and past data, and can dynamically adjust the priority and method of information presentation. For example, if the user indicates an urgent need, it can immediately present concise information and simplified payment options.
[0631] On the terminal, information is presented using search results and payment options. The interface design and amount of information are appropriately adjusted according to the user's emotional state, ensuring a stress-free experience. For example, in a situation where a user is about to miss a train and needs to quickly purchase a ticket, the simplest payment method is presented, allowing for a swift completion of the transaction.
[0632] An example of a prompt for a generative AI model is, "How can we offer the best payment method when the user is in a hurry?" This prompt serves as a guide to improve the output of the sentiment analysis engine and helps to enhance the user experience.
[0633] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0634] Step 1:
[0635] The user accesses an electronic payment application using their smartphone and selects the items they wish to purchase. Input includes the item selection and purchase intent. The device prepares to send this information to the server. The selected items are sent to the server as output.
[0636] Step 2:
[0637] The server analyzes the received user data using optical character recognition and natural language processing. The input includes purchase intent and product information sent by the user. The server extracts useful character information from this data and identifies specific purchase information. The identified information is generated as output and registered in the database.
[0638] Step 3:
[0639] The server uses a generative AI model, the emotion analysis engine, to analyze the user's emotional state in real time. Inputs include the user's behavioral patterns and past interaction data. Based on this data, the server determines whether the user is in an urgent situation or relaxed. The output is an evaluation of the emotional state.
[0640] Step 4:
[0641] The server generates the optimal payment option through information presentation methods based on the results of sentiment analysis. The input is the evaluation result of the emotional state. Based on the evaluation, the server dynamically adjusts the priority and method of information presentation. In the case of an emotion indicating urgency, an intuitive and rapid payment procedure is selected and presented as the output.
[0642] Step 5:
[0643] The terminal displays optimized payment information and options sent from the server to the user. The input consists of payment options and information generated by the server. The terminal displays this information in a user-friendly format and prompts the user to take action. The output is a user-friendly presentation of information.
[0644] Step 6:
[0645] The user reviews the presented payment options and completes the final purchase confirmation process. The input is the information displayed on the terminal. The user's actions initiate the final payment process, and the output is the completion of the purchase.
[0646] The specific processing unit 290 transmits the result of the specific processing to the headset terminal 314. In the headset terminal 314, the control unit 46A causes the speaker 240 and display 343 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.
[0647] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). An example of data generation model 58 is ChatGPT (Internet Search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0648] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and specific processing may also be performed by the headset terminal 314.
[0649] [Fourth Embodiment]
[0650] Figure 7 shows an example of the configuration of the data processing system 410 according to the fourth embodiment.
[0651] As shown in Figure 7, the data processing system 410 includes a data processing device 12 and a robot 414. An example of the data processing device 12 is a server.
[0652] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0653] The robot 414 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a controlled object 443. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and controlled object 443 are also connected to the bus 52.
[0654] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.
[0655] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).
[0656] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.
[0657] The controlled object 443 includes a display device, LEDs in the eyes, and motors that drive the arms, hands, and feet. The posture and gestures of the robot 414 are controlled by controlling the motors of the arms, hands, and feet. Some of the robot 414's emotions can be expressed by controlling these motors. Furthermore, the robot 414's facial expressions can also be expressed by controlling the illumination state of the LEDs in its eyes.
[0658] Figure 8 shows an example of the main functions of the data processing device 12 and the robot 414. As shown in Figure 8, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.
[0659] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.
[0660] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0661] In robot 414, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0662] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0663] The system of the present invention consists of three components: a server, a terminal, and a user. The specific operation of each component is described below.
[0664] First, the user uploads contracts and other related electronic documents through the business system interface. The uploaded documents are sent from the terminal to the server. The server receives these documents and efficiently extracts character information from the electronic documents using optical character recognition (OCR).
[0665] The server then analyzes the extracted text information using natural language processing to identify important information such as the contract number, contracter name, contract date, and contract amount. This information is organized using information management tools and registered in the contract database. At the same time, a link to the original electronic document is generated and stored in the database.
[0666] Users can then search for necessary contract information through this system. When a user enters a concise and unexaggerated search prompt into the terminal, the terminal uses a query generation mechanism to generate a query to search the database based on the user's input. The server quickly searches the database in response to the query and collects the relevant information.
[0667] Search results are presented to the user via their device. These results include links to relevant documents and summaries of necessary contractual information. This allows users to quickly access the information they need and proceed with their work efficiently.
[0668] In this system, the server uses indexing mechanisms to structurally organize information and improve the speed of search result retrieval. Furthermore, data organization mechanisms organize documents based on the extracted data, thereby improving operational efficiency.
[0669] For example, if a user enters a prompt such as "search for contracts concluded in the past year," the terminal will receive this prompt and generate an appropriate query. The server will process the query, retrieve the relevant information from the database, and return the search results to the user. In this way, the present invention provides a means to significantly streamline the management of electronic documents.
[0670] The following describes the processing flow.
[0671] Step 1:
[0672] Users upload PDF files of contracts from their terminals to the server using the web interface of the business system.
[0673] Step 2:
[0674] The server receives the uploaded PDF file, performs a security check, and then saves the file to temporary storage.
[0675] Step 3:
[0676] The server activates the optical character recognition (OCR) system and extracts character information from the saved PDF file. During this process, image-based characters are converted into text data.
[0677] Step 4:
[0678] The server uses natural language processing to analyze important contract information from the text extracted by OCR processing, identifying, for example, the name of the contractor, the contract date, and the contract amount.
[0679] Step 5:
[0680] The server organizes the identified information and registers it in the contract database. Furthermore, it generates a reference link to the original PDF file and saves this link in the database as well.
[0681] Step 6:
[0682] The user uses the business system's search function to input conditions or keywords into the terminal to search for specific contract information.
[0683] Step 7:
[0684] The terminal activates a query generation mechanism based on user input and constructs a search query targeting the database.
[0685] Step 8:
[0686] The server receives the generated query, quickly searches the database, and retrieves relevant contract information and links.
[0687] Step 9:
[0688] The terminal receives search results from the server and displays the information to the user in a list format. This allows the user to quickly access the necessary contracts and information.
[0689] (Example 1)
[0690] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0691] In today's business environment, managing electronic information requires considerable time and effort. In particular, quickly and accurately extracting, managing, and searching for necessary information from important documents such as contracts is difficult. Furthermore, there is a demand for improved information retrieval speed. Solving these challenges is essential to improving operational efficiency.
[0692] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0693] In this invention, the server includes recognition means for extracting character information from electronic information, language processing means for analyzing the extracted character information and identifying specific information, and management means for registering the analyzed information in a storage device and generating a reference to the original electronic information. This enables rapid and accurate management of electronic information and efficient retrieval of necessary information.
[0694] "Recognition means" refers to a device or program that has the function of automatically extracting textual information from electronic information.
[0695] "Language processing means" refers to a device or program that analyzes extracted character information and performs processing to identify specific information.
[0696] "Management means" refers to a device or program that has the function of systematically registering the analyzed information and generating a reference to the original electronic information.
[0697] A "storage device" is a physical or virtual device used to permanently or temporarily store information.
[0698] A "search query" is a set of conditions or words used to find specific information within a set of information.
[0699] "Information presentation means" refers to a device or program that has the function of presenting search results or necessary information to the user visually or audibly.
[0700] "Indexing" is the process of structuring and organizing information to improve the speed and efficiency of searching.
[0701] To implement this invention, a system is constructed in which a server, terminal, and user work together.
[0702] First, the user uploads electronic documents such as contracts from their terminal through the business system interface. The terminal then transfers these documents to the server using a secure communication protocol.
[0703] The server extracts text information from received electronic documents using optical character recognition (OCR) technology. Specifically, known OCR software such as ABBYY FineReader may be used. The extracted text information is then analyzed using natural language processing (NLP) technology. Here, NLP libraries such as spaCy or NLTK may be used. Through this analysis, information such as the contract number, contracter name, and contract date is automatically identified.
[0704] The analyzed information is registered in a storage device for information management, and a link to the original electronic document is generated. This link facilitates the referencing and tracking of the information. Furthermore, the information is indexed, improving search efficiency.
[0705] Users can quickly find registered information using the system's search function. For example, by entering a prompt such as "Search for contracts concluded this year," relevant contract information can be quickly retrieved. The terminal generates an appropriate search query based on this prompt, and a database search is performed on the server.
[0706] This embodiment of the invention allows users to efficiently manage and retrieve electronic information. The generative AI model assists in generating prompt sentences to enable flexible information retrieval tailored to the user's needs.
[0707] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0708] Step 1:
[0709] The user uses the business system interface to select and upload electronic documents, such as contracts, from their terminal. This operation imports the documents into the terminal's system. Once the import is complete, the terminal sends the electronic document data it received as input to the server.
[0710] Step 2:
[0711] The server receives electronic documents sent from terminals and applies optical character recognition (OCR) technology. This process extracts text information from the electronic documents. The OCR software outputs image-based characters as machine-readable text data.
[0712] Step 3:
[0713] The server analyzes the text data extracted by OCR using natural language processing (NLP). Based on the input text information, it identifies items such as contract number, contractor name, contract date, and contract amount, and outputs them as structured data. By using an NLP library, the server performs the specific processing of automatically extracting the necessary information from the document.
[0714] Step 4:
[0715] The server registers the analyzed structured data in the information management system and generates a link to the original electronic document. This process stores the identified information in the database, and the link is added as part of the output, making it easier to access the information.
[0716] Step 5:
[0717] The user enters a prompt into the terminal to use the system to search for the necessary information. For example, they might enter the prompt "Search for contracts concluded in the past year" and send it to the terminal.
[0718] Step 6:
[0719] The terminal parses user input prompts and generates appropriate database search queries. Search conditions are generated based on the entered prompt text, and the results are output as queries for searching the database.
[0720] Step 7:
[0721] The server searches the database using the generated query. The process then retrieves relevant information from the database, and the search results, including the corresponding contract information and links, are aggregated.
[0722] Step 8:
[0723] The terminal displays search results sent from the server to the user. This allows the user to quickly obtain the necessary information and improve work efficiency. The displayed information includes links to relevant documents and summaries.
[0724] (Application Example 1)
[0725] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0726] Conventional electronic document management systems often require manual extraction and management of important information such as contracts, resulting in time-consuming and labor-intensive processes. Furthermore, verifying and searching transaction records is cumbersome, hindering efficient business operations. Therefore, there is a need for a system that efficiently manages electronic contracts and allows users to easily access contract information and transaction history.
[0727] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0728] In this invention, the server includes recognition means for extracting character data from an electronic document, processing means for analyzing the extracted character data and identifying specific information, information control means for registering the analyzed information in a data storage and generating a reference to the original electronic document, and communication means for transmitting the information to a communication terminal so that users can easily check contract information and transaction records on their terminal devices. This automates the management of electronic contracts and enables users to efficiently access the information they need.
[0729] "Electronic documents" refer to documents created in digital format, such as contracts and transaction records.
[0730] "Character data" refers to individual characters and sets of characters extracted from electronic documents.
[0731] "Recognition means" refers to technologies and devices for extracting character data from electronic documents.
[0732] "Processing means" refers to software or algorithms used to analyze extracted character data and identify specific information.
[0733] A "data vault" refers to a database or storage system used to store analyzed information.
[0734] "Information control means" refers to a system that has the function of registering information in a data storage and generating a reference to the original electronic document.
[0735] "Communication means" refers to communication technologies and infrastructure that transmit information to users' terminal devices, enabling them to verify contract information and transaction records.
[0736] "Terminal devices" refer to devices that users use to access information, such as smartphones and computers.
[0737] To realize this invention, an efficient management system for electronic documents will be constructed. The server will use optical character recognition (OCR) technology to recognize electronic documents such as electronic contracts and transaction records. Specifically, it will use the Google Cloud Vision API to extract text data and AWS Lambda to analyze the information for natural language processing (NLP). The analyzed data will be registered in AWS RDS (relational database service) and stored along with a link to the original document.
[0738] For communication devices used by users, such as smartphones, an application is installed on the device to allow for quick access to contract information and transaction records. This application communicates with a server via API Gateway, generates queries, and presents the search results to the user as a graphical user interface (GUI). As a concrete example, a user can immediately retrieve relevant information by entering a prompt such as "Show me all transaction records from last year."
[0739] By using this system, users can automate the processing of contracts, access necessary information more quickly, and significantly improve operational efficiency.
[0740] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0741] Step 1:
[0742] The user uploads an electronic document using their device. Upon uploading, the document is sent to the cloud, and OCR (Optical Character Recognition) is activated. The input is an electronic document, and the output is text data.
[0743] Step 2:
[0744] The server uses OCR technology to extract character data from electronic documents. Here, the Google Cloud Vision API is used to identify characters from image data and obtain them as text data. The input is an electronic document, and the output is the extracted character data.
[0745] Step 3:
[0746] The server uses AWS Lambda to perform natural language processing on the extracted text data. Here, it identifies information such as the contract number, contract holder name, and contract date. The input is text data, and the output is the parsed specific contract information.
[0747] Step 4:
[0748] The server registers the analyzed contract information in AWS RDS and generates a link to the original electronic document. This link is stored in the database and can be accessed as needed. The input is the analyzed contract information, and the output is the database and link information.
[0749] Step 5:
[0750] The user requests a search based on a prompt entered into the terminal. The terminal generates a search query from the prompt and sends it to the server. The input is the prompt, and the output is the generated search query.
[0751] Step 6:
[0752] The server searches the database within AWS RDS based on the received search query and collects relevant contract information. It then uses the indexed data to perform efficient searches. The input is the search query, and the output is the search results data.
[0753] Step 7:
[0754] The server sends search results to the terminal, which displays the results to the user via a GUI. The user can then review contract information and related links and perform necessary tasks. The input is the search results data, and the output is the information displayed to the user.
[0755] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0756] The present invention is a system that incorporates an emotion engine that analyzes user emotions and applies them to information retrieval and presentation. The system mainly consists of a server, a terminal, and the user.
[0757] First, the user uploads the contract and related documents as electronic data. The server extracts text information from the electronic documents, such as PDFs, using optical character recognition (OCR). This information is then analyzed by natural language processing to identify specific contract information. This identified information is registered in a database, and a link to the original document is also generated.
[0758] A distinctive feature of this system is the introduction of an emotion engine. When a user inputs data or interacts with the system, the server analyzes the user's emotions through the emotion engine based on the data and behavioral patterns obtained from the terminal. Based on this, the way search results are presented and the priority of information are dynamically adjusted, enabling the provision of information that is tailored to the user.
[0759] For example, if a user indicates an emotion requiring urgent attention, the server can change its priorities to quickly display relevant search results. Similarly, if the emotion engine determines that the user is relaxed, it can provide more detailed information than usual.
[0760] Furthermore, the emotion engine also suggests changes to the information presentation style and design on the device to ensure users can continue to use the interface without stress. This feature significantly improves the user experience and leads to smoother workflow.
[0761] For example, when a user enters "I urgently need the latest contract information," the emotion engine senses a high level of urgency. This prompts the server to quickly provide information, and relevant, important contract information is promptly displayed on the terminal. In this way, the present invention realizes an advanced information management system that responds to user needs through emotion estimation.
[0762] The following describes the processing flow.
[0763] Step 1:
[0764] Users upload PDF contract files from their terminals to the server via the business system interface. During this process, user actions and input data are used for initial analysis by the emotion engine.
[0765] Step 2:
[0766] The server receives the uploaded PDF file, activates the optical character recognition (OCR) system, and extracts character information from the electronic document. This character information is temporarily stored for later natural language processing.
[0767] Step 3:
[0768] The server uses natural language processing to analyze the text information extracted by OCR. This analysis identifies important information such as the contractor's name, contract date, and contract amount, and then initiates a process to register this information in a database.
[0769] Step 4:
[0770] The server uses information management tools to organize the analyzed information, generate links to the original electronic documents, and store them in a database. This process allows for quick access to the information when searching later.
[0771] Step 5:
[0772] The device activates an emotion engine and determines the user's emotional state based on their input speed and content. Based on this data, it analyzes whether the user's current psychological state is high-stress or low-stress.
[0773] Step 6:
[0774] The user enters criteria on the interface to search for specific contract information. As the user enters the information, the system monitors the user's keyboard typing patterns and mouse movements.
[0775] Step 7:
[0776] The terminal uses a query generation mechanism to construct database queries based on the user's emotional state and search criteria. The priority of these queries is dynamically adjusted according to the user's emotions.
[0777] Step 8:
[0778] The server processes queries and searches the database based on the analysis results of the emotion engine. It efficiently retrieves search results and prioritizes information as needed.
[0779] Step 9:
[0780] The device receives search results from the server and quickly displays information tailored to the user's analyzed emotional state. By presenting information in a way that suits the user's state, the user can view and use the information in the most optimal manner.
[0781] (Example 2)
[0782] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0783] In the modern era, while information is rapidly digitizing, there is a lack of systems that can quickly and efficiently extract necessary information from electronic documents and provide it appropriately according to the user's situation and emotions. Furthermore, because there are no technological means to improve the user experience by presenting information while taking user emotions into consideration, problems such as stress and information overload are occurring.
[0784] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0785] In this invention, the server includes optical character recognition means for extracting character information from an electronic document, natural language processing means for analyzing the extracted character information and identifying desired information, information management means for registering the analyzed information in an information recording unit and generating connection information to the original electronic document, and emotion analysis means for analyzing the user's emotions and adjusting the method of presenting information based on the results. This enables the user to always receive information in a form that suits their emotional state, thereby improving the accuracy of information acquisition and the user experience.
[0786] An "electronic document" is a file containing text data and image data stored in a format that can be read by computers and electronic devices.
[0787] "Textual information" refers to information expressed in text format, including data such as strings of characters and numbers.
[0788] "Optical character recognition means" refers to a device or program that reads characters from image data and converts them into text data.
[0789] "Natural language processing" refers to the technology used to analyze and understand natural language used by humans using computers.
[0790] "Information management means" refers to a mechanism or system for efficiently organizing, storing, retrieving, and managing information.
[0791] "Emotional analysis methods" refer to analytical methods that use techniques to estimate a user's emotions and psychological state from data.
[0792] The "information recording unit" is a data storage area that stores acquired information and arranges it in a way that allows access as needed.
[0793] A "search query generation means" is a function or process that automatically or manually generates inquiries for information retrieval.
[0794] "Information presentation means" refers to a method or device for presenting acquired information to the user in an easily understandable manner.
[0795] "Operation data collection means" refers to a mechanism or device that collects data related to the user's operation history and usage status.
[0796] "Systematization means" refers to a mechanism or method for structuring information and organizing it in an easily accessible format.
[0797] An "information classification and organization means" is a method or apparatus for classifying and organizing acquired information according to specific criteria.
[0798] "Information presentation adjustment means" refers to a function or process that dynamically adjusts the method of presenting information based on sentiment analysis and usage conditions.
[0799] This invention uses a system consisting of a user, a terminal, and a server to perform information analysis on electronic documents and provide information tailored to the user's emotions.
[0800] The user uploads contracts and related documents as electronic files to their device. The device then sends these electronic documents to the server.
[0801] The server plays a central role in processing received electronic documents. First, the server extracts text information from the document using optical character recognition (OCR) technology. This process uses common OCR tools such as Tesseract OCR. Next, the extracted text information is analyzed using natural language processing (NLP) technology. NLP processing uses libraries such as SpaCy and NLTK to identify contract information and important text data.
[0802] The server registers the identified information in a database and generates a link to the original document. This database management makes it possible to quickly retrieve the necessary information.
[0803] Furthermore, the server incorporates an emotion analysis engine. This engine evaluates emotional data from the user's inputs and actions on the device. The server utilizes generative AI models such as TensorFlow and PyTorch to analyze the user's emotional state (e.g., urgency, relaxation) and dynamically adjusts how information is presented based on the obtained emotional information.
[0804] The terminal is responsible for retrieving information sent from the server and displaying it in a format suitable for the user. For example, if the user needs urgent information, relevant information will be displayed quickly and prioritized on the terminal. Conversely, if the user is relaxed, more detailed and comprehensive information will be provided. The terminal can also adjust the screen design and information presentation style according to the user's emotional state.
[0805] For example, if a user enters "I urgently need information about the latest contracts," the server performs sentiment analysis on this input and detects a high level of urgency. This prompts the server to quickly search its database and begin the process of displaying relevant and important contract information on the user's device. Examples of prompts for the AI model in this scenario include "Consider the user's urgency and prioritize displaying important information."
[0806] This invention allows users to efficiently obtain necessary information without experiencing stress. By adapting information delivery to the user's emotional state, it becomes possible to provide a more comfortable user experience.
[0807] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0808] Step 1:
[0809] The user uploads contracts and related documents as electronic data to the terminal. The input data is in PDF or image format. The terminal sends these files to the server. The server temporarily stores the received documents to pass them on to the next processing step. At this stage, the electronic documents are received as input and are ready for OCR processing.
[0810] Step 2:
[0811] The server performs optical character recognition (OCR) on the received electronic document. The input is the document file sent in step 1. Using software such as Tesseract OCR, the server extracts character information from the image data and converts it into text format. This OCR process generates string data from the document as output.
[0812] Step 3:
[0813] The server analyzes the character information extracted by OCR using natural language processing (NLP) techniques. The input data is the string data obtained in the previous step. The server uses SpaCy or NLTK to identify contract-related information in the document (e.g., contractor name, date, conditions, etc.) and organize it as structured data. This analysis identifies specific information, preparing it for registration in the database as output.
[0814] Step 4:
[0815] The server registers the analyzed information in the database. The input is the structured data identified in step 3. Based on this information, the server also generates a link to the original document and saves it in the database. In this step, the registered data is generated as output and becomes accessible for subsequent search queries.
[0816] Step 5:
[0817] The device records user input and actions, collecting sentiment data. Input consists of the user's operation history and actual interactions. The device sends this data to a server, preparing it for use as the basis for sentiment analysis. At this stage, user behavior data is obtained as output.
[0818] Step 6:
[0819] The server analyzes data sent from the terminal using an emotion analysis engine to estimate the user's emotional state. The input is the user's behavioral data collected in step 5. Using generative AI models such as TensorFlow and PyTorch, the server identifies the user's emotions (such as the degree of urgency or relaxation) and generates analysis results. These results are output as emotion data used to adjust the information presented.
[0820] Step 7:
[0821] The server adjusts how information is presented based on the sentiment analysis results. The input is the sentiment data obtained in step 6. The server prioritizes and modifies the display method to provide information optimized for the user. For example, if the user indicates urgency, relevant information is prioritized and presented quickly. The adjusted information is then presented to the user as output.
[0822] Step 8:
[0823] The terminal receives information provided by the server and displays it to the user in the most optimal way. The input is the information adjusted in step 7. The terminal dynamically changes the screen layout and presentation style according to the user's emotions. For example, if the terminal determines that the user is relaxed, detailed information will be displayed. In this step, the information provided to the user is finalized as the final output.
[0824] (Application Example 2)
[0825] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0826] Conventional electronic document management systems have been unable to dynamically present information in response to the user's emotional state, limiting the improvement of the user experience. In particular, in the field of electronic payments, optimizing information based on the user's emotions is required, but current systems are not adequately capable of doing so.
[0827] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0828] In this invention, the server includes optical character recognition means for extracting character information from an electronic document, natural language processing means for analyzing the extracted character information and identifying desired information, information management means for registering the analyzed information in a database and generating a link to the original electronic document, and emotion analysis means for analyzing the user's emotional state and dynamically adjusting the method of presenting information. This enables optimal presentation of information according to the user's emotional state and smooth information provision in electronic payments.
[0829] An "optical character recognition means" is a device that extracts character information from electronic documents.
[0830] "Natural language processing means" refers to technology that analyzes extracted textual information and identifies information that aligns with the user's intent.
[0831] "Information management means" refers to a technological system that registers analyzed information in a database and generates links to facilitate access to the original electronic document.
[0832] An "emotion analysis tool" is a technological system that analyzes the user's emotional state from their input and actions, and dynamically adjusts the information presented based on that information.
[0833] A "query generation mechanism" is a system that automatically creates appropriate database queries based on user input.
[0834] "Information presentation means" refers to technologies that display search results to users visually or by other means, and that display them in a way that is appropriate to the user's emotional state.
[0835] An "indexing method" is a technique for organizing electronic documents and related information so that they can be efficiently searched.
[0836] A "data organization method" is a technical system that arranges extracted information according to certain rules and formats, making it available for use.
[0837] "Payment support methods" are technologies that optimize payment information and options according to the user's emotions, thereby supporting a smooth payment process.
[0838] The system that realizes this invention will be built as a smartphone application to support electronic payments. At the core of this system is a server with an integrated emotion analysis engine that analyzes user input and behavior and provides information according to the user's emotional state.
[0839] The server first receives data sent by the user through the application. This data includes purchase information as electronic documents and user interaction logs. Using optical character recognition and natural language processing, the server extracts useful character information from this data and identifies the desired information. The analyzed information is registered in a database, and a link to the original document is also generated as needed.
[0840] Furthermore, the server uses a generative AI model, an emotion analysis engine, to recognize the user's emotional state in real time. This engine analyzes emotions from the user's input patterns and past data, and can dynamically adjust the priority and method of information presentation. For example, if the user indicates an urgent need, it can immediately present concise information and simplified payment options.
[0841] On the terminal, information is presented using search results and payment options. The interface design and amount of information are appropriately adjusted according to the user's emotional state, ensuring a stress-free experience. For example, in a situation where a user is about to miss a train and needs to quickly purchase a ticket, the simplest payment method is presented, allowing for a swift completion of the transaction.
[0842] An example of a prompt for a generative AI model is, "How can we offer the best payment method when the user is in a hurry?" This prompt serves as a guide to improve the output of the sentiment analysis engine and helps to enhance the user experience.
[0843] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0844] Step 1:
[0845] The user accesses an electronic payment application using their smartphone and selects the items they wish to purchase. Input includes the item selection and purchase intent. The device prepares to send this information to the server. The selected items are sent to the server as output.
[0846] Step 2:
[0847] The server analyzes the received user data using optical character recognition and natural language processing. The input includes purchase intent and product information sent by the user. The server extracts useful character information from this data and identifies specific purchase information. The identified information is generated as output and registered in the database.
[0848] Step 3:
[0849] The server uses a generative AI model, the emotion analysis engine, to analyze the user's emotional state in real time. Inputs include the user's behavioral patterns and past interaction data. Based on this data, the server determines whether the user is in an urgent situation or relaxed. The output is an evaluation of the emotional state.
[0850] Step 4:
[0851] The server generates the optimal payment option through information presentation methods based on the results of sentiment analysis. The input is the evaluation result of the emotional state. Based on the evaluation, the server dynamically adjusts the priority and method of information presentation. In the case of an emotion indicating urgency, an intuitive and rapid payment procedure is selected and presented as the output.
[0852] Step 5:
[0853] The terminal displays optimized payment information and options sent from the server to the user. The input consists of payment options and information generated by the server. The terminal displays this information in a user-friendly format and prompts the user to take action. The output is a user-friendly presentation of information.
[0854] Step 6:
[0855] The user reviews the presented payment options and completes the final purchase confirmation process. The input is the information displayed on the terminal. The user's actions initiate the final payment process, and the output is the completion of the purchase.
[0856] The specific processing unit 290 transmits the result of the specific processing to the robot 414. In the robot 414, the control unit 46A causes the speaker 240 and the controlled object 443 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.
[0857] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). An example of data generation model 58 is ChatGPT (Internet Search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0858] In the above embodiment, an example was given in which the specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the robot 414.
[0859] Furthermore, the emotion identification model 59, acting as an emotion engine, may determine the user's emotion according to a specific mapping. Specifically, the emotion identification model 59 may determine the user's emotion according to a specific mapping, which is an emotion map (see Figure 9). Similarly, the emotion identification model 59 may also determine the robot's emotion, and the identification processing unit 290 may perform identification processing using the robot's emotion.
[0860] Figure 9 shows an emotion map 400 in which multiple emotions are mapped. In the emotion map 400, emotions are arranged in concentric circles radiating from the center. The closer to the center of the concentric circles, the more primitive the emotions are located. Further out of the concentric circles, emotions representing states and actions arising from mental states are located. Emotion is a concept that includes feelings and mental states. On the left side of the concentric circles, emotions that are generally generated from reactions occurring in the brain are located. On the right side of the concentric circles, emotions that are generally induced by situational judgment are located. Above and below the concentric circles, emotions that are generally generated from reactions occurring in the brain and induced by situational judgment are located. In addition, the emotion of "pleasure" is located on the upper side of the concentric circles, and the emotion of "displeasure" is located on the lower side. Thus, in the emotion map 400, multiple emotions are mapped based on the structure in which emotions arise, and emotions that are likely to occur simultaneously are mapped close together.
[0861] These emotions are distributed at the 3 o'clock position on the Emotion Map 400, and usually fluctuate between feelings of security and anxiety. In the right half of the Emotion Map 400, situational awareness takes precedence over internal feelings, resulting in a calm impression.
[0862] The inside of the Emotion Map 400 represents inner thoughts, while the outside represents actions. Therefore, the further you go from the outside of the Emotion Map 400, the more visible (expressed in actions) your emotions become.
[0863] Here, human emotions are based on various balances, such as posture and blood sugar levels. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. Similarly, in robots, cars, motorcycles, etc., emotions can be created based on various balances, such as posture and battery level. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. The emotion map can be generated, for example, based on Dr. Mitsuyoshi's emotion map (Research on a system for analyzing brain physiological signals of speech emotion recognition and emotion, Tokushima University, doctoral dissertation: https: / / ci.nii.ac.jp / naid / 500000375379). The left half of the emotion map contains emotions belonging to a region called "response," where sensation is dominant. The right half of the emotion map contains emotions belonging to a region called "situation," where situational awareness is dominant.
[0864] The emotion map defines two emotions that promote learning. One is the emotion around the middle of the negative "repentance" and "reflection" on the situation side. In other words, it is when the robot experiences negative emotions such as "I never want to feel this way again" or "I don't want to be scolded again." The other is the emotion around the positive "desire" on the reaction side. In other words, it is when the robot has positive feelings such as "I want more" or "I want to know more."
[0865] The emotion identification model 59 inputs user input into a pre-trained neural network, obtains emotion values representing each emotion shown in the emotion map 400, and determines the user's emotion. This neural network is pre-trained based on multiple training data sets, which are combinations of user input and emotion values representing each emotion shown in the emotion map 400. Furthermore, this neural network is trained so that emotions located close together have similar values, as shown in the emotion map 900 in Figure 10. Figure 10 shows an example where multiple emotions such as "reassured," "calm," and "confident" have similar emotion values.
[0866] The above description primarily focuses on the functions of the data processing device 12 in relation to this disclosure. However, the system related to this disclosure is not necessarily implemented on a server. The system related to this disclosure may be implemented as a general information processing system. This disclosure may be implemented, for example, as a software program that runs on a personal computer or as an application that runs on a smartphone. The method related to this disclosure may be provided to users in SaaS (Software as a Service) format.
[0867] In the above embodiment, an example was given in which a specific process is performed by a single computer 22. However, the technology of this disclosure is not limited thereto, and a distributed processing of the specific process may be performed by multiple computers, including computer 22. For example, a data generation model 58 may be provided in an external device of the data processing device 12, and the external device may generate data according to the input data.
[0868] In the above embodiment, an example was given in which the specific processing program 56 is stored in the storage 32, but the technology of this disclosure is not limited thereto. For example, the specific processing program 56 may be stored in a portable, computer-readable, non-temporary storage medium such as a USB (Universal Serial Bus) memory. The specific processing program 56 stored in the non-temporary storage medium is installed in the computer 22 of the data processing device 12. The processor 28 executes specific processing according to the specific processing program 56.
[0869] Alternatively, the specific processing program 56 may be stored in a storage device such as a server connected to the data processing device 12 via the network 54, and the specific processing program 56 may be downloaded and installed on the computer 22 in response to a request from the data processing device 12.
[0870] Furthermore, it is not necessary to store the entirety of the specific processing program 56 in a storage device such as a server connected to the data processing device 12 via the network 54, or to store the entirety of the specific processing program 56 in the storage 32; it is acceptable to store only a portion of the specific processing program 56.
[0871] The following types of processors can be used as hardware resources to perform specific processing. Examples of processors include a CPU, a general-purpose processor that functions as a hardware resource to perform specific processing by executing software, i.e., a program. Other examples of processors include dedicated electrical circuits, such as FPGAs (Field-Programmable Gate Arrays), PLDs (Programmable Logic Devices), or ASICs (Application Specific Integrated Circuits), which have circuit configurations specifically designed to perform specific processing. All of these processors have built-in or connected memory, and all of them perform specific processing by using memory.
[0872] The hardware resource that performs a specific process may consist of one of these various processors, or it may consist of a combination of two or more processors of the same or different types (for example, a combination of multiple FPGAs, or a combination of a CPU and an FPGA). Alternatively, the hardware resource that performs a specific process may consist of a single processor.
[0873] Examples of configurations using a single processor include, firstly, a configuration in which one or more CPUs and software are combined to form a single processor, and this processor functions as a hardware resource that performs a specific process. Secondly, there is a configuration using a processor that realizes the functions of the entire system, including multiple hardware resources that perform a specific process, on a single IC chip, as exemplified by SoCs (System-on-a-chip). In this way, a specific process is realized using one or more of the above types of processors as hardware resources.
[0874] Furthermore, the hardware structure of these various processors can more specifically utilize electrical circuits that combine circuit elements such as semiconductor devices. Also, the specific processing described above is merely an example. Therefore, it goes without saying that unnecessary steps can be deleted, new steps added, or the processing order rearranged, as long as it does not deviate from the main purpose.
[0875] The descriptions and illustrations presented above are detailed explanations of the technical aspects of this disclosure and are merely examples of the technical aspects. For example, the above descriptions of the structure, function, operation, and effect are examples of the structure, function, operation, and effect of the technical aspects of this disclosure. Therefore, it goes without saying that you may delete unnecessary parts, add new elements, or replace elements in the descriptions and illustrations presented above, as long as you do not deviate from the essence of the technical aspects of this disclosure. Furthermore, in order to avoid confusion and facilitate understanding of the technical aspects of this disclosure, explanations of common technical knowledge and the like that do not require special explanation to enable the implementation of the technical aspects of this disclosure have been omitted from the descriptions and illustrations presented above.
[0876] All documents, patent applications, and technical standards described herein are incorporated by reference to the same extent as if each individual document, patent application, and technical standard were specifically and individually noted to be incorporated by reference.
[0877] The following is further disclosed regarding the embodiments described above.
[0878] (Claim 1)
[0879] An optical character recognition means for extracting character information from electronic documents,
[0880] A natural language processing means for analyzing extracted character information and identifying desired information,
[0881] Information management means for registering the analyzed information in a database and generating a link to the original electronic document,
[0882] A system that includes this.
[0883] (Claim 2)
[0884] A query generation means for searching a database based on user input and presenting relevant information,
[0885] A means of presenting information to the user for displaying search results,
[0886] The system according to claim 1, comprising:
[0887] (Claim 3)
[0888] An indexing method for improving search efficiency by indexing electronic documents and related information,
[0889] A data organization method for organizing electronic documents based on extracted information,
[0890] The system according to claim 1, comprising:
[0891] "Example 1"
[0892] (Claim 1)
[0893] A recognition means for extracting textual information from electronic information,
[0894] A language processing means for analyzing extracted character information and identifying specific information,
[0895] A management means for registering the analyzed information in a storage device and generating a reference to the original electronic information,
[0896] A means of generating search queries based on user input,
[0897] A system that includes this.
[0898] (Claim 2)
[0899] Information presentation means for searching a storage device and presenting related information,
[0900] A means of collecting information in response to the generated search query,
[0901] The system according to claim 1, comprising:
[0902] (Claim 3)
[0903] A means of improving search speed by indexing information,
[0904] A means of organizing information based on extracted information,
[0905] The system according to claim 1, comprising:
[0906] "Application Example 1"
[0907] (Claim 1)
[0908] A recognition means for extracting character data from electronic documents,
[0909] A processing means for analyzing extracted character data and identifying specific information,
[0910] Information control means for registering the analyzed information in a data storage and generating a reference to the original electronic document,
[0911] A communication means for transmitting information to a communication terminal, enabling users to easily check contract information and transaction records on their terminal device,
[0912] A system that includes this.
[0913] (Claim 2)
[0914] A query generation means for searching a data repository based on user input and displaying related information,
[0915] A means of providing information for outputting search results to a communication terminal,
[0916] The system according to claim 1, comprising:
[0917] (Claim 3)
[0918] A means of organizing electronic documents and related information to improve search efficiency by indexing them,
[0919] A means for organizing electronic documents based on extracted data,
[0920] The system according to claim 1, comprising:
[0921] "Example 2 of combining an emotion engine"
[0922] (Claim 1)
[0923] An optical character recognition means for extracting character information from electronic documents,
[0924] A natural language processing means for analyzing extracted character information and identifying desired information,
[0925] Information management means for registering the analyzed information in the information recording unit and generating connection information to the original electronic document,
[0926] A means for analyzing user emotions and adjusting the way information is presented based on the results,
[0927] A system that includes this.
[0928] (Claim 2)
[0929] A search query generation means for searching the information recording unit based on user input and presenting relevant information,
[0930] A means of presenting information to the user for displaying search results,
[0931] A means for collecting user operation data and providing it to an emotion analysis means,
[0932] The system according to claim 1, comprising:
[0933] (Claim 3)
[0934] A systematization means for organizing information in order to make electronic documents and related information searchable,
[0935] An information classification and organization means for classifying and organizing electronic documents based on extracted information,
[0936] A means for adjusting information presentation to improve the user experience based on the results of sentiment analysis,
[0937] The system according to claim 1, comprising:
[0938] "Application example 2 when combining with an emotional engine"
[0939] (Claim 1)
[0940] An optical character recognition means for extracting character information from electronic documents,
[0941] A natural language processing means for analyzing extracted character information and identifying desired information,
[0942] Information management means for registering the analyzed information in a database and generating a link to the original electronic document,
[0943] A means for analyzing the user's emotional state and dynamically adjusting the way information is presented,
[0944] A system that includes this.
[0945] (Claim 2)
[0946] A query generation means for searching a database based on user input and presenting relevant information,
[0947] An information presentation means that displays search results to the user and presents information according to the user's emotional state,
[0948] The system according to claim 1, comprising:
[0949] (Claim 3)
[0950] An indexing method for improving search efficiency by indexing electronic documents and related information,
[0951] A data organization method for organizing electronic documents based on extracted information,
[0952] A payment support mechanism for optimizing payment information and options according to the user's emotions,
[0953] The system according to claim 1, comprising: [Explanation of Symbols]
[0954] 10, 210, 310, 410 Data Processing Systems 12 Data Processing Devices 14 Smart Devices 214 Smart Glasses 314 Headset-type terminal 414 Robots< / url:> < / url:> < / url:> < / url:>
Claims
1. An optical character recognition means for extracting character information from electronic documents, A natural language processing means for analyzing extracted character information and identifying desired information, Information management means for registering the analyzed information in a database and generating a link to the original electronic document, A system that includes this.
2. A query generation means for searching a database based on user input and presenting relevant information, A means of presenting information to the user for displaying search results, The system according to claim 1, comprising:
3. An indexing method for improving search efficiency by indexing electronic documents and related information, A data organization method for organizing electronic documents based on extracted information, The system according to claim 1, comprising: