system
A system collects and analyzes data to generate multilingual, customizable terminology dictionaries, addressing communication gaps and enhancing knowledge sharing and understanding within organizations.
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Applications
- Current Assignee / Owner
- SOFTBANK GROUP CORP
- Filing Date
- 2024-12-16
- Publication Date
- 2026-06-26
AI Technical Summary
Existing systems fail to efficiently address misunderstandings and inefficiencies caused by insufficient understanding of technical terms and abbreviations within organizations, particularly in multilingual environments, leading to communication gaps and ineffective knowledge sharing.
A system that collects data from various sources, analyzes it using contextual analysis, and generates definitions for specialized terminology, translating them into multiple languages and customizing dictionaries for departments or individuals, ensuring real-time updates and optimized communication.
Facilitates effective communication and knowledge sharing across departments and languages by providing real-time, customizable, and multilingual specialized terminology dictionaries, enhancing understanding and productivity.
Smart Images

Figure 2026105361000001_ABST
Abstract
Description
Technical Field
[0001] The technology of the present disclosure relates to a system.
Background Art
[0002] Patent Document 1 discloses a method for controlling a persona chatbot, which is performed by at least one processor, including steps of receiving a user utterance, adding the user utterance to a prompt including an instruction sentence related to an explanation of a chatbot character, encoding the prompt, and inputting the encoded prompt into a language model to generate a chatbot utterance as a response to the user utterance.
Prior Art Documents
Patent Documents
[0003]
Patent Document 1
Summary of the Invention
Problems to be Solved by the Invention
[0004] It is required to eliminate misunderstandings and inefficiencies caused by insufficient understanding of technical terms and abbreviations within an organization, enable new employees and transferees to quickly acquire organizational knowledge, and narrow the communication gap between departments and different specialized fields. Also, it is an issue to achieve effective knowledge sharing and quick information access while maintaining the consistency of terms in a multilingual environment.
Means for Solving the Problems
[0005] This invention extracts important descriptions by collecting data containing specialized terminology from sources such as social media and call records using data collection means. The extracted descriptions are analyzed for their usage using contextual analysis means. Based on the analysis results, an AI model automatically generates definitions for the descriptions. The generated definitions are constructed as a specialized terminology dictionary and further translated into multiple languages by translation means. This makes it possible to organize and provide terminology in a form usable by all members of the organization. Furthermore, real-time updates of the terminology dictionary are achieved by update means, and dictionaries optimized for each department or individual can be provided by customization means. This supports efficient communication both within and outside the organization.
[0006] "Data collection means" refers to methods and devices for automatically acquiring information from sources such as social media, email, voice calls, and meeting minutes.
[0007] "Preprocessing means" refers to a device or method that includes processes for deleting unnecessary information or tokenizing data, as it is a technique for preparing collected data into an analyzable format.
[0008] "Extraction means" refers to techniques and devices for identifying and selecting important descriptions and terms from data that has been preprocessed using natural language processing techniques.
[0009] "Contextual analysis means" refers to devices and techniques for analyzing how extracted descriptions and terms are used and understanding their meaning, and which perform analysis using a context window.
[0010] "Definition generation means" refers to a device and method for automatically generating definitions to clearly explain the meaning of a description or term based on the results of contextual analysis.
[0011] "Dictionary construction means" refers to a method and apparatus for constructing a specialized terminology dictionary that can be easily used by organizations and individuals, using the generated definitions.
[0012] "Translation means" refers to technologies and devices that automatically translate terms and definitions contained in a dictionary into multiple languages.
[0013] "Update methods" refer to technologies and devices for reflecting new information and definitions in a specialized terminology dictionary in real time and maintaining its up-to-date status.
[0014] "Customization means" refers to methods and apparatus for optimizing and providing specialized terminology dictionaries for each department or individual, according to the user's needs. [Brief explanation of the drawing]
[0015] [Figure 1] This is a conceptual diagram showing an example of the configuration of a data processing system according to the first embodiment. [Figure 2] This is a conceptual diagram showing an example of the essential functions of a data processing device and a smart device according to the first embodiment. [Figure 3] This is a conceptual diagram showing an example of the configuration of a data processing system according to the second embodiment. [Figure 4] This is a conceptual diagram showing an example of the main functions of a data processing device and smart glasses according to the second embodiment. [Figure 5] This is a conceptual diagram showing an example of the configuration of a data processing system according to the third embodiment. [Figure 6] This is a conceptual diagram showing an example of the main functions of a data processing device and a headset-type terminal according to the third embodiment. [Figure 7] This is a conceptual diagram showing an example of the configuration of a data processing system according to the fourth embodiment. [Figure 8] This is a conceptual diagram showing an example of the main functions of a data processing device and a robot according to the fourth embodiment. [Figure 9] This shows an emotion map where multiple emotions are mapped. [Figure 10] This shows an emotion map where multiple emotions are mapped. [Figure 11] This is a sequence diagram showing the processing flow of the data processing system in Example 1. [Figure 12] It is a sequence diagram showing the processing flow of the data processing system in Application Example 1. [Figure 13] It is a sequence diagram showing the processing flow of the data processing system in Example 2 when combined with an emotion engine. [Figure 14] It is a sequence diagram showing the processing flow of the data processing system in Application Example 2 when combined with an emotion engine.
Mode for Carrying Out the Invention
[0016] Hereinafter, an example of an embodiment of a system according to the technology of the present disclosure will be described with reference to the accompanying drawings.
[0017] First, the terms used in the following description will be explained.
[0018] In the following embodiments, the numbered processor (hereinafter simply referred to as "processor") may be one arithmetic unit or a combination of a plurality of arithmetic units. Also, the processor may be one type of arithmetic unit or a combination of a plurality of types of arithmetic units. Examples of arithmetic units include a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), a GPGPU (General-Purpose computing on Graphics Processing Units), an APU (Accelerated Processing Unit), and the like.
[0019] In the following embodiments, the numbered RAM (Random Access Memory) is a memory in which information is temporarily stored and is used as a work memory by the processor.
[0020] In the following embodiments, the signed storage is one or more non-volatile storage devices that store various programs and various parameters. Examples of non-volatile storage devices include flash memory (SSD (Solid State Drive)), magnetic disks (e.g., hard disks), or magnetic tapes.
[0021] In the following embodiments, the signed communication interface (I / F) is an interface that includes a communication processor and an antenna, etc. The communication interface manages communication between multiple computers. Examples of communication standards applicable to the communication interface include wireless communication standards such as 5G (5th Generation Mobile Communication System), Wi-Fi (registered trademark), or Bluetooth (registered trademark).
[0022] In the following embodiments, "A and / or B" is synonymous with "at least one of A and B." That is, "A and / or B" means that it may be A alone, or B alone, or a combination of A and B. Furthermore, in this specification, the same concept as "A and / or B" applies when expressing three or more things linked by "and / or."
[0023] [First Embodiment]
[0024] Figure 1 shows an example of the configuration of the data processing system 10 according to the first embodiment.
[0025] As shown in Figure 1, the data processing system 10 includes a data processing device 12 and a smart device 14. An example of the data processing device 12 is a server.
[0026] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0027] The smart device 14 comprises a computer 36, a reception device 38, an output device 40, a camera 42, and a communication interface 44. The computer 36 comprises a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The reception device 38, output device 40, and camera 42 are also connected to the bus 52.
[0028] The reception device 38 is equipped with a touch panel 38A and a microphone 38B, etc., and receives user input. The touch panel 38A receives user input by detecting contact with an object (e.g., a pen or finger). The microphone 38B receives user input by detecting the user's voice. The control unit 46A transmits data indicating the user input received by the touch panel 38A and microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the data indicating the user input.
[0029] The output device 40 includes a display 40A and a speaker 40B, and presents data to the user 20 by outputting the data in a form perceptible to the user 20 (e.g., audio and / or text). The display 40A displays visible information such as text and images according to instructions from the processor 46. The speaker 40B outputs audio according to instructions from the processor 46. The camera 42 is a small digital camera equipped with an optical system such as a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor.
[0030] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various types of information between processor 46 and processor 28 via network 54.
[0031] Figure 2 shows an example of the main functions of the data processing device 12 and the smart device 14.
[0032] As shown in Figure 2, in the data processing device 12, a specific processing is performed by the processor 28. A specific processing program 56 is stored in the storage 32. The specific processing program 56 is an example of a "program" related to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 according to the specific processing program 56 executed on the RAM 30.
[0033] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0034] In the smart device 14, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The reception output program 60 is used in conjunction with a specific processing program 56 by the data processing system 10. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0035] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".
[0036] This invention is server-centric and begins with the server collecting data. The server uses APIs to acquire text data from SNS and email servers and uses a speech recognition engine to convert audio data from calls and meetings into text. The collected data is then preprocessed on the server. Unnecessary symbols and spaces are removed, tokenization is performed, and the data is prepared in a format suitable for analysis.
[0037] Next, the server extracts key terms from the pre-processed data using natural language processing techniques. The terms extracted in this step are then analyzed to determine how they are used in each context. The information obtained through contextual analysis becomes the foundational data used by AI to clarify the meaning of these terms.
[0038] Through the analysis process, the server automatically generates definitions for each term. These generated definitions help understand the context in which and how technical terms are used. Based on these definitions, the server builds specialized terminology dictionaries for the entire organization, departments, and even individuals.
[0039] Users can access these dictionaries through their devices, and the dictionary contents are updated in real time on the server, allowing for immediate reflection of newly extracted terms and their correspondingly changed definitions. The updated dictionaries are translated into multiple languages and can be used by multinational organizations.
[0040] As a concrete example, consider the use of this system in an international research institution. The server collects data from online platforms used by researchers and extracts specialized DNA-related terminology. A dictionary accessible to users (researchers) on their terminals supports more effective communication by providing definitions and contexts for the generated scientific terms. This process is expected to deepen understanding among researchers with different backgrounds and improve team productivity.
[0041] The following describes the processing flow.
[0042] Step 1:
[0043] The server uses APIs and web scraping to retrieve information from social media and email servers. It also employs speech recognition technology to convert and collect audio data from phone calls and meetings into text.
[0044] Step 2:
[0045] The server performs preprocessing on the collected data. This preprocessing involves removing unnecessary tags and special characters, and splitting words and phrases through tokenization, preparing the data for analysis.
[0046] Step 3:
[0047] The server extracts important terms and phrases from pre-processed data using natural language processing techniques. Here, the TF-IDF algorithm and topic modeling are used to select terms to identify based on frequency and importance.
[0048] Step 4:
[0049] The server analyzes the surrounding context of the extracted terms and performs contextual analysis to understand how they are being used. This allows for a deeper understanding of the context before and after the terms are used.
[0050] Step 5:
[0051] The server uses AI to automatically generate definitions of terms based on the results of contextual analysis. These definitions include specific meanings and relevant usage examples, providing details to aid understanding.
[0052] Step 6:
[0053] The server builds specialized terminology dictionaries tailored to each department and individual needs, based on automatically generated definitions. This allows users to access dictionaries optimized for their specific needs.
[0054] Step 7:
[0055] The server will translate specialized terminology dictionaries into multiple languages, establishing a system to support intercultural communication. This translation process will be automated, enabling the organization to adapt to its internationalization needs.
[0056] Step 8:
[0057] Users can access the latest dictionary information through their devices and deepen their understanding of terminology as needed. The server continuously updates the dictionary in real time based on user feedback and newly collected data.
[0058] (Example 1)
[0059] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."
[0060] In recent years, there has been a growing need to efficiently collect large amounts of data from diverse sources, analyze it with high accuracy, and rapidly generate definitions of technical terms, providing them as dictionaries in multiple languages. However, due to data diversity and language differences, doing this in real time is difficult. In addition, there is a demand for customizable dictionaries that can meet user needs, and a system is needed to address these challenges.
[0061] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0062] In this invention, the server includes information gathering means, organizing means for organizing the collected information, and extraction means for extracting important terms from the organized information. This makes it possible to efficiently process data obtained from diverse information sources, generate definitions of specialized terms in real time, and provide a glossary that supports multiple languages.
[0063] "Information gathering means" refers to functions and technologies for obtaining data from various information sources. These means include, for example, systems that efficiently acquire text data and audio data using APIs or speech recognition technology.
[0064] "Preparation methods" refer to functions that perform data cleansing and tokenization in order to convert collected raw data into an analyzable format. Through these methods, unnecessary information is removed from the data, and it is optimized for analysis.
[0065] "Extraction means" refers to a function that identifies and selects important terms and descriptions from organized data. This means utilizes natural language processing technology and further provides basic information for analysis.
[0066] "Analytical tools" refer to functions that analyze the context and usage of extracted terms to clarify their meaning. These tools deepen our understanding of how terms are used.
[0067] The term "definition generation means" refers to a function that automatically creates specific meanings and definitions of terms based on the results of analysis. This means uses an AI model to generate detailed definitions.
[0068] "Cataloging method" refers to the function of creating a dictionary that aggregates and systematically organizes specialized terminology based on the generated definitions. This catalog promotes a unified understanding throughout the organization.
[0069] "Translation means" refers to the function of converting the constructed glossary into multiple languages in order to make it usable by multinational users. This means incorporates machine translation technology.
[0070] "Update mechanism" refers to a function that updates terms and their definitions in real time based on newly acquired information. This mechanism ensures that users always receive the latest information.
[0071] "Personalization means" refers to a function that customizes the glossary according to the user's needs and requirements, optimizing it for specific conditions. This means that users are provided with a dictionary that is easier to use.
[0072] This invention is a server-centered system configuration that efficiently collects and analyzes large amounts of data obtained from various information sources, rapidly generates definitions of technical terms, and then constructs and provides them as a glossary. Specific embodiments are shown below.
[0073] Server Functions
[0074] The server uses information gathering methods to collect text data from sources such as SNS platforms and mail servers via APIs. It also uses speech recognition technology to transcribe audio data, for example, from audio conferencing systems. This process utilizes speech recognition engines such as Google® Speech-to-Text and IBM Watson®.
[0075] The server then uses processing tools to prepare the collected data through cleansing and tokenization. This removes unnecessary symbols and whitespace, and converts the data into a parseable format using natural language processing libraries (such as NLTK or spaCy).
[0076] Subsequently, the server uses extraction methods to extract important terms from the prepared data. In this step, useful information is extracted using TF-IDF and word embedding techniques (e.g., Word2Vec, BERT).
[0077] The server then uses analytical tools to analyze the context of the extracted terms. This clearly analyzes the meaning of each term and generates definitions of the terms using generative AI models (e.g., OpenAI® GPT, BERT).
[0078] User Usage Instructions
[0079] Users access a glossary built by the server using their terminals. Users can view the definitions and usage contexts of generated technical terms in real time. Furthermore, the server uses update mechanisms to constantly update the dictionary with the latest information, and multilingual support allows users to access materials in multiple languages.
[0080] Examples of specific cases and prompt statements
[0081] As a concrete example, consider the case where an international research institution uses this system. The server collects data, for example, gene analysis data, from online platforms used by researchers, and extracts specialized DNA-related terminology from it.
[0082] Example of a prompt:
[0083] "Collect and generate definitions for DNA-related terminology used by international research teams on online platforms. Provide examples of how this generated dictionary facilitates communication among researchers with diverse backgrounds."
[0084] This allows users to communicate effectively with researchers from diverse backgrounds without encountering language barriers.
[0085] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0086] Step 1:
[0087] The server acquires data from various sources using information gathering methods. These sources include social networking services (SNS), email servers, and online meeting systems. Specifically, it collects text data using APIs and converts audio data into text using a speech recognition engine. The input is audio or text data, and the output is transcribed text data.
[0088] Step 2:
[0089] The server preprocesses the collected data using pre-processing tools. Specifically, it cleanses the data, removing unnecessary symbols and spaces. It also tokenizes the data using a natural language processing library, preparing it for parsing. The input is characterized text data, and the output is pre-processed, parsing-ready data.
[0090] Step 3:
[0091] The server extracts important terms from the prepared data using extraction methods. Here, TF-IDF and word embedding techniques are used to identify weighted terms within the document. The input is the prepared data, and the output is a list of important terms.
[0092] Step 4:
[0093] The server analyzes the context of extracted terms using analytical tools. This includes contextual analysis to understand how the terms are used in a sentence. The input is a list of important terms, and the output is contextual information for each term.
[0094] Step 5:
[0095] The server generates term definitions based on the results of contextual analysis using a definition generation mechanism. It utilizes a generative AI model to create specific term definitions. The input is contextual information, and the output is a list of term definitions.
[0096] Step 6:
[0097] The server uses a cataloging mechanism to build a glossary based on the generated definitions. The glossary is created to promote a consistent understanding of terminology within the organization. The input is a list of term definitions, and the output is a glossary usable throughout the organization.
[0098] Step 7:
[0099] The server uses translation tools to translate the constructed glossary into multiple languages. This makes the glossary available in an international environment. The input is a unified glossary, and the output is a multilingual glossary.
[0100] Step 8:
[0101] Users can access the glossary provided by the server using their terminals, customize the dictionary in real time, and check for updates. The update mechanism enables real-time information updates, and the personalization mechanism allows users to build a dictionary tailored to their specific needs. Input consists of user search queries and customization requests, while output consists of the latest term definitions and contextual information.
[0102] (Application Example 1)
[0103] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."
[0104] In autonomous vehicles, there is a need to provide diverse technical information in a real-time, easily understandable format. Conventional systems have struggled to effectively analyze the vast amounts of data collected by sensors and communications and present it in a way that is easy for passengers to understand. As a result, important information for drivers and passengers is not available in a timely manner, leading to challenges in fully realizing the safety and convenience of the vehicle.
[0105] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0106] In this invention, the server includes a data acquisition device, a preprocessing device for preprocessing the collected data, and an extraction device for extracting important descriptions from the preprocessed data. This makes it possible to efficiently analyze information obtained from sensors and present it to drivers and passengers in an easy-to-understand manner.
[0107] A "data acquisition device" is a hardware or software configuration for automatically acquiring necessary data from an information source.
[0108] A "data preprocessing device" is hardware or software that performs data cleaning and formatting to prepare acquired data into a format suitable for analysis.
[0109] An "extraction device" is an algorithm or program used to identify and extract important information or descriptions from pre-processed data.
[0110] A "contextual analysis device" is a device that analyzes the situation and background in which extracted information is used, in order to deepen the understanding of that information.
[0111] A "definition generator" is a system that automatically defines the meaning and usage of extracted information and terms based on the results of contextual analysis.
[0112] A "dictionary building device" is a device that creates a dictionary by aggregating specialized terminology and related information based on the generated definitions.
[0113] A "translation device" is a system or function that converts a constructed dictionary into multiple languages in real time.
[0114] An "information presentation device" is a device or system that provides analyzed and processed data to the user visually or audibly.
[0115] An "update device" is hardware or software that updates related systems in real time whenever new information or definitions are added, providing users with the latest information.
[0116] A "customization device" is a device that has the function of adjusting the dictionary of specialized terms and the method of presenting information according to the user's needs and requests.
[0117] The system implementing this invention is comprised of multiple hardware and software components. The server acquires data from various sources through a data acquisition device, and the preprocessor cleans the collected data and converts it into an analyzable format. Specifically, APIs are used, and a speech recognition engine is employed to convert speech data into text data.
[0118] Next, the server uses an extraction device to identify important descriptions from the pre-processed data and processes the newly obtained information in real time. A context analysis device further analyzes the context in which these descriptions are used, and based on the results, a definition generator creates definitions for technical terms.
[0119] The generated definitions are aggregated by a dictionary building device and translated into multiple languages by a translation device that enables multilingual support. In particular, in the context of autonomous vehicles, the analysis results are presented to drivers and passengers in an easily understandable format using in-vehicle information display devices. These information display devices provide information via displays and audio systems.
[0120] The update device updates dictionaries and definitions whenever the system receives new data, ensuring that it always provides the latest information. Furthermore, the customization device allows users to adjust the type and format of information displayed according to their own needs.
[0121] For example, if the system acquires data from weather sensors and detects that it is raining on a highway, it can present passengers with important information such as "slippery road surface." By accurately conveying necessary information in real time in this way, it is possible to enhance the safety and comfort of autonomous vehicles.
[0122] An example of a prompt is, "Display important driving information in real time based on data acquired from sensors." This prompt is used as an instruction to provide appropriate information using a generative AI model.
[0123] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0124] Step 1:
[0125] The server acquires data from various sources using data collection devices. Inputs include sensor data and social network data. It collects necessary information from this data, producing a raw dataset as output. Specifically, it retrieves data by calling APIs.
[0126] Step 2:
[0127] The server formats the acquired raw data using a preprocessor. The input is the raw data collected in step 1. Whitespace and symbols are removed, and the formatted data is output. Specifically, it performs routine data cleaning tasks.
[0128] Step 3:
[0129] The server converts pre-processed data into meaningful descriptions using an extraction device. The input is pre-processed data. It analyzes the information using natural language processing and outputs it as meaningful tokens. Specifically, it performs text analysis and keyword extraction.
[0130] Step 4:
[0131] The server analyzes the context in which the extracted description is used, employing a context analysis device. The input is the token from step 3. It understands the context of the description and outputs that information. Specifically, it applies a context analysis algorithm.
[0132] Step 5:
[0133] The server uses a definition generator to create definitions of technical terms from the analyzed information. The input is the result of contextual analysis. The generated definitions of technical terms are output. Specifically, it executes an AI-based definition generation process.
[0134] Step 6:
[0135] The server aggregates definitions generated using a dictionary building device and translates them into a multilingual dictionary. The input consists of definitions of technical terms. The dictionary-formatted data is output in multiple languages. Specifically, it calls existing translation APIs.
[0136] Step 7:
[0137] The server uses an information display device to show information to passengers and drivers via the in-vehicle information display system. The input is dictionary data translated into multiple languages. Information is conveyed to passengers and drivers visually or audibly. Specifically, the server presents information in a format suitable for the vehicle's display.
[0138] Step 8:
[0139] The server uses an update device to update the dictionary and information definitions whenever new data is obtained. Inputs include feedback from sensors and new data. The latest information is output and provided to the user. Specifically, it performs periodic data collection and dictionary reconstruction.
[0140] Step 9:
[0141] The user adjusts the type and format of information displayed using a customization device. The input is the user's settings. Customized information is output according to the settings. Specifically, it performs system setting updates and information adjustments.
[0142] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0143] This invention is a system that integrates data collection, analysis, dictionary building, and user sentiment analysis, with each element working in conjunction to achieve effective knowledge management. First, the server collects information from various sources using APIs and speech recognition. Text data obtained from SNS, email, and call records is preprocessed to a format suitable for analysis.
[0144] Next, the server extracts key terms and utilizes natural language processing techniques to understand their context. This contextual information forms the basis for explaining the meaning of the terms. An AI model automatically generates definitions for the terms and builds a specialized glossary based on them. The dictionary is updated in real time and supports multiple languages using translation tools. Throughout this process, the system remains easily accessible to users via their devices.
[0145] Furthermore, the integration of an emotion engine makes it possible to analyze emotion data from user interactions. The server optimizes the dictionary content based on the user's emotions. For example, if a user expresses a negative emotion indicating a lack of understanding of a term's definition, the server responds by adding examples of related terms or deepening the explanation. Emotion data can also be used to evaluate learning progress. This allows for the provision of customized learning modules to users, meeting their individual learning needs.
[0146] As a concrete example, let's consider its use in educational institutions. Users, i.e., students, access a specialized terminology dictionary using a terminal. The server analyzes the students' sentiments and provides supplementary information for items they don't fully understand. This improves students' learning effectiveness and creates an environment where they can acquire knowledge efficiently. This system plays a crucial role in facilitating smooth communication across language barriers in multinational classes and global companies.
[0147] The following describes the processing flow.
[0148] Step 1:
[0149] The server collects data from social networking services and email servers via APIs, and transcribes voice calls and meeting recordings using speech recognition technology. The collected data is converted to text format and stored for further analysis.
[0150] Step 2:
[0151] The server performs preprocessing on the collected data. Preprocessing removes unnecessary symbols and HTML tags, standardizes spacing, and tokenizes the data, preparing it for analysis in a clean state.
[0152] Step 3:
[0153] The server extracts important terms and phrases from pre-processed data using natural language processing techniques. It applies algorithms such as TF-IDF and topic modeling to identify highly specific terms.
[0154] Step 4:
[0155] The server analyzes the context containing the extracted term. Using the context window, it grasps the meaning and nuance of the term from the surrounding words and phrases, thereby adding context to the description.
[0156] Step 5:
[0157] The server uses an AI model to generate definitions of terms based on the results of contextual analysis. The generated definitions are organized in a way that allows users to intuitively understand them, including specific examples and usage examples.
[0158] Step 6:
[0159] The system uses an emotion engine to collect emotional data from user interactions. The server analyzes the user's emotional state and determines whether definitions and explanations of terms should be improved.
[0160] Step 7:
[0161] The server optimizes the contents of the technical term dictionary based on user sentiment data, adding relevant information as needed. For example, it might improve the system by providing clearer examples for terms that are difficult to understand.
[0162] Step 8:
[0163] Users access a well-organized dictionary through their device to obtain the necessary information. The server updates this dictionary in real time and applies multilingual translations to support users' smooth access to information.
[0164] Step 9:
[0165] The server evaluates the user's learning progress through an emotion engine and provides customized learning modules tailored to their individual learning pace. This allows users to learn efficiently at their own pace.
[0166] (Example 2)
[0167] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".
[0168] In today's information society, there is a need to efficiently retrieve necessary information from vast amounts of data and understand the definitions and contexts of specific terms. Furthermore, facilitating smooth communication between different languages and providing optimal information tailored to the user's learning progress are also challenges. To address these issues, a system that operates in real time and responds to the individual needs of users is essential.
[0169] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0170] In this invention, the server includes acquisition means for acquiring data, formatting means, and extraction means for extracting important terms. This makes it possible to efficiently collect necessary information from vast amounts of data and support the definition of appropriate terms and their contextual understanding for the user. Furthermore, through multilingual conversion and sentiment analysis, it is possible to provide a learning environment optimized for individual users and facilitate communication.
[0171] "Means of acquiring data" refers to the functions and processes used to gather necessary information from information sources.
[0172] "Formatting methods" refer to processes and devices used to convert acquired data into a format that is easy to analyze.
[0173] "Extraction method" refers to the process or function used to select important terms and keywords from formatted data.
[0174] "Contextual analysis methods" refer to processes and techniques for investigating the situations and backgrounds in which extracted terms are used.
[0175] "Means of definition creation" refers to functions or processes for clarifying the meaning of terms based on the analyzed context.
[0176] "Dictionary construction methods" refer to the processes and techniques for systematically collecting created definitions to create a glossary.
[0177] "Conversion means" refers to functions and technologies for translating and converting a constructed dictionary into a different language.
[0178] "Emotional optimization means" refers to processes or devices that optimize the presentation of information based on the user's emotional data.
[0179] "Evaluation provision means" refers to a function that evaluates the user's learning status and progress and provides feedback based on that evaluation.
[0180] "Update methods" refer to the processes and techniques for adding or correcting new information or definitions of changed terms to a dictionary.
[0181] "Personalization methods" refer to functions and technologies that adjust and customize information according to the specific needs and requests of the user.
[0182] This invention is a system that integrates data collection, formatting, and extraction of information necessary for understanding. In this system, a server plays a central role in acquiring data from various sources. APIs and speech recognition technologies can be used for data acquisition. Specifically, APIs for acquiring social media data and speech recognition services for converting speech to text can be used.
[0183] The server formats the collected data and converts it into a parseable format. This includes cleaning and standardizing the format of text data. It then extracts key terms and analyzes the context in which those terms are used.
[0184] Based on the results of contextual analysis, a generative AI model is used to create definitions for terms. This builds a specialized glossary of terms. The generated dictionary becomes available in multiple languages through a translation mechanism. This provides an environment that is easily accessible to users who speak different languages.
[0185] Furthermore, emotion optimization techniques can analyze user emotion data and optimize the presentation of dictionaries and information. For example, if a user is confused by a particular term, the server can help the user understand it by providing detailed explanations and additional examples.
[0186] As a concrete example, when a user accesses the system to learn a specific technical term, they enter a prompt. For instance, by providing text such as "The server explains how to collect data using an API," the system generates definitions and related information. This allows the user to efficiently acquire the necessary knowledge and apply it in practice.
[0187] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0188] Step 1:
[0189] The server retrieves data from various sources using APIs and speech recognition systems. Inputs include social media posts, emails, and call logs. The server collects this data and prepares it for subsequent formatting procedures. The output is a collection of raw data.
[0190] Step 2:
[0191] The server uses formatting tools to format the raw data into a parseable format. The input is the raw data collected in step 1. At this stage, unnecessary characters and HTML tags are removed from the data, case sensitivity is standardized, whitespace is removed, etc., to generate clean text data. The output is neatly formatted text data.
[0192] Step 3:
[0193] The server extracts key terms from formatted data. The input is formatted text data. Using natural language processing techniques, part-of-speech tagging is performed to extract key terms such as nouns and verbs from the text. The output is a list of the extracted key terms.
[0194] Step 4:
[0195] The server performs contextual analysis to analyze the context in which terms are used. The input is the list of key terms obtained in step 3, as well as the formatted text data. The server analyzes the context surrounding each term and generates foundational data to understand its meaning. The output is contextual information for the terms.
[0196] Step 5:
[0197] The server uses a generative AI model to create term definitions based on contextual information. The input is the contextual information obtained in step 4. The AI model is given prompt sentences, and as a result, the meanings of the terms are automatically generated. The output is a list of defined terms.
[0198] Step 6:
[0199] The server builds a term dictionary based on the created definitions and uses a multilingual translation mechanism. The input is the definitions of terms generated in step 5. The definitions are added to the dictionary and translated into multiple languages by the translation mechanism. The output is an updated multilingual term dictionary.
[0200] Step 7:
[0201] The server analyzes the user's emotional data using emotion optimization techniques and evaluates the learning progress. Inputs include the user's operation history and feedback. Specifically, repeated searches of the same term within a certain time frame are considered emotional data to identify the user's problems. The output is a learning plan optimized considering the user's emotional state.
[0202] Step 8:
[0203] Users access optimized glossaries and learning modules through their devices. The server uses personalization methods to provide information tailored to the user's needs. Input consists of user queries and prompts. The server provides additional explanations and examples based on the user's context to support efficient learning. Output is the most suitable learning content for the user.
[0204] (Application Example 2)
[0205] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as a "server" and the smart device 14 as a "terminal".
[0206] In modern online shopping, users often make misunderstandings or inaccurate decisions due to a lack of understanding of complex financial terminology. In particular, the sheer volume and specialized nature of financial information makes it difficult for users to feel confident in their decision-making. This invention aims to improve the user's purchasing experience and provide a system that assists in understanding financial terminology.
[0207] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0208] In this invention, the server includes data collection means, preprocessing means for preprocessing collected information, extraction means for extracting important terms from the preprocessed information, context analysis means for analyzing the context in which the extracted terms are used, explanation generation means for generating explanations of terms based on the results of the context analysis, dictionary construction means for constructing a specialized terminology dictionary based on the generated explanations, translation means for translating the constructed dictionary into multiple languages, and optimization means for analyzing user responses and optimizing explanations of purchase-related terms. This enables users to deepen their understanding of financial terminology in real time and make rational and confident decisions.
[0209] A "data collection means" is an element that has the function of collecting necessary data from multiple information sources.
[0210] A "preprocessing means" is an element that has the function of performing processing to convert the collected data into a format that is easy to analyze.
[0211] An "extraction means" is an element that has the function of selecting important terms from pre-processed data.
[0212] A "situational analysis tool" is an element that has the function of performing analysis to understand the context in which the extracted terms are used.
[0213] An "explanation generation means" is an element that has the function of generating definitions and explanations of terms based on the results of situational analysis.
[0214] A "dictionary construction means" is an element that has the function of accumulating the definitions of generated terms to form a specialized terminology dictionary.
[0215] A "translation tool" is an element that has the function of converting a constructed specialized terminology dictionary into a different language.
[0216] An "optimization tool" is an element that adjusts the explanation of terms based on user feedback and, when necessary, makes it the easiest to understand.
[0217] This invention is implemented as a system including data collection means, preprocessing means, extraction means, situation analysis means, explanation generation means, dictionary construction means, translation means, and optimization means. The server uses an API to collect data from multiple information sources, collects the data, and performs preprocessing. The collected data is formatted into text and important terms are extracted using natural language processing techniques.
[0218] This system is implemented in programming languages such as Python and JavaScript (registered trademark) and utilizes cloud services such as Google Cloud and AWS (registered trademark) for data processing. For natural language processing, it can use the Google Cloud Natural Language API or IBM Watson NLU. These capabilities enable the contextual analysis tool to understand the situation in which each term is used.
[0219] The server generates definitions of terms based on the results of the situation analysis. These definitions are stored as a specialized terminology dictionary. To support multiple languages, the dictionary is translated into several languages using translation tools. This process is efficiently carried out using tools such as the Google Translate API.
[0220] When a user accesses the system through their device, optimization measures adjust the explanations of purchase-related terms based on the user's reactions and sentiment data. This allows users to effectively acquire the knowledge necessary for their financial transactions.
[0221] For example, if a user expresses concern about the term "credit score," the server will directly provide detailed explanations and related information on the page to enhance the user's understanding. The AI model can also be instructed to generate information using the following prompt: "Please briefly explain the financial term 'XX' related to the product you are currently trying to purchase."
[0222] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0223] Step 1:
[0224] The server uses APIs to collect data such as user purchase history and search history from multiple sources. The input is raw data obtained from the APIs, and the output is data prepared for preprocessing. In this step, the server accesses each source and retrieves the data in the required format.
[0225] Step 2:
[0226] The server performs preprocessing to convert the collected data into a parseable format. The input is the raw data collected in step 1, and the output is data in text format. At this stage, the server filters out unnecessary information and formats the data by converting it to text format.
[0227] Step 3:
[0228] The server extracts important terms from pre-processed data using natural language processing techniques. The input is formatted text data, and the output is a list of extracted important terms. In this step, the server identifies and lists relevant keywords and phrases.
[0229] Step 4:
[0230] The server performs contextual analysis to analyze the context in which extracted terms are used. The input is a list of terms, and the output is contextual information. For each term, the server evaluates its usage and adds relevant data to create detailed contextual information.
[0231] Step 5:
[0232] The server generates definitions of terms based on the analyzed contextual information. The input is contextual information, and the output is the term and its definition. At this stage, the server utilizes a generative AI model to automatically generate accurate definitions of terms.
[0233] Step 6:
[0234] The server collects the generated term definitions to build a multilingual specialized terminology dictionary. Input is terms and their definitions, and output is updated dictionary data. The dictionary is automatically updated, and the server uses a translation API to translate the information into multiple languages.
[0235] Step 7:
[0236] When a user accesses the system using a terminal, the server optimizes the terminology explanations based on the user's actions. The input is user interaction data from the terminal, and the output is the optimized terminology explanation. The server analyzes the user's reactions, improves the explanations as needed, and presents them in an easily understandable format.
[0237] The specific processing unit 290 transmits the result of the specific processing to the smart device 14. In the smart device 14, the control unit 46A causes the output device 40 to output the result of the specific processing. The microphone 38B acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the audio data.
[0238] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). An example of data generation model 58 is ChatGPT (registered trademark) (Internet search).<URL: https: / / openai.com / blog / chatgpt> ), Gemini (registered trademark) (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0239] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart device 14.
[0240] [Second Embodiment]
[0241] Figure 3 shows an example of the configuration of the data processing system 210 according to the second embodiment.
[0242] As shown in Figure 3, the data processing system 210 includes a data processing device 12 and smart glasses 214. An example of the data processing device 12 is a server.
[0243] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0244] The smart glasses 214 include a computer 36, a microphone 238, a speaker 240, a camera 42, and a communication interface 44. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, and camera 42 are also connected to the bus 52.
[0245] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.
[0246] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).
[0247] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.
[0248] Figure 4 shows an example of the main functions of the data processing device 12 and the smart glasses 214. As shown in Figure 4, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.
[0249] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.
[0250] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0251] In the smart glasses 214, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0252] Next, the identification processing performed by the identification processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".
[0253] This invention is server-centric and begins with the server collecting data. The server uses APIs to acquire text data from SNS and email servers and uses a speech recognition engine to convert audio data from calls and meetings into text. The collected data is then preprocessed on the server. Unnecessary symbols and spaces are removed, tokenization is performed, and the data is prepared in a format suitable for analysis.
[0254] Next, the server extracts key terms from the pre-processed data using natural language processing techniques. The terms extracted in this step are then analyzed to determine how they are used in each context. The information obtained through contextual analysis becomes the foundational data used by AI to clarify the meaning of these terms.
[0255] Through the analysis process, the server automatically generates definitions for each term. These generated definitions help understand the context in which and how technical terms are used. Based on these definitions, the server builds specialized terminology dictionaries for the entire organization, departments, and even individuals.
[0256] Users can access these dictionaries through their devices, and the dictionary contents are updated in real time on the server, allowing for immediate reflection of newly extracted terms and their correspondingly changed definitions. The updated dictionaries are translated into multiple languages and can be used by multinational organizations.
[0257] As a concrete example, consider the use of this system in an international research institution. The server collects data from online platforms used by researchers and extracts specialized DNA-related terminology. A dictionary accessible to users (researchers) on their terminals supports more effective communication by providing definitions and contexts for the generated scientific terms. This process is expected to deepen understanding among researchers with different backgrounds and improve team productivity.
[0258] The following describes the processing flow.
[0259] Step 1:
[0260] The server uses APIs and web scraping to retrieve information from social media and email servers. It also employs speech recognition technology to convert and collect audio data from phone calls and meetings into text.
[0261] Step 2:
[0262] The server performs preprocessing on the collected data. This preprocessing involves removing unnecessary tags and special characters, and splitting words and phrases through tokenization, preparing the data for analysis.
[0263] Step 3:
[0264] The server extracts important terms and phrases from pre-processed data using natural language processing techniques. Here, the TF-IDF algorithm and topic modeling are used to select terms to identify based on frequency and importance.
[0265] Step 4:
[0266] The server analyzes the surrounding context of the extracted terms and performs contextual analysis to understand how they are being used. This allows for a deeper understanding of the context before and after the terms are used.
[0267] Step 5:
[0268] The server uses AI to automatically generate definitions of terms based on the results of contextual analysis. These definitions include specific meanings and relevant usage examples, providing details to aid understanding.
[0269] Step 6:
[0270] The server builds specialized terminology dictionaries tailored to each department and individual needs, based on automatically generated definitions. This allows users to access dictionaries optimized for their specific needs.
[0271] Step 7:
[0272] The server will translate specialized terminology dictionaries into multiple languages, establishing a system to support intercultural communication. This translation process will be automated, enabling the organization to adapt to its internationalization needs.
[0273] Step 8:
[0274] Users can access the latest dictionary information through their devices and deepen their understanding of terminology as needed. The server continuously updates the dictionary in real time based on user feedback and newly collected data.
[0275] (Example 1)
[0276] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."
[0277] In recent years, there has been a growing need to efficiently collect large amounts of data from diverse sources, analyze it with high accuracy, and rapidly generate definitions of technical terms, providing them as dictionaries in multiple languages. However, due to data diversity and language differences, doing this in real time is difficult. In addition, there is a demand for customizable dictionaries that can meet user needs, and a system is needed to address these challenges.
[0278] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0279] In this invention, the server includes information gathering means, organizing means for organizing the collected information, and extraction means for extracting important terms from the organized information. This makes it possible to efficiently process data obtained from diverse information sources, generate definitions of specialized terms in real time, and provide a glossary that supports multiple languages.
[0280] "Information gathering means" refers to functions and technologies for obtaining data from various information sources. These means include, for example, systems that efficiently acquire text data and audio data using APIs or speech recognition technology.
[0281] The "preparation means" refers to the function of performing data cleansing and tokenization in order to convert the collected raw data into an analyzable format. By means of this means, unnecessary information is removed from the data and it is optimized for analysis.
[0282] The "extraction means" refers to the function of identifying and selecting important terms and descriptions from the prepared data. This means utilizes natural language processing technology and provides basic information for further analysis.
[0283] The "analysis means" refers to the function of analyzing the context and usage situation of the extracted terms and clarifying their meaning. By means of this means, the understanding of how the terms are used is deepened.
[0284] The "definition generation means" refers to the function of automatically creating specific meanings and definitions of terms based on the results of analysis. This means uses an AI model to generate detailed definitions.
[0285] The "directory construction means" refers to the function of creating a dictionary that aggregates and organizes technical terms based on the generated definitions. This directory promotes unified understanding across the organization.
[0286] The "translation means" refers to the function of converting into multiple languages in order to make the constructed term directory available to multinational users. This means incorporates machine translation technology.
[0287] The "update means" refers to the function of updating terms and their definitions in real time based on newly acquired information. This means always provides users with the latest information.
[0288] The "personalization means" refers to the function of customizing the term directory according to the needs and requirements of users and optimizing it for specific conditions. By means of this means, users are provided with a more user-friendly dictionary.
[0289] This invention is a server-centered system configuration that efficiently collects and analyzes large amounts of data obtained from various information sources, rapidly generates definitions of technical terms, and then constructs and provides them as a glossary. Specific embodiments are shown below.
[0290] Server Functions
[0291] The server uses information gathering methods to collect text data from sources such as SNS platforms and mail servers via APIs. It also uses speech recognition technology to transcribe audio data, for example, from audio conferencing systems. This process utilizes speech recognition engines such as Google Speech-to-Text and IBM Watson.
[0292] The server then uses processing tools to prepare the collected data through cleansing and tokenization. This removes unnecessary symbols and whitespace, and converts the data into a parseable format using natural language processing libraries (such as NLTK or spaCy).
[0293] Subsequently, the server uses extraction methods to extract important terms from the prepared data. In this step, useful information is extracted using TF-IDF and word embedding techniques (e.g., Word2Vec, BERT).
[0294] The server then uses analytical tools to analyze the context of the extracted terms. This clearly determines the meaning of each term and generates definitions for the terms using generative AI models (e.g., OpenAI GPT, BERT).
[0295] User Usage Instructions
[0296] Users access a glossary built by the server using their terminals. Users can view the definitions and usage contexts of generated technical terms in real time. Furthermore, the server uses update mechanisms to constantly update the dictionary with the latest information, and multilingual support allows users to access materials in multiple languages.
[0297] Examples of specific cases and prompt statements
[0298] As a concrete example, consider the case where an international research institution uses this system. The server collects data, for example, gene analysis data, from online platforms used by researchers, and extracts specialized DNA-related terminology from it.
[0299] Example of a prompt:
[0300] "Collect and generate definitions for DNA-related terminology used by international research teams on online platforms. Provide examples of how this generated dictionary facilitates communication among researchers with diverse backgrounds."
[0301] This allows users to communicate effectively with researchers from diverse backgrounds without encountering language barriers.
[0302] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0303] Step 1:
[0304] The server acquires data from various sources using information gathering methods. These sources include social networking services (SNS), email servers, and online meeting systems. Specifically, it collects text data using APIs and converts audio data into text using a speech recognition engine. The input is audio or text data, and the output is transcribed text data.
[0305] Step 2:
[0306] The server preprocesses the data collected using the maintenance means. Specifically, it performs data cleaning to remove unnecessary symbols and blanks. Also, it tokenizes the data using a natural language processing library to prepare it in an analyzable format. The input is text data in character form, and the output is preprocessed analysis-ready data.
[0307] Step 3:
[0308] The server extracts important terms from the preprocessed data using the extraction means. Here, TF-IDF and word embedding techniques are used to identify weighted terms within the document. The input is the preprocessed data, and the output is a list of important terms.
[0309] Step 4:
[0310] The server analyzes the context of the terms extracted using the analysis means. This includes context analysis to understand how the terms are used within the sentences. The input is a list of important terms, and the output is the context information for each term.
[0311] Step 5:
[0312] The server generates definitions of the terms based on the results of the context analysis using the definition generation means. It utilizes a generative AI model to create specific definitions of the terms. The input is the context information, and the output is a list of term definitions.
[0313] Step 6:
[0314] The server constructs a term catalog based on the generated definitions using the catalog construction means. The catalog is created for the purpose of promoting unified understanding of terms within the organization. The input is a list of term definitions, and the output is a term catalog available for the entire organization.
[0315] Step 7:
[0316] <0"000995>The server uses translation tools to translate the constructed glossary into multiple languages. This makes the glossary available in an international environment. The input is a unified glossary, and the output is a multilingual glossary.
[0317] Step 8:
[0318] Users can access the glossary provided by the server using their terminals, customize the dictionary in real time, and check for updates. The update mechanism enables real-time information updates, and the personalization mechanism allows users to build a dictionary tailored to their specific needs. Input consists of user search queries and customization requests, while output consists of the latest term definitions and contextual information.
[0319] (Application Example 1)
[0320] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."
[0321] In autonomous vehicles, there is a need to provide diverse technical information in a real-time, easily understandable format. Conventional systems have struggled to effectively analyze the vast amounts of data collected by sensors and communications and present it in a way that is easy for passengers to understand. As a result, important information for drivers and passengers is not available in a timely manner, leading to challenges in fully realizing the safety and convenience of the vehicle.
[0322] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0323] In this invention, the server includes a data acquisition device, a preprocessing device for preprocessing the collected data, and an extraction device for extracting important descriptions from the preprocessed data. This makes it possible to efficiently analyze information obtained from sensors and present it to drivers and passengers in an easy-to-understand manner.
[0324] A "data acquisition device" is a hardware or software configuration for automatically acquiring necessary data from an information source.
[0325] A "data preprocessing device" is hardware or software that performs data cleaning and formatting to prepare acquired data into a format suitable for analysis.
[0326] An "extraction device" is an algorithm or program used to identify and extract important information or descriptions from pre-processed data.
[0327] A "contextual analysis device" is a device that analyzes the situation and background in which extracted information is used, in order to deepen the understanding of that information.
[0328] A "definition generator" is a system that automatically defines the meaning and usage of extracted information and terms based on the results of contextual analysis.
[0329] A "dictionary building device" is a device that creates a dictionary by aggregating specialized terminology and related information based on the generated definitions.
[0330] A "translation device" is a system or function that converts a constructed dictionary into multiple languages in real time.
[0331] An "information presentation device" is a device or system that provides analyzed and processed data to the user visually or audibly.
[0332] An "update device" is hardware or software that updates related systems in real time whenever new information or definitions are added, providing users with the latest information.
[0333] A "customization device" is a device that has the function of adjusting the dictionary of specialized terms and the method of presenting information according to the user's needs and requests.
[0334] The system implementing this invention is comprised of multiple hardware and software components. The server acquires data from various sources through a data acquisition device, and the preprocessor cleans the collected data and converts it into an analyzable format. Specifically, APIs are used, and a speech recognition engine is employed to convert speech data into text data.
[0335] Next, the server uses an extraction device to identify important descriptions from the pre-processed data and processes the newly obtained information in real time. A context analysis device further analyzes the context in which these descriptions are used, and based on the results, a definition generator creates definitions for technical terms.
[0336] The generated definitions are aggregated by a dictionary building device and translated into multiple languages by a translation device that enables multilingual support. In particular, in the context of autonomous vehicles, the analysis results are presented to drivers and passengers in an easily understandable format using in-vehicle information display devices. These information display devices provide information via displays and audio systems.
[0337] The update device updates dictionaries and definitions whenever the system receives new data, ensuring that it always provides the latest information. Furthermore, the customization device allows users to adjust the type and format of information displayed according to their own needs.
[0338] For example, if the system acquires data from weather sensors and detects that it is raining on a highway, it can present passengers with important information such as "slippery road surface." By accurately conveying necessary information in real time in this way, it is possible to enhance the safety and comfort of autonomous vehicles.
[0339] An example of a prompt is, "Display important driving information in real time based on data acquired from sensors." This prompt is used as an instruction to provide appropriate information using a generative AI model.
[0340] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0341] Step 1:
[0342] The server acquires data from various sources using data collection devices. Inputs include sensor data and social network data. It collects necessary information from this data, producing a raw dataset as output. Specifically, it retrieves data by calling APIs.
[0343] Step 2:
[0344] The server formats the acquired raw data using a preprocessor. The input is the raw data collected in step 1. Whitespace and symbols are removed, and the formatted data is output. Specifically, it performs routine data cleaning tasks.
[0345] Step 3:
[0346] The server converts pre-processed data into meaningful descriptions using an extraction device. The input is pre-processed data. It analyzes the information using natural language processing and outputs it as meaningful tokens. Specifically, it performs text analysis and keyword extraction.
[0347] Step 4:
[0348] The server analyzes the context in which the extracted description is used, employing a context analysis device. The input is the token from step 3. It understands the context of the description and outputs that information. Specifically, it applies a context analysis algorithm.
[0349] Step 5:
[0350] The server uses a definition generator to create definitions of technical terms from the analyzed information. The input is the result of contextual analysis. The generated definitions of technical terms are output. Specifically, it executes an AI-based definition generation process.
[0351] Step 6:
[0352] The server aggregates definitions generated using a dictionary building device and translates them into a multilingual dictionary. The input consists of definitions of technical terms. The dictionary-formatted data is output in multiple languages. Specifically, it calls existing translation APIs.
[0353] Step 7:
[0354] The server uses an information display device to show information to passengers and drivers via the in-vehicle information display system. The input is dictionary data translated into multiple languages. Information is conveyed to passengers and drivers visually or audibly. Specifically, the server presents information in a format suitable for the vehicle's display.
[0355] Step 8:
[0356] The server uses an update device to update the dictionary and information definitions whenever new data is obtained. Inputs include feedback from sensors and new data. The latest information is output and provided to the user. Specifically, it performs periodic data collection and dictionary reconstruction.
[0357] Step 9:
[0358] The user adjusts the type and format of information displayed using a customization device. The input is the user's settings. Customized information is output according to the settings. Specifically, it performs system setting updates and information adjustments.
[0359] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0360] This invention is a system that integrates data collection, analysis, dictionary building, and user sentiment analysis, with each element working in conjunction to achieve effective knowledge management. First, the server collects information from various sources using APIs and speech recognition. Text data obtained from SNS, email, and call records is preprocessed to a format suitable for analysis.
[0361] Next, the server extracts key terms and utilizes natural language processing techniques to understand their context. This contextual information forms the basis for explaining the meaning of the terms. An AI model automatically generates definitions for the terms and builds a specialized glossary based on them. The dictionary is updated in real time and supports multiple languages using translation tools. Throughout this process, the system remains easily accessible to users via their devices.
[0362] Furthermore, the integration of an emotion engine makes it possible to analyze emotion data from user interactions. The server optimizes the dictionary content based on the user's emotions. For example, if a user expresses a negative emotion indicating a lack of understanding of a term's definition, the server responds by adding examples of related terms or deepening the explanation. Emotion data can also be used to evaluate learning progress. This allows for the provision of customized learning modules to users, meeting their individual learning needs.
[0363] As a concrete example, let's consider its use in educational institutions. Users, i.e., students, access a specialized terminology dictionary using a terminal. The server analyzes the students' sentiments and provides supplementary information for items they don't fully understand. This improves students' learning effectiveness and creates an environment where they can acquire knowledge efficiently. This system plays a crucial role in facilitating smooth communication across language barriers in multinational classes and global companies.
[0364] The following describes the processing flow.
[0365] Step 1:
[0366] The server collects data from social networking services and email servers via APIs, and transcribes voice calls and meeting recordings using speech recognition technology. The collected data is converted to text format and stored for further analysis.
[0367] Step 2:
[0368] The server performs preprocessing on the collected data. Preprocessing removes unnecessary symbols and HTML tags, standardizes spacing, and tokenizes the data, preparing it for analysis in a clean state.
[0369] Step 3:
[0370] The server extracts important terms and phrases from pre-processed data using natural language processing techniques. It applies algorithms such as TF-IDF and topic modeling to identify highly specific terms.
[0371] Step 4:
[0372] The server analyzes the context containing the extracted term. Using the context window, it grasps the meaning and nuance of the term from the surrounding words and phrases, thereby adding context to the description.
[0373] Step 5:
[0374] The server uses an AI model to generate definitions of terms based on the results of contextual analysis. The generated definitions are organized in a way that allows users to intuitively understand them, including specific examples and usage examples.
[0375] Step 6:
[0376] The system uses an emotion engine to collect emotional data from user interactions. The server analyzes the user's emotional state and determines whether definitions and explanations of terms should be improved.
[0377] Step 7:
[0378] The server optimizes the contents of the technical term dictionary based on user sentiment data, adding relevant information as needed. For example, it might improve the system by providing clearer examples for terms that are difficult to understand.
[0379] Step 8:
[0380] Users access a well-organized dictionary through their device to obtain the necessary information. The server updates this dictionary in real time and applies multilingual translations to support users' smooth access to information.
[0381] Step 9:
[0382] The server evaluates the user's learning progress through an emotion engine and provides customized learning modules tailored to their individual learning pace. This allows users to learn efficiently at their own pace.
[0383] (Example 2)
[0384] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".
[0385] In today's information society, there is a need to efficiently retrieve necessary information from vast amounts of data and understand the definitions and contexts of specific terms. Furthermore, facilitating smooth communication between different languages and providing optimal information tailored to the user's learning progress are also challenges. To address these issues, a system that operates in real time and responds to the individual needs of users is essential.
[0386] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0387] In this invention, the server includes acquisition means for acquiring data, formatting means, and extraction means for extracting important terms. This makes it possible to efficiently collect necessary information from vast amounts of data and support the definition of appropriate terms and their contextual understanding for the user. Furthermore, through multilingual conversion and sentiment analysis, it is possible to provide a learning environment optimized for individual users and facilitate communication.
[0388] "Means of acquiring data" refers to the functions and processes used to gather necessary information from information sources.
[0389] "Formatting methods" refer to processes and devices used to convert acquired data into a format that is easy to analyze.
[0390] "Extraction method" refers to the process or function used to select important terms and keywords from formatted data.
[0391] "Contextual analysis methods" refer to processes and techniques for investigating the situations and backgrounds in which extracted terms are used.
[0392] "Means of definition creation" refers to functions or processes for clarifying the meaning of terms based on the analyzed context.
[0393] "Dictionary construction methods" refer to the processes and techniques for systematically collecting created definitions to create a glossary.
[0394] "Conversion means" refers to functions and technologies for translating and converting a constructed dictionary into a different language.
[0395] "Emotional optimization means" refers to processes or devices that optimize the presentation of information based on the user's emotional data.
[0396] "Evaluation provision means" refers to a function that evaluates the user's learning status and progress and provides feedback based on that evaluation.
[0397] "Update methods" refer to the processes and techniques for adding or correcting new information or definitions of changed terms to a dictionary.
[0398] "Personalization methods" refer to functions and technologies that adjust and customize information according to the specific needs and requests of the user.
[0399] This invention is a system that integrates data collection, formatting, and extraction of information necessary for understanding. In this system, a server plays a central role in acquiring data from various sources. APIs and speech recognition technologies can be used for data acquisition. Specifically, APIs for acquiring social media data and speech recognition services for converting speech to text can be used.
[0400] The server formats the collected data and converts it into a parseable format. This includes cleaning and standardizing the format of text data. It then extracts key terms and analyzes the context in which those terms are used.
[0401] Based on the results of contextual analysis, a generative AI model is used to create definitions for terms. This builds a specialized glossary of terms. The generated dictionary becomes available in multiple languages through a translation mechanism. This provides an environment that is easily accessible to users who speak different languages.
[0402] Furthermore, emotion optimization techniques can analyze user emotion data and optimize the presentation of dictionaries and information. For example, if a user is confused by a particular term, the server can help the user understand it by providing detailed explanations and additional examples.
[0403] As a concrete example, when a user accesses the system to learn a specific technical term, they enter a prompt. For instance, by providing text such as "The server explains how to collect data using an API," the system generates definitions and related information. This allows the user to efficiently acquire the necessary knowledge and apply it in practice.
[0404] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0405] Step 1:
[0406] The server retrieves data from various sources using APIs and speech recognition systems. Inputs include social media posts, emails, and call logs. The server collects this data and prepares it for subsequent formatting procedures. The output is a collection of raw data.
[0407] Step 2:
[0408] The server uses formatting tools to format the raw data into a parseable format. The input is the raw data collected in step 1. At this stage, unnecessary characters and HTML tags are removed from the data, case sensitivity is standardized, whitespace is removed, etc., to generate clean text data. The output is neatly formatted text data.
[0409] Step 3:
[0410] The server extracts key terms from formatted data. The input is formatted text data. Using natural language processing techniques, part-of-speech tagging is performed to extract key terms such as nouns and verbs from the text. The output is a list of the extracted key terms.
[0411] Step 4:
[0412] The server performs contextual analysis to analyze the context in which terms are used. The input is the list of key terms obtained in step 3, as well as the formatted text data. The server analyzes the context surrounding each term and generates foundational data to understand its meaning. The output is contextual information for the terms.
[0413] Step 5:
[0414] The server uses a generative AI model to create term definitions based on contextual information. The input is the contextual information obtained in step 4. The AI model is given prompt sentences, and as a result, the meanings of the terms are automatically generated. The output is a list of defined terms.
[0415] Step 6:
[0416] The server builds a term dictionary based on the created definitions and uses a multilingual translation mechanism. The input is the definitions of terms generated in step 5. The definitions are added to the dictionary and translated into multiple languages by the translation mechanism. The output is an updated multilingual term dictionary.
[0417] Step 7:
[0418] The server analyzes the user's emotional data using emotion optimization techniques and evaluates the learning progress. Inputs include the user's operation history and feedback. Specifically, repeated searches of the same term within a certain time frame are considered emotional data to identify the user's problems. The output is a learning plan optimized considering the user's emotional state.
[0419] Step 8:
[0420] Users access optimized glossaries and learning modules through their devices. The server uses personalization methods to provide information tailored to the user's needs. Input consists of user queries and prompts. The server provides additional explanations and examples based on the user's context to support efficient learning. Output is the most suitable learning content for the user.
[0421] (Application Example 2)
[0422] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."
[0423] In modern online shopping, users often make misunderstandings or inaccurate decisions due to a lack of understanding of complex financial terminology. In particular, the sheer volume and specialized nature of financial information makes it difficult for users to feel confident in their decision-making. This invention aims to improve the user's purchasing experience and provide a system that assists in understanding financial terminology.
[0424] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0425] In this invention, the server includes data collection means, preprocessing means for preprocessing collected information, extraction means for extracting important terms from the preprocessed information, context analysis means for analyzing the context in which the extracted terms are used, explanation generation means for generating explanations of terms based on the results of the context analysis, dictionary construction means for constructing a specialized terminology dictionary based on the generated explanations, translation means for translating the constructed dictionary into multiple languages, and optimization means for analyzing user responses and optimizing explanations of purchase-related terms. This enables users to deepen their understanding of financial terminology in real time and make rational and confident decisions.
[0426] A "data collection means" is an element that has the function of collecting necessary data from multiple information sources.
[0427] A "preprocessing means" is an element that has the function of performing processing to convert the collected data into a format that is easy to analyze.
[0428] An "extraction means" is an element that has the function of selecting important terms from pre-processed data.
[0429] A "situational analysis tool" is an element that has the function of performing analysis to understand the context in which the extracted terms are used.
[0430] An "explanation generation means" is an element that has the function of generating definitions and explanations of terms based on the results of situational analysis.
[0431] A "dictionary construction means" is an element that has the function of accumulating the definitions of generated terms to form a specialized terminology dictionary.
[0432] A "translation tool" is an element that has the function of converting a constructed specialized terminology dictionary into a different language.
[0433] An "optimization tool" is an element that adjusts the explanation of terms based on user feedback and, when necessary, makes it the easiest to understand.
[0434] This invention is implemented as a system including data collection means, preprocessing means, extraction means, situation analysis means, explanation generation means, dictionary construction means, translation means, and optimization means. The server uses an API to collect data from multiple information sources, collects the data, and performs preprocessing. The collected data is formatted into text and important terms are extracted using natural language processing techniques.
[0435] This system is implemented in programming languages such as Python and JavaScript, and utilizes cloud services such as Google Cloud and AWS for data processing. For natural language processing, it can use the Google Cloud Natural Language API or IBM Watson NLU. These capabilities allow the contextual analysis tool to understand the situation in which each term is used.
[0436] The server generates definitions of terms based on the results of the situation analysis. These definitions are stored as a specialized terminology dictionary. To support multiple languages, the dictionary is translated into several languages using translation tools. This process is efficiently carried out using tools such as the Google Translate API.
[0437] When a user accesses the system through their device, optimization measures adjust the explanations of purchase-related terms based on the user's reactions and sentiment data. This allows users to effectively acquire the knowledge necessary for their financial transactions.
[0438] For example, if a user expresses concern about the term "credit score," the server will directly provide detailed explanations and related information on the page to enhance the user's understanding. The AI model can also be instructed to generate information using the following prompt: "Please briefly explain the financial term 'XX' related to the product you are currently trying to purchase."
[0439] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0440] Step 1:
[0441] The server uses APIs to collect data such as user purchase history and search history from multiple sources. The input is raw data obtained from the APIs, and the output is data prepared for preprocessing. In this step, the server accesses each source and retrieves the data in the required format.
[0442] Step 2:
[0443] The server performs preprocessing to convert the collected data into a parseable format. The input is the raw data collected in step 1, and the output is data in text format. At this stage, the server filters out unnecessary information and formats the data by converting it to text format.
[0444] Step 3:
[0445] The server extracts important terms from pre-processed data using natural language processing techniques. The input is formatted text data, and the output is a list of extracted important terms. In this step, the server identifies and lists relevant keywords and phrases.
[0446] Step 4:
[0447] The server performs contextual analysis to analyze the context in which extracted terms are used. The input is a list of terms, and the output is contextual information. For each term, the server evaluates its usage and adds relevant data to create detailed contextual information.
[0448] Step 5:
[0449] The server generates definitions of terms based on the analyzed contextual information. The input is contextual information, and the output is the term and its definition. At this stage, the server utilizes a generative AI model to automatically generate accurate definitions of terms.
[0450] Step 6:
[0451] The server collects the generated term definitions to build a multilingual specialized terminology dictionary. Input is terms and their definitions, and output is updated dictionary data. The dictionary is automatically updated, and the server uses a translation API to translate the information into multiple languages.
[0452] Step 7:
[0453] When a user accesses the system using a terminal, the server optimizes the terminology explanations based on the user's actions. The input is user interaction data from the terminal, and the output is the optimized terminology explanation. The server analyzes the user's reactions, improves the explanations as needed, and presents them in an easily understandable format.
[0454] The specific processing unit 290 transmits the result of the specific processing to the smart glasses 214. In the smart glasses 214, the control unit 46A causes the speaker 240 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.
[0455] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0456] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart glasses 214.
[0457] [Third Embodiment]
[0458] Figure 5 shows an example of the configuration of the data processing system 310 according to the third embodiment.
[0459] As shown in Figure 5, the data processing system 310 includes a data processing device 12 and a headset terminal 314. An example of the data processing device 12 is a server.
[0460] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0461] The headset terminal 314 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a display 343. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and display 343 are also connected to the bus 52.
[0462] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.
[0463] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).
[0464] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.
[0465] Figure 6 shows an example of the main functions of the data processing device 12 and the headset terminal 314. As shown in Figure 6, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.
[0466] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.
[0467] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0468] In the headset terminal 314, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0469] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the headset terminal 314 will be referred to as the "terminal".
[0470] This invention is server-centric and begins with the server collecting data. The server uses APIs to acquire text data from SNS and email servers and uses a speech recognition engine to convert audio data from calls and meetings into text. The collected data is then preprocessed on the server. Unnecessary symbols and spaces are removed, tokenization is performed, and the data is prepared in a format suitable for analysis.
[0471] Next, the server extracts key terms from the pre-processed data using natural language processing techniques. The terms extracted in this step are then analyzed to determine how they are used in each context. The information obtained through contextual analysis becomes the foundational data used by AI to clarify the meaning of these terms.
[0472] Through the analysis process, the server automatically generates definitions for each term. These generated definitions help understand the context in which and how technical terms are used. Based on these definitions, the server builds specialized terminology dictionaries for the entire organization, departments, and even individuals.
[0473] Users can access these dictionaries through their devices, and the dictionary contents are updated in real time on the server, allowing for immediate reflection of newly extracted terms and their correspondingly changed definitions. The updated dictionaries are translated into multiple languages and can be used by multinational organizations.
[0474] As a concrete example, consider the use of this system in an international research institution. The server collects data from online platforms used by researchers and extracts specialized DNA-related terminology. A dictionary accessible to users (researchers) on their terminals supports more effective communication by providing definitions and contexts for the generated scientific terms. This process is expected to deepen understanding among researchers with different backgrounds and improve team productivity.
[0475] The following describes the processing flow.
[0476] Step 1:
[0477] The server uses APIs and web scraping to retrieve information from social media and email servers. It also employs speech recognition technology to convert and collect audio data from phone calls and meetings into text.
[0478] Step 2:
[0479] The server performs preprocessing on the collected data. This preprocessing involves removing unnecessary tags and special characters, and splitting words and phrases through tokenization, preparing the data for analysis.
[0480] Step 3:
[0481] The server extracts important terms and phrases from pre-processed data using natural language processing techniques. Here, the TF-IDF algorithm and topic modeling are used to select terms to identify based on frequency and importance.
[0482] Step 4:
[0483] The server analyzes the surrounding context of the extracted terms and performs contextual analysis to understand how they are being used. This allows for a deeper understanding of the context before and after the terms are used.
[0484] Step 5:
[0485] The server uses AI to automatically generate definitions of terms based on the results of contextual analysis. These definitions include specific meanings and relevant usage examples, providing details to aid understanding.
[0486] Step 6:
[0487] The server builds specialized terminology dictionaries tailored to each department and individual needs, based on automatically generated definitions. This allows users to access dictionaries optimized for their specific needs.
[0488] Step 7:
[0489] The server will translate specialized terminology dictionaries into multiple languages, establishing a system to support intercultural communication. This translation process will be automated, enabling the organization to adapt to its internationalization needs.
[0490] Step 8:
[0491] Users can access the latest dictionary information through their devices and deepen their understanding of terminology as needed. The server continuously updates the dictionary in real time based on user feedback and newly collected data.
[0492] (Example 1)
[0493] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0494] In recent years, there has been a growing need to efficiently collect large amounts of data from diverse sources, analyze it with high accuracy, and rapidly generate definitions of technical terms, providing them as dictionaries in multiple languages. However, due to data diversity and language differences, doing this in real time is difficult. In addition, there is a demand for customizable dictionaries that can meet user needs, and a system is needed to address these challenges.
[0495] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0496] In this invention, the server includes information gathering means, organizing means for organizing the collected information, and extraction means for extracting important terms from the organized information. This makes it possible to efficiently process data obtained from diverse information sources, generate definitions of specialized terms in real time, and provide a glossary that supports multiple languages.
[0497] "Information gathering means" refers to functions and technologies for obtaining data from various information sources. These means include, for example, systems that efficiently acquire text data and audio data using APIs or speech recognition technology.
[0498] "Preparation methods" refer to functions that perform data cleansing and tokenization in order to convert collected raw data into an analyzable format. Through these methods, unnecessary information is removed from the data, and it is optimized for analysis.
[0499] "Extraction means" refers to a function that identifies and selects important terms and descriptions from organized data. This means utilizes natural language processing technology and further provides basic information for analysis.
[0500] "Analytical tools" refer to functions that analyze the context and usage of extracted terms to clarify their meaning. These tools deepen our understanding of how terms are used.
[0501] The term "definition generation means" refers to a function that automatically creates specific meanings and definitions of terms based on the results of analysis. This means uses an AI model to generate detailed definitions.
[0502] "Cataloging method" refers to the function of creating a dictionary that aggregates and systematically organizes specialized terminology based on the generated definitions. This catalog promotes a unified understanding throughout the organization.
[0503] "Translation means" refers to the function of converting the constructed glossary into multiple languages in order to make it usable by multinational users. This means incorporates machine translation technology.
[0504] "Update mechanism" refers to a function that updates terms and their definitions in real time based on newly acquired information. This mechanism ensures that users always receive the latest information.
[0505] "Personalization means" refers to a function that customizes the glossary according to the user's needs and requirements, optimizing it for specific conditions. This means that users are provided with a dictionary that is easier to use.
[0506] This invention is a server-centered system configuration that efficiently collects and analyzes large amounts of data obtained from various information sources, rapidly generates definitions of technical terms, and then constructs and provides them as a glossary. Specific embodiments are shown below.
[0507] Server Functions
[0508] The server uses information gathering methods to collect text data from sources such as SNS platforms and mail servers via APIs. It also uses speech recognition technology to transcribe audio data, for example, from audio conferencing systems. This process utilizes speech recognition engines such as Google Speech-to-Text and IBM Watson.
[0509] The server then uses processing tools to prepare the collected data through cleansing and tokenization. This removes unnecessary symbols and whitespace, and converts the data into a parseable format using natural language processing libraries (such as NLTK or spaCy).
[0510] Subsequently, the server uses extraction methods to extract important terms from the prepared data. In this step, useful information is extracted using TF-IDF and word embedding techniques (e.g., Word2Vec, BERT).
[0511] The server then uses analytical tools to analyze the context of the extracted terms. This clearly determines the meaning of each term and generates definitions for the terms using generative AI models (e.g., OpenAI GPT, BERT).
[0512] User Usage Instructions
[0513] Users access a glossary built by the server using their terminals. Users can view the definitions and usage contexts of generated technical terms in real time. Furthermore, the server uses update mechanisms to constantly update the dictionary with the latest information, and multilingual support allows users to access materials in multiple languages.
[0514] Examples of specific cases and prompt statements
[0515] As a concrete example, consider the case where an international research institution uses this system. The server collects data, for example, gene analysis data, from online platforms used by researchers, and extracts specialized DNA-related terminology from it.
[0516] Example of a prompt:
[0517] "Collect and generate definitions for DNA-related terminology used by international research teams on online platforms. Provide examples of how this generated dictionary facilitates communication among researchers with diverse backgrounds."
[0518] This allows users to communicate effectively with researchers from diverse backgrounds without encountering language barriers.
[0519] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0520] Step 1:
[0521] The server acquires data from various sources using information gathering methods. These sources include social networking services (SNS), email servers, and online meeting systems. Specifically, it collects text data using APIs and converts audio data into text using a speech recognition engine. The input is audio or text data, and the output is transcribed text data.
[0522] Step 2:
[0523] The server preprocesses the collected data using pre-processing tools. Specifically, it cleanses the data, removing unnecessary symbols and spaces. It also tokenizes the data using a natural language processing library, preparing it for parsing. The input is characterized text data, and the output is pre-processed, parsing-ready data.
[0524] Step 3:
[0525] The server extracts important terms from the prepared data using extraction methods. Here, TF-IDF and word embedding techniques are used to identify weighted terms within the document. The input is the prepared data, and the output is a list of important terms.
[0526] Step 4:
[0527] The server analyzes the context of extracted terms using analytical tools. This includes contextual analysis to understand how the terms are used in a sentence. The input is a list of important terms, and the output is contextual information for each term.
[0528] Step 5:
[0529] The server generates term definitions based on the results of contextual analysis using a definition generation mechanism. It utilizes a generative AI model to create specific term definitions. The input is contextual information, and the output is a list of term definitions.
[0530] Step 6:
[0531] The server uses a cataloging mechanism to build a glossary based on the generated definitions. The glossary is created to promote a consistent understanding of terminology within the organization. The input is a list of term definitions, and the output is a glossary usable throughout the organization.
[0532] Step 7:
[0533] The server uses translation tools to translate the constructed glossary into multiple languages. This makes the glossary available in an international environment. The input is a unified glossary, and the output is a multilingual glossary.
[0534] Step 8:
[0535] Users can access the glossary provided by the server using their terminals, customize the dictionary in real time, and check for updates. The update mechanism enables real-time information updates, and the personalization mechanism allows users to build a dictionary tailored to their specific needs. Input consists of user search queries and customization requests, while output consists of the latest term definitions and contextual information.
[0536] (Application Example 1)
[0537] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0538] In autonomous vehicles, there is a need to provide diverse technical information in a real-time, easily understandable format. Conventional systems have struggled to effectively analyze the vast amounts of data collected by sensors and communications and present it in a way that is easy for passengers to understand. As a result, important information for drivers and passengers is not available in a timely manner, leading to challenges in fully realizing the safety and convenience of the vehicle.
[0539] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0540] In this invention, the server includes a data acquisition device, a preprocessing device for preprocessing the collected data, and an extraction device for extracting important descriptions from the preprocessed data. This makes it possible to efficiently analyze information obtained from sensors and present it to drivers and passengers in an easy-to-understand manner.
[0541] A "data acquisition device" is a hardware or software configuration for automatically acquiring necessary data from an information source.
[0542] A "data preprocessing device" is hardware or software that performs data cleaning and formatting to prepare acquired data into a format suitable for analysis.
[0543] An "extraction device" is an algorithm or program used to identify and extract important information or descriptions from pre-processed data.
[0544] A "contextual analysis device" is a device that analyzes the situation and background in which extracted information is used, in order to deepen the understanding of that information.
[0545] A "definition generator" is a system that automatically defines the meaning and usage of extracted information and terms based on the results of contextual analysis.
[0546] A "dictionary building device" is a device that creates a dictionary by aggregating specialized terminology and related information based on the generated definitions.
[0547] A "translation device" is a system or function that converts a constructed dictionary into multiple languages in real time.
[0548] An "information presentation device" is a device or system that provides analyzed and processed data to the user visually or audibly.
[0549] An "update device" is hardware or software that updates related systems in real time whenever new information or definitions are added, providing users with the latest information.
[0550] A "customization device" is a device that has the function of adjusting the dictionary of specialized terms and the method of presenting information according to the user's needs and requests.
[0551] The system implementing this invention is comprised of multiple hardware and software components. The server acquires data from various sources through a data acquisition device, and the preprocessor cleans the collected data and converts it into an analyzable format. Specifically, APIs are used, and a speech recognition engine is employed to convert speech data into text data.
[0552] Next, the server uses an extraction device to identify important descriptions from the pre-processed data and processes the newly obtained information in real time. A context analysis device further analyzes the context in which these descriptions are used, and based on the results, a definition generator creates definitions for technical terms.
[0553] The generated definitions are aggregated by a dictionary building device and translated into multiple languages by a translation device that enables multilingual support. In particular, in the context of autonomous vehicles, the analysis results are presented to drivers and passengers in an easily understandable format using in-vehicle information display devices. These information display devices provide information via displays and audio systems.
[0554] The update device updates dictionaries and definitions whenever the system receives new data, ensuring that it always provides the latest information. Furthermore, the customization device allows users to adjust the type and format of information displayed according to their own needs.
[0555] For example, if the system acquires data from weather sensors and detects that it is raining on a highway, it can present passengers with important information such as "slippery road surface." By accurately conveying necessary information in real time in this way, it is possible to enhance the safety and comfort of autonomous vehicles.
[0556] An example of a prompt is, "Display important driving information in real time based on data acquired from sensors." This prompt is used as an instruction to provide appropriate information using a generative AI model.
[0557] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0558] Step 1:
[0559] The server acquires data from various sources using data collection devices. Inputs include sensor data and social network data. It collects necessary information from this data, producing a raw dataset as output. Specifically, it retrieves data by calling APIs.
[0560] Step 2:
[0561] The server formats the acquired raw data using a preprocessor. The input is the raw data collected in step 1. Whitespace and symbols are removed, and the formatted data is output. Specifically, it performs routine data cleaning tasks.
[0562] Step 3:
[0563] The server converts pre-processed data into meaningful descriptions using an extraction device. The input is pre-processed data. It analyzes the information using natural language processing and outputs it as meaningful tokens. Specifically, it performs text analysis and keyword extraction.
[0564] Step 4:
[0565] The server analyzes the context in which the extracted description is used, employing a context analysis device. The input is the token from step 3. It understands the context of the description and outputs that information. Specifically, it applies a context analysis algorithm.
[0566] Step 5:
[0567] The server uses a definition generator to create definitions of technical terms from the analyzed information. The input is the result of contextual analysis. The generated definitions of technical terms are output. Specifically, it executes an AI-based definition generation process.
[0568] Step 6:
[0569] The server aggregates definitions generated using a dictionary building device and translates them into a multilingual dictionary. The input consists of definitions of technical terms. The dictionary-formatted data is output in multiple languages. Specifically, it calls existing translation APIs.
[0570] Step 7:
[0571] The server uses an information display device to show information to passengers and drivers via the in-vehicle information display system. The input is dictionary data translated into multiple languages. Information is conveyed to passengers and drivers visually or audibly. Specifically, the server presents information in a format suitable for the vehicle's display.
[0572] Step 8:
[0573] The server uses an update device to update the dictionary and information definitions whenever new data is obtained. Inputs include feedback from sensors and new data. The latest information is output and provided to the user. Specifically, it performs periodic data collection and dictionary reconstruction.
[0574] Step 9:
[0575] The user adjusts the type and format of information displayed using a customization device. The input is the user's settings. Customized information is output according to the settings. Specifically, it performs system setting updates and information adjustments.
[0576] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0577] This invention is a system that integrates data collection, analysis, dictionary building, and user sentiment analysis, with each element working in conjunction to achieve effective knowledge management. First, the server collects information from various sources using APIs and speech recognition. Text data obtained from SNS, email, and call records is preprocessed to a format suitable for analysis.
[0578] Next, the server extracts key terms and utilizes natural language processing techniques to understand their context. This contextual information forms the basis for explaining the meaning of the terms. An AI model automatically generates definitions for the terms and builds a specialized glossary based on them. The dictionary is updated in real time and supports multiple languages using translation tools. Throughout this process, the system remains easily accessible to users via their devices.
[0579] Furthermore, the integration of an emotion engine makes it possible to analyze emotion data from user interactions. The server optimizes the dictionary content based on the user's emotions. For example, if a user expresses a negative emotion indicating a lack of understanding of a term's definition, the server responds by adding examples of related terms or deepening the explanation. Emotion data can also be used to evaluate learning progress. This allows for the provision of customized learning modules to users, meeting their individual learning needs.
[0580] As a concrete example, let's consider its use in educational institutions. Users, i.e., students, access a specialized terminology dictionary using a terminal. The server analyzes the students' sentiments and provides supplementary information for items they don't fully understand. This improves students' learning effectiveness and creates an environment where they can acquire knowledge efficiently. This system plays a crucial role in facilitating smooth communication across language barriers in multinational classes and global companies.
[0581] The following describes the processing flow.
[0582] Step 1:
[0583] The server collects data from social networking services and email servers via APIs, and transcribes voice calls and meeting recordings using speech recognition technology. The collected data is converted to text format and stored for further analysis.
[0584] Step 2:
[0585] The server performs preprocessing on the collected data. Preprocessing removes unnecessary symbols and HTML tags, standardizes spacing, and tokenizes the data, preparing it for analysis in a clean state.
[0586] Step 3:
[0587] The server extracts important terms and phrases from pre-processed data using natural language processing techniques. It applies algorithms such as TF-IDF and topic modeling to identify highly specific terms.
[0588] Step 4:
[0589] The server analyzes the context containing the extracted term. Using the context window, it grasps the meaning and nuance of the term from the surrounding words and phrases, thereby adding context to the description.
[0590] Step 5:
[0591] The server uses an AI model to generate definitions of terms based on the results of contextual analysis. The generated definitions are organized in a way that allows users to intuitively understand them, including specific examples and usage examples.
[0592] Step 6:
[0593] The system uses an emotion engine to collect emotional data from user interactions. The server analyzes the user's emotional state and determines whether definitions and explanations of terms should be improved.
[0594] Step 7:
[0595] The server optimizes the contents of the technical term dictionary based on user sentiment data, adding relevant information as needed. For example, it might improve the system by providing clearer examples for terms that are difficult to understand.
[0596] Step 8:
[0597] Users access a well-organized dictionary through their device to obtain the necessary information. The server updates this dictionary in real time and applies multilingual translations to support users' smooth access to information.
[0598] Step 9:
[0599] The server evaluates the user's learning progress through an emotion engine and provides customized learning modules tailored to their individual learning pace. This allows users to learn efficiently at their own pace.
[0600] (Example 2)
[0601] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0602] In today's information society, there is a need to efficiently retrieve necessary information from vast amounts of data and understand the definitions and contexts of specific terms. Furthermore, facilitating smooth communication between different languages and providing optimal information tailored to the user's learning progress are also challenges. To address these issues, a system that operates in real time and responds to the individual needs of users is essential.
[0603] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0604] In this invention, the server includes acquisition means for acquiring data, formatting means, and extraction means for extracting important terms. This makes it possible to efficiently collect necessary information from vast amounts of data and support the definition of appropriate terms and their contextual understanding for the user. Furthermore, through multilingual conversion and sentiment analysis, it is possible to provide a learning environment optimized for individual users and facilitate communication.
[0605] "Means of acquiring data" refers to the functions and processes used to gather necessary information from information sources.
[0606] "Formatting methods" refer to processes and devices used to convert acquired data into a format that is easy to analyze.
[0607] "Extraction method" refers to the process or function used to select important terms and keywords from formatted data.
[0608] "Contextual analysis methods" refer to processes and techniques for investigating the situations and backgrounds in which extracted terms are used.
[0609] "Means of definition creation" refers to functions or processes for clarifying the meaning of terms based on the analyzed context.
[0610] "Dictionary construction methods" refer to the processes and techniques for systematically collecting created definitions to create a glossary.
[0611] "Conversion means" refers to functions and technologies for translating and converting a constructed dictionary into a different language.
[0612] "Emotional optimization means" refers to processes or devices that optimize the presentation of information based on the user's emotional data.
[0613] "Evaluation provision means" refers to a function that evaluates the user's learning status and progress and provides feedback based on that evaluation.
[0614] "Update methods" refer to the processes and techniques for adding or correcting new information or definitions of changed terms to a dictionary.
[0615] "Personalization methods" refer to functions and technologies that adjust and customize information according to the specific needs and requests of the user.
[0616] This invention is a system that integrates data collection, formatting, and extraction of information necessary for understanding. In this system, a server plays a central role in acquiring data from various sources. APIs and speech recognition technologies can be used for data acquisition. Specifically, APIs for acquiring social media data and speech recognition services for converting speech to text can be used.
[0617] The server formats the collected data and converts it into a parseable format. This includes cleaning and standardizing the format of text data. It then extracts key terms and analyzes the context in which those terms are used.
[0618] Based on the results of contextual analysis, a generative AI model is used to create definitions for terms. This builds a specialized glossary of terms. The generated dictionary becomes available in multiple languages through a translation mechanism. This provides an environment that is easily accessible to users who speak different languages.
[0619] Furthermore, emotion optimization techniques can analyze user emotion data and optimize the presentation of dictionaries and information. For example, if a user is confused by a particular term, the server can help the user understand it by providing detailed explanations and additional examples.
[0620] As a concrete example, when a user accesses the system to learn a specific technical term, they enter a prompt. For instance, by providing text such as "The server explains how to collect data using an API," the system generates definitions and related information. This allows the user to efficiently acquire the necessary knowledge and apply it in practice.
[0621] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0622] Step 1:
[0623] The server retrieves data from various sources using APIs and speech recognition systems. Inputs include social media posts, emails, and call logs. The server collects this data and prepares it for subsequent formatting procedures. The output is a collection of raw data.
[0624] Step 2:
[0625] The server uses formatting tools to format the raw data into a parseable format. The input is the raw data collected in step 1. At this stage, unnecessary characters and HTML tags are removed from the data, case sensitivity is standardized, whitespace is removed, etc., to generate clean text data. The output is neatly formatted text data.
[0626] Step 3:
[0627] The server extracts key terms from formatted data. The input is formatted text data. Using natural language processing techniques, part-of-speech tagging is performed to extract key terms such as nouns and verbs from the text. The output is a list of the extracted key terms.
[0628] Step 4:
[0629] The server performs contextual analysis to analyze the context in which terms are used. The input is the list of key terms obtained in step 3, as well as the formatted text data. The server analyzes the context surrounding each term and generates foundational data to understand its meaning. The output is contextual information for the terms.
[0630] Step 5:
[0631] The server uses a generative AI model to create term definitions based on contextual information. The input is the contextual information obtained in step 4. The AI model is given prompt sentences, and as a result, the meanings of the terms are automatically generated. The output is a list of defined terms.
[0632] Step 6:
[0633] The server builds a term dictionary based on the created definitions and uses a multilingual translation mechanism. The input is the definitions of terms generated in step 5. The definitions are added to the dictionary and translated into multiple languages by the translation mechanism. The output is an updated multilingual term dictionary.
[0634] Step 7:
[0635] The server analyzes the user's emotional data using emotion optimization techniques and evaluates the learning progress. Inputs include the user's operation history and feedback. Specifically, repeated searches of the same term within a certain time frame are considered emotional data to identify the user's problems. The output is a learning plan optimized considering the user's emotional state.
[0636] Step 8:
[0637] Users access optimized glossaries and learning modules through their devices. The server uses personalization methods to provide information tailored to the user's needs. Input consists of user queries and prompts. The server provides additional explanations and examples based on the user's context to support efficient learning. Output is the most suitable learning content for the user.
[0638] (Application Example 2)
[0639] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0640] In modern online shopping, users often make misunderstandings or inaccurate decisions due to a lack of understanding of complex financial terminology. In particular, the sheer volume and specialized nature of financial information makes it difficult for users to feel confident in their decision-making. This invention aims to improve the user's purchasing experience and provide a system that assists in understanding financial terminology.
[0641] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0642] In this invention, the server includes data collection means, preprocessing means for preprocessing collected information, extraction means for extracting important terms from the preprocessed information, context analysis means for analyzing the context in which the extracted terms are used, explanation generation means for generating explanations of terms based on the results of the context analysis, dictionary construction means for constructing a specialized terminology dictionary based on the generated explanations, translation means for translating the constructed dictionary into multiple languages, and optimization means for analyzing user responses and optimizing explanations of purchase-related terms. This enables users to deepen their understanding of financial terminology in real time and make rational and confident decisions.
[0643] A "data collection means" is an element that has the function of collecting necessary data from multiple information sources.
[0644] A "preprocessing means" is an element that has the function of performing processing to convert the collected data into a format that is easy to analyze.
[0645] An "extraction means" is an element that has the function of selecting important terms from pre-processed data.
[0646] A "situational analysis tool" is an element that has the function of performing analysis to understand the context in which the extracted terms are used.
[0647] An "explanation generation means" is an element that has the function of generating definitions and explanations of terms based on the results of situational analysis.
[0648] A "dictionary construction means" is an element that has the function of accumulating the definitions of generated terms to form a specialized terminology dictionary.
[0649] A "translation tool" is an element that has the function of converting a constructed specialized terminology dictionary into a different language.
[0650] An "optimization tool" is an element that adjusts the explanation of terms based on user feedback and, when necessary, makes it the easiest to understand.
[0651] This invention is implemented as a system including data collection means, preprocessing means, extraction means, situation analysis means, explanation generation means, dictionary construction means, translation means, and optimization means. The server uses an API to collect data from multiple information sources, collects the data, and performs preprocessing. The collected data is formatted into text and important terms are extracted using natural language processing techniques.
[0652] This system is implemented in programming languages such as Python and JavaScript, and utilizes cloud services such as Google Cloud and AWS for data processing. For natural language processing, it can use the Google Cloud Natural Language API or IBM Watson NLU. These capabilities allow the contextual analysis tool to understand the situation in which each term is used.
[0653] The server generates definitions of terms based on the results of the situation analysis. These definitions are stored as a specialized terminology dictionary. To support multiple languages, the dictionary is translated into several languages using translation tools. This process is efficiently carried out using tools such as the Google Translate API.
[0654] When a user accesses the system through their device, optimization measures adjust the explanations of purchase-related terms based on the user's reactions and sentiment data. This allows users to effectively acquire the knowledge necessary for their financial transactions.
[0655] For example, if a user expresses concern about the term "credit score," the server will directly provide detailed explanations and related information on the page to enhance the user's understanding. The AI model can also be instructed to generate information using the following prompt: "Please briefly explain the financial term 'XX' related to the product you are currently trying to purchase."
[0656] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0657] Step 1:
[0658] The server uses APIs to collect data such as user purchase history and search history from multiple sources. The input is raw data obtained from the APIs, and the output is data prepared for preprocessing. In this step, the server accesses each source and retrieves the data in the required format.
[0659] Step 2:
[0660] The server performs preprocessing to convert the collected data into a parseable format. The input is the raw data collected in step 1, and the output is data in text format. At this stage, the server filters out unnecessary information and formats the data by converting it to text format.
[0661] Step 3:
[0662] The server extracts important terms from pre-processed data using natural language processing techniques. The input is formatted text data, and the output is a list of extracted important terms. In this step, the server identifies and lists relevant keywords and phrases.
[0663] Step 4:
[0664] The server performs contextual analysis to analyze the context in which extracted terms are used. The input is a list of terms, and the output is contextual information. For each term, the server evaluates its usage and adds relevant data to create detailed contextual information.
[0665] Step 5:
[0666] The server generates definitions of terms based on the analyzed contextual information. The input is contextual information, and the output is the term and its definition. At this stage, the server utilizes a generative AI model to automatically generate accurate definitions of terms.
[0667] Step 6:
[0668] The server collects the generated term definitions to build a multilingual specialized terminology dictionary. Input is terms and their definitions, and output is updated dictionary data. The dictionary is automatically updated, and the server uses a translation API to translate the information into multiple languages.
[0669] Step 7:
[0670] When a user accesses the system using a terminal, the server optimizes the terminology explanations based on the user's actions. The input is user interaction data from the terminal, and the output is the optimized terminology explanation. The server analyzes the user's reactions, improves the explanations as needed, and presents them in an easily understandable format.
[0671] The specific processing unit 290 transmits the result of the specific processing to the headset terminal 314. In the headset terminal 314, the control unit 46A causes the speaker 240 and display 343 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.
[0672] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0673] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and specific processing may also be performed by the headset terminal 314.
[0674] [Fourth Embodiment]
[0675] Figure 7 shows an example of the configuration of the data processing system 410 according to the fourth embodiment.
[0676] As shown in Figure 7, the data processing system 410 includes a data processing device 12 and a robot 414. An example of the data processing device 12 is a server.
[0677] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0678] The robot 414 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a controlled object 443. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and controlled object 443 are also connected to the bus 52.
[0679] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.
[0680] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).
[0681] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.
[0682] The controlled object 443 includes a display device, LEDs in the eyes, and motors that drive the arms, hands, and feet. The posture and gestures of the robot 414 are controlled by controlling the motors of the arms, hands, and feet. Some of the robot 414's emotions can be expressed by controlling these motors. Furthermore, the robot 414's facial expressions can also be expressed by controlling the illumination state of the LEDs in its eyes.
[0683] Figure 8 shows an example of the main functions of the data processing device 12 and the robot 414. As shown in Figure 8, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.
[0684] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.
[0685] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0686] In robot 414, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0687] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0688] This invention is server-centric and begins with the server collecting data. The server uses APIs to acquire text data from SNS and email servers and uses a speech recognition engine to convert audio data from calls and meetings into text. The collected data is then preprocessed on the server. Unnecessary symbols and spaces are removed, tokenization is performed, and the data is prepared in a format suitable for analysis.
[0689] Next, the server extracts key terms from the pre-processed data using natural language processing techniques. The terms extracted in this step are then analyzed to determine how they are used in each context. The information obtained through contextual analysis becomes the foundational data used by AI to clarify the meaning of these terms.
[0690] Through the analysis process, the server automatically generates definitions for each term. These generated definitions help understand the context in which and how technical terms are used. Based on these definitions, the server builds specialized terminology dictionaries for the entire organization, departments, and even individuals.
[0691] Users can access these dictionaries through their devices, and the dictionary contents are updated in real time on the server, allowing for immediate reflection of newly extracted terms and their correspondingly changed definitions. The updated dictionaries are translated into multiple languages and can be used by multinational organizations.
[0692] As a concrete example, consider the use of this system in an international research institution. The server collects data from online platforms used by researchers and extracts specialized DNA-related terminology. A dictionary accessible to users (researchers) on their terminals supports more effective communication by providing definitions and contexts for the generated scientific terms. This process is expected to deepen understanding among researchers with different backgrounds and improve team productivity.
[0693] The following describes the processing flow.
[0694] Step 1:
[0695] The server uses APIs and web scraping to retrieve information from social media and email servers. It also employs speech recognition technology to convert and collect audio data from phone calls and meetings into text.
[0696] Step 2:
[0697] The server performs preprocessing on the collected data. This preprocessing involves removing unnecessary tags and special characters, and splitting words and phrases through tokenization, preparing the data for analysis.
[0698] Step 3:
[0699] The server extracts important terms and phrases from pre-processed data using natural language processing techniques. Here, the TF-IDF algorithm and topic modeling are used to select terms to identify based on frequency and importance.
[0700] Step 4:
[0701] The server analyzes the surrounding context of the extracted terms and performs contextual analysis to understand how they are being used. This allows for a deeper understanding of the context before and after the terms are used.
[0702] Step 5:
[0703] The server uses AI to automatically generate definitions of terms based on the results of contextual analysis. These definitions include specific meanings and relevant usage examples, providing details to aid understanding.
[0704] Step 6:
[0705] The server builds specialized terminology dictionaries tailored to each department and individual needs, based on automatically generated definitions. This allows users to access dictionaries optimized for their specific needs.
[0706] Step 7:
[0707] The server will translate specialized terminology dictionaries into multiple languages, establishing a system to support intercultural communication. This translation process will be automated, enabling the organization to adapt to its internationalization needs.
[0708] Step 8:
[0709] Users can access the latest dictionary information through their devices and deepen their understanding of terminology as needed. The server continuously updates the dictionary in real time based on user feedback and newly collected data.
[0710] (Example 1)
[0711] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0712] In recent years, there has been a growing need to efficiently collect large amounts of data from diverse sources, analyze it with high accuracy, and rapidly generate definitions of technical terms, providing them as dictionaries in multiple languages. However, due to data diversity and language differences, doing this in real time is difficult. In addition, there is a demand for customizable dictionaries that can meet user needs, and a system is needed to address these challenges.
[0713] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0714] In this invention, the server includes information gathering means, organizing means for organizing the collected information, and extraction means for extracting important terms from the organized information. This makes it possible to efficiently process data obtained from diverse information sources, generate definitions of specialized terms in real time, and provide a glossary that supports multiple languages.
[0715] "Information gathering means" refers to functions and technologies for obtaining data from various information sources. These means include, for example, systems that efficiently acquire text data and audio data using APIs or speech recognition technology.
[0716] "Preparation methods" refer to functions that perform data cleansing and tokenization in order to convert collected raw data into an analyzable format. Through these methods, unnecessary information is removed from the data, and it is optimized for analysis.
[0717] "Extraction means" refers to a function that identifies and selects important terms and descriptions from organized data. This means utilizes natural language processing technology and further provides basic information for analysis.
[0718] "Analytical tools" refer to functions that analyze the context and usage of extracted terms to clarify their meaning. These tools deepen our understanding of how terms are used.
[0719] The term "definition generation means" refers to a function that automatically creates specific meanings and definitions of terms based on the results of analysis. This means uses an AI model to generate detailed definitions.
[0720] "Cataloging method" refers to the function of creating a dictionary that aggregates and systematically organizes specialized terminology based on the generated definitions. This catalog promotes a unified understanding throughout the organization.
[0721] "Translation means" refers to the function of converting the constructed glossary into multiple languages in order to make it usable by multinational users. This means incorporates machine translation technology.
[0722] "Update mechanism" refers to a function that updates terms and their definitions in real time based on newly acquired information. This mechanism ensures that users always receive the latest information.
[0723] "Personalization means" refers to a function that customizes the glossary according to the user's needs and requirements, optimizing it for specific conditions. This means that users are provided with a dictionary that is easier to use.
[0724] This invention is a server-centered system configuration that efficiently collects and analyzes large amounts of data obtained from various information sources, rapidly generates definitions of technical terms, and then constructs and provides them as a glossary. Specific embodiments are shown below.
[0725] Server Functions
[0726] The server uses information gathering methods to collect text data from sources such as SNS platforms and mail servers via APIs. It also uses speech recognition technology to transcribe audio data, for example, from audio conferencing systems. This process utilizes speech recognition engines such as Google Speech-to-Text and IBM Watson.
[0727] The server then uses processing tools to prepare the collected data through cleansing and tokenization. This removes unnecessary symbols and whitespace, and converts the data into a parseable format using natural language processing libraries (such as NLTK or spaCy).
[0728] Subsequently, the server uses extraction methods to extract important terms from the prepared data. In this step, useful information is extracted using TF-IDF and word embedding techniques (e.g., Word2Vec, BERT).
[0729] The server then uses analytical tools to analyze the context of the extracted terms. This clearly determines the meaning of each term and generates definitions for the terms using generative AI models (e.g., OpenAI GPT, BERT).
[0730] User Usage Instructions
[0731] Users access a glossary built by the server using their terminals. Users can view the definitions and usage contexts of generated technical terms in real time. Furthermore, the server uses update mechanisms to constantly update the dictionary with the latest information, and multilingual support allows users to access materials in multiple languages.
[0732] Examples of specific cases and prompt statements
[0733] As a concrete example, consider the case where an international research institution uses this system. The server collects data, for example, gene analysis data, from online platforms used by researchers, and extracts specialized DNA-related terminology from it.
[0734] Example of a prompt:
[0735] "Collect and generate definitions for DNA-related terminology used by international research teams on online platforms. Provide examples of how this generated dictionary facilitates communication among researchers with diverse backgrounds."
[0736] This allows users to communicate effectively with researchers from diverse backgrounds without encountering language barriers.
[0737] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0738] Step 1:
[0739] The server acquires data from various sources using information gathering methods. These sources include social networking services (SNS), email servers, and online meeting systems. Specifically, it collects text data using APIs and converts audio data into text using a speech recognition engine. The input is audio or text data, and the output is transcribed text data.
[0740] Step 2:
[0741] The server preprocesses the collected data using pre-processing tools. Specifically, it cleanses the data, removing unnecessary symbols and spaces. It also tokenizes the data using a natural language processing library, preparing it for parsing. The input is characterized text data, and the output is pre-processed, parsing-ready data.
[0742] Step 3:
[0743] The server extracts important terms from the prepared data using extraction methods. Here, TF-IDF and word embedding techniques are used to identify weighted terms within the document. The input is the prepared data, and the output is a list of important terms.
[0744] Step 4:
[0745] The server analyzes the context of extracted terms using analytical tools. This includes contextual analysis to understand how the terms are used in a sentence. The input is a list of important terms, and the output is contextual information for each term.
[0746] Step 5:
[0747] The server generates term definitions based on the results of contextual analysis using a definition generation mechanism. It utilizes a generative AI model to create specific term definitions. The input is contextual information, and the output is a list of term definitions.
[0748] Step 6:
[0749] The server uses a cataloging mechanism to build a glossary based on the generated definitions. The glossary is created to promote a consistent understanding of terminology within the organization. The input is a list of term definitions, and the output is a glossary usable throughout the organization.
[0750] Step 7:
[0751] The server uses translation tools to translate the constructed glossary into multiple languages. This makes the glossary available in an international environment. The input is a unified glossary, and the output is a multilingual glossary.
[0752] Step 8:
[0753] Users can access the glossary provided by the server using their terminals, customize the dictionary in real time, and check for updates. The update mechanism enables real-time information updates, and the personalization mechanism allows users to build a dictionary tailored to their specific needs. Input consists of user search queries and customization requests, while output consists of the latest term definitions and contextual information.
[0754] (Application Example 1)
[0755] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0756] In autonomous vehicles, there is a need to provide diverse technical information in a real-time, easily understandable format. Conventional systems have struggled to effectively analyze the vast amounts of data collected by sensors and communications and present it in a way that is easy for passengers to understand. As a result, important information for drivers and passengers is not available in a timely manner, leading to challenges in fully realizing the safety and convenience of the vehicle.
[0757] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0758] In this invention, the server includes a data acquisition device, a preprocessing device for preprocessing the collected data, and an extraction device for extracting important descriptions from the preprocessed data. This makes it possible to efficiently analyze information obtained from sensors and present it to drivers and passengers in an easy-to-understand manner.
[0759] A "data acquisition device" is a hardware or software configuration for automatically acquiring necessary data from an information source.
[0760] A "data preprocessing device" is hardware or software that performs data cleaning and formatting to prepare acquired data into a format suitable for analysis.
[0761] An "extraction device" is an algorithm or program used to identify and extract important information or descriptions from pre-processed data.
[0762] A "contextual analysis device" is a device that analyzes the situation and background in which extracted information is used, in order to deepen the understanding of that information.
[0763] A "definition generator" is a system that automatically defines the meaning and usage of extracted information and terms based on the results of contextual analysis.
[0764] A "dictionary building device" is a device that creates a dictionary by aggregating specialized terminology and related information based on the generated definitions.
[0765] A "translation device" is a system or function that converts a constructed dictionary into multiple languages in real time.
[0766] An "information presentation device" is a device or system that provides analyzed and processed data to the user visually or audibly.
[0767] An "update device" is hardware or software that updates related systems in real time whenever new information or definitions are added, providing users with the latest information.
[0768] A "customization device" is a device that has the function of adjusting the dictionary of specialized terms and the method of presenting information according to the user's needs and requests.
[0769] The system implementing this invention is comprised of multiple hardware and software components. The server acquires data from various sources through a data acquisition device, and the preprocessor cleans the collected data and converts it into an analyzable format. Specifically, APIs are used, and a speech recognition engine is employed to convert speech data into text data.
[0770] Next, the server uses an extraction device to identify important descriptions from the pre-processed data and processes the newly obtained information in real time. A context analysis device further analyzes the context in which these descriptions are used, and based on the results, a definition generator creates definitions for technical terms.
[0771] The generated definitions are aggregated by a dictionary building device and translated into multiple languages by a translation device that enables multilingual support. In particular, in the context of autonomous vehicles, the analysis results are presented to drivers and passengers in an easily understandable format using in-vehicle information display devices. These information display devices provide information via displays and audio systems.
[0772] The update device updates dictionaries and definitions whenever the system receives new data, ensuring that it always provides the latest information. Furthermore, the customization device allows users to adjust the type and format of information displayed according to their own needs.
[0773] For example, if the system acquires data from weather sensors and detects that it is raining on a highway, it can present passengers with important information such as "slippery road surface." By accurately conveying necessary information in real time in this way, it is possible to enhance the safety and comfort of autonomous vehicles.
[0774] An example of a prompt is, "Display important driving information in real time based on data acquired from sensors." This prompt is used as an instruction to provide appropriate information using a generative AI model.
[0775] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0776] Step 1:
[0777] The server acquires data from various sources using data collection devices. Inputs include sensor data and social network data. It collects necessary information from this data, producing a raw dataset as output. Specifically, it retrieves data by calling APIs.
[0778] Step 2:
[0779] The server formats the acquired raw data using a preprocessor. The input is the raw data collected in step 1. Whitespace and symbols are removed, and the formatted data is output. Specifically, it performs routine data cleaning tasks.
[0780] Step 3:
[0781] The server converts pre-processed data into meaningful descriptions using an extraction device. The input is pre-processed data. It analyzes the information using natural language processing and outputs it as meaningful tokens. Specifically, it performs text analysis and keyword extraction.
[0782] Step 4:
[0783] The server analyzes the context in which the extracted description is used, employing a context analysis device. The input is the token from step 3. It understands the context of the description and outputs that information. Specifically, it applies a context analysis algorithm.
[0784] Step 5:
[0785] The server uses a definition generator to create definitions of technical terms from the analyzed information. The input is the result of contextual analysis. The generated definitions of technical terms are output. Specifically, it executes an AI-based definition generation process.
[0786] Step 6:
[0787] The server aggregates definitions generated using a dictionary building device and translates them into a multilingual dictionary. The input consists of definitions of technical terms. The dictionary-formatted data is output in multiple languages. Specifically, it calls existing translation APIs.
[0788] Step 7:
[0789] The server uses an information display device to show information to passengers and drivers via the in-vehicle information display system. The input is dictionary data translated into multiple languages. Information is conveyed to passengers and drivers visually or audibly. Specifically, the server presents information in a format suitable for the vehicle's display.
[0790] Step 8:
[0791] The server uses an update device to update the dictionary and information definitions whenever new data is obtained. Inputs include feedback from sensors and new data. The latest information is output and provided to the user. Specifically, it performs periodic data collection and dictionary reconstruction.
[0792] Step 9:
[0793] The user adjusts the type and format of information displayed using a customization device. The input is the user's settings. Customized information is output according to the settings. Specifically, it performs system setting updates and information adjustments.
[0794] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0795] This invention is a system that integrates data collection, analysis, dictionary building, and user sentiment analysis, with each element working in conjunction to achieve effective knowledge management. First, the server collects information from various sources using APIs and speech recognition. Text data obtained from SNS, email, and call records is preprocessed to a format suitable for analysis.
[0796] Next, the server extracts key terms and utilizes natural language processing techniques to understand their context. This contextual information forms the basis for explaining the meaning of the terms. An AI model automatically generates definitions for the terms and builds a specialized glossary based on them. The dictionary is updated in real time and supports multiple languages using translation tools. Throughout this process, the system remains easily accessible to users via their devices.
[0797] Furthermore, the integration of an emotion engine makes it possible to analyze emotion data from user interactions. The server optimizes the dictionary content based on the user's emotions. For example, if a user expresses a negative emotion indicating a lack of understanding of a term's definition, the server responds by adding examples of related terms or deepening the explanation. Emotion data can also be used to evaluate learning progress. This allows for the provision of customized learning modules to users, meeting their individual learning needs.
[0798] As a concrete example, let's consider its use in educational institutions. Users, i.e., students, access a specialized terminology dictionary using a terminal. The server analyzes the students' sentiments and provides supplementary information for items they don't fully understand. This improves students' learning effectiveness and creates an environment where they can acquire knowledge efficiently. This system plays a crucial role in facilitating smooth communication across language barriers in multinational classes and global companies.
[0799] The following describes the processing flow.
[0800] Step 1:
[0801] The server collects data from social networking services and email servers via APIs, and transcribes voice calls and meeting recordings using speech recognition technology. The collected data is converted to text format and stored for further analysis.
[0802] Step 2:
[0803] The server performs preprocessing on the collected data. Preprocessing removes unnecessary symbols and HTML tags, standardizes spacing, and tokenizes the data, preparing it for analysis in a clean state.
[0804] Step 3:
[0805] The server extracts important terms and phrases from pre-processed data using natural language processing techniques. It applies algorithms such as TF-IDF and topic modeling to identify highly specific terms.
[0806] Step 4:
[0807] The server analyzes the context containing the extracted term. Using the context window, it grasps the meaning and nuance of the term from the surrounding words and phrases, thereby adding context to the description.
[0808] Step 5:
[0809] The server uses an AI model to generate definitions of terms based on the results of contextual analysis. The generated definitions are organized in a way that allows users to intuitively understand them, including specific examples and usage examples.
[0810] Step 6:
[0811] The system uses an emotion engine to collect emotional data from user interactions. The server analyzes the user's emotional state and determines whether definitions and explanations of terms should be improved.
[0812] Step 7:
[0813] The server optimizes the contents of the technical term dictionary based on user sentiment data, adding relevant information as needed. For example, it might improve the system by providing clearer examples for terms that are difficult to understand.
[0814] Step 8:
[0815] Users access a well-organized dictionary through their device to obtain the necessary information. The server updates this dictionary in real time and applies multilingual translations to support users' smooth access to information.
[0816] Step 9:
[0817] The server evaluates the user's learning progress through an emotion engine and provides customized learning modules tailored to their individual learning pace. This allows users to learn efficiently at their own pace.
[0818] (Example 2)
[0819] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0820] In today's information society, there is a need to efficiently retrieve necessary information from vast amounts of data and understand the definitions and contexts of specific terms. Furthermore, facilitating smooth communication between different languages and providing optimal information tailored to the user's learning progress are also challenges. To address these issues, a system that operates in real time and responds to the individual needs of users is essential.
[0821] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0822] In this invention, the server includes acquisition means for acquiring data, formatting means, and extraction means for extracting important terms. This makes it possible to efficiently collect necessary information from vast amounts of data and support the definition of appropriate terms and their contextual understanding for the user. Furthermore, through multilingual conversion and sentiment analysis, it is possible to provide a learning environment optimized for individual users and facilitate communication.
[0823] "Means of acquiring data" refers to the functions and processes used to gather necessary information from information sources.
[0824] "Formatting methods" refer to processes and devices used to convert acquired data into a format that is easy to analyze.
[0825] "Extraction method" refers to the process or function used to select important terms and keywords from formatted data.
[0826] "Contextual analysis methods" refer to processes and techniques for investigating the situations and backgrounds in which extracted terms are used.
[0827] "Means of definition creation" refers to functions or processes for clarifying the meaning of terms based on the analyzed context.
[0828] "Dictionary construction methods" refer to the processes and techniques for systematically collecting created definitions to create a glossary.
[0829] "Conversion means" refers to functions and technologies for translating and converting a constructed dictionary into a different language.
[0830] "Emotional optimization means" refers to processes or devices that optimize the presentation of information based on the user's emotional data.
[0831] "Evaluation provision means" refers to a function that evaluates the user's learning status and progress and provides feedback based on that evaluation.
[0832] "Update methods" refer to the processes and techniques for adding or correcting new information or definitions of changed terms to a dictionary.
[0833] "Personalization methods" refer to functions and technologies that adjust and customize information according to the specific needs and requests of the user.
[0834] This invention is a system that integrates data collection, formatting, and extraction of information necessary for understanding. In this system, a server plays a central role in acquiring data from various sources. APIs and speech recognition technologies can be used for data acquisition. Specifically, APIs for acquiring social media data and speech recognition services for converting speech to text can be used.
[0835] The server formats the collected data and converts it into a parseable format. This includes cleaning and standardizing the format of text data. It then extracts key terms and analyzes the context in which those terms are used.
[0836] Based on the results of contextual analysis, a generative AI model is used to create definitions for terms. This builds a specialized glossary of terms. The generated dictionary becomes available in multiple languages through a translation mechanism. This provides an environment that is easily accessible to users who speak different languages.
[0837] Furthermore, emotion optimization techniques can analyze user emotion data and optimize the presentation of dictionaries and information. For example, if a user is confused by a particular term, the server can help the user understand it by providing detailed explanations and additional examples.
[0838] As a concrete example, when a user accesses the system to learn a specific technical term, they enter a prompt. For instance, by providing text such as "The server explains how to collect data using an API," the system generates definitions and related information. This allows the user to efficiently acquire the necessary knowledge and apply it in practice.
[0839] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0840] Step 1:
[0841] The server retrieves data from various sources using APIs and speech recognition systems. Inputs include social media posts, emails, and call logs. The server collects this data and prepares it for subsequent formatting procedures. The output is a collection of raw data.
[0842] Step 2:
[0843] The server uses formatting tools to format the raw data into a parseable format. The input is the raw data collected in step 1. At this stage, unnecessary characters and HTML tags are removed from the data, case sensitivity is standardized, whitespace is removed, etc., to generate clean text data. The output is neatly formatted text data.
[0844] Step 3:
[0845] The server extracts key terms from formatted data. The input is formatted text data. Using natural language processing techniques, part-of-speech tagging is performed to extract key terms such as nouns and verbs from the text. The output is a list of the extracted key terms.
[0846] Step 4:
[0847] The server performs contextual analysis to analyze the context in which terms are used. The input is the list of key terms obtained in step 3, as well as the formatted text data. The server analyzes the context surrounding each term and generates foundational data to understand its meaning. The output is contextual information for the terms.
[0848] Step 5:
[0849] The server uses a generative AI model to create term definitions based on contextual information. The input is the contextual information obtained in step 4. The AI model is given prompt sentences, and as a result, the meanings of the terms are automatically generated. The output is a list of defined terms.
[0850] Step 6:
[0851] The server builds a term dictionary based on the created definitions and uses a multilingual translation mechanism. The input is the definitions of terms generated in step 5. The definitions are added to the dictionary and translated into multiple languages by the translation mechanism. The output is an updated multilingual term dictionary.
[0852] Step 7:
[0853] The server analyzes the user's emotional data using emotion optimization techniques and evaluates the learning progress. Inputs include the user's operation history and feedback. Specifically, repeated searches of the same term within a certain time frame are considered emotional data to identify the user's problems. The output is a learning plan optimized considering the user's emotional state.
[0854] Step 8:
[0855] Users access optimized glossaries and learning modules through their devices. The server uses personalization methods to provide information tailored to the user's needs. Input consists of user queries and prompts. The server provides additional explanations and examples based on the user's context to support efficient learning. Output is the most suitable learning content for the user.
[0856] (Application Example 2)
[0857] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0858] In modern online shopping, users often make misunderstandings or inaccurate decisions due to a lack of understanding of complex financial terminology. In particular, the sheer volume and specialized nature of financial information makes it difficult for users to feel confident in their decision-making. This invention aims to improve the user's purchasing experience and provide a system that assists in understanding financial terminology.
[0859] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0860] In this invention, the server includes data collection means, preprocessing means for preprocessing collected information, extraction means for extracting important terms from the preprocessed information, context analysis means for analyzing the context in which the extracted terms are used, explanation generation means for generating explanations of terms based on the results of the context analysis, dictionary construction means for constructing a specialized terminology dictionary based on the generated explanations, translation means for translating the constructed dictionary into multiple languages, and optimization means for analyzing user responses and optimizing explanations of purchase-related terms. This enables users to deepen their understanding of financial terminology in real time and make rational and confident decisions.
[0861] A "data collection means" is an element that has the function of collecting necessary data from multiple information sources.
[0862] A "preprocessing means" is an element that has the function of performing processing to convert the collected data into a format that is easy to analyze.
[0863] An "extraction means" is an element that has the function of selecting important terms from pre-processed data.
[0864] A "situational analysis tool" is an element that has the function of performing analysis to understand the context in which the extracted terms are used.
[0865] An "explanation generation means" is an element that has the function of generating definitions and explanations of terms based on the results of situational analysis.
[0866] A "dictionary construction means" is an element that has the function of accumulating the definitions of generated terms to form a specialized terminology dictionary.
[0867] A "translation tool" is an element that has the function of converting a constructed specialized terminology dictionary into a different language.
[0868] An "optimization tool" is an element that adjusts the explanation of terms based on user feedback and, when necessary, makes it the easiest to understand.
[0869] This invention is implemented as a system including data collection means, preprocessing means, extraction means, situation analysis means, explanation generation means, dictionary construction means, translation means, and optimization means. The server uses an API to collect data from multiple information sources, collects the data, and performs preprocessing. The collected data is formatted into text and important terms are extracted using natural language processing techniques.
[0870] This system is implemented in programming languages such as Python and JavaScript, and utilizes cloud services such as Google Cloud and AWS for data processing. For natural language processing, it can use the Google Cloud Natural Language API or IBM Watson NLU. These capabilities allow the contextual analysis tool to understand the situation in which each term is used.
[0871] The server generates definitions of terms based on the results of the situation analysis. These definitions are stored as a specialized terminology dictionary. To support multiple languages, the dictionary is translated into several languages using translation tools. This process is efficiently carried out using tools such as the Google Translate API.
[0872] When a user accesses the system through their device, optimization measures adjust the explanations of purchase-related terms based on the user's reactions and sentiment data. This allows users to effectively acquire the knowledge necessary for their financial transactions.
[0873] For example, if a user expresses concern about the term "credit score," the server will directly provide detailed explanations and related information on the page to enhance the user's understanding. The AI model can also be instructed to generate information using the following prompt: "Please briefly explain the financial term 'XX' related to the product you are currently trying to purchase."
[0874] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0875] Step 1:
[0876] The server uses APIs to collect data such as user purchase history and search history from multiple sources. The input is raw data obtained from the APIs, and the output is data prepared for preprocessing. In this step, the server accesses each source and retrieves the data in the required format.
[0877] Step 2:
[0878] The server performs preprocessing to convert the collected data into a parseable format. The input is the raw data collected in step 1, and the output is data in text format. At this stage, the server filters out unnecessary information and formats the data by converting it to text format.
[0879] Step 3:
[0880] The server extracts important terms from pre-processed data using natural language processing techniques. The input is formatted text data, and the output is a list of extracted important terms. In this step, the server identifies and lists relevant keywords and phrases.
[0881] Step 4:
[0882] The server performs contextual analysis to analyze the context in which extracted terms are used. The input is a list of terms, and the output is contextual information. For each term, the server evaluates its usage and adds relevant data to create detailed contextual information.
[0883] Step 5:
[0884] The server generates definitions of terms based on the analyzed contextual information. The input is contextual information, and the output is the term and its definition. At this stage, the server utilizes a generative AI model to automatically generate accurate definitions of terms.
[0885] Step 6:
[0886] The server collects the generated term definitions to build a multilingual specialized terminology dictionary. Input is terms and their definitions, and output is updated dictionary data. The dictionary is automatically updated, and the server uses a translation API to translate the information into multiple languages.
[0887] Step 7:
[0888] When a user accesses the system using a terminal, the server optimizes the terminology explanations based on the user's actions. The input is user interaction data from the terminal, and the output is the optimized terminology explanation. The server analyzes the user's reactions, improves the explanations as needed, and presents them in an easily understandable format.
[0889] The specific processing unit 290 transmits the result of the specific processing to the robot 414. In the robot 414, the control unit 46A causes the speaker 240 and the controlled object 443 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.
[0890] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0891] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the robot 414.
[0892] Furthermore, the emotion identification model 59, acting as an emotion engine, may determine the user's emotion according to a specific mapping. Specifically, the emotion identification model 59 may determine the user's emotion according to a specific mapping, which is an emotion map (see Figure 9). Similarly, the emotion identification model 59 may also determine the robot's emotion, and the identification processing unit 290 may perform identification processing using the robot's emotion.
[0893] Figure 9 shows an emotion map 400 in which multiple emotions are mapped. In the emotion map 400, emotions are arranged in concentric circles radiating from the center. The closer to the center of the concentric circles, the more primitive the emotions are located. Further out of the concentric circles, emotions representing states and actions arising from mental states are located. Emotion is a concept that includes feelings and mental states. On the left side of the concentric circles, emotions that are generally generated from reactions occurring in the brain are located. On the right side of the concentric circles, emotions that are generally induced by situational judgment are located. Above and below the concentric circles, emotions that are generally generated from reactions occurring in the brain and induced by situational judgment are located. In addition, the emotion of "pleasure" is located on the upper side of the concentric circles, and the emotion of "displeasure" is located on the lower side. Thus, in the emotion map 400, multiple emotions are mapped based on the structure in which emotions arise, and emotions that are likely to occur simultaneously are mapped close together.
[0894] These emotions are distributed at the 3 o'clock position on the Emotion Map 400, and usually fluctuate between feelings of security and anxiety. In the right half of the Emotion Map 400, situational awareness takes precedence over internal feelings, resulting in a calm impression.
[0895] The inside of the Emotion Map 400 represents inner thoughts, while the outside represents actions. Therefore, the further you go from the outside of the Emotion Map 400, the more visible (expressed in actions) your emotions become.
[0896] Here, human emotions are based on various balances, such as posture and blood sugar levels. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. Similarly, in robots, cars, motorcycles, etc., emotions can be created based on various balances, such as posture and battery level. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. The emotion map can be generated, for example, based on Dr. Mitsuyoshi's emotion map (Research on a system for analyzing brain physiological signals of speech emotion recognition and emotion, Tokushima University, doctoral dissertation: https: / / ci.nii.ac.jp / naid / 500000375379). The left half of the emotion map contains emotions belonging to a region called "response," where sensation is dominant. The right half of the emotion map contains emotions belonging to a region called "situation," where situational awareness is dominant.
[0897] The emotion map defines two emotions that promote learning. One is the emotion around the middle of the negative "repentance" and "reflection" on the situation side. In other words, it is when the robot experiences negative emotions such as "I never want to feel this way again" or "I don't want to be scolded again." The other is the emotion around the positive "desire" on the reaction side. In other words, it is when the robot has positive feelings such as "I want more" or "I want to know more."
[0898] The emotion identification model 59 inputs user input into a pre-trained neural network, obtains emotion values representing each emotion shown in the emotion map 400, and determines the user's emotion. This neural network is pre-trained based on multiple training data sets, which are combinations of user input and emotion values representing each emotion shown in the emotion map 400. Furthermore, this neural network is trained so that emotions located close together have similar values, as shown in the emotion map 900 in Figure 10. Figure 10 shows an example where multiple emotions such as "reassured," "calm," and "confident" have similar emotion values.
[0899] The above description primarily focuses on the functions of the data processing device 12 in relation to this disclosure. However, the system related to this disclosure is not necessarily implemented on a server. The system related to this disclosure may be implemented as a general information processing system. This disclosure may be implemented, for example, as a software program that runs on a personal computer or as an application that runs on a smartphone. The method related to this disclosure may be provided to users in SaaS (Software as a Service) format.
[0900] In the above embodiment, an example was given in which a specific process is performed by a single computer 22. However, the technology of this disclosure is not limited thereto, and a distributed processing of the specific process may be performed by multiple computers, including computer 22. For example, a data generation model 58 may be provided in an external device of the data processing device 12, and the external device may generate data according to the input data.
[0901] In the above embodiment, an example was given in which the specific processing program 56 is stored in the storage 32, but the technology of this disclosure is not limited thereto. For example, the specific processing program 56 may be stored in a portable, computer-readable, non-temporary storage medium such as a USB (Universal Serial Bus) memory. The specific processing program 56 stored in the non-temporary storage medium is installed in the computer 22 of the data processing device 12. The processor 28 executes specific processing according to the specific processing program 56.
[0902] Alternatively, the specific processing program 56 may be stored in a storage device such as a server connected to the data processing device 12 via the network 54, and the specific processing program 56 may be downloaded and installed on the computer 22 in response to a request from the data processing device 12.
[0903] Furthermore, it is not necessary to store the entirety of the specific processing program 56 in a storage device such as a server connected to the data processing device 12 via the network 54, or to store the entirety of the specific processing program 56 in the storage 32; it is acceptable to store only a portion of the specific processing program 56.
[0904] The following types of processors can be used as hardware resources to perform specific processing. Examples of processors include a CPU, a general-purpose processor that functions as a hardware resource to perform specific processing by executing software, i.e., a program. Other examples of processors include dedicated electrical circuits, such as FPGAs (Field-Programmable Gate Arrays), PLDs (Programmable Logic Devices), or ASICs (Application Specific Integrated Circuits), which have circuit configurations specifically designed to perform specific processing. All of these processors have built-in or connected memory, and all of them perform specific processing by using memory.
[0905] The hardware resource that performs a specific process may consist of one of these various processors, or it may consist of a combination of two or more processors of the same or different types (for example, a combination of multiple FPGAs, or a combination of a CPU and an FPGA). Alternatively, the hardware resource that performs a specific process may consist of a single processor.
[0906] Examples of configurations using a single processor include, firstly, a configuration in which one or more CPUs and software are combined to form a single processor, and this processor functions as a hardware resource that performs a specific process. Secondly, there is a configuration using a processor that realizes the functions of the entire system, including multiple hardware resources that perform a specific process, on a single IC chip, as exemplified by SoCs (System-on-a-chip). In this way, a specific process is realized using one or more of the above types of processors as hardware resources.
[0907] Furthermore, the hardware structure of these various processors can more specifically utilize electrical circuits that combine circuit elements such as semiconductor devices. Also, the specific processing described above is merely an example. Therefore, it goes without saying that unnecessary steps can be deleted, new steps added, or the processing order rearranged, as long as it does not deviate from the main purpose.
[0908] The descriptions and illustrations presented above are detailed explanations of the technical aspects of this disclosure and are merely examples of the technical aspects. For example, the above descriptions of the structure, function, operation, and effect are examples of the structure, function, operation, and effect of the technical aspects of this disclosure. Therefore, it goes without saying that you may delete unnecessary parts, add new elements, or replace elements in the descriptions and illustrations presented above, as long as you do not deviate from the essence of the technical aspects of this disclosure. Furthermore, in order to avoid confusion and facilitate understanding of the technical aspects of this disclosure, explanations of common technical knowledge and the like that do not require special explanation to enable the implementation of the technical aspects of this disclosure have been omitted from the descriptions and illustrations presented above.
[0909] All documents, patent applications, and technical standards described herein are incorporated by reference to the same extent as if each individual document, patent application, and technical standard were specifically and individually noted to be incorporated by reference.
[0910] The following is further disclosed regarding the embodiments described above.
[0911] (Claim 1)
[0912] Data collection means,
[0913] Preprocessing means for preprocessing collected data,
[0914] An extraction means for extracting important descriptions from preprocessed data,
[0915] A contextual analysis means for analyzing the context in which the extracted descriptions are used,
[0916] A definition generation means that generates a definition of a description based on the results of contextual analysis,
[0917] A dictionary building method for constructing a specialized terminology dictionary based on the generated definitions,
[0918] A system that includes a translation mechanism for translating a constructed dictionary into multiple languages.
[0919] (Claim 2)
[0920] The system according to claim 1, further comprising update means for updating a new description or definition thereof in real time.
[0921] (Claim 3)
[0922] The system according to claim 1, further comprising a customization means for customizing a specialized terminology dictionary according to user requests.
[0923] "Example 1"
[0924] (Claim 1)
[0925] Information gathering methods,
[0926] A means of organizing the collected information,
[0927] An extraction method for extracting important terms from organized information,
[0928] An analytical means for analyzing the context in which extracted terms are used,
[0929] A definition generation means that automatically generates definitions of terms based on the results of situation analysis,
[0930] A cataloging method for constructing a glossary based on the generated definitions,
[0931] A system that includes a translation mechanism for translating the constructed catalog into multiple languages.
[0932] (Claim 2)
[0933] The system according to claim 1, further comprising an update means for updating a new term or its definition in real time.
[0934] (Claim 3)
[0935] The system according to claim 1, further comprising an individualization means for individualizing a glossary according to the user's request.
[0936] "Application Example 1"
[0937] (Claim 1)
[0938] Data collection device,
[0939] A preprocessing device for preprocessing the collected data,
[0940] An extraction device for extracting important descriptions from pre-processed data,
[0941] A contextual analysis device that analyzes the context in which the extracted descriptions are used,
[0942] A definition generator that generates a definition of a description based on the results of contextual analysis,
[0943] A dictionary building device that constructs a specialized terminology dictionary based on the generated definitions,
[0944] A translation device that translates a constructed dictionary into multiple languages,
[0945] A system including an information display device that displays technical information on an in-vehicle information display device.
[0946] (Claim 2)
[0947] The system according to claim 1, further comprising an update device for updating new descriptions or definitions thereof, or information related to vehicle operation, in real time.
[0948] (Claim 3)
[0949] The system according to claim 1, further comprising a customization device for customizing a specialized terminology dictionary according to user requests and selecting a method for displaying vehicle-related information.
[0950] "Example 2 of combining an emotion engine"
[0951] (Claim 1)
[0952] A means of acquiring data,
[0953] A formatting method for formatting the acquired data,
[0954] Extraction methods for extracting important terms from formatted data,
[0955] A contextual analysis tool for analyzing the context in which the extracted terms are used,
[0956] A definition creation method that creates definitions of terms based on the results of contextual analysis,
[0957] A dictionary construction method for building a glossary based on the created definitions,
[0958] A conversion method for converting the constructed dictionary into multiple languages,
[0959] A means of optimizing information based on user emotions,
[0960] A system including means for evaluating and providing information on a user's learning progress.
[0961] (Claim 2)
[0962] The system according to claim 1, further comprising an update means for updating new terms or their definitions in real time.
[0963] (Claim 3)
[0964] The system according to claim 1, further comprising a personalization means for personalizing a term dictionary according to the user's wishes.
[0965] "Application example 2 when combining with an emotional engine"
[0966] (Claim 1)
[0967] Data collection means,
[0968] Preprocessing means for preprocessing collected information,
[0969] An extraction means for extracting important terms from pre-processed information,
[0970] A contextual analysis means for analyzing the context in which extracted terms are used,
[0971] An explanation generation means that generates explanations of terms based on the results of situation analysis,
[0972] A dictionary building method that constructs a specialized terminology dictionary based on the generated explanations,
[0973] A translation tool that translates a constructed dictionary into multiple languages,
[0974] An optimization method that analyzes user responses and optimizes explanations of purchase-related terms,
[0975] A system that includes this.
[0976] (Claim 2)
[0977] The system according to claim 1, further comprising means for updating new information or explanations thereof in the real world.
[0978] (Claim 3)
[0979] The system according to claim 1, further comprising means for modifying a specialized terminology dictionary according to user requirements. [Explanation of Symbols]
[0980] 10, 210, 310, 410 Data Processing Systems 12 Data Processing Devices 14 Smart Devices 214 Smart Glasses 314 Headset-type terminal 414 Robots< / url:> < / url:> < / url:> < / url:>
Claims
1. Data collection device, A preprocessing device for preprocessing the collected data, An extraction device for extracting important descriptions from pre-processed data, A contextual analysis device that analyzes the context in which the extracted descriptions are used, A definition generator that generates a definition of a description based on the results of contextual analysis, A dictionary building device that constructs a specialized terminology dictionary based on the generated definitions, A translation device that translates a constructed dictionary into multiple languages, A system including an information display device that displays technical information on an in-vehicle information display device.
2. The system according to claim 1, further comprising an update device for updating new descriptions or definitions thereof, or information related to vehicle operation, in real time.
3. The system according to claim 1, further comprising a customization device for customizing a specialized terminology dictionary according to user requests and selecting a method for displaying vehicle-related information.