system
The system addresses the challenge of translating Japanese expressions by analyzing context and cultural nuances, ensuring accurate and natural translations with supplementary explanations, enhancing cross-cultural communication.
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Applications
- Current Assignee / Owner
- SOFTBANK GROUP CORP
- Filing Date
- 2024-12-09
- Publication Date
- 2026-06-19
AI Technical Summary
Conventional translation systems struggle to accurately understand and convey Japanese-specific expressions and cultural nuances, leading to unnatural translations and miscommunication across languages.
A system that analyzes Japanese-specific expressions and contexts, providing natural translations with supplementary explanations about cultural background and nuances, using morphological analysis, natural language processing, and generative AI models to ensure accurate communication.
Enables smooth communication by accurately conveying the intended meaning and cultural context, overcoming the limitations of conventional translation systems.
Smart Images

Figure 2026100718000001_ABST
Abstract
Description
Technical Field
[0001] The technology of the present disclosure relates to a system.
Background Art
[0002] Patent Document 1 discloses a persona chatbot control method performed by at least one processor, including steps of receiving a user utterance, adding the user utterance to a prompt including an instruction sentence related to an explanation of a chatbot character, encoding the prompt, and inputting the encoded prompt into a language model to generate a chatbot utterance in response to the user utterance.
Prior Art Documents
Patent Documents
[0003]
Patent Document 1
Summary of the Invention
Problems to be Solved by the Invention
[0004] In conventional translation systems, it has been difficult to accurately understand Japanese-specific expressions and contexts and appropriately convey them to other languages. For this reason, there is a problem that the translation result becomes unnatural and the intended meaning is not accurately conveyed to the recipient. Furthermore, the transmission of information including nuances due to cultural backgrounds and turns of phrase is also insufficient, hindering communication between different cultures.
Means for Solving the Problems
[0005] According to this invention, a system is proposed that receives input natural language data, analyzes Japanese-specific expressions and context to appropriately extract meaning, and then provides a natural translation into a different language based on the analyzed content, while also generating additional explanations about the relevant cultural background and nuances. This system enables accurate communication of intent across cultures and can solve conventional problems.
[0006] "Input natural language data" refers to text information written in a human language, such as Japanese, that the user provides to the system.
[0007] "Means of receiving" refers to a function or mechanism for receiving input natural language data into the system.
[0008] "Means for analyzing specific expressions" refers to functions or processes for identifying characteristic phrases and contexts within natural language data and understanding their meaning.
[0009] "Contextual translation means" refers to a function or process for converting natural language data into a different language, taking into account the analyzed expression and its surrounding context.
[0010] "Means for generating additional explanations" refers to a function or process that creates supplementary information to make the cultural context and nuances of expression clearer for translated content.
[0011] "Output means" refers to a function or method for presenting the translation results and generated explanations in a format usable by the user. [Brief explanation of the drawing]
[0012] [Figure 1] This is a conceptual diagram showing an example of the configuration of a data processing system according to the first embodiment. [Figure 2] This is a conceptual diagram showing an example of the essential functions of a data processing device and a smart device according to the first embodiment. [Figure 3] It is a conceptual diagram showing an example of the configuration of a data processing system according to the second embodiment. [Figure 4] It is a conceptual diagram showing an example of the main functions of a data processing device and smart glasses according to the second embodiment. [Figure 5] It is a conceptual diagram showing an example of the configuration of a data processing system according to the third embodiment. [Figure 6] It is a conceptual diagram showing an example of the main functions of a data processing device and a headset-type terminal according to the third embodiment. [Figure 7] It is a conceptual diagram showing an example of the configuration of a data processing system according to the fourth embodiment. [Figure 8] It is a conceptual diagram showing an example of the main functions of a data processing device and a robot according to the fourth embodiment. [Figure 9] It shows an emotion map to which a plurality of emotions are mapped. [Figure 10] It shows an emotion map to which a plurality of emotions are mapped. [Figure 11] It is a sequence diagram showing the processing flow of the data processing system in Example 1. [Figure 12] It is a sequence diagram showing the processing flow of the data processing system in Application Example 1. [Figure 13] It is a sequence diagram showing the processing flow of the data processing system in Example 2 when an emotion engine is combined. [Figure 14] It is a sequence diagram showing the processing flow of the data processing system in Application Example 2 when an emotion engine is combined.
MODE FOR CARRYING OUT THE INVENTION
[0013] Hereinafter, an example of an embodiment of a system according to the technology of the present disclosure will be described with reference to the accompanying drawings.
[0014] First, the terms used in the following description will be explained.
[0015] In the following embodiments, the labeled processor (hereinafter simply referred to as "processor") may be a single arithmetic unit or a combination of multiple arithmetic units. Also, the processor may be a single type of arithmetic unit or a combination of multiple types of arithmetic units. Examples of arithmetic units include a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), a GPGPU (General-Purpose computing on Graphics Processing Units), an APU (Accelerated Processing Unit), and the like.
[0016] In the following embodiments, the labeled RAM (Random Access Memory) is a memory in which information is temporarily stored and is used as a work memory by the processor.
[0017] In the following embodiments, the labeled storage is one or more non-volatile storage devices that store various programs and various parameters, etc. Examples of non-volatile storage devices include flash memory (SSD (Solid State Drive)), magnetic disks (e.g., hard disks), or magnetic tapes, and the like.
[0018] In the following embodiments, the labeled communication I / F (Interface) is an interface including a communication processor and an antenna, etc. The communication I / F controls communication between multiple computers. Examples of communication standards applied to the communication I / F include wireless communication standards including 5G (5th Generation Mobile Communication System), Wi-Fi (registered trademark), or Bluetooth (registered trademark), and the like.
[0019] In the following embodiments, "A and / or B" is synonymous with "at least one of A and B." That is, "A and / or B" means that it may be A alone, or B alone, or a combination of A and B. Furthermore, in this specification, the same concept as "A and / or B" applies when expressing three or more things linked by "and / or."
[0020] [First Embodiment]
[0021] Figure 1 shows an example of the configuration of the data processing system 10 according to the first embodiment.
[0022] As shown in Figure 1, the data processing system 10 includes a data processing device 12 and a smart device 14. An example of the data processing device 12 is a server.
[0023] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0024] The smart device 14 comprises a computer 36, a reception device 38, an output device 40, a camera 42, and a communication interface 44. The computer 36 comprises a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The reception device 38, output device 40, and camera 42 are also connected to the bus 52.
[0025] The reception device 38 is equipped with a touch panel 38A and a microphone 38B, etc., and receives user input. The touch panel 38A receives user input by detecting contact with an object (e.g., a pen or finger). The microphone 38B receives user input by detecting the user's voice. The control unit 46A transmits data indicating the user input received by the touch panel 38A and microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the data indicating the user input.
[0026] The output device 40 includes a display 40A and a speaker 40B, and presents data to the user 20 by outputting the data in a form perceptible to the user 20 (e.g., audio and / or text). The display 40A displays visible information such as text and images according to instructions from the processor 46. The speaker 40B outputs audio according to instructions from the processor 46. The camera 42 is a small digital camera equipped with an optical system such as a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor.
[0027] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various types of information between processor 46 and processor 28 via network 54.
[0028] Figure 2 shows an example of the main functions of the data processing device 12 and the smart device 14.
[0029] As shown in Figure 2, in the data processing device 12, a specific processing is performed by the processor 28. A specific processing program 56 is stored in the storage 32. The specific processing program 56 is an example of a "program" related to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 according to the specific processing program 56 executed on the RAM 30.
[0030] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0031] In the smart device 14, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The reception output program 60 is used in conjunction with a specific processing program 56 by the data processing system 10. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0032] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".
[0033] This invention forms a system for natural and effective translation from Japanese to other languages. The user inputs the Japanese text they wish to translate using a terminal. The terminal sends this input to a server, which analyzes the received text. The server breaks down words and grammatical structures through morphological analysis, identifying expressions and important phrases unique to Japanese. Then, considering the surrounding context of the text, it extracts the meaning and intent necessary for translation.
[0034] Based on the analysis results, the server translates Japanese expressions and nuances into a foreign language in a way that is easy to understand. This translation process goes beyond literal translation, taking into account cultural background and nuances of expression. In particular, Japanese expressions such as "yappari" (after all) and "toriaezu" (for now) can be interpreted differently depending on the situation, so additional explanations are generated to appropriately convey the intended meaning.
[0035] As a result, the server sends the generated translation and additional explanations to the user's terminal and displays the results in the format specified by the user. This system enables smooth communication that transcends language barriers.
[0036] For example, if a user enters the text "It started raining, so I think we should cancel today after all." in Japanese, the server will pay attention to the expression "after all" and interpret it as meaning a reconfirmation of the change of plans. When translating this into English, it will output the result with a supplementary explanation that includes the nuance, such as "It started raining, so I think we should cancel our plans today. ('After all' indicates a reconfirmation of the decision)." This ensures that foreign language recipients can understand the speaker's original intent without misunderstanding.
[0037] The following describes the processing flow.
[0038] Step 1:
[0039] The user enters the Japanese text they want translated into the device. The device then sends the entered text to the server.
[0040] Step 2:
[0041] The server receives text sent from the terminal. The server preprocesses the received data into a parseable format. Specifically, it performs noise reduction and character normalization (e.g., unifying half-width and full-width characters).
[0042] Step 3:
[0043] The server performs morphological analysis, breaking down the part of speech and constituent elements of each word in the text. This identifies expressions and keywords unique to the Japanese language.
[0044] Step 4:
[0045] The server interprets the meaning and intent of an expression based on the context and surrounding information of the sentence. Here, data models are used to understand specific phrases and nuances.
[0046] Step 5:
[0047] The server performs natural and appropriate translations into foreign languages based on the analysis results. This process involves not only literal translations but also translations that are context-aware and natural.
[0048] Step 6:
[0049] The server generates supplementary explanations that convey nuances and cultural background that cannot be fully expressed through translation alone. These explanations help the recipient understand the intended meaning more accurately.
[0050] Step 7:
[0051] The server sends the generated translation results and supplementary explanations to the terminal. The terminal displays the received information to the user, allowing the user to confirm the results.
[0052] (Example 1)
[0053] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."
[0054] In today's global society, multilingual communication is crucial, but simple translation often fails to accurately convey cultural nuances and intentions. Therefore, there is a need for techniques that enable natural and effective communication between different languages. In particular, languages with unique expressions, such as Japanese, require detailed translations and supplementary explanations based on context and background.
[0055] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0056] In this invention, the server includes means for analyzing received natural language data and breaking it down into words and grammatical structures, means for extracting specific expressions and elements from the analysis results and translating them while considering the context, and means for providing supplementary information to the generated translation based on cultural background and nuances of expression. This enables natural and effective communication between different languages.
[0057] "Natural language data" refers to information expressed in the language that people use on a daily basis.
[0058] "Analysis" is the process of breaking down input data and classifying and organizing it in order to understand its structure and meaning.
[0059] "Deconstructing words and grammatical structures" means dividing a text into individual words and grammatical elements and clarifying the role of each.
[0060] "Extracting unique expressions and elements" means identifying and extracting phrases and important information specific to that language.
[0061] "Translating while considering context" means understanding the meaning from the context in which the text is used and the surrounding sentences, and then appropriately translating it into a different language.
[0062] "Providing supplementary information based on cultural background and nuances of expression" means taking into account different cultural backgrounds and subtle linguistic meanings during translation and adding information to explain them.
[0063] "Formatting and outputting in an output format" means converting the translation results and supplementary information into a format that is easy for the reader to understand, and then displaying or distributing them.
[0064] A "system" is a collection of multiple components or processes that work together to achieve a specific purpose.
[0065] This invention provides a system for natural and effective translation from natural language to another language. The user inputs the Japanese text they wish to translate into a terminal. The terminal converts the input text into an appropriate data structure and sends it to the server.
[0066] The server analyzes the received Japanese text using morphological analysis software. This analysis utilizes a "language analysis tool," which breaks down the text into individual words and extracts its grammatical structure. The server then uses the analysis results to detect Japanese-specific expressions and important phrases. To extract contextual meaning and intent, it uses a natural language processing library called the "language understanding library."
[0067] After analysis, the server uses a generative AI model to perform translations that take into account cultural background and nuances of expression. Specifically, it utilizes a "translation API" to request additional supplementary explanations for appropriately translated results. This enables the transmission of detailed meaning beyond simple translation.
[0068] Finally, the server formats the generated translation and supplementary explanations in the specified format and sends them to the terminal. The user can then check the translation displayed on the terminal, enabling smooth communication between different languages.
[0069] For example, if a user types "It's started raining, so I think we should cancel today after all," the server interprets "after all" as a confirmation of a change in plans, translates it into English as "It started raining, so I think we should cancel our plans today. ('After all' indicates a reconfirmation of the decision)," and sends it to the device.
[0070] An example of a prompt to input into the generative AI model is as follows: "Translate the following Japanese sentence into English and add supplementary explanations to convey the nuance: 'It's started raining, so I think I'll just cancel today.'"
[0071] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0072] Step 1:
[0073] The user enters the natural language text they want translated into the device. During this input process, the user uses a text field to enter sentences in Japanese or another language. The entered text data is temporarily stored on the device. For example, if the user enters "It's started raining, so I think I'll cancel today," this sentence will be stored on the device.
[0074] Step 2:
[0075] The terminal converts the input text into a data structure and sends it to the server. Specifically, it serializes the text into JSON or XML format and sends it to the server as an HTTP request. The input data is in a structured format so that the server can easily parse it. For example, the input sentence is sent to the server in JSON format such as "{ 'text': 'It's started raining, so I think I'll cancel today.'}".
[0076] Step 3:
[0077] The server analyzes the received natural language data, breaking it down into words and grammatical structures. Using a "morphological analysis tool," the server breaks down the text into smaller parts, identifying each word and its part of speech. From the input text, the server obtains word-level analysis results, such as "rain / noun," "ga / particle," and "furu / verb." This reveals the grammatical structure of the text.
[0078] Step 4:
[0079] The server extracts unique expressions and elements from the analysis results and begins the translation process, taking context into consideration. Based on the analysis results, it identifies Japanese-specific expressions (e.g., "yappari") and considers their corresponding meaning and intent. The server uses a "natural language understanding library" to gain a deeper understanding of the context and converts the prompt sentence into a format that can be input into the AI model. The output of this process is an appropriate translation candidate that is relevant to the context.
[0080] Step 5:
[0081] The server performs translation using a generative AI model. The model receives prompts and generates translation results that reflect the context and nuances. Specifically, it utilizes a language model API to perform translations according to the instructed prompt sentences. The generated output is a translation such as, "It started raining, so I think we should cancel our plans today. ('やっぱり' indicates a reconfirmation of the decision)."
[0082] Step 6:
[0083] The server formats the generated translation results into the specified output format and sends them to the terminal. The formatting process converts the data to formats such as HTML or plain text, making it immediately readable by the user. The server returns the formatted data to the terminal as an HTTP response.
[0084] Step 7:
[0085] The device correctly interprets the data received from the server and displays it to the user. Specifically, it displays the received translation results in the appropriate widget, making them viewable by the user. The user can then check the translation results on that screen and utilize the information as needed.
[0086] (Application Example 1)
[0087] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."
[0088] Conventional translation systems struggle to accurately interpret expressions and contexts unique to Japanese, resulting in a failure to adequately convey cultural nuances in communication between users of different languages. Furthermore, the lack of effective means to display real-time translations and their supplementary explanations across different devices hinders smooth multi-platform communication.
[0089] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0090] In this invention, the server includes means for receiving input natural language data, means for analyzing specific expressions from the received natural language data, and means for translating the natural language into a different language based on the analyzed expressions and context. This enables the provision of supplementary information related to Japanese-specific expressions and cultural background, and further allows the translation results to be displayed in real time on different devices.
[0091] "Input natural language data" refers to the initial language text that the user sends to the server for translation.
[0092] "Means of receiving" means a method or device for properly importing natural language data into a server.
[0093] "Means for analyzing specific expressions" refers to methods for identifying important phrases and expressions contained within natural language data and analyzing their structure.
[0094] "Contextual translation" refers to a method of accurate translation that takes into account the meaning and nuances of the surrounding text.
[0095] "Means for generating additional explanations" refers to methods for automatically creating supplementary information to clarify the cultural context and intent behind the translated content.
[0096] "Means for displaying translations in real time on different devices" refers to methods for instantly displaying translated results on a variety of devices.
[0097] A "generative AI model" refers to an algorithm or platform that uses artificial intelligence technology to process data and generate output corresponding to a specific task.
[0098] The system that implements this application primarily consists of a server and a user's terminal. The server receives natural language data entered by the user on their terminal and analyzes that data. Morphological analysis tools such as MeCab are used for the analysis, which identifies grammatical structures and specific expressions. Based on this information, the server translates the natural language into another language using Google Cloud Translation API or OpenAI API.
[0099] The server also generates additional explanations about the translated content. These explanations are created in the backend using a generative AI model and supplement cultural context and linguistic nuances. The generated translations and supplementary information are sent to the user's device and displayed in real time on different devices.
[0100] As a concrete example, if a foreign tourist wants to translate Japanese tourist information into English, they might input a sentence like "Mount Fuji is beautiful after all." The server analyzes the expression "after all" and provides a translation that includes its cultural implications. In this case, a translation such as "Fuji is really beautiful, isn't it? ('After all' implies a reconfirmation of the opinion)" would be generated.
[0101] An example of a prompt statement is written as follows:
[0102] The Japanese text, "The trip was fun, but it's really reassuring to be back home," would be translated as follows:
[0103] "Translate the following text into English, explaining the nuances of 'yappari': 'The trip was fun, but I still feel relieved when I get home.'"
[0104] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0105] Step 1:
[0106] The user enters the Japanese text they wish to have translated on their device. This input is then sent to the server by the device. The entered data is in natural language text.
[0107] Step 2:
[0108] The server analyzes the received Japanese text using a morphological analysis tool (MeCab). This analysis identifies word boundaries and grammatical structures within the text. This allows for the identification of specific Japanese phrases and expressions.
[0109] Step 3:
[0110] The server uses the Google Cloud Translation API or OpenAI API to perform translations based on the analyzed data. This process incorporates contextual information to obtain translation results that take into account cultural background and linguistic nuances.
[0111] Step 4:
[0112] The server uses a generative AI model to generate additional explanations related to the translation results. The generated explanations supplement the background and original intent of the translation, and also refer to cultural contexts.
[0113] Step 5:
[0114] The server sends the generated translation results and additional explanations to the user's device. The user's device displays this information on the screen in real time. This allows the user to view the translation results immediately.
[0115] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0116] This invention relates to a system for translating from Japanese to other languages, which has a function to adjust the translation content while taking the user's emotions into consideration. The user inputs the Japanese text they wish to have translated using a terminal. The terminal sends the input data to a server, which then receives the data.
[0117] The server first performs morphological analysis, breaking down and analyzing the part of speech and grammatical structure of each word in the input text. From this analysis, it identifies Japanese-specific expressions and phrases and understands the context. Then, it performs translation using natural language processing to generate natural-sounding expressions in the other language. In this translation process, it considers not only literal translation but also nuance.
[0118] Furthermore, this system utilizes an emotion engine. The server analyzes the user's emotions from the input text using the emotion engine and adjusts the nuances of the translation based on those emotions. For example, if it is determined that the user has positive emotions, the translation result will be adjusted to reflect those positive nuances. In addition, supplementary information related to emotions is generated and included in the translation result.
[0119] Finally, the server sends the translation results, which take the user's emotions into consideration, along with any additional information generated as needed, to the terminal. The terminal displays the received information to the user, allowing them to review the translation and related information.
[0120] For example, if a user enters the text "It's going to be a great day," the server's emotion engine recognizes this text as a positive emotion. Then, when translating it into English, it performs a translation that reflects the emotion, such as "It's going to be an amazing day!" This makes it easier for the user's emotions to be accurately conveyed to the recipient.
[0121] The following describes the processing flow.
[0122] Step 1:
[0123] The user enters the Japanese text they want translated using their device. The device then sends this input to the server.
[0124] Step 2:
[0125] The server receives text sent from the terminal. The received text is preprocessed to prepare it for parsing. Specifically, noise reduction and character normalization are performed.
[0126] Step 3:
[0127] The server performs morphological analysis, breaking down the words in the input text and analyzing their parts of speech and grammatical structure. This allows it to identify expressions and phrases unique to the Japanese language.
[0128] Step 4:
[0129] The server's sentiment engine recognizes the user's emotions from the analyzed text. The sentiment engine determines emotions based on emotional keywords and phrases contained in the text.
[0130] Step 5:
[0131] The server adjusts the nuances of the translation based on recognized sentiment information. It performs translations that reflect the context and nuances appropriate to the emotions.
[0132] Step 6:
[0133] The server generates natural-sounding translations into foreign languages based on context and sentiment. It also generates supplementary information about cultural background and nuances of expression, as needed.
[0134] Step 7:
[0135] The server sends the generated translation results and supplementary information to the user's terminal. The terminal displays the received information to the user, allowing them to review the translation and related information.
[0136] (Example 2)
[0137] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".
[0138] Traditional translation systems have the problem of being limited to literal translations and having difficulty considering nuances and emotions between languages. Therefore, there is a need for translation methods that reflect the user's emotions. It is also important to provide appropriate supplementary information based on cultural background.
[0139] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0140] In this invention, the server includes means for receiving input natural language information, means for analyzing the word structure of the received natural language information, means for translating the natural language into a different language based on the analyzed word structure and context, means for analyzing sentiment during the translation process and adjusting the nuances of the translated content, and means for generating supplementary information related to the translated content. This makes it possible to provide natural translation results that reflect the user's emotions and to present sentiment-based supplementary information together.
[0141] "Natural language information" refers to text and audio data expressed in the language that humans use on a daily basis.
[0142] "Word structure" refers to the results of analyzing the part of speech, grammatical role, and conjugation forms of words in natural language.
[0143] "Context" refers to the linguistic and cultural background in which a particular word or phrase is used, as well as the content of the surrounding text.
[0144] "Translation means" refers to a method or device for converting information expressed in one language into information with equivalent meaning in a different language.
[0145] "Methods for analyzing emotions" refer to processing and techniques used to identify the emotional tone and intentions of a writer or speaker from information such as text and audio.
[0146] "Methods for adjusting nuances" refer to methods of making subtle adjustments to wording and expressions in the translated result to reflect the emotional and cultural characteristics of the original language.
[0147] "Means of generating supplementary information" refers to techniques that create information and annotations to aid understanding, in addition to the basic translation result.
[0148] This invention is a system that provides a translation that takes into account the sentiment of the natural language text input by the user. The user inputs the text they wish to have translated using a terminal. The terminal sends the input data to the server.
[0149] The server first uses "morphological analysis software" to perform morphological analysis. This software divides the input natural language information into word units and analyzes the part of speech and grammatical role of each word. Based on the results of this analysis, it understands the context and uses a natural language processing engine to translate into another language. A "translation engine" may be used for translation, but in this case, it aims for a natural translation that is appropriate to the context, rather than a literal translation.
[0150] Furthermore, the server uses a "sentiment analysis engine" to analyze the user's emotions from the input text. Based on the analyzed emotional information, it adjusts the nuances of the translated text to reflect the user's emotions. At this time, supplementary information related to the emotions is generated and sent to the terminal along with the translation result.
[0151] The terminal displays the final adjusted translation and supplementary information to the user, allowing them to confirm it. For example, if the user enters the text "It's going to be an amazing day," the server analyzes the positive emotion in this text and provides a corresponding translation: "It's going to be an amazing day!"
[0152] An example of a prompt for a generative AI model is: "Translate the following Japanese text into English, reflecting the sentiment of the text: 'It's going to be a great day.'"
[0153] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0154] Step 1:
[0155] The user inputs the Japanese text for which translation is desired into the terminal. For example, the user inputs "Today is a very enjoyable day." The terminal sends this input data to the server. The input is Japanese text, and the output is the data sent to the server.
[0156] Step 2:
[0157] The server performs morphological analysis using the natural language information received from the terminal. Specifically, the server uses morphological analysis software to analyze the Japanese text word by word and identify the word type and grammar structure. The input is the Japanese text received from the terminal, and the output is the analysis result, which is the word type information and grammar structure of the words.
[0158] Step 3:
[0159] The server performs translation using the natural language processing engine based on the analyzed word type information and grammar structure. At this stage, it understands the context of the input text and generates a translation that takes into account nuances rather than just a literal translation. For example, the Japanese sentence "今日はとても楽しい一日です" is translated as "Today is a very enjoyable day." The input is the result of morphological analysis, and the output is the translated English text.
[0160] Step 4:
[0161] The server analyzes the user's sentiment using the sentiment analysis engine in parallel with the translation. Here, it detects a positive sentiment from the expression "とても楽しい". The input is the original Japanese text, and the output is the result of sentiment analysis, which is the information of a positive sentiment.
[0162] Step 5:
[0163] The server adjusts the nuances of the translated text based on the sentiment analysis results. In this case, it adjusts it to emphasize positive emotions more, such as "Today is a wonderfully enjoyable day!". The input is the translated English text and the sentiment analysis results, and the output is the sentiment-adjusted translated text.
[0164] Step 6:
[0165] The server generates supplementary information about emotions and adds it to the translation result. Here, it generates information to supplement positive emotions. The input is the emotion-adjusted translation text, and the output is the final translation result with supplementary information.
[0166] Step 7:
[0167] The server sends the final adjusted translation results and supplementary information to the terminal. The terminal receives this and displays it to the user. The user can then review the translated content and related information. The input is the data sent from the server, and the output is the information displayed to the user.
[0168] (Application Example 2)
[0169] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as a "server" and the smart device 14 as a "terminal".
[0170] In translation between different languages, it is crucial to provide translations that not only perform grammatical conversions but also appropriately reflect the user's emotions. However, conventional systems lack the nuance adjustments necessary to take user emotions into account, sometimes resulting in misrepresentation of the user's intentions and feelings. To address this challenge, there is a need for translation systems that take emotions into account.
[0171] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0172] In this invention, the server includes means for receiving input character data, means for analyzing a specific syntax from the received character data, means for converting the character data into different formats based on the analyzed syntax and context, means for analyzing the user's emotions and adjusting the nuances of the conversion based on those emotions, and means for outputting the adjusted conversion results and explanations. This allows for the provision of translation results that reflect the user's emotions, enabling more accurate communication of intent.
[0173] "Inputted character data" refers to the language information that the user provides to the system for morphological analysis and translation.
[0174] A "means of receiving" refers to an element that has the function of receiving data sent by the user within the system.
[0175] A "means for parsing a specific syntax" refers to an element that breaks down input character data into word and sentence structures and processes them to understand their grammar and meaning.
[0176] "Means of converting to different formats" refers to elements that have the function of appropriately translating or converting analyzed character data into other languages or formats.
[0177] A "means of analyzing emotions" refers to an element that has the function of detecting and analyzing emotional nuances and tones from user input data.
[0178] "Means of adjusting nuance" refer to elements that modify the tone and expression of the translation result based on detected emotions, thereby maintaining the intended emotion.
[0179] "Means for outputting adjusted conversion results and explanations" refers to elements that have the function of presenting the converted content and related supplementary information to the user.
[0180] This invention provides a multilingual translation system that takes user emotions into consideration. The system is implemented via a user terminal such as a smartphone or smart glasses. The user inputs natural language text data into the terminal, and sentiment analysis and translation processing are performed on the server. Upon receiving the text data, the server first performs morphological analysis. Specifically, it uses analysis software running in a Python environment to analyze the parts of speech and syntactic structure of the words and phrases that make up the text.
[0181] The server then uses a natural language processing engine to translate into different languages. During this process, it employs a generative AI model algorithm to generate translations that are contextual and syntactic. Furthermore, it analyzes the user's emotions through an emotion engine and performs nuance adjustments to reflect these emotions in the translated text.
[0182] Based on the emotions detected from the natural language text entered by the user, the translation is adjusted appropriately to generate an informative result. This translation result, along with additional information based on the emotions, is sent to the device and presented to the user. For example, if the user enters "It's going to be an amazing day," the system recognizes this positive emotion and provides the translation "It's going to be an amazing day!"
[0183] An example of a prompt is, "Retranslate the product description to reflect the user's emotions in a positive way." This system makes it easy to perform translations that accurately reflect the user's feelings.
[0184] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0185] Step 1:
[0186] The terminal receives natural language text data entered by the user. This input data reflects the user's intent. The terminal then prepares to send this data to the server.
[0187] Step 2:
[0188] The server receives natural language text sent from the terminal. The received text data is input into a morphological analysis system, which analyzes the part of speech and grammatical structure of each word. The output of this analysis is syntactic information and context of the text.
[0189] Step 3:
[0190] The server uses syntactic information obtained from morphological analysis to translate text into different languages using a generative AI model. At this stage, it generates appropriate translation candidates based on the input syntactic information. The output is the translated text data.
[0191] Step 4:
[0192] After the translation process, the server uses an emotion analysis engine to analyze the emotions contained in the user's input text. At this stage, it generates data for nuance adjustment based on the emotion information extracted from the text.
[0193] Step 5:
[0194] The server uses information obtained from sentiment analysis to adjust the nuances of the translated text. Specifically, it uses prompt sentences provided by the generative AI model to modify the translation results to reflect the user's emotions. The output is the adjusted translated text.
[0195] Step 6:
[0196] The server sends the adjusted translated text and related supplementary information to the terminal. The terminal receives this and displays a translation result that reflects the sentiment to the user. This allows the user to confirm that the translation takes sentiment into account.
[0197] The specific processing unit 290 transmits the result of the specific processing to the smart device 14. In the smart device 14, the control unit 46A causes the output device 40 to output the result of the specific processing. The microphone 38B acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the audio data.
[0198] Data generation model 58 is a so-called generative AI (Artificial Intelligence). An example of data generation model 58 is ChatGPT (registered trademark) (Internet search).<URL: https: / / openai.com / blog / chatgpt> ), Gemini (registered trademark) (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0199] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart device 14.
[0200] [Second Embodiment]
[0201] Figure 3 shows an example of the configuration of the data processing system 210 according to the second embodiment.
[0202] As shown in Figure 3, the data processing system 210 includes a data processing device 12 and smart glasses 214. An example of the data processing device 12 is a server.
[0203] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0204] The smart glasses 214 include a computer 36, a microphone 238, a speaker 240, a camera 42, and a communication interface 44. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, and camera 42 are also connected to the bus 52.
[0205] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.
[0206] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).
[0207] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.
[0208] Figure 4 shows an example of the main functions of the data processing device 12 and the smart glasses 214. As shown in Figure 4, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.
[0209] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.
[0210] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0211] In the smart glasses 214, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0212] Next, the identification processing performed by the identification processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".
[0213] This invention forms a system for natural and effective translation from Japanese to other languages. The user inputs the Japanese text they wish to translate using a terminal. The terminal sends this input to a server, which analyzes the received text. The server breaks down words and grammatical structures through morphological analysis, identifying expressions and important phrases unique to Japanese. Then, considering the surrounding context of the text, it extracts the meaning and intent necessary for translation.
[0214] Based on the analysis results, the server translates Japanese expressions and nuances into a foreign language in a way that is easy to understand. This translation process goes beyond literal translation, taking into account cultural background and nuances of expression. In particular, Japanese expressions such as "yappari" (after all) and "toriaezu" (for now) can be interpreted differently depending on the situation, so additional explanations are generated to appropriately convey the intended meaning.
[0215] As a result, the server sends the generated translation and additional explanations to the user's terminal and displays the results in the format specified by the user. This system enables smooth communication that transcends language barriers.
[0216] For example, if a user enters the text "It started raining, so I think we should cancel today after all." in Japanese, the server will pay attention to the expression "after all" and interpret it as meaning a reconfirmation of the change of plans. When translating this into English, it will output the result with a supplementary explanation that includes the nuance, such as "It started raining, so I think we should cancel our plans today. ('After all' indicates a reconfirmation of the decision)." This ensures that foreign language recipients can understand the speaker's original intent without misunderstanding.
[0217] The following describes the processing flow.
[0218] Step 1:
[0219] The user enters the Japanese text they want translated into the device. The device then sends the entered text to the server.
[0220] Step 2:
[0221] The server receives text sent from the terminal. The server preprocesses the received data into a parseable format. Specifically, it performs noise reduction and character normalization (e.g., unifying half-width and full-width characters).
[0222] Step 3:
[0223] The server performs morphological analysis, breaking down the part of speech and constituent elements of each word in the text. This identifies expressions and keywords unique to the Japanese language.
[0224] Step 4:
[0225] The server interprets the meaning and intent of an expression based on the context and surrounding information of the sentence. Here, data models are used to understand specific phrases and nuances.
[0226] Step 5:
[0227] The server performs natural and appropriate translations into foreign languages based on the analysis results. This process involves not only literal translations but also translations that are context-aware and natural.
[0228] Step 6:
[0229] The server generates supplementary explanations that convey nuances and cultural background that cannot be fully expressed through translation alone. These explanations help the recipient understand the intended meaning more accurately.
[0230] Step 7:
[0231] The server sends the generated translation results and supplementary explanations to the terminal. The terminal displays the received information to the user, allowing the user to confirm the results.
[0232] (Example 1)
[0233] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."
[0234] In today's global society, multilingual communication is crucial, but simple translation often fails to accurately convey cultural nuances and intentions. Therefore, there is a need for techniques that enable natural and effective communication between different languages. In particular, languages with unique expressions, such as Japanese, require detailed translations and supplementary explanations based on context and background.
[0235] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0236] In this invention, the server includes means for analyzing received natural language data and breaking it down into words and grammatical structures, means for extracting specific expressions and elements from the analysis results and translating them while considering the context, and means for providing supplementary information to the generated translation based on cultural background and nuances of expression. This enables natural and effective communication between different languages.
[0237] "Natural language data" refers to information expressed in the language that people use on a daily basis.
[0238] "Analysis" is the process of breaking down input data and classifying and organizing it in order to understand its structure and meaning.
[0239] "Deconstructing words and grammatical structures" means dividing a text into individual words and grammatical elements and clarifying the role of each.
[0240] "Extracting unique expressions and elements" means identifying and extracting phrases and important information specific to that language.
[0241] "Translating while considering context" means understanding the meaning from the context in which the text is used and the surrounding sentences, and then appropriately translating it into a different language.
[0242] "Providing supplementary information based on cultural background and nuances of expression" means taking into account different cultural backgrounds and subtle linguistic meanings during translation and adding information to explain them.
[0243] "Formatting and outputting in an output format" means converting the translation results and supplementary information into a format that is easy for the reader to understand, and then displaying or distributing them.
[0244] A "system" is a collection of multiple components or processes that work together to achieve a specific purpose.
[0245] This invention provides a system for natural and effective translation from natural language to another language. The user inputs the Japanese text they wish to translate into a terminal. The terminal converts the input text into an appropriate data structure and sends it to the server.
[0246] The server analyzes the received Japanese text using morphological analysis software. This analysis utilizes a "language analysis tool," which breaks down the text into individual words and extracts its grammatical structure. The server then uses the analysis results to detect Japanese-specific expressions and important phrases. To extract contextual meaning and intent, it uses a natural language processing library called the "language understanding library."
[0247] After analysis, the server uses a generative AI model to perform translations that take into account cultural background and nuances of expression. Specifically, it utilizes a "translation API" to request additional supplementary explanations for appropriately translated results. This enables the transmission of detailed meaning beyond simple translation.
[0248] Finally, the server formats the generated translation and supplementary explanations in the specified format and sends them to the terminal. The user can then check the translation displayed on the terminal, enabling smooth communication between different languages.
[0249] For example, if a user types "It's started raining, so I think we should cancel today after all," the server interprets "after all" as a confirmation of a change in plans, translates it into English as "It started raining, so I think we should cancel our plans today. ('After all' indicates a reconfirmation of the decision)," and sends it to the device.
[0250] An example of a prompt to input into the generative AI model is as follows: "Translate the following Japanese sentence into English and add supplementary explanations to convey the nuance: 'It's started raining, so I think I'll just cancel today.'"
[0251] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0252] Step 1:
[0253] The user enters the natural language text they want translated into the device. During this input process, the user uses a text field to enter sentences in Japanese or another language. The entered text data is temporarily stored on the device. For example, if the user enters "It's started raining, so I think I'll cancel today," this sentence will be stored on the device.
[0254] Step 2:
[0255] The terminal converts the input text into a data structure and sends it to the server. Specifically, it serializes the text into JSON or XML format and sends it to the server as an HTTP request. The input data is in a structured format so that the server can easily parse it. For example, the input sentence is sent to the server in JSON format such as "{ 'text': 'It's started raining, so I think I'll cancel today.'}".
[0256] Step 3:
[0257] The server analyzes the received natural language data, breaking it down into words and grammatical structures. Using a "morphological analysis tool," the server breaks down the text into smaller parts, identifying each word and its part of speech. From the input text, the server obtains word-level analysis results, such as "rain / noun," "ga / particle," and "furu / verb." This reveals the grammatical structure of the text.
[0258] Step 4:
[0259] The server extracts unique expressions and elements from the analysis results and begins the translation process, taking context into consideration. Based on the analysis results, it identifies Japanese-specific expressions (e.g., "yappari") and considers their corresponding meaning and intent. The server uses a "natural language understanding library" to gain a deeper understanding of the context and converts the prompt sentence into a format that can be input into the AI model. The output of this process is an appropriate translation candidate that is relevant to the context.
[0260] Step 5:
[0261] The server performs translation using a generative AI model. The model receives prompts and generates translation results that reflect the context and nuances. Specifically, it utilizes a language model API to perform translations according to the instructed prompt sentences. The generated output is a translation such as, "It started raining, so I think we should cancel our plans today. ('やっぱり' indicates a reconfirmation of the decision)."
[0262] Step 6:
[0263] The server formats the generated translation results into the specified output format and sends them to the terminal. The formatting process converts the data to formats such as HTML or plain text, making it immediately readable by the user. The server returns the formatted data to the terminal as an HTTP response.
[0264] Step 7:
[0265] The device correctly interprets the data received from the server and displays it to the user. Specifically, it displays the received translation results in the appropriate widget, making them viewable by the user. The user can then check the translation results on that screen and utilize the information as needed.
[0266] (Application Example 1)
[0267] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."
[0268] Conventional translation systems struggle to accurately interpret expressions and contexts unique to Japanese, resulting in a failure to adequately convey cultural nuances in communication between users of different languages. Furthermore, the lack of effective means to display real-time translations and their supplementary explanations across different devices hinders smooth multi-platform communication.
[0269] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0270] In this invention, the server includes means for receiving input natural language data, means for analyzing specific expressions from the received natural language data, and means for translating the natural language into a different language based on the analyzed expressions and context. This enables the provision of supplementary information related to Japanese-specific expressions and cultural background, and further allows the translation results to be displayed in real time on different devices.
[0271] "Input natural language data" refers to the initial language text that the user sends to the server for translation.
[0272] "Means of receiving" means a method or device for properly importing natural language data into a server.
[0273] "Means for analyzing specific expressions" refers to methods for identifying important phrases and expressions contained within natural language data and analyzing their structure.
[0274] "Contextual translation" refers to a method of accurate translation that takes into account the meaning and nuances of the surrounding text.
[0275] "Means for generating additional explanations" refers to methods for automatically creating supplementary information to clarify the cultural context and intent behind the translated content.
[0276] "Means for displaying translations in real time on different devices" refers to methods for instantly displaying translated results on a variety of devices.
[0277] A "generative AI model" refers to an algorithm or platform that uses artificial intelligence technology to process data and generate output corresponding to a specific task.
[0278] The system that implements this application primarily consists of a server and a user's terminal. The server receives natural language data entered by the user on their terminal and analyzes that data. Morphological analysis tools such as MeCab are used for the analysis, which identifies grammatical structures and specific expressions. Based on this information, the server translates the natural language into another language using the Google Cloud Translation API or the OpenAI API.
[0279] The server also generates additional explanations about the translated content. These explanations are created in the backend using a generative AI model and supplement cultural context and linguistic nuances. The generated translations and supplementary information are sent to the user's device and displayed in real time on different devices.
[0280] As a specific example, when a foreign tourist wants to translate Japanese tourist information into English, for example, they input a sentence like "Mount Fuji is still beautiful." The server analyzes the expression "still" and provides a translation that includes its cultural meaning. In this case, a translation like "Fuji is really beautiful, isn't it? ('still' implies a reconfirmation of the opinion)" will be generated.
[0281] Examples of prompt sentences are described as follows:
[0282] For the Japanese text "Travel was fun, but still, I feel relieved when I get home.", the translation is as follows:
[0283] "Translate the following text into English, explaining the nuances of'still': '旅行は楽しかったけど、やっぱり家に帰るとほっとするね。'"
[0284] The flow of the specific process in Application Example 1 will be explained using FIG. 12.
[0285] Step 1:
[0286] The user inputs the Japanese text they wish to translate on their terminal. This input is sent by the terminal to the server. The input data is natural language text.
[0287] Step 2:
[0288] The server analyzes the received Japanese text using a morphological analysis tool (MeCab). Through this analysis, the word segmentation and grammatical structure of the text are identified. As a result, specific turns of phrase and expressions in Japanese are identified.
[0289] Step 3:
[0290] The server uses the Google Cloud Translation API or OpenAI API to perform translations based on the analyzed data. This process incorporates contextual information to obtain translation results that take into account cultural background and linguistic nuances.
[0291] Step 4:
[0292] The server uses a generative AI model to generate additional explanations related to the translation results. The generated explanations supplement the background and original intent of the translation, and also refer to cultural contexts.
[0293] Step 5:
[0294] The server sends the generated translation results and additional explanations to the user's device. The user's device displays this information on the screen in real time. This allows the user to view the translation results immediately.
[0295] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0296] This invention relates to a system for translating from Japanese to other languages, which has a function to adjust the translation content while taking the user's emotions into consideration. The user inputs the Japanese text they wish to have translated using a terminal. The terminal sends the input data to a server, which then receives the data.
[0297] The server first performs morphological analysis, breaking down and analyzing the part of speech and grammatical structure of each word in the input text. From this analysis, it identifies Japanese-specific expressions and phrases and understands the context. Then, it performs translation using natural language processing to generate natural-sounding expressions in the other language. In this translation process, it considers not only literal translation but also nuance.
[0298] Furthermore, this system utilizes an emotion engine. The server analyzes the user's emotions from the input text using the emotion engine and adjusts the nuances of the translation based on those emotions. For example, if it is determined that the user has positive emotions, the translation result will be adjusted to reflect those positive nuances. In addition, supplementary information related to emotions is generated and included in the translation result.
[0299] Finally, the server sends the translation results, which take the user's emotions into consideration, along with any additional information generated as needed, to the terminal. The terminal displays the received information to the user, allowing them to review the translation and related information.
[0300] For example, if a user enters the text "It's going to be a great day," the server's emotion engine recognizes this text as a positive emotion. Then, when translating it into English, it performs a translation that reflects the emotion, such as "It's going to be an amazing day!" This makes it easier for the user's emotions to be accurately conveyed to the recipient.
[0301] The following describes the processing flow.
[0302] Step 1:
[0303] The user enters the Japanese text they want translated using their device. The device then sends this input to the server.
[0304] Step 2:
[0305] The server receives text sent from the terminal. The received text is preprocessed to prepare it for parsing. Specifically, noise reduction and character normalization are performed.
[0306] Step 3:
[0307] The server performs morphological analysis, decomposes the words in the input text, and analyzes their word types and grammatical structures. This enables the identification of expressions and turns of phrase unique to Japanese.
[0308] Step 4:
[0309] The server's emotion engine recognizes the user's emotion from the analyzed text. The emotion engine determines the emotion based on emotional keywords and phrases contained in the text.
[0310] Step 5:
[0311] Based on the recognized emotion information, the server adjusts the nuance of the translation. A translation that reflects the context and nuance according to the emotion is performed.
[0312] Step 6:
[0313] Based on the context and emotion, the server generates a translation into natural foreign language. If necessary, supplementary information regarding cultural background and expression nuances is also generated.
[0314] Step 7:
[0315] The server transmits the generated translation result and supplementary information to the user's terminal. The terminal displays the received information to the user, enabling the user to view the translation content and related information.
[0316] (Example 2)
[0317] Next, Example 2 will be described. In the following description, the data processing device 12 is referred to as the "server", and the smart glasses 214 are referred to as the "terminal".
[0318] In conventional translation systems, there was a problem in that the translation was limited to literal translation and it was difficult to perform translations that took into account nuances and emotions between languages. Therefore, a translation method that reflects the user's emotion has been demanded. Also, it is important to provide appropriate supplementary information based on cultural background.
[0319] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0320] In this invention, the server includes means for receiving input natural language information, means for analyzing the word structure of the received natural language information, means for translating the natural language into a different language based on the analyzed word structure and context, means for analyzing sentiment during the translation process and adjusting the nuances of the translated content, and means for generating supplementary information related to the translated content. This makes it possible to provide natural translation results that reflect the user's emotions and to present sentiment-based supplementary information together.
[0321] "Natural language information" refers to text and audio data expressed in the language that humans use on a daily basis.
[0322] "Word structure" refers to the results of analyzing the part of speech, grammatical role, and conjugation forms of words in natural language.
[0323] "Context" refers to the linguistic and cultural background in which a particular word or phrase is used, as well as the content of the surrounding text.
[0324] "Translation means" refers to a method or device for converting information expressed in one language into information with equivalent meaning in a different language.
[0325] "Methods for analyzing emotions" refer to processing and techniques used to identify the emotional tone and intentions of a writer or speaker from information such as text and audio.
[0326] "Methods for adjusting nuances" refer to methods of making subtle adjustments to wording and expressions in the translated result to reflect the emotional and cultural characteristics of the original language.
[0327] "Means of generating supplementary information" refers to techniques that create information and annotations to aid understanding, in addition to the basic translation result.
[0328] This invention is a system that provides a translation that takes into account the sentiment of the natural language text input by the user. The user inputs the text they wish to have translated using a terminal. The terminal sends the input data to the server.
[0329] The server first uses "morphological analysis software" to perform morphological analysis. This software divides the input natural language information into word units and analyzes the part of speech and grammatical role of each word. Based on the results of this analysis, it understands the context and uses a natural language processing engine to translate into another language. A "translation engine" may be used for translation, but in this case, it aims for a natural translation that is appropriate to the context, rather than a literal translation.
[0330] Furthermore, the server uses a "sentiment analysis engine" to analyze the user's emotions from the input text. Based on the analyzed emotional information, it adjusts the nuances of the translated text to reflect the user's emotions. At this time, supplementary information related to the emotions is generated and sent to the terminal along with the translation result.
[0331] The terminal displays the final adjusted translation and supplementary information to the user, allowing them to confirm it. For example, if the user enters the text "It's going to be an amazing day," the server analyzes the positive emotion in this text and provides a corresponding translation: "It's going to be an amazing day!"
[0332] An example of a prompt for a generative AI model is: "Translate the following Japanese text into English, reflecting the sentiment of the text: 'It's going to be a great day.'"
[0333] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0334] Step 1:
[0335] The user inputs the Japanese text for which translation is desired into the terminal. For example, the user inputs "Today is a very enjoyable day." The terminal sends this input data to the server. The input is Japanese text, and the output is the data transmitted to the server.
[0336] Step 2:
[0337] The server performs morphological analysis using the natural language information received from the terminal. Specifically, the server uses morphological analysis software to analyze the Japanese text word by word and identify the word type and grammatical structure. The input is the Japanese text received from the terminal, and the output is the analysis result, which is the word type information and grammatical structure of the words.
[0338] Step 3:
[0339] The server performs translation using the natural language processing engine based on the analyzed word type information and grammatical structure. At this stage, the context of the input text is understood, and a translation that takes into account the nuances rather than just a literal translation is generated. For example, the Japanese sentence "Today is a very enjoyable day." is translated as "Today is a very enjoyable day." The input is the result of morphological analysis, and the output is the translated English text.
[0340] Step 4:
[0341] The server analyzes the user's sentiment using the sentiment analysis engine in parallel with the translation. Here, a positive sentiment is detected from the expression "very enjoyable". The input is the original Japanese text, and the output is the result of sentiment analysis, which is the information of positive sentiment.
[0342] Step 5:
[0343] The server adjusts the nuances of the translated text based on the sentiment analysis results. In this case, it adjusts it to emphasize positive emotions more, such as "Today is a wonderfully enjoyable day!". The input is the translated English text and the sentiment analysis results, and the output is the sentiment-adjusted translated text.
[0344] Step 6:
[0345] The server generates supplementary information about emotions and adds it to the translation result. Here, it generates information to supplement positive emotions. The input is the emotion-adjusted translation text, and the output is the final translation result with supplementary information.
[0346] Step 7:
[0347] The server sends the final adjusted translation results and supplementary information to the terminal. The terminal receives this and displays it to the user. The user can then review the translated content and related information. The input is the data sent from the server, and the output is the information displayed to the user.
[0348] (Application Example 2)
[0349] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."
[0350] In translation between different languages, it is crucial to provide translations that not only perform grammatical conversions but also appropriately reflect the user's emotions. However, conventional systems lack the nuance adjustments necessary to take user emotions into account, sometimes resulting in misrepresentation of the user's intentions and feelings. To address this challenge, there is a need for translation systems that take emotions into account.
[0351] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0352] In this invention, the server includes means for receiving input character data, means for analyzing a specific syntax from the received character data, means for converting the character data into different formats based on the analyzed syntax and context, means for analyzing the user's emotions and adjusting the nuances of the conversion based on those emotions, and means for outputting the adjusted conversion results and explanations. This allows for the provision of translation results that reflect the user's emotions, enabling more accurate communication of intent.
[0353] "Inputted character data" refers to the language information that the user provides to the system for morphological analysis and translation.
[0354] A "means of receiving" refers to an element that has the function of receiving data sent by the user within the system.
[0355] A "means for parsing a specific syntax" refers to an element that breaks down input character data into word and sentence structures and processes them to understand their grammar and meaning.
[0356] "Means of converting to different formats" refers to elements that have the function of appropriately translating or converting analyzed character data into other languages or formats.
[0357] A "means of analyzing emotions" refers to an element that has the function of detecting and analyzing emotional nuances and tones from user input data.
[0358] "Means of adjusting nuance" refer to elements that modify the tone and expression of the translation result based on detected emotions, thereby maintaining the intended emotion.
[0359] "Means for outputting adjusted conversion results and explanations" refers to elements that have the function of presenting the converted content and related supplementary information to the user.
[0360] This invention provides a multilingual translation system that takes user emotions into consideration. The system is implemented via a user terminal such as a smartphone or smart glasses. The user inputs natural language text data into the terminal, and sentiment analysis and translation processing are performed on the server. Upon receiving the text data, the server first performs morphological analysis. Specifically, it uses analysis software running in a Python environment to analyze the parts of speech and syntactic structure of the words and phrases that make up the text.
[0361] The server then uses a natural language processing engine to translate into different languages. During this process, it employs a generative AI model algorithm to generate translations that are contextual and syntactic. Furthermore, it analyzes the user's emotions through an emotion engine and performs nuance adjustments to reflect these emotions in the translated text.
[0362] Based on the emotions detected from the natural language text entered by the user, the translation is adjusted appropriately to generate an informative result. This translation result, along with additional information based on the emotions, is sent to the device and presented to the user. For example, if the user enters "It's going to be an amazing day," the system recognizes this positive emotion and provides the translation "It's going to be an amazing day!"
[0363] An example of a prompt is, "Retranslate the product description to reflect the user's emotions in a positive way." This system makes it easy to perform translations that accurately reflect the user's feelings.
[0364] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0365] Step 1:
[0366] The terminal receives natural language text data entered by the user. This input data reflects the user's intent. The terminal then prepares to send this data to the server.
[0367] Step 2:
[0368] The server receives natural language text sent from the terminal. The received text data is input into a morphological analysis system, which analyzes the part of speech and grammatical structure of each word. The output of this analysis is syntactic information and context of the text.
[0369] Step 3:
[0370] The server uses syntactic information obtained from morphological analysis to translate text into different languages using a generative AI model. At this stage, it generates appropriate translation candidates based on the input syntactic information. The output is the translated text data.
[0371] Step 4:
[0372] After the translation process, the server uses an emotion analysis engine to analyze the emotions contained in the user's input text. At this stage, it generates data for nuance adjustment based on the emotion information extracted from the text.
[0373] Step 5:
[0374] The server uses information obtained from sentiment analysis to adjust the nuances of the translated text. Specifically, it uses prompt sentences provided by the generative AI model to modify the translation results to reflect the user's emotions. The output is the adjusted translated text.
[0375] Step 6:
[0376] The server sends the adjusted translated text and related supplementary information to the terminal. The terminal receives this and displays a translation result that reflects the sentiment to the user. This allows the user to confirm that the translation takes sentiment into account.
[0377] The specific processing unit 290 transmits the result of the specific processing to the smart glasses 214. In the smart glasses 214, the control unit 46A causes the speaker 240 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.
[0378] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0379] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart glasses 214.
[0380] [Third Embodiment]
[0381] Figure 5 shows an example of the configuration of the data processing system 310 according to the third embodiment.
[0382] As shown in Figure 5, the data processing system 310 includes a data processing device 12 and a headset terminal 314. An example of the data processing device 12 is a server.
[0383] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0384] The headset terminal 314 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a display 343. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and display 343 are also connected to the bus 52.
[0385] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.
[0386] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).
[0387] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.
[0388] Figure 6 shows an example of the main functions of the data processing device 12 and the headset terminal 314. As shown in Figure 6, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.
[0389] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.
[0390] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0391] In the headset terminal 314, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0392] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the headset terminal 314 will be referred to as the "terminal".
[0393] This invention forms a system for natural and effective translation from Japanese to other languages. The user inputs the Japanese text they wish to translate using a terminal. The terminal sends this input to a server, which analyzes the received text. The server breaks down words and grammatical structures through morphological analysis, identifying expressions and important phrases unique to Japanese. Then, considering the surrounding context of the text, it extracts the meaning and intent necessary for translation.
[0394] Based on the analysis results, the server translates Japanese expressions and nuances into a foreign language in a way that is easy to understand. This translation process goes beyond literal translation, taking into account cultural background and nuances of expression. In particular, Japanese expressions such as "yappari" (after all) and "toriaezu" (for now) can be interpreted differently depending on the situation, so additional explanations are generated to appropriately convey the intended meaning.
[0395] As a result, the server sends the generated translation and additional explanations to the user's terminal and displays the results in the format specified by the user. This system enables smooth communication that transcends language barriers.
[0396] For example, if a user enters the text "It started raining, so I think we should cancel today after all." in Japanese, the server will pay attention to the expression "after all" and interpret it as meaning a reconfirmation of the change of plans. When translating this into English, it will output the result with a supplementary explanation that includes the nuance, such as "It started raining, so I think we should cancel our plans today. ('After all' indicates a reconfirmation of the decision)." This ensures that foreign language recipients can understand the speaker's original intent without misunderstanding.
[0397] The following describes the processing flow.
[0398] Step 1:
[0399] The user enters the Japanese text they want translated into the device. The device then sends the entered text to the server.
[0400] Step 2:
[0401] The server receives text sent from the terminal. The server preprocesses the received data into a parseable format. Specifically, it performs noise reduction and character normalization (e.g., unifying half-width and full-width characters).
[0402] Step 3:
[0403] The server performs morphological analysis, breaking down the part of speech and constituent elements of each word in the text. This identifies expressions and keywords unique to the Japanese language.
[0404] Step 4:
[0405] The server interprets the meaning and intent of an expression based on the context and surrounding information of the sentence. Here, data models are used to understand specific phrases and nuances.
[0406] Step 5:
[0407] The server performs natural and appropriate translations into foreign languages based on the analysis results. This process involves not only literal translations but also translations that are context-aware and natural.
[0408] Step 6:
[0409] The server generates supplementary explanations that convey nuances and cultural background that cannot be fully expressed through translation alone. These explanations help the recipient understand the intended meaning more accurately.
[0410] Step 7:
[0411] The server sends the generated translation results and supplementary explanations to the terminal. The terminal displays the received information to the user, allowing the user to confirm the results.
[0412] (Example 1)
[0413] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0414] In today's global society, multilingual communication is crucial, but simple translation often fails to accurately convey cultural nuances and intentions. Therefore, there is a need for techniques that enable natural and effective communication between different languages. In particular, languages with unique expressions, such as Japanese, require detailed translations and supplementary explanations based on context and background.
[0415] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0416] In this invention, the server includes means for analyzing received natural language data and breaking it down into words and grammatical structures, means for extracting specific expressions and elements from the analysis results and translating them while considering the context, and means for providing supplementary information to the generated translation based on cultural background and nuances of expression. This enables natural and effective communication between different languages.
[0417] "Natural language data" refers to information expressed in the language that people use on a daily basis.
[0418] "Analysis" is the process of breaking down input data and classifying and organizing it in order to understand its structure and meaning.
[0419] "Deconstructing words and grammatical structures" means dividing a text into individual words and grammatical elements and clarifying the role of each.
[0420] "Extracting unique expressions and elements" means identifying and extracting phrases and important information specific to that language.
[0421] "Translating while considering context" means understanding the meaning from the context in which the text is used and the surrounding sentences, and then appropriately translating it into a different language.
[0422] "Providing supplementary information based on cultural background and nuances of expression" means taking into account different cultural backgrounds and subtle linguistic meanings during translation and adding information to explain them.
[0423] "Formatting and outputting in an output format" means converting the translation results and supplementary information into a format that is easy for the reader to understand, and then displaying or distributing them.
[0424] A "system" is a collection of multiple components or processes that work together to achieve a specific purpose.
[0425] This invention provides a system for natural and effective translation from natural language to another language. The user inputs the Japanese text they wish to translate into a terminal. The terminal converts the input text into an appropriate data structure and sends it to the server.
[0426] The server analyzes the received Japanese text using morphological analysis software. This analysis utilizes a "language analysis tool," which breaks down the text into individual words and extracts its grammatical structure. The server then uses the analysis results to detect Japanese-specific expressions and important phrases. To extract contextual meaning and intent, it uses a natural language processing library called the "language understanding library."
[0427] After analysis, the server uses a generative AI model to perform translations that take into account cultural background and nuances of expression. Specifically, it utilizes a "translation API" to request additional supplementary explanations for appropriately translated results. This enables the transmission of detailed meaning beyond simple translation.
[0428] Finally, the server formats the generated translation and supplementary explanations in the specified format and sends them to the terminal. The user can then check the translation displayed on the terminal, enabling smooth communication between different languages.
[0429] For example, if a user types "It's started raining, so I think we should cancel today after all," the server interprets "after all" as a confirmation of a change in plans, translates it into English as "It started raining, so I think we should cancel our plans today. ('After all' indicates a reconfirmation of the decision)," and sends it to the device.
[0430] An example of a prompt to input into the generative AI model is as follows: "Translate the following Japanese sentence into English and add supplementary explanations to convey the nuance: 'It's started raining, so I think I'll just cancel today.'"
[0431] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0432] Step 1:
[0433] The user enters the natural language text they want translated into the device. During this input process, the user uses a text field to enter sentences in Japanese or another language. The entered text data is temporarily stored on the device. For example, if the user enters "It's started raining, so I think I'll cancel today," this sentence will be stored on the device.
[0434] Step 2:
[0435] The terminal converts the input text into a data structure and sends it to the server. Specifically, it serializes the text into JSON or XML format and sends it to the server as an HTTP request. The input data is in a structured format so that the server can easily parse it. For example, the input sentence is sent to the server in JSON format such as "{ 'text': 'It's started raining, so I think I'll cancel today.'}".
[0436] Step 3:
[0437] The server analyzes the received natural language data, breaking it down into words and grammatical structures. Using a "morphological analysis tool," the server breaks down the text into smaller parts, identifying each word and its part of speech. From the input text, the server obtains word-level analysis results, such as "rain / noun," "ga / particle," and "furu / verb." This reveals the grammatical structure of the text.
[0438] Step 4:
[0439] The server extracts unique expressions and elements from the analysis results and begins the translation process, taking context into consideration. Based on the analysis results, it identifies Japanese-specific expressions (e.g., "yappari") and considers their corresponding meaning and intent. The server uses a "natural language understanding library" to gain a deeper understanding of the context and converts the prompt sentence into a format that can be input into the AI model. The output of this process is an appropriate translation candidate that is relevant to the context.
[0440] Step 5:
[0441] The server performs translation using a generative AI model. The model receives prompts and generates translation results that reflect the context and nuances. Specifically, it utilizes a language model API to perform translations according to the instructed prompt sentences. The generated output is a translation such as, "It started raining, so I think we should cancel our plans today. ('やっぱり' indicates a reconfirmation of the decision)."
[0442] Step 6:
[0443] The server formats the generated translation results into the specified output format and sends them to the terminal. The formatting process converts the data to formats such as HTML or plain text, making it immediately readable by the user. The server returns the formatted data to the terminal as an HTTP response.
[0444] Step 7:
[0445] The device correctly interprets the data received from the server and displays it to the user. Specifically, it displays the received translation results in the appropriate widget, making them viewable by the user. The user can then check the translation results on that screen and utilize the information as needed.
[0446] (Application Example 1)
[0447] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0448] Conventional translation systems struggle to accurately interpret expressions and contexts unique to Japanese, resulting in a failure to adequately convey cultural nuances in communication between users of different languages. Furthermore, the lack of effective means to display real-time translations and their supplementary explanations across different devices hinders smooth multi-platform communication.
[0449] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0450] In this invention, the server includes means for receiving input natural language data, means for analyzing specific expressions from the received natural language data, and means for translating the natural language into a different language based on the analyzed expressions and context. This enables the provision of supplementary information related to Japanese-specific expressions and cultural background, and further allows the translation results to be displayed in real time on different devices.
[0451] "Input natural language data" refers to the initial language text that the user sends to the server for translation.
[0452] "Means of receiving" means a method or device for properly importing natural language data into a server.
[0453] "Means for analyzing specific expressions" refers to methods for identifying important phrases and expressions contained within natural language data and analyzing their structure.
[0454] "Contextual translation" refers to a method of accurate translation that takes into account the meaning and nuances of the surrounding text.
[0455] "Means for generating additional explanations" refers to methods for automatically creating supplementary information to clarify the cultural context and intent behind the translated content.
[0456] "Means for displaying translations in real time on different devices" refers to methods for instantly displaying translated results on a variety of devices.
[0457] A "generative AI model" refers to an algorithm or platform that uses artificial intelligence technology to process data and generate output corresponding to a specific task.
[0458] The system that implements this application primarily consists of a server and a user's terminal. The server receives natural language data entered by the user on their terminal and analyzes that data. Morphological analysis tools such as MeCab are used for the analysis, which identifies grammatical structures and specific expressions. Based on this information, the server translates the natural language into another language using the Google Cloud Translation API or the OpenAI API.
[0459] The server also generates additional explanations about the translated content. These explanations are created in the backend using a generative AI model and supplement cultural context and linguistic nuances. The generated translations and supplementary information are sent to the user's device and displayed in real time on different devices.
[0460] As a concrete example, if a foreign tourist wants to translate Japanese tourist information into English, they might input a sentence like "Mount Fuji is beautiful after all." The server analyzes the expression "after all" and provides a translation that includes its cultural implications. In this case, a translation such as "Fuji is really beautiful, isn't it? ('After all' implies a reconfirmation of the opinion)" would be generated.
[0461] An example of a prompt statement is written as follows:
[0462] The Japanese text, "The trip was fun, but it's really reassuring to be back home," would be translated as follows:
[0463] "Translate the following text into English, explaining the nuances of 'yappari': 'The trip was fun, but I still feel relieved when I get home.'"
[0464] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0465] Step 1:
[0466] The user enters the Japanese text they wish to have translated on their device. This input is then sent to the server by the device. The entered data is in natural language text.
[0467] Step 2:
[0468] The server analyzes the received Japanese text using a morphological analysis tool (MeCab). This analysis identifies word boundaries and grammatical structures within the text. This allows for the identification of specific Japanese phrases and expressions.
[0469] Step 3:
[0470] The server uses the Google Cloud Translation API or OpenAI API to perform translations based on the analyzed data. This process incorporates contextual information to obtain translation results that take into account cultural background and linguistic nuances.
[0471] Step 4:
[0472] The server uses a generative AI model to generate additional explanations related to the translation results. The generated explanations supplement the background and original intent of the translation, and also refer to cultural contexts.
[0473] Step 5:
[0474] The server sends the generated translation results and additional explanations to the user's device. The user's device displays this information on the screen in real time. This allows the user to view the translation results immediately.
[0475] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0476] This invention relates to a system for translating from Japanese to other languages, which has a function to adjust the translation content while taking the user's emotions into consideration. The user inputs the Japanese text they wish to have translated using a terminal. The terminal sends the input data to a server, which then receives the data.
[0477] The server first performs morphological analysis, breaking down and analyzing the part of speech and grammatical structure of each word in the input text. From this analysis, it identifies Japanese-specific expressions and phrases and understands the context. Then, it performs translation using natural language processing to generate natural-sounding expressions in the other language. In this translation process, it considers not only literal translation but also nuance.
[0478] Furthermore, this system utilizes an emotion engine. The server analyzes the user's emotions from the input text using the emotion engine and adjusts the nuances of the translation based on those emotions. For example, if it is determined that the user has positive emotions, the translation result will be adjusted to reflect those positive nuances. In addition, supplementary information related to emotions is generated and included in the translation result.
[0479] Finally, the server sends the translation results, which take the user's emotions into consideration, along with any additional information generated as needed, to the terminal. The terminal displays the received information to the user, allowing them to review the translation and related information.
[0480] For example, if a user enters the text "It's going to be a great day," the server's emotion engine recognizes this text as a positive emotion. Then, when translating it into English, it performs a translation that reflects the emotion, such as "It's going to be an amazing day!" This makes it easier for the user's emotions to be accurately conveyed to the recipient.
[0481] The following describes the processing flow.
[0482] Step 1:
[0483] The user enters the Japanese text they want translated using their device. The device then sends this input to the server.
[0484] Step 2:
[0485] The server receives text sent from the terminal. The received text is preprocessed to prepare it for parsing. Specifically, noise reduction and character normalization are performed.
[0486] Step 3:
[0487] The server performs morphological analysis, breaking down the words in the input text and analyzing their parts of speech and grammatical structure. This allows it to identify expressions and phrases unique to the Japanese language.
[0488] Step 4:
[0489] The server's sentiment engine recognizes the user's emotions from the analyzed text. The sentiment engine determines emotions based on emotional keywords and phrases contained in the text.
[0490] Step 5:
[0491] The server adjusts the nuances of the translation based on recognized sentiment information. It performs translations that reflect the context and nuances appropriate to the emotions.
[0492] Step 6:
[0493] The server generates natural-sounding translations into foreign languages based on context and sentiment. It also generates supplementary information about cultural background and nuances of expression, as needed.
[0494] Step 7:
[0495] The server sends the generated translation results and supplementary information to the user's terminal. The terminal displays the received information to the user, allowing them to review the translation and related information.
[0496] (Example 2)
[0497] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0498] Traditional translation systems have the problem of being limited to literal translations and having difficulty considering nuances and emotions between languages. Therefore, there is a need for translation methods that reflect the user's emotions. It is also important to provide appropriate supplementary information based on cultural background.
[0499] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0500] In this invention, the server includes means for receiving input natural language information, means for analyzing the word structure of the received natural language information, means for translating the natural language into a different language based on the analyzed word structure and context, means for analyzing sentiment during the translation process and adjusting the nuances of the translated content, and means for generating supplementary information related to the translated content. This makes it possible to provide natural translation results that reflect the user's emotions and to present sentiment-based supplementary information together.
[0501] "Natural language information" refers to text and audio data expressed in the language that humans use on a daily basis.
[0502] "Word structure" refers to the results of analyzing the part of speech, grammatical role, and conjugation forms of words in natural language.
[0503] "Context" refers to the linguistic and cultural background in which a particular word or phrase is used, as well as the content of the surrounding text.
[0504] "Translation means" refers to a method or device for converting information expressed in one language into information with equivalent meaning in a different language.
[0505] "Methods for analyzing emotions" refer to processing and techniques used to identify the emotional tone and intentions of a writer or speaker from information such as text and audio.
[0506] "Methods for adjusting nuances" refer to methods of making subtle adjustments to wording and expressions in the translated result to reflect the emotional and cultural characteristics of the original language.
[0507] "Means of generating supplementary information" refers to techniques that create information and annotations to aid understanding, in addition to the basic translation result.
[0508] This invention is a system that provides a translation that takes into account the sentiment of the natural language text input by the user. The user inputs the text they wish to have translated using a terminal. The terminal sends the input data to the server.
[0509] The server first uses "morphological analysis software" to perform morphological analysis. This software divides the input natural language information into word units and analyzes the part of speech and grammatical role of each word. Based on the results of this analysis, it understands the context and uses a natural language processing engine to translate into another language. A "translation engine" may be used for translation, but in this case, it aims for a natural translation that is appropriate to the context, rather than a literal translation.
[0510] Furthermore, the server uses a "sentiment analysis engine" to analyze the user's emotions from the input text. Based on the analyzed emotional information, it adjusts the nuances of the translated text to reflect the user's emotions. At this time, supplementary information related to the emotions is generated and sent to the terminal along with the translation result.
[0511] The terminal displays the final adjusted translation and supplementary information to the user, allowing them to confirm it. For example, if the user enters the text "It's going to be an amazing day," the server analyzes the positive emotion in this text and provides a corresponding translation: "It's going to be an amazing day!"
[0512] An example of a prompt for a generative AI model is: "Translate the following Japanese text into English, reflecting the sentiment of the text: 'It's going to be a great day.'"
[0513] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0514] Step 1:
[0515] The user inputs the Japanese text for which translation is desired into the terminal. For example, the user inputs "Today is a very enjoyable day." The terminal sends this input data to the server. The input is Japanese text, and the output is the data sent to the server.
[0516] Step 2:
[0517] The server performs morphological analysis using the natural language information received from the terminal. Specifically, the server uses morphological analysis software to analyze the Japanese text word by word and identify the word type and grammar structure. The input is the Japanese text received from the terminal, and the output is the analysis result, which is the word type information and grammar structure of the words.
[0518] Step 3:
[0519] The server performs translation using the natural language processing engine based on the analyzed word type information and grammar structure. At this stage, it understands the context of the input text and generates a translation that takes into account nuances rather than just a literal translation. For example, the Japanese sentence "今日はとても楽しい一日です" is translated as "Today is a very enjoyable day." The input is the result of morphological analysis, and the output is the translated English text.
[0520] Step 4:
[0521] The server analyzes the user's sentiment using the sentiment analysis engine in parallel with the translation. Here, it detects a positive sentiment from the expression "とても楽しい". The input is the original Japanese text, and the output is the result of sentiment analysis, which is the information of a positive sentiment.
[0522] Step 5:
[0523] The server adjusts the nuances of the translated text based on the sentiment analysis results. In this case, it adjusts it to emphasize positive emotions more, such as "Today is a wonderfully enjoyable day!". The input is the translated English text and the sentiment analysis results, and the output is the sentiment-adjusted translated text.
[0524] Step 6:
[0525] The server generates supplementary information about emotions and adds it to the translation result. Here, it generates information to supplement positive emotions. The input is the emotion-adjusted translation text, and the output is the final translation result with supplementary information.
[0526] Step 7:
[0527] The server sends the final adjusted translation results and supplementary information to the terminal. The terminal receives this and displays it to the user. The user can then review the translated content and related information. The input is the data sent from the server, and the output is the information displayed to the user.
[0528] (Application Example 2)
[0529] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0530] In translation between different languages, it is crucial to provide translations that not only perform grammatical conversions but also appropriately reflect the user's emotions. However, conventional systems lack the nuance adjustments necessary to take user emotions into account, sometimes resulting in misrepresentation of the user's intentions and feelings. To address this challenge, there is a need for translation systems that take emotions into account.
[0531] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0532] In this invention, the server includes means for receiving input character data, means for analyzing a specific syntax from the received character data, means for converting the character data into different formats based on the analyzed syntax and context, means for analyzing the user's emotions and adjusting the nuances of the conversion based on those emotions, and means for outputting the adjusted conversion results and explanations. This allows for the provision of translation results that reflect the user's emotions, enabling more accurate communication of intent.
[0533] "Inputted character data" refers to the language information that the user provides to the system for morphological analysis and translation.
[0534] A "means of receiving" refers to an element that has the function of receiving data sent by the user within the system.
[0535] A "means for parsing a specific syntax" refers to an element that breaks down input character data into word and sentence structures and processes them to understand their grammar and meaning.
[0536] "Means of converting to different formats" refers to elements that have the function of appropriately translating or converting analyzed character data into other languages or formats.
[0537] A "means of analyzing emotions" refers to an element that has the function of detecting and analyzing emotional nuances and tones from user input data.
[0538] "Means of adjusting nuance" refer to elements that modify the tone and expression of the translation result based on detected emotions, thereby maintaining the intended emotion.
[0539] "Means for outputting adjusted conversion results and explanations" refers to elements that have the function of presenting the converted content and related supplementary information to the user.
[0540] This invention provides a multilingual translation system that takes user emotions into consideration. The system is implemented via a user terminal such as a smartphone or smart glasses. The user inputs natural language text data into the terminal, and sentiment analysis and translation processing are performed on the server. Upon receiving the text data, the server first performs morphological analysis. Specifically, it uses analysis software running in a Python environment to analyze the parts of speech and syntactic structure of the words and phrases that make up the text.
[0541] The server then uses a natural language processing engine to translate into different languages. During this process, it employs a generative AI model algorithm to generate translations that are contextual and syntactic. Furthermore, it analyzes the user's emotions through an emotion engine and performs nuance adjustments to reflect these emotions in the translated text.
[0542] Based on the emotions detected from the natural language text entered by the user, the translation is adjusted appropriately to generate an informative result. This translation result, along with additional information based on the emotions, is sent to the device and presented to the user. For example, if the user enters "It's going to be an amazing day," the system recognizes this positive emotion and provides the translation "It's going to be an amazing day!"
[0543] An example of a prompt is, "Retranslate the product description to reflect the user's emotions in a positive way." This system makes it easy to perform translations that accurately reflect the user's feelings.
[0544] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0545] Step 1:
[0546] The terminal receives natural language text data entered by the user. This input data reflects the user's intent. The terminal then prepares to send this data to the server.
[0547] Step 2:
[0548] The server receives natural language text sent from the terminal. The received text data is input into a morphological analysis system, which analyzes the part of speech and grammatical structure of each word. The output of this analysis is syntactic information and context of the text.
[0549] Step 3:
[0550] The server uses syntactic information obtained from morphological analysis to translate text into different languages using a generative AI model. At this stage, it generates appropriate translation candidates based on the input syntactic information. The output is the translated text data.
[0551] Step 4:
[0552] After the translation process, the server uses an emotion analysis engine to analyze the emotions contained in the user's input text. At this stage, it generates data for nuance adjustment based on the emotion information extracted from the text.
[0553] Step 5:
[0554] The server uses information obtained from sentiment analysis to adjust the nuances of the translated text. Specifically, it uses prompt sentences provided by the generative AI model to modify the translation results to reflect the user's emotions. The output is the adjusted translated text.
[0555] Step 6:
[0556] The server sends the adjusted translated text and related supplementary information to the terminal. The terminal receives this and displays a translation result that reflects the sentiment to the user. This allows the user to confirm that the translation takes sentiment into account.
[0557] The specific processing unit 290 transmits the result of the specific processing to the headset terminal 314. In the headset terminal 314, the control unit 46A causes the speaker 240 and display 343 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.
[0558] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0559] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and specific processing may also be performed by the headset terminal 314.
[0560] [Fourth Embodiment]
[0561] Figure 7 shows an example of the configuration of the data processing system 410 according to the fourth embodiment.
[0562] As shown in Figure 7, the data processing system 410 includes a data processing device 12 and a robot 414. An example of the data processing device 12 is a server.
[0563] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0564] The robot 414 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a controlled object 443. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and controlled object 443 are also connected to the bus 52.
[0565] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.
[0566] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).
[0567] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.
[0568] The controlled object 443 includes a display device, LEDs in the eyes, and motors that drive the arms, hands, and feet. The posture and gestures of the robot 414 are controlled by controlling the motors of the arms, hands, and feet. Some of the robot 414's emotions can be expressed by controlling these motors. Furthermore, the robot 414's facial expressions can also be expressed by controlling the illumination state of the LEDs in its eyes.
[0569] Figure 8 shows an example of the main functions of the data processing device 12 and the robot 414. As shown in Figure 8, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.
[0570] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.
[0571] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0572] In robot 414, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0573] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0574] This invention forms a system for natural and effective translation from Japanese to other languages. The user inputs the Japanese text they wish to translate using a terminal. The terminal sends this input to a server, which analyzes the received text. The server breaks down words and grammatical structures through morphological analysis, identifying expressions and important phrases unique to Japanese. Then, considering the surrounding context of the text, it extracts the meaning and intent necessary for translation.
[0575] Based on the analysis results, the server translates Japanese expressions and nuances into a foreign language in a way that is easy to understand. This translation process goes beyond literal translation, taking into account cultural background and nuances of expression. In particular, Japanese expressions such as "yappari" (after all) and "toriaezu" (for now) can be interpreted differently depending on the situation, so additional explanations are generated to appropriately convey the intended meaning.
[0576] As a result, the server sends the generated translation and additional explanations to the user's terminal and displays the results in the format specified by the user. This system enables smooth communication that transcends language barriers.
[0577] For example, if a user enters the text "It started raining, so I think we should cancel today after all." in Japanese, the server will pay attention to the expression "after all" and interpret it as meaning a reconfirmation of the change of plans. When translating this into English, it will output the result with a supplementary explanation that includes the nuance, such as "It started raining, so I think we should cancel our plans today. ('After all' indicates a reconfirmation of the decision)." This ensures that foreign language recipients can understand the speaker's original intent without misunderstanding.
[0578] The following describes the processing flow.
[0579] Step 1:
[0580] The user enters the Japanese text they want translated into the device. The device then sends the entered text to the server.
[0581] Step 2:
[0582] The server receives text sent from the terminal. The server preprocesses the received data into a parseable format. Specifically, it performs noise reduction and character normalization (e.g., unifying half-width and full-width characters).
[0583] Step 3:
[0584] The server performs morphological analysis, breaking down the part of speech and constituent elements of each word in the text. This identifies expressions and keywords unique to the Japanese language.
[0585] Step 4:
[0586] The server interprets the meaning and intent of an expression based on the context and surrounding information of the sentence. Here, data models are used to understand specific phrases and nuances.
[0587] Step 5:
[0588] The server performs natural and appropriate translations into foreign languages based on the analysis results. This process involves not only literal translations but also translations that are context-aware and natural.
[0589] Step 6:
[0590] The server generates supplementary explanations that convey nuances and cultural background that cannot be fully expressed through translation alone. These explanations help the recipient understand the intended meaning more accurately.
[0591] Step 7:
[0592] The server sends the generated translation results and supplementary explanations to the terminal. The terminal displays the received information to the user, allowing the user to confirm the results.
[0593] (Example 1)
[0594] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0595] In today's global society, multilingual communication is crucial, but simple translation often fails to accurately convey cultural nuances and intentions. Therefore, there is a need for techniques that enable natural and effective communication between different languages. In particular, languages with unique expressions, such as Japanese, require detailed translations and supplementary explanations based on context and background.
[0596] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0597] In this invention, the server includes means for analyzing received natural language data and breaking it down into words and grammatical structures, means for extracting specific expressions and elements from the analysis results and translating them while considering the context, and means for providing supplementary information to the generated translation based on cultural background and nuances of expression. This enables natural and effective communication between different languages.
[0598] "Natural language data" refers to information expressed in the language that people use on a daily basis.
[0599] "Analysis" is the process of breaking down input data and classifying and organizing it in order to understand its structure and meaning.
[0600] "Deconstructing words and grammatical structures" means dividing a text into individual words and grammatical elements and clarifying the role of each.
[0601] "Extracting unique expressions and elements" means identifying and extracting phrases and important information specific to that language.
[0602] "Translating while considering context" means understanding the meaning from the context in which the text is used and the surrounding sentences, and then appropriately translating it into a different language.
[0603] "Providing supplementary information based on cultural background and nuances of expression" means taking into account different cultural backgrounds and subtle linguistic meanings during translation and adding information to explain them.
[0604] "Formatting and outputting in an output format" means converting the translation results and supplementary information into a format that is easy for the reader to understand, and then displaying or distributing them.
[0605] A "system" is a collection of multiple components or processes that work together to achieve a specific purpose.
[0606] This invention provides a system for natural and effective translation from natural language to another language. The user inputs the Japanese text they wish to translate into a terminal. The terminal converts the input text into an appropriate data structure and sends it to the server.
[0607] The server analyzes the received Japanese text using morphological analysis software. This analysis utilizes a "language analysis tool," which breaks down the text into individual words and extracts its grammatical structure. The server then uses the analysis results to detect Japanese-specific expressions and important phrases. To extract contextual meaning and intent, it uses a natural language processing library called the "language understanding library."
[0608] After analysis, the server uses a generative AI model to perform translations that take into account cultural background and nuances of expression. Specifically, it utilizes a "translation API" to request additional supplementary explanations for appropriately translated results. This enables the transmission of detailed meaning beyond simple translation.
[0609] Finally, the server formats the generated translation and supplementary explanations in the specified format and sends them to the terminal. The user can then check the translation displayed on the terminal, enabling smooth communication between different languages.
[0610] For example, if a user types "It's started raining, so I think we should cancel today after all," the server interprets "after all" as a confirmation of a change in plans, translates it into English as "It started raining, so I think we should cancel our plans today. ('After all' indicates a reconfirmation of the decision)," and sends it to the device.
[0611] An example of a prompt to input into the generative AI model is as follows: "Translate the following Japanese sentence into English and add supplementary explanations to convey the nuance: 'It's started raining, so I think I'll just cancel today.'"
[0612] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0613] Step 1:
[0614] The user enters the natural language text they want translated into the device. During this input process, the user uses a text field to enter sentences in Japanese or another language. The entered text data is temporarily stored on the device. For example, if the user enters "It's started raining, so I think I'll cancel today," this sentence will be stored on the device.
[0615] Step 2:
[0616] The terminal converts the input text into a data structure and sends it to the server. Specifically, it serializes the text into JSON or XML format and sends it to the server as an HTTP request. The input data is in a structured format so that the server can easily parse it. For example, the input sentence is sent to the server in JSON format such as "{ 'text': 'It's started raining, so I think I'll cancel today.'}".
[0617] Step 3:
[0618] The server analyzes the received natural language data, breaking it down into words and grammatical structures. Using a "morphological analysis tool," the server breaks down the text into smaller parts, identifying each word and its part of speech. From the input text, the server obtains word-level analysis results, such as "rain / noun," "ga / particle," and "furu / verb." This reveals the grammatical structure of the text.
[0619] Step 4:
[0620] The server extracts unique expressions and elements from the analysis results and begins the translation process, taking context into consideration. Based on the analysis results, it identifies Japanese-specific expressions (e.g., "yappari") and considers their corresponding meaning and intent. The server uses a "natural language understanding library" to gain a deeper understanding of the context and converts the prompt sentence into a format that can be input into the AI model. The output of this process is an appropriate translation candidate that is relevant to the context.
[0621] Step 5:
[0622] The server performs translation using a generative AI model. The model receives prompts and generates translation results that reflect the context and nuances. Specifically, it utilizes a language model API to perform translations according to the instructed prompt sentences. The generated output is a translation such as, "It started raining, so I think we should cancel our plans today. ('やっぱり' indicates a reconfirmation of the decision)."
[0623] Step 6:
[0624] The server formats the generated translation results into the specified output format and sends them to the terminal. The formatting process converts the data to formats such as HTML or plain text, making it immediately readable by the user. The server returns the formatted data to the terminal as an HTTP response.
[0625] Step 7:
[0626] The device correctly interprets the data received from the server and displays it to the user. Specifically, it displays the received translation results in the appropriate widget, making them viewable by the user. The user can then check the translation results on that screen and utilize the information as needed.
[0627] (Application Example 1)
[0628] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0629] Conventional translation systems struggle to accurately interpret expressions and contexts unique to Japanese, resulting in a failure to adequately convey cultural nuances in communication between users of different languages. Furthermore, the lack of effective means to display real-time translations and their supplementary explanations across different devices hinders smooth multi-platform communication.
[0630] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0631] In this invention, the server includes means for receiving input natural language data, means for analyzing specific expressions from the received natural language data, and means for translating the natural language into a different language based on the analyzed expressions and context. This enables the provision of supplementary information related to Japanese-specific expressions and cultural background, and further allows the translation results to be displayed in real time on different devices.
[0632] "Input natural language data" refers to the initial language text that the user sends to the server for translation.
[0633] "Means of receiving" means a method or device for properly importing natural language data into a server.
[0634] "Means for analyzing specific expressions" refers to methods for identifying important phrases and expressions contained within natural language data and analyzing their structure.
[0635] "Contextual translation" refers to a method of accurate translation that takes into account the meaning and nuances of the surrounding text.
[0636] "Means for generating additional explanations" refers to methods for automatically creating supplementary information to clarify the cultural context and intent behind the translated content.
[0637] "Means for displaying translations in real time on different devices" refers to methods for instantly displaying translated results on a variety of devices.
[0638] A "generative AI model" refers to an algorithm or platform that uses artificial intelligence technology to process data and generate output corresponding to a specific task.
[0639] The system that implements this application primarily consists of a server and a user's terminal. The server receives natural language data entered by the user on their terminal and analyzes that data. Morphological analysis tools such as MeCab are used for the analysis, which identifies grammatical structures and specific expressions. Based on this information, the server translates the natural language into another language using the Google Cloud Translation API or the OpenAI API.
[0640] The server also generates additional explanations about the translated content. These explanations are created in the backend using a generative AI model and supplement cultural context and linguistic nuances. The generated translations and supplementary information are sent to the user's device and displayed in real time on different devices.
[0641] As a concrete example, if a foreign tourist wants to translate Japanese tourist information into English, they might input a sentence like "Mount Fuji is beautiful after all." The server analyzes the expression "after all" and provides a translation that includes its cultural implications. In this case, a translation such as "Fuji is really beautiful, isn't it? ('After all' implies a reconfirmation of the opinion)" would be generated.
[0642] An example of a prompt statement is written as follows:
[0643] The Japanese text, "The trip was fun, but it's really reassuring to be back home," would be translated as follows:
[0644] "Translate the following text into English, explaining the nuances of 'yappari': 'The trip was fun, but I still feel relieved when I get home.'"
[0645] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0646] Step 1:
[0647] The user enters the Japanese text they wish to have translated on their device. This input is then sent to the server by the device. The entered data is in natural language text.
[0648] Step 2:
[0649] The server analyzes the received Japanese text using a morphological analysis tool (MeCab). This analysis identifies word boundaries and grammatical structures within the text. This allows for the identification of specific Japanese phrases and expressions.
[0650] Step 3:
[0651] The server uses the Google Cloud Translation API or OpenAI API to perform translations based on the analyzed data. This process incorporates contextual information to obtain translation results that take into account cultural background and linguistic nuances.
[0652] Step 4:
[0653] The server uses a generative AI model to generate additional explanations related to the translation results. The generated explanations supplement the background and original intent of the translation, and also refer to cultural contexts.
[0654] Step 5:
[0655] The server sends the generated translation results and additional explanations to the user's device. The user's device displays this information on the screen in real time. This allows the user to view the translation results immediately.
[0656] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0657] This invention relates to a system for translating from Japanese to other languages, which has a function to adjust the translation content while taking the user's emotions into consideration. The user inputs the Japanese text they wish to have translated using a terminal. The terminal sends the input data to a server, which then receives the data.
[0658] The server first performs morphological analysis, breaking down and analyzing the part of speech and grammatical structure of each word in the input text. From this analysis, it identifies Japanese-specific expressions and phrases and understands the context. Then, it performs translation using natural language processing to generate natural-sounding expressions in the other language. In this translation process, it considers not only literal translation but also nuance.
[0659] Furthermore, this system utilizes an emotion engine. The server analyzes the user's emotions from the input text using the emotion engine and adjusts the nuances of the translation based on those emotions. For example, if it is determined that the user has positive emotions, the translation result will be adjusted to reflect those positive nuances. In addition, supplementary information related to emotions is generated and included in the translation result.
[0660] Finally, the server sends the translation results, which take the user's emotions into consideration, along with any additional information generated as needed, to the terminal. The terminal displays the received information to the user, allowing them to review the translation and related information.
[0661] For example, if a user enters the text "It's going to be a great day," the server's emotion engine recognizes this text as a positive emotion. Then, when translating it into English, it performs a translation that reflects the emotion, such as "It's going to be an amazing day!" This makes it easier for the user's emotions to be accurately conveyed to the recipient.
[0662] The following describes the processing flow.
[0663] Step 1:
[0664] The user enters the Japanese text they want translated using their device. The device then sends this input to the server.
[0665] Step 2:
[0666] The server receives text sent from the terminal. The received text is preprocessed to prepare it for parsing. Specifically, noise reduction and character normalization are performed.
[0667] Step 3:
[0668] The server performs morphological analysis, breaking down the words in the input text and analyzing their parts of speech and grammatical structure. This allows it to identify expressions and phrases unique to the Japanese language.
[0669] Step 4:
[0670] The server's sentiment engine recognizes the user's emotions from the analyzed text. The sentiment engine determines emotions based on emotional keywords and phrases contained in the text.
[0671] Step 5:
[0672] The server adjusts the nuances of the translation based on recognized sentiment information. It performs translations that reflect the context and nuances appropriate to the emotions.
[0673] Step 6:
[0674] The server generates natural-sounding translations into foreign languages based on context and sentiment. It also generates supplementary information about cultural background and nuances of expression, as needed.
[0675] Step 7:
[0676] The server sends the generated translation results and supplementary information to the user's terminal. The terminal displays the received information to the user, allowing them to review the translation and related information.
[0677] (Example 2)
[0678] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0679] Traditional translation systems have the problem of being limited to literal translations and having difficulty considering nuances and emotions between languages. Therefore, there is a need for translation methods that reflect the user's emotions. It is also important to provide appropriate supplementary information based on cultural background.
[0680] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0681] In this invention, the server includes means for receiving input natural language information, means for analyzing the word structure of the received natural language information, means for translating the natural language into a different language based on the analyzed word structure and context, means for analyzing sentiment during the translation process and adjusting the nuances of the translated content, and means for generating supplementary information related to the translated content. This makes it possible to provide natural translation results that reflect the user's emotions and to present sentiment-based supplementary information together.
[0682] "Natural language information" refers to text and audio data expressed in the language that humans use on a daily basis.
[0683] "Word structure" refers to the results of analyzing the part of speech, grammatical role, and conjugation forms of words in natural language.
[0684] "Context" refers to the linguistic and cultural background in which a particular word or phrase is used, as well as the content of the surrounding text.
[0685] "Translation means" refers to a method or device for converting information expressed in one language into information with equivalent meaning in a different language.
[0686] "Methods for analyzing emotions" refer to processing and techniques used to identify the emotional tone and intentions of a writer or speaker from information such as text and audio.
[0687] "Methods for adjusting nuances" refer to methods of making subtle adjustments to wording and expressions in the translated result to reflect the emotional and cultural characteristics of the original language.
[0688] "Means of generating supplementary information" refers to techniques that create information and annotations to aid understanding, in addition to the basic translation result.
[0689] This invention is a system that provides a translation that takes into account the sentiment of the natural language text input by the user. The user inputs the text they wish to have translated using a terminal. The terminal sends the input data to the server.
[0690] The server first uses "morphological analysis software" to perform morphological analysis. This software divides the input natural language information into word units and analyzes the part of speech and grammatical role of each word. Based on the results of this analysis, it understands the context and uses a natural language processing engine to translate into another language. A "translation engine" may be used for translation, but in this case, it aims for a natural translation that is appropriate to the context, rather than a literal translation.
[0691] Furthermore, the server uses a "sentiment analysis engine" to analyze the user's emotions from the input text. Based on the analyzed emotional information, it adjusts the nuances of the translated text to reflect the user's emotions. At this time, supplementary information related to the emotions is generated and sent to the terminal along with the translation result.
[0692] The terminal displays the final adjusted translation and supplementary information to the user, allowing them to confirm it. For example, if the user enters the text "It's going to be an amazing day," the server analyzes the positive emotion in this text and provides a corresponding translation: "It's going to be an amazing day!"
[0693] An example of a prompt for a generative AI model is: "Translate the following Japanese text into English, reflecting the sentiment of the text: 'It's going to be a great day.'"
[0694] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0695] Step 1:
[0696] The user enters the Japanese text they want translated into the terminal. For example, the user enters "Today is a very fun day." The terminal sends this input data to the server. The input is Japanese text, and the output is the data sent to the server.
[0697] Step 2:
[0698] The server performs morphological analysis using natural language information received from the terminal. Specifically, the server uses morphological analysis software to analyze Japanese text word by word, identifying parts of speech and grammatical structures. The input is Japanese text received from the terminal, and the output is the analysis result: part of speech information and grammatical structure of each word.
[0699] Step 3:
[0700] Based on the analyzed word type information and grammar structure, the server performs translation using a natural language processing engine. At this stage, it understands the context of the input text and generates a translation that takes into account nuances rather than just a literal translation. For example, the Japanese sentence "今日はとても楽しい一日です" is translated as "Today is a very enjoyable day." The input is the result of morphological analysis, and the output is the translated English text.
[0701] Step 4:
[0702] In parallel with the translation, the server analyzes the user's sentiment using a sentiment analysis engine. Here, it detects a positive sentiment from the expression "とても楽しい". The input is the original Japanese text, and the output is the result of the sentiment analysis, which is the information of a positive sentiment.
[0703] Step 5:
[0704] Based on the result of the sentiment analysis, the server adjusts the nuance of the translated text. In this case, to emphasize the positive sentiment more, it is adjusted like "Today is a wonderfully enjoyable day!". The input is the translated English text and the result of the sentiment analysis, and the output is the sentiment-adjusted translated text.
[0705] Step 6:
[0706] The server generates supplementary information regarding the sentiment and adds it to the translation result. Here, it generates information to supplement the positive sentiment. The input is the sentiment-adjusted translated text, and the output is the final translation result with supplementary information.
[0707] Step 7:
[0708] The server sends the final adjusted translation results and supplementary information to the terminal. The terminal receives this and displays it to the user. The user can then review the translated content and related information. The input is the data sent from the server, and the output is the information displayed to the user.
[0709] (Application Example 2)
[0710] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0711] In translation between different languages, it is crucial to provide translations that not only perform grammatical conversions but also appropriately reflect the user's emotions. However, conventional systems lack the nuance adjustments necessary to take user emotions into account, sometimes resulting in misrepresentation of the user's intentions and feelings. To address this challenge, there is a need for translation systems that take emotions into account.
[0712] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0713] In this invention, the server includes means for receiving input character data, means for analyzing a specific syntax from the received character data, means for converting the character data into different formats based on the analyzed syntax and context, means for analyzing the user's emotions and adjusting the nuances of the conversion based on those emotions, and means for outputting the adjusted conversion results and explanations. This allows for the provision of translation results that reflect the user's emotions, enabling more accurate communication of intent.
[0714] "Inputted character data" refers to the language information that the user provides to the system for morphological analysis and translation.
[0715] A "means of receiving" refers to an element that has the function of receiving data sent by the user within the system.
[0716] A "means for parsing a specific syntax" refers to an element that breaks down input character data into word and sentence structures and processes them to understand their grammar and meaning.
[0717] "Means of converting to different formats" refers to elements that have the function of appropriately translating or converting analyzed character data into other languages or formats.
[0718] A "means of analyzing emotions" refers to an element that has the function of detecting and analyzing emotional nuances and tones from user input data.
[0719] "Means of adjusting nuance" refer to elements that modify the tone and expression of the translation result based on detected emotions, thereby maintaining the intended emotion.
[0720] "Means for outputting adjusted conversion results and explanations" refers to elements that have the function of presenting the converted content and related supplementary information to the user.
[0721] This invention provides a multilingual translation system that takes user emotions into consideration. The system is implemented via a user terminal such as a smartphone or smart glasses. The user inputs natural language text data into the terminal, and sentiment analysis and translation processing are performed on the server. Upon receiving the text data, the server first performs morphological analysis. Specifically, it uses analysis software running in a Python environment to analyze the parts of speech and syntactic structure of the words and phrases that make up the text.
[0722] The server then uses a natural language processing engine to translate into different languages. During this process, it employs a generative AI model algorithm to generate translations that are contextual and syntactic. Furthermore, it analyzes the user's emotions through an emotion engine and performs nuance adjustments to reflect these emotions in the translated text.
[0723] Based on the emotions detected from the natural language text entered by the user, the translation is adjusted appropriately to generate an informative result. This translation result, along with additional information based on the emotions, is sent to the device and presented to the user. For example, if the user enters "It's going to be an amazing day," the system recognizes this positive emotion and provides the translation "It's going to be an amazing day!"
[0724] An example of a prompt is, "Retranslate the product description to reflect the user's emotions in a positive way." This system makes it easy to perform translations that accurately reflect the user's feelings.
[0725] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0726] Step 1:
[0727] The terminal receives natural language text data entered by the user. This input data reflects the user's intent. The terminal then prepares to send this data to the server.
[0728] Step 2:
[0729] The server receives natural language text sent from the terminal. The received text data is input into a morphological analysis system, which analyzes the part of speech and grammatical structure of each word. The output of this analysis is syntactic information and context of the text.
[0730] Step 3:
[0731] The server uses syntactic information obtained from morphological analysis to translate text into different languages using a generative AI model. At this stage, it generates appropriate translation candidates based on the input syntactic information. The output is the translated text data.
[0732] Step 4:
[0733] After the translation process, the server uses an emotion analysis engine to analyze the emotions contained in the user's input text. At this stage, it generates data for nuance adjustment based on the emotion information extracted from the text.
[0734] Step 5:
[0735] The server uses information obtained from sentiment analysis to adjust the nuances of the translated text. Specifically, it uses prompt sentences provided by the generative AI model to modify the translation results to reflect the user's emotions. The output is the adjusted translated text.
[0736] Step 6:
[0737] The server sends the adjusted translated text and related supplementary information to the terminal. The terminal receives this and displays a translation result that reflects the sentiment to the user. This allows the user to confirm that the translation takes sentiment into account.
[0738] The specific processing unit 290 transmits the result of the specific processing to the robot 414. In the robot 414, the control unit 46A causes the speaker 240 and the controlled object 443 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.
[0739] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0740] In the above embodiment, an example was given in which the specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the robot 414.
[0741] Furthermore, the emotion identification model 59, acting as an emotion engine, may determine the user's emotion according to a specific mapping. Specifically, the emotion identification model 59 may determine the user's emotion according to a specific mapping, which is an emotion map (see Figure 9). Similarly, the emotion identification model 59 may also determine the robot's emotion, and the identification processing unit 290 may perform identification processing using the robot's emotion.
[0742] Figure 9 shows an emotion map 400 in which multiple emotions are mapped. In the emotion map 400, emotions are arranged in concentric circles radiating from the center. The closer to the center of the concentric circles, the more primitive the emotions are located. Further out of the concentric circles, emotions representing states and actions arising from mental states are located. Emotion is a concept that includes feelings and mental states. On the left side of the concentric circles, emotions that are generally generated from reactions occurring in the brain are located. On the right side of the concentric circles, emotions that are generally induced by situational judgment are located. Above and below the concentric circles, emotions that are generally generated from reactions occurring in the brain and induced by situational judgment are located. In addition, the emotion of "pleasure" is located on the upper side of the concentric circles, and the emotion of "displeasure" is located on the lower side. Thus, in the emotion map 400, multiple emotions are mapped based on the structure in which emotions arise, and emotions that are likely to occur simultaneously are mapped close together.
[0743] These emotions are distributed at the 3 o'clock position on the Emotion Map 400, and usually fluctuate between feelings of security and anxiety. In the right half of the Emotion Map 400, situational awareness takes precedence over internal feelings, resulting in a calm impression.
[0744] The inside of the Emotion Map 400 represents inner thoughts, while the outside represents actions. Therefore, the further you go from the outside of the Emotion Map 400, the more visible (expressed in actions) your emotions become.
[0745] Here, human emotions are based on various balances, such as posture and blood sugar levels. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. Similarly, in robots, cars, motorcycles, etc., emotions can be created based on various balances, such as posture and battery level. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. The emotion map can be generated, for example, based on Dr. Mitsuyoshi's emotion map (Research on a system for analyzing brain physiological signals of speech emotion recognition and emotion, Tokushima University, doctoral dissertation: https: / / ci.nii.ac.jp / naid / 500000375379). The left half of the emotion map contains emotions belonging to a region called "response," where sensation is dominant. The right half of the emotion map contains emotions belonging to a region called "situation," where situational awareness is dominant.
[0746] The emotion map defines two emotions that promote learning. One is the emotion around the middle of the negative "repentance" and "reflection" on the situation side. In other words, it is when the robot experiences negative emotions such as "I never want to feel this way again" or "I don't want to be scolded again." The other is the emotion around the positive "desire" on the reaction side. In other words, it is when the robot has positive feelings such as "I want more" or "I want to know more."
[0747] The emotion identification model 59 inputs user input into a pre-trained neural network, obtains emotion values representing each emotion shown in the emotion map 400, and determines the user's emotion. This neural network is pre-trained based on multiple training data sets, which are combinations of user input and emotion values representing each emotion shown in the emotion map 400. Furthermore, this neural network is trained so that emotions located close together have similar values, as shown in the emotion map 900 in Figure 10. Figure 10 shows an example where multiple emotions such as "reassured," "calm," and "confident" have similar emotion values.
[0748] The above description primarily focuses on the functions of the data processing device 12 in relation to this disclosure. However, the system related to this disclosure is not necessarily implemented on a server. The system related to this disclosure may be implemented as a general information processing system. This disclosure may be implemented, for example, as a software program that runs on a personal computer or as an application that runs on a smartphone. The method related to this disclosure may be provided to users in SaaS (Software as a Service) format.
[0749] In the above embodiment, an example was given in which a specific process is performed by a single computer 22. However, the technology of this disclosure is not limited thereto, and a distributed processing of the specific process may be performed by multiple computers, including computer 22. For example, a data generation model 58 may be provided in an external device of the data processing device 12, and the external device may generate data according to the input data.
[0750] In the above embodiment, an example was given in which the specific processing program 56 is stored in the storage 32, but the technology of this disclosure is not limited thereto. For example, the specific processing program 56 may be stored in a portable, computer-readable, non-temporary storage medium such as a USB (Universal Serial Bus) memory. The specific processing program 56 stored in the non-temporary storage medium is installed in the computer 22 of the data processing device 12. The processor 28 executes specific processing according to the specific processing program 56.
[0751] Alternatively, the specific processing program 56 may be stored in a storage device such as a server connected to the data processing device 12 via the network 54, and the specific processing program 56 may be downloaded and installed on the computer 22 in response to a request from the data processing device 12.
[0752] Furthermore, it is not necessary to store the entirety of the specific processing program 56 in a storage device such as a server connected to the data processing device 12 via the network 54, or to store the entirety of the specific processing program 56 in the storage 32; it is acceptable to store only a portion of the specific processing program 56.
[0753] The following types of processors can be used as hardware resources to perform specific processing. Examples of processors include a CPU, a general-purpose processor that functions as a hardware resource to perform specific processing by executing software, i.e., a program. Other examples of processors include dedicated electrical circuits, such as FPGAs (Field-Programmable Gate Arrays), PLDs (Programmable Logic Devices), or ASICs (Application Specific Integrated Circuits), which have circuit configurations specifically designed to perform specific processing. All of these processors have built-in or connected memory, and all of them perform specific processing by using memory.
[0754] The hardware resource that performs a specific process may consist of one of these various processors, or it may consist of a combination of two or more processors of the same or different types (for example, a combination of multiple FPGAs, or a combination of a CPU and an FPGA). Alternatively, the hardware resource that performs a specific process may consist of a single processor.
[0755] Examples of configurations using a single processor include, firstly, a configuration in which one or more CPUs and software are combined to form a single processor, and this processor functions as a hardware resource that performs a specific process. Secondly, there is a configuration using a processor that realizes the functions of the entire system, including multiple hardware resources that perform a specific process, on a single IC chip, as exemplified by SoCs (System-on-a-chip). In this way, a specific process is realized using one or more of the above types of processors as hardware resources.
[0756] Furthermore, the hardware structure of these various processors can more specifically utilize electrical circuits that combine circuit elements such as semiconductor devices. Also, the specific processing described above is merely an example. Therefore, it goes without saying that unnecessary steps can be deleted, new steps added, or the processing order rearranged, as long as it does not deviate from the main purpose.
[0757] The descriptions and illustrations presented above are detailed explanations of the technical aspects of this disclosure and are merely examples of the technical aspects. For example, the above descriptions of the structure, function, operation, and effect are examples of the structure, function, operation, and effect of the technical aspects of this disclosure. Therefore, it goes without saying that you may delete unnecessary parts, add new elements, or replace elements in the descriptions and illustrations presented above, as long as you do not deviate from the essence of the technical aspects of this disclosure. Furthermore, in order to avoid confusion and facilitate understanding of the technical aspects of this disclosure, explanations of common technical knowledge and the like that do not require special explanation to enable the implementation of the technical aspects of this disclosure have been omitted from the descriptions and illustrations presented above.
[0758] All documents, patent applications, and technical standards described herein are incorporated by reference to the same extent as if each individual document, patent application, and technical standard were specifically and individually noted to be incorporated by reference.
[0759] The following is further disclosed regarding the embodiments described above.
[0760] (Claim 1)
[0761] A means for receiving input natural language data,
[0762] A means of analyzing specific expressions from received natural language data,
[0763] A means for translating natural language into a different language based on the analyzed expression and context,
[0764] Means for generating additional explanations related to the translated content,
[0765] A means for outputting the generated translation results and explanations,
[0766] A system that includes this.
[0767] (Claim 2)
[0768] The system according to claim 1, wherein the natural language data is in Japanese and comprises means for analyzing specific expressions.
[0769] (Claim 3)
[0770] The system according to claim 1, further comprising means for providing supplementary information related to the cultural background based on the analyzed expression.
[0771] "Example 1"
[0772] (Claim 1)
[0773] A device that receives input natural language data,
[0774] A means for analyzing received natural language data and decomposing it into words and grammatical structures,
[0775] A method for extracting unique expressions and elements from the analysis results and translating them while considering the context,
[0776] A means of providing supplementary information to the generated translation based on cultural background and nuances of expression,
[0777] A device that formats the generated translation results and supplementary information into an output format and outputs them,
[0778] A system that includes this.
[0779] (Claim 2)
[0780] The system according to claim 1, wherein the natural language data is in a specific language and the system includes a device for analyzing specific expressions and nuances.
[0781] (Claim 3)
[0782] The system according to claim 1, comprising a device that provides supplementary information related to the cultural background based on the aforementioned analysis results.
[0783] "Application Example 1"
[0784] (Claim 1)
[0785] A means for receiving input natural language data,
[0786] A means of analyzing specific expressions from received natural language data,
[0787] A means for translating natural language into a different language based on the analyzed expression and context,
[0788] Means for generating additional explanations related to the translated content,
[0789] A means for outputting the generated translation results and explanations,
[0790] A means for displaying translations in real time on different devices,
[0791] A system that includes this.
[0792] (Claim 2)
[0793] The system according to claim 1, wherein the natural language data is in Japanese, and the system comprises means for analyzing specific expressions and means for providing supplementary information for specific expressions using different devices.
[0794] (Claim 3)
[0795] The system according to claim 1, further comprising means for providing supplementary information related to cultural background based on the analyzed expression, and means for using a generative AI model to generate explanations that assist in display on different devices.
[0796] "Example 2 of combining an emotion engine"
[0797] (Claim 1)
[0798] A means for receiving input natural language information,
[0799] A means for analyzing the word structure of received natural language information,
[0800] A means for translating natural language into a different language based on the structure and context of analyzed words,
[0801] A means of analyzing emotions in the translation process and adjusting the nuances of the translated content,
[0802] Means for generating additional supplementary information related to the translated content,
[0803] A means for outputting the generated translation results and supplementary information,
[0804] A system that includes this.
[0805] (Claim 2)
[0806] The system according to claim 1, wherein the natural language information is a language associated with a specific cultural area, and the system comprises means for analyzing expressions specific to that language.
[0807] (Claim 3)
[0808] The system according to claim 1, further comprising means for providing supplemental information related to emotion based on the structure of the analyzed words and the user's emotions.
[0809] "Application example 2 when combining with an emotional engine"
[0810] (Claim 1)
[0811] A means for receiving input character data,
[0812] A means of parsing a specific syntax from received character data,
[0813] A means for converting character data into different formats based on the analyzed syntax and background,
[0814] Means for generating additional information related to the converted content,
[0815] A means of analyzing the user's emotions and adjusting the nuances of the transformation based on those emotions,
[0816] A means for outputting the adjusted conversion result and explanation,
[0817] A system that includes this.
[0818] (Claim 2)
[0819] The system according to claim 1, wherein the character data is natural language and comprises means for analyzing a specific expression.
[0820] (Claim 3)
[0821] The system according to claim 1, further comprising means for providing diverse translations based on emotion, based on the analyzed syntax. [Explanation of symbols]
[0822] 10, 210, 310, 410 Data Processing Systems 12 Data Processing Devices 14 Smart Devices 214 Smart Glasses 314 Headset-type terminal 414 Robots< / url:> < / url:> < / url:> < / url:>
Claims
1. A means for receiving input natural language data, A means of analyzing specific expressions from received natural language data, A means for translating natural language into a different language based on the analyzed expression and context, Means for generating additional explanations related to the translated content, A means for outputting the generated translation results and explanations, A system that includes this.
2. The system according to claim 1, wherein the natural language data is in Japanese and comprises means for analyzing specific expressions.
3. The system according to claim 1, further comprising means for providing supplementary information related to the cultural background based on the analyzed expression.