Query answering apparatus and method

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By employing multiple embedding models to calculate similarities and remove duplicates, the method addresses language-based performance degradation in query response systems, achieving accurate and efficient response generation.

WO2026135164A1PCT designated stage Publication Date: 2026-06-25POSCO HLDG INC

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: WO · WO
Patent Type: Applications
Current Assignee / Owner: POSCO HLDG INC
Filing Date: 2025-12-16
Publication Date: 2026-06-25

Application Information

Patent Timeline

16 Dec 2025

Application

25 Jun 2026

Publication

WO2026135164A1

IPC: G06F16/9532; G06F16/9538; G06F16/951; G06F40/151; G06F18/22; G06N3/045; G06N3/096

AI Tagging

Application Domain

Web data indexing Biological models

Technology Topics

Theoretical computer science Degree of similarity

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Text error correction method and device, storage medium and electronic equipment
CN122242490ADigital data information retrieval Semantic analysis AlgorithmWord list
Multi-page linkage code generation system and method based on decision tree and dependency graph
CN122240076ARequirement analysis Knowledge based modelsCode generationTheoretical computer science
Formula recognition method, related device and program product
CN122200697ABiological models Algorithm Theoretical computer science
egg roll placard
CN122251839ACard gamesInformation transmission Theoretical computer science
Memory coherence with early store completion
US12664094B2Memory systems Theoretical computer science Data store

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Existing methods for removing duplicate news articles and generating responses to user queries based on keyword similarity fail when language differences between queries and news articles degrade performance.

Method used

Utilize multiple embedding models with different characteristics to generate first and second vectors for articles and queries, calculating similarities to remove duplicates and generate responses, using cosine and Euclidean distance algorithms to determine optimal articles for input into a Large Language Model (LLM).

Benefits of technology

Provides accurate and efficient removal of duplicate articles and generation of optimal responses by leveraging diverse embedding models, ensuring high similarity and near real-time processing.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure KR2025021866_25062026_PF_FP_ABST

Patent Text Reader

Abstract

A query answering apparatus and method may: receive a query and a plurality of articles; output a plurality of first vectors for each of the plurality of articles from a plurality of embedding models having different characteristics; remove duplicate articles on the basis of a first similarity calculated on the basis of the plurality of first vectors; output a plurality of second vectors for the query from the plurality of embedding models; and generate a response corresponding to the query on the basis of a second similarity calculated on the basis of the plurality of first vectors for each of the plurality of articles and the plurality of second vectors for the query.

Need to check novelty before this filing date? Find Prior Art

Description

Question answering device and method

[0001] The present disclosure relates to a technology for generating a response corresponding to a query.

[0002] Recently, there are web pages that provide services such as automatically recommending news articles based on users' interests or finding and displaying responses relevant to their queries. In particular, active research is being conducted on using language models to provide services that correspond to users' interests or tendencies.

[0003] The constant influx of multinational news contains a significant amount of duplicate articles, necessitating their efficient removal. Furthermore, there is a need to efficiently provide responses corresponding to user queries by utilizing information contained within these news articles.

[0004] Conventionally, keywords have been extracted to remove duplicate news based on keyword similarity and to infer and provide responses corresponding to queries; however, when the language applied to the query and the news article differs, there is a problem of degraded performance in removing duplicate articles or generating responses corresponding to the query.

[0005] The present disclosure aims to provide a technology for outputting a response corresponding to an input query through a plurality of embedding models.

[0006] In one aspect, the present embodiments provide a query response device comprising: an information receiving unit that receives a query and a plurality of articles; an article refining unit that outputs a plurality of first vectors for each of a plurality of articles from a plurality of embedding models having different characteristics and removes duplicate articles based on a first similarity calculated based on the plurality of first vectors; and a response generating unit that outputs a plurality of second vectors for a query from a plurality of embedding models and generates a response corresponding to the query based on a second similarity calculated based on the plurality of first vectors for each of a plurality of articles and the plurality of second vectors for a query.

[0007] In another aspect, the present embodiments provide a query response method comprising the steps of receiving a query and a plurality of articles, outputting a plurality of first vectors for each of a plurality of articles output from a plurality of embedding models having different characteristics, removing duplicate articles based on a first similarity calculated based on the plurality of first vectors, outputting a plurality of second vectors for the query from a plurality of embedding models, and generating a response corresponding to the query based on a second similarity calculated based on the plurality of first vectors for each of the plurality of articles and the plurality of second vectors for the query.

[0008] The present disclosure may provide a technology for generating a response corresponding to a query.

[0009] FIG. 1 is a drawing for explaining the configuration of a device that generates a response corresponding to a query according to one embodiment.

[0010] FIG. 2 is a diagram for schematically explaining LLM, which is a type of artificial intelligence according to one embodiment.

[0011] FIG. 3 is a flowchart illustrating the learning process of an artificial intelligence model including an LLM according to one embodiment.

[0012] FIG. 4 is a flowchart illustrating the process of generating a response to a query input through an LLM according to one embodiment.

[0013] FIG. 5 is a flowchart illustrating the process of removing duplicate articles from news articles collected through a plurality of embedding models according to one embodiment.

[0014] FIG. 6 is a flowchart illustrating the process of generating a response corresponding to an input query through a plurality of embedding models according to one embodiment.

[0015] FIG. 7 is a flowchart for explaining a method for generating a response corresponding to a query according to one embodiment.

[0016] FIG. 8 is a configuration diagram of a computing device including an artificial intelligence model according to one embodiment.

[0017] FIG. 9 is a configuration diagram of a computer system including a client-server that includes an artificial intelligence model according to one embodiment.

[0018] Hereinafter, some embodiments of the present disclosure will be described in detail with reference to the exemplary drawings. In assigning reference numerals to the components of each drawing, the same components may have the same reference numeral as much as possible, even if they are shown in different drawings. Furthermore, in describing the embodiments, if it is determined that a detailed description of related known components or functions may obscure the essence of the technical concept, such detailed description may be omitted. Where terms such as "comprising," "having," or "consisting of" are used in this specification, other parts may be added unless "only" is used. Where a component is expressed in the singular, it may include a plural unless otherwise specified.

[0019] Additionally, terms such as first, second, A, B, (a), (b), etc., may be used to describe the components of the present disclosure. These terms are used merely to distinguish the components from other components, and the nature, order, sequence, or number of the components are not limited by such terms.

[0020] In describing the positional relationship of components, where it is stated that two or more components are "connected," "combined," or "joined," it should be understood that while the two or more components may be directly "connected," "combined," or "joined," they may also be "connected," "combined," or "joined" with other components "intervened." Here, the other components may be included in one or more of the two or more components that are "connected," "combined," or "joined" with one another.

[0021] In describing the temporal flow relationship regarding components, methods of operation, or methods of production, for example, when the temporal or sequential relationship is described using "after," "following," "next," or "before," it may include cases where the relationship is not continuous unless "immediately" or "directly" is used.

[0022] Meanwhile, where numerical values or corresponding information regarding a component (e.g., levels, etc.) are mentioned, even without separate explicit notation, the numerical values or corresponding information may be interpreted as including a range of error that may occur due to various factors (e.g., process factors, internal or external shocks, noise, etc.).

[0023] The embodiments are described in detail below with reference to the drawings.

[0024] FIG. 1 is a drawing for explaining the configuration of a device that generates a response corresponding to a query according to one embodiment.

[0025] Referring to FIG. 1, the question and answer device (100) of the present disclosure includes an information receiving unit (110) that receives a query and a plurality of articles.

[0026] The question answer device (100) of the present disclosure can use a plurality of embedding models having different characteristics to determine and remove duplicate articles among a plurality of collected articles, determine an article corresponding to an input query, and generate a response corresponding to the query through an LLM.

[0027] For example, the aforementioned multiple articles may be collected from at least one website through a crawling technique based on at least one of a preset field and a preset period and stored in the preset database.

[0028] The aforementioned pre-set fields can be configured in various ways as needed, regardless of type or scope. For example, secondary batteries can be configured as a field required for article collection, and autonomous vehicles can be configured as a single field. Additionally, the aforementioned fields do not need to be configured as only one, and two or more fields may be configured.

[0029] The aforementioned preset period may be set in units of days, weeks, or months. However, the aforementioned period is not fixed as a single period and may be set in various ways as needed.

[0030] As described above, the question-and-answer device (100) of the present disclosure can collect multiple articles from websites through crawling. Crawling means visiting at least one web page on the internet through a web crawler, also called a spider or search engine bot, and collecting information contained within the web page.

[0031] Specifically, crawling is performed by setting a starting URL (Uniform Resource Locator), and if information corresponding to conditions such as a pre-set field or a pre-set period exists at that URL through a pre-configured library, the title or body content of an article is obtained as text. For the text information corresponding to the aforementioned conditions, a portion of the content contained therein can be extracted and stored in a pre-configured database.

[0032] Data collected through crawling may include information on article titles, body text, media outlet names, publication dates, and user reviews. However, this is merely an example, and various other types of information may be collected as needed.

[0033] In addition, the aforementioned pre-configured library may include the HTTP (HyperText Transfer Protocol) library, but is not limited to this, and various libraries may be used as needed.

[0034] The question and answer device (100) of the present disclosure can receive information about articles collected by crawling and stored in a pre-set database through an information receiving unit (110).

[0035] The question answer device (100) of the present disclosure can store article information stored in a database pre-configured in the manner described above, and the basic information of the article in a separate RDB (Relational Database) table.

[0036] The aforementioned crawled news information can be periodically stored in a pre-configured database at pre-configured intervals, and data consistency can be ensured through immediate re-execution in the event that an error occurs during operations such as saving to or querying the pre-configured database.

[0037] The question and answer device (100) of the present disclosure may preprocess received article information and finally generate a Q&A including various expected questions and expected answers while generating a response to a question or analyzing the latest trends through LLM.

[0038] In addition, the question and answer device (100) of the present disclosure can receive a query entered by a user through the information receiving unit (110) described above.

[0039] A query is data or command entered by a user to interact with a specific artificial intelligence model, and includes queries requesting specific information or facts, queries asking for opinions, queries requesting creation, queries requesting document translation or summarization, and queries requesting data analysis.

[0040] The present disclosure proposes a method of determining the article that corresponds most to the query determined through multiple embedding models, rather than directly inputting the query entered by the user into the language model LLM, and inputting the determined query and article into the LLM.

[0041] The question answer device (100) of the present disclosure outputs a plurality of first vectors for each of a plurality of articles from a plurality of embedding models having different characteristics, and includes an article refinement unit (120) that removes duplicate articles based on a first similarity calculated based on the plurality of first vectors.

[0042] To remove the aforementioned duplicate articles, the question answer device (100) of the present disclosure can remove duplicate articles by embedding articles at set intervals and calculating a periodic first similarity, and can also remove the embedding vector corresponding to the duplicate articles to be removed from the stored database together.

[0043] Embedding is a technique that converts each piece of data into a high-dimensional vector containing numbers while preserving the original meaning of the data. For example, the word 'secondary battery' can be converted into a high-dimensional vector containing numbers such as [0.52, -0.04, 0.16, ... 0.27] through embedding, and the word 'car' can be converted into a high-dimensional vector containing numbers such as [0.42, -0.17, 0.18, ... 0.25] through embedding techniques. Furthermore, embedding can convert not only words into vectors but also entire sentences or texts into a single vector. For example, the title of a specific article can be converted into a high-dimensional vector containing numbers such as [0.2, -0.01, 0.36, ... 0.87].

[0044] The question-answering device (100) of the present disclosure can output a plurality of first vectors for each of a plurality of articles from a plurality of embedding models having different characteristics. For example, assuming there are 100 embedding models and 100 articles obtained, the number of first vectors generated can be 10,000. The content of the article input into the embedding model may be the article title, the body of the article, or both.

[0045] In addition, as mentioned above, multiple embedding models have different characteristics.

[0046] For example, different characteristics of multiple embedding models may include characteristics in which the performance of the embedding differs depending on the type of language included in the text input to the embedding model.

[0047] For example, assuming there are three embedding models, namely embedding model A, embedding model B, and embedding model C, embedding model A can be set to have the best embedding performance when English is input, embedding model B can be set to have the best embedding performance when Korean is input, and embedding model C can be set to have the best embedding performance when Chinese is input.

[0048] Since the languages of articles collected through crawling may vary, high-dimensional vectors can be output through embedding models optimized for each language. However, because a single article may contain languages from multiple countries, the present disclosure proposes a method to input the collected articles into all embedding models, rather than just a single embedding model, to output multiple high-dimensional vectors in order to derive the most accurate results.

[0049] As another example, a plurality of embedding models may include a plurality of first embedding models that output a first vector and a plurality of second embedding models that output a second vector, wherein the number of the plurality of first embedding models may be set to be equal to the number of the plurality of second embedding models.

[0050] The question answer device (100) of the present disclosure can call a Multi Embedding model every hour when an article is collected, output an embedding vector through a plurality of embedding models, and store the output results in parallel in a pre-set database.

[0051] As another example, the first similarity between multiple articles can be calculated based on two first vectors selected from the entire first vector output from multiple embedding models and a preset algorithm.

[0052] In addition, the aforementioned preset algorithm used for calculating the first similarity may include at least one of a cosine similarity judgment algorithm and a Euclidean distance calculation algorithm.

[0053] The question answer device (100) of the present disclosure outputs a plurality of first vectors for each of a plurality of articles through a plurality of embedding models, determines duplicate articles based on a first similarity calculated through the aforementioned algorithm, and can remove one of the articles determined to be duplicate articles.

[0054] The cosine similarity algorithm is a technique that measures the degree of similarity between two data sets using a cosine function by comparing the directionality between two vectors. Similarity can be calculated by Equation 1, where θ is the angle, X and Y are the two vectors to be compared, X o Y is the dot product of the two vectors, and |X| and |Y| are the magnitudes of each vector.

[0055]

[0056] For example, if X is a vector [2, 1] and Y is a vector [1, 2], X o Y is 2*1 + 1*2 = 4, and |X| and |Y| are respectively by ...is obtained. Therefore, the first similarity, Cosθ, can be calculated as 0.8.

[0057] Since Cosθ, the first similarity, has only values between -1 and 1, the similarity also has only values between -1 and 1, and it can be determined that the closer Cosθ is to 1, the higher the similarity between the two articles.

[0058] Accordingly, the question-answering device (100) of the present disclosure may remove one of the two articles corresponding to each of the two first vectors when there are two first vectors in which the first similarity calculated for two articles is greater than or equal to a preset first threshold, and all information regarding the first vector corresponding to the removed article may also be removed. The aforementioned preset first threshold is a real number and can be set in various ways as needed.

[0059] In addition, the aforementioned Euclidean distance calculation algorithm calculates the distance between two points, and the smaller the calculated distance, the higher the first similarity can be determined.

[0060] The Euclidean distance can be calculated based on Equation 2, where d(p,q) is the distance between two points p and q, x1 and y1 are the coordinates of p, and x2 and y2 are the coordinates of q.

[0061]

[0062] For example, assuming that p is a vector of [2, 1] and q is [1, 2], it can be said that in p, x1 is 2 and y1 is 1, and in q, x2 is 1 and y2 is 2, and calculating the Euclidean distance based on Equation 2 yields √2. By comparing the aforementioned result with a preset second threshold, if there are two first vectors that are less than the preset second threshold, one of the two articles corresponding to each of the two first vectors can be removed, and all information regarding the first vector corresponding to the removed article can also be removed. The aforementioned preset second threshold is a real number and can be set in various ways as needed.

[0063] In other words, the article refinement unit (120) of the present disclosure may consider two articles corresponding to each of two first vectors corresponding to a first similarity determined to belong to a first similarity range as duplicate articles, remove one of the two articles, and remove a plurality of first vectors corresponding to the removed article. The aforementioned first similarity range may be determined in various ways according to the algorithm used to calculate the first similarity and the set threshold.

[0064] The question answer device (100) of the present disclosure includes a response generation unit (130) that outputs a plurality of second vectors for a query from a plurality of embedding models and generates a response corresponding to the query based on a second similarity calculated based on a plurality of first vectors for each of a plurality of articles and a plurality of second vectors for a query.

[0065] The present disclosure eliminates duplicate articles by using multiple embedding models that exhibit different performance depending on the type of input language, and also allows the use of multiple embedding models in determining the article corresponding to the input query. Through this, an optimal response to a query input into the LLM can be generated.

[0066] For example, the second similarity between a query and an article can be calculated based on a plurality of first vectors output through a plurality of embedding models, a first vector selected from the total of two vectors, a second vector, and a preset algorithm.

[0067] As described above, the algorithm used for calculating similarity may include at least one of a cosine similarity judgment algorithm and a Euclidean distance calculation algorithm.

[0068]

[0069] For example, if there are 100 embedding models and 1 query and 100 articles with duplicate articles removed, there are 100 first vectors for the query and 10,000 second vectors for the articles, and 1,000,000 second similarities can be calculated.

[0070] The second similarity calculation method can be applied in the same way as the first similarity calculation method described above.

[0071] As another example, the response generation unit (130) of the present disclosure may convert a first vector and a second vector corresponding to a second similarity determined to belong to a second similarity range into original text, and input the converted text into a pre-trained artificial intelligence model to generate the response based on the output result. The aforementioned second similarity range may be determined in various ways according to the algorithm used to calculate the second similarity and the set threshold.

[0072] The question-answering device (100) of the present disclosure may determine that the degree of similarity is high if the calculated second similarity is greater than or equal to a preset third threshold when a cosine similarity judgment algorithm is used for the second similarity calculation, and may determine that the degree of similarity is high if the calculated second similarity is less than a preset fourth threshold when a Euclidean distance calculation algorithm is used for the second similarity calculation. The aforementioned third threshold and fourth threshold are real numbers and can be set in various ways as needed.

[0073] Accordingly, when an article corresponding to a first vector determined to have the highest degree of similarity to a query is determined, the question-answering device (100) of the present disclosure can convert the first vector corresponding to the query and the second vector corresponding to the article back into text, and input the converted result into a pre-trained artificial intelligence model to generate a response to the query. The first vector determined to have the highest degree of similarity to the aforementioned query is not limited to one, but can be determined in various numbers as needed.

[0074] In addition, the aforementioned pre-trained artificial intelligence model may include a Large Language Model (LM) that includes at least one transformer.

[0075] The question and answer device (100) of the present disclosure may ultimately generate a Q&A including various expected questions and expected answers while generating a response to a question or analyzing the latest trends through LLM.

[0076] The present disclosure has the advantage of being able to provide the optimal response intended by the user by determining the optimal article corresponding to a query that can secure Near Realtime through a plurality of embedding models having different characteristics by periodically removing duplicate articles and inputting it into an LLM.

[0077] Below, the overall process of generating a response corresponding to a query is explained in more detail with reference to a diagram.

[0078] FIG. 2 is a diagram for schematically explaining LLM, which is a type of artificial intelligence according to one embodiment.

[0079] Referring to FIG. 2, artificial intelligence (200) includes machine learning (ML) (210), deep learning (DL) (220), and LLM (240) as sub-concepts.

[0080] Specifically, artificial intelligence (200) is a field that performs repetitive learning in a manner similar to human intelligence and makes judgments based on the results of learning. Artificial intelligence (200) is a broad concept that includes machine learning (210) and deep learning (220), and machine learning (210) is used as a broad concept that includes deep learning (220).

[0081] Machine learning (220) is a field of artificial intelligence (200) that can learn patterns in data and perform decision-making or prediction. It is also a field that develops algorithms and technologies that enable computers to learn based on data, and it is a core technology in various fields such as image processing, image recognition, speech recognition, and internet search, showing excellent performance in prediction and detection.

[0082] The types of machine learning (210) learning methods include supervised learning, unsupervised learning, and reinforcement learning.

[0083] Supervised learning is a learning method in which learning is performed with data input to machine learning (210) and correct answers for that data provided, and unsupervised learning is a learning method in which machine learning learns patterns on its own to find the correct answers, in that data is input to machine learning but there are no correct answers for that data.

[0084] In addition, reinforcement learning is a learning method in which an agent interacts with a given environment and is given a certain reward for actions or judgments made by the agent, and learns in a direction that maximizes the aforementioned reward.

[0085] In addition, machine learning (210) includes deep learning (220), which uses a hierarchical structure to learn patterns of large-scale data using an artificial neural network (ANN) to solve complex problems.

[0086] Deep learning (220) includes Convolutional Neural Networks (CNN) used for image and video processing, Recurrent Neural Networks (RNN) used for processing sequentially input data, Long Short-Term Memory (LSTM) used for processing time series data, and Generative Adversarial Networks (GAN) used for data augmentation.

[0087] In addition, natural language processing (NLP) (230) can be described as a field of artificial intelligence that enables a machine computer to understand, interpret, and generate natural language, which is the language used by humans.

[0088] In particular, the LLM (240), which is one of the artificial intelligence models used through the query generation device of the present disclosure, is a type of artificial intelligence model learned through prompts, which are data input to the model and are a vast amount of text data. The LLM (240) can generate a consistent response to various prompts. Additionally, the LLM (240) can translate language, generate text that meets conditions, or summarize text.

[0089] LLM(240) may include a transformer that finds the relationship between words contained in the input text.

[0090] Learning can be performed through at least one of a Pretraining learning method that learns the patterns and structures of text using a large-scale learning dataset prepared as a learning method of LLM (240) and a Fine-Tuning learning method that performs learning according to the intended use of LLM (240) by labeling a part of the large-scale learning dataset, and can also be performed using a few-shot learning method that performs learning by reflecting examples in the learning data as needed.

[0091] The question answering device of the present disclosure can generate an optimal response by first removing duplicate articles among collected articles using a plurality of embedding models, determining the article that most corresponds to the input query, and inputting it into the LLM, which is the language model described above.

[0092] FIG. 3 is a flowchart illustrating the learning process of an artificial intelligence model including an LLM according to one embodiment.

[0093] Referring to Fig. 3, a general learning process of an artificial intelligence model including LLM is illustrated. Although there are various learning methods and processes for artificial intelligence models, LLM is used as an example for the sake of convenience of explanation.

[0094] Specifically, training data to be used for training is collected, and preprocessing is performed on the collected training data (S300).

[0095] Assuming that training of LLM is performed, input training data is required. The present disclosure allows LLM to be used as a model for efficiently recommending news articles, and allows LLM to be used to calculate a first recommendation score for news articles, so news articles can be used as training data, which can be collected through the aforementioned crawling method and obtained through a pre-configured database.

[0096] Once training data is collected, preprocessing can be performed to input it into an LLM model. For example, if news articles are collected as training data, tokenization can be performed to classify the text of the collected articles into tokens, which are the smallest meaningful units; normalization can be performed to convert the text into a standard form (converting to lowercase, converting to numbers, removing special characters); meaningless words can be removed; or the text can be converted into a vector containing numerical values. The aforementioned preprocessing methods are merely examples and can be configured in various ways as needed.

[0097] Initialization of the artificial intelligence model to be trained is performed (S310).

[0098] Once the AI model is ready for training, the AI model can be initialized. Initializing the AI model involves initializing the weights or loss function used in the model. For example, the aforementioned initialization does not mean setting the weights to zero, but rather setting the weights to a specific value or range.

[0099] When training is performed, a preset loss function is calculated to determine whether the data output from the artificial intelligence model matches reality (S320).

[0100] Through the aforementioned loss function, the performance of the model used in the training process can be evaluated, and weights can also be adjusted.

[0101] For example, the Mean Squared Error, which expresses the difference between the predicted value and the actual value output for an input value input into an artificial intelligence model, can be used as a loss function in the learning process of deep learning.

[0102] As another example, Cross-Entropy Loss can be used as a loss function in the training process of LLM. This loss function calculates the difference between the probability distribution of potential outputs from the LLM for an input text and the probability distribution of the actual answer.

[0103] When training is completed, verification of the artificial intelligence model is performed to check whether the training was performed correctly (S330).

[0104] Validation of LLM is performed by evaluating the performance of the trained model using validation data separately from the training data. Validation methods that may be used include BLEU (Bilingual Evaluation Understudy) and ROUGE (Recall-Oriented Understudy for Gisting Evaluation).

[0105] In addition, the occurrence of overfitting can be determined during the validation process. Overfitting refers to a situation where the model is overly focused on training data, resulting in a decline in performance when non-training data is input.

[0106] The determination of whether overfitting has occurred can be made by whether the loss incurred during the validation process using validation data increases compared to the loss incurred when training is performed using training data.

[0107] Once the verification of the artificial intelligence model is complete, a suitability evaluation of the model is performed, and if it is determined that it is not suitable, the aforementioned training and verification are repeated (S340).

[0108] Once all training and verification of the artificial intelligence model is completed, the model is deployed after a final evaluation is performed (S350).

[0109]

[0110] FIG. 4 is a flowchart illustrating the process of generating a response to a query input through an LLM according to one embodiment.

[0111] Referring to FIG. 4, the question answering device of the present disclosure can generate a response corresponding to a query by inputting a query determined from a plurality of embedding models and at least one article information into a pre-trained LLM.

[0112] Specifically, the question answering device of the present disclosure inputs a query determined from a plurality of embedding models and at least one article information as text into the LLM (S400).

[0113] The question answering device of the present disclosure can input at least one of a question determined from a plurality of embedding models and at least one of title information, body information, media company name, feedback information, and publication date corresponding to at least one article information into the LLM.

[0114] The question answer device of the present disclosure may translate the aforementioned query and article information into a specific language, such as English or Korean, as needed in an LLM, or perform preprocessing to remove unnecessary information.

[0115] When text corresponding to the query and article information is input into the LLM, the text is separated into at least one word, and each word is converted into a high-dimensional vector (S410).

[0116] The question answering device of the present disclosure performs tokenization to separate the input text into words when text corresponding to a query and article information is input into an LLM, and can convert each tokenized word back into an embedding vector, which is a high-dimensional vector containing numbers. For example, if there is a word "BATTERY," it can be converted into an embedding vector [0.1, -0.25,...], which is a high-dimensional vector containing numbers.

[0117] Accordingly, the question answering device of the present disclosure may input questions and article information as text in accordance with the input data format of the LLM, or may input them as the original embedding vector.

[0118] When each word is converted into an embedding vector which is a high-dimensional vector, it is input into a transformer to determine the association between words (S420).

[0119] When each embedding vector is input into the transformer, words associated with the words included in the input query are selected, and words that can be included in the response are selected based on the input article information to calculate the correct answer probability for each word.

[0120] Calculate the probability of the correct answer for the words that can be included in the response, and generate an output vector associated with the word with the highest probability (S430).

[0121] When an output vector is generated, the output vector is converted into text to output the conclusion through LLM (S440).

[0122] The converted text is output from the LLM as a response (S450).

[0123] The process of outputting a conclusion through the aforementioned LLM is merely an example, and may be performed by changing some of the order as necessary.

[0124] FIG. 5 is a flowchart illustrating the process of removing duplicate articles from news articles collected through a plurality of embedding models according to one embodiment.

[0125] Referring to FIG. 5, the question-answering device of the present disclosure can calculate the similarity between embedding vectors converted through a plurality of embedding models for collected articles and delete duplicate articles based on the calculated similarity.

[0126] Specifically, the question-answering device of the present disclosure collects specific articles for a preset field or a preset period (S500).

[0127] The aforementioned articles may be collected from at least one website using a crawling technique and stored in the aforementioned pre-configured database. There are no restrictions on the field or period of the collected articles, and they may be collected in various ways depending on the settings.

[0128] When an article is collected, the question-answering device of the present disclosure inputs the article information into a multi-embedding model (S510).

[0129] The article information input into the embedding model may include at least one of the following: article title, body text, media outlet name, feedback information, and publication date. However, this is merely an example, and various inputs may be provided as needed.

[0130] Multiple embedding models included in a multi-embedding model differ in their performance in generating embedding vectors depending on the type of input language. The question-answering device of the present disclosure can input article information into multiple embedding models having different characteristics.

[0131] When article information is input into a multi-embedding model, the question-answering device of the present disclosure calculates an embedding vector from the first embedding model (S520).

[0132] Embedding vectors can convert each data point into a high-dimensional vector containing numbers while preserving the original meaning of the data.

[0133] In this disclosure, for the removal of duplicate articles, the embedding vector output from the embedding model into which article information is input may be referred to as the first vector.

[0134] When an embedding vector is calculated from the first embedding model, the question answering device of the present disclosure calculates an embedding vector from the second embedding model (S530).

[0135] As mentioned above, assuming there are 100 embedding models and 100 acquired articles, the number of first vectors generated can be 10,000.

[0136] There is no limit to the number of embedding models included in the multi-embedding model. The two embedding models mentioned above are just examples, and the number can be set to various levels as needed.

[0137] When embedding vectors are calculated from all embedding models, the question answering device of the present disclosure stores the calculation results in a pre-set database (S540).

[0138] The question answering device of the present disclosure calculates a first similarity based on the calculated embedding vector (S550).

[0139] As described above, the first similarity can be calculated based on at least one of a cosine similarity determination algorithm and a Euclidean distance calculation algorithm.

[0140] When multiple embedding vectors are calculated for multiple articles, the question answering device of the present disclosure may determine any two of the calculated embedding vectors to calculate a first similarity. However, since it is meaningless to calculate similarity for two embedding vectors calculated based on Article A, the device may be configured not to calculate similarity.

[0141] When the first similarity is calculated, the question answering device of the present disclosure removes information about any one of the duplicate articles and the embedding vector corresponding to the article (S560).

[0142] When a first similarity is calculated, the question answering device of the present disclosure determines whether to view two embedding vectors as duplicates based on the type of algorithm used to calculate the first similarity and a threshold value set, and if determined to be duplicates, removes one of the two articles corresponding to the embedding vector and can remove information about the embedding vector corresponding to the removed article from a preset database.

[0143] FIG. 6 is a flowchart illustrating the process of generating a response corresponding to an input query through a plurality of embedding models according to one embodiment.

[0144] Referring to FIG. 6, the question answer device of the present disclosure can calculate a similarity between an embedding vector converted through a plurality of embedding models for an input query and an embedding vector for an article, determine at least one article corresponding to the query based on the calculated similarity, and generate a response through LLM.

[0145] Specifically, the question answer device of the present disclosure receives a question and receives the question (S600).

[0146] There are no restrictions on the format, type, or subject of the input query. When a query is input, it may be stored in a pre-configured database, and the query response device of the present disclosure may retrieve the query by querying the database.

[0147] When a query is received, the question answer device of the present disclosure inputs the query into a multi-embedding model (S610).

[0148] When a query is input into a multi-embedding model, the question answering device of the present disclosure calculates an embedding vector from the first embedding model (S620).

[0149] When an embedding vector is calculated from the first embedding model, the question answering device of the present disclosure calculates an embedding vector from the second embedding model (S630).

[0150] Embedding vectors can convert a query into a high-dimensional vector containing numbers while preserving the original meaning of the query.

[0151] In this disclosure, an embedding vector output from an embedding model when a query is input may be referred to as a second vector.

[0152] If there is 1 input query and 100 embedding models, the second vector generated can be 100. There is no limit to the number of embedding models included in the multi-embedding model. The two embedding models mentioned above are just examples and can be set to various numbers as needed.

[0153] The embedding model used to remove duplicate articles and the embedding model used to calculate the second vector from the query may be the same embedding model, or may be an embedding model with different characteristics or performance. However, the present disclosure proposes setting the number of embedding models used to remove duplicate articles and the number of embedding models used to calculate the second vector to be the same.

[0154] When embedding vectors are calculated from all embedding models, the question answering device of the present disclosure can store the calculated results in a pre-set database.

[0155] When embedding vectors are calculated from all embedding models, the question answering device of the present disclosure retrieves embedding vectors for articles by querying a preset database (S640).

[0156] The question answering device of the present disclosure calculates a second similarity based on an embedding vector for a calculated question and an embedding vector for an article (S650).

[0157] The second similarity calculation can be applied in the same way as the first similarity calculation method.

[0158] When a second similarity is calculated, the question answering device of the present disclosure determines an embedding vector for an article corresponding to an embedding vector for a query based on the second similarity (S660).

[0159] The number of embedding vectors for articles corresponding to the embedding vector for the query is not limited to one and can be set to various values as needed.

[0160] When a query and an article corresponding to the query are determined, the question answer device of the present disclosure converts an embedding vector for the query and the article into text, inputs it into an LLM, and generates a response (S670).

[0161] FIG. 7 is a flowchart for explaining a method for generating a response corresponding to a query according to one embodiment.

[0162] Referring to FIG. 7, the question and answer method of the present disclosure includes an information receiving step of receiving a query and a plurality of articles (S700).

[0163] The question answering device of the present disclosure can use a plurality of embedding models having different characteristics to determine and remove duplicate articles among a plurality of collected articles, determine an article corresponding to an input query, and generate a response corresponding to the query through an LLM.

[0164] For example, the aforementioned multiple articles may be collected from at least one website through a crawling technique based on at least one of a preset field and a preset period and stored in the preset database.

[0165] The aforementioned pre-set fields can be configured in various ways as needed, regardless of type or scope. For example, secondary batteries can be configured as a field required for article collection, and autonomous vehicles can be configured as a single field. Additionally, the aforementioned fields do not need to be configured as only one, and two or more fields may be configured.

[0166] The aforementioned preset period may be set in units of days, weeks, or months. However, the aforementioned period is not fixed as a single period and may be set in various ways as needed.

[0167] As described above, the question-and-answer device of the present disclosure can collect multiple articles from websites through crawling. Crawling means visiting at least one web page on the Internet through a web crawler, also called a spider or search engine bot, and collecting information contained within the web page.

[0168] Specifically, crawling is performed by setting a starting URL (Uniform Resource Locator), and if information corresponding to conditions such as a pre-set field or a pre-set period exists at that URL through a pre-configured library, the title or body content of an article is obtained as text. For the text information corresponding to the aforementioned conditions, a portion of the content contained therein can be extracted and stored in a pre-configured database.

[0169] Data collected through crawling may include information on article titles, body text, media outlet names, publication dates, and user reviews. However, this is merely an example, and various other types of information may be collected as needed.

[0170] In addition, the aforementioned pre-configured library may include the HTTP (HyperText Transfer Protocol) library, but is not limited to this, and various libraries may be used as needed.

[0171] The question-and-answer device of the present disclosure can receive information on articles collected by a crawling method and stored in a pre-configured database. In addition, the question-and-answer device of the present disclosure can receive a query entered by a user.

[0172] A query is data or command entered by a user to interact with a specific artificial intelligence model, and includes queries requesting specific information or facts, queries asking for opinions, queries requesting creation, queries requesting document translation or summarization, and queries requesting data analysis.

[0173] The present disclosure proposes a method of determining the article that corresponds most to the query determined through multiple embedding models, rather than directly inputting the query entered by the user into the language model LLM, and inputting the determined query and article into the LLM.

[0174] The question answering method of the present disclosure includes an article refinement step of outputting a plurality of first vectors for each of a plurality of articles from a plurality of embedding models having different characteristics, and removing duplicate articles based on a first similarity calculated based on the plurality of first vectors (S710).

[0175] Embedding is a technique that transforms each data point into a high-dimensional vector containing numbers while preserving the original meaning of the data.

[0176] The question-answering device of the present disclosure can output a plurality of first vectors for each of a plurality of articles from a plurality of embedding models having different characteristics. The content of the article input into the embedding model may be the article title, the body of the article, or both.

[0177] In addition, as mentioned above, multiple embedding models have different characteristics.

[0178] For example, different characteristics of multiple embedding models may include characteristics in which the performance of the embedding differs depending on the type of language included in the text input to the embedding model.

[0179] Since the languages of articles collected through crawling may vary, high-dimensional vectors can be output through embedding models optimized for each language. However, because a single article may contain languages from multiple countries, the present disclosure proposes a method to input the collected articles into all embedding models, rather than just a single embedding model, to output multiple high-dimensional vectors in order to derive the most accurate results.

[0180] As another example, a plurality of embedding models may include a plurality of first embedding models that output a first vector and a plurality of second embedding models that output a second vector, wherein the number of the plurality of first embedding models may be set to be equal to the number of the plurality of second embedding models.

[0181] As another example, the first similarity between multiple articles can be calculated based on two first vectors selected from the entire first vector output from multiple embedding models and a preset algorithm.

[0182] In addition, the aforementioned preset algorithm used for calculating the first similarity may include at least one of a cosine similarity judgment algorithm and a Euclidean distance calculation algorithm.

[0183] The question answering device of the present disclosure outputs a plurality of first vectors for each of a plurality of articles through a plurality of embedding models, determines duplicate articles based on a first similarity calculated through the aforementioned algorithm, and can remove one of the articles determined to be duplicate articles.

[0184] The Cosine Similarity Judgment Algorithm is a technique that measures the degree of similarity between two data by comparing the directionality between two vectors using a cosine function. Similarity can be calculated by the aforementioned mathematical formula 1, where θ is the angle, X and Y are the two vectors to be compared, XY is the dot product of the two vectors, and |X| and |Y| are the magnitudes of each vector.

[0185] Since Cosθ, the first similarity, has only values between -1 and 1, the similarity also has only values between -1 and 1, and it can be determined that the closer Cosθ is to 1, the higher the similarity between the two articles.

[0186] Accordingly, the question-answering device of the present disclosure can remove one of the two articles corresponding to each of the two first vectors when there are two first vectors in which the first similarity calculated for two articles is greater than or equal to a preset first threshold, and all information regarding the first vector corresponding to the removed article can also be removed. The aforementioned preset first threshold is a real number and can be set in various ways as needed.

[0187] In addition, the aforementioned Euclidean distance calculation algorithm calculates the distance between two points, and the smaller the calculated distance, the higher the first similarity can be determined.

[0188] The Euclidean distance can be calculated based on the aforementioned mathematical formula 2, in which d(p,q) is the distance between two points p and q, x1 and y1 are the coordinates of p, and x2 and y2 are the coordinates of q.

[0189] When comparing the output result with a preset second threshold, if there are two first vectors that are less than the preset second threshold, one of the two articles corresponding to each of the two first vectors can be removed, and all information regarding the first vector corresponding to the removed article can also be removed. The aforementioned preset second threshold is a real number and can be set in various ways as needed.

[0190] In other words, the question answering device of the present disclosure may consider two articles corresponding to each of two first vectors corresponding to a first similarity determined to belong to a first similarity range as duplicate articles, remove one of the two articles, and remove a plurality of first vectors corresponding to the removed article. The aforementioned first similarity range may be determined in various ways according to the algorithm used to calculate the first similarity and the set threshold.

[0191] The question answering method of the present disclosure includes a response generation step of outputting a plurality of second vectors for a query from a plurality of embedding models and generating a response corresponding to the query based on a second similarity calculated based on a plurality of first vectors for each of a plurality of articles and a plurality of second vectors for a query (S720).

[0192] The present disclosure eliminates duplicate articles by using multiple embedding models that exhibit different performance depending on the type of input language, and also allows the use of multiple embedding models in determining the article corresponding to the input query. Through this, an optimal response to a query input into the LLM can be generated.

[0193] For example, the second similarity between a query and an article can be calculated based on a plurality of first vectors output through a plurality of embedding models, a first vector selected from the total of two vectors, a second vector, and a preset algorithm.

[0194] As described above, the algorithm used for calculating similarity may include at least one of a cosine similarity judgment algorithm and a Euclidean distance calculation algorithm.

[0195] The second similarity calculation method can be applied in the same way as the first similarity calculation method described above.

[0196] As another example, the question-answering device of the present disclosure may convert a first vector and a second vector corresponding to a second similarity determined to belong to a second similarity range into original text, input the converted text into a pre-trained artificial intelligence model, and generate the response based on the output result. The aforementioned second similarity range may be determined in various ways depending on the algorithm used to calculate the second similarity and the set threshold.

[0197] The question-answering device of the present disclosure may determine that the degree of similarity is high if the calculated second similarity is greater than or equal to a preset third threshold when a cosine similarity determination algorithm is used for the second similarity calculation, and may determine that the degree of similarity is high if the calculated second similarity is less than a preset fourth threshold when a Euclidean distance calculation algorithm is used for the second similarity calculation. The aforementioned third threshold and fourth threshold are real numbers and can be set in various ways as needed.

[0198] Accordingly, when an article corresponding to a first vector determined to have the highest degree of similarity to a query is determined, the question-answering device of the present disclosure can convert the first vector corresponding to the query and the second vector corresponding to the article back into text, and input the converted result into a pre-trained artificial intelligence model to generate a response to the query. The first vector determined to have the highest degree of similarity to the aforementioned query is not limited to one, but can be determined in various numbers as needed.

[0199] Additionally, the aforementioned pre-trained artificial intelligence model may include a Large Language Model (LM) comprising at least one Transformer. Alternatively, the pre-trained artificial intelligence model may refer to various language models or foundation models.

[0200] Through the operation of the aforementioned configurations, the optimal response among the expected responses to the input query can be provided, and the overall performance of the model can also be improved by using multiple embedding models.

[0201] FIG. 8 is a configuration diagram of a computing device including an artificial intelligence model according to one embodiment.

[0202] Referring to FIG. 8, the computing device (800) may include memory (810) and a processor (820), and the memory may include at least one artificial intelligence model (830).

[0203] The memory (810) can store a program for the operation of the processor (820) and can temporarily or permanently store input / output data. The memory (810) may include at least one type of storage medium among RAM, SRAM, ROM, EEPROM, PROM, magnetic memory, magnetic disk, optical disk, hard disk type, multimedia card micro type, flash memory type, card type memory (e.g., SD or XD memory, etc.), volatile memory (e.g., SRAM, DRAM), or non-volatile memory (e.g., NAND Flash).

[0204] In addition, the memory (810) can store various functions and algorithms, and can store various data, applications, software, commands, code, etc.

[0205] The processor (820) can control the overall operation of the query generation device of the present disclosure. The processor (820) can execute one or more programs and may mean a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), or a dedicated processor on which methods according to some embodiments of the present disclosure are performed.

[0206] Meanwhile, the computing device (800) of the present disclosure may be a quantum computing device rather than a classic computing device. A quantum computing device performs operations in units of qubits rather than bits. A qubit can have a state in which 0 and 1 are simultaneously superpositioned, and if there are M qubits, 2^M states can be represented simultaneously.

[0207] A quantum computing device can use various types of quantum gates (e.g., Pauli / Rotation / Hadamard / CNOT / SWAP / Toffoli) that receive one or more qubits to perform quantum operations and perform specified operations, and can combine quantum gates to form a quantum circuit with a special function.

[0208] Quantum computing devices can use quantum artificial neural networks (e.g., QCNN, QGRNN) that can perform functions of conventional artificial neural networks (e.g., CNN, RNN) at a faster speed while using fewer parameters.

[0209] Additionally, the memory (810) may store an artificial intelligence model (830) comprising a plurality of embedding models that calculate embedding vectors for input queries and articles of the present disclosure, and an LLM that generates a response to a query. When a task to calculate embedding vectors for input queries or articles or a task to generate a response to a query is requested, the processor (820) may execute the artificial intelligence model (830) stored in the memory (810) to generate embedding vectors for queries or articles and a response to a query, and output the result.

[0210] For example, the processor (820) receives a query and multiple articles, outputs multiple first vectors for each of the multiple articles from multiple embedding models having different characteristics, removes duplicate articles based on a first similarity calculated based on the multiple first vectors, outputs multiple second vectors for the query from multiple embedding models, and can generate a response corresponding to the query based on a second similarity calculated based on the multiple first vectors for each of the multiple articles and the multiple second vectors for the query.

[0211] FIG. 9 is a configuration diagram of a computer system including a client-server that includes an artificial intelligence model according to one embodiment.

[0212] Referring to FIG. 9, a computing system according to one embodiment of the present invention may include a computer device (900) including memory (930) and a processor (940) and a server (910) including memory (950) and a processor (960). The computer device (900) and the server (910) may be connected via a wired or wireless connection through a network (920).

[0213] The network (920) connecting the aforementioned computer device (900) and server (910) can also be configured as a network of various sizes, such as a Local Area Network (LAN), a Wide Area Network (WAN), a Value Added Network (VAN), a mobile radio communication network, etc.

[0214] The memory (930) of the computer device (900) can store article information about collected articles or information about input queries.

[0215] The memory (960) of the server (910) can store an artificial intelligence model (970) including a plurality of embedding models that produce embedding vectors for the aforementioned input query and article, and an LLM that generates a response to the query.

[0216] The processor (940) of the computer device (900) can transmit to the server (910) a request for the generation of an embedding vector for an input query or article stored in memory (930) or a request for the generation of a response to the query.

[0217] The processor (960) of the server (910) can generate the aforementioned embedding vector and response to the query using an artificial intelligence model (970) that includes a plurality of embedding models that calculate embedding vectors for the input query and article for the received request and an LLM that generates a response to the query, and transmit the result to the computer device (900).

[0218] The foregoing description is merely an illustrative explanation of the technical concept of the present disclosure, and those skilled in the art to which the present disclosure pertains may make various modifications and variations within the scope of the essential characteristics of the technical concept. Furthermore, since these embodiments are intended to explain, not limit, the scope of the technical concept is not limited by these embodiments. The scope of protection of the present disclosure shall be interpreted by the claims below, and all technical concepts within an equivalent scope shall be interpreted as being included within the scope of rights of the present disclosure.

[0219]

[0220] CROSS-REFERENCE TO RELATED APPLICATION

[0221] This patent application claims priority pursuant to Section 119(a) of the U.S. Patent Act (35 USC § 119(a)) to Korean Patent Application No. 10-2024-0191114 filed on December 19, 2024, all of which are incorporated by reference into this patent application. Additionally, this patent application claims priority in countries other than the United States for the same reasons as above, all of which are incorporated by reference into this patent application.

Claims

1. An information receiving unit that receives a query and multiple articles; An article refinement unit that outputs a plurality of first vectors for each of the plurality of articles from a plurality of embedding models having different characteristics, and removes duplicate articles based on a first similarity calculated based on the plurality of first vectors; and A question answer device comprising a response generation unit that outputs a plurality of second vectors for the query from the plurality of embedding models, and generates a response corresponding to the query based on a second similarity calculated based on the plurality of first vectors for each of the plurality of articles and the plurality of second vectors for the query.

2. The above-mentioned multiple articles are, A query response device characterized by collecting data from at least one website through a crawling technique based on at least one of a preset field and a preset period and storing it in the preset database.

3. In Paragraph 1, The above plurality of embedding models are, It includes a plurality of first embedding models that output the first vector and a plurality of second embedding models that output the second vector, The number of the plurality of first embedding models mentioned above is, A question answering device characterized by having the same number as the plurality of second embedding models above.

4. In Paragraph 1, The different characteristics of the above plurality of embedding models are, A question answer device comprising embedding performance that differs depending on the type of language included in the text input to the above embedding model.

5. In Paragraph 1, The first similarity between the above plurality of articles is, A question answering device characterized by being calculated based on two first vectors selected from among all first vectors output from the plurality of embedding models and a preset algorithm.

6. In Paragraph 5, The above article refinement department is, A question-answering device characterized by removing one of the news articles corresponding to each of the two first vectors corresponding to the first similarity determined to belong to the first similarity range, and removing a plurality of the first vectors corresponding to the removed article.

7. In Paragraph 5, The second similarity between the above inquiry and the above article is, A question answering device characterized by being calculated based on a plurality of first vectors output through the plurality of embedding models, one first vector selected from the total of two vectors, the second vector, and a preset algorithm.

8. In Paragraph 7, The above-mentioned preset algorithm is, A question answering device characterized by including at least one of a cosine similarity determination algorithm and a Euclidean distance calculation algorithm.

9. In Paragraph 1, The above response generating unit is, A question answering device characterized by converting the first vector and the second vector corresponding to the second similarity determined to belong to the second similarity range into the original text, inputting the converted text into a pre-trained artificial intelligence model, and generating the response based on the output result.

10. In Paragraph 9, The above-mentioned pre-trained artificial intelligence model is, A question answering device characterized by including a Large Language Model (LM) comprising at least one transformer.

11. Information receiving step for receiving a query and multiple articles; Article refinement step of outputting a plurality of first vectors for each of the plurality of articles from a plurality of embedding models having different characteristics, and removing duplicate articles based on a first similarity calculated based on the plurality of first vectors; and A question-response method comprising a response generation step of outputting a plurality of second vectors for the query from the plurality of embedding models, and generating a response corresponding to the query based on a second similarity calculated based on the plurality of first vectors for each of the plurality of articles and the plurality of second vectors for the query.

12. The above-mentioned plurality of articles, A query response method characterized by collecting data from at least one website through a crawling technique based on at least one of a preset field and a preset period and storing it in the preset database.

13. In Paragraph 11, The above plurality of embedding models are, It includes a plurality of first embedding models that output the first vector and a plurality of second embedding models that output the second vector, The number of the plurality of first embedding models mentioned above is, A question answering method characterized by having the same number as the plurality of second embedding models above.

14. In Paragraph 11, The different characteristics of the above plurality of embedding models are, A question answering method comprising different embedding performance depending on the type of language included in the text input to the above embedding model.

15. In Paragraph 11, The first similarity between the above plurality of articles is, A question answer method characterized by being calculated based on two first vectors selected from among all first vectors output from the plurality of embedding models and a preset algorithm.

16. In Paragraph 15, The above article refinement step is, A question-and-answer method characterized by removing one of the news articles corresponding to each of the two first vectors corresponding to the first similarity determined to belong to the first similarity range, and removing a plurality of the first vectors corresponding to the removed article.

17. In Paragraph 15, The second similarity between the above inquiry and the above article is, A question answer method characterized by being calculated based on a plurality of first vectors output through the plurality of embedding models, one first vector selected from the total of two vectors, the second vector, and a preset algorithm.

18. In Paragraph 17, The above-mentioned preset algorithm is, A question answering method characterized by including at least one of a cosine similarity determination algorithm and a Euclidean distance calculation algorithm.

19. In Paragraph 11, The above response generation step is, A question answering method characterized by converting the first vector and the second vector corresponding to the second similarity determined to belong to the second similarity range into the original text, inputting the converted text into a pre-trained artificial intelligence model, and generating the response based on the output result.

20. In Paragraph 19, The above-mentioned pre-trained artificial intelligence model is, A question answer method characterized by including a Large Language Model (LM) that includes at least one transformer.