Context generation method and apparatus

By introducing sliding window technology and a comprehensive scoring mechanism into the RAG system, and combining token-level F1 scores, Jaccard similarity, and keyword hit evaluation, the problem of low matching degree between answers and original corpora in the RAG system is solved, and the accuracy and relevance of generated answers are improved.

CN122242449APending Publication Date: 2026-06-19BEIJING JIZHI DIGITAL TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
BEIJING JIZHI DIGITAL TECH CO LTD
Filing Date
2026-03-09
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

In existing technologies, the answers generated by RAG systems have a low degree of matching with the original corpus, resulting in insufficient accuracy and relevance, and may introduce information redundancy, reducing user trust.

Method used

By combining keyword hit rewards and comprehensive scores, the selected segments are ensured to be highly consistent with the prediction results and accurately aligned with the original corpus. The sliding window technique is used to extract segments from the top-k similar text blocks, and token-level F1 scores, Jaccard similarity, and keyword hit evaluations are performed to select the optimal segment as the target context data.

Benefits of technology

It significantly improved the accuracy and relevance of answers, optimized the overall performance of the RAG system, and ensured that the generated contextual data was highly relevant to user needs and complete in information.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122242449A_ABST
    Figure CN122242449A_ABST
Patent Text Reader

Abstract

This application discloses a context generation method and apparatus, relating to the fields of artificial intelligence and other technologies. The context generation method includes: acquiring text data input by a user and encoding it to obtain a text vector; calculating the vector similarity between the text vector and knowledge vector data to obtain target data; obtaining initial context data based on the target data and text data; and performing alignment optimization processing based on the initial context data, text data, and target data to obtain target context data.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the fields of artificial intelligence and other technologies, and in particular to a context generation method and apparatus. Background Technology

[0002] Retrieval-Augmented Generation (RAG) is an architecture that combines an external knowledge base with a generative language model. This method uses a retrieval mechanism to filter query-related text fragments from a large number of documents, then combines the retrieval context with the original query and inputs it into a language model to generate an accurate answer based on external knowledge, effectively solving the problem of lagging knowledge updates in traditional generative models.

[0003] Related technologies typically use the initial predictions generated by large language models. However, this approach may result in a low degree of matching between the answers and the original corpus, leading to insufficient accuracy and relevance. It may also introduce information redundancy, which can reduce user trust and result in unsatisfactory performance of evaluation metrics. Summary of the Invention

[0004] The embodiments of this application aim to at least partially solve one of the technical problems in the related art. Therefore, the purpose of the embodiments of this application is to provide a context generation method, apparatus, device, and medium that significantly improves the accuracy and relevance of context data.

[0005] This application provides a context generation method, including: acquiring text data input by a user and encoding it to obtain a text vector; calculating the vector similarity between the text vector and knowledge vector data to obtain target data; obtaining initial context data based on the target data and text data; and performing alignment optimization processing based on the initial context data, text data, and target data to obtain target context data.

[0006] For example, alignment optimization processing is performed based on initial context data, text data, and target data to obtain target context data, including: extracting keywords from the text data to obtain keyword data; performing segmentation processing on the target data based on a preset window length and a preset sliding window length to obtain multiple segment data; and evaluating the multiple segment data based on the initial context data and keyword data to obtain target context data. For example, the preset window length includes multiple window lengths; the target data is segmented based on the preset window length and the preset sliding window length to obtain multiple segment data, including: segmenting the target data based on the multiple window lengths to obtain multiple initial segment data; and segmenting the target data according to the preset sliding window length, starting from each initial segment data, to obtain multiple segment data.

[0007] For example, evaluating multiple fragment data based on initial context data and keyword data to obtain target context data includes: performing a first similarity evaluation and a second similarity evaluation on multiple fragment data based on the initial context data to obtain a first evaluation result and a second evaluation result; obtaining initial score data based on the first evaluation result and the second evaluation result; performing keyword evaluation on multiple fragment data based on keyword data to obtain a third evaluation result; obtaining target score data based on the initial score data and the third evaluation result; and obtaining target context data based on the target score data and a preset threshold. For example, obtaining initial score data based on the first evaluation result and the second evaluation result includes: performing a product operation based on the first evaluation result and the weight coefficient to obtain a first operation result; performing a difference operation on the first preset value and the weight coefficient to obtain a difference result, and performing a product operation based on the difference result and the second evaluation result to obtain a second operation result; and performing a summation operation on the first operation result and the second operation result to obtain the initial score data.

[0008] For example, the target score data is obtained based on the initial score data and the third evaluation result, including: performing a product operation based on the third evaluation result and the second preset value to obtain the third operation result; and obtaining the target score data based on the initial score data and the third operation result. For example, a third evaluation result is obtained by evaluating multiple data segments based on keyword data, including: extracting keywords from each data segment; comparing the keywords with the keyword data to obtain keyword hit data for each data segment; and obtaining the third evaluation result based on the keyword hit data. For example, initial context data is obtained based on target data and text data, including: obtaining prompt word data based on target data and text data; obtaining initial context data based on prompt word data; and / or knowledge vector data is obtained by: acquiring knowledge data; dividing the knowledge data into blocks to obtain multiple text blocks; and encoding the multiple text blocks to obtain knowledge vector data.

[0009] For example, obtaining target context data based on target score data and a preset threshold includes: determining target segment data from multiple segment data based on the target score data; when the target score data corresponding to the target segment data is greater than or equal to the preset threshold, using the target segment data as target context data, wherein when the target score data corresponding to the target segment data is less than the preset threshold, using the initial context data as target context data.

[0010] Another embodiment of this application provides a context generation device, which includes: a processing module for acquiring text data input by a user and encoding it to obtain a text vector; a calculation module for calculating the vector similarity between the text vector and knowledge vector data to obtain target data; an acquisition module for obtaining initial context data based on the target data and the text data; and an optimization module for performing alignment optimization processing based on the initial context data, the text data, and the target data to obtain target context data.

[0011] Another embodiment of this application provides an electronic device having a computer program stored thereon, which, when executed by a processor, implements the steps of the method of any of the above embodiments.

[0012] Another embodiment of this application provides a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the steps of the method of any of the above embodiments.

[0013] In the above embodiments, the context generation method includes: acquiring and encoding user-input text data to obtain text vectors; calculating the vector similarity between the text vectors and knowledge vector data to obtain target data; obtaining initial context data based on the target data and text data; and performing alignment optimization processing based on the initial context data, text data, and target data to obtain target context data. By acquiring and encoding user-input text data, high-quality text vectors are generated, achieving accurate representation of user intent. By performing efficient vector similarity calculation between the text vectors and the knowledge vector database, highly relevant target data can be retrieved quickly and accurately, significantly improving the accuracy and recall rate of information matching. Furthermore, an initial context is generated based on the target data and the original text data, and alignment optimization processing is performed by fusing the text data and the target data to finally construct target context data that highly matches user needs, is semantically coherent, and has complete information, improving the accuracy of the target context data. Attached Figure Description

[0014] Figure 1 A flowchart of the context generation method provided for embodiments of this application; Figure 2 Flowchart of another context generation method provided for embodiments of this application; Figure 3 Block diagram of a context generation apparatus provided for another embodiment of this application; Figure 4 A block diagram of an electronic device provided for another embodiment of this application. Detailed Implementation

[0015] The embodiments of this application are described in detail below. Examples of the embodiments are shown in the accompanying drawings, wherein the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the accompanying drawings are exemplary and intended to explain this application, and should not be construed as limiting this application.

[0016] Retrieval-Augmented Generation (RAG) is an architecture that combines an external knowledge base with a generative language model. This method uses a retrieval mechanism to filter query-related text fragments from a large number of documents, then combines the retrieval context with the original query and inputs it into a language model to generate an accurate answer based on external knowledge, effectively solving the problem of lagging knowledge updates in traditional generative models.

[0017] Related technologies typically use the initial predictions generated by large language models. However, this approach may result in a low degree of matching between the answers and the original corpus, leading to insufficient accuracy and relevance. It may also introduce information redundancy, which can reduce user trust and result in unsatisfactory performance of evaluation metrics.

[0018] Therefore, how to more accurately align the generated answer with the original corpus while preserving the model's judgment direction, in order to improve the accuracy of the prediction results and the matching degree of the reference answer, has become an important technical challenge for RAG system optimization.

[0019] In view of this, this application proposes a context generation method that combines a keyword hit reward and a comprehensive score scoring mechanism to ensure that the selected segments are highly consistent with the prediction results and accurately aligned with the original corpus, thereby significantly improving the accuracy and relevance of the answers.

[0020] Figure 1 A flowchart illustrating the context generation method provided for embodiments of this application.

[0021] like Figure 1 As shown, the context generation method 100 provided in this application includes, for example, steps S110-S140.

[0022] Step S110: Obtain the text data input by the user and encode it to obtain a text vector.

[0023] For example, the encoding process is implemented based on an embedding model, such as text2vec, where the text data input by the user is encoded to obtain a text vector.

[0024] Step S120: Calculate the vector similarity between the text vector and the knowledge vector data to obtain the target data.

[0025] For example, knowledge vector data is obtained by processing knowledge data (such as chunking and encoding). Chunking the knowledge data yields multiple text blocks, and encoding the text blocks yields knowledge vector data. The knowledge data includes document data corresponding to an external knowledge base. The knowledge vector data is stored in a vector database (such as Faiss) in the form of vectors. Vector similarity includes cosine similarity. By calculating the cosine similarity between the text vectors and the knowledge vector data, the cosine similarities are sorted, and the top-k text blocks (the text blocks corresponding to the top k cosine similarities (the knowledge data after chunking)) and the corresponding knowledge vector data are selected as the target data.

[0026] Step S130: Based on the target data and text data, obtain the initial context data.

[0027] For example, prompt word data can be obtained based on target data and text data. The prompt word data is then input into a large language model (e.g., Qwen-7B-Chat). The large language model generates prediction results based on the prompt word data, and uses the prediction results as initial context data.

[0028] Step S140: Alignment optimization processing is performed based on the initial context data, text data, and target data to obtain the target context data.

[0029] For example, the top-k text blocks (target data) are segmented based on a preset window length and a preset sliding window length to obtain multiple segment data. These segment data are then evaluated based on the initial context data and the keyword data corresponding to the text data to obtain target score data and target segment data. When the target score data is greater than or equal to a preset threshold, the target segment data is used as the target context data; when the target score data is less than the preset threshold, the initial context data is used as the target context data. Evaluation metrics (such as BLEU (Bilingual Evaluation Understudy) and ROUGE-L (Recall-Oriented Understudy for Gisting Evaluation - Longest Common Subsequence)) are used to measure the consistency between the generated target context data and the reference answer (a manually annotated, generally accepted correct standard answer, usually written by domain experts).

[0030] In the above embodiments, by acquiring and encoding user-input text data, high-quality text vectors are generated, achieving accurate representation of user intent. By efficiently calculating vector similarity between the text vectors and a knowledge vector database, highly relevant target data can be retrieved quickly and accurately, significantly improving the accuracy and recall of information matching. Furthermore, an initial context is generated based on this target data and the original text data. Through alignment optimization by fusing the text data and target data, target context data that highly matches user needs, is semantically coherent, and informationally complete is finally constructed, improving the accuracy of the target context data.

[0031] In one example, initial context data is obtained based on target data and text data, including: obtaining prompt word data based on target data and text data; obtaining initial context data based on prompt word data; and / or knowledge vector data is obtained by: acquiring knowledge data; segmenting the knowledge data into multiple text blocks; and encoding the multiple text blocks to obtain knowledge vector data.

[0032] Specifically, prompt word data can be obtained based on target data and text data. The prompt word data is then input into a large language model (e.g., Qwen-7B-Chat). The large language model generates prediction results based on the prompt word data, and uses the prediction results as initial context data.

[0033] For example, user questions (text data) and "top-k similar text blocks" (target data) are assembled into cue words and fed into a large language model, which generates predictions based on the cue words (initial context data).

[0034] Knowledge vector data is obtained by segmenting an external knowledge base (knowledge data) into multiple text blocks. A fixed-size segmentation strategy can be used, allowing partial overlap between blocks to ensure information integrity and coverage. Each text block is encoded into a vector representation using an embedding model (e.g., text2vec). These vectors are then stored in a vector database (e.g., Faiss), which supports various index types (e.g., IndexFlatL2, IndexFlatIP, IndexIVFFlat) to provide efficient vector similarity queries.

[0035] In one example, alignment optimization is performed based on initial context data, text data, and target data to obtain target context data, including: extracting keywords from the text data to obtain keyword data; truncating the target data into segments based on a preset window length and a preset sliding window length to obtain multiple segments; and evaluating the multiple segments based on the initial context data and keyword data to obtain target context data. Specifically, the preset window length can include multiple window lengths. For example, the preset sliding window length is 1, which means that each time the target data is truncated according to the preset window length, multiple data segments are obtained. The multiple data segments are evaluated based on the initial context data and keyword data to obtain the target context data. The evaluation includes a first evaluation, a second evaluation, and a keyword evaluation.

[0036] For example, when extracting keywords from user questions, the user questions (text data) are segmented by spaces, stop words are removed, and the first 8 words are selected as keyword data.

[0037] In one example, the preset window length includes multiple window lengths; the target data is segmented based on the preset window length and the preset sliding window length to obtain multiple segment data, including: segmenting the target data based on the multiple window lengths to obtain multiple initial segment data; and segmenting the target data according to the preset sliding window length, starting from each initial segment data, to obtain multiple segment data.

[0038] Specifically, the preset window lengths, such as min_len to max_len, increase sequentially (i.e., the window lengths are min_len, min_len+1, ..., max_len, respectively), where min_len represents the minimum window length and max_len represents the maximum window length. The preset sliding window length is, for example, 1. Based on the preset window lengths, the target data is segmented to obtain initial segment data of different window lengths. For each initial segment data, segmentation is performed on the target data according to the preset sliding window length (e.g., sliding one data position on the target data each time), thereby obtaining multiple segment data.

[0039] For example, when using the sliding window technique in top-k similar text blocks, the window length L increases sequentially from min_len to max_len. For each window length L, text segments (target data) are extracted in sequence, specifically from position 0 to L, position 1 to L+1, and position 2 to L+2, ensuring that the data length of the segment obtained each time is the window length L.

[0040] In one example, multiple fragment data are evaluated based on initial context data and keyword data to obtain target context data, including: performing a first similarity evaluation and a second similarity evaluation on multiple fragment data based on the initial context data to obtain a first evaluation result and a second evaluation result; obtaining initial score data based on the first evaluation result and the second evaluation result; performing keyword evaluation on multiple fragment data based on keyword data to obtain a third evaluation result; obtaining target score data based on the initial score data and the third evaluation result; and obtaining target context data based on the target score data and a preset threshold.

[0041] Specifically, the first similarity assessment includes token-level F1 scores, and the second similarity assessment includes Jaccard similarity. The first and second assessment results are obtained by calculating F1 and Jaccard scores on multiple fragment data based on the initial context data. The first and second assessment results are then weighted to obtain the initial score data. The third assessment result is obtained by performing keyword assessment on multiple fragment data based on keyword data. The keyword assessment includes the number of keyword hits. The target score data is obtained based on the initial score data and the third assessment result. The target context data is then selected based on the target score data.

[0042] The Token F1 score ensures that the fragment data is the exact source of the prediction (initial context data) (high precision) and that the fragment does not omit key information (high recall). Jaccard similarity ensures that the core vocabulary of the fragment data and the prediction highly overlaps.

[0043] For example, a user question (text data): "What is the context length of ChatGPT-4?"; the model's initial prediction (initial context data) (P): "The context length of ChatGPT-4 is 128K tokens."; a candidate segment extracted from a text in the retrieved top-k blocks using a sliding window (segment data) (C): "As of June 2024, the context window of the ChatGPT-4 model has expanded to 128,000 tokens." Tokenize multiple data fragments and initial context data (splitting by character / word). The initial context data's token(P) is: {ChatGPT, -, 4, of, context, length, is, 128, K, tokens, 。}. The fragment data's token(C) is: {as of, 2024, year, 6, month, ,, ChatGPT, -, 4, model, of, context, window, has, expanded, to, 128, ,, 000, tokens, 。}.

[0044] Calculate the Token F1 score (the first evaluation result): Intersection = {ChatGPT, -, 4, context, 128, tokens, 。} (7 tokens), Precision (P) = Intersection size / |P| = 7 / 11 ≈ 0.636. The words "的", "length", "是", "K" in the prediction are not found in the snippet. Recall (R) = Intersection size / |C| = 7 / 21 ≈ 0.333. A lot of information such as "截至", "2024", "年" in the snippet is not predicted to be included. F1 score = 2 * (0.636 * 0.333) / (0.636 + 0.333) ≈ 0.436. The F1 score is relatively low (0.436), indicating that the candidate snippet (snippet data) is not the exact original source of the prediction. "长度是K" in the prediction is "窗口已扩展至,000" in the snippet, with a large difference.

[0045] Calculate the Jaccard score (the second evaluation result): Union = all non-repeated Tokens, about 25 in total. Jaccard score = Intersection size / Union size = 7 / 25 = 0.28.

[0046] In one example, based on the first evaluation result and the second evaluation result, initial score data is obtained, including: performing a multiplication operation on the first evaluation result and the weight coefficient to obtain the first operation result; performing a subtraction operation on the first preset value and the weight coefficient to obtain the difference result, and then performing a multiplication operation on the difference result and the second evaluation result to obtain the second operation result; performing an addition operation on the first operation result and the second operation result to obtain the initial score data.

[0047] Specifically, each intercepted snippet (snippet data) is compared with the predicted snippet (initial context data), and scored based on the token-level F1 score (the first evaluation result) and the Jaccard score (the second evaluation result). The scoring formula is as shown in formula (1): Initial score data = alpha * F1 score + (1.0 - alpha) * Jaccard score (1) Where alpha represents the weight coefficient, and the first preset value is 1.0.

[0048] In one example, based on the keyword data, keyword evaluation is performed on multiple snippet data to obtain the third evaluation result, including: extracting the keywords of each snippet data; comparing the keywords with the keyword data to obtain the keyword hit data of each snippet data; based on the keyword hit data, obtaining the third evaluation result.

[0049] Specifically, by evaluating multiple data segments based on keyword data, a third evaluation result can be obtained. Keyword evaluation includes the number of keyword hits. Keywords are extracted from each data segment, and the extracted keywords are compared with the keyword data. The number of keyword data contained in each data segment (keyword hit data) is counted as the third evaluation result.

[0050] In one example, the target score data is obtained based on the initial score data and the third evaluation result, including: performing a product operation based on the third evaluation result and the second preset value to obtain the third operation result; and obtaining the target score data based on the initial score data and the third operation result.

[0051] Specifically, by multiplying the keyword hit data with a second preset value, a third calculation result can be obtained. The second preset value is, for example, 0.05. For instance, the number of times the user's keyword is hit in each data segment is recorded, with each hit increasing the score by 0.05 points. Based on the initial score data and the third calculation result, the target score data can be obtained by summing the results. As shown in formula (2): Target score data = Initial score data + 0.05 × Keyword hit data (2) In one example, target context data is obtained based on target score data and a preset threshold, including: determining target segment data from multiple segment data based on target score data; when the target score data corresponding to the target segment data is greater than or equal to the preset threshold, the target segment data is used as target context data, wherein when the target score data corresponding to the target segment data is less than the preset threshold, the initial context data is used as target context data.

[0052] Specifically, among all segments (multiple segment data) captured by sliding windows of all sizes, the segment with the highest final score (target score data) is selected. If the final score of the segment is greater than or equal to the threshold, the segment data is used as the target segment data, and the original prediction result (initial context data) is replaced by the segment data, with the target segment data serving as the target context data; otherwise, the original prediction result (initial context data) remains unchanged, and the initial context data is used as the target context data.

[0053] Table 1 shows the pseudocode for sliding window segment extraction and evaluation: Table 1

[0054] In the above embodiments, the segment extraction and scoring rules are based on the optimization goal of "corpus alignment". By combining the scoring mechanism of keyword hit reward and target score, it is ensured that the selected segments are highly consistent with the prediction results and can be accurately aligned to the original corpus, thereby significantly improving the accuracy and relevance of the answer.

[0055] Figure 2 Another flowchart of a context generation method provided for an embodiment of this application, such as Figure 2 As shown, the context generation methods include S201-S209.

[0056] S201, Obtain external knowledge base.

[0057] For example, external knowledge bases can include structured documents, such as text files: TXT, Markdown, Word, PDF, etc.; web content: crawled or scraped web pages, blogs, news articles; code repositories: GitHub repositories, API documentation, technical manuals. They can also include semi-structured data, such as tabular data: Excel, CSV files; JSON / XML: configuration files or data exchange formats containing structured information. Furthermore, they can include multimedia content, such as images: image features extracted through multimodal models or text recognized by OCR; and audio / video: converted to text through Automatic Speech Recognition (ASR) before being stored in the database.

[0058] S202, the external knowledge base is segmented to obtain text blocks.

[0059] For example, external knowledge documents in different formats (such as PDF, Word, Excel, web pages, and database records) are loaded and uniformly converted into plain text format. The extracted text is cleaned and standardized to improve the quality of subsequent retrieval. Long documents are divided into smaller chunks suitable for model processing, resulting in multiple chunks.

[0060] S203, Obtaining User Information.

[0061] For example, the user enters the relevant question (text data) that they want to retrieve.

[0062] S204, input the user question and text block into the embedding model.

[0063] For example, embedding models such as OpenAI's text-embedding-ada-002 are widely used general-purpose embedding models in the industry. They convert text into a 1536-dimensional vector, exhibiting balanced performance across various tasks and demonstrating good semantic understanding capabilities. Models in the Sentence Transformers library (such as all-mpnet-base-v2 or all-MiniLM-L6-v2) are a very popular open-source solution. The all-MiniLM-L6-v2 model generates 384-dimensional vectors while maintaining high performance, resulting in higher computational and storage efficiency, making it ideal for resource-constrained or locally deployed scenarios requiring rapid response. Domain-specific / language models, such as the M3E (Moka Massive Mixed Embedding) series, are open-source text embedding models optimized for Chinese scenarios, typically performing better than general-purpose models on Chinese semantic matching tasks. Cohere Embed models provide powerful multilingual embedding capabilities, suitable for handling cross-language semantic retrieval tasks.

[0064] S205, encodes the text block.

[0065] For example, after the segmented text blocks are processed by the embedding model, text block encoding vectors (knowledge vector data) can be obtained. The text block encoding vectors are stored in a vector database (such as Faiss, etc.), which supports multiple index types (such as IndexFlatL2, IndexFlatIP, IndexIVFFlat, etc.) to provide efficient vector similarity queries.

[0066] S206, coded to process user questions.

[0067] For example, the user's input text data (such as a question or a description) is first fed into an embedding model. This model acts like a "semantic understander," compressing the semantic and syntactic information in the text into a fixed-length, high-dimensional numerical vector (e.g., a 768-dimensional or 1536-dimensional array). This vector can be viewed as a coordinate point of the text in a "semantic space." Texts with similar semantics have vectors that are also close to each other in space. The user's question, after encoding, can be obtained as a user question vector encoding (text vector).

[0068] S207, stored in the vector database.

[0069] For example, knowledge vector data is stored in a vector database (such as Faiss) that supports multiple index types (such as IndexFlatL2, IndexFlatIP, IndexIVFFlat, etc.) to provide efficient vector similarity queries.

[0070] S208, retrieves the top-k similar text blocks.

[0071] For example, the text vector representing the user's question is quickly compared for similarity with tens of thousands or even millions of knowledge vectors in a vector database. A common comparison method is to calculate cosine similarity, which focuses on measuring the alignment of two vectors in direction; the closer the value is to 1, the more semantically similar they are. To achieve millisecond-level response times in massive datasets, the vector database does not perform a simple brute-force traversal. Instead, it utilizes efficient indexing algorithms such as HNSW (Hierarchical Navigable Small World) to quickly locate regions adjacent to the query point in the vast vector space. After the comparison is complete, the system sorts all candidate knowledge vectors in descending order based on their similarity scores and strictly follows a preset top-k parameter (e.g., k=3 or 5) to select the top k vectors with the highest similarity. Finally, the system extracts the complete original text blocks corresponding to these winning vectors. These text blocks are knowledge fragments that, after being "understood" and "judged" by the machine, are considered to be semantically most relevant to the user's current question and most likely to contain the answer. The text block is fed to a large language model to generate the final answer, thus completing the intelligent retrieval loop from "user question" to "relevant knowledge".

[0072] S209, input to LLM (Large Language Model).

[0073] For example, the text data corresponding to the text vector and the top-k similar text blocks are fed into the large language model. The large language model generates an initial prediction result (initial context data) based on the prompt words and uses the initial prediction result directly as the target context data.

[0074] The context generation method in the above embodiments may result in a low degree of matching between the answer and the original corpus if the initial prediction result generated by the model is directly used as the target context data. To improve this problem, the context generation method proposed in this application starts a post-processing alignment process after generating the prediction. By retrieving the top-k similar text blocks, the system uses a sliding window technique to extract segments and scores them based on token-level F1 scores, Jaccard similarity, and the hit rate of question keywords. When the score of the best segment exceeds a preset threshold, the system will replace the initial prediction result (initial context data) with this segment (segment data) and use the segment data as the target context data; otherwise, the initial prediction result will be retained and the initial context data will be used as the target context data.

[0075] The context generation method proposed in this application significantly improves the answer quality of the RAG system by introducing a post-processing alignment process and utilizing sliding window technology and a comprehensive scoring mechanism. Specifically, by extracting keywords from user questions and performing fragment truncation and precise scoring on top-k similar text blocks, the system can optimize matching accuracy and relevance. During the scoring process, token-level F1 scores, Jaccard similarity, and keyword hit rewards are combined to ensure that the final selected fragment is highly aligned with the original corpus, and the predicted results are replaced or retained based on the score. This method enhances the semantic relevance and accuracy of the answers and optimizes the overall performance of the system.

[0076] Figure 3 Block diagram of a context generation apparatus provided for another embodiment of this application.

[0077] This specification provides a context generation device 300. Please refer to [link / reference]. Figure 3 The context generation device 300 includes: a processing module 310, a calculation module 320, an acquisition module 330, and an optimization module 340.

[0078] The processing module 310 is used to acquire the text data input by the user and encode it to obtain a text vector.

[0079] The calculation module 320 is used to calculate the vector similarity between the text vector and the knowledge vector data to obtain the target data.

[0080] Module 330 is used to obtain initial context data based on target data and text data.

[0081] The optimization module 340 is used to perform alignment optimization processing based on the initial context data, text data and target data to obtain the target context data.

[0082] For example, the optimization module 340 is also used to perform keyword extraction processing on the text data to obtain keyword data; to perform segmentation processing on the target data based on the preset window length and the preset sliding window length to obtain multiple segment data; and to evaluate the multiple segment data based on the initial context data and the keyword data to obtain target context data. For example, the preset window length includes multiple window lengths; the optimization module 340 is also used to perform segmentation processing on the target data based on the multiple window lengths to obtain multiple initial segment data; taking each initial segment data as the starting point, the target data is segmented according to the preset sliding window length to obtain multiple segment data.

[0083] For example, the optimization module 340 is further configured to perform a first similarity assessment and a second similarity assessment on multiple fragment data based on initial context data to obtain a first assessment result and a second assessment result; obtain initial score data based on the first assessment result and the second assessment result; perform keyword assessment on multiple fragment data based on keyword data to obtain a third assessment result; obtain target score data based on the initial score data and the third assessment result; and obtain target context data based on the target score data and a preset threshold. For example, the optimization module 340 is further configured to perform a product operation based on the first evaluation result and the weight coefficient to obtain a first operation result; perform a difference operation on the first preset value and the weight coefficient to obtain a difference result, and perform a product operation based on the difference result and the second evaluation result to obtain a second operation result; and perform a summation operation on the first operation result and the second operation result to obtain initial score data.

[0084] For example, the optimization module 340 is also used to perform a product operation based on the third evaluation result and the second preset value to obtain a third operation result; and to obtain target score data based on the initial score data and the third operation result. For example, the optimization module 340 is also used to extract keywords for each segment of data; compare the keywords with the keyword data to obtain keyword hit data for each segment of data; and obtain a third evaluation result based on the keyword hit data.

[0085] For example, the obtaining module 330 is further configured to obtain prompt word data based on target data and text data; obtain initial context data based on prompt word data; and / or obtain knowledge vector data by: acquiring knowledge data; performing block processing on the knowledge data to obtain multiple text blocks; and performing encoding processing on the multiple text blocks to obtain knowledge vector data.

[0086] For example, the optimization module 340 is further configured to determine target segment data from multiple segment data based on target score data; when the target score data corresponding to the target segment data is greater than or equal to a preset threshold, the target segment data is used as target context data, wherein when the target score data corresponding to the target segment data is less than the preset threshold, the initial context data is used as target context data.

[0087] Figure 4 A block diagram of an electronic device provided for another embodiment of this application.

[0088] Another embodiment of this application provides an electronic device having a computer program stored thereon, which, when executed by a processor, implements the steps of the method of any of the above embodiments.

[0089] like Figure 4 As shown, for ease of understanding, embodiments of this application illustrate a specific electronic device 400.

[0090] Electronic device 400 is intended to represent various forms of digital computers, such as laptop computers, desktop computers, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. Electronic device 400 may also represent various forms of mobile devices, such as personal digital processors, cellular phones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions are merely illustrative and are not intended to limit the implementation of the present disclosure described and / or claimed herein.

[0091] like Figure 4 As shown, the electronic device 400 includes a computing unit 401, which can perform various appropriate actions and processes according to a computer program stored in a read-only memory (ROM) 402 or a computer program loaded from a storage unit 408 into a random access memory (RAM) 403. The RAM 403 may also store various programs and data required for the operation of the electronic device 400. The computing unit 401, ROM 402, and RAM 403 are interconnected via a bus 404. An input / output (I / O) interface 405 is also connected to the bus 404.

[0092] Multiple components in electronic device 400 are connected to input / output (I / O) interface 405. These components include: input unit 406, such as a keyboard or mouse; output unit 407, such as various types of displays or speakers; storage unit 408, such as a hard disk or optical disk; and communication unit 409, such as a network interface card (NIC), modem, or wireless transceiver. Communication unit 409 allows electronic device 400 to exchange information / data with other devices through computer networks such as the Internet and / or various telecommunications networks.

[0093] The computing unit 401 can be a variety of general-purpose and / or special-purpose processing components with processing and computing capabilities. Some examples of the computing unit 401 include, but are not limited to, a central processing unit (CPU), a graphics processing unit (GPU), various special-purpose artificial intelligence (AI) computing chips, various computing units running machine learning model algorithms, a digital signal processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 401 performs the various methods described above. For example, in some embodiments, any one or more of the various methods described above can be implemented as a computer software program tangibly contained in a machine-readable medium, such as storage unit 408. In some embodiments, part or all of the computer program can be loaded and / or installed on the electronic device 400 via ROM 402 and / or communication unit 409. When the computer program is loaded into RAM 403 and executed by the computing unit 401, one or more steps of any one or more of the various methods described above can be performed. Alternatively, in other embodiments, the computing unit 401 can be configured to perform any one or more of the various methods described above by any other suitable means (e.g., by means of firmware).

[0094] This application provides a computer-readable storage medium storing a computer program thereon, which, when executed by a processor, implements the steps of the method in any of the above embodiments.

[0095] It should be noted that the logic and / or steps represented in the flowchart or otherwise described herein, for example, can be considered as a sequenced list of executable instructions for implementing logical functions, and can be specifically implemented in any computer-readable medium for use by, or in conjunction with, an instruction execution system, apparatus, or device (such as a computer-based system, a processor-included system, or other system that can fetch and execute instructions from, an instruction execution system, apparatus, or device). For the purposes of this application, "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transmit programs for use by, or in conjunction with, an instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of computer-readable media include: electrical connections (electronic devices) having one or more wires, portable computer disk drives (magnetic devices), random access memory (RAM), read-only memory (ROM), erasable and editable read-only memory (EPROM or flash memory), fiber optic devices, and portable optical disc read-only memory (CDROM). Furthermore, computer-readable media can even be paper or other suitable media on which programs can be printed, because programs can be obtained electronically, for example, by optically scanning the paper or other media, followed by editing, interpreting, or otherwise processing as necessary, and then stored in computer memory.

[0096] It should be understood that various parts of this application can be implemented using hardware, software, firmware, or a combination thereof. In the above embodiments, multiple steps or methods can be implemented using software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, it can be implemented using any one or a combination of the following techniques known in the art: discrete logic circuits having logic gates for implementing logical functions on data signals, application-specific integrated circuits (ASICs) having suitable combinational logic gates, programmable gate arrays (PGAs), field-programmable gate arrays (FPGAs), etc.

[0097] In the description of this application, the references to terms such as "one embodiment," "some embodiments," "example," "specific example," or "some examples," etc., indicate that a specific feature, structure, material, or characteristic described in connection with that embodiment or example is included in at least one embodiment or example of this application. In this application, the illustrative expressions of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the specific features, structures, materials, or characteristics described may be combined in any suitable manner in one or more embodiments or examples.

[0098] In the description of this application, it should be understood that the terms "center", "longitudinal", "lateral", "length", "width", "thickness", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", "clockwise", "counterclockwise", "axial", "radial", "circumferential", etc., indicating the orientation or positional relationship based on the orientation or positional relationship shown in the accompanying drawings, are only for the convenience of describing this application and simplifying the description, and do not indicate or imply that the device or element referred to must have a specific orientation, or be constructed and operated in a specific orientation, and therefore should not be construed as a limitation of this application.

[0099] Furthermore, the terms "first," "second," etc., used in the embodiments of this application are for descriptive purposes only and should not be construed as indicating or implying relative importance, or implicitly specifying the number of technical features indicated in this embodiment. Therefore, features defined with terms such as "first" and "second" in the embodiments of this application can explicitly or implicitly indicate that the embodiment includes at least one of those features. In the description of this application, the word "multiple" means at least two or more, such as two, three, four, etc., unless otherwise explicitly and specifically defined in the embodiments.

[0100] In this application, unless otherwise explicitly specified or limited in the embodiments, the terms "installation," "connection," "joining," and "fixing" appearing in the embodiments should be interpreted broadly. For example, a connection can be a fixed connection, a detachable connection, or an integral part; it can also be a mechanical connection, an electrical connection, etc. Of course, it can also be a direct connection, or an indirect connection through an intermediate medium, or it can be the internal communication between two components, or the interaction between two components. Those skilled in the art can understand the specific meaning of the above terms in this application based on the specific implementation.

[0101] In this application, unless otherwise expressly specified and limited, "above" or "below" the second feature can mean that the first feature is in direct contact with the second feature, or that the first feature is in indirect contact with the second feature through an intermediate medium. Furthermore, "above," "on top of," and "over" the second feature can mean that the first feature is directly above or diagonally above the second feature, or simply that the first feature is at a higher horizontal level than the second feature. "Below," "below," and "under" the second feature can mean that the first feature is directly below or diagonally below the second feature, or simply that the first feature is at a lower horizontal level than the second feature.

[0102] Although embodiments of this application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting this application. Those skilled in the art can make changes, modifications, substitutions and variations to the above embodiments within the scope of this application.

Claims

1. A context generation method, characterized in that, The method includes: The text data input by the user is obtained and encoded to obtain a text vector; The target data is obtained by calculating the vector similarity between the text vector and the knowledge vector data. Based on the target data and the text data, initial context data is obtained; Alignment optimization processing is performed based on the initial context data, the text data, and the target data to obtain the target context data.

2. The method according to claim 1, characterized in that, The alignment optimization process based on the initial context data, the text data, and the target data to obtain the target context data includes: The text data is processed by keyword extraction to obtain keyword data; The target data is segmented based on a preset window length and a preset sliding window length to obtain multiple data segments. The target context data is obtained by evaluating multiple fragment data based on the initial context data and the keyword data.

3. The method according to claim 2, characterized in that, The preset window length includes multiple window lengths; the segmentation process of the target data based on the preset window length and the preset sliding window length yields multiple data segments, including: Based on the multiple window lengths, the target data is segmented to obtain multiple initial segment data; Starting with each initial data segment, the target data is segmented according to a preset sliding window length to obtain multiple data segments.

4. The method according to claim 2, characterized in that, The process of evaluating multiple fragment data based on the initial context data and the keyword data to obtain the target context data includes: Based on the initial context data, a first similarity assessment and a second similarity assessment are performed on multiple fragment data to obtain a first assessment result and a second assessment result; Based on the first evaluation result and the second evaluation result, the initial score data is obtained; Based on the keyword data, keyword evaluation is performed on multiple fragment data to obtain a third evaluation result; Based on the initial score data and the third evaluation result, the target score data is obtained; The target context data is obtained based on the target score data and the preset threshold.

5. The method according to claim 4, characterized in that, The initial score data obtained based on the first evaluation result and the second evaluation result includes: The first calculation result is obtained by multiplying the first evaluation result and the weight coefficient. A difference operation is performed on the first preset value and the weight coefficient to obtain a difference result, and a product operation is performed on the difference result and the second evaluation result to obtain a second calculation result; The initial score data is obtained by summing the first calculation result and the second calculation result.

6. The method according to claim 4, characterized in that, The process of obtaining target score data based on the initial score data and the third evaluation result includes: The third calculation result is obtained by multiplying the third evaluation result and the second preset value. Based on the initial score data and the third calculation result, the target score data is obtained.

7. The method according to claim 4, characterized in that, The third evaluation result obtained by evaluating multiple data segments based on the keyword data includes: Extract keywords from each of the data segments; The keywords are compared with the keyword data to obtain the keyword hit data for each segment of data; Based on the keyword hit data, the third evaluation result is obtained.

8. The method according to claim 1, characterized in that, The step of obtaining initial context data based on the target data and the text data includes: obtaining prompt word data based on the target data and the text data; and obtaining the initial context data based on the prompt word data. and / or The knowledge vector data is obtained in the following way: acquiring knowledge data; dividing the knowledge data into blocks to obtain multiple text blocks; and encoding the multiple text blocks to obtain the knowledge vector data.

9. The method according to claim 4, characterized in that, The process of obtaining the target context data based on the target score data and a preset threshold includes: Target segment data is determined from multiple segment data based on the target score data; When the target score data corresponding to the target segment data is greater than or equal to the preset threshold, the target segment data is used as the target context data. Wherein, when the target score data corresponding to the target segment data is less than the preset threshold, the initial context data is used as the target context data.

10. A context generation apparatus, characterized in that, The device includes: The processing module is used to acquire text data input by the user and encode it to obtain a text vector; The calculation module is used to calculate the vector similarity between the text vector and the knowledge vector data to obtain the target data; The acquisition module is used to obtain initial context data based on the target data and the text data; The optimization module is used to perform alignment optimization processing based on the initial context data, the text data, and the target data to obtain the target context data.