A large model-based data processing system

By acquiring and processing data, and combining large language models with other models for information fusion, the problem of inaccurate dialogue content generated by large language models has been solved, achieving higher accuracy and comprehension capabilities.

CN122309640APending Publication Date: 2026-06-30SHENZHEN TCL HIGH TECH DEVELOPMENT CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
SHENZHEN TCL HIGH TECH DEVELOPMENT CO LTD
Filing Date
2024-12-30
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

The problem is that the dialogue content generated by existing large language models is not accurate enough.

Method used

By acquiring the data to be processed and the first data, candidate data is determined based on this information, and the first target data is further determined. Information fusion and optimization are performed using a large language model and other processing models to improve the accuracy of the generated target text information.

Benefits of technology

It improves the accuracy of the generated target data information, enhances the ability of large models to understand long content, and ensures that the generated dialogue content is more accurate.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122309640A_ABST
    Figure CN122309640A_ABST
Patent Text Reader

Abstract

This application discloses a data processing system based on a large model, which determines candidate data information based on the data information to be processed and the first data information; and determines the first target data information based on the candidate data information.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of computer technology, and more specifically to a data processing system. Background Technology

[0002] With the rapid adoption of generative artificial intelligence (AI) technology, more and more companies are integrating large language models and diffusion models into their businesses. Among these applications, using large language models in conjunction with intelligent speech recognition to enable dialogue between computer devices and users is one of the most important. However, existing large language models suffer from inaccurate generated dialogue content. Summary of the Invention

[0003] This application provides a data processing system.

[0004] In a first aspect, this application provides a method comprising:

[0005] Obtain the data to be processed and the first data information;

[0006] Based on the data to be processed and the first data, candidate data information is determined;

[0007] Based on the candidate data information, the first target data information is determined.

[0008] Secondly, this application provides a system comprising:

[0009] The information acquisition module is used to acquire the data to be processed and the first data information;

[0010] The first determining module is used to determine candidate data information based on the data to be processed and the first data information;

[0011] The second determining module is used to determine the first target data information based on the candidate data information.

[0012] Thirdly, this application also provides a computer device, which includes:

[0013] One or more processors;

[0014] Memory; and

[0015] One or more applications, wherein the applications are stored in memory and configured to be executed by a processor to implement the methods of any one of the first aspects.

[0016] Fourthly, embodiments of this application provide a computer-readable storage medium having a computer program stored thereon, the computer program being loaded by a processor to perform the steps of the method in any of the first aspects. Attached Figure Description

[0017] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments recorded in the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0018] Figure 1 This is a schematic diagram of a data processing system provided in an embodiment of the present invention;

[0019] Figure 2 This is a flowchart of one embodiment of the data processing method provided by the present invention;

[0020] Figure 3 This is a flowchart illustrating a specific embodiment of determining candidate data information provided in this invention.

[0021] Figure 4 This is a flowchart illustrating a specific embodiment of determining first target data information provided in this invention.

[0022] Figure 5 This is a schematic block diagram of the data processing system provided in the embodiments of the present invention;

[0023] Figure 6 This is a schematic diagram of an embodiment of the computer device provided in this invention. Detailed Implementation

[0024] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.

[0025] In the description of this application, it should be understood that the terms "center," "longitudinal," "lateral," "length," "width," "thickness," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," and "outer," etc., indicating the orientation or positional relationship based on the orientation or positional relationship shown in the accompanying drawings, are only for the convenience of describing this application and simplifying the description, and do not indicate or imply that the device or element referred to must have a specific orientation, or be constructed and operated in a specific orientation, and therefore should not be construed as a limitation on this application. Furthermore, the terms "first," "second," "third," etc., are used for descriptive purposes only and should not be construed as indicating or implying relative importance or implicitly specifying the number of technical features indicated. Thus, a feature defined with "first," "second," "third," etc., may explicitly or implicitly include one or more features.

[0026] In this application, the term "exemplary" is used to mean "used as an example, illustration, or description." Any embodiment described as "exemplary" in this application is not necessarily to be construed as being more preferred or advantageous than other embodiments. The following description is provided to enable any person skilled in the art to make and use this application. Details are set forth in the following description for purposes of explanation. It should be understood that those skilled in the art will recognize that this application can be made without using these specific details. In other instances, well-known structures and processes are not described in detail to avoid obscuring the description of this application with unnecessary detail. Therefore, this application is not intended to be limited to the embodiments shown, but is consistent with the broadest scope of the principles and features disclosed in this application.

[0027] It should be noted that since the method in this application embodiment is executed in a computer device, the processing objects of each computer device exist in the form of data or information, such as time, which is essentially time information. It is understood that if size, quantity, position, etc. are mentioned in subsequent embodiments, they are all corresponding data that exist so that the computer device can process them. Specific details will not be elaborated here.

[0028] This application provides a data processing method and system. Furthermore, the above-mentioned data processing method and system can be a data processing method and system based on a large model, which will be described in detail below.

[0029] Please see Figure 1 , Figure 1 This is a schematic diagram of a data processing system provided in an embodiment of this application. The data processing system may include a computer device 100, which integrates the data processing system, such as... Figure 1 Computer equipment in the country.

[0030] In this embodiment, the computer device 100 is mainly used to acquire data information to be processed and first data information; determine candidate data information based on the data information to be processed and the first data information; and determine first target data information based on the candidate data information, which can improve the accuracy of the generated target text information.

[0031] In this embodiment, the computer device 100 can be a standalone server, a server network, or a server cluster. For example, the computer device 100 described in this embodiment includes, but is not limited to, a computer, a network host, a single network server, a set of multiple network servers, or a cloud server composed of multiple servers. The cloud server is composed of a large number of computers or network servers based on cloud computing.

[0032] It is understood that the computer device 100 used in the embodiments of this application can be a device that includes both receiving and transmitting hardware, that is, a device having receiving and transmitting hardware capable of performing bidirectional communication on a bidirectional communication link. Such a device may include: cellular or other communication devices having a single-line display, a multi-line display, or a cellular or other communication device without a multi-line display. Specifically, the computer device 100 may be a desktop terminal or a mobile terminal, and may also be one of a mobile phone, tablet computer, laptop computer, etc.

[0033] Those skilled in the art will understand that Figure 1 The application environment shown is merely one application scenario of the solution in this application and does not constitute a limitation on the application scenario of the solution in this application. Other application environments may include those that are more specific to this application. Figure 1 The number of computer devices shown is more or less, for example Figure 1 Only one computer device is shown in the diagram. It is understood that the data processing system may also include one or more other services, which are not limited here.

[0034] In addition, such as Figure 1 As shown, the data processing system may also include a memory 200 for storing data, such as data information, such as data to be processed, first target data information, etc., and image information, such as first image data information, second image data information, etc.

[0035] It should be noted that, Figure 1The schematic diagram of the data processing system shown is merely an example. The data processing system and scenario described in the embodiments of this application are for the purpose of more clearly illustrating the technical solutions of the embodiments of this application, and do not constitute a limitation on the technical solutions provided in the embodiments of this application. As those skilled in the art will know, with the evolution of data processing systems and the emergence of new business scenarios, the technical solutions provided in the embodiments of this application are also applicable to similar technical problems.

[0036] like Figure 2 The diagram shown is a flowchart of an embodiment of the data processing method in this application. The data processing method may include the following steps S201 to S203, as detailed below:

[0037] S201. Obtain the data information to be processed and the first data information.

[0038] In this embodiment, the data to be processed is dialogue information input by the user and acquired by the computer device. The data to be processed can be voice information or text information; this embodiment does not impose any limitations. For example, when the data to be processed is voice information, the computer device can acquire the information through its own microphone; when the data to be processed is text information, the computer device can receive the data to be processed input by the user through input devices such as a keyboard, mouse, or touchscreen; additionally, the computer device can also acquire the data to be processed from other devices via a network, Bluetooth, etc., and this embodiment does not impose any limitations.

[0039] Furthermore, the data information to be processed is dialogue information associated with the third image data information, the first data information is descriptive text information corresponding to the third image data information, the image data refers to the set of gray values ​​of each pixel represented by numerical values, the image data can be static image data, or the image data can be dynamic video data. For example, the third image data information is a video data including a child and an adult, and the data information to be processed is "what is the relationship between the child and the adult".

[0040] Optionally, the computer device can obtain the first data information from other devices via a network, Bluetooth, etc., or the computer device can determine the first data information based on the third image data information; this embodiment is not limited to this. In a specific embodiment, the first data information is determined by the computer device based on the third image data information. The steps for obtaining the first data information specifically include: filtering the third image data information to obtain fourth image data information; inputting the fourth image data information into a first processing model; and outputting the first data information through the first processing model. In this embodiment, the fourth image data information is obtained by filtering the third image data information and then output through the first processing model. This eliminates the need to perform information generation processing on all image information in the third image data information, thereby improving the efficiency of data processing.

[0041] In one specific embodiment, when filtering the third image data information, multiple image information can be uniformly sampled from the third image data information to obtain the fourth image data information, or multiple image information can be randomly sampled from the third image data information to obtain the fourth image data information. This embodiment does not limit the scope.

[0042] Furthermore, the first processing model can be built based on the Cap4Video model, or it can be built based on the Contrastive Language–Image Pre-training (CLIP) model, or it can be built based on the Transformer model. This embodiment does not limit the specific model.

[0043] S202. Based on the data to be processed and the first data information, determine the candidate data information.

[0044] In this embodiment, the candidate data information is the answer information corresponding to the data to be processed generated based on the data to be processed and the first data information. In this embodiment, the first data information is obtained, and the candidate data information is determined by combining the first data information and the data to be processed. Data processing can be performed in conjunction with the context of the third image data information to improve the accuracy of the determined first target data information.

[0045] In a specific implementation, such as Figure 3 As shown, the process of determining candidate data information based on the data to be processed and the first data information in step S202 can include steps S301 to S302, as follows:

[0046] S301. Based on the first data information, determine the second data information.

[0047] In this embodiment, the second data information is fused data information obtained by fusing additional data information retrieved based on the first data information with the first data information. This embodiment locates new data information based on the first data information, and then determines the first target data information by combining the first data information and the new data information. This allows for the gradual location and refinement of information associated with the data to be processed, improving the large model's ability to understand long content, and thus improving the accuracy of the determined first target data information. In a specific embodiment, the step of determining the second data information based on the first data information specifically includes: analyzing and processing the first data information to obtain analysis result information; and performing information retrieval processing based on the analysis result information to obtain the second data information.

[0048] In this embodiment, candidate data information is obtained by processing the data to be processed and the first data information through a second processing model. The second processing model can call various tools to obtain the second data information. The analysis result information can represent the tools that the second processing model needs to call. For example, the second processing model can call tools such as visual question answering, fragment localization, text retrieval, and memory query to obtain the second data information. If the analysis result information is the first result information, it means that the second processing model needs to call the text retrieval tool to obtain the second data information. If the analysis result information is the second result information, it means that the second processing model needs to call the fragment localization tool to obtain the second data information. If the analysis result information is the third result information, it means that the second processing model needs to call the memory query tool to obtain the second data information. If the analysis result information is the fourth result information, it means that the second processing model needs to call the visual question answering tool to obtain the second data information.

[0049] In one specific embodiment, the analysis result information includes first result information, and / or, second result information, and / or, third result information, and / or, fourth result information. The step of performing information retrieval processing based on the analysis result information to obtain second data information specifically includes: if the analysis result information is first result information, performing information retrieval processing on the second target data information based on time information to obtain second data information; and / or, if the analysis result information is second result information, performing information retrieval processing on the second target data information based on text query information to obtain second data information; and / or, if the analysis result information is third result information, performing information retrieval processing on the third target data information based on text query information to obtain second data information; and / or, if the analysis result information is fourth result information, obtaining second data information based on first image data information.

[0050] In this embodiment, the time information is the start time step and end time step determined based on the first data information. The second target data information includes descriptive text information corresponding to multiple image segments in the third image data information. When performing information retrieval processing on the second target data information based on the time information, the descriptive text information of the image segments corresponding to the start time step to the end time step can be directly obtained from the second target data information based on the start time step and end time step, and the descriptive text information is determined as the second data information.

[0051] Furthermore, the text query information is the query text determined based on the first data information, and the second target data information also includes the image feature information corresponding to each image segment. When performing information retrieval processing on the second target data information based on the text query information, the text feature information corresponding to the text query information can be compared with the image feature information. The descriptive text information of the image segment corresponding to the image feature information most similar to the text feature information can be determined from the image feature information, and the descriptive text information can be determined as the second data information.

[0052] Optionally, the third target data information includes object feature information. When performing information retrieval processing on the third target data information based on the text query information, the text feature information corresponding to the text query information can be compared with the object feature information. The descriptive text information of the image segment corresponding to the object feature information most similar to the text feature information can be determined from the object feature information, and the descriptive text information can be determined as the second data information.

[0053] Furthermore, the first image data information is the image information at a certain time point in the third image data information determined based on the first data information. When obtaining the second data information based on the first image data information, a visual question-and-answer tool can be invoked to perform information generation processing on the first image data information to obtain the second data information. Through the visual question-and-answer tool, other data information besides the second target data information and the third target data information can be obtained, further improving the accuracy of the determined first target data information.

[0054] In one specific embodiment, the aforementioned second target data information is obtained based on the following method: inputting the second image data information into a first processing model, and outputting third data information through the first processing model; encoding the second image data information to obtain first feature information; extracting features from the third data information to obtain second feature information; and fusing the third data information, the first feature information, and the second feature information to obtain the second target data information.

[0055] In this embodiment, the first processing model may be built based on the Cap4Video model, the first processing model may be built based on the Contrastive Language–Image Pre-training (CLIP) model, or the first processing model may be built based on the Transformer model. This embodiment does not limit the specific model.

[0056] Furthermore, the first encoder can be used to extract features from the third data information. The first encoder can be built based on the Cap4Video encoder or the DINOv2 encoder; this embodiment is not limited to either. A fourth processing model can also be used to extract features from the third data information; for example, the bge-m3 model can be used.

[0057] In one specific embodiment, the second image data information is an image segment segmented from the third image data information. The second image data information is obtained based on the following method: performing similarity calculation on the third image data information to obtain first similarity information; performing calculation processing on the first similarity information to obtain third feature information; performing feature analysis processing on the third feature information to obtain candidate position information; performing filtering processing on the candidate position information to obtain target position information; and performing segmentation processing on the third image data information based on the target position information to obtain the second image data information.

[0058] In one specific embodiment, in order to reduce the computational load when segmenting the third image data information, the third image data information can be divided into multiple fifth image data information by using a sliding window. Then, the similarity between adjacent image information in each fifth image data information is calculated to obtain the first similarity information of each fifth image data information. Then, the target location information corresponding to each fifth image data information is determined based on the first similarity information of each fifth image data information. The target location information corresponding to multiple fifth image data information is all the segmentation points of the third image data information.

[0059] In one specific embodiment, the process of determining the similarity between adjacent image information can be expressed as follows: w i,j F represents the similarity between the i-th image information and the j-th image information. i F represents the color layout descriptor for the i-th image information. j The color layout descriptor representing the j-th image information, d(F i ,F j ) represents F i With F jThe Euclidean distance between them, σ is the normalization coefficient, and σ is used to control the range of similarity values. σ can be set according to actual needs. Optionally, σ can be set to F. i With F j The difference in modulus length.

[0060] In one specific embodiment, the step of calculating and processing the first similarity information to obtain the third feature information specifically includes: constructing a feature equation based on the first similarity information; and solving the feature equation to obtain the third feature information. The feature equation can be expressed as: (DW)y = λDy, where y represents the eigenvector, λ represents the eigenvalue, D represents a diagonal matrix, W represents a symmetric matrix, and the diagonal elements of D... element W ij =w i,j , where n represents the number of similarities in the first similarity information.

[0061] In one specific embodiment, the third feature information includes a first feature vector, a second feature vector, and a third feature vector. The first feature vector, the second feature vector, and the third feature vector are the second smallest, the third smallest, and the fourth smallest feature vectors obtained by solving the feature equation, respectively. By analyzing the sign changes of each feature vector in the third feature information, candidate position information can be determined. By filtering the candidate position information using an adaptive threshold method, the target position information can be obtained.

[0062] In one specific embodiment, the aforementioned third target data information is obtained through the following steps: object recognition is performed on the third image data information to obtain target object recognition information; the target object recognition information includes at least one target object information; the fourth feature information corresponding to each target object information is fused to obtain the fifth feature information corresponding to each target object information; the target object recognition information and the fifth feature information are fused to obtain the third target data information.

[0063] In this embodiment, the fourth feature information is the feature information of the object corresponding to each target object information in the third image data information. Fusion processing refers to the process of integrating and optimizing information from multiple data sources to obtain more consistent, accurate and useful information. The object corresponding to each target object information may appear in multiple image information. When performing fusion processing on the fourth feature information corresponding to each target object information, the fourth feature information corresponding to each target object information can be averaged or spliced. This embodiment does not limit this.

[0064] In one specific embodiment, the step of performing object recognition on the third image data information to obtain target object recognition information specifically includes: acquiring at least one object information to be detected corresponding to the third image data information; determining, for any one of the at least one object information to be detected, whether there is first object information in the first object recognition information that matches the object information to be detected; if there is first object information in the first object recognition information, performing a fusion process on the object information to be detected and the first object information to obtain updated first object recognition information; and / or, if there is no first object information in the first object recognition information, determining second object information based on the object information to be detected, and performing a fusion process on the second object information and the first object recognition information to obtain updated first object recognition information; and determining the updated first object recognition information as target object recognition information.

[0065] In this embodiment, the object information to be detected is the object identifier information corresponding to all objects appearing in the third image data information. The objects can be people, animals, props, etc. in the third image data information. For example, the object information to be detected can be represented as TRACK = {track1, track2, ..., track...} n Considering that the same object in the third image data information may appear multiple times in different image information, resulting in multiple object information for the same object, this embodiment, after obtaining at least one object information to be detected corresponding to the third image data information, further determines whether there is first object information in the first object identification information that matches the object information to be detected, so as to group the multiple object information of the same object into the same group.

[0066] In this embodiment of the application, if the first object identification information contains first object information, it indicates that the first object information and the object information to be detected are object information corresponding to the same object, and the object information to be detected and the first object information are fused together; conversely, if the first object identification information does not contain first object information, it indicates that the first object identification information does not contain object information belonging to the same object as the object information to be detected, and the second object information is determined based on the object information to be detected, and the second object information and the first object identification information are fused together.

[0067] In one specific embodiment, the first object identification information includes at least one third object information, and each third object information includes at least one first target information. The step of determining whether there is first object information in the first object identification information that matches the object information to be detected specifically includes: for any third object information in the first object identification information, obtaining first index information corresponding to the third object information; if the first index information does not match the second index information corresponding to the object information to be detected, determining second similarity information between the object information to be detected and the first target information in the third object information; if the second similarity information satisfies a first condition, determining the third object information as the first object information.

[0068] In this embodiment of the application, the first index information represents the position of the object corresponding to the third object information in the third image data information, and the second index information represents the position of the object corresponding to the object to be detected in the third image data information. If the first index information does not match the second index information corresponding to the object to be detected, it indicates that the third object information and the object to be detected do not share image frames. Then, it is further determined whether the second similarity information satisfies the first condition. If the second similarity information satisfies the first condition, the third object information is determined to be the first object information.

[0069] In one specific embodiment, the second similarity information satisfying the first condition can be that there are at least a first number of second similarity information corresponding to the first target information that are greater than the first similarity threshold, and / or that there are at least a second number of second similarity information corresponding to the first target information that are greater than the second similarity threshold. The first similarity threshold, the second pixel threshold, the first number, and the second number can be set according to actual needs, with the first similarity threshold < the second similarity threshold. For example, the first similarity threshold can be set to 0.5, the second similarity threshold can be set to 0.6, the first number can be set to the total number of first target information in the third object information, and the second number can be set to 1.

[0070] In one specific embodiment, the step of determining the second similarity information between the object information to be detected and the first target information in the third object information specifically includes: calculating the sixth feature information corresponding to the object information to be detected and the seventh feature information corresponding to the first target information in the third object information to obtain the third similarity information; calculating the eighth feature information corresponding to the object information to be detected and the ninth feature information corresponding to the first target information in the third object information to obtain the fourth similarity information; and performing a weighted summation of the third similarity information and the fourth similarity information to obtain the second similarity information.

[0071] Optionally, the process of determining the second similarity information can be expressed as: sim(i,j)=τ5*Cap(i,j)+τ6*DINO(i,j), sim(i,j) represents the second similarity information between the i-th object to be detected and the j-th first target information, and Cap(i,j) represents the third similarity information between the i-th object to be detected and the j-th first target information. This represents the sixth feature information. This represents the seventh feature information. This represents the eighth feature information. The ninth feature information is represented by τ1, τ2, τ3, τ4, τ5, and τ6, which represent hyperparameters. τ1, τ2, τ3, τ4, τ5, and τ6 can be set according to actual needs. In a specific embodiment, τ1>0, τ2∈(0,1), τ3>0, τ4∈(0,1), τ5∈(0,1), and τ6∈(0,1).

[0072] Furthermore, the sixth and seventh feature information are feature information obtained using the second encoder, and the eighth and ninth feature information are feature information obtained using the third encoder. The second encoder and the third encoder are different encoders. For example, the second encoder is a Cap4Video encoder, and the third encoder is a DINOv2 encoder.

[0073] S302. Based on the data to be processed and the second data information, determine the candidate data information.

[0074] In some embodiments, the step of determining candidate data information based on the data to be processed and the second data information specifically includes: performing a fusion process on the data to be processed and the second data information to obtain first fused information; inputting the first fused information into a second processing model; and outputting candidate data information through the second processing model.

[0075] In this embodiment of the application, fusion processing refers to the process of integrating and optimizing information from multiple data sources to obtain more consistent, accurate and useful information. By fusion processing the data information to be processed and the second data information, candidate data information can be determined by combining the data information to be processed and the second data information, thereby improving the accuracy of the determined candidate data information.

[0076] Furthermore, the first fusion information serves as prompt data to guide the second processing model in outputting candidate data information. For example, the first fusion information can be represented as: "Given a video that has {N} frames, the frames are decoded at 1fps. Given the following descriptions of the sampled frames in the video:\n{caption}\n#C to denote the sentence is an action done by the camerawearer (the person who recorded the video while wearing a camera on their head).\n#O to denote that the sentence is an action done by someone other than the camerawearer.\n Please answer the following question:\n```\n{question}\n```\n Please think step-by-step and write the best answer index in JSON format {answer_format}. Note that only one answer is returned for the question.", where {caption} represents the second data information, and {question} represents the data information to be processed.

[0077] In some embodiments, the second processing model can be built upon a Large Language Model (LLM), which refers to an artificial neural network model with a very large number of parameters. In the field of artificial intelligence, a large model typically refers to a model with hundreds of millions to trillions of parameters. These models usually need to be trained on large-scale datasets and require significant computational resources for optimization and tuning. Large models are commonly used to solve complex tasks such as natural language processing, computer vision, and speech recognition.

[0078] In this embodiment of the application, the large model can be a language model of the scale of ChatGPT, BERT, XLNet, Zhipu model, Claude, Moonshot AI model, ChatGLM model, Qianyitongwen model, MiniMax model, Xinghuo model, Llama model, 360GPT model, Qwen model, Baichuan model, Yunque model, vivoLM model, and Wenxin Yiyan, etc., and this embodiment of the application does not limit it.

[0079] S203. Based on the candidate data information, determine the first target data information.

[0080] In this embodiment, the first target data information is the final answer information corresponding to the data information to be processed determined based on the candidate data information. For example, the data information to be processed is "What is the relationship between the child and the adult?", and the first target data information is "The adult is the child's parent." This embodiment determines candidate data information based on the data information to be processed and the first data information, and then determines the first target data information based on the candidate data information, which can improve the accuracy of the determined first target data information.

[0081] In some embodiments, refer to Figure 4 As shown, the step S203 above, which determines the first target data information based on the candidate data information, may include steps S401 to S402, as follows:

[0082] S401. Based on the candidate data information, determine the fourth data information.

[0083] In this embodiment, the fourth data information is used to characterize the confidence level of the candidate data information. In a specific embodiment, the step of determining the fourth data based on the candidate data information specifically includes: fusing the candidate data information and the second data information to obtain second fused information; inputting the second fused information into a third processing model; and outputting the fourth data information through the third processing model. This embodiment combines the candidate data information and the second data information to determine the fourth data information, which can improve the accuracy of the obtained fourth data information. Using the third processing model to output the fourth data information can improve the efficiency of determining the fourth data information.

[0084] In this embodiment, the second fusion information is prompt data used to guide the third processing model to output the fourth data information. For example, the second fusion information can be represented as: "Please assess the confidence level in the decision-making process. The provided information is as follows, {previous_prompt} The decision-making process is as follows, {answer} Criteria for Evaluation: Insufficient Information (Confidence Level: 1): If information is too lacking for a reasonable conclusion. Partial Information (Confidence Level: 2): If information partially supports an informed guess. Sufficient Information (Confidence Level: 3): If information fully supports a well-informed decision. Assessment Focus: Evaluate based on the therelevance, completeness, and clarity of the provided information in relation to the decision-making context. Please generate the confidence with..." The JSON format is set to "confidence_format", where "previous_prompt" represents the second data information, "answer" represents the candidate data information, and "confidence_format" is the format of the fourth data information, for example: "confidence":"xxx"}.

[0085] Furthermore, the third processing model can be built based on a large language model (LLM). The third processing model and the second processing model can be built using the same large language model, or they can be built using different large language models. This embodiment does not impose any limitations.

[0086] S402. Based on the fourth data information, determine the first target data information.

[0087] In this embodiment of the application, after obtaining candidate data information, fourth data information is determined based on the candidate data information, and first target data information is determined based on the fourth data information. When the candidate data information does not meet the requirements, the candidate data information can be continuously optimized, thereby improving the accuracy of the determined first target data information.

[0088] In one specific embodiment, the step of determining the first target data information based on the fourth data information specifically includes: if the fourth data information satisfies the second condition, determining the candidate data information as the first target data information; and / or, if the fourth data information does not satisfy the second condition, determining the first target data information based on the second data information.

[0089] In this embodiment, the fourth data information satisfying the second condition can mean that the fourth data information matches the reference information, or that the evaluation value corresponding to the fourth data information is greater than or equal to the evaluation threshold. This embodiment does not limit this. For example, the fourth data information may be 1, 2, or 3. If the fourth data information is 3, it is determined that the fourth data information satisfies the second condition. As another example, if the evaluation value corresponding to the fourth data information is 80% and the evaluation threshold is 80%, it is determined that the fourth data information satisfies the second condition.

[0090] In some embodiments, the step of determining the first target data information based on the second data information specifically includes: determining the second data information as the new first data information, and continuing to perform the step of determining candidate data information based on the data information to be processed and the first data information until the fourth data information meets the second condition or the number of updates of the first data information reaches the number threshold; and determining the candidate data information as the first target data information.

[0091] In summary, the data processing method provided in this implementation scheme obtains the data to be processed and the first data information, determines candidate data information based on the data to be processed and the first data information, and determines the first target data information based on the candidate data information. This scheme improves the accuracy of the obtained first target data information by determining candidate data information based on the data to be processed and the first data information, and then determining the first target data information based on the candidate data information. Furthermore, determining fourth data information based on the candidate data information, and then determining the first target data information based on the fourth data information, allows for continuous optimization of the candidate data information when it does not meet the requirements, thereby improving the accuracy of the determined first target data information. Even further, determining second data information based on the first data information, and determining candidate data information based on the data to be processed and the second data information, allows for the gradual location and refinement of information related to the data to be processed, improving the large model's ability to understand long content, and thus improving the accuracy of the determined first target data information.

[0092] To better implement the data processing method in the embodiments of this application, a data processing system is also provided in the embodiments of this application, such as... Figure 5 As shown, the data processing system 600 includes:

[0093] Information acquisition module 610 is used to acquire data information to be processed and first data information;

[0094] The first determining module 620 is used to determine candidate data information based on the data information to be processed and the first data information;

[0095] The second determining module 630 is used to determine the first target data information based on the candidate data information.

[0096] In this embodiment of the application, by determining candidate data information based on the data to be processed and the first data information, and then determining the first target data information based on the candidate data information, the accuracy of the obtained first target data information can be improved.

[0097] In some embodiments of this application, the first determining module 620 determines candidate data information based on the data information to be processed and the first data information, including:

[0098] Based on the first data information, determine the second data information;

[0099] Based on the data to be processed and the second data information, candidate data information is determined.

[0100] In some embodiments of this application, the first determining module 620 determines second data information based on the first data information, including:

[0101] The first data information is analyzed and processed to obtain the analysis results.

[0102] Information retrieval and processing are performed based on the analysis results to obtain the second data information.

[0103] In some embodiments of this application, the analysis result information includes first result information, and / or, second result information, and / or, third result information, and / or, fourth result information. The first determining module 620 performs information retrieval processing based on the analysis result information to obtain second data information, including:

[0104] If the analysis result is the first result information, then information retrieval processing is performed on the second target data information based on the time information to obtain the second data information; and / or,

[0105] If the analysis result is the second result information, information retrieval processing is performed on the second target data information based on the text query information to obtain the second data information; and / or,

[0106] If the analysis result is the third result information, information retrieval processing is performed on the third target data information based on the text query information to obtain the second data information; and / or,

[0107] If the analysis result is the fourth result, the second data information is obtained based on the first image data information.

[0108] In some embodiments of this application, the second target data information is obtained by the first determining module 620 in the following manner:

[0109] The second image data information is input into the first processing model, and the third data information is output through the first processing model.

[0110] The second image data information is encoded to obtain the first feature information;

[0111] Feature extraction is performed on the third data information to obtain the second feature information;

[0112] The third data information, the first feature information, and the second feature information are fused to obtain the second target data information.

[0113] In some embodiments of this application, the second image data information is obtained by the first determining module 620 in the following manner:

[0114] The similarity of the third image data is calculated to obtain the first similarity information;

[0115] The first similarity information is processed to obtain the third feature information;

[0116] The third feature information is processed by feature analysis to obtain candidate position information;

[0117] The candidate location information is filtered to obtain the target location information;

[0118] The third image data is segmented based on the target location information to obtain the second image data.

[0119] In some embodiments of this application, the third target data information is obtained by the first determining module 620 in the following manner:

[0120] Object recognition is performed on the third image data to obtain target object recognition information; the target object recognition information includes at least one target object information.

[0121] The fourth feature information corresponding to each target object information is fused to obtain the fifth feature information corresponding to each target object information; the fourth feature information is the feature information of the object corresponding to each target object information in the third image data information.

[0122] The target object identification information and the fifth feature information are fused to obtain the third target data information.

[0123] In some embodiments of this application, the first determining module 620 performs object recognition on the third image data information to obtain target object recognition information, including:

[0124] Obtain at least one object information to be detected corresponding to the third image data information;

[0125] For any one of the at least one object information to be detected, determine whether there is a first object information in the first object identification information that matches the object information to be detected;

[0126] If the first object identification information contains first object information, the object information to be detected and the first object information are fused to obtain updated first object identification information; and / or,

[0127] If the first object identification information does not exist in the first object identification information, the second object information is determined based on the object information to be detected, and the second object information and the first object identification information are fused to obtain the updated first object identification information;

[0128] The updated first object identification information is determined as the target object identification information.

[0129] In some embodiments of this application, the first object identification information includes at least one third object information, each third object information including at least one first target information, and the first determining module 620 determines whether there is first object information in the first object identification information that matches the object information to be detected, including:

[0130] For any third object information in the first object recognition information, obtain the first index information corresponding to the third object information; the first index information represents the position of the object corresponding to the third object information in the third image data information.

[0131] If the first index information does not match the second index information corresponding to the object to be detected, the second similarity information between the object to be detected and the first target information in the third object information is determined; the second index information represents the position of the object corresponding to the object to be detected in the third image data information.

[0132] If the second similarity information satisfies the first condition, the third object information is determined to be the first object information.

[0133] In some embodiments of this application, the first determining module 620 determines candidate data information based on the data to be processed and the second data information, including:

[0134] The data to be processed and the second data are fused together to obtain the first fused information;

[0135] The first fused information is input into the second processing model, and the candidate data information is output through the second processing model.

[0136] In some embodiments of this application, the second determining module 630 determines the first target data information based on the candidate data information, including:

[0137] Based on the candidate data information, the fourth data information is determined; the fourth data information is used to characterize the confidence level of the candidate data information.

[0138] Based on the fourth data information, the first target data information is determined.

[0139] In some embodiments of this application, candidate data information is determined based on second data information, which is data information determined based on first data information. The second determining module 630 determines the first target data information based on fourth data information, including:

[0140] If the fourth data information meets the second condition, the candidate data information is determined as the first target data information; and / or,

[0141] If the fourth data information does not meet the second condition, the first target data information is determined based on the second data information.

[0142] In some embodiments of this application, the second determining module 630 determines the first target data information based on the second data information, including:

[0143] The second data information is determined as the new first data information, and the step of determining candidate data information based on the data information to be processed and the first data information continues to be executed until the fourth data information meets the second condition or the number of updates of the first data information reaches the number threshold.

[0144] The candidate data information is selected as the first target data information.

[0145] In some embodiments of this application, candidate data information is determined based on second data information, which is data information determined based on first data information. The second determining module 630 determines fourth data information based on the candidate data information, including:

[0146] The candidate data information and the second data information are fused to obtain the second fused information;

[0147] The second fused information is input into the third processing model, and the fourth data information is output through the third processing model.

[0148] In some embodiments of this application, the first data information is obtained by the second determining module 630 in the following manner:

[0149] The third image data information is filtered to obtain the fourth image data information;

[0150] The fourth image data information is input into the first processing model, and the first data information is output through the first processing model.

[0151] This application also provides a computer device that integrates any of the data processing systems provided in this application. The computer device includes:

[0152] One or more processors;

[0153] Memory; and

[0154] One or more applications, wherein the applications are stored in memory and configured to be executed by a processor from the steps of the data processing method in any of the embodiments described above.

[0155] This application also provides a computer device that integrates any of the data processing systems provided in this application. For example... Figure 6 As shown, it illustrates a structural schematic diagram of the computer device involved in the embodiments of this application, specifically:

[0156] The computer device may include components such as a processor 801 with one or more processing cores, a memory 802 with one or more computer-readable storage media, a power supply 803, and an input unit 804. Those skilled in the art will understand that... Figure 6 The computer device structure shown does not constitute a limitation on the computer device and may include more or fewer components than shown, or combine certain components, or have different component arrangements. Wherein:

[0157] The processor 801 is the control center of the computer device. It connects various parts of the computer device via various interfaces and lines, and performs various functions and processes data by running or executing software programs and / or modules stored in the memory 802, and by calling data stored in the memory 802, thereby providing overall monitoring of the computer device. Optionally, the processor 801 may include one or more processing cores; optionally, the processor 801 may integrate an application processor and a modem processor, wherein the application processor mainly handles the operating system, user interface, and applications, and the modem processor mainly handles wireless communication. It is understood that the modem processor may also not be integrated into the processor 801.

[0158] The memory 802 can be used to store software programs and modules. The processor 801 executes various functional applications and data processing by running the software programs and modules stored in the memory 802. The memory 802 may mainly include a program storage area and a data storage area. The program storage area may store the operating system, application programs required for at least one function (such as sound playback function, image playback function, etc.), etc.; the data storage area may store data created according to the use of the computer device, etc. In addition, the memory 802 may include high-speed random access memory, and may also include non-volatile memory, such as at least one disk storage device, flash memory device, or other volatile solid-state storage device. Accordingly, the memory 802 may also include a memory controller to provide the processor 801 with access to the memory 802.

[0159] The computer device also includes a power supply 803 that supplies power to the various components. Optionally, the power supply 803 can be logically connected to the processor 801 through a power management system, thereby enabling functions such as charging, discharging, and power consumption management through the power management system. The power supply 803 may also include one or more DC or AC power supplies, recharging systems, power fault detection circuits, power converters or inverters, power status indicators, and other arbitrary components.

[0160] The computer device may also include an input unit 804, which can be used to receive input digital or character information and generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control.

[0161] Although not shown, the computer device may also include a display unit, etc., which will not be described in detail here. Specifically, in this embodiment, the processor 801 in the computer device loads the executable files corresponding to the processes of one or more application programs into the memory 802 according to the following instructions, and the processor 801 runs the application programs stored in the memory 802 to realize various functions, as follows:

[0162] Obtain the data to be processed and the first data information;

[0163] Based on the data to be processed and the first data, candidate data information is determined;

[0164] Based on the candidate data information, the first target data information is determined.

[0165] Those skilled in the art will understand that all or part of the steps in the various methods of the above embodiments can be performed by instructions, or by instructions controlling related hardware. These instructions can be stored in a computer-readable storage medium and loaded and executed by a processor.

[0166] Therefore, embodiments of this application provide a computer-readable storage medium, which may include: read-only memory (ROM), random access memory (RAM), a magnetic disk, or an optical disk, etc. A computer program is stored thereon, and the computer program is loaded by a processor to execute the steps in any of the data processing methods provided in embodiments of this application. For example, the computer program loaded by the processor can execute the following steps:

[0167] Obtain the data to be processed and the first data information;

[0168] Based on the data to be processed and the first data, candidate data information is determined;

[0169] Based on the candidate data information, the first target data information is determined.

[0170] In the above embodiments, the descriptions of each embodiment have different focuses. For parts not described in detail in a certain embodiment, please refer to the detailed descriptions of other embodiments above, which will not be repeated here.

[0171] In practice, each of the above units or structures can be implemented as an independent entity or can be arbitrarily combined to be implemented as the same or several entities. For the specific implementation of each of the above units or structures, please refer to the previous method embodiments, which will not be repeated here.

[0172] For details on the implementation of each of the above operations, please refer to the previous examples, which will not be repeated here.

[0173] The data processing method and system provided in the embodiments of this application have been described in detail above. Specific examples have been used to illustrate the principles and implementation methods of this application. The description of the above embodiments is only for the purpose of helping to understand the method and core ideas of this application. At the same time, for those skilled in the art, there will be changes in the specific implementation methods and application scope based on the ideas of this application. Therefore, the content of this specification should not be construed as a limitation of this application.

Claims

1. A method characterized by, include: Obtain the data to be processed and the first data information; Based on the data to be processed and the first data information, candidate data information is determined; Based on the candidate data information, the first target data information is determined.

2. The method of claim 1, wherein, The step of determining candidate data information based on the data to be processed and the first data information includes: Based on the first data information, the second data information is determined; Based on the data to be processed and the second data information, candidate data information is determined.

3. The method of claim 2, wherein, The step of determining the second data information based on the first data information includes: The first data information is analyzed and processed to obtain the analysis result information; Information retrieval processing is performed based on the analysis results to obtain the second data information.

4. The method of claim 3, wherein, The analysis results include first result information, and / or, second result information, and / or, third result information, and / or, fourth result information; The information retrieval processing based on the analysis results to obtain second data information includes: If the analysis result information is the first result information, information retrieval processing is performed on the second target data information based on the time information to obtain the second data information; and / or, If the analysis result information is the second result information, information retrieval processing is performed on the second target data information based on the text query information to obtain the second data information; and / or, If the analysis result information is the third result information, information retrieval processing is performed on the third target data information based on the text query information to obtain the second data information; and / or, If the analysis result information is the fourth result information, the second data information is obtained based on the first image data information.

5. The method of claim 4, wherein, The second target data information was obtained based on the following method: The second image data information is input into the first processing model, and the third data information is output through the first processing model. The second image data information is encoded to obtain the first feature information; Feature extraction is performed on the third data information to obtain the second feature information; The third data information, the first feature information, and the second feature information are fused to obtain the second target data information.

6. The method of claim 5, wherein, The second image data information was obtained in the following way: The similarity of the third image data is calculated to obtain the first similarity information; The first similarity information is processed to obtain the third feature information; The third feature information is subjected to feature analysis processing to obtain candidate position information; The candidate location information is filtered to obtain the target location information; The third image data information is segmented based on the target location information to obtain the second image data information.

7. The method of claim 4, wherein, The third target data information was obtained based on the following method: Object recognition is performed on the third image data to obtain target object recognition information; the target object recognition information includes at least one target object information. The fourth feature information corresponding to each target object information is fused to obtain the fifth feature information corresponding to each target object information. The fourth feature information is the feature information of the object corresponding to each target object information in the third image data information; The target object identification information and the fifth feature information are fused to obtain the third target data information.

8. The method of claim 7, wherein, The process of performing object recognition on the third image data information to obtain target object recognition information includes: Obtain at least one object information to be detected corresponding to the third image data information; For any one of the at least one object information to be detected, determine whether there is first object information in the first object identification information that matches the object information to be detected; If the first object identification information contains the first object information, the object information to be detected and the first object information are fused to obtain updated first object identification information; and / or, If the first object information is not present in the first object identification information, the second object information is determined based on the object information to be detected, and the second object information and the first object identification information are fused to obtain the updated first object identification information. The updated first object identification information is determined as the target object identification information.

9. The method of claim 8, wherein, The first object identification information includes at least one third object information, and each third object information includes at least one first target information; Determining whether there is first object information in the first object identification information that matches the information of the object to be detected includes: For any of the third object information in the first object identification information, obtain the first index information corresponding to the third object information; the first index information represents the position of the object corresponding to the third object information in the third image data information. If the first index information does not match the second index information corresponding to the object to be detected, a second similarity information is determined between the object to be detected and the first target information in the third object information; the second index information represents the position of the object corresponding to the object to be detected in the third image data information. If the second similarity information satisfies the first condition, the third object information is determined to be the first object information.

10. The method of claim 2, wherein, The step of determining candidate data information based on the data to be processed and the second data information includes: The data to be processed and the second data are fused together to obtain the first fused information; The first fused information is input into the second processing model, and the candidate data information is output through the second processing model.

11. The method of claim 1, wherein, The step of determining the first target data information based on the candidate data information includes: Based on the candidate data information, a fourth data information is determined; the fourth data information is used to characterize the confidence level of the candidate data information. Based on the fourth data information, the first target data information is determined.

12. The method according to claim 11, characterized in that, The candidate data information is determined based on the second data information, which is the data information determined based on the first data information. The determination of the first target data information based on the fourth data information includes: If the fourth data information satisfies the second condition, the candidate data information is determined as the first target data information; And / or, If the fourth data information does not meet the second condition, the first target data information is determined based on the second data information.

13. The method according to claim 12, characterized in that, The step of determining the first target data information based on the second data information includes: The second data information is determined as the new first data information, and the step of determining candidate data information based on the data information to be processed and the first data information continues to be executed until the fourth data information satisfies the second condition or the number of updates of the first data information reaches the number threshold. The candidate data information is determined as the first target data information.

14. The method according to claim 11, characterized in that, The candidate data information is determined based on the second data information, which is the data information determined based on the first data information. The determination of the fourth data information based on the candidate data information includes: The candidate data information and the second data information are fused to obtain the second fused information; The second fused information is input into the third processing model, and the fourth data information is output through the third processing model.

15. The method according to any one of claims 1 to 14, characterized in that, The first data information was obtained in the following way: The third image data information is filtered to obtain the fourth image data information; The fourth image data information is input into the first processing model, and the first data information is output through the first processing model.

16. A system, characterized in that, include: The information acquisition module is used to acquire the data to be processed and the first data information; The first determining module is used to determine candidate data information based on the data information to be processed and the first data information; The second determining module is used to determine the first target data information based on the candidate data information; Further, the first determining module determines candidate data information based on the data information to be processed and the first data information, including: Based on the first data information, the second data information is determined; Based on the data to be processed and the second data information, candidate data information is determined; Furthermore, the first determining module determines the second data information based on the first data information, including: The first data information is analyzed and processed to obtain the analysis result information; Based on the analysis results, information retrieval processing is performed to obtain the second data information; Further, the analysis result information includes first result information, and / or, second result information, and / or, third result information, and / or, fourth result information. The first determining module performs information retrieval processing based on the analysis result information to obtain second data information, including: If the analysis result information is the first result information, information retrieval processing is performed on the second target data information based on the time information to obtain the second data information; and / or, If the analysis result information is the second result information, information retrieval processing is performed on the second target data information based on the text query information to obtain the second data information; and / or, If the analysis result information is the third result information, information retrieval processing is performed on the third target data information based on the text query information to obtain the second data information; and / or, If the analysis result information is the fourth result information, the second data information is obtained based on the first image data information; Furthermore, the second target data information is obtained by the first determining module in the following manner: The second image data information is input into the first processing model, and the third data information is output through the first processing model. The second image data information is encoded to obtain the first feature information; Feature extraction is performed on the third data information to obtain the second feature information; The third data information, the first feature information, and the second feature information are fused together to obtain the second target data information; Furthermore, the second image data information is obtained by the first determining module in the following manner: The similarity of the third image data is calculated to obtain the first similarity information; The first similarity information is processed to obtain the third feature information; The third feature information is subjected to feature analysis processing to obtain candidate position information; The candidate location information is filtered to obtain the target location information; Based on the target location information, the third image data information is segmented to obtain the second image data information; Furthermore, the third target data information is obtained by the first determining module in the following manner: Object recognition is performed on the third image data to obtain target object recognition information; the target object recognition information includes at least one target object information. The fourth feature information corresponding to each target object information is fused to obtain the fifth feature information corresponding to each target object information; the fourth feature information is the feature information of the object corresponding to each target object information in the third image data information. The target object identification information and the fifth feature information are fused to obtain the third target data information; Furthermore, the first determining module performs object recognition on the third image data information to obtain target object recognition information, including: Obtain at least one object information to be detected corresponding to the third image data information; For any one of the at least one object information to be detected, determine whether there is first object information in the first object identification information that matches the object information to be detected; If the first object identification information contains the first object information, the object information to be detected and the first object information are fused to obtain updated first object identification information; and / or, If the first object information is not present in the first object identification information, the second object information is determined based on the object information to be detected, and the second object information and the first object identification information are fused to obtain the updated first object identification information. The updated first object identification information is determined as the target object identification information; Further, the first object identification information includes at least one third object information, each of the third object information including at least one first target information, and the first determining module determines whether there is first object information in the first object identification information that matches the object information to be detected, including: For any of the third object information in the first object identification information, obtain the first index information corresponding to the third object information; the first index information represents the position of the object corresponding to the third object information in the third image data information. If the first index information does not match the second index information corresponding to the object to be detected, a second similarity information is determined between the object to be detected and the first target information in the third object information; the second index information represents the position of the object corresponding to the object to be detected in the third image data information. If the second similarity information satisfies the first condition, the third object information is determined to be the first object information; Further, the first determining module determines candidate data information based on the data information to be processed and the second data information, including: The data to be processed and the second data are fused together to obtain the first fused information; The first fused information is input into the second processing model, and the candidate data information is output through the second processing model. Further, the second determining module determines the first target data information based on the candidate data information, including: Based on the candidate data information, a fourth data information is determined; the fourth data information is used to characterize the confidence level of the candidate data information. Based on the fourth data information, the first target data information is determined; Further, the candidate data information is determined based on second data information, which is data information determined based on the first data information. The second determining module determines the first target data information based on the fourth data information, including: If the fourth data information satisfies the second condition, the candidate data information is determined as the first target data information; and / or, If the fourth data information does not meet the second condition, the first target data information is determined based on the second data information; Furthermore, the second determining module determines the first target data information based on the second data information, including: The second data information is determined as the new first data information, and the step of determining candidate data information based on the data information to be processed and the first data information continues to be executed until the fourth data information satisfies the second condition or the number of updates of the first data information reaches the number threshold. The candidate data information is determined as the first target data information; Further, the candidate data information is determined based on second data information, which is data information determined based on the first data information. The second determining module determines fourth data information based on the candidate data information, including: The candidate data information and the second data information are fused to obtain the second fused information; The second fused information is input into the third processing model, and the fourth data information is output through the third processing model. Furthermore, the first data information is obtained by the second determining module in the following manner: The third image data information is filtered to obtain the fourth image data information; The fourth image data information is input into the first processing model, and the first data information is output through the first processing model.

17. A computer device, characterized in that, The computer device includes: One or more processors; Memory; and One or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the processor to implement the method of any one of claims 1 to 15.

18. A computer-readable storage medium, characterized in that, It contains a computer program that is loaded by a processor to perform the steps of the method according to any one of claims 1 to 15.