Battery material data extraction method based on multi-modal collaboration and context completion
By employing multimodal collaboration and contextual completion methods, this approach addresses the problem of isolated multimodal data extraction from battery material literature in existing technologies. It enables collaborative extraction, intelligent completion, and standardized output of battery material data, improving the efficiency, completeness, and accuracy of battery material data acquisition. This solves the problems of isolated multimodal data extraction, inability to complete missing information, and lack of standardized extraction processes in existing technologies, which lead to low extraction efficiency, poor data completeness, and insufficient accuracy. Ultimately, this approach enables the efficient and accurate construction of a battery material performance knowledge base.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- DOCUMENT & INFORMATION CENT OF CHINESE ACAD OF SCI
- Filing Date
- 2026-03-25
- Publication Date
- 2026-06-19
AI Technical Summary
In existing technologies, the extraction of multimodal data from battery material literature is isolated, missing information cannot be filled in, and the extraction process lacks standardization, resulting in low extraction efficiency, poor data integrity, and insufficient accuracy.
A battery material data extraction method based on multimodal collaboration and context completion is adopted, including multimodal deconstruction and preprocessing, fine-grained text data extraction, global context construction, dynamic logic completion and confidence scoring, multimodal data fusion and conflict resolution, data verification and standardized output, to form a high-quality battery material performance knowledge base.
It enables collaborative extraction, intelligent completion, and standardized output of multimodal data from battery material literature, improving the efficiency, completeness, and accuracy of data extraction.
Smart Images

Figure CN122241187A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of battery material data extraction technology, specifically to a battery material data extraction method based on multimodal collaboration and context completion. Background Technology
[0002] In the research and development and data analysis of battery materials, especially electrolytes, scientific literature is the core data source. Traditional manual extraction and organization methods are inefficient, costly, and prone to errors. Although existing text extraction technologies can extract information such as electrolyte formulations and performance parameters from papers in a structured manner, and multimodal technologies can also obtain data from images and tables, these technologies mostly process single-modal information independently, lacking cross-modal collaboration and contextual linkage. This leads to data fragmentation, missing key experimental conditions, low-quality image data easily contaminating text data, and crude fusion strategies causing redundancy and errors. It is difficult to form a highly reliable and structured knowledge base of battery material performance, and it cannot meet the needs of research and development and data applications for efficient, accurate, and standardized data.
[0003] Existing technologies suffer from problems such as isolated extraction of multimodal data from battery material literature, inability to fill in missing information, and lack of standardized extraction processes, resulting in low extraction efficiency, poor data integrity, and insufficient accuracy. Summary of the Invention
[0004] This application provides a battery material data extraction method based on multimodal collaboration and context completion, which solves the technical problems in the prior art of isolated multimodal data extraction from battery material literature, inability to complete missing information, lack of standardized extraction process, resulting in low extraction efficiency, poor data integrity and insufficient accuracy.
[0005] In view of the above problems, this application provides a method for extracting battery material data based on multimodal collaboration and context completion.
[0006] This application provides a method for extracting battery material data based on multimodal collaboration and context completion, the method comprising: The process involves multimodal deconstruction and preprocessing of original literature on battery materials to obtain text object datasets and image object datasets. Fine-grained text data extraction and global context construction are performed on the text object datasets to obtain a global context feature dictionary. Data extraction is then performed on the image object datasets based on the global context feature dictionary to obtain an image extraction dataset. Dynamic logical completion and confidence scoring are performed on the image extraction datasets to obtain a completed image dataset and an image data confidence score set. Based on the image data confidence score set, multimodal data fusion and conflict resolution based on source priority are performed on the global context feature dictionary and the completed image dataset to obtain a multimodal data fusion result. Finally, the multimodal data fusion result is validated and standardized for output to obtain a battery material performance knowledge base.
[0007] One or more technical solutions provided in this application have at least the following technical effects or advantages: This paper describes a method for extracting battery material data from original literature through multimodal deconstruction and preprocessing. This yields text and image datasets. Fine-grained text data extraction and global context construction are then performed to obtain a global context feature dictionary. Data extraction is performed on the image datasets to obtain an image extraction dataset. Dynamic logical completion and confidence scoring are then performed to obtain a completed image dataset and an image data confidence score set. Multimodal data fusion and conflict resolution based on source priority are applied to the global context feature dictionary and the completed image dataset to obtain the multimodal data fusion result. Finally, the multimodal data fusion result is validated and standardized for output to obtain a battery material performance knowledge base. This method achieves collaborative extraction, intelligent completion, and standardized output of multimodal data from battery material literature, improving the efficiency, completeness, and accuracy of battery material data extraction. Attached Figure Description
[0008] Figure 1 A schematic diagram of the battery material data extraction method based on multimodal collaboration and context completion provided in this application embodiment; Figure 2 This is a schematic diagram illustrating the process of obtaining the global context feature dictionary flow in the battery material data extraction method based on multimodal collaboration and context completion provided in the embodiments of this application. Detailed Implementation
[0009] This application provides a battery material data extraction method based on multimodal collaboration and context completion to address the technical problems in existing technologies, such as isolated extraction of multimodal data from battery material literature, inability to complete missing information, and lack of standardized extraction processes, resulting in low extraction efficiency, poor data integrity, and insufficient accuracy.
[0010] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only a part of the embodiments of this application, and not all of them. All other embodiments obtained by those skilled in the art based on the embodiments of this application without creative effort are within the scope of protection of this application.
[0011] Examples, such as Figure 1 As shown, this application provides a method for extracting battery material data based on multimodal collaboration and context completion, the method comprising: Step S100: Perform multimodal deconstruction and preprocessing on the original literature on battery materials to obtain text object datasets and image object datasets.
[0012] Specifically, multimodal deconstruction and preprocessing are performed on the original literature on battery materials. First, a PDF parsing tool is used to deconstruct the original PDF text of the literature, automatically identifying and separating three types of knowledge objects: text paragraphs, tables, and images. The text knowledge objects are then divided into literature chapters such as abstracts, introductions, and experimental sections, and encapsulated in a structured JSON format. Metadata such as chapter, page number, and paragraph number are bound to each text segment to form a standard text dataset. The image knowledge objects are uniformly converted to JPG / PNG format, named according to image number, and a precise association is established with the corresponding text to form a standard image dataset. At the same time, the table knowledge objects are classified. Editable text-type tables are directly converted to HTML format to supplement the standard text dataset as text objects, while scanned image-type tables are saved as image objects to supplement the standard image dataset. After the above table data supplementation, the final complete text object dataset and image object dataset are obtained.
[0013] Step S200: Perform fine-grained text data extraction and global context construction on the text object dataset to obtain a global context feature dictionary.
[0014] Specifically, fine-grained text data extraction and global context construction are carried out on the text object dataset. First, a large language model is guided by a customized prompt word to accurately extract structured data that conforms to a predefined fine-grained data pattern, such as electrolyte information, basic physicochemical properties, and electrochemical cycling performance, from the text, and output the extracted text dataset in JSON format. Then, frequency statistics and mode calculation are performed on the extracted text dataset to identify the dominant experimental conditions in the literature, and to complete the identification and aggregation of key features such as mainstream cathode materials, mainstream anode materials, high-frequency electrolyte names, and typical test temperatures. Finally, this core information is encapsulated into a document-level global context feature dictionary.
[0015] Step S300: Extract data from the image object dataset based on the global context feature dictionary to obtain the image extraction dataset.
[0016] Specifically, data extraction is performed on the image object dataset based on the global context feature dictionary. First, intelligent image-text anchoring is completed for each image through a multi-level matching strategy, retrieving the image captions and adjacent paragraphs in the text that reference the image, and integrating them to form the local context of the image. Then, combining the global context feature dictionary and the local context of the image, enhanced visual cue words are dynamically constructed, consisting of standard data extraction instructions, global context feature hints, and local context text. These enhanced visual cue words are input into a visual understanding model, which combines prior text knowledge to parse the image and achieve semantic alignment, extracting structured data with complete experimental condition annotations. The structured extraction results of all images are then integrated to form the image extraction dataset.
[0017] Step S400: Perform dynamic logical completion and confidence scoring on the image extraction dataset to obtain the completed image dataset and the image data confidence score set.
[0018] Specifically, dynamic logical completion and confidence scoring are performed on the image extraction dataset. First, each data record in the image extraction dataset is traversed, checking the completeness of key fields such as electrolyte name, positive electrode material, negative electrode material, and test temperature. If a key field is empty, relevant information is extracted from the corresponding local context of the image to fill it in. If no corresponding information is found in the local context, the core experimental condition information from the global context feature dictionary is used to complete the field backfilling. All completed fields are marked for traceability. After completing all missing fields, a completed image dataset is generated. Simultaneously, a quantitative confidence calculation system based on a base score and dynamic adjustment is constructed, using the formula: In this formula This represents the final confidence score of the data record. The base score represents the source of the data; the base score for image source data is set to 0.6. Dynamic bonus points representing field completeness. This represents the number of non-empty and valid key fields. Each valid field adds 0.02 points, with a maximum score of 0.09. This represents a dynamic deduction of points for missing fields. To determine the number of missing key fields, each missing field is deducted by 0.05, and the score is reduced to a minimum of 0. Finally, the calculated confidence score is rounded to two decimal places and bound to the corresponding data record for storage, forming a set of image data confidence scores.
[0019] Step S500: Based on the image data confidence score set, perform multimodal data fusion and conflict resolution on the global context feature dictionary and the completed image dataset based on information source priority to obtain the multimodal data fusion result.
[0020] Specifically, based on the image data confidence score set, multimodal data fusion and conflict resolution based on source priority are carried out on the global context feature dictionary and the completed image dataset. Firstly, relying on the formula... The image data confidence score set obtained from the quantitative confidence score calculation formula clarifies the core rule of information source priority: text data has a higher information source priority than image data. Unique keys are defined for the three types of core data: the unique key for electrolyte information is the standardized electrolyte name; the unique key for basic physicochemical performance is the standardized electrolyte name + test temperature; and the unique key for electrochemical cycle performance is the standardized electrolyte name + positive electrode material + negative electrode material + test temperature + current density. Then, combining the confidence scores of each image data point, differentiated fusion and conflict resolution strategies are implemented based on information source priority and unique keys. For text and image data with consistent unique keys... For conflicting data, high-priority text data is directly retained. For unique key data not covered in the text but with a confidence level of usable threshold in the image data, it is retained as a valid supplement. For data with the same source and identical unique key, the one with higher confidence score or richer information is retained. At the same time, a string distance algorithm is used to perform fuzzy deduplication of electrolyte formulations. If two formulations are determined to be similar and the information sources are text and image respectively, the text data is directly retained to avoid the introduction of noise. Finally, the text extraction data in the global context feature dictionary and the supplementary image dataset are integrated through the above strategies to obtain the multimodal data fusion result.
[0021] Step S600: Perform data verification and standardization on the multimodal data fusion results to obtain a battery material performance knowledge base.
[0022] Specifically, the multimodal data fusion results undergo data verification and standardized output. First, based on the physical principles of battery materials, hard rules are set for physical rule verification, automatically removing abnormal data that clearly violate common sense, such as coulombic efficiency greater than 100% and ionic conductivity less than or equal to 0, to obtain the verified data fusion results. Then, the verified data undergoes format standardization, completing the unified unit conversion, unifying all temperatures to degrees Celsius, conductivity to mS / cm, etc., while standardizing field names and unifying the number of decimal places retained, with confidence scores retained to two decimal places and performance parameters retained to three decimal places. Finally, the verified and standardized multimodal fusion data is structured and output in both JSON and Excel formats, forming a high-quality battery material performance knowledge base containing complete material formulations, test conditions, performance data, and bound data sources, confidence scores, and completion tags.
[0023] In one possible implementation, step S100 further includes: Step S110: Use a parsing tool to perform multimodal deconstruction of the original literature on the battery materials to obtain a set of text knowledge objects, a set of image knowledge objects, and a set of table knowledge objects.
[0024] Step S120: Perform text feature preprocessing on the text knowledge object set to obtain a standard text dataset.
[0025] Step S130: Perform image feature preprocessing on the image knowledge object set to obtain a standard image dataset.
[0026] Step S140: Supplement the standard text dataset and the standard image dataset with the table knowledge object set to obtain the text object dataset and the image object dataset.
[0027] Specifically, multimodal deconstruction operations are performed on the original PDF files of battery material literature using PDF parsing tools such as MinerU and Grobid. The document structure parsing algorithm is used to intelligently identify and separate the content of the literature, accurately extracting three independent knowledge objects: text paragraphs, images, and tables. Then, similar knowledge objects are integrated and grouped to form sets of text knowledge objects, image knowledge objects, and table knowledge objects.
[0028] Text feature preprocessing is performed on the text knowledge object set. The text paragraphs are precisely divided into logical paragraphs such as abstract, introduction, experimental section, results and discussion, and conclusion according to the standard chapter division rules of the literature. Then, the divided text segments are structured and encapsulated to generate standardized text data in JSON format. Each text segment is bound with corresponding metadata information such as chapter, page number, and paragraph number to achieve traceable management of text data. Finally, all structured text data are integrated to form a standard text dataset.
[0029] Image feature preprocessing is performed on the image knowledge object set. Embedded image resources are extracted by traversing document pages. Vector graphics are directly extracted, and bitmap images are precisely cropped. All images are uniformly converted into RGB format JPG or PNG files. At the same time, the coordinate position of the image in the PDF is identified by the parsing program, and the image number is matched by searching for the "Fig.X" class identifier above and below the image. The images are named Fig_X.png / JPG according to the image number, and an index relationship between the image and the text paragraph is established. Finally, all images that have completed the uniform format, standardized naming and bound index relationship are integrated to form a standard image dataset.
[0030] Based on the table knowledge object set, targeted data supplementation was carried out on the standard text dataset and standard image dataset. First, all tables in the table knowledge object set were identified and classified. Editable text tables were converted into HTML format to obtain table text conversion results. Scanned image tables were converted into image formats consistent with the standard image dataset to obtain table image conversion results. Then, the table text conversion results were added to the standard text dataset as text objects to improve the text data content, and the table image conversion results were added to the standard image dataset as image objects to improve the image data content. After the above classification and supplementation, the text object dataset and image object dataset with completed table data fusion were obtained respectively.
[0031] In one possible implementation, step S140 further includes: Step S141: Perform text-to-table conversion on the table knowledge object set to obtain the table text conversion result.
[0032] Step S142: Convert the table knowledge object set into an image table and obtain the table image conversion result.
[0033] Step S143: Supplement the standard text dataset based on the table text conversion result to obtain the text object dataset.
[0034] Step S144: Supplement the standard image dataset based on the image conversion results in the table to obtain the image object dataset.
[0035] Specifically, the table knowledge object set is converted into a text table. First, the tables in the table knowledge object set are identified by type, and editable text tables are selected. Then, these text tables are directly converted into HTML format. After the format standardization process is completed, the results are integrated to form the table text conversion result.
[0036] To obtain the table image conversion results, the table knowledge object set is first converted into an image table. The table type is first identified for all tables in the table knowledge object set, and non-editable image tables are filtered out. Then, the same image processing method as the standard image dataset is used to convert this type of image table into a standardized image format. After the format conversion of all image tables is completed, they are integrated to finally obtain the table image conversion results.
[0037] The converted table text is embedded in HTML format into the JSON structured data system of the standard text dataset. This table text data is then integrated with the original text fragments that are divided by document chapters and bound with metadata such as chapter, page number, and paragraph number. This makes the table text data a component of the standard text dataset. At the same time, the embedded table text data is supplemented with corresponding metadata information to achieve context traceability. After integrated processing, a text object dataset that incorporates the table text data is obtained.
[0038] The converted table images were integrated and supplemented according to the processing specifications of the standard image dataset. The converted image-type tables were uniformly saved as RGB JPG / PNG formats, matched with corresponding figure numbers and named according to the Fig_X naming rules. At the same time, an index relationship between this type of table image and the text paragraph was established to ensure that it is consistent with the storage, naming and association rules of the literature images in the standard image dataset. The integrated table images were then incorporated into the standard image dataset, finally resulting in an image object dataset containing original literature images and image-type tables.
[0039] In one possible implementation, such as Figure 2 As shown, step S200 further includes: Step S210: Guide the large language model to perform fine-grained text data extraction on the text object dataset by using a customized prompt word, and obtain the extracted text dataset.
[0040] Step S220: Perform frequency statistics on the extracted text dataset to obtain the text parameter frequency statistics results.
[0041] Step S230: Based on the frequency statistics of the text parameters, perform key feature identification and aggregation on the extracted text dataset to generate the global context feature dictionary.
[0042] Specifically, a customized prompt word is constructed based on the business requirements for battery material data extraction. This prompt word explicitly sets the large language model as a professional researcher in the field of battery materials, specifying that it needs to extract three types of structured data from the text object dataset: electrolyte information, basic physicochemical properties, and electrochemical cycle performance. It also strictly limits the extraction fields for each type of data and requires the model to output JSON format results according to the preset data pattern without omitting fields or adding irrelevant information. The text content of the text object dataset is embedded into the customized prompt word and then input into the large language model. This guides the model to perform fine-grained text data extraction from the text object dataset, accurately extracting core information including electrolyte name, original formula, chemical composition list, ionic conductivity, viscosity and test temperature, positive and negative electrode materials, test conditions, and cycle performance indicators. Finally, the structured data output by the model is integrated to form the extracted text dataset.
[0043] A comprehensive frequency statistical analysis was conducted on various structured data extracted from the text dataset. The frequency of occurrence of core experimental parameters and material information, such as test temperature, positive electrode material, negative electrode material, and electrolyte name, was statistically calculated. At the same time, the frequency of occurrence of different values of each parameter was counted and identified. For example, the number of times different test temperatures appeared in the text was counted, and the frequently occurring temperature values, mainstream positive and negative electrode material types, and frequently mentioned electrolyte names were identified. Finally, the frequency statistical results of all parameters were integrated and summarized to form a text parameter frequency statistical result that includes the value of each core parameter, its corresponding frequency, and the frequency of occurrence. This provides data support for subsequent key feature identification and aggregation.
[0044] Based on the statistical results of text parameter frequencies, and combined with the mode calculation algorithm, key features of the extracted text dataset are identified and aggregated. First, the dominant experimental conditions and core material information in the literature are identified from the statistical results, including the most frequently occurring test temperature, mainstream positive and negative electrode materials, and frequently mentioned electrolyte names. At the same time, supplementary feature information such as typical test temperatures is extracted in batches from the extracted text dataset using regular expressions. Then, all the identified key features are integrated and encapsulated, and aggregated into a document-level global context feature dictionary according to the preset field specifications. This dictionary contains core prior knowledge such as default test temperature, main electrolyte formula, and mainstream positive and negative electrode material types, providing reliable text information support for context injection and missing field completion in subsequent image data extraction.
[0045] In one possible implementation, step S300 further includes: Step S310: Perform image-text anchoring on the image object dataset to obtain multiple local image contexts.
[0046] Step S320: Perform cue word enhancement processing on the local context of the multiple images according to the global context feature dictionary to obtain multiple enhanced visual cue words.
[0047] Step S330: The visual understanding model parses the image object dataset based on the multiple enhanced visual cue words to generate the image extraction dataset.
[0048] Specifically, for each image in the image object dataset, an intelligent image-text anchoring operation is performed. A multi-level strategy is adopted, which prioritizes matching the same block title, searching for neighboring titles, and extracting from the regular expressions of the captions and adjacent paragraphs. This accurately binds each image file to its exact identifier in the main text of the document. At the same time, the caption text corresponding to each image is automatically retrieved, as well as all relevant surrounding text fragments such as the text before and after the image in the main text. These text fragments are integrated as the local context of the corresponding image. Finally, a corresponding local context is generated for each image in the image object dataset, and multiple image local contexts are obtained by integrating them.
[0049] Enhancement of visual cues is performed on multiple local contexts of images one by one. Based on a global context feature dictionary, the dictionary contains document-level core prior knowledge such as default test temperature, mainstream positive and negative electrode materials, and main electrolyte formula. Combined with standard battery material data extraction instructions and unified structured output format requirements, and following the combination structure of "basic extraction instructions + global context features injected with fixed sentence patterns + corresponding local context text of the image", exclusive visual cues are dynamically constructed for each local context of the image. By deeply integrating global context features with local context of the image, the enhancement and optimization of each original cue content is completed, and finally, multiple enhanced visual cues corresponding one-to-one with the images in the image object dataset are obtained.
[0050] After matching multiple enhanced visual cues with images in the image object dataset one by one, the data is input into a visual understanding model. This model adopts a Visual Language Model (VLM) architecture, integrating a dual-modal processing structure of visual feature extraction and natural language understanding. First, the visual feature extraction network performs pixel-level analysis of the images, extracting visual feature information such as curves, values, and legends from the battery material performance graphs. Then, the natural language understanding module performs semantic analysis of the global context features, local context text, and data extraction instructions in the enhanced visual cues. Through a cross-modal attention mechanism, the model achieves accurate alignment and fusion reasoning between visual features and text semantics. Based on this algorithm and structure, the model can complete the semantic matching of electrolyte abbreviations and full-text details in the images, identify missing experimental conditions, accurately extract battery material performance data such as cycle count, capacity retention rate, and coulombic efficiency from the images, and output structured results in a preset format. Finally, the analysis and extraction results of all images are integrated and summarized to generate an image extraction dataset with complete experimental condition annotations.
[0051] In one possible implementation, step S400 further includes: Step S410: Perform dynamic logical completion on the image extraction dataset based on the global context feature dictionary to obtain the completed image dataset.
[0052] Step S420: Construct the formula for calculating the quantification confidence level.
[0053] Step S430: Evaluate the confidence of the completed image dataset according to the quantized confidence calculation formula, and obtain the image data confidence score set.
[0054] Specifically, the process iterates through each data record in the image dataset, verifying key fields such as electrolyte name, positive electrode material, negative electrode material, and test temperature one by one. If a key field is found to be empty, the relevant information is retrieved from the local context corresponding to the image to complete it. If the local context does not have corresponding information, the core prior knowledge such as the default test temperature, mainstream positive and negative electrode materials, and main electrolyte formula in the global context feature dictionary is called to intelligently fill in the missing fields. At the same time, all completed fields are marked for subsequent traceability, and finally a complete image dataset with all fields is obtained.
[0055] A quantitative confidence score calculation formula is constructed based on data source, field completeness, and completion status. The formula is as follows: ;in As a base score set according to the data source, the base score for the completed image dataset processed in this step is set to 0.6; This represents the number of non-empty, valid key fields in the data record. Each valid field adds 0.02 points, and this dynamic scoring item has an upper limit of 0.09. This represents the number of key fields that are still incomplete after the data record has been completed. For each incomplete field, the score is reduced by 0.05. The final confidence score is set to a minimum of 0.0, and the overall score range is limited to 0.0 to 1.0, providing a unified and quantitative basis for subsequent data confidence evaluation.
[0056] For each data record in the completed image dataset, a confidence score is calculated individually. Each source image is initially assigned a base score of 0.6. Then, a dynamic score is added based on the number of non-empty and valid key fields in the data record, such as electrolyte name, positive and negative electrode materials, and test temperature. Each valid field adds 0.02 points, with the cumulative score not exceeding 0.09. Simultaneously, a dynamic deduction is applied to key fields that are still incomplete after completion, with each missing field deducting 0.05 points, and the minimum score being 0.0. The confidence scores for all data records are calculated to the range of 0.0 to 1.0 and rounded to two decimal places. The calculated confidence scores are then bound to their corresponding data records and stored. Finally, the confidence score results for all data records are integrated to form an image data confidence score set.
[0057] In one possible implementation, step S420 further includes: The formula for calculating the quantification confidence level is as follows: ;in, Characterizes the confidence score; Characterize the basic score; The system features dynamic scoring, which includes adding 0.02 points to the base score if a non-empty and valid key field appears in the image data, with a maximum score of 0.09. The dynamic deduction is represented by the following: if there are incomplete key fields in the image data, 0.05 points are deducted for each incomplete key field, until the score is reduced to 0.
[0058] Specifically, the quantitative confidence calculation formula constructed in this invention is as follows: Score is the final confidence score, ranging from 0.0 to 1.0; The base score, set based on the data source, is uniformly set to 0.6 for the dataset after image extraction and completion. This is a dynamic bonus point item. This represents the number of non-empty and valid key fields in the image data record. For each key field that meets the requirements, the base score is increased by 0.02, and this dynamic score is capped at 0.09 to avoid excessive influence of a single dimension on the confidence score. This is a dynamic deduction item. The number of key fields in the image data that are still incomplete after being supplemented by the global context feature dictionary is deducted by 0.05 points for each incomplete key field. If the score after deduction is less than 0.0, the final confidence score is recorded as 0 points. This formula combines the data source, field completeness and supplementation status to achieve accurate quantification of the reliability of image data. The calculation results provide the core judgment basis for subsequent multimodal data fusion and conflict resolution.
[0059] In one possible implementation, step S500 further includes: Step S510: Set the source priority, wherein the source priority includes text data source priority being greater than image data source priority.
[0060] Step S520: Define three types of core data unique bonds, including electrolyte information unique bonds, basic physicochemical performance unique bonds, and electrochemical cycle performance unique bonds.
[0061] Step S530: Based on the image data confidence score set, perform differentiated data fusion and conflict resolution on the global context feature dictionary and the completed image dataset according to the information source priority and the unique keys of the three types of core data, and generate the multimodal data fusion result.
[0062] Specifically, a clear source priority rule is set, which sets the source priority of text data extracted from text object datasets and aggregated by a global context feature dictionary to be higher than that of image data obtained after image extraction and completion. This serves as the core basis for determining data selection during subsequent multimodal data fusion and conflict resolution, preventing low-confidence image data from covering high-confidence text data and ensuring the overall accuracy of the fused data.
[0063] For the three types of core data extracted from battery materials, unique keys are defined based on data characteristics and business logic. These keys serve as the core basis for determining whether data is duplicated and whether it needs to be merged. The unique key for electrolyte information is the standardized electrolyte name, which is the identifier obtained after standardizing the electrolyte name by removing spaces and symbols. The unique key for basic physicochemical performance is the combination of the standardized electrolyte name and the test temperature, thereby distinguishing the basic physicochemical performance data of different electrolytes at different test temperatures. The unique key for electrochemical cycle performance is the combination of the standardized electrolyte name, positive electrode material, negative electrode material, test temperature, and current density. Through multi-dimensional feature combinations, electrochemical cycle performance data under different experimental conditions can be accurately distinguished. The definition of these three types of unique keys enables accurate identification of data from different types of battery materials.
[0064] Using image data confidence scores as the basis for data reliability quantification, and strictly adhering to the source priority rule of text data being superior to image data, this study leverages the unique keys of three core data types—electrolyte information, basic physicochemical properties, and electrochemical cycle performance—to conduct differentiated multimodal data fusion and conflict resolution processing on high-confidence text extraction data corresponding to the global context feature dictionary and the supplemented image dataset. First, unique key matching is used to determine whether data is duplicated or conflicting. For conflicting text-image data with identical unique keys, the text data is directly retained to avoid image data overwriting and contamination. For text data without corresponding unique keys but with image data confidence reaching a usable threshold, the image data is supplemented and fused into the text data. For data from the same source with identical unique keys, the one with a higher confidence score or a longer, more informative original formulation is retained. Simultaneously, a string distance algorithm is used for fuzzy deduplication of electrolyte formulations. If similarity is determined and the source is text-image heterogeneous, the text data is directly retained to prevent noise introduced by image recognition errors. After this series of differentiated processing steps, a conflict-free, redundancy-free, and highly reliable multimodal data fusion result is finally generated.
[0065] In one possible implementation, step S520 further includes: Step S521: The unique key for electrolyte information includes the standardized electrolyte name.
[0066] Step S522: The unique key to the basic physicochemical properties includes the standardized electrolyte name and the test temperature.
[0067] Step S523: The unique key to the electrochemical cycle performance includes the standardized electrolyte name, positive electrode material, negative electrode material, test temperature, and current density.
[0068] Specifically, a unique key is defined for core data related to electrolyte information. This unique key is a standardized electrolyte name, which is obtained by standardizing the extracted electrolyte name by removing spaces and symbols. This standardized name serves as the unique matching basis for electrolyte information data, used to determine whether electrolyte information data from different sources points to the same electrolyte and whether there are duplicates or conflicts. This provides a precise single-dimensional matching standard for the subsequent fusion and conflict resolution of multimodal electrolyte information data.
[0069] For core data related to the basic physicochemical properties of battery materials, a unique key is defined. This unique key consists of a standardized electrolyte name and a test temperature. The standardized electrolyte name is obtained by standardizing the electrolyte name after removing spaces and symbols. The test temperature is the actual ambient temperature at which the basic physicochemical property is measured. By combining these two core dimensions, a unique matching basis for basic physicochemical property data is constructed. This can accurately distinguish the basic physicochemical property data of the same electrolyte at different test temperatures, and can also effectively identify the performance data differences of different electrolytes at the same test temperature. This provides a precise and business-logic-aligned two-dimensional matching standard for subsequent duplicate determination, fusion matching, and conflict resolution of textual and graphical data related to basic physicochemical properties.
[0070] A unique key is defined for core data related to the electrochemical cycle performance of battery materials. This unique key is composed of five core dimensions: standardized electrolyte name, positive electrode material, negative electrode material, test temperature, and current density. The standardized electrolyte name is the identifier obtained by standardizing the electrolyte name after removing spaces and symbols. The positive and negative electrode materials are the core electrode materials of the battery system. The test temperature is the ambient temperature of the electrochemical cycle test, and the current density is the charge and discharge current density during the cycle test. By combining these five key experimental parameters, a unique matching basis for electrochemical cycle performance data is constructed. This can accurately distinguish the electrochemical cycle performance data of different electrolytes and different electrode material systems under different test temperatures and current densities, effectively avoiding data duplication and conflict identification errors caused by differences in experimental conditions.
[0071] In one possible implementation, step S600 further includes: Step S610: Verify the multimodal data fusion results using basic physics principles in the field of battery materials, and obtain the verified data fusion results.
[0072] Step S620: Standardize the format of the fusion results after verification to generate the battery material performance knowledge base.
[0073] Specifically, for the multimodal data fusion results after fusion and conflict resolution, a physical common sense verification of the battery materials field is carried out. Based on the physical common sense of the field of battery materials research and development, hard rule thresholds are set, and the core performance parameters in the data are automatically checked for compliance. Abnormal data that obviously violate physical common sense, such as coulombic efficiency greater than 100% and ionic conductivity less than 0, are directly removed. Only valid data that conforms to the physical rules of the field are retained, and finally the data fusion result after verification with abnormal data filtering is obtained.
[0074] After verifying the physical principles of battery materials, the data fusion results underwent comprehensive format standardization. First, the unit representation of all data records was standardized, converting temperature to degrees Celsius (°C), conductivity to mS / cm, and capacity to mAh / g, all common units in the battery materials field. Simultaneously, field names were standardized, and naming rules for data fields were standardized. The number of decimal places for values was also standardized, with confidence scores retained to two decimal places and battery performance-related parameters to three decimal places. Then, all standardized data was structurally integrated. Each data record includes complete material information, electrolyte formulation, experimental test conditions, performance data, as well as the corresponding data source, confidence score, and completion tags. Finally, a highly reliable, structured battery material performance knowledge base was generated, directly usable for scientific research analysis and model training.
[0075] It should be noted that the order of the embodiments described above is merely for descriptive purposes and does not represent the superiority or inferiority of the embodiments. Furthermore, the above description focuses on specific embodiments of this specification. Additionally, the processes depicted in the accompanying drawings do not necessarily require a specific or sequential order to achieve the desired results. In some implementations, multitasking and parallel processing are possible or may be advantageous.
[0076] The above description is only a preferred embodiment of this application and is not intended to limit this application. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of this application should be included within the protection scope of this application.
[0077] This specification and accompanying drawings are merely illustrative examples of this application and are intended to cover any and all modifications, variations, combinations, or equivalents within the scope of this application. Clearly, those skilled in the art can make various alterations and modifications to this application without departing from its scope. Therefore, if such modifications and variations fall within the scope of this application and its equivalents, this application intends to include such modifications and variations.
Claims
1. A method for extracting battery material data based on multimodal collaboration and context completion, characterized in that, The method includes: Multimodal deconstruction and preprocessing of original literature on battery materials were performed to obtain text object datasets and image object datasets; Fine-grained text data extraction and global context construction are performed on the text object dataset to obtain a global context feature dictionary; Data extraction is performed on the image object dataset based on the global context feature dictionary to obtain the image extraction dataset; Dynamic logical completion and confidence scoring are performed on the image extraction dataset to obtain the completed image dataset and the image data confidence score set; Based on the image data confidence score set, multimodal data fusion and conflict resolution based on information source priority are performed on the global context feature dictionary and the completed image dataset to obtain the multimodal data fusion result; The multimodal data fusion results are validated and standardized for output, and a battery material performance knowledge base is obtained.
2. The battery material data extraction method based on multimodal collaboration and context completion as described in claim 1, characterized in that, Multimodal deconstruction and preprocessing of original literature on battery materials were performed to obtain text object datasets and image object datasets, including: The original literature on the battery materials was deconstructed using a parsing tool in a multimodal manner to obtain a set of textual knowledge objects, an image knowledge object set, and a table knowledge object set. The text knowledge object set is preprocessed with text features to obtain a standard text dataset; Image feature preprocessing is performed on the image knowledge object set to obtain a standard image dataset; The standard text dataset and the standard image dataset are supplemented with data based on the table knowledge object set to obtain the text object dataset and the image object dataset, respectively.
3. The battery material data extraction method based on multimodal collaboration and context completion as described in claim 2, characterized in that, Based on the table knowledge object set, the standard text dataset and the standard image dataset are supplemented with data respectively to obtain the text object dataset and the image object dataset, including: Perform text-to-table conversion on the aforementioned table knowledge object set to obtain the table text conversion result; The table knowledge object set is converted into an image table to obtain the table image conversion result. The standard text dataset is supplemented based on the table text conversion results to obtain the text object dataset; The standard image dataset is supplemented based on the image conversion results in the table to obtain the image object dataset.
4. The battery material data extraction method based on multimodal collaboration and context completion as described in claim 1, characterized in that, Fine-grained text data extraction and global context construction are performed on the text object dataset to obtain a global context feature dictionary, including: By using customized prompt words, the large language model is guided to perform fine-grained text data extraction on the text object dataset to obtain the extracted text dataset; Frequency statistics are performed on the extracted text dataset to obtain the text parameter frequency statistics results; Based on the frequency statistics of the text parameters, key features are identified and aggregated in the extracted text dataset to generate the global context feature dictionary.
5. The battery material data extraction method based on multimodal collaboration and context completion as described in claim 1, characterized in that, Data extraction is performed on the image object dataset based on the global context feature dictionary to obtain the image extraction dataset, including: Image-text anchoring is performed on the image object dataset to obtain multiple local image contexts; Based on the global context feature dictionary, the local context of the multiple images is enhanced with prompt words to obtain multiple enhanced visual prompt words; The visual understanding model parses the image object dataset based on the multiple enhanced visual cue words to generate the image extraction dataset.
6. The battery material data extraction method based on multimodal collaboration and context completion as described in claim 1, characterized in that, Dynamic logical completion and confidence scoring are performed on the image extraction dataset to obtain the completed image dataset and the image data confidence score set, including: Dynamic logical completion is performed on the image extraction dataset based on the global context feature dictionary to obtain the completed image dataset; Construct a formula for calculating quantitative confidence level; The confidence score set of the image data is obtained by evaluating the confidence score of the image data according to the quantized confidence score calculation formula.
7. The battery material data extraction method based on multimodal collaboration and context completion as described in claim 1, characterized in that, Based on the image data confidence score set, multimodal data fusion and conflict resolution based on source priority are performed on the global context feature dictionary and the completed image dataset to obtain the multimodal data fusion result, including: Set source priority, wherein the source priority includes text data source priority being greater than image data source priority; Three types of core data unique bonds are defined, namely, unique bonds for electrolyte information, unique bonds for basic physicochemical properties, and unique bonds for electrochemical cycle performance. Based on the image data confidence score set, differential data fusion and conflict resolution are performed on the global context feature dictionary and the completed image dataset according to the information source priority and the three types of core data unique keys to generate the multimodal data fusion result.
8. The battery material data extraction method based on multimodal collaboration and context completion as described in claim 7, characterized in that, The unique key for electrolyte information includes a standardized electrolyte name; The unique key to the basic physicochemical properties includes the standardized electrolyte name and the test temperature; The unique key to the electrochemical cycle performance includes the standardized electrolyte name, positive electrode material, negative electrode material, test temperature, and current density.
9. The battery material data extraction method based on multimodal collaboration and context completion as described in claim 1, characterized in that, The multimodal data fusion results are validated and standardized to obtain a battery material performance knowledge base, including: The multimodal data fusion results are verified using basic physics principles in the field of battery materials to obtain the verified data fusion results. The format of the fusion results after verification is standardized to generate the battery material performance knowledge base.