A method of generating a bibliometric analysis report and related products
By automating bibliometric analysis through intelligent agents, the problem of complex manual operation in existing technologies is solved, and bibliometric analysis reports are generated efficiently to support scientific research decision-making and trend prediction.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- INST OF MEDICAL INFORMATION CHINESE ACAD OF MEDICAL SCI
- Filing Date
- 2026-04-23
- Publication Date
- 2026-06-30
AI Technical Summary
In existing technologies, bibliometric analysis relies on manual operation, and the tools are numerous and complex, making it difficult for researchers to quickly obtain research progress and trends in the field, resulting in low research efficiency and the risk of missing important information.
By employing an intelligent agent based on the Reflexion framework and combining a bibliometric text generation model, a Transformer model, and a MedBERT model, a one-click generation of bibliometric analysis reports is achieved through automated literature retrieval, preprocessing, bibliometric analysis, and report generation.
It improves the convenience of literature statistics and correlation analysis, enhances research efficiency, and can quickly generate high-quality bibliometric analysis reports to support research decision-making.
Smart Images

Figure CN122309533A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of information processing technology, and in particular to a method for generating bibliometric analysis reports and related products. Background Technology
[0002] Bibliometric analysis involves quantitatively analyzing external characteristics of academic literature, such as the quantity, citations, authors, journals, and keywords, thereby enabling visualization technology to support scientific research evaluation, trend prediction, and decision-making.
[0003] In related technologies, research on bibliometric analysis still relies on manual analysis of relevant literature datasets using tools. However, existing bibliometric analysis tools are numerous, have fragmented statistical functions, complex operation procedures, and are slow to learn. Enterprises and researchers cannot quickly obtain bibliometric analysis reports for a specific field, making it difficult to understand research progress and trends in a timely manner, leading to lagging technological development and insufficient scientific innovation. Faced with a large amount of literature data, users can usually only select a small number of documents for detailed reading, resulting in low research efficiency and a risk of missing important technologies and events. Summary of the Invention
[0004] To address the aforementioned issues, this application provides a method and related products for generating bibliometric analysis reports, thereby improving the convenience of bibliometric statistics and correlation analysis and enhancing research efficiency.
[0005] The embodiments of this application disclose the following technical solutions: In a first aspect, embodiments of this application provide a method for generating a bibliometric analysis report, applied to an intelligent agent, the method comprising: Obtain user needs, and based on user needs, derive retrieval strategies and corresponding semantic vectors of user needs; Determine the execution plan based on the retrieval strategy and the semantic vector of the demand; Following the above implementation plan, a bibliometric analysis report was obtained; The intelligent agent is built on the Reflexion framework and adopts a combination of models, inference planning, memory, and tools. The models include a bibliometric text generation model, a Transformer model, and a MedBERT model. The inference planning includes reasoning to parse user needs, breaking down execution plans, and reflection. The memory includes summarizing and analyzing the process of generating bibliometric analysis reports. The tools include a bibliometric database, a medical dictionary, a medical ontology, a knowledge base, bibliometric analysis tools, and an API interface.
[0006] In one possible implementation, the execution scheme includes literature retrieval and preprocessing, which includes the following steps: The API tool is invoked to retrieve and download the literature dataset from the corresponding literature database; the literature dataset includes fields such as literature title, author, author affiliation, keywords, abstract, journal, year, language, and subject category. The data from each document in the document dataset are preprocessed to obtain normalized data. The preprocessing includes error screening and correction, deletion of duplicate data, and completion of missing fields.
[0007] In one possible implementation, the execution scheme includes bibliometric analysis, which includes the following steps: Use bibliometric analysis tools to perform bibliometric analysis on standardized data and obtain bibliometric analysis result charts; The bibliometric analysis results charts and text prompt templates are used as inputs to the bibliometric text generation model, which then outputs the text content corresponding to the bibliometric analysis results charts. The text content is integrated to obtain a text set.
[0008] In one possible implementation, the execution scheme includes report generation and optimization, which includes the following steps: The text set and abstract prompt template are used as inputs to the bibliometric text generation model, which then outputs abstract text. Based on the bibliometric analysis results, charts, text content, and abstract text, a bibliometric analysis report is generated; Based on the evaluation and recommendations of the bibliometric analysis report, the bibliometric analysis report is dynamically adjusted and optimized.
[0009] In one possible implementation, user needs are obtained, and a retrieval strategy and a corresponding semantic vector of user needs are derived based on those needs, including: Obtain user requirements; these include search topics, search scope, and reporting requirements. Expand search topics using the Medical Subject Headings (MeSH), DrugBank, and the International Classification of Diseases (ICD-10); Based on the search scope and the expanded search topic, a search strategy is derived; The user intent is parsed using a deep Transformer model, and the user's request text is encoded into a request semantic vector.
[0010] In one possible implementation, bibliometric analysis includes at least one of the following: research trend analysis, publication and citation comparison analysis, collaboration network analysis, keyword frequency analysis, keyword co-occurrence analysis, country analysis, institution analysis, author analysis, journal analysis, co-cited literature analysis, and highly cited literature analysis.
[0011] In one possible implementation, the training method for the bibliometric text generation model is as follows: Obtain bibliometric analysis data related to the medical and health fields; The bibliometric analysis results charts and corresponding text content are extracted from the bibliometric analysis data, and initial data pairs are determined from the extracted results; the bibliometric analysis results charts include bar charts, line charts, relationship diagrams and data tables; Based on the initial data pairs, an active learning strategy is used to obtain the training dataset; the proportions of bar charts, line charts, relationship graphs, and data tables in the training dataset are consistent. The training dataset and text prompt templates were used as input to the initial model. The model was then fine-tuned in a supervised manner using the multimodal pre-trained model mPLUG-Owl2, and further fine-tuned using LoRA+ to obtain the bibliometric text generation model.
[0012] Secondly, embodiments of this application provide an intelligent agent for generating bibliometric analysis reports, the intelligent agent including: an analysis module, a determination module, and an execution module; The analysis module is configured to obtain user needs, and based on these needs, to generate a retrieval strategy and a semantic vector of the user needs. The determination module is configured to determine the execution plan based on the retrieval strategy and the semantic vector of the requirements; The execution module is configured to execute the above execution plan and obtain a bibliometric analysis report; The intelligent agent is built on the Reflexion framework and adopts a combination of models, inference planning, memory, and tools. The models include a bibliometric text generation model, a Transformer model, and a MedBERT model. The inference planning includes reasoning to parse user needs, breaking down execution plans, and reflection. The memory includes summarizing and analyzing the process of generating bibliometric analysis reports. The tools include a bibliometric database, a medical dictionary, a medical ontology, a knowledge base, bibliometric analysis tools, and an API interface.
[0013] In one possible implementation, the execution module is configured to call an interface tool to retrieve and download a literature dataset from the corresponding literature database; wherein the literature dataset includes field information such as literature title, author, author affiliation, keywords, abstract, journal, year, language, and subject category; the literature data in the literature dataset is preprocessed to obtain normalized data; wherein the preprocessing includes error screening and correction of the literature data, deletion of duplicate data, and completion of missing fields.
[0014] In one possible implementation, a bibliometric analysis tool is invoked to perform bibliometric analysis on the standardized data, resulting in a bibliometric analysis result chart. The bibliometric analysis result chart and text prompt template are used as input to a bibliometric text generation model, which outputs the text content corresponding to the bibliometric analysis result chart. The text content is then integrated to obtain a text set.
[0015] In one possible implementation, the text set and abstract prompt template are used as input to the bibliometric text generation model, which outputs the abstract text. Based on the bibliometric analysis results charts, text content, and abstract text, a bibliometric analysis report is obtained. The bibliometric analysis report is then dynamically adjusted and optimized based on the evaluation and suggestions provided.
[0016] In one possible implementation, the analysis module is configured to obtain user requirements, which include the search topic, search scope, and reporting requirements; expand the search topic using the Medical Subject Headings (MeSH), the DrugBank database, and the International Classification of Diseases (ICD-10); derive a search strategy based on the search scope and the expanded search topic; and parse the user intent using a deep Transformer model, encoding the user requirement text into a requirement semantic vector.
[0017] In one possible implementation, bibliometric analysis includes at least one of the following: research trend analysis, publication and citation comparison analysis, collaboration network analysis, keyword frequency analysis, keyword co-occurrence analysis, country analysis, institution analysis, author analysis, journal analysis, co-cited literature analysis, and highly cited literature analysis.
[0018] In one possible implementation, the agent also includes: a model training module; The model training module is configured to acquire bibliometric analysis data related to the medical and health field; extract bibliometric analysis result charts and corresponding text content from the bibliometric analysis data, and determine initial data pairs from the extracted results; the bibliometric analysis result charts include bar charts, line charts, relationship diagrams, and data tables; based on the initial data pairs, an active learning strategy is used to obtain a training dataset; the proportions of bar charts, line charts, relationship diagrams, and data tables in the training dataset are consistent; the training dataset and text prompt templates are used as input to the initial model, and supervised fine-tuning is performed through the multimodal pre-trained model mPLUG-Owl2, and LoRA+ is used to fine-tune the initial model to obtain the bibliometric text generation model.
[0019] To improve the convenience and efficiency of literature statistics and correlation analysis, this application provides a method for generating bibliometric analysis reports. First, user needs are obtained, and a retrieval strategy and corresponding semantic vectors are derived based on these needs. Then, an execution plan is determined based on the retrieval strategy and semantic vectors. Finally, the execution plan is executed to generate the bibliometric analysis report. The agent is built on the Reflexion framework and employs a combination of models, inference planning, memory, and tools. The models include a bibliometric text generation model, a Transformer model, and a MedBERT model. Inference planning includes reasoning to parse user needs, breaking down execution plans, and reflection. Memory includes summarizing and analyzing the process of generating the bibliometric analysis report. Tools include a literature database, a medical dictionary, a medical ontology, a knowledge base, bibliometric analysis tools, and API interfaces. In this application, the combination of models, inference planning, memory, and tools enables one-click report generation, significantly improving learning and work efficiency. Attached Figure Description
[0020] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0021] Figure 1 A flowchart illustrating a method for generating a bibliometric analysis report, provided as an embodiment of this application; Figure 2 A schematic diagram of an intelligent agent provided in an embodiment of this application; Figure 3 This application provides a schematic diagram illustrating the performance of document retrieval and preprocessing in an embodiment of the present application. Figure 4 A schematic diagram illustrating the execution of bibliometric analysis as provided in an embodiment of this application; Figure 5 A schematic diagram illustrating the generation and optimization of an execution report as provided in an embodiment of this application; Figure 6 A flowchart illustrating a training method for a bibliometric text generation model provided in this application embodiment; Figure 7 This is a schematic diagram of an intelligent agent for generating bibliometric analysis reports, provided as an embodiment of this application. Detailed Implementation
[0022] To enable those skilled in the art to better understand the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present application, and not all embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the scope of protection of the present application.
[0023] The terms "first" and "second," etc., used in the specification and claims of this application are used to distinguish different objects, not to describe a specific order of objects. For example, "first operation instruction" and "second operation instruction," etc., are used to distinguish different operation instructions, not to describe a specific order of operation instructions.
[0024] In the embodiments of this application, the terms "exemplary" or "for example" are used to indicate that something is an example, illustration, or description. Any embodiment or design that is described as "exemplary" or "for example" in the embodiments of this application should not be construed as being more preferred or advantageous than other embodiments or design. Specifically, the use of the terms "exemplary" or "for example" is intended to present the relevant concepts in a specific manner.
[0025] In the description of the embodiments of this application, unless otherwise stated, "multiple" means two or more, for example, multiple processing units means two or more processing units, multiple elements means two or more elements, etc.
[0026] See Figure 1 The figure is a flowchart of a method for generating a bibliometric analysis report provided in an embodiment of this application.
[0027] like Figure 1 As shown, the method for generating a bibliometric analysis report includes the following steps: S1000: Obtain user needs, and based on user needs, obtain retrieval strategies and corresponding semantic vectors of user needs.
[0028] The user requirements in this application embodiment include search topics, search scope, and report requirements.
[0029] In one possible implementation, the agent can analyze user needs and expand the search topics by leveraging professional medical data such as the Medical Subject Headings (MeSH), DrugBank, and the International Classification of Diseases 10th Revision (ICD-10), forming a set of subject headings, a set of free terms, and a set of classification codes. This provides a pool of professional terms for the search strategy, which is then combined with the search scope to generate the search strategy. In addition, the agent can also use the Transformer deep learning model to deeply analyze user intent and encode user needs text into a semantic vector of needs.
[0030] S2000: Determine the execution plan based on the retrieval strategy and the semantic vector of the demand.
[0031] For example, based on the generated retrieval strategy and demand semantic vector, the intelligent agent formulates a specific execution plan, feeds the execution plan back to the client, and makes real-time and dynamic adjustments based on the client's suggestions. The execution plan includes three parts: literature retrieval and preprocessing, bibliometric analysis, and report generation and optimization. The literature retrieval and preprocessing part needs to cover literature retrieval strategies, data collection sources and fields, data download and storage, and data preprocessing. The bibliometric analysis part needs to cover bibliometric analysis dimensions and methods, required tools, and presentation methods of analysis results. The report generation and optimization part needs to cover report organization structure and report presentation methods.
[0032] S3000: Execute the above execution plan to obtain a bibliometric analysis report.
[0033] The intelligent agent is built on the Reflexion framework and adopts a combination of models, inference planning, memory, and tools; such as Figure 2 As shown, the models in the intelligent agent include a bibliometric text generation model, a Transformer model, and a MedBERT model. The reasoning and planning in the intelligent agent includes reasoning to parse user needs, breaking down execution plans, and reflection. The memory in the intelligent agent includes summarizing and analyzing the process of generating bibliometric analysis reports (long-term memory, i.e., lessons learned and experiences summarized in the process of generating historical reports, and short-term memory, i.e., user needs and feedback in the current report generation process). The tools in the intelligent agent include a bibliometric database, a medical dictionary, a medical ontology, a knowledge base, bibliometric analysis tools, and an API interface.
[0034] It should be noted that, in the field of computer science and artificial intelligence, the intelligent agent in the embodiments of this application is an autonomous entity (software or hardware) that can achieve its goals by perceiving the environment, making autonomous decisions and performing actions.
[0035] In this embodiment, by constructing an intelligent agent with a pattern of "model + reasoning planning + memory + tools", one-click report generation can be achieved, greatly improving learning and work efficiency; by integrating and using various medical field dictionaries, ontology, knowledge bases, etc., the capabilities of the intelligent agent can be effectively enhanced, making the report more interpretable.
[0036] The execution plan mentioned in the previous examples includes three parts: literature retrieval and preprocessing, bibliometric analysis, and report generation and optimization. The execution of each part will be described below.
[0037] For literature retrieval and preprocessing, the following steps (S3110-S3120) are performed: S3110: Call the interface tool to retrieve and download the literature dataset from the corresponding literature database; the literature dataset includes field information such as literature title, author, author affiliation, keywords, abstract, journal, year, language and subject category.
[0038] S3120: Preprocess the data of each document in the document dataset to obtain normalized data; the preprocessing includes error screening and correction of the document data, deletion of duplicate data and completion of missing fields.
[0039] For missing fields, in this embodiment of the application, the missing fields can be filled in by extracting literature keywords, topic categories and other fields based on medical professional thesaurus such as MeSH and ICD-10 using the MedBERT model.
[0040] For bibliometric analysis, the following steps are performed: S3210-S3230: S3210: Use the bibliometric analysis tool to perform bibliometric analysis on the standardized data and obtain the bibliometric analysis results charts.
[0041] The bibliometric analysis tools include tools such as CiteSpace, VOSviewer, HistCite, Ucinet, and Bibexcel; the bibliometric analysis includes research trend analysis, publication and citation comparison analysis, collaboration network analysis (national collaboration, institutional collaboration, scholar collaboration), keyword frequency analysis, keyword co-occurrence analysis, country analysis, institution analysis, author analysis, journal analysis, co-cited literature analysis, and highly cited literature analysis; the bibliometric analysis results charts include bar charts, line charts, and data tables.
[0042] S3220: Use the bibliometric analysis result charts and text prompt templates as input to the bibliometric text generation model, so that the bibliometric text generation model outputs the text content corresponding to the bibliometric analysis result charts.
[0043] For example, based on the obtained bibliometric analysis results charts, a pre-designed prompt template is selected, and the prompt template and the bibliometric analysis results charts are input into the constructed bibliometric text generation model to generate the text content corresponding to the chart results.
[0044] S3230: Integrate text content to obtain a text set.
[0045] It should be noted that, prior to S3210, this embodiment of the application can also perform data format conversion. The intelligent agent, based on the analysis dimensions (annual statistics and cooperative relationships, etc.) and methods in the execution plan, sequentially determines the analysis tool and its requirements for imported data, identifies necessary fields, performs field mapping for literature data of different formats, realizes the conversion of literature data formats, and obtains the data to be analyzed.
[0046] Regarding report generation and optimization, the following steps are performed: S3310-S3330: S3310: Use the text set and abstract prompt template as input to the bibliometric text generation model, so that the bibliometric text generation model outputs abstract text.
[0047] For example, in this embodiment of the application, a pre-designed summary prompt template and text set can be input into the constructed bibliometric text generation model to generate report summary text.
[0048] S3320: Based on the bibliometric analysis results, charts, text content, and abstract text, obtain the bibliometric analysis report.
[0049] For example, in this embodiment of the application, a report structure can be organized based on the bibliometric analysis results charts and corresponding text content and abstract text to generate a bibliometric analysis report.
[0050] S3330: Evaluation and recommendations for bibliometric analysis reports, and dynamic adjustment and optimization of bibliometric analysis reports.
[0051] For example, in this embodiment, the generated report can be fed back to the client, which can then evaluate and provide suggestions. The intelligent agent can then dynamically adjust and optimize the report. Furthermore, the historical process of this task is recorded, including task objectives, user suggestions, process instructions, and adjustment plans, to facilitate the intelligent agent's self-learning and optimization, providing more effective execution plans and reports.
[0052] The bibliometric text generation model was used in the previous embodiments. The training method for this model will be described below. The training method for the bibliometric text generation model includes the following steps S4100-S4400: S4100: Obtain bibliometric analysis data related to the medical and health fields.
[0053] The bibliometric analysis data includes bibliometric analysis reports and papers in the medical and health fields. This application does not specifically limit the data in this embodiment.
[0054] S4200: Extract bibliometric analysis result charts and corresponding text content from the bibliometric analysis data, and determine the initial data pairs from the extracted results; wherein, the bibliometric analysis result charts include bar charts, line charts, relationship diagrams and data tables.
[0055] For example, in the embodiments of this application, OCR technology can be used to extract the result charts and corresponding text content related to bibliometric analysis, and a portion of the data can be manually and randomly selected from the extracted results to verify the correctness of the extraction, thereby obtaining initial data pairs.
[0056] S4300: Based on the initial data pairs, an active learning strategy is used to obtain the training dataset; wherein, the proportions of bar charts, line charts, relationship graphs and data tables in the training dataset are consistent.
[0057] S4400: The training dataset and text prompt templates are used as input to the initial model. The model is then fine-tuned in a supervised manner using the multimodal pre-trained model mPLUG-Owl2. The initial model is then fine-tuned using LoRA+ to train the bibliometric text generation model.
[0058] In addition, in this embodiment of the application, user feedback can be collected and the model parameters can be continuously optimized by combining the text generation task of the bibliometric analysis report.
[0059] In this embodiment, the text generation model is trained based on a multimodal large model, which can better realize deep reasoning and exploratory learning, thereby generating reports efficiently and accurately.
[0060] This application provides an intelligent agent for generating bibliometric analysis reports. The intelligent agent includes: an analysis module 1000, a determination module 2000, and an execution module 3000. Analysis module 1000 is configured to obtain user needs, and based on user needs, obtain retrieval strategies and corresponding semantic vectors of user needs. Module 2000 is configured to determine the execution plan based on the retrieval strategy and the semantic vector of the requirements. Execution module 3000 is configured to execute the above execution plan and obtain a bibliometric analysis report; The intelligent agent is built on the Reflexion framework and adopts a combination of models, inference planning, memory, and tools. The models include a bibliometric text generation model, a Transformer model, and a MedBERT model. The inference planning includes reasoning to parse user needs, breaking down execution plans, and reflection. The memory includes summarizing and analyzing the process of generating bibliometric analysis reports. The tools include a bibliometric database, a medical dictionary, a medical ontology, a knowledge base, bibliometric analysis tools, and an API interface.
[0061] In this embodiment of the application, a combination of models, reasoning planning, memory, and tools can be used to generate reports with one click, greatly improving learning and work efficiency.
[0062] In one possible implementation, the execution module is configured to call an interface tool to retrieve and download a literature dataset from the corresponding literature database; wherein the literature dataset includes field information such as literature title, author, author affiliation, keywords, abstract, journal, year, language, and subject category; the literature data in the literature dataset is preprocessed to obtain normalized data; wherein the preprocessing includes error screening and correction of the literature data, deletion of duplicate data, and completion of missing fields.
[0063] In one possible implementation, a bibliometric analysis tool is invoked to perform bibliometric analysis on the standardized data, resulting in a bibliometric analysis result chart. The bibliometric analysis result chart and text prompt template are used as input to a bibliometric text generation model, which outputs the text content corresponding to the bibliometric analysis result chart. The text content is then integrated to obtain a text set.
[0064] In one possible implementation, the text set and abstract prompt template are used as input to the bibliometric text generation model, which outputs the abstract text. Based on the bibliometric analysis results charts, text content, and abstract text, a bibliometric analysis report is obtained. The bibliometric analysis report is then dynamically adjusted and optimized based on the evaluation and suggestions provided.
[0065] In one possible implementation, the analysis module is configured to obtain user requirements, which include the search topic, search scope, and reporting requirements; expand the search topic using the Medical Subject Headings (MeSH), the DrugBank database, and the International Classification of Diseases (ICD-10); derive a search strategy based on the search scope and the expanded search topic; and parse the user intent using a deep Transformer model, encoding the user requirement text into a requirement semantic vector.
[0066] In one possible implementation, bibliometric analysis includes at least one of the following: research trend analysis, publication and citation comparison analysis, collaboration network analysis, keyword frequency analysis, keyword co-occurrence analysis, country analysis, institution analysis, author analysis, journal analysis, co-cited literature analysis, and highly cited literature analysis.
[0067] In one possible implementation, the agent also includes: a model training module; The model training module is configured to acquire bibliometric analysis data related to the medical and health field; extract bibliometric analysis result charts and corresponding text content from the bibliometric analysis data, and determine initial data pairs from the extracted results; the bibliometric analysis result charts include bar charts, line charts, relationship diagrams, and data tables; based on the initial data pairs, an active learning strategy is used to obtain a training dataset; the proportions of bar charts, line charts, relationship diagrams, and data tables in the training dataset are consistent; the training dataset and text prompt templates are used as input to the initial model, and supervised fine-tuning is performed through the multimodal pre-trained model mPLUG-Owl2, and LoRA+ is used to fine-tune the initial model to obtain the bibliometric text generation model.
[0068] It should be noted that the various embodiments in this specification are described in a progressive manner, and the same or similar parts between the various embodiments can be referred to mutually. Each embodiment focuses on describing the differences from other embodiments. In particular, for the device and system embodiments, since they are basically similar to the method embodiments, the description is relatively simple, and the relevant parts can be referred to the description of the method embodiments. The device and system embodiments described above are merely illustrative. The units described as separate components may or may not be physically separate, and the components indicated as units may or may not be physical units, that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to achieve the purpose of the solution in this embodiment according to actual needs. Those skilled in the art can understand and implement this without creative effort.
[0069] The above description is merely one specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the technical scope disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.
Claims
1. A method of generating a bibliometric analysis report, characterized by, Applied to intelligent agents, the method includes: Obtain user needs, and based on the user needs, obtain a retrieval strategy and a demand semantic vector corresponding to the user needs; Based on the retrieval strategy and the demand semantic vector, an execution plan is determined; Execute the above execution plan to obtain a bibliometric analysis report; The intelligent agent is built on the Reflexion framework and adopts a combination of models, inference planning, memory, and tools. The models include a bibliometric text generation model, a Transformer model, and a MedBERT model. The inference planning includes reasoning and parsing the user's needs, breaking down the execution plan, and reflection. The memory includes summarizing and analyzing the process of generating the bibliometric analysis report. The tools include a bibliometric database, a medical dictionary, a medical ontology, a knowledge base, bibliometric analysis tools, and an API interface.
2. The method of claim 1, wherein, The execution plan includes literature retrieval and preprocessing, and the execution of the literature retrieval and preprocessing includes the following steps: The interface tool is invoked to retrieve and download the literature dataset from the corresponding literature database; wherein, the literature dataset includes field information such as literature title, author, author affiliation, keywords, abstract, journal, year, language and subject category; The literature data in the literature dataset are preprocessed to obtain normalized data; wherein, the preprocessing includes error screening and correction, deletion of duplicate data and completion of missing fields.
3. The method of claim 2, wherein, The execution plan includes bibliometric analysis, and performing the bibliometric analysis includes the following steps: The standardized data were subjected to bibliometric analysis using a bibliometric analysis tool, and the bibliometric analysis results were obtained in charts. The bibliometric analysis result charts and text prompt templates are used as inputs to the bibliometric text generation model, so that the bibliometric text generation model outputs the text content corresponding to the bibliometric analysis result charts. The text content is integrated to obtain a text set.
4. The method of claim 3, wherein, The execution plan includes report generation and optimization, and the execution of report generation and optimization includes the following steps: The text set and summary prompt template are used as input to the bibliometric text generation model, so that the bibliometric text generation model outputs summary text. Based on the bibliometric analysis results charts, the text content, and the abstract text, a bibliometric analysis report is obtained; Based on the evaluation and recommendations in the bibliometric analysis report, the bibliometric analysis report is dynamically adjusted and optimized.
5. The method of claim 1, wherein, The process of obtaining user needs, and deriving a retrieval strategy and a demand semantic vector corresponding to the user needs, includes: Obtain user requirements; wherein, the user requirements include search topics, search scope, and report requirements; The search topics were expanded using the Medical Subject Headings (MeSH), the DrugBank database, and the International Classification of Diseases (ICD-10). The search strategy is derived based on the search scope and the expanded search topic; The user intent is parsed using the Transformer deep model, and the user request text is encoded into the request semantic vector.
6. The method of claim 3, wherein, The bibliometric analysis includes at least one of the following: research trend analysis, publication and citation comparison analysis, collaboration network analysis, keyword frequency analysis, keyword co-occurrence analysis, country analysis, institution analysis, author analysis, journal analysis, co-cited literature analysis, and highly cited literature analysis.
7. The method according to any one of claims 1 to 6, characterized in that, The training method for the bibliometric text generation model is as follows: Obtain bibliometric analysis data related to the medical and health fields; The bibliometric analysis results charts and corresponding text content are extracted from the bibliometric analysis data, and initial data pairs are determined from the extracted results; wherein, the bibliometric analysis results charts include bar charts, line charts, relationship diagrams and data tables; Based on the initial data pairs, an active learning strategy is used to obtain a training dataset; wherein the proportions of the bar chart, the line chart, the relationship graph, and the data table in the training dataset are consistent. The training dataset and text prompt templates are used as input to the initial model. The model is then fine-tuned in a supervised manner using the multimodal pre-trained model mPLUG-Owl2, and further fine-tuned using LoRA+ to obtain the bibliometric text generation model.
8. An agent for generating a bibliometric analysis report, characterized in that, The intelligent agent includes: an analysis module, a determination module, and an execution module; The analysis module is configured to acquire user needs, and based on the user needs, obtain a retrieval strategy and a demand semantic vector corresponding to the user needs. The determining module is configured to determine an execution plan based on the retrieval strategy and the demand semantic vector; The execution module is configured to execute the execution plan and obtain a bibliometric analysis report; The intelligent agent is built on the Reflexion framework and adopts a combination of models, inference planning, memory, and tools. The models include a bibliometric text generation model, a Transformer model, and a MedBERT model. The inference planning includes reasoning and parsing the user's needs, breaking down the execution plan, and reflection. The memory includes summarizing and analyzing the process of generating the bibliometric analysis report. The tools include a bibliometric database, a medical dictionary, a medical ontology, a knowledge base, bibliometric analysis tools, and an API interface.
9. A computer device, comprising: include: A memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor, when executing the computer program, implements the method for generating a bibliometric analysis report as described in any one of claims 1-7.
10. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores instructions that, when executed on a terminal device, cause the terminal device to perform the method for generating a bibliometric analysis report as described in any one of claims 1-7.