A method and system for concentrator log analysis

By combining a client-server architecture and an intelligent agent execution environment with a large language model, the problems of low efficiency and poor accuracy in concentrator log analysis are solved, achieving efficient and accurate log analysis and fault diagnosis, and meeting the needs of large-scale operation and maintenance.

CN122309544APending Publication Date: 2026-06-30QINGDAO ITECHENE TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
QINGDAO ITECHENE TECH CO LTD
Filing Date
2026-03-30
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

Existing methods for concentrator log analysis are inefficient, inaccurate, costly, and lack flexibility, failing to meet the needs of large-scale operation and maintenance and rapid troubleshooting.

Method used

Adopting a client-server architecture, the system achieves automated and intelligent log analysis by combining log file fragmented uploading, protocol type identification and parsing, structured log data storage, intelligent agent execution environment, and large language model.

Benefits of technology

It improves the efficiency and accuracy of log analysis, reduces operation and maintenance costs, meets the needs of rapid troubleshooting and customization, and ensures the system's high throughput and response stability.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122309544A_ABST
    Figure CN122309544A_ABST
Patent Text Reader

Abstract

This invention relates to the field of smart grid concentrator operation and maintenance, specifically providing a concentrator log analysis method and system to address the problems of low efficiency, poor adaptability, high cost, and insufficient flexibility in existing log analysis methods. To this end, the analysis method of this invention includes: a client uploading a concentrator log file to a server; the server processing the log file to obtain structured log data and storing the structured log data in a database; the client sending a user-submitted question to the server; the server invoking an intelligent agent based on the question: performing a database query and a knowledge base query to obtain the structured log data corresponding to the question and the historical fault diagnosis experience corresponding to the question, respectively; and, based on the question, the corresponding structured log data, and the historical fault diagnosis experience, obtaining the answer to the question and returning it to the client. This achieves efficient analysis of concentrator logs.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of smart grid concentrator operation and maintenance, and specifically provides a concentrator log analysis method and system. Background Technology

[0002] In the field of smart grid concentrator operation and maintenance, with the continuous growth of overseas concentrator supply, on-site meter reading failures are frequent. Analysis of communication message logs between the concentrator and the electricity meter has become a core aspect of fault location and troubleshooting, directly impacting operation and maintenance efficiency, fault repair quality, and customer satisfaction. Currently, there are three main technical solutions for concentrator log analysis in the industry, but all have significant shortcomings and cannot meet the needs of large-scale overseas operation and maintenance and rapid troubleshooting.

[0003] The current mainstream solution is manual line-by-line parsing. Operation and maintenance personnel need to export the raw logs from the concentrator and manually complete log retrieval, protocol decoding, statistical analysis, and root cause identification using general text editing and spreadsheet tools. The entire process relies on manual operation and personal experience, which is not only extremely inefficient and time-consuming, with an average of 2 person-days for a single analysis, resulting in a long problem localization cycle, but also prone to omissions and misjudgments due to the large volume of logs and complex protocols. Furthermore, experience is difficult to reuse, and labor costs continue to rise.

[0004] Some enterprises use rule-based log filtering tools that match abnormal logs using preset fixed regular expressions. However, these tools lack the ability to analyze the context of communication links, cannot identify hidden faults across time periods, and have rigid rule bases that are difficult to adapt to the customized needs of concentrator protocols in different countries overseas, resulting in extremely poor scalability. A few attempts have used traditional machine learning classification methods, which can identify known fault types, but have weak generalization ability to unseen anomalies, require a large amount of manually labeled data, have high training costs, and cannot respond to user-defined natural language queries, lacking flexibility.

[0005] In summary, current technologies fail to achieve end-to-end automation of log analysis, exhibiting problems such as low efficiency, poor accuracy, high cost, weak scalability, and insufficient flexibility. They cannot simultaneously address the timeliness and customization needs of large-scale operations and maintenance. Therefore, there is an urgent need for an intelligent solution that can automatically understand communication protocol context, support dynamic problem reasoning, and flexibly adapt to multiple scenarios. This solution would overcome existing technological bottlenecks, improve the efficiency and accuracy of concentrator log analysis, and reduce operational costs. Summary of the Invention

[0006] The present invention aims to solve the above-mentioned technical problems, namely, the low efficiency, poor adaptability, high cost and lack of flexibility of existing log analysis methods.

[0007] In a first aspect, the present invention provides a concentrator log analysis method, comprising: a client uploading a concentrator log file to a server;

[0008] The server processes the log file to obtain structured log data and stores the structured log data in the database;

[0009] The client sends the user's question to the server;

[0010] The server invokes the intelligent agent based on the aforementioned problem:

[0011] Perform database queries and knowledge base queries to obtain the structured log data corresponding to the problem and the historical fault diagnosis experience corresponding to the problem, respectively;

[0012] Based on the question, the corresponding structured log data, and the historical fault diagnosis experience, the answer to the question is returned to the client.

[0013] In one technical solution of the above method, the server processes the log file to obtain structured log data, including:

[0014] Identify the protocol type of the log file;

[0015] Select the corresponding parsing strategy based on the recognition results;

[0016] The log file is parsed line by line using the parsing strategy to obtain multiple JSON data entries. After each line is parsed, the memory used is reclaimed.

[0017] In one technical solution of the above method, the step of selecting the corresponding parsing strategy based on the recognition result includes:

[0018] If the first byte of the log file is 0x68 and the subsequent bytes conform to the DL / T 698.45 frame structure, then the DLT698Parser class is loaded;

[0019] If the log file matches the IEC 62056 protocol features, then the HDCLParser class is loaded.

[0020] In one technical solution of the above method, the step of parsing the log file line by line to obtain multiple JSON data based on the parsing strategy includes:

[0021] Extract timestamps, message frames, and communication IDs based on the regular expression rules in the parsing strategy;

[0022] The validity of the message frame is verified;

[0023] If the verification passes, feature calculation is performed on all message frames under the same communication ID to obtain derived features;

[0024] The JSON data is obtained based on the derived features.

[0025] In one technical solution of the above method, verifying the frame validity of the message frame includes:

[0026] The message frame is subjected to format verification, address verification, and timing verification.

[0027] In one technical solution of the above method, storing the structured log data into a database includes:

[0028] Store the JSON data in the cache;

[0029] In response to the fulfillment of batch processing trigger conditions, the JSON data already stored in the cache is stored in the database.

[0030] In one technical solution of the above method, the batch processing trigger condition is that the number of JSON data stored in the cache reaches a preset quantity threshold or exceeds the preset duration set in the time sequence verification.

[0031] In one technical solution of the above method, the client uploads the concentrator log file to the server, including:

[0032] If the size of the log file to be uploaded exceeds a preset size threshold, the log file is split into multiple fragments, each fragment having a file ID, fragment number, and total number of fragments.

[0033] In one technical solution of the above method, before the server processes the log file to obtain structured log data, the method further includes:

[0034] In response to the client sending a fragment upload completion request, the server performs an integrity check on all received fragments.

[0035] If the verification passes, the fragments are merged to obtain the log file.

[0036] In one technical solution of the above method, before the server invokes the intelligent agent based on the problem, the method further includes:

[0037] The question is segmented into words and entity recognition is performed to obtain a structured query context object.

[0038] In one technical solution of the above method, the execution of database queries and knowledge base queries to obtain structured log data corresponding to the problem and historical fault diagnosis experience corresponding to the problem, respectively, includes:

[0039] Generate the execution environment for the intelligent agent;

[0040] Load the corresponding workflow template based on the query context;

[0041] Execute the corresponding workflow based on the workflow template to obtain the structured log data corresponding to the problem;

[0042] Based on the query context, query the knowledge base for the N most similar historical fault diagnosis experiences corresponding to the question, where N is a natural number greater than or equal to 1.

[0043] In one technical solution of the above method, the generated intelligent agent execution environment includes:

[0044] Load the user's historical dialogue memory to construct a multi-turn dialogue context;

[0045] The database query, chart generation, and report download functions are encapsulated into independent utility classes for use by large language models;

[0046] Set up a workflow template, wherein the workflow template defines the node execution order and conditional branching logic;

[0047] A reasoning and action framework for building a large language model.

[0048] In one technical solution of the above method, the step of obtaining the answer to the problem and returning it to the client based on the problem, the corresponding structured log data, and the historical fault diagnosis experience includes:

[0049] Based on the question, the corresponding structured log data, and the historical fault diagnosis experience, prompt words are constructed and input into the large language model to obtain the answer to the question.

[0050] In one technical solution of the above method, during the training phase of the large language model, the large language model is fine-tuned using LoRA based on knowledge of the power domain.

[0051] In a second aspect, the present invention provides a concentrator log analysis system, comprising:

[0052] Client and server, among which

[0053] The client uploads the concentrator log file to the server.

[0054] The server processes the log file to obtain structured log data and stores the structured log data in the database;

[0055] The client sends the user's question to the server;

[0056] The server invokes the intelligent agent based on the problem:

[0057] Perform database queries and knowledge base queries to obtain the structured log data corresponding to the problem and the historical fault diagnosis experience corresponding to the problem, respectively;

[0058] Based on the question, the corresponding structured log data, and the historical fault diagnosis experience, the answer to the question is returned to the client.

[0059] By adopting the above technical solution, this invention achieves technical decoupling of the two pipelines—log upload and problem analysis—by setting up a decoupled architecture. This allows the log file processing and user problem answering processes to execute independently, sharing only the database as a unified data source. This architecture effectively avoids mutual interference between the two business processes. The heavy workload operations such as log file sharding and batch processing for database entry during log upload do not affect the real-time response of user problem answering. This ensures both high throughput for log data processing and low latency for user problem answering, significantly improving the overall system efficiency and response stability. By setting up a log file sharding upload mechanism, combined with server-side integrity verification and file merging steps, the integrity and stability of large log file uploads are effectively guaranteed, significantly improving the processing efficiency of the log upload stage. By setting up protocol type identification and corresponding parsing strategies, combined with line-by-line parsing and immediate memory reclamation, the relevance and accuracy of log parsing are improved, while significantly reducing system memory resource consumption, adapting to the parsing requirements of large log files. JSON data is obtained through three-layer verification of frame legality and derivation feature calculation, ensuring the reliability of structured log data. By setting up caching to temporarily store JSON data and batch processing to trigger conditions for bulk data entry, frequent database interactions are reduced, improving data storage efficiency while balancing the timeliness of log data storage with the system's high-throughput processing requirements. Structured query context is obtained through question segmentation and entity recognition, constructing an intelligent agent execution environment containing historical dialogue memories, utility classes, and workflow templates. Combined with a large language model fine-tuned using LoRA, it accurately retrieves structured log data and historical fault diagnosis experience, efficiently outputting targeted answers, significantly improving the accuracy of problem-solving and meeting the practical needs of rapid on-site troubleshooting. Attached Figure Description

[0060] The preferred embodiments of the present invention are described below with reference to the accompanying drawings, in which:

[0061] Figure 1 This is a schematic diagram of the concentrator log analysis system in this invention;

[0062] Figure 2 This is a flowchart illustrating the concentrator log analysis method in this invention;

[0063] Figure 3 This is a schematic diagram of the specific process of step 2 in the concentrator log analysis method of the present invention;

[0064] Figure 4 This is a schematic diagram of the specific process of step 23 in the concentrator log analysis method of the present invention;

[0065] Figure 5 This is a schematic diagram of the specific process of step 4 in the concentrator log analysis method of this invention;

[0066] Figure 6 This is a schematic diagram of the specific process of step 41 of the concentrator log analysis method in this invention. Detailed Implementation

[0067] Preferred embodiments of the present invention will now be described with reference to the accompanying drawings. Those skilled in the art should understand that these embodiments are merely illustrative of the technical principles of the present invention and are not intended to limit the scope of protection of the present invention. Those skilled in the art can make adjustments as needed to adapt to specific application scenarios.

[0068] It should be noted that in the description of this invention, terms such as "bottom" that indicate direction or positional relationship are based on the direction or positional relationship shown in the accompanying drawings. This is merely for ease of description and does not indicate or imply that the relevant device or element must have a specific orientation, or be constructed and operated in a specific orientation. Therefore, it should not be construed as a limitation of this invention. Furthermore, ordinal numbers such as "first," "second," and "third" are used for descriptive purposes only and should not be construed as indicating or implying relative importance.

[0069] Furthermore, it should be noted that although the various steps of the control method of the present invention are described in a specific order in the description of the present invention, these orders are not restrictive. Without departing from the basic principles of the present invention, those skilled in the art can perform the steps in different orders.

[0070] First refer to Figure 1 This is a schematic diagram of the structure of a concentrator log analysis system according to the present invention. It is a classic B / S architecture, including a client and a server. In some embodiments, the client has a browser, which allows the client to input concentrator log files and questions. The client and server communicate via an API port. The server contains an agent and a database. The input log files are stored in the database, and the agent can access the log files in the database and analyze and answer the questions. Specifically, the client uploads the concentrator log files to the server.

[0071] The server processes the log file to obtain structured log data and stores the structured log data in the database;

[0072] The client sends the user's question to the server;

[0073] The server invokes the intelligent agent based on the problem:

[0074] Perform database queries and knowledge base queries to obtain the structured log data corresponding to the problem and the historical fault diagnosis experience corresponding to the problem, respectively;

[0075] Based on the question, the corresponding structured log data, and the historical fault diagnosis experience, the answer to the question is returned to the client.

[0076] Another aspect of this application provides a concentrator log analysis method. In the aforementioned analysis system, the analysis of concentrator logs is achieved through information transmission and processing between the client and server. For example... Figure 2 As shown, steps S1-S4 are included, specifically:

[0077] Step S1: The client uploads the concentrator log file to the server;

[0078] In some embodiments, step S1 requires determining the size of the log file. If it is less than a threshold, the file is uploaded normally; if it is greater than the threshold, it is uploaded in chunks. In this embodiment, the threshold is specifically set to 10MB. The chunked upload involves splitting the log file into multiple chunks, each chunk containing a file ID, a chunk number, and a total number of chunks. In a preferred embodiment, the front-end JavaScript uses the `Blob.slice()` method to cut the file into 5MB chunks, and sends POST requests to the ` / api / upload` interface of the Flask API chunk by chunk using the `multipart / form-data` format. Each chunk carries three key identifiers:

[0079] file_id (Unique file identifier, generated by UUID v4)

[0080] chunk_id (slice sequence number, starting from 0)

[0081] total_chunks (Total number of chunks)

[0082] After receiving the chunk, the Flask API uses `werkzeug.secure_filename()` for sanitization and temporarily stores the chunk in the ` / tmp / {file_id} / ` directory with the filename `chunk_{chunk_id}`. This design supports resuming interrupted uploads and concurrent uploads; if the upload of a chunk fails, only that chunk needs to be re-uploaded, not the entire file.

[0083] In some embodiments, a web framework can be replaced, i.e., the Flask API can be replaced with the FastAPI. The advantage is that the Pydantic model can be used to achieve automatic parameter validation, which improves the robustness of the API interface; and the asynchronous view function asyncdef is used to support high concurrency requests (>1000 QPS).

[0084] In other embodiments, the Flask API can also be replaced with Django, which has the advantage of using the built-in DjangoAdmin management backend to simplify the development of log files and task management interfaces.

[0085] Step S2: The server processes the log file to obtain structured log data and stores it in the database. Before processing, it needs to determine whether the log file was uploaded normally. If the log file was uploaded in fragments, the server will directly receive the log file. If the log file was uploaded in fragments, the server needs to respond to the client by sending a fragment upload completion request. In this embodiment, after all fragments are uploaded, the client browser sends a ` / api / upload / complete` request, carrying the `file_id` and `md5_checksum` (the MD5 hash value calculated by the front end). The server performs integrity verification on all received fragments. Specifically, the Flask API reads all fragments and recalculates the MD5 hash, comparing it with the previous MD5 value. If they match, the verification passes.

[0086] If the verification passes, the shards are merged to obtain the log file. In one specific embodiment, the `process_log.delay(file_id)` method of the Celery distributed task queue is called to submit the log file parsing task to the Redis Broker (using `redis: / / localhost:6379 / 0` as the message middleware), and immediately returns a JSON response `{"task_id": "celery-task-id", "status": "processing"}` to the frontend to avoid HTTP request timeouts due to long latency (gateways typically limit this to 30 seconds). The Celery Worker traverses the ` / tmp / {file_id} / ` directory, merging all shards in chunk_id order into a complete log file, thus obtaining the log file.

[0087] In some specific embodiments, after obtaining the log file, such as Figure 3 As shown, the specific processing step S2 includes:

[0088] Step S21: Identify the protocol type of the log file; in some embodiments, this specifically involves reading the file header and identifying the communication protocol type by matching the magic number.

[0089] Step S22: Select the corresponding parsing strategy based on the identification result; in some embodiments, specifically if the first byte of the log file is 0x68 and the subsequent bytes conform to the DL / T 698.45 frame structure, then load the DLT698Parser class.

[0090] If the log file matches the IEC 62056 protocol features, then the HDCLParser class is loaded.

[0091] Step S23: Based on the parsing strategy, parse the log file line by line to obtain multiple JSON data entries, reclaiming memory after each line is parsed. For example... Figure 4 As shown, in some embodiments, specifically:

[0092] Step S231: Extract timestamps, message frames, and communication IDs based on the regular expression rules in the parsing strategy; specifically: read the log line by line using the generator mode, and use pre-compiled regular expressions to match key fields:

[0093] Python

[0094] # Timestamp extraction mode (supports two formats: 2025-01-01 12:00:00 and 01 / 01 / 2025 12:00:00)

[0095] TIMESTAMP_PATTERN = re.compile(r'(\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2})')

[0096] # Message data extraction mode (matching more than 8 hexadecimal bytes)

[0097] DATAGRAM_PATTERN = re.compile(r'([0-9A-Fa-f]{2}\s){8,}')

[0098] After each line is parsed, the memory occupied is reclaimed. In some embodiments, this is specifically achieved by using the re.finditer() iterator to achieve O(1) memory usage. Regardless of the file size (GB level), it will not cause memory overflow. By using this streaming parsing method, the memory overflow problem caused by parsing large files is fundamentally avoided. At the same time, the log parsing process can be carried out in parallel, adapting to the high throughput log processing requirements of the system.

[0099] In other embodiments, the protocol parsing engine can be replaced, i.e., regular expressions can be replaced with the ANTLR4 parser. The advantage is that it can write .g4 grammar files for DL / T 698.45, generate efficient Lexer / Parser, improve parsing speed by 3 times, and has better syntax maintainability than regular expressions.

[0100] In other embodiments, Python can be replaced with Cython, and the protocol parsing core module can be compiled into a C extension, improving CPU-intensive computing performance by 5-10 times.

[0101] Step S232: Verify the frame validity of the message frame; including format verification, address verification, and timing verification, specifically:

[0102] Format validation: Checks if the start-of-frame character (0x68), length field, and checksum (CS) are correct. If the CS check fails, it marks frame_valid=false.

[0103] Address verification: Extract the message address field and match it with the meter_no field in the dcu_comm_tasks table. If they do not match, mark address_mismatch=true.

[0104] Timing verification: Calculate diff_time = response_time - request_time. If diff_time > 30 seconds, it is judged as a timeout.

[0105] Step S233: If the verification passes, perform feature calculations on all message frames under the same communication ID to obtain derived features; the derived features are core analytical fields in the JSON data that reflect communication quality and execution results. In some embodiments, the derived features are specifically as follows:

[0106] comm_times: The total number of frames under the same comm_id

[0107] success_comm_times: The number of frames with a response code of 0x00.

[0108] fail_comm_times: comm_times - success_comm_times

[0109] exec_result: An enumeration of values ​​(0: all failed, 1: all succeeded, 2: partially succeeded).

[0110] Step S234: Obtain the JSON data based on the derived features. Derived features are the core components of the JSON data, which is a structured integration of the original parsed fields and derived features. The original core fields and derived features are standardized and integrated to form a single structured log entry, which is ultimately converted into JSON data.

[0111] Step S24: Store the structured log data, i.e., JSON data, into the database. In some embodiments, this specifically includes:

[0112] Step S241: Store the JSON data in the cache. Specifically, the Celery Worker converts each structured log entry into a JSON object. After accumulating 500 entries, it pushes them into the Redis queue in batches using `redis.Redis().lpush('preprocessed_log_queue', json.dumps(batch_array))`. A left-in, right-out (LPUSH+BRPOP) pattern is used to ensure that the data is first-in, first-out. In some preferred embodiments, each JSON object contains the following fields:

[0113] {

[0114] "comm_id": "TX202501010001",

[0115] "point_no": 102,

[0116] "meter_no": "123456789012",

[0117] "request_time": "2025-01-01 02:00:00",

[0118] "response_time": "2025-01-01 02:00:15",

[0119] "diff_time": 15,

[0120] "log_comm": [{"step":1,"direction":"REQ","data":"68...16"}, {...}],

[0121] "exec_result": 1,

[0122] "error_reason": null

[0123] }

[0124] In other embodiments, message queues can be used as an alternative, i.e., Redis can be replaced by RabbitMQ. The advantage is that RabbitMQ's exchange and routing key mechanism supports finer-grained task routing, which can distribute parsing tasks and data entry tasks to different queues to achieve resource isolation.

[0125] In other implementations, Redis can be replaced with Apache Kafka, which has the advantage that Kafka's partitions and consumer groups support horizontal scaling, making it suitable for handling ultra-large-scale log streams of >10GB / hr.

[0126] Step S242: In response to meeting the batch processing trigger condition, store the JSON data already stored in the cache into the database. The batch processing trigger condition is that the number of JSON data stored in the cache reaches a preset threshold or exceeds the preset time set in the time-series verification. In some embodiments, this is specifically implemented using LogPersister (a batch processing service process). LogPersister runs independently and blocks and listens to the Redis queue using redis.Redis().brpop('preprocessed_log_queue',timeout=30). When the queue length is ≥500 or the timeout is 30 seconds, the MySQL stored procedure is called:

[0127] The stored procedure `CALL StoreDCULogsAndTasks(@batch_json, @affected_rows)` internally uses a batch transaction commit (START TRANSACTION...INSERT...COMMIT) to write 500 records at once to the three tables `dcu_comm_logs`, `dcu_comm_tasks`, and `dcu_meter_datas`, improving execution efficiency by more than 10 times compared to inserting records one by one. After writing, `LogPersister` deletes the processed data using the Redis `LTRIM` command and logs it to `process_log`. Once the Celery Worker detects this log, it pushes a completion notification to the client's browser via WebSocket.

[0128] In other embodiments, the database can be replaced, i.e., MySQL can be replaced with PostgreSQL. The advantage is that by utilizing PostgreSQL's JSONB fields + GIN indexes, the performance of querying nested JSON fields can be improved by 2 times; and the COPY command can be used to achieve ultra-high-speed batch import (>100,000 records / second).

[0129] In other embodiments, MySQL is replaced by TiDB, which has the advantage of being a distributed HTAP database that supports horizontal scaling, meets petabyte-level log storage requirements, and is compatible with the MySQL protocol without requiring code modifications.

[0130] Step S3: The client sends the user's question to the server;

[0131] The question is segmented and entity recognition is performed to obtain a structured query context object. Specifically, in some embodiments, after the user inputs the question on the client-side front-end, the client browser sends the raw text to the Flask API's ` / api / question` interface via WebSocket or POST request. The preprocessing module uses the Jieba word segmentation engine (loading a custom power industry dictionary containing technical terms such as "load curve," "daily freeze," and "metering point"), combined with regular expression entity recognition to extract structured parameters.

[0132] point_no: Matches the regular expression r'measurement point\s*(\d+)'

[0133] Task: Match keywords such as "load curve" or "daily freeze".

[0134] date: Matches the date format and normalizes it to YYYY-MM-DD, ultimately generating a QueryContext object.

[0135] Step S4: The server invokes the intelligent agent based on the problem: The specific steps of step S4 are as follows Figure 5 As shown, in some embodiments, specifically:

[0136] Step S41: Perform database and knowledge base queries to obtain the structured log data corresponding to the problem and the historical fault diagnosis experience corresponding to the problem, respectively; the specific steps of step S41 are as follows in some embodiments: Figure 6 As shown, it specifically includes:

[0137] Step S411: Generate the agent execution environment; in some preferred embodiments, generating the construction environment specifically involves:

[0138] Step S4111: Load the user's historical dialogue memory and construct a multi-turn dialogue context; in this embodiment, specifically: use the LangChain context to construct and load the AI ​​framework in the LangGraph workflow, specifically read the user's historical dialogue (Key: user_session:{user_id}) from Redis, construct ConversationBufferMemory, and realize the association of multi-turn dialogue context.

[0139] Step S4112: Encapsulate the database query, chart generation, and report download functions into independent utility classes for use by the large language model; in this embodiment, this specifically involves dynamically loading three utility classes: SQLDatabaseToolkit (automatically converts natural language into SQL), ChartGenerator (encapsulates Matplotlib), and ReportDownloader (generates PDF).

[0140] Step S4113: Set the workflow template, wherein the workflow template defines the node execution order and conditional branching logic; in this embodiment, specifically: select the LangGraph template according to QueryContext.task_type:

[0141] task_type='overall' → Load OverallAnalysisWorkflow (execute full table statistical query)

[0142] task_type='specific' → Load SpecificTaskWorkflow (execute conditional branch query)

[0143] task_type='custom' → Load CustomQuestionWorkflow (perform multi-hop inference)

[0144] Step S4114: Construct the reasoning and action framework for the large language model. Specifically, the workflow defines a DAG using JSON, and condition nodes are judged using JavaScript expressions; the Agent registers with the ReAct framework, and the LLM autonomously makes decisions by invoking the tool sequence.

[0145] Step S412: Load the corresponding workflow template based on the query context; in some embodiments, the query needs to be converted. Taking SpecificTaskWorkflow as an example, first query the node, execute SQLDatabaseChain, and automatically convert "Query the load curve task of metering point 102" into:

[0146] SELECT * FROM dcu_comm_tasks WHERE point_no=102 AND task='load curve A' ORDER BY request_time DESC LIMIT 100; Then, the conditional node evaluates len(query_result) > 0 using the JavaScript expression engine. If it is true, proceed to the analysis node; otherwise, return "No relevant record found".

[0147] Step S413: Execute the corresponding workflow based on the workflow template to obtain the structured log data corresponding to the problem;

[0148] Step S414: Based on the query context, query the knowledge base for the N most similar historical fault diagnosis experiences corresponding to the question, where N is a natural number greater than or equal to 1.

[0149] In some embodiments, specifically: LangGraph queries the database results (JSON array), and the top-5 similar knowledge fragments retrieved by RAG, i.e., N is 5 in this embodiment.

[0150] Step S42: Based on the question, the corresponding structured log data, and the historical fault diagnosis experience, the answer to the question is returned to the client.

[0151] In some embodiments, specifically: based on the question, the corresponding structured log data, and the historical fault diagnosis experience, prompt words are constructed, input into the large language model, and the answer corresponding to the question is obtained. That is, RAG retrieval enhancement and LLM inference: LangGraph concatenates the database query results (JSON array) with the Top-5 similar knowledge fragments retrieved by RAG into a Prompt, which is then input into the LLM large language model. An example of the Prompt structure is as follows:

[0152] [System Prompt: You are a power system log analysis expert]

[0153] [Historical Case: {RAG Search Results}]

[0154] [Current data: {MySQL query results}]

[0155] [User Question: {question}]

[0156] Please analyze the reasons for the failure and provide optimization suggestions, outputting in JSON format.

[0157] LLM outputs structured JSON:

[0158] {

[0159] "root_cause": "Channel conflict caused retry timeout".

[0160] "confidence": 0.92,

[0161] "Suggestion": "It is recommended to adjust the meter reading period to 3:00-4:00 AM to avoid peak electricity consumption hours."

[0162] "chart_type": "timeline",

[0163] "chart_data": {...}

[0164] }

[0165] In a preferred embodiment, the LLM large language model uses Tongyi Qianwen-72B, fine-tuned with LoRA and injected with 5000 power fault cases.

[0166] LangGraph uses the ChartGenerator tool and Matplotlib to generate a communication time series diagram from the chart_data, then converts it to Base64 encoding and embeds it into HTML. The Flask API pushes the final result to the client's browser in chunks via the Server-Sent Events (SSE) protocol. The front end renders the text, charts, and download links step by step, with the total response time controlled within 3-10 seconds.

[0167] In other embodiments, the AI ​​framework can be replaced. LangChain+LangGraph can be replaced with AutoGen, which has the advantage of using Microsoft AutoGen's multi-Agent dialogue mechanism. One agent is responsible for querying the database, and another agent is responsible for calling LLM. The analysis is completed through dialogue collaboration, which is suitable for more complex cross-table join queries.

[0168] In other embodiments, the agent invocation pattern can be replaced, i.e., the ReAct framework can be replaced with the ReWOO framework. ReWOO pre-plans all tool invocation steps before execution, reducing the number of LLM calls and lowering token costs by 30%.

[0169] In other embodiments, a single LLM call can be replaced by chain-of-thought multi-step reasoning, which has the advantage of asking questions in steps for complex problems, with each step independently calling the LLM, thus improving the interpretability of the final answer.

[0170] The technical solution of the present invention has been described above with reference to the preferred embodiments shown in the accompanying drawings. However, it will be readily understood by those skilled in the art that the scope of protection of the present invention is obviously not limited to these specific embodiments. Without departing from the principles of the present invention, those skilled in the art can make equivalent changes or substitutions to the relevant technical features, and the technical solutions after such changes or substitutions will all fall within the scope of protection of the present invention.

Claims

1. A concentrator log analysis method, characterized in that, include: The client uploads the concentrator log file to the server; The server processes the log file to obtain structured log data and stores the structured log data in the database; The client sends the user's question to the server; The server invokes the intelligent agent based on the aforementioned problem: Perform database queries and knowledge base queries to obtain the structured log data corresponding to the problem and the historical fault diagnosis experience corresponding to the problem, respectively; Based on the question, the corresponding structured log data, and the historical fault diagnosis experience, the answer to the question is returned to the client.

2. The method according to claim 1, characterized in that, The server processes the log file to obtain structured log data, including: Identify the protocol type of the log file; Select the corresponding parsing strategy based on the recognition results; The log file is parsed line by line using the parsing strategy to obtain multiple JSON data entries. After each line is parsed, the memory used is reclaimed.

3. The method according to claim 2, characterized in that, The step of selecting the corresponding parsing strategy based on the recognition result includes: If the first byte of the log file is 0x68 and the subsequent bytes conform to the DL / T 698.45 frame structure, then the DLT698Parser class is loaded; If the log file matches the IEC 62056 protocol features, then the HDCLParser class is loaded.

4. The method according to claim 2, characterized in that, The process of parsing the log file line by line based on the parsing strategy yields multiple JSON data entries, including: Extract timestamps, message frames, and communication IDs based on the regular expression rules in the parsing strategy; The validity of the message frame is verified; If the verification passes, feature calculation is performed on all message frames under the same communication ID to obtain derived features; The JSON data is obtained based on the derived features.

5. The method according to claim 4, characterized in that, The verification of the validity of the message frame includes: The message frame is subjected to format verification, address verification, and timing verification.

6. The method according to claim 2, characterized in that, The step of storing the structured log data into the database includes: Store the JSON data in the cache; In response to the fulfillment of batch processing trigger conditions, the JSON data already stored in the cache is stored in the database.

7. The method according to claim 6, characterized in that, The batch processing trigger condition is that the amount of JSON data stored in the cache reaches a preset threshold or exceeds the preset duration set in the time-series verification.

8. The method according to claim 1, characterized in that, The client uploads the concentrator log file to the server, including: If the size of the log file to be uploaded exceeds a preset size threshold, the log file is split into multiple fragments, each fragment having a file ID, fragment number, and total number of fragments.

9. The method according to claim 8, characterized in that, Before the server processes the log file to obtain structured log data, the method further includes: In response to the client sending a fragment upload completion request, the server performs an integrity check on all received fragments. If the verification passes, the fragments are merged to obtain the log file.

10. The method according to claim 1, characterized in that, Before the server invokes the agent based on the aforementioned problem, the method further includes: The question is segmented into words and entity recognition is performed to obtain a structured query context object.

11. The method according to claim 10, characterized in that, The execution of database queries and knowledge base queries yields the structured log data corresponding to the problem and the historical fault diagnosis experience corresponding to the problem, including: Generate the execution environment for the intelligent agent; Load the corresponding workflow template based on the query context; Execute the corresponding workflow based on the workflow template to obtain the structured log data corresponding to the problem; Based on the query context, query the knowledge base for the N most similar historical fault diagnosis experiences corresponding to the question, where N is a natural number greater than or equal to 1.

12. The method according to claim 11, characterized in that, The generated intelligent agent execution environment includes: Load the user's historical dialogue memory to construct a multi-turn dialogue context; The database query, chart generation, and report download functions are encapsulated into independent utility classes for use by large language models; Set up a workflow template, wherein the workflow template defines the node execution order and conditional branching logic; A reasoning and action framework for building a large language model.

13. The method according to claim 12, characterized in that, The process of obtaining the answer to the question and returning it to the client based on the question, the corresponding structured log data, and the historical fault diagnosis experience includes: Based on the question, the corresponding structured log data, and the historical fault diagnosis experience, prompt words are constructed and input into the large language model to obtain the answer to the question.

14. The method according to claim 12 or 13, further comprising: During the training phase of the large language model, it is fine-tuned using LoRA based on knowledge of the power domain.

15. A concentrator log analysis system, characterized in that, include: Client and server, among which The client uploads the concentrator log file to the server. The server processes the log file to obtain structured log data and stores the structured log data in the database; The client sends the user's question to the server; The server invokes the intelligent agent based on the problem: Perform database queries and knowledge base queries to obtain the structured log data corresponding to the problem and the historical fault diagnosis experience corresponding to the problem, respectively; Based on the question, the corresponding structured log data, and the historical fault diagnosis experience, the answer to the question is returned to the client.