An all-weather traffic incident identification method based on multi-AI large model joint decision

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
By using multi-AI large-scale models for joint decision-making, and employing diffusion models and large language models to denoise and analyze traffic images, combined with user feedback for optimization, accurate identification and efficient early warning of traffic incidents around the clock have been achieved, improving the level of intelligence in traffic management and the efficiency of emergency response.

CN120689820BActive Publication Date: 2026-06-26SOUTH CHINA UNIV OF TECH

View PDF 3 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: SOUTH CHINA UNIV OF TECH
Filing Date: 2025-05-15
Publication Date: 2026-06-26

Application Information

Patent Timeline

15 May 2025

Application

26 Jun 2026

Publication

CN120689820B

IPC: G06V20/54; G06V20/70; G06V10/30; G06V10/764; G06V10/82; G06V10/94; G06N3/0475; G06N3/08; G06N5/04; G08G1/01; G08G1/04

AI Tagging

Technology Topics

Traffic crash Linguistic model

Technical Efficacy Phrases

Accurate detectionAccurate early warning

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Existing technologies are insufficient in terms of accuracy, understanding ability, and real-time performance in traffic incident identification, making it difficult to achieve accurate traffic incident identification and early warning in complex scenarios.

Method used

The method employs a multi-AI large model joint decision-making approach, which collects all-weather traffic images through road monitoring cameras, uses a diffusion model for noise reduction, combines a large language model for traffic event analysis, extracts vehicle information, road conditions, personnel information, and weather conditions, determines the severity of accidents, and pushes early warning information in real time. It also incorporates user feedback for adaptive optimization.

Benefits of technology

It enables accurate identification and efficient early warning of traffic incidents under all-weather conditions, improving the intelligence level and emergency response efficiency of traffic management departments.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN120689820B_ABST

Patent Text Reader

Abstract

The application provides an all-weather traffic event identification method based on multi-AI large model joint decision, mainly including the following steps: firstly, real-time collection of all-weather traffic images is carried out through a road monitoring camera, and pretreatment is carried out to improve the image definition; subsequently, semantic analysis of the traffic images is carried out by using a large language model, and data structured storage is realized; inference of a traffic event type is carried out by using a locally deployed large language model; determination of the severity of the traffic accident is carried out in combination with the large language model; system maintenance is realized in combination with the large language model; the AITI-Agent is constructed by integrating multiple large models, the limitation of the traditional image recognition method in the complex traffic environment is solved, and more accurate and real-time all-weather traffic event identification and early warning are realized by integrating different large language models.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the fields of computer vision and intelligent transportation technology, and in particular to an all-weather traffic event recognition method based on joint decision-making of multiple AI models. Background Technology

[0002] In the field of intelligent transportation, automatic identification and early warning of traffic incidents is an important research direction for improving road safety and emergency response efficiency. Traditional methods mainly rely on object detection technology, using deep learning models to detect and classify accident images. However, these methods have limited generalization ability to complex scenes and struggle to accurately understand the background and semantic information of accidents from multiple perspectives. Compared to traditional methods, large language models and AI-Agents demonstrate powerful knowledge reasoning and autonomous decision-making capabilities. Specifically, large language models can combine multimodal information to deeply understand the background, related factors, and potential risks of accidents, while AI-Agents possess autonomous learning and real-time interaction capabilities, adjusting identification and early warning strategies according to the dynamic environment, greatly improving the accuracy and intelligence level of event identification.

[0003] Based on currently available technologies, such as the traffic accident recognition method and device based on a multimodal large language model (CN202480000606.7), the method mainly includes: firstly, acquiring multimodal data such as video data, audio data, and sensor data under the driving state of the target vehicle; inputting the acquired multimodal data into an accident recognition model trained based on a multimodal large language model for multimodal information mining; determining whether the target vehicle has been involved in a traffic accident based on the mined multimodal information; and if the target vehicle has been involved in a traffic accident, issuing an alarm message. However, no all-weather traffic event image recognition method based on joint decision-making using multiple AI large language models has been disclosed. Summary of the Invention

[0004] In view of the limitations of existing technologies, the purpose of this invention is to provide an all-weather traffic incident identification method based on joint decision-making of multiple AI models, so as to solve the shortcomings of existing technologies in terms of incident identification accuracy, understanding ability, and real-time performance, and improve the intelligence level and efficiency of traffic incident early warning.

[0005] The present invention is achieved by at least one of the following technical solutions.

[0006] A method for all-weather traffic incident identification based on joint decision-making of multiple AI models includes the following steps:

[0007] S1. Real-time collection of images of road traffic operation status around the clock using road traffic monitoring camera equipment to provide data support for subsequent analysis;

[0008] S2. Use a diffusion model to denoise the acquired traffic images;

[0009] S3. Use a large language model to analyze the denoised traffic images, extract vehicle information, road conditions, personnel information, weather conditions, whether there is congestion, and whether an accident has occurred, which are required for traffic event analysis, and store them in the database in a structured manner.

[0010] S4. Use locally deployed large language models to reason and judge different traffic events;

[0011] S5. Based on the damage to vehicles, the number of vehicles involved in the accident, and the injuries sustained by personnel, the severity of traffic accidents is determined as minor accidents, general accidents, serious accidents, and major accidents.

[0012] S6. Send real-time warning information to traffic management departments regarding event types and the severity of traffic accidents;

[0013] S7. Through user feedback, analysis of historical early warning effects, and combination with a large language model, the system for traffic incident image recognition and early warning is maintained.

[0014] Furthermore, in step S1, images of various traffic events are collected using surveillance cameras. The image data includes various types of traffic roads at different times and under different weather conditions.

[0015] The categories for different time periods include: morning peak hours (7:00-9:00), midday hours (11:00-13:00), evening peak hours (17:00-19:00), nighttime hours (19:00-6:00), and off-peak hours (6:00-7:00, 9:00-11:00, 13:00-17:00). The categories for different weather conditions include: sunny, cloudy, rainy, foggy, and snowy.

[0016] Traffic road categories include: highways, urban expressways, rural roads, roundabouts, tunnels, bridges, ramps, service areas, and construction sections, to ensure that the system can identify and warn of traffic accidents in different road environments.

[0017] Furthermore, in step S2, the diffusion model performs noise reduction processing on the acquired traffic images, including the following steps:

[0018] S2-1. Using a dataset of traffic accident images containing different weather conditions, time periods, road sections, and perspectives, construct training samples and normalize the original clear image x0.

[0019] S2-2. Based on the forward diffusion process, Gaussian noise is gradually added to the original clear image x0 to simulate different degrees of image distortion. The noise addition process follows the formula below:

[0020]

[0021] In the formula, q(x) t |x0) represents x t The probability distribution of x t This represents the noisy image at time step t. The proportion of cumulative noise is represented by I, where I is the identity matrix and N is a Gaussian distribution.

[0022] S2-3. Train the denoising model to learn how to denoise the image x. t To predict noise ∈ 0 and the denoised sharp image x0, the prediction process follows the formula:

[0023]

[0024] In the formula, Denotes the original image estimate obtained by denoising and reverse engineering, ∈0(x t ,t) represents noise;

[0025] S2-4. Based on the reverse denoising process, a clear image x0 is generated by progressively denoising. The denoising calculation at each time step t follows the following formula:

[0026] p θ (x t-1 |x t )=N(x t-1 μ θ (x t ,t),∑ θ (x t ,t))

[0027] In the formula, p θ (x t-1 |x t ) represents the current image x, which is known. t In the case of generating the image x from the previous time step t-1 The probability distribution, μ θ (x t ,t) represents the denoised mean, ∑ θ (x t ,t) represents the denoised variance;

[0028] S2-5. The mean squared error (MSE) loss function L is used to calculate the error between the predicted noise and the actual noise, and the neural network parameters θ are optimized. The loss function is defined as follows:

[0029]

[0030] In the formula, L(θ) represents the training loss function of the model, which depends on the network parameter θ, ∈ represents the real Gaussian noise, ∈ θ (x t (t) represents the noise prediction value of the neural network output, and x is the input. t Given time step t, the output is an estimate of ∈. This represents the expectation calculation of the joint distribution of the original image x0, time step t, and noise ∈.

[0031] Further, step S3 includes the following steps:

[0032] S3-1. The traffic event information extracted from the image is divided into four categories: traffic accidents, traffic congestion, road construction, and debris spills.

[0033] S3-2. For the vehicle information extracted from the image, extract its features, including the number of vehicles N. vehicle Number of vehicles involved in the accident N accidentvehicle Vehicle types, including:

[0034]

[0035] And N accidentvehicle ≤N vehicle ;

[0036] in, Represents a non-negative integer;

[0037] The road conditions extracted from the image include road condition and road segment type T. road Number of lanes in the road section L total Number of lanes where the accident occurred (L) accident Is there traffic congestion within the road section? congestion And whether there are risk factors D that affect road traffic. road The definition is as follows:

[0038] T road ={Highways, urban expressways, rural roads, roundabouts, tunnels, bridges, ramps, service areas, construction zones}

[0039]

[0040] D road = {Landslides, rockfalls, potholes, obstacles, landslides, icing, snow accumulation}

[0041] S3-3. Regarding the personnel information extracted from the image, including determining whether there are pedestrians P on the road surface.exist Number of pedestrians P number P number Are there any casualties? injured The constraints are defined as follows:

[0042]

[0043] When P exist When = 1,

[0044]

[0045] And P injured ≤P number

[0046] S3-4. For the weather conditions F extracted from the image W Including daytime T Day Nighttime T Night Snowing snow Rain W rain , Sunny W sunny Fog W fog The definition is as follows:

[0047] F W ={T Day ,T Night W snow W rain W sunny W fog}

[0048] S3-5, Regarding the T value extracted from the image indicating whether an accident occurred... accident The definition is as follows:

[0049]

[0050] S3-6. Store traffic incident information, vehicle information, road conditions, personnel information, weather conditions, and whether an accident has occurred in a structured manner in the database.

[0051] Furthermore, in step S4, the traffic event type is inferred and determined by combining the data stored in step S3 and utilizing the locally deployed large language model, including:

[0052] Analyze the vehicle collision locations in traffic accidents to determine whether a rear-end collision, side collision, oncoming collision, side impact, or rollover accident occurred.

[0053] Analyze the number of vehicles involved in the accident to determine whether it is a single-vehicle accident, a two-vehicle accident, or a multi-vehicle chain collision, and infer the cause of the accident by combining the vehicle's driving direction and the collision sequence.

[0054] For traffic congestion, determine whether traffic congestion has occurred on the road surface; for road construction, determine the number of lanes occupied by road construction; for debris spillage, determine the volume and type of debris, and infer the degree of impact of debris on road traffic; based on the obtained road condition data, determine whether the traffic accident was caused by extreme weather factors; based on the obtained weather condition data, determine whether the traffic accident was caused by abnormal road factors such as road collapse, rockfall, water accumulation, or snow accumulation.

[0055] Furthermore, in step S5, the severity of the traffic accident is determined by combining the large language model, and the accident is classified into minor accidents, general accidents, serious accidents, and major accidents. The severity of the accident follows the formula below:

[0056]

[0057] in,

[0058]

[0059] Accident severity type T is defined based on the accident severity value D. D Follow the formula below:

[0060]

[0061] In the formula, T D Indicates the type of accident severity, D indicates the severity of the accident, P injured Indicates casualties, L total L represents the total number of lanes in the image segment. accident α represents the number of lanes affected by an accident in the image segment. i This represents the impact coefficient of different types of vehicles involved in accidents. L represents the number of accident vehicles corresponding to different vehicle types. congesti0n T represents the road congestion situation in the image segment. accident This indicates whether a traffic accident has occurred in the road segment of the image, where ω1, ω2, ω3, and ω4 are the weight coefficients of each influencing factor.

[0062] Furthermore, in step S7, the accident identification and early warning system is adaptively optimized by combining user feedback and historical early warning effects. The system first collects user feedback data on early warning information, including false alarms and missed alarms, and establishes an early warning dataset to compare the differences between the model's early warning results and the actual accident situation. Combining the self-supervised learning ability of the large language model, the system continuously optimizes the model's performance in tasks such as semantic information extraction from traffic accident images, image feature association reasoning, and accident category determination. Through multiple rounds of iterative training, the system enhances its ability to identify and warn of traffic accidents under complex environments such as different environments, weather conditions, and lighting conditions.

[0063] Furthermore, the AITI-Agent architecture is adopted, and through task division, the information extracted in step S3, the reasoning and judgment of different traffic events in step S4, the determination of the severity of traffic accidents in step S5, and the maintenance task in step S7 are assigned to the corresponding large language models for processing.

[0064] The system for implementing the all-weather traffic incident identification method based on multi-AI large model joint decision-making includes:

[0065] The image data acquisition module collects traffic image data in real time, around the clock, through front-end camera equipment, and transmits the raw image data to the image enhancement module;

[0066] The image enhancement module is used to perform image sharpening preprocessing on the acquired images to improve image quality and recognizability, and then transmits the enhanced image data to the deep semantic feature extraction module.

[0067] The deep semantic feature extraction module is used to perform deep semantic analysis and feature extraction on the acquired traffic images, generate semantic feature information containing vehicles, pedestrians, traffic facilities and traffic events, and store the acquired information in the form of structured data. The semantic feature information is then transmitted to the traffic event intelligent reasoning module.

[0068] The traffic incident intelligent reasoning module is used to intelligently determine whether a traffic incident exists in the current scene and send the identification results to the accident severity determination module;

[0069] The accident severity determination module is used to determine the severity of an incident based on the identified traffic incident type, scope of impact, number of vehicles involved, and the reasoning ability of a large language model, and to provide a reference for subsequent emergency response.

[0070] The accident early warning module provides real-time early warning information to relevant traffic management departments based on the accident assessment results, thereby improving traffic safety response efficiency.

[0071] The system maintenance module is used to monitor the operating status of each module of the system, record logs, update models regularly, and optimize strategies to ensure the stability, accuracy, and scalability of the system.

[0072] The modules mentioned above exchange data and control signals through a high-speed data bus and a unified information interaction interface to ensure the high efficiency and real-time performance of the system.

[0073] The present invention provides a computer device comprising: a memory and a processor, and a computer program stored in the memory. When the computer program is executed on the processor, it implements the aforementioned all-weather traffic event identification method based on joint decision-making of multiple AI models.

[0074] Compared with existing technologies, the beneficial effects of the present invention are as follows:

[0075] This invention, based on an AI-Agent architecture, integrates multiple large language models for collaborative reasoning to intelligently analyze traffic incident images. By reasoning from deep semantic information extracted from images, it extracts data such as accident type and severity in real time, achieving accurate traffic accident detection and early warning. Simultaneously, an adaptive optimization mechanism based on historical data and user feedback is designed to continuously improve the accuracy of event judgment and the effectiveness of early warning. This invention can provide traffic management departments with more intelligent and efficient traffic incident monitoring and dispatch support, improving road safety management and possessing broad application value. Attached Figure Description

[0076] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the following description is provided with accompanying drawings of the relevant technical solutions in the embodiments of the present invention or the prior art. It should be understood that the accompanying drawings described below are only for the purpose of clearly illustrating some embodiments of the technical solutions of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0077] Figure 1 This is a flowchart illustrating an all-weather traffic incident identification method based on joint decision-making of multiple AI models, as an example.

[0078] Figure 2 This is a schematic diagram illustrating the training of an image enhancement model for an example.

[0079] Figure 3 This is a schematic diagram of a structured database for extracting semantic information from traffic event images based on the integration of Gemini 2.0 and GPT-4o in the embodiment.

[0080] Figure 4 This is a schematic diagram of the interface for determining the severity of a traffic accident, as shown in the example.

[0081] Figure 5 A schematic diagram of the database and front-end structure in an embodiment of the present invention. Detailed Implementation

[0082] The present invention will be further described in detail below with reference to the embodiments and accompanying drawings, but the embodiments of the present invention are not limited thereto.

[0083] Example 1

[0084] like Figures 1-5 As shown in this embodiment, an all-weather traffic incident identification method based on joint decision-making of multiple AI models integrates different large language models and divides tasks by constructing an AITI-Agent (Artificial Intelligent Traffic Incident-Agent) architecture. Each model performs specific tasks based on its own strengths. Through interaction with a traffic data platform, real-time traffic incident identification and early warning are achieved, improving road safety management efficiency. The method specifically includes the following steps:

[0085] Step 1: Use road traffic monitoring camera equipment to collect real-time images of road traffic conditions around the clock, providing data support for subsequent analysis, including:

[0086] The system utilizes surveillance cameras to collect images of various traffic incidents, covering a wide range of road types and times under different weather conditions.

[0087] As one example, the classification of different time periods includes: morning peak hours (7:00-9:00), midday hours (11:00-13:00), evening peak hours (17:00-19:00), nighttime hours (19:00-6:00), and off-peak hours (6:00-7:00, 9:00-11:00, 13:00-17:00). Classification of different weather conditions includes: sunny, cloudy, rainy, foggy, and snowy. Classification of traffic roads includes: highways, urban expressways, rural roads, roundabouts, tunnels, bridges, ramps, service areas, and construction zones, ensuring the system has the ability to identify and issue warnings for traffic events under different road conditions.

[0088] Step 2: Use a diffusion model to denoise the acquired traffic images and construct an image enhancement module suitable for traffic images.

[0089] like Figure 2 As shown, the image enhancement module based on the diffusion model is constructed. The process of denoising the acquired traffic images by training the diffusion model includes the following steps:

[0090] First, training samples are constructed using a traffic event image dataset containing images of different weather conditions, time periods, road sections, and viewpoints. The original clear image x0 is then normalized. Next, Gaussian noise is progressively added to the clear image x0 based on a forward diffusion process to simulate different degrees of image distortion. The noise addition process follows the formula below:

[0091]

[0092] In the formula, q(x) t |x0) represents x t The probability distribution of x t This represents the noisy image at time step t. The proportion of cumulative noise is represented by I, where I is the identity matrix and N is a Gaussian distribution.

[0093] Next, the denoising model is trained to learn how to denoise the image x. t To predict noise ∈ 0 and the denoised sharp image x0, the prediction process follows the formula:

[0094]

[0095] In the formula, Denotes the original image estimate obtained by denoising and reverse engineering, ∈0(x t ,t) represents the noise predicted by the neural network (Ho J, Jain A, Abbeel P. Denoising diffusion probabilistic models[J]. Advances in neural information processing systems, 2020, 33: 6840-6851.).

[0096] Based on the inverse denoising process, a clear image x0 is generated by progressively denoising. The denoising calculation at each time step t follows the formula below:

[0097]

[0098] In the formula, p θ (x t-1 |x t ) represents the current image x, which is known. t In the case of generating the image x from the previous time step t-1 The probability distribution, μ θ (x t ,t) represents the denoised mean, ∑ θ (x t ,t) represents the denoised variance;

[0099] Finally, the mean squared error (MSE) loss function L is used to calculate the error between the predicted noise and the actual noise, and the neural network parameters θ are optimized. The loss function is defined as follows:

[0100]

[0101] In the formula, L(θ) represents the training loss function of the model, which depends on the network parameter θ, ∈ represents the real Gaussian noise, ∈ θ (x t (t) represents the noise prediction value of the neural network output, and x is the input. t Given time step t, the output is an estimate of ∈. This represents the expectation calculation of the joint distribution of the original image x0, time step t, and noise ∈;

[0102] After the model is trained according to the above training process, it can perform noise reduction processing on the acquired traffic images, remove environmental noise, increase image details, restore image clarity under complex weather conditions, and improve image quality.

[0103] Step 3: Use large language models (such as Gemini 2.0 and GPT-4o) to make joint decisions and perform deep analysis on the obtained high-definition traffic operation images to further extract structured data information such as traffic events, vehicle information, road conditions, personnel information, weather conditions, shooting location, and whether an accident has occurred.

[0104] Traffic incidents are categorized into four types: traffic accidents, traffic congestion, road construction, and debris spills. For vehicle information extracted from images, their physical features are extracted, including the number of vehicles N. vehicle Number of vehicles involved in the accident N accidentvehicle Vehicle types, including:

[0105]

[0106] And N accidentvehicle ≤N vehicle ;

[0107] in, Represents a non-negative integer.

[0108] The road conditions extracted from the image include road condition and road segment type T. road Number of lanes in the road section L total Number of lanes where the accident occurred (L) accident Is there traffic congestion within the road section? congestion And whether there are risk factors D that affect road traffic. road The definition is as follows:

[0109] T road ={Highways, urban expressways, rural roads, roundabouts, tunnels, bridges, ramps, service areas, construction zones}

[0110]

[0111] D road = {Landslides, rockfalls, potholes, obstacles, landslides, icing, snow accumulation}

[0112] For the personnel information extracted from the image, including determining whether there are pedestrians P on the road surface. exist Number of pedestrians P number Are there any casualties? injured The constraints are defined as follows:

[0113]

[0114] When P exist When = 1,

[0115]

[0116] And P injured ≤P number

[0117] For the weather conditions F extracted from the image W Including daytime T Day Nighttime T Night Snowing snow Rain W rain , Sunny W sunny Fog W fog The definition is as follows:

[0118] h W ={T Day ,T Night W snow W rain W sunny W fog}

[0119] Whether an accident occurred is extracted from the image. accident The definition is as follows:

[0120]

[0121] Finally, information such as traffic incidents, vehicle information, road conditions, personnel information, weather conditions, shooting locations, and whether an accident has occurred are stored in the database in a structured manner. Among them, the structured data of traffic accident semantic information extracted based on Gemini2.0 is shown in Table 1.

[0122] Table 1. Structured data extracted from traffic accident semantic information based on Gemini 2.0

[0123]

[0124] The defined database is built on the SQLAlchemy framework to implement structured storage of traffic scene data. The system connects to a local MySQL database and creates a data table named "Traffic Structured Data". The table contains several fields, including: an auto-incrementing "serial number" to uniquely identify each record; an "image" field to store the traffic image path; descriptions of "vehicle information", "road conditions", "personnel information", and "weather conditions" detected in the image; annotations of the image's shooting time and location; a Boolean field recording whether the scene involves a traffic accident; and an automatically generated record timestamp field. This database supports subsequent structured analysis, querying, and visualization processing of traffic image data, and has good scalability and versatility. The database definition code segment is shown in Table 2.

[0125] Table 2 defines the database model code segment.

[0126]

[0127] This document describes the process of automatically semantically parsing traffic images using the Gemini 2.0 model and writing the analysis results into a database. The process comprises two main steps: First, a traffic image analysis function is constructed, taking the image as input and sending prompts to the Gemini 2.0 model. The model is instructed to act as a traffic analysis expert, performing structured parsing of the image content from multiple dimensions, including vehicle information, road conditions, pedestrian information, weather conditions, shooting location, and whether an accident has occurred. The model returns results in JSON format, which are then parsed and extracted. Next, the system calls a database writing function to encapsulate the structured information and image path into a single record and store it in the defined "Traffic Structured Data" table. This process achieves a closed-loop operation of automatic image understanding and structured database storage, providing data support for subsequent intelligent retrieval and traffic risk assessment. The code segment for calling Gemini 2.0 to parse images and store them in the database is shown in Table 3.

[0128] Table 3 shows the code segment that calls Gemini 2.0 to parse images and store them in the database.

[0129]

[0130]

[0131] Figure 3The database visualization interface is shown, consisting of four buttons: "Upload Image," "Extract Information Based on AI Large Language Model," "Save to Database," and "Accident Severity Assessment." Clicking the "Upload Image" button allows users to upload video surveillance traffic images. Clicking "Extract Information Based on AI Large Language Model" then uses the large language model to analyze the uploaded image and save it to the database. The left pane in the middle displays traffic information extracted by the large language model from the uploaded image, while the right pane displays information output by the large language model based on specified fields. The table at the bottom represents the structured data containing different specified fields.

[0132] Step 4: The traffic event types extracted in Step S3 are shown in Table 4. Among them, in traffic accidents, the images and corresponding data identified as "accidents have occurred" are used to infer and determine the traffic accident type according to the collision method using the locally deployed llava:34b-v1.6-fp16 large language model; in traffic congestion, traffic flow characteristics are combined to determine whether traffic congestion has occurred in the image; in road construction, the cause of road construction is determined to infer whether it is temporary or long-term construction; in the event of spilled material, the volume and type of spilled material are determined to predict whether danger may occur.

[0133] Table 4 Types of Traffic Incidents

[0134]

[0135] As one example, for the identified accident images, the llava:34b-v1.6-fp16 model infers the traffic accident type based on the vehicle collision method, specifically including:

[0136] 1. Rear-end collision: Determine whether the vehicles involved in the accident were involved in a longitudinal collision, especially whether they struck the rear of the vehicle in front.

[0137] 2. Side collision: Analyze whether the vehicle involved in the accident collided with the side of the vehicle, such as contact with vehicles in adjacent lanes.

[0138] 3. Oncoming collision: Determine whether the accident occurred in the oncoming lane, including head-on collisions.

[0139] 4. Side collision: Analyze whether the vehicle was damaged due to a side collision with an obstacle (such as a guardrail or barrier).

[0140] 5. Rollover accident: Inspect whether the accident vehicle has overturned, rolled, or otherwise caused damage.

[0141] The llava:34b-v1.6-fp16 model categorizes accidents into single-vehicle accidents, two-vehicle accidents, and multi-vehicle chain collisions based on the number of vehicles involved. It also determines whether extreme weather conditions caused the traffic accident based on road condition data. Furthermore, it assesses whether road collapses, rockfalls, flooding, snow accumulation, or other abnormal road conditions caused the accident. The traffic types and causes are shown in Table 5.

[0142] Table 5 Types and Causes of Traffic Accidents

[0143]

[0144] Step 5: Combine the DeepSeek large model to determine the severity of the accident. Based on the damage to the vehicles, the number of vehicles involved, and the injuries to the personnel, the accident is classified into minor accidents, general accidents, serious accidents, and major accidents.

[0145] The severity of an accident is determined by the following formula:

[0146]

[0147] Based on the accident information, has an accident occurred within the road section? accident Defined as follows:

[0148]

[0149] Based on traffic flow information, is there congestion within the road segment? congestion Defined as follows:

[0150]

[0151] Based on the type of vehicle involved in the accident, α i Defined as follows:

[0152]

[0153] Based on the above formula, the accident severity type T is defined according to the accident severity value D. D Follow the formula below:

[0154]

[0155] In the formula, T D Indicates the type of accident severity, D indicates the severity of the accident, P injured Indicates casualties, L total L represents the total number of lanes in the image segment. accident α represents the number of lanes affected by an accident in the image segment. i This represents the impact coefficient of different types of vehicles involved in accidents. L represents the number of accident vehicles corresponding to different vehicle types. congestion T represents the road congestion situation in the image segment. accident This indicates whether a traffic accident has occurred in the road segment in the image, where ω1, ω2, ω3, and ω4 are the weight coefficients of each influencing factor.

[0156] The weighting coefficients ω1, ω2, ω3, and ω4 are 0.4, 0.2, 0.3, and 0.1, respectively. Based on the data obtained in step 3, the results of the accident severity determination are shown in Table 6.

[0157] Table 6. Results of Accident Severity Assessment

[0158]

[0159] Figure 4 The interface for determining the severity of the accident was displayed. When you click... Figure 3 After clicking the "Accident Severity Judgment" button, you will be automatically redirected to the accident severity judgment interface. The table includes parameters for judging the severity of the accident, and the calculated accident severity type is displayed at the bottom of the interface.

[0160] Step 6: Push real-time early warning information on accident type and severity to traffic management departments. By pushing key information such as collected accident images and data to traffic management departments in real time, relevant departments can quickly take response measures and handle emergencies.

[0161] Step 7: Through user feedback and analysis of historical warning effects, combined with large language models, the traffic incident image recognition and warning system is adaptively optimized. The capabilities of each large language model in different aspects such as image semantic information extraction, image information reasoning and judgment are continuously strengthened to improve the accuracy of traffic incident recognition and warning.

[0162] Based on user feedback and historical early warning results, the traffic incident identification and early warning system is adaptively optimized, specifically including the following steps:

[0163] 1. Data Collection and Processing

[0164] The system collects user feedback data on early warning information through an interactive interface. The feedback data includes false alarm information and missed alarm information. At the same time, the system collects historical early warning effect data and establishes an early warning dataset for analyzing the difference between model early warning results and actual accident situations.

[0165] 2. System maintenance and optimization

[0166] The system was iteratively optimized using code analysis capabilities combined with Claude's three major models, gradually improving its performance under complex environmental conditions. Specific optimization tasks included:

[0167] (1) Optimization of semantic information extraction capability from traffic accident images;

[0168] (2) Optimization of image feature association reasoning ability;

[0169] (3) Optimization of the accuracy of accident category determination;

[0170] 3. Performance Evaluation and Feedback

[0171] The system periodically evaluates the performance of the optimized model and combines the evaluation results with user feedback data to form a closed-loop optimization mechanism, thereby achieving continuous adaptive optimization of the system. Figure 5 The database and front-end structure and process of this embodiment are summarized.

[0172] This embodiment combines one or more large language models for joint decision-making, constructing an intelligent architecture called AITI-Agent. By integrating multiple large language models, it fully leverages the advantages of each model to achieve collaborative operation, thereby improving the performance of the traffic event image recognition and early warning system in complex environments.

[0173] This embodiment implements a system for all-weather traffic incident recognition based on multi-AI large model joint decision-making. The system includes an image data acquisition module, an image enhancement module, a deep semantic feature extraction module, a traffic incident intelligent reasoning module, an accident severity determination module, an accident early warning module, and a system maintenance module. The above modules transmit data and exchange control signals through a high-speed data bus and a unified information interaction interface to ensure the system's high efficiency and real-time operation.

[0174] The image data acquisition module collects traffic image data around the clock and in real time through the front-end camera equipment, and transmits the raw image data to the image enhancement module;

[0175] The image enhancement module is used to perform image sharpening preprocessing on the acquired images to improve image quality and recognizability, and then transmits the enhanced image data to the deep semantic feature extraction module.

[0176] The deep semantic feature extraction module is used to perform deep semantic analysis and feature extraction on the acquired traffic images, generate semantic feature information containing vehicles, pedestrians, traffic facilities and traffic events, and store the acquired information in the form of structured data. The semantic feature information is then transmitted to the traffic event intelligent reasoning module.

[0177] The traffic incident intelligent reasoning module is used to intelligently determine whether a traffic incident exists in the current scene and send the identification results to the accident severity determination module;

[0178] The accident severity determination module is used to determine the severity of an incident based on the identified traffic incident type, scope of impact, number of vehicles involved, and the reasoning ability of a large language model, and to provide a reference for subsequent emergency response.

[0179] Based on the accident assessment results, the accident early warning module provides real-time early warning information to relevant traffic management departments, thereby improving traffic safety response efficiency.

[0180] The system maintenance module is used to monitor the operating status of each module of the system, record logs, update models regularly, and optimize strategies to ensure the stability, accuracy, and scalability of the system.

[0181] In summary, this invention adopts the AITI-Agent (Artificial Intelligent Traffic Incident-Agent) architecture, building a collaborative framework composed of multiple AI large language models. Through task division, sub-tasks such as deep semantic feature extraction (step S3), intelligent traffic incident reasoning (step S4), accident severity determination (step S5), and system maintenance (step S7) are assigned to the most suitable large language model for processing. AITI-Agent interacts with the traffic data platform in real time, receiving high-quality traffic images and related data processed by the image enhancement module, coordinating the execution of each sub-task and integrating the processing results, thereby achieving efficient identification and early warning of traffic incidents and improving the efficiency of road safety management.

[0182] Specifically, the system first uses a diffusion model to sharpen the acquired images, then extracts structured information about traffic image semantics based on large language models (e.g., Gemini 2.0 and GPT-4o); it then uses a locally deployed large language model (e.g., llava:34b-v1.6-fp16) to determine the type of traffic incident; it then uses a large language model (e.g., Deepseek) to determine the severity of traffic accidents; and it uses a large language model (e.g., Claude 3) to maintain the traffic incident image recognition and early warning system. Finally, by interacting with a traffic data platform, it achieves real-time traffic accident recognition and early warning, thereby improving the efficiency of road safety management.

[0183] The preferred embodiments of the present invention disclosed above are merely illustrative of the invention. These preferred embodiments do not exhaustively describe all details, nor do they limit the invention to the specific implementations described. Clearly, many modifications and variations can be made based on the content of this specification. This specification selects and specifically describes these embodiments to better explain the principles and practical applications of the invention, enabling those skilled in the art to better understand and utilize the invention.

Claims

1. A method for all-weather traffic incident identification based on joint decision-making of multiple AI models, characterized in that, Includes the following steps: S1. Real-time collection of images of road traffic operation status around the clock using road traffic monitoring camera equipment to provide data support for subsequent analysis; S2. Denoising the acquired traffic images using a diffusion model; The denoising process for the acquired traffic images using a diffusion model includes the following steps: S2-1. Using a traffic accident image dataset containing different weather conditions, time periods, road sections, and viewpoints, construct training samples and refine the original clear images. Perform normalization processing; S2-2, Based on the forward diffusion process, the original clear image Gaussian noise is added gradually to simulate different degrees of image distortion. The noise addition process follows the formula below: In the formula, express The probability distribution, Indicates the time number t The image with added noise at the step, This represents the proportion of cumulative noise. It is the identity matrix. It follows a Gaussian distribution; S2-3. Train the denoising model to learn how to denoise images. Predicted noise and the clear image after noise reduction The prediction process follows the formula below: In the formula, This represents the original image estimate obtained after denoising and reverse engineering. Indicates noise; S2-4. Based on the inverse denoising process, a clear image is generated through progressive denoising. Each time step t The denoising calculation follows the formula below: In the formula, Indicates that the current image is known. In the case of generating the image from the previous moment. The probability distribution, This represents the denoised mean. Indicates the denoised variance; S2-5, Using the Mean Squared Error (MSE) Loss Function L The error between predicted noise and actual noise is calculated, and the neural network parameters are optimized. The loss function is defined as follows: In the formula, The training loss function of the model depends on the network parameters. , This represents true Gaussian noise. This represents the noise prediction value output by the neural network, with the input being... and time step The output is The estimate, Indicates the original image Time step and noise The expectation is calculated from the joint distribution; S3. Utilize a large language model to analyze the denoised traffic images, extracting vehicle information, road conditions, pedestrian information, weather conditions, congestion status, and accident status required for traffic event analysis. Store this information in a structured format in a database, including the following steps: S3-1. The traffic event information extracted from the image is divided into four categories: traffic accidents, traffic congestion, road construction, and debris spills. S3-2. For the vehicle information extracted from the image, extract its features, including the number of vehicles. Number of vehicles involved in the accident Vehicle types, including: ；； in, Represents a non-negative integer; The road conditions extracted from the image include road condition and road segment types. Number of lanes in a road section Number of lanes where accidents occurred Are there any traffic jams on the road section? And whether there are any risk factors that affect road traffic. The definition is as follows: S3-3. Regarding the personnel information extracted from the image, including determining whether there are pedestrians on the road. Number of pedestrians Are there any casualties? The constraints are defined as follows: when hour, S3-4. Weather conditions extracted from the image. including daytime ,at night ,snow ,rain ,sunny Dense fog The definition is as follows: S3-5, Regarding whether an accident occurred, the image was extracted to determine this. The definition is as follows: S3-6. Store traffic incident information, vehicle information, road conditions, personnel information, weather conditions, and whether an accident has occurred in a structured manner in the database; S4. Use locally deployed large language models to reason and judge different traffic events; S5. Based on the damage to vehicles, the number of vehicles involved in the accident, and the injuries sustained by personnel, the severity of traffic accidents is determined as minor accidents, general accidents, serious accidents, and major accidents. S6. Send real-time warning information to traffic management departments regarding event types and the severity of traffic accidents; S7. Through user feedback, analysis of historical early warning effects, and combination with a large language model, the system for traffic incident image recognition and early warning is maintained.

2. The all-weather traffic incident identification method based on joint decision-making of multiple AI large models according to claim 1, characterized in that, In step S1, images of various traffic events are collected using surveillance cameras. The image data includes various types of traffic roads at different times and under different weather conditions. The categories for different time periods include: morning peak hours (7:00-9:00), midday hours (11:00-13:00), evening peak hours (17:00-19:00), nighttime hours (19:00-6:00), and off-peak hours (6:00-7:00, 9:00-11:00, 13:00-17:00). The categories for different weather conditions include: sunny, cloudy, rainy, foggy, and snowy. Traffic road categories include: highways, urban expressways, rural roads, roundabouts, tunnels, bridges, ramps, service areas, and construction sections, to ensure that the system can identify and warn of traffic accidents in different road environments.

3. The all-weather traffic incident identification method based on joint decision-making of multiple AI large models according to claim 1, characterized in that, In step S4, the traffic event type is inferred and determined by combining the data stored in step S3 and utilizing the locally deployed large language model, including: Analyze the vehicle collision locations in traffic accidents to determine whether a rear-end collision, side collision, oncoming collision, side impact, or rollover accident occurred. Analyze the number of vehicles involved in the accident to determine whether it is a single-vehicle accident, a two-vehicle accident, or a multi-vehicle chain collision, and infer the cause of the accident by combining the vehicle's driving direction and the collision sequence. For traffic congestion, determine whether traffic congestion has occurred on the road surface; for road construction, determine the number of lanes occupied by road construction; for debris spillage, determine the volume and type of debris, and infer the degree of impact of debris on road traffic; based on the obtained road condition data, determine whether the traffic accident was caused by extreme weather factors; based on the obtained weather condition data, determine whether the traffic accident was caused by abnormal road factors such as road collapse, rockfall, water accumulation, or snow accumulation.

4. The all-weather traffic incident identification method based on joint decision-making of multiple AI large models according to claim 1, characterized in that, In step S5, the severity of the traffic accident is determined using a large language model, and the accident is classified into four levels: minor accident, general accident, serious accident, and major accident. The severity of the accident follows the formula below: in, According to the severity of the accident D Value defines the accident severity type Follow the formula below: In the formula, Indicates the type of accident severity. D Indicates the severity of the accident. Indicates the casualty situation. This indicates the total number of lanes in the image segment. This indicates the number of lanes affected by an accident in the image segment. This represents the impact coefficient of different types of vehicles involved in accidents. This indicates the number of accident vehicles corresponding to different vehicle types. This indicates the road congestion situation in the image segment. Indicates whether a traffic accident has occurred in the image segment. ω 1. ω 2. ω 3. ω 4 represents the weighting coefficient of each influencing factor.

5. The all-weather traffic incident identification method based on joint decision-making of multiple AI large models according to claim 1, characterized in that, In step S7, the accident identification and early warning system is adaptively optimized by combining user feedback and historical early warning effects. The system first collects user feedback data on early warning information, including false alarms and missed alarms, and establishes an early warning dataset to compare the differences between the model's early warning results and the actual accident situation. Combining the self-supervised learning ability of the large language model, the system continuously optimizes the model's performance in tasks such as semantic information extraction from traffic accident images, image feature association reasoning, and accident category determination. Through multiple rounds of iterative training, the system enhances its ability to identify and warn of traffic accidents under different environmental, weather, and lighting conditions.

6. A method for all-weather traffic incident identification based on joint decision-making of multiple AI large models according to any one of claims 1 to 5, characterized in that, The AITI-Agent architecture is adopted. Through task division, the information extracted in step S3, the reasoning and judgment of different traffic events in step S4, the determination of the severity of traffic accidents in step S5, and the maintenance task in step S7 are assigned to the corresponding large language models for processing.

7. A system for implementing the all-weather traffic incident identification method based on multi-AI large model joint decision-making as described in claim 1, characterized in that, include: The image data acquisition module collects traffic image data in real time, around the clock, through front-end camera equipment, and transmits the raw image data to the image enhancement module; The image enhancement module is used to perform image sharpening preprocessing on the acquired images to improve image quality and recognizability, and then transmits the enhanced image data to the deep semantic feature extraction module. The deep semantic feature extraction module is used to perform deep semantic analysis and feature extraction on the acquired traffic images, generate semantic feature information containing vehicles, pedestrians, traffic facilities and traffic events, and store the acquired information in the form of structured data. The semantic feature information is then transmitted to the traffic event intelligent reasoning module. The traffic incident intelligent reasoning module is used to intelligently determine whether a traffic incident exists in the current scene and send the identification results to the accident severity determination module; The accident severity determination module is used to determine the severity of an incident based on the identified traffic incident type, scope of impact, number of vehicles involved, and the reasoning ability of a large language model, and to provide a reference for subsequent emergency response. The accident early warning module provides real-time early warning information to relevant traffic management departments based on the accident assessment results, thereby improving traffic safety response efficiency. The system maintenance module is used to monitor the operating status of each module of the system, record logs, update models regularly, and optimize strategies to ensure the stability, accuracy, and scalability of the system. The modules mentioned above exchange data and control signals through a high-speed data bus and a unified information interaction interface to ensure the high efficiency and real-time performance of the system.

8. A computer device, characterized in that, It includes: a memory and a processor, and a computer program stored in the memory, which, when executed on the processor, implements the all-weather traffic event identification method based on joint decision-making of multiple AI large models as described in claim 6.

Citation Information

Patent Citations

CN118525275A
CN118551647A
KR101268899B1

Patent Information

AI Technical Summary

Abstract

Description

Patent Citations

CN118525275A

CN118551647A

KR101268899B1