Incident display system, incident display method, and incident display program
The incident display system objectively analyzes video footage by determining incidents, generating trajectories, and providing text descriptions, addressing the subjectivity in existing video analysis methods and enhancing incident data presentation.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- NEC CORP
- Filing Date
- 2024-12-23
- Publication Date
- 2026-07-02
AI Technical Summary
Existing video analysis techniques for incidents at intersections are subjective and lack objectivity, making it difficult to perform accurate incident analysis.
An incident display system that acquires video from cameras, determines incidents using a determination unit, generates trajectories and text descriptions using a VLM, registers incident data, and outputs results to a user terminal upon request, enabling objective analysis.
Provides an objective and systematic approach to analyzing video footage of incidents, allowing for accurate and detailed presentation of incident data to users.
Smart Images

Figure JP2024045405_02072026_PF_FP_ABST
Abstract
Description
Incident Display System, Incident Display Method, and Incident Display Program
[0001] The present invention relates to an incident display system, an incident display method, and an incident display program.
[0002] Patent Document 1 discloses a technique for tracking a moving object such as a vehicle traveling on a road on video.
[0003] International Publication No. 2024 / 038501
[0004] Incidents such as contact accidents between moving objects occurring at intersections or the like may be analyzed by the photographed video of a camera installed near the intersection. However, when a person analyzes the video, the subjectivity of the analyzing person enters, so it may be difficult to perform an objective analysis. Therefore, the development of a technique for objectively analyzing video has been demanded.
[0005] The present disclosure has been made to solve such problems, and an object thereof is to provide an incident display system, an incident display method, and an incident display program for objectively analyzing video.
[0006] The incident display system according to the present disclosure includes: an acquisition unit that acquires video from at least one camera that photographs a road; a determination unit that determines the occurrence of an incident of a moving object moving on the road based on the acquired video; a trajectory generation unit that cuts out a video of a predetermined length including the time point when the moving object related to the incident appears and the incident occurs from the acquired video, and generates a trajectory of the travel of the moving object related to the incident; a text generation unit that causes a VLM to generate a text explaining the cut-out video; a registration unit that registers incident data in which the cut-out video, the trajectory, and the text are associated with the incident in a storage unit; and an output unit that extracts any one or more of the determination result of the incident, the cut-out video, the trajectory, and the text from the incident data according to a user operation and displays them on a user terminal operable by the user.
[0007] The incident display method relating to this disclosure involves a computer acquiring video footage from at least one camera that photographs a road, determining the occurrence of an incident involving a moving object on the road based on the acquired video footage, extracting a predetermined length of video footage from the acquired video footage that shows the moving object involved in the incident and includes the time when the incident occurred, generating a trajectory of the moving object involved in the incident, having a VLM generate a text describing the extracted video footage, registering incident data in a storage means that links the extracted video footage, the trajectory, and the text to the incident, and, in response to user operation, extracting one or more of the incident determination result, the extracted video footage, the trajectory, and the text from the incident data and displaying them on a user terminal that the user can operate.
[0008] The incident display program relating to this disclosure includes the steps of: acquiring video footage from at least one camera that photographs a road; determining the occurrence of an incident involving a moving object on the road based on the acquired video footage; extracting a predetermined length of video footage from the acquired video footage that shows the moving object involved in the incident and includes the time when the incident occurred, and generating a trajectory of the moving object involved in the incident; causing a VLM to generate a text describing the extracted video footage; registering incident data in a storage means that associates the extracted video footage, the trajectory, and the text with the incident; and, in response to user operation, extracting one or more of the incident determination result, the extracted video footage, the trajectory, and the text from the incident data and displaying them on a user terminal that the user can operate.
[0009] This disclosure provides an incident display system, an incident display method, and an incident display program for objectively analyzing video footage.
[0010] This is a block diagram showing the configuration of the incident display system related to this disclosure. This is a flowchart showing the flow of the incident display method related to this disclosure. This is a block diagram showing the configuration of the incident display system related to this disclosure. This is a flowchart showing the flow of the incident display method related to this disclosure. This is a diagram showing an example of how incidents are displayed and mapped on a map. This is a diagram showing an example of the trajectory of a moving object. This is a block diagram illustrating the hardware configuration of a computer.
[0011] Embodiments of the present disclosure will be described in detail below with reference to the drawings. In each drawing, the same or corresponding elements are denoted by the same reference numerals, and redundant explanations will be omitted where necessary for clarity.
[0012] <Embodiment 1> Figure 1 is a block diagram showing the configuration of the incident display system 100. The incident display system 100 includes an acquisition unit 110, a determination unit 120, a trajectory generation unit 130, a text generation unit 140, a registration unit 150, and an output unit 160. The incident display system 100 is connected to a network 600 (not shown in Figure 1). The network 600 may be wired or wireless. A camera 400 and a user terminal 500 (not shown in Figure 1) are connected to the network 600.
[0013] The acquisition unit 110 acquires the video footage captured by the camera 400. If there are multiple cameras 400, the acquisition unit 110 acquires video footage from each camera 400. The camera 400 is a filming device that films the road. There may be one or more cameras 400 connected to the network 600. The video footage captured by the camera 400 shows moving objects on the road. These moving objects may be vehicles such as cars, motorcycles, bicycles, or electric scooters, or they may be people or wild animals.
[0014] The determination unit 120 determines the occurrence of an incident involving moving objects on a road based on the video footage acquired by the acquisition unit 110. Here, an incident includes contact and proximity between moving objects. An incident is, for example, when two vehicles are in close proximity on a road. For example, the determination unit 120 determines that an incident has occurred between the first and second moving objects if the time between the date and time when the first moving object passed and the date and time when the second moving object passed at a point on the road is less than or equal to a predetermined value. In this case, the determination unit 120 may also determine that the first and second moving objects are in contact if the dates and times when the first and second moving objects passed coincide. Alternatively, the determination unit 120 may determine that the first and second moving objects are in close proximity if the time between the date and time when the first moving object passed and the date and time when the second moving object passed is between zero and a predetermined value. The determination unit 120 assigns identification information to the determined incident to identify the incident. The identification information is, for example, a number for identifying the incident.
[0015] The trajectory generation unit 130 extracts a predetermined length of video from the video acquired by the acquisition unit 110 that shows a moving object involved in an incident determined by the determination unit 120, and that includes the time when the incident occurred. For example, the trajectory generation unit 130 may extract video from among the videos acquired from multiple cameras 400 that clearly show the moving object involved in the incident, for a predetermined time before and after the time the incident occurred. Subsequently, the trajectory generation unit 130 generates the trajectory of the moving object involved in the incident determined by the determination unit 120. Specifically, the trajectory generation unit 130 analyzes the extracted video, estimates the position of the moving object involved in the incident, and generates time-series data of the moving object's position as the trajectory of the moving object. When acquiring video from multiple cameras 400, the trajectory generation unit 130 may integrate the upper parts of the multiple videos to generate the trajectory of the moving object. Specifically, the trajectory generation unit 130 may, for example, correct the position of the moving object based on the position and angle of view feature points of each camera 400.
[0016] The text generation unit 140 causes the VLM (Vision and Language Model) to generate text describing the video extracted by the trajectory generation unit 130. When video containing an incident is input to the VLM, it generates text about the incident. The generated text includes attributes of the incident. These attributes include, for example, the type of moving object involved in the incident, the cause of the incident, and the specific details of the incident.
[0017] The registration unit 150 registers incident data, which is formed by linking the identification information of the incident determined by the determination unit 120 with the video footage and generated trajectory extracted by the trajectory generation unit 130, and the text generated by the text generation unit 140, into a predetermined storage device. The storage device may be a storage device that constitutes the incident display system 100, or it may be an external storage device.
[0018] The output unit 160 outputs and displays incident data to the user terminal 500 in response to user operations. The user terminal 500 is an information processing terminal that can be operated by the user. The user is a person who wishes to obtain information about a predetermined incident. The user operates the user terminal 500 to obtain incident data related to the predetermined incident. Specifically, the user inputs the attributes of the incident they wish to obtain, i.e., search attributes, into the user terminal 500. Here, the search attributes are, for example, the location where the incident occurred, the date and time of occurrence, the type of mobile object involved in the incident, etc. The user terminal 500 transmits the input search attributes to the incident display system 100. The output unit 160 extracts incident data corresponding to the received search attributes from the storage device and transmits the extraction results to the user terminal 500. The extraction results may include whether or not there is an incident corresponding to the search attributes. Also, the extraction results may be at least a part of the incident data related to the incident corresponding to the search attributes. In other words, the output unit 160 outputs and displays to the user terminal 500 one or more of the following for incidents that match the search attributes: the judgment result, the extracted video, the trajectory, and the text.
[0019] Figure 2 is a flowchart illustrating the flow of the incident display method. First, the acquisition unit 110 acquires video footage from at least one camera 400 that photographs the road (step S101). The determination unit 120 determines that an incident has occurred involving a moving object on the road based on the video footage acquired in step S101 (step S102). The trajectory generation unit 130 extracts a predetermined length of video footage from the video footage acquired in step S101 that shows the moving object involved in the incident and includes the time when the incident occurred, and generates the trajectory of the moving object involved in the incident (step S103). The text generation unit 140 causes the VLM to generate text describing the video footage extracted in step S103 (step S104).
[0020] Next, the registration unit 150 associates the extracted video, trajectory, and text with the incident determined in step S102 and registers them in the storage means (step S105). The output unit 160 displays one or more of the incident determination result, extracted video, trajectory, and text in response to user operation (step S106). Thus, in this embodiment, the incident display method performs incident determination and the generation of accompanying text on the system, so it can present the user with objectively analyzed data about the incident.
[0021] The incident display system 100 includes a processor, memory, and storage device (not shown). The storage device stores a computer program that implements the processing of the incident display method according to this embodiment. The processor loads the computer program from the storage device into the memory and executes the computer program. As a result, the processor realizes the functions of the acquisition unit 110, the determination unit 120, the trajectory generation unit 130, the text generation unit 140, the registration unit 150, and the output unit 160.
[0022] Furthermore, the acquisition unit 110, the determination unit 120, the trajectory generation unit 130, the text generation unit 140, the registration unit 150, and the output unit 160 may each be implemented with dedicated hardware. Also, some or all of the components of each device may be implemented by general-purpose or dedicated circuits, processors, etc., or combinations thereof. These may be configured by a single chip or by multiple chips connected via a bus. Some or all of the components of each device may be implemented by a combination of the above-mentioned circuits, etc., and programs. Also, a CPU (Central Processing Unit), GPU (Graphics Processing Unit), FPGA (field-programmable gate array), etc. can be used as the processor.
[0023] Furthermore, if some or all of the components of the incident display system 100 are implemented by multiple information processing devices or circuits, these devices may be centrally located or distributed. For example, the information processing devices or circuits may be implemented in a form where each is connected via a communication network, such as a client-server system or a cloud computing system. In addition, the functions of the incident display system 100 may be provided in SaaS (Software as a Service) format.
[0024] <Embodiment 2> Embodiment 2 is a specific example of Embodiment 1 described above. Figure 3 is a block diagram showing the configuration of the incident display system 300. In addition to the incident display system 300, Figure 3 also shows a camera 400, a user terminal 500, and a network 600. The incident display system 300 is connected to the camera 400 and the user terminal 500 via the network 600. Descriptions that overlap with Embodiment 1 will be omitted as appropriate.
[0025] The user terminal 500 is an information processing terminal used by the user, such as a personal computer, tablet, or smartphone. The user terminal 500 comprises a communication unit 510, an input unit 520, and a display unit 530. The user terminal 500's hardware configuration includes at least a display device and a computer. The communication unit 510 is an interface for communicating with the outside of the user terminal 500. The input unit 520 is an input device for the user to input information into the user terminal 500. The display unit 530 is a display device that displays incident data and the like acquired from the incident display system 300.
[0026] The incident display system 300 comprises an acquisition unit 320, a determination unit 330, a mapping unit 340, a trajectory generation unit 350, a text generation unit 360, a registration unit 370, an output unit 380, and a storage unit 390. The incident display system 300 is an information processing system that performs incident determination, analysis, and output, and is, for example, a server implemented by a computer.
[0027] The acquisition unit 320 is an example of the acquisition unit 110 shown in Figure 1. The acquisition unit 320 acquires the video captured by the camera 400 and registers it as video data 391 in the storage unit 390. The storage unit 390 is a storage device that stores the video data 391 and incident data 392, etc. The camera 400 can be placed in any position as long as it can capture images of the road. For example, the camera 400 can be placed in a position that can capture images of areas where incidents between moving objects are likely to occur, such as near intersections.
[0028] The determination unit 330 is an example of the determination unit 120 shown in Figure 1. The determination unit 330 determines whether an incident has occurred involving a moving object on the road based on the video footage acquired by the acquisition unit 320. The determination unit 330 determines whether an incident has occurred at each point on the road and assigns identification information to each determined incident.
[0029] The mapping unit 340 maps the location of the incident determined by the determination unit 330 onto a map. Specifically, for incidents to which the determination unit 330 has assigned identification information, the mapping unit 340 associates the identification information with the location where the incident occurred. The map used for mapping may be one that has been pre-registered in the storage unit 390, or it may be obtained from an external database.
[0030] The trajectory generation unit 350 is an example of the trajectory generation unit 130 shown in Figure 1. The text generation unit 360 is an example of the text generation unit 140 shown in Figure 1. The registration unit 370 is an example of the registration unit 150 shown in Figure 1. The registration unit 370 registers incident data 392 in the storage unit 390, which links the location mapped by the mapping unit 340 to the incident identification information determined by the determination unit 330, the video and generated trajectory extracted by the trajectory generation unit 350, and the text generated by the text generation unit 140.
[0031] The output unit 380 is an example of the output unit 160 shown in Figure 1. The output unit 380 may output the map mapped by the mapping unit 340 to the user terminal 500. In this case, the user may select a location on the map displayed on the user terminal 500 from which information about the incident was obtained. When the user selects a predetermined location on the map, the user terminal 500 transmits the information of the selected location to the incident display system 300. Based on the received location information, the output unit 380 extracts the incident data 392 associated with that location and transmits at least a portion of the extracted data to the user terminal 500.
[0032] Next, with reference to Figure 4, the flow of the incident display method will be explained. First, the acquisition unit 320 acquires video from at least one camera 400 that photographs the road (step S201). The determination unit 330 determines whether an incident has occurred involving a moving object on the road based on the video acquired in step S201 (step S202), and assigns identification information to the incident. The mapping unit 340 maps the identification information of the incident to the location where the incident was determined in step S202 (step S203). The trajectory generation unit 350 extracts a predetermined length of video from the video acquired in step S201 that shows the moving object involved in the incident and includes the time when the incident occurred, and generates the trajectory of the moving object involved in the incident (step S204). The text generation unit 360 causes the VLM to generate text describing the video extracted in step S204 (step S205).
[0033] Next, the registration unit 370 associates the mapped location, extracted video, trajectory, and text with the incident identification information determined in step S202 and registers them in the storage unit 390 (step S206). The output unit 380 displays one or more of the incident determination result, mapped location, extracted video, trajectory, and text in response to user operation (step S207).
[0034] The incident data output by the output unit 380 to the user terminal 500 may be displayed in a different manner depending on the attributes of the incident, in order to improve user visibility. Here, the attributes of the incident may be based on text generated by the text generation unit 360, for example, and specifically include the type of mobile object involved in the incident, the cause of the incident, the date and time the incident occurred, and the frequency of incident occurrences. The display manner that may be changed may include color, shape, pattern, etc.
[0035] The incident data output by the output unit 380 to the user terminal 500 may be displayed on a map. Figure 5 shows an example of incident data displayed on a map. In the example shown in Figure 5, vehicle-to-vehicle incidents are indicated by downward-sloping diagonal lines, and vehicle-to-person incidents are indicated by upward-sloping diagonal lines. For vehicle-to-vehicle incidents, the more overlapping the areas, the higher the frequency of occurrence. In addition, contacts among vehicle-to-vehicle incidents are indicated by stars. By changing the display method according to the attributes of the incident in this way, the user can see multiple incidents at a glance. The output unit 380 may be configured to display detailed information of an incident when the user selects an incident displayed on the map. The detailed information of an incident may include the trajectory and text associated with the incident.
[0036] When displaying the trajectory as detailed information about an incident, the trajectory may be displayed on a map. Figure 6 shows an example of a trajectory displayed on a map. The circles in Figure 6 indicate the positions of the moving objects involved in the incident before and after the incident. As shown in Figure 6, in the case of an incident involving multiple moving objects, the trajectory of each moving object may be displayed in a different display manner.
[0037] <Example of Hardware Configuration> The following describes how each functional configuration of the incident display system in this disclosure can be realized through a combination of hardware and software, with reference to Figure 7.
[0038] The incident display system in this disclosure can realize the above-described functions using a computer 11 including the hardware configuration shown in Figure 7. The computer 11 may be a portable computer such as a smartphone or tablet terminal, or a stationary computer such as a PC. The computer 11 may be a dedicated computer designed to realize each device, or it may be a general-purpose computer. The computer 11 can realize the desired functions by installing a predetermined program.
[0039] Computer 11 has a bus 21, a processor 30, a memory 40, a storage device 50, an input / output interface 60 (the interface is also referred to as I / F (Interface)), and a network interface 70. The bus 21 is a data transmission path for the processor 30, the memory 40, the storage device 50, the input / output interface 60, and the network interface 70 to transmit and receive data from each other. However, the method of connecting the processor 30 and the like to each other is not limited to bus connection.
[0040] The processor 30 is various processors such as a CPU, a GPU, or an FPGA. The memory 40 is a main memory device realized using a RAM (Random Access Memory) or the like.
[0041] The storage device 50 is an auxiliary storage device realized using a hard disk, an SSD, a memory card, or a ROM (Read Only Memory). The storage device 50 stores a program for realizing a desired function. The processor 30 reads this program into the memory 40 and executes it to realize each functional component of each device.
[0042] The input / output interface 60 is an interface for connecting the computer 11 and an input / output device. For example, an input device such as a keyboard and an output device such as a display device are connected to the input / output interface 60.
[0043] The network interface 70 is an interface for connecting the computer 11 to a network.
[0044] The example of the hardware configuration in the present disclosure has been described above. However, the above-described embodiments are not limited thereto. The present disclosure can also realize any process by causing a processor to execute a computer program.
[0045] In the examples described above, the program includes a set of instructions (or software code) that, when loaded into a computer, cause the computer to perform one or more of the functions described in the embodiments. The program may be stored on a non-temporary computer-readable medium or a physical storage medium. Examples, but not limited to, include random-access memory (RAM), read-only memory (ROM), flash memory, solid-state drive (SSD) or other memory technologies, CD-ROM, digital versatile disc (DVD), Blu-ray® disc or other optical disc storage, magnetic cassette, magnetic tape, magnetic disk storage or other magnetic storage devices. The program may be transmitted over a temporary computer-readable medium or a communication medium. Examples, but not limited to, include temporary computer-readable medium or a communication medium that includes electrical, optical, acoustic or other forms of propagating signals.
[0046] Although the present disclosure has been described above with reference to embodiments, the present disclosure is not limited to the embodiments described above. Various modifications to the structure and details of the present disclosure can be made as can be understood by those skilled in the art within the scope of the present disclosure. Furthermore, each embodiment can be combined with other embodiments as appropriate.
[0047] Each drawing is merely illustrative to illustrate one or more embodiments. Each drawing may be associated with one or more other embodiments, rather than being associated with only one specific embodiment. As those skilled in the art will understand, various features or steps described with reference to any one drawing can be combined with features or steps shown in one or more other drawings, for example, to create embodiments not explicitly shown or described. Not all features or steps shown in any one drawing to illustrate an exemplary embodiment are necessarily required, and some features or steps may be omitted. The order of steps described in any of the drawings may be changed as appropriate.
[0048] Some or all of the above embodiments can also be described as follows, but are not limited to the following.
[0049] (Supplementary Note A1) An acquisition unit that acquires video from at least one camera that photographs a road, a determination unit that determines the occurrence of an incident of a moving body moving on the road based on the acquired video, and the moving body related to the incident is captured, and a trajectory generation unit that cuts out a video of a predetermined length including the time point when the incident occurred from the acquired video and generates a trajectory of the traveling of the moving body related to the incident, and a sentence generation unit that generates a sentence explaining the cut-out video in VLM, and an incident related to the incident A registration unit that registers incident data in which the cut-out video, the trajectory, and the sentence are associated with each other in a storage means, and an output unit that extracts any one or more of the determination result of the incident, the cut-out video, the trajectory, and the sentence from the incident data according to a user operation and displays them on a user terminal operable by the user. An incident display system comprising:
[0050] (Supplementary Note A2) The incident display system according to Supplementary Note A1, further comprising a mapping unit that maps the occurrence position of the incident on a map, and when the user selects a position on the map, the output unit outputs the determination result of the incident related to the incident that occurred at the selected position, Any one or more of the cut-out video, the trajectory, and the sentence are displayed on the user terminal.
[0051] (Supplementary Note A3) The incident display system according to Supplementary Note A1 or A2, further comprising a mapping unit that maps the occurrence position of the incident on a map, and the output unit changes a display mode on the map according to the occurrence frequency of the incident. [[ID=(Appendix A4) The incident display system according to any one of Appendix A1 to A3, further comprising a mapping unit for mapping the location of the incident onto a map, wherein the output unit changes the display manner on the map according to the attributes of the incident determined based on the text.
[0053] (Appendix B1) An incident display method comprising: a computer acquiring video from at least one camera that photographs a road; determining the occurrence of an incident involving a moving object on the road based on the acquired video; extracting a predetermined length of video from the acquired video that shows the moving object involved in the incident and includes the time when the incident occurred; generating a trajectory of the moving object involved in the incident; causing a VLM to generate a text describing the extracted video; registering incident data in a storage means that links the extracted video, the trajectory, and the text to the incident; and, in response to user operation, extracting one or more of the incident determination result, the extracted video, the trajectory, and the text from the incident data and displaying them on a user terminal that the user can operate.
[0054] (Appendix C1) An incident display program that causes a computer to perform the following steps: acquiring video from at least one camera that films a road; determining the occurrence of an incident involving a moving object on the road based on the acquired video; extracting a predetermined length of video from the acquired video that shows the moving object involved in the incident and includes the time when the incident occurred, and generating a trajectory of the moving object involved in the incident; causing a VLM to generate a text describing the extracted video; registering incident data in a storage means that associates the extracted video, the trajectory, and the text with the incident; and, in response to user operation, extracting one or more of the incident determination result, the extracted video, the trajectory, and the text from the incident data and displaying them on a user terminal that the user can operate.
[0055] Some or all of the elements (e.g., configuration and function) described in Appendices A2 to A4 that are subordinate to Appendice A1 may also be subordinate to Appendices B1 and C1 in the same manner as those described in Appendices A2 to A4. Some or all of the elements described in any appendice may be applied to various hardware, software, recording means, systems, and methods for recording software.
[0056] 100 Incident display system 110 Acquisition unit 120 Judgment unit 130 Trajectory generation unit 140 Text generation unit 150 Registration unit 160 Output unit 300 Incident display system 320 Acquisition unit 330 Judgment unit 340 Mapping unit 350 Trajectory generation unit 360 Text generation unit 370 Registration unit 380 Output unit 390 Storage unit 391 Video data 392 Incident data 400 Camera 500 User terminal 510 Communication unit 520 Input unit 530 Display unit 600 Network
Claims
1. An incident display system comprising: acquisition means for acquiring video footage from at least one camera that films the road; determination means for determining the occurrence of an incident involving a moving object on the road based on the acquired video footage; trajectory generation means for extracting a predetermined length of video footage from the acquired video footage that shows the moving object involved in the incident and includes the time when the incident occurred, and for generating the trajectory of the moving object involved in the incident; text generation means for causing a VLM to generate text describing the extracted video footage; registration means for registering incident data in a storage means, linking the extracted video footage, the trajectory, and the text to the incident; and output means for extracting one or more of the incident determination result, the extracted video footage, the trajectory, and the text from the incident data in response to user operation, and displaying them on a user terminal operable by the user.
2. The incident display system according to claim 1, further comprising mapping means for mapping the location of the incident on a map, wherein when the user selects a location on the map, the output means extracts one or more of the incident determination result, the extracted video, the trajectory, and the text relating to the incident that occurred at the selected location and displays them on the user terminal.
3. The incident display system according to claim 1 or 2, further comprising mapping means for mapping the location of the incident onto a map, wherein the output means changes the display mode on the map according to the frequency of the incident.
4. The incident display system according to claim 1 or 2, further comprising mapping means for mapping the location of the incident onto a map, wherein the output means changes the display mode on the map according to the attributes of the incident determined based on the text.
5. An incident display method comprising: a computer acquiring video footage from at least one camera that photographs a road; determining the occurrence of an incident involving a moving object on the road based on the acquired video footage; extracting a predetermined length of video footage from the acquired video footage that shows the moving object involved in the incident and includes the time when the incident occurred; generating a trajectory of the moving object involved in the incident; having a VLM generate a text describing the extracted video footage; registering incident data in a storage means that links the extracted video footage, the trajectory, and the text to the incident; and, in response to user operation, extracting one or more of the incident determination result, the extracted video footage, the trajectory, and the text from the incident data and displaying them on a user terminal that the user can operate.
6. An incident display program that causes a computer to perform the following steps: acquiring video from at least one camera that films a road; determining the occurrence of an incident involving a moving object on the road based on the acquired video; extracting a predetermined length of video from the acquired video that shows the moving object involved in the incident and includes the time when the incident occurred, and generating a trajectory of the moving object involved in the incident; causing a VLM to generate a text describing the extracted video; registering incident data in a storage means, linking the extracted video, the trajectory, and the text to the incident; and, in response to user operation, extracting one or more of the incident determination result, the extracted video, the trajectory, and the text from the incident data and displaying them on a user terminal that the user can operate.