Information processing device, information processing method, and program
The information processing device addresses the challenge of diverse user sensing requirements by receiving and processing natural language intents to generate and analyze sensing requests, effectively detecting and reporting events like obstacles for vehicles.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- TOYOTA JIDOSHA KK
- Filing Date
- 2025-12-15
- Publication Date
- 2026-07-02
AI Technical Summary
Conventional technologies lack the ability to flexibly and accurately respond to diverse user requirements for sensing processing, particularly in scenarios where vehicles need to detect obstacles at future positions and provide timely warnings for dangerous situations.
An information processing device that receives sensing intents in natural language, generates corresponding sensing requests, acquires sensing data, and detects events by analyzing the data to transmit relevant information to the requester, utilizing a control unit, management devices, and wireless communication systems.
Enables flexible and accurate sensing processing that meets a wide range of user requests, including obstacle detection and warning systems for vehicles, by leveraging natural language processing and wireless communication networks.
Smart Images

Figure JP2025043645_02072026_PF_FP_ABST
Abstract
Description
Information Processing Apparatus, Information Processing Method, and Program
[0001] The present disclosure relates to an information processing apparatus, an information processing method, and a program.
[0002] In Non-Patent Document 1 below, 5G wireless sensing technology is described as a technology for obtaining data from wireless signals (signals such as reflection, refraction, and diffraction) affected by an object or environment. Also, 5G wireless sensing technology is described as acquiring information regarding the characteristics of the environment and / or objects within the environment (such as shape, size, orientation, speed, position, distance between objects, or relative movement).
[0003] ”Study on Integrated Sensing and Communication”, [online], 2024-06-28 updated, 3GPP (registered trademark) TR 22.837 V19.4.0 Section 4, [searched on November 4, 2024], Internet <URL: https: / / portal.3gpp.org / desktopmodules / Specifications / SpecificationDetails.aspx?specificationId=4044>
[0004] However, in the conventional technology, for example, there is no assumption that a large number of vehicles or the like perform a sensing request for the position after moving at the vehicle's own speed for a specific time in order to detect obstacles at a future position and process the obtained sensing results. Furthermore, there is no assumption of a use case where, when there is an obstacle that behaves dangerously at the said future position, the vehicle warns the driver of an avoidance action. Not limited to such examples, the user's requirements for sensing processing by wireless communication are diverse. An aspect of the disclosed embodiments is to be able to flexibly and accurately respond to the diverse requirements of users for sensing processing.
[0005] In one aspect, the embodiments of the disclosure are exemplified by an information processing device including a control unit. The control unit receives a sensing intent in natural language from a sensing requester, specifying the event to be sensed. The control unit then transmits a sensing request corresponding to the sensing intent to a management device that distributes sensing data acquired by a wireless communication device. The control unit then receives sensing data in response to the sensing request from the management device. Furthermore, the control unit detects the event specified in the sensing intent from the received sensing data. The control unit then transmits the detected event to the requester.
[0006] This information processing device can flexibly and accurately respond to a wide range of user requests for sensing processing.
[0007] Figure 1 is a diagram illustrating the configuration of a network to which the information processing device of the first embodiment is connected. Figure 2 is a diagram illustrating a vehicle that requests processing from the information processing device of this embodiment. Figure 3 is a diagram illustrating components that constitute the core network of a fifth-generation mobile communication system. Figure 4 is a flowchart illustrating event detection processing by the information processing device of the first embodiment. Figure 5 is a diagram illustrating data for the target location definition dictionary. Figure 6 is a diagram illustrating data for the sensing target definition dictionary. Figure 7 is a diagram illustrating data for the event type definition dictionary. Figure 8 is a flowchart illustrating event detection processing of the second embodiment. Figure 9 is a flowchart illustrating event detection processing of the third embodiment. Figure 10 is a flowchart illustrating event detection processing of the third embodiment. Figure 11 is a flowchart illustrating sensing instruction aggregation processing A of the fourth embodiment. Figure 12 is a flowchart illustrating sensing instruction aggregation processing B of a modified example.
[0008] Hereinafter, this disclosure will describe the information processing device 6, the information processing method, and the program with reference to the drawings of the embodiment. In this embodiment, the information processing device 6 includes a control unit 60 (see Figure 3). The control unit 60 receives a sensing intent in natural language from a sensing requester, specifying the event to be sensed. The control unit 60 then transmits a sensing request corresponding to the sensing intent to a management device that distributes sensing data acquired by a base station 3A, UE2, etc. (see Figure 1), which is an example of a wireless communication device. The control unit 60 then receives sensing data in response to the sensing request from the management device. Furthermore, the control unit 60 detects the event specified in the sensing intent from the received sensing data. The control unit 60 then transmits the detected event to the requester.
[0009] Here, sensing is exemplified as a technique for obtaining data from radio signals (signals such as reflection, refraction, and diffraction) that have been affected by an object or environment, for example, in a mobile communication system. Furthermore, the radio signal is emitted by wireless communication devices such as base stations 3A and UE2, and is received by the same or other wireless communication devices as a signal that has been affected by reflection, refraction, diffraction, etc., from the object being sensed. The wireless communication device acquires sensing data from the received radio signal.
[0010] In this embodiment, the source of the sensing request is a computer or the like connected to the mobile communication system, such as a UE2 mounted on a mobile device or carried by a person. The management device is, for example, what is called SENSING 11n of the network function (NF) in the core network (5GC) of a fifth-generation mobile communication system (also called a 5G network or 5GNW) (see Figure 3). However, the configuration of this embodiment is not limited to fifth-generation mobile communication systems, but is also applicable to sixth-generation mobile communication systems and later mobile communication systems.
[0011] Furthermore, in this embodiment, an event is, for example, the detection of a sensing target. Alternatively, an event may be the detection of information about the location where the sensing target is situated, the type of event including the attributes of the target, the status of the target, such as size, material, movement status, and movement speed. Based on the sensing intent expressed in natural language from the sensing requestor, this information processing device generates a corresponding sensing request, transmits the generated sensing request to the management device, detects an event based on the sensing data received as a result, and transmits it to the requestor.
[0012] <First Embodiment> Referring to Figures 1 to 4, the information processing device 6, information processing method, and program according to the first embodiment will be described.
[0013] (Example of Application) Figure 1 is a diagram illustrating the configuration of the network N1 to which the information processing device 6 of this embodiment is connected. As shown in Figure 1, in this embodiment, the information processing device 6 communicates with various devices connected to the network N1 and performs processing. In this embodiment, the devices connected to the information processing device 6 and the network N1 are referred to as the information communication system 100. The network N1 is exemplified by a communication network that includes at least one of a mobile communication system such as Long Term Evolution (LTE), a fifth-generation mobile communication system (5G), a sixth-generation mobile communication system (6G), and the Internet.
[0014] For example, the information processing device 6 is connected via the network N1 to user equipment such as UE2-1 to UE2-3 installed in vehicles C1 to C3, etc. UE2-1 to UE2-3, etc., installed in vehicles C1 to C3, are devices that request sensing result data (hereinafter simply referred to as sensing data) obtained through sensing processing. Hereafter, when UE2-k (where k is an integer) is used as a general term, it will simply be referred to as UE2.
[0015] Furthermore, the information processing device 6 is connected to SENSING 11n, which is an example of a management device, via network N1. In addition, SENSING 11n is connected to base stations 3A-1, 3A-2, UE2-4, UE2-5, etc., via network N1. Hereafter, when base stations 3A-k (where k is an integer) are referred to collectively, they will simply be called base station 3A. Base stations 3A and UE2 are examples of wireless communication devices.
[0016] Base stations 3A, UE2, etc., for example, have transmitters and receivers and operate as sensors. That is, these sensors emit electromagnetic waves (i.e., radio signals) from the transmitter and receive reflected waves, refracted waves, diffracted waves, etc. from the sensing target with the receiver to detect the sensing target. SENSING 11n instructs these sensors to detect (F4-1 to F4-4) and receives sensing data (F5-1 to F5-4).
[0017] The sensor may, for example, be a pair of a transmitter and receiver of one base station 3. Alternatively, the sensor may be a pair of a transmitter of a first base station 3A-1 and a receiver of a second base station 3A-2 other than the first base station 3A-1. Furthermore, the sensor may be a pair of a transmitter and receiver of one UE2. Alternatively, the sensor may be a pair of a transmitter of a first UE2-1 and a receiver of a second UE2-2 other than the first UE2-1. Also, the sensor may be a pair of a transmitter of base station 3A and a receiver of UE2. Furthermore, the sensor may be a pair of a transmitter of UE2 and a receiver of base station 3A. In this embodiment, the signal transmitted by the transmitter to detect the sensing object is called a reference signal.
[0018] Furthermore, in this embodiment, the information processing device 6 may perform processing in cooperation with other computers on the network N1, such as a server 6A. The server 6A may, for example, manage various databases. Examples of databases include a map database, various dictionaries, a rule base, and a base station database that stores location information of base stations 3A. Also, the server 6A is not limited to a single computer, but may consist of multiple computers. In addition, the information processing device 6 and one or more servers 6A may be part of a virtual system called a cloud.
[0019] Figure 2 illustrates vehicles C1 to C3, which are requesting processing from the information processing device 6 of this embodiment. As shown in Figure 2, for example, vehicles C1 and C2 are traveling towards an area (geographical region) including point A1, and it is assumed that they wish to acquire information near point A1. Vehicle C3 is also traveling towards an area including point A2, and it is assumed that it wishes to acquire information near point A2. Here, the information desired by vehicles C1 to C3, etc., is, for example, the presence or absence of obstacles (the presence of bicycles, pedestrians, etc.).
[0020] Figure 2 shows three request sources. However, in reality, it is expected that sensing processing requests to the information processing device 6 will be issued from a very large number of devices. Furthermore, it is expected that the sensing processing requests to the information processing device 6 will vary widely in terms of the target of detection, the detection area (location), the detection conditions, the status of the target, the type of event, etc. In this embodiment, it is illustrated that the information processing device 6 efficiently processes a large number of diverse sensing processing requests.
[0021] In other words, as illustrated in Figure 1, UE2-1 to UE2-3 notify the information processing device 6 of the specifications for sensing processing, for example. In this embodiment, the specifications for sensing processing are called sensing intentions (F1-1 to F1-3). Sensing intentions include what to detect (target), where to detect (location, area), and how to detect (presence or absence, movement or absence, movement speed, detection accuracy, etc.). Presence or absence, movement or absence, movement speed, etc. can also be described as types of events.
[0022] When the information processing device 6 receives a sensing intent, it generates a sensing request (F2 in Figure 2). That is, the information processing device 6 analyzes the sensing intent and generates a sensing request. When analyzing the sensing intent, the information processing device 6 may also aggregate sensing results. In other words, if there is already acquired sensing data that matches the sensing intent, the information processing device 6 can use the acquired sensing data without generating a new sensing request. Furthermore, if sensing processes corresponding to multiple sensing intents can be aggregated into a single sensing request, the information processing device 6 can perform aggregation.
[0023] The information processing device 6 then notifies SENSING 11n of the sensing request (F3). The sensing request can also be called the activation of a sensing task. The sensing request includes, for example, a sensing area (or location), notification conditions, detection accuracy, etc. Here, the sensing area (or location) is information that identifies the location where the sensing process is performed. The notification conditions are information that specifies the conditions under which the sensing results from the sensing process are reported. The notification conditions include the number of notifications and the timing of notifications. If there are multiple notifications, the notification cycle and the length of the notification period (or the start and end times of the notifications) may also be specified. The notifications may also continue until a notification termination message is sent. Furthermore, the notification conditions may be such that notifications are only generated when a specific event occurs. Thus, SENSING 11n can be considered an example of a management device that distributes sensing data acquired by a wireless communication device.
[0024] SENSING 11n instructs each sensor to perform sensing processing in the designated sensing area (or location) according to the sensing request (F4-1 to F4-4), and receives sensing data (F5-1 to F5-4). Then, SENSING 11n notifies the information processing device 6 of the sensing data according to the notification conditions (F6).
[0025] The information processing device 6 receives sensing data and performs filtering (F7). The filtering process is the process of acquiring sensing data that matches the sensing intent from the requester. If sensing data that matches the sensing intent is acquired, it can be said that an event has been detected. Therefore, if the information processing device 6 has acquired sensing data that matches the sensing intent from the requester, it notifies the requester of the acquired data (F8). The notification in F8 can be called a notification of a detected event.
[0026] In this embodiment, detecting an event includes, for example, detecting the presence or absence of a sensing object, detecting the location of the sensing object, and detecting the state and type of the sensing object. The location of the sensing object is a geographical location, such as latitude and longitude. The state of the sensing object includes the size, material, whether it is moving or not, and the speed of movement of the sensing object. The size, material, whether it is moving or not, and the speed of movement of the sensing object can also be referred to as the type of sensing or the type of event.
[0027] Here, SENSING 11n can obtain the position of the sensing target by instructing three or more sensing entities (base station 3A or other UE2) to measure the latitude and longitude of the sensing target using a triangulation method. Furthermore, SENSING 11n can obtain the material of the sensing target by instructing the sensing entities (base station 3A or other UE2) to measure the material based on the absorption rate of electromagnetic waves at the sensing target, using the power of the electromagnetic waves radiated to the sensing entity (base station 3A or other UE2) and the power of the received reflected waves.
[0028] (Network Example) Figure 3 illustrates the components (constituent elements) that make up the core network (5GC) of the fifth-generation mobile communication system (also called a 5G network or 5GNW) within the information and communication system 100. In this embodiment, the constituent elements of the 5GC are collectively called Network Function (hereinafter referred to as NF11), and individually they are called, for example, Access and Mobility Management Function (hereinafter referred to as AMF 11b). In Figure 3, each constituent element is given a general code along with an individual code in parentheses.
[0029] However, the types of NF11 are not limited to the examples shown in Figure 2. UPF (User Plane Function) 11a AMF (Access and Mobility Management Function) 11b SMF (Session Management Function) 11c PCF (Policy Control Function) 11d NEF (Network Exposure Function) 11e NRF (Network Repository Function) 11g NSSF (Network Slice Selection Function) 11h AUSF (Authentication Server Function) 11i UDM (Unified Data Management) 11j NWDAF (Network Data Analytics Function) 11k SENSING (Sensing Function) 11n
[0030] UPF11a performs routing and forwarding of user packets (user plane packets sent and received by UE2), packet inspection, and QoS processing. UPF11a is connected to DN (Data Network) 5. DN5 is an external data network (such as the Internet) outside of 5GC.
[0031] The AMF11b is the location-based accommodation device for UE2 in the core network. The AMF11b accommodates RAN3 (base stations) and performs subscriber authentication control, UE2 location (mobility), and registered area management. The UDM11j is a database (storage device) that provides subscriber information and retrieves, registers, deletes, and modifies the status of UE2.
[0032] SMF11c manages PDU (Protocol Data Unit) sessions and controls UPF11a for the implementation of QoS (Quality of Service) control and policy control. A PDU session is a virtual communication channel for data exchange between UE2 and DN5.
[0033] The PCF11d performs QoS control, policy control, and billing control under the control of the SMF11c. QoS control involves controlling the quality of communication, such as prioritizing packet forwarding. Policy control involves communication control, such as QoS, packet forwarding eligibility, and billing, based on network or subscriber information.
[0034] The NEF 11e acts as an intermediary for communication between external nodes (external devices) such as the AF (Application Function) 12 and nodes (NF) within the control plane. In other words, the NEF 11e functions as a gateway (GW) between the core network and the external network. The AF 12 is, for example, an application server (external server) located outside the core network (e.g., connected to DN5).
[0035] NRF11g stores and manages information about the NFs that make up the core network. In response to an inquiry regarding an NF that the user wishes to use, NRF11g can return a list of multiple candidate NFs to the inquirer.
[0036] NSSF11h has the function of selecting the network slice to be used by the subscriber from among the network slices generated by network slicing. A network slice is a virtual network with specifications tailored to its intended use.
[0037] AUSF11i is a subscriber authentication server that performs subscriber authentication under the control of AMF11b. NWDAF 11k has the function of collecting and analyzing data from each NF11, OAM (Operations, Administration, and Maintenance) terminal, AF12, etc. NWDAF 11k is an NF that provides analytical information related to 5GS.
[0038] UDM11j maintains subscriber-related information and provides subscriber information, as well as retrieving, registering, deleting, and modifying the status of UE2. SENSING 11n performs sensing services, including collecting sensing information from UE2, RAN3 (base station (gNB)), or other nodes, and providing the collected sensing information to UE2 or other external systems (AF12, DN5, etc.). Details of SENSING 11n will be described later.
[0039] AF12 is an NF that processes sensing data and provides application services using the sensing data to UE2 (terminal). For example, AF12 notifies UE2 (terminal) of the sensing results in a specified spatial range acquired by SENSING 11n. Alternatively, an application program executed on UE2 (terminal) may also operate as AF12. The functions of NF11 described above are examples, and each NF11 may have other functions, execute the functions of other NF11s, or multiple NF11s may cooperate to execute a single function.
[0040] These FN11s are defined, for example, in 3GPP® TS23.501. DN5 is an external data network (such as the Internet) outside of 5GC. An information processing device 6 is connected to DN5, for example. The information processing device 6 may be AF12 of 5GC. RAN3 is a wireless access network to 5GC. RAN3 is formed, for example, by a base station 3A. Note that the information communication system 100 may not be the entire system shown in Figure 3, but rather a combination of some of the 5GC FN11s illustrated in Figure 3.
[0041] Among the FN11s, the AMF 11b is a UE in-circuit accommodation device in 5GC. The AMF 11b accommodates the RAN3 and performs subscriber authentication control, UE2 position (mobility) management, etc.
[0042] The NWDAF 11k is an NF11 that provides data analysis for the 5G network. The NWDAF 11k notifies other NF11s (such as the AMF 11b, SMF 11c, PCF 11d, etc.) of the data analysis results and supports the dynamic network control and management of these NF11s. Examples of the data analysis results provided by the NWDAF 11k include latency per UE2, the movement path of UE2, location information, movement speed, movement direction, and material. Also, examples of the data analysis results include load level (resource utilization rate of each cell or base station), future load prediction, load distribution per area, and resource block or frequency availability information. Regarding the NWDAF 11k, for example, 3GPP (registered trademark) TS 29.520 or TS 23.288 defines its functions or processes.
[0043] The SENSING 11n performs processes including collection of information from UE2 or other external systems, analysis and processing of the collected information, and provision of the analysis results to other FN11s, UE2, AF12, or other external systems (such as DN5). However, instead of the SENSING 11n, the NWDAF 11k may perform the analysis and processing of the sensing results by the SENSING 11n.
[0044] The SENSING 11n, other NF11s, AF12, etc. are formed on a computer according to a computer program. The configurations of the SENSING 11n, other NF11s, AF12, etc. are virtual and may be formed on different computers or on multiple computers. Also, any plurality of the SENSING 11n, other NF11s, AF12, etc. may be formed on the same computer. Also, such a computer may have the same configuration as the information processing device 6 or UE2 connected to the data network DN5.
[0045] These computers have a Central Processing Unit (hereinafter referred to as CPU 61), a main memory device 62, and external devices, and execute information processing and communication processing by a computer program. The CPU 61 is also called a processor. The CPU 61 is not limited to a single processor and may have a multi-processor configuration. Further, the CPU 61 may include a Graphics Processing Unit (GPU), a Digital Signal Processor (DSP), etc.
[0046] The CPU 61 executes a computer program developed to be executable in the main memory device 62 and provides the processing of the information processing device 6. The main memory device 62 stores a computer program executed by the CPU 61, data processed by the CPU 61, etc. The CPU 61 and the main memory device 62 are called a control unit 60.
[0047] Examples of the external devices include an external storage device 63, an output device 64, an operation device 65, and a communication device 66. The external storage device 63 is used, for example, as a storage area that supplements the main memory device 62 and stores a computer program executed by the CPU 61, data processed by the CPU 61, etc.
[0048] The output device 64 is, for example, a display device such as a liquid crystal display or an electroluminescence panel. However, the output device 64 may include a device that outputs sound such as a speaker. The operation device 65 may be, for example, a touch panel with a touch sensor superimposed on the display of the output device 64. The communication device 66 accesses, for example, the network N1 provided by the information communication system 100 and communicates with a server 6A etc. (see FIG. 1), which is a computer connected to the network.
[0049] (Processing Example) FIG. 4 is a flowchart illustrating event detection processing by the information processing device 6 of the first embodiment. This processing starts, for example, when the information processing device 6 receives a request for executing sensing processing from a requester. Here, the requester is, for example, the UE2-1 mounted on the vehicle C1, the UE2-2 carried by a person, etc.
[0050] In this process, the information processing device 6 first receives the sensing intent from the requester (S1). The sensing intent may be described in natural language. However, the sensing intent in natural language may be defined by words in a predetermined order. The predetermined order is, for example, a combination of words indicating location, a word indicating the object to be sensed, the situation of the object, and a word indicating the type of event. That is, the format of such word combinations may be fixed in the Application Programming Interface (API) provided by the information processing device 6 to the requester UE2.
[0051] In this case, the sensing intent is, for example, "Intersection A, obstacle, presence or absence." Here, Intersection A is, for example, the name of a point searchable in a map database. An obstacle is, for example, an object with dimensions exceeding a certain standard value. Presence or absence is an example of the type of situation or event of the target, and is information that specifies whether or not the object to be sensed exists.
[0052] The requesting UE2 may, for example, execute an application program (hereinafter referred to as "app") for transmitting a sensing intent to the information processing device 6. The app may, for example, present pull-down menus in the user interface it provides for selecting words indicating location, words indicating the sensing target, and words indicating the status or type of the target. The UE2 can then generate a sensing intent based on the combination of multiple words selected by the user in the pull-down menus and transmit it to the information processing device 6.
[0053] However, UE2 may also receive the sensing intent in natural language from the user's text input or speech, and transmit it to the information processing device 6. In the example in Figure 4, the sensing intent is specified in natural language. For example, the sensing intent is "Tell me if there is an obstacle at intersection A." Processing S1 is an example of receiving a sensing intent in natural language from the sensing requestor, specifying the event to be sensed.
[0054] When the information processing device 6 receives a sensing intent written in Japanese as natural language, it performs morphological analysis (S2). As a result of morphological analysis, morphemes separated into parts of speech are obtained, for example, "A, intersection, no, obstacle, ga, aru, ka, oshiete, te". The same applies when the natural language is Chinese. When the natural language is Japanese, the information processing device 6 may unify the verbs obtained as a result of morphological analysis into their terminal forms. On the other hand, when the natural language is formed by a sequence of words, such as English, the information processing device 6 may omit morphological analysis and simply process the sequence of words as is.
[0055] Next, the information processing device 6 refers to a thesaurus (S3) and identifies words indicating location, words indicating the sensing target, and words indicating the type of target situation or event from the results of morphological analysis (or a sequence of words). For example, the thesaurus defines words indicating location as place, point, ahead, before, forward, intersection, etc. Therefore, from the word "intersection" included in the sensing intent, the information processing device 6 can identify the location as "intersection" by combining it with the preceding proper noun "A".
[0056] Furthermore, for example, a thesaurus defines words that indicate an object as "object," "obstacle," "hindrance," "obstacle," etc. Therefore, the information processing device 6 can identify the word indicating the object as "obstacle" from the word "obstacle" included in the sensing intent.
[0057] Furthermore, for example, the thesaurus defines words that indicate a situation, such as situation, type, exist, be, presence, movement, and speed. Therefore, the information processing device 6 can identify from the word "exist" included in the sensing intent that the word indicating the situation (or type of event) of the target is "exist," and can recognize the intent to request the presence or absence of something.
[0058] Furthermore, for example, a thesaurus may define words indicating instructions as "instruct," "teach," "instruct," "inform," "show," etc. In this case, the information processing device 6 can identify the word indicating an instruction as "teach" from the word "teach" included in the sensing intent, and recognize the intent requested by the requester.
[0059] The thesaurus can be built, for example, in the main memory 12 or the external memory 13. Alternatively, the thesaurus may be built on another computer connected to the network N1 (such as server 6A in Figure 1). The thesaurus is an example of a first database in which the relationship between information included in sensing intent and information included in sensing request is defined. The information processing device 6 may also use a rule-based system instead of the thesaurus, or in conjunction with the thesaurus, to identify words indicating location, words indicating sensing target, and words indicating situation or type. The rules of the rule-based system are, for example, as follows:
[0060] Rule: "IF the sensing intent includes a place synonym, THEN IF there are proper nouns before or after the place synonym, THEN the combination of the place synonym and the proper noun identifies the sensing area." According to this rule, if the sensing intent in natural language contains a place synonym in the thesaurus, and there are proper nouns before or after it, the sensing area is determined. For example, if morphological analysis reveals that the proper noun "A" and the word "intersection" (a common noun) are included one after the other, the information processing device 6 processes "A intersection" as a term that identifies the sensing area.
[0061] Next, the information processing device 6 sets a sensing area in which sensing processing will be performed, including a location (point, area) identified by a word indicating a location, and requests the execution of sensing processing (S4). More specifically, the information processing device 6 searches the map database based on the term that identifies the sensing area (for example, "Intersection A") obtained in the processing of S3, and finds the geographical location (for example, latitude and longitude) of "Intersection A".
[0062] Next, the information processing device 6 determines the sensing area. The sensing area may be identified, for example, by cell identification information (hereinafter referred to as cell ID) that encompasses a point or area specified by a word indicating location, or by a unit of tracking area (TA). The communication area provided by the telecommunications business is divided into TAs and identified by a Tracking Area Code (TAC). A TA includes one or more base stations 3A, i.e., cells. Therefore, TAC can be used as location information. The sensing area may include one or more cell IDs or one or more TACs.
[0063] In this embodiment, the information processing device 6 refers to a base station database containing a sequence of base station IDs, cell IDs, latitude, longitude, and cell radii for each frequency in order to determine the sensing area. The base station database may be built on, for example, a computer on the network N1 (server 6A in Figure 1). Alternatively, the information processing device 6 may have the base station database internally. The information in the base station database is provided, for example, by a telecommunications company operating the information and communication system 100. In this way, the information processing device 6 determines a cell or TA that encompasses the geographical location (e.g., latitude and longitude) of the term (e.g., "A intersection") that identifies the sensing area, which was obtained in the processing of S3.
[0064] Then, the information processing device 6 requests the execution of sensing processing from SENSING 11n via NEF 11e. Here, NEF 11e performs processing to receive the secure provision of information from the information processing device 1, which is running an external application, to 5GC. The request to execute sensing processing with a set sensing area is an example of a sensing request corresponding to a sensing intention. Processing S3 and S4 are examples of determining a sensing request corresponding to a sensing intention. Processing S4 is also an example of sending a sensing request corresponding to a sensing intention.
[0065] The information processing device 6 may request SENSING 11n to perform sensing processing via a Subscribe message. A Subscribe message is a message that requests the detection of sensing targets to continue. A Subscribe message causes the sensing processing to be repeated, for example, at a specified frequency, for a specified duration or until a cancellation request via an Unsubscribe message is received. However, the information processing device 6 may also request SENSING 11n to perform sensing processing each time via a Request message.
[0066] The information processing device 6 then acquires sensing data via the NEF 11e (S5). The sensing data includes, for example, the latitude and longitude of the location where the object is detected, the size of the detected object, whether or not it is moving, the speed of movement, the direction of movement, and the material. The processing in S5 is an example of receiving sensing data in response to a sensing request from the SENSING 11n, which acts as a management device.
[0067] Next, the information processing device 6 refers to the rule base and filters the sensing data (S6). The rule base defines filtering rules, i.e., conditions for selecting sensing data, in the form of IF THEN ELSE. Filtering rules include, for example, conditions about the source of the sensing request, conditions about the location where the object was detected, and conditions about the status or type of the detected object. More specifically, conditions about the status or type of the object are exemplified by, for example, conditions about the size of the object, conditions about whether the object is moving, conditions about the speed of movement, conditions about the material, etc. The rules in the rule base are, for example, as follows:
[0068] Rule: "IF requester is a 'vehicle', THEN IF detection point is an intended 'location', THEN IF request includes an 'obstacle', THEN IF detection size of detected object > SS mm, THEN obstacle detected." In this rule, the requester for sensing processing is the UE2 mounted on the vehicle. Furthermore, if the detection point of the target is an intended location, the request is for obstacle detection, and the detected object is larger than SS millimeters, the information processing device 6 determines that an obstacle has been detected. Note that the words "vehicle," "location," "obstacle," etc. included in these rules may be replaced with synonyms defined in the thesaurus and the rules may be modified and applied accordingly.
[0069] The rule base can be built, for example, in the main memory 12 or the external memory 13. Alternatively, the rule base may be built on another computer connected to the network N1 (such as server 6A in Figure 1). The rule base is an example of a second database in which rules for identifying events corresponding to sensing intentions are defined. Furthermore, the thesaurus and the rule base may be managed in a database on the same computer. That is, the information processing device 6 or the computer managing the thesaurus and rule base may replace terms included in the rules of the rule base with thesauruses from the thesaurus, refer to the rules, and perform filtering.
[0070] The information processing device 6 then determines whether the sensing data conforms to the sensing intent according to one of the rule-based rules (S7). The processes in S6 and S7 are examples of detecting an event specified in the sensing intent from the received sensing data. Furthermore, the processes in S6 and S7 are also examples of detecting an event from sensing data according to a rule.
[0071] If the sensing data matches the sensing intent, the information processing device 6 can determine that it has detected an event corresponding to the sensing intent. Therefore, the information processing device 6 notifies the requester of the detected event by transmitting the sensing data (S8). The process in S8 is an example of transmitting the detected event to the requester. On the other hand, if the sensing data does not match the sensing intent, the information processing device 6 proceeds to S9.
[0072] The information processing device 6 then determines whether or not to terminate the process (S9). For example, if the sensing request requests continuous sensing processing (sensing via a Subscribe message), the information processing device 6 returns the process to S5 and processes the next sensing data. On the other hand, if the sensing request requests a single sensing process (sensing via a Request message), the information processing device 6 terminates the process.
[0073] (Effects of the Embodiment) As described above, the information processing device 6 receives sensing intent in natural language from the requester, such as UE2-1, and notifies the management device, SENSING 11n, of a sensing request corresponding to the sensing intent. The information processing device 6 then acquires sensing data in response to the sensing request and detects events corresponding to the sensing intent from the acquired sensing data. Furthermore, the information processing device 6 transmits the detected events to the requester. In this way, the information processing device 6 can grasp the intent of the user, such as UE2-1, and perform sensing processing that conforms to the user's intent. In other words, the information processing device 6 can respond flexibly and accurately to the user's requests. Furthermore, the processing performed by SENSING 11n as described above may also be performed by other NF11s such as SMF11c, PCF11d, NEF11e, NRF11g, NSSF11h, AUSF11i, UDM11j, and NWDAF 11k, or it may be performed in cooperation with multiple NFs.
[0074] Furthermore, as described above, the thesaurus can be considered an example of a first database in which the relationship between information contained in sensing intent and information contained in sensing requests corresponding to sensing intent is defined. The information processing device 6 then searches the thesaurus based on the sensing intent and determines the sensing request corresponding to the sensing intent. Therefore, the information processing device 6 can receive the user's sensing intent in natural language and process it appropriately.
[0075] Furthermore, as described above, the rule-based system can be considered an example of a second database in which rules are defined for identifying events corresponding to sensing intentions based on sensing results. Therefore, the information processing device 6 can receive the user's sensing intentions in natural language and appropriately detect events from the sensing results according to the rules of the rule-based system.
[0076] Furthermore, in this embodiment, the sensing intent described in natural language includes information for identifying at least one of the object on which the event is detected, the location where the event is detected, the circumstances of the object, and the type of event. Therefore, the information processing device 6 can detect events that meet the conditions of "what," "where," and "how."
[0077] <Second Embodiment> The event detection process according to the second embodiment will be described with reference to Figures 5 to 8. In the first embodiment described above, the information processing device 6 detected events that matched the sensing intent in natural language using a similar word dictionary and a rule base. In this embodiment, a more limited sensing intent is processed in the same configuration as the first embodiment (Figures 1 to 3). That is, in this embodiment, the information processing device 6 performs processing assuming that the sensing intent includes "who," "where," "what," and "how."
[0078] The information processing device 6 then refers to the target location definition dictionary to recognize "where" included in the sensing intent. Furthermore, the information processing device 6 refers to the sensing target definition dictionary to recognize "what" included in the sensing intent. In addition, the information processing device 6 refers to the event type definition dictionary to recognize "how" included in the sensing intent.
[0079] These dictionaries, like the dictionaries in the first embodiment, are built on the main memory 12, the external memory 13, or another computer connected to the network N1 (such as server 6A in Figure 1). The configuration of the information processing device 6 other than these dictionaries is the same as in the first embodiment. Therefore, Figures 1 to 3 are directly applicable to the second embodiment. In addition, the information processing device 6 refers to a rule base similar to that in the first embodiment in order to process sensing intent.
[0080] (Data example) As described above, in this embodiment, the sensing intent in natural language includes information corresponding to "who," "where," "what," and "how." The sensing intent is exemplified as follows:
[0081] Sensing intent: "A car wants to detect whether or not an obstacle is present at a distance of 100m." In this example, "who" is "the car," "where" is "at 100m away," "what" is "an obstacle to the car," and "how" is "as whether or not it is present." The format of the information corresponding to "who," "where," "what," and "how" may be fixed in the Application Programming Interface (API) provided to the requesting UE2 by the information processing device 6, as in the first embodiment. Alternatively, the UE2 may, using an application similar to the first embodiment, present pull-down menus for selecting the information corresponding to "who," "where," "what," and "how." The information processing device 6 has a dictionary illustrated in Figures 5 to 7 to generate sensing requests based on such sensing intents.
[0082] Figure 5 illustrates data for a target location definition dictionary corresponding to "where". This data defines information corresponding to "where". Examples of target location definition dictionary data include Ym ahead, Ym forward, Ym behind, Ym behind, C city D town E block F number G, (latitude Tx, longitude Ny), CR intersection, and RC railroad crossing. Of this data, Ym, C through G, Tx, Ny, CR, RC, etc., may be parameters that can be set by the requesting UE2. Therefore, UE2 may accept input from the user for the values to be set for these parameters via a user interface. Then, UE2 may set the values received from the user for the above parameters and generate a sensing request.
[0083] On the other hand, the information processing device 6 only needs to recognize "where" included in the sensing intent based on words (common nouns) such as "ahead," "forward," "back," "rear," "city," "latitude," "longitude," "intersection," and "railroad crossing," excluding these parameters. Furthermore, when processing sensing intent in Japanese, the information processing device 6 only needs to recognize "where" when these words (common nouns) are accompanied by morphemes indicating location such as "at," "in," or "at." Furthermore, when processing sensing intent in English, the information processing device 6 only needs to recognize "where" when these words (common nouns) are specified together with prepositions indicating location such as "in," "at," "on," "upon," or "over." The same processing applies to Chinese as well.
[0084] Furthermore, the information processing device 6 may define in its rule-based rules that it recognizes "where" when morphemes indicating location, such as "at," "in," or "at," are attached to these words (common nouns). Similarly, rules for recognizing "where" may be defined in a rule-based system for English and Chinese. However, as mentioned above, if the format of the information corresponding to "who," "where," "what," and "how" is fixed in the API, such recognition processing in the information processing device 6 is unnecessary. This is because the requesting UE2-1, etc., notifies the information processing device 6 of "who," "where," "what," and "how" according to the API's provisions.
[0085] Figure 6 illustrates the data in the sensing target definition dictionary corresponding to "what". The data in the sensing target definition dictionary includes common nouns such as object, thing, obstacle, person, intersection, and railroad crossing. When processing sensing intent in Japanese, the information processing device 6 only needs to recognize "what" when these words (common nouns) are accompanied by morphemes indicating an object, such as "を". When processing sensing intent in English, the information processing device 6 may also recognize "what" when these words (common nouns) are the object of a verb. The same applies to Chinese.
[0086] Furthermore, the information processing device 6 may define a rule-based mechanism to recognize "what" when a morpheme indicating an object, such as "を," is attached to these words (common nouns). Similarly, a rule-based mechanism may be defined for recognizing "what" in the case of English and Chinese. However, as mentioned above, if the format of information corresponding to "who," "where," "what," and "how" is fixed by the API, such rule-based recognition processing is unnecessary.
[0087] Figure 7 illustrates the data in the event type definition dictionary corresponding to "how". The event type definition dictionary contains words that define the type of event or the situation of the object. As shown in Figure 7, the data in the event type definition dictionary includes common nouns such as existence, presence or absence, being, action, movement, speed, precision, and accuracy. The information processing device 6 only needs to recognize "how" included in the sensing intent based on these words (common nouns). However, as mentioned above, if the format of the information corresponding to "who," "where," "what," and "how" is fixed in the API, then such recognition processing in the information processing device 6 is unnecessary.
[0088] (Processing Example) Figure 8 is a flowchart illustrating the event detection process of the second embodiment. In this process, the information processing device 6 first receives a sensing intention from the requester (S11). Next, the information processing device 6 identifies "who" made the request from the source information of the sensing intention (S12). For example, the information processing device 6 has a user-defined database that records the correspondence between the user identification information of the UE2 accessing the information processing device 6 and the way the UE2 moves. When the information processing device 6 receives a request from the user of the UE2 to provide information processing by the information processing device 6, it should issue user identification information to identify the user. The information processing device 6 should then have the user input the way the UE2 moves (e.g., in a vehicle, carried by the user) along with the user identification information. The information processing device 6 should then record the way the UE2 moves in the user-defined database.
[0089] In this way, when the information processing device 6 receives sensing intent (including user identification information) from the requester, it only needs to identify the method of movement corresponding to the user identification information using the user-defined database. The method of movement is, as described above, in-vehicle, for example, carried by the user. If the method of movement of UE2 is in-vehicle, information identifying the vehicle C1, etc., may be registered in the user-defined database. In this way, the information processing device 6 only needs to recognize the requester based on the user identification information and identify "who" made the request.
[0090] Next, the information processing device 6 identifies "where" from the sensing intent based on the target location definition dictionary and determines the sensing target location. As mentioned above, "where" is, for example, "100 units away". In this case, the information processing device 6 obtains the current position of the vehicle C1, etc. The UE2 of the vehicle C1, etc., can obtain its current position using a Global Positioning System (GPS) or Global Navigation Satellite System (GNSS) and provide it to the information processing device 6. Then, the information processing device 6 can identify the location information (for example, latitude and longitude) corresponding to "where" from the current position of the vehicle C1, etc. and the relative distance information such as "100 units away" included in the sensing intent.
[0091] The information processing device 6 then sets a sensing area that includes the location information corresponding to "where". The method for setting the sensing area is the same as in S4 of the first embodiment. That is, the information processing device 6 refers to a base station database that includes base station ID, cell ID, latitude, longitude, and a sequence of cell radii for each frequency, and obtains a cell ID or TAC that includes the location information corresponding to "where" (for example, latitude and longitude), and sets it as the sensing area. Furthermore, the information processing device 6 identifies the sensing accuracy from the sensing intent based on the event type definition dictionary (S13). However, the sensing accuracy does not have to be included in the sensing intent.
[0092] The information processing device 6 requests SENSING 11n to execute a sensing process specifying the sensing area and, if necessary, the sensing accuracy (S14). The process in S14 is the same as the process in S4 of the first embodiment (Figure 4).
[0093] The information processing device 6 then receives sensing data via the NEF 11e (S15). The sensing data, as in the first embodiment, includes, for example, the latitude and longitude of the location where the object was detected, the size of the detected object, whether or not it was moving, the speed of movement, the direction of movement, the material, etc.
[0094] Next, the information processing device 6 refers to the sensing target definition dictionary based on the sensing intent and identifies "what" (for example, "obstacle"). The information processing device 6 also refers to the event type definition dictionary based on the sensing intent and identifies "how" (for example, "existence or absence"). Then, similar to the first embodiment, the information processing device 6 extracts the target from the sensing result using rule-based rules (Example 1) (S16). The process in S16 is the same as the process in S6 of the first embodiment (Figure 4). Also, the processes in S17 to S19 in Figure 8 are the same as the processes in S7 to S9 of the first embodiment (Figure 4).
[0095] However, if, for example, the object corresponding to "what" is detected in the image, a multimodal large-scale language model (LLM) may be applied in the process of S16 to determine "what" (type of object, etc.). An example of a multimodal LLM is a system that combines a Vision Transformer (ViT) with a transformer for language processing.
[0096] ViT decomposes a single image into multiple sub-regions (patches) and places these sub-regions in a space of embedding vectors. ViT processes sub-images similarly to words in a sentence, learning from a large amount of existing images to recognize images. Therefore, by combining ViT with a transformer, it becomes possible to perform processing based on the co-occurrence probability of image patch embedding vectors and word embedding vectors. For example, it can recognize that the co-occurrence probability of an image of an object blocking a road and the word "obstacle" is higher than the co-occurrence probability of an image of a rose and the word "obstacle."
[0097] The information processing device 1 may cooperate with a multimodal LLM built on a computer connected to the network N1 (for example, server 6A in Figure 1) to determine whether the image received in S15 corresponds to "what". Alternatively, the information processing device 1 may be equipped with a multimodal LLM and determine whether the image received in S15 corresponds to "what".
[0098] (Effects of the Embodiment) As described above, the information processing device 6 can grasp the intentions of the user, such as UE2, and perform sensing processing that is in line with the user's intentions. In other words, the information processing device 6 can respond flexibly and accurately to the user's requests.
[0099] <Third Embodiment> In the first embodiment described above, the information processing device 6 processed sensing intent using a thesaurus and a rule base. In the second embodiment described above, the information processing device 6 processed sensing intent using a target location definition dictionary, a sensing target definition dictionary, an event type definition dictionary, and a rule base. In this embodiment, the information processing device 6 processes sensing intent described in natural language using a generation AI, LLM, or multimodal LLM, etc. The configuration and processing of this embodiment are the same as in the first embodiment, except that LLM, etc., is used. Therefore, for example, the configurations in Figures 1 to 3 can be applied directly to this embodiment.
[0100] (Language Models) Language models can be exemplified as models of the generation probability of individual sentences. For example, a language model can be described as a model that defines the probability of simultaneous occurrence of words, or morphemes obtained by separating particles and auxiliary verbs, that are included in a sentence (see, for example, Tsuyoshi Okadome, "Fundamentals of Deep Learning Generative AI," 1st edition, Kyoritsu Shuppan, March 30, 2024 (hereinafter, [Okadome])). Another example of a language model is the sequence transformation model. A sequence transformation model can be described as a model that deals with the probability of transforming a sequence X of a certain word (or morpheme) into another sequence Y. In other words, a sequence transformation model is a model of the probability of sequence Y occurring given that sequence X has occurred.
[0101] In these language models, each word (or morpheme) is represented by a single vector in an N-dimensional vector space called an embedding vector. In the embedding vector space, the embedding vectors of two words that are relatively likely to occur simultaneously are positioned at relatively close angles to each other. Therefore, the larger the dot product of the embedding vectors corresponding to two words, the higher the probability of those two words occurring simultaneously. Furthermore, in the embedding vector space, the difference values of the embedding vectors between two pairs of words with similar interrelationships will be approximately the same, and linear operations will hold. For example, in the embedding vectors, France - Paris = Japan - Tokyo holds true, where - is the subtraction sign.
[0102] LLM can be described as a model that has been pre-trained on the structure of words (or morphemes) in a large amount of text and fine-tuned for a specific application. LLM makes it possible to predict the probability of word and sentence occurrence, or to determine the likelihood of sentences generated by a computer.
[0103] A multimodal Language Model (LLM) can be described as a large-scale language model capable of processing multiple modals (formats, such as text, images, audio, and video) that include information other than text. In other words, a multimodal LLM can accept not only text but also images and audio as input and has the ability to understand and generate that information. For example, by learning data that includes images and their corresponding sentences, a multimodal LLM can learn the relationship between a certain word (or sentence) and images that have a high probability of occurring simultaneously. Therefore, a multimodal LLM can generate the word "obstacle" in response to a photograph of an object blocking a road. In this way, a multimodal LLM integrates information from different modals to understand meaning and return a response. On the other hand, generative AI can be described as a multimodal LLM primarily specialized in content generation.
[0104] (Processing Example) Figures 9 and 10 are flowcharts illustrating the event detection process of the third embodiment. In this process, processes S21 and S22 are the same as S11 and S12 in the second embodiment. In this embodiment, the sensing intent is assumed to be, for example, "to detect the presence or absence of an obstacle to a car 100m away with an error of 1m or less."
[0105] However, in this embodiment, the information processing device 6 cooperates with a generation AI or multimodal LLM built on a computer connected to the network N1 (for example, server 6A in Figure 1). Alternatively, in this embodiment, the information processing device 6 may have a generation AI or multimodal LLM internally. Therefore, in this embodiment, the information processing device 6 may receive natural language sensing intent as a prompt to the generation AI or multimodal LLM.
[0106] In this embodiment, the information processing device 6 decomposes the sensing intent described in natural language into morphemes and converts them into word embedding vectors (S23). The procedure for converting a sentence into a word embedding vector has been reported, for example, in a system called Word2Vec (Mikolov, T. et al. (2013) Efficient estimation of word representation in vector space. arXiv:1301.3781).
[0107] Next, the information processing device 6 uses multimodal LLM to extract and determine the words that best correspond to "where" (location) and "how" ("accuracy") from the words included in the sensing intent in the word embedding vector space of the language model (S24). However, the information processing device 6 may simply perform the dot product of "where" and the embedding vector from the words included in the sensing intent and determine the word with the largest value as the word corresponding to "where". In this way, the information processing device 1 determines "100m away" as the word corresponding to "where" from the sensing intent. Then, the information processing device 6 sets the sensing area in the same way as in S14 of Figure 8 of the second embodiment.
[0108] Alternatively, the information processing device 6 may simply perform the dot product of the words "how" included in the sensing intent with the embedding vector and determine the word with the largest value to be the word corresponding to "how". Alternatively, the information processing device 6 may simply perform the dot product of the words "precision" included in the sensing intent with the embedding vector and determine the word with the largest value to be the word corresponding to "precision". In this way, the information processing device 1 determines "within an error of 1 m" as the word corresponding to "precision" from the sensing intent.
[0109] Then, the information processing device 6 requests SENSING 11n to execute a sensing process specifying the sensing area and sensing accuracy via NEF 11e (S25). The process in S25 is the same as the process in S14 in Figure 8 of the second embodiment.
[0110] The information processing device 6 then acquires sensing data via the NEF 11e (S26). The sensing data, as in the first embodiment, includes, for example, the latitude and longitude of the location where the object was detected, the size of the detected object, whether or not it was moving, the speed of movement, the direction of movement, the material, etc.
[0111] Next, the information processing device 6 extracts and determines the words that best correspond to "what" and "how" (type) from the words included in the sensing intent in the embedding vector space of the multimodal LLM (S27). Here, the information processing device 6 determines "obstacle" as "what". The information processing device 6 also determines "existence or non-existence" as "how" (type of event).
[0112] However, the information processing device 6 may simply perform the dot product of the words included in the sensing intent, "what," with the embedding vector, and select the word with the largest value as the word corresponding to "what." Alternatively, the information processing device 6 may simply perform the dot product of the words included in the sensing intent, "how" (type of event), with the embedding vector, and select the word with the largest value as the word corresponding to "how" (type of event).
[0113] Next, the information processing device 6 recognizes the object corresponding to "what" from the acquired sensing results (e.g., images) using image recognition (Convolutional Neural Network; CNN, multimodal LLM, etc.) (S28). In this case, the CNN is assumed to have completed deep learning and be able to distinguish objects corresponding to "what" from, for example, an image of an object on a road. The multimodal LLM is assumed to have completed pre-training and fine-tuning of image recognition for wireless sensing images. As described above, the information processing device 6 is assumed to be in cooperation with the CNN, multimodal LLM, etc., or to be equipped with the CNN, LLM, etc. Next, the processing in Figure 9 continues to Figure 10, represented by symbol A1.
[0114] Next, when the information processing device 6 repeatedly acquires sensing data through the process in S35, it determines whether there is movement based on the time change between the previous sensing result and the current sensing result, and calculates the movement speed if there is movement (S31). Here, in the process of S31, the information processing device 6 may calculate whether there is movement and the movement speed from the change in the image included in the sensing data. In the process of S31, the information processing device 6 may calculate whether there is movement and the movement speed from the change in the position of the object included in the sensing data (for example, latitude and longitude).
[0115] Next, the information processing device 6 filters the sensing data according to the "what" and "how" of the sensing intent (S32). That is, the information processing device 6 determines whether the size, presence or absence of movement, movement speed, movement direction, material, etc. of the detected object included in the sensing data conform to the "what" and "how" determined by the processing in S27. If the sensing data conforms to the sensing intent as a result of the processing in S32 (YES in S33), the information processing device 6 notifies the requester of the occurrence of the event (S34). On the other hand, if the sensing data does not conform to the sensing intent (NO in S33), the information processing device 6 proceeds to processing in S35.
[0116] As described above, the sensing intent in this embodiment is a prompt to the generating AI or multimodal LLM. Therefore, it can be said that the information processing device 6 detects events corresponding to the sensing intent based on the prompt using the generating AI or multimodal LLM. Furthermore, it can be said that the information processing device 6 filters and outputs sensing data according to the received prompt.
[0117] The subsequent processing is the same as in the first and second embodiments. For example, if the sensing request requests continuous sensing processing (sensing request via a Subscribe message), the information processing device 6 proceeds to S26 in Figure 9, which is continued by symbol A2, and processes the next sensing data. On the other hand, if the sensing request requests a single sensing process (sensing request via a Request message), the information processing device 6 terminates processing.
[0118] (Effects of the Embodiment) As described above, the information processing device 6 can grasp the intent of the user, such as UE2, using a language model, LLM, or multimodal LLM, and perform sensing processing that matches the user's intent. Furthermore, the information processing device 6 can filter the results of the sensing processing using a language model, LLM, or multimodal LLM to obtain sensing results that match the sensing intent.
[0119] Furthermore, the sensing intent received in natural language serves as a prompt to the Generating AI, LLM, or multimodal LLM. Therefore, the information processing device 6 uses the Generating AI, LLM, or multimodal LLM to detect events corresponding to the sensing intent based on the prompt. In other words, the information processing device 6 can appropriately filter the sensing data using the Generating AI, LLM, or multimodal LLM and acquire data that matches the sensing intent.
[0120] Furthermore, the information processing device 6 filters and outputs the sensing results according to the received prompt. Therefore, the information processing device 6 can convert requests for generated AI, LLM, or multimodal LLM into requests for SENSING 11n, and appropriately incorporate the generated AI, LLM, or multimodal LLM into the sensing process.
[0121] <Fourth Embodiment> The sensing process according to the fourth embodiment will be described with reference to Figure 11. In the first to third embodiments described above, an example of a process for acquiring sensing data that matches the sensing intent described in natural language was shown. In this embodiment, the information processing device 6 aggregates sensing intents from multiple requesters, efficiently notifies SENSING 11n of the sensing request, suppresses resource consumption, and acquires the sensing result. In this embodiment as well, the configurations in Figures 1 to 3 are applied as they are, just as in the first to third embodiments described above.
[0122] In this embodiment, the information processing device 6 has already notified SENSING 11n of a sensing request and has stored in the database the parameters to be set for the sensing request corresponding to "where," "what," and "how" for the request currently being processed for sensing. Furthermore, the information processing device 6 has also stored in the database the parameters corresponding to "where," "what," and "how" for the sensing data already acquired.
[0123] Figure 11 is a flowchart illustrating the sensing instruction aggregation process A of the fourth embodiment. The process in Figure 11 can be incorporated into the processes of S4 and S5 in Figure 4 of the first embodiment, S14 and S15 in Figure 8 of the second embodiment, or S25 and S26 in Figure 9 of the third embodiment. In this process, the information processing device 6 creates a new sensing request corresponding to the sensing intention received from the requester (S41). The process of S41 is the same as the process described in S4 in Figure 3 of the first embodiment, S14 in Figure 8 of the second embodiment, and S25 in Figure 9 of the third embodiment.
[0124] Next, the information processing device 6 determines whether the sensing area of the newly created sensing request is the same as that of an existing request (S42). That is, the information processing device 6 has currently sent a request to SENSING 11n and is waiting to receive sensing data. In this case, the information processing device 6 only needs to determine whether the request already sent to SENSING 11n matches the sensing area of the new request. The information processing device 6 also needs to determine whether the sensing area of the available data in the sensing data already acquired matches the sensing area of the new request. In this way, the information processing device 6 determines whether to substitute the new sensing request with an existing sensing request.
[0125] Here, "available" means, for example, that the acquired sensing data is static data with little temporal variation. Static data is, for example, data detecting a building. On the other hand, the information processing device 6 only needs to exclude data detecting moving objects or obstacles from the static data. Note that the determination in S42 is not limited to whether or not the sensing areas are common. The information processing device 6 may also determine whether or not to substitute an existing sensing request for a new sensing request based on whether or not the conditions corresponding to "what," "where," and "how" are common.
[0126] If the determination in S42 is YES, the information processing device 6 determines whether the accuracy of the newly created sensing request is less than or equal to the accuracy of the existing request (S43). That is, it determines whether the accuracy of the existing request satisfies the accuracy of the newly created sensing request.
[0127] If the answer in S43 is YES, the information processing device 6 acquires the sensing results of the existing request (S44). That is, the information processing device 6 waits for sensing data corresponding to the existing request, acquires it, and transmits it to the requester. The information processing device 6 also refers to the acquired sensing data and transmits it to the requester. On the other hand, if the answer in S42 or S43 is NO, the information processing device 6 notifies SENSING 11n of a new sensing request and acquires the results (S45).
[0128] (Effects of the Embodiment) In sensing processing, if it is assumed that sensing data will be continuously received, at least one of the communications for sensing requests and result acquisition will occur frequently. Furthermore, if the amount of result acquisition is large, constraints may arise on at least one of the communication costs and processing resources. The information processing device 6 of this embodiment efficiently performs sensing according to the sensing intention by aggregating the sensing intentions included in multiple sensing requests.
[0129] In other words, in this embodiment, whether or not the sensing conditions included in the second sensing request newly notified to SENSING 11n are satisfied is determined by whether or not the sensing conditions included in the existing first sensing request are satisfied. The existing first sensing request is, for example, a sensing request that has already been sent to SENSING 11n, which acts as a management device, by the information processing device 6. In such a case, the information processing device 6 does not send the second sensing request to the management device, but detects the event from the sensing data for the first sensing request. Therefore, by aggregating the sensing intentions included in multiple sensing requests from multiple requesters, sensing according to the sensing intention can be efficiently performed.
[0130] (Modified Example) Figure 12 is a flowchart illustrating a modified example of sensing instruction aggregation process B. In the process shown in Figure 11, the information processing device 6 determined whether the sensing area of a newly created sensing request was the same as that of an existing sensing request. However, the information processing device 6 may also determine, using the same procedure as in Figure 11, whether multiple sensing intentions received from multiple UE2s during the same time period are common. For example, the information processing device 6 may use the same procedure as in Figure 11 to select only one sensing intention with the highest request accuracy (representative sensing intention) from among multiple sensing intentions received from multiple UE2s during the same time period, and generate a sensing request.
[0131] In other words, in the process shown in Figure 12, the information processing device 6 receives sensing intentions from multiple requesters (UE2, etc.) for a predetermined period (S51). The information processing device 6 then groups together sensing intentions that share a common sensing area (S52). The process in S52 is an example of grouping together sensing intentions that share common sensing conditions. Note that the grouping of sensing intentions is not limited to sensing intentions that share a common sensing area. In short, the information processing device 6 can group together multiple sensing intentions that share common conditions corresponding to "what," "where," and "how." Furthermore, the information processing device 6 selects the sensing intention with the highest required accuracy within the group as the representative sensing intention (S53).
[0132] The information processing device 6 then creates a sensing request corresponding to the representative sensing intent (S54). Furthermore, the information processing device 6 creates a sensing request specifying the sensing area and accuracy and transmits it to SENSING 11n (S55). In this way, the information processing device 6 receives sensing data corresponding to the representative sensing intent from SENSING 11n (S56). Furthermore, if the sensing data corresponding to the representative sensing intent matches the sensing intent, the information processing device 6 transmits the occurrence of the event to the requester (S57). In this way, the information processing device 6 can aggregate multiple sensing intents received from multiple UE2s during the same time period using a representative sensing intent. As a result, the information processing device 6 can efficiently utilize wireless resources and acquire sensing data.
[0133] <Other Embodiments> In the first to fourth embodiments and their variations described above, the information processing device 6 selects sensing data that matches the sensing intent by filtering or the like (F7 in Figure 1, S6 to S7 in Figure 4, S17 in Figure 8, etc.). However, this process of selecting sensing data that matches the sensing intent by filtering or the like may be performed on any FN11 included in the core network (core network) of a mobile communication system such as 5GC.
[0134] When FN11 selects sensing data that matches the sensing intent through filtering or other processes, the information processing device 6 or the UE2 installed in the vehicle C1, etc., simply needs to transmit the sensing intent to FN11 and receive sensing data that matches the sensing intent. In this case, the information processing device 6 or the UE2 installed in the vehicle C1, etc., may receive the sensing intent to specify the event to be sensed from the user in natural language. The information processing device 1 may also receive the sensing intent in natural language from the UE2 installed in the vehicle C1, etc. The information processing device 6 (or the UE2 installed in the vehicle C1, etc.) should then transmit the received natural language to the core network. Then, the core network will detect the event specified in the sensing intent from the sensing data. The information processing device 6 (or the UE2 installed in the vehicle C1, etc.) should receive the event from the core network.
[0135] Furthermore, the embodiments described above are merely examples, and this disclosure may be modified and implemented as appropriate without departing from its essence. Also, the processes and means described in this disclosure can be freely combined and implemented, as long as no technical inconsistencies arise. Moreover, processes described as being performed by a single device may be divided and executed by multiple devices. Conversely, processes described as being performed by different devices may be executed by a single device. In a computer system, the hardware configuration (server configuration) used to implement each function can be flexibly changed.
[0136] The present disclosure can also be realized by supplying a computer program implementing the functions described in the above embodiments to a computer, and having one or more processors in the computer read and execute the program. Such a computer program may be provided to the computer by a non-temporary computer-readable storage medium that can be connected to the computer's system bus, or it may be provided to the computer via a network N1. The non-temporary computer-readable storage medium includes, for example, any type of disk such as a magnetic disk, hard disk drive (HDD), optical disk (CD-ROM, DVD disk, Blu-ray disk, etc.), read-only memory (ROM), random access memory (RAM), EPROM, EEPROM, magnetic card, flash memory, or optical card.
[0137] 2 UE 3A Base Station 6 Information Processing Device 11 NF 11k NWDAF 11n SENFING 60 Control Unit 61 CPU 62 Main Memory 63 External Memory 66 Communication Device 100 Information Communication System
Claims
1. An information processing device comprising: a control unit that receives a sensing intent in natural language from a sensing requester, specifying an event to be sensed; transmits a sensing request corresponding to the sensing intent to a management device that distributes sensing data acquired by a wireless communication device; receives the sensing data for the sensing request from the management device; detects the event specified in the sensing intent from the received sensing data; and transmits the detected event to the requester.
2. The information processing apparatus according to claim 1, wherein the control unit refers to a first database in which the relationship between the information included in the sensing intent and the information included in the sensing request is defined, and determines a sensing request corresponding to the sensing intent.
3. The information processing apparatus according to claim 1, wherein the control unit refers to a second database in which rules for identifying the event corresponding to the sensing intention are defined, and detects the event from the sensing data in accordance with the rules.
4. The information processing apparatus according to claim 1, wherein the natural language is a prompt to a Generating Artificial Intelligence (AI) or a Multimodal Large Language Model (LLM), and the control unit detects the event corresponding to the sensing intent based on the prompt using the Generating AI or the Multimodal LLM.
5. The information processing apparatus according to claim 4, wherein the control unit filters and outputs the sensing data in response to the received prompt.
6. The information processing apparatus according to claim 1, wherein the natural language includes information for identifying at least one of the object on which the event is detected, the location on which the event is detected, the circumstances of the object, and the type of the event.
7. The information processing apparatus according to claim 1, wherein, when the sensing conditions included in a first sensing request already transmitted to the management device are satisfied, the sensing conditions included in a second sensing request newly notified to the management device are satisfied, the control unit does not transmit the second sensing request to the management device, but detects the event from the sensing data for the first sensing request.
8. The information processing apparatus according to claim 1, wherein the control unit groups together sensing intentions that have common sensing conditions from among a plurality of sensing intentions received from a plurality of requesters during a predetermined period, creates a sensing request corresponding to a representative sensing intention from the plurality of grouped sensing intentions and transmits it to the management device, acquires sensing data corresponding to the representative sensing intention, detects the event, and transmits the detected event to the plurality of requesters.
9. An information processing method comprising: a computer receiving a sensing intent in natural language from a sensing requester, specifying an event to be sensed; transmitting a sensing request corresponding to the sensing intent to a management device that distributes sensing data acquired by a wireless communication device; receiving the sensing data for the sensing request from the management device; detecting the event specified in the sensing intent from the received sensing data; and transmitting the detected event to the requester.
10. The information processing method according to claim 9, wherein the computer refers to a first database in which the relationship between the information contained in the sensing intent and the information contained in the sensing request is defined, and determines a sensing request corresponding to the sensing intent.
11. The information processing method according to claim 9, wherein the computer refers to a second database in which rules for identifying the event corresponding to the sensing intention are defined, and detects the event from the sensing data in accordance with the rules.
12. The information processing method according to claim 9, wherein the natural language is a prompt to a generative AI or a multimodal LLM, and the computer detects the event corresponding to the sensing intent based on the prompt using the generative AI or the multimodal LLM.
13. The information processing method according to claim 12, wherein the computer filters and outputs the sensing data in response to the received prompt.
14. The information processing method according to claim 9, wherein the natural language includes information for identifying at least one of the object on which the event is detected, the location on which the event is detected, the circumstances of the object, and the type of the event.
15. The information processing method according to claim 9, wherein when the sensing conditions included in a first sensing request already transmitted to the management device are satisfied, thereby satisfying the sensing conditions included in a second sensing request newly notified to the management device, the computer does not transmit the second sensing request to the management device, but detects the event from the sensing data for the first sensing request.
16. The information processing method according to claim 9, wherein the computer groups together sensing intentions that have common sensing conditions from among a plurality of sensing intentions received from a plurality of requesters during a predetermined period, creates a sensing request corresponding to a representative sensing intention from the plurality of grouped sensing intentions and transmits it to the management device, acquires sensing data corresponding to the representative sensing intention, detects the event, and transmits the detected event to the plurality of requesters.
17. A program to cause a computer to receive a sensing intent in natural language from a sensing requestor, specifying an event to be sensed; transmit a sensing request corresponding to the sensing intent to a management device that distributes sensing data acquired by a wireless communication device; receive the sensing data for the sensing request from the management device; detect the event specified in the sensing intent from the received sensing data; and transmit the detected event to the requestor.
18. The program according to claim 17, which causes the computer to refer to a first database in which the relationships between information contained in the sensing intent and information contained in the sensing request are defined, and to determine a sensing request corresponding to the sensing intent.
19. The program according to claim 17, which causes a computer to refer to a second database in which rules for identifying the events corresponding to the sensing intent are defined, and to detect the events from the sensing data in accordance with the rules.
20. An information processing device comprising a control unit that receives a sensing intent in natural language to specify an event to be sensed, transmits the received natural language to a core network, detects the event specified in the sensing intent from the sensing data, and receives the event from the core network.