Search query generation method, system, and non-temporary computer-readable medium
The system enhances data search accuracy by associating unstructured data with structured data through timestamp comparisons and past query patterns, addressing the challenge of ambiguous data sources in digital transformation environments.
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Applications
- Current Assignee / Owner
- HITACHI LTD
- Filing Date
- 2025-11-27
- Publication Date
- 2026-06-19
AI Technical Summary
Existing data utilization platforms struggle to accurately interpret and retrieve relevant information from diverse and contextually ambiguous data sources, particularly unstructured data, leading to insufficient search accuracy in digital transformation environments.
A system and method for generating search queries that identify relevant systems and data sources by monitoring data exchanges, associating unstructured data with structured data through timestamp comparisons, and utilizing a history of past query patterns to enhance search accuracy, eliminating the need for computer vision models for object detection.
Improves data search accuracy by relating unstructured data to relevant events without requiring complex machine learning models, enabling precise information retrieval across multiple data sources.
Smart Images

Figure 2026100806000001_ABST
Abstract
Description
Technical Field
[0001] The present disclosure generally relates to data utilization platforms, and more specifically to systems and methods for enhancing data search accuracy in a distributed production environment using context query processing.
Background Art
[0002] Today, many companies are engaged in digital transformation initiatives. Digital transformation involves leveraging digital technologies to improve operational efficiency and create added value. In digital transformation, various systems are integrated and data is shared. For example, digital transformation in factory facilities involves using operational systems such as enterprise resource planning (ERP), product life cycle management (PLM), and manufacturing execution systems (MES) in parallel with video capture systems such as robotic systems including ceiling cameras, arm robots, and autonomous mobile robots (AMRs), as well as worker support systems such as wearable devices. Data from each system is often stored in a relational database (RDB), a time series database, or an object storage system. These systems and databases are generally connected to a data utilization platform.
[0003] For example, Patent Document 1 discloses a technique for sharing and linking ERP information. Specifically, it describes that "the information (address, data definition) of the service in the ERP business server 2050 is registered so that the ERP linking server 2030 can refer to it, and the request from the client terminal 2010 is analyzed and transferred to the appropriate ERP business server."
Prior Art Documents
Patent Documents
[0004]
Patent Document 1
[0005] Users will attempt to utilize this data to improve their operations. The data available for use includes not only structured and semi-structured data stored in relational databases (RDBs) and time-series databases, but also unstructured data such as video data stored in object storage. To respond to user queries, the data utilization platform must be able to search and retrieve relevant information from a wide range of distributed data sources. Queries are often expressed in natural language and are not necessarily uniquely defined as extract, transform, and load (ETL) program code. For example, if a production problem occurs, such as a robotic arm stopping during a manufacturing process, the user might ask, "What caused the robotic arm to stop?" In response, the data utilization platform is intended to provide the user with information about the robot's operation, which may include sensor data and command logs from the MES, as well as ceiling camera images and wearable device camera images captured when the robot stopped.
[0006] In recent years, attempts have been made to extract desired data using artificial intelligence (AI). Various users, such as business owners, system architects, field operators, data analysts, and maintenance engineers, request information tailored to meet their specific needs. By leveraging AI, it is possible to search for the information users need. To improve the accuracy of information retrieval, data utilization platforms must accurately interpret and understand the meaning of data stored across various data sources.
[0007] One existing strategy to improve data retrieval accuracy involves providing domain-specific information, while another calculates relevance scores and ranks data in response to queries. However, these methods primarily focus on estimating relevance between data sources and user queries. In many cases, this estimation alone is insufficient to achieve satisfactory accuracy. This is especially true when the underlying meaning of the data in the source is not well defined, or when the database stores multipurpose data without clear context or distinction. Without a deeper understanding of the contextual relevance of the data, these strategies will not yield accurate results. [Means for solving the problem]
[0008] A method and system for generating search queries for information retrieval are disclosed. The system includes a module configured to identify systems relevant to a user query, and these relevant systems include unstructured data sources such as video and image data from multiple target systems. The system monitors data exchange between the identified systems, records storage systems, and records specific locations within those storage systems where data relevant to the user query is stored.
[0009] The system further includes modules configured to associate video or image generation systems, or video or image storage systems that do not directly interact with other systems, with systems relevant to user queries. This association is performed by comparing the timestamps of changes in video or images with the execution history of the related systems to determine whether the video or image and the related systems are capturing the same event.
[0010] In the case of system mobility, such as when wearable devices or autonomous mobile robots are involved, the system is configured to identify and retrieve only data related to the event targeted by the user query. This is achieved by capturing the location and timestamp of each system when the target event occurs and extracting data from the relevant systems based on their relationship to the event and the corresponding timestamps.
[0011] In addition, the system includes a module that improves search accuracy and efficiency by utilizing a recorded history of past query events and successful query patterns. This module maintains a history log of successful query patterns and associated systems, allowing users to configure mappings of related systems and data sources to facilitate future queries based on these mappings.
[0012] In some aspects of this disclosure, a method for utilizing data from multiple data sources includes: configuring connections between a data utilization platform and data sources using system information stored in a system information database; monitoring one or more interactions between data sources and storing one or more interactions in the system information database; generating relationships between data sources based on one or more interactions and storing the relationships in the system information database, wherein at least one of the data sources may include unstructured data, and the relationships define relationships between devices and data storage paths; and generating, generating, and storing regenerated queries in response to the receipt of user queries by adding context information to user queries, wherein the context information is managed The process includes generating, which may be derived from contextual data, including information related to the source, product lifecycle, manufacturing execution, database address, data type, or directory name; identifying, based on query matching criteria or relationships, a subset of the data source associated with the regenerated query, which may include relevant timestamps, robot IDs, or system status information, and which includes system information including system name, system ID, IP address, port number, installation location, or related operation; and identifying, which includes a relational database, time-series database, video data, or image data; and cross-referencing, based on relationships, unstructured data obtained from the device with the subset. Cross-referencing unstructured data may eliminate the need to train a computer vision model for object detection in order to interpret the content captured by camera data.
[0013] Some embodiments further include associating data from a wearable device with corresponding system commands based on one or more interactions recorded in a system information database. Interactions may include API requests between data sources, sensor data, system logs, or operational commands via an API gateway or network scanning unit.
[0014] In some embodiments, generating a relationship may be based on a data exchange pattern or unique identifier from system information, the unique identifier including a system ID, port number, IP address, or the temporal alignment of two or more interactions. Attributes of the interaction may include a source address, destination address, execution time, data size, or exchange frequency, IP address, MAC address, database table name, column name, or path within an object storage system.
[0015] Some embodiments further include prompting the user to provide additional system information during the configuration of the connection between the data utilization platform and the data source, or using a subset to allow ETL code to retrieve data from the subset based on a regenerated query, and the ETL code is generated to extract, transform, and load the data in response to the regenerated query and subset into a user-accessible format.
[0016] In some embodiments, a system that utilizes data from multiple data sources may include a system information database configured to store system information and interactions between data sources, and a data utilization platform, the data utilization platform comprising: a system connection configuration unit configured to configure connections between the data utilization platform and data sources using system information stored in the system information database; a data exchange measurement unit configured to monitor interactions through an API gateway or network scanning unit and store the interactions in the system information database; an inter-system relationship configuration unit configured to generate relationships between data sources based on interactions stored in the system information database, wherein at least one of the data sources contains unstructured data and the relationships define relationships between devices and data storage paths; a context addition unit configured to add context information from a context database to user queries and thereby generate regenerated queries; and a data source selector configured to identify a subset of data sources associated with a regenerated query based on matching query criteria or relationships, the data utilization platform configured to associate unstructured data obtained from devices with the subset based on relationships.
[0017] Some embodiments may further include an executable code generator configured to generate ETL code to retrieve data from a subset based on a regenerated query.
[0018] In some embodiments, the interaction may include API requests between data sources, sensor data, system logs, or operational commands, via an API gateway or network scanning unit.
[0019] In some embodiments, associating unstructured data with a subset eliminates the need to train a computer vision model for object detection in order to interpret the content captured by the camera data.
[0020] In some embodiments, the data utilization platform may be further configured to associate data from the wearable device with corresponding system instructions based on the conversations recorded in the system information database.
[0021] In some embodiments, the inter-system relationship configuration unit generates relationships based on data exchange patterns or unique identifiers from the system information, and the unique identifiers include system IDs, port numbers, IP addresses, or temporal alignments of two or more of the conversations.
[0022] In some embodiments, the regenerated query may include related timestamps, robot IDs, or system status information, the system information includes system names, system IDs, IP addresses, port numbers, installation locations, or related operations, and the data source may include relational databases, time series databases, video data, or image data.
[0023] In some embodiments, the attributes of the conversation may include source addresses, destination addresses, execution times, data sizes, exchange frequencies, IP addresses, MAC addresses, database table names, column names, or paths within an object storage system.
[0024] In some embodiments, the context database may include information related to management resources, product life cycles, manufacturing execution, database addresses, data types, or directory names.
[0025] In some embodiments, the techniques described herein relate to a non-temporary computer-readable medium for storing instructions for performing a process, the instructions comprising: configuring a connection between a data utilization platform and a data source using system information stored in a system information database; monitoring one or more interactions between data sources; storing one or more interactions in the system information database; generating and storing relationships between data sources based on one or more interactions and storing the relationships in the system information database, wherein the data sources include unstructured data and the relationships define a relationship between a device and a data storage path; generating and storing a user query in response to the receipt of a user query, by adding contextual information to the user query to generate a regenerated query; using a source selector to identify a subset of data sources associated with the regenerated query based on matching query criteria or relationships; and cross-referencing unstructured data obtained from a device with the subset based on the relationships.
[0026] Aspects of the present disclosure may include a system comprising: means for configuring connections between a data utilization platform and data sources using system information stored in a system information database; means for monitoring one or more interactions between data sources and storing one or more interactions in the system information database; and means for generating relationships between data sources based on one or more interactions and storing the relationships in the system information database, wherein at least one of the data sources may include unstructured data, and the relationships define associations between devices and data storage paths.
[0027] Aspects of the present disclosure can involve a system that includes means for adding context information to a user query in response to receiving the user query to generate a regenerated query, means for using a source selector to identify a subset within a data source that is associated with the regenerated query based on matching query criteria or relationships and that may include relevant timestamps, robot IDs, or system status information, and means for cross-referencing unstructured data obtained from a device with the subset based on relationships. Cross-referencing the unstructured data may eliminate the need to train a computer vision model for object detection in order to interpret the content captured by camera data.
Advantages of the Invention
[0028] According to the present invention, it is possible to improve data search accuracy.
Brief Description of the Drawings
[0029] [Figure 1] FIG. shows a system for utilizing data from multiple data sources in a production environment according to various embodiments of the present disclosure. [Figure 2] FIG. shows an example of a system operating within a production process according to various embodiments of the present disclosure. [Figure 3] FIG. shows an example of data exchange between the systems shown in FIG. 2 according to various embodiments of the present disclosure. [Figure 4] FIG. shows state transitions within the data utilization platform shown in FIG. 1 according to various embodiments of the present disclosure. [Figure 5] FIG. shows an example of system information provided by a user during a configuration process according to various embodiments of the present disclosure. [Figure 6] FIG. is a flowchart regarding a system relationship configuration status according to various embodiments of the present disclosure. [Figure 7]This figure shows the relationships established between systems in various embodiments of the present disclosure. [Figure 8] This figure shows a time series of the activity of an arm robot and the corresponding movement of an object detected in a video captured by a camera, according to various embodiments of the present disclosure. [Figure 9] This figure shows the relationships between systems that are determined after the data content has been verified, according to various embodiments of this disclosure. [Figure 10] This flowchart shows an exemplary process for utilizing data from multiple data sources according to various embodiments of the present disclosure. [Figure 11] This figure shows examples of computing environments according to various embodiments of the present disclosure. [Modes for carrying out the invention]
[0030] The following detailed description provides details of the drawings and exemplary implementations of this application. Reference numbers and descriptions of elements that overlap between drawings are omitted for clarity. Terms used throughout this description are provided as examples and are not intended to be limiting. For example, the use of the term “automatic” may include fully automatic implementations or semi-automatic implementations with user or administrator control over specific aspects of the implementation, depending on the desired implementation for those skilled in the art practicing the implementations of this application. Selections may be made by the user through a user interface or other input means, or through a desired algorithm. Exemplary implementations such as those described herein may be used individually or in combination, and the functionality of the exemplary implementations may be achieved through any means by the desired implementation.
[0031] Figure 1 shows a system that utilizes data from multiple data sources in a production environment according to various embodiments of the present disclosure. As shown, the system 100 comprises a data utilization platform 102, a network 104, and data sources 106. In embodiments, the data utilization platform 102 may comprise a context database 112, a context addition unit 114, a data source selector 116, an executable code generator 118, a system information database 120, a system connection configuration unit 122, a data exchange measurement unit 124, an inter-system relationship configuration unit 126, a network scanning unit 128, and an API gateway 130. The data sources 106 may comprise data systems such as a data storage system (e.g., RDB 132, time-series database 134, object storage 138), a production control system (MES 140, etc.), a robot system 142 (e.g., robotic arm and AMR), a video capture system (e.g., ceiling camera 144), and a worker assistance system (e.g., wearable device 150). It is understood that the scope of the data sources is not limited to these examples and may include multiple cases of each type.
[0032] The system connection configuration unit 122 establishes a connection between the data utilization platform 102 and the data source 106 using user-provided data source information and the system information database 120. The data source information may include the system name, system ID, IP address, port number, related operation, and installation location. Depending on the system configuration, not all of these details may be required, and additional information may be provided as needed.
[0033] During operation, the data exchange measurement unit 124 tracks data exchanges between data sources 106. The data utilization platform 102 may function as an API gateway 130, acting as part of the API endpoints of each system. When data is sent through the API gateway 130, the data exchange measurement unit 124 may record the exchange in the system information database 120. The recorded data may include details such as source and destination addresses, execution time, data size, and exchange frequency, but only some of these metrics may be recorded, or additional or alternative metrics may be included. For example, the data exchange measurement unit 124 may record details such as IP addresses, MAC addresses, database table names, column names, or paths within an object storage system.
[0034] In cases where data transmission bypasses the API gateway 130, the network scanning unit 128 may capture data exchanges within the network 104 and record them in the system information database 120. The captured data may include source and destination addresses, execution time, data size, and exchange frequency. Depending on the system configuration, additional metrics such as IP addresses, MAC addresses, database table names, column names, or paths within the object storage system may also be recorded.
[0035] The inter-system relationship configuration unit 126 may establish relationships between data sources 106 based on information stored in the system information database 120 and record these relationships in the database. The relationships may include mappings such as specific device names and table names where instructions are stored, or camera device IDs and paths in object storage where corresponding images are stored.
[0036] The context addition unit 114 may use information from the context database 112 to generate a regenerated query that enhances the user query and may incorporate additional background information. The context database 112 may contain data related to management resources, product lifecycle, manufacturing execution, database address, data type, directory name, etc. In an embodiment, the context addition unit 114 may leverage a Large Language Model (LLM) to generate the regenerated query.
[0037] The data source selector 116 may, for example, identify related data sources using information stored in the system information database 120 in response to a regenerated query created by the context addition unit 114, for example, in order to create a related data source list. This list may include the system name, system ID, and address, and may further include the IP address, MAC address, database table name, column name, or path within the object storage system. The data source selector 116 may use LLM to generate the related data source list.
[0038] The executable code generator 118 may generate ETL code by referring to the regenerated query and related data source list to retrieve relevant data for the user. The executable code generator 118 may also utilize LLM to facilitate the generation of ETL code.
[0039] Figure 2 shows an example of a system operating within a production process according to various embodiments of the present disclosure. In this example, robotic arms 1(142-1) and 2(142-2) perform tasks on the worksite based on commands from, for example, MES140. A user may be equipped with a wearable device and move around the worksite, completing tasks as instructed by MES140. AMR160 navigates the worksite under commands from MES140, transporting parts to robotic arms 1(142-1) and 2(142-2). Cameras 1(144-1), 2(144-2), and 3(144-3), which are ceiling-mounted cameras, may be implemented as ceiling-mounted cameras that capture video of the worksite, and each of cameras 1(144-1) to 3(144-3) may cover a different range. RDB132 is configured to manage the reading and writing of data related to the operation of MES140. The time-series database 134 stores data collected from encoders (not shown) associated with the movement and motion of arm robots 1 (142-1) and 2 (142-2). In addition, object storage 138 holds video and still image data captured by cameras 1 (144-1), 2 (144-2), and 3 (144-3), as well as from wearable devices such as wearable device 150.
[0040] While the present invention is generally described in the context of a production environment, it should be noted that this is not intended to limit the scope of the disclosure, as the systems and methods for utilizing data from multiple data sources described herein may be used in any other variety of applications.
[0041] Figure 3 shows an example of data exchange between the systems shown in Figure 2 according to various embodiments of the present disclosure. In this embodiment, it is assumed that the robotic arm 2 (142-2) is stopped during a production process, and a user queries the data utilization platform (shown in Figure 2) for data related to the robotic arm 2 (142-2) in order to investigate the possible cause of the stoppage. The time-series database 134 may continuously receive data 310 and 312 from the encoders of robotic arm 1 and robotic arm 2, respectively. The exemplary data may include information such as the following: 'robot_encoders,robot_id=“JM-X7-001314,”joint=“rotary,”encoder_value=10917,velocity=5.6,position=0.12,[timestamp=]2024-09-01T09:23:17Z' The MES140 issues a work command to the arm robot 1 (142-1) by terminating a request to the endpoint / api / user_program / execute, which includes data such as the following:
[0042] '{“robot_id”:“robot1”,“program_id”:21}' When the arm robot 1 (142-1) completes its task, it responds to the MES140 by sending data to the endpoint " / api / mes / notifyCompletion" as shown below. '{“robot_id”:“robot1”,“program_id”:21,“status”:“completed”,“timestamp”:2024-09-01T09:23:46Z”}' Based on the received data, MES140 may insert the data into RDB 132 using a command like the one below. “INSERT INTO robot_execution_logs(robot_id,program_id,status,completion_timestamp)VALUES('robot1',21,'completed','2024-09-01 09:23:46')” Similar interactions may take place between MES140 and arm robot 2(142-2), and between MES140 and AMR160.
[0043] In this embodiment, the MES140 issues a specific command to the user wearing the wearable device 150 by sending the following data. “09:04 Go to arm robot2 and assemble the parts” The wearable device 150 may be equipped with a camera (not shown) at the endpoint / api / messages / instructions, and may automatically or by user action capture video or still images and store them in object storage 138, for example, via the path shown below. / WDEV-24 / EMP2175 / 2024 / 09 / 01 / 09 / 03 Camera 1 (144-1) captures video and still images of the work site and stores them in object storage 138, for example, via the following path. / CAM-891 / 0c7e23 / 2024 / 09 / 01 / 09 Similar data exchange may occur between MES140 and camera 2(144-2) or camera 3(144-3). Thus, the system tracks the data exchange, keeps track of where the relevant data is stored, and includes the associated data path. Those skilled in the art will understand that the examples of data, endpoints, and paths used in the description of Figure 3 are for illustrative purposes only and are not limited to these cases.
[0044] Figure 4 shows the state transitions within the data utilization platform shown in Figure 1 according to various embodiments of the present disclosure. In the system configuration status (S1), the system connection configuration unit configures the connection between the data utilization platform and the data source. In embodiments, this step may involve configuring the data utilization platform based on the provided information to ensure that the data source is properly linked to the data utilization platform for future data exchange.
[0045] Figure 5 shows an example of system information provided by the user during the configuration process according to various embodiments of the present disclosure. In embodiments, if similar information already exists in the system information database, it may be reused. In embodiments, the system connection configuration unit establishes a connection between the data utilization platform (hereinafter, the "platform") and the data source. In cases where certain information, such as authentication information, is missing, the system may prompt the user to enter the necessary details. In embodiments, the system connection configuration unit may then display to the user a list of systems that have successfully connected to verify that all target devices are properly registered. Once the user confirms, the platform may then transition to the data exchange measurement status (S2).
[0046] In the Data Exchange Measurement Status (S2), the Data Exchange Measurement Unit monitors and measures data exchanges between connected data sources. These exchanges may be recorded in the system information database, as shown in Figure 3. If a new data source is detected, or if a user requests the addition of a new data source, the platform may revert to the System Configuration Status (S1). If a user executes a query, or if the platform determines it is necessary, the status transitions to the Inter-System Relationship Configuration Status (S3).
[0047] In the Inter-System Relationship Configuration Status (S3), the Inter-System Relationship Configuration Unit may establish and record relationships between various data sources in the System Information Database to verify, for example, that data interactions and dependencies are properly mapped and updated.
[0048] Figure 6 is a flowchart relating to the inter-system relationship configuration status (S3) according to various embodiments of the present disclosure. In step S301, the platform uses information stored in the system information database to configure various inter-system relationships.
[0049] Figure 7 shows relationships established between systems in various embodiments of this disclosure. For example, Arm Robot 1 is associated with the time-series database entry "robot_encoders,robot_id="JM-X7-001865"". The identifier "JM-X7-001865" used by Arm Robot is different from the name "Arm robot 1" used in the MES, but the data exchange measurement unit determines, by analyzing the data exchange, that both identifiers refer to the same entity. Similarly, the relationship between the RDB and Arm Robot 1 can be identified. Arm Robot 2 is associated in a similar manner. Relationships between the camera and object storage, and between the wearable device and object storage, may also be measured by the data exchange measurement unit to determine which system's data is stored in which object storage path. For example, the MES data shows that there was no direct data exchange between Arm Robot and camera while the AMR was performing a task unrelated to Arm Robot 2, which indicates that their relationship cannot be easily determined.
[0050] Therefore, in step S302, the data utilization platform evaluates whether the relationships between the data sources have been sufficiently elucidated. In this embodiment, the relationship between the robotic arm and the camera remains unknown, so the process resumes in step S303.
[0051] In step S303, the data utilization platform further investigates the data content. For example, a command sent from the MES to the wearable device reveals that the wearable device was near the arm robot 2 at time 9:04. Simultaneously, when the data is sent from the wearable device to object storage, it is stored in the path " / WDEV-24 / EMP2175 / 2024 / 09 / 01 / 09 / 04", which allows the data utilization platform to associate the data in this path with the arm robot 2. In addition, information from encoders stored in the time-series database is used to determine when the arm robot was active. By analyzing the data from the camera stored in object storage, the platform can determine whether an object in the video is moving without needing to identify whether the moving object in the camera image is the arm robot or not. As a result, specialized training in robot motion detection in camera images is not required for this process.
[0052] Structured data such as RDB data, time-series data, and API execution logs are generally well-defined and interconnected, which makes establishing and interpreting relationships with other systems relatively simple.
[0053] In contrast, camera data presents unique challenges because its operations are often independent and lack direct contextual relationships with other systems. Embodiments of this disclosure overcome this difficulty by relating camera data to other data sources without requiring object-level interpretation.
[0054] Traditionally, camera images capture only movement without recognizing specific objects or events. Embodiments of this specification eliminate the requirement to rely on computer vision (CV) models to interpret the content of video footage, and instead relate camera data to other relevant data sources through contextual relationships. By leveraging a broader range of data content from various sources, the system simplifies the process of relating camera data to relevant events and integrating camera data into a comprehensive analysis without requiring complex machine learning models for object recognition.
[0055] Figure 8 shows a time series of the activity of a robotic arm and the corresponding movement of an object detected in a video captured by a camera, according to various embodiments of the present disclosure. Encoder data in the time series database indicates the timing of the robotic arm's movement. Simultaneously, video data from the camera in object storage enables the platform to detect movement in the video. At this point, it is unnecessary to determine whether the moving object in the camera's video is a robotic arm or not, thus eliminating the need for specialized training in detecting robotic motion in images.
[0056] In Figure 8, the activity of arm robot 1(142-1) matches 94% of the motion detected by camera 1(144-1), but only 46% and 44% of the motion detected by camera 2(144-2) and camera 3(144-3), respectively. As a result, arm robot 1(142-1) can be associated with camera 1(144-1). Similarly, the activity of arm robot 2(142-2) matches 94% and 90% of the motion detected by camera 2(144-2) and camera 3(144-3), respectively, but only 43% of the motion detected by camera 1. Therefore, arm robot 2(142-2) can be associated with camera 2(144-3) and camera 3(144-3). Figure 2 shows that even though camera 3 (144-3) was not configured to capture images of the robotic arm 2 (142-2), it recorded the movement of the robotic arm 2, which is reflected in the video of camera 3 (144-3). This demonstrates that an advantageous feature of such an embodiment is that even if the data source is not directly specified by the user, it can be identified and used as a data source for relevant data.
[0057] Figure 9 shows the relationships between systems, which are determined after the data content has been verified, according to various embodiments of this disclosure. After this analysis, the process transitions back to the data exchange measurement status (S2). Corresponding to step S304 in Figure 6, the system determines whether the relationships between the data sources are sufficiently established. As shown in Figure 9, the process is completed once all relationships have been verified. If there are any relationships that are not sufficiently resolved, the process in Figure 6 proceeds to step S305.
[0058] In step S305 of Figure 6, the system queries the user regarding any relationships that could not be automatically determined. Upon receiving user input, the process terminates. If the user executes a query such as "What caused Arm Robot 2 to stop?", the data source selector (not shown) may identify relevant data associated with Arm Robot 2. As shown by the solid lines in Figure 9, the relevant data may include data from a time-series database for "JM-X7-001314", data from the robot_execution_logs table regarding the RDB robot_id being "robot2", and data stored in object storage located at paths " / CAM-891 / 699e25 / 2024 / 09 / 01 / 09" and " / WDEV-24 / EMP2175 / 2024 / 09 / 01 / 09 / 04". In embodiments, the identified relevant data may be compiled into a related data source list. An executable code generator (not shown) may then use the relevant data source list to generate code that causes, for example, ETL to perform a search for the relevant data.
[0059] Figure 10 is a flowchart illustrating an exemplary process for utilizing data from multiple data sources according to various embodiments of the present disclosure. Process 1000 may begin in step 1002, in which system information stored in a system information database is used to establish connections between a data utilization platform and any number of data sources, which may include unstructured data.
[0060] In step 1004, the data utilization platform may monitor the interactions between those data sources and store those interactions in a system information database.
[0061] In step 1006, the data utilization platform may generate relationships between data sources based on the interaction and store these relationships in a system information database. In an embodiment, the relationships may define the association between a device and a data storage path.
[0062] In step 1008, in response to receiving a user query, context information is added to the user query, and a regenerated query is generated.
[0063] In step 1010, a source selector may be used to identify a subset of the data source that is associated with the regenerated query, for example, based on query matching criteria or relationships.
[0064] Finally, in step 1012, based on the relationships, the unstructured data obtained from the device may be cross-referenced with a subset.
[0065] Those skilled in the art will recognize that (1) certain steps may be performed at their discretion, (2) the steps are not limited to the specific order described herein, (3) certain steps may be performed in a different order, and (4) certain steps may be performed simultaneously.
[0066] Figure 11 shows an example computing environment having an example of a computer device suitable for use in several implementations according to various embodiments of the present disclosure. The computer device 1105 of the computing environment 1100 may include one or more processing units, cores, or processors 1110, memory 1115 (e.g., RAM, ROM, and / or others), internal storage 1120 (e.g., magnetic, optical, solid-state storage, and / or organic), and / or an I / O interface 1125, all of which may be coupled by a communication mechanism or bus 1130 for communicating information, or embedded in the computer device 1105. The I / O interface 1125 may also be configured to receive images from a camera or provide images to a projector or display, depending on the desired implementation.
[0067] Computer device 1105 can be communicatively coupled to input / user interface 1135 and output device / interface 1140. Either or both of input / user interface 1135 and output device / interface 1140 may be wired or wireless interfaces and may be detachable. Input / user interface 1135 may include any physical or virtual device, component, sensor, or interface that can be used to provide input (e.g., buttons, touchscreen interfaces, keyboards, pointing / cursor controls, microphones, cameras, Braille, motion sensors, optical readers, and / or others). Output device / interface 1140 may include displays, televisions, monitors, printers, speakers, Braille, and the like. In some exemplary implementations, input / user interface 1135 and output device / interface 1140 may embed or be physically coupled to computer device 1105. In other exemplary implementations, other computer devices may function as or provide input / user interface 1135 and output device / interface 1140 of computer device 1105.
[0068] Examples of computer devices 1105 may include advanced mobile devices (e.g., smartphones, devices in automobiles and other machines, devices carried by people and animals), mobile devices (e.g., tablets, notebooks, laptops, personal computers, portable televisions, radios, etc.), and devices not designed for portability (e.g., desktop computers, other computers, information kiosks, televisions with one or more processors embedded and / or televisions, radios with them combined, etc.).
[0069] Computer device 1105 can be communicatively coupled to external storage 1145 and network 1150 (for example, via I / O interface 1125) to communicate with any number of networked components, devices, and systems, including one or more computer devices of the same or different configurations. Computer device 1105, or any connected computer device, can function as, provide, or be referred to as a server, client, thin server, general-purpose machine, dedicated machine, or any other label.
[0070] The I / O interface 1125 may include wired and / or wireless interfaces that use any communication or I / O protocol or standard (e.g., Ethernet, 802.11x, Universal System Bus, WiMAX®, modem, cellular network protocol, etc.) to communicate information with at least all connected components, devices, and networks of the computing environment 1100. The network 1150 may be any network or combination of networks (e.g., the Internet, local area network, wide area network, telephone network, cellular network, satellite network, etc.).
[0071] Computer device 1105 may use computer-usable or computer-readable media, including temporary and non-temporary media, and / or may use them for communication. Temporary media include transmission media (e.g., metal cables, optical fibers), signals, carrier waves, etc. Non-temporary media include magnetic media (e.g., disks and tapes), optical media (e.g., CD-ROMs, digital video discs, Blu-ray discs), solid-state media (e.g., RAM, ROMs, flash memory, solid storage), and other non-volatile storage or memory.
[0072] Computer device 1105 can be used to implement techniques, methods, applications, processes, or computer executable instructions in several computing environment examples. Computer executable instructions can be retrieved from temporary media and can be stored in and retrieved from non-temporary media. Executable instructions can be in one or more of any programming languages, scripting languages, and machine languages (e.g., C, C++, C#, Java®, Visual Basic, Python®, Perl, JavaScript®, etc.).
[0073] The processor 1110 can run under any operating system (OS) (not shown) in a native or virtual environment. One or more applications can be deployed, including a logic unit 1160, an application programming interface (API) unit 1165, an input unit 1170, an output unit 1175, and an inter-unit communication mechanism 1195 for different units to communicate with each other, with the OS, and with other applications (not shown). The units and elements described may vary in design, function, configuration, or implementation, and are not limited to the description provided. The processor 1110 may take the form of a hardware processor such as a central processing unit (CPU), or a combination of hardware and software units.
[0074] In some exemplary implementations, when information or execution instructions are received by the API unit 1165, they may be communicated to one or more other units (e.g., logic unit 1160, input unit 1170, output unit 1175). In some examples, the logic unit 1160 may be configured to control the flow of information between units and to direct the services provided by the API unit 1165, input unit 1170, and output unit 1175 in some exemplary implementations described above. For example, the flow of one or more processes or implementations may be controlled by the logic unit 1160 alone or in combination with the API unit 1165. The input unit 1170 may be configured to receive input for the computations described in the exemplary implementations, and the output unit 1175 may be configured to provide outputs based on the computations described in the exemplary implementations.
[0075] The processor 1110 can be configured to execute a method or computer instruction that may involve, for example, as described with respect to Figures 1, 3, and 10, using system information stored in a system information database to configure a connection between a data utilization platform and a data source, monitoring one or more interactions between data sources, and storing one or more interactions in the system information database.
[0076] The processor 1110 can be configured to execute a method or computer instruction that, based on one or more interactions, generates relationships between data sources and stores the relationships in a system information database, as described with respect to Figures 1, 6, and 10, for example, where at least one of the data sources may include unstructured data, and the relationships define relationships between devices and data storage paths.
[0077] The processor 1110 may be configured to execute a method or computer instruction that, in response to receiving a user query, adds contextual information to the user query to generate a regenerated query; uses a source selector to identify a subset of data sources associated with the regenerated query, which may include associated timestamps, robot IDs, or system status information, based on query matching criteria or relationships; and cross-references unstructured data obtained from the device with the subset based on relationships. Cross-referencing unstructured data may eliminate the need to train a computer vision model for object detection to interpret content captured by camera data, for example, as described with respect to Figures 1 and 10.
[0078] Some parts of the detailed description are presented with respect to algorithms and symbolic representations of computer operations. These algorithmic descriptions and symbolic representations are means used by those skilled in the field of data processing to communicate the essence of technological innovations to others skilled in the field. An algorithm is a set of defined steps that lead to a desired final state or result. In exemplary implementations, the steps performed require the physical manipulation of tangible quantities to achieve a tangible result.
[0079] Unless otherwise specifically stated, as is evident from the discussion, any discussion using terms such as “processing,” “computing,” “calculating,” “decision,” and “display” throughout the explanation is understood to include the operations and processes of a computer system or other information processing device that manipulate data presented as physical (electronic) quantities in the registers and memory of a computer system and convert them into other data similarly presented as physical quantities in the memory or registers of a computer system or other information storage, transmission, or display devices.
[0080] Exemplary implementations may also relate to apparatus for performing the operations described herein. This apparatus may include one or more general-purpose computers, which may be specifically constructed for a particular purpose or selectively activated or reconfigured by one or more computer programs. Such computer programs may be stored on computer-readable media, such as computer-readable storage media or computer-readable signal media. Computer-readable storage media may include tangible media, such as optical disks, magnetic disks, read-only memory, random-access memory, solid-state devices, drives, or any other type of tangible or non-temporary medium suitable for storing electronic information. Computer-readable signal media may include media such as carrier waves. The algorithms and displays presented herein are not inherently related to any particular computer or other apparatus. Computer programs may include purely software implementations, which include instructions for performing the operations of a desired implementation.
[0081] Various general-purpose systems may be used with the programs and modules illustrated herein, or may prove useful in constructing more specialized devices to perform desired method steps. Furthermore, the exemplary implementations are not described with reference to any particular programming language. It will be recognized that various programming languages may be used to implement the techniques of the exemplary implementations described herein. Instructions in a programming language may be executed by one or more processing devices, such as a central processing unit (CPU), processor, or controller.
[0082] As is known in the art, the operations described above can be implemented by hardware, software, or any combination of software and hardware. Various embodiments of the exemplary implementations may be implemented using circuits and logic devices (hardware), while other embodiments may be implemented using instructions stored in a machine-readable medium (software) that, when executed by a processor, would cause the processor to implement the methods for implementing the implementations of this application. Furthermore, some implementations of this application may be implemented solely by hardware, while other exemplary implementations may be implemented solely by software. Moreover, the various functions described may be implemented in a single unit or spread across multiple components in various ways. When implemented by software, the methods may be executed by a processor, such as a general-purpose computer, based on instructions stored in a machine-readable medium. If desired, the instructions may be stored in the medium in a compressed and / or encrypted form.
[0083] Furthermore, other embodiments of this application will become apparent to those skilled in the art by considering this specification and practicing the techniques of this application. The various embodiments and / or components of the exemplary embodiments described herein may be used individually or in any combination. This specification and the exemplary embodiments are to be considered merely examples, and the true scope and intent of this application are indicated by the following claims. [Explanation of symbols]
[0084] 100...System, 102...Data Utilization Platform, 104...Network, 106...Data Source, 112...Data Context, 114...Context Addition Unit, 116...Data Source Selector, 118...Executable Code Generator, 120...System Information Database, 122...System Connection Configuration Unit, 124...Data Exchange Measurement Unit, 126...Inter-System Relationship Configuration Unit, 128...Network Scanning Unit, 130...API Gateway, 134...Time Series Database, 138...Object Storage, 142...Robot System, 144...Ceiling Camera, 150...Wearable Device
Claims
1. A method for generating search queries executed by a system that utilizes data from multiple data sources, Using system information stored in the system information database, configure the connection between the data utilization platform and the data source. The system monitors one or more interactions between the data sources and stores the one or more interactions in the system information database. The process involves generating relationships between the data sources based on one or more of the aforementioned interactions, and storing the relationships in the system information database, wherein at least one of the data sources includes unstructured data, and the relationships define a relationship between a device and a data storage path. In response to receiving a user query, the system adds contextual information to the user query and generates a regenerated query. Using a source selector to identify a subset of the data source that is associated with the regenerated query, based on query matching criteria or the aforementioned relationship, A method for generating a search query, comprising cross-referencing unstructured data obtained from a device with the subset based on the aforementioned relationship.
2. The search query generation method according to claim 1, wherein the one or more interactions include at least one of the following: an API request between the data sources, sensor data, system logs, or an operation command, via an API gateway or network scanning unit.
3. The search query generation method according to claim 1, wherein cross-referencing unstructured data eliminates the need to train a computer vision model for object detection in order to interpret content captured by camera data.
4. The search query generation method according to claim 1, further comprising associating data from a wearable device with corresponding system commands based on one or more interactions recorded in the system information database.
5. The search query generation method according to claim 1, wherein generating the aforementioned relationship is based on at least one of a data exchange pattern or a unique identifier from the system information, the unique identifier including at least one of a system ID, a port number, an IP address, or the temporal alignment of two or more interactions.
6. The search query generation method according to claim 1, wherein the regenerated query includes at least one of related timestamps, robot IDs, or system status information, the system information includes at least one of system name, system ID, IP address, port number, installation location, or related operation, and the data source includes at least one of relational databases, time-series databases, video data, or image data.
7. The search query generation method according to claim 1, wherein the attributes of one or more interactions include at least one of the following: source address, destination address, execution time, data size, or exchange frequency, IP address, MAC address, database table name, column name, or path within an object storage system.
8. The search query generation method according to claim 1, further comprising prompting the user to provide additional system information while configuring the connection between the data utilization platform and the data source.
9. The search query generation method according to claim 1, wherein the context information is obtained from a context database, which includes information relating to at least one of the following: a management resource, a product lifecycle, a manufacturing execution, a database address, a data type, or a directory name.
10. The search query generation method according to claim 1, further comprising using the subset to enable the ETL code to retrieve the data from the subset based on the regenerated query, wherein the ETL code is generated to extract, transform, and load the data in response to the regenerated query and the subset into a user-accessible format.
11. A system that utilizes data from multiple data sources, A system information database configured to store system information and interactions between data sources, A system connection configuration unit comprising a data utilization platform, wherein the data utilization platform is configured to configure a connection between the data utilization platform and the data source using the system information stored in the system information database, A data exchange measurement unit configured to monitor the dialogue through an API gateway or network scanning unit and to store the dialogue in the system information database, A system-to-system relationship configuration unit is configured to generate relationships between data sources based on the interactions stored in the system information database, wherein at least one of the data sources includes unstructured data, and the relationship defines the relationship between a device and a data storage path. A context appending unit configured to add context information from a context database to user queries and thereby generate regenerated queries, A system comprising: a data source selector configured to identify a subset of data sources associated with the regenerated query based on query matching criteria or the relationship, wherein the data utilization platform is configured to associate unstructured data obtained from a device with the subset based on the relationship.
12. The system according to claim 11, further comprising an executable code generator configured to generate ETL code to retrieve data from the subset based on the regenerated query.
13. The system according to claim 11, wherein the interaction includes at least one of the following: an API request between the data sources, sensor data, system logs, or an operation command, via an API gateway or network scanning unit.
14. The system according to claim 11, wherein associating the unstructured data with the subset eliminates the need to train a computer vision model for object detection in order to interpret the content captured by the camera data.
15. The system according to claim 11, wherein the data utilization platform is further configured to associate data from a wearable device with corresponding system commands based on the dialogue recorded in the system information database.
16. The system according to claim 11, wherein the inter-system relationship configuration unit generates the relationship based on at least one of a data exchange pattern or a unique identifier from the system information, and the unique identifier includes at least one of two or more temporal alignments from a system ID, a port number, an IP address, or the dialogue.
17. The system according to claim 11, wherein the regenerated query includes at least one of related timestamps, robot IDs, or system status information, the system information includes at least one of system name, system ID, IP address, port number, installation location, or related operation, and the data source includes at least one of relational databases, time-series databases, video data, or image data.
18. The system according to claim 11, wherein the attributes of the interaction include at least one of a source address, destination address, execution time, data size, exchange frequency, IP address, MAC address, database table name, column name, or path within an object storage system.
19. The system according to claim 11, wherein the context database includes information relating to at least one of the following: a management resource, a product lifecycle, a manufacturing execution, a database address, a data type, or a directory name.
20. In the system, Using system information stored in the system information database, configure the connection between the data utilization platform and the data source. Monitoring one or more interactions between the aforementioned data sources, The one or more of the aforementioned dialogues are stored in the system information database, The process involves generating relationships between data sources based on one or more interactions and storing those relationships in the system information database, wherein at least one of the data sources includes unstructured data, and the relationships define, generate, and store relationships between devices and data storage paths. In response to receiving a user query, the system adds contextual information to the user query and generates a regenerated query. Using a source selector to identify a subset of the data source that is associated with the regenerated query, based on query matching criteria or the aforementioned relationship, A non-temporary computer-readable medium storing a program that performs the following actions: cross-referencing unstructured data obtained from a device with the subset based on the aforementioned relationship.