A method, system, device and storage medium for monitoring input / output performance
By monitoring the completion queue and asynchronous work queue of the input/output framework tool and recording the time data of object requests, the problem of the inability to deeply monitor the input/output framework tool in the existing technology is solved, and higher monitoring accuracy and system performance optimization are achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- TENCENT TECHNOLOGY (SHENZHEN) CO LTD
- Filing Date
- 2024-12-17
- Publication Date
- 2026-06-19
AI Technical Summary
In existing technologies, the performance monitoring strategies of input/output framework tools cannot penetrate deep into the framework tools themselves, resulting in low accuracy in locating performance problems and affecting system performance.
By monitoring the completion queue and asynchronous work queue of the input/output framework tool, recording the time data of object requests, and calculating the latency data of asynchronous operations, the performance status of the input/output framework tool can be determined.
It improves the granularity and accuracy of monitoring, helps locate performance problems in the system kernel, and improves system performance.
Smart Images

Figure CN122240431A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of computer technology, and in particular to a method, system, device, and storage medium for monitoring input / output performance. Background Technology
[0002] Currently, with the rapid development of information technology, related applications have gradually integrated into people's lives, providing a variety of services. For example, in the field of computer technology, there are some input / output framework tools (such as IO_uring). These tools run in the system kernel and can help handle the system's input / output operations, thereby improving the processing performance of data input / output operations.
[0003] In related technologies, to improve system performance as much as possible, there is a need to monitor input / output performance to identify system performance bottlenecks. Current strategies often involve statistically analyzing relevant inputs and outputs at the system's block devices or application level to determine overall processing latency. This approach fails to understand the internal processing of input / output operations within the input / output framework tool, making it difficult to determine the tool's operational status. This results in low accuracy in locating performance problems and can negatively impact system performance. Summary of the Invention
[0004] This application provides a method, system, device, and storage medium for monitoring input / output performance. It can delve into the internal workings of input / output framework tools to determine the processing status of input / output operations, improve the granularity and accuracy of monitoring, and help locate performance problems in the system kernel, thereby improving the system's operating performance.
[0005] One aspect of this application provides a method for monitoring input / output performance, the method comprising:
[0006] The completion queue of the input / output framework tool is monitored, and the first-time data when the corresponding completion queue entry is generated in the completion queue after the system input / output layer has finished processing each object request is recorded.
[0007] Query the input / output operation flow corresponding to each of the object requests. If the object request is converted into an asynchronous command in the input / output framework tool, record the second time data of the asynchronous command corresponding to the object request being added to the asynchronous work queue.
[0008] Based on the difference between the first time data and the second time data corresponding to the target request, the asynchronous operation latency data corresponding to the target request is determined; wherein, the target request is an object request that is converted into an asynchronous command in the input / output framework tool;
[0009] Based on the asynchronous operation latency data corresponding to each of the target requests, the first performance monitoring data of the input / output framework tool is determined.
[0010] On the other hand, embodiments of this application provide an input / output performance monitoring system, the system comprising:
[0011] The recording unit is used to monitor the completion queue of the input / output framework tool and record the first-time data when the corresponding completion queue entry is generated in the completion queue after the system input / output layer has finished processing each object request.
[0012] The query unit is used to query the input and output operation flow corresponding to each of the object requests. If the object request is converted into an asynchronous command in the input and output framework tool, the second time data of the asynchronous command corresponding to the object request being added to the asynchronous work queue is recorded.
[0013] The calculation unit is used to determine the asynchronous operation delay data corresponding to the target request based on the difference between the first time data and the second time data corresponding to the target request; wherein the target request is an object request that is converted into an asynchronous command in the input / output framework tool;
[0014] The processing unit determines the first performance monitoring data of the input / output framework tool based on the asynchronous operation latency data corresponding to each of the target requests.
[0015] Optionally, in some embodiments, the system further includes a second processing unit, which is specifically used for:
[0016] The submission queue of the input / output framework tool is monitored, and third-time data is recorded when each object request generates a corresponding submission queue entry in the submission queue.
[0017] Based on the difference between the first time data and the third time data corresponding to the object request, the processing delay data corresponding to the object request is determined;
[0018] Based on the processing latency data corresponding to each of the object requests, the second performance monitoring data of the input / output framework tool is determined.
[0019] Optionally, in some embodiments, the processing unit is specifically used for:
[0020] Check whether a summary output request for the first performance monitoring data has been received;
[0021] If the summary output request is not received at present, the asynchronous operation delay data between the time node of the last detection of the summary output request and the current time node will be migrated from the buffer to the specified storage space for storage.
[0022] Waiting for a predetermined period of time, storing the asynchronous operation delay data determined during the waiting process into the buffer, and returning to execute the step of detecting whether a summary output request for the first performance monitoring data has been received;
[0023] If the summary output request is received, the first performance monitoring data of the input / output framework tool is determined based on the asynchronous operation latency data stored in the storage space.
[0024] Optionally, in some embodiments, the processing unit is specifically used for:
[0025] Detect whether the size of the asynchronous operation delay data is greater than 0;
[0026] If the size of the asynchronous operation delay data is greater than 0, the asynchronous operation delay data is stored in the buffer;
[0027] If the size of the asynchronous operation delay data is less than or equal to 0, delete the asynchronous operation delay data.
[0028] Optionally, in some embodiments, the processing unit is specifically used for:
[0029] A hash table is created in the buffer.
[0030] Assign a key to the asynchronous operation delay data, perform a hash calculation on the key, and determine the storage location in the hash table based on the hash calculation result;
[0031] The asynchronous operation delay data is used as the value and stored in the storage location.
[0032] Optionally, in some embodiments, the processing unit is specifically used for:
[0033] Obtain predetermined statistical indicators; wherein, the statistical indicators include at least one of mean, median, variance, standard deviation, and predetermined quantile values;
[0034] Based on the statistical indicators, the latency data of each asynchronous operation are processed to determine the first performance monitoring data of the input / output framework tool.
[0035] Optionally, in some embodiments, the statistical indicator includes a predetermined quantile value; the processing unit is specifically used for:
[0036] Detect the first number of the asynchronous operation delay data;
[0037] The target sequence number is determined based on the predetermined quantile value and the first number;
[0038] The asynchronous operation delay data are sorted to obtain a sorting queue, and the queue number corresponding to each asynchronous operation delay data in the sorting queue is determined.
[0039] Based on the asynchronous operation delay data that corresponds to the same queue number and target number, the first performance monitoring data of the input / output framework tool is determined.
[0040] Optionally, in some embodiments, the system further includes a detection unit, which is specifically used for:
[0041] Detect the event information requested by each of the objects at the system input / output layer;
[0042] Based on the event information, determine the input / output layer latency data requested by each of the objects.
[0043] Optionally, in some embodiments, the system further includes an anomaly localization unit, which is specifically used for:
[0044] Calculate the average processing latency data corresponding to each of the object requests to obtain the first average latency data; calculate the average asynchronous operation latency data corresponding to each of the target requests to obtain the second average latency data; and calculate the average input / output layer latency data corresponding to each of the object requests to obtain the third average latency data.
[0045] The first delay data is determined based on the difference between the first average delay data and the third average delay data;
[0046] Compare the second mean delay data with the first delay data;
[0047] If the ratio of the second mean delay data to the first delay data is greater than the first preset threshold, it is determined that there is an anomaly in the reading of the asynchronous output of the input / output framework tool.
[0048] Optionally, in some embodiments, the anomaly localization unit is further configured to:
[0049] Compare the first average latency data with the preset latency threshold;
[0050] If the first average latency data is greater than the latency threshold, it is determined that the performance of the input / output framework tool is abnormal.
[0051] Optionally, in some embodiments, the recording unit is specifically used for:
[0052] In response to a monitoring request from the target object, detect whether a running application exists in the current user space;
[0053] If the application is running in the user space, monitor the submission queue of the input / output framework tool.
[0054] On the other hand, embodiments of this application provide an electronic device, including a processor and a memory;
[0055] The memory is used to store computer programs;
[0056] The processor executes the computer program to implement the aforementioned method for monitoring input / output performance.
[0057] On the other hand, embodiments of this application provide a computer-readable storage medium storing a computer program that is executed by a processor to implement the aforementioned input / output performance monitoring method.
[0058] On the other hand, embodiments of this application also provide a computer program product, which includes a computer program stored in a computer-readable storage medium. The processor of a computer device reads the computer program from the computer-readable storage medium and executes the computer program, causing the computer device to perform the aforementioned input / output performance monitoring method.
[0059] The embodiments of this application include at least the following beneficial effects: This application provides a method, system, device, and storage medium for monitoring input / output performance. This application monitors the processing flow of object requests input to an input / output framework tool. Through the completion queue of the input / output framework tool, it determines the first time data when each object request generates a completion queue entry after processing. For target requests that undergo asynchronous processing within the input / output framework tool, it determines the second time data after they are converted into asynchronous commands and added to the asynchronous work queue. Thus, the processing status of the input / output operations of the object requests can be determined within the input / output framework tool. Then, based on the difference between the first and second time data corresponding to the target request, the asynchronous operation latency data corresponding to the target request is determined, thereby determining the first performance monitoring data of the input / output framework tool. This method, based on event monitoring of the input / output framework tool, effectively determines the working state of the input / output framework tool. By recording the time data of relevant nodes, it determines the latency data of the input / output framework tool when performing asynchronous operations, which can improve the granularity and accuracy of monitoring, and is beneficial for locating performance problems in the system kernel and improving the system's operating performance. Attached Figure Description
[0060] The accompanying drawings are used to provide a further understanding of the technical solutions of this application and constitute a part of the specification. They are used together with the embodiments of this application to explain the technical solutions of this application and do not constitute a limitation on the technical solutions of this application.
[0061] Figure 1 This is a schematic diagram illustrating I / O performance monitoring in related technologies.
[0062] Figure 2 This is a system architecture diagram of the input / output performance monitoring method provided in the embodiments of this application.
[0063] Figure 3 This is a schematic diagram of a system architecture for a streaming media service provided in an embodiment of this application;
[0064] Figure 4 This is a schematic diagram illustrating I / O operations based on IO_uring, as provided in an embodiment of this application.
[0065] Figure 5 This is a flowchart illustrating an input / output performance monitoring method provided in an embodiment of this application.
[0066] Figure 6 This is a flowchart illustrating the process of an input / output framework tool for processing object requests provided in an embodiment of this application.
[0067] Figure 7 This is a schematic diagram of a process for determining first performance monitoring data provided in an embodiment of this application;
[0068] Figure 8 This is a flowchart illustrating a location system anomaly provided in an embodiment of this application;
[0069] Figure 9 This is a schematic diagram illustrating an application example of a monitoring method provided in the embodiments of this application;
[0070] Figure 10 This is a schematic diagram illustrating the working principle of a kernel acquisition module provided in an embodiment of this application;
[0071] Figure 11 This is a schematic diagram illustrating the working principle of an output module provided in an embodiment of this application;
[0072] Figure 12 This is a structural block diagram of an input / output performance monitoring system provided in an embodiment of this application;
[0073] Figure 13 This is a structural block diagram of an electronic device provided in an embodiment of this application. Detailed Implementation
[0074] To make the objectives, technical solutions, and advantages of this application clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the scope of this application.
[0075] It is understood that the terms “first,” “second,” etc., used in this application may be used to describe various concepts herein, but unless otherwise stated, these concepts are not limited by these terms. These terms are used only to distinguish one concept from another.
[0076] As used in this application, the terms "at least one", "multiple", "each", "any", etc., "at least one" includes one, two or more, "multiple" includes two or more, "each" refers to each of the corresponding multiples, and "any" refers to any one of the multiples.
[0077] Before providing a further detailed description of the embodiments of this application, the nouns and terms used in the embodiments of this application are explained, and the nouns and terms used in the embodiments of this application shall be interpreted as follows:
[0078] 1) IO_uring is a high-performance asynchronous I / O framework tool for Linux systems, designed to provide more efficient I / O operations than traditional system calls.
[0079] 2) Linux, an open-source operating system kernel. The Linux kernel is the core of the operating system, responsible for managing computer hardware resources, providing process management and scheduling, file system support, network communication, and other functions. A complete Linux operating system typically includes the Linux kernel and a series of user-space tools and libraries, which together constitute a complete operating system environment.
[0080] 3) Kernel: The core of the operating system, responsible for managing and controlling computer hardware resources and providing basic services to applications running in user space.
[0081] 4) User space: The area where applications and libraries running on the operating system reside. These applications and libraries cannot directly access hardware resources; instead, they interact with the kernel through system calls.
[0082] 5) User Mode: The operating mode of a normal application. In user mode, applications cannot directly access hardware resources such as memory, disk, and network interfaces; they can only perform limited operations and cannot execute privileged instructions.
[0083] 6) Kernel Mode: The operating mode in which the operating system kernel runs. In kernel mode, applications can access all system resources and execute any instruction, including privileged instructions. Generally, program code in kernel mode is responsible for managing and scheduling system resources, such as handling interrupts, managing memory, and executing system calls.
[0084] 7) Block device: A hardware device that can perform read and write operations on data blocks of a fixed size. Common block devices include hard drives, SSDs (solid-state drives), USB flash drives, etc.
[0085] 8) BPF (Berkeley Packet Filter) is a highly efficient packet filtering technology. Originally used in network sniffing tools, its application scope has greatly expanded over time, especially in modern operating systems and kernels. By using BPF technology, applications can efficiently filter out packets of interest.
[0086] Currently, with the rapid development of information technology, related applications have gradually integrated into people's lives, providing a variety of services. For example, in the field of computer technology, there are some input / output (I / O) framework tools (such as IO_uring). These tools run in the system kernel and can help process the system's input / output operations, thereby improving the processing performance of data input / output operations.
[0087] Specifically, taking IO_uring as an example, it provides asynchronous I / O functionality, allowing applications to initiate I / O requests and return immediately without waiting for the I / O operation to complete. Once the I / O operation is complete, the kernel notifies the application to perform asynchronous harvesting. Furthermore, in IO_uring, user mode and kernel mode communicate using shared memory, effectively reducing system calls. User-mode applications can submit I / O operations to be initiated to shared memory, and kernel threads can read and execute the corresponding I / O operations from shared memory, then return the results to the application via shared memory. By reducing the overhead of system calls and optimizing the I / O path, IO_uring can significantly reduce I / O latency and improve the performance of data I / O operations.
[0088] IO_uring is primarily designed for Linux systems and can be applied to performance-critical scenarios, especially server devices. For example, in some embodiments, the server device can be a web server. High-concurrency web servers typically handle a large number of network requests, and IO_uring can significantly reduce the processing time for each request, improving the throughput and response speed of the web server. In some embodiments, the server device can be a database server, which involves numerous disk read / write operations. IO_uring can optimize these operations, reduce I / O latency, and improve database performance. In some embodiments, the server device can also be a network proxy or load balancer; IO_uring can help them handle connections and data transmission more efficiently, reducing request processing time.
[0089] Of course, it is understood that the types of server devices to which IO_uring can be applied are not limited to the examples given above. In other scenarios, other server devices can also be used. Furthermore, in some embodiments, IO_uring can also be applied to terminal devices. For example, some terminal devices may run high-performance desktop applications, such as video editing software and large games. IO_uring can improve data processing efficiency and ensure the efficient and stable operation of the applications. For another example, terminal devices can also be IoT devices. IoT devices may need to process large amounts of sensor data. Based on IO_uring, data transmission efficiency can be improved and energy consumption of IoT devices can be reduced. In this application embodiment, there are no restrictions on the types of devices to which IO_uring can be applied.
[0090] In related technologies, to improve system performance as much as possible, there is a need to monitor input / output performance to identify system performance bottlenecks. This includes performance monitoring tasks for input / output framework tools. However, current input / output performance monitoring strategies often involve statistical analysis of relevant inputs and outputs at the system's block devices or application level. (Refer to...) Figure 1 , Figure 1 A schematic diagram of I / O performance monitoring in related technologies is shown.
[0091] For example, taking a block device as a hard drive, when monitoring the I / O operation processing performance of a system, relevant tools (such as built-in command-line tools or third-party monitoring tools) are used to monitor some disk performance indicators, such as utilization, saturation, IOPS (Input / Output Per Second), throughput, and response time. The overall processing latency is inferred from the response time. Alternatively, a time data point is recorded at the moment the application initiates an I / O request to the operating system, and another time data point is recorded at the moment the application receives notification from the operating system that the I / O operation is complete. The difference between the two time data points is calculated to determine the overall processing latency.
[0092] However, these implementation methods, when determining input / output performance, do not delve into the internal workings of the input / output framework tool to understand how input / output operations are processed, making it difficult to ascertain the tool's operational status. For example, when monitoring at the system's block device level, the monitored object is the hardware device, and the obtained input / output performance metrics cannot reflect the state of the input / output framework tool. When monitoring at the application level, there is inherent latency overhead between the application's I / O request to the operating system and the request reaching the input / output framework tool; the obtained processing latency data cannot accurately reflect the performance of the input / output framework tool.
[0093] Therefore, in summary, the monitoring strategies for input and output performance in related technologies cannot effectively monitor the performance of input and output framework tools, resulting in low accuracy in locating performance problems and easily affecting the system's operation.
[0094] In view of this, this application provides an input / output performance monitoring method, system, device, and storage medium. This application monitors the processing flow of object requests input to an input / output framework tool. Through the completion queue of the input / output framework tool, it determines the first time data when each object request generates a completion queue entry after processing. For target requests that undergo asynchronous processing within the input / output framework tool, it determines the second time data after they are converted into asynchronous commands and added to the asynchronous work queue. This allows for the determination of the processing status of the object request's input / output operations within the input / output framework tool. Then, based on the difference between the first and second time data corresponding to the target request, it determines the asynchronous operation latency data corresponding to the target request, thereby determining the first performance monitoring data of the input / output framework tool. This method, based on event monitoring of the input / output framework tool, effectively determines the working state of the input / output framework tool. By recording the time data of relevant nodes, it determines the latency data of the input / output framework tool when performing asynchronous operations, improving the granularity and accuracy of monitoring. This is beneficial for locating performance problems in the system kernel and improving system performance.
[0095] System architecture and scenario description used in the embodiments of this application
[0096] Figure 2 This is a system architecture diagram of an input / output performance monitoring method provided in this application embodiment, which includes a terminal device 240, an Internet 230, a gateway 220, a backend server 210, etc.
[0097] In this embodiment, the terminal device 240 can install and run related applications, such as video playback applications, shopping applications, and social applications. The terminal device 240 can take various forms, including desktop computers, laptops, PDAs (personal digital assistants), mobile phones, in-vehicle terminals, home theater terminals, and dedicated terminals. Furthermore, it can be a single device or a collection of multiple devices. The terminal device 240 can communicate with the Internet 230 via wired or wireless means to exchange data.
[0098] A backend server 210 refers to a computer system that can provide certain services to terminal devices 240. Compared to ordinary terminal devices 240, backend servers 210 have higher requirements in terms of stability, security, and performance. A backend server 210 can be a single high-performance computer in a network platform, a cluster of multiple high-performance computers, a portion of a single high-performance computer (e.g., a virtual machine), or a combination of portions of multiple high-performance computers (e.g., virtual machines). Specifically, a backend server 210 can be an independent physical server, a server cluster or distributed system composed of multiple physical servers, or a cloud server providing basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDN, and big data and artificial intelligence platforms.
[0099] Gateway 220, also known as an internetwork connector or protocol converter, is a computer system or device that acts as a translator, enabling network interconnection at the transport layer. It bridges the gap between two systems using different communication protocols, data formats, languages, or even completely different architectures. Gateways can also provide filtering and security functions. Messages sent from terminal device 240 to backend server 210 are forwarded to the corresponding backend server 210 via gateway 220. Messages sent from backend server 210 to terminal device 240 are also forwarded to the corresponding terminal device 240 via gateway 220.
[0100] The input / output performance monitoring method provided in this application embodiment can be executed locally on the terminal device 240 or the backend server 210.
[0101] For example, when the input / output performance monitoring method provided in this application embodiment is applied to the terminal device 240 for local execution, it can help to discover the specific reasons that cause the application of the terminal device to respond slowly, thereby facilitating the implementation of optimization measures and improving the user experience of the application; and in some scenarios, by monitoring the performance of input / output framework tools, it can also help the terminal device select a suitable input / output framework tool and improve the operating performance of the terminal device.
[0102] For example, when the input / output performance monitoring method provided in this application embodiment is executed locally on the backend server 210, the system performance bottleneck can be detected in a timely manner, making it convenient for maintenance personnel to take corresponding optimization measures to avoid system overload or crash.
[0103] Of course, this is understandable. Figure 2 The system architecture diagram shown is only used to illustrate the system architecture to which the methods in the embodiments of this application can be applied, and does not imply any limitation on the actual implementation of this application.
[0104] The input / output performance monitoring method provided in this application can be executed in various scenarios, and the following is an exemplary description of it.
[0105] (I) Game Scene
[0106] The method provided in this application embodiment can be applied in game application scenarios.
[0107] In gaming applications, I / O processing performance is crucial, directly impacting the player's gaming experience. The method described in this application can be applied to gaming scenarios. For example, a player's computer device can be equipped with an input / output framework tool to improve game performance.
[0108] By applying the method in the embodiments of this application, the input and output performance of the system can be monitored to determine whether the input and output framework tools on the current computer device are compatible, which helps to solve the performance bottleneck of the game application and improves the game experience.
[0109] (II) Scenarios for Streaming Media Applications
[0110] The method provided in this application embodiment can be applied to streaming media application scenarios.
[0111] Currently, with the continuous development of information technology, numerous streaming media applications have emerged. Streaming media is a technology that continuously transmits audio and video content over a network, allowing terminal devices to begin playing media files before the data is fully downloaded. Streaming media scenarios typically involve real-time or near-real-time content delivery, which can include online video, live audio broadcasts, IPTV, distance learning courses, live game streaming, video conferencing, and more.
[0112] The input / output performance monitoring method provided in this embodiment can be applied in streaming media scenarios. Please refer to... Figure 3 , Figure 3 This application illustrates a schematic diagram of the system architecture of a streaming media service provided in an embodiment of the present application, such as... Figure 3 As shown, streaming content providers can use a CDN (Content Delivery Network) to distribute streaming content, which involves caching relevant business data on various nodes of the CDN. When someone wants to access a streaming resource, the nearest node responds to the request and delivers the streaming resource to the relevant terminal device for playback and display. Input / output framework tools can be applied to CDN nodes to improve their performance and efficiency.
[0113] In this scenario, the performance monitoring data corresponding to the input and output framework tools on each node can be determined by the method provided in the embodiments of this application, and nodes that may be faulty can be located, thereby enabling timely node repair and improving the efficiency and stability of streaming media data distribution.
[0114] General Description of Embodiments in this Application
[0115] Before introducing and explaining the input / output performance monitoring method provided in the embodiments of this application, the input / output framework tools involved in the embodiments of this application will be introduced and explained first.
[0116] In this embodiment, the input / output framework tool IO_uring is used as an example for description. In practical applications, it is not limited to the form of IO_uring. Any input / output framework tool with similar or related functions to IO_uring can be used for performance monitoring using the input / output performance monitoring method provided in this embodiment. This application does not impose any restrictions on this.
[0117] As described above, IO_uring eliminates the need for system calls to initiate I / O operations by sharing memory between user space and kernel space. Specifically, functionally, IO_uring provides two queues that share memory with user space: the Submission Queue (SQ) and the Completion Queue (CQ). The Submission Queue is a circular queue consisting of a contiguous block of memory, used to store the data (requests) for which I / O operations will be performed. Similarly, the Completion Queue is also a circular queue consisting of a contiguous block of memory, used to store the data returned after the I / O operation is completed.
[0118] In the kernel, the `io_sq_ring` structure is typically used to represent the submission queue. The `io_sq_ring` structure contains `head`, `tail`, `ring_entries`, and `array` fields. The `head` field stores the head pointer of the circular queue, the `tail` field stores the tail pointer, the `ring_entries` field stores the total number of existing I / O requests in the queue, and the `array` field is an index within the circular queue array pointing to the submission queue entries. The kernel maps the `io_sq_ring` structure to the memory space of user-mode applications. This allows both user-mode applications and kernel-mode input / output framework tools to access and manipulate the `io_sq_ring` structure. For example, applications can directly submit I / O requests to the circular queue of the `io_sq_ring` structure, and input / output framework tools can read and execute the submitted requests (i.e., process the corresponding I / O operations) from the circular queue, thus avoiding system calls and context switching events.
[0119] In the `io_sq_ring` structure described above, the `array` field is an index pointing to a submission queue entry. Here, a submission queue entry (SQE) is a specific piece of data in the submission queue, which can be obtained by transforming the request submitted by the application. Specifically, a submission queue entry generally includes the `opcode`, `ioprio`, `fd`, `off`, `addr`, and `len` fields. The `opcode` field is the I / O opcode, primarily used to indicate the type of I / O operation requested, such as read, write, synchronous, or asynchronous; the `ioprio` field is the priority of the I / O operation, which can be used to execute important I / O operations in advance; the `fd` field is the file handle corresponding to the I / O operation; the `off` field represents the offset of the current I / O operation; the `addr` field points to the memory address associated with the current I / O operation, for example, for a write operation, it points to the memory address of the content to be written to the file; and the `len` field represents the length of the data in the current I / O operation.
[0120] When an application submits an I / O operation request, it first retrieves a free entry from the submission queue, then fills this entry with the corresponding field data (such as the aforementioned opcode and ioprio fields) to generate the corresponding submission queue entry. The index corresponding to this submission queue entry is written into the io_sq_ring structure, thereby enabling the I / O operation request to be submitted to IO_uring for processing.
[0121] After completing I / O operations through the kernel, IO_uring saves the data results of the I / O operations to the completion queue. In the kernel, the completion queue is typically represented by the io_cq_ring structure. The io_cq_ring structure contains head, tail, ring_entries, and cqes fields. The head field records the head pointer of the ring queue, the tail field records the tail pointer, the ring_entries field records the total number of completed I / O operations in the queue, and the cqes field is a ring queue array storing the data results of the I / O operations. This array contains several entries called completion queue entries (CQEs), each storing the data result of one I / O operation corresponding to a request.
[0122] Please refer to Figure 4 , Figure 4 This illustration shows a schematic diagram of I / O operation processing based on IO_uring, as provided in an embodiment of this application. Generally, the I / O operation processing flow of IO_uring is as follows: the application submits an I / O operation request to the submission queue in IO_uring; the thread of IO_uring reads the submitted queue entry (after the request has been transformed) from the submission queue; based on the data content in the submission queue entry, it initiates an I / O request; the system executes the I / O operation and returns the corresponding data result to IO_uring; this data result is stored in the completion queue in IO_uring as a completion queue entry. The application can read the data result of the I / O operation through the completion queue, thereby completing the overall I / O operation process.
[0123] Based on the above introduction and explanation of the input / output framework tools, the following describes a method for monitoring input / output performance provided in an embodiment of this application. Please refer to... Figure 5 , Figure 5 This illustration shows a flowchart of an input / output performance monitoring method provided in an embodiment of this application. Figure 5 The illustrated process can be applied to the aforementioned terminal devices or backend servers. For example... Figure 5 As shown, a method for monitoring input / output performance according to an embodiment of this application includes, but is not limited to, the following steps:
[0124] Step 510: Monitor the completion queue of the input / output framework tool and record the first-time data when the corresponding completion queue entry is generated in the completion queue after the system input / output layer has finished processing each object request;
[0125] Step 520: Query the input / output operation flow corresponding to each object request. If the object request is converted into an asynchronous command in the input / output framework tool, record the second time data of the asynchronous command corresponding to the object request being added to the asynchronous work queue.
[0126] Step 530: Determine the asynchronous operation latency data corresponding to the target request based on the difference between the first time data and the second time data corresponding to the target request; wherein, the target request is an object request that is converted into an asynchronous command in the input / output framework tool;
[0127] Step 540: Determine the first performance monitoring data of the input / output framework tool based on the asynchronous operation latency data corresponding to each target request.
[0128] This application provides a method for monitoring input / output performance. This method is based on event monitoring of input / output framework tools, which effectively determines the working status of the input / output framework tools. By recording the time data of relevant nodes, the latency data of the input / output framework tools when performing asynchronous operations can be determined, which can improve the granularity and accuracy of monitoring, and help to locate system kernel performance problems and improve system operation performance.
[0129] Specifically, when monitoring the performance of the input / output framework tool, the performance of the input / output framework tool in processing the I / O requests of related applications can be monitored. In this embodiment of the application, the I / O requests of the applications can be referred to as object requests.
[0130] In this application embodiment, the I / O operation type corresponding to the object request can be various, and this application embodiment does not limit it. For example, in some embodiments, the I / O operation type corresponding to the object request may be a file read / write operation. For instance, when a file management system reads the content of an uploaded file, it will trigger a file read / write operation object request. In some embodiments, the I / O operation type corresponding to the object request may be a network communication operation. For instance, when a web server receives data sent by a client, it will trigger a network communication operation object request. In some embodiments, the I / O operation type corresponding to the object request may be a database operation. For instance, when a database analysis system needs to query historical data to generate a report, it will trigger a database operation object request.
[0131] It is understood that the triggering conditions for object requests and the corresponding I / O operation types can be flexibly set according to the relevant functions of the application, and this embodiment does not impose any restrictions on this.
[0132] It should be noted that, in the embodiments of this application, the number of applications running in user space can be one or more when monitoring input / output performance. In other words, this application can monitor the I / O operation processing performance of the input / output framework tool when running a single application to determine the compatibility between the input / output framework tool and the application; it can also monitor the I / O operation processing performance of the input / output framework tool under a large number of concurrent requests when running multiple applications to determine the concurrent processing performance of the input / output framework tool.
[0133] Of course, regardless of whether there is one or more applications running in the user space, in this embodiment of the application, when monitoring input and output performance, relevant performance data can be statistically analyzed and processed when multiple objects are triggered, thus enabling a more accurate evaluation.
[0134] In this application embodiment, the number of object requests involved in the monitoring process is not specifically limited and can be flexibly adjusted according to actual needs. For example, in some embodiments, a threshold for the number of object requests processed during the monitoring process can be preset. Once the number of object requests actually processed by the input / output framework tool reaches this threshold after monitoring begins, the current monitoring task ends. In some embodiments, a time period corresponding to the monitoring task can also be preset, and relevant performance data can be determined based on the object request processing status of the input / output framework tool within the set time period. This application does not impose any limitations on this.
[0135] In step 510, when monitoring the performance of the input / output framework tool, the completion queue of the input / output framework tool can be monitored. Referring to the previous description of the process for handling I / O operations based on IO_uring, in this embodiment, for each object request, after processing the corresponding I / O operation, the input / output framework tool will generate a completion queue entry corresponding to that object request in the completion queue. Here, when monitoring the completion queue of the input / output framework tool, the time data when the I / O operation of each object request generates the corresponding completion queue entry in the completion queue can be recorded. In this embodiment, this time data is recorded as the first time data.
[0136] It should be noted that there are multiple ways to monitor the completion queue of the input / output framework tool in step 510 to determine the first-time data. For example, in some embodiments, an event notification mechanism can be used. For instance, event notifications can be registered in the input / output framework tool. When a new completion queue entry is added, the kernel triggers an event notification, and the first-time data for generating the corresponding completion queue entry can be determined based on this event notification. In some embodiments, file descriptors (fds) can be used to monitor the completion queue, and relevant interfaces can be pre-specified to observe the file descriptors. When a file descriptor becomes readable, it indicates that a new completion queue entry has appeared in the completion queue, and the current time point is recorded as the first-time data corresponding to that completion queue entry. Of course, it is understood that the actual method of monitoring the first-time data when a completion queue entry is generated can be flexibly adjusted as needed, and this application does not impose any limitations on this.
[0137] In step 510, the statistical time granularity of the recorded first-time data can be flexibly set as needed. For example, a millisecond timing granularity can be used to record the first-time data. For instance, for a certain object request, its corresponding first-time data is recorded as 16 minutes, 4 seconds, and 2 milliseconds.
[0138] It should be noted that, in order to facilitate the calculation and processing of relevant data, the timing granularity of each time data in this application embodiment can be preset to the same specification, such as using a timing granularity of milliseconds for statistics. This application does not impose any restrictions on this.
[0139] In step 520, for each object request, you can query its corresponding input / output operation flow in the input / output framework tool. For example... Figure 4 As shown, the general input / output operation flow for object requests includes: the application submits an I / O operation request to the submission queue in IO_uring; the thread in IO_uring reads the submitted queue entry after the request has been transformed from the submitted queue; and initiates an I / O request based on the data content in the submitted queue entry. Here, for different types of object requests, the input / output framework tools use different methods to actually initiate I / O requests to the system input / output layer.
[0140] Please refer to Figure 6 , Figure 6 This illustration shows a flowchart of an input / output framework tool processing object requests, as provided in an embodiment of this application. Figure 6As shown, after receiving an object request, the input / output framework tool converts it into a commit queue entry and adds it to the commit queue. Then, the tool parses the commit queue entries corresponding to each object request. As mentioned earlier regarding commit queue entries, this includes an opcode field, primarily used to indicate the type of I / O operation being requested, such as read, write, synchronous, or asynchronous. For synchronous I / O operations, the input / output framework tool can directly send a request to the system input / output layer to implement the corresponding input / output operation. However, for asynchronous I / O operations, the tool first converts the entry into an asynchronous command, adds it to the asynchronous work queue, and then sends a request to the system input / output layer based on the asynchronous commands in the queue.
[0141] In the above-described scenario, this embodiment of the application can detect whether each object request is converted into an asynchronous command in the input / output framework tool. If so, these object requests can be recorded as target requests. For target requests that have been converted into asynchronous commands, this embodiment of the application can record the time data when their corresponding asynchronous commands are added to the asynchronous work queue, and record this as second time data.
[0142] It is understood that, in the embodiments of this application, the implementation method of monitoring the asynchronous work queue in the input / output framework tool to determine the second time data corresponding to each target request can be implemented by referring to the processing method of the first time data in the aforementioned step 510, which will not be elaborated here.
[0143] In step 530, after obtaining the first time data and the second time data corresponding to the target request, the difference between the first time data and the second time data can be calculated. It can be understood that the first time data records the time node when the I / O operation corresponding to the target request is completed, and the second time data records the time node when the target request is added to the asynchronous work queue and is about to start executing the asynchronous operation. Therefore, based on the difference between the first time data and the second time data, the time consumed by the asynchronous operation corresponding to the target request from the start of execution to its completion can be determined. In this embodiment, this is recorded as asynchronous operation latency data. Asynchronous operation latency data can be used to characterize the performance of the input / output framework tool in performing asynchronous I / O operations. The larger the asynchronous operation latency data, the worse the performance of the input / output framework tool in performing asynchronous I / O operations; the smaller the asynchronous operation latency data, the better the performance of the input / output framework tool in performing asynchronous I / O operations.
[0144] For example, for a certain target request, assuming its corresponding first time data is 18 minutes 26 seconds and 43 milliseconds and its corresponding second time data is 18 minutes 27 seconds and 96 milliseconds, then the calculated asynchronous operation latency data corresponding to the target request is 1053 milliseconds.
[0145] In step 540, as described above, the number of object requests involved in the input / output performance monitoring task in this embodiment of the application is multiple. Among these numerous object requests, the number of target requests converted into asynchronous commands in the input / output framework tool is generally also multiple. In this embodiment of the application, the performance monitoring data of the input / output framework tool in processing asynchronous I / O operations can be determined based on the asynchronous operation latency data corresponding to each target request. Here, this performance monitoring data is recorded as the first performance monitoring data.
[0146] In this application embodiment, the specific implementation method for determining the first performance monitoring data of the input / output framework tool based on the asynchronous operation latency data corresponding to multiple target requests is not limited. For example, in some embodiments, the average value of the asynchronous operation latency data corresponding to each target request can be calculated, and this average value can be determined as the first performance monitoring data of the input / output framework tool; in other embodiments, the asynchronous operation latency data corresponding to each target request can be sorted, several larger asynchronous operation latency data can be removed, and the largest of the remaining asynchronous operation latency data can be determined as the first performance monitoring data. Of course, the actual method for determining the first performance monitoring data is not limited to the above forms, and in some embodiments, the first performance monitoring data can also include statistical data of multiple dimensions determined based on the asynchronous operation latency data corresponding to each target request. Specific implementation methods will be described in subsequent embodiments and will not be elaborated here.
[0147] It is understood that the input / output performance monitoring method provided in this application monitors the processing flow of object requests input to the input / output framework tool. Through the completion queue of the input / output framework tool, it determines the first time data when each object request generates a completion queue entry after processing. For target requests that undergo asynchronous processing within the input / output framework tool, it determines the second time data after they are converted into asynchronous commands and added to the asynchronous work queue. In this way, the processing status of the input / output operations of the object requests can be determined within the input / output framework tool. Then, based on the difference between the first and second time data corresponding to the target request, the asynchronous operation latency data corresponding to the target request is determined, thereby determining the first performance monitoring data of the input / output framework tool. This method, based on event monitoring of the input / output framework tool, effectively determines the working state of the input / output framework tool. By recording the time data of relevant nodes, it determines the latency data of the input / output framework tool when performing asynchronous operations, which can improve the granularity and accuracy of monitoring, and is beneficial for locating performance problems in the system kernel and improving the system's operating performance.
[0148] Specifically, in some embodiments, the input / output performance monitoring method provided in this application further includes:
[0149] Monitor the submission queue of the input / output framework tool and record third-time data when each object request generates a corresponding submission queue entry in the submission queue;
[0150] The processing latency data corresponding to the object request is determined based on the difference between the first time data and the third time data corresponding to the object request.
[0151] Based on the processing latency data corresponding to each object request, the second performance monitoring data of the input / output framework tool is determined.
[0152] In this embodiment of the application, in addition to determining the asynchronous operation latency data corresponding to the target request, for all object requests, whether asynchronous or synchronous operation types, the time spent from entering the input / output framework tool for processing to finally generating the corresponding completion queue entry can be determined. In this embodiment of the application, this time is recorded as processing latency data.
[0153] Understandably, processing latency data can be used to characterize the overall performance of the input / output framework tool in handling I / O operations. If the processing latency data corresponding to most object requests is large, it indicates that the input / output framework tool has poor performance in handling I / O operations; conversely, if the processing latency data corresponding to most object requests is small, it indicates that the input / output framework tool has good performance in handling I / O operations.
[0154] Specifically, in this embodiment, when determining the processing latency data corresponding to an object request, the submission queue of the input / output framework tool can be monitored. As described earlier regarding IO_uring, for each object request, it is submitted to the submission queue by the application, generating a corresponding submission queue entry, indicating that it has entered the input / output framework tool for processing. Therefore, in this embodiment, the time data of the submission queue entries generated in the submission queue can be monitored and recorded as third-time data.
[0155] For each object request, we can obtain the third-time data when its corresponding submission queue entry is generated and the first-time data when its corresponding completion queue entry is generated. Then, we can calculate the difference between the first-time data and the third-time data. The first-time data records the time when the I / O operation corresponding to the object request is completed, while the third-time data records the time when the I / O operation corresponding to the object request begins processing. Calculating the difference between the first-time data and the third-time data allows us to determine the time elapsed from the start of processing to the completion of the I / O operation corresponding to the object request, i.e., the aforementioned processing latency data.
[0156] For example, for a certain object request, assuming its corresponding first time data is 18 minutes 29 seconds 102 milliseconds and its corresponding third time data is 18 minutes 30 seconds 514 milliseconds, then the calculated processing latency data for the object request is 1412 milliseconds.
[0157] In this embodiment, after obtaining the processing latency data corresponding to each object request, the overall performance monitoring data for I / O operations of the input / output framework tool can be determined based on this processing latency data. Here, this performance monitoring data is referred to as the second performance monitoring data. It is understood that the method for determining the second performance monitoring data can be implemented with reference to the aforementioned first performance monitoring data, and this application will not elaborate on this.
[0158] It is understood that, in this embodiment of the application, by monitoring the submission queue of the input / output framework tool, the processing latency data of the input / output framework tool for I / O operations can be determined by combining the completion status of the queue, so as to clearly and accurately determine the I / O operation processing performance of the input / output framework tool.
[0159] Specifically, in some embodiments, reference is made to Figure 7 Based on the asynchronous operation latency data corresponding to each target request, the first performance monitoring data of the input / output framework tool is determined, including:
[0160] Step 710: Check if a summary output request for the first performance monitoring data has been received;
[0161] Step 720: If no summary output request is received at present, the asynchronous operation delay data between the time node of the last detection of the summary output request and the current time node will be migrated from the buffer to the specified storage space for storage.
[0162] Step 730: Wait for a predetermined time period, store the asynchronous operation delay data determined during the waiting process in the buffer, and return to the step of checking whether a summary output request for the first performance monitoring data has been received.
[0163] Step 740: If a summary output request is received, determine the first performance monitoring data of the input / output framework tool based on the asynchronous operation latency data stored in the storage space.
[0164] In this embodiment, when calculating the asynchronous operation latency data corresponding to each target request and determining the first performance monitoring data of the input / output framework tool based on the asynchronous operation latency data, newly determined asynchronous operation latency data can be placed in a buffer. Every so often, the asynchronous operation latency data in the buffer can be migrated to a designated storage space, and it is checked whether a summary output request for the first performance monitoring data has been received. If no summary output request for the first performance monitoring data has been received, the first performance monitoring data can be temporarily omitted from calculation; if a summary output request for the first performance monitoring data has been received, the first performance monitoring data can be calculated based on the asynchronous operation latency data stored in the storage space.
[0165] Specifically, in this embodiment, at a certain moment, it can be detected whether a summary output request for the first performance monitoring data has been received. Here, the summary output request can be used to trigger the task flow for calculating the first performance monitoring data. It can be input by relevant personnel or triggered automatically by the program; this application does not impose any restrictions on this. If no summary output request is received, the asynchronous operation delay data between the last time the summary output request was detected and the current time can be migrated from the buffer to the designated storage space for storage.
[0166] In this embodiment, the buffer can be used to temporarily store asynchronous operation delay data. Every so often (i.e., the time interval between two detection summary output requests), the buffer can be cleared, and the asynchronous operation delay data in the buffer can be migrated to a designated storage space. In this way, the buffer can continue to cache the next batch of asynchronous operation delay data, facilitating the reuse of the buffer's storage resources. The designated storage space can be a user-space storage space, which can be used by relevant personnel to view the asynchronous operation delay data.
[0167] In this embodiment, after migrating the asynchronous operation delay data in the buffer to the designated storage space, a predetermined time period can be waited for. The size of this time period can be flexibly determined according to actual needs, and this application does not impose any restrictions on it. The asynchronous operation delay data determined during the waiting process can be stored in the buffer. When the waiting time period reaches the preset length, the detection operation for the summary output request can be executed again. This process is repeated cyclically. If no summary output request is received, the asynchronous operation delay data will be migrated from the buffer to the designated storage space in batches.
[0168] In other cases, if a summary output request is received, the first performance monitoring data for the input / output framework tool can be determined based on the asynchronous operation latency data stored in the current storage space. The specific calculation process has been described in the foregoing embodiments and will not be repeated here.
[0169] It is understood that, in this embodiment of the application, the storage mode combining buffer and storage space can achieve continuous storage and processing of asynchronous operation latency data, which is suitable for scenarios requiring long-term performance monitoring. Furthermore, calculating the first performance monitoring data based on the summary output request allows for on-demand processing, reducing unnecessary data processing steps and making it more suitable for actual monitoring needs.
[0170] It should be noted that the above-described processing flow for asynchronous operation latency data and first performance monitoring data is also applicable to the processing of processing latency data and second performance monitoring data, and will not be elaborated upon here.
[0171] Specifically, in some embodiments, the asynchronous operation delay data determined during the waiting process is stored in a buffer, including:
[0172] Check if the size of the asynchronous operation latency data is greater than 0;
[0173] If the size of the asynchronous operation delay data is greater than 0, the asynchronous operation delay data will be stored in the buffer.
[0174] If the size of the asynchronous operation delay data is less than or equal to 0, delete the asynchronous operation delay data.
[0175] In this embodiment of the application, asynchronous operation delay data can be filtered before being stored in the buffer to remove some obviously abnormal data.
[0176] It is easy to understand that in this embodiment, for each target request, the first time data is the time node when its corresponding I / O operation is completed, and the second time data is the time node when its corresponding asynchronous operation starts execution. Normally, the first time data for each target request should be later than the second time data; that is, it is impossible for a completion queue entry to be generated before the asynchronous operation corresponding to the target request has even started or just begun execution. Therefore, the asynchronous operation latency data corresponding to each target request in this embodiment should be a value greater than 0. If it is less than or equal to 0, it indicates an anomaly in the statistics of the asynchronous operation latency data. For example, there may be an error in the calculation process or an error in the recorded first or second time data, resulting in the asynchronous operation latency data being less than or equal to 0.
[0177] Specifically, in this embodiment, it is possible to detect whether the size of the asynchronous operation delay data is greater than 0. If so, it can be stored in the buffer. Conversely, if the size of the asynchronous operation delay data is less than or equal to 0, it indicates that it is obviously erroneous data. In this case, the data can be deleted and not stored in the buffer.
[0178] It is understood that in this embodiment of the application, filtering asynchronous operation latency data before storing it and deleting obviously erroneous asynchronous operation latency data can effectively improve the accuracy and reliability of data recording, which is beneficial to improving the performance monitoring effect and improving the utilization efficiency of storage resources. Of course, similarly, the above-mentioned filtering process for asynchronous operation latency data can also be applied to processing latency data, which will not be elaborated here.
[0179] Specifically, in some embodiments, storing asynchronous operation delay data in a buffer includes:
[0180] Create a hash table in the buffer;
[0181] Assign keys to asynchronous operation delay data, perform hash calculations on the keys, and determine the storage location in the hash table based on the hash calculation results;
[0182] The asynchronous operation delay data is used as the value and stored in the storage location.
[0183] In this embodiment of the application, a hash table can be used to store asynchronous operation latency data. A hash table is a data structure used to store associative arrays (i.e., collections of key-value pairs). It uses an algorithm called a hash function to map keys to a location in the table for fast access to records.
[0184] Specifically, in this embodiment, a hash table can be first established in the buffer. Then, for each piece of asynchronous operation delay data that needs to be stored, a key is assigned to it, a hash calculation is performed on the key, and the storage location corresponding to the asynchronous operation delay data is determined in the hash table based on the hash calculation result. Next, the asynchronous operation delay data can be used as the value and stored in that storage location. Here, different keys are assigned for different asynchronous operation delay data. By performing hash calculation on the key, the asynchronous operation delay data can be conveniently and evenly distributed to different storage locations.
[0185] It is understood that in this embodiment of the application, storing asynchronous operation latency data using a hash table can facilitate rapid data retrieval in the future, which is beneficial to improving the efficiency of data processing and thus improving the efficiency of determining performance monitoring data.
[0186] Specifically, in some embodiments, the first performance monitoring data of the input / output framework tool is determined based on the asynchronous operation latency data corresponding to each target request, including:
[0187] Obtain predetermined statistical indicators; wherein, the statistical indicators include at least one of the mean, median, variance, standard deviation, and predetermined quantiles;
[0188] Based on statistical indicators, the latency data of each asynchronous operation is processed to determine the primary performance monitoring data for the input / output framework tool.
[0189] In this embodiment of the application, when determining the first performance monitoring data of the input / output framework tool based on the asynchronous operation latency data corresponding to each target request, the data results can be determined under different statistical indicators and then uniformly used as the first performance monitoring data output.
[0190] Specifically, in this embodiment of the application, corresponding statistical indicators can be obtained in advance. Here, the type of statistical indicator may include, but is not limited to, at least one of the mean, median, variance, standard deviation, and predetermined quantile values.
[0191] For example, in some embodiments, when the statistical metric includes the mean, the mean of the asynchronous operation latency data corresponding to each target request can be calculated and output as part of the first performance monitoring data. In some embodiments, when the statistical metric includes the median, variance, or standard deviation, the relevant data of the asynchronous operation latency data corresponding to each target request can also be calculated to obtain the corresponding first performance monitoring data output.
[0192] Specifically, in this embodiment, the first performance monitoring data can also be determined based on a predetermined quantile value. Here, the predetermined quantile value can be between 0 and 100%, for example, it can be set to 10%, 25%, 40%, 70%, 90%, etc., and this application does not limit its size.
[0193] When determining the first performance monitoring data for the input / output framework tool based on predetermined quantile values, the relevant processing flow may include:
[0194] The first number of the asynchronous operation latency data is detected;
[0195] Determine the target sequence number based on the predetermined quantile value and the first number;
[0196] Sort the latency data of each asynchronous operation to obtain a sorted queue, and determine the queue number corresponding to each latency data of asynchronous operation in the sorted queue;
[0197] Based on the asynchronous operation latency data with the same queue number and target number, the first performance monitoring data of the input / output framework tool is determined.
[0198] In this embodiment of the application, for the implementation method of determining the first performance monitoring data by a predetermined quantile value, the number of asynchronous operation latency data corresponding to the obtained target request can be detected first, and recorded as the first number. Based on the first number and the predetermined quantile value, the target sequence number can be determined. For example, if the current first number is 200, and the corresponding predetermined quantile value is 25%, the target sequence number can be determined as 50 based on their product. Next, the asynchronous operation latency data can be sorted, specifically arranged in ascending order, to obtain a sorting queue. Then, the queue sequence number corresponding to each asynchronous operation latency data in the sorting queue can be determined, and the asynchronous operation latency data corresponding to the target sequence number is determined as the first performance monitoring data.
[0199] It should be noted that in the embodiments of this application, if the sorting is arranged in descending order to obtain a sorted queue, the method of determining the target sequence number can be adjusted. For example, the difference between 1 and the predetermined quantile value can be calculated first, and then the target sequence number can be determined based on the product of the difference and the first number. This application does not impose any restrictions on this.
[0200] Specifically, in some embodiments, monitoring the submission queue of the input / output framework tool includes:
[0201] In response to a monitoring request from the target object, detect whether a running application exists in the current user space;
[0202] If a running application exists in user space, monitor the submission queue of the input / output framework tool.
[0203] It is readily understood that, in this embodiment of the application, when performing a monitoring task, the input / output framework tool requires a corresponding object request for processing. In other words, a running application is required in user space. Without a running application, an object request cannot be generated, and the monitoring task cannot be executed successfully.
[0204] Therefore, when actually performing a monitoring task, after the target object initiates a monitoring request, it can detect whether there is a running application in the current user space. If there is, steps 510 to 540 mentioned above can be executed normally. If not, a prompt message can be sent to the target object, informing it that no application is running in its user space and normal monitoring cannot be achieved. Here, the target object can be relevant operations and maintenance personnel, and this application does not restrict their identity.
[0205] Specifically, in some embodiments, the input / output performance monitoring method provided in this application further includes:
[0206] Detect event information requested by each object at the system input / output layer;
[0207] Based on the event information, determine the input / output layer latency data requested by each object.
[0208] In this embodiment, for each object request, event information at the system's input / output layer can also be obtained. Based on this timing information, the input / output layer latency data of the object request can be determined. That is, the time taken for the system to actually execute the I / O operation of the object request after the input / output framework tool sends the request to the system.
[0209] In this embodiment of the application, the obtained input / output layer latency data, asynchronous operation latency data, first performance monitoring data, processing latency data, and second performance monitoring data can be used to determine the input / output performance of the system and facilitate the location of performance bottlenecks. The following is a detailed description with specific examples.
[0210] Exemplarily, in some embodiments, reference is made to Figure 8 The input / output performance monitoring method provided in this application embodiment further includes:
[0211] Step 810: Calculate the average processing latency data corresponding to each object request to obtain the first average latency data; calculate the average asynchronous operation latency data corresponding to each target request to obtain the second average latency data; and calculate the average input / output layer latency data corresponding to each object request to obtain the third average latency data.
[0212] Step 820: Determine the first delay data based on the difference between the first average delay data and the third average delay data;
[0213] Step 830: Compare the second mean delay data with the first delay data;
[0214] Step 840: If the ratio of the second mean delay data to the first delay data is greater than the first preset threshold, it is determined that there is an anomaly in the reading of the asynchronous output of the input / output framework tool.
[0215] In this embodiment of the application, for each object request, the average value of the processing latency data corresponding to them can be calculated, and the obtained data can be recorded as the first average latency data. The average value of the asynchronous operation latency data corresponding to the target request can be calculated, and the obtained data can be recorded as the second average latency data. The average value of the input and output layer latency data corresponding to each object request can be calculated, and the obtained data can be recorded as the third average latency data.
[0216] It is understandable that the third average latency data represents the average execution time of each object request in the system's input / output layer, while the first average latency data represents the overall average processing time of each object request. Calculating the difference between the first and third average latency data yields the first latency data, which is the actual average processing time of each object request within the input / output framework tool.
[0217] In this embodiment, the first latency data and the second average latency data can be compared. If they are relatively close, it indicates that the asynchronous output of the input / output framework tool is normal. Conversely, if the second average latency data is much larger than the first latency data, for example, if the ratio of the two is greater than a preset threshold, it indicates that the asynchronous output reading of the input / output framework tool is abnormal. Here, the preset threshold can be denoted as the first preset threshold, and its value can be set according to actual needs.
[0218] For example, in some scenarios, the first average latency data obtained by the method in this application embodiment is 1353ms, the second average latency data is 2151ms, and the third average latency data is 423ms. Calculating the difference between the first and third average latency data, the first latency data is determined to be 930ms, indicating that the average processing time for all object requests in the input / output framework tool is 930ms. The second average latency data is 2151ms, which is significantly larger than the first latency data, indicating that the processing time for asynchronous operation type object requests is abnormal. This situation is mainly due to insufficient reading of relevant data by the application after the asynchronous operation is completed, causing tasks to accumulate in the input / output framework tool, affecting the overall latency of asynchronous I / O operations.
[0219] Exemplary, in some embodiments, the input / output performance monitoring method provided in this application further includes:
[0220] Compare the first average latency data with the preset latency threshold;
[0221] If the first mean latency data is greater than the latency threshold, it is determined that there is an anomaly in the performance of the input / output framework tool.
[0222] In this embodiment of the application, as mentioned above, the first average latency data represents the average processing time of all object requests. After obtaining the first average latency data, it can be compared with a preset latency threshold. If the first average latency data is greater than the latency threshold, it indicates that the input / output framework tool has poor processing performance for all types of object requests. In this case, it can be determined that the performance of the input / output framework tool is abnormal.
[0223] The following describes and explains a method for monitoring input and output performance provided in this application embodiment, with reference to specific application examples.
[0224] The method described in this application embodiment can be encapsulated into a monitoring program. Generally, those who need to monitor the input / output performance of a system are professional developers. Therefore, the monitoring program in this application embodiment can use text output and command-line interaction methods, without using a GUI interface. In practical applications, the monitoring program can record various time data, such as first-time data, second-time data, and various latency data. If it is necessary to output relevant performance monitoring data, a command-line tool can be used to trigger a request to summarize and output the performance monitoring data.
[0225] For example, in this embodiment of the application, when performance monitoring data needs to be output, the device can output statistical information in the following form:
[0226] CTX: 0xffff888851b76000, comm: fio-124552
[0227] Total complete time (us):
[0228] Median=1605492Mode=1287062Std=447189p50=1660708p90=2354575
[0229] async complete time:
[0230] Median=2076264Mode=1577505Std=261205p50=2078629p90=2379722
[0231] Blk latency (us):
[0232] Median=426273Mode=425723Std=72856p50=413872p90=428729
[0233] Here, CTX represents the context of the input / output framework tool and the process running it. "Total complete time" is the second performance monitoring data determined based on processing latency data, "async complete time" is the third performance monitoring data determined based on asynchronous operation latency data, and "Blk latency" is the input / output layer latency data. The statistics above show specific data for the median, mode, standard deviation, 50th percentile (p50), and 90th percentile (p90).
[0234] Specifically, please refer to Figure 9 , Figure 9 A schematic diagram illustrating an application example of a monitoring method provided in this application embodiment is shown.
[0235] In this embodiment of the application, the monitoring program corresponding to the monitoring method can be used to monitor input / output framework tools. Figure 6 Taking the processing flow of the input / output framework tool shown in the example, in this embodiment of the application, when performing a monitoring task, a monitoring program can be started, and relevant triggering events can be pre-registered within it. For example, in this embodiment of the application, monitoring of specific events in the input / output framework tool can be implemented based on BPF technology. Figure 9 As shown, when the input / output framework tool converts an object request into a submission queue entry, the monitoring program can record the third time data corresponding to the event; when the input / output framework tool converts an object request into an asynchronous command and adds it to the asynchronous work queue, the monitoring program records the second time data corresponding to the event; when the input / output framework tool completes an I / O operation and generates the corresponding completion queue entry, the monitoring program can record the first time data corresponding to the event.
[0236] In this embodiment of the application, the monitoring program may specifically include two modules: one is a kernel acquisition module, and the other is an output module. (Refer to...) Figure 10 , Figure 10 This illustration shows a schematic diagram of the working principle of a kernel acquisition module provided in an embodiment of this application. In this embodiment, the kernel acquisition module can run in the kernel and includes a series of BPF programs for monitoring input / output framework tools, such as... Figure 10As shown, these BPF programs can be used to monitor event information from the submission queue, asynchronous work queue, completion queue, and system input / output layer within the input / output framework tool. The corresponding time data for these events can be recorded in relevant hash tables. Based on the time data from the submission and completion queues, processing latency can be calculated; similarly, asynchronous operation latency can be calculated based on the time data from the asynchronous work queue and completion queue. This data can be stored in a pre-defined buffer.
[0237] Reference Figure 11 , Figure 11 This illustration shows a schematic diagram of the working principle of an output module provided in an embodiment of this application. In this embodiment, the output module can be used to transfer and output data collected by the kernel acquisition module. Specifically, the output module can transfer various data in the buffer to a predetermined storage space (such as user space) at fixed intervals. Furthermore, the output module can detect whether there is a summary output request; if so, it will statistically analyze performance monitoring data based on asynchronous operation latency data and processing latency data in the storage space, and output the performance monitoring data.
[0238] It should be noted that, Figure 11 In this application, when traversing the data in the transfer buffer, only the asynchronous operation delay data and the processing delay data can be output. For the input and output layer delay data, they can be queried as needed and are not output when traversing the buffer. This application does not impose any restrictions on this.
[0239] It is understood that the method in this application embodiment, based on event monitoring of the input / output framework tool, can effectively determine the working status of the input / output framework tool. By recording the time data of relevant nodes, the granularity and accuracy of monitoring can be improved, which is helpful in locating performance problems of the system kernel and improving the system's operating performance.
[0240] Reference Figure 12 In this embodiment of the application, an input / output performance monitoring system is also provided, which includes:
[0241] The recording unit 1210 is used to monitor the completion queue of the input / output framework tool and record the first-time data when the corresponding completion queue entry is generated in the completion queue after the system input / output layer has completed processing each object request.
[0242] The query unit 1220 is used to query the input and output operation flow corresponding to each object request. If the object request is converted into an asynchronous command in the input and output framework tool, the second time data of the asynchronous command corresponding to the object request being added to the asynchronous work queue is recorded.
[0243] The calculation unit 1230 is used to determine the asynchronous operation delay data corresponding to the target request based on the difference between the first time data and the second time data corresponding to the target request; wherein, the target request is an object request that is converted into an asynchronous command in the input / output framework tool;
[0244] The processing unit 1240 determines the first performance monitoring data of the input / output framework tool based on the asynchronous operation latency data corresponding to each target request.
[0245] Optionally, in some embodiments, the system further includes a second processing unit, which is specifically used for:
[0246] Monitor the submission queue of the input / output framework tool and record third-time data when each object request generates a corresponding submission queue entry in the submission queue;
[0247] The processing latency data corresponding to the object request is determined based on the difference between the first time data and the third time data corresponding to the object request.
[0248] Based on the processing latency data corresponding to each object request, the second performance monitoring data of the input / output framework tool is determined.
[0249] Optionally, in some embodiments, the processing unit is specifically used for:
[0250] Check if a summary output request for the first performance monitoring data has been received;
[0251] If no summary output request is received at present, the asynchronous operation delay data between the time node of the last detection of the summary output request and the current time node will be migrated from the buffer to the specified storage space for storage.
[0252] Wait for a predetermined period of time, store the asynchronous operation delay data determined during the waiting process in a buffer, and return to the step of checking whether a summary output request for the first performance monitoring data has been received.
[0253] If a summary output request is received, the first performance monitoring data for the input / output framework tool is determined based on the asynchronous operation latency data stored in the storage space.
[0254] Optionally, in some embodiments, the processing unit is specifically used for:
[0255] Check if the size of the asynchronous operation latency data is greater than 0;
[0256] If the size of the asynchronous operation delay data is greater than 0, the asynchronous operation delay data will be stored in the buffer.
[0257] If the size of the asynchronous operation delay data is less than or equal to 0, delete the asynchronous operation delay data.
[0258] Optionally, in some embodiments, the processing unit is specifically used for:
[0259] Create a hash table in the buffer;
[0260] Assign keys to asynchronous operation delay data, perform hash calculations on the keys, and determine the storage location in the hash table based on the hash calculation results;
[0261] The asynchronous operation delay data is used as the value and stored in the storage location.
[0262] Optionally, in some embodiments, the processing unit is specifically used for:
[0263] Obtain predetermined statistical indicators; wherein, the statistical indicators include at least one of the mean, median, variance, standard deviation, and predetermined quantiles;
[0264] Based on statistical indicators, the latency data of each asynchronous operation is processed to determine the primary performance monitoring data for the input / output framework tool.
[0265] Optionally, in some embodiments, the statistical indicator includes a predetermined quantile value; the processing unit is specifically used for:
[0266] The first number of the asynchronous operation latency data is detected;
[0267] Determine the target sequence number based on the predetermined quantile value and the first number;
[0268] Sort the latency data of each asynchronous operation to obtain a sorted queue, and determine the queue number corresponding to each latency data of asynchronous operation in the sorted queue;
[0269] Based on the asynchronous operation latency data with the same queue number and target number, the first performance monitoring data of the input / output framework tool is determined.
[0270] Optionally, in some embodiments, the system further includes a detection unit, which is specifically used for:
[0271] Detect event information requested by each object at the system input / output layer;
[0272] Based on the event information, determine the input / output layer latency data requested by each object.
[0273] Optionally, in some embodiments, the system further includes an anomaly localization unit, which is specifically used for:
[0274] Calculate the average processing latency data corresponding to each object request to obtain the first average latency data; calculate the average asynchronous operation latency data corresponding to each target request to obtain the second average latency data; and calculate the average input / output layer latency data corresponding to each object request to obtain the third average latency data.
[0275] The first delay data is determined based on the difference between the first average delay data and the third average delay data;
[0276] Compare the second mean latency data with the first latency data;
[0277] If the ratio of the second mean delay data to the first delay data is greater than the first preset threshold, it is determined that there is an anomaly in the reading of the asynchronous output of the input / output framework tool.
[0278] Optionally, in some embodiments, the anomaly location unit is further configured to:
[0279] Compare the first average latency data with the preset latency threshold;
[0280] If the first mean latency data is greater than the latency threshold, it is determined that there is an anomaly in the performance of the input / output framework tool.
[0281] Optionally, in some embodiments, the recording unit is specifically used for:
[0282] In response to a monitoring request from the target object, detect whether a running application exists in the current user space;
[0283] If a running application exists in user space, monitor the submission queue of the input / output framework tool.
[0284] It is understandable that, such as Figure 5 The content of the input / output performance monitoring method embodiments shown herein is applicable to the input / output performance monitoring system embodiments. The specific functions implemented by the input / output performance monitoring system embodiments are as follows: Figure 5 The method for monitoring input / output performance shown is the same as the embodiment described above, and the beneficial effects achieved are the same as those described above. Figure 5 The beneficial effects achieved by the illustrated method embodiment for monitoring input and output performance are also the same.
[0285] This application also discloses an electronic device, including:
[0286] At least one processor;
[0287] At least one memory for storing at least one program;
[0288] When at least one program is executed by at least one processor, such that at least one processor implements as Figure 5 The illustrated example is a method for monitoring input / output performance.
[0289] The electronic device in the embodiments of this application may be a terminal device, a computer device, or a server device.
[0290] For example, refer to Figure 13 , Figure 13 This is a schematic diagram of the structure of an electronic device provided in an embodiment of this application. Taking a terminal device as an example, Figure 13 In this context, the electronic device 1300 may include an RF (Radio Frequency) circuit 1310, a memory 1320 including one or more computer-readable storage media, an input unit 1330, a display unit 1340, a sensor 1350, an audio circuit 1360, a short-range wireless transmission module 1370, a processor 1380 including one or more processing cores, and a power supply 1390, among other components. Those skilled in the art will understand that... Figure 13 The device structure shown does not constitute a limitation on the terminal device and may include more or fewer components than shown, or combine certain components, or have different component arrangements.
[0291] RF circuit 1310 can be used for receiving and transmitting signals during information transmission or calls. Specifically, it receives downlink information from the base station and hands it over to one or more processors 1380 for processing; additionally, it transmits uplink data to the base station. Typically, RF circuit 1310 includes, but is not limited to, an antenna, at least one amplifier, a tuner, one or more oscillators, a SIM card, a transceiver, a coupler, an LNA (Low Noise Amplifier), a duplexer, etc. Furthermore, RF circuit 1310 can also communicate wirelessly with networks and other devices. Wireless communication can use any communication standard or protocol, including but not limited to GSM (Global System for Mobile communication), GPRS (General Packet Radio Service), CDMA (Code Division Multiple Access), WCDMA (Wideband Code Division Multiple Access), LTE (Long Term Evolution), email, SMS (Short Messaging Service), etc.
[0292] Memory 1320 can be used to store software programs and modules (or units). Processor 1380 executes various functional applications and data processing by running the software programs and modules (or units) stored in memory 1320. Memory 1320 may primarily include a program storage area and a data storage area. The program storage area may store the operating system, application programs required for at least one function (such as sound playback function, image playback function, etc.); the data storage area may store data created based on the use of electronic device 1300 (such as audio data, telephone directory, etc.). Furthermore, memory 1320 may include high-speed random access memory and may also include non-volatile memory, such as at least one disk storage device, flash memory device, or other volatile solid-state storage device. Accordingly, memory 1320 may also include a memory controller to provide access to memory 1320 for processor 1380 and input unit 1330. Although Figure 13 The RF circuit 1310 is shown, but it is understood that it is not a necessary component of the electronic device 1300 and can be omitted as needed without changing the nature of the invention.
[0293] The input unit 1330 can be used to receive input digital or character information, and to generate keyboard, mouse, joystick, optical, or trackball signal inputs related to object settings and function control. Specifically, the input unit 1330 may include a touch-sensitive surface 1331 and other input devices 1332. The touch-sensitive surface 1331, also known as a touch display screen or touchpad, can collect touch operations on or near the object (such as operations performed by the object using a finger, stylus, or any suitable object or accessory on or near the touch-sensitive surface 1331), and drive the corresponding connection device according to a pre-set program. Optionally, the touch-sensitive surface 1331 may include two parts: a touch detection device and a touch controller. The touch detection device detects the touch position of the object and the signal generated by the touch operation, and transmits the signal to the touch controller; the touch controller receives touch information from the touch detection device, converts it into touch point coordinates, sends it to the processor 1380, and can receive and execute instructions from the processor 1380. In addition, the touch-sensitive surface 1331 can be implemented using various types such as resistive, capacitive, infrared, and surface acoustic wave. Besides the touch-sensitive surface 1331, the input unit 1330 may also include other input devices 1332. Specifically, other input devices 1332 may include, but are not limited to, one or more of the following: a physical keyboard, function keys (such as volume control buttons, power buttons, etc.), a trackball, a mouse, and a joystick.
[0294] Display unit 1340 can be used to display information input by an object or information provided to an object, as well as various graphical object interfaces for controlling electronic device 1300. These graphical object interfaces can be composed of graphics, text, icons, video, and any combination thereof. Display unit 1340 may include display panel 1341, optionally configured as LCD (Liquid Crystal Display), OLED (Organic Light-Emitting Diode), etc. Further, touch-sensitive surface 1331 may cover display panel 1341. When touch-sensitive surface 1331 detects a touch operation on or near it, it transmits the information to processor 1380 to determine the type of touch event. Subsequently, processor 1380 provides corresponding visual output on display panel 1341 according to the type of touch event. Although in Figure 13 In this embodiment, the touch-sensitive surface 1331 and the display panel 1341 are implemented as two separate components to realize input and output functions. However, in some embodiments, the touch-sensitive surface 1331 and the display panel 1341 can be integrated to realize input and output functions.
[0295] The electronic device 1300 may also include at least one sensor 1350, such as a light sensor, a motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor and a proximity sensor. The ambient light sensor can adjust the brightness of the display panel 1341 according to the ambient light level, and the proximity sensor can turn off the display panel 1341 or the backlight when the electronic device 1300 is moved to the ear. As a type of motion sensor, a gravity acceleration sensor can detect the magnitude of acceleration in various directions (generally three axes). When stationary, it can detect the magnitude and direction of gravity and can be used for applications that recognize the phone's posture (such as landscape / portrait switching, related games, magnetometer posture calibration), vibration recognition-related functions (such as pedometers, taps), etc. Other sensors that the electronic device 1300 may be equipped with, such as gyroscopes, barometers, hygrometers, thermometers, and infrared sensors, will not be described in detail here.
[0296] Audio circuitry 1360, speaker 1361, and microphone 1362 provide an audio interface between the device and electronic device 1300. Audio circuitry 1360 converts received audio data into electrical signals and transmits them to speaker 1361, where speaker 1361 converts them into sound signals for output. Conversely, microphone 1362 converts collected sound signals into electrical signals, which are then received by audio circuitry 1360, converted back into audio data, processed by processor 1380, and transmitted via RF circuitry 1310 to another electronic device, or output to memory 1320 for further processing. Audio circuitry 1360 may also include an earphone jack to facilitate communication between external headphones and electronic device 1300.
[0297] The short-range wireless transmission module 1370 can be a WIFI (wireless fidelity) module, Bluetooth module, or infrared module, etc. The electronic device 1300 can transmit information with wireless transmission modules on other devices via the short-range wireless transmission module 1370.
[0298] Processor 1380 is the control center of electronic device 1300. It connects various parts of the device via various interfaces and lines, and performs various functions and processes data of electronic device 1300 by running or executing software programs or modules stored in memory 1320 and calling data stored in memory 1320, thereby providing overall control of the device. Optionally, processor 1380 may include one or more processing cores; optionally, processor 1380 may integrate an application processor and a modem processor, wherein the application processor mainly handles the operating system, user interface, and applications, and the modem processor mainly handles wireless communication. It is understood that the aforementioned modem processor may also not be integrated into processor 1380.
[0299] Electronic device 1300 also includes a power supply 1390 (such as a battery) for supplying power to various components. Optionally, the power supply 1390 can be logically connected to the processor 1380 through a power management system, thereby enabling functions such as managing charging, discharging, and power consumption through the power management system. The power supply 1390 may also include one or more DC or AC power supplies, recharging systems, power fault detection circuits, power converters or inverters, power status indicators, and other arbitrary components.
[0300] Although not shown, the electronic device 1300 may also include a camera, Bluetooth module, etc., which will not be described in detail here.
[0301] This application also discloses a computer-readable storage medium storing a processor-executable program, which, when executed by a processor, is used to implement, for example... Figure 5 The illustrated example is a method for monitoring input / output performance.
[0302] Understandable Figure 5 The content of the input / output performance monitoring method embodiments shown is applicable to the embodiments of this computer-readable storage medium. The specific functions implemented by the embodiments of this computer-readable storage medium are the same as those in the embodiments of this computer-readable storage medium. Figure 5 The method for monitoring input / output performance shown in the embodiment is the same, and the beneficial effects achieved are the same. Figure 5 The beneficial effects achieved by the illustrated method embodiment for monitoring input and output performance are also the same.
[0303] This application also discloses a computer program product or computer program, which includes computer instructions stored in the aforementioned computer-readable storage medium. Figure 13 The processor of the illustrated electronic device can read the computer instructions from the aforementioned computer-readable storage medium, and the processor executes the computer instructions, causing the computer device to perform... Figure 5 The illustrated example is a method for monitoring input / output performance.
[0304] Understandable Figure 5 The content of the input / output performance monitoring method embodiments shown herein is applicable to this computer program product or computer program embodiment, and the specific functions implemented by this computer program product or computer program embodiment are the same as those described above. Figure 5 The method for monitoring input / output performance shown in the embodiment is the same, and the beneficial effects achieved are the same. Figure 5 The beneficial effects achieved by the illustrated method embodiment for monitoring input and output performance are also the same.
[0305] In some alternative embodiments, the functions / operations mentioned in the block diagrams may not occur in the order shown in the operation diagrams. For example, depending on the functions / operations involved, two consecutively shown blocks may actually be executed substantially simultaneously, or the blocks may sometimes be executed in reverse order. Furthermore, the embodiments presented and described in the flowcharts of this application are provided by way of example to provide a more comprehensive understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed and sub-operations described as part of a larger operation are executed independently.
[0306] Furthermore, although this application is described in the context of functional modules, it should be understood that, unless otherwise stated to the contrary, one or more of the functions and / or features may be integrated into a single physical device and / or software module, or one or more functions and / or features may be implemented in a separate physical device or software module. It is also understood that a detailed discussion of the actual implementation of each module is unnecessary for understanding this application. Rather, given the properties, functions, and internal relationships of the various functional modules in the apparatus disclosed herein, the actual implementation of the module will be understood within the scope of conventional technology for an engineer. Therefore, those skilled in the art can implement the application set forth in the claims using ordinary techniques without excessive experimentation. It is also understood that the specific concepts disclosed are merely illustrative and not intended to limit the scope of this application, which is determined by the full scope of the appended claims and their equivalents.
[0307] If a function is implemented as a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or a part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods of the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.
[0308] In this application embodiment, the terms "module" or "unit" refer to a computer program or part of a computer program that has a predetermined function and works with other related parts to achieve a predetermined goal, and can be implemented wholly or partially using software, hardware (such as processing circuitry or memory), or a combination thereof. Similarly, a processor (or multiple processors or memory) can be used to implement one or more modules or units. Furthermore, each module or unit can be part of an overall module or unit that includes the functionality of that module or unit.
[0309] The logic and / or steps represented in the flowchart or otherwise described herein, for example, can be considered as a sequenced list of executable instructions for implementing logical functions, and can be embodied in any computer-readable storage medium for use by, or in conjunction with, an instruction execution system, apparatus, or device (such as a computer-based system, a processor-included system, or other system that can fetch and execute instructions from, an instruction execution system, apparatus, or device). For the purposes of this specification, "computer-readable storage medium" can be any means that can contain, store, communicate, propagate, or transmit programs for use by, or in conjunction with, an instruction execution system, apparatus, or device.
[0310] It should be understood that various parts of this application can be implemented using hardware, software, firmware, or a combination thereof. In the above embodiments, multiple steps or methods can be implemented using software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, it can be implemented using any one or a combination of the following techniques known in the art: discrete logic circuits having logic gates for implementing logical functions on data signals, application-specific integrated circuits (ASICs) having suitable combinational logic gates, programmable gate arrays (PGAs), field-programmable gate arrays (FPGAs), etc.
[0311] In the foregoing description of this specification, the references to terms such as "one embodiment," "another embodiment," or "some embodiments," etc., indicate that a specific feature, structure, material, or characteristic described in connection with an embodiment or example is included in at least one embodiment or example of this application. In this specification, the illustrative expressions of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the specific features, structures, materials, or characteristics described may be combined in any suitable manner in one or more embodiments or examples.
[0312] Although embodiments of this application have been shown and described, those skilled in the art will understand that various changes, modifications, substitutions and variations can be made to these embodiments without departing from the principles and spirit of this application, the scope of which is defined by the claims and their equivalents.
[0313] The above is a detailed description of the preferred embodiments of this application, but this application is not limited to the embodiments. Those skilled in the art can make various equivalent modifications or substitutions without departing from the spirit of this application, and these equivalent modifications or substitutions are all included within the scope defined by the claims of this application.
Claims
1. A method for monitoring input / output performance, characterized in that, The method includes: The completion queue of the input / output framework tool is monitored, and the first-time data when the corresponding completion queue entry is generated in the completion queue after the system input / output layer has finished processing each object request is recorded. Query the input / output operation flow corresponding to each of the object requests. If the object request is converted into an asynchronous command in the input / output framework tool, record the second time data of the asynchronous command corresponding to the object request being added to the asynchronous work queue. Based on the difference between the first time data and the second time data corresponding to the target request, the asynchronous operation latency data corresponding to the target request is determined; wherein, the target request is an object request that is converted into an asynchronous command in the input / output framework tool; Based on the asynchronous operation latency data corresponding to each of the target requests, the first performance monitoring data of the input / output framework tool is determined.
2. The method for monitoring input / output performance according to claim 1, characterized in that, The method further includes: The submission queue of the input / output framework tool is monitored, and third-time data is recorded when each object request generates a corresponding submission queue entry in the submission queue. Based on the difference between the first time data and the third time data corresponding to the object request, the processing delay data corresponding to the object request is determined; Based on the processing latency data corresponding to each of the object requests, the second performance monitoring data of the input / output framework tool is determined.
3. The method for monitoring input / output performance according to claim 1, characterized in that, The step of determining the first performance monitoring data of the input / output framework tool based on the asynchronous operation latency data corresponding to each of the target requests includes: Check whether a summary output request for the first performance monitoring data has been received; If the summary output request is not received at present, the asynchronous operation delay data between the time node of the last detection of the summary output request and the current time node will be migrated from the buffer to the specified storage space for storage. Waiting for a predetermined period of time, storing the asynchronous operation delay data determined during the waiting process into the buffer, and returning to execute the step of detecting whether a summary output request for the first performance monitoring data has been received; If the summary output request is received, the first performance monitoring data of the input / output framework tool is determined based on the asynchronous operation latency data stored in the storage space.
4. The method for monitoring input / output performance according to claim 3, characterized in that, The step of storing the asynchronous operation delay data determined during the waiting process into the buffer includes: Detect whether the size of the asynchronous operation delay data is greater than 0; If the size of the asynchronous operation delay data is greater than 0, the asynchronous operation delay data is stored in the buffer; If the size of the asynchronous operation delay data is less than or equal to 0, delete the asynchronous operation delay data.
5. The method for monitoring input / output performance according to claim 4, characterized in that, The step of storing the asynchronous operation delay data in the buffer includes: A hash table is created in the buffer. Assign a key to the asynchronous operation delay data, perform a hash calculation on the key, and determine the storage location in the hash table based on the hash calculation result; The asynchronous operation delay data is used as the value and stored in the storage location.
6. The method for monitoring input / output performance according to claim 1, characterized in that, The step of determining the first performance monitoring data of the input / output framework tool based on the asynchronous operation latency data corresponding to each of the target requests includes: Obtain predetermined statistical indicators; wherein, the statistical indicators include at least one of mean, median, variance, standard deviation, and predetermined quantile values; Based on the statistical indicators, the latency data of each asynchronous operation are processed to determine the first performance monitoring data of the input / output framework tool.
7. The method for monitoring input / output performance according to claim 6, characterized in that, The statistical indicators include predetermined quantile values; the process of processing the latency data of each asynchronous operation according to the statistical indicators to determine the first performance monitoring data of the input / output framework tool includes: Detect the first number of the asynchronous operation delay data; The target sequence number is determined based on the predetermined quantile value and the first number; The asynchronous operation delay data are sorted to obtain a sorting queue, and the queue number corresponding to each asynchronous operation delay data in the sorting queue is determined. Based on the asynchronous operation delay data that corresponds to the same queue number and target number, the first performance monitoring data of the input / output framework tool is determined.
8. The method for monitoring input / output performance according to claim 2, characterized in that, The method further includes: Detect the event information requested by each of the objects at the system input / output layer; Based on the event information, determine the input / output layer latency data requested by each of the objects.
9. The method for monitoring input / output performance according to claim 8, characterized in that, The method further includes: Calculate the average processing latency data corresponding to each of the object requests to obtain the first average latency data; calculate the average asynchronous operation latency data corresponding to each of the target requests to obtain the second average latency data; and calculate the average input / output layer latency data corresponding to each of the object requests to obtain the third average latency data. The first delay data is determined based on the difference between the first average delay data and the third average delay data; Compare the second mean delay data with the first delay data; If the ratio of the second mean delay data to the first delay data is greater than the first preset threshold, it is determined that there is an anomaly in the reading of the asynchronous output of the input / output framework tool.
10. The method for monitoring input / output performance according to claim 9, characterized in that, The method further includes: Compare the first average latency data with the preset latency threshold; If the first average latency data is greater than the latency threshold, it is determined that the performance of the input / output framework tool is abnormal.
11. The method for monitoring input / output performance according to any one of claims 1-10, characterized in that, The monitoring of the submission queue of the input / output framework tool includes: In response to a monitoring request from the target object, detect whether a running application exists in the current user space; If the application is running in the user space, monitor the submission queue of the input / output framework tool.
12. A monitoring system for input / output performance, characterized in that, The system includes: The recording unit is used to monitor the completion queue of the input / output framework tool and record the first-time data when the corresponding completion queue entry is generated in the completion queue after the system input / output layer has finished processing each object request. The query unit is used to query the input and output operation flow corresponding to each of the object requests. If the object request is converted into an asynchronous command in the input and output framework tool, the second time data of the asynchronous command corresponding to the object request being added to the asynchronous work queue is recorded. The calculation unit is used to determine the asynchronous operation delay data corresponding to the target request based on the difference between the first time data and the second time data corresponding to the target request; wherein the target request is an object request that is converted into an asynchronous command in the input / output framework tool; The processing unit determines the first performance monitoring data of the input / output framework tool based on the asynchronous operation latency data corresponding to each of the target requests.
13. An electronic device comprising a memory and a processor, wherein the memory stores a computer program, characterized in that, When the processor executes the computer program, it implements the input / output performance monitoring method according to any one of claims 1 to 11.
14. A computer-readable storage medium storing a computer program, characterized in that, When the computer program is executed by the processor, it implements the input / output performance monitoring method according to any one of claims 1 to 11.
15. A computer program product, comprising a computer program, characterized in that, When the computer program is executed by the processor, it implements the input / output performance monitoring method according to any one of claims 1 to 11.