Automatic capture of user interface screenshots for software product documentation
By recognizing the completion map of the user interface window and analyzing the graph convolutional network, the system automatically captures screenshots of the user interface, solving the problem of excessive screenshots in existing technologies, improving the efficiency and accuracy of screenshot capture, and achieving efficient generation of software product documentation.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- INTERNATIONAL BUSINESS MACHINE CORPORATION
- Filing Date
- 2022-07-28
- Publication Date
- 2026-06-30
AI Technical Summary
Existing automatic user interface screenshot capture technology tends to result in too many screenshots, requiring the technology author to manually sort them, which is time-consuming and inefficient.
By identifying the user interface windows of the software product, a completion graph is created, the completion percentage of each screenshot is calculated, and based on this, a subset of screenshots to be included in the software product documentation is identified. A graph convolutional network is then used to analyze the completion graph to automatically capture user interface screenshots.
It improves the efficiency and accuracy of screenshot capture, reduces the number of unnecessary screenshots, and increases the efficiency of automated generation of software product documentation.
Smart Images

Figure CN115705235B_ABST
Abstract
Description
Technical Field
[0001] This invention generally relates to capturing user interface screenshots, and more specifically, to automatically capturing user interface screenshots for use in software product documentation. Background Technology
[0002] Software product documentation, such as user guides, typically includes screenshots of the software product's user interface (UI). Including UI screenshots allows users to better follow the instructions step-by-step and verify their actions by comparing the user experience with the UI screenshots. Furthermore, including UI screenshots in product documentation allows users to understand the software product through UI captures without needing to install or access the product. Therefore, technical authors preparing software documentation recognize that UI captures are an important element of easy-to-use product documentation and thus include exemplary UI captures.
[0003] Manually capturing UI screenshots for product documentation (such as user guides) is typically time-consuming. Software is available that allows users to capture screenshots periodically, such as every two or five seconds, or automatically when UI events occur (e.g., a new active window appears or a UI control is activated). However, this type of automated capture software often results in capturing too many screenshots, requiring the technical author to sort the captured screenshots to determine the ones needed. Summary of the Invention
[0004] Embodiments of the present invention relate to a computer-implemented method for automatically capturing user interface (UI) screenshots for use in software product documentation. Non-limiting examples of the computer-implemented method include identifying a user interface window of the software product and creating a completion graph of the user interface window. The method further includes capturing multiple screenshots of the user interface window during use of the software product and calculating a completion percentage for each of the multiple screenshots. The method further includes identifying a subset of the multiple screenshots to be included in the software product documentation based on the completion percentage.
[0005] Embodiments of the present invention relate to a system for automatically capturing user interface (UI) screenshots for use in software product documentation. A non-limiting example of the system includes a processor coupled to memory, operable to identify user interface windows of the software product and create a completion graph of the user interface windows. The processor is also operable to capture multiple screenshots of the user interface windows during use of the software product and calculate a completion percentage for each of the multiple screenshots. The processor is further operable to identify a subset of the multiple screenshots to be included in the software product documentation based on the completion percentage.
[0006] Embodiments of the present invention relate to a computer program product for automatically capturing user interface (UI) screenshots for use in software product documentation. The computer program product includes a computer-readable storage medium containing program instructions. The program instructions are executable by a processor to cause the processor to perform a method. Non-limiting examples of the method include identifying a user interface window of the software product and creating a completion graph of the user interface window. The method also includes capturing multiple screenshots of the user interface window during use of the software product and calculating a completion percentage for each of the multiple screenshots. The method further includes identifying a subset of the multiple screenshots to be included in the software product documentation based on the completion percentage.
[0007] Additional technical features and benefits are achieved through the technology of this invention. Embodiments and aspects of the invention are described in detail herein and are considered part of the claimed subject matter. For a better understanding, refer to the Detailed Description section and the accompanying drawings. Attached Figure Description
[0008] The details of the proprietary rights described herein are specifically pointed out and explicitly claimed in the claims at the end of the specification. The foregoing and other features and advantages of embodiments of the invention will be apparent from the following detailed description taken in conjunction with the accompanying drawings, wherein:
[0009] Figure 1 A cloud computing environment according to one or more embodiments of the present invention is described;
[0010] Figure 2 An abstract model layer is described according to one or more embodiments of the present invention;
[0011] Figure 3 A block diagram of a computer system for implementing one or more embodiments of the present invention is depicted;
[0012] Figure 4 A system for automatically capturing screenshots of user interfaces (UIs) used in software product documentation, according to an embodiment of the present invention, is described;
[0013] Figure 5A A completion diagram is depicted according to one or more embodiments of the present invention;
[0014] Figure 5B A completion diagram of compression according to one or more embodiments of the present invention is depicted;
[0015] Figure 6 A flowchart is depicted for a method for automatically capturing user interface (UI) screenshots for use in software product documentation according to one or more embodiments of the present invention;
[0016] Figure 7A flowchart is depicted for a method for updating software product documentation by automatically capturing user interface (UI) screenshots according to one or more embodiments of the present invention.
[0017] The accompanying drawings are illustrative. Many variations of the drawings or the operations described therein may be made without departing from the spirit of the invention. For example, operations may be performed in different orders, or operations may be added, deleted, or modified. Furthermore, the term "coupling" and its variations describe a communication path between two elements, and do not imply a direct connection between elements without intermediate elements / connections. All such variations are considered part of the specification. Detailed Implementation
[0018] Various embodiments of the invention are described herein with reference to the accompanying drawings. Alternative embodiments of the invention can be devised without departing from its scope. Various connections and positional relationships (e.g., above, below, adjacent, etc.) between elements are illustrated in the following description and drawings. Unless otherwise specified, these connections and / or positional relationships can be direct or indirect, and the invention is not intended to be limiting in this respect. Thus, coupling of entities can refer to direct or indirect coupling, and positional relationships between entities can be direct or indirect positional relationships. Furthermore, the various tasks and process steps described herein can be incorporated into a more comprehensive program or process having other steps or functions not detailed herein.
[0019] The following definitions and abbreviations are used to interpret the claims and specification. As used herein, the terms “comprising,” “including,” “containing,” “having,” “having,” or any other variations thereof are intended to cover non-exclusive inclusion. For example, a composition, mixture, process, method, article, or apparatus that comprises a list of elements is not necessarily limited to those elements, but may include other elements not expressly listed or inherent to such composition, mixture, process, method, article, or apparatus.
[0020] Furthermore, the term "exemplary" as used herein means "serving as an example, instance, or illustration." Any embodiment or design described herein as "exemplary" is not necessarily to be construed as superior to other embodiments or designs. The terms "at least one" and "one or more" can be understood to include any integer greater than or equal to one, i.e., one, two, three, four, etc. The term "multiple" can be understood to include any integer greater than or equal to two, i.e., two, three, four, five, etc. The term "connection" can include both indirect "connection" and direct "connection."
[0021] The terms “approximately,” “substantially,” “approximately,” and their variations are intended to include the degree of error associated with a specific number of measurements based on the equipment available at the time of application submission. For example, “approximately” can include a range of ±8%, 5%, or 2% of a given value.
[0022] For the sake of brevity, conventional techniques related to the manufacture and use of this invention may or may not be described in detail herein. Specifically, various aspects of the computing systems used to implement the various technical features described herein and the specific computer programs are well known. Therefore, for the sake of brevity, well-known system and / or process details are not provided, and many conventional implementation details are either briefly mentioned or omitted entirely herein.
[0023] It should be understood that while this disclosure includes a detailed description of cloud computing, the implementation of the teachings given herein is not limited to cloud computing environments. Rather, embodiments of the invention can be implemented in conjunction with any other type of computing environment now known or developed hereafter.
[0024] Cloud computing is a service delivery model that enables convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, network bandwidth, servers, processing, memory, storage, applications, virtual machines, and services), which can be rapidly provisioned and released with minimal management effort or interaction with the service provider. This cloud model may include at least five features, at least three service models, and at least four deployment models.
[0025] The features are as follows:
[0026] On-demand self-service: Cloud consumers can unilaterally and automatically provide computing power, such as server time and network storage, as needed, without requiring human interaction with the service provider.
[0027] Extensive network access: Capabilities are available through networks and accessed via standard mechanisms that facilitate the use of heterogeneous thin client or thick client platforms (e.g., mobile phones, laptops, and PDAs).
[0028] Resource pooling: A provider's computing resources are pooled to serve multiple consumers using a multi-tenant model, where different physical and virtual resources are dynamically assigned and reassigned as needed. There is a sense of location independence because consumers typically do not have control or knowledge of the exact location of the resources provided, but may be able to specify the location at a higher level of abstraction (e.g., country, state, or data center).
[0029] Rapid flexibility: The ability to provide capacity quickly and flexibly, automatically scaling down and up rapidly in some situations to scale up rapidly. For consumers, the available supply capacity often appears unlimited and can be purchased in any quantity at any time.
[0030] Measuring services: Cloud systems automatically control and optimize resource usage by leveraging metering capabilities at a level of abstraction appropriate to the service type (e.g., storage, processing, bandwidth, and active user accounts). Resource usage can be monitored, controlled, and reported, providing transparency to both the providers and consumers of the services being utilized.
[0031] Infrastructure as a Service (IaaS): The capabilities offered to consumers are processing, storage, networking, and other basic computing resources that enable consumers to deploy and run arbitrary software, which may include operating systems and applications. Consumers do not manage or control the underlying cloud infrastructure, but rather have control over the operating system, storage, deployed applications, and potentially limited control over selected networking components (e.g., host firewalls).
[0032] The deployment model is as follows:
[0033] Private cloud: A cloud infrastructure that operates solely for an organization. It can be managed by the organization or a third party and can exist on-site or off-site.
[0034] Community cloud: A cloud infrastructure shared by several organizations and supporting a specific community with shared concerns (e.g., tasks, security requirements, policies, and compliance considerations). It can be managed by an organization or a third party and can exist on-site or off-site.
[0035] Public cloud: Makes cloud infrastructure available to the public or large industry groups and is owned by an organization that sells cloud services.
[0036] Hybrid cloud: A cloud infrastructure is a combination of two or more clouds (private, community, or public) that remain a single entity but are bound together by standardized or proprietary technologies that enable data and applications to be ported (e.g., cloud bursting for load balancing between clouds).
[0037] Cloud computing environments are service-oriented, focusing on statelessness, loose coupling, modularity, and semantic interoperability. At the heart of cloud computing is the infrastructure comprising a network of interconnected nodes.
[0038] See now Figure 1This describes an illustrative cloud computing environment 50. As shown, the cloud computing environment 50 includes one or more cloud computing nodes 10 to which local computing devices used by cloud consumers can communicate. These local computing devices include, for example, personal digital assistants (PDAs) or cellular phones 54A, desktop computers 54B, laptop computers 54C, and / or automotive computer systems 54N. The nodes 10 can communicate with each other. They can be physically or virtually grouped (not shown) in one or more networks, such as private clouds, community clouds, public clouds, or hybrid clouds, or combinations thereof, as described above. This allows the cloud computing environment 50 to provide infrastructure, platforms, and / or software as services that cloud consumers do not need to maintain on their local computing devices. It should be understood that... Figure 1 The types of computing devices 54A-N shown are intended to be illustrative only, and computing node 10 and cloud computing environment 50 can communicate with any type of computerized device via any type of network and / or network-addressable connection (e.g., using a web browser).
[0039] See now Figure 2 This demonstrates the 50 (cloud computing environment) Figure 1 This provides a set of functional abstractions. It should be understood beforehand. Figure 2 The components, layers, and functions shown are intended to be illustrative only, and embodiments of the invention are not limited thereto. As shown, the following layers and corresponding functions are provided:
[0040] The hardware and software layer 60 includes hardware and software components. Examples of hardware components include: a mainframe 61; a RISC (Reduced Instruction Set Computer) based server 62; a server 63; a blade server 64; a storage device 65; and network and networking components 66. In some embodiments, software components include network application server software 67 and database software 68.
[0041] The virtualization layer 70 provides an abstraction layer from which the following examples of virtual entities can be provided: virtual server 71; virtual storage 72; virtual network 73, including virtual private network; virtual application and operating system 74; and virtual client 75.
[0042] In one example, management layer 80 may provide the following functionalities: Resource Provisioning 81 provides dynamic procurement of computing resources and other resources used to perform tasks within the cloud computing environment. Metering and Pricing 82 provides cost tracking as resources are utilized within the cloud computing environment and bills or invoices for the consumption of these resources. In one example, these resources may include application software licenses. Security provides authentication for cloud consumers and tasks, as well as protection for data and other resources. User Portal 83 provides access to the cloud computing environment for consumers and system administrators. Service Level Management 84 provides cloud resource allocation and management to ensure that required service levels are met. Service Level Agreement (SLA) Planning and Fulfillment 85 provides pre-scheduling and procurement of cloud resources based on anticipated future needs according to the SLA.
[0043] Workload tier 90 provides examples of functionalities that can be leveraged in a cloud computing environment. Examples of workloads and functionalities that can be provided from this tier include: mapping and navigation 91; software development and lifecycle management 92; virtual classroom teaching 93; data analysis and processing 94; transaction processing 95; and capturing user interface screenshots for product documentation 96.
[0044] refer to Figure 3 An embodiment of a processing system 300 for implementing the teachings of this document is illustrated. In this embodiment, system 300 has one or more central processing units (processors) 21a, 21b, 21c, etc. (collectively referred to as processors 21). In one or more embodiments, each processor 21 may include a Reduced Instruction Set Computer (RISC) microprocessor. Processor 21 is coupled to system memory 34 and various other components via system bus 33, read-only memory (ROM) 22 is coupled to system bus 33, and may include a basic input / output system (BIOS) that controls certain basic functions of system 300.
[0045] Figure 3Input / output (I / O) adapter 27 and network adapter 26 coupled to system bus 33 are also shown. I / O adapter 27 may be a Small Computer System Interface (SCSI) adapter that communicates with hard disk 23 and / or tape storage drive 25 or any other similar component. I / O adapter 27, hard disk 23, and tape storage device 25 are collectively referred to herein as mass storage 24. Operating system 40 executing on processing system 300 may be stored in mass storage 24. Network adapter 26 interconnects bus 33 with external network 36, enabling data processing system 300 to communicate with other such systems. Screen (e.g., display monitor) 35 is connected to system bus 33 via display adapter 32, which may include graphics adapters and video controllers for improving performance in graphics-intensive applications. In one embodiment, adapters 27, 26, and 32 may be connected to one or more I / O buses that are connected to system bus 33 via an intermediate bus bridge (not shown). Suitable I / O buses for connecting peripheral devices such as hard disk controllers, network adapters, and graphics adapters typically include common protocols such as Peripheral Component Interconnect (PCI). Additional input / output devices are shown connected to the system bus 33 via user interface adapter 28 and display adapter 32. Keyboard 29, mouse 30, and speaker 31 are all interconnected to the bus 33 via user interface adapter 28, which may include, for example, a super I / O chip that integrates multiple device adapters into a single integrated circuit.
[0046] In an exemplary embodiment, the processing system 300 includes a graphics processing unit 41, which is a dedicated electronic circuit designed to manipulate and modify memory to accelerate the creation of images intended for output to a display in a frame buffer. Overall, the graphics processing unit 41 is highly efficient in manipulating computer graphics and image processing, and has a highly parallel architecture that makes it more efficient than a general-purpose CPU for algorithms that process large data blocks in parallel.
[0047] Therefore, as Figure 3 The system 300 configured includes processing power in the form of a processor 21, storage capacity including system memory 34 and mass storage 24, input devices such as a keyboard 29 and a mouse 30, and output capacity including a speaker 31 and a display 35. In one embodiment, a portion of the system memory 34 and mass storage 24 jointly stores the operating system for coordination. Figure 3 The functions of each component are shown below.
[0048] Turning now to an overview of the technology more specifically related to aspects of the present invention, methods, systems, and computer program products are provided for automatically capturing user interface (UI) screenshots for inclusion in product documentation. In exemplary embodiments, the methods, systems, and computer program products are configured to automatically capture and identify screenshots for inclusion in product documentation based on a calculated degree of completion of the user interface window depicted in the screenshot. In exemplary embodiments, the degree of completion of the user interface window is calculated based on a completion graph, which is an undirected weighted topological graph created based on features of the user interface window. In exemplary embodiments, the completion graph is constructed in the backend to obtain the state of UI controls. Features of a UI window, as used herein, refer to whether the UI controls of the UI window have a state of user input. Once created, the completion graph is analyzed using a graph convolutional network (GCN) to obtain the degree of completion and degree-of-completion percentage of the user interface window depicted in the screenshot. In exemplary embodiments, the methods, systems, and computer program products are also configured to use machine learning to obtain a historical set of completion and degree-of-completion percentages for all categories of user interface windows from available systems and their associated user guides.
[0049] Now we turn to a more detailed description of various aspects of the invention. Figure 4 A computing system 400 for automatically capturing user interface (UI) screenshots for use in software product documentation, according to an embodiment of the present invention, is described. System 400 includes a software product 402, a user interface 404, a UI analysis engine 406, product documentation 408, and optional historical product documentation 410. The computing system 400 can... Figure 3 The cloud computing system 50 is implemented on the processing system 300. Furthermore, the cloud computing system 50 can communicate with one or all components of the computing system 400 via wired or wireless electronic communication. The cloud 50 can supplement, support, or replace some or all of the functions of the components of the computing system 400. Additionally, some or all of the functions of the components of the computing system 400 can be implemented as nodes 10 of the cloud 50. Figure 1 and 2 As shown in the diagram, cloud computing node 10 is merely an example of a suitable cloud computing node and is not intended to impose any limitation on the scope or functionality of the embodiments of the invention described herein.
[0050] As described herein, a completion level in a UI window is a measure of how well a user action (such as clicking a button, entering a value, or selecting a date) has been completed within the UI window. In exemplary embodiments, each completion level is independent of the order in which steps are performed, but relates to what action has been completed, and is closely related to the input value (or state) of the UI controls within the window. For example, the input state of a typical UI control in a window, such as a user ID field, would result in two different completion levels: one with a user ID input and another without.
[0051] In an exemplary embodiment, a completion graph is used to calculate the completion rate of user actions in the UI window. Figure 5A A schematic diagram of a completion graph 500 according to one embodiment is shown. The completion graph 500 is an undirected weighted topology graph constructed based on the features of a UI window. In an exemplary embodiment, the completion graph 500 is created starting with window node 502, whose node index is equal to zero, representing the UI window. Next, each UI control in the UI window is assigned a unique index from the beginning. Each UI control corresponds to two nodes in the completion graph 500. One node is an input-free node 504, and the other is an input node 506. In an exemplary embodiment, the input-free node 504 and the input node 506 are grouped into corresponding category groups 508, 510, which are used to group input or input-free nodes of the same control type and the same value type. In one embodiment, N is the total number of UI control categories in the software world, rather than the number of UI control categories in a particular UI window. In an exemplary embodiment, if a UI element has an input value, a connecting edge will exist between its input node 506 and window node 502. If a UI element has no input value, a connection edge will exist between its no-input node 504 and window node 502. In the exemplary embodiment, the weight of the connection edge is always set to 1.
[0052] In an exemplary embodiment, once the completion graph 500 has been created, it is compressed to create a compressed completion graph 550. Compressing the completion graph 500 simplifies its structure and reduces the computational cost required to analyze it. In an exemplary embodiment, each no-input node 504 in each no-input category group 508 is compressed into a virtual no-input node 514. The weight of the connection between the virtual no-input node 514 and the window node 502 is the sum of the connection edges between the no-input node 504 and the window node 502 in each no-input category group 508. If no-input node 504 does not exist in the no-input category group 508, there will be no connection edge between the virtual no-input node 514 and the window node 502. Similarly, each input node 506 in each input category group 510 is compressed into a virtual input node 516. The weight of the connection between the virtual input node 516 and the window node 502 is the sum of the connection edges between the input node 506 and the window node 502 in each input category group 510. If there is no input node 506 in the input category group 510, then there will be no connection edge between the virtual input node 516 and the window node 502.
[0053] In an exemplary embodiment, after a completion map of the user interface window has been created and compressed, the completion map is analyzed to identify the completion level and / or completion percentage of the UI window screenshot. In an exemplary embodiment, to calculate the completion level, the compressed completion map is fed as input into a three-layer graph convolutional network (GCN). After extracting features from the compressed completion map, the output of the GCN contains node embeddings of the window nodes, which are considered as the completion level value.
[0054] In the exemplary embodiment, a three-layer GCN is used as input, and a one-layer GCN formula is used: To calculate the node embedding of a node in the topology graph, where and Additionally, A is the weighted adjacency matrix, and I is the identity matrix. Let X be the diagonal degree matrix of A, X be the identity matrix, and W be a parameter matrix that is randomly initialized. ReLU is a non-linear activation function.
[0055] In an exemplary embodiment, the completion percentage is calculated using the following formula:
[0056]
[0057] in, Embed the window node into the node in the current completion graph. For window nodes, embed them into the completion graph. Connect all UI control nodes without inputs in the completion graph to the window node. This embeds the window node into the completion graph, connecting all input nodes belonging to UI controls in the completion graph to the window node.
[0058] Figure 6 A flowchart depicts a method 600 for automatically capturing user interface (UI) screenshots for software product documentation according to one or more embodiments of the present invention. Method 600 includes identifying a user interface window of the software product, as shown in block 602. Next, as shown in block 604, method 600 includes creating a completion graph for the user interface window. In an exemplary embodiment, the completion graph is an undirected weighted graph created by identifying whether UI controls have input and grouping the UI controls of the user interface window. Method 600 also includes capturing multiple screenshots of the user interface window during use of the software product, as shown in block 606. In one embodiment, capturing multiple screenshots of the user interface window during use of the software product includes periodically capturing screenshots while the user interface window is open. In another embodiment, capturing multiple screenshots of the user interface window during use of the software product includes capturing a screenshot each time a user interface event occurs on the user interface window.
[0059] Next, as shown in box 608, method 600 includes calculating the completion percentage for each of a plurality of screenshots. In an exemplary embodiment, calculating the completion percentage for each of the plurality of screenshots includes analyzing a completion map corresponding to each of the plurality of screenshots. In one embodiment, analyzing the completion map includes inputting the completion map into a three-layer graph convolutional network, wherein the completion map is processed by... , and To calculate the completion percentage. Method 600 also includes identifying a subset of multiple screenshots to be included in the document based on the completion percentage, as shown in box 610. In an exemplary embodiment, the method also includes automatically inserting a subset of the multiple screenshots into the software product documentation.
[0060] In one embodiment, the method further includes obtaining documentation for previous versions of the software product and identifying screenshots included in the documentation. Once a screenshot is identified, each screenshot is categorized into a user interface window category. Then, when a newer version of the software product is used, the user interface window category of the new user interface window is determined. Next, the completion percentage for each identified screenshot belonging to the same user interface window category as the new user interface window is calculated. The identified screenshots in the product documentation can then be replaced with screenshots of the new user interface window whose completion percentage is closest to that of the identified screenshots.
[0061] In an exemplary embodiment, user interface windows in the software can be divided into a finite number of user interface window categories. Each category of user interface windows serves the same purpose, such as performing the same business task or requiring similar user interface operations. For this purpose, the control values in these windows should be similar. Therefore, available historical product documentation containing screenshots of all UI window categories can be analyzed to determine what kind of screenshots, at what level of completion, for each UI window category an experienced technical writer would typically choose to capture and include in the product documentation. Therefore, a machine learning model can be trained to obtain a set of screenshots for each UI window category. Thus, by analyzing historical product documentation to obtain a historical set of completion levels and percentages for all UI window categories, the process of automatically identifying and classifying screenshots for new versions of user guides can be performed more efficiently.
[0062] Now for reference Figure 7 A flowchart illustrating a method 700 for updating software product documentation by automatically capturing user interface (UI) screenshots according to one or more embodiments of the present invention is shown. Method 700 includes obtaining software product documentation for a previous version of the software product, as shown in block 702. Next, as shown in block 704, method 700 includes identifying each screenshot in the software product documentation, determining the UI window category of each screenshot, and calculating a completion percentage and completion level for each screenshot. Method 700 also includes executing a new version of the software product and capturing candidate screenshots on each UI operation, as shown in block 706. In an exemplary embodiment, candidate screenshots are automatically captured when the completion percentage of the active user interface window is within a predetermined range, such as 2%, of the identified current target completion percentage. The identified current target completion percentage is obtained from screenshots of the same UI category in the software product documentation of a previous version of the software product.
[0063] Next, as shown in box 708, method 700 includes calculating the percentage of completion and the completion rate for each candidate screenshot, and determining the user interface window category for each candidate screenshot. Method 700 further includes calculating the cosine similarity between each candidate screenshot and the identified screenshot of the same UI window category, as shown in box 710. In box 712, method 700 replaces the identified screenshot with the candidate screenshot of the same user interface window category that has the highest cosine similarity to create a new software product document, and then concludes.
[0064] It may also include other processes. It should be understood that... Figure 6 and Figure 7 The processes described herein are examples, and other processes may be added, or existing processes may be removed, modified, or rearranged without departing from the scope and spirit of this disclosure.
[0065] This invention can be a system, method, and / or computer program product with any level of technical detail integration. The computer program product may include a computer-readable storage medium having computer-readable program instructions thereon for causing a processor to execute aspects of the invention.
[0066] Computer-readable storage media can be tangible devices capable of retaining and storing instructions for use by an instruction execution device. Computer-readable storage media can be, for example, but not limited to, electronic storage devices, magnetic storage devices, optical storage devices, electromagnetic storage devices, semiconductor storage devices, or any suitable combination of the foregoing. A non-exhaustive list of more specific examples of computer-readable storage media includes: portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), static random access memory (SRAM), portable compact disk read-only memory (CD-ROM), digital universal disk (DVD), memory sticks, floppy disks, mechanical encoding devices such as punch cards or protrusions in slots having instructions recorded thereon, and any suitable combination of the foregoing. As used herein, computer-readable storage media should not be construed as transient signals themselves, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., light pulses passing through fiber optic cables), or electrical signals emitted through wires.
[0067] The computer-readable program instructions described herein can be downloaded from a computer-readable storage medium to a corresponding computing / processing device, or downloaded via a network (e.g., the Internet, a local area network, a wide area network, and / or a wireless network) to an external computer or external storage device. The network may include copper transmission cables, optical transmission fibers, wireless transmissions, routers, firewalls, switches, gateway computers, and / or edge servers. A network adapter card or network interface in each computing / processing device receives computer-readable program instructions from the network and forwards them to a computer-readable storage medium within the corresponding computing / processing device.
[0068] Computer-readable program instructions used to perform the operations of this invention may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, state setting data, or source code or object code written in any combination of one or more programming languages, including object-oriented programming languages (such as Java, Smalltalk, C++, etc.) and conventional procedural programming languages (such as the "C" programming language or similar programming languages). The computer-readable program instructions may be executed entirely on a user's computer, partially on a user's computer, as a standalone software package, partially on a user's computer and partially on a remote computer, or entirely on a remote computer or server. In the latter case, the remote computer may be connected to the user's computer via any type of network (including a local area network (LAN) or a wide area network (WAN)) or may be connected to an external computer (e.g., via the Internet using an Internet service provider). In some embodiments, electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGAs), or programmable logic arrays (PLAs) may be personalized to execute computer-readable program instructions by utilizing state information from the computer-readable program instructions in order to perform aspects of this invention.
[0069] This document describes various aspects of the invention with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It should be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer-readable program instructions.
[0070] These computer-readable program instructions may be provided to a processor of a general-purpose computer, a special-purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions / actions specified in one or more blocks of a flowchart and / or block diagram. These computer-readable program instructions may also be stored in a computer-readable storage medium that causes a computer, programmable data processing apparatus, and / or other device to operate in a particular manner, such that the computer-readable storage medium storing the instructions includes an article of manufacture containing instructions that implement aspects of the functions / actions specified in the blocks of the flowchart and / or block diagram.
[0071] Computer-readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable apparatus, or other device to produce computer-implemented processing, such that the instructions executed on the computer, other programmable apparatus, or other device perform the functions / actions specified in one or more blocks of a flowchart and / or block diagram.
[0072] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present invention. Each block in a flowchart or block diagram may represent a module, segment, or portion of instructions, including one or more executable instructions for implementing a specified logical function. In some alternative implementations, the functions marked in the blocks may occur in a different order than indicated in the figures. For example, depending on the functions involved, two consecutively shown blocks may actually be executed substantially simultaneously, or these blocks may sometimes be executed in reverse order. It should also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, can be implemented using a dedicated hardware-based system that performs the specified function or action or executes a combination of dedicated hardware and computer instructions.
[0073] The description of various embodiments of the present invention is provided for illustrative purposes and is not intended to be exhaustive or limited to the disclosed embodiments. Many modifications and variations will be apparent to those skilled in the art without departing from the scope and spirit of the described embodiments. The terminology used herein was chosen to best explain the principles of the embodiments, their practical application, or technical improvements relative to technologies found in the market, or to enable others skilled in the art to understand the embodiments described herein.
Claims
1. A method for automatically capturing user interface screenshots for use in software product documentation, the method comprising: Identify the user interface window of a software product; A first completion graph is created for the user interface window, wherein the first completion graph includes window nodes representing the user interface window and no-input nodes and input nodes representing each UI control in the user interface window, and if a UI control has an input value, there is a connection edge between its input node and the window node; if a UI control has no input value, there is a connection edge between its no-input node and the window node. Capture multiple screenshots of the user interface window during the use of the software product; The completion percentage of each screenshot is calculated from multiple screenshots. The completion percentage of each screenshot is calculated by inputting the completion map corresponding to each screenshot into a three-layer graph convolutional network (GCN). The node embeddings of window nodes in the completion map corresponding to each screenshot, output by the GCN, are used as the completion percentage value for that screenshot. The calculation of the completion percentage of each screenshot is based on the node embeddings of window nodes in the first, second, and third completion maps corresponding to each screenshot. In the second completion map, no-input nodes belonging to all UI controls are connected to window nodes, and in the third completion map, input nodes belonging to all UI controls are connected to window nodes. Identify a subset of multiple screenshots to be included in the software product documentation based on completion percentage.
2. The method according to claim 1, wherein, The first completion graph is an undirected weighted graph created by identifying whether the user interface controls of the user interface window have input and grouping the user interface controls.
3. The method according to claim 1, wherein, Capturing multiple screenshots of the user interface window during the use of the software product includes periodically capturing screenshots while the user interface window is open.
4. The method according to claim 1, wherein, Capturing multiple screenshots of the user interface window during the use of the software product includes capturing a screenshot every time a user interface event occurs in the user interface window.
5. The method of claim 1, further comprising: Obtain documentation for previous versions of the software product and identify screenshots contained within the documentation; The screenshots contained in each of the identified documents are categorized into user interface window categories. Determine the user interface window category; and Calculate the completion percentage of screenshots contained in each of the identified documents that belong to the same user interface window category as the user interface window.
6. The method according to claim 5, wherein, A subset of screenshots to be included in the software product documentation includes: multiple screenshots in each of the identified screenshots that have the highest cosine similarity to screenshots contained in the identified documentation of the same UI window category.
7. A system for automatically capturing user interface screenshots for use in software product documentation, comprising a processor communicatively coupled to a memory, the processor being configured to perform the method according to any one of claims 1 to 6.
8. A computer program product for automatically capturing user interface screenshots for use in software product documentation, comprising computer-readable program instructions executable by a processor to cause the processor to perform the method according to any one of claims 1 to 6.