Method, device, and program for predicting usability of digital product on basis of reinforcement learning

The method addresses the challenge of complex hierarchical structure modeling in digital products by using reinforcement learning to analyze structure and connection graphs, enhancing usability prediction accuracy and adaptability.

WO2026141835A1PCT designated stage Publication Date: 2026-07-02MANYFAST INC

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
MANYFAST INC
Filing Date
2025-08-20
Publication Date
2026-07-02

Smart Images

  • Figure KR2025012601_02072026_PF_FP_ABST
    Figure KR2025012601_02072026_PF_FP_ABST
Patent Text Reader

Abstract

According to various embodiments of the present invention, disclosed is a method for predicting usability of a digital product on the basis of reinforcement learning. The method may comprise the steps of: acquiring a structure graph representing inclusion relationships between components included in a digital product and a connection graph representing connection relationships between pages existing in the digital product; configuring a usability prediction environment on the basis of the structure graph and the connection graph; acquiring target data for predicting usability of the digital product, and defining a reward function on the basis of the target data; and acquiring a usability prediction result by performing reinforcement learning on the basis of the usability prediction environment and the reward function.
Need to check novelty before this filing date? Find Prior Art

Description

Reinforcement learning-based method, device, and program for predicting the usability of digital products

[0001] The present invention relates to a method, apparatus, and program for predicting the usability of a digital product based on reinforcement learning. Specifically, it relates to a method, apparatus, and program for analyzing the hierarchical structure among the components of a digital product and performing reinforcement learning based on said structure to predict usability.

[0002]

[0003] Digital products are provided in various forms in modern society and are widely utilized in diverse fields, such as software applications, web pages, and electronic device interfaces. The quality and usability of these digital products are directly linked to user experience, making them one of the key factors determining product success.

[0004] Usability is generally defined as a concept encompassing user efficiency, satisfaction, and accessibility, and efforts are being made to measure and improve it during the digital product design phase. Currently, usability evaluation methods utilize a combination of qualitative and quantitative approaches, such as expert evaluations, user testing, and surveys. However, these traditional methods have limitations; they are costly and time-consuming, and are difficult to apply effectively, particularly in large-scale digital products or systems with complex interfaces.

[0005] Recently, artificial intelligence (AI) technology is being introduced into usability evaluation and prediction, with reinforcement learning-based approaches receiving particular attention. Reinforcement learning is a technique that learns policies through interaction with the environment and derives optimal actions through a reward system, and it is known to demonstrate excellent performance in solving complex problems. In this regard, technological attempts are underway to analyze the hierarchical structure of digital products and automatically measure or predict usability through reinforcement learning.

[0006] However, existing usability prediction technologies fail to effectively model the complex hierarchical structure of digital products, resulting in limited accuracy in usability measurement; furthermore, when applying reinforcement learning, the lack of clarity regarding the construction of the learning environment and the design of the reward function can lead to inefficiencies during the learning process.

[0007] Therefore, there is a demand in the industry for a more sophisticated and automated usability prediction method that performs reinforcement learning based on the hierarchical structure of digital products. In this regard, Korean Published Patent Application No. 10-2014-0113797 discloses a web usability evaluation system that generates a journey map.

[0008]

[0009] The present invention, devised in response to the aforementioned background technology, aims to provide a method, apparatus, and program for predicting the usability of a digital product based on reinforcement learning.

[0010] The technical problems of the present invention are not limited to those mentioned above, and other unmentioned technical problems will be clearly understood by those skilled in the art from the description below.

[0011]

[0012] The present invention can quantitatively predict usability based on the hierarchical structure of a digital product and the connection relationships between pages. Through this, the present invention can automate usability evaluation during the early design phase of a digital product, reduce costs and time, and provide a more optimized user experience.

[0013] In addition, by applying reinforcement learning, the present invention can minimize subjective bias that may occur in existing methods and provide flexibility and adaptability to predict usability in various user environments through learned policies.

[0014] The effects of the present invention are not limited to those mentioned above, and other unmentioned effects will be clearly understood by a person skilled in the art from the description below.

[0015]

[0016] FIG. 1 is a drawing illustrating a system according to one embodiment of the present invention.

[0017] FIG. 2 is a hardware configuration diagram of a computing device according to one embodiment of the present invention.

[0018] FIGS. 3 to 7 are drawings for explaining a method for predicting the usability of a digital product based on reinforcement learning according to an embodiment of the present invention.

[0019]

[0020] According to an embodiment of the present invention for solving the problem described above, a method for predicting the usability of a digital product based on reinforcement learning is disclosed. The method may include: a step of obtaining a structure graph representing the inclusion relationship between components included in the digital product and a connection graph representing the connection relationship between pages existing in the digital product; a step of configuring a usability prediction environment based on the structure graph and the connection graph; a step of obtaining target data for predicting the usability of the digital product and defining a reward function based on the target data; and a step of performing reinforcement learning based on the usability prediction environment and the reward function to obtain a usability prediction result.

[0021] In an alternative embodiment, the structure graph is configured such that the inclusion relationships between the components are represented by component nodes and there are no circular connections, the connection graph is configured such that the connection relationships between the pages are represented by page nodes and there are circular connections, and the target data may include content to be predicted, a function to be predicted, and a failure condition.

[0022] In an alternative embodiment, the step of configuring a usability prediction environment based on the structure graph and the connection graph may include: setting a component evaluation score for each component node; setting a page evaluation score for each page node including one or more component nodes; and creating the usability prediction environment for performing reinforcement learning based on the evaluation scores of each component node and each page node.

[0023] In an alternative embodiment, the step of setting a component evaluation score for each of the component nodes comprises: calculating an index score based on the index order corresponding to the component relative to the total number of indices of the page containing the component node; calculating a layout score by querying the layout area of ​​the child component nodes of the component node; obtaining an importance score of the content included in the component node; and determining the evaluation score based on the index score, the layout score, and the importance score; and the step of setting a page evaluation score for each page node containing one or more component nodes may comprise: calculating a sum value by summing the number of content and functions included in each of the child component nodes included in the page node and the child component nodes of the child component nodes; and calculating the page evaluation score by multiplying the sum value by a pre-set coefficient.

[0024] In an alternative embodiment, the component evaluation score is a score used as a decision criterion by the agent during the process of generating an initial policy, and the page evaluation score may be a score referenced by the agent for improving the efficiency of the search path and determining actions to optimize the initial policy.

[0025] In an alternative embodiment, the step of acquiring target data for predicting the usability of the digital product and defining a reward function based on the target data may include: a step of setting reward and penalty criteria corresponding to search success criteria and search failure conditions included in the target data; and a step of defining a reward function that grants the reward or penalty by reflecting the evaluation scores of each of the component node and the page node in the usability prediction environment.

[0026] In an alternative embodiment, the step of obtaining a usability prediction result by performing reinforcement learning based on the usability prediction environment and the reward function may include: initializing an agent in the usability prediction environment and setting it to an initial state; calculating a reward or penalty based on the agent's action based on the reward function when the agent performs exploration in the usability prediction environment and selects a state transition and an action; and generating an initial policy through iterative exploration and learning of the agent, and optimizing the initial policy to derive the usability prediction result.

[0027] In an alternative embodiment, the step of generating an initial policy through iterative exploration and learning of the agent and optimizing the initial policy to derive the usability prediction result may include: identifying a navigation path to reach target content or a page based on the initial policy and the optimized policy; calculating the navigation time by predicting the transition time between each page node in the navigation path; and deriving a usability prediction result including the navigation path and the navigation time.

[0028] According to one embodiment of the present invention for solving the above-described problem, an apparatus is disclosed. The apparatus comprises: a memory for storing one or more instructions; and a processor for executing the one or more instructions stored in the memory, and the processor can perform the above-described methods by executing the one or more instructions.

[0029] According to one embodiment of the present invention for solving the above-described problem, a computer program stored on a computer-readable recording medium is disclosed, which is combined with a computer as hardware to perform the above-described methods.

[0030] Other specific details of the present invention are included in the detailed description and drawings.

[0031]

[0032] Various embodiments are now described with reference to the drawings. In this specification, various descriptions are provided to facilitate an understanding of the invention. However, it is evident that these embodiments can be practiced without such specific descriptions.

[0033] As used herein, terms such as “component,” “module,” “system,” etc. refer to computer-related entities, hardware, firmware, software, combinations of software and hardware, or executions of software. For example, a component may be, but is not limited to, a procedure executed on a processor, a processor, an object, an execution thread, a program, and / or a computer. For example, both an application executed on a computing device and the computing device itself may be a component. One or more components may reside within a processor and / or an execution thread. A component may be localized within a single computer. A component may be distributed among two or more computers. Additionally, these components may be executed from various computer-readable media having various data structures stored therein. Components may communicate through local and / or remote processes, for example, according to signals having one or more data packets (e.g., data from a component interacting with another component in a local system or distributed system, and / or data transmitted through signals to other systems and networks such as the Internet).

[0034] Furthermore, the term "or" is intended to mean an implicit "or" rather than an exclusive "or." That is, unless otherwise specified or evident from the context, "X uses A or B" is intended to mean one of the natural implicit substitutions. In other words, if X uses A; if X uses B; or if X uses both A and B, "X uses A or B" may apply to any of these cases. Additionally, the term "and / or" as used herein should be understood to refer to and include all possible combinations of one or more of the enumerated related items.

[0035] Additionally, the terms “comprising” and / or “comprising” should be understood to mean that such features and / or components are present. However, the terms “comprising” and / or “comprising” should be understood not to exclude the presence or addition of one or more other features, components and / or groups thereof. Furthermore, unless otherwise specified or clearly evident from the context to indicate a singular form, the singular in this specification and claims should generally be interpreted to mean “one or more.”

[0036] Those skilled in the art should recognize that the various exemplary logical blocks, configurations, modules, circuits, means, logics, and algorithmic steps described in connection with the embodiments disclosed herein may be implemented in electronic hardware, computer software, or a combination of both. To clearly exemplify the interchangeability of hardware and software, various exemplary components, blocks, configurations, means, logics, modules, circuits, and steps have been generally described above in terms of their functionality. Whether such functionality is implemented in hardware or software depends on the specific application and design constraints imposed on the overall system. Skilled technicians may implement the described functionality in various ways for each specific application. However, such decisions regarding implementation should not be interpreted as moving out of the scope of the invention.

[0037] The description of the presented embodiments is provided to enable those skilled in the art to use or practice the present invention. Various modifications to these embodiments will be apparent to those skilled in the art. The general principles defined herein may be applied to other embodiments without departing from the scope of the present invention. Thus, the present invention is not limited to the embodiments presented herein. The present invention should be interpreted in the broadest possible scope consistent with the principles and novel features presented herein.

[0038] In this specification, the term "computer" refers to any type of hardware device comprising at least one processor, and may be understood to include software configurations operating on said hardware device according to the embodiments. For example, the term "computer" may be understood to include smartphones, tablet PCs, desktops, laptops, and user clients and applications running on each of these devices, but is not limited thereto.

[0039] Hereinafter, embodiments of the present invention will be described in detail with reference to the attached drawings.

[0040] Each step described in this specification is described as being performed by a computer, but the subject of each step is not limited thereto, and depending on the embodiment, at least some of each step may be performed on different devices.

[0041]

[0042] FIG. 1 is a drawing illustrating a system according to one embodiment of the present invention.

[0043] Referring to FIG. 1, a system according to one embodiment of the present invention may include a computing device (100), a user terminal (200), and an external server (300). The system illustrated in FIG. 1 is according to one embodiment, and its components are not limited to the embodiment illustrated in FIG. 1 and may be added, changed, or deleted as needed.

[0044] In one embodiment, the computing device (100) can predict the usability of a digital product based on reinforcement learning. For example, the computing device (100) can analyze the hierarchical structure of the digital product and the connection relationships between pages, and perform an automated usability evaluation based on this.

[0045] Specifically, the computing device (100) can obtain a structure graph representing the inclusion relationships between components included in a digital product and a connection graph representing the connection relationships between pages existing in the digital product. Additionally, the computing device (100) can configure a usability prediction environment based on the structure graph and the connection graph. Furthermore, the computing device (100) can obtain target data for predicting the usability of the digital product and define a reward function based on the target data. Then, the computing device (100) can obtain a usability prediction result by performing reinforcement learning based on the usability prediction environment and the reward function.

[0046] Accordingly, the computing device (100) of the present invention can effectively evaluate usability during the design phase of a digital product and provide insights to improve the user experience.

[0047] Hereinafter, an example of a method in which a computing device (100) predicts the usability of a digital product based on reinforcement learning will be described later with reference to FIGS. 3 to 7.

[0048] In various embodiments, the computing device (100) may provide Web or Application-based services. However, it is not limited thereto.

[0049] The computing device (100) may include any type of computer system or computer device, such as, for example, a microprocessor, a mainframe computer, a digital processor, a portable device, and a device controller. However, it is not limited thereto.

[0050] Hereinafter, the hardware configuration of the computing device (100) will be described with reference to FIG. 2.

[0051] Meanwhile, the user terminal (200) may be connected to the computing device (100) via a network (400) and may be a user terminal that plans or manufactures a digital product corresponding to the usability predicted by the computing device (100). For example, the user terminal (200) may include a user terminal that participates in the design process of a digital product or tests an initial prototype of the product.

[0052] Here, the user terminal (200) may include, for example, various types of computer devices. Specifically, for example, the user terminal (200) may refer to various terminal devices such as smartphones, tablet PCs, desktops, and laptops.

[0053] The user terminal (200) includes a display on at least a part of the terminal and may include an operating system for running applications or extension-based services provided by the computing device (100). For example, the user terminal (200) may be a smartphone, but is not limited thereto, and the user terminal (200) may include all kinds of handheld-based wireless communication devices such as navigation, PCS (Personal Communication System), GSM (Global System for Mobile communications), PDC (Personal Digital Cellular), PHS (Personal Handyphone System), PDA (Personal Digital Assistant), IMT (International Mobile Telecommunication)-2000, CDMA (Code Division Multiple Access)-2000, W-CDMA (W-Code Division Multiple Access), Wibro (Wireless Broadband Internet) terminal, smartpad, tablet PC, etc., as wireless communication devices that ensure portability and mobility.

[0054] An external server (300) can be connected to a computing device (100) via a network (400), and can transmit and receive various information / data necessary for the computing device (100) to predict the usability of a digital product based on reinforcement learning, and can store and manage various information / data generated as the computing device (100) predicts the usability of a digital product based on reinforcement learning.

[0055] For example, the external server (300) may be a database server that stores information used in the reinforcement learning-based digital product usability prediction method. As another example, the external server (300) may be a server that provides information used in the reinforcement learning-based digital product usability prediction method.

[0056] The network (400) may refer to a connection structure capable of exchanging information between each node, such as computing devices, multiple terminals, and servers. For example, the network (400) includes a Local Area Network (LAN), a Wide Area Network (WAN), the World Wide Web (WWW), a wired and wireless data network, a telephone network, a wired and wireless television network, etc.

[0057] Wireless data communication networks include, but are not limited to, 3G, 4G, 5G, 3GPP (3rd Generation Partnership Project), 5GPP (5th Generation Partnership Project), LTE (Long Term Evolution), WIMAX (World Interoperability for Microwave Access), Wi-Fi, Internet, LAN (Local Area Network), Wireless LAN (Wireless Local Area Network), WAN (Wide Area Network), PAN (Personal Area Network), RF (Radio Frequency), Bluetooth network, NFC (Near-Field Communication) network, satellite broadcasting network, analog broadcasting network, DMB (Digital Multimedia Broadcasting) network, etc.

[0058]

[0059] FIG. 2 is a hardware configuration diagram of a computing device according to one embodiment of the present invention.

[0060] Referring to FIG. 2, a computing device (100) according to one embodiment of the present invention may include one or more processors (110), a memory (120) for loading a computer program (151) executed by the processor (110), a bus (130), a communication interface (140), and a storage (150) for storing the computer program (151). Here, FIG. 2 illustrates only the components related to the embodiment of the present invention. Therefore, a person skilled in the art to which the present invention pertains will understand that other general-purpose components may be included in addition to the components illustrated in FIG. 2.

[0061] The processor (110) controls the overall operation of each component of the computing device (100). The processor (110) may be composed of one or more cores and may include processors for data analysis and deep learning, such as a central processing unit (CPU), a general purpose graphics processing unit (GPGPU), or a tensor processing unit (TPU) of the computing device. Alternatively, it may be configured to include any type of processor well known in the art of the present invention.

[0062] Additionally, the processor (110) can perform operations for at least one application or program for executing the method according to embodiments of the present invention, and the computing device (100) may have one or more processors.

[0063] In various embodiments, the processor (110) may further include RAM (Random Access Memory, not shown) and ROM (Read-Only Memory, not shown) for temporarily and / or permanently storing signals (or data) processed within the processor (110). Additionally, the processor (110) may be implemented in the form of a system-on-chip (SoC) comprising at least one of a graphics processing unit, RAM, and ROM.

[0064] Memory (120) stores various data, instructions and / or information. Memory (120) may load a computer program (151) from storage (150) to execute a method / operation according to various embodiments of the present invention. When the computer program (151) is loaded into memory (120), the processor (110) may perform the method / operation by executing one or more instructions constituting the computer program (151). Memory (120) may be implemented as volatile memory such as RAM, but the technical scope of the present invention is not limited thereto.

[0065] The bus (130) provides communication functions between components of the computing device (100). The bus (130) can be implemented as various types of buses, such as an address bus, a data bus, and a control bus.

[0066] The communication interface (140) supports wired and wireless internet communication of the computing device (100). Additionally, the communication interface (140) may support various communication methods other than internet communication. To this end, the communication interface (140) may be configured to include a communication module well known in the art of the present invention. In some embodiments, the communication interface (140) may be omitted.

[0067] Storage (150) can store a computer program (151) non-temporarily. When performing a process according to an embodiment of the present invention through a computing device (100), storage (150) can store various information necessary to perform a method according to the disclosed embodiment or to provide a service.

[0068] The storage (150) may be configured to include non-volatile memory such as ROM (Read Only Memory), EPROM (Erasable Programmable ROM), EEPROM (Electrically Erasable Programmable ROM), flash memory, a hard disk, a removable disk, or any form of computer-readable recording medium well known in the art to which the present invention belongs.

[0069] A computer program (151) may include one or more instructions that cause a processor (110) to perform a method / operation according to various embodiments of the present invention when loaded into memory (120). That is, the processor (110) may perform the method / operation according to various embodiments of the present invention by executing the one or more instructions.

[0070] In one embodiment, the computer program (151) may include one or more instructions to perform various methods related to various tasks related to learning a neural network model.

[0071] The steps of the method or algorithm described in connection with embodiments of the present invention may be implemented directly in hardware, implemented as a software module executed by hardware, or implemented by a combination thereof. The software module may reside in RAM (Random Access Memory), ROM (Read Only Memory), EPROM (Erasable Programmable ROM), EEPROM (Electrically Erasable Programmable ROM), Flash Memory, a hard disk, a removable disk, a CD-ROM, or any form of computer-readable recording medium well known in the art to which the present invention belongs.

[0072] The components of the present invention may be implemented as a program (or application) and stored on a medium to be executed in combination with a computer, which is hardware. The components of the present invention may be implemented as software programming or software elements, and similarly, embodiments may be implemented in programming or scripting languages ​​such as C, C++, Java, assembler, etc., including various algorithms implemented as combinations of data structures, processes, routines, or other programming configurations. Functional aspects may be implemented as algorithms executed on one or more processors.

[0073]

[0074] FIGS. 3 to 7 are drawings for explaining a method for predicting the usability of a digital product based on reinforcement learning according to an embodiment of the present invention.

[0075] Referring to FIG. 3, the computing device (100) can obtain a structure graph representing the inclusion relationship between components included in a digital product and a connection graph representing the connection relationship between pages existing in the digital product (S110).

[0076] For example, a computing device (100) may receive a structure graph and a connection graph from a user terminal (200) of a user who plans or produces a digital product (e.g., a mobile application or a website). These graphs may be obtained by analyzing design drafts, wireframes, or prototype data generated during the interface design phase of the digital product.

[0077] For example, the computing device (100) can obtain a structure graph as shown in (a) of FIG. 4 and a connection graph as shown in (b) of FIG. 4.

[0078] In one embodiment, the structure graph may be configured such that inclusion relationships between components are represented by component nodes, and no circular connections exist.

[0079] Specifically, a structure graph represents UI elements used on each screen of a digital product as nodes and indicates how these elements are hierarchically organized. For example, a structure containing buttons, images, text, etc., under a top-level layout component can be represented by nodes and edges.

[0080] In one embodiment, the connection graph represents the connection relationships between pages using page nodes, and may be configured with a structure in which circular connections exist.

[0081] Specifically, the connection graph represents the user navigation flow and expresses the possibility of moving between pages as edges. For example, it may include paths such as moving from the home page to the settings page, or returning from the settings page to the home page.

[0082] In various embodiments, the computing device (100) may receive design files or prototype data from a user terminal (200) of a user who plans or produces a digital product, and generate a structure graph and a connection graph based thereon. Specifically, the computing device (100) may analyze a file written in XML, JSON, or other markup languages ​​generated by a user interface design tool to identify the attributes and hierarchical structure of each component.

[0083] For example, the computing device (100) can analyze a sketch file or Figma data exported from a prototyping tool to extract the attributes and inclusion relationships of the components and generate a structure graph. In addition, it can generate a connection graph by identifying the connection relationships between pages through screen transition information or link settings.

[0084] Accordingly, the computing device (100) can automatically generate a structure graph and a connection graph based on design data provided by the user, and efficiently obtain base data for predicting the usability of a digital product.

[0085] Referring again to FIG. 3, the computing device (100) can configure a usability prediction environment based on a structure graph and a connection graph (S120).

[0086] Specifically, referring to FIG. 5, the computing device (100) can set a component evaluation score for each component node (S121). Here, the component evaluation score may be a score used as a decision criterion during the process of an agent creating an initial policy. For example, the agent may prioritize searching for components with high component evaluation scores to increase the probability of achieving the goal.

[0087] More specifically, the computing device (100) can calculate an index score based on the index order corresponding to the component relative to the total number of indices of the page containing the component node. For example, the index score may be higher for components located higher within the page.

[0088] The computing device (100) can calculate a layout score by querying the layout area of ​​a sub-component node of a component node. For example, a banner or main button that occupies a large area on the screen may be assigned a high layout score.

[0089] For example, the computing device (100) can determine the layout score by calculating the pixel unit area of ​​the component and the ratio relative to the total screen area.

[0090] The computing device (100) can obtain an importance score of the content included in the component node. For example, a component that performs a major function, such as a sign-up button or a purchase button, can receive a high importance score.

[0091] For example, a computing device (100) can evaluate the functional importance of each component by utilizing domain knowledge or a predefined set of rules and assign an importance score accordingly.

[0092] The computing device (100) can determine an evaluation score based on an index score, a layout score, and an importance score.

[0093] For example, the computing device (100) can calculate the final component evaluation score by calculating three scores in a weighted sum manner. Specifically, for example, the computing device (100) can calculate the evaluation score as 'Evaluation Score = (Index Score × Weight 1) + (Layout Score × Weight 2) + (Importance Score × Weight 3)'. Here, Weight 1, Weight 2, and Weight 3 are values ​​to reflect the importance of each score and can be set by the user.

[0094] Additionally, the computing device (100) can set a page evaluation score per page node including one or more component nodes (S122). Here, the page evaluation score may be a score that the agent refers to for improving the efficiency of the search path and determining actions to optimize the initial policy. For example, the agent can shorten the time to reach the goal by prioritizing the search of pages with high page evaluation scores.

[0095] Specifically, the computing device (100) can calculate a sum value by summing the number of contents and functions included in each of the child component nodes included in the page node and the child component nodes of the child component node. That is, the page evaluation score can be associated with the complexity and functions of the page.

[0096] For example, the computing device (100) can calculate a value by summing the total number of buttons, input fields, images, text blocks, etc. included in the page.

[0097] For example, the computing device (100) can calculate a sum value by applying weights according to the functional importance of each component. In this case, components that perform important functions may be assigned higher weights. Here, the functional importance can be determined by the computing device (100) using domain knowledge or a predefined rule set, or obtained from a user terminal (200).

[0098] The computing device (100) can calculate a page evaluation score by multiplying the sum value by a preset coefficient.

[0099] For example, the computing device (100) can calculate the page evaluation score as ‘page evaluation score = sum value × coefficient’.

[0100] For example, if a specific page plays a key role in user navigation, such as a home page or a main menu page, the computing device (100) can apply a high coefficient to that page to increase the evaluation score.

[0101] Here, the computing device (100) can determine a coefficient based on attribute information such as the functional importance of a page, user access frequency, or the role of a page, or obtain a user-specified page-specific coefficient through a user terminal (200).

[0102] Specifically, the computing device (100) can set coefficients by analyzing metadata regarding the importance or role of each page during the design phase of the digital product. For example, pages that have a direct impact on user behavior, such as payment pages or sign-up pages, have high importance, so a larger coefficient can be applied to increase the evaluation score of the corresponding page. Additionally, the computing device (100) can adjust the coefficients by utilizing this statistical information if, based on past user behavior data, the visit frequency of a specific page is high or the user stay time is long.

[0103] Additionally, the computing device (100) may directly receive page-by-page counts from the user terminal (200). This allows a designer or planner to subjectively judge the importance of each page and set the count, thereby enabling the reflection of user experience or business goals. For example, if a user sets a high count for a page to emphasize a newly released feature, the agent learns to prioritize exploring that page, which can then be reflected in the usability prediction results.

[0104] In this way, the computing device (100) can determine the coefficient by utilizing both automated data analysis and user input, thereby enabling the page evaluation score to more accurately reflect the actual usability and user experience of the digital product.

[0105] Meanwhile, the computing device (100) can generate a usability prediction environment for performing reinforcement learning based on the evaluation scores of each component node and page node (S123).

[0106] Specifically, the computing device (100) can set up an environment that allows the agent to make rational decisions during the learning process by reflecting the evaluation score in the value evaluation of the state and action.

[0107] For example, the computing device (100) can design a reward function to provide a greater reward to the agent when selecting a component or page with a high evaluation score. Additionally, the computing device (100) can induce the agent to learn an efficient navigation path by granting a reward proportional to the evaluation score of the corresponding element when the agent selects a specific component or moves to a specific page.

[0108] For example, the computing device (100) can apply rules to the reward function such as 'Reward = +100 points for achieving the goal', 'Penalty = -10 points for choosing the wrong path', and 'Additional reward = Evaluation score of the selected component × weight'. Through this, the agent learns to maximize the reward function and searches for a path with high usability.

[0109] Referring again to FIG. 3, the computing device (100) can acquire target data to predict the usability of a digital product and define a reward function based on the target data (S130). Here, the target data may include content to be predicted, functions to be predicted, and failure conditions.

[0110] For example, predicted content can refer to specific information or pages that a user seeks within a digital product. In the case of a shopping app, this could be a specific product detail page; in a news app, a specific article page; or in a social media app, a specific user's profile page.

[0111] Furthermore, the features targeted for prediction can represent specific tasks or functions that a user intends to perform within a digital product. For example, this may include purchasing products, sharing posts, adding friends, or changing settings. As these functions are core elements of the user experience, it is crucial for the agent to learn how to successfully perform them.

[0112] Failure conditions define situations or conditions that the agent must avoid during the navigation process. For example, failure conditions may include reaching an error page (e.g., a 404 page), page loading failing due to restricted access rights, or an excessive number of navigation steps. Additionally, users repeatedly visiting content they are not interested in may also be considered failure conditions.

[0113] The computing device (100) can define a reward function based on this target data to perform reinforcement learning so that the agent can efficiently reach the target content or function. For example, the agent can be effectively guided in the direction of learning by giving a high reward when the agent successfully reaches the predicted target content and giving a penalty when a situation corresponding to a failure condition occurs.

[0114] Specifically, referring to FIG. 6, the computing device (100) can set reward and penalty criteria corresponding to the search success criteria and search failure conditions included in the target data (S131).

[0115] More specifically, the computing device (100) can analyze specific content or functions that the agent must reach in the target data and failure conditions that must be avoided during the search process, and determine the size of the reward and the degree of the penalty accordingly.

[0116] For example, the computing device (100) can set reward and penalty criteria to grant a high reward when the agent successfully reaches the target page, and to impose a penalty when the agent fails to reach the target within a time limit or visits an error page or an irrelevant page.

[0117] And, the computing device (100) can define a reward function that grants a reward or penalty by reflecting the evaluation scores of each component node and page node in the usability prediction environment (S132).

[0118] Specifically, the computing device (100) can adjust the reward value for the agent's behavior according to the evaluation score of the node.

[0119] For example, the computing device (100) can induce the agent to learn a more efficient path by giving additional rewards when selecting a component or page with a high evaluation score and imposing a penalty when selecting a node with a low evaluation score.

[0120] Referring again to FIG. 3, the computing device (100) can obtain a usability prediction result by performing reinforcement learning based on a usability prediction environment and a reward function (S140).

[0121] Specifically, referring to FIG. 7, the computing device (100) can initialize an agent in a usability prediction environment and set it to an initial state (S141).

[0122] More specifically, the computing device (100) can define the agent's state space and action space and randomly set an initial policy. Here, the state space represents the current search location of the digital product or the state of the interface components, and the action space includes possible operations or movements that the agent can take. Since the definition of the state space and action space directly affects the accuracy and efficiency of the reinforcement learning algorithm, the computing device (100) can be appropriately designed to suit the characteristics of the digital product.

[0123] For example, the computing device (100) can set the initial location of the agent to a home page or start screen, and possible actions may include all interactions that can be performed in a user interface, such as clicking a button, scrolling, or switching pages.

[0124] Additionally, the computing device (100) can calculate a reward or penalty based on the agent's actions based on a reward function when the agent performs exploration in a usability prediction environment and selects a state transition and an action (S142).

[0125] Specifically, the computing device (100) can calculate a reward value corresponding to the action and state whenever the agent takes a specific action and transitions to a new state. At this time, the computing device (100) can update the agent's Q-value table by applying a Q-learning algorithm. Here, Q-learning is an off-policy reinforcement learning algorithm that enables the agent to learn the expected cumulative reward for each state-action pair.

[0126] For example, the computing device (100) may grant a positive reward when the agent takes an action that brings it closer to a target function or content, and impose a penalty when the agent deviates to an unnecessary path or takes an action corresponding to a failure condition. Through this, the agent learns a policy in a direction that maximizes the reward.

[0127] Additionally, the computing device (100) can adjust the reward value by utilizing the component evaluation score and the page evaluation score. For example, if an agent selects a component with a high evaluation score, additional rewards can be given to induce the agent to prefer that path.

[0128] And, the computing device (100) can generate an initial policy through iterative exploration and learning of the agent, and optimize the policy to derive a usability prediction result (S143).

[0129] Specifically, the computing device (100) can update the agent's policy by applying a deep reinforcement learning algorithm such as a Deep Q-Network (DQN). Here, the DQN utilizes a neural network to enable efficient approximation of Q-values ​​even in a large state space.

[0130] In this case, during the learning process, the agent can learn an optimal policy by performing exploration and exploitation in a balanced manner. The computing device (100) can be designed so that the agent applies an episilon-greedy strategy to explore new paths by selecting a random action with a certain probability, and otherwise selects the current optimal action. In this way, the agent can optimize the policy in a direction that maximizes the cumulative reward through a reward function.

[0131] Additionally, the computing device (100) can calculate the search time by predicting the transition time between each page node in the search path.

[0132] Specifically, the computing device (100) can estimate the expected time required for each state transition by considering page switching time, user behavior patterns, interface complexity, etc.

[0133] For example, the computing device (100) can calculate the total time required for the entire navigation path by applying a time such as an average of 0.1 seconds for a button click and an average of 0.5 seconds for a page loading based on user experience data or a general usage scenario.

[0134] And, the computing device (100) can derive a usability prediction result including a search path and a search time.

[0135] Specifically, the computing device (100) can identify the most efficient path for an agent to reach a goal based on an optimized policy, and calculate usability metrics by combining the expected total navigation time, the order of visited pages and components, and the cumulative reward value along the path.

[0136] For example, the computing device (100) can generate a usability evaluation report including information such as the number of search steps, average response time, error frequency, and duplicate path ratio.

[0137] Accordingly, the computing device (100) can quantitatively evaluate the usability of a digital product and provide detailed usability prediction results to identify areas requiring improvement. Through this, the computing device (100) can help designers or developers optimize the interface of the digital product and improve the user experience.

[0138] Additionally, the computing device (100) can calculate path efficiency by summing the evaluation scores of nodes traversed in the search path based on an initial policy and an optimized policy, and calculate the ratio of duplicate paths by analyzing the degree of duplication of the search path to derive an efficiency prediction result.

[0139] Specifically, the computing device (100) can calculate a total path score by summing the evaluation scores of each node visited in the path explored by the agent. Here, the total path score can be used as an indicator of how efficiently the agent reached the goal. For example, the total path score increases as the agent passes through nodes with high evaluation scores, which may mean that a path with high usability was chosen.

[0140] For example, the computing device (100) can calculate average path efficiency by dividing the sum of node evaluation scores on the search path by the total number of steps taken in the search. Through this, the computing device (100) can quantitatively evaluate how efficient the agent's search path is.

[0141] Additionally, the computing device (100) can calculate the ratio of duplicate paths by dividing the number of times the same node in the search path is repeatedly visited by the total number of visits. Here, a higher duplication ratio indicates that the agent has repeatedly searched unnecessary paths, which may suggest areas requiring improvement in terms of usability.

[0142] Accordingly, the computing device (100) can derive efficiency prediction results including not only the search path and search time, but also path efficiency and the ratio of duplicate paths. Through these comprehensive usability prediction results, it is possible to identify factors that hinder the user experience in the interface of a digital product and provide specific details for improving the interface structure or navigation flow.

[0143]

[0144] According to an additional embodiment of the present invention, a computing device (100) can improve the accuracy of predicting the usability of a digital product by utilizing actual user behavior data.

[0145] Specifically, the computing device (100) can collect user interaction data and integrate it into a reinforcement learning process. Here, the user interaction data may include click patterns, page navigation paths, time spent on each page, frequency of interaction with specific components, etc., that occur while users actually use the digital product.

[0146] More specifically, the computing device (100) can pre-learn the initial policy of the agent based on user behavior data. To do this, an imitation learning or inverse reinforcement learning algorithm can be applied to enable the agent to learn the behavior patterns of actual users. This allows the agent to establish a policy that more accurately reflects the user experience.

[0147] For example, a computing device (100) can analyze user session logs to identify frequently used navigation paths and major areas of interest on the interface. In this case, the agent prioritizes the exploration of paths with high user preference during the reinforcement learning process based on frequently used navigation paths and major areas of interest on the interface, which can increase the realism of usability prediction.

[0148] For example, a computing device (100) can construct a probabilistic model from user behavior data to calculate the probability of transitioning from each state to the next behavior. Then, the computing device (100) can utilize the results of the probabilistic model for policy initialization in reinforcement learning so that the agent can efficiently reduce the search space and improve the learning speed.

[0149] Additionally, the computing device (100) can apply a hybrid model that combines reinforcement learning and supervised learning. That is, the computing device (100) can create a user-centered reinforcement learning environment by initializing the value function of an agent using user behavior data or by reflecting user preferences in the reward function.

[0150] Accordingly, the computing device (100) can improve the accuracy and efficiency of predicting the usability of digital products by combining actual user data with reinforcement learning. This overcomes the limitations of existing reinforcement learning-based methods that do not sufficiently reflect the complexity of user behavior and can contribute to the design of a more user-friendly interface.

[0151]

[0152] According to an additional embodiment of the present invention, a computing device (100) can combine reinforcement learning and natural language processing to analyze user feedback and improve the accuracy of predicting the usability of a digital product.

[0153] Specifically, the computing device (100) can collect unstructured text data such as online reviews, survey results, and user opinions collected from social media. This data includes advantages, disadvantages, and improvements directly mentioned by users regarding digital products, and can reflect the users' subjective experiences and satisfaction.

[0154] More specifically, the computing device (100) can analyze collected text data using natural language processing technology and perform key keyword extraction, sentiment analysis, topic modeling, etc. Through this, it is possible to determine how users feel about specific interface elements or functions and to identify factors affecting usability.

[0155] For example, the computing device (100) may recognize that there is a usability issue with the corresponding function or component if users repeatedly provide negative opinions such as "the login process is complicated" or "it is difficult to find the menu." Conversely, positive opinions such as "the search function is convenient" or "the design is intuitive" indicate high usability of the element.

[0156] For example, the computing device (100) can generate user satisfaction scores for each component node and page node based on sentiment analysis results. These satisfaction scores are additionally reflected in the reward function during the reinforcement learning process, which can induce the agent to prioritize exploring paths with high user satisfaction. Additionally, the computing device (100) can impose penalties on components or pages identified as major complaints, thereby causing the agent to avoid those paths or seek improvement measures.

[0157] Accordingly, the computing device (100) can effectively utilize user feedback through the combination of reinforcement learning and natural language processing, and can perform usability prediction of digital products more accurately and realistically. Compared to a method that merely considered interface structure and functional elements, this approach can improve the completeness of usability evaluation by reflecting the user's subjective experience and emotions. In addition, it is possible to design an interface that meets the needs and expectations of actual users, thereby improving user satisfaction.

[0158] Additionally, the computing device (100) can simulate various user profiles by applying a multi-agent system during the reinforcement learning process.

[0159] Specifically, each agent can be designed to reflect different user characteristics such as age, gender, and technical proficiency. Through this, the computing device (100) can perform usability predictions for various user groups in parallel and derive interface improvement plans optimized for each user group.

[0160] For example, an elderly agent may be sensitive to font size or button size, while a youth agent may be more interested in visual elements or animation effects. Taking these differences into account, the computing device (100) can identify usability issues by user segment and present customized solutions.

[0161] Accordingly, the computing device (100) can help with more detailed prediction and improvement of the usability of digital products by integrating reinforcement learning, natural language processing, and a multi-agent system.

[0162] Although embodiments of the present invention have been described above with reference to the attached drawings, those skilled in the art will understand that the present invention may be implemented in other specific forms without altering its technical concept or essential features. Therefore, the embodiments described above should be understood as illustrative in all respects and not restrictive.

Claims

1. A method performed by a computing device comprising at least one processor, A step of obtaining a structure graph representing the inclusion relationship between components included in a digital product and a connection graph representing the connection relationship between pages existing in the digital product; A step of configuring a usability prediction environment based on the above structure graph and the above connection graph; A step of acquiring target data to predict the usability of the digital product and defining a reward function based on the target data; and A step of obtaining a usability prediction result by performing reinforcement learning based on the above usability prediction environment and the above reward function; including, Reinforcement learning-based method for predicting the usability of digital products.

2. In Paragraph 1, The above structural graph is, The inclusion relationships between the above components are represented by component nodes, configured in a structure where no circular connections exist, and The above connection graph is, The connection relationships between the above pages are represented by page nodes, and are configured in a structure where circular connections exist. The above target data is, including predicted content, predicted features, and failure conditions, Reinforcement learning-based method for predicting the usability of digital products.

3. In Paragraph 1, The step of configuring a usability prediction environment based on the above structure graph and the above connection graph is, Step of setting the component evaluation score for each component node; Step of setting a page evaluation score per page node including one or more component nodes; and A step of creating the usability prediction environment for performing reinforcement learning based on the evaluation scores of each of the above component node and the above page node; including, Reinforcement learning-based method for predicting the usability of digital products.

4. In Paragraph 3, The step of setting the component evaluation score for each of the above component nodes is, A step of calculating an index score based on the index order corresponding to the component relative to the total number of indices of the page containing the component node; A step of calculating a layout score by querying the layout area of ​​a sub-component node of the above component node; A step of obtaining an importance score of the content included in the above component node; and A step of determining the evaluation score based on the index score, the layout score, and the importance score; Includes, The step of setting a page evaluation score per page node including the above one or more component nodes is: A step of calculating a sum value by summing the number of contents and functions included in each of the child component nodes included in the page node and the sub-component nodes of the child component nodes; and A step of calculating the page evaluation score by multiplying the above sum value by a preset coefficient; including, Reinforcement learning-based method for predicting the usability of digital products.

5. In Paragraph 3, The above component evaluation score is, It is a score used as a decision criterion during the process of the agent generating an initial policy, and The above page evaluation score is, A score that is referenced by the agent in improving the efficiency of the search path and making action decisions to optimize the initial policy, Reinforcement learning-based method for predicting the usability of digital products.

6. In Paragraph 3, The step of acquiring target data for predicting the usability of the digital product and defining a reward function based on the target data is: A step of setting reward and penalty criteria corresponding to the search success criteria and search failure conditions included in the above target data; and A step of defining a reward function that grants the reward or penalty by reflecting the evaluation scores of each of the component node and the page node in the above usability prediction environment; including, Reinforcement learning-based method for predicting the usability of digital products.

7. In Paragraph 1, The step of obtaining a usability prediction result by performing reinforcement learning based on the above usability prediction environment and the above reward function is, A step of initializing the agent in the above usability prediction environment and setting it to an initial state; When the agent performs exploration in the usability prediction environment and selects a state transition and an action, a step of calculating a reward or penalty according to the agent's action based on the reward function; and A step of generating an initial policy through iterative exploration and learning of the agent, and optimizing the initial policy to derive the usability prediction result; including, Reinforcement learning-based method for predicting the usability of digital products.

8. In Paragraph 7, The step of generating an initial policy through iterative exploration and learning of the above agent, and optimizing the said initial policy to derive the said usability prediction result, A step of identifying a navigation path to reach target content or a page based on the above initial policy and optimized policy; A step of calculating the search time by predicting the switching time between each page node in the above search path; and A step of deriving usability prediction results including the above-mentioned search path and search time; including, Reinforcement learning-based method for predicting the usability of digital products.

9. Memory for storing one or more instructions; and A processor that executes one or more instructions stored in the memory. Including, The above processor executes the above one or more instructions, A device that performs the method of claim 1.

10. A computer program stored on a computer-readable recording medium that is combined with a computer, which is hardware, to perform the method of claim 1.