Method and appartus for managing behaviors of application on the basis of clustering, device, and medium

By extracting key links and attribute sets from application log data using a clustering-based approach and utilizing machine learning models to determine application behavior types, this approach solves the problem of managing massive amounts of application behavior in existing technologies and achieves efficient and accurate risk identification.

WO2026123264A1PCT designated stage Publication Date: 2026-06-18BEIJING ZITIAO NETWORK TECH CO LTD

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
BEIJING ZITIAO NETWORK TECH CO LTD
Filing Date
2024-12-11
Publication Date
2026-06-18

AI Technical Summary

Technical Problem

Existing technologies struggle to effectively manage the massive amounts of data generated by applications, leading to misjudgments and omissions, and making it difficult to accurately identify potential risks.

Method used

By using clustering-based methods, key links and attribute sets are extracted from application log data, and machine learning models are used to determine application behavior types, thereby achieving automated management.

Benefits of technology

It improves the accuracy and efficiency of application behavior management, reduces the workload of manual analysis, and can accurately identify potential risks.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN2024138600_18062026_PF_FP_ABST
    Figure CN2024138600_18062026_PF_FP_ABST
Patent Text Reader

Abstract

Provided are a method and apparatus for managing behaviors of an application on the basis of clustering, a device, and a medium. In the method, during operation of an application, a plurality of links are acquired from log data of the application, each of the plurality of links being directed toward a code segment in code of the application and being associated with a behavior. On the basis of predetermined key features, a plurality of key links are determined among the plurality of links. A plurality of attribute sets respectively associated with the plurality of key links are determined, each of the plurality of attribute sets comprising a key link among the plurality of key links and at least one attribute associated with the key link. On the basis of at least one cluster of the plurality of attribute sets, at least one type of behavior of the application is determined, attributes in the attribute sets in the cluster satisfying a predetermined condition. A clustering operation can cluster massive amounts of behavior data into stable behavior types, and can accurately identify potential risks of each behavior of the application.
Need to check novelty before this filing date? Find Prior Art

Description

Methods, apparatus, devices, and media for managing application behavior based on clustering. Technical Field

[0001] Implementations of this disclosure generally relate to application management, and in particular to methods, apparatus, devices, and computer-readable storage media for managing the behavior of applications based on clustering. Background Technology

[0002] Applications can implement multiple functions, and as these functions become more complex, it becomes necessary to check whether various behaviors during application runtime meet expectations. Application developers and / or operators can determine application behavior through manual analysis of the application's code and / or runtime logs. However, manual analysis involves an enormous workload and may fail to accurately identify potentially risky behaviors due to misjudgments or omissions. Although some automated analysis tools have been developed, effectively managing application behavior remains difficult due to the massive amounts of behavior that applications may generate (e.g., millions, tens of millions, or even more). Therefore, a more efficient and accurate method for managing application behavior is desired. Summary of the Invention

[0003] In a first aspect of this disclosure, a method for managing application behavior based on clustering is provided. In this method, during application execution, multiple links are obtained from the application's log data, with each link pointing to a behavior-related code segment within the application's code. Multiple key links are identified from the multiple links based on predetermined key features. Multiple attribute sets are determined, each associated with one of the multiple key links, and each attribute set includes the key links and at least one attribute associated with each key link. Based on at least one cluster of the multiple attribute sets, at least one type of application behavior is determined, where attributes in the attribute sets within the clusters satisfy predetermined conditions.

[0004] In a second aspect of this disclosure, an apparatus for managing the behavior of an application based on clustering is provided. The apparatus includes: an acquisition module configured to acquire multiple links from log data of the application during its operation, wherein the links in the multiple links point to code segments in the application's code that are associated with the behavior; a link determination module configured to determine multiple key links from the multiple links based on predetermined key features; a fact determination module configured to determine multiple attribute sets respectively associated with the multiple key links, wherein the attribute sets include the key links in the multiple key links and at least one attribute associated with the key links; and a type determination module configured to determine at least one type of the application's behavior based on at least one cluster of the multiple attribute sets, wherein the attributes in the attribute sets of the clusters satisfy predetermined conditions.

[0005] In a third aspect of this disclosure, an electronic device is provided. The electronic device includes: at least one processing unit; and at least one memory coupled to the at least one processing unit and storing instructions for execution by the at least one processing unit, the instructions causing the electronic device to perform the method according to a first aspect of this disclosure when executed by the at least one processing unit.

[0006] In a fourth aspect of this disclosure, a computer-readable storage medium is provided having a computer program stored thereon, which, when executed by a processor, causes the processor to implement the method according to a first aspect of this disclosure.

[0007] In a fifth aspect of this disclosure, a computer program product is provided, comprising a computer program that, when executed by a processor, implements the method according to a first aspect of this disclosure.

[0008] It should be understood that the content described in this content section is not intended to limit the key or essential features of the implementation of this disclosure, nor is it intended to restrict the scope of this disclosure. Other features of this disclosure will become readily apparent from the following description. Attached Figure Description

[0009] The above and other features, advantages, and aspects of the various implementations of this disclosure will become more apparent in the following detailed description, taken in conjunction with the accompanying drawings. In the drawings, the same or similar reference numerals denote the same or similar elements, wherein:

[0010] Figure 1 shows a block diagram of an application environment according to one implementation of the present disclosure;

[0011] Figure 2 shows a block diagram of some implementations of this disclosure for managing the behavior of an application based on clustering;

[0012] Figure 3 shows a block diagram of determining the link based on key features according to some implementations of this disclosure;

[0013] Figure 4 shows a block diagram of the structure of the attribute set according to some implementations of this disclosure;

[0014] Figure 5 shows a block diagram of a text description for determining behavior according to some implementations of this disclosure;

[0015] Figure 6 shows a block diagram illustrating the use of machine learning models to determine the text interpretation of a link according to some implementations of this disclosure;

[0016] Figure 7 shows a block diagram of a database for clustering technical facts according to some implementations of this disclosure;

[0017] Figure 8 shows a flowchart of a method for managing the behavior of an application based on clustering, according to some implementations of this disclosure;

[0018] Figure 9 shows a block diagram of an apparatus for managing application behavior based on clustering, according to some implementations of this disclosure; and

[0019] Figure 10 shows a block diagram of a device capable of implementing various implementations of the present disclosure. Detailed Implementation

[0020] Implementations of this disclosure will now be described in more detail with reference to the accompanying drawings. While some implementations of this disclosure are shown in the drawings, it should be understood that this disclosure can be implemented in various forms and should not be construed as limited to the implementations set forth herein. Rather, these implementations are provided to provide a more thorough and complete understanding of this disclosure. It should be understood that the accompanying drawings and implementations of this disclosure are for illustrative purposes only and are not intended to limit the scope of protection of this disclosure.

[0021] In the description of the implementation methods disclosed herein, the term "comprising" and similar terms should be understood as open inclusion, i.e., "including but not limited to". The term "based on" should be understood as "at least partially based on". The term "one implementation" or "the implementation" should be understood as "at least one implementation". The term "some implementations" should be understood as "at least some implementations". Other explicit and implicit definitions may also be included below. As used herein, the term "model" can represent the relationships between various data. For example, the aforementioned relationships can be obtained based on various currently known and / or future-developed technical solutions.

[0022] It is understood that the data involved in this technical solution (including but not limited to the data itself, the acquisition or use of the data) shall comply with the requirements of relevant laws, regulations and related provisions.

[0023] It is understood that before using the technical solutions disclosed in the various embodiments of this disclosure, users should be informed of the types, scope of use, and usage scenarios of the personal information involved in this disclosure through appropriate means in accordance with relevant laws and regulations, and user authorization should be obtained.

[0024] For example, upon receiving a user's active request, a prompt message is sent to the user to explicitly inform them that the requested operation will require the acquisition and use of the user's personal information. This allows the user to independently choose whether to provide personal information to the software or hardware, such as the electronic device, application, server, or storage medium performing the operations of this disclosed technical solution, based on the prompt message.

[0025] As an optional but non-restrictive implementation, in response to a user's active request, a prompt message can be sent to the user, for example, via a pop-up window, where the prompt message can be presented in text format. Furthermore, the pop-up window can also include a selection control allowing the user to choose whether to "agree" or "disagree" to provide personal information to the electronic device.

[0026] It is understood that the above notification and user authorization process are merely illustrative and do not constitute a limitation on the implementation of this disclosure. Other methods that comply with relevant laws and regulations may also be applied to the implementation of this disclosure.

[0027] The term "in response to" as used herein refers to a state in which a corresponding event occurs or a condition is satisfied. It will be understood that the timing of subsequent actions performed in response to such event or condition is not necessarily strongly correlated with the time when the event occurs or the condition is met. For example, in some cases, subsequent actions may be performed immediately upon the occurrence of the event or the fulfillment of the condition; while in others, they may be performed some time after the occurrence of the event or the fulfillment of the condition.

[0028] Example Environment

[0029] Applications can implement multiple functions. As the functions of applications become more complex, it is necessary to check whether various behaviors during application operation meet expectations, for example, whether there are potential data security risks. Referring to Figure 1, which describes the environment in which the application is executed, Figure 1 shows a block diagram 100 of an application environment according to one implementation of this disclosure. As shown in Figure 1, the example environment may include a terminal device 110. In this example environment, the terminal device 110 may run an application 120 that supports user interface interaction. Application 120 may be any suitable type of application for user interface interaction, examples of which may include, but are not limited to, media applications or other suitable applications. User 140 may interact with application 120 via terminal device 110 and / or its attached devices. In the environment of Figure 1, if application 120 is active, terminal device 110 can present an interface 150 for supporting user interface interaction through application 120.

[0030] In some embodiments, terminal device 110 communicates with server 130 to provide services to application 120. Terminal device 110 can be any type of mobile terminal, fixed terminal, or portable terminal, including mobile phones, desktop computers, laptop computers, notebook computers, netbook computers, tablet computers, media computers, multimedia tablets, personal communication system (PCS) devices, personal navigation devices, personal digital assistants (PDAs), audio / video players, digital cameras / camcorders, positioning devices, television receivers, radio receivers, e-book devices, gaming devices, or any combination thereof, including accessories and peripherals of these devices or any combination thereof. In some embodiments, terminal device 110 can also support any type of user-facing interface (such as "wearable" circuitry).

[0031] Server 130 can be a standalone physical server, a server cluster or distributed system composed of multiple physical servers, or a cloud server providing basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content delivery networks, and big data and artificial intelligence platforms. Server 130 may include, for example, computing systems / servers such as mainframes, edge computing nodes, computing devices in a cloud environment, etc. Server 130 can provide backend services for applications 120 that support content presentation in terminal device 110.

[0032] A communication connection can be established between server 130 and terminal device 110. This communication connection can be established via wired or wireless means. The communication connection may include, but is not limited to, Bluetooth, mobile network, Universal Serial Bus, and Wi-Fi connections, and the embodiments of this disclosure are not limited in this respect. In the embodiments of this disclosure, server 130 and terminal device 110 can achieve signaling interaction through their communication connection. It should be understood that the structure and function of various elements in the environment are described for illustrative purposes only and do not imply any limitation on the scope of this disclosure.

[0033] In the context of this disclosure, application 120 can perform various functions, such as, but not limited to, video applications, short video applications, social networking applications, music applications, media sharing applications, communication applications, and so on. With the increasing number of applications and their growing complexity, data security issues are receiving increasing attention. Application developers and / or operators can manually analyze the application's code and / or runtime logs to determine whether the application's behavior meets security requirements. However, manual analysis involves a huge workload and may fail to accurately identify potentially risky behaviors due to misjudgments or omissions. Although partially automated analysis tools can improve analysis efficiency to some extent, it remains difficult to effectively manage application behavior due to the potentially massive amounts of behavior involved (e.g., millions, tens of millions, or even more). Therefore, it is desirable to categorize these massive amounts of behavior into multiple types to manage application behavior in a more effective and accurate manner.

[0034] Summary of Management Application Behavior

[0035] To at least partially address the shortcomings of the prior art, according to one implementation of this disclosure, a method for managing application behavior based on clustering is proposed. Referring to Figure 2, which describes an overview of one implementation of this disclosure, Figure 2 shows a block diagram 200 for managing application behavior based on clustering according to some implementations of this disclosure. Here, application behavior may include one or more actions performed by the application. For example, one or more code segments of the application may perform an action. Actions may correspond to events in the application's log data; for example, in response to performing an action, a corresponding event may be recorded in the log data.

[0036] As shown in Figure 2, log data 250 may include multiple events (e.g., events 210-1, 210-2, 210-3, 210-4, 210-5, 210-6, etc., collectively referred to as event 210). During application execution, multiple code execution links can be obtained from the application's log data 250. These code execution links can point to code segments in the application's code that are associated with specific behaviors. Here, code execution links can be simply referred to as links. Furthermore, multiple key links can be identified from these links based on predetermined key features. Specifically, links 220 and 222 can represent different identified key links, respectively. It should be understood that log data may include a massive number of links; some links provide auxiliary functions and do not lead to analysis needs, while others may pose security risks and therefore require analysis.

[0037] Multiple attribute sets can be identified, each associated with a number of critical links. Each attribute set includes the critical link among the multiple critical links, and at least one attribute associated with the critical link. As shown in Figure 2, attribute set 230 can correspond to link 220, and attribute set 232 can correspond to link 222, and so on. Here, the attribute sets can describe various aspects of information related to each behavior, thereby providing multifaceted data support for subsequent behavior management. The attribute sets can be referred to as technical facts to describe various aspects of the relevant technologies of the critical links. Subsequently, at least one type of application behavior can be determined based on at least one cluster 240 of the multiple attribute sets. Attributes in the attribute sets within the clusters satisfy predetermined conditions.

[0038] By utilizing some implementation methods disclosed herein, critical links can accurately identify data security-related behaviors, thus providing a solid foundation for data security protection. Furthermore, clustering operations can group massive amounts of behavioral data into stable behavioral types. In this way, potential risks during application operation can be accurately identified, thereby providing an effective solution for application data protection.

[0039] Detailed process of managing application behavior

[0040] Having described an overview of some implementations according to this disclosure, further details regarding methods for managing application behavior based on clustering will be described below. According to some implementations of this disclosure, in the process of retrieving multiple links from log data, multiple call stacks, each associated with multiple actions of the application, can be retrieved from the log data. Further, a set of links for executing actions among the multiple call stacks can be extracted from the call stacks. In the context of this disclosure, a set of "XX" can include one or more "XX", for example, a set of links can include one or more links. See Figure 3 for further details, which illustrates a block diagram 300 of determining links based on key features according to some implementations of this disclosure. As shown in Figure 3, call stack 310 can correspond to one or more events and can be reported by the application. Call stack 310 can include multiple links, such as links 220, 311, 312, 313, 314, etc.

[0041] The call stack 310 can be traversed in a top-down direction 330. This allows for prioritizing the traversal of the most recent log data, thus enabling the detection of whether recent behavior matches expectations. Alternatively and / or additionally, the call stack can be traversed in the opposite direction. According to some implementations of this disclosure, the call stack can be represented in various formats and point to code segments in the application code associated with the behavior. For example, a call stack could be represented as com.***.os01.***.control01.onClick and associated with an event associated with clicking the control control01; another example could be represented as com.***.os01.***.control02.onClick and associated with an event associated with clicking the control control02, and so on.

[0042] According to some implementations of this disclosure, key feature 320 may include various types, such as at least one of the following: user interaction 321, page feature 322, component lifecycle 323, application lifecycle 324, or callback 325. See Table 1 for further details on key features.

[0043] Table 1 Examples of Key Features

[0044] According to some implementations of this disclosure, a set of key links can be extracted from multiple links in the call stack based on key features. In the context of this disclosure, for ease of description, the extracted key links can be referred to as keyframes. Specifically, in the process of determining multiple key links from multiple links based on key features, at least one key link for performing an action can be determined from the multiple links based on the key features. Each call stack can be processed in a similar manner as described in Figure 3, thereby obtaining at least one key link corresponding to each call stack. Then, the at least one key link corresponding to each call stack can be combined to determine multiple key links. In other words, multiple keyframes can be determined, and these keyframes can correspond to various actions of the application. Here, an action can correspond to one or more keyframes.

[0045] According to some implementations of this disclosure, in the process of determining at least one key link for performing behavior from a set of links, key features can be used to perform text filtering on the set of links to determine at least one key link. Specifically, keywords corresponding to the key features can be determined separately, and then keyframes can be extracted through text filtering. For example, keywords corresponding to user interactions may include onClick, onDraw, etc.; and keywords corresponding to component lifecycles and application lifecycles may include onCreate, onResume, etc. Keywords can be searched in each link of the call stack, and links containing the keywords can be extracted.

[0046] Alternatively and / or additionally, in determining at least one key link for performing an action from a set of links, at least one key link can be determined based on the distance between the set of links and key features. Specifically, the text distance between the link and the keyword feature can be determined using a predetermined text processing algorithm, thereby extracting keyframes.

[0047] It should be understood that there can be a many-to-many relationship between links and key features. For example, a link may include multiple keywords corresponding to different key features; alternatively and / or additionally, multiple links may include the same keyword. Using some implementations of this disclosure, multiple keyframes to be analyzed can be accurately extracted by traversal, avoiding omissions.

[0048] According to some implementations of this disclosure, multiple attribute sets can be determined, each associated with a plurality of critical links. The attribute sets in the multiple attribute sets include the critical links among the multiple critical links, and at least one attribute associated with the critical link. Referring to Figure 4 for further details regarding the attribute sets, Figure 4 shows a block diagram 400 of the structure of the attribute sets according to some implementations of this disclosure. As shown in Figure 4, the attribute set may include a keyframe 411 and at least one attribute of that keyframe. Specifically, at least one attribute may include at least any of the following associated with the critical link: network request path 412, domain 413, service node 414, client 415, and stack 416. Alternatively and / or additionally, at least one attribute may include a text description 417.

[0049] According to some implementations of this disclosure, clustering can be performed based on the values ​​of each attribute in the attribute set, and the attributes in the attribute set within the cluster can satisfy predetermined conditions. For example, attribute sets with the same domain can be grouped into the same cluster, attribute sets with the same service nodes can be grouped into the same cluster, and so on. Alternatively and / or additionally, multiple attributes in the attribute set can be used as multiple dimensions of the attribute set, and the clustering process can be performed based on the distance between the dimensions of the attribute set.

[0050] Here, keyframe 411 refers to the extracted key link, such as com.***.os01.***.control01.onClick in the example above, etc. Path 412 can represent, for example, the path of a network request associated with this keyframe, such as a path in the server called " / app / plugin / config", etc. Domain 413 can represent the address of the corresponding domain, such as "api-boot / ***.com", etc. Service node 414 can represent a node in the server used to provide services, such as "app.***.api", etc. Client 415 can represent the client device running the application, such as "device01", etc. Stack 416 can represent the corresponding call stack, for example, pointing to the entry address of the stack called "stack01", etc.

[0051] According to some implementations of this disclosure, the text description 417 can be represented in natural language and used to describe behavior. Specifically, the text description may include at least one of the following associated with the behavior: a scenario, indicating the initiator that triggers the behavior; or a triggering reason, indicating at least one of the following: interactive behavior, changes in page or application cycles, or page changes. A scenario may, for example, represent a specific page (e.g., a Fragment), an action (Activity), a business function, a business scenario, or a function involved in the behavior, etc. A triggering reason may represent a specific reason in the application that leads to the behavior, such as user interactive behavior, changes in page or application cycles, or page changes, etc. Using some implementations of this disclosure, the text description can provide richer information about the behavior of the application, thereby facilitating the determination of the type of behavior.

[0052] According to some implementations of this disclosure, at least one textual interpretation of at least one critical link can be determined based on a database associated with the application. This database can store various types of knowledge defined in the application and can determine textual descriptions for describing behavior based on at least one textual interpretation. See Figure 5 for more information, which shows a block diagram 500 for determining textual descriptions of behavior according to some implementations of this disclosure.

[0053] As shown in Figure 5, during the operation of application 120, a set of key links associated with the behavior can be extracted from the log data 510 of application 120, such as link 512, etc. The application may have an associated database 520, which may include various functions related to the application's code. Database 520 can be defined by the application's developers. Based on the database 520 associated with application 120, various textual explanations (e.g., including textual explanation 530, etc.) can be determined for each link. Specifically, one link may correspond to one textual explanation describing the corresponding function of that link. Further, based on a set of textual explanations, a textual description 540 describing the behavior can be determined. Here, the textual description 540 is expressed in natural language. Specifically, the textual description 540 may include various aspects, such as the scenario and / or triggering reason related to the behavior, etc.

[0054] In applications involving large amounts of code and complex functionality, a vast number of behaviors will occur, making it difficult to determine the specific content of each behavior through manual analysis. Using the implementation method disclosed herein, the specific content of each behavior in the application can be automatically analyzed based on a database, thereby significantly reducing the workload of manual analysis and improving the efficiency and accuracy of application management. Specifically, machine learning models can be used to perform the management process; for example, language models can be used to understand and interpret one or more key links, thereby achieving automated interpretation of the application's key behaviors.

[0055] According to some implementations of this disclosure, the database may include a mapping between links and text interpretations. See Table 2 for further details, which shows examples of such mappings.

[0056] Table 2 Examples of mapping relationships

[0057] As shown in Table 2, strings represent strings in a link, and text explanations include the corresponding text explanations for that link. It should be understood that different applications can have their own databases, and these databases can be defined by the application developers. Alternatively and / or additionally, different applications can share the same database, thereby allowing for more general management of the different behaviors of different applications.

[0058] According to some implementations of this disclosure, in determining at least one text interpretation for at least one key link, the text interpretation associated with the key link can be determined based on a mapping relationship for each key link among the at least one key link. Each key link can be processed in a similar manner to determine the text interpretation associated with all keyframes.

[0059] Specifically, the database can be searched for individual strings within the link, and a corresponding text explanation can be provided upon detecting a particular string. It should be understood that a link can include one or more strings; for example, a link might include "inbox" and "click," in which case the text explanation could include, for example, "inbox" or "click." It should also be understood that the text explanation may only include a discrete set of descriptive words related to the behavior; these descriptive words are merely intermediate data. Since the mapping relationship includes the functional descriptions corresponding to each string in the link, the corresponding text explanations can be extracted from each keyframe in this way, serving as the basis for generating the text description.

[0060] According to some implementations of this disclosure, in determining the text interpretation associated with a link, at least one string can be determined based on the delimiters in the link. Furthermore, at least one string can be matched in the mapping relationship to determine the text interpretation. For example, an example link can be represented as: com.***.os01.***.inbox.onClick. In this case, the delimiter "dot" can be used to divide the link into multiple strings: com, ***, os01, ***, inbox, onClick. Based on the mapping relationship shown in Table 2, it can be determined that the link includes the string "inbox," thereby determining that the text interpretation includes: inbox.

[0061] According to some implementations of this disclosure, in the process of determining a string, the string can be split into multiple substrings. For example, the string "onClick" can be split into the substrings "on" and "click", thereby determining that the link further includes "click", and thus determining that the text interpretation includes: inbox, click. As another example, the string "ProfileTab***" can be split into: "profile", "tab", and "***". Specifically, the splitting can be performed according to the capitalization of the characters in the string, or the capitalization can be ignored. Alternatively and / or additionally, partial matching can be performed; for example, assuming the mapping relationship includes a long string (including string 1 and string 2), matching string 1 and / or string 2 can be assigned in the link. In this way, it is convenient to process the link in a more refined manner, thereby improving the accuracy of text interpretation.

[0062] According to some implementations of this disclosure, text interpretations can be determined in various ways. For example, text interpretations can be determined by text analysis or by calling a machine learning model. In determining a set of text interpretations for a set of links, prompt words can be identified. These prompt words can instruct the machine learning model to determine a set of text interpretations for a set of links based on mapping relationships. Furthermore, the machine learning model's response to the prompt words can be received. See Figure 6 for further details, which shows a block diagram 600 of determining text interpretations for links using a machine learning model according to some implementations of this disclosure. As shown in Figure 6, a set of key links 610 can include multiple links. Prompt words 620 can be determined, which can be used to call a machine learning model 630 to output a corresponding set of text interpretations based on a set of key links 610 and a database 520. Prompt words 620 can have different structures; see Table 3 for examples of prompt words.

[0063] Table 3 Examples of prompt words

[0064] As shown in Table 3, part 1 can specify the task to be performed by the machine learning model, part 2 can specify relevant examples for performing the task, and part 3 can specify one or more keyframes to be processed. The prompt word can be input into the machine learning model, which will then invoke the mapping relationships in database 520 to output text explanations for each keyword. Utilizing some implementation methods of this disclosure, the powerful processing capabilities of the machine learning model can be leveraged to analyze the strings in the keyframes, thereby providing corresponding text explanations.

[0065] According to some implementations of this disclosure, a machine learning model can be used to process a set of text interpretations to determine the corresponding text description. In this case, the prompt words further instruct the machine learning model to determine the text description used to describe the behavior based on the set of text interpretations. Specifically, Table 4 shows examples of prompt words used to determine the text description.

[0066] Table 4 Examples of prompt words

[0067] As shown in Table 4, part 1 can specify the task to be performed by the machine learning model, part 2 can specify relevant examples for performing the task, and part 3 can specify a set of text interpretations to be processed. The prompt word can be input into the machine learning model so that it can output the corresponding text description. Using some implementations of this disclosure, the powerful processing capabilities of the machine learning model can be invoked to obtain text descriptions of behaviors expressed in natural language based on each text interpretation.

[0068] According to some implementations of this disclosure, the prompt words may include more complex structures. For example, the prompt words may include: task description, steps, examples, and input data. Specifically, the task description may specify the task to be performed by the machine learning model, the steps may specify one or more steps required to perform the specified task, the examples may include one or more examples related to the task description (e.g., including input data and output data), and the input data may include the data to be processed that will be fed into the machine learning model.

[0069] According to some implementations of this disclosure, prompt words can be generated, and the steps specify that the following two steps are performed by a machine learning model: determining a set of text interpretations based on the call stack and the database, and determining a text description of the behavior based on the set of text interpretations.

[0070] Table 5 Examples of prompt words

[0071] As shown in Table 5, part 1 can specify the task to be performed by the machine learning model, part 2 can specify multiple steps (step 1 and step 2) to perform the task, part 3 can specify relevant examples to perform the task, and part 4 can specify multiple keyframes to be processed. The prompt word can be input into the machine learning model, at which point the machine learning model outputs the corresponding text description. Using some implementation methods of this disclosure, the powerful processing capabilities of the machine learning model can be invoked to obtain a text description of the behavior in natural language based on various text interpretations.

[0072] According to some implementations of this disclosure, the methods described above can be executed on server 130 and / or any other computing device as shown in Figure 1. Using the implementations of this disclosure, the specific content of various behaviors of the application can be automatically analyzed based on database 520, thereby greatly reducing the workload of manual analysis and improving the efficiency and accuracy of application management. Specifically, using some implementations of this disclosure can improve analysis efficiency. Compared to traditional manual analysis methods, the technical solution of this disclosure can greatly improve analysis efficiency by automatically extracting keyframes and using machine learning models to provide interpretations.

[0073] By utilizing some implementation methods of this disclosure, the accuracy of analysis can be improved. Manual analysis is easily influenced by subjective factors, leading to inaccurate results. The technical solution of this disclosure, based on call stacks and powerful machine learning models, can improve the accuracy and consistency of analysis. Furthermore, the technical solution of this disclosure can adapt to complex behavioral patterns. For complex application behaviors, traditional methods struggle to accurately extract and interpret key information. The technical solution of this disclosure can better adapt to complex behavioral patterns, providing a more comprehensive and accurate description of behavior. Furthermore, the technical solution of this disclosure can improve versatility, thus being applicable to various types of applications and behaviors.

[0074] According to some implementations of this disclosure, in the process of determining at least one type of behavior of an application based on at least one cluster of multiple attribute sets, at least one cluster of multiple attribute sets can be determined based on at least one link cluster of multiple key links; and for a cluster in at least one cluster, the type of behavior corresponding to the cluster is determined based on the text interpretation associated with the cluster.

[0075] The method described above can be used to determine the type of behavior. Assume there are multiple types: Type 1 represents behavior that may cause data security risk 1, Type 2 represents behavior that may cause data security risk 2, and so on. In this case, Type 1 can correspond to cluster 1 of the attribute set, which can include the attribute set. Here, subscripts can represent cluster numbers, and superscripts can represent attribute set numbers within a cluster. Furthermore, type 2 can correspond to cluster 2 of attribute sets, where cluster 1 may include attribute sets. And so on. Based on the methods described above, the set of attributes associated with a particular behavior can be determined. Then, based on the clustering of this attribute set, the specific type of the behavior can be determined. For example, suppose the set of attributes relevant to determining a behavior is... This would identify the behavior as type 2 and potentially trigger data security risk 2.

[0076] According to some implementations of this disclosure, various attribute sets and their corresponding clusters can be stored in a database. See Figure 7 for further details, which shows a block diagram 700 of a database for storing clusters of attribute sets according to some implementations of this disclosure. As shown in Figure 7, the database may include: an ID 711 for the attribute set, and a cluster 713 corresponding to that attribute set. Alternatively and / or additionally, the database may further include: an ID 714 for an administrator managing the attribute sets, and other fields that help improve application management performance.

[0077] According to some implementations of this disclosure, the keyframe extraction process can map massive amounts of events (e.g., millions or more) to multiple keyframes (e.g., a smaller order of magnitude of hundreds of thousands), and the clustering process can then map these multiple keyframes to multiple clusters with even smaller sets of attributes (e.g., approximately ten thousand). In this way, the extremely large order of magnitude of the data to be processed can be transformed to an acceptablely smaller order of magnitude. The clustering-related data can be stored in a database for subsequent management and application.

[0078] According to some implementations of this disclosure, defined clusters can be used to manage application behavior. During application runtime, it may be necessary to check whether various behaviors of the application conform to expectations; for example, multiple behaviors that may pose data security risks can be detected. Specifically, in response to receiving a query request for behavior of a target type, a target cluster matching the target type of behavior can be determined from at least one cluster; and a set of target attributes corresponding to the target cluster can be provided from multiple attribute sets.

[0079] Suppose a query request specifies a behavior that may cause data security risk 1. In this case, the target type can be determined as "Type 1," and a target cluster matching "Type 1" can be identified from at least one cluster. Subsequently, a set of target attributes corresponding to this target cluster can be determined from multiple attribute sets. Specifically, the attribute set can be determined. And it can be done by analyzing attribute sets This allows for the identification of potential risks in the application, leading to modifications to the application's code. Utilizing some implementation methods disclosed herein, massive amounts of behavior can be mapped to clusters with a manageable number of behaviors. This approach reduces the workload involved in application management and improves application development and management efficiency.

[0080] By utilizing some implementation methods disclosed herein and through cross-thread call stacks, user behaviors, page lifecycle behaviors, and business attributes related to data security can be accurately identified, and keyframes that fully express the identity, scenario, and triggering behavior of the caller can be constructed. The clustering process can group a large number of similar data security-related behaviors under a single label, forming a stable set of attribute labels related to data security. Furthermore, machine learning and data mining techniques can be used to conduct in-depth analysis of the clustered attribute labels, thereby identifying behaviors that may lead to potential data security risks.

[0081] Example process

[0082] Figure 8 illustrates a flowchart of a method 800 for managing application behavior based on clustering, according to some implementations of this disclosure. At block 810, during application runtime, multiple links are retrieved from the application's log data, with links pointing to behavior-related code segments within the application's code. At block 820, multiple critical links are determined from the multiple links based on predetermined key features. At block 830, multiple attribute sets are determined, each associated with one of the multiple critical links, and each attribute set includes the critical links and at least one attribute associated with a critical link. At block 840, based on at least one cluster of the multiple attribute sets, at least one type of application behavior is determined, where attributes in the attribute sets within the cluster satisfy predetermined conditions.

[0083] According to some implementations of this disclosure, obtaining multiple links from log data includes: obtaining multiple call stacks from the log data that are respectively associated with multiple actions of the application; and extracting a set of links from the call stacks of the multiple call stacks for executing the actions of the multiple actions.

[0084] According to some implementations of this disclosure, determining multiple key links from multiple links based on key features includes: determining at least one key link for performing behavior from a set of links based on key features.

[0085] According to some implementations of this disclosure, determining at least one key link for performing an action from a set of links includes at least one of the following: performing text filtering on a set of links using key features to determine at least one key link; or determining at least one key link based on the distance between a set of links and key features.

[0086] According to some implementations of this disclosure, at least one attribute includes a text description for describing the behavior, the text description being expressed in natural language and including at least one of the following associated with the behavior: scenario, indicating the initiator that triggered the behavior; or triggering reason, indicating at least one of the following: interactive behavior, page cycle or application cycle change, page change.

[0087] According to some implementations of this disclosure, the method further includes: determining at least one textual interpretation of at least one critical link based on a database associated with the application; and determining a textual description for describing the behavior based on at least one textual interpretation.

[0088] According to some implementations of this disclosure, the database includes a mapping relationship between links and text interpretations, and determining at least one text interpretation for at least one key link includes: for a key link among at least one key link, determining the text interpretation associated with the key link based on the mapping relationship.

[0089] According to some implementations of this disclosure, determining at least one type of application behavior based on at least one cluster of multiple attribute sets includes: determining at least one cluster of multiple attribute sets based on at least one link cluster of multiple key links; and determining the type of behavior corresponding to the cluster based on text interpretation associated with the cluster for the cluster in the at least one cluster.

[0090] According to some implementations of this disclosure, at least one attribute further includes at least one of the following associated with the critical link: network request path, domain, service node, client, stack, administrator.

[0091] According to some implementations of this disclosure, the method further includes: in response to receiving a query request for a behavior of a target type, determining a target cluster from at least one cluster that matches the behavior of the target type; and providing a set of target attributes corresponding to the target cluster from multiple attribute sets.

[0092] Example devices and equipment

[0093] Figure 9 shows a block diagram of an apparatus 900 for managing application behavior based on clustering, according to some implementations of the present disclosure. The apparatus includes: 11. An apparatus for managing application behavior based on clustering, comprising:

[0094] The acquisition module is configured to retrieve multiple links from the application's log data during application runtime. These links point to code segments in the application's code that are associated with specific behaviors.

[0095] The link determination module is configured to determine multiple key links from multiple links based on predetermined key features;

[0096] The fact-determination module is configured to determine multiple sets of attributes associated with multiple critical links, each set including a critical link among the multiple critical links and at least one attribute associated with a critical link; and

[0097] A type determination module is configured to determine at least one type of application behavior based on at least one cluster of multiple attribute sets, wherein attributes in the attribute sets of the cluster satisfy predetermined conditions.

[0098] According to some implementations of this disclosure, the acquisition module is further configured to: acquire multiple call stacks associated with multiple actions of the application from log data; and extract a set of links from the multiple call stacks for executing the actions of the multiple actions.

[0099] According to some implementations of this disclosure, the link determination module is further configured to: determine at least one key link from a set of links for performing actions based on key features.

[0100] According to some implementations of this disclosure, the link determination module is further configured to: perform text filtering on a set of links using key features to determine at least one key link; or determine at least one key link based on the distance between a set of links and key features.

[0101] According to some implementations of this disclosure, at least one attribute includes a text description for describing the behavior, the text description being expressed in natural language and including at least one of the following associated with the behavior: scenario, indicating the initiator that triggered the behavior; or triggering reason, indicating at least one of the following: interactive behavior, page cycle or application cycle change, page change.

[0102] According to some implementations of this disclosure, the apparatus further includes a processing module configured to: determine at least one text interpretation of at least one critical link based on a database associated with the application; and determine a text description for describing the behavior based on the at least one text interpretation.

[0103] According to some implementations of this disclosure, the database includes a mapping relationship between links and text interpretations, and the processing module is further configured to: for a critical link in at least one critical link, determine the text interpretation associated with the critical link based on the mapping relationship.

[0104] According to some implementations of this disclosure, the type determination module is further configured to: determine at least one cluster of multiple attribute sets based on at least one link cluster of multiple key links; and, for a cluster in at least one cluster, determine the type of behavior corresponding to the cluster based on a text interpretation associated with the cluster.

[0105] According to some implementations of this disclosure, at least one attribute further includes at least one of the following associated with the critical link: network request path, domain, service node, client, stack, administrator.

[0106] According to some implementations of this disclosure, the processing module is further configured to: in response to receiving a query request for a behavior of a target type, determine a target cluster from at least one cluster that matches the behavior of the target type; and provide a set of target attributes corresponding to the target cluster from multiple attribute sets.

[0107] Figure 10 shows a block diagram of a device 1000 capable of implementing various implementations of the present disclosure. It should be understood that the computing device 1000 shown in Figure 10 is merely exemplary and should not constitute any limitation on the functionality and scope of the implementations described herein. The computing device 1000 shown in Figure 10 can be used to implement the methods described above.

[0108] As shown in Figure 10, the computing device 1000 is in the form of a general-purpose computing device. Components of the computing device 1000 may include, but are not limited to, one or more processors or processing units 1010, memory 1020, storage devices 1030, one or more communication units 1040, one or more input devices 1050, and one or more output devices 1060. The processing unit 1010 may be a physical or virtual processor and is capable of performing various processes according to programs stored in memory 1020. In a multiprocessor system, multiple processing units execute computer-executable instructions in parallel to improve the parallel processing capability of the computing device 1000.

[0109] Computing device 1000 typically includes multiple computer storage media. Such media can be any available media accessible to computing device 1000, including but not limited to volatile and non-volatile media, removable and non-removable media. Memory 1020 can be volatile memory (e.g., registers, cache, random access memory (RAM)), non-volatile memory (e.g., read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory), or some combination thereof. Storage device 1030 can be removable or non-removable media and may include machine-readable media, such as flash drives, disks, or any other media capable of storing information and / or data (e.g., training data for training) and accessible within computing device 1000.

[0110] The computing device 1000 may further include additional removable / non-removable, volatile / non-volatile storage media. Although not shown in FIG. 10, disk drives for reading from or writing to removable, non-volatile disks (e.g., "floppy disks") and optical disk drives for reading from or writing to removable, non-volatile optical disks may be provided. In these cases, each drive may be connected to a bus (not shown) via one or more data media interfaces. The memory 1020 may include a computer program product 1025 having one or more program modules configured to perform various methods or actions of various implementations of this disclosure.

[0111] The communication unit 1040 enables communication with other computing devices via a communication medium. Additionally, the components of the computing device 1000 can function as a single computing cluster or multiple computing machines capable of communicating via communication connections. Therefore, the computing device 1000 can operate in a networked environment using logical connections to one or more other servers, network personal computers (PCs), or another network node.

[0112] Input device 1050 can be one or more input devices, such as a mouse, keyboard, trackball, etc. Output device 1060 can be one or more output devices, such as a monitor, speaker, printer, etc. Computing device 1000 can also communicate with one or more external devices (not shown) via communication unit 1040 as needed. These external devices include storage devices, display devices, etc., and can communicate with one or more devices that enable user interaction with computing device 1000, or with any device (e.g., network card, modem, etc.) that enables computing device 1000 to communicate with one or more other computing devices. Such communication can be performed via input / output (I / O) interface (not shown).

[0113] According to an implementation of this disclosure, a computer-readable storage medium is provided, on which computer-executable instructions are stored, wherein the computer-executable instructions are executed by a processor to implement the method described above. According to an implementation of this disclosure, a computer program product is also provided, which is tangibly stored on a non-transitory computer-readable medium and includes computer-executable instructions, which are executed by a processor to implement the method described above. According to an implementation of this disclosure, a computer program product is provided, on which a computer program is stored, which, when executed by a processor, implements the method described above.

[0114] Various aspects of this disclosure are described herein with reference to flowchart illustrations and / or block diagrams of methods, apparatuses, devices, and computer program products implemented according to this disclosure. It should be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer-readable program instructions.

[0115] These computer-readable program instructions can be provided to a processing unit of a general-purpose computer, a special-purpose computer, or other programmable data processing apparatus to produce a machine such that, when executed by the processing unit of the computer or other programmable data processing apparatus, they create means for implementing the functions / actions specified in one or more blocks of the flowchart and / or block diagram. These computer-readable program instructions can also be stored in a computer-readable storage medium that causes a computer, programmable data processing apparatus, and / or other device to operate in a particular manner. Thus, the computer-readable medium storing the instructions comprises an article of manufacture that includes instructions for implementing aspects of the functions / actions specified in one or more blocks of the flowchart and / or block diagram.

[0116] Computer-readable program instructions can be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable data processing apparatus, or other device to produce a computer-implemented process, thereby causing the instructions that execute on the computer, other programmable data processing apparatus, or other device to perform the functions / actions specified in one or more boxes of a flowchart and / or block diagram.

[0117] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of this disclosure. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of an instruction, which contains one or more executable instructions for implementing the specified logical function. In some alternative implementations, the functions indicated in the blocks may occur in a different order than those indicated in the drawings. For example, two consecutive blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, may be implemented using a dedicated hardware-based system that performs the specified function or action, or using a combination of dedicated hardware and computer instructions.

[0118] Various implementations of this disclosure have been described above. These descriptions are exemplary and not exhaustive, nor are they limited to the disclosed implementations. Many modifications and variations will be apparent to those skilled in the art without departing from the scope and spirit of the described implementations. The terminology used herein is chosen to best explain the principles, practical applications, or improvements to technology in the market, or to enable others skilled in the art to understand the various implementations disclosed herein.

Claims

1. A method for managing application behavior based on clustering, comprising: During the operation of the application, multiple links are obtained from the application's log data, and the links in the multiple links point to code segments in the application's code that are associated with the behavior; Based on predetermined key features, multiple key links are determined from the multiple links; Determine multiple sets of attributes that are associated with the multiple critical links, wherein the multiple sets of attributes include the critical links among the multiple critical links and at least one attribute associated with the critical links; as well as Based on at least one cluster of the plurality of attribute sets, at least one type of the behavior of the application is determined, wherein the attributes in the attribute sets of the cluster satisfy predetermined conditions.

2. The method of claim 1, wherein obtaining the plurality of links from the log data comprises: Obtain multiple call stacks from the log data, each associated with a different action of the application. as well as From the call stacks of the plurality of call stacks, extract a set of links for executing the actions of the plurality of actions.

3. The method of claim 2, wherein determining the plurality of key links from the plurality of links based on the key features comprises: Based on the key features, at least one key link for performing the behavior is determined from the set of links.

4. The method of claim 3, wherein determining the at least one key link from the set of links for performing the behavior comprises at least one of the following: Using the key features, perform text filtering on the set of links to determine the at least one key link; or The at least one critical link is determined based on the distance between a set of links and the critical feature.

5. The method of claim 3, wherein the at least one attribute comprises a text description for describing the behavior, the text description being expressed in natural language and including at least one of the following associated with the behavior: The scenario refers to the initiator that triggers the behavior; or The triggering reason indicates at least one of the following: interactive behavior, changes in the page cycle or application cycle, or changes to the page.

6. The method of claim 5, further comprising: Based on the database associated with the application, at least one textual interpretation of the at least one key link is determined; as well as Based on the at least one text interpretation, determine the text description used to describe the behavior.

7. The method of claim 6, wherein the database includes a mapping relationship between links and text interpretations, and determining the at least one text interpretation of the at least one key link includes: For the critical link in the at least one critical link, based on the mapping relationship, determine the text interpretation associated with the critical link.

8. The method of claim 5, wherein determining the at least one type of the behavior of the application based on the at least one clustering of the plurality of attribute sets comprises: Based on the at least one link clustering of the multiple key links, determine at least one clustering of the multiple attribute sets; as well as For each cluster in the at least one cluster, the type of behavior corresponding to the cluster is determined based on the text interpretation associated with the cluster.

9. The method of claim 3, wherein the at least one attribute further includes at least one of the following associated with the critical link: network request path, domain, service node, client, stack, administrator.

10. The method of claim 1, further comprising: In response to receiving a query request for a behavior of a target type, a target cluster matching the behavior of the target type is determined from the at least one cluster; as well as From the multiple attribute sets, a set of target attributes corresponding to the target cluster is provided.

11. An apparatus for managing the behavior of an application based on clustering, comprising: The acquisition module is configured to acquire multiple links from the application's log data during the application's operation, wherein the links point to code segments in the application's code that are associated with the behavior. A link determination module is configured to determine multiple key links from the plurality of links based on predetermined key features; The fact-determination module is configured to determine multiple sets of attributes associated with the plurality of critical links, the multiple sets of attributes including the critical links among the plurality of critical links, and at least one attribute associated with the critical link; as well as A type determination module is configured to determine at least one type of the behavior of the application based on at least one cluster of the plurality of attribute sets, wherein the attributes in the attribute sets of the clusters satisfy predetermined conditions.

12. An electronic device, comprising: At least one processing unit; as well as At least one memory, coupled to the at least one processing unit and storing instructions for execution by the at least one processing unit, the instructions causing the electronic device to perform the method according to any one of claims 1 to 10 when executed by the at least one processing unit.

13. A computer-readable storage medium having stored thereon computer instructions that, when executed by a processor, cause the processor to perform the method according to any one of claims 1 to 10.

14. A computer instruction product comprising computer instructions, wherein the computer instructions, when executed by a processor, implement the method according to any one of claims 1 to 10.