Machine-learning techniques for automated cloud service management

WO2026135664A1PCT designated stage Publication Date: 2026-06-25EQUIFAX INC

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
EQUIFAX INC
Filing Date
2024-12-17
Publication Date
2026-06-25

Smart Images

  • Figure US2024060502_25062026_PF_FP_ABST
    Figure US2024060502_25062026_PF_FP_ABST
Patent Text Reader

Abstract

In some aspects, a computing system can train a machine-learning model to analyze a graph database for risk assessment. The computing system can use the machine-learning model to identify a risk indicator for a target component of one or more interactive computing environments. The graph database can include a set of nodes where each node represents a respective infrastructure service of one or more infrastructure services and a set of edges connecting individual nodes of the set of nodes. The computing system can generate the risk indicator for the target component based on an output of the machine-learning model The computing system additionally can output a graphical user interface including at least the risk indicator for use in controlling access to the one or more infrastructure services.
Need to check novelty before this filing date? Find Prior Art

Description

Attorney Docket No. 096923-1449744MACHINE-LEARNING TECHNIQUES FOR AUTOMATED CLOUD SERVICE MANAGEMENTTechnical Field

[0001] The present disclosure relates generally to cloud service management. More specifically, but not by way of limitation, this disclosure relates to machine-learning models for automated cloud service management.Background

[0002] Graph processing can be used to analyze large data sets and generate predictions based on that data. A graph may include a number of connected and unconnected nodes, where nodes are connected to each other by edges. Graph processing has a number of applications in data science such as label propagation, graph partitioning, node classification, risk prediction, and so on.Summary

[0003] Various aspects of the present disclosure provide systems and methods for training and using a machine-learning model for risk assessment and outcome prediction using a graph database. In one example, a computer-implemented method is performed by one or more processing devices. The computer-implemented method includes accessing a machinelearning model trained using a training process to identify a risk indicator for a target component of one or more interactive computing environments, where the machine-learning model is trained to analyze a graph database. The graph database can include a set of nodes, where each node of the set of nodes represents a respective infrastructure service of one or more infrastructure services provided by the one or more interactive computing environments. The graph database also can include a set of edges connecting individual nodes of the set of nodes, where the set of edges represents relationships between the respective connected nodes. The computer-implemented method can also include generating, for the target component, the risk indicator based on an output of the machine-learning model trained to analyze the graph database. The computer-implemented method can further include outputting, to a remote computing device, a graphical user interface including at least the risk indicator for use in controlling access to the one or more infrastructure services.

[0004] In another example, a system includes a processor and a memory device in which instructions executable by the processor are stored for causing the processor to perform various130446303V.2Attorney Docket No. 096923-1449744 operations. The processor can access a machine-learning model trained using a training process to identify a risk indicator for a target component of one or more interactive computing environments, where the machine-learning model is trained to analyze a graph database. The graph database can include a set of nodes, where each node of the set of nodes represents a respective infrastructure service of one or more infrastructure services provided by the one or more interactive computing environments. Additionally, the graph database can include a set of edges connecting individual nodes of the set of nodes, where the set of edges represents relationships between the respective connected nodes. The processor can generate, for the target component, the risk indicator based on an output of the machine-learning model trained to analyze the graph database. The processor also can output, to a remote computing device, a graphical user interface including at least the risk indicator for use in controlling access to the one or more infrastructure services.

[0005] In yet another example, a non-transitory computer-readable storage medium has program code that is executable by a processor to cause a computing device to perform various operations. The operations can include access, by the processor, a machine-learning model trained using a training process to identify a risk indicator for a target component of one or more interactive computing environments, where the machine-learning model is trained to analyze a graph database. The graph database can include a set of nodes, where each node of the set of nodes represents a respective infrastructure service of one or more infrastructure services provided by the one or more interactive computing environments. The graph database also can include a set of edges connecting individual nodes of the set of nodes, where the set of edges represents relationships between the respective connected nodes. The operations also can include generating, by the processor and for the target component, the risk indicator based on an output of the machine-learning model trained to analyze the graph database. The operations can further include outputting, by the processor and to a remote computing device, a graphical user interface comprising at least the risk indicator for use in controlling access to the one or more infrastructure services.

[0006] This summary is not intended to identify key or essential features of the claimed subject matter, nor is it intended to be used in isolation to determine the scope of the claimed subject matter. The subject matter should be understood by reference to appropriate portions of the entire specification, any or all drawings, and each claim.

[0007] The foregoing, together with other features and examples, will become more apparent upon referring to the following specification, claims, and accompanying drawings.230446303V.2Attorney Docket No. 096923-1449744Brief Description of the Drawings

[0008] FIG. 1 is a block diagram depicting an example of an operating environment in which a machine-learning model can be trained using graph data and applied in a risk assessment application according to certain aspects of the present disclosure.

[0009] FIG. 2 is a flow chart depicting an example of a process for utilizing a machinelearning model to generate risk indicators for a target component based on predictor variables associated with the target component according to certain aspects of the present disclosure.

[0010] FIG. 3 is a block diagram of an example of a computing environment to generate a visual representation of cloud infrastructure according to certain aspects of the present disclosure.

[0011] FIG. 4 is a diagram depicting an example of graph data in a graph database according to certain aspects of the present disclosure.

[0012] FIG. 5 is a diagram depicting an example of a graphical user interface including an interactive architecture diagram visualizing one or more infrastructure services according to certain aspects of the present disclosure.

[0013] FIG. 6 is a block diagram depicting an example of a computing system suitable for implementing aspects of the techniques and technologies presented herein.Detailed Description

[0014] Certain aspects and examples of the present disclosure relate to machine-learning techniques for automated cloud service management. A cloud computing environment can be a type of distributed computing environment where a cloud provider can provide one or more computing services via one or more networks, such as the Internet. More specifically, the cloud provider can provide datacenters and compute resources to host computing infrastructure that can enable users to deploy or manage applications and system resources within the cloud computing environment. For example, the users can be cloud administrators manage access to the computing services (e.g., servers, storage, networking, software, etc.) provided by the cloud computing environment to ensure information security. Machine-learning techniques can facilitate the management of the computing services, such as by automating a determination of a respective risk indicator of each computing service. The machine-learning techniques can involve training or executing a machine-learning model to use graph data as input to generate an output related to the risk indicators.330446303V.2Attorney Docket No. 096923-1449744

[0015] Machine-learning techniques can provide powerful predictive insights. In one example, complex data can be used to predict or assess a risk associated with a target component of a computing environment. In some cases, the target component may include one or more secured resources to which access can be controlled. Accordingly, based on the risk associated with the target component, access by other computing components or computing services to the secured resources can be controlled. To make such predictions, a machinelearning model may be trained on large and complex data sets. For example, a machinelearning model, such as a graph neural network (GNN), can make determinations based on a graph database generated from a large set of data points. In particular, the graph database can include a set of nodes representing a respective infrastructure service. The target component can correspond to a particular node selected from the set of nodes, such as to determine a risk associated with the infrastructure service related to the particular node interacting with other infrastructure services. Disclosed examples provide systems and methods for training a machine-learning model to analyze the graph database to predict the risk associated with the target component. Further, disclosed examples facilitate the generation of the graph database using additional machine-learning models, such as machine-learning models trained to perform natural language processing (NLP).

[0016] In some cases, manual management or interpretation of computing environment configurations of one or more computing environments can be time-consuming or resourceintensive, such as with respect to processing power or storage resources. Misconfigurations of the computing environments resulting from the manual management of the computing environment configurations can expose the computing environments to security risks or cause noncompliance with certain regulations. Additionally, generating architecture diagrams of the computing configurations can be similarly time-consuming or resource-intensive. Certain aspects described herein for training a machine-learning model to perform risk assessment or other outcome predictions based on a graph database can address one or more issues identified herein.

[0017] In one example, managing one or more cloud computing environments can involve regulating or maintaining cloud configuration settings, parameters, or policies, such as for virtual machines, containers, or other suitable components of the cloud computing environments. The cloud computing environments may be provided or hosted by different cloud providers, which can result in heterogenous data sources that can involve different types or formats of data. Scalability of the cloud computing environments can contribute to a dynamic nature of the cloud computing environments such that a respective computing430446303V.2Attorney Docket No. 096923-1449744 environment configuration of the cloud computing environments, such as resource provisioning, may frequently vary or change. Additionally, the cloud computing environments may include different types of infrastructure services, such as storage, networks, databases, access control, virtualizations, or a combination thereof. As an example, each cloud computing environment can include tens, hundreds, thousands, or millions of virtual machines that can communicate with each other or with other components in the cloud computing environments. Disclosed examples herein describe constructing a graph database using one or more machinelearning models that can be trained and executed to interpret dependencies or relationships between the infrastructure services provided by the cloud computing environments, which can automate the generation of architecture diagrams of the computing configurations involving heterogeneous data sources. Additionally, in some cases, a machine-learning model that is different from the machine-learning models used to generate the graph database can be trained and used to analyze the graph database to perform the risk assessment.

[0018] To overcome the above-described problems, aspects of the present disclosure relate to using one or more machine-learning models (e.g., machine-learning models trained to perform NLP to parse heterogeneous data sources). Outputs of these machine-learning models can be used to construct a graph database that indicates system resources, network topology, access permissions, or a combination thereof associated with the computing environments. The heterogeneous data sources can include configuration files, asset inventories, or application programming interfaces of the computing environments (e.g., cloud platforms). Aspects of the present disclosure can be used to analyze large data sets, enabling the machine-learning models to be trained using this data, thereby improving outcomes and accuracy of the machine-learning models. The graph database can include a set of nodes where each node can represent a respective infrastructure service provided by a particular computing environment. As an example, the machine-learning models can be trained to use NLP to extract data from documentation, such as configuration files, of the computing environments.

[0019] In some cases, a separate machine-learning model may be trained to generate a database schema based on the extracted data provided by the machine-learning models trained to perform NLP. The database schema can indicate nodes and edges of the graph database. Accordingly, the database schema can be used to construct the graph database. The separate machine-learning model can also be referred to as a schema model. In particular, the schema model can be trained to recognize the infrastructure services provided by the computing environments and extract relationships between the infrastructure services. Based on the recognized infrastructure services, the schema model can generate one or more nodes of the530446303V.2Attorney Docket No. 096923-1449744 graph database, such as a respective node to represent each infrastructure service. The schema model can generate a set of edges based on the extracted relationships, such as to indicate a measure of relatedness between two infrastructure services. Additionally, the schema model can identify one or more attributes (e.g., operating system, central processing unit (CPU), memory, etc.) associated with each infrastructure service. In some cases, the attributes can relate to system resources that are configurable with respect to a respective infrastructure service. The attributes can be included in the database schema and in the graph database. Over time, the cloud computing environments may change, for example with respect to the infrastructure services provided, resource allocation, etc. The schema model can update the database schema (e.g., by generating an updated version of the database schema). Modifications to the database schema can be propagated to other components, such as by updating the graph database, an architecture diagram visualizing the graph database, a graphical user interface displaying the architecture diagram, or a combination thereof.

[0020] In some cases, once the graph database is generated, yet another machine-learning model (e.g., a risk prediction model) can be trained to analyze the graph database to perform risk assessment. The trained risk prediction model can be executed to identify one or more critical points in the cloud computing environments that may be more at risk with respect to security, reliability, or stability compared to the remaining portion of the cloud computing environments. In particular, the trained risk prediction model can be used to generate a risk indicator for controlling access of a target component to a computing environment or portions of the computing environment (e.g., the infrastructure services of the computing environment). As an example, the trained risk prediction model may determine a quantification of risk associated with a particular infrastructure service provided by the cloud computing environments. Access permissions of the particular infrastructure service can be adjusted based on the risk indicator associated with the particular infrastructure service. For instance, certain access permissions of the particular infrastructure service can be revoked to prevent access to other infrastructure services in the cloud computing environments if the particular infrastructure service is likely to have been compromised. The above examples can be used to determine a risk indicator for a target component based on data from heterogeneous data sources.

[0021] In some cases, a visualization module can output a graphical user interface (GUI) for display to a user based on the graph database. The user can use the GUI to monitor a current state of the cloud computing environments, such as in real-time or substantially contemporaneously. For instance, the GUI can include a respective visual representation630446303V.2Attorney Docket No. 096923-1449744 corresponding to each network infrastructure service provided by the cloud computing environments. Each visual representation may present information related to its corresponding network infrastructure service, such as a respective risk indicator quantifying a level of risk associated with each network infrastructure service. Additionally, the visualization engine can update the GUI to provide a simulation of a modified computing environment, such as based on user input provided by the user. For instance, the visualization engine can receive a simulation request generated by the user and including one or more modified parameters by which to generate the simulation of the modified computing environment. As an example, the schema model may receive the modified parameters as input to then generate an updated database schema indicating effects of the modified parameters to the modified computing environment. Accordingly, the user can test or evaluate the effects of the modified parameters prior to actual implementation in the cloud computing environments.

[0022] Certain aspects described herein provide improvements to assessing risks, for examples, in access control associated with distributed computing environments. For instance, the schema model described herein can generate a database schema including graph data that can be used as training data to train a risk prediction model to determine a risk indicator related to an infrastructure service of a cloud computing environment. Manually determining the risk indicator can be difficult due to a large quantity of components in the cloud computing environment that can hinder a user’s ability to interpret dependencies or relationships within the cloud computing environment. Additionally, cloud providers may provide different cloud platforms that can each have a respective format, a respective set of system resources, a respective network topology, or other configurations that can hinder the user’s ability to manually determine the risk indicator. Disclosed systems and methods can enable generating a graph database including resource configuration data obtained from the cloud providers that has been transformed from an unstructured format into a structured format. The systems and methods described herein can enable previously unusable or difficult to use data sets to be used to train machine-learning models, such as the risk prediction model or the schema model. This can yield more accurate machine learning outcomes by enabling the machine-learning models to be trained on a larger corpus of data. Accordingly, improved predictive power and improved accuracy of the machine-learning models also improves a user’s or a management module’s ability to make accurate and informed decisions on whether to grant access, by a target component of a distributed computing environment, to a secured or restricted system resource.

[0023] These illustrative examples are given to introduce the reader to the general subject matter discussed here and are not intended to limit the scope of the disclosed concepts. The730446303V.2Attorney Docket No. 096923-1449744 following sections describe various additional features and examples with reference to the drawings in which like numerals indicate like elements, and directional descriptions are used to describe the illustrative examples but, like the illustrative examples, should not be used to limit the present disclosure.

[0024] Referring now to the drawings, FIG. l is a block diagram depicting an example of an operating environment 100 in which a risk assessment computing system 130 builds and trains a machine-learning model that can be used to predict risk indicators based on predictor variables. FIG. 1 depicts examples of hardware components of a risk assessment computing system 130, according to some aspects of the present disclosure. The risk assessment computing system 130 can be a specialized computing system that may be used for processing large amounts of data using a large number of computer processing cycles. The risk assessment computing system 130 can include a model training server 110 for building and training one or more machine-learning models (e.g., a risk prediction model 120 or another machine-learning model 102 trained to perform natural language processing (NLP)). The other machine-learning model 102 can also be referred to herein as a language model 102. In some examples, other machine-learning models, such as a schema model further described with respect to FIG. 2 and FIG. 3, may also be built and trained using the model training server 110. The risk assessment computing system 130 can additionally include a risk assessment server 118 for performing a risk assessment for given predictor variables 124 using the trained risk prediction model 120.

[0025] The model training server 110 can include one or more processing devices that execute program code, such as a model training application 112. The program code is stored on a non-transitory computer-readable storage medium. The model training application 112 can execute one or more processes to implement a training process to train a machine-learning model for predicting risk indicators based on predictor variables 124. Additionally, the model training application 112 can train the language model 102 to perform NLP, natural language generation, or suitable information retrieval techniques to output structured data that has been transformed from an unstructured format to a structured format.

[0026] In some examples, the machine-learning model training samples 126 can vary, such as based on a particular machine-learning model being trained on a subset of the machinelearning model training samples 126. For example, a particular subset of the machine-learning model training samples 126 used to train the language model 102 can include documentation related to external computing environments (e.g., cloud platforms). Non-limiting examples of the documentation can include technical documents, configuration templates, infrastructure- as-code scripts, specification files, application programming interfaces (APIs), manuals, etc.830446303V.2Attorney Docket No. 096923-1449744In particular, the language model 102 can be trained using the machine-learning model training samples 126 to parse or otherwise analyze the documentation to generate an output including resource configuration data associated with the external computing environments.

[0027] In some cases, the machine-learning model training samples 126 can be generated from graph data 142 associated with various computing components, such as databases, storage systems, networks, software, processors, etc. The graph data 142 can include historical graph data, such as related to previous or existing graph databases. The language model 102 can facilitate a creation of graph data 142. As an example, an output of the language model 102 can be provided as an input to a different machine-learning model, such as the schema model, to generate the graph data 142 including one or more nodes and one or more edges connecting the nodes. The graph data 142 can include attributes of each of the computing components. For example, the graph data 142 can include a graph generated based on a semi-labeled data set. The graph data for each computing component can be represented as graph structures having nodes and edges. In some scenarios, the graph data 142 includes a graph generated from a large-scale data set, such as including hundreds, thousands, or millions of records. The graph data 142 can also be stored in the risk data repository 122.

[0028] In some aspects, the model training application 112 can build and train a risk prediction model 120 utilizing model training samples 126. The model training samples 126 can include multiple training vectors consisting of training predictor variables and training risk indicator outputs corresponding to the training vectors. The model training samples 126 can be stored in one or more network-attached storage units on which various repositories, databases, or other structures are stored. Examples of these data structures are the risk data repository 122.

[0029] Network-attached storage units may store a variety of different types of data organized in a variety of different ways and from a variety of different sources. For example, the network-attached storage unit may include storage other than primary storage located within the model training server 110 that is directly accessible by processors located therein. In some aspects, the network-attached storage unit may include secondary, tertiary, or auxiliary storage, such as large hard drives, servers, virtual memory, among other types. Storage devices may include portable or non-portable storage devices, optical storage devices, and various other mediums capable of storing and containing data. A machine-readable storage medium or computer-readable storage medium may include a non-transitory medium, such a non- transitory computer-readable storage medium, in which data can be stored and that does not include carrier waves or transitory electronic signals. Examples of a non-transitory medium930446303V.2Attorney Docket No. 096923-1449744 may include, for example, a magnetic disk or tape, optical storage media such as a compact disk or digital versatile disk, flash memory, memory, or memory devices.

[0030] The risk assessment server 118 can include one or more processing devices that execute program code, such as a risk assessment application 114. The program code is stored on a non-transitory computer-readable medium. The risk assessment application 114 can execute one or more processes to utilize the risk prediction model 120 trained by the model training application 112 to predict risk indicators based on input predictor variables 124. In some examples, the input predictor variables 124 can correspond to parameters or characteristics (e.g., topology, access permissions, etc.) of infrastructure services provided by cloud platforms.

[0031] Furthermore, the risk assessment computing system 130 can communicate with various other computing systems, such as client computing systems 104. Each client computing system 104 may include one or more third-party devices, such as individual servers or groups of servers operating in a distributed manner. A client computing system 104 can include any computing device or group of computing devices operated by a provider of products or services (e.g., cloud providers). The client computing system 104 can include one or more server devices. The one or more server devices can include or can otherwise access one or more non-transitory computer-readable media. The client computing system 104 can also execute instructions that provide a computing environment accessible to user computing systems 106, the risk assessment computing system 130, or a combination thereof. An example of the computing environment provided by the client computing system 104 can include a cloud computing environment that can provide one or more infrastructure services to the user computing systems 106. The executable instructions are stored in one or more non-transitory computer-readable media.

[0032] The client computing system 104 can further include one or more processing devices that are capable of providing the computing environment to perform operations or provide the infrastructure services described herein. The computing environment can include executable instructions stored in one or more non-transitory computer-readable media. The instructions providing the computing environment can configure one or more processing devices to perform operations described herein. In some aspects, the executable instructions for the computing environment can include instructions that provide one or more graphical interfaces. The graphical interfaces are used by a user computing system 106 to access various functions of the computing environment. For instance, the computing environment may transmit data to and receive data from a user computing system 106 to shift between different1030446303V.2Attorney Docket No. 096923-1449744 states of the computing environment, where the different states involve different access permissions of infrastructure services provided by the computing environment.

[0033] In some examples, the client computing system 104 may have other system resources associated therewith (not shown in FIG. 1), such as server computers hosting and managing virtual machine instances or container instances to provide cloud computing services, server computers hosting and managing online storage resources for users, server computers for providing database services, and others. The interaction between the user computing system 106 and the client computing system 104 may be performed through graphical user interfaces presented by the client computing system 104 to the user computing system 106, or through an application programming interface (API) calls or web service calls. In some aspects, the user computing system 106 can include a management module 117 that can automate management of the other system resources corresponding to the client computing system 104. The management module 117 can be provided as part of a software application or another suitable component of the user computing system 106. The management module 117 can apply machine learning techniques to extract suitable data related to the other system resources and interpret dependencies, relationships, or a combination thereof with respect to the other system resources. Additionally, the management module 117 can construct a graph database including graph data (e.g., graph data 128) as a representation of resources, network topology, permissions, or a combination thereof that are associated with the other system resources.

[0034] A user computing system 106 can include any computing device or other communication device operated by a user, such as a consumer or a customer. The user computing system 106 can include one or more computing devices, such as laptops, smartphones, and other personal computing devices. In some cases, the user computing system 106 may send risk assessment queries to the risk assessment server 118 for risk assessment or may send signals to the risk assessment server 118 that control or otherwise influence different aspects of the risk assessment computing system 130. A user computing system 106 can include executable instructions stored in one or more non-transitory computer-readable media. The user computing system 106 can also include one or more processing devices that are capable of executing program code to perform operations described herein. In various examples, the user computing system 106 can allow a user to access certain infrastructure services from a client computing system 104 or other system resources, to obtain controlled access to electronic content hosted by the client computing system 104, etc.

[0035] In some aspects, a computing environment implemented through a client computing1130446303V.2Attorney Docket No. 096923-1449744 system 104 can be used to provide access to various online functions (e.g., cloud computing). As a simplified example, a website or other interactive computing environment provided by an online resource provider can include electronic functions for requesting system resources, online storage resources, network resources, database resources, or other types of resources. A user computing system 106 can request modifications to the computing environment provided by the client computing system 104, such as to customize cloud services (e.g., infrastructure, platforms, software applications, etc.) of the computing environment. Due to complexity associated with architecture or customizability of the computing environment, it can be difficult to manually manage infrastructure of the computing environment. For example, it can be timeconsuming or resource intensive for a user of the user computing system 106 to manually interpret cloud configurations or generate architecture diagrams to visualize the infrastructure of the computing environment. As a result, detection of compromised or at-risk components of the computing environment may be delayed with manual management of the computing environment, thereby decreasing security and compliance with cybersecurity regulations. Accordingly, the management module 117 of the user computing system 106 can facilitate the management of the computing environment by automatically monitoring the computing environment, such as using risk indicators outputted by the risk prediction model 120. For example, the management module 117 can collect data associated with the computing environment and communicate with the risk assessment server 118 for risk assessment, such as by periodically transmitting requests to generate or update risk indicators. Based on the risk indicators predicted by the risk assessment server 118, the management module 117 can determine whether to control access of a particular infrastructure service to certain features or components of the computing environment. In some examples, the management module 117 can automatically apply adjustments to the computing environment provided by the client computing system 104 based on the risk indicators outputted by the risk prediction model 120.

[0036] In a simplified example, the system depicted in FIG. 1 can configure a machinelearning model to be used for accurately determining risk indicators, such as a quantification or level of risk associated with permitting a request to access certain features or components of the computing environment, using predictor variables. In some cases, the request can be generated by an infrastructure service to access, communicate with, or otherwise interact with another infrastructure service in the computing environment. A predictor variable can be any variable predictive of risk that is associated with a computing component (e.g., an infrastructure service or another suitable feature or component of the computing environment).

[0037] Examples of predictor variables used for predicting the risk associated with a1230446303V.2Attorney Docket No. 096923-1449744 computing component accessing online resources include, but are not limited to, variables indicating the identification characteristics of the computing component (e.g., name of the computing component, the network connectivity of the computing component, a unique identifier of the computing component, etc.), variables indicative of prior actions or requests involving the computing component (e.g., past requests of system resources generated by the computing component, the amount of system resources currently held by the computing component, and so on.), variables indicative of one or more behavioral traits of a computing component (e.g., whether an amount of requests transmitted is within an acceptable range, whether communication has occurred with a unverified or compromised component), etc. As an example, the online resources can include system resources provided using a cloud computing environment. In some aspects, predictor variables can be extracted from a labeled graph or a graph database. For example, predictor variables can be derived from metadata or attributes associated with each node of the graph database and from the relationships between the nodes of the graph database as indicated by edges between the nodes.

[0038] The predicted risk indicator can be used by the risk prediction model 120 to determine the risk associated with a computing component (e.g., an infrastructure service or another suitable component of the computing environment) accessing a service provided by the computing environment. Based on the predicted risk indicator, the management module 117 can thereby grant or deny access by the computing component to a suitable portion of the computing environment implementing the service. For example, if the management module 117 determines that the predicted risk indicator is lower than a threshold risk indicator value, then the client computing system 104 can generate or otherwise provide access permission to a particular infrastructure service that requested the access. The access permission can include, for example, cryptographic keys used to generate valid access credentials or decryption keys used to decrypt access credentials. Additionally or alternatively, the access permission can include read permissions, write permissions, execute permissions, etc. The client computing system 104 associated with the service provider can also allocate resources to an infrastructure service based on the predicted risk indicator. With the appropriate access credentials or permissions, the infrastructure service can establish or maintain communication with other components in the computing environment hosted by the client computing system 104 and access system resources via invoking API calls, web service calls, HTTP requests, or other proper mechanisms.

[0039] Each communication within the operating environment 100 may occur over one or more data networks, such as a public data network 108, a network 116 such as a private data1330446303V.2Attorney Docket No. 096923-1449744 network, or some combination thereof. A data network may include one or more of a variety of different types of networks, including a wireless network, a wired network, or a combination of a wired and wireless network. Examples of suitable networks include the Internet, a personal area network, a local area network (“LAN”), a wide area network (“WAN”), or a wireless local area network (“WLAN”). A wireless network may include a wireless interface or a combination of wireless interfaces. A wired network may include a wired interface. The wired or wireless networks may be implemented using routers, access points, bridges, gateways, or the like, to connect devices in the data network.

[0040] The number of devices depicted in FIG. 1 is provided for illustrative purposes. Different numbers of devices may be used. For example, while certain devices or systems are shown as single devices in FIG. 1, multiple devices may instead be used to implement these devices or systems. Similarly, devices or systems that are shown as separate, such as the model training server 110 and the risk assessment server 118 or the user computing system 106 and the risk assessment computing system 130, may be instead implemented in a single device or system.

[0041] FIG. 2 is a flow chart depicting an example of a process 200 for using a machinelearning model to generate risk indicators for a target component based on predictor variables associated with the target component. In some examples, the target component can be part of one or more interactive computing environments, such as cloud computing environments. As described herein, the target component can be an infrastructure service provided in a computing environment maintained by client computing system(s) 104, such as cloud providers. One or more computing devices (e.g., the risk assessment server 118) implement operations depicted in FIG. 2 by executing suitable program code (e.g., the risk assessment application 114). For illustrative purposes, the process 200 is described with reference to certain examples depicted in the figures. Other implementations, however, are possible.

[0042] At block 202, the process 200 involves accessing a machine-learning model trained using a training process to identify a risk indicator for the target component by analyzing a graph database. In some cases, a computing device (e.g., the risk assessment server 118) can receive a risk assessment query for a target component from a remote computing device, such as the user computing system(s) 106. For example, a management module 117 of the user computing system(s) 106 may transmit the risk assessment query to the risk assessment server 118 to monitor a respective risk level of each infrastructure service available in the computing environment provided by the client computing system(s) 104. Once the management module 117 transmits the risk assessment query to the risk assessment server 118, the risk assessment1430446303V.2Attorney Docket No. 096923-1449744 server 118 can provide access to the machine-learning model that can be executed to identify the risk indicator corresponding to the target component. In particular, the machine-learning model can be trained to analyze the graph database to generate an output that can be used to identify the risk indicator. The graph database can provide a representation of each infrastructure service of the computing environment, attributes associated with each infrastructure, and a respective relationship between or among the infrastructure services.

[0043] In some examples, generating the graph database can involve machine-learning techniques. For example, a database generation module can use one or more machine-learning models to generate a database schema corresponding to the graph database. In particular, the database generation module can extract resource configuration data provided by one or more external data sources using a language model 102 trained to perform natural language processing. For example, the language model 102 can extract a dataset corresponding to one or more infrastructure services from one or more configuration files provided as part of the resource configuration data. Once the resource configuration data is extracted, the resource configuration data can be normalized using the language model 102 such as to convert the resource configuration data from an unstructured format into a structured format. As an example, the resource configuration data can be converted from unformatted text into a tabular format. Subsequent to generating the normalized resource configuration data, the normalized resource configuration data can be provided to a different machine-learning model, which can be referred to as a schema model. The schema model can be trained to generate a database schema using the normalized resource configuration data as at least a portion of its input. The database schema generated by the schema model can indicate a set of nodes and a set of edges associated with the graph database. For example, the schema model can generate the set of nodes based on the infrastructure services determined using the dataset extracted from the configuration files by the language model 102. As another example, the schema model can generate the set of edges to indicate a respective relationship among the set of nodes, such as between connected nodes. The database generation module then can generate the graph database based on the database schema outputted by the schema model.

[0044] In some examples, the graph database may be updated over time, such as by the database generation module. For example, an occurrence of an event associated with the graph database may be detected. As an example, the event can involve an outage of a datacenter hosting a subset of infrastructure services represented as the set of nodes of the graph database. The database generate module can use a machine-learning model, such as the schema model, that is trained to perform natural language processing to determine a modification to the graph1530446303V.2Attorney Docket No. 096923-1449744 database based on the occurrence of the event. For example, the modification to the graph database can include removing a subset of the nodes corresponding to the subset of the infrastructure services affected by the outage of the datacenter. Once the modification is determined, the modification can be applied to the graph database to generate an updated graph database by modifying the set of nodes or the set of edges of the graph database. For example, the event may involve system resources being added to a cloud computing environment that the graph database represents. Accordingly, the occurrence of the event can cause the database generation module to generate an updated graph database that includes one or more additional nodes representing the additional system sources. The updated graph database additionally can include one or more additional edges connecting the additional nodes to each other or to existing nodes of the graph database.

[0045] In some aspects, process 200 can include training the machine-learning model on graph data related to a graph database representing infrastructure of the computing environment. A model training server 110 can include a model training application 112 that can be executed to train a risk prediction model 120 to generate a trained risk prediction model. As described herein, the graph database can include a set of nodes representing infrastructure services or other components provided in the computing environment. The graph database additionally can include a set of edges connecting individual nodes in the set of nodes and indicating a respective relationship between the nodes. As an example, a first subset of edges can connect certain nodes representing receiving systems, and a second subset of edges can connect other nodes representing transmitting systems with the respective nodes representing the receiving systems to which the transmitting systems have transmitted data. In some implementations, the graph database can include one or more attributes assigned to the set of nodes, where the attributes can indicate parameters or characteristics associated with a corresponding node. For instance, a particular node representing an infrastructure service of a database may include a list of attributes indicating an amount of storage assigned to the database, an amount of remaining storage, connectivity of the database, etc.

[0046] At block 204, the process 200 involves generating the risk indicator based on the output of the machine-learning model (e.g., the risk prediction model 120) trained to analyze the graph database. In particular, resolving the risk assessment query can include executing a machine-learning model trained to generate an output related to risk indicator values. The machine-learning model can be trained to generate its output using input predictor variables or other data suitable for assessing risks associated with the target component, such as graph data 128 of the graph database or other predictive data generated based on graph data 128 of the1630446303V.2Attorney Docket No. 096923-1449744 graph database. In some examples, the risk assessment query can indicate or otherwise specify the target component for which to perform a risk assessment. Predictor variables associated with the target component can be used as inputs to the machine-learning model. The predictor variables associated with the target component can be obtained from a predictor variable database configured to store predictor variables associated with various computing components. The output of the machine-learning model would include the risk indicator for the target component based on its current predictor variables.

[0047] Examples of predictor variables can include data associated with a computing component that describes prior actions or transmissions involving the computing component, previous activity of the computing component, or any other traits that may be used to predict or quantify risks associated with the computing component. Prior actions or transmissions related to the computing component can include information that can be obtained from configuration files, technical documents, manuals, infrastructure-as-code scripts, specification files, or other data about the activities or characteristics of the computing component. In some aspects, predictor variables can be obtained from technical documents, configuration templates, etc. Additionally or alternatively, predictor variables can be generated based on analysis of graph data or the graph database. For example, predictor variables may be generated based on an analysis of related infrastructure services stored as individual nodes in the graph database that are connected using respective edges indicating relationships (e.g., dependencies) of the nodes. The risk indicator can indicate a level of risk associated with the computing component, such as with respect to vulnerability, likelihood of being compromised, strength of security controls.

[0048] The machine-learning model can be constructed through an at least partially automated process called training that can have little or no human involvement. As described herein, in some cases, the machine-learning model can be trained by executing the model training application 112 in the model training server 110 of FIG. 1. During training, training data can be iteratively supplied to the machine-learning model to enable the machine-learning model to identify patterns related to the training data or to identify relationships between the training data and output data. The machine-learning model can be trained in a supervised manner, an unsupervised manner, or a semi-supervised manner. For example, in supervised training, each training sample in the training data can include training predictor variables correlated to desired training risk indicator outputs. The training risk indicator outputs can include a scalar, a vector, or a different type of data structure, such as text or an image. Correlating the training predictor variables and the desired training risk indicator outputs can1730446303V.2Attorney Docket No. 096923-1449744 enable the machine-learning model to learn a mapping or rules relating predictor variables and risk indicator outputs. In contrast, in unsupervised training, the training data may include the training predictor variables but not the desired training risk indicator outputs. Accordingly, in unsupervised training, the machine-learning model can use the training predictor variables to determine structure in the training data on its own. In semi-supervised training, a subset of the training predictor variables provided in the training data may be correlated to desired training risk indicator outputs.

[0049] At block 206, the process 200 involves outputting an interactive graphical user interface (GUI) including at the risk indicator determined using the machine-learning model. The interactive GUI can provide a visualization of the set of nodes or the set of edges of the graph database. In some cases, resolving the risk assessment query can include generating and transmitting a response to the risk assessment query, such as to the management module 117. The response can include the risk indicator generated using the output of the machine-learning model. The risk indicator can be used for one or more operations that involve performing an operation with respect to the target component based on a predicted risk associated with the target component. In one example, the risk indicator can be used to control access of the target component with respect to one or more computing environments. In particular, access of the target component can be restricted or permitted with respect to certain functionalities, services, etc. provided by the computing environment(s). As discussed above with regard to FIG. 1, the risk assessment computing system 130 can communicate with client computing systems 104, which may send risk assessment queries to the risk assessment server 118 to request risk assessment. The client computing systems 104 may be associated with technological providers, such as cloud computing providers, online storage providers, or other types of organizations. The client computing systems 104 may provide interactive computing environments (e.g., a website, user interface, etc.) for users to access various services offered by these service providers. Users can utilize user computing systems 106 to access the interactive computing environments thereby accessing the services provided by these providers.

[0050] In some implementations, an infrastructure service of a computing environment provided by the client computing system 104 can submit a request to access or interact with another infrastructure service in the computing environment. Based on the request, the client computing system 104 can generate and submit a risk assessment query for the customer to the risk assessment server 118. The risk assessment query can include, for example, an identifier corresponding to the infrastructure service and other information associated with the1830446303V.2Attorney Docket No. 096923-1449744 infrastructure service that can be utilized to generate predictor variables. The risk assessment server 118 can perform a risk assessment based on predictor variables related to the infrastructure service and return the predicted risk indicator to the client computing system 104.

[0051] Based on the received risk indicator, the client computing system 104 can determine whether to grant the customer access to the interactive computing environment. In some cases, the client computing system 104 determines that the level of risk associated with the infrastructure service accessing or interacting with the other infrastructure service in the computing environment and the associated technical or financial service is too high. As a result, the client computing system 104 can deny access by the infrastructure service to the other infrastructure service. Conversely, if the client computing system 104 determines that the level of risk associated with the customer is acceptable (e.g., below a certain threshold), the client computing system 104 can grant access to the other infrastructure service in the computing environment by the infrastructure service. The infrastructure service then would be able to interact with or use the various services provided by the other infrastructure service. For example, with the granted access, the infrastructure service can utilize the other infrastructure service to access cloud computing resources, online storage resources, web pages, or other user interfaces provided by the other infrastructure service to execute applications, store data, query data, submit an online digital application, operate electronic tools, or perform various other operations within the computing environment hosted by the client computing system 104.

[0052] In some examples, outputting the interactive GUI can involve generating a set of visual representations such that a respective visual representation corresponds to each node of the graph database. For example, the user computing system 106 can execute a visualization engine that can communicate with the database generation module to use the schema of the graph database to generate the set of visual representations. Additionally, the visualization engine can generate the interactive GUI to include one or more filter options that a user of the interactive GUI can select to modify the set of visual representations outputted in the interactive GUI. Outputting the interactive GUI also can include providing an interface element that receives user input to generate a simulation request indicating a requested modification to the graph database.

[0053] In some examples, the user of the interactive GUI may generate the simulation request that can be received after outputting the interactive GUI, such as via a display device that can be part of the user computing system 106. For example, the user can provide the user input to the interface element of the interactive GUI via an input device (e.g., a keyboard, mouse, touchscreen, etc.) that can be part of the user computing system 106. The simulation1930446303V.2Attorney Docket No. 096923-1449744 request can indicate a requested modification to the graph database. For example, the requested modification may include or be part of a potential scenario that the user is requesting to simulate based on the graph database. Based on the simulation request, an updated risk indicator can be determined. In particular, the updated risk indicator can be determined subsequent to applying the requested modification to the graph database. Additionally, an updated GUI can be generated based on the requested modification indicated in the simulation request such that the updated GUI includes the updated risk indicator. The updated GUI may replace an existing version of the GUI that has been outputted for display to the user.

[0054] Referring now to FIG. 3, a block diagram depicting an example of a computing environment 300 to generate a visual representation of cloud infrastructure is presented. For illustrative purposes, the computing environment 300 is described with reference to certain examples depicted in other figures (e.g., FIG. 1). In some cases, the components depicted in FIG. 3 may be part of the operating environment 100 of FIG. 1, such as part of user computing system(s) 106. Other implementations, however, are possible.

[0055] In some examples, the computing environment 300 can include a database generation module 302 used to generate a graph database 304 including graph data, such as one or more nodes and one or more edges connecting the nodes. The database generation module 302 can include at least one machine-learning model (e.g., language model 102) trained to perform language-related machine-learning tasks or information retrieval tasks. In some implementations, the language model 102 can be trained to receive documentation or other text-based data from heterogeneous sources as input and generate structured data as output. The heterogeneous sources can be external data sources 306 related to computing environments communicatively coupled with the computing environment 300, such as computing environments associated with or provided by the client computing systems 104. The external data sources 306 can use different formats to store or present its data that may be in an unstructured format (e.g., text). Non-limiting examples of the documentation provided as input to the language model 102 can include service documentation 308, configuration files 310, or other suitable reference materials. The documentation can include information indicating how to configure a particular infrastructure service or environment, such as by declaring resources of a configuration or by including commands to create a deployment.

[0056] Using the documentation, the language model 102 can output a dataset including resource configuration data 312 indicating resources, network topology, or permissions related to the computing environments associated with the external data sources 306. In particular, the language model 102 can transform the documentation provided in various unstructured formats2030446303V.2Attorney Docket No. 096923-1449744 into the resource configuration data 312 that has a structured format usable to generate the graph database 304. The unstructured formats of the documentation may vary depending on hardware or software of the computing environments. As an example, the language model 102 may extract text from files in various formats, process the text using natural language processing, and output the resource configuration data 312 in a dataset having a data structure (e.g., a table) with a standardized or normalized format. In other words, the language model 102 can normalize the resource configuration data 312 to generate normalized resource configuration data 314 by transforming the resource configuration data 312 from the unstructured format(s) into a structured format. As another example, the language model 102 can be trained to extract data related to an existing instance of an infrastructure service in the computing environment 300. For instance, the computing environment 300 may include a running compute (not shown in FIG. 3) that has been assigned certain system resources (e.g., processing power, memory, etc.). The language model 102 can analyze a particular configuration file used to deploy the running compute to obtain resource configuration data 312 associated with the running compute.

[0057] The database generation module 302 additionally can include another machinelearning model (e.g., schema model 316) to generate a database schema 318 that can indicate infrastructure of the external data sources 306. In some cases, the database schema 318 can be a diagram representing data stored in the graph database 304. The schema model 316 can receive the dataset including the normalized resource configuration data 314 from the language model 102. Generating the database schema 318 can involve translating the normalized resource configuration data 314 into an implementation (e.g., a diagram) that can be interpreted to generate or maintain the graph database 304. For example, the normalized resource configuration data 314 may be outputted by the language model 102 in a tabular format. The schema model 316 can use the normalized resource configuration data 314 as an input and extract relevant information to construct the database schema 318 that represents the information using nodes, edges, attributes, or a combination thereof. In some cases, the schema model 316 can be trained to recognize computing components in the normalized resource configuration data 314 and generate a set of nodes including a respective node corresponding to each recognized computing component. Additionally, the schema model 316 can extract relationships between the computing components from the normalized resource configuration data 314 and generate corresponding edges connecting nodes in the set of nodes. In some cases, the schema model 316 also can identify attributes in the normalized resource configuration data 314 that describe the computing components or relationships. Once the2130446303V.2Attorney Docket No. 096923-1449744 attributes are identified, the schema model 316 can assign each attribute to its corresponding node or edge as part of constructing the database schema 318.

[0058] In some examples, the database schema 318 may change over time to reflect modifications (e.g., additions, removals, updates, etc.) to the infrastructure services or other events. For example, the infrastructure services may undergo changes or redistributions in resource allocation, such as causing certain infrastructure services to be allocated less system resources after the changes. As another example, certain infrastructure services may be unavailable, such as to implement an update or a replacement service. Other changes to the external data sources 306 or the corresponding computing environments are possible. In some cases, the computing environment 300 can include one or more event listeners (e.g., a plugin, application, etc.) that can initiate a predefined process or perform a predefined action if one or more events are detected. The event listeners can identify when an event associated with the database schema 318 has occurred and can collect event data 320 associated with each detected event. In some cases, the schema model 316 can receive input related to an occurrence of an event and can update the database schema 318 based on the event and its corresponding event data 320. As an example, a particular event listener can be implemented to log events related to a particular infrastructure service (e.g., a running application). The log can include event data 320 generated by the particular event listener and can be provided as input to the schema model 316 such that the schema model 316 can determine a modification to the database schema 318 based on the occurrence of the events indicated in the log. Accordingly, the schema model 316 can generate an updated version of the database schema 318 using the log. In some implementations, the database generation module 302 can implement version control, such as by tracking or storing multiple versions of the database schema 318. As an example, the database generation module 302 can store a current version of the database schema 318 as well as previous versions of the database schema 318 to be able to perform a rollback mechanism to revert to a particular previous version.

[0059] Once the schema model 316 outputs the database schema 318, a visualization engine 322 in the computing environment 300 can use the database schema 318 to generate an interactive architecture diagram 324 including one or more visual representations. The visual representations can indicate a current state of the computing environments associated with the external data sources 306, such as by providing information related to features, services, or components of the computing environments. In other words, the interactive architecture diagram 324 can model the computing environments, for example enabling a user to visually interpret and analyze the infrastructure services provided in the computing environments.2230446303V.2Attorney Docket No. 096923-1449744Changes made to the database schema 318 can be reflected in the interactive architecture diagram 324, such as in batched updates or in real-time. For example, the visualization engine 322 can receive communication (e.g., an update request) from the database generation module 302 when the database schema 318 is updated, such as when an updated version of the database schema 318 is generated. The communication outputted by the database generation module 302 can indicate the changes made to the database schema 318. Using the communication from the database generation module 302, the visualization engine 322 can update the visual representations of the interactive architecture diagram 324.

[0060] Each visual representation included in the interactive architecture diagram 324 can correspond to a respective infrastructure service provided by the computing environments related to the external data sources 306. In some examples, the visualization engine 322 can output the interactive architecture diagram 324 via a graphical user interface (GUI) 326. For example, the visualization engine 322 can be communicatively coupled with an output device, such as a display device, that can provide the GUI 326 for display to a user (e.g., a user of the user computing systems 106). Additionally, the visualization engine 322 can be in communication with an input device (e.g., a keyboard, a mouse, a touchscreen, etc.) that can receive user input with respect to the interactive architecture diagram 324 displayed via the GUI 326. For example, a user may interact with the GUI 326 via the input device to drag a component (e.g., an interface element) of the interactive architecture diagram 324 from a starting position to an ending position to reposition the component. As another example, the user may hover over a particular component of the GUI 326 using a mouse as the input device. Based on this mouse event, the GUI 326 can be updated to display attributes (e.g., processing power, storage, resource allocation, network access, etc.) associated with the particular component of the interactive architecture diagram 324.

[0061] In some implementations, the database schema 318 may include hundreds, thousands, or millions of nodes and edges, which can be difficult for a user to manually interpret or analyze. Rendering a respective visual representation corresponding to each node in the database schema 318 can be inefficient depending on a quantity of visual representations being presented via the interactive architecture diagram 324. In some examples, the GUI 326 can include one or more filter options selectable by a user, such as to generate a customized view of the interactive architecture diagram 324 based on user input. For example, the GUI 326 may transmit an update request to the visualization engine 322, where the update request can include user input provided by a user to the GUI 326 indicating a selected filter option by which to update the interactive architecture diagram 324. Each filter option can correspond to2330446303V.2Attorney Docket No. 096923-1449744 a set of criteria by which the visual representations can be processed to output a subset of visual representations that match or fulfill the set of criteria. Examples of the criteria can include a type of the infrastructure service (e.g., a database, virtual machine, container, etc.), a location of the infrastructure service, connectivity (e.g., network access or lack thereof), access permissions, firewall rules, etc. Selecting a filter option can cause the visualization engine 322 to provide the subset of the visual representations for display via the GUI 326 based on the selected filter option. In other words, the filter options can be applied to decrease an amount of visual representations displayed via the GUI 326, which can facilitate interpretation and analysis of the visual representations.

[0062] In addition to providing a representation of the current state of the computing environments, the interactive architecture diagram 324 can be outputted for display to a user to visualize proposed changes to the computing environments. The proposed changes can be applied to the database schema 318 prior to actual implementation in the computing environments, such as to predict results or consequences associated with implementing the proposed changes. In some cases, the visualization engine 322 can receive a simulation request 328 including a requested modification 330 of the graph database 304 or the database schema 318. For example, the simulation request 328 may be generated based on user input provided via the input device. Once the visualization engine 322 receives the simulation request 328, the visualization engine 322 can communicate with the database generation module 302 to update the database schema 318. The database generation module 302 can execute the schema model 316 to fulfill the simulation request 328 by applying the requested modification 330 and generating an updated version of the database schema 318. The graph database 304 can, in turn, be updated using the updated version of the database schema 318 that includes the requested modification 330. In some implementations, once the updated version of the database schema 318 is generated, the database generation module 302 may communicate with the risk assessment server 118 to perform a risk assessment with respect to the updated database schema. Changes applied based on the requested modification 330 may affect (e.g., increase or decrease) a respective level of risk associated with the infrastructure services or other components of the computing environments. Accordingly, the updated database schema can be used to generate new predictor variables provided as input to the risk prediction model 120 to generate updated risk indicators corresponding to the infrastructure services. The risk assessment server 118 can transmit the updated risk indicators to the database generation module 302 or the visualization engine 322 to include in the interactive architecture diagram 324.2430446303V.2Attorney Docket No. 096923-1449744

[0063] In some examples, the requested modification 330 can include adjustments to resource allocation of the computing environments, such as by increasing an amount of system resources available to be allocated in the computing environments. For example, the requested modification 330 may include adding four additional processors to a particular compute in the computing environments. The requested modification 330 can be inputted into the schema model 316 to generate an updated database schema that assigns the four additional processors to the particular compute. In some cases, the updated database schema may include updated attributes related to the particular compute, such as increased bandwidth associated with the particular compute due to the increase in the quantity of processors. Additionally, the updated database schema may include updates to other components or infrastructure services related to the particular compute based on the increase in processors.

[0064] As another example, the requested modification 330 may involve simulating an outage in which a subset of system resources or infrastructure services of the computing environments are unavailable (e.g., offline or disconnected from network access). Simulating the outage can ensure that the computing environments are sufficiently overprovisioned to maintain functionality while the subset of system resources or infrastructure services are unavailable. In other words, simulating the outage can be used to determine weaknesses or vulnerabilities in resource allocation or provisioning of the computing environments. For instance, the simulated outage can be applied to evaluate whether to implement redundancy (e.g., duplicate versions or copies of certain components) in the computing environments. As an example, once an updated database schema is generated to simulate the outage, the database generation module 302 may communicate with a risk assessment server 1188 to determine updated risk indicators of the infrastructure services or other components of the computing environments. The updated risk indicators can be used to identify vulnerable components of the computing environments in the updated database schema, such as a database that lacks a duplicate copy in another region or zone of a cloud computing environment. Once outputted by a risk prediction model 120 of the risk assessment server 118, the updated risk indicators can be provided in the interactive architecture diagram 324 via the GUI 326 for display to the user.

[0065] FIG. 4 illustrates a diagram depicting an example of graph data 400 in a graph database (e.g., the graph database 304 of FIG. 3). In some examples, each node of the graph data 400 can represent a respective computing component (e.g., an infrastructure service) associated with one or more computing environments provided by one or more computing systems. In the illustrated example, the graph data 400 includes three nodes corresponding to2530446303V.2Attorney Docket No. 096923-1449744 three infrastructure services (e.g., a first infrastructure service 402 A, a second infrastructure service 402B, and a third infrastructure service 402C). Each node can include one or more edges 404 connecting individual nodes in the graph data 400 and indicating a relationship (e.g., dependency, parent-child relationship, etc.) between connected nodes. In some cases, an edge 404 can indicate that a node includes or belongs to another node. Additionally, in some examples, the graph data 400 can include one or more attributes 406 assigned to each node. For example, as shown in FIG. 4, the second infrastructure service 402B can have attributes 406 indicating its central processing unit (CPU) and memory allocated to the second infrastructure service 402B. As another example, the third infrastructure service 402C can have an attribute 406 indicating information related to its operating system, such as a version of the operating system or a type of the operating system.

[0066] In some examples, a particular node of the graph data 400 can be a target component that undergoes a risk assessment to determine a corresponding risk indicator quantifying a level of risk associated with the particular node. For example, the target component may be the first infrastructure service 402A that can be configured to provide firewall functionality to manage (e.g., allow or deny) access to other infrastructure services (e.g., the second and third infrastructure services 402B-C). A risk assessment server 118 can provide the graph data 400 as input to a risk prediction model 120 trained to output a risk indicator based on the graph data 400. In some cases, the risk prediction model 120 may determine one or more predictor variables 124 associated with the first infrastructure service 402A (e.g., related to its functionality, relationship to other nodes in the graph data 400, etc.). The risk prediction model 120 can evaluate the first infrastructure service 402 A based on its predictor variables to determine a corresponding risk indicator quantifying a level of risk associated with the first infrastructure service 402A. For example, the risk indicator can be used to determine whether sufficient security controls are in place to prevent unauthorized access to the second and third infrastructure services 402B-C, which, in some examples, can be virtual machine instances. In some implementations, if the level of risk is above a predefined threshold, the risk prediction model 120 may output a recommendation (e.g., including an action or operation) to reduce the level of risk. For example, the recommendation may include updating a rule set of the first infrastructure service 402A to implement stricter access control with respect to the second and third infrastructure services 402B-C. In some cases, the recommendation can be automatically implemented by a computing environment corresponding to the first infrastructure service 402A.2630446303V.2Attorney Docket No. 096923-1449744

[0067] As another example in which the infrastructure services 402A-C are separate applications in communication with each other, the risk assessment server 118 may use the risk prediction model 120 to generate updated risk indicators after a security breach. In some cases, an extent of the security breach may be initially unknown. The updated risk indicators can be used to determine whether the infrastructure services 402A-C have been compromised by a malicious actor or otherwise affected by the security breach. For example, the updated risk indicators can be compared to one or more previous risk indicators to determine whether a significant change (e.g., above a predefined threshold) has occurred, which can be indicative of an infrastructure service being compromised. In some implementations, if the first infrastructure service 402A determined to be compromised based on the updated risk indicators, its access to the remaining infrastructure services (e.g., the second and third infrastructure services 402B-C) can automatically be revoked or limited.

[0068] In some examples, as described herein, the graph data 400 can be used to generate an interactive architecture diagram visualizing the infrastructure services 402A-C. The interactive architecture diagram can be outputted for display to a user via a graphical user interface (GUI). An example of such a GUI 326 is depicted in FIG. 5. The GUI 326 can include a set of visual representations 502 provided as part of interactive architecture diagram 324. For example, as shown in FIG. 5, the interactive architecture diagram 324 can include a first visual representation 502A, a second visual representation 502B, and a third visual representation 502C. Each visual representation 502 can correspond to a respective node of the graph data 400. In some cases, each visual representation 502 can include information related to a respective infrastructure service corresponding to the nodes of the graph data 400. As an example, the visual representations 502A-C can include a respective risk indicator (e.g., a first risk indicator 504A, a second risk indicator 504B, or a third risk indicator 504C) corresponding to each infrastructure service. As described herein, the risk indicators 504A-C can be outputted by a risk prediction model 120 to quantify a level of risk associated with each infrastructure service. Other information related to the infrastructure services may be provided in the visual representations 502A-C. For example, the visual representations 502A-C can include attributes associated with a corresponding node.

[0069] In some examples, the GUI 326 can include one or more interface elements 506 selectable by a user to customize or modify the interactive architecture diagram 324 outputted via the GUI 326. For example, the GUI 326 can include a first interface element 506A enabling the user to filter the visual representations 502 of the interactive architecture diagram 324 using one or more filter options 508. In some implementations, the first interface element 506A can2730446303V.2Attorney Docket No. 096923-1449744 include checkboxes, a dropdown menu, text boxes, radio buttons, or other suitable interface elements to receive user input. Each filter option can include one or more criteria by which to select which visual representations to display via the interactive architecture diagram 324. As shown in FIG. 5, the first interface element 506A can include three filter options: a first filter option 508A, a second filter option 508B, and a third filter option 508C. Any number of filter options are possible. As an example, the first filter option 508A can enable the user to select which types (e.g., virtual machine instance, container instance, database, firewall service, etc.) of infrastructure services to display in the interactive architecture diagram 324. As another example, the second filter option 508B can enable the user to narrow down the displayed visual representations to those that correspond to infrastructure services with a predefined amount of storage available, such as a minimum of 4 gigabytes (GB).

[0070] In some implementations, the GUI 326 can include a second interface element 506B enabling the user to provide one or more proposed changes to display by updating the interactive architecture diagram 324. As an example, the second interface element 506B can include one or more text boxes such that the user can provide user input to generate a simulation request 328 indicating at least one requested modification 330 to update the GUI 326. In particular, the user can specify a first parameter 510A and a second parameter 51 OB as the proposed changes to include in the simulation request 328. As described herein, examples of the proposed changes can include simulating an outage of certain infrastructure services or system resources or simulating a different distribution of system resources. For instance, the first parameter 310A can involve assigning an offline status to certain infrastructure services in a computing environment (e.g., in a particular zone or region). As another example, the second parameter 31 OB can involve redistributing memory from one infrastructure service to another infrastructure service.

[0071] The proposed changes can be applied to the graph data 400 without affecting an actual implementation of the infrastructure services. In other words, the proposed changes can be used to apply a modification to the GUI 326 to generate an updated GUI for display to the user, rather than being directly applied to the infrastructure services. Accordingly, the user can use the GUI 326 to effect or simulate the proposed changes in the interactive architecture diagram 324 to evaluate any effects or results caused by the proposed changes. In some cases, the effects of the proposed changes can indicate pinch points or weaknesses in a computing environment, such as with respect to cybersecurity, reliability, etc. Additionally or alternatively, the effects of the proposed changes can be used as part of a cost-benefit analysis implemented to determine whether to actually implement the proposed changes in the2830446303V.2Attorney Docket No. 096923-1449744 computing environment. The graph data 400 of the interactive architecture diagram 324 can be used to train a machine-learning model (e.g., risk prediction model 120). The information represented in the interactive architecture diagram 324 can enable the machine-learning model to predict risk related to infrastructure services and determine remedial actions to decrease risk.

[0072] Any suitable computing system or group of computing systems can be used to perform the operations for the machine-learning operations described herein. For example, FIG. 6 is a block diagram depicting an example of a computing device 600, which can be used to implement the risk assessment server 118 or the model training server 110. The computing device 600 can include various devices for communicating with other devices in the operating environment 100, as described with respect to FIG. 1. The computing device 600 can include various devices for performing one or more transformation operations described above with respect to FIGS. 1-5.

[0073] The computing device 600 can include a processor 602 that is communicatively coupled to a memory 604. The processor 602 executes computer-executable program code stored in the memory 604, accesses information stored in the memory 604, or both. Program code may include machine-executable instructions that may represent a procedure, a function, a subprogram, a program, a routine, a subroutine, a module, a software package, a class, or any combination of instructions, data structures, or program statements. A code segment may be coupled to another code segment or a hardware circuit by passing or receiving information, data, arguments, parameters, or memory contents. Information, arguments, parameters, data, etc. may be passed, forwarded, or transmitted via any suitable means including memory sharing, message passing, token passing, network transmission, among others.

[0074] Examples of a processor 602 include a microprocessor, an application-specific integrated circuit, a field-programmable gate array, or any other suitable processing device. The processor 602 can include any number of processing devices, including one. The processor 602 can include or communicate with a memory 604. The memory 604 stores program code that, when executed by the processor 602, causes the processor to perform the operations described in this disclosure.

[0075] The memory 604 can include any suitable non-transitory computer-readable medium. The computer-readable medium can include any electronic, optical, magnetic, or other storage device capable of providing a processor with computer-readable program code or other program code. Non-limiting examples of a computer-readable medium include a magnetic disk, memory chip, optical storage, flash memory, storage class memory, ROM, RAM, an ASIC, magnetic storage, or any other medium from which a computer processor can2930446303V.2Attorney Docket No. 096923-1449744 read and execute program code. The program code may include processor-specific program code generated by a compiler or an interpreter from code written in any suitable computerprogramming language. Examples of suitable programming language include Hadoop, C, C++, C#, Visual Basic, Java, Python, Perl, JavaScript, ActionScript, etc.

[0076] The computing device 600 may also include a number of external or internal devices such as input or output devices. For example, the computing device 600 is shown with an input / output interface 608 that can receive input from input devices or provide output to output devices. A bus 606 can also be included in the computing device 600. The bus 606 can communicatively couple one or more components of the computing device 600.

[0077] The computing device 600 can execute program code 614 that includes the risk assessment application 114 and / or the model training application 112. The program code 614 for the risk assessment application 114 and / or the model training application 112 may be resident in any suitable computer-readable medium and may be executed on any suitable processing device. For example, as depicted in FIG. 6, the program code 614 for the risk assessment application 114 and / or the model training application 112 can reside in the memory 604 at the computing device 600 along with the program data 616 associated with the program code 614, such as the predictor variables 124, the model training samples 126, and / or the graph data 142. Executing the risk assessment application 114 or the model training application 112 can configure the processor 602 to perform the operations described herein. In some examples, the program code 614 includes the management module 117, visualization engine 322, database generation module 302, or a combination thereof.

[0078] In some aspects, the computing device 600 can include one or more output devices. One example of an output device is the network interface device 610 depicted in FIG. 6. A network interface device 610 can include any device or group of devices suitable for establishing a wired or wireless data connection to one or more data networks described herein. Non-limiting examples of the network interface device 610 include an Ethernet network adapter, a modem, etc.

[0079] Another example of an output device is the presentation device 612 depicted in FIG. 6. A presentation device 612 can include any device or group of devices suitable for providing visual, auditory, or other suitable sensory output. Non-limiting examples of the presentation device 612 include a touchscreen, a monitor, a speaker, a separate mobile computing device, etc. In some aspects, the presentation device 612 can include a remote client-computing device that communicates with the computing device 600 using one or more data networks described herein. In other aspects, the presentation device 612 can be omitted.3030446303V.2Attorney Docket No. 096923-1449744

[0080] The foregoing description of some examples has been presented only for the purpose of illustration and description and is not intended to be exhaustive or to limit the disclosure to the precise forms disclosed. Numerous modifications and adaptations thereof will be apparent to those skilled in the art without departing from the spirit and scope of the disclosure.3130446303V.2

Claims

Attorney Docket No. 096923-1449744Claims1. A computer-implemented method comprising: accessing, by a processor, a machine-learning model trained using a training process to identify a risk indicator for a target component of one or more interactive computing environments, wherein the machine-learning model is trained to analyze a graph database comprising: a set of nodes, wherein each node of the set of nodes represents a respective infrastructure service of one or more infrastructure services provided by the one or more interactive computing environments; and a set of edges connecting individual nodes of the set of nodes, wherein the set of edges represents relationships between the respective connected nodes; generating, by the processor and for the target component, the risk indicator based on an output of the machine-learning model trained to analyze the graph database; and outputting, by the processor and to a remote computing device, a graphical user interface comprising at least the risk indicator for use in controlling access to the one or more infrastructure services.

2. The computer-implemented method of claim 1, wherein the machine-learning model trained to analyze the graph database is a first machine-learning model, and wherein the method further comprises generating the graph database by: extracting, by the processor from one or more external data sources using a second machine-learning model trained to perform natural language processing, resource configuration data provided by the one or more external data sources; normalizing, by the processor using the second machine-learning model, the resource configuration data by converting the resource configuration data from an unstructured format into a structured format; providing, by the processor, the normalized resource configuration data to a third machine-learning model trained to generate a database schema, wherein the database schema indicates the set of nodes and the set of edges associated with the graph database; and generating, by the processor using a database generation module, the graph database based on the database schema outputted by the third machine-learning model.3230446303V.2Attorney Docket No. 096923-14497443. The computer-implemented method of claim 2, wherein generating the database schema further comprises: extracting, by the processor from one or more configuration files of the resource configuration data using the second machine-learning model, a dataset corresponding to the one or more infrastructure services; generating, by the processor using the third machine-learning model, the set of nodes based on the one or more infrastructure services determined using the dataset; and generating, by the processor using the third machine-learning model, the set of edges to indicate a respective relationship among the set of nodes.

4. The computer-implemented method of claim 1, further comprising: detecting, by the processor, an occurrence of an event associated with the graph database; determining, by the processor using a separate machine-learning model trained to perform natural language processing, a modification to the graph database based on the occurrence of the event; and applying, by the processor, the modification to the graph database to generate an updated graph database by modifying the set of nodes or the set of edges of the graph database.

5. The computer-implemented method of claim 1, further comprising: receiving, by the processor, a simulation request generated by a user of the graphical user interface, wherein the simulation request indicates a requested modification to the graph database; determining, by the processor based on the simulation request, an updated risk indicator; and generating, by the processor based on the requested modification indicated in the simulation request, an updated graphical user interface that comprises the updated risk indicator.

6. The computer-implemented method of claim 1, wherein outputting the graphical user interface further comprises: generating, by the processor using a visualization engine, a set of visual representations comprising a respective visual representation corresponding to each node;3330446303V.2Attorney Docket No. 096923-1449744 providing, by the processor using the visualization engine, one or more filter options selectable by a user of the graphical user interface to modify the set of visual representations outputted in the graphical user interface; and providing, by the processor using the visualization engine, an interface element that receives user input to generate a simulation request indicating a requested modification to the graph database.

7. The computer-implemented method of claim 1, wherein each node of the set of nodes comprises a respective set of attributes defining a plurality of system resources provided by a corresponding node of the set of nodes.

8. A system comprising: a processor; and a memory device in which instructions executable by the processor are stored for causing the processor to: access, by the processor, a machine-learning model trained using a training process to identify a risk indicator for a target component of one or more interactive computing environments, wherein the machine-learning model is trained to analyze a graph database comprising: a set of nodes, wherein each node of the set of nodes represents a respective infrastructure service of one or more infrastructure services provided by the one or more interactive computing environments; and a set of edges connecting individual nodes of the set of nodes, wherein the set of edges represents relationships between the respective connected nodes; generate, for the target component, the risk indicator based on an output of the machine-learning model trained to analyze the graph database; and output, to a remote computing device, a graphical user interface comprising at least the risk indicator for use in controlling access to the one or more infrastructure services.

9. The system of claim 8, wherein the machine-learning model trained to analyze the graph database is a first machine-learning model, and wherein the instructions cause the processor to generate the graph database by:3430446303V.2Attorney Docket No. 096923-1449744 extracting, by the processor from one or more external data sources using a second machine-learning model trained to perform natural language processing, resource configuration data provided by the one or more external data sources; normalizing, by the processor using the second machine-learning model, the resource configuration data by converting the resource configuration data from an unstructured format into a structured format; providing, by the processor, the normalized resource configuration data to a third machine-learning model trained to generate a database schema, wherein the database schema indicates the set of nodes and the set of edges associated with the graph database; and generating, by the processor using a database generation module, the graph database based on the database schema outputted by the third machine-learning model.

10. The system of claim 9, wherein the instructions cause the processor to generate the database schema by: extracting, by the processor from one or more configuration files of the resource configuration data using the second machine-learning model, a dataset corresponding to the one or more infrastructure services; generating, by the processor using the third machine-learning model, the set of nodes based on the one or more infrastructure services determined using the dataset; and generating, by the processor using the third machine-learning model, the set of edges to indicate a respective relationship among the set of nodes.

11. The system of claim 8, wherein the instructions further cause the processor to: detect an occurrence of an event associated with the graph database; determine, using a separate machine-learning model trained to perform natural language processing, a modification to the graph database based on the occurrence of the event; and apply the modification to the graph database to generate an updated graph database by modifying the set of nodes or the set of edges of the graph database.

12. The system of claim 8, wherein the instructions further cause the processor to: receive a simulation request generated by a user of the graphical user interface, wherein the simulation request indicates a requested modification to the graph database; determine, based on the simulation request, an updated risk indicator; and3530446303V.2Attorney Docket No. 096923-1449744 generate, based on the requested modification indicated in the simulation request, an updated graphical user interface that comprises the updated risk indicator.

13. The system of claim 8, wherein outputting the graphical user interface further comprises: generating, by the processor using a visualization engine, a set of visual representations comprising a respective visual representation corresponding to each node; providing, by the processor using the visualization engine, one or more filter options selectable by a user of the graphical user interface to modify the set of visual representations outputted in the graphical user interface; and providing, by the processor using the visualization engine, an interface element that receives user input to generate a simulation request indicating a requested modification to the graph database.

14. The system of claim 11, wherein each node of the set of nodes comprises a respective set of attributes defining a plurality of system resources provided by a corresponding node of the set of nodes.

15. A non-transitory computer-readable storage medium having program code that is executable by a processor to cause a computing device to perform operations, the operations comprising: accessing, by the processor, a machine-learning model trained using a training process to identify a risk indicator for a target component of one or more interactive computing environments, wherein the machine-learning model is trained to analyze a graph database comprising: a set of nodes, wherein each node of the set of nodes represents a respective infrastructure service of one or more infrastructure services provided by the one or more interactive computing environments; and a set of edges connecting individual nodes of the set of nodes, wherein the set of edges represents relationships between the respective connected nodes; generating, by the processor and for the target component, the risk indicator based on an output of the machine-learning model trained to analyze the graph database; and outputting, by the processor and to a remote computing device, a graphical user interface comprising at least the risk indicator for use in controlling access to the one or more infrastructure services.3630446303V.2Attorney Docket No. 096923-144974416. The non-transitory computer-readable storage medium of claim 15, wherein the machine-learning model trained to analyze the graph database is a first machine-learning model, and wherein the operations further comprise generating the graph database by: extracting, by the processor from one or more external data sources using a second machine-learning model trained to perform natural language processing, resource configuration data provided by the one or more external data sources; normalizing, by the processor using the second machine-learning model, the resource configuration data by converting the resource configuration data from an unstructured format into a structured format; providing, by the processor, the normalized resource configuration data to a third machine-learning model trained to generate a database schema, wherein the database schema indicates the set of nodes and the set of edges associated with the graph database; and generating, by the processor using a database generation module, the graph database based on the database schema outputted by the third machine-learning model.

17. The non-transitory computer-readable storage medium of claim 16, wherein generating the database schema further comprises: extracting, by the processor from one or more configuration files of the resource configuration data using the second machine-learning model, a dataset corresponding to the one or more infrastructure services; generating, by the processor using the third machine-learning model, the set of nodes based on the one or more infrastructure services determined using the dataset; and generating, by the processor using the third machine-learning model, the set of edges to indicate a respective relationship among the set of nodes.

18. The non-transitory computer-readable storage medium of claim 15, wherein the operations further comprise: detecting, by the processor, an occurrence of an event associated with the graph database; determining, by the processor using a separate machine-learning model trained to perform natural language processing, a modification to the graph database based on the occurrence of the event; and3730446303V.2Attorney Docket No. 096923-1449744 applying, by the processor, the modification to the graph database to generate an updated graph database by modifying the set of nodes or the set of edges of the graph database.

19. The non-transitory computer-readable storage medium of claim 15, wherein the operations further comprise: receiving, by the processor, a simulation request generated by a user of the graphical user interface, wherein the simulation request indicates a requested modification to the graph database; determining, by the processor based on the simulation request, an updated risk indicator; and generating, by the processor based on the requested modification indicated in the simulation request, an updated graphical user interface that comprises the updated risk indicator.

20. The non-transitory computer-readable storage medium of claim 15, wherein outputting the graphical user interface further comprises: generating, by the processor using a visualization engine, a set of visual representations comprising a respective visual representation corresponding to each node; providing, by the processor using the visualization engine, one or more filter options selectable by a user of the graphical user interface to modify the set of visual representations outputted in the graphical user interface; and providing, by the processor using the visualization engine, an interface element that receives user input to generate a simulation request indicating a requested modification to the graph database.3830446303V.2