Layer 2 networking storm control in virtualized cloud environments
An infinitely scalable distributed switch with Layer 2 VNICs and local switches addresses limitations in cloud computing environments, enhancing network management and storm control in virtualized networks.
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Patents
- Current Assignee / Owner
- ORACLE INT CORP
- Filing Date
- 2021-11-24
- Publication Date
- 2026-06-26
AI Technical Summary
Cloud computing environments face limitations in providing efficient Layer 2 networking functions, constraining the functionality and value of virtual networks.
Implementing an infinitely scalable distributed switch with Layer 2 virtual network interface cards (VNICs) and local switches to connect compute instances, supporting Layer 2 network services like storm control within virtualized cloud environments.
Enhances network management and scalability, enabling effective Layer 2 networking functions and storm control in virtualized cloud environments.
Smart Images

Figure 0007880881000001 
Figure 0007880881000002 
Figure 0007880881000003
Abstract
Description
Technical Field
[0001] Reference to Related Applications This international patent application claims priority to U.S. Patent Application No. 17 / 494,729, filed on October 5, 2021, entitled "LAYER-2 NETWORKING USING ACCESS CONTROL LISTS IN A VIRTUALIZED CLOUD ENVIRONMENT", which claims the benefit of U.S. Provisional Application No. 63 / 132,377, filed on December 30, 2020, entitled "LAYER-2 NETWORKING STORM CONTROL IN A VIRTUALIZED CLOUD ENVIRONMENT", the contents of which are hereby incorporated by reference in their entirety for all purposes.
Background Art
[0002] Background Cloud computing provides on-demand availability of computing resources. Cloud computing can be based on data centers that are available to users via the Internet. Cloud computing can provide infrastructure as a service (IaaS). Virtual networks may be created for user use. However, these virtual networks have limitations that constrain their functionality and value. Therefore, further improvements are desired.
Summary of the Invention
Means for Solving the Problems
[0003] Summary This disclosure relates to a virtualized cloud environment. Techniques for providing layer 2 networking functions in a virtualized cloud environment are described. The layer 2 functions are provided in addition to, and in relation to, the layer 3 networking functions provided by the virtualized cloud environment.
[0004] Some embodiments of this disclosure relate to providing customers with Layer 2 virtual local area networks (VLANs) within a private network, such as a customer's virtual cloud network (VCN). In a Layer 2 VLAN, different compute instances are connected. The customer is given the perception of an emulated single switch connecting the compute instances. In fact, this emulated switch is implemented as an infinitely scalable distributed switch including a set of local switches. More specifically, each compute instance runs on a host machine connected to a network virtualization device (NVD). For each compute instance on a host connected to the NVD, the NVD hosts a Layer 2 virtual network interface card (VNIC) and a local switch associated with the compute instance. The Layer 2 VNIC represents a port of the compute instance on the Layer 2 VLAN. The local switch connects the VNIC to other VNICs (e.g., other ports) associated with other compute instances on the Layer 2 VLAN. Various Layer 2 network services are supported, for example, including storm control.
[0005] This section describes various embodiments, including methods, systems, and non-temporary computer-readable storage media for storing programs, code, or instructions executable by one or more processors. [Brief explanation of the drawing]
[0006] [Figure 1] This is a high-level diagram of a distributed environment showing a virtual or overlay cloud network hosted by a cloud service provider infrastructure, according to a specific embodiment. [Figure 2] This is an architectural schematic diagram showing the physical elements of the physical network within the CSPI according to a specific embodiment. [Figure 3] This figure shows an exemplary configuration of a CSPI in which a host machine is connected to multiple network virtualization devices (NVDs) according to a particular embodiment. [Figure 4] This diagram shows the connection between a host machine and an NVD that provides I / O virtualization to support multi-tenancy functionality, according to a specific embodiment. [Figure 5] This is a schematic block diagram showing the physical network provided by CSPI according to a specific embodiment. [Figure 6] This is a schematic diagram of a computing network according to one embodiment. [Figure 7] This is a schematic diagram of the logic and hardware of a VLAN according to one embodiment. [Figure 8] This is a schematic logic diagram of multiple connected L2 VLANs according to one embodiment. [Figure 9] This is a logical schematic diagram of multiple connected L2 VLANs and subnet 900 according to one embodiment. [Figure 10] This is a schematic diagram of VLAN communication and VLAN learning according to one embodiment. [Figure 11] This is a schematic diagram of a VLAN according to one embodiment. [Figure 12] This flowchart shows process 1200 for communication within a VLAN according to one embodiment. [Figure 13] An exemplary environment suitable for defining a storm control configuration for an L2 virtual network according to one embodiment is illustrated. [Figure 14] This figure shows exemplary storm control techniques in layered virtual networks according to several embodiments. [Figure 15] This is a sequence diagram illustrating the process for using storm control information in an L2 virtual network according to several embodiments. [Figure 16] This flowchart shows a process for determining the generation and distribution of storm control information according to several embodiments. [Figure 17]This flowchart shows the process for updating storm control policies based on collected metrics, according to several embodiments. [Figure 18] This flowchart shows a process for updating storm control information according to several embodiments. [Figure 19] This is a block diagram showing one pattern for realizing a cloud infrastructure as a service system according to at least one embodiment. [Figure 20] This block diagram shows another pattern for realizing a cloud infrastructure as a service system, according to at least one embodiment. [Figure 21] This block diagram shows another pattern for realizing a cloud infrastructure as a service system, according to at least one embodiment. [Figure 22] This block diagram shows another pattern for realizing a cloud infrastructure as a service system, according to at least one embodiment. [Figure 23] A block diagram showing an exemplary computer system according to at least one embodiment. [Modes for carrying out the invention]
[0007] Detailed explanation In the following description, certain details are included for illustrative purposes to facilitate a full understanding of the particular embodiment. However, it will be apparent that various embodiments may be carried out without these specific details. The figures and descriptions are not intended to be limiting. The term “exemplary” is used here to mean “provided as an example, case, or illustration.” Any embodiment or design described herein as “exemplary” should not necessarily be construed as being preferable or advantageous over other embodiments or designs.
[0008] A. Exemplary virtual networking architecture The term "cloud service" generally refers to services that a cloud service provider (CSP) makes available to users or customers on demand (e.g., via a subscription model) using systems and infrastructure (cloud infrastructure). Typically, the servers and systems that make up the CSP's infrastructure are separate from the customer's own on-premises servers and systems. Thus, customers can utilize cloud services provided by a CSP without having to purchase hardware and software resources for the service separately. Cloud services are designed to provide customers who subscribe to them with easy and scalable access to applications and computing resources without the customer having to invest in the procurement of the infrastructure used to provide the service.
[0009] There are several cloud service providers that offer various types of cloud services. Cloud services include various different types or models such as SaaS (Software-as-a-Service), PaaS (Platform-as-a-Service), IaaS (Infrastructure-as-a-Service).
[0010] Customers can subscribe to one or more cloud services provided by a CSP. The customer can be any entity such as an individual, an organization, a company, etc. When a customer subscribes or signs up for the services provided by a CSP, a tenant or account is created for that customer. Thereafter, the customer can access one or more subscribed cloud resources associated with the account via this account.
[0011] As described above, IaaS (Infrastructure as a Service) is a specific type of cloud computing service. In the IaaS model, the CSP provides infrastructure (referred to as cloud service provider infrastructure or CSPI) that customers can use to build their own customizable networks and deploy customer resources. Therefore, the customer's resources and network are hosted in a distributed environment by the infrastructure provided by the CSP. This is different from traditional computing where the customer's infrastructure hosts the customer's resources and network.
[0012] CSPI may include interconnected high-performance computing resources, including various host machines, memory resources, and network resources, forming a physical network also known as an underlay network or base network. CSPI resources may be distributed across one or more data centers geographically distributed across one or more geographical regions. Virtualization software can run on these physical resources to provide a virtualized distributed environment. Virtualization creates an overlay network (also known as a software-based network, software-defined network, or virtual network) on top of the physical network. The CSPI physical network provides the foundation for creating one or more overlay or virtual networks on top of the physical network. A virtual network or overlay network may include one or more virtual cloud networks (VCNs). Virtual networks are implemented using software virtualization technologies (e.g., hypervisors, functions performed by network virtualization devices (NVDs) (e.g., smart NICs), top-of-rack (TOR) switches, smart TORs implementing one or more functions performed by NVDs, and other mechanisms) to create a network abstraction layer that can run on top of the physical network. Virtual networks can take various forms, such as peer-to-peer networks and IP networks. A virtual network is typically either a Layer 3 IP network or a Layer 2 VLAN. Such virtual or overlay networks are often referred to as virtual Layer 3 networks or overlay Layer 3 networks. Examples of protocols developed for virtual networks include IP-in-IP (or GRE (Generic Routing Encapsulation)), virtual extensible LAN (VXLAN - IETF RFC7348), virtual private networks (VPNs) (e.g., MPLS Layer 3 virtual private network (RFC4364)), VMware NSX, and GENEVE (Generic Network Virtualization Encapsulation).
[0013] In the case of IaaS, the infrastructure provided by the CSP (CSPI) may be configured to deliver virtualized computing resources over a public network (e.g., the internet). In the IaaS model, the cloud computing service provider can host infrastructure elements (e.g., servers, storage, network nodes (e.g., hardware), deployment software, platform virtualization (e.g., hypervisor layer)). In some cases, the IaaS provider can provide various services associated with these infrastructure elements (e.g., billing, monitoring, logging, security, load balancing, and clustering). Because these services are policy-driven, IaaS users can maintain application availability and performance by implementing policies to drive load balancing. The CSPI provides infrastructure and a set of complementary cloud services. This allows customers to build and run a wide range of applications and services in a highly available hosted distributed environment. The CSPI provides high-performance computing resources and capabilities, as well as storage capacity, on a flexible virtual network that can be securely accessed from various network locations, such as the customer's on-premises network. When a customer subscribes to or registers for an IaaS service provided by a CSP, the tenancy created for that customer becomes a securely isolated partition from the CSP, allowing the customer to create, organize, and manage cloud resources.
[0014] Customers can build their own virtual networks using the computing, memory, and networking resources provided by CSPI. They can deploy one or more customer resources or workloads, such as compute instances, on these virtual networks. For example, a customer can build one or more customizable private virtual networks called Virtual Cloud Networks (VCNs) using the resources provided by CSPI. On a customer VCN, a customer can deploy one or more customer resources, such as compute instances. Compute instances may be virtual machines, bare-metal instances, etc. Thus, CSPI provides the infrastructure and a set of complementary cloud services that enable customers to build and run a variety of applications and services in a highly available virtual host environment. Customers do not manage or control the underlying physical resources provided by CSPI, but they control the operating system, memory, and deployed applications, and, in some cases, have limited control over certain networking components (e.g., firewalls).
[0015] A CSP can provide a console that enables customers and network administrators to configure, access, and manage resources deployed to the cloud using CSPI resources. In certain embodiments, the console provides a web-based user interface that can be used to utilize and manage CSPI. In some embodiments, the console is a web-based application provided by the CSP.
[0016] CSPI can support single-tenancy or multi-tenancy architectures. In a single-tenancy architecture, software (e.g., applications, databases) or hardware elements (e.g., host machines or servers) serve a single customer or tenant. In a multi-tenancy architecture, software or hardware elements serve multiple customers or tenants. Therefore, in a multi-tenancy architecture, CSPI resources are shared among multiple customers or tenants. In a multi-tenancy environment, CSPI implements precautions and safeguards to isolate each tenant's data and prevent it from being visible to other tenants.
[0017] In a physical network, a network endpoint (endpoint) refers to a computing device or system that is connected to the physical network and communicates bidirectionally with the connected network. A network endpoint in a physical network may be connected to a local area network (LAN), a wide area network (WAN), or other types of physical networks. Examples of traditional endpoints in a physical network include modems, hubs, bridges, switches, routers, and other networking devices, as well as physical computers (or host machines). Each physical device in a physical network has a fixed network address that can be used to communicate with that device. This fixed network address may be a Layer 2 address (e.g., a MAC address), a fixed Layer 3 address (e.g., an IP address), etc. In a virtualized environment or virtual network, endpoints can include various virtual endpoints, such as virtual machines hosted by elements of the physical network (e.g., hosted by a physical host machine). These endpoints in a virtual network are addressed by overlay addresses, such as overlay Layer 2 addresses (e.g., overlay MAC addresses) and overlay Layer 3 addresses (e.g., overlay IP addresses). Network overlays provide flexibility by allowing network administrators to move overlay addresses associated with network endpoints using software management (e.g., through software implementing the control plane of the virtual network). Therefore, unlike physical networks, in virtual networks, network management software can be used to move overlay addresses (e.g., overlay IP addresses) from one endpoint to another. Because virtual networks are built on top of physical networks, both the virtual network and the underlying physical network are involved in communication between elements of the virtual network.To facilitate such communication, each element of the CSPI is configured to learn and store mappings that map the overlay address of the virtual network to the actual physical address of the underlying network, or vice versa. These mappings are used to facilitate communication. To facilitate routing within the virtual network, customer traffic is encapsulated.
[0018] Therefore, physical addresses (e.g., physical IP addresses) are associated with elements of a physical network, while overlay addresses (e.g., overlay IP addresses) are associated with entities in a virtual network. Both physical and overlay IP addresses are real IP addresses. They are distinct from virtual IP addresses, which map to multiple real IP addresses. Virtual IP addresses provide a one-to-many mapping between virtual IP addresses and multiple real IP addresses.
[0019] The cloud infrastructure, or CSPI, is physically hosted in one or more data centers in one or more regions around the world. The CSPI may include elements of a physical network or underlying network and virtualization elements of a virtual network built on top of the physical network elements (e.g., virtual networks, compute instances, virtual machines). In certain embodiments, the CSPI is organized and hosted in realms, regions, and available domains. A region is typically a local geographical area containing one or more data centers. Regions are generally independent of each other and may be separated by vast distances, for example, across countries or continents. For example, one region may be in Australia, another in Japan, and yet another in India. CSPI resources are divided among these regions such that each region has an independent subset of CSPI resources. Each region can provide a set of core infrastructure services and resources, such as computing resources (e.g., bare metal servers, virtual machines, containers, and related infrastructure), storage resources (e.g., block volume storage, file storage, object storage, archive storage), networking resources (e.g., virtual cloud networks (VCNs), load balancing resources, connectivity to on-premises networks), database resources, edge networking resources (e.g., DNS), and access management and monitoring resources. Each region generally has multiple routes for connecting to other regions within the realm.
[0020] Generally, applications are deployed in the region where they are most frequently used (i.e., on infrastructure relevant to that region) because using nearby resources is faster than using distant resources. Applications may also be deployed in different regions for various reasons, such as redundancy to mitigate the risks of large-scale weather systems or region-wide events like earthquakes, or redundancy to meet various requirements for legal jurisdictions, tax domains, and other business or social standards.
[0021] Data centers within a region may be further organized and subdivided into availability domains (ADs). An availability domain may correspond to one or more data centers located in a region. A region may consist of one or more availability domains. In such a distributed environment, CSPI resources are region-specific, such as virtual cloud networks (VCNs), or availability domain-specific, such as compute instances.
[0022] ADs within a single region are configured to be fault-tolerant, isolated from one another, and configured to be highly unlikely to fail simultaneously. This is achieved by configuring ADs so that a failure in one AD within a region has little impact on the availability of other ADs within the same region, by not sharing critical infrastructure resources such as networking, physical cabling, cabling routes, and cabling entry points. Connecting ADs within the same region with a low-latency, high-bandwidth network provides highly available connectivity to other networks (e.g., the internet, customer on-premises networks), and a replication system for both high availability and disaster recovery can be built across multiple ADs. CloudSense utilizes multiple ADs to ensure high availability and protect against resource failures. As the infrastructure provided by the IaaS provider grows, more regions and ADs may be added along with additional capacity. Traffic between available domains is typically encrypted.
[0023] In certain embodiments, regions are grouped into realms. A realm is a logical collection of regions. Realms are isolated from each other and do not share any data. Regions within the same realm can communicate with each other, but regions in different realms cannot. A CSP customer's tenancy or account may reside in a single realm and span one or more regions belonging to that single realm. Typically, when a customer subscribes to an IaaS service, their tenancy or account is created in a customer-designated region within a realm (referred to as the "home" region). The customer can extend their tenancy to one or more other regions within a realm. A customer cannot access regions that do not reside in the realm where their tenancy resides.
[0024] An IaaS provider can offer multiple realms, each corresponding to a specific set of customers or users. For example, a commercial realm may be offered for commercial customers. Another example is that a realm may be offered for a specific country or for customers in that country. Yet another example is that a government realm may be offered for a government, for example. For example, a government realm may be created for a specific government and may have a higher security level than a commercial realm. For example, Oracle Cloud Infrastructure (OCI) currently offers a realm for the commercial domain and two realms for the government cloud domain (e.g., FedRAMP accreditation and IL5 accreditation).
[0025] In certain embodiments, an Active Directory (AD) can be subdivided into one or more fault domains. A fault domain is a grouping of infrastructure resources within the AD to provide anti-affinity. Fault domains can distribute compute instances so that they are not located on the same physical hardware within a single AD. This is known as anti-affinity. A fault domain refers to a collection of hardware elements (computers, switches, etc.) that share a single point of failure. The compute pool is logically divided into fault domains. Therefore, a hardware failure or compute hardware maintenance event affecting one fault domain does not affect instances in other fault domains. Depending on the embodiment, the number of fault domains in each AD may vary. For example, in certain embodiments, each AD may contain three fault domains. Fault domains function as logical data centers within the AD.
[0026] When a customer subscribes to an IaaS service, resources from CSPI are provisioned to the customer and associated with the customer's tenancy. The customer can use these provisioned resources to build private networks and deploy resources on these networks. Customer networks hosted on the cloud by CSPI are called Virtual Cloud Networks (VCNs). A customer can configure one or more Virtual Cloud Networks (VCNs) using the CSPI resources allocated to them. A VCN is a virtual or software-defined private network. Customer resources deployed in a customer's VCN can include compute instances (e.g., virtual machines, bare metal instances) and other resources. These compute instances may represent various customer workloads such as applications, load balancers, and databases. Compute instances deployed on a VCN can communicate with publicly accessible endpoints (public endpoints) over public networks such as the internet, with other instances within the same VCN or other VCNs (e.g., other VCNs of the customer, or VCNs not belonging to the customer), with the customer's on-premises data center or network, with Sendi endpoints, and with other types of endpoints.
[0027] A CSP can provide a variety of services using a CSPI. In some cases, a CSPI customer can act like a service provider themselves and provide services using CSPI resources. A service provider can expose service endpoints characterized by identifying information (e.g., IP address, DNS name, and port). A customer's resources (e.g., compute instances) can consume a particular service by accessing the service endpoint of that particular service exposed by the service. These service endpoints are generally publicly accessible endpoints that users can access via public communication networks such as the internet using the public IP address associated with the endpoint. Publicly accessible network endpoints are sometimes called public endpoints.
[0028] In certain embodiments, a service provider may expose a service through an endpoint of the service (sometimes called a service endpoint). Customers of the service can access the service using this service endpoint. In certain embodiments, the service endpoint provided for a service may be accessible to multiple customers who wish to consume that service. In other implementations, a dedicated service endpoint may be provided to a customer. Thus, only that customer can access the service using that dedicated service endpoint.
[0029] In certain embodiments, a VCN, once created, is associated with a private overlay classless inter-domain routing (CIDR) address space, which is a private overlay IP address range (e.g., 10.0 / 16) assigned to that VCN. A VCN includes associated subnets, route tables, and gateways. While a VCN resides within a single region, it can extend to one or more or all available domains within a region. A gateway is a virtual interface configured for a VCN, enabling traffic communication between the VCN and one or more endpoints outside the VCN. By configuring one or more different types of gateways for a VCN, communication between different types of endpoints can be enabled.
[0030] A VCN may be subdivided into one or more subnets, such as one or more subnets. Thus, a subnet is a constituent unit or partition that can be created within a VCN. A VCN can have one or more subnets. Each subnet within a VCN does not overlap with other subnets within that VCN and is associated with a contiguous range of overlay IP addresses (e.g., 10.0.0.0 / 24 and 10.0.1.0 / 24) that represent an address space subset of the VCN's address space.
[0031] Each compute instance is associated with a virtual network interface card (VNIC). This allows each compute instance to join a subnet in a VCN. A VNIC is a logical representation of a physical network interface card (NIC). Generally, a VNIC is the interface between an entity (e.g., compute instance, service) and a virtual network. A VNIC resides in a subnet and has one or more associated IP addresses and associated security rules or policies. A VNIC is equivalent to a Layer 2 port on a switch. A VNIC connects a compute instance to a subnet within a VCN. The VNIC associated with a compute instance enables the compute instance to be part of a subnet in a VCN and allows the compute instance to communicate (e.g., send and receive packets) with endpoints on the same subnet as the compute instance, endpoints in different subnets within the VCN, or endpoints outside the VCN. Therefore, the VNIC associated with a compute instance determines how the compute instance connects to internal and external endpoints within the VCN. The VNIC for a compute instance is created and associated with that compute instance when the compute instance is created and added to a subnet within the VCN. If a subnet consists of a set of compute instances, it includes the VNICs corresponding to the set of compute instances, and each VNIC is connected to a compute instance within the set of computer instances.
[0032] Each compute instance is assigned a private overlay IP address via the VNIC associated with it. This private overlay IP address is assigned to the VNIC associated with the compute instance when the compute instance is created and is used to route the compute instance's traffic. All VNICs within a given subnet use the same route table, security lists, and DHCP options. As mentioned above, each subnet within a VCN does not overlap with other subnets within that VCN and is associated with a contiguous range of overlay IP addresses (e.g., 10.0.0.0 / 24 and 10.0.1.0 / 24) that represents a subset of the address space of that VCN. For a VNIC on a particular subnet of a VCN, the overlay IP address assigned to the VNIC is an address from the contiguous range of overlay IP addresses assigned to the subnet.
[0033] In certain embodiments, a compute instance may be assigned additional overlay IP addresses, such as one or more public IP addresses in the case of a public subnet, in addition to its private overlay IP address, as needed. These multiple addresses may be assigned to the same VNIC or to multiple VNICs associated with the compute instance. However, each instance has a primary VNIC that is created at instance launch and associated with the overlay private IP address assigned to the instance. This primary VNIC cannot be deleted. Additional VNICs, called secondary VNICs, can be added to an existing instance in the same available domain as the primary VNIC. All VNICs are in the same available domain as the instance. Secondary VNICs may be in the same VCN subnet as the primary VNIC, or they may be in the same VCN or different VCN subnets.
[0034] Compute instances can optionally be assigned a public IP address if they are located in a public subnet. When creating a subnet, you can specify that subnet as either a public or private subnet. A private subnet means that resources within that subnet (e.g., compute instances) and associated VNICs cannot have public overlay IP addresses. A public subnet means that resources within that subnet and associated VNICs can have public IP addresses. Customers can specify subnets that exist across a single available domain or multiple available domains within a region or realm.
[0035] As described above, a VCN may be subdivided into one or more subnets. In certain embodiments, a virtual router (referred to as a VCN VR or simply a VR) configured for the VCN enables communication between subnets within the VCN. For subnets within a VCN, the VR represents the logical gateway for that subnet, enabling communication between the subnet (i.e., compute instances on that subnet) and endpoints on other subnets within the VCN and other endpoints outside the VCN. A VCN VR is a logical entity configured to route traffic between VNICs within the VCN and virtual gateways (gateways) associated with the VCN. Gateways are described further below with reference to Figure 1. A VCN VR is a Layer 3 / IP layer concept. In one embodiment, there is one VCN VR for one VCN. This VCN VR potentially has an unlimited number of ports addressed by IP addresses, with one port for each subnet of the VCN. Thus, the VCN VR has a different IP address for each subnet of the VCN to which the VCN VR is connected. The VR is also connected to various gateways configured for the VCN. In certain embodiments, specific overlay IP addresses from a subnet's overlay IP address range are held on ports of the VCN VR for that subnet. For example, consider a VCN having two subnets, each having the associated address ranges 10.0 / 16 and 10.1 / 16. For the first subnet of the VCN having the address range 10.0 / 16, addresses from this range are held on ports of the VCN VR for that subnet. In some cases, the first IP address from this range may be held on the VCN VR. For example, for a subnet having the overlay IP address range 10.0 / 16, the IP address 10.0.0.1 may be held on ports of the VCN VR for that subnet. For a second subnet within the same VCN having the address range 10.1 / 16, the VCN VR may have ports for the second subnet having the IP address 10.1.0.1.A VCN VR has a different IP address for each subnet within the VCN.
[0036] In some other embodiments, each subnet within a VCN may have its own associated VR, which is addressable by the subnet using a reserved or default IP address associated with the VR. The reserved or default IP address may be, for example, a first IP address from a range of IP addresses associated with that subnet. A VNIC within a subnet can use this default or reserved IP address to communicate with the VR associated with the subnet (e.g., send and receive packets). In such embodiments, the VR is the incoming / outgoing point for that subnet. A VR associated with a subnet within a VCN can communicate with other VRs associated with other subnets within the VCN. A VR can also communicate with gateways associated with the VCN. The VR functionality of a subnet is performed on, or by, one or more NVDs that perform the VNIC functionality of the VNICs within the subnet.
[0037] Route tables, security rules, and DHCP options may be configured for the VCN. The route table is the VCN's virtual route table and contains rules for routing traffic from subnets within the VCN to destinations outside the VCN, via a gateway or specially configured instance. The VCN's route table can be customized to control packet forwarding / routing to and from the VCN. DHCP options refer to configuration information automatically provided to an instance when it is launched.
[0038] Security rules configured for a VCN represent the VCN's overlay firewall rules. Security rules can include inbound and outbound rules and can specify the types of traffic allowed to enter and exit instances within the VCN (e.g., based on protocol and port). Customers can choose whether certain rules are stateful or stateless. For example, a customer can allow incoming SSH traffic to a pair of instances from any location by configuring a stateful inbound rule with source CIDR 0.0.0.0 / 0 and destination TCP port 22. Security rules may be implemented using network security groups or security lists. A network security group consists of a set of security rules that apply only to resources within that group. A security list, on the other hand, contains rules that apply to all resources within a subnet that uses that security list. A VCN may include default security rules and default security lists. DHCP options configured for a VCN provide configuration information that is automatically provided when instances within the VCN start up.
[0039] In certain embodiments, VCN configuration information is determined and stored by the VCN control plane. VCN configuration information may include, for example, address ranges associated with the VCN, subnets and associated information within the VCN, one or more VRs associated with the VCN, compute instances and associated VNICs within the VCN, NVDs that perform various virtualized network functions associated with the VCN (e.g., VNICs, VRs, gateways), VCN status information, and other VCN-related information. In certain embodiments, the VCN distribution service exposes the configuration information or a portion thereof stored by the VCN control plane to the NVD. Using the distributed information, packets can be forwarded to and from compute instances within the VCN by updating information stored and used by the NVD (e.g., forwarding tables, routing tables, etc.).
[0040] In certain embodiments, the creation of VCNs and subnets is handled by the VCN control plane (CP), and the startup of compute instances is handled by the compute control plane. The compute control plane is configured to allocate physical resources for compute instances and then call the VCN control plane to create VNICs and connect to the compute instances. The VCN CP also sends VCN data mappings to the VCN data plane, which is configured to perform packet forwarding and routing functions. In certain embodiments, the VCN CP provides a distribution service configured to provide updates to the VCN data plane. Examples of VCN control planes are shown in Figures 17, 18, 19, and 20 (see reference numbers 1716, 1816, 1916, and 2016) and are described below.
[0041] Customers can create one or more VCNs using resources hosted by CSPI. Compute instances deployed on a customer VCN can communicate with different endpoints. These endpoints can include endpoints hosted by CSPI and endpoints outside of CSPL.
[0042] Various different architectures for implementing cloud-based services using CSPI are shown in Figures 1, 2, 3, 4, 5, 17, 18, 19, and 21, and are described below. Figure 1 is a high-level diagram of a distributed environment 100 showing an overlay VCN or customer VCN hosted by CSPI according to a particular embodiment. The distributed environment shown in Figure 1 includes multiple elements within the overlay network. The distributed environment 100 shown in Figure 1 is merely an example and is not intended to unduly limit the scope of the claimed embodiments. Many variations, alternatives, and modifications are possible. For example, in some implementations, the distributed environment shown in Figure 1 may have more or fewer systems or elements than those shown in Figure 1, may combine two or more systems, or may have different system configurations or arrangements.
[0043] As shown in the example in Figure 1, the distributed environment 100 includes a CSPI 101 that provides services and resources that customers can subscribe to and use to build a virtual cloud network (VCN). In a particular embodiment, the CSPI 101 provides IaaS services to subscriber customers. The data centers within the CSPI 101 may be organized into one or more regions. Figure 1 shows an example of a region, the “US region” 102. The customer has configured a customer VCN 104 for region 102. The customer can deploy various compute instances on the VCN 104, which may include virtual machines or bare metal instances. Examples of instances include applications, databases, load balancers, etc.
[0044] In the embodiment shown in Figure 1, customer VCN 104 includes two subnets, namely "Subnet-1" and "Subnet-2", each subnet having its own CIDR IP address range. In Figure 1, the overlay IP address range for subnet-1 is 10.0 / 16, and the address range for subnet-2 is 10.1 / 16. The VCN virtual router 105 represents the logical gateway of the VCN, enabling communication between subnets of VCN 104 and communication with other endpoints outside the VCN. The VCN VR 105 is configured to route traffic between VNICs within VCN 104 and gateways associated with VCN 104. The VCN VR 105 provides ports to each subnet of VCN 104. For example, the VR 105 can provide a port with IP address 10.0.0.1 to subnet-1 and a port with IP address 10.1.0.1 to subnet-2.
[0045] Multiple compute instances can be deployed on each subnet. In this case, compute instances may be virtual machine instances and / or bare metal instances. Compute instances within a subnet may be hosted by one or more host machines within CSPI101. Compute instances join the subnet via the VNIC associated with them. For example, as shown in Figure 1, compute instance C1 is part of subnet-1 via the VNIC associated with it. Similarly, compute instance C2 is part of subnet-1 via the VNIC associated with C2. Similarly, multiple compute instances, which may be virtual machine instances or bare metal instances, may be part of subnet-1. Each compute instance is assigned a private overlay IP address and MAC address via the associated VNIC. For example, in Figure 1, compute instance C1 has the overlay IP address 10.0.0.2 and MAC address M1, and compute instance C2 has the private overlay IP address 10.0.0.3 and MAC address M2. Each compute instance in subnet-1, including compute instances C1 and C2, has a default route to VCN VR105 using IP address 10.0.0.1, which is the IP address of the port of VCN VR105 in subnet-1.
[0046] Multiple compute instances, including virtual machine instances and / or bare metal instances, can be deployed in subnet-2. For example, as shown in Figure 1, compute instances Dl and D2 are part of subnet-2 via the VNIC associated with each compute instance. In the embodiment shown in Figure 1, compute instance D1 has the overlay IP address 10.1.0.2 and MAC address MM1, and compute instance D2 has the private overlay IP address 10.1.0.3 and MAC address MM2. Each compute instance in subnet-2, including compute instances D1 and D2, has a default route to VCN VR105 using IP address 10.1.0.1, which is the IP address of the port of VCN VR105 in subnet-2.
[0047] Furthermore, VCN A104 may include one or more load balancers. For example, a load balancer may be provided for a subnet and configured to load balance traffic among multiple compute instances on the subnet. Alternatively, a load balancer may be provided to load balance traffic among subnets within the VCN.
[0048] A specific compute instance deployed on VCN104 can communicate with various different endpoints. These endpoints may include endpoints hosted by CSPI200 and endpoints outside of CSPI200. Endpoints hosted by CSPI101 may include endpoints on the same subnet as a particular compute instance (e.g., communication between two compute instances in subnet-1), endpoints on different subnets but within the same VCN (e.g., communication between a compute instance in subnet-1 and a compute instance in subnet-2), endpoints in different VCNs within the same region (e.g., communication between a compute instance in subnet-1 and an endpoint in a VCN in the same region 106 or 110, or between a compute instance in subnet-1 and an endpoint in service network 110 in the same region), or endpoints in VCNs in different regions (e.g., communication between a compute instance in subnet-1 and an endpoint in a VCN in a different region 108). In addition, compute instances in subnets hosted by CSPI101 can communicate with endpoints not hosted by CSPI101 (i.e., outside of CSPI101). These external endpoints include endpoints within the customer's on-premises network 116, endpoints within other remote cloud host networks 118, public endpoints 114 accessible via public networks such as the internet, and other endpoints.
[0049] Communication between compute instances on the same subnet is facilitated using VNICs associated with the source and destination compute instances. For example, compute instance C1 in subnet-1 may want to send a packet to compute instance C2, also in subnet-1. For a packet sent from the source compute instance, whose destination is another compute instance on the same subnet, this packet is first processed by the VNIC associated with the source compute instance. The processing performed by the VNIC associated with the source compute instance may include determining the packet's destination information from the packet header, identifying any policies (e.g., security lists) configured for the VNIC associated with the source compute instance, determining the packet's next hop, performing any packet encapsulation / decapsulation functions as needed, and forwarding / routing the packet to the next hop to facilitate communication to its intended destination. If the destination compute instance is on the same subnet as the source compute instance, the VNIC associated with the source compute instance is configured to identify the VNIC associated with the destination compute instance and forward the packet to that VNIC for processing. Next, the VNIC associated with the destination compute instance is executed and forwards the packets to the destination compute instance.
[0050] When a packet is transmitted from a compute instance within a subnet to an endpoint in a different subnet of the same VCN, the communication is facilitated by the VNICs associated with the source and destination compute instances, and the VCN VR. For example, if compute instance C1 in subnet-1 in Figure 1 wants to send a packet to compute instance D1 in subnet-2, the packet is first processed by the VNIC associated with compute instance C1. The VNIC associated with compute instance C1 is configured to route the packet to VCN VR105 using the VCN VR's default route or port 10.0.0.1. VCN VR105 is configured to route the packet to subnet-2 using port 10.1.0.1. The packet is then received and processed by the VNIC associated with D1, and the VNIC forwards the packet to compute instance D1.
[0051] To transmit packets from compute instances within VCN104 to endpoints outside VCN104, communication is facilitated by a VNIC associated with the source compute instance, VCN VR105, and a gateway associated with VCN104. One or more types of gateways can be associated with VCN104. A gateway is an interface between the VCN and another endpoint, which is outside the VCN. A gateway is a Layer 3 / IP layer concept that enables communication between the VCN and endpoints outside the VCN. Therefore, gateways facilitate traffic flow between the VCN and other VCNs or networks. Different types of gateways can be configured in the VCN to facilitate different types of communication with different types of endpoints. Through gateways, communication may take place over a public network (e.g., the internet) or a private network. Various communication protocols may be used for these communications.
[0052] For example, compute instance C1 may want to communicate with an endpoint outside of VCN104. The packet may first be processed by the VNIC associated with source compute instance C1. The VNIC processing determines that the packet's destination is outside subnet-1 of Cl. The VNIC associated with C1 can then forward the packet to VCN VR105 of VCN104. VCN VR105 then processes the packet and, as part of the processing, determines a specific gateway associated with VCN104 as the packet's next hop based on the packet's destination. VCN VR105 can then forward the packet to the specific gateway. For example, if the destination is an endpoint within the customer's operation-premise network, the packet may be forwarded by VCN VR105 to a dynamic routing gateway (DRG) 122 configured for VCN104. The packet is then forwarded from the gateway to the next hop, facilitating communication of the packet to its intended final destination.
[0053] Various different types of gateways may be configured for the VCN. Examples of gateways that may be configured for the VCN are shown in Figure 1 and described below. Examples of gateways associated with the VCN are also shown in Figures 17, 18, 19, and 20 (for example, gateways shown by reference numbers 1734, 1736, 1738, 1834, 1836, 1838, 1934, 1936, 1938, 2034, 2036, and 2038) and described below. As shown in the embodiment shown in Figure 1, a dynamic routing gateway (DRG) 122 may be added to or associated with the customer VCN 104. The DRG 122 provides a path for private network traffic communication between the customer VCN 104 and another endpoint. The other endpoint may be the customer on-premises network 116, VCN 108 in a different region of CSPI 101, or another remote cloud network 118 not hosted by CSPI 101. The customer on-premises network 116 may be a customer network or customer data center built using the customer's resources. Access to the customer on-premises network 116 is generally strictly restricted. For a customer that has both the customer on-premises network 116 and one or more VCNs 104 deployed or hosted in the cloud by CSPI 101, the customer may want the on-premises network 116 and the cloud-based VCNs 104 to be able to communicate with each other. This would allow the customer to build an enhanced hybrid environment that includes the customer's VCNs 104 hosted by CSPI 101 and the on-premises network 116. DRG 122 enables such communication. To enable such communication, a communication channel 124 is configured. In this case, one endpoint of the communication channel is on the customer on-premises network 116, and the other endpoint is on CSPI 101 and connected to the customer VCN 104. The communication channel 124 can be via a public communication network such as the internet, or a private communication network.Various different communication protocols can be used, such as IPsec VPN technology on public communication networks like the Internet, and Oracle®'s FastConnect technology which uses a private network instead of a public network. Devices or equipment within the customer on-premises network 116 that form one endpoint of communication channel 124 are called customer premises equipment (CPE), such as CPE126 shown in Figure 1. The endpoint on the CSPI101 side may be a host machine running DRG122.
[0054] In certain embodiments, a Remote Peering Connection (RPC) can be added to the DRG. This allows a customer to peer one VCN with another VCN in a different region. Using such an RPC, a customer VCN 104 can connect to a VCN 108 in a different region using the DRG 122. The DRG 122 may also be used to communicate with other remote cloud networks 118 not hosted by the CSPI 101, such as the Microsoft® Azure cloud or the Amazon® AWS cloud.
[0055] As shown in Figure 1, an Internet Gateway (IGW) 120 can be configured on the customer VCN 104 to enable compute instances on the customer VCN 104 to communicate with a public endpoint 114 accessible via a public network such as the Internet. The IGW 120 is a gateway for connecting the VCN to a public network such as the Internet. The IGW 120 enables public subnets within a VCN, such as VCN 104 (resources within public subnets have public overlay IP addresses), to directly access a public endpoint 112 on a public network such as the Internet 114. Connections can be initiated from subnets within VCN 104 or from the Internet using the IGW 120.
[0056] A Network Address Translation (NAT) gateway 128 can be configured in customer VCN 104. The NAT gateway 128 enables cloud resources within the customer VCN that do not have dedicated public overlay IP addresses to access the internet without directly exposing them to incoming internet connectivity (e.g., L4-L7 connectivity). This allows private subnets within the VCN, such as private subnet-1 of VCN 104, to have private access to public endpoints on the internet. With the NAT gateway, private subnets can initiate connections to the public internet, but connections cannot be initiated from the internet to the private subnets.
[0057] In certain embodiments, a service gateway (SGW) 126 can be configured in a customer VCN 104. The SGW 126 provides a route for private network traffic between VCN 104 and service endpoints supported by a service network 110. In certain embodiments, the service network 110 may be provided by a CSP and can provide a variety of services. An example of such a service network is the Oracle® service network, which provides a variety of services that customers can use. For example, compute instances (e.g., database systems) in a private subnet of customer VCN 104 can back up data to service endpoints (e.g., object storage devices) without requiring a public IP address or access to the internet. In some embodiments, a VCN may have only one SGW, and connections can only be initiated from subnets within the VCN, and not from the service network 110. When peering a VCN with another VCN, resources in the other VCN typically cannot access the SGW. Resources in an on-premises network connected to a VCN via FastConnect or VPN Connect can also use the service gateway configured for that VCN.
[0058] In some implementations, SGW126 uses service-classless inter-domain routing (CIDR) labels. A CIDR label is a string representing all regionally exposed IP address ranges for a service or group of services of interest. Customers use service CIDR labels to control traffic to services when configuring SGW and associated routing rules. Customers can optionally use service CIDR labels when configuring security rules without having to adjust security rules if the public IP addresses of services change in the future.
[0059] The Local Peering Gateway (LPG) 132 is an addable gateway to the customer VCN 104 that enables the VCN 104 to peer with other VCNs within the same region. Peering means that VCNs communicate using private IP addresses without traffic traversing a public network such as the internet or routing traffic through the customer's on-premises network 116. In a preferred embodiment, the VCN has a separate LPG for each established peering. Local peering, or VCN peering, is a common practice used to establish network connectivity between different applications or infrastructure management functions.
[0060] Service providers, such as service providers on service network 110, can provide access to their services using different access models. According to the public access model, a service may be exposed as a public endpoint accessible publicly by compute instances within the customer VCN via a public network such as the internet, or it may be accessed privately via SGW126. According to a specific private access model, a service may be accessed as a private IP endpoint within a private subnet within the customer VCN. This is called private endpoint (PE) access and allows service providers to expose their services as instances within the customer's private network. A private endpoint resource represents a service within the customer VCN. Each PE appears as a VNIC (referred to as a PE-VNIC, having one or more private IPs) selected by the customer from a subnet within the customer VCN. Thus, a PE provides a way to provide services within the customer's private VCN subnet using a VNIC. Because the endpoint is exposed as a VNIC, the PE VNIC can utilize all the features associated with a VNIC, such as routing rules and security lists.
[0061] Service providers enable access via PE by registering their services. Providers can associate policies with services that restrict their visibility to customer tenants. Providers can register multiple services under a single virtual IP address (VIP), especially in the case of multi-tenant services. Multiple private endpoints may exist representing the same service (across multiple VCNs).
[0062] Subsequently, compute instances within the private subnet can access the service using the PE VNIC's private IP address or service DNS name. Compute instances within the customer VCN can access the service by sending traffic to the PE's private IP address within the customer VCN. The Private Access Gateway (PAGW) 130 is a gateway resource that can connect to a service provider VCN (e.g., a VCN within service network 110) and act as the receiving / transmitting point for all traffic to and from the customer subnet private endpoint. The PAGW 130 allows the provider to scale the number of PE connections without utilizing internal IP address resources. The provider only needs to configure one PAGW for any number of services registered in a single VCN. The provider can present a service as a private endpoint in multiple VCNs for one or more customers. From the customer's perspective, the PE VNIC appears to be connected to the service the customer wants to interact with, rather than to the customer's instances. Traffic directed to the private endpoint is routed to the service via the PAGW 130. These are called customer-to-service private connections (C2S connections).
[0063] Furthermore, by using the PE concept, private access to the service can be extended to the customer's on-premises network and data center by enabling traffic to flow through FastConnect / IPsec links and private endpoints within the customer's VCN. Private access to the service can also be extended to the customer's peering VCN by enabling traffic to flow between LPG132 and PEs within the customer's VCN.
[0064] Customers can control VCN routing at the subnet level, allowing them to specify which subnets use which gateways within their VCN, such as VCN104. The VCN's route table can be used to determine whether traffic can be routed outside the VCN through a particular gateway. For example, in a specific case, the route table for a public subnet within customer VCN104 might allow non-local traffic to be sent via IGW120. The route table for a private subnet within the same customer VCN104 might allow traffic to CSP services via SGW126. All remaining traffic could be sent via NAT gateway 128. The route table only controls traffic leaving the VCN.
[0065] Security lists associated with a VCN are used to control inbound connections and traffic entering the VCN via gateways. All resources within a subnet use the same mute table and security lists. Security lists may be used to control specific types of traffic entering and leaving instances within a VCN subnet. Security list rules may include inbound and outbound rules. For example, inbound rules may specify allowed source address ranges, and outbound rules may specify allowed destination address ranges. Security rules may specify specific protocols (e.g., TCP, ICMP), specific ports (e.g., port 22 for SSH, port 3389 for Windows® RDP), etc. In certain implementations, the instance's operating system may enforce its own firewall rules that match the security list rules. Rules may be stateful (e.g., connections are tracked and responses are automatically allowed without explicit security list rules for response traffic) or stateless.
[0066] Access from a customer VCN (i.e., resources or compute instances deployed on VCN104) can be classified as public access, private access, or dedicated access. Public access refers to an access model for accessing public endpoints using public IP addresses or NAT. Private access allows customer workloads within VCN104 with private IP addresses (e.g., resources in a private subnet) to access services without traversing a public network such as the internet. In certain embodiments, CSPI101 allows customer VCN workloads with private IP addresses to access the public service endpoint of a service using a service gateway. Thus, the service gateway provides a private access model by establishing a virtual link between the customer VCN and the public endpoint of a service that resides outside the customer's private network.
[0067] Furthermore, CSPI can provide dedicated public access using technologies such as FastConnect public peering. In this case, customer on-premises instances can access one or more services within the customer VCN using FastConnect connectivity without going through a public network such as the internet. CSPI can also provide dedicated private access using FastConnect private peering. In this case, customer on-premises instances with private IP addresses can access customer VCN workloads using FastConnect connectivity. FastConnect is a network connectivity used as an alternative to connecting customer on-premises networks to CSPI and its services using the public internet. FastConnect provides a simple, flexible, and economical way to create dedicated private connectivity with higher bandwidth options and a reliable, consistent networking experience compared to internet-based connectivity.
[0068] Figure 1 and the accompanying description above illustrate various virtualization elements in an exemplary virtual network. As mentioned above, the virtual network is built on an underlying physical network or infrastructure network. Figure 2 is a simplified architectural diagram showing the physical elements within the physical network within the CSPI200 that provide the foundation for the virtual network, according to a particular embodiment. As shown, the CSPI200 provides a distributed environment including elements and resources (e.g., compute resources, memory resources, and networking resources) provided by a Cloud Service Provider (CSP). These elements and resources are used to provide cloud services (e.g., IaaS services) to subscribers, i.e., customers who subscribe to one or more services provided by the CSP. Based on the services a customer subscribes to, the CSPI200 provides some resources (e.g., compute resources, memory resources, and networking resources) to the customer. The customer can then use the physical compute resources, memory resources, and networking resources provided by the CSPI200 to build their own cloud-based (i.e., CSPI-hosted) customizable private virtual network. As mentioned above, these customer networks are called virtual cloud networks (VCNs). Customers can deploy one or more customer resources, such as compute instances, to these customer VCNs. Compute instances may be virtual machines, bare metal instances, etc. CSPI200 provides infrastructure and a suite of complementary cloud services that enable customers to build and run a wide range of applications and services in a highly available host environment.
[0069] In the exemplary embodiment shown in Figure 2, the physical elements of CSPI200 include one or more physical host machines or physical servers (e.g., 202, 206, 208), network virtualization devices (NVDs) (e.g., 210, 212), top-of-rack (TOR) switches (e.g., 214, 216), a physical network (e.g., 218), and switches within physical network 218. The physical host machines or servers can host and run various compute instances participating in one or more subnets of the VCN. Compute instances may include virtual machine instances and bare metal instances. For example, the various compute instances shown in Figure 1 may be hosted by the physical host machines shown in Figure 2. Virtual machine compute instances in the VCN may run on one host machine or on several different host machines. The physical host machines can also host virtual host machines, container-based hosts or functions, etc. The VIC and VCN VR shown in Figure 1 may run on the FTVD shown in Figure 2. The gateway shown in Figure 1 may be run by the host machine and / or NVD shown in Figure 2.
[0070] A host machine or server can run a hypervisor (also known as a virtual machine monitor or VMM) that creates and enables a virtualized environment on the host machine. Virtualization or a virtualized environment facilitates cloud-based computing. One or more compute instances may be created, run, and managed on the host machine by a hypervisor on the host machine. The hypervisor on the host machine can share the host machine's physical compute resources (e.g., compute resources, memory resources, and networking resources) among various compute instances running on the host machine.
[0071] For example, as shown in Figure 2, host machines 202 and 208 run hypervisors 260 and 266, respectively. These hypervisors may be implemented using software, firmware, hardware, or a combination thereof. Typically, a hypervisor is a process or software layer residing in the host machine's operating system (OS), which runs on the host machine's hardware processor. A hypervisor provides a virtualization environment that allows the host machine's physical computing resources (e.g., processing resources such as processors / cores, memory resources, and networking resources) to be shared among various virtual machine computing instances running on the host machine. For example, in Figure 2, hypervisor 260 resides in the OS of host machine 202 and allows the host machine 202's computing resources (e.g., processing resources, memory resources, and networking resources) to be shared among computing instances (e.g., virtual machines) running on host machine 202. A virtual machine can have its own OS (called a guest OS). This guest OS may be the same as or different from the host machine's OS. The operating system (OS) of a virtual machine running on a host machine may be the same as, or different from, the operating systems of other virtual machines running on the same host machine. Therefore, a hypervisor can run multiple OSs in parallel while sharing the same computing resources of the host machine. The host machines shown in Figure 2 may have the same type of hypervisor or different types of hypervisors.
[0072] Compute instances may be virtual machine instances or bare metal instances. In Figure 2, compute instance 268 on host machine 202 and compute instance 274 on host machine 208 are examples of virtual machine instances. Host machine 206 is an example of a bare metal instance provided to a customer.
[0073] In certain examples, the entire host machine may be provided to a single customer, and one or more compute instances (either virtual machines or bare metal instances) hosted by that host machine may all belong to the same customer. In other examples, the host machine may be shared among multiple customers (i.e., multiple tenants). In such a multi-tenant scenario, the host machine can host virtual machine compute instances belonging to different customers. These compute instances may be members of different VCNs of different customers. In certain embodiments, bare metal compute instances are hosted by bare metal servers without a hypervisor. When bare metal compute instances are provided, a single customer or tenant maintains control of the physical CPU, memory, and network interfaces of the host machine hosting the bare metal instances, and the host machine is not shared with other customers or tenants.
[0074] As mentioned above, each compute instance that is part of a VCN is associated with a VNIC that enables the compute instance to be a member of the VCN's subnet. The VNIC associated with a compute instance facilitates the communication of packets or frames to and from the compute instance. The VNIC is associated with the compute instance when it is created. In certain embodiments, for a compute instance run by a host machine, the VNIC associated with that compute instance is run by an NVD connected to the host machine. For example, in Figure 2, host machine 202 runs virtual machine compute instance 268 associated with VNIC 276, and VNIC 276 is run by an NVD 210 connected to host machine 202. In another example, bare metal instance 272 hosted by host machine 206 is associated with VNIC 280, which is run by an NVD 212 connected to host machine 206. In yet another example, VNIC 284 is associated with compute instance 274 run by host machine 208, and VNIC 284 is run by an NVD 212 connected to host machine 208.
[0075] For compute instances hosted by a host machine, an NVD connected to that host machine executes a VCN VR corresponding to the VCN of which the compute instance is a member. For example, in the embodiment shown in Figure 2, NVD210 executes VCN VR277 corresponding to the VCN of which compute instance 268 is a member. Additionally, NVD212 can execute one or more VCN VR283 corresponding to the VCNs of compute instances hosted by host machines 206 and 208.
[0076] A host machine may include one or more network interface cards (NICs) for connecting it to other devices. The NICs on the host machine may provide one or more ports (or interfaces) for communicating with another device. For example, a host machine can be connected to an NVD using one or more ports (or interfaces) provided on the host machine and the NVD. Alternatively, a host machine can be connected to other devices, such as another host machine.
[0077] For example, in Figure 2, host machine 202 is connected to NVD210 using a link 220 that extends between port 234 provided by NIC 232 of host machine 202 and port 236 of NVD210. Host machine 206 is connected to NVD212 using a link 224 that extends between port 246 provided by NIC 244 of host machine 206 and port 248 of NVD212. Host machine 208 is connected to NVD212 using a link 226 that extends between port 252 provided by NIC 250 of host machine 208 and port 254 of NVD212.
[0078] Similarly, the NVDs are connected via communication links to top-of-rack (TOR) switches connected to a physical network 218 (also called a switch fabric). In certain embodiments, the links between the host machines and the NVDs, and between the NVDs and the TOR switches, are Ethernet® links. For example, in Figure 2, NVDs 210 and 212 are connected to TOR switches 214 and 216, respectively, via links 228 and 230. In certain embodiments, links 220, 224, 226, 228, and 230 are Ethernet® links. The collection of host machines and NVDs connected to the TOR is sometimes referred to as a rack.
[0079] The physical network 218 provides a communication fabric that enables communication between TOR switches. The physical network 218 may be a multi-layer network. In a particular implementation, the physical network 218 is a multi-layer Clos network of switches, and TOR switches 214 and 216 represent leaf-level nodes of the multi-layer and multi-node physical switching network 218. Different Clos network configurations are possible, including but not limited to 2-layer, 3-layer, 4-layer, 5-layer networks, and generally "n"-layer networks. An example of a Clos network is shown in Figure 5 and described below.
[0080] Various connection configurations are possible between the host machine and N VDs, including one-to-one, many-to-one, and one-to-many configurations. In an example of a one-to-one configuration, each host machine is connected to its own separate NVD. For example, in Figure 2, host machine 202 is connected to NVD210 via host machine 202's NIC232. In a many-to-one configuration, multiple host machines are connected to a single NVD. For example, in Figure 2, host machines 206 and 208 are connected to the same NVD212 via NIC244 and 250, respectively.
[0081] In a one-to-many configuration, one host machine is connected to multiple NVDs. Figure 3 shows an example within CSPI300 where a host machine is connected to multiple NVDs. As shown in Figure 3, the host machine 302 has a network interface card (NIC) 304 that includes multiple ports 306 and 30S. The host machine 300 is connected to the first NVD 310 via port 306 and link 320, and to the second NVD 312 via port 308 and link 322. Ports 306 and 308 may be Ethernet® ports, and links 320 and 322 between the host machine 302 and the NVDs 310 and 312 may be Ethernet® links. The NVD 310 is connected to the first TOR switch 314, and the NVD 312 is connected to the second TOR switch 316. The links between the NVDs 310 and 312 and the TOR switches 314 and 316 may be Ethernet® links. TOR switches 314 and 316 represent layer-0 switching devices within a multilayer physical network 318.
[0082] The configuration shown in Figure 3 provides two separate physical network paths from the physical switch network 318 to the host machine 302: a first path from TOR switch 314 through NVD 310 to the host machine 302, and a second path from TOR switch 316 through NVD 312 to the host machine 302. These separate paths provide enhanced availability (referred to as high availability) for the host machine 302. If one path experiences a problem (e.g., one link in the path fails) or if there is a problem with a device (e.g., a particular NVD is not functioning), the other path can be used for communication with the host machine 302.
[0083] In the configuration shown in Figure 3, the host machine is connected to two different NVDs using two different ports provided by the host machine's NIC. In other embodiments, the host machine may include multiple NICs that enable connections between the host machine and multiple NVDs.
[0084] Referring again to Figure 2, the NVD is a physical device or element that performs one or more network virtualization functions and / or memory virtualization functions. The NVD may be any device having one or more processing units (e.g., a CPU, a network processing unit (NPU), an FPGA, a packet processing pipeline), memory including a cache, and ports. Various virtualization functions may be performed by software / firmware executed by one or more processing units of the NVD.
[0085] NVDs may be implemented in various different forms. For example, in certain embodiments, an NVD may be implemented as an interface card called a smart NIC or intelligent NIC with an integrated processor. A smart NIC is a separate device from the NIC on the host machine. In Figure 2, NVD210 may be implemented as a smart NIC connected to host machine 202, and NVD212 may be implemented as smart NICs connected to host machines 206 and 208.
[0086] However, the smart NIC is just one example of an NVD implementation. Various other implementations are possible. For example, in some other implementations, the NVD or one or more functions performed by the NVD may be incorporated into or performed by one or more host machines, one or more TOR switches, and other elements of the CSPI200. For example, the NVD may be integrated into the host machine. In this case, the functions performed by the NVD are performed by the host machine. As another example, the NVD may be part of a TOR switch, or the TOR switch may be configured to perform functions performed by the NVD, enabling the TOR switch to perform various complex packet translations used in public clouds. A TOR that performs the functions of the NVD is sometimes called a smart TOR. In yet another implementation that provides customers with virtual machine (VM) instances rather than bare metal (BM) instances, the functions provided by the NVD may be implemented within the hypervisor of the host machine. In some other implementations, some of the functions of the NVD may be offloaded to a centralized service running on a set of host machines.
[0087] As shown in Figure 2, in certain embodiments, such as when implemented as a smart NIC, the NVD may have multiple physical ports that enable it to connect to one or more host machines and one or more TOR switches. The ports on the NVD can be classified as host-facing ports (also called "south ports") or network-facing or TOR-facing ports (also called "north ports"). Host-facing ports on the NVD are the ports used to connect the NVD to a host machine. Examples of host-facing ports in Figure 2 include port 236 on the NVD210 and ports 248 and 254 on the NVD212. Network-facing ports on the NVD are the ports used to connect the NVD to a TOR switch. Examples of network-facing ports in Figure 2 include port 256 on the NVD210 and port 258 on the NVD212. As shown in Figure 2, the NVD210 is connected to the TOR switch 214 via a link 228 extending from port 256 on the NVD210 to the TOR switch 214. Similarly, the NVD212 is connected to the TOR switch 216 via a link 230 that extends from port 258 of the NVD212 to the TOR switch 216.
[0088] The NVD can receive packets and frames from the host machine (for example, packets and frames generated by compute instances hosted by the host machine) via its host-facing port, perform the necessary packet processing, and then forward the packets and frames to the TOR switch via its network-facing port. The NVD can also receive packets and frames from the TOR switch via its network-facing port, perform the necessary packet processing, and then forward the packets and frames to the host machine via its host-facing port.
[0089] In certain embodiments, multiple ports and associated links may be provided between the NVD and the TOR switch. By aggregating these ports and links, a link aggregator group (LAG) of multiple ports or links can be formed. Link aggregation allows multiple physical links between two endpoints (e.g., between the NVD and the TOR switch) to be treated as a single logical link. All physical links within a given LAG can operate in full-duplex mode at the same speed. LAGs help to increase the bandwidth and reliability of the connection between the two endpoints. If one of the physical links in the LAG fails, traffic is dynamically and transparently reassigned to another physical link within the LAG. Aggregated physical links provide higher bandwidth than individual links. Multiple ports associated with a LAG are treated as a single logical port. Traffic can be load-balanced across the multiple physical links of the LAG. One or more LAGs can be configured between two endpoints. The two endpoints may be, for example, between the NVD and the TOR switch, or between a host machine and the NVD.
[0090] NVD implements or performs network virtualization functions. These functions are performed by software / firmware run by NVD. Examples of network virtualization functions include, but are not limited to, packet encapsulation and decapsulation functions, functions for creating VCN networks, functions for implementing network policies such as VCN security list (firewall) functions, and functions for facilitating the routing and forwarding of packets between compute instances within the VCN. In certain embodiments, upon receiving a packet, NVD is configured to run a packet processing pipeline that processes the packet and determines how to forward or route it. As part of this packet processing pipeline, NVD provides the execution of VNICs related to cis within the VCN, the execution of virtual routers (VRs) related to the VCN, packet encapsulation and decapsulation to facilitate forwarding or routing within the virtual network, the execution of specific gateways (e.g., local peering gateways), the implementation of security lists, network security groups, network address translation (NAT) functions (e.g., public IP to private IP translation on a per-host basis), throttling functions, and other functions.
[0091] In some embodiments, the packet processing data path in the NVD may include multiple packet pipelines. Each packet pipeline consists of a set of packet translation stages. In some implementations, upon receiving a packet, it is parsed and classified into a single pipeline. The packet is then processed linearly, stage by stage, until it is discarded or sent out through the NVD's interface. These stages provide the basic functional packet processing building blocks (e.g., header validation, throttling, insertion of new Layer 2 headers, L4 firewall execution, VCN encapsulation / decapsulation), and as a result, new pipelines can be constructed by assembling existing stages, and new functionality can be added by creating new stages and inserting them into existing pipelines.
[0092] The NVD can perform both control plane and data plane functions corresponding to the VCN's control plane and data plane. Examples of the VCN control plane are shown in Figures 17, 18, 19, and 20 (see reference numbers 1716, 1816, 1916, and 2016) and are described below. Examples of the VCN data plane are shown in Figures 17, 18, 19, and 20 (see reference numbers 1718, 1818, 1918, and 2018) and are described below. Control plane functions include functions used to configure the network to control how data is forwarded (e.g., setting routes and route tables, configuring VNICs). In certain embodiments, a VCN control plane is provided that centrally calculates the mapping of all overlays to the substrate and exposes it to the NVD and virtual network edge devices (e.g., various gateways such as DRG, SGW, IGW). Firewall rules can also be exposed using the same mechanism. In certain embodiments, the NVD retrieves only the mappings relevant to that NVD. The data plane function includes the ability to perform the actual routing / forwarding of packets based on the configuration set using the control plane. The VCN data plane is implemented by encapsulating customer network packets before they pass through the backbone network. The encapsulation / decapsulation function is implemented in the NVD. In certain embodiments, the NVD is configured to intercept all network packets entering and leaving the host machine and to perform network virtualization functions.
[0093] As described above, NVD performs various virtualization functions, including VNICs and VCN VRs. An NVD can run VNICs associated with compute instances hosted by one or more host machines connected to a VNIC. For example, as shown in Figure 2, NVD210 runs the functions of VNIC276 associated with compute instance 268 hosted by host machine 202 connected to NVD210. As another example, NVD212 runs VNIC280 associated with bare-metal compute instance 272 hosted by host machine 206 and VNIC284 associated with compute instance 274 hosted by host machine 208. A host machine can host compute instances belonging to different VCNs belonging to different customers. An NVD connected to a host machine can run VNICs corresponding to compute instances (i.e., perform functions associated with VNICs).
[0094] Furthermore, the NVD runs a VCN virtual router corresponding to the VCN of the compute instance. For example, in the embodiment shown in Figure 2, NVD210 runs VCN VR277 corresponding to the VCN to which compute instance 268 belongs. NVD212 runs one or more VCN VR283 corresponding to one or more VCNs to which compute instances hosted on host machines 206 and 208 belong. In a particular embodiment, a VCN VR corresponding to a VCN is run by all NVDs connected to a host machine that hosts at least one compute instance belonging to that VCN. If a host machine hosts compute instances belonging to a different VCN, the NVDs connected to that host machine can run VCN VRs corresponding to different VCNs.
[0095] In addition to VNICs and VCN VRs, an NVD may include one or more hardware elements that run various software (e.g., daemons) and facilitate various network virtualization functions performed by the NVD. For simplicity, these various elements are grouped as “packet processing elements” as shown in Figure 2. For example, NVD210 includes packet processing element 286, and NVD212 includes packet processing element 288. For example, a packet processing element of an NVD may include a packet processor configured to monitor all packets received and communicated using the NVD and to store network information by interacting with the NVD’s ports and hardware interfaces. Network information may include, for example, network flow information to identify different network flows processed by the NVD and information about each flow (e.g., statistics for each flow). In certain embodiments, network flow information may be stored on a per-VNIC basis. As another example, a packet processing element may include a replication agent configured to replicate the information stored by the NVD to one or more different replication target stores. As yet another example, the packet processing element may include a logging agent configured to perform the NVD's logging function. The packet processing element may also include software to monitor the performance and health of the NVD, and optionally the status and health of other elements connected to the NVD.
[0096] Figure 1 shows the elements of an exemplary virtual or overlay network, including a VCN, subnets within the VCN, compute instances deployed on the subnets, VNICs associated with the compute instances, a VR for the VCN, and a set of gateways configured for the VCN. The overlay elements shown in Figure 1 may be run or hosted by one or more of the physical elements shown in Figure 2. For example, compute instances within a VCN may be run or hosted by one or more host machines shown in Figure 2. In the case of compute instances hosted by host machines, the VNICs associated with those compute instances are typically run by NVDs connected to that host machine (i.e., VNIC functionality is provided by NVDs connected to that host machine). The VCN VR functionality of the VCN is run by all NVDs connected to the host machines that host or run the compute instances that are part of that VCN. Gateways associated with the VCN may be run by one or more different types of NVDs. For example, some gateways may be run by smart NICs, and others may be run by one or more host machines or other implementations of NVDs.
[0097] As described above, compute instances within a customer VCN can communicate with a variety of different endpoints. These endpoints may be on the same subnet as the source compute instance, on a different subnet but still within the same VCN, or may include endpoints outside the source compute instance's VCN. These communications are facilitated using the VNIC associated with the compute instance, the VCN VR, and the gateway associated with the VCN.
[0098] Communication between two compute instances on the same subnet within a VCN is facilitated using VNICs associated with the source and destination compute instances. The source and destination compute instances may be hosted on the same host machine or on different host machines. Packets originating from the source compute instance may be forwarded from the host machine hosting the source compute instance to an NVD connected to that host machine. In the NVD, packets are processed using a packet processing pipeline, which may include the execution of the VNIC associated with the source compute instance. Because the destination endpoint of the packets is on the same subnet, the execution of the VNIC associated with the source compute instance forwards the packets to the NVD running the VNIC associated with the destination compute instance, where the NVD processes the packets and forwards them to the destination compute instance. The VNICs associated with the source and destination compute instances may run on the same NVD (for example, if both the source and destination compute instances are hosted on the same host machine) or on different NVDs (for example, if the source and destination compute instances are hosted on different host machines connected to different NVDs). The VNIC can use the routing / forwarding table stored by the NVD to determine the next hop of a packet.
[0099] When a packet is communicated from a compute instance within a subnet to an endpoint in a different subnet within the same VCN, the packet originating from the source compute instance is communicated from the host machine hosting the source compute instance to the NVD connected to that host machine. In the NVD, the packet is processed using a packet processing pipeline and a VR associated with the VCN, which may include the execution of one or more VNICs. For example, as part of the packet processing pipeline, the NVD executes or invokes a function corresponding to the VNIC associated with the source compute instance (also called executing the VNIC). The function executed by the VNIC may include examining the VLAN tag on the packet. Because the packet's destination is outside the subnet, a VCN VR function is invoked and executed by the NVD. The VCN VR then routes the packet to the NVD executing the VNIC associated with the destination compute instance. The VNIC associated with the destination compute instance then processes the packet and forwards it to the destination compute instance. The VNICs associated with the source compute instance and the destination compute instance may run on the same NVD (for example, if both the source compute instance and the destination compute instance are hosted by the same host machine), or they may run on different NVDs (for example, if the source compute instance and the destination compute instance are hosted by different host machines connected to different NVDs).
[0100] If the packet's destination is outside the VCN of the source compute instance, the packet originating from the source compute instance is communicated from the host machine hosting the source compute instance to the NVD connected to that host machine. The NVD runs the VNIC associated with the source compute instance. Because the packet's destination endpoint is outside the VCN, the packet is processed by the VCN VR of that VCN. The NVD invokes VCN VR functionality, which may result in the packet being forwarded to an NVD running the appropriate gateway associated with the VCN. For example, if the destination is an endpoint within the customer's on-premises network, the packet may be forwarded by the VCN VR to an NVD running the DRG gateway configured for the VCN. The VCN VR may run on the same NVD as the NVD running the VNIC associated with the source compute instance, or it may run on a different NVD. The gateway may run on an NVD that is a smart NIC, a host machine, or another NVD implementation. The packet is then processed by the gateway and forwarded to the next hop to facilitate communication of the packet to its intended destination endpoint. For example, in the embodiment shown in Figure 2, a packet originating from compute instance 268 may be communicated from host machine 202 to NVD210 via link 220 (using NIC 232). VNIC 276 on NVD210 is invoked because it is the VNIC associated with source compute instance 268. VNIC 276 is configured to examine the encapsulated information in the packet, determine the next hop for forwarding the packet to facilitate communication of the packet to its intended destination endpoint, and forward the packet to the determined next hop.
[0101] Compute instances deployed on a VCN can communicate with various different endpoints. These endpoints may include endpoints hosted by CSPI200 and endpoints outside of CSPI200. Endpoints hosted by CSPI200 may include instances within the same VCN or other VCNs (which may be customer VCNs or VCNs not belonging to a customer). Communication between endpoints hosted by CSPI200 may be performed over the physical network 218. Compute instances can also communicate with endpoints not hosted by CSPI200 or located outside of CSPI200. Examples of these endpoints include endpoints within the customer's on-premises network or data center, or public endpoints accessible over a public network such as the Internet. Communication with endpoints outside of CSPI200 may be performed over a public network (e.g., the Internet) (not shown in Figure 2) or a private network (not shown in Figure 2) using various communication protocols.
[0102] The architecture of the CSPI200 shown in Figure 2 is merely an example and is not intended to be limiting. Alternative embodiments are possible, and variations, substitutions, and modifications are possible. For example, in some implementations, the CSPI200 may have more or fewer systems or elements than those shown in Figure 2, may combine two or more systems, or may have different system configurations or arrangements. The systems, subsystems, and other elements shown in Figure 2 may be implemented as software (e.g., code, instructions, programs), hardware, or a combination thereof, executed by one or more processing units (e.g., processors, cores) of each system. The software may be stored in a non-temporary storage medium (e.g., a memory device).
[0103] Figure 4 shows a connection between a host machine and an NVD to provide I / O virtualization to support multi-tenancy functionality, according to a particular embodiment. As shown in Figure 4, the host machine 402 runs a hypervisor 404 that provides the virtualization environment. The host machine 402 runs two virtual machine instances, namely VM1 406 belonging to customer / tenant #1 and VM2 408 belonging to customer / tenant #2. The host machine 402 includes a physical NIC 410 connected to the NVD 412 via link 414. Each compute instance is connected to a VNIC run by the NVD 412. In the embodiment of Figure 4, VM1 406 is connected to VNIC-VM1 420 and VM2 408 is connected to VNIC-VM2 422.
[0104] As shown in Figure 4, NIC410 includes two logical NICs, namely logical NIC A 416 and logical NIC B 418. Each virtual machine is connected to its own logical NIC and configured to operate with its own logical NIC. For example, VM1 406 is connected to logical NIC A 416, and VM2 408 is connected to logical NIC B 418. Although the host machine 402 consists of only one physical NIC 410 shared by multiple tenants, the logical NICs allow each tenant's virtual machine to believe that it owns its own host machine and NIC.
[0105] In a particular embodiment, each logical NIC is assigned its own VLAN ID. Thus, logical NIC A 416 for tenant #1 is assigned a specific VLAN ID, and logical NIC B 418 for tenant #2 is assigned a different VLAN ID. When a packet is communicated from VM1 406, the hypervisor attaches the tag assigned to tenant #1 to the packet and then communicates the packet from host machine 402 to NVD412 via link 414. Similarly, when a packet is communicated from VM2 408, the hypervisor attaches the tag assigned to tenant #2 to the packet and then communicates the packet from host machine 402 to NVD412 via link 414. Thus, the packet 424 communicated from host machine 402 to NVD412 has an associated tag 426 that identifies a specific tenant and associated VM. When packet 424 is received from host machine 402 on NVD, the tag 426 associated with the packet is used to determine whether the packet should be processed by VNIC-VM1 420 or VNIC-VM2 422. The packet is then processed by the corresponding VNIC. The configuration shown in Figure 4 allows each tenant's compute instance to believe that it owns its own host machine and NIC. The configuration shown in Figure 4 provides I / O virtualization to support multi-tenancy functionality.
[0106] Figure 5 is a schematic block diagram showing a physical network 500 according to a particular embodiment. The embodiment shown in Figure 5 is constructed as a Clos network. A Clos network is a specific type of network topology designed to provide connectivity redundancy while maintaining high bimodal bandwidth and maximum resource utilization. A Clos network is a type of non-blocking, multi-stage or multi-layer switching network, where the number of stages or layers may be 2, 3, 4, 5, etc. The embodiment shown in Figure 5 is a 3-layer network including layers 1, 2, and 3. A TOR switch 504 represents a layer-0 switch in the Clos network. One or more NVDs are connected to the TOR switch. Layer-0 switches are also called edge devices in the physical network. Layer-0 switches are connected to layer-1 switches, also called leaf switches. In the embodiment shown in Figure 5, "n" layer-0 TOR switches are connected to "n" layer-1 switches to form pods. Each layer-0 switch in a pod is interconnected to all layer-1 switches in the pod, but switches between pods are not connected. In a particular implementation, two pods are referred to as a block. Each block is serviced by or connected to n Layer-2 switches (also called spine switches). The physical network topology may contain multiple blocks. Similarly, the Layer-2 switches are connected to n Layer-3 switches (also called superspine switches). Packet communication over the physical network 500 is typically performed using one or more Layer 3 communication protocols. Typically, all layers of the physical network except the TOR layer are n-way redundant, thus achieving high availability. The physical network can be extended by specifying policies for pods and blocks to control the mutual visibility of switches in the physical network.
[0107] A key feature of Clos networks is that the maximum hop count required to reach one Layer-0 switch from one Layer-0 switch to another (or from an NVD connected to a Layer-0 switch to another NVD connected to a Layer-0 switch) remains constant. For example, in a Layer 3 Clos network, a packet requires a maximum of 7 hops to reach one NVD from another. In this case, the source NVD and target NVD are connected to the leaf layer of the Clos network. Similarly, in a Layer 4 Clos network, a packet requires a maximum of 9 hops to reach one NVD from another. In this case, the source NVD and target NVD are connected to the leaf layer of the Clos network. Therefore, the Clos network architecture maintains a constant overall network latency, which is crucial for communication within and between data centers. Clos topologies are horizontally scalable and cost-effective. Network bandwidth / throughput capacity can be easily increased by adding more switches to each layer (e.g., more leaf and spine switches) and increasing the number of links between switches in adjacent layers.
[0108] In certain embodiments, each resource within the CSPI is assigned a unique identifier called a Cloud Identifier (CID). This identifier is included as part of the resource's information. This identifier can be used to manage the resource, for example, via a console or API. An example syntax for a CID is as follows:
[0109] ocid1.<RESOURCE TYPE> . <realm>[REGION] [FUTURE USE]<UNIQUE ID> In the formula, "ocid1" is a string that indicates the CID version.
[0110] "RESOURCE TYPE" represents the type of resource (e.g., instance, volume, VCN, subnet, user, group).
[0111] "REALM" represents the region where the resources reside. Exemplary values include "c1" representing a commercial region, "c2" representing a government cloud region, or "c3" representing a federal government cloud region. Each region can have its own domain name.
[0112] "REGION" represents the region to which the resource belongs. If no region applies to the resource, this section may be left blank.
[0113] "FUTURE USE" indicates that it is reserved for future use. The "UNIQUE ID" is the unique identifier portion. This format may vary depending on the type of resource or service.
[0114] B. Exemplary Layer 2 VLAN Architecture This section describes technologies for providing Layer 2 networking capabilities in a virtualized cloud environment. Layer 2 capabilities are provided in addition to, and in relation to, the Layer 3 networking capabilities provided by the virtualized cloud environment. In certain embodiments, virtual Layer 2 and Layer 3 capabilities are provided by Oracle Cloud Infrastructure (OCI), provided by Oracle Corporation.
[0115] Following the introduction of Layer 2 networking functionality, this section explains Layer 2 implementation for VLANs. Subsequently, Layer 2 VLAN services, including storm control, are described.
[0116] Preface The number of enterprise customers migrating their on-premises applications to cloud environments provided by cloud service providers (CSPs) continues to grow rapidly. However, many of these customers quickly realize that the migration journey to the cloud can be extremely challenging, requiring their existing applications to be rebuilt and redesigned to function in the cloud environment. This is because applications written for on-premises environments often rely on the characteristics of the physical network in terms of monitoring, availability, and scalability. Therefore, these on-premises applications need to be rebuilt and redesigned before they can function in the cloud environment.
[0117] There are several reasons why on-premises applications cannot easily migrate to a cloud environment. One of the main reasons is that current cloud virtual networks operate at Layer 3 of the OSI model, for example, the IP layer, and do not provide the Layer 2 functionality required by applications. Layer 3-based routing or forwarding involves determining where a packet should be sent (for example, to which customer instance) based on information contained in the Layer 3 header of the packet, for example, based on the destination IP address contained in the Layer 3 header of the packet. To facilitate this, the location of IP addresses within a virtualized cloud network is determined via a centralized control and organization system or controller. These may include, for example, IP addresses associated with customer entities or resources within the virtualized cloud environment.
[0118] Many customers are running applications in their on-premises environments that have stringent requirements for Layer 2 networking capabilities that are not currently addressed by current cloud offerings and IaaS service providers. For example, traffic is routed using Layer 3 protocols with Layer 3 headers in current cloud offerings, and the Layer 2 capabilities required by the applications are not supported. These Layer 2 capabilities may include features such as Address Resolution Protocol (ARP) processing, Media Access Control (MAC) address learning, and Layer 2 broadcast capabilities, Layer 2 (MAC-based) forwarding, and Layer 2 networking constructs. By providing virtualized Layer 2 networking capabilities in a virtualized cloud network as described in this disclosure, customers can now seamlessly migrate their legacy applications to the cloud environment without requiring any substantial restructuring or redesign. For example, the virtualized Layer 2 networking capabilities described herein enable such applications (e.g., VMware vSphere, vCenter, vSAN, and NSX-T components) to communicate at Layer 2 as they would in an on-premises environment. These applications can run the same versions and configurations in the public cloud, allowing customers to use legacy on-premises applications, including existing knowledge, tools, and processes associated with the legacy applications. Customers can also access native cloud services from their applications (for example, using VMware Software-Defined Data Centers (SDDCs)).
[0119] Another example is several legacy on-premises applications that require Layer 2 broadcast support for failover (e.g., enterprise clustering software applications, network virtual appliances). Illustrative applications include Fortinet FortiGate, IBM® QRadar, Palo Alto firewalls, Cisco ASA, Juniper SRX, and Oracle RAC (Real Application Clustering). As described in this disclosure, by providing virtualized Layer 2 networking in a virtualized public cloud, these appliances can now operate in a virtualized public cloud environment without modification. Virtualized Layer 2 networking capabilities comparable to on-premises are provided, as described herein. The virtualized Layer 2 networking capabilities described in this disclosure support traditional Layer 2 networking, including customer-defined VLANs, as well as support for unicast, broadcast, and multicast Layer 2 traffic capabilities. Layer 2-based packet routing and forwarding include using Layer 2 protocols and routing or forwarding packets, for example, based on the destination MAC address contained in the Layer 2 header, using information contained in the packet's Layer 2 header. Protocols used by enterprise applications (e.g., clustering software applications), such as ARP, Gratuitous Address Resolution Protocol (GARP), and Reverse Address Resolution Protocol (RARP), can now also function in cloud environments.
[0120] There are several reasons why traditional virtualized cloud infrastructure supports virtualized Layer 3 networking and not Layer 2 networking. Layer 2 networks typically do not scale in the same way as Layer 3 networks. Layer 2 network control protocols do not have the level of sophistication desired for scaling. For example, Layer 3 networks do not have to worry about packet looping, which Layer 2 networks must deal with. IP packets (i.e., Layer 3 packets) have the concept of Time To Live (TTL), while Layer 2 packets do not. IP addresses contained within Layer 3 packets have topological concepts such as subnets and CIDR ranges, while Layer 2 addresses (e.g., MAC addresses) do not. Layer 3 IP networks have built-in tools that facilitate troubleshooting, such as packet internet exploration and routing, for finding routing information. Such tools are not available for Layer 2. Layer 3 networks support multipath functionality, which is not available for Layer 2. Due to the lack of sophisticated control protocols for exchanging information between entities in a network (e.g., Border Gateway Protocol (BGP) and Open Shortest Path First (OSPF)), Layer 2 networks must rely on broadcast and multicast to learn about the network, which can negatively impact network performance. As the network changes, the learning process for Layer 2 must be repeated, which is not necessary for Layer 3. For these and other reasons, it is more desirable for cloud IaaS service providers to provide infrastructure that operates at Layer 3 rather than Layer 2.
[0121] However, Layer 2 functionality is required by many on-premises applications despite its numerous drawbacks. For example, consider a virtualized cloud configuration where a customer (customer 1) has two instances, instance A with IP1 and instance B with IP2, in a virtual network "V" where instances can be compute instances (e.g., bare metal, virtual machines, or containers) or service instances such as load balancers, NFS mount points, or other service instances. Virtual network V is a separate address space isolated from other virtual networks and the underlying physical network. This isolation can be achieved using various techniques, including packet encapsulation or NAT. For this reason, the IP addresses of instances in the customer's virtual network are different from the addresses in the physical network where they are hosted. A centralized SDN (Software Defined Networking) control plane is provided that knows the physical IPs and the virtual interfaces of all virtual IP addresses. When a packet is sent from instance A to a destination IP2 in virtual network V, the virtual network SDN stack needs to know where IP2 is located. It must know this in advance so that it can send the packet to the IP in the physical network where the virtual IP address IP2 for V is hosted. The location of a virtual IP address can be modified within the cloud, thus changing the relationship between the physical IP and the virtual IP address. Every time a virtual IP address is moved (for example, moving the IP address associated with a virtual machine to another virtual machine, or migrating a virtual machine to a new physical host), an API call must be made to the SDN control plane to inform the controller that the IP has been moved, so that it can update all participants in the SDN stack, including the packet processor (data plane). However, there is a class of applications that does not make such API calls.Examples include various on-premises applications and applications provided by various virtualization software vendors such as VMware. The value of facilitating virtual Layer 2 networking in a virtualized cloud environment lies in enabling support for applications that are not programmed to make such API calls, or applications that rely on other Layer 2 networking features, such as support for non-IP Layer 3 and MAC learning.
[0122] A virtual Layer 2 network creates a broadcast domain, and learning is performed by the members of the broadcast domain. In a virtual Layer 2 domain, any IP can exist on any MAC on any host within that Layer 2 domain, and the system learns using standard Layer 2 networking protocols. The system virtualizes these networking primitives, and it does not need to be explicitly told by a central controller where the MAC and IP reside within its virtual Layer 2 network. This allows applications requiring low-latency failover, applications that need to support broadcast or multicast protocols to multiple nodes, and legacy applications that do not know how to make API calls to the SDN control plane or API endpoints to determine where IP and MAC addresses are valid to run. Therefore, providing Layer 2 networking capabilities in a virtualized cloud environment is required to support functionality that is not available at the IP Layer 3 level.
[0123] Another technical advantage of providing virtual Layer 2 in a virtualized cloud environment is that it enables support for a variety of different Layer 3 protocols (such as IPv4 and IPv6), including non-IP protocols. For example, it can support various non-IP protocols such as IPX and AppleTalk. Existing cloud IaaS providers do not provide Layer 2 functionality in their virtualized cloud networks, and therefore cannot support these non-IP protocols. By providing Layer 2 networking functionality as described in this disclosure, it is possible to provide support for applications that require and depend on the availability of Layer 3 protocols and Layer 2 level functionality.
[0124] Using the technologies described in this disclosure, both Layer 3 and Layer 2 functionality is provided in a virtualized cloud infrastructure. As previously stated, Layer 3-based networking provides certain efficiencies not provided by Layer 2 networking, particularly efficiencies that are well-suited for scaling. By providing Layer 2 functionality in addition to Layer 3 functionality, it becomes possible to leverage such efficiencies provided by Layer 3 (for example, to provide a more scalable solution) while providing Layer 2 functionality in a more scalable manner. For example, virtualized Layer 3 avoids the need to use broadcasts for learning purposes. By providing Layer 3 for its efficiency, and simultaneously providing virtualized Layer 2 to enable applications that require it, applications that cannot function without Layer 2 functionality, and to support non-IP protocols, etc., customers are provided with complete flexibility in a virtualized cloud environment.
[0125] Customers themselves have hybrid environments where Layer 2 and Layer 3 environments coexist, and virtualized cloud environments can now support both of these environments. Customers can have Layer 3 networks such as subnets and / or Layer 2 networks such as VLANs, and these two environments can interact with each other within the virtualized cloud environment.
[0126] Virtualized cloud environments also need to support multi-tenancy. Multi-tenancy makes provisioning both Layer 3 and Layer 2 functionalities within the same virtualized cloud environment technically difficult and complex. For example, a Layer 2 broadcast domain must be managed across many different customers within the cloud provider's infrastructure. The embodiments described in this disclosure overcome these technical challenges.
[0127] For virtualization providers (e.g., VMware), a virtualized Layer 2 network that emulates a physical Layer 2 network allows workloads to run without modification. Applications provided by such a virtualization provider can then run on the virtualized Layer 2 network provided by the cloud infrastructure. For example, such an application might include a set of instances that need to run on a Layer 2 network. If a customer wants to lift and shift such applications from their on-premises environment to a virtualized cloud environment, they cannot simply import the applications and run them in the cloud because these applications rely on an underlying Layer 2 network that is not provided by the current virtualized cloud provider (for example, Layer 2 networking capabilities are used to perform virtual machine migration or move where MAC and IP addresses are valid). For these reasons, such applications cannot run natively in a virtualized cloud environment. Using the techniques described here, a cloud provider can provide a virtualized Layer 2 network in addition to a virtualized Layer 3 network. Here, such an application stack can run without modification in the cloud environment and can perform nested virtualization within the cloud environment. Customers can now run and manage their own Layer 2 applications within the cloud. Application providers do not need to make any changes to their own software to facilitate this. Such legacy applications or workloads (e.g., legacy load balancers, legacy applications, KVM, OpenStack, clustering software) can now run unchanged in a virtualized cloud environment.
[0128] By providing virtualized Layer 2 functionality as described here, various Layer 3 protocols, including non-IP protocols, can now be supported by virtualized cloud environments. Taking Ethernet as an example, it can support various different EtherTypes (fields in the Layer 2 header that indicate what type of Layer 3 packet is being sent; what protocols should be expected at Layer 3) including various non-IP protocols. EtherTypes are two-octet fields within an Ethernet® frame. They are used by the data link layer at the receiving end to determine which protocols are encapsulated in the frame's payload and how the payload will be processed. EtherTypes are also used as the basis for 802.1Q VLAN tagging, encapsulating packets from a VLAN for transmission that is multiplexed with other VLAN traffic over an Ethernet trunk. Examples of EtherTypes include IPv4, IPv6, Address Resolution Protocol (ARP), AppleTalk, and IPX. Cloud networks that support Layer 2 protocols can support any protocol at the Layer 3 layer. Similarly, when cloud infrastructure provides support for Layer 3 protocols, it can support various Layer 4 protocols such as TCP, UDP, and ICMP. A network can be agnostic to Layer 4 protocols when virtualization is provided at Layer 3. Similarly, a network can be agnostic to Layer 3 protocols when virtualization is provided at Layer 2. This technology can be extended to support any Layer 2 network type, including FDDI and InfiniBand.
[0129] Therefore, many applications written for physical networks, particularly those operating with clusters of computer nodes sharing a broadcast domain, utilize Layer 2 features that are not supported in L3 virtual networks. The following six examples highlight the complexities that can arise from the lack of Layer 2 networking capabilities: (1) MAC and IP assignment without prior API calls. Network appliances and hypervisors (such as VMware) were not built for cloud virtual networks. They assume that they can use MACs as long as the MAC is unique, and can obtain dynamic addresses from a DHCP server or use any IP assigned to the cluster. Often there is no mechanism that they can be configured to notify the control plane about the assignment of these Layer 2 and Layer 3 addresses. If the MAC and IP are unknown, the Layer 3 virtual network does not know where to send the traffic. (2) Low-latency reallocation of MAC and IP for high availability and live migration. Many on-premises applications use ARP to reallocate IPs and MACs for high availability - when an instance in a cluster or HA pair stops responding, a newly active instance sends a Gratuitous ARP (GARP) to reallocate the service IP to its MAC, or sends a Reverse ARP (RARP) to reallocate the service MAC to its interface. This is also important when live migrating instances on a hypervisor: the new host must send a RARP when the guest moves so that guest traffic is sent to the new host. The reallocation must not only be done without API calls, but also with very low latency (sub-milliseconds). This cannot be achieved with HTTPS calls to REST endpoints. (3) Interface multiplexing by MAC address. When a hypervisor hosts multiple virtual machines on a single host, all of which are on the same network, guest interfaces are distinguished by their MAC addresses. This requires support for multiple MAC addresses on the same virtual interface. (4) VLAN support. A single physical virtual machine host may need to be on multiple broadcast domains, as indicated by the use of VLAN tags. For example, VMware ESX uses VLANs for traffic isolation (for example, a guest virtual machine may communicate on one VLAN, storage on another VLAN, and the host virtual machine on yet another VLAN). (5) Use of broadcast and multicast traffic. ARP requires L2 broadcasting, and there are examples of on-premises applications that use broadcast and multicast traffic for cluster and HA applications. (6) Support for non-IP traffic. Since L3 networks require IPv4 or IPv6 headers to communicate, the use of L3 protocols other than IP would not work. L2 virtualization means that networks within a VLAN can be L3 protocol independent—the L3 header could be IPv4, IPv6, IPX, or something else—or even nonexistent.
[0130] Example of Layer 2 VLAN implementation As disclosed herein, a Layer 2 (L2) network can be created within a cloud network. This virtual L2 network includes one or more Layer 2 virtual networks, such as virtualized L2 VLANs, which are referred to here as VLANs. Each VLAN may contain multiple compute instances, each of which may be associated with at least one L2 virtual network interface (e.g., an L2 VNIC) and an L2 virtual switch. In some embodiments, each pair of L2 virtual network interfaces and L2 virtual switches is hosted on an NVD. The NVD may host multiple such pairs, each pair being associated with a different compute instance. A collection of L2 virtual switches represents a single L2 switch emulated by a VLAN. An L2 virtual network interface represents a collection of L2 ports on a single L2 switch emulated by an L2 switch. VLANs can connect to other VLANs, Layer 3 (L3) networks, on-premises networks, and / or other networks via a VLAN Switching and Routing Service (VSRS), also referred here to as a Reality Virtual Router (RVR) or L2 VSRS. An example of this architecture is described below.
[0131] Referring here to Figure 6, a schematic diagram of one embodiment of a computing network is shown. VCN602 resides in CSPI601. VCN602 includes multiple gateways that connect VCN602 to other networks. These gateways include DRG604, which can connect VCN602 to an on-premises network, such as an on-premises data center 606. The gateways may further include gateway 600, which may include an LPG for connecting VCN602 to another VCN, and / or an IGW and / or NAT gateway for connecting VCN602 to the Internet. The gateways of VCN602 may further include a service gateway 610, which can connect VCN602 to a service network 612. The service network 612 may include one or more databases and / or stores, such as an autonomous database 614 and / or an object store 616. The service network may include a conceptual network that includes an aggregation of IP ranges, which may be, for example, a public IP range. In some embodiments, these IP ranges may cover some or all of the public services provided by the CSPI601 provider. These services can be accessed, for example, via an internet gateway or a NAT gateway. In some embodiments, the service network provides a way for services within the service network to be accessed from the local area through a dedicated gateway (service gateway) for that purpose. In some embodiments, the backend of these services can be implemented, for example, in their own private network. In some embodiments, the service network 612 may include further additional databases.
[0132] VCN602 can contain multiple virtual networks. Each of these networks can contain one or more compute instances, and one or more compute instances can communicate within their respective networks, between networks, or outside of VCN602. One of the virtual networks in VCN602 is L3 subnet 620. L3 subnet 620 is a unit or subdivision of configuration created within VCN602. Subnet 620 can contain a virtual Layer 3 network in the virtualized cloud environment of VCN602, and VCN602 is hosted on the underlying physical network of CPSI601. Figure 6 shows a single subnet 620, but VCN602 can have one or more subnets. Each subnet within VCN602 can be associated with a contiguous range of overlay IP addresses (e.g., 10.0.0.0 / 24 and 10.0.1.0 / 24) that does not overlap with other subnets within that VCN and represents a subset of the address space within the VCN's address space. In some embodiments, this IP address space can be isolated from the address space associated with the CPSI601.
[0133] Subnet 620 contains one or more compute instances, specifically a first compute instance 622-A and a second compute instance 622-B. Compute instances 622-A and 622-B can communicate with each other within subnet 620, or with other instances, devices, and / or networks outside subnet 620. Communication outside subnet 620 is enabled by a virtual router (VR) 624. VR624 enables communication between subnet 620 and other networks in VCN602. For subnet 620, VR624 represents a logical gateway that enables subnet 620 (e.g., compute instances 622-A and 622-B) to communicate with endpoints on other networks within VCN602, and with other endpoints outside VCN602.
[0134] VCN602 may further include additional networks, specifically one or more L2 VLANs (referred to here as VLANs), which are examples of virtual L2 networks. Each of these VLANs may include a virtual Layer 2 network, localized to the cloud environment of VCN602 and / or hosted by the underlying physical network of CPSI601. In the embodiment of Figure 6, VCN602 includes VLAN A630 and VLAN B640. Each VLAN 630, 640 within VCN602 may be associated with a contiguous range of overlay IP addresses (e.g., 10.0.0.0 / 24 and 10.0.1.0 / 24) that do not overlap with other networks within that VCN, such as other subnets or VLANs within that VCN, and represent a subset of the address space within the VCN's address space. In some embodiments, this IP address space of the VLANs may be isolated from the address space associated with CPSI601. Each of VLANs 630 and 640 may contain one or more compute instances. Specifically, VLAN A630 may contain, for example, a first compute instance 632-A and a second compute instance 632-B. In some embodiments, VLAN A630 may contain additional compute instances. VLAN B640 may contain, for example, a first compute instance 642-A and a second compute instance 642-B. Each of compute instances 632-A, 632-B, 642-A, and 642-B may have an IP address and a MAC address. These addresses may be assigned or generated in any desired manner. In some embodiments, these addresses may be within the VLAN's CIDR, and in some embodiments, these addresses may be arbitrary addresses. In embodiments where a compute instance of a VLAN communicates with an endpoint outside the VLAN, one or both of these addresses may be from the VLAN CIDR; however, if all communication is within the VLAN, these addresses are not limited to addresses within the VLAN CIDR.In contrast to networks where addresses are assigned by the control plane, the IP and / or MAC addresses of compute instances within a VLAN may be assigned by the users / customers of that VLAN, and these IP and / or MAC addresses may then be discovered and / or learned by compute instances within the VLAN according to the learning process discussed below.
[0135] Each VLAN can include a VLAN Switching and Routing Service (VSRS); specifically, VLAN A630 includes VSRS A634, and VLAN B640 includes VSRS B644. Each VSRS634, 644 participates in Layer 2 switching and local learning within the VLAN, and also performs all necessary Layer 3 network functions, including ARP, NDP, and routing. VSRS performs ARP (which is a Layer 2 protocol) because it must map IP to MAC addresses.
[0136] In these cloud-based VLANs, each virtual interface or virtual gateway can be associated with one or more media access control (MAC) addresses, which may be virtual MAC addresses. Within the VLAN, one or more compute instances 632-A, 632-B, 642-A, 642-B, and / or one or more service instances, which may be bare metal, VMs, or containers, can communicate directly with each other via a virtual switch. External communication with other VLANs or L3 networks is enabled via VSRS634,644. VSRS634,644 is a distributed service that provides Layer 3 functionality, such as IP routing, to the VLAN network. In some embodiments, VSRS634,644 is a horizontally scalable, highly available routing service located at the intersection of IP and L2 networks, and capable of participating in IP routing and L2 learning within a cloud-based L2 domain.
[0137] VSRS634,644 can be distributed across multiple nodes within the infrastructure, and the VSRS634,644 functionality can be scalable, specifically horizontally scalable. In some embodiments, each node implementing the VSRS634,644 functionality shares and replicates router and / or switch functionality with one another. Furthermore, these nodes can present themselves as a single VSRS634,644 to all instances within VLAN630,640. VSRS634,644 can be implemented on any virtualization device within CSPI601, specifically within a virtual network. Therefore, in some embodiments, VSRS634,644 can be implemented on any virtual network virtualization device, including NICs, SmartNICs, switches, Smart switches, or general-purpose computing hosts.
[0138] VSRS634,644 can be a service residing on one or more hardware nodes, such as one or more x86 servers or one or more networking devices, specifically one or more SmartNICs, that support a cloud network. In some embodiments, VSRS634,644 can be implemented on a server fleet. Thus, VSRS634,644 can be a service distributed across a fleet of nodes, which may be a centrally managed fleet or distributed to the edge, that participates in and shares L2 and L3 learning along with evaluating routing and security policies. In some embodiments, each VSRS instance can update other VSRS instances with new mapping information learned by the VSRS instance. For example, if a VSRS instance learns IP, interface, and / or MAC mappings for one or more CIs in its VLAN, the VSRS instance can provide its updated information to other VSRS instances in the VCN. Through this cross-update, a VSRS instance associated with a first VLAN can know the mappings, including IP, interface, and / or MAC mappings, for CIs in other VLANs, and in some embodiments, it can know the mappings, including IP, interface, and / or MAC mappings, for CIs in other VLANs within VCN602. When VSRS resides on a server fleet and / or is distributed across a fleet of nodes, these updates can be greatly accelerated.
[0139] In some embodiments, VSRS634, 644 may also host one or more higher-level services necessary for networking, including but not limited to DHCP relay; DHCP (hosting); DHCPv6; IPv6 neighbor discovery protocols such as the IPv6 neighbor discovery protocol; DNS; hosting DNSv6; SLAAC for IPv6; NTP; metadata services; and block store mount points. In some embodiments, VSRS may support one or more network address translation (NAT) functions for translating between multiple network address spaces. In some embodiments, VSRS may incorporate anti-spoofing, anti-MAC spoofing, ARP cache poisoning countermeasures for IPv4, IPv6 router advertisement (RA) guard, DHCP guard, packet filtering with access control lists (ACLs); and / or reverse route forwarding checks. VSRS may implement functions including, for example, ARP, GARP, packet filtering (ACLs), DHCP relay, and / or IP routing protocols. VSRS634 and 644 can, for example, learn MAC addresses, invalidate expired MAC addresses, handle MAC address migration, look up MAC address information, handle MAC information flooding, handle storm control, prevent loops, perform Layer 2 multicast via protocols such as IGMP in the cloud, collect statistics including logs, use SNMP for statistics and monitoring, and / or collect and use statistics about broadcast, total traffic, bits, spanning tree packets, etc.
[0140] Within a virtual network, VSRS634,644 may appear as different instantiations. In some embodiments, each of these instantiations of VSRS may be associated with VLAN630,640, and in some embodiments, each VLAN630,640 may have an instantiation of VSRS634,644. In some embodiments, each instantiation of VSRS634,644 may have one or more unique tables corresponding to the VLAN630,640 to which the VSRS634,644 instantiation is associated. Each instantiation of VSRS634,644 may generate and / or curate unique tables associated with that instantiation of VSRS634,644. Therefore, while a single service may provide VSRS634,644 functionality for one or more cloud networks, individual instances of VSRS634,644 within a cloud network may have their own Layer 2 and Layer 3 forwarding tables, while multiple such customer networks may have overlapping Layer 2 and Layer 3 forwarding tables.
[0141] In some embodiments, VSRS634,644 may support competing VLANs and IP spaces across multiple tenants. This may include having multiple tenants on the same VSRS634,644. In some embodiments, some or all of these tenants may choose to use some or all of the same IP address space, the same MAC space, and the same VLAN space. This can provide users with extreme flexibility when choosing addresses. In some embodiments, this multi-tenancy is supported by providing each tenant with a separate virtual network, which is a private network within the cloud network. Each virtual network is given a unique identifier. Similarly, in some embodiments, each host may have a unique identifier, and / or each virtual interface or virtual gateway may have a unique identifier. In some embodiments, these unique identifiers, specifically the unique identifiers of the virtual network for each tenant, may be encoded in each communication. By providing each virtual network with a unique identifier and including it in the communication, a single instantiation of VSRS634,644 can accommodate multiple tenants with overlapping addresses and / or namespaces.
[0142] VSRS634 and 644 can perform switching and / or routing functions to facilitate and / or enable the creation of and / or communication with L2 networks within VLANs 630 and 640. These VLANs 630 and 640 may be found within a cloud computing environment, more specifically within a virtual network in that cloud computing environment.
[0143] For example, each of VLANs 630 and 640 includes multiple compute instances 632-A, 632-B, 642-A, and 642-B. VSRS634 and 644 enable communication between compute instances in one VLAN 630 or 640 and compute instances in another VLAN 630 or 640, or subnet 620. In some embodiments, VSRS634 and 644 enable communication between compute instances within one VLAN 630 or 640 and another VCN, another network outside the VCN including the internet, an on-premises data center, etc. In such embodiments, for example, compute instance 632-A can send communications to an endpoint outside the VLAN, in this example VLAN A630. A compute instance (632-A) can send communications to VSRS A634, which can then direct those communications to routers 624, 644, or gateways 604, 608, 610 that are communicatively coupled to the desired endpoint. Routers 624, 644, or gateways 604, 608, 610 that are communicatively coupled to the desired endpoint can receive communications from the compute instance (632-A) and direct those communications to the desired endpoint.
[0144] Referring here to Figure 7, a schematic diagram of the logical and hardware aspects of VLAN 700 is shown. As can be understood, VLAN 700 includes multiple endpoints, specifically multiple compute instances and VSRSs. Multiple compute instances (CIs) are instantiated on one or more host machines. In some embodiments, this can be a one-to-one relationship, where each CI is instantiated on its own host machine, and / or in some embodiments, this can be a many-to-one relationship, where multiple CIs are instantiated on a single common host machine. In various embodiments, CIs can be Layer 2 CIs by being configured to communicate with each other using an L2 protocol. Figure 7 illustrates a scenario where several CIs are instantiated on their own host machines and several CIs share a common host machine. As shown in Figure 7, instance 1 (CI1) 704-A is instantiated on host machine 1 702-A, instance 2 (CI2) 704-B is instantiated on host machine 2 702-B, and instance 3 (CI3) 704-C and instance 4 (CI4) 704-D are instantiated on a common host machine 702-C.
[0145] Each of the CI704-A, 704-B, 704-C, and 704-D is coupled to communicate with other CI704-A, 704-B, 704-C, 704-D, and VSRS714 within VLAN 700. Specifically, each of the CI704-A, 704-B, 704-C, and 704-D is connected to other CI704-A, 704-B, 704-C, 704-D, and VSRS714 within VLAN 700 via L2 VNICs and switches. Each CI704-A, 704-B, 704-C, and 704-D is associated with its own L2 VNIC and switch. The switch may be a local, L2 virtual switch that is specifically associated with and deployed for the L2 VNIC. Specifically, CI1 704-A is associated with L2 VNIC1 708-A and switch 1 710-A; CI2 704-B is associated with L2 VNIC2 708-B and switch 710-B; CI3 704-C is associated with L2 VNIC3 708-C and switch 3 710-C; and CI4 704-D is associated with L2 VNIC4 708-D and switch 4 710-D.
[0146] In some embodiments, each L2 VNIC 708 and its associated switch 710 may be instantiated on an NVD 706. This instantiation can be one-to-one, with a single L2 VNIC 708 and its associated switch 710 being instantiated on a specific NVD 706, or it can be many-to-one, with multiple L2 VNIC 708s and their associated switches 710 being instantiated on a single common NVD 706. Specifically, L2 VNIC1 708-A and switch 1 710-A are instantiated on NVD1 706-A, L2 VNIC2 708-B and switch 2 710-B are instantiated on NVD2, and both L2 VNIC3 708-C and switch 3 710-C, as well as L2 VNIC4 708-D and switch 710-D, are instantiated on a common NVD, i.e., NVD 706-C.
[0147] In some embodiments, the VSRS714 can support competing VLANs and IP spaces across multiple tenants. This may include having multiple tenants on the same VSRS714. In some embodiments, some or all of these tenants may choose to use some or all of the same IP address space, the same MAC space, and the same VLAN space. This may provide users with extreme flexibility when choosing addresses. In some embodiments, this multi-tenancy is supported by providing each tenant with a separate virtual network, which is a private network within the cloud network. Each virtual network (e.g., each VLAN or VCN) is given a unique identifier, such as a VCN identifier which may be a VLAN identifier. This unique identifier may be selected, for example, by the control plane, specifically by the CSPI control plane. In some embodiments, this unique VLAN identifier may include one or more bits which may be included and / or used in packet encapsulation. Similarly, in some embodiments, each host may have a unique identifier, and / or each virtual interface or virtual gateway may have a unique identifier. In some embodiments, these unique identifiers, specifically the unique identifiers of the virtual network for a tenant, can be encoded in each communication. By providing a unique identifier for each virtual network and including it in the communication, a single instantiation of VSRS can accommodate multiple tenants with overlapping addresses and / or namespaces. In some embodiments, VSRS714 can determine which tenant a packet belongs to based on the VCN identifier and / or VLAN identifier associated with the communication, specifically in the VCN header of the communication. In the embodiments disclosed herein, communications entering and leaving a VLAN may have a VCN header that can include a VLAN identifier.Based on the VCN header containing the VLAN identifier, the VSRS714 can determine the tenancy; in other words, the receiving VSRS can determine which VLAN and / or tenant to send communications to. In addition, each compute instance belonging to a VLAN (e.g., an L2 compute instance) is given a unique interface identifier that identifies the L2 VNIC associated with that compute instance. The interface identifier may be included in traffic from and to the compute instance (e.g., by being included in the frame header) and can be used by the NVD to identify the L2 VNIC associated with the compute instance. In other words, the interface identifier can uniquely identify a compute instance and / or its associated L2 VNIC.
[0148] As shown in Figure 7, switches 710-A, 710-B, 710-C, and 710-D together can form an L2 distributed switch 712, also referred to here as the distributed switch 712. From the customer's perspective, each switch 710-A, 710-B, 710-C, and 710-D within the L2 distributed switch 712 is a single switch connecting to all CIs within the VLAN. However, the L2 distributed switch 712, which emulates the user experience of a single switch, is infinitely scalable and includes a set of local switches (for example, switches 710-A, 710-B, 710-C, and 710-D in the exemplary example in Figure 7). As shown in Figure 7, each CI runs on a host machine connected to the NVD. For each CI on a host connected to the NVD, the NVD hosts a Layer 2 VNIC and a local switch associated with the compute instance (for example, an L2 virtual switch that is local to the NVD, associated with the Layer 2 VNIC, and is a member or component of an L2 distributed switch 712). The Layer 2 VNIC represents the port of the compute instance on the Layer 2 VLAN. The local switch connects the VNIC to other VNICs (for example, other ports) associated with other compute instances on the Layer 2 VLAN.
[0149] Each of the CI704-A, 704-B, 704-C, and 704-D can communicate with other CI704-A, 704-B, 704-C, and 704-D within VLAN 700, or with a VSRS714. One of the CI704-A, 704-B, 704-C, and 704-D transmits a frame to another CI704-A, 704-B, 704-C, or VSRS714 by sending the frame to the MAC address and interface identifier of the receiving CI or VSRS714. The MAC address and interface identifier may be included in the frame header. Here, as described above, the interface identifier may indicate the L2 VNIC of the receiving CI or VSRS714.
[0150] In one embodiment, CI1 704-A may be a source CI, L2 VNIC708-A may be a source L2 VNIC, and switch 710-A may be a source L2 virtual switch. In this embodiment, CI3 704-C may be a destination CI, and L2 VNIC3 708-C may be a destination L2 VNIC. The source CI can send a frame along with the source MAC address and destination MAC address. This frame may be intercepted by NVD706-A, which instantiates the source VNIC and source switch.
[0151] L2 VNICs 708-A, 708-B, 708-C, and 708-D can each learn to map MAC addresses to L2 VNIC interface identifiers for VLAN 700. This mapping can be learned based on frames and / or communications received from within VLAN 700. Based on this previously determined mapping, the source VNIC can determine the interface identifier of the destination interface associated with the destination CI within the VLAN and encapsulate the frame. In some embodiments, this encapsulation may include Geneve encapsulation, specifically L2 Geneve encapsulation which includes an L2 (Ethernet®) header of the frame being encapsulated. The encapsulated frame can identify the destination MAC, destination interface identifier, source MAC, and source interface identifier.
[0152] The source VNIC can pass the encapsulated frame to the source switch, which can then direct the frame to the destination VNIC. Upon receiving the frame, the destination VNIC can deencapsulate it and then provide the frame to the destination CI.
[0153] Referring now to Figure 8, a logical schematic diagram of multiple connected L2 VLANs 800 is shown. In the particular embodiment shown in Figure 8, both VLANs reside in the same VCN. As can be understood, the multiple connected L2 VLANs 800 can include a first VLAN, VLAN A802-A, and a second VLAN, VLAN B802-B. Each of these VLANs 802-A and 802-B can contain one or more CIs, each of which can have an associated L2 VNIC and an associated L2 virtual switch. Furthermore, each of these VLANs 802-A and 802-B can contain a VSRS.
[0154] Specifically, VLAN A802-A can include instance 1 804-A connected to L2 VNIC1 806-A and switch 1 808-A, instance 2 804-B connected to L2 VNIC2 806-B and switch 808-B, and instance 3 804-C connected to L2 VNIC3 806-C and switch 3 808-C. VLAN B 802-B can include instance 4 804-D connected to L2 VNIC4 806-D and switch 4 808-D, instance 5 804-E connected to L2 VNIC5 806-E and switch 808-E, and instance 6 804-F connected to L2 VNIC6 806-F and switch 3 808-F. VLAN A802-A may further include VSRS A810-A, and VLAN B802-B may include VSRS B810-B. Each of CI804-A, 804-B, and 804-C of VLAN A802-A can be communicatively coupled to VSRS A810-A, and each of CI804-D, 804-E, and 804-F of VLAN B802-B can be communicatively coupled to VSRS B810-B.
[0155] VLAN A802-A can be communicably coupled to VLAN B802-B via their respective VSRS810-A and 810-B. Each VSRS can similarly be coupled to gateway 812, which can provide access to CI804-A, 804-B, 804-C, 804-D, 804-E, and 804-F within each VLAN802-A and 802-B to other networks outside the VCN where VLAN802-A and 802-B are located. In some embodiments, these networks may include, for example, one or more on-premises networks, another VCN, a service network, or a public network such as the Internet.
[0156] Each of the CIs 804-A, 804-B, and 804-C in VLAN A802-A can communicate with CIs 804-D, 804-E, and 804-F in VLAN B802-B via the VSRS 810-A and 810-B of each VLAN 802-A and 802-B. For example, one of the CIs 804-A, 804-B, 804-C, 804-D, 804-E, and 804-F in one of VLANs 802-A and 802-B can send a frame to CIs 804-A, 804-B, 804-C, 804-D, 804-E, and 804-F in the other VLAN 802-A and 802-B. This frame can leave the source VLAN via the VSRS of the source VLAN, enter the destination VLAN, and be routed to the destination CI via the destination VSRS.
[0157] In one embodiment, CI1 804-A may be a source CI, VNIC806-A may be a source VNIC, and switch 808-A may be a source switch. In this embodiment, CI 5 804-E may be a destination CI, and L2 VNIC5 806-E may be a destination VNIC. VSRS A810-A may be a source VSRS identified as an SVSRS, and VSRS B810-B may be a destination VSRS identified as a DVSRS.
[0158] The source CI can send a frame along with its MAC address. This frame may be intercepted by the NVD that instantiates the source VNIC and source switch. The source VNIC encapsulates the frame. In some embodiments, this encapsulation may include Geneve encapsulation, specifically L2 Geneve encapsulation. The encapsulated frame can identify the destination address of the destination CI. In some embodiments, this destination address may also include the destination address of the destination VSRS. The destination address of the destination CI may include the destination IP address, the destination MAC address of the destination CI, and / or the destination interface identifier of the destination VNIC associated with the destination CI. The destination address of the destination VSRS may include the IP address of the destination VSRS, the interface identifier of the destination VNIC associated with the destination VSRS, and / or the MAC address of the destination VSRS.
[0159] The source VSRS can receive the frame from the source switch, look up the VNIC mapping from the frame's destination address (which may be a destination IP address), and forward the packet to the destination VSRS. The destination VSRS can receive the frame. Based on the destination address contained in the frame, the destination VSRS can forward the frame to the destination VNIC. The destination VNIC can receive the frame, decapsulate it, and then deliver the frame to the destination CI.
[0160] Referring now to Figure 9, a logical schematic diagram of multiple connected L2 VLANs and subnets 900 is shown. In the particular embodiment shown in Figure 9, both the VLANs and subnets reside in the same VCN. This is shown because the virtual routers and VSRSs of both the VLANs and subnets are directly connected rather than connected via a gateway.
[0161] To be understood, this can include a first VLAN, VLAN A902-A, a second VLAN, VLAN B902-B, and subnet 930. Each of these VLANs 902-A and 902-B can contain one or more CIs, each of which can have an associated L2 VNIC and an associated L2 switch. Furthermore, each of these VLANs 902-A and 902-B can contain a VSRS. Similarly, subnet 930, which can be an L3 subnet, can contain one or more CIs, each of which can have an associated L3 VNIC, and L3 subnet 930 can contain a virtual router 916.
[0162] Specifically, VLAN A902-A can include instance 1 904-A connected to L2 VNIC1 906-A and switch 1 908-A, instance 2 904-B connected to L2 VNIC2 906-B and switch 2 908-B, and instance 3 904-C connected to L2 VNIC3 906-C and switch 3 908-C. VLAN B 902-B can include instance 4 904-D connected to L2 VNIC4 906-D and switch 4 908-D, instance 5 904-E connected to L2 VNIC5 906-E and switch 5 908-E, and instance 6 904-F connected to L2 VNIC6 906-F and switch 6 908-F. VLAN A902-A may further include VSRS A910-A, and VLAN B902-B may include VSRS B910-B. Each of CI904-A, 904-B, and 904-C of VLAN A902-A can be communicatively coupled to VSRS A910-A, and each of CI904-D, 904-E, and 904-F of VLAN B902-B can be communicatively coupled to VSRS B910-B. L3 subnet 930 may include one or more CIs, specifically instance 7 904-G which is communicatively coupled to L3 VNIC 7 906-G. L3 subnet 930 may include virtual router 916.
[0163] VLAN A902-A can be communicably coupled to VLAN B902-B via their respective VSRS instances 910-A and 910-B. L3 subnet 930 can be communicably coupled to VLAN A902-A and VLAN B902-B via virtual router 916. Virtual router 916 and each of VSRS instances 910-A and 910-B can similarly be coupled to gateway 912, which can grant access to CIs 904-A, 904-B, 904-C, 904-D, 904-E, 904-F, and 904-G in each VLAN 902-A, 902-B and subnet 930 to other networks outside the VCN where VLAN 902-A, 902-B and subnet 930 are located. In some embodiments, these networks may include, for example, one or more on-premises networks, another VCN, a service network, a public network such as the Internet, etc.
[0164] Each VSRS instance 910-A, 910-B can provide an outgoing path for frames leaving their associated VLANs 902-A, 902-B, and an inbound path for frames entering their associated VLANs 902-A, 902-B. From VSRS instances 910-A, 910-B of VLANs 902-A, 902-B, frames can be sent to any desired endpoint, including L2 endpoints such as L2 CIs in another VLAN on the same VCN or a different VCN or network, and / or L3 endpoints such as L3 CIs in a subnet on the same VCN or a different VCN or network.
[0165] In one embodiment, CI1 904-A may be a source CI, VNIC906-A may be a source VNIC, and switch 908-A may be a source switch. In this embodiment, CI7 904-G may be a destination CI, and VNIC7 906-G may be a destination VNIC. VSRS A910-A may be a source VSRS identified as an SVSRS, and virtual router (VR) 916 may be a destination VR.
[0166] The source CI can send a frame along with its MAC address. This frame may be intercepted by the NVD that instantiates the source VNIC and source switch. The source VNIC encapsulates the frame. In some embodiments, this encapsulation may include Geneve encapsulation, specifically L2 Geneve encapsulation. The encapsulated frame can identify the destination address of the destination CI. In some embodiments, this destination address may also include the destination address of the VSRS of the source CI's VLAN. The destination address of the destination CI may include the destination IP address, the destination MAC address of the destination CI, and / or the destination interface identifier of the destination VNIC of the destination CI.
[0167] The source VSRS can receive the frame from the source switch, look up the VNIC mapping from the frame's destination address (which may be a destination IP address), and forward the frame to the destination VR. The destination VR can receive the frame. Based on the destination address contained in the frame, the destination VR can forward the frame to the destination VNIC. The destination VNIC can receive the frame, decapsulate it, and then deliver the frame to the destination CI.
[0168] Learning by L2 VNICs and / or L2 virtual switches within a virtual L2 network Referring now to Figure 10, a schematic diagram of one embodiment of intra-VLAN communication and learning within VLAN 1000 is shown. The learning here is specific to how the L2 VNIC, the VSRS of the source CI's VLAN, and / or L2 virtual switch learn the association between MAC addresses and the L2 VNIC / / VSRS VNIC (more specifically, between MAC addresses associated with an L2 compute instance or VSRS and interface identifiers associated with the L2 VNICs of those L2 compute instances associated with the VSRS VNIC). Generally, the learning is based on incoming traffic. This learning is different from the learning process (e.g., the ARP process) that an L2 compute instance may implement to learn a destination MAC address in terms of interface-to-MAC address learning. The two learning processes (e.g., L2 VNIC / L2 virtual switch and L2 compute instance) are shown as being implemented jointly in Figure 12.
[0169] As can be understood, VLAN1000 includes compute instance 1 1000-A, which is communicatively coupled to NVD1 1001-A, which instantiates L2 VNIC1 1002-A and L2 switch 1 1004-A. VLAN1000 also includes compute instance 2 1000-B, which is communicatively coupled to NVD2 1001-B, which instantiates L2 VNIC2 1002-B and L2 switch 2 1004-A. VLAN1000 also runs on a server fleet and includes VSRS1015, which includes VSRS VNIC1002-C and VSRS switch 1004-C. All switches 1004-A, 1004-B, and 1004-C together form L2 distributed switch 1050. VSRS1015 is communicatively coupled to endpoint 1008, which may include a gateway, specifically an L2 / L3 router in the form of another VSRS, or an L3 router in the form of a virtual router.
[0170] The control plane 1010 of the VCN hosting VLAN 1000 maintains information identifying each L2 VNIC on VLAN 1000 and the network configuration of each L2 VNIC. For example, this information may include, for each L2 VNIC, the interface identifier associated with the L2 VNIC and / or the physical IP address of the NVD hosting the L2 VNIC. The control plane 1010 updates the interfaces in VLAN 1000 with this information (e.g., periodically or on demand). Thus, each L2 VNIC in VLAN 1000, 1002-A, 1002-B, 1002-C, receives information from the control plane 1010, identifies the interface in the VLAN, and populates this information into a table. The table populated by the L2 VNICs can be stored locally on the NVD hosting the L2 VNICs. If VNIC1002-A, 1002-B, and 1002-C already contain the current table, VNIC1002-A, 1002-B, and 1002-C can determine any discrepancies between their current table and the information / table received from control plane 1010. In some embodiments, VNIC1002-A, 1002-B, and 1002-C can update their table to match the information received from control plane 1010.
[0171] As shown in Figure 10, frames are transmitted via L2 switches 1004-A, 1004-B, and 1004-C and received by receiving VNICs 1002-A, 1002-B, and 1002-C. When frames are received by VNICs 1002-A, 1002-B, and 1002-C, the VNIC learns the mapping of the source interface (source VNIC) and source MAC address of that frame. Based on the table of information received from the control plane 1010, the VNIC can map the source MAC address (from the received frame) to the interface identifier of the source VNIC and the IP address of that VNIC and / or the IP address of the NVD hosting that VNIC (if the interface identifier and IP address are available from the table). Thus, L2 VNICs 1002-A, 1002-B, and 1002-C learn the mapping of interface identifiers to MAC addresses based on the received communications and / or frames. Each VNIC 1002-A, 1002-B, and 1002-C may have its L2 forwarding (FWD) tables 1006-A, 1006-B, and 1006-C, along with this learned mapping information. In some embodiments, the L2 forwarding table includes a MAC address and associates it with at least one of an interface identifier or a physical IP address. In such embodiments, the MAC address may be an address assigned to an L2 compute instance and may correspond to a port emulated by the L2 VNIC associated with the L2 compute instance. The interface identifier can uniquely identify the L2 VNIC and / or L2 compute instance. The virtual IP address may be that of the L2 VNIC, and the physical IP address may be that of the NVD hosting the L2 VNIC. L2 forwarding updated by the L2 VNIC may be stored locally on the NVD hosting the L2 VNIC and may be used by the L2 virtual switch associated with the L2 VNIC to direct frames.In some embodiments, VNICs within a common VLAN can share all or part of their mapping tables with one another.
[0172] In light of the network architecture described above, the traffic flow is described below. For clarity, the traffic flow is described in relation to compute instance 2 1000-B, L2 VNIC2 10002-B, L2 switch 2 1004-B, and NVD2 1001-B. This description applies equivalently to the traffic flow between other compute instances.
[0173] As described above, VLANs are implemented within a VCN as overlay L2 networks on top of L3 physical networks. An L2 compute instance of a VLAN can send or receive L2 frames that include an overlay MAC address (also called a virtual MAC address) as the source MAC address and destination MAC address. An L2 frame can also encapsulate a packet that includes an overlay IP address (also called a virtual IP address) as the source IP address and destination IP address. In some embodiments, the overlay IP address of a compute instance may belong to the CIDR range of the VLAN. Other overlay IP addresses may belong to the CIDR range (in which case the L2 frame flows within the VLAN) or outside the CIDR range (in which case the L2 frame is destined for or received from another network). An L2 frame may also include a VLAN tag, which can be used to uniquely identify the VLAN and distinguish it from multiple L2 VNICs on the same NVD. L2 frames can be received by an NVD in encapsulated packets via a tunnel from the host machine of a compute instance, from another NVD, or from a server fleet hosting a VSRS. In these different cases, the encapsulated packet may be an L3 packet transmitted over the physical network, with source and destination IP addresses being physical IP addresses. Different types of encapsulation are possible, including Geneve encapsulation. An NVD can decapsulate the received packet to extract the L2 frame. Similarly, to transmit an L2 frame, an NVD can encapsulate it in an L3 packet and transmit it over the physical board.
[0174] For inbound traffic within the VLAN from instance 2 1000-B, NVD2 1001-B receives a frame from the host machine of instance 2 1000-B via the Ethernet link. The frame contains an interface identifier that identifies L2 VNIC2 1000-B. The frame contains the overlay MAC address of instance 2 1000-B (e.g., M.2) as the source MAC address and the overlay MAC address of instance 1 1000-A (e.g., M.1) as the destination MAC address. Given the interface identifier, NVD2 1001-B passes the frame to L2 VNIC2 1002-B for further processing. L2 VNIC2 1002-B forwards the frame to L2 switch 2 1004-B. Based on L2 forwarding table 1006-B, L2 switch 2 1004-B determines whether the destination MAC address is known (for example, by matching it with an entry in L2 forwarding table 1006-B).
[0175] If known, L2 switch 2 1004-B determines that L2 VNIC1 1002-A is the associated tunnel endpoint and forwards the frame to L2 VNIC1 1002-A. Forwarding may involve encapsulation and decapsulation of the frame in the packet (e.g., Geneve encapsulation and decapsulation), and the packet may contain the frame, the physical IP address of NVD1 1001-A as the destination address (e.g., IP.1), and the physical IP address of NVD 2 1001-B as the source address (e.g., IP.2).
[0176] If unknown, L2 switch 2 1004-B broadcasts the frame to various VNICs in the VLAN (e.g., including L2 VNIC 1 1002-A and any other L2 VNICs in the VLAN), and the broadcasted frame is processed (e.g., encapsulated, transmitted, decapsulated) among the relevant NVDs. In some embodiments, this broadcast is performed in the physical network, or more specifically, emulated in the physical network, and the frame can be encapsulated separately to each L2 VNIC, including VSRS in the VLAN. Thus, the broadcast is emulated in the physical network via a series of replicated unicast packets. Each L2 VNIC then receives the frame and learns the association between the interface identifier of L2 VNIC 2 1002-B and the source MAC address (e.g., M.2) and the source physical IP address (e.g., IP.2).
[0177] For incoming traffic within a VLAN from compute instance 1 1000-A to compute instance 2 1000-B, NVD2 1001-B receives the packet from NVD1. The packet has IP.1 as the source address and a frame, and the frame contains M.2 as the destination MAC address and M.1 as the source MAC address. The frame also contains the network identifier of L2 VNIC1 1002-A. Upon decapsulation, VNIC2 receives the frame and learns that this interface identifier is associated with M.1 and / or IP.1, and if this information was previously unknown, stores this learned information in the L2 forwarding table 1006-B on switch 2 for subsequent outgoing traffic. Alternatively, upon decapsulation, L2 VNIC2 1002-B receives the frame and learns that this interface identifier is associated with M.1 and / or IP.1, and if this information is known, refreshes the validity period.
[0178] For outgoing traffic sent from instance 2 1000-B in VLAN 1000 to an instance in another VLAN, a similar flow to the outgoing traffic described above may exist, except that a VSRS VNIC and VSRS switch are used. In particular, the destination MAC address is not within the L2 broadcast of VLAN 1000 (it is in another L2 VLAN). Therefore, the overlay destination IP address of the destination instance (e.g., IP.A) is used for this outgoing traffic. For example, L2 VNIC 2 1002-B determines that IP.A is outside the CIDR range of VLAN 1000. Therefore, L2 VNIC 2 1002-B sets the destination MAC address to the default gateway MAC address (e.g., M.DG). Based on M.DG, L2 switch 2 1004-B sends the outgoing traffic to the VSRS VNIC (e.g., via a tunnel with appropriate end-to-end encapsulation). The VSRS VNIC forwards the outgoing traffic to the VSRS switch. Next, the VSRS switch performs routing functionality, where, based on the overlay destination IP address (e.g., IP.A), the VSRS switch on VLAN 1000 sends the outgoing traffic to the VSRS switch on the other VLAN (e.g., via a virtual router between these two VLANs, with appropriate end-to-end encapsulation). The VSRS switch on the other VLAN then performs switching functionality by determining that IP.A is within the CIDR range of this VLAN, and performs an ARP cache lookup based on IP.A to determine the destination MAC address associated with IP.A. If no match is found in the ARP cache, an ARP request is sent to a different L2 VNIC on the other VLAN to determine the destination MAC address. Otherwise, the VSRS switch sends the outgoing traffic to the relevant VNIC (e.g., via a tunnel, with appropriate encapsulation).
[0179] For incoming traffic from an instance in another VLAN to an instance in VLAN 1000, the traffic flow is the same as above, except that it is in the reverse direction. For outgoing traffic from an instance in VLAN 1000 to the L3 network, the traffic flow is the same as above, except that the VSRS switch in VLAN 1000 directly routes the packet to the destination VNIC in the virtual L3 network via the virtual router (e.g., the packet does not need to be routed through another VSRS switch). For incoming traffic from the virtual L3 network to an instance in VLAN 1000, the traffic flow is the same as above, except that the packet is received by the VSRS switch in VLAN 1000A, which transmits the packet as a frame within the VLAN. For traffic between VLAN 1000 and other networks (outgoing or incoming), the VSRS switch is used similarly, with its routing function used for outgoing traffic to send packets through the appropriate gateway (e.g., IGW, NGW, DRG, SGW, LPG), and its switching function used for incoming traffic to transmit frames within VLAN 1000.
[0180] Referring to Figure 11, a schematic diagram of an embodiment of VLAN 1100 (for example, a cloud-based virtual L2 network) is shown, specifically a diagram of the VLAN implementation.
[0181] As described above, a VLAN can contain "n" compute instances 1102-A, 1102-B, 1102-N, each running on a host machine. As previously stated, there can be one-to-one associations between compute instances and host machines, or many-to-one associations between multiple compute instances and a single host machine. Each compute instance 1102-A, 1102-B, 1102-N can be an L2 compute instance, in which case it is associated with at least one virtual interface (e.g., L2 VNIC) 1104-A, 1104-B, 1104-N and switches 1106-A, 1106-B, 1106-N. Switches 1106-A, 1106-B, 1106-N are L2 virtual switches and together form an L2 distributed switch type 1107.
[0182] Pairs of L2 VNICs 1104-A, 1104-B, 1104-N and switches 1106-A, 1106-B, 1106-N associated with compute instances 1102-A, 1102-B, 1102-N on the host machine are pairs of software modules on NVDs 1108-A, 1108-B, 1108-N connected to the host machine. Each L2 VNIC 1104-A, 1104-B, 1104-N represents an L2 port of a single customer-recognized switch (referred to here as the v-switch). Generally, host machine "i" runs compute instance "i" and is connected to NVD "i". Then NVD "i" runs L2 VNIC "i" and switch "i". L2 VNIC "i" represents L2 port "i" of the v-switch. "i" is a positive integer between 1 and "n". Here, a one-to-one association is described, but other types of associations are also possible. For example, a single NVD can be connected to multiple hosts, each running one or more compute instances belonging to a VLAN. In this case, the NVD hosts multiple pairs of L2 VNICs and switches, each corresponding to one of the compute instances.
[0183] A VLAN can include an instance of VSRS1110. VSRS1110 performs switching and routing functions and includes instances of VSRS VNIC1112 and VSRS switch 1114. VSRS VNIC1112 represents a port on the v-switch that connects the v-switch to other networks via a virtual router. As shown, VSRS1110 can be instantiated on server fleet 1116.
[0184] The control plane 1118 can track information identifying the L2 VNICs 1104-A, 1104-B, and 1104-N and their placement in the VLAN. The control plane 1110 can further provide this information to the L2 interfaces 1104-A, 1104-B, and 1104-N within the VLAN.
[0185] As shown in Figure 11, the VLAN can be a cloud-based virtual L2 network that can be built on top of the physical network 1120. In some embodiments, this physical network 1120 may include NVD1108-A, 1108-B, and 1108-N.
[0186] Generally, a first L2 compute instance in a VLAN (e.g., compute instance 1 1102-A) can communicate with a second compute instance in the VLAN (e.g., compute instance 2 1102-B) using L2 protocols. For example, a frame can be sent between these two L2 compute instances across the VLAN. Nevertheless, the frame can be encapsulated, tunneled, routed, and / or otherwise processed so that it can be sent over the underlying physical network 1120.
[0187] For example, compute instance 1 1102-A sends a frame destined for compute instance 2 1102-B. Depending on the network connections between host machine 1 and NVD1, between NVD1 and physical network 1120, between physical network 1120 and NVD2, and between NVD2 and host machine 2 (e.g., TCP / IP connection, Ethernet® connection, tunneling connection, etc.), different types of processing may be applied to the frame. For example, the frame is received and encapsulated by NVD1 until it reaches compute instance 2, and so on. This processing is assumed to be possible so that the frame can be transmitted between lower-layer physical resources, and for brevity and clarity, its explanation is omitted from the explanation of VLANs and related L2 operations.
[0188] Virtual L2 network communication Multiple forms of communication can occur within or using a virtual L2 network. These may include intra-VLAN communication. In such embodiments, a source compute instance (CI) can send communication to a destination compute instance located in the same VLAN as the source compute instance (CI). Communication can also be sent to an endpoint outside the VLAN of the source CI. This may include, for example, communication between a source CI in a first VLAN and a destination CI in a second VLAN, communication between a source CI in a first VLAN and a destination CI in an L3 subnet, and / or communication from a source CI in a first VLAN to a destination CI outside the VCN containing the source CI's VLAN. This communication may further include, for example, receiving communication at the destination CI from a source CI outside the destination CI's VLAN. This source CI may be in another VLAN, an L3 subnet, or outside the VCN containing the source CI's VLAN.
[0189] Each CI within a VLAN can play an active role in the traffic flow. This includes learning interface identifiers versus MAC addresses (also referred to here as interface versus MAC address), mapping instances within the VLAN to maintain the L2 forwarding table within the VLAN, and sending and / or receiving communications (e.g., frames in the case of L2 communications). VSRS can play an active role in communications within the VLAN and in communications with source or destination CIs outside the VLAN. VSRS can maintain its presence within the L2 and L3 networks, enabling outgoing and incoming communications.
[0190] Referring now to Figure 12, a flowchart illustrating one embodiment of process 1200 for intra-VLAN communication is shown. In some embodiments, process 1200 may be executed by a compute instance within a common VLAN. Specifically, this process may be executed when a source CI sends communication to a destination CI within a VLAN, but does not know the IP-to-MAC address mapping of that destination CI. This can occur, for example, when a source CI sends a packet to a destination CI that has an IP address in the VLAN, but the source CI does not know the MAC address corresponding to that IP address. In this case, an ARP process can be executed to learn the destination MAC address and the IP-to-MAC address mapping.
[0191] If the source CI knows the IP-to-MAC address mapping, the source CI can send the packet directly to the destination CI without the need for an ARP process to be performed. In some embodiments, this packet may be intercepted by a source VNIC whose source VNIC is an L2 VNIC in intra-VLAN communication. If the source VNIC knows the interface-to-MAC address mapping for the destination MAC address, the source VNIC can encapsulate the packet, for example, with L2 encapsulation, and forward the corresponding frame to the destination MAC address to the destination VNIC whose destination VNIC is an L2 VNIC in intra-VLAN communication.
[0192] If the source VNIC does not know the interface-to-MAC address mapping for a MAC address, the source VNIC can perform an interface-to-MAC address learning process. This may involve the source VNIC sending a frame to all interfaces in the VLAN. In some embodiments, this frame may be sent to all interfaces in the VLAN via broadcast. In some embodiments, this broadcast may be implemented in the form of a serial unicast in the physical network. This frame may include the destination MAC and IP addresses, the interface identifier, and the MAC and IP addresses of the source VNIC. Each VNIC in the VLAN can receive this frame and learn the interface-to-MAC address mapping of the source VNIC.
[0193] Each receiving VNIC can further decapsulate a frame and forward the decapsulated frame (e.g., the corresponding packet) to its associated CI. Each CI may include a network interface from which it can evaluate the forwarded packet. If the network interface determines that the CI receiving the forwarded packet does not match the destination MAC and / or IP address, the packet is dropped. If the network interface determines that the CI receiving the forwarded frame matches the destination MAC and / or IP address, the packet is received by the CI. In some embodiments, a CI having a MAC and / or IP address that matches the destination MAC and / or IP address of a packet can send a response to the source CI, thereby allowing the source VNIC to learn the interface-to-MAC address mapping of the destination CI, and thereby allowing the source CI to learn the IP-to-MAC address mapping of the destination CI.
[0194] If the source CI does not know the IP-to-MAC address mapping, or if the source CI's IP-to-MAC address mapping to the destination CI is outdated, process 1200 can be executed. Thus, once the IP-to-MAC address mapping is known, the source CI can send the packet. If the IP-to-MAC address mapping is not known, process 1200 can be executed. If the interface-to-MAC address mapping is not known, the interface-to-MAC address learning process outlined above can be executed. If the interface-to-MAC address mapping is known, the source VNIC can send the corresponding frame to the destination VNIC. Process 1200 begins at block 1202, where the source CI determines that the destination CI's IP-to-MAC address mapping is unknown to the source CI. In some embodiments, this may include the source CI determining the destination IP address for the packet and determining that the destination IP address is not associated with any MAC address stored in the source CI's mapping table. Alternatively, the source CI may determine that the IP-to-MAC address mapping for its destination CI is outdated. In some embodiments, a mapping can be outdated if it has not been updated and / or validated within a certain time limit. If the source CI determines that the destination CI's IP-to-MAC address mapping is unknown and / or outdated, the source CI initiates an ARP request for the destination IP address and sends an ARP request for Ethernet broadcast.
[0195] In block 1204, the source VNIC, also called the source interface, receives an ARP request from the source CI. The source interface identifies all interfaces on the VLAN and sends ARP requests to all interfaces on the VLAN broadcast domain. As previously mentioned, the control plane knows all interfaces on the VLAN and provides this information to the interfaces with the VLAN, so the source interface also knows all interfaces within the VLAN and can send ARP requests to each of them. To do this, the source interface duplicates the ARP requests and encapsulates one of the ARP requests for each interface on the VLAN. Each encapsulated ARP request includes the source CI interface identifier, the source CI MAC address and IP address, the target IP address, and the destination CI interface identifier. The source CI interface duplicates the Ethernet broadcast by sending the duplicated and encapsulated ARP requests (e.g., ARP messages) as serial unicast, so that one ARP request is sent to each interface in the VLAN.
[0196] In block 1206, each interface in the VLAN broadcast domain receives and decapsulates an ARP message. Each interface in the VLAN broadcast domain that receives an ARP message learns the interface-to-MAC address mapping of the source VNIC of the source CI (e.g., the interface identifier of the source interface to the MAC address of the source CI) because the message identifies the source CI's MAC address and IP address, as well as the source CI interface identifier. As part of learning the interface-to-MAC address mapping for the source CI, each interface can update its mapping table (e.g., its L2 forwarding table) and provide the updated mapping to its associated switch and / or CI. Each receiving interface, except for VSRS, can forward the decapsulated packet to its associated CI. The CI recipient of the forwarded decapsulated packet, specifically the network interface of that CI, can determine whether the target IP address matches the IP address of the CI. If the IP address of the CI associated with that interface does not match the destination CI IP address, in some embodiments, the packet is dropped by that CI and no further action is taken. In the case of VSRS, VSRS can determine whether the target IP address matches the VSRS's IP address. If the VSRS's IP address does not match the target IP address specified in the received packet, in some embodiments, the packet is dropped by the VSRS and no further action is taken.
[0197] If the destination CI IP address specified in the received packet is determined to match the IP address of the CI (destination CI) associated with the receiving interface, the destination CI sends a response, which may be a unicast ARP response, to the source interface, as shown in block 1208. This response includes the destination CI MAC address and destination CI IP address, and the source CI IP address and MAC address. This response is received by the destination interface, as shown in block 1210, and it encapsulates the unicast ARP response. In some embodiments, this encapsulation may include Geneve encapsulation. The destination interface can forward the encapsulated ARP response to the source interface via the destination switch. This response includes the destination CI MAC address and IP address, as well as the destination CI interface identifier, and the source CI MAC address and IP address, as well as the source CI interface identifier.
[0198] In block 1212, the source interface receives the ARP response and decapsulates it. The source interface can then learn an interface-to-MAC address mapping for the destination CI based on the information contained in the encapsulated and / or encapsulated frame. In some embodiments, the source interface can forward the ARP response to the source CI.
[0199] In block 1214, the source CI receives an ARP response. In some embodiments, the source CI can update its mapping table based on the information contained in the ARP response, specifically, it can update the mapping table to reflect the IP-to-MAC address mapping based on the destination CI's MAC address and IP address. The source CI can then send a packet to the destination CI based on this MAC address. This packet may include the source CI's MAC address and interface identifier as the source MAC address and source interface, and the destination CI's MAC address and interface identifier as the destination MAC address and destination interface.
[0200] In block 1216, the source interface can receive packets from the source CI. The source interface can encapsulate packets, and in some embodiments, this encapsulation uses Geneve encapsulation. The source interface can forward the corresponding frames to the destination CI, specifically to the destination interface. The encapsulated frames may include the MAC address and interface identifier of the source CI as the source MAC address and source interface identifier, and the MAC address and interface identifier of the destination CI as the destination MAC address and destination interface.
[0201] In block 1218, the destination interface receives a frame from the source interface. The destination interface can decapsulate the frame and then forward the corresponding packet to the destination CI. In block 1220, the destination CI receives a packet from the destination interface.
[0202] Storm control Physical L2 networks can suffer from frame storms, where end stations can send a large number of frames in rapid bursts of traffic. Such traffic bursts can be further amplified by the flooding characteristics of L2 networks, where frames with unknown unicast destination addresses or broadcast or multicast destination addresses are duplicated. Such "amplified traffic bursts" can quickly bring down the entire network. This is especially true if the network has loops—even for only a short period. Thus, L2 traffic storms can cause failures across the entire network. Different techniques exist to prevent L2 traffic storms, including the use of spanning tree. However, such techniques typically prohibit the use of multipath communication within physical L2 networks.
[0203] Embodiments of the present disclosure provide an L2 virtual network that is overlaid on a physical network (e.g., an L3 physical network), as described herein. The L2 virtual network, such as a virtualized L2 VLAN (referred to herein as a VLAN), is implemented using a technique that provides storm control while enabling multipath communication.
[0204] In VLANs, broadcast or multicast can be used to send frames from one compute instance to multiple compute instances within the VLAN. If a loop exists between two compute instances, broadcast or multicast can result in a frame storm. Different techniques are possible to prevent this storm. In an exemplary technique, the frame transmission rate (e.g., frames / second and / or bits / second) is monitored across different VNICs representing ports on the customer's switch (referred to here as a "v-switch," which indicates that this corresponds to the customer's awareness of a single virtual switch). As described above, this v-switch is actually an L2 distributed switch spanning multiple NVDs, each NVD hosting one or more L2 virtual switches belonging to the L2 distributed switch. The rate is compared against a limiting policy. If the rate violates this policy, the violating VNIC can be shut down, or, depending on the type of violation, some of the frames it is processing can be dropped. These and other aspects of storm control are described below.
[0205] Figure 13 shows an exemplary environment suitable for defining a storm control configuration for an L2 virtual network according to one embodiment. In the embodiment, the environment includes a computer system 1310 that communicates with a customer device 1320 over one or more networks (not shown). The computer system 1310 may include a set of hardware computing resources that host a VCN 1312. A control plane hosted by one or more of the hardware computing resources can receive and process inputs from the customer device 1320 and deploy an L2 virtual network (shown as L2VLAN 1314 in Figure 13) within the VCN 1312.
[0206] In one example, input from customer device 1320 may include various types of information. This information can be specified via console or API calls and may include, among other things, customer-specified configuration information, L2 VLAN configuration 1322 and storm control configuration 1324.
[0207] The L2 VLAN configuration 1322 can, for example, specify the number, type, and configuration of L2 compute instances that should be included in L2 VLAN 1314. In addition, the L2 VLAN configuration 1322 can specify the customer-designated name of the port on the customer-aware v-switch, the MAC address of the compute instance (which may be an L2 compute instance), and the association between the port and the MAC address (or, more generally, the compute instance). For example, the customer may specify that L2 VLAN 1314 should contain two L2 compute instances, the first L2 compute instance having MAC address M.1 and associated with a first port named P1, and the second L2 compute instance having MAC address M.2 and associated with a second port named P2.
[0208] Storm control configuration 1324 can, for example, represent a storm control policy that controls the flow of traffic, including frames within, to, and / or from, VLAN 1314, within L2 VLAN 1314. A storm control policy can represent a set of actions and a set of traffic flow conditions. When a traffic flow condition is detected or a violation of it is detected (e.g., measured), the corresponding action may be initiated or executed. Storm control configuration 1324 can further represent an escalation policy that further controls the flow of traffic depending on the type of violation of the traffic flow condition. For example, an escalation policy may indicate that if a violation is repeatedly detected within a certain duration (e.g., at a certain frequency or number of violations) or persists for a longer period of time than a certain duration, another set of actions (e.g., escalation actions) may be initiated or executed.
[0209] In the embodiment, different storm control configuration types are possible and can be used together or independently. The first storm control configuration type indicates whether storm control applies to a specific port (or a subset of that port) of a v-switch or to the entire set of ports (e.g., the entire VLAN). In particular, the customer can specify the permissible transmission rate (e.g., the maximum transmission rate defined in units of frames / second and / or bits / second) for each port, a set of ports, and / or the entire set of ports. The second storm control configuration type indicates whether storm control applies to unicast frames, and / or broadcast and / or multicast frames. The third storm control configuration type indicates the type of transmission rate (e.g., frames / second and / or bits / second) to be used for storm control. The fourth storm control configuration type indicates the action to be taken in response to a violation of the storm control policy. For example, the customer can specify that frames exceeding the permitted transmission rate should be dropped. Alternatively, the customer can specify that a port violating the permitted transmission rate (e.g., receiving and / or sending frames exceeding the permitted transmission rate) should be shut down and rendered inoperable (e.g., the link state should be brought down). A fifth storm control configuration type can define an escalation policy that includes dropping frames and subsequently shutting down the port (e.g., if the violation is intermittent, excess frames may be discarded; however, if the violation is more persistent over a period of time, the violating port should be shut down). A sixth storm control configuration type indicates metrics and / or statistics to be reported, which may be used for troubleshooting.For example, customers can request system logs (syslog), flow logs, specific metrics (e.g., how many frames were sent, how many frames were dropped, which ports were used, how often frames were dropped (e.g., frame drop rate), how often frame transmission spikes (e.g., exceeding the permitted transmission rate by a certain amount) were observed), watermarks (e.g., the highest transmission spike and its timing), and alerts regarding violations.
[0210] The above inputs can be received by the control plane, and the customer specifies the parameters for each storm control configuration type using its own customer presentation (e.g., by using its own nomenclature for the v-switch ports). The control plane generates storm control information based on the actual network implementation (e.g., L2 distributed switch) and L2 VLAN configuration 1322 (e.g., customer-defined ports). The control plane also distributes the generated storm control information to the NVD, which then organizes the implementation of storm control.
[0211] Therefore, the control plane receives various information and then deploys and manages the different resources of L2 VLAN 1314, generates the associated storm control configuration and distributes it to these resources. For example, L2 VLAN 1314 is configured according to L2 VLAN configuration 1322 and includes a requested compute instance hosted on a host machine and a pair of L2 VNICs-L2 virtual switches hosted on an NVD. To generate the storm control configuration, the control plane translates the customer definition from the storm control configuration into the actual topology of L2 VLAN 1314. For example, each L2 VNIC emulates a port, and the control plane associates the L2 VNIC (e.g., its interface identifier, its MAC address (if not specified), and / or the IP address of the NVD hosting the L2 VNIC) with the port name (and, if specified, the specified MAC address). Instead of using port names, storm control configurations indicate storm control by identifying the associated L2 VNICs (e.g., their interface identifiers, their MAC addresses, and / or the IP address of the NVD hosting the L2 VNICs). The NVD hosting the L2 VNICs receives and applies the storm control configuration associated with the L2 VNICs, and the NVD can then perform traffic flow enforcement.
[0212] Figure 14 illustrates an exemplary storm control technique in a Layer 2 virtual network according to one embodiment. The Layer 2 virtual network is referred to here as a VLAN. The top of Figure 14 shows a VLAN implementation diagram 1410. The bottom of Figure 14 shows a customer presentation of the VLAN 1420.
[0213] As mentioned above, a VLAN can contain "n" compute instances, each of which runs on a host machine. Figure 14 illustrates a one-to-one association between a compute instance and a host machine, but a many-to-one association is possible, where one host machine can run multiple compute instances. Each compute instance is associated with at least one virtual interface (e.g., an L2 VNIC) and a switch (e.g., an L2 virtual switch). The VNIC and switch pair associated with a compute instance on a host machine can be a pair of software modules on an NVD connected to the host machine. Each L2 VNIC represents an L2 port on the customer's v-switch. In the example in Figure 14, host machine "i" runs compute instance "i" and is connected to NVD "i". Next, NVD "i" runs VNIC "i" and switch "i". VNIC "i" represents L2 port "i" on the v-switch, where i is a positive integer from 1 to n. Again, a one-to-one association is described, but other types of associations are also possible. For example, a single NVD can be connected to multiple hosts, each running one or more compute instances belonging to a VLAN. In this case, the NVD would host multiple pairs of VNICs and switches, each corresponding to one of those compute instances.
[0214] Customer input can be received by the control plane (e.g., the VCN control plane, including VLANs). Input can be received via API calls and / or console, and can specify different dimensions of storm control. The organization of storm control can be managed by the control plane, and the implementation of storm control can be done at the NVD (data plane) level.
[0215] In one example, as described above, the NVD's L2 VNIC learns interface-to-MAC address mappings based on incoming traffic. Such mappings, along with VLAN identifiers, can be sent to the control plane. The control plane can receive similar mappings from different NVDs hosting different L2 VNICs and generate mappings between interface identifiers, MAC addresses, (for example, the NVD's) physical IP addresses, VLAN identifiers, and storm control parameters.
[0216] For example, VNIC1 learns that M.2 (the overlay MAC address of compute instance 2) is associated with ID.2 (the interface identifier of L2 VNIC2) and IP.2 (the physical address of NVD2), and that Mn (the overlay MAC address of compute instance n) is associated with ID.n (the interface identifier of L2 VNIC n) and IP.n (the physical address of NVD n). Similarly, VNIC2 learns that M.1 (the overlay MAC address of compute instance 1) is associated with ID.1 (the interface identifier of L2 VNIC1) and IP.1 (the physical address of NVD1). These associations are reported to the control plane as part of the mapping, and the control plane can generate mappings such as {Customer 1; M.1 → ID.1; IP.1; VLAN A}, {Customer 1, M.2 → ID.2, IP.2; VLAN A}, ..., {Customer 1, Mn → ID.n, IP.n; VLAN A}.
[0217] Customer input can also specify a storm control configuration 1422 in addition to the L2 VLAN configuration described in relation to Figure 13. For illustrative purposes, as part of the storm control configuration 1422, customer input specifies that each of ports 1, 2, and n has limits of 1,000 FPS, 2,000 FPS, and 3,000 FPS, respectively, and that frames should be dropped if a violation occurs. Based on the association between the customer definition of the VLAN and its actual implementation (for example, VNIC1,2,...,n correspond to ports 1,2,...,n respectively), the control plane may include storm control information 1411 for the entire VLAN, namely {Customer 1; M.1 → ID.1,IP.1; VLAN A; Limit: 1,000 FPS; Action: Drop}, {Customer 1,M.2 → ID.2,IP.2; VLAN A; Limit: 2,000 FPS; Action: Drop},...,{Customer 1,Mn → ID.n,IP.n; VLAN A, Limit: 3,000 FPS; Action: Drop}. In this exemplary example, "Limit" corresponds to a traffic flow condition, its value (e.g., "1,000 FPS") corresponds to the maximum FPS rate, and "Action: Drop" corresponds to the action of dropping frames when the maximum FPS rate is exceeded.
[0218] In another example, customer input could specify a total limit (e.g., 6,000 FPS) to be allowed within a VLAN to drop frames in the event of intermittent violations and to shut down the violating port in the event of a persistent violation. The control plane could assign the total limit across ports (e.g., L2 VNICs) to determine individual limits per port (e.g., the total limit divided by "n" per L2 VNIC). A dynamic multiplier "X" could be associated with each individual limit and adjusted over time in response to confirmed violations (e.g., the dynamic multiplier "X" for L2 VNIC1 could be initially set to "2", increased if violations are confirmed, and decreased if no violations are confirmed within a period). Furthermore, the control plane could define that frames should be dropped in the event of a violation based on FPS, and that violating ports should be shut down based on the total number of frames exceeding a threshold within a period. For example, in the case of L2 VNIC1, the control plane generates the following individual storm control information: {Limit: (6,000 / 3)*X; X=2; Action: Drop frames; Escalation: Shutdown if the total number of frames in one hour exceeds 10,000,000}. The control plane generates similar individual storm information for the remaining L2 VNICs, and each of such pieces of information can be included in the overall storm control information 1411 for the entire VLAN (for example, {Customer 1, M.1 → ID.1, IP.1; VLAN A; Limit: (6,000 / 3)*X; X=2; Action: Drop frames; Escalation: Shutdown if the total number of frames in one hour exceeds 10,000,000}).
[0219] Based on the association between the customer definition and actual implementation of the VLAN (e.g., L2 VNIC1 corresponds to port 1) and mapping (e.g., customer 1, M.1 → ID.1; IP.1; VLAN A), the control plane can deliver relevant individual storm control portions to the NVD for local implementation of storm control. For example, individual storm control information 1414(1) applicable to L2 VNIC1 is sent to the NVD1 hosting L2 VNIC1. For illustrative purposes, this individual storm control information 1414(1) may include {VNIC1 → Limit: 1,000 FPS; Action: Drop} or, if applicable, {VNIC1 → Limit: (6,000 / 3)*X; X=2; Action: Drop frames; Escalation: Shutdown if the total number of frames in one hour exceeds 10,000,000} is sent to the NVD1). Similarly, individual storm control information 1414(2) applicable to L2 VNIC2 is sent to the NVD2 hosting L2 VNIC2. The customer also does not need to request storm control for a set of ports (e.g., port n). Therefore, individual storm control information is not generated and does not need to be sent to the relevant NVD (e.g., individual storm control information is not defined for L2 VNICn and is not sent to the NVDn).
[0220] Storm control can be performed by the NVD on incoming and / or outgoing traffic. For incoming implementation, and for illustrative purposes, referencing NVD1, NVD1 monitors the traffic flow to L2 VNIC1 (e.g., transmission rate such as FPS and / or BPS of frames sent to L2 VNIC1, and / or total transmission volume such as the total number of frames or bits sent to L2 VNIC1 within the last hour) to compare against applicable traffic flow conditions (e.g., FPS and / or BPS limits of storm control and / or escalation policies). If a violation is detected, NVD1 takes an applicable action (e.g., drops frames sent to L2 VNIC1 or links down and shuts down L2 VNIC1). In contrast, for outgoing implementation, and also referencing NVD1 for illustrative purposes, NVD1 receives the individual storm control information of the remaining NVDs and uses this information for outgoing implementation from L2 VNIC1. For example, NVD1 receives the applicable limits set for NVD2 from the control plane. If the transmission rate (e.g., FPS and / or BPS) of frames sent by L2 VNIC1 to L2 VNIC2 exceeds the FPS limit and / or BPS limit, such frames are dropped by NVD1 rather than being sent to NVD2 and subsequently dropped there. If the total amount of such frames transmitted exceeds the allowable time unit limit, L2 VNIC1 is brought down.
[0221] Different NVDs can report information about frame transmission and frame drops to the control plane. This information may include the transmission rate, total transmission amount (e.g., the total number of frames transmitted and / or the total number of bits transmitted), drop rate, total drop amount (e.g., the total number of dropped frames and / or the total number of dropped bits), and the action to be taken (e.g., drop, shutdown, escalation). The information transmitted from the NVD can be annotated with metadata about the associated L2 VNIC and / or VLAN (e.g., the metadata may identify the associated L2 VNIC and may include the VLAN ID).
[0222] Next, the control plane can collect information from different NVDs and generate metrics and / or statistics requested by the customer. It can push alerts. Other types of metrics and / or statistics can be pushed or presented upon customer request.
[0223] When a dynamic multiplier "X" is used in the storm control configuration information, the control plane can compare how the transmission rate and / or total transmission volume compares to the set limits. Depending on the comparison, the dynamic multiplier "X" can be increased or decreased for each VNIC.
[0224] Herein, as further explained above, a customer's VLAN may include an instance of VSRS (not shown in Figure 14). VSRS performs switching and routing functions and includes a VSRS VNIC representing a port on the v-switch, which connects the v-switch to other networks via a virtual router. Similar configuration information for storm control may be generated for VSRS and sent to VSRS for local enforcement. Furthermore, to support the dropping of traffic that VSRS would route to and from another network, the mapping may include the overlay IP address of the compute instance. If traffic with the compute instance's overlay IP address (e.g., as a source or destination IP address) results in a breach, VSRS can drop this traffic and, if applicable depending on the escalation configuration, link down the VSRS VNIC.
[0225] Figure 15 is a sequence diagram illustrating the process for using storm control information in an L2 virtual network according to several embodiments. In the embodiments, a remote device 1510 operated by the customer (for example, a device from the customer's on-premises network, such as customer device 1320, which is remotely connected to the VCN) communicates with the control plane 1520 to configure storm control for the customer's VLAN. The control plane 1520 organizes the implementation of storm control by the NVD 1530, which hosts the VLAN's L2 VNIC and L2 virtual switch.
[0226] As illustrated, the sequence diagram shows that a customer device 1510 can storm when it sends customer input to the control plane 1520. The input, among other information (e.g., VLAN configuration), indicates the storm control configuration. The storm control configuration may be specific to a port, a set of ports, or the entire set of ports on a v-switch known to the customer. The control plane 1520 then generates storm control information based on the storm control configuration. The storm control information may include global storm control information applicable to VLANs and / or individual storm control information for each L2 VNIC. Generally, the control plane 1520 translates the storm control configuration from a customer-defined port to an L2 VNIC implementation and may include storm control policies, escalation policies, or modifications thereto (e.g., by adjusting transmission limits based on a dynamic multiplier "X") in the storm control information. The control plane 1520 also determines the set of NVDs 130 that will receive the storm control information. Generally, when a customer specifies which ports to which storm control applies, the control plane 1520 determines the corresponding L2 VNIC and the NVD 1530 hosting this L2 VNIC 1530. Individual storm control information defined for the L2 VNIC (corresponding to the storm control configuration defined for the port by the customer) is sent to the NVD 1530.
[0227] Next, the NVD1530 receives and stores the relevant storm control information. Incoming traffic to an L2 VNIC hosted by the NVD1530 and where individual storm control information is stored is controlled by the NVD. Similarly, outgoing traffic from an L2 VNIC to another L2 VNIC (which may be hosted on a different NVD) can be controlled by the NVD using this storm control information and / or the storm control information of the other L2 VNIC. Control may include determining whether a traffic flow condition or a violation has been detected, and / or determining the type of violation for which a storm control policy and / or escalation policy should be enforced.
[0228] Furthermore, the NVD1530 can collect metrics and / or statistics regarding incoming and / or outgoing traffic flows of the L2 VNIC. Such metrics and / or statistics are reported to the control plane 1520 using a push mechanism (e.g., periodic) or a pull mechanism (e.g., on-demand from the control plane 1520). The control plane 1520 can send the metrics / statistics received from the NVD1530 to the customer device 1510, and / or generate new metrics and / or statistics based on the aggregation or combination of metrics and / or statistics reported from multiple NVD1530s and send them to the customer device 1510. In addition, the control plane 1520 can generate updates to storm control information based on the metrics and / or statistics reported from one or more of the NVD1530s. For example, if an FPS limit is set for an L2 VNIC and this limit is partially defined using a multiplier "X", this multiplier can be adjusted (e.g., increased or decreased) depending on metrics and / or statistics indicating the amount of incoming and / or outgoing traffic to the L2 VNIC, and / or the type of violation of the storm control policy and / or escalation policy defined for the L2 VNIC. Updates to individual storm control information associated with the NVD1530 can be sent to the NVD1530 (e.g., via a push mechanism). Alternatively, the entire updated storm control information may be sent to this NVD1530.
[0229] Figure 16 is a flowchart showing a process 1600 for determining, generating, and distributing storm control information in several embodiments. In some embodiments, the process 1600 can be performed by a control plane that manages the deployment of a Layer 2 virtual network on the underlying physical network.
[0230] Process 1600 begins in block 1602, when the control plane stores the customer's network configuration, which represents the port definitions. In some embodiments, customer input is received from the customer device via API calls and / or the console and represents the L2 virtual network configuration (e.g., an L2 VLAN configuration as described in relation to Figure 13). This input can also represent the customer definition of ports for customer-aware v-switches in the L2 virtual network. This input can be stored as part of the network configuration.
[0231] In block 1604, the control plane stores mapping information that associates the addresses of L2 virtual networks with the addresses of the physical networks hosting the L2 virtual networks. For example, an L2 virtual network includes compute instances and, for each compute instance, a pair of L2 VNICs and L2 virtual switches. A physical network includes the host machines running the compute instances and the NVDs running the L2 VNIC-L2 virtual switch pairs. The addresses of the compute instances (e.g., IP addresses) and / or the addresses of the L2 VNICs (e.g., MAC addresses and interface identifiers) can be mapped to the addresses of the host machines and NVDs (e.g., IP addresses).
[0232] In block 1606, the control plane receives customer input indicating a storm control configuration. In some embodiments, customer input is received from a customer device via API calls and / or a console to indicate a storm control configuration (e.g., a storm control configuration described in relation to Figure 13).
[0233] In block 1608, the control plane determines the set of NVDs that should receive storm control information. In some embodiments, the storm control configuration is specified for the ports of the v-switch. Based on the network configuration, the control plane determines the correspondence between the ports and the L2 VNICs of the L2 virtual network. Based on the mapping information, the control plane determines the association between the L2 VNICs and the NVDs of the physical network, where the NVDs host the L2 VNICs. Therefore, the control plane determines that control information should be defined for the L2 VNICs and sent to the NVDs. Similar determinations can be made for a set of L2 VNICs hosted on a set of NVDs, or for all L2 VNICs hosted on multiple NVDs, depending on whether customer input indicates that the storm control configuration applies to a set of ports or the entire set of ports.
[0234] In block 1610, the control plane generates storm control information based on mapping information and network configuration. In some embodiments, the control plane translates customer-specified ports to L2 VNICs based on network configuration and determines the associated NVD based on mapping information. The control plane can also determine storm control policies, escalation policies, and / or modifications thereto from the storm control configuration to be included in the storm control information defined for the L2 VNIC and deployed to the NVD, as described in relation to Figures 14-15.
[0235] In block 1612, the control plane transmits storm control information to a set of NVDs. In some embodiments, where applicable, separate storm control information is generated for the L2 VNIC. This separate storm control information is transmitted to the NVD hosting the L2 VNIC.
[0236] Figure 17 is a flowchart of process 1700 for updating the storm control policy based on collected metrics, according to several embodiments. In some embodiments, process 1700 may be performed by the control plane to update storm control information previously sent to the NVD.
[0237] Process 1700 begins in block 1702, where the control plane indicates the types of metrics / statistics to collect to a set of NVDs. In some embodiments, the NVD hosts an L2 VNIC. The types of metrics / statistics may relate to incoming and / or outgoing traffic flows on the L2 VNIC (e.g., FPS, BPS, number of violations, violation rate, duration of violations, etc., for incoming and / or outgoing flows). This L2 VNIC may correspond to ports identified by the customer in the customer input for metric / statistics monitoring. The types of metrics / statistics may be specified by the customer in the input. Additionally or alternatively, the control plane may define the types of metrics / statistics to monitor so that updates can be made to storm control policies and / or escalation policies. The types of metrics / statistics may be included in the storm control information that is defined and sent to the NVD, or in separate information sent to the NVD.
[0238] In block 1704, the control plane receives metrics and / or statistics about traffic flow within the L2 virtual network from a set of NVDs. In some embodiments, the NVDs that have received the above information can collect metrics and / or statistics for each indicated type and report them to the control plane. The control plane can then collect such metrics and / or statistics from multiple NVDs over time.
[0239] In block 1706, the control plane determines updates to the flow control information. In some embodiments, the flow control information is global flow control information applicable to multiple L2 VNICs. In this case, the update may involve changing control parameters (e.g., limits, multipliers) or actions (e.g., performing a link down instead of dropping) for the relevant NVD, depending on metrics and / or statistics related to the multiple L2 VNICs. In other embodiments, the flow control information is individual flow control information applicable to a specific L2 VNIC. In this case, the update may involve changing control parameters (e.g., limits, multipliers) or actions (e.g., performing a link down instead of dropping) for the NVD hosting the L2 VNIC, depending on metrics and / or statistics specific to that L2 VNIC or multiple L2 VNICs.
[0240] In block 1708, the control plane determines the set of NVDs that receive updates. In some embodiments, the updates are for global storm control information. In this case, the NVDs that receive this information are identified. In some embodiments, the updates are for individual storm control information for a particular L2 VNIC hosted on an NVD. In this case, this NVD is identified.
[0241] In block 1710, the control plane transmits updated or updated storm control information to the NVD set. In some embodiments, a push mechanism is used.
[0242] Figure 18 is a flowchart showing a process 1800 for updating storm control information according to several embodiments. In some embodiments, process 1800 may be executed by an NVD running an L2 VNIC and an L2 virtual switch that communicate with the control plane and are associated as a pair with a compute instance. The L2 VNIC, L2 virtual switch, and compute instance may belong to the customer's L2 virtual network.
[0243] Process 1800 begins in block 1802, where the NVD hosts L2 VNICs and L2 virtual switches, and receives and stores storm control information associated with the L2 VNICs. In some embodiments, this storm control information is individual storm control information defined for each L2 VNIC and is received from the control plane. In some embodiments, this storm control information is global storm control information defined for multiple L2 VNICs hosted by multiple NVDs and is transmitted to the multiple NVDs by the control plane.
[0244] In block 1804, the NVD monitors traffic flow to and / or from the L2 VNIC. In some embodiments, monitoring is for incoming traffic to the L2 VNIC and is performed based on storm control policies and / or escalation policies indicated by storm control information. For example, storm control information indicates limits on the FPS of incoming frames to the L2 VNIC and / or the BPS of incoming bits. Thus, the FPS and / or BPS of traffic to the L2 VNIC are monitored over time. In other embodiments, monitoring is for outgoing traffic from the L2 VNIC and is performed based on storm control policies and / or escalation policies indicated by storm control information defined for the L2 VNIC or another L2 VNIC. For example, storm control information indicates limits on the FPS of outgoing frames to the L2 VNIC and / or the BPS of outgoing bits. Thus, the FPS and / or BPS of traffic from the L2 VNIC are monitored over time.
[0245] In block 1806, NVD determines whether a storm control policy violation has been detected. In some embodiments, the storm control information indicates a storm control policy that specifies traffic flow conditions. The monitored traffic flow is compared to the traffic flow conditions to determine whether a violation has occurred. For example, if an FPS / BPS limit is defined by the storm control policy, and the monitored FPS / BPS exceeds this limit (e.g., maximum transmission rate), a violation is detected. If a storm control policy violation is detected, block 1810 follows block 1806. Otherwise, block 1820 follows block 1806.
[0246] In block 1810, the NVD determines the type of violation. In some embodiments, the storm control policy indicates a duration. If the duration of a violation exceeds that duration, an unacceptable persistent violation is detected. Otherwise, the violation is determined to be non-persistent. In other embodiments, the storm control policy indicates a violation rate (e.g., the number of violations per unit time). If the number of violations detected per unit time exceeds the violation rate, an unacceptable frequent violation is detected. Otherwise, the violation is determined to be rare.
[0247] In block 1812, the NVD initiates action based on the type of violation. In some embodiments, the storm control policy indicates the action to be taken when the violation is not persistent and / or infrequent. Otherwise, the storm control policy points to an escalation policy that indicates the action to be taken (e.g., an escalated action to be taken when the violation is persistent and / or frequent). The NVD takes action (e.g., drops frames, links down the L2 VNIC, etc.).
[0248] In block 1820, the NVD sends frames to and from the L2 VNIC. No violations are detected. Therefore, storm control does not need to be applied, and incoming and / or outgoing traffic remains unchanged.
[0249] In block 1822, the NVD collects metrics and / or statistics and transmits them to the control plane. The types of metrics and / or statistics may be pre-indicated to the NVD by the control plane, as described above. The NVD can report the collected metrics and / or statistics via a push or pull mechanism.
[0250] C. Exemplary Infrastructure as a Service Architecture As mentioned above, IaaS (Infrastructure as a Service) is a specific type of cloud computing. IaaS may be configured to provide virtualized computing resources over a public network (e.g., the internet). In the IaaS model, a cloud computing provider can host infrastructure elements (e.g., servers, storage, network nodes (e.g., hardware), deployment software, platform virtualization (e.g., hypervisor layer)). In some cases, the IaaS provider can provide various services associated with the infrastructure elements (e.g., billing, monitoring, logging, security, load balancing, and clustering). Therefore, since these services can be policy-driven, IaaS users can implement policies to drive load balancing in order to maintain application availability and performance.
[0251] In some cases, IaaS customers can access resources and services over a wide area network (WAN), such as the internet, and install the rest of their application stack using the cloud provider's services. For example, a user can log into an IaaS platform, create virtual machines (VMs), install an operating system (OS) on each VM, deploy middleware such as databases, create storage buckets for workloads and backups, and install enterprise software on the VMs. Customers can use the provider's services to perform a variety of functions, including balancing network traffic, troubleshooting applications, monitoring performance, and managing disaster recovery.
[0252] In most cases, the cloud computing model requires the participation of a cloud provider. This cloud provider may or may not be a third-party service specializing in IaaS provision (e.g., offering, renting, or selling). Alternatively, a company can become a provider of private clouds and infrastructure services.
[0253] In some cases, IaaS deployment is the process of deploying a new application or a new version of an application to a pre-configured application server. IaaS deployment may also include the process of preparing the server (e.g., installing libraries, daemons, etc.). IaaS deployment is often managed by the cloud provider under the hypervisor layer (e.g., servers, storage, network hardware, and virtualization). Therefore, customers can deploy the OS, middleware, and / or applications (e.g., self-service virtual machines, which can be spun up on demand).
[0254] In some cases, IaaS provisioning may include acquiring the computers or virtual hosts to be used and installing the necessary libraries or services on those computers or virtual hosts. In most cases, deployment does not include provisioning, and provisioning must be performed first.
[0255] In some cases, IaaS provisioning presents two distinct challenges. First, there's the challenge of provisioning an initial set of infrastructure before doing anything. Second, there's the challenge of evolving existing infrastructure after everything has been provisioned (e.g., adding new services, modifying services, removing services). In some cases, these two challenges can be addressed by enabling the declarative definition of infrastructure configuration. In other words, the infrastructure (e.g., what elements are needed and how these elements interact) may be defined by one or more configuration files. Thus, the overall topology of the infrastructure (e.g., which resources depend on which and how they work together) can be described declaratively. In some examples, once the topology is defined, workflows can be generated to create and / or manage the different elements described in the configuration files.
[0256] In some examples, infrastructure can include many interconnected elements. For example, there may be one or more virtual private clouds (VPCs), also known as core networks (e.g., configurable compute resources and / or potential on-demand pools of shared compute resources). In some examples, there may be one or more security group rules provisioned to define how network security is configured, and one or more virtual machines (VMs). Other infrastructure elements such as load balancers and databases may also be provisioned. As more and more infrastructure elements are desired and / or added, infrastructure can evolve incrementally.
[0257] In some examples, sequential deployment techniques may be employed to enable the deployment of infrastructure code across various virtual computing environments. Furthermore, the techniques described can enable infrastructure management within these environments. In some examples, a service team may write code that is intended to be deployed to one or more different production environments, typically many different geographical locations, sometimes across the globe. However, in some examples, the infrastructure for deploying the code must first be configured. In some examples, provisioning can be done manually, resources can be provisioned using provisioning tools, and / or the code can be deployed using deployment tools after the infrastructure has been provisioned.
[0258] Figure 19 is a block diagram 1900 illustrating an exemplary pattern of an IaaS architecture according to at least one embodiment. The service operator 1902 may be communicably coupled to a secure host tenancy 1904 which may include a virtual cloud network (VCN) 1906 and a secure host subnet 1908. In some examples, the service operator 1902 may use one or more client computing devices. One or more client computing devices may be handheld mobile devices (e.g., iPhone®, mobile phones, iPad®, tablets, personal digital assistants (PDAs) or wearable devices (Google® Glass® head-mounted displays)) with Internet, email, short message service (SMS), BlackBerry®, or other communication protocols enabled, and which can run software such as Microsoft Windows Mobile® and / or various mobile operating systems such as iOS, Windows Phone, Android®, BlackBerry 8, and Palm OS. The client computing devices may also be general-purpose personal computers, including, exemplarily, personal computers and / or laptop computers, which run various versions of the Microsoft Windows® operating system, Apple Macintosh® operating system, and / or Linux® operating system. Alternatively, the client computing devices may be workstation computers running various commercially available UNIX® or UNIX-like operating systems, including, but not limited to, various GNU / Linux operating systems and, for example, Google Chrome® OS.Alternatively or additionally, the client computing device may be other electronic devices that can communicate via a network that has access to VCN1906 and / or the Internet, such as a thin client computer, an Internet-enabled game system (e.g., a Microsoft Xbox® game console with or without a Kinect® gesture input device), and / or a personal messaging device.
[0259] VCN1906 may include a local peering gateway (LPG) 1910 that can communicately connect to Secure Shell (SSH) VCN1912 via LPG1910 contained within SSH VCN1912. SSH VCN1912 may include an SSH subnet 1914, and SSH VCN1912 can communicately connect to control plane VCN1916 via LPG1910 contained within control plane VCN1916. Furthermore, SSH VCN1912 can communicately connect to data plane VCN1918 via LPG1910. Control plane VCN1916 and data plane VCN1918 may be contained within a service tenancy 1919, which may be owned and / or operated by an IaaS provider.
[0260] The control plane VCN1916 may include a control plane DMZ (demilitarized zone) layer 1920 that functions as a perimeter network (e.g., the portion of the corporate network between the corporate intranet and the external network). DMZ-based servers have a certain level of reliability and can contain security breaches. Furthermore, the DMZ layer 1920 may include a control plane application layer 1924 that may include one or more load balancer (LB) subnets 1922 and application subnets 1926, and a control plane data layer 1928 that may include database (DB) subnets 1930 (e.g., a front-end DB subnet and / or a back-end DB subnet). The LB subnet 1922 included in the control plane DMZ layer 1920 may be communicatively coupled to the application subnet 1926 included in the control plane application layer 1924 and to an internet gateway 1934 which may be included in the control plane VCN 1916. The application subnet 1926 may be communicatively coupled to the DB subnet 1930 included in the control plane data layer 1928, to a service gateway 1936, and to a network address translation (NAT) gateway 1938. The control plane VCN 1916 may include the service gateway 1936 and the NAT gateway 1938.
[0261] The control plane VCN 1916 may include a data plane mirror application layer 1940, which may include an application subnet 1926. The application subnet 1926 included in the data plane mirror application layer 1940 may include a virtual network interface controller (VNIC) 1942 on which compute instance 1944 can run. Compute instance 1944 can communicatively connect the application subnet 1926 of the data plane mirror application layer 1940 to an application subnet 1926 that may be included in the data plane application layer 1946.
[0262] The data plane VCN1918 may include a data plane application layer 1946, a data plane DMZ layer 1948, and a data plane data layer 1950. The data plane DMZ layer 1948 may include an LB subnet 1922 that can be communicatively coupled to the application subnet 1926 of the data plane application layer 1946 and the internet gateway 1934 of the data plane VCN1918. The application subnet 1926 may be communicatively coupled to the service gateway 1936 of the data plane VCN1918 and the NAT gateway 1938 of the data plane VCN1918. The data plane data layer 1950 may also include a DB subnet 1930 that can be communicatively coupled to the application subnet 1926 of the data plane application layer 1946.
[0263] The Internet gateway 1934 of the control plane VCN1916 and the Internet gateway 1934 of the data plane VCN1918 may be communicatively coupled to a metadata management service 1952 which can be communicatively coupled to the public internet 1954. The public internet 1954 may be communicatively coupled to the NAT gateway 1938 of the control plane VCN1916 and the NAT gateway 1938 of the data plane VCN1918. The service gateway 1936 of the control plane VCN1916 and the service gateway 1936 of the data plane VCN1918 may be communicatively coupled to a cloud service 1956.
[0264] In some cases, a service gateway 1936 of the control plane VCN1916 or data plane VCN1918 can make application programming interface (API) calls to a cloud service 1956 without going through the public internet 1954. API calls from the service gateway 1936 to the cloud service 1956 can be one-way. The service gateway 1936 can make API calls to the cloud service 1956, and the cloud service 1956 can send request data to the service gateway 1936. However, the cloud service 1956 may not initiate an API call to the service gateway 1936.
[0265] In some examples, secure host tenancy 1904 may be directly connected to a potentially isolated service tenancy 1919. Secure host subnet 1908 can communicate with SSH subnet 1914 via LPG 1910, which enables bidirectional communication with isolated systems. By connecting secure host subnet 1908 to SSH subnet 1914, secure host subnet 1908 can access other entities within service tenancy 1919.
[0266] The control plane VCN1916 allows users of service tenancy 1919 to configure or provision desired resources. Desired resources provisioned in the control plane VCN1916 may be deployed or used in the data plane VCN1918. In some examples, the control plane VCN1916 may be isolated from the data plane VCN1918, and the data plane mirror application layer 1940 of the control plane VCN1916 can communicate with the data plane application layer 1946 of the data plane VCN1918 via a VNIC 1942 which may be included in the data plane mirror application layer 1940 and the data plane application layer 1946.
[0267] In some examples, a system user or customer may make requests, such as create, read, update, or delete (CRUD) operations, via the public internet 1954, which can communicate the requests to the metadata management service 1952. The metadata management service 1952 can communicate the requests to the control plane VCN 1916 via the internet gateway 1934. The requests may also be received by an LB subnet 1922 included in the control plane DMZ layer 1920. The LB subnet 1922 may determine that the request is valid, and in response to this determination, the LB subnet 1922 may send the request to an application subnet 1926 included in the control plane application layer 1924. If the request is validated and requires a call to the public internet 1954, the call to the public internet 1954 can be sent to a NAT gateway 1938, which can make calls to the public internet 1954. Memory for storing the requests may be stored in the DB subnet 1930.
[0268] In some cases, the data plane mirror application layer 1940 can facilitate direct communication between the control plane VCN1916 and the data plane VCN1918. For example, it may be desirable that changes, updates, or other appropriate modifications to the configuration be applied to the resources contained in the data plane VCN1918. Since the control plane VCN1916 can communicate directly with the resources contained in the data plane VCN1918 via VNIC1942, it can perform changes, updates, or other appropriate modifications to the configuration.
[0269] In some embodiments, the control plane VCN1916 and the data plane VCN1918 may be included in the service tenancy 1919. In this case, the system user or customer does not have to own or operate either the control plane VCN1916 or the data plane VCN1918. Instead, the IaaS provider may own or operate the control plane VCN1916 and the data plane VCN1918, and both may be included in the service tenancy 1919. This embodiment can prevent a user or customer from interacting with other users' resources or other customers' resources by enabling network isolation. This embodiment can also enable a system user or customer to store databases privately without having to rely on the public internet 1954, which may not have the desired level of security for storage.
[0270] In another embodiment, the LB subnet 1922 included in the control plane VCN 1916 may be configured to receive signals from the service gateway 1936. In this embodiment, the control plane VCN 1916 and the data plane VCN 1918 may be configured to be invoked by the IaaS provider's customers without calling the public internet 1954. The IaaS provider's customers may prefer this embodiment because the database used by the customer may be stored in a service tenancy 1919 that is controlled by the IaaS provider and can be isolated from the public internet 1954.
[0271] FIG. 20 is a block diagram 2000 showing another exemplary parameter of an IaaS architecture, according to at least one embodiment. A service operator 2002 (e.g., service operator 1902 of FIG. 19) may be communicatively coupled to a secure host tenancy 2004 (e.g., secure host tenancy 1904 of FIG. 19) that may include a virtual cloud network (VCN) 2006 (e.g., VCN 1906 of FIG. 19) and a secure host subnet 2008 (e.g., secure host subnet 1908 of FIG. 19). The VCN 2006 can include a local peering gateway (LPG) 2010 (e.g., LPG 1910 of FIG. 19) that can be communicatively coupled to a secure shell (SSH) VCN 2012 (e.g., SSH VCN 1912 of FIG. 19) via the LPG 1910 included in the SSH VCN 2012. The SSH VCN 2012 can include an SSH subnet 2014 (e.g., SSH subnet 1914 of FIG. 19), and the SSH VCN 2012 can be communicatively coupled to a control plane VCN 2016 (e.g., control plane VCN 1916 of FIG. 19) via the LPG 2010 included in the control plane VCN 2016. The control plane VCN 2016 may be included in a service tenancy 2019 (e.g., service tenancy 1919 of FIG. 19), and a data plane VCN 2018 (e.g., data plane VCN 1918 of FIG. 19) may be included in a customer tenancy 2021 that may be owned or operated by a user or customer of the system.
[0272] The control plane VCN2016 may include a control plane DMZ tier 2020 (e.g., control plane DMZ tier 1920 in Figure 19) which may include an LB subnet 2022 (e.g., LB subnet 1922 in Figure 19), a control plane application tier 2024 (e.g., control plane application tier 1924 in Figure 19) which may include an application subnet 2026 (e.g., application subnet 1926 in Figure 19), and a control plane data tier 2028 (e.g., control plane data tier 1928 in Figure 19) which may include a database (DB) subnet 2030 (e.g., similar to DB subnet 1930 in Figure 19). The LB subnet 2022 included in the control plane DMZ tier 2020 may be coupled to communicate with the application subnet 2026 included in the control plane application tier 2024 and with an internet gateway 2034 (e.g., internet gateway 1934 in Figure 19) which may be included in the control plane VCN2016. The application subnet 2026 may be communicatively coupled to the DB subnet 2030, service gateway 2036 (e.g., the service gateway in Figure 19), and network address translation (NAT) gateway 2038 (e.g., NAT gateway 1938 in Figure 19), which are included in the control plane data layer 2028. The control plane VCN 2016 may include the service gateway 2036 and the NAT gateway 2038.
[0273] The control plane VCN 2016 can include a data plane mirror app layer 2040 (e.g., the data plane mirror app layer 1940 of FIG. 19) that can include an app subnet 2026. The app subnet 2026 included in the data plane mirror app layer 2040 can include a virtual network interface controller (VNIC) 2042 (e.g., VNIC 1942) that can execute a compute instance 2044 (similar to the compute instance 1944 of FIG. 19). The compute instance 2044 can facilitate communication between the app subnet 2026 of the data plane mirror app layer 2040 and an app subnet 2026 that may be included in the data plane app layer 2046 (e.g., the data plane app layer 1946 of FIG. 19) via the VNIC 2042 included in the data plane mirror app layer 2040 and the VNIC 2042 included in the data plane app layer 2046.
[0274] The internet gateway 2034 included in the control plane VCN 2016 may be communicatively coupled to a metadata management service 2052 (e.g., the metadata management service 1952 of FIG. 19) that may be communicatively coupled to the public internet 2054 (e.g., the public internet 1954 of FIG. 19). The public internet 2054 may be communicatively coupled to the NAT gateway 2038 included in the control plane VCN 2016. The service gateway 2036 included in the control plane VCN 2016 may be communicatively coupled to a cloud service 2056 (e.g., the cloud service 1956 of FIG. 19).
[0275] In some examples, the data plane VCN2018 may be included in customer tenancy 2021. In this case, the IaaS provider can provide a control plane VCN2016 per customer, and the IaaS provider can configure a unique compute instance 2044 for each customer, included in service tenancy 2019. Each compute instance 2044 can allow communication between the control plane VCN2016 included in service tenancy 2019 and the data plane VCN2018 included in customer tenancy 2021. The compute instance 2044 can allow resources provisioned in the control plane VCN2016 included in service tenancy 2019 to be deployed or used in the data plane VCN2018 included in customer tenancy 2021.
[0276] In another example, an IaaS provider's customer may have a database residing in customer tenancy 2021. In this example, control plane VCN2016 may include a data plane minor app tier 2040 that can include app subnet 2026. A data plane mirror app tier 2040 may reside in data plane VCN2018, but does not have to reside in data plane VCN2018. That is, a data plane mirror app tier 2040 can access customer tenancy 2021, but does not have to reside in data plane VCN2018 and does not have to be owned or operated by the IaaS provider's customer. A data plane mirror app tier 2040 may be configured to make calls to data plane VCN2018, but does not have to be configured to make calls to any entity contained in control plane VCN2016. Customers may wish to deploy or use resources within the data plane VCN2018 provisioned in the control plane VCN2016, and the data plane mirror application tier 2040 can facilitate the customer's desired deployment or other use of resources.
[0277] In some embodiments, a customer of the IaaS provider can apply filters to the data plane VCN2018. In this embodiment, the customer can determine what the data plane VCN2018 can access and can restrict access from the data plane VCN2018 to the public internet 2054. The IaaS provider may not be able to apply filters or control access from the data plane VCN2018 to any external network or database. Applying filters and controls to the data plane VCN2018 included in the customer tenancy 2021 can help isolate the data plane VCN2018 from other customers and the public internet 2054.
[0278] In some embodiments, cloud service 2056 can be invoked by service gateway 2036 to access services that may not reside on the public internet 2054, on control plane VCN2016, or on data plane VCN2018. The connection between cloud service 2056 and control plane VCN2016 or data plane VCN2018 does not have to be live or continuous. Cloud service 2056 may reside on a separate network owned or operated by the IaaS provider. Cloud service 2056 may be configured to receive calls from service gateway 2036 and not to receive calls from the public internet 2054. Some cloud services 2056 may be isolated from other cloud services 2056, and control plane VCN2016 may be isolated from cloud services 2056 that may not be located in the same region as control plane VCN2016. For example, control plane VCN2016 may be located in "Region 1", and cloud service "Deployment 19" may be located in "Region 1" and "Region 2". If a call to deployment 19 is made by a service gateway 2036 included in the control plane VCN2016 located in region 1, this call may be sent to deployment 19 in region 1. In this example, the control plane VCN2016 or deployment 19 in region 1 does not need to be communicably coupled with deployment 19 in region 2.
[0279] Figure 21 is a block diagram 2100 showing another exemplary pattern of an IaaS architecture according to at least one embodiment. A service operator 2102 (e.g., service operator 1902 in Figure 19) may be communicatively coupled to a secure host tenancy 2104 (e.g., secure host tenancy 1904 in Figure 19), which may include a virtual cloud network (VCN) 2106 (e.g., VCN1906 in Figure 19) and a secure host subnet 2108 (e.g., secure host subnet 1908 in Figure 19). VCN 2106 may include an LPG 2110 (e.g., LPG1910 in Figure 19), which may be communicatively coupled to an SSH VCN 2112 (e.g., SSH VCN1912 in Figure 19) via an LPG 2110 contained in the SSH VCN 2112. SSH VCN2112 may include SSH subnet 2114 (e.g., SSH subnet 1914 in Figure 19), and SSH VCN2112 may be communicatively coupled to control plane VCN2116 (e.g., control plane VCN1916 in Figure 19) via LPG2110 included in control plane VCN2116, and may be communicatively coupled to data plane VCN2118 (e.g., data plane 1918 in Figure 19) via LPG2110 included in data plane VCN2118. Control plane VCN2116 and data plane VCN2118 may be included in service tenancy 2119 (e.g., service tenant 1919 in Figure 19).
[0280] The control plane VCN2116 may include a control plane DMZ layer 2120 (e.g., control plane DMZ layer 1920 in Figure 19) which may include a load balancer (LB) subnet 2122 (e.g., LB subnet 1922 in Figure 19), a control plane application layer 2124 (e.g., control plane application layer 1924 in Figure 19) which may include an application subnet 2126 (e.g., similar to application subnet 1926 in Figure 19), and a control plane data layer 2128 (e.g., control plane data layer 1928 in Figure 19) which may include a DB subnet 2130. The LB subnet 2122 included in the control plane DMZ layer 2120 may be communicably coupled to the application subnet 2126 included in the control plane application layer 2124 and to an internet gateway 2134 (e.g., internet gateway 1934 in Figure 19) which may be included in the control plane VCN2116. The application subnet 2126 may be communicatively coupled to the DB subnet 2130 included in the control plane data layer 2128, and to the service gateway 2136 (e.g., the service gateway in Figure 19) and the network address translation (NAT) gateway 2138 (e.g., the NAT gateway 1938 in Figure 19). The control plane VCN 2116 may include the service gateway 2136 and the NAT gateway 2138.
[0281] The data plane VCN2118 may include a data plane application layer 2146 (e.g., data plane application layer 1946 in Figure 19), a data plane DMZ layer 2148 (e.g., data plane DMZ layer 1948 in Figure 19), and a data plane data layer 2150 (e.g., data plane data layer 1950 in Figure 19). The data plane DMZ layer 2148 may include an LB subnet 2122 that can be communicatively coupled to the trusted application subnet 2160 and untrusted application subnet 2162 of the data plane application layer 2146 and Internet gateway 2134 included in the data plane VCN2118. The trusted application subnet 2160 may be communicatively coupled to the service gateway 2136 included in the data plane VCN2118, the NAT gateway 2138 included in the data plane VCN2118, and the DB subnet 2130 included in the data plane data layer 2150. The untrusted application subnet 2162 may be communicatively coupled to the service gateway 2136 included in the data plane VCN 2118 and to the DB subnet 2130 included in the data plane data layer 2150. The data plane data layer 2150 may include the DB subnet 2130, which can be communicatively coupled to the service gateway 2136 included in the data plane VCN 2118.
[0282] An untrusted application subnet 2162 may include one or more primary VNICs 2164(1)-(N) that can be communicatively coupled to tenant virtual machines (VMs) 2166(1)-(N). Each tenant VM 2166(1)-(N) may be communicatively coupled to each application subnet 2167(1)-(N) that can be included in each container transmission VCN 2168(1)-(N) that can be included in each customer tenancy 2170(1)-(N). Each secondary VNIC 2172(1)-(N) can facilitate communication between the untrusted application subnet 2162 included in the data plane VCN 2118 and the application subnets included in the container transmission VCN 2168(1)-(N). Each container transmission VCN 2168(1)-(N) may include a NAT gateway 2138 that can be communicatively coupled to the public internet 2154 (e.g., public internet 1954 in Figure 19).
[0283] The Internet gateway 2134 included in the control plane VCN2116 and the Internet gateway 2134 included in the data plane VCN2118 may be communicatively coupled to a metadata management service 2152 (e.g., a metadata management system 1952 in Figure 19) which can be communicatively coupled to the public internet 2154. The public internet 2154 may be communicatively coupled to the NAT gateway 2138 included in the control plane VCN2116 and the NAT gateway 2138 included in the data plane VCN2118. The service gateway 2136 included in the control plane VCN2116 and the service gateway 2136 included in the data plane VCN2118 may be communicatively coupled to a cloud service 2156.
[0284] In some embodiments, the data plane VCN2118 may be integrated with the customer tenancy 2170. This integration may be useful or desirable for the IaaS provider's customer in some cases, such as when they may want support when executing code. The customer may provide code that, when executed, could be destructive, could communicate with other customer resources, or could cause undesirable effects. Thus, the IaaS provider can determine whether or not to execute the code that the customer has provided to the IaaS provider.
[0285] In some examples, an IaaS provider's customer may grant the IaaS provider temporary network access and request functionality to be added to the data plane application layer 2146. The code for executing the functionality may run on VMs 2166(1)-(N), but cannot be configured to run elsewhere on the data plane VCN 2118. Each VM 2166(1)-(N) may be connected to one customer tenancy 2170. Each container 2171(1)-(N) contained within VMs 2166(1)-(N) may be configured to run the code. In this case, a double isolation may exist (for example, containers 2171(1)-(N) run the code, and containers 2171(1)-(N) may be contained within VMs 2166(1)-(N) that are in at least an untrusted application subnet 2162), which can help prevent erroneous or undesirable code from damaging the IaaS provider's network or the networks of different customers. Containers 2171(1)-(N) may be communicatively coupled to customer tenancy 2170 and may be configured to send or receive data from customer tenancy 2170. Containers 2171(1)-(N) do not have to be configured to send or receive data from any other entities in the data plane VCN2118. Once code execution is complete, the IaaS provider may kill or discard containers 2171(I)-(N).
[0286] In some embodiments, a trusted application subnet 2160 may execute code that may be owned or operated by the IaaS provider. In this embodiment, the trusted application subnet 2160 may be communicatively coupled to a DB subnet 2130 and configured to perform CRUD operations in the DB subnet 2130. An untrusted application subnet 2162 may be communicatively coupled to a DB subnet 2130, but in this embodiment, the untrusted application subnet may be configured to perform read operations within the DB subnet 2130. Containers 2171(1)~(N) contained in each customer's VM 2166(1)~(N) and capable of executing code from the customer do not need to be communicatively coupled to the DB subnet 2130.
[0287] In other embodiments, the control plane VCN2116 and the data plane VCN2118 do not have to be directly coupled in a communicative manner. In this embodiment, direct communication between the control plane VCN2116 and the data plane VCN2118 may not exist. However, indirect communication by at least one method may exist. An LPG2110 that facilitates communication between the control plane VCN2116 and the data plane VCN2118 may be established by the IaaS provider. In another example, the control plane VCN2116 or the data plane VCN2118 can make a call to the cloud service 2156 via the service gateway 2136. For example, a call from the control plane VCN2116 to the cloud service 2156 may include a request for a service that can communicate with the data plane VCN2118.
[0288] Figure 22 is a block diagram 2200 showing another exemplary parameter of an IaaS architecture according to at least one embodiment. A service operator 2202 (e.g., service operator 1902 in Figure 19) may be communicatively coupled to a secure host tenancy 2204 (e.g., secure host tenancy 1904 in Figure 19), which may include a virtual cloud network (VCN) 2206 (e.g., VCN1906 in Figure 19) and a secure host subnet 2208 (e.g., secure host subnet 1908 in Figure 19). VCN2206 may include an LPG2210 (e.g., LPG1910 in Figure 19), which may be communicatively coupled to an SSH VCN2212 (e.g., SSH VCN1912 in Figure 19) via an LPG2210 contained in an SSH VCN2212. SSH VCN2212 may include SSH subnet-2214 (e.g., SSH subnet 1914 in Figure 19), and SSH VCN2212 may be communicatively coupled to control plane VCN2216 (e.g., control plane VCN1916 in Figure 19) via LPG2210 included in control plane VCN2216, and may be communicatively coupled to data plane VCN2218 (e.g., data plane 1918 in Figure 19) via LPG2210 included in data plane VCN2218. Control plane VCN2216 and data plane VCN2218 may be included in service tenancy 2219 (e.g., service tenancy 1919 in Figure 19).
[0289] The control plane VCN2216 may include a control plane DMZ layer 2220 (e.g., control plane DMZ layer 1920 in Figure 19) which may include an LB subnet 2222 (e.g., LB subnet 1922 in Figure 19), a control plane application layer 2224 (e.g., control plane application layer 1924 in Figure 19) which may include an application subnet 2226 (e.g., application subnet 1926 in Figure 19), and a control plane data tier 2228 (e.g., control plane data tier 1928 in Figure 19) which may include a DB subnet 2230 (e.g., DB subnet 2130 in Figure 21). The LB subnet 2222 included in the control plane DMZ layer 2220 may be communicably coupled to the application subnet 2226 included in the control plane application layer 2224 and to an internet gateway 2234 (e.g., internet gateway 1934 in Figure 19) which may be included in the control plane VCN2216. The application subnet 2226 may be communicatively coupled to the DB subnet 2230 included in the control plane data layer 2228, and to the service gateway 2236 (e.g., the service gateway in Figure 19) and the network address translation (NAT) gateway 2238 (e.g., the NAT gateway 1938 in Figure 19). The control plane VCN 2216 may include the service gateway 2236 and the NAT gateway 2238.
[0290] The data plane VCN2218 may include a data plane application layer 2246 (e.g., data plane application layer 1946 in Figure 19), a data plane DMZ layer 2248 (e.g., data plane DMZ layer 1948 in Figure 19), and a data plane data layer 2250 (e.g., data plane data layer 1950 in Figure 19). The data plane DMZ layer 2248 may include trusted application subnets 2260 (e.g., trusted application subnet 2160 in Figure 21) and untrusted application subnets 2262 (e.g., untrusted application subnet 2162 in Figure 21) of the data plane application layer 2246, and an LB subnet 2222 that can be communicatively coupled to the internet gateway 2234 included in the data plane VCN2218. A trusted application subnet 2260 may be communicatively coupled to a service gateway 2236 included in data plane VCN2218, a NAT gateway 2238 included in data plane VCN2218, and a DB subnet 2230 included in data plane data layer 2250. An untrusted application subnet 2262 may be communicatively coupled to a service gateway 2236 included in data plane VCN2218, and a DB subnet 2230 included in data plane data layer 2250. The data plane data layer 2250 may include a DB subnet 2230 that can be communicatively coupled to a service gateway 2236 included in data plane VCN2218.
[0291] An untrusted application subnet 2262 may include primary YNICs 2264(1)-(N) that can be communicatively coupled to tenant virtual machines (VMs) 2266(1)-(N) residing in the untrusted application subnet 2262. Each tenant VM 2266(1)-(N) may execute code in its respective container 2267(1)-(N) and may be communicatively coupled to an application subnet 2226 that may be included in a data plane application layer 2246 that may be included in a container-transmitting VCN 2268. Each secondary VNIC 2272(1)-(N) can facilitate communication between the untrusted application subnet 2262 included in a data plane VCN 2218 and the application subnet included in a container-transmitting VCN 2268. The container-transmitting VCN may include a NAT gateway 2238 that can be communicatively coupled to the public internet 2254 (e.g., public internet 1954 in Figure 19).
[0292] The Internet gateway 2234 included in the control plane VCN2216 and the Internet gateway 2234 included in the data plane VCN2218 may be communicatively coupled to a metadata management service 2252 (e.g., a metadata management system 1952 in Figure 19) which can be communicatively coupled to the public internet 2254. The public internet 2254 may be communicatively coupled to the Internet gateway 2234 included in the control plane VCN2216 and the NAT gateway 2238 included in the data plane VCN2218. The Internet gateway 2234 included in the control plane VCN2216 and the service gateway 2236 included in the data plane VCN2218 may be communicatively coupled to a cloud service 2256.
[0293] In some examples, the pattern shown by the architecture in block diagram 2200 of Figure 22 can be considered an exception to the pattern shown by the architecture in block diagram 2100 of Figure 21, and may be desirable for the IaaS provider's customers when the IaaS provider cannot communicate directly with the customers (e.g., in a disconnected area). Customers can access in real time each container 2267(1)~(N) contained within each customer's VM2266(1)~(N). Containers 2267(1)~(N) may be configured to call each secondary VNIC 2272(1)~(N) contained within the application subnet 2226 of the data plane application layer 2246, which may be contained within the container sending VCN 2268. Secondary VNICs 2272(1)~(N) can send calls to a NAT gateway 2238 which can send calls to the public internet 2254. In this example, the containers 2267(1)-(N) that customers can access in real time may be isolated from the control plane VCN2216 and from other entities included in the data plane VCN2218. Furthermore, the containers 2267(1)-(N) may be isolated from other customers' resources.
[0294] In other examples, a customer can call the cloud service 2256 using the containers 2267(1) to (N). In this example, the customer can execute code in the containers 2267(1) to (N) that requests a service from the cloud service 2256. The containers 2267(1) to (N) can send the request to the secondary VNICs 2272(1) to (N) that can send the request to a NAT gateway that can send the request to the public internet 2254. The public internet 2254 can send this request to the LB subnet 2222 included in the control plane VCN 2216 via the internet gateway 2234. In response to determining that the request is valid, the LB subnet can send this request to the app subnet 2226, and the app subnet 2226 can send this request to the cloud service 2256 via the service gateway 2236.
[0295] Note that the illustrated IaaS architectures 1900, 2000, 2100, and 2200 may include elements other than those shown. Also, the illustrated embodiments are merely examples of a part of a cloud infrastructure system that can incorporate the embodiments of the present disclosure. In some other embodiments, the IaaS system may have more or fewer elements than those shown, may combine two or more elements, or may have a different configuration or arrangement of elements.
[0296] In certain embodiments, the IaaS system described herein can include a suite of application, middleware, and database services provided to customers in a self-service, subscription-based, flexible scalability, reliable, highly available, and secure manner. An example of such an IaaS system is the Oracle® Cloud Infrastructure (OCI) provided by the applicant.
[0297] Figure 23 shows an exemplary computer system 2300 in which various embodiments may be implemented. System 2300 may be used to implement any of the computer systems described above. As shown, computer system 2300 includes a processing unit 2304 that communicates with a number of peripheral subsystems via a bus subsystem 2302. These peripheral subsystems may include a processing acceleration unit 2306, an I / O subsystem 2308, a storage subsystem 2318, and a communication subsystem 2324. The storage subsystem 2318 includes a tangible computer-readable storage medium 2322 and system memory 2310.
[0298] The bus subsystem 2302 provides a mechanism for various components and subsystems of the computer system 2300 to communicate with each other as intended. Although the bus subsystem 2302 is schematically shown as a single bus, alternative embodiments of the bus subsystem may utilize multiple buses. The bus subsystem 2302 may be one of several types of bus structures, including a memory bus or memory controller, a peripheral bus, and a local bus using any of the various bus architectures. For example, such architectures may include the Industry Standard Architecture (ISA) bus, Microchannel Architecture (MCA), Bus Extension ISA (EISA) bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Interconnect (PCI) bus, which can be implemented as a mezzanine bus manufactured in accordance with the IEEE P1386.1 standard.
[0299] A processing unit 2304, which can be implemented as one or more integrated circuits (e.g., conventional microprocessors or microcontrollers), controls the operation of the computer system 2300. The processing unit 2304 may include one or more processors. These processors may include single-core or multi-core processors. In some embodiments, the processing unit 2304 may be implemented as one or more independent processing units 2332 and / or 2334, each containing a single-core or multi-core processor. In other embodiments, the processing unit 2304 may be implemented as a quad-core processing unit formed by integrating two dual-core processors onto a single chip.
[0300] In various embodiments, the processing unit 2304 can execute various programs in response to program code and can maintain multiple programs or processes running simultaneously. At any given time, some or all of the program code being executed can reside in the processor 2304 and / or the memory subsystem 2318. The processor 2304 can provide the various functionalities described above through appropriate programming. The computer system 2300 may further include a processing acceleration unit 2306 which may include a digital signal processor (DSP), a dedicated processor and / or the same.
[0301] The I / O subsystem 2308 may include a user interface input device and a user interface output device. The user interface input device may include a keyboard, a pointing device such as a mouse or trackball, a touchpad or touchscreen integrated into a display, a scroll wheel, a click wheel, a dial, a button, a switch, a keypad, a voice input device with a voice command recognition system, a microphone, and other types of input devices. The user interface input device may also include a motion detection and / or gesture recognition device, such as a Microsoft Kinect® motion sensor. The Microsoft Kinect® motion sensor can control and interact with input devices such as a Microsoft Xbox® 360 game controller via a natural user interface (NUI) that utilizes gestures and voice commands. The user interface input device may also include an eye gesture recognition device, such as a Google Glass® blink detector. The Google Glass® blink detector detects the user's eye activity (e.g., blinking when taking a picture and / or selecting a menu) and converts the eye activity into input to an input device (e.g., Google Glass®). Furthermore, the user interface input device may include a voice recognition detection device that enables interaction between the user and a voice recognition system (e.g., Siri® Navigator) via voice commands.
[0302] Furthermore, user interface input devices include, but are not limited to, three-dimensional (3D) mice, joysticks or pointing sticks, gamepads, graphic tablets, audio / visual devices such as speakers, digital cameras, digital video cameras, portable media players, webcams, image scanners, fingerprint scanners, barcode readers, 3D scanners, 3D printers, laser rangefinders, and eye-tracking devices. In addition, user interface input devices may include medical image input devices such as computed tomography scanners, magnetic resonance imaging scanners, ultrasound imaging scanners, or medical ultrasound devices. Furthermore, user interface input devices may include audio input devices such as MIDI keyboards and electronic musical instruments.
[0303] The user interface output device may include non-visual displays such as display subsystems, indicator lights, or audio output devices. The display subsystem may be, for example, a flat-panel device using a cathode ray tube (CRT), liquid crystal display (LCD), or plasma display, a projection device, or a touchscreen. Generally, when the term “output device” is used, it is intended to include all possible types of devices and mechanisms for outputting information from the computer system 2300 to a user or another computer. For example, the user interface output device includes, but is not limited to, various display devices that visually convey text, images, and audio / video information, such as monitors, printers, speakers, headphones, car navigation systems, plotters, audio output devices, and modems.
[0304] The computer system 2300 may include a storage subsystem 2318. The storage subsystem 2318 comprises software elements, which, in the illustration, are located in the system memory 2310. The system memory 2310 can store program instructions that can be loaded and executed by the processing unit 2304, and data generated by the execution of these programs.
[0305] Depending on the configuration and type of the computer system 2300, the system memory 2310 may be volatile memory (e.g., random access memory: RAM) and / or non-volatile memory (e.g., read-only memory: ROM, flash memory). Generally, RAM contains data and / or program modules that the processing unit 2304 can access immediately, and / or data and / or program modules currently being operated and executed by the processing unit 2304. In some implementations, the system memory 2310 may include several different types of memory, such as static random access memory (SRAM) or dynamic random access memory (DRAM). In some implementations, a basic input / output system (BIOS), which includes basic routines that help transfer information between elements within the computer system 2300 during startup and other times, may generally be stored in ROM. As an example and not limited thereto, system memory 2310 also shows application programs 2312, program data 2314, and operating system 2316, which may include client applications, web browsers, middle-tier applications, relational database management systems (RDBMS), etc.For example, Operating System 2316 may include various versions of Microsoft Windows®, Apple Macintosh®, and / or Linux® operating systems, various commercially available UNIX® or UNIX-like operating systems (including, but not limited to, various GNU / Linux operating systems, Google Chrome® OS, etc.), and / or mobile operating systems such as iOS, Windows® Phone, Android® OS, BlackBerry® OS, and Palm® OS.
[0306] Furthermore, the storage subsystem 2318 can provide a tangible, computer-readable storage medium for storing basic programming and data structures that provide the functionality of several embodiments. Software (programs, code modules, instructions) that provides the above functionality when executed by the processor may be stored in the storage subsystem 2318. These software modules or instructions may be executed by the processing unit 2304. The storage subsystem 2318 can also provide a repository for storing data used in accordance with this disclosure.
[0307] Furthermore, the storage subsystem 2300 may include a computer-readable storage medium reader 2320 that can be further connected to the computer-readable storage medium 2322. The computer-readable storage medium 2322, together with the system memory 2310, or in combination with the system memory 2310 as needed, can comprehensively represent remote storage devices, local storage devices, fixed storage devices, and / or removable storage devices, in addition to storage media for temporarily and / or permanently storing, storing, transmitting, and retrieving computer-readable information.
[0308] Furthermore, the computer-readable storage medium 2322 containing code or a portion of code may include any suitable medium known or used in the art. Such medium includes, but is not limited to, volatile and non-volatile, removable and non-removable media, which are implemented in any way or technique for storing and / or transmitting information. This may include tangible computer-readable storage media such as RAM, ROM, electronically erasable programmable ROM (EEPROM), flash memory or other memory technologies, CD-ROM, digital versatile disk (DVD), or other optical storage devices, magnetic cassettes, magnetic tapes, magnetic disk storage devices or other magnetic storage devices, or other tangible computer-readable media. It may also include intangible computer-readable media such as data signals, data transmissions, or other media that can be used to transmit desired information and are accessible by the computer system 2300.
[0309] For example, the computer-readable storage medium 2322 may include a hard disk drive that reads from or writes to a non-removable non-volatile magnetic medium, a magnetic disk drive that reads from or writes to a removable non-volatile magnetic disk, and an optical disk drive that reads from or writes to a removable non-volatile optical disk such as a CD-ROM, DVD, or Blu-ray® disc or other optical medium. The computer-readable storage medium 2322 may include, but is not limited to, a zip® drive, a flash memory card, a universal serial bus (USB) flash drive, a secure digital (SD) card, a DVD disc, a digital videotape, and the like. Furthermore, the computer-readable storage medium 2322 may include flash memory-based SSDs, enterprise flash drives, solid-state drives (SSDs) based on non-volatile memory such as solid-state ROM, SSDs based on volatile memory such as solid-state RAM, dynamic RAM, and static RAM, DRAM-based SSDs, magnetoresistive RAM (MRAM) SSDs, and hybrid SSDs that use a combination of DRAM and flash memory-based SSDs. Disk drives and their associated computer-readable media can provide computer system 2300 with non-volatile storage devices for computer-readable instructions, data structures, program modules, and other data.
[0310] The communication subsystem 2324 provides interfaces with other computer systems and networks. It acts as an interface for receiving data from other systems and transmitting data from computer system 2300 to other systems. For example, the communication subsystem 2324 may enable computer system 2300 to connect to one or more devices via the Internet. In some embodiments, the communication subsystem 2324 may include radio frequency (RF) transceiver components for accessing wireless voice and / or data networks (using, for example, cellular technologies such as 3G, 4G, or EDGE (enhanced data rates for global evolution), advanced data network technologies), WiFi (IEEE 802.11 family standards or other mobile communication technologies or any combination thereof), global positioning system (GPS) receiver components, and / or other components. In some embodiments, the communication subsystem 2324 may provide wired network connectivity (e.g., Ethernet) in addition to, or instead of, the wireless interface.
[0311] In addition, in some embodiments, the communication subsystem 2324 can receive input communications on behalf of one or more users who can use the computer system 2300, in the form of structured and / or unstructured data feeds 2326, event streams 2328, event updates 2330, etc.
[0312] For example, the communications subsystem 2324 may be configured to receive data feeds 2326 in real time from users of social networks and / or other communications services, such as Twitter® feeds, Facebook® updates, and Rich Site Summary (RSS) feeds, and / or to receive real-time updates from one or more third-party sources.
[0313] Furthermore, the communication subsystem 2324 may be configured to receive data in the form of a continuous data stream, which may include an event stream 2328 and / or event update 2330 of real-time events, which may be continuous or have no boundaries in an essentially definite-end state. Examples of applications that generate continuous data may include, for example, sensor data applications, financial tickers, network performance measurement tools (e.g., network monitoring and traffic management applications), clickstream analysis tools, and automotive traffic monitoring.
[0314] Furthermore, the communication subsystem 2324 may be configured to output structured and / or unstructured data feeds 2326, event streams 2328, event updates 2330, etc., to one or more databases that can communicate with one or more streaming data source computers coupled to the computer system 2300.
[0315] The computer system 2300 may be one of a variety of types, including handheld portable devices (e.g., iPhone® mobile phones, iPad® computing tablets, PDAs), wearable devices (e.g., Google Glass® head-mounted displays), PCs, workstations, mainframes, kiosks, server racks, or other data processing systems.
[0316] In the preceding description, specific details are provided for illustrative purposes to enable a full understanding of the embodiments of this disclosure. However, it will be apparent that various embodiments can be carried out without these specific details. The following description provides examples only and is not intended to limit the scope, applicability, or configuration of this disclosure. Rather, the following description of embodiments provides a possible description for carrying out the embodiments to those skilled in the art. It should be understood that various modifications can be made to the function and arrangement of elements without departing from the spirit and scope of this disclosure as set forth in the appended claims. The drawings and descriptions are not intended to be restrictive. Circuits, systems, networks, processes, and other components may be shown as components in the form of block diagrams so as not to obscure the embodiments with unnecessary details. In other embodiments, well-known circuits, processes, algorithms, structures, and techniques may be shown without unnecessary details so as not to obscure the embodiments. The teachings of this disclosure can also be applied to various types of applications, such as mobile applications, non-mobile applications, desktop applications, web applications, and enterprise applications. Furthermore, the teachings in this disclosure are not limited to a specific operating environment (e.g., operating system, device, platform, etc.) but can be applied to multiple different operating environments.
[0317] It should also be noted that each embodiment is described as a process shown as a flowchart, flow diagram, data flow diagram, structure diagram, or block diagram. While flowcharts describe operations as sequential processes, many operations can be performed in parallel or simultaneously. Furthermore, the order of operations may be rearranged. A process terminates when its operations are completed, but it may include additional steps not shown in the diagram. A process can correspond to a method, function, procedure, subroutine, subprogram, etc. If a process corresponds to a function, its termination can correspond to the return of the calling function or the main function.
[0318] The terms “example” and “exemplary” are used herein to mean “serving as an example, case, or illustration.” Any embodiment or design described herein as “exemplary” or “example” should not necessarily be construed as being preferable or advantageous to other embodiments or designs.
[0319] The terms “machine-readable storage medium” or “computer-readable storage medium” include, but are not limited to, portable or non-portable storage devices, optical storage devices, and various other media that can store, store, or transport instructions and / or data. Machine-readable storage medium or computer-readable storage medium may also include non-transient media that can store data and do not contain carrier waves and / or transient electronic signals that propagate over wireless or wired connections. Examples of non-transient media may include, but are not limited to, magnetic disks or tapes, optical storage media such as compact discs (CDs) or digital multipurpose discs (DVDs), flash memory, memory, or memory devices. Computer program products may include code and / or machine-executable instructions that can represent any combination of procedures, functions, subprograms, programs, routines, subroutines, modules, software packages, classes, instructions, data structures, or program statements. Code segments may be coupled to other code segments or hardware circuits by transferring and / or receiving information, data, arguments, parameters, or memory contents. Information, arguments, parameters, and data may be transmitted, transferred, or sent via any suitable means, such as memory sharing, message transfer, token transfer, or network transmission.
[0320] Furthermore, embodiments may be implemented in hardware, software, firmware, middleware, microcode, hardware description language, or any combination thereof. When implemented in software, firmware, middleware, or microcode, the program code or code segment that performs the required work may be stored in a machine-readable medium. The processor can then perform the required work. Systems shown in some drawings may be provided in various configurations. In some examples, the system may be configured as a distributed system in which one or more elements of the system are distributed across one or more networks within a cloud computing system. Where an element is described as "configured" to perform a particular operation, such a configuration may be achieved, for example, by designing electronics or other hardware to perform the operation, by programming or controlling electronics (e.g., a microprocessor or other suitable electronics) to perform the operation, or by any combination thereof.
[0321] While specific embodiments of the Disclosure have been described, various changes, modifications, alternative configurations, and equivalents are also included within the scope of the Disclosure. Embodiments of the Disclosure are not limited to operating within a specific data processing environment, but can freely operate within multiple data processing environments. Furthermore, while embodiments of the Disclosure have been described using a set of specific procedures and steps, it will be apparent to those skilled in the art that the scope of the Disclosure is not limited to the procedures and steps described. The various features and aspects of the embodiments described above can be used individually or in combination.
[0322] Furthermore, while embodiments of the Disclosure have been described using specific combinations of hardware and software, it should be recognized that other combinations of hardware and software are also included within the scope of the Disclosure. Embodiments of the Disclosure can be implemented using hardware alone, software alone, or a combination thereof. The various processes described in the Disclosure can be run on the same processor or on different processors in any combination. Thus, where it is described that a component or module is configured to perform a particular process, that configuration can be implemented, for example, by designing an electronic circuit to perform that process, by programming a programmable electronic circuit (such as a microprocessor) to perform that process, or by a combination thereof. Processes can communicate using a variety of technologies, including but not limited to prior art for communication between processes. Different pairs of processes may use different technologies, or the same pair of processes may use different technologies at different times.
[0323] Therefore, the specification and drawings should be considered illustrative rather than restrictive. However, it will be apparent that additions, reductions, deletions, and other modifications and alterations may be made without departing from the broad spirit and scope defined by the claims. Thus, although specific embodiments of this disclosure have been described, these embodiments are not intended to be limiting. Various modifications and their equivalents are included in the appended claims.
[0324] Examples of embodiments of this disclosure can be described in light of the following sections. Item 1. A method comprising storing mapping information relating the address of a customer's Layer 2 virtual network to the address of a physical network hosting the Layer 2 virtual network, wherein the Layer 2 virtual network comprises a plurality of compute instances, a plurality of Layer 2 virtual network interfaces, and a plurality of Layer 2 virtual switches, the physical network comprises a plurality of network virtualization devices (NVDs) and a plurality of host machines, one of the plurality of compute instances is hosted on one of the plurality of host machines, and one of the plurality of compute instances is hosted on one of the Layer 2 virtual network interfaces. A method comprising: a work interface associated with a Layer 2 virtual switch among the plurality of Layer 2 virtual switches, wherein the Layer 2 virtual network interface and the Layer 2 virtual switch are hosted on an NVD among the plurality of NVDs, the NVD and the host machine are communicably coupled, the method further comprising: receiving customer input, the input specifying a storm control configuration for traffic flow in the Layer 2 virtual network, the method further comprising: generating storm control information for the NVD based on the storm control configuration and the mapping information, and transmitting the storm control information to the NVD.
[0325] Item 2. The method according to Item 1, wherein the above input indicates that the above storm control configuration is applied to a port, specifies an action to be performed under traffic flow conditions, and the method further includes determining that the port corresponds to a media access control (MAC) address of a certain Layer 2 virtual network interface, determining, based on the mapping information, that the MAC address is associated with an Internet Protocol (IP) address of a certain NVD, and indicating the traffic flow conditions and the action in the storm control information based on the MAC address and the IP address.
[0326] 3. The method described in 2, wherein the traffic flow conditions include the frame transmission rate, and the actions include dropping frames or linking down the port.
[0327] Item 4. The method according to Item 2, wherein the traffic flow conditions include a maximum frame transmission rate and duration, and the action includes dropping frames when the frame transmission rate exceeds the maximum frame transmission rate and linking down the port when the exceedance is detected for a duration exceeding the duration.
[0328] 5. The method according to any one of sections 1 to 4, wherein the above input indicates that the storm control policy of the above storm control configuration applies to at least one of unicast, broadcast, or multicast frames, and the above storm control information indicates the storm control policy and the applicability of the above storm control policy to at least one of frame broadcasts or frame multicasts to a certain NVD for controlling the flow of frames associated with a certain Layer 2 virtual network interface.
[0329] Item 6. The above inputs represent the storm control policy of the above storm control configuration, the actions of the above storm control policy to be performed in the event of a violation of the above storm control policy, and the escalation policy of the above storm control configuration to be applied in the event of a repeated violation of the above storm control policy, and the above storm control information represents the above storm control policy, the above actions, and the above escalation policy as shown in the above NVD to control the flow of frames associated with the above Layer 2 virtual network interface, as described in any one of Items 1 to 5.
[0330] Item 7. The above input indicates the type of transmission rate of the above storm control configuration, the type of transmission rate includes at least one of frames / second or bits / second, and the storm control information is as shown in the above NVD to control the flow of frames associated with the above Layer 2 virtual network interface, as described in any one of Items 1 to 6.
[0331] Item 8. The above input indicates the type of statistics for the above storm control configuration, and the above storm control information is the type of statistics as shown in the above NVD, in order to collect statistics on the flow of frames associated with the above Layer 2 virtual network interface, as described in any one of Items 1 to 7.
[0332] Item 9. The method according to any one of Items 1 to 8, wherein the above input indicates that the above storm control configuration is applied to multiple ports, the method further comprises determining that the multiple ports correspond to a set of Layer 2 virtual network interfaces among the multiple Layer 2 virtual network interfaces, determining, based on the mapping information, that the set of Layer 2 virtual network interfaces is associated with a set of NVDs among the multiple NVDs, and transmitting the storm control information to each NVD in the set of NVDs.
[0333] Item 10. The above storm control configuration, the above storm control information, the above NVD, and the above Layer 2 virtual network interface are, respectively, a first storm control configuration, a first storm control information, a first NVD, and a first Layer 2 virtual network interface, and the method further determines that the above input indicates (i) a second storm configuration, (ii) the above first storm control configuration is applied to a first port, and (iii) the above second storm configuration is applied to a second port, and the above first port corresponds to the above first Layer 2 virtual network interface, and the above second port The method according to any one of items 1 to 9, comprising: determining that the interface corresponds to a second Layer 2 virtual network interface among the multiple Layer 2 virtual network interfaces; determining, based on the mapping information, that the first Layer 2 virtual network interface is hosted by the first NVD and the second Layer 2 virtual network interface is hosted by a second NVD among the multiple NVDs; generating second storm control information based on the second storm configuration; and transmitting the second storm control information to the second NVD.
[0334] Item 11. The method according to any one of items 1 to 10, wherein the storm control configuration provides a first limit on the transmission rate, and the storm control configuration provides a second limit corresponding to the adjustment of the first limit by a multiplier.
[0335] Item 12. The method according to Item 11, further comprising collecting metrics relating to the traffic flow, updating the multiplier based on the metrics, and sending updates relating to the storm control configuration to one NVD, wherein the updates include at least one of the updated multiplier or an updated second limit based on the updated multiplier.
[0336] Item 13. A network virtualization device comprising one or more processors and one or more computer-readable storage media for storing instructions, wherein, when the instructions are executed by the one or more processors, the network virtualization device is configured to host Layer 2 virtual network interfaces and Layer 2 virtual switches belonging to a customer's Layer 2 virtual network, the Layer 2 virtual network interfaces and Layer 2 virtual switches are associated with Layer 2 compute instances belonging to the Layer 2 virtual network, the Layer 2 compute instances are hosted on a host machine of a physical network comprising the network virtualization device, the host machine and the network virtualization device are communicatively coupled, and the Layer 2 virtual network is the physical network A network virtualization device hosted on a network, comprising multiple Layer 2 compute instances, multiple Layer 2 virtual network interfaces, and multiple Layer 2 virtual switches, wherein the above instruction, when executed by one or more of the above processors, configures the network virtualization device to store storm control information indicating a storm control policy and actions to be taken in violation of the storm control policy, configures the network virtualization device to monitor traffic flows to and / or from the Layer 2 virtual network interfaces, configures the network virtualization device to determine that a traffic flow violates the storm control policy, and configures the network virtualization device to initiate the above action based on the traffic flow that violates the storm control policy.
[0337] Item 14. The network virtualization device described in Item 13, wherein the storm control policy indicates a maximum transmission rate for incoming traffic to the Layer 2 virtual network interface, the action includes dropping frames, and determining that the traffic flow violates the storm control policy includes determining that the transmission rate of incoming frames to the Layer 2 virtual network interface exceeds the maximum transmission rate, and initiating the action includes dropping incoming frames.
[0338] Item 15. The network virtualization device described in Item 14, wherein the storm control policy further includes specifying a duration for escalation action, the escalation action includes link down, and determining that the traffic flow violates the storm control policy further includes determining that the transmission rate has lasted longer than the duration, and initiating the action further includes linking down the Layer 2 virtual network interface.
[0339] Item 16. The network virtualization device as described in Item 14, wherein the Layer 2 virtual network interface is a first Layer 2 virtual network interface, the storm control policy indicates a maximum transmission rate for incoming traffic to a second Layer 2 virtual network interface, the action includes dropping frames, determining that the traffic flow violates the storm control policy includes determining that the transmission rate of frames from the first Layer 2 virtual network interface to the second Layer 2 virtual network interface exceeds the maximum transmission rate, and initiating the action includes dropping frames from the first Layer 2 virtual network interface to the second Layer 2 virtual network interface.
[0340] Item 17. The storm control information is generated based on the storm control configuration indicated by the customer input and mapping information relating the address of the Layer 2 virtual network to the address of the physical network, as described in Item 14 of the network virtualization device.
[0341] Item 18. The above input of the customer indicates the port to which the above storm control configuration is applied, and the above storm control information is further generated based on the correspondence between the above port and the above Layer 2 virtual network interface and applied to the above Layer 2 virtual network interface based on the above correspondence, according to the network virtualization device of Item 17.
[0342] Item 19. A system comprising one or more processors and one or more computer-readable storage media for storing instructions, wherein, when the instructions are executed by the one or more processors, the system is configured to store mapping information relating the address of a customer's Layer 2 virtual network to the address of a physical network hosting the Layer 2 virtual network, the Layer 2 virtual network comprising a plurality of compute instances, a plurality of Layer 2 virtual network interfaces, and a plurality of Layer 2 virtual switches, the physical network comprising a plurality of network virtualization devices (NVDs) and a plurality of host machines, some of the plurality of compute instances being hosted on some of the plurality of host machines, and some of the plurality of Layer 2 virtual network interfaces being hosted on some of the Layer 2 virtual network interfaces. A system comprising a Layer 2 virtual network interface associated with a Layer 2 virtual switch among the plurality of Layer 2 virtual switches, wherein the Layer 2 virtual network interface and the Layer 2 virtual switch are hosted on an NVD among the plurality of NVDs, the NVD and the host machine are communicatively coupled, the instruction further configures the system to receive customer input, the input specifying a storm control configuration for traffic flow in the Layer 2 virtual network, the instruction further configures the system to generate storm control information for the NVD based on the storm control configuration and the mapping information, and transmit the storm control information to the NVD.
[0343] Item 20. The system as described in Item 19, further comprising the above-mentioned NVD, wherein the NVD is configured to store the above-mentioned storm control information, which indicates a storm control policy and an action to be taken in violation of the above-mentioned storm control policy, and the NVD is further configured to monitor traffic flows to and / or from the above-mentioned Layer 2 virtual network interface, to determine that the traffic flow violates the above-mentioned storm control policy, and to initiate the above-mentioned action based on the traffic flow that violates the above-mentioned storm control policy.
[0344] The indefinite articles “a” / “an”, the definite article “the”, and similar references used in the context describing this disclosure (particularly in the context of the claims) should be interpreted as including both singular and plural unless otherwise specifically stated herein or unless the meaning is clearly indicated otherwise. The terms “comprising”, “having”, “including”, and “containing” should be interpreted as non-restrictive terms (i.e., “including but not limited to”) unless otherwise specifically stated. The term “connected” should be interpreted as some or all of it being contained, attached, or joined together, even if something is intervening. In this specification, enumerations of value ranges are intended simply as a shorthand way of referring to each individual value that falls within that range, and unless otherwise specifically stated herein, each individual value is incorporated herein as it is described separately herein. Unless otherwise specifically stated herein or unless the meaning is clearly indicated otherwise, all methods described herein may be performed in any appropriate order. In this specification, the use of any and all examples or exemplary language (e.g., "like") is intended to clarify embodiments of the Disclosure and, unless otherwise specified, does not limit the scope of the Disclosure. Terms in the specification should not be construed as indicating any non-claimed elements essential to the implementation of the Disclosure.
[0345] Disjunctive language, such as the phrase "at least one of X, Y, or Z," is intended to be understood in context as commonly used to indicate that an item, term, etc., may be X, Y, Z, or any combination thereof (e.g., X, Y, and / or Z), unless otherwise specified. Therefore, such disjunctive language is not generally intended, nor implies, that a particular embodiment requires the presence of at least one X, at least one Y, or at least one Z.
[0346] Preferred embodiments of the Disclosure are described herein, including the best known mode for carrying out the Disclosure. Variations of these preferred embodiments will become apparent to those skilled in the art by reading the foregoing description. Those skilled in the art may, as appropriate, adopt such modifications, and the Disclosure may be carried out in ways other than those specifically described herein. Accordingly, the Disclosure includes all variations and equivalents of the subject matter described in the claims appended herein, as permitted by applicable law. Furthermore, any combination of the above elements in all possible variations is incorporated herein unless otherwise indicated herein.
[0347] All references cited herein, including publications, patent applications, and patents, shall be incorporated by reference to the same extent as if they were included herein, provided that each reference is individually and clearly indicated as being incorporated by reference.
[0348] While aspects of the disclosure described above are explained with reference to specific embodiments, those skilled in the art will recognize that the disclosure is not limited thereto. The various features and aspects of the disclosure described above may be used individually or in combination. Furthermore, embodiments may be used in any number of environments and applications beyond those described herein without departing from the broader spirit and scope of this specification. Accordingly, the specification and drawings should be considered illustrative rather than restrictive.< / realm>
Claims
1. A method implemented by a computer, This includes storing mapping information that associates the address of a Layer 2 virtual network interface provided by a customer's Layer 2 virtual network with the address of the physical network hosting the Layer 2 virtual network. The Layer 2 virtual network comprises multiple computing instances, multiple Layer 2 virtual network interfaces, and multiple Layer 2 virtual switches. The aforementioned physical network comprises multiple network virtualization devices (NVDs) and multiple host machines. One of the aforementioned multiple computing instances is hosted on one of the aforementioned multiple host machines, The aforementioned computing instance is associated with one of the multiple Layer 2 virtual network interfaces and one of the multiple Layer 2 virtual switches. The aforementioned Layer 2 virtual network interface and the aforementioned Layer 2 virtual switch are hosted on one of the plurality of NVDs, and the aforementioned Layer 2 virtual network interface hosted on the aforementioned NVD corresponds to a port of the aforementioned Layer 2 virtual switch hosted on the aforementioned NVD. The aforementioned NVD and the aforementioned host machine are connected in a communicative manner, and the method further, The method includes receiving customer input, the input specifying a storm control configuration for traffic flow in the Layer 2 virtual network, and the method further includes: Based on the storm control configuration and the mapping information, generate storm control information for a certain NVD, This includes transmitting the storm control information to a certain NVD, A method in which the input indicates whether the storm control configuration applies to a specific port in the Layer 2 virtual network or to all ports in the Layer 2 virtual network.
2. The input indicates that the storm control configuration is applied to the port, specifies the action to be performed under the traffic flow conditions, and the method further, The port is determined to correspond to the media access control (MAC) address of a certain Layer 2 virtual network interface, Based on the mapping information, it is determined that the MAC address is associated with the Internet Protocol (IP) address of a certain NVD, The method according to claim 1, further comprising indicating the traffic flow conditions and the action based on the MAC address and the IP address in the storm control information.
3. The method according to claim 2, wherein the traffic flow condition includes a frame transmission rate, and the action includes dropping a frame or linking down the port.
4. The method according to claim 2, wherein the traffic flow conditions include a maximum frame transmission rate and a duration, and the action includes dropping a frame when it is detected that the frame transmission rate exceeds the maximum frame transmission rate, and linking down the port when it is detected that the exceedance occurs for a duration longer than the duration.
5. The method according to any one of claims 1 to 4, wherein the input indicates that the storm control policy of the storm control configuration is applied to at least one of unicast, broadcast, or multicast frames, and the storm control information indicates the storm control policy and the applicability of the storm control policy to at least one of frame broadcast or frame multicast to a certain NVD for controlling the flow of frames associated with a certain Layer 2 virtual network interface.
6. The method according to any one of claims 1 to 5, wherein the input indicates a storm control policy of the storm control configuration, an action of the storm control policy to be performed in the event of a violation of the storm control policy, and an escalation policy of the storm control configuration to be applied in the event of a repeated violation of the storm control policy, and the storm control information indicates the storm control policy, the action, and the escalation policy in the NVD to control the flow of frames associated with the Layer 2 virtual network interface.
7. The method according to any one of claims 1 to 6, wherein the input indicates the type of transmission rate of the storm control configuration, the type of transmission rate includes at least one of frames per second or bits per second, and the storm control information indicates the type of transmission rate in the NVD to control the flow of frames associated with the Layer 2 virtual network interface.
8. The method according to any one of claims 1 to 7, wherein the input indicates the type of statistics for the storm control configuration, and the storm control information is shown in the NVD to collect statistics relating to the flow of frames associated with a certain Layer 2 virtual network interface.
9. The input indicates that the storm control configuration is applied to multiple ports, and the method further, It is determined that the aforementioned multiple ports correspond to a set of Layer 2 virtual network interfaces among the aforementioned multiple Layer 2 virtual network interfaces, Based on the mapping information, it is determined that a certain set of Layer 2 virtual network interfaces is associated with a certain set of NVDs among the plurality of NVDs, The method according to any one of claims 1 to 8, further comprising transmitting the storm control information to each NVD in a set of NVDs.
10. The storm control configuration, the storm control information, the NVD, and the Layer 2 virtual network interface are, respectively, a first storm control configuration, a first storm control information, a first NVD, and a first Layer 2 virtual network interface, and the method further comprises The input is determined to indicate (i) a second storm control configuration, (ii) the first storm control configuration is applied to the first port, and (iii) the second storm control configuration is applied to the second port. It is determined that the first port corresponds to the first Layer 2 virtual network interface, and the second port corresponds to the second Layer 2 virtual network interface among the plurality of Layer 2 virtual network interfaces, Based on the mapping information, it is determined that the first Layer 2 virtual network interface is hosted by the first NVD, and the second Layer 2 virtual network interface is hosted by the second NVD among the plurality of NVDs. To generate second storm control information based on the second storm control configuration, The method according to any one of claims 1 to 9, further comprising transmitting the second storm control information to the second NVD.
11. The method according to any one of claims 1 to 10, wherein the storm control configuration indicates a first limit on the transmission rate, and the storm control configuration indicates a second limit corresponding to the adjustment of the first limit by a multiplier.
12. The above method further, Collecting metrics related to the aforementioned traffic flow, Updating the multiplier based on the metric, The method according to claim 11, comprising transmitting an update relating to the storm control configuration to a certain NVD, wherein the update includes at least one of the updated multiplier or an updated second limit based on the updated multiplier.
13. A method implemented by a computer, This includes storing mapping information that associates the address of a Layer 2 virtual network interface provided by a customer's Layer 2 virtual network with the address of the physical network hosting the Layer 2 virtual network. The Layer 2 virtual network comprises multiple computing instances, multiple Layer 2 virtual network interfaces, and multiple Layer 2 virtual switches. The aforementioned physical network comprises multiple network virtualization devices (NVDs) and multiple host machines. One of the aforementioned multiple computing instances is hosted on one of the aforementioned multiple host machines, The aforementioned computing instance is associated with one of the multiple Layer 2 virtual network interfaces and one of the multiple Layer 2 virtual switches. The aforementioned Layer 2 virtual network interface and the aforementioned Layer 2 virtual switch are hosted on one of the multiple NVDs, The aforementioned NVD and the aforementioned host machine are connected in a communicative manner, and the method further, The method includes receiving customer input, the input specifying a storm control configuration for traffic flow in the Layer 2 virtual network, and the method further includes: Based on the storm control configuration and the mapping information, generate storm control information for a certain NVD, This includes transmitting the storm control information to a certain NVD, The input indicates a storm control policy of the storm control configuration, an action of the storm control policy to be performed in the event of a violation of the storm control policy, and an escalation policy of the storm control configuration to be applied in the event of repeated violations of the storm control policy, wherein the storm control information indicates the storm control policy, the action, and the escalation policy in a method for controlling the flow of frames associated with a certain Layer 2 virtual network interface.
14. A method implemented by a computer, This includes storing mapping information that associates the address of a Layer 2 virtual network interface provided by a customer's Layer 2 virtual network with the address of the physical network hosting the Layer 2 virtual network. The Layer 2 virtual network comprises multiple computing instances, multiple Layer 2 virtual network interfaces, and multiple Layer 2 virtual switches. The aforementioned physical network comprises multiple network virtualization devices (NVDs) and multiple host machines. One of the aforementioned multiple computing instances is hosted on one of the aforementioned multiple host machines, The aforementioned computing instance is associated with one of the multiple Layer 2 virtual network interfaces and one of the multiple Layer 2 virtual switches. The aforementioned Layer 2 virtual network interface and the aforementioned Layer 2 virtual switch are hosted on one of the multiple NVDs, The aforementioned NVD and the aforementioned host machine are connected in a communicative manner, and the method further, The method includes receiving customer input, the input specifying a storm control configuration for traffic flow in the Layer 2 virtual network, and the method further includes: Based on the storm control configuration and the mapping information, generate storm control information for a certain NVD, This includes transmitting the storm control information to a certain NVD, A method wherein the storm control configuration indicates a first limit on the transmission rate, and the storm control configuration indicates a second limit corresponding to the adjustment of the first limit by a multiplier.
15. A network virtualization device, One or more processors, The system comprises one or more computer-readable storage media for storing instructions, and when an instruction is executed by the one or more processors, the network virtualization device is controlled Configure to host Layer 2 virtual network interfaces and Layer 2 virtual switches belonging to the customer's Layer 2 virtual network, The Layer 2 virtual network interface and the Layer 2 virtual switch are associated with a Layer 2 compute instance belonging to the Layer 2 virtual network. The Layer 2 computing instance is hosted on a host machine of a physical network equipped with the network virtualization device, and the host machine and the network virtualization device are communicated together. The Layer 2 virtual network is hosted on the physical network and comprises a plurality of Layer 2 computing instances, a plurality of Layer 2 virtual network interfaces, and a plurality of Layer 2 virtual switches. The instruction, when executed by one or more processors, further enables the network virtualization device to The system is configured to store storm control information that indicates a storm control policy and the actions that should be taken in violation of the storm control policy. The system is configured to monitor traffic flow to and / or traffic flow from the Layer 2 virtual network interface, The system is configured to determine that the aforementioned traffic flow violates the storm control policy. The system is configured to initiate the action based on the traffic flow that violates the storm control policy, The storm control policy specifies a maximum transmission rate for incoming traffic to the Layer 2 virtual network interface, the action includes dropping frames, determining that the traffic flow violates the storm control policy includes determining that the transmission rate of incoming frames to the Layer 2 virtual network interface exceeds the maximum transmission rate, and initiating the action includes dropping incoming frames. The storm control policy further includes specifying a duration for an escalation action, the escalation action includes a link down, and determining that the traffic flow violates the storm control policy further includes determining that the transmission rate has lasted longer than the duration, and initiating the action further includes linking down the Layer 2 virtual network interface, for a network virtualization device.
16. The network virtualization device according to claim 15, wherein the Layer 2 virtual network interface is a first Layer 2 virtual network interface, the storm control policy indicates a maximum transmission rate for incoming traffic to a second Layer 2 virtual network interface, the action includes dropping frames, determining that the traffic flow violates the storm control policy includes determining that the transmission rate of frames from the first Layer 2 virtual network interface to the second Layer 2 virtual network interface exceeds the maximum transmission rate, and initiating the action includes dropping frames from the first Layer 2 virtual network interface to the second Layer 2 virtual network interface.
17. The network virtualization device according to claim 15, wherein the storm control information is generated based on a storm control configuration indicated by the customer input and mapping information relating the address of the Layer 2 virtual network interface belonging to the Layer 2 virtual network to the address of the physical network.
18. The network virtualization device according to claim 17, wherein the input of the customer indicates a port to which the storm control configuration is applied, and the storm control information is further generated based on a correspondence between the port and the Layer 2 virtual network interface and applied to the Layer 2 virtual network interface based on the correspondence.
19. It is a system, One or more processors, The system comprises one or more computer-readable storage media for storing instructions, and when an instruction is executed by the one or more processors, the system The Layer 2 virtual network interface provided by the customer's Layer 2 virtual network The system is configured to store mapping information that associates the address of the virtual network with the address of the physical network hosting the Layer 2 virtual network. The Layer 2 virtual network comprises multiple computing instances, multiple Layer 2 virtual network interfaces, and multiple Layer 2 virtual switches. The aforementioned physical network comprises multiple network virtualization devices (NVDs) and multiple host machines. One of the aforementioned multiple computing instances is hosted on one of the aforementioned multiple host machines, The aforementioned computing instance is associated with one of the multiple Layer 2 virtual network interfaces and one of the multiple Layer 2 virtual switches. The aforementioned Layer 2 virtual network interface and the aforementioned Layer 2 virtual switch are hosted on one of the plurality of NVDs, and the aforementioned Layer 2 virtual network interface hosted on the aforementioned NVD corresponds to a port of the aforementioned Layer 2 virtual switch hosted on the aforementioned NVD. The aforementioned NVD and the aforementioned host machine are communicated together, and the instruction, when executed by the one or more processors, the system The system is configured to receive customer input, the input specifying a storm control configuration for traffic flow in the Layer 2 virtual network, and the instruction, when executed by one or more processors, the system Based on the storm control configuration and the mapping information, the system is configured to generate storm control information for a certain NVD. The storm control information is configured to be transmitted to a certain NVD. The input indicates whether the storm control configuration applies to all ports in the Layer 2 virtual network or to specific ports in the Layer 2 virtual network.
20. It is a system, One or more processors, The system comprises one or more computer-readable storage media for storing instructions, and when an instruction is executed by the one or more processors, the system The system is configured to store mapping information that associates the address of the Layer 2 virtual network interface of the customer's Layer 2 virtual network with the address of the physical network hosting the Layer 2 virtual network. The Layer 2 virtual network comprises multiple computing instances, multiple Layer 2 virtual network interfaces, and multiple Layer 2 virtual switches. The aforementioned physical network comprises multiple network virtualization devices (NVDs) and multiple host machines. One of the aforementioned multiple computing instances is hosted on one of the aforementioned multiple host machines, The aforementioned computing instance is associated with one of the multiple Layer 2 virtual network interfaces and one of the multiple Layer 2 virtual switches. The aforementioned Layer 2 virtual network interface and the aforementioned Layer 2 virtual switch are hosted on one of the multiple NVDs, The aforementioned NVD and the aforementioned host machine are communicated together, and the instruction, when executed by the one or more processors, the system The system is configured to receive customer input, the input specifying a storm control configuration for traffic flow in the Layer 2 virtual network, and the instruction, when executed by one or more processors, the system Based on the storm control configuration and the mapping information, the system is configured to generate storm control information for a certain NVD. The storm control information is configured to be transmitted to a certain NVD. The input indicates a storm control policy of the storm control configuration, an action of the storm control policy to be performed in the event of a violation of the storm control policy, and an escalation policy of the storm control configuration to be applied in the event of repeated violations of the storm control policy, and the storm control information indicates the storm control policy, the action, and the escalation policy in the NVD to control the flow of frames associated with the Layer 2 virtual network interface of the system.
21. It is a system, One or more processors, The system comprises one or more computer-readable storage media for storing instructions, and when an instruction is executed by the one or more processors, the system The system is configured to store mapping information that associates the address of the Layer 2 virtual network interface of the customer's Layer 2 virtual network with the address of the physical network hosting the Layer 2 virtual network. The Layer 2 virtual network comprises multiple computing instances, multiple Layer 2 virtual network interfaces, and multiple Layer 2 virtual switches. The aforementioned physical network comprises multiple network virtualization devices (NVDs) and multiple host machines. One of the aforementioned multiple computing instances is hosted on one of the aforementioned multiple host machines, The aforementioned computing instance is associated with one of the multiple Layer 2 virtual network interfaces and one of the multiple Layer 2 virtual switches. The aforementioned Layer 2 virtual network interface and the aforementioned Layer 2 virtual switch are hosted on one of the multiple NVDs, The aforementioned NVD and the aforementioned host machine are communicated together, and the instruction, when executed by the one or more processors, the system The system is configured to receive customer input, the input specifying a storm control configuration for traffic flow in the Layer 2 virtual network, and the instruction, when executed by one or more processors, the system Based on the storm control configuration and the mapping information, the system is configured to generate storm control information for a certain NVD. The storm control information is configured to be transmitted to a certain NVD. A system wherein the storm control configuration indicates a first limit on the transmission rate, and the storm control configuration indicates a second limit corresponding to the adjustment of the first limit by a multiplier.
22. The aforementioned NVD further comprises, The NVD is configured to store the storm control information, the storm control information indicates a storm control policy and an action to be taken in violation of the storm control policy, and the NVD further, It is configured to monitor traffic flow to and / or traffic flow from the Layer 2 virtual network interface, The system is configured to determine that the aforementioned traffic flow violates the storm control policy. The system according to any one of claims 19 to 21, configured to initiate the action based on the traffic flow that violates the storm control policy.
23. A program for causing a processor to perform the method described in any one of claims 1 to 14.