Internet Group Management Protocol (IGMP) for Layer 2 networks in virtualized cloud environments
A distributed switch architecture with Layer 2 VNICs and local switches addresses limitations in virtualized cloud environments, enhancing Layer 2 networking and supporting IGMP functionality for scalable and flexible VLANs, improving network management and communication.
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Patents
- Current Assignee / Owner
- ORACLE INT CORP
- Filing Date
- 2021-11-24
- Publication Date
- 2026-06-22
Smart Images

Figure 0007877327000001 
Figure 0007877327000002 
Figure 0007877327000003
Abstract
Description
Technical Field
[0001] Reference to Related Applications This international patent application claims priority to U.S. Patent Application No. 17 / 494,725, filed on October 5, 2021, titled "LAYER-2 NETWORKING USING ACCESS CONTROL LISTS IN A VIRTUALIZED CLOUD ENVIRONMENT", which claims the benefit of U.S. Provisional Application No. 63 / 132,377, filed on December 30, 2020, titled "INTERNET GROUP MANAGEMENT PROTOCOL (IGMP) OF A LAYER 2 NETWORK IN A VIRTUALIZED CLOUD ENVIRONMENT", the content of which is hereby incorporated by reference in its entirety for all purposes.
Background Art
[0002] Background Cloud computing provides on-demand availability of computing resources. Cloud computing can be based on a data center that is available to users via the Internet. Cloud computing can provide infrastructure as a service (IaaS). Virtual networks may be created for user use. However, these virtual networks have limitations that restrict their functionality and value. Therefore, further improvements are desired.
Summary of the Invention
Means for Solving the Problems
[0003] Summary The present disclosure relates to a virtualized cloud environment. Techniques for providing layer 2 networking functionality in a virtualized cloud environment are described. The layer 2 functionality is provided in addition to, and in relation to, the layer 3 networking functionality provided by the virtualized cloud environment.
[0004] Some embodiments of this disclosure relate to providing customers with Layer 2 virtual local area networks (VLANs) within a private network, such as a customer's virtual cloud network (VCN). In a Layer 2 VLAN, different compute instances are connected. The customer is given the perception of an emulated single switch connecting the compute instances. In fact, this emulated switch is implemented as an infinitely scalable distributed switch including a set of local switches. More specifically, each compute instance runs on a host machine connected to a network virtualization device (NVD). For each compute instance on a host connected to the NVD, the NVD hosts a Layer 2 virtual network interface card (VNIC) and a local switch associated with the compute instance. A Layer 2 VNIC represents a port of the compute instance on the Layer 2 VLAN. Local switches connect the VNIC to other VNICs (e.g., other ports) associated with other compute instances on the Layer 2 VLAN. Various Layer 2 network services are supported, for example, including Internet Group Management Protocol (IGMP) functionality.
[0004] This section describes various embodiments, including methods, systems, and non-temporary computer-readable storage media for storing programs, code, or instructions executable by one or more processors. [Brief explanation of the drawing]
[0005] [Figure 1] This is a high-level diagram of a distributed environment showing a virtual or overlay cloud network hosted by a cloud service provider infrastructure, according to a specific embodiment. [Figure 2] This is an architectural schematic diagram showing the physical elements of the physical network within the CSPI according to a specific embodiment. [Figure 3]This figure shows an exemplary configuration of a CSPI in which a host machine is connected to multiple network virtualization devices (NVDs) according to a particular embodiment. [Figure 4] This diagram shows the connection between a host machine and an NVD that provides I / O virtualization to support multi-tenancy functionality, according to a specific embodiment. [Figure 5] This is a schematic block diagram showing the physical network provided by CSPI according to a specific embodiment. [Figure 6] This is a schematic diagram of a computing network according to one embodiment. [Figure 7] This is a schematic diagram of the logic and hardware of a VLAN according to one embodiment. [Figure 8] This is a schematic logic diagram of multiple connected L2 VLANs according to one embodiment. [Figure 9] This is a logical schematic diagram of multiple connected L2 VLANs and subnet 900 according to one embodiment. [Figure 10] This is a schematic diagram of VLAN communication and VLAN learning according to one embodiment. [Figure 11] This is a schematic diagram of a VLAN according to one embodiment. [Figure 12] This is a flowchart of process 1200 for VLAN communication according to one embodiment.
[0018] [Figure 13] This figure shows a suitable exemplary environment for defining a Layer 2 virtual network configuration according to one embodiment. [Figure 14] This figure shows an exemplary IGMP technique in a Layer 2 virtual network according to one embodiment. [Figure 15] This is a flowchart showing a process for generating an IGMP table in a Layer 2 virtual network according to one embodiment. [Figure 16] This flowchart shows a process for updating an IGMP table in a Layer 2 virtual network according to one embodiment. [Figure 17] This is a flowchart showing the process for executing an IGMP query in a Layer 2 virtual network according to one embodiment. [Figure 18] This is a flowchart showing a process for using an IGMP table in a Layer 2 virtual network according to one embodiment. [Figure 19] This is a block diagram showing one pattern for realizing a cloud infrastructure as a service system according to at least one embodiment. [Figure 20] This block diagram shows another pattern for realizing a cloud infrastructure as a service system, according to at least one embodiment. [Figure 21] This block diagram shows another pattern for realizing a cloud infrastructure as a service system, according to at least one embodiment. [Figure 22] This block diagram shows another pattern for realizing a cloud infrastructure as a service system, according to at least one embodiment. [Figure 23] A block diagram showing an exemplary computer system according to at least one embodiment. [Modes for carrying out the invention]
[0006] Detailed explanation In the following description, certain details are included for illustrative purposes to facilitate a full understanding of the particular embodiment. However, it will be apparent that various embodiments may be carried out without these specific details. The figures and descriptions are not intended to be limiting. The term “exemplary” is used here to mean “provided as an example, case, or illustration.” Any embodiment or design described herein as “exemplary” should not necessarily be construed as being preferable or advantageous over other embodiments or designs.
[0007] A. Exemplary virtual networking architecture The term "cloud service" generally refers to services that a cloud service provider (CSP) makes available to users or customers on demand (e.g., via a subscription model) using systems and infrastructure (cloud infrastructure). Typically, the servers and systems that make up the CSP's infrastructure are separate from the customer's own on-premises servers and systems. Thus, a customer can utilize cloud services provided by a CSP without separately purchasing the hardware and software resources for the service. Cloud services are designed to provide customers who subscribe to easy and scalable access to applications and computing resources without the customer having to invest in the procurement of the infrastructure used to provide the service.
[0008] There are several cloud service providers that offer various types of cloud services. Cloud services include various different types or models such as SaaS (Software-as-a-Service), PaaS (Platform-as-a-Service), and IaaS (Infrastructure-as-a-Service).
[0009] A customer can subscribe to one or more cloud services provided by a CSP. The customer can be any entity such as an individual, an organization, or a business. When a customer subscribes or registers for a service provided by a CSP, a tenant or account is created for that customer. Thereafter, the customer can access one or more subscribed cloud resources associated with the account via this account.
[0010] As described above, IaaS (Infrastructure as a Service) is a specific type of cloud computing service. In the IaaS model, the CSP provides infrastructure (referred to as cloud service provider infrastructure or CSPI) that customers can use to build their own customizable networks and deploy customer resources. Therefore, the customer's resources and networks are hosted in a distributed environment by the infrastructure provided by the CSP. This is different from traditional computing where the customer's infrastructure hosts the customer's resources and networks.
[0011] CSPI may include interconnected high-performance computing resources, including various host machines, memory resources, and network resources, forming a physical network also known as an underlay network or base network. CSPI resources may be distributed across one or more data centers geographically distributed across one or more geographical regions. Virtualization software can run on these physical resources to provide a virtualized distributed environment. Virtualization creates an overlay network (also known as a software-based network, software-defined network, or virtual network) on top of the physical network. The CSPI physical network provides the foundation for creating one or more overlay or virtual networks on top of the physical network. A virtual network or overlay network may include one or more virtual cloud networks (VCNs). Virtual networks are implemented using software virtualization technologies (e.g., hypervisors, functions performed by network virtualization devices (NVDs) (e.g., smart NICs), top-of-rack (TOR) switches, smart TORs implementing one or more functions performed by NVDs, and other mechanisms) to create a network abstraction layer that can run on top of the physical network. Virtual networks can take various forms, such as peer-to-peer networks and IP networks. A virtual network is typically either a Layer 3 IP network or a Layer 2 VLAN. Such virtual or overlay networks are often referred to as virtual Layer 3 networks or overlay Layer 3 networks. Examples of protocols developed for virtual networks include IP-in-IP (or GRE (Generic Routing Encapsulation)), virtual extensible LAN (VXLAN - IETF RFC7348), virtual private networks (VPNs) (e.g., MPLS Layer 3 virtual private network (RFC4364)), VMware NSX, and GENEVE (Generic Network Virtualization Encapsulation).
[0012] In the case of IaaS, the infrastructure provided by the CSP (CSPI) may be configured to deliver virtualized computing resources over a public network (e.g., the internet). In the IaaS model, the cloud computing service provider can host infrastructure elements (e.g., servers, storage, network nodes (e.g., hardware), deployment software, platform virtualization (e.g., hypervisor layer)). In some cases, the IaaS provider can provide various services associated with these infrastructure elements (e.g., billing, monitoring, logging, security, load balancing, and clustering). Because these services are policy-driven, IaaS users can maintain application availability and performance by implementing policies to drive load balancing. The CSPI provides infrastructure and a set of complementary cloud services. This allows customers to build and run a wide range of applications and services in a highly available hosted distributed environment. The CSPI provides high-performance computing resources and capabilities, as well as storage capacity, on a flexible virtual network that can be securely accessed from various network locations, such as the customer's on-premises network. When a customer subscribes to or registers for an IaaS service provided by a CSP, the tenancy created for that customer becomes a securely isolated partition from the CSP, allowing the customer to create, organize, and manage cloud resources.
[0013] Customers can build their own virtual networks using the computing, memory, and networking resources provided by CSPI. They can deploy one or more customer resources or workloads, such as compute instances, on these virtual networks. For example, a customer can build one or more customizable private virtual networks called Virtual Cloud Networks (VCNs) using the resources provided by CSPI. On a customer VCN, a customer can deploy one or more customer resources, such as compute instances. Compute instances may be virtual machines, bare-metal instances, etc. Thus, CSPI provides the infrastructure and a set of complementary cloud services that enable customers to build and run a variety of applications and services in a highly available virtual host environment. Customers do not manage or control the underlying physical resources provided by CSPI, but they control the operating system, memory, and deployed applications, and, in some cases, have limited control over certain networking components (e.g., firewalls).
[0014] A CSP can provide a console that enables customers and network administrators to configure, access, and manage resources deployed to the cloud using CSPI resources. In certain embodiments, the console provides a web-based user interface that can be used to utilize and manage CSPI. In some embodiments, the console is a web-based application provided by the CSP.
[0015] CSPI can support single-tenancy or multi-tenancy architectures. In a single-tenancy architecture, software (e.g., applications, databases) or hardware elements (e.g., host machines or servers) serve a single customer or tenant. In a multi-tenancy architecture, software or hardware elements serve multiple customers or tenants. Therefore, in a multi-tenancy architecture, CSPI resources are shared among multiple customers or tenants. In a multi-tenancy environment, CSPI implements precautions and safeguards to isolate each tenant's data and prevent it from being visible to other tenants.
[0016] In a physical network, a network endpoint (endpoint) refers to a computing device or system that is connected to the physical network and communicates bidirectionally with the connected network. A network endpoint in a physical network may be connected to a local area network (LAN), a wide area network (WAN), or other types of physical networks. Examples of traditional endpoints in a physical network include modems, hubs, bridges, switches, routers, and other networking devices, as well as physical computers (or host machines). Each physical device in a physical network has a fixed network address that can be used to communicate with that device. This fixed network address may be a Layer 2 address (e.g., a MAC address), a fixed Layer 3 address (e.g., an IP address), etc. In a virtualized environment or virtual network, endpoints can include various virtual endpoints, such as virtual machines hosted by elements of the physical network (e.g., hosted by a physical host machine). These endpoints in a virtual network are addressed by overlay addresses, such as overlay Layer 2 addresses (e.g., overlay MAC addresses) and overlay Layer 3 addresses (e.g., overlay IP addresses). Network overlays provide flexibility by allowing network administrators to move overlay addresses associated with network endpoints using software management (e.g., through software implementing the control plane of the virtual network). Therefore, unlike physical networks, in virtual networks, network management software can be used to move overlay addresses (e.g., overlay IP addresses) from one endpoint to another. Because virtual networks are built on top of physical networks, both the virtual network and the underlying physical network are involved in communication between elements of the virtual network.To facilitate such communication, each element of the CSPI is configured to learn and store mappings that map the overlay address of the virtual network to the actual physical address of the underlying network, or vice versa. These mappings are used to facilitate communication. To facilitate routing within the virtual network, customer traffic is encapsulated.
[0017] Therefore, physical addresses (e.g., physical IP addresses) are associated with elements of a physical network, while overlay addresses (e.g., overlay IP addresses) are associated with entities in a virtual network. Both physical and overlay IP addresses are real IP addresses. They are distinct from virtual IP addresses, which map to multiple real IP addresses. Virtual IP addresses provide a one-to-many mapping between virtual IP addresses and multiple real IP addresses.
[0018] The cloud infrastructure, or CSPI, is physically hosted in one or more data centers in one or more regions around the world. The CSPI may include elements of a physical network or underlying network and virtualization elements of a virtual network built on top of the physical network elements (e.g., virtual networks, compute instances, virtual machines). In certain embodiments, the CSPI is organized and hosted in realms, regions, and available domains. A region is typically a local geographical area containing one or more data centers. Regions are generally independent of each other and may be separated by vast distances, for example, across countries or continents. For example, one region may be in Australia, another in Japan, and yet another in India. CSPI resources are divided among these regions such that each region has an independent subset of CSPI resources. Each region can provide a set of core infrastructure services and resources, such as computing resources (e.g., bare metal servers, virtual machines, containers, and related infrastructure), storage resources (e.g., block volume storage, file storage, object storage, archive storage), networking resources (e.g., virtual cloud networks (VCNs), load balancing resources, connectivity to on-premises networks), database resources, edge networking resources (e.g., DNS), and access management and monitoring resources. Each region generally has multiple routes for connecting to other regions within the realm.
[0019] Generally, applications are deployed in the region where they are most frequently used (i.e., on infrastructure relevant to that region) because using nearby resources is faster than using distant resources. Applications may also be deployed in different regions for various reasons, such as redundancy to mitigate the risks of large-scale weather systems or region-wide events like earthquakes, or redundancy to meet various requirements for legal jurisdictions, tax domains, and other business or social standards.
[0020] Data centers within a region may be further organized and subdivided into availability domains (ADs). An availability domain may correspond to one or more data centers located in a region. A region may consist of one or more availability domains. In such a distributed environment, CSPI resources are region-specific, such as virtual cloud networks (VCNs), or availability domain-specific, such as compute instances.
[0021] ADs within a single region are configured to be fault-tolerant, isolated from one another, and configured to be highly unlikely to fail simultaneously. This is achieved by configuring ADs so that a failure in one AD within a region has little impact on the availability of other ADs within the same region, by not sharing critical infrastructure resources such as networking, physical cabling, cabling routes, and cabling entry points. Connecting ADs within the same region with a low-latency, high-bandwidth network provides highly available connectivity to other networks (e.g., the internet, customer on-premises networks), and a replication system for both high availability and disaster recovery can be built across multiple ADs. CloudSense utilizes multiple ADs to ensure high availability and protect against resource failures. As the infrastructure provided by the IaaS provider grows, more regions and ADs may be added along with additional capacity. Traffic between available domains is typically encrypted.
[0022] In certain embodiments, regions are grouped into realms. A realm is a logical collection of regions. Realms are isolated from each other and do not share any data. Regions within the same realm can communicate with each other, but regions in different realms cannot. A CSP customer's tenancy or account may reside in a single realm and span one or more regions belonging to that single realm. Typically, when a customer subscribes to an IaaS service, their tenancy or account is created in a customer-designated region within a realm (referred to as the "home" region). The customer can extend their tenancy to one or more other regions within a realm. A customer cannot access regions that do not reside in the realm where their tenancy resides.
[0023] An IaaS provider can offer multiple realms, each corresponding to a specific set of customers or users. For example, a commercial realm may be offered for commercial customers. Another example is that a realm may be offered for a specific country or for customers in that country. Yet another example is that a government realm may be offered for a government, for example. For example, a government realm may be created for a specific government and may have a higher security level than a commercial realm. For example, Oracle Cloud Infrastructure (OCI) currently offers a realm for the commercial domain and two realms for the government cloud domain (e.g., FedRAMP accreditation and IL5 accreditation).
[0024] In certain embodiments, an Active Directory (AD) can be subdivided into one or more fault domains. A fault domain is a grouping of infrastructure resources within the AD to provide anti-affinity. Fault domains can distribute compute instances so that they are not located on the same physical hardware within a single AD. This is known as anti-affinity. A fault domain refers to a collection of hardware elements (computers, switches, etc.) that share a single point of failure. The compute pool is logically divided into fault domains. Therefore, a hardware failure or compute hardware maintenance event affecting one fault domain does not affect instances in other fault domains. Depending on the embodiment, the number of fault domains in each AD may vary. For example, in certain embodiments, each AD may contain three fault domains. Fault domains function as logical data centers within the AD.
[0025] When a customer subscribes to an IaaS service, resources from CSPI are provisioned to the customer and associated with the customer's tenancy. The customer can use these provisioned resources to build private networks and deploy resources on these networks. Customer networks hosted on the cloud by CSPI are called Virtual Cloud Networks (VCNs). A customer can configure one or more Virtual Cloud Networks (VCNs) using the CSPI resources allocated to them. A VCN is a virtual or software-defined private network. Customer resources deployed in a customer's VCN can include compute instances (e.g., virtual machines, bare metal instances) and other resources. These compute instances may represent various customer workloads such as applications, load balancers, and databases. Compute instances deployed on a VCN can communicate with publicly accessible endpoints (public endpoints) over public networks such as the internet, with other instances within the same VCN or other VCNs (e.g., other VCNs of the customer, or VCNs not belonging to the customer), with the customer's on-premises data center or network, with Sendi endpoints, and with other types of endpoints.
[0026] A CSP can provide a variety of services using a CSPI. In some cases, a CSPI customer can act like a service provider themselves and provide services using CSPI resources. A service provider can expose service endpoints characterized by identifying information (e.g., IP address, DNS name, and port). A customer's resources (e.g., compute instances) can consume a particular service by accessing the service endpoint of that particular service exposed by the service. These service endpoints are generally publicly accessible endpoints that users can access via public communication networks such as the internet using the public IP address associated with the endpoint. Publicly accessible network endpoints are sometimes called public endpoints.
[0027] In certain embodiments, a service provider may expose a service through an endpoint of the service (sometimes called a service endpoint). Customers of the service can access the service using this service endpoint. In certain embodiments, the service endpoint provided for a service may be accessible to multiple customers who wish to consume that service. In other implementations, a dedicated service endpoint may be provided to a customer. Thus, only that customer can access the service using that dedicated service endpoint.
[0028] In certain embodiments, a VCN, once created, is associated with a private overlay classless inter-domain routing (CIDR) address space, which is a private overlay IP address range (e.g., 10.0 / 16) assigned to that VCN. A VCN includes associated subnets, route tables, and gateways. While a VCN resides within a single region, it can extend to one or more or all available domains within a region. A gateway is a virtual interface configured for a VCN, enabling traffic communication between the VCN and one or more endpoints outside the VCN. By configuring one or more different types of gateways for a VCN, communication between different types of endpoints can be enabled.
[0029] A VCN may be subdivided into one or more subnets, such as one or more subnets. Thus, a subnet is a constituent unit or partition that can be created within a VCN. A VCN can have one or more subnets. Each subnet within a VCN does not overlap with other subnets within that VCN and is associated with a contiguous range of overlay IP addresses (e.g., 10.0.0.0 / 24 and 10.0.1.0 / 24) that represent an address space subset of the VCN's address space.
[0030] Each compute instance is associated with a virtual network interface card (VNIC). This allows each compute instance to join a subnet in a VCN. A VNIC is a logical representation of a physical network interface card (NIC). Generally, a VNIC is the interface between an entity (e.g., compute instance, service) and a virtual network. A VNIC resides in a subnet and has one or more associated IP addresses and associated security rules or policies. A VNIC is equivalent to a Layer 2 port on a switch. A VNIC connects a compute instance to a subnet within a VCN. The VNIC associated with a compute instance enables the compute instance to be part of a subnet in a VCN and allows the compute instance to communicate (e.g., send and receive packets) with endpoints on the same subnet as the compute instance, endpoints in different subnets within the VCN, or endpoints outside the VCN. Therefore, the VNIC associated with a compute instance determines how the compute instance connects to internal and external endpoints within the VCN. The VNIC for a compute instance is created and associated with that compute instance when the compute instance is created and added to a subnet within the VCN. If a subnet consists of a set of compute instances, it includes the VNICs corresponding to the set of compute instances, and each VNIC is connected to a compute instance within the set of computer instances.
[0031] Each compute instance is assigned a private overlay IP address via the VNIC associated with it. This private overlay IP address is assigned to the VNIC associated with the compute instance when the compute instance is created and is used to route the compute instance's traffic. All VNICs within a given subnet use the same route table, security lists, and DHCP options. As mentioned above, each subnet within a VCN does not overlap with other subnets within that VCN and is associated with a contiguous range of overlay IP addresses (e.g., 10.0.0.0 / 24 and 10.0.1.0 / 24) that represents a subset of the address space of that VCN. For a VNIC on a particular subnet of a VCN, the overlay IP address assigned to the VNIC is an address from the contiguous range of overlay IP addresses assigned to the subnet.
[0032] In certain embodiments, a compute instance may be assigned additional overlay IP addresses, such as one or more public IP addresses in the case of a public subnet, in addition to its private overlay IP address, as needed. These multiple addresses may be assigned to the same VNIC or to multiple VNICs associated with the compute instance. However, each instance has a primary VNIC that is created at instance launch and associated with the overlay private IP address assigned to the instance. This primary VNIC cannot be deleted. Additional VNICs, called secondary VNICs, can be added to an existing instance in the same available domain as the primary VNIC. All VNICs are in the same available domain as the instance. Secondary VNICs may be in the same VCN subnet as the primary VNIC, or they may be in the same VCN or different VCN subnets.
[0033] Compute instances can optionally be assigned a public IP address if they are located in a public subnet. When creating a subnet, you can specify that subnet as either a public or private subnet. A private subnet means that resources within that subnet (e.g., compute instances) and associated VNICs cannot have public overlay IP addresses. A public subnet means that resources within that subnet and associated VNICs can have public IP addresses. Customers can specify subnets that exist across a single available domain or multiple available domains within a region or realm.
[0034] As described above, a VCN may be subdivided into one or more subnets. In certain embodiments, a virtual router (referred to as a VCN VR or simply a VR) configured for the VCN enables communication between subnets within the VCN. For subnets within a VCN, the VR represents the logical gateway for that subnet, enabling communication between the subnet (i.e., compute instances on that subnet) and endpoints on other subnets within the VCN and other endpoints outside the VCN. A VCN VR is a logical entity configured to route traffic between VNICs within the VCN and virtual gateways (gateways) associated with the VCN. Gateways are described further below with reference to Figure 1. A VCN VR is a Layer 3 / IP layer concept. In one embodiment, there is one VCN VR for one VCN. This VCN VR potentially has an unlimited number of ports addressed by IP addresses, with one port for each subnet of the VCN. Thus, the VCN VR has a different IP address for each subnet of the VCN to which the VCN VR is connected. The VR is also connected to various gateways configured for the VCN. In certain embodiments, specific overlay IP addresses from a subnet's overlay IP address range are held on ports of the VCN VR for that subnet. For example, consider a VCN having two subnets, each having the associated address ranges 10.0 / 16 and 10.1 / 16. For the first subnet of the VCN having the address range 10.0 / 16, addresses from this range are held on ports of the VCN VR for that subnet. In some cases, the first IP address from this range may be held on the VCN VR. For example, for a subnet having the overlay IP address range 10.0 / 16, the IP address 10.0.0.1 may be held on ports of the VCN VR for that subnet. For a second subnet within the same VCN having the address range 10.1 / 16, the VCN VR may have ports for the second subnet having the IP address 10.1.0.1.A VCN VR has a different IP address for each subnet within the VCN.
[0035] In some other embodiments, each subnet within a VCN may have its own associated VR, which is addressable by the subnet using a reserved or default IP address associated with the VR. The reserved or default IP address may be, for example, a first IP address from a range of IP addresses associated with that subnet. A VNIC within a subnet can use this default or reserved IP address to communicate with the VR associated with the subnet (e.g., send and receive packets). In such embodiments, the VR is the incoming / outgoing point for that subnet. A VR associated with a subnet within a VCN can communicate with other VRs associated with other subnets within the VCN. A VR can also communicate with gateways associated with the VCN. The VR functionality of a subnet is performed on, or by, one or more NVDs that perform the VNIC functionality of the VNICs within the subnet.
[0036] Route tables, security rules, and DHCP options may be configured for the VCN. The route table is the VCN's virtual route table and contains rules for routing traffic from subnets within the VCN to destinations outside the VCN, via a gateway or specially configured instance. The VCN's route table can be customized to control packet forwarding / routing to and from the VCN. DHCP options refer to configuration information automatically provided to an instance when it is launched.
[0037] Security rules configured for a VCN represent the VCN's overlay firewall rules. Security rules can include inbound and outbound rules and can specify the types of traffic allowed to enter and exit instances within the VCN (e.g., based on protocol and port). Customers can choose whether certain rules are stateful or stateless. For example, a customer can allow incoming SSH traffic to a pair of instances from any location by configuring a stateful inbound rule with source CIDR 0.0.0.0 / 0 and destination TCP port 22. Security rules may be implemented using network security groups or security lists. A network security group consists of a set of security rules that apply only to resources within that group. A security list, on the other hand, contains rules that apply to all resources within a subnet that uses that security list. A VCN may include default security rules and default security lists. DHCP options configured for a VCN provide configuration information that is automatically provided when instances within the VCN start up.
[0038] In certain embodiments, VCN configuration information is determined and stored by the VCN control plane. VCN configuration information may include, for example, address ranges associated with the VCN, subnets and associated information within the VCN, one or more VRs associated with the VCN, compute instances and associated VNICs within the VCN, NVDs that perform various virtualized network functions associated with the VCN (e.g., VNICs, VRs, gateways), VCN status information, and other VCN-related information. In certain embodiments, the VCN distribution service exposes the configuration information or a portion thereof stored by the VCN control plane to the NVD. Using the distributed information, packets can be forwarded to and from compute instances within the VCN by updating information stored and used by the NVD (e.g., forwarding tables, routing tables, etc.).
[0039] In certain embodiments, the creation of VCNs and subnets is handled by the VCN control plane (CP), and the startup of compute instances is handled by the compute control plane. The compute control plane is configured to allocate physical resources for compute instances and then call the VCN control plane to create VNICs and connect to the compute instances. The VCN CP also sends VCN data mappings to the VCN data plane, which is configured to perform packet forwarding and routing functions. In certain embodiments, the VCN CP provides a distribution service configured to provide updates to the VCN data plane. Examples of VCN control planes are shown in Figures 17, 18, 19, and 20 (see reference numbers 1716, 1816, 1916, and 2016) and are described below.
[0040] Customers can create one or more VCNs using resources hosted by CSPI. Compute instances deployed on a customer VCN can communicate with different endpoints. These endpoints can include endpoints hosted by CSPI and endpoints outside of CSPL.
[0041] Various different architectures for implementing cloud-based services using CSPI are shown in Figures 1, 2, 3, 4, 5, 17, 18, 19, and 21, and are described below. Figure 1 is a high-level diagram of a distributed environment 100 showing an overlay VCN or customer VCN hosted by CSPI according to a particular embodiment. The distributed environment shown in Figure 1 includes multiple elements within the overlay network. The distributed environment 100 shown in Figure 1 is merely an example and is not intended to unduly limit the scope of the claimed embodiments. Many variations, alternatives, and modifications are possible. For example, in some implementations, the distributed environment shown in Figure 1 may have more or fewer systems or elements than those shown in Figure 1, may combine two or more systems, or may have different system configurations or arrangements.
[0042] As shown in the example in Figure 1, the distributed environment 100 includes a CSPI 101 that provides services and resources that customers can subscribe to and use to build a virtual cloud network (VCN). In a particular embodiment, the CSPI 101 provides IaaS services to subscriber customers. The data centers within the CSPI 101 may be organized into one or more regions. Figure 1 shows an example of a region, the “US region” 102. The customer has configured a customer VCN 104 for region 102. The customer can deploy various compute instances on the VCN 104, which may include virtual machines or bare metal instances. Examples of instances include applications, databases, load balancers, etc.
[0043] In the embodiment shown in Figure 1, customer VCN 104 includes two subnets, namely "Subnet-1" and "Subnet-2", each subnet having its own CIDR IP address range. In Figure 1, the overlay IP address range for subnet-1 is 10.0 / 16, and the address range for subnet-2 is 10.1 / 16. The VCN virtual router 105 represents the logical gateway of the VCN, enabling communication between subnets of VCN 104 and communication with other endpoints outside the VCN. The VCN VR 105 is configured to route traffic between VNICs within VCN 104 and gateways associated with VCN 104. The VCN VR 105 provides ports to each subnet of VCN 104. For example, the VR 105 can provide a port with IP address 10.0.0.1 to subnet-1 and a port with IP address 10.1.0.1 to subnet-2.
[0044] Multiple compute instances can be deployed on each subnet. In this case, compute instances may be virtual machine instances and / or bare metal instances. Compute instances within a subnet may be hosted by one or more host machines within CSPI101. Compute instances join the subnet via the VNIC associated with them. For example, as shown in Figure 1, compute instance C1 is part of subnet-1 via the VNIC associated with it. Similarly, compute instance C2 is part of subnet-1 via the VNIC associated with C2. Similarly, multiple compute instances, which may be virtual machine instances or bare metal instances, may be part of subnet-1. Each compute instance is assigned a private overlay IP address and MAC address via the associated VNIC. For example, in Figure 1, compute instance C1 has the overlay IP address 10.0.0.2 and MAC address M1, and compute instance C2 has the private overlay IP address 10.0.0.3 and MAC address M2. Each compute instance in subnet-1, including compute instances C1 and C2, has a default route to VCN VR105 using IP address 10.0.0.1, which is the IP address of the port of VCN VR105 in subnet-1.
[0045] Multiple compute instances, including virtual machine instances and / or bare metal instances, can be deployed in subnet-2. For example, as shown in Figure 1, compute instances Dl and D2 are part of subnet-2 via the VNIC associated with each compute instance. In the embodiment shown in Figure 1, compute instance D1 has the overlay IP address 10.1.0.2 and MAC address MM1, and compute instance D2 has the private overlay IP address 10.1.0.3 and MAC address MM2. Each compute instance in subnet-2, including compute instances D1 and D2, has a default route to VCN VR105 using IP address 10.1.0.1, which is the IP address of the port of VCN VR105 in subnet-2.
[0046] Furthermore, VCN A104 may include one or more load balancers. For example, a load balancer may be provided for a subnet and configured to load balance traffic among multiple compute instances on the subnet. Alternatively, a load balancer may be provided to load balance traffic among subnets within the VCN.
[0047] A specific compute instance deployed on VCN104 can communicate with various different endpoints. These endpoints may include endpoints hosted by CSPI200 and endpoints outside of CSPI200. Endpoints hosted by CSPI101 may include endpoints on the same subnet as a particular compute instance (e.g., communication between two compute instances in subnet-1), endpoints on different subnets but within the same VCN (e.g., communication between a compute instance in subnet-1 and a compute instance in subnet-2), endpoints in different VCNs within the same region (e.g., communication between a compute instance in subnet-1 and an endpoint in a VCN in the same region 106 or 110, or between a compute instance in subnet-1 and an endpoint in service network 110 in the same region), or endpoints in VCNs in different regions (e.g., communication between a compute instance in subnet-1 and an endpoint in a VCN in a different region 108). In addition, compute instances in subnets hosted by CSPI101 can communicate with endpoints not hosted by CSPI101 (i.e., outside of CSPI101). These external endpoints include endpoints within the customer's on-premises network 116, endpoints within other remote cloud host networks 118, public endpoints 114 accessible via public networks such as the internet, and other endpoints.
[0048] Communication between compute instances on the same subnet is facilitated using VNICs associated with the source and destination compute instances. For example, compute instance C1 in subnet-1 may want to send a packet to compute instance C2, also in subnet-1. For a packet sent from the source compute instance, whose destination is another compute instance on the same subnet, this packet is first processed by the VNIC associated with the source compute instance. The processing performed by the VNIC associated with the source compute instance may include determining the packet's destination information from the packet header, identifying any policies (e.g., security lists) configured for the VNIC associated with the source compute instance, determining the packet's next hop, performing any packet encapsulation / decapsulation functions as needed, and forwarding / routing the packet to the next hop to facilitate communication to its intended destination. If the destination compute instance is on the same subnet as the source compute instance, the VNIC associated with the source compute instance is configured to identify the VNIC associated with the destination compute instance and forward the packet to that VNIC for processing. Next, the VNIC associated with the destination compute instance is executed and forwards the packets to the destination compute instance.
[0049] When a packet is transmitted from a compute instance within a subnet to an endpoint in a different subnet of the same VCN, the communication is facilitated by the VNICs associated with the source and destination compute instances, and the VCN VR. For example, if compute instance C1 in subnet-1 in Figure 1 wants to send a packet to compute instance D1 in subnet-2, the packet is first processed by the VNIC associated with compute instance C1. The VNIC associated with compute instance C1 is configured to route the packet to VCN VR105 using the VCN VR's default route or port 10.0.0.1. VCN VR105 is configured to route the packet to subnet-2 using port 10.1.0.1. The packet is then received and processed by the VNIC associated with D1, and the VNIC forwards the packet to compute instance D1.
[0050] To transmit packets from compute instances within VCN104 to endpoints outside VCN104, communication is facilitated by a VNIC associated with the source compute instance, VCN VR105, and a gateway associated with VCN104. One or more types of gateways can be associated with VCN104. A gateway is an interface between the VCN and another endpoint, which is outside the VCN. A gateway is a Layer 3 / IP layer concept that enables communication between the VCN and endpoints outside the VCN. Therefore, gateways facilitate traffic flow between the VCN and other VCNs or networks. Different types of gateways can be configured in the VCN to facilitate different types of communication with different types of endpoints. Through gateways, communication may take place over a public network (e.g., the internet) or a private network. Various communication protocols may be used for these communications.
[0051] For example, compute instance C1 may want to communicate with an endpoint outside of VCN104. The packet may first be processed by the VNIC associated with source compute instance C1. The VNIC processing determines that the packet's destination is outside subnet-1 of Cl. The VNIC associated with C1 can then forward the packet to VCN VR105 of VCN104. VCN VR105 then processes the packet and, as part of the processing, determines a specific gateway associated with VCN104 as the packet's next hop based on the packet's destination. VCN VR105 can then forward the packet to the specific gateway. For example, if the destination is an endpoint within the customer's operation-premise network, the packet may be forwarded by VCN VR105 to a dynamic routing gateway (DRG) 122 configured for VCN104. The packet is then forwarded from the gateway to the next hop, facilitating communication of the packet to its intended final destination.
[0052] Various different types of gateways may be configured for the VCN. Examples of gateways that may be configured for the VCN are shown in Figure 1 and described below. Examples of gateways associated with the VCN are also shown in Figures 17, 18, 19, and 20 (for example, gateways shown by reference numbers 1734, 1736, 1738, 1834, 1836, 1838, 1934, 1936, 1938, 2034, 2036, and 2038) and described below. As shown in the embodiment shown in Figure 1, a dynamic routing gateway (DRG) 122 may be added to or associated with the customer VCN 104. The DRG 122 provides a path for private network traffic communication between the customer VCN 104 and another endpoint. The other endpoint may be the customer on-premises network 116, VCN 108 in a different region of CSPI 101, or another remote cloud network 118 not hosted by CSPI 101. The customer on-premises network 116 may be a customer network or customer data center built using the customer's resources. Access to the customer on-premises network 116 is generally strictly restricted. For a customer that has both the customer on-premises network 116 and one or more VCNs 104 deployed or hosted in the cloud by CSPI 101, the customer may want the on-premises network 116 and the cloud-based VCNs 104 to be able to communicate with each other. This would allow the customer to build an enhanced hybrid environment that includes the customer's VCNs 104 hosted by CSPI 101 and the on-premises network 116. DRG 122 enables such communication. To enable such communication, a communication channel 124 is configured. In this case, one endpoint of the communication channel is on the customer on-premises network 116, and the other endpoint is on CSPI 101 and connected to the customer VCN 104. The communication channel 124 can be via a public communication network such as the internet, or a private communication network.Various different communication protocols can be used, such as IPsec VPN technology on public communication networks like the Internet, and Oracle®'s FastConnect technology which uses a private network instead of a public network. Devices or equipment within the customer on-premises network 116 that form one endpoint of communication channel 124 are called customer premises equipment (CPE), such as CPE126 shown in Figure 1. The endpoint on the CSPI101 side may be a host machine running DRG122.
[0053] In certain embodiments, a Remote Peering Connection (RPC) can be added to the DRG. This allows a customer to peer one VCN with another VCN in a different region. Using such an RPC, a customer VCN 104 can connect to a VCN 108 in a different region using the DRG 122. The DRG 122 may also be used to communicate with other remote cloud networks 118 not hosted by the CSPI 101, such as the Microsoft® Azure cloud or the Amazon® AWS cloud.
[0054] As shown in Figure 1, an Internet Gateway (IGW) 120 can be configured on the customer VCN 104 to enable compute instances on the customer VCN 104 to communicate with a public endpoint 114 accessible via a public network such as the Internet. The IGW 120 is a gateway for connecting the VCN to a public network such as the Internet. The IGW 120 enables public subnets within a VCN, such as VCN 104 (resources within public subnets have public overlay IP addresses), to directly access a public endpoint 112 on a public network such as the Internet 114. Connections can be initiated from subnets within VCN 104 or from the Internet using the IGW 120.
[0055] A Network Address Translation (NAT) gateway 128 can be configured in customer VCN 104. The NAT gateway 128 enables cloud resources within the customer VCN that do not have dedicated public overlay IP addresses to access the internet without directly exposing them to incoming internet connectivity (e.g., L4-L7 connectivity). This allows private subnets within the VCN, such as private subnet-1 of VCN 104, to have private access to public endpoints on the internet. With the NAT gateway, private subnets can initiate connections to the public internet, but connections cannot be initiated from the internet to the private subnets.
[0056] In certain embodiments, a service gateway (SGW) 126 can be configured in a customer VCN 104. The SGW 126 provides a route for private network traffic between VCN 104 and service endpoints supported by a service network 110. In certain embodiments, the service network 110 may be provided by a CSP and can provide a variety of services. An example of such a service network is the Oracle® service network, which provides a variety of services that customers can use. For example, compute instances (e.g., database systems) in a private subnet of customer VCN 104 can back up data to service endpoints (e.g., object storage devices) without requiring a public IP address or access to the internet. In some embodiments, a VCN may have only one SGW, and connections can only be initiated from subnets within the VCN, and not from the service network 110. When peering a VCN with another VCN, resources in the other VCN typically cannot access the SGW. Resources in an on-premises network connected to a VCN via FastConnect or VPN Connect can also use the service gateway configured for that VCN.
[0057] In some implementations, SGW126 uses service-classless inter-domain routing (CIDR) labels. A CIDR label is a string representing all regionally exposed IP address ranges for a service or group of services of interest. Customers use service CIDR labels to control traffic to services when configuring SGW and associated routing rules. Customers can optionally use service CIDR labels when configuring security rules without having to adjust security rules if the public IP addresses of services change in the future.
[0058] The Local Peering Gateway (LPG) 132 is an addable gateway to the customer VCN 104 that enables the VCN 104 to peer with other VCNs within the same region. Peering means that VCNs communicate using private IP addresses without traffic traversing a public network such as the internet or routing traffic through the customer's on-premises network 116. In a preferred embodiment, the VCN has a separate LPG for each established peering. Local peering, or VCN peering, is a common practice used to establish network connectivity between different applications or infrastructure management functions.
[0059] Service providers, such as service providers on service network 110, can provide access to their services using different access models. According to the public access model, a service may be exposed as a public endpoint accessible publicly by compute instances within the customer VCN via a public network such as the internet, or it may be accessed privately via SGW126. According to a specific private access model, a service may be accessed as a private IP endpoint within a private subnet within the customer VCN. This is called private endpoint (PE) access and allows service providers to expose their services as instances within the customer's private network. A private endpoint resource represents a service within the customer VCN. Each PE appears as a VNIC (referred to as a PE-VNIC, having one or more private IPs) selected by the customer from a subnet within the customer VCN. Thus, a PE provides a way to provide services within the customer's private VCN subnet using a VNIC. Because the endpoint is exposed as a VNIC, the PE VNIC can utilize all the features associated with a VNIC, such as routing rules and security lists.
[0060] Service providers enable access via PE by registering their services. Providers can associate policies with services that restrict their visibility to customer tenants. Providers can register multiple services under a single virtual IP address (VIP), especially in the case of multi-tenant services. Multiple private endpoints may exist representing the same service (across multiple VCNs).
[0061] Subsequently, compute instances within the private subnet can access the service using the PE VNIC's private IP address or service DNS name. Compute instances within the customer VCN can access the service by sending traffic to the PE's private IP address within the customer VCN. The Private Access Gateway (PAGW) 130 is a gateway resource that can connect to a service provider VCN (e.g., a VCN within service network 110) and act as the receiving / transmitting point for all traffic to and from the customer subnet private endpoint. The PAGW 130 allows the provider to scale the number of PE connections without utilizing internal IP address resources. The provider only needs to configure one PAGW for any number of services registered in a single VCN. The provider can present a service as a private endpoint in multiple VCNs for one or more customers. From the customer's perspective, the PE VNIC appears to be connected to the service the customer wants to interact with, rather than to the customer's instances. Traffic directed to the private endpoint is routed to the service via the PAGW 130. These are called customer-to-service private connections (C2S connections).
[0062] Furthermore, by using the PE concept, private access to the service can be extended to the customer's on-premises network and data center by enabling traffic to flow through FastConnect / IPsec links and private endpoints within the customer's VCN. Private access to the service can also be extended to the customer's peering VCN by enabling traffic to flow between LPG132 and PEs within the customer's VCN.
[0063] Customers can control VCN routing at the subnet level, allowing them to specify which subnets use which gateways within their VCN, such as VCN104. The VCN's route table can be used to determine whether traffic can be routed outside the VCN through a particular gateway. For example, in a specific case, the route table for a public subnet within customer VCN104 might allow non-local traffic to be sent via IGW120. The route table for a private subnet within the same customer VCN104 might allow traffic to CSP services via SGW126. All remaining traffic could be sent via NAT gateway 128. The route table only controls traffic leaving the VCN.
[0064] Security lists associated with a VCN are used to control inbound connections and traffic entering the VCN via gateways. All resources within a subnet use the same mute table and security lists. Security lists may be used to control specific types of traffic entering and leaving instances within a VCN subnet. Security list rules may include inbound and outbound rules. For example, inbound rules may specify allowed source address ranges, and outbound rules may specify allowed destination address ranges. Security rules may specify specific protocols (e.g., TCP, ICMP), specific ports (e.g., port 22 for SSH, port 3389 for Windows® RDP), etc. In certain implementations, the instance's operating system may enforce its own firewall rules that match the security list rules. Rules may be stateful (e.g., connections are tracked and responses are automatically allowed without explicit security list rules for response traffic) or stateless.
[0065] Access from a customer VCN (i.e., resources or compute instances deployed on VCN104) can be classified as public access, private access, or dedicated access. Public access refers to an access model for accessing public endpoints using public IP addresses or NAT. Private access allows customer workloads within VCN104 with private IP addresses (e.g., resources in a private subnet) to access services without traversing a public network such as the internet. In certain embodiments, CSPI101 allows customer VCN workloads with private IP addresses to access the public service endpoint of a service using a service gateway. Thus, the service gateway provides a private access model by establishing a virtual link between the customer VCN and the public endpoint of a service that resides outside the customer's private network.
[0066] Furthermore, CSPI can provide dedicated public access using technologies such as FastConnect public peering. In this case, customer on-premises instances can access one or more services within the customer VCN using FastConnect connectivity without going through a public network such as the internet. CSPI can also provide dedicated private access using FastConnect private peering. In this case, customer on-premises instances with private IP addresses can access customer VCN workloads using FastConnect connectivity. FastConnect is a network connectivity used as an alternative to connecting customer on-premises networks to CSPI and its services using the public internet. FastConnect provides a simple, flexible, and economical way to create dedicated private connectivity with higher bandwidth options and a reliable, consistent networking experience compared to internet-based connectivity.
[0067] Figure 1 and the accompanying description above illustrate various virtualization elements in an exemplary virtual network. As mentioned above, the virtual network is built on an underlying physical network or infrastructure network. Figure 2 is a simplified architectural diagram showing the physical elements within the physical network within the CSPI200 that provide the foundation for the virtual network, according to a particular embodiment. As shown, the CSPI200 provides a distributed environment including elements and resources (e.g., compute resources, memory resources, and networking resources) provided by a Cloud Service Provider (CSP). These elements and resources are used to provide cloud services (e.g., IaaS services) to subscribers, i.e., customers who subscribe to one or more services provided by the CSP. Based on the services a customer subscribes to, the CSPI200 provides some resources (e.g., compute resources, memory resources, and networking resources) to the customer. The customer can then use the physical compute resources, memory resources, and networking resources provided by the CSPI200 to build their own cloud-based (i.e., CSPI-hosted) customizable private virtual network. As mentioned above, these customer networks are called virtual cloud networks (VCNs). Customers can deploy one or more customer resources, such as compute instances, to these customer VCNs. Compute instances may be virtual machines, bare metal instances, etc. CSPI200 provides infrastructure and a suite of complementary cloud services that enable customers to build and run a wide range of applications and services in a highly available host environment.
[0068] In the exemplary embodiment shown in Figure 2, the physical elements of CSPI200 include one or more physical host machines or physical servers (e.g., 202, 206, 208), network virtualization devices (NVDs) (e.g., 210, 212), top-of-rack (TOR) switches (e.g., 214, 216), a physical network (e.g., 218), and switches within physical network 218. The physical host machines or servers can host and run various compute instances participating in one or more subnets of the VCN. Compute instances may include virtual machine instances and bare metal instances. For example, the various compute instances shown in Figure 1 may be hosted by the physical host machines shown in Figure 2. Virtual machine compute instances in the VCN may run on one host machine or on several different host machines. The physical host machines can also host virtual host machines, container-based hosts or functions, etc. The VIC and VCN VR shown in Figure 1 may run on the FTVD shown in Figure 2. The gateway shown in Figure 1 may be run by the host machine and / or NVD shown in Figure 2.
[0069] A host machine or server can run a hypervisor (also known as a virtual machine monitor or VMM) that creates and enables a virtualized environment on the host machine. Virtualization or a virtualized environment facilitates cloud-based computing. One or more compute instances may be created, run, and managed on the host machine by a hypervisor on the host machine. The hypervisor on the host machine can share the host machine's physical compute resources (e.g., compute resources, memory resources, and networking resources) among various compute instances running on the host machine.
[0070] For example, as shown in Figure 2, host machines 202 and 208 run hypervisors 260 and 266, respectively. These hypervisors may be implemented using software, firmware, hardware, or a combination thereof. Typically, a hypervisor is a process or software layer residing in the host machine's operating system (OS), which runs on the host machine's hardware processor. A hypervisor provides a virtualization environment that allows the host machine's physical computing resources (e.g., processing resources such as processors / cores, memory resources, and networking resources) to be shared among various virtual machine computing instances running on the host machine. For example, in Figure 2, hypervisor 260 resides in the OS of host machine 202 and allows the host machine 202's computing resources (e.g., processing resources, memory resources, and networking resources) to be shared among computing instances (e.g., virtual machines) running on host machine 202. A virtual machine can have its own OS (called a guest OS). This guest OS may be the same as or different from the host machine's OS. The operating system (OS) of a virtual machine running on a host machine may be the same as, or different from, the operating systems of other virtual machines running on the same host machine. Therefore, a hypervisor can run multiple OSs in parallel while sharing the same computing resources of the host machine. The host machines shown in Figure 2 may have the same type of hypervisor or different types of hypervisors.
[0071] Compute instances may be virtual machine instances or bare metal instances. In Figure 2, compute instance 268 on host machine 202 and compute instance 274 on host machine 208 are examples of virtual machine instances. Host machine 206 is an example of a bare metal instance provided to a customer.
[0072] In certain examples, the entire host machine may be provided to a single customer, and one or more compute instances (either virtual machines or bare metal instances) hosted by that host machine may all belong to the same customer. In other examples, the host machine may be shared among multiple customers (i.e., multiple tenants). In such a multi-tenant scenario, the host machine can host virtual machine compute instances belonging to different customers. These compute instances may be members of different VCNs of different customers. In certain embodiments, bare metal compute instances are hosted by bare metal servers without a hypervisor. When bare metal compute instances are provided, a single customer or tenant maintains control of the physical CPU, memory, and network interfaces of the host machine hosting the bare metal instances, and the host machine is not shared with other customers or tenants.
[0073] As mentioned above, each compute instance that is part of a VCN is associated with a VNIC that enables the compute instance to be a member of the VCN's subnet. The VNIC associated with a compute instance facilitates the communication of packets or frames to and from the compute instance. The VNIC is associated with the compute instance when it is created. In certain embodiments, for a compute instance run by a host machine, the VNIC associated with that compute instance is run by an NVD connected to the host machine. For example, in Figure 2, host machine 202 runs virtual machine compute instance 268 associated with VNIC 276, and VNIC 276 is run by an NVD 210 connected to host machine 202. In another example, bare metal instance 272 hosted by host machine 206 is associated with VNIC 280, which is run by an NVD 212 connected to host machine 206. In yet another example, VNIC 284 is associated with compute instance 274 run by host machine 208, and VNIC 284 is run by an NVD 212 connected to host machine 208.
[0074] For compute instances hosted by a host machine, an NVD connected to that host machine executes a VCN VR corresponding to the VCN of which the compute instance is a member. For example, in the embodiment shown in Figure 2, NVD210 executes VCN VR277 corresponding to the VCN of which compute instance 268 is a member. Additionally, NVD212 can execute one or more VCN VR283 corresponding to the VCNs of compute instances hosted by host machines 206 and 208.
[0075] A host machine may include one or more network interface cards (NICs) for connecting it to other devices. The NICs on the host machine may provide one or more ports (or interfaces) for communicating with another device. For example, a host machine can be connected to an NVD using one or more ports (or interfaces) provided on the host machine and the NVD. Alternatively, a host machine can be connected to other devices, such as another host machine.
[0076] For example, in Figure 2, host machine 202 is connected to NVD210 using a link 220 that extends between port 234 provided by NIC 232 of host machine 202 and port 236 of NVD210. Host machine 206 is connected to NVD212 using a link 224 that extends between port 246 provided by NIC 244 of host machine 206 and port 248 of NVD212. Host machine 208 is connected to NVD212 using a link 226 that extends between port 252 provided by NIC 250 of host machine 208 and port 254 of NVD212.
[0077] Similarly, the NVDs are connected via communication links to top-of-rack (TOR) switches connected to a physical network 218 (also called a switch fabric). In certain embodiments, the links between the host machines and the NVDs, and between the NVDs and the TOR switches, are Ethernet® links. For example, in Figure 2, NVDs 210 and 212 are connected to TOR switches 214 and 216, respectively, via links 228 and 230. In certain embodiments, links 220, 224, 226, 228, and 230 are Ethernet® links. The collection of host machines and NVDs connected to the TOR is sometimes referred to as a rack.
[0078] The physical network 218 provides a communication fabric that enables communication between TOR switches. The physical network 218 may be a multi-layer network. In a particular implementation, the physical network 218 is a multi-layer Clos network of switches, and TOR switches 214 and 216 represent leaf-level nodes of the multi-layer and multi-node physical switching network 218. Different Clos network configurations are possible, including but not limited to 2-layer, 3-layer, 4-layer, 5-layer networks, and generally "n"-layer networks. An example of a Clos network is shown in Figure 5 and described below.
[0079] Various connection configurations are possible between the host machine and N VDs, including one-to-one, many-to-one, and one-to-many configurations. In an example of a one-to-one configuration, each host machine is connected to its own separate NVD. For example, in Figure 2, host machine 202 is connected to NVD210 via host machine 202's NIC232. In a many-to-one configuration, multiple host machines are connected to a single NVD. For example, in Figure 2, host machines 206 and 208 are connected to the same NVD212 via NIC244 and 250, respectively.
[0080] In a one-to-many configuration, one host machine is connected to multiple NVDs. Figure 3 shows an example within CSPI300 where a host machine is connected to multiple NVDs. As shown in Figure 3, the host machine 302 has a network interface card (NIC) 304 that includes multiple ports 306 and 30S. The host machine 300 is connected to the first NVD 310 via port 306 and link 320, and to the second NVD 312 via port 308 and link 322. Ports 306 and 308 may be Ethernet® ports, and links 320 and 322 between the host machine 302 and the NVDs 310 and 312 may be Ethernet® links. The NVD 310 is connected to the first TOR switch 314, and the NVD 312 is connected to the second TOR switch 316. The links between the NVDs 310 and 312 and the TOR switches 314 and 316 may be Ethernet® links. TOR switches 314 and 316 represent layer-0 switching devices within a multilayer physical network 318.
[0081] The configuration shown in Figure 3 provides two separate physical network paths from the physical switch network 318 to the host machine 302: a first path from TOR switch 314 through NVD 310 to the host machine 302, and a second path from TOR switch 316 through NVD 312 to the host machine 302. These separate paths provide enhanced availability (referred to as high availability) for the host machine 302. If one path experiences a problem (e.g., one link in the path fails) or if there is a problem with a device (e.g., a particular NVD is not functioning), the other path can be used for communication with the host machine 302.
[0082] In the configuration shown in Figure 3, the host machine is connected to two different NVDs using two different ports provided by the host machine's NIC. In other embodiments, the host machine may include multiple NICs that enable connections between the host machine and multiple NVDs.
[0083] Referring again to Figure 2, the NVD is a physical device or element that performs one or more network virtualization functions and / or memory virtualization functions. The NVD may be any device having one or more processing units (e.g., a CPU, a network processing unit (NPU), an FPGA, a packet processing pipeline), memory including a cache, and ports. Various virtualization functions may be performed by software / firmware executed by one or more processing units of the NVD.
[0084] NVDs may be implemented in various different forms. For example, in certain embodiments, an NVD may be implemented as an interface card called a smart NIC or intelligent NIC with an integrated processor. A smart NIC is a separate device from the NIC on the host machine. In Figure 2, NVD210 may be implemented as a smart NIC connected to host machine 202, and NVD212 may be implemented as smart NICs connected to host machines 206 and 208.
[0085] However, the smart NIC is just one example of an NVD implementation. Various other implementations are possible. For example, in some other implementations, the NVD or one or more functions performed by the NVD may be incorporated into or performed by one or more host machines, one or more TOR switches, and other elements of the CSPI200. For example, the NVD may be integrated into the host machine. In this case, the functions performed by the NVD are performed by the host machine. As another example, the NVD may be part of a TOR switch, or the TOR switch may be configured to perform functions performed by the NVD, enabling the TOR switch to perform various complex packet translations used in public clouds. A TOR that performs the functions of the NVD is sometimes called a smart TOR. In yet another implementation that provides customers with virtual machine (VM) instances rather than bare metal (BM) instances, the functions provided by the NVD may be implemented within the hypervisor of the host machine. In some other implementations, some of the functions of the NVD may be offloaded to a centralized service running on a set of host machines.
[0086] As shown in Figure 2, in certain embodiments, such as when implemented as a smart NIC, the NVD may have multiple physical ports that enable it to connect to one or more host machines and one or more TOR switches. The ports on the NVD can be classified as host-facing ports (also called "south ports") or network-facing or TOR-facing ports (also called "north ports"). Host-facing ports on the NVD are the ports used to connect the NVD to a host machine. Examples of host-facing ports in Figure 2 include port 236 on the NVD210 and ports 248 and 254 on the NVD212. Network-facing ports on the NVD are the ports used to connect the NVD to a TOR switch. Examples of network-facing ports in Figure 2 include port 256 on the NVD210 and port 258 on the NVD212. As shown in Figure 2, the NVD210 is connected to the TOR switch 214 via a link 228 extending from port 256 on the NVD210 to the TOR switch 214. Similarly, the NVD212 is connected to the TOR switch 216 via a link 230 that extends from port 258 of the NVD212 to the TOR switch 216.
[0087] The NVD can receive packets and frames from the host machine (for example, packets and frames generated by compute instances hosted by the host machine) via its host-facing port, perform the necessary packet processing, and then forward the packets and frames to the TOR switch via its network-facing port. The NVD can also receive packets and frames from the TOR switch via its network-facing port, perform the necessary packet processing, and then forward the packets and frames to the host machine via its host-facing port.
[0088] In certain embodiments, multiple ports and associated links may be provided between the NVD and the TOR switch. By aggregating these ports and links, a link aggregator group (LAG) of multiple ports or links can be formed. Link aggregation allows multiple physical links between two endpoints (e.g., between the NVD and the TOR switch) to be treated as a single logical link. All physical links within a given LAG can operate in full-duplex mode at the same speed. LAGs help to increase the bandwidth and reliability of the connection between the two endpoints. If one of the physical links in the LAG fails, traffic is dynamically and transparently reassigned to another physical link within the LAG. Aggregated physical links provide higher bandwidth than individual links. Multiple ports associated with a LAG are treated as a single logical port. Traffic can be load-balanced across the multiple physical links of the LAG. One or more LAGs can be configured between two endpoints. The two endpoints may be, for example, between the NVD and the TOR switch, or between a host machine and the NVD.
[0089] NVD implements or performs network virtualization functions. These functions are performed by software / firmware run by NVD. Examples of network virtualization functions, but not limited to, include packet encapsulation and decapsulation functions, functions for creating VCN networks, functions for implementing network policies such as VCN security list (firewall) functions, and functions for facilitating the routing and forwarding of packets between compute instances within the VCN. In certain embodiments, upon receiving a packet, NVD is configured to run a packet processing pipeline that processes the packet and determines how to forward or route it. As part of this packet processing pipeline, NVD provides the execution of VNICs related to cis within the VCN, the execution of virtual routers (VRs) related to the VCN, packet encapsulation and decapsulation to facilitate forwarding or routing within the virtual network, the execution of specific gateways (e.g., local peering gateways), the implementation of security lists, network security groups, network address translation (NAT) functions (e.g., translation from public IP to private IP on a per-host basis), throttling functions, and other functions.
[0090] In some embodiments, the packet processing data path in the NVD may include multiple packet pipelines. Each packet pipeline consists of a set of packet translation stages. In some implementations, upon receiving a packet, it is parsed and classified into a single pipeline. The packet is then processed linearly, stage by stage, until it is discarded or sent out through the NVD's interface. These stages provide the basic functional packet processing building blocks (e.g., header validation, throttling, insertion of new Layer 2 headers, L4 firewall execution, VCN encapsulation / decapsulation), and as a result, new pipelines can be constructed by assembling existing stages, and new functionality can be added by creating new stages and inserting them into existing pipelines.
[0091] The NVD can perform both control plane and data plane functions corresponding to the VCN's control plane and data plane. Examples of the VCN control plane are shown in Figures 17, 18, 19, and 20 (see reference numbers 1716, 1816, 1916, and 2016) and are described below. Examples of the VCN data plane are shown in Figures 17, 18, 19, and 20 (see reference numbers 1718, 1818, 1918, and 2018) and are described below. Control plane functions include functions used to configure the network to control how data is forwarded (e.g., setting routes and route tables, configuring VNICs). In certain embodiments, a VCN control plane is provided that centrally calculates the mapping of all overlays to the substrate and exposes it to the NVD and virtual network edge devices (e.g., various gateways such as DRG, SGW, IGW). Firewall rules can also be exposed using the same mechanism. In certain embodiments, the NVD retrieves only the mappings relevant to that NVD. The data plane function includes the ability to perform the actual routing / forwarding of packets based on the configuration set using the control plane. The VCN data plane is implemented by encapsulating customer network packets before they pass through the backbone network. The encapsulation / decapsulation function is implemented in the NVD. In certain embodiments, the NVD is configured to intercept all network packets entering and leaving the host machine and to perform network virtualization functions.
[0092] As described above, NVD performs various virtualization functions, including VNICs and VCN VRs. An NVD can run VNICs associated with compute instances hosted by one or more host machines connected to a VNIC. For example, as shown in Figure 2, NVD210 runs the functions of VNIC276 associated with compute instance 268 hosted by host machine 202 connected to NVD210. As another example, NVD212 runs VNIC280 associated with bare-metal compute instance 272 hosted by host machine 206 and VNIC284 associated with compute instance 274 hosted by host machine 208. A host machine can host compute instances belonging to different VCNs belonging to different customers. An NVD connected to a host machine can run VNICs corresponding to compute instances (i.e., perform functions associated with VNICs).
[0093] Furthermore, the NVD runs a VCN virtual router corresponding to the VCN of the compute instance. For example, in the embodiment shown in Figure 2, NVD210 runs VCN VR277 corresponding to the VCN to which compute instance 268 belongs. NVD212 runs one or more VCN VR283 corresponding to one or more VCNs to which compute instances hosted on host machines 206 and 208 belong. In a particular embodiment, a VCN VR corresponding to a VCN is run by all NVDs connected to a host machine that hosts at least one compute instance belonging to that VCN. If a host machine hosts compute instances belonging to a different VCN, the NVDs connected to that host machine can run VCN VRs corresponding to different VCNs.
[0094] In addition to VNICs and VCN VRs, an NVD may include one or more hardware elements that run various software (e.g., daemons) and facilitate various network virtualization functions performed by the NVD. For simplicity, these various elements are grouped as “packet processing elements” as shown in Figure 2. For example, NVD210 includes packet processing element 286, and NVD212 includes packet processing element 288. For example, a packet processing element of an NVD may include a packet processor configured to monitor all packets received and communicated using the NVD and to store network information by interacting with the NVD’s ports and hardware interfaces. Network information may include, for example, network flow information to identify different network flows processed by the NVD and information about each flow (e.g., statistics for each flow). In certain embodiments, network flow information may be stored on a per-VNIC basis. As another example, a packet processing element may include a replication agent configured to replicate the information stored by the NVD to one or more different replication target stores. As yet another example, the packet processing element may include a logging agent configured to perform the NVD's logging function. The packet processing element may also include software to monitor the performance and health of the NVD, and optionally the status and health of other elements connected to the NVD.
[0095] Figure 1 shows the elements of an exemplary virtual or overlay network, including a VCN, subnets within the VCN, compute instances deployed on the subnets, VNICs associated with the compute instances, a VR for the VCN, and a set of gateways configured for the VCN. The overlay elements shown in Figure 1 may be run or hosted by one or more of the physical elements shown in Figure 2. For example, compute instances within a VCN may be run or hosted by one or more host machines shown in Figure 2. In the case of compute instances hosted by host machines, the VNICs associated with those compute instances are typically run by NVDs connected to that host machine (i.e., VNIC functionality is provided by NVDs connected to that host machine). The VCN VR functionality of the VCN is run by all NVDs connected to the host machines that host or run the compute instances that are part of that VCN. Gateways associated with the VCN may be run by one or more different types of NVDs. For example, some gateways may be run by smart NICs, and others may be run by one or more host machines or other implementations of NVDs.
[0096] As described above, compute instances within a customer VCN can communicate with a variety of different endpoints. These endpoints may be on the same subnet as the source compute instance, on a different subnet but still within the same VCN, or may include endpoints outside the source compute instance's VCN. These communications are facilitated using the VNIC associated with the compute instance, the VCN VR, and the gateway associated with the VCN.
[0097] Communication between two compute instances on the same subnet within a VCN is facilitated using VNICs associated with the source and destination compute instances. The source and destination compute instances may be hosted on the same host machine or on different host machines. Packets originating from the source compute instance may be forwarded from the host machine hosting the source compute instance to an NVD connected to that host machine. In the NVD, packets are processed using a packet processing pipeline, which may include the execution of the VNIC associated with the source compute instance. Because the destination endpoint of the packets is on the same subnet, the execution of the VNIC associated with the source compute instance forwards the packets to the NVD running the VNIC associated with the destination compute instance, where the NVD processes the packets and forwards them to the destination compute instance. The VNICs associated with the source and destination compute instances may run on the same NVD (for example, if both the source and destination compute instances are hosted on the same host machine) or on different NVDs (for example, if the source and destination compute instances are hosted on different host machines connected to different NVDs). The VNIC can use the routing / forwarding table stored by the NVD to determine the next hop of a packet.
[0098] When a packet is communicated from a compute instance within a subnet to an endpoint in a different subnet within the same VCN, the packet originating from the source compute instance is communicated from the host machine hosting the source compute instance to the NVD connected to that host machine. In the NVD, the packet is processed using a packet processing pipeline and a VR associated with the VCN, which may include the execution of one or more VNICs. For example, as part of the packet processing pipeline, the NVD executes or invokes a function corresponding to the VNIC associated with the source compute instance (also called executing the VNIC). The function executed by the VNIC may include examining the VLAN tag on the packet. Because the packet's destination is outside the subnet, a VCN VR function is invoked and executed by the NVD. The VCN VR then routes the packet to the NVD executing the VNIC associated with the destination compute instance. The VNIC associated with the destination compute instance then processes the packet and forwards it to the destination compute instance. The VNICs associated with the source compute instance and the destination compute instance may run on the same NVD (for example, if both the source compute instance and the destination compute instance are hosted by the same host machine), or they may run on different NVDs (for example, if the source compute instance and the destination compute instance are hosted by different host machines connected to different NVDs).
[0099] If the packet's destination is outside the VCN of the source compute instance, the packet originating from the source compute instance is communicated from the host machine hosting the source compute instance to the NVD connected to that host machine. The NVD runs the VNIC associated with the source compute instance. Because the packet's destination endpoint is outside the VCN, the packet is processed by the VCN VR of that VCN. The NVD invokes VCN VR functionality, which may result in the packet being forwarded to an NVD running the appropriate gateway associated with the VCN. For example, if the destination is an endpoint within the customer's on-premises network, the packet may be forwarded by the VCN VR to an NVD running the DRG gateway configured for the VCN. The VCN VR may run on the same NVD as the NVD running the VNIC associated with the source compute instance, or it may run on a different NVD. The gateway may run on an NVD that is a smart NIC, a host machine, or another NVD implementation. The packet is then processed by the gateway and forwarded to the next hop to facilitate communication of the packet to its intended destination endpoint. For example, in the embodiment shown in Figure 2, a packet originating from compute instance 268 may be communicated from host machine 202 to NVD210 via link 220 (using NIC 232). VNIC 276 on NVD210 is invoked because it is the VNIC associated with source compute instance 268. VNIC 276 is configured to examine the encapsulated information in the packet, determine the next hop for forwarding the packet to facilitate communication of the packet to its intended destination endpoint, and forward the packet to the determined next hop.
[0100] Compute instances deployed on a VCN can communicate with various different endpoints. These endpoints may include endpoints hosted by CSPI200 and endpoints outside of CSPI200. Endpoints hosted by CSPI200 may include instances within the same VCN or other VCNs (which may be customer VCNs or VCNs not belonging to a customer). Communication between endpoints hosted by CSPI200 may be performed over the physical network 218. Compute instances can also communicate with endpoints not hosted by CSPI200 or located outside of CSPI200. Examples of these endpoints include endpoints within the customer's on-premises network or data center, or public endpoints accessible over a public network such as the Internet. Communication with endpoints outside of CSPI200 may be performed over a public network (e.g., the Internet) (not shown in Figure 2) or a private network (not shown in Figure 2) using various communication protocols.
[0101] The architecture of the CSPI200 shown in Figure 2 is merely an example and is not intended to be limiting. Alternative embodiments are possible, and variations, substitutions, and modifications are possible. For example, in some implementations, the CSPI200 may have more or fewer systems or elements than those shown in Figure 2, may combine two or more systems, or may have different system configurations or arrangements. The systems, subsystems, and other elements shown in Figure 2 may be implemented as software (e.g., code, instructions, programs), hardware, or a combination thereof, executed by one or more processing units (e.g., processors, cores) of each system. The software may be stored in a non-temporary storage medium (e.g., a memory device).
[0102] Figure 4 shows a connection between a host machine and an NVD to provide I / O virtualization to support multi-tenancy functionality, according to a particular embodiment. As shown in Figure 4, the host machine 402 runs a hypervisor 404 that provides the virtualization environment. The host machine 402 runs two virtual machine instances, namely VM1 406 belonging to customer / tenant #1 and VM2 408 belonging to customer / tenant #2. The host machine 402 includes a physical NIC 410 connected to the NVD 412 via link 414. Each compute instance is connected to a VNIC run by the NVD 412. In the embodiment of Figure 4, VM1 406 is connected to VNIC-VM1 420 and VM2 408 is connected to VNIC-VM2 422.
[0103] As shown in Figure 4, NIC410 includes two logical NICs, namely logical NIC A 416 and logical NIC B 418. Each virtual machine is connected to its own logical NIC and configured to operate with its own logical NIC. For example, VM1 406 is connected to logical NIC A 416, and VM2 408 is connected to logical NIC B 418. Although the host machine 402 consists of only one physical NIC 410 shared by multiple tenants, the logical NICs allow each tenant's virtual machine to believe that it owns its own host machine and NIC.
[0104] In a particular embodiment, each logical NIC is assigned its own VLAN ID. Thus, logical NIC A 416 for tenant #1 is assigned a specific VLAN ID, and logical NIC B 418 for tenant #2 is assigned a different VLAN ID. When a packet is communicated from VM1 406, the hypervisor attaches the tag assigned to tenant #1 to the packet and then communicates the packet from host machine 402 to NVD412 via link 414. Similarly, when a packet is communicated from VM2 408, the hypervisor attaches the tag assigned to tenant #2 to the packet and then communicates the packet from host machine 402 to NVD412 via link 414. Thus, the packet 424 communicated from host machine 402 to NVD412 has an associated tag 426 that identifies a specific tenant and associated VM. When packet 424 is received from host machine 402 on NVD, the tag 426 associated with the packet is used to determine whether the packet should be processed by VNIC-VM1 420 or VNIC-VM2 422. The packet is then processed by the corresponding VNIC. The configuration shown in Figure 4 allows each tenant's compute instance to believe that it owns its own host machine and NIC. The configuration shown in Figure 4 provides I / O virtualization to support multi-tenancy functionality.
[0105] Figure 5 is a schematic block diagram showing a physical network 500 according to a particular embodiment. The embodiment shown in Figure 5 is constructed as a Clos network. A Clos network is a specific type of network topology designed to provide connectivity redundancy while maintaining high bimodal bandwidth and maximum resource utilization. A Clos network is a type of non-blocking, multi-stage or multi-layer switching network, where the number of stages or layers may be 2, 3, 4, 5, etc. The embodiment shown in Figure 5 is a 3-layer network including layers 1, 2, and 3. A TOR switch 504 represents a layer-0 switch in the Clos network. One or more NVDs are connected to the TOR switch. Layer-0 switches are also called edge devices in the physical network. Layer-0 switches are connected to layer-1 switches, also called leaf switches. In the embodiment shown in Figure 5, "n" layer-0 TOR switches are connected to "n" layer-1 switches to form pods. Each layer-0 switch in a pod is interconnected to all layer-1 switches in the pod, but switches between pods are not connected. In a particular implementation, two pods are referred to as a block. Each block is serviced by or connected to n Layer-2 switches (also called spine switches). The physical network topology may contain multiple blocks. Similarly, the Layer-2 switches are connected to n Layer-3 switches (also called superspine switches). Packet communication over the physical network 500 is typically performed using one or more Layer 3 communication protocols. Typically, all layers of the physical network except the TOR layer are n-way redundant, thus achieving high availability. The physical network can be extended by specifying policies for pods and blocks to control the mutual visibility of switches in the physical network.
[0106] A key feature of Clos networks is that the maximum hop count required to reach one Layer-0 switch from one Layer-0 switch to another (or from an NVD connected to a Layer-0 switch to another NVD connected to a Layer-0 switch) remains constant. For example, in a Layer 3 Clos network, a packet requires a maximum of 7 hops to reach one NVD from another. In this case, the source NVD and target NVD are connected to the leaf layer of the Clos network. Similarly, in a Layer 4 Clos network, a packet requires a maximum of 9 hops to reach one NVD from another. In this case, the source NVD and target NVD are connected to the leaf layer of the Clos network. Therefore, the Clos network architecture maintains a constant overall network latency, which is crucial for communication within and between data centers. Clos topologies are horizontally scalable and cost-effective. Network bandwidth / throughput capacity can be easily increased by adding more switches to each layer (e.g., more leaf and spine switches) and increasing the number of links between switches in adjacent layers.
[0107] In certain embodiments, each resource within the CSPI is assigned a unique identifier called a Cloud Identifier (CID). This identifier is included as part of the resource's information. This identifier can be used to manage the resource, for example, via a console or API. An example syntax for a CID is as follows:
[0108] ocid1.<RESOURCE TYPE> . <realm>[REGION] [FUTURE USE]<UNIQUE ID> In the formula, "ocid1" is a string that indicates the CID version.
[0109] "RESOURCE TYPE" represents the type of resource (e.g., instance, volume, VCN, subnet, user, group).
[0110] "REALM" represents the region where the resources reside. Exemplary values include "c1" representing a commercial region, "c2" representing a government cloud region, or "c3" representing a federal government cloud region. Each region can have its own domain name.
[0111] "REGION" represents the region to which the resource belongs. If no region applies to the resource, this section may be left blank.
[0112] "FUTURE USE" indicates that it is reserved for future use. The "UNIQUE ID" is the unique identifier portion. This format may vary depending on the type of resource or service.
[0113] B. Exemplary Layer 2 VLAN Architecture This section describes technologies for providing Layer 2 networking capabilities in a virtualized cloud environment. Layer 2 capabilities are provided in addition to, and in relation to, the Layer 3 networking capabilities provided by the virtualized cloud environment. In certain embodiments, virtual Layer 2 and Layer 3 capabilities are provided by Oracle Cloud Infrastructure (OCI), provided by Oracle Corporation.
[0132] Following the introduction of Layer 2 networking functionality, this section explains Layer 2 implementation for VLANs. Subsequently, it describes Layer 2 VLAN services, including Internet Group Management Protocol (IGMP) functionality.
[0114] Preface The number of enterprise customers migrating their on-premises applications to cloud environments provided by cloud service providers (CSPs) continues to grow rapidly. However, many of these customers quickly realize that the migration journey to the cloud can be extremely challenging, requiring their existing applications to be rebuilt and redesigned to function in the cloud environment. This is because applications written for on-premises environments often rely on the characteristics of the physical network in terms of monitoring, availability, and scalability. Therefore, these on-premises applications need to be rebuilt and redesigned before they can function in the cloud environment.
[0115] There are several reasons why on-premises applications cannot easily migrate to a cloud environment. One of the main reasons is that current cloud virtual networks operate at Layer 3 of the OSI model, for example, the IP layer, and do not provide the Layer 2 functionality required by applications. Layer 3-based routing or forwarding involves determining where a packet should be sent (for example, to which customer instance) based on information contained in the Layer 3 header of the packet, for example, based on the destination IP address contained in the Layer 3 header of the packet. To facilitate this, the location of IP addresses within the virtualized cloud network is determined via a centralized control and orchestration system or controller. These may include, for example, IP addresses associated with customer entities or resources within the virtualized cloud environment.
[0116] Many customers are running applications in their on-premises environments that have stringent requirements for Layer 2 networking capabilities that are not currently addressed by current cloud offerings and IaaS service providers. For example, traffic is routed using Layer 3 protocols with Layer 3 headers in current cloud offerings, and the Layer 2 capabilities required by the applications are not supported. These Layer 2 capabilities may include features such as Address Resolution Protocol (ARP) processing, Media Access Control (MAC) address learning, and Layer 2 broadcast capabilities, Layer 2 (MAC-based) forwarding, and Layer 2 networking constructs. By providing virtualized Layer 2 networking capabilities in a virtualized cloud network as described in this disclosure, customers can now seamlessly migrate their legacy applications to the cloud environment without requiring any substantial restructuring or redesign. For example, the virtualized Layer 2 networking capabilities described herein enable such applications (e.g., VMware vSphere, vCenter, vSAN, and NSX-T components) to communicate at Layer 2 as they would in an on-premises environment. These applications can run the same versions and configurations in the public cloud, allowing customers to use legacy on-premises applications, including existing knowledge, tools, and processes associated with the legacy applications. Customers can also access native cloud services from their applications (for example, using VMware Software-Defined Data Centers (SDDCs)).
[0117] Another example is several legacy on-premises applications that require Layer 2 broadcast support for failover (e.g., enterprise clustering software applications, network virtual appliances). Illustrative applications include Fortinet FortiGate, IBM® QRadar, Palo Alto firewalls, Cisco ASA, Juniper SRX, and Oracle RAC (Real Application Clustering). As described in this disclosure, by providing virtualized Layer 2 networking in a virtualized public cloud, these appliances can now operate in a virtualized public cloud environment without modification. Virtualized Layer 2 networking capabilities comparable to on-premises are provided, as described herein. The virtualized Layer 2 networking capabilities described in this disclosure support traditional Layer 2 networking, including customer-defined VLANs, as well as support for unicast, broadcast, and multicast Layer 2 traffic capabilities. Layer 2-based packet routing and forwarding include using Layer 2 protocols and routing or forwarding packets, for example, based on the destination MAC address contained in the Layer 2 header, using information contained in the packet's Layer 2 header. Protocols used by enterprise applications (e.g., clustering software applications), such as ARP, Gratuitous Address Resolution Protocol (GARP), and Reverse Address Resolution Protocol (RARP), can now also function in cloud environments.
[0118] There are several reasons why traditional virtualized cloud infrastructure supports virtualized Layer 3 networking and not Layer 2 networking. Layer 2 networks typically do not scale in the same way as Layer 3 networks. Layer 2 network control protocols do not have the level of sophistication desired for scaling. For example, Layer 3 networks do not have to worry about packet looping, which Layer 2 networks must deal with. IP packets (i.e., Layer 3 packets) have the concept of Time To Live (TTL), while Layer 2 packets do not. IP addresses contained within Layer 3 packets have topological concepts such as subnets and CIDR ranges, while Layer 2 addresses (e.g., MAC addresses) do not. Layer 3 IP networks have built-in tools that facilitate troubleshooting, such as packet internet exploration and routing, for finding routing information. Such tools are not available for Layer 2. Layer 3 networks support multipath functionality, which is not available for Layer 2. Due to the lack of sophisticated control protocols for exchanging information between entities in a network (e.g., Border Gateway Protocol (BGP) and Open Shortest Path First (OSPF)), Layer 2 networks must rely on broadcast and multicast to learn about the network, which can negatively impact network performance. As the network changes, the learning process for Layer 2 must be repeated, which is not necessary for Layer 3. For these and other reasons, it is more desirable for cloud IaaS service providers to provide infrastructure that operates at Layer 3 rather than Layer 2.
[0119] However, Layer 2 functionality is required by many on-premises applications despite its numerous drawbacks. For example, consider a virtualized cloud configuration where a customer (customer 1) has two instances, instance A with IP1 and instance B with IP2, in a virtual network "V" where instances can be compute instances (e.g., bare metal, virtual machines, or containers) or service instances such as load balancers, NFS mount points, or other service instances. Virtual network V is a separate address space isolated from other virtual networks and the underlying physical network. This isolation can be achieved using various techniques, including packet encapsulation or NAT. For this reason, the IP addresses of instances in the customer's virtual network are different from the addresses in the physical network where they are hosted. A centralized SDN (Software Defined Networking) control plane is provided that knows the physical IPs and the virtual interfaces of all virtual IP addresses. When a packet is sent from instance A to a destination IP2 in virtual network V, the virtual network SDN stack needs to know where IP2 is located. It must know this in advance so that it can send the packet to the IP in the physical network where the virtual IP address IP2 for V is hosted. The location of a virtual IP address can be modified within the cloud, thus changing the relationship between the physical IP and the virtual IP address. Every time a virtual IP address is moved (for example, moving the IP address associated with a virtual machine to another virtual machine, or migrating a virtual machine to a new physical host), an API call must be made to the SDN control plane to inform the controller that the IP has been moved, so that it can update all participants in the SDN stack, including the packet processor (data plane). However, there is a class of applications that does not make such API calls.Examples include various on-premises applications and applications provided by various virtualization software vendors such as VMware. The value of facilitating virtual Layer 2 networking in a virtualized cloud environment lies in enabling support for applications that are not programmed to make such API calls, or applications that rely on other Layer 2 networking features, such as support for non-IP Layer 3 and MAC learning.
[0120] A virtual Layer 2 network creates a broadcast domain, and learning is performed by the members of the broadcast domain. In a virtual Layer 2 domain, any IP can exist on any MAC on any host within that Layer 2 domain, and the system learns using standard Layer 2 networking protocols. The system virtualizes these networking primitives, and it does not need to be explicitly told by a central controller where the MAC and IP reside within its virtual Layer 2 network. This allows applications requiring low-latency failover, applications that need to support broadcast or multicast protocols to multiple nodes, and legacy applications that do not know how to make API calls to the SDN control plane or API endpoints to determine where IP and MAC addresses are valid to run. Therefore, providing Layer 2 networking capabilities in a virtualized cloud environment is required to support functionality that is not available at the IP Layer 3 level.
[0121] Another technical advantage of providing virtual Layer 2 in a virtualized cloud environment is that it enables support for a variety of different Layer 3 protocols (such as IPv4 and IPv6), including non-IP protocols. For example, it can support various non-IP protocols such as IPX and AppleTalk. Existing cloud IaaS providers do not provide Layer 2 functionality in their virtualized cloud networks, and therefore cannot support these non-IP protocols. By providing Layer 2 networking functionality as described in this disclosure, it is possible to provide support for applications that require and depend on the availability of Layer 3 protocols and Layer 2 level functionality.
[0122] Using the technologies described in this disclosure, both Layer 3 and Layer 2 functionality is provided in a virtualized cloud infrastructure. As previously stated, Layer 3-based networking provides certain efficiencies not provided by Layer 2 networking, particularly efficiencies that are well-suited for scaling. By providing Layer 2 functionality in addition to Layer 3 functionality, it becomes possible to leverage such efficiencies provided by Layer 3 (for example, to provide a more scalable solution) while providing Layer 2 functionality in a more scalable manner. For example, virtualized Layer 3 avoids the need to use broadcasts for learning purposes. By providing Layer 3 for its efficiency, and simultaneously providing virtualized Layer 2 to enable applications that require it, applications that cannot function without Layer 2 functionality, and to support non-IP protocols, etc., customers are provided with complete flexibility in a virtualized cloud environment.
[0123] Customers themselves have hybrid environments where Layer 2 and Layer 3 environments coexist, and virtualized cloud environments can now support both of these environments. Customers can have Layer 3 networks such as subnets and / or Layer 2 networks such as VLANs, and these two environments can interact with each other within the virtualized cloud environment.
[0124] Virtualized cloud environments also need to support multi-tenancy. Multi-tenancy makes provisioning both Layer 3 and Layer 2 functionalities within the same virtualized cloud environment technically difficult and complex. For example, a Layer 2 broadcast domain must be managed across many different customers within the cloud provider's infrastructure. The embodiments described in this disclosure overcome these technical challenges.
[0125] For virtualization providers (e.g., VMware), a virtualized Layer 2 network that emulates a physical Layer 2 network allows workloads to run without modification. Applications provided by such a virtualization provider can then run on the virtualized Layer 2 network provided by the cloud infrastructure. For example, such an application might include a set of instances that need to run on a Layer 2 network. If a customer wants to lift and shift such applications from their on-premises environment to a virtualized cloud environment, they cannot simply import the applications and run them in the cloud because these applications rely on an underlying Layer 2 network that is not provided by the current virtualized cloud provider (for example, Layer 2 networking capabilities are used to perform virtual machine migration or move where MAC and IP addresses are valid). For these reasons, such applications cannot run natively in a virtualized cloud environment. Using the techniques described here, a cloud provider can provide a virtualized Layer 2 network in addition to a virtualized Layer 3 network. Here, such an application stack can run without modification in the cloud environment and can perform nested virtualization within the cloud environment. Customers can now run and manage their own Layer 2 applications within the cloud. Application providers do not need to make any changes to their own software to facilitate this. Such legacy applications or workloads (e.g., legacy load balancers, legacy applications, KVM, OpenStack, clustering software) can now run unchanged in a virtualized cloud environment.
[0126] By providing virtualized Layer 2 functionality as described here, various Layer 3 protocols, including non-IP protocols, can now be supported by virtualized cloud environments. Taking Ethernet as an example, it can support various different EtherTypes (fields in the Layer 2 header that indicate what type of Layer 3 packet is being sent; what protocols should be expected at Layer 3) including various non-IP protocols. EtherTypes are two-octet fields within an Ethernet® frame. They are used by the data link layer at the receiving end to determine which protocols are encapsulated in the frame's payload and how the payload will be processed. EtherTypes are also used as the basis for 802.1Q VLAN tagging, encapsulating packets from a VLAN for transmission that is multiplexed with other VLAN traffic over an Ethernet trunk. Examples of EtherTypes include IPv4, IPv6, Address Resolution Protocol (ARP), AppleTalk, and IPX. Cloud networks that support Layer 2 protocols can support any protocol at the Layer 3 layer. Similarly, when cloud infrastructure provides support for Layer 3 protocols, it can support various Layer 4 protocols such as TCP, UDP, and ICMP. A network can be agnostic to Layer 4 protocols when virtualization is provided at Layer 3. Similarly, a network can be agnostic to Layer 3 protocols when virtualization is provided at Layer 2. This technology can be extended to support any Layer 2 network type, including FDDI and InfiniBand.
[0127] Therefore, many applications written for physical networks, particularly those operating with clusters of computer nodes sharing a broadcast domain, utilize Layer 2 features that are not supported in L3 virtual networks. The following six examples highlight the complexities that can arise from the lack of Layer 2 networking capabilities: (1) MAC and IP assignment without prior API calls. Network appliances and hypervisors (such as VMware) were not built for cloud virtual networks. They assume that they can use MACs as long as the MAC is unique, and can obtain dynamic addresses from a DHCP server or use any IP assigned to the cluster. Often there is no mechanism that they can be configured to notify the control plane about the assignment of these Layer 2 and Layer 3 addresses. If the MAC and IP are unknown, the Layer 3 virtual network does not know where to send the traffic. (2) Low-latency reallocation of MAC and IP for high availability and live migration. Many on-premises applications use ARP to reallocate IPs and MACs for high availability - when an instance in a cluster or HA pair stops responding, a newly active instance sends a Gratuitous ARP (GARP) to reallocate the service IP to its MAC, or sends a Reverse ARP (RARP) to reallocate the service MAC to its interface. This is also important when live migrating instances on a hypervisor: the new host must send a RARP when the guest moves so that guest traffic is sent to the new host. The reallocation must not only be done without API calls, but also with very low latency (sub-milliseconds). This cannot be achieved with HTTPS calls to REST endpoints. (3) Interface multiplexing by MAC address. When a hypervisor hosts multiple virtual machines on a single host, all of which are on the same network, guest interfaces are distinguished by their MAC addresses. This requires support for multiple MAC addresses on the same virtual interface. (4) VLAN support. A single physical virtual machine host may need to be on multiple broadcast domains, as indicated by the use of VLAN tags. For example, VMware ESX uses VLANs for traffic isolation (for example, a guest virtual machine may communicate on one VLAN, storage on another VLAN, and the host virtual machine on yet another VLAN). (5) Use of broadcast and multicast traffic. ARP requires L2 broadcasting, and there are examples of on-premises applications that use broadcast and multicast traffic for cluster and HA applications. (6) Support for non-IP traffic. Since L3 networks require IPv4 or IPv6 headers to communicate, the use of L3 protocols other than IP would not work. L2 virtualization means that networks within a VLAN can be L3 protocol independent—the L3 header could be IPv4, IPv6, IPX, or something else—or even nonexistent.
[0128] Example of Layer 2 VLAN implementation As disclosed herein, a Layer 2 (L2) network can be created within a cloud network. This virtual L2 network includes one or more Layer 2 virtual networks, such as virtualized L2 VLANs, which are referred to here as VLANs. Each VLAN may contain multiple compute instances, each of which may be associated with at least one L2 virtual network interface (e.g., an L2 VNIC) and an L2 virtual switch. In some embodiments, each pair of L2 virtual network interfaces and L2 virtual switches is hosted on an NVD. The NVD may host multiple such pairs, each pair being associated with a different compute instance. A collection of L2 virtual switches represents a single L2 switch emulated by a VLAN. An L2 virtual network interface represents a collection of L2 ports on a single L2 switch emulated by an L2 switch. VLANs can connect to other VLANs, Layer 3 (L3) networks, on-premises networks, and / or other networks via a VLAN Switching and Routing Service (VSRS), also referred here to as a Reality Virtual Router (RVR) or L2 VSRS. An example of this architecture is described below.
[0129] Referring here to Figure 6, a schematic diagram of one embodiment of a computing network is shown. VCN602 resides in CSPI601. VCN602 includes multiple gateways that connect VCN602 to other networks. These gateways include DRG604, which can connect VCN602 to an on-premises network, such as an on-premises data center 606. The gateways may further include gateway 600, which may include an LPG for connecting VCN602 to another VCN, and / or an IGW and / or NAT gateway for connecting VCN602 to the Internet. The gateways of VCN602 may further include a service gateway 610, which can connect VCN602 to a service network 612. The service network 612 may include one or more databases and / or stores, such as an autonomous database 614 and / or an object store 616. The service network may include a conceptual network that includes an aggregation of IP ranges, which may be, for example, a public IP range. In some embodiments, these IP ranges may cover some or all of the public services provided by the CSPI601 provider. These services can be accessed, for example, via an internet gateway or a NAT gateway. In some embodiments, the service network provides a way for services within the service network to be accessed from the local area through a dedicated gateway (service gateway) for that purpose. In some embodiments, the backend of these services can be implemented, for example, in their own private network. In some embodiments, the service network 612 may include further additional databases.
[0130] VCN602 can contain multiple virtual networks. Each of these networks can contain one or more compute instances, and one or more compute instances can communicate within their respective networks, between networks, or outside of VCN602. One of the virtual networks in VCN602 is L3 subnet 620. L3 subnet 620 is a unit or subdivision of configuration created within VCN602. Subnet 620 can contain a virtual Layer 3 network in the virtualized cloud environment of VCN602, and VCN602 is hosted on the underlying physical network of CPSI601. Figure 6 shows a single subnet 620, but VCN602 can have one or more subnets. Each subnet within VCN602 can be associated with a contiguous range of overlay IP addresses (e.g., 10.0.0.0 / 24 and 10.0.1.0 / 24) that does not overlap with other subnets within that VCN and represents a subset of the address space within the VCN's address space. In some embodiments, this IP address space can be isolated from the address space associated with the CPSI601.
[0131] Subnet 620 contains one or more compute instances, specifically a first compute instance 622-A and a second compute instance 622-B. Compute instances 622-A and 622-B can communicate with each other within subnet 620, or with other instances, devices, and / or networks outside subnet 620. Communication outside subnet 620 is enabled by a virtual router (VR) 624. VR624 enables communication between subnet 620 and other networks in VCN602. For subnet 620, VR624 represents a logical gateway that enables subnet 620 (e.g., compute instances 622-A and 622-B) to communicate with endpoints on other networks within VCN602, and with other endpoints outside VCN602.
[0132] VCN602 may further include additional networks, specifically one or more L2 VLANs (referred to here as VLANs), which are examples of virtual L2 networks. Each of these VLANs may include a virtual Layer 2 network, localized to the cloud environment of VCN602 and / or hosted by the underlying physical network of CPSI601. In the embodiment of Figure 6, VCN602 includes VLAN A630 and VLAN B640. Each VLAN 630, 640 within VCN602 may be associated with a contiguous range of overlay IP addresses (e.g., 10.0.0.0 / 24 and 10.0.1.0 / 24) that do not overlap with other networks within that VCN, such as other subnets or VLANs within that VCN, and represent a subset of the address space within the VCN's address space. In some embodiments, this IP address space of the VLANs may be isolated from the address space associated with CPSI601. Each of VLANs 630 and 640 may contain one or more compute instances. Specifically, VLAN A630 may contain, for example, a first compute instance 632-A and a second compute instance 632-B. In some embodiments, VLAN A630 may contain additional compute instances. VLAN B640 may contain, for example, a first compute instance 642-A and a second compute instance 642-B. Each of compute instances 632-A, 632-B, 642-A, and 642-B may have an IP address and a MAC address. These addresses may be assigned or generated in any desired manner. In some embodiments, these addresses may be within the VLAN's CIDR, and in some embodiments, these addresses may be arbitrary addresses. In embodiments where a compute instance of a VLAN communicates with an endpoint outside the VLAN, one or both of these addresses may be from the VLAN CIDR; however, if all communication is within the VLAN, these addresses are not limited to addresses within the VLAN CIDR.In contrast to networks where addresses are assigned by the control plane, the IP and / or MAC addresses of compute instances within a VLAN may be assigned by the users / customers of that VLAN, and these IP and / or MAC addresses may then be discovered and / or learned by compute instances within the VLAN according to the learning process discussed below.
[0133] Each VLAN can include a VLAN Switching and Routing Service (VSRS); specifically, VLAN A630 includes VSRS A634, and VLAN B640 includes VSRS B644. Each VSRS634, 644 participates in Layer 2 switching and local learning within the VLAN, and also performs all necessary Layer 3 network functions, including ARP, NDP, and routing. VSRS performs ARP (which is a Layer 2 protocol) because it must map IP to MAC addresses.
[0134] In these cloud-based VLANs, each virtual interface or virtual gateway can be associated with one or more media access control (MAC) addresses, which may be virtual MAC addresses. Within the VLAN, one or more compute instances 632-A, 632-B, 642-A, 642-B, and / or one or more service instances, which may be bare metal, VMs, or containers, can communicate directly with each other via a virtual switch. External communication with other VLANs or L3 networks is enabled via VSRS634,644. VSRS634,644 is a distributed service that provides Layer 3 functionality, such as IP routing, to the VLAN network. In some embodiments, VSRS634,644 is a horizontally scalable, highly available routing service located at the intersection of IP and L2 networks, and capable of participating in IP routing and L2 learning within a cloud-based L2 domain.
[0135] VSRS634,644 can be distributed across multiple nodes within the infrastructure, and the VSRS634,644 functionality can be scalable, specifically horizontally scalable. In some embodiments, each node implementing the VSRS634,644 functionality shares and replicates router and / or switch functionality with one another. Furthermore, these nodes can present themselves as a single VSRS634,644 to all instances within VLAN630,640. VSRS634,644 can be implemented on any virtualization device within CSPI601, specifically within a virtual network. Therefore, in some embodiments, VSRS634,644 can be implemented on any virtual network virtualization device, including NICs, SmartNICs, switches, Smart switches, or general-purpose computing hosts.
[0136] VSRS634,644 can be a service residing on one or more hardware nodes, such as one or more x86 servers or one or more networking devices, specifically one or more SmartNICs, that support a cloud network. In some embodiments, VSRS634,644 can be implemented on a server fleet. Thus, VSRS634,644 can be a service distributed across a fleet of nodes, which may be a centrally managed fleet or distributed to the edge, that participates in and shares L2 and L3 learning along with evaluating routing and security policies. In some embodiments, each VSRS instance can update other VSRS instances with new mapping information learned by the VSRS instance. For example, if a VSRS instance learns IP, interface, and / or MAC mappings for one or more CIs in its VLAN, the VSRS instance can provide its updated information to other VSRS instances in the VCN. Through this cross-update, a VSRS instance associated with a first VLAN can know the mappings, including IP, interface, and / or MAC mappings, for CIs in other VLANs, and in some embodiments, it can know the mappings, including IP, interface, and / or MAC mappings, for CIs in other VLANs within VCN602. When VSRS resides on a server fleet and / or is distributed across a fleet of nodes, these updates can be greatly accelerated.
[0137] In some embodiments, VSRS634, 644 may also host one or more higher-level services necessary for networking, including but not limited to DHCP relay; DHCP (hosting); DHCPv6; IPv6 neighbor discovery protocols such as the IPv6 neighbor discovery protocol; DNS; hosting DNSv6; SLAAC for IPv6; NTP; metadata services; and block store mount points. In some embodiments, VSRS may support one or more network address translation (NAT) functions for translating between multiple network address spaces. In some embodiments, VSRS may incorporate anti-spoofing, anti-MAC spoofing, ARP cache poisoning countermeasures for IPv4, IPv6 router advertisement (RA) guard, DHCP guard, packet filtering with access control lists (ACLs); and / or reverse route forwarding checks. VSRS may implement functions including, for example, ARP, GARP, packet filtering (ACLs), DHCP relay, and / or IP routing protocols. VSRS634 and 644 can, for example, learn MAC addresses, invalidate expired MAC addresses, handle MAC address migration, look up MAC address information, handle MAC information flooding, handle storm control, prevent loops, perform Layer 2 multicast via protocols such as IGMP in the cloud, collect statistics including logs, use SNMP for statistics and monitoring, and / or collect and use statistics about broadcast, total traffic, bits, spanning tree packets, etc.
[0138] Within a virtual network, VSRS634,644 may appear as different instantiations. In some embodiments, each of these instantiations of VSRS may be associated with VLAN630,640, and in some embodiments, each VLAN630,640 may have an instantiation of VSRS634,644. In some embodiments, each instantiation of VSRS634,644 may have one or more unique tables corresponding to the VLAN630,640 to which the VSRS634,644 instantiation is associated. Each instantiation of VSRS634,644 may generate and / or curate unique tables associated with that instantiation of VSRS634,644. Therefore, while a single service may provide VSRS634,644 functionality for one or more cloud networks, individual instances of VSRS634,644 within a cloud network may have their own Layer 2 and Layer 3 forwarding tables, while multiple such customer networks may have overlapping Layer 2 and Layer 3 forwarding tables.
[0139] In some embodiments, VSRS634,644 may support competing VLANs and IP spaces across multiple tenants. This may include having multiple tenants on the same VSRS634,644. In some embodiments, some or all of these tenants may choose to use some or all of the same IP address space, the same MAC space, and the same VLAN space. This can provide users with extreme flexibility when choosing addresses. In some embodiments, this multi-tenancy is supported by providing each tenant with a separate virtual network, which is a private network within the cloud network. Each virtual network is given a unique identifier. Similarly, in some embodiments, each host may have a unique identifier, and / or each virtual interface or virtual gateway may have a unique identifier. In some embodiments, these unique identifiers, specifically the unique identifiers of the virtual network for each tenant, may be encoded in each communication. By providing each virtual network with a unique identifier and including it in the communication, a single instantiation of VSRS634,644 can accommodate multiple tenants with overlapping addresses and / or namespaces.
[0140] VSRS634 and 644 can perform switching and / or routing functions to facilitate and / or enable the creation of and / or communication with L2 networks within VLANs 630 and 640. These VLANs 630 and 640 may be found within a cloud computing environment, more specifically within a virtual network in that cloud computing environment.
[0141] For example, each of VLANs 630 and 640 includes multiple compute instances 632-A, 632-B, 642-A, and 642-B. VSRS634 and 644 enable communication between compute instances in one VLAN 630 or 640 and compute instances in another VLAN 630 or 640, or subnet 620. In some embodiments, VSRS634 and 644 enable communication between compute instances within one VLAN 630 or 640 and another VCN, another network outside the VCN including the internet, an on-premises data center, etc. In such embodiments, for example, compute instance 632-A can send communications to an endpoint outside the VLAN, in this example VLAN A630. A compute instance (632-A) can send communications to VSRS A634, which can then direct those communications to routers 624, 644, or gateways 604, 608, 610 that are communicatively coupled to the desired endpoint. Routers 624, 644, or gateways 604, 608, 610 that are communicatively coupled to the desired endpoint can receive communications from the compute instance (632-A) and direct those communications to the desired endpoint.
[0142] Referring here to Figure 7, a schematic diagram of the logical and hardware aspects of VLAN 700 is shown. As can be understood, VLAN 700 includes multiple endpoints, specifically multiple compute instances and VSRSs. Multiple compute instances (CIs) are instantiated on one or more host machines. In some embodiments, this can be a one-to-one relationship, where each CI is instantiated on its own host machine, and / or in some embodiments, this can be a many-to-one relationship, where multiple CIs are instantiated on a single common host machine. In various embodiments, CIs can be Layer 2 CIs by being configured to communicate with each other using an L2 protocol. Figure 7 illustrates a scenario where several CIs are instantiated on their own host machines and several CIs share a common host machine. As shown in Figure 7, instance 1 (CI1) 704-A is instantiated on host machine 1 702-A, instance 2 (CI2) 704-B is instantiated on host machine 2 702-B, and instance 3 (CI3) 704-C and instance 4 (CI4) 704-D are instantiated on a common host machine 702-C.
[0143] Each of the CI704-A, 704-B, 704-C, and 704-D is coupled to communicate with other CI704-A, 704-B, 704-C, 704-D, and VSRS714 within VLAN 700. Specifically, each of the CI704-A, 704-B, 704-C, and 704-D is connected to other CI704-A, 704-B, 704-C, 704-D, and VSRS714 within VLAN 700 via L2 VNICs and switches. Each CI704-A, 704-B, 704-C, and 704-D is associated with its own L2 VNIC and switch. The switch may be a local, L2 virtual switch that is specifically associated with and deployed for the L2 VNIC. Specifically, CI1 704-A is associated with L2 VNIC1 708-A and switch 1 710-A; CI2 704-B is associated with L2 VNIC2 708-B and switch 710-B; CI3 704-C is associated with L2 VNIC3 708-C and switch 3 710-C; and CI4 704-D is associated with L2 VNIC4 708-D and switch 4 710-D.
[0144] In some embodiments, each L2 VNIC 708 and its associated switch 710 may be instantiated on an NVD 706. This instantiation can be one-to-one, with a single L2 VNIC 708 and its associated switch 710 being instantiated on a specific NVD 706, or it can be many-to-one, with multiple L2 VNIC 708s and their associated switches 710 being instantiated on a single common NVD 706. Specifically, L2 VNIC1 708-A and switch 1 710-A are instantiated on NVD1 706-A, L2 VNIC2 708-B and switch 2 710-B are instantiated on NVD2, and both L2 VNIC3 708-C and switch 3 710-C, as well as L2 VNIC4 708-D and switch 710-D, are instantiated on a common NVD, i.e., NVD 706-C.
[0145] In some embodiments, the VSRS714 can support competing VLANs and IP spaces across multiple tenants. This may include having multiple tenants on the same VSRS714. In some embodiments, some or all of these tenants may choose to use some or all of the same IP address space, the same MAC space, and the same VLAN space. This may provide users with extreme flexibility when choosing addresses. In some embodiments, this multi-tenancy is supported by providing each tenant with a separate virtual network, which is a private network within the cloud network. Each virtual network (e.g., each VLAN or VCN) is given a unique identifier, such as a VCN identifier which may be a VLAN identifier. This unique identifier may be selected, for example, by the control plane, specifically by the CSPI control plane. In some embodiments, this unique VLAN identifier may include one or more bits which may be included and / or used in packet encapsulation. Similarly, in some embodiments, each host may have a unique identifier, and / or each virtual interface or virtual gateway may have a unique identifier. In some embodiments, these unique identifiers, specifically the unique identifiers of the virtual network for a tenant, can be encoded in each communication. By providing a unique identifier for each virtual network and including it in the communication, a single instantiation of VSRS can accommodate multiple tenants with overlapping addresses and / or namespaces. In some embodiments, VSRS714 can determine which tenant a packet belongs to based on the VCN identifier and / or VLAN identifier associated with the communication, specifically in the VCN header of the communication. In the embodiments disclosed herein, communications entering and leaving a VLAN may have a VCN header that can include a VLAN identifier.Based on the VCN header containing the VLAN identifier, the VSRS714 can determine the tenancy; in other words, the receiving VSRS can determine which VLAN and / or tenant to send communications to. In addition, each compute instance belonging to a VLAN (e.g., an L2 compute instance) is given a unique interface identifier that identifies the L2 VNIC associated with that compute instance. The interface identifier may be included in traffic from and to the compute instance (e.g., by being included in the frame header) and can be used by the NVD to identify the L2 VNIC associated with the compute instance. In other words, the interface identifier can uniquely identify a compute instance and / or its associated L2 VNIC.
[0146] As shown in Figure 7, switches 710-A, 710-B, 710-C, and 710-D together can form an L2 distributed switch 712, also referred to here as the distributed switch 712. From the customer's perspective, each switch 710-A, 710-B, 710-C, and 710-D within the L2 distributed switch 712 is a single switch connecting to all CIs within the VLAN. However, the L2 distributed switch 712, which emulates the user experience of a single switch, is infinitely scalable and includes a set of local switches (for example, switches 710-A, 710-B, 710-C, and 710-D in the exemplary example in Figure 7). As shown in Figure 7, each CI runs on a host machine connected to the NVD. For each CI on a host connected to the NVD, the NVD hosts a Layer 2 VNIC and a local switch associated with the compute instance (for example, an L2 virtual switch that is local to the NVD, associated with the Layer 2 VNIC, and is a member or component of an L2 distributed switch 712). The Layer 2 VNIC represents the port of the compute instance on the Layer 2 VLAN. The local switch connects the VNIC to other VNICs (for example, other ports) associated with other compute instances on the Layer 2 VLAN.
[0147] Each of the CI704-A, 704-B, 704-C, and 704-D can communicate with other CI704-A, 704-B, 704-C, and 704-D within VLAN 700, or with a VSRS714. One of the CI704-A, 704-B, 704-C, and 704-D transmits a frame to another CI704-A, 704-B, 704-C, or VSRS714 by sending the frame to the MAC address and interface identifier of the receiving CI or VSRS714. The MAC address and interface identifier may be included in the frame header. Here, as described above, the interface identifier may indicate the L2 VNIC of the receiving CI or VSRS714.
[0148] In one embodiment, CI1 704-A may be a source CI, L2 VNIC708-A may be a source L2 VNIC, and switch 710-A may be a source L2 virtual switch. In this embodiment, CI3 704-C may be a destination CI, and L2 VNIC3 708-C may be a destination L2 VNIC. The source CI can send a frame along with the source MAC address and destination MAC address. This frame may be intercepted by NVD706-A, which instantiates the source VNIC and source switch.
[0149] L2 VNICs 708-A, 708-B, 708-C, and 708-D can each learn to map MAC addresses to L2 VNIC interface identifiers for VLAN 700. This mapping can be learned based on frames and / or communications received from within VLAN 700. Based on this previously determined mapping, the source VNIC can determine the interface identifier of the destination interface associated with the destination CI within the VLAN and encapsulate the frame. In some embodiments, this encapsulation may include Geneve encapsulation, specifically L2 Geneve encapsulation which includes an L2 (Ethernet®) header of the frame being encapsulated. The encapsulated frame can identify the destination MAC, destination interface identifier, source MAC, and source interface identifier.
[0150] The source VNIC can pass the encapsulated frame to the source switch, which can then direct the frame to the destination VNIC. Upon receiving the frame, the destination VNIC can deencapsulate it and then provide the frame to the destination CI.
[0151] Referring now to Figure 8, a logical schematic diagram of multiple connected L2 VLANs 800 is shown. In the particular embodiment shown in Figure 8, both VLANs reside in the same VCN. As can be understood, the multiple connected L2 VLANs 800 can include a first VLAN, VLAN A802-A, and a second VLAN, VLAN B802-B. Each of these VLANs 802-A and 802-B can contain one or more CIs, each of which can have an associated L2 VNIC and an associated L2 virtual switch. Furthermore, each of these VLANs 802-A and 802-B can contain a VSRS.
[0152] Specifically, VLAN A802-A can include instance 1 804-A connected to L2 VNIC1 806-A and switch 1 808-A, instance 2 804-B connected to L2 VNIC2 806-B and switch 808-B, and instance 3 804-C connected to L2 VNIC3 806-C and switch 3 808-C. VLAN B 802-B can include instance 4 804-D connected to L2 VNIC4 806-D and switch 4 808-D, instance 5 804-E connected to L2 VNIC5 806-E and switch 808-E, and instance 6 804-F connected to L2 VNIC6 806-F and switch 3 808-F. VLAN A802-A may further include VSRS A810-A, and VLAN B802-B may include VSRS B810-B. Each of CI804-A, 804-B, and 804-C of VLAN A802-A can be communicatively coupled to VSRS A810-A, and each of CI804-D, 804-E, and 804-F of VLAN B802-B can be communicatively coupled to VSRS B810-B.
[0153] VLAN A802-A can be communicably coupled to VLAN B802-B via their respective VSRS810-A and 810-B. Each VSRS can similarly be coupled to gateway 812, which can provide access to CI804-A, 804-B, 804-C, 804-D, 804-E, and 804-F within each VLAN802-A and 802-B to other networks outside the VCN where VLAN802-A and 802-B are located. In some embodiments, these networks may include, for example, one or more on-premises networks, another VCN, a service network, or a public network such as the Internet.
[0154] Each of the CIs 804-A, 804-B, and 804-C in VLAN A802-A can communicate with CIs 804-D, 804-E, and 804-F in VLAN B802-B via the VSRS 810-A and 810-B of each VLAN 802-A and 802-B. For example, one of the CIs 804-A, 804-B, 804-C, 804-D, 804-E, and 804-F in one of VLANs 802-A and 802-B can send a frame to CIs 804-A, 804-B, 804-C, 804-D, 804-E, and 804-F in the other VLAN 802-A and 802-B. This frame can leave the source VLAN via the VSRS of the source VLAN, enter the destination VLAN, and be routed to the destination CI via the destination VSRS.
[0155] In one embodiment, CI1 804-A may be a source CI, VNIC806-A may be a source VNIC, and switch 808-A may be a source switch. In this embodiment, CI 5 804-E may be a destination CI, and L2 VNIC5 806-E may be a destination VNIC. VSRS A810-A may be a source VSRS identified as an SVSRS, and VSRS B810-B may be a destination VSRS identified as a DVSRS.
[0156] The source CI can send a frame along with its MAC address. This frame may be intercepted by the NVD that instantiates the source VNIC and source switch. The source VNIC encapsulates the frame. In some embodiments, this encapsulation may include Geneve encapsulation, specifically L2 Geneve encapsulation. The encapsulated frame can identify the destination address of the destination CI. In some embodiments, this destination address may also include the destination address of the destination VSRS. The destination address of the destination CI may include the destination IP address, the destination MAC address of the destination CI, and / or the destination interface identifier of the destination VNIC associated with the destination CI. The destination address of the destination VSRS may include the IP address of the destination VSRS, the interface identifier of the destination VNIC associated with the destination VSRS, and / or the MAC address of the destination VSRS.
[0157] The source VSRS can receive the frame from the source switch, look up the VNIC mapping from the frame's destination address (which may be a destination IP address), and forward the packet to the destination VSRS. The destination VSRS can receive the frame. Based on the destination address contained in the frame, the destination VSRS can forward the frame to the destination VNIC. The destination VNIC can receive the frame, decapsulate it, and then deliver the frame to the destination CI.
[0158] Referring now to Figure 9, a logical schematic diagram of multiple connected L2 VLANs and subnets 900 is shown. In the particular embodiment shown in Figure 9, both the VLANs and subnets reside in the same VCN. This is shown because the virtual routers and VSRSs of both the VLANs and subnets are directly connected rather than connected via a gateway.
[0159] To be understood, this can include a first VLAN, VLAN A902-A, a second VLAN, VLAN B902-B, and subnet 930. Each of these VLANs 902-A and 902-B can contain one or more CIs, each of which can have an associated L2 VNIC and an associated L2 switch. Furthermore, each of these VLANs 902-A and 902-B can contain a VSRS. Similarly, subnet 930, which can be an L3 subnet, can contain one or more CIs, each of which can have an associated L3 VNIC, and L3 subnet 930 can contain a virtual router 916.
[0160] Specifically, VLAN A902-A can include instance 1 904-A connected to L2 VNIC1 906-A and switch 1 908-A, instance 2 904-B connected to L2 VNIC2 906-B and switch 2 908-B, and instance 3 904-C connected to L2 VNIC3 906-C and switch 3 908-C. VLAN B 902-B can include instance 4 904-D connected to L2 VNIC4 906-D and switch 4 908-D, instance 5 904-E connected to L2 VNIC5 906-E and switch 5 908-E, and instance 6 904-F connected to L2 VNIC6 906-F and switch 6 908-F. VLAN A902-A may further include VSRS A910-A, and VLAN B902-B may include VSRS B910-B. Each of CI904-A, 904-B, and 904-C of VLAN A902-A can be communicatively coupled to VSRS A910-A, and each of CI904-D, 904-E, and 904-F of VLAN B902-B can be communicatively coupled to VSRS B910-B. L3 subnet 930 may include one or more CIs, specifically instance 7 904-G which is communicatively coupled to L3 VNIC 7 906-G. L3 subnet 930 may include virtual router 916.
[0161] VLAN A902-A can be communicably coupled to VLAN B902-B via their respective VSRS instances 910-A and 910-B. L3 subnet 930 can be communicably coupled to VLAN A902-A and VLAN B902-B via virtual router 916. Virtual router 916 and each of VSRS instances 910-A and 910-B can similarly be coupled to gateway 912, which can grant access to CIs 904-A, 904-B, 904-C, 904-D, 904-E, 904-F, and 904-G in each VLAN 902-A, 902-B and subnet 930 to other networks outside the VCN where VLAN 902-A, 902-B and subnet 930 are located. In some embodiments, these networks may include, for example, one or more on-premises networks, another VCN, a service network, a public network such as the Internet, etc.
[0162] Each VSRS instance 910-A, 910-B can provide an outgoing path for frames leaving their associated VLANs 902-A, 902-B, and an inbound path for frames entering their associated VLANs 902-A, 902-B. From VSRS instances 910-A, 910-B of VLANs 902-A, 902-B, frames can be sent to any desired endpoint, including L2 endpoints such as L2 CIs in another VLAN on the same VCN or a different VCN or network, and / or L3 endpoints such as L3 CIs in a subnet on the same VCN or a different VCN or network.
[0163] In one embodiment, CI1 904-A may be a source CI, VNIC906-A may be a source VNIC, and switch 908-A may be a source switch. In this embodiment, CI7 904-G may be a destination CI, and VNIC7 906-G may be a destination VNIC. VSRS A910-A may be a source VSRS identified as an SVSRS, and virtual router (VR) 916 may be a destination VR.
[0164] The source CI can send a frame along with its MAC address. This frame may be intercepted by the NVD that instantiates the source VNIC and source switch. The source VNIC encapsulates the frame. In some embodiments, this encapsulation may include Geneve encapsulation, specifically L2 Geneve encapsulation. The encapsulated frame can identify the destination address of the destination CI. In some embodiments, this destination address may also include the destination address of the VSRS of the source CI's VLAN. The destination address of the destination CI may include the destination IP address, the destination MAC address of the destination CI, and / or the destination interface identifier of the destination VNIC of the destination CI.
[0165] The source VSRS can receive the frame from the source switch, look up the VNIC mapping from the frame's destination address (which may be a destination IP address), and forward the frame to the destination VR. The destination VR can receive the frame. Based on the destination address contained in the frame, the destination VR can forward the frame to the destination VNIC. The destination VNIC can receive the frame, decapsulate it, and then deliver the frame to the destination CI.
[0166] Learning by L2 VNICs and / or L2 virtual switches within a virtual L2 network Referring now to Figure 10, a schematic diagram of one embodiment of intra-VLAN communication and learning within VLAN 1000 is shown. The learning here is specific to how the L2 VNIC, the VSRS of the source CI's VLAN, and / or L2 virtual switch learn the association between MAC addresses and the L2 VNIC / / VSRS VNIC (more specifically, between MAC addresses associated with an L2 compute instance or VSRS and interface identifiers associated with the L2 VNICs of those L2 compute instances associated with the VSRS VNIC). Generally, the learning is based on incoming traffic. This learning is different from the learning process (e.g., the ARP process) that an L2 compute instance may implement to learn a destination MAC address in terms of interface-to-MAC address learning. The two learning processes (e.g., L2 VNIC / L2 virtual switch and L2 compute instance) are shown as being implemented jointly in Figure 12.
[0167] As can be understood, VLAN1000 includes compute instance 1 1000-A, which is communicatively coupled to NVD1 1001-A, which instantiates L2 VNIC1 1002-A and L2 switch 1 1004-A. VLAN1000 also includes compute instance 2 1000-B, which is communicatively coupled to NVD2 1001-B, which instantiates L2 VNIC2 1002-B and L2 switch 2 1004-A. VLAN1000 also runs on a server fleet and includes VSRS1015, which includes VSRS VNIC1002-C and VSRS switch 1004-C. All switches 1004-A, 1004-B, and 1004-C together form L2 distributed switch 1050. VSRS1015 is connected to endpoint 1008 for communication, and endpoint 1008 may include a gateway, specifically, for example, an L2 / L3 router in the form of another VSRS, or for example, an L3 router in the form of a virtual router.
[0168] The control plane 1010 of the VCN hosting VLAN 1000 maintains information identifying each L2 VNIC on VLAN 1000 and the network configuration of each L2 VNIC. For example, this information may include, for each L2 VNIC, the interface identifier associated with the L2 VNIC and / or the physical IP address of the NVD hosting the L2 VNIC. The control plane 1010 updates the interfaces in VLAN 1000 with this information (e.g., periodically or on demand). Thus, each L2 VNIC in VLAN 1000, 1002-A, 1002-B, 1002-C, receives information from the control plane 1010, identifies the interface in the VLAN, and populates this information into a table. The table populated by the L2 VNICs can be stored locally on the NVD hosting the L2 VNICs. If VNIC1002-A, 1002-B, and 1002-C already contain the current table, VNIC1002-A, 1002-B, and 1002-C can determine any discrepancies between their current table and the information / table received from control plane 1010. In some embodiments, VNIC1002-A, 1002-B, and 1002-C can update their table to match the information received from control plane 1010.
[0169] As shown in Figure 10, frames are transmitted via L2 switches 1004-A, 1004-B, and 1004-C and received by receiving VNICs 1002-A, 1002-B, and 1002-C. When frames are received by VNICs 1002-A, 1002-B, and 1002-C, the VNIC learns the mapping of the source interface (source VNIC) and source MAC address of that frame. Based on the table of information received from the control plane 1010, the VNIC can map the source MAC address (from the received frame) to the interface identifier of the source VNIC and the IP address of that VNIC and / or the IP address of the NVD hosting that VNIC (if the interface identifier and IP address are available from the table). Thus, L2 VNICs 1002-A, 1002-B, and 1002-C learn the mapping of interface identifiers to MAC addresses based on the received communications and / or frames. Each VNIC 1002-A, 1002-B, and 1002-C may have its L2 forwarding (FWD) tables 1006-A, 1006-B, and 1006-C, along with this learned mapping information. In some embodiments, the L2 forwarding table includes a MAC address and associates it with at least one of an interface identifier or a physical IP address. In such embodiments, the MAC address may be an address assigned to an L2 compute instance and may correspond to a port emulated by the L2 VNIC associated with the L2 compute instance. The interface identifier can uniquely identify the L2 VNIC and / or L2 compute instance. The virtual IP address may be that of the L2 VNIC, and the physical IP address may be that of the NVD hosting the L2 VNIC. L2 forwarding updated by the L2 VNIC may be stored locally on the NVD hosting the L2 VNIC and may be used by the L2 virtual switch associated with the L2 VNIC to direct frames.In some embodiments, VNICs within a common VLAN can share all or part of their mapping tables with one another.
[0170] In light of the network architecture described above, the traffic flow is described below. For clarity, the traffic flow is described in relation to compute instance 2 1000-B, L2 VNIC2 10002-B, L2 switch 2 1004-B, and NVD2 1001-B. This description applies equivalently to the traffic flow between other compute instances.
[0171] As described above, VLANs are implemented within a VCN as overlay L2 networks on top of L3 physical networks. An L2 compute instance of a VLAN can send or receive L2 frames that include an overlay MAC address (also called a virtual MAC address) as the source MAC address and destination MAC address. An L2 frame can also encapsulate a packet that includes an overlay IP address (also called a virtual IP address) as the source IP address and destination IP address. In some embodiments, the overlay IP address of a compute instance may belong to the CIDR range of the VLAN. Other overlay IP addresses may belong to the CIDR range (in which case the L2 frame flows within the VLAN) or outside the CIDR range (in which case the L2 frame is destined for or received from another network). An L2 frame may also include a VLAN tag, which can be used to uniquely identify the VLAN and distinguish it from multiple L2 VNICs on the same NVD. L2 frames can be received by an NVD in encapsulated packets via a tunnel from the host machine of a compute instance, from another NVD, or from a server fleet hosting a VSRS. In these different cases, the encapsulated packet may be an L3 packet transmitted over the physical network, with source and destination IP addresses being physical IP addresses. Different types of encapsulation are possible, including Geneve encapsulation. An NVD can decapsulate the received packet to extract the L2 frame. Similarly, to transmit an L2 frame, an NVD can encapsulate it in an L3 packet and transmit it over the physical board.
[0172] For inbound traffic within the VLAN from instance 2 1000-B, NVD2 1001-B receives a frame from the host machine of instance 2 1000-B via the Ethernet link. The frame contains an interface identifier that identifies L2 VNIC2 1000-B. The frame contains the overlay MAC address of instance 2 1000-B (e.g., M.2) as the source MAC address and the overlay MAC address of instance 1 1000-A (e.g., M.1) as the destination MAC address. Given the interface identifier, NVD2 1001-B passes the frame to L2 VNIC2 1002-B for further processing. L2 VNIC2 1002-B forwards the frame to L2 switch 2 1004-B. Based on L2 forwarding table 1006-B, L2 switch 2 1004-B determines whether the destination MAC address is known (for example, by matching it with an entry in L2 forwarding table 1006-B).
[0173] If known, L2 switch 2 1004-B determines that L2 VNIC1 1002-A is the associated tunnel endpoint and forwards the frame to L2 VNIC1 1002-A. Forwarding may involve encapsulation and decapsulation of the frame in the packet (e.g., Geneve encapsulation and decapsulation), and the packet may contain the frame, the physical IP address of NVD1 1001-A as the destination address (e.g., IP.1), and the physical IP address of NVD 2 1001-B as the source address (e.g., IP.2).
[0174] If unknown, L2 switch 2 1004-B broadcasts the frame to various VNICs in the VLAN (e.g., including L2 VNIC 1 1002-A and any other L2 VNICs in the VLAN), and the broadcasted frame is processed (e.g., encapsulated, transmitted, decapsulated) among the relevant NVDs. In some embodiments, this broadcast is performed in the physical network, or more specifically, emulated in the physical network, and the frame can be encapsulated separately to each L2 VNIC, including VSRS in the VLAN. Thus, the broadcast is emulated in the physical network via a series of replicated unicast packets. Each L2 VNIC then receives the frame and learns the association between the interface identifier of L2 VNIC 2 1002-B and the source MAC address (e.g., M.2) and the source physical IP address (e.g., IP.2).
[0175] For incoming traffic within a VLAN from compute instance 1 1000-A to compute instance 2 1000-B, NVD2 1001-B receives the packet from NVD1. The packet has IP.1 as the source address and a frame, and the frame contains M.2 as the destination MAC address and M.1 as the source MAC address. The frame also contains the network identifier of L2 VNIC1 1002-A. Upon decapsulation, VNIC2 receives the frame and learns that this interface identifier is associated with M.1 and / or IP.1, and if this information was previously unknown, stores this learned information in the L2 forwarding table 1006-B on switch 2 for subsequent outgoing traffic. Alternatively, upon decapsulation, L2 VNIC2 1002-B receives the frame and learns that this interface identifier is associated with M.1 and / or IP.1, and if this information is known, refreshes the validity period.
[0176] For outgoing traffic sent from instance 2 1000-B in VLAN 1000 to an instance in another VLAN, a similar flow to the outgoing traffic described above may exist, except that a VSRS VNIC and VSRS switch are used. In particular, the destination MAC address is not within the L2 broadcast of VLAN 1000 (it is in another L2 VLAN). Therefore, the overlay destination IP address of the destination instance (e.g., IP.A) is used for this outgoing traffic. For example, L2 VNIC 2 1002-B determines that IP.A is outside the CIDR range of VLAN 1000. Therefore, L2 VNIC 2 1002-B sets the destination MAC address to the default gateway MAC address (e.g., M.DG). Based on M.DG, L2 switch 2 1004-B sends the outgoing traffic to the VSRS VNIC (e.g., via a tunnel with appropriate end-to-end encapsulation). The VSRS VNIC forwards the outgoing traffic to the VSRS switch. Next, the VSRS switch performs routing functionality, where, based on the overlay destination IP address (e.g., IP.A), the VSRS switch on VLAN 1000 sends the outgoing traffic to the VSRS switch on the other VLAN (e.g., via a virtual router between these two VLANs, with appropriate end-to-end encapsulation). The VSRS switch on the other VLAN then performs switching functionality by determining that IP.A is within the CIDR range of this VLAN, and performs an ARP cache lookup based on IP.A to determine the destination MAC address associated with IP.A. If no match is found in the ARP cache, an ARP request is sent to a different L2 VNIC on the other VLAN to determine the destination MAC address. Otherwise, the VSRS switch sends the outgoing traffic to the relevant VNIC (e.g., via a tunnel, with appropriate encapsulation).
[0177] For incoming traffic from an instance in another VLAN to an instance in VLAN 1000, the traffic flow is the same as above, except that it is in the reverse direction. For outgoing traffic from an instance in VLAN 1000 to the L3 network, the traffic flow is the same as above, except that the VSRS switch in VLAN 1000 directly routes the packet to the destination VNIC in the virtual L3 network via the virtual router (e.g., the packet does not need to be routed through another VSRS switch). For incoming traffic from the virtual L3 network to an instance in VLAN 1000, the traffic flow is the same as above, except that the packet is received by the VSRS switch in VLAN 1000A, which transmits the packet as a frame within the VLAN. For traffic between VLAN 1000 and other networks (outgoing or incoming), the VSRS switch is used similarly, with its routing function used for outgoing traffic to send packets through the appropriate gateway (e.g., IGW, NGW, DRG, SGW, LPG), and its switching function used for incoming traffic to transmit frames within VLAN 1000.
[0178] Referring to Figure 11, a schematic diagram of an embodiment of VLAN 1100 (for example, a cloud-based virtual L2 network) is shown, specifically a diagram of the VLAN implementation.
[0179] As described above, a VLAN can contain "n" compute instances 1102-A, 1102-B, 1102-N, each running on a host machine. As previously stated, there can be one-to-one associations between compute instances and host machines, or many-to-one associations between multiple compute instances and a single host machine. Each compute instance 1102-A, 1102-B, 1102-N can be an L2 compute instance, in which case it is associated with at least one virtual interface (e.g., L2 VNIC) 1104-A, 1104-B, 1104-N and switches 1106-A, 1106-B, 1106-N. Switches 1106-A, 1106-B, 1106-N are L2 virtual switches and together form an L2 distributed switch type 1107.
[0180] Pairs of L2 VNICs 1104-A, 1104-B, 1104-N and switches 1106-A, 1106-B, 1106-N associated with compute instances 1102-A, 1102-B, 1102-N on the host machine are pairs of software modules on NVDs 1108-A, 1108-B, 1108-N connected to the host machine. Each L2 VNIC 1104-A, 1104-B, 1104-N represents an L2 port of a single customer-recognized switch (referred to here as the v-switch). Generally, host machine "i" runs compute instance "i" and is connected to NVD "i". Then NVD "i" runs L2 VNIC "i" and switch "i". L2 VNIC "i" represents L2 port "i" of the v-switch. "i" is a positive integer between 1 and "n". Here, a one-to-one association is described, but other types of associations are also possible. For example, a single NVD can be connected to multiple hosts, each running one or more compute instances belonging to a VLAN. In this case, the NVD hosts multiple pairs of L2 VNICs and switches, each corresponding to one of the compute instances.
[0181] A VLAN can include an instance of VSRS1110. VSRS1110 performs switching and routing functions and includes instances of VSRS VNIC1112 and VSRS switch 1114. VSRS VNIC1112 represents a port on the v-switch that connects the v-switch to other networks via a virtual router. As shown, VSRS1110 can be instantiated on server fleet 1116.
[0182] The control plane 1118 can track information identifying the L2 VNICs 1104-A, 1104-B, and 1104-N and their placement in the VLAN. The control plane 1110 can further provide this information to the L2 interfaces 1104-A, 1104-B, and 1104-N within the VLAN.
[0183] As shown in Figure 11, the VLAN can be a cloud-based virtual L2 network that can be built on top of the physical network 1120. In some embodiments, this physical network 1120 may include NVD1108-A, 1108-B, and 1108-N.
[0184] Generally, a first L2 compute instance in a VLAN (e.g., compute instance 1 1102-A) can communicate with a second compute instance in the VLAN (e.g., compute instance 2 1102-B) using L2 protocols. For example, a frame can be sent between these two L2 compute instances across the VLAN. Nevertheless, the frame can be encapsulated, tunneled, routed, and / or otherwise processed so that it can be sent over the underlying physical network 1120.
[0185] For example, compute instance 1 1102-A sends a frame destined for compute instance 2 1102-B. Depending on the network connections between host machine 1 and NVD1, between NVD1 and physical network 1120, between physical network 1120 and NVD2, and between NVD2 and host machine 2 (e.g., TCP / IP connection, Ethernet® connection, tunneling connection, etc.), different types of processing may be applied to the frame. For example, the frame is received and encapsulated by NVD1 until it reaches compute instance 2, and so on. This processing is assumed to be possible so that the frame can be transmitted between lower-layer physical resources, and for brevity and clarity, its explanation is omitted from the explanation of VLANs and related L2 operations.
[0186] Virtual L2 network communication Multiple forms of communication can occur within or using a virtual L2 network. These may include intra-VLAN communication. In such embodiments, a source compute instance (CI) can send communication to a destination compute instance located in the same VLAN as the source compute instance (CI). Communication can also be sent to an endpoint outside the VLAN of the source CI. This may include, for example, communication between a source CI in a first VLAN and a destination CI in a second VLAN, communication between a source CI in a first VLAN and a destination CI in an L3 subnet, and / or communication from a source CI in a first VLAN to a destination CI outside the VCN containing the source CI's VLAN. This communication may further include, for example, receiving communication at the destination CI from a source CI outside the destination CI's VLAN. This source CI may be in another VLAN, an L3 subnet, or outside the VCN containing the source CI's VLAN.
[0187] Each CI within a VLAN can play an active role in the traffic flow. This includes learning interface identifiers versus MAC addresses (also referred to here as interface versus MAC address), mapping instances within the VLAN to maintain the L2 forwarding table within the VLAN, and sending and / or receiving communications (e.g., frames in the case of L2 communications). VSRS can play an active role in communications within the VLAN and in communications with source or destination CIs outside the VLAN. VSRS can maintain its presence within the L2 and L3 networks, enabling outgoing and incoming communications.
[0188] Referring now to Figure 12, a flowchart illustrating one embodiment of process 1200 for intra-VLAN communication is shown. In some embodiments, process 1200 may be executed by a compute instance within a common VLAN. Specifically, this process may be executed when a source CI sends communication to a destination CI within a VLAN, but does not know the IP-to-MAC address mapping of that destination CI. This can occur, for example, when a source CI sends a packet to a destination CI that has an IP address in the VLAN, but the source CI does not know the MAC address corresponding to that IP address. In this case, an ARP process can be executed to learn the destination MAC address and the IP-to-MAC address mapping.
[0189] If the source CI knows the IP-to-MAC address mapping, the source CI can send the packet directly to the destination CI without the need for an ARP process to be performed. In some embodiments, this packet may be intercepted by a source VNIC whose source VNIC is an L2 VNIC in intra-VLAN communication. If the source VNIC knows the interface-to-MAC address mapping for the destination MAC address, the source VNIC can encapsulate the packet, for example, with L2 encapsulation, and forward the corresponding frame to the destination MAC address to the destination VNIC whose destination VNIC is an L2 VNIC in intra-VLAN communication.
[0190] If the source VNIC does not know the interface-to-MAC address mapping for a MAC address, the source VNIC can perform an interface-to-MAC address learning process. This may involve the source VNIC sending a frame to all interfaces in the VLAN. In some embodiments, this frame may be sent to all interfaces in the VLAN via broadcast. In some embodiments, this broadcast may be implemented in the form of a serial unicast in the physical network. This frame may include the destination MAC and IP addresses, the interface identifier, and the MAC and IP addresses of the source VNIC. Each VNIC in the VLAN can receive this frame and learn the interface-to-MAC address mapping of the source VNIC.
[0191] Each receiving VNIC can further decapsulate a frame and forward the decapsulated frame (e.g., the corresponding packet) to its associated CI. Each CI may include a network interface from which it can evaluate the forwarded packet. If the network interface determines that the CI receiving the forwarded packet does not match the destination MAC and / or IP address, the packet is dropped. If the network interface determines that the CI receiving the forwarded frame matches the destination MAC and / or IP address, the packet is received by the CI. In some embodiments, a CI having a MAC and / or IP address that matches the destination MAC and / or IP address of a packet can send a response to the source CI, thereby allowing the source VNIC to learn the interface-to-MAC address mapping of the destination CI, and thereby allowing the source CI to learn the IP-to-MAC address mapping of the destination CI.
[0192] If the source CI does not know the IP-to-MAC address mapping, or if the source CI's IP-to-MAC address mapping to the destination CI is outdated, process 1200 can be executed. Thus, once the IP-to-MAC address mapping is known, the source CI can send the packet. If the IP-to-MAC address mapping is not known, process 1200 can be executed. If the interface-to-MAC address mapping is not known, the interface-to-MAC address learning process outlined above can be executed. If the interface-to-MAC address mapping is known, the source VNIC can send the corresponding frame to the destination VNIC. Process 1200 begins at block 1202, where the source CI determines that the destination CI's IP-to-MAC address mapping is unknown to the source CI. In some embodiments, this may include the source CI determining the destination IP address for the packet and determining that the destination IP address is not associated with any MAC address stored in the source CI's mapping table. Alternatively, the source CI may determine that the IP-to-MAC address mapping for its destination CI is outdated. In some embodiments, a mapping can be outdated if it has not been updated and / or validated within a certain time limit. If the source CI determines that the destination CI's IP-to-MAC address mapping is unknown and / or outdated, the source CI initiates an ARP request for the destination IP address and sends an ARP request for Ethernet broadcast.
[0193] In block 1204, the source VNIC, also called the source interface, receives an ARP request from the source CI. The source interface identifies all interfaces on the VLAN and sends ARP requests to all interfaces on the VLAN broadcast domain. As previously mentioned, the control plane knows all interfaces on the VLAN and provides this information to the interfaces with the VLAN, so the source interface also knows all interfaces within the VLAN and can send ARP requests to each of them. To do this, the source interface duplicates the ARP requests and encapsulates one of the ARP requests for each interface on the VLAN. Each encapsulated ARP request includes the source CI interface identifier, the source CI MAC address and IP address, the target IP address, and the destination CI interface identifier. The source CI interface duplicates the Ethernet broadcast by sending the duplicated and encapsulated ARP requests (e.g., ARP messages) as serial unicast, so that one ARP request is sent to each interface in the VLAN.
[0194] In block 1206, each interface in the VLAN broadcast domain receives and decapsulates an ARP message. Each interface in the VLAN broadcast domain that receives an ARP message learns the interface-to-MAC address mapping of the source VNIC of the source CI (e.g., the interface identifier of the source interface to the MAC address of the source CI) because the message identifies the source CI's MAC address and IP address, as well as the source CI interface identifier. As part of learning the interface-to-MAC address mapping for the source CI, each interface can update its mapping table (e.g., its L2 forwarding table) and provide the updated mapping to its associated switch and / or CI. Each receiving interface, except for VSRS, can forward the decapsulated packet to its associated CI. The CI recipient of the forwarded decapsulated packet, specifically the network interface of that CI, can determine whether the target IP address matches the IP address of the CI. If the IP address of the CI associated with that interface does not match the destination CI IP address, in some embodiments, the packet is dropped by that CI and no further action is taken. In the case of VSRS, VSRS can determine whether the target IP address matches the VSRS's IP address. If the VSRS's IP address does not match the target IP address specified in the received packet, in some embodiments, the packet is dropped by the VSRS and no further action is taken.
[0195] If the destination CI IP address specified in the received packet is determined to match the IP address of the CI (destination CI) associated with the receiving interface, the destination CI sends a response, which may be a unicast ARP response, to the source interface, as shown in block 1208. This response includes the destination CI MAC address and destination CI IP address, and the source CI IP address and MAC address. This response is received by the destination interface, as shown in block 1210, and it encapsulates the unicast ARP response. In some embodiments, this encapsulation may include Geneve encapsulation. The destination interface can forward the encapsulated ARP response to the source interface via the destination switch. This response includes the destination CI MAC address and IP address, as well as the destination CI interface identifier, and the source CI MAC address and IP address, as well as the source CI interface identifier.
[0196] In block 1212, the source interface receives the ARP response and decapsulates it. The source interface can then learn an interface-to-MAC address mapping for the destination CI based on the information contained in the encapsulated and / or encapsulated frame. In some embodiments, the source interface can forward the ARP response to the source CI.
[0197] In block 1214, the source CI receives an ARP response. In some embodiments, the source CI can update its mapping table based on the information contained in the ARP response, specifically, it can update the mapping table to reflect the IP-to-MAC address mapping based on the destination CI's MAC address and IP address. The source CI can then send a packet to the destination CI based on this MAC address. This packet may include the source CI's MAC address and interface identifier as the source MAC address and source interface, and the destination CI's MAC address and interface identifier as the destination MAC address and destination interface.
[0198] In block 1216, the source interface can receive packets from the source CI. The source interface can encapsulate packets, and in some embodiments, this encapsulation uses Geneve encapsulation. The source interface can forward the corresponding frames to the destination CI, specifically to the destination interface. The encapsulated frames may include the MAC address and interface identifier of the source CI as the source MAC address and source interface identifier, and the MAC address and interface identifier of the destination CI as the destination MAC address and destination interface.
[0199] In block 1218, the destination interface receives a frame from the source interface. The destination interface can decapsulate the frame and then forward the corresponding packet to the destination CI. In block 1220, the destination CI receives a packet from the destination interface.
[0200] IGMP In an L2 physical network, frames can be multicast to a group of devices associated with a group MAC address. To do this, the network's switches send IGMP queries to different devices in the network, asking whether each device should be added to the multicast group having the group MAC address. If a device's IGMP response indicates that the device should be a member of the multicast group, the switch associates that device's MAC address with the group MAC address in its IGMP table. Subsequently, when the switch receives a frame destined for the group MAC address, it duplicates that frame into multiple frames destined for the MAC addresses of the devices in the multicast group and sends the duplicated frames to the relevant devices.
[0201] In contrast, as explained here, a Layer 2 virtual network does not involve a single switch. Instead, a distributed switch is implemented (referred to here as a v-switch or L2 v-switch), including multiple L2 VNICs and L2 virtual switches hosted on one or more NVDs. Therefore, there does not need to be a single central switch for sending IGMP queries to different compute instances of the Layer 2 virtual network. Instead, IGMP queries need to be delivered across L2 virtual switches. Furthermore, IGMP responses are not received by a single central switch. Instead, each L2 virtual switch can receive IGMP responses from its corresponding compute instance. To enable group multicast, an IGMP table must be created from the received responses collected by multiple L2 virtual switches and then delivered to multiple L2 virtual switches.
[0202] Figure 13 shows an exemplary environment suitable for defining an L2 virtual network configuration. In this embodiment, the environment includes a computer system 1310 that communicates with a customer device 1320 over one or more networks (not shown). The computer system 1310 may include a set of hardware computing resources that host a VCN 1312. A control plane hosted by one or more of the hardware computing resources can receive and process input from the customer device 1320 and deploy an L2 virtual network (shown as L2VLAN 1314 in Figure 13) within the VCN 1312.
[0203] In one example, input from customer device 1320 may include various types of information. This information can be specified via console or API calls and may include, among other things, customer-specified configuration information, L2 VLAN configuration 1322.
[0204] The L2 VLAN configuration 1322 can, for example, specify the number, type, and configuration of L2 compute instances that should be included in L2 VLAN 1314. In addition, the L2 VLAN configuration 1322 can specify the customer-designated name of the port on the customer-aware v-switch, the MAC address of the compute instance (which may be an L2 compute instance), and the association between the port and the MAC address (or, more generally, the compute instance). For example, the customer may specify that L2 VLAN 1314 should contain two L2 compute instances, the first L2 compute instance having MAC address M.1 and associated with a first port named P1, and the second L2 compute instance having MAC address M.2 and associated with a second port named P2.
[0205] The control plane receives various information and then deploys and manages the different resources of L2 VLAN 1314. For example, L2 VLAN 1314 is configured according to L2 VLAN configuration 1322 and includes a requested compute instance hosted on a host machine and a pair of L2 VNICs-L2 virtual switches hosted on an NVD. The control plane can also maintain a mapping between the customer-specified L2 VLAN configuration 1322 and the actual topology of L2 VLAN 1314. For example, the mapping indicates that an L2 VNIC emulates a port on a v-switch and that an L2 VNIC (e.g., its interface identifier, its MAC address (if not specified), and / or the IP address of the NVD hosting the L2 VNIC) can be associated with a port name (and the specified MAC address, if specified). The mapping can also associate the L2 VNIC and its associated L2 virtual switch with the associated compute instance, indicating the NVD hosting the L2 VNIC and L2 virtual switch and the host machine hosting the compute instance.
[0206] Figure 14 illustrates an exemplary IGMP technology in a Layer 2 virtual network. The Layer 2 virtual network is referred to here as a VLAN. The top of Figure 14 shows a VLAN implementation diagram 1410. The bottom of Figure 14 shows a customer presentation of the VLAN 1420. The customer presentation 1420 gives the customer, where the Layer 2 virtual network is deployed, the awareness that the L2 v switches in this network are IGMP-enabled ports. Implementation diagram 1410 shows how the IGMP technology can be implemented, where the ports on the L2 v switch are emulated by an L2 VNIC, which, together with the associated L2 virtual switch, can support IGMP functionality.
[0207] In one example, as described above, the NVD's L2 VNIC learns interface-to-MAC address mappings based on incoming traffic. Such mappings, along with VLAN identifiers, can be sent to the control plane (e.g., the control plane, or more generally, the computer system of the VCN containing the VLANs). The control plane can receive similar mappings from different NVDs hosting different L2 VNICs and, among other things, generate mappings between interface identifiers, MAC addresses, (e.g., the NVD's) physical IP addresses, and VLAN identifiers.
[0208] For example, VNIC1 learns that M.2 (the overlay MAC address of compute instance 2) is associated with ID.2 (the interface identifier of L2 VNIC2) and IP.2 (the physical address of NVD2), and that Mn (the overlay MAC address of compute instance n) is associated with ID.n (the interface identifier of L2 VNIC n) and IP.n (the physical address of NVD n). Similarly, VNIC2 learns that M.1 (the overlay MAC address of compute instance 1) is associated with ID.1 (the interface identifier of L2 VNIC1) and IP.1 (the physical address of NVD1). These associations are reported to the control plane as part of the mapping, and the control plane can generate mappings such as {Customer 1; M.1 → ID.1; IP.1; VLAN A}, {Customer 1, M.2 → ID.2, IP.2; VLAN A}, ..., {Customer 1, Mn → ID.n, IP.n; VLAN A}.
[0209] Furthermore, the control plane can generate and maintain a mapping between customer-specified port names and the distribution of VLAN resources on the physical network. For example, this mapping indicates that port "i" is configured for compute instance "i" and corresponds to L2 VNIC "i", which is then associated with L2 virtual switch "i", and this pair is hosted by NVD "i". This mapping may be represented as {customer1;CI.1→P.1;P.1→M.1;M.1→ID.1,SW.1,IP.1;VLAN A}, where "CI.1" identifies compute instance 1 (using customer-defined naming conventions), P.1 identifies port 1 (using customer-defined naming conventions), M.1 identifies the MAC address associated with compute instance 1, ID.1 identifies L2 VNIC 1, SW.1 identifies L2 virtual switch 1, IP.1 identifies the IP address of NVD 1, and VLAN A identifies the VLAN.
[0210] Therefore, the control plane maintains customer-defined configurations of VLANs (e.g., compute instance 1 is connected to port 1, etc.), mappings of VLANs to actual network realities on the physical network, and mappings that associate customer-defined configurations with actual realities. Such information enables the control plane to integrate IGMP functionality within VLANs.
[0211] For example, based on the mapping, the control plane determines which NVDs host the L2 virtual switches associated with the compute instances in the VLAN (e.g., NVD1, NVD2, ..., NVDn). The control plane sends requests to each of these NVDs for periodic IGMP queries within the VLAN (e.g., NVD1, NVD2, ..., NVDn). This periodic time interval can be 30 seconds or some other predetermined value. An NVD (e.g., NVD1) receives the request and determines the applicable L2 VNIC (e.g., L2 VNIC1). The L2 VNIC (e.g., L2 VNIC1) then sends an IGMP query to its associated compute instance (e.g., compute instance 1) to determine whether the compute instance should be added to the multicast group (or removed from the multicast group if previously added). In response to the IGMP response of a compute instance regarding an addition (or removal), the L2 VNIC updates the associated L2 virtual switch (e.g., L2 virtual switch 1) to allow the L2 virtual switch to update its local IGMP configuration (for example, in the case of an addition, it includes an association between the compute instance's overlay MAC address and the multicast group's group MAC address (e.g., "MG"), and in the case of a removal, it removes this association from the local IGMP table). Responses from different compute instances (or updates to the local IGMP table) are also reported to the control plane, which then updates the mapping to include an IGMP configuration indicating that compute instances 1 and n are members of the multicast group, but compute instance 2 is not (e.g., {customer1,M.1→ID.1;IP.1;VLAN A,MG listener:yes},{customer1M.2→ID.2,IP.2;VLAN A,listener:no},...,{customer1,Mn→ID.n;IP.n;VLAN A,MG listener:yes}).The control plane can send all or part of the collected IGMP configuration (for example, an IGMP table in the following format: {M.1,MG,Listener:yes},{M.2,MG,Listener:no},...,{Mn,MG,Listener:yes}) to the NVD. The NVD then updates its local L2 virtual switches with the relevant parts of the received IGMP configuration for use in subsequent frame multicast.
[0212] Multicast can be performed by the NVD based on a locally stored IGMP table. For example, NVD1 receives a frame from compute instance 1, which contains a multicast MAC address (e.g., MG) as its destination MAC address. Based on its local IGMP table 1, L2 virtual switch 1 determines that Mn (MAC address of compute instance n) and M.3 (MAC address of compute instance 3, not shown in Figure 14) are associated with MG. Therefore, L2 virtual switch 1 duplicates the frame twice, resulting in a first frame with Mn as its destination MAC address and a second frame with M.3 as its destination MAC address. These duplicated frames are sent to L2 VNIC n and L2 VNIC 3 (L2 VNIC 3 corresponds to a third compute instance 3, not shown in Figure 14).
[0213] As further explained above, the VSRS VNIC and the NVD hosting the VSRS switch can be used for communication between the VLAN and resources outside the VLAN. The IGMP configuration collected by the control plane can be sent to this NVD. Thus, when the speaker to the multicast group is outside the VLAN, the VSRS VNIC can receive the associated traffic. Based on its switching capabilities, the VSRS switch determines the compute instances of the VLAN associated with the multicast group and sends multicast frames to the Layer 2 VNICs associated with these compute instances.
[0214] Figure 15 is a flowchart illustrating the process for generating an IGMP table in a Layer 2 virtual network. In one embodiment, process 1500 can be performed by a control plane or another VCN system that manages the deployment of the Layer 2 virtual network on the underlying physical network. For clarity, process 1500 is described in relation to two NVDs, each hosting a pair of L2 VNICs and L2 virtual switches for different compute instances. However, this process is equally applicable when there are different numbers of NVDs, L2 VNIC-L2 virtual switch pairs, and / or compute instances.
[0215] Process 1500 begins in block 1502, in which the control plane receives first information from the first NVD regarding the IGMP response of a first compute instance of an L2 virtual network. In one embodiment, the first information indicates that the first compute instance should be added to a multicast group. This compute instance may be hosted on a host machine communicably coupled to the first NVD. The first NVD may host a first L2 VNIC and a first L2 virtual switch associated with the first compute instance. In one embodiment, the control plane may send a request to the first NVD to perform an IGMP query. The first information is received from the first NVD in response to the request. In such an embodiment, the control plane may determine, based on mapping information, that the first NVD hosts a first L2 VNIC and a first L2 virtual switch associated with a compute instance of an L2 virtual network, and may send a request based on this determination.
[0216] In block 1504, the control plane receives second information from the second NVD regarding the IGMP response of a second compute instance of the L2 virtual network. In one embodiment, the second information also indicates that the second compute instance should be added to a multicast group. This compute instance may be hosted on the same or a different host machine that is communicably coupled with the second NVD. The second NVD may host a second L2 VNIC and a second L2 virtual switch associated with the second compute instance. Although process 1500 is described in relation to two different NVDs, the process also applies when a single NVD hosts multiple pairs of L2 VNICs and L2 virtual switches. In one embodiment, the control plane may send a request to the second NVD to perform an IGMP query. The second information is received from the second NVD in response to the request. In such embodiments, the control plane can determine, based on mapping information, that the second NVD hosts a second L2 VNIC and a second L2 virtual switch associated with a second compute instance of the L2 virtual network, and can send a request based on this determination. The requests in blocks 1502 and 1504 may be sent in parallel or sequentially using any type of mechanism, such as broadcast or unicast.
[0217] In block 1506, the control plane generates an IGMP table based on the first and second pieces of information. In one embodiment, the IGMP table associates the MAC addresses of the first compute instance and the second compute instance with the group MAC address of the multicast group, indicating that the two MAC addresses are listeners. The control plane may also receive information indicating that another compute instance of the L2 virtual network should not be added to (or removed from) the multicast group. In this case, the IGMP table may not associate the MAC address with the group MAC address, or may include an association indicating that the MAC address is not a listener. The MAC addresses of compute instances may be determined based on the customer-defined configuration of the L2 virtual network and mapping information maintained by the control plane regarding the deployment of L2 virtual network resources on the physical network. The group MAC address may be generated by the control plane or defined based on customer input.
[0218] In block 1508, the control plane sends at least a first portion of the IGMP table to the first NVD. In one embodiment, the first NVD already has first information about a first compute instance associated with a multicast group. Therefore, the control plane does not need to send the entire IGMP table to the first NVD. Instead, the control plane may send a portion of the IGMP table that contains information that may not yet be available to the first NVD. For example, this portion may include information about a second compute instance (e.g., the association between the second compute instance and the multicast group) but not information about the first compute instance (e.g., the association between the first compute instance and the multicast group). In other embodiments, the entire IGMP table is sent to the first NVD.
[0219] In block 1510, the control plane sends at least a second portion of the IGMP table to the second NVD. In one embodiment, the second NVD already has second information about a second compute instance associated with a multicast group. Therefore, the control plane does not need to send the entire IGMP table to the second NVD. Instead, the control plane may send a portion of the IGMP table that contains information that may not yet be available to the second NVD. For example, this portion may contain information about a first compute instance (e.g., the association between the first compute instance and the multicast group) but not information about a second compute instance (e.g., the association between the second compute instance and the multicast group). In other embodiments, the entire IGMP table is sent to the second NVD.
[0220] Figure 16 is a flowchart illustrating the process for updating the IGMP table in a Layer 2 virtual network. In one embodiment, process 1600 can be performed by a control plane or another VCN system that manages the deployment of the Layer 2 virtual network on the underlying physical network. For clarity, process 1600 is described in relation to a certain update from a certain NVD. However, process 1600 is equally applicable when multiple updates are received from the same NVD or different NVDs. Assume that the IGMP table has already been generated (for example, by performing process 1500).
[0221] Process 1600 begins in block 1602, in which the control plane receives first information from the first NVD indicating an update to the multicast group. In one embodiment, the first information indicates that the first compute instance should be removed from the multicast group (if it is already added to the multicast group) or should be added to the multicast group (if it is not already added to the multicast group or has been previously removed from the multicast group). The first information may be received based on a push mechanism implemented by the first NVD, or the first information may be received in response to a request from the control plane for an update regarding an IGMP query.
[0222] In block 1604, the control plane generates an update to the IGMP table. In one embodiment, when the first information indicates that the first compute device should be removed, the control plane updates the IGMP table by removing the association of the first compute instance with the multicast group (for example, by deleting the entry that associates the MAC address of the first compute instance with the group MAC address, or by indicating that the first compute instance is not a listener to the multicast group). Conversely, when the first information indicates that the first compute device should be added, the control plane updates the IGMP table by adding the association of the first compute instance with the multicast group (for example, by inserting an entry that associates the MAC address of the first compute instance with the group MAC address, or by indicating that the first compute instance is a listener to the multicast group). In the first embodiment, at this point, the first NVD may update its own local IGMP table to reflect the update to the first compute instance's membership in the multicast group. Therefore, the control plane does not need to send the updated or updated IGMP table to the first NVD. In such an embodiment, blocks 1606 and 1608 can be executed. In a second embodiment, the control plane may periodically send the updated IGMP table and / or updates to all NVDs hosting pairs of L2 VNICs and L2 virtual switches of the L2 virtual network when it receives updates or when it receives a number of updates exceeding a threshold.
[0223] In block 1606, the control plane determines the set of NVDs to which an update should be sent. As described above, the set may, in the first embodiment, include one or more NVDs other than the first NVD, and in the second embodiment, may include NVDs that include the first NVD. The control plane may determine the set of NVDs based on mapping information about the L2 virtual network. In the first embodiment, the control plane does not include the first NVD. For a second NVD hosting a second L2 VNIC and L2 virtual switch associated with a second compute instance of the L2 virtual network, the control plane may also determine whether the second compute instance is a member of a multicast group. If the second compute instance is a member of a multicast group, the control plane may determine that the second NVD should receive an update because, if the frame originates from the second compute instance and is addressed to the group MAC address, the frame may no longer need to be replicated (if the update removes the first compute instance) or may need to be replicated (if the update adds the first compute instance). Otherwise, the control plane can determine that the second NVD does not need to receive an update. In the second embodiment, the first and second NVDs would be added to the set (for example, their identifiers would be included in the set).
[0224] In block 1608, the control plane sends the update to the set of NVDs. In one embodiment, the control plane sends the entire updated IGMP table. In some other embodiments, the control plane sends only the updated portion of the IGMP table (e.g., added entries, or instructions about entries removed from the IGMP table and which each NVD from the set removes from its local IGMP table).
[0225] Figure 17 is a flowchart illustrating the process for executing IGMP queries in a Layer 2 virtual network. In one embodiment, process 1700 can be executed by an NVD hosting a first L2 VNIC and a first L2 virtual switch associated with a first compute instance, where the L2 virtual network includes the L2 VNIC, the L2 virtual switch, and the first compute instance. For clarity, process 1700 is described in relation to an NVD hosting a single L2 VNIC-L2 virtual switch pair. However, this process also applies when the NVD hosts multiple such pairs. In this case, the NVD can store a local IGMP table for each L2 virtual switch, or the NVD can store a single local IGMP table and associate the entries in this table with different L2 virtual switches.
[0226] Process 1700 starts at block 1702, and the NVD receives a request for an IGMP query. In one embodiment, the request is periodically received from the control plane of a VCN (or another VCN system) that includes an L2 virtual network. Based on mapping information stored by the control plane, the request may include an identifier for a first compute instance and / or an identifier for a first L2 VNIC that emulates the ports of the first compute instance.
[0227] In block 1704, the NVD sends an IGMP query to the first compute instance. In one embodiment, the NVD identifies the first compute instance based on the request and the host machine hosting the first compute instance. The NVD may then send a request to the host machine indicating that an IGMP response from the first compute instance is requested.
[0228] In block 1706, the NVD receives an IGMP response for the first compute instance. In one embodiment, the IGMP response is received as a response to an IGMP query and indicates whether the first compute instance should be added to the multicast group (if not previously added) or removed from the multicast group (if previously added).
[0229] In block 1708, the NVD stores first information regarding the IGMP response. In one embodiment, the first information indicates whether the first compute instance should be added to or removed from the multicast group. In addition, the NVD (e.g., the first L2 VNIC or the first L2 virtual switch) may include an association of the first compute instance with a multicast group, which can be stored in the local IGMP table of the first L2 virtual switch (e.g., by inserting an entry that associates the MAC address of the first compute instance with a group MAC address, or by indicating that the first compute instance is a listener for the multicast group). For removal, the NVD (e.g., the first L2 VNIC or the first L2 virtual switch) may remove the association of the first compute instance with a multicast group from the local IGMP table (e.g., the corresponding entry is deleted, or first information indicating that the first compute instance is not a listener for the multicast group).
[0230] In block 1710, the NVD transmits first information to a computer system. In one embodiment, the computer system includes a control plane. The first information indicates the membership status of a first compute instance in a multicast group.
[0231] In block 1712, the NVD receives at least a portion of the IGMP table. In one embodiment, the NVD receives the entire IGMP table generated and / or maintained by the computer system based on the first information. In one embodiment, the NVD receives only a portion of this IGMP table, which does not contain information about the membership of a first compute instance in the multicast group (because this information may already be known to the NVD and corresponds to the first information), but contains information about the membership of a second compute instance in the multicast group. An example of a second compute instance includes a compute instance associated with a second L2 VNIC and a second L2 virtual switch, which are instead hosted by a second different NVD.
[0232] In block 1714, the NVD stores the local IGMP table. In one embodiment, the NVD updates the already stored local IGMP table based on the portion received in block 1712. For example, entries from the received portion are added to the local IGMP table. In some other embodiments, the NVD receives the entire IGMP table. In this case, the NVD can replace the local IGMP with the received IGMP table, or it can compare the two tables and add entries that are missing from the local IGMP table but available from the received IGMP table to the local IGMP table.
[0233] Figure 18 is a flowchart illustrating the process for using an IGMP table in a Layer 2 virtual network. In one embodiment, process 1800 may be performed by an NVD hosting a first L2 VNIC and a first L2 virtual switch associated with a first compute instance, where the L2 virtual network includes the L2 VNIC, the L2 virtual switch, and the first compute instance. For clarity, process 1800 is described in relation to an NVD hosting a single L2 VNIC-L2 virtual switch pair. However, this process also applies when the NVD hosts multiple such pairs. In this case, the NVD may store a local IGMP table for each L2 virtual switch, or the NVD may store a single local IGMP table and associate entries in this table with different L2 virtual switches. In either case, the IGMP table entries can be examined to determine whether a frame should be replicated.
[0234] Process 1800 starts at block 1802, where NVD receives a frame containing header information. In one embodiment, the frame may be received from a first compute instance via the host machine (for example, the frame is an outgoing frame sent from the first compute instance).
[0235] In block 1804, the NVD determines whether the frame needs to be duplicated. If the frame needs to be duplicated, block 1806 follows block 1804; otherwise, block 1810 follows block 1804. In one embodiment, this determination involves header information and IGMP table entries. Specifically, the NVD parses the header information to determine the destination MAC address and whether this destination MAC address matches the group MAC address of the multicast group (for example, a first L2 VNIC passes the frame to a first L2 virtual switch, which then performs a lookup of its IGMP table entries). Assuming the destination MAC address is the group MAC address, the NVD (for example, the first L2 virtual switch) determines that the frame should be duplicated, the MAC address of the second compute instance associated with the group MAC address, and the number of copies (which is equal to the number of second compute instances determined).
[0236] In block 1806, the NVD generates a copy of the frame. In one embodiment, the NVD (e.g., the first L2 virtual switch) copies the payload and modifies the header information to include the MAC address of a second compute instance instead of the group MAC address as the destination. This copy can be repeated as needed, depending on the number of second compute instances determined in block 1806.
[0237] In block 1808, the NVD transmits a copy. In one embodiment, the replicated frame is transmitted by the first L2 virtual switch to a second L2 VNIC associated with a second compute instance, which has a destination MAC address. This copy can be repeated as needed, depending on the number of second compute instances determined in block 1806.
[0238] In block 1810, the NVD sends a frame. Here, the header information includes the MAC address of the second compute instance as the destination, so the frame is actually destined for the second compute instance, not the multicast group. Therefore, in one embodiment, the frame is sent by the first L2 virtual switch to the second L2 VNIC associated with the second compute instance.
[0239] C. Exemplary Infrastructure as a Service Architecture As mentioned above, IaaS (Infrastructure as a Service) is a specific type of cloud computing. IaaS may be configured to provide virtualized computing resources over a public network (e.g., the internet). In the IaaS model, a cloud computing provider can host infrastructure elements (e.g., servers, storage, network nodes (e.g., hardware), deployment software, platform virtualization (e.g., hypervisor layer)). In some cases, the IaaS provider can provide various services associated with the infrastructure elements (e.g., billing, monitoring, logging, security, load balancing, and clustering). Therefore, since these services can be policy-driven, IaaS users can implement policies to drive load balancing in order to maintain application availability and performance.
[0240] In some cases, IaaS customers can access resources and services over a wide area network (WAN), such as the internet, and install the rest of their application stack using the cloud provider's services. For example, a user can log into an IaaS platform, create virtual machines (VMs), install an operating system (OS) on each VM, deploy middleware such as databases, create storage buckets for workloads and backups, and install enterprise software on the VMs. Customers can use the provider's services to perform a variety of functions, including balancing network traffic, troubleshooting applications, monitoring performance, and managing disaster recovery.
[0241] In most cases, the cloud computing model requires the participation of a cloud provider. This cloud provider may or may not be a third-party service specializing in IaaS provision (e.g., offering, renting, or selling). Alternatively, a company can become a provider of private clouds and infrastructure services.
[0242] In some cases, IaaS deployment is the process of deploying a new application or a new version of an application to a pre-configured application server. IaaS deployment may also include the process of preparing the server (e.g., installing libraries, daemons, etc.). IaaS deployment is often managed by the cloud provider under the hypervisor layer (e.g., servers, storage, network hardware, and virtualization). Therefore, customers can deploy the OS, middleware, and / or applications (e.g., self-service virtual machines, which can be spun up on demand).
[0243] In some cases, IaaS provisioning may include acquiring the computers or virtual hosts to be used and installing the necessary libraries or services on those computers or virtual hosts. In most cases, deployment does not include provisioning, and provisioning must be performed first.
[0244] In some cases, IaaS provisioning presents two distinct challenges. First, there's the challenge of provisioning an initial set of infrastructure before doing anything. Second, there's the challenge of evolving existing infrastructure after everything has been provisioned (e.g., adding new services, modifying services, removing services). In some cases, these two challenges can be addressed by enabling the declarative definition of infrastructure configuration. In other words, the infrastructure (e.g., what elements are needed and how these elements interact) may be defined by one or more configuration files. Thus, the overall topology of the infrastructure (e.g., which resources depend on which and how they work together) can be described declaratively. In some examples, once the topology is defined, workflows can be generated to create and / or manage the different elements described in the configuration files.
[0245] In some examples, infrastructure can include many interconnected elements. For example, there may be one or more virtual private clouds (VPCs), also known as core networks (e.g., configurable compute resources and / or potential on-demand pools of shared compute resources). In some examples, there may be one or more security group rules provisioned to define how network security is configured, and one or more virtual machines (VMs). Other infrastructure elements such as load balancers and databases may also be provisioned. As more and more infrastructure elements are desired and / or added, infrastructure can evolve incrementally.
[0246] In some examples, sequential deployment techniques may be employed to enable the deployment of infrastructure code across various virtual computing environments. Furthermore, the techniques described can enable infrastructure management within these environments. In some examples, a service team may write code that is intended to be deployed to one or more different production environments, typically many different geographical locations, sometimes across the globe. However, in some examples, the infrastructure for deploying the code must first be configured. In some examples, provisioning can be done manually, resources can be provisioned using provisioning tools, and / or the code can be deployed using deployment tools after the infrastructure has been provisioned.
[0247] Figure 19 is a block diagram 1900 illustrating an exemplary pattern of an IaaS architecture according to at least one embodiment. The service operator 1902 may be communicably coupled to a secure host tenancy 1904 which may include a virtual cloud network (VCN) 1906 and a secure host subnet 1908. In some examples, the service operator 1902 may use one or more client computing devices. One or more client computing devices may be handheld mobile devices (e.g., iPhone®, mobile phones, iPad®, tablets, personal digital assistants (PDAs) or wearable devices (Google® Glass® head-mounted displays)) with Internet, email, short message service (SMS), BlackBerry®, or other communication protocols enabled, and which can run software such as Microsoft Windows Mobile® and / or various mobile operating systems such as iOS, Windows Phone, Android®, BlackBerry 8, and Palm OS. The client computing devices may also be general-purpose personal computers, including, exemplarily, personal computers and / or laptop computers, which run various versions of the Microsoft Windows® operating system, Apple Macintosh® operating system, and / or Linux® operating system. Alternatively, the client computing devices may be workstation computers running various commercially available UNIX® or UNIX-like operating systems, including, but not limited to, various GNU / Linux operating systems and, for example, Google Chrome® OS.Alternatively or additionally, the client computing device may be other electronic devices that can communicate via a network that has access to VCN1906 and / or the Internet, such as a thin client computer, an Internet-enabled game system (e.g., a Microsoft Xbox® game console with or without a Kinect® gesture input device), and / or a personal messaging device.
[0248] VCN1906 may include a local peering gateway (LPG) 1910 that can communicately connect to Secure Shell (SSH) VCN1912 via LPG1910 contained within SSH VCN1912. SSH VCN1912 may include an SSH subnet 1914, and SSH VCN1912 can communicately connect to control plane VCN1916 via LPG1910 contained within control plane VCN1916. Furthermore, SSH VCN1912 can communicately connect to data plane VCN1918 via LPG1910. Control plane VCN1916 and data plane VCN1918 may be contained within a service tenancy 1919, which may be owned and / or operated by an IaaS provider.
[0249] The control plane VCN1916 may include a control plane DMZ (demilitarized zone) layer 1920 that functions as a perimeter network (e.g., the portion of the corporate network between the corporate intranet and the external network). DMZ-based servers have a certain level of reliability and can contain security breaches. Furthermore, the DMZ layer 1920 may include a control plane application layer 1924 that may include one or more load balancer (LB) subnets 1922 and application subnets 1926, and a control plane data layer 1928 that may include database (DB) subnets 1930 (e.g., a front-end DB subnet and / or a back-end DB subnet). The LB subnet 1922 included in the control plane DMZ layer 1920 may be communicatively coupled to the application subnet 1926 included in the control plane application layer 1924 and to an internet gateway 1934 which may be included in the control plane VCN 1916. The application subnet 1926 may be communicatively coupled to the DB subnet 1930 included in the control plane data layer 1928, to a service gateway 1936, and to a network address translation (NAT) gateway 1938. The control plane VCN 1916 may include the service gateway 1936 and the NAT gateway 1938.
[0250] The control plane VCN 1916 may include a data plane mirror application layer 1940, which may include an application subnet 1926. The application subnet 1926 included in the data plane mirror application layer 1940 may include a virtual network interface controller (VNIC) 1942 on which compute instance 1944 can run. Compute instance 1944 can communicatively connect the application subnet 1926 of the data plane mirror application layer 1940 to an application subnet 1926 that may be included in the data plane application layer 1946.
[0251] The data plane VCN1918 may include a data plane application layer 1946, a data plane DMZ layer 1948, and a data plane data layer 1950. The data plane DMZ layer 1948 may include an LB subnet 1922 that can be communicatively coupled to the application subnet 1926 of the data plane application layer 1946 and the internet gateway 1934 of the data plane VCN1918. The application subnet 1926 may be communicatively coupled to the service gateway 1936 of the data plane VCN1918 and the NAT gateway 1938 of the data plane VCN1918. The data plane data layer 1950 may also include a DB subnet 1930 that can be communicatively coupled to the application subnet 1926 of the data plane application layer 1946.
[0252] The Internet gateway 1934 of the control plane VCN1916 and the Internet gateway 1934 of the data plane VCN1918 may be communicatively coupled to a metadata management service 1952 which can be communicatively coupled to the public internet 1954. The public internet 1954 may be communicatively coupled to the NAT gateway 1938 of the control plane VCN1916 and the NAT gateway 1938 of the data plane VCN1918. The service gateway 1936 of the control plane VCN1916 and the service gateway 1936 of the data plane VCN1918 may be communicatively coupled to a cloud service 1956.
[0253] In some cases, a service gateway 1936 of the control plane VCN1916 or data plane VCN1918 can make application programming interface (API) calls to a cloud service 1956 without going through the public internet 1954. API calls from the service gateway 1936 to the cloud service 1956 can be one-way. The service gateway 1936 can make API calls to the cloud service 1956, and the cloud service 1956 can send request data to the service gateway 1936. However, the cloud service 1956 may not initiate an API call to the service gateway 1936.
[0254] In some examples, secure host tenancy 1904 may be directly connected to a potentially isolated service tenancy 1919. Secure host subnet 1908 can communicate with SSH subnet 1914 via LPG 1910, which enables bidirectional communication with isolated systems. By connecting secure host subnet 1908 to SSH subnet 1914, secure host subnet 1908 can access other entities within service tenancy 1919.
[0255] The control plane VCN1916 allows users of service tenancy 1919 to configure or provision desired resources. Desired resources provisioned in the control plane VCN1916 may be deployed or used in the data plane VCN1918. In some examples, the control plane VCN1916 may be isolated from the data plane VCN1918, and the data plane mirror application layer 1940 of the control plane VCN1916 can communicate with the data plane application layer 1946 of the data plane VCN1918 via a VNIC 1942 which may be included in the data plane mirror application layer 1940 and the data plane application layer 1946.
[0256] In some examples, a system user or customer may make requests, such as create, read, update, or delete (CRUD) operations, via the public internet 1954, which can communicate the requests to the metadata management service 1952. The metadata management service 1952 can communicate the requests to the control plane VCN 1916 via the internet gateway 1934. The requests may also be received by an LB subnet 1922 included in the control plane DMZ layer 1920. The LB subnet 1922 may determine that the request is valid, and in response to this determination, the LB subnet 1922 may send the request to an application subnet 1926 included in the control plane application layer 1924. If the request is validated and requires a call to the public internet 1954, the call to the public internet 1954 can be sent to a NAT gateway 1938, which can make calls to the public internet 1954. Memory for storing the requests may be stored in the DB subnet 1930.
[0257] In some cases, the data plane mirror application layer 1940 can facilitate direct communication between the control plane VCN1916 and the data plane VCN1918. For example, it may be desirable that changes, updates, or other appropriate modifications to the configuration be applied to the resources contained in the data plane VCN1918. Since the control plane VCN1916 can communicate directly with the resources contained in the data plane VCN1918 via VNIC1942, it can perform changes, updates, or other appropriate modifications to the configuration.
[0258] In some embodiments, the control plane VCN1916 and the data plane VCN1918 may be included in the service tenancy 1919. In this case, the system user or customer does not have to own or operate either the control plane VCN1916 or the data plane VCN1918. Instead, the IaaS provider may own or operate the control plane VCN1916 and the data plane VCN1918, and both may be included in the service tenancy 1919. This embodiment can prevent a user or customer from interacting with other users' resources or other customers' resources by enabling network isolation. This embodiment can also enable a system user or customer to store databases privately without having to rely on the public internet 1954, which may not have the desired level of security for storage.
[0259] In another embodiment, the LB subnet 1922 included in the control plane VCN 1916 may be configured to receive signals from the service gateway 1936. In this embodiment, the control plane VCN 1916 and the data plane VCN 1918 may be configured to be invoked by the IaaS provider's customers without calling the public internet 1954. The IaaS provider's customers may prefer this embodiment because the database used by the customer may be stored in a service tenancy 1919 that is controlled by the IaaS provider and can be isolated from the public internet 1954.
[0260] Figure 20 is a block diagram 2000 showing another exemplary parameter of an IaaS architecture according to at least one embodiment. A service operator 2002 (e.g., service operator 1902 in Figure 19) may be communicatively coupled to a secure host tenancy 2004 (e.g., secure host tenancy 1904 in Figure 19), which may include a virtual cloud network (VCN) 2006 (e.g., VCN1906 in Figure 19) and a secure host subnet 2008 (e.g., secure host subnet 1908 in Figure 19). The VCN 2006 may include a local peering gateway (LPG) 2010 (e.g., LPG1910 in Figure 19), which may be communicatively coupled to a secure shell (SSH) VCN 2012 (e.g., SSH VCN1912 in Figure 19) via an LPG 1910 contained in an SSH VCN 2012. SSH VCN2012 may include SSH subnet 2014 (e.g., SSH subnet 1914 in Figure 19), and SSH VCN2012 may be communicably coupled to control plane VCN2016 (e.g., control plane VCN1916 in Figure 19) via LPG2010 included in control plane VCN2016. Control plane VCN2016 may be included in service tenancy 2019 (e.g., service tenancy 1919 in Figure 19), and data plane VCN2018 (e.g., data plane VCN1918 in Figure 19) may be included in customer tenancy 2021, which may be owned or operated by a user or customer of the system.
[0261] The control plane VCN2016 may include a control plane DMZ tier 2020 (e.g., control plane DMZ tier 1920 in Figure 19) which may include an LB subnet 2022 (e.g., LB subnet 1922 in Figure 19), a control plane application tier 2024 (e.g., control plane application tier 1924 in Figure 19) which may include an application subnet 2026 (e.g., application subnet 1926 in Figure 19), and a control plane data tier 2028 (e.g., control plane data tier 1928 in Figure 19) which may include a database (DB) subnet 2030 (e.g., similar to DB subnet 1930 in Figure 19). The LB subnet 2022 included in the control plane DMZ tier 2020 may be coupled to communicate with the application subnet 2026 included in the control plane application tier 2024 and with an internet gateway 2034 (e.g., internet gateway 1934 in Figure 19) which may be included in the control plane VCN2016. The application subnet 2026 may be communicatively coupled to the DB subnet 2030, service gateway 2036 (e.g., the service gateway in Figure 19), and network address translation (NAT) gateway 2038 (e.g., NAT gateway 1938 in Figure 19), which are included in the control plane data layer 2028. The control plane VCN 2016 may include the service gateway 2036 and the NAT gateway 2038.
[0262] The control plane VCN2016 may include a data plane mirror application layer 2040 (e.g., data plane mirror application layer 1940 in Figure 19) which may include an application subnet 2026. The application subnet 2026 included in the data plane mirror application layer 2040 may include a virtual network interface controller (VNIC) 2042 (e.g., VNIC1942) which may run a compute instance 2044 (e.g., similar to compute instance 1944 in Figure 19). The compute instance 2044 can facilitate communication between the application subnet 2026 of the data plane mirror application layer 2040 and the application subnet 2026 that may be included in the data plane application layer 2046 (e.g., data plane application layer 1946 in Figure 19) via the VNIC2042 included in the data plane mirror application layer 2040 and the VNIC2042 included in the data plane application layer 2046.
[0263] The Internet gateway 2034 included in the control plane VCN2016 may be communicably coupled to a metadata management service 2052 (e.g., metadata management service 1952 in Figure 19), which can be communicably coupled to the public internet 2054 (e.g., public internet 1954 in Figure 19). The public internet 2054 may be communicably coupled to a NAT gateway 2038 included in the control plane VCN2016. The service gateway 2036 included in the control plane VCN2016 may be communicably coupled to a cloud service 2056 (e.g., cloud service 1956 in Figure 19).
[0264] In some examples, the data plane VCN2018 may be included in customer tenancy 2021. In this case, the IaaS provider can provide a control plane VCN2016 per customer, and the IaaS provider can configure a unique compute instance 2044 for each customer, included in service tenancy 2019. Each compute instance 2044 can allow communication between the control plane VCN2016 included in service tenancy 2019 and the data plane VCN2018 included in customer tenancy 2021. The compute instance 2044 can allow resources provisioned in the control plane VCN2016 included in service tenancy 2019 to be deployed or used in the data plane VCN2018 included in customer tenancy 2021.
[0265] In another example, an IaaS provider's customer may have a database residing in customer tenancy 2021. In this example, control plane VCN2016 may include a data plane minor app tier 2040 that can include app subnet 2026. A data plane mirror app tier 2040 may reside in data plane VCN2018, but does not have to reside in data plane VCN2018. That is, a data plane mirror app tier 2040 can access customer tenancy 2021, but does not have to reside in data plane VCN2018 and does not have to be owned or operated by the IaaS provider's customer. A data plane mirror app tier 2040 may be configured to make calls to data plane VCN2018, but does not have to be configured to make calls to any entity contained in control plane VCN2016. Customers may wish to deploy or use resources within the data plane VCN2018 provisioned in the control plane VCN2016, and the data plane mirror application tier 2040 can facilitate the customer's desired deployment or other use of resources.
[0266] In some embodiments, a customer of the IaaS provider can apply filters to the data plane VCN2018. In this embodiment, the customer can determine what the data plane VCN2018 can access and can restrict access from the data plane VCN2018 to the public internet 2054. The IaaS provider may not be able to apply filters or control access from the data plane VCN2018 to any external network or database. Applying filters and controls to the data plane VCN2018 included in the customer tenancy 2021 can help isolate the data plane VCN2018 from other customers and the public internet 2054.
[0267] In some embodiments, cloud service 2056 can be invoked by service gateway 2036 to access services that may not reside on the public internet 2054, on control plane VCN2016, or on data plane VCN2018. The connection between cloud service 2056 and control plane VCN2016 or data plane VCN2018 does not have to be live or continuous. Cloud service 2056 may reside on a separate network owned or operated by the IaaS provider. Cloud service 2056 may be configured to receive calls from service gateway 2036 and not to receive calls from the public internet 2054. Some cloud services 2056 may be isolated from other cloud services 2056, and control plane VCN2016 may be isolated from cloud services 2056 that may not be located in the same region as control plane VCN2016. For example, control plane VCN2016 may be located in "Region 1", and cloud service "Deployment 19" may be located in "Region 1" and "Region 2". If a call to deployment 19 is made by a service gateway 2036 included in the control plane VCN2016 located in region 1, this call may be sent to deployment 19 in region 1. In this example, the control plane VCN2016 or deployment 19 in region 1 does not need to be communicably coupled with deployment 19 in region 2.
[0268] Figure 21 is a block diagram 2100 showing another exemplary pattern of an IaaS architecture according to at least one embodiment. A service operator 2102 (e.g., service operator 1902 in Figure 19) may be communicatively coupled to a secure host tenancy 2104 (e.g., secure host tenancy 1904 in Figure 19), which may include a virtual cloud network (VCN) 2106 (e.g., VCN1906 in Figure 19) and a secure host subnet 2108 (e.g., secure host subnet 1908 in Figure 19). VCN 2106 may include an LPG 2110 (e.g., LPG1910 in Figure 19), which may be communicatively coupled to an SSH VCN 2112 (e.g., SSH VCN1912 in Figure 19) via an LPG 2110 contained in the SSH VCN 2112. SSH VCN2112 may include SSH subnet 2114 (e.g., SSH subnet 1914 in Figure 19), and SSH VCN2112 may be communicatively coupled to control plane VCN2116 (e.g., control plane VCN1916 in Figure 19) via LPG2110 included in control plane VCN2116, and may be communicatively coupled to data plane VCN2118 (e.g., data plane 1918 in Figure 19) via LPG2110 included in data plane VCN2118. Control plane VCN2116 and data plane VCN2118 may be included in service tenancy 2119 (e.g., service tenant 1919 in Figure 19).
[0269] The control plane VCN2116 may include a control plane DMZ layer 2120 (e.g., control plane DMZ layer 1920 in Figure 19) which may include a load balancer (LB) subnet 2122 (e.g., LB subnet 1922 in Figure 19), a control plane application layer 2124 (e.g., control plane application layer 1924 in Figure 19) which may include an application subnet 2126 (e.g., similar to application subnet 1926 in Figure 19), and a control plane data layer 2128 (e.g., control plane data layer 1928 in Figure 19) which may include a DB subnet 2130. The LB subnet 2122 included in the control plane DMZ layer 2120 may be communicably coupled to the application subnet 2126 included in the control plane application layer 2124 and to an internet gateway 2134 (e.g., internet gateway 1934 in Figure 19) which may be included in the control plane VCN2116. The application subnet 2126 may be communicatively coupled to the DB subnet 2130 included in the control plane data layer 2128, and to the service gateway 2136 (e.g., the service gateway in Figure 19) and the network address translation (NAT) gateway 2138 (e.g., the NAT gateway 1938 in Figure 19). The control plane VCN 2116 may include the service gateway 2136 and the NAT gateway 2138.
[0270] The data plane VCN 2118 can include a data plane application layer 2146 (e.g., the data plane application layer 1946 in FIG. 19), a data plane DMZ layer 2148 (e.g., the data plane DMZ layer 1948 in FIG. 19), and a data plane data layer 2150 (e.g., the data plane data layer 1950 in FIG. 19). The data plane DMZ layer 2148 can include an LB subnet 2122 communicatively coupled to a trusted app subnet 2160 and an untrusted app subnet 2162 of the data plane application layer 2146 and the Internet gateway 2134 included in the data plane VCN 2118. The trusted app subnet 2160 may be communicatively coupled to a service gateway 2136 included in the data plane VCN 2118, a NAT gateway 2138 included in the data plane VCN 2118, and a DB subnet 2130 included in the data plane data layer 2150. The untrusted app subnet 2162 may be communicatively coupled to a service gateway 2136 included in the data plane VCN 2118 and a DB subnet 2130 included in the data plane data layer 2150. The data plane data layer 2150 can include a DB subnet 2130 communicatively coupled to a service gateway 2136 included in the data plane VCN 2118.
[0271] The untrusted application subnet 2162 can include one or more primary VNICs 2164(1)-(N) communicatively coupled to tenant virtual machines (VMs) 2166(1)-(N). Each tenant VM 2166(1)-(N) may be communicatively coupled to respective application subnets 2167(1)-(N) that may be included in respective container transit VCNs 2168(1)-(N) that may be included in respective customer tenancies 2170(1)-(N). Each secondary VNIC 2172(1)-(N) can facilitate communication between the untrusted application subnet 2162 included in the data plane VCN 2118 and the application subnets included in the container transit VCNs 2168(1)-(N). Each container transit VCN 2168(1)-(N) can include a NAT gateway 2138 communicatively coupled to the public Internet 2154 (e.g., the public Internet 1954 of FIG. 19).
[0272] The Internet gateway 2134 included in the control plane VCN 2116 and the Internet gateway 2134 included in the data plane VCN 2118 may be communicatively coupled to a metadata management service 2152 (e.g., the metadata management system 1952 of FIG. 19) communicatively coupled to the public Internet 2154. The public Internet 2154 may be communicatively coupled to the NAT gateway 2138 included in the control plane VCN 2116 and the NAT gateway 2138 included in the data plane VCN 2118. The service gateway 2136 included in the control plane VCN 2116 and the service gateway 2136 included in the data plane VCN 2118 may be communicatively coupled to cloud services 2156.
[0273] In some embodiments, the data plane VCN2118 may be integrated with the customer tenancy 2170. This integration may be useful or desirable for the IaaS provider's customer in some cases, such as when they may want support when executing code. The customer may provide code that, when executed, could be destructive, could communicate with other customer resources, or could cause undesirable effects. Thus, the IaaS provider can determine whether or not to execute the code that the customer has provided to the IaaS provider.
[0274] In some examples, an IaaS provider's customer may grant the IaaS provider temporary network access and request functionality to be added to the data plane application layer 2146. The code for executing the functionality may run on VMs 2166(1)-(N), but cannot be configured to run elsewhere on the data plane VCN 2118. Each VM 2166(1)-(N) may be connected to one customer tenancy 2170. Each container 2171(1)-(N) contained within VMs 2166(1)-(N) may be configured to run the code. In this case, a double isolation may exist (for example, containers 2171(1)-(N) run the code, and containers 2171(1)-(N) may be contained within VMs 2166(1)-(N) that are in at least an untrusted application subnet 2162), which can help prevent erroneous or undesirable code from damaging the IaaS provider's network or the networks of different customers. Containers 2171(1)-(N) may be communicatively coupled to customer tenancy 2170 and may be configured to send or receive data from customer tenancy 2170. Containers 2171(1)-(N) do not have to be configured to send or receive data from any other entities in the data plane VCN2118. Once code execution is complete, the IaaS provider may kill or discard containers 2171(I)-(N).
[0275] In some embodiments, a trusted application subnet 2160 may execute code that may be owned or operated by the IaaS provider. In this embodiment, the trusted application subnet 2160 may be communicatively coupled to a DB subnet 2130 and configured to perform CRUD operations in the DB subnet 2130. An untrusted application subnet 2162 may be communicatively coupled to a DB subnet 2130, but in this embodiment, the untrusted application subnet may be configured to perform read operations within the DB subnet 2130. Containers 2171(1)~(N) contained in each customer's VM 2166(1)~(N) and capable of executing code from the customer do not need to be communicatively coupled to the DB subnet 2130.
[0276] In other embodiments, the control plane VCN2116 and the data plane VCN2118 do not have to be directly coupled in a communicative manner. In this embodiment, direct communication between the control plane VCN2116 and the data plane VCN2118 may not exist. However, indirect communication by at least one method may exist. An LPG2110 that facilitates communication between the control plane VCN2116 and the data plane VCN2118 may be established by the IaaS provider. In another example, the control plane VCN2116 or the data plane VCN2118 can make a call to the cloud service 2156 via the service gateway 2136. For example, a call from the control plane VCN2116 to the cloud service 2156 may include a request for a service that can communicate with the data plane VCN2118.
[0277] Figure 22 is a block diagram 2200 showing another exemplary parameter of an IaaS architecture according to at least one embodiment. A service operator 2202 (e.g., service operator 1902 in Figure 19) may be communicatively coupled to a secure host tenancy 2204 (e.g., secure host tenancy 1904 in Figure 19), which may include a virtual cloud network (VCN) 2206 (e.g., VCN1906 in Figure 19) and a secure host subnet 2208 (e.g., secure host subnet 1908 in Figure 19). VCN2206 may include an LPG2210 (e.g., LPG1910 in Figure 19), which may be communicatively coupled to an SSH VCN2212 (e.g., SSH VCN1912 in Figure 19) via an LPG2210 contained in an SSH VCN2212. SSH VCN2212 may include SSH subnet-2214 (e.g., SSH subnet 1914 in Figure 19), and SSH VCN2212 may be communicatively coupled to control plane VCN2216 (e.g., control plane VCN1916 in Figure 19) via LPG2210 included in control plane VCN2216, and may be communicatively coupled to data plane VCN2218 (e.g., data plane 1918 in Figure 19) via LPG2210 included in data plane VCN2218. Control plane VCN2216 and data plane VCN2218 may be included in service tenancy 2219 (e.g., service tenancy 1919 in Figure 19).
[0278] The control plane VCN2216 may include a control plane DMZ layer 2220 (e.g., control plane DMZ layer 1920 in Figure 19) which may include an LB subnet 2222 (e.g., LB subnet 1922 in Figure 19), a control plane application layer 2224 (e.g., control plane application layer 1924 in Figure 19) which may include an application subnet 2226 (e.g., application subnet 1926 in Figure 19), and a control plane data tier 2228 (e.g., control plane data tier 1928 in Figure 19) which may include a DB subnet 2230 (e.g., DB subnet 2130 in Figure 21). The LB subnet 2222 included in the control plane DMZ layer 2220 may be communicably coupled to the application subnet 2226 included in the control plane application layer 2224 and to an internet gateway 2234 (e.g., internet gateway 1934 in Figure 19) which may be included in the control plane VCN2216. The application subnet 2226 may be communicatively coupled to the DB subnet 2230 included in the control plane data layer 2228, and to the service gateway 2236 (e.g., the service gateway in Figure 19) and the network address translation (NAT) gateway 2238 (e.g., the NAT gateway 1938 in Figure 19). The control plane VCN 2216 may include the service gateway 2236 and the NAT gateway 2238.
[0279] The data plane VCN2218 may include a data plane application layer 2246 (e.g., data plane application layer 1946 in Figure 19), a data plane DMZ layer 2248 (e.g., data plane DMZ layer 1948 in Figure 19), and a data plane data layer 2250 (e.g., data plane data layer 1950 in Figure 19). The data plane DMZ layer 2248 may include trusted application subnets 2260 (e.g., trusted application subnet 2160 in Figure 21) and untrusted application subnets 2262 (e.g., untrusted application subnet 2162 in Figure 21) of the data plane application layer 2246, and an LB subnet 2222 that can be communicatively coupled to the internet gateway 2234 included in the data plane VCN2218. A trusted application subnet 2260 may be communicatively coupled to a service gateway 2236 included in data plane VCN2218, a NAT gateway 2238 included in data plane VCN2218, and a DB subnet 2230 included in data plane data layer 2250. An untrusted application subnet 2262 may be communicatively coupled to a service gateway 2236 included in data plane VCN2218, and a DB subnet 2230 included in data plane data layer 2250. The data plane data layer 2250 may include a DB subnet 2230 that can be communicatively coupled to a service gateway 2236 included in data plane VCN2218.
[0280] An untrusted application subnet 2262 may include primary YNICs 2264(1)-(N) that can be communicatively coupled to tenant virtual machines (VMs) 2266(1)-(N) residing in the untrusted application subnet 2262. Each tenant VM 2266(1)-(N) may execute code in its respective container 2267(1)-(N) and may be communicatively coupled to an application subnet 2226 that may be included in a data plane application layer 2246 that may be included in a container-transmitting VCN 2268. Each secondary VNIC 2272(1)-(N) can facilitate communication between the untrusted application subnet 2262 included in a data plane VCN 2218 and the application subnet included in a container-transmitting VCN 2268. The container-transmitting VCN may include a NAT gateway 2238 that can be communicatively coupled to the public internet 2254 (e.g., public internet 1954 in Figure 19).
[0281] The Internet gateway 2234 included in the control plane VCN2216 and the Internet gateway 2234 included in the data plane VCN2218 may be communicatively coupled to a metadata management service 2252 (e.g., a metadata management system 1952 in Figure 19) which can be communicatively coupled to the public internet 2254. The public internet 2254 may be communicatively coupled to the Internet gateway 2234 included in the control plane VCN2216 and the NAT gateway 2238 included in the data plane VCN2218. The Internet gateway 2234 included in the control plane VCN2216 and the service gateway 2236 included in the data plane VCN2218 may be communicatively coupled to a cloud service 2256.
[0282] In some examples, the pattern shown by the architecture in block diagram 2200 of Figure 22 can be considered an exception to the pattern shown by the architecture in block diagram 2100 of Figure 21, and may be desirable for the IaaS provider's customers when the IaaS provider cannot communicate directly with the customers (e.g., in a disconnected area). Customers can access in real time each container 2267(1)~(N) contained within each customer's VM2266(1)~(N). Containers 2267(1)~(N) may be configured to call each secondary VNIC 2272(1)~(N) contained within the application subnet 2226 of the data plane application layer 2246, which may be contained within the container sending VCN 2268. Secondary VNICs 2272(1)~(N) can send calls to a NAT gateway 2238 which can send calls to the public internet 2254. In this example, the containers 2267(1)-(N) that customers can access in real time may be isolated from the control plane VCN2216 and from other entities included in the data plane VCN2218. Furthermore, the containers 2267(1)-(N) may be isolated from other customers' resources.
[0283] In another example, a customer can use containers 2267(1)-(N) to invoke cloud service 2256. In this example, the customer can execute code in containers 2267(1)-(N) to request a service from cloud service 2256. Containers 2267(1)-(N) can then send this request to secondary VNICs 2272(1)-(N), which can send the request to a NAT gateway that can send the request to the public internet 2254. The public internet 2254 can then send this request to LB subnet 2222, which is included in control plane VCN 2216, via internet gateway 2234. In response to determining that the request is valid, the LB subnet can send this request to application subnet 2226, which can then send this request to cloud service 2256 via service gateway 2236.
[0284] Note that the illustrated IaaS architectures 1900, 2000, 2100, and 2200 may include elements other than those shown. Furthermore, the illustrated embodiments are only examples of some cloud infrastructure systems that may incorporate embodiments of this disclosure. In some other embodiments, the IaaS system may have more or fewer elements than those shown, may combine two or more elements, or may have different configurations or arrangements of elements.
[0285] In certain embodiments, the IaaS systems described herein may include a suite of applications, middleware, and database services delivered to customers in a self-service, subscription-based, flexibly scalable, reliable, highly available, and secure manner. An example of such an IaaS system is Oracle® Cloud Infrastructure (OCI), provided by the applicant.
[0286] Figure 23 shows an exemplary computer system 2300 in which various embodiments may be implemented. System 2300 may be used to implement any of the computer systems described above. As shown, computer system 2300 includes a processing unit 2304 that communicates with a number of peripheral subsystems via a bus subsystem 2302. These peripheral subsystems may include a processing acceleration unit 2306, an I / O subsystem 2308, a storage subsystem 2318, and a communication subsystem 2324. The storage subsystem 2318 includes a tangible computer-readable storage medium 2322 and system memory 2310.
[0287] The bus subsystem 2302 provides a mechanism for various components and subsystems of the computer system 2300 to communicate with each other as intended. Although the bus subsystem 2302 is schematically shown as a single bus, alternative embodiments of the bus subsystem may utilize multiple buses. The bus subsystem 2302 may be one of several types of bus structures, including a memory bus or memory controller, a peripheral bus, and a local bus using any of the various bus architectures. For example, such architectures may include the Industry Standard Architecture (ISA) bus, Microchannel Architecture (MCA), Bus Extension ISA (EISA) bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Interconnect (PCI) bus, which can be implemented as a mezzanine bus manufactured in accordance with the IEEE P1386.1 standard.
[0288] A processing unit 2304, which can be implemented as one or more integrated circuits (e.g., conventional microprocessors or microcontrollers), controls the operation of the computer system 2300. The processing unit 2304 may include one or more processors. These processors may include single-core or multi-core processors. In some embodiments, the processing unit 2304 may be implemented as one or more independent processing units 2332 and / or 2334, each containing a single-core or multi-core processor. In other embodiments, the processing unit 2304 may be implemented as a quad-core processing unit formed by integrating two dual-core processors onto a single chip.
[0289] In various embodiments, the processing unit 2304 can execute various programs in response to program code and can maintain multiple programs or processes running simultaneously. At any given time, some or all of the program code being executed can reside in the processor 2304 and / or the memory subsystem 2318. The processor 2304 can provide the various functionalities described above through appropriate programming. The computer system 2300 may further include a processing acceleration unit 2306 which may include a digital signal processor (DSP), a dedicated processor and / or the same.
[0290] The I / O subsystem 2308 may include a user interface input device and a user interface output device. The user interface input device may include a keyboard, a pointing device such as a mouse or trackball, a touchpad or touchscreen integrated into a display, a scroll wheel, a click wheel, a dial, a button, a switch, a keypad, a voice input device with a voice command recognition system, a microphone, and other types of input devices. The user interface input device may also include a motion detection and / or gesture recognition device, such as a Microsoft Kinect® motion sensor. The Microsoft Kinect® motion sensor can control and interact with input devices such as a Microsoft Xbox® 360 game controller via a natural user interface (NUI) that utilizes gestures and voice commands. The user interface input device may also include an eye gesture recognition device, such as a Google Glass® blink detector. The Google Glass® blink detector detects the user's eye activity (e.g., blinking when taking a picture and / or selecting a menu) and converts the eye activity into input to an input device (e.g., Google Glass®). Furthermore, the user interface input device may include a voice recognition detection device that enables interaction between the user and a voice recognition system (e.g., Siri® Navigator) via voice commands.
[0291] Furthermore, user interface input devices include, but are not limited to, three-dimensional (3D) mice, joysticks or pointing sticks, gamepads, graphic tablets, audio / visual devices such as speakers, digital cameras, digital video cameras, portable media players, webcams, image scanners, fingerprint scanners, barcode readers, 3D scanners, 3D printers, laser rangefinders, and eye-tracking devices. In addition, user interface input devices may include medical image input devices such as computed tomography scanners, magnetic resonance imaging scanners, ultrasound imaging scanners, or medical ultrasound devices. Furthermore, user interface input devices may include audio input devices such as MIDI keyboards and electronic musical instruments.
[0292] The user interface output device may include non-visual displays such as display subsystems, indicator lights, or audio output devices. The display subsystem may be, for example, a flat-panel device using a cathode ray tube (CRT), liquid crystal display (LCD), or plasma display, a projection device, or a touchscreen. Generally, when the term “output device” is used, it is intended to include all possible types of devices and mechanisms for outputting information from the computer system 2300 to a user or another computer. For example, the user interface output device includes, but is not limited to, various display devices that visually convey text, images, and audio / video information, such as monitors, printers, speakers, headphones, car navigation systems, plotters, audio output devices, and modems.
[0293] The computer system 2300 can include a memory subsystem 2318. The memory subsystem 2318 comprises software elements, and in the illustration, these software elements are disposed within the system memory 2310. The system memory 2310 can store program instructions that are loadable and executable by the processing unit 2304, and data generated by the execution of these programs.
[0294] Depending on the configuration and type of the computer system 2300, the system memory 2310 may be volatile memory (e.g., random access memory: RAM) and / or non-volatile memory (e.g., read-only memory: ROM, flash memory). Generally, RAM contains data and / or program modules that the processing unit 2304 can access immediately, and / or data and / or program modules currently being operated and executed by the processing unit 2304. In some implementations, the system memory 2310 may include several different types of memory, such as static random access memory (SRAM) or dynamic random access memory (DRAM). In some implementations, a basic input / output system (BIOS), which includes basic routines that help transfer information between elements within the computer system 2300 during startup and other times, may generally be stored in ROM. As an example and not limited thereto, system memory 2310 also shows application programs 2312, program data 2314, and operating system 2316, which may include client applications, web browsers, middle-tier applications, relational database management systems (RDBMS), etc.For example, Operating System 2316 may include various versions of Microsoft Windows®, Apple Macintosh®, and / or Linux® operating systems, various commercially available UNIX® or UNIX-like operating systems (including, but not limited to, various GNU / Linux operating systems, Google Chrome® OS, etc.), and / or mobile operating systems such as iOS, Windows® Phone, Android® OS, BlackBerry® OS, and Palm® OS.
[0295] Furthermore, the storage subsystem 2318 can provide a tangible, computer-readable storage medium for storing basic programming and data structures that provide the functionality of several embodiments. Software (programs, code modules, instructions) that provides the above functionality when executed by the processor may be stored in the storage subsystem 2318. These software modules or instructions may be executed by the processing unit 2304. The storage subsystem 2318 can also provide a repository for storing data used in accordance with this disclosure.
[0296] Furthermore, the storage subsystem 2300 may include a computer-readable storage medium reader 2320 that can be further connected to the computer-readable storage medium 2322. The computer-readable storage medium 2322, together with the system memory 2310, or in combination with the system memory 2310 as needed, can comprehensively represent remote storage devices, local storage devices, fixed storage devices, and / or removable storage devices, in addition to storage media for temporarily and / or permanently storing, storing, transmitting, and retrieving computer-readable information.
[0297] Furthermore, the computer-readable storage medium 2322 containing code or a portion of code may include any suitable medium known or used in the art. Such medium includes, but is not limited to, volatile and non-volatile, removable and non-removable media, which are implemented in any way or technique for storing and / or transmitting information. This may include tangible computer-readable storage media such as RAM, ROM, electronically erasable programmable ROM (EEPROM), flash memory or other memory technologies, CD-ROM, digital versatile disk (DVD), or other optical storage devices, magnetic cassettes, magnetic tapes, magnetic disk storage devices or other magnetic storage devices, or other tangible computer-readable media. It may also include intangible computer-readable media such as data signals, data transmissions, or other media that can be used to transmit desired information and are accessible by the computer system 2300.
[0298] For example, the computer-readable storage medium 2322 may include a hard disk drive that reads from or writes to a non-removable non-volatile magnetic medium, a magnetic disk drive that reads from or writes to a removable non-volatile magnetic disk, and an optical disk drive that reads from or writes to a removable non-volatile optical disk such as a CD-ROM, DVD, or Blu-ray® disc or other optical medium. The computer-readable storage medium 2322 may include, but is not limited to, a zip® drive, a flash memory card, a universal serial bus (USB) flash drive, a secure digital (SD) card, a DVD disc, a digital videotape, and the like. Furthermore, the computer-readable storage medium 2322 may include flash memory-based SSDs, enterprise flash drives, solid-state drives (SSDs) based on non-volatile memory such as solid-state ROM, SSDs based on volatile memory such as solid-state RAM, dynamic RAM, and static RAM, DRAM-based SSDs, magnetoresistive RAM (MRAM) SSDs, and hybrid SSDs that use a combination of DRAM and flash memory-based SSDs. Disk drives and their associated computer-readable media can provide computer system 2300 with non-volatile storage devices for computer-readable instructions, data structures, program modules, and other data.
[0299] The communication subsystem 2324 provides interfaces with other computer systems and networks. It acts as an interface for receiving data from other systems and transmitting data from computer system 2300 to other systems. For example, the communication subsystem 2324 may enable computer system 2300 to connect to one or more devices via the Internet. In some embodiments, the communication subsystem 2324 may include radio frequency (RF) transceiver components for accessing wireless voice and / or data networks (using, for example, cellular technologies such as 3G, 4G, or EDGE (enhanced data rates for global evolution), advanced data network technologies), WiFi (IEEE 802.11 family standards or other mobile communication technologies or any combination thereof), global positioning system (GPS) receiver components, and / or other components. In some embodiments, the communication subsystem 2324 may provide wired network connectivity (e.g., Ethernet) in addition to, or instead of, the wireless interface.
[0300] In addition, in some embodiments, the communication subsystem 2324 can receive input communications on behalf of one or more users who can use the computer system 2300, in the form of structured and / or unstructured data feeds 2326, event streams 2328, event updates 2330, etc.
[0301] For example, the communications subsystem 2324 may be configured to receive data feeds 2326 in real time from users of social networks and / or other communications services, such as Twitter® feeds, Facebook® updates, and Rich Site Summary (RSS) feeds, and / or to receive real-time updates from one or more third-party sources.
[0302] Furthermore, the communication subsystem 2324 may be configured to receive data in the form of a continuous data stream, which may include an event stream 2328 and / or event update 2330 of real-time events, which may be continuous or have no boundaries in an essentially definite-end state. Examples of applications that generate continuous data may include, for example, sensor data applications, financial tickers, network performance measurement tools (e.g., network monitoring and traffic management applications), clickstream analysis tools, and automotive traffic monitoring.
[0303] Furthermore, the communication subsystem 2324 may be configured to output structured and / or unstructured data feeds 2326, event streams 2328, event updates 2330, etc., to one or more databases that can communicate with one or more streaming data source computers coupled to the computer system 2300.
[0304] The computer system 2300 may be one of a variety of types, including handheld portable devices (e.g., iPhone® mobile phones, iPad® computing tablets, PDAs), wearable devices (e.g., Google Glass® head-mounted displays), PCs, workstations, mainframes, kiosks, server racks, or other data processing systems.
[0305] In the preceding description, specific details are provided for illustrative purposes to enable a full understanding of the embodiments of this disclosure. However, it will be apparent that various embodiments can be carried out without these specific details. The following description provides examples only and is not intended to limit the scope, applicability, or configuration of this disclosure. Rather, the following description of embodiments provides a possible description for carrying out the embodiments to those skilled in the art. It should be understood that various modifications can be made to the function and arrangement of elements without departing from the spirit and scope of this disclosure as set forth in the appended claims. The drawings and descriptions are not intended to be restrictive. Circuits, systems, networks, processes, and other components may be shown as components in the form of block diagrams so as not to obscure the embodiments with unnecessary details. In other embodiments, well-known circuits, processes, algorithms, structures, and techniques may be shown without unnecessary details so as not to obscure the embodiments. The teachings of this disclosure can also be applied to various types of applications, such as mobile applications, non-mobile applications, desktop applications, web applications, and enterprise applications. Furthermore, the teachings in this disclosure are not limited to a specific operating environment (e.g., operating system, device, platform, etc.) but can be applied to multiple different operating environments.
[0306] It should also be noted that each embodiment is described as a process shown as a flowchart, flow diagram, data flow diagram, structure diagram, or block diagram. While flowcharts describe operations as sequential processes, many operations can be performed in parallel or simultaneously. Furthermore, the order of operations may be rearranged. A process terminates when its operations are completed, but it may include additional steps not shown in the diagram. A process can correspond to a method, function, procedure, subroutine, subprogram, etc. If a process corresponds to a function, its termination can correspond to the return of the calling function or the main function.
[0307] The terms “example” and “exemplary” are used herein to mean “serving as an example, case, or illustration.” Any embodiment or design described herein as “exemplary” or “example” should not necessarily be construed as being preferable or advantageous to other embodiments or designs.
[0308] The terms “machine-readable storage medium” or “computer-readable storage medium” include, but are not limited to, portable or non-portable storage devices, optical storage devices, and various other media that can store, store, or transport instructions and / or data. Machine-readable storage medium or computer-readable storage medium may also include non-transient media that can store data and do not contain carrier waves and / or transient electronic signals that propagate over wireless or wired connections. Examples of non-transient media may include, but are not limited to, magnetic disks or tapes, optical storage media such as compact discs (CDs) or digital multipurpose discs (DVDs), flash memory, memory, or memory devices. Computer program products may include code and / or machine-executable instructions that can represent any combination of procedures, functions, subprograms, programs, routines, subroutines, modules, software packages, classes, instructions, data structures, or program statements. Code segments may be coupled to other code segments or hardware circuits by transferring and / or receiving information, data, arguments, parameters, or memory contents. Information, arguments, parameters, and data may be transmitted, transferred, or sent via any suitable means, such as memory sharing, message transfer, token transfer, or network transmission.
[0309] Furthermore, embodiments may be implemented in hardware, software, firmware, middleware, microcode, hardware description language, or any combination thereof. When implemented in software, firmware, middleware, or microcode, the program code or code segment that performs the required work may be stored in a machine-readable medium. The processor can then perform the required work. Systems shown in some drawings may be provided in various configurations. In some examples, the system may be configured as a distributed system in which one or more elements of the system are distributed across one or more networks within a cloud computing system. Where an element is described as "configured" to perform a particular operation, such a configuration may be achieved, for example, by designing electronics or other hardware to perform the operation, by programming or controlling electronics (e.g., a microprocessor or other suitable electronics) to perform the operation, or by any combination thereof.
[0310] While specific embodiments of the Disclosure have been described, various changes, modifications, alternative configurations, and equivalents are also included within the scope of the Disclosure. The embodiments of the Disclosure are not limited to operating within a specific data processing environment, but can freely operate within multiple data processing environments. Furthermore, while the embodiments of the Disclosure have been described using a set of specific procedures and steps, it will be apparent to those skilled in the art that the scope of the Disclosure is not limited to the procedures and steps described. The various features and aspects of the embodiments described above can be used individually or in combination.
[0311] Furthermore, while embodiments of the Disclosure have been described using specific combinations of hardware and software, it should be recognized that other combinations of hardware and software are also included within the scope of the Disclosure. Embodiments of the Disclosure can be implemented using hardware alone, software alone, or a combination thereof. The various processes described in the Disclosure can be run on the same processor or on different processors in any combination. Thus, where it is described that a component or module is configured to perform a particular process, that configuration can be implemented, for example, by designing an electronic circuit to perform that process, by programming a programmable electronic circuit (such as a microprocessor) to perform that process, or by a combination thereof. Processes can communicate using a variety of technologies, including but not limited to prior art for communication between processes. Different pairs of processes may use different technologies, or the same pair of processes may use different technologies at different times.
[0312] Therefore, the specification and drawings should be considered illustrative rather than restrictive. However, it will be apparent that additions, reductions, deletions, and other modifications and alterations may be made without departing from the broad spirit and scope defined by the claims. Thus, although specific embodiments of this disclosure have been described, these embodiments are not intended to be limiting. Various modifications and their equivalents are included in the appended claims.
[0313] Examples of embodiments of this disclosure can be described in light of the following sections. Item 1. A method implemented by a computer system, comprising receiving first information from a first network virtualization device (NVD) regarding an Internet Group Management Protocol (IGMP) response of a first compute instance, wherein the first information indicates that the first compute instance should be added to a multicast group, the first compute instance being hosted on a host machine and belonging to a Layer 2 virtual network, the Layer 2 virtual network being hosted on a physical network and comprising a plurality of compute instances, a plurality of Layer 2 virtual network interfaces, and a plurality of Layer 2 virtual switches, the physical network comprising the first NVD and the host machine, the first NVD comprising a first Layer 2 virtual network interface among the plurality of Layer 2 virtual network interfaces and a plurality of Layer 2 virtual switches A method implemented by a computer system, wherein the first Layer 2 virtual network interface and the first Layer 2 virtual switch host a first Layer 2 virtual network interface and the first Layer 2 virtual switch are associated with the first compute instance, and the method further comprises generating an IGMP table indicating that the first compute instance is added to the multicast group based on the first information and sending at least a first portion of the IGMP table to a second NVD, the second NVD hosting a second Layer 2 virtual network interface among the plurality of Layer 2 virtual network interfaces and a second Layer 2 virtual switch among the plurality of Layer 2 virtual switches, and the second Layer 2 virtual network interface and the second Layer 2 virtual switch are associated with the second compute instance among the plurality of compute instances.
[0314] Item 2. The method according to Item 1, further comprising receiving from the second NVD a second information relating to the IGMP response of the second compute instance, wherein the second information indicates that the second compute instance should be added to the multicast group, and the IGMP table is further generated based on the second information indicating that the second compute instance is added to the multicast group.
[0315] Item 3. The method according to item 2, further comprising transmitting at least a second portion of the IGMP table to the first NVD.
[0316] Item 4. The method according to Item 3, wherein the first portion transmitted to the second NVD does not include a first instruction that the second compute instance is added to the multicast group, but includes a second instruction that the first compute instance is added to the multicast group, and the second portion transmitted to the first NVD includes the second instruction but does not include the first instruction.
[0317] Item 5. The IGMP table described above is sent to the first NVD and the second NVD as described in Item 3.
[0318] 6. The method according to any one of items 1 to 5, further comprising sending a request for the IGMP query to the first NVD, the first IGMP response being received by the first NVD based on the IGMP query.
[0319] 7. The method according to Section 6, further comprising determining, based on the configuration information of the Layer 2 virtual network, that the first Layer 2 virtual switch is associated with the first compute instance and is hosted by the first NVD, and based on the determination that the first Layer 2 virtual switch is associated with the first compute instance and is hosted by the first NVD, the request is sent to the first NVD.
[0320] Item 8. The method according to any one of Items 1 to 7, further comprising: receiving second information from the first NVD indicating that the first compute instance should be removed from the multicast group; and generating an update to the IGMP table based on the second information, the update indicating that the first compute instance should be removed from the multicast group; and further comprising transmitting at least the update to the second NVD.
[0321] Item 9. The above update is further sent to another NVD instead of the first NVD as described in Item 8.
[0322] Item 10. A network virtualization device comprising one or more processors and one or more computer-readable storage media for storing instructions, wherein when the instructions are executed by the one or more processors, the network virtualization device is configured to host a first Layer 2 virtual network interface and a first Layer 2 virtual switch belonging to a Layer 2 virtual network, the first Layer 2 virtual network interface and the first Layer 2 virtual switch are associated with a first compute instance belonging to the Layer 2 virtual network, the first compute instance is hosted on a host machine of a physical network comprising the network virtualization device, and the host machine The thin and the network virtualization device are connected in a communicative manner, and the Layer 2 virtual network is hosted on the physical network and comprises a plurality of compute instances, a plurality of Layer 2 virtual network interfaces, and a plurality of Layer 2 virtual switches, wherein the instruction, when executed by one or more processors, configures the network virtualization device to send an Internet Group Management Protocol (IGMP) query to the first compute instance and to receive an IGMP response from the first compute instance, the IGMP response indicating that the first compute instance should be added to a multicast group, the network virtualization device.
[0323] Item 11. The network virtualization device according to Item 10, wherein the execution of the above instruction further configures the network virtualization device to transmit to the computer system first information relating to the IGMP response, the first information indicating that the first compute instance should be added to the multicast group, and the execution of the above instruction further configures the network virtualization device to receive from the computer system at least a portion of an IGMP table, the IGMP table being generated based on the first information.
[0324] Section 12. The network virtualization device described in Section 11, wherein the part above includes a first instruction that a second compute instance among the plurality of compute instances is added to the multicast group, and the second compute instance is associated with a second Layer 2 virtual network interface and a second Layer 2 virtual switch hosted by the second network virtualization device.
[0325] Item 13. The execution of the above instructions further configures the network virtualization device to store a local IGMP table based on the IGMP response of the first compute instance and the portion of the IGMP table received from the computer system, wherein the portion does not include a second instruction that the first compute instance is added to the multicast group, and the local IGMP table includes the second instruction, as described in Item 12.
[0326] Item 14. The execution of the above instructions further configures the network virtualization device to receive requests from the computer system for the IGMP queries, and the IGMP queries are sent to the network virtualization device as described in Item 11 based on the requests.
[0327] Section 15. The network virtualization device according to Section 11, wherein the execution of the above instruction further configures the network virtualization device to store a local IGMP table based on the portion of the IGMP table received from the computer system, the local IGMP table indicating that a second compute instance of the plurality of compute instances is added to the multicast group, and the execution of the above instruction further configures the network virtualization device to receive a frame of the first compute instance, to determine, based on the local IGMP table, that the frame should be duplicated and sent to the second compute instance, and to send the duplicated frame to the second compute instance.
[0328] Item 16. One or more computer-readable storage media for storing instructions, wherein, when executed on a network virtualization device, the instructions cause the network virtualization device to perform an operation, the operation comprising hosting a first Layer 2 virtual network interface and a first Layer 2 virtual switch belonging to a Layer 2 virtual network, the first Layer 2 virtual network interface and the first Layer 2 virtual switch being associated with a first compute instance belonging to the Layer 2 virtual network, the first compute instance being hosted on a host machine of a physical network comprising the network virtualization device. The host machine and the network virtualization device are connected in a communicative manner, and the Layer 2 virtual network is hosted on the physical network and comprises multiple compute instances, multiple Layer 2 virtual network interfaces, and multiple Layer 2 virtual switches, wherein the operation further includes sending a first Internet Group Management Protocol (IGMP) query to the first compute instance and receiving a first IGMP response from the first compute instance, the first IGMP response being one or more computer-readable storage media indicating that the first compute instance should be added to a multicast group.
[0329] Item 17. The operation further comprises transmitting to a computer system first information relating to the first IGMP response, the first information indicating that the first compute instance should be added to the multicast group, and the operation further comprises receiving from the computer system at least a portion of an IGMP table, the IGMP table being generated based on the first information, on one or more computer-readable storage media as described in Item 16.
[0330] Item 18. The above operation further comprises sending a second IGMP query to the first compute instance and receiving a second IGMP response from the first compute instance, wherein the second IGMP response indicates that the first compute instance should be removed from the multicast group, in one or more computer-readable storage media as described in Item 17.
[0331] Item 19. The operation described above further includes storing a local IGMP table based on the portion of the IGMP table received from the computer system, and updating the local IGMP table to indicate that the first compute instance is removed from the multicast group, in one or more computer-readable storage media as described in Item 18.
[0332] Item 20. The operation described above further includes storing a local IGMP table based on the portion of the IGMP table received from the computer system, and sending an update to the computer system indicating that the first compute instance should be removed from the multicast group, one or more computer-readable storage media as described in Item 18.
[0333] The indefinite articles “a” / “an”, the definite article “the”, and similar references used in the context describing this disclosure (particularly in the context of the claims) should be interpreted as including both singular and plural unless otherwise specifically stated herein or unless the meaning is clearly indicated otherwise. The terms “comprising”, “having”, “including”, and “containing” should be interpreted as non-restrictive terms (i.e., “including but not limited to”) unless otherwise specifically stated. The term “connected” should be interpreted as some or all of it being contained, attached, or joined together, even if something is intervening. In this specification, enumerations of value ranges are intended simply as a shorthand way of referring to each individual value that falls within that range, and unless otherwise specifically stated herein, each individual value is incorporated herein as it is described separately herein. Unless otherwise specifically stated herein or unless the meaning is clearly indicated otherwise, all methods described herein may be performed in any appropriate order. In this specification, the use of any and all examples or exemplary language (e.g., "like") is intended to clarify embodiments of the Disclosure and, unless otherwise specified, does not limit the scope of the Disclosure. Terms in the specification should not be construed as indicating any non-claimed elements essential to the implementation of the Disclosure.
[0334] Disjunctive language, such as the phrase "at least one of X, Y, or Z," is intended to be understood in context as commonly used to indicate that an item, term, etc., may be X, Y, Z, or any combination thereof (e.g., X, Y, and / or Z), unless otherwise specified. Therefore, such disjunctive language is not generally intended, nor implies, that a particular embodiment requires the presence of at least one X, at least one Y, or at least one Z.
[0335] Preferred embodiments of the Disclosure are described herein, including the best known mode for carrying out the Disclosure. Variations of these preferred embodiments will become apparent to those skilled in the art by reading the foregoing description. Those skilled in the art may, as appropriate, adopt such modifications, and the Disclosure may be carried out in ways other than those specifically described herein. Accordingly, the Disclosure includes all variations and equivalents of the subject matter described in the claims appended herein, as permitted by applicable law. Furthermore, any combination of the above elements in all possible variations is incorporated herein unless otherwise indicated herein.
[0336] All references cited herein, including publications, patent applications, and patents, shall be incorporated by reference to the same extent as if they were included herein, provided that each reference is individually and clearly indicated as being incorporated by reference.
[0337] While aspects of the disclosure described above are explained with reference to specific embodiments, those skilled in the art will recognize that the disclosure is not limited thereto. The various features and aspects of the disclosure described above may be used individually or in combination. Furthermore, embodiments may be used in any number of environments and applications beyond those described herein without departing from the broader spirit and scope of this specification. Accordingly, the specification and drawings should be considered illustrative rather than restrictive.< / realm>
Claims
1. A method performed by a computer, This includes receiving first information from a first network virtualization device (NVD) regarding an Internet Group Management Protocol (IGMP) response of a first compute instance, wherein the IGMP response is a response to an IGMP query, and the first information indicates that the first compute instance should be added to a multicast group. The first computing instance is hosted on a host machine and belongs to a Layer 2 virtual network. The Layer 2 virtual network is hosted on a physical network and includes multiple compute instances, multiple Layer 2 virtual network interfaces, and multiple Layer 2 virtual switches. The physical network includes the first NVD and the host machine, The first NVD hosts the first Layer 2 virtual network interface among the plurality of Layer 2 virtual network interfaces and the first Layer 2 virtual switch among the plurality of Layer 2 virtual switches. The first Layer 2 virtual network interface and the first Layer 2 virtual switch are associated with the first compute instance, and the method further, This includes receiving second information from the second NVD regarding the IGMP response of the second compute instance, the second information indicating that the second compute instance should be added to the multicast group. Based on the first and second pieces of information, an IGMP table is generated indicating that the first and second compute instances are added to the multicast group. This includes transmitting at least a first portion of the IGMP table to the second NVD, the second NVD hosting a second Layer 2 virtual network interface among the plurality of Layer 2 virtual network interfaces and a second Layer 2 virtual switch among the plurality of Layer 2 virtual switches, the second Layer 2 virtual network interface and the second Layer 2 virtual switch being associated with a second compute instance among the plurality of compute instances, A method wherein the first portion transmitted to the second NVD includes a first instruction that the first compute instance is added to the multicast group, but does not include a second instruction that the second compute instance is added to the multicast group.
2. The method according to claim 1, wherein the entire IGMP table is transmitted to the first NVD and the second NVD.
3. The above method further, The method according to claim 1 or 2, comprising sending a request for an IGMP query to the first NVD, wherein the IGMP response of the first compute instance is received by the first NVD based on the IGMP query.
4. The above method further, The method according to claim 3, comprising determining, based on the configuration information of the Layer 2 virtual network, that the first Layer 2 virtual switch is associated with the first compute instance and is hosted by the first NVD, and based on the determination that the first Layer 2 virtual switch is associated with the first compute instance and is hosted by the first NVD, the request is transmitted to the first NVD.
5. The above method further, Receiving third information from the first NVD indicating that the first compute instance should be removed from the multicast group, The method further includes generating an update to the IGMP table based on the third information, the update indicating that the first compute instance is removed from the multicast group, and the method further includes The method according to any one of claims 1 to 4, comprising transmitting at least the update to the second NVD.
6. The method according to claim 5, wherein the update is further transmitted to another NVD instead of the first NVD.
7. A network virtualization device, One or more processors, The system comprises one or more computer-readable storage media for storing instructions, and when an instruction is executed by the one or more processors, the network virtualization device is controlled Configure to host a first Layer 2 virtual network interface and a first Layer 2 virtual switch belonging to a Layer 2 virtual network, The first Layer 2 virtual network interface and the first Layer 2 virtual switch are associated with a first compute instance belonging to the Layer 2 virtual network. The first computing instance is hosted on a host machine of a physical network equipped with the network virtualization device, and the host machine and the network virtualization device are communicated together. The Layer 2 virtual network is hosted on the physical network and comprises a plurality of compute instances, a plurality of Layer 2 virtual network interfaces, and a plurality of Layer 2 virtual switches. The instruction, when executed by one or more processors, further enables the network virtualization device to The first compute instance is configured to send Internet Group Management Protocol (IGMP) queries, The system is configured to receive an IGMP response from the first compute instance, the IGMP response indicating that the first compute instance should be added to the multicast group. The execution of the aforementioned instruction further configures the network virtualization device to receive at least a portion of the IGMP table from the computer system. The IGMP table is generated by the computer system based on first and second information relating to the IGMP response, the first information indicating that the first compute instance should be added to the multicast group, and the second information indicating that the second compute instance among the plurality of compute instances should be added to the multicast group. The second computing instance is associated with a second Layer 2 virtual network interface and a second Layer 2 virtual switch hosted on a second network virtualization device. The execution of the aforementioned instruction further includes the network virtualization device, The system is configured to store a local IGMP table based on the IGMP response of the first computation instance and a portion of the IGMP table received from the computer system. The aforementioned part does not include a second instruction that the first compute instance is added to the multicast group, but includes a first instruction that the second compute instance should be added to the multicast group. The stored local IGMP table includes the second instruction of the network virtualization device.
8. The execution of the aforementioned instruction further includes the network virtualization device, The network virtualization device according to claim 7, configured to transmit first information relating to the IGMP response to the computer system.
9. The execution of the aforementioned instruction further includes the network virtualization device, The network virtualization device according to claim 8, configured to receive requests for the IGMP query from the computer system, and the IGMP query is transmitted based on the request.
10. The execution of the instruction further comprises the network virtualization device, Configured to receive the frame of the first computation instance, Based on the local IGMP table, the system is configured to determine that the frame should be duplicated and sent to the second computing instance. The network virtualization device according to claim 8, configured to transmit replicated frames destined for the second compute instance.
11. A program that causes a processor to perform an action, The aforementioned operation is, This includes hosting a first Layer 2 virtual network interface and a first Layer 2 virtual switch belonging to a Layer 2 virtual network, The first Layer 2 virtual network interface and the first Layer 2 virtual switch are associated with a first compute instance belonging to the Layer 2 virtual network. The first computing instance is hosted on a host machine of a physical network equipped with a network virtualization device, and the host machine and the network virtualization device are connected in a communicative manner. The Layer 2 virtual network is hosted on the physical network and comprises multiple computing instances, multiple Layer 2 virtual network interfaces, and multiple Layer 2 virtual switches, and the operation further includes: Sending a first Internet Group Management Protocol (IGMP) query to the first computing instance, This includes receiving a first IGMP response from the first compute instance, the first IGMP response indicating that the first compute instance should be added to a multicast group. The aforementioned operation further, This includes receiving at least a portion of the IGMP table from a computer system. The IGMP table is generated by the computer system based on first and second information relating to the IGMP response, the first information indicating that the first compute instance should be added to the multicast group, and the second information indicating that the second compute instance among the plurality of compute instances should be added to the multicast group. The second computing instance is associated with a second Layer 2 virtual network interface and a second Layer 2 virtual switch hosted on a second network virtualization device. The aforementioned operation further, This includes storing a local IGMP table based on the first IGMP response of the first computation instance and the portion of the IGMP table received from the computer system, The aforementioned part does not include a second instruction that the first compute instance is added to the multicast group, but includes a first instruction that the second compute instance should be added to the multicast group. The stored local IGMP table is a program that includes the second instruction.
12. The aforementioned operation further, The program according to claim 11, comprising transmitting first information relating to the first IGMP response to the computer system.
13. The aforementioned operation further, Sending a second IGMP query to the first computation instance, The program according to claim 12, comprising receiving a second IGMP response of the first compute instance, wherein the second IGMP response indicates that the first compute instance should be removed from the multicast group.
14. The aforementioned operation further, The program according to claim 13, comprising updating the local IGMP table to indicate that the first compute instance is removed from the multicast group.
15. The aforementioned operation further, The program according to claim 13, comprising sending an update to the computer system indicating that the first compute instance should be removed from the multicast group.
16. The operation further, A program according to any one of claims 11 to 15, comprising receiving a request for an IGMP query from the computer system.
17. The program according to claim 16, wherein the computer system determines, based on the configuration information of the Layer 2 virtual network, that the first Layer 2 virtual switch is associated with the first compute instance and is hosted by the network virtualization device, and based on the determination that the first Layer 2 virtual switch is associated with the first compute instance and is hosted by the network virtualization device, transmits the request to the processor.
18. The network virtualization device according to claim 9, wherein the computer system determines, based on the configuration information of the Layer 2 virtual network, that the first Layer 2 virtual switch is associated with the first compute instance and is hosted by the network virtualization device, and based on the determination that the first Layer 2 virtual switch is associated with the first compute instance and is hosted by the network virtualization device, transmits the request to the determined network virtualization device.
19. A program for causing a processor to perform the method according to any one of claims 1 to 6.