AWS Networking

Virtual Private Cloud (VPC)

Networking 101

Networks are the roads that allow our data to move from device to device. Networks connect to other networks to allow data to move to remote devices.

Switch: Route data between hosts within a single network. OSI model data link layer. Frames are routed using layer 2 switches. Media Access Control (MAC) address of hosts are used to segment a network and transfer data between hosts on same network.

Layer 2 switches mitigate collision domain as each port becomes a separate collision domain.

Routers: Route data between hosts that belong to different networks

Overview of VPCs

AWS definition of VPC:

Amazon Virtual Private Cloud (Amazon VPC) lets us provision a logically isolated section of the Amazon Web Services (AWS) cloud where we can launch AWS resources in a virtual network that we define. We have complete control over our virtual networking environment, including selection of our own IP address range, creation of subnets, and configuration of route tables and network gateways.

Note: When we create an AWS account, a "Default" VPC is created for us.

Subnets

A subnet (shorthand for "subnetwork") is a subsection of a network. A subnet can contain hosts between 2 to 16,777,214. As per AWS definition:

When we create a VPC, it spans all the Availability Zones in the region. After creating a VPC, we can add one or more subnets in each Availability Zone. Each subnet must reside entirely within one Availability Zone and cannot span over multiple zones.

Note: Our default VPC already has a subnet created by default.

Public Subnet: Where clients can communicate / use resources on Internet. This is done by adding a route to IGW and NAT (private to public IP translation) in routing table.

Private Subnet: Where clients can only communicate / use resources on the same network. No NAT is done in routing table.

IP Address

There are two type of IP addresses. IPv4 and IPv6.

IPv4

IPv4 addresses consist of 4 Octets of binary digits represented in decimal format. For example; 192.168.1.1

Here 192.168.1.1 will be 1100 0000 . 1010 1000 . 0000 0001 . 0000 0001 in binary format.

Every host on a subnet has a unique IP address. We can assign IP addresses to hosts in a subnet manually or using DHCP (Dynamic Host Configuration Protocol) Server.

IPv4 has following classes

Class A - 0.0.0.0 - 127.255.255.255 - First octet (8-bit) is for number of networks and last three octets (24-bits) are for number of hosts.

Class B - 128.0.0.0 - 191.255.255.255 - First two Octets (16-bits) are number of networks and last two octets (16-bits) are for number of hosts.

Class C - 192.0.0.0 - 223.255.255.255 - First three Octets (24-bits) are number of networks and last one octet (8-bits) is for number of hosts.

Class D - 224.0.0.0 - 239.255.255.255 - Multi-cast range - Broadcast range. All 32 bits are network portion

Class E - 240.0.0.0 - 255.255.255.255 - For research

127.0.0.1 is the loopback address.

Class A, B and C are available as Public IPs. There are three types of IP addresses.

  1. Network Address - Host portion contain all zeroes.
  2. Broadcast Address - Host portion contain all ones.
  3. Host Address - All IPs except all zeroes and all ones.

Private IP ranges

Class A - 10.0.0.0 - 10.255.255.255

Class B - 172.16.0.0 - 172.31.255.255

Class C - 192.168.0.0 - 192.168.255.255

CIDR - Classless Inter-Domain Routing

To solve the scarcity of IPv4 public address, CIDR was introduced. CIDR makes subnetting possible by making host and network portion dynamic (based on number of hosts).

IPv6

IPv6 address is 128 bits long or 32 nibbles (32x4bits) or 8 hextets (16x8). Network and host portion contain 64 bits each.

Type of IPv6 Addresses

  1. IPv6 unicast address is for global communication.
  2. IPv6 Link Local address is for local network. It always starts with FE80.
  3. Loop back address. ::1/128
  4. Unspecified
  5. Unique Local. FC00::/7 for private addressing.
  6. IPv6 Multicast Address
    • One to many communication
    • not implemented globally

7. IPv6 Anycast Address

    • One IPv6 many devices
    • Used for load balancing

ARP Table - mapping of IP addresses to MAC addresses

MAC Address Table - mapping of switchports to MAC addresses

Routing Table - mapping of IP subnets to egress Interfaces. It is populated from directly connected interfaces.

How a website is opened on a PC

On PC, we use HTTP client application such as Google Chrome, Internet Explorer and in search bar, type URL (Uniform Resource Locator) google.com and press Enter. HTTP client app takes the URL and initiate a UDP (Universal Datagram Packet) message for DNS to resolve (IP of webserver google.com).

UDP message becomes a segment at layer 4 (Transport) adding source port and destination port 53 (UDP port for DNS). At Layer 3 (Network), segment becomes a packet and a source and destination IP (in this case default IP of Router/gateway after looking at the routing table) addresses are added with Layer 4 port values. At Layer 2 (data link), packet becomes a frame and source and destination (MAC of Router/default gateway) MAC addresses are added.

If Destination MAC address is not known in the ARP table, then Address Resolution Protocol (ARP) is kicked in. Destination MAC address is input as all Fs. When the frame reaches switch or router, it looks at the destination IP, and if it belongs to itself then OK otherwise, switch will broadcast the ARP message to all ports except the port from where the message was received. Now the destination host (router in this case) will respond with its MAC address in the ARP message and now frame is moved to layer 1.

At Layer 1 (physical), frames become bit stream and reaches default gateway (router). Router processes layer1, layer 2 and layer 3 messages, and performs Network Address Translation (NAT). In NAT, router will replace its private source IP with Public source IP, updates the destination IP of DNS resolver, checks its routing table for next hop for destination IP and forwards the packet to next hop (i.e. RSP's network). RSP's router will receive the packet and will lookup its routing table to see the next hop for the destination IP and forwards the packet till it reaches the DNS resolver.

DNS resolver will look at the URL and will send the UDP packet to DNS root name server. A URL has three parts. First is Top Level Domain (e.g. .com, .edu, .org, .com.au etc). Second is Second Level Domain (e.g. google, amazon, microsoft etc). Third is Third Level Domain (e.g. play.google.com etc). While www is a type of host.

Now DNS root name server will send the packet back to DNS resolver with instructions to go to authoritative name server for .com Top Level Domain (TLD). Now authoritative name server for .com TLD checks its records for authoritative name server for google.com and sends the packet back to DNS resolver with instructions to go to actual domain name server (e.g. AWS Route53 DNS server). DNS resolver will send the request to actual domain name server and actual domain name server will check the domain name records (A-type etc) against the domain name and return the IP of google.com webserver to DNS resolver.

DNS resolver will respond back with the IP address of google.com in the UDP response message. The UDP message will then reach all the way back to its original client. Also note that the return of UDP will be faster as all ARP tables, Routing tables and MAC address tables are already updated.

Client will then request an HTTP Get message at application layer. At Application layer, destination port 80 (for HTTP, 443 for HTTPs, 23 for telnet, 22 for SSH/SFTP, FTP for 20 and 21). At transport layer, source port of client is added.

Also a three-way hand shake process is initiated by sending out a SYN message (SEQ: 0, CTL: SYN) to Webserver which will respond back by SYN, ACK (SEQ: 0, ACK: 1, CTL: SYN-ACK) message. Client will then respond by ACK (SEQ: 1 , ACK: 1, CTL: ACK) message hence completing a three-way hand shake and connection is established. Data can now transfer between server and client. Each data packet transferred will be followed by an ACK message from client to server in case of download or ACK from server to client in case of upload. For example, a 22 bit data was transferred, so ack message will be ACK (SEQ: 1, ACK: 23).

Now once the data is transferred, session close will be initiated by either client or server. Server will send a FIN, ACK message (SEQ: 23, ACK: 1). Client will send ACK (SEQ: 1, ACK: 24). Client will also send FIN, ACK message (SEQ: 1, ACK: 24). Server will respond by ACK (SEQ: 24, ACK: 2). Hence session is successfully closed.

Internet Gateways (IGWs)

An internet gateway is a horizontally scaled, redundant and highly available VPC component that allows communication between instances in our VPC and the internet. Our default VPC already has an IGW attached.

Route Tables

A route table contains a set of rules, called routes, that are used to determine where network traffic is directed.

Our default VPC already has a main route table. Also, in AWS VPC, all subnets' routes are automatically added in Route Table by default so that clients in each subnet can communication with clients in other subnets in the same VPC.

0.0.0.0/0 with target igw-XXXXX means a default route to Internet Gateway.

Networking Security

Network Access Control List (NACL)

An optional firewall/security layer that controls inbound and outbound traffic for one or more subnets. The default VPC has a default NACL and ALL traffic is allowed (both inbound/outbound). We define Inbound Rules and Outbound Rules.

  1. Rules are evaluated based on rule # from lowest to highest.
  2. The first rule evaluated that applies to the traffic type gets immediately applied and executed regardless of the rules that come after (have a higher rule #).

NACLs are stateless which means that we have to explicitly allow traffic in both inbound and outbound rules.

Default NACL:

  • Allows both inbound and outbound traffic since the ALLOW rule has a lower rule #.

New NACL:

  • When we create a new NACL, ALL traffic is denied by default.

Security group (SG)

A firewall/security layer on the instance level. It controls the access to specific Instances/resources from clients and users over internet.

What is a Firewall?

A firewall is a type of software that either allows or blocks certain kinds of internet traffic to pass through it.

Example: If the NACL and SG are configured to allow web traffic (HTTP) then HTTP requests will be allowed into the subnet and then into the EC2 instance. If they are configured to deny FTP traffic, then all FTP requests will be blocked.

Next: AWS Compute