Compute Fundamentals

Elastic Compute Cloud (EC2) Fundamentals

EC2 is one of the most widely used services within AWS. As an Infrastructure as a Service (IaaS) product, it's responsible for providing long-running compute as a service. An EC2 host consists of EC2 Instance (CPU, RAM (root drive)) and an optional Instance Store Volume (ephemeral)).

  • Create a key pair and download the private key.
  • Go to VPC and create a default VPC.

Now back to EC2 and launch an Instance.

  1. Select an Amazon Machine Image (AMI). OS to select. Either 64-Bit (x86) or 64-Bit (Arm) based to select.
  2. Select an Instance Type. Number of vCPUs, RAM size (memory).
  3. Configure Instance details. Select network as VPC (default), subnet as a specific availability zone, Auto-assign Public IP as Use subnet setting (Enable)
  4. Add storage. By default a Root volume is attached to instance called Instance store Volume. We can attach EBS volume as optional storage.
  5. Add Tags. Input Key and Value to mark as a part of a project or purpose.
  6. Configure Security Group. SG is a filter to allow traffic coming / leaving the instance. SG is attached to ENI (Elastic Network Interface). For Linux AMI, SSH with TCP port 22 should be allowed (For Windows AMI, RDP should be selected). Select My IP instead of all IPs (0.0.0.0) for incoming SSH requests.
  7. Launch the instance.
  8. Click Monitoring tab of the newly created instance.

We are not billed if an instance is in pending, stopping (if preparing to stop), stopped, shutting down, or terminated state (excludes reserved instances). EBS volumes incur charges regardless of the instance's state.

EC2 instances are grouped into families, which are designed for a specific broad type workload. The type determines a certain set of features and sizes decide the level of workload they can cope with.

The current EC2 families are general purpose, compute optimized, memory optimized, storage optimized and accelerated computing.

Instance type include:

  • T2 and T3: Low cost instance types that provide burst capability (general purpose)
  • M5: For genral workloads (general purpose)
  • C4: Provides more capable CPU (compute optimized)
  • X1 and R4: Optimize large amounts of fast memory (memory optimized)
  • I3: delivers fast IO (Storage optimized)
  • P2, G3 and F1: Deliver GPU and FPGAs (Field Programmable Gate Arrays) (Accelerated computing)
  • A1: (general purpose). AWS Nitro System

Instance sizes include nano, micro, small, medium, large, x.large, 2x.large, and larger.

Special Cases:

  • "a": Use AMD CPUs
  • "A": Arm based
  • "n": higher speed networking
  • "d": NVMe storage

Need to be aware of CPU credits which consist of an instance's CPU being utilized as per recommended values (e.g. recommended value of CPU utilization for T3 instance is 20-30%. If a CPU runs higher than that then, it starts consuming CPU credits and once all CPU credits are gone, CPU will face throttling.

Instance Storage Architecture

Instance Store Volume are optional volumes that are directly attached to the EC2 host in addition to root storage. To check the Instance Store Volume, launch an M5 instance with an extra EBS and run the following commands.

df -h (displays all the file systems of the system)
lsblk (lists all of the block devices, here we will be able to see the additional instance store volume)
sudo mkdir /ephemeral (creating a directory named ephemeral)
sudo mkfs -t ext4 /dev/nvmeln1 (create file system for instance store volume - nvmeln1)
sudo mount /dev/nvmeln1 /ephemeral (mount the file system on directory ephemeral)
df -h (will show the newly created file system-dev/nvmeln1 mounted on folder  -/ephemeral)
cd /ephemeral (change directory to ephemeral)
sudo nano important.txt (create important.txt file via text editor-nano)

Type some text in the file and save the file

ls -la (will show the important.txt in the ephemeral directory)
sudo reboot (this will reboot the OS of the instance)

Once OS is rebooted and we are logged back in

df -k (to see the mounted file systems, the ephemeral file is not visible and we have to mount the file system again)
sudo mount /dev/nvmeln1 /ephemeral
cd /ephemeral/
ls -la (will show our important.txt file)
cat important.txt (will show the text inside the file)

Now we will reboot the instance from AWS EC2 console and once rebooted, logged into the instance.

df -k (to see the mounted file systems, the ephemeral file is not visible and we have to mount the file system again)
sudo mount /dev/nvmeln1 /ephemeral
cd /ephemeral/
ls -la (will show our important.txt file)
cat important.txt (will show the text inside the file)

However, if we stop and instance, it will detach the instance store volume and once we start the instance, a new instance store volume will be allocated so our file important.txt is lost.

Elastic Block Storage (EBS)

Elastic Block Storage (EBS) is a storage service that creates and manages volumes based on four underlying storage types. Volumes are persistent, can be attached and removed from EC2 instances, and are replicated within a single AZ.

Volume Types:

  • Mechanical sc1 and st1; solid state gp2 and io1
  • sc1: lowest cost, infrequent access, cannot be boot volume
  • st1: low cost, throughput intensive, cannot be a boot volume
  • gp2: Default, balance of IOPS/MiB/s - burst pool IOPS per GB
  • io1: Highest performance, can adjust size and IOPS separately

To protect against AZ failure, EBS snapshots (to S3) can be used. Data is replicated across AZs in the region and (optionally internationally.

We can also create EBS volume via EC2 dashboard, left hand menu-->Create Volume. Make sure that the EBS is created in the same AZ as the EC2 instance so that the EBS can be attached to the EC2 instance.

Storage performance is measured in two main ways.

IOPS - Number of Input-Output Operations Per Second.

Throughput - data rate in Mbps

Block Size - For example 256 Kb

Storage performance = 256 Kb x 400 IOPS = 102 Mbps throughput

Storage performance = 100 Gb x 3 IOPS = 300 Gbps throughput

We should be careful about selecting throughput demanding workload OR IOPS demanding workload.

EBS supports a maximum per instance throughput of 1750 MB/s and 80K IOPS.

General Purpose (gp2): (SSD)

  • 3 IOPS/Gb (100 IOPS - 16K IOPS)
  • Bursts up to 3K IOPS (credit based)
  • 1GB - 16 TB size, max throughput p/vol of 250 Mbps

Provisioned IOPS SSD (io1):(SSD)

  • Used for applications that require sustained IOPS performance
  • Large database workloads
  • Volume size of 4 Gb - 16 TB upto 64K IOPS per volume (for Nitro based volume, 32K IOPS for SSD EBS volume)
  • Max throughput p/vol of 1k Mbps

Throughput Optimized (st1):(HDD)

  • Low storage cost
  • Used for frequently accessed, throughput intensive workloads (streaming big data)
  • cannot be a boot volume
  • Volume size of 500 GB - 16 TB
  • Per volume max throughput of 500 Mb/s and IOPS 500 (IOPS not applicable)

Cold HDD (sc1):(HDD)

  • Lowest cost
  • Infrequently accessed data
  • Cannot be a boot volume
  • Volume size of 500 Gb - 16 Tb
  • Per volume max throughput of 250 Mb/s and 250 IOPS (IOPS not applicable)

EBS snapshots

EBS snapshots are a point in time backup of an EBS volume stored in S3. The initial snapshot is a full copy of the volume. Future snapshots only store the data changed since the last snapshot.

Snapshots can be used to create new volumes and are a great way to move or copy instances between AZs. When creating a snapshot of the root/boot volume of an instance or busy volume its recommended the instance is powered off (in stop state), or disks are flushed.

Snapshots can be copied between regions, shared and automated using Data Lifecycle Manager (DLM).

Snapshots can be copied between regions. Snapshots work on incremental backup mechanism with the difference with any snapshot missing in the chain does not hamper the data retrieval and no data is lost.

Security groups

Security groups are software firewalls that can be attached to network interfaces and (by association) products in AWS. Security groups each have inbound rules and outbound rules. A rule allows traffic to or from a source (IP, network, named AWS entity) and protocol. There can be a maximum of 5 security groups for each Elastic Network Interface.

Security groups have a hidden implicit/default deny rule but cannot explicitly deny traffic. By default, a security group does not allow inbound traffic until a rule is set and they include an outbound rule that allows all outbound traffic.

Security Groups are stateful - meaning for any traffic allowed in/out, the return traffic is automatically allowed. Security groups can reference AWS resources, other security groups, and even themselves. Several instances can belong to the same security group even across different Availablity Zones in a VPC.

Instance Metadata

Instance metadata is data relating to the instance that can be accessed from within the instance itself using a utility capable of accessing HTTP and using the URL:

http://169.254.169.254/latest/meta-data

AMI used to create the instance:

http://169.254.169.254/latest/meta-data/ami-id

Instance ID:

http://169.254.169.254/latest/meta-data/instance-id

Instance type:

http://169.254.169.254/latest/meta-data/instance-type

Instance metadata is a way that scripts and applications running on EC2 can get visibility of data they would normally need API calls for.

The metadata can provide the current external IPv4 address for the instance, which is not configured on the instance itself but provided by the internet gateway in the VPC. It provides the Availability Zone the instance was launched in and the security groups applied to the instance. In the case of spot instances, it also provides the approximate time the instance will terminate.

To retrieve a Public IP of a linux instance;

curl -s http://169.254.169.254/latest/meta-data/public-ipv4

Lab: Launching a linux instance, connecting via SSH (openSSH) and installing webserver (httpd) on that instance.

  • Make sure default VPC exists with IGW attached.
  • Launch a linux instance and connect via SSH (putty or OpenSSH)
  • Running below commands to install webserver on the instance
#!/bin/bash
sudo yum update -y
sudo yum install -y httpd
sudo yum install -y wget
chkconfig httpd on
sudo service httpd start

We can now put index.html file in /var/www/html folder using below commands

cd /var/www/html
wget http://file location

Now copy the Public IPv4 URL from Instance details and paste in a new browser window and press Enter to see the new website loaded.

Next: Compute Intermediate