EC2 Status Checks
Links: 114 AWS SOA Index
There can be 2 to 3 questions from status checks in the exam therefore it is important to understand the differences between system and instance status checks.
EC2 Status Checks¶
- Automated checks to identify hardware and software issues.
Status checks are built into Amazon EC2, so they CANNOT be disabled or deleted.
System Status¶
- Problems with the underlying host.
- AWS issue.
- Some examples are:
- Loss of network connectivity
- Loss of system power
- Software issues on the physical host
- Hardware issues on the physical host that impact network reachability
- Resolution:
- Stop and start the instance (instance migrated to a new host) or
- Wait for AWS to fix it
Instance Status¶
- Instance status checks monitor the software and network configuration of your individual instance.
- Some examples are:
- Incorrect networking or startup configuration
- Exhausted memory
- Corrupted file system
- Incompatible kernel
- It requires our intervention to fix it.
- Resolution:
- reboot the instance or
- change instance configuration
Rebooting will fix instance status checks. Starting and stopping will fix system status checks.
- When you stop and start an instance it is moved to a different host which can fix system status check errors.
- In Rebooting the instance isn't moved to a different host so it won't fix system status checks.
CW Metrics and Recovery¶
- CloudWatch Metrics (1 minute interval)
StatusCheckFailed_System
StatusCheckFailed_Instance
StatusCheckFailed
(for both)
- Option 1: CloudWatch Alarm (Preferred)
- Recover EC2 instance with the same private/public IP, EIP, metadata, and Placement Group.
- Send notifications using SNS.
- Option 2: Auto Scaling Group
- Set min/max/desired 1 to recover an instance but won't keep the same private and elastic IP.
- Set min/max/desired 1 to recover an instance but won't keep the same private and elastic IP.
Last updated: 2023-03-20