AWS Day-to-Day Operational Tasks
This guide outlines common daily tasks for a Cloud Engineer or SysOps Administrator working with AWS.
1. Monitoring and Observability
- CloudWatch Dashboards: Review custom dashboards for key metrics like EC2 CPU utilization, RDS connections, and ALB latency.
- Alarms: Check the status of CloudWatch Alarms. Investigate any "In Alarm" states (e.g.,
HighCPU,DiskSpaceLow). - CloudTrail: Review CloudTrail logs for unauthorized API calls or unexpected changes to resources (e.g., a Security Group rule change).
2. Security and Compliance
- Trusted Advisor: Check AWS Trusted Advisor for security recommendations (e.g., "Security Groups - Specific Ports Unrestricted", "IAM Use").
- IAM Management: Rotate IAM access keys that are older than 90 days. Ensure MFA is enabled for all users with console access.
- Patching: Use AWS Systems Manager (SSM) Patch Manager to verify compliance of EC2 instances and schedule maintenance windows for OS updates.
3. Cost Optimization
- Cost Explorer: Review daily spend in Cost Explorer. Filter by "Service" to see what is driving costs. Look for anomalies.
- Unused Resources: Identify and delete unattached EBS volumes, unassociated Elastic IPs, and idle Load Balancers.
- Spot Instances: Check if non-critical workloads running on On-Demand instances can be moved to Spot Instances for savings.
4. Backup and Recovery
- AWS Backup: Verify that daily backup jobs (for EBS, RDS, DynamoDB, EFS) completed successfully via the AWS Backup console.
- Snapshot Lifecycle: Ensure old snapshots are being deleted according to the retention policy to manage storage costs.
5. Support and Troubleshooting
- Support Center: Check the status of any open support cases with AWS Support.
- Personal Health Dashboard: Check for any scheduled maintenance events (e.g., EC2 hardware maintenance) that might affect your resources.