Introduction to Security and Architecture on AWS
Course Link: https://www.pluralsight.com/courses/introduction-security-architecture-aws
AWS Architecture Core Concepts
AWS Shared Responsibility Model: AWS is responsible for the security of the cloud whereas customer is responsible for security in the cloud.
AWS Well-Architected framework: It’s collection of best practices across five key pillars for how to best create systems that create business value on AWS.
Link: https://aws.amazon.com/architecture/well-architected/
Pillars are:
- Operational Excellence: Running and monitoring systems for business value
- Security: Protecting information and business assets
- Reliability: Enabling infrastructure to recover from disruptions
- Performance Efficiency: Using resources efficiently to achieve business value
- Cost Optimization: Achieving minimal costs for the desired value
Reliability on AWS:
- Fault Tolerance: Being able to support the failure of components within your architecture
- High-availability: Keeping your entire solution running in the expected manner despite issues that may occur
Some services can enable fault tolerance in your custom application. Ex: SQS, Route53
Common Compliance standards:
- PCI-DSS: Compliance for processing credit cards
- HIPAA: Compliance for healthcare data
- SOC 1, SOC 2, SOC 3: Third-party reviews of operational processes
- FedRAMP: Standards for US gov data handling
- ISO 27018: Standard for handling PII
Compliance services in AWS:
- AWS Config: Provides conformance packs for standards
- AWS Artifact: Provides self-service access to reports
- Amazon GuardDuty: Provides intelligent threat detection
AWS Identities and User Management
IAM supports identity federation through SAML providers including Active Directory
AWS IAM Identities:
- Users
- Groups
- Roles (enables a user or AWS service to assume permissions for a task)
AWS IAM best practices:
- multi factor authentication (MFA)
- least privilege access
Amazon Cognito enables you to handles authentication and aspects of authorization for your custom web and mobile applications through AWS.
Amazon Cognito:
- Its a user directory service for custom applications
- Provides UI components for many platforms (like Signup/Login UI for iOS/Android app)
- Provides features to control account access
- Enables controlled access to AWS resources
- Can work with social and enterprise identity providers (like Google, Amazon, FB, Active Directory and SAML 2.0 providers)
Data Architecture on AWS
On-prem data integration services:
- AWS Storage Gateway: Hybrid-cloud storage service
- AWS DataSync: Automated data transfer service
AWS Storage Gateway:
- Integrates cloud storage into your local network
- Deployed as a VM or specific hardware appliance
- Integrates with S3 and EBS
- Supports three different gateway types:
- Tape Gateway: Enables tape backup processes to store data in the cloud on virtual tapes. Tape gateway acts as virtual tape library (VTL).
- Volume Gateway: Provides cloud based iSCSI volumes to local applications.
- File Gateway: Stores files in Amazon S3 while providing cached low-latency local access for certain files.
AWS DataSync:
- Leverages the DataSync agent deployed as a VM on your network
- Integrates with S3, EFS, and FSx for Windows File Server on AWS
- Greatly improved speed of transfer due to custom protocol and optimizations
- Charged per GB of data transferred
Data Processing Services:
- AWS Glue: Managed Extract, Transform and Load (ETL) service
- Amazon EMR (Elastic Map Reduce): Big data cloud processing using popular tools
- AWS Data Pipeline: Data workflow orchestration service across AWS services
AWS Glue:
- Fully managed ETL (extract, transform and load) service on AWS
- Supports data in Amazon RDS, DynamoDB, RedShift, and S3
- Supports a serverless model of execution
Amazon EMR:
- Enables big-data processing on Amazon EC2 and S3
- Supports popular open-source frameworks and tools
- Apache Spark
- Apache Hive
- Apache HBase
- Apache Flink
- Apache Hudi
- Presto
- Operates in a clustered environment without additional configuration
- Supports many different big-data use cases
AWS Data Pipeline:
- Managed extract, transform and load (ETL) service on AWS
- Manages data workflow through AWS services
- Supports S3, EMR, RedShift, DynamoDB and RDS
- Can integrate with on-prem data stores
Data Analysis services:
- Amazon Athena: Services that enables querying of data stored in Amazon S3
- Amazon Quicksight: Business intelligence service enabling data dashboards
- Amazon CloudSearch: Managed search service for custom applications
Amazon Athena:
- Fully managed serverless service
- Enables querying of large-scale data stored in Amazon S3
- Queries are written using standard SQL
- Charged based on data scanned for query
Amazon Quicksight:
- Fully managed business intelligence service
- Enables dynamic data dashboard based on data stored in AWS
- Charged on per-user and per-session pricing model
- Multiple versions provided on needs
Amazon CloudSearch:
- Fully managed search service on AWS
- Support scaling of search infrastructure to meet the demand
- Charged per hour and instance type of search infrastructure
- Enables developers to integrate search into custom applications
AI and Machine Learning services:
- Amazon Rekognition: Computer vision service powered by machine learning
- Amazon Translate: Text translation service powered by machine learning
- Amazon Transcribe: Speech to text solution using machine learning
Amazon Rekognition:
- Fully managed image and video recognition deep learning service
- Identifies objects in images
- Identifies objects and actions in videos
- Can detect specific people using facial analysis
- Supports custom labels for your business objects
Amazon Translate:
- Fully managed service for translation of text
- Currently supports 71 languages and variants (as on Dec 2021) - https://aws.amazon.com/translate/details/ & https://docs.aws.amazon.com/translate/latest/dg/what-is.html#language-pairs
- Can perform language identification
- Work both in batch and real-time
Amazon Transcribe:
- Fully managed speech recognition services
- Recorded speech is converted into text in custom applications
- Includes a specific sub-service for medical use
- Supports batch and real-time transcription
- Supports 31 languages - https://docs.aws.amazon.com/transcribe/latest/dg/supported-languages.html#table-language-matrix
Disaster Recovery on AWS
Disaster Recover architectures:
- Backup and Restore
- Pilot Light
- Warm Standby
- Multi Site
In the above architectures, cost and complexity increases top to bottom. However, the recovery time decreases from top to bottom.
Backup and Restore:
- Prod data is backed up into Amazon S3
- Data can be stored in either standard or archival storage classes
- EBS data can be stored as snapshots in Amazon S3 also
- In a disaster recovery event, a process is started to launch a new environment
- This approach has the longest recovery time but least cost
Pilot Light:
- Key infrastructure components are kept running in the cloud
- Designed to reduce recovery time over the Backup and Restore approach
- Does incur cost of this infra continually running on the cloud
- AMI’s are prepared for additional systems and can be launched quickly
Warm Standby:
- A scaled-down version of the full environment is running in the cloud
- Critical systems can be running on less capable instance types
- Instance types and other systems can be ramped up for disaster recovery event
- Does incur cost of this infrastructure continually running in the cloud
Multi Site:
- Full environment is running in the cloud at all times
- Utilizes instance types needed for production not just recovery
- Provides a near seamless recovery process
- Incurs the most cost over the other approaches
Disaster Recovery Approach considerations:
- Recovery Time Objective (RTO) - amount of time to recover
- Recovery Point Objective (RPO) - amount of data loss (in terms of time)
Architecting Applications on Amazon EC2
Scaling EC2 Infra
- Vertical scaling: “scale up” with larger instances
- Horizontal scaling: “scale out” add more instances
Horizontal scaling services in AWS:
- Auto Scaling Group: Set of EC2 instances with rules for scaling & management
- Elastic Load Balancer: Distributes traffic across multiple targets
Auto Scaling Group:
- Launch template defines the instance config for the group
- Defines max, min and desired number of instances required
- Performs health checks on all instances
- Exists in one or more availability zones within the same region
- Works with on-demand and spot instances
AWS Secrets Manager:
- Secure way to integrate credentials, secrets, etc
- Integrates natively with RDS, DocumentDB and Redshift
- Can auto-rotate credentials with integrated services
- Enables fine-grained access control to secrets
Security in Amazon VPC:
- Security groups
- Network ACL’s
- Amazon VPN
Security Groups:
- Serve as firewalls to EC2s and are applied at instance level
- EC2 instances can belong to multiple security groups
- Must be explicitly associated with instances
- By default all outbound traffic is allowed
Network ACL:
- Works at the subnet level with a VPC
- Enables you to allow or deny traffic
- Each VPC has a default ACL that allows all inbound and outbound traffic
- Custom ACLs deny all traffic until rules are added
AWS VPN:
- Creates an encrypted tunnel into your VPC
- Can be used to connect your data center or even individual client machines
- Supported in two services:
- Site-to-site VPN
- Client VPN
Security Services:
- AWS Shield: Managed DDoS protection service for apps on AWS
- Amazon Macie: Data protection service powered by machine learning
- Amazon Inspector: Automated security assessment service for EC2 instances
AWS Shield:
- Provides protection against DDoS attacks for apps running on AWS
- Enables ongoing threat detection and mitigation
- Has two service levels:
- Standard
- Advanced
Amazon Macie:
- Utilizes machine learning to analyze data stored in Amazon S3
- It can detect PII and IP in S3
- Provides dashboards to show how the data is being stored and accessed
- Enables alerts if it detects anything unusual about data access
Amazon Inspector:
- Enables scanning of Amazon EC2 instances for security vulns
- Charged by instance per assessment run
- Two types of rules packages:
- Network reachability assessment
- Host assessment
Deploying Pre-defined solutions on AWS:
- AWS Service Catalog
- AWS Marketplace
AWS Service Catalog:
- Targeted to serve as an organizational service catalog for the cloud
- Can include single server image to multi-tier custom applications
- Enables organizations to leverage services to meet the compliance
- Supports a lifecycle for services released in the catalog
AWS Developer Services:
- AWS CodeCommit
- AWS CodeBuild
- AWS CodeDeploy
- AWS CodePipeline
- AWS CodeStar
AWS CodeCommit:
- Managed VCS
- Utilizes git for repos
- Control access with IAM policies
- Serves as an alternative to GitHub and Bitbucket
AWS CodeBuild:
- Fully managed build and CI service
- Don’t need to worry about maintaining infra
- Charged per minute for compute resources you utilize
AWS CodeDeploy:
- Managed deployment service for custom apps
- Deploys to AWS EC2, Fargate, Lambda and on-premise service
- Provides dashboard for deployments
AWS CodePipeline:
- Fully managed continuous delivery service
- Provides capabilities to automate building, testing and deploying
- Integrates with other tools and GitHub
AWS CodeStar:
- Workflow tool that automates the use of other developer services
- Creates complete continuous delivery toolchain
- Provides custom dashboards and configurations
- Charged for the other services you leverage
Security Policies and Standards Automation on AWS for DevOps Engineers
Course Link: https://www.pluralsight.com/courses/security-policies-standards-automation-aws-devops-engineers
IAM & STS
-
IAM User, Group and Role helps in Authentication
-
IAM Policies helps in Authorization
-
IAM Roles: Service roles and Service linked role (predefined roles)
-
IAM Policy:
- Statement ID (SID)
- Effect (usually Allow / Deny)
- Principal
- Action
- Resource (optional)
- Condition (optional)
-
IAM Policy types:
- Identity based
- Resource based (not supported by all services)
-
The IAM policies is first evaluated for “Deny” actions. Only if theres no deny and theres an explicit allow, the permission is granted.
IAM Best Practices:
- Enable MFA
- Never use AWS root account access
- Use IAM groups
- Grant least privilege
- Rotate credentials
- Use managed policies (create customer managed policy if required)
- Review & monitor user activity
IAM STS
- Can be used in
- Enterprise Identity Federation and Web Identity Federation
- Cross account access and giving privs to EC2 instances
- AWS STS actions:
- AssumeRole
- AssumeRolewithSAML
- AssumeRolewithWebIdentity
- DecodeAuthorization
- GetSessionToken
- GetCallerIdentity
- GetFederationToken
- GetAccessKeyInfo
Identity Federation in AWS
-
Supports SAML and non-SAML based identity federation
-
Non-SAML based ID federation is done using AWS managed Microsoft AD (Secure Windows Trusts)
-
AWS recommends Cognito for Web Identity Federation in mobile apps
-
AWS allows Role Switching where the target AWS account needs to trust the source AWS account and assign a policy with the trust. Any IAM user accessing target AWS account should have the permission to sts:AssumeRkeole.
-
AWS Trusted Advisor Categories:
- Basic (Security & Service limits)
- Enterprise/Business (Cost-optimization, Fault tolerance and Performance)
-
Cloudwatch alerts can be setup in two ways:
- Simple alarms (for specific metrics)
- Composite alarms (for combining two or more alarms)
AWS Organizations
- AWS Organizations have Service Control Policies (SCP) that specify maximum permission for an entity
- Features:
- Consolidated billing and account management
- Hierarchical groupings as Organizational Units (OU)
- Centrally control member accounts using SCPs
- IAM integration and support
- Integration with other AWS services
Service Control Policies
-
Needs to be explicitly enabled in an organization
-
Organizational policies to manage permissions
-
Can be applied in root level, OU level or individual account level
-
Don’t affect resource-based policies
-
Access control to users that are part of the organization
-
Don’t affect service linked roles
-
SCPs allowed at the root level is inherited by the child entities
-
Tag policies help standardize tags on all tagged resources across organization. Ex: Set values of “environment” key as one in “dev”, “stag” or “prod” values only.
AWS re:Inforce 2019: The Fundamentals of AWS Cloud Security (FND209-R)
Patterns to know:
- Permissions management - AWS IAM
- Data Encryption - AWS KMS
- Network Security controls - Amazon VPC
Two patterns for cross account access:
- Create policies in both the accounts (source & destination) with the same resource & permissions but in destination account policy, add “Principal” JSON object with the “AWS” account to trust
- For services that don’t support resource based policies (like DynamoDB), create policy in destination account only with required permissions and create a separate role with trust policy which trusts the source AWS account. Then create a new role in source AWS account that can assume role in the destination account.
AWS Organizations - lets to organize accounts
AWS re:Inforce 2019: Security Best Practices the Well-Architected Way (SDD318)
What should you do first?
- IAM
- Use automation
- Enable detection
- Prepare for an incident
Incident Response: Use Amazon GuardDuty as a starting point.
Some GuardDuty docs:
- https://docs.aws.amazon.com/guardduty/latest/ug/guardduty-ug.pdf
- https://docs.aws.amazon.com/guardduty/latest/ug/guardduty_finding-types-active.html
Permission Boundaries for IAM: https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_boundaries.html
Automate credential management:
- Disable/delete unused access keys
- Cleanup after federation
- Regularly remind people for having an access key
- Remove leavers
- Constantly reduce permissions
Account & IAM:
- AWS Organizations
- IAM
- AWS SSO
- MFA
- AWS Secrets Manager
Management:
- AWS CloudFormation
- AWS Systems Manager
- AWS Code Services
- Event
Detective:
- Amazon GuardDuty
- Cloudtrail
- Cloudwatch
- AWS Config
Infrastructure and data:
- AWS Inspector
- AWS Security Hub
- AWS Certificate Manager (Private Certificate Authority)
- AWS KMS
- Amazon Macie
Automate detection of data leaks:
- Consider all sources of data!
- Interesting API calls
- Any service that hosts data
- Pre-signed URL generation
- S3 access logs
- Custom access log information
- Data classification code for analysis
Switch on default Amazon EBS encryption - https://docs.aws.amazon.com/cli/latest/reference/ec2/enable-ebs-encryption-by-default.html
Links to learn more:
- AWS Well Architected Framework - https://wa.aws.amazon.com/index.en.html
- AWS Solutions: https://aws.amazon.com/solutions/
- Labs: https://wellarchitectedlabs.com/
Advanced Network Security on AWS
Course: https://www.pluralsight.com/courses/advanced-networking-security-aws
Amazon VPC Security Best Practices
-
Public subnet: Has local route for other VMs on VPC and internet gateway route for internet access
-
Private subnet: Has local route and NAT gateway route for internet access
-
Protected/Isolated subnet: Has only local route but no route to internet/NAT gateway
-
Network Access Control List (NACL)
- subnet level firewall
- stateless traffic filters
- rules to blacklist or whitelist traffic
- rules are evaluated by sequence numbers (lesser to greater)
-
Security Groups
- Instance level firewall
- Can only allow ingress and egress rules (cannot deny)
- Configuration is required to allow communication
- Stateful protection
Difference between SG and NACL:
Security Group | Network ACL |
---|---|
Associated to EC2 via network interface card (ENI) | Associated to a subnet and implemented in a network |
Supports Allow rules only because it blocks traffic by default | Supports Allow and Deny rules |
Stateful | Stateless |
All rules are evaluated before deciding whether to allow traffic | All rules are processed by their sequence number |
Layers of defense for an instance for public traffic: Route table, Network ACL, Security Group
NAT Gateways are scoped to availability zone and not highly available by default. Internet Gateways are not scoped to AZ but to each subnet and are highly available.
Notes for each service
Cloud HSM
- Cloud-based Hardware Security Module
- Compliant with HIPAA, PCI and FedRAMP
- Usually deployed as cluster with >= 2 HSMs spread across different availability zones in same region
- Automatically load balanced and keys synced across HSMs in cluster
- Automatically backup HSMs to S3 bucket in same region. Can be restored to CloudHSM only.
- When new HSM added to cluster, the backup is restored to make it exact clone of existing HSMs.
- AWS has no visibility/access to encryption keys
- No vendor lock-in. Keys can be exported to other commercial HSMs if required
- AWS KMS can be configured to use CloudHSM as custom key store (along with storing master keys)
- Charged per hour (pay as you go)
- Scale up to 32 HSMs per cluster. Scale down to 0; restore from backup.
- Upto 1024 unique users on HSMs
AWS Inspector
- Agent based scanning (with
awsagent
having sudo privs) - Checks vulnerabilities using CVE database
- Has package rules
- CVE
- CIS Benchmark
- Security Best Practices
- Runtime Behaviour Analysis
- Security Levels (High, Medium, Low, Informational)
- Has supported regions and OS
- Supports roles and instance tagging. You need to tag the instances that needs to be included in the assessment.