Coalfire

Architecture Diagram

Solution overview

Coalfire-CF/terraform-aws-account-setup module used to create the initial account configuration.

Coalfire-CF/terraform-aws-vpc-nfw module used to create a VPC named primary in us-west-2 with:

a CIDR range of 10.1.0.0/16
3 subnets,

Name AZ CIDR Type

Management us-west-2a 10.1.0.0/24 public

Application us-west-2b 10.1.1.0/24 private

Backend us-west-2b 10.1.2.0/24 private
a NAT gateway in us-west-2b
an IGW
Route tables and routes

Coalfire-CF/terraform-aws-securitygroup module used to create a SG which allows traffic on port 443 from the ALB to app instances and SSH traffic on port 22 from the Management subnet.

terraform-aws-modules/autoscaling/aws module used to create an ASG which manages 2-6 Application t2.micro Amazon Linux 2 instances, which run a script to install apache at creation time.

terraform-aws-modules/acm/aws module used to create an ACM cert to be attached to the ALB for HTTPS traffic.

terraform-aws-modules/alb/aws module used to create an ALB in the public Management subnet and routes traffic to the instances managed by the Application instances managed by the ASG.

I create a key-pair to be used with the ec2 instances to allow SSH connections to be established.

Coalfire-CF/terraform-aws-ec module used to create a single t2.micro Amazon Linux 2 instance in the Management subnet, which allows SSH access from my local.

Deployment instructions

cd env/challenge
terraform plan
terraform apply

Design decisions

The spec called for "3 subnets, spread evenly across two availability zones". Spreading 3 subnets across 2 AZs is a bit ambguous, so I opted to put the Management subnet in one AZ and the Application and Backend subnets in the other AZ. This would co-locate the app and backend layers in the same AZ to cut down on latency and data transfer costs.

I opted to use Amazon Linux 2 for the ec2 instances since they are specifically built by Amazon for use in cloud applications on EC2, have great security standards, and are sufficient for general purpose use.

For the SSH access to the Application instances from the Management subnet, I opted to allow access from the Management subnet CIDR instead of the single instance to make managing that connection simpler. If another management instance was added it would have access without needing to update the SG. Optionally, this access could be reduced the single instance for improved security.

I provisioned the ALB in the Management subnet, since that is the only public subnet, which would allow external traffic to reach out and route traffic to the application instances. If the traffic was going to be purely internal traffic, this ALB could be moved into the Application subnet for improved security.

References to resources used

Apache Install Script: https://aws.plainenglish.io/deploying-a-aws-autoscaling-group-with-terraform-f487b865444f
Coalfire terraform modules: https://github.com/orgs/Coalfire-CF/repositories
Other terraform modules: https://registry.terraform.io/namespaces/terraform-aws-modules
Various stack overflow and medium articles
Various AWS docs

Assumptions made

For provisioning the ACM cert to be used for https traffic, I assumed the domain would be managed by a higher level infrastructure stack outside of this project.

Analysis of deployed infra

1) What security gaps exist?

Instead of having a management instance in a public subnet with access to SSH to the application instances, we could just utilize AWS's session manager to connec to the application instances via AWS console or CLI. This would allow us to remove the management instance, remove the SG rules to allow access, and use IAM to allow access to the application instances.

2) What availability issues exist?

Having the app and backend layers running in a single AZ each does not provide strong availability. I would provision Application and Backend subnets in at least 2 AZs and split any deployments across both AZs.

While looking into possible availability issues with having the ALB in a single public subnet, I found that this setup isn't even allowed. We would need to provision another public subnet in the other AZ so that it can be provisioned and meet AWS's availability requirements.

Having a single management instance could pose an availability risk, but its use would probably be infrequent and be a low risk.

3) Cost optimization opportunities?

Remove the management instance and use AWS session manager.

Allow ASG to scale to 0 if a delay due to instance startup is permissible, otherwise allow it to scale to 1.

Improvement Plan

1) List specific changes you'd make to improve security, resilence, cost, maintainability.

Multiple AZs for app and backend availability
Migrate to use AWS session manager and remove bastion instance

2) Poritizie them (what would you fix first and why?).

I try to prioritize work by assessing the combination of impact and difficulty to complete the task. The task of migrating to AWS session manager has a lower overall impact, but would be a relatively quick change to implement, so I would knock it out quickly. Then I would take on the larger, more impactful change of restructuring the AZs/subnets and deployments. This would also prevent performing the work of allowing the management instance to connect to instances in additional AZs, which would eventually be remove anyways.

3) Include at least 2 implemented improvements in code or scripts (e.g., tightening SG rules, adding CloudWatch alarms, setting bucket policies.

Runbook-style notes

1) How would someone else deploy and operate your environment?

Ideally, this project would have CI/CD configured so that future updates could be made by anyone via a pull request, which would be planned and applied via CI/CD platform.

2) How would you respond to an outage for the EC2 instance?

Check status of instance in AWS console
SSH to it to test connectivity

3) How would you restore data if the S3 bucket were deleted?

For this project, I am assuming the s3 bucket in question would be for the terraform state, since there are no other references to s3. Deleting the bucket would not have a direct impact on the items curerntly in AWS, but would prevent updates until the state is restored. If changes are managed by a CI/CD platform you may get lucky and find the state exists in the files from the most recently pipeline. If that is not the case, a new state file would need to be created and each item would need to be imported into the state file.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
env/challenge		env/challenge
.gitignore		.gitignore
AwsVpcArchitectureDiagram.png		AwsVpcArchitectureDiagram.png
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Coalfire

Architecture Diagram

Solution overview

Deployment instructions

Design decisions

References to resources used

Assumptions made

Analysis of deployed infra

1) What security gaps exist?

2) What availability issues exist?

3) Cost optimization opportunities?

Improvement Plan

1) List specific changes you'd make to improve security, resilence, cost, maintainability.

2) Poritizie them (what would you fix first and why?).

3) Include at least 2 implemented improvements in code or scripts (e.g., tightening SG rules, adding CloudWatch alarms, setting bucket policies.

Runbook-style notes

1) How would someone else deploy and operate your environment?

2) How would you respond to an outage for the EC2 instance?

3) How would you restore data if the S3 bucket were deleted?

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Name	AZ	CIDR	Type
Management	us-west-2a	10.1.0.0/24	public
Application	us-west-2b	10.1.1.0/24	private
Backend	us-west-2b	10.1.2.0/24	private

Folders and files

Latest commit

History

Repository files navigation

Coalfire

Architecture Diagram

Solution overview

Deployment instructions

Design decisions

References to resources used

Assumptions made

Analysis of deployed infra

1) What security gaps exist?

2) What availability issues exist?

3) Cost optimization opportunities?

Improvement Plan

1) List specific changes you'd make to improve security, resilence, cost, maintainability.

2) Poritizie them (what would you fix first and why?).

3) Include at least 2 implemented improvements in code or scripts (e.g., tightening SG rules, adding CloudWatch alarms, setting bucket policies.

Runbook-style notes

1) How would someone else deploy and operate your environment?

2) How would you respond to an outage for the EC2 instance?

3) How would you restore data if the S3 bucket were deleted?

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages