Don’t reinvent the wheel: Risk Assessment and Incident Response Plan - Blog

I remember, like years ago, I was inspired by Eric Ries’ lean methodology, the focus is on actionable, straightforward processes that avoid unnecessary complexity and buzzwords. In essence, why can’t those principles be used in the compliance world? Our team shifted the emphasis towards pragmatic, efficient strategies for risk assessment and incident response using Amazon Web Services (AWS) to enhance security posture. We’re already on AWS, what advantages can we gain out of the most full-fledged cloud provider from the SOC 2 perspective? This is where the “don’t reinvent the wheel” has started.

Risk Assessment: A Lean Approach

Initially, our risk assessment exercise involved the process of identifying where sensitive data resides and how it’s protected. AWS services like AWS Config and AWS Security Hub offer a comprehensive view of AWS resource configurations and their compliance with best practices, respectively. Obviously, we utilize RDS, S3, Glacier, and other services for secure, encrypted data storage augmented by continuous backups. By leveraging these tools, our resource-constrained team was able to pinpoint vulnerabilities and prioritize them based on potential impact, following the lean principle of focusing on what truly matters. Now, as the automated foundation for data management has been established, I have decided to expand the horizon and look at AWS Trusted Advisor and Amazon Inspector for in-depth risk assessments.

Trusted Advisor inspects the AWS environment against best practices, identifying opportunities to improve performance, reduce costs, and tighten security. Amazon Inspector, as the best companion, automates security assessment 24/7, scanning for vulnerabilities and deviations from best practices within AWS resources. Together, these tools provide a continuous, automated risk assessment process, ensuring that our organization can proactively manage and mitigate potential security risks.

The risk assessment matrix has been flawlessly developed with Vanta’s robust scoring and mapping tool that is available out of the box! It’s incredibly convenient and easy to map all the services (Vanta’s integration list is immense) against their respective risks in minutes and then focus on the personnel aspects.

Incident Response: Stages Unfolded

Preparation

If you’re small or experience the resource constraints as we do, the automation is your best ally. Actually, I have a strong opinion that automation is an unwavering ally starting from day one, regardless of how big or small the team is. Automation is not error-prone because of emotions, lack of energy, bad mood, or burnout. Our team utilizes Amazon GuardDuty for intelligent threat detection and AWS Identity and Access Management (IAM) to ensure that only authorized users have access to sensitive resources. Along with services, we utilize the WAFR recommendations focused on account/region segregation, which we will discuss in the next chapter, “Learn from our mistakes: Infrastructure readiness starting from day one”. This preparation aligns with Ries’ principle of building a secure foundation before scaling.

Detection and Analysis

AWS CloudTrail and Amazon GuardDuty play critical roles here. CloudTrail tracks user activity and API usage, while GuardDuty analyzes logs for suspicious activity, ensuring that Trisk can swiftly identify potential security incidents. But in order to detect something suspicious, it is crucial to have the right normalized data. AWS Application Load Balancing (ALB) plays a pivotal role in distributing incoming traffic across multiple targets, enhancing the availability and fault tolerance of Trisk’s applications. ALB also serves as a first line of defense, offering SSL/TLS decryption, which offloads this task from the application servers and simplifies certificate management. This not only secures data in transit but also enables deeper packet inspection and more sophisticated routing rules for increased security. The topic of application logs is also non-negotiable in terms of where and how to store them. Utilize the full power of CloudTrail and GuardDuty, and don’t miss a bit of the potential breach indications.

AWS Elastic Container Registry (ECR) is integral to Trisk’s container management strategy, providing a secure location to store, manage, and deploy container images. The ECR image scanning feature automatically scans images for vulnerabilities upon upload using the Common Vulnerabilities and Exposures (CVEs) database. This ensures early identification and remediation of the vulnerabilities early in the development cycle, reinforcing the security of the containerized applications.

Containment, Eradication, and Recovery

Monitoring and alerts are critical for early detection of potential security incidents. Amazon CloudWatch and AWS Security Hub provide comprehensive monitoring and alerting capabilities. CloudWatch monitors operational data, setting alarms for specific thresholds, while Security Hub aggregates, organizes, and prioritizes security alerts. We leverage these services for real-time insights into operational health and security posture, enabling quick reaction to potential threats.

Once an incident is detected, our IRT team focuses on containment to prevent spread, using Amazon VPC (Virtual Private Cloud) for network isolation. Eradication involves removing the threat, often by patching vulnerabilities or updating security groups, and recovery is facilitated by AWS Backup to restore services to their pre-incident state efficiently. Gladly, it’s all on paper and in our regular testing exercises/simulations.

Post-Incident Analysis

The focus here is on learning and improvement, core to Ries’ lean startup approach. AWS services facilitate the analysis of what happened, why, and how processes can be improved to prevent future incidents. This might involve adjusting AWS WAF (Web Application Firewall) rules or revising IAM policies based on lessons learned.

Conclusion: Lean, Not Light Comprehensive Security Posture

In adopting AWS services for Trisk’s SOC2 journey, our team does not seek shortcuts but rather a streamlined, effective path to compliance and security. This approach, inspired by the lean methodology, underscores the importance of continuous learning, agility, and the efficient use of resources. By leveraging AWS’s robust ecosystem, we can ensure that our risk assessment and incident response are not only SOC2 compliant but also resilient against the evolving landscape of cyber threats, embodying an ethos of adaptability and relentless improvement.

Focus on what adds value, eliminate waste, and continuously iterate.