Customizing Config Remediation Notifications with Amazon Q Developer
Optimizing Slack-based triage for AWS Config Remediation
To continuously enforce AWS security standards, security engineers often implement AWS Config auto-remediations via Systems Manager Automation (SSM) documents for managing several categories of cloud misconfigurations. Although this solution is widely known and used (see: AWS Config remediation docs), our security team found it difficult to monitor the auto-remediations done by Config, specifically due to the logging provided from CloudTrail when remediations occurred.
In the first iteration of monitoring when Config performed a remediation, we created a CloudWatch Events query on our CloudTrail CloudWatch logs group to monitor for Config invoking an SSM document with the pattern:
{
"detail": {
"eventName": ["StartAutomationExecution"],
"eventSource": ["ssm.amazonaws.com"],
"userIdentity.invokedBy": ["config.amazonaws.com"]
},
"source": ["aws.ssm"]
}
And the results from this query were sent to an SNS topic and later to Amazon Q Developer (FKA AWS Chatbot) to forward to our on-call security team via Slack using the below architecture:

However, the alerts sent to Slack left much to be desired, as can be seen in an example Slack message we received:

We found these Slack alerts lacking due to two issues:
-
The results of the CloudWatch query to watch for AWS Config remediations included information about the SSM document Config invoked, however the parameters passed to the SSM document were obscured and replaced with text HIDDEN_DUE_TO_SECURITY_REASONS
For example, below is a sample event of Config invoking the
AWS-DisableIncomingSSHOnPort22
SSM Document as part of an auto-remediation.{ "eventVersion": "1.08", "userIdentity": { "type": "AssumedRole", "principalId": "<<redacted>>:AwsConfigRemediation", "arn": "arn:aws:sts::<<redacted>>:assumed-role/AWSServiceRoleForConfigRemediation/AwsConfigRemediation", "accountId": "<<redacted>>", "accessKeyId": "ASIA......", "sessionContext": { "sessionIssuer": { "type": "Role", "principalId": "<<redacted>>", "arn": "arn:aws:iam::<<redacted>>:role/aws-service-role/remediation.config.amazonaws.com/AWSServiceRoleForConfigRemediation", "accountId": "<<redacted>>", "userName": "AWSServiceRoleForConfigRemediation" }, "webIdFederationData": {}, "attributes": { "creationDate": "2025-01-14T21:26:26Z", "mfaAuthenticated": "false" } }, "invokedBy": "config.amazonaws.com" }, "eventTime": "2025-01-14T21:26:26Z", "eventSource": "ssm.amazonaws.com", "eventName": "StartAutomationExecution", "awsRegion": "us-east-1", "sourceIPAddress": "config.amazonaws.com", "userAgent": "config.amazonaws.com", "requestParameters": { "documentName": "AWS-DisableIncomingSSHOnPort22", "documentVersion": "1", "parameters": { "AutomationAssumeRole": [ "HIDDEN_DUE_TO_SECURITY_REASONS" ], "SecurityGroupIds": [ "HIDDEN_DUE_TO_SECURITY_REASONS" ] } }, "responseElements": { "automationExecutionId": "eda2ec0c-c778-4d2b-a59d-aad5b52d5e9a" }, "requestID": "eda2ec0c-c778-4d2b-a59d-aad5b52d5e9a", "eventID": "6e2f6630-f439-439d-93ee-841aba72182c", "readOnly": false, "eventType": "AwsApiCall", "managementEvent": true, "recipientAccountId": "<<redacted>>", "eventCategory": "Management" }
Since we use Config auto-remediations on various types of cloud resources and to perform various actions, there was no way to determine the necessary information of what resources were updated, what the actions were, or even what account the changes occurred in. Due to this, the severity of the alert couldn’t be determined, so each Config remediation notification had to be treated as high-priority.
- Secondly, Amazon Q Developer did not support manipulation of AWS Events, so although some desired fields were available in the original CloudWatch events (such as recipientAccountId) we were unable to modify the message sent to Slack to include these fields.
Due to these limitations, our on-call engineers were doing many manual CloudWatch Insights queries to map the original Config remediation event based on the actions taken by the SSM document to determine the impact of the changes made by Config. As our on-call team grew with various levels of familiarity with AWS and our AWS footprint expanded, it became difficult to manage and disseminate these CloudWatch Insights queries across different AWS regions and accounts. Additionally, this led to responder fatigue as on-call engineers had to leave Slack and navigate AWS in order to complete their analysis.
To solve this problem, we introduced an AWS Lambda function to act as an interceptor between the SNS topic, which received events from CloudWatch, and the Amazon Q Developer integration. Additionally, we utilized the new Amazon Q Developer custom notifications feature to construct bespoke Slack notifications which contain all the relevant information for triage:

The Lambda utilizes SSM Parameter Store to hold a mapping between the SSM document Config invoked and the CloudWatch Insights queries on-call engineers previously ran manually. With this new information and the ability to customize fields through Amazon Q Developer, we send our on-call engineers messages with all the information they need to quickly triage the event without ever leaving Slack.
In detail, when a Config remediation occurs, the original unaltered event is sent to Slack through Amazon Q Developer. At the same time, the lambda invokes a CloudWatch Logs query to determine the full details of the event. However, CloudWatch Logs data may be several minutes delayed from the original CloudTrail event. Because of this, when the CloudWatch Logs query completes, a follow-up message is sent to Amazon Q Developer in the thread of the original Slack message.
In Slack, these enriched messages look like:

With this improvement, the time it takes to triage Config remediation events went from several minutes, down to seconds. Additionally, by incorporating all the necessary information for triage into Slack, on-call engineers can determine if an investigation needs to be launched directly from their mobile devices, especially important for off-hours alerts!
I have provided sample Terraform & Python GitHub Repo for you to use as reference or incorporate into your work streams!