Defender Track

Objective 6: Use Athena

For this objective, we'll be exploring the logs in a similar way as we did with jq, by using the AWS Service Athena. You'll need to do this from your own account, as there isn't a way I can give untrusted users access to Athena in my account without people doing undesirable things. Athena can be accessed at https://console.aws.amazon.com/athena/home?region=us-east-1#query. We'll be working with such a small dataset that any charges on your account should be a few pennies. You'll need Athena and Glue privileges.

In the query editor, run:

create database flaws2;

Switch to the flaws2 database you just created and run:

CREATE EXTERNAL TABLE `cloudtrail`(
    `eventversion` string COMMENT 'from deserializer', 
    `useridentity` struct<type:string,principalid:string,arn:string,accountid:string,invokedby:string,accesskeyid:string,username:string,sessioncontext:struct<attributes:struct<mfaauthenticated:string,creationdate:string>,sessionissuer:struct<type:string,principalid:string,arn:string,accountid:string,username:string>>> COMMENT 'from deserializer', 
    `eventtime` string COMMENT 'from deserializer', 
    `eventsource` string COMMENT 'from deserializer', 
    `eventname` string COMMENT 'from deserializer', 
    `awsregion` string COMMENT 'from deserializer', 
    `sourceipaddress` string COMMENT 'from deserializer', 
    `useragent` string COMMENT 'from deserializer', 
    `errorcode` string COMMENT 'from deserializer', 
    `errormessage` string COMMENT 'from deserializer', 
    `requestparameters` string COMMENT 'from deserializer', 
    `responseelements` string COMMENT 'from deserializer', 
    `additionaleventdata` string COMMENT 'from deserializer', 
    `requestid` string COMMENT 'from deserializer', 
    `eventid` string COMMENT 'from deserializer', 
    `resources` array<struct<arn:string,accountid:string,type:string>> COMMENT 'from deserializer', 
    `eventtype` string COMMENT 'from deserializer', 
    `apiversion` string COMMENT 'from deserializer', 
    `readonly` string COMMENT 'from deserializer', 
    `recipientaccountid` string COMMENT 'from deserializer', 
    `serviceeventdetails` string COMMENT 'from deserializer', 
    `sharedeventid` string COMMENT 'from deserializer', 
    `vpcendpointid` string COMMENT 'from deserializer')
ROW FORMAT SERDE 
    'com.amazon.emr.hive.serde.CloudTrailSerde' 
STORED AS INPUTFORMAT 
    'com.amazon.emr.cloudtrail.CloudTrailInputFormat' 
OUTPUTFORMAT 
    'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
    's3://flaws2-logs/AWSLogs/653711331788/CloudTrail';

You can now run:

select eventtime, eventname from cloudtrail;

You can run all your normal SQL queries against this data now, for example:

SELECT 
    eventname,
    count(*) AS mycount 
FROM cloudtrail 
GROUP BY eventname 
ORDER BY mycount;

Athena is great for incident response because you don't have to wait for the data to load anywhere, just define the table in Athena and start querying it. If you do so, you should also create partitions which will reduce your costs by helping you query only against a specific day. Alex Smolen gives a good explanation on how to do that in his article Partitioning CloudTrail Logs in Athena.

The End

You now have some core skills for doing security work in AWS. These include assuming roles in other accounts, reading IAM and resource policies, querying json logs with jq and Athena, and understanding CloudTrail.

If you found this helpful, please tweet about! If you'd like to know more about AWS security or need help, I do consulting work where I can provide private training, assessments, and more. To get in contact, go to https://summitroute.com/