Category: Cloud

Protect Sensitive Data on AWS with Amazon Macie

Protect Sensitive Data on AWS with Amazon Macie

I’m a few days late on this, but I just read on the AWS blog that they launched a new service called Amazon Macie on August 14. According to the website, Macie is “a security service that uses machine learning to automatically discover, classify, and protect sensitive data in AWS.” Here are a few points that stood out for me:

  1. It looks like S3 is the initial data store for the service, and that is welcome news based on the recent high-profile data exposures caused by misconfiguration of S3 bucket policies.
  2. The pricing also seems reasonable (though you always have to watch out for costs with cloud services – they can sneak up on you quickly).
  3. The interface looks intuitive to get started. Some AWS services are not very intuitive and take for granted that you have a lot of AWS knowledge (which is not an unfair assumption for the most part). But the Macie designers seemed to consider that the exposures on S3 might just mean that some people might not know what the heck they are doing. You still will need to know about CloudTrail and other AWS concepts.
  4. The Macie dashboard gives a lot of good information and doesn’t appear to be be cluttered. Another nicely designed UX so far.
  5. Straight from the product page: “For data classification purposes, Macie utilizes CloudTrail’s ability to capture object-level API activity on S3 objects (data events).”
  6. Alerting seems to be very strong. Go check out the alerting page to see what all is available (there’s a lot there).

It looks like a few high-profile AWS customers are already using it (edmunds, Netflix, and Autodesk are the examples included in the marketing info), so AWS has done some due diligence with vetting this out. Machine learning is becoming more and more a norm in our lives these days, and the use cases are proving themselves out in a lot of cases. With so many different customers that could potentially use this with so many different kinds of data, it will be interesting to see some case studies come out over time to see if machine learning can be applied here with high success. I predict it will be successful and highly utilized.

Great post on the RNC AWS file leak discovery from UpGuard

Great post on the RNC AWS file leak discovery from UpGuard

UpGuard’s post on their discovery of the RNC data is trending big time on the netsec subreddit. I highly recommend going to read the post if you want to know what they found. But in a nutshell, it all centers around the misconfiguration of permissions to the AWS S3 bucket where the database was stored.

I would like to say that the carelessness that was shown here is surprising, but unfortunately it’s not anything new. As I get deeper into the cloud, I see more and more parallels from my straight networking days. Permissions have always been an issue with networks in general, and now that Amazon, Microsoft, and other cloud providers are making it so easy to provision resources in the public cloud, the implications of faulty permissions are huge. This is just one example of a slew of problems, but it just so happens to be a VERY BIG example. We’re talking the potential exposure of data on nearly all 200 million US registered voters, plus the inner workings of the GOP in the last election. And no one really knows how long it was out there.

One last thing: when you go read the post, be sure to read the whole thing. Not only does the article talk about what was exposed; it also goes into the implications of that exposure that go beyond just your basic “muh data” There is some very targeted… almost metadata… about people that are derived from some sophisticated data analytics, which could lead to some very specific targeting. Kinda eerie.

Bitnami