As a confession to anyone in my IT team reading this blog: I’m a hoarder.
I began at AvePoint in 2008 and have faithfully retained archives of every e-mail sent since then. I’ve treated nearly every presentation, training, and market analysis the same since I’ve taken on this role. My OneDrive where most of this content goes has no tags or meaningful labels on my content, and PowerPoints from 2011 live happily next to this blog post that I’m authoring in 2020!
Normally we wouldn’t think this is an issue. I can even hear some of you admit “at least it’s not on his laptop!” But as we’re going to see, my OneDrive isn’t just for personal documents. Let’s remember:
For many of us, there’s that handy dumping ground that guarantees our thoughts can be seen by our colleagues – the “Shared with Everyone” folder!
For a few of us (hopefully very few), we decided years ago to make co-authoring easier and went as far as the “Editable (Everyone)” folder!
This data is co-mingling with our Teams 1:1 chat data as well, which for many of us includes passport photos, tax documents sent to HR, and yes, even photos of our finisher medals from recent races!
Why this comes as an important confession to my IT team is because our management of personal and customer data has been under heavy scrutiny since we passed our first ISO 27001:2013 audits in 2018. While we can’t guarantee every user will follow best practices for information management and tagging, we’re still accountable for how they handle sensitivedata owned by our company.
That could be data relating to employees and customers under GDPR, data regarding trials for the FDA, contracts, operational and intellectual property data for our company, etc. For our company to be expected to continue to do business under CCPA, GDPR, ISO, or other laws we need a solid information management strategy that takes into account the fact that most data that gets shared is largely unstructured
We can’t control what we can’t see. Let’s first make sure we understand where sensitive data sits before we start creating policies to lock down user data.
Discovering where sensitive data sits in your environment is actually easier than you may think! Microsoft has done a substantial job of marketing “E5” features around security and compliance, and we will certainly be exploring these in future posts. But let’s realize that there’s a quick step that we are able to take right now:
I’m going to make some suggestions in this post based on what you’re most likely to discover in your environment. For me, with an E5 license enabled, the Security and Compliance center tells me that I have a very good chance of discovering credit card information:
Now you may be thinking, “I don’t have security labels auto-applied, retention labels auto-applied, trainable classifiers or exact data match switched on for my environment. I don’t even have an E5! How are we going to find sensitive information!?”
Every time you deploy a policy for labeling, content discovery, or privacy filters (such as sensitive information blocking in MCAS or Microsoft Cloud App Security) you’ll typically see a set of policies that looks like this:
These are summaries of something called Sensitive Information Types from Microsoft. If you look at the U.S. Financial Data policies for native Office 365 DLP (an E3 feature at the time of writing), you will see that “credit card number” check right in there! You can check out the full list of 100 here (assuming you have admin access to your tenant). If not, you can read about them here.
What makes this solution possible is that Microsoft is already indexing this data for use in all your future policies. That means from the moment your data is indexed, I can now find out where all 100 sensitive information types are deployed!
You’re going to head over to build a Content Search in the eDiscovery center – a basic feature enabled for all Office 365 license levels. You must have access to the eDiscovery center in your environment to make this work. The fastest way to get there is under the https://protection.office.com/ home page, under Search -> Content Search.
You have a few options for building this search, but I strongly recommend “guided search” for your first try:
For my environment, I’m looking for a quick “catch-all” check that looks for any Credit Card numbers. If you plan on running this against a production environment, you should probably limit this scope!
Since we’re just getting started, you want to run your search for: SensitiveType: “Credit Card Number.”
There are much more specific terms you can include outlined here, but keeping it generic will help us make the biggest splash with our first query.
You may want to grab a cup of coffee; this will take a while. (Don’t worry, it’ll save this search for you to pull up results of later!) If you do happen to walk away, you’ll see your previous searches on the main landing page of “Content Search” with a helpful report of the results:
You’ve successfully built ONE query against a SINGLE sensitive information type that runs ONCE, and is targeted at ONE scope. You’re also now responsible for all the information you return, including next-steps on securing that copy of the data.
When you sign up for updates on our new Policies and Insights product you’ll learn how you can grab dozens of these sensitive information types in one click, as well as the ability to map these back to your audit history and avoid potential risk from exposed permissions—all in reports that are intended for sharing!
For a bit more on Policies and Insights, check out the video below:
John Hodges is Senior Vice President of Product Strategy at AvePoint, focusing on developing compliance solutions that address modern data privacy, classification, and data protection needs for organizations worldwide. Since joining AvePoint in 2008, John has worked directly with the company’s product management and research & development teams to cultivate creative ideas and bridge the gap between sales and technology – providing a practical target for innovation and a focused message for sales and marketing. John has been actively engaged in the SharePoint community for several years, working with many Fortune 500 companies to drive sustainable adoption of Microsoft technology and optimize SharePoint’s larger purpose-built implementations. John’s insights and opinions on modern Information Technology can be found in various industry publications, as well as throughout this numerous speaking sessions in webinars and at events worldwide.