Data is the new currency in today’s digital age. Just as you wouldn’t leave your house title lying around for anyone to take, understanding where your data resides is crucial for its protection. Knowing the exact location of your data allows you to implement proper security measures, ensuring it’s not vulnerable to unauthorized access or breaches.
Understanding your data’s location also plays a vital role in regulatory compliance. For instance, CIS controls (https://www.cisecurity.org/controls) Control 13: Data Protection and Control 14: Controlled Access Based on the Need to Know, emphasize the need to secure data and limit access strictly to those who need it. By mapping out where your data lives, you can better align your practices with these controls, reducing risks and meeting compliance requirements.
In this blog, I will guide you through the various methods to discover where your data resides, the specific tools to use for different types of data, and when and how to effectively utilize each tool.
The 2 Methods in discovery data
Manual methods involve physically documenting all the locations where your data is stored. This approach requires you to actively track and record each data repository, whether it’s on-premises, in the cloud, or across various applications and devices. While this method can be thorough and provide a deep understanding of your data landscape, it is also time-consuming and prone to human error. Think of it as manually creating an inventory of every item in your home – it’s detailed but can be exhausting and easy to miss something.
Automatic methods leverage technology to scan, map, and classify your data across different environments. These methods use specialized tools to automatically discover data locations, classify sensitive information, and provide insights into data usage and movement.
Type of Data in an Organization
Organizations typically handle two primary types of business data: documents and organizational business data.
Documents include files like reports, presentations, spreadsheets, and PDFs, which often contain sensitive information and require careful management and protection.
On the other hand, Organizational business data encompasses the data generated from business operations, workflows, and applications, such as transaction records, customer information, and operational metrics. Think of applications such as Dynamics 365, Workday data, SAP data, etc. This type of data is what is used for day-to-day operations.
Now that we know about the 2 different data in an organisation, let’s go have a look at what are the available Microsoft solutions to use to DISCOVER DATA (most of which are already included in your Microsoft Business Premium, or E3 and E5 licenses)
Quick Note:
There are solutions that are not on this list that has some form of search/ discovery capability (ex. Purview Data Life Cycle Management, Audit Log Search) I’ve omitted it in this list as their primary purpose is data governance and the data discovery capability relies on the other items that I’ve listed down below
Document discovery tool
Microsoft Purview Information Protection: (for documents stored in Email, SharePoint, OneDrive and Teams): It helps classify and label data based on its sensitivity. Start by defining your data classification schema, apply labels to your documents using built-in or custom labels, and configure policies to automatically classify and protect sensitive information as it is created or modified.
Microsoft Purview Information Scanner (for On-prem data): This is designed to scan and classify on-premises data. To use it, deploy the scanner to your on-premises environment, configure scanning jobs to target specific data repositories, and review the scan results to understand where sensitive information resides and how it is being used.
Microsoft Compliance Center (Content Search Tool): The Content search tool in the Microsoft Compliance Center allows you to search for and manage content across your organization.
Microsoft 365 eDiscovery: This helps you manage and analyze large volumes of data for legal and compliance purposes. To use it, access the eDiscovery portal, create a case, add data sources, and run searches and analytics to gather relevant information for your legal or compliance needs.
Defender for Cloud Apps: This is a comprehensive solution for monitoring and controlling data movement across cloud applications. The tool also offers data classification and protection through integration with Microsoft Purview Information Protection, ensuring consistent data security across your organization​
Priva (using Privacy Assessments): This is specifically just for Personal data discovery. Automates the discovery, documentation, and evaluation of personal data use across your entire data estate. Using this regulatory-independent solution, you can automate privacy assessments and build a complete compliance record for the responsible use of personal data.
Organizational Business Data Tools
Purview Data Map: Helps you create a unified map of your data estate by automatically scanning and classifying your data sources. To use it, configure scanning rules and connect your data sources to Purview. The Data Map will continuously update, providing an up-to-date view of your data landscape, including classification and sensitivity labels, which helps in managing data compliance and governance.
Purview Data Catalog: Provides a searchable catalog of data assets, making it easy to discover and understand data across your organization. To use it, start by connecting your data sources to Purview, which will automatically scan and index your data. Users can then search for data assets, view metadata, and understand data lineage, facilitating better data governance and management.