Microsoft Purview – Victor's Blog

When to use Purview Information Barriers and Purview DLP

12/03/202509/03/2025 victorwingsing

At first glance, it’s easy to think that if you have Data Loss Prevention (DLP) capabilities where you have policies monitoring internal data flows, then Information Barriers might be an unnecessary extra. After all, DLP diligently scans every email, document, and chat for sensitive content. This is certainly the sentiment that I often get when talking to Cyber Security teams.

This made me realise 2 things:

Microsoft needs to do a better job in marketing/ promoting Purview Information Barriers and
Information Barrier has it’s own purpose that DLP cannot do.

What is Purview Information Barrier

Microsoft Purview Information Barrier is designed to restrict communication and collaboration between defined groups within an organisation. It’s primary function is to ensure that teams with conflicting interests (think of trading and research groups in financial services) cannot interact with each other. By enforcing internal boundaries, these policies help maintain confidentiality and avoid accidental data leakage between sensitive departments. (ex. Insider Trading)

With Purview Information Barrier, you can create a policies that can automatically prevent internal teams from communicating with each other through Microsoft teams. These include the following actions:

In SharePoint and OneDrive, Information Barriers can prevent the following unauthorized collaboration:

Capabilities shared by Information Barrier in Microsoft Purview DLP

You probably noticed that the activities above such as “Sharing a file with another” and “Sharing content with another user” can already be done within Microsoft Purview DLP. In essence, yes, that is correct. An admin can setup a policy to BLOCK these file sharing to another user.

Where DLP falls short and Information Barriers shine

While Purview DLP is effective at blocking explicit sending or sharing actions, it misses scenarios where access is already granted, which is where Purview Information Barriers come in to the rescue. DLP policies activate when a user actively sends data, but if sensitive information is already shared through granted permissions, the DLP policy remains dormant. For example, if User A (Finance) adds User B (Sales) as a member to the Finance Teams site or SharePoint site, User B gains immediate access to all files without any explicit sharing event, leaving DLP unable to intervene.

Alternatively, User A could simply send a meeting invite and start a Teams call with screen sharing, bypassing the trigger for DLP.

Another example, consider a situation where User A uploads a confidential document to a shared folder that automatically grants access to a broader group—here, Information Barriers would prevent unauthorised viewing by restricting access at the source, whereas DLP would not block the document being placed in that shared location.

Strategy in using BOTH Information Barrier and DLP

You should view Purview Information Barriers as a key part of your data governance and protection strategy. Relying solely on DLP leaves gaps that Information Barriers can fill—by preventing risky internal interactions before they even happen. Here’s a few actionable items that you can do today:

Start by reviewing your organisation’s internal communication flows to identify potential conflicts of interest and assign segmented rules that restrict who can communicate with whom.
Work with your Corporate Communications, Human Resources teams and Legal team to identify when and where to apply restrictions between groups of users.
Ensure these barriers align with your overall compliance and governance framework, and conduct regular testing to confirm their effectiveness. Then codify these in your data governance policies
Finally, train your teams on why these measures are necessary and how to adhere to them.

Adopting a dual strategy with both DLP and Information Barriers will provide much stronger data protection stance, reducing the chance of inadvertent data leaks from within.

References:

Information Barriers: https://learn.microsoft.com/en-us/purview/information-barriers-solution-overview
Get started with Information Barriers: https://learn.microsoft.com/en-us/purview/information-barriers-policies?tabs=microsoft-purview-portal
Use Microsoft 365 Information Barriers to stay secure & compliant in your Microsoft Teams workplace: https://www.youtube.com/watch?v=39HDyKVZTSA

Deep dive in PDF labeling and data protection

09/03/202509/03/2025 victorwingsing

Let’s cut to the chase – PDFs are everywhere in your organisation, and they’re housing your sensitive data. I’m talking about those finalised e-signed contracts, bank statements, and countless other critical documents. While we’re all busy protecting our Office files with fancy security measures, PDFs often slip through the cracks. But here’s the thing – they need the same level of classification and protection as your typical .docx or .xlsx files.

Here’s the different ways you could label PDF files and simple to follow deployment strategy to enable PDF data classification to your data.

Labeling PDFs: Three Approaches

Label data natively in Microsoft Office then save it as PDF
Label data using Adobe Acrobat
Label data using the Microsoft Purview In

Read all the way to the end to see what would happen if you use the “Open in PDF Word” function to an encrypted PDF file.

Approach 1: Label natively using Microsoft Office then save it as a PDF

Approach 1: Label Then Save as PDF
This approach is something you can do now. This method involves applying a sensitivity label directly to an Office document in an application like Microsoft Word, and then saving it as a PDF. Although the label transfers to the PDF, note that if your label incorporates encryption, you must disable the PDF/A option when saving. The resulting PDF will display protection via Purview Information Protection, and its custom properties will indicate the applied label.

Saved as a PDF. The document security shows no security as the label that I used is just a plain label without any encryption.

Custom values shows the label that I used.

TAKE NOTE that if your label has ENCRYPTION turned on, then you need to unselect the PDF/A option as you save it.

The security tab displays that it’s protected by Purview Information Protection.

The custom properties shows the Privileged/ Protected / Encrypted label used

Approach 2: Label data using Adobe Acrobat PDF Reader

Here’s where it gets interesting (and a bit challenging). Most of us view these PDFs through web browsers or PDF readers, with Adobe being the undisputed king of the PDF world. In fact, Adobe’s so dominant that in most organisations I’ve worked with, it’s practically become the default way to handle PDFs – much like how we all say “Google it” instead of “search for it”.

Unlike your Microsoft Office suite (Word, Excel, PowerPoint, Outlook), Adobe Acrobat doesn’t play nicely with Sensitivity labels. The “solution”? Mucking about in the Windows registry. Yes, you read that right – registry editing. Adobe’s own support documentation lists down the exact steps to do this. Source (Adobe MPIP Support: https://helpx.adobe.com/enterprise/kb/mpip-support-acrobat.html)

Sure, tweaking the registry is not difficult to do. But imagine rolling this out across thousands of machines in your enterprise. Any experienced IT admin who’s attempted large-scale registry changes will tell you that it’s not fun.

There is a way to do this via Intune to simplify things. You can read it here from Simon Skotheimsik’s blog: https://skotheimsvik.no/how-to-use-intune-to-enable-sensitivity-labels-on-pdf-files

Why would you choose this option?

This option is great if you need to add the same Header, Footer or Watermark that you use in your Word, Excel and PowerPoint files to your PDF.

Approach 3: Label data using the Microsoft Purview Information Protection client

This client must be installed first to your Windows devices before it would work, you can get it here: https://www.microsoft.com/en-gb/download/details.aspx?id=53018

Once installed, you now have a tool that can label PDF files and do so much more. There are some limitation to this that you’ll see below. The client application can be launched by right clicking a file and selecting Apply sensitivity label with Microsoft Purview.

One big benefit of using this client is that you can select multiple files or even an entire folder and mass label them in 1 go. You can use this to MANUALLY label all the files sitting inside a PC or even in a Shared Network Drive.

The limitation.

The limitation of using this tool is that you will not be able label data while a PDF is open, there is no label interface inside of Adobe Acrobat, also with this tool cannot apply headers, footers or watermarks. This is by design as the client is an application/ process that applies labels outside of office files. Read it here: https://learn.microsoft.com/en-us/purview/sensitivity-labels-office-apps#when-office-apps-apply-content-marking-and-encryption

Opening Encrypted PDF in Word?

This was a question to me by a client: What happens when a user tries to open a PDF in Word?

Most of us by now know that you can open and edit a PDF in Word (if you don’t know how, please check this: https://support.microsoft.com/en-us/office/opening-pdfs-in-word-1d1d2acc-afa0-46ef-891d-b76bcd83d9c8

The short answer is that your data is still protected. Here’s what happens when I tried to open an encrypted PDF file in Word.

Here’s the original PDF file that I have encrypted.

After using Word to open the PDF. A pop-up prompt asked me select how I want the file to be opened.

From the Preview window, I can already see that the data is encrypted by Microsoft IRM Services. This gives me confidence that the data is protected.

Then upon opening the file, all I can see are the hashed data. The text + image in the original file is no longer readable.

Deployment strategy

Now that you know how labels works for PDFs. Let’s talk about Deployment.

Begin with Approach 1 because it leverages familiar tools like Microsoft Word and allows you to secure sensitive PDFs right from the document creation stage. This straightforward step minimises the learning curve and reduces the likelihood of errors, enabling your team to adopt essential security measures immediately.

Once the basics are in place, invest in user education to ensure proper application and management of sensitivity labels. Training reinforces security compliance and builds a strong foundation, empowering your staff to understand and uphold data protection practices across the organisation.

After establishing confidence in Approach 1, transition to the Microsoft Purview Information Protection client (Approach 3) to enable scalable, mass labelling across devices and shared drives. This phased progression not only improves operational efficiency and consistency but also sets the stage for introducing more advanced options like registry adjustments (Approach 2) when additional formatting or watermark requirements arise.

References:

PDF Support: https://learn.microsoft.com/en-us/purview/sensitivity-labels-office-apps#pdf-support
Reading resources https://techcommunity.microsoft.com/blog/microsoft365insiderblog/apply-sensitivity-labels-to-pdfs-created-with-office-apps/4214665 https://office365itpros.com/2022/06/20/protected-pdfs-microsoft-365-easier/
MIP end-user document https://support.microsoft.com/en-us/topic/label-and-protect-files-in-file-explorer-in-windows-67829155-2d0e-4122-9677-7c53c8cba18a
MIP Labeller download https://www.microsoft.com/en-gb/download/details.aspx?id=53018
News about labeller https://learn.microsoft.com/en-us/purview/about-infoprotect-client Release notes https://learn.microsoft.com/en-us/purview/information-protection-client-relnotes
How to use sensitivity labels in your PDF files by Nikki Chapple (2022): https://nikkichapple.com/how-to-use-sensitivity-labels-with-your-pdf-files/
How to use Intune to enable sensitivity labels on PDF files (2022): https://skotheimsvik.no/how-to-use-intune-to-enable-sensitivity-labels-on-pdf-files

All Adobe related guides:

Adobe MPIP Support: https://helpx.adobe.com/enterprise/kb/mpip-support-acrobat.html
Watermarking, Headers and Footers are enabled in PDF
Video here: https://experienceleague.adobe.com/en/docs/document-cloud-learn/acrobat-learning/integrations/microsoftsensitivitylabels
Press release on Adobe PDF support (2023) https://www.microsoft.com/en-us/security/blog/2023/03/07/get-integrated-microsoft-purview-information-protection-in-adobe-acrobat-now-available/

In the center of it all.

04/02/202504/02/2025 victorwingsing

Among all Microsoft Purview security solutions, there’s one that you absolutely must get right. If you don’t, your entire data security strategy could fall apart, no matter what other security tools you’re using.

This key solution brings together three basic but crucial tasks: finding your sensitive data, labelling it correctly, and keeping it safe. This solution is Microsoft Purview Information Protection (MIP), and it’s at the heart of how you protect your company’s data.

Why is MIP so critical?

Think of the Microsoft Purview’s Data Classification service as the system that helps all other security tools know what to do. Here’s how it works with different Purview tools:

Purview Data Loss Prevention (DLP):

Works like a security guard that reads the labels
If it sees a file marked ‘Secret’, it knows exactly what protection rules to follow
For example: “This is confidential data, so don’t let it be shared outside the company”

Endpoint DLP (Devices) and Microsoft Defender for Cloud Apps:

These tools check the labels whether you’re working on your laptop or in cloud apps like Workday, Salesforce, etc.
They constantly ask “What’s this file’s label?” before allowing any action
Then they make sure the right safety measures are in place

Microsoft Purview Insider Risk Management:

This one’s particularly clever about using the labels
It watches for unusual behaviour with sensitive data
For example: If someone suddenly downloads 100 files marked ‘Highly Confidential’, it raises an alert
It can then start extra monitoring or take other protective steps”

Microsoft Purview Data Governance (Data Map)

This service uses MIP to help you map and catalog your structured data.
It gives you the ability to apply consistent classification across your data estate. You can have a standardised label across your organisation.
For example: “A ‘Confidential’ label means the same thing everywhere, making it easier to manage and protect”

Third party services using MIP

Even third party servicse leverages on the MIP data classification services.

Trellix integrates it’s DLP network appliance with MIP: https://docs.trellix.com/bundle/data-loss-prevention-11.11.x-product-guide/page/UUID-5d61c924-38ac-3cb9-fb84-17596363740f.html

Crowdstrike leverage Microsoft Purview Information Protection labels (page 5 of 7): https://www.crowdstrike.com/wp-content/uploads/2023/12/A-Modern-Approach-to-Confidently-Stopping-Unauthorized-Data-Exfiltration_WhitePaper.pdf

zScaler and Egnyte can import MIP labels as part of it’s DLP: https://help.zscaler.com/downloads/zscaler-technology-partners/data/zscaler-and-egnyte-deployment-guide/Zscaler-Egnyte-Deployment-Guide-FINAL.pdf

Microsoft Purview Information Protection is the foundation that your entire data security and governance strategy builds upon. Without a properly planned and implemented MIP deployment, even the most sophisticated Purview solutions won’t deliver their full value. Think of it as building a house – you need to get the foundation right first.

As your organisation grows and your data landscape becomes more complex, your MIP strategy needs to evolve too. Regular reviews of your classification labels, updating sensitivity rules, and fine-tuning your protection policies aren’t just good practice – they’re essential for keeping your data secure and compliant.

Making the case for Optical Content Recognition (OCR) in your Data Protection strategy

01/02/202503/02/2025 victorwingsing

I recently applied for a U.S. visa, and as part of the process, I had to submit my passport, bank records, and a lot of personally identifiable information to the embassy in the form of PDF and JPEG files. This meant that much of my sensitive data is now stored as images. This made me wonder: How are organisations safeguarding data that is image-based rather than text-based?

Traditional Data Loss Prevention (DLP) strategies, while effective in monitoring text-based data, often fall short when it comes to image-based content. This shortcoming can lead to significant vulnerabilities, as sensitive information is frequently embedded within images (see my example above). Optical Content Recognition ( OCR) emerges as a must-have tool in addressing this gap, enabling organisations to extract and analyze text from images. For Cyber Security teams aiming to enhance their data security posture, integrating OCR into their DLP strategy is not just beneficial—it is a must!

What are the industry use cases for OCR in DLP?

Financial services: Sensitive information such as account numbers, credit card details, and personally identifiable information (PII) is often embedded in scanned documents, receipts, and screenshots
Healthcare industry: There are data that are in the form of Medical records and scans, prescriptions and doctor’s notes (assuming that your doctor can write legibly)
Retail and Ecommerce: Scanned receipts and invoices and most product returns and refunds that starts in paper get scanned and stored.
Manufacturing: Contracts, Blueprints, R&D documents and even internal presentations (most of which gets converted to either an image or a PDF)
Government and Public Sector: Scanned copies of passports, drivers licenses and PII data, Incident reports (which again starts on paper and ends up as a image)

These are just examples of where OCR in DLP can come in to ensure that data is not leaked out.

OCR in Microsoft Purview

Microsoft Purview has OCR capability that allows you to be able to identify, and protect data. This allows you to scan images for Sensitive Information but do remember that this is an OPTIONAL feature and must be enabled at a Tenant level. There’s also a bit of a cost to it (more on this later)

To turn on OCR in your Microsoft Purview you’d need to do the following.

Go to Settings > Select Optical Content Recognition.
Choose where you want OCR to scan.

The full Technical instruction can be found here: https://learn.microsoft.com/en-us/purview/ocr-learn-about?tabs=purview#workflow-at-a-glance

The Cost of OCR

This capability is powerful as it leverages on the Azure AI to use OCR. As of today, the cost to run $1.00 USD per 1,000 scanned item. The keywords to look out for in the costing is ‘per scanned item’ this is because Microsoft considers each page in a PDF or each individual image page in a set of images as 1 scan. So a PDF that contains 10 pages counts as 10 scans. https://learn.microsoft.com/en-us/purview/ocr-learn-about?tabs=purview#estimate-your-ocr-scanning-charges

Data Strategy in using OCR for the first time.

To limit your cost and be more deliberate in running this OCR scan, here’s a helpful strategy so that you use to get started.

Data Search Using Content Search in Purview: Utilize Microsoft Purview’s Content Search feature to filter by file type, such as JPEG and PNG, to identify potential images containing sensitive information. This targeted approach ensures that all image files are scanned for embedded text.

Focus on Known Locations: Identify departments or teams that handle sensitive data, such as Finance, Sales, and Marketing, and focus OCR searches on their respective SharePoint sites. This strategy maximizes the efficiency of OCR by concentrating on areas where sensitive information is most likely to reside.

File Name Analysis: Implement keyword searches for terms that indicate sensitive content, such as “passport” or “ccn” (credit card number), in file names. This proactive approach helps in identifying and flagging files that may contain sensitive information.

AI Implementation Failures: What We Learned from 2024

03/01/202503/01/2025 victorwingsing

My news feed is filled with “A Year in Review” of what happened in 2024 and the thing that stood out to me was 2024 was a bit of a mess for AI implementations.

From chat-bots giving illegal advice to fake content flooding our news and social media feeds (I’m pretty sure that I’m not the only ones who’ve seen the Pope wear a cool puffy jacket)

Credit: https://x.com/skyferrori/status/1639647708154675200?mx=2

So how did we get here:

The rush to implement AI solutions was largely driven by market pressure and FOMO (Fear of Missing Out). Companies, desperate to stay competitive, rushed to deploy AI solutions without proper governance frameworks or security controls. Board rooms worldwide echoed with demands for “AI strategy,” often without understanding what that actually meant for their business.

This perfect storm was further fueled by the accessibility of AI tools and platforms. What used to require deep technical expertise became available through simple APIs and low-code interfaces. While this democratisation of AI is generally positive, it led to a “wild west” scenario where implementations often outpaced proper security and compliance considerations.

The result? Poor deployment, Terrible user experience and many half-baked AI solutions, security vulnerabilities, and trust issues.

Before You Start: The Boring (But Essential) Bits

Look, I get it – you want to jump straight into the exciting world of AI. But here’s the thing: you need to sort out your data house first. Think of it like baby-proofing your home. Your CISO and security team need to know exactly what data you’ve got, where it lives, and who’s allowed to play with it.

Get your Microsoft Purview DLP policies sorted, tag your sensitive stuff using Purview Information Protection, and make sure you’ve got the right security controls in place. Trust me, this boring bit will save you from some proper headaches later.

The Fix: Four Simple Actionable Steps

Sort Out Your Governance
- Get an AI committee going
- Write clear policies on AI usage, Data Protection, etc
- Set proper standards
- Actually check if things work (please audit!)
Lock Down Security
- Content verification is your friend
- Proper access controls (seriously)
- Encrypt everything (not just email but the underlying data must be encrypted too)
- Have a plan when things go wrong
  - Microsoft Purview is ready to secure data for AI
Quality Control
- Keep humans in the loop
- Test, test, test
- Watch those outputs (again please run audit checks)
- Clean data = better results
  - Azure has a good framework for this: Test and evaluate AI workloads on Azure
Smart Implementation
- Start small, scale later (even on a controlled Copilot for Microsoft 365, pilot it first with a handful of trusted people)
- Train your people properly, (end-user education is a must)
- Listen to user feedback
- Don’t rush it

2024 showed us that rushing in without proper planning is a recipe for disaster. Take your time, do it right, and maybe we won’t see your company in next year’s “AI Fails” list.

Other Sources:

Excluding a specific user (or group) from Sensitivity labels

13/10/202403/02/2025 victorwingsing

I’m excited to share a practical guide I’ve created that walks you through the process of excluding specific users or groups from Microsoft Purview Sensitivity Labels. This guide comes from a real-world scenario where an organization is piloting a new approach to simplify its labeling structure. They wanted to test how reducing the number of labels applied to users would affect workflows and information protection. To support this, I’ve put together detailed instructions on how to effectively manage exclusions in Purview, along with a back-out process to ensure a smooth rollback if needed.

This PDF guide is packed with step-by-step instructions, screenshots, and expert tips to help you navigate the nuances of label exclusions. Whether you’re in the middle of a label simplification pilot or simply looking to better control label application, this guide will help streamline your process. Get ready to dive in and experience a more flexible, user-centered approach to managing Sensitivity Labels in Microsoft Purview!

Excluding a Group from Sensitivity label policy Download

From Novice to Ninja: a new CISOs guide to DLP

06/10/202403/02/2025 victorwingsing

Congratulations, CISO! 🎉 Great job in landing your new role, where protecting sensitive data isn’t just a job—it’s a daily tightrope walk over a pit of cyber threats, compliance demands, and evolving technology.

Now that you’re at the steering wheel, your inbox is probably overflowing with security concerns, regulatory requirements, and a few “fun” audit emails. Don’t worry, you’re in good company. This guide is here to give you actionable steps to set up your Data Loss Prevention (DLP) strategy, ensuring you don’t just survive in this role—you thrive.

So, what does being a CISO mean? Well, you’re now the go-to person when sensitive data sneaks out, malicious insiders get a bit too curious, or someone clicks that suspicious link promising free money from an unknown relative in Timbuktu. No pressure, right? But here’s the deal: inaction is risk. Delaying or overlooking the core elements of a solid DLP strategy could lead to breaches that cost more than your next cybersecurity budget.

To make your journey smoother, I’ve prepared a handy worksheet that you can use right now to take action on your Data Loss Prevention strategy. These aren’t just checkboxes—these are critical steps to lock down your organization’s data and avoid waking up to a breach nightmare.

You can Download the worksheet below.

CISO DLP worksheet

Here’s what you can expect see inside:

1. Classifying Data and Why It’s Important

Why it matters: Not all data is created equal. By classifying your data, you can prioritize resources and security measures where they’re needed most. Would you protect the company picnic plan with the same force as your customers’ financial information? (Spoiler: probably not!)

Example:

High-risk data: Customer credit card details, proprietary code, or confidential HR files—things you’d never want to see in the wrong hands.
Medium-risk data: Internal meeting notes, marketing strategies—sensitive, but not catastrophic if leaked.
Low-risk data: Public reports, customer FAQs—this is the stuff you’d share at a conference.

Take Action Today: Review your organization’s data and start tagging it by risk level. Ask yourself, “What would happen if this data got out?” and use that to guide your classification efforts

2. Why and How to Identify Sensitive Data

Why it matters: You can’t protect what you don’t know exists. Sensitive data is often hidden across different platforms—sometimes even in the most unexpected places (like a random email attachment or NTFS file shares). Identifying it is the first step to ensuring it stays secure.

Example:

Sensitive Data: Personally Identifiable Information (PII) like social security numbers or health records, intellectual property (IP), and anything that’s subject to regulations like GDPR or HIPAA.
Surprise Discovery: Finding a list of client emails attached to a forgotten project buried in a shared folder.

Take Action Today: Use a discovery tool or audit your data manually. Start with cloud storage, email servers, and shared folders. Look for data that could lead to a privacy violation or financial loss if exposed.

3. Developing a Data Handling Policy

Why it matters: A solid data handling policy is the foundation of your DLP strategy. Without clear rules in place, sensitive information can slip through the cracks, exposing your organization to unnecessary risk. Your data handling policy ensures everyone—from top execs to interns—understands the dos and don’ts of handling sensitive information.

Example:

Clear Guidelines: For high-risk data like financial information, the policy might mandate encryption during transfer and restricted access to authorized personnel only.
Real-Life Scenario: Imagine your marketing team accidentally sharing a file with customer details over an unsecured network. A proper data handling policy would prevent this by enforcing secure file transfer practices.

Take Action Today: Draft a policy that covers how different types of data (high, medium, low risk) should be handled. It should specify everything from encryption requirements to access control and data retention periods. Involve key stakeholders (Legal, IT, HR) to ensure all bases are covered.

Now that you know the key steps to securing your organization’s data, it’s time to plan it out, partner with your internal stakeholders, and take action. DLP isn’t a one-person job—it’s a team effort that involves collaboration across IT, Legal, HR, and beyond. The risks of inaction are far too high, so don’t wait until something goes wrong. Proactively implementing these best practices today will not only protect your data but also strengthen your leadership as a new CISO.

Take a Load Off and SIT (an oversimplified explanation of using SIT)

11/06/202411/06/2024 victorwingsing

In my Purview Ninja Training (you can take the training too, click here), one of the Purview capabilities that I struggled understanding at first was using the Sensitive Information Types for automatic classification. Not because it’s difficult to understand but becaue there were so many different options you can choose from that can be applied to similar use cases.

So to save time in understanding it, here is an over-simplified matrix of when to use the different automatic classification options using Microsoft Purview Information Protection.

When to use each capability.

Built-in SIT: Ready-to-use, predefined data types like social security numbers, credit card numbers, and other common sensitive data formats. Ideal for general compliance and basic data protection needs.
Custom SIT: Customizable to meet unique organizational requirements. Suitable for both structured and unstructured data.
EDM (Exact Data Match SITs): Best for exact matches of structured data with consistent formats, such as financial records and personal IDs.
Document Fingerprinting: Detects and protects standardized documents with repeatable structures, like legal forms and templates.
Named Entities SIT: Used for for detecting contextual sensitive or important data, like names or organizations, particularly within unstructured formats.
Trainable Classifiers: Useful for complex or ever changing data types, especially in unstructured data, where static rules or patterns are inadequate

Dude, Where’s my DATA

02/06/202402/06/2024 victorwingsing

Data is the new currency in today’s digital age. Just as you wouldn’t leave your house title lying around for anyone to take, understanding where your data resides is crucial for its protection. Knowing the exact location of your data allows you to implement proper security measures, ensuring it’s not vulnerable to unauthorized access or breaches.

Understanding your data’s location also plays a vital role in regulatory compliance. For instance, CIS controls (https://www.cisecurity.org/controls) Control 13: Data Protection and Control 14: Controlled Access Based on the Need to Know, emphasize the need to secure data and limit access strictly to those who need it. By mapping out where your data lives, you can better align your practices with these controls, reducing risks and meeting compliance requirements.

In this blog, I will guide you through the various methods to discover where your data resides, the specific tools to use for different types of data, and when and how to effectively utilize each tool.

The 2 Methods in discovery data

Manual methods involve physically documenting all the locations where your data is stored. This approach requires you to actively track and record each data repository, whether it’s on-premises, in the cloud, or across various applications and devices. While this method can be thorough and provide a deep understanding of your data landscape, it is also time-consuming and prone to human error. Think of it as manually creating an inventory of every item in your home – it’s detailed but can be exhausting and easy to miss something.

Automatic methods leverage technology to scan, map, and classify your data across different environments. These methods use specialized tools to automatically discover data locations, classify sensitive information, and provide insights into data usage and movement.

Type of Data in an Organization

Organizations typically handle two primary types of business data: documents and organizational business data.

Documents include files like reports, presentations, spreadsheets, and PDFs, which often contain sensitive information and require careful management and protection.

On the other hand, Organizational business data encompasses the data generated from business operations, workflows, and applications, such as transaction records, customer information, and operational metrics. Think of applications such as Dynamics 365, Workday data, SAP data, etc. This type of data is what is used for day-to-day operations.

Now that we know about the 2 different data in an organisation, let’s go have a look at what are the available Microsoft solutions to use to DISCOVER DATA (most of which are already included in your Microsoft Business Premium, or E3 and E5 licenses)

Quick Note:

There are solutions that are not on this list that has some form of search/ discovery capability (ex. Purview Data Life Cycle Management, Audit Log Search) I’ve omitted it in this list as their primary purpose is data governance and the data discovery capability relies on the other items that I’ve listed down below

Document discovery tool

Microsoft Purview Information Protection: (for documents stored in Email, SharePoint, OneDrive and Teams): It helps classify and label data based on its sensitivity. Start by defining your data classification schema, apply labels to your documents using built-in or custom labels, and configure policies to automatically classify and protect sensitive information as it is created or modified.

Microsoft Purview Information Scanner (for On-prem data): This is designed to scan and classify on-premises data. To use it, deploy the scanner to your on-premises environment, configure scanning jobs to target specific data repositories, and review the scan results to understand where sensitive information resides and how it is being used.

Microsoft Compliance Center (Content Search Tool): The Content search tool in the Microsoft Compliance Center allows you to search for and manage content across your organization.

Microsoft 365 eDiscovery: This helps you manage and analyze large volumes of data for legal and compliance purposes. To use it, access the eDiscovery portal, create a case, add data sources, and run searches and analytics to gather relevant information for your legal or compliance needs.

Defender for Cloud Apps: This is a comprehensive solution for monitoring and controlling data movement across cloud applications. The tool also offers data classification and protection through integration with Microsoft Purview Information Protection, ensuring consistent data security across your organization

Priva (using Privacy Assessments): This is specifically just for Personal data discovery. Automates the discovery, documentation, and evaluation of personal data use across your entire data estate. Using this regulatory-independent solution, you can automate privacy assessments and build a complete compliance record for the responsible use of personal data.

Organizational Business Data Tools

Purview Data Map: Helps you create a unified map of your data estate by automatically scanning and classifying your data sources. To use it, configure scanning rules and connect your data sources to Purview. The Data Map will continuously update, providing an up-to-date view of your data landscape, including classification and sensitivity labels, which helps in managing data compliance and governance.

Purview Data Catalog: Provides a searchable catalog of data assets, making it easy to discover and understand data across your organization. To use it, start by connecting your data sources to Purview, which will automatically scan and index your data. Users can then search for data assets, view metadata, and understand data lineage, facilitating better data governance and management.

Information Rights Management vs. Encryption via Sensitivity Labels: Why You Can’t Use Both on One Document

22/05/202422/05/2024 victorwingsing

An interesting use case came from a client where they were looking at enabling encryption using Sensitivity labels and do away with the existing Information Rights Management (IRM) to protect their files in Sharepoint.

One of the Security analyst asked why not use BOTH at the same time. If both of them offers security protection, surely having DOUBLE protection will be better right? Well…

Before we dive deeper in to the reason, let’s have an understanding first of what is Information Rights Management and Encryption through Sensitivity labels.

What is IRM? Information Rights Management (IRM) is a tool that helps protect and control who can access, edit, print, or forward your documents and emails. Think of it as a digital lock that only lets certain people in and tells them what they can and can’t do with the information.

How to Use IRM in Microsoft 365:

Go to the document or email you want to protect.
Click on the “File” tab.
Select “Info” and then “Protect Document.”
Choose “Restrict Access” and set the permissions for who can access and what they can do.

What are Sensitivity Labels? Sensitivity Labels are part of Microsoft Information Protection solutions. They allow organizations to classify and protect documents and emails based on their sensitivity. These labels can apply encryption, watermarking, and content marking, as well as define access policies. Key features include: The organisation designs which label enables encryption.

In simple terms, the encryption is applied once the appropriate Label is selected.

https://learn.microsoft.com/en-us/purview/encryption-sensitivity-labels

Here’s why you can’t use both of them at the same time.

The primary reason you cannot use both IRM and Sensitivity Labels encryption simultaneously on a document is due to overlapping functionalities and potential conflicts between the two systems:

Redundant Encryption: Both systems apply encryption, which can lead to conflicts or redundancy in the encryption process. Encrypting a document twice can complicate access management and decryption processes.

Policy Conflicts: IRM and Sensitivity Labels both define access and usage policies. Applying both might result in conflicting policies, making it difficult to enforce a clear and consistent set of rules.

Which Encryption wins if these 2 were used at the same time?

Based on my testing, Information Rights Management (IRM) Wins. When a document is protected by both IRM and a Sensitivity Label, the document retains the IRM encryption and loses the Sensitivity Label.

This outcome makes sense because IRM encryption is embedded directly into the document. On the other hand, Sensitivity Label encryption is more flexible and can be easily changed by applying or reapplying different labels. Therefore, the more rigid and integrated IRM encryption overrides the more adaptable Sensitivity Label encryption.