In the center of it all.

Among all Microsoft Purview security solutions, there’s one that you absolutely must get right. If you don’t, your entire data security strategy could fall apart, no matter what other security tools you’re using.

This key solution brings together three basic but crucial tasks: finding your sensitive data, labelling it correctly, and keeping it safe. This solution is Microsoft Purview Information Protection (MIP), and it’s at the heart of how you protect your company’s data.

Why is MIP so critical?

Think of the Microsoft Purview’s Data Classification service as the system that helps all other security tools know what to do. Here’s how it works with different Purview tools:

Purview Data Loss Prevention (DLP):

  • Works like a security guard that reads the labels
  • If it sees a file marked ‘Secret’, it knows exactly what protection rules to follow
  • For example: “This is confidential data, so don’t let it be shared outside the company”

Endpoint DLP (Devices) and Microsoft Defender for Cloud Apps:

  • These tools check the labels whether you’re working on your laptop or in cloud apps like Workday, Salesforce, etc.
  • They constantly ask “What’s this file’s label?” before allowing any action
  • Then they make sure the right safety measures are in place

Microsoft Purview Insider Risk Management:

  • This one’s particularly clever about using the labels
  • It watches for unusual behaviour with sensitive data
  • For example: If someone suddenly downloads 100 files marked ‘Highly Confidential’, it raises an alert
  • It can then start extra monitoring or take other protective steps”

Microsoft Purview Data Governance (Data Map)

  • This service uses MIP to help you map and catalog your structured data.
  • It gives you the ability to apply consistent classification across your data estate. You can have a standardised label across your organisation.
  • For example: “A ‘Confidential’ label means the same thing everywhere, making it easier to manage and protect”

Third party services using MIP

Even third party servicse leverages on the MIP data classification services.

Trellix integrates it’s DLP network appliance with MIP: https://docs.trellix.com/bundle/data-loss-prevention-11.11.x-product-guide/page/UUID-5d61c924-38ac-3cb9-fb84-17596363740f.html

Crowdstrike leverage Microsoft Purview Information Protection labels (page 5 of 7): https://www.crowdstrike.com/wp-content/uploads/2023/12/A-Modern-Approach-to-Confidently-Stopping-Unauthorized-Data-Exfiltration_WhitePaper.pdf

zScaler and Egnyte can import MIP labels as part of it’s DLP: https://help.zscaler.com/downloads/zscaler-technology-partners/data/zscaler-and-egnyte-deployment-guide/Zscaler-Egnyte-Deployment-Guide-FINAL.pdf

Microsoft Purview Information Protection is the foundation that your entire data security and governance strategy builds upon. Without a properly planned and implemented MIP deployment, even the most sophisticated Purview solutions won’t deliver their full value. Think of it as building a house – you need to get the foundation right first.

As your organisation grows and your data landscape becomes more complex, your MIP strategy needs to evolve too. Regular reviews of your classification labels, updating sensitivity rules, and fine-tuning your protection policies aren’t just good practice – they’re essential for keeping your data secure and compliant.

Making the case for Optical Content Recognition (OCR) in your Data Protection strategy

I recently applied for a U.S. visa, and as part of the process, I had to submit my passport, bank records, and a lot of personally identifiable information to the embassy in the form of PDF and JPEG files. This meant that much of my sensitive data is now stored as images. This made me wonder: How are organisations safeguarding data that is image-based rather than text-based?

Traditional Data Loss Prevention (DLP) strategies, while effective in monitoring text-based data, often fall short when it comes to image-based content. This shortcoming can lead to significant vulnerabilities, as sensitive information is frequently embedded within images (see my example above). Optical Content Recognition ( OCR) emerges as a must-have tool in addressing this gap, enabling organisations to extract and analyze text from images. For Cyber Security teams aiming to enhance their data security posture, integrating OCR into their DLP strategy is not just beneficial—it is a must!

What are the industry use cases for OCR in DLP?

  • Financial services: Sensitive information such as account numbers, credit card details, and personally identifiable information (PII) is often embedded in scanned documents, receipts, and screenshots
  • Healthcare industry: There are data that are in the form of Medical records and scans, prescriptions and doctor’s notes (assuming that your doctor can write legibly)
  • Retail and Ecommerce: Scanned receipts and invoices and most product returns and refunds that starts in paper get scanned and stored.
  • Manufacturing: Contracts, Blueprints, R&D documents and even internal presentations (most of which gets converted to either an image or a PDF)
  • Government and Public Sector: Scanned copies of passports, drivers licenses and PII data, Incident reports (which again starts on paper and ends up as a image)

These are just examples of where OCR in DLP can come in to ensure that data is not leaked out.

OCR in Microsoft Purview

Microsoft Purview has OCR capability that allows you to be able to identify, and protect data. This allows you to scan images for Sensitive Information but do remember that this is an OPTIONAL feature and must be enabled at a Tenant level. There’s also a bit of a cost to it (more on this later)

To turn on OCR in your Microsoft Purview you’d need to do the following.

  1. Go to Settings > Select Optical Content Recognition.
  2. Choose where you want OCR to scan.

The full Technical instruction can be found here: https://learn.microsoft.com/en-us/purview/ocr-learn-about?tabs=purview#workflow-at-a-glance

The Cost of OCR

This capability is powerful as it leverages on the Azure AI to use OCR. As of today, the cost to run $1.00 USD per 1,000 scanned item. The keywords to look out for in the costing is ‘per scanned item’ this is because Microsoft considers each page in a PDF or each individual image page in a set of images as 1 scan. So a PDF that contains 10 pages counts as 10 scans. https://learn.microsoft.com/en-us/purview/ocr-learn-about?tabs=purview#estimate-your-ocr-scanning-charges

Data Strategy in using OCR for the first time.

To limit your cost and be more deliberate in running this OCR scan, here’s a helpful strategy so that you use to get started.

Data Search Using Content Search in Purview: Utilize Microsoft Purview’s Content Search feature to filter by file type, such as JPEG and PNG, to identify potential images containing sensitive information. This targeted approach ensures that all image files are scanned for embedded text.

Focus on Known Locations: Identify departments or teams that handle sensitive data, such as Finance, Sales, and Marketing, and focus OCR searches on their respective SharePoint sites. This strategy maximizes the efficiency of OCR by concentrating on areas where sensitive information is most likely to reside.

File Name Analysis: Implement keyword searches for terms that indicate sensitive content, such as “passport” or “ccn” (credit card number), in file names. This proactive approach helps in identifying and flagging files that may contain sensitive information.

AI Implementation Failures: What We Learned from 2024

My news feed is filled with “A Year in Review” of what happened in 2024 and the thing that stood out to me was 2024 was a bit of a mess for AI implementations.

From chat-bots giving illegal advice to fake content flooding our news and social media feeds (I’m pretty sure that I’m not the only ones who’ve seen the Pope wear a cool puffy jacket)

So how did we get here:

The rush to implement AI solutions was largely driven by market pressure and FOMO (Fear of Missing Out). Companies, desperate to stay competitive, rushed to deploy AI solutions without proper governance frameworks or security controls. Board rooms worldwide echoed with demands for “AI strategy,” often without understanding what that actually meant for their business.

This perfect storm was further fueled by the accessibility of AI tools and platforms. What used to require deep technical expertise became available through simple APIs and low-code interfaces. While this democratisation of AI is generally positive, it led to a “wild west” scenario where implementations often outpaced proper security and compliance considerations.

The result? Poor deployment, Terrible user experience and many half-baked AI solutions, security vulnerabilities, and trust issues.


Before You Start: The Boring (But Essential) Bits

Look, I get it – you want to jump straight into the exciting world of AI. But here’s the thing: you need to sort out your data house first. Think of it like baby-proofing your home. Your CISO and security team need to know exactly what data you’ve got, where it lives, and who’s allowed to play with it.

Get your Microsoft Purview DLP policies sorted, tag your sensitive stuff using Purview Information Protection, and make sure you’ve got the right security controls in place. Trust me, this boring bit will save you from some proper headaches later.


The Fix: Four Simple Actionable Steps

  1. Sort Out Your Governance
    • Get an AI committee going
    • Write clear policies on AI usage, Data Protection, etc
    • Set proper standards
    • Actually check if things work (please audit!)
  2. Lock Down Security
  3. Quality Control
    • Keep humans in the loop
    • Test, test, test
    • Watch those outputs (again please run audit checks)
    • Clean data = better results
  4. Smart Implementation
    • Start small, scale later (even on a controlled Copilot for Microsoft 365, pilot it first with a handful of trusted people)
    • Train your people properly, (end-user education is a must)
    • Listen to user feedback
    • Don’t rush it

2024 showed us that rushing in without proper planning is a recipe for disaster. Take your time, do it right, and maybe we won’t see your company in next year’s “AI Fails” list.

Other Sources:

Excluding a specific user (or group) from Sensitivity labels

I’m excited to share a practical guide I’ve created that walks you through the process of excluding specific users or groups from Microsoft Purview Sensitivity Labels. This guide comes from a real-world scenario where an organization is piloting a new approach to simplify its labeling structure. They wanted to test how reducing the number of labels applied to users would affect workflows and information protection. To support this, I’ve put together detailed instructions on how to effectively manage exclusions in Purview, along with a back-out process to ensure a smooth rollback if needed.

This PDF guide is packed with step-by-step instructions, screenshots, and expert tips to help you navigate the nuances of label exclusions. Whether you’re in the middle of a label simplification pilot or simply looking to better control label application, this guide will help streamline your process. Get ready to dive in and experience a more flexible, user-centered approach to managing Sensitivity Labels in Microsoft Purview!

From Novice to Ninja: a new CISOs guide to DLP

Congratulations, CISO! 🎉 Great job in landing your new role, where protecting sensitive data isn’t just a job—it’s a daily tightrope walk over a pit of cyber threats, compliance demands, and evolving technology.

Now that you’re at the steering wheel, your inbox is probably overflowing with security concerns, regulatory requirements, and a few “fun” audit emails. Don’t worry, you’re in good company. This guide is here to give you actionable steps to set up your Data Loss Prevention (DLP) strategy, ensuring you don’t just survive in this role—you thrive.

So, what does being a CISO mean? Well, you’re now the go-to person when sensitive data sneaks out, malicious insiders get a bit too curious, or someone clicks that suspicious link promising free money from an unknown relative in Timbuktu. No pressure, right? But here’s the deal: inaction is risk. Delaying or overlooking the core elements of a solid DLP strategy could lead to breaches that cost more than your next cybersecurity budget.

To make your journey smoother, I’ve prepared a handy worksheet that you can use right now to take action on your Data Loss Prevention strategy. These aren’t just checkboxes—these are critical steps to lock down your organization’s data and avoid waking up to a breach nightmare.

You can Download the worksheet below.

Here’s what you can expect see inside:

1. Classifying Data and Why It’s Important

Why it matters: Not all data is created equal. By classifying your data, you can prioritize resources and security measures where they’re needed most. Would you protect the company picnic plan with the same force as your customers’ financial information? (Spoiler: probably not!)

Example:

  • High-risk data: Customer credit card details, proprietary code, or confidential HR files—things you’d never want to see in the wrong hands.
  • Medium-risk data: Internal meeting notes, marketing strategies—sensitive, but not catastrophic if leaked.
  • Low-risk data: Public reports, customer FAQs—this is the stuff you’d share at a conference.

Take Action Today: Review your organization’s data and start tagging it by risk level. Ask yourself, “What would happen if this data got out?” and use that to guide your classification efforts

2. Why and How to Identify Sensitive Data

Why it matters: You can’t protect what you don’t know exists. Sensitive data is often hidden across different platforms—sometimes even in the most unexpected places (like a random email attachment or NTFS file shares). Identifying it is the first step to ensuring it stays secure.

Example:

  • Sensitive Data: Personally Identifiable Information (PII) like social security numbers or health records, intellectual property (IP), and anything that’s subject to regulations like GDPR or HIPAA.
  • Surprise Discovery: Finding a list of client emails attached to a forgotten project buried in a shared folder.

Take Action Today: Use a discovery tool or audit your data manually. Start with cloud storage, email servers, and shared folders. Look for data that could lead to a privacy violation or financial loss if exposed.

3. Developing a Data Handling Policy

Why it matters: A solid data handling policy is the foundation of your DLP strategy. Without clear rules in place, sensitive information can slip through the cracks, exposing your organization to unnecessary risk. Your data handling policy ensures everyone—from top execs to interns—understands the dos and don’ts of handling sensitive information.

Example:

  • Clear Guidelines: For high-risk data like financial information, the policy might mandate encryption during transfer and restricted access to authorized personnel only.
  • Real-Life Scenario: Imagine your marketing team accidentally sharing a file with customer details over an unsecured network. A proper data handling policy would prevent this by enforcing secure file transfer practices.

Take Action Today: Draft a policy that covers how different types of data (high, medium, low risk) should be handled. It should specify everything from encryption requirements to access control and data retention periods. Involve key stakeholders (Legal, IT, HR) to ensure all bases are covered.

Now that you know the key steps to securing your organization’s data, it’s time to plan it out, partner with your internal stakeholders, and take action. DLP isn’t a one-person job—it’s a team effort that involves collaboration across IT, Legal, HR, and beyond. The risks of inaction are far too high, so don’t wait until something goes wrong. Proactively implementing these best practices today will not only protect your data but also strengthen your leadership as a new CISO.

Take a Load Off and SIT (an oversimplified explanation of using SIT)

In my Purview Ninja Training (you can take the training too, click here), one of the Purview capabilities that I struggled understanding at first was using the Sensitive Information Types for automatic classification. Not because it’s difficult to understand but becaue there were so many different options you can choose from that can be applied to similar use cases.

So to save time in understanding it, here is an over-simplified matrix of when to use the different automatic classification options using Microsoft Purview Information Protection.

When to use each capability.

  • Built-in SIT: Ready-to-use, predefined data types like social security numbers, credit card numbers, and other common sensitive data formats. Ideal for general compliance and basic data protection needs.
  • Custom SIT: Customizable to meet unique organizational requirements. Suitable for both structured and unstructured data.
  • EDM (Exact Data Match SITs): Best for exact matches of structured data with consistent formats, such as financial records and personal IDs.
  • Document Fingerprinting: Detects and protects standardized documents with repeatable structures, like legal forms and templates.
  • Named Entities SIT: Used for for detecting contextual sensitive or important data, like names or organizations, particularly within unstructured formats.
  • Trainable Classifiers: Useful for complex or ever changing data types, especially in unstructured data, where static rules or patterns are inadequate

Dude, Where’s my DATA

Data is the new currency in today’s digital age. Just as you wouldn’t leave your house title lying around for anyone to take, understanding where your data resides is crucial for its protection. Knowing the exact location of your data allows you to implement proper security measures, ensuring it’s not vulnerable to unauthorized access or breaches.

Understanding your data’s location also plays a vital role in regulatory compliance. For instance, CIS controls (https://www.cisecurity.org/controls) Control 13: Data Protection and Control 14: Controlled Access Based on the Need to Know, emphasize the need to secure data and limit access strictly to those who need it. By mapping out where your data lives, you can better align your practices with these controls, reducing risks and meeting compliance requirements.

In this blog, I will guide you through the various methods to discover where your data resides, the specific tools to use for different types of data, and when and how to effectively utilize each tool.


The 2 Methods in discovery data

    Manual methods involve physically documenting all the locations where your data is stored. This approach requires you to actively track and record each data repository, whether it’s on-premises, in the cloud, or across various applications and devices. While this method can be thorough and provide a deep understanding of your data landscape, it is also time-consuming and prone to human error. Think of it as manually creating an inventory of every item in your home – it’s detailed but can be exhausting and easy to miss something.

    Automatic methods leverage technology to scan, map, and classify your data across different environments. These methods use specialized tools to automatically discover data locations, classify sensitive information, and provide insights into data usage and movement.


    Type of Data in an Organization

    Organizations typically handle two primary types of business data: documents and organizational business data.

    Documents include files like reports, presentations, spreadsheets, and PDFs, which often contain sensitive information and require careful management and protection.

    On the other hand, Organizational business data encompasses the data generated from business operations, workflows, and applications, such as transaction records, customer information, and operational metrics. Think of applications such as Dynamics 365, Workday data, SAP data, etc. This type of data is what is used for day-to-day operations.

    Now that we know about the 2 different data in an organisation, let’s go have a look at what are the available Microsoft solutions to use to DISCOVER DATA (most of which are already included in your Microsoft Business Premium, or E3 and E5 licenses)

    Quick Note:

    There are solutions that are not on this list that has some form of search/ discovery capability (ex. Purview Data Life Cycle Management, Audit Log Search) I’ve omitted it in this list as their primary purpose is data governance and the data discovery capability relies on the other items that I’ve listed down below


    Document discovery tool

    Microsoft Purview Information Protection: (for documents stored in Email, SharePoint, OneDrive and Teams): It helps classify and label data based on its sensitivity. Start by defining your data classification schema, apply labels to your documents using built-in or custom labels, and configure policies to automatically classify and protect sensitive information as it is created or modified.

    Microsoft Purview Information Scanner (for On-prem data): This is designed to scan and classify on-premises data. To use it, deploy the scanner to your on-premises environment, configure scanning jobs to target specific data repositories, and review the scan results to understand where sensitive information resides and how it is being used.

    Microsoft Compliance Center (Content Search Tool): The Content search tool in the Microsoft Compliance Center allows you to search for and manage content across your organization.

    Microsoft 365 eDiscovery: This helps you manage and analyze large volumes of data for legal and compliance purposes. To use it, access the eDiscovery portal, create a case, add data sources, and run searches and analytics to gather relevant information for your legal or compliance needs.

    Defender for Cloud Apps: This is a comprehensive solution for monitoring and controlling data movement across cloud applications. The tool also offers data classification and protection through integration with Microsoft Purview Information Protection, ensuring consistent data security across your organization​

    Priva (using Privacy Assessments): This is specifically just for Personal data discovery. Automates the discovery, documentation, and evaluation of personal data use across your entire data estate. Using this regulatory-independent solution, you can automate privacy assessments and build a complete compliance record for the responsible use of personal data.

    Organizational Business Data Tools

    Purview Data Map: Helps you create a unified map of your data estate by automatically scanning and classifying your data sources. To use it, configure scanning rules and connect your data sources to Purview. The Data Map will continuously update, providing an up-to-date view of your data landscape, including classification and sensitivity labels, which helps in managing data compliance and governance.

    Purview Data Catalog: Provides a searchable catalog of data assets, making it easy to discover and understand data across your organization. To use it, start by connecting your data sources to Purview, which will automatically scan and index your data. Users can then search for data assets, view metadata, and understand data lineage, facilitating better data governance and management.

    Creating an Insider Risk Management Strategy: A Simplified Guide

    When thinking about Insider Risk Management strategy, it’s easy to get lost in a maze of complex solutions and cutting-edge technologies. However, before we dive into program specifics, let’s take a step back.

    Simplification is our guiding principle here, and it brings us to the core four elements essential for any successful strategy: People, Process, Technology, and Implementing the Action.

    People: The Core of Insider Risk Management

    Insider risk management starts with understanding that your people are both your biggest asset and potential risk. Training and awareness are crucial. Employees should be aware of the organization’s policies, the significance of data protection, and the consequences of non-compliance. Engage departments across the board—security, HR, legal—to foster a culture of accountability and transparency. Regular training ensures everyone is up-to-date on the latest protocols and threats.

    Ask yourself the following:

    • How can we enhance our current training programs to better address the specific risks and policies relevant to our organization, ensuring all employees are not only aware but fully understand their role in data protection?
    • In what ways can we foster a stronger culture of accountability and transparency within our organization, encouraging open communication between departments such as security, HR, and legal?
    • What measures can we implement to regularly update and refresh our team’s knowledge on the latest data protection protocols and potential insider threats, keeping our defenses as current as possible?

    Process: Streamlining Risk Management

    The process involves setting up a clear, structured approach to identifying, investigating, and responding to insider threats. Begin with establishing clear policies using Microsoft Purview Insider Risk Management, which offers templates for common scenarios like data theft by departing users or unintentional data leaks. Regular audits and analytics help in preemptively identifying potential risks, while a defined triage process ensures timely response to alerts. Cases are managed systematically, from investigation to action, ensuring a thorough review and appropriate response to each incident.

    Ask yourself the following:

    • How can we tailor Microsoft Purview Insider Risk Management templates to better reflect our organization’s specific risk scenarios and policies, ensuring a more targeted and effective approach?
    • What strategies can we implement to enhance our regular audit and analytics processes, enabling us to identify potential insider risks more proactively and accurately?
    • How can we improve our triage process for responding to alerts, ensuring that each case is addressed timely and efficiently, from investigation to action?

    Technology: Leveraging Microsoft Purview for Enhanced Security

    Technology underpins the entire insider risk management framework. Microsoft Purview Insider Risk Management provides a comprehensive suite of tools for monitoring, detection, and response. Use its analytics for a deep dive into user activities, identifying anomalies that could signal potential risks. The platform’s case management feature streamlines investigations, integrating data from various sources for a holistic view of each incident. Collaboration tools facilitate cross-departmental action, ensuring a unified response to insider threats.

    Ask yourself the following:

    • In what ways can we optimize the use of the platform’s case management features to ensure a more efficient investigation process, integrating data from diverse sources for a comprehensive analysis of incidents?
    • What steps can we take to enhance collaboration across departments using the tools provided by Microsoft Purview, ensuring a coordinated and unified response to insider risks?

    Implementing Your Strategy

    1. Audit and Analytics: Activate auditing to track activities within your organization. Use insider risk analytics to scan for potential risks even before setting up specific policies.
    2. Policy Setup: Choose from Microsoft Purview’s policy templates tailored to different risk scenarios. Customize these to align with your organization’s specific needs.
    3. Alert Management: Configure alerts to notify you of suspicious activities. Establish a process for reviewing, evaluating, and addressing these alerts efficiently.
    4. Investigation and Action: Investigate incidents with the aid of user activity reports and take decisive actions based on your findings. Collaborate with HR, legal, and security teams to ensure comprehensive case management.
    5. Continuous Review and Optimization: Regularly review your insider risk policies and processes. Update them as needed to adapt to evolving threats and organizational changes.

    In essence, managing insider risks effectively requires a blend of proactive people engagement, streamlined processes, and advanced technology.

    By leveraging Microsoft Purview Insider Risk Management and Communication Compliance, organizations can establish a robust framework that mitigates risks while fostering a culture of security and compliance.

    For more detailed guidance on setting up and optimizing your insider risk management framework with Microsoft Purview, you can explore resources directly from Microsoft Learn and Microsoft Security playlist.

    Additional resources:

    Embrace Change, Secure Data: Navigating the UK’s Data Protection Evolution with Microsoft Purview

    UK’s Data Protection Refresh

    The UK is introducing a new law that plans to introduce a host of new updates to the existing UK Data Protetion bill. You can read details of the change here and here and from the UK government source themselves here.

    The UK Data Protection and Digital Information Bill proposes a transformative approach to data protection, aiming to balance innovation with data security. The bill introduces easier data transfer processes, a risk-based approach to international transfers, and a streamlined accountability framework. This legislative evolution represents the UK’s commitment to fostering a secure yet flexible data-driven landscape post-Brexit.

    To kickstart your journey towards embracing the changes, I encourage your organization to consider initiating with these key steps. This approach not only prepares you for the transition but also positions you to leverage change using proven tools that are purpose built for Security and Compliance.

    Microsoft Purview: Your Data Protection Ally
    Microsoft Purview is a comprehensive toolkit designed to help organizations navigate the complexities of the new data protection landscape. Here’s how:

    1. Simplified Data Transfers: With Microsoft Purview Information Protection, organizations can classify, label, and protect data, ensuring compliance with the bill’s simplified data transfer requirements.
    2. Streamlined Accountability in Action: Microsoft Priva adapts to the bill’s accountability revamp, offering privacy management solutions that align with the shift towards a “senior responsible individual” model.
    3. Legitimate Interests Simplified: The platform aids in discerning when and how to process data based on legitimate interests, reflecting the bill’s nuanced take on data processing rights.
    4. Embracing a Risk-Based Approach: Microsoft Purview Data Loss Prevention (DLP) fortifies organizations against data breaches, embodying the bill’s risk-based ethos for international data transfers.

    The Takeaway: Future-Proof Your Data Practices
    The UK’s legislative update signals a new era of data protection, where flexibility and security go hand in hand. Microsoft Purview stands out as a the go-to resource for organizations aiming to thrive in this changing regulatory environment. By leveraging Purview’s suite of solutions, businesses can ensure their data practices are not only compliant but also conducive to growth and innovation in the digital age.

    Dive Deeper:
    For those keen to explore the intricacies of the UK Data Protection and Digital Information Bill and Microsoft Purview’s capabilities further, insightful resources await at IAPP’s overview and Pinsent Masons’ detailed analysis here.