When inheriting (a label) is an issue

I encountered a MIP labelling use case that I have not encountered before. The use case question is:

The answer: a whole lot of complication as you can see below.

Starting with the basic of access control: Microsoft Purview Information Protection gives IT admin an option to select which type of permission model to choose. IT defined or User defined model. The User defined model enables end-users to be to define the encryption for their document. This can be done through the label itself under controls.

This gives the user the ability to mix and match how they want their data to be accessed. Users can select who can have read or view or edit rights. These can be different individuals in 1 permission.

In the label publishing policy, you can then configure whether emails should inherit the label of a attachment if the label of the document is higher. This is to ensure that the higher label (with it’s higher security) takes precedence.

I ran a test with the following parameters.

  • Open Outlook > The email label is set a NO LABEL
  • Attached the Word document I created earlier with the label called Highly Confidential (this is the same file with the permission set from the 2nd screenshot above)
  • Sent it to other 2 accounts that was not in the permission list above. This is to simulate how the recipients would see the message
    • Sent to 1 internal account (Barry Allen)
    • Sent to 1 external account that is NOT in the permission list

Outcome:

  1. Outlook was NOT able to inherit the higher label.

On a positive note, this means that the encryption/ permission still works on the document. The screenshot above is from an external email that I have that was not in the permission list of the attached file. So IT security can at least have that peace of mind to know that as long as the data is properly labelled. Data leakage is kept to a minimal.

In another test where the encryption option model for the label that I used is set to use IT Admin defined (all permission is pre-defined.)

The Outlook email was able to properly inherit the label.

The Value of Testing and Advice for IT Admins

Testing is essential when setting up MIP labels and encryption. Real-world testing helps uncover issues or behaviours that might not be obvious from the documentation. By testing it yourself, you can be confident that the setup works as expected in your environment and meets your organisation’s needs.

Advice for IT Admins:
If you plan to use user-defined encryption, make sure your users are properly trained. This model can be confusing, and users might think they’ve set permissions correctly when they haven’t. To avoid mistakes, provide clear instructions and training. Testing these scenarios yourself will also help you spot potential problems and give better support to your users.

Reference: https://learn.microsoft.com/en-us/purview/create-sensitivity-labels#publish-sensitivity-labels-by-creating-a-label-policy

New Built-in Role in Entra: AI Admin

Microsoft has recognised the need for a specialised Admin account to manage AI and Microsoft Copilot across the organisation. This AI Admin role has started rolling out across all Microsoft 365 and Microsoft Entra clients since November 2024.

With AI Admin account can do the following tasks:

  • Manage all aspects of Microsoft 365 Copilot
  • Manage AI-related enterprise services, extensibility, and copilot agents from the Integrated apps page in the Microsoft 365 admin center
  • Approve and publish line-of-business copilot agents
  • Allow users to install an app or install an app for users in the organization if the app does not require permission
  • Read and configure Azure and Microsoft 365 service health dashboards
  • View usage reports, adoption insights, and organizational insight
  • Create and manage support tickets in Azure and the Microsoft 365 admin center

Reference: https://learn.microsoft.com/en-us/entra/identity/role-based-access-control/permissions-reference#ai-administrator

In the center of it all.

Among all Microsoft Purview security solutions, there’s one that you absolutely must get right. If you don’t, your entire data security strategy could fall apart, no matter what other security tools you’re using.

This key solution brings together three basic but crucial tasks: finding your sensitive data, labelling it correctly, and keeping it safe. This solution is Microsoft Purview Information Protection (MIP), and it’s at the heart of how you protect your company’s data.

Why is MIP so critical?

Think of the Microsoft Purview’s Data Classification service as the system that helps all other security tools know what to do. Here’s how it works with different Purview tools:

Purview Data Loss Prevention (DLP):

  • Works like a security guard that reads the labels
  • If it sees a file marked ‘Secret’, it knows exactly what protection rules to follow
  • For example: “This is confidential data, so don’t let it be shared outside the company”

Endpoint DLP (Devices) and Microsoft Defender for Cloud Apps:

  • These tools check the labels whether you’re working on your laptop or in cloud apps like Workday, Salesforce, etc.
  • They constantly ask “What’s this file’s label?” before allowing any action
  • Then they make sure the right safety measures are in place

Microsoft Purview Insider Risk Management:

  • This one’s particularly clever about using the labels
  • It watches for unusual behaviour with sensitive data
  • For example: If someone suddenly downloads 100 files marked ‘Highly Confidential’, it raises an alert
  • It can then start extra monitoring or take other protective steps”

Microsoft Purview Data Governance (Data Map)

  • This service uses MIP to help you map and catalog your structured data.
  • It gives you the ability to apply consistent classification across your data estate. You can have a standardised label across your organisation.
  • For example: “A ‘Confidential’ label means the same thing everywhere, making it easier to manage and protect”

Third party services using MIP

Even third party servicse leverages on the MIP data classification services.

Trellix integrates it’s DLP network appliance with MIP: https://docs.trellix.com/bundle/data-loss-prevention-11.11.x-product-guide/page/UUID-5d61c924-38ac-3cb9-fb84-17596363740f.html

Crowdstrike leverage Microsoft Purview Information Protection labels (page 5 of 7): https://www.crowdstrike.com/wp-content/uploads/2023/12/A-Modern-Approach-to-Confidently-Stopping-Unauthorized-Data-Exfiltration_WhitePaper.pdf

zScaler and Egnyte can import MIP labels as part of it’s DLP: https://help.zscaler.com/downloads/zscaler-technology-partners/data/zscaler-and-egnyte-deployment-guide/Zscaler-Egnyte-Deployment-Guide-FINAL.pdf

Microsoft Purview Information Protection is the foundation that your entire data security and governance strategy builds upon. Without a properly planned and implemented MIP deployment, even the most sophisticated Purview solutions won’t deliver their full value. Think of it as building a house – you need to get the foundation right first.

As your organisation grows and your data landscape becomes more complex, your MIP strategy needs to evolve too. Regular reviews of your classification labels, updating sensitivity rules, and fine-tuning your protection policies aren’t just good practice – they’re essential for keeping your data secure and compliant.

Making the case for Optical Content Recognition (OCR) in your Data Protection strategy

I recently applied for a U.S. visa, and as part of the process, I had to submit my passport, bank records, and a lot of personally identifiable information to the embassy in the form of PDF and JPEG files. This meant that much of my sensitive data is now stored as images. This made me wonder: How are organisations safeguarding data that is image-based rather than text-based?

Traditional Data Loss Prevention (DLP) strategies, while effective in monitoring text-based data, often fall short when it comes to image-based content. This shortcoming can lead to significant vulnerabilities, as sensitive information is frequently embedded within images (see my example above). Optical Content Recognition ( OCR) emerges as a must-have tool in addressing this gap, enabling organisations to extract and analyze text from images. For Cyber Security teams aiming to enhance their data security posture, integrating OCR into their DLP strategy is not just beneficial—it is a must!

What are the industry use cases for OCR in DLP?

  • Financial services: Sensitive information such as account numbers, credit card details, and personally identifiable information (PII) is often embedded in scanned documents, receipts, and screenshots
  • Healthcare industry: There are data that are in the form of Medical records and scans, prescriptions and doctor’s notes (assuming that your doctor can write legibly)
  • Retail and Ecommerce: Scanned receipts and invoices and most product returns and refunds that starts in paper get scanned and stored.
  • Manufacturing: Contracts, Blueprints, R&D documents and even internal presentations (most of which gets converted to either an image or a PDF)
  • Government and Public Sector: Scanned copies of passports, drivers licenses and PII data, Incident reports (which again starts on paper and ends up as a image)

These are just examples of where OCR in DLP can come in to ensure that data is not leaked out.

OCR in Microsoft Purview

Microsoft Purview has OCR capability that allows you to be able to identify, and protect data. This allows you to scan images for Sensitive Information but do remember that this is an OPTIONAL feature and must be enabled at a Tenant level. There’s also a bit of a cost to it (more on this later)

To turn on OCR in your Microsoft Purview you’d need to do the following.

  1. Go to Settings > Select Optical Content Recognition.
  2. Choose where you want OCR to scan.

The full Technical instruction can be found here: https://learn.microsoft.com/en-us/purview/ocr-learn-about?tabs=purview#workflow-at-a-glance

The Cost of OCR

This capability is powerful as it leverages on the Azure AI to use OCR. As of today, the cost to run $1.00 USD per 1,000 scanned item. The keywords to look out for in the costing is ‘per scanned item’ this is because Microsoft considers each page in a PDF or each individual image page in a set of images as 1 scan. So a PDF that contains 10 pages counts as 10 scans. https://learn.microsoft.com/en-us/purview/ocr-learn-about?tabs=purview#estimate-your-ocr-scanning-charges

Data Strategy in using OCR for the first time.

To limit your cost and be more deliberate in running this OCR scan, here’s a helpful strategy so that you use to get started.

Data Search Using Content Search in Purview: Utilize Microsoft Purview’s Content Search feature to filter by file type, such as JPEG and PNG, to identify potential images containing sensitive information. This targeted approach ensures that all image files are scanned for embedded text.

Focus on Known Locations: Identify departments or teams that handle sensitive data, such as Finance, Sales, and Marketing, and focus OCR searches on their respective SharePoint sites. This strategy maximizes the efficiency of OCR by concentrating on areas where sensitive information is most likely to reside.

File Name Analysis: Implement keyword searches for terms that indicate sensitive content, such as “passport” or “ccn” (credit card number), in file names. This proactive approach helps in identifying and flagging files that may contain sensitive information.