The common Insider Risks and how to mitigate them

17/05/202617/05/2026 victorwingsing

In Part 1 (read it here), we established the strategic and collaborative foundations of Insider Risk Management.

Now, we move to the practical hands-on application of IRM: how to detect and investigate the specific patterns of insider risk using Microsoft Purview. This section is for those who are ready to implement these controls.

Let’s look at the 4 common patterns (plus an extra special one about AI) that most organisations sees their employees do when they try taking data out of the organisation…whether they are intentional about it or not.

The Departing Employee risk

People sometimes take client lists, pricing files, or other company information when they are about to leave because they think it will help them in their next job. They may want to keep customer relationships, prove their value to a new employer, or make their move to a competitor easier and faster. Some also tell themselves that the information is “theirs” because they worked on it or built those client relationships.

In other cases, the reason is fear or frustration. A departing employee may worry that once they leave, they will lose access to important contacts, documents, or knowledge, so they copy it “just in case.” Even if they do not see themselves as doing something serious, taking company data before leaving can expose the organisation to legal, commercial, and security risk.

Insider Prevention tip: Use HR connectors to flag resignations. Configure a policy that monitors for unusual collecting/sharing 90 days pre-departure.

Inside Purview Insider Risk Management > Head to Policy then select the template Data theft by departing users. Then Select the HR connector configuration screen for Insider Risk Management. This is used to import resignation or employment status data for departing employee risk indicators.

Here’s the link on how to setup the connector: LINK

How to use these settings: Configure the HR connector to bring in employee status changes, such as resignations or planned departures. After the connector is active, map the relevant HR fields correctly and verify that departing users are being detected. You can then use this signal in an Insider Risk Management policy to increase scrutiny during the pre-departure window.

The Email to self risk

The “remote work” excuse – emailing sensitive attachments to their own personal accounts (Gmail, Outlook.com, etc).

Mitigate this by creating a policy for detecting emails with attachments sent to personal email accounts or other external recipients.

How to use these settings: Select indicators for email activity to external recipients and focus on messages that include attachments. If available in your configuration, narrow the scope to personal domains and combine the policy with sensitivity labels or priority content so that high-value data is reviewed first.

Implementation Tip: Detect emails with attachments to personal domains. Correlate this with sensitivity labels to prioritise high-value data.

The Drip transfer risk

There are users who try to be sneaky by diong small, repeated transfers over time that individually look benign but collectively represent a significant leak.

To mitigate this, set your threshold or sequence settings for repeated low-volume transfers to the same external recipient over time. You can even use the same policy as the Email to Self policy above.

How to use these settings: Set thresholds that look for repeated actions rather than one large event, such as multiple small sends to the same recipient across several days. Tune the volume, frequency, and time window so the policy can identify slow exfiltration patterns without creating too many false positives.

Implementation Tip: Set thresholds for repeated sends to the same external recipient. Use volume-based triggers to catch this slow-and-steady exfiltration.

The “Detour” risk

This is when a user is blocked by DLP and immediately tries a workaround (e.g., downgrading a sensitivity label or using a personal device).

Modify your policy configuration to look for sequence of events where a user has experienced the following: DLP block events, sensitivity label downgrade signals, or related sequence detection settings for attempted workarounds.

How to use these settings: Configure the policy to look for a DLP block followed by a related action that suggests circumvention, such as a label downgrade or a second attempt through another route. The key is to use sequence-based detection so the system recognises the pattern of behaviour, not just a single isolated event.

Implementation Tip: Trigger on DLP blocks followed by label downgrades. This pattern is a strong indicator of intentional circumvention.

The Agentic AI risk

AI agents and copilots now act on behalf of users, accessing and moving data. 94% of organisations report AI is increasing insider risk. If your organisation does not have the basic data proctection control, there is a high-likelihood of data risk.

To mitigate this risk: Use both Purview Insider Risk Mnanagement and Purview Data Security Posture Managenent to create policies that specifically looks for risky AI usage.

Similar to you basic policies, you can create thresholds to identify false positives to true positives.

Conclusion: Starting Small, Thinking Big

Don’t try to boil the ocean. Start with a pilot group (e.g., M&A or Finance). Insider Risk Management is a journey of cultural and technical maturity.

It’s about building a resilient organisation where data is respected, privacy is protected, and risk is managed collaboratively.

A Practical Guide to Insider Risk Management in the UK

03/05/202603/05/2026 victorwingsing

There are many, many post talking about Insider Risk Management but very little that talks about the practical, realistic and field tested approach to Insider Risk Management. This is my attempt to tip the scale towards the latter. I’m skipping the textbook definitions to share real-world scenarios from the trenches specifically, the messy, human problems clients have thrown at me and the practical, field-tested responses we’ve workshopped to address them.

Let’s start with the Human and the strategic foundations of Insider Risk which is…

The Human Element

Let’s be honest: we’ve built digital fortresses with firewalls taller than the Shard and MFA that demands a blood sample. But what happens when the threat isn’t a hooded hacker, but friendly Dave from Sales “backing up” his client list before jumping ship?

In the UK, 90% of organisations face insider incidents annually, and 74% are negligent. People like Dave who aren’t villains, just human [Source: Cybersecurity Insiders]. IRM isn’t about building higher walls; it’s about understanding who’s walking through the gate. With the FCA and GDPR watching closely, “set and forget” security will no longer work.

IRM is a Team Sport

If you think IRM is just a “Cyber Security thing,” you’re in for a rude awakening. It’s more like a heist movie, but instead of stealing diamonds, you’re trying to stop data from walking out the door. And you can’t do it alone. You need a “Triad of Trust” (there’s 4 below since I’ve not used Triad before):

HR: They’re the ones who know Dave is leaving. They provide the context—resignations, performance reviews, the “vibes.” Without HR, you’re just watching random data movements and guessing.
Legal: They’re the ones who keep you out of court. They ensure your monitoring doesn’t cross the line into “Big Brother” territory, keeping you compliant with employment law, Privacy laws and GDPR.
IT/Cyber: You. The tech wizards. You provide the tools (Purview, DLP, Logging) and the forensic skills to figure out what’s actually happening.
Business Leaders: They define what “sensitive” actually means. From M&A docs, merger docs; to Customer Support, it’s the client list. One size does not fit all.

Pro Tip: Form a small, cross-functional steering group. Call it the “Data Defence League” if you want. Just get them in a room.

The Privacy Paradox (aka Balancing Monitoring with Trust)

Let’s address the elephant in the room: IRM tools are intrusive by design. They’re supposed to be. They monitor user activity and correlate events to spot patterns. But in the UK, we have a thing called “privacy,” and it’s kind of a big deal. Here’s how you can balance it.

The UK – GDPR Balance:

Transparency: Tell people you’re watching. Update those employment contracts. Add it to your Employee Training program, include it your End-user Agreement that they see when they log-in to their Corporate PC. Send an email. Be open. Why: Because secrecy breeds mistrust.
Proportionality: Don’t monitor the intern with the same intensity as the Head of M&A. Start with high-risk roles (Tier 1) and expand based on evidence. It’s called “being reasonable.”
Pseudonymisation: This is your best friend. Purview keeps data private by default. Analysts see “ANON2340,” not “Dave from HR,” until a formal case is opened. It’s like a mask for your data.
Policy-Led Monitoring: Only trigger monitoring when a highly defined policy is breached. This isn’t about general surveillance; it’s about catching specific, pre-agreed risk behaviors. If the policy isn’t broken, the system stays quiet.

You can’t protect what you haven’t classified

Here’s another hard truth. Purview IRM is only as good as the data it can see. If you haven’t done the boring work of classification, you’re flying blind. There’s a clear dependency chain:

Sensitivity Labels: The bedrock. If a document isn’t labelled “Confidential,” IRM can’t prioritise it. It’s like trying to find a needle in a haystack without knowing what a needle looks like.
Sensitive Information Types (SITs): Teach Purview to recognise UK-specific data like NINs, IBANs, or NHS numbers. If it doesn’t know what a NIN is, it can’t protect it.
Data Loss Prevention (DLP): DLP is the “first line of defence.” IRM is the “second line” that investigates when DLP is bypassed or when subtle patterns emerge. Think of DLP as the bouncer and IRM as the detective.

Warning: If your DLP policies are noisy or your labels are inconsistent, your IRM alerts will be useless. Start by tuning your DLP and Classification strategy before turning on IRM. Otherwise, you’ll just be drowning in false positives.

Questions from my clients HR, Legal and Business Operations team

Q1 (HR/Legal): “How do we ensure we aren’t creating a ‘Big Brother’ culture that destroys employee morale?”

Answer: Focus on “Privacy by Design.” Use pseudonymisation, limit access to investigation data to a need-to-know basis, and ensure all monitoring is tied to a legitimate business interest (e.g., protecting IP) rather than general performance monitoring. Transparency is your best defence against mistrust. Think of it as “security with respect.”

Q2 (Business Ops): “How do we distinguish between ‘normal’ high-volume work and ‘risky’ data exfiltration, especially in data-heavy roles like Legal or Finance?”

Answer: Use “Scoped Policies” and “Baseline Behaviour.” Purview allows you to set different thresholds for different groups. A Legal team downloading 500 files for a DSAR is normal; a Sales rep doing the same is a risk. Use group-based scoping to reduce false positives and respect business context. It’s about context, not just volume.

Q3 (Legal/Compliance): “What are the legal repercussions for a first-time offender versus a repeat offender?”

Answer: Define a “Graduated Response” framework. First-time negligent offenses should trigger coaching and re-training. Repeat offenses or malicious intent should trigger formal HR/Legal escalation. Consistency is key to procedural fairness. Don’t fire Dave for a first-time mistake; teach him.

Q4 (IT/Security): “How do we handle long notice periods (e.g., 3-6 months) for senior leavers?”

Answer: Map AD “accountExpires” attributes to IRM triggering events. Configure a 90-day pre-expiry monitoring window to catch pre-resignation data gathering. It’s like having a security camera on the exit door.

Q5 (HR): “How do we integrate IRM with our existing HR processes for terminations?”

Answer: Use HR connectors to automatically flag resignations or terminations. This ensures IRM policies are triggered without manual intervention, reducing the risk of human error. Automate the boring stuff.

Q6 (Business Leaders): “How do we measure the success of our IRM programme?”

Answer: Track metrics like “Mean Time to Investigate,” “False Positive Rate,” and “Number of High-Severity Cases Resolved.” The goal is to move from reaction to resilience. Show them the value, not just the alerts.

1. Establish clear data storage policies

The recommended solution: Codify in your Information Security Standards that confidential or sensitive data must NOT be stored in third-party systems.

Why this works: By keeping sensitive data within Microsoft 365’s ecosystem, you maintain full control over encryption, access permissions, and audit trails. Microsoft 365 provides native integration between all its services—from SharePoint and OneDrive to Teams and Outlook—ensuring encrypted documents work seamlessly across your organisation’s approved platforms.

Implementation tip: Create a simple classification guide that shows users exactly which data types belong where. Make it clear that “Confidential” and above stays in Microsoft 365, while “General” business information can live elsewhere.

2. Educate users on platform selection

The recommended solution: Train your end-users on what platforms to use, when to use them, and how to properly share confidential data.

Why this works: Most encryption-related issues stem from users not understanding the boundaries of their tools. When people know that encrypted documents won’t work in Dropbox, they’ll choose SharePoint instead.

Implementation tip: Create simple decision trees: “Need to share confidential data externally? Use secure email with expiry dates. Need to collaborate on sensitive documents? Use SharePoint with guest access controls.”

3. Configure service accounts for automation

The recommended solution: For AI and RPA systems (especially in-house built ones), add the named user accounts that run these systems to your encryption policies as approved users.

Why this works: Many automation systems use dedicated service accounts to access files. By explicitly granting these accounts decryption rights, your automated workflows continue functioning while maintaining security controls.

Implementation tip: Create a dedicated security group for automation accounts and include this group in your sensitivity label encryption settings. This makes it easier to manage permissions as you add more automated systems.

4. Implement data minimisation for BI tools

The recommended solution: Third-party BI tools should not access confidential data directly. Instead, use data minimisation, anonymisation, and masking techniques.

What this means: Data minimisation involves only sharing the minimum data necessary for analysis. Data anonymisation removes personally identifiable information, while data masking replaces sensitive values with realistic but fake data that maintains statistical properties.

Why it’s important: This approach protects sensitive information while still enabling business intelligence. Your sales dashboard can show trends and patterns without exposing individual customer details or confidential pricing information.

Implementation tip: Create sanitised data exports specifically for BI tools, removing or masking sensitive fields before the data leaves your secure environment.

5. Standardise PDF readers organisation-wide

The recommended solution: Ensure all devices run the same, supported version of PDF readers using Intune, Group Policy, or IT deployment checklists.

Why this works: Consistency eliminates the “it works on my machine” problem. When everyone uses Adobe Reader/Acrobat version 22 or later, encrypted PDFs open reliably across your organisation.

Implementation tip: Include PDF reader version checks in your device compliance policies. Set up automatic updates where possible, and create a simple verification script for IT teams to run during device setup.

6. Map your external ecosystem

The recommended solution:: Understand what software your vendors, suppliers, and customers are using before sharing encrypted documents.

Why this works: Knowing that your vendor/ partner/ supplier/ law firm uses LibreOffice or your client prefers Google Docs helps you choose the right sharing method upfront, avoiding embarrassing “I can’t open your file” conversations.

Best practice examples:

Maintain a simple spreadsheet of key partners and their preferred platforms
Ask about software compatibility during vendor onboarding
Include system requirements in your standard contract templates
Create partner-specific sharing guidelines for your teams

7. Identify critical platform dependencies

The recommended solution: In connection to item 6, note which critical partners use non-Microsoft platforms that could be impacted by encrypted data sharing, then ensure users know the right channels for sensitive data exchange.

Why this works: This builds on your data storage policies (solution 1) and user education (solution 2) by creating specific guidance for high-stakes relationships.

Implementation tip: For critical partners who can’t handle encrypted files, establish secure alternatives like password-protected SharePoint links with expiry dates, or use secure email gateways that work across platforms.

Two additional best practices you shouldn’t miss

8. Create encryption exception processes

The recommended solution: Establish a formal process for temporarily removing encryption when legitimate business needs arise.

Why you need this: Sometimes encrypted documents genuinely need to be shared with systems that can’t handle them. Rather than having users work around security controls, create an approved exception process with proper approvals, time limits, and audit trails.

9. Implement regular compatibility testing

The recommended solution: Schedule quarterly tests of your encryption policies against your actual business workflows.

Why this matters: Software updates, new vendor relationships, and changing business processes can break previously working encryption setups. Regular testing catches these issues before they impact critical business operations.

Implementation tip: Create a simple test matrix covering your most common document types, sharing scenarios, and external platforms. Run through this checklist each quarter and after major system updates.

Remember: The goal isn’t perfect security—it’s effective security that people can actually use.

When Encryption Meets Reality: What Actually Breaks When You Deploy Sensitivity Labels

09/06/202509/06/2025 victorwingsing

Microsoft Purview’s sensitivity labels are brilliant for protecting your organisation’s data—until they’re not. While the encryption capabilities of labels like “Highly Confidential” and “Internal Only” provide robust security, they can also create unexpected roadblocks that’ll have your users reaching for the IT helpdesk.

I’ve previously written several blog post on the subject that you can read

Adding to what I’ve already mentioned above, let’s explore the seven other common issues when encryption through sensitivity labels meets the real world.

1. Third-Party Cloud Storage Platforms

What breaks: Dropbox, Adobe Creative Cloud, DocuSign, and similar platforms

Why it happens: Purview treats these as external environments and blocks access to encrypted content. Your beautifully protected document becomes a digital paperweight the moment someone tries to edit it outside the Microsoft 365 ecosystem.

The impact: Users can’t collaborate on sensitive documents stored in Dropbox or send encrypted contracts through DocuSign—two very common business scenarios.

2. AI and RPA Systems

What breaks: Third-party artificial intelligence tools and robotic process automation systems

Why it happens: These systems need to read and process your data, but encryption renders the content unreadable to external AI engines.

The impact: Your automated processing stops working, chatbots can’t access knowledge base documents, and data extraction workflows grind to a halt.

3. Business Intelligence Dashboards

What breaks: Third-party analytics platforms that pull data from encrypted Excel files.

Why it happens: BI tools can’t decrypt and read the underlying data in your spreadsheets, leaving your dashboards empty or displaying errors.

The impact: Executive reports fail to update, sales dashboards show no data, and business intelligence grinds to a halts

4. Legacy Adobe PDF Readers

What breaks: Adobe versions older than Adobe Reader/Acrobat 22

Why it happens: Older Adobe versions lack the necessary components to handle Purview’s encryption standards.

The impact: Users with older software installations can’t open encrypted PDFs, creating accessibility issues across different departments or external partners.

As per Microsoft the version that supports labelling is version 22.003.20258

Adobe’s official documentation is more update and it shows version 23.003.20201.1ec7624

In my personal experience, I’ve seen devices that has Acrobat 21, 19 and 15 not even be able to open up encrypted PDF files.

5. Online PDF Viewers

What breaks: Browser-based PDF viewers (with the exception of Microsoft Edge) and 3^rd party PDF reader apps.

Why it happens: These lightweight PDF viewers (ex. Nitro PDF and PDFgear) don’t have the decryption capabilities required for Purview-protected documents.

The impact: Document previews fail, web-based workflows break. Users using 3^rd party reader apps either is not able to open the files or gets an error message when they open an encrypted PDF.

6. Open Source Office Suites

What breaks: LibreOffice, OpenOffice, and similar free alternatives

Why it happens: These applications lack the proprietary decryption libraries needed to handle Microsoft’s encryption.

The impact: Your vendors, remote branch offices or sub-member firms who runs their own IT systems who are using these free office software suddenly can’t access company documents, creating a two-tier system of document access.

I’ve checked the LibreOffice documentation and could not find any mention of support for RMS.

7. Non-Microsoft Productivity Platforms

What breaks: Google Workspace (Docs, Sheets, Slides) and Apple iWork (Pages, Numbers, Keynote)

Why it happens: Competing platforms don’t support Microsoft’s encryption standards—hardly surprising, but often overlooked during planning and deployment.

You can read more about that here:

Does sensitivity label applied docs can be opened in google docs if I add my google account while applying the label?

Also, Google Workspace has a competing data classification scheme: Enable or disable a classification label which is why I think it not likely that Google will make this cross-platform work.

The impact: Cross-platform collaboration becomes impossible, and BYOD policies clash with security requirements.

If you encounter any of these, read part 2 with my recommended actions/ workarounds.

When Purview SIT NAMES gives me headaches

07/06/202507/06/2025 victorwingsing

There are now 326 built-in Purview SIT (as of 07-June-2025). The new addition great but Microsoft needs to do a better job in managing how SIT’s documented and communicated to it’s users. Here’s my quick rant on this matter:

Rant 1: The Update list that is stuck in February 2024

The official Microsoft page (Sensitive information type entity definitions) hasn’t been updated since February 7, 2024. Seriously? 1 year and five months and counting.

Why it’s frustrating: We’re stuck creating custom SIT because the docs are MIA. (Although, I did win a project out of creating custom SIT – but still!)

Rant #2: Inconsistencies in SIT naming

If you compare the SIT names in the In the official documentation against the SIT names that you get in the Purview portal, the names that are used shows up are different.

You can see in this image below that in the Purview Portal (left), There are items such as Hungarian Social Security number versus Hungary Social security number.

Some of the names have an added n (Australian instead of Australia), some have the words Identification spelled out instead of ID. There’s a long list of inconsistencies with the naming.

Admittedly, from within their XML data, the names are correct. So now I ask the product team, why have the names in the documentation title be different then?

Rant 3: No announcements when new Built-it SITs are out

There are no announcement to the Purview community when new built-in SITs comes out. I had to manually check every now and then by exporting a list of SIT and comparing it with previous data. Good thing I keep a track of what comes out (read my latest post on the matter: https://www.linkedin.com/pulse/12-new-sits-available-microsoft-purview-victor-wingsing-uzzze/)

Advise to Purview admins

Use the Immutable IDs when deploying SITs. Use the following command to export all the ID in a csv file

Get-DlpSensitiveInformationType | Select-Object Name, Identity | Export-Csv -Path "SIT_Names_and_IDs.csv" -NoTypeInformation

Change the path name to a location where you want the file to be exported.

Using the Immutable Identity (ID) is better when deploying your Purview policies as names could easily change but these ID (as the name implies) are immutable. This makes your deployment scripts future-proof.

Using eReaders with Microsoft Purview Information Protection: A “Remarkable” Case Study

11/03/202510/03/2025 victorwingsing

I’ve already decided what I’ll buy first when I win the lottery and it’s going to be the Remarkable Paper Pro.

I saw a C-level executive from a client using this device in a meeting and I was immediately impressed by its design. The form factor, the way it writes like paper and the feature where you can just write on-top of a PDF files is just so cool.

This same client later asked whether implementing sensitivity labelling for PDF files would impact their users as they have many of whom use this device for reading and annotating documents whilst travelling (especially VIPs). So…I decided to investigate.

Remarkable Paper Pro: Technical Overview

Operating System: Codex (custom Linux-based OS)
Supported formats: Limited to PDF and ePub
Web capabilities: No built-in browser

File Management Options

Email: Direct file sharing via email.
Cable transfer: USB connection for importing/exporting
Cloud integration: Syncs with personal Google Drive, Dropbox and OneDrive
Remarkable custom app: The device can import files through my.remarkable.com

Device limitation (for Device Management or Data Security)

The Operating system (a Linux OS) cannot be onboarded to Microsoft Device Management or Intune
The Operating system does not have browser to access the Microsoft authentication portal
Users accessing corporate data are limited to do it in 3 general ways (sending it to the device via email, via usb cable, or via syncing the files from their Personal online storage aka Personal Dropbox, OneDrive, Google Drive)
Though reMarkable tablet can open, view, and annotate password-protected PDFs. However, this feature is limited to basic password protection and does not extend to Microsoft Purview’s advanced encryption methods, such as Rights Management Services (RMS) or Microsoft Information Protection (MIP).

Impact Assessment

Users will encounter issues only when using sensitivity labels with encryption to PDF files. This limitation exists because the Remarkable devices cannot process Microsoft Purview’s advanced encryption methods, lacking both the necessary authentication capabilities and OS support to decrypt protected content.

The device also has no browser to authenticate with Microsoft services and its custom Linux-based OS (Codex) cannot be integrated with Microsoft’s security ecosystem. This makes it not possible to work on encrypted PDFs.

However, if PDF files are merely labelled without encryption applied (visual marking only), users will experience no impact whatsoever. These files remain fully accessible and maintain all annotation capabilities, as the labelling exists purely as metadata without affecting the file’s core accessibility.

Potential Solutions

Simple approach: Instruct executives to use sensitivity labels without encryption for PDF files they need to access on their Remarkable devices. Implement DLP monitoring to track PDFs sent to personal email addresses, providing security oversight without disrupting workflow.

Moderate approach (but Costly): Issue corporate Onyx Boox eReaders as an alternative. Onyx Boox is a direct competitor of Remarkable but the key difference is that it runs on Android OS.

The big benefit: these Android-based (Android 13 OS) devices support Microsoft authentication and can be properly integrated with MDM solutions, allowing full compatibility with encrypted documents.

It also cost less than the Remarkable Paper Pro, but buying an extra corporate device (even at $499 USD) just for reading PDF files and note taking might not be taken well by your CFO.

Complex approach: Create a special sensitivity label variant without encryption specifically for executive use cases involving eReaders. This label would maintain visual markings and tracking capabilities while ensuring accessibility on the Remarkable device.

Supporting your current Remarkable device users today.

If supporting Remarkable devices for VIP users is necessary, focus on monitoring data flow rather than blocking device use.

Set up DLP policies that track document transfers to personal emails and cloud services used with Remarkable. Include:

Alerts when sensitive documents are transferred
Required business justification for transfers
Time limits on sensitive document access
Targeted security training for Remarkable users
Regular reviews of transferred documents
Clear audit logs of document movement (once reviews are done)

This approach balances users device preferences with security needs. Monitoring works better than banning devices that senior staff prefer to use.

Reference:

About Remarkable: https://support.remarkable.com/s/article/About-reMarkable-2
Receiving email using Remarkable: Trouble sharing files via email
Importing and Exporting files: Importing and exporting files and How to import and export files with the desktop app | reMarkable
Integration with 3rd party file storage: 11 WAYS to UPLOAD to your reMarkable 2

Deep dive in PDF labeling and data protection

09/03/202509/03/2025 victorwingsing 1 Comment

Let’s cut to the chase – PDFs are everywhere in your organisation, and they’re housing your sensitive data. I’m talking about those finalised e-signed contracts, bank statements, and countless other critical documents. While we’re all busy protecting our Office files with fancy security measures, PDFs often slip through the cracks. But here’s the thing – they need the same level of classification and protection as your typical .docx or .xlsx files.

Here’s the different ways you could label PDF files and simple to follow deployment strategy to enable PDF data classification to your data.

Labeling PDFs: Three Approaches

Label data natively in Microsoft Office then save it as PDF
Label data using Adobe Acrobat
Label data using the Microsoft Purview In

Read all the way to the end to see what would happen if you use the “Open in PDF Word” function to an encrypted PDF file.

Approach 1: Label natively using Microsoft Office then save it as a PDF

Approach 1: Label Then Save as PDF
This approach is something you can do now. This method involves applying a sensitivity label directly to an Office document in an application like Microsoft Word, and then saving it as a PDF. Although the label transfers to the PDF, note that if your label incorporates encryption, you must disable the PDF/A option when saving. The resulting PDF will display protection via Purview Information Protection, and its custom properties will indicate the applied label.

Saved as a PDF. The document security shows no security as the label that I used is just a plain label without any encryption.

Custom values shows the label that I used.

TAKE NOTE that if your label has ENCRYPTION turned on, then you need to unselect the PDF/A option as you save it.

The security tab displays that it’s protected by Purview Information Protection.

The custom properties shows the Privileged/ Protected / Encrypted label used

Approach 2: Label data using Adobe Acrobat PDF Reader

Here’s where it gets interesting (and a bit challenging). Most of us view these PDFs through web browsers or PDF readers, with Adobe being the undisputed king of the PDF world. In fact, Adobe’s so dominant that in most organisations I’ve worked with, it’s practically become the default way to handle PDFs – much like how we all say “Google it” instead of “search for it”.

Unlike your Microsoft Office suite (Word, Excel, PowerPoint, Outlook), Adobe Acrobat doesn’t play nicely with Sensitivity labels. The “solution”? Mucking about in the Windows registry. Yes, you read that right – registry editing. Adobe’s own support documentation lists down the exact steps to do this. Source (Adobe MPIP Support: https://helpx.adobe.com/enterprise/kb/mpip-support-acrobat.html)

Sure, tweaking the registry is not difficult to do. But imagine rolling this out across thousands of machines in your enterprise. Any experienced IT admin who’s attempted large-scale registry changes will tell you that it’s not fun.

There is a way to do this via Intune to simplify things. You can read it here from Simon Skotheimsik’s blog: https://skotheimsvik.no/how-to-use-intune-to-enable-sensitivity-labels-on-pdf-files

Why would you choose this option?

This option is great if you need to add the same Header, Footer or Watermark that you use in your Word, Excel and PowerPoint files to your PDF.

Approach 3: Label data using the Microsoft Purview Information Protection client

This client must be installed first to your Windows devices before it would work, you can get it here: https://www.microsoft.com/en-gb/download/details.aspx?id=53018

Once installed, you now have a tool that can label PDF files and do so much more. There are some limitation to this that you’ll see below. The client application can be launched by right clicking a file and selecting Apply sensitivity label with Microsoft Purview.

One big benefit of using this client is that you can select multiple files or even an entire folder and mass label them in 1 go. You can use this to MANUALLY label all the files sitting inside a PC or even in a Shared Network Drive.

The limitation.

The limitation of using this tool is that you will not be able label data while a PDF is open, there is no label interface inside of Adobe Acrobat, also with this tool cannot apply headers, footers or watermarks. This is by design as the client is an application/ process that applies labels outside of office files. Read it here: https://learn.microsoft.com/en-us/purview/sensitivity-labels-office-apps#when-office-apps-apply-content-marking-and-encryption

Opening Encrypted PDF in Word?

This was a question to me by a client: What happens when a user tries to open a PDF in Word?

Most of us by now know that you can open and edit a PDF in Word (if you don’t know how, please check this: https://support.microsoft.com/en-us/office/opening-pdfs-in-word-1d1d2acc-afa0-46ef-891d-b76bcd83d9c8

The short answer is that your data is still protected. Here’s what happens when I tried to open an encrypted PDF file in Word.

Here’s the original PDF file that I have encrypted.

After using Word to open the PDF. A pop-up prompt asked me select how I want the file to be opened.

From the Preview window, I can already see that the data is encrypted by Microsoft IRM Services. This gives me confidence that the data is protected.

Then upon opening the file, all I can see are the hashed data. The text + image in the original file is no longer readable.

Deployment strategy

Now that you know how labels works for PDFs. Let’s talk about Deployment.

Begin with Approach 1 because it leverages familiar tools like Microsoft Word and allows you to secure sensitive PDFs right from the document creation stage. This straightforward step minimises the learning curve and reduces the likelihood of errors, enabling your team to adopt essential security measures immediately.

Once the basics are in place, invest in user education to ensure proper application and management of sensitivity labels. Training reinforces security compliance and builds a strong foundation, empowering your staff to understand and uphold data protection practices across the organisation.

After establishing confidence in Approach 1, transition to the Microsoft Purview Information Protection client (Approach 3) to enable scalable, mass labelling across devices and shared drives. This phased progression not only improves operational efficiency and consistency but also sets the stage for introducing more advanced options like registry adjustments (Approach 2) when additional formatting or watermark requirements arise.

References:

PDF Support: https://learn.microsoft.com/en-us/purview/sensitivity-labels-office-apps#pdf-support
Reading resources https://techcommunity.microsoft.com/blog/microsoft365insiderblog/apply-sensitivity-labels-to-pdfs-created-with-office-apps/4214665 https://office365itpros.com/2022/06/20/protected-pdfs-microsoft-365-easier/
MIP end-user document https://support.microsoft.com/en-us/topic/label-and-protect-files-in-file-explorer-in-windows-67829155-2d0e-4122-9677-7c53c8cba18a
MIP Labeller download https://www.microsoft.com/en-gb/download/details.aspx?id=53018
News about labeller https://learn.microsoft.com/en-us/purview/about-infoprotect-client Release notes https://learn.microsoft.com/en-us/purview/information-protection-client-relnotes
How to use sensitivity labels in your PDF files by Nikki Chapple (2022): https://nikkichapple.com/how-to-use-sensitivity-labels-with-your-pdf-files/
How to use Intune to enable sensitivity labels on PDF files (2022): https://skotheimsvik.no/how-to-use-intune-to-enable-sensitivity-labels-on-pdf-files

All Adobe related guides:

Adobe MPIP Support: https://helpx.adobe.com/enterprise/kb/mpip-support-acrobat.html
Watermarking, Headers and Footers are enabled in PDF
Video here: https://experienceleague.adobe.com/en/docs/document-cloud-learn/acrobat-learning/integrations/microsoftsensitivitylabels
Press release on Adobe PDF support (2023) https://www.microsoft.com/en-us/security/blog/2023/03/07/get-integrated-microsoft-purview-information-protection-in-adobe-acrobat-now-available/

In the center of it all.

04/02/202504/02/2025 victorwingsing

Among all Microsoft Purview security solutions, there’s one that you absolutely must get right. If you don’t, your entire data security strategy could fall apart, no matter what other security tools you’re using.

This key solution brings together three basic but crucial tasks: finding your sensitive data, labelling it correctly, and keeping it safe. This solution is Microsoft Purview Information Protection (MIP), and it’s at the heart of how you protect your company’s data.

Why is MIP so critical?

Think of the Microsoft Purview’s Data Classification service as the system that helps all other security tools know what to do. Here’s how it works with different Purview tools:

Purview Data Loss Prevention (DLP):

Works like a security guard that reads the labels
If it sees a file marked ‘Secret’, it knows exactly what protection rules to follow
For example: “This is confidential data, so don’t let it be shared outside the company”

Endpoint DLP (Devices) and Microsoft Defender for Cloud Apps:

These tools check the labels whether you’re working on your laptop or in cloud apps like Workday, Salesforce, etc.
They constantly ask “What’s this file’s label?” before allowing any action
Then they make sure the right safety measures are in place

Microsoft Purview Insider Risk Management:

This one’s particularly clever about using the labels
It watches for unusual behaviour with sensitive data
For example: If someone suddenly downloads 100 files marked ‘Highly Confidential’, it raises an alert
It can then start extra monitoring or take other protective steps”

Microsoft Purview Data Governance (Data Map)

This service uses MIP to help you map and catalog your structured data.
It gives you the ability to apply consistent classification across your data estate. You can have a standardised label across your organisation.
For example: “A ‘Confidential’ label means the same thing everywhere, making it easier to manage and protect”

Third party services using MIP

Even third party servicse leverages on the MIP data classification services.

Trellix integrates it’s DLP network appliance with MIP: https://docs.trellix.com/bundle/data-loss-prevention-11.11.x-product-guide/page/UUID-5d61c924-38ac-3cb9-fb84-17596363740f.html

Crowdstrike leverage Microsoft Purview Information Protection labels (page 5 of 7): https://www.crowdstrike.com/wp-content/uploads/2023/12/A-Modern-Approach-to-Confidently-Stopping-Unauthorized-Data-Exfiltration_WhitePaper.pdf

zScaler and Egnyte can import MIP labels as part of it’s DLP: https://help.zscaler.com/downloads/zscaler-technology-partners/data/zscaler-and-egnyte-deployment-guide/Zscaler-Egnyte-Deployment-Guide-FINAL.pdf

Microsoft Purview Information Protection is the foundation that your entire data security and governance strategy builds upon. Without a properly planned and implemented MIP deployment, even the most sophisticated Purview solutions won’t deliver their full value. Think of it as building a house – you need to get the foundation right first.

As your organisation grows and your data landscape becomes more complex, your MIP strategy needs to evolve too. Regular reviews of your classification labels, updating sensitivity rules, and fine-tuning your protection policies aren’t just good practice – they’re essential for keeping your data secure and compliant.

Making the case for Optical Content Recognition (OCR) in your Data Protection strategy

01/02/202503/02/2025 victorwingsing

I recently applied for a U.S. visa, and as part of the process, I had to submit my passport, bank records, and a lot of personally identifiable information to the embassy in the form of PDF and JPEG files. This meant that much of my sensitive data is now stored as images. This made me wonder: How are organisations safeguarding data that is image-based rather than text-based?

Traditional Data Loss Prevention (DLP) strategies, while effective in monitoring text-based data, often fall short when it comes to image-based content. This shortcoming can lead to significant vulnerabilities, as sensitive information is frequently embedded within images (see my example above). Optical Content Recognition ( OCR) emerges as a must-have tool in addressing this gap, enabling organisations to extract and analyze text from images. For Cyber Security teams aiming to enhance their data security posture, integrating OCR into their DLP strategy is not just beneficial—it is a must!

What are the industry use cases for OCR in DLP?

Financial services: Sensitive information such as account numbers, credit card details, and personally identifiable information (PII) is often embedded in scanned documents, receipts, and screenshots
Healthcare industry: There are data that are in the form of Medical records and scans, prescriptions and doctor’s notes (assuming that your doctor can write legibly)
Retail and Ecommerce: Scanned receipts and invoices and most product returns and refunds that starts in paper get scanned and stored.
Manufacturing: Contracts, Blueprints, R&D documents and even internal presentations (most of which gets converted to either an image or a PDF)
Government and Public Sector: Scanned copies of passports, drivers licenses and PII data, Incident reports (which again starts on paper and ends up as a image)

These are just examples of where OCR in DLP can come in to ensure that data is not leaked out.

OCR in Microsoft Purview

Microsoft Purview has OCR capability that allows you to be able to identify, and protect data. This allows you to scan images for Sensitive Information but do remember that this is an OPTIONAL feature and must be enabled at a Tenant level. There’s also a bit of a cost to it (more on this later)

To turn on OCR in your Microsoft Purview you’d need to do the following.

Go to Settings > Select Optical Content Recognition.
Choose where you want OCR to scan.

The full Technical instruction can be found here: https://learn.microsoft.com/en-us/purview/ocr-learn-about?tabs=purview#workflow-at-a-glance

The Cost of OCR

This capability is powerful as it leverages on the Azure AI to use OCR. As of today, the cost to run $1.00 USD per 1,000 scanned item. The keywords to look out for in the costing is ‘per scanned item’ this is because Microsoft considers each page in a PDF or each individual image page in a set of images as 1 scan. So a PDF that contains 10 pages counts as 10 scans. https://learn.microsoft.com/en-us/purview/ocr-learn-about?tabs=purview#estimate-your-ocr-scanning-charges

Data Strategy in using OCR for the first time.

To limit your cost and be more deliberate in running this OCR scan, here’s a helpful strategy so that you use to get started.

Data Search Using Content Search in Purview: Utilize Microsoft Purview’s Content Search feature to filter by file type, such as JPEG and PNG, to identify potential images containing sensitive information. This targeted approach ensures that all image files are scanned for embedded text.

Focus on Known Locations: Identify departments or teams that handle sensitive data, such as Finance, Sales, and Marketing, and focus OCR searches on their respective SharePoint sites. This strategy maximizes the efficiency of OCR by concentrating on areas where sensitive information is most likely to reside.

File Name Analysis: Implement keyword searches for terms that indicate sensitive content, such as “passport” or “ccn” (credit card number), in file names. This proactive approach helps in identifying and flagging files that may contain sensitive information.

Excluding a specific user (or group) from Sensitivity labels

13/10/202403/02/2025 victorwingsing

I’m excited to share a practical guide I’ve created that walks you through the process of excluding specific users or groups from Microsoft Purview Sensitivity Labels. This guide comes from a real-world scenario where an organization is piloting a new approach to simplify its labeling structure. They wanted to test how reducing the number of labels applied to users would affect workflows and information protection. To support this, I’ve put together detailed instructions on how to effectively manage exclusions in Purview, along with a back-out process to ensure a smooth rollback if needed.

This PDF guide is packed with step-by-step instructions, screenshots, and expert tips to help you navigate the nuances of label exclusions. Whether you’re in the middle of a label simplification pilot or simply looking to better control label application, this guide will help streamline your process. Get ready to dive in and experience a more flexible, user-centered approach to managing Sensitivity Labels in Microsoft Purview!

Excluding a Group from Sensitivity label policy Download