Blog

Your Top 10 Data Redaction Questions Answered

March 30, 2021

We recently welcomed Redacted.ai into the OneTrust family to further the expansion of our enhanced data redaction capabilities. This move brings powerful data redaction capabilities into the OneTrust platform to solve a broad range of privacy, information security, and legal use cases. We recently hosted a webinar about this acquisition and received numerous questions about what our new data redaction capabilities can do for your privacy, security and data governance programs. We received over 100 questions about the product and the future of OneTrust data redaction.

We took your top questions and created an FAQ series to dive deep into our data redaction capabilities and what they mean for you.

Let’s start with some context.

What Is Data Redaction?

Data redaction means removing sensitive information from files or databases. There are two types of data redaction:

Unstructured file redaction: This is redacting data that is stored in files as opposed to in a database. Think PDFs, Word docs, images, emails, etc. This type of data accounts for almost 90% of all digital data.
‘Database’ redaction: This is redacting data that is typically stored under known fields in a database or structured data system. In other words, the data you typically interact with on a day-to-day basis that has already been organized and understood – like names, addresses, and SSN.

Note: sometimes this type of data redaction is used to mean data masking which is a technique used to hide or obfuscate data from the results of a query to a database (e.g., a credit card number replaced by xxxx in the results file).

Why Is Data Redaction Needed?

There are multiple needs for redacting sensitive and personal information from unstructured files such as emails, pdfs, docs etc.

Let’s look at a few key use-cases.

Legal redaction

As part of eDiscovery processes, M&A, investigations, and regulatory information sharing, there is a need to redact sensitive information before disclosing it.

Government / FOIA redaction

Governments in several countries are subject to Freedom of Information type requests and privacy-related data requests. Before disclosing information to the person who made the request, sensitive information and/or personal information needs to be redacted from files.

Life Sciences

In clinical trials, there are documents relating to patients that need to be shared with various other parties to enable research to be conducted. There is a need to redact the patient’s personal information before sharing those files.

Privacy related redaction in the context of DSARs

When organizations receive data subject access requests or consumer rights requests under privacy laws like the GDPR and the CCPA, the person’s information is sometimes commingled with other people’s information and other sensitive information in the relevant files such as emails, pdfs, docs. There is a need to redact other people’s information and sensitive information before disclosing the files to the requester.

Now let’s jump into your top 10 questions:

1. In what languages do you automatically detect sensitive information?

OneTrust Data Redaction handles the following languages to date and is continually adding more languages. If there is one that is not covered here, please contact us:

English
French
German
Spanish
Italian
Chinese
Arabic

2. Can you specify the type of redaction on each redacted item e.g., “Third party name” or “Business Sensitive” or “Third party opinion”?

The default redaction is in black and will cover the applicable area in the file. You can also choose to annotate the redaction with text that may give the reason for redaction and/or state a rule pursuant to which the information is not being shown. You can also choose different redaction colors on the page, including for different types of entities. For example, you could choose that this name should be redacted in red, and another name should be redacted in green.

3. How does it handle attachments on emails?

OneTrust Data Redaction will show the email(s) and in line will display the attachment(s) so those can be reviewed in one go.

4. How do we redact information relating to one person, but not information relating to other people?

OneTrust Data Redaction has the concept of a Do Not Redact list and an Always Redact list. If you know the information of someone you would like not to be redacted, you can add their information to the Do Not Redact list. The product will automatically detect other people’s personal information and redact those.

5. Are the pdfs that are produced safe from software that can remove redactions?

OneTrust Data Redaction creates a new file that contains the redactions. In this new file, the information that is redacted is no longer there. In other words, that information is in the original file, but not in the new redacted file. The redacted area is in that sense covering a blank space. Even if there was software capable of removing the redaction, it would uncover a blank space behind that area.

6. Are you able to redact multiple pages at once?

OneTrust Data Redaction has a few features that allow this to happen:

Automatic detection capabilities automatically redact the types of information you would like redacted across the file (for example, Names)
Within a file you can manually decide to redact things on all pages in that file
You are able to add items to the Do Not Redact list which would also apply across several files

7. Is there a dictionary for the texts used for redaction? Do we need to maintain this dictionary? Do you have to enter the actual data item to be redacted (e.g., SSN: 123-45-6789) or do you indicate redact anything set as xxx-xx-xxxx in the file?

We use a combination of approaches to detect sensitive information, including Natural Language Processing (NLP). This means that even if the machine has never seen a particular name or address before it will still predict that this set of characters is likely to be a name or address (this approach is not to search from a database of exact matches as would be the case with a dictionary-only approach). Similarly, with a social security number (SSN) the tool will leverage machine learning and NLP to predict that combination of numbers is an SSN without having seen that particular number before.

8. Does it work on PNG files? Which file formats are supported?

Yes, we do support PNG files as well as a wide variety of input files:

Email file types
- PST
- Msg
- Eml
- mbox
PDF
Word
Excel
PowerPoint
PNG
JPEG
JSON

9. I’d like to understand if it’s possible to write new classification rules or customize the ones the solution provides. And how to do it?

There is the ability to write your own rules (through RegEX) and save those for use across different contexts.

10. Will it also look at meta data and redact this?

OneTrust Data Redaction looks for metadata in the original files and removes them before generating the new redacted files. For example, the metadata containing sensitive information is filtered out before generating the new redacted file.

Learn more about OneTrust Data Redaction capabilities or request a demo today!

Blog

Your Top 10 Data Redaction Questions Answered

What Is Data Redaction?

Why Is Data Redaction Needed?

Legal redaction

Government / FOIA redaction

Life Sciences

Privacy related redaction in the context of DSARs

1. In what languages do you automatically detect sensitive information?

2. Can you specify the type of redaction on each redacted item e.g., “Third party name” or “Business Sensitive” or “Third party opinion”?

3. How does it handle attachments on emails?

4. How do we redact information relating to one person, but not information relating to other people?

5. Are the pdfs that are produced safe from software that can remove redactions?

6. Are you able to redact multiple pages at once?

7. Is there a dictionary for the texts used for redaction? Do we need to maintain this dictionary? Do you have to enter the actual data item to be redacted (e.g., SSN: 123-45-6789) or do you indicate redact anything set as xxx-xx-xxxx in the file?

8. Does it work on PNG files? Which file formats are supported?

9. I’d like to understand if it’s possible to write new classification rules or customize the ones the solution provides. And how to do it?

10. Will it also look at meta data and redact this?

You May Also Like

Consent & Preferences

Tackling privacy and personalization: Fireside chat with PwC and the NFL

Our expert panel looks at how consent and preference management empowers consumers, creates engaging custom experiences, and helps companies comply with global regulations.

February 04, 2026

Privacy Automation

Privacy automation 101: Simplifying data mapping and risk assessments

Join this webinar to learn how OneTrust helps you automate Data Mapping and Privacy Risk Assessments and how to overcome maintenance challenges.

September 25, 2024

Data Discovery & Classification

Enhancing Data Governance: OneTrust and Snowflake strategies for data-driven businesses

Join us for a webinar with Jim Warner and Alex Cash to explore how Snowflake and OneTrust can revolutionize your data governance strategy, helping you maintain data quality, ensure compliance, and exceed marketing ROI in 2024.

September 24, 2024

AI Governance

From policy to practice: Bringing your AI Governance program to life

Join our webinar to gain practical, real-world guidance from industry experts on implementing effective AI governance.

September 10, 2024

AI Governance

Ensuring compliance and operational readiness under the EU AI Act

Join our webinar and learn about the EU AI Act's enforcement requirements and practical strategies for achieving compliance and operational readiness.

August 22, 2024

Privacy Management

OneTrust Live: Unlocking the power of automation for privacy programs

Join us for a live demo where we will discuss the advanced capabilities of OneTrust solutions in data privacy enforcement, first-party data collection, and AI innovation.

August 21, 2024

Cookie Consent

Unpacking Google's third-party cookies decision: What marketers need to know

Join our webinar to learn about Google's decision to keep third-party cookies and how it impacts marketers. Get actionable strategies and stay ahead in data privacy.

August 13, 2024

Privacy Management

Preparing for child data protection laws in the US

Join DataGuidance and a panel of experts as we discuss US privacy laws the protection of minors' data.

August 07, 2024

AI Governance

AI Governance in action: A live demo

Whether your AI is sourced from vendors and third parties or developed in-house, AI Governance supports informed decision-making and helps build trust in the responsible use of AI. Join the live demo webinar to watch OneTrust AI Governance in action.

August 06, 2024

Privacy Management

CPRA in action: OneTrust Live Demo

Join us for a deep dive tour of our suite of technology solutions for operationalizing and automating CPRA requirements across Do Not Share, Consumer Rights and privacy governance operations.

August 01, 2024

Consent & Preferences

The ultimate consent strategy for maximizing customer opt-ins in 2024

Discover the ultimate consent strategy for 2024! Watch our webinar to maximize customer opt-ins, optimize user experience, and maintain compliance.

July 30, 2024

Privacy Automation

Moving through the Data Privacy Maturity Model

Explore the Data Privacy Maturity Model to evolve your data privacy program from compliance-focused to a strategic, value-driven framework.

July 29, 2024

Privacy Management

Going beyond CCPA

Join us for an in-depth webinar on "Going Beyond CCPA," where we will explore the intricacies of privacy laws, compare major regulations, and provide guidance on enhancing your privacy policy.

July 25, 2024

Third-Party Risk

Third-Party AI: Procurement and risk management best practices

As innovation teams race to integrate AI into their products and services, new challenges arise for development teams leveraging third-party models. Join the webinar to gain insights on how to navigate AI vendors while mitigating third-party risks.

July 25, 2024

Privacy Management

New European cyber laws: What you need to know

The EU has adopted several new Cyber Laws that will impact many businesses and will come into force over the next few months (in October in the case of NISD2) and require actions now. Join the webinar to learn about the latest cyber developments.

July 23, 2024

Responsible AI

Scaling to new heights with AI Governance