Blog

How automation helps reduce your sensitive data footprint

Establish data retention and minimization policies to reduce your organization’s attack surface

Sam Curcuruto
Sr. Product Marketing Manager, Data Discovery, OneTrust
May 5, 2023

Young Black businesswoman works on her laptop in an open office.

The value of data today is greater than ever before, with companies looking for ways to optimize its collection and utilization to provide customers with timely, personalized experiences. As data’s value increases, so do the associated risks and costs. Cloud storage alone accounts for 30% of a company’s overall IT budget, with one terabyte (TB) of data costing $3,351 per year on average. That’s a cool $1M in storage costs alone for 300 TB of data. Apart from the rising costs of data storage, data breaches are also becoming more prevalent with the volume and variety collected by organizations today. The average damage of a data breach in 2022 sat at $4.35M. 

The problem is clear: More data, more costs, more risk

But is there more value? That’s up to how your organization makes use of it. Hoarding data or collecting it without a clear purpose not only increases the issues of storage cost and breach risk mentioned above, but also violates myriad regulations and other principles of data minimization and data retention policies. 

 

Unstructured data and its challenges

Well, if it’s so clear that data minimization and data retention is the answer to high storage costs, data breach risks, and non-compliance issues, why isn’t everyone doing it? More than 80% of the data stored by organizations is unstructured. 

This means it’s in the form of:

  • Emails
  • File attachments
  • Images
  • PDFs
  • Other forms of data which don’t’ have a predefined fields like a structured database

This data also usually becomes meaningless in 90 days, and nearly a third of it is considered redundant, obsolete, and trivial (ROT). ROT data not only adds empty data storage costs, it’s also prime fodder for data breaches as it typically sits outside secure systems. It expands the attack surface of your company, which is all the possible risk areas from which an unauthorized user or attacker could breach your system. 

Keeping these concerns with unstructured data and a growing attack surface in mind, most privacy regulations today call out the need to include data minimization practices as a part of standard operation procedures. Recent enforcement actions from the Federal Trade Commission (FTC) show that privacy and data security best practices have data minimization as a key tenet. Companies can start to include this in their data workflows, using privacy by design principles in their products or services to ensure data is minimized from the outset and collection and use are clearly communicated to customers. 

 

How can companies operationalize data retention and minimization?

Now that the solution of incorporating privacy by design into your products and services from their inception is clear, the next step is figuring out how to integrate them into your processes seamlessly. 

1. Observe your current data lifecycle

To kick things off, look at your most common data workflows and scenarios. Analyze your metadata to see relevant fields data created, last accessed/modified. Identify when data stops being necessary, where data is commonly deleted in these situations, and see how this could correlate to a data retention schedule. 

2. Establish a deletion method

After identifying where data is deleted and formulating a retention schedule around these scenarios, you can apply these retention periods to your data, e.g. archiving or deleting SharePoint files after they cross a certain time threshold. 

3. Use a centralized data governance tool 

When your retention periods are defined and deletion methods are established, using a tool to power this mechanism is the most efficient way to go about this process.  

  • Determine the most accurate set of retention policies for your organization based on your relevant regulations
  • Automate the retention and deletion process by setting business rules and applying them to your files
  • Flag and identify any violations of retention rules in the system
  • Decide whether data needs to be deleted, anonymized, or de-identified and carry out that action accordingly

 

How can automation help?

OneTrust Data Discovery can help your organization operationalize data retention policies by helping you first identify unstructured data across your entire IT infrastructure. After having full visibility across your data ecosystem in structured, semi-structured and unstructured environments, you can then:

  • Capture business and technical metadata to enable data retention
  • Leverage machine learning to automate policy rules and label data accurately
  • Monitor data over time against the defined policies and ensure controls are followed
  • Track performance with advanced analytics across your data ecosystem, identifying trends and at-risk data

 

To learn more about how OneTrust Data Discovery can take your organization’s data retention and minimization policies to the next level, request a demo today. 


You may also like

Webinar

Data Discovery

Live demo: OneTrust Data Discovery

See how OneTrust Data Discovery can help your organization achieve complete data visibility to empower your security program and reduce risk.

June 22, 2023

Learn more

Webinar

Data Discovery

OneTrust Data Discovery Day: A deep dive into automating data discovery and classification

Join us for a two-hour deep dive into data discovery and how OneTrust helps privacy, IT, and security teams understaind their data and achieve risk reduction goals.

June 13, 2023

Learn more

Webinar

Data Discovery

Monitoring least privilege access risks

Understand common scenarios for applying data access governance within your business and key considerations for evaluating open access risk.

May 18, 2023

Learn more