LEGACY DATA CLEAN-UP FOR GDPR COMPLIANCE USING IBM STOREDIQ SUITE

The Data Protection Officer of a leading UK insurer needed insight into the organisation’s unstructured data estate (network file shares, SharePoint sites and Exchange). They needed to identify where individuals’ Personal Data resided and ensure it was managed according to the GDPR, and not impacting the rights of the individual.

Initial attempts to identify this Personal Data using mainly manual processes proved:

  • the results would be extremely inconsistent and as such unreliable
  • there was no way the process would be complete in any reasonable timescale for GPPR compliance – estimate for completion was in years, not months or weeks
  • it was too impactful on the business – business users were having to spend a considerable amount of their working day on data identification tasks

The current lack of visibility of Personal Data being held by the organisation also meant that they would not be able to respond in a complete manner to DSARs (Data Subject Access Requests).

The problem faced was further exacerbated due to a lack of information governance within the organisation, resulting in a massive over retention of the unstructured data they stored, meaning that far more data than really necessary was going to have to be analysed.

The solution

Insight 2 Value were engaged to implement an Unstructured Data Discovery program that would allow the Company to gain timely insight into all of their unstructured data and act accordingly on the findings. IBM StoredIQ was implemented to build a consolidated view of all their unstructured data estate.

With the single view of their data in place the company were able to rapidly segment data, identifying:

  • Redundant, Obsolete or Trivial (ROT)
  • Critical data by business unit. Detailed content analytics was then performed to allow identification of the data containing Personal Data.

The solution also provided the means to remediate the data accordingly:

  • ROT data was then deleted (around 30% of their total estate)
  • Documents containing Personal Data were assessed to ensure that they were being stored in accordance with Companies Records of Processing Activities. If not, the data was either moved to a secure system or deleted.

The Outcome

The IBM StoredIQ solution provides a single consistent approach to discovery across all unstructured data repositories – allowing the easy identification of personal information and the ability to perform remediation actions as required. As a result, the company is now able to quickly query their unstructured data estates in support of a Data Subject Access Request.

Furthermore, having built this single view of their data to support their GDPR initiatives, it was realised that the information that the solution provided could be used to support additional use cases:

IT operations:

  • Storage costs and efficiencies – the initial ROT removal provided immediate savings
  • Duplication of data
  • Categories and age of data being held
  • Data retained for users no longer with the company
  • It could provide genuine data from which to build and implement an Information Retention Strategy

The initial removal of Redundant, Obsolete or Trivial data (ROT) had an additional positive impact on the business. Users, especially those in customer-facing roles, found they spending less time finding the right information and more time focusing on their customers.