Newswire

PRO Partners

Understanding the Value of Database Discovery beyond Unstructured Data

Doug HermanWhen tackling electronic discovery, many attorneys and their teams know they need to focus on collecting, reviewing and producing potentially responsive electronic data. However, the scope of discovery is not just limited to emails, user files and other similar materials. Organizations actually create, maintain and store two types of data: unstructured and structured. Understanding how structured data is created, what it is, where it exists and how to collect it is critical in ensuring a defensible and thorough discovery review.

Structured and unstructured data exist in two extremely different formats. According to the Sedona Conference's “Database Principles: Addressing the Preservation & Production of Databases & Database Information in Civil Litigation”, “Information stored in databases differs fundamentally from discrete unstructured data, because unstructured data files tend to be static and self-contained.”

Structured data, on the other hand, contains three common characteristics, according to the Sedona Conference: it consists of many pieces of discrete information; it is subdivided into fields or records; and it is saved in a common format such as a database. 

The average organization may have information in many structured data formats. Examples of structured databases include:

  • Client Relationship Management (CRM) systems, which sales teams use to track their activity, sales leads and pipeline
  • Enterprise Resource Planning (ERP) systems, which are used by every organization to track their general ledger, accounts payable and sometimes inventory; and 
  • Human Resource Information Systems (HRIS), which organize critical information for each individual employee. 

In fact, email systems such as Microsoft Exchange are actually databases, and each individual email is a record in the structured system. If you have ever had to pull data out of Microsoft SharePoint, which is a database on the back end, then you have dealt with a structured system. 

Structured data often exists in either a flat-file format (such as an Excel spreadsheet) or in relational databases (such as an SQL database). Each table in a relational database is actually a flat file-but when the tables are connected to each other via a common element and that “relationship” is maintained through the software, the files become relational. Consider a CRM database that includes contact information for an individual in one table and information about the company that individual works for in another table. When the data is presented to the user through the application, it looks like these elements are all part of one record, but behind the scenes, this data is actually divided across many tables. 

Potentially responsive data can reside in a number of different structured systems. For example, information relevant to an alleged price-fixing case may exist in a combination of the sales CRM and financial ERP databases. Data in the HRIS system may be useful in combating age-discrimination claims. 

In the order to find the information, though, the legal team needs to know where to look and how to get the data.

Managing Structured Data During Discovery

Considering the breadth and depth of the information contained in structured databases, how can the legal team conduct e-discovery in a way that is cost-effective, timely and defensible? There are several different approaches, each with its own set of advantages and drawbacks.

Back up and produce the entire database

While backing up and producing the entire database represents a thorough approach, most legal teams rightfully avoid it at all costs. Even with a small database, the legal team will be handing over tremendous amounts of information, some of which may be privileged or confidential and much of which will be irrelevant. 

Opposing counsel will only be able to review the content of the backup if they have the ability to restore it, the server and software framework to restore it to and a copy of the software that was used to get data into the system. Reviewing the data requires the exact same software application and server configuration that the backup was created from. Even if the opposing side has the same hardware and software, organizations often customize their database systems over time so licensing and other compatibility considerations will most likely be an issue.

The brute force method 

This approach requires some technical finesse and assumes that critical documentation about the database structure and fielded data within exists. 

A database schema is a diagram that defines the structure of a relational database, identifying what tables exist and how those tables interact with each other. A data dictionary is the critical document that provides a nontechnical definition for each column, helping the legal team to understand what information actually exists. Stated another way, a data dictionary - also known as a metadata repository - contains information about data, how it relates to other data, when data was created, who created it and where it resides. A database schema provides a map of how the database is structured, often depicted graphically.

”Legal

Through this approach, the legal team works with a data dictionary and schema to identify and mine the data that counsel is seeking and then extract the information via the back end through an SQL or similar query.

While it seems straightforward, this approach is generally more complicated than it first appears. It requires that organizations compile and regularly update both a “data dictionary” and “database schema,” so that the data can be thoroughly identified and tracked. Few organizations have both the dictionary and the schema. 

It is also very time-consuming, since both sides may need to meet and confer in an effort to negotiate what data should be extracted, the parameters of that data (i.e., a date range), production methods and other considerations. 

Legal teams often find that many of the fields in the database, while they exist in the schema and data dictionary, are associated with software functionality that is not used by the organization. That adds to the complexity of the approach. In-house resources can be leveraged for this approach, but that leaves those employees open to testifying as subject matter experts. 

Leverage the application's reporting functionality 

With this approach, the legal team uses functionality built into most software applications to extract the potentially relevant data in an organized and meaningful format. Many of today's databases have fairly robust reporting functionalities, and using those is often the most intuitive and painless way to get the data. With this approach, the legal team can often leverage internal resources that are familiar with the data and the software. Since this is a much more simplistic approach than the others and leverages the application itself, there will be limited need for a technical expert. 

This method focuses more on the reports that are generated from the system than the data itself. After all, most legal professionals take the position that reports from the system are what drive an organization's activities. When was the last time you heard of a CEO querying a database directly to glean information, rather than relying on the trends indicated by the reports coming out of it?

Structured data is an emerging issue, and this article only scratches the surface of the complexity associated with the topic. Unfortunately, in the ever-emerging landscape of electronic discovery, attorneys and their legal teams cannot ignore databases and other structured systems when conducting discovery. It is critical that counsel know when structured data may be relevant and how to leverage the appropriate resources to get it out.

Doug Herman is a Managing Director with UHY Advisors FLVS, Inc. based in Chicago. He leads the eDiscovery & Digital Forensics Practice Group. He has delivered and administered hundreds of litigations involving the collection, preservation, culling review and production of electronic documents. He has also provided expert analysis and preparation in a large number of litigations as a testifying and non-testifying expert as well as a court-appointed neutral expert for resolution of discovery disputes. Michael Gooch is Manager of eDiscovery Operations at UHY Advisors. In this capacity, he serves as advisor on topics of technology and best-practices in the areas of processing and hosting frameworks.  He often streamlines, tests and implements existing and new software packages for eDiscovery, as well as develops and documents technical workflow and oversees training of the data analyst and project management teams.  
 

Copyright © 2023 Legal IT Professionals. All Rights Reserved.

Media Partnerships

We offer organizers of legal IT seminars, events and conferences a unique marketing and promotion opportunity. Legal IT Professionals has been selected official media partner for many events.

development by motivus.pt