Understanding the Value of Database Discovery beyond Unstructured Data

Responsible AI in legal practices: A guide to effective governance and implementation – Part II Kandace Donovan
Responsible AI in legal practices: A guide to effective governance for legal IT professionals Kandace Donovan The legal industry is undergoing significant transformations due to the increasing adoption of Artif(...)

Newswire

1 May 2025
Cleary Gottlieb Announces Strategic Partnership with Legora

Access Legal Proclaim releases Land Registry version 2

Guru Discovery Enhances eDiscovery Services Through Strategic Partnership With QuikData

Relativity Champions Access to Justice Through Digital Innovation
30 April 2025
SpotDraft's AI Impact Report: Contract Management Leads Legal AI Transformation
29 April 2025
CLOC Joins Lexpo'25 to Deliver Exclusive Legal Ops Programming

Actionstep Releases 2025 US Midsize Law Firm Priorities Report

One in Three Open to Robot Lawyers — But Only Under Human Supervision, Robin AI Survey Reveals
16 April 2025
Agiloft Sets New Benchmark in Contract Lifecycle Management with Industry-first AI on the Inside™
15 April 2025
Aderant Appoints Josiah Chaves as New Chief Client Officer

PRO Partners

Relativity

Ascertus

TravelingCoaches

RBROSolutions

Bundledocs

Apperio

BigHand

iComplibyLegalRM

TigerEye

Katchr

iTrainLegal

Advanced

iManage

NetDocuments

PracticeEvolve

Peppermint

Elite

TIQTime

Opus2

LTC4

: 17
Jun
2013; Douglas Herman & Michael Gooch

Doug Herman When tackling electronic discovery, many attorneys and their teams know they need to focus on collecting, reviewing and producing potentially responsive electronic data. However, the scope of discovery is not just limited to emails, user files and other similar materials. Organizations actually create, maintain and store two types of data: unstructured and structured. Understanding how structured data is created, what it is, where it exists and how to collect it is critical in ensuring a defensible and thorough discovery review.

Structured and unstructured data exist in two extremely different formats. According to the Sedona Conference's “Database Principles: Addressing the Preservation & Production of Databases & Database Information in Civil Litigation”, “Information stored in databases differs fundamentally from discrete unstructured data, because unstructured data files tend to be static and self-contained.”

Structured data, on the other hand, contains three common characteristics, according to the Sedona Conference: it consists of many pieces of discrete information; it is subdivided into fields or records; and it is saved in a common format such as a database.

The average organization may have information in many structured data formats. Examples of structured databases include:

Client Relationship Management (CRM) systems, which sales teams use to track their activity, sales leads and pipeline
Enterprise Resource Planning (ERP) systems, which are used by every organization to track their general ledger, accounts payable and sometimes inventory; and
Human Resource Information Systems (HRIS), which organize critical information for each individual employee.

In fact, email systems such as Microsoft Exchange are actually databases, and each individual email is a record in the structured system. If you have ever had to pull data out of Microsoft SharePoint, which is a database on the back end, then you have dealt with a structured system.

Structured data often exists in either a flat-file format (such as an Excel spreadsheet) or in relational databases (such as an SQL database). Each table in a relational database is actually a flat file-but when the tables are connected to each other via a common element and that “relationship” is maintained through the software, the files become relational. Consider a CRM database that includes contact information for an individual in one table and information about the company that individual works for in another table. When the data is presented to the user through the application, it looks like these elements are all part of one record, but behind the scenes, this data is actually divided across many tables.

Potentially responsive data can reside in a number of different structured systems. For example, information relevant to an alleged price-fixing case may exist in a combination of the sales CRM and financial ERP databases. Data in the HRIS system may be useful in combating age-discrimination claims.

In the order to find the information, though, the legal team needs to know where to look and how to get the data.

Managing Structured Data During Discovery

Considering the breadth and depth of the information contained in structured databases, how can the legal team conduct e-discovery in a way that is cost-effective, timely and defensible? There are several different approaches, each with its own set of advantages and drawbacks.

Back up and produce the entire database

While backing up and producing the entire database represents a thorough approach, most legal teams rightfully avoid it at all costs. Even with a small database, the legal team will be handing over tremendous amounts of information, some of which may be privileged or confidential and much of which will be irrelevant.

Opposing counsel will only be able to review the content of the backup if they have the ability to restore it, the server and software framework to restore it to and a copy of the software that was used to get data into the system. Reviewing the data requires the exact same software application and server configuration that the backup was created from. Even if the opposing side has the same hardware and software, organizations often customize their database systems over time so licensing and other compatibility considerations will most likely be an issue.

The brute force method

This approach requires some technical finesse and assumes that critical documentation about the database structure and fielded data within exists.

A database schema is a diagram that defines the structure of a relational database, identifying what tables exist and how those tables interact with each other. A data dictionary is the critical document that provides a nontechnical definition for each column, helping the legal team to understand what information actually exists. Stated another way, a data dictionary - also known as a metadata repository - contains information about data, how it relates to other data, when data was created, who created it and where it resides. A database schema provides a map of how the database is structured, often depicted graphically.

Through this approach, the legal team works with a data dictionary and schema to identify and mine the data that counsel is seeking and then extract the information via the back end through an SQL or similar query.

While it seems straightforward, this approach is generally more complicated than it first appears. It requires that organizations compile and regularly update both a “data dictionary” and “database schema,” so that the data can be thoroughly identified and tracked. Few organizations have both the dictionary and the schema.

It is also very time-consuming, since both sides may need to meet and confer in an effort to negotiate what data should be extracted, the parameters of that data (i.e., a date range), production methods and other considerations.

Legal teams often find that many of the fields in the database, while they exist in the schema and data dictionary, are associated with software functionality that is not used by the organization. That adds to the complexity of the approach. In-house resources can be leveraged for this approach, but that leaves those employees open to testifying as subject matter experts.

Leverage the application's reporting functionality

With this approach, the legal team uses functionality built into most software applications to extract the potentially relevant data in an organized and meaningful format. Many of today's databases have fairly robust reporting functionalities, and using those is often the most intuitive and painless way to get the data. With this approach, the legal team can often leverage internal resources that are familiar with the data and the software. Since this is a much more simplistic approach than the others and leverages the application itself, there will be limited need for a technical expert.

This method focuses more on the reports that are generated from the system than the data itself. After all, most legal professionals take the position that reports from the system are what drive an organization's activities. When was the last time you heard of a CEO querying a database directly to glean information, rather than relying on the trends indicated by the reports coming out of it?

Structured data is an emerging issue, and this article only scratches the surface of the complexity associated with the topic. Unfortunately, in the ever-emerging landscape of electronic discovery, attorneys and their legal teams cannot ignore databases and other structured systems when conducting discovery. It is critical that counsel know when structured data may be relevant and how to leverage the appropriate resources to get it out.

Doug Herman is a Managing Director with UHY Advisors FLVS, Inc. based in Chicago. He leads the eDiscovery & Digital Forensics Practice Group. He has delivered and administered hundreds of litigations involving the collection, preservation, culling review and production of electronic documents. He has also provided expert analysis and preparation in a large number of litigations as a testifying and non-testifying expert as well as a court-appointed neutral expert for resolution of discovery disputes. Michael Gooch is Manager of eDiscovery Operations at UHY Advisors. In this capacity, he serves as advisor on topics of technology and best-practices in the areas of processing and hosting frameworks. He often streamlines, tests and implements existing and new software packages for eDiscovery, as well as develops and documents technical workflow and oversees training of the data analyst and project management teams.

Media Partnerships

We offer organizers of legal IT seminars, events and conferences a unique marketing and promotion opportunity. Legal IT Professionals has been selected official media partner for many events.

development by motivus.pt