Part Two of ESI Basics: Processing

July 25th, 2018

This is Part Two of a continuing series on ESI basics. In this series, we cover some of the terms used most often on the tech-side of e-discovery. In Part One, we provided an overview of PSTs. You can find that article here. Whether this is an introduction to you or a refresher, and whether you are an attorney, member of an in-house team or data analyst, this information may come in handy in your practice.

What Is Processing and Why Do You Need It?

Let’s say you have a new case and you request proposals from various e-discovery vendors (or your discovery counsel firm) for preparing the documents for review and production. Maybe you get a proposal back with a very large number next to it called "ESI processing" or “Metadata Extraction.” What do those terms mean and what are you actually getting for your money?

In general terms, ESI processing and metadata extraction involve collecting electronic data and making it useable. In more specific terms, ESI processing takes all of your native documents and runs them through software that parses out the data it reads from the header and extracted text into searchable fields. These searchable fields are often referred to as metadata (data about the data). Metadata allows you to search, put documents in chronological order and otherwise analyze your documents in a way that is efficient and consistent across software applications. Note that there are methods for forensic metadata and extraction but this blog will be confined to more common non-forensic analysis.

ESI Processing – Items to Consider

1. Processing is only as good as the tool that is being used. Some advanced processing tools can handle a very large number of file types and can extract the maximum amount of information available. They can also do it quickly. Other more basic processing tools can handle the most common file types and may be sufficient for your needs. Some popular processing tools are Law, Nuix, IPRO, Relativity Processing and other proprietary tools that are created by database hosting vendors.

2. The software provider will have a list of file types that can be processed and what the tool considers an "exception." Exception files are usually proprietary file types used with specific software or are files that don't extract (for example, some system files and database files). Ask for the processible file list from your vendor to make sure that their processing platform can process your files, particularly if you use 3D modeling software, accounting software or legacy files that no longer have software support. Also, keep in mind that metadata extraction processes may differ depending on the type of device from which the data was collected.

3. You should be able to request a list of the specific pieces of data that are extracted. There can be hundreds of items depending on the file type and you won't necessarily need them all for the purposes of production but it is good to know what can be provided.

4. You’ll want to ask about the handling of duplicates, attachments and compressed files. Will documents be de-duped on a global or custodial basis at the time of processing? Does the processing platform automatically unzip container files? Does it do that with other compressed file applications, like 7 Zip or Pkzip? Can it open Encase files?

5. Also, you may need to ask how the processing platform manages password-protected or corrupt files. Will it log them for you so that you can review a report and consult with the client regarding a solution? Most platforms should be able to provide you with such a report.

File Compression and Data Expansion

In most cases, your data will expand after processing, which means the data size collected does not end up being the data size hosted. Data expansion occurs most often with email because processing pulls out any attachments as separate files and extracts the metadata from them. It can also pull out embedded files and unwrap various containers, including the zip files mentioned above. While the expanded size can be estimated, it largely depends on the original systems used and how they were set up. An Outlook .pst file will expand to 1.75 - 2.00 times the original .pst size. However, Lotus Notes .nsf files can expand to much more than that, sometimes 3-4 times the original size because of the way the files are compressed. So, when budgeting for a case, it’s important to understand what type of data you have in order to reasonably estimate the potential processing and hosting charges.

File Type Filtering

During the early stages of processing (the “pre-processing” stage), you may have the option of narrowing down your document collection by file type before full processing begins. Simple filtering can include de-NISTING, which uses a list provided by the US government called the NIST list, and filters out files by file extension. The chief purpose of NIST filtering is to remove file types that are unlikely to be useable or responsive, such as system files. Some processing providers also use their own common list of non-document file types that can be removed prior to processing and review. Examples include .db, temp and .bak files.

Post-Processing

You should request a summary report after the documents are processed, which will show you the various file types that were in the processed data set. This report is very handy if you have to go back to the client to ask about specific file types that couldn’t be processed. It will also reveal the number of emails, Word documents, spreadsheets, Adobe PDFs, etc., so that you can get an idea of how long and detailed the review may be.

After processing is complete, you will be able to run keywords to cull the data set needing review. Keywords can be set to run across metadata as well (which we have found to be extremely helpful). You would provide a list of relevant search terms and then receive a report displaying the number of documents identified, by term. You can then tweak the terms to get to the document set you intend to review.

In Conclusion

It’s helpful to discuss processing options with your service provider up front so that you are aware of any impact to timelines. In addition, you’ll want to have realistic expectations when you discuss discovery deadlines and ESI specifications with opposing counsel. Asking the right questions before and during processing will help you avoid document-related issues down the road, particularly during the review and production phases.

DISCLAIMER: The information contained in this blog is not intended as legal advice or as an opinion on specific facts. For more information about these issues, please contact the author(s) of this blog or your existing LitSmart contact. The invitation to contact the author is not to be construed as a solicitation for legal work. Any new attorney/client relationship will be confirmed in writing.

Topics: E-Discovery Data Processing KTLitSmart LitSmart project management

Newest Posts

Spoiler Alert! Another Legal Update on Data Preservation and Spoliation Implications

There appears to be a recent theme on this blog regarding data preservation and spoliation, and—not to spoil anyone’s appetite for this important topic—we are back with another one. And for good reason given the heightened risk of spoliation sanctions in today’s increasingly data-driven legal landscape. A recent order in Safelite Group, Inc. v. Lockridge is one of many that highlights the growing need to stay apprised of the various steps necessary to ensure compliance with essential data preservation requirements.

Ignorance might be bliss, but it is not a defense. This is especially true as it relates to one’s duty to comply with a litigation hold. To avoid potential Rule 37(e) sanctions, attorneys must be familiar with the preservation steps needed for basic sources of ESI and take care to ensure that their clients understand the same.
Blurred Lines: Personal Devices, Proportionality, and Piercing the Work Product Privilege

In a fairly short opinion and order, the district court in Weston v. DocuSign, Inc. analyzed whether the parties were entitled to the production of text messages from former employees’ personal devices and potential piercing of the attorney work product privilege. The issues in this opinion are not necessarily novel but illustrate significant concerns for litigants.

In a world where the lines between our personal and private lives are increasingly blurry, the possibility of discovery on personal devices should come as a surprise to no one, and it is, of course, a litigation disaster to have the work product privilege protections pierced and to be ordered to turn over attorney notes, witness lists, and witness communications on the very subject of the litigation. So, what is the take-away for litigation counsel with respect to protecting the work product privilege?
Planting the Seeds of Accountability for Spoliation Sanctions

When seeking sanctions for spoliated evidence, the nature of the evidence and your jurisdiction can play a pivotal role. Are you in state or federal court? Is the missing evidence electronically stored information or not? The same facts and circumstances could yield vastly different outcomes depending on the answers to those questions. It is important to recognize up front, at the start of your case, how your jurisdiction may impact discovery issues that could arise later down the road so that you can plan accordingly. In the case in this post, while the court did not ultimately affirm the imposition of an adverse jury instruction for spoliation of evidence, it did find a duty to preserve existed based not only on the parties’ contract, but on evidence the party in question had promised to preserve such evidence. By contrast, the insurers failed to demonstrate that same party owed them a duty to preserve.

What Is Processing and Why Do You Need It?

ESI Processing – Items to Consider

File Compression and Data Expansion

File Type Filtering

Post-Processing

In Conclusion

Subscribe to the E-Discovery Newsletter

Related Posts

Data Mapping - Why is it Important for Successful E-Discovery?

Pitfalls of Complex Search Protocols in ESI Agreements

Newest Posts

Spoiler Alert! Another Legal Update on Data Preservation and Spoliation Implications

Blurred Lines: Personal Devices, Proportionality, and Piercing the Work Product Privilege

Planting the Seeds of Accountability for Spoliation Sanctions