Technology Advantage

Pitfalls of Complex Search Protocols in ESI Agreements


ESI Agreements cover the full gambit of e-discovery issues, from preservation expectations to production specifications. Sometimes, these agreements include an overview of the process by which the parties will identify the universe of potentially responsive documents by using specific date ranges, identifying priority custodians and developing proposed search terms. Adding another layer of complexity, this discussion is often had in a vacuum before the parties even know how much data their clients have. Coming up with criteria that identifies the relevant documents, but is not overly broad, in this vacuum can be difficult. However, the process is important because the date range, custodians and terms will inevitably dictate how much data is collected, processed and reviewed, which can significantly affect any litigation budget.

Discussions with Opposing Counsel Regarding How to Narrow Data Sets

As opposing litigation teams begin to discuss how to narrow the data sets they will be dealing with in the case, each side has their own ideas about how to identify relevant data. As a starting point, date limitations are usually fairly straightforward and targeting a date range that will return information that is related to the specific case usually comes without much disagreement. (Note that the disagreement related to dates typically comes later when the parties decide on the “anticipation of litigation” date, after which documents likely would not need to be logged on a privilege log.) Discussions about how many and which custodians’ data will be collected can tricky, but again ultimately the parties will settle on a finite number. (Note that that number may represent individual names or roles; particularly in construction cases, where multiple people may have held the same role over the course of a project, the parties may agree to a set number of custodian roles rather than names.) So, while the date ranges and custodians may require a bit of negotiation, coming to an agreement on key words can be much more complicated. 

Complex Search Strings and Boolean Operators

Many ESI agreements place limitations on the number of keywords and/or custodians with the very reasonable goal of controlling the scope of discovery. While this process generally accomplishes that goal, there can be unexpected drawbacks if these limitations are too restrictive. One of the most common strategies for managing a restriction on the number of key terms a party can use is to add Boolean operators in an effort to expand the impact of an individual term. While AND or proximity terms such as w/# (within a certain number of words) help limit the number of results, the OR operator typically expands the number of results. Depending on the agreement with opposing counsel, the OR operator can be used in an attempt to ‘sneak’ in additional terms. As an example of how complicated the development of search strings can be when parties are faced with term restrictions, I have run into a number of situations where an individual search ‘term’ exceeded Relativity’s very generous 450 character limit for an individual term. In a case such as this one, it may have made more sense to have a higher number of terms but restrict them to AND or w/# operators, so that each additional word in a term string serves to limit rather than expand the number of results.

In addition to the sheer length of some search terms, there is also the potential for increasingly complex terms, employing combinations of Boolean operators. It’s not uncommon to see a term like following one that includes AND, OR, proximity operators and wildcard characters:

((Term1* OR Term2* OR Term3* OR Term4* OR Term5* OR Term6 OR Term7 OR Term8* OR Term9) AND ((Term10* OR Term11* OR Term12* OR Term13*) w/10 (Term14 OR Term15 OR Term16 OR Term17 OR Term18* OR Term19* OR Term20* OR Term21* OR Term22*))

Such complex search strings also create room for error. For example, different platforms use different operators and syntax in search terms, and have different capabilities in terms of the complexity of searches. Litigation support teams are tasked with adapting search terms constructed by opposing parties to the syntax used by their particular review platforms. Also, a simple extra parenthesis could drastically and unintentionally change the results. I’m not in any way advocating that terms should be very simple because that would likely result in overbroad results and unnecessary review; however, if the parties agree to complex search strings, they should be sure to double and triple check the strings and also test the results to make sure the terms are working as intended. 

Different Search Terms for Each Custodian

In an effort to more effectively target potentially relevant data, sometimes parties will agree to limit the number of custodians and limit the number of terms but then also agree that there can be different terms (and sometimes even different date ranges) for each custodian. Imagine the parties agree to twelve terms and twelve custodians and each custodian is allotted twelve separate and distinct terms for a total of 144 potential terms. In a further effort to reduce the number of documents needing review, the parties agree that the number of search results for each individual term will not exceed a certain number of documents (including families, when email was involved). After each set of terms is run, the parties exchange the results. The receiving party then suggests modifications to the terms that exceed the specified threshold (such as adding Boolean operators, like AND, or proximity operators, like w/#) and sends the revised terms back to be run again. While this process results in smaller, targeted review sets that are likely to contain the most relevant information, the back and forth process of reviewing search term reports and negotiating revisions to search strings adds to the complexity.

Sometimes it’s Best to Keep it Simple

On the plus side, more complex searching protocols usually meet their intended results and limit the universe of documents needing review, creating a predictable cost for review. On the downside, however, the process can be lengthy and create a risk of error. With every iteration of exchanging terms comes added complexity and the risk of misplaced parenthesis, inaccurate or unintended use of wildcards or incorrect syntax.

In the early stages of a case, simplicity is often the best course of action. Protocols that are too restrictive can lead to using methodologies designed to work around the restrictions. By contrast, protocols that are too broad can lead to the review of documents that have nothing to do with the meat of the case. If possible (and understanding that it may not be), a middle ground would be ideal. An agreement wherein the parties agree to a date range, a set number of terms and a set number of custodians, and then exchange maybe just one or, at the most, two iterations of terms, would be ideal. In most cases, the KISS theory is the best. Let’s Keep It Simple, as long as possible.


DISCLAIMER: The information contained in this blog is not intended as legal advice or as an opinion on specific facts. For more information about these issues, please contact the author(s) of this blog or your existing LitSmart contact. The invitation to contact the author is not to be construed as a solicitation for legal work. Any new attorney/client relationship will be confirmed in writing.


Topics: E-Discovery Best Practices KT LitSmart KTLitSmart LitSmart Litigation Best Practices Search Term Creation Relativity ESI Protocol KISS Theory Complex Search Terms

Subscribe to the E-Discovery Newsletter