Searching is a core e-discovery skill that has been a part of the legal case landscape for about two decades now. Throughout that time, the fundamental capabilities for keyword searching have not changed much. Keyword searching persists as a useful technique in spite of new legal technology. Logical connectors (OR, AND, NOT, etc.) and proximity operators (w/X) are simple keyword search tools. Yes, they are crude compared to the technologies du jour, but they work well! Thank 19th Century mathematicians, 20th Century computer scientists, and many others!
Keyword searching is a handy tool when used correctly.
The Cheshire Cat: "Then it doesn't matter which way you walk."
The process of creating a document search is a mini-project.
The Project Management Institute defines a project as "a temporary endeavor undertaken to create a unique project service or result." A case team's request to create a search then is the mini project's requirements. Invest as much time as needed to understand the motivation behind a search request. When I receive precise instructions to create some search, I like to probe a bit. What are you trying to achieve? Why the specificity? I find that asking such questions sometimes brings to mind similar past requests. You can leverage previous experiences for the lessons they offer. I also like to ask a requester if they have an idea of the rough number of documents they expect to see from their search. Knowledge of this number helps frame a conversation if the search result is far off.
Here are a few search characteristics useful to limit document search results:
- Custodians: Can you remove any unnecessary custodians?
- Date Filter: Can you limit the search to a specific date range(s)?
- Families: Searches typically include document families by default. What if a document collection contains a high average number of email attachments? The case team may not need to see all those attachments. Ask them if you are unsure.
- Email Threading: Use this powerful technique to pinpoint relevant email threads. Explain how not using email threading to cut redundant email chains is a waste of resources.
- File Types: Is the case team interested in only specific file types? Create a file extension report to see a cross-section of file types in a file collection.
- Analytics: Language identification and near-duplicates analysis are two helpful culling techniques. The former tool can help you locate only documents that contain specific languages. The latter can identify very similar near-duplicate non-email documents based on a percentage of similarity that of your choosing.
- Use Transparent Features: Minimize the use of opaque database features in searches. By 'opaque features,' I mean criteria whose origin you cannot trace. Imagine that you create a complicated search A. You mark Search A's results with a tag named "simple tag." You then delete search A. Now, you create search B, one of the criteria of which is the tag "simple tag." Search B runs fine, but you have introduced a problem. The meaning of the "simple tag" criteria is not available.
The point is to be as thorough as possible in this search creation process. Strive to craft searches that return the smallest set of the most relevant documents.
Crafting searches that return the desired documents is half the battle. Here are a few other considerations:
- Performance: The faster a search runs, the better. The more criteria you add to a search, the slower it will run. If your search will not execute at all, something is likely wrong with the search setup. The search may contain too many criteria or incorrect syntax.
- Naming Convention: Use simple, consistent search names. Clear search naming conventions help future users understand their purpose. I find it helpful to incorporate a date, initials, and descriptive text to document the query's meaning. It is always helpful to include an explanatory note to the search if you need to provide more background.
- Organization: If a database becomes cluttered with searches, try saving your searches in an organized folder structure.
- Do Unto Others: As you work on searches, try to consider how others will view and interpret your work product. Is your search criteria's intent obvious interpretation or confusing? This is a reason why including explanatory notes can be so helpful.
- KISS Principle: Use the 'Keep It Simple, Stupid' rule of thumb. The intuitive gist of the principle is that simple things work well. Complicated searches that contain mind-twisting logic are not helpful; shoot for elegant simplicity. If there is too much going on in one search, then consider dividing that search into two or more. Computer programmers name this practice "divide and conquer." Solve small problems, then combine the results to solve the overall problem.
- Leverage Resources: Law firms and service providers hire talented technical personnel. Engage with them! If you are working on a search and things are not clicking, reach out for help. It is quite likely that they have dealt with issues like yours before.
In conclusion, sound search creation is a core e-discovery technical (and artistic!) skill. The need for accurate and timely search results has been here from the industry's infancy. It will be here for the foreseeable future. Become familiar with your platform’s search capabilities. Leverage the available tools to help you save resources. Recognize that the effort you pour into keeping a database organized pays real dividends. Attorneys wrestle with the finer points of their wordplay to achieve precise results. Strive to manage your database tools in the same fashion. Who knows, you may find the smoking gun your attorneys need.
DISCLAIMER: The information contained in this blog is not intended as legal advice or as an opinion on specific facts. For more information about these issues, please contact the author(s) of this blog or your existing LitSmart contact. The invitation to contact the author is not to be construed as a solicitation for legal work. Any new attorney/client relationship will be confirmed in writing.
 1847, The Mathematical Analysis of Logic, Being an Essay Towards a Calculus of Deductive Reasoning, originally published in Cambridge by Macmillan, Barclay, & Macmillan. Reprinted in Oxford by Basil Blackwell, 1951.
 Rubenstein, Herbert & Goodenough, John. (1965). Contextual correlates of synonymy. Commun. ACM. 8. 627-633. 10.1145/365628.365657.
 Lewis Carroll, Alice’s Adventures In Wonderland, 1866, 89.
 PMI, PMBOK Guide, 5th ed., 2013, 2.