What is Data Mapping?
Data mapping is the creation of an inventory of an organization’s data. A data map describes:
- The type of data the exists in the organization;
- Where that data is stored;
- Who the responsible party is for that data; and
- When the data is eligible for archival or deletion.
A data map is usually something that is generated along with a data retention plan and is a great starting point if and when an organization becomes involved in litigation and discovery in the future. Usually (although not in all cases!), despite its name, it’s not actually a “map,” but instead it is a written document that has headers and paragraphs and bullets.
What Types of Data Exist?
To determine what types of data exist, an organization should review its data management practices and develop a list of the types of data that are generated. This can be what we call “user created data,” such as email, word documents, drawings, reports or presentations. It can also be “structured data,” such as financial data or customer data. The data can also include text messages and photos. In litigation, the definition of data often expands outside of what is stored electronically to include hard copy documents and tangible things, such as models. Once the types of data are determined, it is easier to investigate where those sources of data may live within the organization.
Where is the Data Stored?
Data can be stored in a variety of locations such as (but not limited to) shared network spaces, personal network spaces, local laptops, the cloud, instant messaging systems, document management systems, structured databases such as finance or customer management systems, email or mobile devices.
In addition to determining the physical locations of the data, it is important to note its status on your data map since some data may be difficult to access. For example:
- Online data is usually readily accessible in the active exchange environment or an organization’s network.
- Nearline data may be archived data that is still fairly accessible to an end user.
- Offline data may be located on backup tapes or decommissioned servers, and will be harder to access. This data may be considered “inaccessible.”
- Inaccessible data may also be in legacy data systems that are no longer accessible to the end user.
Who is the Responsible Party?
The individual or department that is responsible for the data in question will be the most valuable resource in determining the location of that data and the best way to identify and collect what may be important in an investigation or litigation. They will have real time information on how that data is used in the day-to-day workflow and where it can be accessed. It is important to remember that responsible parties for the data may also exist at companies that are external to the organization, which is especially the case when cloud based or third-party applications are utilized.
When is the Data Archived or Deleted?
It is important for an organization to have a plan for archival and deletion of data (aka “data disposition policies”), which should be laid out in the document retention and litigation hold policies. Archival and deletion plans should be based on the type and status of the data, as well as a time component. Failure to detail this step will lead to inflated cost for hardware and storage of data, and ultimately will increase the cost of e-discovery when and if the organization is involved in litigation.
Why is Data Mapping Important for Successful E-Discovery?
Have you ever run into a new grocery store to grab just one item? Collecting data during the e-discovery process is similar. Since you are not familiar with the store, you spend more time than usual searching for your desired item and you may be tempted by other items, leading you to spend additional money. If you were in your go-to grocery store, you could have gotten in and out much more quickly and cheaply. Similarly, to avoid going down a discovery rabbit hole, a data map will lead you to the sources of data and the custodians that are important.
When e-discovery is about to commence, one of the very first steps an organization should undertake should be a review of the organization’s data map. It is important to understand the relevant data sources needed and where those can be found within the organization’s various data systems. In addition, it is important to treat the data map as a living document and to update it regularly to reflect the most up to date state of the organization’s systems. In the current state of the world, more employees are working remotely and may be storing data differently than in years past. Organizations should ensure that they are accounting for those locations and work practices in their updated data maps.
Understanding the documents needed, where they can be found, the custodians of that data that will need to be interviewed, and the policies around archival, will help enable the legal team to target the data that’s actually relevant and inevitably reduce costs down the road.