E-Discovery Continues Its Move to the Left
No, that is not a political statement. It is more akin to your teenager telling you that a 2009 Mike Trout rookie baseball card just sold for $3.93 million — a record — and that he thinks he had that very same card growing up.
“Well, what happened to it?” you ask, to which he replies, “I think we still have it.”
“Where is it?” you may ask, with sudden interest. “Don’t know. Mom cleaned my room and I think she stuck it in a box with my other stuff,” replies the prospective millionaire. “It could be in the garage, or in the attic, or in the basement…”
With those words your sudden interest vanishes, knowing your house probably contains at least 2,500 boxes of various shapes and sizes, and that each of the possible locations had not been inventoried or organized since before the turn of the century. A manual pursuit to discover Mr. Trout circa 2009 would be a herculean effort and could likely end up with a false positive. “If I had only …” is now stuck in your head in thinking about the value now to have built a spreadsheet or labeled boxes back then.
Data is leading the way…
Such is the case in the world of e-discovery today: more data, of various form factors, hiding in a variety of unmanaged locations. What started prior to the pandemic with the growth of collaborative platforms like Microsoft Teams and mobile applications like WeChat — was accelerated and joined by the historic explosion of Zoom and other video conferencing technologies.
Today, organizations are faced with an inevitable e-discovery future of miles of boxes hiding their own singular version of Mike Trout.
The response to communications data expansion and explosion from the e-discovery industry has been consistent, and swift. In short, get closer and dive deeper into the data to apply better data governance and machine-assisted approaches. Or, in common e-discovery vocabulary, add capabilities that address the earlier stages of the E-Discovery Reference Model (EDRM) on the left side of the diagram. To illustrate:
- iPro has acquired ZyLAB’s One Discovery and archiving platform to help “analyze data earlier” in the e-discovery process
- Lighthouse has acquired H5 to add analytics and review capabilities to its discovery service portfolio
- Relativity acquired TextIQ to enhance its analytics capabilities within its email and document review platform
Each of these acquisitions shows that, as an individual use case, discovery is more about deriving insights from disparate, quickly accumulating data. The traditional approach of reviewing individual documents and threaded email conversations is a thing of the past.
…but e-discovery technology hasn’t embraced new channels
The use of advanced predictive analytics in discovery is not new, considering products such as Content Analyst and Recommind are well-established. However, most of these tools were built before the proliferation of voice, video, persistent chats, and interactive content —which are redefining today’s massive corporate data sets.
In other words, these tools can find the Mike Trout card if you have an indexed and well-organized box of cards in a known location. It’s more likely you’ll end up with Mike Trout bobbleheads, autographed jerseys, and recent purchases from the TroutPro website — given the nature of today’s data.
It’s also good to keep in mind that advanced analytics and machine learning approaches alone are not a panacea. The effectiveness of each is highly dependent on having a clean, normalized content repository to draw from. It must account for the interactive, multi-modal, and non-text-based nature of today’s collaborative content. Consider Milbeck v. Truecar. In this recent case 1.67GB of Slack data yielded 17 million “content objects,” which the court concluded could not be used by the plaintiff due to the challenge of “identifying the start and end of relevant conversations.” Even the most advanced predictive analytics tools may not have been able to solve that problem.
While that case focuses on the uniqueness of Slack, the coming tidal wave of legal cases involving the use of Zoom, Microsoft Teams, WeChat and other collaborative technologies is inevitable. Hence, the response from discovery providers to build more complete and reliable methods to get closer to the data.
Find your Mike Trout
So, how should firms evaluate their discovery provider’s capabilities to identify, collect and enable review of today’s content? Start by considering:
It’s a “big data” problem
Video, voice recordings and persistent chats will create larger and larger discovery datasets. Preservation and review storage locations will require scaling capacity like never before. More firms will respond by demanding that early case assessment (ECA) capabilities move closer to the data — as opposed to finding creative ways to move a massive collection of content directly into review platforms.
On-demand collection strategies take on new risks
Today’s content is heterogeneous not only in terms of features but also in terms of available methods to capture content and contextually important metadata. The availability of APIs and limitations that content source providers impose in terms of quantity and frequency of collection should be known up front — as opposed to being stumbled upon in the face of discovery. This also was the subject of the recent Calendar Research v. Stubhub case, where the production of Slack messages was an ongoing issue due to the use of a free account (which limits full historical data and is at the discretion of Slack to provide) versus the more suitable Enterprise version (which provides connectors for data storage). Stubhub’s ability to complete a full, timely discovery request was compromised.
Modern tools…modern problems
Discovery teams are already running into challenges created by new collaborative tool features, such as “modern attachments” that replace an actual file with a link to the file. Many are now evaluating specific features, such as Slack Huddles, Microsoft Teams transcripts, and embedded digital assistants to proactively assess the implications for collection and review strategies.
Prepare for ”the next network”
The pandemic accelerated the use of digital communications tools. Many of these tools have crept over from our personal lives due to their familiarity and accessibility when working from home. Cases like the GameStop trading incident involve the use of YouTube, Reddit, and Instagram. But it could have just as easily happened on Discord, WhatsApp, Club House or TikTok. Many firms are now receiving requests to support at least one new communications tool per week. They need to remain diligent not only in their awareness of popular tools (perhaps start by asking the aforementioned prospective millionaire), but also by monitoring for the use of tools that are prohibited by the business.
Ultimately, moving to the left is about attempting to govern today’s heterogeneous unstructured information more effectively. It entails more than just labeling boxes, using AI or reorganizing your storage locations.
Simply replacing the EDRM diagram with the Information Governance Reference Model (IGRM) in marketing materials does not make an e-discovery provider an expert in governing data. Now is the time to engage with information governance domain experts. Your next 2009 Mike Trout may be at stake.
Share this post!
Our internal subject matter experts and our network of external industry experts are featured with insights into the technology and industry trends that affect your electronic communications compliance initiatives. Sign up to benefit from their deep understanding, tips and best practices regarding how your company can manage compliance risk while unlocking the business value of your communications data.