Predictive coding has received a lot of attention lately as the next great magical wand in the e-discovery bag of tricks. However, as with any new technology, there are a number of different implementations and marketing claims that are confusing the whole picture of how this system can help make the e-discovery process more efficient and ultimately reduce costs.
In a nutshell, predictive coding involves the application of sophisticated artificial intelligence to permit the computer to make suggested determinations based on human interaction and the content of documents.
All predictive coding incarnations basically involve the review lawyer coding a subset of the records in the collection. The system examines the decisions made by the reviewer and identifies properties of the documents that it can use to automatically make determinations. As the reviewer continues to code documents, the system predicts what the reviewer will code. When the system’s predictions and the reviewer’s actually coding coincide (within reason), the system has learned enough to make confident predictions on its own.
Predictive coding is being applied at several stages in the e-discovery analysis and review processes:
Culling: In this mode, a lawyer who is an authority on the matter makes relevance decisions on a subset of the records. Once a sufficient number of records have been reviewed (typically a few thousand), the system applies its predictive analysis to the entire set to cull out the records most likely to be relevant. These records can then be subjected to the normal, manual review process.
Subjective Coding: The predictive coding system examines the subjective coding decisions made by lawyers as they manually review records. When a sufficient number of records have been reviewed, the system will start to make coding suggestions for subsequent records to assist the lawyers.
Review Quality Control: Along the same lines as predictive subjective coding, the system uses the subjective coding decisions made by lawyers to predict how documents should be coded. However, instead of suggesting codes for un-reviewed records, the system will apply the predictions to all manually coded records and identify those records where its predictions and the actually coding diverge. This will enable reviewers to zero in on documents that may not be coded correctly.
Prioritization of Records for Review: Predictive coding can also be used to prioritize records in a review. Once a sufficient number of records have been manually reviewed and coded, the system can group un-reviewed documents based on its coding predictions. The review project manager can then group all documents likely to be coded relevant, for instance, and assign these to be reviewed first.
Predictive coding technology is also being considered in several electronic records management solutions to permit automatic classification of records, removing the burden from individual users.
This technology is being incorporated into more and more e-Discovery software systems, and may soon become a standard way to cull and review electronic data.
For more information on this technology and other cutting-edge e-discovery solutions, contact us.
One of the challenging and often expensive aspects of collecting data for litigation, regulatory investigation or audit is locating all sources of potentially relevant evidence. That exercise is difficult enough when considering only company equipment and devices (computers, Blackberries, servers, shared drives, etc.). The scope grows exponentially when one considers the personal devices possessed by employees, including home computers, cell phones, Blackberries, iPhones, iPads, etc. Your employees are using these personal devices for business purposes, which means that potentially relevant evidence is stored on devices your organization does not ultimately control.
Think this doesn’t apply in your business? Think again.
According to a study of 4,500 users in 13 countries by KRC Research (published in The Globe and Mail on Tuesday, April 19, 2011 on page B7), 40% of workers use their personal devices for business purposes. Further 50% of workers who use their own devices for business reasons access company networks without their employer’s knowledge.
Perhaps it is time to revisit your organization’s Records Retention, Acceptable Use, Security and/orTechnology policies?
Solid state drives are gaining in popularity in laptop computers and tablets. Compared to their hard disk drive counterparts, solid state drives are more expensive and offer less storage. However, they are much faster and lack the moving parts that can make HDDs prone to failure, particularly in mobile devices that experience a great deal of movement. Solid state drives also consume much less power, allowing portable devices to be used longer between charging.
As with any digital technology, as they move into the mainstream, the price of solid state drives will fall and the storage capacity will increase. It is expected that solid state drives will virtually replace conventional hard drives in portable devices within the next 3 to 5 years.
All of this sounds great, except when it comes to computer forensics. For years, computer forensic professionals have been claiming that “delete does not mean delete”. When you drag a file into the Windows recycle bin, or delete an email in Outlook, a computer forensic technician can usually recover it. This is because, when you “delete” a record on a computer, all that happens is that the record is hidden from view and is suitably marked so that sometime in the future, the computer can replace it with newer data.
Unlike conventional hard drives, solid state drives are little computers unto themselves. They insulate the main device from all the nitty gritty details about storing and retrieving information. Among other things, the solid state drive automatically purges deleted information after 30 to 60 minutes. This is done to reduce power consumption, as the power is directly related to how much data is stored on the drive. Unfortunately (from a computer forensics perspective) this means that when you “delete” a file or email, after an hour, it is permanently erased from the solid state drive.
Although most e-Discovery matters only involve active data, there are situations such as fraud or harassment, where deleted information may be important. The widespread use of solid state drives will make investigations such as these more difficult.
For more information about the computer forensic implications of solid state drives, refer to the Journal of Digital Forensics, Security and Law, Volume 5, Number 3.