Redacting documents should be easy... and interesting How do you keep focused and avoid mistakes in a job that is highly repetitive? Imagine that you work in security at the airport: Conducting the same processes over and over again, and always be asking the same questions: Do you have any liquids? Do you have a computer? Always guiding people through the same detector and checking more or less the same bags. Why bring up this example in an article whose theme is about redacting and anonymizing documents? Are there any similarities? Well, not so many but then again: The process of looking for personal information in documents and redact it is highly repetitive. You need to look for generic personal information that directly or indirectly can identify a person and remove it or replace it (pseudonymize). On the surface it may not be the most exciting task at hand. But what an important one!
And just like in the airport - if you don't pay proper attention you risk letting in the bad guys. If you don't pay proper attention when redacting documents you soon find yourself on a risky path too. The consequences of failing to anonymize a name, a social security number or a rare disease in a document can be catastrophic to the person who is suddenly exposed.
Now try to imagine that the security personnel had to manually carry the bags, after which they would be forced to open the bags and search for dangerous content. At the same time, a family is waiting impatiently on the side with their tired and hungry kids. I guess you would have a higher employee turnover rate. But most likely, and worst of all, mistakes would likely happen more often.
Fortunately, that is not the case. However, if you work with redaction the opposite could be said. If you have ever tried to redact and anonymize a document, you probably agree with me, that it wasn´t the most thrilling process. You most likely had to manually search for and black out or pseudonymize the document. If you were so lucky to have a redacting system to assist you in the process, you would still be left with the task of finding names, addresses, property numbers, ship numbers, social security numbers etc.. Automatic? No! Manual? Yes.
Fine, if you only need to redact and anonymize a few pages. But what if you have to read through hundreds, or even thousands of pages in this manner. Chances are, despite your good intentions, that you miss a name or two. And in the process of finding generic GDPR-information you lose attention to the more difficult parts that requires human judgement, and which no machine can most likely detect. For instance, confidential company information or information whose fact alone can lead any reader to figure out who that person is. It doesn't take long for a person to figure out who “The founder of the company whose mission it is to build an everything store and by the way has a great interest in going to orbit ” is. But for an AI-model? Well it requires a lot of training.
And now comes the surprising piece when I tell you, that that is exactly how people redact or anonymize documents in most organizations today. Using manual processes. Consequence: mistakes happen, time spent redacting is enormous, people quit.
We talked to more than 25 government institutions and a dozen law firms in Denmark. In these organisations, redaction is part of everyday life and a vast amount of time and resources are spent anonymizing and redacting documents. And no one uses a digital redaction tool to assist them. Why? Because it does ́t exist. Yet….
Law firms and government institutions need to redact and anonymize information for various reasons - use cases and methods which we will explore more thoroughly in a later article.
The question we have most often been met with at Cleardox is: There must be a good redaction tool out there? The fact is: There aren't. Well define a “good” redaction tool. In our view a good and modern redaction tool consists of three pillars: A system that automatically finds personal information in your document under review, and present them in a user friendly manner. The systems we have looked at (and belief me: we have searched the corners of the web), don ́t have this feature. Some can find emails, and others can find numbers. But it's never the full range of entities. Security, security, security. The reason why most redaction tasks are still manual, is because people are afraid that you can reengineer the document and the original entities. A redaction tool must ensure that it can never happen. UX & design. Most of the systems we have seen, don ́t have the touch and feel you would expect from a modern “apple like” software system. Many of them have too many unnecessary features and requires too much training to master.
So to answer my question at the very beginning: How do you keep focused and avoid mistakes in jobs that is repetitive? By having the right set of tools that assist you in your daily job. Tools that makes it interesting and enjoyable to go to work. In this case: a redaction tool.
All in all: We believe redaction and anonymization of documents should be easy, fast and interesting. This important task deserves a proper redaction tool. Just like the security personnel at the airport, who has advanced scanning equipment and metal detectors at their disposal, such should be the circumstances of the people responsible for redacting documents. Rather than paying too much attention to names, addresses and the like, they should pay their attention to the difficult redaction scenarios.
Cheers,
The Cleardox team