Professional Documents
Culture Documents
The Open Government Act of 2007 emphasizes putting administrative and court
records in the public domain as much as possible, and this is supported by the
Obama administration. Because of the vast amount of such records, the redaction
process is being increasingly automated, at least for an initial screening, for it is not
humanly possible to sift through numerous documents and edit out personal
information such as Social Security Numbers, names of minors, dates of birth etc.
Automated legal redaction software not only remove visible information that has
been previously identified for editing, but also help in identifying and removing
metadata about the information and any hidden tracking data that can be
recovered later through professional methods. While major editing software
developers such as Microsoft or Adobe provide their own redaction tools, several
stand-alone automated legal redaction tools are also available.
These automated legal redaction software work in several ways they can delete
the selected information and then create a new file without this information; or they
can replace the information with garbage lettering (in effect replacing one string of
binary numbers with another, meaningless one); or they can draw rectangles on top
of the marked sections, generate an image of the page and then flatten it, so that
the original redacted information is lost. One can then print out these images or
scan them through OCR software in batch mode for electronic storage and retrieval.
These automated legal redaction software also feature self-learning capabilities, and
try to match patterns of your previous work to predict potential candidates for
removal.
However, a redacted document that has been processed by automated legal
redaction software should ideally be screened by a human prior to publication to
check that all information has been omitted, and no clues have been left to recover
that information.
2. (Redact a PDF)
Requirements for PDF Redaction
Legal and administrative requirements are increasingly placing more documents in
the public domain, and electronic documents, including PDFs, are no exception.
Before publishing these files, however, it is necessary to redact a PDF either
manually or through batch software. There are several techniques as to how to
redact a PDF properly, and avoid common mistakes that might lead to inadvertently
leaving data in the redacted PDF file. An example is to make the text the same color
as the background, or to put a rectangle around the text, which allows one to
highlight the text easily and read it.
Using Acrobats built-in tool
The full version of Adobe Acrobat comes with a tool for automatic redaction, and
offers several options. When you start to redact a PDF using this tool, it parses the
document automatically, identifies private data according to criteria you set,
removes the data, and then creates a new file so that one cannot access redacted
data anymore. It can also load a pre-defined wordlist and search the document
according to this list. It is relatively easy and fast to redact a PDF using this tool.
Acrobat have also released a short video on their website which takes you through
the steps and teaches you how to redact a PDF.
Batch Mode Redaction
You can also use stand-alone plugins that can handle a large number of files, and let
you redact a PDF in batch mode, thus saving you considerable amount of time.
Batch mode redaction of a PDF, however, should be used with caution, and always
check some files randomly to make sure that no clues as to the missing data is left
behind inadvertently. Once you have had some practice, and mastered the art of
setting up proper filters, go ahead and redact a PDF with confidence.