A Tutorial On Redaction

This image has been redacted for your convenience.

Writing software is fun. Redacting software is torturous. It gets worse when the software was never intended for redaction, but you need to redact it anyway.

I’ve worked on multiple projects in the past decade and all of them have involved redaction at some level. The source code, the documentation, the bug fix requests; I’ve redacted every type of digital thing you can think of. Sadly, even after all this time, I’ve yet to find a way to remove the vampiric qualities of redaction which consume the souls of those who perform it. However, I have learned a few ways to make redaction more effective with less time-consuming rework, and that’s what this post is all about.

Redaction, if you are uninitiated into the cult, is a form of evil magic where you remove sections of important information from documents or source code all while somehow retaining most of your sanity. Sometimes, the information is removed because you aren’t licensed to give away someone else’s work, but you need to deliver something that contains the work. Other times, you want to protect your own inventions but still be able to sell portions of your work. You might be allowed to make redaction obvious, printing black boxes where text should be. Or maybe you’ll need to completely hide the fact you messed with the content. This latter approach is the one I’ll assume, because I don’t think there’s a whole lot of value in the former.

What’s the Point?

Stop redacting for a second. It’s easy to jump into redaction work and go through some easy, repeatable steps to get your job done and end up missing the whole point of redaction. Remember, the reason you are redacting is because whoever is receiving the information you have can’t have specific stuff it contains.

Are you redacting to remove terms? Maybe the names of intellectual properties? If that’s all, you might think you can search and replace the contents of the files you need to redact. Replacing terms is easy enough. You can probably finish your redaction work in a few minutes. But what value is there in removing terms? If the people who are being provided the redacted material have any idea what the material was used for, they’ll know which terms were redacted. You haven’t done anything. And if they have no idea what it was used for, why do they care about it?

I’ve found that redaction is time-consuming and tedious, but also an inconsistent process. You can’t write a program to perform redaction for you, because a program can’t interpret every conceivable spelling error, phrasing (especially poor English), or acronym. Searching for terms with software is really helpful, but it only catches the most obvious stuff.

Consider this paragraph:

“The software uses a proprietary component called This Sure Is Awesome Technology. This technology is used to generate output in a comma-separated list, but in columns instead of rows. This is protected by patent Des. 555,555. Our tool uses this tool to turn pictures of ducks into pictures of chickens. Chickens are better than ducks.”

Suppose you can’t transfer This Sure Is Awesome Technology, because it is licensed. And suppose you can’t transfer the patent information because of international law. A search for the relevant terms would get you what you want, but try just removing them:

“The software uses a proprietary component called. This technology is used to generate output in a comma-separated list, but in columns instead of rows. This is protected by patent. Our tool uses this tool to turn pictures of ducks into pictures of chickens. Chickens are better than ducks.”

You’ve left in the nature of the technology in question and the content of the patent. It doesn’t take much effort for someone to replace your redacted text. So what was the point? You need to do this by hand:

“The software uses a proprietary component which translates files from one type to another. Our tool uses this tool to turn pictures of ducks into pictures of chickens. Chickens are better than ducks.”

Here, the meaning is preserved, but not the method, which is the focus of the redaction. Redaction is almost always an effort to protect methods and concepts, so why apply a method that can’t protect anything?

Search, however, is limited. Consider a different writing of the same text:

“The software uses a proprietary component called TSiAT. This technology is used makes row-based CSVs. This is protected by PD 555,556. Our tool uses this tool to turn pictures of ducks into pictures of chickens. Chickens are better than ducks.”

Your search won’t catch every possible spelling error. It won’t catch different forms of the same phrases, especially if those forms have poor grammar. It won’t catch acronyms you don’t expect. If you understand the point of redaction, you won’t consider your job complete just because a search doesn’t return terms from a list of “bad” words.

The Redaction Balance

It’s easy not to go far enough in your redaction and leave too much content behind. On the other hand, redacting massive sections of documents removes any value from them. At that point, why even give the documents away?

You’ve probably seen documents redacted by the government. You know, those poorly scanned pages that have a handful of words floating in a sea of black ink. You might find pieces of information here and there but you might not. Why did they release the document at all if there’s nothing in it?

A better approach, as described above, is to search for terms and concepts  yourself. You, a human being and not a computer, can understand practically everything that might show up in the information you are redacting. It’s tedious, it’s terrible, and it might be evil, but redaction is something you can’t describe in logical terms any more than you can describe writing a book in logical terms. You can’t delegate this terrible work to a computer, no matter how much you want to. And if you are doing redaction the right way, you’ll really, really want to.

Your goal is to remove terms and ideas in a careful way that doesn’t make it obvious that the terms and ideas ever existed. For instance, if you need to remove the section in brackets, do it like this:

“The tool is capable of [feature A], which does X, Y, and [Z] in order, as well as feature B, which does X and Y only.”

Becomes:

“The tool is capable of feature B, which does X and Y in order.”

A hard redaction of this, replacing [feature A] with [redacted feature] and [Z] with [redacted function] would read like this:

“The tool is capable of redacted feature, which does X, Y, and redacted function in order, as well as feature B, which does X and Y only.”

This gives away the fact a feature exists as well as 66% of what it does. If you want someone to know about “Feature B” and not “Feature A”, this is a terrible way to do it.

Acronyms Are Your Enemy

If a term you are replacing is an acronym, things get much worse.

Imagine you have an acronym like RED. That term might appear thousands of times in unrelated words: hundred, redaction, credibility, bred, not to mention the word “red” itself. If this term is just replaced forcibly with something like “Supplier Technology”, you end up with ridiculous sentences like:

There are two-hundSupplier Technology tests, each of which appear in black if they passed and Supplier Technology if they failed, establishing the cSupplier Technologyibility of the claim that the software was tested.”

You’d need to go through these results by hand, which is no faster than searching and reading in the first place. If you left this in place, anyone reading the redacted document would realize that “Supplier Technology” is clearly what all instances of “RED” become. Again, search-and-replace has accomplished nothing.

And this isn’t the end of the pain you will suffer at the hands of linguistic shortcuts. Laziness compels people to turn all sorts of things into acronyms where you may not expect them. And even if you try and expect them, they probably won’t use the same letters you would. This doesn’t even take spelling errors into account. A misspelled acronym is like a land mine of important information you can’t sweep for. It’s just waiting for the recipient of the redacted material to trip on, blowing up in your face. Acronyms are just another reason you should be performing redaction by hand.

Some Precautions

You may have no idea if the documents or software you are writing will be subject to redaction in the future. But if you somehow do know, there are a few things you should keep in mind.

If you want to make redaction trivial, don’t mix different proprietary information. If you are working with three companies, try to keep the IP of each of them separate, restricting interaction to as few documents as possible. You’ll find that this won’t be possible, at least completely. The closer you get to it, however, the easier the redaction.

If you want to make redaction impossibly difficult, use extremely short proprietary terms. For instance, two-character terms like “A1” will show up in binary data in guids, maybe millions of times. Even if an engineer can look through things at a rate of 10/second (which is practically begging for human error), that’s still four full terrible days to look for a term which might legitimately appear twice in its proprietary context. Inconsistent acronyms, lack of spelling and grammatical checks, and images all help multiply the length of time you will need to perform redaction. Avoid these things as much as possible in any material that might be redacted in the future.

Writing Well Means Thinking Clearly

In the three or four English classes I was required to take in college, there were no lectures on the topic of writing well. We “studied” politics – exclusively by reading poorly written papers created by our peers, all combined into a parody of a textbook – but we never studied English.

Good writing is a product of clear thinking. If you can’t get your thoughts into written form, you probably don’t understand what you are thinking about. When English professors stop teaching how to write clear English, they either do it because they are unqualified to teach or because they aren’t English professors in the first place but amateur political hacks. I’m not sure which of these is worse.

I’ve since graduated college, but bad writing thrives just as much in business as it does in education. Thankfully, good writers have addressed the problem before, and yesterday I found an old piece by George Orwell which had me thinking about it again.

…quite apart from avoidable ugliness, two qualities are common to [bad writing]. The first is staleness of imagery; the other is lack of precision. The writer either has a meaning and cannot express it, or he inadvertently says something else, or he is almost indifferent as to whether his words mean anything or not. This mixture of vagueness and sheer incompetence is the most marked characteristic of modern English prose, and especially of any kind of political writing.

Bad writing is ugly, stale, and imprecise. It follows that good writing is better in each way. I hope in my own writing to avoid ugliness, staleness, and imprecision.

Orwell lists a few bad habits that writers should avoid. Business-speak – the dread language invented by people who wanted to seem important by using many words to say very little – seems to be nothing but these habits taken as law:

Dying metaphors … there is a huge dump of worn-out metaphors which have lost all evocative power and are merely used because they save people the trouble of inventing phrases for themselves…

Operators, or verbal false limbs. These save the trouble of picking out appropriate verbs and nouns, and at the same time pad each sentence with extra syllables which give it an appearance of symmetry …  In addition, the passive voice is wherever possible used in preference to the active…

Pretentious diction. Words like phenomenonelementindividual (as noun), objectivecategoricaleffectivevirtualbasicprimarypromoteconstituteexhibitexploitutilizeeliminateliquidate, are used to dress up simple statements and give an air of scientific impartiality to biassed judgements…

Meaningless words. In certain kinds of writing, particularly in art criticism and literary criticism, it is normal to come across long passages which are almost completely lacking in meaning…

A quick look at some recent company emails I’ve received removes any doubt that the sort of language spoken in the business world is one with a passing resemblance to English. It has English words, but unlike English, it’s purpose is not the communication of information.

Consider these incredible phrases:

  1. “distracting instability”: This is pure jargon. It appears to confer information, but it is more of a passphrase used to indicate membership in a group – the group of professional businessmen. Like all jargon, it could be replaced by a simple English expression like “
  2. “operational excellence”: More jargon. This phrase is does not mean what the English words that comprise it mean, which makes it bad. A thing which is operational is in use or ready for use. Excellence, on the other hand, is the quality of surpassing mere goodness and being great. Imagine someone using the phrase “operational red”. The only difference is substituting one quality with another. It doesn’t make any sense, either.
  3. “get the ball rolling”: A metaphor that can always be replaced by the word “start”.
  4. “bubbled to the top”: There are few more complicated or less clever ways to say “rose”.
  5. “compliant to the ever-evolving requirements related to this area”: The end of the phrase (“related to this area”) is redundant. Was there any question the requirements were related to what we’re already talking about?
  6. “opportunities for growth”: More redundancy. The word “opportunities” gets to the point without the botany reference.
  7. “tackle this challenge”: This is not only a dying metaphor, but a bad one in the first place. You don’t “tackle challenges”. Challenges are abstract, and tackling is a physical act.
  8. “eliminating potential delays”: Since potential delays, being potential and not actual, do not actually exist, it seems impossible to know what they are, let alone to eliminate them.
  9. “a ticket to entry toward building a partnership”: Another dying metaphor, this time used to pad a sentence toward artificial importance. The entire phrase “a ticket to entry toward” could be replaced with the single-syllable word “start”. Does the author know that English has such a word available?
  10. “this will allow us to ensure we not only enable”: We will do something. What will we do? We will be allowed. What will be allowed? We will be allowed to ensure. What will we be allowed to ensure? That we not only enable, but also do something else. All that this phrase adds is confusing layers of verbs. Is that a useful device in other languages?
  11. “working to leverage”: The word “leverage”, outside of physics, can always (ALWAYS) be replaced with the word “use”. And it always (ALWAYS) should be.

I’m probably guilty of business-speak and other errors in writing. This is especially so because I didn’t realize just how bad business-speak was until years after I began being forced to read it.

Useful to me, and hopefully useful to you, Orwell gives a list rules to keep in mind as you write:

i. Never use a metaphor, simile or other figure of speech which you are used to seeing in print.

ii. Never use a long word where a short one will do.

iii. If it is possible to cut a word out, always cut it out.

iv. Never use the passive where you can use the active.

v. Never use a foreign phrase, a scientific word or a jargon word if you can think of an everyday English equivalent.

vi. Break any of these rules sooner than say anything outright barbarous.

Orwell’s purpose for expounding the virtue of good writing is to avoid the political manipulation that requires bad writing to hide bad thinking. This same sort of motivation exists in the business world. Business-speak is used to hide things – ignorance, motivations, lies, manipulation, information – from readers by making those readers feel they’ve been told something important and informative.

A Critique of Salary

Articles about business are often full of jargon, ugliness, and imprecision, but I recently discovered an article on salaries that seems to avoid egregious examples of those linguistic evils. I had been looking into the origin of the term “salary” and the bureaucratic inventions based on it: “salaried exempt” and “salaried non-exempt.”

I’m a software engineer, and like most of the people who work on software, I am paid according to the “salaried exempt” rules. This is like a normal salary (I am paid a certain amount of money over the course of the year for my work, rather than per hour), except that my company is not required to pay me if I need to work extra hours to get my job done. Not all companies don’t abuse this policy, and my own actually provides some extra money to a point for overtime. Nevertheless, I have some critical thoughts of the entire concept of salary.

I’m not writing this to just summarize my thoughts on salary, but to compare and contrast them with the article I mentioned, which is titled “4 reasons why companies can ask their employees to work for ‘free'”. Lack of capitalization aside, I already have some problems with this. I’m not interested in why a company can ask their employees to work for free. The answer is intuitively obvious: it’s legal. The author talks about the legality of salaried employees being asked to work extra hours, so at least she covers the title. However, out of her four sections, only half talk about why employers can do this. The other half talk about why they would choose to.

A leaked Urban Outfitters memo from 2015 was the motivation for the article, itself written two years ago. It begins (emphasis mine):

The leaked Urban Outfitters memo asking salaried employees to volunteer one or more weekend shifts at an Urban Outfitters fulfillment center to pick, pack and ship merchandise is really no story at all, despite Internet shaming and sensational claims that Urban Outfitters is making management employees work for “free.”  The request of Urban Outfitters is not unusual; it is just unusual that the request was leaked to the media.  Employers regularly require exempt employees to go over and above a 40-hour work week without additional pay, and this approach is appropriate under wage-hour laws.

My disagreement with the article begins with the first paragraph. We’ll come back to the use of the word “free”, used to describe the hours worked by many salaried employees beyond the contractual obligation they have, and focus for now on the line “this approach is appropriate”. Why is it appropriate? Because it is legal. This is the theme of the article. The salary system in place is legitimate because it is legal, which is almost a tautology. The fact is, I don’t think there are many good reasons to have this system, and I think a lot of people realize that and appeal to the legality of it as justification.

And, while some media commentators have dubbed this as “working for free,” the reality is that the employees are not working for free.  They have agreed to work all required hours in exchange for a certain salary.

This is true, but the “required hours” amount to forty hours every week. What value is an agreement to work forty hours a week if this number is merely a suggestion?

After all, remember that there are salary requirements for exempt employees, so those who are being asked to “volunteer” are being compensated at a higher pay grade, at or above a salary set by our federal and state governments pursuant to public policy considerations.  Therefore, it is in fact “fair” to ask exempt employees for the extra work…

“Fair” in the context of this government means “legal”. It is constantly the reference point for fairness and appropriateness. I think it’s a bad standard though; why is the law written as it is written? The real question is what objectively determines fairness. The author tries to answer this by saying the quantity of money being paid justifies overtime. The salary for exempt employees exceeds an arbitrary government limit in the Fair Labor Standards Act, and is thus “fair”. After another reference to the law, she goes on again to give more rationale:

 The increased responsibility and salary levels of exempt employees also means they likely have more bargaining power in the marketplace and freedom to leave an oppressive employer, so government is less concerned about extra “unpaid” work in their case.

I don’t care what the government is concerned with. I don’t care what the government permits under law. I think it is wrong to require employees to work more hours than they are contractually obligated to work, and I’m convinced the entire concept of “salaried exempt” is absurd. The fact I have “more bargaining power” doesn’t offset this, and it turns out that many salaried exempt positions require such specialized skills that this bargaining is done by more people for fewer jobs anyway.

 1. Employees who are exempt can work over 40 hours without additional compensation.

Her first argument is a restating of the law. Of course employees can be required to work over forty hours without additional pay. We’ve already established this. But the interesting question is why this ought to be the case. Yet another attempt to rationalize this is provided:

Exempt employees take customers to dinner after hours without additional compensation.  They answer after-hour calls and emails without additional compensation.  This happens all the time.  And, it’s legal.

Employees often do things after work hours for which they are not paid and it is legal, so therefore companies can ask employees to work more than forty hours a week. It’s not really an argument, but a restating.

2. Volunteering for additional work does not change the employee’s primary duty.

Exempt employees who “volunteer” for  production type duties (e.g. pick, pack, and ship merchandise) do not have their jobs transformed into hourly non-exempt jobs as long as their primary duty remains exempt.

Again, another restating of the fact that companies can do what we’ve already established they can legally do. It gets a little more interesting after this:

3. Production work doubles as leadership training for exempt workers.

…Rolling up their sleeves to help might provide a real eye-opening education for how hard the hourly employees work and how decisions by exempt  personnel affect those hourly workers.  This could be valuable training for managers, administrators and professionals.  Also, isn’t rolling up your sleeves to perform “undesirable” tasks one definition of leadership?  Leaders should not be above any task, no matter how “menial.”

It isn’t doing undesirable tasks that repulses people from the concept of “salaried exempt”. It’s doing those tasks without getting paid for the extra hours worked. This rationale doesn’t even enter into the discussion when the jobs in question are in world of engineering, since there often isn’t any sort of “leadership” in the sense described here going on.

The fact is, we are no closer to answer as to why this is a good practice than we were when we started. One final reason is given:

4. ‘Volunteer’ work can reduce overtime.

Reducing overtime of hourly workers by asking exempt employees to pitch in, as long as the company does it legally, is a perfectly legitimate business decision.

Some people – who don’t get paid extra for working extra hours – can work in place of those who do get paid extra for working extra hours, which if done legally, is a legitimate business decision. Because, as we’ve already seen many times, it is the legality of the practice that makes it fair, legitimate, and appropriate. Overtime doesn’t reduce overtime, even if it means the business isn’t required to pay as much if they shift the employees working overtime around.

My response to all of this is pretty straightforward. An employee who agrees to a contract to work forty hours each week and then proceeds to do just that for $50,000 a year is making $50,000 / (52 * 40) ≈ $24 dollars an hour. Another employee who agrees to the same contract but who is asked to work evenings and weekends, averaging 50 hours of work a week is making $50,000 / (52 * 50) ≈ $19 an hour. This makes sense; 25% more hours worked for the same amount of money means a corresponding decrease in hourly pay.

A government or business can come along and say “we’re paying you for a certain amount of work, not a certain number of hours”, but this isn’t entirely accurate. If it were, an employee could leave the office after getting their work done. This rarely happens for “salary exempt” employees. It’s more accurate to say that “salary exempt” means working a minimum of forty hours a week and a maximum of whatever the managers of the company ask them to work.

While I don’t think the law is wrong to permit what it does, I think people should be a bit wiser than merely repeating what the law says to justify the behavior of companies. I understand that overtime is sometimes required. Companies can’t anticipate everything that might get in the way of an important deadline, and sometimes there isn’t time to hire and teach new employees (who would need to be laid off once the deadline is achieved anyway). This is fine and even fair as an emergency tactic, but it is a terrible policy for normal work.

I’ve often seen companies require employees to work extra hours to avoid hiring new employees, even though the employees working overtime were hired under the pretense of working forty hours a week. It might be legal, but it isn’t fair. I don’t think the government should come and sue the companies doing this sort of thing, but the employees working the mandatory overtime should probably look for jobs elsewhere. The market has already begun correcting this abuse, and companies are even advertising their commitment to a forty hour workweek as a perk.

Medieval peasants worked fewer hours than we wealthy Americans do, and it’s probably part of the cause of our moral decay as a civilization that we give so little time to genuine rest. Companies expect their employees to give up anything to get their jobs done when it turns out that many of the things employees give up are more valuable than the work.