Several years back, my colleague Ben and I developed a Chrome extension for editorial quality assurance. Ben provided the actual coding ability, and I was there to try and define the logic for the various rules we were trying to detect. We started with relatively simple checks, like preferred words, or preferred spelling or capitalisation, checks for heading and link length, and some basic metadata checks, but then we thought, ‘it’d be cool if this could detect passive voice’…
So, how did we do it?
Consider the passive voice below:
The story was written by Kermit.
The bold bit was the key to being able to detect it. This pattern looks like:
Auxiliary verb + Past participle
- has been
- will be
- should be
- can be
- can not be
- can’t be
- could be
- couldn’t be
- could not be
- must be
- must not be
- mustn’t be
- was not
- were not
- has not been
- will not be
- shouldn’t be
I’m probably forgetting a few, but that’s most of them, as far as we’re likely to need. As we weren’t using any fuzzy matching, the contraction forms (e.g. wasn’t) needed to be called out specifically.
For the past participles, we used the hundred most common verbs in English, plus a bunch of of ones common to government… Things like ‘implemented’ and ‘delivered’.
Ben incanted the sacred words while facing east and sacrificing a chicken to the gods of jQuery, and hurrah, we were away. It worked brilliantly. When passive voice was detected, it would be highlighted yellow in the page, with the font changed to fuchsia Comic Sans, making it so jarring that the author had to consider whether passive voice was appropriate for that particular situation.
And people who couldn’t grasp passive voice thought that it was voodoo.