Coding passive voice detection

Several years back, my colleague Ben and I developed a Chrome extension for editorial quality assurance. Ben provided the actual coding ability, and I was there to try and define the logic for the various rules we were trying to detect. We started with relatively simple checks, like preferred words, preferred spelling and capitalisation, checks for heading and link length, and some basic metadata checks… Then we thought, ‘it’d be cool if this could detect passive voice’…

So, how did we do it?

There are simple ways of explaining passive voice, but these are not easily replicated in JavaScript… So we had to bite in to the grammatical structure.

Consider the passive voice below:

The story was written by Kermit.

The bold bit was the key to being able to detect it. This pattern looks like:

Auxiliary verb + Past participle

[An auxiliary verb is a verb used just to carry the tense; it’s not your traditional ‘action’ word. The past participle is… Well, here’s an example: Today I eat. Yesterday I ate. I have eaten. Eaten is the past participle.]

So to detect this form with JavaScript, we just needed lists of all possible auxiliary verbs, and all past participles…

Auxiliary verbs

was
were
has been
will be
should be
can be
can not be
can’t be
could be
couldn’t be
could not be
must be
must not be
mustn’t be
was not
wasn’t
were not
has not been
will not be
shouldn’t be
…

I’m probably forgetting a few, but that’s most of them, as far as we’re likely to need. As we weren’t using any fuzzy matching, the contraction forms (e.g. wasn’t) needed to be called out specifically.

For the past participles, we used the hundred most common verbs in English, plus a bunch of of ones common to government… Things like ‘implemented’ and ‘delivered’.

Ben incanted the sacred words while facing east and sacrificing a chicken to the gods of jQuery, and hurrah, we were away. It worked brilliantly. When passive voice was detected, it was highlighted yellow in the page, and the font was changed to fuchsia Comic Sans, making it so jarring that the author had to consider whether passive voice was appropriate for that particular situation.

And people who couldn’t grasp passive voice thought that it was voodoo.

Epilogue—But wait, it’s not that simple…

[An update from November 2017]

In the above text, besides the specific examples, there are 6 incidences of passive voice. Can you detect them all?

One of these wouldn’t have been picked up by our Chrome extension, even if we were checking for every possible past participle. Who knows why?

Put your smarty pants on and let me know in the comments. [Hint, it’s in the third paragraph.]

By Rory|2017-11-08T08:52:41+10:00March 16th, 2015|Coding, Grammar|6 Comments

About the Author: Rory

Writer, editor, musician, plain English evangelist, content ninja for hire, and general web guy, Rory does lots of things, when he has time...

6 Comments

Ben Boyle March 16, 2015 at 11.00pm

Good times! We should publish those tests, I’ll look into that today 🙂
Phillip Lincoln March 17, 2015 at 2.38am

You two are a couple of web savants. The people thank you.
Ben Boyle March 17, 2015 at 6.40am

Tests are now public! (Not the tool itself unfortunately)

Here are the sacred words: https://github.com/qld-gov-au/qgov_qa_tests/blob/master/tests/editorial.js#L648

And the list of passive voice and past participle terms:

– https://github.com/qld-gov-au/qgov_qa_tests/blob/master/tests/editorial.js#L428
– https://github.com/qld-gov-au/qgov_qa_tests/blob/master/tests/editorial.js#L458
Sharon L March 17, 2015 at 8.40am

Agree with Phil! How can we get the tool?
Rory March 18, 2015 at 12.30am

If you’re still with QG, you can get access from the online services team or editorial team at the place I used to work…

🙂

Try editorial@…
Helping people make websites – Ben Boyle lives here! March 1, 2016 at 1.53am

[…] 2012 we tackled editorial quality by testing for passive voice, non-preferred terms, number formats and more. Computers are great at consistently performing the […]

Comments are closed.