GNU diction Database

My database for GNU diction flags grammar, punctuation, and style problems in Standard American English and suggests replacements. GNU diction is a command-line tool that runs on most desktop operating systems.


File: diction.txt
Size: ~900 kilobytes
Type: plain text

Contents and Resources

The groups contain sentences, phrases, and words. Most lines don't have a suggested replacement. Lines that use = never reference another line outside their group. I sort entries in this order.

  1. Foreign Language
  2. Dialects
  3. Mistakes
  4. Suggestions

Foreign Language

Familiar English words should be used before borrowed foreign words. The foreign language section uses the borrowed meaning, not the original language meaning. The irony is many of the familiar words are probably borrowed foreign words. There is some slang mixed in, but all of it is based on a foreign word.




The database is for Standard American English so any language from another dialect is wrong. That includes different spellings, slang, mistakes, or anything else in that dialect.




Unlike suggestions these are problems that should be fixed. Being vulgar or using slang isn't wrong, but using two consecutive articles or putting an apostrophe in the wrong place is.



Jargon, self censorship, slang, double negatives, and so on are all here. These are situational or possible problems instead of outright mistakes.




Try this command if you know what you are doing and want to test the database immediately.

$ diction -nsf diction.txt example.txt | \
  fmt | more

Continue reading if you need further explanation.

Installing GNU diction

If you are using GNU/Linux, BSD, or another operating system with a software repository GNU diction is probably in there. If it isn't in your repository you can download and compile it. The GNU diction and style website has distributions of source code that include instructions how to compile it.

For Windows users there is GnuWin32 which has diction as an extra package. I don't use Windows so I have never tested it.

For OS X users you might be able to follow the instructions for any GNU command line tool installation. It uses homebrew, but the same as Windows I did not test OS X.

Using GNU diction

After installing diction the first thing to read is the manual. Use this command to see it.

$ man diction

With no options diction will print out lines with brackets around flagged words and phrases.

$ diction example.txt

Checking 20,000 Leagues Under the Sea with that command would print lines like this.

pg164.txt:72 For some time past vessels had been met by "an enormous thing," a long object, spindle-shaped, occasionally phosphorescent, and infinitely larger and more rapid in [its] movements [than] a whale.

The -s or --suggestions switch does what it sounds like and enables suggestions. In the database the suggestions are on the right side after a tab.

$ diction -s example.txt

The output now matches the database with the flagged words on the left and the suggestions on the right.

pg164.txt:72: For some time past vessels had been met by "an enormous thing," a long object, spindle-shaped, occasionally phosphorescent, and infinitely larger and more rapid in [its -> = "it is" or "its"?] movements [than -> (examine sentences containing "than" to insure that they are not missing words: I love my father more than my mother. I love my father more than my mother loves my father. I love my father more than I love my mother)] a whale.

Using My Database

The database offers some basic automated checks and sometimes a suggested replacement. Don't use it as a style guide or dictionary.

Some suggestions have the wrong tense, plural, and participle. I made similar entries point to a single one to save time and make less mistakes when editing the dabatase. Adjust the suggestion to fit the sentence or rewrite the sentence.

My database is used with the -f switch in diction which selects a user database. Use the -n switch to disable the default database. This example disables the default database, uses my database, enables suggestions, formats the text for reading, and views the results.

$ diction -nsf diction.txt example.txt | \
  fmt | more

The output would look like this.

pg164.txt:424: Besides," thought I, "all roads lead back to Europe; and the unicorn may be [amiable -> friendly (complex)] enough to hurry me towards the coast of France.

Ignoring a Group

Using grep is one way to ignore output from a group. This example ignores output flagged as British English.

$ diction -nsf diction.txt example.txt | \
  grep -v "(British English)" | fmt | more

That command would ignore the default database, use my database, enable suggestions which shows groups, search for all output that does not contain (British English), format the text for reading, and view the results.

The database is licensed as WTFPL version 2.

File Format

The latest release version of GNU diction doesn't describe its format, but it is simple. It ignores case when the word in the dictionary is all lower case. An equals sign in the second column after the tab prints the text of that entry. The manual describes a switch for beginner mistakes, but I don't use that ability.

SPACE text to find TAB suggestions

letters at the end of a word TAB suggestions

SPACE text to find TAB = show text of this entry

letters at the end of a word TAB = show text of this entry

I formatted my suggested text with categories in parenthesis and comments in braces because GNU diction uses brackets when inserting text. This example prints any word that ends with xor. If suggestions are on it prints everything after the tab. Every entry has a group to identify the problem, process the output with tools such as grep, and make the database easier to edit.

xor TAB {except for flexor, Luxor, and plexor}(Leet)

Made by Mr. Satterly
With help from Mrs. Satterly