Steampunk Fairy Tales: Volume 3 now available!

Read about it »

Echo Word Utility

An echo is a repeated word that breaks reader immersion. Echoes can be difficult to find in your own writing.

Echoes are a problem I often have, so I built a tool to help me discover where echoes occur. I wrote this blog post to explain how it works, and I hope it can help you, too.

How to Use the Echo Word Detector

Paste an excerpt from your story into the textarea on this page and press Detect.

Paste your excerpt into the textarea and press detect

Potential echoes are highlighted in gray.

Potential echoes

Click on a highlighted word to focus on a specific echo. Click on the word again to show all potential echoes.

Focused on a potential echo

Red words are potential echoes at the beginning or ending of a sentence. These echoes tend to stand out more.

Words that echo at beginning or ending of sentence

There is also a list of every word in the excerpt, sorted by how many times the word is used. You can click on one of those words to highlight every occurence in your excerpt.

Word count

You shouldn’t trust the echo word utility completely, just like you shouldn’t trust every piece of advice from every critique you receive. It’s a tool. It highlights potential weaknesses; it does not tell you your writing is bad.

A few excerpts can be analyzed with a click of a button. This provides a quick, unbiased view of how the tool works, since it’s analyzing well known works.

Example excerpts

Feel free to paste excerpts from other stories for comparison.

Different types of words have different thresholds. You can even exclude certain word types.

Cick the alter thresholds button to change thresholds. Uncheck to exclude word type from being highlighted. You will need to click the Detect button again for changes to appear.

Alter thresholds


How it Works

Consider the following block of text, where each word is numbered:

Meet Leslie and Dave.
1    2      3   4    

Leslie and Dave like writing.
5      6   7    8    9       

Leslie writes more.
10     11     12  

The text is converted to a data structure where each word is a list of where it appears:

{
    meet: 1
    leslie: 2, 5, 10
    and: 3, 6
    dave: 4, 7
    like: 8
    writing: 9
    writes: 11
    more: 12
}

Each word is then analyzed for echoes.

An echo is determined using two thresholds: distance and cluster size.

Calculating distance is done by simple subtraction. Consider leslie, which is the second, fifth, and tenth word:

distance(leslie) = { 5 - 2, 10 - 5}
distance(leslie) = { 3, 5 }

If the distance threshold is four, then 3 is a hit because three is less than four. The 5 is a miss, since five is not less than four.

distance_threshold = 4
threshold(distance(leslie)) = { 3 < 4, 5 < 4 }
threshold(distance(leslie)) = { true, false }

Based on the distance threshold, the first two occurences of leslie are close to each other, but the last two occurences are not.

Next, the cluster threshold is applied to determine if there are too many occurences that are close to each other.

If we continue the example, with a cluster size of two, we find one cluster:

L = threshold(distance(leslie)) = { 
    true,    // (words 2, 5)
    false    // (words 5, 10)
}

cluster_threshold = 2
clusters(L) = { { 2, 5 } }

In other words, the second and fifth occurence of leslie are considered echoes.

Let’s start over, this time with different input and thresholds:

//=======================================
// INPUT
//---------------------------------------
{
    the: 1, 5, 9, 22, 41, 50
}

//=======================================
// CALCULATE DISTANCE
//---------------------------------------
distance_threshold = 10

D = distance(the) =  { 
    4,       // ( 5 -  1)
    4,       // ( 9 -  5)
    13,      // (22 -  9)
    19,      // (41 - 22)
    9        // (50 - 41)
}

//=======================================
// ARE WORDS CLOSE TO EACH OTHER?
//---------------------------------------
T = threshold(D) = {
    4  < 10,   // (words 1, 5)
    4  < 10,   // (words 5, 9)
    13 < 10,   // (words 9, 22)
    19 < 10,   // (words 22, 41)
    9  < 10    // (words 50, 41)
}
T = threshold(D) = {
    true,      // (words 1, 5)
    true,      // (words 5, 9)
    false,     // (words 9, 22)
    false,     // (words 22, 41)
    true       // (words 41, 50)
}

//=======================================
// ARE THERE CLUSTERS?
//---------------------------------------
cluster_threshold = 2

clusters(T) = { 
    { 1, 5, 9 },
    { 41, 50 }
}

//=======================================
// SUMMARY
//---------------------------------------
There are two clusters.
The first is a cluster of three occurences at positions 1, 5, and 9.
The second is a cluster of two occurences at positions 41 and 50.
 
Random Articles

4 Tips Worth Repeating from Writing Excuses, Season 7
Writing Excuses is a podcast run by four successful authors that’s unique because each episode is only fifteen minutes long. This constraint keeps the content focused. Leslie and I listen to them when we’re cleaning or working out, and if we find an especially good episode, we take notes. Here are four episodes focused on writing that I thought were worth sharing. Read on

 


Umpus on Wine Label
This past weekend, Dave and I visited a good friend in Lakewood, Ohio. We did the usual things: eat pizza, drink beer, watch movies that should have long since been forgotten or eradicated. Read on