Steampunk Fairy Tales: Volume 3 now available!

Read about it »

Echo Word Utility

An echo is a repeated word that breaks reader immersion. Echoes can be difficult to find in your own writing.

Echoes are a problem I often have, so I built a tool to help me discover where echoes occur. I wrote this blog post to explain how it works, and I hope it can help you, too.

How to Use the Echo Word Detector

Paste an excerpt from your story into the textarea on this page and press Detect.

Paste your excerpt into the textarea and press detect

Potential echoes are highlighted in gray.

Potential echoes

Click on a highlighted word to focus on a specific echo. Click on the word again to show all potential echoes.

Focused on a potential echo

Red words are potential echoes at the beginning or ending of a sentence. These echoes tend to stand out more.

Words that echo at beginning or ending of sentence

There is also a list of every word in the excerpt, sorted by how many times the word is used. You can click on one of those words to highlight every occurence in your excerpt.

Word count

You shouldn’t trust the echo word utility completely, just like you shouldn’t trust every piece of advice from every critique you receive. It’s a tool. It highlights potential weaknesses; it does not tell you your writing is bad.

A few excerpts can be analyzed with a click of a button. This provides a quick, unbiased view of how the tool works, since it’s analyzing well known works.

Example excerpts

Feel free to paste excerpts from other stories for comparison.

Different types of words have different thresholds. You can even exclude certain word types.

Cick the alter thresholds button to change thresholds. Uncheck to exclude word type from being highlighted. You will need to click the Detect button again for changes to appear.

Alter thresholds


How it Works

Consider the following block of text, where each word is numbered:

Meet Leslie and Dave.
1    2      3   4    

Leslie and Dave like writing.
5      6   7    8    9       

Leslie writes more.
10     11     12  

The text is converted to a data structure where each word is a list of where it appears:

{
    meet: 1
    leslie: 2, 5, 10
    and: 3, 6
    dave: 4, 7
    like: 8
    writing: 9
    writes: 11
    more: 12
}

Each word is then analyzed for echoes.

An echo is determined using two thresholds: distance and cluster size.

Calculating distance is done by simple subtraction. Consider leslie, which is the second, fifth, and tenth word:

distance(leslie) = { 5 - 2, 10 - 5}
distance(leslie) = { 3, 5 }

If the distance threshold is four, then 3 is a hit because three is less than four. The 5 is a miss, since five is not less than four.

distance_threshold = 4
threshold(distance(leslie)) = { 3 < 4, 5 < 4 }
threshold(distance(leslie)) = { true, false }

Based on the distance threshold, the first two occurences of leslie are close to each other, but the last two occurences are not.

Next, the cluster threshold is applied to determine if there are too many occurences that are close to each other.

If we continue the example, with a cluster size of two, we find one cluster:

L = threshold(distance(leslie)) = { 
    true,    // (words 2, 5)
    false    // (words 5, 10)
}

cluster_threshold = 2
clusters(L) = { { 2, 5 } }

In other words, the second and fifth occurence of leslie are considered echoes.

Let’s start over, this time with different input and thresholds:

//=======================================
// INPUT
//---------------------------------------
{
    the: 1, 5, 9, 22, 41, 50
}

//=======================================
// CALCULATE DISTANCE
//---------------------------------------
distance_threshold = 10

D = distance(the) =  { 
    4,       // ( 5 -  1)
    4,       // ( 9 -  5)
    13,      // (22 -  9)
    19,      // (41 - 22)
    9        // (50 - 41)
}

//=======================================
// ARE WORDS CLOSE TO EACH OTHER?
//---------------------------------------
T = threshold(D) = {
    4  < 10,   // (words 1, 5)
    4  < 10,   // (words 5, 9)
    13 < 10,   // (words 9, 22)
    19 < 10,   // (words 22, 41)
    9  < 10    // (words 50, 41)
}
T = threshold(D) = {
    true,      // (words 1, 5)
    true,      // (words 5, 9)
    false,     // (words 9, 22)
    false,     // (words 22, 41)
    true       // (words 41, 50)
}

//=======================================
// ARE THERE CLUSTERS?
//---------------------------------------
cluster_threshold = 2

clusters(T) = { 
    { 1, 5, 9 },
    { 41, 50 }
}

//=======================================
// SUMMARY
//---------------------------------------
There are two clusters.
The first is a cluster of three occurences at positions 1, 5, and 9.
The second is a cluster of two occurences at positions 41 and 50.
 
Random Articles

Sprint 11: Japan and the Waiting Game
October was a time of great revisions and querying agents. The end of October was so busy and exciting that I couldn’t remember what we did the first two weeks, until I reviewed the board. Read on

 


The Legend of Korra: A Discovery in Several Ways
Recently, Dave and I started watching a new show: The Legend of Korra. I watched a few episodes of Avatar with my little cousin, and generally enjoyed them. As I’ve seen some funny images from The Legend of Korra over the years, I was geeked to sit down and finally give it a proper watch. Read on