Last week, I analyzed the relative lengths of paragraphs across a representative selection of fantasy novels. (You can read the details here.) This week, I’m turning my attention to a related measurement: the dialogue passages. I’ve often felt that my writing was rather dialogue heavy, but is that really the case, or am I just reflecting my secret belief that I was really intended to write for the stage?

In addition to the collection of titles I examined last time, I’ve added two new ones, and removed one: Peter S. Beagle’s The Last Unicorn, and Mike Resnick’s Santiago were added, since I acquired them recently in a Humble Bundle. And Whirlwind in the Thorn Tree, by S. A. Hunt was removed due to a typographic issue: a number of lengthy passages from a book-within-the-book were transcribed in double quotes, which made it impossible to isolate the actual dialogue passages.

The analysis consisted of extracting all double-quoted passages from the work, and applying the same word-count metrics that were used in the previous article, to provide basic descriptive statistics for the length of dialogue utterances one finds in the fantasy genre.

How The Numbers Shape Up

Longest Utterance: The award for longest winded speech goes to Strider, in Lord of the Rings, for his recounting of the tale of Beren and Lúthien, which he did in 469 uninterrupted words. The book with the most short-winded characters was The Last Empire (Mistborn #1). The closest that story comes to a filibuster is Sazed’s 112-word recap near the end, describing how the Lord Ruler’s magical trickery worked.

Average Utterance Length: In a marked constrast from last weeks analysis of overall paragraph lengths, it seems that fantasy writers are much more similar to one another as regards their dialogue lengths. The Dark Tower is the tersest of the contenders, averaging just 7.5 words per speech, while The Lord of the Rings tops out the list with the most verbose characters, averaging 19.1 words per rant.

Mostly Less Than: And when we look at the mostly-less-than length, we see almost the exact same order of as we did for average length, with Santiago just barely edging out The Dark Tower for the brevity crown, with 95% of its utterances being 28 words or less. Again, LoTR gets the wind-bag award, needing 71 words to snare the majority of its dialogues.

stats-dlgDensity: One way in which these books differ significantly from one another is in the proportion of narration to dialogue. The texts of the majority of titles are less than 50% dialogue, ranging from a low, narrative-heavy score of 13% dialogue for The Wizard of Earthsea, to a much chattier 37% dialogue in The Final Empire (Mistborn #1). But the real odd-balls are Santiago, which is 59% dialogue, and The Last Unicorn, which scores a whopping 63% talky-talk. These two outliers seem so at odds with the rest of the group that I had to go into the text and examine it myself, to be sure that there wasn’t some kind of bug in my analysis tool, but my visual inspection did reveal an awful lot of dialogue in these two books.

As for Strange Places, I am comforted to see that it is not an oddball outlier, sitting comfortably in the middle of the pack with 26% dialogue. Not as terse as Earthsea, nor as chatty as Unicorn. Something tells me I can safely forget those worries about being too dialogue-heavy for novels, although that also puts a spike in eye of my dream of discovering I had a “naturual” instinct for the stage. Oh well.

Speaking Characters: Another dimension in which these titles varied dramatically was the number of characters who are given speaking roles. This was counted by examining the speech attribution tags themselves. Any dialogue passage that ended with “said Xxxx,” or “Xxxx said” after the closing quote was considered to be a direct dialogue attribution, and if ‘Xxxx’ was capitalized, it was assumed to be attributed to the name of a character. By collecting and counting all of these occurrences, the analysis tool was able to give an approximate count of the characters with speaking roles. Obviously, any character who only ever “whined,” “barked,” “demanded” or even “asked,” but never actually “said” anything, would not be counted. The other cause for imprecision was double counting. No attempt was made to merge attributions for two labels that were in fact the same character. So, Strider and Aragorn, for example, were counted as distinct characters.

Given these caveats, I was nevertheless surprised to see such a broad range of cast sizes in this group. The smallest chorus of characters was found in The Wizard of Earthsea, with only 7 speaking roles, though it is probably no surprise at all that The Lord of the Rings set the bedlam standard with 76 distinct voices. (One day, I’ll have to compare it to A Song of Ice and Fire, just to see who has the larger cast of voices.) The average number across the complete collection was 24 speaking roles, and the median was 18. So my slight concern that Strange Places might have more voiced characters than normal is shown to be completely groundless. (Funny how we can obsess about minutiae, isn’t it?)

LoTR-pgphistTighter Graphs: One last thing struck me, as I was reviewing the numbers this time around: the pattern of dialogue lengths is different from the pattern of overall paragraph lengths. If you’ll recall, in the previous article, I pointed out that the distribution graphs could be broken down into three basic shapes. But when we look at dialogue only, there is only one shape. True, the books that tend toward short paragraphs also tend to have short dialogue passages, and the more expansive titles have more of the longer speeches, but the shapes of the graphs are almost identical, appearing essentially the same as the one shown here, from Lord of the Rings. The shape shows a strong skewing toward the very shortest utterances, only a very few of the longer speeches, and a rapid, scooped curve from the short lengths to the longer ones. It suggests to me that there are different rules employed by these writers when it comes writing dialogue vs. narration. I don’t know what those rules might be, but there does appear to be difference.

And for those who were interested in the ‘spiking’ we saw in some of the graphs from last time, yes, there is still some degree of spiking involved here, in the dialogue distributions, as well. I still think this indicates a tendency to trim paragraphs for length, after the writing has been done. But the effect is less extreme and does not significantly alter the shape of the basic curve.

Is there anything shocking in these numbers? No. Not to me anyway. But I find them oddly comforting, just the same.

The Data

For those who want to explore the data more closely, here’s the table of length distribution stats for each of the titles in this analysis.

TitleMaxAvgMedianMostly LessRatioSpeakers
