« Milwaukee Journal Sentinel Publishes RSS Feeds | Main | Attorney General Challenges Confidentiality of Draft Legislation »

Amazon Introduces Statistical Comparison Features

It seems that Amazon.com has launched some pretty cool features of which I've only recently become aware; Concordance, Statistically Improbable Phrases and Text Stats are available for books in which the full text is available to Amazon.

When available, these features are listed in the "Inside this Book" heading. To see these features in action, take a look at the record for War and Peace. For more information, read the Washington Post article.

The following summaries are excerpted from Amazon:

Concordance is an alphabetized list of the most frequently occurring words in a book. The font size of a word is proportional to the number of times it occurs in the book. Hover your mouse over a word to see how many times it occurs, or click on a word to see a list of book excerpts containing that word.

Statistically Improbable Phrases, or "SIPs", are the most distinctive phrases in the text of books. To identify SIPs, our computers scan the text of all books in the Search Inside! program. If they find a phrase that occurs a large number of times in a particular book relative to all Search Inside! books, that phrase is a SIP in that book.

Text Stats calculates a variety of statistics for each book in the Search Inside!™ program. The Readability calculations estimate how easy it is to read and understand the text of a book.

  • The Fog Index indicates the number of years of formal education required to read and understand a passage of text.
  • The Flesch Index is another indicator of reading ease. The score returned is based on a 100 point scale, with 100 being easiest to read. Scores between 90 and 100 are appropriate for 5th and 6th graders, while a college degree is considered necessary to understand text with a score between 0 and 30.
  • The Flesch-Kincaid Index is a refinement to the Flesch Index that tries to relate the score to a U.S. grade level. For example, text with a Flesch-Kincaid score of 10.1 would be considered suitable for someone with a 10th grade or higher reading level.