First Rowling, then Shakespeare… who’s next?

The Times today has an article about the possibility that Shakespeare wrote a passage in an edition of The Spanish Tragedy, an early Elizabethan play by Thomas Kyd.  The original computer analysis (by Brian Vickers) was very similar to that used to suggest that J. K. Rowling was the author of The Cuckoo’s Calling, which we talked about here.  Big Think describes what Vickers did:

Sir Brian has employed software called Pl@giarism–a free program developed by Maastricht University to catch law students cheating on their written work–to search a database of the 58 different plays performed in London between 1580 and 1595. But Sir Brian isn’t looking to catch anyone cheating. Rather, he is looking for examples of so-called “self-plagiarism.” The Pl@giarism software identifies every occasion that a sequence of three words appears in Shakespeare’s known works, and then looks for repetitions of these sequences in an unattributed text. Some of these word sequences are common, everyday collocations such as “by the way” or “Yes, my lord.”

Excluding those phrases, Sir Brian focuses on word sequences that are unique to Shakespeare. For instance, the word sequence “eyebrows jutty over” appears only twice in all of Elizabethan drama. One instance is in Shakespeare’s Henry V, written in approximately 1599. The only other instance is found in the fourth edition of Thomas Kyd’s “The Spanish Tragedy,” published in 1602. This version contains additions to five scenes, totaling 320 lines. In these short passages, Sir Brian found 46 collocation matches that are completely unique to Shakespeare’s poems and plays written before 1596. That evidence is hard to argue with.

What got the Times’ attention was another paper that focuses on Shakespeare’s handwriting and how that helps explain oddities in the passage:

In a terse four-page paper, to be published in the September issue of the journal Notes and Queries, Douglas Bruster argues that various idiosyncratic features of the Additional Passages — including some awkward lines that have struck some doubters as distinctly sub-Shakespearean — may be explained as print shop misreadings of Shakespeare’s penmanship.

“What we’ve got here isn’t bad writing, but bad handwriting,” Mr. Bruster said in a telephone interview.

What I couldn’t find in a cursory Google search was the actual passage in question.  It’s easy enough, though, to find the standard sample of Shakespeare’s messy handwriting–the passage from the manuscript of the play Sir Thomas More that is generally agreed to be by Shakespeare:

Next time, should J. K. Rowling disguise her writing style?

Here’s an interesting interview on Science Friday with Patrick Juola, the guy who’s computerized analysis helped identify J. K. Rowling as the author of the mystery The Cuckoo’s Calling.  He gives much more detail about exactly what kind of analysis he did over at Language Log.  Essentially, he compared the novel to works by Rowling, P.D. James, Ruth Rendell, and Val McDermid on four linguistic variables: distribution of word lengths, use of the 100 most common English words, and two other tests based on authorial vocabulary.

So, the final score? The results look “mixed,” but pointing strongly to Rowlng. There were certainly a couple of likely losers: nothing at all pointed to Rendell as a possible author, and only one test, and an unreliable one at that, suggested James. McDermid could be a reasonable candidate author, but the word length distribution seemed almost entirely uncharacteristic of her. The only person consistently suggested by every analysis was Rowling, who showed up as the winner or the runner-up in each instance.

One of the comments to Juola’s Language Log post suggests that a determined author can defeat analyses like these.  This is referred to as “adversarial stylometry.”  There are two basic approaches: obfuscation, where you try to simply hide your own style, and imitation, where you try to copy someone else’s style.  (A third approach is machine translation, where you translate an original passage using machine translation services.)  I doubt that any of this is worth Rowling’s time, but you might consider it if, say, you’re a whistleblower who wants to remain anonymous.

Of course, all of this analysis is overshadowed by the Onion’s shocking revelation that J.K. Rowling’s books were really written by Newt Gingrich:

“Assuming a fake identity really gave me a lot of freedom to build out the world of Hogwarts and flesh out the characters without drawing unwanted attention to myself or having the novels associated in any way with my political career,” Gingrich said in a statement, confirming reports he wrote the first four books in the fantasy series while still in office, but wrote the remainder before his 2012 presidential run.

Why do people rely on anything besides the Onion for their news?

Writing e-book sales copy — sheesh, it’s harder than you think

I have to trust that my e-book publisher knows more about the business than I do.  They certainly seem to.  They have convinced me to change my title from Portal to The Portal because one-word titles aren’t selling well nowadays, unless you’re James Patterson or Clive Cussler.  OK, fine — they can have the “the.”

Now they have sent me these instructions for the sales copy that will appear online..

Maximum overall word count: 200 words. (this includes sales blurb only)

Ideal length: 150 words

Why the length limits? Readers/people are basically lazy.  Amazon allows for approx 120 words before the reader has to click “read more”.  The incentivizing plot twist (or a strong suggestion of the twist) must appear in the first 120 words.

First Paragraph length max: 250 characters including spaces.  More than that and the number of lines exceeds three on most standard monitors.  More than three lines and the reader tends to “click away” unless the title is highly anticipated.

Apps present a new wrinkle.  200 characters including spaces to incentivize the reader to “click” read more.  Because readers are basically lazy, the buy-now case is best made in the first 200 characters (including spaces).

Copy Structure: Every word in the copy must either introduce the protagonist/antagonist, present the internal or external conflict, or contribute to a  relevant and non-clichéd sub-genre plot twist that sets the book apart. (but not too far apart.  Readers also tend to read in a rut).

OK, then.  The text I came up with here doesn’t fit the guidelines, so there is work to be done.  The limitation on total character count (including spaces) is an interesting modern development.  I’ve just started using Word 2013, and it took a bit of fumbling around before I figured out how to get it to show me the character count.  Sure enough, it will display the number of characters, and the number of characters including spaces, with a single mouse click.  Good job, Microsoft!

 

Want to see a Shakespeare play in ten minutes?

. . . without all the annoying Shakespearean verbiage that slows down most productions of his plays?

Of course you do.  So you want to see early silent movies of Shakespeare plays.  Here is an 11-minute Tempest from 1908 that features special effects like Ariel disappearing:

And here is a hand-tinted King Lear from Italy in 1910:

It lasts 16 minutes, but King Lear is pretty complicated (even without the Edmund/Edgar subplot).

If you’re like me (and who isn’t?) you love this kind of stuff.  And you probably also love the Reduced Shakespeare Company, which gets Shakespeare done quickly, even if they have to use words.

What books do you pretend to have read?

Book Riot did an informal poll of its readers about books they pretend to have read.  Here are the top 20:

  1. Pride and Prejudice by Jane Austen (85 mentions)
  2. Ulysses by James Joyce
  3. Moby-Dick by Herman Melville
  4. War and Peace by Leo Tolstoy
  5. The Bible
  6. 1984 by George Orwell
  7. The Lord of the Rings by J.R.R. Tolkien
  8. The Great Gatsby by F. Scott Fitzgerald
  9. Anna Karenina by Leo Tolstoy
  10. Catcher in the Rye by J.D. Salinger
  11. Infinite Jest by David Foster Wallace
  12. Catch-22 by Joseph Heller
  13. To Kill a Mockingbird by Harper Lee
  14. Fifty Shades of Grey by E.L. James
  15. Jane Eyre by Charlotte Bronte
  16. Crime and Punishment by Fyodor Dostoevsky
  17. Wuthering Heights by Emily Bronte
  18. Great Expectations by Charles Dickens
  19. Harry Potter (series) by J.K. Rowling
  20. A Tale of Two Cities by Charles Dickens (21 mentions)

“Pretend to have read” is a slippery category — Pretend to whom?  Your snobby literary friends?  Your co-workers standing around the water cooler?  Your girlfriend the English major who won’t sleep with you if you haven’t finished Ulysses?  Does anyone really care nowadays what you’ve read and what you haven’t read?  Presumably the folks that Book Riot readers hang out with do.

Can you spot the one that isn’t as classic-y as the rest?  I thought you could.  As the Book Riot writer suggests, presumably people pretend to have read Fifty Shades of Grey so they don’t get left out of interesting conversations.

Of the books on the list, I haven’t read Pride and Prejudice and Wuthering Heights (among the nineteenth century classics), and Fifty Shades of Grey and The Infinite Jest (among the recent novels).  I’ve dipped into the Harry Potter books with my kids, but haven’t read any of the novels straight through.

There, I’m glad I could finally get that off my chest.

How do you spell the plural of “you”? Whitey Bulger needs to know

This is from the Fox News transcript of Whitey Bulger’s statement at his trial yesterday:

And my thing is, as far as I’m concerned, I didn’t get a fair trial, and this is a sham, and do what youse want with me. That’s it. That’s my final word.

The Boston Globe‘s online version of the statement also spells the word youse.  But the headline of its print edition this morning spells it yous.  Online, ABC News also spells it yous, while NBC News sanitizes it to you.

I would have spelled it youse.  Or maybe even you’se.  Google Ngram Viewer gives a slight lead to yous lately, but that might be because yous gets credit for thank-yous.  Youse had a big lead in American English from 1900-1940, and you’se had the lead briefly in the 1860s before falling back to third place.

I wonder if the Globe and other newspapers have the word in their style guides  It probably doesn’t come up that often, but it pays to be prepared.  You never know when you’re going to get another Whitey Bulger.