Distributed Proofreading

Doing my part to preserve history and get out-of-copyright books into digital format.

Distributed ProofreadersAbout a month ago, before I left home for the summer, I stumbled upon the Distributed Proofreaders Web site. The best way to describe the site is to echo the text on its home page under Site Concept:

Distributed Proofreaders provides a web-based method to ease the conversion of Public Domain books into e-books. By dividing the workload into individual pages, many volunteers can work on a book at the same time, which significantly speeds up the creation process.

Here’s how it works. Someone, somewhere scans printed book pages into a computer as images. OCR software is applied to translate the text into machine-readable text characters. Then volunteer proofreaders step in and compare the original scanned pages to the editable text. Proofreaders follow a set of proofing guidelines to ensure consistency as they modify the translated text. Each page passes through a series of steps that eventually turns all of a book’s pages into a single text document. That document is then released as a free ebook in a variety of formats via Project Gutenberg.

I became a volunteer. So far, I’ve proofed 14 pages. I know that doesn’t seem like a lot — and it’s not — but if 100 people each proofed 14 pages a week, 1,400 pages a week would be proofed. That’s what the “distributed” in Distributed Proofreading is all about.

The good part about being a proofreader — other that warm, fuzzy feeling you get from helping to make the world a better place — is that you get to read lots of old books about topics that interest you. The day I joined, I proofread two pages of a New York newspaper account of World War I. It was fascinating. Today, I proofread 12 pages of a biography of Benjamin Franklin, who I believe is the greatest American who ever lived. (There is a lot to be learned from Franklin’s life and writings.)

Why am I blogging about this? Well, I’m hoping that other folks will embrace this project and donate an hour or two a week (or a month) to proofreading pages. The more folks who work on this project, the more quickly these great old books and other pieces of literature will get into free digital format for readers and students to enjoy.

Want to help ebooks thrive? Give distributed proofreading a try.

A Tale of two Copyright Infringements

Together, we can stop it.

CopyrightThe other day, while trying in vain to catch up with missed tweets by the people I follow on Twitter, I noticed that two of my Twitter friends were dealing with copyright-related issues. Since then, both issues were resolved. I believe that part of the reason for the speedy resolution of these problems was involvement by the Twitter community.

Content Theft

The first case came to light when one of my Twitter friends, @anntorrence, complained that she had not gotten a response from the blogger who used one of her articles on his site. The link to the article in question told the rest of the story. Ann had written a great tips piece about preparing for a cold-weather photowalk. The article was originally published on Ann’s blog, Pixel Remix: the Ann-alog. Later, it was picked up with her permission on Photowalking Utah. The same article was picked up without her permission by a new photowalking Web site that was obviously anxious to build content and Google juice.

Ann’s article is copyrighted — as is most content on the Web. Her obvious distress over the piece being used without her permission bothered me. After all, I earn my living as a writer and have seen my own content stolen again and again. In my case, it often affects my livelihood by distributing content that I normally receive royalties for. But that doesn’t mean that content theft is any less wrong when it’s from a blog or other free source.

I went to the Web site guilty of the theft and posted a comment there. I also wrote to the owner of the site. I was horrified not only to see the theft, but because that site was one of the few that I actually paid to advertise my helicopter business on. I was not interested in supporting a site that was stealing content. If they stole from Ann, who else had they stolen from? How much of the content was original or reused with permission? (Needless to say, I pulled my ad immediately.)

The owner of the site made the fatal error of replying to me in Twitter. He defended his actions by saying that he “gives credit when due.” He was obviously clueless about copyright law — as most people incapable of creating their own content appear to be. He seemed to think that if it was on the Web, it was free for use anywhere, as long as he put a byline for the original author. He appeared to think he was being generous by including a link back to the article — not the original, but the site he stole it from.

An @reply argument ensued, with me trying to educate him and him responding arrogantly. He tried to continue the argument in e-mail. After I left my computer (and Twitterrific), he was apparently blasted by other Twitter users who got in on the discussion with their own @replies.

Ann has since gotten satisfaction for the situation — her article has been removed. Unfortunately, the owner of the site still doesn’t get it. He has written a post apologizing for not giving proper links back to original articles. He evidently does not understand that he needs permission to reuse copyrighted work.

I wonder what Scott Kelby will say when he sees his work used on the offender’s site. Personally, I hope he sues the site owner’s sorry ass.

I would urge people to boycott the site, but that might send new visitors there just to check it out. Instead, I’ll just urge people not to frequent sites that steal content. If you think a blog’s post contains content used without permission, don’t be afraid to comment about it.

Removing Copyright Notices

The second case was far more blatant. Some idiot had written a blog post about how to remove copyright notices from photos and other images found on the Web. As if that wasn’t bad enough, he used someone else’s copyrighted image for his example. That someone else was @PattyHankins, one of my Twitter friends.

Patty mentioned the problem in Twitter and I went to investigate. The post in question was a typical hacker/pirate post with instructions for removing copyright notices that were part of a photo. Patty’s photo appeared numerous times in the step-by-step instructions. After the first time, the author of the post made a comment like, “I don’t know who Patty Hankins is, but nice picture.” Extremely obnoxious.

I posted a comment to the post. I can’t remember exactly what I said, but it clearly pointed out that the author of the post and site was violating Patty’s copyright. Evidently, many other Twitter users did the same thing. So when Patty sent his ISP a DMCA notice, she got a quick response. The photo was removed within four hours.

Patty referred me to “Using the DMCA Takedown Notice to Battle Copyright Infringement” on NatureScapes.net for what she says is the most effective sample DMCA letter she’s ever used.

Again, I believe that one of the reasons Patty had a relatively easy time of getting the photo off the infringer’s Web site was the outpouring of comments by outraged Twitter users.

For More Information…

If this post interests you, you might be interested in the following links.

And please do use the Comments link or form to add your thoughts about this matter. If you are one of the offending bloggers, however, don’t waste your time. My blog is not your soapbox.

Related Links:


Could be hazardous to your good name.

A few months ago, I read a blog post by some A-list pro blogger that briefly discussed eZineArticles.com as a place to publish articles and generate hits for your site. The idea was that the articles contained a byline with links and people who read them would come back to your site to read more. The result: more hits.

I dug deeply into my well of content and found a handful of articles I didn’t mind republishing. I formatted them as required and submitted them to eZineArticles.com, after setting up an account as an author. A bunch of the articles were bounced back because they read like blog posts. But I successfully argued that they did provide useful information in my somewhat conversational and bloggish writing style. All five articles were published on the eZine Articles site.

First Surprise: Anyone Can Republish!

What I didn’t realize at first was that anyone who sets up a publisher relationship with eZineArticles.com could republish my work, as long as it was republished exactly as written and included my byline, bio, and links. I discovered this when an article I wrote about flying at sunrise was picked up by a Web site with content about cruising.

After a few e-mails went back and forth between me and the site owner and eZineArticles support staff, I realized what I’d missed by not reading the fine print — I was basically granting a very broad set of rights to eZineArticles.com. But the site that had used the piece was a high quality site and I didn’t mind my recycled work appearing there. And the eZineArticles folks assured me that publishers had to meet certain requirements to use the work.

Second Surprise: Hot Sex?

But I wasn’t very happy when I traced a link to one of my Antelope Canyon photos article to a Blogger blog with the words “hot-sex” in its domain name. Although the site didn’t appear to contain any porn, I didn’t want my content — or name! — associated with it. So I wrote to eZineArticles support to complain.

Today, I found the same article used on a site with “nurse-fetish” in the domain name. Now I was pissed. I wrote again to the eZineArticles staff.

eZineArticles.com Responds

My new message crossed their response to the first one in the ether. In their response, they told me that if I didn’t want my work on a specific site, it was my responsibility to contact the owner of that site and ask him to remove it.

Ever try to contact the owner of a Blogger blog? It’s not possible if they don’t want to make it possible.

I replied that their response was completely unsatisfactory and that I would be deleting all of my articles from their site.

And then I did.

Lessons Learned

I am certainly not desperate enough to be published or to get hits by releasing my work on a site that allows distribution without prior approval by the author. Frankly, I don’t think any author should be that desperate.

eZineArticles.com obviously doesn’t give a damn about its authors if it won’t work to prevent this kind of activity with an author’s work. Any author who publishes with them deserves whatever shit he gets — including his name spread around on sites of questionable quality and purpose.

From now on, I will publish my work electronically in only three places:

  • Here, on this site, where my work is covered by a copyright notice that helps protect my work from misuse.
  • On the sites of publishers who pay me for my efforts and protect our copyrights.
  • On the sites of other bloggers who have asked me to guest author for them and will protect our copyrights.

I’m angry about this, but I know it’s my own fault. I was conned, first by the pro blogger who pushed eZineArticles.com and then by eZineArticles.com itself. I don’t understand why anyone would allow their work to be reproduced in a way that they cannot control. Could they all be as stupid as I was when I signed up?

As for the “hot-sex” and “nurse-fetish” sites, I wonder how the other female eZineArticles authors feel about their work — and their names — appearing there.

Could it be? Piracy site shut down?

To early to be sure, but not too early to hope.

Last night, before shutting down for the night, I decided to check a pirate Web site I’ve been monitoring to see if any new ebooks had arrived. I’ve been finding my books — and the books of author friends — on a number of pirate Web sites, but one of them was especially blatant and offensive. It listed literally hundreds of ebooks and complete training DVDs by dozens of publishers and scores of authors. If you can’t figure out why this bothers me, read this.

After a long wait, an error message appeared in place of the site’s home page:

The requested URL could not be retrieved
While trying to retrieve the URL: http://[omitted]/
The following error was encountered:
* Connection to [omitted] Failed
The system returned:
(111) Connection refused
The remote host or network may be down. Please try the request again.

I tried a few more times and got the same result.

Then my normal state of paranoia set in and I thought that the site’s owner may have blocked my IP address. I’d been checking the site with an alias user ID that pointed to a domain name I never use for personal stuff. But I didn’t mask my IP address. So I asked Jonathan at Plagiarism Today to try. He got the same result (and taught me a trick for checking for IP blocking another way).

About the Site

The site was hosted somewhere in Asia or the Pacific, although the guy who ran it wrote in perfect English. So there wasn’t much to be done as far as DMCA notices to the guy’s site hosting ISP.

Most of the pirated files were being hosted on a Germany-based free file hosting site. That site’s gimmick is that people can download one file at a time unless they pay for a “premium account.” So I think one could make a good argument that the hosting company was selling access to our files.

To the hosting company’s credit, they made it pretty easy to get the files taken down. All I had to do is get the complete URL to the file and send it to them via an online form. Within 24 hours, the link simply stopped working. So even though the pirate site still listed my ebooks, none of the download links would work. To me, that was almost as good as taking the whole site down.

Take Down!

Join us in our fight to stop ebook piracy! Authors Against Piracy is a private Yahoo Group dedicated to educating authors on how they can find illegal copies of their books online and get them off. We can make a difference!

But I do have reason to hope that the site may have been taken down. When I saw the extend of the copyright infringement there, I was outraged. I spent almost two full days contacting authors and publishers to tell them about what I’d seen. Among the publishers I contacted were Pearson, McGraw-Hill, O’Reilly, Symantec, Lynda.com, and Total Training. I thought that if I got some big guns out against this guy, he’d get taken down.

And maybe it did work. Maybe one of them threw a big enough legal staff at either the site owner, his ISP, or the file hosting sites to get the whole thing taken offline. Or maybe just having all those publishers and authors going at him with e-mail and other communications made him realize that his efforts to earn a few dollars by setting up illegal downloads just wasn’t worth the hassle of fighting all these people.

Whacking Moles

I don’t care what the reason might be. I just rejoice in the possibility that we may have succeeded in “whacking this mole.”

Because as one of my publishers pointed out: “Trying to stop these guys is a game of whack-a-mole. You hit one and another one pops up.”

I agree. But there are more people and resources on our team than on theirs. If we work together, we can keep those moles in their holes.

Copyright for Writers and Bloggers – Part III: Fair Use and Public Domain

What’s fair? Use common sense.

In the first article of this series (Part I: Why Copyright is Important), I discussed the importance of copyrights to authors. In the second article (Part II: Creative Commons), I tell you about the Creative Commons license I use to protect the work on this site.

In this last article of the series, I explain the concept of fair use — or attempt to, anyway — and how it enables you to quote copyrighted works for certain purposes.

CopyrightFair Use

Now here’s a good question. What if you want to use one of my articles on your AdSense-supported Web site? Obviously, that’s in violation of my Creative Commons license. But what if you’re satisfied using only a part of it?

That’s where Fair Use comes into play. Fair use allows you to take a portion of copyright-protected material and use it provided the use meets the definition of “fair” as set forth by the Copyright Act of 1976:

…the fair use of a copyrighted work, including such use by reproduction in copies or phonorecords or by any other means specified by that section, for purposes such as criticism, comment, news reporting, teaching (including multiple copies for classroom use), scholarship, or research, is not an infringement of copyright. In determining whether the use made of a work in any particular case is a fair use the factors to be considered shall include—

  1. the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes;
  2. the nature of the copyrighted work;
  3. the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and
  4. the effect of the use upon the potential market for or value of the copyrighted work.

You can read more about this on Wikipedia.

Fair Use is Common Sense

Fair use, of course, is ruled upon by judges when copyright infringement cases get to court. But you can keep yourself out of court — and be a good member of the blogging community — by using common sense and thinking through the use you have in mind.

For example, suppose you want to use portions of this article as part of a college course you’re teaching about copyright in the Internet age. You could print the article and share it as a handout with your students. Of course, you should also credit me as the author. That’s common courtesy in the writing world.

Or suppose you want to blog about this article as part of your own opinion piece about copyright. You could take a quote from my article and use it to make one of your points — or to present one of my points that you want to argue. (Be gentle, please.) For fair use, you’d have to limit the amount of material you used so it’s only a portion of the entire piece. You should also include my byline and a link back to my article — that’s common courtesy in the blogging world.

Both of these uses would be considered fair. What’s not fair is using a work in a way that would reduce demand or marketability for it — like reproducing it in whole on your Web site without a link back to the original. Or using it to make money by providing content on a site that exists primarily to generate advertising revenue.

Public Domain

There’s one more thing I want to mention here.

If you don’t care about how people use your work, you can release it into the public domain. This essentially means that you’re giving up all rights to it and people can do with it what they want.

If you find a work that’s in the public domain — including classic novels that are out-of-copyright — you can use them pretty much anyway you like. But let your conscience be your guide. Do you really want to claim that that passage from Mark Twain’s Roughing It was really penned by you?

Just remember, there’s nothing in this blog — or in most others — that’s in the public domain. Respect the author’s copyrights, whether they’re a standard copyright “All Rights Reserved” notice, a Creative Common’s license, or something less formal. It’s not just courtesy. It’s the law.

What Do You Think?

Got something to say about this? Use the Comments link or form for this post to get it off your chest.