Archive for Copyright news

To Defend Fair Use, You Need To Quantify It

A website called Defend Fair Use just launched alleging that large media and content companies are misrepresenting consumer rights under copyright law. This initiative is led by the Computer & Communications Industry Association, a nonprofit back by Google, Microsoft and Yahoo among others.

While we welcome more discussion among these players about the contours of consumers’ right and copyright law, it’s ironic that the same companies alleging exaggerated copyright notices are profiting from duplicate content.

“Big Content” and “Big Technology” are clearly trying to spin the issue. To clarify, let me breakdown the four factors of Fair Use and show where Attributor can provide objective metrics to guide Fair Use determination . . . without boring you to death.

Factor 1: The purpose and character of the use, including whether such use is of a commercial nature is for nonprofit educational purposes.

Detectable. While Attributor won’t identify if the usage is transformative, we automatically detect if the page on which reuse occurs has advertising present. As evidenced by recent moves by the New York Times, advertising is clearly driving the online content economy making commercial use an increasingly important factor.

Also, you can learn a lot about the purpose and character of a use by whether or not attribution is provided, which in the online world, amounts to links from the copy to the original - we report back on attribution for every match we find.

Factor 2: The nature of the copyrighted work

Not Detectable. Sorry, we can’t determine whether your content is fiction or non-fiction, but we’ll add it as a feature request!

Factor 3: The amount and substantiality of the portion used in relation to the copyrighted work as a whole.

Detectable. This is a fancy way of saying that the less of your content that is taken, the more likely it qualifies as Fair Use. For each match, we report back on the percentage of the original content that has been reused.

Factor 4: The effect of the use upon the potential market for or value of the copyrighted work.

Detectable. Not only will we indicate if ads are present on the reusing site, but we will also provide the amount of monthly traffic for the site. We’re also adding functionality that will help you understand the impact of content reuse on your ranking in search engines. As noted in our Harry Potter research, much of content reuse is occurring on sites that appear higher in search engine rankings than the original content owner. This can have a major impact on the relative market value of the original work.

Attributor won’t remove all the emotion from the room in copyright discussions, but it will provide an objective means to evaluate Fair Use disputes and (hopefully) result in less litigation and less posturing between “Big Content” and “Big Technology”.

Digg Furl Reddit Bloglines Google

Dirty Money?

CNET’s Elinor Mills has a great piece about sites republishing song lyrics and making money from Google text ads. She details the plight of Alexander Perls Rousmaniere - a Los Angeles Artist who is losing money to sites reusing his lyrics without permission.

The chief villains are the shifty sites reusing the lyrics, but the article goes a step further by pointing out the lucrative role of search engines:

“Google is selling advertising on all the big copyright-infringing lyric Web sites,” Rousmaniere said. “It may seem like small potatoes, but lyrics are a huge search term on the Internet–these sites (and Google) are probably pulling in hundreds of thousands of dollars monthly, all on the back of copyrighted material.”

Google takes a lot of copyright “heat”, but it is not the only search engine to profit from unauthorized use of copyrighted material — and consider the size of the “dirty” money when you include the ad networks who serve display ads on these pages and the sites hosting the content in the first place.

It is an intricate plot that needs to be sorted out before publishers and content creators like Rousmaniere lose incentive to put their original content online.

Digg Furl Reddit Bloglines Google

Harry Potter Wrap-Up

On Friday, we found a site containing the first 10 chapters of Harry Potter and the Deathly Hallows. This seemed like a better example of infringement than the previously analyzed spoiler page, so we plugged the chapters into Attributor and checked the results at midnight Sunday night.

Here are our findings:

  • 2,806 sites lifted the book content
  • Duplication by type of site breaks down as follows
    • 54% Forums/Blogs (other than Harry Potter fan sites)
    • 27% Splogs or other commercial sites
    • 19% Harry Potter fan sites
  • Across all sites, the percentage of full chapter text copied is ~71%
  • Over 80% of the sites duplicating the content have ads on their pages
  • Sites duplicating the book are based in 43 different countries.

By all accounts sales of the book are phenomenal, and judging by an informal Attributor office poll, the impact on the first weekend’s sales appears to be zilch.

That said significant portions of the book continue to pop up all over the web making the downstream impact of the duplication unknown.

One thing for certain– the hysteria over the book’s release has filled many sploggers pockets. We just hope they repaid Scholastic by buying a few copies of the book!

August 1st Update

After reading about the Spanish spoiler’s release via TechCrunch, we loaded the 1st 10 Spanish chapters of the book into Attributor.

Here are our findings:

  • 440 sites lifted the book content in Spanish
  • Duplication by type of site breaks down as follows
    • 48% Splogs or other commercial sites
    • 40% Forums/Blogs (other than Harry Potter fan sites)
    • 12% Harry Potter fan sites
  • Across all sites, the percentage of full chapter Sapnish text copied is 60%
  • Over 85% of the sites duplicating the content have ads on their pages
  • Sites duplicating the book are based in 11 different countries.

The increase in splog site duplication is further proof of how easy it is to monetize popular search terms using Adsense or Yahoo Search Marketing text links.

This is the first of a series of analyses we’ll be sharing in the coming months. We hope to provide insights on how the content economy works and how it could be better managed with web-wide visibility and accountability.

Digg Furl Reddit Bloglines Google

The Spread of the Harry Potter Spoiler: Day 2

Here are some updated numbers on the spread of content lifted from Harry Potter and the Deathly Hallows

  • We found 574 unique pages duplicating the spoiler content. New sites are coming in at a rate of ~20 per hour.
  • The duplication is spread over 27 countries including the United States, Russia, the Netherlands, UK, China, Germany, Italy, Germany, Poland, British Virgin Islands, Argentina, Hungary, Brazil, Croatia, Samoa, Spain, Columbia, Singapore, Malaysia, Philippines, Peru, Mexico, Australia, Vietnam, Canada, Czech Republic, Indonesia and the Ukraine.
  • Harry Potter fan sites represent ~10% of the duplication, indicating that many fans don’t want to know the ending before they buy the book; instead matches are primarily “splogger” sites–these are sites that place ads around the lifted content and game the search engines to appear high in search rankings. This enables them to profit from the increased Harry Potter search activity.

5PM Update

  • There are now 708 pages duplicating the Harry Potter and the Deathly Hallows spoiler content. Many sites that appeared on our list in the first 12 hours have taken down the spoiler content; however, new sites are popping up at a faster rate. Attributor keeps a cached copy of all matches, and we will tally the gross match number in a future post.
  • 9% of the sites duplicating the spoiler content are Harry Potter fan sites.
  • 559 (79%) of the sites have ads on their pages.
  • Most of the duplication is verbatim. The percentage copied across all domains is >80%
  • Over the last 24 hours, duplication on Chinese sites has grown the fastest.

More info to come, including a thorough “post-game” analysis of what Attributor found.

Digg Furl Reddit Bloglines Google

The Spread of the Harry Potter Spoiler: Day 1

We thought we’d enter the Harry Potter discussion from a new angle - a quantitative look at the spoiler content’s reach and insights into the types of sites who are lifting and publishing the book content.

Yesterday we added the web page that includes excerpts from the unreleased novel Harry Potter and the Deathly Hallows. Attributor’s monitoring platform immediately found 312 separate reuses of the spoiler page across the Web.

The top 3 sites hosting the re-used content are

  1. livejournal.com
  2. groups.google.com
  3. twoj.net

With Attributor, instead of having to manually search and sort through a haystack of tertiary matches, we provide specific citations where your content are being re-used and enables you to act appropriately — by issuing a licensing request, or sending automatic DMCA takedown notices. We also enable more innovative approaches such as allowing a teaser portion of your content to be posted freely as long as the reuser links back to your site or includes one of your widgets to sell the full version.

We will update the blog over the next few days with the latest numbers and analysis of the sites that are re-using the content. Watch this space and let us know what you want to find out.

Digg Furl Reddit Bloglines Google

Fair Use Day

July 11th marked the 3rd annual global “Fair Use” day – an event created to raise awareness of copyright issues, particularly the legal use of copyrighted material. Because there are few objective ways to classify “Fair Use” online, the Fair Use day fanfare was minimal. The 7-11 free slurpee giveaway didn’t help either.

We’re working hard to bring objectivity to this discussion by showing all content re-use in the context of % of the original copied, whether advertising is present, and whether the publisher-defined attribution rules are followed.

In our view, visibility + objectivity = smarter decisions, increased opportunity and a more functional online content economy.

Here’s to a much more objective Fair Use debate the rest of the year and more free slurpees.

Digg Furl Reddit Bloglines Google

Do the Internet’s gatekeepers have a copyright responsibility?

The Belgian ISP ruling is grabbing headlines this week as a potentially precedent-setting case within the European Union — specifically, the declaration that Internet Service Providers bear responsibility for stopping illegal file-sharing on their network. While the ruling is unlikely to jump the Atlantic anytime soon, it raises broader questions of gatekeepers’ responsibilities and highlights the need for the industry to embrace new technologies that will prevent the issue from being “solved” in the courts.

How do we prevent a litigious ending to this story in the U.S? By creating an online content marketplace in which all players have web-wide visibility into content re-use. And not just ISPs, this marketplace should include anyone who serves, indexes, aggregates or monetizes online content.

Billions of dollars and the future growth of the online content economy depend on it.

Digg Furl Reddit Bloglines Google