[poppler] Removing watermark/footer from JSTOR PDFs

Pablo Rodríguez oinos at web.de
Sat Jul 23 03:55:10 PDT 2011


On 07/23/2011 09:44 AM, Federico Leva (Nemo) wrote:
> Hello,
> you might have heard about 
> <http://arstechnica.com/tech-policy/news/2011/07/swartz-supporter-dumps-18592-jstor-docs-on-the-pirate-bay.ars>
> We're now going to upload those ~19000 PDFs to the Internet Archive, but 
> we need to remove a watermark.

Hi Federico,

please consider the following reasons for not doing such an action: the
content, the Internet archive and yourself.

[I'm not a lawyer and this is not and cannot be considered legal advice.
If you need legal advice, you will have to hire a lawyer.]

The content itsef is under copyright and not in the public domain (at
least, the vast majority of these PDF documents would be under copyright
protection). Putting so many documents could be considered a criminal
offense and it is probably a felony under US law, at least. Civil
damages for copyright infringement in the US are up to 30K US$ for
single not willful infringement and up to 150K US$ for willful
infringement. It is clear that in your case, the infringement would be
willful, so do the math by multiplying documents protected by copyright
by the amount by single willful copyright infringement. Civil and
criminal penalties are compatible, so you could face both.

Regarding the Internet Archive, it is located in the US and it is
subject to the US law. Hosing services may face secondary liability for
copyright infringement. This is what the DMCA is about. There is a safe
harbor for hosting services by removing the offending contents at
holder's request. You could upload the file again and again (I wonder
whether the Internet Archive would allow that), but in that case the
prosecuting party could argue that the Internet Archive should have
effectively prevented the new upload of such a unique file (the file is
huge, so easily identifiable). Remember that the Internet Archive makes
contents directly available to others (unlike the Pirate Bay), the main
reason why it could face secondary liability in copyright infringement.
And even having to face a trial is too expensive for the vast majority
of individuals or institutions.

About yourself, the most basic Google search gives extremely accurate
details on the first occurrence. You are giving too many data in both
this message to a public mailing list and on the internet, so it
wouldn't be hard to find you. And you could find yourself opening your
home to a police raid. You might argue that you aren't US citizen, but
don't forget that the German registrant of wikileaks.de had his house
raided because of the leaks (and he hosted no contents).

I could agree with you that copyright law is going crazy, but what
you're planning (and announcing) is not the way to fix it. Even if you
don't care about yourself, please don't get the Internet Archive into
trouble.

Please, don't get this wrong. It is nothing personal.

I hope it helps,


Pablo
-- 
http://www.ousia.tk


More information about the poppler mailing list