[Clipart] Clipart mining guide

John Olsen johnny_automatic at mac.com
Tue Oct 16 20:21:42 PDT 2007

I have had a couple of people ask me to go through the steps I use to  
get images  from old books.  I am happy to share them.  I can only do  
a version using an Adobe Creative Suite workflow as this is what I am  
most familiar with and it seems the GIMP needs some additional  
helpers to handle  PDFs.  So I imagine this all can be done with an  
Open Source workflow.  I just am not the authority to write on it.   
Maybe someone can translate it.

Anyway, I have a draft below.  I was thinking of adding it to the  
Wiki section "http://openclipart.org/wiki/Clipart_Acquisition" but  
wondered if that clutters up that section.  maybe it is better in a  
separate subsection under either "Clip Art Information" or  
"Contributor & User Handbook".  If so someone who can unlock the top  
level would need to set that up for me.  Ryan?

Any suggestions or ideas are appreciated.

John Olsen

== Guideline for Mining Images from Online Book Libraries ==
* Sites such as those collected under the texts section of  
www.Archive.org offer a gold mine of Public Domain images.  This is a  
guide to how to extract these images for use here on OCAL.  Please  
note the author uses a workflow using Adobe Creative Suite 3 because  
he is most familiar with this software.  It probably can all be done  
using Open Source software.  Someone else will need to add those  
* Find a book with images you would like to extract.  Keep in mind  
that the resolution is not extremely high so small images may not  
have enough resolution to extract good SVGs.
* Download the PDF version of the book.  It usually has the best  
resolution.  The black and white PDF will be made for reading the  
text and might not have the best images.  it is better to get the  
full color PDF and do your own adjustments.
* Open the PDF in Photoshop.  You will be asked to select a page.   
Pick the page you want and open it.  Then crop the image tight around  
the graphic you want.
* Alternatively you can extract individual pages using Adobe Acrobat  
and then open these single pages in Photoshop.  This can be faster  
and less memory intensive when mining large books.
* Using Image>Adjustments>Black & White convert the image to black &  
white.  The High Contrast Red Filter Preset usually does a good job.
* Further enhance the image using Image>Adjustments>Brightness/ 
Contrast to increase contrast and brightness if necessary so you get  
a nice high contrast image.
* Save the file as .PSD or .TIF or any format that Adobe Illustrator  
* Open this file in Adobe Illustrator.
* Use Live Trace to convert photo to vector art.  The following  
presets usually give the best result.  One Color Logo (give black  
lines only-smallest file size), Black & White Logo (white parts are  
filled shapes as well) and Comic Art.
* When happy with the image, Save as SVG.  Do not preserve  
Illustrator editing so file is basic SVG.
* Using File>Save for Web & Devices make a PNG file to upload with SVG.
* Upload file to OCAL.
* You usually get better results with strong black & white images.   
They make nice clean traces with reasonable file sizes.  It is easier  
to color these afterwards then try to trace a full color image and  
expect clean, crisp lines.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/clipart/attachments/20071016/91f23d5e/attachment.html>

More information about the clipart mailing list