[Clipart] Searching and download graphics

Wed May 16 13:54:18 PDT 2007

You're thinking in terms of "on-demand" PDFs.  If those PDFs could be 
generated every week on Sunday Morning at 1 a.m. when server load is low...

Another question is what package you were using to generate the PDFs. 
Some are faster and/or lower-demand than others.  It might be worth 
exploring a few different methods and performance testing/tuning them to 
determine which one brings in the best combo of speed and CPU load.

Perhaps the best bet is not to create the PDFs from SVG, but to use a 
two stage process where all the SVGs are converted into JPEG using Batik 
or ImageMagick or commandline Inkscape, then the PDFs are built using 
the JPEG images.  That might be faster because you can use the fastest 
method for all the SVG to JPEG conversions, then the fastest method to 
generate PDFs.

Or... offload portions of the processing to the user's machine.

Once you built some XML indices and bitmaps of the images, you could 
build a collection browser frontend that offloaded a large chunk of the 
processing to a client-side interface in Flash.

For example, if you wanted a tag browser, you create an XML file...

	<item>
	  <itemtitle>Witch on Broom</itemtitle>
           <author>Zeimusu</author>
	  <date>12-12-2006</date>
	  <svgurl>http://...</svgurl>
	  <bitmapurl>http://...</bitmapurl>
	  <thumburl>http://...</thumburl>
	  <tags>witch,broom,Halloween,clip_art</tags>
	</item>

For the current collection of 3000 or so tagged images, the XML file 
might be 1.5-2 megs.  Use zLib compression (Flash 9 / ActionScript 3 has 
zLib compression support, IIRC) to compress the XML file.

The frontend can unzip the file, then parse the XML into Tag and Author 
arrays which can be used to generate thumbnail indices for all the tags 
and authors, a higher-res (say 400x400) preview of individual images, 
and offer the user an SVG download link.  Generate the XML file and 
bitmaps nightly during a slower period.

Make the XML file structure available and let people play with 
developing client side browsers based on the XML.  Heck, you might make 
the browser a Flash component and let people have fun doing mash-ups 
that incorporate the component.

OTOH, there's also the option of making sure Google has an accurate 
sitemap of all the different pages you want to offer, then using an 
embedded Google Search (using the Google AJAX search API or AdSense for 
Search) to offload some of the search stress to Google.

- Greg

Thomas Zastrow wrote:
> Alan schrieb:
>> "would be not too difficult"
>>
>> One of the really great things about open source software is that it 
>> truly benefits from ease of use and simplicity.   Highly effective 
>> ideas made as effective/transparent/simple as possible is true 
>> genius.  Open source software caters to this model because there is 
>> nothing to hide and nothing to protect.
>>
>> Be the best you can be, because open source encourages exactly this.
>>
>>
> Alan, I already tried to build PDF-catalogues from the last public 
> provided package 0.18. It wasn't a big problem to write the code: the 
> problem was the realy big amount of cliparts which were already in OCAL. 
> It took too much time and CPU performance for generating PDFs on the 
> fly. And OCAL still grows and grows. So, I think the only way for 
> producing PDFs from OCAL would be an offline way. For doing that, I 
> would need information about the structure of the database, how files 
> are stored and referenced and so on.
> 
> Best,
> 
> Tom
> 
>  
> 
>