[Clipart] help for the project (at least next release)

Alan alan at ccnsweb.com
Tue Sep 4 16:35:25 PDT 2007


Below I've summarized some of our previous conversations: (Maybe start putting some of this on a wiki?)

Mockup - see pdf (attached).


*4 Primary Requirements for Open Clipart*


*1.  Easy to find and download graphics *

If you want this project to really take off, I would suggest this is the priority

In the absence of being able to download all of the graphics because of the sheer size of the package, perhaps it would be better to design an installable executable which contains field searchable thumbnails and an update function.  I have sent a mockup in pdf.

Scenario:  The user downloads the utility.  Once the utility is downloaded and installed, it automatically begins updating optimized thumbnail graphics (png?).  The utility also updates all search criteria which is already attached to each graphic.  The user searches for a particular graphic using the local utility.  The user views the thumbnail(s) which also contains a link to the full svg graphic.  If the user likes and wants the full graphic, there is a check box beside the thumbnail graphic in the utility.  The user selects as many graphics as he/she desires, using the checkbox.  Once all selections are made, the user clicks a download button and the full svg graphic is downloaded from the server.

There is a legend below the checkbox for each thumbnail.  This legend indicates whether the graphic has already been downloaded and whether updates are available for the graphic.  There is also the option to choose which personal storage folder to add this too (drop-down list).

(The user can create personal storage folders for downloaded graphics.  These then become available under the thumbnails.)

These thumbnail graphics have a "Thumbprint" - essentially an absolutely unique id which permanently identifies the orginal graphic.  I would suggest the thumbprint have a time-date-author stamp which should make for a completely unique id.  The stamp should be automatically applied when the graphic is first submitted.  Allow the time stamp to include seconds or even milliseconds?  

These thumbnails also have the identical search field of the orginal svg graphic.   Much thought should be put into search fields.  Some more obvious ones are author, date, content.  The search fields can be part of a drop down list, where the user can choose an item such as author and then to the right enter keywords.  There should also be a feature to add or remove search fields in order to narrow the list even further.  )See mockup more - less.)

Later issues would perhaps include integration of the utility into OpenOffice?

*2.  Easy for the server to handle all required tasks

*As you can see below, updating thumbnails and creating them can be resource intensive.  What is the best way to do this?

If it were possible to decentralize the files as in a peer-sharing form of search utility, download problems would be removed for large files, but security may become an issue for the user.
*
3.  Easy to administer

*Everything needs to be setup in such a way that updating the database with search criteria and graphics is as easy as possible.

Is there a backup in place for server content?

*4.  Easy to add new graphics

*This seems to already work pretty well.






---------------------------
The more we can work with the materials already to hand, probably the 
easier.  All of the graphics are already available in small bite-size 
png chunks.  I'm not sure how everything is currently set up, but how 
about assigning a descriptive permanent name to each png graphic which 
then becomes a permanent key.  Perhaps it could be done using a numeric 
sequence which is the date and time of submission and/or acceptance into 
the database.  This would practically guarantee a completely unique key 
for each graphic. That permanent sequence then can be attached to all 
kinds of information, including search data (descriptive terms, size, 
colors, author, date, whatever).  That sequence could be attached in a 
local utility to a download function for png for quick local searches, 
but it could also be used to define the svg download.   Your update 
function in the  local utility could be used to update a text based file 
or files which would update the search fields any time you wanted.  
Quick downloads with easy updates to search functionality.  Your png 
graphics for local search would also be quick downloads in the small 
chunks.  The biggest update of course would be the first update where 
the utility and the bulk of the png graphics would be downloaded.  After 
that, it would be just very small chunks of updates for search fields 
and added png graphics.   When a graphic or graphics is chosen for svg 
download, the choices are loaded into a component of the utility which 
knows to look for the svg component attached to the permanent key.

This allows complete flexibility with updating the search fields, easy 
identification of graphics based on a permanent key, quick updates 
locally after initial installation, offloading of search function and 
storage to local system, increase of apparent speed in searches due to 
local storage, redundancy should the www or the server be slowed down, 
and enables the user to use the png graphics should the svg not be 
immediately available.

I'm sure I've missed things, and I'm not a programmer (yet), but would 
it be possible to set this up?

Alan

------------------

You're thinking in terms of "on-demand" PDFs.  If those PDFs could be 
generated every week on Sunday Morning at 1 a.m. when server load is low...

Another question is what package you were using to generate the PDFs. 
Some are faster and/or lower-demand than others.  It might be worth 
exploring a few different methods and performance testing/tuning them to 
determine which one brings in the best combo of speed and CPU load.

Perhaps the best bet is not to create the PDFs from SVG, but to use a 
two stage process where all the SVGs are converted into JPEG using Batik 
or ImageMagick or commandline Inkscape, then the PDFs are built using 
the JPEG images.  That might be faster because you can use the fastest 
method for all the SVG to JPEG conversions, then the fastest method to 
generate PDFs.

Or... offload portions of the processing to the user's machine.

Once you built some XML indices and bitmaps of the images, you could 
build a collection browser frontend that offloaded a large chunk of the 
processing to a client-side interface in Flash.

For example, if you wanted a tag browser, you create an XML file...

	<item>
	  <itemtitle>Witch on Broom</itemtitle>
           <author>Zeimusu</author>
	  <date>12-12-2006</date>
	  <svgurl>http://...</svgurl>
	  <bitmapurl>http://...</bitmapurl>
	  <thumburl>http://...</thumburl>
	  <tags>witch,broom,Halloween,clip_art</tags>
	</item>

For the current collection of 3000 or so tagged images, the XML file 
might be 1.5-2 megs.  Use zLib compression (Flash 9 / ActionScript 3 has 
zLib compression support, IIRC) to compress the XML file.

The frontend can unzip the file, then parse the XML into Tag and Author 
arrays which can be used to generate thumbnail indices for all the tags 
and authors, a higher-res (say 400x400) preview of individual images, 
and offer the user an SVG download link.  Generate the XML file and 
bitmaps nightly during a slower period.

Make the XML file structure available and let people play with 
developing client side browsers based on the XML.  Heck, you might make 
the browser a Flash component and let people have fun doing mash-ups 
that incorporate the component.

OTOH, there's also the option of making sure Google has an accurate 
sitemap of all the different pages you want to offer, then using an 
embedded Google Search (using the Google AJAX search API or AdSense for 
Search) to offload some of the search stress to Google.


- Greg


Alan schrieb:


> "would be not too difficult"
>
> One of the really great things about open source software is that it 
> truly benefits from ease of use and simplicity.   Highly effective 
> ideas made as effective/transparent/simple as possible is true 
> genius.  Open source software caters to this model because there is 
> nothing to hide and nothing to protect.
>
> Be the best you can be, because open source encourages exactly this.
>
>
Alan, I already tried to build PDF-catalogues from the last public 
provided package 0.18. It wasn't a big problem to write the code: the 
problem was the realy big amount of cliparts which were already in OCAL. 
It took too much time and CPU performance for generating PDFs on the 
fly. And OCAL still grows and grows. So, I think the only way for 
producing PDFs from OCAL would be an offline way. For doing that, I 
would need information about the structure of the database, how files 
are stored and referenced and so on.

Best,

Tom




Jon Phillips wrote:

> > On Thu, 2007-05-10 at 11:02 -0600, Alan wrote:
> >   
>   
>> >> I've been thinking.
>> >>
>> >> Your server is going to get hammered everytime people do searches on it
>> >> for particular graphics.   I like the idea of setting up a thumbnail
>> >> database and a search function to find particular photos, or you could
>> >> organize by type as well.
>> >>
>> >> Would it be possible to create a utility which downloads to a users
>> >> computer the thumbnail database and other search criteria?  This would
>> >> allow searches to take place locally.  When a user has decided what they
>> >> want, the utility will connect to the server and download what they
>> >> want.  The local utility could be set up to link to a complete download
>> >> or a multiple set of chosen graphics or a single picture at a time.
>> >>
>> >> This will reduce the workload on the server and allow users to browse
>> >> through the thumbnail photo collection at their leisure on their own
>> >> computer.
>> >>     
>>     
> >
> > This is a great idea. Would you like to help realize this?
> >
> > Bryce Harrington (on the list) has been working on Inkscape + Open Clip
> > Art Library integration, and this is definitely something we have talked
> > about.
> >
> > Jon!


John Olsen wrote:
>
> I agree with Jon that online searching and use of the archive is  
> probably more likely in the future.  Pointing at resources on the web  
> rather than downloading them all seems like the trend.  Maybe it is  
> the visualist in me, but for a library of any art I think a browser  
> with thumbnails in the most important feature.  It is currently quite  
> tedious to drill down to each piece of art then back up again.  Even  
> a next and previous button might help here.  But clearly being able  
> to see the collection displayed as a gallery would be a huge  
> improvement.
>
> And even though the classic method for organizing files is  folder  
> and sub-folder, it seems the tagging system is far more dynamic and  
> let's everyone custom tailor their results.
>
> BTW, I have the cleanup under 100 items now, but some of them are not  
> fixable by me.  I think they were made with the beta of Inkscape or  
> something as I can find no way to clean them up without breaking them.
>
>   
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/clipart/attachments/20070904/0a4b1d90/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: SearchUtility - mockup.pdf
Type: application/pdf
Size: 130196 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/clipart/attachments/20070904/0a4b1d90/attachment.pdf>


More information about the clipart mailing list