[Clipart] help for the project (at least next release)

Jon Phillips jon at rejon.org
Tue Sep 4 22:14:11 PDT 2007


On Tue, 2007-09-04 at 17:35 -0600, Alan wrote:
> 
> Below I've summarized some of our previous conversations: (Maybe start
> putting some of this on a wiki?)

Absolutely put up on the wiki and merge into what is there...

> Mockup - see pdf (attached).
> 
> 
> 4 Primary Requirements for Open Clipart
> 
> 
> 1.  Easy to find and download graphics 
> 
> If you want this project to really take off, I would suggest this is
> the priority
> 
> In the absence of being able to download all of the graphics because
> of the sheer size of the package, perhaps it would be better to design
> an installable executable which contains field searchable thumbnails
> and an update function.  I have sent a mockup in pdf.
> 
> Scenario:  The user downloads the utility.  Once the utility is
> downloaded and installed, it automatically begins updating optimized
> thumbnail graphics (png?).  The utility also updates all search
> criteria which is already attached to each graphic.  The user searches
> for a particular graphic using the local utility.  The user views the
> thumbnail(s) which also contains a link to the full svg graphic.  If
> the user likes and wants the full graphic, there is a check box beside
> the thumbnail graphic in the utility.  The user selects as many
> graphics as he/she desires, using the checkbox.  Once all selections
> are made, the user clicks a download button and the full svg graphic
> is downloaded from the server.
> 
> There is a legend below the checkbox for each thumbnail.  This legend
> indicates whether the graphic has already been downloaded and whether
> updates are available for the graphic.  There is also the option to
> choose which personal storage folder to add this too (drop-down list).
> 
> (The user can create personal storage folders for downloaded graphics.
> These then become available under the thumbnails.)
> 
> These thumbnail graphics have a "Thumbprint" - essentially an
> absolutely unique id which permanently identifies the orginal graphic.
> I would suggest the thumbprint have a time-date-author stamp which
> should make for a completely unique id.  The stamp should be
> automatically applied when the graphic is first submitted.  Allow the
> time stamp to include seconds or even milliseconds?  
> 
> These thumbnails also have the identical search field of the orginal
> svg graphic.   Much thought should be put into search fields.  Some
> more obvious ones are author, date, content.  The search fields can be
> part of a drop down list, where the user can choose an item such as
> author and then to the right enter keywords.  There should also be a
> feature to add or remove search fields in order to narrow the list
> even further.  )See mockup more - less.)
> 
> Later issues would perhaps include integration of the utility into
> OpenOffice?

Cool...we were hopeful that the integration work with Inkscape for
import/export Open Clip Art Library could be re-used in other apps.

Anyway, all of what you say is easy to do with ccHost's apis.
http://creativecommons.org/project/cchost

> 2.  Easy for the server to handle all required tasks
> 
> As you can see below, updating thumbnails and creating them can be
> resource intensive.  What is the best way to do this?
> 
> If it were possible to decentralize the files as in a peer-sharing
> form of search utility, download problems would be removed for large
> files, but security may become an issue for the user.

To create thumbnails on a local machine would be best, but also help to
push SVG as a file format to get this into play so no need to generate
thumbnails :)

> 3.  Easy to administer
> 
> Everything needs to be setup in such a way that updating the database
> with search criteria and graphics is as easy as possible.
> 
> Is there a backup in place for server content?

The downloads of the content at http://openclipart.org/downloads is on a
fast OSUOSL mirrored ftp server. We considered putting up all uploads to
this, but currently are not...we need to think more about mirroring
content, etc...

> 4.  Easy to add new graphics
> 
> This seems to already work pretty well.
> 

Cool :)

Cheers keep the ideas flowing!

> 
> 
> 
> 
> ---------------------------
> The more we can work with the materials already to hand, probably the 
> easier.  All of the graphics are already available in small bite-size 
> png chunks.  I'm not sure how everything is currently set up, but how 
> about assigning a descriptive permanent name to each png graphic
> which 
> then becomes a permanent key.  Perhaps it could be done using a
> numeric 
> sequence which is the date and time of submission and/or acceptance
> into 
> the database.  This would practically guarantee a completely unique
> key 
> for each graphic. That permanent sequence then can be attached to all 
> kinds of information, including search data (descriptive terms, size, 
> colors, author, date, whatever).  That sequence could be attached in
> a 
> local utility to a download function for png for quick local
> searches, 
> but it could also be used to define the svg download.   Your update 
> function in the  local utility could be used to update a text based
> file 
> or files which would update the search fields any time you wanted.  
> Quick downloads with easy updates to search functionality.  Your png 
> graphics for local search would also be quick downloads in the small 
> chunks.  The biggest update of course would be the first update where 
> the utility and the bulk of the png graphics would be downloaded.
> After 
> that, it would be just very small chunks of updates for search fields 
> and added png graphics.   When a graphic or graphics is chosen for
> svg 
> download, the choices are loaded into a component of the utility
> which 
> knows to look for the svg component attached to the permanent key.
> 
> This allows complete flexibility with updating the search fields,
> easy 
> identification of graphics based on a permanent key, quick updates 
> locally after initial installation, offloading of search function and 
> storage to local system, increase of apparent speed in searches due
> to 
> local storage, redundancy should the www or the server be slowed
> down, 
> and enables the user to use the png graphics should the svg not be 
> immediately available.
> 
> I'm sure I've missed things, and I'm not a programmer (yet), but
> would 
> it be possible to set this up?
> 
> Alan

Sure! It just waits on some coding...ccHost and the code and content are
open and at your disposal! Jump on in! And, ask lots of questions.

> ------------------
> 
> You're thinking in terms of "on-demand" PDFs.  If those PDFs could be 
> generated every week on Sunday Morning at 1 a.m. when server load is
> low...
> 
> Another question is what package you were using to generate the PDFs. 
> Some are faster and/or lower-demand than others.  It might be worth 
> exploring a few different methods and performance testing/tuning them
> to 
> determine which one brings in the best combo of speed and CPU load.
> 
> Perhaps the best bet is not to create the PDFs from SVG, but to use a 
> two stage process where all the SVGs are converted into JPEG using
> Batik 
> or ImageMagick or commandline Inkscape, then the PDFs are built using 
> the JPEG images.  That might be faster because you can use the
> fastest 
> method for all the SVG to JPEG conversions, then the fastest method
> to 
> generate PDFs.
> 
> Or... offload portions of the processing to the user's machine.
> 
> Once you built some XML indices and bitmaps of the images, you could 
> build a collection browser frontend that offloaded a large chunk of
> the 
> processing to a client-side interface in Flash.
> 
> For example, if you wanted a tag browser, you create an XML file...
> 
> 	<item>
> 	  <itemtitle>Witch on Broom</itemtitle>
>            <author>Zeimusu</author>
> 	  <date>12-12-2006</date>
> 	  <svgurl>http://...</svgurl>
> 	  <bitmapurl>http://...</bitmapurl>
> 	  <thumburl>http://...</thumburl>
> 	  <tags>witch,broom,Halloween,clip_art</tags>
> 	</item>
> 
> For the current collection of 3000 or so tagged images, the XML file 
> might be 1.5-2 megs.  Use zLib compression (Flash 9 / ActionScript 3
> has 
> zLib compression support, IIRC) to compress the XML file.
> 
> The frontend can unzip the file, then parse the XML into Tag and
> Author 
> arrays which can be used to generate thumbnail indices for all the
> tags 
> and authors, a higher-res (say 400x400) preview of individual images, 
> and offer the user an SVG download link.  Generate the XML file and 
> bitmaps nightly during a slower period.
> 
> Make the XML file structure available and let people play with 
> developing client side browsers based on the XML.  Heck, you might
> make 
> the browser a Flash component and let people have fun doing mash-ups 
> that incorporate the component.
> 
> OTOH, there's also the option of making sure Google has an accurate 
> sitemap of all the different pages you want to offer, then using an 
> embedded Google Search (using the Google AJAX search API or AdSense
> for 
> Search) to offload some of the search stress to Google.

Yes, creating a google sitemap plugin for ccHost would be brilliant to
get more traffic.

All your ideas are great, and I would say, flesh out your plan more and
put it up on the wiki...others?

Jon

> 
> - Greg
> 
> 
> Alan schrieb:
> 
> > "would be not too difficult" 
> > 
> > One of the really great things about open source software is that it
> > truly benefits from ease of use and simplicity.   Highly effective
> > ideas made as effective/transparent/simple as possible is true
> > genius.  Open source software caters to this model because there is
> > nothing to hide and nothing to protect. 
> > 
> > Be the best you can be, because open source encourages exactly
> > this. 
> > 
> > 
> Alan, I already tried to build PDF-catalogues from the last public
> provided package 0.18. It wasn't a big problem to write the code: the
> problem was the realy big amount of cliparts which were already in
> OCAL. It took too much time and CPU performance for generating PDFs on
> the fly. And OCAL still grows and grows. So, I think the only way for
> producing PDFs from OCAL would be an offline way. For doing that, I
> would need information about the structure of the database, how files
> are stored and referenced and so on. 
> 
> Best, 
> 
> Tom 
> 
> 
> 
> Jon Phillips wrote:
> > > On Thu, 2007-05-10 at 11:02 -0600, Alan wrote:
> > >   
> >   
> > > >> I've been thinking.
> > > >>
> > > >> Your server is going to get hammered everytime people do
> > > searches on it
> > > >> for particular graphics.   I like the idea of setting up a
> > > thumbnail
> > > >> database and a search function to find particular photos, or
> > > you could
> > > >> organize by type as well.
> > > >>
> > > >> Would it be possible to create a utility which downloads to a
> > > users
> > > >> computer the thumbnail database and other search criteria?
> > > This would
> > > >> allow searches to take place locally.  When a user has decided
> > > what they
> > > >> want, the utility will connect to the server and download what
> > > they
> > > >> want.  The local utility could be set up to link to a complete
> > > download
> > > >> or a multiple set of chosen graphics or a single picture at a
> > > time.
> > > >>
> > > >> This will reduce the workload on the server and allow users to
> > > browse
> > > >> through the thumbnail photo collection at their leisure on
> > > their own
> > > >> computer.
> > > >>     
> > >     
> > >
> > > This is a great idea. Would you like to help realize this?
> > >
> > > Bryce Harrington (on the list) has been working on Inkscape + Open
> > Clip
> > > Art Library integration, and this is definitely something we have
> > talked
> > > about.
> > >
> > > Jon!
> 
> 
> John Olsen wrote:
> > 
> > I agree with Jon that online searching and use of the archive is  
> > probably more likely in the future.  Pointing at resources on the
> > web  
> > rather than downloading them all seems like the trend.  Maybe it
> > is  
> > the visualist in me, but for a library of any art I think a
> > browser  
> > with thumbnails in the most important feature.  It is currently
> > quite  
> > tedious to drill down to each piece of art then back up again.
> > Even  
> > a next and previous button might help here.  But clearly being
> > able  
> > to see the collection displayed as a gallery would be a huge  
> > improvement.
> > 
> > And even though the classic method for organizing files is  folder  
> > and sub-folder, it seems the tagging system is far more dynamic
> > and  
> > let's everyone custom tailor their results.
> > 
> > BTW, I have the cleanup under 100 items now, but some of them are
> > not  
> > fixable by me.  I think they were made with the beta of Inkscape
> > or  
> > something as I can find no way to clean them up without breaking
> > them.
> > 
> >   
-- 
Jon Phillips

San Francisco, CA
USA PH 510.499.0894
jon at rejon.org
http://www.rejon.org

MSN, AIM, Yahoo Chat: kidproto
Jabber Chat: rejon at gristle.org
IRC: rejon at irc.freenode.net




More information about the clipart mailing list