[Clipart] help for the project (at least next release)
Jon Phillips
jon at rejon.org
Tue Sep 4 22:14:11 PDT 2007
On Tue, 2007-09-04 at 17:35 -0600, Alan wrote:
>
> Below I've summarized some of our previous conversations: (Maybe start
> putting some of this on a wiki?)
Absolutely put up on the wiki and merge into what is there...
> Mockup - see pdf (attached).
>
>
> 4 Primary Requirements for Open Clipart
>
>
> 1. Easy to find and download graphics
>
> If you want this project to really take off, I would suggest this is
> the priority
>
> In the absence of being able to download all of the graphics because
> of the sheer size of the package, perhaps it would be better to design
> an installable executable which contains field searchable thumbnails
> and an update function. I have sent a mockup in pdf.
>
> Scenario: The user downloads the utility. Once the utility is
> downloaded and installed, it automatically begins updating optimized
> thumbnail graphics (png?). The utility also updates all search
> criteria which is already attached to each graphic. The user searches
> for a particular graphic using the local utility. The user views the
> thumbnail(s) which also contains a link to the full svg graphic. If
> the user likes and wants the full graphic, there is a check box beside
> the thumbnail graphic in the utility. The user selects as many
> graphics as he/she desires, using the checkbox. Once all selections
> are made, the user clicks a download button and the full svg graphic
> is downloaded from the server.
>
> There is a legend below the checkbox for each thumbnail. This legend
> indicates whether the graphic has already been downloaded and whether
> updates are available for the graphic. There is also the option to
> choose which personal storage folder to add this too (drop-down list).
>
> (The user can create personal storage folders for downloaded graphics.
> These then become available under the thumbnails.)
>
> These thumbnail graphics have a "Thumbprint" - essentially an
> absolutely unique id which permanently identifies the orginal graphic.
> I would suggest the thumbprint have a time-date-author stamp which
> should make for a completely unique id. The stamp should be
> automatically applied when the graphic is first submitted. Allow the
> time stamp to include seconds or even milliseconds?
>
> These thumbnails also have the identical search field of the orginal
> svg graphic. Much thought should be put into search fields. Some
> more obvious ones are author, date, content. The search fields can be
> part of a drop down list, where the user can choose an item such as
> author and then to the right enter keywords. There should also be a
> feature to add or remove search fields in order to narrow the list
> even further. )See mockup more - less.)
>
> Later issues would perhaps include integration of the utility into
> OpenOffice?
Cool...we were hopeful that the integration work with Inkscape for
import/export Open Clip Art Library could be re-used in other apps.
Anyway, all of what you say is easy to do with ccHost's apis.
http://creativecommons.org/project/cchost
> 2. Easy for the server to handle all required tasks
>
> As you can see below, updating thumbnails and creating them can be
> resource intensive. What is the best way to do this?
>
> If it were possible to decentralize the files as in a peer-sharing
> form of search utility, download problems would be removed for large
> files, but security may become an issue for the user.
To create thumbnails on a local machine would be best, but also help to
push SVG as a file format to get this into play so no need to generate
thumbnails :)
> 3. Easy to administer
>
> Everything needs to be setup in such a way that updating the database
> with search criteria and graphics is as easy as possible.
>
> Is there a backup in place for server content?
The downloads of the content at http://openclipart.org/downloads is on a
fast OSUOSL mirrored ftp server. We considered putting up all uploads to
this, but currently are not...we need to think more about mirroring
content, etc...
> 4. Easy to add new graphics
>
> This seems to already work pretty well.
>
Cool :)
Cheers keep the ideas flowing!
>
>
>
>
> ---------------------------
> The more we can work with the materials already to hand, probably the
> easier. All of the graphics are already available in small bite-size
> png chunks. I'm not sure how everything is currently set up, but how
> about assigning a descriptive permanent name to each png graphic
> which
> then becomes a permanent key. Perhaps it could be done using a
> numeric
> sequence which is the date and time of submission and/or acceptance
> into
> the database. This would practically guarantee a completely unique
> key
> for each graphic. That permanent sequence then can be attached to all
> kinds of information, including search data (descriptive terms, size,
> colors, author, date, whatever). That sequence could be attached in
> a
> local utility to a download function for png for quick local
> searches,
> but it could also be used to define the svg download. Your update
> function in the local utility could be used to update a text based
> file
> or files which would update the search fields any time you wanted.
> Quick downloads with easy updates to search functionality. Your png
> graphics for local search would also be quick downloads in the small
> chunks. The biggest update of course would be the first update where
> the utility and the bulk of the png graphics would be downloaded.
> After
> that, it would be just very small chunks of updates for search fields
> and added png graphics. When a graphic or graphics is chosen for
> svg
> download, the choices are loaded into a component of the utility
> which
> knows to look for the svg component attached to the permanent key.
>
> This allows complete flexibility with updating the search fields,
> easy
> identification of graphics based on a permanent key, quick updates
> locally after initial installation, offloading of search function and
> storage to local system, increase of apparent speed in searches due
> to
> local storage, redundancy should the www or the server be slowed
> down,
> and enables the user to use the png graphics should the svg not be
> immediately available.
>
> I'm sure I've missed things, and I'm not a programmer (yet), but
> would
> it be possible to set this up?
>
> Alan
Sure! It just waits on some coding...ccHost and the code and content are
open and at your disposal! Jump on in! And, ask lots of questions.
> ------------------
>
> You're thinking in terms of "on-demand" PDFs. If those PDFs could be
> generated every week on Sunday Morning at 1 a.m. when server load is
> low...
>
> Another question is what package you were using to generate the PDFs.
> Some are faster and/or lower-demand than others. It might be worth
> exploring a few different methods and performance testing/tuning them
> to
> determine which one brings in the best combo of speed and CPU load.
>
> Perhaps the best bet is not to create the PDFs from SVG, but to use a
> two stage process where all the SVGs are converted into JPEG using
> Batik
> or ImageMagick or commandline Inkscape, then the PDFs are built using
> the JPEG images. That might be faster because you can use the
> fastest
> method for all the SVG to JPEG conversions, then the fastest method
> to
> generate PDFs.
>
> Or... offload portions of the processing to the user's machine.
>
> Once you built some XML indices and bitmaps of the images, you could
> build a collection browser frontend that offloaded a large chunk of
> the
> processing to a client-side interface in Flash.
>
> For example, if you wanted a tag browser, you create an XML file...
>
> <item>
> <itemtitle>Witch on Broom</itemtitle>
> <author>Zeimusu</author>
> <date>12-12-2006</date>
> <svgurl>http://...</svgurl>
> <bitmapurl>http://...</bitmapurl>
> <thumburl>http://...</thumburl>
> <tags>witch,broom,Halloween,clip_art</tags>
> </item>
>
> For the current collection of 3000 or so tagged images, the XML file
> might be 1.5-2 megs. Use zLib compression (Flash 9 / ActionScript 3
> has
> zLib compression support, IIRC) to compress the XML file.
>
> The frontend can unzip the file, then parse the XML into Tag and
> Author
> arrays which can be used to generate thumbnail indices for all the
> tags
> and authors, a higher-res (say 400x400) preview of individual images,
> and offer the user an SVG download link. Generate the XML file and
> bitmaps nightly during a slower period.
>
> Make the XML file structure available and let people play with
> developing client side browsers based on the XML. Heck, you might
> make
> the browser a Flash component and let people have fun doing mash-ups
> that incorporate the component.
>
> OTOH, there's also the option of making sure Google has an accurate
> sitemap of all the different pages you want to offer, then using an
> embedded Google Search (using the Google AJAX search API or AdSense
> for
> Search) to offload some of the search stress to Google.
Yes, creating a google sitemap plugin for ccHost would be brilliant to
get more traffic.
All your ideas are great, and I would say, flesh out your plan more and
put it up on the wiki...others?
Jon
>
> - Greg
>
>
> Alan schrieb:
>
> > "would be not too difficult"
> >
> > One of the really great things about open source software is that it
> > truly benefits from ease of use and simplicity. Highly effective
> > ideas made as effective/transparent/simple as possible is true
> > genius. Open source software caters to this model because there is
> > nothing to hide and nothing to protect.
> >
> > Be the best you can be, because open source encourages exactly
> > this.
> >
> >
> Alan, I already tried to build PDF-catalogues from the last public
> provided package 0.18. It wasn't a big problem to write the code: the
> problem was the realy big amount of cliparts which were already in
> OCAL. It took too much time and CPU performance for generating PDFs on
> the fly. And OCAL still grows and grows. So, I think the only way for
> producing PDFs from OCAL would be an offline way. For doing that, I
> would need information about the structure of the database, how files
> are stored and referenced and so on.
>
> Best,
>
> Tom
>
>
>
> Jon Phillips wrote:
> > > On Thu, 2007-05-10 at 11:02 -0600, Alan wrote:
> > >
> >
> > > >> I've been thinking.
> > > >>
> > > >> Your server is going to get hammered everytime people do
> > > searches on it
> > > >> for particular graphics. I like the idea of setting up a
> > > thumbnail
> > > >> database and a search function to find particular photos, or
> > > you could
> > > >> organize by type as well.
> > > >>
> > > >> Would it be possible to create a utility which downloads to a
> > > users
> > > >> computer the thumbnail database and other search criteria?
> > > This would
> > > >> allow searches to take place locally. When a user has decided
> > > what they
> > > >> want, the utility will connect to the server and download what
> > > they
> > > >> want. The local utility could be set up to link to a complete
> > > download
> > > >> or a multiple set of chosen graphics or a single picture at a
> > > time.
> > > >>
> > > >> This will reduce the workload on the server and allow users to
> > > browse
> > > >> through the thumbnail photo collection at their leisure on
> > > their own
> > > >> computer.
> > > >>
> > >
> > >
> > > This is a great idea. Would you like to help realize this?
> > >
> > > Bryce Harrington (on the list) has been working on Inkscape + Open
> > Clip
> > > Art Library integration, and this is definitely something we have
> > talked
> > > about.
> > >
> > > Jon!
>
>
> John Olsen wrote:
> >
> > I agree with Jon that online searching and use of the archive is
> > probably more likely in the future. Pointing at resources on the
> > web
> > rather than downloading them all seems like the trend. Maybe it
> > is
> > the visualist in me, but for a library of any art I think a
> > browser
> > with thumbnails in the most important feature. It is currently
> > quite
> > tedious to drill down to each piece of art then back up again.
> > Even
> > a next and previous button might help here. But clearly being
> > able
> > to see the collection displayed as a gallery would be a huge
> > improvement.
> >
> > And even though the classic method for organizing files is folder
> > and sub-folder, it seems the tagging system is far more dynamic
> > and
> > let's everyone custom tailor their results.
> >
> > BTW, I have the cleanup under 100 items now, but some of them are
> > not
> > fixable by me. I think they were made with the beta of Inkscape
> > or
> > something as I can find no way to clean them up without breaking
> > them.
> >
> >
--
Jon Phillips
San Francisco, CA
USA PH 510.499.0894
jon at rejon.org
http://www.rejon.org
MSN, AIM, Yahoo Chat: kidproto
Jabber Chat: rejon at gristle.org
IRC: rejon at irc.freenode.net
More information about the clipart
mailing list