[Clipart] help for the project (at least next release)
Alan
alan at ccnsweb.com
Tue Sep 4 18:36:18 PDT 2007
Below I've summarized some of our previous conversations: (Maybe
start putting some of this on a wiki?)
Four Primary Requirements for Open Clipart
1. Easy to find and download graphics
In the absence of being able to download all of the graphics because
of the sheer size of the package, perhaps it would be better to
design an installable executable which contains field searchable
thumbnails and an update function. I have sent a mockup in pdf.
Scenario: The user downloads the utility. Once the utility is
downloaded and installed, it automatically begins updating optimized
thumbnail graphics (png?). The utility also updates all search
criteria which is already attached to each graphic. The user
searches for a particular graphic using the local utility. The user
views the thumbnail(s) which also contains a link to the full svg
graphic. If the user likes and wants the full graphic, there is a
check box beside the thumbnail graphic in the utility. The user
selects as many graphics as he/she desires, using the checkbox. Once
all selections are made, the user clicks a download button and the
full svg graphic is downloaded from the server.
There is a legend below the checkbox for each thumbnail. This legend
indicates whether the graphic has already been downloaded and
whether updates are available for the graphic. There is also the
option to choose which personal storage folder to add this too
(drop-down list).
(The user can create personal storage folders for downloaded
graphics. These then become available under the thumbnails.)
These thumbnail graphics have a "Thumbprint" - essentially an
absolutely unique id which permanently identifies the orginal
graphic. I would suggest the thumbprint have a time-date-author
stamp which should make for a completely unique id. The stamp should
be automatically applied when the graphic is first submitted. Allow
the time stamp to include seconds or even milliseconds?
These thumbnails also have the identical search field of the orginal
svg graphic. Much thought should be put into search fields. Some
more obvious ones are author, date, content. The search fields can
be part of a drop down list, where the user can choose an item such
as author and then to the right enter keywords. There should also be
a feature to add or remove search fields in order to narrow the list
even further. )See mockup more - less.)
Later issues would perhaps include integration of the utility into
OpenOffice?
2. Easy for the server to handle all required tasks
As you can see below, updating thumbnails and creating them can be
resource intensive. What is the best way to do this?
If it were possible to decentralize the files as in a peer-sharing
form of search utility, download problems would be removed for large
files, but security may become an issue for the user.
3. Easy to administer
Everything needs to be setup in such a way that updating the
database with search criteria and graphics is as easy as possible.
Is there a backup in place for server content?
4. Easy to add new graphics
This seems to already work pretty well.
---------------------------
The more we can work with the materials already to hand, probably the
easier. All of the graphics are already available in small bite-size
png chunks. I'm not sure how everything is currently set up, but how
about assigning a descriptive permanent name to each png graphic which
then becomes a permanent key. Perhaps it could be done using a numeric
sequence which is the date and time of submission and/or acceptance into
the database. This would practically guarantee a completely unique key
for each graphic. That permanent sequence then can be attached to all
kinds of information, including search data (descriptive terms, size,
colors, author, date, whatever). That sequence could be attached in a
local utility to a download function for png for quick local searches,
but it could also be used to define the svg download. Your update
function in the local utility could be used to update a text based file
or files which would update the search fields any time you wanted.
Quick downloads with easy updates to search functionality. Your png
graphics for local search would also be quick downloads in the small
chunks. The biggest update of course would be the first update where
the utility and the bulk of the png graphics would be downloaded. After
that, it would be just very small chunks of updates for search fields
and added png graphics. When a graphic or graphics is chosen for svg
download, the choices are loaded into a component of the utility which
knows to look for the svg component attached to the permanent key.
This allows complete flexibility with updating the search fields, easy
identification of graphics based on a permanent key, quick updates
locally after initial installation, offloading of search function and
storage to local system, increase of apparent speed in searches due to
local storage, redundancy should the www or the server be slowed down,
and enables the user to use the png graphics should the svg not be
immediately available.
I'm sure I've missed things, and I'm not a programmer (yet), but would
it be possible to set this up?
Alan
------------------
You're thinking in terms of "on-demand" PDFs. If those PDFs could be
generated every week on Sunday Morning at 1 a.m. when server load is low...
Another question is what package you were using to generate the PDFs.
Some are faster and/or lower-demand than others. It might be worth
exploring a few different methods and performance testing/tuning them to
determine which one brings in the best combo of speed and CPU load.
Perhaps the best bet is not to create the PDFs from SVG, but to use a
two stage process where all the SVGs are converted into JPEG using Batik
or ImageMagick or commandline Inkscape, then the PDFs are built using
the JPEG images. That might be faster because you can use the fastest
method for all the SVG to JPEG conversions, then the fastest method to
generate PDFs.
Or... offload portions of the processing to the user's machine.
Once you built some XML indices and bitmaps of the images, you could
build a collection browser frontend that offloaded a large chunk of the
processing to a client-side interface in Flash.
For example, if you wanted a tag browser, you create an XML file...
<item>
<itemtitle>Witch on Broom</itemtitle>
<author>Zeimusu</author>
<date>12-12-2006</date>
<svgurl>http://...</svgurl>
<bitmapurl>http://...</bitmapurl>
<thumburl>http://...</thumburl>
<tags>witch,broom,Halloween,clip_art</tags>
</item>
For the current collection of 3000 or so tagged images, the XML file
might be 1.5-2 megs. Use zLib compression (Flash 9 / ActionScript 3 has
zLib compression support, IIRC) to compress the XML file.
The frontend can unzip the file, then parse the XML into Tag and Author
arrays which can be used to generate thumbnail indices for all the tags
and authors, a higher-res (say 400x400) preview of individual images,
and offer the user an SVG download link. Generate the XML file and
bitmaps nightly during a slower period.
Make the XML file structure available and let people play with
developing client side browsers based on the XML. Heck, you might make
the browser a Flash component and let people have fun doing mash-ups
that incorporate the component.
OTOH, there's also the option of making sure Google has an accurate
sitemap of all the different pages you want to offer, then using an
embedded Google Search (using the Google AJAX search API or AdSense for
Search) to offload some of the search stress to Google.
- Greg
Alan schrieb:
> "would be not too difficult"
>
> One of the really great things about open source software is that it
> truly benefits from ease of use and simplicity. Highly effective
> ideas made as effective/transparent/simple as possible is true
> genius. Open source software caters to this model because there is
> nothing to hide and nothing to protect.
>
> Be the best you can be, because open source encourages exactly this.
>
>
Alan, I already tried to build PDF-catalogues from the last public
provided package 0.18. It wasn't a big problem to write the code: the
problem was the realy big amount of cliparts which were already in OCAL.
It took too much time and CPU performance for generating PDFs on the
fly. And OCAL still grows and grows. So, I think the only way for
producing PDFs from OCAL would be an offline way. For doing that, I
would need information about the structure of the database, how files
are stored and referenced and so on.
Best,
Tom
Jon Phillips wrote:
> > On Thu, 2007-05-10 at 11:02 -0600, Alan wrote:
> >
>
>> >> I've been thinking.
>> >>
>> >> Your server is going to get hammered everytime people do searches on it
>> >> for particular graphics. I like the idea of setting up a thumbnail
>> >> database and a search function to find particular photos, or you could
>> >> organize by type as well.
>> >>
>> >> Would it be possible to create a utility which downloads to a users
>> >> computer the thumbnail database and other search criteria? This would
>> >> allow searches to take place locally. When a user has decided what they
>> >> want, the utility will connect to the server and download what they
>> >> want. The local utility could be set up to link to a complete download
>> >> or a multiple set of chosen graphics or a single picture at a time.
>> >>
>> >> This will reduce the workload on the server and allow users to browse
>> >> through the thumbnail photo collection at their leisure on their own
>> >> computer.
>> >>
>>
> >
> > This is a great idea. Would you like to help realize this?
> >
> > Bryce Harrington (on the list) has been working on Inkscape + Open Clip
> > Art Library integration, and this is definitely something we have talked
> > about.
> >
> > Jon!
John Olsen wrote:
>
> I agree with Jon that online searching and use of the archive is
> probably more likely in the future. Pointing at resources on the web
> rather than downloading them all seems like the trend. Maybe it is
> the visualist in me, but for a library of any art I think a browser
> with thumbnails in the most important feature. It is currently quite
> tedious to drill down to each piece of art then back up again. Even
> a next and previous button might help here. But clearly being able
> to see the collection displayed as a gallery would be a huge
> improvement.
>
> And even though the classic method for organizing files is folder
> and sub-folder, it seems the tagging system is far more dynamic and
> let's everyone custom tailor their results.
>
> BTW, I have the cleanup under 100 items now, but some of them are not
> fixable by me. I think they were made with the beta of Inkscape or
> something as I can find no way to clean them up without breaking them.
>
>
More information about the clipart
mailing list