A Standard for Thumbnailers

Wed Jan 10 03:35:38 PST 2007

On Wed, 10 Jan 2007 10:40:18 +0000 Erlend Davidson
<E.R.M.Davidson at sms.ed.ac.uk> babbled:

> 
> 
> Carsten Haitzler (The Rasterman) wrote:
> > On Tue, 09 Jan 2007 16:07:40 +0000 Erlend Davidson
> > <E.R.M.Davidson at sms.ed.ac.uk> babbled:
> >
> >   
> >> Carsten Haitzler (The Rasterman) wrote:
> >>     
> >>> On Tue, 09 Jan 2007 12:08:46 +0100 Benedikt Meurer <benny at xfce.org>
> >>> babbled:
> >>>
> >>>   
> >>>       
> >>>> Stanislav Brabec wrote:
> >>>>     
> >>>>         
> >>>>> - Image files in public directories are thumbnailed many times:
> >>>>>   once for each accessing user.
> >>>>>       
> >>>>>           
> >>>> Sharing thumbnails might be a security problem.
> >>>>
> >>>>     
> >>>>         
> >>>>> - Thumbnails on removable media are thumbnailed many times: once for
> >>>>>   each machine, where medium is accessed
> >>>>>       
> >>>>>           
> >>>> IIRC the standard already covers this, tho it's not implemented in Xfce
> >>>> and GNOME.
> >>>>
> >>>>     
> >>>>         
> >>>>> - Thumbnail files are often larger than images itself, especially for
> >>>>>   jpeg images below 20kB in size.
> >>>>>       
> >>>>>           
> >>>> The software can detect this and load the JPEG directly.
> >>>>
> >>>>     
> >>>>         
> >>>>> - There is a very small chance to detect deleted images and delete
> >>>>>   corresponding thumbnail.
> >>>>>       
> >>>>>           
> >>>> The file manager should clean up the thumbnails when deleting files.
> >>>>
> >>>>     
> >>>>         
> >>>>> Use unique inode_number/volume_id instead of file_path.
> >>>>>       
> >>>>>           
> >>>> This will not work with certain FUSE file systems, that generate unique
> >>>> inodes on-demand, because it's likely that the inodes will be different
> >>>> each time the file system is mounted.
> >>>>
> >>>>     
> >>>>         
> >>>>> Haw widely used desktop-neutral thumbnailing library understanding many
> >>>>> embedded thumbnails and providing thumbnailing for images without
> >>>>> embedded thumbnail.
> >>>>>       
> >>>>>           
> >>>> I think external thumbnailer as used by Thunar/Nautilus are better,
> >>>> because it's more flexible and avoids loading various libraries required
> >>>> to generate thumbnails for different formats into the processes.
> >>>>
> >>>> Using .desktop files to register thumbnailers instead of GConf should be
> >>>> fine for desktop-neutral usage.
> >>>>
> >>>>     
> >>>>         
> >>>>> Allow jpeg thumbnails.
> >>>>>       
> >>>>>           
> >>>> Aside from maybe a few bytes saved per thumbnail, why would you want to
> >>>> do this?
> >>>>     
> >>>>         
> >>> it's nto a few bytes. for 128x128 sizes thumbs a thumbnail can go from a
> >>> 20k png to a 2k jpeg very very very easily. the jpeg thumbs invariably are
> >>> significantly smaller than the png ones (depending on the quality level
> >>> you are happy with). over the space of a few thousand thumbnails this
> >>> make a BIG difference.
> >>>   
> >>>       
> >> I have 6731 thumbnails in ~/thumbnails/normal.  This directory has a 
> >> size of 110MB, so that's about 15KB a file.  My home directory has 25GB 
> >> of used space, with all pictures, latex files, postscript files, pdf 
> >> files (which I have a lot of) being thumbnailed, so that seems quite 
> >> small to me.
> >>     
> >
> > depends how you do it. i use jpeg for thumbnails (within a container format)
> > and 128x128 thumbs end up 4-8k each. thats 3193 thumbs, 17Mb.
> >
> > excluding any time used to load or scale src images i timed the overhead of
> > writing thumbs:
> > time to WRITE 3193 128x128 PNG thumbs: 115.55 sec
> > time to WRITE 3193 128x128 JPG thumbs: 7.59 sec
> >
> > size of thumbs as PNG: 69M
> > size of thumbs as JPG: 17
> >   
> These are using your epeg program I assume?  This speed-up is only 
> applies to jpeg images then?

actually no - not using epeg. i have a different path for doing it that i use
but it uses the same principles as epeg in a much more generic gfx pipeline.
the numbers above were from an isolated test program i wrote just for the
purpose of this email :)

> > so... by doing nothing more complex than being more intelligent about my
> > thumbnail format choice - no complex code, no "amazing tricks", nothing
> > beyond what is already done with PNG, but instead doing it with JPG, I
> >
> > 1. spend 105 seconds or so less time in writing out thumbnails that i
> > shouldn't have spent (i.e. WASTE of CPU for no real gain).
> > 2. I have WASTED 57M of disk storage pointlessly.
> >
> > just by being adamant on using PNG as a format. if this were a discussion of
> > "but that extra cpu used and disk used buys you features X, Y and X" then we
> > have something to discuss. we are making a tradeoff. but we are not making a
> > tradeoff. we are simply wasting resources. these add up over 1000's of
> > users. thats extra disk IO time needed to load thumbs on retrieval, extra
> > IO time to save, extra processing time to save (and load incidentally). ALL
> > we need to do is SAVE as a JPEG file (and you can add alpha to JPEG if you
> > so please so you can't argue that only PNG can do this). You don't need to
> > add any amazing code
> > - you already have wrappers for saving as JPEG right there - waiting to be
> > used. We can argue there is quality loss (which there is) but frankly - it's
> > imperceptible when you are scanning a list of thumbnails for your favorite
> > image (that's the point of JPEG).
> >
> > efficiency MATTERS. The above timings were on a very high-end box (core 2
> > duo laptop, 5600 with 2G of RAM). these numbers blow out significantly when
> > you are talking people in not so wealthy countries who are stuck on 500mhz
> > p3's because that is all they can afford (or less - e.g. the OLPC). why
> > waste their valuable disk space and cpu cycles? it's an incredibly simple
> > thing to do (supporting JPEG thumbnails) and has significantly MEASURABLE
> > benefits (with the only detriments being quality encoding loss - see above).
> >
> >   
> >> Thumbnails are quite a high-end feature anyway - even on a fast computer 
> >> they take a little time to produce.
> >>     
> >
> > i disagree. if you do them right they can be generated VERY quickly even on
> > a low-end PC. I devoted a little bit of time and effort a few years back
> > making that trivial with EPEG - it's a wrapper around libjpeg intended for
> > 1 thing only. rapid jpeg thumbnailing.
> I was about to post about EPEG here... you wrote it, so I guess you 
> already know!
> 
> Yes I have noticed it is extremely quick to generate a thumbnail... it 
> takes about 0.1 seconds, compared with 0.5s with convert, and 0.6 using 
> convert to go from jpg to png. The visual quality of the thumbnails it 
> generates is low, even at the 100% thumbnail size.
> 
> I tried using epeg to scale the image, and then convert to turn it into 
> a png... it worked very well:
> 
> $ time epeg -m 128,128 input.jpg output.jpg
> real    0m0.126s
> user    0m0.116s
> sys     0m0.004s
> 
> $ time convert output.jpg -antialias output.png
> real    0m0.030s
> user    0m0.010s
> sys     0m0.006s
> 
> (I turned off antialiasing in the above because you cannot see the 
> difference when you resize photographs that much, and it makes a 30% 
> reduction in generation time).

converting a jpeg to a png will not make any difference to the image - it will
look identical as the source jpeg. what you say above doesn't make sense? also
note convert is not a great tool for benchmarks - its one of the slowest image
processing engines around. :)

> >  it makes use of the fact that JPEG's are DCT's
> > and you get "free" downscaling on decode by factors of 2, 4 and 8 times in
> > each dimension simply by decoding the DCT at a different output res (and
> > throwing out higher frequency DCT elements). it means for an incredibly
> > faster decode and thus thumbnail generation is sped up massively. you speed
> > up the other half of this work by using JPEG output for thumbnails. it
> > makes a difference. a perceptible and measurable one. people notice.
> Could a similar optimisation be applied to PNG to generate those 
> thumbnails very quickly?

to generate thumbnails OF png files? if so - yes, IF the png happens to be
encoded with "interlacing". if not - no. this is only on DECODE. on ENCODE this
optimisation isn't used. we need to separate reading and writing into different
things. reading a jpeg and have it already scaled down on read is possible -
and libjpeg can do it - epeg just makes it dead-easy to use, that's all. it's
simply a wrapper front-end. for writing - that's another matter. in my speed
tests i used evas itself (as it can do the "epeg trick" using libjpeg
natively), and can also save. libjpeg is simply faster at writing files than
libpng - by a large margin. by their nature jpeg will be faster as it literally
has less to pass into the huffman compressor compared to png (as it strips out
information during encode to dct and throwing out of frequencies).

-- 
------------- Codito, ergo sum - "I code, therefore I am" --------------
The Rasterman (Carsten Haitzler)    raster at rasterman.com
裸好多
Tokyo, Japan (東京 日本)