A Standard for Thumbnailers

Erlend Davidson E.R.M.Davidson at sms.ed.ac.uk
Thu Jan 11 15:15:23 PST 2007



Carsten Haitzler (The Rasterman) wrote:
> On Wed, 10 Jan 2007 10:40:18 +0000 Erlend Davidson
> <E.R.M.Davidson at sms.ed.ac.uk> babbled:
>
>   
>> Carsten Haitzler (The Rasterman) wrote:
>>     
>>> On Tue, 09 Jan 2007 16:07:40 +0000 Erlend Davidson
>>> <E.R.M.Davidson at sms.ed.ac.uk> babbled:
>>>
>>>   
>>>       
>>>> Carsten Haitzler (The Rasterman) wrote:
>>>>     
>>>>         
>>>>> On Tue, 09 Jan 2007 12:08:46 +0100 Benedikt Meurer <benny at xfce.org>
>>>>> babbled:
>>>>>
>>>>>   
>>>>>       
>>>>>           
>>>>>> Stanislav Brabec wrote:
>>>>>>     
>>>>>>         
>>>>>>             
>>>>>>> - Image files in public directories are thumbnailed many times:
>>>>>>>   once for each accessing user.
>>>>>>>       
>>>>>>>           
>>>>>>>               
>>>>>> Sharing thumbnails might be a security problem.
>>>>>>
>>>>>>     
>>>>>>         
>>>>>>             
>>>>>>> - Thumbnails on removable media are thumbnailed many times: once for
>>>>>>>   each machine, where medium is accessed
>>>>>>>       
>>>>>>>           
>>>>>>>               
>>>>>> IIRC the standard already covers this, tho it's not implemented in Xfce
>>>>>> and GNOME.
>>>>>>
>>>>>>     
>>>>>>         
>>>>>>             
>>>>>>> - Thumbnail files are often larger than images itself, especially for
>>>>>>>   jpeg images below 20kB in size.
>>>>>>>       
>>>>>>>           
>>>>>>>               
>>>>>> The software can detect this and load the JPEG directly.
>>>>>>
>>>>>>     
>>>>>>         
>>>>>>             
>>>>>>> - There is a very small chance to detect deleted images and delete
>>>>>>>   corresponding thumbnail.
>>>>>>>       
>>>>>>>           
>>>>>>>               
>>>>>> The file manager should clean up the thumbnails when deleting files.
>>>>>>
>>>>>>     
>>>>>>         
>>>>>>             
>>>>>>> Use unique inode_number/volume_id instead of file_path.
>>>>>>>       
>>>>>>>           
>>>>>>>               
>>>>>> This will not work with certain FUSE file systems, that generate unique
>>>>>> inodes on-demand, because it's likely that the inodes will be different
>>>>>> each time the file system is mounted.
>>>>>>
>>>>>>     
>>>>>>         
>>>>>>             
>>>>>>> Haw widely used desktop-neutral thumbnailing library understanding many
>>>>>>> embedded thumbnails and providing thumbnailing for images without
>>>>>>> embedded thumbnail.
>>>>>>>       
>>>>>>>           
>>>>>>>               
>>>>>> I think external thumbnailer as used by Thunar/Nautilus are better,
>>>>>> because it's more flexible and avoids loading various libraries required
>>>>>> to generate thumbnails for different formats into the processes.
>>>>>>
>>>>>> Using .desktop files to register thumbnailers instead of GConf should be
>>>>>> fine for desktop-neutral usage.
>>>>>>
>>>>>>     
>>>>>>         
>>>>>>             
>>>>>>> Allow jpeg thumbnails.
>>>>>>>       
>>>>>>>           
>>>>>>>               
>>>>>> Aside from maybe a few bytes saved per thumbnail, why would you want to
>>>>>> do this?
>>>>>>     
>>>>>>         
>>>>>>             
>>>>> it's nto a few bytes. for 128x128 sizes thumbs a thumbnail can go from a
>>>>> 20k png to a 2k jpeg very very very easily. the jpeg thumbs invariably are
>>>>> significantly smaller than the png ones (depending on the quality level
>>>>> you are happy with). over the space of a few thousand thumbnails this
>>>>> make a BIG difference.
>>>>>   
>>>>>       
>>>>>           
>>>> I have 6731 thumbnails in ~/thumbnails/normal.  This directory has a 
>>>> size of 110MB, so that's about 15KB a file.  My home directory has 25GB 
>>>> of used space, with all pictures, latex files, postscript files, pdf 
>>>> files (which I have a lot of) being thumbnailed, so that seems quite 
>>>> small to me.
>>>>     
>>>>         
>>> depends how you do it. i use jpeg for thumbnails (within a container format)
>>> and 128x128 thumbs end up 4-8k each. thats 3193 thumbs, 17Mb.
>>>
>>> excluding any time used to load or scale src images i timed the overhead of
>>> writing thumbs:
>>> time to WRITE 3193 128x128 PNG thumbs: 115.55 sec
>>> time to WRITE 3193 128x128 JPG thumbs: 7.59 sec
>>>
>>> size of thumbs as PNG: 69M
>>> size of thumbs as JPG: 17
>>>   
>>>       
>> These are using your epeg program I assume?  This speed-up is only 
>> applies to jpeg images then?
>>     
>
> actually no - not using epeg. i have a different path for doing it that i use
> but it uses the same principles as epeg in a much more generic gfx pipeline.
> the numbers above were from an isolated test program i wrote just for the
> purpose of this email :)
>
>   
>>> so... by doing nothing more complex than being more intelligent about my
>>> thumbnail format choice - no complex code, no "amazing tricks", nothing
>>> beyond what is already done with PNG, but instead doing it with JPG, I
>>>
>>> 1. spend 105 seconds or so less time in writing out thumbnails that i
>>> shouldn't have spent (i.e. WASTE of CPU for no real gain).
>>> 2. I have WASTED 57M of disk storage pointlessly.
>>>
>>> just by being adamant on using PNG as a format. if this were a discussion of
>>> "but that extra cpu used and disk used buys you features X, Y and X" then we
>>> have something to discuss. we are making a tradeoff. but we are not making a
>>> tradeoff. we are simply wasting resources. these add up over 1000's of
>>> users. thats extra disk IO time needed to load thumbs on retrieval, extra
>>> IO time to save, extra processing time to save (and load incidentally). ALL
>>> we need to do is SAVE as a JPEG file (and you can add alpha to JPEG if you
>>> so please so you can't argue that only PNG can do this). You don't need to
>>> add any amazing code
>>> - you already have wrappers for saving as JPEG right there - waiting to be
>>> used. We can argue there is quality loss (which there is) but frankly - it's
>>> imperceptible when you are scanning a list of thumbnails for your favorite
>>> image (that's the point of JPEG).
>>>
>>> efficiency MATTERS. The above timings were on a very high-end box (core 2
>>> duo laptop, 5600 with 2G of RAM). these numbers blow out significantly when
>>> you are talking people in not so wealthy countries who are stuck on 500mhz
>>> p3's because that is all they can afford (or less - e.g. the OLPC). why
>>> waste their valuable disk space and cpu cycles? it's an incredibly simple
>>> thing to do (supporting JPEG thumbnails) and has significantly MEASURABLE
>>> benefits (with the only detriments being quality encoding loss - see above).
>>>
>>>   
>>>       
>>>> Thumbnails are quite a high-end feature anyway - even on a fast computer 
>>>> they take a little time to produce.
>>>>     
>>>>         
>>> i disagree. if you do them right they can be generated VERY quickly even on
>>> a low-end PC. I devoted a little bit of time and effort a few years back
>>> making that trivial with EPEG - it's a wrapper around libjpeg intended for
>>> 1 thing only. rapid jpeg thumbnailing.
>>>       
>> I was about to post about EPEG here... you wrote it, so I guess you 
>> already know!
>>
>> Yes I have noticed it is extremely quick to generate a thumbnail... it 
>> takes about 0.1 seconds, compared with 0.5s with convert, and 0.6 using 
>> convert to go from jpg to png. The visual quality of the thumbnails it 
>> generates is low, even at the 100% thumbnail size.
>>
>> I tried using epeg to scale the image, and then convert to turn it into 
>> a png... it worked very well:
>>
>> $ time epeg -m 128,128 input.jpg output.jpg
>> real    0m0.126s
>> user    0m0.116s
>> sys     0m0.004s
>>
>> $ time convert output.jpg -antialias output.png
>> real    0m0.030s
>> user    0m0.010s
>> sys     0m0.006s
>>
>> (I turned off antialiasing in the above because you cannot see the 
>> difference when you resize photographs that much, and it makes a 30% 
>> reduction in generation time).
>>     
>
> converting a jpeg to a png will not make any difference to the image - it will
> look identical as the source jpeg. what you say above doesn't make sense?
The current standard means that thumbnails should be in the png format.  
This is why I converted to png.
>  also
> note convert is not a great tool for benchmarks - its one of the slowest image
> processing engines around. :)
>   
What other tool should I use?
>   
>>>  it makes use of the fact that JPEG's are DCT's
>>> and you get "free" downscaling on decode by factors of 2, 4 and 8 times in
>>> each dimension simply by decoding the DCT at a different output res (and
>>> throwing out higher frequency DCT elements). it means for an incredibly
>>> faster decode and thus thumbnail generation is sped up massively. you speed
>>> up the other half of this work by using JPEG output for thumbnails. it
>>> makes a difference. a perceptible and measurable one. people notice.
>>>       
>> Could a similar optimisation be applied to PNG to generate those 
>> thumbnails very quickly?
>>     
>
> to generate thumbnails OF png files? if so - yes, IF the png happens to be
> encoded with "interlacing". if not - no. this is only on DECODE. on ENCODE this
> optimisation isn't used. we need to separate reading and writing into different
> things. reading a jpeg and have it already scaled down on read is possible -
> and libjpeg can do it - epeg just makes it dead-easy to use, that's all. it's
> simply a wrapper front-end. for writing - that's another matter. in my speed
> tests i used evas itself (as it can do the "epeg trick" using libjpeg
> natively), and can also save. libjpeg is simply faster at writing files than
> libpng - by a large margin. by their nature jpeg will be faster as it literally
> has less to pass into the huffman compressor compared to png (as it strips out
> information during encode to dct and throwing out of frequencies).
>
>   



More information about the xdg mailing list