[poppler] pdftohtml lets you run random shell commands

suzuki toshiya mpsuzuki at hiroshima-u.ac.jp
Thu Apr 19 04:08:49 PDT 2012


Ihar `Philips` Filipau wrote:
> On 4/19/12, Albert Astals Cid <aacid at kde.org> wrote:
>> --- El jue, 19/4/12, Ihar `Philips` Filipau <thephilips at gmail.com> escribió:
>>
>> And now realize the pdftohtml can be called from a webservice.
>>
> 
> Get real, man.
> 
> In that case, a user or a random person off a street will NEVER ever
> have a possibility to supply random string to a command to be ran on
> the server.

When I was writing my last post, I asked myself "the usecase like web-
service should be mentioned?" - I could not make good example that
pdftohtml kicked by network client is required to receive some Unix commands
over the network.

Or, the possibility we discuss is that even if it is NOT required,
careless administrator will often make a web service taking the command
string from the clients, and we should prevent it?

Anyway, in my personal opinion, poppler is expected to be a self-
standing PDF rendering software, so, the situation that pdftohtml is
required to invoke yet-another rasterizer, Ghostscript, is not good.

> This is the same as the SQL injections and should be handled by
> webservice the same way - by NEVER EVER exposing anything to raw
> unfiltered user input.
> 
>> Now let's be serious, the world is full of people that don't have a clue,
>> and those people usually copy and paste from the interwebs, now imagine that
>> I run an obscure command line of pdftohtml i found in a forum that says
>> it'll work better because it does magic and it ends up removing all the
>> files in my home folder. I'd call that unexpected behaviour
>>
> 
> There are lots of ways - and on forums the text coloring is most
> popular among them - of how one can sneak a stealthy command into
> something innocently looking. That's why on all *nix forums there is a
> merciless ban hammer against such jokers. (I'm an old time Perl coder
> and there was this period of time on Perl forums too.)
> 
> And btw, the same way, one can simply append invisible "; rm -rf *" to
> the end of pdftothml invocation. And there is nothing you can do about
> it.
> 
> Overall, I think you are overreacting. I'm perfectly aware of what I'm
> talking about, actually developing and maintaining software running as
> a back-end for a B2B webservice of sorts. (And I did develop
> webservices in past too.) And I do have two suid-root tools under my
> responsibility, so this problems are rather closer to me than you
> think.
> 
> But then again, I said from the beginning, I do not mind the change
> (esp. if it would reuse some list or even a fixed size array instead
> of va_list), since it cleans up the main() of pdftohtml. If you wish I
> can experiment with escaping the string too (I have some spare time
> right now). Theoretically, for system() it should suffice to escape
> every single quote and wrap the string in the single quotes.
> _______________________________________________
> poppler mailing list
> poppler at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/poppler



More information about the poppler mailing list