[systemd-devel] [PATCH 2/5] shared/json: JSON parser + number tokenizer bugfix

Mon May 18 01:45:38 PDT 2015

On Fri, 2015-05-15 at 17:11 +0200, Lennart Poettering wrote:
> On Fri, 15.05.15 17:05, Pavel Odvody (podvody at redhat.com) wrote:
> 
> > On Fri, 2015-05-15 at 16:12 +0200, Lennart Poettering wrote:
> > > On Fri, 15.05.15 16:03, Pavel Odvody (podvody at redhat.com) wrote:
> > > 
> > > > On Fri, 2015-05-15 at 15:23 +0200, Lennart Poettering wrote:
> > > > > On Thu, 07.05.15 17:47, Pavel Odvody (podvody at redhat.com) wrote:
> > > > > 
> > > > > Hmm, so if I grok this right, then this at's a DOM-like ("object
> > > > > model") parser for json, where we previously hat a SAX-like ("stream")
> > > > > parser only. What's the rationale for this? Why doesn't the stream
> > > > > parser suffice?
> > > > > 
> > > > > I intentionally opted for a stream parser when I wrote the code, and
> > > > > that#s actually the primary reason why i roleld my own parser here,
> > > > > instead of using some existing library....
> > > > > 
> > > > 
> > > > Hmm, I'd call it lexer/tokenizer, since the burden of syntactic analysis
> > > > is on the user. The parser is actually rather thin wrapper around
> > > > json_tokenize.
> > > > 
> > > > Rationale: the v2 manifest (also) contains embedded JSON documents and
> > > > is itself versioned, so it will change sooner or later.
> > > > I believe that parsing the manifest, or any "decently" complex JSON
> > > > document, using the stream parser would yield equal or bigger chunk of
> > > > code than generic DOM parser + few lines that consume it's API.
> > > 
> > > Can you give an example of these embedded JSON documents?
> > > 
> > > Couldn't this part be handled nicely by providing a call that skips
> > > nicely over json objects we don't want to process?
> > > 
> > > Lennart
> > > 
> > 
> > http://pastebin.com/rrkVxHzT
> > 
> > Yes, it could be handled, but I wouldn't call it nicely :)
> > Since there's a lot of nested objects / arrays I guess that you'd need
> > to do the syntactic analysis anyway. It'd be even worse in case some
> > values would shadow the key names, or some part of the document were
> > re-ordered.
> 
> Well, what I really don't like about object parsers is that they might
> take unbounded memory, which is much less a problem with stream
> parsers...
> 
> If we do object model parsing we really need to be careful with
> enforcing limits on everything...
> 
> Lennart
> 

Hmm, I could add a function measuring the size of the resulting object.

  int json_parse_check(const char* data, size_t *size);

Which accepts a JSON string and outputs the final size on success.

What do you think?

-- 
Pavel Odvody <podvody at redhat.com>
Software Engineer - EMEA ENG Developer Experience
5EC1 95C1 8E08 5BD9 9BBF 9241 3AFA 3A66 024F F68D
Red Hat Czech s.r.o., Purkyňova 99/71, 612 45, Brno

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://lists.freedesktop.org/archives/systemd-devel/attachments/20150518/897f6453/attachment.sig>