[Swfdec] compiling vs interpreting

Fri Jan 12 15:36:24 PST 2007

So I thought I'd write this down, in case anyone has useful input to it or
wants to read up on the decisions later.

Currnetly swfdec compiles actions from a flash file into JSScript objects
and executes them using the standard Spidermonkey API. While this allows
all the niceties that are available with SpiderMonkey (most
important: a whole debugging framework with stack inspection etc - see
player/swfdebug to get an idea about this), I don't like this approach.

I'm probably going to rewrite this into an interpreter that duplicates a
lot of the functionality of jsinterp.c and interpretes the original
Actionscript from the file. So in short: I'm going to end up where Eric
already was. Here's a list of the reasons for this decision:

- security
SpiderMonkey checks security of its scripts at compile-time. This is nice
for SpiderMonkey since it can use opcodes tailored directly at its
compilation and do all necessary checks while it compiles. However,
Actionscript bytecodes are stack-based and I'd need to do runtime checking
to make sure it does never pop off more value than it had pushed before.

- incompatible bytecodes
the bytecode for a function call (JSOP_CALL) includes the number of
arguments to call the function with. However, the Actionscript code pops
it off the stack. While being security sensitive (see above) there is also
no bytecode for it.
Another such example is the GetVariable action that can request a whole
old-style path such as "/root/movie/video:width". Good luck trying that
with JSOP_GETELEM. I'd need to hack this functionality into all these
opcodes or add new ones, but that doesn't make it simpler.

- one layer less
since I don't convert to (probably slower because of various workarounds)
Spidermonkey bytecodes, there's one area less that can fuck up, namely
jsinterp.c - which is hell to debug, because 90% of the code in that file
is inside macros, and you tell me why ELEMENT_OP (2, -1, CACHED_GET (-1))
or whatever crashes. So I hope to get the crashes into my code - I can
easier work around them then without affecting other working code.
Oh, and it's probably faster, too.

- version differences
Flash executes the same bytecode different depending on version. So I have
to add lots of workarounds generating stupid bytecodes (like converting
NULL to 0 for some versions) in the code that I could solve far easier in
my own interpreter.

- better housekeeping
Since a JSScript is bound to a JSContext, I have to carry the JSContext
everywhere to all my parser objects (aka SwfdecCharacter) that can contain
scripts. If I'd just have a SwfdecBuffer there, it'd be far easier to
maintain.

- debugging
Since when reverse enginering you want to step in ActionScript bytecode
cunks, you need a way to do this. It works quite ok (see
swfdec_debugger.c), but it has its issues. And it's hard to catch internal
bugs this way. You end up in the hell that is jsinterp.c again. :)

I think I'm versed well enough in SpiderMonkey now to make JSFunction cope
with 3 types of functions; JSScript, native and SwfdecBuffer. :)
I'm probably gonna create a new branch (named interpreter) and try it out
there.

Benjamin