[Spice-devel] RFC: Lightweight tracing mechanism

Christophe de Dinechin dinechin at redhat.com
Mon Jun 12 07:47:00 UTC 2017


Top remark: here is what I expected to see discussed when I shared the code.

1. What categories do we want?

2. How do spread the categorization work?

3. How much do today’s developers depend on existing output?

4. What categories do we request e.g. in bug reports (IOW, what should --spice-debug select)

5. What tracing options and tweaks are potentially dangerous for end-users (e.g. what could expose security information)

6. Consequently, what tracing options do hide in release builds, current POV ranging from “as many as possible" (me) to “none of them” (you)

7. What are the existing hard-coded parameters? Explicit (e.g. 400ms delay in the server) or implicit (e.g. unbounded memory usage)

8. Among these, which ones would we like to tweak?

9. Among those we want to tweak, which ones would deserve dedicated options or env variables, which ones would be perfectly OK with a more generic option like the -t I proposed.


It did not go as planned, but if you have time, I think these are interesting questions.



> On 9 Jun 2017, at 18:06, Marc-André Lureau <marcandre.lureau at redhat.com> wrote:
> 
> 
> 
> ----- Original Message -----
> 
>> [drop_stats]id=0 #in-frames=99 #late=99 (100.00%, avg 2995.89 ms)
>> #drop-on-playback=0 (0.00%)
>> [drop_stats]id=0 #in-frames=198 #late=198 (100.00%, avg 2160.48 ms)
>> #drop-on-playback=3 (1.52%)
>> [drop_stats]id=0 #in-frames=297 #late=297 (100.00%, avg 1881.11 ms)
>> #drop-on-playback=12 (4.04%)
>> 
>> But then:
>> 
>> Why 5? How much I want to put a tweak there and see what happens if I change
>> that value.
>> 
>> 	#define STREAM_PLAYBACK_SYNC_DROP_SEQ_LEN_LIMIT 5
>> 
>> Why 100? What happens if you change it?
>> 
>>        op->multi_media_time = mmtime + 100; /* workaround... */
>> 
> 
> I don't get your tweaks, they look specific to the problem at hand, I don't think such esoteric code should be checked in code, it will just decay.

Are you telling me that you like this kind of magic, undocumented values  that you can’t change without recompiling? If that is really what you are saying, could you please explain the rationale to me? If not, what are you really saying?

Let me go on record stating that to me, most magic values like this in the code reek of “quick and dirty”, i.e. it’s cheap enough now, we believe it does not matter but never quite tested it.

> 
>> 
>> Here is a more serious follow up with another tweaks I just pushed,
>> latency_offset:
>> 
>> ddd at ptitpuce[traces] spice-gtk> spicy -t latency_offset=1900:drop_stats=25 -h
>> turbo -p 5901
>> [drop_stats]id=0 #in-frames=25 #late=0 (0.00%, avg 0.00 ms)
>> #drop-on-playback=0 (0.00%)
>> […]
>> [drop_stats]id=0 #in-frames=225 #late=0 (0.00%, avg 0.00 ms)
>> #drop-on-playback=14 (6.22%)
>> [drop_stats]id=0 #in-frames=250 #late=17 (6.80%, avg 2027.82 ms)
>> #drop-on-playback=15 (6.00%)
>> [drop_stats]id=0 #in-frames=275 #late=36 (13.09%, avg 2353.86 ms)
>> #drop-on-playback=26 (9.45%)
>> [drop_stats]id=0 #in-frames=300 #late=58 (19.33%, avg 2445.26 ms)
>> #drop-on-playback=50 (16.67%)
>> [drop_stats]id=0 #in-frames=325 #late=58 (17.85%, avg 2445.26 ms)
>> #drop-on-playback=50 (15.38%)
>> 
> 
> same idea, this is playing with the code, not code that should remain.

First, “playing with the code”, aka “interactivity”, is indeed what this is about. That what the point of my “punchcard” analogy. By being interactive, terminals made many things possible that don’t even seem to make sense with a punchcard reader and a line printer. And the mechanism I’m talking about there, as simple as it is, makes spice testing more interactive.

Second, whether that or any other instrumentation remains or not would depend on a number of things. If the instrumentation showed that something interesting happens with “latency_offset=1600”, then I believe I would share the instrumentation to let others replicate my experiment. If variations of that parameter allowed me to observe a 2x effect in some performance aspect, then I would certainly leave the tweak in and suggest we collectively work on optimizing its value.



> 
>>> 
>>>> 
>>>> 3. What is actually received during the early protocol exchange? That’s
>>>> mostly for my curiosity, to see how chatty the protocol is, the size of
>>>> packets it tends to exchange, etc. To answer a question from Marc André,
>>>> this one and the next are tools that would immediately have shown me where
>>>> the macOS version was goofing up.
>>>> 
>>>>> spicy -t wire_read -h turbo -p 5901
>>>> [wire_read]Read 16 bytes in 0x7f8b150d85a8
>>>> [wire_read]00000000:  52 45 44 51 02 00 00 00 02 00 00 00 BA 00 00 00
>>>> 
>>>> [wire_read]Read 186 bytes in 0x7f8b141c6650
>>>> [wire_read]00000000:  00 00 00 00 30 81 9F 30 0D 06 09 2A 86 48 86 F7
>>>> [wire_read]00000010:  0D 01 01 01 05 00 03 81 8D 00 30 81 89 02 81 81
>>>> [wire_read]00000020:  00 D2 A9 FE 7B F0 36 B7 33 3C C1 F0 5E 10 50 F3
>>>> [wire_read]00000030:  11 D3 3E 31 4A 95 AC AC 7B 0C 20 CB 57 B2 C8 1D
>>>> [wire_read]00000040:  10 F1 E2 4A 1A F2 8F 2A 03 70 25 BB 63 C7 D6 51
>>>> [wire_read]00000050:  D4 DA 64 3C 95 0C B6 9E A7 4E B3 3F 65 12 83 86
>>>> [wire_read]00000060:  F4 35 F3 AC 3D C3 9A 6C C9 EE 34 D9 67 F0 8A 2D
>>>> [wire_read]00000070:  53 4D 73 46 06 CA 80 04 CF 7F 72 4A 8D 74 81 78
>>>> [wire_read]00000080:  E8 28 39 51 7C 9E 3B C2 19 D6 BD EC B4 F7 0A 68
>>>> [wire_read]00000090:  D3 AF D7 1F E6 93 E1 4C 23 F5 83 68 E1 7D 31 9A
>>>> [wire_read]000000A0:  53 02 03 01 00 01 01 00 00 00 01 00 00 00 B2 00
>>>> [wire_read]000000B0:  00 00 0B 00 00 00 09 00 00 00
>>>> [wire_read]Read 4 bytes in 0x109478b44
>>>> [wire_read]00000000:  00 00 00 00
>>>> ** Message: main channel: opened
>>>> [wire_read]Read 6 bytes in 0x7f8b1274efb0
>>>> [wire_read]00000000:  67 00 20 00 00 00
>>>> [wire_read]Read 32 bytes in 0x7f8b1274dff0
>>>> [wire_read]00000000:  31 AE 1E 5E 01 00 00 00 03 00 00 00 02 00 00 00
>>>> [wire_read]00000010:  01 00 00 00 0A 00 00 00 9B AF BE 0B 00 00 00 01
>>>> 
>>> 
>>> This is so low-level than even debug log don't make much sense. Imho when
>>> you need to read the byte-stream, you can just attach gdb and put a
>>> breakpoint/command. Then GDB allows to do a lot more interesting stuff,
>>> like conditional breakpoint, various print etc..
>> 
>> Valid opinion. Don’t like trace options, don’t have to use them. By why
>> prevent me from using them?
> 
> Cluttering the code because you can't use proper tools is not acceptable imho. 

I have to call the cheap shot here. What on earth makes you think it is OK to publicly imply that:

a) I can’t use these tools.

b) there is so little debate that these are the “proper tool” that my suggesting otherwise is "not acceptable"

Please refrain from making such unfounded attacks. I don’t mind you asking if I know about this or that tool, or why I think they are not the right tools. It is definitely unreasonable to claim I don’t know of alternatives, in particular of alternatives as widely known as gdb. And believe me, it is entirely wrong to deduce that I am proposing the trace options because I don’t know how to use gdb.

I have what I consider as very good reasons to consider the alternatives you suggested inferior for my purpose, and I made numerous attempts at explaining what these reasons were. I try again in this email, search below for “semantics” vs. “primitive”.

The claim of “clutter” is bogus too. Adding internal reports or instrumentation is not cluttering the code. If anything, it is documenting it. It is adding code you can use to check internal consistency, performance, or to test your hypotheses about how the code behaves under this or that load. Any large enough system has such tools. Linux filesystems have fsck for consistency checks, iostats, lsof, and plenty of other tools for analysis or for asking specific questions. Suggesting that these tools clutter Linux because you could use a kernel debugger to get the information is not a very compelling argument. How is spice different?

Finally, let me point out that I usually try to give precise and detailed answers to your questions. This often leads me to, in turn, ask precise and specific questions in my answers. I would very much appreciate if you started answering these questions. Otherwise, it is not a productive discussion, it is just you asserting your personal preferences and beliefs as the only possible choice.


> 
>> 
>> Please send me your GDB script for -t wire_read:marshall_read, which I use
>> for LEB128 development, and for -t drop_stats=25:latency_offset=N, which I
>> use to investigate where the latency is coming from. Please make it
>> compatible with LLDB too, unless you know how to install GDB on macOS (read
>> this first if you think I’m being silly
>> https://stackoverflow.com/questions/33162757/how-to-install-gdb-debugger-in-mac-osx-el-capitan).
> 
> Well that looks like a general problem if you can't use debuggers on your platform. I don't own a mac to help you here, nor do I have much experience with it.

I notice you did not send me the GDB script. This is an example of you not answering a question I asked. Could you please seriously make an effort into writing such a script, and not make further suggestions that I use alternative tools until you have done so? I’m not trying to have you shut up here, but rather reflect on what you were asking, and why this is absolutely not the “proper tool”, and that my supposedly not knowing how to use GDB has absolutely nothing to do with the matter.

Making the hypothesis that the reason I suggest a mechanism like the one I shared is because I don’t know how to use GDB or whatever else is not just insulting, it shows you really did not put that much effort into understanding why I am making this proposal, and what problems it addresses. I am sorry, but based on your record, I am highly convinced you can do much better.

Of course, the fact that GDB does not work well on the macOS platform (and neither do systemtap, or ftrace, or …) would be enough to kill these approaches for me. But that’s not the core of the argument. Even if I decided to only focus on today’s Linux, and assumed you can install any package you want, I would still consider using gdb for the examples I gave as, at best, a clumsy workaround, and at worse a big fat waste of time. If you write the gdb script, I believe you will reach that conclusion yourself.



> 
>>> 
>>>> 
>>>> 4. A symbolic view of the initial exchange (not all cases are covered in
>>>> marshal.py / demarshal.py yet, so some fields are likely missing).
>>>> 
>>>>> spicy -t marshall_read -h turbo -p 5901
>>>> 
>>>> [marshall_read]  SpiceMsgMainInit.session_id=1745897490
>>>> [marshall_read]  SpiceMsgMainInit.display_channels_hint=1
>>>> [marshall_read]  SpiceMsgMainInit.supported_mouse_modes=3
>>>> [marshall_read]  SpiceMsgMainInit.current_mouse_mode=2
>>>> [marshall_read]  SpiceMsgMainInit.agent_connected=1
>>>> [marshall_read]  SpiceMsgMainInit.agent_tokens=10
>>>> [marshall_read]  SpiceMsgMainInit.multi_media_time=226863047
>>>> [marshall_read]  SpiceMsgMainInit.ram_hint=16777216
>>>> [marshall_read]  SpiceMsgMainName.name_len=10
>>>> [marshall_read]  SpiceMsgPing.id=1
>>>> [marshall_read]  SpiceMsgPing.timestamp=226863447500
>>>> [marshall_read]  SpiceMsgPing.id=2
>>>> [marshall_read]  SpiceMsgPing.timestamp=226863447502
>>>> [marshall_read]  SpiceMsgPing.id=3
>>>> [marshall_read]  SpiceMsgPing.timestamp=226863447503
>>>> [marshall_read]  SpiceMsgChannels.num_of_channels=8
>>>> [marshall_read]  SpiceChannelId.type=2
>>>> [marshall_read]  SpiceChannelId.id=0
>>>> [marshall_read]  SpiceChannelId.type=4
>>>> [marshall_read]  SpiceChannelId.id=0
>>>> [marshall_read]  SpiceChannelId.type=9
>>>> [marshall_read]  SpiceChannelId.id=1
>>>> [marshall_read]  SpiceChannelId.type=9
>>>> [marshall_read]  SpiceChannelId.id=0
>>>> [marshall_read]  SpiceChannelId.type=6
>>>> [marshall_read]  SpiceChannelId.id=0
>>>> [marshall_read]  SpiceChannelId.type=5
>>>> [marshall_read]  SpiceChannelId.id=0
>>>> [marshall_read]  SpiceChannelId.type=2
>>>> [marshall_read]  SpiceChannelId.id=1
>>>> [marshall_read]  SpiceChannelId.type=3
>>>> [marshall_read]  SpiceChannelId.id=0
>>> 
>>> Same here, this is so verbose and low-level than you can simply attach gdb
>>> and pretty-print the structure you are interested in.
>> 
>> Possible ≠ easy. You can also simply edit source code with sed. That does not
>> make it a good idea.
>> 
>>> This is really just useful to developpers, no one else could make sense of
>>> these internal stuff.
>> 
>> Asking a customer to send the output of spicy -t magic_trace seems easier
>> than asking them to run a gdb script.
> 
> I never said you have to ask customer for running gdb.

Actually, you did, if implicitly. You suggested that this was “really just useful to developers” just above, and that the correct way to get the information was to “simply attach gdb and pretty print the structure you are interested in". Meaning that if, as a developer, I want that information from a customer, I have to ask the customer to run GDB. It’s the most logical conclusion of what you wrote. 

The very reason I dragged an hypothetical customer in the scenario was to demonstrate that “no one else [than a developer] could make sense of these internal stuff” does not imply the conclusion you wanted me to draw that the best way to print that stuff is with gdb, and (presumably from there) that whoever thinks otherwise does not know how to use gdb. See also ssh -vvv example below.


> At this point, the developper should have enough information to reproduce the problem.

How? You said yourself that the information was available through GDB and that this was the right way to get it. But let’s assume I understood you incorrectly. Not using GDB, only the existing debug log, how does a developer know that some step in the protocol misbehaved and how? Don’t you find it strange to have a program that is designed to talk to another, and where the only way to know what it understood from the other side is to attach a debugger or a packet analyzer? (As an aside, wireshark tells me what goes on the wire, not what spice thinks it sent or received)

When some ssh session does not work the way I expect, I don’t attach a debugger or wireshark to ssh, I use ssh -vvv. And quite often, ssh -vvv is enough to tell me where things broke down.  I don’t have that for spice, and it’s a Bad Thing™ in my opinion. If some ssh developer told me that I cannot add code for -vvv because gdb or wireshark are the "proper tools" for figuring out what’s going on, I would find that ludicrous.


> I happened very seldomly that you have to provide the user with a custom binary for testing, which may provide additional log (a good indication that this log should be upstream this time)

Can you give some concrete examples? It would be interesting to see if the information that was added makes sense to have all the time, or if it is something that you’d ask only if some specific symptoms are there. My working hypothesis is that a lot of that “sometimes useful” information never made it to the code precisely because it would have been there all the time when no one needed it, and been seen as clutter. This is precisely the problem that the -t option addresses: a scalable way to add as many surgically precise messages as you want, without cluttering the general output that everybody shares.

> 
>> 
>>> 
>>>> 
>>>> I believe this example illustrates well the kind of instrumentation that
>>>> is
>>>> useful and convenient to have there all the time, but that we don’t want
>>>> to
>>>> send to a debug log if not explicitly requested. I see no reason to
>>>> pollute
>>>> even a structured log with that stuff, unless I actually need it. One line
>>>> per field with a common structure on purpose, to make it easy to grep /
>>>> awk
>>>> the data you want.
>>>> 
>>> 
>>> No, I don't think we need this available "all the time" or put differently,
>>> in the code. We need useful debug log, but when it comes to internal data
>>> structure dump or byte stream analaysis, use proper tools: gdb &
>>> tcpdump/wireshark for ex.
>> 
>> Use what you want. But why prevent me from using tools I find less primitive
>> for the task?
> 
> We don't have the same perception of primitive tools.

No, we don’t. But I also believe you misunderstood what I meant by “primitive”, and interpreted it as “simple”.

I used “primitive” in the same sense that C keywords are the primitives of the language, i.e. something designed as building blocks to build systems, without any bulit-in semantics. This is in opposition to something having semantics, it does not imply “simple”, but it implies “insufficient". GDB is way more complicated than my little preprocessor tricks and printfs. Yet a context-sensitive, semantic-aware printf statement in spice will still be less primitive, because it deals with aspects that I have to build manually in GDB. In other words, GDB and printf are primitives. GDB scripts and printf statements are not, and can be said to be higher-level.

I propose a trace options system which lets us add contents with a lot of semantics to the spice logs. You countered with GDB pretty printing of structures, which is a primitive on which I could build a (different) semantics system using GDB scripts. Now, I believe that this can be done. I also believe that it would be more complex to build a decent collection of spice-aware GDB scripts than to add printf statements in the code. This is why I asked you to provide such a script, as a comparison point. I’m not trying to be a smartass or anything, I genuinely want to see if gdb would be effectively shorter and simpler than printf for the examples. I personally think it wouldn’t.

But when I say I want to use less primitive tools, it is to point out that you can’t offer primitives as a replacement for semantics. You can’t offer putchar as a replacement for printf, one is primitive, the other has semantics about streams and formatting that you have to reconstruct if you use putchar. You can’t offer gdb alone as a replacement for protocol traces. You need to add scripts that dump stuff.

Does that clarify what I meant?

Regarding simplicity of GDB, x/100bx data is a good manual alternative to my hex dump, and automated alternatives are possible (https://stackoverflow.com/questions/9233095/memory-dump-formatted-like-xxd-from-gdb). As you wrote, pretty printing is a good way to show C structures. BUT the hex dump is not any simpler than the one I wrote in C (and presumably lower performance), and the pretty printing of C structures is made more complicated by the fact that there are many marshalling functions, so you need to add breakpoints and print statements in each of them. Again, try it, share your findings, let’s see if it’s really simpler.

> 
>>> 
>>> I don't think you make a good point about the "log tweaks" either.
>> 
>> I don’t see you having made any good point against them, only suggesting
>> purely theoretical alternatives that would turn minutes of work into days.
> 
> You won't be surprised that I think quite the contrary.
> 
> I'd simply like to:
> - not make needless tweaks available in production builds

Please justify that they are needless, when repeatedly stated I need them and explained why. It is important that you address my needs, otherwise you just assert your personal preferences.


> - not clutter the code with logging very specific to a bug (just imagine if everybody added its own bug-specific log in upstream code)

On the code side, please explain how the spice_trace(name_of_trace, printf_format, …) syntax I suggested clutters the code, and what you’d see as an alternative, taking into account that I need the trace, and mostly classified existing traces, so simply removing the trace statement does not address my requirements.

On the log side, please explain why it’s better to clutter the logs unconditionally with uncategorized messages.

If I were to use the word “clutter”, it would be to describe the existing logs, with over 60% of the byte count in there being generally useless or redundant information (I derive the 60% from the ratio of sizes between two files I shared earlier, 79M vs 29M).


> - not reinvent the wheel (like parsing etc)

Please explain why you think my proposal does that. I’m adding one option handler, and one shared option parser that deals both with environment variables and command-line options. Does this duplicate any existing parsing?


> - use existing logging infrastructure to be coherent with the rest of the system, and stay familiar to the developer/admin of that system 

Please explain why you think I’m not using the existing logging infrastructure, when I made quite a few efforts to ensure that existing logging options behave mostly like they used to. If there are errors in the way I do this, they can be fixed. But as stated, this is a cheap attack on code you obviously did not take the time to read.


> - use existing, powerful tools and framework, familiar to existing developers

Please explain how adding a few printf statements precludes you from using these “powerful tools and frameworks”. I did respond to this earlier. Nothing I did prevents you from using gdb and wireshark and systemtap if you want to. But please acknowledge that nothing these tools does really solves the problem I’m addressing.

> 
> What I consider useful in what you propose, is to add more debug and add log categories. The rest fits a poor-man way of debugging, imho, and doesn't match well with common practices and tools. I will study your patches in more details when I have time to see if I missed something. 

I do resent attacks like “poor-man way of debugging”.


Thanks,
Christophe



> _______________________________________________
> Spice-devel mailing list
> Spice-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/spice-devel



More information about the Spice-devel mailing list