[PATCH i-g-t v6] igt-runner fact checking
Peter Senna Tschudin
peter.senna at linux.intel.com
Tue Nov 19 10:07:14 UTC 2024
On 19.11.2024 09:19, Luca Coelho wrote:
> On Mon, 2024-11-18 at 17:03 +0100, Peter Senna Tschudin wrote:
>>
>> On 18.11.2024 14:07, Luca Coelho wrote:
>> [...]
>>>> +
>>>> +/**
>>>> + * report_fact_change: prints changes using igt_info()
>>>> + *
>>>> + * Reports facts that are new, changed and deleted.
>>>> + *
>>>> + * Returns: void
>>>> + */
>>>
>>> If you are planning to create docs from this (which I assume you do,
>>> because of the "/**"), you should add descriptions to all function
>>> arguments, otherwise you'll get warnings, I think.
>>
>> Thank you for the heads up. I will double check.
>>
>>>
>>>
>>>> +static void report_fact_change(igt_fact *fact, const char *new_value,
>>>> + const char *last_test, bool delete)
>>>
>>> I think you don't need this function. If you take away all the
>>> boilerplate and the checks to see who is actually calling the function
>>> (i.e. if it is new, update, delete...), all that remains is an
>>> igt_info() call. It seems to me that you are checking things which you
>>> already knew again inside this function.
>>
>> No no, this is absolutely needed. I am not concerned with additional
>> function calls at this scale as we can comfortably afford them.
>
> I have to disagree with "absolutely needed". How can a new feature
> implementation _require_ a specific function to exist?
>
>
>> The reason why this is a separated function is because logging is the
>> most important function of the patch, and I want to make sure it is in a
>> single function. It is a desing choice that I am happy with.
>
> Sure, I get that. But I don't see how adding a lot of boiler plate and
> redundant code can help.
Hi, and thank you for taking the time. Please show me what you have in
mind here. How can we have all logging in a separate function with code
that you are happy with?
>
> This all-important function is static and hidden well inside the code,
> being called by the list functions (also static), which in turn are
> called by other static functions until finally reaching igt_facts(),
> which is the one that is exported.
I get you weight priorities differently, but you really did not offer me
what is the benefit is of getting rid of the logging function. When I
made the decision I had ease of change in mind. I wanted to make sure to
make it as simple as possible to change the logging. The reason being
that we may need to do stuff like creating files to save the logs in the
future. So this is the benefit we get from having a separate logging
function: making it easy to change _how_ we log, and _what_ we log.
What do we get by doing it your way?
>
>
>>>> +{
>>>> + struct timespec uptime_ts;
>>>> + char *uptime = NULL;
>>>> + bool changed = false;
>>>> + const char *before_tests = "before any test";
>>>> +
>>>> + if (clock_gettime(CLOCK_BOOTTIME, &uptime_ts) != 0)
>>>> + return;
>>>> +
>>>> + asprintf(&uptime,
>>>> + "%ld.%06ld",
>>>> + uptime_ts.tv_sec,
>>>> + uptime_ts.tv_nsec / 1000);
>>>> +
>>>> + /* If delete is true, the fact was deleted */
>>>> + if (delete) {
>>>> + igt_info("[%s] [FACT %s] deleted: %s: %s\n",
>>>> + uptime,
>>>> + last_test ? last_test : before_tests,
>>>> + fact->name,
>>>> + fact->value);
>>>> + return;
>>>> + }
>>>
>>> For example here, only the deletion code calls this function with
>>> delete == true, so it could just as well print the info by itself.
>>>
>>> Same thing for the subsequent blocks.
>>
>> Oh yeah!
>>
>>
>>>
>>>
>>>> +
>>>> + /* If new_value is NULL, it is a new fact */
>>>> + if (new_value == NULL) {
>>>> + igt_info("[%s] [FACT %s] new: %s: %s\n",
>>>> + uptime,
>>>> + last_test ? last_test : before_tests,
>>>> + fact->name,
>>>> + fact->value);
>>>> + return;
>>>> + }
>>>> +
>>>> + /* Check if the value changed
>>>> + *
>>>> + * There is a corner case because the gpu driver can change the human
>>>> + * readeable strings. If the fact->name contains pci_gpu_fact, compare
>>>> + * just the first nine characters (pci_id).
>>>> + */
>>>> + if (strstr(fact->name, pci_gpu_fact) != NULL) {
>>>> + if (strncmp(fact->value, new_value, 9) != 0)
>>>> + changed = true;
>>>> + } else if (strcmp(fact->value, new_value) != 0)
>>>> + changed = true;
>>>
>>> And this, the only actual logic in this function is better done when
>>> you get this information.
>>
>> Having a separate function for logging is a design decision. Feel free
>> to propose concrete changes that keep it as a separate function.
>
> This is not related to logging in itself. According to what I
> understood from your comment, this was because the _input_ can change
> the later part of the string without this being a significant fact
> change. I thought it would be better to keep this check in the part of
> the code that actually knows what this string is, namely, the function
> that reads it from the kernel.
>
> Now that I think about it again, I think I understand your view. It's
> probably because you want to make this decision (to log or not to log)
> here. So this is fine.
Thanks
>
>
>>>> +
>>>> + if (changed) {
>>>> + igt_info("[%s] [FACT %s] changed: %s: %s -> %s\n",
>>>> + uptime,
>>>> + last_test,
>>>> + fact->name,
>>>> + fact->value,
>>>> + new_value);
>>>> + changed = false;
>>>> + }
>>>> + free(uptime);
>>>> +}
>>>> +
>>>> +/**
>>>> + * get_fact: Returns a fact from the list by name.
>>>> + *
>>>> + * Iterates over the list of facts and returns the first one with the same name.
>>>> + * If no fact is found, returns NULL.
>>>> + *
>>>> + * Returns: igt_fact* or NULL
>>>> + */
>>>> +static igt_fact *get_fact(igt_fact_list *list, const char *name)
>>>> +{
>>>> + if (!list || list->facts == NULL || list->num_facts == 0)
>>>> + return NULL;
>>>
>>> AFAICT list->facts == NULL and list->num_facts == 0 are redundant here,
>>> right? It should not happen. And if you did lose track of the
>>> num_facts you actually had in the list (regardless of whether it's
>>> zero), you would be in trouble in other places too.
>>
>> For me, redundancy is not a problem. Again, this was a design decision.
>> I am in no way trying to minimize the number of comparisons or optimize
>> to the level you are mentioning.
>
> Redundancy is not a problem per se, if it's deemed important for
> readability, performance etc. I don't think it is, in this case.
>
>
>> For me it is more important to catch errors here, to increase chances of
>> preventing weird behaviors downdstream. I am very happy to pay the very
>> small price of redundant checks.
>
> You're hiding errors here. If list->facts is not NULL, but list-
>> num_facts == 0, you'll simply return, as if there were nothing in the
> list, when in fact the list is broken. The weird behavior could still
> happen elsewhere, for instance, in the merge_facts() function you have
> this:
>
> for (int i = 0; i < list->num_facts; i++) {
> if (strstr(list->facts[i].name, fact_group) == NULL)
> continue;
>
> You are assuming list->facts is not NULL based on list->num_facts being
> greater than zero.
Thank you for that. I will make sure to extend the unit testing to cover
the case you described:
list->facts is not NULL
list->num_facts == 0
>
>
>>>> + for (int i = 0; i < list->num_facts; i++) {
>>>> + if (list->facts[i].name == NULL)
>>>> + continue;
>>>> +
>>>> + if (strcmp(list->facts[i].name, name) == 0)
>>>> + return &list->facts[i];
>>>> + }
>>>> + return NULL;
>>>> +}
>>>> +
>>>> +/**
>>>> + * delete_fact: Deletes a fact from the list by name.
>>>> + *
>>>> + * Iterates over the list of facts and deletes the first one with the same name.
>>>> + * Returns true if the fact was deleted, false otherwise.
>>>> + *
>>>> + * Returns: bool idicating if the fact was deleted
>>>> + */
>>>> +static bool delete_fact(igt_fact_list *list, const char *name, const char *last_test)
>>>> +{
>>>> + igt_fact *fact = NULL;
>>>> + int i;
>>>> +
>>>> + if (!list || !name || list->facts == NULL || list->num_facts == 0)
>>>> + return false;
>>>> +
>>>> + fact = get_fact(list, name);
>>>> + if (fact == NULL)
>>>> + return false;
>>>> +
>>>> + /* Report the deletion */
>>>> + report_fact_change(fact, NULL, last_test, true);
>>>> +
>>>> + /* Move all facts after the one to be deleted one step back */
>>>> + for (i = 0; i < list->num_facts; i++) {
>>>> + if (strcmp(list->facts[i].name, name) == 0) {
>>>> + free(list->facts[i].name);
>>>> + free(list->facts[i].value);
>>>> +
>>>> + for (int j = i; j < list->num_facts - 1; j++)
>>>> + list->facts[j] = list->facts[j + 1];
>>>> + break;
>>>> + }
>>>> + }
>>>> + list->num_facts--;
>>>> +
>>>> + if (list->num_facts == 0) {
>>>> + free(list->facts);
>>>> + list->facts = NULL;
>>>> + } else {
>>>> + list->facts = realloc(list->facts, list->num_facts * sizeof(igt_fact));
>>>> + igt_assert(list->facts != NULL);
>>>> + }
>>>> +
>>>> + return true;
>>>> +}
>>>> +
>>>> +/**
>>>> + * set_fact: Adds a fact to the list.
>>>> + *
>>>> + * This is intended for the "new_list" that is created for being merged with the
>>>> + * "list". It adds a new fact to the list. If the fact already exists, it
>>>> + * returns false.
>>>> + *
>>>> + * Returns: bool indicating if the fact was added
>>>> + */
>>>> +static bool set_fact(igt_fact_list *list, const char *name,
>>>> + const char *value, const char *last_test)
>>>> +{
>>>> + igt_fact *fact = NULL;
>>>> +
>>>> + /* Check that name and value are not null */
>>>> + if (name == NULL || value == NULL)
>>>> + return false;
>>>> +
>>>> + fact = get_fact(list, name);
>>>> + if (fact != NULL)
>>>> + return false; /* Should not happen */
>>>> +
>>>> + /* Add a new fact */
>>>> + list->num_facts++;
>>>> +
>>>> + list->facts = realloc(list->facts, list->num_facts * sizeof(igt_fact));
>>>> + igt_assert(list->facts != NULL);
>>>> +
>>>> + list->facts[list->num_facts - 1].name = strdup(name);
>>>> + list->facts[list->num_facts - 1].value = strdup(value);
>>>> +
>>>> + if (last_test)
>>>> + list->facts[list->num_facts - 1].last_test = strdup(last_test);
>>>> + else
>>>> + list->facts[list->num_facts - 1].last_test = NULL;
>>>> +
>>>> + return true;
>>>> +}
>>>> +
>>>> +/**
>>>> + * update_list: Updates the main list with new or different facts
>>>> + *
>>>> + * This function is used to update the main list with new or different facts. It
>>>> + * differs from set_fact because by calling report_fact_change() as this is
>>>> + * changing the actual fact list.
>>>
>>> Again, I think you should know this at the caller side (and you do), so
>>> if you just print instead of having the report_fact_change() function,
>>> you can use a single function for add and update.
>>
>> Just acking that I did read this one as well.
>>
>>>
>>>
>>>> + *
>>>> + * Returns: bool indicating if the list was changed
>>>> + */
>>>> +static bool update_list(igt_fact_list *list, const char *name,
>>>> + const char *value, const char *last_test)
>>>> +{
>>>> + igt_fact *fact = NULL;
>>>> +
>>>> + /* Check that name and value are not null */
>>>> + if (name == NULL || value == NULL)
>>>> + return false;
>>>> +
>>>> + fact = get_fact(list, name);
>>>> + if (fact != NULL) {
>>>> + /* Return false if fact->value equals value */
>>>> + if (strcmp(fact->value, value) == 0)
>>>> + return false; /* Not changed */
>>>> +
>>>> + report_fact_change(fact, value, last_test, false);
>>>> +
>>>> + /* Update the value */
>>>> + fact->value = strdup(value);
>>>> + return true;
>>>> + }
>>>> +
>>>> + /* Add a new fact */
>>>> + list->num_facts++;
>>>> +
>>>> + list->facts = realloc(list->facts, list->num_facts * sizeof(igt_fact));
>>>> + igt_assert(list->facts != NULL);
>>>> +
>>>> + list->facts[list->num_facts - 1].name = strdup(name);
>>>> + list->facts[list->num_facts - 1].value = strdup(value);
>>>> +
>>>> + /* Report new fact */
>>>> + report_fact_change(&list->facts[list->num_facts - 1], NULL,
>>>> + last_test, false);
>>>> +
>>>> + return true;
>>>> +}
>>>> +
>>>> +/**
>>>> + * merge_facts: Merges two lists of facts
>>>> + *
>>>> + * This function merges two lists of facts. It filters the facts by fact_group
>>>> + * and deletes the ones that are not present in the new_list. It also updates
>>>> + * the facts that are present in both lists.
>>>> + *
>>>> + * As an example if fact_group is "pci_gpu", we can delete the fact
>>>> + * hardware.pci.gpu_at.0000:00:02.0 if it doesn't exist in new_list.
>>>> + *
>>>> + * Returns: void
>>>> + */
>>>> +static void merge_facts(igt_fact_list *list, igt_fact_list *new_list,
>>>> + const char *fact_group, const char *last_test)
>>>> +{
>>>> + /* Filter by fact_group and delete facts that exist on list
>>>> + * but not on new_list
>>>> + */
>>>> + for (int i = 0; i < list->num_facts; i++) {
>>>> + if (strstr(list->facts[i].name, fact_group) == NULL)
>>>> + continue;
>>>> + if (get_fact(new_list, list->facts[i].name) == NULL)
>>>> + delete_fact(list, list->facts[i].name, last_test);
>>>> + }
>>>> +
>>>> + /* Add and update facts from new_list to list */
>>>> + for (int i = 0; i < new_list->num_facts; i++) {
>>>> + if (strstr(new_list->facts[i].name, fact_group) == NULL)
>>>> + continue; /* Should never happen */
>
> This is another case of hiding errors. If this shouldn't happen, but
> does, you should make it very visible, maybe even an igt_assert? Now
> you're just ignoring it and could hide an actual problem in the test's
> target code, by not detecting it due to a bug in the test code that
> goes unnoticed.
Yes, thank you for that too!
>
>
>>>> + update_list(list, new_list->facts[i].name,
>>>> + new_list->facts[i].value,
>>>> + new_list->facts[i].last_test);
>>>> + }
>>>> +}
>>>
>>> Can't you just grow the list instead of creating new ones and then
>>> merging?
>>
>> Please show me the code you have in mind.
>
> I've seen and implemented so many lists before that this just seems a
> lot of complexity for such a simple, single-user list. I don't even
> understand why you need the executor to know the list... Is there going
> to be more than one user using fact lists in the same thread? I'm not
> saying that you should, but it _could_ even be a global in the facts.c
> code that gets initialized internally when the executor calls.
>
> In fact, you could even just have a global fixed-size array for it.
> Since resources are not much of a concern, you could make it large
> enough so that it would never, in practice, overflow. And if it did,
> just warn, crop it and keep going. Much simpler.
>
>
>>> This makes me start thinking... it would be so much easier to make all
>>> this in python. Would it be possible to glue a python script to igt-
>>> runner (or whoever calls it) and get the data from dmesg, lsmod etc.?
>>> Or otherwise aren't there any libraries already used by IGT that
>>> include list handling?
>>
>> Absolutely no python here. Never! :-)
>
> Oh, well. Can't really argue against dogmas...
Wow, that was a low punch. This code is going to run about 1M times /
day in our CI. There is history that precedes me working on IGT of
issues created by trying to mix c and python in igt. The reasons for not
trying python again are runtime overhead and the history of such
attempts in igt that just made a bunch of people and our CI very upset.
[...]
Thanks, I will update and resend.
More information about the igt-dev
mailing list