[PATCH v2 libdrm 4/7] tegra: Add channel, job, pushbuf and fence APIs

Fri May 2 07:53:55 PDT 2014

On Fri, May 2, 2014 at 4:06 PM, Thierry Reding <thierry.reding at gmail.com> wrote:
> On Thu, Apr 10, 2014 at 07:13:22PM +0200, Erik Faye-Lund wrote:
>>
>> But this raises the question, how is job->cmdbufs (and job->relocs)
>> supposed to get free'd?
>>
>> I'm a bit curious as to what the expected life-time of these objects
>> are. Do I need to create a new job-object for each submit, or can I
>> reuse a job? I think the latter would be preferable... So perhaps we
>> should have a drm_tegra_job_reset that allows us to throw away the
>> accumulated cmdbufs and relocs, and start building a new job?
>
> They are currently freed by drm_tegra_job_free(), so yes, the current
> intended usage is to create a new job for each submit. Do you mind if we
> keep your proposal for a drm_tegra_job_reset() in mind for a follow-up
> patch. It's an optimization that we can easily add later on and I'd like
> to avoid too much premature optimization at this point.

Sure, I don't mind.

>> > diff --git a/tegra/private.h b/tegra/private.h
>> > index 9b6bc9395d23..fc74fb56b58d 100644
>> > --- a/tegra/private.h
>> > +++ b/tegra/private.h
>> > @@ -26,13 +26,31 @@
>> >  #define __DRM_TEGRA_PRIVATE_H__ 1
>> >
>> >  #include <stdbool.h>
>> > +#include <stddef.h>
>> >  #include <stdint.h>
>> >
>> >  #include <libdrm.h>
>> > +#include <libdrm_lists.h>
>> >  #include <xf86atomic.h>
>> >
>> > +#include "tegra_drm.h"
>> >  #include "tegra.h"
>> >
>> > +#define container_of(ptr, type, member) ({                             \
>> > +               const typeof(((type *)0)->member) *__mptr = (ptr);      \
>> > +               (type *)((char *)__mptr - offsetof(type, member));      \
>> > +       })
>> > +
>> <snip>
>> > +
>> > +struct drm_tegra_pushbuf_private {
>> > +       struct drm_tegra_pushbuf base;
>> > +       struct drm_tegra_job *job;
>> > +       drmMMListHead list;
>> > +       drmMMListHead bos;
>> > +
>> > +       struct drm_tegra_bo *bo;
>> > +       uint32_t *start;
>> > +       uint32_t *end;
>> > +};
>> > +
>> > +static inline struct drm_tegra_pushbuf_private *
>> > +pushbuf_priv(struct drm_tegra_pushbuf *pb)
>> > +{
>> > +       return container_of(pb, struct drm_tegra_pushbuf_private, base);
>> > +}
>> > +
>>
>> This seems to be the only use-case for container_of, and it just
>> happens to return the exact same pointer as was passed in... so do we
>> really need that helper?
>
> It has the benefit of keeping this code working even when somebody
> suddenly adds to the beginning of drm_tegra_pushbuf_private. I'd rather
> not resolve to force casting just to be on the safe side.

Is that a really a legitimate worry? There's a very clear relationship
between the two structures. Both nouveau and freedreno does the same
thing (for nouveau_bo_priv, nouveau_device_priv, msm_device, msm_pipe
and msm_bo structures), and it seems to work fine for them. Perhaps a
comment along the lines of "/* this needs to be the first member */"
is enough to discourage people from messing with it?

>> > diff --git a/tegra/pushbuf.c b/tegra/pushbuf.c
>> > new file mode 100644
>> > index 000000000000..178d5cd57541
>> > --- /dev/null
>> > +++ b/tegra/pushbuf.c
>> > @@ -0,0 +1,205 @@
>> <snip>
>> > +drm_public
>> > +int drm_tegra_pushbuf_new(struct drm_tegra_pushbuf **pushbufp,
>> > +                         struct drm_tegra_job *job)
>> > +{
>> > +       struct drm_tegra_pushbuf_private *pushbuf;
>> > +       void *ptr;
>> > +       int err;
>> > +
>> > +       pushbuf = calloc(1, sizeof(*pushbuf));
>> > +       if (!pushbuf)
>> > +               return -ENOMEM;
>> > +
>> > +       DRMINITLISTHEAD(&pushbuf->list);
>> > +       DRMINITLISTHEAD(&pushbuf->bos);
>> > +       pushbuf->job = job;
>> > +
>> > +       *pushbufp = &pushbuf->base;
>> > +
>> > +       DRMLISTADD(&pushbuf->list, &job->pushbufs);
>>
>> Hmm. So the job keeps a list of pushbufs. What purpose does this
>> serve? Shouldn't the job only need the cmdbuf-list (which it already
>> has) and the BOs (so they can be dereferenced after being submitted)?
>
> This is currently used to keep track of existing pushbuffers and make
> sure that they are freed (when the job is freed).
>

OK, it seems to me that the need for keeping the pushbuffers around is
simply not there. Only the BOs needs to be kept around, no?

>> AFAICT, we could make drm_tegra_pushbuf_queue append the BO to a list
>> in the job instead. That way we don't need to keep all the
>> pushbuf-objects around, and we might even be able to reuse the same
>> object rather than keep reallocating them. No?
>
> I guess we could make drm_tegra_pushbuf_new() reinitialize the current
> pushbuf object and always return the same. But perhaps there are
> occasions where it's useful to keep a few of them around?

This sounds a bit hacky (and potentially hazardous, depending on the
client). I'd rather have clear semantics, and flexibility for the
clients to do what suits it best, as long as the cost is negligible.

> If reusing the pushbuf objects makes sense, then perhaps a different API
> would be more useful. Rather than allocate in the context of a job, they
> could be allocated in a channel-wide context and added to/associated
> with a job only as needed.

Yes, I think this might make sense. I'll have to ponder a bit more
about this, though.

>> > +drm_public
>> > +int drm_tegra_pushbuf_sync(struct drm_tegra_pushbuf *pushbuf,
>> > +                          enum drm_tegra_syncpt_cond cond)
>> > +{
>> > +       struct drm_tegra_pushbuf_private *priv = pushbuf_priv(pushbuf);
>> > +
>> > +       if (cond >= DRM_TEGRA_SYNCPT_COND_MAX)
>> > +               return -EINVAL;
>> > +
>> > +       *pushbuf->ptr++ = HOST1X_OPCODE_NONINCR(0x0, 0x1);
>> > +       *pushbuf->ptr++ = cond << 8 | priv->job->syncpt;
>>
>> Perhaps "HOST1X_OPCODE_IMM(0x0, cond << 8 | priv->job->syncpt)", which
>> saves a word?
>
> Interesting. Have you ever tested whether that works properly that way?
> I've never seen that in a command stream from the binary driver.

No, I haven't. But from the documentation, there's nothing to suggest
it wouldn't work.

I'll give it a spin later to verify!

>> Either way, I think this should either call drm_tegra_pushbuf_prepare
>> to make sure two words are available, or be renamed to reflect that it
>> causes a push (or two, as in the current form).
>
> I've added a call to drm_tegra_pushbuf_prepare() here and in
> drm_tegra_pushbuf_relocate() which also pushes one word to the pushbuf.
>

Actually, I think drm_tegra_pushbuf_relocate should just be renamed to
drm_tegra_pushbuf_push_reloc(), as that conveys quite clearly that it
is just a push-helper and need to be counted when preparing. That way
it doesn't need to prepare itself.

The reason why I think this is better for that function, is that
relocs will naturally appear in the middle of other pushes, so it
makes sense to perpare them just like the other pushes. I *think*
drm_tegra_pushbuf_sync is only really useful when switching between
GR2D and GR3D, so it's not such a big concern. But I could be wrong,
dunno. Perhaps it's also useful when texturing with a newly rendered
FBO also.

>> > +struct drm_tegra_channel;
>> > +struct drm_tegra_job;
>> > +
>> > +struct drm_tegra_pushbuf {
>> > +       uint32_t *ptr;
>> > +};
>>
>> Hmm, single-member structs always makes me cringe a bit. But in this
>> case it's much better than a double-pointer IMO.
>>
>> But perhaps some of the members in the private-struct might be useful
>> out here? For instance, if "uint32_t *end;" was also available, we
>> could do:
>>
>> inline void drm_tegra_pushbuf_push_word(struct drm_tegra_pushbuf
>> *pushbuf, uint32_t value)
>> {
>>        assert(pushbuf->ptr < pushbuf->end);
>>        *pushbuf->ptr++ = value;
>> }
>>
>> ...and easily pick up bugs with too few words? The helper would of
>> course be in the header-file, so the write gets inlined nicely.
>
> Sounds good. If you have no strong objections, I'd like to keep that for
> a follow on patch when there's more code that actively uses this API. It
> should be fairly safe to make more members public via drm_tegra_pushbuf
> rather than the other way around, so I'd like to err on the side of
> carefulness for a bit longer.

Hmm, isn't it the *current* code that is "careless" in this case? The
client doesn't have enough context to validate this itself (without
having access to the end-pointer)?

> Thanks for the review, and apologies for being sluggish.

No worries, thanks for the work so far! :)