<div dir="ltr">Hi Maarten, I tested your changes and needed the attached patch: behavior now seems equivalent as android sync. I haven't tested performance.<div><br></div><div>The issue resolved by this patch happens when i_b < b->num_fences and i_a >= a->num_fences (or vice versa). Then, pt_a is invalid and so dereferencing pt_a->context causes a crash.</div>
</div><div class="gmail_extra"><br><br><div class="gmail_quote">On Mon, Nov 4, 2013 at 2:31 AM, Maarten Lankhorst <span dir="ltr"><<a href="mailto:maarten.lankhorst@canonical.com" target="_blank">maarten.lankhorst@canonical.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">op 02-11-13 22:36, Colin Cross schreef:<br>
</div><div><div class="h5">> On Wed, Oct 30, 2013 at 5:17 AM, Maarten Lankhorst<br>
> <<a href="mailto:maarten.lankhorst@canonical.com">maarten.lankhorst@canonical.com</a>> wrote:<br>
>> op 24-10-13 14:13, Maarten Lankhorst schreef:<br>
>>> So I actually tried to implement it now. I killed all the deprecated members and assumed a linear timeline.<br>
>>> This means that syncpoints can only be added at the end, not in between. In particular it means sw_sync<br>
>>> might be slightly broken.<br>
>>><br>
>>> I only tested it with a simple program I wrote called ufence.c, it's in drivers/staging/android/ufence.c in the following tree:<br>
>>><br>
>>> <a href="http://cgit.freedesktop.org/~mlankhorst/linux" target="_blank">http://cgit.freedesktop.org/~mlankhorst/linux</a><br>
>>><br>
>>> the "rfc: convert android to fence api" has all the changes from my dma-fence proposal to what android would need,<br>
>>> it also converts the userspace fence api to use the dma-fence api.<br>
>>><br>
>>> sync_pt is implemented as fence too. This meant not having to convert all of android right away, though I did make some changes.<br>
>>> I killed the deprecated members and made all the fence calls forward to the sync_timeline_ops. dup and compare are no longer used.<br>
>>><br>
>>> I haven't given this a spin on a full android kernel, only with the components that are in mainline kernel under staging and my dumb test program.<br>
>>><br>
>>> ~Maarten<br>
>>><br>
>>> PS: The nomenclature is very confusing. I want to rename dma-fence to syncpoint, but I want some feedback from the android devs first. :)<br>
>>><br>
>> Come on, any feedback? I want to move the discussion forward.<br>
>><br>
>> ~Maarten<br>
> I experimented with it a little on a device that uses sync and came<br>
> across a few bugs:<br>
> 1. sync_timeline_signal needs to call __fence_signal on all signaled<br>
> points on the timeline, not just the first<br>
> 2. fence_add_callback doesn't always initialize cb.node<br>
> 3. sync_fence_wait should take ms<br>
> 4. sync_print_pt status printing was incorrect<br>
> 5. there is a deadlock:<br>
> sync_print_obj takes obj->child_list_lock<br>
> sync_print_pt<br>
> fence_is_signaled<br>
> fence_signal takes fence->lock == obj->child_list_lock<br>
> 6. freeing a timeline before all the fences holding points on that<br>
> timeline have timed out results in a crash<br>
><br>
> With the attached patch to fix these issues, our libsync and sync_test<br>
> give the same results as with our sync code. I haven't tested against<br>
> the full Android framework yet.<br>
><br>
> The compare op and timeline ordering is critical to the efficiency of<br>
> sync points on Android. The compare op is used when merging fences to<br>
> drop all but the latest point on the same timeline. This is necessary<br>
> for example when the same buffer is submitted to the display on<br>
> multiple frames, like when there is a live wallpaper in the background<br>
> updating at 60 fps and a static screen of widgets on top of it. The<br>
> static widget buffer is submitted on every frame, returning a new<br>
> fence each time. The compositor merges the new fence with the fence<br>
> for the previous buffer, and because they are on the same timeline it<br>
> merges down to a single point. I experimented with disabling the<br>
> merge optimization on a Nexus 10, and found that leaving the screen on<br>
> running a live wallpaper eventually resulted in 100k outstanding sync<br>
> points.<br>
<br>
</div></div>Well, here I did the same for dma-fence, can you take a look?<br>
<br>
---<br>
<br>
diff --git a/drivers/staging/android/sync.c b/drivers/staging/android/sync.c<br>
index 2c7fd3f2ab23..d1d89f1f8553 100644<br>
--- a/drivers/staging/android/sync.c<br>
+++ b/drivers/staging/android/sync.c<br>
@@ -232,39 +232,62 @@ void sync_fence_install(struct sync_fence *fence, int fd)<br>
}<br>
EXPORT_SYMBOL(sync_fence_install);<br>
<br>
+static void sync_fence_add_pt(struct sync_fence *fence, int *i, struct fence *pt) {<br>
+ fence->cbs[*i].sync_pt = pt;<br>
+ fence->cbs[*i].fence = fence;<br>
+<br>
+ if (!fence_add_callback(pt, &fence->cbs[*i].cb, fence_check_cb_func)) {<br>
+ fence_get(pt);<br>
+ (*i)++;<br>
+ }<br>
+}<br>
+<br>
<div class="im"> struct sync_fence *sync_fence_merge(const char *name,<br>
</div> struct sync_fence *a, struct sync_fence *b)<br>
{<br>
int num_fences = a->num_fences + b->num_fences;<br>
struct sync_fence *fence;<br>
- int i;<br>
+ int i, i_a, i_b;<br>
<br>
fence = sync_fence_alloc(offsetof(struct sync_fence, cbs[num_fences]), name);<br>
if (fence == NULL)<br>
return NULL;<br>
<br>
- fence->num_fences = num_fences;<br>
atomic_set(&fence->status, num_fences);<br>
<br>
- for (i = 0; i < a->num_fences; ++i) {<br>
- struct fence *pt = a->cbs[i].sync_pt;<br>
-<br>
- fence_get(pt);<br>
- fence->cbs[i].sync_pt = pt;<br>
- fence->cbs[i].fence = fence;<br>
<div class="im">- if (fence_add_callback(pt, &fence->cbs[i].cb, fence_check_cb_func))<br>
</div>- atomic_dec(&fence->status);<br>
+ /*<br>
+ * Assume sync_fence a and b are both ordered and have no<br>
+ * duplicates with the same context.<br>
+ *<br>
+ * If a sync_fence can only be created with sync_fence_merge<br>
+ * and sync_fence_create, this is a reasonable assumption.<br>
+ */<br>
+ for (i = i_a = i_b = 0; i_a < a->num_fences || i_b < b->num_fences; ) {<br>
+ struct fence *pt_a = i_a < a->num_fences ? a->cbs[i_a].sync_pt : NULL;<br>
+ struct fence *pt_b = i_b < b->num_fences ? b->cbs[i_b].sync_pt : NULL;<br>
+<br>
+ if (!pt_b || pt_a->context < pt_b->context) {<br>
+ sync_fence_add_pt(fence, &i, pt_a);<br>
+<br>
+ i_a++;<br>
+ } else if (!pt_a || pt_a->context > pt_b->context) {<br>
+ sync_fence_add_pt(fence, &i, pt_b);<br>
+<br>
+ i_b++;<br>
+ } else {<br>
+ if (pt_a->seqno - pt_b->seqno <= INT_MAX)<br>
+ sync_fence_add_pt(fence, &i, pt_a);<br>
+ else<br>
+ sync_fence_add_pt(fence, &i, pt_b);<br>
+<br>
+ i_a++;<br>
+ i_b++;<br>
+ }<br>
}<br>
<br>
- for (i = 0; i < b->num_fences; ++i) {<br>
- struct fence *pt = b->cbs[i].sync_pt;<br>
-<br>
- fence_get(pt);<br>
- fence->cbs[a->num_fences + i].sync_pt = pt;<br>
- fence->cbs[a->num_fences + i].fence = fence;<br>
<div class="im">- if (fence_add_callback(pt, &fence->cbs[a->num_fences + i].cb, fence_check_cb_func))<br>
</div>- atomic_dec(&fence->status);<br>
- }<br>
+ if (num_fences > i)<br>
+ atomic_sub(num_fences - i, &fence->status);<br>
+ fence->num_fences = i;<br>
<br>
sync_fence_debug_add(fence);<br>
return fence;<br>
<div class="HOEnZb"><div class="h5"><br>
_______________________________________________<br>
dri-devel mailing list<br>
<a href="mailto:dri-devel@lists.freedesktop.org">dri-devel@lists.freedesktop.org</a><br>
<a href="http://lists.freedesktop.org/mailman/listinfo/dri-devel" target="_blank">http://lists.freedesktop.org/mailman/listinfo/dri-devel</a><br>
</div></div></blockquote></div><br></div>