[RFC PATCH] dmabuf-sync: Introduce buffer synchronization framework

Thu Jun 13 19:32:40 PDT 2013

Hi Russell,

> -----Original Message-----
> From: Russell King - ARM Linux [mailto:linux at arm.linux.org.uk]
> Sent: Friday, June 14, 2013 2:26 AM
> To: Inki Dae
> Cc: maarten.lankhorst at canonical.com; daniel at ffwll.ch; robdclark at gmail.com;
> linux-fbdev at vger.kernel.org; dri-devel at lists.freedesktop.org;
> kyungmin.park at samsung.com; myungjoo.ham at samsung.com; yj44.cho at samsung.com;
> linux-arm-kernel at lists.infradead.org; linux-media at vger.kernel.org
> Subject: Re: [RFC PATCH] dmabuf-sync: Introduce buffer synchronization
> framework
> 
> On Thu, Jun 13, 2013 at 05:28:08PM +0900, Inki Dae wrote:
> > This patch adds a buffer synchronization framework based on DMA BUF[1]
> > and reservation[2] to use dma-buf resource, and based on ww-mutexes[3]
> > for lock mechanism.
> >
> > The purpose of this framework is not only to couple cache operations,
> > and buffer access control to CPU and DMA but also to provide easy-to-use
> > interfaces for device drivers and potentially user application
> > (not implemented for user applications, yet). And this framework can be
> > used for all dma devices using system memory as dma buffer, especially
> > for most ARM based SoCs.
> >
> > The mechanism of this framework has the following steps,
> >     1. Register dmabufs to a sync object - A task gets a new sync object
> and
> >     can add one or more dmabufs that the task wants to access.
> >     This registering should be performed when a device context or an
> event
> >     context such as a page flip event is created or before CPU accesses
a
> shared
> >     buffer.
> >
> > 	dma_buf_sync_get(a sync object, a dmabuf);
> >
> >     2. Lock a sync object - A task tries to lock all dmabufs added in
its
> own
> >     sync object. Basically, the lock mechanism uses ww-mutex[1] to avoid
> dead
> >     lock issue and for race condition between CPU and CPU, CPU and DMA,
> and DMA
> >     and DMA. Taking a lock means that others cannot access all locked
> dmabufs
> >     until the task that locked the corresponding dmabufs, unlocks all
the
> locked
> >     dmabufs.
> >     This locking should be performed before DMA or CPU accesses these
> dmabufs.
> >
> > 	dma_buf_sync_lock(a sync object);
> >
> >     3. Unlock a sync object - The task unlocks all dmabufs added in its
> own sync
> >     object. The unlock means that the DMA or CPU accesses to the dmabufs
> have
> >     been completed so that others may access them.
> >     This unlocking should be performed after DMA or CPU has completed
> accesses
> >     to the dmabufs.
> >
> > 	dma_buf_sync_unlock(a sync object);
> >
> >     4. Unregister one or all dmabufs from a sync object - A task
> unregisters
> >     the given dmabufs from the sync object. This means that the task
> dosen't
> >     want to lock the dmabufs.
> >     The unregistering should be performed after DMA or CPU has completed
> >     accesses to the dmabufs or when dma_buf_sync_lock() is failed.
> >
> > 	dma_buf_sync_put(a sync object, a dmabuf);
> > 	dma_buf_sync_put_all(a sync object);
> >
> >     The described steps may be summarized as:
> > 	get -> lock -> CPU or DMA access to a buffer/s -> unlock -> put
> >
> > This framework includes the following two features.
> >     1. read (shared) and write (exclusive) locks - A task is required to
> declare
> >     the access type when the task tries to register a dmabuf;
> >     READ, WRITE, READ DMA, or WRITE DMA.
> >
> >     The below is example codes,
> > 	struct dmabuf_sync *sync;
> >
> > 	sync = dmabuf_sync_init(NULL, "test sync");
> >
> > 	dmabuf_sync_get(sync, dmabuf, DMA_BUF_ACCESS_READ);
> > 	...
> >
> > 	And the below can be used as access types:
> > 		DMA_BUF_ACCESS_READ,
> > 		- CPU will access a buffer for read.
> > 		DMA_BUF_ACCESS_WRITE,
> > 		- CPU will access a buffer for read or write.
> > 		DMA_BUF_ACCESS_READ | DMA_BUF_ACCESS_DMA,
> > 		- DMA will access a buffer for read
> > 		DMA_BUF_ACCESS_WRITE | DMA_BUF_ACCESS_DMA,
> > 		- DMA will access a buffer for read or write.
> >
> >     2. Mandatory resource releasing - a task cannot hold a lock
> indefinitely.
> >     A task may never try to unlock a buffer after taking a lock to the
> buffer.
> >     In this case, a timer handler to the corresponding sync object is
> called
> >     in five (default) seconds and then the timed-out buffer is unlocked
> by work
> >     queue handler to avoid lockups and to enforce resources of the
buffer.
> >
> > The below is how to use:
> > 	1. Allocate and Initialize a sync object:
> > 		struct dmabuf_sync *sync;
> >
> > 		sync = dmabuf_sync_init(NULL, "test sync");
> > 		...
> >
> > 	2. Add a dmabuf to the sync object when setting up dma buffer
> relevant
> > 	   registers:
> > 		dmabuf_sync_get(sync, dmabuf, DMA_BUF_ACCESS_READ);
> > 		...
> >
> > 	3. Lock all dmabufs of the sync object before DMA or CPU accesses
> > 	   the dmabufs:
> > 		dmabuf_sync_lock(sync);
> > 		...
> >
> > 	4. Now CPU or DMA can access all dmabufs locked in step 3.
> >
> > 	5. Unlock all dmabufs added in a sync object after DMA or CPU
> access
> > 	   to these dmabufs is completed:
> > 		dmabuf_sync_unlock(sync);
> >
> > 	   And call the following functions to release all resources,
> > 		dmabuf_sync_put_all(sync);
> > 		dmabuf_sync_fini(sync);
> >
> > 	You can refer to actual example codes:
> > 		https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> exynos.git/
> > 		commit/?h=dmabuf-
> sync&id=4030bdee9bab5841ad32faade528d04cc0c5fc94
> >
> > 		https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> exynos.git/
> > 		commit/?h=dmabuf-
> sync&id=6ca548e9ea9e865592719ef6b1cde58366af9f5c
> >
> > The framework performs cache operation based on the previous and current
> access
> > types to the dmabufs after the locks to all dmabufs are taken:
> > 	Call dma_buf_begin_cpu_access() to invalidate cache if,
> > 		previous access type is DMA_BUF_ACCESS_WRITE | DMA and
> > 		current access type is DMA_BUF_ACCESS_READ
> >
> > 	Call dma_buf_end_cpu_access() to clean cache if,
> > 		previous access type is DMA_BUF_ACCESS_WRITE and
> > 		current access type is DMA_BUF_ACCESS_READ | DMA
> >
> > Such cache operations are invoked via dma-buf interfaces so the dma buf
> exporter
> > should implement dmabuf->ops->begin_cpu_access/end_cpu_access callbacks.
> >
> > [1] http://lwn.net/Articles/470339/
> > [2] http://lwn.net/Articles/532616/
> > [3] https://patchwork-mail1.kernel.org/patch/2625321/
> >
> > Signed-off-by: Inki Dae <inki.dae at samsung.com>
> > ---
> >  Documentation/dma-buf-sync.txt |  246 ++++++++++++++++++
> >  drivers/base/Kconfig           |    7 +
> >  drivers/base/Makefile          |    1 +
> >  drivers/base/dmabuf-sync.c     |  555
> ++++++++++++++++++++++++++++++++++++++++
> >  include/linux/dma-buf.h        |    5 +
> >  include/linux/dmabuf-sync.h    |  115 +++++++++
> >  include/linux/reservation.h    |    7 +
> >  7 files changed, 936 insertions(+), 0 deletions(-)
> >  create mode 100644 Documentation/dma-buf-sync.txt
> >  create mode 100644 drivers/base/dmabuf-sync.c
> >  create mode 100644 include/linux/dmabuf-sync.h
> >
> > diff --git a/Documentation/dma-buf-sync.txt b/Documentation/dma-buf-
> sync.txt
> > new file mode 100644
> > index 0000000..e71b6f4
> > --- /dev/null
> > +++ b/Documentation/dma-buf-sync.txt
> > @@ -0,0 +1,246 @@
> > +                    DMA Buffer Synchronization Framework
> > +                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +                                  Inki Dae
> > +                      <inki dot dae at samsung dot com>
> > +                          <daeinki at gmail dot com>
> > +
> > +This document is a guide for device-driver writers describing the DMA
> buffer
> > +synchronization API. This document also describes how to use the API to
> > +use buffer synchronization between CPU and CPU, CPU and DMA, and DMA
> and DMA.
> > +
> > +The DMA Buffer synchronization API provides buffer synchronization
> mechanism;
> > +i.e., buffer access control to CPU and DMA, cache operations, and easy-
> to-use
> > +interfaces for device drivers and potentially user application
> > +(not implemented for user applications, yet). And this API can be used
> for all
> > +dma devices using system memory as dma buffer, especially for most ARM
> based
> > +SoCs.
> > +
> > +
> > +Motivation
> > +----------
> > +
> > +Sharing a buffer, a device cannot be aware of when the other device
> will access
> > +the shared buffer: a device may access a buffer containing wrong data
> if
> > +the device accesses the shared buffer while another device is still
> accessing
> > +the shared buffer. Therefore, a user process should have waited for
> > +the completion of DMA access by another device before a device tries to
> access
> > +the shared buffer.
> > +
> > +Besides, there is the same issue when CPU and DMA are sharing a buffer;
> i.e.,
> > +a user process should consider that when the user process have to send
> a buffer
> > +to a device driver for the device driver to access the buffer as input.
> > +This means that a user process needs to understand how the device
> driver is
> > +worked. Hence, the conventional mechanism not only makes user
> application
> > +complicated but also incurs performance overhead because the
> conventional
> > +mechanism cannot control devices precisely without additional and
> complex
> > +implemantations.
> > +
> > +In addition, in case of ARM based SoCs, most devices have no hardware
> cache
> > +consistency mechanisms between CPU and DMA devices because they do not
> use ACP
> > +(Accelerator Coherency Port). ACP can be connected to DMA engine or
> similar
> > +devices in order to keep cache coherency between CPU cache and DMA
> device.
> > +Thus, we need additional cache operations to have the devices operate
> properly;
> > +i.e., user applications should request cache operations to kernel
> before DMA
> > +accesses the buffer and after the completion of buffer access by CPU,
> or vise
> > +versa.
> > +
> > +	buffer access by CPU -> cache clean -> buffer access by DMA
> > +
> > +Or,
> > +	buffer access by DMA -> cache invalidate -> buffer access by CPU
> > +
> > +The below shows why cache operations should be requested by user
> > +process,
> > +    (Presume that CPU and DMA share a buffer and the buffer is mapped
> > +     with user space as cachable)
> > +
> > +	handle = drm_gem_alloc(size);
> > +	...
> > +	va1 = drm_gem_mmap(handle1);
> > +	va2 = malloc(size);
> > +	...
> > +
> > +	while(conditions) {
> > +		memcpy(va1, some data, size);
> > +		...
> > +		drm_xxx_set_dma_buffer(handle, ...);
> > +		...
> > +
> > +		/* user need to request cache clean at here. */
> > +
> > +		/* blocked until dma operation is completed. */
> > +		drm_xxx_start_dma(...);
> > +		...
> > +
> > +		/* user need to request cache invalidate at here. */
> > +
> > +		memcpy(va2, va1, size);
> > +	}
> > +
> > +The issue arises: user processes may incur cache operations: user
> processes may
> > +request unnecessary cache operations to kernel. Besides, kernel cannot
> prevent
> > +user processes from requesting such cache operations. Therefore, we
> need to
> > +prevent such excessive and unnecessary cache operations from user
> processes.
> > +
> > +
> > +Basic concept
> > +-------------
> > +
> > +The mechanism of this framework has the following steps,
> > +    1. Register dmabufs to a sync object - A task gets a new sync
object
> and
> > +    can add one or more dmabufs that the task wants to access.
> > +    This registering should be performed when a device context or an
> event
> > +    context such as a page flip event is created or before CPU accesses
> a shared
> > +    buffer.
> > +
> > +	dma_buf_sync_get(a sync object, a dmabuf);
> > +
> > +    2. Lock a sync object - A task tries to lock all dmabufs added in
> its own
> > +    sync object. Basically, the lock mechanism uses ww-mutex[1] to
avoid
> dead
> > +    lock issue and for race condition between CPU and CPU, CPU and DMA,
> and DMA
> > +    and DMA. Taking a lock means that others cannot access all locked
> dmabufs
> > +    until the task that locked the corresponding dmabufs, unlocks all
> the locked
> > +    dmabufs.
> > +    This locking should be performed before DMA or CPU accesses these
> dmabufs.
> > +
> > +	dma_buf_sync_lock(a sync object);
> > +
> > +    3. Unlock a sync object - The task unlocks all dmabufs added in its
> own sync
> > +    object. The unlock means that the DMA or CPU accesses to the
dmabufs
> have
> > +    been completed so that others may access them.
> > +    This unlocking should be performed after DMA or CPU has completed
> accesses
> > +    to the dmabufs.
> > +
> > +	dma_buf_sync_unlock(a sync object);
> > +
> > +    4. Unregister one or all dmabufs from a sync object - A task
> unregisters
> > +    the given dmabufs from the sync object. This means that the task
> dosen't
> > +    want to lock the dmabufs.
> > +    The unregistering should be performed after DMA or CPU has
completed
> > +    accesses to the dmabufs or when dma_buf_sync_lock() is failed.
> > +
> > +	dma_buf_sync_put(a sync object, a dmabuf);
> > +	dma_buf_sync_put_all(a sync object);
> > +
> > +    The described steps may be summarized as:
> > +	get -> lock -> CPU or DMA access to a buffer/s -> unlock -> put
> > +
> > +This framework includes the following two features.
> > +    1. read (shared) and write (exclusive) locks - A task is required
to
> declare
> > +    the access type when the task tries to register a dmabuf;
> > +    READ, WRITE, READ DMA, or WRITE DMA.
> > +
> > +    The below is example codes,
> > +	struct dmabuf_sync *sync;
> > +
> > +	sync = dmabuf_sync_init(NULL, "test sync");
> > +
> > +	dmabuf_sync_get(sync, dmabuf, DMA_BUF_ACCESS_READ);
> > +	...
> > +
> > +    2. Mandatory resource releasing - a task cannot hold a lock
> indefinitely.
> > +    A task may never try to unlock a buffer after taking a lock to the
> buffer.
> > +    In this case, a timer handler to the corresponding sync object is
> called
> > +    in five (default) seconds and then the timed-out buffer is unlocked
> by work
> > +    queue handler to avoid lockups and to enforce resources of the
> buffer.
> > +
> > +
> > +Access types
> > +------------
> > +
> > +DMA_BUF_ACCESS_READ - CPU will access a buffer for read.
> > +DMA_BUF_ACCESS_WRITE - CPU will access a buffer for read or write.
> > +DMA_BUF_ACCESS_READ | DMA_BUF_ACCESS_DMA - DMA will access a buffer for
> read
> > +DMA_BUF_ACCESS_WRITE | DMA_BUF_ACCESS_DMA - DMA will access a buffer
> for read
> > +					    or write.
> > +
> > +
> > +API set
> > +-------
> > +
> > +bool is_dmabuf_sync_supported(void)
> > +	- Check if dmabuf sync is supported or not.
> > +
> > +struct dmabuf_sync *dmabuf_sync_init(void *priv, const char *name)
> > +	- Allocate and initialize a new sync object. The caller can get a
> new
> > +	sync object for buffer synchronization. priv is used to set
> caller's
> > +	private data and name is the name of sync object.
> > +
> > +void dmabuf_sync_fini(struct dmabuf_sync *sync)
> > +	- Release all resources to the sync object.
> > +
> > +int dmabuf_sync_get(struct dmabuf_sync *sync, void *sync_buf,
> > +			unsigned int type)
> > +	- Add a dmabuf to a sync object. The caller can group multiple
> dmabufs
> > +	by calling this function several times. Internally, this function
> also
> > +	takes a reference to a dmabuf.
> > +
> > +void dmabuf_sync_put(struct dmabuf_sync *sync, struct dma_buf *dmabuf)
> > +	- Remove a given dmabuf from a sync object. Internally, this
> function
> > +	also release every reference to the given dmabuf.
> > +
> > +void dmabuf_sync_put_all(struct dmabuf_sync *sync)
> > +	- Remove all dmabufs added in a sync object. Internally, this
> function
> > +	also release every reference to the dmabufs of the sync object.
> > +
> > +int dmabuf_sync_lock(struct dmabuf_sync *sync)
> > +	- Lock all dmabufs added in a sync object. The caller should call
> this
> > +	function prior to CPU or DMA access to the dmabufs so that others
> can
> > +	not access the dmabufs. Internally, this function avoids dead lock
> > +	issue with ww-mutex.
> > +
> > +int dmabuf_sync_unlock(struct dmabuf_sync *sync)
> > +	- Unlock all dmabufs added in a sync object. The caller should call
> > +	this function after CPU or DMA access to the dmabufs is completed
> so
> > +	that others can access the dmabufs.
> > +
> > +
> > +Tutorial
> > +--------
> > +
> > +1. Allocate and Initialize a sync object:
> > +	struct dmabuf_sync *sync;
> > +
> > +	sync = dmabuf_sync_init(NULL, "test sync");
> > +	...
> > +
> > +2. Add a dmabuf to the sync object when setting up dma buffer relevant
> registers:
> > +	dmabuf_sync_get(sync, dmabuf, DMA_BUF_ACCESS_READ);
> > +	...
> > +
> > +3. Lock all dmabufs of the sync object before DMA or CPU accesses the
> dmabufs:
> > +	dmabuf_sync_lock(sync);
> > +	...
> > +
> > +4. Now CPU or DMA can access all dmabufs locked in step 3.
> > +
> > +5. Unlock all dmabufs added in a sync object after DMA or CPU access to
> these
> > +   dmabufs is completed:
> > +	dmabuf_sync_unlock(sync);
> > +
> > +   And call the following functions to release all resources,
> > +	dmabuf_sync_put_all(sync);
> > +	dmabuf_sync_fini(sync);
> > +
> > +
> > +Cache operation
> > +---------------
> > +
> > +The framework performs cache operation based on the previous and
> current access
> > +types to the dmabufs after the locks to all dmabufs are taken:
> > +	Call dma_buf_begin_cpu_access() to invalidate cache if,
> > +		previous access type is DMA_BUF_ACCESS_WRITE | DMA and
> > +		current access type is DMA_BUF_ACCESS_READ
> > +
> > +	Call dma_buf_end_cpu_access() to clean cache if,
> > +		previous access type is DMA_BUF_ACCESS_WRITE and
> > +		current access type is DMA_BUF_ACCESS_READ | DMA
> > +
> > +Such cache operations are invoked via dma-buf interfaces. Thus, the dma
> buf
> > +exporter should implement dmabuf->ops->begin_cpu_access and
> end_cpu_access
> > +callbacks.
> > +
> > +
> > +References:
> > +[1] https://patchwork-mail1.kernel.org/patch/2625321/
> > diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig
> > index 5ccf182..54a1d5a 100644
> > --- a/drivers/base/Kconfig
> > +++ b/drivers/base/Kconfig
> > @@ -212,6 +212,13 @@ config FENCE_TRACE
> >  	  lockup related problems for dma-buffers shared across multiple
> >  	  devices.
> >
> > +config DMABUF_SYNC
> > +	bool "DMABUF Synchronization Framework"
> > +	depends on DMA_SHARED_BUFFER
> > +	help
> > +	  This option enables dmabuf sync framework for buffer
> synchronization between
> > +	  DMA and DMA, CPU and DMA, and CPU and CPU.
> > +
> >  config CMA
> >  	bool "Contiguous Memory Allocator"
> >  	depends on HAVE_DMA_CONTIGUOUS && HAVE_MEMBLOCK
> > diff --git a/drivers/base/Makefile b/drivers/base/Makefile
> > index 8a55cb9..599f6c90 100644
> > --- a/drivers/base/Makefile
> > +++ b/drivers/base/Makefile
> > @@ -11,6 +11,7 @@ obj-y			+= power/
> >  obj-$(CONFIG_HAS_DMA)	+= dma-mapping.o
> >  obj-$(CONFIG_HAVE_GENERIC_DMA_COHERENT) += dma-coherent.o
> >  obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf.o fence.o reservation.o
> > +obj-$(CONFIG_DMABUF_SYNC) += dmabuf-sync.o
> >  obj-$(CONFIG_ISA)	+= isa.o
> >  obj-$(CONFIG_FW_LOADER)	+= firmware_class.o
> >  obj-$(CONFIG_NUMA)	+= node.o
> > diff --git a/drivers/base/dmabuf-sync.c b/drivers/base/dmabuf-sync.c
> > new file mode 100644
> > index 0000000..c8723a5
> > --- /dev/null
> > +++ b/drivers/base/dmabuf-sync.c
> > @@ -0,0 +1,555 @@
> > +/*
> > + * Copyright (C) 2013 Samsung Electronics Co.Ltd
> > + * Authors:
> > + *	Inki Dae <inki.dae at samsung.com>
> > + *
> > + * This program is free software; you can redistribute  it and/or
> modify it
> > + * under  the terms of  the GNU General  Public License as published by
> the
> > + * Free Software Foundation;  either version 2 of the  License, or (at
> your
> > + * option) any later version.
> > + *
> > + */
> > +
> > +#include <linux/kernel.h>
> > +#include <linux/module.h>
> > +#include <linux/slab.h>
> > +#include <linux/debugfs.h>
> > +#include <linux/uaccess.h>
> > +
> > +#include <linux/dmabuf-sync.h>
> > +
> > +#define MAX_SYNC_TIMEOUT	5 /* Second. */
> > +
> > +#define NEED_BEGIN_CPU_ACCESS(old, new)	\
> > +		(old->accessed_type == (DMA_BUF_ACCESS_WRITE | \
> > +					DMA_BUF_ACCESS_DMA) && \
> > +		 (new->access_type == DMA_BUF_ACCESS_READ))
> > +
> > +#define NEED_END_CPU_ACCESS(old, new)	\
> > +		(old->accessed_type == DMA_BUF_ACCESS_WRITE && \
> > +		 ((new->access_type == (DMA_BUF_ACCESS_READ | \
> > +					DMA_BUF_ACCESS_DMA)) || \
> > +		  (new->access_type == (DMA_BUF_ACCESS_WRITE | \
> > +					DMA_BUF_ACCESS_DMA))))
> > +
> > +int dmabuf_sync_enabled = 1;
> > +
> > +MODULE_PARM_DESC(enabled, "Check if dmabuf sync is supported or not");
> > +module_param_named(enabled, dmabuf_sync_enabled, int, 0444);
> > +
> > +static void dmabuf_sync_timeout_worker(struct work_struct *work)
> > +{
> > +	struct dmabuf_sync *sync = container_of(work, struct dmabuf_sync,
> work);
> > +	struct dmabuf_sync_object *sobj;
> > +
> > +	mutex_lock(&sync->lock);
> > +
> > +	list_for_each_entry(sobj, &sync->syncs, head) {
> > +		if (WARN_ON(!sobj->robj))
> > +			continue;
> > +
> > +		printk(KERN_WARNING "%s: timeout = 0x%x [type = %d, " \
> > +					"refcnt = %d, locked = %d]\n",
> > +					sync->name, (u32)sobj->dmabuf,
> > +					sobj->access_type,
> > +
atomic_read(&sobj->robj->shared_cnt),
> > +					sobj->robj->locked);
> > +
> > +		/* unlock only valid sync object. */
> > +		if (!sobj->robj->locked)
> > +			continue;
> > +
> > +		if (sobj->robj->shared &&
> > +				atomic_read(&sobj->robj->shared_cnt) > 1) {
> > +			atomic_dec(&sobj->robj->shared_cnt);
> 
> So, in my long standing complaints about atomic_t, this one really takes
> the biscuit.  What makes you think that this is somehow safe?
> 
> What happens if:
> 
> shared_cnt = 2
> 	CPU0				CPU1
> 	atomic_read(&shared_cnt)
> 					atomic_read(&shared_cnt)
> 	atomic_dec(&shared_cnt)
> 					atomic_dec(&shared_cnt)
> 
> Now, it's zero.  That's not what the above code intends.  You probably
> think that because it's called "atomic_*" it has some magical properties
> which saves you from stuff like this.  I'm afraid it doesn't.
> 
> sync->lock may save you from that, but if that's the case, why use
> atomic_t's anyway here because you're already in a protected region.
> But maybe not, because I see other uses of shared_cnt without this
> lock below.

It seems that you say dmabuf_sync_lock_objs function. Right, this function
doesn't use sync->lock. However, ww_mutex_lock would be used instead of it
there. So I think there also is in a protected region.

> 
> I think you need to revisit this and think more carefully about how
> to deal with this counting.  If you wish to continue using the
> atomic_* API, please take time to get familiar with it, and most
> importantly realise that virtually any sequence of:
> 
> 	if some-condition-based-on atomic_read(&something)
> 		do something with atomic_*(&something)
> 
> is a bug.  Maybe take a look at atomic_add_unless() which can be used
> with negative values to decrement.

Thanks for your advice. Anyway, it's likely to better to use
atomic_add_unless even though protected.

Thanks,
Inki Dae