[systemd-devel] [PATCH] Fix broken syscall(__NR_fanotify_mark... on 32bit mips.
David Daney
ddaney at caviumnetworks.com
Wed Apr 20 10:36:49 PDT 2011
On 04/20/2011 09:14 AM, Eric Paris wrote:
> On Wed, 2011-04-20 at 12:16 +0200, Aurelien Jarno wrote:
>> On Wed, Apr 20, 2011 at 03:22:08AM +0200, Lennart Poettering wrote:
>>> On Wed, 20.04.11 09:15, fykcee1 at gmail.com (fykcee1 at gmail.com) wrote:
>>>
>>>>
>>>> 2011/4/20 Lennart Poettering<lennart at poettering.net>
>>>>>
>>>>> On Thu, 14.04.11 17:34, fykcee1 at gmail.com (fykcee1 at gmail.com) wrote:
>>>>>
>>>>>> diff --git a/src/missing.h b/src/missing.h
>>>>>> index 35e209f..b367831 100644
>>>>>> --- a/src/missing.h
>>>>>> +++ b/src/missing.h
>>>>>> @@ -125,7 +125,12 @@ static inline int fanotify_init(unsigned int flags, unsigned int event_f_flags)
>>>>>>
>>>>>> static inline int fanotify_mark(int fanotify_fd, unsigned int flags, uint64_t mask,
>>>>>> int dfd, const char *pathname) {
>>>>>> - return syscall(__NR_fanotify_mark, fanotify_fd, flags, mask, dfd, pathname);
>>>>>> + if (sizeof(void *) == 4)
>>>>>> + return syscall(__NR_fanotify_mark, fanotify_fd, flags,
>>>>>> + *((uint32_t *)&mask), *((uint32_t *)&mask + 1),
>>>>>> + dfd, pathname);
>>>>>> + else /* 64bit */
>>>>>> + return syscall(__NR_fanotify_mark, fanotify_fd, flags, mask, dfd, pathname);
>>>>>> }
>>>>>>
>>>>>> #ifndef BTRFS_IOCTL_MAGIC
>>>>>
>>>>> Hmm, wouldn't this break x86-32?
>>>> It works fine on my asus laptop with a 32bit CPU Pentium-M.
>>>>
>>>> We pass two 32bit arguments instead of one 64bit argument. syscall()
>>>> doesn't know the prototype, I guess it will only fetch 9 native-width
>>>> arguments(1 syscall number + 8 parameters) from the stack. Aurelien,
>>>> am I right?
>>>
>>> Hmm, I am not sure I understand this? How is it possible that both ways
>>> work on x86? There must be quite a difference between passing 6 or 7
>>> arguments to syscall.
>>
>> syscall(64_bit_arg) actually pass the 64-bit argument in two different
>> registers (or on two different 32-bit stack entries). Then the GNU libc
>> copies the registers while converting from the userspace ABI to the
>> kernel ABI (usually passing more values into registers).
>>
>> On a lot of 32-bit cpus which needs alignment, a 64-bit arguments have
>> to be stack aligned, or register aligned. That's why due to the userspace
>> ABI, one register or one stack entry has to be empty. This later break
>> when this empty register arrives at the kernel level.
>>
>>> Or are you suggesting that the current invocation is broken even on x86,
>>> just doesn't become visible since only the lower 32 bit of "mask" are
>>> used?
>>>
>>
>> No, on x86 it just work "by chance", as x86 is very flexible with
>> alignement.
>
> When I implemented the kernel interface I remember someone brought up
> 64-bit alignment and it was suggest I go with the ordering I have. I
> understood that fanotify_fd would be the first 32 bits, flags be the
> second and then mask should be 64-bit aligned as the 3rd/4th?
>
> I am pretty arch stupid, and I have no idea at all if it is related, but
> I'm looking at kernel commit 5e844b31c2ace282ab8bea630b63e0212d9532d4
> which wires up the fanotify syscalls for mips. I see that it used a u64
> for a3 and a4 when these are only going to be 32 values. Does that
> matter?
It is correct, because it is what is expected by merge_64().
> I also see that some prototype such as sys32_readahead()
> explicitly added a pad into the 32 helper. Should we be doing that?
No.
The o32 ABI requires that 64-bit arguments be aligned starting in an
even argument register number. Since there are already two 32-bit
arguments, no padding is needed.
> I
> also notice that the naming of arguments seems wrong (although it
> shouldn't matter)
That was an unfortunate mistake. If someone wanted to, we could do:
s/a3/a2/
s/a4/a3/
That might reduce the confusing somewhat.
>
> SYSCALL_DEFINE6(32_fanotify_mark, int, fanotify_fd, unsigned int, flags,
> u64, a3, u64, a4, int, dfd, const char __user *, pathname)
>
> I would have thought equivalently:
> fanotify_fd == a0
> flags == a1
> 'a3' == a2
> 'a4' == a3
> dfd == a4
>
> etc.....
>
> Anyway, no idea if it is related, but I thought I designed the
> systemcall with this padding in mind.....
It is difficult to to call fanotify_mark() using the o32 syscall().
You would have to do something like this (untested):
int foo_fanotify_mark(int fanotify_fd, unsigned int flags, u64 mask, int
dfd, const char __user * pathname)
{
u32 mask_low = (u32)mask;
u32 mask_high = (u32)(mask >> 32);
return syscall(4337, fanotify_fd, flags, mask_low, mask_high, dfd,
pathname);
}
The order of mask_low, mask_high in the syscall argument list depends on
the endianness. Figuring out the correct order is left as an exercise
for the reader.
David Daney
More information about the systemd-devel
mailing list