[Intel-gfx] [PATCH 4/4] drm/i915/bios: do not discard address space
Lucas De Marchi
lucas.demarchi at intel.com
Fri Nov 8 18:18:52 UTC 2019
On Fri, Nov 08, 2019 at 01:14:03PM +0200, Jani Nikula wrote:
>On Thu, 07 Nov 2019, Lucas De Marchi <lucas.demarchi at intel.com> wrote:
>> When we are mapping the VBT through pci_map_rom() we may not be allowed
>> to simply discard the address space and go on reading the memory. After
>> checking on my test system that dumping the rom via sysfs I could
>> actually get the correct vbt, I decided to change the implementation to
>> use the same approach, by calling memcpy_fromio().
>>
>> In order to avoid copying the entire oprom this implements a simple
>> memmem() searching for "$VBT". Contrary to the previous implementation
>> this also takes care of not issuing unaligned PCI reads that would
>> otherwise get translated into more even more reads. I also vaguely
>> remember unaligned reads failing in the past with some devices.
>>
>> Also make sure we copy only the VBT and not the entire oprom that is
>> usually much larger.
>
>So you have
>
>1. a fix to unaligned reads
unaligned io reads, yes
>
>2. an optimization to avoid reading individual bytes four times
it was by no means an optimization. Not reading the same byte 4 bytes is
there actually to stop doing the unaligned IO reads. You can't have (2)
without (1) unless you switch to ioreadb() and add a shift (which may
not be a bad idea.
>
>3. respecting __iomem and copying (I guess these are tied together)
>
>Seems to me that really should be at least three patches. Not
>necessarily in the above order.
(3) is actually the most important I think, so I will start by that.
>
>Follow-up: store pointer to the oprom vbt somewhere under i915->vbt, and
>have debugfs i915_vbt() handle that properly.
I don't think this is needed. The thing I'm doing here is the same as
what can be accomplished by reading the rom from sysfs:
find /sys/bus/pci/devices/*/ -name rom
... choose one
echo 1 > rom # to allow reading the rom
hexdump -C rom
>
>>
>> Signed-off-by: Lucas De Marchi <lucas.demarchi at intel.com>
>> ---
>> drivers/gpu/drm/i915/display/intel_bios.c | 95 +++++++++++++++++++----
>> 1 file changed, 79 insertions(+), 16 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/display/intel_bios.c b/drivers/gpu/drm/i915/display/intel_bios.c
>> index 671bbce6ba5b..c401e90b7cf1 100644
>> --- a/drivers/gpu/drm/i915/display/intel_bios.c
>> +++ b/drivers/gpu/drm/i915/display/intel_bios.c
>> @@ -1806,31 +1806,88 @@ bool intel_bios_is_valid_vbt(const void *buf, size_t size)
>> return vbt;
>> }
>>
>> -static const struct vbt_header *find_vbt(void __iomem *oprom, size_t size)
>> +void __iomem *find_vbt(void __iomem *oprom, size_t size)
>> {
>> - size_t i;
>> + const u32 MAGIC = *((const u32 *)"$VBT");
>> + size_t done = 0, cur = 0;
>> + void __iomem *p;
>> + u8 buf[128];
>> + u32 val;
>>
>> - /* Scour memory looking for the VBT signature. */
>> - for (i = 0; i + 4 < size; i++) {
>> - void *vbt;
>> + /*
>> + * poor's man memmem() with sizeof(buf) window to avoid frequent
>> + * wrap-arounds and using u32 for comparison. This gives us 4
>> + * comparisons per ioread32() and avoids unaligned io reads (although it
>> + * still does unaligned cpu access).
>> + */
>
>If we're really worried about performance here, and use a local buffer
>to optimize the wraparounds, would it actually be more efficient to use
>memcpy_fromio() which has an arch specific implementation in asm?
Not really worried about performance. I actually did 3 implementations
that avoids the unaligned io reads.
1) this one
2) memcpy_fromio() to the local buffer + strnstr()
3) allocate a oprom buffer, memcpy_fromio() the entire rom and keep a
pointer to it. Then free the oprom after the vbt is used
(2) and (1) had basically the same complexity involved of requiring a
wrap around local buffer, so I went with (1)
I didn't feel confortable with (3) because it would allocate much more
memory than really needed.
>
>In any case makes you think you should first have the patch that the
>patch subject claims, fix unaligned reads and add optimizations
>next. This one does too much.
Again, it was not really meant to be an optimization.
Ville told me that we may not really need to deal with the unaligned
access and change the implementation to expect the VBT to be aligned.
This I would be the simplest way to change it, but I'm not fond on
changing this and breaking old systems usin it... anyway, we can give it
a try and revert if it breaks.
I will send a new version with the split.
Lucas De Marchi
>
>BR,
>Jani.
>
>
>
>> + for (p = oprom; p < oprom + size; p += 4) {
>> + *(u32 *)(&buf[done]) = ioread32(p);
>> + done += 4;
>>
>> - if (ioread32(oprom + i) != *((const u32 *)"$VBT"))
>> - continue;
>> + while (cur + 4 <= done) {
>> + val = *(u32 *)(buf + cur);
>> + if (val == MAGIC)
>> + return p - (done - cur) + 4;
>>
>> - /*
>> - * This is the one place where we explicitly discard the address
>> - * space (__iomem) of the BIOS/VBT.
>> - */
>> - vbt = (void __force *)oprom + i;
>> - if (intel_bios_is_valid_vbt(vbt, size - i))
>> - return vbt;
>> + cur++;
>> + }
>>
>> - break;
>> + /* wrap-around */
>> + if (done + 4 >= sizeof(buf)) {
>> + buf[0] = buf[done - 3];
>> + buf[1] = buf[done - 2];
>> + buf[2] = buf[done - 1];
>> + cur = 0;
>> + done = 3;
>> + }
>> }
>>
>> + /* Read the entire oprom and no VBT found */
>> return NULL;
>> }
>>
>> +/*
>> + * Copy vbt to a new allocated buffer and update @psize to match the VBT
>> + * size
>> + */
>> +static struct vbt_header *copy_vbt(void __iomem *oprom, size_t *psize)
>> +{
>> + off_t vbt_size_offset = offsetof(struct vbt_header, vbt_size);
>> + struct vbt_header *vbt;
>> + void __iomem *p;
>> + u16 vbt_size;
>> + size_t size;
>> +
>> + size = *psize;
>> + p = find_vbt(oprom, size);
>> + if (!p)
>> + return NULL;
>> +
>> + size -= p - oprom;
>> +
>> + /*
>> + * We need to at least be able to read the size and make sure it doesn't
>> + * overflow the oprom. The rest will be validated later.
>> + */
>> + if (sizeof(*vbt) > size) {
>> + DRM_DEBUG_DRIVER("VBT header incomplete\n");
>> + return NULL;
>> + }
>> +
>> + vbt_size = ioread16(p + vbt_size_offset);
>> + if (vbt_size > size) {
>> + DRM_DEBUG_DRIVER("VBT incomplete\n");
>> + return NULL;
>> + }
>> +
>> + vbt = kmalloc(vbt_size, GFP_KERNEL);
>> + memcpy_fromio(vbt, p, vbt_size);
>> +
>> + *psize = vbt_size;
>> +
>> + return vbt;
>> +}
>> +
>> /**
>> * intel_bios_init - find VBT and initialize settings from the BIOS
>> * @dev_priv: i915 device instance
>> @@ -1861,10 +1918,13 @@ void intel_bios_init(struct drm_i915_private *dev_priv)
>> if (!oprom)
>> goto out;
>>
>> - vbt = find_vbt(oprom, size);
>> + vbt = copy_vbt(oprom, &size);
>> if (!vbt)
>> goto out;
>>
>> + if (!intel_bios_is_valid_vbt(vbt, size))
>> + goto out;
>> +
>> DRM_DEBUG_KMS("Found valid VBT in PCI ROM\n");
>> }
>>
>> @@ -1897,6 +1957,9 @@ void intel_bios_init(struct drm_i915_private *dev_priv)
>>
>> if (oprom)
>> pci_unmap_rom(pdev, oprom);
>> +
>> + if (vbt != dev_priv->opregion.vbt)
>> + kfree(vbt);
>> }
>>
>> /**
>
>--
>Jani Nikula, Intel Open Source Graphics Center
More information about the Intel-gfx
mailing list