[RFC PATCH v2] Utilize the PCI API in the TTM framework.

Wed Jan 12 01:12:14 PST 2011

Hi, Konrad.

This discussion has become a bit lenghty. I'll filter out the sorted-out 
stuff, which leaves me with two remaining issues:

On 01/11/2011 04:55 PM, Konrad Rzeszutek Wilk wrote:
>
> So at the end we have 16GB taken from 8GB->24GB, and 320MB taken from
> 0->4GB. When you start allocating coherent memory from each guest
> (and yeah, say we use 2GB each), we end up with the first guest getting
> the 2GB, the second getting 1.7GB, and then the next two getting zil.
>
> You still have GFP_KERNEL memory in each guest - the first one has 2GB left
> , then second 2.3, the next two have each 4GB.
>
> > From the hyprevisor pool perspective, the 0-4GB zone is exhausted, so
> is the 8GB->24GB, but it still has 4GB->8GB free - so it can launch one more
> guest (but without PCI passthrough devices).
>
>    
>> On a 4GB machine or less, that would be the same as kernel memory.
>> Now, if 4 guests think they can allocate 2GB of coherent memory
>> each, you might run out of kernel memory on the host?
>>      
> So host in this case refers to the Hypervisor and it does not care
> about the DMA or what - it does not have any device drivers(*) or such.
> The first guest (dom0) is the one that deals with the device drivers.
>
> *: It has one: the serial port, but that is not really that important
> for this discussion.
>    

Let's assume we're at where the hypervisor (or host) has exhausted the 
0-4GB zone, due to guests coherent memory allocations, and that the 
physical machine has 4GB of memory, all in the 0-4GB zone. Now if the 
hypervisor was running on a Linux kernel, there would be no more 
GFP_KERNEL memory available on the *host* (hypervisor), and the 
hypervisor would crash. Now I don't know much about Xen, but it might be 
that this is not a problem with Xen at all?

>>
>> Another thing that I was thinking of is what happens if you have a
>> huge gart and allocate a lot of coherent memory. Could that
>> potentially exhaust IOMMU resources?
>>      
> <scratches his head>
>    

I need to be more specific. Let's assume we're on "bare metal", and we 
want to allocate 4GB of coherent memory. For most IOMMUs that would mean 
as you previously state, that we actually allocate GFP_DMA32 memory. But 
for some IOMMUs that would perhaps mean that we allocate *any* memory 
and set up a permanent DMA mapping in the IOMMU for the coherent pages. 
What if, in such a case, the IOMMU can only set up 2GB of coherent memory?

Or in short, can there *ever* be "bare metal" cases where the amount of 
coherent memory available is less than DMA32 memory available?

Thanks,
Thomas