[Intel-xe] [PATCH v3 1/6] drm/xe/hwmon: Add HWMON infrastructure

Rodrigo Vivi rodrigo.vivi at kernel.org
Fri Aug 11 18:48:59 UTC 2023


On Fri, Aug 11, 2023 at 10:39:00AM -0700, Guenter Roeck wrote:
> On 8/11/23 09:01, Rodrigo Vivi wrote:
> > On Tue, Aug 08, 2023 at 03:07:43PM -0700, Guenter Roeck wrote:
> > > On 8/8/23 14:31, Rodrigo Vivi wrote:
> > > > On Fri, Aug 04, 2023 at 08:06:22PM +0530, Nilawar, Badal wrote:
> > > > > 
> > > > > 
> > > > > On 04-08-2023 19:56, Guenter Roeck wrote:
> > > > > > On 8/4/23 06:19, Nilawar, Badal wrote:
> > > > > > > 
> > > > > > > Hi Guenter,
> > > > > > > On 03-08-2023 04:42, Guenter Roeck wrote:
> > > > > > > > On 8/2/23 15:40, Andi Shyti wrote:
> > > > > > > > > Hi Badal,
> > > > > > > > > 
> > > > > > > > > [...]
> > > > > > > > > 
> > > > > > > > > > +struct xe_hwmon_data {
> > > > > > > > > > +    struct device *hwmon_dev;
> > > > > > > > > > +    struct xe_gt *gt;
> > > > > > > > > > +    char name[12];
> > > > > > > > > > +};
> > > > > > > > > > +
> > > > > > > > > > +struct xe_hwmon {
> > > > > > > > > > +    struct xe_hwmon_data ddat;
> > > > > > > > > > +    struct mutex hwmon_lock;
> > > > > > > > > > +};
> > > > > > > > > 
> > > > > > > > > why do we need two structures here? Can we merge them?
> > > > > > > > > 
> > > > > > > > 
> > > > > > > > A later patch adds multiple hwmon devices which makes use of it.
> > > > > > > > I think that is flawed, and I am not inclined to accept it.
> > > > > > > Is there any obvious reason that there shouldn't be multiple
> > > > > > > devices? In i915 we are doing the same.
> > > > > > > https://patchwork.freedesktop.org/patch/497324/?series=104278&rev=3
> > > > > > > 
> > > > > > 
> > > > > > Technically you can do whatever you like as long as the code doesn't reside
> > > > > > in drivers/hwmon. I won't NACK it, but I won't give it a Reviewed-by:
> > > > > > either. i915 shouldn't do it, but I didn't realize what they are doing
> > > > > > at the time. Other drivers doing it wrong is not an argument. You can't
> > > > > > argue that you may drive faster than the speed limit because others do it
> > > > > > or because police didn't stop you last time you did either.
> > > > > > 
> > > > > > One chip, one hwmon device. Do you have separate parent devices for
> > > > > > all your hwmon devices ? If yes, you can argue that having multiple hwmon
> > > > > > devices make sense. If not, you can't.
> > > > > Thanks for clarification. There is only one parent device. So will try to
> > > > > accommodate single hwmon device.
> > > > 
> > > > Well, it is one PCI device, but under 1 pci device we can have multiple "tiles"
> > > > that can duplicate many components. Inside each tile we can even have multiple
> > > > "gt"s.
> > > > 
> > > > But back to the tile, each tile has its own metrics. It's own power delivery,
> > > > own sensors and all. They can even be seen as independent devices from this
> > > > angle.
> > > > 
> > > > I'm afraid that the attempt to put everything as one device, but all the
> > > > entries duplicated per tile/gt we might end up with a messed api.
> > > > 
> > > 
> > > Your argument does not make sense. I am not asking to duplicate anything.
> > 
> > Okay, I'm sorry, maybe 'duplication' was a bad choice of words from my part.
> > 
> > You had told that having multiple hwmon device for a single chip was not
> > acceptable.
> > 
> > But I'm trying to explain that we have a hardware architecture where the graphics
> > is duplicated in 'tiles' inside the same PCI card. Each tile with its
> > own sensors and monitoring systems. And also an extra sensors monitoring the
> > entire 'package' that includes the tiles and the SoC.
> > So 1 hwmon device per gt-tile and package sound the appropriated way to me.
> > 
> 
> No, it isn't. Next you are going to tell me to split CPU temperature devices
> in the same way because they are split in "tiles" on the same CPU core.

okay, let's align with coretemp then.

> 
> > Your lines had convinced Badal to get them all and merge in a single hwmon
> > device. If we do this, the API will get messed up.
> > 
> > And this is what I meant by 'messed up':
> > quoting Badal:
> > """
> > With single device energy entries will look like hwmonxx/energy1_input,
> > energy2_input, energy3_input.
> > To identify which entry for what need to expose additional entry energyX_lable
> > which will contain ("package", "gtN")
> 
> So what is the problem with that ? That is a description and not "messed up".

>From the user space perspective it would be easier to get an unique handle
for the subdevice (directory) and then inspect each property (files) with single
and direct access, without having to inspect every single 'label' file of that
property in the directory to match the desired sub-device.

But nevermind. Let's have a single hwmon per device with multiple label files
and aligning with cputemp.

> 
> Guenter
> 


More information about the Intel-xe mailing list