[Intel-gfx] [PATCH 0/3] Support 64 bpp half float formats
ville.syrjala at linux.intel.com
Fri Nov 30 14:15:53 UTC 2018
On Thu, Nov 29, 2018 at 09:39:52PM +0000, Strasser, Kevin wrote:
> Ville Syrjälä wrote:
> > On Wed, Nov 28, 2018 at 10:38:10PM -0800, Kevin Strasser wrote:
> >> This series defines new formats and adds a plane property to be used for
> >> floating point framebuffer content. Implementation is then added to i915.
> >> I have shared an IGT branch which adds test coverage for the new formats:
> >> https://github.com/strassek/xorg-intel-gpu-tools/tree/fp16
> > Looks about similar as what I had written. I wrote my half<->full
> > conversion thing from scratch which probably means it has more rounding
> > errors and whatnot. The speed of mine wasn't exactly stellar and looks
> > like your version probably has the same issue. So I was actually
> > thinking of using the sse<something> instructions meant for this
> > could provide a nice speedup. I guess we might want the pure c version
> > as a backup though. Hmm. Now I also seem to recall that I noticed
> > there being a compiler intrinsic even for single value half<->full
> > precision conversion. Did you look into using that (if I didn't imagine
> > it)?
> You are thinking of vcvtps2ph and vcvtph2ps, I haven't yet had a chance to
> give them a try, but I agree it seems like a good idea.
> > BTW I just rebased my fp16 for pre-icl platforms:
> > git://github.com/vsyrjala/linux.git fp16_scanout_2
> > Apart from the ivb/hsw w/a there isn't all that much unexpected
> > when it comes to fp16 on those platforms either.
> I don't mean to step on your toes with this series, were you waiting for /
> working on a real usecase before pushing that code?
I pretty much just did it so that I could test >10bpc gamma LUTs. But
I got sidetracked by other things so I didn't really get even that far.
Also another problem is that igt depends on cairo which didn't support
rendering at >10bpc, so I couldn't really test that stuff properly even
if I wanted to. Maarten has patches to wire up floats into cairo but I
think he just said that it still kinda uses 8bpc precision only :(
Anyways, the fact that you did icl and I did pre-icl is pretty good
division of labour. Sometimes things work out by accident :)
More information about the Intel-gfx