IndexMaps with dim > 2 and the GPU

(Greg Cotten) #1

Hi all, directly implementing IndexMaps with a dim > 2 (with the exception of HalfDomain) is an optimization problem on the GPU. You basically have two options:

  1. Directly implement IndexMaps on the GPU with a for loop (bad)
  2. Convert the IndexMap to effectively be a 1D LUT - this will potentially cause imprecision unless the 1D LUT generated is insanely precise. I think @nick mentioned a 20-bit sized 1D LUT would be fine for Linear to ACEScc.

My question is - should we forgo IndexMaps with dim > 2 except for HalfDomain? I’ve implemented a HalfDomain index map on the GPU and it was pretty straightforward.

(Scott Dyer) #2

The issue of IndexMaps came up again during last weeks call. Many have already shared thoughts on this issue, but we need to finalize what this group’s recommendation for changing the spec (if any) will be.

In a different thread, a few pros/cons of IndexMaps were already discussed:

Key points (according to my interpretation of the discussion):

  • We do not want to leave the spec as is - either require IndexMap or remove it entirely.
  • There are cases where arbitrarily spaced input entries can be useful.
  • There is also some value in being able to directly encapsulate features of any other LUT format (in this case, .csp).
    However, there are serious performance issues with arbitrarily spaced LUTs - so in practice they get cast into a suitably large equally spaced LUT.
  • An IndexMap could be converted to a halfDomain LUT, which is effectively a preset IndexMap with samples at every possible half-float code value. If the precision of half-float is acceptable, then by definition its code values will be closely enough spaced samples. For higher precision inputs, values can be interpolated between the halfDomain samples.
  • Use cases for IndexMap > 2 are even less now that we are planning to add a dedicated element for lin-to-log and log-to-lin conversions.

It feels to me like the group is leaning toward removing IndexMap from the specification, because halfDomain allows for an arbitrarily spaced LUT to be practically applied (“automated” conversion/compatibility from one to the other is the only thing lost).

Yet I get the feeling there still seems to be hesitation from some about whether obliterating it for the sake of ensuring compatibility is worth the tradeoff of losing the potentially useful feature of arbitrarily spaced LUTs.