The Difference In Optimizations Between NIR & GLSL
One of the biggest additions to Mesa so far this year has been the introduction of NIR, the new intermediate representation designed to replace GLSL IR and designed by a bright student fresh out of high school.
The Intel driver has begun using NIR by default, the Freedreno driver has NIR support, and the Raspberry Pi VC4 Gallium3D driver has also been working on NIR support as being the initial "customers" of this new intermediate representation.
One of the advantages advertised from the get-go for this intermediate representation has been that more optimizations can be shared across drivers and in a better way than the current GLSL IR situation... So what's the difference? Matt Turner of Intel fortunately committed today the same optimization to both GLSL and NIR, which indirectly does a nice job for demonstrating the difference.
The optimization done by Matt is transforming a pow(x, 4) call just into (x*x)*(x*x). The GLSL optimization comes basically down to:
The Intel driver has begun using NIR by default, the Freedreno driver has NIR support, and the Raspberry Pi VC4 Gallium3D driver has also been working on NIR support as being the initial "customers" of this new intermediate representation.
One of the advantages advertised from the get-go for this intermediate representation has been that more optimizations can be shared across drivers and in a better way than the current GLSL IR situation... So what's the difference? Matt Turner of Intel fortunately committed today the same optimization to both GLSL and NIR, which indirectly does a nice job for demonstrating the difference.
The optimization done by Matt is transforming a pow(x, 4) call just into (x*x)*(x*x). The GLSL optimization comes basically down to:
is_vec_four(ir_constant *ir)Meanwhile, the same transformation in NIR is just one line:
{
return (ir == NULL) ? false : ir->is_value(4.0, 4);
}
if (is_vec_four(op_const[1]))
{
ir_variable *x = new(ir) ir_variable(ir->operands[1]->type, "x",
ir_var_temporary);
base_ir->insert_before(x);
base_ir->insert_before(assign(x, ir->operands[0]));
ir_variable *squared = new(ir) ir_variable(ir->operands[1]->type,
"squared",
ir_var_temporary);
base_ir->insert_before(squared);
base_ir->insert_before(assign(squared, mul(x, x)));
return mul(squared, squared);
}
(('fpow', a, 4.0), ('fmul', ('fmul', a, a), ('fmul', a, a))),
28 Comments