Ptx model has `redux.sync` that performs reduction operation on the data from each predicated active thread in the thread group. It only is available sm80+.
This revision adds redux as on op to nvvm dialect.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D142088