Update unrolling preferences for Apple Silicon CPUs to enable partial unrolling and runtime unrolling for small loops with reductions. This builds on top of unroller changes to introduce parallel reduction phis, if possible: https://github.com/llvm/llvm-project/pull/149470. PR: https://github.com/llvm/llvm-project/pull/149699