Barrier removal in OpenMPOpt normally removes barriers by proving that
they are redundant with barriers preceding them. However, it can't do
this with the "pseudo-barrier" at the end of kernels because that can't
be removed. Instead, it removes the barriers preceding the end of the
kernel which that end-of-kernel barrier is redundant with. However,
these barriers aren't always redundant with the end-of-kernel barrier
when loops are involved, and removing them can lead to incorrect results
in compiled code.
This change fixes this by requiring that these pre-end-of-kernel
barriers also have the kernel end as a unique successor before removing
them. It also changes the initialization of `ExitED` for kernels since
the kernel end is not an aligned barrier.