This commit takes advantage of the new `load.async.to.lds` intrinsic in
order to add an `async` mode to `gather_to_lds`. In this mode,
completion of the load needs to be managed with `asyncmark` and
`wait.asyncmark` intrinsics instead of being implicitly derived by alias
analysis.
This commit adds the flag, a lowering for it, and updates tests.
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>