This fixes a crash in SCF→GPU when building the per‑dim index for mapped
scf.parallel.
**Change**:
- Map step/lb through cloningMap, then run ensureLaunchIndependent.
- If either is still unavailable at launch scope, emit a match‑failure;
otherwise build the affine.apply.
**Why this is correct:**
- Matches how the pass already handles launch bounds; avoids creating an
op with invalid operands and replaces a segfault with a clear
diagnostic.
**Tests**:
- Added two small regressions that lower to gpu.launch and exercise the
affine.apply path.
Fixes : #167654
Signed-off-by: Shashi Shankar <shashishankar1687@gmail.com>