I noticed that NVPTX will sometimes emit `mad.lo` to multiply by 1, e.g.
in https://gcc.godbolt.org/z/4j47Y9W4c.
This happens when DAGCombiner operates on the add before the mul, so the
imad contraction happens regardless of whether the mul could have been
simplified.
To fix this, I remove `NVPTXISD::IMAD` and only combine to mad during
selection. This allows the default DAGCombiner patterns to simplify
the graph without any NVPTX-specific intervention.
Similar to 806761a7629df268c8aed49657aeccffa6bca449
-mtriple= specifies the full target triple while -march= merely sets the
architecture part of the default target triple (e.g. Windows, macOS),
leaving a target triple which may not make sense.
Therefore, -march= is error-prone and not recommended for tests without
a target triple. The issue has been benign as we recognize
nvptx{,64}-apple-darwin as ELF instead of rejecting it outrightly.