280 Commits

Author SHA1 Message Date
Matt Arsenault
2502e3b7ba
IR: Promote "denormal-fp-math" to a first class attribute (#174293)
Convert "denormal-fp-math" and "denormal-fp-math-f32" into a first
class denormal_fpenv attribute. Previously the query for the effective
denormal mode involved two string attribute queries with parsing. I'm
introducing more uses of this, so it makes sense to convert this
to a more efficient encoding. The old representation was also awkward
since it was split across two separate attributes. The new encoding
just stores the default and float modes as bitfields, largely avoiding
the need to consider if the other mode is set.

The syntax in the common cases looks like this:
  `denormal_fpenv(preservesign,preservesign)`
  `denormal_fpenv(float: preservesign,preservesign)`
  `denormal_fpenv(dynamic,dynamic float: preservesign,preservesign)`

I wasn't sure about reusing the float type name instead of adding a
new keyword. It's parsed as a type but only accepts float. I'm also
debating switching the name to subnormal to match the current
preferred IEEE terminology (also used by nofpclass and other
contexts).

This has a behavior change when using the command flag debug
options to set the denormal mode. The behavior of the flag
ignored functions with an explicit attribute set, per
the default and f32 version. Now that these are one attribute,
the flag logic can't distinguish which of the two components
were explicitly set on the function. Only one test appeared to
rely on this behavior, so I just avoided using the flags in it.

This also does not perform all the code cleanups this enables.
In particular the attributor handling could be cleaned up.

I also guessed at how to support this in MLIR. I followed
MemoryEffects as a reference; it appears bitfields are expanded
into arguments to attributes, so the representation there is
a bit uglier with the 2 2-element fields flattened into 4 arguments.
2026-02-05 13:31:26 +00:00
Florian Hahn
49288b6523
[UTC] Add initial VPlan support. (#178534)
Add support for extracting a VPlan from LV debug output and generalizing
matching for unnamed VPValues.

Once we have support for -vplan-print-after=xxxx we can strip the logic
to extract a VPlan manually. We cannot use regex, as we need to match
from start opening bracket to the correct closing bracket.

PR: PR: https://github.com/llvm/llvm-project/pull/178534
2026-02-04 20:29:22 +00:00
Jay Foad
7ea33e6848
[CodeGen] Remove unused first operand of SUBREG_TO_REG (#179690)
The first input operand of SUBREG_TO_REG was an immediate that most
targets set to 0. In practice it had no effect on codegen. Remove it.
2026-02-04 17:35:21 +00:00
Francesco Petrogalli
7890960218
[UTC][RISC-V] Enable riscv32 Mach-O triple. (#178738)
Co-authored-by: Adam Nemet <anemet@apple.com>
2026-01-30 10:13:16 -08:00
David Sherwood
73c7c562dd
[LLVM][DAGCombiner] Look through freeze when combining extensions of loads (#175022)
Following on from https://github.com/llvm/llvm-project/pull/172484 I
have added support to tryToFoldExtOfLoad for looking through freezes, in
order to catch more cases of extending loads. This type of code is
sometimes seen being generated by the loop vectoriser. For now I've
limited this to cases where the load is only used by the freeze, since
otherwise it leads to worse code in some X86 tests.
2026-01-29 12:01:43 +00:00
Alexander Richardson
7b8b23a84d
[update_mc_test_checks] Support --show-inst output
This is useful to check that the correct registers were used in cases
where different register classes use the same name in asm input/output.

Pull Request: https://github.com/llvm/llvm-project/pull/174011
2026-01-19 22:22:37 -08:00
hev
c9cc782e0b
[llvm][LoongArch] Add PC-relative address materialization using pcadd instructions (#175358)
This patch adds support for PC-relative address materialization using
pcadd-class relocations, covering the HI20/LO12 pair and their GOT and
TLS variants (IE, LD, GD, and DESC).

Link:
https://gcc.gnu.org/pipermail/gcc-patches/2025-December/703312.html
2026-01-14 16:12:18 +08:00
Kunqiu Chen
1cb9b790f0
[UTC] Align label var handling of old lines to new lines (#173850)
BB labels have been treated as variables in newer UTC versions. 

However, UTC previously handled BB labels in old lines differently from
new lines, causing incorrect `remap_metavar_names`.

E.g., 
- New lines var `exit:` and `label %exit`: UTC generalized them as
`[[@@]]` and `[[@@]]`.
- Old lines var `[[EXIT]]:` and `label %[[EXIT]]`: UTC generalized them
as `[[@@]]:` and `label %[[@@]]`, which mismatched with the
generalization of new lines.

This mismatch might cause unexpected variable name remappings, even if
the new lines are indeed equivalent to the old lines.

This PR aligns label var handling of old lines to new lines, i.e.,
generalizes `[[EXIT]]:` and `label %[[EXIT]]` as `[[@@]]` and `[[@@]]`.
2026-01-11 19:39:41 +08:00
Jay Foad
7f5dbbc342
[Utils][update_mc_test_checks] Handle double quotes in asm source (#175161) 2026-01-09 14:04:00 +00:00
Alexander Richardson
f00575eacd
[UpdateTestChecks] Fix %update_mc_test_checks substitution
We need to explicitly pass --llvm-mc-binary to avoid picking up llvm-mc
from somewhere else in $PATH. Noticed this because test lines were being
generated that didn't include my latest changes to update_mc_test_checks.py

Pull Request: https://github.com/llvm/llvm-project/pull/172230
2025-12-30 11:02:31 -08:00
David Green
8a4cc440f2 [AArch64] Run optimizeTerminators earlier too. (#170907)
Running optimizeTerminators prior to other optimizations like branch
layout can lead to more folding and better codegen, but is not on its
own able to capture all cases. There is benefit to running it in both
places. This adds the existing code from #161508 into the
AArch64RedundantCopyElimination pass, which sounds like a sensible
enough place for it.

This is a recommit with an extra fix for shrink-wrapping domtree use.
2025-12-11 15:33:15 +00:00
Arthur Eubanks
53cd4ab413
Revert "[AArch64] Run optimizeTerminators earlier too." (#171505)
Reverts llvm/llvm-project#170907

Causes crashes, see
https://github.com/llvm/llvm-project/pull/170907#issuecomment-3634271414
2025-12-10 00:22:17 +00:00
David Green
ca927e564d
[AArch64] Run optimizeTerminators earlier too. (#170907)
Running optimizeTerminators prior to other optimizations like branch
layout can lead to more folding and better codegen, but is not on its
own able to capture all cases. There is benefit to running it in both
places. This adds the existing code from #161508 into the
AArch64RedundantCopyElimination pass, which sounds like a sensible
enough place for it.
2025-12-09 09:11:11 +00:00
Alexander Richardson
9baf76a9f8
[MCAsmStreamer] Print register names in --show-inst mode
Passing the context to `Inst.dump_pretty()` allows printing symbolic
register names instead of `<MCOperand Reg:1234>` in the output.
I plan to use this in a future RVY test cases where we have register
class with the same name in assembly syntax, but different underlying
register enum values. Printing the name of the enum value makes it
easier to test that we selected the correct register.

Reviewed By: lenary

Pull Request: https://github.com/llvm/llvm-project/pull/171252
2025-12-08 21:53:15 -08:00
David Green
26f5116266
[AArch64] Optimize CBZ wzr and friends. (#161508)
In certain situations, especially with zero phi operands propagated after tail
duplications, we can end up with CBZ/CBNZ/TBZ/TBNZ with a zero register. It
can can be introduced late in the pipeline.

This patch adds a basic pass to fold them away to either a direct branch or
removing the instruction entirely. It runs quite late n the pipeline, so doesnt
fit into any of the existing passes. It only needs to look at the terminators
to each BB, so the new pass should have a limited in compile-time impact.
2025-12-05 17:44:45 +00:00
Ivan Kosarev
83765f435d
[Utils][update_mc_test_checks] Support generating asm tests from templates. (#168946)
Reduces the pain of manual editing tests applying the same
changes over multiple instructions and keeping them consistent.
2025-11-24 13:38:41 +00:00
Matt Arsenault
fbf74b2553
AMDGPU: Select vector reg class for divergent build_vector (#168169)
The main improvement is to the mfma tests. There are some
mild regressions scattered around, and a few major ones.
The worst regressions are in some of the bitcast tests;
these are cases where the SGPR argument list runs out
and uses VGPRs, and the copies-from-VGPR are misidentified
as divergent. Most of the shufflevector tests are also
regressions. These end up with cleaner MIR, but then get poor
regalloc decisions.
2025-11-14 21:53:39 -08:00
Matt Arsenault
2bf92787df
AMDGPU: Start using RegClassByHwMode for wavesize operands
(#159884)

This eliminates the pseudo registerclasses used to hack the
wave register class, which are now replaced with RegClassByHwMode,
so most of the diff is from register class ID renumbering.
2025-11-11 15:07:59 -08:00
Valery Pykhtin
59f6f33bc3
Reapply "[utils][UpdateLLCTestChecks] Add MIR support to update_llc_test_checks.py." (#164965) (#166575)
This change enables update_llc_test_checks.py to automatically generate
MIR checks for RUN lines that use `-stop-before` or `-stop-after` flags
allowing tests to verify intermediate compilation stages (e.g., after
instruction selection but before peephole optimizations) alongside the
final assembly output. If `-debug-only` flag is present in the run line it's
considered as the main point of interest for testing and stop flags above
are ignored (that is no MIR checks are generated).

This resulted from the scenario, when I needed to test two instruction
matching patterns where the later pattern in the peepholer reverts the
earlier pattern in the instruction selector and distinguish it from the
case when the earlier pattern didn't worked at all.

Initially created by Claude Sonnet 4.5 it was improved later to handle
conflicts in MIR <-> ASM prefixes and formatting.
2025-11-06 11:35:46 +01:00
Valery Pykhtin
ba1dbdd44a
Revert "[utils][UpdateLLCTestChecks] Add MIR support to update_llc_test_checks.py." (#166549)
Reverts llvm/llvm-project#164965
2025-11-05 14:30:27 +01:00
Valery Pykhtin
c782ed3440
[utils][UpdateLLCTestChecks] Add MIR support to update_llc_test_checks.py. (#164965)
This change enables update_llc_test_checks.py to automatically generate
MIR checks for RUN lines that use `-stop-before` or `-stop-after` flags
allowing tests to verify intermediate compilation stages (e.g., after
instruction selection but before peephole optimizations) alongside the
final assembly output. If `-debug-only` flag is present in the run line it's
considered as the main point of interest for testing and stop flags above
are ignored (that is no MIR checks are generated).

This resulted from the scenario, when I needed to test two instruction
matching patterns where the later pattern in the peepholer reverts the
earlier pattern in the instruction selector and distinguish it from the
case when the earlier pattern didn't worked at all.

Initially created by Claude Sonnet 4.5 it was improved later to handle
conflicts in MIR <-> ASM prefixes and formatting.
2025-11-05 13:31:10 +01:00
Kunqiu Chen
82cf54fbf6
[UTC] CHECK-EMPTY instead of skipping blank lines (#165718)
Previously, any blank lines in IR were ignored by UTC, leading to more
fragile `CHECK`s being generated.

This change lets UTC, 1) emit `CHECK-EMPTY` to check blank lines, and 2)
generate more `CHECK-NEXT`s, landing the discussion
https://github.com/llvm/llvm-project/pull/165419#issuecomment-3457572422.

Moreover, this change also aligns the behavior of IR check-gen to ASM
check-gen, which has been emitting `CHECK-EMPTY` since
a8a89c77ea.
2025-11-01 17:01:30 +08:00
Kunqiu Chen
566c7311d4
[UTC] Indent switch cases (#165212)
LLVM prints switch cases indented by 2 additional spaces, as follows:
```LLVM
  switch i32 %x, label %default [
    i32 0, label %phi
    i32 1, label %phi
  ]
```

Since this only changes the output IR of update_test_checks.py and does
not change the logic of the File Check Pattern, there seems to be no
need to update the existing test cases.
2025-10-28 22:00:54 +08:00
Tomer Shafir
9abae17b25
[UpdateTestChecks][llc] Support arm64-apple-darwin (#165092)
Adds `arm64-apple-darwin` support to `asm.py` matching and removes now
invalidated `target-triple-mismatch` test (I dont have another triple
supported by llc but not the autogenerator that make this test useful).
2025-10-27 21:22:23 +02:00
Matt Arsenault
853760bca6
AMDGPU: Use ELF mangling in data layout (#163011)
Closes #95219
2025-10-13 03:01:45 +00:00
Matt Arsenault
4af3e8f1d4
AMDGPU: Remove LDS_DIRECT_CLASS register class (#161762)
This is a singleton register class which is a bad idea,
and not actually used.
2025-10-04 03:56:43 +00:00
Matt Arsenault
1b30e49b9b
AMDGPU: Remove m0 classes (#161758)
These are singleton register classes, which are not a good idea
and also are unused.
2025-10-04 02:33:45 +00:00
Alex Bradbury
9d48df7a92
[UpdateTestChecks] Don't fail silently when conflicting CHECK lines means no checks are generated for some functions (#159321)
There is a warning that triggers if you (for instance) run
`update_llc_test_checks.py` on an input where _all_ functions have
conflicting check lines and so no checks are generated. However, there
are no warnings emitted at all for the case where some functions have
non-conflicting check lines but others don't. This is a source of
frustration because running update_llc_test_checks can result in all
check lines being removed for certain functions when such a conflict
exists with no warning, meaning we have to be extra vigilant inspecting
the diff. I've also personally wasted time tracking down the source of
the dropped lines assuming that update_test_checks would emit a warning
in such cases.

This change adds logic to emit warnings on a function-by-function basis
for any RUN that has conflicting prefixes meaning no output is
generated. This subsumes the previous warning for when _all_ functions
conflict.
2025-09-23 16:17:35 +00:00
Stanislav Mekhanoshin
fd59fd563f
[AMDGPU] Add aperture classes to VS_64 (#158823)
Should not do anything.
2025-09-16 11:15:50 -07:00
Stanislav Mekhanoshin
72aa946762
[AMDGPU] Drop high 32 bits of aperture registers (#158725)
Fixes: SWDEV-551181
2025-09-16 02:11:39 -07:00
Antonio Frighetto
1cacc7339b
[UTC] Record TBAA semantics when autogenerating check lines
UpdateTestChecks have been updated to take into account TBAA
semantics as well, when emitting checks. This is achieved by
parsing TBAA metadata for each tool invocation – whose tool
is identified by their prefixes –, and maintaining a global
dict of prefixes, TBAA nodes.
2025-09-10 19:40:30 +02:00
Antonio Frighetto
cb9cb4eb2e
[UTC] Introduce test for PR147670 (NFC) 2025-09-10 19:40:30 +02:00
Stanislav Mekhanoshin
6aebbb0a85
[AMDGPU] Define 1024 VGPRs on gfx1250 (#156765)
This is a baseline support, it is not useable yet.
2025-09-03 16:25:18 -07:00
Matt Arsenault
1ff6bfe7a5
AMDGPU: Add VS_64_Align2 class (#156132)
We need an aligned version of the VS class to properly
represent operand constraints.

This fixes regressions with #155559
2025-09-02 23:24:07 +09:00
Michael Berg
efa99eccfc
[LoopDist] Add metadata for checking post process state of distribute… (#153902)
…d loops

Add a count of the number of partitions LoopDist made when distributing
a loop in meta data, then check for loops which are already distributed
to prevent reprocessing.

We see this happen on some spec apps, LD is on by default at SiFive.
2025-08-22 11:05:31 -07:00
Alex MacLean
d494eb0fa3
[NVPTX] Skip numbering unreferenced virtual registers (readability) (#154391)
When assigning numbers to registers, skip any with neither uses nor
defs. This is will not have any impact at all on the final SASS but it
makes for slightly more readable PTX. This change should also ensure
that future minor changes are less likely to cause noisy diffs in
register numbering.
2025-08-19 12:27:46 -07:00
Philip Reames
4d629f9744
[MIR] Remove std::variant from multiple save/restore point handling [nfc] (#153226)
In review of bbde6b, I had originally proposed that we support the
legacy text format. As review evolved, it bacame clear this had been a
bad idea (too much complexity), but in order to let that patch finally
move forward, I approved the change with the variant. This change undoes
the variant, and updates all the tests to just use the array form.
2025-08-12 11:23:05 -07:00
Tomer Shafir
d64e6b5e27
[utils][UpdateTestChecks] Warn about possible target triple mismatch (#149645)
Aims to improve error reporting by printing a warning if the target
function regex that has been selected finds no matches. For example, a
`-mtriple=arm64-apple-darwin` runline, would map to the `arm64` prefix
by `update_llc_test_checks.py` and wouldn't match Apple's function
layout, generating some not understandable garbage checks.

The implementation changes `common.process_run_line` to return an
abstract indicator of number of functions processed, without breaking
the drivers. Then `update_llc_test_checks.py` prints a driver specific
error message.
2025-08-12 11:44:42 +03:00
Nikita Popov
c23b4fbdbb
[IR] Remove size argument from lifetime intrinsics (#150248)
Now that #149310 has restricted lifetime intrinsics to only work on
allocas, we can also drop the explicit size argument. Instead, the size
is implied by the alloca.

This removes the ability to only mark a prefix of an alloca alive/dead.
We never used that capability, so we should remove the need to handle
that possibility everywhere (though many key places, including stack
coloring, did not actually respect this).
2025-08-08 11:09:34 +02:00
Aiden Grossman
6ba25c1a56
[llvm] Remove uses of %T in tests (#151621)
This patch removes all uses of %T from within LLVM tests. %T has been
deprecated for about seven years and use is not advised given it is not
unique per test and can thus lead to races. The goal of this is to
eventually remove support for %T from lit.
2025-08-01 08:24:56 -07:00
Alex MacLean
35693daa70
[NVPTX] Fix v2i8 call lowering, use generic ld/st nodes for call params (#146930) 2025-07-28 10:41:51 -07:00
sivadeilra
b933f0c376
Fix Windows EH IP2State tables (remove +1 bias) (#144745)
This changes how LLVM constructs certain data structures that relate to
exception handling (EH) on Windows. Specifically this changes how
IP2State tables for functions are constructed. The purpose of this
change is to align LLVM to the requires of the Windows AMD64 ABI, which
requires that the IP2State table entries point to the boundaries between
instructions.

On most Windows platforms (AMD64, ARM64, ARM32, IA64, but *not* x86-32),
exception handling works by looking up instruction pointers in lookup
tables. These lookup tables are stored in `.xdata` sections in
executables. One element of the lookup tables are the `IP2State` tables
(Instruction Pointer to State).

If a function has any instructions that require cleanup during exception
unwinding, then it will have an IP2State table. Each entry in the
IP2State table describes a range of bytes in the function's instruction
stream, and associates an "EH state number" with that range of
instructions. A value of -1 means "the null state", which does not
require any code to execute. A value other than -1 is an index into the
State table.

The entries in the IP2State table contain byte offsets within the
instruction stream of the function. The Windows ABI requires that these
offsets are aligned to instruction boundaries; they are not permitted to
point to a byte that is not the first byte of an instruction.

Unfortunately, CALL instructions present a problem during unwinding.
CALL instructions push the address of the instruction after the CALL
instruction, so that execution can resume after the CALL. If the CALL is
the last instruction within an IP2State region, then the return address
(on the stack) points to the *next* IP2State region. This means that the
unwinder will use the wrong cleanup funclet during unwinding.

To fix this problem, compilers should insert a NOP after a CALL
instruction, if the CALL instruction is the last instruction within an
IP2State region. The NOP is placed within the same IP2State region as
the CALL, so that the return address points to the NOP and the unwinder
will locate the correct region.

This PR modifies LLVM so that it inserts NOP instructions after CALL
instructions, when needed. In performance tests, the NOP has no
detectable significance. The NOP is rarely inserted, since it is only
inserted when the CALL is the last instruction before an IP2State
transition or the CALL is the last instruction before the function
epilogue.

NOP padding is only necessary on Windows AMD64 targets. On ARM64 and
ARM32, instructions have a fixed size so the unwinder knows how to "back
up" by one instruction.

Interaction with Import Call Optimization (ICO):

Import Call Optimization (ICO) is a compiler + OS feature on Windows
which improves the performance and security of DLL imports. ICO relies
on using a specific CALL idiom that can be replaced by the OS DLL
loader. This removes a load and indirect CALL and replaces it with a
single direct CALL.

To achieve this, ICO also inserts NOPs after the CALL instruction. If
the end of the CALL is aligned with an EH state transition, we *also*
insert a single-byte NOP. **Both forms of NOPs must be preserved.** They
cannot be combined into a single larger NOP; nor can the second NOP be
removed.

This is necessary because, if ICO is active and the call site is
modified by the loader, the loader will end up overwriting the NOPs that
were inserted for ICO. That means that those NOPs cannot be used for the
correct termination of the exception handling region (the IP2State
transition), so we still need an additional NOP instruction. The NOPs
cannot be combined into a longer NOP (which is ordinarily desirable)
because then ICO would split one instruction, producing a malformed
instruction after the ICO call.
2025-07-22 09:18:13 -07:00
Alex MacLean
f03782dd67
[NVPTX] Fixup v2i8 parameter and return lowering (#145585)
This change fixes v2i8 lowering for parameters and returned values. As
part of this work, I move the lowering for return values to use generic
ISD::STORE nodes as these are more flexible and have existing
legalization handling.

Note that calling a function with v2i8 arguments or returns is still not
working but this is left for a subsequent change as this MR is already
fairly large.

Partially addresses #128853
2025-06-27 09:26:10 -07:00
Alex MacLean
70333de6cf
[NVPTX] Consolidate and cleanup various NVPTXISD nodes (NFC) (#145581)
This change consolidates and cleans up various NVPTXISD target-specific
nodes in order to simplify SDAG ISel. While there are some whitespace
changes in the emitted PTX it is otherwise a non-functional change.

NVPTXISD::Wrapper - This node was used to wrap external-symbol and
global-address nodes. It is redundant and has been removed. Instead we
use the non-target versions of these nodes and convert them
appropriately during ISel.

NVPTXISD::CALL - Much of the family of nodes used to represent a PTX
call instruction have been replaced by this new single node. It
corresponds to a single instruction and is therefore much simpler to
create and lower.
2025-06-25 11:42:21 -07:00
Pierre van Houtryve
01848731d3
[tools] Allow RegClass/Bank in update_givaluetracking_test_checks.py (#141727)
The script previously assumed an underscore after the :
2025-05-28 10:29:18 +02:00
David Green
a2aa88192f
[GlobalISel] Add a update_givaluetracking_test_checks.py script (#140296)
As with the other update scripts this takes the output of
-passes=print<gisel-value-tracking> and inserts the results into an
existing mir file. This means that the input is a lot like
update_analysis_test_checks.py, and the output needs to insert into a
mir file similarly to update_mir_test_checks.py. The code used to do the
inserting has been moved to common, to allow it to be reused. Otherwise
it tries to reuse the existing infrastructure, and
update_givaluetracking_test_checks is kept relatively short.
2025-05-22 09:06:37 +01:00
Ramkumar Ramachandra
bb2791609d
[LAA] Tweak debug output for UTC stability (#140764)
UpdateTestChecks has a make_analyzer_generalizer to replace pointer
addressess from the debug output of LAA with a pattern, which is an
acceptable solution when there is one RUN line. However, when there are
multiple RUN lines with a common pattern, UTC fails to recognize common
output due to mismatched pointer addresses. Instead of hacking UTC scrub
the output before comparing the outputs from the different RUN lines,
fix the issue once and for all by making LAA not output unstable pointer
addresses in the first place.

The removal of the now-dead make_analyzer_generalizer is left as a
non-trivial exercise for a follow-up.
2025-05-21 12:01:49 +01:00
hev
746c682c4a
[LoongArch] Introduce 32s target feature for LA32S ISA extensions (#139695)
According to the offical LoongArch reference manual, the 32-bit
LoongArch is divied into two variants: the Reduced version (LA32R) and
Standard version (LA32S). LA32S extends LA32R by adding additional
instructions, and the 64-bit version (LA64) fully includes the LA32S
instruction set.

This patch introduces a new target feature `32s` for the LoongArch
backend, enabling support for instructions specific to the LA32S
variant.

The LA32S exntension includes the following additional instructions:

- ALSL.W
- {AND,OR}N
- B{EQ,NE}Z
- BITREV.{4B,W}
- BSTR{INS,PICK}.W
- BYTEPICK.W
- CL{O,Z}.W
- CPUCFG
- CT{O,Z}.W
- EXT.W,{B,H}
- F{LD,ST}X.{D,S}
- MASK{EQ,NE}Z
- PC{ADDI,ALAU12I}
- REVB.2H
- ROTR{I},W

Additionally, LA32R defines three new instruction aliases:

- RDCNTID.W RJ => RDTIMEL.W ZERO, RJ
- RDCNTVH.W RD => RDTIMEH.W RD, ZERO
- RDCNTVL.W RD => RDTIMEL.W RD, ZERO
2025-05-20 18:28:08 +08:00
Ruiling, Song
b8e5307031
update_mir_test_checks: keep comment embedded in MIR (#140016)
We often add inline comment in mir. It is useful to keep them.
2025-05-20 09:55:18 +08:00
Alex MacLean
369891b674
[NVPTX] use untyped loads and stores where ever possible (#137698)
In most cases, the type information attached to load and store
instructions is meaningless and inconsistently applied. We can usually
use ".b" loads and avoid the complexity of trying to assign the correct
type. The one expectation is sign-extending load, which will continue to
use ".s" to ensure the sign extension into a larger register is done
correctly.
2025-05-10 08:26:26 -07:00