36951 Commits

Author SHA1 Message Date
Finn Plummer
45c01e8a33
[NFC][TargetTransformInfo][VectorUtils] Consolidate isVectorIntrinsic... api (#117635)
- update `VectorUtils:isVectorIntrinsicWithScalarOpAtArg` to use TTI for
all uses, to allow specifiction of target specific intrinsics
- add TTI to the `isVectorIntrinsicWithStructReturnOverloadAtField` api
- update TTI api to provide `isTargetIntrinsicWith...` functions and
  consistently name them
- move `isTriviallyScalarizable` to VectorUtils
  
- update all uses of the api and provide the TTI parameter

Resolves #117030
2024-12-19 11:54:26 -08:00
Craig Topper
f139bde8d8
[SelectionDAG] Move SDNode::use_iterator::getOperandNo to SDUse. (#120536)
This allows us to write more range based for loops because we no
longer need the iterator. It also matches IR's Use class.
2024-12-19 09:07:42 -08:00
Craig Topper
e6b2495545
[SelectionDAG] Split SDNode::use_iterator into user_iterator and use_iterator. (#120531)
SDNode::use_iterator now returns an SDUse& when dereferenced.
SDNode::user_iterator returns SDNode*. SDNode::use_begin/use_end/uses
work on use_iterator. SDNode::user_begin/user_end/users work on
user_iterator.

We can now write range based for loops using SDUse& and SDNode::uses().
I've converted many of these in this patch. I didn't update loops that
have additional variables updated in their for statement.

Some loops use SDNode::use_iterator::getOperandNo() which also prevents
using range based for loops. I plan to move this into SDUse in a follow
up patch.
2024-12-19 08:35:32 -08:00
Shubham Sandeep Rastogi
16d952898f Revert "Add a pass to collect dropped var stats for MIR (#120501)"
This reverts commit 223c7648468cd4f649a578d3f9cbc27a63523192.

Reverted due to vuildbot failure:

flang-aarch64-libcxx

Linking CXX shared library lib/libLLVMAnalysis.so.20.0git
FAILED: lib/libLLVMAnalysis.so.20.0git
2024-12-19 00:48:40 -08:00
Shubham Sandeep Rastogi
223c764846
Add a pass to collect dropped var stats for MIR (#120501)
Reland "Add a pass to collect dropped var stats for MIR" (#117044)

I am trying to reland https://github.com/llvm/llvm-project/pull/115566

I also moved the DroppedVariableStats code to the Analysis lib

This is part of a stack of patches with
https://github.com/llvm/llvm-project/pull/120502 being the first one in
the stack
2024-12-19 00:41:48 -08:00
Craig Topper
bd261ecc5a
[SelectionDAG] Add SDNode::user_begin() and use it in some places (#120509)
Most of these are just places that want the first user and aren't
iterating over the whole list.

While there I changed some use_size() == 1 to hasOneUse() which
is more efficient.

This is part of an effort to rename use_iterator to user_iterator
and provide a use_iterator that dereferences to SDUse&. This patch
helps reduce the diff on later patches.
2024-12-18 22:13:04 -08:00
Craig Topper
4ca4287da4
[SelectionDAG] Replace findGlueUse in SelectionDAGISel with SDNode::getGluedUser. NFC (#120512) 2024-12-18 21:46:52 -08:00
Craig Topper
104ad9258a
[SelectionDAG] Rename SDNode::uses() to users(). (#120499)
This function is most often used in range based loops or algorithms
where the iterator is implicitly dereferenced. The dereference returns
an SDNode * of the user rather than SDUse * so users() is a better name.

I've long beeen annoyed that we can't write a range based loop over
SDUse when we need getOperandNo. I plan to rename use_iterator to
user_iterator and add a use_iterator that returns SDUse& on dereference.
This will make it more like IR.
2024-12-18 20:09:33 -08:00
Zhaoxin Yang
f334db92be
[llvm][CodeGen] Intrinsic llvm.powi.* code gen for vector arguments (#118242)
Scalarize vector FPOWI instead of promoting the type. This allows the
scalar FPOWIs to be visited and converted to libcalls before promoting
the type.

FIXME: This should be done in LegalizeVectorOps/LegalizeDAG, but call
lowering needs the unpromoted EVT.

Without this patch, in some backends, such as RISCV64 and LoongArch64,
the i32 type is illegal and will be promoted. This causes exponent type
check to fail when ISD::FPOWI node generates a libcall.

Fix https://github.com/llvm/llvm-project/issues/118079
2024-12-19 08:57:31 +08:00
Florian Hahn
76714be5fd
Revert "Add support for single reductions in ComplexDeinterleavingPass (#112875)"
This reverts commit b3eede5e1fa7ab742b86e9be22db7bccd2505b8a.

This has been breaking most AArch64 stage2 builds for 4+ hours,
reverting to get the bots back to green.

https://lab.llvm.org/buildbot/#/builders/41/builds/4172
https://lab.llvm.org/buildbot/#/builders/4/builds/4281
https://lab.llvm.org/buildbot/#/builders/199/builds/263
https://lab.llvm.org/buildbot/#/builders/198/builds/334
https://lab.llvm.org/buildbot/#/builders/143/builds/4276
https://lab.llvm.org/buildbot/#/builders/17/builds/4725
2024-12-18 15:06:52 +00:00
Paul Walker
3146911eb0
[LLVM][AsmPrinter] Add vector ConstantInt/FP support to emitGlobalConstantImpl. (#120077)
The fixes a failure path for fixed length vector globals when
ConstantInt/FP is used to represent splats instead of
ConstantDataVector.
2024-12-18 11:51:01 +00:00
Nicholas Guy
b3eede5e1f
Add support for single reductions in ComplexDeinterleavingPass (#112875)
The Complex Deinterleaving pass assumes that all values emitted will
result in complex numbers, this patch aims to remove that assumption and
adds support for emitting just the real or imaginary components, not
both.
2024-12-18 10:34:26 +00:00
Florian Mayer
d9703501b0
[MTE] [NFC] use vector to collect globals to tag (#120283)
The same pattern caused test failures in the HWASan pass, so is brittle.
Let's go for the easier approach.
2024-12-18 00:38:19 -08:00
Pengcheng Wang
1235a93fae
[MachinePipeliner] Use RegisterClassInfo::getRegPressureSetLimit (#119827)
`RegisterClassInfo::getRegPressureSetLimit` is a wrapper of
`TargetRegisterInfo::getRegPressureSetLimit` with some logics to
adjust the limit by removing reserved registers.

It seems that we shouldn't use
`TargetRegisterInfo::getRegPressureSetLimit`
directly, just like the comment "This limit must be adjusted
dynamically for reserved registers" said.

Thus we should use `RegisterClassInfo::getRegPressureSetLimit` and
remove replicated code.

Separate from https://github.com/llvm/llvm-project/pull/118787
2024-12-18 15:13:03 +08:00
Pengcheng Wang
b6ad231666
[MachineSink] Use RegisterClassInfo::getRegPressureSetLimit (#119830)
`RegisterClassInfo::getRegPressureSetLimit` is a wrapper of
`TargetRegisterInfo::getRegPressureSetLimit` with some logics to
adjust the limit by removing reserved registers.

It seems that we shouldn't use
`TargetRegisterInfo::getRegPressureSetLimit`
directly, just like the comment "This limit must be adjusted
dynamically for reserved registers" said.

Separate from https://github.com/llvm/llvm-project/pull/118787
2024-12-18 14:51:01 +08:00
Jon Roelofs
01d7a187a4
[llvm] Add missing dependency of libLLVMCodeGen on vt_gen
```
llvm-project/llvm/include/llvm/CodeGenTypes/MachineValueType.h:43:10: fatal error: 'llvm/CodeGen/GenVT.inc' file not found
   43 | #include "llvm/CodeGen/GenVT.inc"
      |          ^~~~~~~~~~~~~~~~~~~~~~~~
```

rdar://141643651
2024-12-17 17:02:55 -07:00
Thurston Dang
e8a6563768
Fix-forward 'RegAllocFast: Avoid using temporary DiagnosticInfo #120184' (#120268)
There was a buildbot breakage

(https://lab.llvm.org/buildbot/#/builders/24/builds/3329/steps/11/logs/stdio):


/home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/CodeGen/AMDGPU/ran-out-of-registers-error-all-regs-reserved.ll:9:10:
error: CHECK: expected string not found in input
; CHECK: error: <unknown>:0:0: no registers from class available to
allocate in function 'no_registers_from_class_available_to_allocate'

2: ==75198==ERROR: AddressSanitizer: stack-use-after-scope on address
0xfa23f9f1c270 at pc 0xb2660dda9340 bp 0xfffffe8ab340 sp 0xfffffe8ab338

caused by https://github.com/llvm/llvm-project/pull/120184, which made a
partial fix but also renabled the tests. This patch attempts to fix
forward by applying the same fix to the error message highlighted in the
buildbot.
2024-12-17 09:09:13 -08:00
Florian Hahn
a487b792e2
[TySan] Add initial Type Sanitizer (LLVM) (#76259)
This patch introduces the LLVM components of a type sanitizer: a
sanitizer for type-based aliasing violations.

It is based on Hal Finkel's https://reviews.llvm.org/D32198.

C/C++ have type-based aliasing rules, and LLVM's optimizer can exploit
these given TBAA metadata added by Clang. Roughly, a pointer of given
type cannot be used to access an object of a different type (with, of
course, certain exceptions). Unfortunately, there's a lot of code in the
wild that violates these rules (e.g. for type punning), and such code
often must be built with -fno-strict-aliasing. Performance is often
sacrificed as a result. Part of the problem is the difficulty of finding
TBAA violations. Hopefully, this sanitizer will help.

For each TBAA type-access descriptor, encoded in LLVM's IR using
metadata, the corresponding instrumentation pass generates descriptor
tables. Thus, for each type (and access descriptor), we have a unique
pointer representation. Excepting anonymous-namespace types, these
tables are comdat, so the pointer values should be unique across the
program. The descriptors refer to other descriptors to form a type
aliasing tree (just like LLVM's TBAA metadata does). The instrumentation
handles the "fast path" (where the types match exactly and no
partial-overlaps are detected), and defers to the runtime to handle all
of the more-complicated cases. The runtime, of course, is also
responsible for reporting errors when those are detected.

The runtime uses essentially the same shadow memory region as tsan, and
we use 8 bytes of shadow memory, the size of the pointer to the type
descriptor, for every byte of accessed data in the program. The value 0
is used to represent an unknown type. The value -1 is used to represent
an interior byte (a byte that is part of a type, but not the first
byte). The instrumentation first checks for an exact match between the
type of the current access and the type for that address recorded in the
shadow memory. If it matches, it then checks the shadow for the
remainder of the bytes in the type to make sure that they're all -1. If
not, we call the runtime. If the exact match fails, we next check if the
value is 0 (i.e. unknown). If it is, then we check the shadow for the
remainder of the byes in the type (to make sure they're all 0). If
they're not, we call the runtime. We then set the shadow for the access
address and set the shadow for the remaining bytes in the type to -1
(i.e. marking them as interior bytes). If the type indicated by the
shadow memory for the access address is neither an exact match nor 0, we
call the runtime.

The instrumentation pass inserts calls to the memset intrinsic to set
the memory updated by memset, memcpy, and memmove, as well as
allocas/byval (and for lifetime.start/end) to reset the shadow memory to
reflect that the type is now unknown. The runtime intercepts memset,
memcpy, etc. to perform the same function for the library calls.

The runtime essentially repeats these checks, but uses the full TBAA
algorithm, just as the compiler does, to determine when two types are
permitted to alias. In a situation where access overlap has occurred and
aliasing is not permitted, an error is generated.

Clang's TBAA representation currently has a problem representing unions,
as demonstrated by the one XFAIL'd test in the runtime patch. We'll
update the TBAA representation to fix this, and at the same time, update
the sanitizer.

When the sanitizer is active, we disable actually using the TBAA
metadata for AA. This way we're less likely to use TBAA to remove memory
accesses that we'd like to verify.

As a note, this implementation does not use the compressed shadow-memory
scheme discussed previously
(http://lists.llvm.org/pipermail/llvm-dev/2017-April/111766.html). That
scheme would not handle the struct-path (i.e. structure offset)
information that our TBAA represents. I expect we'll want to further
work on compressing the shadow-memory representation, but I think it
makes sense to do that as follow-up work.

It goes together with the corresponding clang changes
(https://github.com/llvm/llvm-project/pull/76260) and compiler-rt
changes (https://github.com/llvm/llvm-project/pull/76261)

PR: https://github.com/llvm/llvm-project/pull/76259
2024-12-17 13:57:34 +00:00
Florian Hahn
c1f5937eb4
[SelectOpt] Support BinOps with SExt operands. (#115879)
Building on top of https://github.com/llvm/llvm-project/pull/115489
extend support for binops with SExt operand.

PR: https://github.com/llvm/llvm-project/pull/115879
2024-12-17 11:52:15 +00:00
Benjamin Maxwell
a7dafea384
[SDAG] Allow folding stack slots into sincos/frexp in more cases (#118117)
This adds a new helper `canFoldStoreIntoLibCallOutputPointers()` to
check that it is safe to fold a store into a node that will expand to a
library call that takes output pointers. This requires checking for two
(independent) properties:

1. The store is not within a CALLSEQ_START..CALLSEQ_END pair
* If it is, the expansion would lead to nested call sequences (which is
invalid)
2. The node does not appear as a predecessor to the store
* If it does, attempting to merge the store into the call would result
in a cycle in the DAG

These two properties are checked as part of the same traversal in
`canFoldStoreIntoLibCallOutputPointers()`
2024-12-17 10:54:17 +00:00
Matt Arsenault
10b12e6e07
LiveVariables: Use Register (#120204) 2024-12-17 17:45:24 +07:00
Matt Arsenault
3508d8f6dd
RegAllocFast: Avoid using temporary DiagnosticInfo (#120184)
This reverts commit 1297933f35b4948b4d281259627a72094c407a75.
2024-12-17 16:19:26 +07:00
Florian Mayer
514580b438
[MTE] Apply alignment / size in AsmPrinter rather than IR (#111918)
This makes sure no optimizations are applied that assume the
bigger alignment or size, which could be incorrect if we link
together with non-instrumented code.
2024-12-17 00:47:02 -08:00
Matt Arsenault
5e727e8bed
[Statepoint] Treat undef operands less specially (#119682)
This reverts commit f7443905af1e06eaacda1e437fff8d54dc89c487.

This is to avoid an assertion if an undef operand appears in a
stackmap. This is important to avoid hitting verifier errors
when register allocation starts adding undefs in error scenarios.

Rather than trying to treat undef operands as special, leave them
alone and avoid producing an invalid spill. It would a bit more
precise to produce a spill of an undef register here, but that's not
exposed through the storeRegToStackSlot API.

https://reviews.llvm.org/D122605

This was an alternative to https://reviews.llvm.org/D122582
2024-12-17 12:59:46 +07:00
Matt Arsenault
e2cabd715b RegAllocGreedy: Fix comment typo 2024-12-17 12:46:04 +07:00
Simon Pilgrim
0954c67d7a [DAG] visitFREEZE - only fold integer types to an all ones constant
ISD::isBuildVectorAllOnes can peek through bitcasts, so this can match against FP NAN (ish) data (e.g. double (bitcast i64 -1)) under certain circumstances - bail if the type isn't an integer and let bitcast folding handle it first.

Fixes #120093
2024-12-16 16:46:38 +00:00
Aiden Grossman
76f258920d
[MLGO] Do not include urgent LRs in max cascade calculation (#120052)
A previous PR introduced a threshold where we would mask out a LR that
had been evicted a certain number of times to combat pathological
compile time cases with a somewhat adversarial model. However, this
patch did not take into account urgent LRs which led to compilation
failures when greedy would expect us to provide an eviction and we could
not due to the newly introduced logic.
2024-12-16 07:14:34 -08:00
Björn Pettersson
3ad2399148
[DAGCombiner] Refactor and improve ReduceLoadOpStoreWidth (#119564)
This patch make a couple of improvements to ReduceLoadOpStoreWidth.

When determining the minimum size of "NewBW" we now take byte boundaries
into account. If we for example touch bits 6-10 we shouldn't accept
NewBW=8, because we would fail later when detecting that we can't access
bits from two different bytes in memory using a single load. Instead we
make sure to align LSB/MSB according to byte size boundaries up front
before searching for a viable "NewBW".

In the past we only tried to find a "ShAmt" that was a multiple of
"NewBW", but now we use a sliding window technique to scan for a viable
"ShAmt" that is a multiple of the byte size. This can help out finding
more opportunities for optimization (specially if the original type
isn't byte sized, and for big-endian targets when the original
load/store is aligned on the most significant bit).
2024-12-16 12:15:11 +01:00
David Green
a35db2880a [NFC] Remove some unnecessary semicolons
All inside LLVM_DEBUG, some of which have been cleaned up by adding block
scopes to allow them to format more nicely.
2024-12-16 08:48:57 +00:00
Matt Arsenault
a3db5910b4
RegAllocBase: Avoid using temporary DiagnosticInfo (#120046) 2024-12-16 14:57:13 +07:00
Daniil Kovalev
f65a21a4ec
[PAC][ELF][AArch64] Support signed personality function pointer (#119361)
Re-apply #113148 after revert in #119331

If function pointer signing is enabled, sign personality function
pointer stored in `.DW.ref.__gxx_personality_v0` section with IA key,
0x7EAD = `ptrauth_string_discriminator("personality")` constant
discriminator and address diversity enabled.
2024-12-16 10:24:09 +03:00
Craig Topper
54dac27c57
[GISel][RISCV] Use isSExtCheaperThanZExt when widening G_UMAX/G_UMIN. (#120041)
Similar to what we do for unsigned comparisons after #120032.
2024-12-15 23:16:58 -08:00
Craig Topper
115872902b
[GISel][RISCV] Use isSExtCheaperThanZExt when widening G_ICMP. (#120032)
Sign extending i32->i64 is more efficient than zero extend for RV64.
2024-12-15 22:55:58 -08:00
Craig Topper
6dc24f6a2f
[GISel] Improve MachineVerifier for G_SCMP/UCMP. (#120017)
-Ensure destination type is at least 2 bits.
-Remove unnecessary check that both sources are the same type. The
verifier already handles this generically.
2024-12-15 22:08:43 -08:00
Craig Topper
de1a423c23
[GISel][RISCV][AArch64] Support legalizing G_SCMP/G_UCMP to sub(isgt,islt). (#119265)
Convert the LLT to EVT and call
TargetLowering::shouldExpandCmpUsingSelects to determine if we should do
this.

We don't have a getSetccResultType, so I'm boolean extending the
compares to the result type and using that. If the compares legalize to
the same type, these extends will get removed. Unfortunately, if the
compares legalize to a different type, we end up with truncates or
extends that might not be optimally placed.
2024-12-15 20:47:17 -08:00
Matt Arsenault
818bffcb1c
RegAlloc: Fix failure on undef use when all registers are reserved (#119647)
Greedy and fast would hit different assertions on undef uses if all
registers in a class were reserved.
2024-12-16 10:56:45 +09:00
Matt Arsenault
61f99a1c75
RegAlloc: Do not fatal error if there are no registers in the alloc order (#119640)
Try to use DiagnosticInfo if every register in the class is reserved
by forcing assignment to a reserved register. Also reduces the number
of redundant errors emitted, particularly with fast.

This is still broken in the case of undef uses. There are additional
complications in greedy and fast, so leave it for a separate fix.
2024-12-16 10:52:49 +09:00
Matt Arsenault
bb18e49edb
RegAlloc: Use DiagnosticInfo to report register allocation failures (#119492)
Improve the non-fatal cases to use DiagnosticInfo, which will now
provide a location. The allocators attempt to report different errors
if it happens to see inline assembly is involved (this detection is
quite unreliable) using srcloc instead of dbgloc. For now, leave this
behavior unchanged. I think reporting the full location and context
function would be more useful.
2024-12-16 10:49:08 +09:00
Craig Topper
ca60ee2b8c
[GISel] Remove unnecessary MachineVerifier checks for G_ABDS/G_ABDU. (#120014)
These are declared to use a single type index for all operands in
GenericOpcodes.td and the verifier knows how to check that all operands
with the same type index match.
2024-12-15 17:30:32 -08:00
David Green
4c8c130847
[AArch64][GlobalISel] Scalarize i128 shufflevector instructions. (#119980)
This, like other operations, scalarizes shuffle vector operations with
types larger than 64bits. ImplicitDef and Freeze are also handled the
same way, to allow them to legalize. The legalization of
fewerElementsVectorShuffle is adjusted to handled scalarization.
2024-12-15 10:44:40 +00:00
Kazu Hirata
f01b62ad48 [GlobalISel] Fix warnings
This patch fixes:

  llvm/lib/CodeGen/GlobalISel/CombinerHelperCasts.cpp:167:21: error:
  unused variable 'DL' [-Werror,-Wunused-variable]

  llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp:320:15: error: unused
  variable 'DL' [-Werror,-Wunused-variable]
2024-12-13 10:24:06 -08:00
Craig Topper
0d9fc17433
[GISel] Remove unused DataLayout operand from getApproximateEVTForLLT (#119833) 2024-12-13 09:09:20 -08:00
Ramkumar Ramachandra
4a0d53a0b0
PatternMatch: migrate to CmpPredicate (#118534)
With the introduction of CmpPredicate in 51a895a (IR: introduce struct
with CmpInst::Predicate and samesign), PatternMatch is one of the first
key pieces of infrastructure that must be updated to match a CmpInst
respecting samesign information. Implement this change to Cmp-matchers.

This is a preparatory step in migrating the codebase over to
CmpPredicate. Since we no functional changes are desired at this stage,
we have chosen not to migrate CmpPredicate::operator==(CmpPredicate)
calls to use CmpPredicate::getMatching(), as that would have visible
impact on tests that are not yet written: instead, we call
CmpPredicate::operator==(Predicate), preserving the old behavior, while
also inserting a few FIXME comments for follow-ups.
2024-12-13 14:18:33 +00:00
David Green
94a77ebe24 [AArch64][GlobalISel] Guard against no operands in matchHoistLogicOpWithSameOpcodeHands
In case both LeftHandInst and RightHandInst are IMPLICIT_DEF with no input
operands, this patch protects against the post-legalizer-combiner
matchHoistLogicOpWithSameOpcodeHands with no operands. The
prelegalizercombiner-hoist-same-hands.mir test was cleaned up a little in the
process, and has a post-legalizer run line added so that the implicit_def do
not get folded awwy.
2024-12-13 11:02:55 +00:00
Aiden Grossman
60325abeb3
[MLGO] Add Threshold to Prevent Pathological Compile Time Cases (#119807)
This patch adds a threshold flag, -mlregalloc-max-cascade, to prevent
live ranges from being evicted more than is necessary.

After deploying a new regalloc model, we ran into some pathological
cases where the model decided it wanted to ping-pong evictions, taking
up a large amount of compile time. This threshold is mostly a stop gap
while we continue to investigate other solutions and work on
minimizing/constructing test cases.
2024-12-12 20:46:45 -08:00
paperchalice
1562b70eaf
Reapply "[DomTreeUpdater] Move critical edge splitting code to updater" (#119547)
This relands commit #115111.
Use traditional way to update post dominator tree, i.e. break critical
edge splitting into insert, insert, delete sequence.
When splitting critical edges, the post dominator tree may change its
root node, and `setNewRoot` only works in normal dominator tree...
See

6c7e5827ed/llvm/include/llvm/Support/GenericDomTree.h (L684-L687)
2024-12-13 11:43:09 +08:00
Aiden Grossman
ada517b40c [MLGO][NFC] Clang format MLRegAllocEvictAdvisor.cpp
Run clang-format to fix an issue in spacing in a comment.
2024-12-13 03:40:15 +00:00
Craig Topper
7ece560a50
[GISel] Support narrowing G_ICMP with more than 2 parts. (#119335)
This allows us to support i128 G_ICMP on RV32. I'm not sure how to test
the "left over" part of this as RISC-V always widens to a power of 2
before narrowing.
2024-12-12 09:50:26 -08:00
Tim Gymnich
2db2dc8ab9
[GlobalISel][NFC] Fix LLT Propagation (#119587)
Retain LLT type information by creating new LLTs from the original LLT
instead of only using the original scalar size.

This PR prepares for the [LLT FPInfo
RFC](https://discourse.llvm.org/t/rfc-globalisel-adding-fp-type-information-to-llt/83349/24)
where LLTs will carry additional floating point type information in
addition to the scalar size.
2024-12-12 09:47:46 -08:00
Igor Kirillov
e909c0ccd4
[SelectOpt] Add support for AShr/LShr operands (#118495)
For conditional increments with sign check conditions like X < 0 or X >= 0,
the compiler may generate code like this:

  %cmp = icmp sgt i64 %1, -1
  %shift = ashr i64 %1, 63
  %j.next = add nsw i64 %j, %shift
  %sel = select i1 %cmp ...

, where %cmp is not in computation but in some other implicit or regular
expressions. This patch allows SelectOptimize pass to recognise these
cases.
2024-12-12 14:06:24 +00:00