989 Commits

Author SHA1 Message Date
c8ef
0c1c37bfbe
[TLI] Add support for the tgamma libcall. (#113791)
This patch adds the `tgamma` libcall.
2024-10-29 10:08:38 +08:00
Thomas Fransham
69ead949d0
[llvm] Enable building Analysis plugins on windows (#112303)
Enable building InlineAdvisorPlugin and InlineOrderPlugin on windows for
shared library builds.

This is part of the work to enable LLVM_BUILD_LLVM_DYLIB and LLVM
plugins on window.
2024-10-26 13:15:37 +03:00
Fawdlstty
20bda93e43
[TLI] Add basic support for scalbnxx (#112936)
This patch adds basic support for `scalbln, scalblnf, scalblnl, scalbn,
scalbnf, scalbnl`. Constant folding support will be submitted in a
subsequent patch.

Related issue: <#112631>
2024-10-20 14:17:15 -07:00
c8ef
761fa5844e
[TLI] Add support for the ilogb libcall. (#112725)
This patch adds the `ilogb` libcall. Constant folding will be handled in
subsequent patches.
2024-10-18 14:20:34 +08:00
Rahul Joshi
fa789dffb1
[NFC] Rename Intrinsic::getDeclaration to getOrInsertDeclaration (#111752)
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).
2024-10-11 05:26:03 -07:00
Jeremy Morse
056a3f4673 [NFC] Reapply 3f37c517f, SmallDenseMap speedups
This time with 100% more building unit tests. Original commit message follows.

[NFC] Switch a number of DenseMaps to SmallDenseMaps for speedup (#109417)

If we use SmallDenseMaps instead of DenseMaps at these locations,
we get a substantial speedup because there's less spurious malloc
traffic. Discovered by instrumenting DenseMap with some accounting
code, then selecting sites where we'll get the most bang for our buck.
2024-09-26 10:49:29 +01:00
Nikita Popov
37e5319a12 [UnitTests] Fix APInt signed flags (NFC)
This makes unit tests compatible with the assertion added in
https://github.com/llvm/llvm-project/pull/106524, by setting the
isSigned flag to the correct value or changing how the value is
constructed.
2024-09-20 12:13:33 +02:00
braw-lee
173841cc56
[TLI] Add basic support for fdim libcall (#108702)
first PR to fix #108695

Signed-off-by: Kushal Pal <kushalpal109@gmail.com>
2024-09-20 10:22:33 +04:00
Benjamin Maxwell
43c9203d49
[TLI] Support inferring function attributes for sincos[f|l] (#108554) 2024-09-18 09:40:29 +01:00
Mircea Trofin
82266d3a2b
[nfc][ctx_prof] Factor the callsite instrumentation exclusion criteria (#108471)
Reusing this in the logic fetching the instrumentation in `CtxProfAnalysis`.
2024-09-13 21:25:47 -07:00
JOE1994
52b48a70d3 [llvm][unittests] Strip unneeded use of raw_string_ostream::str() (NFC)
Avoid excess layer of indirection.
2024-09-13 19:01:08 -04:00
Lang Hames
35a0fd507f Fix some another unit test for LLVM_ENABLE_EXPORTED_SYMBOLS_IN_EXECUTABLES=Off.
Building with -DLLVM_ENABLE_EXPORTED_SYMBOLS_IN_EXECUTABLES=Off should not
prevent the AnalysisTests unit test from working.

This fix uses the approach implemented in
https://github.com/llvm/llvm-project/pull/101741.

rdar://135849875
2024-09-12 15:27:08 +10:00
Philip Reames
3d9abfc9f8 Consolidate all IR logic for getting the identity value of a reduction [nfc]
This change merges the three different places (at the IR layer) for
finding the identity value of a reduction into a single copy.  This
depends on several prior commits which fix ommissions and bugs in
the distinct copies, but this patch itself should be fully
non-functional.

As the new comments and naming try to make clear, the identity value
is a property of the @llvm.vector.reduce.* intrinsic, not of e.g.
the recurrence descriptor.  (We still provide an interface for
clients using recurrence descriptors, but the implementation simply
translates to the intrinsic which each corresponds to.)

As a note, the getIntrinsicIdentity API does not support fminnum/fmaxnum
or fminimum/fmaximum which is why we still need manual logic (but at
least only one copy of manual logic) for those cases.
2024-09-04 08:23:21 -07:00
Daniil Fukalov
89e6a28867
[NFC] Add explicit #include llvm-config.h where its macros are used. (#106621)
Without these explicit includes, removing other headers, who implicitly
include llvm-config.h, may have non-trivial side effects.
2024-08-30 09:35:06 +02:00
Mircea Trofin
1991aa6b48
Reapply "[nfc][mlgo] Incrementally update DominatorTreeAnalysis in FunctionPropertiesAnalysis (#104867) (#106309)
Reverts c992690179eb5de6efe47d5c8f3a23f2302723f2.

The problem is that if there is a sequence "{delete A->B} {delete A->B}
{insert A->B}" the net result is "{delete A->B}", which is not what we
want.

Duplicate successors may happen in cases like switch statements (as
shown in the unit test).

The second problem was that in `invoke` cases, some edges we speculate may get deleted don't, but are also not reachable from the inlined call site's basic block. We just need to check which edges are actually not present anymore.

The fix is to sanitize the list of deletes, just like we do for inserts.
2024-08-29 18:28:09 -07:00
Nikita Popov
69c43468d3 [LoopUnrollAnalyzer] Don't simplify signed pointer comparison
We're generally not able to simplify signed pointer comparisons
(because we don't have no-wrap flags that would permit it), so
we shouldn't pretend that we can in the cost model.

The unsigned comparison case is also not modelled correctly,
as explained in the added comment. As this is a cost model
inaccuracy at worst, I'm leaving it alone for now.
2024-08-28 12:15:12 +02:00
Mircea Trofin
1e70122cbc
[ctx_prof] API to get the instrumentation of a BB (#105468)
Analogous to PR #104491 

Issue #89287
2024-08-21 17:17:46 -07:00
Dmitri Gribenko
a811f26335 [llvm][test] Write temporary files into a temporary directory 2024-08-21 10:21:06 +02:00
Mircea Trofin
30318401ad
Fix post-104491 (#105191) 2024-08-20 11:07:10 -07:00
Mircea Trofin
c8a678b1e4
[ctx_prof] Add analysis utility to fetch ID of a callsite (#104491)
This will be needed when maintaining the contextual profile for ICP or inlining - we'll need to first fetch the ID of a callsite, which is in an instrumentation instruction (intrinsic) preceding the callsite.
2024-08-20 10:49:42 -07:00
Noah Goldstein
c6e16a49ef [TLI] Add support for inferring attr cold/noreturn on std::terminate and __cxa_throw
These functions are both inherently on the error path so `cold` seems
appropriate. `noreturn` is definitional.

Closes #101622
2024-08-18 15:37:56 -07:00
Daniil Fukalov
0da2ba811a
[NFC] Cleanup in ADT and Analysis headers. (#104484)
Remove unused directly includes and forward declarations in ADT and
Analysis headers.
2024-08-17 13:11:18 +02:00
Tyler Nowicki
7ff377ba60
[Analysis] Fix null ptr dereference when using WriteGraph without branch probability info (#104102)
The call to 'CFGInfo->getBPI()->getEdgeProbability(Node, SuccBB);' fails
when BPI is not provided. In this case we can give up and not print any
edge attributes.

---------

Co-authored-by: tnowicki <tnowicki.nowicki@amd.com>
2024-08-16 15:23:51 -04:00
YunQiang Su
fb9e685fc4
Intrinsic: introduce minimumnum and maximumnum for IR and SelectionDAG (#96649)
C23 introduced new functions fminimum_num and fmaximum_num, and they
follow the minimumNumber and maximumNumber of IEEE754-2019. Let's
introduce new intrinsics to support them.

This patch introduces support only support for scalar values. The
support of
  vector (vp, vp.reduce, vector.reduce),
  experimental.constrained
will be added in future patches.

With this patch, MIPSr6 and LoongArch can work out of box with
fcanonical and fmax/fmin.

Aarch64/PowerPC64 can use the same login as MIPSr6 and LoongArch, while
they have no fcanonical support yet.
I will add it in future patches.

The FMIN/FMAX of RISC-V instructions follows the
minimumNumber/maximumNumber of IEEE754-2019. We can just add it in
future patch.

Background

https://discourse.llvm.org/t/rfc-fix-llvm-min-f-and-llvm-max-f-intrinsics/79735
Currently we have fminnum/fmaxnum, which have different behavior on
different platform for NUM vs sNaN:
   1) Fallback to fmin(3)/fmax(3): return qNaN.
   2) ARM64/ARM32+Neon: same as libc.
   3) MIPSr6/LoongArch/RISC-V: return NUM.

And the fix of fminnum/fmaxnum to follow minNUM/maxNUM of IEEE754-2008
will submit as separated patches.
2024-08-15 14:09:36 +08:00
Sergei Barannikov
6cf3e7d067
[DataLayout] Use member initialization (NFC) (#103712)
This also adds a default constructor and a few uses of it.
2024-08-14 15:02:47 +03:00
Nikita Popov
6da3361f50
[SCEV] Look through multiply in computeConstantDifference() (#103051)
Inside computeConstantDifference(), handle the case where both sides are
of the form `C * %x`, in which case we can strip off the common
multiplication (as long as we remember to multiply by it for the
following difference calculation).

There is an obvious alternative implementation here, which would be to
directly decompose multiplies inside the "Multiplicity" accumulation.
This does work, but I've found this to be both significantly slower
(because everything has to work on APInt) and more complex in
implementation (e.g. because we now need to match back the new More/Less
with an arbitrary factor) without providing more power in practice. As
such, I went for the simpler variant here.

This is the last step to make computeConstantDifference() sufficiently
powerful to replace existing uses of
`cast<SCEVConstant>(getMinusSCEV())` with it.
2024-08-14 09:37:38 +02:00
Snehasish Kumar
1ccd7ab8b6
Enhance TLI detection of __size_returning_new lib funcs. (#102391)
Previously the return types of __size_returning_new variants were not
validated based on their members. This patch checks the members
manually, also generalizes the size_t checks to be based on the module
instead of being hardcoded. 

As requested in followup comment on
https://github.com/llvm/llvm-project/pull/101564.
2024-08-13 12:44:10 -07:00
Nikita Popov
306b9c7b48
[SCEV] Handle more add/addrec mixes in computeConstantDifference() (#101999)
computeConstantDifference() can currently look through addrecs with
identical steps, and then through adds with identical operands (apart
from constants).

However, it fails to handle minor variations, such as two nested add
recs, or an outer add with an inner addrec (rather than the other way
around).

This patch supports these cases by adding a loop over the
simplifications, limited to a small number of iterations. The motivation
is the same as in #101339, to make
computeConstantDifference() powerful enough to replace existing uses of
`dyn_cast<SCEVConstant>(getMinusSCEV())` with it. Though as the IR test
diff shows, other callers may also benefit.
2024-08-13 11:01:39 +02:00
Sergei Barannikov
4bffbba7e9
[UnitTests] Convert a test to use opaque pointers (#102668) 2024-08-10 00:54:29 +03:00
Jeremy Morse
2b1122eaec
[DebugInfo][RemoveDIs] Use iterator-insertion in unittests and fuzzer (#102015)
These are the final few places in LLVM that use instruction pointers to
insert instructions -- use iterators instead, which is needed for
debug-info correctness in the future. Most of this is a gentle
scattering of getIterator calls or not deref-then-addrofing iterators.
libfuzzer does require a storage change to keep built instruction
positions in a container though. The unit-test changes are very
straightforwards.

This leaves us in a position where libfuzzer can't fuzz on either of
debug-info records, however I don't believe that fuzzing of debug-info
is in scope for the library.
2024-08-08 15:18:34 +01:00
Steven Wu
b8c560f159
[CMake] Remove EXPORT_SYMBOLS_FOR_PLUGINS from #102138 (#102396)
Partially remove some of the changes from #102138 as
EXPORT_SYMBOLS_FOR_PLUGINS doesn't work on all the configurations.
2024-08-08 06:00:11 -07:00
Steven Wu
01b488faab
Reapply "[CMake] Fold export_executable_symbols_* into function args. (#101741)" (#102138)
Fix the builds with LLVM_TOOL_LLVM_DRIVER_BUILD enabled.

LLVM_ENABLE_EXPORTED_SYMBOLS_IN_EXECUTABLES is not completely
compatible with export_executable_symbols as the later will be ignored
if the previous is set to NO.

Fix the issue by passing if symbols need to be exported to
llvm_add_exectuable so the link flag can be determined directly
without calling export_executable_symbols_* later.
2024-08-07 09:12:15 -07:00
Snehasish Kumar
874890c682
Add __size_returning_new variant detection to TLI. (#101564)
Add support to detect __size_returning_new variants defined inproposal
P0901R5 to extend to operator new, see
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0901r5.html for
details.

This PR matches the declarations exported by tcmalloc in
f2516691d0/tcmalloc/malloc_extension.h (L707-L711)
2024-08-06 17:41:46 -07:00
Steven Wu
f9b69a378c Revert "[CMake] Fold export_executable_symbols_* into function args. (#101741)"
This reverts commit 5c56b46a32a8856a022a54291bc9294068f7ddbd. This break
lld build when using GENERATE_DRIVER.
2024-08-06 06:08:16 -07:00
Steven Wu
5c56b46a32
[CMake] Fold export_executable_symbols_* into function args. (#101741)
`LLVM_ENABLE_EXPORTED_SYMBOLS_IN_EXECUTABLES` is not completely
compatible with `export_executable_symbols` as the later will be ignored
if the previous is set to NO.

 Fix the issue by passing if symbols need to be exported to
`llvm_add_exectuable` so the link flag can be determined directly
without calling `export_executable_symbols_*` later.
2024-08-05 19:08:27 -07:00
Nikita Popov
337c8b1a4f [SCEV] Add additional computeConstantDifference() tests (NFC)
And some tweaks to make this test easier to update. Use raw
string literals and print parse and verifier errors.
2024-08-05 16:49:09 +02:00
Kazu Hirata
7df9da7d78
[llvm] Construct SmallVector with ArrayRef (NFC) (#101872) 2024-08-04 08:54:23 -07:00
Florian Hahn
edf46f365c
[SCEV] Use const SCEV * explicitly in more places.
Use const SCEV * explicitly in more places to prepare for
https://github.com/llvm/llvm-project/pull/91961. Split off as suggested.
2024-08-03 20:10:01 +01:00
Nikita Popov
79af6892f8
[SCEV] Handle more adds in computeConstantDifference() (#101339)
Currently it only deals with the case where we're subtracting adds with
at most one non-constant operand. This patch extends it to cancel out
common operands for the subtraction of arbitrary add expressions.

The background here is that I want to replace a getMinusSCEV() call in
LAA with computeConstantDifference():

93fecc2577/llvm/lib/Analysis/LoopAccessAnalysis.cpp (L1602-L1603)

This particular call is very expensive in some cases (e.g. lencod with
LTO) and computeConstantDifference() could achieve this much more
cheaply, because it does not need to construct new SCEV expressions.

However, the current computeConstantDifference() implementation is too
weak for this and misses many basic cases. This is a step towards making
it more powerful while still keeping it pretty fast.
2024-08-02 13:43:02 +02:00
Yingwei Zheng
6aa723daa9
[TLI] Add support for nan libfunc (#101356)
Reference: https://en.cppreference.com/w/cpp/numeric/math/nan
2024-08-01 01:49:38 +08:00
Justin Bogner
6992ebcb4b
Reapply "[DXIL][Analysis] Make alignment on StructuredBuffer optional" (#101113)
Unfortunately storing a `MaybeAlign` in ResourceInfo deletes our move
constructor in compilers that haven't implemented [P0602R4], like GCC 7.
Since we only ever use the alignment in ways where alignment 1 and unset
are ambiguous anyway, we'll just store the integer AlignLog2 value that
we'll eventually use directly.

[P0602R4]:
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0602r4.html

This reverts commit c22171f12fa9f260e2525cf61b93c136889e17f2, reapplying
a94edb6b8e321a46fe429934236aaa4e2e9fb97f.
2024-07-30 10:38:34 -07:00
Johannes Reifferscheid
65697b1c7c
Remove value cache in SCEV comparator. (#100721)
The cache triggers almost never, and seems unlikely to help with
performance. However, when it does, it is likely to cause the comparator
to become inconsistent due to a bad interaction of the depth limit and
cache hits. This leads to crashes in debug builds. See the new unit test
for a reproducer.
2024-07-30 14:27:37 +02:00
Justin Bogner
c22171f12f
Revert "[DXIL][Analysis] Make alignment on StructuredBuffer optional" (#101088)
Seeing build failures, reverting to investigate.

Reverts llvm/llvm-project#100697
2024-07-29 14:49:28 -07:00
Justin Bogner
a94edb6b8e
[DXIL][Analysis] Make alignment on StructuredBuffer optional
HLSL allows StructuredBuffer<> to be defined with scalar or
up-to-4-element vectors as well as with structs, but when doing so
`dxc` doesn't set the alignment. Emulate this.

Pull Request: https://github.com/llvm/llvm-project/pull/100697
2024-07-29 14:41:15 -07:00
Noah Goldstein
67fb7c34f1 [TLI] Add support for inferring attr cold on exit/abort
`abort` can be assumed always cold and assume non-zero `exit` status
as a `cold` path as well.

Closes #101003
2024-07-30 00:56:53 +08:00
Justin Bogner
59e91d4c50
[DXIL][Analysis] Make the DXILResource binding optional. NFC
This makes the binding structure in a DXILResource default to empty
and need a separate call to set up, and also moves the unique ID into
it since bindings are the only place where those are actually used.

This will put us in a better position when dealing with resource
handles in libraries.

Pull Request: https://github.com/llvm/llvm-project/pull/100623
2024-07-25 12:27:57 -07:00
Justin Bogner
b365dbbd8d
[DXIL][Analysis] Move dxil::ResourceInfo to the Analysis library. NFC
I had put this in Transforms/Utils, but that doesn't actually make
sense if we want to populate these structures via an analysis pass.

Pull Request: https://github.com/llvm/llvm-project/pull/100621
2024-07-25 11:22:04 -07:00
David Sherwood
ac6061e084
[Analysis] Add new function isDereferenceableReadOnlyLoop (#97292)
I created this patch due to a reviewer request on PR #88385 to split off
the analysis changes, however without the other code in that PR I can
only test the new function with unit tests.
2024-07-19 10:06:23 +01:00
Yingwei Zheng
e8fbefe15b
[TLI] Add basic support for remquo libcall (#99611)
This patch adds basic support for `remquo`. Constant folding support
will be submitted in a subsequent patch.

Related issue: https://github.com/llvm/llvm-project/issues/99497
2024-07-19 16:35:59 +08:00
mskamp
b22fa9093b
[ValueTracking][X86] Compute KnownBits for phadd/phsub (#92429)
Add KnownBits computations to ValueTracking and X86 DAG lowering.
    
These instructions add/subtract adjacent vector elements in their operands. Example: phadd [X1, X2] [Y1, Y2] = [X1 + X2, Y1 + Y2]. This means that, in this example, we can compute the KnownBits of the operation by computing the KnownBits of [X1, X2] + [X1, X2] and [Y1, Y2] + [Y1, Y2] and intersecting the results. This approach also generalizes to all x86 vector types.
    
There are also the operations phadd.sw and phsub.sw, which perform saturating addition/subtraction. Use sadd_sat and ssub_sat to compute the KnownBits of these operations.
    
Also adjust the existing test case pr53247.ll because it can be transformed to a constant using the new KnownBits computation.
    
Fixes #82516.
2024-07-16 15:50:21 +01:00