llvm-project

Author	SHA1	Message	Date
c8ef	0c1c37bfbe	[TLI] Add support for the `tgamma` libcall. (#113791 ) This patch adds the `tgamma` libcall.	2024-10-29 10:08:38 +08:00
Ellis Hoag	6ab26eab4f	Check hasOptSize() in shouldOptimizeForSize() (#112626 )	2024-10-28 09:45:03 -07:00
Eirik Byrkjeflot Anonsen	d2e9532fe1	[DemoteRegToStack] Use correct variable for branch instructions in DemoteRegToStack (#113798 ) I happened to see this code, and it seems "obviously" wrong to me. So here's what I think this code is supposed to look like.	2024-10-27 17:09:39 +08:00
davidtrevelyan	4102625380	[rtsan][llvm][NFC] Rename sanitize_realtime_unsafe attr to sanitize_realtime_blocking (#113155 ) # What This PR renames the newly-introduced llvm attribute `sanitize_realtime_unsafe` to `sanitize_realtime_blocking`. Likewise, sibling variables such as `SanitizeRealtimeUnsafe` are renamed to `SanitizeRealtimeBlocking` respectively. There are no other functional changes. # Why? - There are a number of problems that can cause a function to be real-time "unsafe", - we wish to communicate what problems rtsan detects and why they're unsafe, and - a generic "unsafe" attribute is, in our opinion, too broad a net - which may lead to future implementations that need extra contextual information passed through them in order to communicate meaningful reasons to users. - We want to avoid this situation and make the runtime library boundary API/ABI as simple as possible, and - we believe that restricting the scope of attributes to names like `sanitize_realtime_blocking` is an effective means of doing so. We also feel that the symmetry between `[[clang::blocking]]` and `sanitize_realtime_blocking` is easier to follow as a developer. # Concerns - I'm aware that the LLVM attribute `sanitize_realtime_unsafe` has been part of the tree for a few weeks now (introduced here: https://github.com/llvm/llvm-project/pull/106754). Given that it hasn't been released in version 20 yet, am I correct in considering this to not be a breaking change?	2024-10-26 13:06:11 +01:00
Justin Fargnoli	8a12e0131f	Revert "[LLVM] Add IRNormalizer Pass" (#113392 ) Reverts llvm/llvm-project#68176 Introduced BuildBot failure: https://github.com/llvm/llvm-project/pull/68176#issuecomment-2428243474	2024-10-22 16:01:32 -07:00
Fabian Ritter	4c697f7037	[LowerMemIntrinsics] Use i8 GEPs in memcpy/memmove lowering (#112707 ) The IR lowering of memcpy/memmove intrinsics uses a target-specific type for its load/store operations. So far, the loaded and stored addresses are computed with GEPs based on this type. That is wrong if the allocation size of the type differs from its store size: The width of the accesses is determined by the store size, while the GEP stride is determined by the allocation size. If the allocation size is greater than the store size, some bytes are not copied/moved. This patch changes the GEPs to use i8 addressing, with offsets based on the type's store size. The correctness of the lowering therefore no longer depends on the type's allocation size. This is in support of PR #112332, which allows adjusting the memcpy loop lowering type through a command line argument in the AMDGPU backend.	2024-10-22 16:48:50 +02:00
Justin Fargnoli	1295d2e6da	[LLVM] Add IRNormalizer Pass (#68176 ) Add the llvm-canon tool. Description from the [original PR](https://reviews.llvm.org/D66029#change-wZv3yOpDdxIu): > Added a new llvm-canon tool which aims to transform LLVM Modules into a canonical form by reordering and renaming instructions while preserving the same semantics. This tool makes it easier to spot semantic differences while diffing two modules which have undergone different transformation passes. The current version of this tool can: - Reorder instructions within a function. - Rename instructions based on the operands. - Sort commutative operands. This code was originally written by @michalpaszkowski and [submitted to mainline LLVM](`14d358537f`). However, it was quickly [reverted](`335de55fa3`) to do BuildBot errors. Michal presented his version of the tool in [LLVM-Canon: Shooting for Clear Diffs](https://www.youtube.com/watch?v=c9WMijSOEUg). @AidanGoldfarb and I ported the code to the new pass manager, added more tests, and fixed some bugs related to PHI nodes that may have been the root cause of the BuildBot errors that caused the patch to be reverted. Additionally, we rewrote the implementation of instruction reordering to fix cases where the original algorithm would break use-def chains. Note that this is @AidanGoldfarb and I's first time submitting to LLVM. Please liberally critique the PR! CC @plotfi for initial review. --------- Co-authored-by: Aidan <aidan.goldfarb@mail.mcgill.ca>	2024-10-21 18:11:11 -07:00
Fawdlstty	20bda93e43	[TLI] Add basic support for scalbnxx (#112936 ) This patch adds basic support for `scalbln, scalblnf, scalblnl, scalbn, scalbnf, scalbnl`. Constant folding support will be submitted in a subsequent patch. Related issue: <#112631>	2024-10-20 14:17:15 -07:00
Kazu Hirata	6ec113d4c3	[Local] Avoid repeated map lookups (NFC) (#113072 )	2024-10-20 09:06:22 -07:00
c8ef	761fa5844e	[TLI] Add support for the `ilogb` libcall. (#112725 ) This patch adds the `ilogb` libcall. Constant folding will be handled in subsequent patches.	2024-10-18 14:20:34 +08:00
Matt Arsenault	f225b07799	Utils: Preserve address space for global_ctors (#112532 )	2024-10-18 09:53:46 +04:00
goldsteinn	69a798a996	Reapply "[Inliner] Propagate more attributes to params when inlining (#91101 )" (2nd Attempt) (#112749 ) Root cause of the bug was code hanging onto `range` attr after changing BitWidth. This was fixed in PR #112633.	2024-10-17 20:28:47 -05:00
Florian Hahn	b060661da8	[SCEVExpander] Expand UDiv avoiding UB when in seq_min/max. (#92177 ) Update SCEVExpander to introduce an SafeUDivMode, which is set when expanding operands of SCEVSequentialMinMaxExpr. In this mode, the expander will make sure that the divisor of the expanded UDiv is neither 0 nor poison. Fixes https://github.com/llvm/llvm-project/issues/89958. PR https://github.com/llvm/llvm-project/pull/92177	2024-10-17 13:55:20 -07:00
goldsteinn	c85611e858	[SimplifyLibCall][Attribute] Fix bug where we may keep `range` attr with incompatible type (#112649 ) In a variety of places we change the bitwidth of a parameter but don't update the attributes. The issue in this case is from the `range` attribute when inlining `__memset_chk`. `optimizeMemSetChk` will replace an `i32` with an `i8`, and if the `i32` had a `range` attr assosiated it will cause an error. Fixes #112633	2024-10-17 10:32:55 -05:00
Jay Foad	85c17e4092	[LLVM] Make more use of IRBuilder::CreateIntrinsic. NFC. (#112706 ) Convert many instances of: Fn = Intrinsic::getOrInsertDeclaration(...); CreateCall(Fn, ...) to the equivalent CreateIntrinsic call.	2024-10-17 16:20:43 +01:00
Kazu Hirata	91b2ac640e	[Transforms] Avoid repeated hash lookups (NFC) (#112654 )	2024-10-17 07:45:02 -07:00
Nikita Popov	255a99c29f	[APInt] Fix APInt constructions where value does not fit bitwidth (NFCI) (#80309 ) This fixes all the places that hit the new assertion added in https://github.com/llvm/llvm-project/pull/106524 in tests. That is, cases where the value passed to the APInt constructor is not an N-bit signed/unsigned integer, where N is the bit width and signedness is determined by the isSigned flag. The fixes either set the correct value for isSigned, set the implicitTrunc flag, or perform more calculations inside APInt. Note that the assertion is currently still disabled by default, so this patch is mostly NFC.	2024-10-17 08:48:08 +02:00
Arthur Eubanks	9e6d24f61f	Revert "[Inliner] Propagate more attributes to params when inlining (#91101 )" This reverts commit ae778ae7ce72219270c30d5c8b3d88c9a4803f81. Creates broken IR, see comments in #91101.	2024-10-16 21:21:34 +00:00
goldsteinn	ae778ae7ce	[Inliner] Propagate more attributes to params when inlining (#91101 ) - [Inliner] Add tests for propagating more parameter attributes; NFC - [Inliner] Propagate more attributes to params when inlining Add support for propagating: - `derefereancable` - `derefereancable_or_null` - `align` - `nonnull` - `range` These are only propagated if the parameter to the to-be-inlined callsite match the exact parameter used in the to-be-inlined function.	2024-10-16 11:53:21 -05:00
Kazu Hirata	e1d205a385	[SCCP] Simplify code with DenseMap::operator[] (NFC) (#112473 )	2024-10-16 00:09:12 -07:00
goldsteinn	3c777f04f0	[Inliner] Don't propagate access attr to byval params (#112256 ) - [Inliner] Add tests for bad propagationg of access attr for `byval` param; NFC - [Inliner] Don't propagate access attr to `byval` params We previously only handled the case where the `byval` attr was in the callbase's param attr list. This PR also handles the case if the `ByVal` was a param attr on the function's param attr list.	2024-10-15 09:25:16 -05:00
elhewaty	9efb07f261	[IR] Add `samesign` flag to icmp instruction (#111419 ) Inspired by https://discourse.llvm.org/t/rfc-signedness-independent-icmps/81423	2024-10-15 17:11:25 +08:00
Tim Renouf	76007138f4	[LLVM] New NoDivergenceSource function attribute (#111832 ) A call to a function that has this attribute is not a source of divergence, as used by UniformityAnalysis. That allows a front-end to use known-name calls as an instruction extension mechanism (e.g. https://github.com/GPUOpen-Drivers/llvm-dialects ) without such a call being a source of divergence.	2024-10-12 09:34:45 +01:00
Rahul Joshi	fa789dffb1	[NFC] Rename `Intrinsic::getDeclaration` to `getOrInsertDeclaration` (#111752 ) Rename the function to reflect its correct behavior and to be consistent with `Module::getOrInsertFunction`. This is also in preparation of adding a new `Intrinsic::getDeclaration` that will have behavior similar to `Module::getFunction` (i.e, just lookup, no creation).	2024-10-11 05:26:03 -07:00
braw-lee	3645c64d87	[SimplifyLibCalls] fdim constant fold (#109235 ) 2nd PR to fix #108695 based on #108702 --------- Signed-off-by: Kushal Pal <kushalpal109@gmail.com>	2024-10-10 14:44:39 +04:00
David Green	5184d763c7	[InstCombine] Convert @log to @llvm.log if the input is known positive. (#111428 ) Similar to 112aac4e8961b9626bb84f36deeaa5a674f03f5a, this converts log libcalls to llvm.log.f64 intrinsics if we know they do not set errno, as the input is not zero and not negative. As log will produce errno if the input is 0 (returning -inf) or if the input is negative (returning nan), we also perform the conversion when we have noinf and nonan.	2024-10-10 09:54:25 +01:00
Noah Goldstein	82ac399733	[SimplifyCFG] Allow merging invoke's with different attrs Same logic as other callsites, if the attributes are intersectable, we merge. Closes #111713	2024-10-10 01:07:59 -05:00
Amara Emerson	18d655fdcc	[SimplifyCFG][NFC] Improve compile time for TryToSimplifyUncondBranchFromEmptyBlock optimization. (#110715 ) In some pathological cases this optimization can spend an unreasonable amount of time populating the set for predecessors of the successor block. This change sinks some of that initializing to the point where it's actually necessary so we can take advantage of the existing early-exits. rdar://137063034	2024-10-09 10:12:07 -07:00
Florian Mayer	5f36042508	[NFC] [HWASan] [MTE] factor out threadlong increment (#110340 )	2024-10-08 15:53:01 -07:00
David Green	db98be3c71	[InstCombine] Minor cleanup for optimizeFMod. NFC	2024-10-08 15:00:30 +01:00
Teresa Johnson	79b32bcda6	[MemProf] Strip callsite metadata when inlining an unprofiled callsite (#110998 ) We weren't flagging inlined callee functions with callsite but not memprof metadata correctly, leading to the callsite metadata not being stripped when that function was inlined into a callsite that didn't itself have callsite metadata. In practice, this meant that we went into the LTO link with many more calls than necessary having callsite metadata / summary records, which in turn made the graph larger than necessary. Fixing this oversight resulted in huge reductions in the thin link of a large target: 99% fewer duplicated context ids (recall we have to duplicate when callsites containing the same stack ids are in different functions) 71% fewer graph edges 17% fewer graph nodes 13% fewer functions cloned 44% smaller peak memory 47% smaller time	2024-10-03 08:06:56 -07:00
Mehdi Amini	6c7a3f80e7	Fix LLVM_ENABLE_ABI_BREAKING_CHECKS macro check: use #if instead of #ifdef (#110938 ) This macros is always defined: either 0 or 1. The correct pattern is to use #if. Re-apply #110185 with more fixes for debug build with the ABI breaking checks disabled.	2024-10-03 01:24:14 +02:00
Christopher Di Bella	45ad1ac4a3	Revert "Fix LLVM_ENABLE_ABI_BREAKING_CHECKS macro check: use #if inst… (#110923 ) …ead of #ifdef (#110883)" This reverts commit 1905cdbf4ef15565504036c52725cb0622ee64ef, which causes lots of failures where LLVM doesn't have the right header guards. The errors can be seen on [BuildKite](https://buildkite.com/llvm-project/upstream-bazel/builds/112362#01924eae-231c-4d06-ba87-2c538cf40e04), where the source uses `#ifndef NDEBUG`, but the content in question is defined when `LLVM_ENABLE_ABI_BREAKING_CHECKS == 1`. For example, `llvm/include/llvm/Support/GenericDomTreeConstruction.h` has the following: ```cpp // Helper struct used during edge insertions. struct InsertionInfo { // ... #ifdef LLVM_ENABLE_ABI_BREAKING_CHECKS SmallVector<TreeNodePtr, 8> VisitedUnaffected; #endif }; // ... InsertionInfo II; // ... #ifndef NDEBUG II.VisitedUnaffected.push_back(SuccTN); #endif ```	2024-10-02 13:54:09 -07:00
Mehdi Amini	1905cdbf4e	Fix LLVM_ENABLE_ABI_BREAKING_CHECKS macro check: use #if instead of #ifdef (#110883 ) This macros is always defined: either 0 or 1. The correct pattern is to use #if. Reapply https://github.com/llvm/llvm-project/pull/110185 with fixes.	2024-10-02 18:43:16 +02:00
Rahul Joshi	5a40bc2383	[NFC] Fix typo in function name `generatedUnsignedRemainderCode` (#110743 ) Rename `generatedUnsignedRemainderCode` to `generateUnsignedRemainderCode`.	2024-10-02 06:05:41 -07:00
Nikita Popov	9f3d1695eb	[SCEVExpander] Preserve gep nuw during expansion (#102133 ) When expanding SCEV adds to geps, transfer the nuw flag to the resulting gep. (Note that this doesn't apply to IV increment GEPs, which go through a different code path.)	2024-10-02 11:45:00 +02:00
Noah Goldstein	4d4beeb43c	[SimplifyCFG] Supporting hoisting/sinking callbases with differing attrs Some (many) attributes can safely be dropped to enable sinking. For example removing `nonnull` on a return/param can't affect correctness. Closes #109472	2024-10-01 18:27:08 -05:00
Ramkumar Ramachandra	9f6f6afa31	LoopSimplify: strip dependency on DA (NFC) (#107379 ) Since no passes compute DependenceAnalysis via the PassManager, there is no value in preserving it here. Hence, strip the unnecessary dependency on DependenceAnalysis.	2024-10-01 16:24:57 +01:00
Nikita Popov	4b3ba64ba7	[SCEVExpander] Clear flags when reusing GEP (#109293 ) As pointed out in the review of #102133, SCEVExpander currently incorrectly reuses GEP instructions that have poison-generating flags set. Fix this by clearing the flags on the reused instruction.	2024-10-01 14:22:54 +02:00
Alex Voicu	4852374135	[llvm][opt][Transforms] Replacement `calloc` should match replaced `malloc` (#110524 ) Currently DSE unconditionally emits `calloc` as returning a pointer to AS0. However, this is incorrect for targets that have a non-zero default AS, as it'd not match the `malloc` signature. This patch addresses that by piping through the AS for the pointer returned by `malloc` into the `calloc` insertion call.	2024-10-01 02:05:28 +01:00
Jeremy Morse	96f37ae453	[NFC] Use initial-stack-allocations for more data structures (#110544 ) This replaces some of the most frequent offenders of using a DenseMap that cause a malloc, where the typical element-count is small enough to fit in an initial stack allocation. Most of these are fairly obvious, one to highlight is the collectOffset method of GEP instructions: if there's a GEP, of course it's going to have at least one offset, but every time we've called collectOffset we end up calling malloc as well for the DenseMap in the MapVector.	2024-09-30 23:15:18 +01:00
Simone Campanoni	5d19d55ce1	[SimplifyCFG] Better aligned a comment. (#109307 )	2024-09-30 09:39:35 -07:00
Nikita Popov	f445e39ab2	[SimplifyCFG] Use isWritableObject() API (#110127 ) SimplifyCFG store speculation currently has some homegrown code to check for a writable object, handling the alloca special case only. Switch it to use the generic isWritableObject() API, which means that we also support byval arguments, allocator return values, and writable arguments. I've adjusted isWritableObject() to also check for the noalias attribute when handling writable. Otherwise, I don't think that we can generalize from at-entry writability. This was not relevant for previous uses of the function, because they'd already require noalias for other reasons anyway.	2024-09-30 10:03:46 +02:00
Joshua Cao	0bc98349c8	[LICM] Use DomTreeUpdater version of SplitBlockPredecessors, nfc (#107190 ) The DominatorTree version is marked for deprecation, so we use the DomTreeUpdater version. We also update sinkRegion() to iterate over basic blocks instead of DomTreeNodes. The loop body calls SplitBlockPredecessors. The DTU version calls DomTreeUpdater::apply_updates(), which may call DominatorTree::reset(). This invalidates the worklist of DomTreeNodes to iterate over.	2024-09-29 21:28:45 -07:00
Mehdi Amini	68ddd6c80e	Revert "Fix LLVM_ENABLE_ABI_BREAKING_CHECKS macro check: use #if instead of #ifdef" (#110310 ) Reverts llvm/llvm-project#110185 There are inconsistencies in some of these macros, which unfortunately isn't caught by a single upstream bot.	2024-09-27 20:34:14 +02:00
Mircea Trofin	c4952e513f	[nfc][ctx_prof] Efficient profile traversal and update (#110052 ) This optimizes profile updates and visits, where we want to access contexts for a specific function. These are all the current update cases. We do so by maintaining a list of contexts for each function, preserving preorder traversal. The list is updated whenever contexts are `std::move`-d or deleted.	2024-09-27 08:09:10 -07:00
Mehdi Amini	5e98136679	Fix LLVM_ENABLE_ABI_BREAKING_CHECKS macro check: use #if instead of #ifdef (#110185 ) This macros is always defined: either 0 or 1. The correct pattern is to use #if.	2024-09-27 11:52:22 +02:00
Ellis Hoag	fbec1c2a08	[NFC][CodeLayout] Remove unused parameter (#110145 ) The `NodeCounts` parameter of `calcExtTspScore()` is unused, so remove it. Use `SmallVector` since arrays are expected to be small since they represent MBBs.	2024-09-26 10:28:06 -07:00
Jeremy Morse	056a3f4673	[NFC] Reapply 3f37c517f, SmallDenseMap speedups This time with 100% more building unit tests. Original commit message follows. [NFC] Switch a number of DenseMaps to SmallDenseMaps for speedup (#109417) If we use SmallDenseMaps instead of DenseMaps at these locations, we get a substantial speedup because there's less spurious malloc traffic. Discovered by instrumenting DenseMap with some accounting code, then selecting sites where we'll get the most bang for our buck.	2024-09-26 10:49:29 +01:00
Mircea Trofin	c8365feed7	[ctx_prof] Simple ICP criteria during module inliner (#109881 ) This is mostly for test: under contextual profiling, we perform ICP for those indirect callsites which have targets marked as `alwaysinline`. This helped uncover a bug with the way the profile was updated upon ICP, where we were skipping over the update if the target wasn't called in that context. That was resulting in incorrect counts for the indirect BB. Also flyby fix to the total/direct count values, they should be 64-bit (as all counters are in the contextual profile)	2024-09-25 15:05:52 -07:00

1 2 3 4 5 ...

7588 Commits