llvm-project

Author	SHA1	Message	Date
Johannes Rudolf Doerfert	41a278f56a	[OpenMP][FIX] Do not add custom state machine eagerly in LTO runs If we run LTO optimization we migth end up introducing a custom state machine and later transforming the region into SPMD. This is a problem. While a follow up will introduce a check for the SPMD conversion, this already prevents the eager custom state machine generation. Only if the kernel init function is defined, rather then declared, we will emit a custom state machine. SPMD-zation can happen eagerly though. Tests are adjusted via a weak definition. The LTO test was added to verify this works as expected. Differential Revision: https://reviews.llvm.org/D136740	2022-10-26 10:40:11 -07:00
Momchil Velikov	9901583968	Revert "[FuncSpec] Fix specialisation based on literals" This reverts commit a8b0f580170089fcd555ade5565ceff0ec60f609 because of "reverse-iteration" buildbot failure.	2022-10-26 13:54:12 +01:00
Momchil Velikov	2c8a4c6e62	Revert "[FuncSpec][NFC] Refactor finding specialisation opportunities" This reverts commit a8853924bd3c50deebfbf993c037257ccf9805f4 due to dependency on a8b0f5801700	2022-10-26 13:54:12 +01:00
Momchil Velikov	a8853924bd	[FuncSpec][NFC] Refactor finding specialisation opportunities This patch reorders the traversal of function call sites and function formal parameters to: * do various argument feasibility checks (`isArgumentInteresting` ) only once per argument, i.e. doing N-args checks instead of N-calls x N-args checks. * do hash table lookups only once per call site, i.e. N-calls lookups/inserts instead of N-call x N-args lookups/inserts. Reviewed By: ChuanqiXu, labrinea Differential Revision: https://reviews.llvm.org/D135968	2022-10-26 10:18:35 +01:00
Momchil Velikov	606d25e545	[FuncSpec] Compute specialisation gain even when forcing specialisation When rewriting the call sites to call the new specialised functions, a single call site can be matched by two different specialisations - a "less specialised" version of the function and a "more specialised" version of the function, e.g. for a function void f(int x, int y) the call like `f(1, 2)` could be matched by either void f.1(int x /* int y == 2 /); or void f.2(/ int x == 1, int y == 2 */); The `FunctionSpecialisation` pass tries to match specialisation in the order of decreasing gain, so "more specialised" functions are preferred to "less specialised" functions. This breaks, however, when using the flag `-force-function-specialization`, in which case the cost/benefit analysis is not performed and all the specialisations are equally preferable. This patch makes the pass calculate specialisation gain and order the specialisations accordingly even when `-force-function-specialization` is used, under the assumption that this flag has purely debugging purpose and it is reasonable to ignore the extra computing effort it incurs. Reviewed By: ChuanqiXu, labrinea Differential Revision: https://reviews.llvm.org/D136180	2022-10-26 10:08:03 +01:00
Momchil Velikov	a8b0f58017	[FuncSpec] Fix specialisation based on literals The `FunctionSpecialization` pass has support for specialising functions, which are called with literal arguments. This functionality is disabled by default and is enabled with the option `-function-specialization-for-literal-constant` . There are a few issues with the implementation, though: * even with the default, the pass will still specialise based on floating-point literals * even when it's enabled, the pass will specialise only for the `i1` type (or `i2` if all of the possible 4 values occur, or `i3` if all of the possible 8 values occur, etc) The reason for this is incorrect check of the lattice value of the function formal parameter. The lattice value is `overdefined` when the constant range of the possible arguments is the full set, and this is the reason for the specialisation to trigger. However, if the set of the possible arguments is not the full set, that must not prevent the specialisation. This patch changes the pass to NOT consider a formal parameter when specialising a function if the lattice value for that parameter is: * unknown or undef * a constant * a constant range with a single element on the basis that specialisation is pointless for those cases. Is also changes the criteria for picking up an actual argument to specialise if the argument is: * a LLVM IR constant * has `constant` lattice value has `constantrange` lattice value with a single element. Reviewed By: ChuanqiXu Differential Revision: https://reviews.llvm.org/D135893	2022-10-26 09:55:33 +01:00
Momchil Velikov	1a525dec7f	[FuncSpec] Fix missed opportunities for function specialisation When collecting the possible constant arguments to specialise a function the compiler will abandon the search on the first argument that is for some reason unsuitable as a specialisation constant. Thus, depending on the traversal order of the functions and call sites, the compiler can end up with a different set of possible constants, hence with different set of specialisations. With this patch, the compiler will skip unsuitable constants, but nevertheless will continue searching for more. Reviewed By: ChuanqiXu Differential Revision: https://reviews.llvm.org/D135867	2022-10-25 23:19:48 +01:00
Momchil Velikov	c47739b45c	[FuncSpec] Consider small noinline functions for specialisation Small functions with size under a given threshold are not considered for specialisaion on the presumption that they are easy to inline. This does not apply to `noinline` functions, though. Reviewed By: ChuanqiXu Differential Revision: https://reviews.llvm.org/D135862	2022-10-25 19:49:04 +01:00
Nikita Popov	7c32c7e777	Reapply [FunctionAttrs] Make location classification more precise Reapplying after the fix for volatile modelling in D135863. ----- Don't add argmem if the pointer is clearly not an argument (e.g. a global). I don't think this makes a difference right now, but gives more obvious results with D135780.	2022-10-20 15:13:20 +02:00
Nikita Popov	bed31153b7	[FuncAttrs] Extract code for adding a location access (NFC) This code is the same for accesses from call arguments and for accesses from other (single-location) instructions. Extract i into a common function.	2022-10-20 12:01:27 +02:00
Nikita Popov	f2fe289374	[FunctionAttrs] Volatile operations can access inaccessible memory Per LangRef, volatile operations are allowed to access the location of their pointer argument, plus inaccessible memory: > Any volatile operation can have side effects, and any volatile > operation can read and/or modify state which is not accessible > via a regular load or store in this module. > [...] > The allowed side-effects for volatile accesses are limited. If > a non-volatile store to a given address would be legal, a volatile > operation may modify the memory at that address. A volatile > operation may not modify any other memory accessible by the > module being compiled. A volatile operation may not call any > code in the current module. FunctionAttrs currently does not model this and ends up marking functions with volatile accesses on arguments as argmemonly, even though they should be inaccessiblemem_or_argmemonly. Differential Revision: https://reviews.llvm.org/D135863	2022-10-20 11:57:10 +02:00
Fangrui Song	c80b12d352	Revert D135427 "[LTO] Make local linkage GlobalValue in non-prevailing COMDAT available_externally" This reverts commit 8ef3fd8d59ba0100bc6e83350ab1e978536aa531. I mentioned that GlobalAlias was not handled. It turns out GlobalAlias has to be handled in the same patch (as opposed to in a follow-up), as otherwise clang codegen of C5/D5 constructor/destructor would regress (https://reviews.llvm.org/D135427#3869003).	2022-10-19 11:24:12 -07:00
Nikita Popov	747f27d97d	[AA] Rename getModRefBehavior() to getMemoryEffects() (NFC) Follow up on D135962, renaming the method name to match the new type name.	2022-10-19 11:03:54 +02:00
Nikita Popov	1a9d9823c5	[AA] Rename uses of FunctionModRefBehavior (NFC) Followup to D135962 to rename remaining uses of FunctionModRefBehavior to MemoryEffects. Does not touch API names yet, but also updates variables names FMRB/MRB to ME, to match the new type name.	2022-10-19 10:54:47 +02:00
Sjoerd Meijer	f7c42a278b	Revert "Recommit "[LoopFlatten] Enable it by default"" This reverts commit 5b9597f59a445523bd59b5251ab1c2865e74919f. A miscompilation was reported: https://github.com/llvm/llvm-project/issues/58441 Reverting this while I look at that.	2022-10-18 23:36:36 +05:30
Sjoerd Meijer	5b9597f59a	Recommit "[LoopFlatten] Enable it by default" The sanitizer bots turned green again after another change went in, i.e. revert 26dd64ba9cfabe5474bb207f3b7099965f81fed7, so I don't think this patch was causing the problems.	2022-10-17 23:27:19 +05:30
Sjoerd Meijer	a71c4e4fbb	Revert "[LoopFlatten] Enable it by default" This reverts commit 233659c7ae9b83b64a9f739d340736bca39c3d2e. I see some sanitizer build bot failures. Not sure if it is change causing it, but let's see if a revert returns the bots to green...	2022-10-17 22:14:20 +05:30
Sjoerd Meijer	233659c7ae	[LoopFlatten] Enable it by default LoopFlatten has been in the code base off by default for years, but this enables it to run by default. Downstream this has been running for years, so it has been exposed to quite some code. Then around the time we switched to the NPM, several fixes went in related to updating the MemorySSA state and we moved it to a loop pass manager, which both helped preventing rerunning certain analysis passes, and thus helped a bit with compile-times. About compile-times, adding a pass isn't free, but this should see only very minor increases. The pass is relatively simple and there shouldn't be anything algorithmically expensive because all it does is looking at inner/outer loops and it checks assumptions on loop increments and indices. If we see increases, I expect this to mainly come from invalidation of analysis info, and perhaps subsequent passes to trigger and do more. Despite its simplicity/restrictions, it triggers in most code-bases, which makes it worth to enable this by default. Differential Revision: https://reviews.llvm.org/D109958	2022-10-17 17:11:39 +05:30
Nikita Popov	d44cd1bbeb	Revert "[FunctionAttrs] Make location classification more precise" This reverts commit b05f5b90a12098660a4fc16da0b4d421ddfe14e2. There are thread sanitizer buildbot failures in simple_stack.c. I think that's because this ended up affecting the handling of volatile accesses to allocas. Reverting for now.	2022-10-13 12:11:04 +02:00
Nikita Popov	b05f5b90a1	[FunctionAttrs] Make location classification more precise Don't add argmem if the pointer is clearly not an argument (e.g. a global). I don't think this makes a difference right now, but gives more obvious results with D135780.	2022-10-13 11:24:23 +02:00
Nikita Popov	440ce05fbf	[FunctionAttrs] Handle potential access of captured argument We have to account for accesses to argument memory via captures. I don't think there's any way to make this produce incorrect results right now (because as soon as "other" is set, we lose the ability to infer argmemonly), but this avoids incorrect results once we have more precise representation.	2022-10-13 11:15:36 +02:00
Nikita Popov	5b3776842f	[FunctionAttrs] Account for memory effects of inalloca/preallocated The code for inferring memory attributes on arguments claims that inalloca/preallocated arguments are always clobbered: `d71ad41080/llvm/lib/Transforms/IPO/FunctionAttrs.cpp (L640-L642)` However, we would still infer memory attributes for the whole function without taking this into account, so we could still end up inferring readnone for the function. This adds an argument clobber if there are any inalloca/preallocated arguments. Differential Revision: https://reviews.llvm.org/D135783	2022-10-13 10:20:17 +02:00
Fangrui Song	8ef3fd8d59	[LTO] Make local linkage GlobalValue in non-prevailing COMDAT available_externally For a local linkage GlobalObject in a non-prevailing COMDAT, it remains defined while its leader has been made available_externally. This violates the COMDAT rule that its members must be retained or discarded as a unit. To fix this, update the regular LTO change D34803 to track local linkage GlobalValues, and port the code to ThinLTO (GlobalAliases are not handled.) This fixes two problems. (a) `__cxx_global_var_init` in a non-prevailing COMDAT group used to linger around (unreferenced, hence benign), and is now correctly discarded. ``` int foo(); inline int v = foo(); ``` (b) Fix https://github.com/llvm/llvm-project/issues/58215: as a size optimization, we place private `__profd_` in a COMDAT with a `__profc_` key. When FuncImport.cpp makes `__profc_` available_externally due to a non-prevailing COMDAT, `__profd_` incorrectly remains private. This change makes the `__profd_` available_externally. ``` cat > c.h <<'eof' extern void bar(); inline __attribute__((noinline)) void foo() {} eof cat > m1.cc <<'eof' int main() { bar(); foo(); } eof cat > m2.cc <<'eof' __attribute__((noinline)) void bar() { foo(); } eof clang -O2 -fprofile-generate=./t m1.cc m2.cc -flto -fuse-ld=lld -o t_gen rm -fr t && ./t_gen && llvm-profdata show -function=foo t/default_.profraw clang -O2 -fprofile-generate=./t m1.cc m2.cc -flto=thin -fuse-ld=lld -o t_gen rm -fr t && ./t_gen && llvm-profdata show -function=foo t/default_.profraw ``` Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D135427	2022-10-11 15:30:07 -07:00
Jordan Rupprecht	fb27fd5f88	Revert "[LTO] Make local linkage GlobalValue in non-prevailing COMDAT available_externally" This reverts commit 4fbe33593c8132fdc48647c06f4d1455bfff1c88. It causes linking errors, with details provided internally. (Hopefully the author/reviewers will be able to upstream the internal repro).	2022-10-10 11:40:45 -07:00
Nikita Popov	874c0327e7	[Attributor] Use ConstantFoldLoadFromConst() When determining the initial value of the object, use the constant folding API to load a given type at a given offset in the global initializer. This makes it work for cases where the load doesn't directly correspond to an aggregate member. Differential Revision: https://reviews.llvm.org/D135435	2022-10-10 10:17:37 +02:00
Fangrui Song	4fbe33593c	[LTO] Make local linkage GlobalValue in non-prevailing COMDAT available_externally See the updated linkonce_resolution_comdat.ll. For a local linkage GV in a non-prevailing COMDAT, it remains defined while its leader has been made available_externally. This violates the COMDAT rule that its members must be retained or discarded as a unit. To fix this, update the regular LTO change D34803 to track local linkage GlobalValues, and port the code to ThinLTO (GlobalAliases are not handled.) Fix https://github.com/llvm/llvm-project/issues/58215: as a size optimization, we place private `__profd_` in a COMDAT with a `__profc_` key. When FuncImport.cpp makes `__profc_` available_externally due to a non-prevailing COMDAT, `__profd_` incorrectly remains private. This change makes the `__profd_` available_externally. ``` cat > c.h <<'eof' extern void bar(); inline __attribute__((noinline)) void foo() {} eof cat > m1.cc <<'eof' #include "c.h" int main() { bar(); foo(); } eof cat > m2.cc <<'eof' #include "c.h" __attribute__((noinline)) void bar() { foo(); } eof clang -O2 -fprofile-generate=./t m1.cc m2.cc -flto -fuse-ld=lld -o t_gen rm -fr t && ./t_gen && llvm-profdata show -function=foo t/default_.profraw # one _Z3foov clang -O2 -fprofile-generate=./t m1.cc m2.cc -flto=thin -fuse-ld=lld -o t_gen rm -fr t && ./t_gen && llvm-profdata show -function=foo t/default_.profraw # one _Z3foov ``` Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D135427	2022-10-08 11:09:43 -07:00
Johannes Doerfert	e18736149c	[Attributor] Teach AAPointerInfo about atomic cmxchg and rmw The atomic operations behave similar to a store except that we don't know the new value and we read the result first.	2022-10-05 06:48:00 -07:00
Johannes Doerfert	93e51fa444	[Attributor] AAPointerInfo can model non-escaping call uses If a call base use will not capture a pointer we can approximate the effects. This is important especially for readnone/only uses. Even may-write uses are not too bad with reachability in place. Capturing is the problem as we loose track of update sides.	2022-10-05 06:29:14 -07:00
Johannes Doerfert	477e8e10f0	[Attributor] Teach AAPointerInfo to look into aggregates If we have a constant aggregate, e.g., as an initializer, we usually failed to extract the proper value/type from it. This patch provides the size and offset information necessary to extract the right part of the constant.	2022-10-05 06:19:47 -07:00
Johannes Doerfert	a9557115b4	[Attributor] Qualify variables to avoid clashes in the future	2022-10-04 19:43:04 -07:00
Nikita Popov	412141663c	Reapply [FunctionAttrs] Infer precise FMRB The previous version of the patch would incorrect convert an existing argmemonly attribute into an inaccessiblemem_or_argmemonly attribute. ----- This updates checkFunctionMemoryAccess() to infer a precise FunctionModRefBehavior, rather than an approximation split into read/write and argmemonly. Afterwards, we still map this back to imprecise function attributes. This still allows us to infer some cases that we previously did not handle, namely inaccessiblememonly and inaccessiblemem_or_argmemonly. In practice, this means we get better memory attributes in the presence of intrinsics like @llvm.assume. Differential Revision: https://reviews.llvm.org/D134527	2022-09-29 14:02:15 +02:00
Sameer Sahasrabuddhe	3f078b308b	[AAPointerInfo] OffsetInfo: Unassigned is distinct from Unknown A User like the PHINode may be visited multiple times for the same pointer along different def-use edges. The uninitialized state of OffsetInfo at the first visit needs to be distinct from the Unknown value that may be assigned after processing the PHINode. Without that, a PHINode with all inputs Unknown is never followed to its uses. This results in incorrect optimization because some interfering accessess are missed. Differential Revision: https://reviews.llvm.org/D134704	2022-09-28 20:31:36 +05:30
Benjamin Kramer	0fb2676c24	Revert "[FunctionAttrs] Infer precise FMRB" This reverts commit 97dfa536260c434e68913129d79d863b26c1c179. It can make DSE crash. Reduced test case at https://reviews.llvm.org/P8291	2022-09-28 16:57:43 +02:00
Doru Bercea	c9adeca501	Move allocas converted from __kmpc_alloc_shared to entry block.	2022-09-27 17:16:58 +00:00
Nikita Popov	97dfa53626	[FunctionAttrs] Infer precise FMRB This updates checkFunctionMemoryAccess() to infer a precise FunctionModRefBehavior, rather than an approximation split into read/write and argmemonly. Afterwards, we still map this back to imprecise function attributes. This still allows us to infer some cases that we previously did not handle, namely inaccessiblememonly and inaccessiblemem_or_argmemonly. In practice, this means we get better memory attributes in the presence of intrinsics like @llvm.assume. Differential Revision: https://reviews.llvm.org/D134527	2022-09-27 10:14:35 +02:00
Sebastian Peryt	46fc75ab28	[NFC][2/n] Remove PrunePH pass Second patch in the series to remove legacy PM and associated -enable-new-pm=0 flag targets pass that has not been ported to new PM - PruneEH. Discussion about this can be found in D44415. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D134686	2022-09-26 18:38:04 -07:00
Nikita Popov	fe196380cc	[FunctionAttrs] Use MemoryLocation::getOrNone() when infering memory attrs MemoryLocation::getOrNone() already has the necessary logic to handle different instruction types. Use it, rather than repeating a subset of the logic. This adds support for previously unhandled instructions like atomicrmw.	2022-09-23 13:56:55 +02:00
Kazu Hirata	00874c48ea	[IPO] Reorder parameters of InlineFunction (NFC) With the recent addition of new parameter MergeAttributes (D134117), callers need to specify several default parameters before getting to specify the new parameter. This patch reorders the parameters so that callers do not have to specify as many default parameters. Differential Revision: https://reviews.llvm.org/D134125	2022-09-20 09:09:38 -07:00
Kazu Hirata	d3b95ecc98	[ModuleInliner] Remove InlineOrder::front (NFC) InlineOrder::front is a remnant from the era when we had a nested "while" loops in the module inliner, with the inner one grouping the call sites with the same caller. Now that we have a simple "while" loop draining the priority queue, we can just use InlineOrder::pop. Differential Revision: https://reviews.llvm.org/D134121	2022-09-18 08:49:44 -07:00
Benjamin Kramer	b987fe4972	Silence unused variable warning in release builds. NFC	2022-09-18 09:15:32 +02:00
Kazu Hirata	284f0397e2	[Transforms] Merge function attributes within InlineFunction (NFC) In the past, we've had a bug resulting in a compiler crash after forgetting to merge function attributes (D105729). This patch teaches InlineFunction to merge function attributes. This way, we minimize the "time" when the IR is valid, but the function attributes are not. Differential Revision: https://reviews.llvm.org/D134117	2022-09-17 23:10:23 -07:00
Kazu Hirata	6e4fbd2f51	[ModuleInliner] Set Changed earlier (NFC) It makes more sense to set Changed to true immediately after a successful inlining.	2022-09-17 14:16:32 -07:00
Kazu Hirata	31b91356bc	[ModuleInliner] Don't include SetVector.h (NFC) We don't use SetVector in the module inliner.	2022-09-17 12:17:52 -07:00
Kazu Hirata	5faf4bf195	[ModuleInliner] Move UseInlinePriority to InlineOrder.cpp (NFC) UseInlinePriority specifies the priority function. This patch simplifies the code by moving UseInlinePriority closer to the actual consumer -- the switch statement inside getInlineOrder. Differential Revision: https://reviews.llvm.org/D134100	2022-09-17 11:41:28 -07:00
Kazu Hirata	6e30a9cc08	[Inliner] Retire DefaultInlineOrder (NFC) DefaultInlineOrder was largely an exercise in generalizing the traversal order of call sites within the inliner. Now that the module inliner is starting to form its shape, there is no point in sharing DefaultInlineOrder between the module inliner and the CGSCC inliner. DefaultInlineOrder and all the other inline orders are mutually exclusive in the following sense: - The use of DefaultInlineOrder doesn't make sense in the module inliner because there is no priority inherent in the order in which call sites are added to the list of call sites -- SmallVector. - The use of any other inline order doesn't make sense in the CGSCC inliner because little prioritization can be done within one CGSCC. This patch essentially reverts the addition of DefaultInlineOrder so that the loop structure of Inliner.cpp looks like the state just before we started working on the module inliner (circa June 2021). At the same time, ww remove the choice of DefaultInlineOrder from UseInlinePriority. Differential Revision: https://reviews.llvm.org/D134080	2022-09-16 15:36:40 -07:00
Teresa Johnson	c2cf93c1a9	[WPD/LTT] Lower type test feeding assumes via phi correctly This fixes https://github.com/llvm/llvm-project/issues/57616. Type test lowering in ThinLTO modules relies on having type id summaries set up for the referenced types, which provide the type test resolution. If there is no summary, the type tests are lowered to false. At the very least, a default type id summary gives the type tests a resolution of Unknown, which is handled correctly (ignored by the first invocation of LTT, and lowered to true by the second). WPD sets up the type id summaries (with a default type test resolution) as it is processing the type tests, but only does this for the patterns handled by WPD, which is a type test directly feeding an assume. In the case of type tests feeding an assume via a phi, the type id summary was not being set up, leading to the type tests being lowered to false incorrectly. Fix this by adding the default type id summary entries for all type ids used on globals during index-only WPD. This is not an issue for hybrid (split-lto-unit) LTO, as in that case the type test resolution is determined and set up during LTT, since the type definitions are in the regular LTO split module, and exported via the summary to the ThinLTO split module. Differential Revision: https://reviews.llvm.org/D134012	2022-09-16 13:50:01 -07:00
Kazu Hirata	9111920af8	[ModuleInliner] clang-format ModuleInliner.cpp (NFC)	2022-09-16 09:41:42 -07:00
Kazu Hirata	4475470529	[ModuleInliner] Remove a stale comment (NFC) These comments refer to the nested loop in the module inliner where the inner loop grouped call sites from the same caller. We don't group call sites anymore, so the comment has become stale.	2022-09-16 09:37:43 -07:00
Kazu Hirata	42a90e6017	[ModuleInliner] Remove a redundaunt variable (NFC) In the CGSCC inliner, DidInline was used as an indicator to update the call graph. In the module inliner, DidInline is always true at the end of the "while" loop, so can just drop it.	2022-09-16 09:32:02 -07:00
Kazu Hirata	513717ddd0	[ModuleInliner] Remove a write-only variable (NFC) InlinedCallees is a remnant from the CGSCC inliner. We don't use it in the module inliner.	2022-09-16 09:15:53 -07:00

1 2 3 4 5 ...

5465 Commits