llvm-project

Author	SHA1	Message	Date
Nikita Popov	658b260dbf	[Attributor] Don't construct pretty GEPs Bring this in line with other transforms like ArgPromotion/SROA/ SCEVExpander and always produce canonical i8 GEPs.	2023-12-22 16:48:13 +01:00
Ivan R. Ivanov	39f09ec245	Invalidate analyses after running Attributor in OpenMPOpt (#74908 ) Using the LoopInfo from OMPInfoCache after the Attributor ran resulted in a crash due to it being in an invalid state. --------- Co-authored-by: Ivan Radanov Ivanov <ivanov2@llnl.gov>	2023-12-20 15:01:21 -08:00
Paul Walker	dea16ebd26	[LLVM][IR] Replace ConstantInt's specialisation of getType() with getIntegerType(). (#75217 ) The specialisation will not be valid when ConstantInt gains native support for vector types. This is largely a mechanical change but with extra attention paid to constant folding, InstCombineVectorOps.cpp, LoopFlatten.cpp and Verifier.cpp to remove the need to call `getIntegerType()`. Co-authored-by: Nikita Popov <github@npopov.com>	2023-12-18 11:58:42 +00:00
Mircea Trofin	ed10fba1b2	[ThinLTO] Allow importing based on a workload definition (#74545 ) An example of a "workload definition" would be "the transitive closure of functions actually called to satisfy a RPC request", i.e. a (typically significantly) smaller subset of the transitive closure (static + possible indirect call targets) of callees. This means this workload definition is a type of flat dynamic profile. Producing one is not in scope - it can be produced offline from traces, or from sample-based profiles, etc. This patch adds awareness to ThinLTO of such a concept. A workload is defined as a root and a list of functions. All function references are by-name (more readable than GUIDs). In the case of aliases, the expectation is the list contains all the alternative names. The workload definitions are presented to the linker as a json file, containing a dictionary. The keys are the roots, the values are the list of functions. The import list for a module defining a root will be the functions listed for it in the profile. Using names this way assumes unique names for internal functions, i.e. clang's `-funique-internal-linkage-names`. Note that the behavior affects the entire module where a root is defined (i.e. different workloads best be defined in different modules), and does not affect modules that don't define roots.	2023-12-14 15:10:48 -08:00
Nikita Popov	696cc20d4e	[LVI] Make UndefAllowed argument of getConstantRange() required For the two remaining uses that did not explicitly specify it, set UndefAllowed=false. In both cases, I believe that treating undef as a full range is the correct behavior.	2023-12-12 11:43:52 +01:00
Youngsuk Kim	9071a15d4b	[llvm][Attributor] Strip AddressSpaceCast from 'constructPointer' (#74742 ) * Remove pointer AddressSpaceCast in function `constructPointer` * Remove 1st parameter (`ResTy`) of function `constructPointer` 1st input argument to function `constructPointer` in all 4 call-sites is `ptr addrspace(0)`. Function `constructPointer` performs a pointer AddressSpaceCast to `ResTy`, making the returned pointer have type `ptr addrspace(0)` in all 4 call-sites. Unless there's a clear reason to discard the addrspace info of input parameter `Ptr`, I think we should keep and forward that info to the returned pointer of `constructPointer`. Opaque ptr cleanup effort.	2023-12-11 09:29:01 -05:00
Oskar Wirga	81360ec582	[CFI] Fix Direct Call Issues in CFI Dispatch Table (#69663 ) I discovered two issues for when a CFI dispatch table entry is used as a direct call. # Inlining There is the possibility that the dispatch table entry contains only a single function pointer: ``` ; Function Attrs: naked nocf_check define private void @.cfi.jumptable() #6 align 8 { entry: call void asm sideeffect "jmp ${0:c}@plt\0Aint3\0Aint3\0Aint3\0A", "s"(ptr @_Z7throw_ei) unreachable } ``` If this function is inlined, the unreachable follows and ruins the containing function. # Exception Handling The dispatch table is always marked NoUnwind. This is fine if the entries are never used directly, but if a direct call is used which the containing function expects to throw, it will no longer throw and the exception handling code will be lost.	2023-12-06 12:56:59 -08:00
Teresa Johnson	88fbc4d3df	[ThinLTO] Add tail call flag to call edges in summary (#74043 ) This adds support for a HasTailCall flag on function call edges in the ThinLTO summary. It is intended for use in aiding discovery of missing frames from tail calls in profiled call stacks for MemProf of profiled binaries that did not disable tail call elimination. A follow on change will add the use of this new flag during MemProf context disambiguation. The new flag is encoded in the bitcode along with either the hotness flag from the profile, or the relative block frequency under the -write-relbf-to-summary flag when there is no profile data. Because we now will always have some additional call edge information, I have removed the non-profile function summary record format, and we simply encode the tail call flag along with a hotness type of none when there is no profile information or relative block frequency. The change of record format and name caused most of the test case changes. I have added explicit testing of generation of the new tail call flag into the bitcode and IR assembly format as part of the changes to llvm/test/Bitcode/thinlto-function-summary-refgraph.ll. I have also added round trip testing through assembly and bitcode to llvm/test/Assembler/thinlto-summary.ll.	2023-12-06 08:41:44 -08:00
Kazu Hirata	92c2529ccd	[llvm] Stop including vector (NFC) Identified with clangd.	2023-12-03 22:32:21 -08:00
Paul Kirth	cfe1ece833	[clang][llvm][fatlto] Avoid cloning modules in FatLTO (#72180 ) https://github.com/llvm/llvm-project/issues/70703 pointed out that cloning LLVM modules could lead to miscompiles when using FatLTO. This is due to an existing issue when cloning modules with labels (see #55991 and #47769). Since this can lead to miscompilation, we can avoid cloning the LLVM modules, which was desirable anyway. This patch modifies the EmbedBitcodePass to no longer clone the module or run an input pipeline over it. Further, it make FatLTO always perform UnifiedLTO, so we can still defer the Thin/Full LTO decision to link-time. Lastly, it removes dead/obsolete code related to now defunct options that do not work with the EmbedBitcodePass implementation any longer.	2023-11-30 17:09:34 -08:00
Youngsuk Kim	c57ef2c698	[llvm][OpenMPOpt] Remove no-op ptr-to-ptr bitcast (NFC) (#73869 ) * Remove a call to CreatePointerBitCastOrAddrSpaceCast which merely adds a no-op ptr-to-ptr bitcast. * Most of the diff is from removing checks for no-op ptr-to-ptr bitcasts in relevant LIT tests	2023-11-29 20:47:37 -05:00
Davide Italiano	61ab43ae00	[IROutliner] Skip dbg values during the candidate search. (#72945 ) dbg value don't really have a value number associated as they have no semantic value associated, i.e. they don't change the code being generated. Use the correct API to go over them. Fixes https://github.com/llvm/llvm-project/issues/62876	2023-11-22 11:26:36 -08:00
Mats Petersson	2fb51fba8c	[FuncSpec] Update function specialization to handle phi-chains (#72903 ) When using the LLVM flang compiler with alias analysis (AA) enabled, SPEC2017:548.exchange2_r was running significantly slower than wihtout the AA. This was caused by the GVN pass replacing many of the loads in the pre-AA code with phi-nodes that form a long chain of dependencies, which the function specialization was unable to follow. This adds a function to discover phi-nodes in a transitive set, with some limitations to avoid spending ages analysing phi-nodes. The minimum latency savings also had to be lowered - fewer load instructions means less saving. Adding some more prints to help debugging the isProfitable decision. No significant change in compile time or generated code-size. (A previous attempt to fix this was abandoned: https://github.com/llvm/llvm-project/pull/71442) --------- Co-authored-by: Alexandros Lamprineas <alexandros.lamprineas@arm.com>	2023-11-22 10:41:01 +00:00
Jeremy Morse	f42482def2	[DebugInfo][RemoveDIs] Don't convert debug-intrinsics to Unreachable (#72380 ) It might seem obvious, but it's not a good idea to convert a debug-intrinsic instruction into an UnreachableInst, as this means things operate differently with and without the -g option. However this can happen due to the "mutate the next instruction" API calls we make. With RemoveDIs eliminating debug intrinsics, this behaviour is at risk of changing, hence this patch ensures we only ever mutate the next _non_ debuginfo instruction into an Unreachable. The tests instrumented with the --try... flag all exercise this, I've added some metadata to a SCCP test to ensure it's exercised.	2023-11-20 20:53:24 +00:00
Davide Italiano	615ebfc3e5	[SampleProfileProbe] Downgrade probes too large from error to warning. (#72574 )	2023-11-16 15:57:51 -08:00
Matthias Braun	cb4627d150	Add setBranchWeigths convenience function. NFC (#72446 ) Add `setBranchWeights` convenience function to ProfDataUtils.h and use it where appropriate.	2023-11-16 10:55:19 -08:00
Teresa Johnson	24a618f69e	[MemProf] Look through alias when applying cloning in ThinLTO backend (#72156 ) Mirror the handling in ModuleSummaryAnalysis to look through aliases when handling call instructions in the ThinLTO backend MemProf handling. Fixes #72094	2023-11-15 13:14:19 -08:00
Kazu Hirata	d4360e428f	[llvm] Stop including llvm/ADT/DenseMap.h (NFC) Ientified with clangd.	2023-11-11 10:07:19 -08:00
Kazu Hirata	84a48ee9fb	[llvm] Stop including llvm/ADT/SetVector.h (NFC) Identified with clangd.	2023-11-10 23:50:23 -08:00
Vidhush Singhal	754b93e466	[Attributor] New attribute to identify what byte ranges are alive for an allocation (#66148 ) Changes the size of allocations automatically. For now, implements the case when a single range from start of the allocation is alive and the allocation can be reduced.	2023-11-10 16:26:37 -08:00
William Junda Huang	683f2df6e5	[SampleProfile] Fix bug where remapper returns empty string and crashing Sample Profile loader (#71479 ) Normally SampleContext does not allow using an empty StirngRef to construct an object, this is to prevent bugs reading the profile. However empty names may be emitted by a function which its name is intentionally set to empty, or a bug in the remapper that returns an empty string. Regardless, converting it to FunctionId first will prevent the assert, and that assert check is unnecessary, which will be addressed in another patch	2023-11-10 21:38:13 +00:00
Jeremy Morse	f1b0a54451	Reapply 7d77bbef4ad92, adding new debug-info classes This reverts commit 957efa4ce4f0391147cec62746e997226ee2b836. Original commit message below -- in this follow up, I've shifted un-necessary inclusions of DebugProgramInstruction.h into being forward declarations (fixes clang-compile time I hope), and a memory leak in the DebugInfoTest.cpp IR unittests. I also tracked a compile-time regression in D154080, more explanation there, but the result of which is hiding some of the changes behind the EXPERIMENTAL_DEBUGINFO_ITERATORS compile-time flag. This is tested by the "new-debug-iterators" buildbot. [DebugInfo][RemoveDIs] Add prototype storage classes for "new" debug-info This patch adds a variety of classes needed to record variable location debug-info without using the existing intrinsic approach, see the rationale at [0]. The two added files and corresponding unit tests are the majority of the plumbing required for this, but at this point isn't accessible from the rest of LLVM as we need to stage it into the repo gently. An overview is that classes are added for recording variable information attached to Real (TM) instructions, in the form of DPValues and DPMarker objects. The metadata-uses of DPValues is plumbed into the metadata hierachy, and a field added to class Instruction, which are all stimulated in the unit tests. The next few patches in this series add utilities to convert to/from this new debug-info format and add instruction/block utilities to have debug-info automatically updated in the background when various operations occur. This patch was reviewed in Phab in D153990 and D154080, I've squashed them together into this commit as there are dependencies between the two patches, and there's little profit in landing them separately. [0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939	2023-11-08 16:42:35 +00:00
Nikita Popov	e360a16fee	[GlobalOpt] Cache whether CC is changeable (#71381 ) The hasAddressTaken() call in hasOnlyColdCalls() has quadratic complexity if there are many cold calls to a function: We're going to visit each call of the function, and then for each of them iterate all the users of the function. We've recently encountered a case where GlobalOpt spends more than an hour in these hasAddressTaken() checks when full LTO is used. Avoid this by moving the hasAddressTaken() check into hasChangeableCC() and caching its result, so it is only computed once per function.	2023-11-07 10:36:45 +01:00
Simon Pilgrim	3ca4fe80d4	[Transforms] Use StringRef::starts_with/ends_with instead of startswith/endswith. NFC. startswith/endswith wrap starts_with/ends_with and will eventually go away (to more closely match string_view)	2023-11-06 16:50:18 +00:00
Nikita Popov	c4c0ac10f1	[IPO] Remove unnecessary bitcasts (NFC)	2023-11-06 16:49:45 +01:00
Nikita Popov	16a595e398	[Attributor] Avoid use of ConstantExpr::getFPTrunc() (NFC) Use the constant folding API instead. For simplificity I'm using the DL-independent API here.	2023-11-06 15:27:01 +01:00
Dominik Adamski	2cce0f6c57	[OpenMP][OMPIRBuilder] Add support to omp target parallel (#67000 ) Added support for LLVM IR code generation which is used for handling omp target parallel code. The call for __kmpc_parallel_51 is generated and the parallel region is outlined to separate function. The proper setup of kmpc_target_init mode is not included in the commit. It is assumed that the SPMD mode for target initialization is properly set by other codegen functions.	2023-11-06 11:44:00 +01:00
Nikita Popov	a682a9cfd0	Revert "Port Swift's merge function pass to llvm: merging functions that differ in constants (#68235 )" This reverts commit 19b5495b653a00da7a250f48b4f739fcf2bbe82f. PR landed without approval, with severe quality issues.	2023-11-03 21:15:46 +01:00
Manman Ren	19b5495b65	Port Swift's merge function pass to llvm: merging functions that differ in constants (#68235 ) See RFC for details: https://discourse.llvm.org/t/rfc-for-moving-swift-s-merge-function-pass-to-llvm/73778 We will need to refactor extension to FunctionComparator/FunctionHash to StructuralHash. This patch adds a new pass which is ported from Swift, and will need to discuss on how to migrate Swift’s pass over after we land this in llvm. Create this PR to get some early review on the patch. --------- Co-authored-by: Manman Ren <mren@meta.com>	2023-11-03 11:13:58 -07:00
Johannes Doerfert	d3e7a48cbd	[OpenMP][NFC] Remove a no-op function	2023-11-03 10:28:36 -07:00
Jeremy Morse	957efa4ce4	Revert "[DebugInfo][RemoveDIs] Add prototype storage classes for "new" debug-info" And some intervening fixups. There are two remaining problems: * A memory leak via https://lab.llvm.org/buildbot/#/builders/236/builds/7120/steps/10/logs/stdio * A performance slowdown with -g where I'm not completely sure what the cause it These might be fairly straightforwards to fix, but it's the end of the day hear, so I figure I'll clear the buildbots til tomorrow. This reverts commit 7d77bbef4ad9230f6f427649373fe46a668aa909. This reverts commit 9026f35afe6ffdc5e55b6615efcbd36f25b11558. This reverts commit d97b2b389a0e511c65af6845119eb08b8a2cb473.	2023-11-02 17:41:36 +00:00
Jeremy Morse	7d77bbef4a	[DebugInfo][RemoveDIs] Add prototype storage classes for "new" debug-info This patch adds a variety of classes needed to record variable location debug-info without using the existing intrinsic approach, see the rationale at [0]. The two added files and corresponding unit tests are the majority of the plumbing required for this, but at this point isn't accessible from the rest of LLVM as we need to stage it into the repo gently. An overview is that classes are added for recording variable information attached to Real (TM) instructions, in the form of DPValues and DPMarker objects. The metadata-uses of DPValues is plumbed into the metadata hierachy, and a field added to class Instruction, which are all stimulated in the unit tests. The next few patches in this series add utilities to convert to/from this new debug-info format and add instruction/block utilities to have debug-info automatically updated in the background when various operations occur. This patch was reviewed in Phab in D153990 and D154080, I've squashed them together into this commit as there are dependencies between the two patches, and there's little profit in landing them separately. [0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939	2023-11-02 12:44:53 +00:00
Johannes Doerfert	a8152086ff	[Attributor][FIX] Ensure new BBs are registered	2023-11-01 12:12:14 -07:00
Nikita Popov	6b8ed78719	[IR] Add writable attribute This adds a writable attribute, which in conjunction with dereferenceable(N) states that a spurious store of N bytes is introduced on function entry. This implies that this many bytes are writable without trapping or introducing data races. See https://llvm.org/docs/Atomics.html#optimization-outside-atomic for why the second point is important. This attribute can be added to sret arguments. I believe Rust will also be able to use it for by-value (moved) arguments. Rust likely won't be able to use it for &mut arguments (tree borrows does not appear to allow spurious stores). In this patch the new attribute is only used by LICM scalar promotion. However, the actual motivation for this is to fix a correctness issue in call slot optimization, which needs this attribute to avoid optimization regressions. Followup to the discussion on D157499. Differential Revision: https://reviews.llvm.org/D158081	2023-11-01 10:46:31 +01:00
Davide Italiano	954af75ceb	[PGO] Skip optimizing probes that don't fit. (#70875 ) The discriminator can only pack 16 bits, so anything exceeding that value will cause the packing code to crash. Emit a diagnostic and skip the optimization instead.	2023-10-31 21:30:36 -07:00
Joseph Huber	e8c0ae60d7	[OpenMP] Add optimization to remove the RPC client (#70683 ) Summary: Part of the work done in the `libc` project is to provide host services for things like `printf` or `malloc`, or generally any syscall-like behaviour. This scheme works by emitting an externally visible global called `__llvm_libc_rpc_client` that the host runtime can pick up to get a handle to the global memory associated with the client. We use the presence of this symbol to indicate whether or not we need to run an RPC server. Normally, this symbol is only present if something requiring an RPC server was linked in, such as `printf`. However, if this call to `printf` was subsequently optimizated out, the symbol would remain and cannot be removed (rightfully so) because of its linkage. This patch adds a special-case optimization to remove this symbol so we can indicate that an RPC server is no longer needed. This patch puts this logic in `OpenMPOpt` as the most readily available place for it. In the future, we should think how to move this somewhere more generic. Furthermore, we use a hard-coded runtime name (which isn't uncommon given all the other magic symbol names). But it might be nice to abstract that part away.	2023-10-31 17:23:24 -05:00
Sander de Smalen	00a831421f	[AArch64][SME] Extend Inliner cost-model with custom penalty for calls. (#68416 ) This is a stacked PR following on from #68415 This patch has two purposes: (1) It tries to make inlining more likely when it can avoid a streaming-mode change. (2) It avoids inlining when inlining causes more streaming-mode changes. An example of (1) is: ``` void streaming_compatible_bar(void); void foo(void) __arm_streaming { /* other code / streaming_compatible_bar(); / other code / } void f(void) { foo(); // expensive streaming mode change } -> void f(void) { / other code / streaming_compatible_bar(); / other code */ } ``` where it wouldn't have inlined the function when foo would be a non-streaming function. An example of (2) is: ``` void streaming_bar(void) __arm_streaming; void foo(void) __arm_streaming { streaming_bar(); streaming_bar(); } void f(void) { foo(); // expensive streaming mode change } -> (do not inline into) void f(void) { streaming_bar(); // these are now two expensive streaming mode changes streaming_bar(); }```	2023-10-31 10:28:40 +00:00
Johannes Doerfert	31b91213bd	[OpenMP] Unify the min/max thread/teams pathways We used to pass the min/max threads/teams values through different paths from the frontend to the middle end. This simplifies the situation by passing the values once, only when we will create the KernelEnvironment, which contains the values. At that point we also manifest the metadata, as appropriate. Some footguns have also been removed, e.g., our target check is now triple-based, not calling convention-based, as the latter is dependent on the ordering of operations. The types of the values have been unified to int32_t.	2023-10-29 10:53:20 -07:00
Mehdi Amini	f390a76b7e	Revert "Revert "[OpenMP][NFC] Add min/max threads/teams count into the KernelEnvironment (#70257 )"" This reverts commit ddbaa11e9f43a38d50d62a9b9b07c3653b6bf8ab. Reapply the original commit, the broken test was repaired in 5e51363f38d083ab326736c0d4d1b5f9fe0de080 in the meantime.	2023-10-26 17:30:01 -07:00
Mehdi Amini	ddbaa11e9f	Revert "[OpenMP][NFC] Add min/max threads/teams count into the KernelEnvironment (#70257 )" This reverts commit c2a1249a8257ed033a98e32e425539c6da6700ec. The MLIR bots are broken with an omp test failure.	2023-10-26 17:25:20 -07:00
Johannes Doerfert	c2a1249a82	[OpenMP][NFC] Add min/max threads/teams count into the KernelEnvironment (#70257 ) The runtime needs to know about the acceptable launch bounds, especially if the compiler (middle- or backend) assumed those bounds. While this patch does not yet inform the runtime, it stores the bounds in a place that can/will be accessed and is associated with the kernel.	2023-10-26 14:46:55 -07:00
Harvin Iriawan	211dc4ad40	[Analysis] Add Scalable field in MemoryLocation.h (#69716 ) This is the first of a series of patch to improve Alias Analysis on Scalable quantities. Keep Scalable information from TypeSize which will be used in Alias Analysis.	2023-10-24 18:18:51 +01:00
Krzysztof Drewniak	93f8e52d1f	[FunctionAttrs] Improve handling of alias-preserving intrinsic calls (#68453 ) Fixes #68270 The function attribute analysis handles many instructions, like addrspacecast, which do not themselves read or write memory but which transform pointers into other values in the same alias set. There are intrinsic functions, such as ptrmask or the AMDGPU-specific make.buffer.rsrc, which also preserve membership in alias sets without capturing. This commit adds the addrspacecast-like behavior to these calls.	2023-10-24 11:16:54 -05:00
Carlos Alberto Enciso	f3b20cb16a	[IPSCCP] Variable not visible at Og. (#66745 ) https://bugs.llvm.org/show_bug.cgi?id=51559 https://github.com/llvm/llvm-project/issues/50901 IPSCCP pass removes the global variable and does not create a constant expression for the initializer value.	2023-10-24 06:22:18 +01:00
Igor Kudrin	42a3a3b3f0	[ThinLTOBitcodeWriter] Do not crash on a typed declaration (#69564 ) This fixes a crash when `splitAndWriteThinLTOBitcode()` hits a declaration with type metadata. For example, such declarations can be generated by the `EliminateAvailableExternally` pass.	2023-10-24 04:29:53 +07:00
Kazu Hirata	a5dca533bd	Use llvm::count (NFC)	2023-10-22 21:18:23 -07:00
Kazu Hirata	9c5a5a421d	[llvm] Stop including llvm/ADT/iterator_range.h (NFC) Identified with misc-include-cleaner.	2023-10-22 15:41:18 -07:00
Kazu Hirata	935d8e12e0	[llvm] Stop including llvm/ADT/StringMap.h (NFC) Identified misc-include-cleaner.	2023-10-22 11:53:56 -07:00
Johannes Doerfert	ba87fba806	[Attributor] Ignore different kernels for kernel lifetime objects If a potential interfering access is in a different kernel and the underlying object has kernel lifetime we can straight out ignore the interfering access. TODO: This should be made much stronger via "reaching kernels", which we already track in AAKernelInfo.	2023-10-21 12:31:06 -07:00
Johannes Doerfert	499fb1b8d8	[Attributor][FIX] Interposable constants cannot be propagated	2023-10-20 19:28:09 -07:00

1 2 3 4 5 ...

6118 Commits