llvm-project

Author	SHA1	Message	Date
Nikita Popov	29441e4f5f	[IR] Convert from nocapture to captures(none) (#123181 ) This PR removes the old `nocapture` attribute, replacing it with the new `captures` attribute introduced in #116990. This change is intended to be essentially NFC, replacing existing uses of `nocapture` with `captures(none)` without adding any new analysis capabilities. Making use of non-`none` values is left for a followup. Some notes: * `nocapture` will be upgraded to `captures(none)` by the bitcode reader. * `nocapture` will also be upgraded by the textual IR reader. This is to make it easier to use old IR files and somewhat reduce the test churn in this PR. * Helper APIs like `doesNotCapture()` will check for `captures(none)`. * MLIR import will convert `captures(none)` into an `llvm.nocapture` attribute. The representation in the LLVM IR dialect should be updated separately.	2025-01-29 16:56:47 +01:00
Yeaseen	d80eb928c7	[llvm] Remove `undef` from `llvm/test/Transforms` tests (#123889 ) This PR replaces some instances of `undef` with `function argument value` or `poison` or `concrete values` in several tests under `llvm/test/Transforms/` directory. These changes align with modern LLVM standards for better-defined behavior and test determinism. If this small PR is okay and gets merged, I will work on the rest. This is inspired by [this project](https://discourse.llvm.org/t/gsoc-2024-remove-undefined-behavior-from-tests/77236/29), work done on this by @leewei05	2025-01-22 14:47:15 +00:00
Lee Wei	58ca7078ce	[llvm] Remove `br i1 undef` from some regression tests [NFC] (#115688 ) This PR aims to remove undefined behavior from tests.	2024-11-12 08:41:27 +00:00
Vitaly Buka	5ce47a5813	Reland "[Support] Assert that DomTree nodes share parent" (#102782 ) A dominance query of a block that is in a different function is ill-defined, so assert that getNode() is only called for blocks that are in the same function. There are three cases, where this behavior did occur. LoopFuse didn't explicitly do this, but didn't invalidate the SCEV block dispositions, leaving dangling pointers to free'ed basic blocks behind, causing use-after-free. We do, however, want to be able to dereference basic blocks inside the dominator tree, so that we can refer to them by a number stored inside the basic block. Reverts #102780 Reland #101198 Fixes #102784 Co-authored-by: Alexis Engelke <engelke@in.tum.de>	2024-08-13 11:56:02 +02:00
Vitaly Buka	3c3df1bef8	Revert "[Support] Assert that DomTree nodes share parent" (#102780 ) Reverts llvm/llvm-project#101198 Breaks multiple bots: https://lab.llvm.org/buildbot/#/builders/72/builds/2103 https://lab.llvm.org/buildbot/#/builders/164/builds/1909 https://lab.llvm.org/buildbot/#/builders/66/builds/2706	2024-08-10 18:36:09 -07:00
Alexis Engelke	8101d1863c	[Support] Assert that DomTree nodes share parent (#101198 ) A dominance query of a block that is in a different function is ill-defined, so assert that getNode() is only called for blocks that are in the same function. There are two cases, where this behavior did occur. LoopFuse didn't explicitly do this, but didn't invalidate the SCEV block dispositions, leaving dangling pointers to free'ed basic blocks behind, causing use-after-free. We do, however, want to be able to dereference basic blocks inside the dominator tree, so that we can refer to them by a number stored inside the basic block.	2024-08-10 18:19:05 +02:00
alex-t	d8cd7fc1f4	AlignmentFromAssumptions should only track pointer operand users (#73370 ) AlignmentFromAssumptions uses SCEV to update the load/store alignment. It tracks down the use-def chains for the pointer which it takes from the assumption cache until it reaches the load or store instruction. It mistakenly adds to the worklist the users of the load result irrespective of the fact that the load result has no connection with the original pointer, moreover, it is not a pointer at all in most cases. Thus the def-use chain contains irrelevant load users. When it is a store instruction the algorithm attempts to adjust its alignment to the alignment of the original pointer. The problem appears when the load and store memory operand pointers belong to different address spaces and possibly have different sizes. The 4bf015c035e4e5b63c7222dfb15ff274a5ed905c was an attempt to address a similar problem. The truncation or zero extension was added to make pointers the same size. That looks strange to me because the zero extension of the pointer is not legal. The test in the 4bf015c035e4e5b63c7222dfb15ff274a5ed905c does not work any longer as for the explicit address spaces conversion the addrspacecast is generated. Summarize: 1. For the alloca to global address spaces conversion addrspacecasts are used, so the code added by the 4bf015c035e4e5b63c7222dfb15ff274a5ed905c is no longer functional. 2. The AlignmentFromAssumptions algorithm should not add the load users to the worklist as they have nothing to do with the original pointer. 3. Instead we only track users that are: GetelementPtrIns, PHINodes.	2023-12-07 17:35:35 +01:00
Nikita Popov	97c0df5eb2	[AlignmentFromAssumes] Handle non-power-of-two alignment (PR64687) Align operand bundles can contain non-power-of-two alignments, but LLVM otherwise does not support them. Bail out in that case instead of crashing. Fixes https://github.com/llvm/llvm-project/issues/64687.	2023-08-15 16:12:53 +02:00
Nikita Popov	d047342012	[AlignmentFromAssumptions] Regenerate test checks (NFC)	2023-08-15 16:12:53 +02:00
Krzysztof Drewniak	f0415f2a45	Re-land "[AMDGPU] Define data layout entries for buffers"" Re-land D145441 with data layout upgrade code fixed to not break OpenMP. This reverts commit 3f2fbe92d0f40bcb46db7636db9ec3f7e7899b27. Differential Revision: https://reviews.llvm.org/D149776	2023-05-03 19:43:56 +00:00
Krzysztof Drewniak	3f2fbe92d0	Revert "[AMDGPU] Define data layout entries for buffers" This reverts commit f9c1ede2543b37fabe9f2d8f8fed5073c475d850. Differential Revision: https://reviews.llvm.org/D149758	2023-05-03 16:11:00 +00:00
Krzysztof Drewniak	f9c1ede254	[AMDGPU] Define data layout entries for buffers Per discussion at https://discourse.llvm.org/t/representing-buffer-descriptors-in-the-amdgpu-target-call-for-suggestions/68798, we define two new address spaces for AMDGCN targets. The first is address space 7, a non-integral address space (which was already in the data layout) that has 160-bit pointers (which are 256-bit aligned) and uses a 32-bit offset. These pointers combine a 128-bit buffer descriptor and a 32-bit offset, and will be usable with normal LLVM operations (load, store, GEP). However, they will be rewritten out of existence before code generation. The second of these is address space 8, the address space for "buffer resources". These will be used to represent the resource arguments to buffer instructions, and new buffer intrinsics will be defined that take them instead of <4 x i32> as resource arguments. ptr addrspace(8). These pointers are 128-bits long (with the same alignment). They must not be used as the arguments to getelementptr or otherwise used in address computations, since they can have arbitrarily complex inherent addressing semantics that can't be represented in LLVM. Even though, like their address space 7 cousins, these pointers have deterministic ptrtoint/inttoptr semantics, they are defined to be non-integral in order to prevent optimizations that rely on pointers being a [0, [addr_max]] value from applying to them. Future work includes: - Defining new buffer intrinsics that take ptr addrspace(8) resources. - A late rewrite to turn address space 7 operations into buffer intrinsics and offset computations. This commit also updates the "fallback address space" for buffer intrinsics to the buffer resource, and updates the alias analysis table. Depends on D143437 Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D145441	2023-05-03 15:25:58 +00:00
Nikita Popov	615efc3ed5	[AlignmentFromAssumptions] Migrate tests to opaque pointers (NFC) Tests were updated with (without manual fixup): https://gist.github.com/nikic/98357b71fd67756b0f064c9517b62a34	2022-06-22 13:57:47 +02:00
Bjorn Pettersson	3f8027fb67	[test] Update some test cases to use -passes when specifying the pipeline This updates transform test cases for ADCE AddDiscriminators AggressiveInstCombine AlignmentFromAssumptions ArgumentPromotion BDCE CalledValuePropagation DCE Reg2Mem WholeProgramDevirt to use the -passes syntax when specifying the pipeline. Given that LLVM_ENABLE_NEW_PASS_MANAGER isn't set to off (which is a deprecated feature) the updated test cases already used the new pass manager, but they were using the legacy syntax when specifying the passes to run. This patch can be seen as a step toward deprecating that interface. This patch also removes some redundant RUN lines. Here I am referring to test cases that had multiple RUN lines verifying both the legacy "-passname" syntax and the new "-passes=passname" syntax. Since we switched the default pass manager to "new PM" both RUN lines have verified the new PM version of the pass (more or less wasting time running the same test twice), unless LLVM_ENABLE_NEW_PASS_MANAGER is set to "off". It is assumed that it is enough to run these tests with the new pass manager now. Differential Revision: https://reviews.llvm.org/D108472	2021-09-29 21:51:08 +02:00
Philip Reames	9b45fd909f	[AlignFromAssume] Bailout w/non-constant alignments (pr51680) This is a bailout for pr51680. This pass appears to assume that the alignment operand to an align tag on an assume bundle is constant. This doesn't appear to be required anywhere, and clang happily generates non-constant alignments for cases such as this case taken from the bug report: // clang -cc1 -triple powerpc64-- -S -O1 opal_pci-min.c extern int a[]; long b; long c; void d(long, int *, int, long, long, long) __attribute__((__alloc_align__(6))); void e() { b = d(c, a, 0, 0, 5, c); b[0] = 0; } This was exposed by a SCEV change which allowed a non-constant alignment to reach further into the pass' code. We could generalize the pass, but for now, let's fix the crash.	2021-08-31 09:20:52 -07:00
Juneyoung Lee	c664769330	[AssumeBundles] offset should be added to correctly calculate align This is a patch to fix the bug in alignment calculation (see https://reviews.llvm.org/D90529#2619492). Consider this code: ``` call void @llvm.assume(i1 true) ["align"(i32* %a, i32 32, i32 28)] %arrayidx = getelementptr inbounds i32, i32* %a, i64 -1 ; aligment of %arrayidx? ``` The llvm.assume guarantees that `%a - 28` is 32-bytes aligned, meaning that `%a` is 32k + 28 for some k. Therefore `a - 4` cannot be 32-bytes aligned but the existing code was calculating the pointer as 32-bytes aligned. The reason why this happened is as follows. `DiffSCEV` stores `%arrayidx - %a` which is -4. `OffSCEV` stores the offset value of “align”, which is 28. `DiffSCEV` + `OffSCEV` = 24 should be used for `a - 4`'s offset from 32k, but `DiffSCEV` - `OffSCEV` = 32 was being used instead. Reviewed By: Tyker Differential Revision: https://reviews.llvm.org/D98759	2021-04-02 12:32:05 +09:00
Tyker	78de7297ab	Reland [AssumeBundles] Use operand bundles to encode alignment assumptions NOTE: There is a mailing list discussion on this: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html Complemantary to the assumption outliner prototype in D71692, this patch shows how we could simplify the code emitted for an alignemnt assumption. The generated code is smaller, less fragile, and it makes it easier to recognize the additional use as a "assumption use". As mentioned in D71692 and on the mailing list, we could adopt this scheme, and similar schemes for other patterns, without adopting the assumption outlining.	2020-09-12 15:36:06 +02:00
Eric Christopher	7bfaa40086	Temporarily Revert "[AssumeBundles] Use operand bundles to encode alignment assumptions" due to the performance bugs filed in https://bugs.llvm.org/show_bug.cgi?id=46753. An SROA change soon may obviate some of these problems. This reverts commit 8d09f20798ac180b1749276bff364682ce0196ab.	2020-07-16 11:54:04 -07:00
Tyker	8d09f20798	[AssumeBundles] Use operand bundles to encode alignment assumptions Summary: NOTE: There is a mailing list discussion on this: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html Complemantary to the assumption outliner prototype in D71692, this patch shows how we could simplify the code emitted for an alignemnt assumption. The generated code is smaller, less fragile, and it makes it easier to recognize the additional use as a "assumption use". As mentioned in D71692 and on the mailing list, we could adopt this scheme, and similar schemes for other patterns, without adopting the assumption outlining. Reviewers: hfinkel, xbolva00, lebedev.ri, nikic, rjmccall, spatel, jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: thopre, yamauchi, kuter, fhahn, merge_guards_bot, hiraditya, bollu, rkruppe, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71739	2020-07-14 01:05:58 +02:00
Roman Lebedev	7ea46aee36	Revert "[AssumeBundles] Use operand bundles to encode alignment assumptions" Assume bundle can have more than one entry with the same name, but at least AlignmentFromAssumptionsPass::extractAlignmentInfo() uses getOperandBundle("align"), which internally assumes that it isn't the case, and happily crashes otherwise. Minimal reduced reproducer: run `opt -alignment-from-assumptions` on target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" %0 = type { i64, %1, i8, i64, %2, i32, %3, i8 } %1 = type opaque %2 = type { i8, i8, i16 } %3 = type { i32, i32, i32, i32 } ; Function Attrs: nounwind define i32 @f(%0* noalias nocapture readonly %arg, %0* noalias %arg1) local_unnamed_addr #0 { bb: call void @llvm.assume(i1 true) [ "align"(%0* %arg, i64 8), "align"(%0* %arg1, i64 8) ] ret i32 0 } ; Function Attrs: nounwind willreturn declare void @llvm.assume(i1) #1 attributes #0 = { nounwind "reciprocal-estimates"="none" } attributes #1 = { nounwind willreturn } This is what we'd have with -mllvm -enable-knowledge-retention This reverts commit c95ffadb2474a4d8c4f598d94d35a9f31d9606cb.	2020-07-04 23:49:23 +03:00
Tyker	c95ffadb24	[AssumeBundles] Use operand bundles to encode alignment assumptions Summary: NOTE: There is a mailing list discussion on this: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html Complemantary to the assumption outliner prototype in D71692, this patch shows how we could simplify the code emitted for an alignemnt assumption. The generated code is smaller, less fragile, and it makes it easier to recognize the additional use as a "assumption use". As mentioned in D71692 and on the mailing list, we could adopt this scheme, and similar schemes for other patterns, without adopting the assumption outlining. Reviewers: hfinkel, xbolva00, lebedev.ri, nikic, rjmccall, spatel, jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: yamauchi, kuter, fhahn, merge_guards_bot, hiraditya, bollu, rkruppe, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71739	2020-06-25 12:59:44 +02:00
Tyker	8938a6c9ed	[NFC] update test to make diff of the following commit clear	2020-06-25 12:59:44 +02:00
Richard Diamond	4bf015c035	[AlignmentFromAssumptions] Fix a SCEV assertion resulting from address space differences. Summary: On targets with different pointer sizes, -alignment-from-assumptions could attempt to create SCEV expressions which use different effective SCEV types. The provided test illustrates the issue. In `getNewAlignment`, AASCEV would be the (only) alloca, which would have an effective SCEV type of i32. But PtrSCEV, the GEP in this case, due to being in the flat/default address space, will have an effective SCEV of i64. This patch resolves the issue by truncating PtrSCEV to AASCEV's effective type. Reviewers: hfinkel, jdoerfert Reviewed By: jdoerfert Subscribers: jvesely, nhaehnle, hiraditya, javed.absar, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75471	2020-03-29 01:26:31 -05:00
Fangrui Song	3fc933af8b	[AlignmentFromAssumptions] getNewAlignmentDiff(): use getURemExpr() The alignment is calculated incorrectly, thus sometimes it doesn't generate aligned mov instructions, as shown by the example below: ``` // b.cc typedef long long index; extern "C" index g_tid; extern "C" index g_num; void add3(float* __restrict__ a, float* __restrict__ b, float* __restrict__ c) { index n = 641024; index m = 161024; index k = 41024; index tid = g_tid; index num = g_num; __builtin_assume_aligned(a, 32); __builtin_assume_aligned(b, 32); __builtin_assume_aligned(c, 32); for (index i0=tidk; i0<m; i0+=numk) for (index i1=0; i1<nm; i1+=m) for (index i2=0; i2<k; i2++) c[i1+i0+i2] = b[i0+i2] + a[i1+i0+i2]; } ``` Compile with `clang b.cc -Ofast -march=skylake -mavx2 -S` ``` vmovaps -224(%rdi,%rbx,4), %ymm0 vmovups -192(%rdi,%rbx,4), %ymm1 # should be movaps vmovups -160(%rdi,%rbx,4), %ymm2 # should be movaps vmovups -128(%rdi,%rbx,4), %ymm3 # should be movaps vaddps -224(%rsi,%rbx,4), %ymm0, %ymm0 vaddps -192(%rsi,%rbx,4), %ymm1, %ymm1 vaddps -160(%rsi,%rbx,4), %ymm2, %ymm2 vaddps -128(%rsi,%rbx,4), %ymm3, %ymm3 vmovaps %ymm0, -224(%rdx,%rbx,4) vmovups %ymm1, -192(%rdx,%rbx,4) # should be movaps vmovups %ymm2, -160(%rdx,%rbx,4) # should be movaps vmovups %ymm3, -128(%rdx,%rbx,4) # should be movaps ``` Differential Revision: https://reviews.llvm.org/D66575 Patch by Dun Liang llvm-svn: 369723	2019-08-23 02:17:04 +00:00
Eric Christopher	cee313d288	Revert "Temporarily Revert "Add basic loop fusion pass."" The reversion apparently deleted the test/Transforms directory. Will be re-reverting again. llvm-svn: 358552	2019-04-17 04:52:47 +00:00
Eric Christopher	a863435128	Temporarily Revert "Add basic loop fusion pass." As it's causing some bot failures (and per request from kbarton). This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358546	2019-04-17 02:12:23 +00:00
Daniel Neilson	20c9207be3	[AlignmentFromAssumptions] Set source and dest alignments of memory intrinsiscs separately Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the AlignmentFromAssumptions pass to cease using the old getAlignment()/setAlignment API of MemoryIntrinsic in favour of getting/setting source & dest specific alignments through the new API. This allows us to simplify some of the code in this pass and also be more aggressive about setting the source and destination alignments separately. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use [get\|set]DestAlignment() and [get\|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774, rL324781, rL324784, rL324955, rL324960 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html Reviewers: hfinkel, bollu, reames Reviewed By: reames Subscribers: reames, llvm-commits Differential Revision: https://reviews.llvm.org/D43081 llvm-svn: 325816	2018-02-22 18:55:59 +00:00
Daniel Neilson	1e68724d24	Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1) Summary: This is a resurrection of work first proposed and discussed in Aug 2015: http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html and initially landed (but then backed out) in Nov 2015: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument which is required to be a constant integer. It represents the alignment of the dest (and source), and so must be the minimum of the actual alignment of the two. This change is the first in a series that allows source and dest to each have their own alignments by using the alignment attribute on their arguments. In this change we: 1) Remove the alignment argument. 2) Add alignment attributes to the source & dest arguments. We, temporarily, require that the alignments for source & dest be equal. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false) will now read call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false) Downstream users may have to update their lit tests that check for @llvm.memcpy/memmove/memset call/declaration patterns. The following extended sed script may help with updating the majority of your tests, but it does not catch all possible patterns so some manual checking and updating will be required. s~declare void @llvm\.mem(set\|cpy\|move)\.p([^(])$(.), i32, i1$~declare void @llvm.mem\1.p\2(\3, i1)~g s~call void @llvm\.memset\.p([^(])i8$i8([^])\ (.), i8 (.), i8 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i8(i8\2* \3, i8 \4, i8 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i16$i8([^])\ (.), i8 (.), i16 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i16(i8\2* \3, i8 \4, i16 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i32$i8([^])\ (.), i8 (.), i32 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i32(i8\2* \3, i8 \4, i32 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i64$i8([^])\ (.), i8 (.), i64 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i64(i8\2* \3, i8 \4, i64 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i128$i8([^])\ (.), i8 (.), i128 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i128(i8\2* \3, i8 \4, i128 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i8$i8([^])\ (.), i8 (.), i8 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i8(i8\2 align \6 \3, i8 \4, i8 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i16$i8([^])\ (.), i8 (.), i16 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i16(i8\2 align \6 \3, i8 \4, i16 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i32$i8([^])\ (.), i8 (.), i32 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i32(i8\2 align \6 \3, i8 \4, i32 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i64$i8([^])\ (.), i8 (.), i64 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i64(i8\2 align \6 \3, i8 \4, i64 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i128$i8([^])\ (.), i8 (.), i128 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i128(i8\2 align \6 \3, i8 \4, i128 \5, i1 \7)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i8$i8([^])\ (.), i8([^])\ (.), i8 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i8(i8\3 \4, i8\5* \6, i8 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i16$i8([^])\ (.), i8([^])\ (.), i16 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i16(i8\3 \4, i8\5* \6, i16 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i32$i8([^])\ (.), i8([^])\ (.), i32 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i32(i8\3 \4, i8\5* \6, i32 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i64$i8([^])\ (.), i8([^])\ (.), i64 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i64(i8\3 \4, i8\5* \6, i64 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i128$i8([^])\ (.), i8([^])\ (.), i128 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i128(i8\3 \4, i8\5* \6, i128 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i8$i8([^])\ (.), i8([^])\ (.), i8 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i8(i8\3* align \8 \4, i8\5* align \8 \6, i8 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i16$i8([^])\ (.), i8([^])\ (.), i16 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i16(i8\3* align \8 \4, i8\5* align \8 \6, i16 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i32$i8([^])\ (.), i8([^])\ (.), i32 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i32(i8\3* align \8 \4, i8\5* align \8 \6, i32 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i64$i8([^])\ (.), i8([^])\ (.), i64 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i64(i8\3* align \8 \4, i8\5* align \8 \6, i64 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i128$i8([^])\ (.), i8([^])\ (.), i128 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i128(i8\3* align \8 \4, i8\5* align \8 \6, i128 \7, i1 \9)~g The remaining changes in the series will: Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. Step 3) Update Clang to use the new IRBuilder API. Step 4) Update Polly to use the new IRBuilder API. Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use getDestAlignment() and getSourceAlignment() instead. Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reviewers: pete, hfinkel, lhames, reames, bollu Reviewed By: reames Subscribers: niosHD, reames, jholewinski, qcolombet, jfb, sanjoy, arsenm, dschuff, dylanmckay, mehdi_amini, sdardis, nemanjai, david2050, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, llvm-commits Differential Revision: https://reviews.llvm.org/D41675 llvm-svn: 322965	2018-01-19 17:13:12 +00:00
Sean Silva	a4c2d150d0	[PM] Port AlignmentFromAssumptions to the new PM. This uses the "runImpl" pattern to share code between the old and new PM. llvm-svn: 272757	2016-06-15 06:18:01 +00:00
Pete Cooper	67cf9a723b	Revert "Change memcpy/memset/memmove to have dest and source alignments." This reverts commit r253511. This likely broke the bots in http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202 http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787 llvm-svn: 253543	2015-11-19 05:56:52 +00:00
Pete Cooper	72bc23ef02	Change memcpy/memset/memmove to have dest and source alignments. Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html These intrinsics currently have an explicit alignment argument which is required to be a constant integer. It represents the alignment of the source and dest, and so must be the minimum of those. This change allows source and dest to each have their own alignments by using the alignment attribute on their arguments. The alignment argument itself is removed. There are a few places in the code for which the code needs to be checked by an expert as to whether using only src/dest alignment is safe. For those places, they currently take the minimum of src/dest alignments which matches the current behaviour. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false) will now read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false) For out of tree owners, I was able to strip alignment from calls using sed by replacing: (call.llvm\.memset.)i32\ [0-9]\,\ i1 false\) with: $1i1 false) and similarly for memmove and memcpy. I then added back in alignment to test cases which needed it. A similar commit will be made to clang which actually has many differences in alignment as now IRBuilder can generate different source/dest alignments on calls. In IRBuilder itself, a new argument was added. Instead of calling: CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, / isVolatile / false) you now call CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, / isVolatile */ false) There is a temporary class (IntegerAlignment) which takes the source alignment and rejects implicit conversion from bool. This is to prevent isVolatile here from passing its default parameter to the source alignment. Note, changes in future can now be made to codegen. I didn't change anything here, but this change should enable better memcpy code sequences. Reviewed by Hal Finkel. llvm-svn: 253511	2015-11-18 22:17:24 +00:00
David Blaikie	a79ac14fa6	[opaque pointer type] Add textual IR support for explicit type parameter to load instruction Essentially the same as the GEP change in r230786. A similar migration script can be used to update test cases, though a few more test case improvements/changes were required this time around: (r229269-r229278) import fileinput import sys import re pat = re.compile(r"((?:=\|:\|^)\sload (?:atomic )?(?:volatile )?(.?))(\| addrspace$\d+$ )\($\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$)") for line in sys.stdin: sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line)) Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7649 llvm-svn: 230794	2015-02-27 21:17:42 +00:00
David Blaikie	79e6c74981	[opaque pointer type] Add textual IR support for explicit type parameter to getelementptr instruction One of several parallel first steps to remove the target type of pointers, replacing them with a single opaque pointer type. This adds an explicit type parameter to the gep instruction so that when the first parameter becomes an opaque pointer type, the type to gep through is still available to the instructions. * This doesn't modify gep operators, only instructions (operators will be handled separately) * Textual IR changes only. Bitcode (including upgrade) and changing the in-memory representation will be in separate changes. * geps of vectors are transformed as: getelementptr <4 x float> %x, ... ->getelementptr float, <4 x float> %x, ... Then, once the opaque pointer type is introduced, this will ultimately look like: getelementptr float, <4 x ptr> %x with the unambiguous interpretation that it is a vector of pointers to float. * address spaces remain on the pointer, not the type: getelementptr float addrspace(1)* %x ->getelementptr float, float addrspace(1)* %x Then, eventually: getelementptr float, ptr addrspace(1) %x Importantly, the massive amount of test case churn has been automated by same crappy python code. I had to manually update a few test cases that wouldn't fit the script's model (r228970,r229196,r229197,r229198). The python script just massages stdin and writes the result to stdout, I then wrapped that in a shell script to handle replacing files, then using the usual find+xargs to migrate all the files. update.py: import fileinput import sys import re ibrep = re.compile(r"(^.?[^%\w]getelementptr inbounds )(((?:<\d x )?)(.?)(\| addrspace$\d$) \(\|>)(?:$\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$))") normrep = re.compile( r"(^.?[^%\w]getelementptr )(((?:<\d* x )?)(.?)(\| addrspace$\d$) \(\|>)(?:$\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$))") def conv(match, line): if not match: return line line = match.groups()[0] if len(match.groups()[5]) == 0: line += match.groups()[2] line += match.groups()[3] line += ", " line += match.groups()[1] line += "\n" return line for line in sys.stdin: if line.find("getelementptr ") == line.find("getelementptr inbounds"): if line.find("getelementptr inbounds") != line.find("getelementptr inbounds ("): line = conv(re.match(ibrep, line), line) elif line.find("getelementptr ") != line.find("getelementptr ("): line = conv(re.match(normrep, line), line) sys.stdout.write(line) apply.sh: for name in "$@" do python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name" rm -f "$name.tmp" done The actual commands: From llvm/src: find test/ -name .ll \| xargs ./apply.sh From llvm/src/tools/clang: find test/ -name .mm -o -name .m -o -name .cpp -o -name .c \| xargs -I '{}' ../../apply.sh "{}" From llvm/src/tools/polly: find test/ -name *.ll \| xargs ./apply.sh After that, check-all (with llvm, clang, clang-tools-extra, lld, compiler-rt, and polly all checked out). The extra 'rm' in the apply.sh script is due to a few files in clang's test suite using interesting unicode stuff that my python script was throwing exceptions on. None of those files needed to be migrated, so it seemed sufficient to ignore those cases. Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7636 llvm-svn: 230786	2015-02-27 19:29:02 +00:00
Hal Finkel	f83e1f7f66	[AlignmentFromAssumptions] Don't crash just because the target is 32-bit We used to crash processing any relevant @llvm.assume on a 32-bit target (because we'd ask SE to subtract expressions of differing types). I've copied our 'simple.ll' test, but with the data layout from arm-linux-gnueabihf to get some meaningful test coverage here. llvm-svn: 217574	2014-09-11 08:40:17 +00:00
Hal Finkel	71b7084112	[AlignmentFromAssumptions] Don't divide by zero for unknown starting alignment The routine that determines an alignment given some SCEV returns zero if the answer is unknown. In a case where we could determine the increment of an AddRec but not the starting alignment, we would compute the integer modulus by zero (which is illegal and traps). Prevent this by returning early if either the start or increment alignment is unknown (zero). llvm-svn: 217544	2014-09-10 21:05:52 +00:00
Hal Finkel	d67e463901	Add an AlignmentFromAssumptions Pass This adds a ScalarEvolution-powered transformation that updates load, store and memory intrinsic pointer alignments based on invariant((a+q) & b == 0) expressions. Many of the simple cases we can get with ValueTracking, but we still need something like this for the more complicated cases (such as those with an offset) that require some algebra. Note that gcc's __builtin_assume_aligned's optional third argument provides exactly for this kind of 'misalignment' offset for which this kind of logic is necessary. The primary motivation is to fixup alignments for vector loads/stores after vectorization (and unrolling). This pass is added to the optimization pipeline just after the SLP vectorizer runs (which, admittedly, does not preserve SE, although I imagine it could). Regardless, I actually don't think that the preservation matters too much in this case: SE computes lazily, and this pass won't issue any SE queries unless there are any assume intrinsics, so there should be no real additional cost in the common case (SLP does preserve DT and LoopInfo). llvm-svn: 217344	2014-09-07 20:05:11 +00:00

36 Commits