1045 Commits

Author SHA1 Message Date
Xiang1 Zhang
a56f264958 Refine tls-load-hoista llvm option
Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D122890
2022-04-01 19:03:58 +08:00
yanming
a7c0b7504c [VP] Add more cast VPintrinsic and docs.
Add vp.fptoui, vp.uitofp, vp.fptrunc, vp.fpext, vp.trunc, vp.zext, vp.sext, vp.ptrtoint, vp.inttoptr intrinsic and docs.

Reviewed By: frasercrmck, craig.topper

Differential Revision: https://reviews.llvm.org/D122291
2022-04-01 09:16:10 +08:00
Fraser Cormack
cc67a8fcf1 [VP][LangRef] Correct select operands in vp.fptosi docs 2022-03-31 08:30:18 +01:00
Fraser Cormack
73244e8f85 [VP] Add vp.icmp comparison intrinsic and docs
This patch mostly follows up on D121292 which introduced the vp.fcmp
intrinsic.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D122729
2022-03-30 17:05:11 +01:00
Fraser Cormack
a0e5d9e1f4 [LangRef][NFC] Correct select operands in vp.fcmp docs
Thanks for craig.topper for spotting this.
2022-03-30 17:03:15 +01:00
Fraser Cormack
da6131f20a [VP] Add vp.fcmp comparison intrinsic and docs
This patch adds the first support for vector-predicated comparison
intrinsics, starting with vp.fcmp. It uses metadata to encode its
condition code, like the llvm.experimental.constrained.fcmp intrinsic.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121292
2022-03-30 14:39:18 +01:00
Michael Kruse
793b7f903d [docs] Update LoopTerminology.
Includes 2 corrections:

 * Update irreducible control flow and add references to CycleTerminology;
   Natural loop is not the only definition of something looping in LLVM anymore.

 * Mention mustprogress loop and function attributes to be used
   instead of the llvm.sideeffect intrinsic.
2022-03-29 23:26:18 -05:00
Xiang1 Zhang
9566405020 [Inline asm] Fix mangle problem when variable used in inline asm.
(Correct 'Mem symbol + IntelExpr' output in PIC model)

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D121785
2022-03-24 09:41:23 +08:00
Nikita Popov
5737ce259b [LangRef] Allow non-power-of-two assume operand bundle
There has been a lot of confusion on this in the past (see for
example https://reviews.llvm.org/D110634 and earlier revisions),
so let's try to get some clarity here. This patch specifies that
a) specifying a non-constant assumed alignment is explicitly
allowed and b) an invalid (non-power-of-two) alignment is not UB,
but rather converts it into an assumption that the pointer is null.

This change is done for two reasons:
a) Assume operand bundles are specifically used in cases where the
alignment is not known during frontend codegen (otherwise we'd just
use an align attribute), so rejecting this case doesn't make sense.
b) At least for aligned_alloc the C standard specifies that passing
an invalid alignment results in a null pointer, not undefined
behavior.

Differential Revision: https://reviews.llvm.org/D119414
2022-03-23 15:51:16 +01:00
Itay Bookstein
d155c7da51 [docs] Fix a couple of typos
Signed-off-by: Itay Bookstein <ibookstein@gmail.com>
2022-03-19 20:24:38 +02:00
Fangrui Song
c6692f819e [GlobalOpt] Don't replace alias with aliasee if either alias/aliasee may be preemptible
Generalize D99629 for ELF. A default visibility non-local symbol is preemptible
in a -shared link. `isInterposable` is an insufficient condition.

Moreover, a non-preemptible alias may be referenced in a sub constant expression
which intends to lower to a PC-relative relocation. Replacing the alias with a
preemptible aliasee may introduce a linker error.

Respect dso_preemptable and suppress optimization to fix the abose issues. With
the change, `alias = 345` will not be rewritten to use aliasee in a `-fpic`
compile.
```
int aliasee;
extern int alias __attribute__((alias("aliasee"), visibility("hidden")));
void foo() { alias = 345; } // intended to access the local copy
```

While here, refine the condition for the alias as well.

For some binary formats like COFF, `isInterposable` is a sufficient condition.
But I think canonicalization for the changed case has little advantage, so I
don't bother to add the `Triple(M.getTargetTriple()).isOSBinFormatELF()` or
`getPICLevel/getPIELevel` complexity.

For instrumentations, it's recommended not to create aliases that refer to
globals that have a weak linkage or is preemptible. However, the following is
supported and the IR needs to handle such cases.
```
int aliasee __attribute__((weak));
extern int alias __attribute__((alias("aliasee")));
```

There are other places where GlobalAlias isInterposable usage may need to be
fixed.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D107249
2022-03-18 14:17:05 -07:00
Lorenzo Albano
28cfa764c2 [VP] Strided loads/stores
This patch introduces two new experimental IR intrinsics and SDAG nodes
to represent vector strided loads and stores.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D114884
2022-03-10 18:46:54 +01:00
Xiang1 Zhang
c31014322c TLS loads opimization (hoist)
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D120000
2022-03-10 09:29:06 +08:00
Augie Fackler
d664c4b73c Attributes: add a new allocalign attribute
This will let us start moving away from hard-coded attributes in
MemoryBuiltins.cpp and put the knowledge about various attribute
functions in the compilers that emit those calls where it probably
belongs.

Differential Revision: https://reviews.llvm.org/D117921
2022-03-04 15:57:53 -05:00
Arthur Eubanks
f0b61f7957 Revert "[GlobalOpt] Don't replace alias with aliasee if either alias/aliasee may be preemptible"
This reverts commit 30e8f83c84c5a302a559722fc0d2973dc3f425ee.

Causes huge compile time regressions on certain large files. Will followup offline with author.
2022-03-03 11:04:14 -08:00
Simon Moll
d05ddb86f6 [VP] vp.sitofp cast intrinsic and docs
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D119922
2022-03-02 10:16:19 +01:00
Simon Moll
febf548129 [VP] Fix vp.fptosi LangRef example
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D120068
2022-03-02 10:15:32 +01:00
Tong Zhang
17ce89fa80 [SanitizerBounds] Add support for NoSanitizeBounds function
Currently adding attribute no_sanitize("bounds") isn't disabling
-fsanitize=local-bounds (also enabled in -fsanitize=bounds). The Clang
frontend handles fsanitize=array-bounds which can already be disabled by
no_sanitize("bounds"). However, instrumentation added by the
BoundsChecking pass in the middle-end cannot be disabled by the
attribute.

The fix is very similar to D102772 that added the ability to selectively
disable sanitizer pass on certain functions.

In this patch, if no_sanitize("bounds") is provided, an additional
function attribute (NoSanitizeBounds) is attached to IR to let the
BoundsChecking pass know we want to disable local-bounds checking. In
order to support this feature, the IR is extended (similar to D102772)
to make Clang able to preserve the information and let BoundsChecking
pass know bounds checking is disabled for certain function.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D119816
2022-03-01 18:47:02 +01:00
Nuno Lopes
da23fc966b [docs] Simplify the description of poison values 2022-02-20 11:41:49 +00:00
Nuno Lopes
5c404049b5 [docs] Add a note saying that the use of poison is preferred to the use of undef
Plus fix a few wrong examples with undef
2022-02-20 11:33:47 +00:00
Kristof Beyls
520a925272 Fix 2 RestructuredText warnings. 2022-02-16 14:16:52 +01:00
Simon Moll
03e83cc8eb [VP] vp.fptosi cast intrinsic and docs
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D119535
2022-02-15 18:17:19 +01:00
Ahmed Bougacha
c703f852c9 [IR] Define "ptrauth" operand bundle.
This introduces a new "ptrauth" operand bundle to be used in
call/invoke. At the IR level, it's semantically equivalent to an
@llvm.ptrauth.auth followed by an indirect call, but it additionally
provides additional hardening, by preventing the intermediate raw
pointer from being exposed.

This mostly adds the IR definition, verifier checks, and support in
a couple of general helper functions. Clang IRGen and backend support
will come separately.

Note that we'll eventually want to support this bundle in indirectbr as
well, for similar reasons.  indirectbr currently doesn't support bundles
at all, and the IR data structures need to be updated to allow that.

Differential Revision: https://reviews.llvm.org/D113685
2022-02-14 11:27:35 -08:00
Momchil Velikov
6398903ac8 Extend the uwtable attribute with unwind table kind
We have the `clang -cc1` command-line option `-funwind-tables=1|2` and
the codegen option `VALUE_CODEGENOPT(UnwindTables, 2, 0) ///< Unwind
tables (1) or asynchronous unwind tables (2)`. However, this is
encoded in LLVM IR by the presence or the absence of the `uwtable`
attribute, i.e.  we lose the information whether to generate want just
some unwind tables or asynchronous unwind tables.

Asynchronous unwind tables take more space in the runtime image, I'd
estimate something like 80-90% more, as the difference is adding
roughly the same number of CFI directives as for prologues, only a bit
simpler (e.g. `.cfi_offset reg, off` vs. `.cfi_restore reg`). Or even
more, if you consider tail duplication of epilogue blocks.
Asynchronous unwind tables could also restrict code generation to
having only a finite number of frame pointer adjustments (an example
of *not* having a finite number of `SP` adjustments is on AArch64 when
untagging the stack (MTE) in some cases the compiler can modify `SP`
in a loop).
Having the CFI precise up to an instruction generally also means one
cannot bundle together CFI instructions once the prologue is done,
they need to be interspersed with ordinary instructions, which means
extra `DW_CFA_advance_loc` commands, further increasing the unwind
tables size.

That is to say, async unwind tables impose a non-negligible overhead,
yet for the most common use cases (like C++ exceptions), they are not
even needed.

This patch extends the `uwtable` attribute with an optional
value:
      -  `uwtable` (default to `async`)
      -  `uwtable(sync)`, synchronous unwind tables
      -  `uwtable(async)`, asynchronous (instruction precise) unwind tables

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D114543
2022-02-14 14:35:02 +00:00
Julien Pages
dcb2da13f1 [AMDGPU] Add a new intrinsic to control fp_trunc rounding mode
Add a new llvm.fptrunc.round intrinsic to precisely control
the rounding mode when converting from f32 to f16.

Differential Revision: https://reviews.llvm.org/D110579
2022-02-11 12:08:23 -05:00
Craig Topper
60745fb16f [VP] llvm.vp.fneg intrinsic and LangRef
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D119262
2022-02-09 07:54:36 -08:00
Craig Topper
cef177d186 [VP] llvm.vp.fma intrinsic and LangRef
Differential Revision: https://reviews.llvm.org/D119185
2022-02-07 15:53:27 -08:00
Nikita Popov
e990e591c9 [LangRef] Require elementtype attribute for gc.statepoint intrinsic
The gc.statepoint intrinsic currently determines the target function
type based on the pointer element type of the argument. In order to
support opaque pointers, require that the argument is annotated with
an elementtype attribute.

Here's an example of the change:

    ; Before:
      %safepoint_token = tail call token (i64, i32, i1 ()*, i32, i32, ...) @llvm.experimental.gc.statepoint.p0f_i1f(i64 0, i32 0, i1 ()* @return_i1, i32 0, i32 0, i32 0, i32 0)

    ; After:
      %safepoint_token = tail call token (i64, i32, i1 ()*, i32, i32, ...) @llvm.experimental.gc.statepoint.p0f_i1f(i64 0, i32 0, i1 ()* elementtype(i1 ()) @return_i1, i32 0, i32 0, i32 0, i32 0)

    ; After with opaque pointers:
      %safepoint_token = tail call token (i64, i32, i1 ()*, i32, i32, ...) @llvm.experimental.gc.statepoint.p0(i64 0, i32 0, ptr elementtype(i1 ()) @return_i1, i32 0, i32 0, i32 0, i32 0)

Differential Revision: https://reviews.llvm.org/D117890
2022-02-04 09:47:31 +01:00
Fangrui Song
30e8f83c84 [GlobalOpt] Don't replace alias with aliasee if either alias/aliasee may be preemptible
Generalize D99629 for ELF. A default visibility non-local symbol is preemptible
in a -shared link. `isInterposable` is an insufficient condition.

Moreover, a non-preemptible alias may be referenced in a sub constant expression
which intends to lower to a PC-relative relocation. Replacing the alias with a
preemptible aliasee may introduce a linker error.

Respect dso_preemptable and suppress optimization to fix the abose issues. With
the change, `alias = 345` will not be rewritten to use aliasee in a `-fpic`
compile.
```
int aliasee;
extern int alias __attribute__((alias("aliasee"), visibility("hidden")));
void foo() { alias = 345; } // intended to access the local copy
```

While here, refine the condition for the alias as well.

For some binary formats like COFF, `isInterposable` is a sufficient condition.
But I think canonicalization for the changed case has little advantage, so I
don't bother to add the `Triple(M.getTargetTriple()).isOSBinFormatELF()` or
`getPICLevel/getPIELevel` complexity.

For instrumentations, it's recommended not to create aliases that refer to
globals that have a weak linkage or is preemptible. However, the following is
supported and the IR needs to handle such cases.
```
int aliasee __attribute__((weak));
extern int alias __attribute__((alias("aliasee")));
```

There are other places where GlobalAlias isInterposable usage may need to be
fixed.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D107249
2022-02-01 10:41:16 -08:00
Ahmed Bougacha
634ca7349d [ObjCARC] Require the function argument in the clang.arc.attachedcall bundle.
Currently, the clang.arc.attachedcall bundle takes an optional function
argument.  Depending on whether the argument is present, calls with this
bundle have the following semantics:

- on x86, with the argument present, the call is lowered to:
    call _target
    mov rax, rdi
    call _objc_retainAutoreleasedReturnValue

- on AArch64, without the argument, the call is lowered to:
    bl _target
    mov x29, x29

  and the objc runtime call is expected to be emitted separately.

That's because, on x86, the objc runtime checks for both the mov and
the call on x86, and treats the combination as the ARC autorelease elision
marker.

But on AArch64, it only checks for the dedicated NOP marker, as that's
historically been sufficiently unique.  Thanks to that, the runtime call
wasn't required to be adjacent to the NOP marker, so it wasn't emitted
as part of the bundle sequence.

This patch unifies both architectures: on AArch64, we now emit all
3 instructions for the bundle.  This guarantees that the runtime call
is adjacent to the marker in the sequence, and that's information the
runtime can use to further optimize this.

This helps simplify some of the handling, in particular
BundledRetainClaimRVs, which no longer needs to know whether the bundle
is sufficient or not: it now always should be.

Note that this does not include an AutoUpgrade for the nullary bundles,
as they are only produced in ObjCContract as part of the obj/asm emission
pipeline, and are not expected to be in bitcode.

Differential Revision: https://reviews.llvm.org/D118214
2022-01-28 12:41:45 -08:00
Ellis Hoag
11d3074267 [InstrProf] Add single byte coverage mode
Use the llvm flag `-pgo-function-entry-coverage` to create single byte "counters" to track functions coverage. This mode has significantly less size overhead in both code and data because
  * We mark a function as "covered" with a store instead of an increment which generally requires fewer assembly instructions
  * We use a single byte per function rather than 8 bytes per block

The trade off of course is that this mode only tells you if a function has been covered. This is useful, for example, to detect dead code.

When combined with debug info correlation [0] we are able to create an instrumented Clang binary that is only 150M (the vanilla Clang binary is 143M). That is an overhead of 7M (4.9%) compared to the default instrumentation (without value profiling) which has an overhead of 31M (21.7%).

[0] https://groups.google.com/g/llvm-dev/c/r03Z6JoN7d4

Reviewed By: kyulee

Differential Revision: https://reviews.llvm.org/D116180
2022-01-27 17:38:55 -08:00
Sanjay Patel
2e26633af0 [IR] document and update ctlz/cttz intrinsics to optionally return poison rather than undef
The behavior in Analysis (knownbits) implements poison semantics already,
and we expect the transforms (for example, in instcombine) derived from
those semantics, so this patch changes the LangRef and remaining code to
be consistent. This is one more step in removing "undef" from LLVM.

Without this, I think https://github.com/llvm/llvm-project/issues/53330
has a legitimate complaint because that report wants to allow subsequent
code to mask off bits, and that is allowed with undef values. The clang
builtins are not actually documented anywhere AFAICT, but we might want
to add that to remove more uncertainty.

Differential Revision: https://reviews.llvm.org/D117912
2022-01-23 11:22:48 -05:00
Fraser Cormack
b8cb79404b [LangRef] Mangle all vector operands in insert/extract intrinsics
This better matches the canonical mangling of these intrinsics.

Reviewed By: peterwaller-arm

Differential Revision: https://reviews.llvm.org/D117675
2022-01-19 15:23:15 +00:00
Philip Reames
17beee44e1 [LangRef] Clarify that inaccessiblememonly functions are allowed noalias returns
Confusion over this point came up in a couple of recent changes (D117180, e20b32ff3). Current tone of discussion seems to be that we think inaccessiblememonly was always legal on allocation functions, so this change takes the form of a clarification instead of a change.

Differential Revision: https://reviews.llvm.org/D117571
2022-01-18 14:47:46 -08:00
Fraser Cormack
c8e33978fb [VP] Propagate align parameter attr on VP gather/scatter to ISel
This patch fixes a case where the 'align' parameter attribute on the
pointer operands to llvm.vp.gather and llvm.vp.scatter was being dropped
during the conversion to the SelectionDAG. The default alignment equal
to the ABI type alignment of the vector type was kept. It also updates
the documentation to reflect the fact that the parameter attribute is
now properly supported.

The default alignment of these intrinsics was previously documented as
being equal to the ABI alignment of the *scalar* type, when in fact that
wasn't the case: the ABI alignment of the vector type was used instead.
This has also been fixed in this patch.

Reviewed By: simoll, craig.topper

Differential Revision: https://reviews.llvm.org/D114423
2022-01-18 17:33:24 +00:00
Nikita Popov
a2261e399a [Docs] Use anonymous reference (NFC)
Hopefully fixes the build failure. Also fix a typo.
2022-01-14 18:01:06 +01:00
Nikita Popov
3bbf7f5ed8 [Docs] Update opaque pointer docs (NFC)
Mention -opaque-pointers, write a bit more about migration pitfalls
and update the open issues.
2022-01-14 17:42:43 +01:00
Hans Wennborg
2bc57d85eb Don't override __attribute__((no_stack_protector)) by inlining (PR52886)
Since 26c6a3e736d3, LLVM's inliner will "upgrade" the caller's stack protector
attribute based on the callee. This lead to surprising results with Clang's
no_stack_protector attribute added in 4fbf84c1732f (D46300). Consider the
following code compiled with clang -fstack-protector-strong -Os
(https://godbolt.org/z/7s3rW7a1q).

  extern void h(int* p);

  inline __attribute__((always_inline)) int g() {
    return 0;
  }

  int __attribute__((__no_stack_protector__)) f() {
    int a[1];
    h(a);
    return g();
  }

LLVM will inline g() into f(), and f() would get a stack protector, against the
users explicit wishes, potentially breaking the program e.g. if h() changes the
value of the stack cookie. That's a miscompile.

More recently, bc044a88ee3c (D91816) addressed this problem by preventing
inlining when the stack protector is disabled in the caller and enabled in the
callee or vice versa. However, the problem remained if the callee is marked
always_inline as in the example above. This affected users, see e.g.
http://crbug.com/1274129 and http://llvm.org/pr52886.

One way to fix this would be to prevent inlining also in the always_inline
case. Despite the name, always_inline does not guarantee inlining, so this
would be legal but potentially surprising to users.

However, I think the better fix is to not enable the stack protector in a
caller based on the callee. The motivation for the old behaviour is unclear, it
seems counter-intuitive, and causes real problems as we've seen.

This commit implements that fix, which means in the example above, g() gets
inlined into f() (also without always_inline), and f() is emitted without stack
protector. I think that matches most developers' expectations, and that's also
what GCC does.

Another effect of this change is that a no_stack_protector function can now be
inlined into a stack protected function, e.g. (https://godbolt.org/z/hafP6W856):

  extern void h(int* p);

  inline int __attribute__((__no_stack_protector__)) __attribute__((always_inline)) g() {
    return 0;
  }

  int f() {
    int a[1];
    h(a);
    return g();
  }

I think that's fine. Such code would be unusual since no_stack_protector is
normally applied to a program entry point which sets up the stack canary. And
even if such code exists, inlining doesn't change the semantics: there is still
no stack cookie setup/check around entry/exit of the g() code region, but there
may be in the surrounding context, as there was before inlining. This also
matches GCC.

See also the discussion at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94722

Differential revision: https://reviews.llvm.org/D116589
2022-01-13 12:04:49 +01:00
Sebastian Neubauer
f4139440f1 [Docs] Fix IR and TableGen grammar inconsistencies
IR:
- globals (and functions, ifuncs, aliases) can have a partition
- catchret has a `to` before the label
- the sint/int types do not exist
- signext comes after the type
- a variable was missing its type

TableGen:
- The second value after a `#` concatenation is optional
  See e.g. llvm/lib/Target/X86/X86InstrAVX512.td:L3351
- IncludeDirective and PreprocessorDirective were never referenced in
  the grammar
- Add some missing ;
- Parent classes of multiclasses can have generic arguments.
  Reuse the `ParentClassList` that is already used in other places.

MIR:
- liveins only allows physical registers, which start with a $

Differential Revision: https://reviews.llvm.org/D116674
2022-01-13 11:55:13 +01:00
Simon Moll
33efbc8184 [VP] llvm.vp.merge intrinsic and LangRef
llvm.vp.merge interprets the %evl operand differently than the other vp
intrinsics: all lanes at positions greater or equal than the %evl
operand are passed through from the second vector input. Otherwise it
behaves like llvm.vp.select.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116725
2022-01-12 14:06:56 +01:00
David Sherwood
51497dc0b2 [IR] Change vector.splice intrinsic to reject out-of-bounds indices
I've changed the definition of the experimental.vector.splice
instrinsic to reject indices that are known to be or possibly
out-of-bounds. In practice, this means changing the definition so that
the index is now only valid in the range [-VL, VL-1] where VL is the
known minimum vector length. We use the vscale_range attribute to
take the minimum vscale value into account so that we can permit
more indices when the attribute is present.

The splice intrinsic is currently only ever generated by the vectoriser,
which will never attempt to splice vectors with out-of-bounds values.
Changing the definition also makes things simpler for codegen since we
can always assume that the index is valid.

This patch was created in response to review comments on D115863

Differential Revision: https://reviews.llvm.org/D115933
2022-01-11 09:37:39 +00:00
Luís Ferreira
34435fd105 [llvm] Add support for DW_TAG_immutable_type
Added documentation about DW_TAG_immutable_type too.

Reviewed By: probinson

Differential Revision: https://reviews.llvm.org/D113633
2022-01-05 19:17:08 +00:00
Nikita Popov
8484bab9cd [LangRef] Require elementtype attribute for indirect inline asm operands
Indirect inline asm operands may require the materialization of a
memory access according to the pointer element type. As this will
no longer be available with opaque pointers, we require it to be
explicitly annotated using the elementtype attribute, for example:

    define void @test(i32* %p, i32 %x) {
      call void asm "addl $1, $0", "=*rm,r"(i32* elementtype(i32) %p, i32 %x)
      ret void
    }

This patch only includes the LangRef change and Verifier updates to
allow adding the elementtype attribute in this position. It does not
yet enforce this, as this will require changes on the clang side
(and test updates) first.

Something I'm a bit unsure about is whether we really need the
elementtype for all indirect constraints, rather than only indirect
register constraints. I think indirect memory constraints might not
strictly need it (though the backend code is written in a way that
does require it). I think it's okay to just make this a general
requirement though, as this means we don't need to carefully deal
with multiple or alternative constraints. In addition, I believe
that MemorySanitizer benefits from having the element type even in
cases where it may not be strictly necessary for normal lowering
(cd2b050fa4/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp (L4066)).

Differential Revision: https://reviews.llvm.org/D116531
2022-01-04 10:02:06 +01:00
Fraser Cormack
d762794040 [IR] Allow the 'align' param attr on vectors of pointers
This patch extends the available uses of the 'align' parameter attribute
to include vectors of pointers. The attribute specifies pointer
alignment element-wise.

This change was previously requested and discussed in D87304.

The vector predication (VP) intrinsics intend to use this for scatter
and gather operations, as they lack the explicit alignment parameter
that the masked versions use.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D115161
2022-01-03 12:32:46 +00:00
Nuno Lopes
b23669123a [docs] Mark @llvm.sideeffect() as willreturn
Changed by https://reviews.llvm.org/D65455
2022-01-01 18:04:04 +00:00
Sami Tolvanen
5dc8aaac39 [llvm][IR] Add no_cfi constant
With Control-Flow Integrity (CFI), the LowerTypeTests pass replaces
function references with CFI jump table references, which is a problem
for low-level code that needs the address of the actual function body.

For example, in the Linux kernel, the code that sets up interrupt
handlers needs to take the address of the interrupt handler function
instead of the CFI jump table, as the jump table may not even be mapped
into memory when an interrupt is triggered.

This change adds the no_cfi constant type, which wraps function
references in a value that LowerTypeTestsModule::replaceCfiUses does not
replace.

Link: https://github.com/ClangBuiltLinux/linux/issues/1353

Reviewed By: nickdesaulniers, pcc

Differential Revision: https://reviews.llvm.org/D108478
2021-12-20 12:55:32 -08:00
Fraser Cormack
8002fa6760 [LangRef] Remove incorrect vector alignment rules
The LangRef incorrectly says that if no exact match is found when
seeking alignment for a vector type, the largest vector type smaller
than the sought-after vector type. This is incorrect as vector types
require an exact match, else they fall back to reporting the natural
alignment.

The corrected rule was not added in its place, as rules for other types
(e.g., floating-point types) aren't documented.

A unit test was added to demonstrate this.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D112463
2021-12-14 14:35:40 +00:00
Fraser Cormack
2d60bc87a2 [VP] [NFC] Fix vp_store signature and vp_gather examples
Reviewed By: frasercrmck, simoll

Differential Revision: https://reviews.llvm.org/D115027
2021-12-13 17:53:19 +00:00
Kito Cheng
39c861719b [RISCV] Fix vm operand constraint to fit GCC's behavior
- `vm` constraint is used for masking operand, which always v0.

- Update testcase, only masking operand should use `vm`, vector mask operations
  should just use `vr` for any vector register.

 - Revise the description of `vm` constraint.

- This patch also fix issue on RISCVRegisterInfo.td and RISCVISelLowering.cpp.

  RISCVRegisterInfo.td:
  - The first VT in the list must be the largest total size since the
    SelectionDAGBuilder uses the first register in the list as the canonical
    type for the register.

  RISCVISelLowering.cpp:
  - Fix RISCVTargetLowering::splitValueIntoRegisterParts and
    RISCVTargetLowering::joinRegisterPartsIntoValue for handling vectors
    with different total size, that will happened on fractional LMUL since
    fractional LMUL is always occupy one vector register.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D112599
2021-12-09 14:46:49 +08:00
Fraser Cormack
3460cc2585 [VP] Propagate align parameter attr on VP load/store to ISel
This patch fixes a case where the 'align' parameter attribute on the
pointer operands to llvm.vp.load and llvm.vp.store was being dropped
during the conversion to the SelectionDAG. The default alignment
equal to the ABI type alignment of the vector type was kept. It also
updates the documentation to reflect the fact that the parameter
attribute is now properly supported.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D114422
2021-12-07 10:16:16 +00:00