CodeGenFunction::EmitCXXDeleteExpr contains logic to go from a pointer
to an array to a pointer to the first element of the array using a
getelementptr LLVM IR instruction. This was done for pointers that were
not variable length arrays, as pointers to variable length arrays never
existed in LLVM IR, but rather than checking for arrays that were not
variable length arrays, it checked for arrays that had a constant bound.
This caused incomplete arrays to be inadvertently omitted.
This getelementptr was necessary back when LLVM IR used typed pointers,
but they have been gone for a while, a gep with a constant zero offset
does nothing now, so we can simplify the code by removing that.
"amdgpu-as" is way too vague and doesn't give enough context.
We may want to support it on normal atomics too, to control the synchronized (ordered) AS.
If we do that, the name has to be less vague.
FMV priority is the returned value of a polymorphic function. On RISC-V
and X86 targets a 32-bit value is enough. On AArch64 we currently need
64 bits and we will soon exceed that. APInt seems to be a suitable
replacement for uint64_t, presumably with minimal compile time overhead.
It allows bit manipulation, comparison and variable bit width.
The checks for the 'z' and 't' format specifiers added in the original
PR #143653 had some issues and were overly strict, causing some build
failures and were consequently reverted at
4c85bf2fe8.
In the latest commit
27c58629ec,
I relaxed the checks for the 'z' and 't' format specifiers, so warnings
are now only issued when they are used with mismatched types.
The original intent of these checks was to diagnose code that assumes
the underlying type of `size_t` is `unsigned` or `unsigned long`, for
example:
```c
printf("%zu", 1ul); // Not portable, but not an error when size_t is unsigned long
```
However, it produced a significant number of false positives. This was
partly because Clang does not treat the `typedef` `size_t` and
`__size_t` as having a common "sugar" type, and partly because a large
amount of existing code either assumes `unsigned` (or `unsigned long`)
is `size_t`, or they define the equivalent of size_t in their own way
(such as
sanitizer_internal_defs.h).2e67dcfdcd/compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h (L203)
Let Clang emit `dead_on_return` attribute on pointer arguments
that are passed indirectly, namely, large aggregates that the
ABI mandates be passed by value; thus, the parameter is destroyed
within the callee. Writes to such arguments are not observable by
the caller after the callee returns.
This should desirably enable further MemCpyOpt/DSE optimizations.
Previous discussion: https://discourse.llvm.org/t/rfc-add-dead-on-return-attribute/86871.
Including the results of `sizeof`, `sizeof...`, `__datasizeof`,
`__alignof`, `_Alignof`, `alignof`, `_Countof`, `size_t` literals, and
signed `size_t` literals, the results of pointer-pointer subtraction and
checks for standard library functions (and their calls).
The goal is to enable clang and downstream tools such as clangd and
clang-tidy to provide more portable hints and diagnostics.
The previous discussion can be found at #136542.
This PR implements this feature by introducing a new subtype of `Type`
called `PredefinedSugarType`, which was considered appropriate in
discussions. I tried to keep `PredefinedSugarType` simple enough yet not
limited to `size_t` and `ptrdiff_t` so that it can be used for other
purposes. `PredefinedSugarType` wraps a canonical `Type` and provides a
name, conceptually similar to a compiler internal `TypedefType` but
without depending on a `TypedefDecl` or a source file.
Additionally, checks for the `z` and `t` format specifiers in format
strings for `scanf` and `printf` were added. It will precisely match
expressions using `typedef`s or built-in expressions.
The affected tests indicates that it works very well.
Several code require that `SizeType` is canonical, so I kept `SizeType`
to its canonical form.
The failed tests in CI are allowed to fail. See the
[comment](https://github.com/llvm/llvm-project/pull/135386#issuecomment-3049426611)
in another PR #135386.
Currently, clang coerces (u)int128_t to two i64 IR parameters when they
are passed in registers. This leads to broken debug info for them after
applying SROA+InstCombine. SROA generates IR like this
([godbolt](https://godbolt.org/z/YrTa4chfc)):
```llvm
define dso_local { i64, i64 } @add(i64 noundef %a.coerce0, i64 noundef %a.coerce1) {
entry:
%a.sroa.2.0.insert.ext = zext i64 %a.coerce1 to i128
%a.sroa.2.0.insert.shift = shl nuw i128 %a.sroa.2.0.insert.ext, 64
%a.sroa.0.0.insert.ext = zext i64 %a.coerce0 to i128
%a.sroa.0.0.insert.insert = or i128 %a.sroa.2.0.insert.shift, %a.sroa.0.0.insert.ext
#dbg_value(i128 %a.sroa.0.0.insert.insert, !17, !DIExpression(), !18)
// ...
!17 = !DILocalVariable(name: "a", arg: 1, scope: !10, file: !11, line: 1, type: !14)
// ...
```
and InstCombine then removes the `or`, moving it into the
`DIExpression`, and the `shl` at which point the debug info salvaging in
`Transforms/Local` replaces the arguments with `poison` as it does not
allow constants larger than 64 bit in `DIExpression`s.
I'm working under the assumption that there is interest in fixing this.
If not, please tell me.
By not coercing `int128_t`s into `{i64, i64}` but keeping them as
`i128`, the debug info stays intact and SelectionDAG then generates two
`DW_OP_LLVM_fragment` expressions for the two corresponding argument
registers.
Given that the ABI code for x64 seems to not coerce the argument when it
is passed on the stack, it should not lead to any problems keeping it as
an `i128` when it is passed in registers.
Alternatively, this could be fixed by checking if a constant value fits
in 64 bits in the debug info salvaging code and then extending the value
on the expression stack to the necessary width. This fixes InstCombine
breaking the debug info but then SelectionDAG removes the expression and
that seems significantly more complex to debug.
Another fix may be to generate `DW_OP_LLVM_fragment` expressions when
removing the `or` as it gets marked as disjoint by InstCombine. However,
I don't know if the KnownBits information is still available at the time
the `or` gets removed and it would probably require refactoring of the
debug info salvaging code as that currently only seems to replace single
expressions and is not designed to support generating new debug records.
Converting `(u)int128_t` arguments to `i128` in the IR seems like the
simpler solution, if it doesn't cause any ABI issues.
Currently, when specifying `vectorize(disable) unroll_count(8)`, the
generated metadata appears as follows:
```
!loop0 = !{!"loop0", !vectorize_width, !followup}
!vectorize_width = !{!"llvm.loop.vectorize.width", i32 1}
!followup = !{!"llvm.loop.vectorize.followup_all", !unroll}
!unroll = !{!"llvm.loop.unroll_count", i32 8}
```
Since the metadata `!vectorize_width` implies that the vectorization is
disabled, the vectorization process is skipped, and the `!followup`
metadata is not processed correctly.
This patch addresses the issue by directly appending properties to the
metadata node when vectorization is disabled, instead of creating a new
follow-up MDNode. In the above case, the generated metadata will now
look like this:
```
!loop0 = !{!"loop0", !vectorize_width, !vectorize_width, !unroll}
!vectorize_width = !{!"llvm.loop.vectorize.width", i32 1}
!unroll = !{!"llvm.loop.unroll_count", i32 8}
```
- [x] Implement refract using HLSL source in hlsl_intrinsics.h
- [x] Implement the refract SPIR-V target built-in in
clang/include/clang/Basic/BuiltinsSPIRV.td
- [x] Add sema checks for refract to CheckSPIRVBuiltinFunctionCall in
clang/lib/Sema/SemaSPIRV.cpp
- [x] Add codegen for spv refract to EmitSPIRVBuiltinExpr in
CGBuiltin.cpp
- [x] Add codegen tests to clang/test/CodeGenHLSL/builtins/refract.hlsl
- [x] Add spv codegen test to clang/test/CodeGenSPIRV/Builtins/refract.c
- [x] Add sema tests to clang/test/SemaHLSL/BuiltIns/refract-errors.hlsl
- [x] Add spv sema tests to
clang/test/SemaSPIRV/BuiltIns/refract-errors.c
- [x] Create the int_spv_refract intrinsic in IntrinsicsSPIRV.td
- [x] In SPIRVInstructionSelector.cpp create the refract lowering and
map it to int_spv_refract in SPIRVInstructionSelector::selectIntrinsic.
- [x] Create SPIR-V backend test case in
llvm/test/CodeGen/SPIRV/hlsl-intrinsics/refract.ll
- [x] Check for what OpenCL support is needed.
Resolves https://github.com/llvm/llvm-project/issues/99153
This PR removes the command line parsing workaround introduced in
https://github.com/llvm/llvm-project/pull/146342 by moving
`LangOptions::ExceptionHandling` to `CodeGenOptions` that get parsed
even for IR input. Additionally, this improves layering, where the
codegen library now checks `CodeGenOptions` instead of `LangOptions` for
exception handling. (This got enabled by
https://github.com/llvm/llvm-project/pull/146422.)
The verifier check was in the wrong place, meaning it wasn't actually
checking many instructions.
Fixing that causes a test failure (coro-dwarf-key-instrs.cpp) because
coros turn off the feature but still annotate instructions with the
metadata (which is a supported situation, but the verifier doesn't like
it, and it's hard to teach the verifier to like it).
Fix that by avoiding emitting any key instruction metadata if the
DISubprogram has opted out of key instructions.
XAndesBFHCvt provides two builtins functions for converting between
float and bf16. Users can use them to convert bf16 values loaded from
memory to float, perform arithmetic operations, then convert them back
to bf16 and store them to memory.
The load/store and move operations for bf16 will be handled in a later
patch.
Fixes: https://github.com/llvm/llvm-project/issues/148953
Currently when coroutine return object type is const qualified, we don't
do direct emission. The regular emission logic assumed that the auto var
emission will always result in an `AllocaInst`. However, based on my
findings, NRVO var emissions don't result in `AllocaInst`s. Therefore,
this
[assertion](1a940bfff9/clang/lib/CodeGen/CGCoroutine.cpp (L712))
will fail.
Since the NRVOed returned object don't live on the coroutine frame, we
won't have the problem of it outliving the coroutine frame, therefore,
we can safely omit this metadata.
Some `LangOptions` duplicate their `CodeGenOptions` counterparts. My
understanding is that this was done solely because some infrastructure
(like preprocessor initialization, serialization, module compatibility
checks, etc.) were only possible/convenient for `LangOptions`. This PR
implements the missing support for `CodeGenOptions`, which makes it
possible to remove some duplicate `LangOptions` fields and simplify the
logic. Motivated by https://github.com/llvm/llvm-project/pull/146342.
When constructing the protocol list in the class metadata generation
(`GenerateClass`), only the protocols from the base class are added but
not protocols declared in class extensions.
This is fixed by using `all_referenced_protocol_{begin, end}` instead of
`protocol_{begin, end}`, matching the behaviour on Apple platforms.
A unit test is included to check if all protocol metadata was emitted
and that no duplication occurs in the protocol list.
Fixes https://github.com/gnustep/libobjc2/issues/339
CC: @davidchisnall
This PR introduces the use of pointer authentication to objective-c[++].
This includes:
* __ptrauth qualifier support for ivars
* protection of isa and super fields
* protection of SEL typed ivars
* protection of class_ro_t data
* protection of methodlist pointers and content
Clang can chose which sort of source-file hash is attached to a DIFile
metadata node. However, whenever hashing is possible, we /always/ attach
a hash. This patch permits users who want DWARF5 but don't want the file
hashes to opt out, by adding a "none" option to the -gsrc-hash option
that skips hash computation.
At this time (immediately prior to llvm21 branching) we haven't
instrumented coroutine generation to identify the "key" instructions of
things like co_return and similar. This will lead to worse stepping
behaviours, as there won't be any key instruction for those lines.
This patch removes the key-instructions flag from the DISubprograms for
coroutines, which will cause AsmPrinter to use the "old" / existing
linetable stepping behaviour, avoiding a regression until we can
instrument these constructs.
(I'm going to post on discourse about whether this is a good idea or not
in a moment)
__enqueue_kernel_varargs' last parameter is in addrspace(5), but CodeGen
currently misses this qualifier. This commit fixes the code to preserve
the qualifier by referencing Alloca, which has its casts removed, rather
than TmpPtr.
The patch adds intrinsics and lowering logic for GlobalSize,
GlobalOffset, SubgroupMaxSize, NumWorkgroups, WorkgroupSize,
WorkgroupId, LocalInvocationId, GlobalInvocationId, SubgroupSize,
NumSubgroups, SubgroupId and SubgroupLocalInvocationId SPIR-V builtins.
The patch also extend spv_thread_id, spv_group_id and
spv_thread_id_in_group to return anyint rather than i32. This allows the
intrinsics to support the opencl environment.
For each of the intrinsics, new clang builtins were added as well as a
binding for the SPIR-V "friendly" format. The original format doesn't
define such binding (uses global variables) but it is not possible to
express the Input SC which is normally required by the environement
specs, and using builtin functions is the most usual approach for other
backend and programming models.
https://reviews.llvm.org/D104808 set the alignment of long double to 64
bits. This is also the alignment specified in the LLVM data layout.
However, the alignment of __float128 was left at 128 bits.
I assume that this was just an oversight, rather than an intentional
divergence. The C ABI document currently does not make any statement
about `__float128`:
https://github.com/WebAssembly/tool-conventions/blob/main/BasicCABI.md