As explained by commit 849f1dd15e92fda2b83dbb6144e6b28b2cb946e0,
-fxray-function-index was the original default but was accidentally flipped by
commit d8a8e5d6240a1db809cd95106910358e69bbf299. Restore the previous behavior.
Originally reported by Oleksii Lozovskyi in D145848.
Apply my post-commit comment on D81995. The negative name misguided commit
d8a8e5d6240a1db809cd95106910358e69bbf299 (`[clang][cli] Remove marshalling from
Opt{In,Out}FFlag`) to:
* accidentally flip the option to not emit the xray_fn_idx section.
* change -fno-xray-function-index (instead of -fxray-function-index) to emit xray_fn_idx
This patch renames XRayOmitFunctionIndex and makes -fxray-function-index emit
xray_fn_idx, but the default remains -fno-xray-function-index .
We stopped testing with -check-prefix=SAMPLEPGO-OLDPM and
-check-prefix=THINLTO-OLDPM as of:
commit 8a7a28075b7fa70d56b131c10a4d1add777d5830
Author: Thomas Preud'homme <thomasp@graphcore.ai>
Date: Fri Sep 17 10:23:40 2021 +0100
This commit implements support for WebAssembly table types and
respective builtins. Table tables are WebAssembly objects to store
reference types. They have a large amount of semantic restrictions
including, but not limited to, only being allowed to be declared
at the top-level as static arrays of zero-length. Not being arguments
or result of functions, not being stored ot memory, etc.
This commit introduces the __attribute__((wasm_table)) to attach to
arrays of WebAssembly reference types. And the following builtins to
manage tables:
* ref __builtin_wasm_table_get(table, idx)
* void __builtin_wasm_table_set(table, idx, ref)
* uint __builtin_wasm_table_size(table)
* uint __builtin_wasm_table_grow(table, ref, uint)
* void __builtin_wasm_table_fill(table, idx, ref, uint)
* void __builtin_wasm_table_copy(table, table, uint, uint, uint)
This commit also enables reference-types feature at bleeding-edge.
This is joint work with Alex Bradbury (@asb).
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D139010
The first patch supported only LMUL=1 types. This patch supports
LMUL!=1.
LMUL is length multiplier that allows multiple vector registers to
be treated as one large register or a fraction of a single vector
register. Supported values for LMUL are 1/8, 1/4, 1/2, 1, 2, 4, and 8.
An LMUL=2 type will be twice as large as an LMUL=1 type. An LMUL=1/2
type will be half the size as an LMUL=1 type.
Type name with "m2" is LMUL=2, "m4" is LMUL=4.
Type name with "mf2" is LMUL=1/2, "mf4" is LMUL=1/4.
For the LMUL!=1 types the user will need to scale __riscv_v_fixed_vlen
by the LMUL before passing to the attribute.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D150926
Migration of clang tests to opaque pointers is finished, so remove
the -no-opaque-pointers flag.
Differential Revision: https://reviews.llvm.org/D152447
This patch is to fix the [[ https://github.com/llvm/llvm-project/issues/63045 | issue 63045]].
Look at the following code:
```
int main(int argc, char *argv[]) {
int arr[1000];
__asm movdir64b rax, ZMMWORD PTR [arr]
return 0;
}
```
Compiling this code using `clang -O0 -fasm-blocks bug.c` gives the a linker error.
The problem seems to be in the generated assembly. Following is the out put of `clang -S -O0 -fasm-blocks bug.c`:
```
movq %rsi, -16(%rbp)
#APP
movdir64b arr, %rax
#NO_APP
xorl %eax, %eax
```
The symbol `arr` should be replaced with some address like `-4(%rbp)`.
This makes me believe that issue is not in the linker, but with the ASM parser.
This issue originates with patch [D145893](https://reviews.llvm.org/D145893). And that's why reverting it fixes the issue. More specifically, the function [isMem512_GR64()](ff471dcf76/llvm/lib/Target/X86/AsmParser/X86Operand.h (L404)) within the [llvm/lib/Target/X86/AsmParser/X86Operand.h](https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/X86/AsmParser/X86Operand.h) file.
Reviewed By: skan
Differential Revision: https://reviews.llvm.org/D151863
Due to the nature of WebAssembly, it's always better to keep
rotates instead of trying to optimize it. Commit 9485d983
disabled the generation of fsh for rotates, however these
tests ensure that future changes don't change the behaviour for
the Wasm backend that tends to have different optimization
requirements than other architectures. Also see:
https://github.com/llvm/llvm-project/issues/62703
Differential Revision: https://reviews.llvm.org/D152126
This reverts commit 35a0079238ce9fc36cdc8c6a2895eb5538bf7b4a.
The backend support is not present yet. The intrinsics will crash
the compiler if compiled to assembly or binary.
Inline builtins have a very special behavior compared to other
functions, it's better if we keep them restricted to a minimal set of
functions.
Add a linkage check which prevents considering ODR definitions as inline
builtins.
Fix#62958
Differential Revision: https://reviews.llvm.org/D148723
This patch adds support for the following SME ACLE intrinsics (as defined
in https://arm-software.github.io/acle/main/acle.html):
- svld1_hor_za8 // also for _za16, _za32, _za64 and _za128
- svld1_hor_vnum_za8 // also for _za16, _za32, _za64 and _za128
- svld1_ver_za8 // also for _za16, _za32, _za64 and _za128
- svld1_ver_vnum_za8 // also for _za16, _za32, _za64 and _za128
- svst1_hor_za8 // also for _za16, _za32, _za64 and _za128
- svst1_hor_vnum_za8 // also for _za16, _za32, _za64 and _za128
- svst1_ver_za8 // also for _za16, _za32, _za64 and _za128
- svst1_ver_vnum_za8 // also for _za16, _za32, _za64 and _za128
SveEmitter.cpp is extended to generate arm_sme.h (currently named
arm_sme_draft_spec_subject_to_change.h) and other SME definitions from
arm_sme.td, which is modeled after arm_sve.td. Common TableGen definitions
are moved into arm_sve_sme_incl.td.
Co-authored-by: Sagar Kulkarni <sagar.kulkarni1@huawei.com>
Reviewed By: sdesmalen, kmclaughlin
Differential Revision: https://reviews.llvm.org/D127910
Pursuant to discussions at
https://discourse.llvm.org/t/rfc-c-23-p1467r9-extended-floating-point-types-and-standard-names/70033/22,
this commit enhances the handling of the __bf16 type in Clang.
- Firstly, it upgrades __bf16 from a storage-only type to an arithmetic
type.
- Secondly, it changes the mangling of __bf16 to DF16b on all
architectures except ARM. This change has been made in
accordance with the finalization of the mangling for the
std::bfloat16_t type, as discussed at
https://github.com/itanium-cxx-abi/cxx-abi/pull/147.
- Finally, this commit extends the existing excess precision support to
the __bf16 type. This applies to hardware architectures that do not
natively support bfloat16 arithmetic.
Appropriate tests have been added to verify the effects of these
changes and ensure no regressions in other areas of the compiler.
Reviewed By: rjmccall, pengfei, zahiraam
Differential Revision: https://reviews.llvm.org/D150913
Function pointers are checked by loading a prefix structure from just
before the function's entry point. However, on Arm, the function
pointer is not always exactly equal to the address of the entry point,
because Thumb function pointers have the low bit set to tell the BX
instruction to enter them in Thumb state. So the generated code loads
from an odd address and suffers an alignment fault.
Fixed by clearing the low bit of the function pointer before
subtracting 8.
Differential Revision: https://reviews.llvm.org/D151308
This commit updates all intrinsics under
`clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated` because the
new script of `update_llc_test_checks.py` is generating many new lines
differently.
This NFC commit updates the test cases in a whole batch.
Signed-off by: eop Chen <eop.chen@sifive.com>
If there is an infinite cycle in the IR, the loop will never exit. Keep
track of visited basic blocks in a set and return nullptr if a block is
visited again.
Fixes#62830.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D151076
As reported by @kees, GCC treats __builtin_object_size of structures
containing flexible array members (aka arrays with incomplete type) not
just as the sizeof the underlying type, but additionally the size of the
members in a designated initializer list.
Fixes: https://github.com/llvm/llvm-project/issues/62789
Reviewed By: erichkeane, efriedma
Differential Revision: https://reviews.llvm.org/D150892
There are two motivations.
`-fno-pic -fstack-protector -mstack-protector-guard=global` created
`__stack_chk_guard` is referenced directly on all ELF OSes except FreeBSD.
This patch allows referencing the symbol indirectly with
-fno-direct-access-external-data.
Some Linux kernel folks want
`-fno-pic -fstack-protector -mstack-protector-guard-reg=gs -mstack-protector-guard-symbol=__stack_chk_guard`
created `__stack_chk_guard` to be referenced directly, avoiding
R_X86_64_REX_GOTPCRELX (even if the relocation may be optimized out by the linker).
https://github.com/llvm/llvm-project/issues/60116
Why they need this isn't so clear to me.
---
Add module flag "direct-access-external-data" and set the dso_local property of
the stack protector symbol. The module flag can benefit other LLVMCodeGen
synthesized symbols that are not represented in LLVM IR.
Nowadays, with `-fno-pic` being uncommon, ideally we should set
"direct-access-external-data" when it is true. However, doing so would require
~90 clang/test tests to be updated, which are too much.
As a compromise, we set "direct-access-external-data" only when it's different
from the implied default value.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D150841
Add more builtins for stdio functions as in GCC, along with their
mutations under IEEE float128 ABI.
Reviewed By: tuliom
Differential Revision: https://reviews.llvm.org/D150087
This is an ongoing series of commits that are reformatting our
Python code.
Reformatting is done with `black`.
If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.
If you run into any problems, post to discourse about it and
we will try to help.
RFC Thread below:
https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style
Reviewed By: MatzeB
Differential Revision: https://reviews.llvm.org/D150761
As reported by @kees, GCC treats __builtin_object_size of structures
containing flexible array members (aka arrays with incomplete type) not
just as the sizeof the underlying type, but additionally the size of the
members in a designated initializer list.
Fixes: https://github.com/llvm/llvm-project/issues/62789
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D150892
With D148785, -fsanitize=function no longer uses C++ RTTI objects and therefore
can support C. The rationale for reporting errors is C11 6.5.2.2p9:
> If the function is defined with a type that is not compatible with the type (of the expression) pointed to by the expression that denotes the called function, the behavior is undefined.
The mangled types approach we use does not exactly match the C type
compatibility (see `f(callee1)` below).
This is probably fine as the rules are unlikely leveraged in practice. In
addition, the call is warned by -Wincompatible-function-pointer-types-strict.
```
void callee0(int (*a)[]) {}
void callee1(int (*a)[1]) {}
void f(void (*fp)(int (*)[])) { fp(0); }
int main() {
int a[1];
f(callee0);
f(callee1); // compatible but flagged by -fsanitize=function, -fsanitize=kcfi, and -Wincompatible-function-pointer-types-strict
}
```
Skip indirect call sites of a function type without a prototype to avoid deal
with C11 6.5.2.2p6. -fsanitize=kcfi skips such calls as well.
Reviewed By: #sanitizers, vitalybuka
Differential Revision: https://reviews.llvm.org/D148827
For the cover letter of this patch-set, please checkout D146872.
Depends on D147916.
This is the 11th patch of the patch-set.
This patch is a proof-of-concept and will be extended to full coverage
in the future. Only vset for tuple type of NF=2, EEW=32, LMUL=1 is
defined now.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D147917
For the cover letter of this patch-set, please checkout D146872.
Depends on D147915.
This is the 10th patch of the patch-set.
This patch is a proof-of-concept and will be extended to full coverage
in the future. Only vget for tuple type of NF=2, EEW=32, LMUL=1 is
defined now.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D147916
For the cover letter of this patch-set, please checkout D146872.
Depends on D147914.
This is the 9th patch of the patch-set.
This patch is a proof-of-concept and will be extended to full coverage
in the future. Currently, the old non-tuple indexed segment store is
not removed, and only signed integer indexed segment store of NF=2,
EEW=32 is defined here.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D147915
For the cover letter of this patch-set, please checkout D146872.
Depends on D147913.
This is the 8th patch of the patch-set.
This patch is a proof-of-concept and will be extended to full coverage
in the future. Currently, the old non-tuple indexed segment load is
not removed, and only signed integer indexed segment load of NF=2,
EEW=32 is defined here.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D147914
For the cover letter of this patch-set, please checkout D146872.
Depends on D147912.
This is the 7th patch of the patch-set.
This patch is a proof-of-concept and will be extended to full coverage
in the future. Currently, the old non-tuple strided segment store is
not removed, and only signed integer strided segment store of NF=2,
EEW=32 is defined here.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D147913
For the cover letter of this patch-set, please checkout D146872.
Depends on D147911.
This is the 6th patch of the patch-set.
This patch is a proof-of-concept and will be extended to full coverage
in the future. Currently, the old non-tuple strided segment load is not
removed, and only signed integer strided segment load of NF=2, EEW=32
is defined here.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D147912
For the cover letter of this patch-set, please checkout D146872.
Depends on D147774.
This is the 5th patch of the patch-set.
This patch is a proof-of-concept and will be extended to full coverage
in the future. Currently, the old non-tuple unit-stride fault-first
segment load is not removed, and only signed integer unit-stride
fault-first segment load of NF=2, EEW=32 is defined here.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D147911
For the cover letter of this patch-set, please checkout D146872.
Depends on D147731.
This is the 4th patch of the patch-set.
This patch is a proof-of-concept and will be extended to full coverage
in the future. Currently, the old non-tuple unit-stride segment store is
not removed, and only signed integer unit-strided segment store of NF=2,
EEW=32 is defined here.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D147774
For the cover letter of this patch-set, please checkout D146872.
Depends on D146873.
This is the 3rd patch of the patch-set. This patch originates from
D99593.
Note: This patch is a proof-of-concept and will be extended to full
coverage in the future. Currently, the old non-tuple unit-stride
segment load is not removed, and only signed integer unit-strided
segment load of NF=2, EEW=32 is defined here.
When replacing the old intrinsics, the extra `IsTuple` parameter under
various places will be redundant and removed.
Authored-by: eop Chen <eop.chen@sifive.com>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D147731