7802 Commits

Author SHA1 Message Date
David Green
9c48b7f0e7 [AArch64][ARM] Alter v8.1a neon intrinsics to be target-based, not preprocessor based
As a continuation of D132034, this switches the QRDMX v8.1a neon
intrinsics over from preprocessor defines to be target-gated. As there
is no "rdma" or "qrdmx" target feature, they use the "v8.1a"
architecture feature directly.

This works well for AArch64, but something needs to be done for Arm at
the same time, as they both use the same header and tablegen emitter.
This patch opts for adding "v8.1a" and all dependant target features to
the Arm TargetParser, similar to what was recently done for AArch64 but
through initFeatureMap when the Architecture is parsed. I attempted to
make the code similar to the AArch64 backend.

Otherwise this is similar to the changes made in D132034.

Differential Revision: https://reviews.llvm.org/D135615
2022-10-25 09:02:52 +01:00
Freddy Ye
fdac4c4e92 [X86] Add CMPCCXADD instructions.
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: pengfei, skan

Differential Revision: https://reviews.llvm.org/D135933
2022-10-25 14:33:39 +08:00
Markus Böck
3637dc601c [clang][CodeGen] Consistently return nullptr Values for void builtins and scalar initalization
A common post condition of the various visitor functions in CodeGen is that instructions, that do not return any values, simply return a nullptr Value as a sentinel. This has not been the case however for calls to some builtins returning void, as well as for an initializer expression of the form `void()`. This would then lead to ICEs in CodeGen on code relying on nullptr being returned for void values, which is eg. the case for conditional expressions [0].
This patch fixes that by returning nullptr Values for intrinsics known not to return any values as well as for a scalar initializer returning void.

Fixes https://github.com/llvm/llvm-project/issues/53127

[0] 266ec801fb/clang/lib/CodeGen/CGExprScalar.cpp (L4849-L4892)

Differential Revision: https://reviews.llvm.org/D136548
2022-10-24 21:41:13 +02:00
Rong Xu
6cee539337 [Clang] Change AnonStructIds in MangleContext to per-function based
Clang is generating different mangled names for the same lambda
function in slightly changed builds (like with non-related
source/Macro change). This is due to the fact that clang uses a
cross-translation-unit sequential string "$_<n>" in lambda's
mangled name. Here, "n" is the AnonStructIds field in MangleContext.

Different mangled names for a unchanged function is undesirable:
it makes perf comparison harder, and can cause some unnecessary
profile mismatch in SampleFDO.

This patch makes mangled name for lambda functions more stable
by changing AnonStructIds to a per-function based seq number if the
DeclContext is a function.

Differential Revision: https://reviews.llvm.org/D136397
2022-10-23 22:33:52 -07:00
Xiang1 Zhang
661881d436 [X86] Add AMX-FP16 instructions.
Differential Revision: https://reviews.llvm.org/D135941
2022-10-22 08:05:22 +08:00
Bill Wendling
283e0a81ef [clang] Correct sanitizer behavior in union FAMs
Clang doesn't have the same behavior as GCC does with union flexible
array members. (Technically, union FAMs are probably not acceptable in
C99 and are an extension of GCC and Clang.)

Both Clang and GCC treat *all* arrays at the end of a structure as FAMs.
GCC does the same with unions. Clang does it for some arrays in unions
(incomplete, '0', and '1'), but not for all. Instead of having this
half-supported feature, sync Clang's behavior with GCC's.

Reviewed By: kees

Differential Revision: https://reviews.llvm.org/D135727
2022-10-20 16:08:11 -07:00
Michael Francis
922f42d531 [clang][AIX] Fix mcount name and call arguments
Currently, compiling a program with the `-pg` flag will result in an
undefined symbol error for `.mcount`. This revision fixes the call to
use `__mcount`, which requires a pointer argument to a pointer-sized
object (unique per inserted call) on AIX.

This is only a partial fix. This patch should fix the `-pg` flag's
behaviour on AIX to work with code you are compiling, but it will not
link against standard libraries with `mcount` instrumentation calls. The
next step is to add profiled libraries to the linker search paths in the
Clang driver for the AIX toolchain when linking with `-pg`.

Differential Review: https://reviews.llvm.org/D135384
2022-10-20 16:20:00 -04:00
Phoebe Wang
62ca79102c [X86][1/2] Support PREFETCHI instructions
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D136040
2022-10-20 08:46:01 +08:00
Philip Reames
9a8f3b113d [clang][RISCV] Set vscale_range attribute based on VLEN
Follow up on D135894, restructure code to work in terms of minimum and maximum VLEN coming from RISCVISAInfo.cpp. In the original review, I'd mentioned that MinVLEN was sometimes zero. This turns out to be a case of human error, combined with really bad (lack of) error reporting.

This patch adds appropriate tests for various vector extension combinations to show the mechanism works, but doesn't try to provide exhaustive coverage of the extension interactions. Presumably, that is already covered in existing tests elsewhere.

Differential Revision: https://reviews.llvm.org/D136106
2022-10-19 16:14:33 -07:00
Phoebe Wang
bc1819389f [X86][RFC] Using __bf16 for AVX512_BF16 intrinsics
This is an alternative of D120395 and D120411.

Previously we use `__bfloat16` as a typedef of `unsigned short`. The
name may give user an impression it is a brand new type to represent
BF16. So that they may use it in arithmetic operations and we don't have
a good way to block it.

To solve the problem, we introduced `__bf16` to X86 psABI and landed the
support in Clang by D130964. Now we can solve the problem by switching
intrinsics to the new type.

Reviewed By: LuoYuanke, RKSimon

Differential Revision: https://reviews.llvm.org/D132329
2022-10-19 23:47:04 +08:00
Daniel Kiss
0d0ca64356 [AArch64] Make ACLE intrinsics always available part MTE
Make MTE intrinsics available in function scope too.
Followup from D133359.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D136062
2022-10-18 11:03:02 +02:00
Daniel Kiss
a175d8b177 Revert "[AArch64] Make ACLE intrinsics always available part MTE"
This reverts commit 09aaf190d93393d9e29d29a033cc3979589c5e84.
2022-10-18 10:45:32 +02:00
Daniel Kiss
09aaf190d9 [AArch64] Make ACLE intrinsics always available part MTE
Make MTE intrinsics available in function scope too.
Followup from D133359.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D136062
2022-10-18 10:35:40 +02:00
Philip Reames
4467c781d7 [clang][RISCV] Set vscale_range attribute based on presence of "v" extension
This follows the path that AArch64 SVE has taken. Doing this via a function attribute set in the frontend is basically a workaround for the fact that several analyzes which need the information (i.e. known bits, lvi, scev) can't easily use TTI without significant amounts of plumbing changes.

This patch hard codes "v" numbers, and directly follows the SVE precedent as a result. In a follow up, I hope to drive this from RISCVISAInfo.h/cpp instead, but the MinVLen number being returned from that interface seemed to always be 0 (which is wrong), and I haven't figured out what's going wrong there.

Differential Revision: https://reviews.llvm.org/D135894
2022-10-17 11:33:03 -07:00
Zahira Ammarguellat
d0a4741392 Fix LIT test func-attr.c added by https://reviews.llvm.org/D135097.
Differential Revision: https://reviews.llvm.org/D136084
2022-10-17 14:26:17 -04:00
Ellis Hoag
970e1ea01a [clang] Fix crash with -funique-internal-linkage-names
Calling `getFunctionLinkage(CalleeInfo.getCalleeDecl())` will crash when the declaration does not have a body, e.g., `extern void foo();`. Instead, we can use `isExternallyVisible()` to see if the delcaration has internal linkage.

I believe using `!isExternallyVisible()` is correct because the clang linkage must be `InternalLinkage` or `UniqueExternalLinkage`, both of which are "internal linkage" in llvm.
9c26f51f5e/clang/include/clang/Basic/Linkage.h (L28-L40)

Fixes https://github.com/llvm/llvm-project/issues/54139

Reviewed By: tmsriram

Differential Revision: https://reviews.llvm.org/D135926
2022-10-17 08:57:23 -07:00
Ting Wang
ee703b5cb1 [clang][PowerPC] PPC64 VAArg fix right-alignment for aggregates fit in register
PPC64 ABI pass aggregates smaller than a register into the least
significant bits of the register. In the case of variadic functions,
they will end up right-aligned in their argument slots in the argument
area on big-endian targets. Apply right-alignment for these aggregates.

Fixes #55900.

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D133338
2022-10-16 22:01:47 -04:00
Yi Kong
f118280b04 [clang] Fix func-attr.c test
The test was introduced by 84a9ec2ff1ee. The check is over-specific and
is broken on the Android buildbot. Fixed by relaxing the variable name
check.
2022-10-15 23:16:26 +09:00
Stefan Pintilie
6897dbc463 [PowerPC] Fix parameters for __builtin_crypto_vsbox
The documentation specifies that the input and ouput for the builtin
__builtin_crypto_vsbox should be vector unsigned char.

This patch fixes this type for the builtin.

Reviewed By: amyk

Differential Revision: https://reviews.llvm.org/D135834
2022-10-14 13:30:59 -05:00
Daniel Kiss
30b67c677c [AArch64] Make ACLE intrinsics always available part1
A given arch feature might enabled by a pragma or a function attribute so in this cases would be nice to use intrinsics.
Today GCC offers the intrinsics without the march flag[1].
PR[2] for ACLE to clarify the intention and remove the need for -march flag for a given intrinsics.

This is going to be more useful when D127812 lands.

[1] https://godbolt.org/z/bxcMhav3z
[2] https://github.com/ARM-software/acle/pull/214

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D133359
2022-10-14 17:23:11 +02:00
Zahira Ammarguellat
84a9ec2ff1 Remove redundant option -menable-unsafe-fp-math.
There are currently two options that are used to tell the compiler to perform
unsafe floating-point optimizations:
'-ffast-math' and '-funsafe-math-optimizations'.

'-ffast-math' is enabled by default. It automatically enables the driver option
'-menable-unsafe-fp-math'.
Below is a table illustrating the special operations enabled automatically by
'-ffast-math', '-funsafe-math-optimizations' and '-menable-unsafe-fp-math'
respectively.

Special Operations -ffast-math	-funsafe-math-optimizations -menable-unsafe-fp-math
MathErrno	       0	         1	                    1
FiniteMathOnly         1 	         0                          0
AllowFPReassoc	       1         	 1                          1
NoSignedZero	       1                 1                          1
AllowRecip             1                 1                          1
ApproxFunc             1                 1                          1
RoundingMath	       0                 0                          0
UnsafeFPMath	       1                 0                          1
FPContract	       fast	         on	                    on

'-ffast-math' enables '-fno-math-errno', '-ffinite-math-only',
'-funsafe-math-optimzations' and sets 'FpContract' to 'fast'. The driver option
'-menable-unsafe-fp-math' enables the same special options than
'-funsafe-math-optimizations'. This is redundant.
We propose to remove the driver option '-menable-unsafe-fp-math' and use
instead, the setting of the special operations to set the function attribute
'unsafe-fp-math'. This attribute will be enabled only if those special
operations are enabled and if 'FPContract' is either 'fast' or set to the
default value.

Differential Revision: https://reviews.llvm.org/D135097
2022-10-14 10:55:29 -04:00
Ting Wang
00b9bed1f0 [clang][PowerPC][NFC] Add base test case for PPC64 VAArg aggregate smaller than a slot
Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D133488
2022-10-13 22:57:40 -04:00
wanglei
defe7c07f0 Reland "[clang][LoongArch] Set MaxAtomicInlineWidth and MaxAtomicPromoteWidth for LoongArch"
Differential Revision: https://reviews.llvm.org/D135526
2022-10-11 20:36:09 +08:00
Weining Lu
42b70793a1 Reland "[Clang][LoongArch] Add inline asm support for constraints k/m/ZB/ZC"
Reference: https://gcc.gnu.org/onlinedocs/gccint/Machine-Constraints.html

k: A memory operand whose address is formed by a base register and
(optionally scaled) index register.

m: A memory operand whose address is formed by a base register and
offset that is suitable for use in instructions with the same
addressing mode as st.w and ld.w.

ZB: An address that is held in a general-purpose register. The offset
is zero.

ZC: A memory operand whose address is formed by a base register and
offset that is suitable for use in instructions with the same
addressing mode as ll.w and sc.w.

Note:
The INLINEASM SDNode flags in below tests are updated because the new
introduced enum `Constraint_k` is added before `Constraint_m`.
  llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-inline-asm.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-inline-asm.ll
  llvm/test/CodeGen/X86/callbr-asm-kill.mir

This patch passes `ninja check-all` on a X86 machine with all official
targets and the LoongArch target enabled.

Differential Revision: https://reviews.llvm.org/D134638
2022-10-11 19:51:48 +08:00
Weining Lu
b32a1bdf42 Revert "[clang][LoongArch] Set MaxAtomicInlineWidth and MaxAtomicPromoteWidth for LoongArch"
This reverts commit 6547565e7bdcd9c3f683ad196b62d08c7061fdf1.

This breaks test: Preprocessor/init-loongarch.c
2022-10-11 19:21:28 +08:00
wanglei
6547565e7b [clang][LoongArch] Set MaxAtomicInlineWidth and MaxAtomicPromoteWidth for LoongArch
Differential Revision: https://reviews.llvm.org/D135526
2022-10-11 18:12:37 +08:00
David Green
b879f99f0e [AArch64][ARM] Alter most of arm_neon.h to be target-based, not preprocessor based.
Similar to D131064, this alters most of the intrinsics in arm_neon.h to
be target based, not preprocessor based. The intrinsics that are changed
are the ones with obvious target features (fp16, fp16fml, cryptos, i8mm
and bf16). The ones that are not yet altered are the ones without target
features like rdma (8.1) and complex (8.3). Those will be switched in a
followup patch that allows targeting architecture versions.

The existing ArchGuard in arm_neon.td is split into ArchGuard that still
adds ifdef defines (for example for intrinsics that require __aarch64__),
and TargetGuards for intrinsics dependant on target features. From there
the TargetGuards are used in two ways:
 - For intrinsics emitted as functions, __attribute__((target(TargetGuard)))
   is added to the definition of the function. Along with the existing
   always_inline intrinsic, this will give a compile time error if the
   function is used in a context where the target feature is not available.
 - For intrinsics emitted as macros, the __builtins are emitted into
   arm_neon.inc using TARGET_BUILTIN as opposed to BUILTIN, which includes
   the target feature and gives an error if the builtin is found in a
   function without the required features, similar to arm_sve.h.

The second method requires that the intrinsics be separable from the
existing _v intrinsics used in other types. For example
__builtin_neon_splat_lane_bf16 is used as opposed to
__builtin_neon_splat_lane_v. There are some adjustments to the CGBuiltin
to account for intrinsics that can be treated similarly, except for
their target features.

Differential Revision: https://reviews.llvm.org/D132034
2022-10-11 09:09:16 +01:00
Nikita Popov
39db5e1ed8 [CodeGen] Convert tests to opaque pointers (NFC)
Conversion performed using the script at:
https://gist.github.com/nikic/98357b71fd67756b0f064c9517b62a34

These are only tests where no manual fixup was required.
2022-10-07 14:22:00 +02:00
Manuel Brito
14e2592ff6 [clang][CodeGen] Use poison instead of undef as placeholder in ARM builtins [NFC]
Differential Revision: https://reviews.llvm.org/D135392
2022-10-07 12:50:59 +01:00
Stefan Pintilie
0e2e1fc90a [PowerPC] Fix types for vcipher builtins.
The documentation specifies that the parameters for the vcipher builtins are
```
vector unsigned char
```
The code used
```
vector unsigned long long
```

This patch fixes the types for the vcipher builtins.

Reviewed By: amyk

Differential Revision: https://reviews.llvm.org/D135300
2022-10-06 14:21:34 -05:00
David Green
4987ae8462 [ARM][AArch64] Dont use macros for half instrinsics in NeonEmitter
We don't require arm_neon.h fp16 intrinsics to be treated as macros any
more.

Differential Revision: https://reviews.llvm.org/D131504
2022-10-03 15:27:23 +01:00
David Green
781b491bba [Clang][AArch64] Support AArch64 target(..) attribute formats.
This adds support under AArch64 for the target("..") attributes. The
current parsing is very X86-shaped, this patch attempts to bring it line
with the GCC implementation from
https://gcc.gnu.org/onlinedocs/gcc/AArch64-Function-Attributes.html#AArch64-Function-Attributes.

The supported formats are:
- "arch=<arch>" strings, that specify the architecture features for a
  function as per the -march=arch+feature option.
- "cpu=<cpu>" strings, that specify the target-cpu and any implied
  atributes as per the -mcpu=cpu+feature option.
- "tune=<cpu>" strings, that specify the tune-cpu cpu for a function as
  per -mtune.
- "+<feature>", "+no<feature>" enables/disables the specific feature, for
  compatibility with GCC target attributes.
- "<feature>", "no-<feature>" enabled/disables the specific feature, for
  backward compatibility with previous releases.

To do this, the parsing of target attributes has been moved into
TargetInfo to give the target the opportunity to override the existing
parsing. The only non-aarch64 change should be a minor alteration to the
error message, specifying using "CPU" to describe the cpu, not
"architecture", and the DuplicateArch/Tune from ParsedTargetAttr have
been combined into a single option.

Differential Revision: https://reviews.llvm.org/D133848
2022-10-01 15:40:59 +01:00
David Green
123064dc39 [Clang][Arm] Convert -fallow-half-arguments-and-returns to a target option. NFC
This cc1 option -fallow-half-arguments-and-returns allows __fp16 to be
passed by argument and returned, without giving an error. It is
currently always enabled for Arm and AArch64, by forcing the option in
the driver. This means any cc1 tests (especially those needing
arm_neon.h) need to specify the option too, to prevent the error from
being emitted.

This changes it to a target option instead, set to true for Arm and
AArch64. This allows the option to be removed. Previously it was implied
by -fnative_half_arguments_and_returns, which is set for certain
languages like open_cl, renderscript and hlsl, so that option now too
controls the errors. There were are few other non-arm uses of
-fallow-half-arguments-and-returns but I believe they were unnecessary.
The strictfp_builtins.c tests were converted from __fp16 to _Float16 to
avoid the issues.

Differential Revision: https://reviews.llvm.org/D133885
2022-09-29 11:00:32 +01:00
Michael Platings
dba8fced96 Fix frint ACLE intrinsic names
Although the instruction names begin "frint", the ACLE spec states that
the intrinsic names begin "__rint", without the "f".

Differential Revision: https://reviews.llvm.org/D134824
2022-09-29 09:13:07 +01:00
Fangrui Song
04a65d62a0 Revert D134638 "[Clang][LoongArch] Add inline asm support for constraints k/m/ZB/ZC"
This reverts commit b7baddc7557e5c35a0f6a604a134d849265a99d4.

Broke CodeGen/X86/callbr-asm-kill.mir
We shall pay attention when adding new constraints.
2022-09-29 00:54:56 -07:00
Weining Lu
b7baddc755 [Clang][LoongArch] Add inline asm support for constraints k/m/ZB/ZC
k: A memory operand whose address is formed by a base register and
(optionally scaled) index register.

m: A memory operand whose address is formed by a base register and
offset that is suitable for use in instructions with the same
addressing mode as st.w and ld.w.

ZB: An address that is held in a general-purpose register. The offset
is zero.

ZC: A memory operand whose address is formed by a base register and
offset that is suitable for use in instructions with the same
addressing mode as ll.w and sc.w.

Differential Revision: https://reviews.llvm.org/D134638
2022-09-29 15:02:08 +08:00
Yonghong Song
75be0482a2 [clang][DebugInfo] Emit debuginfo for non-constant case value
Currently, clang does not emit debuginfo for the switch stmt
case value if it is an enum value. For example,
  $ cat test.c
  enum { AA = 1, BB = 2 };
  int func1(int a) {
    switch(a) {
    case AA: return 10;
    case BB: return 11;
    default: break;
    }
    return 0;
  }
  $ llvm-dwarfdump test.o | grep AA
  $
Note that gcc does emit debuginfo for the same test case.

This patch added such a support with similar implementation
to CodeGenFunction::EmitDeclRefExprDbgValue(). With this patch,
  $ clang -g -c test.c
  $ llvm-dwarfdump test.o | grep AA
                  DW_AT_name    ("AA")
  $

Differential Revision: https://reviews.llvm.org/D134705
2022-09-28 12:10:48 -07:00
Arthur Eubanks
44ad67031c [clang][msan] Turn on -fsanitize-memory-param-retval by default
This eagerly reports use of undef values when passed to noundef
parameters or returned from noundef functions.

This also decreases binary sizes under msan.

To go back to the previous behavior, pass `-fno-sanitize-memory-param-retval`.

Reviewed By: vitalybuka, MaskRay

Differential Revision: https://reviews.llvm.org/D134669
2022-09-28 09:36:39 -07:00
Daniel Kiss
712de9d171 [AArch64] Add all predecessor archs in target info
A given function is compatible with all previous arch versions.
To avoid compering values of the attribute this logic adds all predecessor
architecture values.

Reviewed By: dmgreen, DavidSpickett

Differential Revision: https://reviews.llvm.org/D134353
2022-09-27 10:23:21 +02:00
Fangrui Song
b2d7a0dcf1 [AArch64] Check target feature support for __builtin_arm_crc*
This is the AArch64 counterpart of D134127.
Daniel Kiss will change more `BUILTIN` to `TARGET_BUILTIN`.

Fix #57802
2022-09-26 17:16:44 -07:00
Weining Lu
394f30919a [Clang][LoongArch] Add inline asm support for constraints f/l/I/K
This patch adds support for constraints `f`, `l`, `I`, `K` according
to [1]. The remain constraints (`k`, `m`, `ZB`, `ZC`) will be added
later as they are a little more complex than the others.
f: A floating-point register (if available).
l: A signed 16-bit constant.
I: A signed 12-bit constant (for arithmetic instructions).
K: An unsigned 12-bit constant (for logic instructions).

For now, no need to support register alias (e.g. `$a0`) in llvm as
clang will correctly decode the usage of register name aliases into
their official names. And AFAIK, the not yet upstreamed `rustc` for
LoongArch will always use official register names (e.g. `$r4`).

[1] https://gcc.gnu.org/onlinedocs/gccint/Machine-Constraints.html

Differential Revision: https://reviews.llvm.org/D134157
2022-09-26 08:49:58 +08:00
Nico Weber
ea8371247f [clang-cl] Implement /ZH: flag
Based on a patch by Arlo Siemsen (D98438)!

Differential Revision: https://reviews.llvm.org/D134544
2022-09-25 14:43:14 -04:00
eopXD
75279aeecd [RISCV][Clang] Replace all undef value with poison
Address remaining work that dates back to discussion in D126745

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D134513
2022-09-24 04:42:04 -07:00
Teresa Johnson
b1926f308f Restore "[MemProf] Memprof profile matching and annotation"
This reverts commit 794b7ea960ccc3222f2af582efadbc5e5c464292, and
thus restores commit a212d8da94d08e229aa8d65283e4b116310bba10, and
follow on fixes 0cd6763fa93159b84d70a5bb602c24996acaafaa,
e9ff53d42feac7fc157718523275619a8106f2f3, and
37c6a25e9ab230e5e21fa34e246d9fec55275df0.

Use a hash function (BLAKE3) instead of hash_combine/hash_code which are
not guaranteed to be stable across executions.

Additionally, it adds a "REQUIRES: x86_64-linux" to the tests that have
raw profile inputs to avoid failures on big endian bots.

Reviewers: snehasish, davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D128142
2022-09-23 11:38:47 -07:00
Teresa Johnson
794b7ea960 Revert "[MemProf] Memprof profile matching and annotation"
This reverts commit a212d8da94d08e229aa8d65283e4b116310bba10, and follow
on fixes 0cd6763fa93159b84d70a5bb602c24996acaafaa,
e9ff53d42feac7fc157718523275619a8106f2f3, and
37c6a25e9ab230e5e21fa34e246d9fec55275df0.

After re-reading the documentation for hash_combine, I don't think this
is the appropriate hash function to use for computing the hash to use as
a stack id in the metadata, since it is not guaranteed to produce stable
values across executions. I have not hit this problem, but plan to
switch to using an MD5 hash. I am hitting an issue with one of the bots
(https://lab.llvm.org/buildbot/#/builders/171/builds/20732)
where the values produced are only the lower 32 bits of the expected
hash values, however, which I assume is related to the implementation of
hash_combine and hash_code.

I believe I fixed all of the other bot failures with the follow on fixes,
which I'll merge into the new version before reapplying.
2022-09-22 16:08:03 -07:00
Pavel Samolysov
1c530500ab [Pipelines] Introduce DAE after ArgumentPromotion
The ArgumentPromotion pass uses Mem2Reg promotion at the end to cutting
down generated `alloca` instructions as well as meaningless `store`s and
this behavior can leave unused (dead) arguments. To eliminate the dead
arguments and therefore let the DeadCodeElimination remove becoming dead
inserted `GEP`s as well as `load`s and `cast`s in the callers, the
DeadArgumentElimination pass should be run after the ArgumentPromotion
one.

Differential Revision: https://reviews.llvm.org/D128830
2022-09-22 15:33:46 -07:00
Craig Topper
52708be182 [RISCV] Remove support for the unratified Zbe, Zbf, and Zbm extensions.
These extensions do not appear to be on their way to ratification.
2022-09-22 13:04:41 -07:00
Teresa Johnson
a212d8da94 [MemProf] Memprof profile matching and annotation
Profile matching and IR annotation for memprof profiles.

See also related RFCs:
RFC: Sanitizer-based Heap Profiler [1]
RFC: A binary serialization format for MemProf [2]
RFC: IR metadata format for MemProf [3]*

* Note that the IR metadata format has changed from the RFC during
implementation, as described in the preceeding patch adding the basic
metadata and verification support.

The matching is performed during the normal PGO annotation phase, to
ensure that the inlines applied in the IR at that point are a subset
of the inlines in the profiled binary and thus reflected in the
profile's call stacks. This is important because the call frames are
associated with functions in the profile based on the inlining in the
symbolized call stacks, and this simplifies locating the subset of
profile data relevant for matching onto each function's IR.

The PGOInstrumentationUse pass is enhanced to perform matching for
whatever combination of memprof and regular PGO profile data exists in
the profile.

Using the utilities introduced in D128854:
The memprof profile data for each context is converted to "cold" or
"notcold" based on parameterized thresholds for size, access count, and
lifetime. The memprof allocation contexts are trimmed to the minimal
amount of context required to uniquely identify whether the context is
cold or not cold. For allocations where all profiled contexts have the
same allocation type, no memprof metadata is attached and instead the
allocation call is directly annotated with an attribute specifying the
alloction type. This is the same attributed that will be applied to
allocation calls once cloned for different contexts, and later used
during LibCall simplification to emit allocation hints [4].

Depends on D128141 and D128854.

[1] https://lists.llvm.org/pipermail/llvm-dev/2020-June/142744.html
[2] https://lists.llvm.org/pipermail/llvm-dev/2021-September/153007.html
[3] https://discourse.llvm.org/t/rfc-ir-metadata-format-for-memprof/59165
[4] ab87cf382d

Differential Revision: https://reviews.llvm.org/D128142
2022-09-22 12:48:31 -07:00
serge-sans-paille
d442040292 [clang] Fix interaction between asm labels and inline builtins
One must pick the same name as the one referenced in CodeGenFunction when
generating .inline version of an inline builtin, otherwise they are not
correctly replaced.

Differential Revision: https://reviews.llvm.org/D134362
2022-09-22 09:24:47 +02:00
Craig Topper
182aa0cbe0 [RISCV] Remove support for the unratified Zbp extension.
This extension does not appear to be on its way to ratification.

Still need some follow up to simplify the RISCVISD nodes.
2022-09-21 21:22:42 -07:00