llvm-project

Author	SHA1	Message	Date
David Green	9c48b7f0e7	[AArch64][ARM] Alter v8.1a neon intrinsics to be target-based, not preprocessor based As a continuation of D132034, this switches the QRDMX v8.1a neon intrinsics over from preprocessor defines to be target-gated. As there is no "rdma" or "qrdmx" target feature, they use the "v8.1a" architecture feature directly. This works well for AArch64, but something needs to be done for Arm at the same time, as they both use the same header and tablegen emitter. This patch opts for adding "v8.1a" and all dependant target features to the Arm TargetParser, similar to what was recently done for AArch64 but through initFeatureMap when the Architecture is parsed. I attempted to make the code similar to the AArch64 backend. Otherwise this is similar to the changes made in D132034. Differential Revision: https://reviews.llvm.org/D135615	2022-10-25 09:02:52 +01:00
Freddy Ye	fdac4c4e92	[X86] Add CMPCCXADD instructions. For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html Reviewed By: pengfei, skan Differential Revision: https://reviews.llvm.org/D135933	2022-10-25 14:33:39 +08:00
Markus Böck	3637dc601c	[clang][CodeGen] Consistently return nullptr Values for void builtins and scalar initalization A common post condition of the various visitor functions in CodeGen is that instructions, that do not return any values, simply return a nullptr Value as a sentinel. This has not been the case however for calls to some builtins returning void, as well as for an initializer expression of the form `void()`. This would then lead to ICEs in CodeGen on code relying on nullptr being returned for void values, which is eg. the case for conditional expressions [0]. This patch fixes that by returning nullptr Values for intrinsics known not to return any values as well as for a scalar initializer returning void. Fixes https://github.com/llvm/llvm-project/issues/53127 [0] `266ec801fb/clang/lib/CodeGen/CGExprScalar.cpp (L4849-L4892)` Differential Revision: https://reviews.llvm.org/D136548	2022-10-24 21:41:13 +02:00
Rong Xu	6cee539337	[Clang] Change AnonStructIds in MangleContext to per-function based Clang is generating different mangled names for the same lambda function in slightly changed builds (like with non-related source/Macro change). This is due to the fact that clang uses a cross-translation-unit sequential string "$_<n>" in lambda's mangled name. Here, "n" is the AnonStructIds field in MangleContext. Different mangled names for a unchanged function is undesirable: it makes perf comparison harder, and can cause some unnecessary profile mismatch in SampleFDO. This patch makes mangled name for lambda functions more stable by changing AnonStructIds to a per-function based seq number if the DeclContext is a function. Differential Revision: https://reviews.llvm.org/D136397	2022-10-23 22:33:52 -07:00
Xiang1 Zhang	661881d436	[X86] Add AMX-FP16 instructions. Differential Revision: https://reviews.llvm.org/D135941	2022-10-22 08:05:22 +08:00
Bill Wendling	283e0a81ef	[clang] Correct sanitizer behavior in union FAMs Clang doesn't have the same behavior as GCC does with union flexible array members. (Technically, union FAMs are probably not acceptable in C99 and are an extension of GCC and Clang.) Both Clang and GCC treat all arrays at the end of a structure as FAMs. GCC does the same with unions. Clang does it for some arrays in unions (incomplete, '0', and '1'), but not for all. Instead of having this half-supported feature, sync Clang's behavior with GCC's. Reviewed By: kees Differential Revision: https://reviews.llvm.org/D135727	2022-10-20 16:08:11 -07:00
Michael Francis	922f42d531	[clang][AIX] Fix mcount name and call arguments Currently, compiling a program with the `-pg` flag will result in an undefined symbol error for `.mcount`. This revision fixes the call to use `__mcount`, which requires a pointer argument to a pointer-sized object (unique per inserted call) on AIX. This is only a partial fix. This patch should fix the `-pg` flag's behaviour on AIX to work with code you are compiling, but it will not link against standard libraries with `mcount` instrumentation calls. The next step is to add profiled libraries to the linker search paths in the Clang driver for the AIX toolchain when linking with `-pg`. Differential Review: https://reviews.llvm.org/D135384	2022-10-20 16:20:00 -04:00
Phoebe Wang	62ca79102c	[X86][1/2] Support PREFETCHI instructions For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D136040	2022-10-20 08:46:01 +08:00
Philip Reames	9a8f3b113d	[clang][RISCV] Set vscale_range attribute based on VLEN Follow up on D135894, restructure code to work in terms of minimum and maximum VLEN coming from RISCVISAInfo.cpp. In the original review, I'd mentioned that MinVLEN was sometimes zero. This turns out to be a case of human error, combined with really bad (lack of) error reporting. This patch adds appropriate tests for various vector extension combinations to show the mechanism works, but doesn't try to provide exhaustive coverage of the extension interactions. Presumably, that is already covered in existing tests elsewhere. Differential Revision: https://reviews.llvm.org/D136106	2022-10-19 16:14:33 -07:00
Phoebe Wang	bc1819389f	[X86][RFC] Using `__bf16` for AVX512_BF16 intrinsics This is an alternative of D120395 and D120411. Previously we use `__bfloat16` as a typedef of `unsigned short`. The name may give user an impression it is a brand new type to represent BF16. So that they may use it in arithmetic operations and we don't have a good way to block it. To solve the problem, we introduced `__bf16` to X86 psABI and landed the support in Clang by D130964. Now we can solve the problem by switching intrinsics to the new type. Reviewed By: LuoYuanke, RKSimon Differential Revision: https://reviews.llvm.org/D132329	2022-10-19 23:47:04 +08:00
Daniel Kiss	0d0ca64356	[AArch64] Make ACLE intrinsics always available part MTE Make MTE intrinsics available in function scope too. Followup from D133359. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D136062	2022-10-18 11:03:02 +02:00
Daniel Kiss	a175d8b177	Revert "[AArch64] Make ACLE intrinsics always available part MTE" This reverts commit 09aaf190d93393d9e29d29a033cc3979589c5e84.	2022-10-18 10:45:32 +02:00
Daniel Kiss	09aaf190d9	[AArch64] Make ACLE intrinsics always available part MTE Make MTE intrinsics available in function scope too. Followup from D133359. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D136062	2022-10-18 10:35:40 +02:00
Philip Reames	4467c781d7	[clang][RISCV] Set vscale_range attribute based on presence of "v" extension This follows the path that AArch64 SVE has taken. Doing this via a function attribute set in the frontend is basically a workaround for the fact that several analyzes which need the information (i.e. known bits, lvi, scev) can't easily use TTI without significant amounts of plumbing changes. This patch hard codes "v" numbers, and directly follows the SVE precedent as a result. In a follow up, I hope to drive this from RISCVISAInfo.h/cpp instead, but the MinVLen number being returned from that interface seemed to always be 0 (which is wrong), and I haven't figured out what's going wrong there. Differential Revision: https://reviews.llvm.org/D135894	2022-10-17 11:33:03 -07:00
Zahira Ammarguellat	d0a4741392	Fix LIT test func-attr.c added by https://reviews.llvm.org/D135097 . Differential Revision: https://reviews.llvm.org/D136084	2022-10-17 14:26:17 -04:00
Ellis Hoag	970e1ea01a	[clang] Fix crash with -funique-internal-linkage-names Calling `getFunctionLinkage(CalleeInfo.getCalleeDecl())` will crash when the declaration does not have a body, e.g., `extern void foo();`. Instead, we can use `isExternallyVisible()` to see if the delcaration has internal linkage. I believe using `!isExternallyVisible()` is correct because the clang linkage must be `InternalLinkage` or `UniqueExternalLinkage`, both of which are "internal linkage" in llvm. `9c26f51f5e/clang/include/clang/Basic/Linkage.h (L28-L40)` Fixes https://github.com/llvm/llvm-project/issues/54139 Reviewed By: tmsriram Differential Revision: https://reviews.llvm.org/D135926	2022-10-17 08:57:23 -07:00
Ting Wang	ee703b5cb1	[clang][PowerPC] PPC64 VAArg fix right-alignment for aggregates fit in register PPC64 ABI pass aggregates smaller than a register into the least significant bits of the register. In the case of variadic functions, they will end up right-aligned in their argument slots in the argument area on big-endian targets. Apply right-alignment for these aggregates. Fixes #55900. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D133338	2022-10-16 22:01:47 -04:00
Yi Kong	f118280b04	[clang] Fix func-attr.c test The test was introduced by 84a9ec2ff1ee. The check is over-specific and is broken on the Android buildbot. Fixed by relaxing the variable name check.	2022-10-15 23:16:26 +09:00
Stefan Pintilie	6897dbc463	[PowerPC] Fix parameters for __builtin_crypto_vsbox The documentation specifies that the input and ouput for the builtin __builtin_crypto_vsbox should be vector unsigned char. This patch fixes this type for the builtin. Reviewed By: amyk Differential Revision: https://reviews.llvm.org/D135834	2022-10-14 13:30:59 -05:00
Daniel Kiss	30b67c677c	[AArch64] Make ACLE intrinsics always available part1 A given arch feature might enabled by a pragma or a function attribute so in this cases would be nice to use intrinsics. Today GCC offers the intrinsics without the march flag[1]. PR[2] for ACLE to clarify the intention and remove the need for -march flag for a given intrinsics. This is going to be more useful when D127812 lands. [1] https://godbolt.org/z/bxcMhav3z [2] https://github.com/ARM-software/acle/pull/214 Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D133359	2022-10-14 17:23:11 +02:00
Zahira Ammarguellat	84a9ec2ff1	Remove redundant option -menable-unsafe-fp-math. There are currently two options that are used to tell the compiler to perform unsafe floating-point optimizations: '-ffast-math' and '-funsafe-math-optimizations'. '-ffast-math' is enabled by default. It automatically enables the driver option '-menable-unsafe-fp-math'. Below is a table illustrating the special operations enabled automatically by '-ffast-math', '-funsafe-math-optimizations' and '-menable-unsafe-fp-math' respectively. Special Operations -ffast-math -funsafe-math-optimizations -menable-unsafe-fp-math MathErrno 0 1 1 FiniteMathOnly 1 0 0 AllowFPReassoc 1 1 1 NoSignedZero 1 1 1 AllowRecip 1 1 1 ApproxFunc 1 1 1 RoundingMath 0 0 0 UnsafeFPMath 1 0 1 FPContract fast on on '-ffast-math' enables '-fno-math-errno', '-ffinite-math-only', '-funsafe-math-optimzations' and sets 'FpContract' to 'fast'. The driver option '-menable-unsafe-fp-math' enables the same special options than '-funsafe-math-optimizations'. This is redundant. We propose to remove the driver option '-menable-unsafe-fp-math' and use instead, the setting of the special operations to set the function attribute 'unsafe-fp-math'. This attribute will be enabled only if those special operations are enabled and if 'FPContract' is either 'fast' or set to the default value. Differential Revision: https://reviews.llvm.org/D135097	2022-10-14 10:55:29 -04:00
Ting Wang	00b9bed1f0	[clang][PowerPC][NFC] Add base test case for PPC64 VAArg aggregate smaller than a slot Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D133488	2022-10-13 22:57:40 -04:00
wanglei	defe7c07f0	Reland "[clang][LoongArch] Set MaxAtomicInlineWidth and MaxAtomicPromoteWidth for LoongArch" Differential Revision: https://reviews.llvm.org/D135526	2022-10-11 20:36:09 +08:00
Weining Lu	42b70793a1	Reland "[Clang][LoongArch] Add inline asm support for constraints k/m/ZB/ZC" Reference: https://gcc.gnu.org/onlinedocs/gccint/Machine-Constraints.html k: A memory operand whose address is formed by a base register and (optionally scaled) index register. m: A memory operand whose address is formed by a base register and offset that is suitable for use in instructions with the same addressing mode as st.w and ld.w. ZB: An address that is held in a general-purpose register. The offset is zero. ZC: A memory operand whose address is formed by a base register and offset that is suitable for use in instructions with the same addressing mode as ll.w and sc.w. Note: The INLINEASM SDNode flags in below tests are updated because the new introduced enum `Constraint_k` is added before `Constraint_m`. llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-inline-asm.ll llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-inline-asm.ll llvm/test/CodeGen/X86/callbr-asm-kill.mir This patch passes `ninja check-all` on a X86 machine with all official targets and the LoongArch target enabled. Differential Revision: https://reviews.llvm.org/D134638	2022-10-11 19:51:48 +08:00
Weining Lu	b32a1bdf42	Revert "[clang][LoongArch] Set MaxAtomicInlineWidth and MaxAtomicPromoteWidth for LoongArch" This reverts commit 6547565e7bdcd9c3f683ad196b62d08c7061fdf1. This breaks test: Preprocessor/init-loongarch.c	2022-10-11 19:21:28 +08:00
wanglei	6547565e7b	[clang][LoongArch] Set MaxAtomicInlineWidth and MaxAtomicPromoteWidth for LoongArch Differential Revision: https://reviews.llvm.org/D135526	2022-10-11 18:12:37 +08:00
David Green	b879f99f0e	[AArch64][ARM] Alter most of arm_neon.h to be target-based, not preprocessor based. Similar to D131064, this alters most of the intrinsics in arm_neon.h to be target based, not preprocessor based. The intrinsics that are changed are the ones with obvious target features (fp16, fp16fml, cryptos, i8mm and bf16). The ones that are not yet altered are the ones without target features like rdma (8.1) and complex (8.3). Those will be switched in a followup patch that allows targeting architecture versions. The existing ArchGuard in arm_neon.td is split into ArchGuard that still adds ifdef defines (for example for intrinsics that require __aarch64__), and TargetGuards for intrinsics dependant on target features. From there the TargetGuards are used in two ways: - For intrinsics emitted as functions, __attribute__((target(TargetGuard))) is added to the definition of the function. Along with the existing always_inline intrinsic, this will give a compile time error if the function is used in a context where the target feature is not available. - For intrinsics emitted as macros, the __builtins are emitted into arm_neon.inc using TARGET_BUILTIN as opposed to BUILTIN, which includes the target feature and gives an error if the builtin is found in a function without the required features, similar to arm_sve.h. The second method requires that the intrinsics be separable from the existing _v intrinsics used in other types. For example __builtin_neon_splat_lane_bf16 is used as opposed to __builtin_neon_splat_lane_v. There are some adjustments to the CGBuiltin to account for intrinsics that can be treated similarly, except for their target features. Differential Revision: https://reviews.llvm.org/D132034	2022-10-11 09:09:16 +01:00
Nikita Popov	39db5e1ed8	[CodeGen] Convert tests to opaque pointers (NFC) Conversion performed using the script at: https://gist.github.com/nikic/98357b71fd67756b0f064c9517b62a34 These are only tests where no manual fixup was required.	2022-10-07 14:22:00 +02:00
Manuel Brito	14e2592ff6	[clang][CodeGen] Use poison instead of undef as placeholder in ARM builtins [NFC] Differential Revision: https://reviews.llvm.org/D135392	2022-10-07 12:50:59 +01:00
Stefan Pintilie	0e2e1fc90a	[PowerPC] Fix types for vcipher builtins. The documentation specifies that the parameters for the vcipher builtins are ``` vector unsigned char ``` The code used ``` vector unsigned long long ``` This patch fixes the types for the vcipher builtins. Reviewed By: amyk Differential Revision: https://reviews.llvm.org/D135300	2022-10-06 14:21:34 -05:00
David Green	4987ae8462	[ARM][AArch64] Dont use macros for half instrinsics in NeonEmitter We don't require arm_neon.h fp16 intrinsics to be treated as macros any more. Differential Revision: https://reviews.llvm.org/D131504	2022-10-03 15:27:23 +01:00
David Green	781b491bba	[Clang][AArch64] Support AArch64 target(..) attribute formats. This adds support under AArch64 for the target("..") attributes. The current parsing is very X86-shaped, this patch attempts to bring it line with the GCC implementation from https://gcc.gnu.org/onlinedocs/gcc/AArch64-Function-Attributes.html#AArch64-Function-Attributes. The supported formats are: - "arch=<arch>" strings, that specify the architecture features for a function as per the -march=arch+feature option. - "cpu=<cpu>" strings, that specify the target-cpu and any implied atributes as per the -mcpu=cpu+feature option. - "tune=<cpu>" strings, that specify the tune-cpu cpu for a function as per -mtune. - "+<feature>", "+no<feature>" enables/disables the specific feature, for compatibility with GCC target attributes. - "<feature>", "no-<feature>" enabled/disables the specific feature, for backward compatibility with previous releases. To do this, the parsing of target attributes has been moved into TargetInfo to give the target the opportunity to override the existing parsing. The only non-aarch64 change should be a minor alteration to the error message, specifying using "CPU" to describe the cpu, not "architecture", and the DuplicateArch/Tune from ParsedTargetAttr have been combined into a single option. Differential Revision: https://reviews.llvm.org/D133848	2022-10-01 15:40:59 +01:00
David Green	123064dc39	[Clang][Arm] Convert -fallow-half-arguments-and-returns to a target option. NFC This cc1 option -fallow-half-arguments-and-returns allows __fp16 to be passed by argument and returned, without giving an error. It is currently always enabled for Arm and AArch64, by forcing the option in the driver. This means any cc1 tests (especially those needing arm_neon.h) need to specify the option too, to prevent the error from being emitted. This changes it to a target option instead, set to true for Arm and AArch64. This allows the option to be removed. Previously it was implied by -fnative_half_arguments_and_returns, which is set for certain languages like open_cl, renderscript and hlsl, so that option now too controls the errors. There were are few other non-arm uses of -fallow-half-arguments-and-returns but I believe they were unnecessary. The strictfp_builtins.c tests were converted from __fp16 to _Float16 to avoid the issues. Differential Revision: https://reviews.llvm.org/D133885	2022-09-29 11:00:32 +01:00
Michael Platings	dba8fced96	Fix frint ACLE intrinsic names Although the instruction names begin "frint", the ACLE spec states that the intrinsic names begin "__rint", without the "f". Differential Revision: https://reviews.llvm.org/D134824	2022-09-29 09:13:07 +01:00
Fangrui Song	04a65d62a0	Revert D134638 "[Clang][LoongArch] Add inline asm support for constraints k/m/ZB/ZC" This reverts commit b7baddc7557e5c35a0f6a604a134d849265a99d4. Broke CodeGen/X86/callbr-asm-kill.mir We shall pay attention when adding new constraints.	2022-09-29 00:54:56 -07:00
Weining Lu	b7baddc755	[Clang][LoongArch] Add inline asm support for constraints k/m/ZB/ZC k: A memory operand whose address is formed by a base register and (optionally scaled) index register. m: A memory operand whose address is formed by a base register and offset that is suitable for use in instructions with the same addressing mode as st.w and ld.w. ZB: An address that is held in a general-purpose register. The offset is zero. ZC: A memory operand whose address is formed by a base register and offset that is suitable for use in instructions with the same addressing mode as ll.w and sc.w. Differential Revision: https://reviews.llvm.org/D134638	2022-09-29 15:02:08 +08:00
Yonghong Song	75be0482a2	[clang][DebugInfo] Emit debuginfo for non-constant case value Currently, clang does not emit debuginfo for the switch stmt case value if it is an enum value. For example, $ cat test.c enum { AA = 1, BB = 2 }; int func1(int a) { switch(a) { case AA: return 10; case BB: return 11; default: break; } return 0; } $ llvm-dwarfdump test.o \| grep AA $ Note that gcc does emit debuginfo for the same test case. This patch added such a support with similar implementation to CodeGenFunction::EmitDeclRefExprDbgValue(). With this patch, $ clang -g -c test.c $ llvm-dwarfdump test.o \| grep AA DW_AT_name ("AA") $ Differential Revision: https://reviews.llvm.org/D134705	2022-09-28 12:10:48 -07:00
Arthur Eubanks	44ad67031c	[clang][msan] Turn on -fsanitize-memory-param-retval by default This eagerly reports use of undef values when passed to noundef parameters or returned from noundef functions. This also decreases binary sizes under msan. To go back to the previous behavior, pass `-fno-sanitize-memory-param-retval`. Reviewed By: vitalybuka, MaskRay Differential Revision: https://reviews.llvm.org/D134669	2022-09-28 09:36:39 -07:00
Daniel Kiss	712de9d171	[AArch64] Add all predecessor archs in target info A given function is compatible with all previous arch versions. To avoid compering values of the attribute this logic adds all predecessor architecture values. Reviewed By: dmgreen, DavidSpickett Differential Revision: https://reviews.llvm.org/D134353	2022-09-27 10:23:21 +02:00
Fangrui Song	b2d7a0dcf1	[AArch64] Check target feature support for __builtin_arm_crc* This is the AArch64 counterpart of D134127. Daniel Kiss will change more `BUILTIN` to `TARGET_BUILTIN`. Fix #57802	2022-09-26 17:16:44 -07:00
Weining Lu	394f30919a	[Clang][LoongArch] Add inline asm support for constraints f/l/I/K This patch adds support for constraints `f`, `l`, `I`, `K` according to [1]. The remain constraints (`k`, `m`, `ZB`, `ZC`) will be added later as they are a little more complex than the others. f: A floating-point register (if available). l: A signed 16-bit constant. I: A signed 12-bit constant (for arithmetic instructions). K: An unsigned 12-bit constant (for logic instructions). For now, no need to support register alias (e.g. `$a0`) in llvm as clang will correctly decode the usage of register name aliases into their official names. And AFAIK, the not yet upstreamed `rustc` for LoongArch will always use official register names (e.g. `$r4`). [1] https://gcc.gnu.org/onlinedocs/gccint/Machine-Constraints.html Differential Revision: https://reviews.llvm.org/D134157	2022-09-26 08:49:58 +08:00
Nico Weber	ea8371247f	[clang-cl] Implement /ZH: flag Based on a patch by Arlo Siemsen (D98438)! Differential Revision: https://reviews.llvm.org/D134544	2022-09-25 14:43:14 -04:00
eopXD	75279aeecd	[RISCV][Clang] Replace all undef value with poison Address remaining work that dates back to discussion in D126745 Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D134513	2022-09-24 04:42:04 -07:00
Teresa Johnson	b1926f308f	Restore "[MemProf] Memprof profile matching and annotation" This reverts commit 794b7ea960ccc3222f2af582efadbc5e5c464292, and thus restores commit a212d8da94d08e229aa8d65283e4b116310bba10, and follow on fixes 0cd6763fa93159b84d70a5bb602c24996acaafaa, e9ff53d42feac7fc157718523275619a8106f2f3, and 37c6a25e9ab230e5e21fa34e246d9fec55275df0. Use a hash function (BLAKE3) instead of hash_combine/hash_code which are not guaranteed to be stable across executions. Additionally, it adds a "REQUIRES: x86_64-linux" to the tests that have raw profile inputs to avoid failures on big endian bots. Reviewers: snehasish, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D128142	2022-09-23 11:38:47 -07:00
Teresa Johnson	794b7ea960	Revert "[MemProf] Memprof profile matching and annotation" This reverts commit a212d8da94d08e229aa8d65283e4b116310bba10, and follow on fixes 0cd6763fa93159b84d70a5bb602c24996acaafaa, e9ff53d42feac7fc157718523275619a8106f2f3, and 37c6a25e9ab230e5e21fa34e246d9fec55275df0. After re-reading the documentation for hash_combine, I don't think this is the appropriate hash function to use for computing the hash to use as a stack id in the metadata, since it is not guaranteed to produce stable values across executions. I have not hit this problem, but plan to switch to using an MD5 hash. I am hitting an issue with one of the bots (https://lab.llvm.org/buildbot/#/builders/171/builds/20732) where the values produced are only the lower 32 bits of the expected hash values, however, which I assume is related to the implementation of hash_combine and hash_code. I believe I fixed all of the other bot failures with the follow on fixes, which I'll merge into the new version before reapplying.	2022-09-22 16:08:03 -07:00
Pavel Samolysov	1c530500ab	[Pipelines] Introduce DAE after ArgumentPromotion The ArgumentPromotion pass uses Mem2Reg promotion at the end to cutting down generated `alloca` instructions as well as meaningless `store`s and this behavior can leave unused (dead) arguments. To eliminate the dead arguments and therefore let the DeadCodeElimination remove becoming dead inserted `GEP`s as well as `load`s and `cast`s in the callers, the DeadArgumentElimination pass should be run after the ArgumentPromotion one. Differential Revision: https://reviews.llvm.org/D128830	2022-09-22 15:33:46 -07:00
Craig Topper	52708be182	[RISCV] Remove support for the unratified Zbe, Zbf, and Zbm extensions. These extensions do not appear to be on their way to ratification.	2022-09-22 13:04:41 -07:00
Teresa Johnson	a212d8da94	[MemProf] Memprof profile matching and annotation Profile matching and IR annotation for memprof profiles. See also related RFCs: RFC: Sanitizer-based Heap Profiler [1] RFC: A binary serialization format for MemProf [2] RFC: IR metadata format for MemProf [3]* * Note that the IR metadata format has changed from the RFC during implementation, as described in the preceeding patch adding the basic metadata and verification support. The matching is performed during the normal PGO annotation phase, to ensure that the inlines applied in the IR at that point are a subset of the inlines in the profiled binary and thus reflected in the profile's call stacks. This is important because the call frames are associated with functions in the profile based on the inlining in the symbolized call stacks, and this simplifies locating the subset of profile data relevant for matching onto each function's IR. The PGOInstrumentationUse pass is enhanced to perform matching for whatever combination of memprof and regular PGO profile data exists in the profile. Using the utilities introduced in D128854: The memprof profile data for each context is converted to "cold" or "notcold" based on parameterized thresholds for size, access count, and lifetime. The memprof allocation contexts are trimmed to the minimal amount of context required to uniquely identify whether the context is cold or not cold. For allocations where all profiled contexts have the same allocation type, no memprof metadata is attached and instead the allocation call is directly annotated with an attribute specifying the alloction type. This is the same attributed that will be applied to allocation calls once cloned for different contexts, and later used during LibCall simplification to emit allocation hints [4]. Depends on D128141 and D128854. [1] https://lists.llvm.org/pipermail/llvm-dev/2020-June/142744.html [2] https://lists.llvm.org/pipermail/llvm-dev/2021-September/153007.html [3] https://discourse.llvm.org/t/rfc-ir-metadata-format-for-memprof/59165 [4] `ab87cf382d` Differential Revision: https://reviews.llvm.org/D128142	2022-09-22 12:48:31 -07:00
serge-sans-paille	d442040292	[clang] Fix interaction between asm labels and inline builtins One must pick the same name as the one referenced in CodeGenFunction when generating .inline version of an inline builtin, otherwise they are not correctly replaced. Differential Revision: https://reviews.llvm.org/D134362	2022-09-22 09:24:47 +02:00
Craig Topper	182aa0cbe0	[RISCV] Remove support for the unratified Zbp extension. This extension does not appear to be on its way to ratification. Still need some follow up to simplify the RISCVISD nodes.	2022-09-21 21:22:42 -07:00

1 2 3 4 5 ...

7802 Commits