llvm-project

Author	SHA1	Message	Date
Kazu Hirata	890c4bece2	[memprof] Use SmallVector for InlinedCallStack (NFC) (#114599 ) We can stay within 8 inlined elements more than 99% of the time while building a large application.	2024-11-01 19:52:11 -07:00
Thurston Dang	e549ec529c	[msan] Add handleIntrinsicByApplyingToShadow; support NEON tbl/tbx (#114490 ) This adds a general function that handles intrinsics by applying the intrinsic to the shadows, and applies it to the specific case of Arm NEON TBL/TBX intrinsics. This also updates the tests from https://github.com/llvm/llvm-project/pull/114462	2024-11-01 14:58:45 -07:00
Lei Wang	bef3b54ea1	[InstrPGO] Avoid using global variable to fix potential data race (#114364 ) In https://github.com/llvm/llvm-project/pull/109837, it sets a global variable(`PGOInstrumentColdFunctionOnly`) in PassBuilderPipelines.cpp which introduced a data race detected by TSan. To fix this, I decouple the flag setting, the flags are now set separately(`instrument-cold-function-only-path` is required to be used with `--pgo-instrument-cold-function-only`).	2024-10-31 21:28:13 -07:00
Dmitry Chernenkov	d924a9ba03	Revert "[InstrPGO] Support cold function coverage instrumentation (#109837 )" This reverts commit e517cfc531886bf6ed64b4e7109bb3141ac7f430.	2024-10-31 10:55:17 +00:00
Lei Wang	e517cfc531	[InstrPGO] Support cold function coverage instrumentation (#109837 ) This patch adds support for cold function coverage instrumentation based on sampling PGO counts. The major motivation is to detect dead functions for the services that are optimized with sampling PGO. If a function is covered by sampling profile count (e.g., those with an entry count > 0), we choose to skip instrumenting those functions, which significantly reduces the instrumentation overhead. More details about the implementation and flags: - Added a flag `--pgo-instrument-cold-function-only` in `PGOInstrumentation.cpp` as the main switch to control skipping the instrumentation. - Built the extra instrumentation passes(a bundle of passes in `addPGOInstrPasses`) under sampling PGO pipeline. This is controlled by `--instrument-cold-function-only-path` flag. - Added a driver flag `-fprofile-generate-cold-function-coverage`: - 1) Config the flags in one place, i,e. adding `--instrument-cold-function-only-path=<...>` and `--pgo-function-entry-coverage`. Note that the instrumentation file path is passed through `--instrument-sample-cold-function-path`, because we cannot use the `PGOOptions.ProfileFile` as it's already used by `-fprofile-sample-use=<...>`. - 2) makes linker to link `compiler_rt.profile` lib(see [ToolChain.cpp#L1125-L1131](https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChain.cpp#L1125-L1131) ). - Added a flag(`--pgo-cold-instrument-entry-threshold`) to config entry count to determine cold function. Overall, the full command is like: ``` clang++ -O2 -fprofile-generate-cold-function-coverage=<...> -fprofile-sample-use=<...> code.cc -o code ```	2024-10-28 10:13:45 -07:00
davidtrevelyan	4102625380	[rtsan][llvm][NFC] Rename sanitize_realtime_unsafe attr to sanitize_realtime_blocking (#113155 ) # What This PR renames the newly-introduced llvm attribute `sanitize_realtime_unsafe` to `sanitize_realtime_blocking`. Likewise, sibling variables such as `SanitizeRealtimeUnsafe` are renamed to `SanitizeRealtimeBlocking` respectively. There are no other functional changes. # Why? - There are a number of problems that can cause a function to be real-time "unsafe", - we wish to communicate what problems rtsan detects and why they're unsafe, and - a generic "unsafe" attribute is, in our opinion, too broad a net - which may lead to future implementations that need extra contextual information passed through them in order to communicate meaningful reasons to users. - We want to avoid this situation and make the runtime library boundary API/ABI as simple as possible, and - we believe that restricting the scope of attributes to names like `sanitize_realtime_blocking` is an effective means of doing so. We also feel that the symmetry between `[[clang::blocking]]` and `sanitize_realtime_blocking` is easier to follow as a developer. # Concerns - I'm aware that the LLVM attribute `sanitize_realtime_unsafe` has been part of the tree for a few weeks now (introduced here: https://github.com/llvm/llvm-project/pull/106754). Given that it hasn't been released in version 20 yet, am I correct in considering this to not be a breaking change?	2024-10-26 13:06:11 +01:00
Vitaly Buka	cf8d24531e	[msan] Reduces overhead of #113200 , by 10% (#113201 ) CTMark #113200 size overhead was 5.3%, now it's 4.7%. The patch affects only signed integers. https://alive2.llvm.org/ce/z/Lv5hyi * The patch replaces code which extracted sign bit, maximized/minimized it, then packed it back, with simple sign bit flip. The another way to think about transformation is as a subtraction of MIN_SINT from A/B. Then we map MIN_SINT to 0, 0 to -MIN_SINT, and MAX_SINT to MAX_UINT. * Then to maximize/minimize A/B we don't need to extract sign bit, we can apply shadow the same way as to other bits. * After sign bit flip, we had to switch to unsigned version of the predicates. * After change above getHighestPossibleValue/getLowestPossibleValue became very similar, so we can combine into a single function. * Because the function does sign bit flip and requires unsigned predicates used for returned values, there is no point in keeping it as a member of class, to hide, we switch to function local lambda.	2024-10-24 20:46:49 -07:00
Michael O'Farrell	10f0c1aadd	[PGO] Ensure non-zero entry-count after `populateCounters` (#112029 ) With sampled instrumentation (#69535), profile counts may appear corrupt and `fixFuncEntryCount` may assert. In particular a function can have a 0 block count for its entry, while later blocks are non zero. This is only likely to happen for colder functions, so it is reasonable to take any action that does not crash. Here we simply bail from fixing the entry count.	2024-10-22 16:05:40 -07:00
Michael O'Farrell	b4fcaa137f	[PGO][SampledInstr] Correct off by 1s and allow 100% sampling (#113350 ) This corrects a couple off by ones related to the sampling of instrumented counters, and enables setting 100% rates for burst sampling (burst duration = period). Off by ones: Prior to this change it was impossible to set a period of 65535 because this was converted to fast sampling which rollsover at USHRT_MAX + 1 (65536). Similarly the burst durations would collect burst duration + 1 counts as they used an ULE comparison. 100% sampling: Although this is not useful for a productionized use case, it does allow for more deterministic testing with the sampling checks in place. After all the off by ones are fixed, allowing for 100% sampling is a matter of letting burst duration = period.	2024-10-22 16:01:13 -07:00
Vitaly Buka	c77d8edf80	Revert "Revert "[msan] Switch to -msan-handle-icmp-exact my default"" (#113379 ) Reverts llvm/llvm-project#113376 Fixed with #113378	2024-10-22 14:05:35 -07:00
Vitaly Buka	71792dc570	[NFC][msan] Workaround arg evaluation order diff GCC vs Clang (#113378 )	2024-10-22 13:31:46 -07:00
Vitaly Buka	c3aa8b7dd6	Revert "[msan] Switch to -msan-handle-icmp-exact my default" (#113376 ) Reverts llvm/llvm-project#113200 Breaks bots, see llvm/llvm-project#113200	2024-10-22 13:05:59 -07:00
Vitaly Buka	395093ec15	[msan] Switch to -msan-handle-icmp-exact my default (#113200 ) Fixes #111212. This grows .text by 5.3% on CTMark, (or 2.6% large internal binary) Perf regressed by 1.6%. We will try to improve in follow up patches. It worth to pay some performance regression to fix correctness to avoid stuff like #111212.	2024-10-22 12:35:18 -07:00
goldsteinn	c85611e858	[SimplifyLibCall][Attribute] Fix bug where we may keep `range` attr with incompatible type (#112649 ) In a variety of places we change the bitwidth of a parameter but don't update the attributes. The issue in this case is from the `range` attribute when inlining `__memset_chk`. `optimizeMemSetChk` will replace an `i32` with an `i8`, and if the `i32` had a `range` attr assosiated it will cause an error. Fixes #112633	2024-10-17 10:32:55 -05:00
Jay Foad	85c17e4092	[LLVM] Make more use of IRBuilder::CreateIntrinsic. NFC. (#112706 ) Convert many instances of: Fn = Intrinsic::getOrInsertDeclaration(...); CreateCall(Fn, ...) to the equivalent CreateIntrinsic call.	2024-10-17 16:20:43 +01:00
Qiongsi Wu	f9d0789064	[PGO] Initialize GCOV Writeout and Reset Functions in the Runtime on AIX (#108570 ) This PR registers the writeout and reset functions for `gcov` for all modules in the PGO runtime, instead of registering them using global constructors in each module. The change is made for AIX only, but the same mechanism works on Linux on Power. When registering such functions using global constructors in each module without `-ffunction-sections`, the AIX linker cannot garbage collect unused undefined symbols, because such symbols are grouped in the same section as the `__sinit` symbol. Keeping such undefined symbols causes link errors (see test case https://github.com/llvm/llvm-project/pull/108570/files#diff-500a7e1ba871e1b6b61b523700d5e30987900002add306e1b5e4972cf6d5a4f1R1 for this scenario). This PR implements the initialization in the runtime, hence avoiding introducing `__sinit` into each module. The implementation adds a new global variable `__llvm_covinit_functions` to each module. This new global variable contains the function pointers to the `Writeout` and `Reset` functions. `__llvm_covinit_functions`'s section is the named section `__llvm_covinit`. The linker will aggregate all the `__llvm_covinit` sections from each module to form one single named section in the final binary. The pair of functions ``` const __llvm_gcov_init_func_struct __llvm_profile_begin_covinit(); const __llvm_gcov_init_func_struct __llvm_profile_end_covinit(); ``` are implemented to return the start and end address of this named section in the final binary, and they are used in function ``` __llvm_profile_gcov_initialize() ``` (which is a constructor function in the runtime) so the runtime knows the addresses of all the `Writeout` and `Reset` functions from all the modules. One noticeable implementation detail relevant to AIX is that to preserve the `__llvm_covinit` from the linker's garbage collection, a `.ref` pseudo instruction is inserted into them, referring to the section that contains the `__llvm_gcov_ctr` variables, which are used in the instrumented code. The `__llvm_gcov_ctr` variables did not belong to named sections before, but this PR added them to the `__llvm_gcov_ctr_section` named section, so we can add a `.ref` pseudo instruction that refers to them in the `__llvm_covinit` section.	2024-10-17 09:32:10 -04:00
thetruestblue	927af63fdd	[SanitizerCoverage] Add an option to gate the invocation of the tracing callbacks (#108328 ) Implement -sanitizer-coverage-gated-trace-callbacks to gate the invocation of the tracing callbacks based on the value of a global variable, which is stored in a specific section. When this option is enabled, the instrumentation will not call into the runtime-provided callbacks for tracing, thus only incurring in a trivial branch without going through a function call. It is up to the runtime to toggle the value of the global variable in order to enable tracing. This option is only supported for trace-pc-guard. Note: will add additional support for trace-cmp in a follow up PR. Patch by Filippo Bigarella rdar://101626834	2024-10-16 21:52:38 -07:00
Jay Foad	9255850e89	[LLVM] Remove unused variables after #112546	2024-10-16 16:15:34 +01:00
Jay Foad	d9c95efb6c	[LLVM] Make more use of IRBuilder::CreateIntrinsic. NFC. (#112546 ) Convert almost every instance of: CreateCall(Intrinsic::getOrInsertDeclaration(...), ...) to the equivalent CreateIntrinsic call.	2024-10-16 15:43:30 +01:00
Rahul Joshi	6924fc0326	[LLVM] Add `Intrinsic::getDeclarationIfExists` (#112428 ) Add `Intrinsic::getDeclarationIfExists` to lookup an existing declaration of an intrinsic in a `Module`.	2024-10-16 07:21:10 -07:00
Howard Roark	e36b22f3bf	Revert "[PGO] Preserve analysis results when nothing was instrumented (#93421 )" This reverts commit 23c64beeccc03c6a8329314ecd75864e09bb6d97.	2024-10-16 10:50:48 +03:00
Yuta Saito	d4efc3e097	[Coverage][WebAssembly] Add initial support for WebAssembly/WASI (#111332 ) Currently, WebAssembly/WASI target does not provide direct support for code coverage. This patch set fixes several issues to unlock the feature. The main changes are: 1. Port `compiler-rt/lib/profile` to WebAssembly/WASI. 2. Adjust profile metadata sections for Wasm object file format. - [CodeGen] Emit `__llvm_covmap` and `__llvm_covfun` as custom sections instead of data segments. - [lld] Align the interval space of custom sections at link time. - [llvm-cov] Copy misaligned custom section data if the start address is not aligned. - [llvm-cov] Read `__llvm_prf_names` from data segments 3. [clang] Link with profile runtime libraries if requested See each commit message for more details and rationale. This is part of the effort to add code coverage support in Wasm target of Swift toolchain.	2024-10-15 02:41:43 +09:00
Pavel Samolysov	23c64beecc	[PGO] Preserve analysis results when nothing was instrumented (#93421 ) The `PGOInstrumentationGen` pass should preserve all analysis results when nothing was actually instrumented. Currently, only modules that contain at least a single function definition are instrumented. When a module contains only function declarations and, optionally, global variable definitions (a module for the regular-LTO phase for thin-LTO when LTOUnit splitting is enabled, for example), such module is not instrumented (yet?) and there is no reason to invalidate any analysis results. NFC.	2024-10-12 06:29:55 +03:00
Rahul Joshi	fa789dffb1	[NFC] Rename `Intrinsic::getDeclaration` to `getOrInsertDeclaration` (#111752 ) Rename the function to reflect its correct behavior and to be consistent with `Module::getOrInsertFunction`. This is also in preparation of adding a new `Intrinsic::getDeclaration` that will have behavior similar to `Module::getFunction` (i.e, just lookup, no creation).	2024-10-11 05:26:03 -07:00
Youngsuk Kim	f0ed31ce4b	[llvm][PGOCtxProfLowering] Avoid Type::getPointerTo() (NFC) (#111857 ) `Type::getPointerTo()` is to be deprecated & removed soon.	2024-10-10 16:02:13 -04:00
Florian Mayer	5f36042508	[NFC] [HWASan] [MTE] factor out threadlong increment (#110340 )	2024-10-08 15:53:01 -07:00
davidtrevelyan	4547d6042a	[llvm][rtsan] Add transform pass for sanitize_realtime_unsafe (#109543 )	2024-10-03 06:32:21 -07:00
NAKAMURA Takumi	6c331e50e4	[MC/DC] Rework tvbitmap.update to get rid of the inlined function (#110792 ) Per the discussion in #102542, it is safe to insert BBs under `lowerIntrinsics()` since #69535 has made tolerant of modifying BBs. So, I can get rid of using the inlined function `rmw_or`, introduced in #96040.	2024-10-03 17:57:03 +09:00
Mingming Liu	34f0edd509	[TypeProf][PGO]Support skipping vtable comparisons for a class and its derived ones (#110575 ) Performance critical core libraries could be highly-optimized for arch or micro-arch features. For instance, the absl crc library specializes different templated classes among different hardwares [1]. In a practical setting, it's likely that instrumented profiles are collected on one type of machine and used to optimize binaries that run on multiple types of hardwares. While this kind of specialization is rare in terms of lines of code, compiler can do a better job to skip vtable-based ICP. * The per-class `Extend` implementation is arch-specific as well. If an instrumented profile is collected on one arch and applied to another arch where `Extend` implementation is different, `Extend` might be regarded as unlikely function in the latter case. `ABSL_ATTRIBUTE_HOT` annotation alleviates the problem by putting all `Extend` implementation into the hot text section [2] This change introduces a comma-separated list to specify the mangled vtable names, and ICP pass will skip vtable-based comparison if a vtable variable definition is shown to be in its class hierarchy (per LLVM type metadata). [1] `c6b27359c3/absl/crc/internal/crc_x86_arm_combined.cc (L621-L650)` [2] `c6b27359c3/absl/crc/internal/crc_x86_arm_combined.cc (L370C3-L370C21)`	2024-10-02 10:23:54 -07:00
Vitaly Buka	b2180481ec	[hwasan] Consider order of mapping copts (#109621 ) Flags "-hwasan-mapping-offset" and "-hwasan-mapping-offset-dynamic" are mutually exclusive, use the last one.	2024-09-24 21:11:13 -07:00
Vitaly Buka	4ca4460bae	[hwasan] Add "-hwasan-with-frame-record" (#109620 ) It should not be implied form mapping settings. No longer disable frame records for fixed offset.	2024-09-24 19:46:23 -07:00
Vitaly Buka	0673642cab	[hwasan] Replace "-hwasan-with-ifunc" and "-hwasan-with-tls" options (#109619 ) Relationship between "-hwasan-mapping-offset", "-hwasan-with-ifunc", and "-hwasan-with-tls" can be to hard to understand. Now we will have "-hwasan-mapping-offset", presense of which will imply fixed shadow. If "-hwasan-mapping-offset-dynamic" will set one of 3 available dynamic shadows. As-is "-hwasan-mapping-offset" has precedence over "-hwasan-mapping-offset-dynamic". In follow up patches we need to use the one with last occurrence.	2024-09-23 17:13:25 -07:00
Vitaly Buka	083f0fa454	[NFC][hwasan] Remove code duplication in ShadowMapping::init (#109618 ) The goal to is to reorder this function to make initialization in following order: 1. Defaults 2. Target specific overrides 3. Explicit copt<> overrides	2024-09-23 16:55:42 -07:00
Vitaly Buka	8dbb739ffb	[NFC][hwasan] Use `enum class` in `ShadowMapping` (#109617 )	2024-09-23 15:51:56 -07:00
Vitaly Buka	c9e2c38f2c	[NFC][hwasan] Convert ShadowMapping into class (#109616 ) In the next patch we can switch to enum.	2024-09-23 15:34:12 -07:00
Mircea Trofin	783bac7ffb	[ctx_prof] Handle `select` and its `step` instrumentation (#109185 ) The `step` instrumentation shouldn't be treated, during use, like an `increment`. The latter is treated as a BB ID. The step isn't that, it's more of a type of value profiling. We need to distinguish between the 2 when really looking for BB IDs (==increments), and handle appropriately `step`s. In particular, we need to know when to elide them because `select`s may get elided by function cloning, if the condition of the select is statically known.	2024-09-23 15:21:25 -07:00
Nikita Popov	ecb98f9fed	[IRBuilder] Remove uses of CreateGlobalStringPtr() (NFC) Since the migration to opaque pointers, CreateGlobalStringPtr() is the same as CreateGlobalString(). Normalize to the latter.	2024-09-23 16:30:50 +02:00
Vitaly Buka	10266279c3	[NFC][hwasan] Add a few of {}	2024-09-22 18:12:59 -07:00
Florian Mayer	0cab475d11	[NFC] [HWASan] pull removeFnAttributes into function (#109488 )	2024-09-20 20:37:13 -07:00
Florian Mayer	cdf29709d7	[NFC] [HWASan] fix LLVM style guide violations	2024-09-20 16:29:45 -07:00
Youngsuk Kim	d31e314131	[llvm] Don't call raw_string_ostream::flush() (NFC) Don't call raw_string_ostream::flush(), which is essentially a no-op. As specified in the docs, raw_string_ostream is always unbuffered. ( 65b13610a5226b84889b923bae884ba395ad084d for further reference )	2024-09-20 12:19:59 -05:00
Alex Rønne Petersen	72a218056d	[llvm][Triple] Add `Environment` members and parsing for glibc/musl parity. (#107664 ) This adds support for: * `muslabin32` (MIPS N32) * `muslabi64` (MIPS N64) * `muslf32` (LoongArch ILP32F/LP64F) * `muslsf` (LoongArch ILP32S/LP64S) As we start adding glibc/musl cross-compilation support for these targets in Zig, it would make our life easier if LLVM recognized these triples. I'm hoping this'll be uncontroversial since the same has already been done for `musleabi`, `musleabihf`, and `muslx32`. I intentionally left out a musl equivalent of `gnuf64` (LoongArch ILP32D/LP64D); my understanding is that Loongson ultimately settled on simply `gnu` for this much more common case, so there doesn't seem to be a particularly compelling reason to add a `muslf64` that's basically deprecated on arrival. Note: I don't have commit access.	2024-09-20 08:53:03 +08:00
Pavel Skripkin	8a34f6dba1	[ASAN] Do not consider alignment during object size calculations (#109120 ) It was found that ASAN logic optimizes away out-of-bound access instrumentation for over-aligned arrays. See #108287 for complete code examples. Fix it by not considering alignment during object size calculation, since out-of-bounds access for over-aligned object is still UB and should be reported by ASAN. Closes: #108287	2024-09-19 10:16:28 -07:00
Jay Foad	e03f427196	[LLVM] Use {} instead of std::nullopt to initialize empty ArrayRef (#109133 ) It is almost always simpler to use {} instead of std::nullopt to initialize an empty ArrayRef. This patch changes all occurrences I could find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor could be deprecated or removed.	2024-09-19 16:16:38 +01:00
Mircea Trofin	12d94850cd	[ctx_prof] Avoid `llvm::append_range` to fix some build bots Example: https://lab.llvm.org/buildbot/#/builders/169/builds/3381 The CI allowed the `llvm::append_range` instantiation, but on the other hand it's quite unnecessary here.	2024-09-18 21:19:28 -07:00
Mircea Trofin	ce9209f50e	[ctx_prof] Fix `ProfileAnnotator::allTakenPathsExit` (#109183 ) Added tests to the validator and fixed issues stemming from the previous skipping over BBs with single successors - which is incorrect. That would be now picked by added tests where the assertions are expected to be triggered.	2024-09-18 21:08:34 -07:00
Mircea Trofin	b2d3c315d5	[ctx_prof] Fix checks in `PGOCtxprofFlattening` (#108467 ) The assertion that all out-edges of a BB can't be 0 is incorrect: they can be, if that branch is on a cold subgraph. Added validators and asserts about the expected proprerties of the propagated counters.	2024-09-17 18:19:20 -07:00
Antonio Frighetto	942e872d5b	[Instrumentation] Do not request sanitizers for naked functions Sanitizers instrumentation may be incompatible with naked functions, which lack of standard prologue/epilogue.	2024-09-17 09:23:39 +02:00
Antonio Frighetto	2ae968a0d9	[Instrumentation] Move out to Utils (NFC) (#108532 ) Utility functions have been moved out to Utils. Minor opportunity to drop the header where not needed.	2024-09-15 21:07:40 -07:00
Mircea Trofin	82266d3a2b	[nfc][ctx_prof] Factor the callsite instrumentation exclusion criteria (#108471 ) Reusing this in the logic fetching the instrumentation in `CtxProfAnalysis`.	2024-09-13 21:25:47 -07:00

1 2 3 4 5 ...

3367 Commits