llvm-project

Author	SHA1	Message	Date
Simon Pilgrim	2a79ef66eb	[AMDGPU] canCreateUndefOrPoisonForTargetNode - BFE_I32/U32 can't create poison/undef (#154932 ) Add AMDGPUTargetLowering::canCreateUndefOrPoisonForTargetNode handler and tag BFE_I32/U32 nodes as they can only propagate poison, not create poison/undef. Fighting some of the remaining regressions in #152107	2025-08-22 12:14:45 +00:00
Jungwook Park	b149fc7755	[mlir][scf] Quick fix to scf.execute_region no_inline (#154931 ) Asm printer should exclude `no_inline` attr during printing optional attrs at the bottom.	2025-08-22 13:11:27 +01:00
Michael Halkenhäuser	7c1d2467f1	Reland: [OpenMP] Add ompTest library to OpenMP (#154786 ) Reland of https://github.com/llvm/llvm-project/pull/147381 Added changes to fix observed BuildBot failures: * CMake version (reduced minimum to `3.20`, was: `3.22`) * GoogleTest linking (missing `./build/lib/libllvm_gtest.a`) * Related header issue (missing `#include "llvm/Support/raw_os_ostream.h"`) Original message Description =========== OpenMP Tooling Interface Testing Library (ompTest) ompTest is a unit testing framework for testing OpenMP implementations. It offers a simple-to-use framework that allows a tester to check for OMPT events in addition to regular unit testing code, supported by linking against GoogleTest by default. It also facilitates writing concise tests while bridging the semantic gap between the unit under test and the OMPT-event testing. Background ========== This library has been developed to provide the means of testing OMPT implementations with reasonable effort. Especially, asynchronous or unordered events are supported and can be verified with ease, which may prove to be challenging with LIT-based tests. Additionally, since the assertions are part of the code being tested, ompTest can reference all corresponding variables during assertion. Basic Usage =========== OMPT event assertions are placed before the code, which shall be tested. These assertion can either be provided as one block or interleaved with the test code. There are two types of asserters: (1) sequenced "order-sensitive" and (2) set "unordered" assserters. Once the test is being run, the corresponding events are triggered by the OpenMP runtime and can be observed. Each of these observed events notifies asserters, which then determine if the test should pass or fail. Example (partial, interleaved) ============================== ```c++ int N = 100000; int a[N]; int b[N]; OMPT_ASSERT_SEQUENCE(Target, TARGET, BEGIN, 0); OMPT_ASSERT_SEQUENCE(TargetDataOp, ALLOC, N * sizeof(int)); // a ? OMPT_ASSERT_SEQUENCE(TargetDataOp, H2D, N * sizeof(int), &a); OMPT_ASSERT_SEQUENCE(TargetDataOp, ALLOC, N * sizeof(int)); // b ? OMPT_ASSERT_SEQUENCE(TargetDataOp, H2D, N * sizeof(int), &b); OMPT_ASSERT_SEQUENCE(TargetSubmit, 1); OMPT_ASSERT_SEQUENCE(TargetDataOp, D2H, N * sizeof(int), nullptr, &b); OMPT_ASSERT_SEQUENCE(TargetDataOp, D2H, N * sizeof(int), nullptr, &a); OMPT_ASSERT_SEQUENCE(TargetDataOp, DELETE); OMPT_ASSERT_SEQUENCE(TargetDataOp, DELETE); OMPT_ASSERT_SEQUENCE(Target, TARGET, END, 0); #pragma omp target parallel for { for (int j = 0; j < N; j++) a[j] = b[j]; } ``` References ========== This work has been presented at SC'24 workshops, see: https://ieeexplore.ieee.org/document/10820689 Current State and Future Work ============================= ompTest's development was mostly device-centric and aimed at OMPT device callbacks and device-side tracing. Consequentially, a substantial part of host-related events or features may not be supported in its current state. However, we are confident that the related functionality can be added and ompTest provides a general foundation for future OpenMP and especially OMPT testing. This PR will allow us to upstream the corresponding features, like OMPT device-side tracing in the future with significantly reduced risk of introducing regressions in the process. Build ===== ompTest is linked against LLVM's GoogleTest by default, but can also be built 'standalone'. Additionally, it comes with a set of unit tests, which in turn require GoogleTest (overriding a standalone build). The unit tests are added to the `check-openmp` target. Use the following parameters to perform the corresponding build: `LIBOMPTEST_BUILD_STANDALONE` (Default: ${OPENMP_STANDALONE_BUILD}) `LIBOMPTEST_BUILD_UNITTESTS` (Default: OFF) --------- Co-authored-by: Jan-Patrick Lehr <JanPatrick.Lehr@amd.com> Co-authored-by: Joachim <protze@rz.rwth-aachen.de> Co-authored-by: Joachim Jenke <jenke@itc.rwth-aachen.de>	2025-08-22 13:56:12 +02:00
Leandro Lacerda	15a192cde5	[libc] Enable double math functions on the GPU (#154857 ) This patch adds the `acos` math function to the NVPTX build. It also adds the `sincos` math function to the `math.h` header.	2025-08-22 06:52:13 -05:00
paperchalice	2014890c09	[SelectionDAG] Remove `UnsafeFPMath` in `visitFP_ROUND` (#154768 ) Remove `UnsafeFPMath` in `visitFP_ROUND` part, it blocks some bugfixes related to clang and the ultimate goal is to remove `resetTargetOptions` method in `TargetMachine`, see FIXME in `resetTargetOptions`. See also https://discourse.llvm.org/t/rfc-honor-pragmas-with-ffp-contract-fast https://discourse.llvm.org/t/allowfpopfusion-vs-sdnodeflags-hasallowcontract Now all UnsafeFPMath uses are eliminated in LLVMCodeGen	2025-08-22 19:46:33 +08:00
Simon Pilgrim	d8769bb5b7	[AMDGPU] bf16-conversions.ll - regenerate checks Reduce diffs in #152107	2025-08-22 12:20:50 +01:00
Lang Hames	3292edb7b4	[orc-rt] Add C and C++ APIs for WrapperFunctionResult. (#154927 ) orc_rt_WrapperFunctionResult is a byte-buffer with inline storage and a builtin error state. It is intended as a general purpose return type for functions that return a serialized result (e.g. for communication across ABIs or via IPC/RPC). orc_rt_WrapperFunctionResult contains a small amount of inline storage, allowing it to avoid heap-allocation for small return types (e.g. bools, chars, pointers).	2025-08-22 21:18:30 +10:00
Mehdi Amini	d2b810e24f	[MLIR] Apply clang-tidy fixes for readability-identifier-naming in DataFlowFramework.cpp (NFC)	2025-08-22 04:12:50 -07:00
Mehdi Amini	a8aacb1b66	[MLIR] Apply clang-tidy fixes for misc-use-internal-linkage in toy Tutorial (NFC)	2025-08-22 04:12:50 -07:00
Mehdi Amini	d2dee948a4	[MLIR] Improve clang-tidy script This just helping to better keep track of the failures.	2025-08-22 04:12:50 -07:00
Jacek Caban	a6fcd1a663	[LLD][COFF] Set isUsedInRegularObj for target symbols in resolveAlternateNames (#154837 ) Fixes: #154595 Prior to commit bbc8346e6bb543b0a87f52114fed7d766446bee1, this flag was set by `insert()` from `addUndefined()`. Set it explicitly now.	2025-08-22 13:05:19 +02:00
Ramkumar Ramachandra	2975e674ec	[VPlan] Improve style in match_combine_or (NFC) (#154793 )	2025-08-22 12:01:42 +01:00
Hans Wennborg	ee5367bedb	Revert "[compiler-rt]: fix CodeQL format-string warnings via explicit casts (#153843 )" It broke the build: compiler-rt/lib/hwasan/hwasan_thread.cpp:177:11: error: unknown type name 'ssize_t'; did you mean 'size_t'? 177 \| (ssize_t)unique_id_, (void )this, (void )stack_bottom(), \| ^~~~~~~ \| size_t > This change addresses CodeQL format-string warnings across multiple > sanitizer libraries by adding explicit casts to ensure that printf-style > format specifiers match the actual argument types. > > Key updates: > - Cast pointer arguments to (void*) when used with %p. > - Use appropriate integer types and specifiers (e.g., size_t -> %zu, > ssize_t -> %zd) to avoid mismatches. > - Fix format specifier mismatches across xray, memprof, lsan, hwasan, > dfsan. > > These changes are no-ops at runtime but improve type safety, silence > static analysis warnings, and reduce the risk of UB in variadic calls. This reverts commit d3d5751a39452327690b4e011a23de8327f02e86.	2025-08-22 12:50:53 +02:00
Lang Hames	d5af08a221	[orc-rt] Add inline specifier to orc_rt::make_error. (#154922 ) Prevents linker errors for duplicate definitions when make_error is used from more than one file.	2025-08-22 20:37:10 +10:00
nerix	d6fcaef281	[LLDB][Value] Require type size when reading a scalar (#153386 ) When reading a value as a scalar, the type size is required. It's returned as a `std::optional`. This optional isn't checked for scalar values, where it is unconditionally accessed. This came up in the [Shell/Process/Windows/msstl_smoke.cpp](`4e10b62442/lldb/test/Shell/Process/Windows/msstl_smoke.cpp`) test. There, LLDB breaks at the function entry, so all locals aren't initialized yet. Most values will contain garbage. The [`std::list` synthetic provider](`4e10b62442/lldb/source/Plugins/Language/CPlusPlus/GenericList.cpp (L517)`) tries to read the value using `GetData`. However, in [`ValueObject::GetData`](`4e10b62442/lldb/source/ValueObject/ValueObject.cpp (L766)`), [`ValueObjectChild::UpdateValue`](`88c993fbc5/lldb/source/ValueObject/ValueObjectChild.cpp (L102)`) fails because the parent already failed to read its data, so `m_value` won't have a compiler type, thus the size can't be read.	2025-08-22 12:26:03 +02:00
Ross Brunton	17dbb92612	[Offload][NFC] Use tablegen names rather than `name` parameter for API (#154736 )	2025-08-22 11:13:57 +01:00
tangaac	8439777131	[LoongArch] Pre-commit tests for vecreduce_and/or/... (#154879 )	2025-08-22 17:52:43 +08:00
YafetBeyene	fda24dbc16	[BOLT] Add dump-dot-func option for selective function CFG dumping (#153007 ) ## Change: * Added `--dump-dot-func` command-line option that allows users to dump CFGs only for specific functions instead of dumping all functions (the current only available option being `--dump-dot-all`) ## Usage: * Users can now specify function names or regex patterns (e.g., `--dump-dot-func=main,helper` or `--dump-dot-func="init.`") to generate .dot files only for functions of interest Aims to save time when analysing specific functions in large binaries (e.g., only dumping graphs for performance-critical functions identified through profiling) and we can now avoid reduce output clutter from generating thousands of unnecessary .dot files when analysing large binaries ## Testing The introduced test `dump-dot-func.test` confirms the new option does the following: - [x] 1. `dump-dot-func` can correctly filter a specified functions - [x] 2. Can achieve the above with regexes - [x] 3. Can do 1. with a list of functions - [x] No option specified creates no dot files - [x] Passing in a non-existent function generates no dumping messages - [x] `dump-dot-all` continues to work as expected	2025-08-22 10:51:09 +01:00
Ivan Kosarev	7594b4b8d1	[AMDGPU] Fix compilation errors.	2025-08-22 10:30:43 +01:00
Abhinav Garg	bfc16510c7	[AMDGPU] Regenerate test case to cover gfx10 check lines. (#154909 ) Check lines for GFX10 is missing in this test case. Regenerate to fix test case.	2025-08-22 15:00:28 +05:30
Nikolas Klauser	fd52f4d232	[libc++][NFC] Simplify the special member functions of the node containers (#154707 ) This patch does two things: - Remove exception specifications of `= default`ed special member functions - `= default` special member functions The first part is NFC because the explicit specification does exactly the same as the implicit specification. The second is NFC because it does exactly what the `= default`ed special member does.	2025-08-22 11:24:28 +02:00
Florian Hahn	8bc038daf2	[InstComb] Allow more user for (add (ptrtoint %B), %O) to GEP transform. (#153566 ) Generalize the logic from https://github.com/llvm/llvm-project/pull/153421 to support additional cases where the pointer is only used as integer. Alive2 Proof: https://alive2.llvm.org/ce/z/po58pP This enables vectorizing std::find for some cases, if additional assumptions are provided: https://godbolt.org/z/94oq3576E Depends on https://github.com/llvm/llvm-project/pull/15342. PR: https://github.com/llvm/llvm-project/pull/153566	2025-08-22 10:17:12 +01:00
Ivan Kosarev	faca8c9ed4	[AMDGPU][NFC] Only include CodeGenPassBuilder.h where needed. (#154769 ) Saves around 125-210 MB of compilation memory usage per source for roughly one third of our backend sources, ~60 MB on average.	2025-08-22 10:05:06 +01:00
Simon Pilgrim	1b4fe26343	[clang][x86] Add release note entries describing recent work to making SSE intrinsics generic and usable with constexpr (#154737 ) I haven't created an exhaustive list of intrinsic changes, but I suppose I could if people see a strong need for it.	2025-08-22 09:59:10 +01:00
Baranov Victor	00a405f666	[clang-tidy][NFC] Fix "llvm-prefer-static-over-anonymous-namespace" warnings 1/N (#153885 )	2025-08-22 11:54:17 +03:00
Hans Wennborg	8bf105cb01	[asan] Build the Windows runtime with /hotpatch (#154694 ) Win/ASan relies on the runtime's functions being 16-byte aligned so it can intercept them with hotpatching. This used to be true (but not guaranteed) until #149444. Passing /hotpatch will give us enough alignment and generally ensure that the functions are hotpatchable.	2025-08-22 10:40:04 +02:00
Bjorn Pettersson	2d3167f8d8	[SeparateConstOffsetFromGEP] Avoid miscompiles related to trunc nuw/nsw (#154582 ) Drop poison generating flags on trunc when distributing trunc over add/sub/or. We need to do this since for example (add (trunc nuw A), (trunc nuw B)) is more poisonous than (trunc nuw (add A, B))). In some situations it is pessimistic to drop the flags. Such as if the add in the example above also has the nuw flag. For now we keep it simple and always drop the flags. Worth mentioning is that we drop the flags when cloning instructions and rebuilding the chain. This is done after the "allowsPreservingNUW" checks in ConstantOffsetExtractor::Extract. So we still take the "trunc nuw" into consideration when determining if nuw can be preserved in the gep (which should be ok since that check also require that all the involved binary operations has nuw). Fixes #154116	2025-08-22 10:27:57 +02:00
Bjorn Pettersson	4ff7ac2330	[SeparateConstOffsetFromGEP] Add test case with trunc nuw/nsw showing miscompile Pre commit a test case for issue #154116. When redistributing trunc over add/sub/or we may need to drop poison generating flags from the trunc.	2025-08-22 10:26:09 +02:00
Simon Pilgrim	8d7df8bba1	[X86] Allow AVX2 per-element shift intrinsics to be used in constexpr (#154780 ) This handles constant folding for the AVX2 per-element shift intrinsics, which handle out of bounds shift amounts (logical result = 0, arithmetic result = signbit splat) AVX512 intrinsics will follow in follow up patches First stage of #154287	2025-08-22 09:24:24 +01:00
Pierre van Houtryve	4ab5efd48d	[AMDGPU][gfx1250] Add memory legalizer tests (NFC) (#154725 )	2025-08-22 10:14:09 +02:00
Fangrui Song	f1aee598e7	ARM: Remove unneeded ARM::fixup_arm_thumb_bl special case This is a weird special case added in 2015, simplifying an even older condition. It is a no-op for ELF (isExternal is always false) and seems unneeded for non-ELF.	2025-08-22 01:08:33 -07:00
LLVM GN Syncbot	2a59400003	[gn build] Port 2b8e80694263	2025-08-22 08:03:17 +00:00
Muhammad Omair Javaid	2b8e806942	Revert "[lldb-dap] Add module symbol table viewer to VS Code extension #140626 (#153836 )" This reverts commit 8b64cd8be29da9ea74db5a1a21f7cd6e75f9e9d8. This breaks lldb-aarch64-* bots causing a crash in lldb-dap while running test TestDAP_moduleSymbols.py https://lab.llvm.org/buildbot/#/builders/59/builds/22959 https://lab.llvm.org/buildbot/#/builders/141/builds/10975	2025-08-22 13:02:52 +05:00
Zhaoxin Yang	149d9a38e1	[ELF][LoongArch] -r: Synthesize R_LARCH_ALIGN at input section start (#153935 ) Similay to `94655dc8ae` The difference is that in LoongArch, the ALIGN is synthesized when the alignment is >4, (instead of >=4), and the number of bytes inserted is `sec->addralign - 4`.	2025-08-22 16:02:41 +08:00
Connector Switch	6560adb584	[flang] optimize atand/atan2d precision (#154544 ) Part of https://github.com/llvm/llvm-project/issues/150452.	2025-08-22 15:55:46 +08:00
Matt Arsenault	2b46f31ee3	AMDGPU: Sign extend immediates for 32-bit subregister extracts (#154870 ) extractSubregFromImm previously would sign extend the 16-bit subregister extracts, but not the 32-bit. We try to consistently store immediates as sign extended, since not doing it can result in misreported isInlineImmediate checks.	2025-08-22 16:50:36 +09:00
Stanislav Mekhanoshin	e0945dfa30	[AMDGPU] Add test to show failure with SRC_*_HI registers. NFC. (#154828 ) Since src_{private\|shared}_{base\|limit} registers are added and are not artifical compiler happily uses it when it can. In HW these registers do not exist and the encoding belongs to their 64-bit super-register or 32-bit low register. Same instructions will produce relocation if run through asm.	2025-08-22 00:50:25 -07:00
Jay Foad	cf5243619a	[AMDGPU] Common up two local memory size calculations. NFCI. (#154784 )	2025-08-22 08:44:11 +01:00
serge-sans-paille	50f7c6a5b9	Default to GLIBCXX_USE_CXX11_ABI=ON Because many of our bots actually don't run a listdc++ compatible with _GLIBCXX_USE_CXX11_ABI=0. See https://gcc.gnu.org/onlinedocs/libstdc++/manual/using_dual_abi.html for details. This is a follow-up to be179d069664ce03c485e49fa1f6e2ca3d6286fa related to #154447.	2025-08-22 09:35:40 +02:00
paperchalice	945a186089	[DAGCombiner] Remove most `UnsafeFPMath` references (#146295 ) This pull request removes all references to `UnsafeFPMath` in dag combiner except FP_ROUND. - Set fast math flags in some tests.	2025-08-22 15:27:25 +08:00
Fangrui Song	06ab660911	MCSymbol: Avoid isExported/setExported The next change will move these methods from the base class.	2025-08-22 00:25:55 -07:00
Durgadoss R	36dc6146b8	[MLIR][NVVM] Update TMA tensor prefetch Op (#153464 ) This patch updates the TMA Tensor prefetch Op to add support for im2col_w/w128 and tile_gather4 modes. This completes support for all modes available in Blackwell. * lit tests are added for all possible combinations. * The invalid tests are moved to a separate file with more coverage. Signed-off-by: Durgadoss R <durgadossr@nvidia.com>	2025-08-22 12:51:29 +05:30
Djordje Todorovic	5050da7ba1	[RISCV] Add initial assembler/MC layer support for big-endian (#146534 ) This patch adds basic assembler and MC layer infrastructure for RISC-V big-endian targets (riscv32be/riscv64be): - Register big-endian targets in RISCVTargetMachine - Add big-endian data layout strings - Implement endianness-aware fixup application in assembler backend - Add byte swapping for data fixups on BE cores - Update MC layer components (AsmInfo, MCTargetDesc, Disassembler, AsmParser) This provides the foundation for BE support but does not yet include: - Codegen patterns for BE - Load/store instruction handling - BE-specific subtarget features	2025-08-22 09:21:10 +02:00
Jason Molenda	a2f542b7a5	[lldb][debugserver] update --help to list all the options (#154853 ) These are almost all for internal-developer-users only so "look at debugserver.cpp" wasn't unreasonable, but we rarely add any new options so a simple list of all recognized options isn't a burden to throw in the help method.	2025-08-22 00:05:13 -07:00
Fangrui Song	04a3dd5a19	MCSymbol: Avoid isExported/setExported The next change will move it to MCSymbol{COFF,MachO,Wasm} to make it clear that other object file formats (e.g. ELF) do not use this field.	2025-08-22 00:00:29 -07:00
Fangrui Song	1def457228	MC: Avoid MCSymbol::isExported This bit is only used by COFF/MachO. The upcoming change will move isExported/setExported to MCSymbolCOFF/MCSymbolMachO.	2025-08-21 23:26:53 -07:00
Amit Kumar Pandey	d3d5751a39	[compiler-rt]: fix CodeQL format-string warnings via explicit casts (#153843 ) This change addresses CodeQL format-string warnings across multiple sanitizer libraries by adding explicit casts to ensure that printf-style format specifiers match the actual argument types. Key updates: - Cast pointer arguments to (void*) when used with %p. - Use appropriate integer types and specifiers (e.g., size_t -> %zu, ssize_t -> %zd) to avoid mismatches. - Fix format specifier mismatches across xray, memprof, lsan, hwasan, dfsan. These changes are no-ops at runtime but improve type safety, silence static analysis warnings, and reduce the risk of UB in variadic calls.	2025-08-22 11:51:13 +05:30
Med Ismail Bennani	595148ab76	[lldb/crashlog] Avoid StopAtEntry when launch crashlog in interactive mode (#154651 ) In 88f409194, we changed the way the crashlog scripted process was launched since the previous approach required to parse the file twice, by stopping at entry, setting the crashlog object in the middle of the scripted process launch and resuming it. Since then, we've introduced SBScriptObject which allows to pass any arbitrary python object accross the SBAPI boundary to another scripted affordance. This patch make sure of that to include the parse crashlog object into the scripted process launch info dictionary, which eliviates the need to stop at entry. Signed-off-by: Med Ismail Bennani <ismail@bennani.ma> Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>	2025-08-21 23:16:45 -07:00
Brad Smith	0fff460592	[Driver] DragonFly does not support C11 threads (#154886 )	2025-08-22 02:02:52 -04:00
Rajat Bajpai	b08b219650	[MLIR][NVVM] Add "blocksareclusters" kernel attribute support (#154519 ) This change adds "nvvm.blocksareclusters" kernel attribute support in NVVM Dialect/MLIR.	2025-08-22 11:32:21 +05:30

1 2 3 4 5 ...

549622 Commits