llvm-project

Author	SHA1	Message	Date
Aiden Grossman	de3e8fff20	[NFC][CI] Reformat python files Looks like some of these were not properly formatted at some point. This patch reformats these files so that future diffs are cleaner when running the formatter over the whole file.	2025-05-20 21:52:33 +00:00
Daniel Paoliello	a414877a7a	[x64][win] Add compiler support for x64 import call optimization (equivalent to MSVC /d2guardretpoline) (#126631 ) This is the x64 equivalent of #121516 Since import call optimization was originally [added to x64 Windows to implement a more efficient retpoline mitigation](https://techcommunity.microsoft.com/blog/windowsosplatform/mitigating-spectre-variant-2-with-retpoline-on-windows/295618) the section and constant names relating to this all mention "retpoline" and we need to mark indirect calls, control-flow guard calls and jumps for jump tables in the section alongside calls to imported functions. As with the AArch64 feature, this emits a new section into the obj which is used by the MSVC linker to generate the Dynamic Value Relocation Table and the section itself does not appear in the final binary. The Windows Loader requires a specific sequence of instructions be emitted when this feature is enabled: * Indirect calls/jumps must have the function pointer to jump to in `rax`. * Calls to imported functions must use the `rex` prefix and be followed by a 5-byte nop. * Indirect calls must be followed by a 3-byte nop.	2025-05-20 14:48:41 -07:00
Aiden Grossman	a690852b29	[llvm-exegesis] Error instead of aborting on verification failure (#137581 ) This patch makes llvm-exegesis emit an error when the machine function fails in MachineVerification rather than aborting. This allows downstream users (particularly https://github.com/google/gematria) to handle these errors rather than having the entire process crash. This essentially be NFC from the user perspective minus the addition of the new error message.	2025-05-20 14:48:17 -07:00
Andrew Rogers	98595cfd6f	[llvm] prepare explicit template instantiations in llvm/CodeGen for DLL export annotations (#140653 ) ## Purpose This patch prepares the llvm/CodeGen library for public interface annotations in support of an LLVM Windows DLL (shared library) build, tracked in #109483. The purpose of this patch is to make the upcoming codemod of this library more straight-forward. It is not expected to impact any functionality. The `LLVM_ABI` annotations will be added in a subsequent patch. These changes are required to build with visibility annotations using Clang and gcc on Linux/Darwin/etc; Windows DLL can build fine without them. ## Overview This PR does four things in preparation for adding `LLVM_ABI` annotations to llvm/CodeGen: 1. Explicitly include `Machine.h` and `Function.h` headers from `MachinePassManager.cpp` so that `Function` and `Machine` types are available for the instantiations of `InnerAnalysisManagerProxy`. Without this change, Clang only will only export one of the templates after visibility annotations are added to them. Unclear if this is a Clang bug or expected behavior, but this change avoids the issue and should be harmless. 2. Refactor the definition of `MachineFunctionAnalysisManager` to its own header file. Without this change, it is not possible to add visibility annotations to the declaration with causing gcc to produce `-Wattribute` warnings. 3. Remove the redundant specialization of the `DominatorTreeBase<MachineBasicBlock, false>::addRoot` method. The specialization is the same as implemented in `DominatorTreeBase` so should be unnecessary. Without this change, it is not possible to annotate the subsequent instantiations of `DominatorTreeBase` in the header file without gcc producing `-Wattribute` warnings. Mark unspecialized `addRoot` as `inline` to match the removed specialized version. 4. Move the explicit instantiations of the `GenericDomTreeUpdater` template earlier in the header file. These need to appear before being used in the `MachineDomTreeUpdater` class definition or gcc will produce warnings once visibility annotations are added. ## Background The LLVM Windows DLL effort is tracked in #109483. Additional context is provided in [this discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307). Clang and gcc handle visibility attributes on explicit template instantiations a bit differently; gcc is pickier and generates `-Wattribute` warnings when an explicit instantiation with a visibility annotation appears after the type has already appeared in the translation unit. These warnings can be avoided by moving explicit template instantiations so they always appear first. ## Validation Local builds and tests to validate cross-platform compatibility. This included llvm, clang, and lldb on the following configurations: - Windows with MSVC - Windows with Clang - Linux with GCC - Linux with Clang - Darwin with Clang	2025-05-20 14:40:20 -07:00
Kazu Hirata	e25abd0d54	[bugpoint] Use a range-based for loop (NFC) (#140743 )	2025-05-20 14:34:44 -07:00
Kazu Hirata	cbac2a9241	[llvm] Use llvm::is_contained (NFC) (#140742 )	2025-05-20 14:34:16 -07:00
Peiyong Lin	04ad8d4900	Emit inbounds and nuw attributes in memref. (#138984 ) Now that MLIR accepts nuw and nusw in getelementptr, this patch emits the inbounds and nuw attributes when lower memref to LLVM in load and store operators. This patch also strengthens the memref.load and memref.store spec about undefined behaviour during lowering. This patch also lifts the \|rewriter\| parameter in getStridedElementPtr ahead so that LLVM::GEPNoWrapFlags can be added at the end with a default value and grouped together with other operators' parameters. Signed-off-by: Lin, Peiyong <linpyong@gmail.com>	2025-05-20 14:16:22 -07:00
Arthur Eubanks	11db1285e4	[gn build] Manually port 8f03e1a	2025-05-20 21:09:37 +00:00
Aaron Puchert	317c932622	Suppress errors from well-formed-testing type traits in SFINAE contexts (#135390 ) There are several type traits that produce a boolean value or type based on the well-formedness of some expression (more precisely, the immediate context, i.e. for example excluding nested template instantiation): * `__is_constructible` and variants, * `__is_convertible` and variants, * `__is_assignable` and variants, * `__reference_{binds_to,{constructs,converts}_from}_temporary`, * `__is_trivially_equality_comparable`, * `__builtin_common_type`. (It should be noted that the standard doesn't always base this on the immediate context being well-formed: for `std::common_type` it's based on whether some expression "denotes a valid type." But I assume that's an editorial issue and means the same thing.) Errors in the immediate context are suppressed, instead the type traits return another value or produce a different type if the expression is not well-formed. This is achieved using an `SFINAETrap` with `AccessCheckingSFINAE` set to true. If the type trait is used outside of an SFINAE context, errors are discarded because in that case the `SFINAETrap` sets `InNonInstantiationSFINAEContext`, which makes `isSFINAEContext` return an `optional(nullptr)`, which causes the errors to be discarded in `EmitDiagnostic`. However, in an SFINAE context this doesn't happen, and errors are added to `SuppressedDiagnostics` in the `TemplateDeductionInfo` returned by `isSFINAEContext`. Once we're done with deducing template arguments and have decided which template is going to be instantiated, the errors corresponding to the chosen template are then emitted. At this point we get errors from those type traits that we wouldn't have seen if used with the same arguments outside of an SFINAE context. That doesn't seem right. So what we want to do is always set `InNonInstantiationSFINAEContext` when evaluating these well-formed-testing type traits, regardless of whether we're in an SFINAE context or not. This should only affect the immediate context, as nested contexts add a new `CodeSynthesisContext` that resets `InNonInstantiationSFINAEContext` for the time it's active. Going through uses of `SFINAETrap` with `AccessCheckingSFINAE` = `true`, it occurred to me that all of them want this behavior and we can just use this parameter to decide whether to use a non-instantiation context. The uses are precisely the type traits mentioned above plus the `TentativeAnalysisScope`, where I think it is also fine. (Though I think we don't do tentative analysis in SFINAE contexts anyway.) Because the parameter no longer just sets `AccessCheckingSFINAE` in Sema but also `InNonInstantiationSFINAEContext`, I think it should be renamed (along with uses, which also point the reviewer to the affected places). Since we're testing for validity of some expression, `ForValidityCheck` seems to be a good name. The added tests should more or less correspond to the users of `SFINAETrap` with `AccessCheckingSFINAE` = `true`. I added a test for errors outside of the immediate context for only one type trait, because it requires some setup and is relatively noisy. We put the `ForValidityCheck` condition first because it's constant in all uses and this would then allow the compiler to prune the call to `isSFINAEContext` when true. Fixes #132044.	2025-05-20 23:02:51 +02:00
Igor Kudrin	3f196e0293	[lldb][core] Fix getting summary of a variable pointing to r/o memory (#139196 ) Motivation example: ``` > lldb -c altmain2.core ... (lldb) var F (const char *) F = 0x0804a000 "" ``` The variable `F` points to a read-only memory page not dumped to the core file, so `Process::ReadMemory()` cannot read the data. The patch switches to `Target::ReadMemory()`, which can read data both from the process memory and the application binary.	2025-05-20 13:50:24 -07:00
Matt Arsenault	5aa3171f2c	AMDGPU: Add regression test for multiple frame index lowering (#140784 ) Failures appeared after https://github.com/llvm/llvm-project/pull/140587 but this case wasn't covered	2025-05-20 22:37:43 +02:00
Jeremy Morse	26d9cb17a6	[MC][DebugInfo] Emit linetable entries with known offsets immediately (#134677 ) DWARF linetable entries are usually emitted as a sequence of MCDwarfLineAddrFragment fragments containing the line-number difference and an MCExpr describing the instruction-range the linetable entry covers. These then get relaxed during assembly emission. However, a large number of these instruction-range expressions are ranges within a fixed MCDataFragment, i.e. a range over fixed-size instructions that are not subject to relaxation at a later stage. Thus, we can compute the address-delta immediately, and not spend time and memory describing that computation so it can be deferred.	2025-05-20 21:26:56 +01:00
Florian Hahn	705e27c234	[LoopPeel] Add tests for peeling from end with variable trip counts. Add more test coverage for peeling the last iteration with variable trip counts. Separate test cases for constant and variable trip counts in different files.	2025-05-20 21:07:21 +01:00
Ebuka Ezike	1b6b036c02	[lldb][docs] add command to save core file in gdb to lldb command map. (#140771 )	2025-05-20 20:57:51 +01:00
Anthony Cabrera-Lara	be5b4fad29	Update InterpreterProperties.td (#140746 ) Fix typo in interpreter property description. Fixes #140708	2025-05-20 12:55:59 -07:00
Philip Reames	8708c42e31	[RISCV] Add zvqdotq tests using partial.reduce.add [nfc]	2025-05-20 11:48:36 -07:00
Timm Baeder	9260d310f1	[clang][bytecode][NFC] Remove Frame.cpp (#140750 ) The file was basically empty. The actual implementation for function frames of the two interpreter life in their own respective files.	2025-05-20 20:41:32 +02:00
Aaron Ballman	c555c8d554	[C] Do not diagnose flexible array members with -Wdefault-const-init-field-unsafe (#140578 ) This addresses post-commit review feedback from someone who discovered that we diagnosed code like the following: ``` struct S { int len; const char fam[]; } s; ``` despite it being invalid to initialize the flexible array member. Note, this applies to flexible array members and zero-sized arrays at the end of a structure (an old-style flexible array member), but it does not apply to one-sized arrays at the end of a structure because those do occupy storage that can be initialized.	2025-05-20 14:40:12 -04:00
Philip Reames	0ccd57e289	[RISCV] Add basic coverage of vector.partial.reduce.add [nfc]	2025-05-20 11:31:46 -07:00
Dan Blackwell	4964d98057	[compiler-rt] Replace deprecated os_trace calls on mac (#138908 ) Currently there are deprecation warnings suppressed for `os_trace`; this patch replaces all uses with `os_log_error`. rdar://140295247	2025-05-20 11:31:40 -07:00
Jan Patrick Lehr	b99e57583e	Revert "[mlir] [XeGPU] Add XeGPU workgroup to subgroup pass (#139477 )" (#140779 ) This reverts commit 747620db2a02b889ae3ba3921d6c0e526a3e7677. Multiple bot failures	2025-05-20 20:31:00 +02:00
Kazu Hirata	611f47c46c	[flang] Fix a warning This patch fixes: flang/lib/Optimizer/HLFIR/Transforms/LowerHLFIROrderedAssignments.cpp:1377:10: error: variable 'isValid' set but not used [-Werror,-Wunused-but-set-variable]	2025-05-20 11:24:15 -07:00
Arthur Eubanks	38250ed3b2	[gn build] Manually port a9ee8e4a Can make these into gn args later if needed.	2025-05-20 18:22:04 +00:00
Valentin Clement (バレンタインクレメン)	c17ae161fd	[flang][cuda] Use nullptr for comparison (#140767 ) Comparison without explicit nullptr seems to bring false positives. Use explicit nullptr.	2025-05-20 11:04:06 -07:00
Abhina Sree	a9ee8e4a45	Create a EncodingConverter class with both iconv and icu support. (#138893 ) This patch adds a wrapper class called EncodingConverter for ConverterEBCDIC. This class is then extended to support the ICU library or iconv library. The ICU library currently takes priority over the iconv library. Relevant RFCs: https://discourse.llvm.org/t/rfc-adding-a-charset-converter-to-the-llvm-support-library/69795 https://discourse.llvm.org/t/rfc-enabling-fexec-charset-support-to-llvm-and-clang-reposting/71512 Stacked PR to enable fexec-charset that depends on this: https://github.com/llvm/llvm-project/pull/138895 See old PR for review and commit history: https://github.com/llvm/llvm-project/pull/74516	2025-05-20 14:02:22 -04:00
Andy Kaylor	cbcfe667bb	[CIR] Upstream support for iterator-based range for loops (#140636 ) This change adds handling for C++ member operator calls, implicit no-op casts, and l-value call expressions. Together, these changes enable handling of range for loops based on iterators.	2025-05-20 10:52:15 -07:00
Justin Cady	0931874b21	[Coverage] Add testing to validate code coverage for exceptions (#133463 ) While investigating an issue with code coverage reporting around exceptions it was useful to have a baseline of what works today. This change adds end-to-end testing to validate code coverage behavior that is currently working with regards to exception handling.	2025-05-20 13:43:32 -04:00
Maksim Panchenko	51e222ef48	[BOLT][AArch64] Fix crash for conditional tail calls (#140669 ) When conditional tail call is located in old code while BOLT is operating in lite mode, the call will require optional pending relocation with a type that is currently not supported resulting in a build-time crash. Before a proper fix is implemented, ignore conditional tail calls for relocation purposes and mark their target functions to be patched, i.e. to be served as veneers/thunks.	2025-05-20 10:38:00 -07:00
Nishant Patel	747620db2a	[mlir] [XeGPU] Add XeGPU workgroup to subgroup pass (#139477 ) This PR adds the XeGPU workgroup (wg) to subgroup (sg) pass. The wg to sg pass transforms the xegpu wg level operations to subgroup operations based on the sg_layout and sg_data attribute. The PR adds transformation patterns for following Ops 1. CreateNdDesc 2. LoadNd 3. StoreNd 4. PrefetchNd 4. UpdateNdOffset 5. Dpas	2025-05-20 12:35:50 -05:00
Sarah Spall	5999988af8	[HLSL] Move where ZExt happens in 'EmitStoreThroughExtVectorComponentLValue' to handle bug with hlsl boolean vector swizzles (#140627 ) In 'EmitStoreThroughExtVectorComponentLValue', move the code which ZExts in the case the Destination Scalar Type is larger than the Source Scalar Type, to the top of the function, to ensure each condition is handled. The previous code missed this case: ``` bool4 b = true.xxxx; b.xyz = false.xxx; ``` Leading to a bad shuffle vector. Closes #140564	2025-05-20 10:27:34 -07:00
Sarah Spall	2a1af502d4	[DirectX] scalarize the dx.isinf intrinsic (#140638 ) The DXIL IsInf op only takes scalars. Closes #140577	2025-05-20 10:26:58 -07:00
Craig Topper	0cf6b4f5ee	[Docs][RISCV] Move Zilsd to 'Supported' status. NFC (#140757 )	2025-05-20 10:23:13 -07:00
David Green	47b89fb412	[AArch64] Use i32 extract from UADDV in popcount lowering. (#140718 ) We need the top bits to be zeroes, but an v8i8->i32 EXTRACT_VECTOR_ELT will anyext into the top bits. The instruction we create (UADDV) is known to be zeroes in the upper bits, so we can convert to a larger v2i32 vector and extract from there, similar to the operation currently performed for i64 types. Fixes #140707	2025-05-20 18:09:18 +01:00
Aaron Ballman	6fb23afb8d	[C] Do not diagnose unions with -Wdefault-const-init (#140725 ) A default-initialized union with a const member is generally reasonable in C and isn't necessarily incompatible with C++, so we now silence the diagnostic in that case. However, we do still diagnose a const- qualified, default-initialized union as that is incompatible with C++.	2025-05-20 13:04:24 -04:00
Fangrui Song	a1e314d10d	[test] Add lit.local.cfg after #140471	2025-05-20 09:51:13 -07:00
Dave Lee	ff127624be	[lldb] Reduce max-children-count default to readable size (#139826 ) Change the default from 256 to 24. The argument is that 256 is too large to be scanned by eye, and too large to print in a terminal which can be only 40-50 lines in height. When all children must be shown, `frame variable` and `expression` both support the `-A` (`--show-all-children`) flag. rdar://145327522	2025-05-20 09:34:42 -07:00
Brox Chen	7e9d9dba9c	[AMDGPU][True16][CodeGen] update test fmax3/fmin3 test with true16 mode (#140752 ) This is a NFC patch. This patch duplicate GFX11plus runlines and apply them with "+mattr=+real-true16" and "+mattr=-real-true16" on fmax3/fmin3 tests, and putting '-real-true16' on gisel testline. And then update the test with the update script	2025-05-20 12:33:41 -04:00
Min-Yih Hsu	b3c3297c1a	[RISCV] Fix missing WriteRes for Q extensions in SiFiveP800 scheudling model	2025-05-20 09:24:48 -07:00
Slava Zakharin	54aa9282ed	[flang] Undo the effects of CSE for hlfir.exactly_once. (#140190 ) CSE may delete operations from hlfir.exactly_once and reuse the equivalent results from the parent region(s), e.g. from the parent hlfir.region_assign. This makes it problematic to clone hlfir.exactly_once before the top-level hlfir.where. This patch adds a "canonicalizer" that pulls in such operations back into hlfir.exactly_once.	2025-05-20 09:22:05 -07:00
Min-Yih Hsu	b92b548168	[RISCV] Add scheduling model for SiFive P800 processors (#139316 ) The scheduling model for SiFive P800 series cores. They have 6 integer pipes, 2 floating point pipes, and 2 vector pipes. https://chipsandcheese.com/p/hot-chips-2023-sifives-p870-takes-risc-v-further The tests are meant to have the same coverage as its P600 counterpart.	2025-05-20 09:13:08 -07:00
CarolineConcatto	17e293d5b8	[LLVM][AArch64]CFINV - Add UNPREDICTABLE behaviour if CRm is not zero (#140593 ) Now CFINV follows AXFLAGS behaviour for CRm. It looks like (0) in the instruction encoding means that the behaviour is UNPREDICTABLE if that bit is not zero.	2025-05-20 17:11:11 +01:00
erichkeane	e8dff7bea4	[OpenACC] Fix location of array-section diagnostic. In a sub-subscript of an array-section, it is actually an array section. So make sure we get the location correct when there isn't a 'colon' to look at.	2025-05-20 09:04:32 -07:00
Craig Topper	4a0ae4f504	[RISCV] Add LD_RV32/SD_RV32 to a few more functions in RISCVInstrInfo. (#140640 ) isLoadFromStackSlot/isStoreToStackSlot/getMemOperandsWithOffsetWidth The first 2 probably requires spills/reloads which we don't use LD_RV32/SD_RV32 for yet. I think getMemOperandsWithOffsetWidth is mainly used for load/store clustering. I think we can assume this just works.	2025-05-20 09:01:03 -07:00
Fangrui Song	95e4db8fa7	[llvm-objdump] --adjust-vma: Call getInstruction with adjusted address llvm-objdump currently calls MCDisassembler::getInstruction with unadjusted address and MCInstPrinter::printInst with adjusted address. The decoded branch targets will be adjusted as expected for most targets (as the getInstruction address is insignificant) but not for SystemZ (where the getInstruction address is displayed). Specify an adjust address to fix SystemZInstPrinter output. The added test utilizes llvm/utils/update_test_body.py to make updates easier and additionally checks that we don't adjust SHN_ABS symbol addresses. Pull Request: https://github.com/llvm/llvm-project/pull/140471	2025-05-20 08:54:53 -07:00
Kazu Hirata	ad80f73631	[X86] Fix a warning This patch fixes: llvm/lib/Target/X86/X86ISelLowering.cpp:39622:12: error: explicitly assigning value of variable of type 'SDValue' to itself [-Werror,-Wself-assign-overloaded]	2025-05-20 08:37:48 -07:00
erichkeane	138a899fe0	[OpenACC][CIR] Implement simple 'copy' lowering for combined constructs These are identical in IR as the 'compute' constructs, but require a little additional work since we have 2 operations to work around, not just 1. Note that the test is nearly identical to the compute version, except that the combined 'tag's are present, plus the 'loop' construct.	2025-05-20 08:37:30 -07:00
Alexey Bataev	2318491432	[SLP][NFC]Do the analysis first and then actual codegen, NFC	2025-05-20 08:12:53 -07:00
Simon Pilgrim	09fd8f0093	[X86] matchBinaryPermuteShuffle - match AVX512 "cross lane" SHLDQ/SRLDQ style patterns using VALIGN (#140538 ) Very similar to what we do in lowerShuffleAsVALIGN I've updated isTargetShuffleEquivalent to correctly handle SM_SentinelZero in the expected shuffle mask, but it only allows an exact match (or the test mask was undef) - it can't be used to match zero elements with MaskedVectorIsZero. Noticed while working on #140516	2025-05-20 16:07:56 +01:00
Simon Pilgrim	621a5a976e	[X86] combineAdd - use SDPatternMatch to simplify "(add (zext (vXi1 X)), Y) -> (sub Y, (sext (vXi1 X)))" matching. (#140731 )	2025-05-20 15:59:56 +01:00
Prabhu Rajasekaran	1a9377bef3	[clang][analysis] Thread Safety Analysis: Handle parenthesis (#140656 )	2025-05-20 07:45:14 -07:00

1 2 3 4 5 ...

538174 Commits