549597 Commits

Author SHA1 Message Date
pvanhout
857644e104 Comments 2025-08-22 10:14:27 +02:00
pvanhout
3b51b225ba Drop -CU suffix 2025-08-22 10:14:27 +02:00
pvanhout
e83355bd69 clang-format 2025-08-22 10:14:26 +02:00
pvanhout
c66127cb33 [AMDGPU][gfx1250] Implement SIMemoryLegalizer
Implements the base of the MemoryLegalizer for a roughly correct GFX1250 memory model.
Documentation will come later, and some remaining changes still have to be added, but this is the backbone of the model.
2025-08-22 10:14:26 +02:00
Pierre van Houtryve
4ab5efd48d
[AMDGPU][gfx1250] Add memory legalizer tests (NFC) (#154725) 2025-08-22 10:14:09 +02:00
Fangrui Song
f1aee598e7 ARM: Remove unneeded ARM::fixup_arm_thumb_bl special case
This is a weird special case added in 2015, simplifying an even older
condition. It is a no-op for ELF (isExternal is always false) and seems
unneeded for non-ELF.
2025-08-22 01:08:33 -07:00
LLVM GN Syncbot
2a59400003 [gn build] Port 2b8e80694263 2025-08-22 08:03:17 +00:00
Muhammad Omair Javaid
2b8e806942 Revert "[lldb-dap] Add module symbol table viewer to VS Code extension #140626 (#153836)"
This reverts commit 8b64cd8be29da9ea74db5a1a21f7cd6e75f9e9d8.

This breaks lldb-aarch64-* bots causing a crash in lldb-dap while
running test TestDAP_moduleSymbols.py

https://lab.llvm.org/buildbot/#/builders/59/builds/22959
https://lab.llvm.org/buildbot/#/builders/141/builds/10975
2025-08-22 13:02:52 +05:00
Zhaoxin Yang
149d9a38e1
[ELF][LoongArch] -r: Synthesize R_LARCH_ALIGN at input section start (#153935)
Similay to

94655dc8ae

The difference is that in LoongArch, the ALIGN is synthesized when the
alignment is >4, (instead of >=4), and the number of bytes inserted is
`sec->addralign - 4`.
2025-08-22 16:02:41 +08:00
Connector Switch
6560adb584
[flang] optimize atand/atan2d precision (#154544)
Part of https://github.com/llvm/llvm-project/issues/150452.
2025-08-22 15:55:46 +08:00
Matt Arsenault
2b46f31ee3
AMDGPU: Sign extend immediates for 32-bit subregister extracts (#154870)
extractSubregFromImm previously would sign extend the 16-bit subregister
extracts, but not the 32-bit. We try to consistently store immediates
as sign extended, since not doing it can result in misreported
isInlineImmediate checks.
2025-08-22 16:50:36 +09:00
Stanislav Mekhanoshin
e0945dfa30
[AMDGPU] Add test to show failure with SRC_*_HI registers. NFC. (#154828)
Since src_{private|shared}_{base|limit} registers are added and
are not artifical compiler happily uses it when it can. In HW
these registers do not exist and the encoding belongs to their
64-bit super-register or 32-bit low register. Same instructions
will produce relocation if run through asm.
2025-08-22 00:50:25 -07:00
Jay Foad
cf5243619a
[AMDGPU] Common up two local memory size calculations. NFCI. (#154784) 2025-08-22 08:44:11 +01:00
serge-sans-paille
50f7c6a5b9
Default to GLIBCXX_USE_CXX11_ABI=ON
Because many of our bots actually don't run a listdc++ compatible with
_GLIBCXX_USE_CXX11_ABI=0. See
https://gcc.gnu.org/onlinedocs/libstdc++/manual/using_dual_abi.html for
details.

This is a follow-up to be179d069664ce03c485e49fa1f6e2ca3d6286fa related
to #154447.
2025-08-22 09:35:40 +02:00
paperchalice
945a186089
[DAGCombiner] Remove most UnsafeFPMath references (#146295)
This pull request removes all references to `UnsafeFPMath` in dag
combiner except FP_ROUND.
- Set fast math flags in some tests.
2025-08-22 15:27:25 +08:00
Fangrui Song
06ab660911 MCSymbol: Avoid isExported/setExported
The next change will move these methods from the base class.
2025-08-22 00:25:55 -07:00
Durgadoss R
36dc6146b8
[MLIR][NVVM] Update TMA tensor prefetch Op (#153464)
This patch updates the TMA Tensor prefetch Op
to add support for im2col_w/w128 and tile_gather4 modes.
This completes support for all modes available in Blackwell.
* lit tests are added for all possible combinations.
* The invalid tests are moved to a separate file with more coverage.

Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
2025-08-22 12:51:29 +05:30
Djordje Todorovic
5050da7ba1
[RISCV] Add initial assembler/MC layer support for big-endian (#146534)
This patch adds basic assembler and MC layer infrastructure for
RISC-V big-endian targets (riscv32be/riscv64be):
      - Register big-endian targets in RISCVTargetMachine
      - Add big-endian data layout strings
      - Implement endianness-aware fixup application in assembler
        backend
      - Add byte swapping for data fixups on BE cores
      - Update MC layer components (AsmInfo, MCTargetDesc, Disassembler,
        AsmParser)
    
This provides the foundation for BE support but does not yet include:
      - Codegen patterns for BE
      - Load/store instruction handling
      - BE-specific subtarget features
2025-08-22 09:21:10 +02:00
Jason Molenda
a2f542b7a5
[lldb][debugserver] update --help to list all the options (#154853)
These are almost all for internal-developer-users only so "look at
debugserver.cpp" wasn't unreasonable, but we rarely add any new options
so a simple list of all recognized options isn't a burden to throw in
the help method.
2025-08-22 00:05:13 -07:00
Fangrui Song
04a3dd5a19 MCSymbol: Avoid isExported/setExported
The next change will move it to MCSymbol{COFF,MachO,Wasm} to make it
clear that other object file formats (e.g. ELF) do not use this field.
2025-08-22 00:00:29 -07:00
Fangrui Song
1def457228 MC: Avoid MCSymbol::isExported
This bit is only used by COFF/MachO. The upcoming change will move
isExported/setExported to MCSymbolCOFF/MCSymbolMachO.
2025-08-21 23:26:53 -07:00
Amit Kumar Pandey
d3d5751a39
[compiler-rt]: fix CodeQL format-string warnings via explicit casts (#153843)
This change addresses CodeQL format-string warnings across multiple
sanitizer libraries by adding explicit casts to ensure that printf-style
format specifiers match the actual argument types.

Key updates:
- Cast pointer arguments to (void*) when used with %p.
- Use appropriate integer types and specifiers (e.g., size_t -> %zu,
ssize_t -> %zd) to avoid mismatches.
- Fix format specifier mismatches across xray, memprof, lsan, hwasan,
dfsan.

These changes are no-ops at runtime but improve type safety, silence
static analysis warnings, and reduce the risk of UB in variadic calls.
2025-08-22 11:51:13 +05:30
Med Ismail Bennani
595148ab76
[lldb/crashlog] Avoid StopAtEntry when launch crashlog in interactive mode (#154651)
In 88f409194, we changed the way the crashlog scripted process was
launched since the previous approach required to parse the file twice,
by stopping at entry, setting the crashlog object in the middle of the
scripted process launch and resuming it.

Since then, we've introduced SBScriptObject which allows to pass any
arbitrary python object accross the SBAPI boundary to another scripted
affordance.

This patch make sure of that to include the parse crashlog object into
the scripted process launch info dictionary, which eliviates the need to
stop at entry.

Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>

Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
2025-08-21 23:16:45 -07:00
Brad Smith
0fff460592
[Driver] DragonFly does not support C11 threads (#154886) 2025-08-22 02:02:52 -04:00
Rajat Bajpai
b08b219650
[MLIR][NVVM] Add "blocksareclusters" kernel attribute support (#154519)
This change adds "nvvm.blocksareclusters" kernel attribute support in NVVM Dialect/MLIR.
2025-08-22 11:32:21 +05:30
Mike Hommey
be179d0696
Be explicit about what libstdc++ C++11 ABI to use (#154447)
libstdc++ can be configured to default to a different C++11 ABI, and
when the system that is used to build clang has a different default than
the system used to build a clang plugin, that leads to uses of different
ABIs, leading to breakage (missing symbols) when using clang APIs that
use types like std::string.

We arbitrarily choose to default to the old ABI, but the user can opt-in
to the new ABI. The important part is that whichever is picked is
reflected in llvm-config's output.
2025-08-22 05:55:42 +00:00
Craig Topper
dee25a8a8e
[TableGen] Validate the shift amount for !srl, !shl, and !sra operators. (#132492)
The C operator has undefined behavior for out of bounds shifts so we
should check this.
2025-08-21 22:41:36 -07:00
Frederik Harwath
d6fae7f921
Reapply "[Clang] Take libstdc++ into account during GCC detection" #145056 (#154487)
The Generic_GCC::GCCInstallationDetector class picks the GCC installation directory with the largest version number. Since the location of the libstdc++ include directories is tied to the GCC version, this can break C++ compilation if the libstdc++ headers for this particular GCC version are not available. Linux distributions tend to package the libstdc++ headers separately from GCC. This frequently leads to situations in which a newer version of GCC gets installed as a dependency of another package without installing the corresponding libstdc++ package. Clang then fails to compile C++ code because it cannot find the libstdc++ headers. Since libstdc++ headers are in fact installed on the system, the GCC installation continues to work, the user may not be aware of the details of the GCC detection, and the compiler does not recognize the situation and emit a warning, this behavior can be hard to understand - as witnessed by many related bug reports over the years.

The goal of this work is to change the GCC detection to prefer GCC installations that contain libstdc++ include directories over those which do not. This should happen regardless of the input language since picking different GCC installations for a build that mixes C and C++ might lead to incompatibilities.
Any change to the GCC installation detection will probably have a negative impact on some users. For instance, for a C user who relies on using the GCC installation with the largest version number, it might become necessary to use the --gcc-install-dir option to ensure that this GCC version is selected.
This seems like an acceptable trade-off given that the situation for users who do not have any special demands on the particular GCC installation directory would be improved significantly.
 
This patch does not yet change the automatic GCC installation directory choice. Instead, it does introduce a warning that informs the user about the future change if the chosen GCC installation directory differs from the one that would be chosen if the libstdc++ headers are taken into account.

See also this related Discourse discussion: https://discourse.llvm.org/t/rfc-take-libstdc-into-account-during-gcc-detection/86992.

This patch reapplies #145056. The test in the original PR did not specify a target in the clang RUN line and used a wrong way of piping to FileCheck.
2025-08-22 07:39:11 +02:00
Craig Topper
630712f4c1
[RISCV] Add a helper class to reduce PseudoAtomicLoadNand* pattern duplication. NFC (#154838) 2025-08-21 22:35:28 -07:00
Matt Arsenault
b1b5102624
AMDGPU: Start considering new atomicrmw metadata on integer operations (#122138)
Start considering !amdgpu.no.remote.memory.access and
!amdgpu.no.fine.grained.host.memory metadata when deciding to expand
integer atomic operations. This does not yet attempt to accurately
handle fadd/fmin/fmax, which are trickier and require migrating the
old "amdgpu-unsafe-fp-atomics" attribute.
2025-08-22 05:29:36 +00:00
Lang Hames
c1625fad02
[orc-rt] Rename unique_function to move_only_function. (#154888)
This will allow the ORC runtime and its clients to easily adopt the
c++-23 std::move_only_function type.
2025-08-22 15:26:10 +10:00
Craig Topper
c346f4079a
[RISCV] Use llvm_anyint_ty instead of llvm_any_ty for scalar intrinsics. NFC (#154816) 2025-08-21 22:18:39 -07:00
Matt Arsenault
fc5fcc0c95
AMDGPU: Start using AV_MOV_B64_IMM_PSEUDO (#154500) 2025-08-22 13:59:36 +09:00
Matt Arsenault
01f785cac4
AMDGPU: Expand remaining system atomic operations (#122137)
System scope atomics need to use cmpxchg loops if we know
nothing about the allocation the address is from.
aea5980e26e6a87dab9f8acb10eb3a59dd143cb1 started this, this
expands the set to cover the remaining integer operations.

Don't expand xchg and add, those theoretically should work over PCIe.
This is a pre-commit which will introduce performance regressions.
Subsequent changes will add handling of new atomicrmw metadata, which
will avoid the expansion.

Note this still isn't conservative enough; we do need to expand
some device scope atomics if the memory is in fine-grained remote
memory.
2025-08-22 13:55:04 +09:00
Sergei Barannikov
6a7ade03d1
[TableGen][DecoderEmitter] Remove redundant variable (NFC) (#154880)
`NumFiltered` is the number of elements in all vectors in a map.
It is ever compared to 1, which is equivalent to checking if the map
contains exactly one vector with exactly one element.
2025-08-22 04:42:06 +00:00
Craig Topper
586a7131d3
[RISCV][LoongArch] Prefix tablegen class names for intrinsics with 'RISCV'. NFC (#154821)
All targets are included by Intrinsics.td so we should name things
carefully to avoid interfering with other targets.

Copy one class that LoongArch was also using.
2025-08-21 21:40:35 -07:00
Jordan Rupprecht
49d4712129
[bazel] Port #154774: unroll vector.from_elements (#154882) 2025-08-22 04:33:50 +00:00
Lang Hames
6df9a13e40
[orc-rt] Use LLVM-style header naming scheme. (#154881)
This is more consistent with the rest of the LLVM project, and the
resulting names are closer to the types defined in each of the headers.
2025-08-22 14:28:02 +10:00
dpalermo
d26ea02060
Revert "Fix Debug Build Using GCC 15" (#154877)
Reverts llvm/llvm-project#152223
2025-08-21 21:54:58 -05:00
Yang Bai
f1f194bf10
[mlir][vector] fix: unroll vector.from_elements in gpu pipelines (#154774)
### Problem

PR #142944 introduced a new canonicalization pattern which caused
failures in the following GPU-related integration tests:

-
mlir/test/Integration/GPU/CUDA/TensorCore/sm80/transform-mma-sync-matmul-f16-f16-accum.mlir
-
mlir/test/Integration/GPU/CUDA/TensorCore/sm80/transform-mma-sync-matmul-f32.mlir

The issue occurs because the new canonicalization pattern can generate
multi-dimensional `vector.from_elements` operations (rank > 1), but the
GPU lowering pipelines were not equipped to handle these during the
conversion to LLVM.

### Fix

This PR adds `vector::populateVectorFromElementsLoweringPatterns` to the
GPU lowering passes that are integrated in `gpu-lower-to-nvvm-pipeline`:

- `GpuToLLVMConversionPass`: the general GPU-to-LLVM conversion pass.
- `LowerGpuOpsToNVVMOpsPass`: the NVVM-specific lowering pass.

Co-authored-by: Yang Bai <yangb@nvidia.com>
2025-08-21 21:46:06 -05:00
Sergei Barannikov
418fb50301
[TableGen][DecoderEmitter] Calculate encoding bits once (#154026)
Parse the `Inst` and `SoftField` fields once and store them in
`InstructionEncoding` so that we don't parse them every time
`getMandatoryEncodingBits()` is called.
2025-08-22 05:19:35 +03:00
Lang Hames
273b6f2911
[orc-rt] Add orc_rt::unique_function. (#154874)
A bare-bones version of LLVM's unique_function: this behaves like a
std::unique_function, except that it supports move only callable types.
2025-08-22 12:19:15 +10:00
Muhammad Bassiouni
4d323206ed
[libc][math] Refactor cospif16 implementation to header-only in src/__support/math folder. (#154222)
Part of #147386

in preparation for: https://discourse.llvm.org/t/rfc-make-clang-builtin-math-functions-constexpr-with-llvm-libc-to-support-c-23-constexpr-math-functions/86450
2025-08-22 05:04:13 +03:00
Lang Hames
3ce25abd4a
[orc-rt] Add error.h: structured error support. (#154869)
Adds support for the Error class, Expected class template, and related
APIs that will be used for error propagation and handling in the new ORC
runtime.

The implementations of these types are cut-down versions of similar APIs
in llvm/Support/Error.h. Most advice on llvm::Error and llvm::Expected
(e.g. from the LLVM Programmer's manual) applies equally to
orc_rt::Error and orc_rt::Expected.

Ported from the old ORC runtime at compiler-rt/lib/orc.
2025-08-22 11:53:47 +10:00
Muhammad Bassiouni
783859b2a0
[libc][math] Refactor cospif implementation to header-only in src/__support/math folder. (#154215)
Part of #147386

in preparation for: https://discourse.llvm.org/t/rfc-make-clang-builtin-math-functions-constexpr-with-llvm-libc-to-support-c-23-constexpr-math-functions/86450
2025-08-22 04:53:18 +03:00
Aiden Grossman
9d2a66fb32 [Clang] Slightly clean up __cpuidex_conflict.c
This was intended to be fixed in #154217, but given that didn't land, it
still needs to be done. I think it still makes sense to have this change
in.
2025-08-22 01:37:17 +00:00
Anthony Latsis
0bc02096f6
[clang] Upstream clang::CodeGen::getConstantSignedPointer (#154453)
This function was introduced to Swift's fork in

https://github.com/swiftlang/llvm-project/commit/a9dd959e60c32#diff-db27b2738ad84e3f1093f9174710710478f853804d995a6de2816d1caaad30d1.

The Swift compiler cannot use `CodeGenModule::getConstantSignedPointer`,
to which it forwards, because that is a private interface.
2025-08-21 17:55:57 -07:00
Craig Topper
6167b1e6e9
[TableGen] Remove unnecessary use of utostr when writing to raw_ostream. NFC (#154800)
raw_ostream is capable of printing unsigned or uint64_t directly.
2025-08-21 17:44:53 -07:00
Sergei Barannikov
b3f04bf44c [M68k] Rename a generated file to be consistent with other targets (NFC) 2025-08-22 03:38:30 +03:00
Rahul Joshi
4eeeb8a01e
[NFC][MC][Decoder] Fix off-by-one indentation in generated code (#154855) 2025-08-21 17:20:05 -07:00