The `bool serial` condition in scanRelocations disabled parallelism for
three cases: -z nocombreloc, MIPS, and PPC64. Resolve two cases:
- nocombreloc: .rela.dyn is now always created with combreloc=true so
non-relative relocations are sorted deterministically. Since
#187964 already separates relative relocations unconditionally,
the only remaining effect of -z nocombreloc is suppressing
DT_RELACOUNT (gated on ctx.arg.zCombreloc in DynamicSection).
- PPC64: After #181496 moved scanning into scanSectionImpl, the
sole thread-unsafe access is ctx.ppc64noTocRelax (DenseSet::insert).
Protect it with ctx.relocMutex, which is already used for rare
operations during parallel scanning.
MIPS retains serial scanning due to `MipsGotSection` mutations.
This test doesn't work as intended when an alternative default linker is
specified via `-DCLANG_DEFAULT_LINKER=ld`. If this test isn't intended
to support alternate default linker, lmk I can just change the
downstream usage I'm seeing, though I figure other folks may have
similar configurations. Repro:
```
cmake -S llvm -B build -DLLVM_ENABLE_PROJECTS="clang" -DCLANG_DEFAULT_LINKER=ld -GNinja
ninja -C build
./build/bin/llvm-lit -v clang/test/DebugInfo/CXX/hotpatch.cpp
...
possible intended match
# | 6: "/usr/bin/ld" "-out:hotpatch.exe" "-libpath:lib/amd64" "-libpath:atlmfc/lib/amd64" "-nologo" "-functionpadmin" "/tmp/lit-tmp-o7x0r1o_/hotpatch-4595de.obj"
```
afaict it passed before because `-mincremental-linker-compatible` was
being used until e97a42d5f9fe51de50aabd4d9bf6874a4955f9fa, which would
match on the compilation line.
`__arm_agnostic("sme_za_state")` does not require +sme, but we must
still preserve ZA in case the function is used with code that makes use
of ZA:
> The use of `__arm_agnostic("sme_za_state")` allows writing functions
> that are compatible with ZA state without having to share ZA state
> with the caller, as required by `__arm_preserves`. The use of this
> attribute does not imply that SME is available.
A kernel developer noticed that I missed a call to index the local
filesystem in one of our codepaths, and had a use case that depended on
that working.
rdar://173814556
In some situations such as reported at
https://github.com/llvm/llvm-project/pull/177953#issuecomment-4179014239,
LLVM_(DEFAULT_)TARGET_TRIPLE is not set. It is used to derive the output
directory in #177953. Only flang-rt currently uses
RUNTIMES_(INSTALL|OUTPUT)_RESOURCE_LIB_PATH, we should not fail building
other despite a missing LLVM_TARGET_TRIPLE.
Compiler-rt uses COMPILER_RT_DEFAULT_TARGET_TRIPLE instead which it
derives itself. Most other LLVM runtimes libraries just skip the target
portion of the library path (explicitly so since #93354). Do the same
for RUNTIMES_(INSTALL|OUTPUT)_RESOURCE_LIB_PATH which we hope eventually
can replace the other mechanisms.
…(#135079)"
This reverts commit a757f23404c594f4a48b4ddb6625f88b349d11d5. Commit
causes reduce.cu file in hipcub/warp go from 2 minutes of compilation to
taking several hours.
Many tests have ad hoc forms of the launch & break steps done by
`lldbutil.run_to_source_breakpoint`. This changes some of those tests to
use `run_to_source_breakpoint` instead.
Assisted-by: claude
We had a bug where exceptions caught with catch-all were not properly
handling a thrown exception if the catch-all handler enclosed a cleanup
handler. The structured CIR was generated correctly, but when we
flattened the CFG and introduced cir.eh.initiate operations, the
cir.eh.initiate for the cleanup's EH path was incorrectly marked as
cleanup-only, even though it chained to the dispatch for the catch-all
handler. This resulted in the landing pad generated for the cleanup not
being marked as having a catch-all handler, so the exception was not
caught.
This change fixes the problem in the FlattenCFG pass.
Assisted-by: Cursor / claude-4.6-opus-high
This PR adds the TableGen-generated headers from
https://github.com/llvm/llvm-project/pull/187610 to the HLSL
distribution.
Currently the HLSL distribution is incomplete due to missing these
generated headers, preventing successful compilation:
```
Command Output (stderr):
--
In file included from <built-in>:1:
In file included from D:\a\_work\1\ClangHLSL\Binaries\lib\clang\23\include\hlsl.h:24:
D:\a\_work\1\ClangHLSL\Binaries\lib\clang\23\include\hlsl/hlsl_alias_intrinsics.h:42:10: fatal error: 'hlsl_alias_intrinsics_gen.inc' file not found
42 | #include "hlsl_alias_intrinsics_gen.inc"
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
```
This PR fixes the error by including `hlsl_alias_intrinsics_gen.inc` and
`hlsl_inline_intrinsics_gen.inc` in the HLSL distribution.
This allows a build system to direct Clang to prune a module cache
directory using the same method Clang does internally.
This also changes `clang::maybePruneImpl` to clean up files directly in
the directory, not just subdirectories.
This moves the LLVM_LIBC_IS_DEFINED macro to its own header is
__support/macros. Its implementation leverages cpp::string_view
instead of rolling its own strcmp; this necessitated fixing
several missing constexpr in the string_view implementation.
The new __support/macros/macro-utils.h is also broken out to hold
the stringification macro and can be used in future for token
pasting shenanigans and other such generic macro machinery.
We had an off-by-one error in the CIR generation for array destructor
loops, causing us to miss destructing one element of the array. This
change fixes the problem.
As discussed in #182203, use enums instead.
I tried to name/use them appropriately, but I'm not sure sure I'm really
happy with the results; suggestions welcome.
For signed int-to-FP casts, ComputeNumSignBits can prove exactness where
computeKnownBits cannot -- e.g. through ashr(shl x, a), b where sign propagation is
tracked precisely but individual known bits are all unknown.
Summary:
`Status` is unfortunately heavily overloaded in practice. Things like
X11 define it as a macro. Best to just remove that possibility entirely.
Change `NVVM_SyncWarpOp` base class from `NVVM_Op` to
`NVVM_IntrOp<"bar.warp.sync">`, which auto-generates `llvmEnumName =
nvvm_bar_warp_sync` and registers it with
`-gen-intr-from-llvmir-conversions` and
`-gen-convertible-llvmir-intrinsics`. This enables LLVM IR to MLIR
import. The hand-written `llvmBuilder` is removed as the default
`LLVM_IntrOpBase` builder is equivalent.
Rename several arguments to intrinsic related functions from `ArgsTys`
to `OverloadTys` to better reflect their meaning. The only variables
left with name `ArgTys` now actually mean function argument types.
Also reamove an incorrect comment in Intrinsics.td. Dependent types do
allow forward references starting with
7957fc6547
The evaluation order of function arguments is unspecified by the C++
standard. We had two getNode calls as function arguments which causes
the nodes to be created in a different order depending on the compiler
used. This patch moves them to their own variables to ensure they are
called in the same order on all compilers.
Possible fix for #190148.
The `-mno-incremental-linker-compatible` switch translates to Brepro
linker flag and must be passed on to the underlying linker to match the
behavior of the Windows triples that produce PE COFF.
Make sure that the module has a target triple set before trying to parse
machine functions. This can be required for (downstream) targets if MIR
parsing relies on features guarded by the target triple.
Use CMake's native MACHO_COMPATIBILITY_VERSION and MACHO_CURRENT_VERSION
properties rather than manually pass linker flags. These properties are
available since CMake 3.17.0, released in 2020.
This commit add the GetDimensions methods to Texture2D. For DXIL, it
requires intrinsics that are not yet available. They are added, but not
implemented.
Assisted-by: Gemini
Co-authored-by: Helena Kotas <hekotas@microsoft.com>
This test technically does not require libc++. The test binary mimics
libc++'s namespace layout to trigger some frame hiding logic in lldb,
but it does not require libc++ to function.
This is explicitly marked as a libc++ test and functionally tests the
formatter for a vector of enums. I put it in the generic directory
because there's no reason this couldn't work for other c++ stdlibs.
Additionally, this should be using the custom libc++ like the other
tests.