58069 Commits

Author SHA1 Message Date
Brad Smith
89d636ba91
[Support] Fix building on FreeBSD and OpenBSD (#127005)
Fix building after a6f7cb54d3c268ea4748a0ff783b4b030c3195d9.

Check for the function getauxval() instead of just the sys/auxv.h
header.
2025-02-12 22:55:22 -05:00
Tristan Ross
a6f7cb54d3
[Support] Prefer AUX vector for page size (#126863)
Prefers the page size to come from the AUX vector, `getpagesize` is
removed from POSIX.1-2001. Also throws in a couple asserts to ensure the
page size is a valid value.
2025-02-13 11:39:49 +11:00
Shubham Sandeep Rastogi
92f916faba
Add a pass to collect dropped var statistics for MIR (#126686)
This patch attempts to reland
https://github.com/llvm/llvm-project/pull/120780 while addressing the
issues that caused the patch to be reverted.

Namely:

1. The patch had included code from the llvm/Passes directory in the
llvm/CodeGen directory.

2. The patch increased the backend compile time by 2% due to adding a
very expensive include in MachineFunctionPass.h

The patch has been re-structured so that there is no dependency between
the llvm/Passes and llvm/CodeGen directory, by moving the base class,
`class DroppedVariableStats` to the llvm/IR directory.

The expensive include in MachineFunctionPass.h has been changed to
contain forward declarations instead of other header includes which was
pulling a ton of code into MachineFunctionPass.h and should resolve any
issues when it comes to compile time increase.
2025-02-12 14:08:18 -08:00
Florian Hahn
82605285b8 [LAA] Also clear CheckingGroups in RuntimePointerChecking::reset.
This fixes a crash when trying to print access-info in the newly added
test cases.
2025-02-12 21:49:22 +01:00
vporpo
7a7f9190d0
[SandboxVec][Legality] Fix mask on diamond reuse with shuffle (#126963)
This patch fixes a bug in the creation of shuffle masks when vectorizing
vectors in case of a diamond reuse with shuffle. The mask needs to
enumerate all elements of a vector, not treat the original vector value
as a single element. That is: if vectorizing two <2 x float> vectors
into a <4 x float> the mask needs to have 4 indices, not just 2.
2025-02-12 12:29:09 -08:00
Harald van Dijk
23209eb1d9 Revert "[DebugInfo] Update DIBuilder insertion to take InsertPosition (#126059)"
This reverts commit 3ec9f7494b31f2fe51d5ed0e07adcf4b7199def6.
2025-02-12 17:50:39 +00:00
Harald van Dijk
3ec9f7494b
[DebugInfo] Update DIBuilder insertion to take InsertPosition (#126059)
After #124287 updated several functions to return iterators rather than
Instruction *, it was no longer straightforward to pass their result to
DIBuilder. This commit updates DIBuilder methods to accept an
InsertPosition instead, so that they can be called with an iterator
(preferred), or with a deprecation warning an Instruction *, or a
BasicBlock *. This commit also updates the existing calls to the
DIBuilder methods to pass in iterators.
2025-02-12 17:38:59 +00:00
Kazu Hirata
e9e717f405
[Utils] Avoid repeated hash lookups (NFC) (#126856) 2025-02-12 08:48:16 -08:00
Kazu Hirata
71cceb1439
[CodeGen] Avoid repeated hash lookups (NFC) (#126852) 2025-02-12 08:45:53 -08:00
Kazu Hirata
df09290407
[Analysis] Avoid repeated hash lookups (NFC) (#126851) 2025-02-12 08:45:27 -08:00
Nick Sarnie
cb3498c670
[OpenMP][OpenMPIRBuilder] Support SPIR-V device variant matches (#126801)
We should be able to use `spirv64` as a device variant match and it
should be considered a GPU.

Also add the triple to an RTTI check.

Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
2025-02-12 16:40:05 +00:00
Vy Nguyen
fc4d87100a
Define -DLLVM_BUILD_TELEMETRY to be used in ifdef (#126746)
Background: 

Telemetry code isn't always built (controlled by this
LLVM_BUILD_TELEMETRY cmake flag)
This means users of the library may not have the library. So we're
definding the `-DLLVM_BUILD_TELEMETRY` to be used in ifdef.
2025-02-12 09:33:52 -05:00
Nikita Popov
f085261b59 [IRBuilder] Add additional overload with in-place Inserter construction (NFC)
Currently, for IRBuilders that require an explicitly constructed
Folder, we also force Inserter to be constructed and then copied.
Provide a variant where the Inserter uses in-place default
construction, to support cases where it is self-referential.
2025-02-12 15:29:47 +01:00
Akshat Oke
7b60e03d73
Reland "CodeGen][NewPM] Port MachineScheduler to NPM. (#125703)" (#126684)
`RegisterClassInfo` was supposed to be kept alive between pass runs,
which wasn't being done leading to recomputations increasing the compile
time.

Now the Impl class is a member of the legacy and new passes so that it
is not reconstructed on every pass run.

---------

Co-authored-by: Christudasan Devadasan <christudasan.devadasan@amd.com>
2025-02-12 18:54:39 +05:30
Haohai Wen
ec28e9b757
[MC] Replace MCContext::GenericSectionID with MCSection::NonUniqueID (#126202)
They have same semantics. NonUniqueID is more friendly for isUnique
implementation in MCSectionELF.

History: 97837b7 added support for unique IDs in sections and added
GenericSectionID. Later, 1dc16c7 added NonUniqueID.
2025-02-12 14:28:37 +08:00
Sam Elliott
d222488007
[AsmParser] Remove OperandMatchResultTy (#126650)
This has been deprecated since a479be0f39a3301e9ca634d37cf6454b6d3865c6
from September 2023, before LLVM 18. Surely now enough release cycles
have happened that it can be removed upstream.
2025-02-11 21:59:05 -08:00
Abhishek Kaushik
df2dca7a73
[MC] Use std::move to avoid copy (#126700) 2025-02-12 10:01:30 +05:30
Daniel Hoekwater
3a22cf9bd8
[CFIFixup] Fixup CFI for split functions with synchronous uwtables (#125299)
- **Precommit tests for synchronous uwtable CFI fixup**
- **[CFIFixup] Fixup CFI for split functions with synchronous uwtables**

Commit
6e54fccede
disables CFI fixup for
functions with synchronous tables, breaking CFI for split functions.
Instead, we can disable *block-level* CFI fixup for functions with
synchronous tables.

Unwind tables can be:
- N/A (not present)
- Asynchronous
- Synchronous

Functions without unwind tables don't need CFI fixup (since they don't
care about CFI).

Functions with asynchronous unwind tables must be accurate for each
basic block, so full CFI fixup is necessary.

Functions with synchronous unwind tables only need to be accurate for
each function (specifically, the portion of a function in a given
section). Disabling CFI fixup entirely for functions with synchronous
uwtables may break CFI for a function split between two sections. The
portion in the first section may have valid CFI, while the portion in
the second section is missing a call frame.

Ex:
```
(.text.hot)
Foo (BB1):
  <Call frame information>
  ...
BB2:
  ...

(.text.split)
BB3:
  ...
BB4:
  <epilogue>
```

Even if `Foo` has a synchronous unwind table, we still need to insert
call frame information into `BB3` so that unwinding the call stack from
`BB3` or `BB4` works properly.
2025-02-11 18:25:08 -05:00
Lang Hames
84fe1f63b0
[ORC] Switch to singleton pattern for UnwindInfoManager. (#126691)
The find-dynamic-unwind-info callback registration APIs in libunwind
limit the number of callbacks that can be registered. If we use multiple
UnwindInfoManager instances, each with their own own callback function
(as was the case prior to this patch) we can quickly exceed this limit
(see https://github.com/llvm/llvm-project/issues/126611).

This patch updates the UnwindInfoManager class to use a singleton
pattern, with the single instance shared between all LLVM JITs in the
process.

This change does _not_ apply to compact unwind info registered through
the ORC runtime (which currently installs its own callbacks).

As a bonus this change eliminates the need to load an IR "bouncer"
module to supply the unique callback for each instance, so support for
compact-unwind can be extended to the llvm-jitlink tools (which does not
support adding IR).
2025-02-12 10:00:10 +11:00
vporpo
07600f80c7
[SandboxVec][Scheduler] Update ready list comparator (#126160)
This patch implements a hierarchical comparator for the ready list. PHIs
have higher priority than non-phis and terminators are always last.
2025-02-11 13:52:02 -08:00
Alireza Torabian
3c74430320
[DependenceAnalysis][NFC] Removing PossiblyLoopIndependent parameter (#124615)
Parameter PossiblyLoopIndependent has lost its intended purpose. This
flag is always set to true in all cases when depends() is called, hence
we want to reconsider the utility of this variable and remove it from
the function signature entirely. This is an NFC patch.
2025-02-11 16:23:28 -05:00
Philip Reames
e4016bf5c3 [DAG] Use ArrayRef to simplify ShuffleVectorSDNode::isSplatMask 2025-02-11 12:47:10 -08:00
Kazu Hirata
67e1e98811 Revert "[Clang] [OpenMP] Add support for '#pragma omp stripe'. (#119891)"
This reverts commit 070f84ebc89b11df616a83a56df9ac56efbab783.

Buildbot failure:
https://lab.llvm.org/buildbot/#/builders/51/builds/10694
2025-02-11 12:39:01 -08:00
Zahira Ammarguellat
070f84ebc8
[Clang] [OpenMP] Add support for '#pragma omp stripe'. (#119891)
Implement basic parsing and semantic support for `#pragma omp stripe`
constuct introduced in
https://www.openmp.org/wp-content/uploads/[OpenMP-API-Specification-6-0.pdf](https://www.openmp.org/wp-content/uploads/OpenMP-API-Specification-6-0.pdf),
section 11.7.
2025-02-11 13:58:21 -05:00
Benjamin Maxwell
19556eccf6
[RTLIB] Rename getFSINCOS() to getSINCOS (NFC) (#126705)
This makes the name more consistent with the other helpers.
2025-02-11 11:51:35 +00:00
Benjamin Maxwell
701223ac20
[IR] Add llvm.sincospi intrinsic (#125873)
This adds the `llvm.sincospi` intrinsic, legalization, and lowering
(mostly reusing the lowering for sincos and frexp).

The `llvm.sincospi` intrinsic takes a floating-point value and returns
both the sine and cosine of the value multiplied by pi. It computes the
result more accurately than the naive approach of doing the
multiplication ahead of time, especially for large input values.

```
declare { float, float }          @llvm.sincospi.f32(float  %Val)
declare { double, double }        @llvm.sincospi.f64(double %Val)
declare { x86_fp80, x86_fp80 }    @llvm.sincospi.f80(x86_fp80  %Val)
declare { fp128, fp128 }          @llvm.sincospi.f128(fp128 %Val)
declare { ppc_fp128, ppc_fp128 }  @llvm.sincospi.ppcf128(ppc_fp128  %Val)
declare { <4 x float>, <4 x float> } @llvm.sincospi.v4f32(<4 x float>  %Val)
```

Currently, the default lowering of this intrinsic relies on the
`sincospi[f|l]` functions being available in the target's runtime (e.g.
libc).
2025-02-11 09:01:30 +00:00
Abhilash Majumder
6a961dc03d
[NVPTX] Add intrinsics for prefetch.* (#125887)
\[NVPTX\] Add Prefetch intrinsics

This PR adds prefetch intrinsics with the relevant eviction priorities.
* Lit tests are added as part of prefetch.ll
* The generated PTX is verified with a 12.3 ptxas executable.
* Added docs for these intrinsics in NVPTXUsage.rst.

For more information, refer PTX ISA
`<https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-prefetch-prefetchu>`_.

---------

Co-authored-by: abmajumder <abmajumder@nvidia.com>
2025-02-11 14:24:46 +05:30
jeanPerier
99e1308c41
[mlir][LLVM] handle argument and result attributes in llvm.call and llvm.invoke (#123177)
Update llvm.call/llvm.invoke pretty printer/parser and the llvm ir import/export
to deal with the argument and result attributes.

This patch is made on top of PR 123176 that modified the
CallOpInterface and added the argument and result attributes to
llvm.call and llvm.invoke without doing anything with them.

RFC: https://discourse.llvm.org/t/mlir-rfc-adding-argument-and-result-attributes-to-llvm-call/84107
2025-02-11 09:39:51 +01:00
Rahul Joshi
0f674cce82
[NFC][LLVM] Remove unused TargetIntrinsicInfo class (#126003)
Remove `TargetIntrinsicInfo` class as its practically unused (its pure
virtual with no subclasses) and its references in the code.
2025-02-10 14:56:30 -08:00
Nico Weber
872aaddba9 Revert "Modify dwarfdump verification to allow sub-category counts (#125062)"
This reverts commit 13f63010784d8d55620fa7846ac2192f20f95113.
Breaks check-llvm.
2025-02-10 15:08:46 -05:00
youngd007
13f6301078
Modify dwarfdump verification to allow sub-category counts (#125062)
It was discovered that BOLT had several distinct issues of missing debug
information by various tags for debug names (119493 & 119023 as
examples), but the verification of a DWARF with llvm-dwarfdump prior to
those fixes only gave one 'missing name' category.
```
{"error-categories":{"Name Index DIE entry missing name":{"count":36355210}},"error-count":36355210}
```
To more easily leverage dwarf verification for debug health, the JSON
output will be improved to allow having detailed counts by a
sub-category when it makes sense.
For now, this is only implemented on the missing tags, but can be
extended to more.
```
{"error-categories":{"Name Index DIE entry missing name":{"count":10,"details":{"DW_TAG_inlined_subroutine":1,"DW_TAG_label":1,"DW_TAG_namespace":2,"DW_TAG_subprogram":2,"DW_TAG_variable":4}}},"error-count":10}
```

This diff also modifies the tests created in pull request 124936 (not
yet landed) to ensure the JSON switches. Ideally this lands after that
but it did not correctly create a stack of pull requests.
2025-02-10 11:06:15 -08:00
Fangrui Song
ad61e53333
[ARM] Move MCStreamer::emitThumbFunc to ARMTargetStreamer
MCStreamer should not declare arch-specific functions. Such functions
should go to MCTargetStreamer.

Move MCMachOStreamer::emitThumbFunc to ARMTargetMachOStreamer, which is
a new subclass of ARMTargetStreamer. (The new class is just placed in
ARMMachObjectWriter.cpp. The conventional split like
ARMELFObjectWriter.cpp/ARMELFObjectWriter.cpp is overkill.)

`emitCFILabel`, called by ARMWinCOFFStreamer.cpp, has to be made public.

Pull Request: https://github.com/llvm/llvm-project/pull/126199
2025-02-10 09:40:43 -08:00
Nick Sarnie
f3cd223838
[OpenMP][OpenMPIRBuilder] Add initial changes for SPIR-V target frontend support (#125920)
As Intel is working to add support for SPIR-V OpenMP device offloading
in upstream clang/liboffload, we need to modify the OpenMP frontend to
allow SPIR-V as well as generate valid IR for SPIR-V. For example, we
need the frontend to generate code to define and interact with device
globals used in the DeviceRTL.

This is the beginning of what I expect will be (many) other changes, but
let's get started with something simple.

---------

Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
2025-02-10 16:16:40 +00:00
Kazu Hirata
de563951b7
[Analysis] Avoid repeated hash lookups (NFC) (#126465) 2025-02-10 07:50:12 -08:00
Ramkumar Ramachandra
3019e49ebf
SCEV: thread samesign in isBasicBlockEntryGuardedByCond (NFC) (#125840)
isBasicBlockEntryGuardedByCond inadvertedenly drops samesign information
when calling ICmpInst::getNonStrictPredicate. Fix this.
2025-02-10 14:47:13 +00:00
Kazu Hirata
db348c8e8b
[Passes] Avoid repeated hash lookups (NFC) (#126404) 2025-02-09 08:55:55 -08:00
Jonas Devlieghere
7c60725fcf
Revert "Remove dependence on <ciso646>" (#126399)
Reverts llvm/llvm-project#73273
2025-02-08 20:22:15 -08:00
Michael Kenzel
c89735d289
Remove dependence on <ciso646> (#73273)
C++23 removed `<ciso646>` from the standard library. The header is used
in two places: Once in order to pull in standard library macros. Since
this file also includes `<optional>`, that use of `<ciso646>` is
technically redundant, but should probably be left in in case a future
change ever removes the include of `<optional>`. A second use of
`<ciso646>` appears to have been introduced in
da650094b187ee3c8017d74f63c885663faca1d8, but seems unnecessary (the
file doesn't seem to use anything from that header, and it seems to
build just fine on MSVC here without it). The new `<version>` header
should be supported by all supported implementations. This change
replaces uses of `<ciso646>` with the `<version>` header, or removes
them entirely where unnecessary.
2025-02-08 18:48:01 -08:00
vporpo
69b8cf4f06
[SandboxVec][BottomUpVec] Add cost estimation and tr-accept-or-revert pass (#126325)
The TransactionAcceptOrRevert pass is the final pass in the Sandbox
Vectorizer's default pass pipeline. It's job is to check the cost
before/after vectorization and accept or revert the IR to its original
state.

Since we are now starting the transaction in BottomUpVec, tests that run
a custom pipeline need to accept the transaction. This is done with the
help of the TransactionAlwaysAccept pass (tr-accept).
2025-02-08 08:34:18 -08:00
Akshat Oke
564b9b7f4d
Revert "CodeGen][NewPM] Port MachineScheduler to NPM. (#125703)" (#126268)
This reverts commit 5aa4979c47255770cac7b557f3e4a980d0131d69 while I
investigate what's causing the compile-time regression.
2025-02-08 15:36:48 +05:30
joaosaffran
76985fd7ca
[DXIL] Adding support to RootSignatureFlags in obj2yaml (#122396)
This PR adds:
- `RootSignatureFlags` extraction from DXContainer using `obj2yaml`


This PR is part of: #121493

---------

Co-authored-by: joaosaffran <joao.saffran@microsoft.com>
2025-02-07 14:19:19 -08:00
Rahul Joshi
fb1216e829
[NFC][GlobalISel] Minor cleanup in LegalityQuery constructors (#126285)
- Remove a redundant LegalityQuery constructor by using a default value
for `MMODescrs` and remove const for ArrayRef arguments.
- Use a delegating constructor for `MemDesc` constructor that takes
`MachineMemOperand`.
2025-02-07 13:07:18 -08:00
Durgadoss R
f3040498f0
[NVPTX] Add tcgen05 wait/fence/commit intrinsics (#126091)
This patch adds intrinsics for tcgen05 wait,
fence and commit PTX instructions.

lit tests are added and verified with a
ptxas-12.8 executable.

Docs are updated in the NVPTXUsage.rst file.

Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
2025-02-07 22:10:25 +05:30
Yashas Andaluri
a361de6d13
[RDF] Create phi nodes for clobbering defs (#123694)
When a def in a block A reaches another block B that is in A's iterated
dominance frontier, a phi node is added to B for the def register.

A clobbering def can be created at a call instruction, for a register
clobbered by a call.
However, phi nodes are not created for a register, when one of the
reaching defs of the register is a clobbering def.

This patch adds phi nodes for registers that have a clobbering reaching
def. These additional phis help in checking reaching defs for an
instruction in RDF based copy propagation and addressing mode
optimizations.
2025-02-07 08:28:29 -06:00
Sjoerd Meijer
612df14c00
[Clang][Driver] Add an option to control loop-interchange (#125830)
This introduces options `-floop-interchange` and `-fno-loop-interchange`
to enable/disable the loop-interchange pass. This is part of the work
that tries to get that pass enabled by default (#124911), where it was
remarked that a user facing option to control this would be convenient
to have. The option name is the same as GCC's.
2025-02-07 10:31:24 +00:00
Benjamin Maxwell
4bf97aa818
[IR] Add llvm.modf intrinsic (#121948)
This adds the `llvm.modf` intrinsic, legalization, and lowering (mostly
reusing the lowering for sincos and frexp).

The `llvm.modf` intrinsic takes a floating-point value and returns both
the integral and fractional parts (as a struct).

```
declare { float, float }             @llvm.modf.f32(float  %Val)
declare { double, double }           @llvm.modf.f64(double %Val)
declare { x86_fp80, x86_fp80 }       @llvm.modf.f80(x86_fp80  %Val)
declare { fp128, fp128 }             @llvm.modf.f128(fp128 %Val)
declare { ppc_fp128, ppc_fp128 }     @llvm.modf.ppcf128(ppc_fp128  %Val)
declare { <4 x float>, <4 x float> } @llvm.modf.v4f32(<4 x float>  %Val)
```

This corresponds to the libm `modf` function but returns multiple values
in a struct (rather than take output pointers), which makes it easier to
vectorize.
2025-02-07 09:25:13 +00:00
Lang Hames
e2eaf8ded7 [ORC] Force eh-frame use for older Darwins on x86-64 in MachOPlatform, LLJIT.
The system libunwind on older Darwins does not support JIT registration of
compact-unwind. Since the CompactUnwindManager utility discards redundant
eh-frame FDEs by default we need to remove the compact-unwind section first
when targeting older libunwinds in order to preserve eh-frames.

While LLJIT was already doing this as of eae6d6d18bd, MachOPlatform was not.
This was causing buildbot failures in the ORC runtime (e.g. in
https://green.lab.llvm.org/job/llvm.org/job/clang-stage1-RA/3479/).

This patch updates both LLJIT and MachOPlatform to check a bootstrap value,
"darwin-use-ehframes-only", to determine whether to forcibly preserve
eh-frame sections. If this value is present and set to true then compact-unwind
sections will be discarded, causing eh-frames to be preserved. If the value is
absent or set to false then compact-unwind will be used and redundant FDEs in
eh-frames discarded (FDEs that are needed by the compact-unwind section are
always preserved).

rdar://143895614
2025-02-07 17:04:05 +11:00
Lang Hames
63bb4ba84a [ORC] Add ExecutionSession convenience methods to access bootstrap values.
The getBootstrapMap, getBootstrapMapValue, getBootstrapSymbolsMap, and
getBootstrapSymbols methods forward to their respective counterparts in
ExecutorProcessControl, similar to the callWrapper functions.

These methods will be used to simplify an upcoming patch that accesses
the bootstrap values.
2025-02-07 17:04:05 +11:00
Ming-Yi Lai
a1984ec5ea
[llvm-readobj][ELF][RISCV] Dump .note.gnu.property section contents (#125642)
RISCV Zicfilp/Zicfiss extensions uses the `.note.gnu.property` section
to store flags indicating the adoption of features based on these
extensions. This patch enables the llvm-readobj/llvm-readelf tools to
dump these flags with the `--note` flag.
2025-02-07 13:55:16 +08:00
Kazu Hirata
4590f755cf
[Analysis] Avoid repeated hash lookups (NFC) (#126011) 2025-02-06 16:23:04 -08:00