4257 Commits

Author SHA1 Message Date
paperchalice
c53acf0443
[SelectionDAGBuilder] Remove NoNaNsFPMath uses (#169904)
Replaced by checking fast-math flags or value tracking results.
2026-02-09 09:48:07 +08:00
Wael Yehia
13c9276daa [AIX] fix aix-ifunc-toc-restore-query-neg.ll (#153049) 2026-02-05 19:00:23 +00:00
Matt Arsenault
2502e3b7ba
IR: Promote "denormal-fp-math" to a first class attribute (#174293)
Convert "denormal-fp-math" and "denormal-fp-math-f32" into a first
class denormal_fpenv attribute. Previously the query for the effective
denormal mode involved two string attribute queries with parsing. I'm
introducing more uses of this, so it makes sense to convert this
to a more efficient encoding. The old representation was also awkward
since it was split across two separate attributes. The new encoding
just stores the default and float modes as bitfields, largely avoiding
the need to consider if the other mode is set.

The syntax in the common cases looks like this:
  `denormal_fpenv(preservesign,preservesign)`
  `denormal_fpenv(float: preservesign,preservesign)`
  `denormal_fpenv(dynamic,dynamic float: preservesign,preservesign)`

I wasn't sure about reusing the float type name instead of adding a
new keyword. It's parsed as a type but only accepts float. I'm also
debating switching the name to subnormal to match the current
preferred IEEE terminology (also used by nofpclass and other
contexts).

This has a behavior change when using the command flag debug
options to set the denormal mode. The behavior of the flag
ignored functions with an explicit attribute set, per
the default and f32 version. Now that these are one attribute,
the flag logic can't distinguish which of the two components
were explicitly set on the function. Only one test appeared to
rely on this behavior, so I just avoided using the flags in it.

This also does not perform all the code cleanups this enables.
In particular the attributor handling could be cleaned up.

I also guessed at how to support this in MLIR. I followed
MemoryEffects as a reference; it appears bitfields are expanded
into arguments to attributes, so the representation there is
a bit uglier with the 2 2-element fields flattened into 4 arguments.
2026-02-05 13:31:26 +00:00
Jay Foad
7ea33e6848
[CodeGen] Remove unused first operand of SUBREG_TO_REG (#179690)
The first input operand of SUBREG_TO_REG was an immediate that most
targets set to 0. In practice it had no effect on codegen. Remove it.
2026-02-04 17:35:21 +00:00
Nikita Popov
90c632ab48
[PowerPC] Only set QualName symbol on first section switch (#179253)
We were setting it every time when switching to the section. This caused
problems when the debug_aranges emission performed a switch at the end
of the section, resulting in symbols incorrectly pointing to the end
instead of the start of the function.
2026-02-04 10:21:07 +01:00
Wael Yehia
e2061328a8 [AIX] fix aix-ifunc-toc-restore-query.ll (#153049) 2026-02-04 03:53:59 +00:00
Wael Yehia
cc5859671d [AIX] disable aix-ifunc-toc-restore-query-neg.ll on all platforms except ppc for now (#153049) 2026-02-04 03:44:28 +00:00
paperchalice
1ffe78811e
[PowerPC] Remove NoInfsFPMath uses (#163029)
Only `ninf` should be used.
This is the PowerPC part.
2026-02-04 00:35:15 +00:00
Wael Yehia
e1f69ee8e8
[AIX] Implement the ifunc attribute. (#153049)
Currently, the AIX linker and loader do not provide a mechanism to
implement ifuncs similar to GNU_ifunc on ELF Linux.
On AIX, we will lower `__attribute__((ifunc("resolver"))` to the llvm
`ifunc` as other platforms do. The llvm `ifunc` in turn will get lowered
at late stages of the optimization pipeline to an AIX-specific
implementation. No special linkage or relocations are needed when
generating assembly/object output.

On AIX, a function `foo` has two symbols associated with it: a function
descriptor (`foo`) residing in the `.data` section, and an entry point
(`.foo`) residing in the `.text` section. The first field of the
descriptor is the address of the entry point. Typically, the address
field in the descriptor is initialized once: statically, at load time
(?), or at runtime if runtime linking is enabled.

Here we would like to use the address field in the descriptor to
implement the `ifunc` semantics. Specifically, the ifunc function will
become a stub that jumps to the entry point in the address field. A
constructor function is linked into every linkage module. The
constructor walks an array of `{descriptor, resolver}` pairs, calling
the resolver and saving the result in the address field in the
descriptor (thus setting `foo`'s descriptor to point to the resolved
version early during program runtime).

Known limitations:
- Due to bug #161576, which affects object generation path, you will
need either `-ffunction-sections` or `-fno-integrated-as` to generate a
correct/linkable object file.
- aliases to ifuncs are not supported, a testcase has been added and
marked XFAIL. I'm planning to address in a follow-up PR because it's not
important enough, IMHO, for this PR
- dead ifuncs in a CU that contains at least one live ifunc, will result
in all ifuncs being kept by the linker. The fix for this is common with
a similar problem we have with PGO. PR #159435 is trying to provide a
mechanism that will allow the ifunc and PGO implementations to avoid the
dead code retention at the link step.
- the resolver must return a function that is in the same DSO as the
ifunc; the compiler will try to detect if this condition is violated and
report it, but it cannot detect it in general. To be safe, all candidate
functions (returned by a particular resolver) must either be static or
have hidden/protected visibility. This is so that the ifunc stub doesn't
have to save and restore the TOC register r2. In future work, this case
will be supported and the requirement will be lifted.

---------

Co-authored-by: Wael Yehia <wyehia@ca.ibm.com>
2026-02-03 14:15:16 -05:00
zhijian lin
dc520ea4af
[PowerPC] using milicode call for strcmp instead of lib call (#177009)
1. AIX has "millicode" routines, which are functions loaded at boot time
into fixed addresses in kernel memory. This allows them to be customized
for the processor. The __strcmp routine is a millicode implementation;
we use millicode for the strcmp function instead of a library call to
improve performance.
2026-02-02 09:34:53 -05:00
SiliconA-Z
b4797d4c03
[PowerPC] Fix miscompilation when using 32-bit ucmp on 64-bit PowerPC (#178979)
I forgot that you need to clear the upper 32 bits for the carry flag to
work properly on ppc64 or else there will be garbage and possibly
incorrect results.

Fixes: https://github.com/llvm/llvm-project/issues/179119

I do not have merge permissions.
2026-02-02 09:00:40 +01:00
陈子昂
a994198906
[DAG] Reland: Enable bitcast STLF for Constant/Undef (#178890)
This is a reland of #172523.

The original patch caused an assertion failure on RISC-V because it
attempted to create a bitcast from an illegal type (i32 on RV64) during
the post-type-legalization DAGCombine stage.

Added a `TLI.isTypeLegal(Val.getValueType())` check to ensure we only
proceed with the bitcast STLF optimization when the source value's type
is legal for the target.
2026-01-30 18:21:32 +01:00
Alex Bradbury
41f453efe2
Revert "[DAG] Enable bitcast STLF for Constant/Undef" (#178872)
Reverts llvm/llvm-project#172523

As explained in
https://github.com/llvm/llvm-project/pull/172523#issuecomment-3823234270
(along with reproducer), this causes compiler crashes building
llvm-test-suite for RVV targets.
2026-01-30 12:18:38 +00:00
陈子昂
d3c64633c3
[DAG] Enable bitcast STLF for Constant/Undef (#172523)
This patch introduces support for Store-to-Load Forwarding (STLF) in
`DAGCombiner::ForwardStoreValueToDirectLoad` when the store and load
have **different types but equal memory size** (e.g., storing an `i32`
then loading a `float` from the same location).

### What this patch does:
**Enables Optimization:** It allows for the safe forwarding of the
stored value as a Bitcast when the value is:
* A **Constant** (`ConstantSDNode`, `ConstantFPSDNode`,
`ConstantPoolSDNode`).
    * **Undef**.
    * And the memory sizes (`LdMemSize` == `StMemSize`) match.

### Scope and Next Steps:

This patch **only implements forwarding for constant and undef values
that has the same memory size** so far.

**I am submitting this initial patch to get early review feedback on the
core logic and fix the immediate crashes before tackling the more
complex scenarios.**

For the simple case:
```llvm
; Case Handled by this PR so far (e.g., zeroinitializer is a constant)
define float @test_stlf_integer(ptr %p, float %v) {
  store i32 0, ptr %p, align 4 
  %f = load float, ptr %p, align 4 
  ; ...
}
```
Fixes: #151683
2026-01-30 10:11:59 +01:00
Maryam Moghadas
c41691c8b6
[PowerPC] Fix XXPERMDI peephole and ISEL LiveVariables bugs (#172122)
Fixes https://github.com/llvm/llvm-project/issues/159116
Prevent XXPERMDI splat optimization when the splat output register is
used in other instructions, which caused undefined register references. 
Also track removed ISEL operands in simplifyToLI to prevent 
LiveVariables corruption during ISEL-to-COPY conversion.
2026-01-27 09:44:23 -05:00
Nikita Popov
1bad00adc4
[SDAG] Remove non-canonical fabs libcall handling (#177967)
This is a followup to https://github.com/llvm/llvm-project/pull/171288,
which removed lowering of libcalls to SDAG nodes for most libcalls that
get unconditionally canonicalized to intrinsics. This handles the
remaining fabs case, which I originally skipped due to larger test
impact.
2026-01-26 15:11:17 +00:00
Matt Arsenault
aa53f6f3db
ValueTracking: Improve handling for fma/fmuladd (#175614)
The handling for fma was very basic and only handled the
repeated input case. Re-use the fmul and fadd handling for more
accurate sign bit and nan handling.
2026-01-24 11:35:14 +01:00
Sean Fertile
30fc5c1cdf
[PPC64] Convert assert in patchpoint emission to usage error. (#177453)
If the patchpoint intrinsic has requested less bytes then it takes to
make the call then report a fatal usage error. Also fixed a bug where we
forgot to count one of the instructions emitted.
2026-01-22 18:08:49 -05:00
Simon Pilgrim
7e01b33a42
[PPC] Fix suspicious AltiVec VAVG patterns (#176891)
The existing ((X+Y+1)>>1) patterns didn't correct handle overflow, like
the VAVG instructions would

Remove the old patterns and correctly mark the altivec VAVGS/VAVGU
patterns as matching the ISD::AVGCEIL opcodes - the generic DAG folds
will handle everything else

I've updated the vavg.ll tests to correct match ISD::AVGCEILS/U patterns
and added the old tests as negative "overflow" patterns that shouldn't
fold to VAVG instructions

Fixes #174718
2026-01-21 16:48:26 +00:00
Aditi Medhane
7cf30a7d3d
[PowerPC] Add Support for BCDSHIFT, BCDSHIFTR, BCDTRUNC, BCDUTRUNC, and BCDUSHIFT instruction support (#154715)
Support the following BCD format conversion builtins for PowerPC.

- `__builtin_bcdshift` – Shifts a packed decimal value by a specified
number of decimal digits.
- `__builtin_bcdshiftround` – Shifts a packed decimal value by a
specified number of decimal digits, with rounding applied.
- `__builtin_bcdtruncate` –Truncates a packed decimal value to a
specified number of digits.
- `__builtin_bcdunsignedtruncate` – Truncates a packed decimal value and
returns the result as an unsigned packed decimal.
- `__builtin_bcdunsignedshift` – Shifts an unsigned packed decimal value
by a specified number of digits.

> Note: This built-in functions are valid only when all following
conditions are met:
> -qarch is set to utilize POWER9 technology.
> The bcd.h file is included.

## Prototypes

```c
vector unsigned char __builtin_bcdshift(vector unsigned char, int, unsigned char);
vector unsigned char __builtin_bcdshiftround(vector unsigned char, int, unsigned char);
vector unsigned char __builtin_bcdtruncate(vector unsigned char, int, unsigned char);
vector unsigned char __builtin_bcdunsignedtruncate(vector unsigned char, int);
vector unsigned char __builtin_bcdunsignedshift(vector unsigned char, int);
```

---------
2026-01-21 21:34:06 +05:30
Matt Arsenault
0d4a35d560
IR: Remove llvm.convert.to.fp16 and llvm.convert.from.fp16 intrinsics (#174484)
These are long overdue for removal. These were originally a hack
to support loading half values before there was any / decent support
for the half type through the backend. There's no reason to continue
supporting these, they're equivalent to fpext/fptrunc with a bitcast.

SelectionDAG stopped translating these directly, and used the
bitcast + fp cast since f7a02c17628e825, so there's been no reason
to use these since 2014.
2026-01-21 09:50:28 +00:00
Simon Pilgrim
39028cc55a
[DAG] foldAddToAvg - add patterns to form avgceil(A, B) from ((A >> 1) + (B >> 1)) + ((A | B) & 1) (#174719)
Alive2 proof: https://alive2.llvm.org/ce/z/mcatXZ

I've raised #174718 as supposedly PPC has AVGCEIL instructions, but the
patterns in PPCInstrAltivec.td are either incorrect or the instructions
don't account for overflow.

Fixes #128377
2026-01-20 12:12:51 +00:00
Maryam Moghadas
196548988e
[PowerPC] Add support for AMO store builtins (#170933)
This commit adds 4 Clang builtins for PowerPC AMO store operations:

__builtin_amo_stwat for 32-bit unsigned operations
__builtin_amo_stdat for 64-bit unsigned operations
__builtin_amo_stwat_s for 32-bit signed operations
__builtin_amo_stdat_s for 64-bit signed operations

and maps GCC's AMO store functions to these Clang builtins for
compatibility.
2026-01-19 10:58:32 -05:00
Maryam Moghadas
6f8a7e79db
[PowerPC] Add AMO load builtins for conditional increment/decrement (#169435)
This commit adds 4 Clang builtins for PowerPC AMO load conditional 
increment and decrement operations: 

 __builtin_amo_lwat_cond for 32-bit unsigned operations
 __builtin_amo_ldat_cond for 64-bit unsigned operations
 __builtin_amo_lwat_cond_s for 32-bit signed operations
 __builtin_amo_ldat_cond_s for 64-bit signed operations
2026-01-16 12:11:45 -05:00
Tony Linthicum
15b9109bc7
Make MachineBlockFrequencyInfo a required pass for the MachineScheduler pass. (#176172)
This is needed to support functionality in the AMDGPU scheduler. Various
passes have been modified to preserve MBFI to ensure that this change
does not introduce new invocations of MBFI. Some targets have passes
reordered, but there are no new runs of MBFI.
2026-01-15 20:26:51 +00:00
zhijian lin
7b90f426a6
[PowerPC] using milicode call for strstr instead of lib call (#176002)
AIX has "millicode" routines, which are functions loaded at boot time
into fixed addresses in kernel memory. This allows them to be customized
for the processor. The __strstr routine is a millicode implementation;
we use millicode for the strstr function instead of a library call to
improve performance.

I add a helper function `getRuntimeCallSDValueHelper` in the patch. I
will refactor the function `SelectionDAG::getStrlen`
`SelectionDAG::getStrcpy` etc later in another patch.
2026-01-15 14:58:17 -05:00
zhijian lin
cfefd3e46c
[NFC][PowerPC] add test cases for milicode (#175559)
In this PR, we do the following:

1. Simplify the test case for the millicode function  `___memmove`.
2. Add test cases for the millicode functions `___memcpy` ,
`____memset`, `____memmove` which are supported in the patch
https://reviews.llvm.org/D143997.
3. Add pre-commit test cases for the functions `___strstr`,
`___memccpy`, `___strcmp`
2026-01-14 11:46:39 -05:00
zhijian lin
b983b0e92a
[PowerPC] using milicode call for strcpy instead of lib call (#174782)
AIX has "millicode" routines, which are functions loaded at boot time
into fixed addresses in kernel memory. This allows them to be customized
for the processor. The __strcpy routine is a millicode implementation;
we use millicode for the strcpy function instead of a library call to
improve performance.

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2026-01-12 08:58:45 -05:00
Himadhith
743f0a4fdb
[PowerPC] Optimize not equal compares against zero vectors (#150422)
This patch is for special cases involving 0 vectors. During the
comparison of vector operands, current code generation checks with
`vcmpequh (vector compare equal unsigned halfword)` followed by a
negation `xxlnor (VSX Vector Logical NOR XX3-form)`.

This means that for the special case, instead of using `vcmpequh` and
then negating the result, we can directly use `vcmpgtuh (vector compare
greater than unsigned halfword)`.

As a result the negation is avoided since the only condition where this
will be false is for 0 as it is an `unsigned halfword`.

---------

Co-authored-by: himadhith <himadhith.v@ibm.com>
2026-01-12 12:29:55 +05:30
Yingwei Zheng
b8892b9a9b
[SDAG] Add freeze when simplifying select with undef arms (#175199)
Consider the following pattern:
```
%trunc = trunc nuw i64 %x to i48
%sel = select i1 %cmp, i48 %trunc, i48 undef
```
We cannot simplify `%sel` to `%trunc` as `%trunc` may be poison, which
cannot be refined into undef.

This patch checks whether the replacement may be poison. If so, it will
insert a freeze.
We may need SDAG's version of `impliesPoison` if it causes significant
regressions.
Compile-time impact:
https://llvm-compile-time-tracker.com/compare.php?from=ded109c0cff41714ebf9bd60b073aaab07fa4ca8&to=103e605ce6b33bc9145526faf805ee38b972c215&stat=instructions%3Au

Closes https://github.com/llvm/llvm-project/issues/175018.
2026-01-10 13:49:53 +08:00
Sean Fertile
b6212a4caf
XCOFF associated metadata (#159096)
Add a new metadata node `!implicit.ref` to represent an implicit
dependency between 2 symbols. The metadata is unique to AIX and gets
lowered to a relocation that adds an explicit link between the section
the global that the metadata is placed on is allocated in, to the
asscoiated symbol. This relocation will cause the associated symbol to
remain live if the section is not garbage collected. This is used mainly
for compiler features where there is some hidden runtime dependency
between the symbols that isn't otherwise obvious to the linker.
2026-01-09 13:49:21 -05:00
RolandF77
057c7a79e3
[PowerPC] Add type checking for DMF insert (#172078)
Create PPCISD nodes for DMF DMXXINSTDMR512 and DMXXINSTDMR256 operations
to allow type checking.
2026-01-08 12:34:43 -05:00
Maryam Moghadas
325869c7fc
[PowerPC] Add AMO load signed builtins (#168747)
This commit adds two Clang builtins for AMO load signed operations:

__builtin_amo_lwat_s for 32-bit signed operations
__builtin_amo_ldat_s for 64-bit signed operations
2026-01-08 11:59:54 -05:00
Trevor Gross
db26ce5c55
[PowerPC] Change half to use soft promotion rather than PromoteFloat (#152632)
On PowerPC targets, `half` uses the default legalization of promoting to
a `f32`. However, this has some fundamental issues related to inability
to round trip. Resolve this by switching to the soft legalization, which
passes `f16` as an `i16`.

The PowerPC ABI Specification does not define a `_Float16` type, so the
calling convention changes are acceptable.

Fixes the PowerPC part of
https://github.com/llvm/llvm-project/issues/97975
Fixes the PowerPC part of
https://github.com/llvm/llvm-project/issues/97981
2026-01-08 15:35:01 +01:00
Himadhith
b3564b2f5d
[NFC][PowerPC] fix IR to be splat and not zeroinitializer (#174699)
IR should be a splat of 7 as this compares vector of elements with 7
(`vec[i]!=7`). Having `zeroinitializer` goes against this comparison.

Co-authored-by: himadhith <himadhith.v@ibm.com>
2026-01-08 10:07:53 +05:30
Simon Pilgrim
423b2dad40
[AArch64][PPC][X86] Add test coverage for #128377 (#174661) 2026-01-06 22:46:38 +00:00
zhijian lin
448f5fe41b
[NFC][PowerPC] Pre-commit adding test case: use millicode for strcpy (#174243)
add test case to test lib call are used for the ___strcpy milicode.
2026-01-06 11:31:05 -05:00
Frederik Harwath
5c05824d2b
[CodeGen] Rename expand-fp to expand-ir-insts (#172681)
The pass now contains a non-fp expansion and should
be used for any similar expansions regardless of the
types involved. Hence a generic name seems apt.

Rename the source files, pass, and adjust the pass
description. Move all tests for the expansions
that have previously been merged into the pass
to a single directory.
2025-12-18 11:15:04 +00:00
Kevin Per
98b82f90df
[PowerPC]: Add check for cast when shufflevector (#172443)
The crash happens because the cast for `Mask =
cast<ShuffleVectorSDNode>(Res)->getMask();` fails for node `t197: v16i8
= vector_shuffle<16,17,18,19,4,5,6,7,8,9,10,11,u,u,u,u> t196, t196`.
However, both `LHS` and `RHS` are the same node, so
`DAG.getCommutedVectorShuffle` doesn't return a `ShuffleVectorSDNode`
and crashes. The fix is to add a check before the cast is performed.

Closes https://github.com/llvm/llvm-project/issues/172265
2025-12-18 17:14:01 +08:00
Frederik Harwath
71760f324f
[CodeGen] Merge ExpandLargeDivRem into ExpandFp (#172680)
Both passes expand instructions at the IR level.
They use the same kind of instruction visitation
logic and contain significant code duplication e.g.
for scalarization.
2025-12-18 09:22:47 +01:00
Himadhith
c3e7a1ab8f
[NFC][PowerPC] Optimize vector compares for not equal to non zero vectors (#171635)
Lockdown instructions for vector compares `not equal to non-zero (Ex:
vec[i]!=7)`. Current implementation can be made better by removing the
negation and using the identity ``` 0XFFFF + 1 = 0 and 0 + 1 = 0 ```

Co-authored-by: himadhith <himadhith.v@ibm.com>
2025-12-12 11:23:14 +05:30
Nikita Popov
b7c0452a9a
[PowerPC][AIX] Specify correct ABI alignment for double (#144673)
Add `f64:32:64` to the data layout for AIX, to indicate that doubles
have a 32-bit ABI alignment and 64-bit preferred alignment.

Clang was already taking this into account, but it was not reflected in
LLVM's data layout.

A notable effect of this change is that `double` loads/stores with 4
byte alignment are no longer considered "unaligned" and avoid the
corresponding unaligned access legalization. I assume that this is
correct/desired for AIX. (The codegen previously already relied on this
in some places related to the call ABI simply by dint of assuming
certain stack locations were 8 byte aligned, even though they were only
actually 4 byte aligned.)

Fixes https://github.com/llvm/llvm-project/issues/133599.
2025-12-11 08:57:26 +01:00
Nikita Popov
5a24dfa339
[SDAG] Remove most non-canonical libcall handing (#171288)
This is a followup to https://github.com/llvm/llvm-project/pull/171114,
removing the handling for most libcalls that are already canonicalized
to intrinsics in the middle-end. The only remaining one is fabs, which
has more test coverage than the others.
2025-12-10 11:45:26 +01:00
paperchalice
c05ba635c4
[PowerPC] Use the same lowering rule for vector rounding instructions (#166307)
They should have the same lowering rule.
2025-12-09 00:49:14 +00:00
Sean Fertile
7dfe599bda
Fix VarArgs FixedStack object on AIX. (#170240)
Create a mutable aliased fixed stack object for the va_list when any of
the optional arguments are passed in gprs. Since we need to spill the
gpr registers into the parameter save area the stack object is not
immutable, and since the values will almost certainly be accessed
through the IR value for a va_list make the stack object aliased as
well.
2025-12-08 14:36:45 -05:00
zhijian lin
d1ad0856f8
Fix [PowerPC] llc crashed at -O1/O2/O3: Assertion `isImm() && "Wrong MachineOperand mutator"' failed. (#170548)
Fixed issue 
[[PowerPC] llc crashed at -O1/O2/O3: Assertion `isImm() && "Wrong
MachineOperand mutator"'
failed.](https://github.com/llvm/llvm-project/issues/167672)

the root cause of the crash, the IMM operand is in different operand num
of the instruction PPC::XXSPLTW and PPC::XXSPLTB/PPC::XXSPLTH.

and the patch also fix a potential bug that the new element index of
PPC::XXSPLTB/PPC::XXSPLTH/XXSPLTW use the same logic. It should be
different .We need to convert the element index into the proper unit
(byte for VSPLTB, halfword for VSPLTH, word for VSPLTW) because
PPC::XXSLDWI interprets its ShiftImm in 32-bit word units.
2025-12-08 11:16:55 -05:00
YunQiang Su
c6f45f51fb
PowerPC/VSX: Select FMINNUM and FMAXNUM (#135739)
In LangRef, we claim that FMINNUM and FMAXNUM should follow the minNum
and maxNum operators in IEEE754-2008.

PowerPC/VSX does have these instructions XSMINDP and XSMAXDP.

Now we use FMINNUM_IEEE and FMAXNUM_IEEE, since they are used by the
non-arch expand codes now.
In future, we may replace all FMINNUM_IEEE/FMAXNUM_IEEE with FMINNUM and
FMAXNUM.

---------

Co-authored-by: Your Name <you@example.com>
2025-12-08 13:18:52 +08:00
Maryam Moghadas
f650330665
[PowerPC] Add initial support for AMO load builtins (#168746)
This commit adds two Clang builtins for PowerPC AMO load operations:

__builtin_amo_lwat for 32-bit unsigned operations
__builtin_amo_ldat for 64-bit unsigned operations

Also adds an amo.h header that maps GCC's AMO functions to these Clang
builtins for compatibility.
2025-12-03 17:47:56 -05:00
Matt Arsenault
04c81a9973
CodeGen: Add LibcallLoweringInfo analysis pass (#168622)
The libcall lowering decisions should be program dependent,
depending on the current module's RuntimeLibcallInfo. We need
another related analysis derived from that plus the current
function's subtarget to provide concrete lowering decisions.

This takes on a somewhat unusual form. It's a Module analysis,
with a lookup keyed on the subtarget. This is a separate module
analysis from RuntimeLibraryAnalysis to avoid that depending on
codegen. It's not a function pass to avoid depending on any
particular function, to avoid repeated subtarget map lookups in
most of the use passes, and to avoid any recomputation in the
common case of one subtarget (and keeps it reusable across
repeated compilations).

This also switches ExpandFp and PreISelIntrinsicLowering as
a sample function and module pass. Note this is not yet wired
up to SelectionDAG, which is still using the LibcallLoweringInfo
constructed inside of TargetLowering.
2025-12-03 22:00:12 +01:00
Nikita Popov
d0f5a49fb6
[Support] Support debug counters in non-assertion builds (#170468)
This enables the use of debug counters in (non-assertion) release
builds. This is useful to enable debugging without having to switch to
an assertion-enabled build, which may not always be easy.

After some recent improvements, always supporting debug counters no
longer has measurable overhead.
2025-12-03 16:21:47 +01:00