3998 Commits

Author SHA1 Message Date
Hubert Tong
e438513f2e
[AIX][AsmPrinter] Fix unsigned subtraction wrap-around (#122214)
Unsigned subtraction wrap-around occurs in `emitGlobalConstantImpl` on
an AIX-specific code path from 8e4423eb0888 when a structure type has
zero elements.

With assertions enabled, this manifests as:
```
TypeSize llvm::StructLayout::getElementOffset(unsigned int) const: Assertion `Idx < NumElements && "Invalid element idx!"' failed.
```
2025-01-09 00:07:57 -04:00
Fangrui Song
5e0be962fe
[PowerPC] Support PIC Secure PLT for CALL_RM
https://reviews.llvm.org/D111433 introduced PPCISD::CALL_RM for
-frounding-math. -msecure-plt -frounding-math {-fpic,-fPIC} codegen for
PPC32 became incorrect when a function contains function calls but no
global variable references (GlobalBaseReg).

As reported by @q66 , musl/src/dirent/closedir.c implements such a
function, which is miscompiled.

PPCISD::CALL has custom logic to set up the base register
(https://reviews.llvm.org/D42112). Add an extra case for CALL_RM.

While here, improve the test to

* actually test `case PPCISD::CALL`: we need a non-leaf function that
  doesn't access global variables (global variables lead to
  GlobalBaseReg, which call `getGlobalBaseReg()` as well).
* test `ExternalSymbolSDNode` with a memset.

Supersedes: #72758

Pull Request: https://github.com/llvm/llvm-project/pull/121281
2025-01-06 08:59:42 -08:00
Craig Topper
4dfea22e77
[ExpandMemCmp][AArch64][PowerPC][RISCV][X86] Use llvm.ucmp instead of (sub (zext (icmp ugt)), (zext (icmp ult))). (#121530)
AArch64 and PowerPC look like a improvements.
RISC-V is neutral.
X86 trades a dependency breaking xor before a seta for a movsx after a
sbbb. Depending on how the result is used, this movsx might go away.
2025-01-03 09:19:32 -08:00
Simon Pilgrim
b3a7ab6f1f [DAG] Don't allow implicit truncation in extract_element(bitcast(scalar_to_vector(X))) -> trunc(srl(X,C)) fold
Limits #117900 to only fold when scalar_to_vector doesn't perform implicit truncation, as the scaled shift calculation doesn't currently account for this - this can be addressed in a future update.

Fixes #121306
2024-12-30 16:08:35 +00:00
Fangrui Song
9efa7d7af3
Remove -print-lsr-output in favor of --stop-after=loop-reduce
Pull Request: https://github.com/llvm/llvm-project/pull/121305
2024-12-29 18:58:30 -08:00
Fangrui Song
7b23f413d1 MCAsmStreamer: Omit initial ".text"
llvm-mc --assemble prints an initial `.text` from `initSections`.
This is weird for quick assembly tasks that do not specify `.text`.

Omit the .text by moving section directive printing from `changeSection`
to `switchSection`. switchSectionNoPrint now correctly calls the
`changeSection` hook (needed by MachO).

The initial directives of clang -S are now reordered. On ELF targets, we
get `.file "a.c"; .text` instead of `.text; .file "a.c"`.
If there is no function, `.text` will be omitted.
2024-12-22 22:03:44 -08:00
Benjamin Maxwell
a7dafea384
[SDAG] Allow folding stack slots into sincos/frexp in more cases (#118117)
This adds a new helper `canFoldStoreIntoLibCallOutputPointers()` to
check that it is safe to fold a store into a node that will expand to a
library call that takes output pointers. This requires checking for two
(independent) properties:

1. The store is not within a CALLSEQ_START..CALLSEQ_END pair
* If it is, the expansion would lead to nested call sequences (which is
invalid)
2. The node does not appear as a predecessor to the store
* If it does, attempting to merge the store into the call would result
in a cycle in the DAG

These two properties are checked as part of the same traversal in
`canFoldStoreIntoLibCallOutputPointers()`
2024-12-17 10:54:17 +00:00
Matt Arsenault
bb18e49edb
RegAlloc: Use DiagnosticInfo to report register allocation failures (#119492)
Improve the non-fatal cases to use DiagnosticInfo, which will now
provide a location. The allocators attempt to report different errors
if it happens to see inline assembly is involved (this detection is
quite unreliable) using srcloc instead of dbgloc. For now, leave this
behavior unchanged. I think reporting the full location and context
function would be more useful.
2024-12-16 10:49:08 +09:00
Fangrui Song
133352feb3 [test] Remove redundant -march= when target triple is specified in IR 2024-12-15 12:42:17 -08:00
Stefan Pintilie
67eb05b292
[PowerPC] Add special handling for arguments that are smaller than pointer size. (#119003)
When arguments are passed in memory instead of registers we currently
load the entire pointer size even though the argument may be smaller.
For exmaple if the pointer size if i32 then we use a load word even if
the argument is only an i8. This patch zeros / extends the bits that are
not required to ensure that we are getting the correct value even if the
load is larger.
2024-12-12 09:43:53 -05:00
Guillaume DI FATTA
a1ee1a9126
[CodeGen] @llvm.experimental.stackmap make operands immediate (#117932)
This pull request modifies the behavior of the
`@llvm.experimental.stackmap` intrinsic to require that its two first
operands (`id` and `numShadowBytes`) be **immediate values**. This
change ensures that variables cannot be passed as two first arguments to
this intrinsic.


Related Issue: https://github.com/llvm/llvm-project/issues/115733

### Testing
- Added new test cases to ensure errors are emitted for non-immediate
operands.
- Ran the full LLVM test suite to verify no regressions were introduced.
2024-12-11 17:41:19 +08:00
zhijian lin
4d06623b28
recalculate the live interval of the defined register of xvmaddmdp in the VSX FMA mutation pass. (#116071)
The patch fix https://github.com/llvm/llvm-project/issues/116061

The root cause of the assertion is that the FMA mutation pass does not
update the subranges of the live interval for the defined register of
the modified instruction .

it recalculate the live interval of the defined register of xvmaddmdp in
the VSX FMA mutation pass.
2024-12-10 11:21:15 -05:00
Amy Kwan
f31099ce58
[PowerPC][AIX] Emit PowerPC version for XCOFF (#113214)
This PR emits implements the ability to emit the PPC version for both
assembly and object files on AIX.
2024-12-10 11:11:50 -05:00
Lei Huang
a13ec9cd54
[PowerPC] Update data layout aligment of i128 to 16 (#118004)
Fix 64-bit PowerPC part of
https://github.com/llvm/llvm-project/issues/102783.
2024-12-09 18:02:24 -05:00
Maryam Moghadas
68e75eebec
[PPC] Custom lower ssubo for i64 (#118711)
This is a follow-up patch to improve the codegen for ssubo node for i64
in 64-bit mode by custom lowering.
2024-12-05 17:22:44 -05:00
zhijian lin
6b5c67bd16
[PowerPC][Backend] using signed extend value instead of zero extend value for isIntS34Immediate() (#118703)
The patch fix the issue
https://github.com/llvm/llvm-project/issues/118695
2024-12-05 09:08:18 -05:00
Simon Pilgrim
b1a48af56a
[DAG] SimplifyDemandedVectorElts - add handling for INT<->FP conversions (#117884) 2024-12-04 07:37:01 +00:00
Zaara Syeda
935bbbbde4
[PPC] Remove missed cases of ppc-merge-string-pool (#117626)
PPCMergeStringPool was replaced with GlobalMerge with commit aaa37d6.
Some cases of option ppc-merge-string-pool were missed being removed.
2024-12-03 13:31:26 -05:00
Simon Pilgrim
31b7d4333a
[DAG] Extend extract_element(bitcast(scalar_to_vector(X))) -> trunc(srl(X,C)) (#117900)
When extracting a smaller integer from a scalar_to_vector source, we were limited to only folding/truncating the lowest bits of the scalar source.

This patch extends the fold to handle extraction of any other element, by right shifting the source before truncation.

Fixes a regression from #117884
2024-11-29 17:24:38 +00:00
Maryam Moghadas
dab4121a55
[PowerPC] Add custom lowering for ssubo (#111748) (#115875)
This patch is to improve the codegen for ssubo node for i32 by custom lowering.
2024-11-28 13:55:53 -05:00
Maryam Moghadas
66d350a017 [PowerPC][NFC] Pre-commit test case to prepare for patch to custom lower ssubo 2024-11-28 10:37:03 -05:00
Nikita Popov
04a2d50efd [PPC] Use getSignedConstant() for frame index offset
The offset is signed. Fixes assertion failure reported at:
https://github.com/llvm/llvm-project/pull/117558#issuecomment-2504413074
2024-11-28 10:49:45 +01:00
RolandF77
a475180498
[PowerPC] Use setbc for values from vector compare conditions (#114858)
For P10 use the setbc instruction to get int values from vector compare
summary condition results.
2024-11-27 12:47:10 -05:00
Zaara Syeda
b1a34b80b8
[NFC][Test] Fix PowerPC test gcov_ctr_ref_init.ll (#117577) 2024-11-26 12:09:49 -05:00
Zaara Syeda
8e4423eb08
[AsmPrinter] Fix handling in emitGlobalConstantImpl for AIX (#116255)
When GlobalMerge creates a MergedGlobal of statics all initialized to
zero, emitGlobalConstantImpl sees a ConstantAggregateZero. This results
in just emitting zeros followed by labels for the aliases. We need to
handle it more like how emitGlobalConstantStruct does by emitting each
global inside the aggregate.

---------

Co-authored-by: Hubert Tong <hubert.reinterpretcast@gmail.com>
2024-11-19 09:58:25 -05:00
Akshat Oke
3f9d02aae8
[CodeGen][NewPM] Port PeepholeOptimizer to NPM (#116326)
With this, all machine SSA optimization passes are available in the new codegen pipeline.
2024-11-18 11:02:01 +05:30
Sergei Barannikov
032014ef10
[PowerPC] Add SDNPMemOperand to some nodes (#115580)
Nodes created with `getMemIntrinsicNode` have memory operands. In order
for operands to be propagated to machine instructions, the nodes should
have `SDNPMemOperand` property.

Similar to 3c8c385a.
2024-11-15 20:36:56 +03:00
Jake Egan
48cc435109
Reland "[PowerPC] Add error for incorrect use of memory operands (#114277)" (#115958)
Commit 93589057830b2c3c35500ee8cac25c717a1e98f9 was reverted because it
caused a failure with test `lld :: ELF/ppc64-local-exec-tls.s`. This
relands the commit with a fix for the test.
2024-11-13 22:24:19 -05:00
Tex Riddell
5c2a133b13
Emit constrained atan2 intrinsic for clang builtin (#113636)
This change is part of this proposal:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

- `Builtins.td` - Add f16 support for libm atan2 builtin
- `CGBuiltin.cpp` - Emit constraint atan2 intrinsic for clang builtin
- `clang/test/CodeGenCXX/builtin-calling-conv.cpp` - Use erff instead of
atan2 for clang builtin to lib call calling convention check, now that
atan2 maps to an intrinsic.
- add atan2 cases to llvm.experimental.constrained tests for more
backends: ARM, PowerPC, RISCV, SystemZ.
- LangRef.rst: add llvm.experimental.constrained.atan2, revise
llvm.atan2 description.

Last part of Implement the atan2 HLSL Function. Fixes #70096.
2024-11-12 13:34:29 -08:00
Benjamin Maxwell
014455a587
[SDAG] Limit sincos/frexp stack slot folding to stores chained to entry (#115906)
When the chain is not the entry node there is a risk the stores are
within a (CALLSEQ_START, CALLSEQ_END), which when the node is expanded
will lead to nested call sequences.

It should be possible to check for this and allow more cases, but for
now, let's limit this to cases where it's definitely safe.

Fixes #115323
2024-11-12 20:48:41 +00:00
Zaara Syeda
aaa37d6755
[PPC] Replace PPCMergeStringPool with GlobalMerge for Linux (#114850)
Enable merging all constants without looking at use in GlobalMerge by
default to replace PPCMergeStringPool pass on Linux.
2024-11-12 14:02:01 -05:00
Jake Egan
0e52a0721e Revert "[PowerPC] Add error for incorrect use of memory operands (#114277)"
This commit broke a test on a couple bots
lld :: ELF/ppc64-local-exec-tls.s

This reverts commit 93589057830b2c3c35500ee8cac25c717a1e98f9.
2024-11-12 04:03:06 -05:00
Jake Egan
9358905783
[PowerPC] Add error for incorrect use of memory operands (#114277)
If an instruction doesn't support memory operands, but one is provided,
an error should be raised. And conversely, if an instruction requires a
memory operand, but none is given, an error should be raised.
2024-11-12 03:00:06 -05:00
Amy Kwan
4981f8cb72
[PowerPC] Fix vector_shuffle combines when inputs are scalar_to_vector of differing types. (#80784)
This patch fixes the combines for vector_shuffles when either or both of
its left and right hand side inputs are scalar_to_vector nodes.

Previously, when both left and right side inputs are scalar_to_vector
nodes, the current combine could not handle this situation, as the shuffle
mask was updated incorrectly. To temporarily solve this solution, this combine
was simply disabled and not performed.

Now, not only does this patch aim to resolve the previous issue of the
incorrect shuffle mask adjustments respectively, but it also updates any test
cases that are affected by this change.

Patch migrated from https://reviews.llvm.org/D130487.
2024-11-11 10:53:51 -05:00
Nikita Popov
dd116369f6
[InstSimplify] Fix incorrect poison propagation when folding phi (#96631)
We can only replace phi(X, undef) with X, if X is known not to be
poison. Otherwise, the result may be more poisonous on the undef branch.

Fixes https://github.com/llvm/llvm-project/issues/68683.
2024-11-07 14:09:45 +01:00
abhishek-kaushik22
d2aff182d3
Revert "TLS loads opimization (hoist)" (#114740)
This reverts commit c31014322c0b5ae596da129cbb844fb2198b4ef4.

Based on the discussions in #112772, this pass is not needed after the
introduction of `llvm.threadlocal.address` intrinsic.

Fixes https://github.com/llvm/llvm-project/issues/112771.
2024-11-07 10:10:28 +01:00
Benjamin Maxwell
ea6b8fa4b9
[SDAG] Merge multiple-result libcall expansion into DAG.expandMultipleResultFPLibCall() (#114792)
This merges the logic for expanding both FFREXP and FSINCOS into one
method `DAG.expandMultipleResultFPLibCall()`. This reduces duplication
and also allows FFREXP to benefit from the stack slot elimination
implemented for FSINCOS. This method will also be used in future to
implement more multiple-result intrinsics (such as modf and sincospi).
2024-11-06 11:06:06 +00:00
Simon Pilgrim
9540a7ae82
[DAG] SimplifyMultipleUseDemandedBits - bypass ADD nodes if either operand is zero (#112588)
The dpbusd_const.ll test change is due to us losing the expanded add reduction pattern as one of the elements is known to be zero (removing one of the adds from the reduction pyramid). I don't think its of concern.

Noticed while working on #107423
2024-11-05 17:20:41 +00:00
Simon Pilgrim
aef0e77c76
[DAG] visitAND - Fold (and (srl X, C), 1) -> (srl X, BW-1) for signbit extraction (#114992)
If we're masking the LSB of a SRL node result and that is shifting down an extended sign bit, see if we can change the SRL to shift down the MSB directly.

These patterns can occur during legalisation when we've sign extended to a wider type but the SRL is still shifting from the subreg.

Alternative to #114967

Fixes the remaining regression in #112588
2024-11-05 14:42:15 +00:00
zhijian lin
2cd32132db
[PowerPC] Utilize getReservedRegs to find asm clobberable registers. (#107863)
This patch utilizes getReservedRegs() to find asm clobberable registers.
And to make the result of getReservedRegs() accurate, this patch
implements the todo, which is to make r2 allocatable on AIX for some
leaf functions.
2024-11-04 12:57:26 -05:00
zhijian lin
01463a269e
Fixed test case error which caused the default cpu is changed from generic to ppc64 for triple powerpc64 (#114828)
[Summary]
Fixed test case error in the
https://lab.llvm.org/buildbot/#/builders/33/builds/5801 which
  caused by the commit https://github.com/llvm/llvm-project/pull/113943
2024-11-04 11:37:35 -05:00
zhijian lin
a51712751c
[PowerPC][LLC] Utilize PPC::getNormalizedPPCTargetCPU() to set CPU (#113943)
Utilize common API in PPCTargetParser
(https://github.com/llvm/llvm-project/pull/97541) to set default CPU
with same interfaces for LLC.
This will update AIX default CPU to pwr7 and LoP powerppc64 default CPU
to ppc64.
2024-11-04 09:40:54 -05:00
Ramkumar Ramachandra
5e75880165
CodeGen/test: improve a test, regen with UTC (#113338) 2024-11-04 11:29:25 +00:00
Maryam Moghadas
c7c5042e3c
Revert "[PowerPC] Add custom lowering for ssubo (#111748)" (#114672)
This reverts commit 8a0cb9ac869334fd6c6bd6aad8408623a7ccd7f6.
Reverting due to PPC bootstrap bot failure.
2024-11-02 10:48:36 -04:00
zhijian lin
674574d25c
Promote 32bit pseudo instr that infer extsw removal to 64bit in PPCMIPeephole (#85451)
Fixes:   https://github.com/llvm/llvm-project/issues/71030

Bug only happens in 64bit involving spills. Since we don't know when the
spill will happen, all instructions in the chain used to deduce sign
extension for eliminating 'extsw' will need to be promoted to 64-bit
pseudo instructions.

The following instruction will promoted in PPCMIPeepholes: EXTSH, LHA,
ISEL to EXTSH8, LHA8, ISEL8
2024-10-31 15:49:36 -04:00
Zaara Syeda
155956834a
Fix failing test gcov_ctr_ref_init.ll (#114428)
Fix test failing after merge of commit:
ccddd136024305be8b6aa77e4ce576d8e6521529
2024-10-31 13:08:16 -04:00
Zaara Syeda
ccddd13602
Enable aggressive constant merge in GlobalMerge for AIX (#113956)
Enable merging all constants without looking at use in GlobalMerge by
default to replace PPCMergeStringPool pass on AIX.
2024-10-31 11:22:48 -04:00
Maryam Moghadas
8a0cb9ac86
[PowerPC] Add custom lowering for ssubo (#111748)
This patch is to improve the codegen for ssubo node for i32 in 64-bit
mode by custom lowering.
2024-10-29 15:43:05 -04:00
Simon Pilgrim
32aa782ea2 [PowerPC] copysignl.ll - regenerate to reduce the diff in #111269 2024-10-29 10:57:43 +00:00
Serge Pavlov
819abe412d
[Test] Fix usage of constrained intrinsics (#113523)
Some tests contain errors in constrained intrinsic usage, such as missed
or extra type parameters, wrong type parameters order and some other.

---------

Co-authored-by: Andy Kaylor <andy_kaylor@yahoo.com>
2024-10-28 14:07:32 +07:00