451 Commits

Author SHA1 Message Date
Youngsuk Kim
d31e314131 [llvm] Don't call raw_string_ostream::flush() (NFC)
Don't call raw_string_ostream::flush(), which is essentially a no-op.
As specified in the docs, raw_string_ostream is always unbuffered.
( 65b13610a5226b84889b923bae884ba395ad084d for further reference )
2024-09-20 12:19:59 -05:00
weiwei chen
3d0846bedc
[MC] Explicitly mark MCSymbol for MO_ExternalSymbol (#108880)
- [x] Mark `MCSymbol` for `MO_ExternalSymbol` to be external when
created.
2024-09-20 00:16:31 -04:00
Simon Pilgrim
c59ac1a2f6
[X86] Cleanup AVX512 VBROADCAST subvector instruction names. (#108888)
This patch makes the `VBROADCAST***X**` subvector broadcast instructions consistent - the `***X**` section represents the original subvector type/size, but we were not correctly using the AVX512 Z/Z256/Z128 suffix to consistently represent the destination width (or we missed it entirely).
2024-09-18 10:34:35 +01:00
Matt Arsenault
b4f3a9662d X86: Avoid using MachineFunction::getMMI 2024-07-20 13:11:40 +04:00
Feng Zou
e603451f3c
[X86] Support branch hint (#97721)
For more details about this feature, please refer to latest Intel 64 and
IA-32 Architectures Optimization Reference Manual Volume 1:
https://www.intel.com/content/www/us/en/content-details/821612/intel-64-and-ia-32-architectures-optimization-reference-manual-volume-1.html
2024-07-08 13:12:50 +08:00
Alexis Engelke
a89669cb6b
[X86][MC] Drop optional from LowerMachineOperand (#96338)
This caused the MCOperand to be returned in memory. An MCOperand is only
16 bytes and therefore can be returned in registers on x86-64 and
AArch64 (and others).
2024-06-22 09:20:45 +02:00
Simon Pilgrim
9476671dc3
[X86] Lower vXi8 multiplies by constant using PMADDUBSW on SSSE3+ targets (#95403)
As discussed on #90748 - we can avoid unpacks/extensions from vXi8 to vXi16 by using PMADDUBSW instead and packing the vXi16 results back together.
2024-06-15 14:07:33 +01:00
Simon Pilgrim
74fe1da01e [MC][X86] addConstantComments - add mul vXi16 comments
Based on feedback from #95403 - we use multiply by constant for various lowerings (shifts, division etc.), so its very useful to printout the constants to help understand the transform involved.

vXi16 multiplies are the easiest to add for this initial commit, but we can add other arithmetic instructions as follow ups when the need arises (I intend to add PMADDUBSW handling for #95403 next).

I've done my best to update all test checks but there are bound to be ones that got missed that will only appear when the file is regenerated.
2024-06-14 15:43:36 +01:00
Ricky Zhou
fcffea06fd
[XRay][X86] Handle conditional calls when lowering patchable tail calls (#89364)
xray instruments tail call function exits by inserting a nop sled before
the tail call. When tracing is enabled, the nop sled is replaced with a
call to `__xray_FunctionTailExit()`. This currently does not work for
conditional tail calls, as the instrumentation assumes that the tail
call will be unconditional. This causes two issues:
 - `__xray_FunctionTailExit()` is inappropately called even when the
   tail call is not taken.
 - `__xray_FunctionTailExit()`'s prologue/epilogue adjusts the stack
   pointer with add/sub instructions. This clobbers condition flags,
   which can flip the condition used for the tail call, leading to
   incorrect program behavior.

Fix this by rewriting conditional calls when lowering patchable tail
calls.

With this change, a conditional patchable tail call like:
```
  je target
```

Will be lowered to:
```
  jne .fallthrough
  .p2align 1, ..
.Lxray_sled_N:
  SLED_CODE
  jmp target
.fallthrough:
```
2024-05-27 21:43:10 -07:00
Xu Zhang
f6d431f208
[CodeGen] Make the parameter TRI required in some functions. (#85968)
Fixes #82659

There are some functions, such as `findRegisterDefOperandIdx` and  `findRegisterDefOperand`, that have too many default parameters. As a result, we have encountered some issues due to the lack of TRI  parameters, as shown in issue #82411.

Following @RKSimon 's suggestion, this patch refactors 9 functions, including `{reads, kills, defines, modifies}Register`,  `registerDefIsDead`, and `findRegister{UseOperandIdx, UseOperand, DefOperandIdx, DefOperand}`, adjusting the order of the TRI parameter and making it required. In addition, all the places that call these functions have also been updated correctly to ensure no additional impact.

After this, the caller of these functions should explicitly know whether to pass the `TargetRegisterInfo` or just a `nullptr`.
2024-04-24 14:24:14 +01:00
Alexandre Ganea
ec1af63dde
[Codegen][X86] Fix /HOTPATCH with clang-cl and inline asm (#87639)
This fixes an edge case where functions starting with inline assembly
would assert while trying to lower that inline asm instruction.

After this PR, for now we always add a no-op (xchgw in this case) without
considering the size of the next inline asm instruction. We might want
to revisit this in the future.

This fixes Unreal Engine 5.3.2 compilation with clang-cl and /HOTPATCH.

Should close https://github.com/llvm/llvm-project/issues/56234
2024-04-08 20:02:19 -04:00
Phoebe Wang
f4676b6be6
[X86] Add Support for X86 TLSDESC Relocations (#83136) 2024-03-15 22:09:56 +08:00
Fangrui Song
a331937197 [MC] Move CompressDebugSections/RelaxELFRelocations from TargetOptions/MCAsmInfo to MCTargetOptions
The convention is for such MC-specific options to reside in
MCTargetOptions. However, CompressDebugSections/RelaxELFRelocations do
not follow the convention: `CompressDebugSections` is defined in both
TargetOptions and MCAsmInfo and there is forwarding complexity.

Move the option to MCTargetOptions and hereby simplify the code. Rename
the misleading RelaxELFRelocations to X86RelaxRelocations. llvm-mc
-relax-relocations and llc -x86-relax-relocations can now be unified.
2024-03-06 23:19:59 -08:00
Simon Pilgrim
448fe73428 [X86] Add X86::getVectorRegisterWidth helper. NFC.
Replaces internal helper used by addConstantComments to allow reuse in a future patch.
2024-02-08 12:42:33 +00:00
Simon Pilgrim
2096e57905 [X86] addConstantComments - add FP16 MOVSH asm comments support 2024-02-05 18:02:03 +00:00
Simon Pilgrim
f958ad3b89 [X86] printZeroUpperMove - add support for mask predicated instructions
Handle masked predicated movss/movsd in addConstantComments now that we can generically handle the destination + mask register

This will more significantly help improve 'fixup constant' comments from #73509
2024-02-05 16:23:16 +00:00
Simon Pilgrim
47dcf5d5dc [X86] printBroadcast - add support for mask predicated instructions
Handle masked predicated load/broadcasts in addConstantComments now that we can generically handle the destination + mask register

This will more significantly help improve 'fixup constant' comments from #73509
2024-02-05 16:23:15 +00:00
Simon Pilgrim
f4714204d0 [X86] printExtend - add support for mask predicated instructions
Remove handling from EmitAnyX86InstComments and handle all VPMOVSX/VPMOVZX comments in addConstantComments now that we can generically handle the destination + mask register and shuffle mask comment
2024-02-05 16:23:15 +00:00
Simon Pilgrim
de9a87301a [X86] Split up getShuffleComment into printShuffleMask and printDstRegisterName helpers. NFC.
This will allow us to easily use printDstRegisterName for other mask predicate destination registers, and printout shuffle masks from other instruction types.
2024-02-05 16:23:15 +00:00
Simon Pilgrim
1af05363d6 [X86] getShuffleComment - use MI description to determine AVX512 masked predicates instead of src index offsets. 2024-02-05 14:22:46 +00:00
Simon Pilgrim
bc6370abd3 [X86] addConstantComments - split VPERMILPS/VPERMILPD handling to reduce repeated switch cases etc. NFC. 2024-02-05 13:48:15 +00:00
Simon Pilgrim
66397435ed [X86] Add common getSrcIdx helper to determine source index after AVX512 masked predicates. NFC. 2024-02-05 13:48:15 +00:00
Simon Pilgrim
69ffa7be3b
[X86] X86FixupVectorConstants - load+zero vector constants that can be stored in a truncated form (#80428)
Further develops the vsextload support added in #79815 / b5d35feacb7246573c6a4ab2bddc4919a4228ed5 - reduces the size of the vector constant by storing it in the constant pool in a truncated form, and zero-extend it as part of the load.
2024-02-05 12:17:58 +00:00
Jie Fu
4cf2ed4396 [X86] Fix -Wsign-compare in X86MCInstLower.cpp (NFC)
llvm-project/llvm/lib/Target/X86/X86MCInstLower.cpp:1588:48:
error: comparison of integers of different signs: 'unsigned int' and 'int' [-Werror,-Wsign-compare]
  if (C && C->getType()->getScalarSizeInBits() == SrcEltBits) {
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^  ~~~~~~~~~~
1 error generated.
2024-02-02 19:41:06 +08:00
Simon Pilgrim
b5d35feacb
[X86] X86FixupVectorConstants - load+sign-extend vector constants that can be stored in a truncated form (#79815)
Reduce the size of the vector constant by storing it in the constant pool in a truncated form, and sign-extend it as part of the load.

I've extended the existing FixupConstant functionality to support these sext constant rebuilds - we still select the smallest stored constant entry and prefer vzload/broadcast/vextload for same bitwidth to avoid domain flips.

I intend to add the matching load+zero-extend handling in a future PR, but that requires some alterations to the existing MC shuffle comments handling first.
2024-02-02 11:28:58 +00:00
Simon Pilgrim
4318b033bd
[MC][X86] Merge lane/element broadcast comment printers. (#79020)
This is /almost/ NFC - the only annoyance is that for some reason we were using "<C1,C2,..>" for ConstantVector types unlike all other cases - these now use the same "[C1,C2,..]" format as the other constant printers.
2024-01-23 12:33:52 +00:00
Alexandre Ganea
bb28442c0b
[CodeGen][X86] Fix lowering of tailcalls when -ms-hotpatch is used (#77245)
Previously, tail jump pseudo-opcodes were skipped by the
`encodeInstruction()` call inside `X86AsmPrinter::LowerPATCHABLE_OP`.
This caused emission of a 2-byte NOP and dropping of the tail jump.

With this PR, we change `PATCHABLE_OP` to not wrap the first
`MachineInstr` anymore, but inserting itself before,
leaving the instruction unaltered. At lowering time in `X86AsmPrinter`,
we now "look ahead" for the next non-pseudo `MachineInstr` and
lower+encode it, to inspect its size. If the size is below what
`PATCHABLE_OP` expects, it inserts NOPs; otherwise it does nothing. That
way, now the first `MachineInstr` is always lowered as usual even if
`"patchable-function"="prologue-short-redirect"` is used.

Fixes https://github.com/llvm/llvm-project/issues/76879,
https://github.com/llvm/llvm-project/issues/76958 and
https://github.com/llvm/llvm-project/issues/59039
2024-01-22 14:19:08 -05:00
Simon Pilgrim
27eb8d53ae [X86] printConstant - add ConstantVector handling 2024-01-22 15:59:55 +00:00
Simon Pilgrim
74ab7958bd [X86] printZeroUpperMove - add support for constant vectors.
Allows cases where movss/movsd etc. are loading constant (ConstantDataSequential) sub-vectors, ensuring we pad with the correct number of zero upper elements by making repeated printConstant calls to print zeroes in a matching int/fp format.
2024-01-22 15:40:46 +00:00
Simon Pilgrim
4e64ed9780 [X86] Update X86::getConstantFromPool to take base OperandNo instead of Displacement MachineOperand
This allows us to check the entire constant address calculation, and ensure we're not performing any runtime address math into the constant pool (noticed in an upcoming patch).
2024-01-22 15:40:45 +00:00
Simon Pilgrim
60963272c5 [X86] Add printElementBroadcast constant comments helper. NFC.
Pull out helper instead of repeating switch cases.
2024-01-22 12:16:19 +00:00
Simon Pilgrim
09bd2cb70f [X86] Add printLaneBroadcast constant comments helper. NFC.
Pull out helper instead of repeating switch cases.
2024-01-22 12:16:18 +00:00
Simon Pilgrim
1a5eeade16 [X86] Add printZeroUpperMove constant/shuffle comments helper. NFC.
Pull out helper instead of repeating switch cases.
2024-01-22 11:44:51 +00:00
Jie Fu
0d51c8704c [X86] Fix -Wsign-compare in X86MCInstLower.cpp (NFC)
llvm-project/llvm/lib/Target/X86/X86MCInstLower.cpp:1867:20:
 error: comparison of integers of different signs: 'int' and 'unsigned int' [-Werror,-Wsign-compare]
      if (SclWidth == C->getType()->getScalarSizeInBits()) {
          ~~~~~~~~ ^  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
2024-01-19 22:38:47 +08:00
Simon Pilgrim
a2a0089ac3
[X86] movsd/movss/movd/movq - add support for constant comments (#78601)
If we're loading a constant value, print the constant (and the zero upper elements) instead of just the shuffle mask.

This did require me to move the shuffle mask handling into addConstantComments as we can't handle this in the MC layer.
2024-01-19 14:21:26 +00:00
Simon Pilgrim
110e1717b3 [X86] X86MCInstLower.cpp - fix spelling mistake 2024-01-18 15:44:27 +00:00
Simon Pilgrim
33287e35f2
[X86] Emit verbose (constant) comments before EVEX compression tag (#78585)
This helps ensure the encoding details are next to the EVEX tag

Noticed while preparing to add more constant commenting as part of #73783 and #71078
2024-01-18 15:13:42 +00:00
Simon Pilgrim
d12dffacaa [X86] Add X86::getConstantFromPool helper function to replace duplicate implementations.
We had the same helper function in shuffle decode / vector constant code - move this to X86InstrInfo to avoid duplication.
2024-01-18 11:59:46 +00:00
Shengchen Kan
4f71068b72 [X86] Correct the asm comment for compression NF_ND -> NF 2024-01-12 12:55:11 +08:00
Shengchen Kan
1c674666fa
[X86] Support EVEX compression for EGPR (#77202)
Compress promoted instruction (EVEX) to pre-promotion instruction
(legacy/VEX) when R16-R31 is not used.

Alternative of #77065
2024-01-08 16:50:23 +08:00
Simon Pilgrim
d1deeae094
[X86] Rename VBROADCASTF128/VBROADCASTI128 to VBROADCASTF128rm/VBROADCASTI128rm (#75040)
Add missing rm postfix to show these are load instructions
2023-12-11 11:52:53 +00:00
Simon Pilgrim
2ed15877e7 [X86] Ensure asm comments only print the constant values for the vector load's register width
We were printing the entire Constant, which if we were loading from a wider constant pool entry meant that we were confusing the asm comment with upper bits that aren't actually part of the load result
2023-11-17 14:30:30 +00:00
Kazu Hirata
9c5a5a421d [llvm] Stop including llvm/ADT/iterator_range.h (NFC)
Identified with misc-include-cleaner.
2023-10-22 15:41:18 -07:00
Fangrui Song
111fcb0df0 [llvm] Fix duplicate word typos. NFC
Those fixes were taken from https://reviews.llvm.org/D137338
2023-09-01 18:25:16 -07:00
Elliot Goodrich
b0abd4893f [llvm] Add missing StringExtras.h includes
In preparation for removing the `#include "llvm/ADT/StringExtras.h"`
from the header to source file of `llvm/Support/Error.h`, first add in
all the missing includes that were previously included transitively
through this header.
2023-06-25 15:42:22 +01:00
prabhukr
30198bd788 [Triple] Add triple for UEFI
Target triple to support "x86_64-unknown-uefi"

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D131594
2023-06-06 08:42:28 -07:00
Simon Pilgrim
f1a42300aa [X86] printConstant - fix asm comment issue when broadcasting from a wider constant pool entry
In cases where a broadcast op is loading from a constant entry wider than the broadcast element, we were incorrectly printing the entire entry and not just the lower bits referenced by the broadcast.
2023-05-31 12:28:17 +01:00
Simon Pilgrim
0f8e0f4228 [X86] lowerBuildVectorAsBroadcast - broadcast Constant of original (BuildVector) element size
Noticed in D150143/D150526 - we currently create scalar Constant values using the broadcast instruction width, which might be wider than the original build vector width, making it tricky to recognise the original constant bits data.

If we have widened the broadcast value, its much more useful for asm comments if we create a ConstantVector with the original element data, add that to the constant-pool and load that with the same (wider) broadcast instruction.
2023-05-27 14:05:44 +01:00
Kyle Huey
3be667ae5a [X86] Use the CFA when appropriate for better variable locations around calls.
Without frame pointers, the locations of variables on the stack are emitted
relative to the stack pointer (via the stack pointer being the value of
DW_AT_frame_base on the subprogram). If a call modifies the stack pointer
this results in the locations being wrong and the debugger displaying the
wrong values for variables.

By using DW_OP_call_frame_cfa in these situations the emitted location for
the variable will automatically handle changes in the stack pointer
(provided LLVM is emitting the correct CFI directives elsewhere, of course).
The CFA needs to be adjusted for the size of the stack frame (including the
return address) to allow the variable locations themselves to remain
unchanged by this patch.

Certain LLDB features cannot cope with DW_OP_call_frame_cfa, so this change
is heuristically limited to the cases where it's necessary for correctness
to minimize the fallout there.

Reviewed By: #debug-info, scott.linder, jryans, jmorse

Differential Revision: https://reviews.llvm.org/D143463
2023-05-23 20:24:55 +00:00
Shengchen Kan
c81a121f3f Revert "Revert "[X86] Remove patterns for ADC/SBB with immediate 8 and optimize during MC lowering, NFCI""
This reverts commit cb16b33a03aff70b2499c3452f2f817f3f92d20d.

In fact, the test https://bugs.chromium.org/p/chromium/issues/detail?id=1446973#c2
already passed after 5586bc539acb26cb94e461438de01a5080513401
2023-05-19 22:21:56 +08:00