llvm-project

Author	SHA1	Message	Date
Qihan Cai	5f0515debd	[RISCV] Support Remaining P Extension Instructions for RV32/64 (#150379 ) This patch implements pages 15-17 from jhauser.us/RISCV/ext-P/RVP-instrEncodings-015.pdf Documentation: jhauser.us/RISCV/ext-P/RVP-baseInstrs-014.pdf jhauser.us/RISCV/ext-P/RVP-instrEncodings-015.pdf	2025-08-20 22:54:07 +10:00
Paul Walker	d6a688fb3d	[LLVM][CodeGen][SME] hasB16b16() is not sufficient to prove BFADD availability. (#154143 ) The FEAT_SVE_B16B16 arithmetic instructions are only available to streaming mode functions when SME2 is available.	2025-08-20 11:12:43 +01:00
SivanShani-Arm	460e9a8837	[LLVM][AArch64] Build attributes: Support switching to a defined subsection by name only (#154159 ) The AArch64 build attribute specification now allows switching to an already-defined subsection using its name alone, without repeating the optionality and type parameters. This patch updates the parser to support that behavior. Spec reference: https://github.com/ARM-software/abi-aa/pull/230/files	2025-08-20 10:50:48 +01:00
Link	46e77ebf71	[RISCV][NFC] Ensure files end with newline. (#154457 ) Add trailing newlines to the following files to comply with POSIX standards: - llvm/lib/Target/RISCV/RISCVInstrInfoXSpacemiT.td - llvm/test/MC/RISCV/xsmtvdot-invalid.s - llvm/test/MC/RISCV/xsmtvdot-valid.s Closes #151706	2025-08-20 17:18:16 +08:00
Gang Chen	ef68d1587d	[AMDGPU] upstream barrier count reporting part1 (#154409 )	2025-08-19 16:42:31 -07:00
Craig Topper	60aa0d4bfc	[RISCV] Add P-ext MC support for pli.dh, pli.db, and plui.dh. (#153972 ) Refactor the pli.b/h/w and plui.h/w tablegen classes.	2025-08-18 08:23:14 -07:00
Jonathan Thackray	f38c83c582	[AArch64][llvm] Disassemble instructions in `SYS` alias encoding space more correctly (#153905 ) For instructions in the `SYS` alias encoding space which take no register operands, and where the unused 5 register bits are not all set (0x31, 0b11111), then disassemble to a `SYS` alias and not the instruction, since it is not considered valid. This is because it is specified in the Arm ARM in text similar to this (e.g. page C5-1037 of DDI0487L.b for `TLBI ALLE1`, or page C5-1585 for `GCSPOPX`): ``` Rt should be encoded as 0b11111. If the Rt field is not set to 0b11111, it is CONSTRAINED UNPREDICTABLE whether: * The instruction is UNDEFINED. * The instruction behaves as if the Rt field is set to 0b11111. ``` Since we want to follow "should" directives, and not encourage undefined behaviour, only assemble or disassemble instructions considered valid. Add an extra test-case for this, and all existing test-cases are continuing to pass.	2025-08-18 14:41:41 +01:00
ZhaoQi	8f671a675f	[LoongArch] Always emit symbol-based relocations regardless of relaxation (#153943 ) This commit changes all relocations to be relocated with symbols. Without this commit, errors may occur in some cases, such as when using `llc/lto+relax`, or combining relaxed and norelaxed object files using `ld -r`. Some tests updated.	2025-08-18 20:15:49 +08:00
林克	6842cc5562	[RISCV] Add SpacemiT XSMTVDot (SpacemiT Vector Dot Product) extension. (#151706 ) The full spec can be found at spacemit-x60 processor support scope: Section 2.1.2.2 (Features): https://developer.spacemit.com/documentation?token=BWbGwbx7liGW21kq9lucSA6Vnpb#2.1 This patch only supports assembler.	2025-08-18 18:03:17 +08:00
ZhaoQi	76fb1619f0	[LoongArch] Reduce number of reserved relocations when relax enabled (#153769 )	2025-08-18 17:42:43 +08:00
ZhaoQi	8181c76bca	[LoongArch][NFC] More tests to ensure branch relocs reserved when relax enabled (#153768 )	2025-08-18 16:07:36 +08:00
ZhaoQi	6957e44d8e	[LoongArch][MC] Refine conditions for emitting ALIGN relocations (#153365 ) According to the suggestions in https://github.com/llvm/llvm-project/pull/150816, this commit refine the conditions for emitting R_LARCH_ALIGN relocations. Some existing tests are updated to avoid being affected by this optimization. New tests are added to verify: removal of redundant ALIGN relocations, ALIGN emitted after the first linker-relaxable instruction, and conservatively emitted ALIGN in lower-numbered subsections.	2025-08-18 14:54:27 +08:00
Craig Topper	4a3b69920b	[RISCV] Accept [-128,255] instead of [0, 255] for pli.b (#153913 ) pli.h and pli.w both accept signed immediates, so pli.b should too. But unlike those instructions, pli.b doesn't do any extension so its ok to accept an unsigned immediate as well.	2025-08-17 21:39:08 -07:00
Sergei Barannikov	76d993bd25	[Hexagon] Add missing operand when disassembling Y4_crswap10 (#153849 ) Auto-generated decoder fails to add the $sgp10 operand because it has no encoding bits. Work around this by adding the missing operand after decoding is complete. Fixes #153829.	2025-08-16 02:13:43 +00:00
Craig Topper	e67ec12640	[RISCV] Remove experimental from Smctr and Ssctr. (#153903 ) These extensions were ratified in November 2024.	2025-08-15 17:18:09 -07:00
Craig Topper	e2eaea412a	[RISCV] Add MC support for more P extension instructions. (#153629 ) This implements pages 10-14 from https://jhauser.us/RISCV/ext-P/RVP-instrEncodings-015.pdf Test cases copied from #123271 with a couple mistakes fixed. Co-authored-by: realqhc <caiqihan021@hotmail.com>	2025-08-14 23:23:28 -07:00
Stanislav Mekhanoshin	57c1e01e48	[AMDGPU] Don't allow wgp mode on gfx1250 (#153680 ) - gfx1250 only supports cu mode	2025-08-14 15:16:56 -07:00
Craig Topper	cba5f1b6c1	[RISCV] Add MC support for P extensions with scalar second operands. (#153502 ) These are the instructions from page 8 and the second half of page 9 here in https://jhauser.us/RISCV/ext-P/RVP-instrEncodings-015.pdf Co-authored-by: realqhc <caiqihan021@hotmail.com>	2025-08-14 07:03:36 -07:00
Stanislav Mekhanoshin	80d430df5d	[AMDGPU] Add MSG_SAVEWAVE_HAS_TDM on gfx1250 (#153483 )	2025-08-13 23:01:50 -07:00
Stanislav Mekhanoshin	fc911fe928	[AMDGPU] Add HW_REG_IB_STS2 on gfx1250 (#153479 )	2025-08-13 23:01:28 -07:00
Stanislav Mekhanoshin	cc0d227154	[AMDGPU] Disable s_setkill on gfx1250 (#153471 )	2025-08-13 23:01:04 -07:00
Craig Topper	ace08d5ccf	[RISCV] Add MC support for more P extension instructions. (#153458 ) These instructions are the shift by immediate and saturate by immediate instructions from the top half of page 9 of https://jhauser.us/RISCV/ext-P/RVP-instrEncodings-015.pdf I've also improved the CHECK lines in the invalid tests to check line and column number from the diagnostic. Co-authored-by: realqhc <caiqihan021@hotmail.com>	2025-08-13 22:07:03 -07:00
Craig Topper	e9b4e68928	[RISCV] Fix CHECK line for pslli.b in rv64p-valid.s. NFC	2025-08-13 09:01:12 -07:00
Jonathan Thackray	7bd0c5fa66	[AArch64][llvm] Unify AArch64 tests into a single file (4/4) (NFC) (#146331 ) This is a series of patches (4/4) to unify assembly/disassembly of recent AArch64 tests into a single file. The aim is to improve consistency, so that all instructions and system registers are thoroughly tested, and future test cases will be in a unified format. This patch: * removes .txt tests whose .s tests have functions * makes the .s tests have a roundabout run line to test both encoding and assembly See also #146328, #146329 and #146330. Co-authored-by: Virginia Cangelosi <virginia.cangelosi@arm.com>	2025-08-13 14:40:41 +00:00
Jonathan Thackray	8453f205eb	[AArch64][llvm] Unify AArch64 tests into a single file (3/4) (NFC) (#146330 ) This is a series of patches (3/4) to unify assembly/disassembly of recent AArch64 tests into a single file. The aim is to improve consistency, so that all instructions and system registers are thoroughly tested, and future test cases will be in a unified format. This patch: * removes .txt tests which have multiple feature dependencies * makes the .s tests have a roundabout run line to test both encoding and assembly * creates diagnostic tests when needed See also #146328, #146329 and #146331. Co-authored-by: Virginia Cangelosi <virginia.cangelosi@arm.com>	2025-08-13 14:07:28 +00:00
Jonathan Thackray	b878793739	[AArch64][llvm] Unify AArch64 tests into a single file (2/4) (NFC) (#146329 ) This is a series of patches (2/4) to unify assembly/disassembly of recent AArch64 tests into a single file. The aim is to improve consistency, so that all instructions and system registers are thoroughly tested, and future test cases will be in a unified format. This patch: * removes .txt tests which have only one feature required * makes the .s tests have a roundabout run line to test both encoding and assembly * creates diagnostic tests when needed * fixes naming convention of tests See also #146328, #146330 and #146331. Co-authored-by: Virginia Cangelosi <virginia.cangelosi@arm.com>	2025-08-13 13:39:45 +00:00
Jonathan Thackray	69452d50ce	[AArch64][llvm] Unify AArch64 tests into a single file (1/4) (NFC) (#146328 ) This is a series of patches (1/4) to unify assembly/disassembly of recent AArch64 tests into a single file. The aim is to improve consistency, so that all instructions and system registers are thoroughly tested, and future test cases will be in a unified format. This patch: * unifies errorless .s and .txt tests into a single file * remove .txt tests which don't have feature requirements * makes the .s tests have a roundabout run line to test both encoding and assembly See also #146329, #146330 and #146331. --------- Co-authored-by: Virginia Cangelosi <virginia.cangelosi@arm.com>	2025-08-13 13:45:25 +01:00
Stanislav Mekhanoshin	d0ee82040c	[AMDGPU] Add s_barrier_init\|join\|leave instructions (#153296 )	2025-08-12 15:07:07 -07:00
Sterling-Augustine	2e9944a03e	Generate an .sframe section with a skeleton header (#151223 ) This continues the sframe implementation discussed previously. Of note, this also adds some target dependent functions to the object file. Additional fields will be needed later. It would be possible to do all of this inside the sframe implementation itself if it feels a little messy and specialized, but generally I think that target info goes with target info. Another question is if we want a sentinel value for unimplemented sframe abi arches, or a std::optional. Both work.	2025-08-12 12:57:31 -07:00
Sam Elliott	9e8f7acd2b	[RISCV] Track Linker Relaxable through Assembly Relaxation (#152602 ) Span-dependent instructions on RISC-V interact in a complex manner with linker relaxation. The span-dependent assembler algorithm implemented in LLVM has to start with the smallest version of an instruction and then only make it larger, so we compress instructions before emitting them to the streamer. When the instruction is streamed, the information that the instruction (or rather, the fixup on the instruction) is linker relaxable must be accurate, even though the assembler relaxation process may transform a not-linker-relaxable instruction/fixup into one that that is linker relaxable, for instance `c.jal` becoming `qc.e.jal`, or `bne` getting turned into `beq; jal` (the `jal` is linker relaxable). In order for this to work, the following things have to happen: - Any instruction/fixup which might be relaxed to a linker-relaxable instruction/fixup, gets marked as `RelaxCandidate = true` in RISCVMCCodeEmitter. - In RISCVAsmBackend, when emitting the `R_RISCV_RELAX` relocation, we have to check that the relocation/fixup kind is one that may need a relax relocation, as well as that it is marked as linker relaxable (the latter will not be set if relaxation is disabled). - Linker Relaxable instructions streamed to a Relaxable fragment need to mark the fragment and its section as linker relaxable. I also added more debug output for Sections/Fixups which are marked Linker Relaxable. This results in more relocations, when these PC-relative fixups cross an instruction with a fixup that is resolved as not linker-relaxable but caused the fragment to be marked linker relaxable at streaming time (i.e. `c.j`). Fixes: #150071	2025-08-12 09:02:48 +01:00
Peter Collingbourne	55e71c0b14	Improve test to include multiple fragments and PATCHINST relocations.	2025-08-11 12:50:39 -07:00
Peter Collingbourne	a9227316bf	MC: Introduce R_AARCH64_PATCHINST relocation type. The R_AARCH64_PATCHINST relocation type is to support deactivation symbols. For more information, see the RFC: https://discourse.llvm.org/t/rfc-deactivation-symbols/85556 Part of the AArch64 psABI extension: https://github.com/ARM-software/abi-aa/issues/340	2025-08-11 11:31:36 -07:00
Fangrui Song	3769ce013b	MC: Refine ALIGN relocation conditions Each section now tracks the index of the first linker-relaxable fragment, enabling two changes: * Delete redundant ALIGN relocations before the first linker-relaxable instruction in a section. The primary example is the offset 0 R_RISCV_ALIGN relocation for a text section aligned by 4. * For alignments larger than the NOP size after the first linker-relaxable instruction, ALIGN relocations are now generated, even in norelax regions. This fixes the issue #150159. The new test llvm/test/MC/RISCV/Relocations/align-after-relax.s verifies the required ALIGN in a norelax region following linker-relaxable instructions. By using a fragment index within the subsection (which is less than or equal to the section's index), the implementation may generate redundant ALIGN relocations in lower-numbered subsections before the first linker-relaxable instruction. align-option-relax.s demonstrates the ALIGN optimization. Add an initial `call` to a few tests to prevent the ALIGN optimization. --- When the alignment exceeds 2, we insert $alignment-2 bytes of NOPs, even in non-RVC code. This enables non-RVC code following RVC code to handle a 2-byte adjustment without requiring an additional state in MCSection or AsmParser. ``` .globl _start _start: // GNU ld can relax this to 6505 lui a0, 0x1 // LLD hasn't implemented this transformation. lui a0, %hi(foo) .option push .option norelax .option norvc // Now we generate R_RISCV_ALIGN with addend 2, even if this is a norvc region. .balign 4 b0: .word 0x3a393837 .option pop foo: ``` Pull Request: https://github.com/llvm/llvm-project/pull/150816	2025-08-07 19:16:58 -07:00
Stanislav Mekhanoshin	dddeb07c2e	[AMDGPU] Restrict packed math FP32 instructions to read only one SGPR per operand on gfx12+ (#152465 ) Sec. 4.6.7.1 of the gfx1250 SPG states that if an SGPR is used as an operand, only one SGPR will be read for both the low and high operations. As a result, the corresponding bits in `op_sel` and `op_sel_hi` must be the same when the operand is an SGPR. Co-authored-by: Tian, Shilei <Shilei.Tian@amd.com> Co-authored-by: Tian, Shilei <Shilei.Tian@amd.com>	2025-08-07 16:13:34 -07:00
Sam Elliott	4e11f89904	[RISCV] Basic Objdump Mapping Symbol Support (#151452 ) This implements very basic support for RISC-V mapping symbols in llvm-objdump, sharing the implementation with how Arm/AArch64/CSKY implement this feature. This only supports the `$x` (instruction) and `$d` (data) mapping symbols for RISC-V, and not the version of `$x` which includes an architecture string suffix.	2025-08-07 11:28:07 -07:00
David Spickett	246990dc02	[llvm][MC][test] Disable many-instructons.s on 32-bit systems Added by https://github.com/llvm/llvm-project/pull/150846. Checks the size of a structure, which is only correct for 64-bit systems.	2025-08-07 10:09:04 +00:00
Stanislav Mekhanoshin	b296ea9c14	[AMDGPU] s_get_shader_cycles_u64 gfx1250 instruction (#152390 ) It is the same as reading SHADER_CYCLES_LO and SHADER_CYCLES_HI but with a single instruction.	2025-08-06 15:32:28 -07:00
Stanislav Mekhanoshin	66392a8d8d	[AMDGPU] Add XNACK_STATE_PRIV and _MASK gfx1250 registers (#152374 ) Co-authored-by: Pierre Vanhoutryve <pierre.vanhoutryve@amd.com> Co-authored-by: Pierre Vanhoutryve <pierre.vanhoutryve@amd.com>	2025-08-06 14:44:17 -07:00
Stanislav Mekhanoshin	c3103068b7	[AMDGPU] Add more gfx1250 MC tests. NFC. (#152388 ) These are already working, but left downstream.	2025-08-06 14:38:28 -07:00
Stanislav Mekhanoshin	184821b63d	[AMDGPU] Add gfx1250 DS MC tests. NFC. (#152378 )	2025-08-06 14:15:35 -07:00
Jonathan Thackray	c3d24217bf	[AArch64][llvm] Fix disassembly of `ldt{add,set,clr}` instructions using `xzr/wzr` (#152292 ) The current disassembly of `ldt{add,set,clr}` instructions when using `xzr/wzr` is incorrect. The Armv9.6-A Memory Systems specification says: ``` For each of LDT{ADD\|SET\|CLR}{L}, there is the corresponding STT{ADD\|SET\|CLR}{L} alias, for the case where the register selected by the Rt field is XZR or WZR ``` and: ``` LDT{ADD\|SET\|CLR}{A}{L} is equivalent to LD{ADD\|SET\|CLR}{A}{L} except that: <..conditions..> ``` The Arm ARM specifies the preferred form of disassembly for these aliases: ``` STADD <Xs>, [<Xn\|SP>] is equivalent to LDADD <Xs>, XZR, [<Xn\|SP>] and is always the preferred disassembly. ``` (ref: DDI 0487L.b C6-2317) This means that `sttadd` is the preferred disassembly for `ldtadd w0, wzr, [x2]` when Rt is `xzr` or `wzr`. This change also aligns llvm disassembly with GNU binutils, as shown by the following examples: llvm before this change: ``` % cat test.s stadd w0, [sp] sttadd w0, [sp] ldadd w0, wzr, [sp] ldtadd w0, wzr, [sp] % llvm-mc-20 -triple aarch64 -mattr=+lse,+lsui test.s stadd w0, [sp] ldtadd w0, wzr, [sp] stadd w0, [sp] ldtadd w0, wzr, [sp] ``` llvm after this change: ``` % llvm-mc -triple aarch64 -mattr=+lse,+lsui test.s stadd w0, [sp] sttadd w0, [sp] stadd w0, [sp] sttadd w0, [sp] ``` GCC-15 test: ``` % gas test.s -march=armv8-a+lsui+lse -o test.o % objdump -dr test.o 0: b82003ff stadd w0, [sp] 4: 192007ff sttadd w0, [sp] 8: b82003ff stadd w0, [sp] c: 192007ff sttadd w0, [sp] ``` Many thanks to Ezra Sitorus and Alice Carlotti for reporting and confirming this issue.	2025-08-06 15:44:15 +01:00
Stanislav Mekhanoshin	34aed0ed56	[AMDGPU] Add gfx1250 wmma_scale[16]_f32_32x16x128_f4 instructions (#152194 )	2025-08-05 15:15:21 -07:00
Stanislav Mekhanoshin	d08c2977e8	[AMDGPU] Add MC support for new gfx1250 src_flat_scratch_base_lo/hi (#152203 )	2025-08-05 14:35:48 -07:00
Oliver Stannard	f6c2a357e7	[AArch64] Add Apple assembly syntax for recent instructions (#152111 ) Some vector instructions override AsmString in the tablegen description, but did not include the Apple syntax variant, so were printed without operands. Fixes #151330	2025-08-05 16:04:25 +01:00
Stanislav Mekhanoshin	37fe9f6382	[AMDGPU] Add gfx1250 v_wmma_scale[16]_f32_16x16x128_f8f6f4 MC support (#152014 ) This adds new VOP3PX2e encoding	2025-08-04 14:20:12 -07:00
Stanislav Mekhanoshin	dd0737bd99	[AMDGPU] gfx1250 v_wmma_ld_scale instructions (#152010 )	2025-08-04 11:36:48 -07:00
Fangrui Song	e8fc808bf8	Reapply "MCFragment: Use trailing data for fixed-size part" (#150846 ) The fixed-size content of the MCFragment object is now stored as trailing data, replacing ContentStart/ContentEnd with ContentSize. The available space for trailing data is tracked using `FragSpace`. If the available space is insufficient, a new block is allocated within the bump allocator `MCObjectStreamer::FragStorage`. FragList::Tail cannot be reused when switching sections or subsections, as it is not associated with the fragment space tracked by `FragSpace`. Instead, allocate a new fragment, which becomes less expensive after #150574. Data can only be appended to the tail fragment of a subsection, not to fragments in the middle. Post-assembler-layout adjustments (such as .llvm_addrsig and .llvm.call-graph-profile) have been updated to use the variable-size part instead. --- This reverts commit a2fef664c29a53bfa8a66927fcf8b2e5c9da4876, which reverted the innocent f1aa6050bd90f8ec4273da55d362e23905ad3a81 . Commit df71243fa885cd3db701dc35a0c8d157adaf93b3, the MCOrgFragment fix, has fixed the root cause of https://github.com/ClangBuiltLinux/linux/issues/2116	2025-08-04 09:10:42 -07:00
Fangrui Song	df71243fa8	MC: Evaluate .org during fragment relaxation Similar to 742ecfc13e8aa34cfff2900e31838f657fcafe30 for MCFillFragment, ensure `.org` directives with expressions are re-evaluated during fragment relaxation, as their sizes may change. Continue iteration to prevent stale, incorrect sizes. While I knew MCOrgFragment likely needed to be re-evaluated at all, I did not have a motivation to add it;-) This fixes the root cause of https://github.com/ClangBuiltLinux/linux/issues/2116 (writeSectionData assertion failure when building the Linux kernel for arm64) The issue cannot be reliably replicated. The specific test case would not replicate if any of the following condition was not satisfied: * .org was not re-evaluated. Fixed by this commit. * clang -cc1as has a redundant `initSections` call, leading to a redundant initial FT_Align fragment. llvm-mc -filetype=obj, lacking the redundant `initSections`, doesn't replicate. * faa931b717c02d57f0814caa9133219040e6a85b decreased sizeof(MCFragment). * f1aa6050bd90f8ec4273da55d362e23905ad3a81 added more fragments	2025-08-04 00:29:14 -07:00
Fangrui Song	a2fef664c2	Revert "MCFragment: Use trailing data for fixed-size part" This reverts commit f1aa6050bd90f8ec4273da55d362e23905ad3a81 (reland of #150846), fixing conflicts. It caused https://github.com/ClangBuiltLinux/linux/issues/2116 , which surfaced after a subsequent commit faa931b717c02d57f0814caa9133219040e6a85b decreased sizeof(MCFragment). ``` % /tmp/Debug/bin/clang "-cc1as" "-triple" "aarch64" "-filetype" "obj" "-main-file-name" "a.s" "-o" "a.o" "a.s" clang: /home/ray/llvm/llvm/lib/MC/MCAssembler.cpp:615: void llvm::MCAssembler::writeSectionData(raw_ostream &, const MCSection ) const: Assertion `getContext().hadError() \|\| OS.tell() - Start == getSectionAddressSize(Sec)' failed. PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script. Stack dump: 0. Program arguments: /tmp/Debug/bin/clang -cc1as -triple aarch64 -filetype obj -main-file-name a.s -o a.o a.s Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it): 0 libLLVMSupport.so.22.0git 0x00007cf91eb753cd llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 61 fish: Job 1, '/tmp/Debug/bin/clang "-cc1as" "…' terminated by signal SIGABRT (Abort) ``` The test is sensitive to precise fragment offsets. Using llvm-mc -filetype=obj -triple=aarch64 a.s does not replicate the issue. However, clang -cc1as includes an unnecessary `initSection` (adding an extra FT_Align), which causes the problem.	2025-08-03 23:18:51 -07:00
Stanislav Mekhanoshin	849009c635	[AMDGPU] Add missing v_permlane_up_b32 test. NFC. (#151811 )	2025-08-02 15:22:29 -07:00

1 2 3 4 5 ...

11009 Commits