llvm-project

Author	SHA1	Message	Date
Craig Topper	6fbfbd7c88	[RISCV] Add some additional notes about mask pseudo instructions to RISCVVectorExtension.rst. NFC (#120337 )	2024-12-17 21:32:06 -08:00
Krzysztof Drewniak	b24caf3d2b	[llvm][TableGen] Add a !initialized predicate to allow testing for ? (#117964 ) There are cases (like in an upcoming patch to MLIR's `Property` class) where the ? value is a useful null value. However, existing predicates make ti difficult to test if the value in a record one is operating is ? or not. This commit adds the !initialized predicate, which is 1 on concrete, non-? values and 0 on ?. --------- Co-authored-by: Akshat Oke <Akshat.Oke@amd.com>	2024-12-17 20:34:35 -06:00
Nick Desaulniers	7153a21916	[libc][docs] update sphinx requirement hashes (#120315 ) Link: #120274	2024-12-17 14:14:03 -08:00
Nick Desaulniers	4c5ddc9ed4	[libc][docs] add redirect for math/index.html (#120274 ) commit a9aff440d9dd ("[libc][docs] reorganize documentation (#118836)") moved https://libc.llvm.org/math/index.html to https://libc.llvm.org/headers/math/index.html which makes links from various slide decks stale. There's an extension for sphinx that can generate redirects. Add a dependency on that, then use it to create a redirect so that those older links still work. I was able to install this sphinx extension via: $ sudo apt install python3-sphinx-reredirects We may need to install this on whatever server generates the llvm documentation.	2024-12-17 10:37:21 -08:00
Fangrui Song	c6ff809ae9	[llvm-mc] Add --hex to disassemble hex bytes `--disassemble`/`--cdis` parses input bytes as decimal, 0bbin, 0ooct, or 0xhex. While the hexadecimal digit form is most commonly used, requiring a 0x prefix for each byte (`0x48 0x29 0xc3`) is cumbersome. Tools like xxd -p and rz-asm use a plain hex dump form without the 0x prefix or space separator. This patch adds --hex to disassemble such hex bytes with optional whitespace. ``` % rz-asm -a x86 -b 64 -d 4829c34829c4 sub rbx, rax sub rsp, rax % llvm-mc -triple=x86_64 --cdis --hex --output-asm-variant=1 <<< 4829c34829c4 .text sub rbx, rax sub rsp, rax ``` Pull Request: https://github.com/llvm/llvm-project/pull/119992	2024-12-16 21:05:08 -08:00
Vyacheslav Levytskyy	978de2d666	[SPIR-V] Add saturation and float rounding mode decorations, a subset of arithmetic constrained floating-point intrinsics, and SPV_INTEL_float_controls2 extension (#119862 ) This PR adds the following features: * saturation and float rounding mode decorations, * arithmetic constrained floating-point intrinsics (strict_fadd, strict_fsub, strict_fmul, strict_fdiv, strict_frem, strict_fma and strict_fldexp), * and SPV_INTEL_float_controls2 extension, * using recent improvements of emit-intrinsics step, this PR also simplifies pre- and post-legalizer steps and improves instruction selection.	2024-12-16 10:29:46 +01:00
Djordje Todorovic	52e9f2c52c	[RISCV] Add MIPS P8700 processor (#119882 ) The P8700 is a high-performance processor from MIPS designed to meet the demands of modern workloads, offering exceptional scalability and efficiency. It builds on MIPS's established architectural strengths while introducing enhancements that set it apart. For more details, you can check out the official product page here: https://mips.com/products/hardware/p8700/. Scheduling model will be added in a separate commit/PR.	2024-12-13 20:54:25 +01:00
Sudharsan Veeravalli	668d9688ac	[RISCV] Add Qualcomm uC Xqcilsm (Load Store Multiple) extension (#119823 ) This extension adds 6 instructions that can do multi-word load/store. The current spec can be found at: https://github.com/quic/riscv-unified-db/releases/latest This patch adds assembler only support.	2024-12-14 00:06:58 +05:30
AidinT	6f8a363a48	[Kaleidoscope] Add mem2reg pass to function pass manager (#119707 ) Kaleidoscope has switched to new pass manager before (#72324), but both code and tutorial document have some missing parts. This pull request fixes the following problems: 1. Adds `PromotePass` to the function pass manager. This pass was removed during the switch from legacy pass manager to the new pass manager. 2. Syncs the tutorial with the code.	2024-12-12 16:25:09 +01:00
quic_hchandel	0614c601b4	[RISCV] Add Qualcomm uC Xqcics(Conditional Select) extension (#119504 ) The Qualcomm uC Xqcics extension adds 8 conditional select instructions. The current spec can be found at: https://github.com/quic/riscv-unified-db/releases/latest This patch adds assembler only support. --------- Co-authored-by: Harsh Chandel <hchandel@qti.qualcomm.com>	2024-12-12 11:12:09 +05:30
Florian Hahn	5fae408d3a	[VPlan] Dispatch to multiple exit blocks via middle blocks. (#112138 ) A more lightweight variant of https://github.com/llvm/llvm-project/pull/109193, which dispatches to multiple exit blocks via the middle blocks. The patch also introduces a bit of required scaffolding to enable early-exit vectorization, including an option. At the moment, early-exit vectorization doesn't come with legality checks, and is only used if the option is provided and the loop has metadata forcing vectorization. This is only intended to be used for testing during bring-up, with @david-arm enabling auto early-exit vectorization plugging in the changes from https://github.com/llvm/llvm-project/pull/88385. PR: https://github.com/llvm/llvm-project/pull/112138	2024-12-11 21:11:05 +00:00
Abhay Kanhere	de56df9eb5	[Nomination] Add additional Apple representative to the Security Group (#118571 ) I'd like to nominate myself as an additional Apple representative (vendor contact) on the llvm security group. I met many of you at the llvm-dev meeting roundtable(s) in Santa Clara. I closely work with @ahmedbougacha @jroelofs at Apple. - Abhay	2024-12-11 11:00:08 -08:00
Nuno Lopes	03661fbe45	[docs][UB] add section on poison propagation through select Examples from Nikita Popov, thank you!	2024-12-11 16:52:03 +00:00
Nuno Lopes	0100c631f8	[docs] Add guide about Undefined Behavior (#119220 ) Thanks Antonio Frighetto, John Regehr, and Nikita Popov for reviewing this!	2024-12-11 12:23:51 +00:00
Guillaume DI FATTA	a1ee1a9126	[CodeGen] @llvm.experimental.stackmap make operands immediate (#117932 ) This pull request modifies the behavior of the `@llvm.experimental.stackmap` intrinsic to require that its two first operands (`id` and `numShadowBytes`) be immediate values. This change ensures that variables cannot be passed as two first arguments to this intrinsic. Related Issue: https://github.com/llvm/llvm-project/issues/115733 ### Testing - Added new test cases to ensure errors are emitted for non-immediate operands. - Ran the full LLVM test suite to verify no regressions were introduced.	2024-12-11 17:41:19 +08:00
anoopkg6	dc04d414df	SystemZ: Add support for __builtin_setjmp and __builtin_longjmp. (#119257 ) This pr includes fixes for original pr##116642. Implementation for __builtin_setjmp and __builtin_longjmp for SystemZ..	2024-12-10 19:50:51 +01:00
Anastasia Stulova	dadd8455fe	Removed Anastasia Stulova from Office Hours Calendar. (#119384 ) Unfortunately, due to other commitments I am no longer able to host this community meeting. Co-authored-by: Anastasia Stulova <astulova@nvidia.com>	2024-12-10 14:52:01 +00:00
Amara Emerson	a4c7c66098	[GlobalISel] Document minimum legality requirements for G_IMPLICIT_DEF. (#117609 ) The reason for this change is to clarify an existing technical restriction of LLVM: there needs to be a way to implicitly define a type if there is any way to legally define that type by another means.	2024-12-09 22:10:13 -08:00
Simon Pilgrim	290a111792	[docs] Add a brief description of using -fveclib to enable some math library vectorizations (#119215 ) Fixes #62283	2024-12-09 15:06:23 +00:00
Thorsten Schütt	fc2cc018ec	[GlobalISel] list undocumented opcodes in docs (#119089 )	2024-12-08 16:35:33 +01:00
Austin Kerbow	aebd3389a9	[AMDGPU] Fix user SGPR alloc order in docs (#119092 ) NFC. Preload kernarg SGPRs are allocated after the private segment size SGPR. This patch updates AMDGPUUsage.rst to reflect this.	2024-12-07 13:08:35 -08:00
Feng Zou	94c6dd62fa	[docs] Update release notes for APX relocation types (#118575 )	2024-12-07 21:27:10 +08:00
Ulrich Weigand	8787bc72a6	Revert "[SystemZ] Add support for __builtin_setjmp and __builtin_longjmp (#116642 )" This reverts commit 030bbc92a705758f1131fb29cab5be6d6a27dd1f.	2024-12-07 00:55:54 +01:00
Jeffrey Byrnes	9ac52ce8d6	[AMDGPU] Add iglp_opt(3) for simple mfma / exp interleaving (#117269 ) Adds a minimal iglp_opt to do simple exp / mfma interleaving.	2024-12-06 15:19:07 -08:00
anoopkg6	030bbc92a7	[SystemZ] Add support for __builtin_setjmp and __builtin_longjmp (#116642 ) Implementation for __builtin_setjmp and __builtin_longjmp for SystemZ.	2024-12-06 23:33:33 +01:00
Chandler Carruth	28bba0d717	Bump minimum MSVC version by one dot release to VS 2019 16.8 (#118833 ) This is a small change, but unblocks using longer string literals in LLVM's source code, and hopefully isn't disruptive. Discussed in an RFC here: https://discourse.llvm.org/t/rfc-raising-minimum-msvc-version-by-one-dot-release/	2024-12-06 01:48:02 -08:00
Dmitry Sidorov	d057b53a7d	[SPIR-V] Add SPV_INTEL_joint_matrix extension (#118578 ) The spec is available here: https://github.com/intel/llvm/pull/12497 The PR doesn't add OpCooperativeMatrixApplyFunctionINTEL instruction as it's still experimental and not properly tested E2E. The PR also fixes few bugs in the related code: 1. CooperativeMatrixMulAddKHR optional operand must be literal, not a constant; 2. Fixed available capabilities table creation for a case, when a single extension adds few capabilities, that occupy not contiguous op codes. --------- Signed-off-by: Sidorov, Dmitry <dmitry.sidorov@intel.com>	2024-12-04 19:00:19 +01:00
Rahul Joshi	e2c3d16282	[NFC] Eliminate need of Emacs tag and file name in file header (#118553 ) - Simplify file header to not require file name and C++ Emacs tag. See https://discourse.llvm.org/t/is-c-in-header-files-still-relevant/83124/1	2024-12-04 08:57:27 -08:00
Thorsten Schütt	148fdc519c	[GlobalISel] Add G_ABDS and G_ABDU instructions (#118122 ) The DAG has the same instructions: the signed and unsigned absolute difference of it's input. For AArch64, they map to uabd and sabd for Neon and SVE. The Neon and SVE instructions will require custom patterns. They are pseudo opcodes and are not imported by the IRTranslator. We need combines to create them. PowerPC, ARM, and AArch64 have native instructions. /// i.e trunc(abs(sext(Op0) - sext(Op1))) becomes abds(Op0, Op1) /// or trunc(abs(zext(Op0) - zext(Op1))) becomes abdu(Op0, Op1) For GlobalISel, we are going to write the combines in MIR patterns. see: llvm/test/CodeGen/AArch64/abd-combine.ll - [ ] combine into abd - [ ] legalize and add td patterns	2024-12-04 12:53:15 +01:00
John Brawn	ecbe4d1e36	[IR] Allow fast math flags on fptrunc and fpext (#115894 ) This consists of: * Make these instructions part of FPMathOperator. * Adjust bitcode/ir readers/writers to expect fast math flags on these instructions. * Make IRBuilder set the fast math flags on these instructions. * Update langref and release notes. * Update a bunch of tests. Some of these are due to InstCombineCasts incorrectly adding fast math flags to fptrunc, which will be fixed in a later patch.	2024-12-04 10:53:04 +00:00
Shilei Tian	68bcba6d7a	Revert "[AMDGPU] Use COV6 by default (#118515 )" This reverts commit 410cbe3cf28913cca2fc61b3437306b841d08172 because some buildbots are not ready yet.	2024-12-03 20:17:06 -05:00
Shilei Tian	410cbe3cf2	[AMDGPU] Use COV6 by default (#118515 )	2024-12-03 19:38:35 -05:00
Dan Gohman	35cce408ee	[WebAssembly] Support the new "Lime1" CPU (#112035 ) This adds WebAssembly support for the new [Lime1 CPU]. First, this defines some new target features. These are subsets of existing features that reflect implementation concerns: - "call-indirect-overlong" - implied by "reference-types"; just the overlong encoding for the `call_indirect` immediate, and not the actual reference types. - "bulk-memory-opt" - implied by "bulk-memory": just `memory.copy` and `memory.fill`, and not the other instructions in the bulk-memory proposal. Next, this defines a new target CPU, "lime1", which enables mutable-globals, bulk-memory-opt, multivalue, sign-ext, nontrapping-fptoint, extended-const, and call-indirect-overlong. Unlike the default "generic" CPU, "lime1" is meant to be frozen, and followed up by "lime2" and so on when new features are desired. [Lime1 CPU]: https://github.com/WebAssembly/tool-conventions/blob/main/Lime.md#lime1 --------- Co-authored-by: Heejin Ahn <aheejin@gmail.com>	2024-12-03 16:35:23 -08:00
Shilei Tian	17cfd016b4	[AMDGPU][Doc] Add `gfx950` to `gfx9-4-generic` in the document	2024-12-03 11:17:22 -05:00
Vyacheslav Levytskyy	874b4fb6ad	[SPIR-V] Fix emission of debug and annotation instructions and add SPV_EXT_optnone SPIR-V extension (#118402 ) This PR fixes: * emission of OpNames (added newly inserted internal intrinsics and basic blocks) * emission of function attributes (SRet is added) * implementation of SPV_INTEL_optnone so that it emits OptNoneINTEL Function Control flag, and add implementation of the SPV_EXT_optnone SPIR-V extension.	2024-12-03 16:18:06 +01:00
Viktoria Maximova	4a6ecd3821	Add support for SPIR-V extension: SPV_INTEL_media_block_io (#118024 ) This changes implements SPV_INTEL_media_block_io extension in SPIR-V backend.	2024-12-03 13:47:18 +01:00
Sudharsan Veeravalli	6881c6d2a6	[RISCV] Add Qualcomm uC Xqcia (Arithmetic) extension (#118113 ) This extension adds 11 instructions that perform integer arithmetic. The current spec can be found at: https://github.com/quic/riscv-unified-db/releases/latest This patch adds assembler only support.	2024-12-01 17:06:22 +05:30
Nuno Lopes	ed7f36e1ec	[LangRef] update a couple of struct/vector creation examples to use poison	2024-11-29 09:42:25 +00:00
Min-Yih Hsu	96dd39c575	[XRay] Add `__xray_default_options` to specify build-time defined options (#117921 ) Similar to `__asan_default_options`, users can specify default options upon building the instrumented binaries by providing their own definition of `__xray_default_options` which returns the option strings. This is useful in cases where setting the `XRAY_OPTIONS` environment variable might be difficult. Plus, it's a convenient way to populate XRay options when you always want the instrumentation to be enabled.	2024-11-28 22:48:57 -08:00
Sudharsan Veeravalli	8fcbba82d6	[RISCV] Add Qualcomm uC Xqcisls (Scaled Load Store) extension (#117987 ) This extension adds 8 load/store instructions with a scaled index addressing mode. The current spec can be found at: https://github.com/quic/riscv-unified-db/releases/latest This patch adds assembler only support.	2024-11-29 10:26:00 +05:30
Sudharsan Veeravalli	c4645ffeda	[RISCV] Add Qualcomm uC Xqcicsr (CSR) extension (#117169 ) The Qualcomm uC Xqcicsr extension adds 2 instructions that can read and write CSRs. The current spec can be found at: https://github.com/quic/riscv-unified-db/releases/latest This patch adds assembler only support.	2024-11-28 12:46:15 +05:30
Durgadoss R	40d0058e6a	[NVPTX] Add TMA bulk tensor reduction intrinsics (#116854 ) This patch adds NVVM intrinsics and NVPTX codegen for: * cp.async.bulk.tensor.reduce.1D -> 5D variants, supporting both Tile and Im2Col modes. * These intrinsics optionally support cache_hints as indicated by the boolean flag argument. * Lit tests are added for all combinations of these intrinsics in cp-async-bulk-tensor-reduce.ll. * The generated PTX is verified with a 12.3 ptxas executable. * Added docs for these intrinsics in NVPTXUsage.rst file. PTX Spec reference: https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-cp-reduce-async-bulk-tensor Signed-off-by: Durgadoss R <durgadossr@nvidia.com>	2024-11-27 10:57:51 +05:30
Brandon Wu	4a7dbede6b	[RISCV] Support `svukte` extension (#115657 ) This is the extension for "Address-Independent Latency of User-Mode Faults to Supervisor Addresses". Spec: https://github.com/riscv/riscv-isa-manual/pull/1564, https://lf-riscv.atlassian.net/browse/RVS-2977 The spec states that the `svukte` depends on `sv39`, but we don't have `sv39` yet, so I didn't add it to the implied list.	2024-11-27 10:54:57 +08:00
Louis Dionne	5bdcaf1a08	[github] Document the process for requesting the CI/CD role (#115321 ) See https://discourse.llvm.org/t/rfc-proposing-a-new-ci-cd-admin-for-the-project	2024-11-26 14:18:49 -05:00
LiqinWeng	bf07a569b7	[LangRef] Remove extra commas of llvm.vp.ctlz (#117542 )	2024-11-26 10:22:26 +08:00
Justin Bogner	bb88fd171a	[DirectX] Calculate resource binding offsets using the lower bound (#117303 ) In the DXIL CreateHandle and CreateHandleFromBinding ops, resource bindings are indexed from the beginning of the binding space, not from the binding itself. Translate from an index into the binding to one from the beginning of the space when lowering to these operations.	2024-11-25 10:44:01 -08:00
LiqinWeng	73bebf96bc	[LangRef] Update the position of some parameters in the vp intrinsic of abs/cttz/ctlz (#117519 )	2024-11-25 14:47:50 +08:00
Matt Arsenault	d1cca3133a	AMDGPU: Add v_permlane16_swap_b32 and v_permlane32_swap_b32 for gfx950 (#117260 ) This was a bit annoying because these introduce a new special case encoding usage. op_sel is repurposed as a subset of dpp controls, and is eligible for VOP3->VOP1 shrinking. For some reason fi also uses an enum value, so we need to convert the raw boolean to 1 instead of -1. The 2 registers are swapped, so this has 2 defs. Ideally the builtin would return a pair, but that's difficult so return a vector instead. This would make a hypothetical builtin that supports v2f16 directly uglier.	2024-11-22 20:12:50 -08:00
Matt Arsenault	01c9a14ccf	AMDGPU: Define v_mfma_f32_{16x16x128\|32x32x64}_f8f6f4 instructions (#116723 ) These use a new VOP3PX encoding for the v_mfma_scale_* instructions, which bundles the pre-scale v_mfma_ld_scale_b32. None of the modifiers are supported yet (op_sel, neg or clamp). I'm not sure the intrinsic should really expose op_sel (or any of the others). If I'm reading the documentation correctly, we should be able to just have the raw scale operands and auto-match op_sel to byte extract patterns. The op_sel syntax also seems extra horrible in this usage, especially with the usual assumed op_sel_hi=-1 behavior.	2024-11-21 08:51:58 -08:00
Jonas Devlieghere	8bfa87cadf	Release note lldb completion improvements (#117058 )	2024-11-21 07:02:45 -08:00

1 2 3 4 5 ...

11339 Commits