llvm-project

Author	SHA1	Message	Date
Steven Wu	ba8d9ce8d4	[ADT] Fix unused variable from #69528 (#114114 ) Remove unused variable to fix build failures from bot.	2024-10-29 13:00:59 -07:00
David Majnemer	5c12434906	[X86] Emit comments explaining the immediate in vfpclass This makes the assembly a lot more readable at a glance. As an example: ``` vfpclasspd $4, %zmm0, %k0 # k0 = isNegativeZero(zmm0) ```	2024-10-29 19:54:34 +00:00
Maryam Moghadas	8a0cb9ac86	[PowerPC] Add custom lowering for ssubo (#111748 ) This patch is to improve the codegen for ssubo node for i32 in 64-bit mode by custom lowering.	2024-10-29 15:43:05 -04:00
Adam Yang	3a1228a543	[SPIRV] Add GroupMemoryBarrierWithGroupSync intrinsic (#111888 ) partially fixes #70103 ### Changes * Added int_spv_group_memory_barrier_with_group_sync intrinsic in IntrinsicsSPIRV.td * Added lowering for int_spv_group_memory_barrier_with_group_sync in SPIRVInstructionSelector.cpp * Added SPIRV backend test case ### Related PRs * [[clang][HLSL] Add GroupMemoryBarrierWithGroupSync intrinsic #111883](https://github.com/llvm/llvm-project/pull/111883) * [[DXIL] Add GroupMemoryBarrierWithGroupSync intrinsic #111884](https://github.com/llvm/llvm-project/pull/111884)	2024-10-29 12:40:01 -07:00
Rahul Joshi	a18af41c20	[LLVM] Change error messages to start with lower case (#113748 ) Change LLVM Asm and TableGen Lexer/Parser error messages to begin with lower case.	2024-10-29 12:26:33 -07:00
Ellis Hoag	9cc5a4bf66	Remove llvm::shouldOptForSize() from Utils.h (#112630 ) Remove `llvm::shouldOptForSize()` from `Utils.h` since we can use `llvm::shouldOptimizeForSize()` from `SizeOpts.h` instead. Depends on https://github.com/llvm/llvm-project/pull/112626	2024-10-29 14:23:47 -05:00
Kazu Hirata	c79827cd15	[SandboxIR] Fix a warning This patch fixes: llvm/lib/SandboxIR/Context.cpp:684:22: error: unused variable 'MaxRegisteredCallbacks' [-Werror,-Wunused-const-variable]	2024-10-29 12:05:18 -07:00
Lang Hames	9e37cbb469	[ORC] Add some missing FIXMEs, move a temporary Error into an if condition.	2024-10-29 11:12:48 -07:00
Min-Yih Hsu	ba65710908	[RISCV] Avoid redundant SchedRead on _TIED VPseudos (#113940 ) _TIED and _MASK_TIED pseudos have one less operand compared to other pseudos, thus we shouldn't attach the same number of SchedRead for these instructions. I don't think we have a way to (explicitly) check scheduling classes. So I only test this patch with existing tests.	2024-10-29 10:49:35 -07:00
Harald van Dijk	950ee75909	[RISC-V] Fix check of minimum vlen. (#114055 ) If we have a minimum vlen, we were adjusting StackSize to change the unit from vscale to bytes, and then calculating the required padding size for alignment in bytes. However, we then used that padding size as an offset in vscale units, resulting in misplaced stack objects. While it would be possible to adjust the object offsets by dividing AlignmentPadding by ST.getRealMinVLen() / RISCV::RVVBitsPerBlock, we can simplify the calculation a bit if instead we adjust the alignment to be in vscale units. @topperc This fixes a bug I am seeing after #110312, but I am not 100% certain I am understanding the code correctly, could you please see if this makes sense to you?	2024-10-29 17:30:30 +00:00
Steven Wu	b510cdb895	[ADT] Add TrieRawHashMap (#69528 ) Implement TrieRawHashMap can be used to store object with its associated hash. User needs to supply a strong hashing function to guarantee the uniqueness of the hash of the objects to be inserted. A hash collision is not supported and will lead to error or failed to insert. TrieRawHashMap is thread-safe and lock-free and can be used as foundation data structure to implement a content addressible storage. TrieRawHashMap owns the data stored in it and is designed to be: * Fast to lookup. * Fast to "insert" if the data has already been inserted. * Can be used without lock and doesn't require any knowledge of the participating threads or extra coordination between threads. It is not currently designed to be used to insert unique new data with high contention, due to the limitation on the memory allocator.	2024-10-29 10:29:39 -07:00
Afanasyev Ivan	4e1b9d34f9	[mir-strip-debug] Fix debug location info strip for bundled instructions (#113676 ) Fix bug that `mir-strip-debug` pass does not remove debug location from bundled instructions. Problem arises during testing that debug info does not affect optimization passes output (`llvm-lit` with ` -Dllc="llc -debugify-and-strip-all-safe"`), when pass operates on MIR with bundled instructions + memory operands. Let mir test check looks like: ``` CHECK-NEXT: BUNDLE { CHECK-NEXT: $r3 = LD $r1, $r2 :: (load (s64) from %ir.a, !tbaa !2) CHECK-NEXT: } ``` So as `mir-strip-debug` pass does not process bundled instructions, running `llc -debugify-and-strip-all-safe` on the test will produce the following output: ``` BUNDLE { $r3 = LD $r1, $r2, debug-location !DILocation(line: 3, column: 1, scope: <0x608cb2b99b10>) :: (load (s64) from %ir.a, !tbaa !2) } ``` And test will fail, but it shouldn't. Seems like the root cause is that `mir-strip-debug` pass should remove debug location from bundled instructions.	2024-10-29 10:26:15 -07:00
Adam Yang	9a5b3a1bbc	[DXIL] Add GroupMemoryBarrierWithGroupSync intrinsic (#111884 ) fixes #112974 partially fixes #70103 ### Changes - Added new tablegen based way of lowering dx intrinsics to DXIL ops. - Added int_dx_group_memory_barrier_with_group_sync intrinsic in IntrinsicsDirectX.td - Added expansion for int_dx_group_memory_barrier_with_group_sync in DXILIntrinsicExpansion.cpp` - Added DXIL backend test case ### Related PRs * [[clang][HLSL] Add GroupMemoryBarrierWithGroupSync intrinsic #111883](https://github.com/llvm/llvm-project/pull/111883) * [[SPIRV] Add GroupMemoryBarrierWithGroupSync intrinsic #111888](https://github.com/llvm/llvm-project/pull/111888)	2024-10-29 10:17:35 -07:00
Craig Topper	b1d0fe095b	[RISCV] Remove trailing whitespace. NFC	2024-10-29 10:09:28 -07:00
Jubilee	f53889ffca	[RISCV] Allow crypto features to imply dependents (#112659 ) This relationship is a logical dependency. Note Zvbc and Zvknhb. They are explicitly called out in the spec as requiring 64 bits: - `56ed7952d1/doc/vector/riscv-crypto-spec-vector.adoc`	2024-10-29 10:07:20 -07:00
SpencerAbson	2a9dd8af5a	[AArch64] Add assembly/disassembly for zeroing SVE FCVT{X} and BFCVT (#113916 ) This patch adds assembly/disassembly support for the following SVE2.2 instructions - FCVT (zeroing) - FCVTX (zeroing) - BFCVT (zeroing) In accordance with: https://developer.arm.com/documentation/ddi0602/2024-09/SVE-Instructions	2024-10-29 16:55:19 +00:00
Fangrui Song	318bdd0aeb	[StackSafetyAnalysis] Bail out when calling ifunc An assertion failure arises when a call instruction calls a GlobalIFunc. Since we cannot reason about the underlying function, just bail out. Fix #87923 Pull Request: https://github.com/llvm/llvm-project/pull/113841	2024-10-29 09:26:47 -07:00
Jorge Gorbe Moya	4df71ab78e	[SandboxIR] Add callbacks for instruction insert/remove/move ops (#112965 )	2024-10-29 09:25:51 -07:00
Jay Foad	a156362e93	[AMDGPU] Fix machine verification failure after SIFoldOperandsImpl::tryFoldOMod (#113544 ) Fixes #54201	2024-10-29 14:59:37 +00:00
Sarah Spall	75e7ba8c0b	[HLSL] Re-implement countbits with the correct return type (#113189 ) Restricts hlsl countbits to always return a uint32. Implements a lowering from llvm.ctpop which has an overloaded return type to dxil cbits op which always returns uint32. Closes #112779	2024-10-29 07:56:05 -07:00
Shilei Tian	e268398fa8	[NFC][AMDGPU] Use `!foreach` to replace explicit list of registers (#114005 )	2024-10-29 10:50:06 -04:00
Elvina Yakubova	80a09735ac	Revert "[clang][AArch64] Add getHostCPUFeatures to query for enabled … (#114066 ) …features in cpu info (#97749)" This reverts commit d732c0b13c55259177f2936516b6087d634078e0. This is breaking buildbots https://lab.llvm.org/buildbot/#/builders/190/builds/8413, https://lab.llvm.org/buildbot/#/builders/56/builds/10880 and a few others.	2024-10-29 14:43:01 +00:00
Momchil Velikov	b6a84e77b6	[AArch64] Add assembly/disassembly for FMOP4A (widening, 4-way) instructions (#113347 ) The new instructions are described in https://developer.arm.com/documentation/ddi0602/2024-09/SME-Instructions	2024-10-29 14:36:07 +00:00
neildhickey	d732c0b13c	[clang][AArch64] Add getHostCPUFeatures to query for enabled features in cpu info (#97749 ) Add getHostCPUFeatures into the AArch64 Target Parser to query the cpuinfo for the device in the case where we are compiling with -mcpu=native. Add LLVM_CPUINFO environment variable to test mock /proc/cpuinfo files for -mcpu=native Co-authored-by: Elvina Yakubova <eyakubova@nvidia.com>	2024-10-29 13:34:43 +00:00
Matt Arsenault	88e23eb2cf	DAG: Fix legalization of vector addrspacecasts (#113964 )	2024-10-29 08:08:50 -05:00
Lukacma	3c2d77185e	[AARCH64] Add assembly/disassembly for FMMLA instructions (#113313 ) This patch adds assembly/disassembly for the following instructions: FMMLA (widening, FP16 to FP32) FMMLA (widening, FP8 to FP16) FMMLA (widening, FP8 to FP32) According to [1] [1]https://developer.arm.com/documentation/ddi0602	2024-10-29 13:02:46 +00:00
Hari Limaye	e19a5fc6d3	[FuncSpec] Improve accounting of specialization codesize growth (#113448 ) Only accumulate the codesize increase of functions that are actually specialized, rather than for every candidate specialization that we analyse. This fixes a subtle bug where prior analysis of candidate specializations that were deemed unprofitable could prevent subsequent profitable candidates from being recognised.	2024-10-29 11:53:12 +00:00
Momchil Velikov	ec427df2b9	[AArch64] Add assembly/disassembly for FMOP4{A,S} (non-widening) half-precision instructions (#113343 ) The new instructions are described in https://developer.arm.com/documentation/ddi0602/2024-09/SME-Instructions	2024-10-29 11:50:29 +00:00
Jay Foad	2443549b85	[IR] Remove some uses of StructType::setBody. NFC. (#113685 ) It is simple to create the struct body up front, now that we have transitioned to opaque pointers.	2024-10-29 11:44:53 +00:00
Hari Limaye	06664fdc76	[FuncSpec] Enable SpecializeLiteralConstant by default (#113442 ) Enable specialization on literal constant arguments by default in Function Specialization. --------- Co-authored-by: Alexandros Lamprineas <alexandros.lamprineas@arm.com>	2024-10-29 11:41:25 +00:00
Lukacma	98c8d64353	[AArch64] Add assembly/dissasembly for BFSCALE instructions (#113538 ) This patch adds assembly/disassembly for following instructions: BFSCALE (multiple and single vector) BFSCALE (multiple vectors) As specified in https://developer.arm.com/documentation/ddi0602/2024-09 Co-authored-by: Momchil Velikov [momchil.velikov@arm.com](mailto:momchil.velikov@arm.com)	2024-10-29 11:08:36 +00:00
Benjamin Maxwell	c3260c65e8	[IR] Add `llvm.sincos` intrinsic (#109825 ) This adds the `llvm.sincos` intrinsic, legalization, and lowering. The `llvm.sincos` intrinsic takes a floating-point value and returns both the sine and cosine (as a struct). ``` declare { float, float } @llvm.sincos.f32(float %Val) declare { double, double } @llvm.sincos.f64(double %Val) declare { x86_fp80, x86_fp80 } @llvm.sincos.f80(x86_fp80 %Val) declare { fp128, fp128 } @llvm.sincos.f128(fp128 %Val) declare { ppc_fp128, ppc_fp128 } @llvm.sincos.ppcf128(ppc_fp128 %Val) declare { <4 x float>, <4 x float> } @llvm.sincos.v4f32(<4 x float> %Val) ``` The lowering is built on top of the existing FSINCOS ISD node, with additional type legalization to allow for f16, f128, and vector values.	2024-10-29 10:52:20 +00:00
Rohit Aggarwal	dfb60bb919	Adding more vector calls for -fveclib=AMDLIBM (#109662 ) AMD has it's own implementation of vector calls. New vector calls are introduced in the library for exp10, log10, sincos and finite asin/acos Please refer [https://github.com/amd/aocl-libm-ose] --------- Co-authored-by: Rohit Aggarwal <Rohit.Aggarwal@amd.com>	2024-10-29 10:09:55 +00:00
CarolineConcatto	8d38fbf2f0	[LLVM][AArch64] Add assembly/disassembly for SVE Integer Unary Arithm… (#113670 ) …etic Predicated instructions This patch adds the following instructions: SVE bitwise unary operations (predicated) CLS, CLZ, CNT, CNOT, FABS, FNEG, NOT SVE integer unary operations (predicated) SXT{B,H,W}, UXT{B,H,W}, ABS ,NEG SVE2 integer unary operations (predicated) URECPE, URSQRTE, SQABS, SQNEG According to https://developer.arm.com/documentation/ddi0602 Co-authored-by: Spencer Abson Spencer.Abson@arm.com	2024-10-29 09:09:55 +00:00
CarolineConcatto	d4197f3ac1	[LLVM][AArch64] Add assembly/disassembly for MUL/BFMUL SME instructions (#113535 ) According to https://developer.arm.com/documentation/ddi0602 Co-authored-by: Momchil-Velikov Momchil.Velikov@arm.com	2024-10-29 09:09:13 +00:00
Alex Bradbury	7544d3af0e	[RISCV] Mark RVB23U64 and RVB23S64 as non-experimental (#113918 ) The specification was recently ratified <https://github.com/riscv/riscv-profiles/blob/main/src/rvb23-profile.adoc>.	2024-10-29 07:57:34 +00:00
Craig Topper	3f4468faaa	[RISCV] Teach expandRV32ZdinxStore to handle memoperand not being present. (#113981 ) I received a report that the outliner drops memoperands and causes this code to crash. Handle this by only copying the memoperand if it exists. Similar for expandRV32ZdinxLoad	2024-10-28 22:37:47 -07:00
NAKAMURA Takumi	828467a54e	Fix warnings introduced in #111434 [-Wnontrivial-memaccess]	2024-10-29 14:18:24 +09:00
Craig Topper	635c344dfb	[X86] Add vector_compress patterns with a zero vector passthru. (#113970 ) We can use the kz form to automatically zero the extra elements. Fixes #113263.	2024-10-28 19:59:00 -07:00
Yingwei Zheng	18311093ab	[InstCombine] Do not fold `shufflevector(select)` if the select condition is a vector (#113993 ) Since `shufflevector` is not element-wise, we cannot do fold it into select when the select condition is a vector. For shufflevector that doesn't change the length, it doesn't crash, but it is still a miscompilation: https://alive2.llvm.org/ce/z/s8saCx Fixes https://github.com/llvm/llvm-project/issues/113986.	2024-10-29 10:39:07 +08:00
c8ef	0c1c37bfbe	[TLI] Add support for the `tgamma` libcall. (#113791 ) This patch adds the `tgamma` libcall.	2024-10-29 10:08:38 +08:00
Lang Hames	6128ff6630	[JITLink][MachO] Add convenience functions for default text/data sections. The getMachODefaultTextSection and getMachODefaultRWDataSection functions return the "__TEXT,__text" and "__DATA,__data" sections respectively, creating empty sections if the default sections are not already present in the graph. These functions can be used by utilities that want to add code or data to these standard sections (e.g. these functions can be used to supply the section argument to the createAnonymousPointerJumpStub and createPointerJumpStubBlock functions in the various targets).	2024-10-28 18:05:40 -07:00
vporpo	a461869db3	[SandboxIR][Pass] Implement Analyses class (#113962 ) The Analyses class provides a way to pass around commonly used Analyses to SandboxIR passes throught `runOnFunction()` and `runOnRegion()` functions.	2024-10-28 18:00:52 -07:00
Matt Arsenault	1ceccbb0dd	VirtRegRewriter: Add implicit register defs for live out undef lanes (#112679 ) If an undef subregister def is live into another block, we need to maintain a physreg def to track the liveness of those lanes. This would manifest a verifier error after branch folding, when the cloned tail block use no longer had a def. We need to detect interference with other assigned intervals to avoid clobbering the undef lanes defined in other intervals, since the undef def didn't count as interference. This is pretty ugly and adds a new dependency on LiveRegMatrix, keeping it live for one more pass. It also adds a lot of implicit operand spam (we really should have a better representation for this). There is a missing verifier check for this situation. Added an xfailed test that demonstrates this. We may also be able to revert the changes in 47d3cbcf842a036c20c3f1c74255cdfc213f41c2. It might be better to insert an IMPLICIT_DEF before the instruction rather than using the implicit-def operand. Fixes #98474	2024-10-28 17:33:53 -07:00
Igor Kudrin	757d0e4764	Revert "[CFI][LowerTypeTests] Fix indirect call with alias" (#113978 ) Reverts llvm/llvm-project#106185 This is breaking Sanitizer bots: https://lab.llvm.org/buildbot/#/builders/66/builds/5449/steps/8/logs/stdio	2024-10-28 16:13:32 -07:00
David Majnemer	902acde341	[InstCombine] Optimize away certain additions using modular arithmetic We can turn: ``` %add = add i8 %arg, C1 %and = and i8 %add, C2 %cmp = icmp eq i1 %and, C3 ``` into: ``` %and = and i8 %arg, C2 %cmp = icmp eq i1 %and, (C3 - C1) & C2 ``` This is only worth doing if the sequence is the sole user of the addition operation.	2024-10-28 22:51:35 +00:00
Matthias Braun	5903c6af44	InstCombine: Fold shufflevector(select) and shufflevector(phi) (#113746 ) - Transform `shufflevector(select(c, x, y), C)` to `select(c, shufflevector(x, C), shufflevector(y, C))` by re-using the `FoldOpIntoSelect` helper. - Transform `shufflevector(phi(x, y), C)` to `phi(shufflevector(x, C), shufflevector(y, C))` by re-using the `foldOpInotPhi` helper.	2024-10-28 15:35:17 -07:00
vporpo	bf4b31ad54	[SandboxVec][Legality] Check Fastmath flags (#113967 )	2024-10-28 15:32:20 -07:00
vporpo	5ea694816b	[SandboxVec][Legality] Check opcodes and types (#113741 )	2024-10-28 14:05:58 -07:00
joaosaffran	481bce018e	Adding splitdouble HLSL function (#109331 ) - Adding hlsl `splitdouble` intrinsics - Adding DXIL lowering - Adding SPIRV lowering - Adding test Fixes: #108901 --------- Co-authored-by: Joao Saffran <jderezende@microsoft.com>	2024-10-28 13:26:59 -07:00

1 2 3 4 5 ...

187775 Commits