llvm-project

Author	SHA1	Message	Date
Yuta Saito	d4efc3e097	[Coverage][WebAssembly] Add initial support for WebAssembly/WASI (#111332 ) Currently, WebAssembly/WASI target does not provide direct support for code coverage. This patch set fixes several issues to unlock the feature. The main changes are: 1. Port `compiler-rt/lib/profile` to WebAssembly/WASI. 2. Adjust profile metadata sections for Wasm object file format. - [CodeGen] Emit `__llvm_covmap` and `__llvm_covfun` as custom sections instead of data segments. - [lld] Align the interval space of custom sections at link time. - [llvm-cov] Copy misaligned custom section data if the start address is not aligned. - [llvm-cov] Read `__llvm_prf_names` from data segments 3. [clang] Link with profile runtime libraries if requested See each commit message for more details and rationale. This is part of the effort to add code coverage support in Wasm target of Swift toolchain.	2024-10-15 02:41:43 +09:00
Michael Marjieh	b5600c6f85	[TargetLowering][SelectionDAG] Exploit nneg Flag in UINT_TO_FP (#108931 ) 1. Propagate the nneg flag in WidenVecRes 2. Use SINT_TO_FP in expandUINT_TO_FP when possible.	2024-10-14 20:55:48 +04:00
Akshat Oke	cd6c2b80be	[NewPM][CodeGen] Port StackColoring to NPM (#111812 )	2024-10-14 19:23:34 +05:30
c8ef	a3b0c31ebc	Revert "[DAG] Enhance SDPatternMatch to match integer minimum and maximum patterns in addition to the existing ISD nodes." (#112200 ) Reverts llvm/llvm-project#111774 This appears to be causing some tests to fail.	2024-10-14 21:43:49 +08:00
c8ef	11f625cb87	[DAG] Enhance SDPatternMatch to match integer minimum and maximum patterns in addition to the existing ISD nodes. (#111774 ) Closes #108218. This patch adds icmp+select patterns for integer min/max matchers in SDPatternMatch, similar to those in IR PatternMatch.	2024-10-14 21:19:34 +08:00
David Green	828d72b263	[GlobalISel] Add an assert for the DemandedElts APInt size. (#112150 ) Similar to the other implementations in DAG/ValueTracking, this adds an assert that the size of the DemandedElts is what we expect it to be - the size of a fixed length vector or APInt(1,1) otherwise. The G_BUILDVECTOR is fixed as it was passing an original DemandedElts for the scalar operands.	2024-10-14 09:59:26 +01:00
Akshat Oke	3dba5d8584	[MIR] Add missing noteNewVirtualRegister callbacks (#111634 ) The delegates' callback isn't invoked on parsing new virtual registers. There are two places in the serialization where new virtual registers can be discovered: in register infos and in instructions.	2024-10-14 14:29:09 +05:30
Akshat Oke	dbfca24b99	[MIR] Serialize virtual register flags (#110228 ) [MIR] Serialize virtual register flags This introduces target-specific vreg flag serialization. Flags are represented as `uint8_t` and the `TargetRegisterInfo` override provides methods `getVRegFlagValue` to deserialize and `getVRegFlagsOfReg` to serialize.	2024-10-14 14:19:53 +05:30
duk	464a7ee79e	[CodeGen] Generalize trap emission after SP check fail (#109744 ) Generalize and improve some target-specific code that emits traps after stack protector failure in SelectionDAG & GlobalIsel.	2024-10-12 20:01:22 -04:00
Kazu Hirata	a62768c427	[CodeGen] Simplify code with *Map::operator[] (NFC) (#112075 )	2024-10-11 23:01:21 -07:00
Tex Riddell	82b40fd4fd	Fix scalar overload name constructed by ReplaceWithVeclib.cpp (#111095 ) ReplaceWithVeclib.cpp would construct overload name using all the arguments in the intrinsic, but overloads should only be constructed from arguments for which isVectorIntrinsicWithOverloadTypeAtArg returns true, including the return type first (index -1). Additionally, - skip when `Intrinsic::not_intrinsic`, otherwise `isVectorIntrinsicWithOverloadTypeAtArg` asserts for some IntrinsicCalls. Unblocks translation for pow and atan2 intrinsics. Fixes #111093	2024-10-11 14:38:35 -07:00
Ellis Hoag	adaa603224	[MachineVerifier] Report errors from one thread at a time (#111605 ) Create the `ReportedErrors` class to track the number of reported errors during verification. The class will block reporting errors if some other thread is currently reporting an error. I've encountered a case where there were many different verifications reporting errors at the same time on different threads. This ensures that we don't start printing the error from one case until we are completely done printing errors from other cases. Most of the time `AbortOnError = true` so we usually abort after reporting the first error. Depends on https://github.com/llvm/llvm-project/pull/111602.	2024-10-11 11:53:44 -07:00
Rahul Joshi	fa789dffb1	[NFC] Rename `Intrinsic::getDeclaration` to `getOrInsertDeclaration` (#111752 ) Rename the function to reflect its correct behavior and to be consistent with `Module::getOrInsertFunction`. This is also in preparation of adding a new `Intrinsic::getDeclaration` that will have behavior similar to `Module::getFunction` (i.e, just lookup, no creation).	2024-10-11 05:26:03 -07:00
Yingwei Zheng	ec3e0a5900	Revert "[CodeGenPrepare] Convert `ctpop(X) ==/!= 1` into `ctpop(X) u</u> 2/1`" (#111932 ) Reverts llvm/llvm-project#111284 to fix clang stage2 builds. Investigating... Failed buildbots: https://lab.llvm.org/buildbot/#/builders/76/builds/3576 https://lab.llvm.org/buildbot/#/builders/168/builds/4308 https://lab.llvm.org/buildbot/#/builders/127/builds/1087	2024-10-11 11:08:07 +08:00
Yingwei Zheng	e3894f58e1	[CodeGenPrepare] Convert `ctpop(X) ==/!= 1` into `ctpop(X) u</u> 2/1` (#111284 ) Some targets have better codegen for `ctpop(X) u< 2` than `ctpop(X) == 1`. After https://github.com/llvm/llvm-project/pull/100899, we set the range of ctpop's return value to indicate the argument/result is non-zero. This patch converts `ctpop(X) ==/!= 1` into `ctpop(X) u</u> 2/1` in CGP to fix https://github.com/llvm/llvm-project/issues/95255.	2024-10-11 09:08:38 +08:00
Ellis Hoag	cb5fbd2f60	[CodeLayout] Do not verify after assigning blocks (#111754 ) Rather than invariantly running `F->verify()` when asserts are enabled, run machine IR verification in LIT tests only. Swap `CHECK-PERF` and `CHECK-SIZE` in `code_placement_ext_tsp_large.ll`. Remove `={0,1,true,false}` from flags in tests.	2024-10-10 09:01:50 -07:00
Vladimir Radosavljevic	dabb0ddbd7	[MCP] Skip invalidating def constant regs during forward propagation (#111129 ) Before this patch, redundant COPY couldn't be removed for the following case: ``` %reg1 = COPY %const-reg ... // There is a def of %const-reg %reg2 = COPY killed %reg1 ``` where this can be optimized to: ``` ... // There is a def of %const-reg %reg2 = COPY %const-reg ``` This patch allows for such optimization by not invalidating defined constant registers. This is safe, as architectures like AArch64 and RISCV replace a dead definition of a GPR with a zero constant register for certain instructions.	2024-10-10 18:05:42 +04:00
Oliver Stannard	1e49670b31	[DAGISel] Keep flags when converting FP load/store to integer (#111679 ) This DAG combine replaces a floating-point load/store pair which has no other uses with an integer one, but did not copy the memory operand flags to the new instructions, resulting in it dropping the volatile flag. This optimisation is still valid if one or both of the instructions is volatile, so we can copy over the whole MachineMemOperand to generate volatile integer loads and stores where needed.	2024-10-10 09:17:50 +01:00
YunQiang Su	8d35ab80fc	AArch64: Add FMINNUM_IEEE and FMAXNUM_IEEE support (#107855 ) FMINNM/FMAXNM instructions of AArch64 follow IEEE754-2008. We can use them to canonicalize a floating point number. And FMINNUM_IEEE/FMAXNUM_IEEE is used by something like expanding FMINIMUMNUM/FMAXIMUMNUM, so let's define them. Update combine_andor_with_cmps.ll. Add fp-maximumnum-minimumnum.ll, with nnan testcases only. V1F64 is not supported yet. If we set v1f64 as legal, FMINNUM/FMAXNUM will have some problem: both of them use `if (isOperationLegalOrCustom(FMAXNUM_IEEE, VT))`. AArch64 depends on `expandFMINNUM_FMAXNUM` returning `SDValue()` for FMAXNUM and FMINNUM. We should fix this problem, while it will be in future patch.	2024-10-10 15:09:47 +08:00
YunQiang Su	d52c8408ff	SelectionDAG/expandFMINNUM_FMAXNUM: skips vector if SETCC/VSELECT is not legal (#109570 ) If SETCC or VSELECT is not legal for vector, we should not expand it, instead we can split the vectors. So that, some simple scale instructions can be emitted instead of some pairs of comparation+selection.	2024-10-10 08:39:25 +08:00
Jeffrey Byrnes	853c43d04a	[TTI] NFC: Port TLI.shouldSinkOperands to TTI (#110564 ) Porting to TTI provides direct access to the instruction cost model, which can enable instruction cost based sinking without introducing code duplication.	2024-10-09 14:30:09 -07:00
Ellis Hoag	d905a3c51b	[NFC] Format MachineVerifier.cpp to remove extra indentation (#111602 ) Many structs in this class have the wrong indentation. To generate this diff, I touched the first line of each struct and then ran `git clang-format`. This will make blaming more difficult, but this autoformatting is difficult to avoid triggering. I think it's best to push this as one NFC PR.	2024-10-09 08:26:30 -07:00
Matt Arsenault	ced15cd418	DAG: Preserve more flags when expanding gep (#110815 ) This allows selecting the addressing mode for stack instructions in cases where we need to prove the sign bit is zero.	2024-10-09 13:51:52 +04:00
William G Hatch	181840459d	[LiveDebugValues][NVPTX]VarLocBasedImpl handle vregs, enable for NVPTX (#111456 ) This patch handles virtual registers in the VarLocBasedImpl of the LiveDebugVariables pass, allowing it to be used on architectures that depend on virtual registers in debugging, like NVPTX. It enables the pass for NVPTX.	2024-10-08 19:38:47 -06:00
Simon Pilgrim	1dcb6dc757	[DAG] foldVSelectToSignBitSplatMask - pull out repeated code and use getShiftAmountConstant helper. We're assuming shift amount type matches the result type - which is true for vectors, but I'm hoping to generalize this fold in the future.	2024-10-08 17:36:34 +01:00
Juan Manuel Martinez Caamaño	327124ece7	[NFC][EarlyIfConverter] Rename SSAIfConv::runOnMachineFunction to SSAIfConv::init (#111500 )	2024-10-08 11:11:23 +02:00
Ralf Jung	29ec0716a8	Fix comment typo in ExpandFCOPYSIGN (#111489 ) I noticed this while following https://github.com/llvm/llvm-project/pull/111269. It makes little sense that FCOPYSIGN would look at the sign of `x`, right? Surely this must be `y`. Also fix the inconsistency where it's sometimes `x` and sometimes `X`.	2024-10-08 12:47:56 +04:00
Juan Manuel Martinez Caamaño	1df8ccd35b	Revert "[NFC][EarlyIfConverter] Turn SSAIfConv into a local variable (#107390 )" (#111385 ) This reverts commit 09a4c23eb410d4be52202bed21c967a3653c3544.	2024-10-08 10:27:22 +02:00
Juan Manuel Martinez Caamaño	d5ec01b0dd	Revert "[NFC][EarlyIfConverter] Replace boolean Predicate for a class (#108519 )" (#111372 ) This reverts commit 9e7315912656628b606e884e39cdeb261b476f16.	2024-10-07 16:03:11 +02:00
Juan Manuel Martinez Caamaño	a018353f4b	Revert "[NFC][EarlyIfConverter] Remove unused member variables" This reverts commit 3c83102f0615c7d66f6df698ca472ddbf0e9483d.	2024-10-07 14:37:37 +02:00
Paul Walker	02dd6b1014	[LLVM][CodeGen] Add lowering for scalable vector bfloat operations. (#109803 ) Specifically: fabs, fadd, fceil, fdiv, ffloor, fma, fmax, fmaxnm, fmin, fminnm, fmul, fnearbyint, fneg, frint, fround, froundeven, fsub, fsqrt & ftrunc	2024-10-07 13:01:59 +01:00
Luke Lau	c98e41f858	[LegalizeVectorTypes] Always widen fabs (#111298 ) fabs and fneg are similar nodes in that they can always be expanded to integer ops, but currently they diverge when widened. If the widened vector fabs is marked as expand (and the corresponding scalar type is too), LegalizeVectorTypes thinks that it may be turned into a libcall and so will unroll it to avoid the overhead on the undef elements. However unlike the other ops in that list like fsin, fround, flog etc., an fabs marked as expand will never be legalized into a libcall. Like fneg, it can always be expanded into an integer op. This moves it below unrollExpandedOp to bring it in line with fneg, which fixes an issue on RISC-V with f16 fabs being unexpectedly scalarized when there's no zfhmin.	2024-10-07 17:40:32 +08:00
Luke Lau	18d3a5d558	[LegalizeVectorTypes] When widening don't check for libcalls if promoted (#111297 ) When widening some FP ops, LegalizeVectorTypes will check to see if the widened op may be scalarized and then turned into a bunch of libcalls, and if so unroll early to avoid unnecessary libcalls of the padded undef elements. It checks if the widened op is legal or custom to see if it will be scalarized, but promoted ops will also avoid scalarization. This relaxes the check to account for this which fixes some illegal vector types on RISC-V from being scalarized when they could be widened.	2024-10-07 16:42:36 +08:00
Kazu Hirata	bea28037f6	[CodeGen] Avoid repeated hash lookups (NFC) (#111274 )	2024-10-06 09:21:25 -07:00
Craig Topper	20e37f03c6	[GISel] Don't preserve NSW flag when converting G_MUL of INT_MIN to G_SHL. (#111230 ) mul and shl have different meanings for the nsw flag. We need to drop it when converting a multiply by the minimum negative value.	2024-10-05 10:27:09 -07:00
Daniel Hoekwater	8305e9fc09	Revert "[CFIFixup] Factor CFI remember/restore insertion into a helper (NFC)" (#111168 ) Reverts llvm/llvm-project#111066 This seems to be breaking some builds: - https://lab.llvm.org/buildbot/#/builders/51/builds/4732 - https://lab.llvm.org/buildbot/#/builders/41/builds/2534 - https://lab.llvm.org/buildbot/#/builders/73/builds/6601	2024-10-04 10:34:03 -04:00
Daniel Hoekwater	47c8b95dae	[CFIFixup] Factor CFI remember/restore insertion into a helper (NFC) (#111066 ) Inserting a remember/restore pair is a very clean abstraction, so we can split the logic out into a helper function. Additionally, cleaning this up will make it easier to add logic for handling functions that are split across multiple sections.	2024-10-04 09:31:22 -04:00
Stephen Tozer	b01be72af0	[NFC][CodeGen] Remove unused HasFakeUses MachineFunctionProperty A previous commit d826b0c9 accidentally added a new MachineFunctionProperty, HasFakeUses, that was unused by the commit (and results in an uncovered-switch warning, which was fixed by a separate followup 1811e872); this patch removes that enum value.	2024-10-04 13:58:29 +01:00
Jie Fu	1811e87204	[CodeGen] Fix enumeration value 'HasFakeUses' not handled in switch (NFC) llvm-project/llvm/lib/CodeGen/MachineFunction.cpp:95:10: error: enumeration value 'HasFakeUses' not handled in switch [-Werror,-Wswitch] switch(Prop) { ^~~~ 1 error generated.	2024-10-04 20:27:40 +08:00
Stephen Tozer	d826b0c90f	[LLVM] Add HasFakeUses to MachineFunction (#110097 ) Following the addition of the llvm.fake.use intrinsic and corresponding MIR instruction, two further changes are planned: to add an -fextend-lifetimes flag to Clang that emits these intrinsics, and to have -Og enable this flag by default. Currently, some logic for handling fake uses is gated by the optdebug attribute, which is intended to be switched on by -fextend-lifetimes (and by extension -Og later on). However, the decision was made that a general optdebug attribute should be incompatible with other opt_ attributes (e.g. optsize, optnone), since they all express different intents for how to optimize the program. We would still like to allow -fextend-lifetimes with optsize however (i.e. -Os -fextend-lifetimes should be legal), since it may be a useful configuration and there is no technical reason to not allow it. This patch resolves this by tracking MachineFunctions that have fake uses, allowing us to run passes that interact with them and skip passes that clash with them.	2024-10-04 13:13:30 +01:00
William G Hatch	ae635d6f99	[NVPTX] add support for .debug_loc section (#110905 ) Enable .debug_loc section for NVPTX backend. This commit makes NVPTX omit DW_AT_low_pc (and DW_AT_high_pc) for DW_TAG_compile_unit. This is because cuda-gdb uses the compile unit's low_pc as a base address, and adds the addresses in the debug_loc section to it. Removing low_pc is equivalent to setting that base address to zero, so addition doesn't break the location ranges. Additionally, this patch forces debug_loc label emission to emit single labels with no subtraction or base. This would not be necessary if we could emit `label1 - label2` expressions in PTX. The PTX documentation at https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#debugging-directives-section makes it seem like this is supported, but it doesn't actually work. I believe when that documentation says that you can subtract “label addresses between labels in the same dwarf section”, it doesn't merely mean that the labels need to be in the same section as each other, but in fact they need to be in the same section as the use. If support for label subtraction is supported such that in the debug_loc section you can subtract labels from the main code section, then we can remove the workarounds added in this PR. Also, since this now emits valid .debug_loc sections, it replaces the empty .debug_loc to force existence of at least one debug section with an empty .debug_macinfo section, which matches what nvcc does.	2024-10-03 16:31:49 -06:00
Luke Lau	487686b82e	[SDAG][RISCV] Don't promote VP_REDUCE_{FADD,FMUL} (#111000 ) In https://reviews.llvm.org/D153848, promotion was added for a variety of f16 ops with zvfhmin, including VP reductions. However I don't believe it's correct to promote f16 fadd or fmul reductions to f32 since we need to round the intermediate results. Today if we lower @llvm.vp.reduce.fadd.nxv1f16 on RISC-V, we'll get two different results depending on whether we compiled with +zvfh or +zvfhmin, for example with a 3 element reduction: ; v9 = [0.1563, 5.97e-8, 0.00006104] ; zvfh vsetivli x0, 3, e16, m1, ta, ma vmv.v.i v8, 0 vfredosum.vs v8, v9, v8 vfmv.f.s fa0, v8 ; fa0 = 0.1563 ; zvfhmin vsetivli x0, 3, e16, m1, ta, ma vfwcvt.f.f.v v10, v9 vsetivli x0, 3, e32, m1, ta, ma vmv.v.i v8, 0 vfredosum.vs v8, v10, v8 vfmv.f.s fa0, v8 fcvt.h.s fa0, fa0 ; fa0 = 0.1564 This same thing happens with reassociative reductions e.g. vfredusum.vs, and this also applies for bf16. I couldn't find anything in the LangRef for reductions that suggest the excess precision is allowed. There may be something we can do in Clang with -fexcess-precision=fast, but I haven't looked into this yet. I presume the same precision issue occurs with fmul, but not with fmin/fmax/fminimum/fmaximum. I can't think of another way of lowering these other than scalarizing, and we can't scalarize scalable vectors, so this just removes the promotion and adjusts the cost model to return an invalid cost. (It looks like we also don't currently cost fmul reductions, so presumably they also have an invalid cost?) I think this should be enough to stop the loop vectorizer or SLP from emitting these intrinsics.	2024-10-04 00:17:45 +08:00
Mehdi Amini	6c7a3f80e7	Fix LLVM_ENABLE_ABI_BREAKING_CHECKS macro check: use #if instead of #ifdef (#110938 ) This macros is always defined: either 0 or 1. The correct pattern is to use #if. Re-apply #110185 with more fixes for debug build with the ABI breaking checks disabled.	2024-10-03 01:24:14 +02:00
Christopher Di Bella	45ad1ac4a3	Revert "Fix LLVM_ENABLE_ABI_BREAKING_CHECKS macro check: use #if inst… (#110923 ) …ead of #ifdef (#110883)" This reverts commit 1905cdbf4ef15565504036c52725cb0622ee64ef, which causes lots of failures where LLVM doesn't have the right header guards. The errors can be seen on [BuildKite](https://buildkite.com/llvm-project/upstream-bazel/builds/112362#01924eae-231c-4d06-ba87-2c538cf40e04), where the source uses `#ifndef NDEBUG`, but the content in question is defined when `LLVM_ENABLE_ABI_BREAKING_CHECKS == 1`. For example, `llvm/include/llvm/Support/GenericDomTreeConstruction.h` has the following: ```cpp // Helper struct used during edge insertions. struct InsertionInfo { // ... #ifdef LLVM_ENABLE_ABI_BREAKING_CHECKS SmallVector<TreeNodePtr, 8> VisitedUnaffected; #endif }; // ... InsertionInfo II; // ... #ifndef NDEBUG II.VisitedUnaffected.push_back(SuccTN); #endif ```	2024-10-02 13:54:09 -07:00
Francis Visoiu Mistrih	916e6ad7d0	[CodeGen] Fix InstructionCount remarks for MI bundles (#107621 ) For MI bundles, the instruction count remark doesn't count the instructions inside the bundle.	2024-10-02 13:37:25 -07:00
spupyrev	9016f27c42	[CodeLayout] Size-aware machine block placement (#109711 ) This is an implementation of a new "size-aware" machine block placement. The idea is to reorder blocks so that the number of fall-through jumps is maximized. Observe that profile data is ignored for the optimization, and it is applied only for instances with hasOptSize()=true. This strategy has two benefits: (i) it eliminates jump instructions, which results in smaller text size; (ii) we avoid using profile data while reordering blocks, which yields more "uniform" functions, thus helping ICF and machine outliner/merger. For large (mobile) apps, the size benefits of (i) and (ii) are roughly the same, combined providing up to 0.5% uncompressed and up to 1% compressed savings size on top of the current solution. The optimization is turned off by default.	2024-10-02 10:48:08 -07:00
Mehdi Amini	1905cdbf4e	Fix LLVM_ENABLE_ABI_BREAKING_CHECKS macro check: use #if instead of #ifdef (#110883 ) This macros is always defined: either 0 or 1. The correct pattern is to use #if. Reapply https://github.com/llvm/llvm-project/pull/110185 with fixes.	2024-10-02 18:43:16 +02:00
Matt Arsenault	187dcd8e22	DAG: Preserve disjoint flag when emitting final instructions (#110795 )	2024-10-02 19:37:04 +04:00
Bevin Hansson	1a65d95d00	[CodeGen][RAGreedy] Inform LiveDebugVariables about snippets spilled by InlineSpiller. (#109962 ) RAGreedy invokes InlineSpiller to spill a particular virtreg inline. When the spiller does this, it also identifies small, adjacent liveranges called snippets. These are also spilled or rematerialized in the process. However, the spiller does not inform RA that it has spilled these regs. This means that debug variable locations referencing these regs/ranges are lost. Mark any spilled regs which do not have a stack slot assigned to them as allocated to the slot being spilled to to tell LDV that those regs are located in that slot, even though the regs might no longer exist in the program after regalloc is finished. Also, inform RA about all of the regs which were replaced (spilled or rematted), not just the one that was requested so that it can properly manage the ranges of the debug vars.	2024-10-02 10:29:56 +02:00
Michael Maitland	f957d080e9	[RISCV][GISEL] Legalize G_EXTRACT_SUBVECTOR (#109426 ) This is heavily based on the SelectionDAG lowerEXTRACT_SUBVECTOR code.	2024-10-01 14:08:49 -04:00

1 2 3 4 5 ...

36581 Commits