llvm-project

Author	SHA1	Message	Date
Nikita Popov	fe4dbbb467	[DAGCombiner] Fold add (mul x, C), x to mul x, C+1 While this is normally non-canonical IR, this pattern can appear during SDAG lowering if the add is actually a getelementptr, as illustrated in `@test_ptr`. This pattern comes up when doing provenance-aware high-bit pointer tagging. Proof: https://alive2.llvm.org/ce/z/DLoEcs Fixes https://github.com/llvm/llvm-project/issues/62093. Differential Revision: https://reviews.llvm.org/D148341	2023-04-17 12:33:46 +02:00
Akshay Khadse	8bf7f86d79	Fix uninitialized pointer members in CodeGen This change initializes the members TSI, LI, DT, PSI, and ORE pointer feilds of the SelectOptimize class to nullptr. Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D148303	2023-04-17 16:32:46 +08:00
Wang, Xin10	9fa721c7c6	remove useless call in MIRSampleProfile.cpp This call getSummary returns a value but nobody take it. Reviewed By: skan Differential Revision: https://reviews.llvm.org/D148305	2023-04-17 04:14:28 -04:00
NAKAMURA Takumi	077a2a4bcd	[CMake] Cleanup deps	2023-04-17 00:38:49 +09:00
Kazu Hirata	972983539b	[llvm] Apply fixes from readability-redundant-control-flow (NFC)	2023-04-16 00:13:46 -07:00
Sergei Barannikov	38d84e3d76	[GISel] Legalize G_FSUB to G_FADD + G_FNEG even if G_FNEG is illegal `G_FNEG` used to be legalized to `G_FSUB -0, x` causing infinite loop. This is no longer the case after D84287. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D148187	2023-04-15 08:11:49 +03:00
Daniel Hoekwater	6b62166b4c	Account for PATCHABLE instrs in Branch Relaxation PATCHABLE_* instructions expand to up to 36-byte sleds. Updating the size of PATCHABLE instructions causes them to be outlined, so we need to add a check to prevent the outliner from considering basic blocks that contain PATCHABLE instructions. Differential Revision: https://reviews.llvm.org/D147982	2023-04-14 16:14:50 -07:00
Bjorn Pettersson	40c60c025c	[Passes] Remove the legacy DemandedBitsWrapperPass Last user of DemandedBitsWrapperPass was the BDCE pass. Since the legacy PM version of BDCE was removed in an earlier commit, this patch removes the now unused DemandedBitsWrapperPass. Differential Revision: https://reviews.llvm.org/D148336	2023-04-14 18:56:20 +02:00
Nikita Popov	62ef97e063	[llvm-c] Remove PassRegistry and initialization APIs Remove C APIs for interacting with PassRegistry and pass initialization. These are legacy PM concepts, and are no longer relevant for the new pass manager. Calls to these initialization functions can simply be dropped. Differential Revision: https://reviews.llvm.org/D145043	2023-04-14 12:12:48 +02:00
Aiden Grossman	35714e3a9c	[MLGO] Change MBB Profile Dump from using MBB numbers to MBB IDs Currenty, setting the -mbb-profile-dump dumps a CSV file with blocks inside an individual function identified by their MBB numbers. This patch changes the MBBs to be identified by their ID which is set at MBB creation and not changed afterwards, making it inherently stable throughout the backend. This alleviates concerns with the MBB IDs changing between the profile dump and what ends up in the final object file. The MBBs inside the SHT_LLVM_BB_ADDR_MAP sections are also identified using their MBB ID rather than number, so if we want to match them up we need to identify the MBBs here by number. Reviewed By: mtrofin, rahmanl Differential Revision: https://reviews.llvm.org/D147366	2023-04-14 07:04:07 +00:00
Nick Desaulniers	fc4494dffa	[StackProtector] don't check stack protector before calling nounwind functions https://reviews.llvm.org/rGd656ae28095726830f9beb8dbd4d69f5144ef821 introduced a additional checks before calling noreturn functions in response to this security paper related to Catch Handler Oriented Programming (CHOP): https://download.vusec.net/papers/chop_ndss23.pdf See also: https://bugs.chromium.org/p/llvm/issues/detail?id=30 This causes stack canaries to be inserted in C code which was unexpected; we noticed certain Linux kernel trees stopped booting after this (in functions trying to initialize the stack canary itself). https://github.com/ClangBuiltLinux/linux/issues/1815 There is no point checking the stack canary like this when exceptions are disabled (-fno-exceptions or function is marked noexcept) or for C code. The GCC patch for this issue does something similar: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=a25982ada523689c8745d7fb4b1b93c8f5dab2e7 Android measured a 2% regression in RSS as a result of d656ae280957 and undid it globally: https://android-review.googlesource.com/c/platform/build/soong/+/2524336 Reviewed By: xiangzhangllvm Differential Revision: https://reviews.llvm.org/D147975	2023-04-13 09:37:06 -07:00
sgokhale	bb5befefc6	Revert "[CodeGen][ShrinkWrap] Split restore point" This reverts commit 5f0bccc3d1a74111458c71f009817c9995f4bf83. An issue has been reported here: https://github.com/ClangBuiltLinux/linux/issues/1833	2023-04-13 10:52:28 +05:30
Anshil Gandhi	6530bd3030	[BranchRelaxation] Correct JumpToFT value Toggle true/false values of the JumpToFallThrough parameter to simplify code and make it consistent with the documentation for the `getFallThrough(..)` method. Reviewed By: bcahoon Differential Revision: https://reviews.llvm.org/D148139	2023-04-12 23:21:20 -06:00
Amara Emerson	719024a0d0	[GlobalISel][NFC] Add MachineInstr::getFirst[N]{Regs,LLTs}() helpers to extract regs & types. These reduce the typing and clutter from: Register Dst = MI.getOperand(0).getReg(); Register Src1 = MI.getOperand(1).getReg(); Register Src2 = MI.getOperand(2).getReg(); Register Src3 = MI.getOperand(3).getReg(); LLT DstTy = MRI.getType(Dst); ... etc etc To just: auto [Dst, Src1, Src2, Src3] = MI.getFirst4Regs(); auto [DstTy, Src1Ty, Src2Ty, Src3Ty] = MI.getFirst4LLTs(); Or even more concise: auto [Dst, DstTy, Src1, Src1Ty, Src2, Src2Ty, Src3, Src3Ty] = MI.getFirst4RegLLTs(); Differential Revision: https://reviews.llvm.org/D144687	2023-04-12 16:43:14 -07:00
Amara Emerson	29c851f4e2	[GlobalISel] Move the truncstore_merge combine to the LoadStoreOpt pass and add support for an extra case. If we have set of mergeable stores of shifts, but the original source value being shifted is wider than the merged size, we should still be able to merge if we truncate first. To do this however we need to search for stores speculatively up the block, without knowing exactly how many stores we should see before we stop. The old algorithm has to match an exact number of stores to fit the wide type, or it dies. The new one will try to set the wide type to however many stores we found in the upwards block traversal and use later checks to verify if they're a valid mergeable set. The reason I need to move this to LoadStoreOpt is because the combiner works going top down inside a block, which means that we end up doing partial merges because we haven't seen all the possible stores before we mutate the MIR. In LoadStoreOpt we can go bottom up. As a side effect of this change, we also end up doing better on an existing test case (missing_store) since we manage to do a partial merge there.	2023-04-12 16:43:14 -07:00
Archibald Elliott	17cd511007	[DAGCombiner] Fix (shl (ctlz x) n) for non-power-of-two Data This DAGCombine is not valid for some combinations of the known bits of x and non-power-of-two widths of x. As shown in the bug: - The bitwidth of x is 35 (n=5) - The unknown bits of x is only the least significant bit - This gives the result of the ctlz two possible values: 34 or 35, both of which will give 1 when left-shifted 5 bits. - So the `eor x, 1` that this optimisation would give is not correct. A similar instcombine optimisation is only applied when the width of x is a power-of-two. GlobalISel does not have this bug, as shown by the testcase. Fixes #61549 Differential Revision: https://reviews.llvm.org/D147518	2023-04-12 17:38:39 +01:00
Wang, Xin10	b00fc5ac99	Fix Mem leak in LLVMTargetMachine.cpp If we go to line 302, with one of MCE or MAB is not nullptr, then we could leak mem here. Use unique_ptr to maintain these 2 pointer can avoid it. Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D148003	2023-04-12 05:27:04 -04:00
Ellis Hoag	244be0b0de	[InstrProf] Temporal Profiling As described in [0], this extends IRPGO to support //Temporal Profiling//. When `-pgo-temporal-instrumentation` is used we add the `llvm.instrprof.timestamp()` intrinsic to the entry of functions which in turn gets lowered to a call to the compiler-rt function `INSTR_PROF_PROFILE_SET_TIMESTAMP()`. A new field in the `llvm_prf_cnts` section stores each function's timestamp. Then in `llvm-profdata merge` we convert these function timestamps into a //trace// and add it to the indexed profile. Since these traces could significantly increase the profile size, we've added `-max-temporal-profile-trace-length` and `-temporal-profile-trace-reservoir-size` to limit the length of a trace and the number of traces in a profile, respectively. In a future diff we plan to use these traces to construct an optimized function order to reduce the number of page faults during startup. Special thanks to Julian Mestre for helping with reservoir sampling. [0] https://discourse.llvm.org/t/rfc-temporal-profiling-extension-for-irpgo/68068 Reviewed By: snehasish Differential Revision: https://reviews.llvm.org/D147287	2023-04-11 08:30:52 -07:00
Amaury Séchet	91105df3df	[DAG] Peek through zext/trunc when matching (or (and X, (not Y)), Y). This shows up in the wild, notably as a regression in D127115 . Depends on D147821 Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D147827	2023-04-11 13:48:22 +00:00
Felipe de Azevedo Piovezan	b2020fe3aa	[DbgHistoryCalculator] Improve debug messages I've found that a frequent source of debug information loss in optimized code is due to DEBUG_VALUE intrinsics in a position of the instruction stream that is outside the scope of the variable it describes. Tracking these is pretty difficult with the existing debug messages of the history calculator; this patch addresses the issue by making it obvious when this event happens. Differential Revision: https://reviews.llvm.org/D147718	2023-04-11 08:41:29 -04:00
Amaury Séchet	9041e1fa29	[DAG] Peek through zext/trunc in haveNoCommonBitsSet. This limitation was discovered thanks to some regression in D127115 . Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D147821	2023-04-11 11:44:15 +00:00
Momchil Velikov	4ac6f99ae0	[LiveInterval] Fix live range overlap check Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D145707	2023-04-11 11:11:30 +01:00
Alexis Engelke	8e59fe2d8e	[FastISel] Correctly report prototype on miss The type of a function is nowadays just an opaque pointer, which is not helpful when analyzing FastISel misses. Instead print the actual function type of the function. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D147716	2023-04-11 11:49:08 +02:00
sgokhale	5f0bccc3d1	[CodeGen][ShrinkWrap] Split restore point This patch splits a restore point to allow it to only post-dominate blocks reachable by use or def of CSRs(Callee Saved Registers)/FI(Frame Index). Benchmarking this on SPEC2017, this gives around 4% improvement on povray and no significant change for others. Co-authored-by: junbuml Differential Revision: https://reviews.llvm.org/D42600	2023-04-11 11:58:50 +05:30
Kazu Hirata	63c4967352	Use APInt::getOneBitSet (NFC)	2023-04-10 18:19:17 -07:00
Bing1 Yu	87c1ed5385	Change dyn_cast to cast Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D147923	2023-04-11 00:14:39 +08:00
wangpc	267708f9d5	[MachineOutliner] Add IsOutlined to MachineFunction We add a field `IsOutlined` to indicate whether a MachineFunction is outlined and set it true for outlined functions in MachineOutliner. Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D146191	2023-04-10 10:57:29 +08:00
Kazu Hirata	c121f3a9fb	[CodeGen] Use range-based for loops (NFC)	2023-04-08 16:22:39 -07:00
Nathan Lanza	87c0f67739	[Outliner] Add an option to only enable outlining of patterns above a certain threshold Outlining isn't always a win when the saved instruction count is >= 1. The overhead of representing a new function in the binary depends on exception metadata and alignment. So parameterize this for local tuning. Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D136774	2023-04-08 02:12:40 -04:00
Luo, Yuanke	9db75b23bd	[Coverity] Initialize pointer memeber.	2023-04-06 17:29:53 +08:00
Valery Pykhtin	e09b33feec	[CodeGen] Speedup stack slot sharing during stack coloring (interval overlapping test). AMDGPU code with enabled address sanitizer generates tons of stack objects (> 200000 in my testcase) and takes forever to compile due to the time spent on stack slot sharing. While LiveRange::overlaps method has logarithmic complexity on the number of segments in the involved liveranges the problem is that when a new interval is assigned to a used color it's tested against overlapping every other assigned interval for that color. Instead I decided to join all assigned intervals for a color into a single interval and this allows to have logarithmic complexity on the number of segments for the joined interval. This patch reduced time spent on stack slot coloring pass from 628 to 3 seconds on my testcase. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D146057	2023-04-06 07:23:45 +02:00
Craig Topper	f1924d965a	[SelectionDAG] Expand VP SDNodes by default. Differential Revision: https://reviews.llvm.org/D147643	2023-04-05 18:52:28 -07:00
Matt Arsenault	7907fd4961	RegAllocFast: Fix dropping subreg indexes on unassigned subreg defs This was assuming all register operands were assigned to physical registers. This should ignore the operands which weren't assigned in this run. Fixes #61134	2023-04-05 18:25:51 -04:00
Felipe de Azevedo Piovezan	79a1e32915	[GlobalISel] Improve stack slot tracking in dbg.values For IR like: ``` %alloca = alloca ... dbg.value(%alloca, !myvar, OP_deref(<other_ops>)) ``` GlobalISel lowers it to MIR: ``` %some_reg = G_FRAME_INDEX <stack_slot> DBG_VALUE %some_reg, !myvar, OP_deref(<other_ops>) ``` In other words, if the value of `!myvar` can be obtained by dereferencing an alloca, in MIR we say that the _location_ of a variable is obtained by dereferencing register %some_reg (plus some `<other_ops>`). We can instead remove the use of `%some_reg`: the location of `!myvar` _is_ `<stack_slot>` (plus some `<other_ops>`). This patch implements this transformation, which improves debug information handling in O0, as these registers hardly ever survive register allocation. A note about testing: similar to what was done in D76934 (f24e2e9eebde4b7a1d), this patch exposed a bug in the Builder class when using `-debug`, where we tried to print an incomplete instruction. The changes in `MachineIRBuilder.cpp` address that. Differential Revision: https://reviews.llvm.org/D147536	2023-04-05 08:21:00 -04:00
Sven van Haastregt	5af5ac4e3e	Update mentions of reduction intrinsics; NFC The intrinsics have been out of experimental since 322d0afd875d ("[llvm][mlir] Promote the experimental reduction intrinsics to be first class intrinsics.", 2020-10-07); update some places that still referred to them as experimental.	2023-04-05 11:49:41 +01:00
Dinar Temirbulatov	7f05bdf4ee	[AArch64][SME] Fix an infinite loop in DAGCombine related to adding -force-streaming-compatible-sve flag. Compiler hits infinite loop in DAGCombine. For force-streaming-compatible-sve mode we have custom lowering for 128-bit vector splats and later in DAGCombiner::SimplifyVCastOp() we scalarized SPLAT because we have custom lowering for SME. Later, we restored SPLAT opertion via performMulCombine().	2023-04-05 10:10:55 +00:00
OCHyams	93c194fc9f	[Assignment Tracking] Ignore zero-sized fragments Such dbg.assigns will occur if you write zero-sized memcpys (see https://reviews.llvm.org/D146987#4240016). Handle this in AssignmentTrackingAnalysis (back end) rather than AssignmentTrackingPass (declare-to-assign) in case it is possible to reproduce this as a result of optimisations. Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D147435	2023-04-05 09:31:23 +01:00
Hongtao Yu	5b461d5ec1	[FS-AFDO] Assign discriminators to pseudo probes This is the first change for FS-AFDO integration with CSSPGO. There are more patches coming. With pseudo probes, we do not assign FS discriminators to any other instructions since we will be using only probes for profile correlation. Also call instructions are excluded since their dwarf discriminators are used for other purposes, i.e, storing probe ids. Since they are not getting a FS discriminator, they will also be excluded from MIR profile loading. The corresponding changes will be in the subsequent patches. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D147286	2023-04-04 17:04:37 -07:00
Hans Wennborg	91beab69cd	Revert "Recommit DwarfEHPrepare: insert extra unwind paths for stack protector to instrument" This broke Objective-C autorelease / retainAutoreleasedReturnValue, see comments on the code review. > This is a mitigation patch for > https://bugs.chromium.org/p/llvm/issues/detail?id=30, where existing stack > protection is skipped if a function is returned through by an unwinder rather > than the normal call/return path. The recent patch D139254 added the ability to > instrument a visible unwind path, at least in the IR case (I'm working on the > SelectionDAG instrumentation too) but there are still invisible unwinds it > can't reach. > > So this patch adds logic to DwarfEHPrepare that goes through a function, > converting any call that might throw into an invoke to a simple resume cleanup, > and adding cleanup clauses to existing landingpads that lack them. Obviously we > don't really want to do this if it's wasted effort, so I also exposed > requiresStackProtector from the actual StackProtector code to skip the extra > paths if they won't be used. > > Changes: > * Move test to AArch64 directory as it relies on target presence. > * Re-add Dominator-tree maintenance. Accidentally cherry-picked wrong patch. > * Skip adding paths on Windows EH functions. > > https://reviews.llvm.org/D143637 This reverts commit 2d690684f66fabc9ac6a2c70fcff3b31c9520794.	2023-04-04 18:09:26 +02:00
Jay Foad	5509a18b5a	[MachineVerifier] Try harder to verify SlotIndexes Verify the SlotIndexes analysis after a pass that claims to preserve it, even if there are no further passes (apart from the verifier itself) that would use the analysis. Differential Revision: https://reviews.llvm.org/D129201	2023-04-04 15:23:36 +01:00
Simon Pilgrim	00e3ae4471	[CodeGen] ExpandReductions - add reduce_and/or(<X x i1> V) -> icmp(iX bitcast(<X x i1> V)) canonicalization This already exists in InstCombine but was missing from the late stage ExpandReductions pass Fixes #53419 Fixes #61923 Differential Revision: https://reviews.llvm.org/D147452	2023-04-04 11:19:35 +01:00
Craig Topper	65f3794111	[SelectionDAG] Use MemVT for FoldingSetNodeID in SelectionDAG::getLoadVP. Return types and operands are put in the ID by AddNodeIDNode. I'm pretty sure this was supposed to be the memory VT.	2023-04-03 15:15:48 -07:00
Craig Topper	de92a20131	[SelectionDAG] Move variable declaration to its first assignment. NFC We declared this variable and assigned it to true, but then overwrote it before its first use.	2023-04-03 14:03:05 -07:00
Craig Topper	bb64fd571b	[SelectionDAGBuilder] Use SmallVectorImpl& for function arguments. NFC Make the reference const since we aren't modifying the vectors.	2023-04-03 14:03:05 -07:00
Jun Zhang	7657e50fef	[DAGCombiner] Fold avg(x, x) --> x Signed-off-by: Jun Zhang <jun@junz.org> Differential Revision: https://reviews.llvm.org/D147404	2023-04-03 16:57:50 +08:00
Craig Topper	b5f207e5b2	[SelectionDAG] Rename Flag->Glue. NFC	2023-04-02 19:46:51 -07:00
Simon Pilgrim	2434c8fcf9	[DAG] canCreateUndefOrPoison - add ISD::INSERT_VECTOR_ELT handling If the inserted element index is guaranteed to be inbounds then a ISD::INSERT_VECTOR_ELT will not create poison/undef.	2023-04-02 16:28:26 +01:00
Simon Pilgrim	8153b92d9b	[DAG] Add SelectionDAG::SplitScalar helper Similar to the existing SelectionDAG::SplitVector helper, this helper creates the EXTRACT_ELEMENT nodes for the LO/HI halves of the scalar source. Differential Revision: https://reviews.llvm.org/D147264	2023-03-31 18:35:40 +01:00
David Green	7b6fae42f7	[InterleaveAccess] Check that binop shuffles have an undef second operand It is expected that shuffles that we hoist through binops only have a single vector operand, the other being undef/poison. The checks for isDeInterleaveMaskOfFactor check that all the elements come from inside the first vector, but with non-canonical shuffles the second operand could still have a value. Add a quick check to make sure it is UndefValue as expected, to make sure we don't run into problems with BinOpShuffles not using BinOps. Fixes #61749 Differential Revision: https://reviews.llvm.org/D147306	2023-03-31 15:38:27 +01:00
Qiongsi Wu	f624372ccb	[AIX][CodeGen] Renaming mroptr to xcoff-mroptr This patch renames the `mroptr` option to `mxcoff-roptr` to indicate in the option itself that it is xcoff specific. Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D147161	2023-03-31 10:09:48 -04:00

1 2 3 4 5 ...

33876 Commits