llvm-project

Author	SHA1	Message	Date
Finn Plummer	45c01e8a33	[NFC][TargetTransformInfo][VectorUtils] Consolidate `isVectorIntrinsic...` api (#117635 ) - update `VectorUtils:isVectorIntrinsicWithScalarOpAtArg` to use TTI for all uses, to allow specifiction of target specific intrinsics - add TTI to the `isVectorIntrinsicWithStructReturnOverloadAtField` api - update TTI api to provide `isTargetIntrinsicWith...` functions and consistently name them - move `isTriviallyScalarizable` to VectorUtils - update all uses of the api and provide the TTI parameter Resolves #117030	2024-12-19 11:54:26 -08:00
Craig Topper	f139bde8d8	[SelectionDAG] Move SDNode::use_iterator::getOperandNo to SDUse. (#120536 ) This allows us to write more range based for loops because we no longer need the iterator. It also matches IR's Use class.	2024-12-19 09:07:42 -08:00
Craig Topper	e6b2495545	[SelectionDAG] Split SDNode::use_iterator into user_iterator and use_iterator. (#120531 ) SDNode::use_iterator now returns an SDUse& when dereferenced. SDNode::user_iterator returns SDNode*. SDNode::use_begin/use_end/uses work on use_iterator. SDNode::user_begin/user_end/users work on user_iterator. We can now write range based for loops using SDUse& and SDNode::uses(). I've converted many of these in this patch. I didn't update loops that have additional variables updated in their for statement. Some loops use SDNode::use_iterator::getOperandNo() which also prevents using range based for loops. I plan to move this into SDUse in a follow up patch.	2024-12-19 08:35:32 -08:00
Shubham Sandeep Rastogi	16d952898f	Revert "Add a pass to collect dropped var stats for MIR (#120501 )" This reverts commit 223c7648468cd4f649a578d3f9cbc27a63523192. Reverted due to vuildbot failure: flang-aarch64-libcxx Linking CXX shared library lib/libLLVMAnalysis.so.20.0git FAILED: lib/libLLVMAnalysis.so.20.0git	2024-12-19 00:48:40 -08:00
Shubham Sandeep Rastogi	223c764846	Add a pass to collect dropped var stats for MIR (#120501 ) Reland "Add a pass to collect dropped var stats for MIR" (#117044) I am trying to reland https://github.com/llvm/llvm-project/pull/115566 I also moved the DroppedVariableStats code to the Analysis lib This is part of a stack of patches with https://github.com/llvm/llvm-project/pull/120502 being the first one in the stack	2024-12-19 00:41:48 -08:00
Craig Topper	bd261ecc5a	[SelectionDAG] Add SDNode::user_begin() and use it in some places (#120509 ) Most of these are just places that want the first user and aren't iterating over the whole list. While there I changed some use_size() == 1 to hasOneUse() which is more efficient. This is part of an effort to rename use_iterator to user_iterator and provide a use_iterator that dereferences to SDUse&. This patch helps reduce the diff on later patches.	2024-12-18 22:13:04 -08:00
Craig Topper	4ca4287da4	[SelectionDAG] Replace findGlueUse in SelectionDAGISel with SDNode::getGluedUser. NFC (#120512 )	2024-12-18 21:46:52 -08:00
Craig Topper	104ad9258a	[SelectionDAG] Rename SDNode::uses() to users(). (#120499 ) This function is most often used in range based loops or algorithms where the iterator is implicitly dereferenced. The dereference returns an SDNode * of the user rather than SDUse * so users() is a better name. I've long beeen annoyed that we can't write a range based loop over SDUse when we need getOperandNo. I plan to rename use_iterator to user_iterator and add a use_iterator that returns SDUse& on dereference. This will make it more like IR.	2024-12-18 20:09:33 -08:00
Zhaoxin Yang	f334db92be	[llvm][CodeGen] Intrinsic `llvm.powi.*` code gen for vector arguments (#118242 ) Scalarize vector FPOWI instead of promoting the type. This allows the scalar FPOWIs to be visited and converted to libcalls before promoting the type. FIXME: This should be done in LegalizeVectorOps/LegalizeDAG, but call lowering needs the unpromoted EVT. Without this patch, in some backends, such as RISCV64 and LoongArch64, the i32 type is illegal and will be promoted. This causes exponent type check to fail when ISD::FPOWI node generates a libcall. Fix https://github.com/llvm/llvm-project/issues/118079	2024-12-19 08:57:31 +08:00
Florian Hahn	76714be5fd	Revert "Add support for single reductions in ComplexDeinterleavingPass (#112875 )" This reverts commit b3eede5e1fa7ab742b86e9be22db7bccd2505b8a. This has been breaking most AArch64 stage2 builds for 4+ hours, reverting to get the bots back to green. https://lab.llvm.org/buildbot/#/builders/41/builds/4172 https://lab.llvm.org/buildbot/#/builders/4/builds/4281 https://lab.llvm.org/buildbot/#/builders/199/builds/263 https://lab.llvm.org/buildbot/#/builders/198/builds/334 https://lab.llvm.org/buildbot/#/builders/143/builds/4276 https://lab.llvm.org/buildbot/#/builders/17/builds/4725	2024-12-18 15:06:52 +00:00
Paul Walker	3146911eb0	[LLVM][AsmPrinter] Add vector ConstantInt/FP support to emitGlobalConstantImpl. (#120077 ) The fixes a failure path for fixed length vector globals when ConstantInt/FP is used to represent splats instead of ConstantDataVector.	2024-12-18 11:51:01 +00:00
Nicholas Guy	b3eede5e1f	Add support for single reductions in ComplexDeinterleavingPass (#112875 ) The Complex Deinterleaving pass assumes that all values emitted will result in complex numbers, this patch aims to remove that assumption and adds support for emitting just the real or imaginary components, not both.	2024-12-18 10:34:26 +00:00
Florian Mayer	d9703501b0	[MTE] [NFC] use vector to collect globals to tag (#120283 ) The same pattern caused test failures in the HWASan pass, so is brittle. Let's go for the easier approach.	2024-12-18 00:38:19 -08:00
Pengcheng Wang	1235a93fae	[MachinePipeliner] Use `RegisterClassInfo::getRegPressureSetLimit` (#119827 ) `RegisterClassInfo::getRegPressureSetLimit` is a wrapper of `TargetRegisterInfo::getRegPressureSetLimit` with some logics to adjust the limit by removing reserved registers. It seems that we shouldn't use `TargetRegisterInfo::getRegPressureSetLimit` directly, just like the comment "This limit must be adjusted dynamically for reserved registers" said. Thus we should use `RegisterClassInfo::getRegPressureSetLimit` and remove replicated code. Separate from https://github.com/llvm/llvm-project/pull/118787	2024-12-18 15:13:03 +08:00
Pengcheng Wang	b6ad231666	[MachineSink] Use `RegisterClassInfo::getRegPressureSetLimit` (#119830 ) `RegisterClassInfo::getRegPressureSetLimit` is a wrapper of `TargetRegisterInfo::getRegPressureSetLimit` with some logics to adjust the limit by removing reserved registers. It seems that we shouldn't use `TargetRegisterInfo::getRegPressureSetLimit` directly, just like the comment "This limit must be adjusted dynamically for reserved registers" said. Separate from https://github.com/llvm/llvm-project/pull/118787	2024-12-18 14:51:01 +08:00
Jon Roelofs	01d7a187a4	[llvm] Add missing dependency of libLLVMCodeGen on vt_gen ``` llvm-project/llvm/include/llvm/CodeGenTypes/MachineValueType.h:43:10: fatal error: 'llvm/CodeGen/GenVT.inc' file not found 43 \| #include "llvm/CodeGen/GenVT.inc" \| ^~~~~~~~~~~~~~~~~~~~~~~~ ``` rdar://141643651	2024-12-17 17:02:55 -07:00
Thurston Dang	e8a6563768	Fix-forward 'RegAllocFast: Avoid using temporary DiagnosticInfo #120184 ' (#120268 ) There was a buildbot breakage (https://lab.llvm.org/buildbot/#/builders/24/builds/3329/steps/11/logs/stdio): /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/CodeGen/AMDGPU/ran-out-of-registers-error-all-regs-reserved.ll:9:10: error: CHECK: expected string not found in input ; CHECK: error: <unknown>:0:0: no registers from class available to allocate in function 'no_registers_from_class_available_to_allocate' 2: ==75198==ERROR: AddressSanitizer: stack-use-after-scope on address 0xfa23f9f1c270 at pc 0xb2660dda9340 bp 0xfffffe8ab340 sp 0xfffffe8ab338 caused by https://github.com/llvm/llvm-project/pull/120184, which made a partial fix but also renabled the tests. This patch attempts to fix forward by applying the same fix to the error message highlighted in the buildbot.	2024-12-17 09:09:13 -08:00
Florian Hahn	a487b792e2	[TySan] Add initial Type Sanitizer (LLVM) (#76259 ) This patch introduces the LLVM components of a type sanitizer: a sanitizer for type-based aliasing violations. It is based on Hal Finkel's https://reviews.llvm.org/D32198. C/C++ have type-based aliasing rules, and LLVM's optimizer can exploit these given TBAA metadata added by Clang. Roughly, a pointer of given type cannot be used to access an object of a different type (with, of course, certain exceptions). Unfortunately, there's a lot of code in the wild that violates these rules (e.g. for type punning), and such code often must be built with -fno-strict-aliasing. Performance is often sacrificed as a result. Part of the problem is the difficulty of finding TBAA violations. Hopefully, this sanitizer will help. For each TBAA type-access descriptor, encoded in LLVM's IR using metadata, the corresponding instrumentation pass generates descriptor tables. Thus, for each type (and access descriptor), we have a unique pointer representation. Excepting anonymous-namespace types, these tables are comdat, so the pointer values should be unique across the program. The descriptors refer to other descriptors to form a type aliasing tree (just like LLVM's TBAA metadata does). The instrumentation handles the "fast path" (where the types match exactly and no partial-overlaps are detected), and defers to the runtime to handle all of the more-complicated cases. The runtime, of course, is also responsible for reporting errors when those are detected. The runtime uses essentially the same shadow memory region as tsan, and we use 8 bytes of shadow memory, the size of the pointer to the type descriptor, for every byte of accessed data in the program. The value 0 is used to represent an unknown type. The value -1 is used to represent an interior byte (a byte that is part of a type, but not the first byte). The instrumentation first checks for an exact match between the type of the current access and the type for that address recorded in the shadow memory. If it matches, it then checks the shadow for the remainder of the bytes in the type to make sure that they're all -1. If not, we call the runtime. If the exact match fails, we next check if the value is 0 (i.e. unknown). If it is, then we check the shadow for the remainder of the byes in the type (to make sure they're all 0). If they're not, we call the runtime. We then set the shadow for the access address and set the shadow for the remaining bytes in the type to -1 (i.e. marking them as interior bytes). If the type indicated by the shadow memory for the access address is neither an exact match nor 0, we call the runtime. The instrumentation pass inserts calls to the memset intrinsic to set the memory updated by memset, memcpy, and memmove, as well as allocas/byval (and for lifetime.start/end) to reset the shadow memory to reflect that the type is now unknown. The runtime intercepts memset, memcpy, etc. to perform the same function for the library calls. The runtime essentially repeats these checks, but uses the full TBAA algorithm, just as the compiler does, to determine when two types are permitted to alias. In a situation where access overlap has occurred and aliasing is not permitted, an error is generated. Clang's TBAA representation currently has a problem representing unions, as demonstrated by the one XFAIL'd test in the runtime patch. We'll update the TBAA representation to fix this, and at the same time, update the sanitizer. When the sanitizer is active, we disable actually using the TBAA metadata for AA. This way we're less likely to use TBAA to remove memory accesses that we'd like to verify. As a note, this implementation does not use the compressed shadow-memory scheme discussed previously (http://lists.llvm.org/pipermail/llvm-dev/2017-April/111766.html). That scheme would not handle the struct-path (i.e. structure offset) information that our TBAA represents. I expect we'll want to further work on compressing the shadow-memory representation, but I think it makes sense to do that as follow-up work. It goes together with the corresponding clang changes (https://github.com/llvm/llvm-project/pull/76260) and compiler-rt changes (https://github.com/llvm/llvm-project/pull/76261) PR: https://github.com/llvm/llvm-project/pull/76259	2024-12-17 13:57:34 +00:00
Florian Hahn	c1f5937eb4	[SelectOpt] Support BinOps with SExt operands. (#115879 ) Building on top of https://github.com/llvm/llvm-project/pull/115489 extend support for binops with SExt operand. PR: https://github.com/llvm/llvm-project/pull/115879	2024-12-17 11:52:15 +00:00
Benjamin Maxwell	a7dafea384	[SDAG] Allow folding stack slots into sincos/frexp in more cases (#118117 ) This adds a new helper `canFoldStoreIntoLibCallOutputPointers()` to check that it is safe to fold a store into a node that will expand to a library call that takes output pointers. This requires checking for two (independent) properties: 1. The store is not within a CALLSEQ_START..CALLSEQ_END pair * If it is, the expansion would lead to nested call sequences (which is invalid) 2. The node does not appear as a predecessor to the store * If it does, attempting to merge the store into the call would result in a cycle in the DAG These two properties are checked as part of the same traversal in `canFoldStoreIntoLibCallOutputPointers()`	2024-12-17 10:54:17 +00:00
Matt Arsenault	10b12e6e07	LiveVariables: Use Register (#120204 )	2024-12-17 17:45:24 +07:00
Matt Arsenault	3508d8f6dd	RegAllocFast: Avoid using temporary DiagnosticInfo (#120184 ) This reverts commit 1297933f35b4948b4d281259627a72094c407a75.	2024-12-17 16:19:26 +07:00
Florian Mayer	514580b438	[MTE] Apply alignment / size in AsmPrinter rather than IR (#111918 ) This makes sure no optimizations are applied that assume the bigger alignment or size, which could be incorrect if we link together with non-instrumented code.	2024-12-17 00:47:02 -08:00
Matt Arsenault	5e727e8bed	[Statepoint] Treat undef operands less specially (#119682 ) This reverts commit f7443905af1e06eaacda1e437fff8d54dc89c487. This is to avoid an assertion if an undef operand appears in a stackmap. This is important to avoid hitting verifier errors when register allocation starts adding undefs in error scenarios. Rather than trying to treat undef operands as special, leave them alone and avoid producing an invalid spill. It would a bit more precise to produce a spill of an undef register here, but that's not exposed through the storeRegToStackSlot API. https://reviews.llvm.org/D122605 This was an alternative to https://reviews.llvm.org/D122582	2024-12-17 12:59:46 +07:00
Matt Arsenault	e2cabd715b	RegAllocGreedy: Fix comment typo	2024-12-17 12:46:04 +07:00
Simon Pilgrim	0954c67d7a	[DAG] visitFREEZE - only fold integer types to an all ones constant ISD::isBuildVectorAllOnes can peek through bitcasts, so this can match against FP NAN (ish) data (e.g. double (bitcast i64 -1)) under certain circumstances - bail if the type isn't an integer and let bitcast folding handle it first. Fixes #120093	2024-12-16 16:46:38 +00:00
Aiden Grossman	76f258920d	[MLGO] Do not include urgent LRs in max cascade calculation (#120052 ) A previous PR introduced a threshold where we would mask out a LR that had been evicted a certain number of times to combat pathological compile time cases with a somewhat adversarial model. However, this patch did not take into account urgent LRs which led to compilation failures when greedy would expect us to provide an eviction and we could not due to the newly introduced logic.	2024-12-16 07:14:34 -08:00
Björn Pettersson	3ad2399148	[DAGCombiner] Refactor and improve ReduceLoadOpStoreWidth (#119564 ) This patch make a couple of improvements to ReduceLoadOpStoreWidth. When determining the minimum size of "NewBW" we now take byte boundaries into account. If we for example touch bits 6-10 we shouldn't accept NewBW=8, because we would fail later when detecting that we can't access bits from two different bytes in memory using a single load. Instead we make sure to align LSB/MSB according to byte size boundaries up front before searching for a viable "NewBW". In the past we only tried to find a "ShAmt" that was a multiple of "NewBW", but now we use a sliding window technique to scan for a viable "ShAmt" that is a multiple of the byte size. This can help out finding more opportunities for optimization (specially if the original type isn't byte sized, and for big-endian targets when the original load/store is aligned on the most significant bit).	2024-12-16 12:15:11 +01:00
David Green	a35db2880a	[NFC] Remove some unnecessary semicolons All inside LLVM_DEBUG, some of which have been cleaned up by adding block scopes to allow them to format more nicely.	2024-12-16 08:48:57 +00:00
Matt Arsenault	a3db5910b4	RegAllocBase: Avoid using temporary DiagnosticInfo (#120046 )	2024-12-16 14:57:13 +07:00
Daniil Kovalev	f65a21a4ec	[PAC][ELF][AArch64] Support signed personality function pointer (#119361 ) Re-apply #113148 after revert in #119331 If function pointer signing is enabled, sign personality function pointer stored in `.DW.ref.__gxx_personality_v0` section with IA key, 0x7EAD = `ptrauth_string_discriminator("personality")` constant discriminator and address diversity enabled.	2024-12-16 10:24:09 +03:00
Craig Topper	54dac27c57	[GISel][RISCV] Use isSExtCheaperThanZExt when widening G_UMAX/G_UMIN. (#120041 ) Similar to what we do for unsigned comparisons after #120032.	2024-12-15 23:16:58 -08:00
Craig Topper	115872902b	[GISel][RISCV] Use isSExtCheaperThanZExt when widening G_ICMP. (#120032 ) Sign extending i32->i64 is more efficient than zero extend for RV64.	2024-12-15 22:55:58 -08:00
Craig Topper	6dc24f6a2f	[GISel] Improve MachineVerifier for G_SCMP/UCMP. (#120017 ) -Ensure destination type is at least 2 bits. -Remove unnecessary check that both sources are the same type. The verifier already handles this generically.	2024-12-15 22:08:43 -08:00
Craig Topper	de1a423c23	[GISel][RISCV][AArch64] Support legalizing G_SCMP/G_UCMP to sub(isgt,islt). (#119265 ) Convert the LLT to EVT and call TargetLowering::shouldExpandCmpUsingSelects to determine if we should do this. We don't have a getSetccResultType, so I'm boolean extending the compares to the result type and using that. If the compares legalize to the same type, these extends will get removed. Unfortunately, if the compares legalize to a different type, we end up with truncates or extends that might not be optimally placed.	2024-12-15 20:47:17 -08:00
Matt Arsenault	818bffcb1c	RegAlloc: Fix failure on undef use when all registers are reserved (#119647 ) Greedy and fast would hit different assertions on undef uses if all registers in a class were reserved.	2024-12-16 10:56:45 +09:00
Matt Arsenault	61f99a1c75	RegAlloc: Do not fatal error if there are no registers in the alloc order (#119640 ) Try to use DiagnosticInfo if every register in the class is reserved by forcing assignment to a reserved register. Also reduces the number of redundant errors emitted, particularly with fast. This is still broken in the case of undef uses. There are additional complications in greedy and fast, so leave it for a separate fix.	2024-12-16 10:52:49 +09:00
Matt Arsenault	bb18e49edb	RegAlloc: Use DiagnosticInfo to report register allocation failures (#119492 ) Improve the non-fatal cases to use DiagnosticInfo, which will now provide a location. The allocators attempt to report different errors if it happens to see inline assembly is involved (this detection is quite unreliable) using srcloc instead of dbgloc. For now, leave this behavior unchanged. I think reporting the full location and context function would be more useful.	2024-12-16 10:49:08 +09:00
Craig Topper	ca60ee2b8c	[GISel] Remove unnecessary MachineVerifier checks for G_ABDS/G_ABDU. (#120014 ) These are declared to use a single type index for all operands in GenericOpcodes.td and the verifier knows how to check that all operands with the same type index match.	2024-12-15 17:30:32 -08:00
David Green	4c8c130847	[AArch64][GlobalISel] Scalarize i128 shufflevector instructions. (#119980 ) This, like other operations, scalarizes shuffle vector operations with types larger than 64bits. ImplicitDef and Freeze are also handled the same way, to allow them to legalize. The legalization of fewerElementsVectorShuffle is adjusted to handled scalarization.	2024-12-15 10:44:40 +00:00
Kazu Hirata	f01b62ad48	[GlobalISel] Fix warnings This patch fixes: llvm/lib/CodeGen/GlobalISel/CombinerHelperCasts.cpp:167:21: error: unused variable 'DL' [-Werror,-Wunused-variable] llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp:320:15: error: unused variable 'DL' [-Werror,-Wunused-variable]	2024-12-13 10:24:06 -08:00
Craig Topper	0d9fc17433	[GISel] Remove unused DataLayout operand from getApproximateEVTForLLT (#119833 )	2024-12-13 09:09:20 -08:00
Ramkumar Ramachandra	4a0d53a0b0	PatternMatch: migrate to CmpPredicate (#118534 ) With the introduction of CmpPredicate in 51a895a (IR: introduce struct with CmpInst::Predicate and samesign), PatternMatch is one of the first key pieces of infrastructure that must be updated to match a CmpInst respecting samesign information. Implement this change to Cmp-matchers. This is a preparatory step in migrating the codebase over to CmpPredicate. Since we no functional changes are desired at this stage, we have chosen not to migrate CmpPredicate::operator==(CmpPredicate) calls to use CmpPredicate::getMatching(), as that would have visible impact on tests that are not yet written: instead, we call CmpPredicate::operator==(Predicate), preserving the old behavior, while also inserting a few FIXME comments for follow-ups.	2024-12-13 14:18:33 +00:00
David Green	94a77ebe24	[AArch64][GlobalISel] Guard against no operands in matchHoistLogicOpWithSameOpcodeHands In case both LeftHandInst and RightHandInst are IMPLICIT_DEF with no input operands, this patch protects against the post-legalizer-combiner matchHoistLogicOpWithSameOpcodeHands with no operands. The prelegalizercombiner-hoist-same-hands.mir test was cleaned up a little in the process, and has a post-legalizer run line added so that the implicit_def do not get folded awwy.	2024-12-13 11:02:55 +00:00
Aiden Grossman	60325abeb3	[MLGO] Add Threshold to Prevent Pathological Compile Time Cases (#119807 ) This patch adds a threshold flag, -mlregalloc-max-cascade, to prevent live ranges from being evicted more than is necessary. After deploying a new regalloc model, we ran into some pathological cases where the model decided it wanted to ping-pong evictions, taking up a large amount of compile time. This threshold is mostly a stop gap while we continue to investigate other solutions and work on minimizing/constructing test cases.	2024-12-12 20:46:45 -08:00
paperchalice	1562b70eaf	Reapply "[DomTreeUpdater] Move critical edge splitting code to updater" (#119547 ) This relands commit #115111. Use traditional way to update post dominator tree, i.e. break critical edge splitting into insert, insert, delete sequence. When splitting critical edges, the post dominator tree may change its root node, and `setNewRoot` only works in normal dominator tree... See `6c7e5827ed/llvm/include/llvm/Support/GenericDomTree.h (L684-L687)`	2024-12-13 11:43:09 +08:00
Aiden Grossman	ada517b40c	[MLGO][NFC] Clang format MLRegAllocEvictAdvisor.cpp Run clang-format to fix an issue in spacing in a comment.	2024-12-13 03:40:15 +00:00
Craig Topper	7ece560a50	[GISel] Support narrowing G_ICMP with more than 2 parts. (#119335 ) This allows us to support i128 G_ICMP on RV32. I'm not sure how to test the "left over" part of this as RISC-V always widens to a power of 2 before narrowing.	2024-12-12 09:50:26 -08:00
Tim Gymnich	2db2dc8ab9	[GlobalISel][NFC] Fix LLT Propagation (#119587 ) Retain LLT type information by creating new LLTs from the original LLT instead of only using the original scalar size. This PR prepares for the [LLT FPInfo RFC](https://discourse.llvm.org/t/rfc-globalisel-adding-fp-type-information-to-llt/83349/24) where LLTs will carry additional floating point type information in addition to the scalar size.	2024-12-12 09:47:46 -08:00
Igor Kirillov	e909c0ccd4	[SelectOpt] Add support for AShr/LShr operands (#118495 ) For conditional increments with sign check conditions like X < 0 or X >= 0, the compiler may generate code like this: %cmp = icmp sgt i64 %1, -1 %shift = ashr i64 %1, 63 %j.next = add nsw i64 %j, %shift %sel = select i1 %cmp ... , where %cmp is not in computation but in some other implicit or regular expressions. This patch allows SelectOptimize pass to recognise these cases.	2024-12-12 14:06:24 +00:00

1 2 3 4 5 ...

36951 Commits