llvm-project

Author	SHA1	Message	Date
Vikash Gupta	283806695a	[GlobalIsel] Add combine for select with constants (#121088 ) The SelectionDAG Isel supports the both version of combines mentioned below : ``` select Cond, Pow2, 0 --> (zext Cond) << log2(Pow2) select Cond, 0, Pow2 --> (zext !Cond) << log2(Pow2) ``` The GlobalIsel for now only supports the first one defined in it's generic combinerHelper.cpp. This patch adds the missing second one.	2025-01-01 11:14:53 +05:30
Simon Pilgrim	b3a7ab6f1f	[DAG] Don't allow implicit truncation in extract_element(bitcast(scalar_to_vector(X))) -> trunc(srl(X,C)) fold Limits #117900 to only fold when scalar_to_vector doesn't perform implicit truncation, as the scaled shift calculation doesn't currently account for this - this can be addressed in a future update. Fixes #121306	2024-12-30 16:08:35 +00:00
Fangrui Song	9efa7d7af3	Remove -print-lsr-output in favor of --stop-after=loop-reduce Pull Request: https://github.com/llvm/llvm-project/pull/121305	2024-12-29 18:58:30 -08:00
Fangrui Song	c3ef6d469d	Move two LLVM_DEBUG banners after skippers so that they don't show in -debug output when they are not run.	2024-12-29 10:55:07 -08:00
Fangrui Song	66dd7e63d8	Simplify enablePostRAScheduler and test enablePostRAScheduler() early	2024-12-28 23:51:44 -08:00
Kyungwoo Lee	815343e7dd	[CGData][Merger] Avoid merging the attached call target (#121030 ) For global function merging, the target of the arc-attached call must be a constant and cannot be parameterized. This change adds a check to bypass this case in `canParameterizeCallOperand()`.	2024-12-27 11:59:25 -08:00
Vikash Gupta	c21a3776c9	[GlobalIsel] [Utility] [NFC] Added isConstantOrConstantSplatVectorFP to handle float constants. (#120935 ) Needed for #120104	2024-12-26 18:57:19 +05:30
Igor Kirillov	3469996d0d	[SelectOpt] Optimise big select groups in the latch of a non-inner loop to branches (#119728 ) Loop latches often have a loop-carried dependency, and if they have several SelectLike instructions in one select group, it is usually profitable to convert it to branches rather than keep selects.	2024-12-25 12:58:21 +00:00
Fangrui Song	25bb6592c9	MCAsmInfo: replace AIX-specific variables with IsAIX AIX assembly is very different from the gas syntax. We don't expect other targets to share these differences. Unify the numerous, essentially AIX-specific variables.	2024-12-24 22:46:13 -08:00
Fangrui Song	56600c11ad	MCAsmInfo: replace HLASM-specific variables with IsHLASM HLASM is very different from the gas syntax. We don't expect other targets to customize the differences. Unify the numerous variables.	2024-12-24 18:37:46 -08:00
Ryotaro Kasuga	0d6a584f69	[MachinePipeliner] Add an abstract layer to manipulate Data Dependenc… (#109918 ) …e Graph In MachinePipeliner, a DAG class is used to represent the Data Dependence Graph. Data Dependence Graph generally contains cycles, so it's not appropriate to use DAG classes. In fact, some "hacks" are used to express back-edges in the current implementation. This patch adds a new class to provide a better interface for manipulating dependencies. Our approach is as follows: - To build the graph, we use the ScheduleDAGInstrs class as it is, because it has powerful functions and the current implementation depends heavily on it. - After the graph construction is finished (i.e., during scheduling), we use the new class DataDependenceGraph to manipulate the dependencies. Since we don't change the dependencies during scheduling, the new class only provides functions to read them. Also, this patch is just a refactoring, i.e., scheduling results should not change with or without this patch.	2024-12-24 10:02:15 +09:00
Fangrui Song	7b23f413d1	MCAsmStreamer: Omit initial ".text" llvm-mc --assemble prints an initial `.text` from `initSections`. This is weird for quick assembly tasks that do not specify `.text`. Omit the .text by moving section directive printing from `changeSection` to `switchSection`. switchSectionNoPrint now correctly calls the `changeSection` hook (needed by MachO). The initial directives of clang -S are now reordered. On ELF targets, we get `.file "a.c"; .text` instead of `.text; .file "a.c"`. If there is no function, `.text` will be omitted.	2024-12-22 22:03:44 -08:00
Pengcheng Wang	2e3003211f	[TRI][RISCV] Add methods to get common register class of two registers (#118435 ) Here we add two methods `getCommonMinimalPhysRegClass` and a LLT version `getCommonMinimalPhysRegClassLLT`, which return the most sub register class of the right type that contains these two input registers. We don't overload the `getMinimalPhysRegClass` as there will be ambiguities. We use it to simplify some code in RISC-V target.	2024-12-23 13:10:34 +08:00
NAKAMURA Takumi	d328d41061	Revert "Add a pass to collect dropped var stats for MIR (#120780 )" This reverts commit 3bf91ad2a9c75dd045961e45fdd830fd7b7a5455. (llvmorg-20-init-16123-g3bf91ad2a9c7) `llvm/CodeGen` should not depend on `llvm/Passes`.	2024-12-21 12:42:26 +09:00
Sergei Barannikov	9ae92d7056	[SelectionDAG] Virtualize isTargetStrictFPOpcode / isTargetMemoryOpcode (#119969 ) With this change, targets are no longer required to put memory / strict-fp opcodes after special `ISD::FIRST_TARGET_MEMORY_OPCODE`/`ISD::FIRST_TARGET_STRICTFP_OPCODE` markers. This will also allow autogenerating `isTargetMemoryOpcode`/`isTargetStrictFPOpcode (#119709). Pull Request: https://github.com/llvm/llvm-project/pull/119969	2024-12-21 05:29:51 +03:00
Shubham Sandeep Rastogi	3bf91ad2a9	Add a pass to collect dropped var stats for MIR (#120780 ) This patch uses the DroppedVariableStats class to add dropped variable statistics for MIR passes. Reland 1c082c9cd12efaa67a32c5da89a328c458ed51c5	2024-12-20 10:08:54 -08:00
Wang Pengcheng	d7ddc976d5	[MachinePipeliner] Remove unused private field MF	2024-12-20 19:45:58 +08:00
Pengcheng Wang	d66f653c8d	[MachinePipeliner] Skip reserved registers when computing register pressure (#120694 ) We used to skip fixed registers, but fixed registers are not enough because there are some runtime unusable registers like registers reserved by `-ffixed-xxx` options. Here we change to use reserved registers so that the estimated pressure is more accurate.	2024-12-20 18:30:17 +08:00
Craig Topper	ecd59f802f	[SelectionDAG] Use SmallVectorImpl& to avoid repeating SmallVector size. NFC	2024-12-19 22:03:42 -08:00
Shubham Sandeep Rastogi	e7e622f153	Revert "Move DroppedVariableStats to CodeGen lib (#120650 )" This reverts commit 4307198d51487cc16f98eebb2113caf4a1905914. Broke bot ppc64le-clang-multistage-test: undefined reference to `llvm::DroppedVariableStats::populateVarIDSetAndInlinedMap in In function `llvm::DroppedVariableStatsIR::visitEveryInstruction	2024-12-19 19:59:34 -08:00
Shubham Sandeep Rastogi	4307198d51	Move DroppedVariableStats to CodeGen lib (#120650 ) To get Dropped variable statistics for MIR, we need to move the base class DroppedVariableStats code to the CodeGen library because we cannot have CodeGen link against Passes. Also moved the code for the virtual functions to the header because clang/lib/CodeGen doesn't link against llvm/lib/CodeGen however it does link against Passes which contains the `class StandardInstrumentations` code but not the definition for the virtual functions leading to the error about not finding vtable for `class DroppedVariableStatsIR`	2024-12-19 18:09:14 -08:00
Paul Bowen-Huggett	ee7ca0ddda	Make CombinerHelper methods const (#119529 ) There are a number of backends (specifically AArch64, AMDGPU, Mips, and RISCV) which contain a “TODO: make CombinerHelper methods const” comment. This PR does just that and makes all of the CombinerHelper methods const, removes the TODO comments and makes the associated instances const. This change makes some sense because the CombinerHelper class simply modifies the state of _other_ objects to which it holds pointers or references. Note that AMDGPU contains an identical comment for an instance of AMDGPUCombinerHelper (a subclass of CombinerHelper). I deliberately haven’t modified the methods of that class in order to limit the scope of the change. I’m happy to do so either now or as a follow-up.	2024-12-20 08:29:18 +07:00
Finn Plummer	45c01e8a33	[NFC][TargetTransformInfo][VectorUtils] Consolidate `isVectorIntrinsic...` api (#117635 ) - update `VectorUtils:isVectorIntrinsicWithScalarOpAtArg` to use TTI for all uses, to allow specifiction of target specific intrinsics - add TTI to the `isVectorIntrinsicWithStructReturnOverloadAtField` api - update TTI api to provide `isTargetIntrinsicWith...` functions and consistently name them - move `isTriviallyScalarizable` to VectorUtils - update all uses of the api and provide the TTI parameter Resolves #117030	2024-12-19 11:54:26 -08:00
Craig Topper	f139bde8d8	[SelectionDAG] Move SDNode::use_iterator::getOperandNo to SDUse. (#120536 ) This allows us to write more range based for loops because we no longer need the iterator. It also matches IR's Use class.	2024-12-19 09:07:42 -08:00
Craig Topper	e6b2495545	[SelectionDAG] Split SDNode::use_iterator into user_iterator and use_iterator. (#120531 ) SDNode::use_iterator now returns an SDUse& when dereferenced. SDNode::user_iterator returns SDNode*. SDNode::use_begin/use_end/uses work on use_iterator. SDNode::user_begin/user_end/users work on user_iterator. We can now write range based for loops using SDUse& and SDNode::uses(). I've converted many of these in this patch. I didn't update loops that have additional variables updated in their for statement. Some loops use SDNode::use_iterator::getOperandNo() which also prevents using range based for loops. I plan to move this into SDUse in a follow up patch.	2024-12-19 08:35:32 -08:00
Shubham Sandeep Rastogi	16d952898f	Revert "Add a pass to collect dropped var stats for MIR (#120501 )" This reverts commit 223c7648468cd4f649a578d3f9cbc27a63523192. Reverted due to vuildbot failure: flang-aarch64-libcxx Linking CXX shared library lib/libLLVMAnalysis.so.20.0git FAILED: lib/libLLVMAnalysis.so.20.0git	2024-12-19 00:48:40 -08:00
Shubham Sandeep Rastogi	223c764846	Add a pass to collect dropped var stats for MIR (#120501 ) Reland "Add a pass to collect dropped var stats for MIR" (#117044) I am trying to reland https://github.com/llvm/llvm-project/pull/115566 I also moved the DroppedVariableStats code to the Analysis lib This is part of a stack of patches with https://github.com/llvm/llvm-project/pull/120502 being the first one in the stack	2024-12-19 00:41:48 -08:00
Craig Topper	bd261ecc5a	[SelectionDAG] Add SDNode::user_begin() and use it in some places (#120509 ) Most of these are just places that want the first user and aren't iterating over the whole list. While there I changed some use_size() == 1 to hasOneUse() which is more efficient. This is part of an effort to rename use_iterator to user_iterator and provide a use_iterator that dereferences to SDUse&. This patch helps reduce the diff on later patches.	2024-12-18 22:13:04 -08:00
Craig Topper	4ca4287da4	[SelectionDAG] Replace findGlueUse in SelectionDAGISel with SDNode::getGluedUser. NFC (#120512 )	2024-12-18 21:46:52 -08:00
Craig Topper	104ad9258a	[SelectionDAG] Rename SDNode::uses() to users(). (#120499 ) This function is most often used in range based loops or algorithms where the iterator is implicitly dereferenced. The dereference returns an SDNode * of the user rather than SDUse * so users() is a better name. I've long beeen annoyed that we can't write a range based loop over SDUse when we need getOperandNo. I plan to rename use_iterator to user_iterator and add a use_iterator that returns SDUse& on dereference. This will make it more like IR.	2024-12-18 20:09:33 -08:00
Zhaoxin Yang	f334db92be	[llvm][CodeGen] Intrinsic `llvm.powi.*` code gen for vector arguments (#118242 ) Scalarize vector FPOWI instead of promoting the type. This allows the scalar FPOWIs to be visited and converted to libcalls before promoting the type. FIXME: This should be done in LegalizeVectorOps/LegalizeDAG, but call lowering needs the unpromoted EVT. Without this patch, in some backends, such as RISCV64 and LoongArch64, the i32 type is illegal and will be promoted. This causes exponent type check to fail when ISD::FPOWI node generates a libcall. Fix https://github.com/llvm/llvm-project/issues/118079	2024-12-19 08:57:31 +08:00
Florian Hahn	76714be5fd	Revert "Add support for single reductions in ComplexDeinterleavingPass (#112875 )" This reverts commit b3eede5e1fa7ab742b86e9be22db7bccd2505b8a. This has been breaking most AArch64 stage2 builds for 4+ hours, reverting to get the bots back to green. https://lab.llvm.org/buildbot/#/builders/41/builds/4172 https://lab.llvm.org/buildbot/#/builders/4/builds/4281 https://lab.llvm.org/buildbot/#/builders/199/builds/263 https://lab.llvm.org/buildbot/#/builders/198/builds/334 https://lab.llvm.org/buildbot/#/builders/143/builds/4276 https://lab.llvm.org/buildbot/#/builders/17/builds/4725	2024-12-18 15:06:52 +00:00
Paul Walker	3146911eb0	[LLVM][AsmPrinter] Add vector ConstantInt/FP support to emitGlobalConstantImpl. (#120077 ) The fixes a failure path for fixed length vector globals when ConstantInt/FP is used to represent splats instead of ConstantDataVector.	2024-12-18 11:51:01 +00:00
Nicholas Guy	b3eede5e1f	Add support for single reductions in ComplexDeinterleavingPass (#112875 ) The Complex Deinterleaving pass assumes that all values emitted will result in complex numbers, this patch aims to remove that assumption and adds support for emitting just the real or imaginary components, not both.	2024-12-18 10:34:26 +00:00
Florian Mayer	d9703501b0	[MTE] [NFC] use vector to collect globals to tag (#120283 ) The same pattern caused test failures in the HWASan pass, so is brittle. Let's go for the easier approach.	2024-12-18 00:38:19 -08:00
Pengcheng Wang	1235a93fae	[MachinePipeliner] Use `RegisterClassInfo::getRegPressureSetLimit` (#119827 ) `RegisterClassInfo::getRegPressureSetLimit` is a wrapper of `TargetRegisterInfo::getRegPressureSetLimit` with some logics to adjust the limit by removing reserved registers. It seems that we shouldn't use `TargetRegisterInfo::getRegPressureSetLimit` directly, just like the comment "This limit must be adjusted dynamically for reserved registers" said. Thus we should use `RegisterClassInfo::getRegPressureSetLimit` and remove replicated code. Separate from https://github.com/llvm/llvm-project/pull/118787	2024-12-18 15:13:03 +08:00
Pengcheng Wang	b6ad231666	[MachineSink] Use `RegisterClassInfo::getRegPressureSetLimit` (#119830 ) `RegisterClassInfo::getRegPressureSetLimit` is a wrapper of `TargetRegisterInfo::getRegPressureSetLimit` with some logics to adjust the limit by removing reserved registers. It seems that we shouldn't use `TargetRegisterInfo::getRegPressureSetLimit` directly, just like the comment "This limit must be adjusted dynamically for reserved registers" said. Separate from https://github.com/llvm/llvm-project/pull/118787	2024-12-18 14:51:01 +08:00
Jon Roelofs	01d7a187a4	[llvm] Add missing dependency of libLLVMCodeGen on vt_gen ``` llvm-project/llvm/include/llvm/CodeGenTypes/MachineValueType.h:43:10: fatal error: 'llvm/CodeGen/GenVT.inc' file not found 43 \| #include "llvm/CodeGen/GenVT.inc" \| ^~~~~~~~~~~~~~~~~~~~~~~~ ``` rdar://141643651	2024-12-17 17:02:55 -07:00
Thurston Dang	e8a6563768	Fix-forward 'RegAllocFast: Avoid using temporary DiagnosticInfo #120184 ' (#120268 ) There was a buildbot breakage (https://lab.llvm.org/buildbot/#/builders/24/builds/3329/steps/11/logs/stdio): /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/CodeGen/AMDGPU/ran-out-of-registers-error-all-regs-reserved.ll:9:10: error: CHECK: expected string not found in input ; CHECK: error: <unknown>:0:0: no registers from class available to allocate in function 'no_registers_from_class_available_to_allocate' 2: ==75198==ERROR: AddressSanitizer: stack-use-after-scope on address 0xfa23f9f1c270 at pc 0xb2660dda9340 bp 0xfffffe8ab340 sp 0xfffffe8ab338 caused by https://github.com/llvm/llvm-project/pull/120184, which made a partial fix but also renabled the tests. This patch attempts to fix forward by applying the same fix to the error message highlighted in the buildbot.	2024-12-17 09:09:13 -08:00
Florian Hahn	a487b792e2	[TySan] Add initial Type Sanitizer (LLVM) (#76259 ) This patch introduces the LLVM components of a type sanitizer: a sanitizer for type-based aliasing violations. It is based on Hal Finkel's https://reviews.llvm.org/D32198. C/C++ have type-based aliasing rules, and LLVM's optimizer can exploit these given TBAA metadata added by Clang. Roughly, a pointer of given type cannot be used to access an object of a different type (with, of course, certain exceptions). Unfortunately, there's a lot of code in the wild that violates these rules (e.g. for type punning), and such code often must be built with -fno-strict-aliasing. Performance is often sacrificed as a result. Part of the problem is the difficulty of finding TBAA violations. Hopefully, this sanitizer will help. For each TBAA type-access descriptor, encoded in LLVM's IR using metadata, the corresponding instrumentation pass generates descriptor tables. Thus, for each type (and access descriptor), we have a unique pointer representation. Excepting anonymous-namespace types, these tables are comdat, so the pointer values should be unique across the program. The descriptors refer to other descriptors to form a type aliasing tree (just like LLVM's TBAA metadata does). The instrumentation handles the "fast path" (where the types match exactly and no partial-overlaps are detected), and defers to the runtime to handle all of the more-complicated cases. The runtime, of course, is also responsible for reporting errors when those are detected. The runtime uses essentially the same shadow memory region as tsan, and we use 8 bytes of shadow memory, the size of the pointer to the type descriptor, for every byte of accessed data in the program. The value 0 is used to represent an unknown type. The value -1 is used to represent an interior byte (a byte that is part of a type, but not the first byte). The instrumentation first checks for an exact match between the type of the current access and the type for that address recorded in the shadow memory. If it matches, it then checks the shadow for the remainder of the bytes in the type to make sure that they're all -1. If not, we call the runtime. If the exact match fails, we next check if the value is 0 (i.e. unknown). If it is, then we check the shadow for the remainder of the byes in the type (to make sure they're all 0). If they're not, we call the runtime. We then set the shadow for the access address and set the shadow for the remaining bytes in the type to -1 (i.e. marking them as interior bytes). If the type indicated by the shadow memory for the access address is neither an exact match nor 0, we call the runtime. The instrumentation pass inserts calls to the memset intrinsic to set the memory updated by memset, memcpy, and memmove, as well as allocas/byval (and for lifetime.start/end) to reset the shadow memory to reflect that the type is now unknown. The runtime intercepts memset, memcpy, etc. to perform the same function for the library calls. The runtime essentially repeats these checks, but uses the full TBAA algorithm, just as the compiler does, to determine when two types are permitted to alias. In a situation where access overlap has occurred and aliasing is not permitted, an error is generated. Clang's TBAA representation currently has a problem representing unions, as demonstrated by the one XFAIL'd test in the runtime patch. We'll update the TBAA representation to fix this, and at the same time, update the sanitizer. When the sanitizer is active, we disable actually using the TBAA metadata for AA. This way we're less likely to use TBAA to remove memory accesses that we'd like to verify. As a note, this implementation does not use the compressed shadow-memory scheme discussed previously (http://lists.llvm.org/pipermail/llvm-dev/2017-April/111766.html). That scheme would not handle the struct-path (i.e. structure offset) information that our TBAA represents. I expect we'll want to further work on compressing the shadow-memory representation, but I think it makes sense to do that as follow-up work. It goes together with the corresponding clang changes (https://github.com/llvm/llvm-project/pull/76260) and compiler-rt changes (https://github.com/llvm/llvm-project/pull/76261) PR: https://github.com/llvm/llvm-project/pull/76259	2024-12-17 13:57:34 +00:00
Florian Hahn	c1f5937eb4	[SelectOpt] Support BinOps with SExt operands. (#115879 ) Building on top of https://github.com/llvm/llvm-project/pull/115489 extend support for binops with SExt operand. PR: https://github.com/llvm/llvm-project/pull/115879	2024-12-17 11:52:15 +00:00
Benjamin Maxwell	a7dafea384	[SDAG] Allow folding stack slots into sincos/frexp in more cases (#118117 ) This adds a new helper `canFoldStoreIntoLibCallOutputPointers()` to check that it is safe to fold a store into a node that will expand to a library call that takes output pointers. This requires checking for two (independent) properties: 1. The store is not within a CALLSEQ_START..CALLSEQ_END pair * If it is, the expansion would lead to nested call sequences (which is invalid) 2. The node does not appear as a predecessor to the store * If it does, attempting to merge the store into the call would result in a cycle in the DAG These two properties are checked as part of the same traversal in `canFoldStoreIntoLibCallOutputPointers()`	2024-12-17 10:54:17 +00:00
Matt Arsenault	10b12e6e07	LiveVariables: Use Register (#120204 )	2024-12-17 17:45:24 +07:00
Matt Arsenault	3508d8f6dd	RegAllocFast: Avoid using temporary DiagnosticInfo (#120184 ) This reverts commit 1297933f35b4948b4d281259627a72094c407a75.	2024-12-17 16:19:26 +07:00
Florian Mayer	514580b438	[MTE] Apply alignment / size in AsmPrinter rather than IR (#111918 ) This makes sure no optimizations are applied that assume the bigger alignment or size, which could be incorrect if we link together with non-instrumented code.	2024-12-17 00:47:02 -08:00
Matt Arsenault	5e727e8bed	[Statepoint] Treat undef operands less specially (#119682 ) This reverts commit f7443905af1e06eaacda1e437fff8d54dc89c487. This is to avoid an assertion if an undef operand appears in a stackmap. This is important to avoid hitting verifier errors when register allocation starts adding undefs in error scenarios. Rather than trying to treat undef operands as special, leave them alone and avoid producing an invalid spill. It would a bit more precise to produce a spill of an undef register here, but that's not exposed through the storeRegToStackSlot API. https://reviews.llvm.org/D122605 This was an alternative to https://reviews.llvm.org/D122582	2024-12-17 12:59:46 +07:00
Matt Arsenault	e2cabd715b	RegAllocGreedy: Fix comment typo	2024-12-17 12:46:04 +07:00
Simon Pilgrim	0954c67d7a	[DAG] visitFREEZE - only fold integer types to an all ones constant ISD::isBuildVectorAllOnes can peek through bitcasts, so this can match against FP NAN (ish) data (e.g. double (bitcast i64 -1)) under certain circumstances - bail if the type isn't an integer and let bitcast folding handle it first. Fixes #120093	2024-12-16 16:46:38 +00:00
Aiden Grossman	76f258920d	[MLGO] Do not include urgent LRs in max cascade calculation (#120052 ) A previous PR introduced a threshold where we would mask out a LR that had been evicted a certain number of times to combat pathological compile time cases with a somewhat adversarial model. However, this patch did not take into account urgent LRs which led to compilation failures when greedy would expect us to provide an eviction and we could not due to the newly introduced logic.	2024-12-16 07:14:34 -08:00
Björn Pettersson	3ad2399148	[DAGCombiner] Refactor and improve ReduceLoadOpStoreWidth (#119564 ) This patch make a couple of improvements to ReduceLoadOpStoreWidth. When determining the minimum size of "NewBW" we now take byte boundaries into account. If we for example touch bits 6-10 we shouldn't accept NewBW=8, because we would fail later when detecting that we can't access bits from two different bytes in memory using a single load. Instead we make sure to align LSB/MSB according to byte size boundaries up front before searching for a viable "NewBW". In the past we only tried to find a "ShAmt" that was a multiple of "NewBW", but now we use a sliding window technique to scan for a viable "ShAmt" that is a multiple of the byte size. This can help out finding more opportunities for optimization (specially if the original type isn't byte sized, and for big-endian targets when the original load/store is aligned on the most significant bit).	2024-12-16 12:15:11 +01:00

1 2 3 4 5 ...

36973 Commits