llvm-project

Author	SHA1	Message	Date
Matt Arsenault	1d0370872f	AMDGPU: Expand flat atomics that may access private memory (#109407 ) If the runtime flat address resolves to a scratch address, 64-bit atomics do not work correctly. Insert a runtime address space check (which is quite likely to be uniform) and select between the non-atomic and real atomic cases. Consider noalias.addrspace metadata and avoid this expansion when possible (we also need to consider it to avoid infinitely expanding after adding the predication code).	2024-10-31 08:08:48 -07:00
Matt Arsenault	c198f775cd	AMDGPU: Remove flat/global fmin/fmax intrinsics (#105642 ) These have been replaced with atomicrmw	2024-10-09 09:27:28 +04:00
Matt Arsenault	fbd2a91865	InferAddressSpaces: Handle llvm.fake.use (#109567 )	2024-10-09 09:24:37 +04:00
Matt Arsenault	a87640c97e	AMDGPU: Fix assertion on load of vector of pointers (#110436 ) Fix InferAddressSpaces asserting on a load of a vector of flat pointers. Fixes #110433	2024-09-30 10:16:38 +04:00
Matt Arsenault	ee08d9cba5	AMDGPU: Remove global/flat atomic fadd intrinics (#97051 ) These have been replaced with atomicrmw.	2024-08-22 23:27:33 +04:00
Matt Arsenault	1db674b83d	InferAddressSpaces: Convert test to generated checks Also use named values	2024-08-16 15:05:41 +04:00
Matt Arsenault	2ccbf92f87	InferAddressSpaces: Restore non-instruction user check Fixes regression after 79658d65c3c7a075382b74d81e74714e2ea9bd2d. We were missing test coverage for the nested constant expression case.	2024-08-15 15:55:09 +04:00
Matt Arsenault	7a51dde4e6	InferAddressSpaces: Improve handling of instructions with multiple pointer uses (#101922 ) The use list iteration worked correctly for the load and store case. The atomic instructions happen to have the pointer value as the last visited operand, but we rejected the instruction as simple after the first encountered use. Ignore the use list for the recognized load/store/atomic instructions, and just try to directly replace the known pointer use.	2024-08-08 13:19:35 +04:00
Matt Arsenault	2ef553c05f	InferAddressSpaces: Handle llvm.is.constant (#102010 )	2024-08-06 00:20:01 +04:00
Matt Arsenault	47fc4c37bb	InferAddressSpaces: Handle masked load and store intrinsics (#102007 )	2024-08-06 00:17:07 +04:00
Matt Arsenault	f01a6f5ecb	InferAddressSpaces: Handle prefetch intrinsic (#101982 )	2024-08-06 00:14:02 +04:00
Matt Arsenault	3c483b887e	InferAddressSpaces: Fix mishandling stores of pointers to themselves (#101877 )	2024-08-04 16:36:00 +04:00
Matt Arsenault	b1bcb7ca46	Reapply "AMDGPU: Move attributor into optimization pipeline (#83131 )" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851 ) This reverts commit adaff46d087799072438dd744b038e6fd50a2d78. Drop the -O3 checks from default-attributes.hip. I don't know why they are different on some bots but reverting this is far too disruptive.	2024-07-15 11:51:44 +04:00
dyung	adaff46d08	Revert "AMDGPU: Move attributor into optimization pipeline (#83131 )" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851 ) This reverts commits 677cc15e0ff2e0e6aa30538eb187990a6a8f53c0 and 78bc1b64a6dc3fb6191355a5e1b502be8b3668e7. The test CodeGenHIP/default-attributes.hip is failing on multiple bots even after the attempted fix including the following: - https://lab.llvm.org/buildbot/#/builders/3/builds/1473 - https://lab.llvm.org/buildbot/#/builders/65/builds/1380 - https://lab.llvm.org/buildbot/#/builders/161/builds/595 - https://lab.llvm.org/buildbot/#/builders/154/builds/1372 - https://lab.llvm.org/buildbot/#/builders/133/builds/1547 - https://lab.llvm.org/buildbot/#/builders/81/builds/755 - https://lab.llvm.org/buildbot/#/builders/40/builds/570 - https://lab.llvm.org/buildbot/#/builders/13/builds/748 - https://lab.llvm.org/buildbot/#/builders/12/builds/1845 - https://lab.llvm.org/buildbot/#/builders/11/builds/1695 - https://lab.llvm.org/buildbot/#/builders/190/builds/1829 - https://lab.llvm.org/buildbot/#/builders/193/builds/962 - https://lab.llvm.org/buildbot/#/builders/23/builds/991 - https://lab.llvm.org/buildbot/#/builders/144/builds/2256 - https://lab.llvm.org/buildbot/#/builders/46/builds/1614 These bots have been broken for a day, so reverting to get everything back to green.	2024-07-14 18:48:54 -07:00
Matt Arsenault	78bc1b64a6	AMDGPU: Move attributor into optimization pipeline (#83131 ) Removing it from the codegen pipeline induces a lot of test churn because llc is no longer optimizing out implicit arguments to kernels. Mostly mechanical, but there are some creative test updates. I preferred to take the changes as-is in tests where the ABI isn't relevant. In cases where it's more relevant, or the optimize out logic was too ingrained in the test, I pre-run the optimization. Some cases manually add attributes to disable inputs.	2024-07-14 08:36:33 +04:00
Shan Huang	a355c2d074	[DebugInfo][InferAddressSpaces] Fix the missing debug location update for the new addrspacecast (#97038 ) Fix #97006 .	2024-07-03 09:39:17 +08:00
Matt Arsenault	eda9ff899f	AMDGPU: Flat instructions do not have signed offsets gfx7-gfx11 (#95852 ) Fixes some atomicrmw fadd and intrinsic cases	2024-06-18 13:20:34 +02:00
Nikita Popov	deab451e7a	[IR] Remove support for icmp and fcmp constant expressions (#93038 ) Remove support for the icmp and fcmp constant expressions. This is part of: https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179 As usual, many of the updated tests will no longer test what they were originally intended to -- this is hard to preserve when constant expressions get removed, and in many cases just impossible as the existence of a specific kind of constant expression was the cause of the issue in the first place.	2024-06-04 08:31:03 +02:00
Nikita Popov	d10b76552f	[ConstantFold] Remove notional over-indexing fold (#93697 ) The data-layout independent constant folding currently has some rather gnarly code for canonicalizing GEP indices to reduce "notional overindexing", and then infers inbounds based on that canonicalization. Now that we canonicalize to i8 GEPs, this canonicalization is essentially useless, as we'll discard it as soon as the GEP hits the data-layout aware constant folder anyway. As such, I'd like to remove this code entirely. This shouldn't have any impact on optimization capabilities.	2024-05-30 08:36:44 +02:00
Nikita Popov	a49b5cad99	[InferAddressSpaces] Generate test checks (NFC)	2024-05-29 15:26:59 +02:00
Matt Arsenault	9f9856d623	AMDGPU: Update name for amdgpu.no.remote.memory metadata	2024-05-03 11:50:59 +02:00
Florian Hahn	c8e5ad4e12	Revert "[TBAA] Add verifier for tbaa.struct metadata (#86709 )" This reverts commit 7dbba39e583a3fd64e7e6b947251c035e483f054. Revert as there are reports this triggers during ThinLTO in some configurations.	2024-04-22 10:50:49 +01:00
Matt Arsenault	f433c3b380	AMDGPU: Add tests for atomicrmw handling of new metadata (#89248 ) Add baseline tests which should comprehensively test the new atomic metadata. Test codegen / expansion, and preservation in a few transforms. New metadata defined in #85052	2024-04-20 00:43:36 +02:00
Julian Nagele	7dbba39e58	Reapply "[TBAA] Add verifier for tbaa.struct metadata (#86709 )" This reverts commit b9cd48f96acdd07c627ccafbf4386a1f3dcd6c51. ------------------------------------------------------------- Original commit message: Adds logic to the IR verifier that checks whether !tbaa.struct nodes are well-formed. That is, it checks that the operands of !tbaa.struct nodes are in groups of three, that each group of three operands consists of two integers and a valid tbaa node, and that the regions described by the offset and size operands are non-overlapping. PR: https://github.com/llvm/llvm-project/pull/86709	2024-04-15 11:25:06 +01:00
Florian Hahn	b9cd48f96a	Revert "[TBAA] Add verifier for tbaa.struct metadata (#86709 )" This reverts commit df75183d70e029352a49c93f275db703c81a65c1. Revert for now as this appears to cause failures on some buildbots, e.g.: https://lab.llvm.org/buildbot/#/builders/93/builds/19428/steps/10/logs/stdio	2024-03-27 21:22:15 +00:00
Julian Nagele	df75183d70	[TBAA] Add verifier for tbaa.struct metadata (#86709 ) Adds logic to the IR verifier that checks whether !tbaa.struct nodes are well-formed. That is, it checks that the operands of !tbaa.struct nodes are in groups of three, that each group of three operands consists of two integers and a valid tbaa node, and that the regions described by the offset and size operands are non-overlapping. PR: https://github.com/llvm/llvm-project/pull/86709	2024-03-27 10:30:27 +01:00
Pierre van Houtryve	c831d83bb1	[InferAddrSpaces] Correctly replace identical operands of insts (#82610 ) It's important for PHI nodes because if a PHI node has multiple edges coming from the same block, we can have the same incoming value multiple times in the list of incoming values. All of those need to be consistent (exact same Value*) otherwise verifier complains. Fixes SWDEV-445797	2024-02-22 13:59:04 +01:00
Nikita Popov	2d69827c5c	[Transforms] Convert tests to opaque pointers (NFC)	2024-02-05 11:57:34 +01:00
Fangrui Song	9e9907f1cf	[AMDGPU,test] Change llc -march= to -mtriple= (#75982 ) Similar to 806761a7629df268c8aed49657aeccffa6bca449. For IR files without a target triple, -mtriple= specifies the full target triple while -march= merely sets the architecture part of the default target triple, leaving a target triple which may not make sense, e.g. amdgpu-apple-darwin. Therefore, -march= is error-prone and not recommended for tests without a target triple. The issue has been benign as we recognize $unknown-apple-darwin as ELF instead of rejecting it outrightly. This patch changes AMDGPU tests to not rely on the default OS/environment components. Tests that need fixes are not changed: ``` LLVM :: CodeGen/AMDGPU/fabs.f64.ll LLVM :: CodeGen/AMDGPU/fabs.ll LLVM :: CodeGen/AMDGPU/floor.ll LLVM :: CodeGen/AMDGPU/fneg-fabs.f64.ll LLVM :: CodeGen/AMDGPU/fneg-fabs.ll LLVM :: CodeGen/AMDGPU/r600-infinite-loop-bug-while-reorganizing-vector.ll LLVM :: CodeGen/AMDGPU/schedule-if-2.ll ```	2024-01-16 21:54:58 -08:00
Wenju He	fe146e9b59	[InferAddressSpaces] Fix constant replace to avoid modifying other functions (#70611 ) A constant value is unique in llvm context. InferAddressSpaces was replacing its users in other functions as well. This leads to unexpected behavior in our downstream use case after the pass. InferAddressSpaces is a function passe, so it shall not modify functions other than currently processed one. Co-authored-by: Abhinav Gaba <abhinav.gaba@intel.com> --------- Co-authored-by: Abhinav Gaba <abhinav.gaba@intel.com>	2023-11-13 13:28:56 +08:00
Noah Goldstein	8c2fcf5b77	[InstSimplify] Add some basic simplifications for `llvm.ptrmask` Mostly the same as `and`. We also have a check for a useless `llvm.ptrmask` if the ptr is already known aligned. Differential Revision: https://reviews.llvm.org/D156633	2023-11-01 23:50:35 -05:00
Wenju He	d199ff1765	[InferAddressSpaces] collect flat address expression from return value (#70610 ) If function return value's type is pointer, we can try to collect flat address expression from it. This PR also fixes noop_ptrint_pair_ce2 in noop-ptrint-pair.ll in #70611	2023-11-01 13:32:38 +08:00
Nikita Popov	eb86de63d9	[IR] Require that ptrmask mask matches pointer index size (#69343 ) Currently, we specify that the ptrmask intrinsic allows the mask to have any size, which will be zero-extended or truncated to the pointer size. However, what semantics of the specified GEP expansion actually imply is that the mask is only meaningful up to the pointer type index size -- any higher bits of the pointer will always be preserved. In other words, the mask gets 1-extended from the index size to the pointer size. This is also the behavior we want for CHERI architectures. This PR makes two changes: * It spells out the interaction with the pointer type index size more explicitly. * It requires that the mask matches the pointer type index size. The intention here is to make handling of this intrinsic more robust, to avoid accidental mix-ups of pointer size and index size in code generating this intrinsic. If a zero-extend or truncate of the mask is desired, it should just be done explicitly in IR. This also cuts down on the amount of testing we have to do, and things transforms needs to check for. As far as I can tell, we don't actually support pointers with different index type size at the SDAG level, so I'm just asserting the sizes match there for now. Out-of-tree targets using different index sizes may need to adjust that code.	2023-10-24 09:54:29 +02:00
Simon Pilgrim	e4d0e12099	[DAG] Fold (shl (sext (add_nsw x, c1)), c2) -> (add (shl (sext x), c2), c1 << c2) (REAPPLIED) Assuming the ADD is nsw then it may be sign-extended to merge with a SHL op in a similar fold to the existing (shl (add x, c1), c2) -> (add (shl x, c2), c1 << c2) fold. This is most useful for helping to expose address math for X86, but has also touched several aarch64 test cases as well. Alive2: https://alive2.llvm.org/ce/z/2UpSbJ Differential Revision: https://reviews.llvm.org/D159198	2023-09-06 13:19:42 +01:00
Matt Arsenault	92ee60b66f	AMDGPU: Drop and upgrade llvm.amdgcn.atomic.inc/dec to atomicrmw	2023-06-21 21:20:26 -04:00
Amaury Séchet	a70d5e25f3	[DAGCombine] Make sure combined nodes are added back to the worklist in topological order. Currently, a node and its users are added back to the worklist in reverse topological order after it is combined. This diff changes that order to be topological. This is part of a larger migration to get the DAGCombiner to process nodes in topological order. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D127115	2023-06-13 09:14:37 +00:00
Matt Arsenault	5b657f50b8	AMDGPU: Move LICM after AMDGPUCodeGenPrepare The commit that added the run says it's to hoist uniform parts of integer division expansion. That expansion is performed later, so this didn't do anything in that case. Move this later so the original test shows the improvement. This also saves a run of "Canonicalize natural loops". Not sure why this appears to be still getting a separate loop PM run. Also feels a bit heavy to run this just for divide. Is there a way to specifically hoist the divide sequence when it expands?	2023-06-10 07:37:32 -04:00
CaprYang	44096e6904	[InferAddressSpaces] Handle vector of pointers type & Support intrinsic masked gather/scatter	2023-05-17 23:40:06 +01:00
Tobias Hieta	f84bac329b	[NFC][Py Reformat] Reformat lit.local.cfg python files in llvm This is a follow-up to b71edfaa4ec3c998aadb35255ce2f60bba2940b0 since I forgot the lit.local.cfg files in that one. Reformatting is done with `black`. If you end up having problems merging this commit because you have made changes to a python file, the best way to handle that is to run git checkout --ours <yourfile> and then reformat it with black. If you run into any problems, post to discourse about it and we will try to help. RFC Thread below: https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style Reviewed By: barannikov88, kwk Differential Revision: https://reviews.llvm.org/D150762	2023-05-17 17:03:15 +02:00
Juan Manuel MARTINEZ CAAMAÑO	33da608ecc	[AMDGPU][InferAddressSpaces] Only rewrite address-spaces that can be trivially casted to flat for llvm.amdgcn.flat.atomic.{fadd,fmax,fmin} The intrinsic @llvm.amdgcn.flat.atomic.{fadd,fmax,fmin} can only be selected for flat address spaces (constant, flat and global). This patch restricts the cases over which GCNTTIImpl::rewriteIntrinsicWithAddressSpace rewrites the intrinsic. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D149938	2023-05-16 17:32:58 +02:00
Krzysztof Drewniak	f0415f2a45	Re-land "[AMDGPU] Define data layout entries for buffers"" Re-land D145441 with data layout upgrade code fixed to not break OpenMP. This reverts commit 3f2fbe92d0f40bcb46db7636db9ec3f7e7899b27. Differential Revision: https://reviews.llvm.org/D149776	2023-05-03 19:43:56 +00:00
Krzysztof Drewniak	3f2fbe92d0	Revert "[AMDGPU] Define data layout entries for buffers" This reverts commit f9c1ede2543b37fabe9f2d8f8fed5073c475d850. Differential Revision: https://reviews.llvm.org/D149758	2023-05-03 16:11:00 +00:00
Krzysztof Drewniak	f9c1ede254	[AMDGPU] Define data layout entries for buffers Per discussion at https://discourse.llvm.org/t/representing-buffer-descriptors-in-the-amdgpu-target-call-for-suggestions/68798, we define two new address spaces for AMDGCN targets. The first is address space 7, a non-integral address space (which was already in the data layout) that has 160-bit pointers (which are 256-bit aligned) and uses a 32-bit offset. These pointers combine a 128-bit buffer descriptor and a 32-bit offset, and will be usable with normal LLVM operations (load, store, GEP). However, they will be rewritten out of existence before code generation. The second of these is address space 8, the address space for "buffer resources". These will be used to represent the resource arguments to buffer instructions, and new buffer intrinsics will be defined that take them instead of <4 x i32> as resource arguments. ptr addrspace(8). These pointers are 128-bits long (with the same alignment). They must not be used as the arguments to getelementptr or otherwise used in address computations, since they can have arbitrarily complex inherent addressing semantics that can't be represented in LLVM. Even though, like their address space 7 cousins, these pointers have deterministic ptrtoint/inttoptr semantics, they are defined to be non-integral in order to prevent optimizations that rely on pointers being a [0, [addr_max]] value from applying to them. Future work includes: - Defining new buffer intrinsics that take ptr addrspace(8) resources. - A late rewrite to turn address space 7 operations into buffer intrinsics and offset computations. This commit also updates the "fallback address space" for buffer intrinsics to the buffer resource, and updates the alias analysis table. Depends on D143437 Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D145441	2023-05-03 15:25:58 +00:00
Nikita Popov	bbfb13a5ff	[ConstExpr] Remove select constant expression This removes the select constant expression, as part of https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179. Uses of this expressions have already been removed in advance, so this just removes related infrastructure and updates tests. Differential Revision: https://reviews.llvm.org/D145382	2023-03-16 10:32:08 +01:00
Nikita Popov	5f01a626dd	[ConstantFold] Fix inbounds inference on mismatching source element type When inferring that a GEP of a global variable is inbounds because there is no notional overindexing, we need to check that the global value type and the GEP source element type match. This was not necessary with typed pointers (because we would have a bitcast in between), but is necessary with opaque pointers. We should be able to recover some of the safe cases by performing an offset based inbounds inference in DL-aware ConstantFolding.	2023-01-31 11:33:00 +01:00
Nikita Popov	0254de09eb	[InferAddressSpaces] Regenerate test checks (NFC)	2023-01-30 15:28:46 +01:00
Bjorn Pettersson	3528e63d89	[test] Remove duplicate RUN lines in Transform tests	2022-12-08 11:47:16 +01:00
Matt Arsenault	5651af896c	InferAddressSpaces: Switch tests to use opt -passes	2022-11-27 20:26:16 -05:00
Matt Arsenault	a982f09567	InferAddressSpaces: Convert tests to opaque pointers Had constantexprs be mangled by the opaquify script; had to update those lines manually: NVPTX/bug31948.ll AMDGPU/old-pass-regressions.ll AMDGPU/old-pass-regressions-inseltpoison.ll AMDGPU/infer-address-space.ll Required re-reunning update_test_checks: AMDGPU/redundant-addrspacecast.ll In AMDGPU/insert-pos-assert.ll, bitcast_insert_pos_assert_2 deleted a getelementptr of 0 which I'm guessing was relevant. Replaced with an offset 1 GEP to ensure another addrspacecast is inserted. AMDGPU/infer-getelementptr.ll had one case improve by introducing an inbounds.	2022-11-27 20:26:16 -05:00
Fangrui Song	6b852ffa99	[Sink] Process basic blocks with a single successor This condition seems unnecessary. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D93511	2022-11-18 01:23:12 +00:00

1 2 3

107 Commits