llvm-project

Author	SHA1	Message	Date
Danila Malyutin	ed6c309d4b	[APFloat] Fix truncation of certain subnormal numbers Certain subnormals would be incorrectly rounded away from zero. Fixes #55838 Differential Revision: https://reviews.llvm.org/D127140	2022-06-08 21:54:35 +03:00
Sanjay Patel	abb21b54bc	[ConstProp] add tests for APFloat truncate miscompile; NFC issue #55838	2022-06-05 20:07:18 -04:00
Benjamin Kramer	08b20f20d2	[ConstantFold] Use getFltSemantics instead of manually checking the type Simplifies the code and makes fpext/fptrunc constant folding not crash when the result is bf16.	2022-05-05 15:52:19 +02:00
Craig Topper	ac8c720d48	[IR] Allow constant folding (insertelement <vscale x 2 x i32> zeroinitializer, i32 0, i32 i32 0. Most of insertelement constant folding is blocked if the vector type is scalable. I believe we can make an exception for inserting null into an all zeros vector. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D123413	2022-04-15 17:44:32 -07:00
Nikita Popov	659871cede	[ConstantFold] Add test for load of i8 from i1 (NFC) Semantics here are a bit unclear, but the store-to-load forwarding case at least should be a miscompile.	2022-04-08 16:32:51 +02:00
Sanjay Patel	7cc0a29b3f	[Analysis] propagate poison through add/sub saturate intrinsics A more general enhancement needs to add tests and make sure that intrinsics that return structs are correct. There are also target-specific intrinsics, and I'm not sure what behavior is expected for those.	2022-02-15 10:45:32 -05:00
Sanjay Patel	00218c188b	[Analysis] propagate poison through integer min/max intrinsics A more general enhancement needs to add tests and make sure that intrinsics that return structs are correct. There are also target-specific intrinsics, and I'm not sure what behavior is expected for those.	2022-02-15 10:45:32 -05:00
Sanjay Patel	765b5b8105	[ConstProp] add tests for intrinsics with poison ops; NFC	2022-02-15 10:45:32 -05:00
Bjorn Pettersson	b280ee1dd7	[test] Use -passes=instsimplify instead of -instsimplify in a number of tests. NFC Another step moving away from the deprecated syntax of specifying pass pipeline in opt. Differential Revision: https://reviews.llvm.org/D119080	2022-02-07 14:26:58 +01:00
Bjorn Pettersson	4f73528403	[test][NewGVN] Use -passes=newgvn instead of -newgvn Use the new PM syntax when specifying the pipeline in regression tests previously running "opt -newgvn ..." Instead we now do "opt -passes=newgvn ..." Notice that this also changes the aa-pipeline to become the default aa-pipeline instead of just basic-aa. Since these tests haven't been explicitly requesting basic-aa in the past (compared to the test cases updated in a separate patch involving "-basic-aa -newgvn") it is assumed that the exact aa-pipeline isn't important for the validity of the test cases. An alternative could have been to add -aa-pipeline=basic-aa as well to the run lines, but that might just add clutter in case the test cases do not care about the aa-pipeline. This is another step to move away from the legacy PM syntax when specifying passes in opt. Differential Revision: https://reviews.llvm.org/D118341	2022-01-28 13:58:22 +01:00
Sanjay Patel	2e26633af0	[IR] document and update ctlz/cttz intrinsics to optionally return poison rather than undef The behavior in Analysis (knownbits) implements poison semantics already, and we expect the transforms (for example, in instcombine) derived from those semantics, so this patch changes the LangRef and remaining code to be consistent. This is one more step in removing "undef" from LLVM. Without this, I think https://github.com/llvm/llvm-project/issues/53330 has a legitimate complaint because that report wants to allow subsequent code to mask off bits, and that is allowed with undef values. The clang builtins are not actually documented anywhere AFAICT, but we might want to add that to remove more uncertainty. Differential Revision: https://reviews.llvm.org/D117912	2022-01-23 11:22:48 -05:00
Nikita Popov	b4900296e4	[ConstantFold] Allow all float types in reinterpret load folding Rather than hardcoding just half, float and double, allow all floating point types.	2022-01-21 09:26:51 +01:00
Nikita Popov	3f9d1f516e	[InstSimplify] Add tests for reinterpret load of floats (NFC) Add tests for currently unsupported float types.	2022-01-21 09:26:50 +01:00
Nikita Popov	6a19cb837c	[ConstantFold] Support pointers in reinterpret load folding Peculiarly, the necessary code to handle pointers (including the check for non-integral address spaces) is already in place, because we were already allowing vectors of pointers here, just not plain pointers.	2022-01-21 09:13:37 +01:00
Nikita Popov	805bc24868	[InstSimplify] Add test for load of non-integral pointer (NFC)	2022-01-20 16:50:05 +01:00
Nikita Popov	0f283de9d1	[InstSimplify] Add test for reinterpret load of pointer type (NFC)	2022-01-20 16:25:54 +01:00
Nikita Popov	20d9c51dc0	[ConstantFold] Check for uniform value before reinterpret load The reinterpret load code will convert undef values into zero. Check the uniform value case before it to produce a better result for all-undef initializers. However, the uniform value handling will return the uniform value even if the access is out of bounds, while the reinterpret load code will return undef. Add an explicit check to retain the previous result in this case.	2022-01-14 10:18:02 +01:00
Nikita Popov	e7ce6acc83	[InstSimplify] Add test for load from undef (NFC) If we're loading from an all-undef value, we sometimes still return zero rather than undef.	2022-01-14 10:18:02 +01:00
Nikita Popov	c41aa41957	[ConstFold] Add missing check for inbounds gep If the gep is not inbounds, then the gep might compute a null value even if the base pointer is non-null.	2022-01-06 09:59:40 +01:00
Nikita Popov	37c9171764	[ConstantFold] Add test for invalid non-inbounds gep icmp fold The gep evaluated to null in this case, and as such is not ne null.	2022-01-06 09:59:40 +01:00
Nikita Popov	3dc1907d06	[ConstantFold] Use ConstantFoldLoadFromUniformValue() in more places In particular, this also preserves undef when loading from padding, rather than converting it to zero through a different codepath. This is the remaining part of D115924.	2022-01-05 12:47:50 +01:00
Nikita Popov	4e62d210c4	[ConstantFold] Add test for load of padding (NFC) This currently load zero rather than undef.	2022-01-05 12:47:49 +01:00
Nikita Popov	99c6b12b92	[ConstantFolding] Unify handling of load from uniform value There are a number of places that specially handle loads from a uniform value where all the bits are the same (zero, one, undef, poison), because we a) don't care about the load offset in that case b) it bypasses casts that might not be legal generally but do work with uniform values. We had multiple implementations of this, with a different set of supported values each time. This replaces two usages with a more complete helper. Other usages will be replaced separately, because they have larger impact. This is part of D115924.	2022-01-05 12:30:46 +01:00
Nikita Popov	00686ab4af	[ConstantFold] Add additional load from uniform value tests (NFC)	2022-01-05 12:30:46 +01:00
Nikita Popov	6c031780aa	[ConstantFold] Remove another incorrect icmp of gep fold This folded (null + X) == g to false, but of course this is incorrect if X == g. Possibly this got confused with the null == g case, which is already handled elsewhere.	2022-01-04 16:08:09 +01:00
Nikita Popov	25448826dd	[InstSimplify] Update test to make miscompile more obvious (NFC) This is now testing (null + g3) != g3 and still coming up with "true" as the answer. The original case was a less obvious miscompile with index overflow involved.	2022-01-04 16:08:09 +01:00
Nikita Popov	75db002725	[ConstantFold] Remove another incorrect icmp of GEP fold This fold is not correct, because indices might evaluate to zero even if they are not a literal zero integer. Additionally, this fold would be wrong (in the general case) for non-i8 types as well, due to index overflow. Drop this fold and instead let the target-dependent constant folder compute the actual offset and fold the comparison based on that.	2022-01-04 12:27:40 +01:00
Nikita Popov	aefab6f8d5	[InstSimplify] Use weak symbol in test to show miscompile (NFC) This fold is incorrect, because it assumes that all indices are non-zero. This happens to be true for the test as written, but doesn't hold if we use an extern weak global instead, for which ptrtoint might be zero. Add separate tests for the simple constant int case.	2022-01-04 12:27:40 +01:00
Nikita Popov	5afbfe33e7	[ConstantFold] Make icmp of gep fold offset based We can fold an equality or unsigned icmp between base+offset1 and base+offset2 with inbounds offsets by comparing the offsets directly. This replaces a pair of specialized folds that tried to reason based on the GEP structure instead. One of those folds was plain wrong (because it does not account for negative offsets), while the other is unnecessarily complicated and limited (e.g. it will fail with bitcasts involved). The disadvantage of this change is that it requires data layout, so the fold is no longer performed by datalayout-independent constant folding. I don't think this is a loss in practice, but it does regress the ConstantExprFold.ll test, which checks folding without running any passes. Differential Revision: https://reviews.llvm.org/D116332	2022-01-03 09:41:37 +01:00
Nikita Popov	3bfe0962ba	[ConstFold] Add another icmp of gep of global test (NFC) This time with some complex arithmetic involving bitcasts.	2021-12-28 14:28:28 +01:00
Nikita Popov	23de66d163	[ConstFold] Don't fold signed comparison of gep of global An inbounds GEP may still cross the sign boundary, so signed icmps cannot be folded (https://alive2.llvm.org/ce/z/XSgi4D). This was previously fixed for other folds in this function, but this one was missed.	2021-12-28 14:13:33 +01:00
Nikita Popov	1bd11d34fe	[ConstFold] Add additional icmp of gep of global tests (NFC) The fold is incorrect for the sgt case, as gep inbounds is allowed to cross the sign boundary.	2021-12-28 14:07:15 +01:00
Nikita Popov	2926d6d335	[ConstantFold][GlobalOpt] Don't create x86_mmx null value This fixes the assertion failure reported at https://reviews.llvm.org/D114889#3198921 with a straightforward check, until the cleaner fix in D115924 can be reapplied.	2021-12-21 09:11:41 +01:00
Nikita Popov	aeb36ae0f4	Revert "[ConstantFolding] Unify handling of load from uniform value" This reverts commit 9fd4f80e33a4ae4567483819646650f5735286e2. This breaks SingleSource/Regression/C/gcc-c-torture/execute/pr19687.c in test-suite. Either the test is incorrect, or clang is generating incorrect union initialization code. I've submitted https://reviews.llvm.org/D115994 to fix the test, assuming my interpretation is correct. Reverting this in the meantime as it may take some time to resolve.	2021-12-18 20:46:52 +01:00
Nikita Popov	9fd4f80e33	[ConstantFolding] Unify handling of load from uniform value There are a number of places that specially handle loads from a uniform value where all the bits are the same (zero, one, undef, poison), because we a) don't care about the load offset in that case and b) it bypasses casts that might not be legal generally but do work with uniform values. We had multiple implementations of this, with a different set of supported values each time, as well as incomplete type checks in some cases. In particular, this fixes the assertion reported in https://reviews.llvm.org/D114889#3198921, as well as a similar assertion that could be triggered via constant folding. Differential Revision: https://reviews.llvm.org/D115924	2021-12-17 17:05:06 +01:00
Nikita Popov	65bec04295	[ConstantFold] Handle same type in ConstantFoldLoadThroughBitcast Usually the case where the types are the same ends up being handled fine because it's legal to do a trivial bitcast to the same type. However, this is not true for aggregate types. Short-circuit the whole code if the types match exactly to account for this.	2021-12-10 16:39:50 +01:00
Nikita Popov	9c244a33e7	[InstSimplify] Add test for load of aggregate (NFC) The test is switched to use -instsimplify as it is in the InstSimplify directory. In this particular case InstCombine does fold the load (in a very roundabout way), but InstSimplify does not.	2021-12-10 16:18:18 +01:00
David Green	ab0c5cea0b	[ARM] Use v2i1 for MVE and CDE intrinsics This adjusts all the MVE and CDE intrinsics now that v2i1 is a legal type, to use a <2 x i1> as opposed to emulating the predicate with a <4 x i1>. The v4i1 workarounds have been removed leaving the natural v2i1 types, notably in vctp64 which now generates a v2i1 type. AutoUpgrade code has been added to upgrade old IR, which needs to convert the old v4i1 to a v2i1 be converting it back and forth to an integer with arm.mve.v2i and arm.mve.i2v intrinsics. These should be optimized away in the final assembly. Differential Revision: https://reviews.llvm.org/D114455	2021-12-03 15:27:58 +00:00
Zarko Todorovski	7f7dac7126	[NFC][llvm] Inclusive language: reword uses of sanity test and check Part of continuing work to use more inclusive language. Reworded uses of sanity check and sanity test in llvm/test/	2021-11-25 07:21:42 -05:00
Nikita Popov	c117d77e93	[ConstantFold] Refactor load folding This refactors load folding to happen in two cleanly separated steps: ConstantFoldLoadFromConstPtr() takes a pointer to load from and decomposes it into a constant initializer base and an offset. Then ConstantFoldLoadFromConst() loads from that initializer at the given offset. This makes the core logic independent of having actual GEP expressions (and those GEP expressions having certain structure) and will allow exposing ConstantFoldLoadFromConst() as an independent API in the future. This is mostly only a refactoring, but it does make the folding logic slightly more powerful. Differential Revision: https://reviews.llvm.org/D111023	2021-10-05 18:07:57 +02:00
Nikita Popov	3be4acbaa3	[InstSimplify] Add additional load from constant test (NFC) This case does not get folded, because the GEP indexes too deeply (to the i8), making the bitcast logic not apply (on the [8 x i8]).	2021-10-03 15:52:36 +02:00
Nikita Popov	af382b9383	[IR] Handle constant expressions in containsUndefinedElement() If the constant is a constant expression, then getAggregateElement() will return null. Guard against this before calling HasFn().	2021-09-09 22:04:12 +02:00
Roman Lebedev	3f1f08f0ed	Revert @llvm.isnan intrinsic patchset. Please refer to https://lists.llvm.org/pipermail/llvm-dev/2021-September/152440.html (and that whole thread.) TLDR: the original patch had no prior RFC, yet it had some changes that really need a proper RFC discussion. It won't be productive to discuss such an RFC, once it's actually posted, while said patch is already committed, because that introduces bias towards already-committed stuff, and the tree is potentially in broken state meanwhile. While the end result of discussion may lead back to the current design, it may also not lead to the current design. Therefore i take it upon myself to revert the tree back to last known good state. This reverts commit 4c4093e6e39fe6601f9c95a95a6bc242ef648cd5. This reverts commit 0a2b1ba33ae6dcaedb81417f7c4cc714f72a5968. This reverts commit d9873711cb03ac7aedcaadcba42f82c66e962e6e. This reverts commit 791006fb8c6fff4f33c33cb513a96b1d3f94c767. This reverts commit c22b64ef66f7518abb6f022fcdfd86d16c764caf. This reverts commit 72ebcd3198327da12804305bda13d9b7088772a8. This reverts commit 5fa6039a5fc1b6392a3c9a3326a76604e0cb1001. This reverts commit 9efda541bfbd145de90f7db38d935db6246dc45a. This reverts commit 94d3ff09cfa8d7aecf480e54da9a5334e262e76b.	2021-09-02 13:53:56 +03:00
David Sherwood	8439415333	[IR] Let ConstantVector::getSplat use poison instead of undef This patch updates ConstantVector::getSplat to use poison instead of undef when using insertelement/shufflevector to splat. This follows on from D93793. Differential Revision: https://reviews.llvm.org/D107751	2021-08-10 08:27:43 +01:00
Serge Pavlov	4c4093e6e3	Introduce intrinsic llvm.isnan This is recommit of the patch 16ff91ebccda1128c43ff3cee104e2c603569fb2, reverted in 0c28a7c990c5218d6aec47c5052a51cba686ec5e because it had an error in call of getFastMathFlags (base type should be FPMathOperator but not Instruction). The original commit message is duplicated below: Clang has builtin function '__builtin_isnan', which implements C library function 'isnan'. This function now is implemented entirely in clang codegen, which expands the function into set of IR operations. There are three mechanisms by which the expansion can be made. * The most common mechanism is using an unordered comparison made by instruction 'fcmp uno'. This simple solution is target-independent and works well in most cases. It however is not suitable if floating point exceptions are tracked. Corresponding IEEE 754 operation and C function must never raise FP exception, even if the argument is a signaling NaN. Compare instructions usually does not have such property, they raise 'invalid' exception in such case. So this mechanism is unsuitable when exception behavior is strict. In particular it could result in unexpected trapping if argument is SNaN. * Another solution was implemented in https://reviews.llvm.org/D95948. It is used in the cases when raising FP exceptions by 'isnan' is not allowed. This solution implements 'isnan' using integer operations. It solves the problem of exceptions, but offers one solution for all targets, however some can do the check in more efficient way. * Solution implemented by https://reviews.llvm.org/D96568 introduced a hook 'clang::TargetCodeGenInfo::testFPKind', which injects target specific code into IR. Now only SystemZ implements this hook and it generates a call to target specific intrinsic function. Although these mechanisms allow to implement 'isnan' with enough efficiency, expanding 'isnan' in clang has drawbacks: * The operation 'isnan' is hidden behind generic integer operations or target-specific intrinsics. It complicates analysis and can prevent some optimizations. * IR can be created by tools other than clang, in this case treatment of 'isnan' has to be duplicated in that tool. Another issue with the current implementation of 'isnan' comes from the use of options '-ffast-math' or '-fno-honor-nans'. If such option is specified, 'fcmp uno' may be optimized to 'false'. It is valid optimization in general, but it results in 'isnan' always returning 'false'. For example, in some libc++ implementations the following code returns 'false': std::isnan(std::numeric_limits<float>::quiet_NaN()) The options '-ffast-math' and '-fno-honor-nans' imply that FP operation operands are never NaNs. This assumption however should not be applied to the functions that check FP number properties, including 'isnan'. If such function returns expected result instead of actually making checks, it becomes useless in many cases. The option '-ffast-math' is often used for performance critical code, as it can speed up execution by the expense of manual treatment of corner cases. If 'isnan' returns assumed result, a user cannot use it in the manual treatment of NaNs and has to invent replacements, like making the check using integer operations. There is a discussion in https://reviews.llvm.org/D18513#387418, which also expresses the opinion, that limitations imposed by '-ffast-math' should be applied only to 'math' functions but not to 'tests'. To overcome these drawbacks, this change introduces a new IR intrinsic function 'llvm.isnan', which realizes the check as specified by IEEE-754 and C standards in target-agnostic way. During IR transformations it does not undergo undesirable optimizations. It reaches instruction selection, where is lowered in target-dependent way. The lowering can vary depending on options like '-ffast-math' or '-ffp-model' so the resulting code satisfies requested semantics. Differential Revision: https://reviews.llvm.org/D104854	2021-08-06 14:32:27 +07:00
Serge Pavlov	0c28a7c990	Revert "Introduce intrinsic llvm.isnan" This reverts commit 16ff91ebccda1128c43ff3cee104e2c603569fb2. Several errors were reported mainly test-suite execution time. Reverted for investigation.	2021-08-04 17:18:15 +07:00
Serge Pavlov	16ff91ebcc	Introduce intrinsic llvm.isnan Clang has builtin function '__builtin_isnan', which implements C library function 'isnan'. This function now is implemented entirely in clang codegen, which expands the function into set of IR operations. There are three mechanisms by which the expansion can be made. * The most common mechanism is using an unordered comparison made by instruction 'fcmp uno'. This simple solution is target-independent and works well in most cases. It however is not suitable if floating point exceptions are tracked. Corresponding IEEE 754 operation and C function must never raise FP exception, even if the argument is a signaling NaN. Compare instructions usually does not have such property, they raise 'invalid' exception in such case. So this mechanism is unsuitable when exception behavior is strict. In particular it could result in unexpected trapping if argument is SNaN. * Another solution was implemented in https://reviews.llvm.org/D95948. It is used in the cases when raising FP exceptions by 'isnan' is not allowed. This solution implements 'isnan' using integer operations. It solves the problem of exceptions, but offers one solution for all targets, however some can do the check in more efficient way. * Solution implemented by https://reviews.llvm.org/D96568 introduced a hook 'clang::TargetCodeGenInfo::testFPKind', which injects target specific code into IR. Now only SystemZ implements this hook and it generates a call to target specific intrinsic function. Although these mechanisms allow to implement 'isnan' with enough efficiency, expanding 'isnan' in clang has drawbacks: * The operation 'isnan' is hidden behind generic integer operations or target-specific intrinsics. It complicates analysis and can prevent some optimizations. * IR can be created by tools other than clang, in this case treatment of 'isnan' has to be duplicated in that tool. Another issue with the current implementation of 'isnan' comes from the use of options '-ffast-math' or '-fno-honor-nans'. If such option is specified, 'fcmp uno' may be optimized to 'false'. It is valid optimization in general, but it results in 'isnan' always returning 'false'. For example, in some libc++ implementations the following code returns 'false': std::isnan(std::numeric_limits<float>::quiet_NaN()) The options '-ffast-math' and '-fno-honor-nans' imply that FP operation operands are never NaNs. This assumption however should not be applied to the functions that check FP number properties, including 'isnan'. If such function returns expected result instead of actually making checks, it becomes useless in many cases. The option '-ffast-math' is often used for performance critical code, as it can speed up execution by the expense of manual treatment of corner cases. If 'isnan' returns assumed result, a user cannot use it in the manual treatment of NaNs and has to invent replacements, like making the check using integer operations. There is a discussion in https://reviews.llvm.org/D18513#387418, which also expresses the opinion, that limitations imposed by '-ffast-math' should be applied only to 'math' functions but not to 'tests'. To overcome these drawbacks, this change introduces a new IR intrinsic function 'llvm.isnan', which realizes the check as specified by IEEE-754 and C standards in target-agnostic way. During IR transformations it does not undergo undesirable optimizations. It reaches instruction selection, where is lowered in target-dependent way. The lowering can vary depending on options like '-ffast-math' or '-ffp-model' so the resulting code satisfies requested semantics. Differential Revision: https://reviews.llvm.org/D104854	2021-08-04 15:27:49 +07:00
Sanjay Patel	13302c06cd	[ConstantFolding] avoid crashing on a fake math library call https://llvm.org/PR50960	2021-07-20 18:25:21 -04:00
Juneyoung Lee	5af8bacc94	[InstSimplify] Add more poison folding optimizations This adds more poison folding optimizations to InstSimplify. Since all binary operators propagate poison, these are fine. Also, the precondition of `select cond, undef, x` -> `x` is relaxed to allow the case when `x` is undef. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D104661	2021-06-23 20:25:24 +09:00
Bjorn Pettersson	4c7f820b2b	Update @llvm.powi to handle different int sizes for the exponent This can be seen as a follow up to commit 0ee439b705e82a4fe20e2, that changed the second argument of __powidf2, __powisf2 and __powitf2 in compiler-rt from si_int to int. That was to align with how those runtimes are defined in libgcc. One thing that seem to have been missing in that patch was to make sure that the rest of LLVM also handle that the argument now depends on the size of int (not using the si_int machine mode for 32-bit). When using __builtin_powi for a target with 16-bit int clang crashed. And when emitting libcalls to those rtlib functions, typically when lowering @llvm.powi), the backend would always prepare the exponent argument as an i32 which caused miscompiles when the rtlib was compiled with 16-bit int. The solution used here is to use an overloaded type for the second argument in @llvm.powi. This way clang can use the "correct" type when lowering __builtin_powi, and then later when emitting the libcall it is assumed that the type used in @llvm.powi matches the rtlib function. One thing that needed some extra attention was that when vectorizing calls several passes did not support that several arguments could be overloaded in the intrinsics. This patch allows overload of a scalar operand by adding hasVectorInstrinsicOverloadedScalarOpd, with an entry for powi. Differential Revision: https://reviews.llvm.org/D99439	2021-06-17 09:38:28 +02:00

1 2

97 Commits