The CRC optimization relies on stripping the auxiliary data completely,
and should hence be forbidden when it has a user in the exit-block.
Forbid this case, fixing a miscompile.
Fixes#165382.
Checking if trip-count exceeds 256 is no longer necessary, as we have
moved away from KnownBits computations to pattern-matching, which is
very cheap and independent of TC.
The ValueEvolution logic is deeply flawed, and checking that zero-bits
are shifted can be exploited for miscompiles. In an effort to redo
HashRecognize with a pattern-matching based approach, extract and fix
the core logic of ValueEvolution, and strip it completely, showing that
none of the tests rely on the KnownBits computation of ValueEvolution.
Co-authored-by: Piotr Fusik <p.fusik@samsung.com>
The trip-count of a CRC algorithm can legitimately be greater than the
bitwidth of the result: what it cannot exceed is the bitwidth of the
data, or LHSAux. crc8.le.tc16 is now successfully recognized as a CRC
algorithm.
The test crc8.le.tc16 is a valid CRC algorithm, but isn't recognized as
such due to a buggy arePHIsIntertwined, which is asymmetric in its
PHINode arguments. There is also a fundamental correctness issue: the
core functionality is to match a XOR that's a recurrence in both PHI
nodes, ignoring casts, but the user of the XOR is never checked. Rewrite
and rename the function.
crc8.le.tc16 is still not recognized as a valid CRC algorithm, due to an
incorrect check for loop iterations exceeding the bitwidth of the
result: in reality, it should not exceed the bitwidth of LHSAux, but we
leave this fix to a follow-up.
Co-authored-by: Piotr Fusik <p.fusik@samsung.com>
Make HashRecognize a non-PassManager analysis that can be called to get
the result on-demand, creating a new getResult() entry-point. The issue
was discovered when attempting to use the analysis to perform a
transform in LoopIdiomRecognize.
Const-qualifying Values in the analysis result makes them unusable with
IRBuilder. The issue was discovered when attempting to use the result of
the analysis for a transform.
Big-endian CRC tables are incorrect due to the initial value of CRC in
genSarwateTable being hard-coded for CRC-8. 128 is the signed-min value
for CRC-8, but it should be generalized to APInt::getSignedMinValue. The
issue was found when writing CRC verification tests for llvm-test-suite.
Introduce a fresh analysis for recognizing polynomial hashes, with the
rationale that several targets have specific instructions to optimize
things like CRC and GHASH (eg. X86 and RISC-V crypto extension). We
limit the scope to polynomial hashes computed in a Galois field of
characteristic 2, since this class of operations can also be optimized
in the absence of target-specific instructions to use a lookup table.
At the moment, we only recognize the CRC algorithm.
RFC:
https://discourse.llvm.org/t/rfc-new-analysis-for-polynomial-hash-recognition/86268