This PR prevents the mutator from generating illegal memory operations
for AMDGCN. In particular, direct store and load instructions on
addrspace(8) are not permitted. This PR fixes that by properly
introducing casts to addrspace(7) when required.
This PR addresses issues related to the `amdgcn_cs_chain` intrinsic:
1. Ensures the intrinsic's special attribute and calling convention
requirements are not violated by the mutator.
2. Enforces the necessary unreachable statement following this type of
intrinsic, preventing the fuzzer from generating invalid code.
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and similar feed into
instruction insertion positions. Call-sites where that's necessary were
updated a year ago; but to ensure some type safety however, we'd like to
have all calls to moveBefore use iterators.
This patch adds a (guaranteed dereferenceable) iterator-taking
moveBefore, and changes a bunch of call-sites where it's obviously safe
to change to use it by just calling getIterator() on an instruction
pointer. A follow-up patch will contain less-obviously-safe changes.
We'll eventually deprecate and remove the instruction-pointer
insertBefore, but not before adding concise documentation of what
considerations are needed (very few).
The constructor initializes `*this` with `M->getDataLayout()`, which
is effectively the same as calling the copy constructor.
There does not seem to be a case where a copy would be necessary.
Pull Request: https://github.com/llvm/llvm-project/pull/102841
These are the final few places in LLVM that use instruction pointers to
insert instructions -- use iterators instead, which is needed for
debug-info correctness in the future. Most of this is a gentle
scattering of getIterator calls or not deref-then-addrofing iterators.
libfuzzer does require a storage change to keep built instruction
positions in a container though. The unit-test changes are very
straightforwards.
This leaves us in a position where libfuzzer can't fuzz on either of
debug-info records, however I don't believe that fuzzing of debug-info
is in scope for the library.
Move PassInstrumentationAnalysis into PassInstrumentation.h and stop
including it in PassManager.h (effectively inverting the direction of
the dependency).
Most places using PassManager are not interested in PassInstrumentation,
and we no longer have any uses of it in PassManager.h itself (only in
PassManagerImpl.h).
I'm planning to remove StringRef::equals in favor of
StringRef::operator==.
- StringRef::operator==/!= outnumber StringRef::equals by a factor of
70 under llvm/ in terms of their usage.
- The elimination of StringRef::equals brings StringRef closer to
std::string_view, which has operator== but not equals.
- S == "foo" is more readable than S.equals("foo"), especially for
!Long.Expression.equals("str") vs Long.Expression != "str".
Per discussion in https://github.com/SecurityLab-UCD/IRFuzzer/issues/49,
generating undef during fuzzing seems to be less fruitful. Let's
eliminate undef in favor of poison unless the user explicitly asked for
it.
Signed-off-by: Peter Rong <PeterRong96@gmail.com>
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
FuzzMutate didn't consider some corner cases and leads to mutation failure when mutating some modules.
This patch fixes 3 bugs:
- Add null check when encountering basic blocks without predecessor to avoid segmentation fault
- Avoid insertion after `musttail call` instruction
- Avoid sinking token type
Unit tests are also added.
Reviewed By: Peter
Differential Revision: https://reviews.llvm.org/D151936
When there is a function with metadata/token parameter/return type, `InsertFunctionStrategy` will crash.
This patch fixes the problem by falling back to create function declaration when the sampled function contains metadata/token parameter/return type.
Reviewed By: Peter
Differential Revision: https://reviews.llvm.org/D150627
IRMutation::mutateModule() currently requires the bitcode size of the module.
To compute the bitcode size, one way is to write the module to a buffer using
BitcodeWriter and calculating the buffer size. This would be fine for a single
mutation, but infeasible for repeated mutations due to the large overhead. It
turns out that the only IR strategy weight calculation method that depends on
the current module size is InstDeleterStrategy, which deletes instructions more
frequently as the module size approaches a given max size. However, there is no
real need for the size to be in bytes of bitcode, so we can use a different
metric. One alternative is to let the size be the number of objects in the
Module, including instructions, basic blocks, globals, and aliases. Although
getting the number of instructions is still O(n), it should have significantly
less overhead than BitcodeWriter. This suggestion would cause a change to the
IRMutator API, since IRMutator::mutateModule() can calculate the Module size
itself.
Reviewed By: Peter
Differential Revision: https://reviews.llvm.org/D149989
This revision makes ShuffleBlockStrategy deterministic by replacing
SmallPtrSet with other data structures that has a deterministic iteration
order.
Reviewed By: Peter
Differential Revision: https://reviews.llvm.org/D149676
This patch addresses 2 problems:
- In `ShuffleBlockStrategy`, when `BB` is an EHPad, `BB.getFirstInsertionPt()` will return `BB.end()`, which cannot be dereferenced and will cause crash in following loop.
- In `isCompatibleReplacement`, a call instruction's callee might be replaced by a pointer, causing 2 subproblems:
- we cannot guarantee that the pointer is a function pointer (even if it is, we cannot guarantee it matches the signature).
- after such a replacement, `getCalledFunction` will from then on return `nullptr` (since it's indirect call) which causes Segmentation Fault in the lines below.
This patch fixes the first problem by checking if a block to be mutated is an EHPad in base class `IRMutationStrategy` and skipping mutating it if so.
This patch fixes the second problem by avoiding replacing callee with pointer and adding a null check for indirect calls.
Reviewed By: Peter
Differential Revision: https://reviews.llvm.org/D148853
This revision fixes an incorrect type cast from Instruction to ICmpInstr, which should have been to FCmpInstr instead. It turns out that StrategiesTest.cpp was missing a test case for InstModificationIRStrategy and FCmp, which is also now implemented in this revision. After this revision, [[ https://reviews.llvm.org/D148854 | llvm-stress in D148854 ]] no longer crashes randomly.
Reviewed By: Peter
Differential Revision: https://reviews.llvm.org/D148972
This revision fixes an incorrect type cast from Instruction to ICmpInstr, which should have been to FCmpInstr instead. It turns out that StrategiesTest.cpp was missing a test case for InstModificationIRStrategy and FCmp, which is also now implemented in this revision. After this revision, [[ https://reviews.llvm.org/D148854 | llvm-stress in D148854 ]] no longer crashes randomly.
Reviewed By: Peter
Differential Revision: https://reviews.llvm.org/D148972
This revision fixes an incorrect type cast from Instruction to ICmpInstr, which should have been to FCmpInstr instead. It turns out that StrategiesTest.cpp was missing a test case for InstModificationIRStrategy and FCmp, which is also now implemented in this revision. After this revision, [[ https://reviews.llvm.org/D148854 | llvm-stress in D148854 ]] no longer crashes randomly.
Reviewed By: Peter
Differential Revision: https://reviews.llvm.org/D148972
InsertFunctionStrategy does two things:
1. Add a random function declaration or definition to the module. This would replace previously used `createEmptyFunction`.
2. Add a random function call between instructions.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D148568
Source and Sink are required when generating a new instruction.
(Term defined by previous author, in LLVM terms it's probably Use and User.)
Previously, only instructions in the same block is considered when taking source and sink.
In this patch, more source and sink types are considered.
For source, we have SrcFromInstInCurBlock, FunctionArgument, InstInDominator, SrcFromGlobalVariable, and NewConstOrStack.
For sink, we have SinkToInstInCurBlock, PointersInDominator, InstInDominatee, NewStore, and SinkToGlobalVariable.
A unit test to make sure source always dominates an instruction, and the instruction always dominates the sink is included.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D139907
Source and Sink are required when generating a new instruction.
(Term defined by previous author, in LLVM terms it's probably Use and User.)
Previously, only instructions in the same block is considered when taking source and sink.
In this patch, more source and sink types are considered.
For source, we have SrcFromInstInCurBlock, FunctionArgument, InstInDominator, SrcFromGlobalVariable, and NewConstOrStack.
For sink, we have SinkToInstInCurBlock, PointersInDominator, InstInDominatee, NewStore, and SinkToGlobalVariable.
A unit test to make sure source always dominates an instruction, and the instruction always dominates the sink is included.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D139907
Use deduction guides instead of helper functions.
The only non-automatic changes have been:
1. ArrayRef(some_uint8_pointer, 0) needs to be changed into ArrayRef(some_uint8_pointer, (size_t)0) to avoid an ambiguous call with ArrayRef((uint8_t*), (uint8_t*))
2. CVSymbol sym(makeArrayRef(symStorage)); needed to be rewritten as CVSymbol sym{ArrayRef(symStorage)}; otherwise the compiler is confused and thinks we have a (bad) function prototype. There was a few similar situation across the codebase.
3. ADL doesn't seem to work the same for deduction-guides and functions, so at some point the llvm namespace must be explicitly stated.
4. The "reference mode" of makeArrayRef(ArrayRef<T> &) that acts as no-op is not supported (a constructor cannot achieve that).
Per reviewers' comment, some useless makeArrayRef have been removed in the process.
This is a follow-up to https://reviews.llvm.org/D140896 that introduced
the deduction guides.
Differential Revision: https://reviews.llvm.org/D140955
This is a fairly large changeset, but it can be broken into a few
pieces:
- `llvm/Support/*TargetParser*` are all moved from the LLVM Support
component into a new LLVM Component called "TargetParser". This
potentially enables using tablegen to maintain this information, as
is shown in https://reviews.llvm.org/D137517. This cannot currently
be done, as llvm-tblgen relies on LLVM's Support component.
- This also moves two files from Support which use and depend on
information in the TargetParser:
- `llvm/Support/Host.{h,cpp}` which contains functions for inspecting
the current Host machine for info about it, primarily to support
getting the host triple, but also for `-mcpu=native` support in e.g.
Clang. This is fairly tightly intertwined with the information in
`X86TargetParser.h`, so keeping them in the same component makes
sense.
- `llvm/ADT/Triple.h` and `llvm/Support/Triple.cpp`, which contains
the target triple parser and representation. This is very intertwined
with the Arm target parser, because the arm architecture version
appears in canonical triples on arm platforms.
- I moved the relevant unittests to their own directory.
And so, we end up with a single component that has all the information
about the following, which to me seems like a unified component:
- Triples that LLVM Knows about
- Architecture names and CPUs that LLVM knows about
- CPU detection logic for LLVM
Given this, I have also moved `RISCVISAInfo.h` into this component, as
it seems to me to be part of that same set of functionality.
If you get link errors in your components after this patch, you likely
need to add TargetParser into LLVM_LINK_COMPONENTS in CMake.
Differential Revision: https://reviews.llvm.org/D137838
I think there are more attributes, flags we can add to `call`, functions declarations and global variables. Let's start with these two flags.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D139594
Mutating CFG is hard as we have to maintain dominator relations.
We avoid this problem by inserting a CFG into a splitted block.
switch, ret, and br instructions are generated.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D139067
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
PHI Node can't be modeled like other instructions since its operand
number depends on predecessors. So we have a stand alone strategy for it.
Signed-off-by: Peter Rong <PeterRong96@gmail.com>
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D138959
Randomlly select an instruction and try to use it in the future by replacing it with another instruction's operand.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D138948
`connectToSink` uses a value by putting it in a future instruction.
It will replace the operand of a future instruction with the current value.
However, if current value is an `Instruction` and put into a switch case, the module is invalid.
We fix that by only connecting to Br/Switch's condition, and don't touch other operands.
Will have other strategies to mutate other Br/Switch operands to be patched once this patch is passed
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D138890
`ShuffleBlockStrategy` will shuffle the instructions in a basic block without breaking the dependency of instructions.
It is implemented as a topological sort, only we randomly select instructions with no dependency.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D138339