* Construct frequently-accessed TimerLock/DefaultTimerGroup early to
reduce overhead.
* Rename `aquireDefaultGroup` to `acquireTimerGlobals` and restore
ManagedStatic::claim. https://reviews.llvm.org/D76099
* Drop mtg::. We use internal linkage, so mtg:: is unneeded and might
mislead users. In addition, llvm/ code almost never introduces a named
namespace not in llvm::. Drop mtg::.
* Replace some unique_ptr with optional to reduce overhead.
* Switch to `functionName()`.
* Simplify `llvm::initTimerOptions` and `TimerGroup::constructForStatistics()`
Pull Request: https://github.com/llvm/llvm-project/pull/122429
The system call `__CELQTBCK()` is used to build a backtrace like
on other systems. The collected information are the address of the PC,
the address of the entry point (EP), the difference between both
addresses (+EP), the dynamic storage area (DSA aka the stack
pointer), and the function name.
The system call is described here:
https://www.ibm.com/docs/en/zos/3.1.0?topic=cwicsa6a-celqtbck-also-known-as-celqtbck-64-bit-traceback-service
TimerGroup's dtor accesses this field.
once_flag has a trivial destructor so it doesn't really matter, but msan
checks that it's not accessed after destruction.
/llvm-project/llvm/lib/Support/Timer.cpp:565:74:
error: extra ';' outside of a function is incompatible with C++98 [-Werror,-Wc++98-compat-extra-semi]
static bool mtg::TrackSpace() { return ManagedTimerGlobals->TrackSpace; };
^
1 error generated.
This is intended to help with flang `-ftime-report` support:
- #107270.
With this change, I was able to cherry-pick #107270, uncomment
`llvm::TimePassesIsEnabled = true;` and compile with `-ftime-report`.
I also noticed that `clang/lib/Driver/OffloadBundler.cpp` was statically
constructing a `TimerGroup` and changed it to lazily construct via
ManagedStatic.
Before 925471ed903dad871042d7ed0bab89ab6566a564 remove_directories
supports path with slash (instead of backslash).
The ILCreateFromPathW in new implementation requires backslash path,
so the call to remove_directories will fail if the path contains slash.
This is to normalize the path to make sure remove_directories still
support path with slash as well.
* Add missing semantics to the `Semantics` enum.
* Move all documentation of the semantics to the header file.
* Also rename some functions for consistency.
Printing IR spends a large amount of time updating the column count via
generic UTF-8 APIs. Speed it up by adding a fast path for the common
printable ASCII case.
This makes `-print-on-crash -print-module-scope` about two times faster.
Nested sequences could be defined but the YAML output was incorrect.
`Output::newLineCheck()` was not able to emit multiple dashes `- ` and
YAML parser sometimes didn't accept its output as the result.
This fixes for emitting corresponding dashes for consecutive
`inSeqFirstElement`, but suppresses emission to the top
`inSeqFirstElement`.
This also fixes for emitting flow elements onto nested sequences.
Currently SmallPtrSet stores CurArray and SmallArray, with the former
pointing to the latter in small representation. Instead, only use
CurArray and a separate IsSmall boolean.
Most of the implementation doesn't need the separate SmallArray member
-- we only need it for copy/move/swap operations, where we may switch
back from large to small representation. In those cases, we explicitly
pass down the pointer to SmallStorage.
This reduces the size of SmallPtrSet and improves compile-time.
ErrorAsOutParameter's Error* constructor supports cases where an Error might not
be passed in (because in the calling context it's known that this call won't
fail). Most clients always have an Error present however, and for them an Error&
overload is more convenient.
We should check the zstd decompress result before doing the msan
unpoison. If the res is abnormal, then it would be a huge number, which
will cause undesired msan unpoison behavior and will run for a long
time.
This unbreaks the build on macOS.
Without the include, the build fails with
llvm/lib/Support/RWMutex.cpp:47:36: error: use of undeclared identifier 'safe_malloc'
47 |
static_cast<pthread_rwlock_t*>(safe_malloc(sizeof(pthread_rwlock_t)));
| ^
Changes are:
1) Make signed-overflow detection optimal
2) For signed-overflow, try to rule out direction even if we can't
totally rule out overflow.
3) Intersect add/sub assuming no overflow with possible overflow
clamping values as opposed to add/sub without the assumption.
Implement TrieRawHashMap can be used to store object with its associated
hash. User needs to supply a strong hashing function to guarantee the
uniqueness of the hash of the objects to be inserted. A hash collision
is not supported and will lead to error or failed to insert.
TrieRawHashMap is thread-safe and lock-free and can be used as
foundation data structure to implement a content addressible storage.
TrieRawHashMap owns the data stored in it and is designed to be:
* Fast to lookup.
* Fast to "insert" if the data has already been inserted.
* Can be used without lock and doesn't require any knowledge of the
participating threads or extra coordination between threads.
It is not currently designed to be used to insert unique new data with
high contention, due to the limitation on the memory allocator.
This patch uses data() and size() within StringRef instead of Data and
Length. This makes it easier to replace Data and Length with
std::string_view in the future, which in turn allows us to forward
most of StringRef functions to the counterparts in std::string_view.
The Intel C++ Compiler (ICX) passes linker flags through the driver
unlike MSVC and clang-cl, and therefore needs them to be prefixed with
`/Qoption,link` (the equivalent of `-Wl,` for gcc on *nix).
Use `LINKER:` prefix wherever supported by cmake, when that's not
possible fall-back to `${CMAKE_CXX_LINKER_WRAPPER_FLAG}`. CMake replaces
these with `/Qoption,link` for ICX and with the empty string for MSVC
and clang-cl.
For `target_link_libraries` neither `LINKER:` (not supported prior to
CMake 3.32) nor `${CMAKE_CXX_LINKER_WRAPPER_FLAG}` (does not begin with
`-` would be taken as a library name) works, use `-Qoption,link`
directly within a conditional generator expression that we're linking
with ICX.
For MSVC and clang-cl no functional change is intended.
Tested by compiling with ICX and setting
`CMAKE_(EXE|SHARED|STATIC|MODULE)_LINKER_FLAGS_INIT` to
`-Werror=unknown-argument`.
RFC:
https://discourse.llvm.org/t/rfc-cmake-linker-flags-need-wl-equivalent-for-intel-c-icx-on-windows/82446
This patch adds an IsText parameter to the following getBufferForFile,
getBufferForFileImpl. We introduce a new virtual function
openFileForReadBinary which defaults to openFileForRead except in
RealFileSystem which uses the OF_None flag instead of OF_Text.
The default is set to OF_Text instead of OF_None, this change in value
does not affect any other platforms other than z/OS. Setting this
parameter correctly is required to open files on z/OS in the correct
encoding. The IsText parameter is based on the context of where we open
files, for example, in the ASTReader, HeaderMap requires that files
always be opened in binary even though they might be tagged as text.
Line ending policies were changed in the parent, dccebddb3b80. To make
it easier to resolve downstream merge conflicts after line-ending
policies are adjusted this is a separate whitespace-only commit. If you
have merge conflicts as a result, you can simply `git add --renormalize
-u && git merge --continue` or `git add --renormalize -u && git rebase
--continue` - depending on your workflow.
This header was included after the implementations to work around an
issue with FreeBSD, however, , this causes some issues when
dllexport\explicit visibility
attributes will be added to the headers on Windows, since the
definitions need to see the declarations for the attributes to apply.
This is part of the work to enable LLVM_BUILD_LLVM_DYLIB and plugins on
windows.
---------
Co-authored-by: Tom Stellard <tstellar@redhat.com>
Since both APFloat and (Double)IEEEFloat inherit from APFloatBase, empty
base optimization is not performed by GCC/Clang (Minimal reproducer:
https://godbolt.org/z/dY8cM3Wre). This patch removes inheritance
relation between (Double)IEEEFloat and APFloatBase to make sure EBO is
performed on APFloat. After this patch, the size of `ConstantFPRange`
will be reduced from 72 to 56.
Address comment
https://github.com/llvm/llvm-project/pull/111544#discussion_r1792398427.
This is a prep for https://github.com/llvm/llvm-project/pull/90933.
- Change `FileCache` from a function to a type.
- Store the cache directory in the type, which will be used when creating additional caches for two-codegen round runs that inherit this value.
I'm on a system that has have_pthread_getname_np defined but set to 0.
The correct behavior here is not to use the function, but presently the
macros will attempt to use a non-existing function.
This previously worked before
https://github.com/llvm/llvm-project/pull/106486/files
This patch adds an APFloat type for unsigned E8M0 format. This format is
used for representing the "scale-format" in the MX specification:
(section 5.4)
https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf
This format does not support {Inf, denorms, zeroes}. Like FP32, this
format's exponents are 8-bits (all bits here) and the bias value is 127.
However, it differs from IEEE-FP32 in that the minExponent is -127
(instead of -126). There are updates done in the APFloat utility
functions to handle these constraints for this format.
* The bias calculation is different and convertIEEE* APIs are updated to
handle this.
* Since there are no significand bits, the isSignificandAll{Zeroes/Ones}
methods are updated accordingly.
* Although the format does not have any precision, the precision bit in
the fltSemantics is set to 1 for consistency with APFloat's internal
representation.
* Many utility functions are updated to handle the fact that this format
does not support Zero.
* Provide a separate initFromAPInt() implementation to handle the quirks
of the format.
* Add specific tests to verify the range of values for this format.
Signed-off-by: Durgadoss R <durgadossr@nvidia.com>