7034 Commits

Author SHA1 Message Date
Fangrui Song
24bd9bc0b5 [Timer] Remove signpots overhead on unsupported systems
startTimer/stopTimer are frequently called. It's important to reduce
overhead.
2025-01-10 20:46:57 -08:00
Fangrui Song
0de18e72c6
-ftime-report: reorganize timers
The code generation time is unclear in the -ftime-report output:

* The two clang timers "Code Generation Time" and "LLVM IR Generation
  Time" are in the default group "Miscellaneous Ungrouped Timers".
* There is also a "Clang front-end time" group, which actually includes
  code generation time.

```
===-------------------------------------------------------------------------===
                         Miscellaneous Ungrouped Timers
===-------------------------------------------------------------------------===

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   0.0611 (  1.7%)   0.0099 (  4.4%)   0.0710 (  1.9%)   0.0713 (  1.9%)  LLVM IR Generation Time
   3.5140 ( 98.3%)   0.2165 ( 95.6%)   3.7306 ( 98.1%)   3.7342 ( 98.1%)  Code Generation Time
   3.5751 (100.0%)   0.2265 (100.0%)   3.8016 (100.0%)   3.8055 (100.0%)  Total
...
===-------------------------------------------------------------------------===
                          Clang front-end time report
===-------------------------------------------------------------------------===
  Total Execution Time: 3.9108 seconds (3.9146 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   3.6802 (100.0%)   0.2306 (100.0%)   3.9108 (100.0%)   3.9146 (100.0%)  Clang front-end timer
   3.6802 (100.0%)   0.2306 (100.0%)   3.9108 (100.0%)   3.9146 (100.0%)  Total
```

This patch

* renames "Clang front-end time report" (FrontendAction time) to "Clang
  time report",
* renames "Clang front-end" to "Front end",
* moves "LLVM IR Generation" into the group,
* replaces "Code Generation time" with "Optimizer" (middle end) and
  "Machine code generation" (back end).

```
% clang -c sqlite3.i -w -ftime-report -mllvm -sort-timers=0
...
===-------------------------------------------------------------------------===
                               Clang time report
===-------------------------------------------------------------------------===
  Total Execution Time: 1.5922 seconds (1.5972 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   0.5107 ( 35.9%)   0.0105 (  6.2%)   0.5211 ( 32.7%)   0.5222 ( 32.7%)  Front end
   0.2464 ( 17.3%)   0.0340 ( 20.0%)   0.2804 ( 17.6%)   0.2814 ( 17.6%)  LLVM IR generation
   0.6240 ( 43.9%)   0.1235 ( 72.7%)   0.7475 ( 47.0%)   0.7503 ( 47.0%)  Machine code generation
   0.0413 (  2.9%)   0.0018 (  1.0%)   0.0431 (  2.7%)   0.0433 (  2.7%)  Optimizer
   1.4224 (100.0%)   0.1698 (100.0%)   1.5922 (100.0%)   1.5972 (100.0%)  Total
```

Pull Request: https://github.com/llvm/llvm-project/pull/122225
2025-01-10 19:25:18 -08:00
Fangrui Song
af4d76d909
[Support] Reduce globaal variable overhead after #121663
* Construct frequently-accessed TimerLock/DefaultTimerGroup early to
  reduce overhead.
* Rename `aquireDefaultGroup` to `acquireTimerGlobals` and restore
  ManagedStatic::claim. https://reviews.llvm.org/D76099

* Drop mtg::. We use internal linkage, so mtg:: is unneeded and might
  mislead users. In addition, llvm/ code almost never introduces a named
  namespace not in llvm::. Drop mtg::.
* Replace some unique_ptr with optional to reduce overhead.
* Switch to `functionName()`.
* Simplify `llvm::initTimerOptions` and `TimerGroup::constructForStatistics()`

Pull Request: https://github.com/llvm/llvm-project/pull/122429
2025-01-10 17:59:28 -08:00
Kai Nacke
ca10deaa72
[z/OS] Add backtrace support for z/OS. (#121826)
The system call `__CELQTBCK()` is used to build a backtrace like
on other systems. The collected information are the address of the PC,
the address of the entry point (EP), the difference between both
addresses (+EP), the dynamic storage area (DSA aka the stack
pointer), and the function name.
The system call is described here:

https://www.ibm.com/docs/en/zos/3.1.0?topic=cwicsa6a-celqtbck-also-known-as-celqtbck-64-bit-traceback-service
2025-01-09 13:53:59 -05:00
Benjamin Kramer
8ac6a6b816 Reorder fields so InitDeferredFlag is destroyed last
TimerGroup's dtor accesses this field.

once_flag has a trivial destructor so it doesn't really matter, but msan
checks that it's not accessed after destruction.
2025-01-09 18:33:42 +01:00
Jie Fu
e3e26dc41a [llvm] Remove extra ';' outside of a function (NFC)
/llvm-project/llvm/lib/Support/Timer.cpp:565:74:
error: extra ';' outside of a function is incompatible with C++98 [-Werror,-Wc++98-compat-extra-semi]
static bool mtg::TrackSpace() { return ManagedTimerGlobals->TrackSpace; };
                                                                         ^
1 error generated.
2025-01-09 20:47:54 +08:00
macurtis-amd
52c338daec
[llvm][NFC] Rework Timer.cpp globals to ensure valid lifetimes (#121663)
This is intended to help with flang `-ftime-report` support:
- #107270.

With this change, I was able to cherry-pick #107270, uncomment
`llvm::TimePassesIsEnabled = true;` and compile with `-ftime-report`.

I also noticed that `clang/lib/Driver/OffloadBundler.cpp` was statically
constructing a `TimerGroup` and changed it to lazily construct via
ManagedStatic.
2025-01-09 06:32:48 -06:00
Jinsong Ji
5de7af4b9f
[llvm][Support][Windows] Fix slash in path for remove_directories (#121448)
Before 925471ed903dad871042d7ed0bab89ab6566a564 remove_directories
supports path with slash (instead of backslash).
The ILCreateFromPathW in new implementation requires backslash path,
so the call to remove_directories will fail if the path contains slash.

This is to normalize the path to make sure remove_directories still
support path with slash as well.
2025-01-02 12:32:42 -05:00
Brad Smith
6ea8b4cebd
[llvm][Support] Use __NR_gettid on Linux for compat with older glibc (#120007) 2024-12-18 16:19:09 -05:00
Dmitry Vasilyev
d098ce0ec9
[llvm][Support][Windows] Refactored remove_directories() w/o CComPtr and atlbase.h (#119843)
This is the update of #118677. This patch fixes building with mingw.
2024-12-13 16:59:31 +04:00
Dmitry Vasilyev
925471ed90
[llvm][Support][Windows] Avoid crash calling remove_directories() (#118677)
We faced an unexpected crash in SHELL32_CallFileCopyHooks() on the buildbot 
[lldb-remote-linux-win](https://lab.llvm.org/staging/#/builders/197/builds/1066).
The host is Windows Server 2022 w/o any 3rd party shell extensions. See #118032 for more details.
Based on [this article](https://devblogs.microsoft.com/oldnewthing/20120330-00/?p=7963).
2024-12-12 11:17:54 +04:00
Abhina Sree
04379c9863
[SystemZ][z/OS] Update autoconversion functions to improve support for UTF-8 (#98652)
This fixes the following error when reading source and header files on
z/OS: error: source file is not valid UTF-8
2024-12-11 07:46:51 -05:00
Fangrui Song
1f9f68a1cd [BalancedPartitioning] Fix -Wdeprecated-this-capture 2024-12-07 15:24:04 -08:00
Matthias Springer
f50ce316ec
[llvm][NFC] APFloat: Add missing semantics to enum (#117291)
* Add missing semantics to the `Semantics` enum.
* Move all documentation of the semantics to the header file.
* Also rename some functions for consistency.
2024-12-04 14:58:59 -08:00
Nikita Popov
da23a8372c
[FormattedStream] Add ASCII fast path (#117892)
Printing IR spends a large amount of time updating the column count via
generic UTF-8 APIs. Speed it up by adding a fast path for the common
printable ASCII case.

This makes `-print-on-crash -print-module-scope` about two times faster.
2024-12-02 14:31:24 +01:00
NAKAMURA Takumi
39209724e6
[YAML] Fix incorrect dash output in nested sequences (#116488)
Nested sequences could be defined but the YAML output was incorrect.
`Output::newLineCheck()` was not able to emit multiple dashes `- ` and
YAML parser sometimes didn't accept its output as the result.

This fixes for emitting corresponding dashes for consecutive
`inSeqFirstElement`, but suppresses emission to the top
`inSeqFirstElement`.

This also fixes for emitting flow elements onto nested sequences.
2024-11-30 22:10:31 +09:00
Nikita Popov
a8a494faab
[SmallPtrSet] Remove SmallArray member (NFC) (#118099)
Currently SmallPtrSet stores CurArray and SmallArray, with the former
pointing to the latter in small representation. Instead, only use
CurArray and a separate IsSmall boolean.

Most of the implementation doesn't need the separate SmallArray member
-- we only need it for copy/move/swap operations, where we may switch
back from large to small representation. In those cases, we explicitly
pass down the pointer to SmallStorage.

This reduces the size of SmallPtrSet and improves compile-time.
2024-11-30 10:21:10 +01:00
Lang Hames
d02c1676d7 [Support][Error] Add ErrorAsOutParameter constructor that takes an Error by ref.
ErrorAsOutParameter's Error* constructor supports cases where an Error might not
be passed in (because in the calling context it's known that this call won't
fail). Most clients always have an Error present however, and for them an Error&
overload is more convenient.
2024-11-29 15:57:53 +11:00
Wu Yingcong
5ed09d552d
[Support] Check zstd decompress result before msan unpoison (#117276)
We should check the zstd decompress result before doing the msan
unpoison. If the res is abnormal, then it would be a huge number, which
will cause undesired msan unpoison behavior and will run for a long
time.
2024-11-25 08:59:17 +08:00
Abhina Sreeskantharajan
9cada10911 [SystemZ][z/OS] Add back removed AutoConvert.h headers that were incorrectly identified as unused to fix a build error on z/OS 2024-11-21 08:23:45 -05:00
Florian Hahn
5174d00365
[llvm] Add back Allocator.h include to RWMutex.cpp.
This unbreaks the build on macOS.

Without the include, the build fails with

llvm/lib/Support/RWMutex.cpp:47:36: error: use of undeclared identifier 'safe_malloc'
   47 |
   static_cast<pthread_rwlock_t*>(safe_malloc(sizeof(pthread_rwlock_t)));
         |                                    ^
2024-11-20 15:00:11 +00:00
Kazu Hirata
d44ea7186b
[Support] Remove unused includes (NFC) (#116752)
Identified with misc-include-cleaner.
2024-11-20 06:51:43 -08:00
Matthias Springer
309c890921
[llvm] APFloat: Add helpers to query NaN/inf semantics (#116315)
`APFloat` changes extracted from #116176 as per reviewer comments.
2024-11-16 11:48:05 +09:00
Matthias Springer
2f55de4e31
[llvm] APFloat: Query hasNanOrInf from semantics (#116158)
Whether a floating point type supports NaN or infinity can be queried
from its semantics. No need to hard-code a list of types.
2024-11-15 09:29:54 +09:00
Kazu Hirata
4048c64306
[llvm] Remove redundant control flow statements (NFC) (#115831)
Identified with readability-redundant-control-flow.
2024-11-12 10:09:42 -08:00
goldsteinn
877cb9a2ed
[KnownBits] Make {s,u}{add,sub}_sat optimal (#113096)
Changes are:
    1) Make signed-overflow detection optimal
    2) For signed-overflow, try to rule out direction even if we can't
       totally rule out overflow.
    3) Intersect add/sub assuming no overflow with possible overflow
       clamping values as opposed to add/sub without the assumption.
2024-11-05 09:03:37 -06:00
Daniel Sanders
5b356f27a0 Trivial change llvm::CreateInfoOutputFile() to return raw_ostream. NFC
This is NFC w.r.t upstream but allows us to return raw_null_ostream in our
downstream fork without changing the interface.
2024-10-31 11:22:22 -07:00
Simon Pilgrim
03948882d3 Fix MSVC "32-bit shift implicitly converted to 64 bits" warning. NFC
NumBits should be less than 20 so using an unsigned instead of size_t should be OK
2024-10-30 10:12:57 +00:00
Steven Wu
ba8d9ce8d4
[ADT] Fix unused variable from #69528 (#114114)
Remove unused variable to fix build failures from bot.
2024-10-29 13:00:59 -07:00
Steven Wu
b510cdb895
[ADT] Add TrieRawHashMap (#69528)
Implement TrieRawHashMap can be used to store object with its associated
hash. User needs to supply a strong hashing function to guarantee the
uniqueness of the hash of the objects to be inserted. A hash collision
is not supported and will lead to error or failed to insert.

TrieRawHashMap is thread-safe and lock-free and can be used as
foundation data structure to implement a content addressible storage.
TrieRawHashMap owns the data stored in it and is designed to be:
* Fast to lookup.
* Fast to "insert" if the data has already been inserted.
* Can be used without lock and doesn't require any knowledge of the
participating threads or extra coordination between threads.

It is not currently designed to be used to insert unique new data with
high contention, due to the limitation on the memory allocator.
2024-10-29 10:29:39 -07:00
Kazu Hirata
ec2da0ca19
[ADT] Use data() and size() within StringRef (NFC) (#113657)
This patch uses data() and size() within StringRef instead of Data and
Length.  This makes it easier to replace Data and Length with
std::string_view in the future, which in turn allows us to forward
most of StringRef functions to the counterparts in std::string_view.
2024-10-25 19:37:55 -07:00
Kazu Hirata
141574bacb
[llvm] Remove redundant calls to std::unique_ptr<T>::get (NFC) (#113415) 2024-10-23 10:44:09 -07:00
Mészáros Gergely
7ab6d39a4d
[LLVM][CMake][MSVC] Wrap linker flags for ICX on Windows (#112680)
The Intel C++ Compiler (ICX) passes linker flags through the driver
unlike MSVC and clang-cl, and therefore needs them to be prefixed with
`/Qoption,link` (the equivalent of `-Wl,` for gcc on *nix).

Use `LINKER:` prefix wherever supported by cmake, when that's not
possible fall-back to `${CMAKE_CXX_LINKER_WRAPPER_FLAG}`. CMake replaces
these with `/Qoption,link` for ICX and with the empty string for MSVC
and clang-cl.

For `target_link_libraries` neither `LINKER:` (not supported prior to
CMake 3.32) nor `${CMAKE_CXX_LINKER_WRAPPER_FLAG}` (does not begin with
`-` would be taken as a library name) works, use `-Qoption,link`
directly within a conditional generator expression that we're linking
with ICX.

For MSVC and clang-cl no functional change is intended.

Tested by compiling with ICX and setting
`CMAKE_(EXE|SHARED|STATIC|MODULE)_LINKER_FLAGS_INIT` to
`-Werror=unknown-argument`.

RFC:
https://discourse.llvm.org/t/rfc-cmake-linker-flags-need-wl-equivalent-for-intel-c-icx-on-windows/82446
2024-10-23 13:03:25 +02:00
Sergey Kozub
7b308b18c3
Fix bitcasting E8M0 APFloat to APInt (#113298)
Fixes a bug in APFloat handling of E8M0 type (zero mantissa).

Related PRs:
- https://github.com/llvm/llvm-project/pull/107127
- https://github.com/llvm/llvm-project/pull/111028
2024-10-22 19:15:29 +02:00
Abhina Sree
46dc91e7d9
[SystemZ][z/OS] Add new openFileForReadBinary function, and pass IsText parameter to getBufferForFile (#111723)
This patch adds an IsText parameter to the following getBufferForFile,
getBufferForFileImpl. We introduce a new virtual function
openFileForReadBinary which defaults to openFileForRead except in
RealFileSystem which uses the OF_None flag instead of OF_Text.

The default is set to OF_Text instead of OF_None, this change in value
does not affect any other platforms other than z/OS. Setting this
parameter correctly is required to open files on z/OS in the correct
encoding. The IsText parameter is based on the context of where we open
files, for example, in the ASTReader, HeaderMap requires that files
always be opened in binary even though they might be tagged as text.
2024-10-21 08:20:22 -04:00
Kazu Hirata
d1401822e2
[Support] Use a hetrogenous lookup with std::map (NFC) (#113075) 2024-10-20 10:42:53 -07:00
Luke Drummond
b55c52c047 Revert "Renormalize line endings whitespace only after dccebddb3b80"
This reverts commit 9d98acb196a40fee5229afeb08f95fd36d41c10a.
2024-10-18 21:16:50 +01:00
Luke Drummond
9d98acb196 Renormalize line endings whitespace only after dccebddb3b80
Line ending policies were changed in the parent, dccebddb3b80. To make
it easier to resolve downstream merge conflicts after line-ending
policies are adjusted this is a separate whitespace-only commit. If you
have merge conflicts as a result, you can simply `git add --renormalize
-u && git merge --continue` or `git add --renormalize -u && git rebase
--continue` - depending on your workflow.
2024-10-17 14:49:26 +01:00
NAKAMURA Takumi
83953c7df1 APInt.cpp: Prune a stray semicolon. 2024-10-17 20:04:00 +09:00
David Blaikie
f796a0c7c9
[formatv] Leave format parameters unstripped (#112625)
This is consistent with std::formatv and allows formatters to support a
wider variety of use cases (like having a bare string in their formatter
if that's useful, etc).

Came up in the context of some Carbon diagnostic work here:
https://github.com/carbon-language/carbon-lang/pull/4411#discussion_r1803688859
2024-10-16 15:53:52 -07:00
Thomas Fransham
9093ba9f7e
[Support] Include Support/thread.h before api implementations (#111175)
This header was included after the implementations to work around an
issue with FreeBSD, however, , this causes some issues when
dllexport\explicit visibility
attributes will be added to the headers on Windows, since the
definitions need to see the declarations for the attributes to apply.

This is part of the work to enable LLVM_BUILD_LLVM_DYLIB and plugins on
windows.

---------

Co-authored-by: Tom Stellard <tstellar@redhat.com>
2024-10-10 07:45:29 +03:00
Ariel-Burton
3b7091bcf3
[APFloat] add predicates to fltSemantics for hasZero and hasSignedRepr (#111451)
We add static methods to APFloatBase to allow the hasZero and
hasSignedRepr properties of fltSemantics to be obtained.
2024-10-09 08:37:40 -04:00
Yingwei Zheng
6004f5550c
[ADT][APFloat] Make sure EBO is performed on APFloat (#111641)
Since both APFloat and (Double)IEEEFloat inherit from APFloatBase, empty
base optimization is not performed by GCC/Clang (Minimal reproducer:
https://godbolt.org/z/dY8cM3Wre). This patch removes inheritance
relation between (Double)IEEEFloat and APFloatBase to make sure EBO is
performed on APFloat. After this patch, the size of `ConstantFPRange`
will be reduced from 72 to 56.

Address comment
https://github.com/llvm/llvm-project/pull/111544#discussion_r1792398427.
2024-10-09 16:39:02 +08:00
Timm Baeder
a579782a77
[llvm] Add serialization to uint32_t for FixedPointSemantics (#110288)
FixedPointSemantics is exactly 32bits and this static_assert'ed after
its declaration. Add support for converting it to and from a uint32_t.
2024-10-09 05:44:19 +02:00
ivanaivanovska
871f69f0b6
[TimeProfiler] Added instant events to llvm TimeProfiler. (#103039)
This adds support for adding instant events to the TimeProfiler. Later we plan to use it to record deferring of template instantiations.
2024-10-08 13:04:39 +02:00
Craig Topper
bcb15d0059
[APInt] Slightly simplify APInt::ashrSlowCase. NFC (#111220)
Use an arithmetic shift for the last word copy when BitShift!=0. This
avoids an explicit sign extend after the shift.
2024-10-05 10:25:50 -07:00
Kyungwoo Lee
ed59d571f2
[ThinLTO][NFC] Refactor FileCache (#110463)
This is a prep for https://github.com/llvm/llvm-project/pull/90933.
 
  - Change `FileCache` from a function to a type.
  - Store the cache directory in the type, which will be used when creating additional caches for two-codegen round runs that inherit this value.
2024-10-04 07:50:28 -07:00
William Moses
8bcf79914b
Fix have pthread_getname for some system (#110854)
I'm on a system that has have_pthread_getname_np defined but set to 0.
The correct behavior here is not to use the function, but presently the
macros will attempt to use a non-existing function.

This previously worked before
https://github.com/llvm/llvm-project/pull/106486/files
2024-10-03 08:13:53 -05:00
Durgadoss R
99f527d280
[APFloat] Add APFloat support for E8M0 type (#107127)
This patch adds an APFloat type for unsigned E8M0 format. This format is
used for representing the "scale-format" in the MX specification:
(section 5.4)

https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf

This format does not support {Inf, denorms, zeroes}. Like FP32, this
format's exponents are 8-bits (all bits here) and the bias value is 127.
However, it differs from IEEE-FP32 in that the minExponent is -127
(instead of -126). There are updates done in the APFloat utility
functions to handle these constraints for this format.

* The bias calculation is different and convertIEEE* APIs are updated to
handle this.
* Since there are no significand bits, the isSignificandAll{Zeroes/Ones}
methods are updated accordingly.
* Although the format does not have any precision, the precision bit in
the fltSemantics is set to 1 for consistency with APFloat's internal
representation.
* Many utility functions are updated to handle the fact that this format
does not support Zero.
* Provide a separate initFromAPInt() implementation to handle the quirks
of the format.
* Add specific tests to verify the range of values for this format.

Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
2024-10-02 23:04:21 +05:30
Jay Foad
5cabf1505d
[KnownBits] Make avg{Ceil,Floor}S optimal (#110688)
Rewrite the signed functions in terms of the unsigned ones which are
already optimal.
2024-10-01 19:34:50 +01:00