114 Commits

Author SHA1 Message Date
Aiden Grossman
99cdc26c94
[CI] Cleanup buildkite test report script
This patch removes the generate_test_report_buildkite script entrypoint
as we no longer need it now that buildkite has been sunsetted. Also
remove the calls in the monolithic-* scripts given they are adding
complexity for no value.

Also remove the generate-buildkite-pipeline-premerge script as it is
no longer needed.

Reviewers: tstellar, DavidSpickett, lnihlen, cmtice

Reviewed By: DavidSpickett

Pull Request: https://github.com/llvm/llvm-project/pull/143480
2025-06-22 17:17:12 +00:00
Aiden Grossman
214ca3161b
[CI] Test all projects when CI scripts change
This patch resolves a fixme in the compute_projects script to actually
test all the projects we can when touching something in the .ci
directory. This ensures we test things like compiler-rt before landing
changes.

Reviewers: gburgessiv, lnihlen, cmtice

Reviewed By: cmtice, gburgessiv

Pull Request: https://github.com/llvm/llvm-project/pull/144034
2025-06-22 17:12:16 +00:00
Aiden Grossman
ee414e3504
[CI] Refactor out some early exits in compute_projects
I have a habit of using early exits given it is in the LLVM coding
standards, but most of the early exits used within this script were
trivial and actually adding complexity. These are all instances where we
only perform one operation after the early exit, so removing the early
exit means less lines of code and arguably more readable code.

Reviewers: DavidSpickett, tstellar, cmtice, lnihlen

Reviewed By: DavidSpickett

Pull Request: https://github.com/llvm/llvm-project/pull/143478
2025-06-22 15:08:11 +00:00
Aiden Grossman
b7be8786af
Reapply "[CI] Migrate to runtimes build" (#143612)
This reverts commit 6f62979a5a5bcf70d65f23e0991a274e6df5955b.

The reapplies commit 80ea5f46df3e365a0a2112889bb91732167b6214.

That commit was reverted because it was causing compiler-rt test
failures due to tysan not having its dependencies set up properly within
CMake. That situation has since been rectified in
3cef099ceddccefca8e11268624397cde9e04af6.

Reviewers: lnihlen, rnk, gburgessiv, cmtice

Reviewed By: rnk, cmtice

Pull Request: https://github.com/llvm/llvm-project/pull/144033
2025-06-20 15:07:00 -07:00
George Burgess IV
6f62979a5a
Revert "[CI] Migrate to runtimes build" (#143612)
Reverts llvm/llvm-project#142696

See https://github.com/llvm/llvm-project/issues/143610 for details; I
believe this PR causes CI builders to build LLVM in a way that's been
broken for a while. To keep CI green, if this is the correct culprit,
those tests should be fixed or skipped
2025-06-10 17:57:16 -06:00
Aiden Grossman
80ea5f46df
[CI] Migrate to runtimes build
This patch migrates the premerge pipeline to use LLVM_ENABLE_RUNTIMES to
build libc and compiler-rt.

Reviewers: DavidSpickett, tstellar, cmtice, lnihlen

Reviewed By: DavidSpickett

Pull Request: https://github.com/llvm/llvm-project/pull/142696
2025-06-10 08:11:25 +00:00
Aiden Grossman
591678bebd
[CI] Explicitly compute needed runtime targets
This patch adjusts the compute_projects script to explicitly determine
what runtimes should be built and what runtimes should be tested. This
is mainly to support enabling runtimes for LLDB testing but not test
them unless we are building clang.

Reviewers: Endilll, tstellar, DavidSpickett, lnihlen

Reviewed By: DavidSpickett

Pull Request: https://github.com/llvm/llvm-project/pull/142695
2025-06-10 05:16:02 +00:00
Aiden Grossman
b93e4215ec
[CI] Use LLVM_ENABLE_RUNTIMES for runtimes builds on Linux
This patch switches us to using LLVM_ENABLE_RUNTIMES rather than using
separate runtimes builds for some reductions in CMake configuration time
and some simplification of the monolithic-linux.sh script.

Reviewers: DavidSpickett, cmtice, lnihlen, Endilll, tstellar

Reviewed By: Endilll, DavidSpickett

Pull Request: https://github.com/llvm/llvm-project/pull/142694
2025-06-08 22:06:53 +00:00
Aiden Grossman
34e5d8ef16
[CI] Remove buildkite from metrics container (#143049)
Now that buildkite has been sunsetted, remove buildkite tracking from
the metrics container as it does not do anything.
2025-06-06 12:58:57 -07:00
David Spickett
bd3c632e64
[ci] Use different emoji for Linux and Windows job titles on GitHub (#142101)
Buildkite has a bunch of custom emoji that include Linux and Windows
logos -
https://github.com/buildkite/emojis?tab=readme-ov-file#smileys--emotion.

GitHub doesn't have these so let's just use a penguin and a generic
window. 🐧 🪟

These are standard emoji so Buildkite does support them too, but there's
no motivation to change it there.

Plus, I think the metrics collection might be tied to Buildkite pipeline
name so keeping it the same saves hassle there.
2025-06-02 09:22:55 +01:00
Aiden Grossman
8083944be0 [CI] Do not fail with no JUnit XML
Currently we will fail if there are no JUnit XML files produced from
llvm-lit invocations. This can happen if the build fails and no test
suites end up getting run or if we test a project that does not use
llvm-lit, libe libc.

This fixes #142038.
2025-05-29 21:56:21 +00:00
Aiden Grossman
7d44430f66
[Github] Fix TODO after removal of continue on error (#141896)
Previously we were using continue-on-error within the workflows to
prevent sending out notifications for workflow failures. We worked
around this in the metrics container to see what was actually
failing/passing by looking at the individual steps. Now that we have
gotten rid of the continue-on-error flag, we just have to look at the
job status.
2025-05-29 08:58:51 -07:00
Aiden Grossman
62e28d4c31
[CI] Upload JUnit Test Results as Artifacts (#141905)
This enables a script to come through later and download all the test
files for further offline analysis. This is intended to enable
developing a tool that can spot flaky tests.
2025-05-29 08:58:36 -07:00
Aiden Grossman
deedc8a181
[CI][Github] Remove test naming from premerge jobs (#141527)
This patch removes the "test only please ignore" tagline from the
premerge job names. Now that we are looking to sunset the old
infrastructure pretty soon and the new infrastructure is reporting
errors, we want people to actually pay attention to the failures and
report anything erroneous.
2025-05-26 13:19:09 -07:00
Aiden Grossman
3910a2c40d [CI] Fix compute_projects.py unit tests
It seems like these tests were never run during the development of
30747cfe41f5d4f0b0083750ba9c20cfcccec117, which proceeded to break them.
This patch updates the tests to correspond to the changes introduced in
that patch.
2025-05-25 05:23:10 +00:00
Aiden Grossman
466720960b
[CI] Add link to issue tracker upon job failures (#140817)
The premerge system will fail somewhat often due to issues unrelated to
the patch being tested. This patch adds a link within the long form
outputs to the issue tracker prompting users to open an issue if they
see flakes/soemthing broken at HEAD/anything else wrong.
2025-05-22 01:12:11 -07:00
Aiden Grossman
de3e8fff20 [NFC][CI] Reformat python files
Looks like some of these were not properly formatted at some point. This
patch reformats these files so that future diffs are cleaner when
running the formatter over the whole file.
2025-05-20 21:52:33 +00:00
Aiden Grossman
32adf2c360
[Github][CI] Rename New Premerge Jobs (#138024)
This patch renames the new premerge job as suggested in
https://discourse.llvm.org/t/github-ci-notifications-and-main-branch/85868/10.
This uses more industry standard terms (like CI vs premerge checks which
might be somewhat of a LLVM CI idiom?) and makes it more generic if we
end up doing postcommit testing through Github.
2025-05-01 23:22:05 -07:00
Aiden Grossman
e4feb2d5ca
[CI] Hash pin CI python deps (#137489)
The CI scripts install some python dependencies primarily for testing
this patch moves these over to a single requirements file that also hash
pins everything using pip-compile to conform to best security and
reproducibility practices.
2025-04-27 08:21:20 -07:00
Matheus Izvekov
1c4ff5128a
[ci] add dependencies for lldb python binding tests (#136158) 2025-04-19 03:50:30 -03:00
Matheus Izvekov
2ce97fd43c
[ci] upload any generated clang reproducers as artifacts (#136157)
Make sure any generated clang reproducers end up as artifacts.
2025-04-18 19:47:06 -03:00
Matheus Izvekov
808f63824a
[ci] set up llvm-symbolizer environment variable (#136156)
Set up llvm-symbolizer environment variable so that its preferred over
any symbolizer just built, as it can be much slower when built for
debugging.
2025-04-18 19:46:44 -03:00
Matheus Izvekov
30747cfe41
[ci] add all projects as dependencies of ci (#136153)
Add all projects as dependencies of .ci, to make sure everything is
tested when it changes.

Originally split-off from #135499
2025-04-18 17:42:29 -03:00
Aiden Grossman
1fd7e4c517 Revert "[CI] monolithic-linux improvements (#135499)"
This reverts commit a399c6926a8701083c767cbb041e22ff92e9d717.

This is causing some premerge workflow failures.

Examople:
https://buildkite.com/llvm-project/github-pull-requests/builds/169129#01963d1d-dc75-4b4c-9952-fb60efbf91b4
2025-04-17 05:53:31 +00:00
Matheus Izvekov
a399c6926a
[CI] monolithic-linux improvements (#135499)
Some improvements to monolithic-linux CI:

1) Add correct configuration and dependencies for LLDB testing which
   is actually relevant for clang changes.
2) Skip clang installation and separate configuration for runtimes.
   They will be built with the just built clang either way.
   This avoids building the runtimes twice when LLDB is also tested.
3) Make sure any generated clang reproducers end up as artifacts.
4) Set up llvm-symbolizer environment variable so that its preferred
over
any symbolizer just built, as it can be much slower when built for
debugging.
5) Add all projects as dependencies of `.ci`, to make sure everything is
   tested when it changes.
2025-04-15 13:00:49 -03:00
Aiden Grossman
dbeb7c1bbb
[Github][CI] Upload .ninja_log as an artifact
This enables using tools like https://github.com/nico/ninjatracing for
performance introspection.

Reviewers: mizvekov, lnihlen, tstellar, Endilll, Keenuts

Reviewed By: Keenuts

Pull Request: https://github.com/llvm/llvm-project/pull/135539
2025-04-14 16:28:50 +02:00
David Spickett
d9cfd90524 [ci] Improve wording in CI test reports
We weren't saying where to click, make it clear you click on a
test name.
2025-04-10 09:20:13 +00:00
Nathan Gauër
fe4f666363
[CI] Always upload queue/running count (#134814)
Before this commit, we only pushed a queue/running count when the value
was not zero. This makes building Grafana alerting a bit harder.
Changing this to always upload a value for watched workflows.
2025-04-08 11:16:24 +02:00
Aiden Grossman
582b1b2ac9 [CI] Use env variable to enable pip breaking system packages
This patch uses an env variable instead of the --break-system-packages
flag. This enables the heterogenous configuration between the old and
new premerge systems as the old premerge container does not recognize
the --break-system-packages flag. An env variable will work on new
premerge and have no impact on old premerge.
2025-04-05 20:04:04 +00:00
Aiden Grossman
fb96d5171e Reapply "[CI] Fix Monolithic Linux Build in Ubuntu 24.04 (#133628)"
This reverts commit d72be157823d41e7eaf457cc37ea99c07431a25c.

Now that the container version got bumped, we need to reland this.
2025-04-05 07:24:36 +00:00
Aiden Grossman
d72be15782 Revert "[CI] Fix Monolithic Linux Build in Ubuntu 24.04 (#133628)"
This reverts commit 23fb048ce35f672d8db3f466a2522354bbce66e5.

This broke the new premerge system as it appears the pip installations within
the CI image do not support this option. Buildkite was unaffected.
2025-04-01 23:43:35 +00:00
Aiden Grossman
ce296f1eba
[CI] Exclude gn changes from running premerge (#133623)
These changes are mostly pushed by the gnsyncbot directly to main and
thus don't go through a PR, but we still test on main to see if main is
broken. Given these touch llvm/, they end up burning a decent amount of
testing time for no real benefit, so I think it makes sense to exclude
them from premerge testing explicitly.
2025-04-01 12:58:16 -07:00
Aiden Grossman
23fb048ce3
[CI] Fix Monolithic Linux Build in Ubuntu 24.04 (#133628)
This patch fixes the monolithic linux build in Ubuntu 24.04. Newer
versions of debian/ubuntu pass a warning when installing packages at the
system level using pip as it interferes with system package manager
installed python packages. We do not use any system package manager
installed python packages, so we just ignore the warning (that is an
error without passing the flag) by passing the --break-system-packages
flag.
2025-04-01 12:55:07 -07:00
Aiden Grossman
41c906fe2b
[CI] Add rich build information for github workflows
This patch adds rich test failure information to the Github output,
using the same library that is used for the buildkite pipeline.
Eventually I think we want to add more information like reproduction
information using the containers, but that is very divergent between
Github and Buildkite, so we probably want to wait until we've switched
over before doing that.

Reviewers: Keenuts, tstellar, lnihlen, DavidSpickett

Reviewed By: DavidSpickett, Keenuts

Pull Request: https://github.com/llvm/llvm-project/pull/133197
2025-03-28 23:48:20 -07:00
Aiden Grossman
21eeca3db0
[CI] Exclude docs directories from triggering rebuilds
Currently when someone touches a docs directory in a subproject, it is
treated as if the source code of that project got touched, so the
project is built, it is tested, and the same for all of its enumerated
dependents. This is wasteful, particularly for patches just touching
docs in places like LLVM where we might spend an hour of node time to do
nothing useful given changes in the docs shouldn't cause test failures
and there is already another workflow that tests the documentation build
completes successfully.

Reviewers: Keenuts, tstellar, lnihlen

Reviewed By: tstellar

Pull Request: https://github.com/llvm/llvm-project/pull/133185
2025-03-28 22:30:41 -07:00
Aiden Grossman
34d858635f
[CI] Move CI over to new project computation script
This patch migrates the CI over to the new compute_projects.py script
for calculating what projects need to be tested based on a change to
LLVM.

Reviewers: lnihlen, ldionne, tstellar, Endilll, joker-eph, Keenuts

Reviewed By: Keenuts, tstellar

Pull Request: https://github.com/llvm/llvm-project/pull/132642
2025-03-28 22:25:52 -07:00
Aiden Grossman
2fb53f59c1
[CI] Refactor generate_test_report script
This patch refactors the generate_test_report script, namely turning it
into a proper library, and pulling the script/unittests out into
separate files, as is standard with most python scripts. The main
purpose of this is to enable reusing the library for the new Github
premerge.

Reviewers: tstellar, DavidSpickett, Keenuts, lnihlen

Reviewed By: DavidSpickett

Pull Request: https://github.com/llvm/llvm-project/pull/133196
2025-03-27 12:59:43 -07:00
Aiden Grossman
2ca27e7c3e [CI] Fix typo in compute_projects_test.py
I apparently forgot to properly name the test before submitting the last
patch. This patch properly names the test.
2025-03-27 01:51:13 +00:00
Aiden Grossman
7da71a6b71 [CI] Exclude runtimes from being tested as projects
Before this patch, making a change to a runtime directory (like libcxx)
would cause the project to be added to the LLVM_ENABLE_PROJECTS CMake
flag which is illegal as they can only be built as part of
LLVM_ENABLE_RUNTIMES. This patch fixes that behavior. Test added.
2025-03-26 23:59:41 +00:00
Aiden Grossman
9224165871
[CI] Add Python Script for Computing Projects/Runtimes to Test
This patch adds a python script, compute_projects, and associated unit
tests for computing the projects and runtimes that need to be tested in
premerge. Rewriting in Python opens up a couple new
improvements/opportunities:
1. I personally find python to be much easier to work with than shell
   scripts for tasks like this. Particularly it becomes a lot easier to
   work with paths with proper array support.
2. Unit testing becomes easier which makes it a lot easier to reason
   about behavior changes, especially in review.
3. Most of the configuration is now setup in some dictionaries, which
   makes changes much easier to apply for most of the common changes.

This preserves the behavior of the existing premerge scripts as much as
possible.

Reviewers: ldionne, lnihlen, Endilll, tstellar, Keenuts

Reviewed By: Keenuts

Pull Request: https://github.com/llvm/llvm-project/pull/132634
2025-03-26 12:37:16 -07:00
Vlad Serebrennikov
8f863fcd77
[clang][CI] Reuse build dir between C++26 and modules build of runtimes (#132480)
Between C++26 and Clang modules build of runtimes, which are used as an
additional testing for Clang, the only difference are
`LIBCXX_TEST_PARAMS` and `LIBCXXABI_TEST_PARAMS`. Both of them are
transformed into actual lit configuration lines, and put into
`SERIALIZED_LIT_PARAMS`, which end up in `libcxx/test/cmake-bridge.cfg`
via `configure_file` command. Notably, it seems that they are not used
in any other way.

I checked that if those variables are changed, subsequent runs of CMake
configuration step regenerate `cmake-bridge.cfg` with the new values.
Which means that we don't need to do clean builds for every runtimes
configuration we want to test.

I hope that together with #131913, this will be enough to alleviate
Linux CI pains we're having, and we wouldn't have to make a tough choice
between C++26 and Clang modules builds for pre-merge CI.
2025-03-25 19:38:33 +04:00
Aiden Grossman
052a4b54a7
[CI] Clean up runtimes builds (#131913)
This patch cleans up the runtimes build in premerge due to queuing
delays, dropping the C++03 testing, but keeping the C++20 and Modules
configurations as they are deemed important by clang contributors.

This patch also makes it easier in the future when we need to rework the
runtimes build to anticipate the deprecation of building most of the
runtimes with LLVM_ENABLE_PROJECTS.
2025-03-21 12:39:47 -07:00
Nathan Gauër
77edfbb96c
[CI] Don't count canceled buildkite builds (#132015)
We don't count canceled jobs on GCP, so we shouldn't count canceled jobs
on Buildkite neither.

Signed-off-by: Nathan Gauër <brioche@google.com>
2025-03-21 10:14:44 +01:00
Aiden Grossman
43c21f96a7
Revert "[Premerge] Add flang-rt (#128678)" (#131915)
This reverts commit 95d28fe503cc3d2bc0bb980442d3defaf199ea5a.

I did not fully realize the implications of this change when reviewing.
With how it is set up currently, it causes clang and all of the runtimes
to be built and tested everytime a change to MLIR is made. This is a
large regression in build/test time, which seems to have been causing
large queueing delays.

Reverting for now. Once we rework the runtimes build for premerge (which
I hope to do soon, ideally in the next week), I will make sure flang-rt
gets added in.
2025-03-18 15:52:59 -07:00
Aiden Grossman
0619892cab [CI] Bump max workflow to process count in metrics
This patch bumps the maximum number of metrics to look through when
collecting metrics data. We are currently running into issues where we
are losing data due to the most recent 1000 workflows not containing the
workflows that we actually need to query. Just double it for now.

I plan on monitoring this reasonably closely to ensure we do not run
into issues, mainly API rate limits.
2025-03-18 19:34:57 +00:00
Nathan Gauër
05df923b0e
[CI] Add dateutil dependency to the metrics container (#131333) 2025-03-14 14:45:44 +01:00
Nathan Gauër
44f4e43b4f
[CI] Extend metrics container to log BuildKite metrics (#130996)
The current container focuses on Github metrics. Before deprecating
BuildKite, we want to make sure the new infra quality is better, or at
least the same.

Being able to compare buildkite metrics with github metrics on grafana
will allow us to easily present the comparison.

BuildKite API allows filtering, but doesn't allow changing the result
ordering. Meaning we are left with builds ordered by IDs. This means a
completed job can appear before a running job in the list. 2 solutions
from there:
 - keep the cursor on the oldest running workflow
 - keep a list of running workflows to compare.

Because there is no guarantees in workflow ordering, waiting for the
oldest build to complete before reporting any newer build could mean
delaying the more recent build completion reporting by a few hours. And
because grafana cannot ingest metrics older than 2 hours, this is not an
option.

Thus we leave with the second solution: remember what jobs were running
during the last iteration, and record them as soon as they are
completed. Buildkite has at most ~100 pending jobs, so keeping all those
IDs should be OK.
2025-03-14 11:44:39 +01:00
Michael Kruse
95d28fe503
[Premerge] Add flang-rt (#128678)
Flang's runtime can now be built using LLVM's LLVM_ENABLE_RUNTIMES
mechanism, with the intent to remove the old mechanism in #124126.
Update the pre-merge builders to use the new mechanism.

In the current form, #124126 actually will add
LLVM_ENABLE_RUNTIMES=flang-rt implicitly, so no change is strictly
needed. I still think it is a good idea to do it explicitly and in
advance.

On Windows, flang-rt also requires compiler-rt, but which is not
building on Windows anyway.
2025-03-13 12:17:59 +01:00
Nathan Gauër
1282878c52
[CI] Fix bad timestamps being reported (#130941)
Yesterday, the monitoring reported a job queued for 23h59. After some
checks, it appears no such job existed: the age of the workflows on
completion was at most 5 hours during the last 48 hours.

After some digging, I found out GitHub could return a job with a start
date slightly before the creation date, or completion date before start
date.
This would cause python to compute a negative timedelta, which would
then be reported in grafana as a full 24h delta due to the conversions.

Adding code to ignore negative delta, but logging them.
2025-03-13 10:18:02 +01:00
Nathan Gauër
389a705b8e
[CI] Rework github workflow processing (#130317)
Before this patch, the job/workflow name impacted the metric name,
meaning a change in the workflow definition could break monitoring. This
patch adds a map to get a stable name on metrics from a workflow name.

In addition, it reworks a bit how we track the last processed workflow:
the github queries are broken if filtering is applied, meaning we have a
list of workflow, ordered by 'created_at', which mixes completed &
running workflows.
We have no guarantees over the order of completion, meaning we cannot
stop at the first completed job we found (even per-workflow).

This PR processed the last 1000 workflows, but allows an early stop if
the created_at time is older than 8 hours. This means we could miss
long-running workflows (>8 hours), and if the number of workflows
started before another one completes becomes high (>1000), we'll miss
it.
To detect this kind of behavior, a new metric is added "oldest workflow
processed", which should at least indicate if the depth is too small.

An alternative without arbitrary cut would be to initially parse all
workflows, and then record the last non-completed one we find and always
start from the last (moving the lower bound as they complete). But LLVM
has forever-queued workflows runs (>1 years), hence this would cause us
to iterate over a very large number of jobs.

---------

Signed-off-by: Nathan Gauër <brioche@google.com>
2025-03-11 14:16:18 +01:00