Today, on Darwin platforms, almost every test binary in our test suite
loads two copies of libc++, libc++abi, and libunwind. This is because
each of the test binaries explicitly link against a just-built libc++
(which is explicitly required on Darwin right now) but we don't take the
correct steps to replace the system libc++. Doing so is unnecessary and
potentially error-prone, so most tests should link against the system
libc++ where possible.
Background:
The lldb test suite has a collection of tests that rely on libc++
explicitly. The two biggest categories are data formatter tests (which
make sure that we can correctly display values for std types) and
import-std-module tests (which test that we can import the libc++ std
module). To make sure these tests are run, we require a just-built
libc++ to be used.
All of the test binaries link against the just-built libc++, so it gets
loaded. However, when any system library tries to load libc++, it
attempts to load the system one. dyld checks loaded libraries against
the request to load a new one using the full path, meaning anyone
linking against `/usr/lib/libc++.1.dylib` will get it no matter what
other libc++ dylib is already loaded.
The proper way to handle this is using `DYLD_LIBRARY_PATH`, which
switches dyld to checking the leaf name of a dylib instead of the full
path. In theory this works, but we run into an issue where the system
libc++ has additional symbols and many system libraries fail to load.
Louis Dionne added stubs in libc++abi for these missing symbols, meaning
it would be possible to make this scenario work. This may be useful for
the existing libc++ tests.
### Summary
`SBModuleSpecList` already supports `len()` and iteration via `__len__`
and `__iter__`, but is not subscriptable — `specs[0]` raises
`TypeError`.
This adds a `__getitem__` method that supports integer indexing (with
negative index support) and string lookup using `endswith()` matching,
which works for both Unix and Windows paths.
### Supported key types
| Key type | Example | Behavior |
|---|---|---|
| `int` | `specs[0]`, `specs[-1]` | Direct index with negative index
support |
| `str` | `specs['a.out']`, `specs['/usr/lib/liba.dylib']` | Lookup by
basename or partial/full path via `endswith()`. Returns first match or
`None` |
### Error handling
- **`IndexError`** for out-of-bounds integer indices
- **`TypeError`** for unsupported key types (e.g., `float`)
- **`None`** for string lookups with no match
### Before
```python
>>> specs = lldb.SBModuleSpecList.GetModuleSpecifications('/bin/ls')
>>> specs[0]
Traceback (most recent call last):
File "<console>", line 1, in <module>
TypeError: 'SBModuleSpecList' object is not subscriptable
```
### After
```python
>>> import lldb, re
>>> specs = lldb.SBModuleSpecList.GetModuleSpecifications('/bin/ls')
>>> specs[0]
file = '/bin/ls', arch = x86_64-*-linux, uuid = 3CCC0D8A-..., object size = 140928
>>> specs[-1]
file = '/bin/ls', arch = x86_64-*-linux, uuid = 3CCC0D8A-..., object size = 140928
>>> specs['ls']
file = '/bin/ls', arch = x86_64-*-linux, uuid = 3CCC0D8A-..., object size = 140928
>>> specs[999]
IndexError: list index out of range
>>> specs[1.5]
TypeError: unsupported index type: <class 'float'>
```
### Test plan
Added test_module_spec_list_indexing to TestSBModule.py covering:
- Positive and negative integer indexing
- Out-of-bounds raises IndexError
- Unsupported key type raises TypeError
- String lookup by basename and full path (endswith() matching)
- Missing key returns None
```
bin/llvm-lit -sv lldb/test/API/python_api/sbmodule/TestSBModule.py
```
```
PASS: LLDB :: test_GetObjectName_dwarf (TestSBModule.SBModuleAPICase)
PASS: LLDB :: test_GetObjectName_dwo (TestSBModule.SBModuleAPICase)
PASS: LLDB :: test_module_spec_list_indexing_dwarf (TestSBModule.SBModuleAPICase)
PASS: LLDB :: test_module_spec_list_indexing_dwo (TestSBModule.SBModuleAPICase)
----------------------------------------------------------------------
Ran 12 tests in 1.854s
OK (skipped=6)
```
Co-authored-by: Piyush Jaiswal <piyushjais@meta.com>
`isRISCV()` check always returns false because we only get here if
`min_op_byte_size` and `max_op_byte_size` are equal, which is not true
for RISC-V.
Also, replase `if (!got_op)` check with an `else`. The check is
equivalent to
`if (min_op_byte_size != max_op_byte_size)`, and the `if` above checks
for the opposite condition.
This changes `DenseMapInfo<UUID>::getTombstoneKey()` to return a 1-byte
`{0xFF}` sentinel instead of the empty, default constructed UUID().
Returning the same key for the empty and tombstone value apparently
violates the `DenseMap` invariant.
Since #189401, LLVM and Clang generate `S_DEFRANGE_REGISTER_REL_INDIR`
for indirect locations. This adds support in LLDB.
The offset added after dereferencing is signed here - unlike in
`S_REGREL32_INDIR` (at least that's the assumption). So I updated
`MakeRegisterBasedIndirectLocationExpressionInternal` to handle the
signedness. This is the reason the MSVC test was changed here.
I didn't find a test case where LLVM emits the record with the `VFRAME`
register. Other than that, the clang test is similar to the MSVC one
except that the locations are slightly different.
In the non-ARM case, the offset was left unset, so the symbol
synthesized for the entry point pointed to the start of the containing
section.
As a drive-by change, simplify offset adjustment in ARM case.
The LLVM Coding Standards [1] specify that:
> [T]o match error message styles commonly produced by other tools,
> start the first sentence with a lowercase letter, and finish the last
> sentence without a period, if it would end in one otherwise.
Historically, that hasn't been something we've enforced in LLDB, but in
the past year or so I've started to pay more attention to this in code
reviews. This PR brings more error messages in compliance, further
increasing consistency.
I also adopted `createStringErrorV` where it improved the code as a
drive-by for lines I was already touching.
[1] https://llvm.org/docs/CodingStandards.html#error-and-warning-messages
Assisted-by: Claude Code
Value::ResolveValue() only does something if the value has an associated
compiler type, which is never set on values used in DWARF expressions.
Simplify code by inlining the method.
Some WebAssembly runtimes might use environment variables such as `HOME`
or `XDG_CONFIG_HOME` to store configuration files, additionally some VMs
might allow wasm modules to read environment variables from the host.
Currently, if a runtime is launched using the WebAssembly platform it
doesn't inherit the environment. As a result wasm runtimes and modules
are unable to read from environment variables and might even fail to
launch (if the config file is required). This PR aims to resolve this
issue, I have tested this with the WARDuino runtime and my WIP gdbstub.
Fixes a crash with the following alias, which I use for printing the
contents of pointer variables:
```
command alias vp v -P1
```
At some point in the recent-ish past, parsing this alias has started
crashing lldb. The problem is code that assumes the option and its value
are separate. This assumption causes an index past the end of a vector.
This fix changes `FindArgumentIndexForOption`. The function now returns
a pair of indexes, the first index is the option, the second index is
the index of the value. In the case of joined options like `-P1`, the
two indexes are the same.
This patch does two thing:
- It reverts to the previous behavior of warning for untrusted dSYMs.
- It includes whether a dSYM is trusted or untrusted in the warning
output.
My reasoning is that there's no tooling for automatically signing dSYMs
and therefore we shouldn't change the behavior until this is more
common. The inclusion of whether the dSYM is signed or not is the first
step towards advertising the existence of the feature.
This now also means the release note I added in #189444 is correct
(again).
I had auto-merge enabled in #189444 and since the formatter is
non-blocking it got merged despite the issue. Given I'm already here, I
just formatted the whole file.
LLDB automatically discovers, but doesn't automatically load, scripts in
the dSYM bundle. This is to prevent running untrusted code. Users can
choose to import the script manually or toggle a global setting to
override this policy. This isn't a great user experience: the former
quickly becomes tedious and the latter leads to decreased security.
This PR offers a middle ground that allows LLDB to automatically load
scripts from trusted dSYM bundles. Trusted here means that the bundle
was signed with a certificate trusted by the system. This can be a
locally created certificate (but not an ad-hoc certificate) or a
certificate from a trusted vendor.
* Correctly return the result when used from the console, so that
`DiagnosticsRendering` could use it to output the error.
* Add location pointer to `DILDiagnosticError` internal formatting to
show diagnostics when called from the API.
This patch is motivated by
https://github.com/llvm/llvm-project/pull/189943, where we would like to
print the "these module scripts weren't loaded" warning for *all*
modules batched together. I.e., we want to print the warning *after* all
the script loading attempts, not from within each attempt.
To do so we want to hoist the `ReportWarning` calls in
`Module::LoadScriptingResourceInTarget` out into the callsites. But if
we do that, the callers have to remember to print the warnings. To avoid
this, we redirect all callsites to use
`ModuleList::LoadScriptingResourceInTarget`, which will be responsible
for printing the warnings.
To avoid future accidental uses of
`Module::LoadScriptingResourceInTarget` I moved the API into
`ModuleList` and made it `private`.
A kernel developer noticed that I missed a call to index the local
filesystem in one of our codepaths, and had a use case that depended on
that working.
rdar://173814556
Many tests have ad hoc forms of the launch & break steps done by
`lldbutil.run_to_source_breakpoint`. This changes some of those tests to
use `run_to_source_breakpoint` instead.
Assisted-by: claude
This test technically does not require libc++. The test binary mimics
libc++'s namespace layout to trigger some frame hiding logic in lldb,
but it does not require libc++ to function.
This is explicitly marked as a libc++ test and functionally tests the
formatter for a vector of enums. I put it in the generic directory
because there's no reason this couldn't work for other c++ stdlibs.
Additionally, this should be using the custom libc++ like the other
tests.
In some environments like swiftlang, the `''` causes the command used to
drain the init sequence of the ConPTY to fail. Replacing with a `cls`
invocation removes the need for quotation marks and works just as well.
For whatever reason we ended up with register/register but the first
register just had the second register folder in it.
Move the files up one level so we have register/<test files>.
Fixes#165413. Where a build failure was reported:
```
/b/s/w/ir/x/w/llvm-llvm-project/lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_arm64.cpp:1182:9: error: unknown type name 'user_sve_header'; did you mean 'sve::user_sve_header'?
1182 | user_sve_header *header =
| ^~~~~~~~~~~~~~~
| sve::user_sve_header
```
To fix this, add sve:: as we do for all other uses of this.
This is LLDB's copy of a structure that Linux also defines. I think the
build worked on some machines because that version ended up being
included, but with a more isolated build, it may not.
We have our own definition of it so we can be sure what we're using in
case Linux extends it later.
Tests with custom a.out targets in their Makefile (i.e.
`TestBSDArchives.py`) bypass the standard Makefile.rules linking step
where `CODESIGN` is applied. This leaves the binary unsigned, causing
the process to get kill it on remote darwin devices.
This adds a codesigning step to the all target in Makefile.rules that
signs both $(EXE) and a.out if they exist. This ensures all test
binaries are signed regardless of how they were built.
rdar://173840592
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
This PR is the tests for #138717. I have split it from implementation as
there is a lot of code, and the two don't really overlap. The tests are
checking what a user will see in LLDB, and only indirectly what happens
in lldb-server.
There are two types of tests, the first is register access tests. These
start in a given mode, where a mode is a combination of the streaming
mode on or off, ZA on or off, and a vector length.
For register access tests I have used two states:
* Streaming mode, ZA on, default vector length
* Non-streaming mode, ZA off, default vector length
(changing mode and vector length is tested in existing SVE+SME tests and
the expression tests I'm adding in this same PR)
The test program:
* Sets up the requested mode.
* Allocates a bunch of buffers for LLDB to write expected register
values to.
* Sets initial register values.
Then LLDB looks for those expected values. Assuming they are present, it
then writes a new set of values to the registers, and to the buffers
within the program. The program is then stepped, which runs an
in-process function to check that it now sees those register values. If
that function fails, we have a bug. If it does not, LLDB then checks for
the same values.
This process repeats for several registers. Hence the repeated calls to
check_register_values in the test program.
In this way we make sure that LLDB sends the correct values to ptrace by
validating them in process. Also that LLDB does not write other
registers in the process of writing one register. Which happened to me
several times during development.
The buffer writes are done with memory writes not expression evaluation
becuase at this stage we cannot trust that expressions work (the
expression tests will later prove that they do).
The second type of tests are expression state restoration tests. For
each combination of states we check that we can restore to a given state
if an expression takes us into the other state. This includes between
the same state, and adds an extra vector length to check expanding and
shrinking SVE buffers. This produces roughly 64 tests.
I have written them all as one test case in a loop because these take a
long time (~20 minutes on an FVP) and you almost always want it to fail
fast. When tracing is enabled you'll see the current states logged.
It's possible that we could skip some of these tests as being very
unlikely to ever happen, but in the spirit of
"There are Only Four Billion Floats–So Test Them All!" I think it is
simpler to enumerate every possible state change.
I have not added a test for executing an expression incompatible with
the current mode because this leads to SIGILL, which I know we already
handle. At this time LLDB makes no effort to make the expression
compatible with the current mode either.
This is the implementation part of #138717, tests and documentation will
be in a separate PR.
SME only systems have SME but not SVE. Previously we assumed that you
would either have SVE, SVE+SME, or neither. On an SME only system, the
SVE register state exists only in streaming mode. SVE instructions may
be used, but only in streaming mode. The Linux kernel will report that
such a processor has SME features but no SVE features.
This means that the non-streaming SVE ptrace register set is
inaccessible (with one exception mentioned below). So the main change
here is around the management of FP registers.
* In non-streaming mode, FP registers must be accessed via the FP
register set. Not the non-streaming SVE register set.
* In streaming mode, FP registers are a subset of the streaming SVE
registers, accessed via the streaming SVE register set.
The one exception the kernel allows is:
> On systems that do not support SVE it is permitted to use SETREGSET to
write SVE_PT_REGS_FPSIMD formatted data via NT_ARM_SVE, in this case the
vector length should be specified as 0. This allows streaming mode to be
disabled on systems with SME but not SVE.
https://docs.kernel.org/arch/arm64/sve.html
This is how we are able to restore to a non-streaming state when an
expression leaves us in streaming mode.
The major changes to LLDB are:
* If we receive only SVG in the expedited registers, we now assume it's
an SME only system and therefore VG == SVG.
* A new state SVEState::StreamingFPSIMD is added to mark the mode where
we are in FPSIMD mode on an SME only system (as opposed to the FPSIMD
state of an SVE or SVE+SME system).
* CalculateFprOffset now returns different results depending on whether
we have only SIMD or we have streaming SVE. In the latter case, the
stored register offsets are relative to the SVE registers, but actual
accesses need to be relative to the ptrace FP register set.
* In WriteAllRegisters, if we have an FP set to restore on an SME only
system we assume we must have been in non-streaming mode at the time of
the save. So we restore it using the previously mentioned non-streaming
SVE write.
* GetSVERegSet and ConfigureRegisterContext were updated to handle the
new StreamingFPSIMDOnly state.
The only difference between them is that OptionalBool's third state
is "unknown" and LazyBool's is "calculate". We don't need to tell
the difference in a single context, so I've made a new eLazyBoolDontKnow
which is an alias of eLazyBoolCalculate.
This patch changes the `Platform::LocateXXX` to return a map from
`FileSpec` to `LoadScriptFromSymFile` enum.
This is needed for https://github.com/llvm/llvm-project/pull/188722,
where I intend to set `LoadScriptFromSymFile` per-module.
By default the `Platform::LocateXXX` set the value to whatever the
target's current `target.load-script-from-symbol-file` is set to. In
https://github.com/llvm/llvm-project/pull/188722 we'll allow overriding
this per-target setting on a per-module basis.
Drive-by:
* Added logging when we fail to load a script.
In preparation for updating DIL to handle assignments, this adds a
member variable to the DIL Interpreter indicating whether or not
updating program variables is allowed. For invocations from the LLDB
command prompt (through "frame variable") we want to allow it, but from
other places we might not. Therefore we also add new StackFrame
ExpressionPathOption, eExpressionPathOptionsAllowVarUpdates, which we
add to calls from CommandObjectFrame, and which is checked in
GetValueForVariableExpressionPath. Finally, we also add a parameter,
can_update_vars, with a default value of true, to
ValueObject::SetValueFromInteger, as that will be the main function used
to by assignment in DIL.
I've been tracking sporadic timeouts waiting for a file to appear on
macOS buildbots (and occasionally local development environments). I
believe I've tracked it down to a regression in process launch
performance in macOS.
What I noticed is that running multiple test suites simultaneously
almost always triggered these failures and that the tests were always
waiting on files created by the inferior. Increasing this timeout no
longer triggers the failures on my loaded machine locally.
This timeout moves from about 16 seconds of total wait time to about 127
seconds of total wait time. This may feel a bit extreme, but this is a
performance issue. While I was here, I cleaned up logging code I was
using to investigate the test failures.
rdar://172122213
We already log the import attempt. This patch logs the failure to load
the script since otherwise the log may be misleading. Didn't think it
was worth adding a test-case for this.
Fixes an issue where attaching by program would fail if the program name
was a partial name (e.g. "foobar" instead of "/path/to/foobar").
We failed to create the target which caused the attach to fail. Now we
fallback to the dummy target and update to the real target after the
attach completes.
Here is an example launch configuration that fail:
```
{
"type": "lldb-dap",
"name": "Attach (wait)",
"request": "attach",
"program": "foobar",
"waitFor": true
},
```