llvm-project

Author	SHA1	Message	Date
Nick Fitzgerald	6018930ef1	[lld][WebAssembly] Support for the custom-page-sizes WebAssembly proposal (#128942 ) This commit adds support for WebAssembly's custom-page-sizes proposal to `wasm-ld`. An overview of the proposal can be found [here](https://github.com/WebAssembly/custom-page-sizes/blob/main/proposals/custom-page-sizes/Overview.md). In a sentence, it allows customizing a Wasm memory's page size, enabling Wasm to target environments with less than 64KiB of memory (the default Wasm page size) available for Wasm memories. This commit contains the following: * Adds a `--page-size=N` CLI flag to `wasm-ld` for configuring the linked Wasm binary's linear memory's page size. * When the page size is configured to a non-default value, then the final Wasm binary will use the encodings defined in the custom-page-sizes proposal to declare the linear memory's page size. * Defines a `__wasm_first_page_end` symbol, whose address points to the first page in the Wasm linear memory, a.k.a. is the Wasm memory's page size. This allows writing code that is compatible with any page size, and doesn't require re-compiling its object code. At the same time, because it just lowers to a constant rather than a memory access or something, it enables link-time optimization. * Adds tests for these new features. r? @sbc100 cc @sunfishcode	2025-03-04 09:39:30 -08:00
Hood Chatham	cc7f22ee6c	[object][WebAssembly] Add support for RUNTIME_PATH to yaml2obj and obj2yaml (#126080 ) This is the first step of adding RPATH support for wasm. See corresponding update to the WebAssembly/tool-conventions repo on dynamic linking: https://github.com/WebAssembly/tool-conventions/pull/246	2025-02-24 09:15:41 -08:00
Sam Clegg	48415777ea	Revert "[Object][WebAssembly] Fix data segment offsets higher than 2^31 (#125739 )" (#125786 ) This reverts commit c798a5c4d5c3c8cb21e6001f505d8f44217c2244. This broke bunch of test the emscripten side. Reverting while we investigate.	2025-02-04 16:16:17 -08:00
Sam Clegg	c798a5c4d5	[Object][WebAssembly] Fix data segment offsets higher than 2^31 (#125739 ) Fixes: #58555	2025-02-04 14:06:07 -08:00
Derek Schuff	9fdc38c81c	[WebAssembly][Object] Support more elem segment flags (#123427 ) Some tools (e.g. Rust tooling) produce element segment descriptors with neither elemkind or element type descriptors, but with init exprs instead of func indices (this is with the flags value of 4 in https://webassembly.github.io/spec/core/binary/modules.html#element-section). LLVM doesn't fully model reference types or the various ways to initialize element segments, but we do want to correctly parse and skip over all type sections, so this change updates the object parser to handle that case, and refactors for more clarity. The test file is updated to include one additional elem segment with a flags value of 4, an initializer value of (32.const 0) and an empty vector. Also support parsing files that export imported (undefined) functions.	2025-01-17 17:26:44 -08:00
Lang Hames	d02c1676d7	[Support][Error] Add ErrorAsOutParameter constructor that takes an Error by ref. ErrorAsOutParameter's Error* constructor supports cases where an Error might not be passed in (because in the calling context it's known that this call won't fail). Most clients always have an Error present however, and for them an Error& overload is more convenient.	2024-11-29 15:57:53 +11:00
Kazu Hirata	e9c8106a90	[Object] Remove unused includes (NFC) (#116750 ) Identified with misc-include-cleaner.	2024-11-19 19:42:09 -08:00
Kazu Hirata	4048c64306	[llvm] Remove redundant control flow statements (NFC) (#115831 ) Identified with readability-redundant-control-flow.	2024-11-12 10:09:42 -08:00
Heejin Ahn	be64ca9123	[WebAssembly] Remove WASM_FEATURE_PREFIX_REQUIRED (NFC) (#113729 ) This has not been emitted since `3f34e1b883`. The corresponding proposed tool-conventions change: https://github.com/WebAssembly/tool-conventions/pull/236	2024-11-04 16:12:57 -08:00
Sam Clegg	22b7b84860	[lld][WebAssembly] Report undefined symbols in -shared/-pie builds (#75242 ) Previously we would ignore all undefined symbols when using `-shared` or `-pie`. All undefined symbols would be treated as imports regardless of whether those symbols we defined in any shared library. With this change we now track symbol in shared libraries and report undefined symbols in the main program by default. The old behavior is still available via the `--unresolved-symbols=import-dynamic` command line flag. This rationale for allowing this type of breaking change is that `-pie` and `-shared` are both still experimental will warn as such, unless `--experimental-pic` is passed. As part of this change the linker now models shared library symbols via new SharedFunctionSymbol and SharedDataSymbol types. I've also added a new `--no-shlib-sigcheck` option that bypassed the checking of functions signature in shared libraries. This is specifically required by emscripten the case where the imports/exports of shared libraries have been modified by via JS type legalization (this is only needed when targeting old JS engines where bigint is not yet available See https://github.com/emscripten-core/emscripten/issues/18198	2024-07-12 13:26:52 -07:00
Heejin Ahn	c179d50fd3	[WebAssembly] Add exnref type (#93586 ) This adds (back) the exnref type restored in the new EH proposal adopted in Oct 2023 CG meeting: https://github.com/WebAssembly/exception-handling/blob/main/proposals/exception-handling/Exceptions.md:x	2024-05-28 16:10:11 -07:00
Derek Schuff	2eaeae7e9a	[Object][Wasm] Use offset instead of index for Global address and store size (#81781 ) Currently the address reported by binutils for a global is its index; but its offset (in the file or section) is more useful for binary size attribution. This PR treats globals similarly to functions, and tracks their offset and size. It also centralizes the logic differentiating linked from object and dylib files (where section addresses are 0).	2024-02-15 09:36:44 -08:00
Derek Schuff	01706e7677	[llvm-nm][WebAssembly] Print function symbol sizes (#81315 ) nm already prints sizes for data symbols. Do that for function symbols too, and update objdump to also print size information. Implements item 3 from https://github.com/llvm/llvm-project/issues/76107	2024-02-09 14:22:47 -08:00
Sam Clegg	c429f48b56	[Object][WebAssembly] Improve error on invalid relocation (#81203 ) See https://github.com/emscripten-core/emscripten/issues/21140	2024-02-08 15:20:37 -08:00
Derek Schuff	5818572789	[Object][Wasm] Generate symbol info from name section names (#81063 ) Currently symbol info is generated from a linking section or from export names. This PR generates symbols in a WasmObjectFile from the name section as well, which allows tools like objdump and nm to show useful information for more linked binaries. There are some limitations: most notably that we don't assume any particular ABI, so we don't get detailed information about data symbols if the segments are merged (which is the default). Covers most of the desired functionality from #76107	2024-02-08 13:20:47 -08:00
Derek Schuff	8b0f47bfa4	[Object][Wasm] Use file offset for section addresses in linked wasm files (#80529 ) Wasm has no unified virtual memory space as other object formats and architectures do, so previously WasmObjectFile reported 0 for all section addresses, and until 428cf71ff used section offsets for function symbols. Now we use file offsets for function symbols, and this change switches section addresses to do the same (in linked files). The main result of this is that objdump now reports VMAs in section listings, and also uses file offets rather than section offsets when disassembling linked binaries (matching the behavior of other disassemblers and stack traces produced by browwsers). To make this work, this PR also updates objdump's generation of synthetics fallback symbols to match lib/Object and also correctly plumbs symbol types for regular and dummy symbols through to the backend to avoid needing special knowledge of address 0. This also paves the way for generating symbols from name sections rather than symbol tables or imports (see #76107) by allowing the disassembler's synthetic fallback symbols match the name-section generated symbols (in a followup PR).	2024-02-07 11:51:19 -08:00
Derek Schuff	ef1f999e13	[Object][Wasm] Move WasmSymbolInfo directly into WasmSymbol (NFC) (#80219 ) Move the WasmSymbolInfos from their own vector on the WasmLinkingData directly into the WasmSymbol object. Removing the const-ref to an external object allows the vector of WasmSymbols to be safely expanded/reallocated; generating symbol info from the name section will require this, as the numbers of function and data segment names are stored separately. This is a step toward generating symbol information from name sections for #76107	2024-02-02 10:44:52 -08:00
Derek Schuff	7f409cd82b	[Object][Wasm] Allow parsing of GC types in type and table sections (#79235 ) This change allows a WasmObjectFile to be created from a wasm file even if it uses typed funcrefs and GC types. It does not significantly change how lib/Object models its various internal types (e.g. WasmSignature, WasmElemSegment), so LLVM does not really "support" or understand such files, but it is sufficient to parse the type, global and element sections, discarding types that are not understood. This is useful for low-level binary tools such as nm and objcopy, which use only limited aspects of the binary (such as function definitions) or deal with sections as opaque blobs. This is done by allowing `WasmValType` to have a value of `OTHERREF` (representing any unmodeled reference type), and adding a field to `WasmSignature` indicating it's a placeholder for an unmodeled reference type (since there is a 1:1 correspondence between WasmSignature objects and types in the type section). Then the object file parsers for the type and element sections are expanded to parse encoded reference types and discard any unmodeled fields.	2024-01-25 09:48:38 -08:00
Derek Schuff	103fa3250c	[WebAssembly] Use ValType instead of integer types to model wasm tables (#78012 ) LLVM models some features found in the binary format with raw integers and others with nested or enumerated types. This PR switches modeling of tables and segments to use wasm::ValType rather than uint32_t. This NFC change is in preparation for modeling more reference types, but IMO is also cleaner and closer to the spec.	2024-01-17 11:29:19 -08:00
Derek Schuff	428cf71ffa	Reland "[WebAssembly][Object]Use file offset as function symbol address for linked files (#76198 )" WebAssembly doesn't have a single virtual memory space the way other object formats or architectures do, so "addresses" mean different things depending on the context. Function symbol addresses in object files are offsets from the start of the code section. This is good for linking and relocation. However when dealing with linked binaries, offsets from the start of the file/module are more often used (e.g. for stack traces in browsers), and are more useful for use cases like binary size attribution. This PR changes Object to use the file offset instead of the section offset for function symbols, but only for linked (non-DSO) files. This is a reland of fc5f51cf with a fix for the MSan failure (it was not caused by this change, but it was revealed by the new tests).	2024-01-03 15:39:48 -08:00
Mitch Phillips	665d1a0eb4	Revert "[WebAssembly][Object]Use file offset as function symbol address for linked files (#76198 )" This reverts commit fc5f51cf5af4364b38bf22e491d46e1e892ade0c. Reason: Broke the sanitizer buildbot - https://lab.llvm.org/buildbot/#/builders/5/builds/39751/steps/12/logs/stdio	2024-01-03 11:23:10 +01:00
Derek Schuff	fc5f51cf5a	[WebAssembly][Object]Use file offset as function symbol address for linked files (#76198 ) WebAssembly doesn't have a single virtual memory space the way other object formats or architectures do, so "addresses" mean different things depending on the context. Function symbol addresses in object files are offsets from the start of the code section. This is good for linking and relocation. However when dealing with linked binaries, offsets from the start of the file/module are more often used (e.g. for stack traces in browsers), and are more useful for use cases like binary size attribution. This PR changes Object to use the file offset instead of the section offset for function symbols, but only for linked (non-DSO) files. This implements item number 4 from #76107	2024-01-02 14:54:54 -08:00
DavidKorczynski	e8b6fa5f30	[WebAssembly] Add bounds check in parseCodeSection (#76407 ) This is needed as otherwise `Ctx.Ptr` will be incremented to a position outside it's available buffer, which is being used to read values e.g. `966d564e43/llvm/lib/Object/WasmObjectFile.cpp (L1469)` Fixes: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=28856 Signed-off-by: David Korczynski <david@adalogics.com>	2023-12-26 13:32:13 -08:00
Derek Schuff	35a5df2de6	[WebAssembly][Object] Record section start offsets at start of payload (#76188 ) LLVM ObjectFile currently records the start offsets of sections as the start of the section header, whereas most other tools (WABT, emscripten, wasm-tools) record it as the start of the section content, after the header. This affects binutils tools such as objdump and nm, but not compilation/assembly (since that is driven by symbols and assembler labels which already have their values inside the section payload rather in the header. This patch updates LLVM to match the other tools.	2023-12-21 14:16:37 -08:00
Sam Clegg	4e8cb01b01	[WebAssembly] Add symbol information for shared libraries (#75238 ) The current (experimental) spec for WebAssembly shared libraries does not include a full symbol table like the object format. This change extracts symbol information from the normal wasm exports. This is the first step in having the linker report undefined symbols when linking with shared libraries. The current behaviour is to ignore all undefined symbols when linking with `-pie` or `-shared`. See https://github.com/emscripten-core/emscripten/issues/18198	2023-12-20 11:13:09 -08:00
Kazu Hirata	586ecdf205	[llvm] Use StringRef::{starts,ends}_with (NFC) (#74956 ) This patch replaces uses of StringRef::{starts,ends}with with StringRef::{starts,ends}_with for consistency with std::{string,string_view}::{starts,ends}_with in C++20. I'm planning to deprecate and eventually remove StringRef::{starts,ends}with.	2023-12-11 21:01:36 -08:00
Sam Clegg	afe957ea95	[WebAssembly] Allow absolute symbols in the linking section (symbol table) (#67493 ) Fixes a crash in `-Wl,-emit-relocs` where the linker was not able to write linker-synthetic absolute symbols to the symbol table. This change adds a new symbol flag (`WASM_SYMBOL_ABS`), which means that the symbol's offset is absolute and not relative to a given segment. Such symbols include `__stack_low` and `__stack_low`. Note that wasm object files never contains such symbols, only binaries linked with `-Wl,-emit-relocs`. Fixes: #67111	2023-10-03 13:16:16 -07:00
Sam Clegg	79cf24e211	[llvm-nm][WebAssembly] Report the size of data symbols Fixes: https://github.com/llvm/llvm-project/issues/58839 Differential Revision: https://reviews.llvm.org/D158799	2023-08-25 16:45:50 -07:00
Derek Schuff	1b21067cf2	[WebAssembly][Objcopy] Write output section headers identically to inputs Previously when objcopy generated section headers, it padded the LEB that encodes the section size out to 5 bytes, matching the behavior of clang. This is correct, but results in a binary that differs from the input. This can sometimes have undesirable consequences (e.g. breaking source maps). This change makes the object reader remember the size of the LEB encoding in the section header, so that llvm-objcopy can reproduce it exactly. For sections not read from an object file (e.g. that llvm-objcopy is adding itself), pad to 5 bytes. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D155535	2023-07-27 15:43:51 -07:00
Brendan Dahl	220fe00a7c	[WebAssembly] Support `annotate` clang attributes for marking functions. Annotation attributes may be attached to a function to mark it with custom data that will be contained in the final Wasm file. The annotation causes a custom section named "func_attr.annotate.<name>.<arg0>.<arg1>..." to be created that will contain each function's index value that was marked with the annotation. A new patchable relocation type for function indexes had to be created so the custom section could be updated during linking. Reviewed By: sbc100 Differential Revision: https://reviews.llvm.org/D150803	2023-07-11 15:17:26 -07:00
Job Noorman	8de9f2b558	Move SubtargetFeature.h from MC to TargetParser SubtargetFeature.h is currently part of MC while it doesn't depend on anything in MC. Since some LLVM components might have the need to work with target features without necessarily needing MC, it might be worthwhile to move SubtargetFeature.h to a different location. This will reduce the dependencies of said components. Note that I choose TargetParser as the destination because that's where Triple lives and SubtargetFeatures feels related to that. This issues came up during a JITLink review (D149522). JITLink would like to avoid a dependency on MC while still needing to store target features. Reviewed By: MaskRay, arsenm Differential Revision: https://reviews.llvm.org/D150549	2023-06-26 11:20:08 +02:00
Sam Clegg	d65ed8cde0	[lld][WebAssembly] Fix handling of mixed strong and weak references When adding a undefined symbols to the symbol table, if the existing reference is weak replace the symbol flags with (potentially) non-weak binding. Fixes: https://github.com/llvm/llvm-project/issues/60829 Differential Revision: https://reviews.llvm.org/D144747	2023-02-27 14:20:01 -08:00
Archibald Elliott	62c7f035b4	[NFC][TargetParser] Remove llvm/ADT/Triple.h I also ran `git clang-format` to get the headers in the right order for the new location, which has changed the order of other headers in two files.	2023-02-07 12:39:46 +00:00
Elena Lepilkina	537cdf92c4	[llvm-objdump][RISCV] Use new common method to parse ARCH RISCV attribute Differential Revision: https://reviews.llvm.org/D139553	2023-01-16 16:57:55 +03:00
Guillaume Chatelet	3e5f54d6d7	Revert D139098 "[Alignment] Use Align for ObjectFile::getSectionAlignment" This breaks lld. This reverts commit 10c47465e2505ddfee4e62a2ab2e535abea3ec56.	2022-12-09 09:45:04 +00:00
Guillaume Chatelet	10c47465e2	[Alignment] Use Align for ObjectFile::getSectionAlignment Differential Revision: https://reviews.llvm.org/D139098	2022-12-09 09:34:43 +00:00
Dan Gohman	9f049e9993	[lld][WebAssemby] Allow import module names to be empty strings. The component-model [canonical ABI] is currently using import names with empty strings. Remove the special cases for empty strings from WasmObjectFile.cpp so that they can pass through as-is. [canonical ABI]: https://github.com/WebAssembly/component-model/blob/main/design/mvp/CanonicalABI.md Differential Revision: https://reviews.llvm.org/D133037	2022-08-31 15:30:15 -07:00
Kazu Hirata	7094ab4ee7	[llvm] Modernize bool literals (NFC) Identified with modernize-use-bool-literals.	2022-07-17 18:08:51 -07:00
Derek Schuff	5a082d9c1c	[WebAssembly][Object] Remove requirement that objects must have code sections When parsing name and linking sections, we currently require that the object must have a code section (it seems that this was intended to verify section ordering). However it can be useful for binaries to have their code sections stripped out (e.g. if we just want the debug info). In that case we need the rest of the known sections (so e.g. we know how many functions there are, to verify the name section) but not the actual code. I've removed the restriction completely. I think this is OK because the section-parsing code already checks function and global indices in many places for validity and will return appropriate errors if the relevant sections are missing. Also we can't just replace the requirement of seeing a code section with a requirement that we see a function or global section, because a binary may just not have any functions or globals. But there's only an problem if the name or linking section tries to name a nonexistent function. Part of a fix for https://github.com/emscripten-core/emscripten/issues/13084 Differential Revision: https://reviews.llvm.org/D128094	2022-06-23 13:56:17 -07:00
Kazu Hirata	7a47ee51a1	[llvm] Don't use Optional::getValue (NFC)	2022-06-20 22:45:45 -07:00
Derek Schuff	2ae385e560	[WebAssembly] Add WASM_SEC_LAST_KNOWN to BinaryFormat section types list [NFC] There are 3 places where we were using WASM_SEC_TAG as the "last" known section type, which requires updating (or leaves a bug) when a new known section type is added. Instead add a "last type" to the enum for this purpose. Differential Revision: https://reviews.llvm.org/D127164	2022-06-07 12:05:23 -07:00
Derek Schuff	a205f2904d	[WebAssembly] Consolidate sectionTypeToString in BinaryFormat [NFC] Currently there are 2 duplicate implementation, and I want to add a use in a 3rd place. Combine them in lib/BinaryFormat so they can be shared. Also update toString for symbol and reloc types to use StringRef Differential Revision: https://reviews.llvm.org/D126553	2022-05-27 09:26:36 -07:00
Sam Clegg	9b27fbd19c	[WebAssembly] Fix asan issue from https://reviews.llvm.org/D121349	2022-03-15 11:36:56 -07:00
Sam Clegg	2481adb59c	[WebAssembly] Fix asan issue from https://reviews.llvm.org/D121349	2022-03-14 19:57:50 -07:00
Sam Clegg	9504ab32b7	[WebAssembly] Second phase of implemented extended const proposal This change continues to lay the ground work for supporting extended const expressions in the linker. The included test covers object file reading and writing and the YAML representation. Differential Revision: https://reviews.llvm.org/D121349	2022-03-14 08:55:47 -07:00
serge-sans-paille	e72c195fdc	Cleanup LLVMObject headers Most notably, llvm/Object/Binary.h no longer includes llvm/Support/MemoryBuffer.h llvm/Object/MachOUniversal*.h no longer include llvm/Object/Archive.h llvm/Object/TapiUniversal.h no longer includes llvm/Object/TapiFile.h llvm-project preprocessed size: before: 1068185081 after: 1068324320 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D119457	2022-02-10 21:13:44 +01:00
Quinn Pham	c71fbdd87b	[NFC] Inclusive language: Remove instances of master in URLs [NFC] This patch fixes URLs containing "master". Old URLs were either broken or redirecting to the new URL. Reviewed By: #libc, ldionne, mehdi_amini Differential Revision: https://reviews.llvm.org/D113186	2021-11-05 08:48:41 -05:00
Sam Clegg	659a08399a	[WebAssembly] Add import info to `dylink` section of shared libraries See https://github.com/WebAssembly/tool-conventions/pull/175 Differential Revision: https://reviews.llvm.org/D111345	2021-10-15 11:49:16 -07:00
Heejin Ahn	9261ee32dc	[WebAssembly] Make EH work with dynamic linking This makes Wasm EH work with dynamic linking. So far we were only able to handle destructors, which do not use any tags or LSDA info. 1. This uses `TargetExternalSymbol` for `GCC_except_tableN` symbols, which points to the address of per-function LSDA info. It is more convenient to use than `MCSymbol` because it can take additional target flags. 2. When lowering `wasm_lsda` intrinsic, if PIC is enabled, make the symbol relative to `__memory_base` and generate the `add` node. If PIC is disabled, continue to use the absolute address. 3. Make tag symbols (`__cpp_exception` and `__c_longjmp`) undefined in the backend, because it is hard to make it work with dynamic linking's loading order. Instead, we make all tag symbols undefined in the LLVM backend and import it from JS. 4. Add support for undefined tags to the linker. Companion patches: - https://github.com/WebAssembly/binaryen/pull/4223 - https://github.com/emscripten-core/emscripten/pull/15266 Reviewed By: sbc100 Differential Revision: https://reviews.llvm.org/D111388	2021-10-12 23:28:27 -07:00
Heejin Ahn	3ec1760d91	[WebAssembly] Remove WasmTagType This removes `WasmTagType`. `WasmTagType` contained an attribute and a signature index: ``` struct WasmTagType { uint8_t Attribute; uint32_t SigIndex; }; ``` Currently the attribute field is not used and reserved for future use, and always 0. And that this class contains `SigIndex` as its property is a little weird in the place, because the tag type's signature index is not an inherent property of a tag but rather a reference to another section that changes after linking. This makes tag handling in the linker also weird that tag-related methods are taking both `WasmTagType` and `WasmSignature` even though `WasmTagType` contains a signature index. This is because the signature index changes in linking so it doesn't have any info at this point. This instead moves `SigIndex` to `struct WasmTag` itself, as we did for `struct WasmFunction` in D111104. In this CL, in lib/MC and lib/Object, this now treats tag types in the same way as function types. Also in YAML, this removes `struct Tag`, because now it only contains the tag index. Also tags set `SigIndex` in `WasmImport` union, as functions do. I think this makes things simpler and makes tag handling more in line with function handling. These two shares similar properties in that both of them have signatures, but they are kind of nominal so having the same signature doesn't mean they are the same element. Also a drive-by fix: the reserved 'attirubute' part's encoding changed from uleb32 to uint8 a while ago. This was fixed in lib/MC and lib/Object but not in YAML. This doesn't change object files because the field's value is always 0 and its encoding is the same for the both encoding. This is effectively NFC; I didn't mark it as such just because it changed YAML test results. Reviewed By: sbc100, tlively Differential Revision: https://reviews.llvm.org/D111086	2021-10-05 17:11:22 -07:00

1 2 3 4 5

226 Commits