llvm-project

Author	SHA1	Message	Date
Dávid Bolvanský	44572be295	[NFCI] Fix set-but-unused warning in X86AsmBackend.cpp	2022-03-24 08:13:28 +01:00
Craig Topper	6cfe41dcc8	[X86] Rename more target feature related things consistency. NFC -Rename ModeBit to IsBit to match X86Subtarget. -Rename FeatureLAHFSAHF to FeatureLAFHSAFH64 to match X86Subtarget. -Use consistent capitalization Reviewed By: skan Differential Revision: https://reviews.llvm.org/D121975	2022-03-17 22:27:17 -07:00
Amir Ayupov	999fa9f687	[X86][NFC] Move table from getRelaxedOpcodeArith into its own class Move out the table and prepare the code to reuse it for the reverse mapping. Follows the example of memory folding/unfolding tables in X86InstrFoldTables.cpp Preparation step to unify `llvm::X86::getRelaxedOpcodeArith` and `getShortArithOpcode` in BOLT X86MCPlusBuilder.cpp. Addresses https://lists.llvm.org/pipermail/llvm-dev/2022-January/154526.html Reviewed By: skan, MaskRay Differential Revision: https://reviews.llvm.org/D121402	2022-03-12 09:06:17 -08:00
Kazu Hirata	fd7d40640d	[llvm] Use range-based for loops (NFC)	2021-11-28 18:14:49 -08:00
Quinn Pham	b11c66accf	[NFC] Inclusive language: rename master flag to main flag [NFC] As part of using inclusive language within the llvm project, this patch renames master flag to main flag in these comments. Reviewed By: ZarkoCA Differential Revision: https://reviews.llvm.org/D114090	2021-11-25 15:16:11 -06:00
Kazu Hirata	d000431fb2	[X86] Remove X86ELFObjectWriter in X86AsmBackend.cpp (NFC) Note that the identically named class is defined in an anonymous namespace in X86ELFObjectWriter.cpp.	2021-11-01 08:31:54 -07:00
Reid Kleckner	89b57061f7	Move TargetRegistry.(h\|cpp) from Support to MC This moves the registry higher in the LLVM library dependency stack. Every client of the target registry needs to link against MC anyway to actually use the target, so we might as well move this out of Support. This allows us to ensure that Support doesn't have includes from MC/*. Differential Revision: https://reviews.llvm.org/D111454	2021-10-08 14:51:48 -07:00
Peter Smith	e63455d5e0	[MC] Use local MCSubtargetInfo in writeNops On some architectures such as Arm and X86 the encoding for a nop may change depending on the subtarget in operation at the time of encoding. This change replaces the per module MCSubtargetInfo retained by the targets AsmBackend in favour of passing through the local MCSubtargetInfo in operation at the time. On Arm using the architectural NOP instruction can have a performance benefit on some implementations. For Arm I've deleted the copy of the AsmBackend's MCSubtargetInfo to limit the chances of this causing problems in the future. I've not done this for other targets such as X86 as there is more frequent use of the MCSubtargetInfo and it looks to be for stable properties that we would not expect to vary per function. This change required threading STI through MCNopsFragment and MCBoundaryAlignFragment. I've attempted to take into account the in tree experimental backends. Differential Revision: https://reviews.llvm.org/D45962	2021-09-07 15:46:19 +01:00
Simon Pilgrim	e78bf49a58	[X86] Rename Subtarget Tuning Feature Flag Prefix. NFC. As suggested on D107370, this patch renames the tuning feature flags to start with 'Tuning' instead of 'Feature'. Differential Revision: https://reviews.llvm.org/D107459	2021-08-05 13:09:23 +01:00
Harald van Dijk	75521bd9d8	[X32] Add Triple::isX32(), use it. So far, support for x86_64-linux-gnux32 has been handled by explicit comparisons of Triple.getEnvironment() to GNUX32. This worked as long as x86_64-linux-gnux32 was the only X32 environment to worry about, but we now have x86_64-linux-muslx32 as well. To support this, this change adds an isX32() function and uses it. It replaces all checks for GNUX32 or MuslX32 by isX32(), except for the following: - Triple::isGNUEnvironment() and Triple::isMusl() are supposed to treat GNUX32 and MuslX32 differently. - computeTargetTriple() needs to be able to transform triples to add or remove X32 from the environment and needs to map GNU to GNUX32, and Musl to MuslX32. - getMultiarchTriple() completely lacks any Musl support and retains the explicit check for GNUX32 as it can only return x86_64-linux-gnux32. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D103777	2021-06-07 20:48:39 +01:00
Tim Northover	ba1509da7b	Recommit X86: support Swift Async context This adds support to the X86 backend for the newly committed swiftasync function parameter. If such a (pointer) parameter is present it gets stored into an augmented frame record (populated in IR, but generally containing enhanced backtrace for coroutines using lots of tail calls back and forth). The context frame is identical to AArch64 (primarily so that unwinders etc don't get extra complexity). Specfically, the new frame record is [AsyncCtx, %rbp, ReturnAddr], and its presence is signalled by bit 60 of the stored %rbp being set to 1. %rbp still points to the frame pointer in memory for backwards compatibility (only partial on x86, but OTOH the weird AsyncCtx before the rest of the record is because of x86). Recommited with a fix for unwind info when i386 pc-rel thunks are adjacent to a prologue.	2021-05-18 15:19:05 +01:00
Mitch Phillips	6791a6b309	Revert "X86: support Swift Async context" This reverts commit 747e5cfb9f5d944b47fe014925b0d5dc2fda74d7. Reason: New frame layout broke the sanitizer unwinder. Not clear why, but seems like some of the changes aren't always guarded by Swyft checks. See https://reviews.llvm.org/rG747e5cfb9f5d944b47fe014925b0d5dc2fda74d7 for more information.	2021-05-17 12:44:57 -07:00
Tim Northover	747e5cfb9f	X86: support Swift Async context This adds support to the X86 backend for the newly committed swiftasync function parameter. If such a (pointer) parameter is present it gets stored into an augmented frame record (populated in IR, but generally containing enhanced backtrace for coroutines using lots of tail calls back and forth). The context frame is identical to AArch64 (primarily so that unwinders etc don't get extra complexity). Specfically, the new frame record is [AsyncCtx, %rbp, ReturnAddr], and its presence is signalled by bit 60 of the stored %rbp being set to 1. %rbp still points to the frame pointer in memory for backwards compatibility (only partial on x86, but OTOH the weird AsyncCtx before the rest of the record is because of x86).	2021-05-17 11:56:16 +01:00
Fangrui Song	4f7562d52f	[MC][X86] Support .reloc , BFD_RELOC_{NONE,8,16,32,64}, The names are unfortunate, but BFD_RELOC_NONE provides a generic way indicating a dependency between two sections, which is useful for ld --gc-sections. See https://sourceware.org/bugzilla/show_bug.cgi?id=27530	2021-03-05 21:31:05 -08:00
Bill Wendling	a9f9ceb35f	[X86] Use correct padding when in 16-bit mode In 16-bit mode, some of the nop patterns used in 32-bit mode can end up mangling other instructions. For instance, an aligned "movz" instruction may have the 0x66 and 0x67 prefixes omitted, because the nop that's used messes things up. xorl %ebx, %ebx .p2align 4, 0x90 movzbl (%esi,%ebx), %ecx Use instead nop patterns we know 16-bit mode can handle. Differential Revision: https://reviews.llvm.org/D97268	2021-02-25 20:05:45 -08:00
Fangrui Song	a048ce13e3	[X86] Default to -x86-pad-for-align=false to drop assembler difference with or w/o -g Fix PR48742: the D75203 assembler optimization locates MCRelaxableFragment's within two MCSymbol's and relaxes some MCRelaxableFragment's to reduce the size of a MCAlignFragment. A -g build has more MCSymbol's and therefore may have different assembler output (e.g. a MCRelaxableFragment (jmp) may have 5 bytes with -O1 while 2 bytes with -O1 -g). `.p2align 4, 0x90` is common due to loops. For a larger program, with a lot of temporary labels, the assembly output difference is somewhat destined. The cost seems to overweigh the benefits so we default to -x86-pad-for-align=false until the heuristic is improved. Reviewed By: skan Differential Revision: https://reviews.llvm.org/D94542	2021-01-16 16:39:54 -08:00
Simon Pilgrim	af71298648	[X86] Cleanup/add namespace closure comments. NFCI. Fixes some clang-tidy llvm-namespace-comment warnings.	2020-09-22 15:06:58 +01:00
Jian Cai	c6334db577	[X86] support .nops directive Add support of .nops on X86. This addresses llvm.org/PR45788. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D82826	2020-08-03 11:50:56 -07:00
Craig Topper	0aad82943a	[X86] Enable multibyte NOPs in 64-bit mode for padding/alignment. The default CPU used by llvm-mc doesn't have the NOPL feature, but if we know we're compiling in 64-bit mode we should be able to use nopl.	2020-07-01 23:59:01 -07:00
Craig Topper	c420762172	Revert "[X86] Enable multibyte NOPs in 64-bit mode for padding/alignment." Looks like lld tests need updates too This reverts commit 3367e9dac56024147bbd916c40bfe6a4ee61079b.	2020-07-01 15:20:53 -07:00
Craig Topper	3367e9dac5	[X86] Enable multibyte NOPs in 64-bit mode for padding/alignment. The default CPU used by llvm-mc doesn't have the NOPL feature, but if we know we're compiling in 64-bit mode we should be able to use nopl.	2020-07-01 10:57:24 -07:00
Fangrui Song	0f6bd9cda6	[MC] Drop unneeded std::abs for DW_def_cfa_offset in DarwinX86AsmBackend::generateCompactUnwindEncoding This clean-up is available after double negation bugs are fixed.	2020-05-22 21:12:47 -07:00
Shengchen Kan	99ac9ce701	[NFC] Clean up in MCObjectStreamer and X86AsmBackend	2020-05-09 12:50:44 +08:00
Shengchen Kan	8bb059ab63	[MC][Bugfix] Remove redundant parameter for relaxInstruction Summary: Before this patch, `relaxInstruction` takes three arguments, the first argument refers to the instruction before relaxation and the third argument is the output instruction after relaxation. There are two quite strange things: 1) The first argument's type is `const MCInst &`, the third argument's type is `MCInst &`, but they may be aliased to the same variable 2) The backends of ARM, AMDGPU, RISC-V, Hexagon assume that the third argument is a fresh uninitialized `MCInst` even if `relaxInstruction` may be called like `relaxInstruction(Relaxed, STI, Relaxed)` in a loop. In this patch, we drop the thrid argument, and let `relaxInstruction` directly modify the given instruction. Also, this patch fixes the bug https://bugs.llvm.org/show_bug.cgi?id=45580, which is introduced by D77851, and breaks the assumption of ARM, AMDGPU, RISC-V, Hexagon. Reviewers: Razer6, MaskRay, jyknight, asb, luismarques, enderby, rtaylor, colinl, bcain Reviewed By: Razer6, MaskRay, bcain Subscribers: bcain, nickdesaulniers, nathanchance, wuzish, annita.zhang, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, tpr, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78364	2020-04-21 11:06:55 +08:00
Shengchen Kan	0d3149f431	[MC][X86] Disable branch align in non-text section Summary: The instruction in non-text section can not be executed, so they will not affect performance. In addition, their encoding values are treated as data, so we should not touch them. Reviewers: MaskRay, reames, LuoYuanke, jyknight Reviewed By: MaskRay Subscribers: annita.zhang, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77971	2020-04-18 14:41:25 +08:00
Shengchen Kan	2477cec2ac	[NFC][X86] Refine code in X86AsmBackend Summary: Move code to a better place, rename function, etc Tags: #llvm Differential Revision: https://reviews.llvm.org/D77778	2020-04-09 21:31:52 +08:00
Shengchen Kan	916044d819	[X86][MC] Support enhanced relaxation for branch align Summary: Since D75300 has been landed, I want to support enhanced relaxation when we need to align branches and allow prefix padding. "Enhanced Relaxtion" means we allow an instruction that could not be traditionally relaxed to be emitted into RelaxableFragment so that we increase its length by adding prefixes for optimization. The motivation is straightforward, RelaxFragment is mostly for relative jumps and we can not increase the length of jumps when we need to align them, so if we need to achieve D75300's purpose (reducing the bytes of nops) when need to align jumps, we have to make more instructions "relaxable". Reviewers: reames, MaskRay, craig.topper, LuoYuanke, jyknight Reviewed By: reames Subscribers: hiraditya, llvm-commits, annita.zhang Tags: #llvm Differential Revision: https://reviews.llvm.org/D76286	2020-04-08 19:08:19 +08:00
Shengchen Kan	9f92d4612f	Revert "[NFC][X86] Refine code in X86AsmBackend" This reverts commit a157cde0ac0a804b49f50df0a6faae7416ac3fb4.	2020-04-02 15:57:06 +08:00
Shengchen Kan	a157cde0ac	[NFC][X86] Refine code in X86AsmBackend Replace pattern getContents().size with universe function call	2020-04-02 15:41:10 +08:00
Shengchen Kan	d0efd7bfcf	[X86][MC] Disable Prefix padding after hardcode/prefix Reviewers: reames, MaskRay, craig.topper, LuoYuanke, jyknight, eli.friedman Reviewed By: craig.topper Subscribers: hiraditya, llvm-commits, annita.zhang Tags: #llvm Differential Revision: https://reviews.llvm.org/D76475	2020-04-01 09:49:52 +08:00
Fangrui Song	152d14da64	[MC][X86] Make .reloc support arbitrary relocation types Generalizes D62014 (R_386_NONE/R_X86_64_NONE). Unlike ARM (D76746) and AArch64 (D76754), we cannot delete FK_NONE from getFixupKindSize because FK_NONE is still used by R_386_TLS_DESC_CALL/R_X86_64_TLSDESC_CALL.	2020-03-27 13:33:15 -07:00
Shengchen Kan	1fb4f99a21	[X86][MC] Fix the bug for prefix padding support Summary: There is a tiny logic error of D75300, making branch is not correctly aligned with option -x86-pad-max-prefix-size Reviewers: reames, MaskRay, craig.topper, LuoYuanke, jyknight Reviewed By: reames Subscribers: hiraditya, llvm-commits, annita.zhang Tags: #llvm Differential Revision: https://reviews.llvm.org/D76285	2020-03-27 14:16:09 +08:00
Shengchen Kan	39bcc76a92	[X86] Disable nop padding before instruction following hardcode Reviewers: reames, MaskRay, craig.topper, LuoYuanke, jyknight Reviewed By: LuoYuanke Subscribers: annita.zhang, llvm-commits, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D76176	2020-03-17 09:45:12 +08:00
Shengchen Kan	b1a7a245ec	[NFC][MC] Rename alignBranches* to emitInstruction* alignBranches is X86 specific, change the name in a more general one since other target can do some state chang before and after emitting the instruction.	2020-03-16 17:13:14 +08:00
Philip Reames	a79863f2f7	Support prefix padding for alignment purposes (Relaxable instructions only) Now that D75203 has landed and baked for a few days, extend the basic approach to prefix padding as well. The patch itself is fairly straight forward. For the moment, this patch adds the functional support and some basic testing there of, but defaults to not enabling prefix padding. I want to be able to phrase a separate patch which adds the target specific reasoning and test it cleanly. I haven't decided whether I want to common it with the nop logic or not. Differential Revision: https://reviews.llvm.org/D75300	2020-03-15 19:53:41 -07:00
Shengchen Kan	e6f1dd40bd	[X86] Disable nop padding before instruction following a prefix Reviewers: reames, MaskRay, craig.topper, LuoYuanke, jyknight Reviewed By: LuoYuanke Subscribers: hiraditya, llvm-commits, annita.zhang Tags: #llvm Differential Revision: https://reviews.llvm.org/D76052	2020-03-14 13:15:30 +08:00
Simon Pilgrim	1e686d2689	[X86] Add FeatureFast7ByteNOP flag Lets us remove another SLM proc family flag usage. This is NFC, but we should probably check whether atom/glm/knl? should be using this flag as well...	2020-03-12 13:06:43 +00:00
Shengchen Kan	3a503ce663	[X86] Reduce the number of emitted fragments due to branch align Summary: Currently, a BoundaryAlign fragment may be inserted after the branch that needs to be aligned to truncate the current fragment, this fragment is unused at most of time. To avoid that, we can insert a new empty Data fragment instead. Non-relaxable instruction is usually emitted into Data fragment, so the inserted empty Data fragment will be reused at a high possibility. Reviewers: annita.zhang, reames, MaskRay, craig.topper, LuoYuanke, jyknight Reviewed By: reames, LuoYuanke Subscribers: llvm-commits, dexonsmith, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D75438	2020-03-12 15:37:35 +08:00
Philip Reames	c93f1046fc	[X86/MC] Factor out common code [NFC]	2020-03-05 09:43:41 -08:00
David Blaikie	7a6878a72e	X86AsmBackend.cpp: #ifndef NDEBUG some only-used-in-asserts variables to fix the -Werror non-asserts build	2020-03-04 22:36:24 -08:00
Philip Reames	c94a4133bb	Consistently capitalize a variable [NFC] One instance in a copy paste was pointed out in a review, fix all instances at once.	2020-03-04 20:00:08 -08:00
Shengchen Kan	b3722dea3b	[X86] Add a private member function determinePaddingPrefix for X86AsmBackend Summary: X86 can reduce the bytes of NOP by padding instructions with prefixes to get a better peformance in some cases. So a private member function `determinePaddingPrefix` is added to determine which prefix is the most suitable. Reviewers: annita.zhang, reames, MaskRay, craig.topper, LuoYuanke, jyknight Reviewed By: reames Subscribers: llvm-commits, dexonsmith, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D75357	2020-03-05 09:26:33 +08:00
Philip Reames	f708c823f0	[X86] Relax existing instructions to reduce the number of nops needed for alignment purposes If we have an explicit align directive, we currently default to emitting nops to fill the space. As discussed in the context of the prefix padding work for branch alignment (D72225), we're allowed to play other tricks such as extending the size of previous instructions instead. This patch will convert near jumps to far jumps if doing so decreases the number of bytes of nops needed for a following align. It does so as a post-pass after relaxation is complete. It intentionally works without moving any labels or doing anything which might require another round of relaxation. The point of this patch is mainly to mock out the approach. The optimization implemented is real, and possibly useful, but the main point is to demonstrate an approach for implementing such "pad previous instruction" approaches. The key notion in this patch is to treat padding previous instructions as an optional optimization, not as a core part of relaxation. The benefit to this is that we avoid the potential concern about increasing the distance between two labels and thus causing further potentially non-local code grown due to relaxation. The downside is that we may miss some opportunities to avoid nops. For the moment, this patch only implements a small set of existing relaxations.. Assuming the approach is satisfactory, I plan to extend this to a broader set of instructions where there are obvious "relaxations" which are roughly performance equivalent. Note that this patch doesn't change which instructions are relaxable. We may wish to explore that separately to increase optimization opportunity, but I figured that deserved it's own separate discussion. There are possible downsides to this optimization (and all "pad previous instruction" variants). The major two are potentially increasing instruction fetch and perturbing uop caching. (i.e. the usual alignment risks) Specifically: * If we pad an instruction such that it crosses a fetch window (16 bytes on modern X86-64), we may cause the decoder to have to trigger a fetch it wouldn't have otherwise. This can effect both decode speed, and icache pressure. * Intel's uop caching have particular restrictions on instruction combinations which can fit in a particular way. By moving around instructions, we can both cause misses an change misses into hits. Many of the most painful cases are around branch density, so I don't expect this to be too bad on the whole. On the whole, I expect to see small swings (i.e. the typical alignment change problem), but nothing major or systematic in either direction. Differential Revision: https://reviews.llvm.org/D75203	2020-03-04 16:52:35 -08:00
Shengchen Kan	af57b139a0	Temporarily Revert [X86] Not track size of the boudaryalign fragment during the layout Summary: This reverts commit 2ac19feb1571960b8e1479a451b45ab56da7034e. This commit causes some test cases to run fail when branch is aligned.	2020-03-03 11:15:56 +08:00
Philip Reames	7049cf6496	[BranchAlign] Fix bug w/nop padding for SS manipulation X86 has several instructions which are documented as enabling interrupts exactly one instruction after the one which changes the SS segment register. Inserting a nop between these two instructions allows an interrupt to arrive before the execution of the following instruction which changes semantic behaviour. The list of instructions is documented in "Table 24-3. Format of Interruptibility State" in Volume 3c of the Intel manual. They basically all come down to different ways to write to the SS register. Differential Revision: https://reviews.llvm.org/D75359	2020-03-02 14:40:25 -08:00
Shengchen Kan	2ac19feb15	[X86] Not track size of the boudaryalign fragment during the layout Summary: Currently the boundaryalign fragment caches its size during the process of layout and then it is relaxed and update the size in each iteration. This behaviour is unnecessary and ugly. Reviewers: annita.zhang, reames, MaskRay, craig.topper, LuoYuanke, jyknight Reviewed By: MaskRay Subscribers: hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75404	2020-03-02 09:32:30 +08:00
Shengchen Kan	95fa5c4f24	[X86] Move the function getOrCreateBoundaryAlignFragment MCObjectStreamer is more suitable to create fragments than X86AsmBackend, for example, the function getOrCreateDataFragment is defined in MCObjectStreamer. Differential Revision: https://reviews.llvm.org/D75351	2020-02-29 15:11:16 +08:00
Shengchen Kan	129a762555	[X86] Disable the NOP padding for branches when bundle is enabled When bundle is enabled, data fragment itself has a space to emit NOP to bundle-align instructions. The behaviour makes it impossible for us to determine whether the macro fusion really happen when emitting instructions. In addition, boundary-align fragment is also used to emit NOPs to align instructions, currently using them together sometimes makes code crazy. Differential Revision: https://reviews.llvm.org/D75346	2020-02-29 15:07:06 +08:00
Francis Visoiu Mistrih	1874dee566	[macho][NFC] Extract all CPU_(SUB_)TYPE logic to BinaryFormat This moves all the logic of converting LLVM Triples to MachO::CPU_(SUB_)TYPE from the specific target (Target)AsmBackend to more convenient functions in lib/BinaryFormat. This also gets rid of the separate two X86AsmBackend classes. The previous attempt was to add it to libObject, but that adds an unnecessary dependency to libObject from all the targets. Differential Revision: https://reviews.llvm.org/D74808	2020-02-21 12:43:29 -08:00
Francis Visoiu Mistrih	3f785212e9	Revert "[macho][NFC] Extract all CPU_(SUB_)TYPE logic to libObject" This reverts commit 726c342ce27ada28efe90cb04ffb69c75065710a. This breaks the windows bots with linker errors.	2020-02-20 10:51:25 -08:00

1 2 3 4

195 Commits