llvm-project

Author	SHA1	Message	Date
yonghong-song	0e0bfacff7	[BPF] Add support for may_goto insn (#85358 ) Alexei added may_goto insn in [1]. The asm syntax for may_goto looks like may_goto <label> The instruction represents a conditional branch but the condition is implicit. Later in bpf kernel verifier, the 'may_goto <label>' insn will be rewritten with an explicit condition. The encoding of 'may_goto' insn is enforced in [2] and is also implemented in this patch. In [3], 'may_goto' insn is encoded with raw bytes. I made the following change ``` --- a/tools/testing/selftests/bpf/bpf_experimental.h +++ b/tools/testing/selftests/bpf/bpf_experimental.h @@ -328,10 +328,7 @@ l_true: \ #define cond_break \ ({ __label__ l_break, l_continue; \ - asm volatile goto("1:.byte 0xe5; \ - .byte 0; \ - .long ((%l[l_break] - 1b - 8) / 8) & 0xffff; \ - .short 0" \ + asm volatile goto("may_goto %l[l_break]" \ :::: l_break); \ goto l_continue; \ l_break: break; ``` and ran the selftest with the latest llvm with this patch. All tests are passed. [1] https://lore.kernel.org/bpf/20240306031929.42666-1-alexei.starovoitov@gmail.com/ [2] https://lore.kernel.org/bpf/20240306031929.42666-2-alexei.starovoitov@gmail.com/ [3] https://lore.kernel.org/bpf/20240306031929.42666-4-alexei.starovoitov@gmail.com/	2024-03-15 07:24:28 -07:00
eddyz87	65b123e287	[BPF] rename 'arena' to 'address_space' (#85161 ) There are a few places where `arena` name is used for pointers in non-zero address space in BPF backend, rename these to use a more generic `address_space`: - macro `__BPF_FEATURE_ARENA_CAST` -> `__BPF_FEATURE_ADDR_SPACE_CAST - name for arena global variables section `.arena.N` -> `.addr_space.N`	2024-03-14 19:20:06 -07:00
4ast	2aacb56e83	BPF address space insn (#84410 ) This commit aims to support BPF arena kernel side [feature](https://lore.kernel.org/bpf/20240209040608.98927-1-alexei.starovoitov@gmail.com/): - arena is a memory region accessible from both BPF program and userspace; - base pointers for this memory region differ between kernel and user spaces; - `dst_reg = addr_space_cast(src_reg, dst_addr_space, src_addr_space)` translates src_reg, a pointer in src_addr_space to dst_reg, equivalent pointer in dst_addr_space, {src,dst}_addr_space are immediate constants; - number 0 is assigned to kernel address space; - number 1 is assigned to user address space. On the LLVM side, the goal is to make load and store operations on arena pointers "transparent" for BPF programs: - assume that pointers with non-zero address space are pointers to arena memory; - assume that arena is identified by address space number; - assume that address space zero corresponds to kernel address space; - assume that every BPF-side load or store from arena is done via pointer in user address space, thus convert base pointers using `addr_space_cast(src_reg, 0, 1)`; Only load, store, cmpxchg and atomicrmw IR instructions are handled by this transformation. For example, the following C code: ```c #define __as __attribute__((address_space(1))) void copy(int __as from, int __as to) { to = from; } ``` Compiled to the following IR: ```llvm define void @copy(ptr addrspace(1) %from, ptr addrspace(1) %to) { entry: %0 = load i32, ptr addrspace(1) %from, align 4 store i32 %0, ptr addrspace(1) %to, align 4 ret void } ``` Is transformed to: ```llvm %to2 = addrspacecast ptr addrspace(1) %to to ptr ;; ! %from1 = addrspacecast ptr addrspace(1) %from to ptr ;; ! %0 = load i32, ptr %from1, align 4, !tbaa !3 store i32 %0, ptr %to2, align 4, !tbaa !3 ret void ``` And compiled as: ```asm r2 = addr_space_cast(r2, 0, 1) r1 = addr_space_cast(r1, 0, 1) r1 = (u32 )(r1 + 0) (u32 )(r2 + 0) = r1 exit ``` Co-authored-by: Eduard Zingerman <eddyz87@gmail.com>	2024-03-13 02:27:25 +02:00
Nikolas Klauser	4a58284559	[clang] Refactor Builtins.def to be a tablegen file (#68324 ) This makes the builtins list quite a bit more verbose, but IMO this is a huge win in terms of readability.	2024-01-24 11:22:43 +01:00
yonghong-song	4e67234357	[Clang][BPF] Add __BPF_CPU_VERSION__ macro (#71856 ) Sometimes bpf developer might want to develop different codes based on particular cpu versioins. For example, cpu v1/v2/v3 branch target is 16bit while cpu v4 branch target is 32bit, thus cpu v4 allows more aggressive loop unrolling than cpu v1/v2/v3 (see [1] for a kernel selftest failure due to this). We would like to maintain aggressive loop unrolling for cpu v4 while limit loop unrolling for earlier cpu versions. Another example, signed divide also only available with cpu v4. Actually, adding cpu specific macros are fairly common in llvm. For example, x86 has maco like 'i486', '__pentium_mmx__', etc. AArch64 has '__ARM_NEON', '__ARM_FEATURE_SVE', etc. This patch added __BPF_CPU_VERSION__ macro. Current possible values are 0/1/2/3/4. The following are the -mcpu=... to __BPF_CPU_VERSION__ mapping: ``` cpu __BPF_CPU_VERSION__ no -mcpu=<...> 1 -mcpu=v1 1 -mcpu=v2 2 -mcpu=v3 3 -mcpu=v4 4 -mcpu=generic 1 -mcpu=probe 0 ``` This patch also added some macros for developers to identify some cpu insn features: ``` feature macro enabled in which cpu __BPF_FEATURE_JMP_EXT >= v2 __BPF_FEATURE_JMP32 >= v3 __BPF_FEATURE_ALU32 >= v3 __BPF_FEATURE_LDSX >= v4 __BPF_FEATURE_MOVSX >= v4 __BPF_FEATURE_BSWAP >= v4 __BPF_FEATURE_SDIV_SMOD >= v4 __BPF_FEATURE_GOTOL >= v4 __BPF_FEATURE_ST >= v4 ``` [1] https://lore.kernel.org/bpf/3e3a8a30-dde0-43a1-981e-2274962780ef@linux.dev/	2023-11-10 10:18:54 -08:00
Yonghong Song	6c412b6c6f	[BPF] Add a few new insns under cpu=v4 In [1], a few new insns are proposed to expand BPF ISA to . fixing the limitation of existing insn (e.g., 16bit jmp offset) . adding new insns which may improve code quality (sign_ext_ld, sign_ext_mov, st) . feature complete (sdiv, smod) . better user experience (bswap) This patch implemented insn encoding for . sign-extended load . sign-extended mov . sdiv/smod . bswap insns . unconditional jump with 32bit offset The new bswap insns are generated under cpu=v4 for __builtin_bswap. For cpu=v3 or earlier, for __builtin_bswap, be or le insns are generated which is not intuitive for the user. To support 32-bit branch offset, a 32-bit ja (JMPL) insn is implemented. For conditional branch which is beyond 16-bit offset, llvm will do some transformation 'cond_jmp' -> 'cond_jmp + jmpl' to simulate 32bit conditional jmp. See BPFMIPeephole.cpp for details. The algorithm is hueristic based. I have tested bpf selftest pyperf600 with unroll account 600 which can indeed generate 32-bit jump insn, e.g., 13: 06 00 00 00 9b cd 00 00 gotol +0xcd9b <LBB0_6619> Eduard is working on to add 'st' insn to cpu=v4. A list of llc flags: disable-ldsx, disable-movsx, disable-bswap, disable-sdiv-smod, disable-gotol can be used to disable a particular insn for cpu v4. For example, user can do: llc -march=bpf -mcpu=v4 -disable-movsx t.ll to enable cpu v4 without movsx insns. References: [1] https://lore.kernel.org/bpf/4bfe98be-5333-1c7e-2f6d-42486c8ec039@meta.com/ Differential Revision: https://reviews.llvm.org/D144829	2023-07-26 08:37:30 -07:00
serge-sans-paille	5a7f47cc02	[clang] Optimize clang::Builtin::Info density Reorganize clang::Builtin::Info to have them naturally align on 4 bytes boundaries. Instead of storing builtin headers as a straight char pointer, enumerate them and store the enum. It allows to use a small enum instead of a pointer to reference them. On a 64 bit machine, this brings sizeof(clang::Builtin::Info) from 56 down to 48 bytes. On a release build on my Linux 64 bit machine, it shrinks the size of libclang-cpp.so by 193kB. The impact on performance is negligible in terms of instruction count, but the wall time seems better, see https://llvm-compile-time-tracker.com/compare.php?from=b3d8639f3536a4876b511aca9fb7948ff9266cee&to=a89b56423f98b550260a58c41e64aff9e56b76be&stat=task-clock Differential Revision: https://reviews.llvm.org/D142024	2023-01-23 14:27:44 +01:00
serge-sans-paille	a3c248db87	Move from llvm::makeArrayRef to ArrayRef deduction guides - clang/ part This is a follow-up to https://reviews.llvm.org/D140896, split into several parts as it touches a lot of files. Differential Revision: https://reviews.llvm.org/D141139	2023-01-09 12:15:24 +01:00
serge-sans-paille	d9ab3e82f3	[clang] Use a StringRef instead of a raw char pointer to store builtin and call information This avoids recomputing string length that is already known at compile time. It has a slight impact on preprocessing / compile time, see https://llvm-compile-time-tracker.com/compare.php?from=3f36d2d579d8b0e8824d9dd99bfa79f456858f88&to=e49640c507ddc6615b5e503144301c8e41f8f434&stat=instructions:u This a recommit of e953ae5bbc313fd0cc980ce021d487e5b5199ea4 and the subsequent fixes caa713559bd38f337d7d35de35686775e8fb5175 and 06b90e2e9c991e211fecc97948e533320a825470. The above patchset caused some version of GCC to take eons to compile clang/lib/Basic/Targets/AArch64.cpp, as spotted in aa171833ab0017d9732e82b8682c9848ab25ff9e. The fix is to make BuiltinInfo tables a compilation unit static variable, instead of a private static variable. Differential Revision: https://reviews.llvm.org/D139881	2022-12-27 09:55:19 +01:00
Kazu Hirata	0e9373a6a6	[Basic] Use llvm::is_contained (NFC)	2021-10-10 08:52:14 -07:00
Alessandro Decina	833e9b2ea7	[BPF] add support for 32 bit registers in inline asm Add "w" constraint type which allows selecting 32 bit registers. 32 bit registers were added in https://reviews.llvm.org/rGca31c3bb3ff149850b664838fbbc7d40ce571879. Differential Revision: https://reviews.llvm.org/D102118	2021-05-16 11:01:47 -07:00
Yonghong Song	05e46979d2	[BPF] do compile-once run-everywhere relocation for bitfields A bpf specific clang intrinsic is introduced: u32 __builtin_preserve_field_info(member_access, info_kind) Depending on info_kind, different information will be returned to the program. A relocation is also recorded for this builtin so that bpf loader can patch the instruction on the target host. This clang intrinsic is used to get certain information to facilitate struct/union member relocations. The offset relocation is extended by 4 bytes to include relocation kind. Currently supported relocation kinds are enum { FIELD_BYTE_OFFSET = 0, FIELD_BYTE_SIZE, FIELD_EXISTENCE, FIELD_SIGNEDNESS, FIELD_LSHIFT_U64, FIELD_RSHIFT_U64, }; for __builtin_preserve_field_info. The old access offset relocation is covered by FIELD_BYTE_OFFSET = 0. An example: struct s { int a; int b1:9; int b2:4; }; enum { FIELD_BYTE_OFFSET = 0, FIELD_BYTE_SIZE, FIELD_EXISTENCE, FIELD_SIGNEDNESS, FIELD_LSHIFT_U64, FIELD_RSHIFT_U64, }; void bpf_probe_read(void , unsigned, const void ); int field_read(struct s arg) { unsigned long long ull = 0; unsigned offset = __builtin_preserve_field_info(arg->b2, FIELD_BYTE_OFFSET); unsigned size = __builtin_preserve_field_info(arg->b2, FIELD_BYTE_SIZE); #ifdef USE_PROBE_READ bpf_probe_read(&ull, size, (const void )arg + offset); unsigned lshift = __builtin_preserve_field_info(arg->b2, FIELD_LSHIFT_U64); #if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ lshift = lshift + (size << 3) - 64; #endif #else switch(size) { case 1: ull = (unsigned char )((void )arg + offset); break; case 2: ull = (unsigned short )((void )arg + offset); break; case 4: ull = (unsigned int )((void )arg + offset); break; case 8: ull = (unsigned long long )((void )arg + offset); break; } unsigned lshift = __builtin_preserve_field_info(arg->b2, FIELD_LSHIFT_U64); #endif ull <<= lshift; if (__builtin_preserve_field_info(arg->b2, FIELD_SIGNEDNESS)) return (long long)ull >> __builtin_preserve_field_info(arg->b2, FIELD_RSHIFT_U64); return ull >> __builtin_preserve_field_info(arg->b2, FIELD_RSHIFT_U64); } There is a minor overhead for bpf_probe_read() on big endian. The code and relocation generated for field_read where bpf_probe_read() is used to access argument data on little endian mode: r3 = r1 r1 = 0 r1 = 4 <=== relocation (FIELD_BYTE_OFFSET) r3 += r1 r1 = r10 r1 += -8 r2 = 4 <=== relocation (FIELD_BYTE_SIZE) call bpf_probe_read r2 = 51 <=== relocation (FIELD_LSHIFT_U64) r1 = (u64 )(r10 - 8) r1 <<= r2 r2 = 60 <=== relocation (FIELD_RSHIFT_U64) r0 = r1 r0 >>= r2 r3 = 1 <=== relocation (FIELD_SIGNEDNESS) if r3 == 0 goto LBB0_2 r1 s>>= r2 r0 = r1 LBB0_2: exit Compare to the above code between relocations FIELD_LSHIFT_U64 and FIELD_LSHIFT_U64, the code with big endian mode has four more instructions. r1 = 41 <=== relocation (FIELD_LSHIFT_U64) r6 += r1 r6 += -64 r6 <<= 32 r6 >>= 32 r1 = (u64 )(r10 - 8) r1 <<= r6 r2 = 60 <=== relocation (FIELD_RSHIFT_U64) The code and relocation generated when using direct load. r2 = 0 r3 = 4 r4 = 4 if r4 s> 3 goto LBB0_3 if r4 == 1 goto LBB0_5 if r4 == 2 goto LBB0_6 goto LBB0_9 LBB0_6: # %sw.bb1 r1 += r3 r2 = (u16 )(r1 + 0) goto LBB0_9 LBB0_3: # %entry if r4 == 4 goto LBB0_7 if r4 == 8 goto LBB0_8 goto LBB0_9 LBB0_8: # %sw.bb9 r1 += r3 r2 = (u64 )(r1 + 0) goto LBB0_9 LBB0_5: # %sw.bb r1 += r3 r2 = (u8 )(r1 + 0) goto LBB0_9 LBB0_7: # %sw.bb5 r1 += r3 r2 = (u32 )(r1 + 0) LBB0_9: # %sw.epilog r1 = 51 r2 <<= r1 r1 = 60 r0 = r2 r0 >>= r1 r3 = 1 if r3 == 0 goto LBB0_11 r2 s>>= r1 r0 = r2 LBB0_11: # %sw.epilog exit Considering verifier is able to do limited constant propogation following branches. The following is the code actually traversed. r2 = 0 r3 = 4 <=== relocation r4 = 4 <=== relocation if r4 s> 3 goto LBB0_3 LBB0_3: # %entry if r4 == 4 goto LBB0_7 LBB0_7: # %sw.bb5 r1 += r3 r2 = (u32 )(r1 + 0) LBB0_9: # %sw.epilog r1 = 51 <=== relocation r2 <<= r1 r1 = 60 <=== relocation r0 = r2 r0 >>= r1 r3 = 1 if r3 == 0 goto LBB0_11 r2 s>>= r1 r0 = r2 LBB0_11: # %sw.epilog exit For native load case, the load size is calculated to be the same as the size of load width LLVM otherwise used to load the value which is then used to extract the bitfield value. Differential Revision: https://reviews.llvm.org/D67980 llvm-svn: 374099	2019-10-08 18:23:17 +00:00
Yonghong Song	51a4a0d68f	[BPF] do not generate predefined macro bpf "DefineStd(Builder, "bpf", Opts)" generates the following three macros: bpf __bpf __bpf__ and the macro "bpf" is due to the fact that the target language is C which allows GNU extensions. The name "bpf" could be easily used as variable name or type field name. For example, in current linux kernel, there are four places where bpf is used as a field name. If the corresponding types are included in bpf program, the compilation error will occur. This patch removed predefined macro "bpf" as well as "__bpf" which is rarely used if used at all. Signed-off-by: Yonghong Song <yhs@fb.com> Differential Revision: https://reviews.llvm.org/D61173 llvm-svn: 359310	2019-04-26 15:35:51 +00:00
Jiong Wang	862e7405e8	bpf: teach BPF driver about the new CPU "v3" This patch simply teach BPF driver about the new CPU "v3" introduced in LLVM backend. Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> llvm-svn: 353479	2019-02-07 22:51:56 +00:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Erich Keane	e44bdb3f70	Add Rest of Targets Support to ValidCPUList (enabling march notes) A followup to: https://reviews.llvm.org/D42978 Most of the rest of the Targets were pretty rote, so this patch knocks them all out at once. Differential Revision: https://reviews.llvm.org/D43057 llvm-svn: 324676	2018-02-08 23:16:55 +00:00
Erich Keane	ebba592682	Break up Targets.cpp into a header/impl pair per target type[NFCI] Targets.cpp is getting unwieldy, and even minor changes cause the entire thing to cause recompilation for everyone. This patch bites the bullet and breaks it up into a number of files. I tended to keep function definitions in the class declaration unless it caused additional includes to be necessary. In those cases, I pulled it over into the .cpp file. Content is copy/paste for the most part, besides includes/format/etc. Differential Revision: https://reviews.llvm.org/D35701 llvm-svn: 308791	2017-07-21 22:37:03 +00:00

17 Commits