4539 Commits

Author SHA1 Message Date
enh-google
71dd4e17dd
[libc] fix strsep()/strtok()/strtok_r() "subsequent searches" behavior. (#154370)
These functions turned out to have the same bug that was in wcstok()
(fixed by 4fc9801), so add the missing tests and fix the code in a way
that matches wcstok().

Also fix incorrect test expectations in existing tests.

Also update the BUILD.bazel files to actually build the strsep() test.
2025-08-21 09:32:35 -04:00
Joseph Huber
e90ce511e0 Disable asan on last wide string function 2025-08-21 08:30:47 -05:00
Joseph Huber
3596005148 Fix wide read defaults 2025-08-21 08:17:06 -05:00
Joseph Huber
6ac01d12d6
Reapply "[libc] Enable wide-read memory operations by default on Linux (#154602)" (#154640)
Reland afterr the sanitizer and arm32 builds complained.
2025-08-21 07:40:23 -05:00
Mikhail R. Gadelha
2b1dcf5383
[libc] Remove hardcoded sizeof in __barrier_type.h (#153718)
This PR modifies the static_asserts checking the expected sizes in
__barrier_type.h, so that we can guarantee that our internal
implementation fits the public header.
2025-08-21 09:37:56 -03:00
Joseph Huber
27fc9671f9 Revert "[libc] Enable wide-read memory operations by default on Linux (#154602)"
This reverts commit c80d1483c6d787edf62ff9e86b1e97af5eb5abf9.
2025-08-20 17:27:13 -05:00
Joseph Huber
c80d1483c6
[libc] Enable wide-read memory operations by default on Linux (#154602)
Summary:
This patch changes the linux build to use the wide reads on the memory
operations by default. These memory functions will now potentially read
outside of the bounds explicitly allowed by the current function. While
technically undefined behavior in the standard, plenty of C library
implementations do this. it will not cause a segmentation fault on linux
as long as you do not cross a page boundary, and because we are only
*reading* memory it should not have atomic effects.
2025-08-20 17:17:12 -05:00
Guillaume Chatelet
4dd9e99284
[libc] Fix constexpr add_with_carry/sub_with_borrow (#154282)
The previous version of the code would prevent the use of the compiler
builtins.
2025-08-20 10:41:51 +02:00
Sterling-Augustine
317920063b
Add vector-based strlen implementation for x86_64 and aarch64 (#152389)
These replace the default LIBC_CONF_STRING_UNSAFE_WIDE_READ
implementation
on x86_64 and aarch64.

These are substantially faster than both the character-by-character
implementation and the original unsafe_wide_read implementation. Some
below
I have been unable to performance-test the aarch64 version, but I
suspect
speedups similar to avx2.

```
Function: strlen
Variant:
                                    char	       wide              	ull	        sse2	                 avx2           	avx512
=============================================================================================================================================================
   length=1, alignment=1:        13.18	       20.47 (-55.24%)	       20.21 (-53.27%)	       32.50 (-146.54%)	       26.05 (-97.61%)	       18.03 (-36.74%)
   length=1, alignment=0:        12.80	       34.92 (-172.89%)	       20.01 (-56.39%)	       17.52 (-36.86%)	       17.78 (-38.92%)	       18.04 (-40.94%)
   length=2, alignment=2:         9.91	       19.02 (-91.95%)	       12.64 (-27.52%)	       11.06 (-11.59%)	        9.48 (  4.38%)	        9.48 (  4.34%)
   length=2, alignment=0:         9.56	       26.88 (-181.24%)	       12.64 (-32.31%)	       11.06 (-15.73%)	       11.06 (-15.72%)	       11.83 (-23.80%)
   length=3, alignment=3:         8.31	       10.45 (-25.84%)	        8.28 (  0.32%)	        8.28 (  0.36%)	        6.21 ( 25.28%)	        6.21 ( 25.24%)
   length=3, alignment=0:         8.39	       14.53 (-73.20%)	        8.28 (  1.33%)	        7.24 ( 13.69%)	        7.56 (  9.94%)	        7.25 ( 13.65%)
   length=4, alignment=4:         9.84	       21.76 (-121.24%)	       15.55 (-58.11%)	        6.57 ( 33.18%)	        5.02 ( 48.98%)	        6.00 ( 39.00%)
   length=4, alignment=0:         8.64	       13.70 (-58.51%)	        7.28 ( 15.73%)	        6.37 ( 26.31%)	        6.36 ( 26.36%)	        6.36 ( 26.36%)
   length=5, alignment=5:        11.85	       23.81 (-100.97%)	       12.17 ( -2.67%)	        5.68 ( 52.09%)	        4.87 ( 58.94%)	        6.48 ( 45.33%)
   length=5, alignment=0:        11.82	       13.64 (-15.42%)	        7.27 ( 38.45%)	        6.36 ( 46.15%)	        6.37 ( 46.11%)	        6.36 ( 46.14%)
   length=6, alignment=6:        10.50	       19.37 (-84.56%)	       13.64 (-29.93%)	        6.54 ( 37.71%)	        6.89 ( 34.35%)	        9.45 ( 10.01%)
   length=6, alignment=0:        14.96	       14.05 (  6.04%)	        6.49 ( 56.62%)	        5.68 ( 62.04%)	        5.68 ( 62.04%)	       13.15 ( 12.05%)
   length=7, alignment=7:        10.97	       18.02 (-64.35%)	       14.59 (-33.06%)	        6.36 ( 41.96%)	        5.46 ( 50.25%)	        5.46 ( 50.25%)
   length=7, alignment=0:        10.96	       15.76 (-43.77%)	       15.37 (-40.15%)	        6.96 ( 36.51%)	        5.68 ( 48.22%)	        7.04 ( 35.83%)
   length=4, alignment=0:         8.66	       13.69 (-58.02%)	        7.28 ( 16.00%)	        6.37 ( 26.44%)	        6.37 ( 26.52%)	        6.61 ( 23.74%)
   length=4, alignment=7:         8.87	       17.35 (-95.73%)	       12.18 (-37.39%)	        5.68 ( 35.94%)	        4.87 ( 45.11%)	        6.00 ( 32.36%)
   length=4, alignment=2:         8.67	       10.05 (-15.91%)	        7.28 ( 16.01%)	        7.37 ( 15.02%)	        5.46 ( 37.02%)	        5.47 ( 36.89%)
   length=2, alignment=2:         5.64	       10.01 (-77.64%)	        7.29 (-29.34%)	        6.37 (-13.04%)	        5.46 (  3.19%)	        5.46 (  3.19%)
   length=8, alignment=0:        12.78	       16.52 (-29.33%)	       18.27 (-43.00%)	       11.82 (  7.47%)	        9.83 ( 23.03%)	       11.46 ( 10.27%)
   length=8, alignment=7:        14.24	       17.30 (-21.49%)	       12.16 ( 14.59%)	        5.68 ( 60.14%)	        4.87 ( 65.83%)	        6.23 ( 56.28%)
   length=8, alignment=3:        12.34	       26.15 (-111.98%)	       12.20 (  1.14%)	        6.50 ( 47.34%)	        4.87 ( 60.54%)	        6.18 ( 49.94%)
   length=5, alignment=3:        10.95	       19.74 (-80.30%)	       12.17 (-11.11%)	        5.68 ( 48.16%)	        4.87 ( 55.56%)	        5.96 ( 45.55%)
   length=16, alignment=0:        20.33	       29.29 (-44.08%)	       36.18 (-77.97%)	        5.68 ( 72.06%)	        5.68 ( 72.08%)	       10.60 ( 47.86%)
   length=16, alignment=7:        19.29	       17.52 (  9.16%)	       12.98 ( 32.73%)	        7.05 ( 63.47%)	        4.87 ( 74.75%)	        6.23 ( 67.71%)
   length=16, alignment=4:        20.54	       25.18 (-22.56%)	       15.42 ( 24.92%)	        7.31 ( 64.43%)	        4.87 ( 76.29%)	        5.98 ( 70.88%)
   length=10, alignment=4:        14.59	       21.26 (-45.71%)	       12.17 ( 16.58%)	        5.68 ( 61.07%)	        4.87 ( 66.65%)	        6.00 ( 58.91%)
   length=32, alignment=0:        35.46	       22.00 ( 37.95%)	       16.22 ( 54.26%)	        7.32 ( 79.35%)	        5.68 ( 83.98%)	        7.01 ( 80.22%)
   length=32, alignment=7:        35.23	       24.14 ( 31.48%)	       16.22 ( 53.96%)	        7.30 ( 79.28%)	        8.76 ( 75.12%)	        6.14 ( 82.58%)
   length=32, alignment=5:        35.16	       28.56 ( 18.76%)	       16.22 ( 53.87%)	        7.30 ( 79.23%)	        6.77 ( 80.75%)	        9.82 ( 72.07%)
   length=21, alignment=5:        26.47	       27.66 ( -4.49%)	       15.04 ( 43.17%)	        6.90 ( 73.95%)	        4.87 ( 81.60%)	        6.04 ( 77.18%)
   length=64, alignment=0:        66.45	       25.16 ( 62.14%)	       22.70 ( 65.83%)	       12.99 ( 80.44%)	        7.47 ( 88.77%)	        8.70 ( 86.90%)
   length=64, alignment=7:        64.75	       27.78 ( 57.10%)	       22.72 ( 64.91%)	       10.85 ( 83.25%)	        7.46 ( 88.48%)	        8.68 ( 86.60%)
   length=64, alignment=6:        67.26	       28.58 ( 57.51%)	       22.70 ( 66.24%)	       11.26 ( 83.25%)	        9.46 ( 85.94%)	       13.90 ( 79.33%)
   length=42, alignment=6:        73.42	       27.97 ( 61.91%)	       19.46 ( 73.49%)	        8.92 ( 87.84%)	        6.49 ( 91.16%)	        6.00 ( 91.83%)
  length=128, alignment=0:       172.07	       39.18 ( 77.23%)	       35.68 ( 79.26%)	       13.02 ( 92.43%)	       12.98 ( 92.46%)	        9.76 ( 94.33%)
  length=128, alignment=7:       163.98	       43.79 ( 73.30%)	       36.03 ( 78.03%)	       15.68 ( 90.44%)	       11.35 ( 93.08%)	       10.51 ( 93.59%)
  length=128, alignment=7:       185.86	       40.27 ( 78.33%)	       36.04 ( 80.61%)	       13.78 ( 92.58%)	       11.35 ( 93.89%)	       10.49 ( 94.36%)
   length=85, alignment=7:       121.61	       55.66 ( 54.23%)	       32.34 ( 73.40%)	       13.88 ( 88.59%)	        7.30 ( 94.00%)	        8.72 ( 92.83%)
  length=256, alignment=0:       295.54	       66.48 ( 77.50%)	       61.63 ( 79.15%)	       19.54 ( 93.39%)	       12.97 ( 95.61%)	       12.45 ( 95.79%)
  length=256, alignment=7:       308.06	       78.92 ( 74.38%)	       61.63 ( 80.00%)	       22.90 ( 92.57%)	       12.97 ( 95.79%)	       13.23 ( 95.71%)
  length=256, alignment=8:       295.32	       65.83 ( 77.71%)	       61.62 ( 79.13%)	       23.19 ( 92.15%)	       12.97 ( 95.61%)	       13.50 ( 95.43%)
  length=170, alignment=8:       234.39	       48.79 ( 79.18%)	       43.79 ( 81.32%)	       16.22 ( 93.08%)	       13.97 ( 94.04%)	       10.48 ( 95.53%)
  length=512, alignment=0:       563.75	      116.89 ( 79.27%)	      114.99 ( 79.60%)	       62.71 ( 88.88%)	       19.58 ( 96.53%)	       17.76 ( 96.85%)
  length=512, alignment=7:       580.53	      120.91 ( 79.17%)	      114.47 ( 80.28%)	       37.75 ( 93.50%)	       19.55 ( 96.63%)	       18.68 ( 96.78%)
  length=512, alignment=9:       584.05	      128.35 ( 78.02%)	      114.74 ( 80.35%)	       39.09 ( 93.31%)	       19.76 ( 96.62%)	       18.71 ( 96.80%)
  length=341, alignment=9:       405.84	       90.87 ( 77.61%)	       78.79 ( 80.59%)	       28.77 ( 92.91%)	       14.60 ( 96.40%)	       14.15 ( 96.51%)
 length=1024, alignment=0:      1143.61	      247.03 ( 78.40%)	      243.70 ( 78.69%)	       75.59 ( 93.39%)	       67.02 ( 94.14%)	       28.99 ( 97.46%)
 length=1024, alignment=7:      1124.55	      267.87 ( 76.18%)	      259.16 ( 76.95%)	       64.96 ( 94.22%)	       33.05 ( 97.06%)	       30.91 ( 97.25%)
length=1024, alignment=10:      1459.58	      257.79 ( 82.34%)	      239.91 ( 83.56%)	       65.00 ( 95.55%)	       33.10 ( 97.73%)	       30.33 ( 97.92%)
 length=682, alignment=10:       732.89	      163.67 ( 77.67%)	      170.54 ( 76.73%)	       46.48 ( 93.66%)	       24.32 ( 96.68%)	       21.44 ( 97.07%)
 length=2048, alignment=0:      2141.96	      451.61 ( 78.92%)	      448.00 ( 79.08%)	      133.24 ( 93.78%)	       61.22 ( 97.14%)	       80.08 ( 96.26%)
 length=2048, alignment=7:      2145.05	      458.26 ( 78.64%)	      449.99 ( 79.02%)	      140.19 ( 93.46%)	       60.26 ( 97.19%)	       51.71 ( 97.59%)
length=2048, alignment=11:      2162.61	      463.37 ( 78.57%)	      448.07 ( 79.28%)	      140.29 ( 93.51%)	       59.51 ( 97.25%)	       51.59 ( 97.61%)
length=1365, alignment=11:      1439.74	      322.86 ( 77.58%)	      310.84 ( 78.41%)	      116.08 ( 91.94%)	       42.43 ( 97.05%)	       36.15 ( 97.49%)
 length=4096, alignment=0:      4278.68	      871.60 ( 79.63%)	      865.25 ( 79.78%)	      252.50 ( 94.10%)	      161.17 ( 96.23%)	       94.97 ( 97.78%)
 length=4096, alignment=7:      4253.01	      871.62 ( 79.51%)	      864.21 ( 79.68%)	      243.90 ( 94.27%)	      171.17 ( 95.98%)	       95.14 ( 97.76%)
length=4096, alignment=12:      4252.18	      879.66 ( 79.31%)	      863.68 ( 79.69%)	      244.26 ( 94.26%)	      185.36 ( 95.64%)	       93.61 ( 97.80%)
length=2730, alignment=12:      2868.22	      597.65 ( 79.16%)	      586.22 ( 79.56%)	      175.09 ( 93.90%)	      120.35 ( 95.80%)	      101.35 ( 96.47%)
    length=0, alignment=0:         4.87	        8.11 (-66.73%)	        6.49 (-33.34%)	        5.80 (-19.26%)	        5.68 (-16.67%)	        6.86 (-40.91%)
   length=32, alignment=0:        33.82	       22.36 ( 33.89%)	       17.03 ( 49.66%)	        7.30 ( 78.42%)	        5.68 ( 83.22%)	        7.50 ( 77.83%)
   length=64, alignment=0:        66.20	       26.76 ( 59.58%)	       23.22 ( 64.93%)	       12.99 ( 80.37%)	        7.34 ( 88.92%)	        8.44 ( 87.25%)
   length=96, alignment=0:       130.26	       31.62 ( 75.72%)	       30.00 ( 76.97%)	       11.39 ( 91.26%)	       10.54 ( 91.91%)	        8.68 ( 93.34%)
  length=128, alignment=0:       164.66	       39.05 ( 76.29%)	       35.68 ( 78.33%)	       13.07 ( 92.07%)	       12.97 ( 92.12%)	        9.59 ( 94.18%)
  length=160, alignment=0:       196.63	       45.18 ( 77.02%)	       42.16 ( 78.56%)	       14.65 ( 92.55%)	       10.87 ( 94.47%)	        9.31 ( 95.27%)
  length=192, alignment=0:       225.50	       52.71 ( 76.63%)	       49.61 ( 78.00%)	       16.22 ( 92.81%)	       11.36 ( 94.96%)	       11.08 ( 95.09%)
  length=224, alignment=0:       261.08	       57.57 ( 77.95%)	       55.82 ( 78.62%)	       17.84 ( 93.17%)	       12.16 ( 95.34%)	       11.51 ( 95.59%)
  length=256, alignment=0:       295.13	       65.56 ( 77.79%)	       62.59 ( 78.79%)	       19.46 ( 93.41%)	       13.12 ( 95.56%)	       12.33 ( 95.82%)
  length=288, alignment=0:       325.69	       72.16 ( 77.84%)	       69.20 ( 78.75%)	       21.08 ( 93.53%)	       13.94 ( 95.72%)	       12.32 ( 96.22%)
  length=320, alignment=0:       364.18	       78.78 ( 78.37%)	       75.69 ( 79.21%)	       22.71 ( 93.77%)	       14.70 ( 95.96%)	       14.46 ( 96.03%)
  length=352, alignment=0:       391.40	       84.87 ( 78.32%)	       82.15 ( 79.01%)	       24.50 ( 93.74%)	       15.62 ( 96.01%)	       14.27 ( 96.35%)
  length=384, alignment=0:       428.50	       91.43 ( 78.66%)	       88.70 ( 79.30%)	       26.16 ( 93.90%)	       17.29 ( 95.97%)	       15.04 ( 96.49%)
  length=416, alignment=0:       457.30	       98.23 ( 78.52%)	       95.02 ( 79.22%)	       27.81 ( 93.92%)	       17.22 ( 96.23%)	       15.05 ( 96.71%)
  length=448, alignment=0:       488.38	      104.52 ( 78.60%)	      101.87 ( 79.14%)	       31.22 ( 93.61%)	       18.07 ( 96.30%)	       16.89 ( 96.54%)
  length=480, alignment=0:       526.44	      109.61 ( 79.18%)	      108.11 ( 79.46%)	       31.11 ( 94.09%)	       18.88 ( 96.41%)	       17.10 ( 96.75%)
  length=512, alignment=0:       556.50	      117.29 ( 78.92%)	      113.78 ( 79.56%)	       62.57 ( 88.76%)	       19.88 ( 96.43%)	       17.80 ( 96.80%)
  length=576, alignment=0:       622.17	      152.93 ( 75.42%)	      127.58 ( 79.49%)	       39.34 ( 93.68%)	       21.31 ( 96.58%)	       19.99 ( 96.79%)
  length=640, alignment=0:       691.01	      142.56 ( 79.37%)	      161.78 ( 76.59%)	       39.20 ( 94.33%)	       22.98 ( 96.67%)	       20.13 ( 97.09%)
  length=704, alignment=0:       756.90	      156.31 ( 79.35%)	      176.19 ( 76.72%)	       45.03 ( 94.05%)	       24.82 ( 96.72%)	       22.33 ( 97.05%)
  length=768, alignment=0:       826.23	      193.17 ( 76.62%)	      188.41 ( 77.20%)	       50.81 ( 93.85%)	       27.46 ( 96.68%)	       23.25 ( 97.19%)
  length=832, alignment=0:       890.17	      204.81 ( 76.99%)	      201.61 ( 77.35%)	       53.77 ( 93.96%)	       27.73 ( 96.88%)	       25.06 ( 97.18%)
  length=896, alignment=0:       959.52	      217.89 ( 77.29%)	      213.86 ( 77.71%)	       57.99 ( 93.96%)	       29.53 ( 96.92%)	       26.29 ( 97.26%)
  length=960, alignment=0:      1024.52	      231.06 ( 77.45%)	      227.05 ( 77.84%)	       60.36 ( 94.11%)	       32.29 ( 96.85%)	       27.94 ( 97.27%)
 length=1024, alignment=0:      1086.71	      244.17 ( 77.53%)	      239.87 ( 77.93%)	       64.72 ( 94.04%)	       72.38 ( 93.34%)	       28.72 ( 97.36%)
 length=1152, alignment=0:      1231.48	      270.22 ( 78.06%)	      266.47 ( 78.36%)	       73.38 ( 94.04%)	       40.24 ( 96.73%)	       32.42 ( 97.37%)
 length=1280, alignment=0:      1349.29	      295.45 ( 78.10%)	      292.69 ( 78.31%)	      111.80 ( 91.71%)	       42.44 ( 96.85%)	       34.59 ( 97.44%)
 length=1408, alignment=0:      1487.13	      322.57 ( 78.31%)	      318.18 ( 78.60%)	       84.47 ( 94.32%)	       44.35 ( 97.02%)	       37.31 ( 97.49%)
 length=1536, alignment=0:      1623.52	      347.98 ( 78.57%)	      344.24 ( 78.80%)	      108.31 ( 93.33%)	       49.82 ( 96.93%)	       39.94 ( 97.54%)
 length=1664, alignment=0:      1748.88	      373.80 ( 78.63%)	      370.03 ( 78.84%)	      118.76 ( 93.21%)	       52.89 ( 96.98%)	       42.93 ( 97.55%)
 length=1792, alignment=0:      1886.22	      399.59 ( 78.82%)	      397.39 ( 78.93%)	      127.32 ( 93.25%)	       53.64 ( 97.16%)	       45.39 ( 97.59%)
 length=1920, alignment=0:      2018.37	      425.98 ( 78.89%)	      422.31 ( 79.08%)	      126.70 ( 93.72%)	       57.08 ( 97.17%)	       48.12 ( 97.62%)
 length=2048, alignment=0:      2167.09	      451.70 ( 79.16%)	      447.70 ( 79.34%)	      141.68 ( 93.46%)	       61.63 ( 97.16%)	       79.06 ( 96.35%)
 length=2304, alignment=0:      2422.03	      503.63 ( 79.21%)	      502.23 ( 79.26%)	      149.62 ( 93.82%)	       73.10 ( 96.98%)	       56.97 ( 97.65%)
 length=2560, alignment=0:      2678.68	      556.84 ( 79.21%)	      553.24 ( 79.35%)	      161.06 ( 93.99%)	      127.74 ( 95.23%)	       58.81 ( 97.80%)
 length=2816, alignment=0:      2941.95	      608.70 ( 79.31%)	      604.03 ( 79.47%)	      171.85 ( 94.16%)	       87.11 ( 97.04%)	       67.08 ( 97.72%)
 length=3072, alignment=0:      3229.89	      660.14 ( 79.56%)	      659.19 ( 79.59%)	      183.85 ( 94.31%)	      140.25 ( 95.66%)	       73.01 ( 97.74%)
 length=3328, alignment=0:      3496.08	      713.05 ( 79.60%)	      710.00 ( 79.69%)	      209.72 ( 94.00%)	      138.78 ( 96.03%)	       77.81 ( 97.77%)
 length=3584, alignment=0:      3756.52	      766.19 ( 79.60%)	      763.94 ( 79.66%)	      214.16 ( 94.30%)	      146.36 ( 96.10%)	       83.43 ( 97.78%)
 length=3840, alignment=0:      4017.15	      817.43 ( 79.65%)	      819.77 ( 79.59%)	      242.07 ( 93.97%)	      164.56 ( 95.90%)	       89.72 ( 97.77%)
 length=4096, alignment=0:      4281.59	      867.87 ( 79.73%)	      864.71 ( 79.80%)	      243.33 ( 94.32%)	      173.11 ( 95.96%)	       95.65 ( 97.77%)
 length=4608, alignment=0:      4810.30	      977.80 ( 79.67%)	      985.03 ( 79.52%)	      271.13 ( 94.36%)	      190.62 ( 96.04%)	      107.82 ( 97.76%)
 length=5120, alignment=0:      5380.16	     1075.77 ( 80.00%)	     1071.80 ( 80.08%)	      294.27 ( 94.53%)	      206.04 ( 96.17%)	      141.90 ( 97.36%)
 length=5632, alignment=0:      5925.70	     1195.61 ( 79.82%)	     1193.68 ( 79.86%)	      323.42 ( 94.54%)	      223.55 ( 96.23%)	      125.28 ( 97.89%)
 length=6144, alignment=0:      6402.20	     1285.52 ( 79.92%)	     1281.04 ( 79.99%)	      342.68 ( 94.65%)	      234.84 ( 96.33%)	      167.01 ( 97.39%)
 length=6656, alignment=0:      6997.01	     1387.32 ( 80.17%)	     1384.21 ( 80.22%)	      365.93 ( 94.77%)	      269.89 ( 96.14%)	      176.40 ( 97.48%)
 length=7168, alignment=0:      7454.76	     1492.10 ( 79.98%)	     1488.45 ( 80.03%)	      391.92 ( 94.74%)	      280.81 ( 96.23%)	      187.73 ( 97.48%)
 length=7680, alignment=0:      8163.34	     1608.43 ( 80.30%)	     1615.98 ( 80.20%)	      460.03 ( 94.36%)	      299.86 ( 96.33%)	      201.40 ( 97.53%)
```
2025-08-19 15:18:04 -07:00
Michael Jones
d2b2d6ff10
[libc] Fix missing close at the end of file test (#154392)
The test added by #150802 was missing a close at the end.
2025-08-19 10:26:05 -07:00
codefaber
fd7f69bfe7
[libc] Fix copy/paste error in file.cpp (#150802)
Fix using wrong variable due to copy/paste error.

---------

Co-authored-by: codefaber <codefaber>
2025-08-19 10:05:38 -07:00
Krishna Pandey
550dbec03a
[libc][math][c++23] Add {,u}fromfp{,x}bf16 math functions (#153992)
This PR adds the following basic math functions for BFloat16 type along
with the tests:
- fromfpbf16
- fromfpxbf16
- ufromfpbf16
- ufromfpxbf16

---------

Signed-off-by: Krishna Pandey <kpandey81930@gmail.com>
2025-08-19 22:19:03 +05:30
William Huynh
0c622d72fc
[libc] Add _Returns_twice to C++ code (#153602)
Fixes issue with `<csetjmp>` which requires `_Returns_twice` but in C++
mode
2025-08-19 09:28:23 +01:00
Muhammad Bassiouni
523c3a0197
[libc][math] fix coshf16 build errors. (#154226) 2025-08-19 03:39:24 +03:00
Muhammad Bassiouni
2c79dc1082
[libc][math] Refactor coshf16 implementation to header-only in src/__support/math folder. (#153582)
Part of #147386

in preparation for: https://discourse.llvm.org/t/rfc-make-clang-builtin-math-functions-constexpr-with-llvm-libc-to-support-c-23-constexpr-math-functions/86450
2025-08-19 02:08:07 +03:00
Mohamed Emad
40833eea21
Reland "[libc][math][c23] Implement C23 math function asinpif16" (#152690)
#146226 with fixing asinpi MPFR number function and make it work when
mpfr < `4.2.0`
2025-08-18 00:04:47 +03:00
Aiden Grossman
71925a90c8
[libc] Setup hdrgen for ioctl (#153976)
This patch adds some hdrgen yaml for ioctl(). Otherwise the function
never actually ends up being available in a full build. This is the last
thing that is needed to enable turning on LIBCXX_ENABLE_RANDOM_DEVICE.
2025-08-17 08:52:29 -07:00
Aiden Grossman
29d49c8a37
[libc] Correct standard for getcpu (#153982) 2025-08-16 16:05:45 -07:00
Leandro Lacerda
75bf739208
[libc][gpu] Disable loop unrolling in the throughput benchmark loop (#153971)
This patch makes GPU throughput benchmark results more comparable across
targets by disabling loop unrolling in the benchmark loop.

Motivation:
* PTX (post-LTO) evidence on NVPTX: for libc `sin`, the generated PTX
shows the `throughput` loop unrolled 8x at `N=128` (one iteration
advances the input pointer by 64 bytes = 8 doubles), interleaving eight
independent chains before the back-edge. This hides latency and
significantly reduces cycles/call as the batch size `N` grows.
* Observed scaling (NVPTX measurements): with unrolling enabled, `sin`
dropped from ~3,100 cycles/call at `N=1` to ~360 at `N=128`. After
enforcing `#pragma clang loop unroll(disable)`, results stabilized
(e.g., from ~3100 cycles/call at `N=1` to ~2700 at `N=128`).
* libdevice contrast: the libdevice `sin` path did not exhibit a similar
drop in our measurements, and the PTX appears as compact internal calls
rather than a long FMA chain, leaving less ILP for the outer loop to
extract.

What this change does:
* Applies `#pragma clang loop unroll(disable)` to the GPU `throughput()`
loop in both NVPTX and AMDGPU backends.

Leaving unrolling entirely to the optimizer makes apples-to-apples
comparisons uneven (e.g., libc vs. vendor). Disabling unrolling yields
fairer, more consistent numbers.
2025-08-16 20:14:26 +00:00
Leandro Lacerda
cf5f311b26
[libc] Polish GPU benchmarking (#153900)
This patch provides cleanups and improvements for the GPU benchmarking
infrastructure. The key changes are:

- Fix benchmark convergence bug: Round up the scaled iteration count
(ceil) to ensure it grows properly. The previous truncation logic causes
the iteration count to get stuck.
- Resolve remaining compiler warning.
- Remove unused `BenchmarkLogger` files: This is dead code that added
maintenance and cognitive overhead without providing functionality.
- Improve build hygiene: Clean up headers and CMake dependencies to
strictly follow the 'include what you use' (IWYU) principle.
2025-08-15 19:51:52 -05:00
Leandro Lacerda
08ff017fb0
[libc] Improve GPU benchmarking (#153512)
This patch improves the GPU benchmarking in this way:

* Replace `rand`/`srand` with a deterministic per-thread RNG seeded by
`call_index`: reproducible, apples-to-apples libc vs vendor comparisons.
* Fix input generation: sample the unbiased exponent uniformly in
`[min_exp, max_exp]`, clamp bounds, and skip `Inf`, `NaN`, `-0.0`, and
`+0.0`.
* Fix standard deviation: use an explicit estimator from sums and
sums-of-squares (`sqrt(E[x^2] − E[x]^2)`) across samples.
* Fix throughput overhead: subtract a loop-only baseline inside
NVPTX/AMDGPU timing backends so `benchmark()` gets cycles-per-call
already corrected (no `overhead()` call).
* Adapt existing math benchmarks to the new RNG/timing plumbing (plumb
`call_index`, drop `rand/srand`, clean includes).
* Correct inter-thread aggregation: use iteration-weighted pooling to
compute the global mean/variance, ensuring statistically sound `Cycles
(Mean)` and `Stddev`.
* Remove `Time / Iteration` column from the results table: it reported
per-thread convergence time (not per-call latency) and was
redundant/misleading next to `Cycles (Mean)`.
* Remove unused `BenchmarkLogger` files: dead code that added
maintenance and cognitive overhead without providing functionality.

---

## TODO (before merge)

* [ ] Investigate compiler warnings and address their root causes.
* [x] Review how per-thread results are aggregated into the overall
result.

## Follow-ups (future PRs)

* Add support to run throughput benchmarks with uniform (linear) input
distributions, alongside the current log2-uniform scheme.
* Review/adjust the configuration and coverage of existing math
benchmarks.
* Add more math benchmarks (e.g., `exp`/`expf`, others).
2025-08-15 11:00:17 -05:00
Mikhail R. Gadelha
d7199544af
[libc] Fix mbrtowc test (#153721)
Previously, we were trying to memset a pointer that wasn't being
initialized, and the test would randomly fail.

This PR replaces the pointers with actual objects.
2025-08-15 11:44:33 -03:00
Krishna Pandey
6602d6c7a7
[libc][math][docs] Add documentation for BFloat16 type (#153475)
Signed-off-by: Krishna Pandey <kpandey81930@gmail.com>
2025-08-15 20:07:33 +05:30
William Huynh
6b16a276ef
[libc] Add startup code for ARM v7-A, ARM v7-R variants (#153576)
These variants require a different exception table that requires a bit
of initialisation.

This allows us to enable testing for these variants downstream.
2025-08-15 09:17:50 +00:00
Muhammad Bassiouni
9ddc85f6d5
[libc][math] Refactor coshf implementation to header-only in src/__support/math folder. (#153427)
Part of #147386

in preparation for:
https://discourse.llvm.org/t/rfc-make-clang-builtin-math-functions-constexpr-with-llvm-libc-to-support-c-23-constexpr-math-functions/86450
2025-08-14 17:19:47 +03:00
Krishna Pandey
41c9510d72
[libc][math][c++23] Add bf16fma{,f,l,f128} math functions (#153231)
This PR adds the following basic math functions for BFloat16 type along
with the tests:
- bf16fma
- bf16fmaf
- bf16fmal
- bf16fmaf128

---------

Signed-off-by: Krishna Pandey <kpandey81930@gmail.com>
2025-08-13 23:26:15 +05:30
Muhammad Bassiouni
0f6d3ad0fe
[libc][math] Refactor cosf16 implementation to header-only in src/__support/math folder. (#152871)
Part of #147386

in preparation for: https://discourse.llvm.org/t/rfc-make-clang-builtin-math-functions-constexpr-with-llvm-libc-to-support-c-23-constexpr-math-functions/86450
2025-08-13 18:04:35 +03:00
Jin Huang
91de0a2c43
[libc] Refactor libc code to improve readability. (#153308)
The PR is going to improve the readability for the files under
`llvm-project/libc/src/wchar` directory.

---------

Co-authored-by: Jin Huang <jingold@google.com>
2025-08-12 21:41:21 -07:00
Alexey Samsonov
04081caa09
[libc] Remove LIBC_ERRNO_MODE_SYSTEM mode. (#153077)
Use LIBC_ERRNO_MODE_SYSTEM_INLINE instead as the default for the "public
packaging" (i.e. release mode) of an overlay build. The Bazel build has
already switched to use it by default in
5ccc734fa0355f971f8f515457a0bece33ab6642. This should be a safe change,
as LIBC_ERRNO_MODE_SYSTEM_INLINE works a drop-in (but simpler)
LIBC_ERRNO_MODE_SYSTEM replacement. Remove the associated code paths and
config settings.

Fixes issue #143454.
2025-08-12 19:52:40 -07:00
Krishna Pandey
c819c246f3
[libc][math][c++23] Add bf16div{,f,l,f128} math functions (#153191)
This PR adds the following basic math functions for BFloat16 type along
with the tests:
- bf16div
- bf16divf
- bf16divl
- bf16divf128

---------

Signed-off-by: Krishna Pandey <kpandey81930@gmail.com>
2025-08-12 21:46:22 +05:30
Krishna Pandey
8c5e9399f6
[libc][math][c++23] Add f{max,min}imum{,_mag,_mag_num,_num}bf16 math functions (#152881)
This PR adds the following basic math functions for BFloat16 type along
with the tests:
- fmaximumbf16
- fmaximum_magbf16
- fmaximum_mag_numbf16
- fmaximum_numbf16
- fminimumbf16
- fminimum_magbf16
- fminimum_mag_numbf16
- fminimum_numbf16

---------

Signed-off-by: Krishna Pandey <kpandey81930@gmail.com>
2025-08-12 20:37:31 +05:30
William Huynh
33fe6353ef
Revert "[libc] Add -Wextra for libc tests" (#153169)
Reverts llvm/llvm-project#133643
2025-08-12 11:40:14 +00:00
Vinay Deshmukh
e617dc80bf
[libc] Add -Wextra for libc tests (#133643)
* Relates to: https://github.com/llvm/llvm-project/issues/119281
2025-08-12 12:27:13 +01:00
Joseph Huber
005895290d
[libc] Simplifiy slab waiting in GPU memory allocator (#152872)
Summary:
This moves the waiting to be done inside of the `try_lock` routine
instead. This makes the logic much simpler since it's just a single loop
on a load. We should have the same effect here, and since we don't care
about this being a generic interface it shouldn't matter that it waits
abit. Still wait free since it's guaranteed to make progress
*eventually*.
2025-08-11 13:11:39 -05:00
Muhammad Bassiouni
200a99073f
[libc][math] Refactor cosf implementation to header-only in src/__support/math folder. (#152069)
Part of #147386

in preparation for: https://discourse.llvm.org/t/rfc-make-clang-builtin-math-functions-constexpr-with-llvm-libc-to-support-c-23-constexpr-math-functions/86450
2025-08-11 21:08:21 +03:00
William Huynh
372d86dcf1
[libc] Cleanup startup/baremetal/arm/start.cpp (#151532)
Post-commit review changes as suggested by @petrhosek in #146863
2025-08-11 15:11:36 +01:00
William Huynh
0adcbc0228
[libc] Disable LlvmLibcTimespecGet.Monotonic for baremetal targets (#152290)
This test was caught by our hermetic testing downstream. The baremetal
implementation does not support monotonic, so we disable it.
2025-08-11 14:48:11 +01:00
Krishna Pandey
628c0e33e4
[libc][math][c++23] Add bf16mul{,f,l,f128} math functions (#152847)
This PR adds the following basic math functions for BFloat16 type along
with the tests:
- bf16mul
- bf16mulf
- bf16mull
- bf16mulf128

---------

Signed-off-by: Krishna Pandey <kpandey81930@gmail.com>
2025-08-10 21:38:27 -04:00
Muhammad Bassiouni
e86c9b6b1b
[libc][math] Refactor cos implementation to header-only in src/__support/math folder. (#151883)
Part of #147386

in preparation for: https://discourse.llvm.org/t/rfc-make-clang-builtin-math-functions-constexpr-with-llvm-libc-to-support-c-23-constexpr-math-functions/86450
2025-08-09 18:51:54 +03:00
Joseph Huber
0c139883f4 [libc] Fix server code when GPU is acting as the server
Summary:
Small fix that just ignores all the extra lanes if we're running the
server from a platform that potentially has more.
2025-08-08 19:15:13 -05:00
Krishna Pandey
246f92324f
[libc][math][c++23] Add f{max,min}bf16 math functions (#152782)
Signed-off-by: Krishna Pandey <kpandey81930@gmail.com>
2025-08-08 17:54:02 -04:00
Krishna Pandey
10088b64ef
[libc][math] Update entrypoints for bf16{add,sub}{,f,l,f128} math functions (#152784)
Signed-off-by: Krishna Pandey <kpandey81930@gmail.com>
2025-08-08 17:06:48 -04:00
Joseph Huber
ca006898b3
[libc] Cache old slabs when allocating GPU memory (#151866)
Summary:
This patch introduces a lock-free stack used to store a fixed number of
slabs. Instead of going directly through RPC memory, we instead can
consult the cache and use that. Currently, this means that ~64 MiB of
memory will remain in-use if the user completely fills the cache.
However, because we always fully destroy the object, the chunk size can
be reset so they can be fully reused.

This greatly improves performance in cases where the user has previously
accessed malloc, lowering the difference between an implementation that
does not free slabs at all and one that does.

We can also skip the expensive zeroing step if the old chunk size was
smaller than the previous one. Smaller chunk sizes need a larger
bitfield, and because we know for a fact that the number of users
remaining in this slab is zero thanks to the reference counting we can
guarantee that the bitfield is all zero like when it was initialized.
2025-08-08 14:28:41 -05:00
Krishna Pandey
1ffb99520d
[libc][math][c++23] Add bf16{add,sub}{,f,l,f128} math functions (#152774)
This PR adds implements following basic math functions for BFloat16 type
along with the tests:
- bf16add
- bf16addf
- bf16addl
- bf16addf128
- bf16sub
- bf16subf
- bf16subl
- bf16subf128

---------

Signed-off-by: Krishna Pandey <kpandey81930@gmail.com>
2025-08-08 15:20:24 -04:00
Muhammad Bassiouni
45b15946b1
[libc][hdrgen] Fix hdrgen when using macros as guards in stdlib.yaml. (#152732) 2025-08-08 18:39:47 +03:00
Muhammad Bassiouni
66734f4c3c
[libc][math] Refactor cbrtf implementation to header-only in src/__support/math folder. (#151846)
Part of #147386

in preparation for: https://discourse.llvm.org/t/rfc-make-clang-builtin-math-functions-constexpr-with-llvm-libc-to-support-c-23-constexpr-math-functions/86450
2025-08-08 18:28:50 +03:00
Alexey Samsonov
2422972eea
[libc] Migrate FEnvSafeTest and FPTest to ErrnoCheckingTest. (#152633)
This would ensure that errno value is cleared out before test execution
and tests pass even when LIBC_ERRNO_MODE_SYSTEM_INLINE is specified (and
errno may be clobbered before test execution).

A lot of the tests would fail, however, since errno would end up getting
set to EDOM or ERANGE during test execution and never validated before
the end of the test. This should be fixed - and errno should be
explicitly checked or ignored in all of those cases, but for now add a
TODO to address it later (see open issue #135320) and clear out errno in
test fixture to avoid test failures.
2025-08-07 21:49:57 -07:00
Krishna Pandey
15a705dc93
[libc][math][c++23] Add {ceil,floor,round,roundeven,trunc}bf16 math functions (#152352)
This PR implements the following basic math functions for BFloat16 type
along with the tests:
- ceilbf16
- floorbf16
- roundbf16
- roundevenbf16
- truncbf16

---------

Signed-off-by: Krishna Pandey <kpandey81930@gmail.com>
2025-08-08 00:15:22 -04:00
Caslyn Tonelli
b8195e3a8e
[libc] Fix typo and amend restrict qualifier (#152410)
This removes an extraneous ',' in the generated dlfcn header.

This also adds `__restrict` to `dladdr`'s declaration per POSIX. Another
fix is made: the C standard `restrict` keyword is removed from
dlinfo.cpp/dlinfo.h (but note that dlfcn.yaml still annotates
`__restrict` for dlinfo's decl).
2025-08-07 16:45:14 -07:00
Caslyn Tonelli
e1171e6a98
[libc][dlfcn] Remove unused errno dep (#152222)
This removes the errno dep for the stub libdl functions, since there is
no need for it.
2025-08-07 07:48:28 -07:00