▲ 5 r/bcachefs
suboptimal allocator behavior under heavy load with somehow asymmetric devices setup
Greetings everybody
Some troubles with my volume 3xNVME 3xHDD.
Device label Device State Size Used Use% Leaving
bhdd.seaJ6ER (device 24): sdc4 rw 15.8T 174G 1%
bhdd.tosh21F0 (device 13): sda4 rw 10.5T 3.21T 30% 4.25M
bhdd.tosh4310 (device 14): sdb4 rw 10.5T 2.96T 28%
bnvme.970evo (device 5): nvme2n1p6 rw 62.8G 61.8G 97% 27.8G
bnvme.990pro (device 23): nvme1n1p6 rw 387G 217G 57% 212G
bnvme.sn720 (device 11): nvme0n1p6 rw 74.0G 72.8G 97% 38.2G
nvme1 is substantially bigger (and faster) hdc is somehow bigger and added recently thus filled a little.
Now. Under heavy load iostat
avg-cpu: %user %nice %system %iowait %steal %idle
2.4% 0.0% 1.0% 38.4% 0.0% 58.2%
rkB/s rrqm/s %rrqm r_await rareq-sz Device
35.33 4.4M 37.53 51.5% 1.45 126.4k nvme0n1
16.80 1.0M 0.00 0.0% 0.79 63.6k nvme1n1
21.33 3.3M 33.40 61.0% 3.52 158.0k nvme2n1
163.00 15.2M 259.80 61.4% 229.71 95.7k sda
211.27 28.1M 582.67 73.4% 171.60 136.3k sdb
90.87 5.5M 27.60 23.3% 27.99 61.8k sdc
w/s wkB/s wrqm/s %wrqm w_await wareq-sz Device
30.60 2.9M 5.20 14.5% 0.59 98.7k nvme0n1
82.00 44.0M 67.80 45.3% 1.82 549.8k nvme1n1
23.67 2.3M 4.87 17.1% 2.46 98.7k nvme2n1
26.73 43.4M 119.40 81.7% 324.65 1.6M sda
5.20 2.5M 35.93 87.4% 381.90 498.4k sdb
3.80 136.5k 1.93 33.7% 4.18 35.9k sdc
d/s dkB/s drqm/s %drqm d_await dareq-sz Device
6.93 2.7M 3.87 35.8% 0.55 398.8k nvme0n1
3.80 5.2M 1.40 26.9% 0.81 1.4M nvme1n1
8.27 2.1M 0.00 0.0% 2.26 256.0k nvme2n1
0.00 0.0k 0.00 0.0% 0.00 0.0k sda
0.00 0.0k 0.00 0.0% 0.00 0.0k sdb
0.00 0.0k 0.00 0.0% 0.00 0.0k sdc
f/s f_await aqu-sz %util Device
3.40 0.25 0.07 0.6% nvme0n1
3.40 2.16 0.17 2.9% nvme1n1
3.40 2.08 0.16 2.7% nvme2n1
3.33 114.66 46.50 85.9% sda
3.33 90.46 38.54 85.3% sdb
3.33 3.84 2.57 6.7% sdc
As you can see sdc is used a little. Thus heavy sda/sdb use makes a bottle-neck.
The same hardware is used, by different partitions, to make another bcachefs volume that is used at the same time- for reading. One is simply data volume and the other I'm giving details- as backup. Data volume is somehow different but relatively similar to backup one I'm giving details now.
Is this a case for optimalization per parameters tuning or per sourcecode patching?
Any suggestions welcome
u/krismatu — 9 days ago