1*94c4a1e1SFrank Piva 2*94c4a1e1SFrank Piva============================ 3*94c4a1e1SFrank PivaUserspace block driver(ublk) 4*94c4a1e1SFrank Piva============================ 5*94c4a1e1SFrank Piva 6*94c4a1e1SFrank PivaIntroduction 7*94c4a1e1SFrank Piva============ 8*94c4a1e1SFrank Piva 9*94c4a1e1SFrank PivaThis is the userspace daemon part(ublksrv) of the ublk framework, the other 10*94c4a1e1SFrank Pivapart is ``ublk driver`` [#userspace]_ which supports multiple queue. 11*94c4a1e1SFrank Piva 12*94c4a1e1SFrank PivaThe two parts communicate by io_uring's IORING_OP_URING_CMD with one 13*94c4a1e1SFrank Pivaper-queue shared cmd buffer for storing io command, and the buffer is 14*94c4a1e1SFrank Pivaread only for ublksrv, each io command can be indexed by io request tag 15*94c4a1e1SFrank Pivadirectly, and the command is written by ublk driver, and read by ublksrv 16*94c4a1e1SFrank Pivaafter getting notification from ublk driver. 17*94c4a1e1SFrank Piva 18*94c4a1e1SFrank PivaFor example, when one READ io request is submitted to ublk block driver, ublk 19*94c4a1e1SFrank Pivadriver stores the io command into cmd buffer first, then completes one 20*94c4a1e1SFrank PivaIORING_OP_URING_CMD for notifying ublksrv, and the URING_CMD is issued to 21*94c4a1e1SFrank Pivaublk driver beforehand by ublksrv for getting notification of any new io 22*94c4a1e1SFrank Pivarequest, and each URING_CMD is associated with one io request by tag, 23*94c4a1e1SFrank Pivaso depth for URING_CMD is same with queue depth of ublk block device. 24*94c4a1e1SFrank Piva 25*94c4a1e1SFrank PivaAfter ublksrv gets the io command, it translates and handles the ublk io 26*94c4a1e1SFrank Pivarequest, such as, for the ublk-loop target, ublksrv translates the request 27*94c4a1e1SFrank Pivainto same request on another file or disk, like the kernel loop block 28*94c4a1e1SFrank Pivadriver. In ublksrv's implementation, the io is still handled by io_uring, 29*94c4a1e1SFrank Pivaand share same ring with IORING_OP_URING_CMD command. When the target io 30*94c4a1e1SFrank Pivarequest is done, the same IORING_OP_URING_CMD is issued to ublk driver for 31*94c4a1e1SFrank Pivaboth committing io request result and getting future notification of new 32*94c4a1e1SFrank Pivaio request. 33*94c4a1e1SFrank Piva 34*94c4a1e1SFrank PivaSo far, the ublk driver needs to copy io request pages into userspace buffer 35*94c4a1e1SFrank Piva(pages) first for write before notifying the request to ublksrv, and copy 36*94c4a1e1SFrank Pivauserspace buffer(pages) to the io request pages after ublksrv handles 37*94c4a1e1SFrank PivaREAD. Also looks linux-mm can't support zero copy for this case yet. [#zero_copy]_ 38*94c4a1e1SFrank Piva 39*94c4a1e1SFrank PivaMore ublk targets will be added with this framework in future even though only 40*94c4a1e1SFrank Pivaublk-loop and ublk-null are implemented now. 41*94c4a1e1SFrank Piva 42*94c4a1e1SFrank Pivalibublksrv is also generated, and it helps to integrate ublk into existed 43*94c4a1e1SFrank Pivaproject. One example of demo_null is provided for how to make a ublk 44*94c4a1e1SFrank Pivadevice over libublksrv. 45*94c4a1e1SFrank Piva 46*94c4a1e1SFrank PivaQuick start 47*94c4a1e1SFrank Piva=========== 48*94c4a1e1SFrank Piva 49*94c4a1e1SFrank Pivahow to build ublksrv: 50*94c4a1e1SFrank Piva-------------------- 51*94c4a1e1SFrank Piva 52*94c4a1e1SFrank Piva.. code-block:: console 53*94c4a1e1SFrank Piva 54*94c4a1e1SFrank Piva autoreconf -i 55*94c4a1e1SFrank Piva ./configure #pkg-config and libtool is usually needed 56*94c4a1e1SFrank Piva make 57*94c4a1e1SFrank Piva 58*94c4a1e1SFrank Pivanote: './configure' requires liburing 2.2 package installed, if liburing 2.2 59*94c4a1e1SFrank Pivaisn't available in your distribution, please configure via the following 60*94c4a1e1SFrank Pivacommand, or refer to ``build_with_liburing_src`` [#build_with_liburing_src]_ 61*94c4a1e1SFrank Piva 62*94c4a1e1SFrank Piva.. code-block:: console 63*94c4a1e1SFrank Piva 64*94c4a1e1SFrank Piva PKG_CONFIG_PATH=${LIBURING_DIR} \ 65*94c4a1e1SFrank Piva ./configure \ 66*94c4a1e1SFrank Piva CFLAGS="-I${LIBURING_DIR}/src/include" \ 67*94c4a1e1SFrank Piva CXXFLAGS="-I${LIBURING_DIR}/src/include" \ 68*94c4a1e1SFrank Piva LDFLAGS="-L${LIBURING_DIR}/src" 69*94c4a1e1SFrank Piva 70*94c4a1e1SFrank Pivaand LIBURING_DIR points to directory of liburing source code, and liburing 71*94c4a1e1SFrank Pivaneeds to be built before running above commands. Also IORING_SETUP_SQE128 72*94c4a1e1SFrank Pivahas to be supported in the liburing source. 73*94c4a1e1SFrank Piva 74*94c4a1e1SFrank Pivac++20 is required for building ublk utility, but libublksrv and demo_null.c & 75*94c4a1e1SFrank Pivademo_event.c can be built independently: 76*94c4a1e1SFrank Piva 77*94c4a1e1SFrank Piva- build libublksrv :: 78*94c4a1e1SFrank Piva 79*94c4a1e1SFrank Piva make -C lib/ 80*94c4a1e1SFrank Piva 81*94c4a1e1SFrank Piva- build demo_null && demo_event :: 82*94c4a1e1SFrank Piva 83*94c4a1e1SFrank Piva make -C lib/ 84*94c4a1e1SFrank Piva make demo_null demo_event 85*94c4a1e1SFrank Piva 86*94c4a1e1SFrank Pivahelp 87*94c4a1e1SFrank Piva---- 88*94c4a1e1SFrank Piva 89*94c4a1e1SFrank Piva- ublk help 90*94c4a1e1SFrank Piva 91*94c4a1e1SFrank Pivaadd one ublk-null disk 92*94c4a1e1SFrank Piva---------------------- 93*94c4a1e1SFrank Piva 94*94c4a1e1SFrank Piva- ublk add -t null 95*94c4a1e1SFrank Piva 96*94c4a1e1SFrank Piva 97*94c4a1e1SFrank Pivaadd one ublk-loop disk 98*94c4a1e1SFrank Piva---------------------- 99*94c4a1e1SFrank Piva 100*94c4a1e1SFrank Piva- ublk add -t loop -f /dev/vdb 101*94c4a1e1SFrank Piva 102*94c4a1e1SFrank Pivaor 103*94c4a1e1SFrank Piva 104*94c4a1e1SFrank Piva- ublk add -t loop -f 1.img 105*94c4a1e1SFrank Piva 106*94c4a1e1SFrank Piva 107*94c4a1e1SFrank Pivaadd one qcow2 disk 108*94c4a1e1SFrank Piva------------------ 109*94c4a1e1SFrank Piva 110*94c4a1e1SFrank Piva- ublk add -t qcow2 -f test.qcow2 111*94c4a1e1SFrank Piva 112*94c4a1e1SFrank Pivanote: qcow2 support is experimental, see details in qcow2 status [#qcow2_status]_ 113*94c4a1e1SFrank Pivaand readme [#qcow2_readme]_ 114*94c4a1e1SFrank Piva 115*94c4a1e1SFrank Piva 116*94c4a1e1SFrank Pivaremove one ublk disk 117*94c4a1e1SFrank Piva-------------------- 118*94c4a1e1SFrank Piva 119*94c4a1e1SFrank Piva- ublk del -n 0 #remove /dev/ublkb0 120*94c4a1e1SFrank Piva 121*94c4a1e1SFrank Piva- ublk del -a #remove all ublk devices 122*94c4a1e1SFrank Piva 123*94c4a1e1SFrank Pivalist ublk devices 124*94c4a1e1SFrank Piva--------------------- 125*94c4a1e1SFrank Piva 126*94c4a1e1SFrank Piva- ublk list 127*94c4a1e1SFrank Piva 128*94c4a1e1SFrank Piva- ublk list -v #with all device info dumped 129*94c4a1e1SFrank Piva 130*94c4a1e1SFrank Piva 131*94c4a1e1SFrank Pivaunprivileged mode 132*94c4a1e1SFrank Piva================== 133*94c4a1e1SFrank Piva 134*94c4a1e1SFrank PivaTypical use case is container [#stefan_container]_ in which user 135*94c4a1e1SFrank Pivacan manage its own devices not exposed to other containers. 136*94c4a1e1SFrank Piva 137*94c4a1e1SFrank PivaAt default, controlling ublk device needs privileged user, since 138*94c4a1e1SFrank Piva/dev/ublk-control is permitted for administrator only, and this 139*94c4a1e1SFrank Pivais called privileged mode. 140*94c4a1e1SFrank Piva 141*94c4a1e1SFrank PivaFor unprivilege mode, /dev/ublk-control needs to be allowed for 142*94c4a1e1SFrank Pivaall users, so the following udev rule need to be added: 143*94c4a1e1SFrank Piva 144*94c4a1e1SFrank PivaKERNEL=="ublk-control", MODE="0666", OPTIONS+="static_node=ublk-control" 145*94c4a1e1SFrank Piva 146*94c4a1e1SFrank PivaAlso when new ublk device is added, we need ublk to change device 147*94c4a1e1SFrank Pivaownership to the device's real owner, so the following rules are 148*94c4a1e1SFrank Pivaneeded: :: 149*94c4a1e1SFrank Piva 150*94c4a1e1SFrank Piva KERNEL=="ublkc*",RUN+="ublk_chown.sh %k" 151*94c4a1e1SFrank Piva KERNEL=="ublkb*",RUN+="ublk_chown.sh %k" 152*94c4a1e1SFrank Piva 153*94c4a1e1SFrank Piva``ublk_chown.sh`` can be found under ``utils/`` too. 154*94c4a1e1SFrank Piva 155*94c4a1e1SFrank Piva``utils/ublk_dev.rules`` includes the above rules. 156*94c4a1e1SFrank Piva 157*94c4a1e1SFrank PivaWith the above two administrator changes, unprivileged user can 158*94c4a1e1SFrank Pivacreate/delete/list/use ublk device, also anyone which isn't permitted 159*94c4a1e1SFrank Pivacan't access and control this ublk devices(ublkc*/ublkb*) 160*94c4a1e1SFrank Piva 161*94c4a1e1SFrank PivaUnprivileged user can pass '--unprevileged' to 'ublk add' for creating 162*94c4a1e1SFrank Pivaunprivileged ublk device, then the created ublk device is only available 163*94c4a1e1SFrank Pivafor the owner and administrator. 164*94c4a1e1SFrank Piva 165*94c4a1e1SFrank Pivause unprivileged ublk in docker 166*94c4a1e1SFrank Piva------------------------------- 167*94c4a1e1SFrank Piva 168*94c4a1e1SFrank Piva- install the following udev rules in host machine: :: 169*94c4a1e1SFrank Piva 170*94c4a1e1SFrank Piva ACTION=="add",KERNEL=="ublk[bc]*",RUN+="/usr/local/sbin/ublk_chown_docker.sh %k 'add' '%M' '%m'" 171*94c4a1e1SFrank Piva ACTION=="remove",KERNEL=="ublk[bc]*",RUN+="/usr/local/sbin/ublk_chown_docker.sh %k 'remove' '%M' '%m'" 172*94c4a1e1SFrank Piva 173*94c4a1e1SFrank Piva``ublk_chown_docker.sh`` can be found under ``utils/``. 174*94c4a1e1SFrank Piva 175*94c4a1e1SFrank Piva- run one container and install ublk & its dependency packages 176*94c4a1e1SFrank Piva 177*94c4a1e1SFrank Piva.. code-block:: console 178*94c4a1e1SFrank Piva 179*94c4a1e1SFrank Piva docker run \ 180*94c4a1e1SFrank Piva --name fedora \ 181*94c4a1e1SFrank Piva --hostname=ublk-docker.example.com \ 182*94c4a1e1SFrank Piva --device=/dev/ublk-control \ 183*94c4a1e1SFrank Piva --device-cgroup-rule='a *:* rmw' \ 184*94c4a1e1SFrank Piva --tmpfs /tmp \ 185*94c4a1e1SFrank Piva --tmpfs /run \ 186*94c4a1e1SFrank Piva --volume /sys/fs/cgroup:/sys/fs/cgroup:ro \ 187*94c4a1e1SFrank Piva -ti \ 188*94c4a1e1SFrank Piva fedora:38 189*94c4a1e1SFrank Piva 190*94c4a1e1SFrank Piva.. code-block:: console 191*94c4a1e1SFrank Piva 192*94c4a1e1SFrank Piva #run the following commands inside the above container 193*94c4a1e1SFrank Piva dnf install -y git libtool automake autoconf g++ liburing-devel 194*94c4a1e1SFrank Piva git clone https://github.com/ming1/ubdsrv.git 195*94c4a1e1SFrank Piva cd ubdsrv 196*94c4a1e1SFrank Piva autoreconf -i&& ./configure&& make -j 4&& make install 197*94c4a1e1SFrank Piva 198*94c4a1e1SFrank Piva- add/delete ublk device inside container by unprivileged user 199*94c4a1e1SFrank Piva 200*94c4a1e1SFrank Piva.. code-block:: console 201*94c4a1e1SFrank Piva 202*94c4a1e1SFrank Piva docker exec -u 1001:1001 -ti fedora /bin/bash 203*94c4a1e1SFrank Piva 204*94c4a1e1SFrank Piva.. code-block:: console 205*94c4a1e1SFrank Piva 206*94c4a1e1SFrank Piva #run the following commands inside the above container 207*94c4a1e1SFrank Piva bash-5.2$ ublk add -t null --unprivileged 208*94c4a1e1SFrank Piva dev id 0: nr_hw_queues 1 queue_depth 128 block size 512 dev_capacity 524288000 209*94c4a1e1SFrank Piva max rq size 524288 daemon pid 178 flags 0x62 state LIVE 210*94c4a1e1SFrank Piva ublkc: 237:0 ublkb: 259:1 owner: 1001:1001 211*94c4a1e1SFrank Piva queue 0: tid 179 affinity(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 ) 212*94c4a1e1SFrank Piva target {"dev_size":268435456000,"name":"null","type":0} 213*94c4a1e1SFrank Piva 214*94c4a1e1SFrank Piva bash-5.2$ ls -l /dev/ublk* 215*94c4a1e1SFrank Piva crw-rw-rw-. 1 root root 10, 123 May 1 04:35 /dev/ublk-control 216*94c4a1e1SFrank Piva brwx------. 1 1001 1001 259, 1 May 1 04:36 /dev/ublkb0 217*94c4a1e1SFrank Piva crwx------. 1 1001 1001 237, 0 May 1 04:36 /dev/ublkc0 218*94c4a1e1SFrank Piva 219*94c4a1e1SFrank Piva bash-5.2$ ublk del -n 0 220*94c4a1e1SFrank Piva bash-5.2$ ls -l /dev/ublk* 221*94c4a1e1SFrank Piva crw-rw-rw-. 1 root root 10, 123 May 1 04:35 /dev/ublk-control 222*94c4a1e1SFrank Piva 223*94c4a1e1SFrank Piva- example of ublk in docker: ``tests/debug/ublk_docker`` 224*94c4a1e1SFrank Piva 225*94c4a1e1SFrank Pivatest 226*94c4a1e1SFrank Piva==== 227*94c4a1e1SFrank Piva 228*94c4a1e1SFrank Pivarun all built tests 229*94c4a1e1SFrank Piva------------------- 230*94c4a1e1SFrank Piva 231*94c4a1e1SFrank Pivamake test T=all 232*94c4a1e1SFrank Piva 233*94c4a1e1SFrank Piva 234*94c4a1e1SFrank Pivarun test group 235*94c4a1e1SFrank Piva-------------- 236*94c4a1e1SFrank Piva 237*94c4a1e1SFrank Pivamake test T=null 238*94c4a1e1SFrank Piva 239*94c4a1e1SFrank Pivamake test T=loop 240*94c4a1e1SFrank Piva 241*94c4a1e1SFrank Pivamake test T=generic 242*94c4a1e1SFrank Piva 243*94c4a1e1SFrank Piva 244*94c4a1e1SFrank Pivarun single test 245*94c4a1e1SFrank Piva--------------- 246*94c4a1e1SFrank Piva 247*94c4a1e1SFrank Pivamake test T=generic/001 248*94c4a1e1SFrank Piva 249*94c4a1e1SFrank Pivamake test T=null/001 250*94c4a1e1SFrank Piva 251*94c4a1e1SFrank Pivamake test T=loop/001 252*94c4a1e1SFrank Piva... 253*94c4a1e1SFrank Piva 254*94c4a1e1SFrank Pivarun specified tests or test groups 255*94c4a1e1SFrank Piva---------------------------------- 256*94c4a1e1SFrank Piva 257*94c4a1e1SFrank Pivamake test T=generic:loop/001:null 258*94c4a1e1SFrank Piva 259*94c4a1e1SFrank Piva 260*94c4a1e1SFrank PivaDebug 261*94c4a1e1SFrank Piva===== 262*94c4a1e1SFrank Piva 263*94c4a1e1SFrank Pivaublksrv is running as one daemon process, so most of debug messages won't be 264*94c4a1e1SFrank Pivashown in terminal. If any issue is observed, please collect log via command 265*94c4a1e1SFrank Pivaof "journalctl | grep ublksrvd" 266*94c4a1e1SFrank Piva 267*94c4a1e1SFrank Piva``./configure --enable-debug`` can build a debug version of ublk which 268*94c4a1e1SFrank Pivadumps lots of runtime debug messages, and can't be used in production 269*94c4a1e1SFrank Pivaenvironment, should be for debug purpose only. For debug version of 270*94c4a1e1SFrank Pivaublksrv, 'ublk add --debug_mask=0x{MASK}' can control which kind of 271*94c4a1e1SFrank Pivadebug log dumped, see ``UBLK_DBG_*`` defined in include/ublksrv_utils.h 272*94c4a1e1SFrank Pivafor each kind of debug log. 273*94c4a1e1SFrank Piva 274*94c4a1e1SFrank Pivalibublksrv API doc 275*94c4a1e1SFrank Piva================== 276*94c4a1e1SFrank Piva 277*94c4a1e1SFrank PivaAPI is documented in include/ublksrv.h, and doxygen doc can be generated 278*94c4a1e1SFrank Pivaby running 'make doxygen_doc', the generated html docs are in doc/html. 279*94c4a1e1SFrank Piva 280*94c4a1e1SFrank PivaContributing 281*94c4a1e1SFrank Piva============ 282*94c4a1e1SFrank Piva 283*94c4a1e1SFrank PivaAny kind of contribution is welcome! 284*94c4a1e1SFrank Piva 285*94c4a1e1SFrank PivaDevelopment is done over github. 286*94c4a1e1SFrank Piva 287*94c4a1e1SFrank Piva 288*94c4a1e1SFrank PivaTodo: 289*94c4a1e1SFrank Piva==== 290*94c4a1e1SFrank Piva 291*94c4a1e1SFrank Pivalibublk 292*94c4a1e1SFrank Piva------ 293*94c4a1e1SFrank Piva 294*94c4a1e1SFrank PivaMove libublksrv out of ublksrv project, and make it as one standalone repo 295*94c4a1e1SFrank Pivaand name it as libublk. 296*94c4a1e1SFrank Piva 297*94c4a1e1SFrank PivaIt is planned to do it when ublk driver UAPI changes(feature addition) is slow down. 298*94c4a1e1SFrank Piva 299*94c4a1e1SFrank PivaLicense 300*94c4a1e1SFrank Piva======= 301*94c4a1e1SFrank Piva 302*94c4a1e1SFrank Pivanlohmann(include/nlohmann/json.hpp) is from [#nlohmann]_, which is covered 303*94c4a1e1SFrank Pivaby MIT license. 304*94c4a1e1SFrank Piva 305*94c4a1e1SFrank PivaThe library functions (all code in lib/ directory and include/ublksrv.h) 306*94c4a1e1SFrank Pivaare covered by dual licensed LGPL and MIT, see COPYING.LGPL and LICENSE. 307*94c4a1e1SFrank Piva 308*94c4a1e1SFrank Pivaqcow2 target code is covered by GPL-2.0, see COPYING. 309*94c4a1e1SFrank Piva 310*94c4a1e1SFrank PivaAll other source code are covered by dual licensed GPL and MIT, see 311*94c4a1e1SFrank PivaCOPYING and LICENSE. 312*94c4a1e1SFrank Piva 313*94c4a1e1SFrank PivaReferences 314*94c4a1e1SFrank Piva========== 315*94c4a1e1SFrank Piva 316*94c4a1e1SFrank Piva.. [#ublk_driver] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/block/ublk_drv.c?h=v6.0 317*94c4a1e1SFrank Piva.. [#zero_copy] https://lore.kernel.org/all/[email protected]/ 318*94c4a1e1SFrank Piva.. [#nlohmann] https://github.com/nlohmann/json 319*94c4a1e1SFrank Piva.. [#qcow2_status] https://github.com/ming1/ubdsrv/blob/master/qcow2/STATUS.rst 320*94c4a1e1SFrank Piva.. [#qcow2_readme] https://github.com/ming1/ubdsrv/blob/master/qcow2/README.rst 321*94c4a1e1SFrank Piva.. [#build_with_liburing_src] https://github.com/ming1/ubdsrv/blob/master/build_with_liburing_src 322*94c4a1e1SFrank Piva.. [#stefan_container] https://lore.kernel.org/linux-block/[email protected]/ 323