1*84e872a0SLloyd Pique<?xml version='1.0' encoding='utf-8' ?> 2*84e872a0SLloyd Pique<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [ 3*84e872a0SLloyd Pique<!ENTITY % BOOK_ENTITIES SYSTEM "Wayland.ent"> 4*84e872a0SLloyd Pique%BOOK_ENTITIES; 5*84e872a0SLloyd Pique]> 6*84e872a0SLloyd Pique<chapter id="chap-Wayland-Architecture"> 7*84e872a0SLloyd Pique <title>Wayland Architecture</title> 8*84e872a0SLloyd Pique <section id="sect-Wayland-Architecture-wayland_architecture"> 9*84e872a0SLloyd Pique <title>X vs. Wayland Architecture</title> 10*84e872a0SLloyd Pique <para> 11*84e872a0SLloyd Pique A good way to understand the Wayland architecture 12*84e872a0SLloyd Pique and how it is different from X is to follow an event 13*84e872a0SLloyd Pique from the input device to the point where the change 14*84e872a0SLloyd Pique it affects appears on screen. 15*84e872a0SLloyd Pique </para> 16*84e872a0SLloyd Pique <para> 17*84e872a0SLloyd Pique This is where we are now with X: 18*84e872a0SLloyd Pique </para> 19*84e872a0SLloyd Pique <figure> 20*84e872a0SLloyd Pique <title>X architecture diagram</title> 21*84e872a0SLloyd Pique <mediaobjectco> 22*84e872a0SLloyd Pique <imageobjectco> 23*84e872a0SLloyd Pique <areaspec id="map1" units="other" otherunits="imagemap"> 24*84e872a0SLloyd Pique <area id="area1_1" linkends="x_flow_1" x_steal="#step_1"/> 25*84e872a0SLloyd Pique <area id="area1_2" linkends="x_flow_2" x_steal="#step_2"/> 26*84e872a0SLloyd Pique <area id="area1_3" linkends="x_flow_3" x_steal="#step_3"/> 27*84e872a0SLloyd Pique <area id="area1_4" linkends="x_flow_4" x_steal="#step_4"/> 28*84e872a0SLloyd Pique <area id="area1_5" linkends="x_flow_5" x_steal="#step_5"/> 29*84e872a0SLloyd Pique <area id="area1_6" linkends="x_flow_6" x_steal="#step_6"/> 30*84e872a0SLloyd Pique </areaspec> 31*84e872a0SLloyd Pique <imageobject> 32*84e872a0SLloyd Pique <imagedata fileref="images/x-architecture.png" format="PNG" /> 33*84e872a0SLloyd Pique </imageobject> 34*84e872a0SLloyd Pique </imageobjectco> 35*84e872a0SLloyd Pique </mediaobjectco> 36*84e872a0SLloyd Pique </figure> 37*84e872a0SLloyd Pique <para> 38*84e872a0SLloyd Pique <orderedlist> 39*84e872a0SLloyd Pique <listitem id="x_flow_1"> 40*84e872a0SLloyd Pique <para> 41*84e872a0SLloyd Pique The kernel gets an event from an input 42*84e872a0SLloyd Pique device and sends it to X through the evdev 43*84e872a0SLloyd Pique input driver. The kernel does all the hard 44*84e872a0SLloyd Pique work here by driving the device and 45*84e872a0SLloyd Pique translating the different device specific 46*84e872a0SLloyd Pique event protocols to the linux evdev input 47*84e872a0SLloyd Pique event standard. 48*84e872a0SLloyd Pique </para> 49*84e872a0SLloyd Pique </listitem> 50*84e872a0SLloyd Pique <listitem id="x_flow_2"> 51*84e872a0SLloyd Pique <para> 52*84e872a0SLloyd Pique The X server determines which window the 53*84e872a0SLloyd Pique event affects and sends it to the clients 54*84e872a0SLloyd Pique that have selected for the event in question 55*84e872a0SLloyd Pique on that window. The X server doesn't 56*84e872a0SLloyd Pique actually know how to do this right, since 57*84e872a0SLloyd Pique the window location on screen is controlled 58*84e872a0SLloyd Pique by the compositor and may be transformed in 59*84e872a0SLloyd Pique a number of ways that the X server doesn't 60*84e872a0SLloyd Pique understand (scaled down, rotated, wobbling, 61*84e872a0SLloyd Pique etc). 62*84e872a0SLloyd Pique </para> 63*84e872a0SLloyd Pique </listitem> 64*84e872a0SLloyd Pique <listitem id="x_flow_3"> 65*84e872a0SLloyd Pique <para> 66*84e872a0SLloyd Pique The client looks at the event and decides 67*84e872a0SLloyd Pique what to do. Often the UI will have to change 68*84e872a0SLloyd Pique in response to the event - perhaps a check 69*84e872a0SLloyd Pique box was clicked or the pointer entered a 70*84e872a0SLloyd Pique button that must be highlighted. Thus the 71*84e872a0SLloyd Pique client sends a rendering request back to the 72*84e872a0SLloyd Pique X server. 73*84e872a0SLloyd Pique </para> 74*84e872a0SLloyd Pique </listitem> 75*84e872a0SLloyd Pique <listitem id="x_flow_4"> 76*84e872a0SLloyd Pique <para> 77*84e872a0SLloyd Pique When the X server receives the rendering 78*84e872a0SLloyd Pique request, it sends it to the driver to let it 79*84e872a0SLloyd Pique program the hardware to do the rendering. 80*84e872a0SLloyd Pique The X server also calculates the bounding 81*84e872a0SLloyd Pique region of the rendering, and sends that to 82*84e872a0SLloyd Pique the compositor as a damage event. 83*84e872a0SLloyd Pique </para> 84*84e872a0SLloyd Pique </listitem> 85*84e872a0SLloyd Pique <listitem id="x_flow_5"> 86*84e872a0SLloyd Pique <para> 87*84e872a0SLloyd Pique The damage event tells the compositor that 88*84e872a0SLloyd Pique something changed in the window and that it 89*84e872a0SLloyd Pique has to recomposite the part of the screen 90*84e872a0SLloyd Pique where that window is visible. The compositor 91*84e872a0SLloyd Pique is responsible for rendering the entire 92*84e872a0SLloyd Pique screen contents based on its scenegraph and 93*84e872a0SLloyd Pique the contents of the X windows. Yet, it has 94*84e872a0SLloyd Pique to go through the X server to render this. 95*84e872a0SLloyd Pique </para> 96*84e872a0SLloyd Pique </listitem> 97*84e872a0SLloyd Pique <listitem id="x_flow_6"> 98*84e872a0SLloyd Pique <para> 99*84e872a0SLloyd Pique The X server receives the rendering requests 100*84e872a0SLloyd Pique from the compositor and either copies the 101*84e872a0SLloyd Pique compositor back buffer to the front buffer 102*84e872a0SLloyd Pique or does a pageflip. In the general case, the 103*84e872a0SLloyd Pique X server has to do this step so it can 104*84e872a0SLloyd Pique account for overlapping windows, which may 105*84e872a0SLloyd Pique require clipping and determine whether or 106*84e872a0SLloyd Pique not it can page flip. However, for a 107*84e872a0SLloyd Pique compositor, which is always fullscreen, this 108*84e872a0SLloyd Pique is another unnecessary context switch. 109*84e872a0SLloyd Pique </para> 110*84e872a0SLloyd Pique </listitem> 111*84e872a0SLloyd Pique </orderedlist> 112*84e872a0SLloyd Pique </para> 113*84e872a0SLloyd Pique <para> 114*84e872a0SLloyd Pique As suggested above, there are a few problems with this 115*84e872a0SLloyd Pique approach. The X server doesn't have the information to 116*84e872a0SLloyd Pique decide which window should receive the event, nor can it 117*84e872a0SLloyd Pique transform the screen coordinates to window-local 118*84e872a0SLloyd Pique coordinates. And even though X has handed responsibility for 119*84e872a0SLloyd Pique the final painting of the screen to the compositing manager, 120*84e872a0SLloyd Pique X still controls the front buffer and modesetting. Most of 121*84e872a0SLloyd Pique the complexity that the X server used to handle is now 122*84e872a0SLloyd Pique available in the kernel or self contained libraries (KMS, 123*84e872a0SLloyd Pique evdev, mesa, fontconfig, freetype, cairo, Qt etc). In 124*84e872a0SLloyd Pique general, the X server is now just a middle man that 125*84e872a0SLloyd Pique introduces an extra step between applications and the 126*84e872a0SLloyd Pique compositor and an extra step between the compositor and the 127*84e872a0SLloyd Pique hardware. 128*84e872a0SLloyd Pique </para> 129*84e872a0SLloyd Pique <para> 130*84e872a0SLloyd Pique In Wayland the compositor is the display server. We transfer 131*84e872a0SLloyd Pique the control of KMS and evdev to the compositor. The Wayland 132*84e872a0SLloyd Pique protocol lets the compositor send the input events directly 133*84e872a0SLloyd Pique to the clients and lets the client send the damage event 134*84e872a0SLloyd Pique directly to the compositor: 135*84e872a0SLloyd Pique </para> 136*84e872a0SLloyd Pique <figure> 137*84e872a0SLloyd Pique <title>Wayland architecture diagram</title> 138*84e872a0SLloyd Pique <mediaobjectco> 139*84e872a0SLloyd Pique <imageobjectco> 140*84e872a0SLloyd Pique <areaspec id="mapB" units="other" otherunits="imagemap"> 141*84e872a0SLloyd Pique <area id="areaB_1" linkends="wayland_flow_1" x_steal="#step_1"/> 142*84e872a0SLloyd Pique <area id="areaB_2" linkends="wayland_flow_2" x_steal="#step_2"/> 143*84e872a0SLloyd Pique <area id="areaB_3" linkends="wayland_flow_3" x_steal="#step_3"/> 144*84e872a0SLloyd Pique <area id="areaB_4" linkends="wayland_flow_4" x_steal="#step_4"/> 145*84e872a0SLloyd Pique </areaspec> 146*84e872a0SLloyd Pique <imageobject> 147*84e872a0SLloyd Pique <imagedata fileref="images/wayland-architecture.png" format="PNG" /> 148*84e872a0SLloyd Pique </imageobject> 149*84e872a0SLloyd Pique </imageobjectco> 150*84e872a0SLloyd Pique </mediaobjectco> 151*84e872a0SLloyd Pique </figure> 152*84e872a0SLloyd Pique <para> 153*84e872a0SLloyd Pique <orderedlist> 154*84e872a0SLloyd Pique <listitem id="wayland_flow_1"> 155*84e872a0SLloyd Pique <para> 156*84e872a0SLloyd Pique The kernel gets an event and sends 157*84e872a0SLloyd Pique it to the compositor. This 158*84e872a0SLloyd Pique is similar to the X case, which is 159*84e872a0SLloyd Pique great, since we get to reuse all the 160*84e872a0SLloyd Pique input drivers in the kernel. 161*84e872a0SLloyd Pique </para> 162*84e872a0SLloyd Pique </listitem> 163*84e872a0SLloyd Pique <listitem id="wayland_flow_2"> 164*84e872a0SLloyd Pique <para> 165*84e872a0SLloyd Pique The compositor looks through its 166*84e872a0SLloyd Pique scenegraph to determine which window 167*84e872a0SLloyd Pique should receive the event. The 168*84e872a0SLloyd Pique scenegraph corresponds to what's on 169*84e872a0SLloyd Pique screen and the compositor 170*84e872a0SLloyd Pique understands the transformations that 171*84e872a0SLloyd Pique it may have applied to the elements 172*84e872a0SLloyd Pique in the scenegraph. Thus, the 173*84e872a0SLloyd Pique compositor can pick the right window 174*84e872a0SLloyd Pique and transform the screen coordinates 175*84e872a0SLloyd Pique to window-local coordinates, by 176*84e872a0SLloyd Pique applying the inverse 177*84e872a0SLloyd Pique transformations. The types of 178*84e872a0SLloyd Pique transformation that can be applied 179*84e872a0SLloyd Pique to a window is only restricted to 180*84e872a0SLloyd Pique what the compositor can do, as long 181*84e872a0SLloyd Pique as it can compute the inverse 182*84e872a0SLloyd Pique transformation for the input events. 183*84e872a0SLloyd Pique </para> 184*84e872a0SLloyd Pique </listitem> 185*84e872a0SLloyd Pique <listitem id="wayland_flow_3"> 186*84e872a0SLloyd Pique <para> 187*84e872a0SLloyd Pique As in the X case, when the client 188*84e872a0SLloyd Pique receives the event, it updates the 189*84e872a0SLloyd Pique UI in response. But in the Wayland 190*84e872a0SLloyd Pique case, the rendering happens in the 191*84e872a0SLloyd Pique client, and the client just sends a 192*84e872a0SLloyd Pique request to the compositor to 193*84e872a0SLloyd Pique indicate the region that was 194*84e872a0SLloyd Pique updated. 195*84e872a0SLloyd Pique </para> 196*84e872a0SLloyd Pique </listitem> 197*84e872a0SLloyd Pique <listitem id="wayland_flow_4"> 198*84e872a0SLloyd Pique <para> 199*84e872a0SLloyd Pique The compositor collects damage 200*84e872a0SLloyd Pique requests from its clients and then 201*84e872a0SLloyd Pique recomposites the screen. The 202*84e872a0SLloyd Pique compositor can then directly issue 203*84e872a0SLloyd Pique an ioctl to schedule a pageflip with 204*84e872a0SLloyd Pique KMS. 205*84e872a0SLloyd Pique </para> 206*84e872a0SLloyd Pique </listitem> 207*84e872a0SLloyd Pique 208*84e872a0SLloyd Pique 209*84e872a0SLloyd Pique </orderedlist> 210*84e872a0SLloyd Pique </para> 211*84e872a0SLloyd Pique </section> 212*84e872a0SLloyd Pique <section id="sect-Wayland-Architecture-wayland_rendering"> 213*84e872a0SLloyd Pique <title>Wayland Rendering</title> 214*84e872a0SLloyd Pique <para> 215*84e872a0SLloyd Pique One of the details I left out in the above overview 216*84e872a0SLloyd Pique is how clients actually render under Wayland. By 217*84e872a0SLloyd Pique removing the X server from the picture we also 218*84e872a0SLloyd Pique removed the mechanism by which X clients typically 219*84e872a0SLloyd Pique render. But there's another mechanism that we're 220*84e872a0SLloyd Pique already using with DRI2 under X: direct rendering. 221*84e872a0SLloyd Pique With direct rendering, the client and the server 222*84e872a0SLloyd Pique share a video memory buffer. The client links to a 223*84e872a0SLloyd Pique rendering library such as OpenGL that knows how to 224*84e872a0SLloyd Pique program the hardware and renders directly into the 225*84e872a0SLloyd Pique buffer. The compositor in turn can take the buffer 226*84e872a0SLloyd Pique and use it as a texture when it composites the 227*84e872a0SLloyd Pique desktop. After the initial setup, the client only 228*84e872a0SLloyd Pique needs to tell the compositor which buffer to use and 229*84e872a0SLloyd Pique when and where it has rendered new content into it. 230*84e872a0SLloyd Pique </para> 231*84e872a0SLloyd Pique 232*84e872a0SLloyd Pique <para> 233*84e872a0SLloyd Pique This leaves an application with two ways to update its window contents: 234*84e872a0SLloyd Pique </para> 235*84e872a0SLloyd Pique <para> 236*84e872a0SLloyd Pique <orderedlist> 237*84e872a0SLloyd Pique <listitem> 238*84e872a0SLloyd Pique <para> 239*84e872a0SLloyd Pique Render the new content into a new buffer and tell the compositor 240*84e872a0SLloyd Pique to use that instead of the old buffer. The application can 241*84e872a0SLloyd Pique allocate a new buffer every time it needs to update the window 242*84e872a0SLloyd Pique contents or it can keep two (or more) buffers around and cycle 243*84e872a0SLloyd Pique between them. The buffer management is entirely under 244*84e872a0SLloyd Pique application control. 245*84e872a0SLloyd Pique </para> 246*84e872a0SLloyd Pique </listitem> 247*84e872a0SLloyd Pique <listitem> 248*84e872a0SLloyd Pique <para> 249*84e872a0SLloyd Pique Render the new content into the buffer that it previously 250*84e872a0SLloyd Pique told the compositor to to use. While it's possible to just 251*84e872a0SLloyd Pique render directly into the buffer shared with the compositor, 252*84e872a0SLloyd Pique this might race with the compositor. What can happen is that 253*84e872a0SLloyd Pique repainting the window contents could be interrupted by the 254*84e872a0SLloyd Pique compositor repainting the desktop. If the application gets 255*84e872a0SLloyd Pique interrupted just after clearing the window but before 256*84e872a0SLloyd Pique rendering the contents, the compositor will texture from a 257*84e872a0SLloyd Pique blank buffer. The result is that the application window will 258*84e872a0SLloyd Pique flicker between a blank window or half-rendered content. The 259*84e872a0SLloyd Pique traditional way to avoid this is to render the new content 260*84e872a0SLloyd Pique into a back buffer and then copy from there into the 261*84e872a0SLloyd Pique compositor surface. The back buffer can be allocated on the 262*84e872a0SLloyd Pique fly and just big enough to hold the new content, or the 263*84e872a0SLloyd Pique application can keep a buffer around. Again, this is under 264*84e872a0SLloyd Pique application control. 265*84e872a0SLloyd Pique </para> 266*84e872a0SLloyd Pique </listitem> 267*84e872a0SLloyd Pique </orderedlist> 268*84e872a0SLloyd Pique </para> 269*84e872a0SLloyd Pique <para> 270*84e872a0SLloyd Pique In either case, the application must tell the compositor 271*84e872a0SLloyd Pique which area of the surface holds new contents. When the 272*84e872a0SLloyd Pique application renders directly to the shared buffer, the 273*84e872a0SLloyd Pique compositor needs to be noticed that there is new content. 274*84e872a0SLloyd Pique But also when exchanging buffers, the compositor doesn't 275*84e872a0SLloyd Pique assume anything changed, and needs a request from the 276*84e872a0SLloyd Pique application before it will repaint the desktop. The idea 277*84e872a0SLloyd Pique that even if an application passes a new buffer to the 278*84e872a0SLloyd Pique compositor, only a small part of the buffer may be 279*84e872a0SLloyd Pique different, like a blinking cursor or a spinner. 280*84e872a0SLloyd Pique </para> 281*84e872a0SLloyd Pique </section> 282*84e872a0SLloyd Pique <section id="sect-Wayland-Architecture-wayland_hw_enabling"> 283*84e872a0SLloyd Pique <title>Hardware Enabling for Wayland</title> 284*84e872a0SLloyd Pique <para> 285*84e872a0SLloyd Pique Typically, hardware enabling includes modesetting/display 286*84e872a0SLloyd Pique and EGL/GLES2. On top of that Wayland needs a way to share 287*84e872a0SLloyd Pique buffers efficiently between processes. There are two sides 288*84e872a0SLloyd Pique to that, the client side and the server side. 289*84e872a0SLloyd Pique </para> 290*84e872a0SLloyd Pique <para> 291*84e872a0SLloyd Pique On the client side we've defined a Wayland EGL platform. In 292*84e872a0SLloyd Pique the EGL model, that consists of the native types 293*84e872a0SLloyd Pique (EGLNativeDisplayType, EGLNativeWindowType and 294*84e872a0SLloyd Pique EGLNativePixmapType) and a way to create those types. In 295*84e872a0SLloyd Pique other words, it's the glue code that binds the EGL stack and 296*84e872a0SLloyd Pique its buffer sharing mechanism to the generic Wayland API. The 297*84e872a0SLloyd Pique EGL stack is expected to provide an implementation of the 298*84e872a0SLloyd Pique Wayland EGL platform. The full API is in the wayland-egl.h 299*84e872a0SLloyd Pique header. The open source implementation in the mesa EGL stack 300*84e872a0SLloyd Pique is in wayland-egl.c and platform_wayland.c. 301*84e872a0SLloyd Pique </para> 302*84e872a0SLloyd Pique <para> 303*84e872a0SLloyd Pique Under the hood, the EGL stack is expected to define a 304*84e872a0SLloyd Pique vendor-specific protocol extension that lets the client side 305*84e872a0SLloyd Pique EGL stack communicate buffer details with the compositor in 306*84e872a0SLloyd Pique order to share buffers. The point of the wayland-egl.h API 307*84e872a0SLloyd Pique is to abstract that away and just let the client create an 308*84e872a0SLloyd Pique EGLSurface for a Wayland surface and start rendering. The 309*84e872a0SLloyd Pique open source stack uses the drm Wayland extension, which lets 310*84e872a0SLloyd Pique the client discover the drm device to use and authenticate 311*84e872a0SLloyd Pique and then share drm (GEM) buffers with the compositor. 312*84e872a0SLloyd Pique </para> 313*84e872a0SLloyd Pique <para> 314*84e872a0SLloyd Pique The server side of Wayland is the compositor and core UX for 315*84e872a0SLloyd Pique the vertical, typically integrating task switcher, app 316*84e872a0SLloyd Pique launcher, lock screen in one monolithic application. The 317*84e872a0SLloyd Pique server runs on top of a modesetting API (kernel modesetting, 318*84e872a0SLloyd Pique OpenWF Display or similar) and composites the final UI using 319*84e872a0SLloyd Pique a mix of EGL/GLES2 compositor and hardware overlays if 320*84e872a0SLloyd Pique available. Enabling modesetting, EGL/GLES2 and overlays is 321*84e872a0SLloyd Pique something that should be part of standard hardware bringup. 322*84e872a0SLloyd Pique The extra requirement for Wayland enabling is the 323*84e872a0SLloyd Pique EGL_WL_bind_wayland_display extension that lets the 324*84e872a0SLloyd Pique compositor create an EGLImage from a generic Wayland shared 325*84e872a0SLloyd Pique buffer. It's similar to the EGL_KHR_image_pixmap extension 326*84e872a0SLloyd Pique to create an EGLImage from an X pixmap. 327*84e872a0SLloyd Pique </para> 328*84e872a0SLloyd Pique <para> 329*84e872a0SLloyd Pique The extension has a setup step where you have to bind the 330*84e872a0SLloyd Pique EGL display to a Wayland display. Then as the compositor 331*84e872a0SLloyd Pique receives generic Wayland buffers from the clients (typically 332*84e872a0SLloyd Pique when the client calls eglSwapBuffers), it will be able to 333*84e872a0SLloyd Pique pass the struct wl_buffer pointer to eglCreateImageKHR as 334*84e872a0SLloyd Pique the EGLClientBuffer argument and with EGL_WAYLAND_BUFFER_WL 335*84e872a0SLloyd Pique as the target. This will create an EGLImage, which can then 336*84e872a0SLloyd Pique be used by the compositor as a texture or passed to the 337*84e872a0SLloyd Pique modesetting code to use as an overlay plane. Again, this is 338*84e872a0SLloyd Pique implemented by the vendor specific protocol extension, which 339*84e872a0SLloyd Pique on the server side will receive the driver specific details 340*84e872a0SLloyd Pique about the shared buffer and turn that into an EGL image when 341*84e872a0SLloyd Pique the user calls eglCreateImageKHR. 342*84e872a0SLloyd Pique </para> 343*84e872a0SLloyd Pique </section> 344*84e872a0SLloyd Pique</chapter> 345