xref: /aosp_15_r20/external/wayland/doc/publican/sources/Architecture.xml (revision 84e872a0dc482bffdb63672969dd03a827d67c73)
1*84e872a0SLloyd Pique<?xml version='1.0' encoding='utf-8' ?>
2*84e872a0SLloyd Pique<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
3*84e872a0SLloyd Pique<!ENTITY % BOOK_ENTITIES SYSTEM "Wayland.ent">
4*84e872a0SLloyd Pique%BOOK_ENTITIES;
5*84e872a0SLloyd Pique]>
6*84e872a0SLloyd Pique<chapter id="chap-Wayland-Architecture">
7*84e872a0SLloyd Pique  <title>Wayland Architecture</title>
8*84e872a0SLloyd Pique  <section id="sect-Wayland-Architecture-wayland_architecture">
9*84e872a0SLloyd Pique    <title>X vs. Wayland Architecture</title>
10*84e872a0SLloyd Pique    <para>
11*84e872a0SLloyd Pique      A good way to understand the Wayland architecture
12*84e872a0SLloyd Pique      and how it is different from X is to follow an event
13*84e872a0SLloyd Pique      from the input device to the point where the change
14*84e872a0SLloyd Pique      it affects appears on screen.
15*84e872a0SLloyd Pique    </para>
16*84e872a0SLloyd Pique    <para>
17*84e872a0SLloyd Pique      This is where we are now with X:
18*84e872a0SLloyd Pique    </para>
19*84e872a0SLloyd Pique    <figure>
20*84e872a0SLloyd Pique      <title>X architecture diagram</title>
21*84e872a0SLloyd Pique      <mediaobjectco>
22*84e872a0SLloyd Pique	<imageobjectco>
23*84e872a0SLloyd Pique	  <areaspec id="map1" units="other" otherunits="imagemap">
24*84e872a0SLloyd Pique	    <area id="area1_1" linkends="x_flow_1" x_steal="#step_1"/>
25*84e872a0SLloyd Pique	    <area id="area1_2" linkends="x_flow_2" x_steal="#step_2"/>
26*84e872a0SLloyd Pique	    <area id="area1_3" linkends="x_flow_3" x_steal="#step_3"/>
27*84e872a0SLloyd Pique	    <area id="area1_4" linkends="x_flow_4" x_steal="#step_4"/>
28*84e872a0SLloyd Pique	    <area id="area1_5" linkends="x_flow_5" x_steal="#step_5"/>
29*84e872a0SLloyd Pique	    <area id="area1_6" linkends="x_flow_6" x_steal="#step_6"/>
30*84e872a0SLloyd Pique	  </areaspec>
31*84e872a0SLloyd Pique	  <imageobject>
32*84e872a0SLloyd Pique	    <imagedata fileref="images/x-architecture.png" format="PNG" />
33*84e872a0SLloyd Pique	  </imageobject>
34*84e872a0SLloyd Pique	</imageobjectco>
35*84e872a0SLloyd Pique      </mediaobjectco>
36*84e872a0SLloyd Pique    </figure>
37*84e872a0SLloyd Pique    <para>
38*84e872a0SLloyd Pique      <orderedlist>
39*84e872a0SLloyd Pique	<listitem id="x_flow_1">
40*84e872a0SLloyd Pique	  <para>
41*84e872a0SLloyd Pique	    The kernel gets an event from an input
42*84e872a0SLloyd Pique	    device and sends it to X through the evdev
43*84e872a0SLloyd Pique	    input driver. The kernel does all the hard
44*84e872a0SLloyd Pique	    work here by driving the device and
45*84e872a0SLloyd Pique	    translating the different device specific
46*84e872a0SLloyd Pique	    event protocols to the linux evdev input
47*84e872a0SLloyd Pique	    event standard.
48*84e872a0SLloyd Pique	  </para>
49*84e872a0SLloyd Pique	</listitem>
50*84e872a0SLloyd Pique	<listitem id="x_flow_2">
51*84e872a0SLloyd Pique	  <para>
52*84e872a0SLloyd Pique	    The X server determines which window the
53*84e872a0SLloyd Pique	    event affects and sends it to the clients
54*84e872a0SLloyd Pique	    that have selected for the event in question
55*84e872a0SLloyd Pique	    on that window. The X server doesn't
56*84e872a0SLloyd Pique	    actually know how to do this right, since
57*84e872a0SLloyd Pique	    the window location on screen is controlled
58*84e872a0SLloyd Pique	    by the compositor and may be transformed in
59*84e872a0SLloyd Pique	    a number of ways that the X server doesn't
60*84e872a0SLloyd Pique	    understand (scaled down, rotated, wobbling,
61*84e872a0SLloyd Pique	    etc).
62*84e872a0SLloyd Pique	  </para>
63*84e872a0SLloyd Pique	</listitem>
64*84e872a0SLloyd Pique	<listitem id="x_flow_3">
65*84e872a0SLloyd Pique	  <para>
66*84e872a0SLloyd Pique	    The client looks at the event and decides
67*84e872a0SLloyd Pique	    what to do. Often the UI will have to change
68*84e872a0SLloyd Pique	    in response to the event - perhaps a check
69*84e872a0SLloyd Pique	    box was clicked or the pointer entered a
70*84e872a0SLloyd Pique	    button that must be highlighted. Thus the
71*84e872a0SLloyd Pique	    client sends a rendering request back to the
72*84e872a0SLloyd Pique	    X server.
73*84e872a0SLloyd Pique	  </para>
74*84e872a0SLloyd Pique	</listitem>
75*84e872a0SLloyd Pique	<listitem id="x_flow_4">
76*84e872a0SLloyd Pique	  <para>
77*84e872a0SLloyd Pique	    When the X server receives the rendering
78*84e872a0SLloyd Pique	    request, it sends it to the driver to let it
79*84e872a0SLloyd Pique	    program the hardware to do the rendering.
80*84e872a0SLloyd Pique	    The X server also calculates the bounding
81*84e872a0SLloyd Pique	    region of the rendering, and sends that to
82*84e872a0SLloyd Pique	    the compositor as a damage event.
83*84e872a0SLloyd Pique	  </para>
84*84e872a0SLloyd Pique	</listitem>
85*84e872a0SLloyd Pique	<listitem id="x_flow_5">
86*84e872a0SLloyd Pique	  <para>
87*84e872a0SLloyd Pique	    The damage event tells the compositor that
88*84e872a0SLloyd Pique	    something changed in the window and that it
89*84e872a0SLloyd Pique	    has to recomposite the part of the screen
90*84e872a0SLloyd Pique	    where that window is visible. The compositor
91*84e872a0SLloyd Pique	    is responsible for rendering the entire
92*84e872a0SLloyd Pique	    screen contents based on its scenegraph and
93*84e872a0SLloyd Pique	    the contents of the X windows. Yet, it has
94*84e872a0SLloyd Pique	    to go through the X server to render this.
95*84e872a0SLloyd Pique	  </para>
96*84e872a0SLloyd Pique	</listitem>
97*84e872a0SLloyd Pique	<listitem id="x_flow_6">
98*84e872a0SLloyd Pique	  <para>
99*84e872a0SLloyd Pique	    The X server receives the rendering requests
100*84e872a0SLloyd Pique	    from the compositor and either copies the
101*84e872a0SLloyd Pique	    compositor back buffer to the front buffer
102*84e872a0SLloyd Pique	    or does a pageflip. In the general case, the
103*84e872a0SLloyd Pique	    X server has to do this step so it can
104*84e872a0SLloyd Pique	    account for overlapping windows, which may
105*84e872a0SLloyd Pique	    require clipping and determine whether or
106*84e872a0SLloyd Pique	    not it can page flip. However, for a
107*84e872a0SLloyd Pique	    compositor, which is always fullscreen, this
108*84e872a0SLloyd Pique	    is another unnecessary context switch.
109*84e872a0SLloyd Pique	  </para>
110*84e872a0SLloyd Pique	</listitem>
111*84e872a0SLloyd Pique      </orderedlist>
112*84e872a0SLloyd Pique    </para>
113*84e872a0SLloyd Pique    <para>
114*84e872a0SLloyd Pique      As suggested above, there are a few problems with this
115*84e872a0SLloyd Pique      approach. The X server doesn't have the information to
116*84e872a0SLloyd Pique      decide which window should receive the event, nor can it
117*84e872a0SLloyd Pique      transform the screen coordinates to window-local
118*84e872a0SLloyd Pique      coordinates. And even though X has handed responsibility for
119*84e872a0SLloyd Pique      the final painting of the screen to the compositing manager,
120*84e872a0SLloyd Pique      X still controls the front buffer and modesetting. Most of
121*84e872a0SLloyd Pique      the complexity that the X server used to handle is now
122*84e872a0SLloyd Pique      available in the kernel or self contained libraries (KMS,
123*84e872a0SLloyd Pique      evdev, mesa, fontconfig, freetype, cairo, Qt etc). In
124*84e872a0SLloyd Pique      general, the X server is now just a middle man that
125*84e872a0SLloyd Pique      introduces an extra step between applications and the
126*84e872a0SLloyd Pique      compositor and an extra step between the compositor and the
127*84e872a0SLloyd Pique      hardware.
128*84e872a0SLloyd Pique    </para>
129*84e872a0SLloyd Pique    <para>
130*84e872a0SLloyd Pique      In Wayland the compositor is the display server. We transfer
131*84e872a0SLloyd Pique      the control of KMS and evdev to the compositor. The Wayland
132*84e872a0SLloyd Pique      protocol lets the compositor send the input events directly
133*84e872a0SLloyd Pique      to the clients and lets the client send the damage event
134*84e872a0SLloyd Pique      directly to the compositor:
135*84e872a0SLloyd Pique    </para>
136*84e872a0SLloyd Pique    <figure>
137*84e872a0SLloyd Pique      <title>Wayland architecture diagram</title>
138*84e872a0SLloyd Pique      <mediaobjectco>
139*84e872a0SLloyd Pique	<imageobjectco>
140*84e872a0SLloyd Pique	  <areaspec id="mapB" units="other" otherunits="imagemap">
141*84e872a0SLloyd Pique	    <area id="areaB_1" linkends="wayland_flow_1" x_steal="#step_1"/>
142*84e872a0SLloyd Pique	    <area id="areaB_2" linkends="wayland_flow_2" x_steal="#step_2"/>
143*84e872a0SLloyd Pique	    <area id="areaB_3" linkends="wayland_flow_3" x_steal="#step_3"/>
144*84e872a0SLloyd Pique	    <area id="areaB_4" linkends="wayland_flow_4" x_steal="#step_4"/>
145*84e872a0SLloyd Pique	  </areaspec>
146*84e872a0SLloyd Pique	  <imageobject>
147*84e872a0SLloyd Pique	    <imagedata fileref="images/wayland-architecture.png" format="PNG" />
148*84e872a0SLloyd Pique	  </imageobject>
149*84e872a0SLloyd Pique	</imageobjectco>
150*84e872a0SLloyd Pique      </mediaobjectco>
151*84e872a0SLloyd Pique    </figure>
152*84e872a0SLloyd Pique    <para>
153*84e872a0SLloyd Pique      <orderedlist>
154*84e872a0SLloyd Pique	<listitem id="wayland_flow_1">
155*84e872a0SLloyd Pique	  <para>
156*84e872a0SLloyd Pique	    The kernel gets an event and sends
157*84e872a0SLloyd Pique	    it to the compositor. This
158*84e872a0SLloyd Pique	    is similar to the X case, which is
159*84e872a0SLloyd Pique	    great, since we get to reuse all the
160*84e872a0SLloyd Pique	    input drivers in the kernel.
161*84e872a0SLloyd Pique	  </para>
162*84e872a0SLloyd Pique	</listitem>
163*84e872a0SLloyd Pique	<listitem id="wayland_flow_2">
164*84e872a0SLloyd Pique	  <para>
165*84e872a0SLloyd Pique	    The compositor looks through its
166*84e872a0SLloyd Pique	    scenegraph to determine which window
167*84e872a0SLloyd Pique	    should receive the event. The
168*84e872a0SLloyd Pique	    scenegraph corresponds to what's on
169*84e872a0SLloyd Pique	    screen and the compositor
170*84e872a0SLloyd Pique	    understands the transformations that
171*84e872a0SLloyd Pique	    it may have applied to the elements
172*84e872a0SLloyd Pique	    in the scenegraph. Thus, the
173*84e872a0SLloyd Pique	    compositor can pick the right window
174*84e872a0SLloyd Pique	    and transform the screen coordinates
175*84e872a0SLloyd Pique	    to window-local coordinates, by
176*84e872a0SLloyd Pique	    applying the inverse
177*84e872a0SLloyd Pique	    transformations. The types of
178*84e872a0SLloyd Pique	    transformation that can be applied
179*84e872a0SLloyd Pique	    to a window is only restricted to
180*84e872a0SLloyd Pique	    what the compositor can do, as long
181*84e872a0SLloyd Pique	    as it can compute the inverse
182*84e872a0SLloyd Pique	    transformation for the input events.
183*84e872a0SLloyd Pique	  </para>
184*84e872a0SLloyd Pique	</listitem>
185*84e872a0SLloyd Pique	<listitem id="wayland_flow_3">
186*84e872a0SLloyd Pique	  <para>
187*84e872a0SLloyd Pique	    As in the X case, when the client
188*84e872a0SLloyd Pique	    receives the event, it updates the
189*84e872a0SLloyd Pique	    UI in response. But in the Wayland
190*84e872a0SLloyd Pique	    case, the rendering happens in the
191*84e872a0SLloyd Pique	    client, and the client just sends a
192*84e872a0SLloyd Pique	    request to the compositor to
193*84e872a0SLloyd Pique	    indicate the region that was
194*84e872a0SLloyd Pique	    updated.
195*84e872a0SLloyd Pique	  </para>
196*84e872a0SLloyd Pique	</listitem>
197*84e872a0SLloyd Pique	<listitem id="wayland_flow_4">
198*84e872a0SLloyd Pique	  <para>
199*84e872a0SLloyd Pique	    The compositor collects damage
200*84e872a0SLloyd Pique	    requests from its clients and then
201*84e872a0SLloyd Pique	    recomposites the screen. The
202*84e872a0SLloyd Pique	    compositor can then directly issue
203*84e872a0SLloyd Pique	    an ioctl to schedule a pageflip with
204*84e872a0SLloyd Pique	    KMS.
205*84e872a0SLloyd Pique	  </para>
206*84e872a0SLloyd Pique	</listitem>
207*84e872a0SLloyd Pique
208*84e872a0SLloyd Pique
209*84e872a0SLloyd Pique      </orderedlist>
210*84e872a0SLloyd Pique    </para>
211*84e872a0SLloyd Pique  </section>
212*84e872a0SLloyd Pique  <section id="sect-Wayland-Architecture-wayland_rendering">
213*84e872a0SLloyd Pique    <title>Wayland Rendering</title>
214*84e872a0SLloyd Pique    <para>
215*84e872a0SLloyd Pique      One of the details I left out in the above overview
216*84e872a0SLloyd Pique      is how clients actually render under Wayland. By
217*84e872a0SLloyd Pique      removing the X server from the picture we also
218*84e872a0SLloyd Pique      removed the mechanism by which X clients typically
219*84e872a0SLloyd Pique      render. But there's another mechanism that we're
220*84e872a0SLloyd Pique      already using with DRI2 under X: direct rendering.
221*84e872a0SLloyd Pique      With direct rendering, the client and the server
222*84e872a0SLloyd Pique      share a video memory buffer. The client links to a
223*84e872a0SLloyd Pique      rendering library such as OpenGL that knows how to
224*84e872a0SLloyd Pique      program the hardware and renders directly into the
225*84e872a0SLloyd Pique      buffer. The compositor in turn can take the buffer
226*84e872a0SLloyd Pique      and use it as a texture when it composites the
227*84e872a0SLloyd Pique      desktop. After the initial setup, the client only
228*84e872a0SLloyd Pique      needs to tell the compositor which buffer to use and
229*84e872a0SLloyd Pique      when and where it has rendered new content into it.
230*84e872a0SLloyd Pique    </para>
231*84e872a0SLloyd Pique
232*84e872a0SLloyd Pique    <para>
233*84e872a0SLloyd Pique      This leaves an application with two ways to update its window contents:
234*84e872a0SLloyd Pique    </para>
235*84e872a0SLloyd Pique    <para>
236*84e872a0SLloyd Pique      <orderedlist>
237*84e872a0SLloyd Pique	<listitem>
238*84e872a0SLloyd Pique	  <para>
239*84e872a0SLloyd Pique	    Render the new content into a new buffer and tell the compositor
240*84e872a0SLloyd Pique	    to use that instead of the old buffer. The application can
241*84e872a0SLloyd Pique	    allocate a new buffer every time it needs to update the window
242*84e872a0SLloyd Pique	    contents or it can keep two (or more) buffers around and cycle
243*84e872a0SLloyd Pique	    between them. The buffer management is entirely under
244*84e872a0SLloyd Pique	    application control.
245*84e872a0SLloyd Pique	  </para>
246*84e872a0SLloyd Pique	</listitem>
247*84e872a0SLloyd Pique	<listitem>
248*84e872a0SLloyd Pique	  <para>
249*84e872a0SLloyd Pique	    Render the new content into the buffer that it previously
250*84e872a0SLloyd Pique	    told the compositor to to use. While it's possible to just
251*84e872a0SLloyd Pique	    render directly into the buffer shared with the compositor,
252*84e872a0SLloyd Pique	    this might race with the compositor. What can happen is that
253*84e872a0SLloyd Pique	    repainting the window contents could be interrupted by the
254*84e872a0SLloyd Pique	    compositor repainting the desktop. If the application gets
255*84e872a0SLloyd Pique	    interrupted just after clearing the window but before
256*84e872a0SLloyd Pique	    rendering the contents, the compositor will texture from a
257*84e872a0SLloyd Pique	    blank buffer. The result is that the application window will
258*84e872a0SLloyd Pique	    flicker between a blank window or half-rendered content. The
259*84e872a0SLloyd Pique	    traditional way to avoid this is to render the new content
260*84e872a0SLloyd Pique	    into a back buffer and then copy from there into the
261*84e872a0SLloyd Pique	    compositor surface. The back buffer can be allocated on the
262*84e872a0SLloyd Pique	    fly and just big enough to hold the new content, or the
263*84e872a0SLloyd Pique	    application can keep a buffer around. Again, this is under
264*84e872a0SLloyd Pique	    application control.
265*84e872a0SLloyd Pique	  </para>
266*84e872a0SLloyd Pique	</listitem>
267*84e872a0SLloyd Pique      </orderedlist>
268*84e872a0SLloyd Pique    </para>
269*84e872a0SLloyd Pique    <para>
270*84e872a0SLloyd Pique      In either case, the application must tell the compositor
271*84e872a0SLloyd Pique      which area of the surface holds new contents. When the
272*84e872a0SLloyd Pique      application renders directly to the shared buffer, the
273*84e872a0SLloyd Pique      compositor needs to be noticed that there is new content.
274*84e872a0SLloyd Pique      But also when exchanging buffers, the compositor doesn't
275*84e872a0SLloyd Pique      assume anything changed, and needs a request from the
276*84e872a0SLloyd Pique      application before it will repaint the desktop. The idea
277*84e872a0SLloyd Pique      that even if an application passes a new buffer to the
278*84e872a0SLloyd Pique      compositor, only a small part of the buffer may be
279*84e872a0SLloyd Pique      different, like a blinking cursor or a spinner.
280*84e872a0SLloyd Pique    </para>
281*84e872a0SLloyd Pique  </section>
282*84e872a0SLloyd Pique  <section id="sect-Wayland-Architecture-wayland_hw_enabling">
283*84e872a0SLloyd Pique    <title>Hardware Enabling for Wayland</title>
284*84e872a0SLloyd Pique    <para>
285*84e872a0SLloyd Pique      Typically, hardware enabling includes modesetting/display
286*84e872a0SLloyd Pique      and EGL/GLES2. On top of that Wayland needs a way to share
287*84e872a0SLloyd Pique      buffers efficiently between processes. There are two sides
288*84e872a0SLloyd Pique      to that, the client side and the server side.
289*84e872a0SLloyd Pique    </para>
290*84e872a0SLloyd Pique    <para>
291*84e872a0SLloyd Pique      On the client side we've defined a Wayland EGL platform. In
292*84e872a0SLloyd Pique      the EGL model, that consists of the native types
293*84e872a0SLloyd Pique      (EGLNativeDisplayType, EGLNativeWindowType and
294*84e872a0SLloyd Pique      EGLNativePixmapType) and a way to create those types. In
295*84e872a0SLloyd Pique      other words, it's the glue code that binds the EGL stack and
296*84e872a0SLloyd Pique      its buffer sharing mechanism to the generic Wayland API. The
297*84e872a0SLloyd Pique      EGL stack is expected to provide an implementation of the
298*84e872a0SLloyd Pique      Wayland EGL platform. The full API is in the wayland-egl.h
299*84e872a0SLloyd Pique      header. The open source implementation in the mesa EGL stack
300*84e872a0SLloyd Pique      is in wayland-egl.c and platform_wayland.c.
301*84e872a0SLloyd Pique    </para>
302*84e872a0SLloyd Pique    <para>
303*84e872a0SLloyd Pique      Under the hood, the EGL stack is expected to define a
304*84e872a0SLloyd Pique      vendor-specific protocol extension that lets the client side
305*84e872a0SLloyd Pique      EGL stack communicate buffer details with the compositor in
306*84e872a0SLloyd Pique      order to share buffers. The point of the wayland-egl.h API
307*84e872a0SLloyd Pique      is to abstract that away and just let the client create an
308*84e872a0SLloyd Pique      EGLSurface for a Wayland surface and start rendering. The
309*84e872a0SLloyd Pique      open source stack uses the drm Wayland extension, which lets
310*84e872a0SLloyd Pique      the client discover the drm device to use and authenticate
311*84e872a0SLloyd Pique      and then share drm (GEM) buffers with the compositor.
312*84e872a0SLloyd Pique    </para>
313*84e872a0SLloyd Pique    <para>
314*84e872a0SLloyd Pique      The server side of Wayland is the compositor and core UX for
315*84e872a0SLloyd Pique      the vertical, typically integrating task switcher, app
316*84e872a0SLloyd Pique      launcher, lock screen in one monolithic application. The
317*84e872a0SLloyd Pique      server runs on top of a modesetting API (kernel modesetting,
318*84e872a0SLloyd Pique      OpenWF Display or similar) and composites the final UI using
319*84e872a0SLloyd Pique      a mix of EGL/GLES2 compositor and hardware overlays if
320*84e872a0SLloyd Pique      available. Enabling modesetting, EGL/GLES2 and overlays is
321*84e872a0SLloyd Pique      something that should be part of standard hardware bringup.
322*84e872a0SLloyd Pique      The extra requirement for Wayland enabling is the
323*84e872a0SLloyd Pique      EGL_WL_bind_wayland_display extension that lets the
324*84e872a0SLloyd Pique      compositor create an EGLImage from a generic Wayland shared
325*84e872a0SLloyd Pique      buffer. It's similar to the EGL_KHR_image_pixmap extension
326*84e872a0SLloyd Pique      to create an EGLImage from an X pixmap.
327*84e872a0SLloyd Pique    </para>
328*84e872a0SLloyd Pique    <para>
329*84e872a0SLloyd Pique      The extension has a setup step where you have to bind the
330*84e872a0SLloyd Pique      EGL display to a Wayland display. Then as the compositor
331*84e872a0SLloyd Pique      receives generic Wayland buffers from the clients (typically
332*84e872a0SLloyd Pique      when the client calls eglSwapBuffers), it will be able to
333*84e872a0SLloyd Pique      pass the struct wl_buffer pointer to eglCreateImageKHR as
334*84e872a0SLloyd Pique      the EGLClientBuffer argument and with EGL_WAYLAND_BUFFER_WL
335*84e872a0SLloyd Pique      as the target. This will create an EGLImage, which can then
336*84e872a0SLloyd Pique      be used by the compositor as a texture or passed to the
337*84e872a0SLloyd Pique      modesetting code to use as an overlay plane. Again, this is
338*84e872a0SLloyd Pique      implemented by the vendor specific protocol extension, which
339*84e872a0SLloyd Pique      on the server side will receive the driver specific details
340*84e872a0SLloyd Pique      about the shared buffer and turn that into an EGL image when
341*84e872a0SLloyd Pique      the user calls eglCreateImageKHR.
342*84e872a0SLloyd Pique    </para>
343*84e872a0SLloyd Pique  </section>
344*84e872a0SLloyd Pique</chapter>
345