1TP2: Dynamically Generated Cacheable xDS Resources 2---- 3* Author(s): markdroth, htuch 4* Approver: htuch 5* Implemented in: <xDS client, ...> 6* Last updated: 2022-02-09 7 8## Abstract 9 10This xRFC proposes a new mechanism to allow xDS servers to 11dynamically generate the contents of xDS resources for individual 12clients while at the same time preserving cacheability. Unlike the 13context parameter mechanism that is part of the new xDS naming scheme (see 14[xRFC TP1](TP1-xds-transport-next.md)), the mechanism described in 15this proposal is visible only to the transport protocol layer, not to the 16data model layer. This means that if a resource has a parameter that 17affects its contents, that parameter is not part of the resource's name, 18which means that any other resources that refer to the resource do not 19need to encode the parameter. Therefore, use of these parameters is 20not viral, thus making the mechanism much easier to use. 21 22## Background 23 24There are many use-cases where a control plane may need to 25dynamically generate the contents of xDS resources to tailor the 26resources for individual clients. One common case is where the 27server has a list of routes to configure, but individual routes in 28the list may be included or excluded based on the client's dynamic 29selection parameters (today, conveyed as node metadata). Thus, 30the server needs to generate a slightly different version of the 31`RouteConfiguration` for clients based on the parameters they send. (See 32https://cloud.google.com/traffic-director/docs/configure-advanced-traffic-management#config-filtering-metadata 33for an example.) 34 35The new xDS naming scheme described in [xRFC TP1](TP1-xds-transport-next.md) 36provides a mechanism called context parameters, which is intended to move all 37parameters that affect resource contents into the resource name, thus adding 38cacheability to the xDS ecosystem. However, this approach means that these 39parameters become part of the resource graph on an individual client, which 40causes a number of problems: 41- Dynamic context parameters are viral, spreading from a given resource 42 to all earlier resources in the resource graph. For example, if 43 multiple variants of an EDS resource are needed, there need to be two 44 different instances of the resource with different names, 45 distinguished by a context parameter. But because the contents of the 46 CDS resource include the name of the corresponding EDS resource name, 47 that means that we also need two different versions of the CDS 48 resource, also distinguished by the same context parameter. And then 49 we need two different versions of the RDS resource, since that needs 50 to refer to the CDS resource. And then two different versions of the 51 LDS resource, which refers to the RDS resource. This causes a 52 combinatorial explosion in the number of resources needed, and it adds 53 complexity to xDS servers, which need to construct the right variants 54 of every resource and make sure that they refer to each other using 55 the right names. 56- In the new xDS naming scheme, context parameters are exact-match-only. 57 This means that if a control plane wants to provide the same resource 58 both with and without a given parameter, it needs to publish two 59 versions of the resource, each with a different name, even though the 60 contents are the same, which can also cause unnecessarily poor cache 61 performance. For example, in the "dynamic route selection" use-case, 62 let's say that every client uses two different dynamic selection 63 parameters, `env` (which can have one of the values `prod`, `canary`, or 64 `test`) and `version` (which can have one of the values `v1`, `v2`, or 65 `v3`). Now let's say that there is a `RouteConfiguration` with one route 66 that should be selected via the parameter `env=prod` and another route that 67 should be selected via the parameter `version=v1`. This means that there 68 are only four variants of the `RouteConfiguration` resource (`{env!=prod, 69 version!=v1}`, `{env=prod, version!=v1}`, `{env!=prod, version=v1}`, and 70 `{env=prod, version=v1}`). However, the exact-match semantics means 71 that there will have to be nine different versions of this resource, 72 one for each combination of values of the two parameters. 73 74### Related Proposals: 75* [xRFC TP1: new xDS naming scheme](TP1-xds-transport-next.md) 76 77## Proposal 78 79This document proposes an alternative approach. We start with the 80observation that resource names are used in two places: 81 82- The **transport protocol** layer, which needs to identify the right 83 resource contents to send for a given resource name, often obtaining 84 those resource contents from a cache. 85- The **resource graph** used on an individual client, where there are a 86 set of data model resources that refer to each other by name. For 87 example, a `RouteConfiguration` refers to individual `Cluster` resources 88 by name. 89 90The use-cases that we're aware of for dynamic resource selection have 91an important property that we can take advantage of. When multiple 92variants of a given resource exist, any given client will only ever use 93one of those variants at a given time. That means that the parameters 94that affect which variant of the resource is used are required by the 95transport protocol, but they are not required by the client's data model. 96 97It should be noted that caching xDS proxies, unlike "leaf" clients, will 98need to track multiple variants of each resource, since a given caching 99proxy may be serving clients that need different variants of a given 100resource. However, since caching xDS proxies deal with resources only 101at the transport protocol layer, the resource graph layer is 102essentially irrelevant in that case. 103 104### Dynamic Parameters 105 106With the above property in mind, this document proposes the following 107data structures: 108- **Dynamic parameters**, which are a set of key/value pairs sent by the 109 client when subscribing to a resource. 110- **Dynamic parameter constraints**, which are a set of criteria that 111 can be used to determine whether a set of dynamic parameters matches 112 the constraints. These constraints are considered part of the unique 113 identifier for an xDS resource (along with the resource name itself) 114 on xDS servers, xDS clients, and xDS caching proxies. This provides a 115 mechanism to represent multiple variants of a given resource in a 116 cacheable way. 117 118Both of these data structures are used in the xDS transport protocol, 119but they are not part of the resource name and therefore do not appear as 120part of the resource graph. 121 122When a client subscribes to a resource, it specifies a set of dynamic 123parameters. In response, the server will send a resource whose dynamic 124parameter constraints match the dynamic parameters in the subscription 125request. A client that subscribes to multiple variants of a resource (such 126as a caching xDS proxy) will use the dynamic parameter constraints on the 127returned resource to determine which of its subscriptions the resource is 128associated with. 129 130Dynamic parameters, unlike context parameters, will not be 131exact-match-only. Dynamic parameter constraints will be able to represent 132certain simple types of flexible matching, such as matching an exact 133value or the existance of a key, and simple AND and OR combinations 134of constraints. This flexible matching semantic means that there may be 135ambiguities when determining which resources match which subscriptions, 136which are discussed below. 137 138#### Constraints Representation 139 140Dynamic parameter constraints will be represented in protobuf form as follows: 141 142```proto 143message DynamicParameterConstraints { 144 // A single constraint for a given key. 145 message SingleConstraint { 146 message Exists {} 147 // The key to match against. 148 string key = 1; 149 // How to match. 150 oneof constraint_type { 151 // Matches this exact value. 152 string value = 2; 153 // Key is present (matches any value except for the key being absent). 154 Exists exists = 3; 155 } 156 } 157 158 message ConstraintList { 159 repeated DynamicParameterConstraints constraints = 1; 160 } 161 162 oneof type { 163 // A single constraint to evaluate. 164 SingleConstraint constraint = 1; 165 166 // A list of constraints to be ORed together. 167 ConstraintList or_constraints = 2; 168 169 // A list of constraints to be ANDed together. 170 ConstraintList and_constraints = 3; 171 172 // The inverse (NOT) of a set of constraints. 173 DynamicParameterConstraints not_constraints = 4; 174 } 175} 176``` 177 178#### Background: xDS Client and Server Architecture 179 180Before discussing where dynamic parameter matching is performed, it is 181useful to provide some additional background on xDS client and server 182architecture, independent of this design. 183 184The xDS transport protocol is fundamentally a mechanism that matches up 185subscriptions provided by a client with resources provided by a server. 186The client controls what it is subscribing to at any given time, 187and the server must send the resources from its database that match the 188currently active subscriptions. 189 190An xDS server may be thought of as containing a database of resources, 191in which each resource has an associated list of clients that are currently 192subscribed to that resource. Whenever a client subscribes to a resource, 193the server will send the current version of that resource to the client, 194and it will add the client to the list of clients currently subscribed to 195that resource. Whenever the server receives a new version of that resource 196in its database, it will send the update to all clients that are currently 197subscribed to that resource. Whenever a client unsubscribes from a 198resource, it is removed from the list of clients subscribed to that 199resource, so that the server knows not to send it subsequent updates for 200that resource. 201 202This same paradigm of matching up subscriptions with resources actually 203applies to the xDS client as well. Because the xDS transport protocol 204does not require a server to resend a resource unless its contents have 205changed, clients need to cache the most recently seen value locally in 206case they need it again. In general, the best way to structure an xDS 207transport protocol client is as an API where the caller can start or 208stop subscribing to a given resource at any time, and the xDS client will 209handle the wire-level communication and cache the resources returned by 210the server. The cache in the xDS client functions very similarly to the 211database in an xDS server: each cache entry contains the current value 212of the resource received from the xDS server and a list of subscribers to 213that resource. When the xDS client sees the first subscription start for 214a given resource, it will create the cache entry for that resource, add 215the subscriber to the list of subscribers for that resource, and request 216that resource from the xDS server. When it receives the resource from 217the server, it will store the resource in the cache entry and deliver 218it to all subscribers. When the xDS client sees a second subscription 219start for the same resource, it will add the new subscriber to the list 220of subscribers for that resource and immediately deliver the cached value 221of the resource to the new subscriber. Whenever the server sends an 222updated version of the resource, the xDS client will deliver the update 223to all subscribers. When all subscriptions are stopped, the xDS client 224will unsubscribe from the resource on the wire, so that the xDS server 225knows to stop sending updates for that resource to the client. 226 227In effect, the logic in an xDS client is essentially the same as that in an 228xDS server, with only two differences. First, subscriptions come from local 229API callers instead of downstream RPC clients. And second, the database does 230not contain the authoritative source of the resource contents but rather cached 231values obtained from the server, and the database entries are removed when 232the last subscription for a given resource is stopped. 233 234The logic in a caching xDS proxy is also essentially the same as that in an xDS 235server, with only one difference. Just like an xDS client, the database 236does not contain the authoritative source of the resource contents but 237rather cached values obtained from the server. However, like an xDS 238server, subscriptions do come from downstream RPC clients rather than local 239API callers. 240 241The following table summarizes this structure: 242 243<table> 244 245 <tr> 246 <th>xDS Node Type</th> 247 <th>Source of Subscriptions</th> 248 <th>Source of Resource Contents</th> 249 </tr> 250 251 <tr> 252 <td>xDS Server</td> 253 <td>downstream xDS clients</td> 254 <td>authoritative data</td> 255 </tr> 256 257 <tr> 258 <td>xDS Client</td> 259 <td>local API callers</td> 260 <td>cached data from upstream xDS server</td> 261 </tr> 262 263 <tr> 264 <td>xDS Caching Proxy</td> 265 <td>downstream xDS clients</td> 266 <td>cached data from upstream xDS server</td> 267 </tr> 268 269</table> 270 271#### Where Dynamic Parameter Matching is Performed 272 273Because of the architecture described above, evaluation of matching between 274a set of dynamic parameters and a set of constraints may need to be 275performed by both xDS servers and xDS clients. 276 277xDS servers that support multiple variants of a resource perform this 278matching when deciding which variant of a given resource to return for a 279given subscription request. xDS servers that support multiple variants of 280a resource MUST send the dynamic parameter constraints associated with a 281resource variant to the client along with that variant. Any server 282implementation that fails to do so is in violation of this specification. 283 284xDS caching proxies that support multiple variants of a resource also 285perform this matching when deciding which variant of a given resource to 286return for a given subscription request. Caching proxies MUST store the 287dynamic parameter constraints obtained from the upstream server along with 288each resource variant, which they will use when deciding which variant of a 289given resource to return for a given subscription request from a downstream 290xDS client. Caching proxies MUST send those dynamic parameter constraints to 291the downstream client when sending that variant of the resource. 292 293Note this design assumes that a given leaf client will use a fixed set of 294dynamic parameters, typically configured in a local bootstrap file, for all 295subscriptions over its lifetime. Given that, it is not strictly necessary 296for a leaf client to perform this matching, since it should only ever 297receive a single variant of a given resource, which should always match the 298dynamic parameters it subscribed with. However, clients MAY perform this 299matching, which may be useful in cases where the same cache implementation 300is used on both a leaf client and a caching proxy. 301 302It is important to note that the dynamic parameter matching behavior becomes 303an inherent part of the xDS transport protocol. xDS servers that interact 304only with leaf clients may be tempted not to send dynamic parameter 305constraints to the client along with the chosen resource variant, and 306leaf clients may accept that. However, as soon as that server wants to 307start interacting with a caching proxy or a client that does verify the 308constraints, it will run into problems. xDS server implementors are 309strongly encouraged not to omit the dynamic parameter constraints in their 310responses. 311 312#### Example: Basic Dynamic Parameters Usage 313 314Let's say that the clients are currently categorized by the parameter 315`env`, whose value is either `prod` or `test`. So any given client will 316send one of the following sets of dynamic parameters: 317- `{env=prod}` 318- `{env=test}` 319 320Now let's say that the server has two variants of a given resource, and 321the variants have the following dynamic parameter constraints: 322 323```textproto 324// For {env=prod} 325{constraint:{key:"env" value:"prod"}} 326 327// For {env=test} 328{constraint:{key:"env" value:"test"}} 329``` 330 331When a client subscribes to this resource with dynamic parameters 332`{env=prod}`, the server will return the first variant; when a client 333subscribes to this resource with dynamic parameters `{env=test}`, the 334server will return the second variant. When the client receives the 335returned resource, it will verify that the dynamic parameters it sent 336match the constraints of the returned resource. 337 338#### Unconstrained Parameters 339 340Note that clients may send dynamic parameters that are not specified in 341the constraints on the resulting resource. If a set of constraints does 342not specify any constraint for a given parameter sent by the client, that 343parameter does not prevent the constraints from matching. This allows 344clients to add new parameters before a server begins using them. 345(In general, we expect clients to send a lot of keys that may not 346actually be used by the server, since deployments often divide their 347clients into categories before they have a need to differentiate the 348configs for those categories.) 349 350Continuing the example above, if the server wanted to send the same 351contents for a given resource to both `{env=prod}` and `{env=test}` clients, 352it would have only a single variant of that resource, and that variant would 353not have any constraints. The server would therefore send that variant to 354all clients, and the clients would consider it a match for the constraints 355that they subscribed with. 356 357#### Example: Transition Scenarios 358 359Consider what happens in transition scenarios, where a deployment initially 360groups its clients on a single key but then wants to add a second key. 361The second key needs to be added both in the constraints on the server 362side and in the clients' configurations, but those two changes cannot 363occur atomically. 364 365Let's start with the above example where the clients are already divided into 366`env=prod` and `env=test`. Let's say that now the deployment wants to add 367an additional key called `version`, whose value will be either `v1` or `v2`, 368so that it can further subdivide its clients' configs. 369 370The first step is to add the new key on the clients, so that any given client 371will send one of the following sets of dynamic parameters: 372- `{env=prod, version=v1}` 373- `{env=prod, version=v2}` 374- `{env=test, version=v1}` 375- `{env=test, version=v2}` 376 377At this point, the server still does not have a variant of any resource 378that has constraints for the `version` key; it has only variants that 379differentiate between `env=prod` and `env=test`. But the addition of 380the new key on the clients will not affect which resource variant is 381sent to each client, because it does not affect the matching. Clients 382sending `{env=prod, version=v1}` or `{env=prod, version=v2}` will both get 383the resource variant for `env=prod`, and clients sending 384`{env=test, version=v1}` or `{env=test, version=v2}` will both get the 385resource variant for `env=test`. 386 387Once the clients have all been updated to send the new key, then the 388server can be updated to have different resource variants based on the 389`version` key. For example, it may replace the single resource variant 390for `env=prod` with the following two variants: 391 392```textproto 393// For {env=prod, version=v1} 394{and_constraints:[ 395 {constraint:{key:"env" value:"prod"}}, 396 {constraint:{key:"version" value:"v1"}} 397]} 398 399// For {env=prod, version=v2} 400{and_constraints:[ 401 {constraint:{key:"env" value:"prod"}}, 402 {constraint:{key:"version" value:"v2"}} 403]} 404``` 405 406Once that change happens on the server, the clients will start getting 407the correct variant of the resource based on their `version` key. 408 409Note that in order to avoid causing matching ambiguity, the server must 410handle this kind of change by sending the deletion of the original resource 411variant and the creation of the replacement resource variants in a 412single xDS response. This will allow the client to atomically apply the 413change to its database. For any given subscriber, the client should 414present the change as if there was only one variant of the resource and 415that variant had just been updated. 416 417#### Matching Ambiguity 418 419As mentioned above, this design does introduce the possibility of 420matching ambiguity in certain cases, where there may be more than one 421variant of a resource that matches the dynamic parameters specified by 422the client. 423 424If an xDS transport protocol implementation does encounter multiple 425possible matching variants of a resource, its behavior is undefined. 426In the following sections, we evaluate the cases where that can occur 427and specify how each one will be addressed. 428 429##### Adding a New Key on the Server First 430 431Consider what would happen in the above transition scenario if we changed 432the server to have multiple variants of a resource differentiated by 433the new `version` key before all of the clients were upgraded to use 434that key. For clients sending `{env=prod}`, there would be two possible 435matching variants of the resource, one for `version=v1` and another for 436`version=v2`, and there would be no way to determine which variant to 437use for that client. 438 439As stated above, we are optimizing for the case where new keys are added 440on clients first, since that is expected to be the common scenario. 441However, there may be cases where it is not feasible to have all clients 442start sending a new key before the server needs to start making use of 443that key. 444 445For example, let's say that this transition scenario is occurring in 446an environment where the xDS server is controlled by one team and the 447clients are controlled by various other teams, so it's not feasible to 448force all clients to start sending the new `version` key all at once. 449But there is one particular client team that is eager to start using 450the new `version` key to differentiate the configs of their clients, 451and they don't want to wait for all of the other client teams to start 452sending the new key. 453 454Consider what happens if the server simply adds a variant of the 455resource with the new key, while leaving the original resource variant 456in place: 457 458```textproto 459// Existing variant for older clients that are not yet sending the 460// version key. 461{constraint:{key:"env" value:"prod"}} 462 463// New variant intended for clients sending the version key. 464{and_constraints:[ 465 {constraint:{key:"env" value:"prod"}}, 466 {constraint:{key:"version" value:"v1"}} 467]} 468``` 469 470This will work fine for older clients that are not yet sending the 471`version` key, because their dynamic parameters will not match the new 472variant's constraints. However, newer clients that are sending dynamic 473parameters `{env=prod, version=v1}` will run into ambiguity: those 474parameters can match either of the above variants of the resource. 475 476This situation will be avoided via a best practice that all authoritative 477xDS servers should have **all variants of a given resource specify 478constraints for the same set of keys**. 479 480In order to make this work for the case where the server starts sending 481the constraint on the new key before all clients are sending it, we 482provide the `exists` matcher, which will allow the server to specify 483a default explicitly for clients that are not yet sending a new key. 484In this example, the server would actually have the following two 485variants: 486 487```textproto 488// Existing variant for older clients that are not yet sending the 489// version key. 490{and_constraints:[ 491 {constraint:{key:"env" value:"prod"}}, 492 {not_constraint: 493 {constraint:{key:"version" exists:{}}} 494 } 495]} 496 497// New variant for clients sending the version key. 498{and_constraints:[ 499 {constraint:{key:"env" value:"prod"}}, 500 {constraint:{key:"version" value:"v1"}} 501]} 502``` 503 504This allows maintaining the requirement that all variants of a given 505resource have constraints on the same set of keys, while also allowing 506the server to explicitly provide a result for older clients that do not 507yet send the new key. 508 509##### Variants With Overlapping Constraint Values 510 511There is also a possible ambiguity that can occur if a server provides 512multiple variants of a resource whose constraints for a given key 513overlap in terms of the values they can match. For example, let's say 514that a server has the following two variants of a resource: 515 516```textproto 517// Matches {env=prod} or {env=test}. 518{or_constraints:[ 519 {constraint:{key:"env" value:"prod"}}, 520 {constraint:{key:"env" value:"test"}} 521]} 522 523// Matches {env=qa} or {env=test}. 524{or_constraints:[ 525 {constraint:{key:"env" value:"qa"}}, 526 {constraint:{key:"env" value:"test"}} 527]} 528``` 529 530Now consider what happens if a client subscribes with dynamic parameters 531`{env=test}`. Those dynamic parameters can match either of the above 532variants of the resource. 533 534This situation will be avoided via a best practice that all authoritative 535xDS servers should have **all variants of a given resource specify 536non-overlapping constraints for the same set of keys**. Control planes 537must not accept a set of resources that violates this requirement. 538 539#### Matching Behavior and Best Practices 540 541We advise deployments to avoid ambiguity through the following best practices: 542- Whenever there are multiple variants of a resource, all variants must 543 list the same set of keys. This allows the server to ignore constraints 544 on keys sent by the client that do not affect the choice of variant 545 without causing ambiguity in cache misses. Servers may use the 546 `exists` mechanism to provide backward compatibility for clients that 547 are not yet sending a newly added key. 548- The constraints on each variant of a given resource must be mutually 549 exclusive. For example, if one variant of a resource matches a given key 550 with values "foo" or "bar", and another variant matches that same key 551 with values "bar" or "baz", that would cause ambiguity, because both 552 variants would match the value "bar". 553- There must be a variant of the resource for every value of a key that is 554 going to be present. For example, if clients will send constraints on the 555 `env` key requiring the value to be one of `prod`, `test`, or `qa`, then 556 you must have each of those three variants of the resource. (Failure 557 to do this will result in the server acting as if the requested 558 resource does not exist.) 559 560#### Transport Protocol Changes 561 562The following message will be added to represent a subscription to a 563resource by name with associated dynamic parameters: 564 565```proto 566// A specification of a resource used when subscribing or unsubscribing. 567message ResourceLocator { 568 // The resource name to subscribe to. 569 string name = 1; 570 571 // A set of dynamic parameters used to match against the dynamic parameter 572 // constraints on the resource. This allows clients to select between 573 // multiple variants of the same resource. 574 map<string, string> dynamic_parameters = 2; 575} 576``` 577 578The following new field will be added to `DiscoveryRequest`, to allow clients 579to specify dynamic parameters when subscribing to a resource: 580 581```proto 582 // Alternative to resource_names field that allows specifying cache 583 // keys along with each resource name. Clients that populate this field 584 // must be able to handle responses from the server where resources are 585 // wrapped in a Resource message. 586 repeated ResourceLocator resource_locators = 7; 587``` 588 589Similarly, the following fields will be added to `DeltaDiscoveryRequest`: 590 591```proto 592 // Alternative to resource_names_subscribe field that allows specifying cache 593 // keys along with each resource name. 594 repeated ResourceLocator resource_locators_subscribe = 8; 595 596 // Alternative to resource_names_unsubscribe field that allows specifying cache 597 // keys along with each resource name. 598 repeated ResourceLocator resource_locators_unsubscribe = 9; 599``` 600 601The following message will be added to represent the name of a specific 602variant of a resource: 603 604```proto 605// Specifies a concrete resource name. 606message ResourceName { 607 // The name of the resource. 608 string name = 1; 609 610 // Dynamic parameter constraints associated with this resource. To be used by 611 // client-side caches (including xDS proxies) when matching subscribed 612 // resource locators. 613 DynamicParameterConstraints dynamic_parameter_constraints = 2; 614} 615``` 616 617The following field will be added to the `Resource` message, to allow the 618server to return the dynamic parameters associated with each resource: 619 620```proto 621 // Alternative to the *name* field, to be used when the server supports 622 // multiple variants of the named resource that are differentiated by 623 // dynamic parameter constraints. 624 // Only one of *name* or *resource_name* may be set. 625 ResourceName resource_name = 8; 626``` 627 628And finally, the following field will be added to `DeltaDiscoveryResponse`: 629 630```proto 631 // Alternative to removed_resources that allows specifying which variant of 632 // a resource is being removed. This variant must be used for any resource 633 // for which dynamic parameter constraints were sent to the client. 634 repeated ResourceName removed_resource_names = 8; 635``` 636 637### Client Configuration 638 639Client configuration is outside of the scope of this design. However, 640this section lists some considerations for client implementors to take 641into account. 642 643#### Configuring Dynamic Parameters 644 645Each leaf client should have a way of configuring the dynamic parameters 646that it sends. 647 648For old-style resource names (those not using the new `xdstp` URI 649scheme from [xRFC TP1](TP1-xds-transport-next.md)), clients should 650send the same set of dynamic parameters for all resource subscriptions. 651The client's configuration should allow setting these default dynamic 652parameters globally. 653 654For new-style resource names, clients should send the same set of 655dynamic parameters for all resource subscriptions in a given authority. 656The client's configuration should allow setting the dymamic parameters to 657use for each authority. 658 659#### Migrating From Node Metadata 660 661Today, the equivalent of dynamic parameter constraints is node metadata, 662which can be used by servers to determine the set of resources to send 663for LDS and CDS wildcard subscriptions or to determine the contents of 664other resources (e.g., to select individual routes to be included in an 665RDS resource). For transition purposes, this mechanism can continue 666to be supported by the client performing direct translation of node 667metadata to dynamic parameters. 668 669Any given xDS client may support either or both of these mechanisms. 670 671### Considerations for Implementations 672 673This specification does not prescribe implementation details for xDS 674clients or servers. However, for illustration purposes, this section 675describes how a naive implementation might be structured. 676 677The database of an xDS server or cache of an xDS client can be thought 678of as a map, keyed by resource type and resource name. Prior to this 679specification, the value of the map would have been the current value of the 680resource and a list of subscribers that need to be updated when the 681resource changes. In C++ syntax, the data structure might look like this: 682 683```c++ 684// Represents a subscriber (either a downstream xDS client or a local API caller). 685class Subscriber { 686 public: 687 // ... 688}; 689 690struct DatabaseEntry { 691 // Current contents of resource. 692 // Whenever this changes, the change will be sent to all subscribers. 693 std::optional<google::protobuf::Any> resource_contents; 694 695 // Current list of subscribers. 696 // Entries are added and removed as subscriptions are started and stopped. 697 std::set<Subscriber*> subscribers; 698}; 699 700using Database = 701 std::map<std::string /*resource_type*/, 702 std::map<std::string /*resource_name*/, DatabaseEntry>; 703``` 704 705This design does not change the key structure of the map, but it does 706change the structure of the value of the map. In particular, instead of 707storing a single value for the resource contents, it will need to store 708multiple values, keyed by the associated dynamic parameter constraints. 709And for each subscriber, it will need to store the dynamic parameters that 710the subscriber specified. In a naive implementation (not optimized at all), 711the modified data structure may look like this: 712 713```c++ 714// Represents a subscriber (either a downstream xDS client or a local API caller). 715class Subscriber { 716 public: 717 // ... 718 719 // Returns the dynamic parameters specified for the subscription. 720 DynamicParameters dynamic_parameters() const; 721}; 722 723struct DatabaseEntry { 724 // Resource contents for each variant of the resource, keyed by 725 // dynamic parameter constraints. 726 // Whenever a given variant of the resource changes, the change will be 727 // sent to all subscribers whose dynamic parameters match the constraints 728 // of the resource variant that changed. 729 std::map<DynamicParameterConstraints, 730 std::optional<google::protobuf::Any>> resource_contents; 731 732 // Current list of subscribers. 733 // Entries are added and removed as subscriptions are started and stopped. 734 std::set<Subscriber*> subscribers; 735}; 736``` 737 738When a variant of a resource is updated, the variant is stored in the map 739based on its dynamic parameter constraints. The implementation will then 740iterate through the list of subscribers, sending the updated resource 741variant and its dynamic parameter constraints to each subscriber whose 742dynamic parameters match those constraints. 743 744A more optimized implementation may instead choose to store a separate list 745of subscribers for each resource variant, thus avoiding the need to perform 746matching for every subscriber upon every update of a resource variant. 747However, this would require moving subscribers from one variant to another 748whenever the dynamic parameters change on the resource variants. 749 750### Example 751 752This section shows how the mechanism described in this proposal can be 753used to address the use-case described in the "Background" section above. 754 755Let's say that every client uses two different dynamic selection 756parameters, `env` (which can have one of the values `prod`, `canary`, 757or `test`) and `version` (which can have one of the values `v1`, `v2`, 758or `v3`). Now let's say that there is a `RouteConfiguration` with one 759route that should be selected via the parameter `env=prod` and another 760route that should be selected via the parameter `version=v1`. Without 761this design, the server would need to actually provide the cross-product 762of these parameter values, so there will be 9 different variants of the 763resource, even though there are only 4 unique contents for the resource. 764However, this design instead allows the server to provide only the 4 765unique variants of the resource, with constraints allowing each client 766to get the appropriate one: 767 768<table> 769 <tr> 770 <th>Dynamic Parameter Constraints on Resource</th> 771 <th>Resource Contents</th> 772 </tr> 773 774 <tr> 775 <td> 776<code>{and_constraints:[ 777 {not_constraints: 778 {constraint:{key:"env" value:"prod"}} 779 }, 780 {not_constraints: 781 {constraint:{key:"version" value:"v1"}} 782 } 783]}</code> 784 </td> 785 <td> 786 <ul> 787 <li>does <i>not</i> include the route for <code>env=prod</code> 788 <li>does <i>not</i> include the route for <code>version=v1</code> 789 </ul> 790 </td> 791 </tr> 792 793 <tr> 794 <td> 795<code>{and_constraints:[ 796 {constraint:{key:"env" value:"prod"}}, 797 {not_constraints: 798 {constraint:{key:"version" value:"v1"} 799 } 800]}</code> 801 </td> 802 <td> 803 <ul> 804 <li>does include the route for <code>env=prod</code> 805 <li>does <i>not</i> include the route for <code>version=v1</code> 806 </ul> 807 </td> 808 </tr> 809 810 <tr> 811 <td> 812<code>{and_constraints:[ 813 {not_constraints: 814 {constraint:{key:"env" value:"prod"}} 815 }, 816 {constraint:{key:"version" value:"v1"}} 817]}</code> 818 </td> 819 <td> 820 <ul> 821 <li>does <i>not</i> include the route for <code>env=prod</code> 822 <li>does include the route for <code>version=v1</code> 823 </ul> 824 </td> 825 </tr> 826 827 <tr> 828 <td> 829<code>{and_constraints:[ 830 {constraint:{key:"env" value:"prod"}}, 831 {constraint:{key:"version" value:"v1"}} 832]}</code> 833 </td> 834 <td> 835 <ul> 836 <li>does include the route for <code>env=prod</code> 837 <li>does include the route for <code>version=v1</code> 838 </ul> 839 </td> 840 </tr> 841 842</table> 843 844## Rationale 845 846This section documents limitations and design alternatives that we 847considered. 848 849### Limitation on Enhancing Matching in the Future 850 851One limitation of this design is that, because all xDS transport protocol 852implementations (clients, servers, and caching proxies) need to implement 853this matching behavior, it will be very difficult to add new matching 854behavior in the future. Doing so will probably require some sort of 855client capability. This will make it feasible to expand this mechanism 856in an environment where all of the caching xDS proxies are under centralized 857control, but it will be quite difficult to deploy those changes in 858environments that depend on distributed third-party caching xDS proxies. 859 860Because of this, reviewers of this design are encouraged to carefully 861scrutinize the proposed matching semantics to ensure that they meet our 862expected needs. 863 864### Complexity of Constraint Expressions 865 866Although the `DynamicParameterConstraints` proto allows specifying 867arbitrarily nested combinations of AND, OR, and NOT expressions, control 868planes do not need to actually support that full arbitrary power. It is 869possible to limit the sets of supported constraints to (e.g.) a 870simple flat list of AND or OR expressions, which would make it easier 871for a control plane to optimize its implementation. 872 873Simimarly, caching xDS proxies may be able to provide an optimized 874implementation if all of the constraints that they see are limited to 875some subset of the full flexibility allowed by the protocol. However, 876any general-purpose caching proxy implementation will likely need to 877support a less optimized implementation that does support the full 878flexibility allowed by the protocol. 879 880### Using Context Parameters 881 882We considered extending the context parameter mechanism from [xRFC 883TP1](TP1-xds-transport-next.md) to support flexible matching semantics, 884rather that its current exact-match semantics. However, that approach had 885some down-sides: 886- It would not have solved the virality problem described in the "Background" 887 section above. 888- It would have made the new xDS naming scheme a prerequisite for using 889 the dynamic resource selection mechanism. (The mechanism described in 890 this doc is completely independent of the new xDS naming scheme; it can 891 be used with the legacy xDS naming scheme as well.) 892 893### Stricter Matching to Avoid Ambiguity 894 895We could avoid much of the matching ambiguity described above by saying that 896a set of constraints must specify all keys present in the subscription 897request in order to match. However, this would mean that if the client 898starts subscribing with a new key before the corresponding constraint is 899added on the resources on the server, then it will fail to match the 900existing resources. In other words, the process would be: 901 9021. Add a variant of all resources on the server side with a constraint 903 for `version=v1` (in addition to all existing constraints). 9042. Change clients to start sending the new key. 9053. When all clients are updated, remove the resource variants that do 906 *not* have the new key. 907 908This will effectively require adding new keys on the server side first, 909which seems like a large burden on users. It also seems fairly tricky 910for most users to get the exactly correct set of dynamic parameters on 911each resource variant, and if they fail to do it right, they will break 912their existing configuration. 913 914Ultimately, although this approach is more semantically precise, it is 915also considered too rigid and difficult for users to work with. 916 917## Implementation 918 919TBD (Will probably be implemented in gRPC before Envoy) 920 921## Open issues (if applicable) 922 923N/A 924