1*cc02d7e2SAndroid Build Coastguard WorkerLoad Balancing in gRPC 2*cc02d7e2SAndroid Build Coastguard Worker====================== 3*cc02d7e2SAndroid Build Coastguard Worker 4*cc02d7e2SAndroid Build Coastguard Worker# Scope 5*cc02d7e2SAndroid Build Coastguard Worker 6*cc02d7e2SAndroid Build Coastguard WorkerThis document explains the design for load balancing within gRPC. 7*cc02d7e2SAndroid Build Coastguard Worker 8*cc02d7e2SAndroid Build Coastguard Worker# Background 9*cc02d7e2SAndroid Build Coastguard Worker 10*cc02d7e2SAndroid Build Coastguard WorkerLoad-balancing within gRPC happens on a per-call basis, not a 11*cc02d7e2SAndroid Build Coastguard Workerper-connection basis. In other words, even if all requests come from a 12*cc02d7e2SAndroid Build Coastguard Workersingle client, we still want them to be load-balanced across all servers. 13*cc02d7e2SAndroid Build Coastguard Worker 14*cc02d7e2SAndroid Build Coastguard Worker# Architecture 15*cc02d7e2SAndroid Build Coastguard Worker 16*cc02d7e2SAndroid Build Coastguard Worker## Overview 17*cc02d7e2SAndroid Build Coastguard Worker 18*cc02d7e2SAndroid Build Coastguard WorkerThe gRPC client supports an API that allows load balancing policies to 19*cc02d7e2SAndroid Build Coastguard Workerbe implemented and plugged into gRPC. An LB policy is responsible for: 20*cc02d7e2SAndroid Build Coastguard Worker- receiving updated configuration and list of server addresses from the 21*cc02d7e2SAndroid Build Coastguard Worker resolver 22*cc02d7e2SAndroid Build Coastguard Worker- creating subchannels for the server addresses and managing their 23*cc02d7e2SAndroid Build Coastguard Worker connectivity behavior 24*cc02d7e2SAndroid Build Coastguard Worker- setting the overall [connectivity state](connectivity-semantics-and-api.md) 25*cc02d7e2SAndroid Build Coastguard Worker (usually computed by aggregating the connectivity states of its subchannels) 26*cc02d7e2SAndroid Build Coastguard Worker of the channel 27*cc02d7e2SAndroid Build Coastguard Worker- for each RPC sent on the channel, determining which subchannel to send 28*cc02d7e2SAndroid Build Coastguard Worker the RPC on 29*cc02d7e2SAndroid Build Coastguard Worker 30*cc02d7e2SAndroid Build Coastguard WorkerThere are a number of LB policies provided with gRPC. The most 31*cc02d7e2SAndroid Build Coastguard Workernotable ones are `pick_first` (the default), `round_robin`, and 32*cc02d7e2SAndroid Build Coastguard Worker`grpclb`. There are also a number of additional LB policies to support 33*cc02d7e2SAndroid Build Coastguard Worker[xDS](grpc_xds_features.md), although they are not currently configurable 34*cc02d7e2SAndroid Build Coastguard Workerdirectly. 35*cc02d7e2SAndroid Build Coastguard Worker 36*cc02d7e2SAndroid Build Coastguard Worker## Workflow 37*cc02d7e2SAndroid Build Coastguard Worker 38*cc02d7e2SAndroid Build Coastguard WorkerLoad-balancing policies fit into the gRPC client workflow in between 39*cc02d7e2SAndroid Build Coastguard Workername resolution and the connection to the server. Here's how it all 40*cc02d7e2SAndroid Build Coastguard Workerworks: 41*cc02d7e2SAndroid Build Coastguard Worker 42*cc02d7e2SAndroid Build Coastguard Worker 43*cc02d7e2SAndroid Build Coastguard Worker 44*cc02d7e2SAndroid Build Coastguard Worker1. On startup, the gRPC client issues a [name resolution](naming.md) request 45*cc02d7e2SAndroid Build Coastguard Worker for the server name. The name will resolve to a list of IP addresses, 46*cc02d7e2SAndroid Build Coastguard Worker a [service config](service_config.md) that indicates which client-side 47*cc02d7e2SAndroid Build Coastguard Worker load-balancing policy to use (e.g., `round_robin` or `grpclb`) and 48*cc02d7e2SAndroid Build Coastguard Worker provides a configuration for that policy, and a set of attributes 49*cc02d7e2SAndroid Build Coastguard Worker (channel args in C-core). 50*cc02d7e2SAndroid Build Coastguard Worker2. The client instantiates the load balancing policy and passes it its 51*cc02d7e2SAndroid Build Coastguard Worker configuration from the service config, the list of IP addresses, and 52*cc02d7e2SAndroid Build Coastguard Worker the attributes. 53*cc02d7e2SAndroid Build Coastguard Worker3. The load balancing policy creates a set of subchannels for the IP 54*cc02d7e2SAndroid Build Coastguard Worker addresses of the servers (which might be different from the IP 55*cc02d7e2SAndroid Build Coastguard Worker addresses returned by the resolver; see below). It also watches the 56*cc02d7e2SAndroid Build Coastguard Worker subchannels' connectivity states and decides when each subchannel 57*cc02d7e2SAndroid Build Coastguard Worker should attempt to connect. 58*cc02d7e2SAndroid Build Coastguard Worker4. For each RPC sent, the load balancing policy decides which 59*cc02d7e2SAndroid Build Coastguard Worker subchannel (i.e., which server) the RPC should be sent to. 60*cc02d7e2SAndroid Build Coastguard Worker 61*cc02d7e2SAndroid Build Coastguard WorkerSee below for more information on `grpclb`. 62*cc02d7e2SAndroid Build Coastguard Worker 63*cc02d7e2SAndroid Build Coastguard Worker## Load Balancing Policies 64*cc02d7e2SAndroid Build Coastguard Worker 65*cc02d7e2SAndroid Build Coastguard Worker### `pick_first` 66*cc02d7e2SAndroid Build Coastguard Worker 67*cc02d7e2SAndroid Build Coastguard WorkerThis is the default LB policy if the service config does not specify any 68*cc02d7e2SAndroid Build Coastguard WorkerLB policy. It does not require any configuration. 69*cc02d7e2SAndroid Build Coastguard Worker 70*cc02d7e2SAndroid Build Coastguard WorkerThe `pick_first` policy takes a list of addresses from the resolver. It 71*cc02d7e2SAndroid Build Coastguard Workerattempts to connect to those addresses one at a time, in order, until it 72*cc02d7e2SAndroid Build Coastguard Workerfinds one that is reachable. If none of the addresses are reachable, it 73*cc02d7e2SAndroid Build Coastguard Workersets the channel's state to TRANSIENT_FAILURE while it attempts to 74*cc02d7e2SAndroid Build Coastguard Workerreconnect. Appropriate [backoff](connection-backoff.md) is applied for 75*cc02d7e2SAndroid Build Coastguard Workerrepeated connection attempts. 76*cc02d7e2SAndroid Build Coastguard Worker 77*cc02d7e2SAndroid Build Coastguard WorkerIf it is able to connect to one of the addresses, it sets the channel's 78*cc02d7e2SAndroid Build Coastguard Workerstate to READY, and then all RPCs sent on the channel will be sent to 79*cc02d7e2SAndroid Build Coastguard Workerthat address. If the connection to that address is later broken, 80*cc02d7e2SAndroid Build Coastguard Workerthe `pick_first` policy will put the channel into state IDLE, and it 81*cc02d7e2SAndroid Build Coastguard Workerwill not attempt to reconnect until the application requests that it 82*cc02d7e2SAndroid Build Coastguard Workerdoes so (either via the channel's connectivity state API or by sending 83*cc02d7e2SAndroid Build Coastguard Workeran RPC). 84*cc02d7e2SAndroid Build Coastguard Worker 85*cc02d7e2SAndroid Build Coastguard Worker### `round_robin` 86*cc02d7e2SAndroid Build Coastguard Worker 87*cc02d7e2SAndroid Build Coastguard WorkerThis LB policy is selected via the service config. It does not require 88*cc02d7e2SAndroid Build Coastguard Workerany configuration. 89*cc02d7e2SAndroid Build Coastguard Worker 90*cc02d7e2SAndroid Build Coastguard WorkerThis policy takes a list of addresses from the resolver. It creates a 91*cc02d7e2SAndroid Build Coastguard Workersubchannel for each of those addresses and constantly monitors the 92*cc02d7e2SAndroid Build Coastguard Workerconnectivity state of the subchannels. Whenever a subchannel becomes 93*cc02d7e2SAndroid Build Coastguard Workerdisconnected, the `round_robin` policy will ask it to reconnect, with 94*cc02d7e2SAndroid Build Coastguard Workerappropriate connection [backoff](connection-backoff.md). 95*cc02d7e2SAndroid Build Coastguard Worker 96*cc02d7e2SAndroid Build Coastguard WorkerThe policy sets the channel's connectivity state by aggregating the 97*cc02d7e2SAndroid Build Coastguard Workerstates of the subchannels: 98*cc02d7e2SAndroid Build Coastguard Worker- If any one subchannel is in READY state, the channel's state is READY. 99*cc02d7e2SAndroid Build Coastguard Worker- Otherwise, if there is any subchannel in state CONNECTING, the channel's 100*cc02d7e2SAndroid Build Coastguard Worker state is CONNECTING. 101*cc02d7e2SAndroid Build Coastguard Worker- Otherwise, if there is any subchannel in state IDLE, the channel's state is 102*cc02d7e2SAndroid Build Coastguard Worker IDLE. 103*cc02d7e2SAndroid Build Coastguard Worker- Otherwise, if all subchannels are in state TRANSIENT_FAILURE, the channel's 104*cc02d7e2SAndroid Build Coastguard Worker state is TRANSIENT_FAILURE. 105*cc02d7e2SAndroid Build Coastguard Worker 106*cc02d7e2SAndroid Build Coastguard WorkerNote that when a given subchannel reports TRANSIENT_FAILURE, it is 107*cc02d7e2SAndroid Build Coastguard Workerconsidered to still be in TRANSIENT_FAILURE until it successfully 108*cc02d7e2SAndroid Build Coastguard Workerreconnects and reports READY. In particular, we ignore the transition 109*cc02d7e2SAndroid Build Coastguard Workerfrom TRANSIENT_FAILURE to CONNECTING. 110*cc02d7e2SAndroid Build Coastguard Worker 111*cc02d7e2SAndroid Build Coastguard WorkerWhen an RPC is sent on the channel, the `round_robin` policy will 112*cc02d7e2SAndroid Build Coastguard Workeriterate over all subchannels that are currently in READY state, sending 113*cc02d7e2SAndroid Build Coastguard Workereach successive RPC to the next successive subchannel in the list, 114*cc02d7e2SAndroid Build Coastguard Workerwrapping around to the start of the list when needed. 115*cc02d7e2SAndroid Build Coastguard Worker 116*cc02d7e2SAndroid Build Coastguard Worker### `grpclb` 117*cc02d7e2SAndroid Build Coastguard Worker 118*cc02d7e2SAndroid Build Coastguard Worker(This policy is deprecated. We recommend using [xDS](grpc_xds_features.md) 119*cc02d7e2SAndroid Build Coastguard Workerinstead.) 120*cc02d7e2SAndroid Build Coastguard Worker 121*cc02d7e2SAndroid Build Coastguard WorkerThis LB policy was originally intended as gRPC's primary extensibility 122*cc02d7e2SAndroid Build Coastguard Workermechanism for load balancing. The intent was that instead of adding new 123*cc02d7e2SAndroid Build Coastguard WorkerLB policies directly in the client, the client could implement only 124*cc02d7e2SAndroid Build Coastguard Workersimple algorithms like `round_robin`, and any more complex algorithms 125*cc02d7e2SAndroid Build Coastguard Workerwould be provided by a look-aside load balancer. 126*cc02d7e2SAndroid Build Coastguard Worker 127*cc02d7e2SAndroid Build Coastguard WorkerThe client relies on the load balancer to provide _load balancing 128*cc02d7e2SAndroid Build Coastguard Workerconfiguration_ and _the list of server addresses_ to which the client should 129*cc02d7e2SAndroid Build Coastguard Workersend requests. The balancer updates the server list as needed to balance 130*cc02d7e2SAndroid Build Coastguard Workerthe load as well as handle server unavailability or health issues. The 131*cc02d7e2SAndroid Build Coastguard Workerload balancer will make any necessary complex decisions and inform the 132*cc02d7e2SAndroid Build Coastguard Workerclient. The load balancer may communicate with the backend servers to 133*cc02d7e2SAndroid Build Coastguard Workercollect load and health information. 134*cc02d7e2SAndroid Build Coastguard Worker 135*cc02d7e2SAndroid Build Coastguard WorkerThe `grpclb` policy uses the addresses returned by the resolver (if any) 136*cc02d7e2SAndroid Build Coastguard Workeras fallback addresses, which are used when it loses contact with the 137*cc02d7e2SAndroid Build Coastguard Workerbalancers. 138*cc02d7e2SAndroid Build Coastguard Worker 139*cc02d7e2SAndroid Build Coastguard WorkerThe `grpclb` policy gets the list of addresses of the balancers to talk to 140*cc02d7e2SAndroid Build Coastguard Workervia an attribute returned by the resolver. 141