Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve BalancedPool balancing algorithm #1065

Closed
mcollina opened this issue Oct 18, 2021 · 19 comments
Closed

Improve BalancedPool balancing algorithm #1065

mcollina opened this issue Oct 18, 2021 · 19 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@mcollina
Copy link
Member

Currently BalancedPool use a simple round robin algorithm if the target upstream is not available.

A more nuanced algorithm could easily produce better performance.

This is a good paper describing a better algorithm: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.863.3985&rep=rep1&type=pdf.

@mcollina mcollina added enhancement New feature or request good first issue Good for newcomers labels Oct 18, 2021
@jodevsa
Copy link
Contributor

jodevsa commented Oct 18, 2021

HI @mcollina

I'd like to work on this one. Can you assign me to it?

@mcollina
Copy link
Member Author

go for it!

@jodevsa
Copy link
Contributor

jodevsa commented Oct 19, 2021

@mcollina what is the reason behind introducing the balancedPool? Is it to support distributing the requests to multiple upstreams?

@mcollina
Copy link
Member Author

Yes exactly. There can be a few situations where deploying a separate load balancer in front of a pool of services is not an option.

@jodevsa
Copy link
Contributor

jodevsa commented Oct 19, 2021

Do you already have an idea what the algorithm should do?

I took a brief look into the paper; the algorithm that is used to decide the weights assumes that we are aware of the different request classes. This isn't the case in our situation, right? I'll also dig more into the paper in the following days

I'm also taking a look at this: https://github.com/elastic/elastic-transport-js/blob/main/src/pool/WeightedConnectionPool.ts

It looks like they start each connection with the max weight, and then decrease the weight if the connection errored or TO.

They also have an option to decrease the weight if the http response code is 502, 503, or 504

@mcollina
Copy link
Member Author

That seems a really good start!

@mcollina
Copy link
Member Author

@delvedor can surely help reviewing.

@jodevsa
Copy link
Contributor

jodevsa commented Jul 5, 2022

Hi guys, I finally have a vacation so hopefully I'll be able to wrap this thing 🙏 I was stuck at testing and currently working on rebasing after the new factory thingy changes.

I need some help. So I've been writing tests and I noticed a behaviour in the algorithm where one server would be chosen for so many times. I'll try to explain it via an example:

Let us assume that we have 2 upstreams A, and B, with initial weights of 50 for both. Also let us assume that our dynamic weights logic would have a penalty of 19 points if the upstream request fails. if B fails the weight of B would be 31 and the weight of A would stay 50. the current weight would be around 50 and the algorithm would decrease that weight every N iterations by the GCD which is "1" in our case GCD(50,31). So then the current weight is decreased every N iterations (2 in our case) by the GCD. This would result in the algorithm choosing server A for more than 78 times (((50-11)/1) *2 until it shifts back to server B.

one solution here might be to choose a penalty where P % max_weight == 0. something like 10? and then we'll get a GCD of 10 so B would be choose after 2 iterations ((50-10)/10)*2

It would be very helpful if someone else can validate my findings and check if this is an acceptable behaviour. I'm also up for a call if anyone would like to discuss this further :)

@mcollina
Copy link
Member Author

mcollina commented Jul 5, 2022

I think 78 times is not that bad from my understanding.

If we are under low load, always picking the same server is not a problem. If we are under high load, 78 times does not look to be a long wait, less than a second if we are sending around 100 req/sec.

How does the math look like for like 3 or 5 servers?

@jodevsa
Copy link
Contributor

jodevsa commented Jul 6, 2022

I've written a test runner part of the balanced-pool.js tests to be able to easily test the algorithm. Please note the results vary according to the starting weight per server and the error penalty.

Here is a successful test with 3 servers with starting weight of 100 and error penalty of 7 and server A failing on the first request

  {
   // number of requests to sent using the undici client. 
    iterations: 100,
    startingWeightPerServer:100,
   // the error penality incase server was down or we got a socket hangup
    errorPenalty: 7,
    config: [{ server: 'A', simulateUpstreamDownOnIterations: [0] }, { server: 'B' }, { server: 'C' }],
   // sequence of successful  requests. Notice that A failed on iteration 0 because we configured it with simulateUpstreamDownOnIterations on 0 and the algorithim picked it again after 14 requests
    expected: ['B', 'C', 'B','C','B','C','B','C','B','C','B','C','B','C','A','B','C','A','B','C'],
   // the number of connection refused errors that the undici client faced. this is expected as we configured server A to simulate an upstream down on iteration 0 
    expectedConnectionRefusedErrors: 1,
    expectedSocketErrors: 0,
    // out of the 100 iterations, A was called 29% of the times, B 35% of the times, and C 35% of the times.
    expectedRatios: [0.29, 0.35, 0.35]
  },

here is how it looks with 3 servers running without any failing requests

  {
    iterations: 100,
    startingWeightPerServer:100,
    errorPenalty: 7,
    config: [{ server: 'A'}, { server: 'B' }, { server: 'C' }],
    expected: ['A','B','C','A','B','C'],
    expectedConnectionRefusedErrors: 1,
    expectedSocketErrors: 0,
    expectedRatios: [0.34, 0.33, 0.33]
  },

The load is distributed equally across the 3 servers

@jodevsa
Copy link
Contributor

jodevsa commented Jul 6, 2022

I'm not sure wether we should have the errorPenalty and startingWeightPerServer as a configured value by the user or should we have a static one defined by us

@mcollina
Copy link
Member Author

mcollina commented Jul 6, 2022

It's a good idea to make it configurable with a good default

@jodevsa
Copy link
Contributor

jodevsa commented Jul 6, 2022

@mcollina do we have an estimate about how many upstreams would probably be added? like a max of 100?

I'm asking because currently the algorithm suffers from a case where the complexity could be up to O(maxWeight) if the GCD between all the weights is 1. We could optimise that to be O(N) by either calculating the next lower weight from currentWeight or keep it the logic the same and make sure we don't end up with weights that would end up with a GCD of 1

I've also noticed that the elastic-transport-js library has a bug due to this assumption https://github.com/elastic/elastic-transport-js/blob/main/src/pool/WeightedConnectionPool.ts#L54

    // we should be able to find the next node in 1 array scan,
    // if we don't, it means that we are in an infinite loop

This is wrong especially when the GCD is 1. This is why they use a while(true) in the algorithm mentioned in the paper. The worst case is that we need to loop maxWeight times to find the next lower weight

@mcollina
Copy link
Member Author

mcollina commented Jul 6, 2022

I would not expect more than 100 peers.

@jodevsa
Copy link
Contributor

jodevsa commented Jul 7, 2022

I'm seeing a weird behaviour where I have 3 upstreams (A,B,C) and only 1 of them goes down C. After C is down upstreams A, B will also emit a disconnect event .on('disconnect') with a disconnect socket idle timeout. Does this ring a bell @mcollina? if not I'l try to prepare a test where I can reproduce the problem.

@mcollina
Copy link
Member Author

mcollina commented Jul 7, 2022

Looks like a bug to me.

@jodevsa
Copy link
Contributor

jodevsa commented Jul 7, 2022

my bad it seems related to the test server keepalive timeout configuration :(

@jodevsa
Copy link
Contributor

jodevsa commented Oct 14, 2022

HI @mcollina, can we close this issue now?

@mcollina
Copy link
Member Author

yes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants