Skip to content

[RIP‐70] Optimizing Lock Mechanisms

lizhimins edited this page Feb 12, 2025 · 1 revision

Status

Background & Motivation

What do we need to do

  • Design flexible lock optimization strategies and optimize message sending and receiving logic to improve message sending and processing performance.

Why should we do that

  • As concurrent systems grow more complex internally, deploying effective lock management strategies is key to preserving performance. The adoption of locks in the concurrent code of RocketMQ may have room for optimization. For instance, the current usage of locks, while critical for ensuring consistency and preventing race conditions, could potentially be refined to improve overall message throughput without significantly impacting performance. In practice, we have demonstrated that adjusting the lock strategy can impact the message-sending performance of RocketMQ. Merely altering the backoff strategy of SpinLock can result in a performance difference of 20% (or even more) between the best and worst cases.
  • 随着并发系统内部变得越来越复杂,部署有效的锁管理策略是保持性能的关键。RocketMQ 并发代码中锁的采用可能还有优化空间。例如,当前锁的使用虽然对于确保一致性和防止竞争条件至关重要,但可以进行改进以提高整体消息吞吐量,而不会显着影响性能。在实践中,我们已经证明调整锁策略会影响 RocketMQ 的消息发送性能。仅仅改变 SpinLock 的退避策略就会导致最佳情况和最坏情况之间的性能差异达到 20%(甚至更多)。

Goals

  • What problem is this proposal designed to solve?

    1. Design locking strategies to cope with different concurrent pressures
    2. Optimize the message sending and receiving logic
    3. Design an adaptive locking mechanism
    • 一.设计应对不同并发压力下的锁定策略
    • 二.优化消息收发逻辑
    • 三.设计自适应锁定机制
  • To what degree should we solve the problem?

    1. In different lock competition situations, adaptive lock selects the appropriate locking mechanism according to the critical size and competition situation.
    2. Design the conversion structure between different locking mechanisms to avoid deadlock or lock failure caused by the switching of locking mechanisms.
    3. Optimize the message sending and receiving logic
    • 一.在不同的锁竞争情况,自适应锁根据临界大小以及竞争情况选取合适的锁定机制.
    • 二.设计不同锁定机制之间相互转换的结构,避免造成切换锁定机制过程造成死锁或锁定失效.
    • 三.优化消息收发逻辑

Non-Goals

  • Are there any limits of this proposal?
    1. The calculation process of the adaptive mechanism may be affected by hardware fluctuations
    2. In special scenarios, the optimization results may not be ideal. Therefore, you need to provide supporting tools to update related configurations
    • 一.自适应机制的计算过程可能受硬件波动影响
    • 二.特殊场景下优化结果可能并不理想,需提供配套工具对相关配置更新等

Changes

Architecture

  • Since each locking mechanism cannot take into account various competitive scenarios even if it dynamically ADAPTS, the design of the adaptive lock realized by integrating various locking mechanisms will adjust to a suitable locking mechanism when the dynamic adjustment of a locking mechanism reaches the limit and cannot adapt to a certain scenario。
  • 由于每种锁定机制即使动态适应也不能兼顾各种竞争场景,因此设计综合各种锁定机制所实现的自适应锁,当一种锁定机制动态调整达到极限也无法适应某种场景时,自适应锁便会调整为一种适合的锁定机制
  1. As can be seen from the figure, when each thread enters the critical section, the average critical size and lock contention degree are calculated.
  2. When the current locking mechanism is judged to be unsuitable, the command to change the locking mechanism is issued, and the status is set to 0 to prevent the thread from obtaining the lock.
  3. The lock status is synchronized, and after resetting the calculation, the status is set to 1 to restore the normal state
  • 一.从图中可以看出,当每个线程进入临界区后都会计算平均临界大小和锁竞争程度
  • 二.当判定当前锁定机制不适合后,发出更改锁定机制命令,并将status设置为0,阻止线程获取锁
  • 三.进行锁定状态同步,重置计算后将status设置为1,恢复正常状态

初步验证

  • At present, the locking mechanism of spin optimal K-order retreat strategy is preliminarily verified
  • Single-machine four-process stress test (message body size 2B):
  • 目前初步验证自旋最优K次退避策略的锁定机制
  • 单机四进程压力测试(消息体大小2B):
CPU Arch Flush Policy Original QPS k Optimal QPS Improvement
X86 ASYNC 176312.35 10^3 184214.98 +4.47%
X86 SYNC 177403.12 10^3 187215.47 +5.56%
ARM ASYNC 185321.49 10^3 206431.82 +11.44%
ARM SYNC 188312.17 10^3 212314.43 +12.85%
  • As can be seen from the above figure, QPS changes, and RocketMQ's different locking mechanisms are used without selection guidance, and the performance gap between different locking mechanisms can not be small, so we will also unify the locking mechanism to adaptive locking.
  • 由上图可看到QPS的变化,并且RocketMQ的不同锁定机制使用之间无选用指导,同时不同锁定机制之间的性能差距不容小嘘,因此我们同时将统一锁定机制为自适应锁.

Implementation Outline

Phase 1

  1. Optimize the locking logic for message delivery to commitLog
  2. Spin optimal degree K retreat locking mechanism is introduced
  3. Optimize the back pressure mechanism of the client

Phase 2

  1. Adaptive lock is implemented initially
  2. Introduce other locking mechanisms
  3. Optimize message receiving logic

Phase 3

  1. Improved adaptive locks
  2. Provides tools for adjusting the lock mechanism
  3. Design the necessary tests
  • 完善自适应锁
  • 提供锁定机制调整配套工具
  • 设计必要的测试

Rejected Alternatives

  • How does alternatives solve the issue you proposed?

    • None
  • Pros and Cons of alternatives?

    • None
  • Why should we reject above alternatives?

    • None
Clone this wiki locally