-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAOS-623 engine: Fix a typo #14329
DAOS-623 engine: Fix a typo #14329
Conversation
tgt_nr should divide by tgt_per_numa, not number of numa nodes. Otherwise, it can trigger the assertion at the end of the function. Required-githooks: true Change-Id: I92f65924f9b4b3dce6a756e01e5bfc9e584af0b6 Signed-off-by: Jeff Olivier <[email protected]>
Ticket title is 'Generic ticket for minor code cleanup and improvement' |
Signed-off-by: Jeff Olivier <[email protected]>
Test stage NLT on EL 8.8 completed with status FAILURE. https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-14329/1/display/redirect |
target = base + ((offload + tgt_id) % dss_offload_per_numa_nr); | ||
offload = target + 17; /* Seed next selection */ | ||
base = dss_sys_xs_nr + dss_tgt_nr + (socket * dss_offload_per_numa_nr); | ||
target = base + ((offload + tgt_id) % dss_offload_per_numa_nr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to keep the "target" and "tgt_id" within the same NUMA node or not? If yes, how? It seems that "base" may exceeds the boundary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be more precise, the helper threads are split among numa nodes but base is the xs id of the first helper thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank for the detailed explanation.
Backport for the following patches DAOS-13380 engine: refine tgt_nr check (#12405) DAOS-15739 engine: Add multi-socket support (#14234) DAOS-623 engine: Fix a typo (#14329) * DAOS-13380 engine: refine tgt_nr check 1. for non-DAOS_TARGET_OVERSUBSCRIBE case fail to start engine if #cores is not enough 2. for DAOS_TARGET_OVERSUBSCRIBE case allow to force start engine The #nr_xs_helpers possibly be reduced for either case. * DAOS-15739 engine: Add multi-socket support (#14234) Add a simple multi-socket mode for use cases where a single engine must be used. Avoids the issue of having all helper xstreams automatically assigned to a single NUMA node thus increasing efficiency of synchronizations between I/O and helper xstreams. It is the default behavior if all of the following are true Neither pinned_numa_node nor first_core are used. No oversubscription is requested NUMA has uniform number of cores targets and helpers divide evenly among numa nodes There is more than one numa node Update server config logic to ensure first_core is passed on to engine if it's set while keeping existing behavior when both first_core: 0 and pinned_numa_node are set. Signed-off-by: Jeff Olivier <[email protected]> Signed-off-by: Xuezhao Liu <[email protected]> Signed-off-by: Tom Nabarro <[email protected]>
Required-githooks: true
target = base + ((offload + tgt_id) % dss_offload_per_numa_nr); | ||
offload = target + 17; /* Seed next selection */ | ||
base = dss_sys_xs_nr + dss_tgt_nr + (socket * dss_offload_per_numa_nr); | ||
target = base + ((offload + tgt_id) % dss_offload_per_numa_nr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank for the detailed explanation.
tgt_nr should divide by tgt_per_numa, not number of numa nodes. Otherwise, it can trigger the assertion at the end of the function.
Required-githooks: true
Change-Id: I92f65924f9b4b3dce6a756e01e5bfc9e584af0b6
Before requesting gatekeeper:
Features:
(orTest-tag*
) commit pragma was used or there is a reason documented that there are no appropriate tags for this PR.Gatekeeper: