Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DeprecationLogger's deduplication performance #37530

Closed
markharwood opened this issue Jan 16, 2019 · 4 comments
Closed

DeprecationLogger's deduplication performance #37530

markharwood opened this issue Jan 16, 2019 · 4 comments
Assignees
Labels
:Core/Infra/Logging Log management and logging utilities >enhancement

Comments

@markharwood
Copy link
Contributor

As part of types removal we discovered that DeprecationLogger.deprecatedAndMaybeLog() calls were the source of a big performance regression.
I suspect the bulk of the performance cost is in the use of the extractWarningValueFromWarningHeader method which has to parse prior logged messages to reject any duplicates.
Since that discovery we have implemented our own upstream event thinning in the bulk indexing API and now search APIs to reduce the number of duplicate deprecation calls.
The remaining concern is that other areas of the system may be tempted to make the same mistake given the logging API advertises a de-duplicate function through the use of a common key. Even if developers are aware of the poor performance they may not have a single loop (like in bulk indexing) where it is convenient to add the required trimming of log calls.
If the DeprecationLogger implementation was made more efficient this would help with all these concerns.

@markharwood markharwood added >enhancement :Core/Infra/Core Core issues without another label labels Jan 16, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

@jasontedor
Copy link
Member

I started attacking #35754 after a discussion in our team discussion meeting today. For the performance regression there, the improvements I have made in #37590 and #37597 take the performance penalty on my machine from 40x to 10x. The remaining time is spent in de-duplicating the warning headers and then what remains in encoding and formatting the log message. I don't see any room for improvement on the latter, and the de-duplication is around 50% of the remaining time. I would consider the encoding/formatting the tax we have to pay for deprecation logging something, but there is still room for improvement in the de-duplication.

ebadyano added a commit to ebadyano/elasticsearch that referenced this issue Feb 12, 2019
1. Setting length for formatWarning String to avoid
AbstractStringBuilder.ensureCapacityInternal calls
2. Adding extra check for parameter array length == 0 to avoid
unnecessarily creating StringBuilder in LoggerMessageFormat.format

Relates to elastic#37530
Relates to elastic#37411
@ebadyano
Copy link
Contributor

The original performance gap from #37411 for geonames benchmark was ~21%, with above fixes from Jason I see the gap narrowed to ~7%. I added a few small fixes in #38780 based in what I saw in the jfr. With them the final gap in geonames is ~4%.

ebadyano added a commit that referenced this issue Feb 22, 2019
1. Setting length for formatWarning String to avoid AbstractStringBuilder.ensureCapacityInternal calls
2. Adding extra check for parameter array length == 0 to avoid unnecessarily creating StringBuilder in LoggerMessageFormat.format

Helps to narrow the performance gap in throughout for geonames benchmark (#37411) by 3%. For more details: #37530 (comment) 

Relates to #37530
Relates to #37411
Relates to #35754
ebadyano added a commit to ebadyano/elasticsearch that referenced this issue Feb 25, 2019
1. Setting length for formatWarning String to avoid AbstractStringBuilder.ensureCapacityInternal calls
2. Adding extra check for parameter array length == 0 to avoid unnecessarily creating StringBuilder in LoggerMessageFormat.format

Helps to narrow the performance gap in throughout for geonames benchmark (elastic#37411) by 3%. For more details: elastic#37530 (comment) 

Relates to elastic#37530
Relates to elastic#37411
Relates to elastic#35754
ebadyano added a commit that referenced this issue Feb 25, 2019
1. Setting length for formatWarning String to avoid AbstractStringBuilder.ensureCapacityInternal calls
2. Adding extra check for parameter array length == 0 to avoid unnecessarily creating StringBuilder in LoggerMessageFormat.format

Helps to narrow the performance gap in throughout for geonames benchmark (#37411) by 3%. For more details: #37530 (comment) 

Relates to #37530
Relates to #37411
Relates to #35754
@ebadyano
Copy link
Contributor

Closing this for now, since the majority of the original 21% gap got narrowed to 4% with all the changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Logging Log management and logging utilities >enhancement
Projects
None yet
Development

No branches or pull requests

5 participants