-
Notifications
You must be signed in to change notification settings - Fork 814
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ianb/dogstreams aggregate #3732
Conversation
Issue: if multiple metrics with the same metric_name and timestamp but different tags are read, only one is submitted. Other metrics with other tags are lost
checks/datadog.py
Outdated
return (p[1], p[0], p[3].get('host_name', None), p[3].get('device_name', None)) | ||
|
||
# Sort and group by timestamp, metric name, host_name, device_name, (tags or attributes) | ||
tags = p[3].get('tags', None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None
being the default return value from get
, we could simplify by:
tags = p[3].get("tags")
attribs = sorted(tags.split(",") if tags else p[3])
Same line 32, we can remove the second parameter get
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for that. It was just the pattern of the original code; let me make the changes.
# Sort and group by timestamp, metric name, host_name, device_name | ||
return (p[1], p[0], p[3].get('host_name', None), p[3].get('device_name', None)) | ||
|
||
# Sort and group by timestamp, metric name, host_name, device_name, (tags or attributes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we try to found tags
and not use attribs
all the time ? Is it a special use case where if tags
is present in the attributes the code will later use it ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reason why I did it this way.
As per Our docs, the conical format is
metric unix_timestamp value [attribute1=v1 attributes2=v2 ...]
And for ParserFunction-parsed logs, attributes contain tags
.
metric_attributes = {
'tags': tags,
'metric_type': 'gauge',
}
return (metric_name, date, metric_value, metric_attributes)
So ideally in canonical format, metric tags should be sent like below - which is why I look for tags
mymetric 1531900000 99 tags=env:prod,app:web metric_type=gauge
BUT, most of the time in practice (and from an example we provided from old KB content), the tags are sent as follows. In this case there are no tags
so we assume that the attribs
could contain the tags.
applications.function.runtime_seconds 1464462187 24 application=myapp code_author=gus
mymetric 1531900000 99 env=prod app=web
mymetric 1531900001 99 env=prod app=web
Why not attribs
all the time?
Because we have to assume that the tags (if present) are ordered - which isn't always the case. Example below where using attribs
doesn't work
1_ Using canonical format, this would look like 2 unique metrics and will not be aggregated. groupby() apparently sees it that way.
mymetric 1531900000 99 tags=env:prod,app:web metric_type=gauge
mymetric 1531900000 99 tags=app:web,env:prod metric_type=gauge
2_ using parser function, same thing. If the tags were submitted in a different order, they'll not be aggregated and we can't assume that tags
are always sorted when returned.
user.crashes|2016-05-28 20:24:43.111|24|LotusNotes,Outlook,Explorer
user.crashes|2016-05-28 20:24:43.222|24|LotusNotes,Explorer,Outlook
tags = extras.split(',')
metric_attributes = {
'tags': tags,
'metric_type': 'gauge',
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for that great explication !
Should we not also sort p[3]
if tags
is not set ? Right now we're only sorting tags
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently, theres's no need. Per my testing, we don't need to sort dict; only needed to do it for list. Took a while for me to figure this out: ¯_(ツ)_/¯
attribs
returned is a list iftags
were foundattribs
returned is a dict if fromp[3]
.
`None` being the default return value from `get`
What does this PR do?
Includes the tags (metric attributes if there's none) as part of the metric aggregation.
Motivation
Given the following logs, dogstreams will only submit one of the metric (usually the last one) and drop the others even if they are tagged differently.
Expected metrics to be sent to DD after aggregation:
3 unique:
Actual result: (just one)