-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Citrix Netscaler access & error logs #384
Add support for Citrix Netscaler access & error logs #384
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #384 +/- ##
==========================================
+ Coverage 73.88% 74.04% +0.15%
==========================================
Files 271 272 +1
Lines 22476 22529 +53
==========================================
+ Hits 16607 16681 +74
+ Misses 5869 5848 -21
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
I was thinking about this too. Besides keeping some defaults, this might be worth pursuing. Perhaps @JSCU-CNI has some thoughts on that, since they originally wrote this plugin. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did not entirely finished my review, will continue at a later time!
(?P<remote_ip>.*?) # Client IP address of the request. | ||
\s | ||
-> | ||
\s | ||
(?P<local_ip>.*?) # Local IP of the Netscaler. | ||
\s | ||
(?P<remote_logname>.*?) # Remote logname (from identd, if supplied). | ||
\s | ||
(?P<remote_user>.*?) # Remote user if the request was authenticated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you can think of an elegant way to re use the REMOTE_REGEX
pattern here, since it is quite similar. So use a function that returns a specific version based on a boolean or something in that direction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it's quite similar, but didn't see a way of re-using components here. Considering the regex pattern has been moved to a separate file, I think the code overlap is a little less glaring. Still open to suggestions on this though.
""" | ||
splitted_line = line.split(" ") | ||
first_part = splitted_line[0] | ||
second_part = splitted_line[1] | ||
if ":" in first_part and "." in first_part: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it necessary to also check for the .
? Cannot think of a scenario (other than a corrupt file)
Or maybe you mean second_part
instead, to check for an IP-address?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think with vhosts, you have something like example.com:80
as the first part of a line, which contains both a :
and a .
error_log_paths.extend( | ||
path | ||
for path in custom_log.parent.glob(f"{custom_log.name}*") | ||
if path not in error_log_paths |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use a set as default for the log files and cast it to a list if necessary. Then you don't need to do this existence check ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored into sets, however this does mean that does change the ordering of the values. In the old version, the values of access_log_paths
would be sorted by when they were encountered in the configuration. Now it's sorted alphabetically. I don't see a problem with this myself (the log paths themselves aren't returned to end-users), but still thought it's worth mentioning.
Thanks for the mention. Deriving the log format from Apache config files definitely sounds like a nice to have feature to us. I would propose to keep the current Apache regexes if no config file is found on a target. On another note, we feel the current ApachePlugin would get a little bloated with Citrix functionality once this is merged. |
Thanks for your input, I agree that the plugin would get too bloated with Citrix-specific stuff with the PR in its initially proposed form. Subclassing the To be able to subclass To make this work, some things were moved around or changed:
The refactor also takes the initial review suggestions into account. One additional thing that has been changed compared to 56f8821 is a small rewrite of how Apache configurations are parsed for the presence of
This has been altered to a regex, to be more in line with the rest of the plugin structure. I realize this is a substantial refactor of the |
Thanks for the reply @MaxGroot. I've quickly looked through your new changes. To prevent accidental confusion between the I haven't got anything else to add, this looks like a good addition to Dissect (and thanks for making the regexes easier to read). |
Also apply previous code suggestions
Depends on #463. |
This pull request adds support for access logs as Citrix produces them by adding a few regexes to the Apache access plugin. In doing so, the
WebserverAccessLogRecord
descriptor is extended with the fieldslocal_ip
andpid
.Moreover, this PR adds a generic WebserverErrorLogRecord and extends the Apache plugin with a
error
method. To keep the plugin readable I changed most regex strings to a multi-line version with comments that are parsed usingre.VERBOSE
.It might be worth considering to 'parse' the configuration files for Apache to construct regexes dynamically. However, those files might not always be available. Having said that, I'm unsure whether it is desirable to keep adding / extending regexes to keep covering other log formats in the future. Open to feedback in that regard.