Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Citrix Netscaler access & error logs #384

Merged

Conversation

MaxGroot
Copy link
Contributor

@MaxGroot MaxGroot commented Sep 2, 2023

This pull request adds support for access logs as Citrix produces them by adding a few regexes to the Apache access plugin. In doing so, the WebserverAccessLogRecord descriptor is extended with the fields local_ip and pid.

Moreover, this PR adds a generic WebserverErrorLogRecord and extends the Apache plugin with a error method. To keep the plugin readable I changed most regex strings to a multi-line version with comments that are parsed using re.VERBOSE.

It might be worth considering to 'parse' the configuration files for Apache to construct regexes dynamically. However, those files might not always be available. Having said that, I'm unsure whether it is desirable to keep adding / extending regexes to keep covering other log formats in the future. Open to feedback in that regard.

@codecov
Copy link

codecov bot commented Sep 2, 2023

Codecov Report

Attention: 18 lines in your changes are missing coverage. Please review.

Comparison is base (080de0f) 73.88% compared to head (b54a384) 74.04%.

Files Patch % Lines
dissect/target/plugins/apps/webserver/apache.py 83.65% 17 Missing ⚠️
dissect/target/plugins/apps/webserver/citrix.py 95.45% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #384      +/-   ##
==========================================
+ Coverage   73.88%   74.04%   +0.15%     
==========================================
  Files         271      272       +1     
  Lines       22476    22529      +53     
==========================================
+ Hits        16607    16681      +74     
+ Misses       5869     5848      -21     
Flag Coverage Δ
unittests 74.04% <87.05%> (+0.15%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Schamper Schamper requested a review from Poeloe September 4, 2023 10:50
@Schamper
Copy link
Member

Schamper commented Sep 4, 2023

It might be worth considering to 'parse' the configuration files for Apache to construct regexes dynamically. However, those files might not always be available. Having said that, I'm unsure whether it is desirable to keep adding / extending regexes to keep covering other log formats in the future. Open to feedback in that regard.

I was thinking about this too. Besides keeping some defaults, this might be worth pursuing. Perhaps @JSCU-CNI has some thoughts on that, since they originally wrote this plugin.

Copy link
Contributor

@Poeloe Poeloe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did not entirely finished my review, will continue at a later time!

Comment on lines 98 to 106
(?P<remote_ip>.*?) # Client IP address of the request.
\s
->
\s
(?P<local_ip>.*?) # Local IP of the Netscaler.
\s
(?P<remote_logname>.*?) # Remote logname (from identd, if supplied).
\s
(?P<remote_user>.*?) # Remote user if the request was authenticated.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can think of an elegant way to re use the REMOTE_REGEX pattern here, since it is quite similar. So use a function that returns a specific version based on a boolean or something in that direction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it's quite similar, but didn't see a way of re-using components here. Considering the regex pattern has been moved to a separate file, I think the code overlap is a little less glaring. Still open to suggestions on this though.

"""
splitted_line = line.split(" ")
first_part = splitted_line[0]
second_part = splitted_line[1]
if ":" in first_part and "." in first_part:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to also check for the .? Cannot think of a scenario (other than a corrupt file)

Or maybe you mean second_part instead, to check for an IP-address?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think with vhosts, you have something like example.com:80 as the first part of a line, which contains both a : and a .

Poeloe
Poeloe previously requested changes Sep 4, 2023
error_log_paths.extend(
path
for path in custom_log.parent.glob(f"{custom_log.name}*")
if path not in error_log_paths
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use a set as default for the log files and cast it to a list if necessary. Then you don't need to do this existence check ;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored into sets, however this does mean that does change the ordering of the values. In the old version, the values of access_log_paths would be sorted by when they were encountered in the configuration. Now it's sorted alphabetically. I don't see a problem with this myself (the log paths themselves aren't returned to end-users), but still thought it's worth mentioning.

@JSCU-CNI
Copy link
Contributor

JSCU-CNI commented Sep 5, 2023

It might be worth considering to 'parse' the configuration files for Apache to construct regexes dynamically. However, those files might not always be available. Having said that, I'm unsure whether it is desirable to keep adding / extending regexes to keep covering other log formats in the future. Open to feedback in that regard.

I was thinking about this too. Besides keeping some defaults, this might be worth pursuing. Perhaps @JSCU-CNI has some thoughts on that, since they originally wrote this plugin.

Thanks for the mention. Deriving the log format from Apache config files definitely sounds like a nice to have feature to us. I would propose to keep the current Apache regexes if no config file is found on a target.

On another note, we feel the current ApachePlugin would get a little bloated with Citrix functionality once this is merged.
Looking at the current proposal of this extension of the Apache webserver plugin, we wonder if a new plugin class (eg. CitrixPlugin) would be beneficial. This plugin could inherit the current ApachePlugin and build on existing logic, while keeping Apache specific logic separate.

@MaxGroot
Copy link
Contributor Author

Thanks for your input, I agree that the plugin would get too bloated with Citrix-specific stuff with the PR in its initially proposed form. Subclassing the ApachePlugin with a Citrix-specific plugin seems like the right approach. Apologies it took a while, I went out for a bit to get some milk.

To be able to subclass ApachePlugin into a CitrixWebserverPlugin, substantial changes were necessary to the ApachePlugin. I initially considered splitting this PR into one for refactoring the ApachePlugin while adding error log support, and creating a second PR for the CitrixWebserverPlugin. However, I think some of the design choices can best be discussed with having both the superclass and the subclass in the same PR.

To make this work, some things were moved around or changed:

  • infer_access_log_format is now a static method of ApachePlugin. This way, the CitrixWebserverPlugin can override it for inferring citrix-specific log formats.
  • The LogFormat enum was removed. As enums are not extendible, the CitrixWebserverPlugin could not return a log format of the same type as its superclass. Instead, log formats are now noted using a LogFormat named tuple, which contain both a name and a regex pattern.
  • Some constants, such as default_log_dirs and access_log_names are now properties of the ApachePlugin rather than hardcoded lists contained within methods. This way, the subclass can easily extend these properties.

The refactor also takes the initial review suggestions into account. One additional thing that has been changed compared to 56f8821 is a small rewrite of how Apache configurations are parsed for the presence of CustomLog directives. To also support ErrorLog directives this code had to be supplemented anyway, and the parsing would previously use a mix-and-match of split, strip, replace and indexing:

log_path = line.split("CustomLog")[1].strip().split(" ")[0].replace('"', "")

This has been altered to a regex, to be more in line with the rest of the plugin structure.

I realize this is a substantial refactor of the ApachePlugin. @JSCU-CNI feel free to leave your thoughts on the proposed changes, as you originally wrote the plugin.

@JSCU-CNI
Copy link
Contributor

Thanks for the reply @MaxGroot. I've quickly looked through your new changes. To prevent accidental confusion between the ApachePlugin and the CitrixWebserverPlugin, you could perhaps add a detect method to CitrixWebserverPlugin which checks for os == "citrix".

I haven't got anything else to add, this looks like a good addition to Dissect (and thanks for making the regexes easier to read).

@MaxGroot
Copy link
Contributor Author

Depends on #463.

@Schamper Schamper merged commit fe16e92 into fox-it:main Jan 14, 2024
@MaxGroot MaxGroot deleted the improvement/citrix-access-and-error-logs branch March 9, 2024 19:48
Zawadidone pushed a commit to Zawadidone/dissect.target that referenced this pull request Apr 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants