Merge two Linux APIs for mapping kernel modules to pointers. Fix bugs… #1673

atcuno · 2025-03-06T23:54:46Z

… in new API. Add deprecation warnings throughout old API.

This PR fixes a serious set of issues in the Linux APIs related to mapping kernel pointers to their module and symbol name (a critical aspect of rootkit detection). Previously, I had written an API in the symbols/linux/__init__.py that performed these tasks and were very easy for plugin writers to use - you called a function to get the handler information then you called a second function anytime you needed to map an address inside the handlers.

There was a recent push for covering the tracing APIs (ftrace and tracepoints) that added an entirely new API for mapping modules and symbols, but this work had three major issues that were not handled in attempting to replicate my old API, but with new features added:

This API did not incorporate the kernel .text space in its "modules" set, which meant that every pointer inside the kernel was reported as a dash (rootkit marker) in plugins. In the mass testing runs, this led to every single sample reporting rootkits all over the place.
This API did not directly incorporate the symbol lookups, leading to a bunch of copy/paste code between plugins that wanted to use it - see the diffs in this PR for ftrace and tracepoint - now they match the easy call setup from my old API, but also incorporates the hidden module searching.
The PRs for the new API left the old API in place, plus many existing plugins using the old API. This left Volatility with two public APIs for mapping Linux kernel pointers, neither of which were complete. The updated API is unified and gets the best features of each (hidden modules, mapping inside kernel .text), while leaving it very easy on plugin writers.

This PR fixes all the above issues, with the minimal amount of code possible. To rectify all the issues, and get it to where all the plugins will use new API, I did the following:

Deprecated all the old code paths
Bumped the Linux utilities version on the plugins using the old paths, but left them still using the deprecated paths to minimze the size of this PR. In a PR right after this one is merged, I will update them to use the new paths to avoid the deprecation warnings.
Volatility 3 has the really nice deprecated_method feature, but this does not cover when an entire function will be deprecated and going away without a direct replacement. I added this feature. This feature was necessary as the old API functions that worked with the tuples of (name, start, end) are just in no way compatible with the new API.
Moved all shared code to the utility class to avoid plugins importing from multiple plugins + the utility. I assume these functions were spread out through plugins before the versioned Modules utility, but that is the wrong way to approach it now. This also prevents and avoids the need for circular imports.
I fixed up typing where I noticed it in plugins already getting a major version bump.

ikelos

Cool, looks really good! For a minute I was worried we were changing the LinuxUtilities MAJOR version, but then saw it was just the Modules extension. 5:)

There's a couple considerations about changing the API but they may be excessive, so feel free to reject them. I really don't like the way there's a Deprecation class and we're importing it directly, I'd really like that cleaned up as part of this PR please but will accept an open bug if you want to put it off a bit. Otherwise looks really good and simplifies/collects together a lot of the functionality, nice! 5:D

volatility3/framework/__init__.py

volatility3/framework/symbols/linux/__init__.py

volatility3/framework/plugins/linux/hidden_modules.py

volatility3/framework/symbols/linux/utilities/modules.py

ikelos · 2025-03-07T00:52:16Z

volatility3/framework/symbols/linux/utilities/modules.py

+        )
+
+    @classmethod
+    def run_modules_scanners(


Something to consider here is whether we want to do something like the linux pslist thing we did? With different methods with the same signature that scanned different ways? Then you just hand in the list of scanners you want rather than boolean flags to turn on and off specific sets of scanners?

I was going to revisit the implementation_method for this after this PR goes on. I honestly don't even like the skipping hidden part, but left it here as it was there already. We should just get all the sources all the time.

The issue with doing the hidden_modules scanning by default, is that if the caller never needs it (i.e. it is able to resolve all symbols to modules without it), then we will just have wasted time and CPU scanning the space.

The previous design was triggering the scanning on a per-need basis, if a resolution issue was raised. I think we should leave the choice, depending on the use case ?

This is even more true on unaligned spaces, where hidden_modules can take quite a long time running.

Its only slow when the path for 1 byte alignment is hit, which I am going to fix in a follow up PR. Otherwise, it didn't change speed much in my testing since it only scans inside the vmmemap bounds anyway. If there are samples where this is problematic let me know though as I am confident we can make the scanner quick vs having to conditionally run it.

Can you raise a ticket and note it here to make sure this gets addressed in the future please? (The redo to make the set of gatherers more generic/pluggable)

volatility3/framework/plugins/linux/hidden_modules.py

atcuno · 2025-03-08T21:31:12Z

@ikelos I addressed all your feedback except for adding a datetime to deprecation headings as I am not sure what the implementation would look like.

So, if the calls took a datetime, what would happen with them? The specific date printed in the warning message? Backtrace the calling plugin with a new deprecation exception? Or would they be a parameter just to be grepped before releases?

ikelos · 2025-03-08T21:35:02Z

Yeah, didn't need to be a datetime per-se, even a date string would do (YYYY-MM-DD please), so that we could easily search for ones that should be removed, and we could tell everyone exactly when the method should no longer be used...

ikelos · 2025-03-08T21:36:21Z

We could theoretically throw a full on exception if it tries to be used after that date, but I think we should just allow it until we get around to removing the method (hopefully it'll happen ever release, one for us to check during the release process).

ikelos · 2025-03-08T21:37:53Z

We need to come up with a general policy, I'd suggest at least one year (although I'd possibly go as low as 6 months, because with our release cadence it's more likely to be 9 months) or when a MAJOR version bump happens (no point carry over cruft when everything changes)...

atcuno · 2025-03-09T04:44:42Z

I did 6 months plus a few days @ikelos

ikelos

Looks good (assuming a follow up to make the types of module scans more... modular... so to speak). A couple minor points but no show stoppers...

ikelos · 2025-03-09T12:12:49Z

volatility3/framework/symbols/linux/utilities/modules.py

+
+        kernel_layer = context.layers[kernel.layer_name]
+
+        if modules[0].start != modules[0].start & kernel_layer.address_mask:


This seems a really suspect test, it's checking that high bits aren't set? Why is it only doing it for one entry? What if the list is empty (for some reason)?

I will add the check for an empty list (good catch), but I want to keep the index 0 check as it will enforce callers going through the right API flow to reach that code. All kernel modules will have the high bits set initially, so if the right API flow isn't used then non-masked ones will make it there and everything processes incorrectly.

Shouldn't it be something like any(modules[i].start =! modules[i].start & kernel_layer.address_mask for i in modules) though? Picking out the first one seems like a half hearted check?

If the caller comes through the right API then all will be masked the same, if they don't then none will be masked... if they somehow randomly enumerate on their own and only mask half.. then they are just trying to be broken.

That's not a good argument for why not to check them all? You're applying trust to something you could check, so why not just check it? Otherwise I'm certain someone could write code that masks the first value and then makes this call. There's little point in half a test (and it also requires that the list has elements, whereas any/all would not)...

ikelos · 2025-03-09T12:14:29Z

volatility3/framework/symbols/linux/utilities/modules.py

+        if len(matches) >= 1:
+            if len(matches) > 1:
+                warnings.warn(
+                    f"Address {hex(target_address)} fits in modules at {[hex(module.start) for module in matches]}, indicating potential modules memory space overlap.",


Since this doesn't then bomb out, you should say which one will win.

ikelos · 2025-03-09T12:17:24Z

volatility3/framework/symbols/linux/utilities/modules.py

+        mask = context.layers[kernel.layer_name].address_mask
+
+        start_addr = kernel.object_from_symbol("_text")
+        start_addr = start_addr.vol.offset & mask
+
+        end_addr = kernel.object_from_symbol("_etext")
+        end_addr = end_addr.vol.offset & mask


Suggested change

mask = context.layers[kernel.layer_name].address_mask

start_addr = kernel.object_from_symbol("_text")

start_addr = start_addr.vol.offset & mask

end_addr = kernel.object_from_symbol("_etext")

end_addr = end_addr.vol.offset & mask

address_mask = context.layers[kernel.layer_name].address_mask

start_addr = kernel.object_from_symbol("_text")

start_addr = start_addr.vol.offset & address_mask

end_addr = kernel.object_from_symbol("_etext")

end_addr = end_addr.vol.offset & address_mask

Could we call this address_mask to keep it consistent with other methods please?

ikelos · 2025-03-09T12:18:49Z

volatility3/framework/symbols/linux/utilities/modules.py

+        )
+
+    @classmethod
+    def run_modules_scanners(


Can you raise a ticket and note it here to make sure this gets addressed in the future please? (The redo to make the set of gatherers more generic/pluggable)

ikelos · 2025-03-09T12:22:03Z

volatility3/framework/symbols/linux/utilities/modules.py

+                for modinfo in run_results["kernel"]
+                + run_results["lsmod"]
+                + run_results["check_modules"]


Suggested change

for modinfo in run_results["kernel"]

+ run_results["lsmod"]

+ run_results["check_modules"]

for modinfo in cls.flatten_run_modules_results(run_results)

This should probably be driven by the contents of the list, rather than hard coding the available tests, which looks to be what flatten_run_modules_results does?

ikelos · 2025-03-09T12:23:54Z

volatility3/framework/symbols/linux/utilities/modules.py

+
+        modules = vmlinux.object_from_symbol(symbol_name="modules").cast("list_head")
+
+        table_name = modules.vol.type_name.split(constants.BANG)[0]


This should just be vmlinux.symbol_table_name, no?

atcuno · 2025-03-09T19:36:32Z

@ikelos I managed to break the commit history while rebasing. I asked @dgmcdona to look at it. Will finish it up after.

ikelos · 2025-03-09T20:54:48Z

@j-t-1, please avoid modifying other people's pull requests. It can cause a huge amount of confusion, and in general, it's better to have any changes you want to make to go through a proper PR so they can be reviewed...

j-t-1 · 2025-03-09T22:32:00Z

@j-t-1, please avoid modifying other people's pull requests. It can cause a huge amount of confusion, and in general, it's better to have any changes you want to make to go through a proper PR so they can be reviewed...

Apologies for this. If I understand what has caused this, I deleted pid_list = pid_list or [], which I then realised broke the iteration in the line below, so I undeleted it in a subsequent commit, but only the first was used in a rebase. I will not deliberately modify other people's pull requests (if there is a way to remove access for me to do this, please do as one less potential error pathway).

ikelos · 2025-03-09T22:36:12Z

Thanks, we will be improving permissions so that we don't have to spot when things go wrong, but can prevent them from being done in the first place. I fully understand it was done intending to improve the commit, but we probably ought to prevent others from modifying someone's PR as you say to reduce potential error pathways. Thanks for understanding... 5:)

… in new API. Add deprecation warnings throughout old API.

Co-authored-by: ikelos <[email protected]>

Ensure black doesn't complain

ikelos

Looking really good, just the currently open comments to resolve please

ikelos · 2025-03-10T09:10:12Z

volatility3/framework/deprecation.py

+from volatility3.framework.configuration import requirements
+
+
+def method_being_removed(message: str, removal_date: str):


Any chance of a docstring for this please?

atcuno added parity-release linux_parity_push labels Mar 6, 2025

atcuno requested a review from ikelos March 6, 2025 23:54

ikelos reviewed Mar 7, 2025

View reviewed changes

ikelos assigned atcuno Mar 7, 2025

ikelos added the awaiting-author-response This issue/pull request needs attention from the original author label Mar 7, 2025

atcuno force-pushed the linux_kernel_api_overhaul branch from 710a4c2 to 4d01f9a Compare March 8, 2025 20:03

ikelos reviewed Mar 8, 2025

View reviewed changes

volatility3/framework/plugins/linux/hidden_modules.py Outdated Show resolved Hide resolved

ikelos reviewed Mar 9, 2025

View reviewed changes

dgmcdona force-pushed the linux_kernel_api_overhaul branch from 9b58c7e to 68c680b Compare March 10, 2025 03:22

atcuno and others added 9 commits March 9, 2025 22:23

Merge two Linux APIs for mapping kernel modules to pointers. Fix bugs…

bdc3443

… in new API. Add deprecation warnings throughout old API.

update for black

2ead4d3

Update volatility3/framework/plugins/linux/hidden_modules.py

636bb10

Co-authored-by: ikelos <[email protected]>

Update volatility3/framework/plugins/linux/hidden_modules.py

00678fb

Ensure black doesn't complain

Deprecation API updates

6a32aa4

Bring back change after merge conflict

b68b4f3

Add new deprecation module

4a87bf5

Add removal_date to deprecation API

6cabb05

Add removal_date to deprecation API

3eae04c

dgmcdona force-pushed the linux_kernel_api_overhaul branch from 68c680b to 3eae04c Compare March 10, 2025 03:23

ikelos reviewed Mar 10, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge two Linux APIs for mapping kernel modules to pointers. Fix bugs… #1673

Merge two Linux APIs for mapping kernel modules to pointers. Fix bugs… #1673

atcuno commented Mar 6, 2025

ikelos left a comment

ikelos Mar 7, 2025

atcuno Mar 7, 2025

Abyss-W4tcher Mar 7, 2025 •

edited

Loading

atcuno Mar 7, 2025

ikelos Mar 9, 2025 •

edited

Loading

atcuno commented Mar 8, 2025

ikelos commented Mar 8, 2025

ikelos commented Mar 8, 2025

ikelos commented Mar 8, 2025

atcuno commented Mar 9, 2025

ikelos left a comment

ikelos Mar 9, 2025

atcuno Mar 9, 2025

ikelos Mar 9, 2025 •

edited

Loading

atcuno Mar 9, 2025

ikelos Mar 9, 2025

ikelos Mar 9, 2025

ikelos Mar 9, 2025

ikelos Mar 9, 2025 •

edited

Loading

ikelos Mar 9, 2025

ikelos Mar 9, 2025

atcuno commented Mar 9, 2025

ikelos commented Mar 9, 2025

j-t-1 commented Mar 9, 2025

ikelos commented Mar 9, 2025

ikelos left a comment

ikelos Mar 10, 2025


		kernel_layer = context.layers[kernel.layer_name]

		if modules[0].start != modules[0].start & kernel_layer.address_mask:


		modules = vmlinux.object_from_symbol(symbol_name="modules").cast("list_head")

		table_name = modules.vol.type_name.split(constants.BANG)[0]

		from volatility3.framework.configuration import requirements


		def method_being_removed(message: str, removal_date: str):

Merge two Linux APIs for mapping kernel modules to pointers. Fix bugs… #1673

Are you sure you want to change the base?

Merge two Linux APIs for mapping kernel modules to pointers. Fix bugs… #1673

Conversation

atcuno commented Mar 6, 2025

ikelos left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Abyss-W4tcher Mar 7, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ikelos Mar 9, 2025 • edited Loading

Choose a reason for hiding this comment

atcuno commented Mar 8, 2025

ikelos commented Mar 8, 2025

ikelos commented Mar 8, 2025

ikelos commented Mar 8, 2025

atcuno commented Mar 9, 2025

ikelos left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ikelos Mar 9, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ikelos Mar 9, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

atcuno commented Mar 9, 2025

ikelos commented Mar 9, 2025

j-t-1 commented Mar 9, 2025

ikelos commented Mar 9, 2025

ikelos left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Abyss-W4tcher Mar 7, 2025 •

edited

Loading

ikelos Mar 9, 2025 •

edited

Loading

ikelos Mar 9, 2025 •

edited

Loading

ikelos Mar 9, 2025 •

edited

Loading