Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent double-creation of nerdctl default network. #1538

Merged
merged 3 commits into from
Dec 6, 2022

Conversation

aznashwan
Copy link
Contributor

This patch adds logic to pkg/netutil to prevent nerdctl from accidentally creating a second default network if a non-nerdctl-created network with the same default name already happens to exist.

Additionally, a new nerdctl/default-network label definition has been added to indicate whether a network was created by nerdctl with the express purpose of being used as a default network.

Signed-off-by: Nashwan Azhari [email protected]

@aznashwan aznashwan force-pushed the default-net-name-collisions branch from 1573c2d to d719b8b Compare November 22, 2022 15:33
@aznashwan
Copy link
Contributor Author

Note that this is basically just a workaround and the root cause of the issue is nerdctl using network names as their identifiers instead of IDs for most internal filtering logic.

IMHO it'd be worth creating an issue where we could discuss switching to network IDs further...

@fahedouch
Copy link
Member

This patch adds logic to pkg/netutil to prevent nerdctl from accidentally creating a second default network if a non-nerdctl-created network with the same default name already happens to exist.

can you give an example plz?

Comment on lines 62 to 118
defaultNet, err := e.getDefaultNetworkConfig()
if err != nil {
return nil, fmt.Errorf("failed to check for default network: %s", err)
}
if defaultNet == nil {
if err := e.createDefaultNetworkConfig(); err != nil {
return nil, fmt.Errorf("failed to create default network: %s", err)
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we moveensureDefaultNetworkConfig after e.networkConfigList() and then put that in ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean since ensureDefaultNetworkConfig() has now become createDefaultNetworkConfig(). (which is indeed now called after e.networkConfigList())

It has however just hit me that creating the default network after the networkConfigList() will mean that e.Networks will not include the newly-created default network.

While I could just go e.Networks = append(e.Networks, newDefaultNetwork), I still think this shows a deeper underlying problem in how we expose this field:

I'm guessing e.Networks might have been added for caching reasons or something but at this stage IMO it just makes the network listing logic less consistent and should be removed entirely, what do you think?

Copy link
Member

@fahedouch fahedouch Nov 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • SGTM regarding introducing NetworkList() to expose networkConfigs
  • What I propose is to wrap the e.getDefaultNetworkConfig() and createDefaultNetworkConfig() into ensureDefaultNetworkConfig()

stgh like this

func (e *CNIEnv) ensureDefaultNetworkConfig() error{
	defaultNet, err := e.GetDefaultNetworkConfig()
	if err != nil {
		return  fmt.Errorf("failed to check for default network: %s", err)
	}
	if defaultNet == nil {
		if err := e.createDefaultNetworkConfig(); err != nil {
			return fmt.Errorf("failed to create default network: %s", err)
		}
	}
        return nil
}

and then

func NewCNIEnv(cniPath, cniConfPath string) (*CNIEnv, error) {
	e := CNIEnv{
		Path:        cniPath,
		NetconfPath: cniConfPath,
	}
	if err := os.MkdirAll(e.NetconfPath, 0755); err != nil {
		return nil, err
	}
	if err := e.ensureDefaultNetworkConfig(); err != nil {
		return nil, err
        }
	return &e, nil
}

I believe this make a code more compréhensible since the CNI env constructor should :

  1. create the config path
  2. ensure a config file (default network)
  3. return cniEnv

@aznashwan
Copy link
Contributor Author

@fahedouch it happened while I was running nerdctl on a containerd install on Windows 2019 which already had a CNI network defined bearing the name nat from some common setup script in the containerd repo, though under a different filename. (0-containerd-nat.conf)

nat is the default network name in nerdctl on Windows, and because CNIEnv.ensureDefaultNetworkConfig() always gets called and nerdctl it only checks for the nerdctl-specific CNI network file definition is present before re-creating the network, it meant that nerdctl created its own network named nat, and now I had two.

Note that the original nat thus became completely inaccessible, as nerdctl would always sort the CNI config files alphabetically by filename and cram them first-come-from-served into a map, in which case the original network definition in 0-containerd-nat.conf would always get superseded by the one created by nerdctl in nerdctl-nat.conflist.

In the end this can only be really be solved by having nerdctl use network IDs instead of names for any nerdctl-managed networks, and only use names for networks created externally.

Either way, I felt there needed to be some way to differentiate the default network through a label on it. (am open to any better ideas though)

// created and owned by Nerdctl.
// Boolean value which can be parsed with strconv.ParseBool() is required.
// (like "nerdctl/default-network=true" or "nerdctl/default-network=false")
NerdctlDefaultNetwork = Prefix + "default-network"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this cover multi-network (nerdctl run --net=foo --net=bar, commonly invoke via compose) ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also like to see an integration test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this cover multi-network (nerdctl run --net=foo --net=bar, commonly invoke via compose)?

The label will only get set on the default network nerdctl creates, and the label will only be checked when trying to find the default network.

I'll be adding an integration test shortly.

@aznashwan aznashwan force-pushed the default-net-name-collisions branch 7 times, most recently from e1c2800 to 033dada Compare November 24, 2022 19:51
@fahedouch
Copy link
Member

@aznashwan need some rebase ( because of this #1554)

@aznashwan aznashwan force-pushed the default-net-name-collisions branch 2 times, most recently from 9cc8708 to 6071186 Compare November 29, 2022 10:40
@aznashwan
Copy link
Contributor Author

aznashwan commented Nov 29, 2022

@fahedouch just updated the PR, thanks for the heads-up!

One thing worth noting is that due to #1554 making default network creation no longer happen on any nerdctl network subcommand, I had to "downgrade" the integration tests to unit tests.

@fahedouch fahedouch self-requested a review November 29, 2022 20:54
m := make(map[string]*networkConfig, len(networks))
for _, n := range networks {
if original, exists := m[n.Name]; exists {
logrus.Warnf("duplicate network name %q, %#v will get superseded by %#v", n.Name, original, n)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to have duplicate network name here ?

Copy link
Contributor Author

@aznashwan aznashwan Dec 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Nerdctl uses libcni.ConfListFromFile to parse the libcni.NetworkConfigList which will holds the actual Name field which is loaded from the name in the CNI JSON conf file. (netutil.NetworkConfig inherits all fields from the NetworkConfigList)

There are no checks preventing multiple JSON config files having the same name field in them, which was exactly what happened to me as described in my first comment.

The only real solution would be to switch nerdctl to using network IDs internally (for networks it had created itself), and only resort to the name field only for networks which were not defined by nerdctl. (and thus do not have a nerdctl ID field set)

I noticed issue #1568 (and PR #1581) are a step in the right direction but most of the network-listing codebase is still relying on identifying networks only by name. (or using cniEnv.NetworkMap() which has the potential of networks "shadowing" one-another)

@fahedouch
Copy link
Member

@aznashwan PTAL!

This patch adds logic to pkg/netutil to prevent nerdctl from
accidentally creating a second default network if a non-nerdctl-created
network with the same default name already happens to exist.

Additionally, a new `nerdctl/default-network` label definition has been
added to indicate whether a network was created by nerdctl with the
express purpose of being used as a default network.

Signed-off-by: Nashwan Azhari <[email protected]>
@aznashwan aznashwan force-pushed the default-net-name-collisions branch 2 times, most recently from 9323c59 to ba61d5f Compare December 5, 2022 10:38
This patch makes all the methods on `netutil.CNIEnv` stateless in order
to prevent possible state mismanagement issues. (e.g. calling
`networkConfigList()` after a `NetworkCreate()` excluding the new network)

Signed-off-by: Nashwan Azhari <[email protected]>
@aznashwan aznashwan force-pushed the default-net-name-collisions branch from ba61d5f to fbcd366 Compare December 5, 2022 11:02
Copy link
Member

@fahedouch fahedouch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, CI failing not related to PR

@fahedouch fahedouch added this to the v1.1.0 milestone Dec 6, 2022
@fahedouch
Copy link
Member

@aznashwan
FreeBSD CI is failing

=== RUN   TestDefaultNetworkCreation
    netutil_test.go:165: assertion failed: error is not nil: failed to create default network: needs CNI plugin "bridge" to be installed in CNI_PATH ("/opt/cni/bin"), see https://github.com/containernetworking/plugins/releases: exec: "/opt/cni/bin/bridge": stat /opt/cni/bin/bridge: no such file or directory

@aznashwan aznashwan force-pushed the default-net-name-collisions branch from fbcd366 to c0fc27e Compare December 6, 2022 10:25
@aznashwan aznashwan force-pushed the default-net-name-collisions branch from c0fc27e to 4404583 Compare December 6, 2022 10:29
@aznashwan
Copy link
Contributor Author

@aznashwan FreeBSD CI is failing

@fahedouch good catch, it seems there is no FreeBSD CNI plugin being set up in the CI so I just made that test only run on Linux and Windows for now.

FreeBSD and Windows checks pass now, and the remaining CI failures seem to be unrelated. (22.04 and 20.04 rootless)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants