Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove webhook mode from k8s-workload-registrar #3235

Merged
merged 2 commits into from
Jul 12, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 8 additions & 62 deletions support/k8s/k8s-workload-registrar/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,19 +31,9 @@ The configuration file is a **required** by the registrar. It contains
| `cluster` | string | required | Logical cluster to register nodes/workloads under. Must match the SPIRE SERVER PSAT node attestor configuration. | |
| `pod_label` | string | optional | The pod label used for [Label Based Workload Registration](#label-based-workload-registration) | |
| `pod_annotation` | string | optional | The pod annotation used for [Annotation Based Workload Registration](#annotation-based-workload-registration) | |
| `mode` | string | optional | How to run the registrar, either using a `"webhook"`, `"reconcile`" or `"crd"`. See [Differences](#differences-between-modes) for more details. | `"webhook"` |
| `mode` | string | required | How to run the registrar, either `"reconcile"` or `"crd"`. See [Differences](#differences-between-modes) for more details. | |
| `disabled_namespaces` | []string | optional | Comma seperated list of namespaces to disable auto SVID generation for | `"kube-system", "kube-public"` |

The following configuration directives are specific to `"webhook"` mode:

| Key | Type | Required? | Description | Default |
| -------------------------- | --------| ---------| ----------------------------------------- | ------- |
| `addr` | string | required | Address to bind the HTTPS listener to | `":8443"` |
| `cert_path` | string | required | Path on disk to the PEM-encoded server TLS certificate | `"cert.pem"` |
| `key_path` | string | required | Path on disk to the PEM-encoded server TLS key | `"key.pem"` |
| `cacert_path` | string | required | Path on disk to the CA certificate used to verify the client (i.e. API server) | `"cacert.pem"` |
| `insecure_skip_client_verification` | boolean | required | If true, skips client certificate verification (in which case `cacert_path` is ignored). See [Security Considerations](#security-considerations) for more details. | `false` |

The following configuration directives are specific to `"reconcile"` mode:

| Key | Type | Required? | Description | Default |
Expand All @@ -67,16 +57,15 @@ cluster = "production"
```

## Workload Registration
When running in webhook, reconcile, or crd mode with `pod_controller=true` entries will be automatically created for
When running in reconcile or crd mode with `pod_controller=true` entries will be automatically created for
Pods. The available workload registration modes are:

| Registration Mode | pod_label | pod_annotation | identity_template | Service Account Based |
| ----------------- | --------- | -------------- | ----------------- | --------------- |
| `webhook` | as specified by pod_label | as specified by pod_annotation | _unavailable_ | service account |
| `reconcile` | as specified by pod_label | as specified by pod_annotation | _unavailable_ | service account |
| `crd` | as specified by pod_label | as specified by pod_annotation | as specified by identity_template | _unavailable_ |

If using `webhook` and `reconcile` modes with [Service Account Based SPIFFE IDs](#service-account-based-workload-registration), don't specify either `pod_label` or `pod_annotation`. If you use Label Based SPIFFE IDs, specify only `pod_label`. If you use Annotation Based SPIFFE IDs, specify only `pod_annotation`.
If using the `reconcile` mode with [Service Account Based SPIFFE IDs](#service-account-based-workload-registration), don't specify either `pod_label` or `pod_annotation`. If you use Label Based SPIFFE IDs, specify only `pod_label`. If you use Annotation Based SPIFFE IDs, specify only `pod_annotation`.

For `crd` mode, if neither `pod_label` nor `pod_annotation`
workload registration mode is selected,
Expand Down Expand Up @@ -188,41 +177,6 @@ An example can be found in `mode-reconcile/config/role.yaml`, which you would ap

See [Quick Start for CRD Kubernetes Workload Registrar](mode-crd/README.md#quick-start)

### Webhook Mode Configuration
The registrar will need access to its server keypair and the CA certificate it uses to verify clients.

The following K8S objects are required to set up the validating admission controller:
* `Service` pointing to the registrar port within the spire-server container
* `ValidatingWebhookConfiguration` configuring the registrar as a validating admission controller.

Additionally, unless you disable client authentication (`insecure_skip_client_verification`), you will need:
* `Config` with a user entry for the registrar service client containing the client certificate/key the API server should use to authenticate with the registrar.
* `AdmissionConfiguration` describing where the API server can locate the file containing the `Config`. This file is passed to the API server via the `--admission-control-config-file` flag.

For convenience, a command line utility is provided to generate authentication
material and relevant Kubernetes configuration YAML.

```
$ go run generate-config.go
.... YAML configuration dump ....
```

#### Webhook mode Security Considerations

The registrar authenticates clients by default. This is a very important aspect
of the overall security of the registrar since the registrar can be used to
provide indirect access to the SPIRE server API, albeit scoped. It is *NOT*
recommended to skip client verification (via the
`insecure_skip_client_verification` configurable) unless you fully understand
the risks.

#### Migrating away from the webhook
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth keeping some info on migration for users that are still on the webhook mode?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sure. That's probably a good idea.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved it into a "troubleshooting" section on the CRD mode docs. Hope that spot makes sense.


The k8s ValidatingWebhookConfiguration will need to be removed or pods may fail admission. If you used the default
configuration this can be done with:

`kubectl validatingwebhookconfiguration delete k8s-workload-registrar-webhook`

## DNS names

Both `"reconcile"` and `"crd"` mode provide the ability to add DNS names to registration entries for pods. They
Expand All @@ -238,19 +192,11 @@ to entries. This is known to affect etcd.

## Differences between modes

The `"webhook"` mode uses a Validating Admission Webhook to capture pod creation/deletion events at admission time. It
was the first of the registrar implementations, but suffers from the following problems:
* Race conditions between add and delete for StatefulSets will regularly lead to StatefulSets without entries;
* Unavailability of the webhook either has to block admission entirely, or you'll end up with pods with no entries;
* Spire server errors have to block admission entirely, or you'll end up with pods with no entries;
* It will not clean up left behind entries for pods deleted while the webhook/spire-server was unavailable;
* Entries are not parented to individual Nodes, all SVIDs are flooded to all agents in a cluster, which severely limits scalability.
Use of the `"webhook"` mode is thus strongly discouraged, but it remains the default for backward compatibility reasons.

The `"reconcile"` mode and `"crd"` mode both make use of reconciling controllers instead of webhooks. `"reconcile"` mode,
and `"crd"` mode with the pod_controller enabled, have similar automated workload creation functionality to webhook, but
they do not suffer from the same race conditions, are capable of recovering from (and cleaning up after) failure of the registrar,
and both also ensure that automatically created entries for Pods are limited to the appropriate Nodes to prevent SVID
The `"reconcile"` and `"crd"` modes both make use of reconciling controllers. Both modes,
with the pod_controller enabled, have similar automated workload creation
functionality and are capable of recovering from (and cleaning up after)
failure of the registrar. Each also ensure that automatically created
entries for Pods are limited to the appropriate Nodes to prevent SVID
flooding. When used in this way, `"reconcile"` may be slightly faster to create new entries than `"crd"` mode, and requires
less configuration.

Expand Down
16 changes: 8 additions & 8 deletions support/k8s/k8s-workload-registrar/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,7 @@ const (
defaultLogLevel = "info"

modeCRD = "crd"
modeWebhook = "webhook"
modeReconcile = "reconcile"
defaultMode = modeWebhook
)

type Mode interface {
Expand Down Expand Up @@ -58,7 +56,6 @@ type CommonMode struct {
}

func (c *CommonMode) ParseConfig(hclConfig string) (err error) {
c.Mode = defaultMode
if err := hcl.Decode(c, hclConfig); err != nil {
return errs.New("unable to decode configuration: %v", err)
}
Expand Down Expand Up @@ -89,8 +86,11 @@ func (c *CommonMode) ParseConfig(hclConfig string) (err error) {
if c.PodLabel != "" && c.PodAnnotation != "" {
return errs.New("workload registration mode specification is incorrect, can't specify both pod_label and pod_annotation")
}
if c.Mode != modeCRD && c.Mode != modeWebhook && c.Mode != modeReconcile {
return errs.New("invalid mode \"%s\", valid values are %s, %s and %s", c.Mode, modeCRD, modeWebhook, modeReconcile)
if c.Mode == "" {
return errs.New("mode must be specified, valid values are %q and %q", modeCRD, modeReconcile)
}
if c.Mode != modeCRD && c.Mode != modeReconcile {
return errs.New("invalid mode %q, valid values are %q and %q", c.Mode, modeCRD, modeReconcile)
}
if c.DisabledNamespaces == nil {
c.DisabledNamespaces = defaultDisabledNamespaces()
Expand Down Expand Up @@ -137,9 +137,9 @@ func LoadMode(path string) (Mode, error) {
CommonMode: *c,
}
default:
mode = &WebhookMode{
CommonMode: *c,
}
// This case is defensive since ParseConfig ensures we have
// a valid mode value.
return nil, errs.New("unknown mode: %q", c.Mode)
}

err = mode.ParseConfig(string(hclBytes))
Expand Down
51 changes: 25 additions & 26 deletions support/k8s/k8s-workload-registrar/config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ var (
trust_domain = "domain.test"
cluster = "CLUSTER"
server_socket_path = "SOCKETPATH"
mode = "reconcile"
`
)

Expand All @@ -35,66 +36,65 @@ func TestLoadMode(t *testing.T) {
config, err := LoadMode(confPath)
require.NoError(err)

require.Equal(&WebhookMode{
require.Equal(&ReconcileMode{
CommonMode: CommonMode{
ServerSocketPath: "SOCKETPATH",
ServerAddress: "unix://SOCKETPATH",
TrustDomain: "domain.test",
Cluster: "CLUSTER",
LogLevel: defaultLogLevel,
Mode: "webhook",
Mode: "reconcile",
DisabledNamespaces: []string{"kube-system", "kube-public"},
trustDomain: spiffeid.RequireTrustDomainFromString("domain.test"),
},
Addr: ":8443",
CertPath: defaultCertPath,
KeyPath: defaultKeyPath,
CaCertPath: defaultCaCertPath,
ControllerName: "spire-k8s-registrar",
ClusterDNSZone: "cluster.local",
LeaderElectionResourceLock: "configmaps",
MetricsAddr: ":8080",
}, config)

testCases := []struct {
name string
in string
out *WebhookMode
out *ReconcileMode
err string
}{
{
name: "defaults",
in: testMinimalConfig,
out: &WebhookMode{
out: &ReconcileMode{
CommonMode: CommonMode{
LogLevel: defaultLogLevel,
ServerSocketPath: "SOCKETPATH",
ServerAddress: "unix://SOCKETPATH",
TrustDomain: "domain.test",
Cluster: "CLUSTER",
Mode: "webhook",
Mode: "reconcile",
DisabledNamespaces: []string{"kube-system", "kube-public"},
trustDomain: spiffeid.RequireTrustDomainFromString("domain.test"),
},
Addr: ":8443",
CertPath: defaultCertPath,
KeyPath: defaultKeyPath,
CaCertPath: defaultCaCertPath,
InsecureSkipClientVerification: false,
ControllerName: "spire-k8s-registrar",
ClusterDNSZone: "cluster.local",
LeaderElectionResourceLock: "configmaps",
MetricsAddr: ":8080",
},
},
{
name: "overrides",
in: `
mode = "reconcile"
log_level = "LEVELOVERRIDE"
log_path = "PATHOVERRIDE"
addr = ":1234"
cert_path = "CERTOVERRIDE"
key_path = "KEYOVERRIDE"
cacert_path = "CACERTOVERRIDE"
insecure_skip_client_verification = true
server_socket_path = "SOCKETPATHOVERRIDE"
trust_domain = "domain.test"
cluster = "CLUSTEROVERRIDE"
pod_label = "PODLABEL"
controller_name = "override"
cluster_dns_zone = "override.local"
leader_election_resource_lock = "leases"
metrics_addr = ":8081"
`,
out: &WebhookMode{
out: &ReconcileMode{
CommonMode: CommonMode{
LogLevel: "LEVELOVERRIDE",
LogPath: "PATHOVERRIDE",
Expand All @@ -103,15 +103,14 @@ func TestLoadMode(t *testing.T) {
TrustDomain: "domain.test",
Cluster: "CLUSTEROVERRIDE",
PodLabel: "PODLABEL",
Mode: "webhook",
Mode: "reconcile",
DisabledNamespaces: []string{"kube-system", "kube-public"},
trustDomain: spiffeid.RequireTrustDomainFromString("domain.test"),
},
Addr: ":1234",
CertPath: "CERTOVERRIDE",
KeyPath: "KEYOVERRIDE",
CaCertPath: "CACERTOVERRIDE",
InsecureSkipClientVerification: true,
ControllerName: "override",
ClusterDNSZone: "override.local",
LeaderElectionResourceLock: "leases",
MetricsAddr: ":8081",
},
},
{
Expand Down
95 changes: 0 additions & 95 deletions support/k8s/k8s-workload-registrar/config_webhook.go

This file was deleted.

Loading