Release 2023-02-21 #3656

github-actions · 2023-02-21T10:05:19Z

No description provided.

a size of a *database* cannot be a sum of the sizes of *all databases* indicating that a logical size is calculated for a branch ## Describe your changes ## Issue ticket number and link ## Checklist before requesting a review - [x] i checked the suggested changes - [x] this is not a core feature - [x] this is just a docs update, does not require analytics - [x] this PR does not require a public announcement

Clients may specify endpoint/project name via `options=project=...`, so we should not only remove `project=` from `options` but also drop `options` entirely, because connection pools don't support it. Discussion: https://neondb.slack.com/archives/C033A2WE6BZ/p1676464382670119

## Describe your changes Updates PITR and GC_PERIOD default value doc ## Issue ticket number and link ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

Refactor the tenant_size_model code. Segment now contains just the minimum amount of information needed to calculate the size. Other information that is useful for building up the segment tree, and for display purposes, is now kept elsewhere. The code in 'main.rs' has a new ScenarioBuilder struct for that. Calculating which Segments are "needed" is now the responsibility of the caller of tenant_size_mode, not part of the calculation itself. So it's up to the caller to make all the decisions with retention periods for each branch. The output of the sizing calculation is now a Vec of SizeResults, rather than a tree. It uses a tree representation internally, when doing the calculation, but it's not exposed to the caller anymore. Refactor the way the recursive calculation is performed. Rewrite the code in size.rs that builds the Segment model. Get rid of the intermediate representation with Update structs. Build the Segments directly, with some local HashMaps and Vecs to track branch points to help with that. retention_period is now an input to gather_inputs(), rather than an output. Update pageserver http API: rename /size endpoint to /synthetic_size with following parameters: - /synthetic_size?inputs_only to get debug info; - /synthetic_size?retention_period=0 to override cutoff that is used to calculate the size; pass header -H "Accept: text/html" to get HTML output, otherwise JSON is returned Update python tests and openapi spec. --------- Co-authored-by: Anastasia Lubennikova <[email protected]> Co-authored-by: Joonas Koivunen <[email protected]>

) Previously timer was reset on every collect_metrics_iteration and sending of cached metrics was never triggered. This is a follow-up for a69da4a.

Closes #3518 and might help #3611 and the future build attempts. Propose `-s` flag in the Readme when building via `make` command, to help people to spot build errors easier.

Repeatedly (twice) try to download the compaction targeted layers before actual compaction. Adds tests for both L0 compaction downloading layers and image creation downloading layers. Image creation support existed already. Fixes #3591 Co-authored-by: Christian Schwarz <[email protected]>

This reverts commit a5ce2b5.

This reverts commit a839860.

It's not a property of the credentials that we receive from the client, so remove it from ClientCredentials. Instead, pass it as an argument directly to 'authenticate' function, where it's actually used. All the rest of the changes is just plumbing to pass it through the call stack to 'authenticate'

It's gone

@latest

## Describe your changes ``` $ poetry add werkzeug@latest "moto[server]@latest" Using version ^2.2.3 for werkzeug Using version ^4.1.2 for moto Updating dependencies Resolving dependencies... (1.6s) Writing lock file Package operations: 0 installs, 2 updates, 1 removal • Removing pytz (2022.1) • Updating werkzeug (2.1.2 -> 2.2.3) • Updating moto (3.1.18 -> 4.1.2) ``` Resolves: - https://github.com/neondatabase/neon/security/dependabot/14 - https://github.com/neondatabase/neon/security/dependabot/13 `@dependabot` failed to create a PR for some reason (I guess because it also needed to handle `moto` dependency) ## Issue ticket number and link N/A ## Checklist before requesting a review - [x] I have performed a self-review of my code. - [x] If it is a core feature, I have added thorough tests. - [x] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [x] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

…3629) fix is to stop postgres before the final checkpoint to ensure no inmemory layer gets created. Fixes #3627.

Fixes #3468. This does change how the panics look, and most importantly, make sure they are not interleaved with other messages. Adds a `GET /v1/panic` endpoint for panic testing (useful for sentry dedup and this hook testing). The panics are now logged within a new error level span called `panic` which separates it from other error level events. The panic info is unpacked into span fields: - thread=mgmt request worker - location="pageserver/src/http/routes.rs:898:9" Co-authored-by: Christian Schwarz <[email protected]>

Add an AtomicBool per RemoteLayer, use it to mark together with closed semaphore that remotelayer is unusable until restart or ignore+load. #3533 (comment)

This commit sets up OpenTelemetry tracing and exporter, so that they can be exported as OpenTelemetry traces as well. All outgoing HTTP requests will be traced. A separate (child) span is created for each outgoing HTTP request, and the tracing context is also propagated to the server in the HTTP headers. If tracing is enabled in the control plane and compute node too, you can now get an end-to-end distributed trace of what happens when a new connection is established, starting from the handshake with the client, creating the 'start_compute' operation in the control plane, starting the compute node, all the way to down to fetching the base backup and the availability checks in compute_ctl. Co-authored-by: Dmitry Ivanov <[email protected]>

On the surface, this doesn't add much, but there are some benefits: * We can do graceful shutdowns and thus record more code coverage data. * We now have a foundation for the more interesting behaviors, e.g. "stop accepting new connections after SIGTERM but keep serving the existing ones". * We give the otel machinery a chance to flush trace events before finally shutting down.

## Describe your changes test_on_demand_download is flaky because not waiting until created image layer is transferred to S3. test_tenants_with_remote_storage just leaves garbage at the end of overwritten file. Right solution for test_on_demand_download is to add some API call to wait completion of synchronization with S3 (not just based on last record LSN). But right now it is solved using sleep. ## Issue ticket number and link #3209 ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

Use correct paths in neon-pg-ext-clean

…3636) What it says on the tin. Part of #2476

- Update postgis from 3.3.1 from 3.3.2 - Update plv8 from 3.1.4 to 3.1.5 - Update h3-pg from 4.0.1 to 4.1.2 (and underlying h3 from 4.0.1 to 4.1.0)

Enables tracing panic hook in addition to pageserver introduced in #3475: - proxy - safekeeper - storage_broker For proxy, a drop guard which resets the original std panic hook was added on the first commit. Other binaries don't need it so they never reset anything by `disarm`ing the drop guard. The aim of the change is to make sure all panics a) have span information b) are logged similar to other messages, not interleaved with other messages as happens right now. Interleaving happens right now because std prints panics to stderr, and other logging happens in stdout. If this was handled gracefully by some utility, the log message splitter would treat panics as belonging to the previous message because it expects a message to start with a timestamp. Cc: #3468

arssher · 2023-02-21T10:06:51Z

I'd like to include neondatabase/postgres#259 here to avoid hotfix.

vadim2404 · 2023-02-21T10:49:03Z

@arssher, I don't think you'll be able to avoid the hotfix. At this moment, in pg branches were merged Bitmap scan prefetch with an open bug (cc @MMeent, @knizhnik )

arssher · 2023-02-21T10:55:27Z

Okay, good to know, but then it would be nice to do one hotfix (in PG repo) instead of two (roll one more main release with hotfix containing hotfix).

SergeyMelnikov

Please ping me just before deploying to the first region (I'd like to observe rollout of #3651)

MMeent · 2023-02-21T11:01:50Z

Okay, good to know, but then it would be nice to do one hotfix (in PG repo) instead of two (roll one more main release with hotfix containing hotfix).

Yes. But we can't roll out a PostgreSQL update without a Neon release, so "hotfixes" in PostgreSQL need hotfixes in Neon's release branche too.

Fixes #3648.

arssher · 2023-02-21T11:04:38Z

Yes. But we can't roll out a PostgreSQL update without a Neon release, so "hotfixes" in PostgreSQL need hotfixes in Neon's release branche too.

That is the point -- I want to include PG hotfix in this PR to avoid Neon hotfix.

vadim2404 · 2023-02-21T11:40:41Z

@arssher before having this PR on stg, I don't want to see in release branch

Fixes #3649.

koivunej · 2023-02-21T11:52:07Z

This branch is out-of-date with the base branch
Merge the latest changes from release into this branch.

This is because of a41b524 got added to release branch yesterday but merge commit should handle it.

lubennikovaav and others added 30 commits February 15, 2023 16:02

Add debug messages around timeline.get_current_logical_size

a5ce2b5

Add debug messages around sending cached metrics

a839860

Fix periodic metric sending: don't reset timer on every iteration (#3617

7991bd3

) Previously timer was reset on every collect_metrics_iteration and sending of cached metrics was never triggered. This is a follow-up for a69da4a.

Compile pgjwt extension

5082d84

Propose less verbose way to build neon (#3624)

f0b41e7

Closes #3518 and might help #3611 and the future build attempts. Propose `-s` flag in the Readme when building via `make` command, to help people to spot build errors easier.

Revert "Add debug messages around timeline.get_current_logical_size"

d9ba3c5

This reverts commit a5ce2b5.

Revert "Add debug messages around sending cached metrics"

6139e8e

This reverts commit a839860.

Only use active timelines in synthetic_size calculation

0d3aefb

Extract password hack & cleartext hack

edffe0d

Move hacks to a dedicated module.

a4d5c80

Do not deploy storage to old account (#3630)

a1b0621

It's gone

fix: flaky test_compaction_downloads_on_demand_with_image_creation (#…

501702b

…3629) fix is to stop postgres before the final checkpoint to ensure no inmemory layer gets created. Fixes #3627.

fix: avoid busy loop on replacement failure (#3613)

8e6b27b

Add an AtomicBool per RemoteLayer, use it to mark together with closed semaphore that remotelayer is unusable until restart or ignore+load. #3533 (comment)

[proxy] Improve tracing spans here and there.

d90cd36

Add debug messages to catch abnormal consumption metric values

40799d8

Fix make clean:

53128d5

Use correct paths in neon-pg-ext-clean

staging: enable automatic layer eviction at 20m threshold + period (#…

8d28a24

…3636) What it says on the tin. Part of #2476

Update Postgres extensions (#3615)

564fa11

- Update postgis from 3.3.1 from 3.3.2 - Update plv8 from 3.1.4 to 3.1.5 - Update h3-pg from 4.0.1 to 4.1.2 (and underlying h3 from 4.0.1 to 4.1.0)

Run compute_ctl in a cgroup in VMs (#3577)

2153d2e

koivunej and others added 2 commits February 21, 2023 10:03

Compile xml2 extension

5c5b03c

github-actions bot requested review from a team as code owners February 21, 2023 10:05

github-actions bot requested review from arssher, nikitakalyanov, MMeent and koivunej and removed request for a team February 21, 2023 10:05

vadim2404 self-requested a review February 21, 2023 10:47

vadim2404 approved these changes Feb 21, 2023

View reviewed changes

SergeyMelnikov approved these changes Feb 21, 2023

View reviewed changes

Warn when background tasks exceed their configured period (#3654)

7de3732

Fixes #3648.

add random init delay for background tasks (#3655)

b220ba6

Fixes #3649.

Merge branch 'release' into releases/2023-02-21

6508540

vadim2404 enabled auto-merge February 21, 2023 14:33

vadim2404 merged commit acbf414 into release Feb 21, 2023

vadim2404 deleted the releases/2023-02-21 branch February 21, 2023 15:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release 2023-02-21 #3656

Release 2023-02-21 #3656

github-actions bot commented Feb 21, 2023

arssher commented Feb 21, 2023

vadim2404 commented Feb 21, 2023

arssher commented Feb 21, 2023 •

edited

Loading

SergeyMelnikov left a comment

MMeent commented Feb 21, 2023

arssher commented Feb 21, 2023 •

edited

Loading

vadim2404 commented Feb 21, 2023

koivunej commented Feb 21, 2023 •

edited

Loading

Release 2023-02-21 #3656

Release 2023-02-21 #3656

Conversation

github-actions bot commented Feb 21, 2023

arssher commented Feb 21, 2023

vadim2404 commented Feb 21, 2023

arssher commented Feb 21, 2023 • edited Loading

SergeyMelnikov left a comment

Choose a reason for hiding this comment

MMeent commented Feb 21, 2023

arssher commented Feb 21, 2023 • edited Loading

vadim2404 commented Feb 21, 2023

koivunej commented Feb 21, 2023 • edited Loading

arssher commented Feb 21, 2023 •

edited

Loading

arssher commented Feb 21, 2023 •

edited

Loading

koivunej commented Feb 21, 2023 •

edited

Loading