Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 2023-02-21 #3656

Merged
merged 43 commits into from
Feb 21, 2023
Merged

Release 2023-02-21 #3656

merged 43 commits into from
Feb 21, 2023

Conversation

github-actions[bot]
Copy link

No description provided.

lubennikovaav and others added 30 commits February 15, 2023 16:02
a size of a *database* cannot be a sum of the sizes of *all databases*
indicating that a logical size is calculated for a branch

## Describe your changes

## Issue ticket number and link

## Checklist before requesting a review
- [x] i checked the suggested changes
- [x] this is not a core feature
- [x] this is just a docs update, does not require analytics
- [x] this PR does not require a public announcement
Clients may specify endpoint/project name via `options=project=...`,
so we should not only remove `project=` from `options` but also
drop `options` entirely, because connection pools don't support it.

Discussion: https://neondb.slack.com/archives/C033A2WE6BZ/p1676464382670119
## Describe your changes
Updates PITR and GC_PERIOD default value doc
## Issue ticket number and link

## Checklist before requesting a review
- [ ] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
Refactor the tenant_size_model code. Segment now contains just the
minimum amount of information needed to calculate the size. Other
information that is useful for building up the segment tree, and for
display purposes, is now kept elsewhere. The code in 'main.rs' has a new
ScenarioBuilder struct for that.

Calculating which Segments are "needed" is now the responsibility of the
caller of tenant_size_mode, not part of the calculation itself. So it's
up to the caller to make all the decisions with retention periods for
each branch.

The output of the sizing calculation is now a Vec of SizeResults, rather
than a tree. It uses a tree representation internally, when doing the
calculation, but it's not exposed to the caller anymore.

Refactor the way the recursive calculation is performed.

Rewrite the code in size.rs that builds the Segment model. Get rid of
the intermediate representation with Update structs. Build the Segments
directly, with some local HashMaps and Vecs to track branch points to
help with that.

retention_period is now an input to gather_inputs(), rather than an
output.

Update pageserver http API: rename /size endpoint to /synthetic_size
with following parameters:
    - /synthetic_size?inputs_only to get debug info;
- /synthetic_size?retention_period=0 to override cutoff that is used to
calculate the size;
pass header -H "Accept: text/html" to get HTML output, otherwise JSON is
returned

Update python tests and openapi spec.

---------

Co-authored-by: Anastasia Lubennikova <[email protected]>
Co-authored-by: Joonas Koivunen <[email protected]>
)

Previously timer was reset on every collect_metrics_iteration and sending of cached metrics was never triggered.

This is a follow-up for a69da4a.
Closes #3518 and might help
#3611 and the future build
attempts.

Propose `-s` flag in the Readme when building via `make` command, to
help people to spot build errors easier.
Repeatedly (twice) try to download the compaction targeted layers before actual
compaction. Adds tests for both L0 compaction downloading layers and image
creation downloading layers. Image creation support existed already.

Fixes #3591

Co-authored-by: Christian Schwarz <[email protected]>
It's not a property of the credentials that we receive from the
client, so remove it from ClientCredentials. Instead, pass it as an
argument directly to 'authenticate' function, where it's actually
used. All the rest of the changes is just plumbing to pass it through
the call stack to 'authenticate'
## Describe your changes

```
$ poetry add werkzeug@latest "moto[server]@latest"
Using version ^2.2.3 for werkzeug
Using version ^4.1.2 for moto

Updating dependencies
Resolving dependencies... (1.6s)

Writing lock file

Package operations: 0 installs, 2 updates, 1 removal

  • Removing pytz (2022.1)
  • Updating werkzeug (2.1.2 -> 2.2.3)
  • Updating moto (3.1.18 -> 4.1.2)
```

Resolves:
- https://github.com/neondatabase/neon/security/dependabot/14
- https://github.com/neondatabase/neon/security/dependabot/13

`@dependabot` failed to create a PR for some reason (I guess because it
also needed to handle `moto` dependency)

## Issue ticket number and link
N/A

## Checklist before requesting a review
- [x] I have performed a self-review of my code.
- [x] If it is a core feature, I have added thorough tests.
- [x] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [x] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
…3629)

fix is to stop postgres before the final checkpoint to ensure no
inmemory layer gets created.

Fixes #3627.
Fixes #3468.

This does change how the panics look, and most importantly, make sure
they are not interleaved with other messages. Adds a `GET /v1/panic`
endpoint for panic testing (useful for sentry dedup and this hook
testing).

The panics are now logged within a new error level span called `panic`
which separates it from other error level events. The panic info is
unpacked into span fields:
- thread=mgmt request worker
- location="pageserver/src/http/routes.rs:898:9"

Co-authored-by: Christian Schwarz <[email protected]>
Add an AtomicBool per RemoteLayer, use it to mark together with closed
semaphore that remotelayer is unusable until restart or ignore+load.

#3533 (comment)
This commit sets up OpenTelemetry tracing and exporter, so that they
can be exported as OpenTelemetry traces as well.

All outgoing HTTP requests will be traced. A separate (child)
span is created for each outgoing HTTP request, and the tracing
context is also propagated to the server in the HTTP headers.

If tracing is enabled in the control plane and compute node too, you
can now get an end-to-end distributed trace of what happens when a new
connection is established, starting from the handshake with the
client, creating the 'start_compute' operation in the control plane,
starting the compute node, all the way to down to fetching the base
backup and the availability checks in compute_ctl.

Co-authored-by: Dmitry Ivanov <[email protected]>
On the surface, this doesn't add much, but there are some benefits:

* We can do graceful shutdowns and thus record more code coverage data.

* We now have a foundation for the more interesting behaviors, e.g. "stop
  accepting new connections after SIGTERM but keep serving the existing ones".

* We give the otel machinery a chance to flush trace events before
  finally shutting down.
## Describe your changes

test_on_demand_download is flaky because not waiting until created image
layer is transferred to S3.
test_tenants_with_remote_storage just leaves garbage at the end of
overwritten file.

Right solution for test_on_demand_download is to add some API call to
wait completion of synchronization with S3 (not just based on last
record LSN). But right now it is solved using sleep.

## Issue ticket number and link

#3209 

## Checklist before requesting a review
- [ ] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
Use correct paths in neon-pg-ext-clean
- Update postgis from 3.3.1 from 3.3.2
- Update plv8 from 3.1.4 to 3.1.5
- Update h3-pg from 4.0.1 to 4.1.2 (and underlying h3 from 4.0.1 to 4.1.0)
koivunej and others added 2 commits February 21, 2023 10:03
Enables tracing panic hook in addition to pageserver introduced in
#3475:

- proxy
- safekeeper
- storage_broker

For proxy, a drop guard which resets the original std panic hook was
added on the first commit. Other binaries don't need it so they never
reset anything by `disarm`ing the drop guard.

The aim of the change is to make sure all panics a) have span
information b) are logged similar to other messages, not interleaved
with other messages as happens right now. Interleaving happens right now
because std prints panics to stderr, and other logging happens in
stdout. If this was handled gracefully by some utility, the log message
splitter would treat panics as belonging to the previous message because
it expects a message to start with a timestamp.

Cc: #3468
@github-actions github-actions bot requested review from a team as code owners February 21, 2023 10:05
@github-actions github-actions bot requested review from arssher, nikitakalyanov, MMeent and koivunej and removed request for a team February 21, 2023 10:05
@arssher
Copy link
Contributor

arssher commented Feb 21, 2023

I'd like to include neondatabase/postgres#259 here to avoid hotfix.

@vadim2404 vadim2404 self-requested a review February 21, 2023 10:47
@vadim2404
Copy link
Contributor

@arssher, I don't think you'll be able to avoid the hotfix. At this moment, in pg branches were merged Bitmap scan prefetch with an open bug (cc @MMeent, @knizhnik )

@arssher
Copy link
Contributor

arssher commented Feb 21, 2023

Okay, good to know, but then it would be nice to do one hotfix (in PG repo) instead of two (roll one more main release with hotfix containing hotfix).

Copy link
Contributor

@SergeyMelnikov SergeyMelnikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please ping me just before deploying to the first region (I'd like to observe rollout of #3651)

@MMeent
Copy link
Contributor

MMeent commented Feb 21, 2023

Okay, good to know, but then it would be nice to do one hotfix (in PG repo) instead of two (roll one more main release with hotfix containing hotfix).

Yes. But we can't roll out a PostgreSQL update without a Neon release, so "hotfixes" in PostgreSQL need hotfixes in Neon's release branche too.

@arssher
Copy link
Contributor

arssher commented Feb 21, 2023

Yes. But we can't roll out a PostgreSQL update without a Neon release, so "hotfixes" in PostgreSQL need hotfixes in Neon's release branche too.

That is the point -- I want to include PG hotfix in this PR to avoid Neon hotfix.

@vadim2404
Copy link
Contributor

@arssher before having this PR on stg, I don't want to see in release branch

@koivunej
Copy link
Member

koivunej commented Feb 21, 2023

This branch is out-of-date with the base branch
Merge the latest changes from release into this branch.

This is because of a41b524 got added to release branch yesterday but merge commit should handle it.

@vadim2404 vadim2404 enabled auto-merge February 21, 2023 14:33
@vadim2404 vadim2404 merged commit acbf414 into release Feb 21, 2023
@vadim2404 vadim2404 deleted the releases/2023-02-21 branch February 21, 2023 15:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.