Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] primary/secondary pg cluster wal-g backup restore error #6566

Closed
JashBook opened this issue Jan 31, 2024 · 6 comments
Closed

[BUG] primary/secondary pg cluster wal-g backup restore error #6566

JashBook opened this issue Jan 31, 2024 · 6 comments
Assignees
Labels
bug kind/bug Something isn't working severity/major Great chance user will encounter the same problem
Milestone

Comments

@JashBook
Copy link
Collaborator

Describe the bug
A clear and concise description of what the bug is.

kbcli version
Kubernetes: v1.26.6-gke.1700
KubeBlocks: 0.8.2-beta.2
kbcli: 0.8.1

To Reproduce
Steps to reproduce the behavior:

  1. create cluster
kbcli cluster create  postgres-ttywey --termination-policy=Halt --monitoring-interval=0 --cluster-definition=postgresql --enable-all-logs=false --cluster-version=postgresql-14.8.0 --set cpu=100m,memory=0.5Gi,replicas=2,storage=3Gi  --namespace default
  1. config-wal-g backup
kbcli cluster backup postgres-ttywey --method config-wal-g --namespace default
  1. reconfig archive_command
kbcli cluster configure postgres-ttywey --auto-approve   --set archive_command="'envdir /home/postgres/pgdata/wal-g/env /home/postgres/pgdata/wal-g/wal-g wal-push %p'" --components postgresql --config-spec postgresql-configuration  --namespace default
  1. wal-g backup and restore
kbcli cluster backup postgres-ttywey --method wal-g --namespace default 

kbcli cluster describe-backup backup-default-postgres-ttywey-20240131173822 --namespace default
  1. See error
kubectl get pod,ops -l app.kubernetes.io/instance=postgres-ttywey-backup
NAME                                      READY   STATUS    RESTARTS   AGE
pod/postgres-ttywey-backup-postgresql-0   4/5     Running   0          9m16s
pod/postgres-ttywey-backup-postgresql-1   4/5     Running   0          9m16s

NAME                                                   TYPE      CLUSTER                  STATUS    PROGRESS   AGE
opsrequest.apps.kubeblocks.io/postgres-ttywey-backup   Restore   postgres-ttywey-backup   Running   -/-        9m44s

describe pod

 kubectl describe pod postgres-ttywey-backup-postgresql-0
Name:         postgres-ttywey-backup-postgresql-0
Namespace:    default
Priority:     0
Node:         gke-infracreate-gke-kbdata-4c8g-b9b386a5-h9kn/10.10.0.88
Start Time:   Wed, 31 Jan 2024 17:39:30 +0800
Labels:       app.kubernetes.io/component=postgresql
              app.kubernetes.io/instance=postgres-ttywey-backup
              app.kubernetes.io/managed-by=kubeblocks
              app.kubernetes.io/name=postgresql
              app.kubernetes.io/version=
              apps.kubeblocks.io/component-name=postgresql
              apps.kubeblocks.postgres.patroni/scope=postgres-ttywey-backup-postgresql-patroni5feea8e5
              controller-revision-hash=postgres-ttywey-backup-postgresql-7478c545fb
              statefulset.kubernetes.io/pod-name=postgres-ttywey-backup-postgresql-0
Annotations:  apps.kubeblocks.io/component-replicas: 2
              status:
                {"conn_url":"postgres://10.128.40.8:5432/postgres","api_url":"http://10.128.40.8:8008/patroni","state":"stopped","role":"uninitialized","v...
Status:       Running
IP:           10.128.40.8
IPs:
  IP:           10.128.40.8
Controlled By:  StatefulSet/postgres-ttywey-backup-postgresql
Init Containers:
  pg-init-container:
    Container ID:  containerd://81a77a1ac40a5c878b3f90eea4834f733b18b9c666bb7e4ca98ad44832adac5c
    Image:         infracreate-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/spilo:14.8.0-pgvector-v0.5.0
    Image ID:      infracreate-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/spilo@sha256:fcb43ae7be09ee319652334369f5ced5450566772ee0a54f9da941d4a0fe4238
    Port:          <none>
    Host Port:     <none>
    Command:
      /kb-scripts/init_container.sh
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Wed, 31 Jan 2024 17:39:42 +0800
      Finished:     Wed, 31 Jan 2024 17:39:42 +0800
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     0
      memory:  0
    Requests:
      cpu:     0
      memory:  0
    Environment Variables from:
      postgres-ttywey-backup-postgresql-env  ConfigMap  Optional: false
    Environment:
      KB_POD_NAME:   postgres-ttywey-backup-postgresql-0 (v1:metadata.name)
      KB_POD_UID:     (v1:metadata.uid)
      KB_NAMESPACE:  default (v1:metadata.namespace)
      KB_SA_NAME:     (v1:spec.serviceAccountName)
      KB_NODENAME:    (v1:spec.nodeName)
      KB_HOST_IP:     (v1:status.hostIP)
      KB_POD_IP:      (v1:status.podIP)
      KB_POD_IPS:     (v1:status.podIPs)
      KB_HOSTIP:      (v1:status.hostIP)
      KB_PODIP:       (v1:status.podIP)
      KB_PODIPS:      (v1:status.podIPs)
      KB_POD_FQDN:   $(KB_POD_NAME).postgres-ttywey-backup-postgresql-headless.$(KB_NAMESPACE).svc
    Mounts:
      /home/postgres/conf from postgresql-config (rw)
      /home/postgres/pgdata from data (rw)
      /kb-podinfo from pod-info (rw)
      /kb-scripts from scripts (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-hvdd7 (ro)
Containers:
  postgresql:
    Container ID:  containerd://fdf72a891951be662c568387ecf23dcee2ad183c3ec8dea4f88f68256b481a43
    Image:         infracreate-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/spilo:14.8.0-pgvector-v0.5.0
    Image ID:      infracreate-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/spilo@sha256:fcb43ae7be09ee319652334369f5ced5450566772ee0a54f9da941d4a0fe4238
    Ports:         5432/TCP, 8008/TCP
    Host Ports:    0/TCP, 0/TCP
    Command:
      /kb-scripts/setup.sh
    State:          Running
      Started:      Wed, 31 Jan 2024 17:39:43 +0800
    Ready:          False
    Restart Count:  0
    Limits:
      cpu:     100m
      memory:  512Mi
    Requests:
      cpu:      100m
      memory:   512Mi
    Readiness:  exec [/bin/sh -c -ee exec pg_isready -U "postgres" -h 127.0.0.1 -p 5432
[ -f /postgresql/tmp/.initialized ] || [ -f /postgresql/.initialized ]
] delay=10s timeout=5s period=30s #success=1 #failure=3
    Environment Variables from:
      postgres-ttywey-backup-postgresql-env      ConfigMap  Optional: false
      postgres-ttywey-backup-postgresql-rsm-env  ConfigMap  Optional: false
    Environment:
      KB_POD_NAME:                postgres-ttywey-backup-postgresql-0 (v1:metadata.name)
      KB_POD_UID:                  (v1:metadata.uid)
      KB_NAMESPACE:               default (v1:metadata.namespace)
      KB_SA_NAME:                  (v1:spec.serviceAccountName)
      KB_NODENAME:                 (v1:spec.nodeName)
      KB_HOST_IP:                  (v1:status.hostIP)
      KB_POD_IP:                   (v1:status.podIP)
      KB_POD_IPS:                  (v1:status.podIPs)
      KB_HOSTIP:                   (v1:status.hostIP)
      KB_PODIP:                    (v1:status.podIP)
      KB_PODIPS:                   (v1:status.podIPs)
      KB_POD_FQDN:                $(KB_POD_NAME).postgres-ttywey-backup-postgresql-headless.$(KB_NAMESPACE).svc
      SERVICE_PORT:               5432
      DCS_ENABLE_KUBERNETES_API:  true
      KUBERNETES_USE_CONFIGMAPS:  true
      SCOPE:                      $(KB_CLUSTER_NAME)-$(KB_COMP_NAME)-patroni$(KB_CLUSTER_UID_POSTFIX_8)
      KUBERNETES_SCOPE_LABEL:     apps.kubeblocks.postgres.patroni/scope
      KUBERNETES_ROLE_LABEL:      apps.kubeblocks.postgres.patroni/role
      KUBERNETES_LABELS:          {"app.kubernetes.io/instance":"$(KB_CLUSTER_NAME)","apps.kubeblocks.io/component-name":"$(KB_COMP_NAME)"}
      RESTORE_DATA_DIR:           /home/postgres/pgdata/kb_restore
      KB_PG_CONFIG_PATH:          /home/postgres/conf/postgresql.conf
      SPILO_CONFIGURATION:        bootstrap:
                                    initdb:
                                      - auth-host: md5
                                      - auth-local: trust
                                  
      ALLOW_NOSSL:                true
      PGROOT:                     /home/postgres/pgdata/pgroot
      POD_IP:                      (v1:status.podIP)
      POD_NAMESPACE:              default (v1:metadata.namespace)
      PGUSER_SUPERUSER:           <set to the key 'username' in secret 'postgres-ttywey-backup-conn-credential'>  Optional: false
      PGPASSWORD_SUPERUSER:       <set to the key 'password' in secret 'postgres-ttywey-backup-conn-credential'>  Optional: false
      PGUSER_ADMIN:               superadmin
      PGPASSWORD_ADMIN:           <set to the key 'password' in secret 'postgres-ttywey-backup-conn-credential'>  Optional: false
      PGUSER_STANDBY:             standby
      PGPASSWORD_STANDBY:         <set to the key 'password' in secret 'postgres-ttywey-backup-conn-credential'>  Optional: false
      PGUSER:                     <set to the key 'username' in secret 'postgres-ttywey-backup-conn-credential'>  Optional: false
      PGPASSWORD:                 <set to the key 'password' in secret 'postgres-ttywey-backup-conn-credential'>  Optional: false
    Mounts:
      /dev/shm from dshm (rw)
      /home/postgres/conf from postgresql-config (rw)
      /home/postgres/pgdata from data (rw)
      /kb-podinfo from pod-info (rw)
      /kb-scripts from scripts (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-hvdd7 (ro)
  pgbouncer:
    Container ID:  containerd://9be966297fd39e6793b6644d929b3520a157a0b43a39b7e6c2c80931f32b8684
    Image:         infracreate-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/pgbouncer:1.19.0
    Image ID:      infracreate-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/pgbouncer@sha256:c788fda7436d178e6ea94cf7b8fa767005f0ef483edd90b328c560e1935b26b5
    Port:          6432/TCP
    Host Port:     0/TCP
    Command:
      /kb-scripts/pgbouncer_setup.sh
    State:          Running
      Started:      Wed, 31 Jan 2024 17:39:43 +0800
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     0
      memory:  0
    Requests:
      cpu:      0
      memory:   0
    Liveness:   tcp-socket :tcp-pgbouncer delay=15s timeout=5s period=30s #success=1 #failure=3
    Readiness:  tcp-socket :tcp-pgbouncer delay=15s timeout=5s period=30s #success=1 #failure=3
    Environment Variables from:
      postgres-ttywey-backup-postgresql-env      ConfigMap  Optional: false
      postgres-ttywey-backup-postgresql-rsm-env  ConfigMap  Optional: false
    Environment:
      KB_POD_NAME:             postgres-ttywey-backup-postgresql-0 (v1:metadata.name)
      KB_POD_UID:               (v1:metadata.uid)
      KB_NAMESPACE:            default (v1:metadata.namespace)
      KB_SA_NAME:               (v1:spec.serviceAccountName)
      KB_NODENAME:              (v1:spec.nodeName)
      KB_HOST_IP:               (v1:status.hostIP)
      KB_POD_IP:                (v1:status.podIP)
      KB_POD_IPS:               (v1:status.podIPs)
      KB_HOSTIP:                (v1:status.hostIP)
      KB_PODIP:                 (v1:status.podIP)
      KB_PODIPS:                (v1:status.podIPs)
      KB_POD_FQDN:             $(KB_POD_NAME).postgres-ttywey-backup-postgresql-headless.$(KB_NAMESPACE).svc
      PGBOUNCER_AUTH_TYPE:     md5
      POSTGRESQL_USERNAME:     <set to the key 'username' in secret 'postgres-ttywey-backup-conn-credential'>  Optional: false
      POSTGRESQL_PASSWORD:     <set to the key 'password' in secret 'postgres-ttywey-backup-conn-credential'>  Optional: false
      POSTGRESQL_PORT:         5432
      POSTGRESQL_HOST:          (v1:status.podIP)
      PGBOUNCER_PORT:          6432
      PGBOUNCER_BIND_ADDRESS:  0.0.0.0
    Mounts:
      /home/pgbouncer/conf from pgbouncer-config (rw)
      /kb-scripts from scripts (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-hvdd7 (ro)
  metrics:
    Container ID:  containerd://870c8d2b233d0462b06084c4b9edf43ac0ecd14e5154ffa73c7772c9990072d2
    Image:         infracreate-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/agamotto:0.1.2-beta.1
    Image ID:      infracreate-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/agamotto@sha256:cbab349b90490807a8d5039bf01bc7e37334f20c98c7dd75bc7fc4cf9e5b10ee
    Port:          9187/TCP
    Host Port:     0/TCP
    Command:
      /bin/agamotto
      --config=/opt/agamotto/agamotto-config.yaml
    State:          Running
      Started:      Wed, 31 Jan 2024 17:39:44 +0800
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     0
      memory:  0
    Requests:
      cpu:     0
      memory:  0
    Environment Variables from:
      postgres-ttywey-backup-postgresql-env      ConfigMap  Optional: false
      postgres-ttywey-backup-postgresql-rsm-env  ConfigMap  Optional: false
    Environment:
      KB_POD_NAME:       postgres-ttywey-backup-postgresql-0 (v1:metadata.name)
      KB_POD_UID:         (v1:metadata.uid)
      KB_NAMESPACE:      default (v1:metadata.namespace)
      KB_SA_NAME:         (v1:spec.serviceAccountName)
      KB_NODENAME:        (v1:spec.nodeName)
      KB_HOST_IP:         (v1:status.hostIP)
      KB_POD_IP:          (v1:status.podIP)
      KB_POD_IPS:         (v1:status.podIPs)
      KB_HOSTIP:          (v1:status.hostIP)
      KB_PODIP:           (v1:status.podIP)
      KB_PODIPS:          (v1:status.podIPs)
      KB_POD_FQDN:       $(KB_POD_NAME).postgres-ttywey-backup-postgresql-headless.$(KB_NAMESPACE).svc
      ENDPOINT:          127.0.0.1:5432
      DATA_SOURCE_PASS:  <set to the key 'password' in secret 'postgres-ttywey-backup-conn-credential'>  Optional: false
      DATA_SOURCE_USER:  <set to the key 'username' in secret 'postgres-ttywey-backup-conn-credential'>  Optional: false
    Mounts:
      /opt/agamotto from agamotto-configuration (rw)
      /opt/conf from postgresql-custom-metrics (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-hvdd7 (ro)
  kb-checkrole:
    Container ID:  containerd://162b72cb2c506ff1657983d9337343698da9dac5929227f6147680b328299d83
    Image:         docker.io/apecloud/kubeblocks-tools:0.8.2-beta.2
    Image ID:      docker.io/apecloud/kubeblocks-tools@sha256:01d02531f0093eff056e1dcd17b1c3dc988a1c0441ba263c392fdb2d50d358f8
    Ports:         3501/TCP, 50001/TCP
    Host Ports:    0/TCP, 0/TCP
    Command:
      lorry
      --port
      3501
      --grpcport
      50001
    State:          Running
      Started:      Wed, 31 Jan 2024 17:39:44 +0800
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     0
      memory:  0
    Requests:
      cpu:      0
      memory:   0
    Readiness:  http-get http://:3501/v1.0/checkrole delay=0s timeout=1s period=1s #success=1 #failure=2
    Startup:    tcp-socket :3501 delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment Variables from:
      postgres-ttywey-backup-postgresql-env      ConfigMap  Optional: false
      postgres-ttywey-backup-postgresql-rsm-env  ConfigMap  Optional: false
    Environment:
      KB_POD_NAME:                   postgres-ttywey-backup-postgresql-0 (v1:metadata.name)
      KB_POD_UID:                     (v1:metadata.uid)
      KB_NAMESPACE:                  default (v1:metadata.namespace)
      KB_SA_NAME:                     (v1:spec.serviceAccountName)
      KB_NODENAME:                    (v1:spec.nodeName)
      KB_HOST_IP:                     (v1:status.hostIP)
      KB_POD_IP:                      (v1:status.podIP)
      KB_POD_IPS:                     (v1:status.podIPs)
      KB_HOSTIP:                      (v1:status.hostIP)
      KB_PODIP:                       (v1:status.podIP)
      KB_PODIPS:                      (v1:status.podIPs)
      KB_POD_FQDN:                   $(KB_POD_NAME).postgres-ttywey-backup-postgresql-headless.$(KB_NAMESPACE).svc
      KB_SERVICE_PORT:               5432
      KB_DATA_PATH:                  /home/postgres/pgdata
      KB_BUILTIN_HANDLER:            postgresql
      KB_SERVICE_USER:               <set to the key 'username' in secret 'postgres-ttywey-backup-conn-credential'>  Optional: false
      KB_SERVICE_PASSWORD:           <set to the key 'password' in secret 'postgres-ttywey-backup-conn-credential'>  Optional: false
      KB_RSM_ACTION_SVC_LIST:        null
      KB_RSM_ROLE_UPDATE_MECHANISM:  DirectAPIServerEventUpdate
      KB_RSM_ROLE_PROBE_TIMEOUT:     1
      KB_CLUSTER_NAME:                (v1:metadata.labels['app.kubernetes.io/instance'])
      KB_COMP_NAME:                   (v1:metadata.labels['apps.kubeblocks.io/component-name'])
      KB_SERVICE_CHARACTER_TYPE:     postgresql
    Mounts:
      /home/postgres/pgdata from data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-hvdd7 (ro)
  config-manager:
    Container ID:  containerd://afef0419d30699ba949cddabd0430d102e94ea70174db9c72d47e63ae95ea453
    Image:         docker.io/apecloud/kubeblocks-tools:0.8.2-beta.2
    Image ID:      docker.io/apecloud/kubeblocks-tools@sha256:01d02531f0093eff056e1dcd17b1c3dc988a1c0441ba263c392fdb2d50d358f8
    Port:          <none>
    Host Port:     <none>
    Command:
      env
    Args:
      PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:$(TOOLS_PATH)
      /bin/reloader
      --log-level
      info
      --operator-update-enable
      --tcp
      9901
      --config
      /opt/config-manager/config-manager.yaml
    State:          Running
      Started:      Wed, 31 Jan 2024 17:39:44 +0800
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     0
      memory:  0
    Requests:
      cpu:     0
      memory:  0
    Environment Variables from:
      postgres-ttywey-backup-postgresql-env      ConfigMap  Optional: false
      postgres-ttywey-backup-postgresql-rsm-env  ConfigMap  Optional: false
    Environment:
      KB_POD_NAME:            postgres-ttywey-backup-postgresql-0 (v1:metadata.name)
      KB_POD_UID:              (v1:metadata.uid)
      KB_NAMESPACE:           default (v1:metadata.namespace)
      KB_SA_NAME:              (v1:spec.serviceAccountName)
      KB_NODENAME:             (v1:spec.nodeName)
      KB_HOST_IP:              (v1:status.hostIP)
      KB_POD_IP:               (v1:status.podIP)
      KB_POD_IPS:              (v1:status.podIPs)
      KB_HOSTIP:               (v1:status.hostIP)
      KB_PODIP:                (v1:status.podIP)
      KB_PODIPS:               (v1:status.podIPs)
      KB_POD_FQDN:            $(KB_POD_NAME).postgres-ttywey-backup-postgresql-headless.$(KB_NAMESPACE).svc
      CONFIG_MANAGER_POD_IP:   (v1:status.podIP)
      DB_TYPE:                postgresql
      TOOLS_PATH:             /opt/kb-tools/reload/postgresql-configuration:/opt/config-manager
    Mounts:
      /home/postgres/conf from postgresql-config (rw)
      /opt/config-manager from config-manager-config (rw)
      /opt/kb-tools/reload/postgresql-configuration from cm-script-postgresql-configuration (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-hvdd7 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  data-postgres-ttywey-backup-postgresql-0
    ReadOnly:   false
  dshm:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     Memory
    SizeLimit:  512Mi
  pod-info:
    Type:  DownwardAPI (a volume populated by information about the pod)
    Items:
      metadata.labels['kubeblocks.io/role'] -> pod-role
      metadata.annotations['rs.apps.kubeblocks.io/primary'] -> primary-pod
      metadata.annotations['apps.kubeblocks.io/component-replicas'] -> component-replicas
  agamotto-configuration:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      postgres-ttywey-backup-postgresql-agamotto-configuration
    Optional:  false
  pgbouncer-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      postgres-ttywey-backup-postgresql-pgbouncer-configuration
    Optional:  false
  postgresql-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      postgres-ttywey-backup-postgresql-postgresql-configuration
    Optional:  false
  postgresql-custom-metrics:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      postgres-ttywey-backup-postgresql-postgresql-custom-metrics
    Optional:  false
  scripts:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      postgres-ttywey-backup-postgresql-postgresql-scripts
    Optional:  false
  cm-script-postgresql-configuration:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      sidecar-patroni-reload-script-postgres-ttywey-backup
    Optional:  false
  config-manager-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      sidecar-postgres-ttywey-backup-postgresql-config-manager-config
    Optional:  false
  kube-api-access-hvdd7:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 kb-data=true:NoSchedule
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason                  Age                    From                     Message
  ----     ------                  ----                   ----                     -------
  Normal   Scheduled               9m47s                  default-scheduler        Successfully assigned default/postgres-ttywey-backup-postgresql-0 to gke-infracreate-gke-kbdata-4c8g-b9b386a5-h9kn
  Normal   SuccessfulAttachVolume  9m37s                  attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-e2dcbf9c-7b8b-42a8-a3fe-e838b36ccfa0"
  Normal   Pulled                  9m35s                  kubelet                  Container image "infracreate-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/spilo:14.8.0-pgvector-v0.5.0" already present on machine
  Normal   Created                 9m35s                  kubelet                  Created container pg-init-container
  Normal   Started                 9m35s                  kubelet                  Started container pg-init-container
  Normal   Pulled                  9m34s                  kubelet                  Container image "infracreate-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/agamotto:0.1.2-beta.1" already present on machine
  Normal   Pulled                  9m34s                  kubelet                  Container image "infracreate-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/spilo:14.8.0-pgvector-v0.5.0" already present on machine
  Normal   Created                 9m34s                  kubelet                  Created container postgresql
  Normal   Started                 9m34s                  kubelet                  Started container postgresql
  Normal   Pulled                  9m34s                  kubelet                  Container image "infracreate-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/pgbouncer:1.19.0" already present on machine
  Normal   Created                 9m34s                  kubelet                  Created container pgbouncer
  Normal   Started                 9m34s                  kubelet                  Started container pgbouncer
  Normal   Created                 9m33s                  kubelet                  Created container metrics
  Normal   Started                 9m33s                  kubelet                  Started container metrics
  Normal   Pulled                  9m33s                  kubelet                  Container image "docker.io/apecloud/kubeblocks-tools:0.8.2-beta.2" already present on machine
  Normal   Created                 9m33s                  kubelet                  Created container kb-checkrole
  Normal   Started                 9m33s                  kubelet                  Started container kb-checkrole
  Normal   Pulled                  9m33s                  kubelet                  Container image "docker.io/apecloud/kubeblocks-tools:0.8.2-beta.2" already present on machine
  Normal   Created                 9m33s                  kubelet                  Created container config-manager
  Normal   Started                 9m33s                  kubelet                  Started container config-manager
  Warning  Unhealthy               4m5s (x20 over 9m23s)  kubelet                  Readiness probe failed: 127.0.0.1:5432 - no response

logs pod

kubectl logs postgres-ttywey-backup-postgresql-0 postgresql 
2024-01-31 09:40:18,068 - bootstrapping - INFO - Figuring out my environment (Google? AWS? Openstack? Local?)
2024-01-31 09:40:18,167 - bootstrapping - INFO - Looks like you are running google
2024-01-31 09:40:18,470 - bootstrapping - INFO - kubeblocks generate local configuration: 
bootstrap:
  dcs:
    check_timeline: true
    loop_wait: 10
    max_timelines_history: 0
    maximum_lag_on_failover: 1048576
    postgresql:
      parameters:
        archive_command: /bin/true
        archive_mode: 'on'
        autovacuum_analyze_scale_factor: '0.1'
        autovacuum_max_workers: '3'
        autovacuum_vacuum_scale_factor: '0.05'
        checkpoint_completion_target: '0.9'
        log_autovacuum_min_duration: '10000'
        log_checkpoints: 'True'
        log_connections: 'False'
        log_disconnections: 'False'
        log_min_duration_statement: '1000'
        log_statement: ddl
        log_temp_files: 128kB
        max_connections: '56'
        max_locks_per_transaction: '64'
        max_prepared_transactions: '100'
        max_replication_slots: '16'
        max_wal_senders: '64'
        max_worker_processes: '8'
        tcp_keepalives_idle: 45s
        tcp_keepalives_interval: 10s
        track_commit_timestamp: 'False'
        track_functions: pl
        wal_compression: 'True'
        wal_keep_size: '0'
        wal_level: replica
        wal_log_hints: 'False'
    retry_timeout: 10
    ttl: 30
  initdb:
  - auth-host: md5
  - auth-local: trust
  kb_restore_from_time:
    command: bash /home/postgres/pgdata/kb_restore/kb_restore.sh
    keep_existing_recovery_conf: false
    recovery_conf:
      restore_command: envdir /home/postgres/pgdata/wal-g/restore-env /home/postgres/pgdata/wal-g/wal-g
        wal-fetch %f %p
  method: kb_restore_from_time
postgresql:
  config_dir: /home/postgres/pgdata/conf
  custom_conf: /home/postgres/conf/postgresql.conf
  parameters:
    log_destination: csvlog
    log_directory: log
    log_filename: postgresql-%Y-%m-%d.log
    logging_collector: 'True'
    pg_stat_statements.track_utility: 'False'
    shared_buffers: 128MB
  pg_hba:
  - host     all             all             0.0.0.0/0                md5
  - host     all             all             ::/0                     md5
  - local    all             all                                     trust
  - host     all             all             127.0.0.1/32            trust
  - host     all             all             ::1/128                 trust
  - local     replication     all                                    trust
  - host      replication     all             0.0.0.0/0               md5
  - host      replication     all             ::/0                    md5

2024-01-31 09:40:18,769 - bootstrapping - INFO - Configuring certificate
2024-01-31 09:40:18,770 - bootstrapping - INFO - Generating ssl self-signed certificate
2024-01-31 09:40:21,174 - bootstrapping - INFO - Configuring patroni
2024-01-31 09:40:21,276 - bootstrapping - INFO - Writing to file /run/postgres.yml
2024-01-31 09:40:21,277 - bootstrapping - INFO - Configuring log
2024-01-31 09:40:21,364 - bootstrapping - INFO - Configuring wal-e
2024-01-31 09:40:21,364 - bootstrapping - INFO - Configuring pam-oauth2
2024-01-31 09:40:21,364 - bootstrapping - INFO - No PAM_OAUTH2 configuration was specified, skipping
2024-01-31 09:40:21,364 - bootstrapping - INFO - Configuring crontab
2024-01-31 09:40:21,364 - bootstrapping - INFO - Skipping creation of renice cron job due to lack of SYS_NICE capability
2024-01-31 09:40:21,365 - bootstrapping - INFO - Configuring standby-cluster
2024-01-31 09:40:21,365 - bootstrapping - INFO - Configuring pgqd
2024-01-31 09:40:21,365 - bootstrapping - INFO - Configuring pgbouncer
2024-01-31 09:40:21,365 - bootstrapping - INFO - No PGBOUNCER_CONFIGURATION was specified, skipping
2024-01-31 09:40:21,365 - bootstrapping - INFO - Configuring bootstrap
2024-01-31 09:40:26,570 INFO: Selected new K8s API server endpoint https://10.10.0.2:443
2024-01-31 09:40:26,973 INFO: No PostgreSQL configuration items changed, nothing to reload.
2024-01-31 09:40:27,070 INFO: Lock owner: None; I am postgres-ttywey-backup-postgresql-0
2024-01-31 09:40:27,367 INFO: trying to bootstrap a new cluster
2024-01-31 09:40:27,367 INFO: Running custom bootstrap script: bash /home/postgres/pgdata/kb_restore/kb_restore.sh
2024-01-31 09:40:31 GMT [105]: [1-1] 65ba158f.69 0     LOG:  Auto detecting pg_stat_kcache.linux_hz parameter...
2024-01-31 09:40:31 GMT [105]: [2-1] 65ba158f.69 0     LOG:  pg_stat_kcache.linux_hz is set to 500000
2024-01-31 09:40:32,162 INFO: postmaster pid=105
/var/run/postgresql:5432 - no response
2024-01-31 09:40:32 GMT [105]: [3-1] 65ba158f.69 0     LOG:  redirecting log output to logging collector process
2024-01-31 09:40:32 GMT [105]: [4-1] 65ba158f.69 0     HINT:  Future log output will appear in directory "log".
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
2024-01-31 09:40:37,573 INFO: Lock owner: None; I am postgres-ttywey-backup-postgresql-0
2024-01-31 09:40:37,573 INFO: not healthy enough for leader race
2024-01-31 09:40:38,169 INFO: bootstrap in progress
/var/run/postgresql:5432 - no response
2024-01-31 09:40:38,463 INFO: removing initialize key after failed attempt to bootstrap the cluster
2024-01-31 09:40:38,563 INFO: renaming data directory to /home/postgres/pgdata/pgroot/data.failed
Traceback (most recent call last):
  File "/usr/local/bin/patroni", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 144, in main
    return patroni_main()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 136, in patroni_main
    abstract_main(Patroni, schema)
  File "/usr/local/lib/python3.10/dist-packages/patroni/daemon.py", line 181, in abstract_main
    controller.run()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 106, in run
    super(Patroni, self).run()
  File "/usr/local/lib/python3.10/dist-packages/patroni/daemon.py", line 126, in run
    self._run_cycle()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 109, in _run_cycle
    logger.info(self.ha.run_cycle())
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1770, in run_cycle
    info = self._run_cycle()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1592, in _run_cycle
    return self.post_bootstrap()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1483, in post_bootstrap
    self.cancel_initialization()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1476, in cancel_initialization
    raise PatroniFatalException('Failed to bootstrap cluster')
patroni.exceptions.PatroniFatalException: 'Failed to bootstrap cluster'
/etc/runit/runsvdir/default/patroni: finished with code=1 signal=0
/etc/runit/runsvdir/default/patroni: sleeping 30 seconds
...
2024-01-31 09:45:58,177 INFO: Selected new K8s API server endpoint https://10.10.0.2:443
2024-01-31 09:45:58,566 INFO: No PostgreSQL configuration items changed, nothing to reload.
2024-01-31 09:45:58,570 INFO: Lock owner: None; I am postgres-ttywey-backup-postgresql-0
2024-01-31 09:45:58,680 INFO: trying to bootstrap a new cluster
2024-01-31 09:45:58,680 INFO: Running custom bootstrap script: bash /home/postgres/pgdata/kb_restore/kb_restore.sh
mv: cannot stat '/home/postgres/pgdata/pgroot/data.old/*': No such file or directory
2024-01-31 09:45:58,780 ERROR: Exception during execution of long running task bootstrap
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/patroni/async_executor.py", line 97, in run
    wakeup = func(*args) if args else func()
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/bootstrap.py", line 292, in bootstrap
    return do_initialize(config.get(method)) and self._postgresql.config.append_pg_hba(pg_hba) \
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/bootstrap.py", line 114, in _custom_bootstrap
    self._postgresql.config.write_recovery_conf(config['recovery_conf'])
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/config.py", line 791, in write_recovery_conf
    with ConfigWriter(self._recovery_conf) as f:
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/config.py", line 228, in __enter__
    self._fd = open(self._filename, 'w')
FileNotFoundError: [Errno 2] No such file or directory: '/home/postgres/pgdata/pgroot/data/recovery.conf'
2024-01-31 09:45:58,782 INFO: removing initialize key after failed attempt to bootstrap the cluster
Traceback (most recent call last):
  File "/usr/local/bin/patroni", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 144, in main
    return patroni_main()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 136, in patroni_main
    abstract_main(Patroni, schema)
  File "/usr/local/lib/python3.10/dist-packages/patroni/daemon.py", line 181, in abstract_main
    controller.run()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 106, in run
    super(Patroni, self).run()
  File "/usr/local/lib/python3.10/dist-packages/patroni/daemon.py", line 126, in run
    self._run_cycle()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 109, in _run_cycle
    logger.info(self.ha.run_cycle())
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1770, in run_cycle
    info = self._run_cycle()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1592, in _run_cycle
    return self.post_bootstrap()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1483, in post_bootstrap
    self.cancel_initialization()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1476, in cancel_initialization
    raise PatroniFatalException('Failed to bootstrap cluster')
patroni.exceptions.PatroniFatalException: 'Failed to bootstrap cluster'
/etc/runit/runsvdir/default/patroni: finished with code=1 signal=0
/etc/runit/runsvdir/default/patroni: exceeded maximum number of restarts 5
stopping /etc/runit/runsvdir/default/patroni
timeout: finish: .: (pid 346) 11s, want down
kubectl logs postgres-ttywey-backup-postgresql-1 postgresql
2024-01-31 09:40:23,880 - bootstrapping - INFO - Figuring out my environment (Google? AWS? Openstack? Local?)
2024-01-31 09:40:23,983 - bootstrapping - INFO - Looks like you are running google
2024-01-31 09:40:24,465 - bootstrapping - INFO - kubeblocks generate local configuration: 
bootstrap:
  dcs:
    check_timeline: true
    loop_wait: 10
    max_timelines_history: 0
    maximum_lag_on_failover: 1048576
    postgresql:
      parameters:
        archive_command: /bin/true
        archive_mode: 'on'
        autovacuum_analyze_scale_factor: '0.1'
        autovacuum_max_workers: '3'
        autovacuum_vacuum_scale_factor: '0.05'
        checkpoint_completion_target: '0.9'
        log_autovacuum_min_duration: '10000'
        log_checkpoints: 'True'
        log_connections: 'False'
        log_disconnections: 'False'
        log_min_duration_statement: '1000'
        log_statement: ddl
        log_temp_files: 128kB
        max_connections: '56'
        max_locks_per_transaction: '64'
        max_prepared_transactions: '100'
        max_replication_slots: '16'
        max_wal_senders: '64'
        max_worker_processes: '8'
        tcp_keepalives_idle: 45s
        tcp_keepalives_interval: 10s
        track_commit_timestamp: 'False'
        track_functions: pl
        wal_compression: 'True'
        wal_keep_size: '0'
        wal_level: replica
        wal_log_hints: 'False'
    retry_timeout: 10
    ttl: 30
  initdb:
  - auth-host: md5
  - auth-local: trust
  kb_restore_from_time:
    command: bash /home/postgres/pgdata/kb_restore/kb_restore.sh
    keep_existing_recovery_conf: false
    recovery_conf:
      restore_command: envdir /home/postgres/pgdata/wal-g/restore-env /home/postgres/pgdata/wal-g/wal-g
        wal-fetch %f %p
  method: kb_restore_from_time
postgresql:
  config_dir: /home/postgres/pgdata/conf
  custom_conf: /home/postgres/conf/postgresql.conf
  parameters:
    log_destination: csvlog
    log_directory: log
    log_filename: postgresql-%Y-%m-%d.log
    logging_collector: 'True'
    pg_stat_statements.track_utility: 'False'
    shared_buffers: 128MB
  pg_hba:
  - host     all             all             0.0.0.0/0                md5
  - host     all             all             ::/0                     md5
  - local    all             all                                     trust
  - host     all             all             127.0.0.1/32            trust
  - host     all             all             ::1/128                 trust
  - local     replication     all                                    trust
  - host      replication     all             0.0.0.0/0               md5
  - host      replication     all             ::/0                    md5

2024-01-31 09:40:24,765 - bootstrapping - INFO - Configuring patroni
2024-01-31 09:40:24,965 - bootstrapping - INFO - Writing to file /run/postgres.yml
2024-01-31 09:40:24,966 - bootstrapping - INFO - Configuring crontab
2024-01-31 09:40:24,966 - bootstrapping - INFO - Skipping creation of renice cron job due to lack of SYS_NICE capability
2024-01-31 09:40:24,967 - bootstrapping - INFO - Configuring wal-e
2024-01-31 09:40:24,967 - bootstrapping - INFO - Configuring standby-cluster
2024-01-31 09:40:24,967 - bootstrapping - INFO - Configuring log
2024-01-31 09:40:24,967 - bootstrapping - INFO - Configuring bootstrap
2024-01-31 09:40:24,967 - bootstrapping - INFO - Configuring pgqd
2024-01-31 09:40:24,968 - bootstrapping - INFO - Configuring pgbouncer
2024-01-31 09:40:24,968 - bootstrapping - INFO - No PGBOUNCER_CONFIGURATION was specified, skipping
2024-01-31 09:40:24,968 - bootstrapping - INFO - Configuring certificate
2024-01-31 09:40:24,969 - bootstrapping - INFO - Generating ssl self-signed certificate
2024-01-31 09:40:30,363 - bootstrapping - INFO - Configuring pam-oauth2
2024-01-31 09:40:30,365 - bootstrapping - INFO - No PAM_OAUTH2 configuration was specified, skipping
2024-01-31 09:40:34,675 INFO: Selected new K8s API server endpoint https://10.10.0.2:443
2024-01-31 09:40:35,073 INFO: No PostgreSQL configuration items changed, nothing to reload.
2024-01-31 09:40:35,166 INFO: Lock owner: None; I am postgres-ttywey-backup-postgresql-1
2024-01-31 09:40:35,370 INFO: waiting for leader to bootstrap
2024-01-31 09:40:45,672 INFO: Lock owner: None; I am postgres-ttywey-backup-postgresql-1
2024-01-31 09:40:45,680 INFO: trying to bootstrap a new cluster
2024-01-31 09:40:45,680 INFO: Running custom bootstrap script: bash /home/postgres/pgdata/kb_restore/kb_restore.sh
2024-01-31 09:40:48 GMT [107]: [1-1] 65ba15a0.6b 0     LOG:  Auto detecting pg_stat_kcache.linux_hz parameter...
2024-01-31 09:40:48 GMT [107]: [2-1] 65ba15a0.6b 0     LOG:  pg_stat_kcache.linux_hz is set to 500000
2024-01-31 09:40:49 GMT [107]: [3-1] 65ba15a0.6b 0     LOG:  redirecting log output to logging collector process
2024-01-31 09:40:49 GMT [107]: [4-1] 65ba15a0.6b 0     HINT:  Future log output will appear in directory "log".
2024-01-31 09:40:49,562 INFO: postmaster pid=107
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - no response
2024-01-31 09:40:54,767 INFO: removing initialize key after failed attempt to bootstrap the cluster
2024-01-31 09:40:54,871 INFO: renaming data directory to /home/postgres/pgdata/pgroot/data.failed
Traceback (most recent call last):
  File "/usr/local/bin/patroni", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 144, in main
    return patroni_main()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 136, in patroni_main
    abstract_main(Patroni, schema)
  File "/usr/local/lib/python3.10/dist-packages/patroni/daemon.py", line 181, in abstract_main
    controller.run()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 106, in run
    super(Patroni, self).run()
  File "/usr/local/lib/python3.10/dist-packages/patroni/daemon.py", line 126, in run
    self._run_cycle()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 109, in _run_cycle
    logger.info(self.ha.run_cycle())
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1770, in run_cycle
    info = self._run_cycle()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1592, in _run_cycle
    return self.post_bootstrap()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1483, in post_bootstrap
    self.cancel_initialization()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1476, in cancel_initialization
    raise PatroniFatalException('Failed to bootstrap cluster')
patroni.exceptions.PatroniFatalException: 'Failed to bootstrap cluster'
/etc/runit/runsvdir/default/patroni: finished with code=1 signal=0
/etc/runit/runsvdir/default/patroni: sleeping 30 seconds
...
2024-01-31 09:46:11,568 INFO: Selected new K8s API server endpoint https://10.10.0.2:443
2024-01-31 09:46:11,865 INFO: No PostgreSQL configuration items changed, nothing to reload.
2024-01-31 09:46:11,870 INFO: Lock owner: None; I am postgres-ttywey-backup-postgresql-1
2024-01-31 09:46:11,979 INFO: trying to bootstrap a new cluster
2024-01-31 09:46:11,980 INFO: Running custom bootstrap script: bash /home/postgres/pgdata/kb_restore/kb_restore.sh
mv: cannot stat '/home/postgres/pgdata/pgroot/data.old/*': No such file or directory
2024-01-31 09:46:12,081 ERROR: Exception during execution of long running task bootstrap
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/patroni/async_executor.py", line 97, in run
    wakeup = func(*args) if args else func()
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/bootstrap.py", line 292, in bootstrap
    return do_initialize(config.get(method)) and self._postgresql.config.append_pg_hba(pg_hba) \
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/bootstrap.py", line 114, in _custom_bootstrap
    self._postgresql.config.write_recovery_conf(config['recovery_conf'])
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/config.py", line 791, in write_recovery_conf
    with ConfigWriter(self._recovery_conf) as f:
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/config.py", line 228, in __enter__
    self._fd = open(self._filename, 'w')
FileNotFoundError: [Errno 2] No such file or directory: '/home/postgres/pgdata/pgroot/data/recovery.conf'
2024-01-31 09:46:12,083 INFO: removing initialize key after failed attempt to bootstrap the cluster
Traceback (most recent call last):
  File "/usr/local/bin/patroni", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 144, in main
    return patroni_main()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 136, in patroni_main
    abstract_main(Patroni, schema)
  File "/usr/local/lib/python3.10/dist-packages/patroni/daemon.py", line 181, in abstract_main
    controller.run()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 106, in run
    super(Patroni, self).run()
  File "/usr/local/lib/python3.10/dist-packages/patroni/daemon.py", line 126, in run
    self._run_cycle()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 109, in _run_cycle
    logger.info(self.ha.run_cycle())
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1770, in run_cycle
    info = self._run_cycle()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1592, in _run_cycle
    return self.post_bootstrap()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1483, in post_bootstrap
    self.cancel_initialization()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1476, in cancel_initialization
    raise PatroniFatalException('Failed to bootstrap cluster')
patroni.exceptions.PatroniFatalException: 'Failed to bootstrap cluster'
/etc/runit/runsvdir/default/patroni: finished with code=1 signal=0
/etc/runit/runsvdir/default/patroni: exceeded maximum number of restarts 5
stopping /etc/runit/runsvdir/default/patroni
timeout: finish: .: (pid 337) 11s, want down

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

@JashBook JashBook added the kind/bug Something isn't working label Jan 31, 2024
@JashBook JashBook added this to the Release 0.9.0 milestone Jan 31, 2024
Copy link

github-actions bot commented Mar 4, 2024

This issue has been marked as stale because it has been open for 30 days with no activity

@github-actions github-actions bot added the Stale label Mar 4, 2024
@ahjing99
Copy link
Collaborator

ahjing99 commented Apr 8, 2024

This is still failed on 0.9

➜  ~ k logs postgres-juiecs-backup-postgresql-0
Defaulted container "postgresql" out of: postgresql, pgbouncer, metrics, lorry, config-manager, pg-init-container (init)
2024-04-08 06:00:33,201 - bootstrapping - INFO - Figuring out my environment (Google? AWS? Openstack? Local?)
2024-04-08 06:00:33,299 - bootstrapping - INFO - Looks like you are running google
2024-04-08 06:00:33,498 - bootstrapping - INFO - kubeblocks generate local configuration:
bootstrap:
  dcs:
    check_timeline: true
    loop_wait: 10
    max_timelines_history: 0
    maximum_lag_on_failover: 1048576
    postgresql:
      parameters:
        archive_command: /bin/true
        archive_mode: 'on'
        autovacuum_analyze_scale_factor: '0.1'
        autovacuum_max_workers: '3'
        autovacuum_vacuum_scale_factor: '0.05'
        checkpoint_completion_target: '0.9'
        log_autovacuum_min_duration: '10000'
        log_checkpoints: 'True'
        log_connections: 'False'
        log_disconnections: 'False'
        log_min_duration_statement: '1000'
        log_statement: ddl
        log_temp_files: 128kB
        max_connections: '56'
        max_locks_per_transaction: '64'
        max_prepared_transactions: '100'
        max_replication_slots: '16'
        max_wal_senders: '64'
        max_worker_processes: '8'
        tcp_keepalives_idle: 45s
        tcp_keepalives_interval: 10s
        track_commit_timestamp: 'False'
        track_functions: pl
        wal_compression: 'True'
        wal_keep_size: '0'
        wal_level: replica
        wal_log_hints: 'False'
    retry_timeout: 10
    ttl: 30
  initdb:
  - auth-host: md5
  - auth-local: trust
  kb_restore_from_time:
    command: bash /home/postgres/pgdata/kb_restore/kb_restore.sh
    keep_existing_recovery_conf: false
    recovery_conf:
      restore_command: envdir /home/postgres/pgdata/wal-g/restore-env /home/postgres/pgdata/wal-g/wal-g
        wal-fetch %f %p
  method: kb_restore_from_time
postgresql:
  config_dir: /home/postgres/pgdata/conf
  custom_conf: /home/postgres/conf/postgresql.conf
  parameters:
    pg_stat_statements.track_utility: 'False'
    shared_buffers: 128MB
    shared_preload_libraries: pg_stat_statements,auto_explain,bg_mon,pgextwlist,pg_auth_mon,set_user,pg_cron,pg_stat_kcache,timescaledb,pgaudit
  pg_hba:
  - host     all             all             0.0.0.0/0                md5
  - host     all             all             ::/0                     md5
  - local    all             all                                     trust
  - host     all             all             127.0.0.1/32            trust
  - host     all             all             ::1/128                 trust
  - local     replication     all                                    trust
  - host      replication     all             0.0.0.0/0               md5
  - host      replication     all             ::/0                    md5

2024-04-08 06:00:33,696 - bootstrapping - INFO - Configuring pgbouncer
2024-04-08 06:00:33,696 - bootstrapping - INFO - No PGBOUNCER_CONFIGURATION was specified, skipping
2024-04-08 06:00:33,696 - bootstrapping - INFO - Configuring patroni
2024-04-08 06:00:33,795 - bootstrapping - INFO - Writing to file /run/postgres.yml
2024-04-08 06:00:33,796 - bootstrapping - INFO - Configuring certificate
2024-04-08 06:00:33,796 - bootstrapping - INFO - Generating ssl self-signed certificate
2024-04-08 06:00:40,499 - bootstrapping - INFO - Configuring wal-e
2024-04-08 06:00:40,500 - bootstrapping - INFO - Configuring bootstrap
2024-04-08 06:00:40,500 - bootstrapping - INFO - Configuring crontab
2024-04-08 06:00:40,500 - bootstrapping - INFO - Skipping creation of renice cron job due to lack of SYS_NICE capability
2024-04-08 06:00:40,500 - bootstrapping - INFO - Configuring pam-oauth2
2024-04-08 06:00:40,500 - bootstrapping - INFO - No PAM_OAUTH2 configuration was specified, skipping
2024-04-08 06:00:40,501 - bootstrapping - INFO - Configuring pgqd
2024-04-08 06:00:40,501 - bootstrapping - INFO - Configuring standby-cluster
2024-04-08 06:00:40,501 - bootstrapping - INFO - Configuring log
2024-04-08 06:00:43,808 INFO: Selected new K8s API server endpoint https://10.128.0.53:443
2024-04-08 06:00:44,003 INFO: No PostgreSQL configuration items changed, nothing to reload.
2024-04-08 06:00:44,095 INFO: Lock owner: None; I am postgres-juiecs-backup-postgresql-0
2024-04-08 06:00:44,399 INFO: waiting for leader to bootstrap
2024-04-08 06:00:54,614 INFO: Lock owner: None; I am postgres-juiecs-backup-postgresql-0
2024-04-08 06:00:54,614 INFO: waiting for leader to bootstrap
2024-04-08 06:01:04,602 INFO: Lock owner: None; I am postgres-juiecs-backup-postgresql-0
2024-04-08 06:01:04,610 INFO: trying to bootstrap a new cluster
2024-04-08 06:01:04,610 INFO: Running custom bootstrap script: bash /home/postgres/pgdata/kb_restore/kb_restore.sh
2024-04-08 06:01:06 GMT [117]: [1-1] 66138822.75 0     LOG:  Auto detecting pg_stat_kcache.linux_hz parameter...
2024-04-08 06:01:06 GMT [117]: [2-1] 66138822.75 0     LOG:  pg_stat_kcache.linux_hz is set to 500000
2024-04-08 06:01:06 GMT [117]: [3-1] 66138822.75 0     LOG:  pgaudit extension initialized
2024-04-08 06:01:06,594 INFO: postmaster pid=117
/var/run/postgresql:5432 - no response
2024-04-08 06:01:06 GMT [117]: [4-1] 66138822.75 0     LOG:  redirecting log output to logging collector process
2024-04-08 06:01:06 GMT [117]: [5-1] 66138822.75 0     HINT:  Future log output will appear in directory "../pg_log".
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
2024-04-08 06:01:14,601 INFO: Lock owner: None; I am postgres-juiecs-backup-postgresql-0
2024-04-08 06:01:14,601 INFO: not healthy enough for leader race
2024-04-08 06:01:14,894 INFO: bootstrap in progress
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - no response
2024-04-08 06:01:16,699 INFO: removing initialize key after failed attempt to bootstrap the cluster
2024-04-08 06:01:16,709 INFO: renaming data directory to /home/postgres/pgdata/pgroot/data.failed
Traceback (most recent call last):
  File "/usr/local/bin/patroni", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 144, in main
    return patroni_main()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 136, in patroni_main
    abstract_main(Patroni, schema)
  File "/usr/local/lib/python3.10/dist-packages/patroni/daemon.py", line 181, in abstract_main
    controller.run()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 106, in run
    super(Patroni, self).run()
  File "/usr/local/lib/python3.10/dist-packages/patroni/daemon.py", line 126, in run
    self._run_cycle()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 109, in _run_cycle
    logger.info(self.ha.run_cycle())
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1770, in run_cycle
    info = self._run_cycle()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1592, in _run_cycle
    return self.post_bootstrap()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1483, in post_bootstrap
    self.cancel_initialization()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1476, in cancel_initialization
    raise PatroniFatalException('Failed to bootstrap cluster')
patroni.exceptions.PatroniFatalException: 'Failed to bootstrap cluster'
/etc/runit/runsvdir/default/patroni: finished with code=1 signal=0
/etc/runit/runsvdir/default/patroni: sleeping 30 seconds
2024-04-08 06:01:49,608 INFO: Selected new K8s API server endpoint https://10.128.0.53:443
2024-04-08 06:01:49,893 INFO: No PostgreSQL configuration items changed, nothing to reload.
2024-04-08 06:01:49,896 INFO: Lock owner: None; I am postgres-juiecs-backup-postgresql-0
2024-04-08 06:01:50,000 INFO: trying to bootstrap a new cluster
2024-04-08 06:01:50,000 INFO: Running custom bootstrap script: bash /home/postgres/pgdata/kb_restore/kb_restore.sh
mv: cannot stat '/home/postgres/pgdata/pgroot/data.old/*': No such file or directory
2024-04-08 06:01:50,198 ERROR: Exception during execution of long running task bootstrap
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/patroni/async_executor.py", line 97, in run
    wakeup = func(*args) if args else func()
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/bootstrap.py", line 292, in bootstrap
    return do_initialize(config.get(method)) and self._postgresql.config.append_pg_hba(pg_hba) \
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/bootstrap.py", line 114, in _custom_bootstrap
    self._postgresql.config.write_recovery_conf(config['recovery_conf'])
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/config.py", line 791, in write_recovery_conf
    with ConfigWriter(self._recovery_conf) as f:
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/config.py", line 228, in __enter__
    self._fd = open(self._filename, 'w')
FileNotFoundError: [Errno 2] No such file or directory: '/home/postgres/pgdata/pgroot/data/recovery.conf'
2024-04-08 06:01:50,199 INFO: removing initialize key after failed attempt to bootstrap the cluster
Traceback (most recent call last):
  File "/usr/local/bin/patroni", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 144, in main
    return patroni_main()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 136, in patroni_main
    abstract_main(Patroni, schema)
  File "/usr/local/lib/python3.10/dist-packages/patroni/daemon.py", line 181, in abstract_main
    controller.run()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 106, in run
    super(Patroni, self).run()
  File "/usr/local/lib/python3.10/dist-packages/patroni/daemon.py", line 126, in run
    self._run_cycle()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 109, in _run_cycle
    logger.info(self.ha.run_cycle())
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1770, in run_cycle
    info = self._run_cycle()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1592, in _run_cycle
    return self.post_bootstrap()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1483, in post_bootstrap
    self.cancel_initialization()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1476, in cancel_initialization
    raise PatroniFatalException('Failed to bootstrap cluster')
patroni.exceptions.PatroniFatalException: 'Failed to bootstrap cluster'
/etc/runit/runsvdir/default/patroni: finished with code=1 signal=0
/etc/runit/runsvdir/default/patroni: sleeping 60 seconds
2024-04-08 06:02:53,402 INFO: Selected new K8s API server endpoint https://10.128.0.53:443
2024-04-08 06:02:53,598 INFO: No PostgreSQL configuration items changed, nothing to reload.
2024-04-08 06:02:53,601 INFO: Lock owner: None; I am postgres-juiecs-backup-postgresql-0
2024-04-08 06:02:53,706 INFO: trying to bootstrap a new cluster
2024-04-08 06:02:53,706 INFO: Running custom bootstrap script: bash /home/postgres/pgdata/kb_restore/kb_restore.sh
mv: cannot stat '/home/postgres/pgdata/pgroot/data.old/*': No such file or directory
2024-04-08 06:02:53,728 ERROR: Exception during execution of long running task bootstrap
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/patroni/async_executor.py", line 97, in run
    wakeup = func(*args) if args else func()
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/bootstrap.py", line 292, in bootstrap
    return do_initialize(config.get(method)) and self._postgresql.config.append_pg_hba(pg_hba) \
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/bootstrap.py", line 114, in _custom_bootstrap
    self._postgresql.config.write_recovery_conf(config['recovery_conf'])
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/config.py", line 791, in write_recovery_conf
    with ConfigWriter(self._recovery_conf) as f:
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/config.py", line 228, in __enter__
    self._fd = open(self._filename, 'w')
FileNotFoundError: [Errno 2] No such file or directory: '/home/postgres/pgdata/pgroot/data/recovery.conf'
2024-04-08 06:02:53,729 INFO: removing initialize key after failed attempt to bootstrap the cluster
Traceback (most recent call last):
  File "/usr/local/bin/patroni", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 144, in main
    return patroni_main()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 136, in patroni_main
    abstract_main(Patroni, schema)
  File "/usr/local/lib/python3.10/dist-packages/patroni/daemon.py", line 181, in abstract_main
    controller.run()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 106, in run
    super(Patroni, self).run()
  File "/usr/local/lib/python3.10/dist-packages/patroni/daemon.py", line 126, in run
    self._run_cycle()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 109, in _run_cycle
    logger.info(self.ha.run_cycle())
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1770, in run_cycle
    info = self._run_cycle()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1592, in _run_cycle
    return self.post_bootstrap()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1483, in post_bootstrap
    self.cancel_initialization()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1476, in cancel_initialization
    raise PatroniFatalException('Failed to bootstrap cluster')
patroni.exceptions.PatroniFatalException: 'Failed to bootstrap cluster'
/etc/runit/runsvdir/default/patroni: finished with code=1 signal=0
/etc/runit/runsvdir/default/patroni: sleeping 90 seconds
2024-04-08 06:04:26,307 INFO: Selected new K8s API server endpoint https://10.128.0.53:443
2024-04-08 06:04:26,499 INFO: No PostgreSQL configuration items changed, nothing to reload.
2024-04-08 06:04:26,501 INFO: Lock owner: None; I am postgres-juiecs-backup-postgresql-0
2024-04-08 06:04:26,607 INFO: trying to bootstrap a new cluster
2024-04-08 06:04:26,608 INFO: Running custom bootstrap script: bash /home/postgres/pgdata/kb_restore/kb_restore.sh
mv: cannot stat '/home/postgres/pgdata/pgroot/data.old/*': No such file or directory
2024-04-08 06:04:26,710 ERROR: Exception during execution of long running task bootstrap
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/patroni/async_executor.py", line 97, in run
    wakeup = func(*args) if args else func()
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/bootstrap.py", line 292, in bootstrap
    return do_initialize(config.get(method)) and self._postgresql.config.append_pg_hba(pg_hba) \
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/bootstrap.py", line 114, in _custom_bootstrap
    self._postgresql.config.write_recovery_conf(config['recovery_conf'])
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/config.py", line 791, in write_recovery_conf
    with ConfigWriter(self._recovery_conf) as f:
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/config.py", line 228, in __enter__
    self._fd = open(self._filename, 'w')
FileNotFoundError: [Errno 2] No such file or directory: '/home/postgres/pgdata/pgroot/data/recovery.conf'
2024-04-08 06:04:26,711 INFO: removing initialize key after failed attempt to bootstrap the cluster
Traceback (most recent call last):
  File "/usr/local/bin/patroni", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 144, in main
    return patroni_main()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 136, in patroni_main
    abstract_main(Patroni, schema)
  File "/usr/local/lib/python3.10/dist-packages/patroni/daemon.py", line 181, in abstract_main
    controller.run()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 106, in run
    super(Patroni, self).run()
  File "/usr/local/lib/python3.10/dist-packages/patroni/daemon.py", line 126, in run
    self._run_cycle()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 109, in _run_cycle
    logger.info(self.ha.run_cycle())
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1770, in run_cycle
    info = self._run_cycle()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1592, in _run_cycle
    return self.post_bootstrap()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1483, in post_bootstrap
    self.cancel_initialization()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1476, in cancel_initialization
    raise PatroniFatalException('Failed to bootstrap cluster')
patroni.exceptions.PatroniFatalException: 'Failed to bootstrap cluster'
/etc/runit/runsvdir/default/patroni: finished with code=1 signal=0
/etc/runit/runsvdir/default/patroni: sleeping 120 seconds
2024-04-08 06:06:29,303 INFO: Selected new K8s API server endpoint https://10.128.0.53:443
2024-04-08 06:06:29,499 INFO: No PostgreSQL configuration items changed, nothing to reload.
2024-04-08 06:06:29,501 INFO: Lock owner: None; I am postgres-juiecs-backup-postgresql-0
2024-04-08 06:06:29,607 INFO: trying to bootstrap a new cluster
2024-04-08 06:06:29,607 INFO: Running custom bootstrap script: bash /home/postgres/pgdata/kb_restore/kb_restore.sh
mv: cannot stat '/home/postgres/pgdata/pgroot/data.old/*': No such file or directory
2024-04-08 06:06:29,629 ERROR: Exception during execution of long running task bootstrap
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/patroni/async_executor.py", line 97, in run
    wakeup = func(*args) if args else func()
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/bootstrap.py", line 292, in bootstrap
    return do_initialize(config.get(method)) and self._postgresql.config.append_pg_hba(pg_hba) \
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/bootstrap.py", line 114, in _custom_bootstrap
    self._postgresql.config.write_recovery_conf(config['recovery_conf'])
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/config.py", line 791, in write_recovery_conf
    with ConfigWriter(self._recovery_conf) as f:
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/config.py", line 228, in __enter__
    self._fd = open(self._filename, 'w')
FileNotFoundError: [Errno 2] No such file or directory: '/home/postgres/pgdata/pgroot/data/recovery.conf'
2024-04-08 06:06:29,693 INFO: removing initialize key after failed attempt to bootstrap the cluster
Traceback (most recent call last):
  File "/usr/local/bin/patroni", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 144, in main
    return patroni_main()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 136, in patroni_main
    abstract_main(Patroni, schema)
  File "/usr/local/lib/python3.10/dist-packages/patroni/daemon.py", line 181, in abstract_main
    controller.run()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 106, in run
    super(Patroni, self).run()
  File "/usr/local/lib/python3.10/dist-packages/patroni/daemon.py", line 126, in run
    self._run_cycle()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 109, in _run_cycle
    logger.info(self.ha.run_cycle())
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1770, in run_cycle
    info = self._run_cycle()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1592, in _run_cycle
    return self.post_bootstrap()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1483, in post_bootstrap
    self.cancel_initialization()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1476, in cancel_initialization
    raise PatroniFatalException('Failed to bootstrap cluster')
patroni.exceptions.PatroniFatalException: 'Failed to bootstrap cluster'
/etc/runit/runsvdir/default/patroni: finished with code=1 signal=0
/etc/runit/runsvdir/default/patroni: exceeded maximum number of restarts 5
stopping /etc/runit/runsvdir/default/patroni
timeout: finish: .: (pid 368) 10s, want down
➜  ~

@ahjing99 ahjing99 added severity/major Great chance user will encounter the same problem and removed Stale labels Apr 8, 2024
@ahjing99
Copy link
Collaborator

ahjing99 commented Apr 8, 2024

error after select pg_switch_wal();

 k logs pg1-postgresql-0
Defaulted container "postgresql" out of: postgresql, pgbouncer, metrics, lorry, config-manager, pg-init-container (init)
2024-04-08 06:35:53,398 - bootstrapping - INFO - Figuring out my environment (Google? AWS? Openstack? Local?)
2024-04-08 06:35:53,494 - bootstrapping - INFO - Looks like you are running google
2024-04-08 06:35:53,695 - bootstrapping - INFO - kubeblocks generate local configuration:
bootstrap:
  dcs:
    check_timeline: true
    loop_wait: 10
    max_timelines_history: 0
    maximum_lag_on_failover: 1048576
    postgresql:
      parameters:
        archive_command: /bin/true
        archive_mode: 'on'
        autovacuum_analyze_scale_factor: '0.1'
        autovacuum_max_workers: '3'
        autovacuum_vacuum_scale_factor: '0.05'
        checkpoint_completion_target: '0.9'
        log_autovacuum_min_duration: '10000'
        log_checkpoints: 'True'
        log_connections: 'False'
        log_disconnections: 'False'
        log_min_duration_statement: '1000'
        log_statement: ddl
        log_temp_files: 128kB
        max_connections: '56'
        max_locks_per_transaction: '64'
        max_prepared_transactions: '100'
        max_replication_slots: '16'
        max_wal_senders: '64'
        max_worker_processes: '8'
        tcp_keepalives_idle: 45s
        tcp_keepalives_interval: 10s
        track_commit_timestamp: 'False'
        track_functions: pl
        wal_compression: 'True'
        wal_keep_size: '0'
        wal_level: replica
        wal_log_hints: 'False'
    retry_timeout: 10
    ttl: 30
  initdb:
  - auth-host: md5
  - auth-local: trust
  kb_restore_from_time:
    command: bash /home/postgres/pgdata/kb_restore/kb_restore.sh
    keep_existing_recovery_conf: false
    recovery_conf:
      restore_command: envdir /home/postgres/pgdata/wal-g/restore-env /home/postgres/pgdata/wal-g/wal-g
        wal-fetch %f %p
  method: kb_restore_from_time
postgresql:
  config_dir: /home/postgres/pgdata/conf
  custom_conf: /home/postgres/conf/postgresql.conf
  parameters:
    pg_stat_statements.track_utility: 'False'
    shared_buffers: 128MB
    shared_preload_libraries: pg_stat_statements,auto_explain,bg_mon,pgextwlist,pg_auth_mon,set_user,pg_cron,pg_stat_kcache,timescaledb,pgaudit
  pg_hba:
  - host     all             all             0.0.0.0/0                md5
  - host     all             all             ::/0                     md5
  - local    all             all                                     trust
  - host     all             all             127.0.0.1/32            trust
  - host     all             all             ::1/128                 trust
  - local     replication     all                                    trust
  - host      replication     all             0.0.0.0/0               md5
  - host      replication     all             ::/0                    md5

2024-04-08 06:35:53,894 - bootstrapping - INFO - Configuring standby-cluster
2024-04-08 06:35:53,894 - bootstrapping - INFO - Configuring patroni
2024-04-08 06:35:53,902 - bootstrapping - INFO - Writing to file /run/postgres.yml
2024-04-08 06:35:53,993 - bootstrapping - INFO - Configuring log
2024-04-08 06:35:53,993 - bootstrapping - INFO - Configuring pgqd
2024-04-08 06:35:53,993 - bootstrapping - INFO - Configuring bootstrap
2024-04-08 06:35:53,993 - bootstrapping - INFO - Configuring certificate
2024-04-08 06:35:53,993 - bootstrapping - INFO - Generating ssl self-signed certificate
2024-04-08 06:35:55,300 - bootstrapping - INFO - Configuring wal-e
2024-04-08 06:35:55,300 - bootstrapping - INFO - Configuring pgbouncer
2024-04-08 06:35:55,300 - bootstrapping - INFO - No PGBOUNCER_CONFIGURATION was specified, skipping
2024-04-08 06:35:55,300 - bootstrapping - INFO - Configuring pam-oauth2
2024-04-08 06:35:55,300 - bootstrapping - INFO - No PAM_OAUTH2 configuration was specified, skipping
2024-04-08 06:35:55,300 - bootstrapping - INFO - Configuring crontab
2024-04-08 06:35:55,300 - bootstrapping - INFO - Skipping creation of renice cron job due to lack of SYS_NICE capability
2024-04-08 06:35:57,699 INFO: Selected new K8s API server endpoint https://10.128.0.53:443
2024-04-08 06:35:57,896 INFO: No PostgreSQL configuration items changed, nothing to reload.
2024-04-08 06:35:57,898 INFO: Lock owner: None; I am pg1-postgresql-0
2024-04-08 06:35:58,093 INFO: trying to bootstrap a new cluster
2024-04-08 06:35:58,093 INFO: Running custom bootstrap script: bash /home/postgres/pgdata/kb_restore/kb_restore.sh
2024-04-08 06:36:00 GMT [117]: [1-1] 66139050.75 0     LOG:  Auto detecting pg_stat_kcache.linux_hz parameter...
2024-04-08 06:36:00 GMT [117]: [2-1] 66139050.75 0     LOG:  pg_stat_kcache.linux_hz is set to 1000000
2024-04-08 06:36:00 GMT [117]: [3-1] 66139050.75 0     LOG:  pgaudit extension initialized
2024-04-08 06:36:00,694 INFO: postmaster pid=117
/var/run/postgresql:5432 - no response
2024-04-08 06:36:00 GMT [117]: [4-1] 66139050.75 0     LOG:  redirecting log output to logging collector process
2024-04-08 06:36:00 GMT [117]: [5-1] 66139050.75 0     HINT:  Future log output will appear in directory "../pg_log".
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
2024-04-08 06:36:08,493 INFO: Lock owner: None; I am pg1-postgresql-0
2024-04-08 06:36:08,493 INFO: not healthy enough for leader race
2024-04-08 06:36:08,996 INFO: bootstrap in progress
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
2024-04-08 06:36:18,593 INFO: Lock owner: None; I am pg1-postgresql-0
2024-04-08 06:36:18,593 INFO: not healthy enough for leader race
2024-04-08 06:36:18,593 INFO: bootstrap in progress
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
2024-04-08 06:36:28,595 INFO: Lock owner: None; I am pg1-postgresql-0
2024-04-08 06:36:28,595 INFO: not healthy enough for leader race
2024-04-08 06:36:28,595 INFO: bootstrap in progress
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
2024-04-08 06:36:38,493 INFO: Lock owner: None; I am pg1-postgresql-0
2024-04-08 06:36:38,493 INFO: not healthy enough for leader race
2024-04-08 06:36:38,493 INFO: bootstrap in progress
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
2024-04-08 06:36:48,404 INFO: Lock owner: None; I am pg1-postgresql-0
2024-04-08 06:36:48,404 INFO: not healthy enough for leader race
2024-04-08 06:36:48,404 INFO: bootstrap in progress
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
2024-04-08 06:36:58,404 INFO: Lock owner: None; I am pg1-postgresql-0
2024-04-08 06:36:58,404 INFO: not healthy enough for leader race
2024-04-08 06:36:58,404 INFO: bootstrap in progress
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
2024-04-08 06:37:08,994 INFO: Lock owner: None; I am pg1-postgresql-0
2024-04-08 06:37:08,994 INFO: Still starting up as a standby.
2024-04-08 06:37:08,994 INFO: establishing a new patroni connection to the postgres cluster
2024-04-08 06:37:10,043 INFO: establishing a new patroni connection to the postgres cluster
2024-04-08 06:37:10,045 WARNING: Retry got exception: 'connection problems'
2024-04-08 06:37:10,045 WARNING: Failed to determine PostgreSQL state from the connection, falling back to cached role
2024-04-08 06:37:10,046 INFO: waiting for end of recovery after bootstrap
/var/run/postgresql:5432 - rejecting connections

@ahjing99 ahjing99 closed this as completed Apr 8, 2024
@ahjing99 ahjing99 reopened this Apr 8, 2024
@ahjing99
Copy link
Collaborator

ahjing99 commented Apr 8, 2024

Still failed :

k logs pg-cluster-restore-postgresql-0
Defaulted container "postgresql" out of: postgresql, pgbouncer, metrics, lorry, config-manager, pg-init-container (init)
2024-04-08 09:58:11,670 - bootstrapping - INFO - Figuring out my environment (Google? AWS? Openstack? Local?)
2024-04-08 09:58:11,680 - bootstrapping - INFO - Looks like you are running google
2024-04-08 09:58:11,713 - bootstrapping - INFO - kubeblocks generate local configuration:
bootstrap:
  dcs:
    check_timeline: true
    loop_wait: 10
    max_timelines_history: 0
    maximum_lag_on_failover: 1048576
    postgresql:
      parameters:
        archive_command: /bin/true
        archive_mode: 'on'
        autovacuum_analyze_scale_factor: '0.1'
        autovacuum_max_workers: '3'
        autovacuum_vacuum_scale_factor: '0.05'
        checkpoint_completion_target: '0.9'
        log_autovacuum_min_duration: '10000'
        log_checkpoints: 'True'
        log_connections: 'False'
        log_disconnections: 'False'
        log_min_duration_statement: '1000'
        log_statement: ddl
        log_temp_files: 128kB
        max_connections: '112'
        max_locks_per_transaction: '64'
        max_prepared_transactions: '100'
        max_replication_slots: '16'
        max_wal_senders: '64'
        max_worker_processes: '8'
        tcp_keepalives_idle: 45s
        tcp_keepalives_interval: 10s
        track_commit_timestamp: 'False'
        track_functions: pl
        wal_compression: 'True'
        wal_keep_segments: '0'
        wal_level: replica
        wal_log_hints: 'False'
    retry_timeout: 10
    ttl: 30
  initdb:
  - auth-host: md5
  - auth-local: trust
  kb_restore_from_time:
    command: bash /home/postgres/pgdata/kb_restore/kb_restore.sh
    keep_existing_recovery_conf: false
    recovery_conf:
      restore_command: envdir /home/postgres/pgdata/wal-g/restore-env /home/postgres/pgdata/wal-g/wal-g
        wal-fetch %f %p
  method: kb_restore_from_time
postgresql:
  config_dir: /home/postgres/pgdata/conf
  custom_conf: /home/postgres/conf/postgresql.conf
  parameters:
    pg_stat_statements.track_utility: 'False'
    shared_buffers: 256MB
    shared_preload_libraries: pg_stat_statements,auto_explain,bg_mon,pgextwlist,pg_auth_mon,set_user,pg_cron,pg_stat_kcache,timescaledb,pgaudit
  pg_hba:
  - host     all             all             0.0.0.0/0                md5
  - host     all             all             ::/0                     md5
  - local    all             all                                     trust
  - host     all             all             127.0.0.1/32            trust
  - host     all             all             ::1/128                 trust
  - local     replication     all                                    trust
  - host      replication     all             0.0.0.0/0               md5
  - host      replication     all             ::/0                    md5

2024-04-08 09:58:11,740 - bootstrapping - INFO - Configuring pam-oauth2
2024-04-08 09:58:11,740 - bootstrapping - INFO - No PAM_OAUTH2 configuration was specified, skipping
2024-04-08 09:58:11,740 - bootstrapping - INFO - Configuring wal-e
2024-04-08 09:58:11,740 - bootstrapping - INFO - Configuring pgbouncer
2024-04-08 09:58:11,741 - bootstrapping - INFO - No PGBOUNCER_CONFIGURATION was specified, skipping
2024-04-08 09:58:11,741 - bootstrapping - INFO - Configuring certificate
2024-04-08 09:58:11,741 - bootstrapping - INFO - Generating ssl self-signed certificate
2024-04-08 09:58:12,361 - bootstrapping - INFO - Configuring patroni
2024-04-08 09:58:12,379 - bootstrapping - INFO - Writing to file /run/postgres.yml
2024-04-08 09:58:12,380 - bootstrapping - INFO - Configuring pgqd
2024-04-08 09:58:12,380 - bootstrapping - INFO - Configuring bootstrap
2024-04-08 09:58:12,380 - bootstrapping - INFO - Configuring standby-cluster
2024-04-08 09:58:12,380 - bootstrapping - INFO - Configuring log
2024-04-08 09:58:12,380 - bootstrapping - INFO - Configuring crontab
2024-04-08 09:58:12,380 - bootstrapping - INFO - Skipping creation of renice cron job due to lack of SYS_NICE capability
2024-04-08 09:58:13,672 INFO: Selected new K8s API server endpoint https://10.128.0.53:443
2024-04-08 09:58:13,710 INFO: No PostgreSQL configuration items changed, nothing to reload.
2024-04-08 09:58:13,716 INFO: Lock owner: None; I am pg-cluster-restore-postgresql-0
2024-04-08 09:58:13,804 INFO: trying to bootstrap a new cluster
2024-04-08 09:58:13,805 INFO: Running custom bootstrap script: bash /home/postgres/pgdata/kb_restore/kb_restore.sh
2024-04-08 09:58:14 GMT [105]: [1-1] 6613bfb6.69 0     LOG:  Auto detecting pg_stat_kcache.linux_hz parameter...
2024-04-08 09:58:14 GMT [105]: [2-1] 6613bfb6.69 0     LOG:  pg_stat_kcache.linux_hz is set to 500000
2024-04-08 09:58:14 GMT [105]: [3-1] 6613bfb6.69 0     LOG:  pgaudit extension initialized
2024-04-08 09:58:14 GMT [105]: [4-1] 6613bfb6.69 0     LOG:  starting PostgreSQL 12.18 (Ubuntu 12.18-1.pgdg22.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit
2024-04-08 09:58:14 GMT [105]: [5-1] 6613bfb6.69 0     LOG:  listening on IPv4 address "0.0.0.0", port 5432
2024-04-08 09:58:14 GMT [105]: [6-1] 6613bfb6.69 0     LOG:  listening on IPv6 address "::", port 5432
2024-04-08 09:58:14 GMT [105]: [7-1] 6613bfb6.69 0     LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2024-04-08 09:58:14,109 INFO: postmaster pid=105
2024-04-08 09:58:14 GMT [105]: [8-1] 6613bfb6.69 0     LOG:  redirecting log output to logging collector process
2024-04-08 09:58:14 GMT [105]: [9-1] 6613bfb6.69 0     HINT:  Future log output will appear in directory "../pg_log".
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - no response
2024-04-08 09:58:21,271 INFO: removing initialize key after failed attempt to bootstrap the cluster
2024-04-08 09:58:21,282 INFO: renaming data directory to /home/postgres/pgdata/pgroot/data.failed
Traceback (most recent call last):
  File "/usr/local/bin/patroni", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 144, in main
    return patroni_main()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 136, in patroni_main
    abstract_main(Patroni, schema)
  File "/usr/local/lib/python3.10/dist-packages/patroni/daemon.py", line 181, in abstract_main
    controller.run()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 106, in run
    super(Patroni, self).run()
  File "/usr/local/lib/python3.10/dist-packages/patroni/daemon.py", line 126, in run
    self._run_cycle()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 109, in _run_cycle
    logger.info(self.ha.run_cycle())
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1770, in run_cycle
    info = self._run_cycle()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1592, in _run_cycle
    return self.post_bootstrap()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1483, in post_bootstrap
    self.cancel_initialization()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1476, in cancel_initialization
    raise PatroniFatalException('Failed to bootstrap cluster')
patroni.exceptions.PatroniFatalException: 'Failed to bootstrap cluster'
/etc/runit/runsvdir/default/patroni: finished with code=1 signal=0
/etc/runit/runsvdir/default/patroni: sleeping 30 seconds
2024-04-08 09:58:52,091 INFO: Selected new K8s API server endpoint https://10.128.0.53:443
2024-04-08 09:58:52,137 INFO: No PostgreSQL configuration items changed, nothing to reload.
2024-04-08 09:58:52,141 INFO: Lock owner: None; I am pg-cluster-restore-postgresql-0
2024-04-08 09:58:52,285 INFO: trying to bootstrap a new cluster
2024-04-08 09:58:52,285 INFO: Running custom bootstrap script: bash /home/postgres/pgdata/kb_restore/kb_restore.sh
mv: cannot stat '/home/postgres/pgdata/pgroot/data.old/*': No such file or directory
2024-04-08 09:58:52,304 ERROR: Exception during execution of long running task bootstrap
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/patroni/async_executor.py", line 97, in run
    wakeup = func(*args) if args else func()
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/bootstrap.py", line 292, in bootstrap
    return do_initialize(config.get(method)) and self._postgresql.config.append_pg_hba(pg_hba) \
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/bootstrap.py", line 114, in _custom_bootstrap
    self._postgresql.config.write_recovery_conf(config['recovery_conf'])
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/config.py", line 791, in write_recovery_conf
    with ConfigWriter(self._recovery_conf) as f:
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/config.py", line 228, in __enter__
    self._fd = open(self._filename, 'w')
FileNotFoundError: [Errno 2] No such file or directory: '/home/postgres/pgdata/pgroot/data/recovery.conf'
2024-04-08 09:58:52,307 INFO: removing initialize key after failed attempt to bootstrap the cluster
Traceback (most recent call last):
  File "/usr/local/bin/patroni", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 144, in main
    return patroni_main()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 136, in patroni_main
    abstract_main(Patroni, schema)
  File "/usr/local/lib/python3.10/dist-packages/patroni/daemon.py", line 181, in abstract_main
    controller.run()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 106, in run
    super(Patroni, self).run()
  File "/usr/local/lib/python3.10/dist-packages/patroni/daemon.py", line 126, in run
    self._run_cycle()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 109, in _run_cycle
    logger.info(self.ha.run_cycle())
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1770, in run_cycle
    info = self._run_cycle()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1592, in _run_cycle
    return self.post_bootstrap()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1483, in post_bootstrap
    self.cancel_initialization()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1476, in cancel_initialization
    raise PatroniFatalException('Failed to bootstrap cluster')
patroni.exceptions.PatroniFatalException: 'Failed to bootstrap cluster'
/etc/runit/runsvdir/default/patroni: finished with code=1 signal=0
/etc/runit/runsvdir/default/patroni: sleeping 60 seconds
2024-04-08 09:59:53,034 INFO: Selected new K8s API server endpoint https://10.128.0.53:443
2024-04-08 09:59:53,086 INFO: No PostgreSQL configuration items changed, nothing to reload.
2024-04-08 09:59:53,091 INFO: Lock owner: None; I am pg-cluster-restore-postgresql-0
2024-04-08 09:59:53,109 INFO: trying to bootstrap a new cluster
2024-04-08 09:59:53,109 INFO: Running custom bootstrap script: bash /home/postgres/pgdata/kb_restore/kb_restore.sh
mv: cannot stat '/home/postgres/pgdata/pgroot/data.old/*': No such file or directory
2024-04-08 09:59:53,124 ERROR: Exception during execution of long running task bootstrap
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/patroni/async_executor.py", line 97, in run
    wakeup = func(*args) if args else func()
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/bootstrap.py", line 292, in bootstrap
    return do_initialize(config.get(method)) and self._postgresql.config.append_pg_hba(pg_hba) \
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/bootstrap.py", line 114, in _custom_bootstrap
    self._postgresql.config.write_recovery_conf(config['recovery_conf'])
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/config.py", line 791, in write_recovery_conf
    with ConfigWriter(self._recovery_conf) as f:
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/config.py", line 228, in __enter__
    self._fd = open(self._filename, 'w')
FileNotFoundError: [Errno 2] No such file or directory: '/home/postgres/pgdata/pgroot/data/recovery.conf'
2024-04-08 09:59:53,126 INFO: removing initialize key after failed attempt to bootstrap the cluster
Traceback (most recent call last):
  File "/usr/local/bin/patroni", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 144, in main
    return patroni_main()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 136, in patroni_main
    abstract_main(Patroni, schema)
  File "/usr/local/lib/python3.10/dist-packages/patroni/daemon.py", line 181, in abstract_main
    controller.run()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 106, in run
    super(Patroni, self).run()
  File "/usr/local/lib/python3.10/dist-packages/patroni/daemon.py", line 126, in run
    self._run_cycle()
  File "/usr/local/lib/python3.10/dist-packages/patroni/__main__.py", line 109, in _run_cycle
    logger.info(self.ha.run_cycle())
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1770, in run_cycle
    info = self._run_cycle()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1592, in _run_cycle
    return self.post_bootstrap()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1483, in post_bootstrap
    self.cancel_initialization()
  File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1476, in cancel_initialization
    raise PatroniFatalException('Failed to bootstrap cluster')
patroni.exceptions.PatroniFatalException: 'Failed to bootstrap cluster'
/etc/runit/runsvdir/default/patroni: finished with code=1 signal=0
/etc/runit/runsvdir/default/patroni: sleeping 90 seconds

@wangyelei
Copy link
Contributor

It may be necessary to wait for the relevant wal logs to be uploaded

@JashBook
Copy link
Collaborator Author

dup #8956

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug kind/bug Something isn't working severity/major Great chance user will encounter the same problem
Projects
None yet
Development

No branches or pull requests

4 participants