Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]start starrocks cluster failed after stopping it #9021

Open
tianyue86 opened this issue Mar 7, 2025 · 0 comments
Open

[BUG]start starrocks cluster failed after stopping it #9021

tianyue86 opened this issue Mar 7, 2025 · 0 comments
Assignees
Labels
kind/bug Something isn't working
Milestone

Comments

@tianyue86
Copy link

Describe the bug

Kubernetes: v1.31.1-aliyun.1
KubeBlocks: 1.0.0-beta.32
kbcli: 1.0.0-beta.15

To Reproduce
Steps to reproduce the behavior:

  1. Create starrocks cluster with yaml below - running
apiVersion: apps.kubeblocks.io/v1
kind: Cluster
metadata:
  name: strsce-yioztk
  namespace: default
spec:
  clusterDef: starrocks-ce
  topology: shared-nothing
  terminationPolicy: DoNotTerminate
  componentSpecs:
    - name: fe
      serviceVersion: 3.2.2
      disableExporter: true
      replicas: 2
      resources:
        requests:
          cpu: 1000m
          memory: 1Gi
        limits:
          cpu: 1000m
          memory: 1Gi
      volumeClaimTemplates:
        - name: data
          spec:
            storageClassName:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 20Gi
    - name: be
      serviceVersion: 3.2.2
      replicas: 2
      resources:
        requests:
          cpu: 1000m
          memory: 1Gi
        limits:
          cpu: 1000m
          memory: 1Gi
      volumeClaimTemplates:
        - name: data
          spec:
            storageClassName:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 20Gi
  1. Stop it
kbcli cluster list-instances strsce-yioztk --namespace default
NAME                 NAMESPACE   CLUSTER         COMPONENT   STATUS    ROLE     ACCESSMODE   AZ                 CPU(REQUEST/LIMIT)   MEMORY(REQUEST/LIMIT)   STORAGE     NODE                                   CREATED-TIME                 
strsce-yioztk-be-0   default     strsce-yioztk   be          Running   <none>                cn-zhangjiakou-c   1 / 1                1Gi / 1Gi               data:20Gi   cn-zhangjiakou.10.0.0.144/10.0.0.144   Mar 07,2025 15:39 UTC+0800   
strsce-yioztk-be-1   default     strsce-yioztk   be          Running   <none>                cn-zhangjiakou-c   1 / 1                1Gi / 1Gi               data:20Gi   cn-zhangjiakou.10.0.0.145/10.0.0.145   Mar 07,2025 15:40 UTC+0800   
strsce-yioztk-fe-0   default     strsce-yioztk   fe          Running   <none>                cn-zhangjiakou-c   1 / 1                1Gi / 1Gi               data:20Gi   cn-zhangjiakou.10.0.0.144/10.0.0.144   Mar 07,2025 15:37 UTC+0800   
strsce-yioztk-fe-1   default     strsce-yioztk   fe          Running   <none>                cn-zhangjiakou-c   1 / 1                1Gi / 1Gi               data:20Gi   cn-zhangjiakou.10.0.0.144/10.0.0.144   Mar 07,2025 15:38 UTC+0800   
tianyue@apeclouds-MacBook-Pro kubeblocks-addons % kbcli cluster stop strsce-yioztk --auto-approve --force=true  --namespace default
OpsRequest strsce-yioztk-stop-vrxr2 created successfully, you can view the progress:
	kbcli cluster describe-ops strsce-yioztk-stop-vrxr2 -n default
tianyue@apeclouds-MacBook-Pro kubeblocks-addons % kbcli cluster list-ops strsce-yioztk --status all  --namespace default
NAME                       NAMESPACE   TYPE   CLUSTER         COMPONENT   STATUS    PROGRESS   CREATED-TIME                 
strsce-yioztk-stop-vrxr2   default     Stop   strsce-yioztk   be,fe       Running   2/4        Mar 07,2025 16:45 UTC+0800   
tianyue@apeclouds-MacBook-Pro kubeblocks-addons % k get cluster | grep str
strsce-yioztk     starrocks-ce         DoNotTerminate       Stopping   68m
tianyue@apeclouds-MacBook-Pro kubeblocks-addons % k get cluster | grep str
strsce-yioztk     starrocks-ce         DoNotTerminate       Stopped    68m
tianyue@apeclouds-MacBook-Pro kubeblocks-addons % kbcli cluster list-ops strsce-yioztk --status all  --namespace default
NAME                       NAMESPACE   TYPE   CLUSTER         COMPONENT   STATUS    PROGRESS   CREATED-TIME                 
strsce-yioztk-stop-vrxr2   default     Stop   strsce-yioztk   be,fe       Succeed   4/4        Mar 07,2025 16:45 UTC+0800   
  1. Start it
kbcli cluster start strsce-yioztk --force=true --namespace default
OpsRequest strsce-yioztk-start-b6h92 created successfully, you can view the progress:
	kbcli cluster describe-ops strsce-yioztk-start-b6h92 -n default
tianyue@apeclouds-MacBook-Pro kubeblocks-addons % kbcli cluster list-ops strsce-yioztk --status all  --namespace default
NAME                        NAMESPACE   TYPE    CLUSTER         COMPONENT   STATUS    PROGRESS   CREATED-TIME                 
strsce-yioztk-stop-vrxr2    default     Stop    strsce-yioztk   be,fe       Succeed   4/4        Mar 07,2025 16:45 UTC+0800   
strsce-yioztk-start-b6h92   default     Start   strsce-yioztk   be,fe       Running   0/4        Mar 07,2025 16:46 UTC+0800 
  1. check the cluster status
get pod|grep str
strsce-yioztk-be-0                  0/1     CrashLoopBackOff    11 (2m27s ago)   39m
strsce-yioztk-fe-0                  0/1     ContainerCreating   0                39m

k describe pod strsce-yioztk-be-0
Events:
  Type     Reason                  Age                    From                     Message
  ----     ------                  ----                   ----                     -------
  Normal   Scheduled               39m                    default-scheduler        Successfully assigned default/strsce-yioztk-be-0 to cn-zhangjiakou.10.0.0.144
  Normal   SuccessfulAttachVolume  39m                    attachdetach-controller  AttachVolume.Attach succeeded for volume "d-8vb23m26wqssi0fnw5jx"
  Normal   AllocIPSucceed          39m                    terway-daemon            Alloc IP 10.0.0.116/24 took 33.567815ms
  Normal   Pulled                  38m (x2 over 39m)      kubelet                  Container image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/be-ubuntu:3.2.2" already present on machine
  Normal   Created                 38m (x2 over 39m)      kubelet                  Created container be
  Normal   Started                 38m (x2 over 39m)      kubelet                  Started container be
  Warning  Unhealthy               9m17s (x127 over 39m)  kubelet                  Startup probe failed: Get "http://10.0.0.116:8040/api/health": dial tcp 10.0.0.116:8040: connect: connection refused
  Warning  BackOff                 4m11s (x125 over 37m)  kubelet                  Back-off restarting failed container be in pod strsce-yioztk-be-0_default(b6583351-eca9-491b-b8f8-eefe9dc04ad7)
  1. see error
[Fri Mar  7 17:22:39 CST 2025] /etc/starrocks/be/conf not exist or not a directory, ignore ...
[Fri Mar  7 17:22:39 CST 2025] Add myself (strsce-yioztk-be-0.strsce-yioztk-be-headless.default.svc.cluster.local:9050) into FE ...
ERROR 2003 (HY000): Can't connect to MySQL server on 'strsce-yioztk-fe-fe:9030' (111)
ERROR 2003 (HY000): Can't connect to MySQL server on 'strsce-yioztk-fe-fe:9030' (111)
[Fri Mar  7 17:22:41 CST 2025] Add myself (strsce-yioztk-be-0.strsce-yioztk-be-headless.default.svc.cluster.local:9050) into FE ...
ERROR 2003 (HY000): Can't connect to MySQL server on 'strsce-yioztk-fe-fe:9030' (111)
ERROR 2003 (HY000): Can't connect to MySQL server on 'strsce-yioztk-fe-fe:9030' (111)
[Fri Mar  7 17:22:43 CST 2025] Add myself (strsce-yioztk-be-0.strsce-yioztk-be-headless.default.svc.cluster.local:9050) into FE ...
ERROR 2003 (HY000): Can't connect to MySQL server on 'strsce-yioztk-fe-fe:9030' (111)
ERROR 2003 (HY000): Can't connect to MySQL server on 'strsce-yioztk-fe-fe:9030' (111)
[Fri Mar  7 17:22:45 CST 2025] Add myself (strsce-yioztk-be-0.strsce-yioztk-be-headless.default.svc.cluster.local:9050) into FE ...
ERROR 2003 (HY000): Can't connect to MySQL server on 'strsce-yioztk-fe-fe:9030' (111)
ERROR 2003 (HY000): Can't connect to MySQL server on 'strsce-yioztk-fe-fe:9030' (111)
[Fri Mar  7 17:22:47 CST 2025] Add myself (strsce-yioztk-be-0.strsce-yioztk-be-headless.default.svc.cluster.local:9050) into FE ...
ERROR 2003 (HY000): Can't connect to MySQL server on 'strsce-yioztk-fe-fe:9030' (111)
ERROR 2003 (HY000): Can't connect to MySQL server on 'strsce-yioztk-fe-fe:9030' (111)
[Fri Mar  7 17:22:49 CST 2025] Add myself (strsce-yioztk-be-0.strsce-yioztk-be-headless.default.svc.cluster.local:9050) into FE ...

A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

@tianyue86 tianyue86 added the kind/bug Something isn't working label Mar 7, 2025
@tianyue86 tianyue86 added this to the Release 1.0.0 milestone Mar 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants