Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Hugegraph isn't responding after Cassandra restarted. #2740

Open
1 task done
mkj-git opened this issue Feb 28, 2025 · 4 comments
Open
1 task done

[Bug] Hugegraph isn't responding after Cassandra restarted. #2740

mkj-git opened this issue Feb 28, 2025 · 4 comments
Labels
bug Something isn't working cassandra Cassandra backend

Comments

@mkj-git
Copy link

mkj-git commented Feb 28, 2025

Bug Type (问题类型)

server status (启动/运行异常)

Before submit

  • 我已经确认现有的 IssuesFAQ 中没有相同 / 重复问题 (I have confirmed and searched that there are no similar problems in the historical issue and documents)

Environment (环境信息)

  • Server Version: Latest Code Build
  • Backend: Cassandra
  • OS: Ubuntu
  • Data Size: Just few data

Expected & Actual behavior (期望与实际表现)

I did following steps:

  1. Started cassandra (V 5.0.3)
  2. Started Hugegraph
./start-hugegraph.sh 
Starting HugeGraphServer in daemon mode...
Connecting to HugeGraphServer (http://127.0.0.1:8080/graphs)....OK
Started [pid 1252693]

  1. Created one vertex from Swagger UI and its got created.
  2. Stopped cassandra.
  3. Again tried from Swagger UI and it's returning error:
Failed to fetch.
Possible Reasons:

CORS
Network Failure
URL scheme must be "http" or "https" for CORS request.
  1. Restarted cassandra and tried from Swagger but same issue
incubator-hugegraph-master/target/apache-hugegraph-incubating-1.5.0/apache-hugegraph-server-incubating-1.5.0/bin$ ./start-hugegraph.sh 
Starting HugeGraphServer in daemon mode...
Connecting to HugeGraphServer (http://127.0.0.1:8080/graphs)....OK
Started [pid 1254569]
incubator-hugegraph-master/target/apache-hugegraph-incubating-1.5.0/apache-hugegraph-server-incubating-1.5.0/bin$ ./stop-hugegraph.sh 
no crontab for manish
The HugeGraphServer monitor has been closed
Dev/incubator-hugegraph-master/target/apache-hugegraph-incubating-1.5.0/apache-hugegraph-server-incubating-1.5.0/bin/util.sh: line 375: kill: (1254569) - No such process
Killing HugeGraphServer(pid 1254569).OK

IN the log i don't see anything, here is the last few line

2025-02-28 14:43:04 [main] [INFO] o.a.t.g.s.AbstractChannelizer - Configured application/vnd.gremlin-v2.0+json with org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV2d0
2025-02-28 14:43:04 [main] [INFO] o.a.t.g.s.AbstractChannelizer - application/json already has org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0 configured - it will not be replaced by org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV2d0, change order of serialization configuration if this is not desired.
2025-02-28 14:43:04 [main] [INFO] o.a.t.g.s.AbstractChannelizer - Configured application/vnd.gremlin-v3.0+json with org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0
2025-02-28 14:43:04 [main] [INFO] o.a.t.g.s.AbstractChannelizer - application/json already has org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0 configured - it will not be replaced by org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0, change order of serialization configuration if this is not desired.
2025-02-28 14:43:04 [main] [INFO] o.a.t.g.s.AbstractChannelizer - Configured application/vnd.graphbinary-v1.0 with org.apache.tinkerpop.gremlin.driver.ser.GraphBinaryMessageSerializerV1
2025-02-28 14:43:04 [main] [INFO] o.a.t.g.s.AbstractChannelizer - Configured application/json with org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0
2025-02-28 14:43:04 [main] [INFO] o.a.t.g.s.AbstractChannelizer - Configured application/vnd.gremlin-v2.0+json with org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV2d0
2025-02-28 14:43:04 [main] [INFO] o.a.t.g.s.AbstractChannelizer - application/json already has org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0 configured - it will not be replaced by org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV2d0, change order of serialization configuration if this is not desired.
2025-02-28 14:43:04 [main] [INFO] o.a.t.g.s.AbstractChannelizer - Configured application/vnd.gremlin-v3.0+json with org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0
2025-02-28 14:43:04 [main] [INFO] o.a.t.g.s.AbstractChannelizer - application/json already has org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0 configured - it will not be replaced by org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0, change order of serialization configuration if this is not desired.
2025-02-28 14:43:04 [gremlin-server-boss-1] [INFO] o.a.t.g.s.GremlinServer - Gremlin Server configured with worker thread pool of 1, gremlin pool of 8 and boss thread pool of 1.
2025-02-28 14:43:04 [gremlin-server-boss-1] [INFO] o.a.t.g.s.GremlinServer - Channel started at port 8182.
2025-02-28 14:43:04 [main] [INFO] o.a.h.d.MemoryMonitor - Memory monitoring started.

Vertex/Edge example (问题点 / 边数据举例)

Schema [VertexLabel, EdgeLabel, IndexLabel] (元数据结构)

@mkj-git mkj-git added the bug Something isn't working label Feb 28, 2025
Copy link

dosubot bot commented Feb 28, 2025

I found a similar closed issue related to HugeGraph not responding after a Cassandra restart, but no solution was found in the comments [1].

Regarding your issue, it seems that HugeGraph is not handling the reconnection to Cassandra after it has been restarted. This is a known challenge, as HugeGraph may not automatically reconnect to the backend once the connection is lost.

Here are a few things you might consider checking or trying:

  1. Logs for Connection Errors: Check the logs for any specific error messages related to the connection to Cassandra. Look for messages like "Failed to connect keyspace" or NoHostAvailableException, which indicate connection issues [2].

  2. Configuration for Reconnection: Ensure that your HugeGraph configuration is set up to handle reconnections. While I couldn't find specific configurations in the repository, you might want to check if there are any settings related to connection retries or timeouts.

  3. Restart HugeGraph: If HugeGraph does not automatically reconnect, you might need to restart the HugeGraph server after Cassandra is back online. This can help re-establish the connection.

  4. Graceful Handling of Restarts: HugeGraph has mechanisms to handle backend restarts gracefully, such as the PdConfigureListener, which listens for restart messages and updates configurations [3]. Ensure that your setup is using the latest configurations.

If these steps do not resolve the issue, you might need to look into more detailed logs or consider reaching out to other developers who have faced similar issues for potential workarounds.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

@dosubot dosubot bot added the cassandra Cassandra backend label Feb 28, 2025
@mkj-git
Copy link
Author

mkj-git commented Feb 28, 2025

I got the issue after enabling foreground mode. Once cassandra is down the server itself terminating. This is bad, it should retry, reconnect. Is there anything like this in hugegraph?

./start-hugegraph.sh 
Starting HugeGraphServer in foreground mode...
./start-hugegraph.sh: line 113: 1271256 Killed                  "${BIN}"/hugegraph-server.sh "${CONF}/${GREMLIN_SERVER_CONF}" "${CONF}"/rest-server.properties "${OPEN_SECURITY_CHECK}" "${USER_OPTION}" "${GC_OPTION}" "${OPEN_TELEMETRY}" >> "${LOGS}"/hugegraph-server.log 2>&1
Connecting to http://127.0.0.1:8080/graphs (30).error: process ID list syntax error

Usage:
 ps [options]

 Try 'ps --help <simple|list|output|threads|misc|all>'
  or 'ps --help <s|l|o|t|m|a>'
 for additional help text.

For more details see ps(1).
Starting http://127.0.0.1:8080/graphs failed
See /home/manish/Dev/incubator-hugegraph-master/target/apache-hugegraph-incubating-1.5.0/apache-hugegraph-server-incubating-1.5.0/logs/hugegraph-server.log for HugeGraphServer log output.

Copy link

dosubot bot commented Feb 28, 2025

HugeGraph has a mechanism to automatically retry and reconnect to Cassandra when it goes down. The InitStore class in the HugeGraph server code specifies retry logic with up to 10 retries and a retry interval of 5000 milliseconds. This is designed to handle exceptions like OperationTimedOutException and NoHostAvailableException, which occur when Cassandra is temporarily unavailable [1].

If the server is terminating instead of retrying, ensure that your configuration is correctly set up to utilize this retry mechanism. You might also want to check the logs for any specific error messages that could provide more insight into why the server is terminating.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

@mkj-git
Copy link
Author

mkj-git commented Mar 1, 2025

This issue looks like a problem.
in case if Cassandra goes down or started after hugegraph, its not working.

{
  "exception": "class org.apache.hugegraph.exception.NotFoundException",
  "message": "Graph 'hugegraph' does not exist",
  "cause": ""
}

I am getting this issue when I started cassandra after hugegraph started

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cassandra Cassandra backend
Projects
None yet
Development

No branches or pull requests

1 participant