-
Notifications
You must be signed in to change notification settings - Fork 291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't send messages in persistent group chat #961
Comments
I could reproduce this bug and noticed that I can still receive messages of some people in the group. |
Closing because we got rid of the old PGC PR. |
I just hit this issue running qTox with toxcore v0.2.9. I loaded a persistent group that I've loaded a few times before and was unable to send messages or set title. I still receive peer messages. Restarting qTox doesn't fix the issue. Other groups still work fine. There's no log on message send or title set. Since the group seems to be stuck this way, I'd be happy to dig into what's going on with gdb if a core dev can give me some guidance. |
We hit the same issue with a new group member, running Toxcore 0.2.9: We have the same group chat with 3 members, A, B, and C. All three were online, but A's messages weren't delivered to B, or C, and they also didn't receive their own messages. This persisted across multiple client restarts of A. During the chat, A saw either B or C disconnect and reconnect in the group, and B and C saw each other disconnect and reconnect in their 1-1 chat. After that point, A's messages were then delivered to A and B, but still not C. After a while, C closed and re-opened their client, and then all of A, B, and C could see everyone's messages. @zugz this is one of the cases you asked me about on IRC IIRC. I'll reopen this issue since it's reproducible on tip and sounds like it's being looked into. |
Thanks for reporting this. I'm currently wholly mystified, and haven't
managed to reproduce the bug.
Some questions to narrow down where the problem could be:
Did any of the members join or leave any other groups?
Which pairs of A,B,C were tox friends?
Am I right to understand that during the period when A's messages
weren't being sent to everyone, A nonetheless saw both B and C in the
peer list for the group (except during brief disconnections)?
|
I hit this again, now on v0.2.10. Sorry, will follow up on your questions now for the latest repro case. In this case, the group had 4 members. A was in the call, crashed, then started back up and rejoined the call. When they rejoined, all peers couldn't hear them, but A could hear all peers, and all peers didn't receive A's text messages, but A saw their own.
A was friends with B and C
No. All 4 were in a group audio call playing some games - none were doing anything tox related.
Yes. B, C, D in this case all showed in the peer list, and A was receiving audio for all of them. A leaving and being re-invited to the group did not fix the issue, and A restarting their client did not fix the issue. All 4 members needed to move to a new group, where things then worked. A was using qTox which saves the tox profile pretty rarely - only on friend add/remove basically, so I don't see how the qTox crash during the call could somehow corrupt tox profile state - but possibly the crash is an important part of the repro. |
A was in the call, crashed,
Thanks. That was the hint I needed. I believe I see the problem now:
messages from a particular peer come with a "message number",
incremented with each message from that peer, and messages with too old
a message number are ignored by other peers. The current message number
is saved in the savedata, but if there's a crash and the peer restarts
from an old save file, they'll start from an old message number and so
other peers will ignore them. In particular they ignore the kill message
that the peer sends if it leaves the group, which explains why leaving
and rejoining doesn't help - the other peers don't even realise the peer
left.
All this goes for audio too (there's a separate message number for lossy
packets).
This may take some thought to fix properly.
|
I think this ticket was accidentally closed. The root cause discovered by zugz was a few months after #1321 was opened, and based on chats this issue is still unresolved. Reopening. |
Ah, whoops. Yeah, the commit message in #1321 had "possibly fixes x" and "fixes x" is some kind of magic GitHub thing that auto-closes the issue mentioned when merged. |
Put a future message number into the save file. Peers require the message numbers of messages we send to increase monotonically. If we save the current message number, then send further messages, then quit without saving (e.g. due to a crash), and then resume from the old save data, then monotonicity will fail. This commit works around this problem by introducing an offset when the current message number, so that even in the above circumstance, as long as fewer messages than the offset were sent between saving and reloading, the sent message numbers will increase monotonically. The choice of offset is a balance between wanting it to be large enough that there is room for plenty of messages to be sent in the above scenario, and wanting to avoid the following potential problem: if we repeatedly save and reload without sending any further messages, then the message number may increase so far that peers will interpret an eventual message as being old. This is not conceivably a practical issue for the 32bit lossless message numbers, but is a concern for the 16bit lossy message numbers.
With persistent groups it sometimes happens that I can't send messages in group chat I'm connected to. Unfortunately I don't know how to reproduce this bug and it happens randomly. I appear as online and I can message people, but I am not able to send messages to a group or groups. Normally if I lost connection to a group, I should just be reconnected, but here that never happens. The only way for me to be able to send messages again is to manually leave the group and join it again or restart the client.
The text was updated successfully, but these errors were encountered: