-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
restart every devices after restart HA #528
Comments
I've noticed this also and I have my pc plugged into a meross plug so that turns off without warning. One time I thought my gpu broke as it was making a terrible noise afterwards but somehow as cable got sucked into one of the fans during the process. :s I hope a fix can be found as I have to remember to turn off my pc before updating or restarting HA. Using mss305 hardware 8.0.0 firmware 8.3.15 |
Same issue |
I'm sorry for this behavior but, at the moment, I have no idea of what's going on especially since the component is not being updated since quite a long time so the issue might lie in the way the latest HA release initializes comnponents. |
The same is happening with me, not sure if it is coincidence but I updated the Firmware on the Meross mss305 to version 8.3.15 this morning. |
I'm using Docker on HA, AU plugs - mss210 (hw 7.0.0, fw 7.3.9) and mss310 (hw 8.0.0.0 and fw 6.3.23) and its working fine - HA 25.2.4. Seems restricted to mss305 devices? |
I didn't expierence this before I updated my firmware on my mss305's but I haven't been using this setup for long. At the time I applied the firmware update and then did a HA update so can't link it to one or the other but it appears to be the firmware thats the issue. I don't know if re-linking the plugs would be of any use or if HA will have a workaround. |
I'll throw my point of view into the issue: I think the switch toggling off/on might be due the device rebooting because of improper querying during HA startup. In general, the devices are 'resilient' to malformed queries but this is not always true. Also, the aforementioned 'general status query' is a really basic one universally accepted by all device types (in fact the issue only arises on this special device/fw). I'd ask if you can collect a 'diagnostic' from any of the misbehaving devices (the start of the diagnostic could trigger the toggling off/on though) Also, after the 'toggling glitch' does the device work correctly in HA/meross_lan (i.e. state update, toggling and so on..)? |
Hopefully here is the diagnostic file from one of my devices, and yes it did toggle the switch off/on. Let me know if you need anything else, and I'll try to get it for you. Once the initial poll happens and the device powers off/on everything works fine with them in HA. The power toggles work and states update correctly. |
Sadly also affected by this with 31 MSS305 power outlets. All of mine already had the latest firmware installed prior to this issue occurring. The issue was not present before the 2025 series of Home Assistant updates, as I had an automation restarting my router, managed LAN switches, and RPi (Home Assistant) at 4:00. Now I need to exclude the RPi and hope the Voice Assistant stuff doesn't get stuck. Similar to jonlicence, everything works as intended after the initial On/Off/OriginalState toggle. Since my Home Assistant is also behind one of these, it sometimes results in an endless loop of On/Off/OriginalState until I get home. As a temporary solution, I put all the important things behind battery power backups now - not efficient (+30W base load) but it offsets the sudden power loss. Considering all of my devices are misbehaving on reboot of Home Assistant, should I grab it from all of them? |
I don't have many tools to inspect this behavior right now but I might supsect (just speculation) that the devices don't like a special query used when coming online (already guessed about this before..) You could help me by trying manually querying any of the 'offended' devices by using the action: meross_lan.request
data:
protocol: auto
method: GET
namespace: Appliance.System.Debug
payload: "{}"
device_id: PUT_THE_HEX_FORMATTED_DEVICE_ID_HERE Where you can recover the aforementioned device_id from the device configuration UI (or use instead the That message (for namespace If that message doesn't trigger the toggling you could try querying for another namespace : |
Hi krahabb, I'm away from home this weekend but will definitely try this on Monday and let you know the results. |
Tried both, neither of them caused the issue. Edit: Enabling debug logs has my log file spammed with this:
Edit: config_entry-meross_lan-01J4Z3FFBK2WTHV76RTGK9T1TK(1).json Trace
Edit: None of the commands above appear to trigger it. |
@Xaymar, As for the toggling issue it leaves us with no apparent solutions at the moment... |
Scanning through the diagnostic data, I found exactly one that had a different trace compared to the rest: config_entry-meross_lan-01J59XDGT8K29MBVMW2K4AKZAN.json. This one has a file 7KB bigger than the average. Trace
I do not see anything immediately obvious here. It seems that the diagnostics start in the middle of work instead of at the beginning, so we might be missing the real issue entirely. |
I think I found it! These chains of commands triggers the reset: Chain 1:
Chain 2:
Chain 3:
Edit: From what it looks like based on Wi-Fi sniffing, the entire device crashes and hard resets if it encounters a malformed request. |
The internal buffer overflow is an high candidate for the issue,to overcome it, we'd need to understand what's causing it. The sequnces of commands triggering the reset don't really make sense since they look pretty uncorrelated though. I might suspect instead that the issue arises when commands are sent back-to-back. This could explain why devices typically reset when HA/meross_lan initializes them since, at that time, there's a a bit of message spamming since the initialization queues out at least 3 queries (likely 4) 'almost' back to back. Now, except patching the software to avoid this edge case, the query for the |
I found an easier way to repro the problem: Disable the "Outlet" Entity, then enable it again. After exactly 30 seconds it will toggle Off and On, just like it would after restarting Home Assistant. That should make debugging the problem much easier, since we no longer have to restart everything. Below are two json diagnostic traces:
Edit: The order of files above is a bit weird. The one without a number should be the one downloaded prior to the entity disable, and the one with (1) and (2) should be after.Edit: Another set from the same device:
Edit: No reset on:
Trace ends here. I followed the chain of requests as best as I could, but as I don't have the same speed as a machine, I don't seem to be triggering the buffer overflow. Either that or the trace is missing requests to the entity itself, which would be pretty bad for us trying to find out what's going on. |
I'm pretty much stuck. I tried running a script that emits these events in their original order as fast as possible, but it did not trigger the issue. There is definitely data missing from the trace, but unfortunately I do not have the capacity to intercept encrypted Wi-Fi traffic fully to figure out what is missing. |
I've just published a 'fresh' release (v5.4.2) fixing some compatibility issues against incoming HA core 2025.3 but I've also added a specific feature in order to try 'experiment' a bit with this issue. In the device configuration there's now an option to 'disable multiple requests'. Leaving the option disabled the software will work as in previous versions but you can try activate it and see if this mitigates the issue. |
Turning on the "Disable multiple request packing" seems to address the issue. I was able to restart Home Assistant w/o entering an infinite loop of Home Assistant restarting. |
Ok, got it. I'll see if it's possible to better set the initial estimated buffer size by reducing it for these devices so that the multiple requests never hits the maximum limit thus avoiding the device reboot at start. |
Incorrect timestamp: 1741548706 seconds behind HA (174154870 on average) That's the error in the protocoll, and all the ms305 restart after a reboot of HA |
@jvanderweken, |
if i restart HA all the Meross Devices, in my case mss305 8.0.0, turn off and on. so the device thats pluged in gets off and on.
The text was updated successfully, but these errors were encountered: