Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not able to list logs for download if high number of logs #2162

Closed
shervinkoushan opened this issue Oct 20, 2023 · 22 comments
Closed

Not able to list logs for download if high number of logs #2162

shervinkoushan opened this issue Oct 20, 2023 · 22 comments

Comments

@shervinkoushan
Copy link

shervinkoushan commented Oct 20, 2023

The logfile_download example works fine on the Pixhawk 4, but not on the mRo Control Zero H7 OEM.

This is the output:

09:21 $ build/logfile_download udp://:14540
[09:21:55|Info ] MAVSDK version: v1.4.17 (mavsdk_impl.cpp:20)
Waiting to discover system...
[09:21:55|Info ] New system on: 10.10.0.176:14540 (with sysid: 1) (udp_connection.cpp:192)
[09:21:55|Debug] New: System ID: 1 Comp ID: 1 (mavsdk_impl.cpp:496)
[09:21:55|Debug] Component Autopilot (1) added. (system_impl.cpp:377)
[09:21:55|Warn ] Vehicle type changed (new type: 2, old type: 0) (system_impl.cpp:225)
[09:21:55|Debug] Discovered 1 component(s) (system_impl.cpp:578)
[09:21:55|Warn ] No entries received (log_files_impl.cpp:167)
[09:21:56|Warn ] sending again after 0.502617 s, retries to do: 3  (520). (mavlink_command_sender.cpp:287)
[09:21:56|Warn ] sending again after 1.00862 s, retries to do: 2  (520). (mavlink_command_sender.cpp:287)
[09:21:57|Warn ] sending again after 1.5145 s, retries to do: 1  (520). (mavlink_command_sender.cpp:287)
[09:21:57|Error] Retrying failed (520) (mavlink_command_sender.cpp:307)

Any idea why it doesn't work on the mRo?

Edit: Seems to be an issue with the log contents, not the mRo

@shervinkoushan shervinkoushan changed the title Not able to donwload logs from mRo Not able to download logs from mRo Oct 20, 2023
@JonasVautherin JonasVautherin changed the title Not able to download logs from mRo Not able to download logs from mRo Control Zero H7 OEM Oct 20, 2023
@JonasVautherin
Copy link
Collaborator

Feels like MAVLink messages from MAVSDK don't reach the mRo, though MAVLink messages from the mRo do reach MAVSDK. Could it be that you have an issue there?

@shervinkoushan
Copy link
Author

Feels like MAVLink messages from MAVSDK don't reach the mRo, though MAVLink messages from the mRo do reach MAVSDK. Could it be that you have an issue there?

Thanks for the reply. That might be, is there any way I can troubleshoot this?

@JonasVautherin
Copy link
Collaborator

You can try to send other commands to the autopilot (e.g. arm, or set a parameter, etc) and confirm that none of them work (if my intuition is correct, they will just time out because they never reach the autopilot).

You can also use wireshark to see what happens on the network interface between mavsdk and mRo.

If you want, you can also try to describe your network topology here, so that if there is something obviously wrong we can maybe help 😊.

@shervinkoushan
Copy link
Author

You can try to send other commands to the autopilot (e.g. arm, or set a parameter, etc) and confirm that none of them work (if my intuition is correct, they will just time out because they never reach the autopilot).

You can also use wireshark to see what happens on the network interface between mavsdk and mRo.

If you want, you can also try to describe your network topology here, so that if there is something obviously wrong we can maybe help 😊.

Interestingly, with the pip package mavsdk 1.4.9 I am able to set parameters using the upload_params script.
The line await get_entries(drone) in the logfile_download example still hangs though.

@JonasVautherin
Copy link
Collaborator

Is the mRo running the same version of PX4 as the Pixhawk? Does the mRo support logs? Like can QGC list and download logs, for instance?

@shervinkoushan
Copy link
Author

Is the mRo running the same version of PX4 as the Pixhawk? Does the mRo support logs? Like can QGC list and download logs, for instance?

Yes to both.

  • The same PX4 code is running (a modified version of 1.13.3)
  • I can see the logs in QGC

@JonasVautherin
Copy link
Collaborator

Maybe there is a bug then. Can you check what happens with MAVLink on the mRo side when MAVSDK requests the logs?

@shervinkoushan
Copy link
Author

Maybe there is a bug then. Can you check what happens with MAVLink on the mRo side when MAVSDK requests the logs?

Can you please be more specific? Should I check a specific topic?

@JonasVautherin
Copy link
Collaborator

I would look at the MAVLink messages being used by the log_files implementation: https://github.com/mavlink/MAVSDK/blob/main/src/mavsdk/plugins/log_files/log_files_impl.cpp

@shervinkoushan
Copy link
Author

Looking at this file, the output is as follows:

  1. LOG_REQUEST_END
  2. LOG_REQUEST_LIST
  3. Then this line is called https://github.com/PX4/PX4-Autopilot/blob/f68f88b97c77d38054d54696fca5ef1a3e91822b/src/modules/mavlink/mavlink_log_handler.cpp#L175

This behavior is the same on both the Pixhawk and the mRo, but after 3. on the Pixhawk we can see the logs, whereas it just hangs when requesting from the mRo

@shervinkoushan
Copy link
Author

shervinkoushan commented Oct 25, 2023

It seems like I was wrong about the root cause of the issue. I tested another SD card now and it worked on the mRo.
I tested two mRos earlier with two different SD cards, I am trying to see what is different. The folder structure is pretty similar.

A card that does not work to the left, a card that works to the right:
image
image

I tried removing the -ulogs folder from the left card but it did not help.

On the PX4 side I can see that the logs are detected (165 in the left case, 23 in the right case)

Is the issue that it struggles to handle 100+ logs, or large logs in general?

I copied the contents of the card that worked to the card that didn't, and that worked fine. So the contents of the sd card is the root problem, not the card itself.

But important to note here is that there is not an issue with the mRo, sorry for the confusion.

@JonasVautherin
Copy link
Collaborator

It is weird that it was working with QGC and not with MAVSDK, if that was an sdcard issue 🤔

@shervinkoushan
Copy link
Author

shervinkoushan commented Oct 25, 2023

It is weird that it was working with QGC and not with MAVSDK, if that was an sdcard issue 🤔

Could be an issue on the PX4 side as well, not sure. If you want I can send you the SD-card contents (by email) so you can debug it further.

@shervinkoushan shervinkoushan changed the title Not able to download logs from mRo Control Zero H7 OEM Not able to list logs for download if high number of logs Oct 25, 2023
@shervinkoushan
Copy link
Author

shervinkoushan commented Oct 25, 2023

I can see that QGC triggers LOG_REQUEST_LIST two times. The second time it requests the range 50-167.

MAVSDK first triggers LOG_REQUEST_END, then LOG_REQUEST list in the range 0-167. Maybe MAVSDK simply requests too much data at a time?

It could also be that list_timeout gets called too quickly:

// This first step can take a moment, on PX4 with 100+ logs I see about 2-3s.
_system_impl->register_timeout_handler(
[this]() { list_timeout(); }, _system_impl->timeout_s() * 10.0, &_entries.cookie);

@JonasVautherin
Copy link
Collaborator

Oh right, maybe there is some tuning to do there. The message definition doesn't say anything about a limit for LOG_REQUEST_LIST, so I would assume that the autopilot should do it in a way that works...

It could also be that list_timeout gets called too quickly:

Hmm I don't know, if I understand the code correctly, the timeout is refreshed everytime a LOG_ENTRY is received, and even then there is some retry logic...

Would be nice to check what PX4 does when it receives LOG_REQUEST_LIST and if it could be a problem if too many entries are requested...

@julianoes
Copy link
Collaborator

@shervinkoushan
Copy link
Author

You could try to increase the timeout: https://github.com/mavlink/MAVSDK/blob/v1.4/src/mavsdk/core/include/mavsdk/mavsdk.h#L267-L275

Unfortunately this did not help.
I think the problem is that too many logs are requested at the same time. Should probably try to do it in batches of 100?

Have you been able to reproduce this issue on your side?
Steps to reproduce:

  1. Have 150+ ulogs on a Pixhawk/mRo/whatever
  2. Run the logfile_download example (I tested both the .cpp and .py examples)
  3. Code execution stops on get_entries()

@JonasVautherin
Copy link
Collaborator

Is PX4 having an issue when there are "too many" logs? If that's the case, shouldn't that be fixed on the PX4 side?

@M4ster00gway
Copy link

Any updates on this issue?

@julianoes
Copy link
Collaborator

No, sorry, it's on my long todo list of things I ought to look into. Consider GitHub sponsoring if this is important and urgent for you.

@julianoes
Copy link
Collaborator

Actually, it might have been fixed in the meantime: #2234

Try again with v2.4.0.

See also: mavlink/MAVSDK-Python#662

@julianoes
Copy link
Collaborator

I'm closing this. Comment if this is not fixed in the meantime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants