Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random simulator crash during training #2126

Closed
sillycornvalley opened this issue Aug 8, 2019 · 3 comments
Closed

Random simulator crash during training #2126

sillycornvalley opened this issue Aug 8, 2019 · 3 comments

Comments

@sillycornvalley
Copy link

Has anyone completed the train for DRL following cables in the LandscapeMountains environment successfully? I'm currently trying to run it and I've been facing some issues like random simulator crashes etc. The reward generated after like 10000 steps still seems to be very low and I'm not sure if the training is actually happening.
Is there someone who could help me with this?

@msb336
Copy link
Contributor

msb336 commented Aug 8, 2019

This is a race condition bug in UpdateableObject::update() with low repro. There is a PR currently being reviewed that solves this problem:
#1970

If you use this branch, you should be able to run long training sessions without crashing for the same reason.

@sillycornvalley
Copy link
Author

Okay, thanks. I'll try to use this branch.
By the way, do you know what the maximum generated reward after training would be like? Just a rough approximation, to make sure my algorithm is working as intended. Because even after 10000 steps, the generated reward seems to be very low.

@msb336
Copy link
Contributor

msb336 commented Aug 12, 2019

@sillycornvalley
That is very dependent on your environment and reward function. If you are providing sparse rewards (e.g. only rewarding at the end of an episode), especially if your episodes are very long and often do not result in the desired behavior, it can take a very long time to converge on something useful.

Here is an intro level article on how to properly shape reward functions in an RL training simulation

@msb336 msb336 closed this as completed Aug 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants