-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implemented methods to save and restore PyBullet states. #33
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great!
Could you also add a brief documentation by creating a new file in docs/usage/
Something like Save and restore states with a brief piece of code that explains how to use save_state
and restore_state
(in my opinion, explaining remove_state
is not necessary).
import gym
import panda_gym
env = gym.make("PandaReach-v2")
obs = env.reset()
# [Interact]
valuable_state = env.save_state()
# [Try a sequence of actions]
env.restore_state(valuable_state) # Restore the valuable state
# [Try an alternative sequence of actions.]
env.close()
(No need to replace the brackets comments with code example.)
I have added an example of a greedy random search to the documentation. PTAL |
This looks great. Now you need to update the If you want some help, feel free to ask. |
Thanks! I have added the unit tests. However, I'm not super familiar with |
To use pytest, install it in your virtual env:
Then just run
|
Thanks! It seems like a bunch of tests are failing. I reproduced this by copying a fresh copy of the repo. Here is the error message:
|
Yes, these errors come from the latest version of gym. I solved the problem yesterday on the master branch. I just included these changes to your branch. Pull the changes, force reinstall gym ( |
Awesome thanks! I fixed some small errors in tests, but everything should be good now. Everything is green locally. |
I just thought of something: |
Is the desired goal not an object where the state is captured in pybullet? |
No. It is the opposite. A target position is sampled, and a fake object (just for rendering, agent can't interact with it) is placed in the simulation. |
In my opinion, this should work: env = PandaReachEnv()
env.reset()
state_id = env.save_state()
# Perform the action
action = env.action_space.sample()
next_obs1, reward, done, info = env.step(action)
# Restore and perform the same action
env.reset()
env.restore_state(state_id)
next_obs2, reward, done, info = env.step(action)
# The observations in both cases should be equals
assert np.all(next_obs1["achieved_goal"] == next_obs2["achieved_goal"])
assert np.all(next_obs1["observation"] == next_obs2["observation"])
assert np.all(next_obs1["desired_goal"] == next_obs2["desired_goal"]) |
I see what you mean. I didn't do assertion for the desired goal since it cannot change during an episode, but I could add that. |
Done! |
I think I explained it wrong: I think this can be done by storing the goals in a dictionary that associates |
maybe something like def save_state(self) -> int:
state_id = self.sim.save_state()
self._saved_goal[state_id] = self.task.goal
return state_id
def restore_state(self, state_id: int) -> None:
self.sim.restore_state(state_id)
self.task.goal = self._saved_goal[state_id]
def remove_state(self, state_id: int) -> None:
self._saved_goal.pop(state_id)
self.sim.remove_state(state_id) |
I see! Just pushed the change. |
Useful trick to help you formatting your code: Install
Here:
|
Thanks! |
Thank you for contributing, your changes have been included in the version 2.0.4 :) |
This PR is to address the feature discussed in #32.