Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement]: Wrong gains for weight initialization #1559

Open
4 tasks done
OliEfr opened this issue Jun 16, 2023 · 2 comments
Open
4 tasks done

[Enhancement]: Wrong gains for weight initialization #1559

OliEfr opened this issue Jun 16, 2023 · 2 comments
Assignees
Labels
enhancement New feature or request help wanted Help from contributors is welcomed

Comments

@OliEfr
Copy link

OliEfr commented Jun 16, 2023

Enhancement

The recommended gains for the weight init depend on the used activation function, see torch docs. However, as for now the used gains are statically implemented and always the same in ActorCriticPolicies. See here.

I recommend making the gains dependent on the activation function used(, i.e. probably mainly ReLU and tanh).

If you agree with this, I would like to implement it myself and PR.

Thanks and a good day!

To Reproduce

--

Relevant log output / Error message

--

System Info

--

Checklist

  • I have checked that there is no similar issue in the repo
  • I have read the documentation
  • I have provided a minimal working example to reproduce the bug
  • I've used the markdown code blocks for both code and stack traces.
@OliEfr OliEfr added the bug Something isn't working label Jun 16, 2023
@OliEfr OliEfr changed the title [Bug]: wrong gains for weight initialization [Bug]: Wrong gains for weight initialization Jun 16, 2023
@araffin araffin added enhancement New feature or request and removed bug Something isn't working labels Jun 16, 2023
@OliEfr OliEfr changed the title [Bug]: Wrong gains for weight initialization [Enhancement]: Wrong gains for weight initialization Jun 17, 2023
@araffin araffin self-assigned this Jul 20, 2023
@araffin
Copy link
Member

araffin commented Jul 20, 2023

Hello,
those gains are for orthogonal initialization only (https://pytorch.org/docs/stable/_modules/torch/nn/init.html#orthogonal_), when they are not used, the default pytorch initialization is used.

The gains are from OpenAI Baselines, to keep results consistent, but compared to other initialization, I didn't see any investigation on the effect of the gain so far (this would be already a good contribution), or at least if using tanh/relu with constant gain has an effect.

@araffin araffin added the help wanted Help from contributors is welcomed label Jul 20, 2023
@OliEfr
Copy link
Author

OliEfr commented Jul 21, 2023

Yes, I am talking about orthogonal init. I agree that it is useful to keep it consistent with OpenAI Baselines. A study regarding the effect of gain towards convergence will be useful.

It seems a coincidence (?) that the standard gain listed for ReLU for any initialization is also sqrt(2) Link. (The gain implemented in OpenAI Baselines and sb3 is also sqrt(2). Maybe they just used ReLU by default and never investigated the gain?)

One study that partly investigates impact of weight init is this. They find:

initializing the policy MLP with smaller weights in the last layer

network initialization scheme (C56) does not matter too much

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Help from contributors is welcomed
Projects
None yet
Development

No branches or pull requests

2 participants