questions on v_grads = torch.autograd.grad(loss,v_alphas+v_weights) #27

cholihao · 2019-11-28T10:31:50Z

in architect.py, Im confused about the following 3 lines of code:
v_grads = torch.autograd.grad(loss, v_alphas + v_weights)
dalpha = v_grads[:len(v_alphas)]
dw = v_grads[len(v_alphas):]
why does the gradient compute w.r.t (v_alphas+v_weights)? and the dalpha is retrieved from v_grads[:len(v_alphas)]. I thought it should be computed w.r.t v_alphas only based on equation (7).
the other question is why can you get dalpha and dw from v_grads directly instead of doing autograd separately?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

questions on v_grads = torch.autograd.grad(loss,v_alphas+v_weights) #27

questions on v_grads = torch.autograd.grad(loss,v_alphas+v_weights) #27

cholihao commented Nov 28, 2019

questions on v_grads = torch.autograd.grad(loss,v_alphas+v_weights) #27

questions on v_grads = torch.autograd.grad(loss,v_alphas+v_weights) #27

Comments

cholihao commented Nov 28, 2019