Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cifar-10 inception example #89

Closed
wants to merge 1 commit into from
Closed

Conversation

ieee8023
Copy link

@ieee8023 ieee8023 commented Nov 19, 2016

Hi, I made an example of an Inception model from MXNet ported to Lasagne run on the CIFAR-10 dataset. It is simple and self contained with instructions to run it.

@f0k
Copy link
Member

f0k commented Nov 23, 2016

Thank you for porting this example! But how useful is it to have a baseline at 80% test accuracy? The DenseNet example is also self-contained and achieves 94%. It would also need some more cleaning, for example, the _grayscale function is missing (it should be added or not referenced at all), and you're using batch normalization wrongly (after instead of before the rectifier, and with a redundant bias, you should use batch_norm instead of BatchNormLayer). But even then, I'm not sure it provides any advantage over the DenseNet example, in which you can simply exchange the model with yours if you wanted to. Thank you anyway!

PS: I quite like the idea of your https://github.com/ieee8023/NeuralNetwork-Examples repository, "the same small networks implemented in different frameworks" -- with a little more attention to the pesky details and differences between frameworks, this could become a nice collection!

@f0k f0k closed this Nov 23, 2016
@ieee8023
Copy link
Author

I find that having a single notebook is easier to understand than a few python files. Intermediate results can be shown so it is easier than checking the code out and running it.

Can you show the correct way to use Batch Norm? I was looking online and I believed that to be the correct way.

@benanne
Copy link
Member

benanne commented Nov 27, 2016

Actually there has been some discussion about whether to put batchnorm before or after the nonlinearity lately. Although they put it before in the original paper (because that is where the distribution of activations is most like a gaussian), some people have found that putting it after can work better. I can't recall where I read this originally, but it's discussed here as well: gcr/torch-residual-networks#5

@f0k
Copy link
Member

f0k commented Nov 28, 2016

Can you show the correct way to use Batch Norm? I was looking online and I believed that to be the correct way.

The documentation for BatchNormLayer says:

This layer should be inserted between a linear transformation (such as a DenseLayer, or Conv2DLayer) and its nonlinearity. The convenience function batch_norm() modifies an existing layer to insert batch normalization in front of its nonlinearity.

So as I said:

you should use batch_norm instead of BatchNormLayer

In your code, you would literally just do a search & replace of BatchNormLayer with batch_norm.

Actually there has been some discussion about whether to put batchnorm before or after the nonlinearity lately.

True, we might want to weaken the formulation in the documentation a bit! But this PR was about porting a model from MXNet, and that model places batch normalization before the rectifier (see ConvFactory in https://github.com/dmlc/mxnet/blob/master/example/notebooks/cifar10-recipe.ipynb). Furthermore, placing batch normalization before the nonlinearity seems to be the more common practice, so for now we should encourage batch_norm() in our examples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants