Wednesday, April 22, 2015

Practical SGD: The Backprop Book of Tricks ( and the TOC of Neural Networks Tricks of the Trade )

Today I started reading the Yves LeCun's, (and Bottou, Orr, Müller)  Efficient Backprop article, which I found after doing a web search for Leon Bottou's SGD tricks paper that I mentioned yesterday.  These papers are relevant to all practical users of backpropagation supervised net training. 

Leafing through Efficient Backprop, I noted a discussion of the usual issues of setting learning rates, and convergence, but also — more interestingly for a "beginner", in a way —

  • emphasizing schemes 
  • choosing the sigmoid functions,  
  • normalising inputs, 
  • choosing target values,
  • initialising weights.
I am going to spend a few days working through Efficient Backprop, and applying these refinements to my MNIST recognizer, inasmuch as I can do so without too many mods to FANN.

Essentially this paper represents much of the "accepted wisdom" of the backpropagation community. Both papers, and quite a few more, are part of the Neural Networks Tricks of the Trade 2nd Edition book.  This compendium seems a major reference, and as such I will be adding it to my resource page.  

Here is the table of contents of the Neural Networks Tricks of the Trade book. Click to enlarge. 

