Searching for tutorials and software about Deep Learning and Neural Nets? Be sure to look at my Resource Page!
Looking for Octave? Go to my Easy Octave on Mac page!

Tuesday, April 14, 2015

MNIST with FANN, It's 98.8% alive!

I spent a few hours every day this week rewriting Andrea Burratin's FANN wrapper code for MNIST, to try and establish a baseline for shallow MLP net work on this problem. The rewrite is now functional but not quite complete: research software is always wire and gum.

After three days with a text editor, I now have this dirty but functional suite of my own. Following Andrea's organisation, there are 3 programs:

  1. edconvert which reads Yann Le Cun's preprocessed MNIST data to extract whatever fields I need and write out a file in FANN format. 
  2. edtrain, trains a shallow three layer MLP with stochastic gradient descent
  3. edbulktest computes statistics on the training set of 60K samples and the validation set of 10K.  
Learning is very fast, does not seem to overfit, and very quickly produces results above the 95% recognition rates and over, which would have been decent back in the late 1990's ...

Training run for 100 epochs:
Epoch 1, current mse: 0.0106737
Epoch 2, current mse: 0.0071762
Epoch 3, current mse: 0.00638741
Epoch 4, current mse: 0.00570363
Epoch 5, current mse: 0.00530059
Epoch 6, current mse: 0.00501892
Epoch 7, current mse: 0.004785
Epoch 98, current mse: 0.00197636
Epoch 99, current mse: 0.00194285
Epoch 100, current mse: 0.00197295

Stats on 60K sample training set yield 98.8% correct!
Edmunds-MBP:src edmundronald$ ./edbulktest --stopitem 60000 > shit
NCORRECT = 59309

Edmunds-MBP:src edmundronald$ python
>>> 59309./60000

Stats on 10K validation set are 95.4% correct. 
Edmunds-MBP:src edmundronald$ ./edbulktest --stopitem 10000 > shit

On balance, an interesting experiment  which confirms that the ancient 3-layer MLP techniques work as well as advertised in the literature back in the 90s, and that the FANN library is solid. 



  1. Do you have the code available, or is this just your word?

  2. No, I didn't actually write any code or run it, I routinely photoshop screenshots, and this blog is here for me to inflate my ego by means of fractions of percentage points.



Hey, let me know what you think of my blog, and what material I should add!