====== A set of interesting references ======

  * John Duchi, Elad Hazan, and Yoram Singer. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization,  Journal of Machine Learning Research, accepted pending minor revision. [ [[http://www.cs.berkeley.edu/%7Ejduchi/projects/DuchiHaSi10.pdf|pdf]] ]
  * John Duchi and Yoram Singer. Efficient Online and Batch Learning using Forward Backward Splitting, Journal of Machine Learning Research (JMLR 2009) and Neural Information Processing Systems (NIPS 2009). [ [[http://www.cs.berkeley.edu/~jduchi/projects/DuchiSi09_folos.html|pdf]] ] 
  * Hu, Kwok, Pan. Accelerated Gradient Methods for. Stochastic Optimization and Online Learning. NIPS 09 [ [[http://www.cse.ust.hk/~weikep/papers/NIPS-09-Stochastic.pdf|pdf]] ].
  * Nesterov. Efficiency of coordinate descent methods on huge-scale optimization problems. [ [[http://www.optimization-online.org/DB_FILE/2010/01/2527.pdf|pdf]] ], [ [[http://www.optimization-online.org/DB_HTML/2010/01/2527.html|html]] ].
  * Shalev-Shwartz, S., Singer, Y., and Srebro, N. 2007. Pegasos: Primal Estimated sub-GrAdient SOlver for SVM. In Proceedings of the 24th international Conference on Machine Learning (Corvalis, Oregon, June 20 - 24, 2007). Z. Ghahramani, Ed. ICML '07, vol. 227. ACM, New York, NY, 807-814. [ [[http://www.cs.huji.ac.il/~shais/papers/ShalevSiSr07.pdf|pdf]] ].