Reading List

Machine Learning

Bayesian decision theory

Tenenbaum, J. B.; Griffiths, T. L. & Kemp, C.
Theory-based Bayesian models of inductive learning and reasoning
Trends in Cognitive Sciences,
2006, 10, 309-318
Dietterich, T. G.
Machine Learning for Sequential Data: A Review
Structural, Syntactic, and Statistical Pattern Recognition: Joint Iapr International Workshops Sspr 2002 and Spr 2002, Windsor, Ontario, Canada, August 6-9, 2002: Proceedings,
Friedman, N.; Getoor, L.; Koller, D. & Pfeffer, A.
Learning probabilistic relational models
Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence,
1999, 1300-1309
Goldenberg, A. & Moore, A.
Bayes Net Graphs to Understand Coauthorship Networks
KDD Workshop on Link Discovery: Issues, Approaches and Applications,

Kernel methods

Grauman, K., and T. Darrell.
Pyramid match kernel: Discriminative classification with sets of image features.
MIT Computer Science and Artificial Intelligence Laboratory Technical Report, MIT-CSAIL-TR-2005-017
Leslie, C., E. Eskin, and W. S. Noble. 
The spectrum kernel: A string kernel for SVM protein classification.
In Proceedings of the 2002 Pacific Symposium on Biocomputing
Lodhi, H., C. Saunders, J. Shawe-Taylor, N. Cristianini, and C. Watkins. 
Text classification using string kernels. 
The Journal of Machine Learning Research 2:419-444.

Regularization and model complexity

Lawrence, Steve, C. Lee Giles, and A.C. Tsoi. 
What Size Neural Network Gives Optimal Generalization? Convergence Properties of Backpropagation. 
UM Computer Science Department. University of Maryland, UMIACS-TR-96-22
Mehta, M., J. Rissanen, and R. Agrawal. 
MDL-based decision tree pruning.
In Proceedings of KDD95
Roberts, S., and H. Pashler. 
How persuasive is a good fit? A comment on theory testing.
Psychological Review 107, no. 2:358-367

Performance evaluation

Domingos, P. 
MetaCost: a general method for making classifiers cost-sensitive. 
In Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, p. 155-164
Hand, David J., and Robert J. Till. 
A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems. 
Machine Learning 45, no. 2:171-186.
Japkowicz, N. 
The class imbalance problem: A systematic study. 
Intelligent Data Analysis, 6(5), p.429-449.

Combining multiple classifiers

Dietterich, T. G. 
An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization. Machine Learning 40, no. 2: 139-157. 
Mason, L., J. Baxter, P. Bartlett, and M. Frean. 
Boosting algorithms as gradient descent. 
In Advances in Neural Information Processing Systems, 12:512-518. 
Oza, N.  
Online bagging and boosting. 
In 2005 IEEE International Conference on Systems, Man and Cybernetics  

Clustering and density estimation