Machine Learning Spring 2017
 
 
The Sackler Institute of Graduate Biomedical Sciences at NYU School of Medicine
 
Machine Learning Spring 2017 (BMSC-GA 4439 and BMIN-GA 1004)

Course Director:
David Fenyö (David@FenyoLab.org)

Learning objectives

The student will learn and understand the most commonly used machine learning methods.

Course Material

Required Reading:

  • Introduction to Statistical Learning: with Applications in R. James G, Witten D, Hastie T, Tibshirani R. Springer 2013.
Recommended Reading:
  • Pattern Classification, 2nd Edition,Richard O. Duda, Peter E. Hart, David G. Stork, ISBN: 978-0-471-05669-0
  • The Elements of Statistical Learning: Data Mining, Inference, and Prediction.Hastie T, Tibshirani R, Friedman J. Springer: 2011.
  • Pattern Recognition and Machine Learning (Information Science and Statistics) by Christopher Bishop (Author) ISBN-10: 0387310738

General Policies

Late/missed work: You must adhere to the due dates for all required submissions. If you miss a deadline, then you will not get credit for that assignment/post.

Incompletes: No "Incompletes" will be assigned for this course unless we are at the very end of the course and you have an emergency.

Responding to Messages: I will check e-mails daily during the week, and I will respond to course related questions within 48 hours.

Announcements: I will make announcements throughout the semester by e-mail.

Make sure that your email address is updated; otherwise you may miss important emails from me.

Safeguards: Always back up your work on a safe place (electronic file with a backup is recommended) and make a hard copy. Do not wait for the last minute to do your work. Allow time for deadlines.

Plagiarism: Plagiarism, the presentation of someone else's words or ideas as your own, is a serious offense and will not be tolerated in this class. The first time you plagiarize someone else's work, you will receive a zero for that assignment. The second time you plagiarize, you will fail the course with a notation of academic dishonesty on your official record.

Course Assessment

  • Weekly Problem Sets (50%)
  • Discussions (20%)
  • Final Project (30%)
Lectures

Lecture 1 Course Overview (January 27, 2017 Alexandria West 508 3pm)
Lecturer: David Fenyo

Reading List
  • An Introduction to Statistical Learning by Gareth James et al. Chapter 1-2
  • DREAM Challenges

    Additional Reading
  • Coursera: Machine Learning


    Lecture 2 Unsupervised Learning: Clustering (January 31, 2017 Alexandria West 629 2pm)
    Lecturer: Wenke Liu

    Reading List
  • An Introduction to Statistical Learning by Gareth James et al. Chapter 10
  • The Elements of Statistical Learning by Hastie et al. Chapter 14

    Additional Reading
  • Cluster analysis (5ed). Everitt BS, Landau S, Leese M, Stahl D. Wiley: 2011.


    Lecture 3 Unsupervised Learning: Dimension Reduction (February 3, 2017 Alexandria West 508 3pm)
    Lecturer: Wenke Liu


    Lecture 4 Unsupervised Learning: Clustering and Dimension Reduction Lab (February 7, 2017 Alexandria West 629 2pm)
    Lecturer: Xuya Wang


    Lecture 5 Unsupervised Learning: Trajectory Analysis (February 10, 2017 Alexandria West 508 3pm)
    Lecturer: Isaac Galetzer-Levi


    Lecture 6 Supervised Learning: Regression (February 14, 2017 Alexandria West 629 2pm)
    Lecturer: David Fenyo

    Reading List
  • An Introduction to Statistical Learning by Gareth James et al. Chapter 3

    Additional Reading
  • The Elements of Statistical Learning by Hastie et al. Chapter 3


    Lecture 7 Supervised Learning: Regression Lab (February 17, 2017 Alexandria West 629 3pm)
    Lecturer: Jennifer Teubl


    Lecture 8 Supervised Learning: Classification (February 21, 2017 Alexandria West 629 2pm)
    Lecturer: David Fenyo

    Reading List
  • An Introduction to Statistical Learning by Gareth James et al. Chapter 4

    Additional Reading
  • The Elements of Statistical Learning by Hastie et al. Chapter 4


    Lecture 9 Supervised Learning: Classification Lab (February 24, 2017 Alexandria West 508 3pm)
    Lecturer: Jennifer Teubl


    Lecture 10 Student Project Plan Presentation (February 28, 2017 Alexandria West 629 2pm)


    Lecture 11 Supervised Learning: Performance Estimation & Regularization (March 7, 2017 Alexandria West 629 2pm)
    Lecturer: David Fenyo

    Reading List
  • An Introduction to Statistical Learning by Gareth James et al. Chapters 5 & 6

    Additional Reading
  • The Elements of Statistical Learning by Hastie et al. Chapter 7


    Lecture 12 Supervised Learning: Performance Estimation and Regularization Lab (March 10, 2017 Alexandria West 508 3pm)
    Lecturer: Hua Zhou


    Lecture 13 Neural Networks (March 24, 2017 Alexandria West 508 3pm)
    Lecturer: David Fenyo

    Reading List
  • The Elements of Statistical Learning by Hastie et al. Chapter 11
  • Alipanahi, Babak, et al. "Predicting the sequence specificities of DNA-and RNA-binding proteins by deep learning." Nature biotechnology 33.8 (2015): 831-838.
  • Goodfellow, Ian, et al. "Generative adversarial nets." Advances in neural information processing systems. 2014.
  • Ioffe, Sergey, and Christian Szegedy. "Batch normalization: Accelerating deep network training by reducing internal covariate shift." arXiv preprint arXiv:1502.03167 (2015).

    Additional Reading
  • Neural Networks and Deep Learning by Michael Nielsen


    Lecture 14 Neural Networks Lab (March 28, 2017 Alexandria West 629 2pm)
    Lecturer: Xuya Wang


    Lecture 15 Tree-Based Methods (March 31, 2017 Alexandria West 508 3pm)
    Lecturer: Kasthuri Kannan

    Reading List
  • An Introduction to Statistical Learning by Gareth James et al. Chapter 8
  • Carter H, Chen S, Isik L, et al. Cancer-specific High-throughput Annotation of Somatic Mutations: computational prediction of driver missense mutations. Cancer research. 2009
  • Waks Z, Weissbrod O, Carmeli B, Norel R, Utro F, Goldschmidt Y. Driver gene classification reveals a substantial overrepresentation of tumor suppressors among very large chromatin-regulating proteins. Scientific Reports. 2016


    Lecture 16 Support Vector Machines (April 4, 2017 Alexandria West 629 2pm)
    Lecturer: Kasthuri Kannan

    Reading List
  • An Introduction to Statistical Learning by Gareth James et al. Chapter 9
  • Hyeran Byun and Seong-Whan Lee, Applications of Support Vector Machines for Pattern Recognition: A Survey, SVM 2002, LNCS 2388, pp. 213-236, 2002.
  • Mao Y, Chen H, Liang H, Meric-Bernstam F, Mills GB, Chen K. CanDrA: Cancer-Specific Driver Missense Mutation Annotation with Optimized Features. Adamovic T, ed. PLoS ONE. 2013

    Additional Reading
  • The Elements of Statistical Learning by Hastie et al. Chapter 10


    Lecture 17 Tree-Based Methods and Support Vector Machines Lab (April 11, 2017 Alexandria West 629 2pm)
    Lecturer: Emily Kawaler


    Lecture 18 Probabilistic Graphical Models (April 14, 2017 Alexandria West 508 3pm)
    Lecturer: Narges Razavian

    Reading List
  • Zhang L, Kim S. Learning gene networks under SNP perturbations using eQTL datasets. PLoS Comput Biol. 2014 Feb 27
  • Dobra A, Hans C, Jones B, Nevins JR, Yao G, West M. Sparse graphical models for exploring gene expression data. Journal of Multivariate Analysis. 2004 Jul 1


    Lecture 19 Machine Learning Applied to Text Data (April 18, 2017 Alexandria West 629 2pm)
    Lecturer: Yindalon Aphinyanaphongs


    Lecture 20 Machine Learning Applied to Clinical Data (April 21, 2017 Alexandria West 508 3pm)
    Lecturer: Yindalon Aphinyanaphongs


    Lecture 21 Machine Learning Applied to Omics Data (April 25, 2017 Alexandria West 629 2pm)
    Lecturer: Kelly Ruggles

    Reading List
  • Libbrecht MW, Noble WS. Machine learning applications in genetics and genomics. Nat Rev Genet. 16 (2015) 321-32.
  • Kircher M, Witten DM, Jain P, O'Roak BJ, Cooper GM, Shendure J. A general framework for estimating the relative pathogenicity of human genetic variants. Nat Genet. 46 (2014) 310-5.


    Lecture 22 Student Project Presentation (May 5, 2017 Alexandria West 629 3pm)


    Lecture 23 Student Project Presentation (May 9, 2017 Alexandria West 508 2pm)