Computer Science 25300 / 35300 & Statistics 27700
This course is an introduction to key mathematical concepts at the heart of machine learning. The focus is on matrix methods and statistical models, and features real-world applications ranging from classification and clustering to denoising and recommender systems. Mathematical topics covered include linear equations, matrix rank, subspaces, regression, regularization, the singular value decomposition, and iterative optimization algorithms. Machine learning topics include least squares classification and regression, ridge regression, principal components analysis, principal components regression, kernel methods, matrix completion, support vector machines, clustering, stochastic gradient descent, neural networks, and deep learning. Students are expected to have taken a course in calculus and have exposure to numerical computing (e.g., Matlab, Python, Julia, or R). Knowledge of linear algebra and statistics is not assumed.
Appropriate for graduate students or advanced undergraduates. This course could be used as a precursor to TTIC 31020, “Introduction to Machine Learning,” or CSMC 35400.
Prerequisites:
Students are expected to have taken a course in calculus and have exposure to numerical computing (e.g., Matlab, Python, Julia, or R).
Textbooks:
- Mathematical Methods in Data Science by Sebastian Roch
- Optional additional reading:
- Matrix Methods in Data Mining and Pattern Recognition by Lars Elden.
- Elements of Statistical Learning, 12th printing Jan 2017 by Hastie, Tibshirani, and Friedman
- Introduction to Applied Linear Algebra – Vectors, Matrices, and Least Squares by Stephen Boyd and Lieven Vandenberghe
- Pattern Recognition and Machine Learning by Christopher Bishop
The textbooks will be supplemented with additional notes and readings.
Other resources:
- 3Blue1Brown — great videos and animations of key concepts in linear algebra and machine learning
- Sebastian Raschka Introduction to Machine Learning
Fall 2025
All course videos are being posted to a YouTube channel. The videos linked below are Panopto videos with (potentially faulty) captions.
- Lecture 1, Introduction notes, video part I, video part II
- Lecture 2, Vector and matrices notes, video
- Lecture 3, Least squares notes, video
- Lecture 4, Least squares and optimization notes, video
- Lecture 5, Subspaces and bases notes, video
- Lecture 6, Finding orthogonal bases notes, video
- Lecture 7, Introduction to the Singular Value Decomposition (SVD) notes, video
- Lecture 8, The SVD notes, video
- Lecture 9, Principal Components Analysis notes, video
- Lecture 10, Data leakage and matrix completion notes, video
- Lecture 11, PageRage and ridge regression notes, video
- Lecture 12, Pseudoinverse and kernel ridge regression notes, video
- Lecture 13, Support Vector Machines notes, video
- Lecture 14, Gradient descent and stochastic gradient descent notes, video
- Lecture 15, Backpropagation notes, video
- Lecture 16, Clustering and k-means notes, video
- Lecture 17, Gaussian mixture models and the EM algorithm notes, video
Lectures from past quarters:
Written lecture notes from Fall 2023
- Lecture 1: Introduction
- Lecture 2: Vectors and Matrices
- Lecture 3: Least Squares and Geometry
- Lecture 4: Least Squares and Optimization
- Lecture 5: Subspaces and Bases
- Lecture 6: Orthogonal Bases
- Lecture 7: Introduction to the Singular Value Decomposition
- Lecture 8: The Singular Value Decomposition
- Lecture 9: SVD in Machine Learning
- Lecture 10: SVD in Least Squares
- Lecture 11: Kernel Methods
- Lecture 12: Support Vector Machines
- Lecture 13: Stochastic Gradient Descent
- Lecture 14-15: Backpropagation
- Lecture 16: Clustering and K-means
- Lecture 17: The Expectation-Maximization Algorithm
Videos of past lectures (from 2020 and 2021, imperfectly aligned with most recent class notes)
- Lecture 1: Introduction video
- Lecture 2: Vectors and Matrices video 2019, video 2021
- Lecture 3: Least Squares and Geometry video 2019, video 2021
- Lecture 4: Least Squares and Optimization video 2019, video 2021,
- Lecture 5: Subspaces and Bases video 2021
- Lecture 6: Subspaces, Bases, and Projections video 2019, video 2021
- Lecture 7: Finding Orthogonal Bases video 2019, video 2021
- Lecture 8: Introduction to the Singular Value Decomposition video, video 2021
- Lecture 9: The Singular Value Decomposition video, video 2021
- Lecture 10: SVD, PCA, and Dimensionality Reduction video, video 2021
- Lecture 11: PCR & Ridge Regression video, video 2021
- Lecture 12: Bias in ML and Matrix Completion (with notes on PageRank) video, video 2021, video on matrix completion 2021
- Lecture 13: Kernel Ridge Regression video, video 2021
- Lecture 14: Support Vector Machines video, video 2021
- Lecture 15: Stochastic Gradient Descent video, video 2021
- Lecture 16: Deeper Neural Networks video 2021,
- Lecture 17: Backpropagation video, video 2021
- Lecture 18: Clustering and K-means video, video 2021
Topics:
Intro and Linear Models
- What is ML, how is it related to other disciplines?
- Learning goals and course objectives.
- Vectors and matrices in machine learning models
- Features and models
- Least squares, linear independence and orthogonality
- Linear classifiers
- Loss, risk, generalization
- Applications: bioinformatics, face recognition
Singular Value Decomposition (Principal Component Analysis)
- Dimensionality reduction
- Applications: recommender systems, PageRank
Overfitting and Regularization
- Ridge regression
- Model selection, cross-validation
- Applications: image deblurring
Beyond Least Squares: Alternate Loss Functions
- Hinge loss
- Logistic regression
- Feature functions and nonlinear regression and classification
- Kernel methods and support vector machines
- Application: Handwritten digit classification
Iterative Methods
- Stochastic Gradient Descent (SGD)
- Neural networks and backpropagation
Statistical Models
- Density estimation and maximum likelihood estimation
- Gaussian mixture models and Expectation Maximization
- Unsupervised learning and clustering
- Application: text classification