Computer Science 25300 / 35300 & Statistics 27700

This course is an introduction to key mathematical concepts at the heart of machine learning. The focus is on matrix methods and statistical models, and features real-world applications ranging from classification and clustering to denoising and recommender systems. Mathematical topics covered include linear equations, matrix rank, subspaces, regression, regularization, the singular value decomposition, and iterative optimization algorithms. Machine learning topics include least squares classification and regression, ridge regression, principal components analysis, principal components regression, kernel methods, matrix completion, support vector machines, clustering, stochastic gradient descent, neural networks, and deep learning. Students are expected to have taken a course in calculus and have exposure to numerical computing (e.g., Matlab, Python, Julia, or R). Knowledge of linear algebra and statistics is not assumed.

Appropriate for graduate students or advanced undergraduates. This course could be used as a precursor to TTIC 31020, “Introduction to Machine Learning,” or CSMC 35400.

Prerequisites:

Students are expected to have taken a course in calculus and have exposure to numerical computing (e.g., Matlab, Python, Julia, or R).

Textbooks:

Mathematical Methods in Data Science by Sebastian Roch
Optional additional reading:
- Matrix Methods in Data Mining and Pattern Recognition by Lars Elden.
- Elements of Statistical Learning, 12th printing Jan 2017 by Hastie, Tibshirani, and Friedman
- Introduction to Applied Linear Algebra – Vectors, Matrices, and Least Squares by Stephen Boyd and Lieven Vandenberghe
- Pattern Recognition and Machine Learning by Christopher Bishop
  The textbooks will be supplemented with additional notes and readings.

Other resources:

3Blue1Brown — great videos and animations of key concepts in linear algebra and machine learning
Sebastian Raschka Introduction to Machine Learning

Fall 2025

All course videos are being posted to a YouTube channel. The videos linked below are Panopto videos with (potentially faulty) captions.

Lecture 1, Introduction notes, video part I, video part II
Lecture 2, Vector and matrices notes, video
Lecture 3, Least squares notes, video
Lecture 4, Least squares and optimization notes, video
Lecture 5, Subspaces and bases notes, video
Lecture 6, Finding orthogonal bases notes, video
Lecture 7, Introduction to the Singular Value Decomposition (SVD) notes, video
Lecture 8, The SVD notes, video
Lecture 9, Principal Components Analysis notes, video
Lecture 10, Data leakage and matrix completion notes, video
Lecture 11, PageRage and ridge regression notes, video
Lecture 12, Pseudoinverse and kernel ridge regression notes, video
Lecture 13, Support Vector Machines notes, video
Lecture 14, Gradient descent and stochastic gradient descent notes, video
Lecture 15, Backpropagation notes, video
Lecture 16, Clustering and k-means notes, video
Lecture 17, Gaussian mixture models and the EM algorithm notes, video

Lectures from past quarters:

Written lecture notes from Fall 2023

Lecture 1: Introduction
Lecture 2: Vectors and Matrices
Lecture 3: Least Squares and Geometry
Lecture 4: Least Squares and Optimization
Lecture 5: Subspaces and Bases
Lecture 6: Orthogonal Bases
Lecture 7: Introduction to the Singular Value Decomposition
Lecture 8: The Singular Value Decomposition
Lecture 9: SVD in Machine Learning
Lecture 10: SVD in Least Squares
Lecture 11: Kernel Methods
Lecture 12: Support Vector Machines
Lecture 13: Stochastic Gradient Descent
Lecture 14-15: Backpropagation
Lecture 16: Clustering and K-means
Lecture 17: The Expectation-Maximization Algorithm

Videos of past lectures (from 2020 and 2021, imperfectly aligned with most recent class notes)

Lecture 1: Introduction video
Lecture 2: Vectors and Matrices video 2019, video 2021
Lecture 3: Least Squares and Geometry video 2019, video 2021
Lecture 4: Least Squares and Optimization video 2019, video 2021,
Lecture 5: Subspaces and Bases video 2021
Lecture 6: Subspaces, Bases, and Projections video 2019, video 2021
Lecture 7: Finding Orthogonal Bases video 2019, video 2021
Lecture 8: Introduction to the Singular Value Decomposition video, video 2021
Lecture 9: The Singular Value Decomposition video, video 2021
Lecture 10: SVD, PCA, and Dimensionality Reduction video, video 2021
Lecture 11: PCR & Ridge Regression video, video 2021
Lecture 12: Bias in ML and Matrix Completion (with notes on PageRank) video, video 2021, video on matrix completion 2021
Lecture 13: Kernel Ridge Regression video, video 2021
Lecture 14: Support Vector Machines video, video 2021
Lecture 15: Stochastic Gradient Descent video, video 2021
Lecture 16: Deeper Neural Networks video 2021,
Lecture 17: Backpropagation video, video 2021
Lecture 18: Clustering and K-means video, video 2021

Topics:

Intro and Linear Models

What is ML, how is it related to other disciplines?
Learning goals and course objectives.
Vectors and matrices in machine learning models
Features and models
Least squares, linear independence and orthogonality
Linear classifiers
Loss, risk, generalization
Applications: bioinformatics, face recognition

Singular Value Decomposition (Principal Component Analysis)

Dimensionality reduction
Applications: recommender systems, PageRank

Overfitting and Regularization

Ridge regression
Model selection, cross-validation
Applications: image deblurring

Beyond Least Squares: Alternate Loss Functions

Hinge loss
Logistic regression
Feature functions and nonlinear regression and classification
Kernel methods and support vector machines
Application: Handwritten digit classification

Iterative Methods

Stochastic Gradient Descent (SGD)
Neural networks and backpropagation

Statistical Models

Density estimation and maximum likelihood estimation
Gaussian mixture models and Expectation Maximization
Unsupervised learning and clustering
Application: text classification

Rebecca Willett

Mathematical Foundations of Machine Learning