Course ML Software Engineering: Interview Concept Review Chapter 11 Difficulty intermediate Estimated Time 900 min

Chapter 11: Unsupervised Learning — Interview Deep Review

Unsupervised Learning — Interview Deep Review in ML Software Engineering: Interview Concept Review.

65% complete

Learning Objectives

By the end of this chapter, you will be able to:

  • Relate Unsupervised Learning — Interview Deep Review to common ML software engineering interview questions and trade-offs.
  • Explain when this topic deserves a deeper pass through another tutorial on this site versus staying at recap depth.
  • Surface assumptions, pitfalls, and follow-up probes an interviewer is likely to use.

← Back to course

PCA as optimal linear compression (under variance lens)

Given centered data matrix X, PCA finds orthogonal directions maximizing variance. Leading eigenvectors of covariance Σ are principal axes; eigenvalues encode explained variance.

Eigenpairs story: eigenvectors point along stretch directions; repeating this aloud beats reciting SVD triple product unless linear algebra round deepens.

Relation to SVD: for centered X, SVD provides numerically stable PCA; singular values ↔ sqrt of eigenvalues.

Nonlinear embeddings (interview caution)

t-SNE preserves local neighborhoods for visualization—distances between clusters not globally meaningful; perplexity hyperparameter shifts cluster appearance.

Isomap approximates geodesics via kNN graph—better when data lie on manifold but costlier.

K-means vs Gaussian mixture models

k-means hard-assigns to nearest centroid; minimizes within-cluster dispersion assuming spherical-ish clusters equal variance—use elbow/silhouette cautiously.

GMM + EM soft-assigns posterior responsibilities; excels with overlapping elliptical clusters. EM alternates E-step (posterior assignment) vs M-step (update means/covariances/weights)— articulate local optima reliance + initialization via k-means.

Factorization machines (elevator)

FM models pairwise feature interactions via low-rank embeddings—hits CTR prediction with sparse categorical fields; contrasts with exploding explicit cross-products.

Interview prompts

  • Covariance matrix interpretation?
  • Orthogonal principal components rationale?
  • Choosing k pragmatically?
  • k-means vs GMM failure modes?

Go deeper on this site

Comprehensive Clustering Analysis · Linear geometry intuition → Matrix–Vector Multiplication

1. PCA components orthogonal because: