Essential Deep Learning Topics for Interviews

Nov 7, 2019 2 min read

Go to Project Site

This post contains a list of topics which we feel that one should be comfortable with before appearing for a DL interview. This list is by no means exhaustive (as the field is very wide and ever growing).

Mathematics

Linear Algebra( notes)

Linear Dependence and Span
Eigendecomposition
- Eigenvalues and Eigenvectors
Singular Value Decomposition

Probability and Statistics

Expectation, Variance and Co-variance
Distributions
Bias and Variance
- Bias Variance Trade-off
Estimators
- Biased and Unbiased
Maximum Likelihood Estimation
Maximum A Posteriori (MAP) Estimation

Information Theory

(Shannon) Entropy
Cross Entropy
KL Divergence
- Not a distance metric
- Derivation from likelihood ratio ( Blog)
- Always greater than 0
  - Proof by Jensen’s Inequality
- Relation with Entropy ( Explanation)

Basics

Backpropogation

Vanilla ( blog)
Backprop in CNNs
- Gradients in Convolution and Deconvolution Layers
Backprop through time

Loss Functions

MSE Loss
- Derivation by MLE and MAP
Cross Entropy Loss
- Binary Cross Entropy
- Categorical Cross Entropy

Activation Functions (Sigmoid, Tanh, ReLU and variants) ( blog)
Optimizers
Regularization

Early Stopping
Noise Injection
Dataset Augmentation
Ensembling
Parameter Norm Penalties
- L1 (sparsity)
- L2 (smaller parameter values)
BatchNorm ( Paper)
- Internal Covariate Shift
- BatchNorm in CNNs ( Link)
- Backprop through BatchNorm Layer ( Explanation)
Dropout ( Paper) ( Notes)

Computer Vision

ILSVRC

AlexNet
ZFNet
VGGNet ( Notes)
InceptionNet ( Notes)
ResNet ( Notes)
DenseNet
SENet

Object Recognition ( Blog)

RCNN ( Notes)
Fast RCNN
Faster RCNN ( Notes)
Mask RCNN
YOLO v3 (Real-time object recognition)

Convolution

Cross-correlation
Pooling (Average, Max Pool)
Strides and Padding
Output volume dimension calculation
Deconvolution (Transpose Conv.), Upsampling, Reverse Pooling ( Visualization)

Natural Language Processing

Recurrent Neural Networks

Architectures (Limitations and inspiration behind every model) ( Blog 1) ( Blog 2)
- Vanilla
- GRU
- LSTM
- Bidirectional
Vanishing and Exploding Gradients

Word Embeddings

Word2Vec
CBOW
Glove
FastText
SkipGram, NGram
ELMO
OpenAI GPT
BERT ( Blog)

Transformers ( Paper) ( Code) ( Blog)

BERT ( Paper)
Universal Sentence Encoder

Generative Models

Generative Adversarial Networks (GANs)

Basic Idea
Variants
- Vanilla GAN ( Paper)
- DCGAN
- Wasserstein GAN ( Paper)
- Conditional GAN ( Paper)
Mode Collapse
GAN Hacks ( Link)

Variational Autoencoders (VAEs)

Variational Inference ( tutorial paper)
ELBO and Loss Function derivation

Normalizing Flows

Basic Idea and Applications

Misc

Triplet Loss
BLEU Score
Maxout Networks
Support Vector Machines

Maximal-Margin Classifier
Kernel Trick

PCA ( Explanation)

PCA using neural network
- Architecture
- Loss Function

Spatial Transformer Networks
Gaussian Mixture Models (GMMs)
Expectation Maximization

More Resources

Stanford’s CS231n Lecture Notes
Deep Learning Book (Goodfellow et. al.)

Contributing

We welcome contributions to add resources such as notes, blogs, or papers for a topic. Feel free to open a pull request for the same!

Aniket Agarwal

Junior at IIT Roorkee

My research interests include distributed robotics, mobile computing and programmable matter.