ML Overview

By Duaibeom Feb. 2, 2023

ML
keyword

Roles#

Data Analysis
- Data Scientist
Model Development
- AI Researcher
- AI Specialist
- AI Engineer
Service (Production)
- AI Engineer
- MLOps

AI Product Development Cycle#

Clearly define the problem and goals
- target audience, key stakeholders, data and resources
Collect and analyze data
- Acquire high-quality and relevant data.
- Perform exploratory data analysis(EDA) to understand the distribution, patterns, and anomalies in the data.
- Determine the data split between training, validation, and testing.
Prepare the dataset
- clean, preprocess, transform, missing or imcomplete, imbalance
- Ensure that the data is properly formatted and normalized.
Choose and train an appropriate model
- Consider the trade-off between model complexity and interpretability.
- Ensure that the model is scalable and can handle large datasets.
- loss function, optimization algorithm
- Monitor and adjust the model
Evaluate the model’s performance with appropriate metrics
Refine and optimize the model
- Compare the model’s performance on the test data with its performance on the validation data.
Deploy the model
- Make sure that it is integrated with the other systems and processes as needed.
Monitor and maintain the model’s performance.
- Monitor the model’s performance in real-world conditions, and adjust as necessary.
- Continuously evaluate the model’s performance and make improvements as needed.

Main Challenges#

Data Preprocessing#

Insufficient Quantity of Data
Nonrepresentative Data
Poor-Quality Data
Imbalanced data
- Oversampling
- Undersampling
- Generating synthetic
Irrelevant Features

Training#

Overfitting
- Dropout
- Monte Carlo (MC) Dropout
- Regularization
Underfitting
The Vanishing/Exploding Gradients Problems
- Glorot and He Initialization
- Better Activation Functions
- Batch Normalization
- Gradient Clipping
Hyperparameter tuning
- Grid search
- Random search
- Bayesian optimization

Evaluation#

Explainable AI (XAI)
Data Mismatch in Testing and Validating

Math#

Softmax

Linear Algebra#

Vector, Matrix(Tensor)

Probability#

Statistics#

Framework#

Dataset and Dataloader
Optimizer
Multi-GPU
Monitoring
- Weights & Biases
- Tensorboard
Neural Network
- PyTorch; PyG
- JAX; Flax, Jraph
- TensorFlow;
Accelerator
- TPU: XLA, ?
- GPU: CUDA, Triton

Base of ML#

$\Theta^* \larr \Theta - \eta\nabla_{\Theta}{\cal L}$

Gradient Descent
Constraints?

Activation Functions

Weight Initializers

Metric

Confusion Matrices
- Decision Boundaries
Precision and Recall, and it’s Trade-off
ROC Curve
F1

Loss (Cost)

MSE, MAE
Cross Entropy
- KL divergence
- Focal

Backpropagation

Autodiff (AutoGrad)

Optimizers

Momentum, Nesterov Accelerated Gradient, AdaGrad, RMSProp, Adam, AdaMax, Nadam, AdamW

Learning Rate Scheduler

Computational Complexity

Regularization

The Normal Equation
Regularized Models
- Ridge Regression
- Lasso Regression
- Elastic Net Regression
- Early Stopping
$\ell_1$ and $\ell_2$ Regularization
Max-Norm Regularization
Estimating Probabilities ?

Architectures#

1-Stage Detector
2-Stage Detector
AutoML

Layers#

Linear Layers
Convolutional Layers
- CNN
Recurrent Neurons
- LSTM, GRU
Attention Mechanisms
- Transformer
  - Attention Is All You Need: The Original Transformer Architecture
- Vision Transformers
Pooling Layers
- Avg, Max
Normalization Layers
Dropout
Neural architecture search (NAS)
Optimization
- Tensor Decomposition
- Quantization
- Model Compression
  - Knowledge distillation

Technic#

Using GPUs to Speed Up Computations
- Getting Your Own GPU
- Managing the GPU RAM
- Placing Operations and Variables on Devices
- Parallel Execution Across Multiple Devices
Training Models Across Multiple Devices
- Model Parallelism
- Data Parallelism

Tasks#

Supervise, Un-supervise, Semi-supervise
Instance-Based vs. Model-Based Learning
Classification
Regression
Annotation
Computer Vision (CV)
- Classification and Localization
- Object Detection
- Object Tracking
- Semantic Segmentation
- Optical Character Recognition(OCR)
Natural Language Processing (NLP)
- Bag of Words & Word Embedding
- Forecasting a Time Series
- Handling Long Sequences
  -Fighting the Unstable Gradients Problem
  -Tackling the Short-Term Memory Problem
- Sentiment Analysis
  - Masking
  - Reusing Pretrained Embeddings and Language Models
- An Encoder-Decoder Network for Neural Machine Translation
  - Bidirectional RNNs
  - Beam Search
- KLUE
- MRC
- Summarize
- Generative
  - GPT
RecSys
Multi-modal Learning

Models#

X-AI

Support Vector Machines#

Linear SVM Classification
- Soft Margin Classification
Nonlinear SVM Classification
- Polynomial Kernel
- Similarity Features
- Gaussian RBF Kernel
- SVM Classes and Computational Complexity
SVM Regression
Under the Hood of Linear SVM Classifiers
The Dual Problem
- Kernelized SVMs

Decision Trees#

The CART Training Algorithm
Gini Impurity or Entropy?
Regularization Hyperparameters
Sensitivity to Axis Orientation
Decision Trees Have a High Variance

Ensemble Learning and Random Forests#

Voting Classifiers
Bagging and Pasting
- Out-of-Bag Evaluation
Random Forests
- Feature Importance
Boosting
Stacking

Dimensionality Reduction#

Projection
Manifold Learning
PCA
LLE

Clustering#

k-means and DBSCAN
Gaussian Mixtures
- Using Gaussian Mixtures for Anomaly Detection

Autoencoder#

Efficient Data Representations
Performing PCA with an Undercomplete Linear Autoencoder
Stacked Autoencoders
Convolutional Autoencoders
Denoising Autoencoders
Sparse Autoencoders
Variational Autoencoders

Generative Adversarial Networks#

The Difficulties of Training GANs
Deep Convolutional GANs
Progressive Growing of GANs
StyleGANs

Reinforcement Learning#

Rewards
Policy Search
Neural Network Policies
Evaluating Actions: The Credit Assignment Problem
Policy Gradients
Markov Decision Processes
Temporal Difference Learning
Q-Learning
- Exploration Policies
- Approximate Q-Learning and Deep Q-Learning
Implementing Deep Q-Learning
Deep Q-Learning Variants
- Fixed Q-value Targets
- Double DQN
- Prioritized Experience Replay
- Dueling DQN

Diffusion Models#

Service#

Cloud Service#

Microsoft Azure
Google Cloud services: Vertex AI
AWS
Cloudflare

Drug Discovery Service#

Isomorphic Labs, DeepMind
Insilico medicien

Reference:

homl, 2nd ed.

previous
next ML Study Resources