📊

AI Model Fundamentals

Training, evaluation, and optimization of AI models

⏱️ Estimated reading time: 22 minutes

ML Model Lifecycle

Phases

1. Data Preparation
- Collection
- Cleaning
- Labeling
- Feature engineering

2. Training
- Algorithm selection
- Train/test split
- Iterative training

3. Evaluation
- Performance metrics
- Cross-validation
- Overfitting/underfitting detection

4. Deployment
- Inference
- Monitoring
- Updates

🎯 Key Points

✓ Clear phases: data prep, training, evaluation and deployment
✓ Data quality largely determines model success
✓ Versioning and reproducibility (model registry, seeds, pipelines) are critical
✓ Post-deployment monitoring for data drift and performance degradation
✓ Automated pipelines (CI/CD) reduce errors and speed iterations

Model Customization

Fine-Tuning

Adjusting a pre-trained model with domain-specific data.

Advantages:
- Better performance on specific tasks
- Requires less data than training from scratch
- Faster than full training

Disadvantages:
- Requires training data
- Can be expensive
- Risk of overfitting

Prompt Engineering

Designing effective instructions to guide model responses.

Techniques:
- Zero-shot prompting
- Few-shot prompting
- Chain-of-thought
- System prompts

🎯 Key Points

✓ Fine-tuning improves task-specific performance but requires labeled data
✓ Choose between fine-tuning and prompt engineering based on cost, data and control needs
✓ Watch for overfitting: use validation and regularization
✓ Prompt engineering is fast and low-cost for many cases but offers less absolute control
✓ Assess safety and bias when customizing models

Metrics and Evaluation

Classification Metrics

- Accuracy: Overall precision
- Precision: Correct positives / Total predicted positives
- Recall: Correct positives / Total actual positives
- F1-Score: Harmonic mean of precision and recall

Regression Metrics

- MAE: Mean Absolute Error
- MSE: Mean Squared Error
- RMSE: Root Mean Squared Error
- R²: Coefficient of determination

LLM Evaluation

- BLEU: Sequence similarity
- ROUGE: Text summaries
- Perplexity: Predictive quality
- Human evaluation

🎯 Key Points

✓ Choose metrics aligned with business goals (e.g., recall for fraud detection)
✓ Account for class imbalance and use robust metrics (precision/recall, AUC)
✓ Tune thresholds and calibrate probabilities for operational decisions
✓ For LLMs, combine automated metrics with human evaluation and safety testing
✓ Monitor metrics in production and review frequently

← Back to AWS-AIF-C01