Data Analytics and Business Intelligence
Senior Analytics Leader with 15+ years of experience architecting high-impact data solutions and risk frameworks in banking and insurance. Expert at weaponizing complex datasets to drive executive strategy, having secured over $6M in NPV savings through rigorous cost-benefit modeling and advanced automation.
Bellevue University | In Progress
U.P. Technical University, India
SAS Base 9 Certified | ABA Certification in Deposit Compliance
In this post, we’ll explore the fundamentals of ETL pipeline optimization and how to achieve 40% runtime reduction.
ETL (Extract, Transform, Load) is the backbone of modern data engineering. Let’s break down the optimization strategies:
By implementing these strategies at Citi Group, we achieved:
Effective ETL optimization requires:
Stay tuned for more insights on data engineering best practices!
Selecting the right evaluation metrics is crucial for building effective machine learning models.
Accuracy: The percentage of correct predictions
Precision & Recall:
F1-Score: Harmonic mean of precision and recall
ROC-AUC: Receiver Operating Characteristic curve
Mean Absolute Error (MAE):
Root Mean Squared Error (RMSE):
R-Squared (R²):
At Citi Group, we used ROC-AUC as the primary metric for fraud detection because:
Result: Achieved 99.9% classification accuracy with 0.99 ROC-AUC
Remember: The best metric is the one that aligns with your business problem!