Actuarial Data Science Tutorials and Data Sets
Data Sets
The following file is supposed to provide actuaries with links to publicly available data that can be used to apply and learn data science methods. The list of datasets is maintained by the Data Science Working Group of the Swiss Actuarial Association.
Actuarial Data Science Tutorials
Below we present all the tutorials that have been prepared by the working party. We are intensively working on additional ones.
All tutorials consist of an article and the corresponding code. In the article, we describe the methodology and the statistical model. By providing you with the code you can easily replicate the analysis performed and test it on your own data.
Case Study 15: Privacy-preserving Machine Learning
Case Study 14: SHAP for Actuaries: Explain any Model
Case Study 13: Gini Index and Friends
Data Simulator: Individual Claims Generator for Claims Reserving Studies: data simulation.R
Case Study 12: Actuarial Applications of Natural Language Processing Using
Transformers: Case Studies for Using Text Features in an Actuarial Context
Article on arXiv ; Article in British Actuarial Journal
Code on GitHub ; Notebook (Part 1) ; Notebook (Part 2) ; Notebook (Part 3)
Case Study 11: Model Comparison and Calibration Assessment: User Guide for Consistent Scoring Functions in Machine Learning and Actuarial Practice
Case Study 10: LocalGLMnet: a deep learning architecture for actuaries
Case Study 9: Convolutional neural network studies: (1) anomalies in mortality rates (2) image recognition
Code on GitHub ; Notebook (Mortality) ; Notebook (Digits) ; Notebook (Image)
Case Study 8: Peeking into the Black Box: An Actuarial Case Study for Interpretable Machine Learning
Case Study 7: The Art of Natural Language Processing: Classical, Modern and Contemporary Approaches to Text Document Classification
Code on GitHub ; Notebook (Pip) ; Notebook (ML) ; Notebook (RNN)
Case Study 6: Lee and Carter go Machine Learning: Recurrent Neural Networks
Case Study 5: Unsupervised Learning: What is a Sports Car?
Case Study 4: On Boosting: Theory and Applications
Case Study 3: Nesting Classical Actuarial Models into Neural Networks
Case Study 2: Insights from Inside Neural Networks
Case Study 1: French Motor Third-Party Liability Claims
Code on GitHub ; R Notebook (desciptive) ; R Notebook (GLM) ; Python Notebook (descriptive) ; Python Notebook (GLM)