Artificial Intelligence and Deep Learning Models for Actuarial Applications

Lecture slides from UNSW’s ACTL3143 & ACTL5111 courses

Author

Dr Patrick Laub

These are the lecture slides from the “Artificial Intelligence and Deep Learning Models for Actuarial Applications” courses (coded ACTL3143 & ACTL5111) at UNSW.

Lecture Materials

Readings

The readings from the book will come mainly from Géron (2022), which is available through the UNSW Library’s access to O’Reilly Media texts. I’ll give references to the 3rd edition, but if you get your hands on a copy of the 2nd edition then that is also fine. Some readings will be from James et al. (2021) (or equivalently the the Python version James et al. (2023)) which is available online; you’ll need the 2nd edition for this (the deep learning chapter is not in the 1st edition). Note, if I say “read from A up to B”, that means to read A but stop at B (without reading it).

Week Readings
0 Géron (2022): Chapter 1 “The Machine Learning Landscape”, Chapter 2 “End-to-End Machine Learning Project” (up to “Handling Text and Categorical Attributes”)
1
  • James et al. (2021): Sections 10.1 “Single Layer Neural Networks” & 10.2 “Multilayer Neural Networks”
  • Géron (2022): Chapter 2 “End-to-End Machine Learning Project” (up to “Better Evaluation Using Cross-Validation”), Chapter 10 “Introduction to Artificial Neural Networks With Keras” (up to “Building Complex Models Using the Functional API”)
2
  • Géron (2022): Chapter 3 “Classification” (up to “Multilabel Classification”), Chapter 10 Section “Building Complex Models Using the Functional API”, Chapter 13 Section “Encoding Categorical Features Using Embeddings”
  • James et al. (2021): Section 10.4 “Document Classification”
  • Vajjala et al. (2020): Chapters 1 and 2 (up to “Modeling”), Chapter 3 “Text Representation”
3
  • James et al. (2021): Section 10.3 “Convolutional Neural Networks”
  • Géron (2022): Chapter 14 “Deep Computer Vision Using Convolutional Neural Networks” (just skim through the specific historical architectures, like InceptionNet etc.)
4
  • James et al. (2021): Section 10.5 “Recurrent Neural Networks”
  • Géron (2022): Chapter 15 “Processing Sequences Using RNNs and CNNs”
  • Hyndman & Athanasopoulos (2018): Section 5.1-5.3 and 5.8
5 Schelldorfer & Wüthrich (2019)
7
8 Chollet (2021): Chapter 14 “Conclusions” .

Other useful resources include the Actuaries Institute’s Actuaries’ Analytical Cookbook and the Swiss Association of Actuaries’ Actuarial Data Science Tutorials.

Contributors

  • Eric Tian Dong
  • Michael Jacinto
  • Marcus Lautier
  • Sam Luo
  • Hang Nguyen
  • Melissa Renard
  • Gayani Thalagoda

References

Agarwal, R., Melnick, L., Frosst, N., Zhang, X., Lengerich, B., Caruana, R., & Hinton, G. E. (2021). Neural Additive Models: Interpretable Machine Learning with Neural Nets. Advances in Neural Information Processing Systems, 34, 4699–4711.
Avanzi, B., Taylor, G., Wang, M., & Wong, B. (2024). Machine learning with high-cardinality categorical features in actuarial applications. ASTIN Bulletin: The Journal of the IAA, 54(2), 213–238. https://doi.org/10.1017/asb.2024.7
Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. 3rd International Conference on Learning Representations (ICLR).
Barocas, S., Hardt, M., & Narayanan, A. (2023). Fairness and machine learning: Limitations and opportunities. MIT Press.
Benidis, K., Rangapuram, S. S., Flunkert, V., Wang, Y., Maddix, D., Turkmen, C., Gasthaus, J., Bohlke-Schneider, M., Salinas, D., Stella, L., et al. (2022). Deep learning for time series forecasting: Tutorial and literature survey. ACM Computing Surveys, 55(6), 1–36.
Ben-Zion, Z., Witte, K., Jagadish, A. K., Duek, O., Harpaz-Rotem, I., Khorsandian, M.-C., Burrer, A., Seifritz, E., Homan, P., Schulz, E., et al. (2025). Assessing and alleviating state anxiety in large language models. Npj Digital Medicine, 8(1), 132.
Bergstra, J., Bardenet, R., Bengio, Y., & Kégl, B. (2011). Algorithms for hyper-parameter optimization. Advances in Neural Information Processing Systems, 24.
Bergstra, J., & Bengio, Y. (2012). Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13(2).
Biecek, P., & Burzykowski, T. (2021). Explanatory Model Analysis. Chapman; Hall/CRC, New York. https://pbiecek.github.io/ema/
Bolukbasi, T., Chang, K.-W., Zou, J. Y., Saligrama, V., & Kalai, A. T. (2016). Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. Advances in Neural Information Processing Systems, 29.
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.
Carreira-Perpinán, M. A., & Tavallali, P. (2018). Alternating optimization of decision trees, with application to learning sparse oblique trees. Advances in Neural Information Processing Systems, 31.
Charpentier, A. (2024). Insurance, biases, discrimination and fairness. Springer.
Chen, Z., Lu, Y., Zhang, J., & Zhu, W. (2024). Managing weather risk with a neural network-based index insurance. Management Science, 70(7), 4306–4327.
Chollet, F. (2021). Deep learning with Python. Simon and Schuster.
Delcaillau, D., Ly, A., Papp, A., & Vermet, F. (2022). Model transparency and interpretability: Survey and application to the insurance industry. European Actuarial Journal, 12(2), 443–484.
Doshi-Velez, F., & Kim, B. (2017). Towards a rigorous science of interpretable machine learning.
Feurer, M., & Hutter, F. (2019). Hyperparameter optimization. In Automated machine learning: Methods, systems, challenges (pp. 3–33). Springer.
Fisher, A., Rudin, C., & Dominici, F. (2019). All models are wrong, but many are useful: Learning a variable’s importance by studying an entire class of prediction models simultaneously. Journal of Machine Learning Research, 20(177), 1–81.
Géron, A. (2022). Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow (3rd ed.). O’Reilly Media.
Glassner, A. (2021). Deep learning: A visual approach. No Starch Press.
Goodfellow, I. J., Bulatov, Y., Ibarz, J., Arnoud, S., & Shet, V. (2014). Multi-digit number recognition from street view imagery using deep convolutional neural networks. arXiv Preprint arXiv:1312.6082.
Goodfellow, I. J., Shlens, J., & Szegedy, C. (2014). Explaining and harnessing adversarial examples. arXiv Preprint arXiv:1412.6572.
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press.
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014). Generative adversarial nets. Advances in Neural Information Processing Systems, 27.
Guo, C., & Berkhahn, F. (2016). Entity embeddings of categorical variables. arXiv Preprint arXiv:1604.06737.
Ho, J., Jain, A., & Abbeel, P. (2020). Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 33, 6840–6851.
Hyndman, R. J., & Athanasopoulos, G. (2018). Forecasting: Principles and practice. OTexts.
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2021). An Introduction to Statistical Learning: with Applications in R. Springer.
James, G., Witten, D., Hastie, T., Tibshirani, R., & Taylor, J. (2023). An Introduction to Statistical Learning: with Applications in Python. Springer.
Kelley Pace, R., & Barry, R. (1997). Sparse spatial autoregressions. Statistics & Probability Letters, 33(3), 291–297.
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25, 1097–1105.
Li, C., Wang, J., Zhang, Y., Zhu, K., Hou, W., Lian, J., Luo, F., Yang, Q., & Xie, X. (2023). Large language models understand and can be enhanced by emotional stimuli. arXiv Preprint arXiv:2307.11760.
Liu, C.-L., Yin, F., Wang, D.-H., & Wang, Q.-F. (2011). CASIA online and offline chinese handwriting databases. 2011 International Conference on Document Analysis and Recognition, 37–41.
Lundberg, S. M., & Lee, S.-I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 30.
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv Preprint arXiv:1301.3781.
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., et al. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529–533.
Molnar, C. (2020). Interpretable machine learning.
Murphy, K. P. (2012). Machine learning: A probabilistic perspective. MIT Press.
O’Neil, C. (2017). Weapons of math destruction: How big data increases inequality and threatens democracy. Crown.
Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). Why Should I Trust You?": Explaining the Predictions of Any Classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1135–1144.
Richman, R., & Wuthrich, M. V. (2019). Lee and carter go machine learning: Recurrent neural networks. Available at SSRN 3441030.
Richman, R., & Wüthrich, M. V. (2023). LocalGLMnet: Interpretable deep learning for tabular data. Scandinavian Actuarial Journal, 2023(1), 71–95.
Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5), 206–215.
Rudin, C., Chen, C., Chen, Z., Huang, H., Semenova, L., & Zhong, C. (2022). Interpretable machine learning: Fundamental principles and 10 grand challenges. Statistic Surveys, 16, 1–85.
Russell, S., & Norvig, P. (2021). Artificial intelligence: A modern approach (4th ed.). Pearson.
Samuel, A. L. (1959). Some studies in machine learning using the game of checkers. IBM Journal of Research and Development, 3(3), 210–229.
Schelldorfer, J., & Wüthrich, M. V. (2019). Nesting classical actuarial models into neural networks. Available at SSRN 3320525.
Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., et al. (2016). Mastering the game of go with deep neural networks and tree search. Nature, 529(7587), 484–489.
The TensorFlow Team. (2017). Introducing TensorFlow feature columns. Google Developers Blog. https://developers.googleblog.com/introducing-tensorflow-feature-columns/
Trask, A. W. (2019). Grokking deep learning. Manning Publications.
Vajjala, S., Majumder, B., Gupta, A., & Surana, H. (2020). Practical natural language processing: a comprehensive guide to building real-world NLP systems. O’Reilly Media.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30.