ACTL3143 & ACTL5111 Deep Learning for Actuaries
Lecture Outline
Generative Adversarial Networks
Conditional GANs
Image-to-image translation
Problems with GANs
Language Models
Sampling strategy
Transformers
Try out https://www.whichfaceisreal.com.
Source: https://thispersondoesnotexist.com.
Source: Jeff Heaton (2021), Training a GAN from your Own Images: StyleGAN2.
A schematic of a generative adversarial network.
Source: Thales Silva (2018), An intuitive introduction to Generative Adversarial Networks (GANs), freeCodeCamp.
Source: Google Developers, Overview of GAN Structure, Google Machine Learning Education.
How they best each other:
First step: Training discriminator:
Second step: Training generator:
Lecture Outline
Generative Adversarial Networks
Conditional GANs
Image-to-image translation
Problems with GANs
Language Models
Sampling strategy
Transformers
An analogy for unconditional vs conditional GANs
Original data
Initial fakes
Fakes after 1 iteration
Fakes after 100 kimg
Fakes after 200 kimg
Fakes after 1000 kimg
Fakes after 3700 kimg
Lecture Outline
Generative Adversarial Networks
Conditional GANs
Image-to-image translation
Problems with GANs
Language Models
Sampling strategy
Transformers
A deoldified version of the famous “Migrant Mother” photograph.
Source: Deoldify package.
A deoldified Golden Gate Bridge under construction.
Source: Deoldify package.
Lecture Outline
Generative Adversarial Networks
Conditional GANs
Image-to-image translation
Problems with GANs
Language Models
Sampling strategy
Transformers
StyleGAN2-ADA training times on V100s (1024x1024):
GPUs | 1000 kimg | 25000 kimg | sec / kimg | GPU mem | CPU mem |
---|---|---|---|---|---|
1 | 1d 20h | 46d 03h | 158 | 8.1 GB | 5.3 GB |
2 | 23h 09m | 24d 02h | 83 | 8.6 GB | 11.9 GB |
4 | 11h 36m | 12d 02h | 40 | 8.4 GB | 21.9 GB |
8 | 5h 54m | 6d 03h | 20 | 8.3 GB | 44.7 GB |
Source: NVIDIA’s Github, StyleGAN2-ADA — Official PyTorch implementation.
Source: Metz et al. (2017), Unrolled Generative Adversarial Networks and Randall Munroe (2007), xkcd #221: Random Number.
A schematic of a generative adversarial network.
# Separate optimisers for discriminator and generator.
d_optimizer = keras.optimizers.Adam(learning_rate=0.0003)
g_optimizer = keras.optimizers.Adam(learning_rate=0.0004)
Source: Thales Silva (2018), An intuitive introduction to Generative Adversarial Networks (GANs), freeCodeCamp.
Conv2D
GlobalMaxPool2D
Conv2DTranspose
Sources: Pröve (2017), An Introduction to different Types of Convolutions in Deep Learning, and Peltarion Knowledge Center, Global max pooling 2D.
Generating synthetic user information with differential privacy and Wasserstein GANs.
Source: Côté et al. (2020), Synthesizing Property & Casualty Ratemaking Datasets using Generative Adversarial Networks, arXiv.
Lecture Outline
Generative Adversarial Networks
Conditional GANs
Image-to-image translation
Problems with GANs
Language Models
Sampling strategy
Transformers
Generating sequential data is the closest computers get to dreaming.
Source: Alex Graves (2013), Generating Sequences With Recurrent Neural Networks
Diagram of a word-level language model.
Source: Marcus Lautier (2022).
Diagram of a character-level language model (Char-RNN)
Source: Tensorflow tutorial, Text generation with an RNN.
RNN output | Decoded Transcription |
---|---|
what is the weather like in bostin right now | what is the weather like in boston right now |
prime miniter nerenr modi | prime minister narendra modi |
arther n tickets for the game | are there any tickets for the game |
Source: Hannun et al. (2014), Deep Speech: Scaling up end-to-end speech recognition, arXiv:1412.5567, Table 1.
ROMEO:
Why, sir, what think you, sir?
AUTOLYCUS:
A dozen; shall I be deceased.
The enemy is parting with your general,
As bias should still combit them offend
That Montague is as devotions that did satisfied;
But not they are put your pleasure.
Source: Tensorflow tutorial, Text generation with an RNN.
DUKE OF YORK:
Peace, sing! do you must be all the law;
And overmuting Mercutio slain;
And stand betide that blows which wretched shame;
Which, I, that have been complaints me older hours.
LUCENTIO:
What, marry, may shame, the forish priest-lay estimest you, sir,
Whom I will purchase with green limits o’ the commons’ ears!
Source: Tensorflow tutorial, Text generation with an RNN.
ANTIGONUS:
To be by oath enjoin’d to this. Farewell!
The day frowns more and more: thou’rt like to have
A lullaby too rough: I never saw
The heavens so dim by day. A savage clamour!
[Exit, pursued by a bear]
Lecture Outline
Generative Adversarial Networks
Conditional GANs
Image-to-image translation
Problems with GANs
Language Models
Sampling strategy
Transformers
Idea inspired by Mehta (2023), The need for sampling temperature and differences between whisper, GPT-3, and probabilistic model’s temperature
In today’s lecture we will be different situation. So, next one is what they rective that each commit to be able to learn some relationships from the course, and that is part of the image that it’s very clese and black problems that you’re trying to fit the neural network to do there instead of like a specific though shef series of layers mean about full of the chosen the baseline of car was in the right, but that’s an important facts and it’s a very small summary with very scrort by the beginning of the sentence.
In today’s lecture we will decreas before model that we that we have to think about it, this mightsks better, for chattely the same project, because you might use the test set because it’s to be picked up the things that I wanted to heard of things that I like that even real you and you’re using the same thing again now because we need to understand what it’s doing the same thing but instead of putting it in particular week, and we can say that’s a thing I mainly link it’s three columns.
In today’s lecture we will probably the adw n wait lots of ngobs teulagedation to calculate the gradient and then I’ll be less than one layer the next slide will br input over and over the threshow you ampaigey the one that we want to apply them quickly. So, here this is the screen here the main top kecw onct three thing to told them, and the output is a vertical variables and Marceparase of things that you’re moving the blurring and that just data set is to maybe kind of categorical variants here but there’s more efficiently not basically replace that with respect to the best and be the same thing.
In today’s lecture we will put it different shates to touch on last week, so I want to ask what are you object frod current. They don’t have any zero into it, things like that which mistakes. 10 claims that the average version was relden distever ditgs and Python for the whole term wo long right to really. The name of these two options. There are in that seems to be modified version. If you look at when you’re putting numbers into your, that that’s over. And I went backwards, up, if they’rina functional pricing working with.
In today’s lecture we will put it could be bedinnth. Lowerstoriage nruron. So rochain the everything that I just sGiming. If there was a large. It’s gonua draltionation. Tow many, up, would that black and 53% that’s girter thankAty will get you jast typically stickK thing. But maybe. Anyway, I’m going to work on this libry two, past, at shit citcs jast pleming to memorize overcamples like pre pysing, why wareed to smart a one in this reportbryeccuriay.
This is (probably) just the ‘temperature’ knob under the hood.
An example sequence-to-sequence chatbot model.
Source: Payne (2021), What is beam search, Width.ai blog.
Illustration of a beam search.
Source: Doshi (2021), Foundations of NLP Explained Visually: Beam Search, How It Works, towardsdatascience.com.
Lecture Outline
Generative Adversarial Networks
Conditional GANs
Image-to-image translation
Problems with GANs
Language Models
Sampling strategy
Transformers
GPT makes use of a mechanism known as attention, which removes the need for recurrent layers (e.g., LSTMs). It works like an information retrieval system, utilizing queries, keys, and values to decide how much information it wants to extract from each input token.
Attention heads can be grouped together to form what is known as a multihead attention layer. These are then wrapped up inside a Transformer block, which includes layer normalization and skip connections around the attention layer. Transformer blocks can be stacked to create very deep neural networks.
Highly recommended viewing: Iulia Turk (2021), Transfer learning and Transformer models, ML Tech Talks.
Source: David Foster (2023), Generative Deep Learning, 2nd Edition, O’Reilly Media, Chapter 9.
transformers.set_seed(123)
print(generator("It's the holidays so I'm going to enjoy")[0]["generated_text"])
It's the holidays so I'm going to enjoy this," he says.
The three are all in the same boat. "We're so excited to see what the fans will be like," he says.
The first two days of the season are a big hit. The team still has to play without their top line and center Dwight Howard, but there's still a lot of action ahead.
"There's a lot of new players coming in and I think it's going to be a great tournament for everybody, but it's going to be a very busy time," says Howard.
The team is excited about the opportunity to see their new teammates play.
"It's going to be a great tournament for us but it's going to be a great experience for everyone," he says.
But there's no way around it.
"It's a tough game," says Howard. "We lose one of our best players and we lose one of our best players for the last five years."
transformers.set_seed(234)
print(generator("It's the holidays so I'm going to enjoy")[0]["generated_text"])
It's the holidays so I'm going to enjoy this for a long time."
The full report is expected to be released to the public at the end of the year.
The report also says that the government is "working with private sector stakeholders" to develop a "new approach for addressing and combating cyber attacks on our networks".
In response to the report, Chief Secretary G.K. Chidambaram said the government would ensure that all the data it holds about the state is used in a way that is consistent with its values and priorities.
"We will take steps to ensure that this data is used in a way that is consistent with the values and priorities of the state. We will continue to provide the state with information that makes the state more secure, more efficient and more resilient. We will ensure that our people have access to the most sensitive data. We will ensure that our government provides effective services to the people of Bangladesh and to the world," he told reporters here.
"Today's report is an important step in enhancing our security. The first thing I want to tell you is the government is working with private sector stakeholders and it is important that we provide the system with the tools and the capacity that is necessary to protect our people. It is now in a position to
context = """
StoryWall Formative Discussions: An initial StoryWall, worth 2%, is due by noon on June 3. The following StoryWalls are worth 4% each (taking the best 7 of 9) and are due at noon on the following dates:
The project will be submitted in stages: draft due at noon on July 1 (10%), recorded presentation due at noon on July 22 (15%), final report due at noon on August 1 (15%).
As a student at UNSW you are expected to display academic integrity in your work and interactions. Where a student breaches the UNSW Student Code with respect to academic integrity, the University may take disciplinary action under the Student Misconduct Procedure. To assure academic integrity, you may be required to demonstrate reasoning, research and the process of constructing work submitted for assessment.
To assist you in understanding what academic integrity means, and how to ensure that you do comply with the UNSW Student Code, it is strongly recommended that you complete the Working with Academic Integrity module before submitting your first assessment task. It is a free, online self-paced Moodle module that should take about one hour to complete.
StoryWall (30%)
The StoryWall format will be used for small weekly questions. Each week of questions will be released on a Monday, and most of them will be due the following Monday at midday (see assessment table for exact dates). Students will upload their responses to the question sets, and give comments on another student's submission. Each week will be worth 4%, and the grading is pass/fail, with the best 7 of 9 being counted. The first week's basic 'introduction' StoryWall post is counted separately and is worth 2%.
Project (40%)
Over the term, students will complete an individual project. There will be a selection of deep learning topics to choose from (this will be outlined during Week 1).
The deliverables for the project will include: a draft/progress report mid-way through the term, a presentation (recorded), a final report including a written summary of the project and the relevant Python code (Jupyter notebook).
Exam (30%)
The exam will test the concepts presented in the lectures. For example, students will be expected to: provide definitions for various deep learning terminology, suggest neural network designs to solve risk and actuarial problems, give advice to mock deep learning engineers whose projects have hit common roadblocks, find/explain common bugs in deep learning Python code.
"""
{'score': 0.5019664764404297, 'start': 2092, 'end': 2095, 'answer': '30%'}
{'score': 0.21276013553142548,
'start': 1778,
'end': 1791,
'answer': 'deep learning'}
{'score': 0.5296490788459778,
'start': 1319,
'end': 1335,
'answer': 'Monday at midday'}
“… there is no official paper that describes how ChatGPT works in detail, but … we know that it uses a technique called reinforcement learning from human feedback (RLHF) to fine-tune the GPT-3.5 model. While ChatGPT still has many limitations (such as sometimes “hallucinating” factually incorrect information), it is a powerful example of how Transformers can be used to build generative models that can produce complex, long-ranging, and novel output that is often indistinguishable from human-generated text. The progress made thus far by models like ChatGPT serves as a testament to the potential of AI and its transformative impact on the world.”
Source: David Foster (2023), Generative Deep Learning, 2nd Edition, O’Reilly Media, Chapter 9.
Two new courses starting in 2026:
ACTL4306 “Quantitative Ethical AI for Risk & Actuarial Applications”
ACTL4307 “Generative AI for Actuaries”