Lab: Distributional Regression
ACTL3143 & ACTL5111 Deep Learning for Actuaries
CANN
- Find the coefficients \boldsymbol{\beta}_{\text{GLM}} of the GLM with a link function g(\cdot).
- Find the weights \boldsymbol{w}_{\text{CANN}} of a neural network \mathcal{M}_{\text{CANN}}:\mathbb{R}^{d_{\boldsymbol{x}}}\to\mathbb{R}.
- Given a new instance \boldsymbol{x}, we have \mathbb{E}[Y|\boldsymbol{x}] = g^{-1}\Big( \langle\boldsymbol{\beta}_{\text{GLM}}, \boldsymbol{x}\rangle + \mathcal{M}_{\text{CANN}}(\boldsymbol{x};\boldsymbol{w}_{\text{CANN}})\Big).
MDN
Exercises
CANN
Train a CANN model that predicts the mean as follows: \mathbb{E}[Y|\boldsymbol{x}] = g^{-1}\Big( \textcolor{orange}{0.9} \cdot\langle\boldsymbol{\beta}_{\text{GLM}}, \boldsymbol{x}\rangle + \textcolor{orange}{0.1} \cdot \mathcal{M}_{\text{CANN}}(\boldsymbol{x};\boldsymbol{w}_{\text{CANN}})\Big). where g^{-1}(\cdot)=\exp(\cdot). Hint: Check slides 20, 23, 24, and 25, and change the following line of code on slide 24.
def CANN_negative_log_likelihood(y_true, y_pred): ...= tf.math.exp(CANN_logmu + GLM_logmu) mu
Recompute the dispersion parameter using the adjusted model. Hint: use the code from slide 25 and change the following line of code
= np.exp(np.sum(CANN.predict(X_train), axis = 1)) mus
MDN
Increase the number of mixture components to 5. You can use the code from slide 33.
Change the distributional assumption from gamma to inverse gamma for the mixture density network model. Hint: adjust the following code using this .
= tfd.MixtureSameFamily( mixture_distribution =tfd.Categorical(probs=pis), mixture_distribution=tfd.Gamma(alphas, betas)) components_distribution
Report the average negative log-likelihood loss (test data) using the new MDN. Hint: slides 36 and 39.
Extension
- Compute the CRPS for the models trained in Exercise and Exercise .
- Build a Mixture Density Network (MDN), where the first component is a gamma distribution, the second component is a log-normal distribution, and the third component is an inverse gamma distribution.
Monte Carlo Dropout
For Monte Carlo (MC) dropout, we intentionally leave the dropout on when making predictions.
Deep Ensembles
- Train D neural networks with different random weights initialisations independently in parallel. The trained weights are \boldsymbol{w}^{(1)}, \ldots, \boldsymbol{w}^{(D)}.
- Ensemble the outputs when making predictions, i.e., taking the average of the outputs from each individual neural network.
Exercises
Monte Carlo Dropout
Construct a neural network
MCDropout_LN
that outputs the parameters of a gamma distribution with the following structure and specification:Use
1); tf.random.set_seed(1) random.seed(
Adam
optimiser with the default learning ratevalidation split of 0.2 while training
two hidden layers with 64 neurons in each layer, and
a constant dropout rate of 0.2.
Hint: the following code can be helpful
# Output the paramters of the gamma distribution = Dense(2, activation = 'softplus')(x) outputs # Construct the Gamma distribution on the last layer = tfp.layers.DistributionLambda( distributions lambda t: tfd.Gamma(concentration=t[..., 0:1], =t[..., 1:2]))(outputs) rate # Model = Model(inputs, distributions) MCDropout_LN # Loss Function def gamma_loss(y_true, y_pred): return -y_pred.log_prob(y_true) # Then use the loss function when compiling the model =tf.keras.optimizers.Adam(learning_rate=0.001), MCDropout_LN.compile(optimizer=gamma_loss) loss
Apply MC dropout 2000 times and store the parameter estimates for the first instance in the test dataset using the model
MCDropout_LN
. Hint: slide 61, and replace= gamma_bnn(X_test[9:10].values) predicted_distributions
with
= MCDropout_LN(X_test[:1].values, training = True) predicted_distributions
Calculate the aleatoric and epistemic uncertainty for the instance using equations and . Hint: slide 64.
Deep Ensembles
- Reuse the code demonstrated in the lecture and calculate the aleatoric and epistemic uncertainty for the first instance in the test dataset using equations and . Hint: slides 66, 67, and 68.
Extension
- Prove the result on slide 55.
- Replace the variational distribution with a mixture of Gaussian for the BNN introduced in Exercise .