3
ACTL3143 & ACTL5111 Deep Learning for Actuaries
A recording covering (most of) this Python content:
Lecture Outline
Data Science & Python
Python Data Types
Collections
Control Flow
Python Functions
Import syntax
Lambda functions
It is general purpose language
Python powers:
Python is on Mars.
Source: Kaggle (2021), State of Machine Learning and Data Science.
…[T]he entire machine learning and data science industry has been dominated by these two approaches: deep learning and gradient boosted trees… Users of gradient boosted trees tend to use Scikit-learn, XGBoost, or LightGBM. Meanwhile, most practitioners of deep learning use Keras, often in combination with its parent framework TensorFlow. The common point of these tools is they’re all Python libraries: Python is by far the most widely used language for machine learning and data science.
Source: François Chollet (2021), Deep Learning with Python, Second Edition, Section 1.2.7.
Lecture Outline
Data Science & Python
Python Data Types
Collections
Control Flow
Python Functions
Import syntax
Lambda functions
If we want to add 2 to a variable x
:
Same for:
x -= 2
: take 2 from the current value of x
,x *= 2
: double the current value of x
,x /= 2
: halve the current value of x
.and
& or
help
to get more detailsWhat is the output of:
True and False
What would you add before line 3 to get “True and True”?
Lecture Outline
Data Science & Python
Python Data Types
Collections
Control Flow
Python Functions
Import syntax
Lambda functions
TypeError: slice indices must be integers or None or have an __index__ method
['Coffee', 'Cake', 'Sleep', 'Gadget']
None
<class 'tuple'>
3
Rainy
4 and 5
Lecture Outline
Data Science & Python
Python Data Types
Collections
Control Flow
Python Functions
Import syntax
Lambda functions
if
and else
IndentationError: expected an indented block after 'else' statement on line 3 (2212277638.py, line 4)
Warning
Watch out for mixing tabs and spaces!
elif
for
Loopsfor
loopsPatrick wants a coffee, it is priority #1.
Patrick wants a cake, it is priority #2.
Patrick wants a sleep, it is priority #3.
desires = ["coffee", "cake", "nap"]
times = ["in the morning", "at lunch", "during a boring lecture"]
for desire, time in zip(desires, times):
print(f"Patrick enjoys a {desire} {time}.")
Patrick enjoys a coffee in the morning.
Patrick enjoys a cake at lunch.
Patrick enjoys a nap during a boring lecture.
They can get more complicated:
[[0, 0, 0, 0], [0, 1, 2, 3], [0, 2, 4, 6], [0, 3, 6, 9]]
but I’d recommend just using for
loops at that point.
Say that we want to simulate (X \,\mid\, X \ge 100) where X \sim \mathrm{Pareto}(1). Assuming we have simulate_pareto
, a function to generate \mathrm{Pareto}(1) variables:
>> What would you like to do? order cake
Here's your cake! 🎂
>> What would you like to do? order coffee
Here's your coffee! ☕️
>> What would you like to do? order cake
Here's your cake! 🎂
>> What would you like to do? quit
What does this print out?
Math sometimes works..
What does this print out?
10
Lecture Outline
Data Science & Python
Python Data Types
Collections
Control Flow
Python Functions
Import syntax
Lambda functions
Here, name
is a parameter and the value supplied is an argument.
Assuming we have simulate_standard_normal
, a function to generate \mathrm{Normal}(0, 1) variables:
Note
We’ll cover random numbers next week (using numpy
).
E.g. to fit a Keras model, we use the .fit
method:
model.fit(x=None, y=None, batch_size=None, epochs=1, verbose='auto',
callbacks=None, validation_split=0.0, validation_data=None,
shuffle=True, class_weight=None, sample_weight=None,
initial_epoch=0, steps_per_epoch=None, validation_steps=None,
validation_batch_size=None, validation_freq=1,
max_queue_size=10, workers=1, use_multiprocessing=False)
Say we want all the defaults except changing use_multiprocessing=True
:
but it is much nicer to just have:
What does the following print out?
[4]
lims = limits([1, 2, 3, 4, 5])
smallest_num = lims[0]
largest_num = lims[1]
print(f"The numbers are between {smallest_num} and {largest_num}.")
The numbers are between 1 and 5.
smallest_num, largest_num = limits([1, 2, 3, 4, 5])
print(f"The numbers are between {smallest_num} and {largest_num}.")
The numbers are between 1 and 5.
This doesn’t just work for functions with multiple return values:
Lecture Outline
Data Science & Python
Python Data Types
Collections
Control Flow
Python Functions
Import syntax
Lambda functions
Source: Learnbay.co, Python libraries for data analysis and modeling in Data science, Medium.
as
Want keras.models.Sequential()
.
Alternatives using from
:
Lecture Outline
Data Science & Python
Python Data Types
Collections
Control Flow
Python Functions
Import syntax
Lambda functions
Example: how to sort strings by their second letter?
If you try help(sorted)
you’ll find the key
parameter.
The length of 'Josephine' is 9.
The length of 'Patrick' is 7.
The length of 'Bert' is 4.
Example: how to sort strings by their second letter?
If you try help(sorted)
you’ll find the key
parameter.
The second letter of 'Josephine' is 'o'.
The second letter of 'Patrick' is 'a'.
The second letter of 'Bert' is 'e'.
Example: how to sort strings by their second letter?
If you try help(sorted)
you’ll find the key
parameter.
Caution
Don’t use lambda
as a variable name! You commonly see lambd
or lambda_
or λ
.
Example, opening a file:
Most basic way is:
Haikus from http://www.libertybasicuniversity.com/lbnews/nl107/haiku.htm
from watermark import watermark
print(watermark(python=True, packages="keras,matplotlib,numpy,pandas,seaborn,scipy,torch,tensorflow,tf_keras"))
Python implementation: CPython
Python version : 3.11.9
IPython version : 8.24.0
keras : 3.3.3
matplotlib: 3.9.0
numpy : 1.26.4
pandas : 2.2.2
seaborn : 0.13.2
scipy : 1.11.0
torch : 2.3.1
tensorflow: 2.16.1
tf_keras : 2.16.0
If you came from C (i.e. are a joint computer science student), and were super interested in Python’s internals, maybe you’d be interested in this How variables work in Python video.
help
pip install ...
range
type