What is Python?¶
Python is one of the most widely used programming languages today. It is a high-level, interpreted programming language that emphasizes readability. Python's readability makes it often the language of choice for both those beginning to learn programming and large collaborative projects.
Unlike compiled languages like C++ and Rust, Python is an interpreted language. This means that the Python interpreter will compile the code at runtime, rather than compiling it down to a binary prior to running. This has the benefit of making Python an excellent language for quickly developing and debugging code.
Interactive environments like Jupyter Notebooks allow for clear frameworks for developing, testing, sharing, and presenting code.
Installing Python and working with Environments¶
We will use virtual environments in this tutorial. I recommend that you use an environment manager such as conda or mamba/micromamba.
Mamba/Micromamba is a "fast, robust, and cross-platform package manager" that offers significant performance advantages over conda when installing and resolving packages.
We can create a new environment using the following command:
mamba/conda create -n my_environment python=3.9 numpy matplotlib
Here, we are creating a new environment called my_environment
, which installs Python 3.9
and the packages numpy
and matplotlib
. We can activate this environment using:
mamba/conda activate my_environment
Running which python
confirms that we are utilizing the Python installed within our environment.
Environments enable us to install conflicting versions for various projects. For instance, suppose we need to execute older code dependent on Python 2.7, which is incompatible with modern packages. In that scenario, we can establish an environment with Python 2.7 and install versions of packages compatible with it. This action won't impact any other environment we've established.
For this workshop, we will utilize the following environment:
mamba/conda create -n workshop -c conda-forge python=3.10 numpy matplotlib scipy pandas jupyter jupyterlab ipykernel
Installing additional packages¶
Once in the environment, we can install additional packages using the install
command:
mamba/conda install scipy
This installs the package scipy
, which is a statistics package compatible with numpy
data types. We can also remove a package using:
mamba/conda remove scipy
which would remove the package scipy
. Some packages will require installation from a specific collection of packages:
mamba/conda install -c conda-forge astroquery
This installs the package astroquery
, a package that allows querying astronomical databases like Simbad, which is part of the collection conda-forge
.
If we ever want to see which packages are currently installed, we can use something like:
mamba/conda list
which gives a list of installed packages and their versions. We can output this to a machine-readable file using:
mamba/conda list -e > requirements.txt
Ensuring that users are using a standard environment can help debug version-specific bugs.
Hello World¶
Python is a dynamically typed language, which means that the type of a variable does not need to be known until that variable is used. This also means that we can change the type of a variable at any stage of the code.
We can define variables like:
my_string = "Hello"
We can also overwrite variables like:
# Defining a bool
my_variable = True
# Redefining as a float
my_variable = 4.23
# Redefining as a string
my_variable = "goodbye"
In python we can print output using the print()
function:
# Defining as a string
my_variable = "Hello, world"
print (my_variable)
# Redefining a bool
my_variable = True
print (my_variable)
# Redefining as a float
my_variable = 4.23
print (my_variable)
print ("Goodbye, World!")
Hello, world True 4.23 Goodbye, World!
when printing we can format strings using fstrings
:
pi = 3.14159265359
print (f"Pi to 5 digits = {pi:0.5f}")
print (f"Pi to 3 digits = {pi:0.3f}")
print (f"Pi to 4 digits in scientific notation = {pi:0.4e}")
print (f"Pi as an integer = {pi:0.0f}, minus pi to 1 digit {-pi:0.1f}")
Pi to 5 digits = 3.14159 Pi to 3 digits = 3.142 Pi to 4 digits in scientific notation = 3.1416e+00 Pi as an integer = 3, minus pi to 1 digit -3.1
We can also define strings to format later:
my_string = "{name}'s favorite number is {number}"
formatted = my_string.format( name = "Ste", number = 42)
print(formatted)
Ste's favorite number is 42
And we can add strings together and take slices of string:
# Adding to a string
extended = formatted + " and he likes python"
print (extended)
# Taking up to the last 6 elements and adding "pi"
print (extended[:-6] + "pi")
Ste's favorite number is 42 and he likes python Ste's favorite number is 42 and he likes pi
Basic Operations¶
- Addition: a + b
- Multiplication: a * b
- Division: a / b
- Integer division: a // b
- Modulus: a % b
- Power: a ** b
- Equal: a == b
- Not equal: a != b
- Less than: a < b
- Less than or equal to: a <= b
- Greater than: a > b
- Greater than or equal to: a >= b
Other logical statements:¶
- Or:
- a or b
- a | b
- And:
- a and b
- a & b
For example, if a multiplied by b is less than c divided by d, and e is greater than 10:
(a * b < c / d) and (e > 10)
a = 7
b = 2.2
# Normal division
c = a/b
# Integer division
d = a//b
# Modulus (remainder)
e = a % b
print(f"{a} / {b} = {c}")
print(f"{a} // {b} = {d}")
print(f"{a} % {b} = {e}")
print (f"a > 2: {a > 2}")
7 / 2.2 = 3.1818181818181817 7 // 2.2 = 3.0 7 % 2.2 = 0.39999999999999947 a > 2: True
Basic Data Types¶
Python has several basic data types:
int
: integers: -3, -2, -1, 0, 1, 2, 3, etc.float
: non-integers: 3.14, 42.0, etc.bool
: boolean. Note in Python,True
/False
start with a capital letter.x = false
will give an error, whilex = False
will not.
str
: strings of characters. In Python, strings are wrapped in single (''
) or double (""
) quotation marks. They can be combined when using strings:my_str = "hello"
,my_str = 'apple'
,answer = 'Computer says "no"'
- all of these will work just fine.my_str = "Goodbye'
will not work since we need to match the quotation marks properly.
We can cast from one data type to another using the format:
x = 1.3
y = int(x)
Here, x
is cast to the int
type y
. We can also determine the type of a variable using the type function:
type(x)
x = 10
y = float(x)
z = bool(x)
v = str(x)
print (x,y,z,v)
print (type(x), type(y), type(z), type(v))
10 10.0 True 10 <class 'int'> <class 'float'> <class 'bool'> <class 'str'>
Basic Collections of Data¶
Python offers several ways to organize and store data efficiently. These data structures play a vital role in managing and manipulating information within a program.
Lists¶
A list
in Python is a versatile and mutable collection of items, ordered and enclosed within square brackets []
. It allows storing various data types, including integers, strings, or even other lists. Lists are dynamic, meaning elements can be added, removed, or changed after creation using methods like append()
, insert()
, remove()
, or by directly assigning values to specific indices.
my_list = [1, 2, 3, 'apple', 'banana', 'cherry']
# Create a list
my_list = [4,5.2,-1.3]
print (my_list)
# Add a new element to the end
my_list.append(21)
print (my_list)
# "Pop" out the 1st element
element = my_list.pop(1)
print (element, my_list)
# Reasign a value
my_list[0] = -999
print (my_list)
# Lists can include multiple data types
my_list[-2] = "Hello"
print(my_list)
[4, 5.2, -1.3] [4, 5.2, -1.3, 21] 5.2 [4, -1.3, 21] [-999, -1.3, 21] [-999, 'Hello', 21]
String slicing¶
In Python we can slice lists (and arrays, more on this later) to access sub sections of the list. We use the syntax:
my_list[start:stop]
where start
and stop
are the range that we want to access, with stop
being exclusive. We can also access the last element with:
my_list[-1]
with -1
being the last element (-2
being the second last... etc.). To slice from the 2 element to the second last we would do:
my_list[2:-2]
# Define a new list
my_list = [1,2,3,4,5,6,7,8,9, 1,2,3,4]
print (my_list)
# Create a slice excluding the first and last
my_sub_list = my_list[1:-1]
print(my_sub_list)
# get the length of the list
print (f"The list is {len(my_list)} elements long")
[1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4] [2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3] The list is 13 elements long
Sets¶
Sets
are useful collection in Python. They are an unordered and mutable collection of unique elements. Sets are enclosed in curly braces {}
and support set operations like union, intersection, and difference. They are efficient for tasks requiring unique elements and membership testing.
# Create a set using {}
first_set = {1,2,3,4,4,5}
print (first_set)
# create a set from the list
my_set = set(my_list)
print (my_set)
# Add values to the set
my_set.add(13)
my_set.add(1)
print (my_set)
# Sets can have multiple data types
my_set.add("Hello")
print (my_set)
{1, 2, 3, 4, 5} {1, 2, 3, 4, 5, 6, 7, 8, 9} {1, 2, 3, 4, 5, 6, 7, 8, 9, 13} {1, 2, 3, 4, 5, 6, 7, 8, 9, 13, 'Hello'}
Tuples¶
A tuple
is similar to a list but is immutable once created, denoted by parentheses ()
. Tuples are often used to store related pieces of information together and are faster than lists due to their immutability. They are commonly utilized for items that shouldn't be changed, such as coordinates or configuration settings.
my_tuple = (4, 5, 6, 'dog', 'cat', 'rabbit')
my_tup = (1,2,3,4, "Apple")
print (my_tup)
my_tup[0] = -2
print (my_tup)
(1, 2, 3, 4, 'Apple')
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) Cell In[12], line 3 1 my_tup = (1,2,3,4, "Apple") 2 print (my_tup) ----> 3 my_tup[0] = -2 4 print (my_tup) TypeError: 'tuple' object does not support item assignment
Dictionaries¶
A dictionary
is an unordered collection of key-value pairs enclosed in curly braces {}
. Each element in a dictionary is accessed by its associated key rather than an index. Dictionaries are suitable for storing data where retrieval by a specific key is a priority. They are flexible and allow storing various data types as values.
my_dict = {'name': 'Alice', 'age': 25, 'country': 'USA'}
my_dict = {}
fmt_string = "entry_{entry}"
for i in range(10):
my_dict[fmt_string.format(entry=i)] = -1
print (my_dict["entry_9"])
if "new_key" in my_dict:
print ("Key exists")
if "entry_1" in my_dict:
print ("Key exists: ", my_dict["entry_1"])
-1 Key exists: -1
Looping¶
In Python, there are two primary methods of looping: for
loops and while
loops.
For Loops¶
for
loops use the syntax for variable in iterable
, where iterable
is some sequence-like object, and variable
represents the current instance within the loop. The block of code to be executed within the loop is designated by indentation. Python's standard is to use 4 spaces for indentation, but using tabs (consistently) is also common (avoid mixing spaces and tabs). For example:
my_list = [1, 2, 3, 4, 5]
for num in my_list:
print(num)
This loop iterates through the elements of my_list, assigning each element to the variable num, and then prints each element.
The range()
function is often used with for
loops to generate a sequence of numbers. It allows iterating a specific number of times or generating a sequence within a range.
# Create an empty list
x = []
# range(n) will return an iteratable type which goes from 0-10 exclusive (0,1,...,9)
for i in range(10):
# add i to our list
x.append(i)
print (x)
# The list x is also iterable
for ix in x:
# Print the value squared
print (ix**2)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9] 0 1 4 9 16 25 36 49 64 81
We can also use "list comprehension" to generate a list from simple loops. This takes the format:
array = [ some_funtion(i) for i in some_loop() ]
We can also use list comprehension to do some filtering:
array = [ some_function(i) for i in some_loop() if some_condition(i) ]
The if statement can go before or after the for
loop:
array = [ some_function(i) if some_condition(i) for i in some_loop() ]
We can also add in an else condition:
array = [ some_function(i) if some_condition(i) else 0 for i in some_loop() ]
# Create a list with numbers between 0 and 100
my_list = [ x for x in range(100)]
# Create a list with only even numbers between 0 and 100
my_even_list = [x for x in range(100) if x % 2 == 0]
# Create a list with 0 for even indices and 1 for odd indicies
# Between 0 and 100
my_conditional_list = [ 0 if x %2 == 0 else 1 for x in range(100) ]
print (my_list[:5])
print (my_even_list[:5])
print (my_conditional_list[:5])
[0, 1, 2, 3, 4] [0, 2, 4, 6, 8] [0, 1, 0, 1, 0]
While¶
while
loops execute a block of code as long as a specified condition is True
. Care should be taken to avoid infinite loops where the condition always remains True
. The syntax for a while
loop is while condition:
followed by an indented block of code.
The break
statement can be used to exit a loop prematurely based on a condition, while continue
skips the current iteration and proceeds to the next one.
When evaluting the condition
anything that isn't False
, 0
or None
is considered to be True
.
i = 0
# This will not run
while None:
i+=1
print (i)
if i > 5:
break
i = 0
# Use while loop to print the numbers up to 10
while i < 10:
print (i)
i+=1
0 1 2 3 4 5 6 7 8 9
i = 0
# Use an if condition to break this infinite loop after the 5th iteration
while "hello":
print (i)
i+=1
if i >= 5:
break
0 1 2 3 4
Without the if statement here we would have an infinite loop! We can exit out of a loop with a break
command or we can skip to the end of the current iteration using the continue
command.
Let's use if
statements to see how these work.
if-elif-else Statements¶
if
-elif
-else
statements allow us to control the flow of the code based on conditions. They take the syntax:
if condition1:
# condition 1 code
elif condition2:
# condition 2 code
elif condition3:
# condition 3 code
else:
# default code
Notice that the if
and elif
statements take logical expressions, while else
does not. You can have any number of elif
branches but only one if
branch and at most one else
branch.
This construct allows for branching based on multiple conditions. Python evaluates each condition sequentially. If condition1
is true, it executes the code block under condition1
. If condition1
is false, it checks condition2
, and so on. If none of the conditions are true, the code block under else
(if provided) is executed as the default action.
even_sum = 0
odd_sum = 0
for i in range(100):
# Exit the loop when i goes above or equal to 10
if i >= 10:
break
# Skip the i = 3 or the i = 0 iteration
elif (i == 3) | (i==0):
continue
elif i % 2 == 0:
even_sum += i
print (f"{i} -> Even Sum: {even_sum}")
else :
odd_sum +=1
print (f"{i} -> Odd Sum: {odd_sum}")
1 -> Odd Sum: 1 2 -> Even Sum: 2 4 -> Even Sum: 6 5 -> Odd Sum: 2 6 -> Even Sum: 12 7 -> Odd Sum: 3 8 -> Even Sum: 20 9 -> Odd Sum: 4
Functions¶
Creating functions is an effective method to enhance code reusability and streamline debugging. When there's a block of code intended to be executed multiple times, encapsulating it within a function proves beneficial. This practice minimizes human error by necessitating modifications in only one location. Moreover, employing functions to execute smaller code segments can significantly enhance code readability and simplify the debugging process.
In Python, we define a function using the def
keyword:
# Simple function to print hello
def print_hello():
print("Hello")
# We can take in arguments
def print_message(msg):
print(msg)
# Simple function to take in two arguments and add them togeter and return the sum
def add_numbers(a,b):
c = a + b
# Use the print message function to print c
print_message(c)
return a+b
# We can also pass a function to a function
def repeat(func, n, args):
for i in range(n):
# Using the *args will unwrap the tuple and pass to the function
func(*args)
print_hello()
print_message(42)
c = add_numbers(1.3, 5)
# We can specify which variable is which by specificing the argument name/
repeat(func = add_numbers, n = 5, args=(1.3,2.1))
Hello 42 6.3 3.4000000000000004 3.4000000000000004 3.4000000000000004 3.4000000000000004 3.4000000000000004
We can also write lambda
functions which are short inline functions:
# Lambda function to get square root
my_function = lambda x : x**0.5
# Lambda function to act as a wrapper
def larger_function(x, y):
return x**2 / y**3
# Lambda function which calls for y = 1.5
my_wrapper = lambda x : larger_function(x, 1.5)
print (my_function(4))
print (my_wrapper(3))
print (larger_function(3,4))
2.0 2.6666666666666665 0.140625
Functions, Naming Conventions and Documentation¶
When writing functions and classes (more on this later), we should conform to a consistent convention. This helps both users and developers to better understand the code, improving the ability to use and develop the code.
The convention we'll follow in this example is the Google Python Style Guide. Let's look at some examples of why this is useful.
Consider the following. We have a function calc
which takes three arguments (x
, b
and i
).
def calc(x, b, i):
x[i] = x[i] / b
return x
The function scales an element in x
by 1/b
. We can choose a name to better describe what the function does.
def scale_element(x, b, i):
x[i] = x[i] / b
return x
The use now knows that the function will scale an element of the array, but the user still doesn't know what the arguments are or what is returned. We can add a doc string to help with this.
help(scale_element)
Help on function scale_element in module __main__: scale_element(x, b, i)
This help message is automatically generated from the "docstring" of the function. The docstring is a small description of a function that we write at the state of the function.
Following the Google Python Style Guide, a good template for your docstring is:
def function(x,y,z):
"""One line summary of my function
More detailed description of my function, potentially showing
some math relation:
$\frac{dy}{dx} = x^2$
Args:
x: description of x
y: description of y
z: description of z
Returns:
description of what is returned
Examples:
Some example
>>> function (x,y,z)
return_value
Raises:
Error: Error raised and description of that error
"""
This is quite verbose but has huge benifits for the user and allows us to self document our code.
def scale_element(x, b, i):
"""Scale an element of the input array
Scale an element of the array by a constant
Args:
x: list of values
b: constant to scale by
i: index of element to scale
Returns:
Scaled a list with the element i scaled by 1/b
Examples:
>>> scale_element([1,2,3,4], 2, 1)
[1, 1.0, 3, 4]
"""
x[i] = x[i] / b
return x
Let's break down the sections here:
"""Scale an element of the input array
We start off with a short 1 sentence description of the function
Scale an element of the array by a constant
We then use a more detailed description of the function and how to use it
Args:
x: list of values
b: constant to scale by
i: index of element to scale
We list the arguments by name and what they are.
Returns:
Scaled a list with the element i scaled by 1/b
We list what is returned by the function and what they are.
Examples:
>>> scale_element([1,2,3,4], 2, 1)
[1,1.0,3,4]
"""
We give an usage example of the functions and the expected output. The user can then access this helpful message anything using:
help(scale_element)
Help on function scale_element in module __main__: scale_element(x, b, i) Scale an element of the input array Scale an element of the array by a constant Args: x: list of values b: constant to scale by i: index of element to scale Returns: Scaled a list with the element i scaled by 1/b Examples: >>> scale_element([1,2,3,4], 2, 1) [1, 1.0, 3, 4]
We can do better still. The user doesn't know what type of data to pass. For example if x is just a float rather than an array, this will fail. We can do this with "type-hinting". Type hinting is an optional feature in Python where we tell the expect type of the data that a fuction is expecting.
For example:
def multiply(a, b):
return a * b
# what happens when we pass two floats to this function?
multiply(1.5, 2.3)
3.4499999999999997
# What happens when we pass a string and an int?
multiply("apple", 6)
'appleappleappleappleappleapple'
We've originally defined our function to ints or floats but we behaviour we weren't expecting when passing a string. We can use type hinting to be explicit about what can be passed to the function.
def multiply(a : float, b : float ) -> float:
return a * b
# what happens when we pass two floats to this function?
multiply(1.5, 2.3)
3.4499999999999997
# What happens when we pass a string and an int?
multiply("apple", 6)
'appleappleappleappleappleapple'
This doesn't stop us from calling the function with a string and int, but it does provide additional information to the help()
function.
help(multiply)
Help on function multiply in module __main__: multiply(a: float, b: float) -> float
def scale_element(x : list, b : float, i : int) -> list:
"""Scale an element of the input array
Scale an element of the array by a constant
Args:
x: list of values
b: constant to scale by
i: index of element to scale
Returns:
Scaled a list with the element i scaled by 1/b
Examples:
>>> scale_element([1,2,3,4], 2, 1)
[1, 1.0, 3, 4]
"""
x[i] = x[i] / b
return x
From the first line we can see:
def scale_element(x : list, b : float, i : int) -> list:
That x
is expected to be a list
, b
is expected to be a float
, i
is expected to be a int
and that the function will return a list
.
help(scale_element)
Help on function scale_element in module __main__: scale_element(x: list, b: float, i: int) -> list Scale an element of the input array Scale an element of the array by a constant Args: x: list of values b: constant to scale by i: index of element to scale Returns: Scaled a list with the element i scaled by 1/b Examples: >>> scale_element([1,2,3,4], 2, 1) [1, 1.0, 3, 4]
This is better but let's try and anticpate potential errors
scale_element(3, 2, 1)
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) Cell In[40], line 1 ----> 1 scale_element(3, 2, 1) Cell In[38], line 19, in scale_element(x, b, i) 1 def scale_element(x : list, b : float, i : int) -> list: 2 """Scale an element of the input array 3 4 Scale an element of the array by a constant (...) 17 18 """ ---> 19 x[i] = x[i] / b 20 return x TypeError: 'int' object is not subscriptable
This error message isn't too helpful... Let's write our own
def scale_element(x : list, b : float, i : int) -> list:
"""Scale an element of the input array
Scale an element of the array by a constant
Args:
x: list of values
b: constant to scale by
i: index of element to scale
Returns:
Scaled a list with the element i scaled by 1/b
Examples:
>>> scale_element([1,2,3,4], 2, 1)
[1, 1.0, 3, 4]
Raises:
ValueError: If x is not a list, i in not an int or b is neither an int or float
IndexError: If i is out of bounds in list x
"""
if not isinstance(x, list):
raise ValueError(f'x is expected to be a list, recieved {type(x)}')
if not isinstance(i, int):
raise ValueError(f'i is expected to be an int, recieved {type(i)}')
if not (isinstance(b, int) or isinstance(b, float)):
raise ValueError(f'b is expected to be an int or float, recieved {type(b)}')
if i >= len(x):
raise IndexError(f'Index i ({i}) is out of bounds of array x (with len {len(x)})')
x[i] = x[i] / b
return x
scale_element(3, 2, 1)
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[43], line 1 ----> 1 scale_element(3, 2, 1) Cell In[42], line 24, in scale_element(x, b, i) 2 """Scale an element of the input array 3 4 Scale an element of the array by a constant (...) 21 22 """ 23 if not isinstance(x, list): ---> 24 raise ValueError(f'x is expected to be a list, recieved {type(x)}') 25 if not isinstance(i, int): 26 raise ValueError(f'i is expected to be an int, recieved {type(i)}') ValueError: x is expected to be a list, recieved <class 'int'>
scale_element([3,2,2], 2, 5)
--------------------------------------------------------------------------- IndexError Traceback (most recent call last) Cell In[44], line 1 ----> 1 scale_element([3,2,2], 2, 5) Cell In[42], line 31, in scale_element(x, b, i) 28 raise ValueError(f'b is expected to be an int or float, recieved {type(b)}') 30 if i >= len(x): ---> 31 raise IndexError(f'Index i ({i}) is out of bounds of array x (with len {len(x)})') 32 x[i] = x[i] / b 33 return x IndexError: Index i (5) is out of bounds of array x (with len 3)
help(scale_element)
Help on function scale_element in module __main__: scale_element(x: list, b: float, i: int) -> list Scale an element of the input array Scale an element of the array by a constant Args: x: list of values b: constant to scale by i: index of element to scale Returns: Scaled a list with the element i scaled by 1/b Examples: >>> scale_element([1,2,3,4], 2, 1) [1, 1.0, 3, 4] Raises: ValueError: If x is not a list, i in not an int or b is neither an int or float IndexError: If i is out of bounds in list x
import doctest
doctest.testmod(verbose=True)
Trying: scale_element([1,2,3,4], 2, 1) Expecting: [1, 1.0, 3, 4] ok 3 items had no tests: __main__ __main__.calc __main__.multiply 1 items passed all tests: 1 tests in __main__.scale_element 1 tests in 4 items. 1 passed and 0 failed. Test passed.
TestResults(failed=0, attempted=1)
Packages¶
Python boasts an extensive array of packages developed by the community. In Python, we use the import statement
to bring in packages or specific sections of packages into our code.
import package as p
In the example above, we import a package named package
. The as p
statement allows us to assign an alias, p
, to the imported package. This aliasing technique proves beneficial when accessing objects from within a package that might share a common name with objects in other packages. For instance:
import numpy as np
import math as m
print(np.sin(0))
print(m.sin(0))
0.0 0.0
Here we have imported numpy
using the alias np
and math
using the alias m
. Then, we call the sin
function from both packages, specifying which version of the sin
function we want to invoke.
Numpy is a crucial package in scientific programming, and we'll delve deeper into its functionalities shortly.
We can also import only a section of a package. For example:
import matplotlib.pyplot as plt
from scipy.stats import chi
Here, we import the pyplot
sub-package from the larger matplotlib
package and assign it the alias plt
. Additionally, we import chi
from the stats
sub-package of the scipy
package.
Both packages hold significant importance:
Matplotlib is an extensive library enabling the creation of static, animated, and interactive visualizations in Python. It offers a plethora of tools for various types of plots, charts, and graphical representations.
SciPy encompasses a wide range of scientific computing tools, providing algorithms for optimization, integration, interpolation, solving eigenvalue problems, handling algebraic and differential equations, statistical computations, and more. It's a fundamental package for scientific and technical computing in Python.
Working with Numpy¶
Numpy offers highly optimized functionality for typical matrix and vector operations, with the cornerstone being the numpy array. Arrays resemble lists in their mutability but differ in that they can only contain a single data type.
# Define an array
x = np.array([0,1,2,3,4])
y = x**2
print (x)
print (y)
def add_2(arr : np.array ) -> np.array:
"""Add 2 to the value of the array
Args:
arr: array of values
Returns:
arr + 2
"""
return arr + 2
z = add_2(x)
print(z)
print ( (y - x) / z )
[0 1 2 3 4] [ 0 1 4 9 16] [2 3 4 5 6] [0. 0. 0.5 1.2 2. ]
def subtract_and_add(x : np.array) -> np.array:
"""Subtract and add to an array
Subtract 5 from the array and return the array + 10
Args:
x : input numpy array
Returns:
z : x -5 + 10
"""
z = x
z -= 5
return z + 10
x_data = np.arange(0,10,1)
print (x_data)
y_data = subtract_and_add(x_data)
# What happened here!
print (x_data)
print (y_data)
[0 1 2 3 4 5 6 7 8 9] [-5 -4 -3 -2 -1 0 1 2 3 4] [ 5 6 7 8 9 10 11 12 13 14]
def subtract_and_add_copy(x):
"""Subtract and add to an array
Subtract 5 from the array and return the array + 10.
A copy is used to prevent modifying the input data
Args:
x : input numpy array
Returns:
z : x - 5 + 10
"""
z = x.copy()
z -= 5
return z + 10
x_data = np.arange(0,10,1)
print (x_data)
y_data = subtract_and_add_copy(x_data)
# What happened here!
print (x_data)
print (y_data)
[0 1 2 3 4 5 6 7 8 9] [0 1 2 3 4 5 6 7 8 9] [ 5 6 7 8 9 10 11 12 13 14]
Numpy arrays allow us to filter them using an array mask. We can pass an array of equal size to the array with a binary mask to select the items we want.
x = np.array([1,2,3,4])
x_mask = np.array([True, False, True, True])
# We can select by indexing by the mask
print (x[x_mask])
# We can invert the selection using ~
print (x[~x_mask])
[1 3 4] [2]
# We can mask and filter numpy arrays too
print (x_data)
# Get the odd numbers greater than 4
mask = (x_data > 4) & (x_data %2 == 1)
print (mask)
print (x_data[mask])
print (x_data[mask].sum())
[0 1 2 3 4 5 6 7 8 9] [False False False False False True False True False True] [5 7 9] 21
# use numpy's random number generator to get normal random numbers:
x_rnd = np.random.normal(loc = 0, scale = 1, size = 1000)
great_that_0 = x_rnd > 0
# Use plt.hist to create histograms of the values
plt.hist(x_rnd)
plt.hist(x_rnd[great_that_0])
plt.hist(x_rnd[~great_that_0])
# alpha = transparancy of the histogram
# color = color of the histogram
# bins = binning to use
# linspace linearly paced numbers
# min, max, n
binning = np.linspace(-5,5, 20)
plt.hist(x_rnd, bins= binning,
alpha = 0.5, color = "magenta", label = "All", hatch = "/")
plt.hist(x_rnd[great_that_0], bins= binning,
alpha = 0.5, color = "black", label = "X>0", hatch = "o")
plt.hist(x_rnd[~great_that_0], bins= binning,
alpha = 0.5, color = "darkorange", label = "$X \leq 0$",hatch = "*")
plt.xlabel("X Value")
plt.ylabel("dN/dX")
plt.grid()
plt.legend()
We can define functions which to operate on numpy arrays
def sqrt(x):
return np.sqrt(x)
my_values = np.linspace(0,100)
my_sqrts = sqrt(my_values)
print (my_sqrts[:5])
[0. 1.42857143 2.02030509 2.4743583 2.85714286]
However we do need to be careful on how we write our functions:
def capped_sqrt(x):
if x > 0:
return np.sqrt(x)
else:
return 0.
my_values = np.linspace(-10,10)
my_sqrts = capped_sqrt(my_values)
print (my_sqrts[:5])
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[139], line 2 1 my_values = np.linspace(-10,10) ----> 2 my_sqrts = capped_sqrt(my_values) 3 print (my_sqrts[:5]) Cell In[138], line 2, in capped_sqrt(x) 1 def capped_sqrt(x): ----> 2 if x > 0: 3 return np.sqrt(x) 4 else: ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
We get around this by vectorizing functions. This allows us to run the function on the entire array without needing to loop over the elements.
@np.vectorize
def capped_sqrt(x):
if x > 0:
return np.sqrt(x)
else:
return 0.
my_values = np.arange(-1,10)
my_sqrts = capped_sqrt(my_values)
print (my_sqrts[:5])
[0. 0. 1. 1.41421356 1.73205081]
Decorators¶
Decorators allow us to modify the behavior of a function. They are essentially a function, that take another function as an arguement and modifies the behavior of the function.
Let's define a logging decorator:
def my_logger(func):
def wrapper(*args, **kwargs):
print( f"Running {func.__name__} with:\n\t args = {args}\n\t kwargs = {kwargs}")
ret = func(*args, **kwargs)
print( f"Returning {ret}")
return ret
return wrapper
@my_logger
def sqrt(x):
return np.sqrt(x)
sqrt(4)
Running sqrt with: args = (4,) kwargs = {} Returning 2.0
2.0
Working with Pandas¶
Pandas is an open-source data manipulation and analysis library in Python that's built on top of NumPy. It provides high-level data structures and a variety of tools for working with structured data.
The core data structure in Pandas is the DataFrame, which is essentially a two-dimensional array with labeled axes (rows and columns). This DataFrame object is built upon NumPy's ndarray, utilizing its efficient operations and functions.
import pandas as pd
# Creating a Pandas DataFrame
data = {
'A': np.random.randn(5), # Creating a NumPy array for column 'A'
'B': np.random.rand(5) # Creating a NumPy array for column 'B'
}
df = pd.DataFrame(data)
print("Pandas DataFrame:")
print(df)
Pandas DataFrame: A B 0 -1.591723 0.883364 1 -1.550968 0.513569 2 0.753827 0.700477 3 0.557346 0.619088 4 1.900134 0.677126
help(np.random.randn)
Help on built-in function randn: randn(...) method of numpy.random.mtrand.RandomState instance randn(d0, d1, ..., dn) Return a sample (or samples) from the "standard normal" distribution. .. note:: This is a convenience function for users porting code from Matlab, and wraps `standard_normal`. That function takes a tuple to specify the size of the output, which is consistent with other NumPy functions like `numpy.zeros` and `numpy.ones`. .. note:: New code should use the `~numpy.random.Generator.standard_normal` method of a `~numpy.random.Generator` instance instead; please see the :ref:`random-quick-start`. If positive int_like arguments are provided, `randn` generates an array of shape ``(d0, d1, ..., dn)``, filled with random floats sampled from a univariate "normal" (Gaussian) distribution of mean 0 and variance 1. A single float randomly sampled from the distribution is returned if no argument is provided. Parameters ---------- d0, d1, ..., dn : int, optional The dimensions of the returned array, must be non-negative. If no argument is given a single Python float is returned. Returns ------- Z : ndarray or float A ``(d0, d1, ..., dn)``-shaped array of floating-point samples from the standard normal distribution, or a single such float if no parameters were supplied. See Also -------- standard_normal : Similar, but takes a tuple as its argument. normal : Also accepts mu and sigma arguments. random.Generator.standard_normal: which should be used for new code. Notes ----- For random samples from the normal distribution with mean ``mu`` and standard deviation ``sigma``, use:: sigma * np.random.randn(...) + mu Examples -------- >>> np.random.randn() 2.1923875335537315 # random Two-by-four array of samples from the normal distribution with mean 3 and standard deviation 2.5: >>> 3 + 2.5 * np.random.randn(2, 4) array([[-4.49401501, 4.00950034, -1.81814867, 7.29718677], # random [ 0.39924804, 4.68456316, 4.99394529, 4.84057254]]) # random
# Accessing the underlying NumPy array of column 'A'
numpy_array = df['A'].values
print("Numpy array from Pandas DataFrame:")
print(numpy_array)
Numpy array from Pandas DataFrame: [-1.59172312 -1.55096775 0.75382695 0.55734567 1.90013409]
# Loading a csv file using pandas
url="https://r2.datahub.io/clt98lqg6000el708ja5zbtz0/master/raw/data/monthly.csv"
df=pd.read_csv(url)
df.head()
Source | Date | Mean | |
---|---|---|---|
0 | GCAG | 2016-12 | 0.7895 |
1 | GISTEMP | 2016-12 | 0.8100 |
2 | GCAG | 2016-11 | 0.7504 |
3 | GISTEMP | 2016-11 | 0.9300 |
4 | GCAG | 2016-10 | 0.7292 |
df.tail()
Source | Date | Mean | |
---|---|---|---|
3283 | GISTEMP | 1880-03 | -0.1800 |
3284 | GCAG | 1880-02 | -0.1229 |
3285 | GISTEMP | 1880-02 | -0.2100 |
3286 | GCAG | 1880-01 | 0.0009 |
3287 | GISTEMP | 1880-01 | -0.3000 |
Data Analysis with Python¶
Python is a great language for high-level data analysis, with jupyter notebooks providing a great "analysis notebook" for documenting analysis and displaying results.
Let's look at how we might reduce and analyze data using Python and extract some meaningful results.
Fitting a model to data¶
- scipy optimize package
- numpy polyfit
- Error propagation
- bootstrapping
Let's start by creating a data set using numpy.
# Let define the true model
def model(x, p0, p1, p2):
return p0 * x**2 + p1 * x + p2
# Set the true parameters
p_true = [0.02, 0.1, -2.5]
# Let the x points be random floats between 0-10
x = 10*np.random.rand(100)
y = model(x, p_true[0], p_true[1], p_true[2])
# let's add some gaussian noise
y_noisey = y + np.random.normal(loc = 0, scale = 0.2, size = x.shape)
# define our y error as 0.2
y_err = 0.2 * np.ones(x.shape)
# Plot the data
plt.errorbar(x, y_noisey, yerr = y_err , fmt = "C0o", label = "Measured")
plt.ylabel("Y Values")
plt.xlabel("X Values")
plt.grid()
Let's use scipy.optimize.curve_fit
curve_fit
will perform a lease-squares minimization:
$$ (\vec{y} - y_{model}(\vec{x}, \theta))^2 $$
If errors are provided than it will perform a $\chi^2$-minimization $$ \frac{(\vec{y} - y_{model}(\vec{x}, \theta))^2}{\vec{\Delta y}^2} $$
curve_fit
returns the optimal parameters and the correlation matrix for the minimization allowing us to easily extract an uncertainty.
# Use scipy curve_fit to perform a fit
from scipy.optimize import curve_fit
# Returns optimal (popt) and correation matrix (pcov)
popt, pcov = curve_fit(
model, # Function we want to fit
x, # x data
y_noisey, # y data
p0 = [-0.2, 1, 5], # initial guess
sigma=y_err # error on y
)
x_plot = np.linspace(0,10)
plt.errorbar(x, y_noisey, yerr = y_err , fmt = "C0o", label = "Measured")
plt.plot(x_plot, model(x_plot, *popt), "r--", label = "Best fit")
plt.ylabel("Y Values")
plt.xlabel("X Values")
plt.legend()
plt.grid()
parameter_errors = np.sqrt(np.diag(pcov))
for p, perr in zip(popt, parameter_errors):
print (f"{p:0.3f} +/- {perr:0.3f}")
Are we confident in our uncertainty?¶
It can often be difficult to quantify our uncertainties. Bootstrapping is a useful method to estimate our uncertaities.
Assuming we have independent data points, we can randomly sample our data, apply out fit to that data and then repeat a number of times, to estimate the distribution of best fit values.
# bootstrapping
samples = []
for i in range(100):
# Get random indices
# replace = True allows us to reuse indicies
# So we could be drawing an estimate from the [0th, 11th, 81st, 0th] elements of our array
rnd_int = np.random.choice(np.arange(len(x)), size=len(x), replace=True)
# Extract the corresponding values
x_samp = x[rnd_int]
y_samp = y_noisey[rnd_int]
y_samp_err = y_err[rnd_int]
# Apply fit
p, _ = curve_fit(model, x_samp, y_samp, sigma = y_samp_err)
# Store Result
samples.append(p)
samples = np.array(samples)
fig, axs = plt.subplots(1,3, figsize = (18,6))
for i in range(3):
mean = np.mean(samples[:,i])
std = np.std(samples[:,i])
axs[i].hist(samples[:,i], alpha = 0.5)
axs[i].axvline(popt[i], color = "r", label = "From Fit")
axs[i].axvline(popt[i] - parameter_errors[i], ls = "--", color = "r")
axs[i].axvline(popt[i] + parameter_errors[i], ls = "--", color = "r")
axs[i].axvline(mean, color = "C4", label = "Bootstrap")
axs[i].axvline(mean - std, ls = "--", color = "C4")
axs[i].axvline(mean + std, ls = "--", color = "C4")
axs[i].axvline(p_true[i], color = "k", label = "True")
axs[i].grid()
axs[i].legend()
### What if we under estimate our errors?
# let's add some gaussian noise
# Increase spread to 0.3
y_noisey = y + np.random.normal(loc = 0, scale = 0.3, size = x.shape)
# define our y error as 0.1 (decreasing error)
y_err = 0.1 * np.ones(x.shape)
# Returns optimal (popt) and correation matrix (pcov)
popt, pcov = curve_fit(
model, # Function we want to fit
x, # x data
y_noisey, # y data
p0 = [-0.2, 1, 5], # initial guess
sigma=y_err # error on y
)
x_plot = np.linspace(0,10)
plt.errorbar(x, y_noisey, yerr = y_err , fmt = "C0o", label = "Measured")
plt.plot(x_plot, model(x_plot, *popt), "r--", label = "Best fit")
plt.ylabel("Y Values")
plt.xlabel("X Values")
plt.legend()
plt.grid()
# bootstrapping
samples = []
for i in range(100):
rnd_int = np.random.choice(np.arange(len(x)), size=len(x), replace=True)
x_samp = x[rnd_int]
y_samp = y_noisey[rnd_int]
y_samp_err = y_err[rnd_int]
p, _ = curve_fit(model, x_samp, y_samp, sigma = y_samp_err)
samples.append(p)
samples = np.array(samples)
fig, axs = plt.subplots(1,3, figsize = (18,6))
for i in range(3):
mean = np.mean(samples[:,i])
std = np.std(samples[:,i])
axs[i].hist(samples[:,i], alpha = 0.5)
axs[i].axvline(popt[i], color = "r", label = "From Fit")
axs[i].axvline(popt[i] - parameter_errors[i], ls = "--", color = "r")
axs[i].axvline(popt[i] + parameter_errors[i], ls = "--", color = "r")
axs[i].axvline(mean, color = "C4", label = "Bootstrap")
axs[i].axvline(mean - std, ls = "--", color = "C4")
axs[i].axvline(mean + std, ls = "--", color = "C4")
axs[i].axvline(p_true[i], color = "k", label = "True")
axs[i].grid()
axs[i].legend()
### How to use bootstrapping to handle no-gaussian errors
def exp_model(x, n, tau):
return n*x**-tau
p_true = [10, 0.5]
x_plot = np.linspace(0,10)
plt.plot(x_plot, exp_model(x_plot, *p_true))
Poisson Distribution¶
$$p(X = k ; \lambda) = \frac{e^{-\lambda}\lambda^{k}}{k!}$$
Where $k$ is the observed counts, $\lambda$ is the mean counts. Mean is $\lambda$, standard deviation is $\sqrt{\lambda}$. In counting experiments we typically say if $f= N$, then, $\Delta f = \sqrt{N}$.
Does this mean that its appropriate to use $\sqrt{N}$ in a $\chi^2$ fit?
lam = np.arange(6)
fig, axs = plt.subplots(2,3, figsize = (11,6))
for l, ax in zip(lam, axs.ravel()):
rnd_x = np.random.poisson(lam = l, size = 1000)
ax.hist(rnd_x, alpha = 0.5, bins = -0.5 + np.arange(0,15))
ax.axvline(l)
ax.axvline(l - np.sqrt(l))
ax.axvline(l + np.sqrt(l))
ax.set_title("$\lambda$ = " + f"{l}")
ax.grid()
fig.tight_layout()
x = 10*np.random.random(100)
# y = np.array([
# np.random.poisson(lam = exp_model(x_i, *p_true), size = 1)
# for x_i in x
# ])[:,0]
y = np.random.poisson(lam = exp_model(x, *p_true))
x_plot = np.linspace(0,10)
y_err = np.sqrt(y)
# popt, pcov = curve_fit(exp_model, x, y, sigma = y_err)
popt, pcov = curve_fit(exp_model, x, y)
plt.plot(x_plot, exp_model(x_plot, *p_true))
plt.errorbar(x, y, yerr = y_err, fmt = "C0o")
plt.plot(x_plot, exp_model(x_plot, *popt))
plt.grid()
parameter_errors = np.sqrt(np.diag(pcov))
for pt, p, perr in zip(p_true, popt, parameter_errors):
print (f"{pt:0.3f} -> {p:0.3f} +/- {perr:0.3f}")
# bootstrapping
samples = []
for i in range(100):
rnd_int = np.random.choice(np.arange(len(x)), size=len(x), replace=True)
x_samp = x[rnd_int]
y_samp = y[rnd_int]
y_samp_err = y_err[rnd_int]
p, _ = curve_fit(exp_model, x_samp, y_samp)
samples.append(p)
samples = np.array(samples)
fig, axs = plt.subplots(1,2, figsize = (18,6))
for i in range(2):
mean = np.mean(samples[:,i])
std = np.std(samples[:,i])
axs[i].hist(samples[:,i], alpha = 0.5)
axs[i].axvline(popt[i], color = "r", label = "From Fit")
axs[i].axvline(popt[i] - parameter_errors[i], ls = "--", color = "r")
axs[i].axvline(popt[i] + parameter_errors[i], ls = "--", color = "r")
axs[i].axvline(mean, color = "C4", label = "Bootstrap")
axs[i].axvline(mean - std, ls = "--", color = "C4")
axs[i].axvline(mean + std, ls = "--", color = "C4")
axs[i].axvline(p_true[i], color = "k", label = "True")
axs[i].grid()
axs[i].legend()
We can see that the bootstrapped distributions are highly non-gaussian. It might not make sense to report the uncertainty as 1 sigma error on the fit parameters. Instead we might report using the bootstrapped quantiles. A common way to represent the uncertinty would be to report the 90% confidience/credibility interval. This says that:
- If we were to repeat this experiement 100 times, the measured value would be in this interval 90% of the time
fig, axs = plt.subplots(1,2, figsize = (18,6))
for i, pt in enumerate(p_true):
axs[i].hist(samples[:,i], alpha = 0.5)
axs[i].axvline(popt[i], color = "r", label = "From Fit")
axs[i].axvline(popt[i] - parameter_errors[i], ls = "--", color = "r")
axs[i].axvline(popt[i] + parameter_errors[i], ls = "--", color = "r")
quan = np.quantile(samples[:,i], [0.05, 0.5, 0.95])
axs[i].axvline(quan[1], color = "C4", label = "Bootstrap")
axs[i].axvline(quan[0], ls = "--", color = "C4")
axs[i].axvline(quan[2], ls = "--", color = "C4")
axs[i].axvline(p_true[i], color = "k", label = "True")
axs[i].grid()
axs[i].legend()
print (f"{pt:0.3f} -> {quan[1]:0.3f} [{quan[0]:0.3f}, {quan[2]:0.3f}]")
Classes in Python¶
Classes in Python serve as templates or blueprints defining the attributes (data) and behaviors (methods) of objects. They encapsulate both data and methods that operate on that data within a single structure, promoting code organization and reusability.
To create a class in Python, you use the class
keyword, allowing you to define properties (attributes) and behaviors (methods) within it.
Attributes and Methods¶
Attributes represent the data associated with a class, while methods are functions defined within the class that can access and manipulate this data. These methods can perform various operations on the attributes, thereby altering or providing access to the data encapsulated within the class.
Example of a Simple Class¶
Consider the following example of a basic class in Python:
class Car:
def __init__(self, make, model, year):
self.make = make
self.model = model
self.year = year
def get_details(self):
return f"{self.year} {self.make} {self.model}"
The __init__
function is the "initialization" or constructor of the class. This function is called when the object is created.
class Data():
"""Data Class
Class for holding x/y data with methods to calculate the properties
and some plotting functionalities.
"""
# The "self" keyword denotes data belonging to the class
def __init__(self, x_data : np.ndarray, y_data : np.ndarray) -> None:
"""Initialization function
Copy x_data and y_data
Args:
x_data : data on the x axis
y_data : data on the y axis
Returns:
None
"""
self.x_data = x_data.copy()
self.y_data = y_data.copy()
# Member functions take "self" as the first argument
def calculate_properties(self) -> None:
"""Calculate properties of X and Y data
Determine the mean and standard deviation of x_data and y_data.
Mean and standard deviation are stored as attributes within Data class
Args:
None
Returns:
None
"""
self.x_mean = np.mean(self.x_data)
self.y_mean = np.mean(self.y_data)
self.x_std = np.std(self.x_data)
self.y_std = np.std(self.y_data)
def plot_data(self, x_label : str = None, y_label : str = None) -> plt.figure:
"""Plot X and Y data
Plot the X and Y data and return the figure. Add optional x/y labels.
Lines are added for the mean x/y and their standard deviations.
Args:
x_label : (optional) string for the label of the x axis. (Default None)
y_label : (optional) string for the label of the y axis. (Default None)
Returns:
figure with the plot of y(x)
"""
fig = plt.figure(figsize = (11,6))
plt.plot(self.x_data, self.y_data)
# Plotting the means as solid lines and the +/- 1 sigma as dashed lines
plt.axhline(self.y_mean, color = "C1", ls = "-", label = r"$\mu_{y}$")
plt.axhline(self.y_mean + self.y_std, color = "C1", ls = "--")
plt.axhline(self.y_mean - self.y_std, color = "C1", ls = "--", label = r"$\mu_{y} \pm \sigma_{y}$")
plt.axvline(self.x_mean, color = "C2", ls = "-", label = r"$\mu_{x}$")
plt.axvline(self.x_mean + self.x_std, color = "C2", ls = "--", )
plt.axvline(self.x_mean - self.x_std, color = "C2", ls = "--", label = r"$\mu_{x} \pm \sigma_{x}$")
if x_label is not None:
plt.xlabel(x_label)
if y_label is not None:
plt.ylabel(y_label)
plt.legend()
plt.grid()
return fig
x = np.linspace(-3 * np.pi, 3 * np.pi, 100 )
y = np.sin(x)
my_data = Data(x, y)
my_data.calculate_properties()
fig = my_data.plot_data(y_label="Sin(x)")
Inheritance in Python¶
Inheritance is a fundamental concept in object-oriented programming that allows a new class to inherit properties and behaviors (attributes and methods) from an existing class. This concept promotes code reuse, enhances readability, and enables the creation of more specialized classes.
Basics of Inheritance¶
In Python, inheritance is achieved by specifying the name of the parent class(es) inside the definition of a new class. The new class, also known as the child class or subclass, inherits all attributes and methods from its parent class or classes, referred to as the base class or superclass.
Syntax for Inheriting Classes¶
The syntax for creating a subclass that inherits from a superclass involves passing the name of the superclass inside parentheses when defining the subclass:
class ParentClass:
# Parent class attributes and methods
class ChildClass(ParentClass):
# Child class attributes and methods
Here, ChildClass
is inheriting from ParentClass
, which means ChildClass
will inherit all attributes and methods defined in ParentClass
.
class TimeSeries(Data):
"""Time Series Data Class
Class for holding x/y data with methods to calculate the properties
and some plotting functionalities.
The X data is assumed to be time
"""
def __init__(self, x_data : np.ndarray, y_data: np.ndarray) -> None:
"""Initialization function
Copy x_data and y_data
Args:
x_data : data on the x axis
y_data : data on the y axis
Returns:
None
"""
# We use the "super" keyword to call parent class functions
super().__init__(x_data, y_data)
# We can overwrite functions
def plot_data(self) -> plt.figure:
"""Plot X and Y data
Plot the X and Y data and return the figure. Add optional x/y labels.
Lines are added for the mean x/y and their standard deviations.
Args:
None
Returns:
figure with the plot of y(x)
"""
return super().plot_data(x_label = "Time", y_label = "AU")
# We can also define new functions
def add_to_data(self, y : float) -> None:
"""Add a constant offset to y data.
A constant offset is added to self.y_data.
The properties (mean and std) are calculated for the adjusted dataset
Args:
y : constant offset to be added to self.y_data
Returns:
None
"""
self.y_data += y
# Recalculate the properties
super().calculate_properties()
my_time_series = TimeSeries(x, y)
# Calling a function from the parent class
my_time_series.calculate_properties()
# Call a function that only exists in the child class
my_time_series.add_to_data(10)
# Call overridden function
fig = my_time_series.plot_data()