Functions in Python#
Functions are a fundamental building block in Python programming, allowing you to organize your code into reusable, logical blocks. Whether you’re writing simple scripts or complex applications, functions help make your code cleaner, more modular, and easier to maintain. Once a function is defined, you can call it whenever you need it, avoiding repetitive code. Functions can accept inputs (parameters), perform operations, and return outputs.
By using functions, we can:
Avoid repetitive code
Break complex tasks into smaller, manageable pieces
Improve readability and maintainability of your code
Defining and Calling Functions#
In Python, you define a function using the def
keyword followed by the function name and parentheses ( )
. Inside the parentheses, you can define optional parameters.
def greet():
"""This is a simple function that prints a greeting."""
print("Hello, welcome to the Geospatial Analysis in Python!")
To execute the code inside a function, you simply call it by its name followed by parentheses.
# Call the function
greet()
Hello, welcome to the Geospatial Analysis in Python!
Functions with Parameters#
Functions can accept inputs called parameters. Parameters allow you to pass data to the function.
def greet_user(name):
"""This function greets the user by their name."""
print(f"Hello, {name}! Welcome to the Geospatial Analysis in Python!")
# Call the function with an argument
greet_user("Modric")
Hello, Modric! Welcome to the Geospatial Analysis in Python!
Functions with Return Values#
Sometimes, you want a function to compute and return a result.
def add_numbers(a, b):
"""This function returns the sum of two numbers."""
return a + b
# Call the function and store the result
result = add_numbers(4, 9)
print(f"The sum is: {result}")
The sum is: 13
Default Parameters
You can set default values for parameters. If no value is provided when calling the function, the default is used.
def greet(name="Guest"):
"""This function greets the user, with a default name of 'Guest'."""
print(f"Hello, {name}!")
# Call with and without an argument
greet("Luka")
greet()
Hello, Luka!
Hello, Guest!
Keyword Arguments
Python allows you to call functions using keyword arguments, which makes your code more readable.
def favorite_IDE(programming, IDE="RStudio"):
"""This function describes your favorite programming language and IDE."""
print(f"I love {programming} and enjoy scripting with {IDE}.")
# Call with positional and keyword arguments
favorite_IDE("R")
favorite_IDE(programming="Python", IDE="VS Code")
favorite_IDE(programming="JavaScript", IDE="Jupyter Notebook")
I love R and enjoy scripting with RStudio.
I love Python and enjoy scripting with VS Code.
I love JavaScript and enjoy scripting with Jupyter Notebook.
Variables
Variables defined inside a function are local to that function and cannot be accessed outside it.
def demo_function():
local_var = 10 # This variable exists only inside the function
print(f"Inside function: {local_var}")
demo_function()
# print(local_var) # This will raise an error because local_var is not defined outside
Inside function: 10
Anonymous Functions
Anonymous Functions also referred to as Lambda functions are one-liner functions defined using the lambda
keyword. They are often used for short, simple operations.
# A simple lambda function to multiply two numbers
multiply = lambda x, y: x * y
print(multiply(12, 19))
228
Downloading Geospatial Data#
Downloading geospatial data is a common step in geospatial analysis workflows. Python provides efficient libraries such as requests
for downloading files from the web, enabling users to automate data acquisition. For instance, you can fetch GeoPackage or GeoTIFF files from online repositories like GADM or ESA. Once downloaded, these datasets can be used for various spatial analyses and visualizations. By writing a script to handle downloads, you ensure consistency, reproducibility, and quick access to the required data. Python’s flexibility also allows users to integrate data downloading with preprocessing workflows for seamless geospatial project management.
import requests
# URL of the GADM data of San Marino
url = "https://geodata.ucdavis.edu/gadm/gadm4.1/gpkg/gadm41_SMR.gpkg"
# Destination path to save the file
destfile = "data/downloads/gadm41_SMR.gpkg"
# Download the file
response = requests.get(url)
# Check if the request was successful
if response.status_code == 200:
# Write the content to a file
with open(destfile, 'wb') as file:
file.write(response.content)
print(f"File downloaded successfully and saved to {destfile}")
else:
print(f"Failed to download file. Status code: {response.status_code}")
File downloaded successfully and saved to data/downloads/gadm41_SMR.gpkg
Automating Data Download#
To simplify the process, we can wrap this functionality into a custom function that takes a URL and destination path as arguments. This can be particularly useful when managing multiple datasets.
Download one file at a time#
In this subsection, we will create a function to download country-specific GeoPackage data from the GADM website.
import os
def get_gadm(iso3=None, path=None):
if not iso3 or not path:
raise ValueError("Both 'iso3' and 'path' parameters must be provided.")
# Construct the URL of the GADM data
url = f"https://geodata.ucdavis.edu/gadm/gadm4.1/gpkg/gadm41_{iso3}.gpkg"
# Destination path to save the file
destfile = os.path.join(path, f"{iso3}.gpkg")
# Download the file
response = requests.get(url)
if response.status_code == 200:
with open(destfile, 'wb') as file:
file.write(response.content)
print("Download process completed successfully.")
else:
print(f"Failed to download file. HTTP status code: {response.status_code}")
Parameters:
iso3
: str, ISO3 country code (e.g., “SMR” for San Marino).path
: str, Directory path where the file will be saved.
Returns:
None, but saves the downloaded file to the specified location.
# Example 1: Luxembourg
iso3 = "LUX"
path = "data/downloads/"
get_gadm(iso3, path)
Download process completed successfully.
# Example 2: Cyprus
get_gadm(iso3 = "CYP", path = "data/downloads/")
Download process completed successfully.
Download multiple files simultaneously#
In this subsection, we will create a function to download GADM data for multiple countries simultaneously.
def gadm_downloader(iso3_list=None, path=None):
if not iso3_list or not path:
raise ValueError("Both 'iso3_list' and 'path' parameters must be provided.")
for iso3 in iso3_list:
# Construct the URL
url = f"https://geodata.ucdavis.edu/gadm/gadm4.1/gpkg/gadm41_{iso3}.gpkg"
# Destination path to save the file
destfile = os.path.join(path, f"{iso3}.gpkg")
print(f"Downloading GADM data for country: {iso3}...")
try:
# Download the file
response = requests.get(url)
# Check if the download was successful
if response.status_code == 200:
with open(destfile, 'wb') as file:
file.write(response.content)
print(f"File for {iso3} downloaded successfully.")
else:
print(f"Failed to download file for {iso3}. HTTP status code: {response.status_code}")
except Exception as e:
print(f"An error occurred while downloading file for {iso3}: {e}")
print("Download process completed.")
Key features:
Looping through ISO3 codes
Error Handling with
try-except
Dynamic File Naming
Feedback messages
# Example 3: Cyprus, San Marino, Luxembourg
gadm_downloader(iso3_list=["LUX", "SMR", "CYP"],
path="data/downloads")
Downloading GADM data for country: LUX...
File for LUX downloaded successfully.
Downloading GADM data for country: SMR...
File for SMR downloaded successfully.
Downloading GADM data for country: CYP...
File for CYP downloaded successfully.
Download process completed.
Advantages of Automating Downloads:
Efficiency: Automates the tedious process of manually downloading files.
Reproducibility: Ensures that analyses can be reproduced with the same datasets.
Scalability: Handles multiple datasets with ease, particularly useful in batch processing.
By leveraging these techniques, we can integrate data acquisition seamlessly into our Python scripts, ensuring an efficient and organized workflow for geospatial projects.
Computation Functions#
In this section, we will create a function that computes zonal statistics based on the provided Area of Interest (AOI) and raster data. The compute_zonal_statistics
function calculates zonal statistics for a given raster (spatial dataset) within a defined Area of Interest (AOI).
import geopandas as gpd
import rasterio
import rasterstats
import pandas as pd
def compute_zonal_statistics(aoi_path, raster_path, stats=["mean", "sum"]):
# Load the AOI polygons
aoi = gpd.read_file(aoi_path)
# Open the raster file
with rasterio.open(raster_path) as src:
# Compute zonal statistics using rasterstats
try:
zonal_stats = rasterstats.zonal_stats(
aoi,
raster_path,
stats=stats,
geojson_out=True,
nodata=None # you can set nodata value here if needed
)
except Exception as e:
raise RuntimeError(f"Error during zonal statistics calculation: {e}")
# Convert the results to a GeoDataFrame
results_gdf = gpd.GeoDataFrame.from_features(zonal_stats)
# Return only the AOI attributes and statistics as a DataFrame
stats_column = [col for col in results_gdf.columns if col in stats]
output_df = results_gdf[aoi.columns.tolist() + stats_column]
return output_df
Parameters:
aoi_path
: str, Path to the shapefile or GeoPackage containing the AOI polygons.raster_path
: str, Path to the raster file.stats
: list of str, Statistics to calculate (e.g., [“mean”, “sum”, “min”, “max”]).
Returns:
pandas.DataFrame
containing the AOI attributes and computed statistics.
# Example usage:
aoi_path = "data/vector/kanchanpur.gpkg"
raster_path = "data/raster/popCount_2020.tif"
zonal_stats_df = compute_zonal_statistics(aoi_path, raster_path, stats=["sum", "mean"])
print(zonal_stats_df[["NAME", "sum", "mean"]])
NAME sum mean
0 BaisiBichawa 37900.042969 3.872093
1 Beldandi 46565.355469 14.149303
2 Chandani 72813.015625 17.694536
3 Daijee 49223.929688 3.624737
4 Dekhatbhuli 52221.738281 4.353989
5 Dodhara 64436.566406 20.553929
6 Jhalari 40823.878906 2.517972
7 Kalika 97216.468750 28.830507
8 Krishnapur 36368.550781 1.653792
9 Laxmipur 277108.937500 72.295575
10 MahendranagarN.P. 51835.031250 2.278062
11 Parasan 38199.867188 7.639973
12 Pipaladi 46773.808594 8.258088
13 RaikawarBichawa 44571.746094 4.150456
14 RampurBilaspur 48838.773438 9.415611
15 RauteliBichawa 11581.357422 2.904052
16 Royal Shuklaphanta 131.641449 0.002766
17 Shankarpur 19211.746094 4.094575
18 Sreepur 56755.710938 7.707185
19 Suda 58257.640625 7.586618
20 Tribhuwanbasti 37573.945312 14.512918
Key Features:
Flexible Statistics
Nodata Handling
Output Format
Error Handling
This function is scalable for multiple AOIs and different raster datasets.
In conclusion, mastering functions in Python is essential for writing efficient, reusable, and organized code. Functions enable you to simplify complex tasks, eliminate redundancy, and improve the readability and maintainability of your scripts. By understanding how to define and use functions, you can perform a wide range of data manipulation, statistical analysis, and visualization tasks more effectively. With the ability to build custom functions and utilize Python’s extensive library of built-in functions, you can significantly enhance your data analysis workflow, making your code more modular and adaptable to various challenges. Functions are a powerful tool that, once mastered, will elevate your Python programming skills to the next level.
For this tutorial, this concludes the coverage of functions in Python. If you would like to explore additional functions, examples, or need clarification on any of the steps covered, please visit the GitHub repository: Python_tutorial and feel free to open an issue.