Installing the environment
Create a project folder
Create a project folder for EE 508/DS 537.
Across EE 508, I will refer to this project folder as ~/ee508/. Replace as needed. ~ can be replaced by the user folder in Python via
os.path.expanduser.Make sure you have write access. We will edit the files using QGIS, Python, and R.
Tip
Directory and file structures in this course will (loosely) follow the convention of the Cookiecutter Data Science project, a community template created by Carl Boettiger et al. at Berkeley.
Note for Windows users
Windows users: EE 508 uses OSX/Linux notation for directories and files.
Python usually handles the conversion from slashes / to backslashes \ in filepaths, but errors can still occur (e.g. if you inadvertently mix both in a filepath).
Throughout this course:
When using a filepath provided on this course companion website, think \ instead of /. If you replace the slashes, note that the backslash is the default string escape character, so you have to use double backslashes \\. A better-looking alternative is to add an
rin front of the string. This allows you to use single backslashes \ and saves you editing time when copy & pasting a filepath from somewhere else (e.g. your File Explorer).
Install Miniconda
Miniconda is a lightweight Python package and environment manager.
Using a package manager ensures that the versions of the dozens of Python packages we use in EE 508 will be compatible with each other. It also helps keep our EE 508 Python environment separate from our system’s Python, avoiding interference.
You’re welcome to use a different tool for the job (such as the heavier Anaconda or the lightweight micromamba), as long as you take care of any environment-related troubleshooting.
Download the -latest- Miniconda version for your operating system and CPU:
Mac
Make sure to pick the right file for your CPU (Intel: x86_64, M on Mac: arm64, etc.).
The simplest way is to use the self-installing .pkg files.
Alternatively, download the .sh files, open Terminal, navigate to the folder that contains the file, and run this command, using the name of the file you downloaded:
bash Miniconda3-latest-MacOSX-x86_64.sh
Windows
Find the executable in the File Explorer and run it (double-click):
Miniconda3-latest-Windows-x86_64.exe
Open Quick search to find your freshly installed Anaconda Prompt. Consider Pin to Start or Pin to taskbar, as we will use this application every time we start working on a lab (instead of Command Prompt or PowerShell).
For the remainder of this course, whenever I suggest commands for the Terminal, use the Anaconda Prompt.
Linux
Open Terminal and run this command, using the name of the file you downloaded:
bash Miniconda3-latest-Linux-x86_64.sh
Terms of Reference
In Terminal, read and accept the Terms of Services:
conda tos accept --override-channels --channel https://repo.anaconda.com/pkgs/main
conda tos accept --override-channels --channel https://repo.anaconda.com/pkgs/r
conda tos accept --override-channels --channel https://repo.anaconda.com/pkgs/msys2
Create an empty environment
In Terminal, initialize your Anazonda environment for EE 508:
conda create -n ee508 -c conda-forge -y mamba
-n ee508sets the name of your Anaconda environment. Give it any name you like.-c conda-forgedefines the Anaconda channel from which to pull the packages.-yautomatically confirms the list of packages (remove to review and confirm the package list before installing everything).
We request only one package at this stage: mamba. This solver is much faster than Anaconda’s default.
Activate the environment
Still in Terminal, activate your environment:
conda activate ee508
Replace ee508 with the name of your environment, if you picked a different name.
Important
Remember this command! Activating the environment will be the first thing you’ll do every time you work with Python in this course.
The prompt should now start with (ee508). This tells you that the environment is active.
Install packages
Meet your required packages for EE 508, written in the YAML syntax:
name: ee508
channels:
- conda-forge
dependencies:
# Jupyter ecosystem
- jupyter # Jupyter notebooks
- jupyterlab_code_formatter # Auto-format code in JupyterLab
- jupyterlab_execute_time # Show cell execution times
# Code quality tools
- black # Code formatter
- isort # Import sorter
- ruff # Fast linter
# Core data science libraries
- numpy # Arrays
- pandas # Tabular data
- pyarrow # Fast columnar data format
# Geospatial analysis
- geopandas # Vector data
- shapely # Geometric operations
- rtree # Spatial indexing
- pyproj # Coordinate transformations
- rasterio # Raster data I/O
- rioxarray # Xarray integration for rasterio
- rio-cogeo # Cloud Optimized GeoTIFF tools
- rasterstats # Zonal statistics
- pyogrio # Fast vector I/O
# Machine learning
- statsmodels # Statistical modeling
- scikit-learn # General ML library
- xgboost # Gradient boosting
- lightgbm # Microsoft's gradient boosting
- catboost # Yandex's gradient boosting
# Visualization
- matplotlib # Basic plotting
- plotly # Interactive plots
- altair # Grammar of graphics
- folium # Interactive maps
# File I/O
- openpyxl # Excel files
In Terminal, use this single-line install command to install all packages with mamba. Your environment (ee508) must be active.
mamba install -c conda-forge --override-channels jupyter jupyterlab_code_formatter jupyterlab_execute_time black isort ruff numpy pandas pyarrow geopandas shapely rtree pyproj rasterio rioxarray rio-cogeo rasterstats pyogrio statsmodels scikit-learn xgboost lightgbm catboost matplotlib plotly altair folium openpyxl
After resolving package dependencies, mamba will ask you whether you agree with the list. Confirm with y (yes) and Enter or skip by adding -y to the command.
Installation of the packages can take a while, as mamba downloads and installs about 5.7 GB.
Once you have saved your package, I recommend saving your fully resolved environment as another YAML file:
conda env export > ~/ee508/environment.yml
Keep this file in your project folder. It will allow you to re-create the exact same environment, e.g. if you break yours, or get a new machine:
conda env create -f ~/ee508/environment.yml
If you want to use the modern code formatter ruff (recommended), you also need to pip-install jupyter-ruff, which makes ruff accessible in Jupyter.
pip install jupyter-ruff
Launch and sanity‑check Jupyter
Create a folder where you’d like to keep all of your Jupyter notebooks for EE 508, e.g.:
~/ee508/notebooks
In Terminal, navigate to the folder (e.g., cd ~/ee508/notebooks).
With your ee508 environment active, launch Jupyter notebook:
jupyter notebook
After a litte wait, Terminal indicates that a Jupyter notebook is active and shares its URLs (you can copy & paste them into your browser). Your default browser will open, showing a directory listing of the folder from which you just called the Jupyter notebook (if you just created the folder, there’s not much to see).
If you lose the tab or close the browser, find the URL in Terminal and paste it in your browser.
In the upper right corner of the Jupyter directory listing, select New > Python 3 (ipykernel). A new browser tab will open, showing an empty and “Untitled” notebook.
The notebook you see in the browser is also a file. Or rather, you are seeing an (interactive) HTML website generated from instructions in a KML (text) file saved with the file extension .ipynb (iPython notebook - iPython is the engine behind Jupyter). Look for it in the folder from which you called Jupyter (~/ee508/notebooks). The title of the notebook in your website is the filename: you change one, and the other changes, too.
As you open the notebook, Jupyter also starts a new Python process (kernel) in the background that should now be active and is waiting for your input.
Type the following code into the first cell and run it:
import geopandas as gpd
gpd.__version__
The text to the left of the cell should change from [ ] to [*] while Python executes your command. The first time you run this code will take a bit, as Python initializes your environment. After a short wait, the text should change to [1] and print the version of geopandas you installed. Your environment is now ready for your input.
Fine-tune Jupyter
Return to the directory view (Home) by selecting its browser tab (if still open) or by clicking the Jupyter icon in any notebook.
In the Menu > find Settings > Settings Editor.
The Settings Editor is where you can fine-tune how the Jupyter interface in your browser appears and reacts.
Code formatting: your environment comes with several code formatters:
isortandblack. We will use both code formatters religiously throughout the course: it makes both code and diffs (code comparisons) more readable.Click Jupyterlab Code Formatter. You should see both isort and black listed as default_formatter.
Let’s leave the settings as they are. We’ll accept
blackopinion on 88-character line length, which breaks PEP8 convention (79 characters).
Rulers: I like to have two rulers in my code cells, so I can see how much space I have before the line break (72 characters for comments / docstrings, 88 for code).
Click Notebook in the left sidebar menu.
In the main window, find Rulers. Add one at 72 for comments, one at 88 for code.