1.1. Countries and threatened species with QGIS
In this lab, we explore how the average “richness” (i.e., the count of overlapping ranges) of threatened species varies across countries and administrative regions. Along the way, we will learn how to create maps and process data using QGIS, a free and open-source geographic information system.
Deliverable
A map showing the global average threatened species richness (for birds, amphibians, and mammals) by first-level country subdivisions (states, provinces, etc.)
Due date
Tuesday, September 9, 2025, by 6:00 p.m.
1.1.1. Get the vector and raster data
1.1.1.1. Political boundaries
Natural Earth provides high-quality vector data of political boundaries at coarse scales (best suited for global maps).
On their 1:10m Cultural Vectors page, find Admin 0 – Countries and pick the version “without boundary lakes” (4.8 MB):
ne_10m_admin_0_countries_lakes.zip
Find Admin 1 – States, Provinces and pick the version “without large lakes” (14 MB):
ne_10m_admin_1_states_provinces_lakes.zip
Tip
If you need more precise vector data for fine-scale projects: GADM (the Global Administrative Database) has high-quality vector data of country and country subdivision boundaries down to four levels of subdivision (e.g., country > states > counties > towns in the US). The full layer is 537 MB. GADM does not exclude large inland lakes (e.g., Great Lakes).
Many countries provide higher-quality vector data for national mapping purposes. In the United States, an excellent source for administrative and census-level subdivisions is NHGIS, where you can download full-country shapefiles for counties (230 MB), subdivisions (580 MB), census tracts (723 MB) and more, with oceans and Great Lakes excluded.
1.1.1.2. Richness of threatened species
Clinton Jenkins maintains BiodiversityMapping.org, where you can download raster data (GeoTIFFs) of aggregate species richness for different groups of species.
We will use his global terrestrial maps in the GeoTIFF format. The original data is here:
BiodiversityMapping_TIFFs_2019_03d14_revised.zip
You can download the original file here. However, the data contains empty values and has convoluted file names. I save you the renaming and zero-filling (this lab is long) and provide zero-filled data on the EE 508 drive. Download here:
Rasters for threatened species have the suffix _thr.
Tip
Directory structures in this course are inspired by the Cookiecutter Data Science project, a community template created by Carl Boettiger and colleagues at Berkeley.
1.1.1.3. Download and unzip
Download all data files into a folder for this lab, e.g. ~/ee508/data/external/lab1/part1/.
Unzip the .zip files from Natural Earth.
Write access and unzipping are important, because we will use QGIS to edit the files and add new columns to the vector data.
Attention
If this is the first time you are working with vector data in the ESRI shapefile format (.shp), note that they become useless without their companion files of the same name (.cpg, .dbf, .prj, .shx - a caboodle?). You have to keep these files in the same folder, copy them as a group, etc. If you don’t plan to edit them, it’s easiest to leave them in their .zip file, which QGIS and Python’s geopandas can read as is.
1.1.2. Visualize the data
Open QGIS and start a New Empty Project.
Save it, e.g. ~/ee508/reports/lab1/part1/world.qgz.
Drag-and-drop the two political boundary layers into your QGIS project.
Important
QGIS automatically assigns the coordinate reference system (CRS) of the first imported layer to the entire project.
In essence, the CRS defines how we go from a 3D world (reality) to a 2D map. Because many of our operations end up happening in 2D (screens, geometric operations, etc.), knowing, choosing, and tracking CRS are essential tasks for geospatial data specialists. Read more: What is a CRS?.
NaturalEarth uses the most common CRS for global mapping: latitude and longitude based on the WGS84 ellipsoid. This CRS is used by Google Maps, GPS, etc. and identified by its unique EPSG code 4326. It is now also the CRS for your QGIS project. You can change the project CRS anytime in the Menu Project > Properties… > CRS.
Make the fill color of the political boundary layers transparent, so only outlines remain:
Layers panel > right click on the vector layer > Properties…
In Symbology > click on the Color bar > set Opacity to 0.00%. The color bar will show a grey-white checkerboard in the right half.
To better distinguish countries from subdivisions, you can give the latter a thinner outline or increase its transparency:
In Symbology > click Simple Fill > reduce Stroke Width to 0.05.
Alternative: click Simple Fill > click on the Stroke color > set Opacity to 25%.
Alternative: click Layer Rendering > set Opacity to 25%. This makes the entire layer more transparent (including potential fill colors, labels, symbols, etc.).
Drag-and-drop the threatened species richness layers (_thr) into your QGIS project.
Give all layers short and meaningful names:
Layers panel > right-click on layer > Rename Layer.
Alternative (OSX): Layers panel > left-click on layer > press Enter.
Make sure the political boundary layers are shown on top of the rasters by dragging and dropping them in the Layers panel until both political boundary layers are listed first.
Change the color mapping of the species raster data:
Right-click on species raster layer listed on top > Properties…
Under Symbology > Band Rendering > Render type, choose Singleband pseudocolor.
To make sure the visualization includes all values, open Min / Max Value Settings, make sure the Min / max option is selected and the Statistics extent is Whole Raster. Set Accuracy to Actual (slower).
Pick a Color ramp you like.
In Mode, choose Equal Interval (with 5 classes). Click Classify.
If you do not like the shape of the slope of the color gradient, you can manually change the Value of each color step in the color ramp.
Close the dialog box with OK.
Repeat for each raster layer with a different color gradient.
1.1.3. Add basemaps
In the Browser panel, right-click on XYZ Tiles and choose New Connection…
If the Browser panel is not visible: Menu > View > Panels > Browser
In Name, enter Google Satellite.
In URL, copy and paste the following URL, then click OK:
https://mt1.google.com/vt/lyrs=s&x={x}&y={y}&z={z}
Drag-and-drop the Google Satellite layer from Browser into the panel Layers. You should see the familiar Google Satellite view. (If the view appears to be incorrectly projected, zoom in a bit, and it will self-correct).
Many other layers are made available by different providers in that way (e.g. Google Road, OpenStreetMap, etc.). See this post for the corresponding links.
Explore
Zoom in a bit. Switch the various layers on and off.
How do global distributions of threatened species ranges differ?
In absolute counts and in spatial patterns?
What might be an explanation for that?
1.1.4. Calculate species richness stats by subdivision and species class
Open the Processing Toolbox: Menu > Processing > Toolbox.
Search for zonal.
Open the tool Raster analysis > Zonal statistics (double-click).
In Input layer, choose the subdivisions (states/provinces) polygon layer.
In Raster layer, choose one of your species raster layers.
Choose an Output column prefix that identifies the species group you picked (e.g., amph_, bird_, or mamm_).
Next to Statistics to calculate, click the … button, uncheck everything except Mean, and close.
Run the algorithm.
Oh no! You’re getting an error (if you used the original files from Natural Earth)
Feature (439) […] from `ne_10m_admin_1_states_provinces_lakes` *has invalid geometry. Please fix the geometry or change the “Invalid features filtering” option for this input or globally in Processing settings.
This means that the polygon layer has errors. This happens surprisingly often in practice.
Fix it with the Fix geometries tool:
Menu > Processing> Toolbox > search for fix > Fix geometries
Pick your layer and save to a new file. Run.
Use the fixed vector dataset from here onwards.
My recommendation is to delete the old one, so you don’t mix them up.
QGIS will create a new temporary vector layer, named Zonal Statistics, that has the same geometries and data as the original vector layer (subdivisions), with a new data column that contains the results from Zonal Statistics.
You can examine the data by right-clicking on the layer > Open Attribute Table.
Rename the layer Zonal Statistics, so that it is uniquely identified (we will run the algorithm two more times, and it tends to use the same name for the output layer).
Create one vector layer with three columns of species richness, one for each species group by repeating the above steps. You can accumulate columns by using the output vector layer of each Zonal Statistics run as the Input layer for the next run.
Once you’re done, visualize the zonal statistics results with a choropleth map:
Layer panel > right-click on the vector layer > Properties…
In Symbology, choose Graduated from the top-level dropdown menu
In Value, pick the data column you’d like to visualize. The ones you just generated should be at the bottom of the list, with the suffix _mean.
In Mode, choose Natural Breaks (Jenks). Set Classes to 10.
Click the Classify button, then OK.
“Choro… what?”
“Choro-” derives from the Ancient Greek χώρα (chōra), meaning “place, region, or area.”
“-pleth” comes from the Ancient Greek πληθής (plēthēs), meaning “multitude” or “full, filled”, and is related to πλῆθος (plēthos), meaning “a multitude, quantity, or crowd.”
A Choropleth map is a map where regions (chōra) are filled (plēth-) with colors or patterns according to the values of a particular variable (like population density, temperature, election results, etc.).
The term was coined in English in the early 20th century by geographers and cartographers, based on these Greek roots, to describe this specific type of thematic map.
Repeat the above steps for the other two species groups so you end up with all three columns (
amph_tr,bird_thrandmamm_thr) in the attribute table.
Question
Why do you think some state polygons get dropped when you visualize the amphibian species richness?
Short questions like this one will appear throughout EE 508 labs. Their purpose is to invite you to you pause and to reflect about what you’re doing. Unless otherwise noted, you don’t have to write up an answer to these questions! Write-up instructions usually appear at the end of labs and tend to address bigger-picture items.
1.1.5. Calculate average threatened species richness across all classes
Layers panel > right-click on the state/provinces vector layer > Open Attribute Table.
Open the Field Calculator (Ctrl + I or Command + I) or by clicking this menu button:
Choose an Output field name (e.g., thr_mean).
In Output field type, choose the correct variable type (you have to know this).
The middle window contains a list of functions and field names you can use.
Click on Fields and Values, then click on one of the three fields you just calculated. Notice how helpful information is displayed on the right-hand side.
Double click on each field name to add it to the Expression window. Notice that you need to use quotation marks to refer to your fields in the Field Calculator. Also notice that an error notice (Expression is invalid) appears at the bottom.
Still in the Expression window, add + signs between the three field names in quotes to signal that you would like to sum up the values. When your formula is done, press OK.
Inspect the values. What happened to the values in the subdivisions for which we had no data for amphibians?
Let’s re-compute threatened species richness across groups - but set empty input values to 0.
Re-open the Field Calculator. You can find your previous formula at the bottom of the middle window, under Recent (fieldcalc). Double click on it.
Replace each field name in the summation by
if(“fieldname” is NULL, 0, “fieldname”)
where “fieldname” is the name of the field you would like to add.
This will set any empty values to zero.
Your final expression should be a sum of the three if(…) statements.
Choose an Output field name and Output field type and press OK.
1.1.6. Make a map
Visualize the new column. Choose a Natural Breaks (Jenks) classification with 10 classes.
Caution
Make sure that the new classification is based on your new variable and covers its full range. The easiest way to guarantee that is to click the Classify button after switching the variable you want to classify. If you don’t do this and erroneously keep the classes of another variable, polygons with values outside the range of classes might not get displayed.
Take some time to inspect the results:
Compare with the columns for individual species groups. Does one species group dominate the overall pattern? Why might that be the case?
Do you consider this computation appropriate to visualize differences in threatened species richness across administrative units?
Save your map:
Leave only the state/provinces layer visible, displaying average threatened species richness by state/province, summed across all three groups.
Export your map as a PNG image to your class folder.
Menu Project > Import/Export > Export Map to Image…
In Extent, open the Calculate from Layer dropdown, and select the layer.
Increase the Resolution to 200 dpi.
Click Save to save it as:
~/ee508/reports/lab1/1_threatened_species_richness_by_state.png
Attention
Please use this exact directory structure and filename here and for the rest of the course (replacing ~/ee508/ with your project folder). This helps me automatize some of the evaluation process, which in turn allows me to offer this class to so many students with individualized feedback.
Nice job!
You have already learned a few useful skills in QGIS:
Visualizing raster and vector layers.
Finding tools in the Processing toolbox and using them.
Computing new values in vector data.
Saving a map.
1.1.7. Reflect
In a short paragraph (approx. half a page), consider this question:
Now that we have this raster data on species richness, (how) can we use it to identify priority areas where more (or less) conservation actions should occur, e.g. new regulations, land acquisitions, or conservation payments?
This is not about being correct. Use this opportunity to think about it and share your own thoughts (not those of an AI). Spend no more than ~10min on it.
Save your writeup:
~/ee508/reports/lab1/1_writeup.docx
1.1.8. Wrap up
Your folder ~/ee508/reports/lab1 should now contain both deliverables:
1_threatened_species_richness_by_state.png1_writeup.docx
Compress the two files into a single .zip archive.
Find the Google Assignment with the lab title on the Blackboard course website:
Upload your .zip archive.
You’re done!
That’s it.
I hope you enjoyed your first steps in QGIS.
Bring your questions and opinions to class!
We’ll continue in Python. Is your environment ready? If not, see Installing the environment.