.. EE 508 / DS 537: Data Science for Conservation Decisions .. only:: instructor Clarify more explicity, including in the language, that Scenario 1 uses conditional means as predictors, and that Scenario 6 is a required minimum. Let the WTA of the landowner vary as much as the observed prediction error in the cross-validation. Consider using `mask` instead of `i` for the selection. .. _lab3-2: Optimality of policy targeting: simulating incentives to avoid carbon loss ========================================================================== In this lab, we explore how spatial heterogeneity in conservation values, threat, and costs can help us understand how different policy targeting strategies might differ in terms of their cost-effectiveness (how much of an outcome you can achieve for a given budget). Using estimates from the previous exercise, we will explore how design features of a voluntary land protection program can affect how many avoided carbon emissions we can obtain for a given budget. .. admonition:: Deliverables - Map of the estimated protection costs per ton of expected avoided CO\ :sub:`2` emissions. - Graph of the supply curve of CO\ :sub:`2` emissions through avoided forest loss in Massachusetts. - One :file:`.csv` table with expected outcome indicators of 9 (or more) policy scenarios. - Maps of parcels selected under the different scenarios. - Write-up including estimates of the relative cost and benefits (in terms of avoided carbon emissions) from different land acquisition programs aiming to reduce forest loss in Massachusetts. .. admonition:: Due date Friday, Nov 7, 2025, by 6:00 p.m. Understand how targeting can affect policy outcomes ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Our first few policy targeting strategies will follow similar principles as those formulated in our reading: `Newburn et al. (2006) Habitat and Open Space at Risk of Land-Use Conversion: Targeting Strategies for Land Conservation `_ You already reviewed the empirical estimation sections in :ref:`Lab 3-1 `. This time, remind yourself of the problem setup, then read the sections on targeting strategies, simulations, and main findings. .. list-table:: :header-rows: 1 :widths: 40, 60 * - Chapter - Instructions * - Introduction - Re-read to refresh your memory. * - Modeling Framework for Prioritizing Conservation Easements - Revisit the sections where the targeting strategies (EBC, BC, and EB) are defined for the first time. You don't have to get all of the algebra, but make sure you understand the basic targeting principles. * - Empirical Procedure - Research Study Area - Environmental Benefits Index - Land-Use Change Model - Valuation of Development Rights Model - Re-read to refresh your memory. * - - Targeting Scenarios and Assessment - Read only for the framing assumptions. Our simulations will be simpler (single-time, not dynamic). * - Results and Discussion on Targeting Simulations - **Figure 1** and **Table 5** contain the key findings. Continue the lab when you feel ready to explain **Figure 1** and **Table 5**. .. admonition:: Questions How are *Cost*, *Benefit*, and *Expected Benefit* defined? How do they map to the notions of *Value* and *Threat* used in other parts of EE 508? Which targeting strategy is found to generate the highest overall benefits for a given budget? .. _lab3-2-problem-framing: Frame the problem and analysis ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Let's keep things as simple as possible and introduce a number of **simplifying assumptions** in our problem framing. You can relax them later, e.g., as part of your :ref:`project `. .. admonition:: Problem framing Suppose a group of donors (e.g., Massachusetts state agencies, companies subject to California's cap-and-trade scheme, philanthropic donors, a student-driven initiative, ...) were interested in avoiding carbon emissions from expected forest loss in Massachusetts. - **Choice**: the donors wish to reach their goal exclusively by placing conservation easements on properties that contain forests at risk of conversion (but that are not yet protected with an easement or park). - **Objective function**: - The donors' goal is to minimize carbon emissions over a 100-year horizon. - Their discount rate is zero (i.e. they value a ton of carbon emitted in the future the same as if it were emitted today). - All purchases occur in the year for which we estimated sales prices (so we don't have to adjust for inflation, price growth, etc.). .. admonition:: Estimation assumptions - We assume that the expected parcel-level **rate of forest change is constant** over a 100-year time horizon. In other words, we assume that the estimate of forest change for 1990-2010 can be linearly extrapolated into the future until all forest on a property is lost. We do not consider parcels that are expected to gain forest cover. - Average aboveground carbon stocks in Massachusetts are about 110t per hectare of forest (Lucy Hutyra, personal communication). We assume that all of it is lost when forest is converted. We ignore changes in other stocks. Converting C stocks to CO\ :sub:`2` (``* 44 / 12``), we get about **400t of CO**\ :sub:`2` **emissions per hectare of converted forest**. - Our land cover change dataset **underestimates forest loss by a factor of 2**, as estimated against a statistical sample drawn for much of New England [1]_. We assume we can apply the same correction factor to parcels located in Massachusetts without biasing the analysis. - A property can be protected in perpetuity with a conservation easement at **40% of the fair market value** of acquiring the property. This is a very rough estimate. It reflects the average % reduction in property value from ~200 conservation easement appraisals from a different state (Colorado). The value of easements (i.e. the reduction in property value due to the additional restrictions) will vary for each property and state. Properly estimating their price effect from empirical data in the presence of many non-linear confounders is really difficult (a real-life estimation issue that has been actively exploited to create illegal tax havens). Assessing the value of conservation easements might become easier as data on easements advances - the quasi-experimental methods that we will learn in Lab 4 might help with estimation of causal differences. - Once a property is protected (acquired or encumbered with a conservation easement), the expected **risk** of forest loss is reduced to **zero**. - There is **no leakage**. In other words, protecting any number of parcels will not increase the risk that other parcels will be converted. This is a **big** assumption (not really true) and worthy of relaxing in a project. The reading, Newburn et al. 2006, provides some ideas for corrections. .. [1] *Olofsson P, Holden CE, Bullock EL, Woodcock CE (2016)* `Time series analysis of satellite data reveals continuous deforestation of New England since the 1980s `_. *Environ Res Lett 11(6):064002*. Estimate parcel-level cost-benefit ratio ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To make sure that results are comparable across students, we will all work with the same estimates (predictions) of expected forest change and land acquisition cost. Find them at: :file:`USMA_parcel_predictions.parquet` Import the predictions and join them to the ``parcels``. - Given the above assumptions, compute what percentage of each parcel is expected to change from forest to non-forest over the next 100 years (i.e. five twenty-year periods). - Account for the fact that we are likely to be underestimating forest change by a factor of 2. - Switch the sign of the estimate (make one of the multipliers negative), so forest loss is measured in positive numbers. - Clip (censor) the resulting numbers to the plausible range (i.e. no more than the forest existing in 2010 can be lost, and we ignore forest growth). Write out the code first (on a piece of scratch paper, a note, or in your head). Then reveal the solution. Do this for all of the following steps. *Don't skip ahead by just looking it up. If you do, you miss this first opportunity to memorize the relationships between the variables. Remembering them will be helpful when we play with policy design options.* .. dropdown:: Reveal solution :: parcels['p_f_loss_pred_100yr'] = ( parcels['forestchange_pred'] .mul(-10) .clip(upper=parcels['perc_forest_2010'], lower=0) ) - How many hectares of forest do we expect to be lost on each parcel? We need to divide the percentage change by 100 (to get a value between 0 and 1) and multiply it by the parcel size: .. dropdown:: Reveal solution :: parcels['ha_f_loss_pred_100yr'] = parcels['p_f_loss_pred_100yr'] * parcels['ha'] / 100 - Compute how many tons of CO\ :sub:`2` we expect to be lost on each parcel due to conversion: .. dropdown:: Reveal solution :: parcels['tco2_loss_pred_100yr'] = parcels['ha_f_loss_pred_100yr'] * 400 - Convert the predicted logged per-hectare sales price back to total parcel value in USD: .. dropdown:: Reveal solution :: parcels['acq_cost_pred_usd'] = np.exp(parcels['price_pred']) * parcels['ha'] - Estimate the conservation easement cost (40% of parcel value): .. dropdown:: Reveal solution :: parcels['easement_cost_pred_usd'] = parcels['acq_cost_pred_usd'] * 0.4 - Compute the conservation easement cost per ton of expected avoided CO\ :sub:`2` emissions. .. dropdown:: Reveal solution :: parcels['easement_cost_usd_per_tco2'] = parcels['easement_cost_pred_usd'] / parcels['tco2_loss_pred_100yr'] Map the conservation easement cost per ton of expected avoided CO\ :sub:`2` emissions. - Call ``plot`` with the argument ``scheme='quantiles'``, preferrably with ``k=10`` or larger (the default equal-interval scheme will not look very informative). - Exclude parcels with infinite costs (usually because of zero threat / no forest). - Remove the box and add the outline of Massachusetts. Save the map with a meaningful title as: :file:`~/ee508/reports/lab3/2_easement_cost_usd_per_tco2.png` .. admonition:: Take some time to explore the map Note how the price of saving carbon emissions by avoiding forest loss varies across the state. In some areas, the price is high because the land is expensive. In others, the price is high because the risk of losing forest is low. Draw the supply curve for avoided emissions ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Assuming that our estimates are correct, we can construct an estimated supply curve for avoided CO\ :sub:`2` emissions from avoided forest conversion through easements in Massachusetts. A supply curve indicates how much of a good or service will be offered / produced at any given price. By convention, the y-axis shows the price (here: $/tCO\ :sub:`2`), and the x-axis shows the total quantity of a good obtained at that price (here: total avoided CO\ :sub:`2` emissions in tons). We can construct a supply curve by sorting all offers from the lowest to the highest price per unit. Go ahead and create the estimated supply curve for a price range of 0-1,000 $/tCO\ :sub:`2`: - Sort the parcels from low to high prices per ton of avoided carbon - Use ``.cumsum()`` to create a new column that aggregates the tons of carbon avoided from the cheapest to the most expensive ton. - Select all parcels that offer avoided CO\ :sub:`2` savings at less than 1,000 $/tCO\ :sub:`2`. - Plot the graph. You can plot the relationship of two variables with the column names ``'col1'`` and ``'col2'`` in a DataFrame ``parcels`` as a line with ``parcels.plot('col1', 'col2')``. Add a title and meaningful axis labels (``ax.set_xlabel()``, ``ax.set_ylabel()``). Save the plot as: :file:`~/ee508/reports/lab3/2_tco2_supply_curve.png` Find out which parcels would be reasonable for the donors to protect with conservation easements if the donors valued a ton of CO\ :sub:`2` emissions at: - The clearing price of auctions of greenhouse gas allowances of the Regional Greenhouse Gas Initiative (RGGI), a cap-and-trade program that includes Massachusetts: about **$22**/tCO\ :sub:`2eq`: `RGGI Auction Results, Sep 3, 2025 `_ - The Swedish carbon tax, one of the highest in the world: **$155**/tCO\ :sub:`2eq`: `Carbon Taxes in Europe, 2025 `_ - The Environmental Protection Agency's last estimate of the "social cost of carbon" (SCC) under Biden, for the year 2020: **$190**/tCO\ :sub:`2eq`: `Davenport (2023) Biden Administration Unleashes Powerful Regulatory Tool Aimed at Climate. New York Times. `_ For the sake of simplicity, **ignore all inflation** between estimated sales prices and the carbon tax years. (If you want to correct for that, you're welcome to. Just make it replicable and flag it in your writeup.) For each carbon price, compute and note the following outcome variables: 1. how many parcels (count) would be reasonable to protect (``n``) 2. the total area of those parcels in hectares (``ha_pc``) 3. the total area of avoided forest loss in hectares (``ha_loss_avoided``) 4. the total CO\ :sub:`2` emissions saved in tCO\ :sub:`2` (``tco2_avoided``) 5. the total cost for the conservation easements in $, assuming that the indicated price is paid (``budget``) 6. the average cost per tCO\ :sub:`2` saved in $/tCO2 (``avg_usd_per_tco2``) .. tip:: For this exercise and the following, it will be useful to write a code snippet that prints (or writes to :file:`.csv`) these six values automatically for any selection of parcels. That will save you time when you change the selection. Summarize the values of each of the three prices in a :file:`.csv` file with the following structure. We will add more scenarios to it later. Save it as: :file:`~/ee508/reports/lab3/2_policy_scenario_outcomes.csv` .. only:: todo I've also added a few values so you know when you are on the right track. .. list-table:: :header-rows: 1 :widths: 20 7 8 13 10 8 14 * - Scenario - n - ha_pc - ha_loss_avoided - tco2_avoided - budget - avg_usd_per_tco2 * - $22/t - - - - - - * - $155/t - - - - - - * - $190/t - - - - - - * - scenario_1 - - - - - - * - ... - - - - - - Model the implications of policy design choices ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We will now explore the implications of different policy design choices on potential policy outcomes. Specifically, we will calculate how much tons of carbon dioxide equivalents we can avoid with a fixed budget under different policy design choices. We keep our simplifying assumptions (including the assumption that our estimates are correct!), but add a few more important ones: - Let's assume the policy maker has a budget of $125 million. They cannot spend a cent more. :: BUDGET = 125000000 - The policy maker can reach every single landowner and inform them about the policy. - There are no transaction costs. *Scenario 5 will relax this assumption.* - Every landowner is willing to put an easement on their property if offered the indicated price. *Scenario 6 will relax this assumption.* - Only entire parcels can be protected (not only certain parts, e.g. forest). An efficient way to go through this exercise is to create, for each scenario, a boolean Series - ``is_selected`` - that identifies all parcels that will be protected in that scenario (i.e., it is ``True`` for all parcels that will be selected for protection). The task thus becomes to find that ``Series`` for each scenario. Once you have found ``is_selected``, you can compute the six above outcome values very quickly with the same code snippet. For each scenario: - Save the six outcome indicators listed above in your table (under the name ``scenario_1``, ``scenario_2``, etc.) :file:`~/ee508/reports/lab3/2_policy_scenario_outcomes.csv` - Create a map of the selected parcels. A quick way to do this is to plot the points (``parcel_centroids``) with ``markersize='ms_large'`` (the points are on the small side if you use ``'ms'``). Plot all parcels with the same color and ``alpha=0.5`` to see overlapping parcels. Add the outline of Massachusetts. Save the map for each scenario as: :file:`~/ee508/reports/lab3/2_selected_parcels_scenario_X.png` Where :file:`X` is the respective scenario number. Ready to roll? Let's go. Here are our six policy targeting scenarios and modeling approaches: Scenario 1: Optimal allocation ------------------------------ This scenario assumes that all of our estimates are correct, and that the landowner will accept any deal at (or above) the estimated easement cost (i.e., we perfectly estimated the price at which they're willing to accept the offer). The policy maker can therefore offer exactly that amount to each landowner whose forests they want to preserve, and every landowner will voluntarily accept such a deal. The policy maker selects the parcels with the lowest cost-benefit ratio (``easement_cost_usd_per_tco2``) - or highest benefit-cost ratio - until the budget is exhausted. You can implement this as follows: - Sort the parcels by their cost-benefit ratio (lowest on top). - Use ``.cumsum()`` (cumulative sum) to create a new column that gives you the cumulative expense, starting with the top row: ``'easement_cost_pred_usd_sum'``. - Use ``is_selected`` to identify all parcels whose values in ``'easement_cost_pred_usd_sum'`` are below the fixed budget: .. dropdown:: Reveal solution :: is_selected = parcels['easement_cost_pred_usd_sum'].le(BUDGET) - Go ahead, compute & save the values, and create the map. Scenario 2: Panic buying ------------------------ The policy maker offers payments only to the **most threatened** parcels (starting with the lowest ``forestchange_pred`` value) until the budget is exhausted. Only parcels with forest cover in 2010 are eligible. As before, payment is based on the estimated easement costs. - Note that this means you'll have to remove ineligible parcels *before* applying ``.cumsum()``. Scenario 3: Forest flat rate ---------------------------- The policy maker offers every landowner a fixed payment for each *hectare of forest* protected by an easement. Landowners will participate if the offered payment is higher than the cost of the easement. For simplicity, only entire parcels (all forests) can be protected, and the forest area on a parcel in 2010 is used as the basis of payment. You will need to find the right per-hectare price that will have enough landowners participate to stay just under the budget. For simplicity, find the multiple of $10/ha (e.g., $2,860/ha, $2,870/ha, etc.) for which this condition holds true. You can do this iteratively: set a value, predict enrollment and budget, and adapt your price depending on the result. Start with large steps (e.g. $1000 vs. $5000), then narrow down. .. tip:: ``perc_forest_2010`` is given as a percentage (0-100). The payment offered to the landowners is now based on the per-hectare price, *not* on the cost of the easement! Scenario 4: Carbon flat rate ---------------------------- The policy maker offers landowners a *fixed payment* for each ton of estimated avoided carbon emissions. Landowners participate if the offered payment is higher than the cost of the easement. As before, only entire parcels can be protected. Find the level of the fixed payment per tons of estimated carbon emissions (highest integer value, no fractions) for which enrollment stays just under the budget. Once more, payments to landowners are **not** determined by the estimated easement cost. Scenario 5: Transaction costs ----------------------------- Based on Scenario 1 and its assumptions, consider the fact that protection transactions are not without cost. Instead, assume that each easement transaction costs $45,000 per parcel. - I found this to be the average self-reported transaction cost for 600 conservation easements transactions conducted by Great Outdoors Colorado, the state-financed easement acquisition program. How do outcomes change? .. tip:: Make sure to include the transaction costs in the total budget computation and the average cost / tCO\ :sub:`2eq`. Scenario 6: Prediction errors ----------------------------- Start with Scenario 1, but assume that the policy maker made mistakes in predicting the minimum price at which landowners would be willing to conserve their land (willingness to accept, or short: **WTA**). This could be for multiple reasons: many landowners might be unwilling to protect their land at fair market value because they have other plans for it, or simply don't want a land trust or government to monitor them. On the other hand, some landowners might be willing to protect their land for less. Given our data, we don't know the landowner's preferences, but we can model what implications such errors in the prediction can have for policy results. Create a new column, ``'price_pred_wta'``, which adds random noise to the predicted mean (with a standard deviation of 0.25), and increases the average a little (which makes the average landowners have a slight preference *against* sharing their property rights with someone else). :: random_noise = np.random.normal(scale=0.25, size=len(parcels)) parcels['price_pred_wta'] = parcels['price_pred'] + random_noise + 0.1 Suppose the policy maker sends an offer to each landowner, starting with those who will be (according to the knowledge of the policy maker) the most cost-efficient in providing carbon emission reductions. The policy maker offers payment in exchange for protection based on the estimated cost of the conservation easements. So far, that's just Scenario 1. However, the landowner will only accept if their *actual* (here: random) *WTA* is lower than the offer. In this case, the transaction occurs at the price proposed by the policy maker. Otherwise, they will reject and the parcel will not get protected. This continues until the budget is exhausted. Make sure all your results are saved in :file:`2_policy_scenario_outcomes.csv` and that you have a map for each scenario. Write up your findings ~~~~~~~~~~~~~~~~~~~~~~ Write up a short summary (800-1200 words), addressing these questions: - How do Scenarios 1-6 **differ** in how much carbon emissions we estimate we can reduce for a given budget? - Why do you think we observe these differences? Explain each case. - Overall, what insights (if any) did you have about policy design, specifically avoiding carbon emissions through voluntary long-term forest protection (and assuming, of course, that our assumptions did not lead us astray too much). Or, more pointedly, what *should* policy makers do if they wanted to maximize avoided carbon emissions for a given budget? Make sure to distinguish between: 1. modeling decisions that are policy design choices 2. modeling decisions that attempt to improve our model of reality (measures or modeling approach) - How realistic and feasible do you think each of stylized policy scenarios are (consider implementing them in practice)? - What caveats do you think there are to this analysis? - How would you improve the analysis if you were hired by a donor to do so? There's no need to overthink this. Don't use more than 30min on this. Save the document as: :file:`~/ee508/reports/lab3/2_policy_scenario_reflections.docx` Improve the scenarios (optional) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The above scenarios, as any model, are simplified and idealized version of reality, developed to illustrate the relevance of different pieces of the policy design puzzle. Making these scenarios more realistic is an important task for those interested in informing public policy decisions. You can gain up to 15% of credits on this lab if you create a new, more realistic simulation of a proposed policy intervention that you could actually conceive of being implemented in Massachusetts. Some of the options to add more realism / complexity include: - Combining the different issues showcased above into a new scenario. - Thinking more rigorously about how implementation would actually happen in practice, what practical problems public policy makers might encounter, and model how that might affect cost and participation (e.g. how to reach landowners, how to make offers, etc.). - Relaxing any other assumptions. For inspiration, re-read the long list of assumptions in the :ref:`problem framing `. - Students developing additionals scenarios and including them in their writeup can gain up to 15% of credits. Credit will be allocated as a function of the extent to which the innovation enhances the realism of the scenarios or highlights policy design issues. Last not least: according to my knowledge, there is no published work on this topic. If you are interested in using this approach and analysis to develop a peer-reviewed journal article (the currency of the academic world), feel free to mention it in your writeup. Wrap up ~~~~~~~ Make sure your :file:`~/ee508/reports/lab3/` directory includes all of the following: | :file:`2_easement_cost_usd_per_tco2.png` | :file:`2_policy_scenario_outcomes.csv` | :file:`2_policy_scenario_reflections.docx` | :file:`2_selected_parcels_scenario_1.png` | :file:`2_selected_parcels_scenario_2.png` | :file:`2_selected_parcels_scenario_3.png` | :file:`2_selected_parcels_scenario_4.png` | :file:`2_selected_parcels_scenario_5.png` | :file:`2_selected_parcels_scenario_6.png` | :file:`2_tco2_supply_curve.png` **Congratulations!** After you learned how to use linear regression and machine learning models to explain and predict outcomes of interest, you used (part of) that information to examine how different policy design choices lead to different results in terms of the cost-effectiveness of conservation projects. But we've only scratched the surface. You can go into more detail as part of your :ref:`Individual project `. .. admonition:: How Newburn led to ``openplaces`` In 2016, when I was a postdoc at Stanford University, I wrote David Newburn's and asked whether he'd be open to share his Sonoma county data with me. Unfortunately, the data was not recuperable at the time. "I have not even looked at the Sonoma dataset for at least 5-10 years after publishing the papers. It’s not something I could just send because much of it lives on a GIS server at UC Berkeley, and I do not even have access to it. It is also somewhat out of date since it had development from 1990 to 2002. I would recommend that you try to build the same type of data set elsewhere. [...] There are several data sets that would need to be integrated, many of which I’m not even sure where they are located on the server. **I suggest you create your own data elsewhere.** [It] is not as simple as sending you a single data set."" You could say that this experience - the observation that all the labor poured into developing feature-rich parcel datasets can be easily lost to time - sparked the idea for `places `_ and then ``openplaces``: open-source software systems to create, maintain, and archive parcel datasets for the benefit of replicable research. .. admonition:: Project idea Would you perhaps interested in updating Newburn's analysis for Sonoma county for the 2002 - 2024 timeframe? You could explore where and how targeting ultimatively occurred, and whether the recommendations and models held up.