.. EE 508 / DS 537: Data Science for Conservation Decisions .. _group-project: Directed group project ====================== Fall 2025 introduces the option of students joining a directed group project as an alternative to the :doc:`individual_project` (the default). The directed group project allows students to skip the ideating phase, to jump right into data exploration and coding, and to contribute directly to empirical analyses for present-day policy making. Principles ~~~~~~~~~~ The instructor proposes a scalable research project. - *Scalable* means that the project consists of mutually independent (parallelizable) tasks: if a students drops out, it affects coverage, but doesn't break the project. Each student spends approximately: - 50% of project time on the scalable activity to create a shared resource for the group (e.g., data preparation, ingestion, documentation). - 50% of project time using the shared resource (e.g., dataset) to answer a research or policy question of their own choosing. Project governance is participatory: interdependent project decisions (those affecting several participants, e.g., standards and references for data structure and quality) are made by the group with equal voting powers among group members (e.g., ranked choice voting). - Students reserve ~1h each week for a live group meeting (in-person or Zoom) to discuss & make decisions. - Depending on group size and student interests, we can experiment with algorithmic participatory project governance, e.g. coming up with a list of tasks that would create the highest added value to the project, ranking them democratically, assigning deadlines, and picking and implementing a "fair" task allocation procedure. Participants complete their tasks in accordance with group deadlines and submit their code to the group, alongside a short individual project report. Group project results will be presented to the EE/CDS community in early Spring 2026 (attendance optional). Fall 2025 project ~~~~~~~~~~~~~~~~~ Meeting time ------------ The directed study group has currently 4 students and meets **Fridays, 3:15pm-4:15pm, on Zoom** (https://bostonu.zoom.us/j/9191896749). Project goal ------------ Let's create a map of parcel-level estimates of land value for 2025, using only publicly available parcel, sales & satellite data. Publicly available sales data exists in several states. How does it compare to estimates from 2010? Have things shifted since the pandemic? Step 1: Data synthesis ---------------------- Create a linked parcel-sales dataset to a shared standard. Begin by picking a state from the below list. Try to understand the sales data and prepare the parcel data to link it to. We will meet to define the standard. I can assist you with establishing the linkage based on tax assessor parcel numbers (APNs). Picked so far: .. list-table:: :header-rows: 1 :widths: 20 40 40 * - State - Parcel boundaries - Sale prices & dates * - Connecticut - `2024 Connecticut Parcel and CAMA `_ - `Real Estate Sales 2001-2023 `_, State of Connecticut * - Florida - `Florida Statewide Parcel Map `_ - Two last sales prices & date are part of the parcel data. `Assessment Roll and GIS Data `_ * - Massachusetts - `Massachusetts Property Tax Parcels `_ Ask me for 2018 versions of the database to filter out likely intra-family transfers based on name similarity. - Last sales price & date are part of the parcel data. * - Wisconsin - `Statewide Parcel Map Initiative `_ - `Real Estate Transfer Data `_, Department of Revenue, State of Wisconsin There are more states with data available: .. list-table:: :header-rows: 1 :widths: 20 40 40 * - State - Parcel boundaries - Sale prices & dates * - Illinois - Available - `PTAX-203, -203-A, and -203-B records `_ * - Indiana - Available - `Property Sales Disclosures `_ StatsIndiana * - Minnesota - Available - `Minnesota Department of Revenue real estate transaction data `_ * - New Jersey - Available - `SR1A Sales Files, Division of Taxation, NJ Treasury `_ * - Vermont - Available - `VT Property Transfers `_ (ArcGIS REST service) Step 2: Individual research question ------------------------------------ Pick a question to answer based on the synthesized dataset - or make up your own! You can pick one question with multiple tasks as a group, as long as every group member has unique responsibilities for clearly delineated and fairly distributed tasks. Emphasizing empirical economics ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - How have land values - and thus, the cost of making conservation happen - changed over the past decade (i.e., since the pandemic)? - In this Special Issue of the *Journal of Housing Economics*, seven teams of econometricians attempted to estimate the value of land under houses, all using the same dataset from one large county. Zabel J & McMillen D (eds.) (2022) `Land Values. Special Issue. `_ *Journal of Housing Economics* Perhaps you pick one of these methods and attempt to see whether you can reprodue their findings? Emphasizing data science ^^^^^^^^^^^^^^^^^^^^^^^^ - How can predictive skill be improved with better features? - Building permits? - Zoning? - Image recognition on free high-res imagery, e.g. `NAIP `_)? - Viewshed computation? - How can predictive skill be improved with advanced methods (e.g., deep learning) Emphasizing justice ^^^^^^^^^^^^^^^^^^^ - How "fair" is tax assessment as compared to estimated market values? Can we detect systematic biases in assessment? Emphasizing spatial policy ^^^^^^^^^^^^^^^^^^^^^^^^^^ - Does do the changes to land values affect spatial priorities, e.g. for conservation planning? Spring 2026 opportunities ~~~~~~~~~~~~~~~~~~~~~~~~~ This research will have synergies with the release of `PLACES-FMV CONUS `_, a parcel-level land value dataset, created with NSF support, to be published in 2026. Students with successful projects will have the option of co-authoring a peer-reviewed publication, developed over the course of Spring 2026. Students who liked EE 508 / DS 537 and did well (or have similar skills) are invited to join a Spring 2026 effort to develop and use ``openplaces``, a new open-source data and analytics platform for integrating parcel boundaries, environmental indicators, and socio‑economic data at scale. Directed study -------------- I will be offering a **directed study** (minimum 2, up to 4 credits) for students interested in *using* ``openplaces`` to conduct an applied analysis or research project that is of interest to them. Students join with a research idea, e.g.: - "I want to estimate: - [the effects of private reserves in Chile on deforestation] - [optimal cost-effective solar energy siting in North Carolina] - [how changing lake water quality affects housing prices in Wisconsin] - I will learn to use ``openplaces`` software, infrastructure, and community support to synthesize my datasets and run the analysis. - I will write custom scripts to accomplish my research goals (e.g., geography- and theme-specific data ingestion and analytics). I will benefit from group support and instructor guidance to develop these scripts. - Upon completion, my code will be reviewed and — if appropriate — integrated into the open-source codebase. I will receive credit as a named collaborator on the (`openplaces Github `_) page in addition to authorship of my research." Examples of countries for which parcel data is available: - Brazil, Canada, Colombia, Denmark, United States. Research assistantships ----------------------- I will also be hiring 2-3 **research assistants**, selected based on merit and skills, to fulfill essential roles in developing and maintaining the open-source codebase and data, including: - Documentation - Code review - Assisting other students with technical issues. Paid roles will be distinct from those of students joining the directed study to ensure compliance with BU employment classifications.