Show packages used in this analysis
library(sf)
library(tidyverse)
library(here)
library(scales)
library(knitr)December 20, 2025

California’s San Joaquin Valley (SJV) faces mounting pressures from decades of unsustainable groundwater use. Before 2014, agricultural production in the SJV was driven by unrestricted groundwater pumping, with extractions exceeding replenishment by 2 million acre-feet per year over the preceding three decades (Butterfield et al., 2025). The Sustainable Groundwater Management Act (SGMA), passed in 2014, requires that regions where groundwater is pumped more quickly than it is replenished set limits on current and future groundwater use (Butterfield et al., 2025). Local agencies were required to form groundwater sustainability agencies (GSAs) and develop groundwater sustainability plans (GSPs) to achieve sustainable use over the next two decades (Leahy, 2016).
To meet SGMA-mandated reductions in groundwater use, it’s estimated that over 500,000 acres of irrigated agricultural land will need to be retired throughout the SJV (Hanak et al., 2023). While daunting, this transition opens the door for promising alternative land uses, including habitat restoration, water-limited cropping, solar development, and recharge basins (Hanak et al., 2023).
This analysis is part of a larger research project at the Earth Research Institute at UC Santa Barbara, led by Principal Investigator Dr. Ashley Larsen (Assistant Professor, Bren School of Environmental Science & Management). Building on the approaches of Bryant et al. (2020) and Hyon et al. (2025), our team is developing spatial optimization models using prioritizr in R to evaluate strategic agricultural land retirement scenarios across 500,000+ acres in the San Joaquin Valley. The optimization framework assesses economic, ecological, and hydrologic outcomes of phased land retirement to identify solutions that balance competing objectives—maximizing habitat creation, water savings, and solar development potential while minimizing costs to the agricultural economy.
A critical input to the spatial optimization model is the economic cost associated with retiring each parcel of agricultural land. For fields that are currently in production, this cost can be estimated from the crop being grown. However, for fields classified as idle or fallow in our base year (2022), we need to look backwards in time to determine what was last cultivated—this provides a reasonable proxy for the land’s productive potential and thus its opportunity cost of retirement.
This analysis uses Land IQ crop mapping data to trace the agricultural history of idle and fallow fields, identifying the most recent cultivated crop from historical datasets spanning 2014-2021. The resulting last_comm (last commodity) attribute will be used to assign economic values to idle fields in the prioritizr optimization framework.
Land IQ has worked in cooperation with the Department of Water Resources (DWR) to complete the statewide crop mapping for 2014, 2016, and then for each hydrologic year beginning in 2018. At the time of this analysis, the 2023 water year dataset was provisional and the 2024 water year was in the classification process. As stated on the Land IQ website, “the mapping categorizes nearly 15.4 million acres of land use into more than 50 crop and land use types with accuracies exceeding 98%. Field-by-field classification resulted in over 446,000 individually classified polygons with a minimum field size of 0.5-2.0 acres depending on crop type.”
Their datasets classify agricultural fields by crop type using a combination of satellite imagery, aerial photography, and ground-truth verification. For this analysis, we used Land IQ data from multiple years:
UniqueID field identifiers allowing for direct joins. Original data included up to four crop rotations, peak date of vegetative production and percentage of field cropped.Land IQ uses several classifications to indicate that a field is not currently in production:
Unclassified Fallow: Land not cropped during the current water year only.Idle - Long Term: Land that has been fallow or idle for 3 consecutive water years.Idle - Short Term: Land not cropped the current or previous water year, but cropped within the past three years.Idle: General idle classification in 2014 and 2016 datasets.The first step is to load the Land IQ datasets from each year and define which crop classifications indicate idle or fallow land.
# define file paths to cleaned LandIQ plots
path_2022 <- here("posts/2025-12-20-last-cultivated-crop/data/5_cropRotation/sjvYearRotation/sjvYearRotation.shp")
path_2021 <- here("posts/2025-12-20-last-cultivated-crop/data/LandIQ_processing/2021/5_cropRotation_2021/sjvYearRotation/sjvYearRotation_2021.shp")
path_2020 <- here("posts/2025-12-20-last-cultivated-crop/data/LandIQ_processing/2020/5_cropRotation_2020/sjvYearRotation/sjvYearRotation_2020.shp")
path_2019 <- here("posts/2025-12-20-last-cultivated-crop/data/LandIQ_processing/2019/5_cropRotation_2019/sjvYearRotation/sjvYearRotation_2019.shp")
path_2018 <- here("posts/2025-12-20-last-cultivated-crop/data/LandIQ_processing/2018/5_cropRotation_2018/sjvYearRotation/sjvYearRotation_2018.shp")
path_2016 <- here("posts/2025-12-20-last-cultivated-crop/data/LandIQ_processing/2016/5_cropRotation_2016/sjvYearRotation/sjvYearRotation_2016.shp")
path_2014 <- here("posts/2025-12-20-last-cultivated-crop/data/LandIQ_processing/2014/5_cropRotation_2014/sjvYearRotation/sjvYearRotation_2014.shp")
# read in all cleaned LandIQ plots
landiq_2022 <- read_sf(path_2022)
landiq_2021 <- read_sf(path_2021)
landiq_2020 <- read_sf(path_2020)
landiq_2019 <- read_sf(path_2019)
landiq_2018 <- read_sf(path_2018)
landiq_2016 <- read_sf(path_2016)
landiq_2014 <- read_sf(path_2014)
# define the crop classes that indicate fallow/idle land
idle_comms <- c("Unclassified Fallow", "Idle - Long Term", "Idle - Short Term",
"Idle")To efficiently join historical crop data to the 2022 idle fields, we create lookup tables containing just the field identifier (uniqu_d) and crop classification (comm) from each year.
# extract just UniqueID and comm from each year
lookup_2021 <- landiq_2021 %>%
st_drop_geometry %>%
select(uniqu_d, comm) %>%
rename(comm_2021 = comm)
lookup_2020 <- landiq_2020 %>%
st_drop_geometry %>%
select(uniqu_d, comm) %>%
rename(comm_2020 = comm)
lookup_2019 <- landiq_2019 %>%
st_drop_geometry %>%
select(uniqu_d, comm) %>%
rename(comm_2019 = comm)
lookup_2018 <- landiq_2018 %>%
st_drop_geometry %>%
select(uniqu_d, comm) %>%
rename(comm_2018 = comm)We separate the 2022 data into idle/fallow fields and cultivated fields, then join the historical crop data to the idle fields. The 2018-2021 data can be joined directly using the unique field identifier (uniqu_d), while the 2014 and 2016 data require spatial joins based on geometry overlap.
# separate idle/fallow fields from cultivated fields
landiq_2022_idle <- landiq_2022 %>%
filter(comm %in% idle_comms)
landiq_2022_cultivated <- landiq_2022 %>%
filter(!(comm %in% idle_comms))
# join historical data (2018-2021) to 2022 idle/fallow fields using UniqueID
landiq_2022_idle_joined <- landiq_2022_idle %>%
left_join(lookup_2021, by = c("uniqu_d")) %>%
left_join(lookup_2020, by = c("uniqu_d")) %>%
left_join(lookup_2019, by = c("uniqu_d")) %>%
left_join(lookup_2018, by = c("uniqu_d"))
# 2014 and 2016 don't have a Unique ID column like 2018-2021 data
# we need to do a spatial join based on geometry to get those years
# Ensure all layers have the same CRS
if (st_crs(landiq_2022_idle_joined) != st_crs(landiq_2016)) {
landiq_2016 <- st_transform(landiq_2016, st_crs(landiq_2022_idle_joined))
}
if (st_crs(landiq_2022_idle_joined) != st_crs(landiq_2014)) {
landiq_2014 <- st_transform(landiq_2014, st_crs(landiq_2022_idle_joined))
}
# prepare 2014 and 2016 data for join -- keep just the crop variable
landiq_2016_slim <- landiq_2016 %>%
select(comm_2016 = crp2016)
landiq_2014_slim <- landiq_2014 %>%
select(comm_2014 = crp2014)
# spatial join 2016 data to idle fields (using largest overlap)
landiq_2022_idle_joined <- landiq_2022_idle_joined %>%
st_join(landiq_2016_slim, join = st_intersects, largest = TRUE)
# spatial join 2014 data to idle fields
landiq_2022_idle_joined <- landiq_2022_idle_joined %>%
st_join(landiq_2014_slim, join = st_intersects, largest = TRUE)For each idle/fallow field, we search backwards through time to find the most recent year when the field had a cultivated crop. The function checks each year in sequence (2021, 2020, 2019, 2018, 2016, 2014) and returns the first non-idle (i.e., cultivated) crop classification found.
# create a function to find the last cultivated crop for each idle/fallow field
find_last_crop <- function(comm_2021, comm_2020, comm_2019, comm_2018, comm_2016,
comm_2014, idle_comms) {
# Check 2021 first
if (!is.na(comm_2021) && !(comm_2021 %in% idle_comms)) {
return(comm_2021)
}
# Check 2020
if (!is.na(comm_2020) && !(comm_2020 %in% idle_comms)) {
return(comm_2020)
}
# Check 2019
if (!is.na(comm_2019) && !(comm_2019 %in% idle_comms)) {
return(comm_2019)
}
# Check 2018
if (!is.na(comm_2018) && !(comm_2018 %in% idle_comms)) {
return(comm_2018)
}
# Check 2016
if (!is.na(comm_2016) && !(comm_2016 %in% idle_comms)) {
return(comm_2016)
}
# Check 2014
if (!is.na(comm_2014) && !(comm_2014 %in% idle_comms)) {
return(comm_2014)
}
# No cultivated crop found
return(NA_character_)
}
# apply the function to determine last cultivated crop for idle/fallow fields
landiq_2022_idle_final <- landiq_2022_idle_joined %>%
mutate(
last_comm = mapply(
find_last_crop,
comm_2021, comm_2020, comm_2019, comm_2018, comm_2016, comm_2014,
MoreArgs = list(idle_comms = idle_comms)
)
)After determining the last cultivated crop for each idle field, we can analyze how successful we were in finding historical crop data and visualize the results.
# Count of idle/fallow fields by classification type
idle_summary <- landiq_2022_idle_final %>%
st_drop_geometry() %>%
group_by(comm) %>%
summarise(
total_fields = n(),
fields_with_last_crop = sum(!is.na(last_comm)),
fields_without_last_crop = sum(is.na(last_comm)),
.groups = "drop"
) %>%
mutate(
percent_with_last_crop = round((fields_with_last_crop / total_fields) * 100, 1),
percent_without_last_crop = round((fields_without_last_crop / total_fields) * 100, 1)
)# create a formatted summary table
idle_summary %>%
select(
`Idle classification` = comm,
`Total fields` = total_fields,
`With last crop` = fields_with_last_crop,
`% found` = percent_with_last_crop,
`Without last crop` = fields_without_last_crop,
`% not found` = percent_without_last_crop
) %>%
kable()| Idle classification | Total fields | With last crop | % found | Without last crop | % not found |
|---|---|---|---|---|---|
| Idle - Long Term | 7173 | 2498 | 34.8 | 4675 | 65.2 |
| Idle - Short Term | 7751 | 6845 | 88.3 | 906 | 11.7 |
| Unclassified Fallow | 7987 | 7435 | 93.1 | 552 | 6.9 |
# reshape data for stacked bar chart
idle_summary_long <- idle_summary %>%
select(comm, fields_with_last_crop, fields_without_last_crop) %>%
pivot_longer(
cols = c(fields_with_last_crop, fields_without_last_crop),
names_to = "status",
values_to = "count"
) %>%
mutate(
status = case_when(
status == "fields_with_last_crop" ~ "Last Crop Found",
status == "fields_without_last_crop" ~ "No Previous Crop Found"
),
status = factor(status, levels = c("No Previous Crop Found", "Last Crop Found"))
)
# create stacked bar chart
ggplot(idle_summary_long, aes(x = reorder(comm, -count), y = count, fill = status)) +
geom_bar(stat = "identity", position = "stack") +
geom_text(aes(label = comma(count)),
position = position_stack(vjust = 0.5),
size = 3.5, color = "white", fontface = "bold") +
scale_fill_manual(
values = c("Last Crop Found" = "#2E7D32",
"No Previous Crop Found" = "tan"),
name = "Historical Data Status"
) +
scale_y_continuous(labels = comma) +
labs(
title = "Success rate in identifying last cultivated crop",
subtitle = "For idle/fallow Land IQ fields in 2022, searching historical Land IQ data (2014-2021)",
x = "2022 idle classification",
y = "Number of fields"
) +
theme_minimal() +
theme(
plot.title = element_text(face = "bold", size = 14),
plot.subtitle = element_text(size = 11, color = "gray40"),
axis.text.x = element_text(angle = 15, hjust = 1),
legend.position = "bottom",
panel.grid.minor = element_blank()
)
# create data for pie chart
overall_data <- data.frame(
status = c("Last Crop Found", "No Previous Crop Found"),
count = c(total_with_crop, total_without_crop)
) %>%
mutate(
percentage = round((count / sum(count)) * 100, 1),
label = paste0(status, "\n", comma(count), " fields\n(", percentage, "%)")
)
# create pie chart
ggplot(overall_data, aes(x = "", y = count, fill = status)) +
geom_bar(stat = "identity", width = 1) +
coord_polar("y", start = 0) +
geom_text(aes(label = label),
position = position_stack(vjust = 0.5),
color = "white", fontface = "bold", size = 4) +
scale_fill_manual(
values = c("Last Crop Found" = "#2E7D32",
"No Previous Crop Found" = "tan")
) +
labs(
title = "Overall success rate: finding last cultivated crop",
subtitle = paste0("Total idle/fallow fields analyzed: ", comma(total_idle_fields))
) +
theme_void() +
theme(
plot.title = element_text(face = "bold", size = 14, hjust = 0.5),
plot.subtitle = element_text(size = 11, color = "gray40", hjust = 0.5),
legend.position = "none"
)
Finally, we clean up the dataset by removing the intermediate historical crop columns, add the last_comm column to the cultivated fields (as NA, since they don’t need historical lookup), and combine everything back into a single dataset.
# remove historical comm columns from idle fields
landiq_2022_idle_final_clean <- landiq_2022_idle_final %>%
select(-comm_2021, -comm_2020, -comm_2019, -comm_2018, -comm_2016, -comm_2014)
# add last_comm column (as NA) to cultivated fields for consistency
landiq_2022_cultivated <- landiq_2022_cultivated %>%
mutate(last_comm = NA_character_)
# combine idle and cultivated fields back together
landiq_2022_final <- bind_rows(landiq_2022_cultivated, landiq_2022_idle_final_clean)This analysis successfully traced the agricultural history of idle and fallow fields in California’s San Joaquin Valley, providing a critical input for our spatial optimization models. By combining multiple years of Land IQ crop mapping data using both attribute-based joins (for 2018-2021 data with consistent field IDs) and spatial joins (for 2014 and 2016 data), we were able to determine the last cultivated crop for a significant portion of the idle fields.
The last_comm attribute generated by this analysis will be used to assign economic costs to idle fields in our prioritizr spatial optimization framework. Fields with identified last crops can be assigned costs based on the crop-specific economic value of that commodity (using the USDA NASS annual crop report dataset), while fields without historical crop data may require alternative cost estimation approaches (e.g., taking the mean value of the 8 nearest fields in production, the mean or median value of all idle fields in 2022, etc.).
Following the approach of Bryant et al. (2020), who implemented retirement optimization as a cost-minimization problem using agricultural opportunity costs as the cost layer, we can now assign crop-specific economic values to both actively cultivated and recently-idled fields. This economic cost layer is essential for identifying retirement scenarios that achieve conservation and water-saving targets while minimizing disruption to the agricultural economy.
Several limitations should be considered when interpreting these results. First, the available historical data only extends back to 2014, meaning fields that became idle before this date cannot have their last crop identified. Second, the spatial join methodology for 2014 and 2016 data assumes that field boundaries have remained relatively stable over time, which may not always be the case. Third, the Land IQ classifications themselves have some inherent uncertainty, particularly for fields that may be transitioning between uses. Finally, the last cultivated crop may not perfectly reflect a field’s future productive potential, as factors like water rights, soil degradation, or infrastructure changes may have altered the land’s suitability for agriculture.
This analysis was conducted as part of research on strategic agricultural land retirement in California’s San Joaquin Valley. This work is supported by the Earth Research Institute at UC Santa Barbara.
Land IQ crop mapping data was obtained through the land use mapping data portal on their website. Special thanks to the Land IQ team for their detailed agricultural land use mapping that makes this type of historical analysis possible.
@online{pepperdine2025,
author = {Pepperdine, Maxwell},
title = {Tracking Agricultural Land Use Change: Finding the Last
Cultivated Crop},
date = {2025-12-20},
url = {https://maxpepperdine.github.io/posts/2025-12-20-last-cultivated-crop/},
langid = {en}
}