Adaptation and Evaluation of The CROPGRO-soybean Model To Predict Regional Yield and Production

Agriculture, Ecosystems and Environment 93 (2002) 73–85
Adaptation and evaluation of the CROPGRO-soybean model

to predict regional yield and production夽
Shrikant S. Jagtap∗ , James W. Jones
Department of Agricultural and Biological Engineering, University of Florida, 104 Rogers Hall, Gainsville, FL 32611-0570, USA
Received 18 April 2001; received in revised form 12 December 2001; accepted 18 December 2001
Abstract
In spite of the availability of numerous crop-growth models, there has been limited experience in applying them to predict
regional production and its variability. The main difficulty is a substantial mismatch between spatial and temporal scales of
available data and crop simulation model input requirements. This study developed and tested an operational procedure to
predict soybean (Glycine max L. Merrill) yield and production by linking the CROPGRO-soybean model with a low resolution
regional database of weather, soils, management, and varieties. Historically observed census yields were detrended to remove
effects of changes in technology and then aggregated to a scale of 0.5◦ cell (about a 50 km grid cell) (ĝ) using an area-weighting
approach. Spatial yield variability within a grid cell was simulated (ý) using nine input combinations (3 varieties, 3 planting
dates, 1 soil and 1 initial condition) which were averaged for comparison with aggregated census yield, ĝ in each cell and
year. Yield bias was estimated by minimizing the root mean squared error (RMSE) between corrected ŷ and ĝ. The yield
correction factor needs to be site-specific to account for spatial variations in constraints and management. Yield correction
factor ranged from 0.40 to 0.50 in more than 75% of grid cells. When corrected, the success rate for the goodness of fit of ŷ
and ĝ was ∼100 and 80% for variance at the 95% confidence limit. The 17-year mean of actual yield was accurately predicted
with a slope of 0.95, small intercept (−0.025) and R2 of 0.95. When validated, the prescribed factor test error was 14%,
within the 16% guideline set by the Environmental Protection Agency (1982) as an acceptable criteria for a model to qualify
for management application. Median RMSE were 15, 8, 32, 7 and 85% for 1991, 1992, 1993, 1994 and 1995, respectively.
Years 1993 and 1995 were dominated by high water stress. We conclude that the grid-specific yield correction approach can
effectively correct bias in simulated yields and accurately predict interannual variability using readily available inputs. Future
steps are needed to incorporate procedures that account dynamically for yield susceptibility to pests and diseases. Testing and
improvement of the model should continue to realize its potential.
© 2002 Elsevier Science B.V. All rights reserved.
Keywords: Scaling up simulations; Regional yield predictions; CROPGRO-soybean; Spatial analysis; Yield variability
1. Introduction
Much progress has been made in developing models

that simulate the growth and production of crops in
夽 Florida Agricultural Experiment Station, Journal Series No.
many environments. There are clear advantages in
R-07981.
∗ Corresponding author. Tel.: +1-352-392-1864x261; adopting these plot-scale crop simulation models for
fax: +1-352-392-8476. analyzing regional production, because agricultural
E-mail address: ssjagtap@mail.ifas.ufl.edu (S.S. Jagtap). recommendations and policies are generally imple-
0167-8809/02/$ – see front matter © 2002 Elsevier Science B.V. All rights reserved.
PII: S 0 1 6 7 - 8 8 0 9 ( 0 1 ) 0 0 3 5 8 - 9
74 S.S. Jagtap, J.W. Jones / Agriculture, Ecosystems and Environment 93 (2002) 73–85
mented at this scale (Moen et al., 1994; Chipanshi area planted and harvested, and a range of planting
et al., 1999). This interest has been additionally fu- dates aggregated at a county scale (NASS, 1997).
eled by climate change research that seeks to explore Combining these two data sources make it possible to
impact on regional food productivities and options simulate, evaluate, and improve the accuracy of the
for adaptations (Haskett et al., 1995; Rosenzweig and estimated yields by calibration.
Hillel, 1998). However, there remain several chal- A handful of studies have been made (Kunkel and
lenges in using crop simulation models for large area Hollinger, 1991; Van Lanen et al., 1992; Moen et al.,
yield predictions. A dynamic crop model is typically 1994; Haskett et al., 1995) using crop simulation mod-
designed to simulate plant growth, yield and resource els for regional yield simulations using region-specific
utilization at a point in space over time, where cen- representative soil(s), crop varieties, and planting
tral tendency, variances and trends of underlying times. Weather inputs to the models were obtained
inputs can be specified with certainty. As long as from a station representative of the region. Soil water
the inputs are provided over time, crop models with variables required for the simulation were obtained
some site-validation and calibration have proved to using relative area of each soil type. Averaging arith-
be useful tools to characterize time dependence and metically or using rule-based approaches such as area
interannual responses of yields and other variables weighting and probability of plantings achieved yield
(Boote et al., 1997). aggregation. Simulated yield and its variability gen-
This conventional approach of first validating the erally exceeded the spatially averaged regional yield
models at each field in which it is to be used is time and its variability. The most often mentioned causes
consuming, prohibitively expensive, and impractical were lack of crop model sensitivity to environmental
to apply at a regional scale. Applying field scale mod- stresses such as temperatures (Kunkel and Hollinger,
els to regional yield assessment may require perfect 1991), drought, flooding, extreme weather events
aggregation of effects of all possible input combina- (Kunkel and Hollinger, 1991; Chipanshi et al., 1999),
tions over geographic and probability space. Data col- pests, and harvest losses (Moen et al., 1994; Chipanshi
lection over large areas, which is possible at a field et al., 1999). These studies focused on either one crop-
level, is not currently realistic. Therefore, large-area reporting district (Moen et al., 1994; Haskett et al.,
modeling cannot take into account the level of detail 1995) or on few very large regions across states
found at the field or farm scale. Errors are introduced (Kunkel and Hollinger, 1991; Chipanshi et al., 1999)
whenever models are used at scale for which they and used various data sources and methods for yield
were not developed. Consequently, the models may simulations.
not perform adequately across all possible production The objective of this work was to develop and test a
environments and must be tested across broad agricul- methodology to use CROPGRO-soybean (Jones et al.,
tural areas to be useful in large-area yield predictions. 1989; Boote et al., 1998; Tsuji et al., 1998), a dynamic
Therefore, a reasonable approach is needed to capture process based crop model, for yield estimation in an
heterogeneity in inputs over the production environ- area of about 0.25 million ha. Three problems arise
ment to minimize sources of variability and errors in when adapting CROPGRO for regional scale predic-
yield prediction. tions. First is how to derive the inputs needed to run
The regional assessment process may be facilitated the model at the new scale because of spatial variabil-
by increasing the number of weather data networks, ity in production systems, input use, skills of farmers,
geostatistics, and new GIS-based methods for im- and multiple stresses not accounted for in the model.
proving availability of spatial digital data bases such Secondly, how can one best mimic decisions made by
as the vegetation/ecosystem modeling and analysis farmers who are attempting to reduce risk by vary-
project (VEMAP) data bases for daily weather and ing cultivar, crops, planting dates and other manage-
soil texture (Kittel et al., 1995). The resolution of the ment decisions to control stresses? The third problem
VEMAP database is (0.5◦ ) equivalent to an area of is how to account for the effect of stresses on yield not
approximately 250,000 ha (617, 500 acres) in Georgia. included in CROPGRO such as pests and diseases?
The national agricultural statistical services (NASS) Georgia was selected for this case study because it
also provides annual digital data on yield, production, is at the southern fringe of the soybean production
S.S. Jagtap, J.W. Jones / Agriculture, Ecosystems and Environment 93 (2002) 73–85 75
zone where soybean production is undergoing sub-

stantial changes. Yields in this region are considerably
lower than in the major soybean growing region, partly
due to heterogeneity of soils, their general poor qual-
ity, and less intensive management practices. Viable
soybean simulations of this area are thus particularly
relevant.
2. Materials and methods
2.1. Study area
The case study covered soybean production in the

state of Georgia over the 1974–1995 time period. The
area devoted to soybean increased from 0.8 million ha
in 1972 to a peak of 2.3 million ha in 1982 and has been
steadily declining since then to about 0.3 million ha in
1995. As of 1994, about 82% of eligible producers vol-
untarily participated in the government program aimed
to limit production in return for payments. This pro-
gram ended in 1996 and soybean production is again
increasing.
Dominant soils in the study area are sandy loam
soils in the flat southern areas and loamy soil in the
northern higher elevation areas. According to the
VEMAP database (Kittel et al., 1995), spatial distribu-
tion of potential extractable soil water varies from 143
to 189 mm due to variations in depth and texture. Mean
growing season (May–October) rainfall increases from
west to east and north to south, with relatively high
rainfall (>450 mm/season) in the northeastern moun-
tain areas, along the coast, and areas bordering Florida
in south (Fig. 1). Soybean is cultivated throughout the
state, but 85% of the total soybean production is con-
centrated in the southeastern part of the state, which is
characterized by high rainfall (450–550 mm/season)
and deep sandy loam soils with potential extractable
soil water above 180 mm. Soybeans are grown in
rotation with non-legumes and also in succession.
Fig. 1. Map of Georgia showing county boundaries along
with: (a) average soybean area harvested; (b) growing season
2.2. Simulation of soybean yields (May–October) rainfall during the 1990–1995 period.
Yields were simulated for each year using the

CROPGRO-soybean (Jones et al., 1989; Hoogenboom stress, and management. Using site-specific input
et al., 1994; Boote et al., 1998; Tsuji et al., 1998), data, it calculates development, growth, and partition-
a dynamic process based crop model that simulates ing processes on a daily basis, starting at planting and
how soybean cultivars respond to soil, weather, water ending when harvest maturity is predicted. As a result,
the response of the soybean plant to different soils, according to Ritchie et al. (1999). The lower limit of
weather, and management conditions can be predicted plant extractable water was estimated as the differ-
well (Boote et al., 1998). The CROPGRO-soybean ence between the drained upper limit and water hold-
model has been used to simulate soybean growth and ing capacity contained in the VEMAP database (Kern,
yield in response to management (Egli and Bruening, 1995). Root distributions (Jones et al., 1991) and upper
1992), environmental conditions (Curry et al., 1995), limit of stage-1 evaporation were calculated using the
genetic yield potential (Boote and Tollenaar, 1994; Ritchie and Crum (1989) methods. Additional soil
Mavromatis et al., 2001), and to diagnose within field parameters, including the soil albedo, a soil water
yield variability (Paz et al., 1998), and to estimate drainage rate constant and runoff curve number, were
yield changes and management adaptation associated based on soil texture from the generic soil database
with climate change (Jones et al., 1999). provided with the DSSAT-models (Tsuji et al., 1998).
2.3. Weather data inputs 2.5. Management inputs
The weather data were obtained from the VEMAP Regional yield variability is a consequence of the
(Kittel et al., 1995) database, consisting of daily so- variability of cultivars, cultural practices (such as
lar radiation (MJ m−2 /day), maximum and minimum planting date, plant density, row spacing, and planting
air temperatures (◦ C), and rainfall (mm/day) for each depths), management skills of farmers, soil proper-
half-degree grid cell (∼250,000 ha in the study re- ties, types and abundance of pests and diseases, and
gion). The weather data for each grid cell represents weather. These determinants of yield vary both spa-
aggregated weather from existing weather stations in tially and temporarily and consequently may result in
which monthly totals of the daily values match the significant variations in yield over time and space. To
long-term monthly climatology but the daily series has capture the regional variability of the soybean produc-
variances and covariances characteristic of a station’s tion systems, a total of nine combinations consisting
weather records. It was created using a stochastic daily of three commonly grown varieties, three planting
weather generator parameterized with daily records windows (early, normal and late), the most dominant
from 870 stations (Richardson, 1981). In VEMAP, soil and one initial condition (see Table 1) were sim-
daily solar radiation was generated using CLIMSIM ulated for each grid cell each year. These combina-
(Running et al., 1987) from daily maximum and min- tions were based on the published agricultural census
imum temperatures and precipitation. Similar weather (NASS, 1997), extension publications, and individual
data aggregation methods are used routinely by the researchers familiar with the state and reflect typical
NASS for predicting crop yield (McCormick Myers, late 1990s technology and growing conditions. A mix
1994) as well as regional application (Kunkel and of early to late maturing and high yielding cultivars
Hollinger, 1991; Haskett et al., 1995). from maturity group V, VI or VII (Raymer et al.,
1990, 1993) with a row spacing of 0.91 m and a plant
2.4. Soil parameter inputs population of 270,000 plants ha−1 were simulated.
Sensitivity analysis showed that simulated yields were
For each grid cell, the most common soil was iden- not very sensitive to variations in plant population
tified from the VEMAP database. VEMAP soil data (150,000–337,000 plants ha−1 ).
is based on the 10 km soil database developed by
Kern (1994, 1995). CROPGRO requires soil water 2.6. Simulation runs
inputs (lower, upper, and saturated water holding lim-
its), bulk density, total carbon by soil layer, pH, and CROPGRO-soybean was run for each grid cell for
nitrate and ammonium nitrogen. None of the soil wa- non-irrigated conditions. The daily soil water balance
ter inputs were available in the VEMAP database. was initialized with a soil profile at the lower limit of
VEMAP included the soil physical properties (sand, plant extractable soil water on January 1 each year.
clay, rock fractions, organic carbon) up to 1.5 m soil Thereafter, planting dates were automatically deter-
depth, which were used to estimate drained upper limit mined by the model within each planting window and
Table 1 averaged for each planting date, which were then

Summary of inputs for crop simulation runs weighted by planting percentages to give a mean sim-
Management No. of Description ulated rain-fed yield (ý) for each grid cell, soil and
cases runs weather year.
Variety 3 Most commonly grown
varieties in the state 2.7. Historical yields
belonging to maturity groups
V, VII and VII were
simulated for each grid cell Historical county-average soybean yields were
Planting date 3 These planting opportunities obtained between 1974 and 1995 for counties in
were defined over a planting Georgia (NASS, 1997). The simulated and observed
period characterized as early, time series yields were not comparable, because the
normal and late according to
simulated yield assumed constant level of technology
NASS (1997)
Soils 1 This was the most dominant throughout the test period. Whereas, due to numer-
soil occupying largest portion ous factors (e.g. technology changes such as fertilizer
of the grid cell area use, pest management, and improved seeds), yield se-
Initial condition 1 Initial soil water content was ries sometime shows a low-frequency trend. Higher-
set at lower limit of plant
frequency deviations are primarily attributed to
extractable soil water on
January 1 each year which is weather variability and lower frequency trend to tech-
before the onset of rainy nology. A linear trend analysis technique (Swanson
season and Nyankori, 1970; Hollinger et al., 2001) isolated
the technology trend from that of effects of weather
season on the first day whenever soil water content of on interannual variability of yields. Thus, using the
the top 0.3 m was at least 40% of the available water estimated linear trend, county yield for each year was
and soil temperature was at least 10 ◦ C. The earliest recomputed by adding the yield gain due to technol-
planting date allowed was between May 5 and 24 ogy changes between production year and the last
(NASS, 1997). Most farmers, however, plant during year of test period, 1995, to the observed difference
the optimum-planting window of May 25–June 20 between observed yield and detrended yield for the
and a few may plant late from June 21 to July 5. year of interest. Yields during the 1974–1990 pe-
In most years, about 17.5, 65, and 17.5% of area is riod were used for calibration, while yields for the
planted during the early, optimum and late planting 1991–1995 period were used for validation.
windows. For abnormal seasons such as the dry years A typical grid cell covered several counties, and
of 1980, 1983, 1986, 1988, 1990, 1993 and 1995, we yields varied among counties during and between
assumed a delayed planting distribution of 17.5, 17.5 years. To resolve difference in the scale at which
and 65% due to delayed rains during first two plant- simulation runs were conducted (i.e. 0.5◦ grid cell)
ing windows. CROPGRO-soybean does not explicitly and the individual county yield data from NASS,
simulate disease and pest problems, but includes an county yields were aggregated to grid level yields
empirical soil fertility and disease factor to account using area-weighting approach. Grid cell yield in a
for potential soil fertility limitations or soil-based year (ĝ) was the sum of weighted county yields (i.e.
pests such as nematodes. According to Boote et al. county yield times relative grid cell area occupied by
(1997) and Mavromatis et al. (2001) the value of this the county) for all counties in a grid cell. Standard de-
factor was 0.92. This factor reduces canopy photo- viation of yields (σ ) was also calculated from county
synthesis by 8% daily throughout the growing season yields in a grid cell. Aggregated grid cell yields can
and therefore biomass growth rate. therefore be compared with simulated yields using
Model simulations were run assuming the same grid cell weather, soil and management inputs. Trend
nine combinations of inputs throughout the test pe- analysis revealed that, in spite of technological ad-
riod. There were no carry over effects from one grow- vances such as Round-up Ready soybeans, new vari-
ing season to a following season. The three simulated eties, narrow-rows, and mechanization, yields did not
yields by variety, soil type and initial conditions were increase appreciably over the study period.
2.8. Sources of yield bias in the actual and estimated environmental and the
management conditions, as well as errors due to
One of the most important criteria for regional method used to aggregate county yields to grid cell
applications of a crop model is that the model yields. The availability of sufficiently long historical
should simulate annual and long-term yields similar countywide yield data is essential for calibration and
to those achieved under realistic farming situations. subsequently for testing the predictions with an inde-
CROPGRO-soybean simulates the direct effects of pendent data set. The time series ideally should cover
weather, soil and management on yield, but does not low, average and high yielding periods to reliably
account for pests, diseases, weeds, or lodging and har- estimate Yc . Since the study region is heterogeneous
vest losses. For well-managed, high input farms, yield with large variations in soil–climatic conditions, the
is indeed largely a function of temperature, radiation yield correction factor needs to be grid cell specific
and rainfall. Elsewhere, yields can fail to reach their to account for spatial variations in constraints and
potential because of poor timing of farming opera- management.
tions, inappropriate cultivar selection or yield losses
caused by competition from weeds and damage by 2.10. Validation
pests. A number of studies (Kunkel and Hollinger,
1991; Moen et al., 1994; Haskett et al., 1995; Supit, Since the model usually overpredicts yields without
1997; Rosenthal et al., 1998; Hansen and Jones, 2000) bias correction, our objective was to demonstrate an
have shown the necessity of model calibration for es- improvement in the prediction (ŷ = Yc ý) relative to
timating large area yields to account for aggregation ĝ in both the annual pattern of yield over time and its
errors, lack of specificity in crop management inputs, variance for each grid cell. The ability of the corrected
spatial resolution of soil and weather data and stresses model to predict the annual pattern of ĝ in each grid
not explicitly simulated. Also, given that the extent cell over the 17 years was examined using a t-test of
of grid cell/county data overlap is spatially variable the regression line of ŷ versus ĝ and an index of agree-
due to differences in county areas, simulated yield ment (d, Willmott, 1982). For a set of paired ŷ and ĝ,
(ý) should be corrected to reported grid cell yield (ĝ) d is an index of the overall-relative degree to which ŷ
during the same year. approaches ĝ. It varies between 0 and 1, where a value
of 1 expresses perfect agreement between ŷ and ĝ and
2.9. Yield bias correction 0 describes complete disagreement. A F-test was used
to determine if there were significant differences be-
Yields were corrected using a yield correction factor tween observed and predicted yield variance for each
(Yc ) such that grid cell over the 17-year period.
Additionally, viable long-term soybean yield sim-
ĝ = ýYc , 0 < Yc < 1 (1) ulations are particularly relevant for climate change
This was done by selecting Yc that minimized the root studies. We regressed the 17-year mean of ŷ (Ŷ )
mean squared error (RMSE) between ĝ and ý. The against the 17-year mean of ĝ (Ĝ) for the 58 grid
computation of Yc was a systematic stepwise proce- cells and the slope and intercept were tested at the
dure in which: (1) Yc was gradually increased until 95% level of confidence to determine whether Ý and
RMSE was minimized over the 17 year (1974–1990) Ĝ were the same throughout the state.
time (T) period; (2) the Yc that produced the lowest Validation was carried out by comparing the predic-
RMSE was adopted as the yield correction factor. tions during the 1991–1995 period with the estimated
1/2 Yc using the 1974–1990 data to the reported 1991–
RMSEmin = T −1 (ĝ − Yc ý)2 (2) 1995 yields. The criteria were RMSE, and the pre-
scribed factor test (PFT, Parrish and Smith, 1990;
Yield correction is expected to reduce bias or system- Haskett et al., 1995). The PF test value was calcu-
atic errors between simulated and observed grid cell lated using Eq. (3) for each grid cell and year to
yield due to site-specific constraints not specifically determine whether there in an overlap of a confidence
accounted for by CROPGRO-soybean, differences interval of observed yields with a bounded about the
simulated mean: 3. Results and discussion




L
, ŷ < L

 3.1. Simulated yield
 ŷ

PF = 1, L ≤ ŷ ≤ U (3) Simulated yields were generally higher (Fig. 2) on



 ŷ average by 45% (range 34–58%) than observed yields

 , ŷ > U
U for the 1974–1995 period at grid cell 48 (R48 ), one
of the agriculturally dominant areas in the state of
where L and U were the lower and upper bounds of ob- Georgia. When ŷ were compared to variety of trial
served grid cell yield at the 95% confidence level, and yields from 1987 to 1992 at the agricultural station in
σĝ is the standard error of county yields in a grid cell: Tifton, GA (located within the grid cell R48 ), yields
were 7% higher than observed station yields in 4 of the
L = ĝ − t1−α/2,T −1 σĝ and U = ĝ + t1−α/2,T −1 σĝ 6 years. When the experimental fields suffered heavy
(4) insect losses in 1989, ý was 34% more than the 1989
station yield. During the same year, ŷ was 48% more
The PF test examines whether the bounds of this than the observed census yield ĝ. Although station
interval fall within a certain prescribed factor of the yields are point yields and both ŷ and ĝ are spatially
calibrated yield. A successful test was any year in averaged yields, this comparison shows that when
which PF < 1.16 corresponding to a 16% error level. assumptions in the model were nearly met, the yields
The same statistical rules were used to determine the compared well. One of the major difference between
degree of state production prediction each year. Pro- practices on farms in the region and the agricultural ex-
duction was computed as a product of yield and area periment station is that pesticides were used (Raymer
of harvest. et al., 1990, 1993) at the station to control pests and
Fig. 2. Comparison of simulated yield, observed yields and their standard deviation (S.D.), and experiment station yields at Tifton, GA.
Predicted yields were computed by correcting simulated yields using a yield correction factor determined by minimizing the RMSE between
observed yield and simulated yield at a grid cell.
diseases every year. Thus, there is a large gap between 3.2. Yield calibration
what was simulated by the CROPGRO-soybean or
obtained at the research station and what is actually Regional soybean yield simulations must match
obtained on farms in the region in most of the years. yields achieved by farmers year after year under real-
For example, summer droughts in years 1980, 1990, istic production conditions. Several of the field prob-
1993 and 1995 were devastating to dryland soybean lems (Table 2) were not explicitly accounted for in
production and resulted in very low yields. Note from the CROPGRO-soybean. Nevertheless, ý and ĝ were
Fig. 2 that in these years, which also had lower levels highly correlated (R > 0.75) over the 17-year period
of observed yields, the simulated yields were about of analysis. The slope of regression line ý versus ĝ
the same or even less (i.e. 1995). This was also the was greater than 1 everywhere, and it was significantly
case in the comparison between yields at the exper- larger than 1 in 35 of the 58 grid cells at the 95% level
iment station and the model in 1987 and 1990. This of confidence. These 35 grid cells are located in the
may indicate that the model overestimated yield loss major soybean producing southern part of the state.
in response to water stress. It is also possible that the The yield bias between simulated and census yields
model is correct, however the estimates of county yield were adjusted using a yield correction factor defined
had errors (i.e. observations or aggregation) in these by Eq. (2). Yc varied from 0.35 to 0.60 (Fig. 3a), with
years. The anomaly between ý and ĝ was relatively 0.41–0.45 in about 50% of the area and between 0.45
greater in those years where average or higher than av- to 0.50 in another 25% of the area. Generally Yc was
erage ĝ were observed. Disease, insects or operational higher in the southern part of the state, where inten-
problems (Table 2) may well be responsible for lower sive cultivation, warmer temperatures, and high rain-
ĝ and higher variances (Fig. 2, ±1σ vertical bars) in fall favor disease and pests relative to cooler mountain
these years, and are not accounted for in the model. climate in the north. Linear regression of Yc against ĝ
showed (data not shown) that the intercept and slope
were positive and highly significant at the 95% level
Table 2 of confidence. Yc was weakly but positively correlated
Major causes of soybean yield variations in Georgia, 1988–1995 (R = 0.3) with ĝ, indicating presence of a higher de-
(Raymer et al., 1990, 1993, 1996) gree of farm management skill in high yielding regions
Year Causes of yield loss (Fig. 3a). However there was only a weak relationship
between Yc and the area devoted to cultivation.
1988 Insect population built to damaging level during seed
fill; best yield since 1982 Yc improved the goodness of fit between ý and ĝ
1989 Excessive rain increased diseases and weeds for grid cell R48 (Fig. 2) with RMSE of 9% relative
1990 Insects were major problem; heavy rains from a to Ŷ and d of 0.85. Across the state and over the
tropical storm damaged crop and delayed harvesting 17-year-period, the goodness of fit between ŷ and ĝ
1991 Wet conditions caused more weed and diseases;
excessive vegetative growth; foliar feeding insects
was significant in 56 out of the 58 regions at the 95%
pressure increasing during seed fill level of confidence using a t-test of the regression
1992 Wet conditions caused more weed and diseases; line. The CROPGRO-soybean simulated yield vari-
excessive vegetative growth; foliar feeding insects ances were larger than observed, which is expected
pressure increasing during seed fill; yield less than since the observed yields are spatial averages of nu-
1991
merous farm yields within a grid cell and the model
1993 Dry conditions delayed and reduced planting; driest
and hottest summer leading to rapid decline in crop simulations, by definition, are nine single point esti-
conditions; weed pressure was high, but disease mates. An F-test showed that no significant differences
pressure was low; one of the worst growing seasons were detected between predicted and observed census
with 25-year low production yield variances over the 17-year-period in 78% of grids
1994 Favorable weather condition throughout the year
(i.e. 45 of 58) at the 95% confidence limit. Moen et al.
1995 Soybean planting declined; dry conditions delayed
planting; entire growing season was drier. An (1994) and Chipanshi et al. (1999) observed that even
extended hot and dry period in August reduced yield. after calibration, a simulation model overpredicted re-
Soybean harvest was delayed. Yield and production ported census yields in years where average or higher
was substantially lower than 1994 than average yields were obtained and underpredicted
Fig. 4. Comparison of 17-year mean observed (Ĝ) and predicted

(Ŷ ) soybean yields in Georgia. The 1:1 line and the regression
line between predicted and reported yield is also shown.
yields during drought years. Such a trend was not ob-

served in this study. RMSE relative to Ŷ was less than
16% (Fig. 3b). The Willmott index of agreement was
>0.5 everywhere and >0.7 in 75% of the regions.
Predicted yields Ŷ accounted for 95% of the
variance (Fig. 4) in observed census yields Ĝ with
highly significant slope (0.95), and small intercept
(−0.025). The 5% bias resulted in Ŷ being regularly
0.072 Mg ha−1 less than Ĝ. According to the paired
t-test, the Ĝ and the Ŷ were the same and could not
be rejected at the 95% of confidence level in 57 of the
58 grid cells. The prescribed factor test was success-
ful at the 1.04 level (i.e. an error level of 4%) during
the 17-year calibration period.
3.3. Validation
The grid-specific correction factor (Yc ) computed

using 1974–1990 aggregated grid cell yield ĝ was used
to correct simulated yields during the 1991–1995 test
period. The prescribed factor test error increased to
14% during the validation period due to significantly
higher errors in 1993 and 1995. Mean (median) RMSE
Fig. 3. Spatial distribution of: (a) yield correction factor; (b) errors over the entire state were 24 (15), 9 (8), 44 (32),
17-year RMSE between predicted and reported yield.
10 (7) and 91 (85)% for 1991, 1992, 1993, 1994 and
1995, respectively. Median RMSE was variable over
the state, and in most areas it was much lower than
the mean RMSE. The years 1993 and 1995 were two
exceptional years dominated by water stress. In 1995,
the area devoted to soybeans was lowest since 1974

(0.3 million ha) and thus production was spotty. Mod-
eled yield was either the same or less than observed
grid cell yield, and should not have been adjusted. As
illustrated in Fig. 5a for 1993, higher RMSE values
were localized in areas with marginal soybean produc-
tion (see Fig. 1). There was a positive and highly sig-
nificant spatial correlation between RMSE and amount
of areas reported as lost or abandoned after plant-
ings (Fig. 5b). Further investigations revealed that in
simulation, whenever there is sufficient moisture to
plant, soybeans were planted with no replanting de-
cision built into the simulation approach even when
early growth conditions are very dry. In reality, re-
planting may occur or planting may be delayed when
early rainfall is erratic. Failure of the simulations to
account for these decisions was a major reason for the
errors in 1995. Further downward correction of 1995
simulated yields caused large errors.
3.4. Annual state productions

and interannual trend
Calibrated CROPGRO-soybean predicted year-to-

year state production for the period of 22 years
(state production = 1.11 × CROPGRO production −
13.6, R 2 = 0.95, Table 3). During the period of 22
years, observed production was underestimated by
the model in 12 years. Highest reported and predicted
production of 1.5 and 1.3 million Mg were recorded
in 1982. Georgia’s soybean production in 1995 was
lowest (0.198 million Mg) since 1974; predicted pro-
duction was also lowest in 1995 (0.1 million Mg).
Deviation in production defined as the predicted
(observed) production minus the mean predicted (ob-
served) production over the test period was less than
20% in 18 out of the 22 years. Mean absolute dif-
ference between deviations was 71 Mg, which was
not significantly different (p = 0.05). The model
correctly predicted that 20 of 22 years observed pos-
itive and negative deviations from mean production.
In commercial applications, it is often important for
commodity traders to determine how a particular year
will be relative to previous years in developing pur-
chasing or selling decisions. There may thus be an Fig. 5. Mean RMSE in 1993 (a) and percentage of planted area
advantage for using models for yield/production fore- abandoned (b) in different parts of Georgia.
casting to compliment statistical techniques. Models
can integrate the effects of weather and stresses on
Table 3
Annual predicted and observed soybean production, deviations from the 1974–1995 mean production, and absolute deviation
Year Production (×1000 Mg) Predicted Observed Absolute
deviation (Mg) deviation (Mg) deviation (Mg)
Predicted Observed
1974 634 574 19 −94 113

1975 766 736 151 68 83
1976 508 518 −107 −150 43
1977 624 556 9 −112 121
1978 599 757 −16 89 105
1979 1238 1352 623 684 61
1980 554 653 −61 −15 46
1981 975 1076 360 408 48
1982 1341 1574 726 906 180
1983 767 1059 152 391 239
1984 892 1011 277 343 66
1985 938 929 323 261 62
1986 423 388 −192 −280 88
1987 356 404 −259 −264 5
1988 502 555 −113 −113 0
1989 736 706 121 38 83
1990 217 246 −398 −422 24
1991 350 382 −265 −286 21
1992 453 450 −162 −218 56
1993 159 198 −456 −470 14
1994 392 373 −223 −295 72
1995 103 198 −512 −470 42
Average 615 668 71
growth processes during the season, thus accounting uniform over a large part of the state. This may imply
for within-season impacts of stresses. that although the conditions simulated may not consis-
tently (year-to-year or region-to-region) describe the
actual combination of inputs over time and space, it
4. Conclusions did find the optimal “regional” yield correction factor.
Variations in Yc across the state showed that farms in
An approach based on CROPGRO-soybean sim- highly cultivated areas were better managed than those
ulated regional yield was developed using VEMAP in low production areas. Error analysis revealed that
digital spatial database. The approach was evaluated RMSE values were sensitive to errors in individual
using historical county data in Georgia aggregated to years. Better yield correction procedure and/or accu-
0.5◦ grid cells compatible with the VEMAP database. rate information on planting dates could reduce errors.
Yield bias correction was required to account for ef- The CROPGRO-soybean accounted for year-to-year
fect of stresses not included in CROPGRO as well yield variability as well as long-term mean yields
as aggregation in inputs and outputs due to substan- which are important in climate change research. This
tial mismatch between spatial and temporal scales of was confirmed by the prescribed factor test error of
available data and model input needs. Yield correction 14%. The calibrated model also predicted relative
factors reduced bias between simulated and aggre- yield trends with more than 70% precision.
gated census yield from 57 to 11%. The yield correc- The procedure developed could be used to adjust
tion factors indicate the extent of yield gap between simulated yields in regional climate change studies
actual and achievable yields currently present due to by taking into account regional differences in yield
uncontrolled stresses. The yield correction factor was reducing factors. This would give yields that are in line
with those obtained by farmers in different parts of the Hollinger, S.E., Ehler, E.J., Carlson, R.E., 2001. ENSO midwestern
region. The procedure developed could also be used United States corn and soybean yield responses to changing El
Niño-southern oscillation conditions during the growing season.
to realistically explore the value of seasonal climate
In: Rosenzweig, C., Boote, K.J., Hollinger, S., Iglesias, A.,
forecasting on decisions at a regional scale. Phillips, J. (Eds.), Impacts of El Niño and Climate Variability
Further model improvements are needed to account on Agriculture. ASA Special Publication No. 63. American
for other loss factors (i.e. disease, insects, weeds, Society of Agronomy, Madison, WI, pp. 33–56.
harvesting, flooding, etc.). Research should be con- Hoogenboom, G., Jones, J.W., Wilkens, P.W., Batchelor, W.D.,
Bowen, W.T., Hunt, L.A., Pickering, N.B., Singh, U., Godwin,
centrated towards building dynamic crop simulation
D.C., Baer, B., Boote, K.J., Ritchie, J.T., White, J.W.,
models that account for these losses during the sea- 1994. Crop models. In: Tsuji, G.Y., Uehara, G., Balas, S.
son, minimizing the need for calibration, and also (Eds.), DSSAT V3. University of Hawaii, Honolulu, Hawaii,
on innovative cost-effective technologies to provide pp. 95–244.
accurate and timely model inputs (soils, weather, pest Jones, J.W., Boote, K.J., Jagtap, S.S., Hoogenboom, G., Wilkerson,
G.G., 1989. SOYGRO V5.42 Soybean Crop Growth Simulation
outbreaks). Currently, historical yields are available
Model: Users Guide. Florida Agricultural Experiment Station
for entire counties, but not for individual soils or soil Journal No. 8304. Agricultural Engineering Department and
associations. Thus, there is a mismatch between data Agronomy Department University of Florida, Gainesville, FL
available for crop yields and inputs needed to account 32611. IBSNAT Project. Department of Agronomy and Soil
for the spatial variability that occurs over large areas. Science, University of Hawaii, Honolulu, Hawaii.
Jones, C.A., Bland, W.L., Ritchie, J.T., Williams, J.R., 1991.
Simulation of root growth. In: Hanks, R.J., Ritchie, J.T. (Eds.),
Modeling Plant and Soil Systems. ASA/CSSA/SSSA, Madison,
References
WI, pp. 91–123.
Jones, J.W., Jagtap, S.S., Boote, K.J., 1999. Climate change:
Boote, K.J., Tollenaar, M., 1994. Modeling genetic yield poten- implications for soybean yield and management in the USA.
tial. In: Boote, K.J., Bennett, J.M., Sinclair, T.R., Paulsen, In: Proceedings of the World Soybean Research Conference
G.M. (Eds.), Physiology and Determination of Crop Yield. VI, August 4–7, 1999, Chicago, IL.
ASA/CSSA/SSSA, Madison, WI, pp. 533–565. Kern, S.J., 1994. Spatial pattern of soil organic carbon in the
Boote, K.J., Jones, J.W., Hoogenboom, G., Wilkerson, G.G., 1997. contiguous United States. Soil Sci. Soc. Am. J. 58, 439–455.
Evaluation of the CROPGRO-soybean model over a wide range Kern, S.J., 1995. Geographic pattern of soil water holding capacity
of experiments. In: Kropff, M.J., Teng, P.S., Agarwal, P.K., in the contiguous United States. Soil Sci. Soc. Am. J. 59,
Bouma, J., Bouman, B.A.M., Jones, J.W., van Laar, H.H. (Eds.), 1126–1133.
Systems Approaches for Sustainable Agricultural Development: Kittel, T.G.F., Rosenbloom, N.A., Painter, T.H., Schimel, D.S.,
Applications of Systems Approaches at the Field Level. Kluwer VEMAP Modeling Participants, 1995. The VEMAP integrated
Academic Publishers, Boston, pp. 113–133. database for modeling United States ecosystem/vegetation
Boote, K.J., Jones, J.W., Hoogenboom, G., 1998. Simulation of sensitivity to climate change. J. Biogeogr. 22 (4–5), 857–862.
crop growth: CROPGRO model. In: Peart, R.M., Curry, R.B. Kunkel, K.E., Hollinger, S.E., 1991. Operational large area corn
(Eds.), Agricultural Systems Modeling and Simulation. Marcel and soybean yield estimation. Preprints. In: Proceedings of
Dekker, New York, pp. 651–692. the 20th Conference on Agricultural and Forest Meteorology.
Chipanshi, A.C., Ripley, E.A., Lawford, R.G., 1999. Large-scale American Meteorological Society, Boston, 4 pp.
simulation of wheat yield in semi-arid environments using a Mavromatis, T., Boote, K.J., Jones, J.W., Irmak, A., Shinde, D.,
crop-growth model. Agron. Syst. 59, 57–66. Hoogenboom, G., 2001. Developing genetic coefficients from
Curry, R.B., Jones, J.W., Boote, K.J., Peart, R.M., Allen Jr., crop simulation models using data from crop performance trials.
L.H., 1995. Response of soybeans to predicted climate change Crop Sci. 41, 40–51.
in the USA. In: Rosenzweig, C., Allen Jr., L.H., Jones, McCormick Myers, M.D., 1994. A quality assessment of
J.W., Tsuji, G.Y., Hildebrand, P. (Eds.), Climate Change and near real-time precipitation data used to make crop yield
Agriculture: Analysis of Potential International Impacts. ASA forecasts. USDA–NASS Research Division Research Report
Special Publication No. 59. American Society of Agronomy, No. SRB-9403, US Government Printing Office, Washington,
Madison, WI, pp. 163–181. DC.
Egli, D.B., Bruening, L.H., 1992. Planting date and soybean yield: Moen, T.N., Kaiser, H.M., Riha, S.J., 1994. Regional yield
evaluation of environmental effects with a crop simulation estimation using a crop simulation model: concepts, methods,
model: SOYGRO. Agric. For. Meteorol. 62, 19–29. and validation. Agron. Syst. 46, 79–92.
Hansen, J.W., Jones, J.W., 2000. Scaling-up crop models for NASS, 1997. Usual Planting and Harvesting Dates for the US Field
climate variability applications. Agron. Syst. 65, 43–72. Crops. Agricultural Handbook No. 628. National Agricultural
Haskett, J.D., Pachepsky, Y.A., Acock, B., 1995. Estimation of Statistics Service, Washington, DC.
soybean yield at county and state level using GLYCIM: a case Parrish, R.S., Smith, C.N., 1990. A method for testing whether
study of Iowa. Agron. J. 87, 926–931. model predictions fall within a prescribed factor of true values,
with an application to pesticide leaching. Ecol. Model. 51, 69– Rosenthal, W.D., Hammer, G.L., Butler, D., 1998. Predicting
72. regional sorghum production in Australia using spatial data and
Paz, J.O., Batchelor, W.D., Colvin, T.S., Logsdon, S.D., Kaspar, crop simulation modelling. Agric. For. Meteorol. 91, 263–274.
T.C., Karlen, D.L., 1998. Calibration of a crop growth model Rosenzweig, C., Hillel, D., 1998. Climate Change and the Global
to predict spatial yield variability. Trans. ASAE 41, 1527– Harvest. Oxford University Press, New York, 324 pp.
1534. Running, S.W., Nemani, R.R., Hungerford, R.D., 1987.
Raymer, P.L., Day, J.L., Gipson, R.D., 1990. 1989 Field crops Extrapolation of synoptic meteorological data in mountainous
performance tests. Research Report No. 589. University of terrain and its use for simulating forest evapotranspiration and
Georgia, Athens. photosynthesis. Can. J. For. Res. 17, 472–483.
Raymer, P.L., Day, J.L., Bennet, R.B., Baker, S.H., Branch, Supit, I., 1997. Predicting national wheat yields using a crop simu-
W.D., Stephenson, M.G., 1993. 1992 Field crops performance lation and trend models. Agric. For. Meteorol. 88, 199–214.
tests: soybeans, peanuts, cotton, tobacco, sorghum, and summer Swanson, E.R., Nyankori, J.C., 1970. Influence of weather and
annual forages. Research Report No. 618. University of Georgia, technology on cone and soybean trends. Agric. For. Meteorol.
Athens. 20, 327–342.
Raymer, P.L., Day, J.L., Coy, A.E., Baker, S.H., Branch, W.D., Tsuji, G., Hoogenboom, G., Thornton, P. (Eds.), 1998. Under-
Stephenson, M.G., 1996. 1995 field crops performance tests: standing Options for Agricultural Production. Kluwer Academic
soybean, peanut, cotton, tobacco, sorghum, grain millet, and Publishers, Boston, 399 pp.
summer annual forages. Research Report No. 639. University US Environmental Protection Agency, 1982. Testing for the Field
of Georgia, Athens. applicability of chemical exposure models. In: Proceedings of
Richardson, C.W., 1981. Stochastic simulation of daily precipi- the Workshop on field Applicability Testing, March 15–18,
tation, temperature and solar radiation. Water Resourc. Res. 17, 1982. Exposure Modeling Committee Report, USEPA, Athens,
182–190. GA.
Ritchie, J.T., Crum, J., 1989. Converting soil survey charac- Van Lanen, H.A.J., van Diepen, C.A., Reinds, G.J., de Koning,
terization into IBSNAT crop model input. In: Bouma, J., G.H.J., Bulens, J.D., Bregt, A.K., 1992. Physical land evaluation
Bregt, A.K. (Eds.), Land Qualities in Space and Time. Pudoc, methods and GIS to explore the crop growth potential and its
Wageningen, The Netherlands, pp. 155–167. effects within European communities. Agron. Syst. 39, 307–
Ritchie, J.T., Gerakis, A., Sileiman, A., 1999. Simple model to 328.
estimate field-measured soil water limits. Trans. ASAE 42, Willmott, C.J., 1982. Some comments on the valuation of model
1609–1614. performance. Bull. Am. Meteorol. Soc. 63, 1309–1313.

Adaptation and Evaluation of The CROPGRO-soybean Model To Predict Regional Yield and Production

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Adaptation and Evaluation of The CROPGRO-soybean Model To Predict Regional Yield and Production

Uploaded by

Copyright:

Available Formats

Agriculture, Ecosystems and Environment 93 (2002) 73–85

Adaptation and evaluation of the CROPGRO-soybean model

Much progress has been made in developing models

zone where soybean production is undergoing sub-

2. Materials and methods

2.1. Study area

The case study covered soybean production in the

Yields were simulated for each year using the

2.3. Weather data inputs 2.5. Management inputs

Table 1 averaged for each planting date, which were then

simulated mean: 3. Results and discussion

Fig. 4. Comparison of 17-year mean observed (Ĝ) and predicted

yields during drought years. Such a trend was not ob-

The grid-specific correction factor (Yc ) computed

the area devoted to soybeans was lowest since 1974

3.4. Annual state productions

Calibrated CROPGRO-soybean predicted year-to-

1974 634 574 19 −94 113

You might also like