Let us assume that the following 2-stage sampling schedule is carried out to estimate a multiple regression model for pest density prediction in a given region: first, several experimental fieldsto be surveyed are selected at random in the region and, second, several plots (or quadrats) to be surveyed are set at random in each of the selected fields. Under this sampling schedule, a multiple regression of independent variables (factors concerning pest population development),
x1,
x2, …,
xq on dependent variable (pest density)
y is to be determined. Here, each of the first
p independent variables takes the same value for all plots within each field but takes different value between fields. Each of the residual
q-p variables takes different values even among plots in the same field. By selecting a set of
r variables from independent variables,
x1,
x2, …,
xp, to minimize the prediction sum of the squares or the residual sum of the squares, the following model is determined:
y=β
0+β
1x(1)+β
2x(2)+…+β
rx(
r)+
z,
r≤
p, (1), where β
i's denote partial regression coefficients,
x(i)'s indicate
r variables selected from
x1,
x2, …,
xp, and
z denotes
residual which follows a normal distribution of the mean 0 and the variance σ
12. In the next step, the following regression is determined by selecting variables from residual
q-
p variables with the same criterion as in the first step:
z=β
0'+β
r+1x(r+1)+β
r+2x(r+2)+…+β
r+sx(r+s)+ε,
s≤
q-
p, (2), where ε denotes residual which follows a normal distribution of the mean 0 and the variance σ
22. Estimates of Eq. (1) and Eq. (2) are statistically tested by the estimated residual variances of σ
12 and σ
22, respectively. Predictions can be made by
b0+
b1x(
1)+…+
brx(r)+
b0'+
br+1x(r+1)+…+
br+sx(r+s), (3), where
b's are estimates of β's. An application of the proposed multiple regression method under a 2-stage sampling schedule for predicting arrowhead scale density is illustrated.
View full abstract