Professional Documents
Culture Documents
The data used for this design is from the article 'Screening and Sequential Experimentation: Simulations and
Flame Atomic Absorption Spectrometry Experiments', J. Chem. Ed., 74, 216 (Feb 1997)
' availalble for download as a pdf file.
ms such as Minitab and Statistica.
ve access to Excel, this spreadsheet
4. Choice of Design
The aim of the experiment is to carry out a screening I.e. determine which variables significantly affect t
response. If a variable doesn't significantly affect the result then it can be 'screened out'.
This means the variable is set at its mid-point value and not varied in subsequent experiments.
This is often a necessary step before a full optimization study, to reduce the number of variables to
a manageable numer (preferrably 2-4).
Factorial designs are commonly used for screening. In this case, with 6 variables, to carry out a full fact
design - I.e all combinations of each variable at the two levels- would require 2^6 = 64 experiments.
The full factorial design, in coded form , is shown on the next sheet.
In coded form the low settings for each variable are shown as -1 and the high settings as +1
within achievable
ent experiments.
umber of variables to
A B C D E F
-1 -1 -1 -1 -1 -1
1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1
1 1 -1 -1 -1 -1
-1 -1 1 -1 -1 -1
1 -1 1 -1 -1 -1
-1 1 1 -1 -1 -1
1 1 1 -1 -1 -1
-1 -1 -1 1 -1 -1
1 -1 -1 1 -1 -1
-1 1 -1 1 -1 -1
1 1 -1 1 -1 -1
-1 -1 1 1 -1 -1
1 -1 1 1 -1 -1
-1 1 1 1 -1 -1
1 1 1 1 -1 -1
-1 -1 -1 -1 1 -1
1 -1 -1 -1 1 -1
-1 1 -1 -1 1 -1
1 1 -1 -1 1 -1
-1 -1 1 -1 1 -1
1 -1 1 -1 1 -1
-1 1 1 -1 1 -1
1 1 1 -1 1 -1
-1 -1 -1 1 1 -1
1 -1 -1 1 1 -1
-1 1 -1 1 1 -1
1 1 -1 1 1 -1
-1 -1 1 1 1 -1
1 -1 1 1 1 -1
-1 1 1 1 1 -1
1 1 1 1 1 -1
-1 -1 -1 -1 -1 1
1 -1 -1 -1 -1 1
-1 1 -1 -1 -1 1
1 1 -1 -1 -1 1
-1 -1 1 -1 -1 1
1 -1 1 -1 -1 1
-1 1 1 -1 -1 1
1 1 1 -1 -1 1
-1 -1 -1 1 -1 1
1 -1 -1 1 -1 1
-1 1 -1 1 -1 1
1 1 -1 1 -1 1
-1 -1 1 1 -1 1
1 -1 1 1 -1 1
-1 1 1 1 -1 1
1 1 1 1 -1 1
-1 -1 -1 -1 1 1
1 -1 -1 -1 1 1
-1 1 -1 -1 1 1
1 1 -1 -1 1 1
-1 -1 1 -1 1 1
1 -1 1 -1 1 1
-1 1 1 -1 1 1
1 1 1 -1 1 1
-1 -1 -1 1 1 1
1 -1 -1 1 1 1
-1 1 -1 1 1 1
1 1 -1 1 1 1
-1 -1 1 1 1 1
1 -1 1 1 1 1
-1 1 1 1 1 1
1 1 1 1 1 1
Fractional Factorial Design
It might be decided that the previous design contains too may experiments
A B C D E F
-1 -1 -1 -1 -1 -1
1 -1 -1 -1 1 -1
-1 1 -1 -1 1 1
1 1 -1 -1 -1 1
-1 -1 1 -1 1 1
1 -1 1 -1 -1 1
-1 1 1 -1 -1 -1
1 1 1 -1 1 -1
-1 -1 -1 1 -1 1
1 -1 -1 1 1 1
-1 1 -1 1 1 -1
1 1 -1 1 -1 -1
-1 -1 1 1 1 -1
1 -1 1 1 -1 -1
-1 1 1 1 -1 1
1 1 1 1 1 1
The columns A-D contain a full factorial design in these 4 variables I.e. all combinations of the two levels
Column E was created by multiplying the coefficients in columns A, B and C row-wise
I.e E = ABC e.g. for row 10 -1 (cell F10) = -1(B10) * -1(C10) * -1(D10)
Similalry column F was created by B*C*D I.e F = BCD
This creates a resolution 4 design since the defining word is I = ABCE or I = BCDF
A fuller explanation is contained in the document 'Factorial Designs'
On the next sheet the above design is displayed in actual levels. The responses were measured for the
16 experiments and the results displayed in the results column.
f the two levels
To determine this the main effects for each variable are calculated.
To do this we average the responses for the variable at the high level and subtract from it the average response
at the low level
This is equivalent to multiplying the response column (H) by the column of coefficients for the variable
(e.g. column B for variable A) and dividing by half the number of experiments (8)
The main effects give the relative importance of each variable. The (numerically) largest effect is for
wavelength (variable E), followed by Flame height and % Acetic Acid
The sign of the effect also gives information. A negative effect means that the response is higher at the low setting
In this case, for example, the absorbance is higher at the low wavelength setting of 328.1nm
From these experiments we could definitely 'screen out' flame stoichiometry and lamp current from further
experiments I.e set them at mid point values (stoichiometry between lean and rich and current of 6 mA)
C D E F
-95 -95 -95 -95
-41 -41 41 -41
-63 -63 63 63
-83 -83 -83 83
59 -59 59 59
114 -114 -114 114
121 -121 -121 -121
59 -59 59 -59
-107 107 -107 107
-38 38 38 38
-44 44 44 -44
-73 73 -73 -73
60 60 60 -60
97 97 -97 -97
105 105 -105 105
53 53 53 53
15.5 -7.25 -47.25 4
Main effects
the variable
A B C D E F
-121 -114 -107 -121 -121 -121
-107 -107 -95 -114 -114 -97
-105 -97 -83 -95 -107 -95
-95 -95 -73 -83 -105 -73
-63 -60 -63 -63 -97 -60
-60 -59 -44 -59 -95 -59
-59 -41 -41 -59 -83 -44
-44 -38 -38 -41 -73 -41
38 44 53 38 38 38
41 53 59 44 41 53
53 59 59 53 44 59
59 63 60 60 53 63
73 73 97 73 59 83
83 83 105 97 59 105
97 105 114 105 60 107
114 121 121 107 63 114
Step 2:
In the above table responses at the low (-1 ) settings have a negative sign and responses at the high (+1) settings are positive
We need to get the average of the absolute values at each setting. A main effects plot compares these two averages graphica
A B C D E F
low 81.75 76.375 68 79.375 99.375 73.75
high 69.75 75.125 83.5 72.125 52.125 77.75
Step 3: Plot the data
120
100
80 A
B
60 C
D
40 E
F
20
0
low high
A variable with the biggest difference between the 'high' and 'low' values will be the most significant
I.e E followed by A, C. These are shown by the steepest slopes in the above graphs
(+1) settings are positive.
se two averages graphically
Interactions
The interactions between variables can also be calculated. The column of coded coefficients for each interactio
is calculated by multiplying the columns of coefficients of the corresponding variables
The interaction effect is then found by multiplying the response column by this column of coefficients,
summing the column and dividing by 8
Response
A B C D E F Signal A*B A*C
-1 -1 -1 -1 -1 -1 95 1 1
1 -1 -1 -1 1 -1 41 -1 -1
-1 1 -1 -1 1 1 63 -1 1
1 1 -1 -1 -1 1 83 1 -1
-1 -1 1 -1 1 1 59 1 -1
1 -1 1 -1 -1 1 114 -1 1
-1 1 1 -1 -1 -1 121 -1 -1
1 1 1 -1 1 -1 59 1 1
-1 -1 -1 1 -1 1 107 1 1
1 -1 -1 1 1 1 38 -1 -1
-1 1 -1 1 1 -1 44 -1 1
1 1 -1 1 -1 -1 73 1 -1
-1 -1 1 1 1 -1 60 1 -1
1 -1 1 1 -1 -1 97 -1 1
-1 1 1 1 -1 1 105 -1 -1
1 1 1 1 1 1 53 1 1
column of coefficients,
95 95 95 95
41 -41 41 -41
-63 -63 -63 63
-83 83 -83 -83
59 -59 -59 59
114 114 -114 -114
-121 121 121 121
-59 -59 59 -59
-107 -107 107 -107
-38 38 38 38
44 44 -44 -44
73 -73 -73 73
-60 60 -60 -60
-97 -97 -97 97
105 -105 105 -105
53 53 53 53
(1) What are the best settings for each variable? To determine this we would need to carry out a full
optimization design such as the Central Composite Design. Optimization designs need at least three se
for each variable - this is why we carry out screening first
(2) Is there curvature in the design? Consider variable B - although the main effect is small perhaps w
missing something - perhaps the resonse is significantly higher (or lower) in the range between 4 - 8?
This means there is curvature in the design.
(3) The above analysis tells us the relative effect of each variable but it does not tell us
whether the variable has a significant effect.
(2) and (3) can be tested for by modifying the design to include centre points. These are
experiments with variables set at their mid-points, and given codes of 0.
The mid-point values are:- 9 mm(A), lean/rich(B), 2.5% (C), 6mA (D), 333.2 (E) and 0.45nm(F)
The extended design, in coded form, is shown below, with the responses.
Response
A B C D E F Signal
-1 -1 -1 -1 -1 -1 95
1 -1 -1 -1 1 -1 41
-1 1 -1 -1 1 1 63
1 1 -1 -1 -1 1 83
-1 -1 1 -1 1 1 59
1 -1 1 -1 -1 1 114
-1 1 1 -1 -1 -1 121
1 1 1 -1 1 -1 59
-1 -1 -1 1 -1 1 107
1 -1 -1 1 1 1 38
-1 1 -1 1 1 -1 44
1 1 -1 1 -1 -1 73
-1 -1 1 1 1 -1 60
1 -1 1 1 -1 -1 97
-1 1 1 1 -1 1 105
1 1 1 1 1 1 53
0 0 0 0 0 0 79
0 0 0 0 0 0 74
0 0 0 0 0 0 77
We will illustrate an alterantive analysis on the next sheet. The data will be fitted to a polynomial, with linear and
interaction terms, as follows:-
constant y= b0
first order terms +b1*A + b2*B +b3*C +b4*D +b5*E +b6*F
two way interactions +b12*A*B +b13*A*C +b14*A*D +b15*A*E +b16*A*F +b24*B*D +b26*B*F
three way interactions *b134*A*B*D +b126*A*B*F
Note: not all possible terms can be included due to confounding (see alias table on previous sheet )
The coefficients are then determined using least squares regression . This can be carried out in Excel
using the array function LINEST (consult the Excel help files for use of this function and using array functions)
However before performing regression columns corresponfing to the interaction coefficients need to be constructed
as previously shown.
s we would need to carry out a full
zation designs need at least three settings
Response
e on previous sheet )
the first line above contains the parameters. The second line is the standard errors, third line is R^2 and standard error of y
the fourth line is the F statistic and degrees of freedom and the fifth line regression and residual sum of squares
the confidence interval for each coefficient is b +/-t*se where se is the standard error in the second line of the output
2.386845 2.386845 2.386845 2.386845 2.386845 2.386845 2.386845 2.386845 2.386845 2.386845
Regression coefficients
10
5
0
b126 b134 b26 b24 b16 b15 b14 b13 b12 b6 b5 b4 b3 b2
-5
-10
Regression coefficients
10
5
0
b126 b134 b26 b24 b16 b15 b14 b13 b12 b6 b5 b4 b3 b2
-5
-10
-15
-20
-25
-30
Viewing this graph now gives us an answer to which effects are significant
An effect is significant (at the 95% confidence level ) if its' regression coefficient is significantly non-zero
I.e. its' confidnce interval does not include zero. This applies to b5, b4, b1, b3,b13,b24,b134
This means the variables E (wavelength), A( Flame Height), C(% acetic acid) and D(lamp current)
are significant. There are also significant interactions but due to confounding we cannot definitely say
which are significant. The only way to remove the confounding is to do more experiments. However since we can
screen out variables B and F we could do a full factorial on 4 variables = 16 experiments and determine all
interactions.
Note that it is still possible B and F may have significant interactions even though their main effects are not significa
Note You may see a connection between the regression coefficients and the main effects. The coefficeints ar
half the size of the main effects so both give the same information.
Curvature? Compare the average of the response for the factorial points (first 16) and the centre points
Since these averages are very similar there is little curvature in the model.
Note that if significant curvature is indicated these experiments cannot tell which variable causes the
curvature; a design such as a central composite design is needed to determine this.
Response
A*F B*D B*F A*B*D A*B*F Signal
1 1 1 -1 -1 95
-1 1 1 1 1 41
-1 -1 1 1 -1 63
1 -1 1 -1 1 83
-1 1 -1 -1 1 59
1 1 -1 1 -1 114
1 -1 -1 1 1 121
-1 -1 -1 -1 -1 59
-1 -1 -1 1 1 107
1 -1 -1 -1 -1 38
1 1 -1 -1 1 44
-1 1 -1 1 -1 73
1 -1 1 1 -1 60
-1 -1 1 -1 1 97 75.75
-1 1 1 -1 -1 105 76.66667
1 1 1 1 1 53
0 0 0 0 0 79
0 0 0 0 0 74
0 0 0 0 0 77
b5 b4 b3 b2 b1 b0
-23.625 -3.625 7.75 -0.625 -6 75.89474
0.55508 0.55508 0.55508 0.55508 0.55508 0.509377
#N/A #N/A #N/A #N/A #N/A #N/A
#N/A #N/A #N/A #N/A #N/A #N/A
#N/A #N/A #N/A #N/A #N/A #N/A
ne of the output
b4 b3 b2 b1
b4 b3 b2 b1
antly non-zero