Statistical Engineering is becoming popular and Statistically Designed Experiments play a big role in the Discovery Process.
From time to time there will be a number of factors a Six Sigma Black Belt or Quality Engineer will need to study. In this case, the Six Sigma Practitioner or Quality Engineer may choose to conduct a statistically designed experiment. So what makes a statistically designed experiment useful in the first place? It has much to do with its inherent properties. Let’s take a closer look at these properties using an example.
Suppose we have a factorial experiment with two factors: X1 and X2. From such an experiment, we can estimate two single effects (X1, X2) and one two-way interaction (X1X2). The data from such an experiment can fit the following model.
In this expression; βo, β1, β2, and β12 are regression parameters that measure the population mean (βo), and the single effects (β1, β2) for factors X1 and X2. Finally, the effect of the two-way intersection, X1X2, is measured by β12. The last term, ε, is an error term that represents the variation in y not explained by the following regression or prediction model.
The error, ε, is computed by subtracting the predicted value, ŷ, from the actual value, y.
Using matrix algebra, we can compute the regression coefficients (b’s) and estimate the regression parameters (β‘s). In matrix notation, we can solve for these least square estimates using the following expression.
In this expression, b, is a vector of regression coefficients that estimate the regression parameters, β. In this expression, X, is our design matrix, XT, is the transpose of the design matrix (X), and finally, y, is a response vector. Using the following data set we’ll compute regression coefficients by solving for b.
In table 1 there are two complete replicates on a physical property of interest, y. Here we examined the effects of two factors, X1 and X2, each at two levels, on y. In total, there are eight observations.
We can express the information, shown in Table 1, in matrix notation as follows.
The first column of 1’s is used to compute the overall average of y. The remaining columns represent the levels for factors X1, X2, and the two-way interaction, X1X2.
Using this data we will fit the following regression model.
The XTX matrix is.
The (XTX)-1 matrix is the inverse of the XTX matrix.
Solving for XTy we have.
Solving for the least squares estimates, we have the regression coefficients b.
At this point, there are some interesting things we should discuss. First of all, the (XTX) matrix is a diagonal matrix with all its off diagonal elements equal to zero. When this is the case, the levels for factors X1 and X2 are orthogonal. This just means the levels chosen for the factors in our experiment are not correlated or confounded with each other. As such, the sum of the product of the levels for the factors will equal zero. This is shown in the table below. Notice that the sum of the product of the levels for factors X1 and X2 is zero. This means the factor effects, X1 and X2, are not correlated. When we analyze the effect of y versus X1, it will not be tainted by the changes in X2. This is also true for y versus X2.
Secondly, notice the diagonal values of the inversion matrix, (XTX)-1, is the inverse of the diagonal values found in the (XTX) matrix. Thus, an orthogonal matrix is easily inverted. Also, the computation of the regression coefficients is simplified because of this.
Orthogonality is a powerful property inherent to statistical experiments for the following reasons.
1. When the levels of a factor are orthogonal to all other factors we can compute the regression estimates with easy.
2. Since the effect of a factor, estimated by b, is not influenced by any of the off diagonal elements in the (XTX) matrix it assures the conclusions we draw versus a specific factor effect is independent of all other factor effects.
3. We can estimate the effect of interactions between factors.
So many times experiments are designed without any consideration to orthogonality. When this is the case, the factors may be correlated. In this case, how can we draw any conclusions about a single factor when it is confounded with another factor? We also lose the opportunity to estimate the interaction effect between factors. We therefore lose any opportunity to understand the system under investigation and this impedes our ability to improve the system.
In summary, statistical experiments are the most efficient way to conduct experiments. They collect the required information, for the least amount of cost and use of resources.
To learn more about this topic I recommend the text: Quality by Experimental Design. It is an applied text in Statistical Experimental Design and Analysis.