Free access
EDITORIAL
Mar 1, 2005

If You’ve Ever Used Engineering Judgment, You Just Might Be a Bayesian

Publication: Journal of Environmental Engineering
Volume 131, Issue 3
Chances are that you subscribe to a view of the world not taken by the stats book on your shelf. Further, chances are you don’t even know it.
Consider the following simplified problem. The decision is important. Millions of dollars are at stake. You are confronted with a design decision. You must select either conventional equipment or new equipment. The conventional equipment has an established record of reliability. The new equipment, which is based on the latest technology, is much less tested. However, the new equipment is significantly less expensive.
To weigh the decision more carefully, you feel you need to have a better idea of the failure rate of the new equipment, where the failure rate is defined as the number that fail out of the total installed. You obtain the available data on the performance of the conventional and new equipment. The conventional equipment had been installed at 10,000 plants and a failure rate of 1% was observed. In contrast, the new equipment has been installed only at 30 plants. No failure was observed for the first 29 plants but failure was observed in the 30th plant. This represents a failure rate of 1∕30, or 3.3%.
However, you question whether the observed failure rate of the new equipment, 3.3%, is indicative of the true failure rate that will be observed over a long period of time. In fact, when first having learned about the equipment, you came to the conclusion that it probably would be more reliable than the conventional equipment, i.e., have a failure rate less than 1%. The basis of your reasoning was that the technology underlying the new equipment was less sensitive to variations in the construction process, which was known to be the predominant cause of failure for the standard equipment. Further, the technology had been used successfully in other related applications for which it proved to be reliable. Therefore you wonder if the record of performance is simply too short to draw any meaningful conclusion as to the true failure rate.
To get a better idea of what the true failure rate might be, you turn to the textbook from your course in “Probability and Statistics for Engineers.” After scanning the table of contents, you soon narrow in on the chapter on distributions and the chapter on estimation. You find that a discrete distribution with two outcomes is applicable for your situation; there is a probability p of failure and a probability 1p of no failure (success). Further, in the chapter on estimation, you read that the best estimate of the mean is the sample mean. As the failure rate is defined as the number that fail out of the total number installed, the sample mean is simply the average outcome, where failure is represented by “1” and success as “0.” So the best estimate of the failure rate is indeed 1∕30, or 3.3%.
But this is the best estimate. What about the effect of chance? Fortunately, a couple of pages later, you find the procedure for estimating a confidence interval. You find that it depends on the sample size n , in this case 30, and σ , which is the true standard deviation. Earlier, you recall seeing that the unbiased estimate of σ is the standard deviation of the sample, which you calculate to be 18.26% (again using 1 for failure and 0 for success). As per instructions, you calculate a 95% confidence interval—it is estimated as the mean plus or minus 1.96σn , which evaluates to a range of 3.1 to 3.5%. So, you conclude, it does appear that the failure rate is above 3% and not likely to be below 1%, as you had originally surmised.
However, you read the text a little closer. You find that the sample mean is the best estimate of the mean for “most applications.” Further, you read that “for small samples from nonnormal populations, we cannot expect our degree of confidence to be accurate. However, for samples of size n30 , regardless of the shape of most populations, sampling theory guarantees good results” (Walpole et al. 2002). After a further reading, you come to realize that the analysis and derivations in the sections you read (and in virtually the entire text, you later find) start with an important assumption. The starting assumption is that the true distribution is known (in your case the true value of p is known). The analysis is confined to evaluating the properties of a sample from that known distribution—e.g., the best estimate of the mean and associated confidence intervals. The logic that allows you to infer something about the true mean given the sample mean depends on a limiting argument (the central limit theorem) that places the sample mean in a distribution about the true mean as the sample size becomes large.
You are not satisfied with the limitations of the analysis. At least from your text, there seems to be an area of inapplicability, when the sample size is too small for the conclusions to be drawn (too small for the central limit theorem to apply). That your sample size, 30, exactly matches the low end of the range for which you should rely on the confidence interval assessment hardly inspires confidence. And, you think to yourself, if you had a much larger sample size, you would not have had the question anyway. Further, you notice that there was no room in the analysis for your engineering judgment. After all, you have reason to believe that the system should be just as reliable, if not better.
Discouraged, you step back and view the data in a new light. Eventually, you realize that the data on the new equipment—29 successes followed by one failure—could have occurred given any value of the failure rate between 0 and 1. Certainly, though, you reason the data would have been very unlikely given a failure rate of 0.01%. Similarly, the data would have been very unlikely given a failure rate of 50%. But for failure rates of 1, 2, 3, and 4%, it would not seem to be unlikely. Then you realize that you can actually calculate the probability of the data for any given failure rate. This probability is generally referred to as “likelihood.” The likelihood of 29 successes followed by 1 failure is (1p)29 p1 . So you set out to calculate the probability of the data for the failure rates shown in Table 1.
Table 1. Likelihood of Data Given Different Failure Rates
“True” failure rate0.01%1%2%3%50%
Probability of 29 successes followedby one failure given “true” failure rate0.01%0.75%1.11%1.24%0.0000001%
When you notice the similar likelihood of the data given failure rates of 2 and 3%, which the table indicates are 1.11 and 1.24%, respectively, you come to the following realization: since the likelihood of the data given a failure rate of 2 or 3% is very similar, then why should my initial belief, that a failure rate of 2% is more probable than 3%, change?
Moreover, why should a failure rate of 1%, effectively ruled out in the classical confidence interval procedure, not be possible? After all, the likelihood of the data given a failure rate of 1% is three-fifths that given a failure rate of 3%. Your initial belief, based upon your engineering judgment, was that a failure rate of 1% was many times more probable than 3%.
Essentially, if the above line of logic seems reasonable, then you just might subscribe to the Bayesian view of probability—that probability relates to degree of belief—and the Bayesian view of science—that alternative hypotheses should be viewed in probabilistic terms. Likely, this was not the view that you were exposed to in your probability and statistics course. Using Bayesian statistical analysis, you would approach the above problem in a way that is suggested by the final observations above. Using engineering judgment, for each hypothesis, i.e., failure rate, you would first assign a “prior” probability that reflects your belief in the hypothesis. After collecting the information that one out of 30 plants failed, you would then update these probabilities according to Bayes' rule, a rule of probability. For each hypothesis, e.g., the failure rate = 1%, the “posterior” probability would be calculated as the product of the likelihood of the data (given in row two in Table 1) and the prior probability, divided by this product summed across all hypotheses. In essence, with application of Bayes' rule, the relative merit of a hypothesis is adjusted by the likelihood of observing the data given that it is true.
Bayesian statistics, which shares a long and sometimes clashing history with its more prominent cousin classical statistics, while creeping in, should be brought more speedily into the engineering curriculum. In many ways, it is more relevant to engineering and decisionmaking than classical statistics—and arguably more theoretically appealing (Howson and Urbach 1993). In a recent edition of a probability and statistics text, Bayesian analysis occupies seven pages in an optional section (Walpole et al. 2002). At a minimum, engineers should know the following about the differences between classical statistics and Bayesian statistics. At the top level, the view of probability is different: the classical view is that the probability of an event is its long-run frequency of occurrence; the Bayesian view is that the probability of an event is a person’s assessment of the chance of the event occurring. The analysis of the classical approach starts with the condition that the there is no uncertainty in the state of knowledge, only randomness in the sampling. The analysis of the Bayesian approach starts with the condition that there is uncertainty in the state of knowledge. The classical approach strives to remove subjectivity from the analysis. The Bayesian approach insists that it should not be removed (and cannot be anyway). The classical approach seeks to validate or invalidate individual hypotheses. The Bayesian approach seeks to assign and keep track of the probabilities of competing hypotheses. The Bayesian approach is directly applicable to the type of engineering decision making presented here.

References

Howson, C., and Urbach, P. (1993). Scientific reasoning: The Bayesian approach, 2nd Ed., Open Court, Peru, Ill.
Walpole, R. E., Meyers, R. H., Meyers, S. L., and Ye, K. (2002). Probability and statistics for engineers and scientists, 7th Ed., Prentice-Hall, Upper Saddle River, N.J.

Information & Authors

Information

Published In

Go to Journal of Environmental Engineering
Journal of Environmental Engineering
Volume 131Issue 3March 2005
Pages: 333 - 334

History

Published online: Mar 1, 2005
Published in print: Mar 2005

Permissions

Request permissions for this article.

Authors

Affiliations

Kenneth W. Harrison
Dept. of Civil and Environmental Engineering, Univ. of South Carolina, 300 Main St., Columbia, SC 29028. E-mail: [email protected]

Metrics & Citations

Metrics

Citations

Download citation

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

View Options

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share