This is a followup to my previous post about “Using PERT for
estimating tasks”. The idea is not to illustrate any mathematical proofs for
deriving the PERT Distribution but just to give a hint on its background.
**Graphs will be replaced.
The PERT distribution is a probabilistic model, based on Beta
Distribution, and it derives its estimates based on the probability of
occurrence (i.e., chance or possibility  if an event is likely to happen, we
say it is probable. On the other hand, if it is not likely to happen, we say it
is improbable. Directly or indirectly, probability of occurrence plays a role
in the all activities. Probability of occurrence of any event will be a number
between 0 and 1. Events that are unlikely will have a probability near 0, and
events that are likely to happen have probabilities near 1).
The Beta Model is a flexible yet continuous distribution, defined on
the interval (0, 1), that places value on the event itself and the interval
between events. Because it accounts for a degree of randomness, it is highly
useful in determining likely event durations (time estimates for tasks). In other words, PERT can derive a task
duration with a high probability of accuracy based on three values provided by
experts estimates (pessimistic, optimistic and most likely). But, it’s still an
estimate because it is only as good as the estimates used in the computation.
Probability Density Function Graphical Profiles
The Beta Distribution is parameterized by two positive shape
parameters, denoted by alpha and beta. It can take on different shapes depending on
the values of the two parameters. The domain of the beta distribution can be
viewed as a probability to describe the distribution of an unknown probability value.
The range depends on the chosen shape
parameters.
Typically, the general form of a distribution model is given in
terms of location and scale parameters. The Beta function is different in that the general
distribution model is defined in terms of the lower and upper bounds. However,
the location and scale parameters can be defined in terms of the lower and
upper limits as follows:
location = a

scale = b – a

Location and scale parameters
These parameters, location and scale, are used in modeling
applications.
The effect of the location parameter is to translate the graph,
relative to the standard normal distribution, 10 units to the right on the
horizontal axis. A location parameter of 10 would have shifted the graph 10
units to the left on the horizontal axis. That is, a location parameter simply
shifts the graph left or right on the horizontal axis.
The effect of a scale parameter greater than one is to stretch the Probability
Density Function. The greater the magnitude, the greater the stretching. The
effect of a scale parameter less than one is to compress the function.
The compressing approaches a spike as the scale parameter goes to
zero. A scale parameter of 1 leaves the function unchanged (if the scale
parameter is 1 to begin with) and nonpositive scale parameters are not
allowed.
The standard form of any distribution is the form that has location
parameter zero and scale parameter one.
**Graphs will be replaced.
Ex.1: the following graph has a location
of 0 and scale of 1.
Ex.2: the following graph has a location
of 10 and scale of 1.
Ex.3: the next plot has a scale
parameter of 3 (and a location parameter of zero). The effect of the scale
parameter is to stretch out the graph. The maximum y value is approximately
0.13 as opposed 0.4 in
the previous graphs. The y value, i.e., the vertical axis value, approaches
zero at about (+/) 9 as opposed to (+/) 3 with the first graph.
Ex.4: In contrast, the next graph has a
scale parameter of 1/3 (=0.333). The effect of this scale parameter is to
squeeze the function. That is, the maximum y value is approximately 1.2 as
opposed to 0.4 and the y value is near zero at (+/) 1 as opposed to (+/) 3.
Ex.5: The following graph shows the
effect of both a location and a scale parameter. The plot has been shifted
right 10 units and stretched by a factor of 3.
The Beta distribution in its standard form ranges from zero to one and
takes a wide range of shapes (see the Probability Density Function Graphical
Profiles graph on top). Moreover, it can an also be rescaled and shifted to
create distributions with a wide range of
shapes and over any finite range
for various applications. For example, it’s used to model expert opinion in the
form of the PERT distribution.
The PERT distribution comes out of the need to describe the
uncertainty in tasks during the development of a complex project having thousands
of tasks. Estimates were needed to be made intuitively, quick and consistent in
approach.
The Beta distribution can be rescaled to model a variable that runs
from a to b by using the following formula:
x = a + B(a ,b) * (b  a)
This is the fourparameter version of the Beta distribution. A version
of this fourparameter Beta distribution is called a PERT distribution and makes
the assumption that the mean = (minimum + 4*most likely + maximum) / 6. The
default value of 4 scales the height of the distribution. This extra equation
allows the four parameters to be determined from three input values: the
minimum, most likely and maximum, which makes it ideal for modeling expert
opinion of a variable's uncertainty.
Mean= (Min+Max+4*Mode)/(4+2)

Using the three point estimation, estimates of the mean, standard
deviation and variance can be obtained:
Expected Time (ET) = ( Optimistic + 4 x Most likely + Pessimistic )
/ 6
Expected time (ET) = ( Min + 4 x Most likely + Max) / 6
Standard Deviation (SD) = (MaxMin)/6
Variance (V) = =SD**2 (Standard Deviation squared)
Table Example
List of all tasks lying on the critical path of a project


Tasks

Min

MLT

Max

ET

SD

V

Task A


Task B


Task C

(All time durations are
estimates).
Legend
Min = Optimistic time is generally the shortest time in which the
activity can be
completed.
Most Likely Time (MLT) is the completion time having the highest
probability.
Max = Pessimistic time is the longest time that an activity might
require.
SD (Standard Deviation) is the average deviation form the estimated
time (as a general rule, the higher the SD is the greater amount of uncertainty
exists).
V (Variance) reflects the spread of a value over a normal
distribution (the SD and V will be useful in determining the probability of the
project meeting a desired completion date).
ET = Expected Time