Skip to main content

Follow-Up: Estimating using PERT based on Beta Distribution Model

This is a follow-up to my previous post about “Using PERT for estimating tasks”. The idea is not to illustrate any mathematical proofs for deriving the PERT Distribution but just to give a hint on its background.

The PERT distribution is a probabilistic model, based on Beta Distribution, and it derives its estimates based on the probability of occurrence (i.e., chance or possibility - if an event is likely to happen, we say it is probable. On the other hand, if it is not likely to happen, we say it is improbable. Directly or indirectly, probability of occurrence plays a role in the all activities. Probability of occurrence of any event will be a number between 0 and 1. Events that are unlikely will have a probability near 0, and events that are likely to happen have probabilities near 1).

The Beta Model is a flexible yet continuous distribution, defined on the interval (0, 1), that places value on the event itself and the interval between events. Because it accounts for a degree of randomness, it is highly useful in determining likely event durations (time estimates for tasks).  In other words, PERT can derive a task duration with a high probability of accuracy based on three values provided by experts estimates (pessimistic, optimistic and most likely). But, it’s still an estimate because it is only as good as the estimates used in the computation.

Probability Density Function Graphical Profiles
The Beta Distribution is parameterized by two positive shape parameters, denoted by alpha and beta. It can take on different shapes depending on the values of the two parameters. The domain of the beta distribution can be viewed as a probability to describe the distribution of an unknown probability value.  The range depends on the chosen shape parameters.

Typically, the general form of a distribution model is given in terms of location and scale parameters. The Beta function is different in that the general distribution model is defined in terms of the lower and upper bounds. However, the location and scale parameters can be defined in terms of the lower and upper limits as follows:

location = a
scale = b – a

Location and scale parameters

These parameters, location and scale, are used in modeling applications.
The effect of the location parameter is to translate the graph, relative to the standard normal distribution, 10 units to the right on the horizontal axis. A location parameter of -10 would have shifted the graph 10 units to the left on the horizontal axis. That is, a location parameter simply shifts the graph left or right on the horizontal axis.
The effect of a scale parameter greater than one is to stretch the Probability Density Function. The greater the magnitude, the greater the stretching. The effect of a scale parameter less than one is to compress the function.
The compressing approaches a spike as the scale parameter goes to zero. A scale parameter of 1 leaves the function unchanged (if the scale parameter is 1 to begin with) and non-positive scale parameters are not allowed.
The standard form of any distribution is the form that has location parameter zero and scale parameter one.

**Graphs will be replaced.

Ex.1: the following graph has a location of 0 and scale of 1.

Ex.2: the following graph has a location of 10 and scale of 1.

Ex.3: the next plot has a scale parameter of 3 (and a location parameter of zero). The effect of the scale parameter is to stretch out the graph. The maximum y value is approximately 0.13 as opposed 0.4 in the previous graphs. The y value, i.e., the vertical axis value, approaches zero at about (+/-) 9 as opposed to (+/-) 3 with the first graph.

Ex.4: In contrast, the next graph has a scale parameter of 1/3 (=0.333). The effect of this scale parameter is to squeeze the function. That is, the maximum y value is approximately 1.2 as opposed to 0.4 and the y value is near zero at (+/-) 1 as opposed to (+/-) 3.

Ex.5: The following graph shows the effect of both a location and a scale parameter. The plot has been shifted right 10 units and stretched by a factor of 3.

The Beta distribution in its standard form ranges from zero to one and takes a wide range of shapes (see the Probability Density Function Graphical Profiles graph on top). Moreover, it can an also be rescaled and shifted to create distributions with a wide range of  shapes and  over any finite range for various applications. For example, it’s used to model expert opinion in the form of the PERT distribution.
The PERT distribution comes out of the need to describe the uncertainty in tasks during the development of a complex project having thousands of tasks. Estimates were needed to be made intuitively, quick and consistent in approach.
The Beta distribution can be rescaled to model a variable that runs from a to b by using the following formula:
x = a + B(a ,b) * (b - a)
This is the four-parameter version of the Beta distribution. A version of this four-parameter Beta distribution is called a PERT distribution and makes the assumption that the mean = (minimum + 4*most likely + maximum) / 6. The default value of 4 scales the height of the distribution. This extra equation allows the four parameters to be determined from three input values: the minimum, most likely and maximum, which makes it ideal for modeling expert opinion of a variable's uncertainty.

Mean= (Min+Max+4*Mode)/(4+2)

Using the three point estimation, estimates of the mean, standard deviation and variance can be obtained:
Expected Time (ET) = ( Optimistic + 4 x Most likely + Pessimistic ) / 6
Expected time (ET) = ( Min + 4 x Most likely + Max) / 6
Standard Deviation (SD) = (Max-Min)/6
Variance (V) = =SD**2 (Standard Deviation squared)

Table Example

List of all tasks lying on the critical path of a project
Task A

Task B

Task C

(All time durations are estimates).
-Min = Optimistic time is generally the shortest time in which the activity can be
-Most Likely Time (MLT) is the completion time having the highest probability.
-Max = Pessimistic time is the longest time that an activity might require.
-SD (Standard Deviation) is the average deviation form the estimated time (as a general rule, the higher the SD is the greater amount of uncertainty exists).
-V (Variance) reflects the spread of a value over a normal distribution (the SD and V will be useful in determining the probability of the project meeting a desired completion date).
ET = Expected Time

Popular posts from this blog

The differences between Project, Operation and Program

We said that a project is defined as a temporary endeavor that consumes resources, incurs cost and produce deliverables over a finite period of time to achieve a specific goal. They come in all shapes and sizes and can vary in length or complexity.

Operation type activities are similar to project activities in that they too produce deliverables, consume resources and incur cost. However they are on-going or repetitive in nature, hence they are not project activities or tasks. Some examples of operation activities are weekly maintenance of databases, paying invoices or help desk operations activities.

Programs are much larger than projects. They are made up of many projects and on going activities such as operation type activities and are similar to projects as they consume resources, incur cost and produce deliverables. However programs are more complex and include repetitive operation type activities such as maintenance work, facility administration etc, and are funded typically on a…

Forecasting Project Costs using Variance Analysis

One way to report on cost control and forecasting during project execution is to use the Variance Analysis method, that is, explaining the difference (or variance) between actual costs and the budgeted costs with numbers and make new estimates for completing the work. Please consult this link Earned Value Management for related literature and references.
For the purpose of making these calculations, I will use an hypothetical project example (but it could also be a task or phase). "A company has contracted a service provider to deliver a project in 10 working days (80 hours) for the estimated cost of $10,000 and a work effort of 200 hours. The contract is Time and Material, this means that the company pays the provider for the number of hours actually required to perform the service. So, the provider has no incentive to minimize the number of hours expended on the service. The less efficient the provider is, the more money it makes!"
Summary of Time and Material Contract (re…

Using PERT for estimating tasks

A simple way for estimating tasks is to use the PERT (Program Evaluation review technique) weighted average method. This method uses a weighted average duration estimate to calculate task duration, it gives the opportunity to take into account information based on different types of estimates values (such as poorly defined areas, probabilistics, and ranges for the schedule). This method is based on the Beta distribution model because it can model events which are constrained to take place within an interval defined by a minimum and maximum value. (For this reason, the Beta distribution is used extensively in PERT, CPM and other project planning/control systems to describe the time to completion of a task).

The term weighted average means that the equation uses weighted factors to calculate the expected task duration.
The equation and process modelling a task for PERT is the following:  E=(O+4M+P)/6 (equation)  E= Expected Value  O= Optimistic Value (this is equivalent to a minimum val…