Fortunately there is software to compute the best fitting straight line (hence "linear") that expresses the past relationship between the dependent and independent variable. Continuing our example, you will enter 1) the amount of the past monthly electricity bills, and 2) the number of machine hours occurring during the period of each of the bills. Next, the software will likely use the least squares method to produce the formula for the best fitting line. The line will appear in the form y = a + bx. In addition, the software will provide statistics regarding the correlation, confidence, dispersion around the line, and more.
(In all likelihood there are many independent variables causing a change in the amount of the dependent variable. Therefore, you should not expect that only one independent variable will explain a high percentage of the change in the dependent variable. To increase the percentage, you should think of the many independent variables that could cause a change in the dependent variable. Next you should test the effect of the combination of these independent variables or drivers by using multiple regression analysis software.)
Prior to using simple linear regression analysis it is important to follow these preliminary steps:
- seek an independent variable that is likely to cause or drive the change in the dependent variable
- make certain that the past amounts for the independent variable occur in the exact same period as the amount of the dependent variable
- plot the past observations on a graph using the y-axis for the cost (monthly electricity bill) and the x-axis for the activity (machine hours used during the exact period of the electricity bill)
- review the plotted observations for a linear pattern and for any outliers
- keep in mind that there can be correlation without cause and effect