Correlation for Uncertainty Tool Inputs  

Correlation means that the two distribution samples are not generated independently but instead there is some relationship between the values that are being generated. OSCAM Ship v8.4 introduces the ability to define correlation relationships between pairs of inputs in the Uncertainty Tool.

OSCAM uses rank correlation, which is defined as meaning that the ranks of values (in order of smallest to highest value) are similar to some specified degree. Rank correlations allow completely different probability distributions to be correlated effectively.

Technical explanation of the approach used by OSCAM:
When defining correlation between two uncertainty inputs, one input will be Independent and the other input will be Dependent. The Independent input sample will be generated first and then the Dependent sample will be generated so that it displays the required degree of rank correlation. Note that the heuristic for generating the sample with the required correlation will not exactly generate the target rank correlation (unless the target is 1 or -1), but it will be close to the target value. The Cholesky decomposition with a Monte Carlo approach is used for samples from two simple distributions, with a simple distribution X sorted into the same rank order as the Independent distribution Y to generate another simple distribution, Y, with approximately the correct rank correlation (it is a sample from a population with the required correlation). The Dependent input sample is then generated using the distribution and parameters that are defined in the OSCAM, and that sample is then sorted into the same rank order as distribution Y. This means that the Independent distribution and the Dependent distribution have the same rank correlation as the simple distributions X and Y. The approach is commonly used by statistical analysis packages and Excel add-ins and generates samples with close to the required degree of correlation (generally within +/- 0.05 of the required correlation) very quickly. OSCAM can generate two correlated samples of 10,000 values each in a fraction of a second.

Defining Correlation Between Two Uncertainty Inputs

Any existing correlation relationships are shown on the Uncertainty Input form in the Uncertainty Input Selection List. The Correlation column shows the input identifier (sector code and input number) of the Independent uncertainty input for any correlation relationship, and the target correlation coefficient value.

Uncertainty Input form showing correlated inputs

Uncertainty Input form showing correlations defined between inputs

Correlation between two uncertainty inputs can be defined via the Set Sampling Distribution dialog for the Dependent input (i.e. the input whose sample will be sorted to exhibit the required rank correlation). You can view the Set Sampling Distribution dialog by double-clicking on the Distribution cell, or clicking the ellipses button for the appropriate input.

 The related input to be correlated with is shown at the bottom of the Set Sample Distribution dialog. If no correlation relationship is defined then this will be shows as "<None>".

Set Sampling Distribution dialog with correlation defined

Set Sampling Distribution dialog showing a defined correlation relationship

Click on the ellipses button next to the Correlated Input box to select the Independent input to be correlated with. The Correlation Input Selection form will be displayed. Note that only uncertainty inputs that you have already selected from the Uncertainty Input tree, and are shown in the Uncertainty Input List, will be available for selection.

Correlated Input Selection Form

Correlated Input Selection form with correlated input selected

The name of the Dependent Input will be shown in the dialog caption at the top. All uncertainty inputs are shown in the table. You can sort inputs by sector order (i.e. the order that they are displayed in the Uncertainty Tree View), or by selection order (the order they are displayed in the Uncertainty Input List). You can view all uncertainty inputs or filter so that only inputs from a specified sector are displayed. The "<None>" option will always be shown at the top of the list.

The current selection of the Independent input associated with the Dependent input will be shown by an arrow in the left-hand column of the table. The Dependent input is shown grayed in the text and cannot be selected (you cannot select an input to be correlated with itself). Any ineligible inputs for selection will be shown in red. These cannot be selected because they would cause circular correlation relationships (i.e. a chain of correlation relationships that leads back to the Dependent input). In the example above, PS4 WO Crew is shown in red because PS3 Officer Crew (the input that we are defining the correlation relationship for) is already the Independent input for PS4. This can be seen in the "Correlation" and "Independent Correlation Variable For ..." columns. You can see more information by hovering the mouse cursor over any of the cells and viewing the expanded comments and explanations that are shown in the Long Description panel at the bottom of the main OSCAM application form. See below for more information on correlation chains and circular references.

The Correlation Value is entered at the bottom of the Correlated Input Selection Form form, or it can be entered in the Set Sampling Distribution dialog.

Correlation Chains and Circular References

An uncertainty input can be the Independent input for several Dependent inputs. For example the number of Officer crew and the number of Warrant Officer crew could both reference the number of Enlisted crew as the Independent input in a correlation relationship. It is also possible to select an uncertainty input as the Independent input which itself is the Dependent input in another correlation relationship. For example, the number Officer crew may be correlated with the number of Enlisted crew as the Independent input, and number of Warrant Officer crew correlated with the number of Officer crew as the Independent input. In this example there is an indirect correlation between Warrant Officer crew and Enlisted crew through correlation relationship chaining. This is perfectly acceptable provided that the same uncertainty input does not appear twice in the same chain. If the input did appear twice in the chain this would mean that it was indirectly correlated with itself, which is not permitted.

Correlation chain examples, with and without circular references

Examples of correlation chains with and without circular references

OSCAM prevents circular correlation chaining in the Correlated Input Selection Dialog by checking the chains and displaying any input that would create a chain in red and preventing the selection of those inputs. If the mouse is hovered over a cell in the Correlation column the full correlation chain will be shown in the Long Description panel of the main OSCAM application.

Selecting input PS4 would cause a circular reference so it is displayed in red

Correlation Input Selection form showing circular reference in red text