21 Reports
Predicted Classification Report
Predicted Classification Section
Row 
Actual 
Predicted 
Pcnt1 
Pcnt2 
Pcnt3 
1  Setosa  Setosa  100.0  0.0  0.0 
2  Virginica  Virginica  0.0  0.0  100.0 
3  Versicolo  Versicolo  0.0  99.6  0.4 
4  Virginica  Virginica  0.0  0.0  100.0 
5  Virginica  Versicolo  0.0  72.9  27.1 
6  Setosa  Setosa  100.0  0.0  0.0 
7  Virginica  Virginica  0.0  0.0  100.0 
8  Versicolo  Versicolo  0.0  96.0  4.0 
This report shows the actual group, the predicted group, and the percentage probabilities of each row. The definitions are given above in the Misclassified Rows Report.
Canonical Variate Analysis Report
This report provides a canonical correlation analysis of the discriminant problem. Recall that canonical correlation analysis is used when you want to study the correlation between two sets of variables. In this case, the two sets of variables are defined in the following way. The independent variables comprise the first set. The group variable defines another set, which is generated by creating an indicator variable for each group except the last one.
Inv(W)B Eigenvalue
The eigenvalues of the matrix W1B. These values indicate how much of the total variation explained is accounted for by the various discriminant functions. Hence, the first discriminant function corresponds to the first eigenvalue, and so on. Note that the number of eigenvalues is the minimum of the number of variables and K1, where K is the number of groups.
Ind’l Prcnt
The percent that this eigenvalue is of the total.
Total Prcnt
The cumulative percent of this and all previous eigenvalues.
Canon Corr
The canonical correlation coefficient.
Canon Corr2
The square of the canonical correlation. This is similar to RSquared in multiple regression.
FValue
The value of the approximate Fratio for testing the significance of the Wilks’ lambda corresponding to this row and those below it. Hence, in this example, the first Fvalue tests the significance of both the first and second canonical correlations, while the second Fvalue tests the significance of the second correlation only.
Num DF
The numerator degrees of freedom for this Ftest.
Denom DF
The denominator degrees of freedom for this Ftest.
Prob Level
The significance level of the Ftest. This is the area under the Fdistribution to the right of the Fvalue. Usually, a value less than 0.05 is considered significant.
Wilks’ Lambda
The value of Wilks’ lambda for this row. This Wilks’ lambda is used to test the significance of the discriminant function corresponding to this row and those below it. Recall that Wilks’ lambda is a multivariate generalization of R². The above Fvalue is an approximate test of this Wilks’ lambda.
Source: “Cluster & Discriminant Analysis, Rural & International Marketing Research” by NPTEL is licensed under CC BYNCSA 4.0
Canonical Coefficients Report
Canonical Coefficients Section
Variable 
Canonical Variate
Variate1 
Variate2 
Constant  2.105106  6.661473 
Sepal Length  0.082938  0.002410 
Sepal Width  0.153447  0.216452 
Petal Length  0.220121  0.093192 
Petal Width  0.281046  0.283919 
This report gives the coefficients used to create the canonical scores. The canonical scores are weighted averages of the observations, and these coefficients are the weights (with the constant term added).
Canonical Variates at Group Means Report
This report gives the results of applying the canonical coefficients to the means of each of the groups.
Std. Canonical Coefficients Report
VariableVariate Correlations Report
This report gives the loadings (correlations) of the variables on the canonical variates. That is, each entry is the correlation between the canonical variate and the independent variable. This report can help you interpret a particular canonical variate.
Linear Discriminant Scores Report
This report gives the individual values of the linear discriminant scores. Note that this information may be stored on the database using the Data Storage options.
Regression Scores Report
This report gives the individual values of the predicted scores based on the regression coefficients. Even though these values are predicting indicator variables, it is possible for a value to be less than zero or greater than one. Note that this information may be stored on the database using the Data Storage options.
Canonical Scores Report
This report gives the scores of the canonical variates for each row. Note that this information may be stored on the database using the Data Storage options.
Scores Plot(s)
You may select plots of the linear discriminant scores, regression scores, or canonical scores to aid in your interpretation. These plots are usually used to give a visual impression of how well the discriminant functions are classifying the data. (Several charts are displayed on the output. Only one of these is displayed here.)
This chart plots the values of the first and second linear discriminant scores. By looking at this plot you can see what the classification rule would be. Also, it is obvious from this plot that the first two lineardiscriminant functions are necessary in discriminating among the varieties of iris since the groups can be separated along diagonal lines.
Example 2 – Automatic Variable Selection (Brief Report)
The tutorial we have just concluded was based on all four of the independent variables. A common task in discriminant analysis is variable selection. Often you have a large pool of possible independent variables from which you want to select a smaller set (up to about eight variables) which will do almost as well at discriminating as the complete set. NCSS provides an automatic procedure for doing this, which will be described next.
The automatic variable selection is run by changing the Variable Selection option to Stepwise. The program will conduct a stepwise variable selection. It will first find the best discriminator and then the second best. After it has found two, it checks whether the discrimination would be almost as good if one were removed. This stepping process of adding the best remaining variable and then checking if one of the active variables could be removed continues until no new variable can be found whose Fvalue has a smaller than the Probability Enter value.
An alternative procedure is to use the Multivariate Variable Selection procedure described elsewhere in this manual. If you have more than two groups, you must create a set of dummy (indicator) variables, one for each group. You ignore the last dummy variable, so if there are K groups, you analyze K1 dummy variables. The Multivariate Variable Selection program will always find a subset of your independent variables that is at least as good (and usually better) as the stepwise procedure described in this section. Once a subset of independent variables has been found, they can then be analyzed using the Discriminant Analysis program described here.
Once the variable selection has been made, the program provides the reports that were described in the previous tutorial. Note that two report formats may be called for during the variable selection phase: brief and verbose. We will now provide an example of each type of report.
You may follow along here by making the appropriate entries or load the completed template Example 2 by clicking on Open Example Template from the File menu of the Discriminant Analysis window.
1 Open the Fisher dataset.
 From the File menu of the NCSS Data window, select Open Example Data.
 Click on the file Fisher.NCSS.
 Click Open.
2 Open the Discriminant Analysis window.
 Using the Analysis menu or the Procedure Navigator, find and select the Discriminant Analysis procedure.
3 Specify the variables.
 On the Discriminant Analysis window, select the Variables tab.
 Doubleclick in the Y: Group Variable box. This will bring up the variable selection window.
 Select Iris from the list of variables and then click Ok. “Iris” will appear in the Y: Group Variable box.
 Doubleclick in the X’s: Independent Variables text box. This will bring up the variable selection window.
 Select Sepal Length through PetalWidth from the list of variables and then click Ok. “SepalLength PetalWidth” will appear in the X’s: Independent Variables.
 Enter Stepwise in the Variable Selection box.
4 Specify the reports.
 Select the Reports tab.
 Uncheck all reports and plots. We will only view the Variable Selection Report.
 Enter Labels in the Variable Names box.
 Enter Value Labels in the Value Labels box.
 Enter Brief in the Output box.
5 Run the proedure.

 From the Run menu, select Run Procedure. Alternatively, just click the green Run button.
VariableSelection Summary Report
This report shows what action was taken at each step.
Iteration
This gives the number of this step.
Action This Step
This tells what action (if any) was taken during this step. “Entered” means that the variable was entered into the set of active variables. “Removed” means that the variable was removed from the set of active variables.
Pct Chg In Lambda
This is the percentage decrease in lambda that resulted from this step. Note that Wilks’ lambda is analogous to 1 – RSquared in multiple regression. Hence, we want to decrease Wilks’ lambda to improve our model. For example, going from iteration 2 to iteration 3 results in lambda decreasing from 0 .036884 to 0.024976. This is a 32.29% decrease in lambda.
FValue
This is the Fratio for testing the significance of this variable. If the variable was “Entered,” this tests the hypothesis that the variable should be added. If the variable was “Removed,” this tests whether the variable should be removed.
Prob Level
The significance level of the above FValue.
Wilks’ Lambda
The multivariate extension of RSquared. Wilks’ lambda reduces to 1(RSquared) in the twogroup case. It is interpreted just backwards from RSquared. It varies from one to zero. Values near one imply low predictability, while values close to zero imply high predictability. Note that this Wilks’ lambda value corresponds to the currently active variables.
Example 3 – Automatic Variable Selection (Verbose Report)
We will now rerun this example with the “verbose” option. We assume that the Fisher dataset is available and you are in the Discriminant Analysis procedure.
You may follow along here by making the appropriate entries or load the completed template Example 3 by clicking on Open Example Template from the File menu of the Discriminant Analysis window.
1 Open the Fisher

 From the File menu of the NCSS Data window, select Open Example Data.
 Click on the file Fisher.NCSS.
 Click Open.
2 Open the Discriminant Analysis

 Using the Analysis menu or the Procedure Navigator, find and select the Discriminant Analysis procedure.
3 Specify the variables

 On the Discriminant Analysis window, select the Variables tab.
 Doubleclick in the Y: Group Variable box. This will bring up the variable selection window.
 Select Iris from the list of variables and then click Ok. “Iris” will appear in the Y: Group Variable box.
 Doubleclick in the X’s: Independent Variables text box. This will bring up the variable selection window.
 Select Sepal Length through PetalWidth from the list of variables and then click Ok. “SepalLength PetalWidth” will appear in the X’s: Independent Variables.
 Enter Stepwise in the Variable Selection box.
4 Specify the reports

 Select the Reports tab.
 Uncheck all reports and plots. We will only view the Variable Selection Report.
 Enter Verbose in the Output box.
 Enter Labels in the Variable Names box.
 Enter Value Labels in the Value Labels box.
5 Run the procedure

 From the Run menu, select Run Procedure. Alternatively, just click the green Run button.
VariableSelection Detail Report
This report shows the details of each step.
Step
This gives the number of this step (iteration).
Status
This tells whether the variable is “in” or “out” of the set of active variables.
Pct Chg In Lambda
This is the percentage decrease in lambda that would result if the status of this variable were reversed.
FValue
This is the Fratio for testing the significance of changing the status of this variable.
Prob Level
The significance level of the above FValue.
RSquared Other X’s
This is the RSquared that would result if this variable were regressed on the other independent variables that are active (status = “In”). This provides a check for multicollinearity in the active independent variables.
Overall Wilks’ Lambda
This is the value of Wilks’ lambda for all active independent variables. A value near zero indicates an accurate model; a value near one indicates a poor model.
a number between zero and one, inclusive, that gives the likelihood that a specific event will occur