de l’information
to fault-proneness. The path (b) depicts a positive causal relationship between size and fault-proneness.
The path (c) depicts a positive association between product metrics and size.
If this path diagram is concordant with reality, then size distorts the relationship between product metrics
and fault-proneness. Confounding can result in considerable bias in the estimate of the magnitude of the
association. Size is a positive confounder, which means that ignoring size will always result in the
association between say coupling and fault-proneness to be more positive than it really is.
The potential confounding effect of size can be demonstrated through an example (adapted from [12]).
Consider the table in Table 1 that gave an odds ratio of 22.9. As mentioned earlier, this is representative
of the current univariate analyses used in the object-oriented product metrics validation literature (which
explicitly exclude size as a covariate nor employ a stratification on size).
Now, let us say that if we analyze the data seperately for small and large classes, we have the data in15Table 2 for the large classes, and the data in Table 3 for the small classes.
Fault PronenessCouplingHC
LCFaulty9010Not Faulty91
Table 2: A contingency table showing the results for only large classes of a hypothetical validation study.
Fault PronenessCouplingHC
LCFaulty19Not Faulty1090
Table 3: A contingency table showing the results for only small classes of a hypothetical validation study.
In both of the above tables the odds ratio is one. By stratifying on size (i.e., controlling for the effect of
size), the association between coupling and fault-proneness has been reduced dramatically. This is
because size was the reason why there was an association between coupling and fault-proneness in the
first place. Once the influence of size is removed, the example shows that the impact of the coupling
metric disappears.
Therefore, an important improvement on the conduct of validation studies of object oriented metrics is to
control for the effect of size, otherwise one may be getting the illusion that the product metric is strongly
associated with fault-proneness, when in reality the association is much weaker or non-existent.
2.2.3 Evidence of a Confounding Effect
Now we must consider whether the path diagram in Figure 2 can be supported in reality.
There is evidence that object-oriented product metrics are associated with size. For example, in [22] the
Spearman rho correlation coefficients go as high as 0.43 for associations between some coupling and
cohesion metrics with size, and 0.397 for inheritance metrics, and both are statistically significant (at an
alpha level of say 0.1). Similar patterns emerge in the study reported in [19], where relatively large
correlations are shown. In another study [27] the authors display the correlation matrix showing the
Spearman correlation between a set of object-oriented metrics that can be collected from Shlaer-Mellor
designs and C++ LOC. The correlations range from 0.563 to 0.968, all statistically significant at an alpha
level 0.05. This also indicates very strong correlations with size.
Note that in this example the odds ratio of the size to fault-proneness association is 100, and the size to coupling association is
81.3. Therefore, it follows the model in Figure 2.15
百度搜索“77cn”或“免费范文网”即可找到本站免费阅读全部范文。收藏本站方便下次阅读,免费范文网,提供经典小说教育文库The Confounding Effect of Class Size on The Validity of Obje(12)在线全文阅读。
相关推荐: