de l’information
associated with fault-proneness. The NMO and NMA metrics were found to be associated with fault-
proneness, but the evidence for the SIX metric is more equivocal. The LCOM cohesion metric also has
equivocal evidence supporting its validity.
It should be noted that the differences in the results obtained across studies may be a consequence of
the measurement of different dependent variables. For instance, some treat the dependent variable as
the (continuous) number of defects found. Other studies use a binary value of incidence of a fault during
testing or in the field, or both. It is plausible that the effects of product metrics may be different for each of
these.
An optimistic observer would conclude that the evidence as to the predictive validity of most of these
metrics is good enough to recommend their practical usage.
2.2 The Confounding Effect of Size
In this section we take as a starting point the stance of an optimistic observer and assume that there is
sufficient empirical evidence demonstrating the relationship between the object-oriented metrics that we
study and fault-proneness. We already showed that previous empirical studies drew their conclusions
from univariate analyses. Below we make the argument that univariate analyses ignore the potential
confounding effects of class size. We show that if there is indeed a size confounding effect, then
previous empirical studies could have harbored a large positive bias.
For ease of presentation we take as a running example a coupling metric as the main metric that we are
trying to validate. For our purposes, a validation study is designed to determine whether there is an
association between coupling and fault-proneness. Furthermore, we assume that this coupling metric is
appropriately dichotomized: Low Coupling (LC) and High Coupling (HC). This dichotomization
assumption simplifies the presentation, but the conclusions can be directly generalized to a continuous
metric.
2.2.1 The Case Control Analogy
An object-oriented metrics validation study can be easily seen as an unmatched case-control study.
Case-control studies are frequently used in epidemiology to, for example, study the effect of exposure to12carcinogens on the incidence of cancers [95][12]. The reason for using case-control studies as opposed
to randomized experiments in certain instances is that it would not be ethically and legally defensible to
do otherwise. For example, it would not be possible to have deliberately composed ‘exposed’ and
‘unexposed’ groups in a randomized experiment when the exposure is a suspected carcinogen or toxic
substance. Randomized experiments are more appropriately used to evaluate treatments or preventative
measures [52].
In applying the conduct of a case-control study to the validation of an object-oriented product metric, one
would first proceed by identifying classes that have faults in them (the cases). Then, for the purpose of
comparison another group of classes without faults in them are identified (the controls). We determine
the proportion of cases that have, say High Coupling and the proportion with Low Coupling. Similarly, we
determine the proportion of controls with High Coupling, and the proportion with Low Coupling. If there is
an association of coupling with fault-proneness then the prevalence of High Coupling classes would be
higher in the cases than in the controls. Effectively then, a case-control study follows a paradigm that
proceeds from effect to cause, attempting to find antecedents that lead to faults [99]. In a case-control
study, the control group provides an estimate of the frequency of High Coupling that would be expected
among the classes that do not have faults in them.
In an epidemiological context, it is common to have ‘hospital-based cases’ [52][95]. For example, a
subset or all patients that have been admitted to a hospital with a particular disease can be considered as13cases. Controls can also be selected from the same hospital or clinic. The selection of controls is not
necessarily a simple affair. For example, one can match the cases with controls on some confounding12
13 Other types of studies that are used are cohort-studies [52], but we will not consider these here. This raises the issue of generalizability of the results. However, as noted by Breslow and Day [12], generalization from the sample
in a case-control study depends on non-statistical arguments. The concern with the design of the study is to maximize internal
validity. In general, replication of results establishes generalizability [79].
百度搜索“77cn”或“免费范文网”即可找到本站免费阅读全部范文。收藏本站方便下次阅读,免费范文网,提供经典小说教育文库The Confounding Effect of Class Size on The Validity of Obje(10)在线全文阅读。
相关推荐: