Health & Environmental Research Online (HERO)


Print Feedback Export to File
1238838 
Journal Article 
An evaluation of Salmonella (Ames) test data in the published literature: Application of statistical procedures and analysis of mutagenic potency 
Mccann, J; Horn, L; Kaldor, J 
1984 
Mutation Research
ISSN: 0027-5107
EISSN: 1873-135X 
134 
1-47 
English 
We searched the published literature for Salmonella test data on some 450 chemicals. Only 137 of more than 400 articles containing original data satisfied minimum criteria for a quantitative analysis [1751 experiments, comprising data on 152 chemicals (Table 1)]. Many of these papers did not report basic information about the test protocol (Table 2). We used previously described statistical procedures (Bernstein et al., 1982) to estimate the initial slopes of the dose-response curves and corresponding standard errors. We also applied tests for significance and linear goodness-of-fit. We then used the results of these analyses to examine several issues:

1.(1) Linearity of the low dose region of the dose-response curve. We found that the over-whelming majority of curves were linear, though ability to detect non-linearity of dose-response curves in the standard plate test is only limited. 7% of all experiments to which the goodness-of-fit test was applied were curves of increasing slope, and with a few possible exceptions, these were not obviously associated with any particular mutagens, even those generally considered to produce non-linear effects such as MNNG and EMS (Table 3).

2.(2) Performance of the statistical test for significance. Results of the statistical test for significance of the dose-response were compared with author's opinions as to positivity. In almost all cases (94%) results of the statistical test and authors opinions were the same. In the examples of conflicting opinions, the reasons were: (a) the statistical test places more weight than do most authors on the presence of a linear dose-response; (b) most authors tend to require at least a 2-fold increase over the spontaneous background for ‘significance’, and (c) when the number of spontaneous revertants is small (e.g., TA1537), authors tend to require a larger increase in induced revertants than when the spontaneous background is large, whereas the statistical procedure makes no such distinction. These factors result in the statistical test tending to identify more experiments as positive than do authors, provided there is a linear dose-response, and authors tending to judge more experiments as positive when the dose-response is not linear.

3.(3) Reproducibility. Among the 1751 experiments there were 122 data-sets (a total of 333 experiments) in which the same chemical was tested by two or more different laboratories under the same protocol. 21 of the 122 data-sets had some disagreement between experiments as to whether results were positive or negative (Table 4). In 7 of these cases, both the statistical test and the authors disagreed on particular experiments; in 2 cases, only authors disagreed; and in 12 cases, only the statistical test was in disagreement. Based on examination of each of these cases, we conclude there is a high degree of reproducibility. In the majority of cases, there are reasons that suggest differences in the way the experiments were done that might explain the discrepancies (different criteria for positivity used by different authors; mutagenic impurities in chemicals from different sources; or the use of doses that were too low to detect an effect). In the minority of cases, the difference appears due either to extremely weak effects occurring at the limit of sensitivity of the assay so that they are detected as positive in some but not all tests, or to the random fluctuation of negative data.

To analyze agreement among potencies (initial slopes of the dose-response curves) estimated from replicated experiments we selected, among the 122 data-sets, those that contained 3 or more positive experiments, and we further restricted the analysis to data-sets containing no negative experiments (Figs. 6–8, Table 5). The results of this analysis indicated that estimates of mutagenic potency from different experiments are roughly clustered. However, marked between-experiment variability was also apparent. Thus, the within-laboratory precision of the potency estimates did not account for the variation observed across laboratories.

4.(4) Comparison of results obtained in the published literature and in a standardized testing program. 44 pairs of data-sets were compared from the literature and an NCI/NTP-sponsored testing program. Each pair consisted of 2 or more experiments from each source of data, tested under the same protocol. For each data-set, summary measures of positivity, potency, and variability were obtained (Table 6), and then compared for the 2 sources of data (Table 7). In most cases average slopes were roughly similar (more than half were within a factor of 3, and all but 2 were within a factor of 10). Variability was compared using the coefficient of variation (CV) (the standard deviation divided by the mean). There did not appear to be a great difference in reproducibility between the two sources of data (the average CV was 78% for the literature and 104% for the standardized testing program). We conclude this is most likely due to two factors: (a) Authors tend to publish results that have been verified by replication, whereas, in the standardized testing program, chemicals were, in general, tested only once, using predetermined doses; (b) In contrast to the standardized testing program, most testing done for publication is not done blind, and authors are aware of previously published results, which could conceivably bias their choice of which experimental results to publish.

5.(5) Comparison of potencies across Salmonella tester strains. We compared the mutagenic potency of a number of chemicals in two pairs of tester strains: TA1535, which detects base-pair substitution mutations, and its isogenic pair, TA100, which contains the R-factor plasmid pKM101; and TA1538 (or TA1537), which detect frameshift mutations, and TA1535. The results of this analysis (Figs. 9, 10) indicated that lack of sufficient replication to adequately estimate between-experiment variance limits the analysis, but if certain assumptions are made, it is possible, using statistical criteria, to roughly classify chemicals according to their potencies in the different strains.

We discuss these results with regard to adequacy of the statistical procedures, variability of the assay, and the importance of quantitation of the results of the Salmonella assay in determining its potential utility in cancer risk assessment. 
60-35-5; 6098-44-8; 53-96-3; 1162-65-8; 915-67-3; 61-82-5; 60-09-3; 97-56-3; 92-67-1; 153-78-6; 62-53-3; 90-04-0; 104-94-9; 11097-69-1; 1332-21-4; 492-80-8; 86-50-0; 103-33-3; 56-55-3; 71-43-2; 92-87-5; 50-32-8; 192-97-2; 13510-49-1; 57-57-8; 2650-18-2; 860-22-0; 3817-11-6; 128-37-0; 10108-64-2; 14239-68-0; 58-08-2; 133-06-2; 56-23-5; 95-69-2; 97-00-7; 106-47-8; 94-20-2; 7789-00-6; 14901-08-7; 50-18-0; 72-55-9; 50-29-3; 2303-16-4; 615-05-4; 95-80-7; 53-70-3; 96-12-8; 106-93-4; 91-94-1; 107-06-2; 10588-01-9; 60-57-1; 64-67-5; 55-18-5; 56-53-1; 60-11-7; 57-97-6; 540-73-8; 62-75-9; 140-79-4; 121-14-2; 1937-37-7; 115-29-7; 72-20-8; 106-89-8; 13073-35-3; 62-50-0; 151-56-4; 140-56-7; 3688-53-7; 126-07-8; 67-72-1; 3105-97-3; 23255-93-8; 10034-93-2; 53-95-2; 4637-56-3 
• Chromium VI
     Considered
          Potentially Relevant Supplemental Material
               Mechanistic
• 1,2-Dibromo-3-chloropropane
     Litsearch 2018
          Toxline
• PCBs
     Litsearches
               ToxLine
OPPT REs
• OPPT_Asbestos, Part I: Chrysotile_F. Human Health
     Total – title/abstract screening
          Off topic