ANOFA vs. hierchical log-linear modeling

Some have noticed similarities between ANOFA and hierchical log-linear model. Indeed, the starting point of the two techniques is the same, the use of a G statistics which computes the log-likelihood ratio of two competing models, one being a restricted version of the other (Laurencelle & Cousineau, 2023).

Sharpe (2015) did an excellent overview of frequency analyses. He ends by citing Delucchi (1983): ” It is not difficult to argue that log-linear models will eventually supersede the use of Pearson’s chi-square in the future because of their similarity to analysis of variance (ANOVA) procedures and their extension to higher order tables.”

However, a major chiasma happened in 1940, and from that moment on, hierchical linear models took an unexpected direction apart from ANOFA.

What happened in 1940?

Deming & Stephan (1940) published an article related to U.S. census. Herein, they noted, working with a classification table having more than two factors, that the expected cell frequencies computed using products of estimators (MLE) did not totalize the number of observations. As it turns out, this happens only when there are three or more classification dimensions.

As a solution, they proposed to generate the expected marginals using an iterative algorithm (SPSS calls it the iterative proportional- fitting algorithm). S. E. Fienberg (1970) described in more details the said algorithm, showing that it always converges, and does so in just a few iterations, making it a very apt algorithm. Stephen E. Fienberg (1970) also claimed that the marginals so estimated were suitable for log-linear model. Previous works showed that estimates obtained in that way were maximizing the likelihood of a model with fixed marginal totals (a product-multinomial model which could be schematized as a multinomial with sub-multinomial layers model ) (Stephen E. Fienberg, 2007, p. 168), which is not the adequate model in a totally free multidimensional classification sampling.

This iterative proportional-fitting algorithm became the norm and is implemented in most software performing log-linear model fitting. Fienberg was an influential advocate of this algorithm (see his 1980 books re-edited 2007; Stephen E. Fienberg (2007), chapter 3).

What are the pros and cons of using the iterative proportional-fitting model?

Pros:

  • When the predicted cells are added, they sum up to the observed frequencies;

  • The G statistics is never negative

Cons:

  • The expected marginal frequencies computed with this algorithm are not MLE estimators;

  • Consequently, the Wilks (1938) theorem, which says that asymptotically, the G statistic (a likelihood ratio of MLE’s) follows a chi-square distribution, is no longer applicable to hierarchical log-linear model,

  • Also, the w76 correction to the chi-square distribution for small samples is likewise no longer valid in hierarchical log-linear model;

  • Neyman & Pearson (1933) showed that tests based on the likelihood ratio of MLE’s result in the most powerful statistical tests of hypothesis;

  • The G statistics are no longer additive, not totalizing \(G_{\rm{total}}\) anymore,

  • Consequently, it is no longer possible to decompose the total G statistics into main effects and interactions –or– into simple effects –or– into orthogonal contrasts…

In our opinion, the list of disadvantages of using the iterative algorithm by far exceed the advantages it offers. Wilks and Williams’ theorems are the important foundations of ANOFA which makes this technique sound, and Neyman and Pearson’s theorem, that ANOFA is statistically the most powerful test.

References

Delucchi, K. L. (1983). The use and misuse of chi-square: Lewis and burke revisited. Psychological Bulletin, 94, 166–176. https://doi.org/10.1037/0033-2909.94.1.166
Deming, W. E., & Stephan, F. F. (1940). On a least squares adjustment of a sampled frequency table when the expected marginal totals are known. Annals of Mathematical Statistics, 11, 427–444. https://doi.org/10.1214/aoms/1177731829
Fienberg, S. E. (1970). An iterative procedure for estimation in contingency tables. Annals of Mathematical Statistics, 41, 907–917. https://doi.org/10.1214/aoms/1177696968
Fienberg, Stephen E. (1970). The analysis of multidimensional contingency tables. Ecology, 51, 419–433. https://doi.org/10.2307/1935377
Fienberg, Stephen E. (2007). The analysis of cross-classified categorical data (p. 198). New York: Springer. https://doi.org/10.1007/978-0-387-72825-4
Laurencelle, L., & Cousineau, D. (2023). Analysis of frequency tables: The ANOFA framework. The Quantitative Methods for Psychology, 19, 173–193. https://doi.org/10.20982/tqmp.19.2.p173
Neyman, J., & Pearson, E. S. (1933). IX. On the problem of the most efficient tests of statistical hypotheses. Philosophical Transactions of the Royal Society of London. Series A, 231(694-706), 289–337. https://doi.org/10.1098/rsta.1933.0009
Sharpe, D. (2015). Chi-square test is statistically significant: Now what? Practical Assessment, Research, and Evaluation, 20(1), 8.
Wilks, S. S. (1938). The large-sample distribution of the likelihood ratio for testing composite hypotheses. The Annals of Mathematical Statistics, 9(1), 60–62. https://doi.org/10.1214/aoms/1177732360