A strict maximum likelihood explanation of MaxEnt, and some implications for distribution modelling

Rune Halvorsen

Open Access

A strict maximum likelihood explanation of MaxEnt, and some implications for distribution modelling

Rune Halvorsen

| Apr 18, 2013

Sommerfeltia

Volume 36 (2013): Issue 1 (April 2013)

A strict maximum likelihood explanation of MaxEnt, and some implications for distribution modelling, Editor: Vegar Bakkestuen

About this article

Cite

Page range: 1 - 132

DOI: https://doi.org/10.2478/v10208-011-0016-2

Keywords
Distribution modelling, F-ratio test, Gradient analysis, MaxEnt, Maximum likehood, Model calibration, Model evaluation, Model selection, Regularisation

© 2013 Rune Halvorsen, published by Versita

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.

Distribution modelling - research with the purpose of modelling the distribution of observable objects of a specific type - has become established as an independent branch of ecological science, with strong proliferation of approaches and methods in recent years. Since it was first made available to distribution modellers in 2004, the maximum entropy modelling method (MaxEnt) has established itself as a state-of-the-art method for distribution modelling. Default options and settings in the user-friendly Maxent software has become established as a standard practice for distribution modelling by MaxEnt.

A mini-review of 87 recent publications in which MaxEnt was used with empirical data to model distributions showed that the ‘standard MaxEnt practice’ is followed by a large majority of users and questioned by few. However, the review also provides indications that MaxEnt models obtained by the standard practice are sometimes overfitted to the data used to parameterise the model; examples of cases in which simpler MaxEnt models with predictive performance do exist. Results of the review motivate strongly for a better understanding of the ecological implications of the maximum entropy principle, as a basis for choosing MaxEnt options and settings.

This paper provides a thorough explanation of MaxEnt for ecologists, ending with a set of suggestions for improvements to the current practice of distribution modelling by MaxEnt. The explanation for MaxEnt given in the paper differs from previous explanations by being based on the maximum likelihood principle and by being based upon a gradient analytic perspective on distribution modelling. Four new findings are particularly emphasised: (1) that a strict maximum likelihood explanation of MaxEnt is possible, which places MaxEnt among regression methods in the widest sense; (2) that the true degrees of freedom for the residuals of a Max- Ent null model is N - n, the difference between the number of background and the number of presence observations used in the modelling; (3) that likelihood-ratio and F-ratio tests can be used to compare nested MaxEnt models; and (4) that subset selection methods are likely to be preferential to shrinkage methods for model selection in MaxEnt. Methods for internal model performance assessment, model comparison, and interpretation of MaxEnt model predictions (MaxEnt output), are described and discussed. Two simulated data sets are used to explore and illustrate important issues relating to MaxEnt methodology.

Arguments for development of a generally applicable ‘consensus MaxEnt practice’ for spatial prediction modelling are given, and elements of such a practice discussed. Five main additions or amendments to the ʻstandard MaxEnt practiceʼ are suggested: (1) flexible, interactive tools to assist deriving of variables from raw explanatory variables; (2) interactive tools to allow the user freely to combine model selection methods, methods and approaches for internal model performance assessment, and model improvement criteria, into a data-driven modelling procedure, (3) integration of independent presence/absence data into the modelling process, for external model performance assessment, for model calibration, and for model evaluation; (4) new output formats, notably a probability-ratio output format which directly expresses the ʻrelative suitability of one place vs. anotherʼ for the modelled target; and (5) development of options for discriminative use of MaxEnt, i.e., use of with presence/absence data. The most important research needs are considered to be: (1) comparative studies of strategies for construction of parsimonious sets of derived variables for use in MaxEnt modelling; and (2) comparative tests on independent presence/absence data of the predictive performance of MaxEnt models obtained with different model selection strategies, different approaches for internal model performance assessment, and different model improvement criteria.

eISSN:: 2084-0098
Language:: English

Publication timeframe:: Volume Open
Journal Subjects:: Life Sciences, Plant Science, Ecology, other

Journal RSS Feed

A strict maximum likelihood explanation of MaxEnt, and some implications for distribution modelling

Published Online: Apr 18, 2013

Page range: 1 - 132

DOI: https://doi.org/10.2478/v10208-011-0016-2

KeywordsDistribution modelling, F-ratio test, Gradient analysis, MaxEnt, Maximum likehood, Model calibration, Model evaluation, Model selection, Regularisation

© 2013 Rune Halvorsen, published by Versita

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.

Keywords
Distribution modelling, F-ratio test, Gradient analysis, MaxEnt, Maximum likehood, Model calibration, Model evaluation, Model selection, Regularisation