Using Data-Display Networks for Exploratory Data Analysis in Phylogenetic Studies

Item Type Journal Article
Author David A Morrison
URL http://www.ncbi.nlm.nih.gov/pubmed/20034996
Publication Molecular Biology and Evolution
ISSN 1537-1719
Date Dec 24, 2009
Extra PMID: 20034996
Journal Abbr Mol Biol Evol
DOI 10.1093/molbev/msp309
Accessed 2010-01-01 14:13:05
Library Catalog NCBI PubMed
Abstract Exploratory data analysis (EDA) is a frequently under-valued part of data analysis in biology. It involves evaluating the characteristics of the data before proceeding to the definitive analysis in relation to the scientific question at hand. For phylogenetic analyses, a useful tool for EDA is a data-display network. This type of network is designed to display any character (or tree) conflict in a dataset, without prior assumptions about the causes of those conflicts. The conflicts might be caused by (a) methodological issues in data collection or analysis, (b) homoplasy, or (c) horizontal gene flow of some sort. Here, I explore 13 published datasets using splits networks, as examples of using data-display networks for EDA. In each case, I performed an original EDA on the data provided, to highlight the aspects of the resulting network that will be important for an interpretation of the phylogeny. In each case, there is at least one important point (possibly missed by the original authors) that might affect the phylogenetic analysis. I conclude that EDA should play a greater role in phylogenetic analyses than it has done.
Title Using Data-Display Networks for Exploratory Data Analysis in Phylogenetic Studies
Date Added 2009-01-01 09:13
Date Modified 2009-01-01 09:13