[2003.00381] Statistical power for cluster analysiscontact arXivarXiv Twitter

Cluster algorithms are gaining in popularity due to their compelling ability to identify discrete subgroups in data, and their increasing accessibility in mainstream programming languages and statistical software. While researchers can follow guidelines to choose the right algorithms, and to determine what constitutes convincing clustering, there are no firmly established ways of computing a priori statistical power for cluster analysis. Here, we take a simulation approach to estimate power and classification accuracy for popular analysis pipelines. We systematically varied cluster size, number of clusters, number of different features between clusters, effect size within each different feature, and cluster covariance structure in generated datasets. We then subjected these datasets to common dimensionality reduction approaches (none, multi-dimensional scaling, or uniform manifold approximation and projection) and cluster algorithms (k-means, hierarchical agglomerative clustering with

6 mentions: @esdalmaijer@esdalmaijer
Date: 2020/03/03 23:20

Referring Tweets

@esdalmaijer Do you use cluster analysis? Wonder about its statistical power, and the effects of dimensionality reduction and clustering algorithm choice? WORRY NO MORE! We figured it out for you! With @camillalnord and @DuncanAstle, summary thread below, preprint at t.co/obhQEgSNli t.co/sPB2mw5YVt
@esdalmaijer Our manuscript on @arxiv shows all the simulations and analyses, and has practical tips. Can't find your preferred clustering tool? Our code is freely available on GitHub (link below), and easily adjusted! Pre-print: t.co/obhQEgSNli Code: t.co/NKnzphFI0U

Related Entries

Read more Learning to retrieve reasoning paths from the Wikipedia graph
0 users, 6 mentions 2020/02/24 17:00
Read more [1903.00519] Aggregating explanation methods for stable and robust explainabilitycontact arXivarXiv ...
0 users, 2 mentions 2020/03/24 06:51
Read more [2004.10240] Neural forecasting: Introduction and literature overviewopen searchopen navigation menu...
0 users, 6 mentions 2020/04/23 21:51
Read more [2008.00727] Deep Bayesian Bandits: Exploring in Online Personalized Recommendationsopen searchopen ...
0 users, 3 mentions 2020/08/05 05:21
Read more [2009.08666] Dr. Summarize: Global Summarization of Medical Dialogue by Exploiting Local Structureso...
0 users, 4 mentions 2020/09/21 17:22