Title: | 'ggplot2' Based Plots with Statistical Details |
---|---|
Description: | Extension of 'ggplot2', 'ggstatsplot' creates graphics with details from statistical tests included in the plots themselves. It provides an easier syntax to generate information-rich plots for statistical analysis of continuous (violin plots, scatterplots, histograms, dot plots, dot-and-whisker plots) or categorical (pie and bar charts) data. Currently, it supports the most common types of statistical approaches and tests: parametric, nonparametric, robust, and Bayesian versions of t-test/ANOVA, correlation analyses, contingency table analysis, meta-analysis, and regression analyses. References: Patil (2021) <doi:10.21105/joss.03236>. |
Authors: | Indrajeet Patil [cre, aut, cph] , Chuck Powell [ctb] |
Maintainer: | Indrajeet Patil <[email protected]> |
License: | GPL-3 | file LICENSE |
Version: | 0.12.5.9000 |
Built: | 2024-11-11 22:18:07 UTC |
Source: | https://github.com/indrajeetpatil/ggstatsplot |
Tidy version of the "Bugs" dataset.
bugs_long
bugs_long
A data frame with 372 rows and 6 variables
subject. Dummy identity number for each participant.
gender. Participant's gender (Female, Male).
region. Region of the world the participant was from.
education. Level of education.
condition. Condition of the experiment the participant gave rating for (LDLF: low freighteningness and low disgustingness; LFHD: low freighteningness and high disgustingness; HFHD: high freighteningness and low disgustingness; HFHD: high freighteningness and high disgustingness).
desire. The desire to kill an arthropod was indicated on a scale from 0 to 10.
This data set, "Bugs", provides the extent to which men and women want to kill arthropods that vary in freighteningness (low, high) and disgustingness (low, high). Each participant rates their attitudes towards all anthropods. Subset of the data reported by Ryan et al. (2013).
Ryan, R. S., Wilde, M., & Crist, S. (2013). Compared to a small, supervised lab experiment, a large, unsupervised web-based experiment on a previously unknown effect has benefits that outweigh its potential costs. Computers in Human Behavior, 29(4), 1295-1301.
dim(bugs_long) head(bugs_long) dplyr::glimpse(bugs_long)
dim(bugs_long) head(bugs_long) dplyr::glimpse(bugs_long)
Wrapper around patchwork::wrap_plots()
that will return a combined grid
of plots with annotations. In case you want to create a grid of plots, it is
highly recommended that you use {patchwork}
package directly and not
this wrapper around it which is mostly useful with {ggstatsplot}
plots. It
is exported only for backward compatibility.
combine_plots( plotlist, plotgrid.args = list(), annotation.args = list(), guides = "collect", ... )
combine_plots( plotlist, plotgrid.args = list(), annotation.args = list(), guides = "collect", ... )
plotlist |
A list containing |
plotgrid.args |
A |
annotation.args |
A |
guides |
A string specifying how guides should be treated in the layout.
|
... |
Currently ignored. |
A combined plot with annotation labels.
library(ggplot2) # first plot p1 <- ggplot( data = subset(iris, iris$Species == "setosa"), aes(x = Sepal.Length, y = Sepal.Width) ) + geom_point() + labs(title = "setosa") # second plot p2 <- ggplot( data = subset(iris, iris$Species == "versicolor"), aes(x = Sepal.Length, y = Sepal.Width) ) + geom_point() + labs(title = "versicolor") # combining the plot with a title and a caption combine_plots( plotlist = list(p1, p2), plotgrid.args = list(nrow = 1), annotation.args = list( tag_levels = "a", title = "Dataset: Iris Flower dataset", subtitle = "Edgar Anderson collected this data", caption = "Note: Only two species of flower are displayed", theme = theme( plot.subtitle = element_text(size = 20), plot.title = element_text(size = 30) ) ) )
library(ggplot2) # first plot p1 <- ggplot( data = subset(iris, iris$Species == "setosa"), aes(x = Sepal.Length, y = Sepal.Width) ) + geom_point() + labs(title = "setosa") # second plot p2 <- ggplot( data = subset(iris, iris$Species == "versicolor"), aes(x = Sepal.Length, y = Sepal.Width) ) + geom_point() + labs(title = "versicolor") # combining the plot with a title and a caption combine_plots( plotlist = list(p1, p2), plotgrid.args = list(nrow = 1), annotation.args = list( tag_levels = "a", title = "Dataset: Iris Flower dataset", subtitle = "Edgar Anderson collected this data", caption = "Note: Only two species of flower are displayed", theme = theme( plot.subtitle = element_text(size = 20), plot.title = element_text(size = 30) ) ) )
{ggstatsplot}
plotsExtracting data frames or expressions from {ggstatsplot}
plots
extract_stats(p) extract_subtitle(p) extract_caption(p)
extract_stats(p) extract_subtitle(p) extract_caption(p)
p |
A plot from |
These are convenience functions to extract data frames or expressions with
statistical details that are used to create expressions displayed in
{ggstatsplot}
plots as subtitle, caption, etc. Note that all of this
analysis is carried out by the {statsExpressions}
package. And so if you
are using these functions only to extract data frames, you are better off
using that package.
The only exception is the ggcorrmat()
function. But, if a data frame is
what you want, you shouldn't be using ggcorrmat()
anyway. You can use
correlation::correlation()
function which provides tidy data frames by
default.
A list of tibbles containing summaries of various statistical analyses. The exact details included will depend on the function.
set.seed(123) # non-grouped plot p1 <- ggbetweenstats(mtcars, cyl, mpg) # grouped plot p2 <- grouped_ggbarstats(Titanic_full, Survived, Sex, grouping.var = Age) # extracting expressions ----------------------------- extract_subtitle(p1) extract_caption(p1) extract_subtitle(p2) extract_caption(p2) # extracting data frames ----------------------------- extract_stats(p1) extract_stats(p2)
set.seed(123) # non-grouped plot p1 <- ggbetweenstats(mtcars, cyl, mpg) # grouped plot p2 <- grouped_ggbarstats(Titanic_full, Survived, Sex, grouping.var = Age) # extracting expressions ----------------------------- extract_subtitle(p1) extract_caption(p1) extract_subtitle(p2) extract_caption(p2) # extracting data frames ----------------------------- extract_stats(p1) extract_stats(p2)
Bar charts for categorical data with statistical details included in the plot as a subtitle.
ggbarstats( data, x, y, counts = NULL, type = "parametric", paired = FALSE, results.subtitle = TRUE, label = "percentage", label.args = list(alpha = 1, fill = "white"), sample.size.label.args = list(size = 4), digits = 2L, proportion.test = results.subtitle, digits.perc = 0L, bf.message = TRUE, ratio = NULL, conf.level = 0.95, sampling.plan = "indepMulti", fixed.margin = "rows", prior.concentration = 1, title = NULL, subtitle = NULL, caption = NULL, legend.title = NULL, xlab = NULL, ylab = NULL, ggtheme = ggstatsplot::theme_ggstatsplot(), package = "RColorBrewer", palette = "Dark2", ggplot.component = NULL, ... )
ggbarstats( data, x, y, counts = NULL, type = "parametric", paired = FALSE, results.subtitle = TRUE, label = "percentage", label.args = list(alpha = 1, fill = "white"), sample.size.label.args = list(size = 4), digits = 2L, proportion.test = results.subtitle, digits.perc = 0L, bf.message = TRUE, ratio = NULL, conf.level = 0.95, sampling.plan = "indepMulti", fixed.margin = "rows", prior.concentration = 1, title = NULL, subtitle = NULL, caption = NULL, legend.title = NULL, xlab = NULL, ylab = NULL, ggtheme = ggstatsplot::theme_ggstatsplot(), package = "RColorBrewer", palette = "Dark2", ggplot.component = NULL, ... )
data |
A data frame (or a tibble) from which variables specified are to
be taken. Other data types (e.g., matrix,table, array, etc.) will not
be accepted. Additionally, grouped data frames from |
x |
The variable to use as the rows in the contingency table. Please note that if there are empty factor levels in your variable, they will be dropped. |
y |
The variable to use as the columns in the contingency table.
Please note that if there are empty factor levels in your variable, they
will be dropped. Default is |
counts |
The variable in data containing counts, or |
type |
A character specifying the type of statistical approach:
You can specify just the initial letter. |
paired |
Logical indicating whether data came from a within-subjects or
repeated measures design study (Default: |
results.subtitle |
Decides whether the results of statistical tests are
to be displayed as a subtitle (Default: |
label |
Character decides what information needs to be displayed
on the label in each pie slice. Possible options are |
label.args |
Additional aesthetic arguments that will be passed to
|
sample.size.label.args |
Additional aesthetic arguments that will be
passed to |
digits |
Number of digits for rounding or significant figures. May also
be |
proportion.test |
Decides whether proportion test for |
digits.perc |
Numeric that decides number of decimal places for
percentage labels (Default: |
bf.message |
Logical that decides whether to display Bayes Factor in
favor of the null hypothesis. This argument is relevant only for
parametric test (Default: |
ratio |
A vector of proportions: the expected proportions for the
proportion test (should sum to |
conf.level |
Scalar between |
sampling.plan |
Character describing the sampling plan. Possible options:
|
fixed.margin |
For the independent multinomial sampling plan, which
margin is fixed ( |
prior.concentration |
Specifies the prior concentration parameter, set
to |
title |
The text for the plot title. |
subtitle |
The text for the plot subtitle. Will work only if
|
caption |
The text for the plot caption. This argument is relevant only
if |
legend.title |
Title text for the legend. |
xlab |
Label for |
ylab |
Labels for |
ggtheme |
A |
package , palette
|
Name of the package from which the given palette is to
be extracted. The available palettes and packages can be checked by running
|
ggplot.component |
A |
... |
Currently ignored. |
For details, see: https://indrajeetpatil.github.io/ggstatsplot/articles/web_only/ggpiestats.html
graphical element | geom used |
argument for further modification |
bars | ggplot2::geom_bar() |
NA |
descriptive labels | ggplot2::geom_label() |
label.args |
sample size labels | ggplot2::geom_text() |
sample.size.label.args |
The table below provides summary about:
statistical test carried out for inferential statistics
type of effect size estimate and a measure of uncertainty for this estimate
functions used internally to compute these details
Hypothesis testing
Type | Design | Test | Function used |
Parametric/Non-parametric | Unpaired | Pearson's chi-squared test | stats::chisq.test() |
Bayesian | Unpaired | Bayesian Pearson's chi-squared test | BayesFactor::contingencyTableBF() |
Parametric/Non-parametric | Paired | McNemar's chi-squared test | stats::mcnemar.test() |
Bayesian | Paired | No | No |
Effect size estimation
Type | Design | Effect size | CI available? | Function used |
Parametric/Non-parametric | Unpaired | Cramer's V | Yes | effectsize::cramers_v() |
Bayesian | Unpaired | Cramer's V | Yes | effectsize::cramers_v() |
Parametric/Non-parametric | Paired | Cohen's g | Yes | effectsize::cohens_g() |
Bayesian | Paired | No | No | No |
Hypothesis testing
Type | Test | Function used |
Parametric/Non-parametric | Goodness of fit chi-squared test | stats::chisq.test() |
Bayesian | Bayesian Goodness of fit chi-squared test | (custom) |
Effect size estimation
Type | Effect size | CI available? | Function used |
Parametric/Non-parametric | Pearson's C | Yes | effectsize::pearsons_c() |
Bayesian | No | No | No |
grouped_ggbarstats
, ggpiestats
,
grouped_ggpiestats
# for reproducibility set.seed(123) # creating a plot p <- ggbarstats(mtcars, x = vs, y = cyl) # looking at the plot p # extracting details from statistical tests extract_stats(p)
# for reproducibility set.seed(123) # creating a plot p <- ggbarstats(mtcars, x = vs, y = cyl) # looking at the plot p # extracting details from statistical tests extract_stats(p)
A combination of box and violin plots along with jittered data points for between-subjects designs with statistical details included in the plot as a subtitle.
ggbetweenstats( data, x, y, type = "parametric", pairwise.display = "significant", p.adjust.method = "holm", effsize.type = "unbiased", bf.prior = 0.707, bf.message = TRUE, results.subtitle = TRUE, xlab = NULL, ylab = NULL, caption = NULL, title = NULL, subtitle = NULL, digits = 2L, var.equal = FALSE, conf.level = 0.95, nboot = 100L, tr = 0.2, centrality.plotting = TRUE, centrality.type = type, centrality.point.args = list(size = 5, color = "darkred"), centrality.label.args = list(size = 3, nudge_x = 0.4, segment.linetype = 4, min.segment.length = 0), point.args = list(position = ggplot2::position_jitterdodge(dodge.width = 0.6), alpha = 0.4, size = 3, stroke = 0, na.rm = TRUE), boxplot.args = list(width = 0.3, alpha = 0.2, na.rm = TRUE), violin.args = list(width = 0.5, alpha = 0.2, na.rm = TRUE), ggsignif.args = list(textsize = 3, tip_length = 0.01, na.rm = TRUE), ggtheme = ggstatsplot::theme_ggstatsplot(), package = "RColorBrewer", palette = "Dark2", ggplot.component = NULL, ... )
ggbetweenstats( data, x, y, type = "parametric", pairwise.display = "significant", p.adjust.method = "holm", effsize.type = "unbiased", bf.prior = 0.707, bf.message = TRUE, results.subtitle = TRUE, xlab = NULL, ylab = NULL, caption = NULL, title = NULL, subtitle = NULL, digits = 2L, var.equal = FALSE, conf.level = 0.95, nboot = 100L, tr = 0.2, centrality.plotting = TRUE, centrality.type = type, centrality.point.args = list(size = 5, color = "darkred"), centrality.label.args = list(size = 3, nudge_x = 0.4, segment.linetype = 4, min.segment.length = 0), point.args = list(position = ggplot2::position_jitterdodge(dodge.width = 0.6), alpha = 0.4, size = 3, stroke = 0, na.rm = TRUE), boxplot.args = list(width = 0.3, alpha = 0.2, na.rm = TRUE), violin.args = list(width = 0.5, alpha = 0.2, na.rm = TRUE), ggsignif.args = list(textsize = 3, tip_length = 0.01, na.rm = TRUE), ggtheme = ggstatsplot::theme_ggstatsplot(), package = "RColorBrewer", palette = "Dark2", ggplot.component = NULL, ... )
data |
A data frame (or a tibble) from which variables specified are to
be taken. Other data types (e.g., matrix,table, array, etc.) will not
be accepted. Additionally, grouped data frames from |
x |
The grouping (or independent) variable from |
y |
The response (or outcome or dependent) variable from |
type |
A character specifying the type of statistical approach:
You can specify just the initial letter. |
pairwise.display |
Decides which pairwise comparisons to display. Available options are:
You can use this argument to make sure that your plot is not uber-cluttered
when you have multiple groups being compared and scores of pairwise
comparisons being displayed. If set to |
p.adjust.method |
Adjustment method for p-values for multiple
comparisons. Possible methods are: |
effsize.type |
Type of effect size needed for parametric tests. The
argument can be |
bf.prior |
A number between |
bf.message |
Logical that decides whether to display Bayes Factor in
favor of the null hypothesis. This argument is relevant only for
parametric test (Default: |
results.subtitle |
Decides whether the results of statistical tests are
to be displayed as a subtitle (Default: |
xlab |
Label for |
ylab |
Labels for |
caption |
The text for the plot caption. This argument is relevant only
if |
title |
The text for the plot title. |
subtitle |
The text for the plot subtitle. Will work only if
|
digits |
Number of digits for rounding or significant figures. May also
be |
var.equal |
a logical variable indicating whether to treat the
two variances as being equal. If |
conf.level |
Scalar between |
nboot |
Number of bootstrap samples for computing confidence interval
for the effect size (Default: |
tr |
Trim level for the mean when carrying out |
centrality.plotting |
Logical that decides whether centrality tendency
measure is to be displayed as a point with a label (Default:
If you want default centrality parameter, you can specify this using
|
centrality.type |
Decides which centrality parameter is to be displayed.
The default is to choose the same as
Just as |
centrality.point.args , centrality.label.args
|
A list of additional aesthetic
arguments to be passed to |
point.args |
A list of additional aesthetic arguments to be passed to
the |
boxplot.args |
A list of additional aesthetic arguments passed on to
|
violin.args |
A list of additional aesthetic arguments to be passed to
the |
ggsignif.args |
A list of additional aesthetic
arguments to be passed to |
ggtheme |
A |
package , palette
|
Name of the package from which the given palette is to
be extracted. The available palettes and packages can be checked by running
|
ggplot.component |
A |
... |
Currently ignored. |
For details, see: https://indrajeetpatil.github.io/ggstatsplot/articles/web_only/ggbetweenstats.html
graphical element | geom used |
argument for further modification |
raw data | ggplot2::geom_point() |
point.args |
box plot | ggplot2::geom_boxplot() |
boxplot.args |
density plot | ggplot2::geom_violin() |
violin.args |
centrality measure point | ggplot2::geom_point() |
centrality.point.args |
centrality measure label | ggrepel::geom_label_repel() |
centrality.label.args |
pairwise comparisons | ggsignif::geom_signif() |
ggsignif.args |
The table below provides summary about:
statistical test carried out for inferential statistics
type of effect size estimate and a measure of uncertainty for this estimate
functions used internally to compute these details
Type | Measure | Function used |
Parametric | mean | datawizard::describe_distribution() |
Non-parametric | median | datawizard::describe_distribution() |
Robust | trimmed mean | datawizard::describe_distribution() |
Bayesian | MAP | datawizard::describe_distribution() |
The table below provides summary about:
statistical test carried out for inferential statistics
type of effect size estimate and a measure of uncertainty for this estimate
functions used internally to compute these details
Hypothesis testing
Type | No. of groups | Test | Function used |
Parametric | 2 | Student's or Welch's t-test | stats::t.test() |
Non-parametric | 2 | Mann-Whitney U test | stats::wilcox.test() |
Robust | 2 | Yuen's test for trimmed means | WRS2::yuen() |
Bayesian | 2 | Student's t-test | BayesFactor::ttestBF() |
Effect size estimation
Type | No. of groups | Effect size | CI available? | Function used |
Parametric | 2 | Cohen's d, Hedge's g | Yes | effectsize::cohens_d() , effectsize::hedges_g() |
Non-parametric | 2 | r (rank-biserial correlation) | Yes | effectsize::rank_biserial() |
Robust | 2 | Algina-Keselman-Penfield robust standardized difference | Yes | WRS2::akp.effect() |
Bayesian | 2 | difference | Yes | bayestestR::describe_posterior() |
Hypothesis testing
Type | No. of groups | Test | Function used |
Parametric | 2 | Student's t-test | stats::t.test() |
Non-parametric | 2 | Wilcoxon signed-rank test | stats::wilcox.test() |
Robust | 2 | Yuen's test on trimmed means for dependent samples | WRS2::yuend() |
Bayesian | 2 | Student's t-test | BayesFactor::ttestBF() |
Effect size estimation
Type | No. of groups | Effect size | CI available? | Function used |
Parametric | 2 | Cohen's d, Hedge's g | Yes | effectsize::cohens_d() , effectsize::hedges_g() |
Non-parametric | 2 | r (rank-biserial correlation) | Yes | effectsize::rank_biserial() |
Robust | 2 | Algina-Keselman-Penfield robust standardized difference | Yes | WRS2::wmcpAKP() |
Bayesian | 2 | difference | Yes | bayestestR::describe_posterior() |
The table below provides summary about:
statistical test carried out for inferential statistics
type of effect size estimate and a measure of uncertainty for this estimate
functions used internally to compute these details
Hypothesis testing
Type | No. of groups | Test | Function used |
Parametric | > 2 | Fisher's or Welch's one-way ANOVA | stats::oneway.test() |
Non-parametric | > 2 | Kruskal-Wallis one-way ANOVA | stats::kruskal.test() |
Robust | > 2 | Heteroscedastic one-way ANOVA for trimmed means | WRS2::t1way() |
Bayes Factor | > 2 | Fisher's ANOVA | BayesFactor::anovaBF() |
Effect size estimation
Type | No. of groups | Effect size | CI available? | Function used |
Parametric | > 2 | partial eta-squared, partial omega-squared | Yes | effectsize::omega_squared() , effectsize::eta_squared() |
Non-parametric | > 2 | rank epsilon squared | Yes | effectsize::rank_epsilon_squared() |
Robust | > 2 | Explanatory measure of effect size | Yes | WRS2::t1way() |
Bayes Factor | > 2 | Bayesian R-squared | Yes | performance::r2_bayes() |
Hypothesis testing
Type | No. of groups | Test | Function used |
Parametric | > 2 | One-way repeated measures ANOVA | afex::aov_ez() |
Non-parametric | > 2 | Friedman rank sum test | stats::friedman.test() |
Robust | > 2 | Heteroscedastic one-way repeated measures ANOVA for trimmed means | WRS2::rmanova() |
Bayes Factor | > 2 | One-way repeated measures ANOVA | BayesFactor::anovaBF() |
Effect size estimation
Type | No. of groups | Effect size | CI available? | Function used |
Parametric | > 2 | partial eta-squared, partial omega-squared | Yes | effectsize::omega_squared() , effectsize::eta_squared() |
Non-parametric | > 2 | Kendall's coefficient of concordance | Yes | effectsize::kendalls_w() |
Robust | > 2 | Algina-Keselman-Penfield robust standardized difference average | Yes | WRS2::wmcpAKP() |
Bayes Factor | > 2 | Bayesian R-squared | Yes | performance::r2_bayes() |
The table below provides summary about:
statistical test carried out for inferential statistics
type of effect size estimate and a measure of uncertainty for this estimate
functions used internally to compute these details
Hypothesis testing
Type | Equal variance? | Test | p-value adjustment? | Function used |
Parametric | No | Games-Howell test | Yes | PMCMRplus::gamesHowellTest() |
Parametric | Yes | Student's t-test | Yes | stats::pairwise.t.test() |
Non-parametric | No | Dunn test | Yes | PMCMRplus::kwAllPairsDunnTest() |
Robust | No | Yuen's trimmed means test | Yes | WRS2::lincon() |
Bayesian | NA |
Student's t-test | NA |
BayesFactor::ttestBF() |
Effect size estimation
Not supported.
Hypothesis testing
Type | Test | p-value adjustment? | Function used |
Parametric | Student's t-test | Yes | stats::pairwise.t.test() |
Non-parametric | Durbin-Conover test | Yes | PMCMRplus::durbinAllPairsTest() |
Robust | Yuen's trimmed means test | Yes | WRS2::rmmcp() |
Bayesian | Student's t-test | NA |
BayesFactor::ttestBF() |
Effect size estimation
Not supported.
grouped_ggbetweenstats
, ggwithinstats
,
grouped_ggwithinstats
# for reproducibility set.seed(123) p <- ggbetweenstats(mtcars, am, mpg) p # extracting details from statistical tests extract_stats(p) # modifying defaults ggbetweenstats( morley, x = Expt, y = Speed, type = "robust", xlab = "The experiment number", ylab = "Speed-of-light measurement" ) # you can remove a specific geom to reduce complexity of the plot ggbetweenstats( mtcars, am, wt, # to remove violin plot violin.args = list(width = 0, linewidth = 0), # to remove boxplot boxplot.args = list(width = 0), # to remove points point.args = list(alpha = 0) )
# for reproducibility set.seed(123) p <- ggbetweenstats(mtcars, am, mpg) p # extracting details from statistical tests extract_stats(p) # modifying defaults ggbetweenstats( morley, x = Expt, y = Speed, type = "robust", xlab = "The experiment number", ylab = "Speed-of-light measurement" ) # you can remove a specific geom to reduce complexity of the plot ggbetweenstats( mtcars, am, wt, # to remove violin plot violin.args = list(width = 0, linewidth = 0), # to remove boxplot boxplot.args = list(width = 0), # to remove points point.args = list(alpha = 0) )
Plot with the regression coefficients' point estimates as dots with confidence interval whiskers and other statistical details included as labels.
Although the statistical models displayed in the plot may differ based on the class of models being investigated, there are few aspects of the plot that will be invariant across models:
The dot-whisker plot contains a dot representing the estimate and their
confidence intervals (95%
is the default). The estimate can either be
effect sizes (for tests that depend on the F
-statistic) or regression
coefficients (for tests with t
-, chi^2
-, and z
-statistic), etc. The
function will, by default, display a helpful x
-axis label that should
clear up what estimates are being displayed. The confidence intervals can
sometimes be asymmetric if bootstrapping was used.
The label attached to dot will provide more details from the statistical test carried out and it will typically contain estimate, statistic, and p-value.
The caption will contain diagnostic information, if available, about models that can be useful for model selection: The smaller the Akaike's Information Criterion (AIC) and the Bayesian Information Criterion (BIC) values, the "better" the model is.
The output of this function will be a {ggplot2}
object and, thus,
it can be further modified (e.g. change themes) with {ggplot2}
.
ggcoefstats( x, statistic = NULL, conf.int = TRUE, conf.level = 0.95, digits = 2L, exclude.intercept = FALSE, effectsize.type = "eta", meta.analytic.effect = FALSE, meta.type = "parametric", bf.message = TRUE, sort = "none", xlab = NULL, ylab = NULL, title = NULL, subtitle = NULL, caption = NULL, only.significant = FALSE, point.args = list(size = 3, color = "blue", na.rm = TRUE), errorbar.args = list(height = 0, na.rm = TRUE), vline = TRUE, vline.args = list(linewidth = 1, linetype = "dashed"), stats.labels = TRUE, stats.label.color = NULL, stats.label.args = list(size = 3, direction = "y", min.segment.length = 0, na.rm = TRUE), package = "RColorBrewer", palette = "Dark2", ggtheme = ggstatsplot::theme_ggstatsplot(), ... )
ggcoefstats( x, statistic = NULL, conf.int = TRUE, conf.level = 0.95, digits = 2L, exclude.intercept = FALSE, effectsize.type = "eta", meta.analytic.effect = FALSE, meta.type = "parametric", bf.message = TRUE, sort = "none", xlab = NULL, ylab = NULL, title = NULL, subtitle = NULL, caption = NULL, only.significant = FALSE, point.args = list(size = 3, color = "blue", na.rm = TRUE), errorbar.args = list(height = 0, na.rm = TRUE), vline = TRUE, vline.args = list(linewidth = 1, linetype = "dashed"), stats.labels = TRUE, stats.label.color = NULL, stats.label.args = list(size = 3, direction = "y", min.segment.length = 0, na.rm = TRUE), package = "RColorBrewer", palette = "Dark2", ggtheme = ggstatsplot::theme_ggstatsplot(), ... )
x |
A model object to be tidied, or a tidy data frame from a regression
model. Function internally uses |
statistic |
Relevant statistic for the model ( |
conf.int |
Logical. Decides whether to display confidence intervals as
error bars (Default: |
conf.level |
Numeric deciding level of confidence or credible intervals
(Default: |
digits |
Number of digits for rounding or significant figures. May also
be |
exclude.intercept |
Logical that decides whether the intercept should be
excluded from the plot (Default: |
effectsize.type |
This is the same as |
meta.analytic.effect |
Logical that decides whether subtitle for
meta-analysis via linear (mixed-effects) models (default: |
meta.type |
Type of statistics used to carry out random-effects
meta-analysis. If |
bf.message |
Logical that decides whether results from running a
Bayesian meta-analysis assuming that the effect size d varies across
studies with standard deviation t (i.e., a random-effects analysis)
should be displayed in caption. Defaults to |
sort |
If |
xlab |
Label for |
ylab |
Labels for |
title |
The text for the plot title. |
subtitle |
The text for the plot subtitle. The input to this argument
will be ignored if |
caption |
The text for the plot caption. This argument is relevant only
if |
only.significant |
If |
point.args |
A list of additional aesthetic arguments to be passed to
the |
errorbar.args |
Additional arguments that will be passed to
|
vline |
Decides whether to display a vertical line (Default: |
vline.args |
Additional arguments that will be passed to
|
stats.labels |
Logical. Decides whether the statistic and p-values for
each coefficient are to be attached to each dot as a text label using
|
stats.label.color |
Color for the labels. If set to |
stats.label.args |
Additional arguments that will be passed to
|
package , palette
|
Name of the package from which the given palette is to
be extracted. The available palettes and packages can be checked by running
|
ggtheme |
A |
... |
Additional arguments to tidying method. For more, see
|
For details, see: https://indrajeetpatil.github.io/ggstatsplot/articles/web_only/ggcoefstats.html
graphical element | geom used |
argument for further modification |
regression estimate | ggplot2::geom_point() |
point.args |
error bars | ggplot2::geom_errorbarh() |
errorbar.args |
vertical line | ggplot2::geom_vline() |
vline.args |
label with statistical details | ggrepel::geom_label_repel() |
stats.label.args |
The table below provides summary about:
statistical test carried out for inferential statistics
type of effect size estimate and a measure of uncertainty for this estimate
functions used internally to compute these details
Hypothesis testing and Effect size estimation
Type | Test | CI available? | Function used |
Parametric | Pearson's correlation coefficient | Yes | correlation::correlation() |
Non-parametric | Spearman's rank correlation coefficient | Yes | correlation::correlation() |
Robust | Winsorized Pearson's correlation coefficient | Yes | correlation::correlation() |
Bayesian | Bayesian Pearson's correlation coefficient | Yes | correlation::correlation() |
In case you want to carry out meta-analysis, you will be asked to install
the needed packages ({metafor}
, {metaplus}
, or {metaBMA}
) if they are
unavailable.
All rows of regression estimates where either of the following
quantities is NA
will be removed if labels are requested:
estimate
, statistic
, p.value
.
Given the rapid pace at which new methods are added to these packages, it
is recommended that you install development versions of {easystats}
packages using the install_latest()
function from {easystats}
.
# for reproducibility set.seed(123) library(lme4) # model object mod <- lm(formula = mpg ~ cyl * am, data = mtcars) # creating a plot p <- ggcoefstats(mod) # looking at the plot p # extracting details from statistical tests extract_stats(p) # further arguments can be passed to `parameters::model_parameters()` ggcoefstats(lmer(Reaction ~ Days + (Days | Subject), sleepstudy), effects = "fixed")
# for reproducibility set.seed(123) library(lme4) # model object mod <- lm(formula = mpg ~ cyl * am, data = mtcars) # creating a plot p <- ggcoefstats(mod) # looking at the plot p # extracting details from statistical tests extract_stats(p) # further arguments can be passed to `parameters::model_parameters()` ggcoefstats(lmer(Reaction ~ Days + (Days | Subject), sleepstudy), effects = "fixed")
Correlation matrix containing results from pairwise correlation tests.
If you want a data frame of (grouped) correlation matrix, use
correlation::correlation()
instead. It can also do grouped analysis when
used with output from dplyr::group_by()
.
ggcorrmat( data, cor.vars = NULL, cor.vars.names = NULL, matrix.type = "upper", type = "parametric", tr = 0.2, partial = FALSE, digits = 2L, sig.level = 0.05, conf.level = 0.95, bf.prior = 0.707, p.adjust.method = "holm", pch = "cross", ggcorrplot.args = list(method = "square", outline.color = "black", pch.cex = 14), package = "RColorBrewer", palette = "Dark2", colors = c("#E69F00", "white", "#009E73"), ggtheme = ggstatsplot::theme_ggstatsplot(), ggplot.component = NULL, title = NULL, subtitle = NULL, caption = NULL, ... )
ggcorrmat( data, cor.vars = NULL, cor.vars.names = NULL, matrix.type = "upper", type = "parametric", tr = 0.2, partial = FALSE, digits = 2L, sig.level = 0.05, conf.level = 0.95, bf.prior = 0.707, p.adjust.method = "holm", pch = "cross", ggcorrplot.args = list(method = "square", outline.color = "black", pch.cex = 14), package = "RColorBrewer", palette = "Dark2", colors = c("#E69F00", "white", "#009E73"), ggtheme = ggstatsplot::theme_ggstatsplot(), ggplot.component = NULL, title = NULL, subtitle = NULL, caption = NULL, ... )
data |
A data frame from which variables specified are to be taken. |
cor.vars |
List of variables for which the correlation matrix is to be
computed and visualized. If |
cor.vars.names |
Optional list of names to be used for |
matrix.type |
Character, |
type |
A character specifying the type of statistical approach:
You can specify just the initial letter. |
tr |
Trim level for the mean when carrying out |
partial |
Can be |
digits |
Number of digits for rounding or significant figures. May also
be |
sig.level |
Significance level (Default: |
conf.level |
Scalar between |
bf.prior |
A number between |
p.adjust.method |
Adjustment method for p-values for multiple
comparisons. Possible methods are: |
pch |
Decides the point shape to be used for insignificant correlation
coefficients (only valid when |
ggcorrplot.args |
A list of additional (mostly aesthetic) arguments that
will be passed to |
package , palette
|
Name of the package from which the given palette is to
be extracted. The available palettes and packages can be checked by running
|
colors |
A vector of 3 colors for low, mid, and high correlation values.
If set to |
ggtheme |
A |
ggplot.component |
A |
title |
The text for the plot title. |
subtitle |
The text for the plot subtitle. Will work only if
|
caption |
The text for the plot caption. This argument is relevant only
if |
... |
Currently ignored. |
For details, see: https://indrajeetpatil.github.io/ggstatsplot/articles/web_only/ggcorrmat.html
graphical element | geom used |
argument for further modification |
correlation matrix | ggcorrplot::ggcorrplot() |
ggcorrplot.args |
The table below provides summary about:
statistical test carried out for inferential statistics
type of effect size estimate and a measure of uncertainty for this estimate
functions used internally to compute these details
Hypothesis testing and Effect size estimation
Type | Test | CI available? | Function used |
Parametric | Pearson's correlation coefficient | Yes | correlation::correlation() |
Non-parametric | Spearman's rank correlation coefficient | Yes | correlation::correlation() |
Robust | Winsorized Pearson's correlation coefficient | Yes | correlation::correlation() |
Bayesian | Bayesian Pearson's correlation coefficient | Yes | correlation::correlation() |
grouped_ggcorrmat
ggscatterstats
grouped_ggscatterstats
set.seed(123) library(ggcorrplot) ggcorrmat(iris)
set.seed(123) library(ggcorrplot) ggcorrmat(iris)
A dot chart (as described by William S. Cleveland) with statistical details from one-sample test.
ggdotplotstats( data, x, y, xlab = NULL, ylab = NULL, title = NULL, subtitle = NULL, caption = NULL, type = "parametric", test.value = 0, bf.prior = 0.707, bf.message = TRUE, effsize.type = "g", conf.level = 0.95, tr = 0.2, digits = 2L, results.subtitle = TRUE, point.args = list(color = "black", size = 3, shape = 16), centrality.plotting = TRUE, centrality.type = type, centrality.line.args = list(color = "blue", linewidth = 1, linetype = "dashed"), ggplot.component = NULL, ggtheme = ggstatsplot::theme_ggstatsplot(), ... )
ggdotplotstats( data, x, y, xlab = NULL, ylab = NULL, title = NULL, subtitle = NULL, caption = NULL, type = "parametric", test.value = 0, bf.prior = 0.707, bf.message = TRUE, effsize.type = "g", conf.level = 0.95, tr = 0.2, digits = 2L, results.subtitle = TRUE, point.args = list(color = "black", size = 3, shape = 16), centrality.plotting = TRUE, centrality.type = type, centrality.line.args = list(color = "blue", linewidth = 1, linetype = "dashed"), ggplot.component = NULL, ggtheme = ggstatsplot::theme_ggstatsplot(), ... )
data |
A data frame (or a tibble) from which variables specified are to
be taken. Other data types (e.g., matrix,table, array, etc.) will not
be accepted. Additionally, grouped data frames from |
x |
A numeric variable from the data frame |
y |
Label or grouping variable. |
xlab |
Label for |
ylab |
Labels for |
title |
The text for the plot title. |
subtitle |
The text for the plot subtitle. Will work only if
|
caption |
The text for the plot caption. This argument is relevant only
if |
type |
A character specifying the type of statistical approach:
You can specify just the initial letter. |
test.value |
A number indicating the true value of the mean (Default:
|
bf.prior |
A number between |
bf.message |
Logical that decides whether to display Bayes Factor in
favor of the null hypothesis. This argument is relevant only for
parametric test (Default: |
effsize.type |
Type of effect size needed for parametric tests. The
argument can be |
conf.level |
Scalar between |
tr |
Trim level for the mean when carrying out |
digits |
Number of digits for rounding or significant figures. May also
be |
results.subtitle |
Decides whether the results of statistical tests are
to be displayed as a subtitle (Default: |
point.args |
A list of additional aesthetic arguments to be passed to
the |
centrality.plotting |
Logical that decides whether centrality tendency
measure is to be displayed as a point with a label (Default:
If you want default centrality parameter, you can specify this using
|
centrality.type |
Decides which centrality parameter is to be displayed.
The default is to choose the same as
Just as |
centrality.line.args |
A list of additional aesthetic arguments to be
passed to the |
ggplot.component |
A |
ggtheme |
A |
... |
Currently ignored. |
For details, see: https://indrajeetpatil.github.io/ggstatsplot/articles/web_only/ggdotplotstats.html
graphical element | geom used |
argument for further modification |
raw data | ggplot2::geom_point() |
point.args |
centrality measure line | ggplot2::geom_vline() |
centrality.line.args |
The table below provides summary about:
statistical test carried out for inferential statistics
type of effect size estimate and a measure of uncertainty for this estimate
functions used internally to compute these details
Hypothesis testing
Type | Test | Function used |
Parametric | One-sample Student's t-test | stats::t.test() |
Non-parametric | One-sample Wilcoxon test | stats::wilcox.test() |
Robust | Bootstrap-t method for one-sample test | WRS2::trimcibt() |
Bayesian | One-sample Student's t-test | BayesFactor::ttestBF() |
Effect size estimation
Type | Effect size | CI available? | Function used |
Parametric | Cohen's d, Hedge's g | Yes | effectsize::cohens_d() , effectsize::hedges_g() |
Non-parametric | r (rank-biserial correlation) | Yes | effectsize::rank_biserial() |
Robust | trimmed mean | Yes | WRS2::trimcibt() |
Bayes Factor | difference | Yes | bayestestR::describe_posterior() |
grouped_gghistostats
, gghistostats
,
grouped_ggdotplotstats
# for reproducibility set.seed(123) # creating a plot p <- ggdotplotstats( data = ggplot2::mpg, x = cty, y = manufacturer, title = "Fuel economy data", xlab = "city miles per gallon" ) # looking at the plot p # extracting details from statistical tests extract_stats(p)
# for reproducibility set.seed(123) # creating a plot p <- ggdotplotstats( data = ggplot2::mpg, x = cty, y = manufacturer, title = "Fuel economy data", xlab = "city miles per gallon" ) # looking at the plot p # extracting details from statistical tests extract_stats(p)
Histogram with statistical details from one-sample test included in the plot as a subtitle.
gghistostats( data, x, binwidth = NULL, xlab = NULL, title = NULL, subtitle = NULL, caption = NULL, type = "parametric", test.value = 0, bf.prior = 0.707, bf.message = TRUE, effsize.type = "g", conf.level = 0.95, tr = 0.2, digits = 2L, ggtheme = ggstatsplot::theme_ggstatsplot(), results.subtitle = TRUE, bin.args = list(color = "black", fill = "grey50", alpha = 0.7), centrality.plotting = TRUE, centrality.type = type, centrality.line.args = list(color = "blue", linewidth = 1, linetype = "dashed"), ggplot.component = NULL, ... )
gghistostats( data, x, binwidth = NULL, xlab = NULL, title = NULL, subtitle = NULL, caption = NULL, type = "parametric", test.value = 0, bf.prior = 0.707, bf.message = TRUE, effsize.type = "g", conf.level = 0.95, tr = 0.2, digits = 2L, ggtheme = ggstatsplot::theme_ggstatsplot(), results.subtitle = TRUE, bin.args = list(color = "black", fill = "grey50", alpha = 0.7), centrality.plotting = TRUE, centrality.type = type, centrality.line.args = list(color = "blue", linewidth = 1, linetype = "dashed"), ggplot.component = NULL, ... )
data |
A data frame (or a tibble) from which variables specified are to
be taken. Other data types (e.g., matrix,table, array, etc.) will not
be accepted. Additionally, grouped data frames from |
x |
A numeric variable from the data frame |
binwidth |
The width of the histogram bins. Can be specified as a
numeric value, or a function that calculates width from |
xlab |
Label for |
title |
The text for the plot title. |
subtitle |
The text for the plot subtitle. Will work only if
|
caption |
The text for the plot caption. This argument is relevant only
if |
type |
A character specifying the type of statistical approach:
You can specify just the initial letter. |
test.value |
A number indicating the true value of the mean (Default:
|
bf.prior |
A number between |
bf.message |
Logical that decides whether to display Bayes Factor in
favor of the null hypothesis. This argument is relevant only for
parametric test (Default: |
effsize.type |
Type of effect size needed for parametric tests. The
argument can be |
conf.level |
Scalar between |
tr |
Trim level for the mean when carrying out |
digits |
Number of digits for rounding or significant figures. May also
be |
ggtheme |
A |
results.subtitle |
Decides whether the results of statistical tests are
to be displayed as a subtitle (Default: |
bin.args |
A list of additional aesthetic arguments to be passed to the
|
centrality.plotting |
Logical that decides whether centrality tendency
measure is to be displayed as a point with a label (Default:
If you want default centrality parameter, you can specify this using
|
centrality.type |
Decides which centrality parameter is to be displayed.
The default is to choose the same as
Just as |
centrality.line.args |
A list of additional aesthetic arguments to be
passed to the |
ggplot.component |
A |
... |
Currently ignored. |
For details, see: https://indrajeetpatil.github.io/ggstatsplot/articles/web_only/gghistostats.html
graphical element | geom used |
argument for further modification |
histogram bin | ggplot2::stat_bin() |
bin.args |
centrality measure line | ggplot2::geom_vline() |
centrality.line.args |
The table below provides summary about:
statistical test carried out for inferential statistics
type of effect size estimate and a measure of uncertainty for this estimate
functions used internally to compute these details
Hypothesis testing
Type | Test | Function used |
Parametric | One-sample Student's t-test | stats::t.test() |
Non-parametric | One-sample Wilcoxon test | stats::wilcox.test() |
Robust | Bootstrap-t method for one-sample test | WRS2::trimcibt() |
Bayesian | One-sample Student's t-test | BayesFactor::ttestBF() |
Effect size estimation
Type | Effect size | CI available? | Function used |
Parametric | Cohen's d, Hedge's g | Yes | effectsize::cohens_d() , effectsize::hedges_g() |
Non-parametric | r (rank-biserial correlation) | Yes | effectsize::rank_biserial() |
Robust | trimmed mean | Yes | WRS2::trimcibt() |
Bayes Factor | difference | Yes | bayestestR::describe_posterior() |
grouped_gghistostats
, ggdotplotstats
,
grouped_ggdotplotstats
# for reproducibility set.seed(123) # creating a plot p <- gghistostats( data = ToothGrowth, x = len, xlab = "Tooth length", centrality.type = "np" ) # looking at the plot p # extracting details from statistical tests extract_stats(p)
# for reproducibility set.seed(123) # creating a plot p <- gghistostats( data = ToothGrowth, x = len, xlab = "Tooth length", centrality.type = "np" ) # looking at the plot p # extracting details from statistical tests extract_stats(p)
Pie charts for categorical data with statistical details included in the plot as a subtitle.
ggpiestats( data, x, y = NULL, counts = NULL, type = "parametric", paired = FALSE, results.subtitle = TRUE, label = "percentage", label.args = list(direction = "both"), label.repel = FALSE, digits = 2L, proportion.test = results.subtitle, digits.perc = 0L, bf.message = TRUE, ratio = NULL, conf.level = 0.95, sampling.plan = "indepMulti", fixed.margin = "rows", prior.concentration = 1, title = NULL, subtitle = NULL, caption = NULL, legend.title = NULL, ggtheme = ggstatsplot::theme_ggstatsplot(), package = "RColorBrewer", palette = "Dark2", ggplot.component = NULL, ... )
ggpiestats( data, x, y = NULL, counts = NULL, type = "parametric", paired = FALSE, results.subtitle = TRUE, label = "percentage", label.args = list(direction = "both"), label.repel = FALSE, digits = 2L, proportion.test = results.subtitle, digits.perc = 0L, bf.message = TRUE, ratio = NULL, conf.level = 0.95, sampling.plan = "indepMulti", fixed.margin = "rows", prior.concentration = 1, title = NULL, subtitle = NULL, caption = NULL, legend.title = NULL, ggtheme = ggstatsplot::theme_ggstatsplot(), package = "RColorBrewer", palette = "Dark2", ggplot.component = NULL, ... )
data |
A data frame (or a tibble) from which variables specified are to
be taken. Other data types (e.g., matrix,table, array, etc.) will not
be accepted. Additionally, grouped data frames from |
x |
The variable to use as the rows in the contingency table. Please note that if there are empty factor levels in your variable, they will be dropped. |
y |
The variable to use as the columns in the contingency table.
Please note that if there are empty factor levels in your variable, they
will be dropped. Default is |
counts |
The variable in data containing counts, or |
type |
A character specifying the type of statistical approach:
You can specify just the initial letter. |
paired |
Logical indicating whether data came from a within-subjects or
repeated measures design study (Default: |
results.subtitle |
Decides whether the results of statistical tests are
to be displayed as a subtitle (Default: |
label |
Character decides what information needs to be displayed
on the label in each pie slice. Possible options are |
label.args |
Additional aesthetic arguments that will be passed to
|
label.repel |
Whether labels should be repelled using |
digits |
Number of digits for rounding or significant figures. May also
be |
proportion.test |
Decides whether proportion test for |
digits.perc |
Numeric that decides number of decimal places for
percentage labels (Default: |
bf.message |
Logical that decides whether to display Bayes Factor in
favor of the null hypothesis. This argument is relevant only for
parametric test (Default: |
ratio |
A vector of proportions: the expected proportions for the
proportion test (should sum to |
conf.level |
Scalar between |
sampling.plan |
Character describing the sampling plan. Possible options:
|
fixed.margin |
For the independent multinomial sampling plan, which
margin is fixed ( |
prior.concentration |
Specifies the prior concentration parameter, set
to |
title |
The text for the plot title. |
subtitle |
The text for the plot subtitle. Will work only if
|
caption |
The text for the plot caption. This argument is relevant only
if |
legend.title |
Title text for the legend. |
ggtheme |
A |
package , palette
|
Name of the package from which the given palette is to
be extracted. The available palettes and packages can be checked by running
|
ggplot.component |
A |
... |
Currently ignored. |
For details, see: https://indrajeetpatil.github.io/ggstatsplot/articles/web_only/ggpiestats.html
graphical element | geom used |
argument for further modification |
pie slices | ggplot2::geom_col() |
NA |
labels | ggplot2::geom_label() /ggrepel::geom_label_repel() |
label.args |
The table below provides summary about:
statistical test carried out for inferential statistics
type of effect size estimate and a measure of uncertainty for this estimate
functions used internally to compute these details
Hypothesis testing
Type | Design | Test | Function used |
Parametric/Non-parametric | Unpaired | Pearson's chi-squared test | stats::chisq.test() |
Bayesian | Unpaired | Bayesian Pearson's chi-squared test | BayesFactor::contingencyTableBF() |
Parametric/Non-parametric | Paired | McNemar's chi-squared test | stats::mcnemar.test() |
Bayesian | Paired | No | No |
Effect size estimation
Type | Design | Effect size | CI available? | Function used |
Parametric/Non-parametric | Unpaired | Cramer's V | Yes | effectsize::cramers_v() |
Bayesian | Unpaired | Cramer's V | Yes | effectsize::cramers_v() |
Parametric/Non-parametric | Paired | Cohen's g | Yes | effectsize::cohens_g() |
Bayesian | Paired | No | No | No |
Hypothesis testing
Type | Test | Function used |
Parametric/Non-parametric | Goodness of fit chi-squared test | stats::chisq.test() |
Bayesian | Bayesian Goodness of fit chi-squared test | (custom) |
Effect size estimation
Type | Effect size | CI available? | Function used |
Parametric/Non-parametric | Pearson's C | Yes | effectsize::pearsons_c() |
Bayesian | No | No | No |
grouped_ggpiestats
, ggbarstats
,
grouped_ggbarstats
# for reproducibility set.seed(123) # one sample goodness of fit proportion test p <- ggpiestats(mtcars, vs) # looking at the plot p # extracting details from statistical tests extract_stats(p) # association test (or contingency table analysis) ggpiestats(mtcars, vs, cyl)
# for reproducibility set.seed(123) # one sample goodness of fit proportion test p <- ggpiestats(mtcars, vs) # looking at the plot p # extracting details from statistical tests extract_stats(p) # association test (or contingency table analysis) ggpiestats(mtcars, vs, cyl)
Scatterplots from {ggplot2}
combined with marginal distributions plots
with statistical details.
ggscatterstats( data, x, y, type = "parametric", conf.level = 0.95, bf.prior = 0.707, bf.message = TRUE, tr = 0.2, digits = 2L, results.subtitle = TRUE, label.var = NULL, label.expression = NULL, marginal = TRUE, point.args = list(size = 3, alpha = 0.4, stroke = 0), point.width.jitter = 0, point.height.jitter = 0, point.label.args = list(size = 3, max.overlaps = 1e+06), smooth.line.args = list(linewidth = 1.5, color = "blue", method = "lm", formula = y ~ x), xsidehistogram.args = list(fill = "#009E73", color = "black", na.rm = TRUE), ysidehistogram.args = list(fill = "#D55E00", color = "black", na.rm = TRUE), xlab = NULL, ylab = NULL, title = NULL, subtitle = NULL, caption = NULL, ggtheme = ggstatsplot::theme_ggstatsplot(), ggplot.component = NULL, ... )
ggscatterstats( data, x, y, type = "parametric", conf.level = 0.95, bf.prior = 0.707, bf.message = TRUE, tr = 0.2, digits = 2L, results.subtitle = TRUE, label.var = NULL, label.expression = NULL, marginal = TRUE, point.args = list(size = 3, alpha = 0.4, stroke = 0), point.width.jitter = 0, point.height.jitter = 0, point.label.args = list(size = 3, max.overlaps = 1e+06), smooth.line.args = list(linewidth = 1.5, color = "blue", method = "lm", formula = y ~ x), xsidehistogram.args = list(fill = "#009E73", color = "black", na.rm = TRUE), ysidehistogram.args = list(fill = "#D55E00", color = "black", na.rm = TRUE), xlab = NULL, ylab = NULL, title = NULL, subtitle = NULL, caption = NULL, ggtheme = ggstatsplot::theme_ggstatsplot(), ggplot.component = NULL, ... )
data |
A data frame (or a tibble) from which variables specified are to
be taken. Other data types (e.g., matrix,table, array, etc.) will not
be accepted. Additionally, grouped data frames from |
x |
The column in |
y |
The column in |
type |
A character specifying the type of statistical approach:
You can specify just the initial letter. |
conf.level |
Scalar between |
bf.prior |
A number between |
bf.message |
Logical that decides whether to display Bayes Factor in
favor of the null hypothesis. This argument is relevant only for
parametric test (Default: |
tr |
Trim level for the mean when carrying out |
digits |
Number of digits for rounding or significant figures. May also
be |
results.subtitle |
Decides whether the results of statistical tests are
to be displayed as a subtitle (Default: |
label.var |
Variable to use for points labels entered as a symbol (e.g.
|
label.expression |
An expression evaluating to a logical vector that
determines the subset of data points to label (e.g. |
marginal |
Decides whether marginal distributions will be plotted on
axes using |
point.args |
A list of additional aesthetic arguments to be passed to
the |
point.width.jitter , point.height.jitter
|
Degree of jitter in |
point.label.args |
A list of additional aesthetic arguments to be passed
to |
smooth.line.args |
A list of additional aesthetic arguments to be passed
to |
xsidehistogram.args , ysidehistogram.args
|
A list of arguments passed to
respective |
xlab |
Label for |
ylab |
Labels for |
title |
The text for the plot title. |
subtitle |
The text for the plot subtitle. Will work only if
|
caption |
The text for the plot caption. This argument is relevant only
if |
ggtheme |
A |
ggplot.component |
A |
... |
Currently ignored. |
For details, see: https://indrajeetpatil.github.io/ggstatsplot/articles/web_only/ggscatterstats.html
graphical element | geom used |
argument for further modification |
raw data | ggplot2::geom_point() |
point.args |
labels for raw data | ggrepel::geom_label_repel() |
point.label.args |
smooth line | ggplot2::geom_smooth() |
smooth.line.args |
marginal histograms | ggside::geom_xsidehistogram() , ggside::geom_ysidehistogram() |
xsidehistogram.args , ysidehistogram.args |
The table below provides summary about:
statistical test carried out for inferential statistics
type of effect size estimate and a measure of uncertainty for this estimate
functions used internally to compute these details
Hypothesis testing and Effect size estimation
Type | Test | CI available? | Function used |
Parametric | Pearson's correlation coefficient | Yes | correlation::correlation() |
Non-parametric | Spearman's rank correlation coefficient | Yes | correlation::correlation() |
Robust | Winsorized Pearson's correlation coefficient | Yes | correlation::correlation() |
Bayesian | Bayesian Pearson's correlation coefficient | Yes | correlation::correlation() |
The plot uses ggrepel::geom_label_repel()
to attempt to keep labels
from over-lapping to the largest degree possible. As a consequence plot
times will slow down massively (and the plot file will grow in size) if you
have a lot of labels that overlap.
grouped_ggscatterstats
, ggcorrmat
,
grouped_ggcorrmat
set.seed(123) # creating a plot p <- ggscatterstats( iris, x = Sepal.Width, y = Petal.Length, label.var = Species, label.expression = Sepal.Length > 7.6 ) + ggplot2::geom_rug(sides = "b") # looking at the plot p # extracting details from statistical tests extract_stats(p)
set.seed(123) # creating a plot p <- ggscatterstats( iris, x = Sepal.Width, y = Petal.Length, label.var = Species, label.expression = Sepal.Length > 7.6 ) + ggplot2::geom_rug(sides = "b") # looking at the plot p # extracting details from statistical tests extract_stats(p)
A combination of box and violin plots along with raw (unjittered) data points for within-subjects designs with statistical details included in the plot as a subtitle.
ggwithinstats( data, x, y, type = "parametric", pairwise.display = "significant", p.adjust.method = "holm", effsize.type = "unbiased", bf.prior = 0.707, bf.message = TRUE, results.subtitle = TRUE, xlab = NULL, ylab = NULL, caption = NULL, title = NULL, subtitle = NULL, digits = 2L, conf.level = 0.95, nboot = 100L, tr = 0.2, centrality.plotting = TRUE, centrality.type = type, centrality.point.args = list(size = 5, color = "darkred"), centrality.label.args = list(size = 3, nudge_x = 0.4, segment.linetype = 4), centrality.path = TRUE, centrality.path.args = list(linewidth = 1, color = "red", alpha = 0.5), point.args = list(size = 3, alpha = 0.5, na.rm = TRUE), point.path = TRUE, point.path.args = list(alpha = 0.5, linetype = "dashed"), boxplot.args = list(width = 0.2, alpha = 0.5, na.rm = TRUE), violin.args = list(width = 0.5, alpha = 0.2, na.rm = TRUE), ggsignif.args = list(textsize = 3, tip_length = 0.01, na.rm = TRUE), ggtheme = ggstatsplot::theme_ggstatsplot(), package = "RColorBrewer", palette = "Dark2", ggplot.component = NULL, ... )
ggwithinstats( data, x, y, type = "parametric", pairwise.display = "significant", p.adjust.method = "holm", effsize.type = "unbiased", bf.prior = 0.707, bf.message = TRUE, results.subtitle = TRUE, xlab = NULL, ylab = NULL, caption = NULL, title = NULL, subtitle = NULL, digits = 2L, conf.level = 0.95, nboot = 100L, tr = 0.2, centrality.plotting = TRUE, centrality.type = type, centrality.point.args = list(size = 5, color = "darkred"), centrality.label.args = list(size = 3, nudge_x = 0.4, segment.linetype = 4), centrality.path = TRUE, centrality.path.args = list(linewidth = 1, color = "red", alpha = 0.5), point.args = list(size = 3, alpha = 0.5, na.rm = TRUE), point.path = TRUE, point.path.args = list(alpha = 0.5, linetype = "dashed"), boxplot.args = list(width = 0.2, alpha = 0.5, na.rm = TRUE), violin.args = list(width = 0.5, alpha = 0.2, na.rm = TRUE), ggsignif.args = list(textsize = 3, tip_length = 0.01, na.rm = TRUE), ggtheme = ggstatsplot::theme_ggstatsplot(), package = "RColorBrewer", palette = "Dark2", ggplot.component = NULL, ... )
data |
A data frame (or a tibble) from which variables specified are to
be taken. Other data types (e.g., matrix,table, array, etc.) will not
be accepted. Additionally, grouped data frames from |
x |
The grouping (or independent) variable from |
y |
The response (or outcome or dependent) variable from |
type |
A character specifying the type of statistical approach:
You can specify just the initial letter. |
pairwise.display |
Decides which pairwise comparisons to display. Available options are:
You can use this argument to make sure that your plot is not uber-cluttered
when you have multiple groups being compared and scores of pairwise
comparisons being displayed. If set to |
p.adjust.method |
Adjustment method for p-values for multiple
comparisons. Possible methods are: |
effsize.type |
Type of effect size needed for parametric tests. The
argument can be |
bf.prior |
A number between |
bf.message |
Logical that decides whether to display Bayes Factor in
favor of the null hypothesis. This argument is relevant only for
parametric test (Default: |
results.subtitle |
Decides whether the results of statistical tests are
to be displayed as a subtitle (Default: |
xlab |
Label for |
ylab |
Labels for |
caption |
The text for the plot caption. This argument is relevant only
if |
title |
The text for the plot title. |
subtitle |
The text for the plot subtitle. Will work only if
|
digits |
Number of digits for rounding or significant figures. May also
be |
conf.level |
Scalar between |
nboot |
Number of bootstrap samples for computing confidence interval
for the effect size (Default: |
tr |
Trim level for the mean when carrying out |
centrality.plotting |
Logical that decides whether centrality tendency
measure is to be displayed as a point with a label (Default:
If you want default centrality parameter, you can specify this using
|
centrality.type |
Decides which centrality parameter is to be displayed.
The default is to choose the same as
Just as |
centrality.point.args , centrality.label.args
|
A list of additional aesthetic
arguments to be passed to |
centrality.path.args , point.path.args
|
A list of additional aesthetic
arguments passed on to |
point.args |
A list of additional aesthetic arguments to be passed to
the |
point.path , centrality.path
|
Logical that decides whether individual
data points and means, respectively, should be connected using
|
boxplot.args |
A list of additional aesthetic arguments passed on to
|
violin.args |
A list of additional aesthetic arguments to be passed to
the |
ggsignif.args |
A list of additional aesthetic
arguments to be passed to |
ggtheme |
A |
package , palette
|
Name of the package from which the given palette is to
be extracted. The available palettes and packages can be checked by running
|
ggplot.component |
A |
... |
Currently ignored. |
For details, see: https://indrajeetpatil.github.io/ggstatsplot/articles/web_only/ggwithinstats.html
graphical element | geom used |
argument for further modification |
raw data | ggplot2::geom_point() |
point.args |
point path | ggplot2::geom_path() |
point.path.args |
box plot | ggplot2::geom_boxplot() |
boxplot.args |
density plot | ggplot2::geom_violin() |
violin.args |
centrality measure point | ggplot2::geom_point() |
centrality.point.args |
centrality measure point path | ggplot2::geom_path() |
centrality.path.args |
centrality measure label | ggrepel::geom_label_repel() |
centrality.label.args |
pairwise comparisons | ggsignif::geom_signif() |
ggsignif.args |
The table below provides summary about:
statistical test carried out for inferential statistics
type of effect size estimate and a measure of uncertainty for this estimate
functions used internally to compute these details
Type | Measure | Function used |
Parametric | mean | datawizard::describe_distribution() |
Non-parametric | median | datawizard::describe_distribution() |
Robust | trimmed mean | datawizard::describe_distribution() |
Bayesian | MAP | datawizard::describe_distribution() |
The table below provides summary about:
statistical test carried out for inferential statistics
type of effect size estimate and a measure of uncertainty for this estimate
functions used internally to compute these details
Hypothesis testing
Type | No. of groups | Test | Function used |
Parametric | 2 | Student's or Welch's t-test | stats::t.test() |
Non-parametric | 2 | Mann-Whitney U test | stats::wilcox.test() |
Robust | 2 | Yuen's test for trimmed means | WRS2::yuen() |
Bayesian | 2 | Student's t-test | BayesFactor::ttestBF() |
Effect size estimation
Type | No. of groups | Effect size | CI available? | Function used |
Parametric | 2 | Cohen's d, Hedge's g | Yes | effectsize::cohens_d() , effectsize::hedges_g() |
Non-parametric | 2 | r (rank-biserial correlation) | Yes | effectsize::rank_biserial() |
Robust | 2 | Algina-Keselman-Penfield robust standardized difference | Yes | WRS2::akp.effect() |
Bayesian | 2 | difference | Yes | bayestestR::describe_posterior() |
Hypothesis testing
Type | No. of groups | Test | Function used |
Parametric | 2 | Student's t-test | stats::t.test() |
Non-parametric | 2 | Wilcoxon signed-rank test | stats::wilcox.test() |
Robust | 2 | Yuen's test on trimmed means for dependent samples | WRS2::yuend() |
Bayesian | 2 | Student's t-test | BayesFactor::ttestBF() |
Effect size estimation
Type | No. of groups | Effect size | CI available? | Function used |
Parametric | 2 | Cohen's d, Hedge's g | Yes | effectsize::cohens_d() , effectsize::hedges_g() |
Non-parametric | 2 | r (rank-biserial correlation) | Yes | effectsize::rank_biserial() |
Robust | 2 | Algina-Keselman-Penfield robust standardized difference | Yes | WRS2::wmcpAKP() |
Bayesian | 2 | difference | Yes | bayestestR::describe_posterior() |
The table below provides summary about:
statistical test carried out for inferential statistics
type of effect size estimate and a measure of uncertainty for this estimate
functions used internally to compute these details
Hypothesis testing
Type | No. of groups | Test | Function used |
Parametric | > 2 | Fisher's or Welch's one-way ANOVA | stats::oneway.test() |
Non-parametric | > 2 | Kruskal-Wallis one-way ANOVA | stats::kruskal.test() |
Robust | > 2 | Heteroscedastic one-way ANOVA for trimmed means | WRS2::t1way() |
Bayes Factor | > 2 | Fisher's ANOVA | BayesFactor::anovaBF() |
Effect size estimation
Type | No. of groups | Effect size | CI available? | Function used |
Parametric | > 2 | partial eta-squared, partial omega-squared | Yes | effectsize::omega_squared() , effectsize::eta_squared() |
Non-parametric | > 2 | rank epsilon squared | Yes | effectsize::rank_epsilon_squared() |
Robust | > 2 | Explanatory measure of effect size | Yes | WRS2::t1way() |
Bayes Factor | > 2 | Bayesian R-squared | Yes | performance::r2_bayes() |
Hypothesis testing
Type | No. of groups | Test | Function used |
Parametric | > 2 | One-way repeated measures ANOVA | afex::aov_ez() |
Non-parametric | > 2 | Friedman rank sum test | stats::friedman.test() |
Robust | > 2 | Heteroscedastic one-way repeated measures ANOVA for trimmed means | WRS2::rmanova() |
Bayes Factor | > 2 | One-way repeated measures ANOVA | BayesFactor::anovaBF() |
Effect size estimation
Type | No. of groups | Effect size | CI available? | Function used |
Parametric | > 2 | partial eta-squared, partial omega-squared | Yes | effectsize::omega_squared() , effectsize::eta_squared() |
Non-parametric | > 2 | Kendall's coefficient of concordance | Yes | effectsize::kendalls_w() |
Robust | > 2 | Algina-Keselman-Penfield robust standardized difference average | Yes | WRS2::wmcpAKP() |
Bayes Factor | > 2 | Bayesian R-squared | Yes | performance::r2_bayes() |
The table below provides summary about:
statistical test carried out for inferential statistics
type of effect size estimate and a measure of uncertainty for this estimate
functions used internally to compute these details
Hypothesis testing
Type | Equal variance? | Test | p-value adjustment? | Function used |
Parametric | No | Games-Howell test | Yes | PMCMRplus::gamesHowellTest() |
Parametric | Yes | Student's t-test | Yes | stats::pairwise.t.test() |
Non-parametric | No | Dunn test | Yes | PMCMRplus::kwAllPairsDunnTest() |
Robust | No | Yuen's trimmed means test | Yes | WRS2::lincon() |
Bayesian | NA |
Student's t-test | NA |
BayesFactor::ttestBF() |
Effect size estimation
Not supported.
Hypothesis testing
Type | Test | p-value adjustment? | Function used |
Parametric | Student's t-test | Yes | stats::pairwise.t.test() |
Non-parametric | Durbin-Conover test | Yes | PMCMRplus::durbinAllPairsTest() |
Robust | Yuen's trimmed means test | Yes | WRS2::rmmcp() |
Bayesian | Student's t-test | NA |
BayesFactor::ttestBF() |
Effect size estimation
Not supported.
grouped_ggbetweenstats
, ggbetweenstats
,
grouped_ggwithinstats
# for reproducibility set.seed(123) library(dplyr, warn.conflicts = FALSE) # create a plot p <- ggwithinstats( data = filter(bugs_long, condition %in% c("HDHF", "HDLF")), x = condition, y = desire, type = "np" ) # looking at the plot p # extracting details from statistical tests extract_stats(p) # modifying defaults ggwithinstats( data = bugs_long, x = condition, y = desire, type = "robust" ) # you can remove a specific geom by setting `width` to `0` for that geom ggbetweenstats( data = bugs_long, x = condition, y = desire, # to remove violin plot violin.args = list(width = 0, linewidth = 0), # to remove boxplot boxplot.args = list(width = 0), # to remove points point.args = list(alpha = 0) )
# for reproducibility set.seed(123) library(dplyr, warn.conflicts = FALSE) # create a plot p <- ggwithinstats( data = filter(bugs_long, condition %in% c("HDHF", "HDLF")), x = condition, y = desire, type = "np" ) # looking at the plot p # extracting details from statistical tests extract_stats(p) # modifying defaults ggwithinstats( data = bugs_long, x = condition, y = desire, type = "robust" ) # you can remove a specific geom by setting `width` to `0` for that geom ggbetweenstats( data = bugs_long, x = condition, y = desire, # to remove violin plot violin.args = list(width = 0, linewidth = 0), # to remove boxplot boxplot.args = list(width = 0), # to remove points point.args = list(alpha = 0) )
Helper function for ggstatsplot::ggbarstats()
to apply this function across
multiple levels of a given factor and combining the resulting plots using
ggstatsplot::combine_plots()
.
grouped_ggbarstats( data, ..., grouping.var, plotgrid.args = list(), annotation.args = list() )
grouped_ggbarstats( data, ..., grouping.var, plotgrid.args = list(), annotation.args = list() )
data |
A data frame (or a tibble) from which variables specified are to
be taken. Other data types (e.g., matrix,table, array, etc.) will not
be accepted. Additionally, grouped data frames from |
... |
Arguments passed on to
|
grouping.var |
A single grouping variable. |
plotgrid.args |
A |
annotation.args |
A |
For details, see: https://indrajeetpatil.github.io/ggstatsplot/articles/web_only/ggpiestats.html
ggbarstats
, ggpiestats
,
grouped_ggpiestats
# for reproducibility set.seed(123) library(dplyr, warn.conflicts = FALSE) # let's create a smaller data frame first diamonds_short <- ggplot2::diamonds %>% filter(cut %in% c("Very Good", "Ideal")) %>% filter(clarity %in% c("SI1", "SI2", "VS1", "VS2")) %>% sample_frac(size = 0.05) grouped_ggbarstats( data = diamonds_short, x = color, y = clarity, grouping.var = cut, plotgrid.args = list(nrow = 2) )
# for reproducibility set.seed(123) library(dplyr, warn.conflicts = FALSE) # let's create a smaller data frame first diamonds_short <- ggplot2::diamonds %>% filter(cut %in% c("Very Good", "Ideal")) %>% filter(clarity %in% c("SI1", "SI2", "VS1", "VS2")) %>% sample_frac(size = 0.05) grouped_ggbarstats( data = diamonds_short, x = color, y = clarity, grouping.var = cut, plotgrid.args = list(nrow = 2) )
Helper function for ggstatsplot::ggbetweenstats
to apply this function
across multiple levels of a given factor and combining the resulting plots
using ggstatsplot::combine_plots
.
grouped_ggbetweenstats( data, ..., grouping.var, plotgrid.args = list(), annotation.args = list() )
grouped_ggbetweenstats( data, ..., grouping.var, plotgrid.args = list(), annotation.args = list() )
data |
A data frame (or a tibble) from which variables specified are to
be taken. Other data types (e.g., matrix,table, array, etc.) will not
be accepted. Additionally, grouped data frames from |
... |
Arguments passed on to
|
grouping.var |
A single grouping variable. |
plotgrid.args |
A |
annotation.args |
A |
ggbetweenstats
, ggwithinstats
,
grouped_ggwithinstats
# for reproducibility set.seed(123) library(dplyr, warn.conflicts = FALSE) library(ggplot2) grouped_ggbetweenstats( data = filter(ggplot2::mpg, drv != "4"), x = year, y = hwy, grouping.var = drv ) # modifying individual plots using `ggplot.component` argument grouped_ggbetweenstats( data = filter( movies_long, genre %in% c("Action", "Comedy"), mpaa %in% c("R", "PG") ), x = genre, y = rating, grouping.var = mpaa, ggplot.component = scale_y_continuous( breaks = seq(1, 9, 1), limits = (c(1, 9)) ) )
# for reproducibility set.seed(123) library(dplyr, warn.conflicts = FALSE) library(ggplot2) grouped_ggbetweenstats( data = filter(ggplot2::mpg, drv != "4"), x = year, y = hwy, grouping.var = drv ) # modifying individual plots using `ggplot.component` argument grouped_ggbetweenstats( data = filter( movies_long, genre %in% c("Action", "Comedy"), mpaa %in% c("R", "PG") ), x = genre, y = rating, grouping.var = mpaa, ggplot.component = scale_y_continuous( breaks = seq(1, 9, 1), limits = (c(1, 9)) ) )
Helper function for ggstatsplot::ggcorrmat()
to apply this function across
multiple levels of a given factor and combining the resulting plots using
ggstatsplot::combine_plots()
.
grouped_ggcorrmat( data, ..., grouping.var, plotgrid.args = list(), annotation.args = list() )
grouped_ggcorrmat( data, ..., grouping.var, plotgrid.args = list(), annotation.args = list() )
data |
A data frame from which variables specified are to be taken. |
... |
Arguments passed on to
|
grouping.var |
A single grouping variable. |
plotgrid.args |
A |
annotation.args |
A |
For details, see: https://indrajeetpatil.github.io/ggstatsplot/articles/web_only/ggcorrmat.html
ggcorrmat
, ggscatterstats
,
grouped_ggscatterstats
set.seed(123) grouped_ggcorrmat( data = iris, grouping.var = Species, type = "robust", p.adjust.method = "holm", plotgrid.args = list(ncol = 1L), annotation.args = list(tag_levels = "i") )
set.seed(123) grouped_ggcorrmat( data = iris, grouping.var = Species, type = "robust", p.adjust.method = "holm", plotgrid.args = list(ncol = 1L), annotation.args = list(tag_levels = "i") )
Helper function for ggstatsplot::ggdotplotstats
to apply this function
across multiple levels of a given factor and combining the resulting plots
using ggstatsplot::combine_plots
.
grouped_ggdotplotstats( data, ..., grouping.var, plotgrid.args = list(), annotation.args = list() )
grouped_ggdotplotstats( data, ..., grouping.var, plotgrid.args = list(), annotation.args = list() )
data |
A data frame (or a tibble) from which variables specified are to
be taken. Other data types (e.g., matrix,table, array, etc.) will not
be accepted. Additionally, grouped data frames from |
... |
Arguments passed on to
|
grouping.var |
A single grouping variable. |
plotgrid.args |
A |
annotation.args |
A |
For details, see: https://indrajeetpatil.github.io/ggstatsplot/articles/web_only/ggdotplotstats.html
grouped_gghistostats
, ggdotplotstats
,
gghistostats
# for reproducibility set.seed(123) library(dplyr, warn.conflicts = FALSE) # removing factor level with very few no. of observations df <- filter(ggplot2::mpg, cyl %in% c("4", "6", "8")) # plot grouped_ggdotplotstats( data = df, x = cty, y = manufacturer, grouping.var = cyl, test.value = 15.5 )
# for reproducibility set.seed(123) library(dplyr, warn.conflicts = FALSE) # removing factor level with very few no. of observations df <- filter(ggplot2::mpg, cyl %in% c("4", "6", "8")) # plot grouped_ggdotplotstats( data = df, x = cty, y = manufacturer, grouping.var = cyl, test.value = 15.5 )
Helper function for ggstatsplot::gghistostats
to apply this function
across multiple levels of a given factor and combining the resulting plots
using ggstatsplot::combine_plots
.
grouped_gghistostats( data, x, grouping.var, binwidth = NULL, plotgrid.args = list(), annotation.args = list(), ... )
grouped_gghistostats( data, x, grouping.var, binwidth = NULL, plotgrid.args = list(), annotation.args = list(), ... )
data |
A data frame (or a tibble) from which variables specified are to
be taken. Other data types (e.g., matrix,table, array, etc.) will not
be accepted. Additionally, grouped data frames from |
x |
A numeric variable from the data frame |
grouping.var |
A single grouping variable. |
binwidth |
The width of the histogram bins. Can be specified as a
numeric value, or a function that calculates width from |
plotgrid.args |
A |
annotation.args |
A |
... |
Arguments passed on to
|
For details, see: https://indrajeetpatil.github.io/ggstatsplot/articles/web_only/gghistostats.html
gghistostats
, ggdotplotstats
,
grouped_ggdotplotstats
# for reproducibility set.seed(123) # plot grouped_gghistostats( data = iris, x = Sepal.Length, test.value = 5, grouping.var = Species, plotgrid.args = list(nrow = 1), annotation.args = list(tag_levels = "i") )
# for reproducibility set.seed(123) # plot grouped_gghistostats( data = iris, x = Sepal.Length, test.value = 5, grouping.var = Species, plotgrid.args = list(nrow = 1), annotation.args = list(tag_levels = "i") )
Helper function for ggstatsplot::ggpiestats
to apply this
function across multiple levels of a given factor and combining the
resulting plots using ggstatsplot::combine_plots
.
grouped_ggpiestats( data, ..., grouping.var, plotgrid.args = list(), annotation.args = list() )
grouped_ggpiestats( data, ..., grouping.var, plotgrid.args = list(), annotation.args = list() )
data |
A data frame (or a tibble) from which variables specified are to
be taken. Other data types (e.g., matrix,table, array, etc.) will not
be accepted. Additionally, grouped data frames from |
... |
Arguments passed on to
|
grouping.var |
A single grouping variable. |
plotgrid.args |
A |
annotation.args |
A |
For details, see: https://indrajeetpatil.github.io/ggstatsplot/articles/web_only/ggpiestats.html
ggbarstats
, ggpiestats
,
grouped_ggbarstats
set.seed(123) # grouped one-sample proportion test grouped_ggpiestats(mtcars, x = cyl, grouping.var = am)
set.seed(123) # grouped one-sample proportion test grouped_ggpiestats(mtcars, x = cyl, grouping.var = am)
Grouped scatterplots from {ggplot2}
combined with marginal distribution
plots with statistical details added as a subtitle.
grouped_ggscatterstats( data, ..., grouping.var, plotgrid.args = list(), annotation.args = list() )
grouped_ggscatterstats( data, ..., grouping.var, plotgrid.args = list(), annotation.args = list() )
data |
A data frame (or a tibble) from which variables specified are to
be taken. Other data types (e.g., matrix,table, array, etc.) will not
be accepted. Additionally, grouped data frames from |
... |
Arguments passed on to
|
grouping.var |
A single grouping variable. |
plotgrid.args |
A |
annotation.args |
A |
For details, see: https://indrajeetpatil.github.io/ggstatsplot/articles/web_only/ggscatterstats.html
ggscatterstats
, ggcorrmat
,
grouped_ggcorrmat
# to ensure reproducibility set.seed(123) library(dplyr, warn.conflicts = FALSE) library(ggplot2) grouped_ggscatterstats( data = filter(movies_long, genre == "Comedy" | genre == "Drama"), x = length, y = rating, type = "robust", grouping.var = genre, ggplot.component = list(geom_rug(sides = "b")) ) # using labeling # (also show how to modify basic plot from within function call) grouped_ggscatterstats( data = filter(ggplot2::mpg, cyl != 5), x = displ, y = hwy, grouping.var = cyl, type = "robust", label.var = manufacturer, label.expression = hwy > 25 & displ > 2.5, ggplot.component = scale_y_continuous(sec.axis = dup_axis()) ) # labeling without expression grouped_ggscatterstats( data = filter(movies_long, rating == 7, genre %in% c("Drama", "Comedy")), x = budget, y = length, grouping.var = genre, bf.message = FALSE, label.var = "title", annotation.args = list(tag_levels = "a") )
# to ensure reproducibility set.seed(123) library(dplyr, warn.conflicts = FALSE) library(ggplot2) grouped_ggscatterstats( data = filter(movies_long, genre == "Comedy" | genre == "Drama"), x = length, y = rating, type = "robust", grouping.var = genre, ggplot.component = list(geom_rug(sides = "b")) ) # using labeling # (also show how to modify basic plot from within function call) grouped_ggscatterstats( data = filter(ggplot2::mpg, cyl != 5), x = displ, y = hwy, grouping.var = cyl, type = "robust", label.var = manufacturer, label.expression = hwy > 25 & displ > 2.5, ggplot.component = scale_y_continuous(sec.axis = dup_axis()) ) # labeling without expression grouped_ggscatterstats( data = filter(movies_long, rating == 7, genre %in% c("Drama", "Comedy")), x = budget, y = length, grouping.var = genre, bf.message = FALSE, label.var = "title", annotation.args = list(tag_levels = "a") )
A combined plot of comparison plot created for levels of a grouping variable.
grouped_ggwithinstats( data, ..., grouping.var, plotgrid.args = list(), annotation.args = list() )
grouped_ggwithinstats( data, ..., grouping.var, plotgrid.args = list(), annotation.args = list() )
data |
A data frame (or a tibble) from which variables specified are to
be taken. Other data types (e.g., matrix,table, array, etc.) will not
be accepted. Additionally, grouped data frames from |
... |
Arguments passed on to
|
grouping.var |
A single grouping variable. |
plotgrid.args |
A |
annotation.args |
A |
ggwithinstats
, ggbetweenstats
,
grouped_ggbetweenstats
# for reproducibility set.seed(123) library(dplyr, warn.conflicts = FALSE) library(ggplot2) # the most basic function call grouped_ggwithinstats( data = filter(bugs_long, condition %in% c("HDHF", "HDLF")), x = condition, y = desire, grouping.var = gender, type = "np", # additional modifications for **each** plot using `{ggplot2}` functions ggplot.component = scale_y_continuous(breaks = seq(0, 10, 1), limits = c(0, 10)) )
# for reproducibility set.seed(123) library(dplyr, warn.conflicts = FALSE) library(ggplot2) # the most basic function call grouped_ggwithinstats( data = filter(bugs_long, condition %in% c("HDHF", "HDLF")), x = condition, y = desire, grouping.var = gender, type = "np", # additional modifications for **each** plot using `{ggplot2}` functions ggplot.component = scale_y_continuous(breaks = seq(0, 10, 1), limits = c(0, 10)) )
Edgar Anderson's Iris Data in long format.
iris_long
iris_long
A data frame with 600 rows and 5 variables
id. Dummy identity number for each flower (150 flowers in total).
Species. The species are Iris setosa, versicolor, and virginica.
condition. Factor giving a detailed description of the attribute
(Four levels: "Petal.Length"
, "Petal.Width"
, "Sepal.Length"
,
"Sepal.Width"
).
attribute. What attribute is being measured ("Sepal"
or "Pepal"
).
measure. What aspect of the attribute is being measured ("Length"
or "Width"
).
value. Value of the measurement.
This famous (Fisher's or Anderson's) iris data set gives the measurements in centimeters of the variables sepal length and width and petal length and width, respectively, for 50 flowers from each of 3 species of iris. The species are Iris setosa, versicolor, and virginica.
This is a modified dataset from {datasets}
package.
dim(iris_long) head(iris_long) dplyr::glimpse(iris_long)
dim(iris_long) head(iris_long) dplyr::glimpse(iris_long)
Movie information and user ratings from IMDB.com (long format).
movies_long
movies_long
A data frame with 1,579 rows and 8 variables
title. Title of the movie.
year. Year of release.
budget. Total budget (if known) in US dollars
length. Length in minutes.
rating. Average IMDB user rating.
votes. Number of IMDB users who rated this movie.
mpaa. MPAA rating.
genre. Different genres of movies (action, animation, comedy, drama, documentary, romance, short).
Modified dataset from {ggplot2movies}
package.
The internet movie database (IMDB) is a website devoted to collecting movie data supplied by studios and fans. It claims to be the biggest movie database on the web and is run by amazon.
https://CRAN.R-project.org/package=ggplot2movies
dim(movies_long) head(movies_long) dplyr::glimpse(movies_long)
dim(movies_long) head(movies_long) dplyr::glimpse(movies_long)
{ggstatsplot}
Common theme used across all plots generated in {ggstatsplot}
and assumed
by the author to be aesthetically pleasing to the user. The theme is a
wrapper around ggplot2::theme_bw()
.
All {ggstatsplot}
functions have a ggtheme
parameter that let you choose
a different theme.
theme_ggstatsplot()
theme_ggstatsplot()
A ggplot
object.
library(ggplot2) ggplot(mtcars, aes(wt, mpg)) + geom_point() + theme_ggstatsplot()
library(ggplot2) ggplot(mtcars, aes(wt, mpg)) + geom_point() + theme_ggstatsplot()
Titanic dataset.
Titanic_full
Titanic_full
A data frame with 2201 rows and 5 variables
id. Dummy identity number for each person.
Class. 1st, 2nd, 3rd, Crew.
Sex. Male, Female.
Age. Child, Adult.
Survived. No, Yes.
This data set provides information on the fate of passengers on the fatal maiden voyage of the ocean liner 'Titanic', summarized according to economic status (class), sex, age and survival.
This is a modified dataset from {datasets}
package.
dim(Titanic_full) head(Titanic_full) dplyr::glimpse(Titanic_full)
dim(Titanic_full) head(Titanic_full) dplyr::glimpse(Titanic_full)