diff --git a/NEWS.md b/NEWS.md index 50e0a0b8..b3e8922c 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,5 +1,6 @@ # bayesplot (development version) +* Documentation added for all exported `*_data()` functions (#209) * Improved documentation for `binwidth`, `bins`, and `breaks` arguments to clarify they are passed to `ggplot2::geom_area()` and `ggdist::stat_dots()` in addition to `ggplot2::geom_histogram()` * Improved documentation for `freq` argument to clarify it applies to frequency polygons in addition to histograms * Fixed test in `test-ppc-distributions.R` that incorrectly used `ppc_dens()` instead of `ppd_dens()` when testing PPD functions diff --git a/R/mcmc-diagnostics.R b/R/mcmc-diagnostics.R index 3dc1b324..2727928f 100644 --- a/R/mcmc-diagnostics.R +++ b/R/mcmc-diagnostics.R @@ -18,6 +18,13 @@ #' #' @section Plot Descriptions: #' \describe{ +#' \item{`mcmc_rhat_data()`, `mcmc_neff_data()`}{ +#' Data-preparation back ends for the R-hat and effective sample size plots. +#' Users can call these functions directly to obtain the data frame of +#' diagnostic values with rating labels and create custom diagnostic +#' visualizations with **ggplot2**. +#' } +#' #' \item{`mcmc_rhat()`, `mcmc_rhat_hist()`}{ #' Rhat values as either points or a histogram. Values are colored using #' different shades (lighter is better). The chosen thresholds are somewhat diff --git a/R/mcmc-distributions.R b/R/mcmc-distributions.R index aa0eff84..fbb34ad2 100644 --- a/R/mcmc-distributions.R +++ b/R/mcmc-distributions.R @@ -15,7 +15,7 @@ #' @param ... For dot plots, optional additional arguments to pass to [ggdist::stat_dots()]. #' @param alpha Passed to the geom to control the transparency. #' -#' @template return-ggplot +#' @template return-ggplot-or-data #' #' @section Plot Descriptions: #' \describe{ @@ -48,6 +48,12 @@ #' appear in separate facets; in `mcmc_dens_chains()` they appear in the #' same panel and can overlap vertically. #' } +#' \item{`mcmc_dens_chains_data()`}{ +#' Data-preparation back end for `mcmc_dens_chains()`. Users can call this +#' function directly to obtain the prepared long-format data frame of MCMC +#' draws (with chain information retained) and create custom visualizations +#' with **ggplot2**. +#' } #' } #' #' @examples diff --git a/R/mcmc-intervals.R b/R/mcmc-intervals.R index 13687846..1ab9298a 100644 --- a/R/mcmc-intervals.R +++ b/R/mcmc-intervals.R @@ -51,6 +51,13 @@ #' ridgelines. This plot provides a compact display of (hierarchically) #' related distributions. #' } +#' \item{`mcmc_intervals_data()`, `mcmc_areas_data()`, `mcmc_areas_ridges_data()`}{ +#' Data-preparation back ends for `mcmc_intervals()`, `mcmc_areas()`, and +#' `mcmc_areas_ridges()`, respectively. Users can call these functions +#' directly to obtain the prepared data frames of posterior interval +#' summaries and create custom interval or density-area visualizations +#' with **ggplot2**. +#' } #' } #' #' @examples diff --git a/R/mcmc-parcoord.R b/R/mcmc-parcoord.R index f3718e18..6891ed46 100644 --- a/R/mcmc-parcoord.R +++ b/R/mcmc-parcoord.R @@ -48,6 +48,12 @@ #' when specifying the `transformations` argument to #' `mcmc_parcoord`. See the **Examples** section for how to do this. #' } +#' \item{`mcmc_parcoord_data()`}{ +#' Data-preparation back end for `mcmc_parcoord()`. Users can call +#' `mcmc_parcoord_data()` directly to obtain the prepared long-format data +#' frame of MCMC draws (with optional NUTS diagnostic information) and +#' create custom visualizations with **ggplot2**. +#' } #' } #' #' @template reference-vis-paper diff --git a/R/mcmc-traces.R b/R/mcmc-traces.R index 57449883..cad857ae 100644 --- a/R/mcmc-traces.R +++ b/R/mcmc-traces.R @@ -72,6 +72,13 @@ #' ECDFs and the theoretical expectation for samples originating from the #' same distribution is drawn. See Säilynoja et al. (2021) for details. #' } +#' \item{`mcmc_trace_data()`}{ +#' Data-preparation back end for `mcmc_trace()`, `mcmc_trace_highlight()`, +#' `mcmc_rank_hist()`, `mcmc_rank_overlay()`, and `mcmc_rank_ecdf()`. The +#' returned data frame contains columns for both the original draw values +#' and their within-parameter ranks, so it can be used to build both trace +#' and rank-based visualizations with **ggplot2**. +#' } #' } #' #' @template reference-improved-rhat diff --git a/R/ppc-discrete.R b/R/ppc-discrete.R index c346247f..2c3f54a9 100644 --- a/R/ppc-discrete.R +++ b/R/ppc-discrete.R @@ -45,6 +45,11 @@ #' Same as `ppc_bars()` but a separate plot (facet) is generated for each #' level of a grouping variable. #' } +#' \item{`ppc_bars_data()`}{ +#' Data-preparation back end for `ppc_bars()` and `ppc_bars_grouped()`. +#' Users can call `ppc_bars_data()` directly to obtain the prepared data +#' frame and create custom visualizations with **ggplot2**. +#' } #' \item{`ppc_rootogram()`}{ #' Rootograms allow for diagnosing problems in count data models such as #' overdispersion or excess zeros. In `standing`, `hanging`, and `suspended` diff --git a/R/ppc-distributions.R b/R/ppc-distributions.R index d86a8b4c..d32a2e1a 100644 --- a/R/ppc-distributions.R +++ b/R/ppc-distributions.R @@ -31,7 +31,7 @@ #' \item{`ppc_dots()`}{ #' A dot plot plot is displayed for `y` and each dataset (row) in `yrep`. #' For these plots `yrep` should therefore contain only a small number of rows. -#' See the **Examples** section. This function requires [ggdist::stat_dots] to be installed. +#' See the **Examples** section. #' } #' \item{`ppc_freqpoly_grouped()`}{ #' A separate frequency polygon is plotted for each level of a grouping @@ -59,7 +59,16 @@ #' confidence intervals are provided to asses if `y` and `yrep` originate #' from the same distribution. The PIT values can also be provided directly #' as `pit`. -#' See Säilynoja et al. (2021) for more details.} +#' See Säilynoja et al. (2021) for more details. +#' } +#' \item{`ppc_data()`}{ +#' This function prepares data for plotting with **ggplot2** and doesn't +#' itself make any plots. Users can call it directly to obtain the underlying +#' data frame that (in most cases) is passed to **ggplot2**. This is useful +#' when you want to customize the appearance of PPC plots beyond what the +#' built-in plotting functions allow, or when you want to construct new types +#' of PPC visualizations based on the same underlying data. +#' } #' } #' #' @template reference-vis-paper diff --git a/R/ppc-errors.R b/R/ppc-errors.R index 31ecfb2b..f0012f24 100644 --- a/R/ppc-errors.R +++ b/R/ppc-errors.R @@ -67,9 +67,15 @@ #' this plot `y` and `yrep` should contain proportions rather than counts, #' and `yrep` should have only a small number of rows. #' } +#' \item{`ppc_error_data()`}{ +#' Data-preparation back end for the `ppc_error_*()` family of plotting +#' functions. Users can call `ppc_error_data()` directly to obtain the +#' data frame of predictive errors (`y - yrep`) and create custom error +#' visualizations with **ggplot2**. +#' } #' } #' -#' @template return-ggplot +#' @template return-ggplot-or-data #' #' @templateVar bdaRef (Ch. 6) #' @template reference-bda diff --git a/R/ppc-intervals.R b/R/ppc-intervals.R index d41bf90f..b41be3b0 100644 --- a/R/ppc-intervals.R +++ b/R/ppc-intervals.R @@ -50,6 +50,13 @@ #' Same as `ppc_intervals()` and `ppc_ribbon()`, respectively, but a #' separate plot (facet) is generated for each level of a grouping variable. #' } +#' \item{`ppc_intervals_data()`, `ppc_ribbon_data()`}{ +#' Data-preparation back end for `ppc_intervals()`, `ppc_ribbon()`, and +#' their grouped variants. `ppc_ribbon_data()` is an alias for +#' `ppc_intervals_data()`. Users can call either function directly to +#' obtain the prepared data frame and create custom interval or ribbon +#' visualizations with **ggplot2**. +#' } #' } #' #' @examples diff --git a/R/ppc-loo.R b/R/ppc-loo.R index 88a6692c..e0f93039 100644 --- a/R/ppc-loo.R +++ b/R/ppc-loo.R @@ -23,10 +23,21 @@ #' `ppc_loo_ribbon()`, `alpha` and `size` are passed to #' [ggplot2::geom_ribbon()]. #' -#' @template return-ggplot +#' @template return-ggplot-or-data #' #' @section Plot Descriptions: #' \describe{ +#' \item{`ppc_loo_pit_data()`}{ +#' This function prepares LOO-PIT data for plotting with **ggplot2**. It is +#' the data-preparation back end for the LOO-PIT plotting functions +#' (`ppc_loo_pit_overlay()`, `ppc_loo_pit_qq()`, and `ppc_loo_pit_ecdf()`), +#' and users can call it directly to create custom LOO-PIT plots using +#' ggplot2. The function computes the leave-one-out probability integral +#' transform (LOO-PIT) values and returns a data frame that can be used to +#' build ggplot objects. This is useful when you want to create custom +#' visualizations of LOO-PIT values beyond what the built-in plotting +#' functions provide. +#' } #' \item{`ppc_loo_pit_overlay()`, `ppc_loo_pit_qq()`, `ppc_loo_pit_ecdf()`}{ #' The calibration of marginal predictions can be assessed using probability #' integral transformation (PIT) checks. LOO improves the check by avoiding the diff --git a/R/ppc-scatterplots.R b/R/ppc-scatterplots.R index 50802f82..c6fb13a8 100644 --- a/R/ppc-scatterplots.R +++ b/R/ppc-scatterplots.R @@ -45,6 +45,14 @@ #' The same as `ppc_scatter_avg()`, but a separate plot is generated for #' each level of a grouping variable. #' } +#' \item{`ppc_scatter_data()`, `ppc_scatter_avg_data()`}{ +#' Data-preparation back ends for the `ppc_scatter*()` family of plotting +#' functions. `ppc_scatter_data()` returns a data frame with one row per +#' observation per `yrep` draw, while `ppc_scatter_avg_data()` returns a +#' data frame with one row per observation summarising `yrep` draws with +#' the chosen `stat`. Users can call these functions directly to create +#' custom visualizations with **ggplot2**. +#' } #' } #' #' @examples diff --git a/R/ppc-test-statistics.R b/R/ppc-test-statistics.R index ecaabfd4..b4d2043a 100644 --- a/R/ppc-test-statistics.R +++ b/R/ppc-test-statistics.R @@ -55,6 +55,13 @@ #' computed over the datasets (rows) in `yrep`. The value of the #' statistics in the observed data is overlaid as large point. #' } +#' \item{`ppc_stat_data()`}{ +#' Data-preparation back end for `ppc_stat()`, `ppc_stat_freqpoly()`, and +#' their grouped variants. Users can call `ppc_stat_data()` directly to +#' obtain the data frame of test-statistic values computed from `y` and +#' each row of `yrep`, enabling custom test-statistic visualizations with +#' **ggplot2**. +#' } #' } #' #' @examples diff --git a/man/MCMC-diagnostics.Rd b/man/MCMC-diagnostics.Rd index b396717e..35e589dc 100644 --- a/man/MCMC-diagnostics.Rd +++ b/man/MCMC-diagnostics.Rd @@ -111,6 +111,13 @@ also \link{MCMC-nuts} for additional MCMC diagnostic plots. \section{Plot Descriptions}{ \describe{ +\item{\code{mcmc_rhat_data()}, \code{mcmc_neff_data()}}{ +Data-preparation back ends for the R-hat and effective sample size plots. +Users can call these functions directly to obtain the data frame of +diagnostic values with rating labels and create custom diagnostic +visualizations with \strong{ggplot2}. +} + \item{\code{mcmc_rhat()}, \code{mcmc_rhat_hist()}}{ Rhat values as either points or a histogram. Values are colored using different shades (lighter is better). The chosen thresholds are somewhat diff --git a/man/MCMC-distributions.Rd b/man/MCMC-distributions.Rd index ec69c799..71be98d7 100644 --- a/man/MCMC-distributions.Rd +++ b/man/MCMC-distributions.Rd @@ -224,7 +224,10 @@ The default is \code{quantiles=100} so that each dot represent one percent of posterior mass.} } \value{ -A ggplot object that can be further customized using the \strong{ggplot2} package. +The plotting functions return a ggplot object that can be further +customized using the \strong{ggplot2} package. The functions with suffix +\verb{_data()} return the data that would have been drawn by the plotting +function. } \description{ Various types of histograms and kernel density plots of MCMC draws. See the @@ -262,6 +265,12 @@ but overlaid on a single plot. In \code{mcmc_dens_overlay()} parameters appear in separate facets; in \code{mcmc_dens_chains()} they appear in the same panel and can overlap vertically. } +\item{\code{mcmc_dens_chains_data()}}{ +Data-preparation back end for \code{mcmc_dens_chains()}. Users can call this +function directly to obtain the prepared long-format data frame of MCMC +draws (with chain information retained) and create custom visualizations +with \strong{ggplot2}. +} } } diff --git a/man/MCMC-intervals.Rd b/man/MCMC-intervals.Rd index c0c1605a..b98f835e 100644 --- a/man/MCMC-intervals.Rd +++ b/man/MCMC-intervals.Rd @@ -214,6 +214,13 @@ Density plot, as in \code{mcmc_areas()}, but drawn with overlapping ridgelines. This plot provides a compact display of (hierarchically) related distributions. } +\item{\code{mcmc_intervals_data()}, \code{mcmc_areas_data()}, \code{mcmc_areas_ridges_data()}}{ +Data-preparation back ends for \code{mcmc_intervals()}, \code{mcmc_areas()}, and +\code{mcmc_areas_ridges()}, respectively. Users can call these functions +directly to obtain the prepared data frames of posterior interval +summaries and create custom interval or density-area visualizations +with \strong{ggplot2}. +} } } diff --git a/man/MCMC-parcoord.Rd b/man/MCMC-parcoord.Rd index 3ea32731..cbb7d294 100644 --- a/man/MCMC-parcoord.Rd +++ b/man/MCMC-parcoord.Rd @@ -131,6 +131,12 @@ all variables before plotting you could use function \code{(x - mean(x))/sd(x)} when specifying the \code{transformations} argument to \code{mcmc_parcoord}. See the \strong{Examples} section for how to do this. } +\item{\code{mcmc_parcoord_data()}}{ +Data-preparation back end for \code{mcmc_parcoord()}. Users can call +\code{mcmc_parcoord_data()} directly to obtain the prepared long-format data +frame of MCMC draws (with optional NUTS diagnostic information) and +create custom visualizations with \strong{ggplot2}. +} } } diff --git a/man/MCMC-traces.Rd b/man/MCMC-traces.Rd index 8d578352..9a280361 100644 --- a/man/MCMC-traces.Rd +++ b/man/MCMC-traces.Rd @@ -258,6 +258,13 @@ is, bands that completely cover all of the rank ECDFs with the probability ECDFs and the theoretical expectation for samples originating from the same distribution is drawn. See Säilynoja et al. (2021) for details. } +\item{\code{mcmc_trace_data()}}{ +Data-preparation back end for \code{mcmc_trace()}, \code{mcmc_trace_highlight()}, +\code{mcmc_rank_hist()}, \code{mcmc_rank_overlay()}, and \code{mcmc_rank_ecdf()}. The +returned data frame contains columns for both the original draw values +and their within-parameter ranks, so it can be used to build both trace +and rank-based visualizations with \strong{ggplot2}. +} } } diff --git a/man/PPC-discrete.Rd b/man/PPC-discrete.Rd index d996cd95..9da2fa17 100644 --- a/man/PPC-discrete.Rd +++ b/man/PPC-discrete.Rd @@ -123,6 +123,11 @@ superimposed on the bars. Same as \code{ppc_bars()} but a separate plot (facet) is generated for each level of a grouping variable. } +\item{\code{ppc_bars_data()}}{ +Data-preparation back end for \code{ppc_bars()} and \code{ppc_bars_grouped()}. +Users can call \code{ppc_bars_data()} directly to obtain the prepared data +frame and create custom visualizations with \strong{ggplot2}. +} \item{\code{ppc_rootogram()}}{ Rootograms allow for diagnosing problems in count data models such as overdispersion or excess zeros. In \code{standing}, \code{hanging}, and \code{suspended} diff --git a/man/PPC-distributions.Rd b/man/PPC-distributions.Rd index ce0f264a..b0099f2d 100644 --- a/man/PPC-distributions.Rd +++ b/man/PPC-distributions.Rd @@ -273,7 +273,7 @@ contain only a small number of rows. See the \strong{Examples} section. \item{\code{ppc_dots()}}{ A dot plot plot is displayed for \code{y} and each dataset (row) in \code{yrep}. For these plots \code{yrep} should therefore contain only a small number of rows. -See the \strong{Examples} section. This function requires \link[ggdist:stat_dots]{ggdist::stat_dots} to be installed. +See the \strong{Examples} section. } \item{\code{ppc_freqpoly_grouped()}}{ A separate frequency polygon is plotted for each level of a grouping @@ -301,7 +301,16 @@ the corresponding \code{yrep} values. \code{100 * prob}\% central simultaneous confidence intervals are provided to asses if \code{y} and \code{yrep} originate from the same distribution. The PIT values can also be provided directly as \code{pit}. -See Säilynoja et al. (2021) for more details.} +See Säilynoja et al. (2021) for more details. +} +\item{\code{ppc_data()}}{ +This function prepares data for plotting with \strong{ggplot2} and doesn't +itself make any plots. Users can call it directly to obtain the underlying +data frame that (in most cases) is passed to \strong{ggplot2}. This is useful +when you want to customize the appearance of PPC plots beyond what the +built-in plotting functions allow, or when you want to construct new types +of PPC visualizations based on the same underlying data. +} } } diff --git a/man/PPC-errors.Rd b/man/PPC-errors.Rd index 8e3d28c7..e23b29e9 100644 --- a/man/PPC-errors.Rd +++ b/man/PPC-errors.Rd @@ -132,7 +132,10 @@ return a scalar statistic. The function name is displayed in the axis-label. Defaults to \code{"mean"}.} } \value{ -A ggplot object that can be further customized using the \strong{ggplot2} package. +The plotting functions return a ggplot object that can be further +customized using the \strong{ggplot2} package. The functions with suffix +\verb{_data()} return the data that would have been drawn by the plotting +function. } \description{ Various plots of predictive errors \code{y - yrep}. See the @@ -186,6 +189,12 @@ to \code{arm::binnedplot()}) is generated for each dataset (row) in \code{yrep}. this plot \code{y} and \code{yrep} should contain proportions rather than counts, and \code{yrep} should have only a small number of rows. } +\item{\code{ppc_error_data()}}{ +Data-preparation back end for the \verb{ppc_error_*()} family of plotting +functions. Users can call \code{ppc_error_data()} directly to obtain the +data frame of predictive errors (\code{y - yrep}) and create custom error +visualizations with \strong{ggplot2}. +} } } diff --git a/man/PPC-intervals.Rd b/man/PPC-intervals.Rd index ddac2ebc..6ad715e0 100644 --- a/man/PPC-intervals.Rd +++ b/man/PPC-intervals.Rd @@ -158,6 +158,13 @@ to read than the other. Same as \code{ppc_intervals()} and \code{ppc_ribbon()}, respectively, but a separate plot (facet) is generated for each level of a grouping variable. } +\item{\code{ppc_intervals_data()}, \code{ppc_ribbon_data()}}{ +Data-preparation back end for \code{ppc_intervals()}, \code{ppc_ribbon()}, and +their grouped variants. \code{ppc_ribbon_data()} is an alias for +\code{ppc_intervals_data()}. Users can call either function directly to +obtain the prepared data frame and create custom interval or ribbon +visualizations with \strong{ggplot2}. +} } } diff --git a/man/PPC-loo.Rd b/man/PPC-loo.Rd index 555898d8..7e9ebe02 100644 --- a/man/PPC-loo.Rd +++ b/man/PPC-loo.Rd @@ -223,7 +223,10 @@ order of the observations. The alternative (\code{"median"}) arranges them by median value from smallest (left) to largest (right).} } \value{ -A ggplot object that can be further customized using the \strong{ggplot2} package. +The plotting functions return a ggplot object that can be further +customized using the \strong{ggplot2} package. The functions with suffix +\verb{_data()} return the data that would have been drawn by the plotting +function. } \description{ Leave-One-Out (LOO) predictive checks. See the \strong{Plot Descriptions} section, @@ -233,6 +236,17 @@ for details. \section{Plot Descriptions}{ \describe{ +\item{\code{ppc_loo_pit_data()}}{ +This function prepares LOO-PIT data for plotting with \strong{ggplot2}. It is +the data-preparation back end for the LOO-PIT plotting functions +(\code{ppc_loo_pit_overlay()}, \code{ppc_loo_pit_qq()}, and \code{ppc_loo_pit_ecdf()}), +and users can call it directly to create custom LOO-PIT plots using +ggplot2. The function computes the leave-one-out probability integral +transform (LOO-PIT) values and returns a data frame that can be used to +build ggplot objects. This is useful when you want to create custom +visualizations of LOO-PIT values beyond what the built-in plotting +functions provide. +} \item{\code{ppc_loo_pit_overlay()}, \code{ppc_loo_pit_qq()}, \code{ppc_loo_pit_ecdf()}}{ The calibration of marginal predictions can be assessed using probability integral transformation (PIT) checks. LOO improves the check by avoiding the diff --git a/man/PPC-scatterplots.Rd b/man/PPC-scatterplots.Rd index 2c22bbd3..2727b38c 100644 --- a/man/PPC-scatterplots.Rd +++ b/man/PPC-scatterplots.Rd @@ -116,6 +116,14 @@ is a summary statistic. Unlike for \code{ppc_scatter()}, for The same as \code{ppc_scatter_avg()}, but a separate plot is generated for each level of a grouping variable. } +\item{\code{ppc_scatter_data()}, \code{ppc_scatter_avg_data()}}{ +Data-preparation back ends for the \verb{ppc_scatter*()} family of plotting +functions. \code{ppc_scatter_data()} returns a data frame with one row per +observation per \code{yrep} draw, while \code{ppc_scatter_avg_data()} returns a +data frame with one row per observation summarising \code{yrep} draws with +the chosen \code{stat}. Users can call these functions directly to create +custom visualizations with \strong{ggplot2}. +} } } diff --git a/man/PPC-test-statistics.Rd b/man/PPC-test-statistics.Rd index 3f0bf3e0..4c39620b 100644 --- a/man/PPC-test-statistics.Rd +++ b/man/PPC-test-statistics.Rd @@ -159,6 +159,13 @@ A scatterplot showing the joint distribution of two statistics computed over the datasets (rows) in \code{yrep}. The value of the statistics in the observed data is overlaid as large point. } +\item{\code{ppc_stat_data()}}{ +Data-preparation back end for \code{ppc_stat()}, \code{ppc_stat_freqpoly()}, and +their grouped variants. Users can call \code{ppc_stat_data()} directly to +obtain the data frame of test-statistic values computed from \code{y} and +each row of \code{yrep}, enabling custom test-statistic visualizations with +\strong{ggplot2}. +} } }