The empirical cumulative distribution function (ECDF) provides an alternative
visualisation of distribution. Compared to other visualisations that rely on
density (like geom_histogram()
), the ECDF doesn't require any
tuning parameters and handles both continuous and categorical variables.
The downside is that it requires more training to accurately interpret,
and the underlying visual tasks are somewhat more challenging.
Usage
stat_ecdf(
mapping = NULL,
data = NULL,
geom = "step",
position = "identity",
...,
n = NULL,
pad = TRUE,
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE
)
Arguments
- mapping
Set of aesthetic mappings created by
aes()
. If specified andinherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot. You must supplymapping
if there is no plot mapping.- data
The data to be displayed in this layer. There are three options:
If
NULL
, the default, the data is inherited from the plot data as specified in the call toggplot()
.A
data.frame
, or other object, will override the plot data. All objects will be fortified to produce a data frame. Seefortify()
for which variables will be created.A
function
will be called with a single argument, the plot data. The return value must be adata.frame
, and will be used as the layer data. Afunction
can be created from aformula
(e.g.~ head(.x, 10)
).- geom
The geometric object to use to display the data, either as a
ggproto
Geom
subclass or as a string naming the geom stripped of thegeom_
prefix (e.g."point"
rather than"geom_point"
)- position
Position adjustment, either as a string naming the adjustment (e.g.
"jitter"
to useposition_jitter
), or the result of a call to a position adjustment function. Use the latter if you need to change the settings of the adjustment.- ...
Other arguments passed on to
layer()
. These are often aesthetics, used to set an aesthetic to a fixed value, likecolour = "red"
orsize = 3
. They may also be parameters to the paired geom/stat.- n
if NULL, do not interpolate. If not NULL, this is the number of points to interpolate with.
- pad
If
TRUE
, pad the ecdf with additional points (-Inf, 0) and (Inf, 1)- na.rm
If
FALSE
(the default), removes missing values with a warning. IfTRUE
silently removes missing values.- show.legend
logical. Should this layer be included in the legends?
NA
, the default, includes if any aesthetics are mapped.FALSE
never includes, andTRUE
always includes. It can also be a named logical vector to finely select the aesthetics to display.- inherit.aes
If
FALSE
, overrides the default aesthetics, rather than combining with them. This is most useful for helper functions that define both data and aesthetics and shouldn't inherit behaviour from the default plot specification, e.g.borders()
.
Details
The statistic relies on the aesthetics assignment to guess which variable to use as the input and which to use as the output. Either x or y must be provided and one of them must be unused. The ECDF will be calculated on the given aesthetic and will be output on the unused one.
Computed variables
These are calculated by the 'stat' part of layers and can be accessed with delayed evaluation.