Reorder or Group layout based on hierarchical clustering
Source:R/align-dendrogram.R
align_dendro.Rd
Reorder or Group layout based on hierarchical clustering
Usage
align_dendro(
mapping = aes(),
...,
distance = "euclidean",
method = "complete",
use_missing = "pairwise.complete.obs",
reorder_group = FALSE,
k = NULL,
h = NULL,
plot_dendrogram = TRUE,
plot_cut_height = NULL,
root = NULL,
center = FALSE,
type = "rectangle",
size = NULL,
data = NULL,
free_labs = waiver(),
free_spaces = waiver(),
plot_data = waiver(),
set_context = TRUE,
order = NULL,
name = NULL
)
Arguments
- mapping
Additional default list of aesthetic mappings to use for plot.
- ...
Additional arguments passed to geom_segment().
- distance
A string of distance measure to be used. This must be one of
"euclidean"
,"maximum"
,"manhattan"
,"canberra"
,"binary"
or"minkowski"
. Correlation coefficient can be also used, including"pearson"
,"spearman"
or"kendall"
. In this way,1 - cor
will be used as the distance. In addition, you can also provide a dist object directly or a function return a dist object.- method
A string of the agglomeration method to be used. This should be (an unambiguous abbreviation of) one of
"ward.D"
,"ward.D2"
,"single"
,"complete"
,"average"
(= UPGMA),"mcquitty"
(= WPGMA),"median"
(= WPGMC) or"centroid"
(= UPGMC). you can also provide a function which returns a hclust object.- use_missing
An optional character string giving a method for computing covariances in the presence of missing values. This must be (an abbreviation of) one of the strings
"everything"
,"all.obs"
,"complete.obs"
,"na.or.complete"
, or"pairwise.complete.obs"
. Only used whendistance
is a correlation coefficient string.- reorder_group
A single boolean value, indicates whether we should do Hierarchical Clustering between groups, only used when previous groups have been established.
- k
An integer scalar indicates the desired number of groups.
- h
A numeric scalar indicates heights where the tree should be cut.
- plot_dendrogram
A boolean value indicates whether plot the dendrogram tree.
- plot_cut_height
A boolean value indicates whether plot the cut height.
- root
A length one string or numeric indicates the root branch.
- center
A boolean value. if
TRUE
, nodes are plotted centered with respect to the leaves in the branch. Otherwise (default), plot them in the middle of all direct child nodes.- type
A string indicates the plot type,
"rectangle"
or"triangle"
.- size
Plot size, can be an unit object.
- data
A matrix, a data frame, or even a simple vector that will be converted into a one-column matrix. If the
data
argument is set toNULL
, thealign_*
will use thelayout
data. Additionally, thedata
argument can also accept a function (purrr-like lambda is also okay), which will be applied with thelayout
data,It is important to note that all
align_*
functions consider therows
as the observations. It means theNROW(data)
must return the same number with the parallellayout
axis.layout_heatmap
: for column annotation, thelayout
data will be transposed before using (If data is afunction
, it will be applied with the transposed matrix). This is necessary because column annotation uses heatmap columns as observations, but we need rows.layout_stack
: thelayout
data will be used as it is since we place all plots along a single axis.
- free_labs
A boolean value or a string containing one or more of
"t"
,"l"
,"b"
, and"r"
indicates which axis title should be free from alignment. IfNULL
, all axis title will be aligned. Default: "tlbr".- free_spaces
A boolean value or a string containing one or more of
"t"
,"l"
,"b"
, and"r"
indicates which border spaces should be removed. IfNULL
(default), no space will be removed.- plot_data
A function used to transform the plot data before rendering. By default, it'll inherit from the parent layout. If no parent layout, the default is
NULL
, which means we won't want to modify anything.Used to modify the data after layout has been created, but before the data is handled of to the ggplot2 for rendering. Use this hook if the you needs change the default data for all
geoms
.- set_context
A single boolean value indicates whether to set the active context to current plot. If
TRUE
, all subsequent ggplot elements will be added into this plot.- order
An single integer for the layout order.
- name
A string of the plot name. Used to switch the active context in
hmanno()
orstack_active()
.
ggplot2 specification
align_dendro
initializes a ggplot
data and mapping
.
The internal will always use a default mapping of aes(x = .data$x, y = .data$y)
.
The default ggplot data is the node
coordinates, in addition, a
geom_segment layer with a data of the tree segments
edge
coordinates will be added.
node
and tree segments edge
coordinates contains following columns:
index
: the original index in the tree for the current nodelabel
: node label textx
andy
: x-axis and y-axis coordinates for current node or the start node of the current edge.xend
andyend
: the x-axis and y-axis coordinates of the terminal node for current edge.branch
: which branch current node or edge is. You can use this column to color different groups.panel
: which panel current node is, if we split the plot into panel using facet_grid, this column will show which panel current node or edge is from. Note: some nodes may fall outside panel (between two panel), so there are possibleNA
values in this column. We also provide.panel
column, which always give the right branch for usage of the ggplot facet..panel
: Seepanel
, this is what we often used.panel1
andpanel2
: The panel1 and panel2 variables have the same functionality aspanel
, but they are specifically for theedge
data and correspond to both nodes of each edge.leaf
: A logical value indicates whether current node is a leaf.