This collection of functions computes the distance between two persistence diagrams of the same homology dimension. The diagrams must be represented as 2-column matrices. The first column of the matrix contains the birth times and the second column contains the death times of the points.
Arguments
- x
Either a matrix of shape \(n \times 2\) or an object of class persistence specifying the first persistence diagram.
- y
Either a matrix of shape \(m \times 2\) or an object of class persistence specifying the second persistence diagram.
- tol
A numeric value specifying the relative error. Defaults to
sqrt(.Machine$double.eps)
. For the Bottleneck distance, it can be set to0.0
in which case the exact Bottleneck distance is computed, while an approximate Bottleneck distance is computed iftol > 0.0
. For the Wasserstein distance, it must be strictly positive.- validate
A boolean value specifying whether to validate the input persistence diagrams. Defaults to
TRUE
. IfFALSE
, the function will not check if the input persistence diagrams are valid. This can be useful for performance reasons, but it is recommended to keep itTRUE
for safety.- dimension
An integer value specifying the homology dimension for which to compute the distance. Defaults to
0L
. This is only used ifx
andy
are objects of class persistence.- p
A numeric value specifying the power for the Wasserstein distance. Defaults to
1.0
.
Value
A numeric value storing either the Bottleneck or the Wasserstein distance between the two persistence diagrams.
Details
A matching \(\varphi : D_1 \to D_2\) between persistence diagrams is a bijection of multisets, where both diagrams are assumed to have all points on the diagonal with infinite multiplicity. The \(p\)-Wasserstein distance between \(D_1\) and \(D_2\) is defined as the infimum over all matchings of the expression
$$W_p(D_1,D_2) = \inf_{\varphi: D_1 \to D_2} \left( \sum_{x \in D_1}{\lVert x - \varphi(x) \rVert^p} \right)^{\frac{1}{p}}$$
that can be thought of as the Minkowski distance between the diagrams viewed as vectors on the shared coordinates defined by the matching \(\varphi\). The norm \(\lVert \cdot \rVert\) can be arbitrary; as implemented here, it is the infinity norm \(\lVert (x_1,x_2) \rVert_\infty = \max(x_1,x_2)\). In the limit \(p \to \infty\), the Wasserstein distance becomes the bottleneck distance:
$$B(D_1,D_2) = \inf_{\varphi: D_1 \to D_2} \sup_{x \in D_1}{\lVert x - \varphi(x) \rVert}.$$
The Wasserstein metric is also called the Kantorovich metric in recognition of the originator of the metric.
Examples
bottleneck_distance(
persistence_sample[[1]]$pairs[[1]],
persistence_sample[[2]]$pairs[[1]]
)
#> [1] 0.06197453
bottleneck_distance(
persistence_sample[[1]],
persistence_sample[[2]]
)
#> [1] 0.06197453
wasserstein_distance(
persistence_sample[[1]]$pairs[[1]],
persistence_sample[[2]]$pairs[[1]]
)
#> [1] 1.53403
wasserstein_distance(
persistence_sample[[1]],
persistence_sample[[2]]
)
#> [1] 1.53403