| Title: | Partially Observed Integrated Functional Depth |
|---|---|
| Description: | Integrated Functional Depth for Partially Observed Functional Data and applications to visualization, outlier detection and classification. It implements the methods proposed in: Elías, A., Jiménez, R., Paganoni, A. M. and Sangalli, L. M., (2023), "Integrated Depth for Partially Observed Functional Data", Journal of Computational and Graphical Statistics, <doi:10.1080/10618600.2022.2070171>. Elías, A., Jiménez, R., & Shang, H. L. (2023), "Depth-based reconstruction method for incomplete functional data", Computational Statistics, <doi:10.1007/s00180-022-01282-9>. Elías, A., Nagy, S. (2024), "Statistical properties of partially observed integrated functional depths", TEST, <doi:10.1007/s11749-024-00954-6>. |
| Authors: | Antonio Elías [aut, cre], Raúl Jiménez [ctb], Anna M. Paganoni [ctb], Laura M. Sangalli [ctb], Stanislav Nagy [ctb], Maximiliam Ofner [ctb] |
| Maintainer: | Antonio Elías <[email protected]> |
| License: | GPL-3 |
| Version: | 2.0.1 |
| Built: | 2026-06-01 09:25:32 UTC |
| Source: | https://github.com/aefdz/fdapoifd |
Plots the Functional Boxplot for PoFD and returns the magnitude and domain outliers.
Magnitude outliers in blue, a dotted red indicates that the outlier situation occurs
in a region with less than fdom proportion of the central region.
boxplotPOFD( data, depth, centralRegion = 0.5, fmag = 1.5, fdom = 0, plot = TRUE )boxplotPOFD( data, depth, centralRegion = 0.5, fmag = 1.5, fdom = 0, plot = TRUE )
data |
matrix p by n, being n the number of functions and p the number of grid points. |
depth |
depth used to build the functional boxplot. Default is MBD, see POIFD function for other definitions. |
centralRegion |
number between 0 and 1 determining the proportion of the deepest functions that builds the central region. |
fmag |
factor to enhance the functional central region and determine the functional whiskers. Default is equal to 1.5. The whiskers provide the rule to unmask magnitude outliers. |
fdom |
factor that provides the maximum proportion of observed functions in the central region to consider a magnitude outlier as a domain outlier also. A value equals to 0 means that domain outliers are those functions that are observed on the domain where any of the functions building the central region are observed. A value equals to 1 determine as domain outlier any magnitude outlier out of the region where the central region is completely observed. |
plot |
if the plot is shown or not. |
a list with the functional boxplot for PoDF the magnitude outliers and the domain outliers.
Sun, Y. and Genton, M. G. (2011). Functional boxplots. Journal of Computational & Graphical Statistics, 20(2):316–334.
boxplotPOFD(exampleData$PoFDextremes_outliers, depth = "MBD", centralRegion = 0.5, fmag = 1.5, fdom = 0)boxplotPOFD(exampleData$PoFDextremes_outliers, depth = "MBD", centralRegion = 0.5, fmag = 1.5, fdom = 0)
Generates samples of functions observed in a common domain in the center part of the domain. See Elías et al (2020).
commondomainPOFD(data, observability = NULL, pIncomplete = NULL)commondomainPOFD(data, observability = NULL, pIncomplete = NULL)
data |
functional data completely observed. pxn matrix being n the number of curves and p the number og evaluation points. |
observability |
mean observed proportion of the domain where each function is observed. |
pIncomplete |
number between 0 and 1 related to the proportion of curves that suffers partially observability. The default is 1 meaning that all the sample curves are partially observed. |
a list containing two elements 1) a functional sample and 2) the same sample of functions but partially observed following one of the schemes described in the argument type.
Elías, Antonio, Jiménez, Raúl, Paganoni, Anna M. and Sangalli, Laura M. (2020). Integrated Depths for Partially Observed Functional Data.
data <- sapply(1:100, function(x) runif(1)*sin(seq(0, 2*pi, length.out = 200)) + runif(1)*cos(seq(0, 2*pi, length.out = 200))) data_pofd <- commondomainPOFD(data, observability = 0.5, pIncomplete = 1)data <- sapply(1:100, function(x) runif(1)*sin(seq(0, 2*pi, length.out = 200)) + runif(1)*cos(seq(0, 2*pi, length.out = 200))) data_pofd <- commondomainPOFD(data, observability = 0.5, pIncomplete = 1)
This function implements the reconstruction procedure [1] which is based on the depth measure [2] for partially observed functional data. Missing trajectories are imputed by the mean of the k nearest neighbors within the envelope. The parameter k is tuned minimizing the Mean Squared Error of the reconstruction in the observed part of the curve.
depthbasedreconstructionPOFD(data, id_recons = 1:dim(data)[2])depthbasedreconstructionPOFD(data, id_recons = 1:dim(data)[2])
data |
Data matrix 'p' by 'n', being 'n' the number of functions and 'p' the number of grid points. The row names of the matrix should be the common evaluation grid and the column names the identifiers of each functional data. |
id_recons |
Vector indicating functions to be reconstructed. By default, all functions are reconstructed. |
[1] Elías, A., Jiménez, R., & Shang, H. L. (2023). Depth-based reconstruction method for incomplete functional data. Computational Statistics, 38(3), 1507-1535.
[2] Elías, A., Jiménez, R., Paganoni, A. M., & Sangalli, L. M. (2023). Integrated depths for partially observed functional data. Journal of Computational and Graphical Statistics, 32(2), 341-352.
The reconstructed data matrix 'recons_data'.
data <- exampleData$PoFDintervals recons_data <- depthbasedreconstructionPOFD(data, id_recons = 1:2)data <- exampleData$PoFDintervals recons_data <- depthbasedreconstructionPOFD(data, id_recons = 1:2)
Code for obtaining the envelope Ji of a curve specified by the index i. The implementation is based on Algorithm 1 in [1]. References:
envelope(data, i, max_iter = 10)envelope(data, i, max_iter = 10)
data |
Data matrix. |
i |
Index of curve. |
max_iter |
Maximum number of nearest curves considered in for loop. By default, max_iter = 5. |
[1] Elías, A., Jiménez, R., & Shang, H. L. (2023). Depth-based reconstruction method for incomplete functional data. Computational Statistics, 38(3), 1507-1535.
Envelope (set of indices).
An illustrative Functional Gaussian processes with different partially observed patterns with outliers and without outliers.
exampleDataexampleData
A list with three data sets (functions by columns):
Partially observed functional data in intervals
Partially Observed functional data with missing intervals at the extremes
Same as above but including two magnitude and shape outliers
Partially observed data without a common domain. Each function is one element of the list, containing the evaluation points (x) and the evaluated function (y).
Elías, Antonio, Jiménez, Raúl, Paganoni, Anna M. and Sangalli, Laura M. (2020). Integrated Depths for Partially Observed Functional Data.
data(exampleData) plotPOFD(exampleData$PoFDintervals)data(exampleData) plotPOFD(exampleData$PoFDintervals)
Generates samples of functions observed in different intervals. See Elías et al (2020).
intervalPOFD(data, observability = NULL, ninterval = NULL, pIncomplete = NULL)intervalPOFD(data, observability = NULL, ninterval = NULL, pIncomplete = NULL)
data |
functional data completely observed. pxn matrix being n the number of curves and p the number og evaluation points. |
observability |
mean observed proportion of the domain where each function is observed. |
ninterval |
if type = "interval", n_interval is an integer with the number of observed intervals 1, 2, 3... Large values of this parameter requires a large parameter p to guarantee the observability level. |
pIncomplete |
number between 0 and 1 related to the proportion of curves that suffers partially observability. The default is 1 meaning that all the sample curves are partially observed. |
a list containing two elements 1) a functional sample and 2) the same sample of functions but partially observed following one of the schemes described in the argument type.
Elías, Antonio, Jiménez, Raúl, Paganoni, Anna M. and Sangalli, Laura M. (2020). Integrated Depths for Partially Observed Functional Data.
data <- sapply(1:100, function(x) runif(1)*sin(seq(0, 2*pi, length.out = 200)) + runif(1)*cos(seq(0, 2*pi, length.out = 200))) data_pofd <- intervalPOFD(data, observability = 0.5, ninterval = 2, pIncomplete = 1)data <- sapply(1:100, function(x) runif(1)*sin(seq(0, 2*pi, length.out = 200)) + runif(1)*cos(seq(0, 2*pi, length.out = 200))) data_pofd <- intervalPOFD(data, observability = 0.5, ninterval = 2, pIncomplete = 1)
This function chooses an optimal parameter k (denoting the number of curves within the envelope that are considered for reconstruction).
learningK(data, i, J)learningK(data, i, J)
data |
Data matrix. |
i |
Index of curve. |
J |
Envelope. |
Optimal parameter k.
Plots the Outliergram for PoFD and returns the shape outliers.
outliergramPOFD(data, fshape = 1.5, p1 = 1, p2 = 0, plot = TRUE)outliergramPOFD(data, fshape = 1.5, p1 = 1, p2 = 0, plot = TRUE)
data |
matrix p by n, being n the number of functions and p the number of grid points. |
fshape |
inflation of the outliergram that determine the shape outlier rule. |
p1 |
parameter of the outliergram for resampling method. Default = 1. |
p2 |
parameter of the outliergram for resampling method. Default = 0. |
plot |
if the plot is shown or not. |
a list with the functional outliergram for PoDF and the shape outliers.
Arribas-Gil, A. and Romo, J. (2014). Shape outlier detection and visualization for functional data: the outliergram.Biostatistics, 15(4):603–619.
outliergramPOFD(exampleData$PoFDextremes_outliers, fshape = 1.5, p1 = 1, p2 = 0)outliergramPOFD(exampleData$PoFDextremes_outliers, fshape = 1.5, p1 = 1, p2 = 0)
Plot the sample of partially observed curves and the proportion of observed functions.
plotPOFD(data)plotPOFD(data)
data |
matrix p by n, being n the number of functions and p the number of grid points. |
Plot of the partially observed functional data and the proportion of observed functions at each time point.
plotPOFD(exampleData$PoFDextremes)plotPOFD(exampleData$PoFDextremes)
Compute the depth measure of a partially observed functional data set proposed in [1]. If the functions are not observed in a common partially observed domain, the code first estimates the observation domain using the proposal in [2].
POIFD(data, type = c("HD", "FMD", "MBD", "MHRD"), phi, t = NULL)POIFD(data, type = c("HD", "FMD", "MBD", "MHRD"), phi, t = NULL)
data |
If functions are observed in a partially observed common grid 'data' is a matrix 'p' by 'n', being 'n' the number of functions and 'p' the number of grid points.
The row names of the matrix should be the common evaluation grid and the column names the identifiers of each functional data.
If functions do not have a common grid, 'data'must be a list of length 'n' where each element contains the values and evaluation points of each function.
I.e. each list element must contain two vectors 'x', including the evaluation points, and 'y', the evaluated function values.
For functions without a common grid, the function |
type |
chosen depth measure. Halfspace depth ( |
phi |
phi function of weights for the POIFD. The default value is as in [1]: the proportion of observed functions at each time point. |
t |
If functions do not have a common grid, 't' represents the final common grid of evaluation points to apply the procedure to estimate the observation domain proposed in [2]. |
[1] Elías, A., Jiménez, R., Paganoni, A. M., & Sangalli, L. M. (2023). Integrated depths for partially observed functional data. Journal of Computational and Graphical Statistics, 32(2), 341-352.
[2] Elías, A., Nagy, S. (2025). Statistical Properties of Partially Observed Integrated Funcional Depths. TEST, 34, 125-150.
Ordered vector of depths. The names are the functions names (if provided) or the column position.
data <- exampleData$PoFDintervals poifd <- POIFD(data, type = c("FMD")) data <- exampleData$PoFDintervals_list poifd <- POIFD(data, type = c("FMD"), t = seq(0, 1, length.out = 100))data <- exampleData$PoFDintervals poifd <- POIFD(data, type = c("FMD")) data <- exampleData$PoFDintervals_list poifd <- POIFD(data, type = c("FMD"), t = seq(0, 1, length.out = 100))
Generates samples of sparse functions. See Elías et al (2020).
sparsePOFD(data, observability = NULL, pIncomplete = NULL)sparsePOFD(data, observability = NULL, pIncomplete = NULL)
data |
functional data completely observed. pxn matrix being n the number of curves and p the number og evaluation points. |
observability |
observed proportion of the domain where each function is observed. |
pIncomplete |
number between 0 and 1 related to the proportion of curves that suffers partially observability. The default is 1 meaning that all the sample curves are partially observed. |
a list containing two elements 1) a functional sample and 2) the same sample of functions but partially observed following one of the schemes described in the argument type.
Elías, Antonio, Jiménez, Raúl, Paganoni, Anna M. and Sangalli, Laura M. (2020). Integrated Depths for Partially Observed Functional Data.
data <- sapply(1:100, function(x) runif(1)*sin(seq(0, 2*pi, length.out = 200)) + runif(1)*cos(seq(0, 2*pi, length.out = 200))) data_pofd <- sparsePOFD(data, observability = 0.5, pIncomplete = 1)data <- sapply(1:100, function(x) runif(1)*sin(seq(0, 2*pi, length.out = 200)) + runif(1)*cos(seq(0, 2*pi, length.out = 200))) data_pofd <- sparsePOFD(data, observability = 0.5, pIncomplete = 1)