Package 'Renext'

Title:	Renewal Method for Extreme Values Extrapolation
Description:	Peaks Over Threshold (POT) or 'methode du renouvellement'. The distribution for the excesses can be chosen, and heterogeneous data (including historical data or block data) can be used in a Maximum-Likelihood framework.
Authors:	Yves Deville <[email protected]>, Lise Bardet <[email protected]>
Maintainer:	Yann Richet <[email protected]>
License:	GPL (>= 2)
Version:	3.1-4
Built:	2025-02-20 03:13:33 UTC
Source:	https://github.com/irsn/renext

Help Index

Renewal Method for Extreme Values Extrapolation
Compute an analysis of deviance table for two nested Renouv objects
Barplot for Renouv "Over Threshold" counts
Surge heights at Brest
Surge heights at Brest partial data
Years with missing periods in 'Brest.year' dataset
Squared Coefficient of Variation
CV2 test of exponentiality
Surge heights at Dunkerque
Expectation-Maximisation for a mixture of exponential distributions
Classical "exponential distribution" plot
ML estimation of the Gamma distribution
Fit a GEV distribution from block maxima or r largest order statistics using an aggregated Renewal POT process
Fit a two-parameters Generalised Pareto Distribution from a sample
ML estimation of the Lomax distribution
ML estimation of a 'maxlo' distribution
ML estimation of classical Weibull distribution
Flow of the french river La Garonne
Translate a vector of GEV parameters into renewal model
Goodness-of-fit for the distribution of dates
Goodness-of-fit test for exponential distribution
Generalised Pareto Distribution
Translate a vector of Gumbel parameters into a vector of parameters for a renewal model
Plotting positions for exponential return levels
Simple estimation for the mixture of two exponential distributions
Interevents (or interarrivals) from events dates
Jackson's statistic
Jackson's test of exponentiality
Log-likelihood of a "Renouv" object
Lomax distribution
Likelihood Ratio statistic for exponential vs. GPD
Likelihood Ratio test of exponentiality vs. GPD
Likelihood Ratio statistic for Gumbel vs. GEV
Likelihood Ratio test for the Gumbel distribution
'maxlo' distribution
Mixture of two exponential distributions
Moment estimation for the mixture of two exponential distributions
Parameters from moments
Negative Binomial Levy process
Temporal aggregation of a Marked Process
Add a small amount of noise to a numeric vector
Derivation of probability functions with respect to the parameters
Initial estimation of GPD parameters for an aggregated renewal model
Probability that the Greenwood's statistic is smaller than one
Plot a Rendata object
Plot an object of class "Renouv"
Diagnostic plots for Renouv objects
Compute return levels and confidence limits for a "Renouv" object
Quantiles of a test statistic
Read data using an XML index file
Translate a vector of coefficients from a Renewal-POT model with Pareto excesses into a vector of GEV parameters
Translate a vector of coefficients from a Renewal-POT model with exponential excesses to a vector of Gumbel parameters
Fit a 'Renouvellement' model
Define a 'renouvellement' model without estimation
Legend management for return level plots
Graphical parameters for Return Level plots
Return level plot
Round quantiles in a pseudo-prediction table
Simulate a random RenData object
Compute empirical survivals (S) and return periods (T)
Fix non-skipped periods from skipped ones
Shifted Left Truncated Weibull (SLTW) distribution
Methods computing spacings between Largest Order Statistics
Summary and print methods for "Rendata" objects
Summary and print methods for "Renouv" objects
Make translucient colors
Variance-covariance matrix of the estimates of a "Renouv" object
Classical Weibull distribution plot

Renewal Method for Extreme Values Extrapolation

Description

This package proposes fits and diagnostics for the so-called méthode du renouvellement, an alternative to other "Peaks Over Threshold" (POT) methods. The méthode du renouvellement generalises the classical POT by allowing the excesses over the threshold to follow a probability distribution which can differ from the Generalised Pareto Distribution (GPD). Weibull or gamma excesses are sometimes preferred to GPD excesses. The special case of exponential excesses (which falls in the three families: GPD, Weibull and gamma) has a special interest since it allows exact inference for the (scalar) parameter and for the quantiles form OT data (only).

The package allows the joint use of possibly three kinds of data or information. The first kind is classical excesses, or "OT data". It can be completed with two kinds of data resulting from a temporal aggregation, as is often the case for historical data. Both types are optional, and concern periods or blocks that must not overlap nor cross the OT period.

MAX data correspond to the case where one knows the $r$ largest observations over each block. The number $r$ may vary across blocks. This kind of data is often called ' $r$ largest', or " $r$ Largest Order Statistics" ( $r$ LOS).
OTS data (for OT Supplementary data) correspond to the case where one knows for each block $b$ all the observations that exceeded a threshold $u_b$ which is greater (usually much greater) than the main threshold $u$ . The number $r_b$ of such observations can be zero, in which case we may say that $u_b$ is an unobserved level. A threshold $u_b$ is sometimes called a perception threshold.

Historical data are often available in hydrology (e.g. for river flood discharges, for sea-levels or sea surges) and can concern large periods such as past centuries. An unobserved level can typically be related to a material benchmark.

Maximum likelihood estimation is made possible in this context of heterogeneous data. Inference is based on the asymptotic normality of parameter vector estimate and on linearisation ("delta method") for quantiles or parameter functions.

The package allows the use of "marked-process observations" data (datetime of event and level) where an interevent analysis can be useful. It also allows the event dates to be unknown and replaced by a much broader block indication, e.g. a year number. The key point is then that the "effective duration" (total duration of observation periods) is known. Event counts for blocks can be used to check the assumption of Poisson-distributed events.

The package development was initiated, directed and financed by the french Institut de Radioprotection et de Sûreté Nucléaire (IRSN). The package is a non-academic tool designed for applied analysis on case studies and investigations or comparisons on classical probabilistic models.

Details

The DESCRIPTION file:

Package:	Renext
Type:	Package
Title:	Renewal Method for Extreme Values Extrapolation
Version:	3.1-4
Date:	2023-08-29
Author:	Yves Deville <[email protected]>, Lise Bardet <[email protected]>
Maintainer:	Yann Richet <[email protected]>
URL:	https://github.com/IRSN/Renext, https://irsn.github.io/Renext/
Depends:	R (>= 2.8.0), stats, graphics, evd
Imports:	numDeriv, splines, methods
Suggests:	MASS, ismev, XML
Description:	Peaks Over Threshold (POT) or 'methode du renouvellement'. The distribution for the excesses can be chosen, and heterogeneous data (including historical data or block data) can be used in a Maximum-Likelihood framework.
License:	GPL (>= 2)
LazyData:	yes
Repository:	https://irsn.r-universe.dev
RemoteUrl:	https://github.com/irsn/renext
RemoteRef:	HEAD
RemoteSha:	d1df900464ef9894bfbf52145012b317d9d84a8d

Index of help topics:

Brest                   Surge heights at Brest
Brest.years             Surge heights at Brest partial data
Brest.years.missing     Years with missing periods in 'Brest.year'
                        dataset
CV2                     Squared Coefficient of Variation
CV2.test                CV2 test of exponentiality
Dunkerque               Surge heights at Dunkerque
EM.mixexp               Expectation-Maximisation for a mixture of
                        exponential distributions
GPD                     Generalised Pareto Distribution
Garonne                 Flow of the french river La Garonne
Hpoints                 Plotting positions for exponential return
                        levels
Jackson                 Jackson's statistic
Jackson.test            Jackson's test of exponentiality
LRExp                   Likelihood Ratio statistic for exponential vs.
                        GPD
LRExp.test              Likelihood Ratio test of exponentiality vs. GPD
LRGumbel                Likelihood Ratio statistic for Gumbel vs. GEV
LRGumbel.test           Likelihood Ratio test for the Gumbel
                        distribution
Lomax                   Lomax distribution
Maxlo                   'maxlo' distribution
MixExp2                 Mixture of two exponential distributions
NBlevy                  Negative Binomial Levy process
OT2MAX                  Temporal aggregation of a Marked Process
OTjitter                Add a small amount of noise to a numeric vector
PPplot                  Diagnostic plots for Renouv objects
RLlegend                Legend management for return level plots
RLpar                   Graphical parameters for Return Level plots
RLplot                  Return level plot
Ren2gev                 Translate a vector of coefficients from a
                        Renewal-POT model with Pareto excesses into a
                        vector of GEV parameters
Ren2gumbel              Translate a vector of coefficients from a
                        Renewal-POT model with exponential excesses to
                        a vector of Gumbel parameters
Renext-package          Renewal Method for Extreme Values Extrapolation
Renouv                  Fit a 'Renouvellement' model
RenouvNoEst             Define a 'renouvellement' model without
                        estimation
SLTW                    Shifted Left Truncated Weibull (SLTW)
                        distribution
SandT                   Compute empirical survivals (S) and return
                        periods (T)
anova.Renouv            Compute an analysis of deviance table for two
                        nested Renouv objects
barplotRenouv           Barplot for Renouv "Over Threshold" counts
expplot                 Classical "exponential distribution" plot
fGEV.MAX                Fit a GEV distribution from block maxima or r
                        largest order statistics using an aggregated
                        Renewal POT process
fGPD                    Fit a two-parameters Generalised Pareto
                        Distribution from a sample
fgamma                  ML estimation of the Gamma distribution
flomax                  ML estimation of the Lomax distribution
fmaxlo                  ML estimation of a 'maxlo' distribution
fweibull                ML estimation of classical Weibull distribution
gev2Ren                 Translate a vector of GEV parameters into
                        renewal model
gof.date                Goodness-of-fit for the distribution of dates
gofExp.test             Goodness-of-fit test for exponential
                        distribution
gumbel2Ren              Translate a vector of Gumbel parameters into a
                        vector of parameters for a renewal model
ini.mixexp2             Simple estimation for the mixture of two
                        exponential distributions
interevt                Interevents (or interarrivals) from events
                        dates
logLik.Renouv           Log-likelihood of a "Renouv" object
mom.mixexp2             Moment estimation for the mixture of two
                        exponential distributions
mom2par                 Parameters from moments
pGreenwood1             Probability that the Greenwood's statistic is
                        smaller than one
parDeriv                Derivation of probability functions with
                        respect to the parameters
parIni.MAX              Initial estimation of GPD parameters for an
                        aggregated renewal model
plot.Rendata            Plot a Rendata object
plot.Renouv             Plot an object of class "Renouv"
predict.Renouv          Compute return levels and confidence limits for
                        a "Renouv" object
qStat                   Quantiles of a test statistic
rRendata                Simulate a random RenData object
readXML                 Read data using an XML index file
roundPred               Round quantiles in a pseudo-prediction table
skip2noskip             Fix non-skipped periods from skipped ones
spacings                Methods computing spacings between Largest
                        Order Statistics
summary.Rendata         Summary and print methods for "Rendata" objects
summary.Renouv          Summary and print methods for "Renouv" objects
translude               Make translucient colors
vcov.Renouv             Variance-covariance matrix of the estimates of
                        a "Renouv" object
weibplot                Classical Weibull distribution plot

This package contains a function Renouv to fit "renouvellement" models.

Author(s)

Yves Deville <[email protected]>, Lise Bardet <[email protected]>

Maintainer: Yann Richet <[email protected]>

References

Miquel J. (1984) Guide pratique d'estimation des probabilités de crues, Eyrolles (coll. EDF DER).
Coles S. (2001) Introduction to Statistical Modelling of Extremes Values, Springer.
Embrechts P., Klüppelberg C. and Mikosch T. (1997) Modelling Extremal Events for Insurance and Finance. Springer.

Examples

## 'Garonne' data set
summary(Garonne)
plot(Garonne)

## Weibull excesses
fG <- Renouv(x = Garonne,
             threshold = 3000,
             distname.y = "weibull",
             main = "Weibull fit for 'Garonne'")

coef(fG)
vcov(fG)
summary(fG)
logLik(fG)
## Re-plot if needed
plot(fG)

## Classical 'predict' method with usual formal args 
predict(fG, newdata = c(100, 150, 200), level = c(0.8, 0.9))
## 'Garonne' data set
summary(Garonne)
plot(Garonne)

## Weibull excesses
fG <- Renouv(x = Garonne,
             threshold = 3000,
             distname.y = "weibull",
             main = "Weibull fit for 'Garonne'")

coef(fG)
vcov(fG)
summary(fG)
logLik(fG)
## Re-plot if needed
plot(fG)

## Classical 'predict' method with usual formal args 
predict(fG, newdata = c(100, 150, 200), level = c(0.8, 0.9))

Compute an analysis of deviance table for two nested Renouv objects

Description

Compute an analysis of deviance table for two nested Renouv objects

Usage

   ## S3 method for class 'Renouv'
anova(object, object1, trace = 1L, ...)
## S3 method for class 'Renouv'
anova(object, object1, trace = 1L, ...)

Arguments

`object`	A `Renouv` model as fitted with `Renouv`.
`object1`	A `Renouv` object such that `object` is nested in `object1`.
`trace`	Level of verbosity. The value `0` prints nothing.
`...`	Not used yet.

Details

Of special interest is the case when the distribution of the excesses used in object is exponential while object1 uses a two-parameters alternative in the GPD family. We know then that the convergence to the asymptotic distribution is slow, and a numerical approximation of the true distribution of the test statistic is used when possible, i.e. when the objects do not use MAX or OTS data and the number of exceedances is between 8 and 500.

Value

An object of class "anova" inheriting from class "data.frame".

Note

The deviance of the models can not be interpreted: only the difference of the deviance is meaningful.

Examples

## test using historical data
fit1Exp <- Renouv(Garonne,  distname.y = "exponential", plot = FALSE)
fit1GPD <- Renouv(Garonne, distname.y = "GPD", plot = FALSE)
anova(fit1Exp, fit1GPD)

## test without using historical data
x <- Garonne$OTdata$Flow
dur <- Garonne$OTinfo$effDuration

fit2Exp <- Renouv(x,  threshold = 2700,  effDuration = dur,
                  distname.y = "exponential", plot = FALSE)
fit2GPD <- Renouv(x, threshold = 2700, effDuration = dur,
                  distname.y = "GPD", plot = FALSE)
anova(fit2Exp, fit2GPD)
## test using historical data
fit1Exp <- Renouv(Garonne,  distname.y = "exponential", plot = FALSE)
fit1GPD <- Renouv(Garonne, distname.y = "GPD", plot = FALSE)
anova(fit1Exp, fit1GPD)

## test without using historical data
x <- Garonne$OTdata$Flow
dur <- Garonne$OTinfo$effDuration

fit2Exp <- Renouv(x,  threshold = 2700,  effDuration = dur,
                  distname.y = "exponential", plot = FALSE)
fit2GPD <- Renouv(x, threshold = 2700, effDuration = dur,
                  distname.y = "GPD", plot = FALSE)
anova(fit2Exp, fit2GPD)

Barplot for Renouv "Over Threshold" counts

Description

Barplot for "Over Threshold" counts in time blocks (usually years)

Usage

 barplotRenouv(data,
               blockname = colnames(data)[1],
               varname = colnames(data)[2],
               threshold = quantile(data[, varname], 0.2),
               na.block = NULL,
               plot = TRUE,
               main = NULL, xlab = NULL, ylab = NULL,
               mono = FALSE,
               prob.theo = 0.999,
               ...)
barplotRenouv(data,
               blockname = colnames(data)[1],
               varname = colnames(data)[2],
               threshold = quantile(data[, varname], 0.2),
               na.block = NULL,
               plot = TRUE,
               main = NULL, xlab = NULL, ylab = NULL,
               mono = FALSE,
               prob.theo = 0.999,
               ...)

Arguments

`data`	A dataframe object containing the variables.
`blockname`	Name of the "block" variable (column in `data`). This variable should contain integers, or be of class "factor", but with integer values such as year numbers.
`varname`	Name of the variable (e.g. `"Surge"`).
`threshold`	Only obs for which the variable exceeds `threshold` will be taken into account.
`na.block`	Values of blocks containing missing values. See the Details section.
`plot`	If `FALSE` tests are computed without producing any plot.
`main`	Character for main title or `NULL` in which case a default main title is used.
`xlab`	Character for x axis label or `NULL` in which case a default lab is used.
`ylab`	Character for y axis or `NULL` in which case a default lab is used.
`mono`	If `FALSE` barplot will have colors, else greyscale will be used.
`prob.theo`	The total theoretical probability corresponding to the plotted (theoretical) bars.
`...`	Further args to be passed to `barplot`.

Details

Blocks described in the na.block are omitted in the determination of counts. The object given in the na.block is coerced to character and the same is done for values of block before comparing them to the na.block values. If block variable is of class factor with levels representing years (e.g. 1980, 1981, etc.) missing blocks can be specified either as c("1980", "1981") or as numeric c(1980, 1981).

For the chi-square test, counts for neighbouring frequency classes are grouped in order to reach a minimum frequency of 5 in each group. E.g. if we expect respectively 1.0, 3.8 and 7.0 blocks with frequency 0, 1 and 2 for events, the three counts are grouped in one group with frequency 1.0+3.8+7.0=11.8. Note that this strategy of grouping is not unique and is likely to weaken the power of the test. Before grouping, the higher class theoretical probability is computed as the probability to obtain a count equal to or greater than the max value.

Value

A list with the following objects.

`freq`	frequency table (matrix) giving observed and theoretical (Poisson) frequencies as well as a group number for the chi-square test.
`overdispersion`	the overdispersion coefficient (variance/mean ratio).
`disp.test`	a list giving results of the (over)dispersion test. See the reference Yagouti and al. in the References section.
`chisq.test`	a list giving results for the chis-square test of goodness-of-fit to the Poisson distribution.
`tests`	a matrix with the two tests displayed in two rows.

For both tests, the statistic follows a chi-square distribution under the null hypothesis . The list of results contains the statistic statistic, the number of degrees of freedom df and the $p$ -value p.value.

Note

The two tests: (over-)dispersion and chi-square have one-sided (upper tail) $p$ -value. In other words, we do not intend to reject when statistics take "abnormally small" values, but only when abnormally large values are met.

Author(s)

Yves Deville

References

See Yagouti A., Abi-Zeid I., Ouarda, T.B.M.J. and B. Bobée (2001), Revue de processus ponctuels et synthèse de tests statistiques pour le choix d'un type de processus Revue des Sciences de l'Eau, 1, pp. 323-361.

Examples

## na.block influence for Brest data
opar <- par(mfrow = c(2, 2))

bp1 <- barplotRenouv(data = Brest.years, threshold = 30,
         main = "missing periods ignored")
bp2 <- barplotRenouv(data = Brest.years, threshold = 30,
         na.block = 1992, main = "1992 missing")
bp3 <- barplotRenouv(data = Brest.years, threshold = 30,
         na.block = 1991:1993, main ="1991:1993 missing")
bp4 <- barplotRenouv(data = Brest.years, threshold = 30,
         na.block = Brest.years.missing, main = "all missing periods")

par(opar)

## threshold influence
opar <- par(mfrow = c(2,2))

thresh <- c(30, 35, 40, 50)

for (i in 1:length(thresh)) {
  bp  <- barplotRenouv(data = Brest.years, threshold = thresh[i],
                   na.block = Brest.years.missing,
                   main = paste("threshold =", thresh[i], "cm at Brest"))
}
par(opar)
## na.block influence for Brest data
opar <- par(mfrow = c(2, 2))

bp1 <- barplotRenouv(data = Brest.years, threshold = 30,
         main = "missing periods ignored")
bp2 <- barplotRenouv(data = Brest.years, threshold = 30,
         na.block = 1992, main = "1992 missing")
bp3 <- barplotRenouv(data = Brest.years, threshold = 30,
         na.block = 1991:1993, main ="1991:1993 missing")
bp4 <- barplotRenouv(data = Brest.years, threshold = 30,
         na.block = Brest.years.missing, main = "all missing periods")

par(opar)

## threshold influence
opar <- par(mfrow = c(2,2))

thresh <- c(30, 35, 40, 50)

for (i in 1:length(thresh)) {
  bp  <- barplotRenouv(data = Brest.years, threshold = thresh[i],
                   na.block = Brest.years.missing,
                   main = paste("threshold =", thresh[i], "cm at Brest"))
}
par(opar)

Surge heights at Brest

Description

Surge heights near high tide at Brest tide gauge station (France), detailed version

Usage

BrestBrest

Format

The format is: List of 5

$info : List of 6
- $name : chr "Brest"
- $shortLab : chr "Surge Heights at Brest (France)"
- $longLab : chr "Surge Heights near high tide, Brest (France)"
- $varName : chr "Surge"
- $varShortLab : chr "Surge"
- $varUnit : chr "cm"
$describe : chr "High tide sea surge over 30 cm at Brest (France)..."
$OTinfo : List of 4
- $start : chr POSIXct[1:1], format: "1846-01-01"
- $end : chr POSIXct[1:1], format: "2009-01-01"
- $effDuration : num 148
- $threshold : num 30
$OTdata : 'data.frame': 1289 obs. of 2 variables:
- $date : POSIXct[1:1289], format: "1846-01-14" "1846-01-21" ...
- $Surge : num [1:1289] 36 60 46 40 33 ...
$OTmissing : 'data.frame': 43 obs. of 3 variables:
- $start : POSIXct[1:43], format: "1846-01-01" "1847-01-01" ...
- $end : POSIXct[1:43], format: "1846-01-04" "1847-01-21" ...
- $comment : chr [1:43] "" "" "" "" ...
- attr(*, "class")= chr "Rendata"

Details

Data are provided as a list.

info gives general information about the data
OTinfo gives general information about the Over the Threshold part of data. The effective duration (effDuration element) is the total duration for the periods with effective measurements.
OTdata give OT measurements
OTmissing gives start and end of the missing periods for OT measurements.

Data come from hourly sea levels measured and predicted by the french Service Hydrogéographique et Océanographique de la Marine (SHOM). Observed sea levels are available as REFMAR data at the url https://data.shom.fr/. Data were processed (declustered) by IRSN in order to provide a series of independent surge heights at high tide. Surge height at high tide is defined as the difference between the observed and the predicted maximal sea levels near high tide. A correction was applied to account for trend in the sea-level over the observation period.

The effective duration given in years is defined up to a small fraction of year due to leap years and leap seconds.

Source

https://data.shom.fr/

Examples

str(Brest)
Brest$OTinfo$start
plot(Brest)
str(Brest)
Brest$OTinfo$start
plot(Brest)

Surge heights at Brest partial data

Description

Surge heights at Brest (France)

Usage

Brest.yearsBrest.years

Format

A data frame with 954 observations on the following 2 variables.

year: Year e.g; 1980
Surge: Surge heights above the threshold of 30 cm.

Details

These data are a simplified version of Brest. For each surge event only the year is retained as timestamp. Years with missing periods are available as a vector Brest.years.missing.

This dataset is useful for testing since similar data are sometimes met in the analyses.

Examples

names(Brest.years)
names(Brest.years)

Years with missing periods in 'Brest.year' dataset

Description

Years with missing periods in the 'Brest.years' dataset

Usage

Brest.years.missingBrest.years.missing

Format

The format is: int [1:49] 1846 1847 1852 1857 1858 1859 1860 1861 1862 1863 ...

Details

Vector of years containing missing periods in the Brest.years dataset. This years should be ignored when computing yearly statistics such as event rates, since time records are lost.

Examples

print(Brest.years.missing)
print(Brest.years.missing)

Squared Coefficient of Variation

Description

Squared Coefficient of Variation.

Usage

   CV2(x)
CV2(x)

Arguments

`x`	Numeric vector or matrix.

Details

Compute the squared Coefficient of Variation of one or several samples provided as a numeric vector or matrix.

Value

Numeric vector of the squared coefficients of variation.

Note

The squared coefficient of variation is the ratio $S^2/\bar{X}^2$ where $\bar{X}$ and $S^2$ are the sample mean and the sample variance. The variance is computed using the sample size $n$ as denominator, rather than the usual $n-1$ .

Examples

n <- 30; nSamp <- 500
X <- matrix(rexp(n * nSamp), nrow= nSamp, ncol = n)
W <- CV2(X)
plot(density(W), main = "CV2 of exponential samples")
n <- 30; nSamp <- 500
X <- matrix(rexp(n * nSamp), nrow= nSamp, ncol = n)
W <- CV2(X)
plot(density(W), main = "CV2 of exponential samples")

CV2 test of exponentiality

Description

Test of exponentiality based on the squared coefficient of variation.

Usage

   CV2.test(x, method = c("num", "sim", "asymp"), nSamp = 15000)
CV2.test(x, method = c("num", "sim", "asymp"), nSamp = 15000)

Arguments

`x`	Numeric vector giving the sample.
`method`	Method used to compute the $p$ -value. Can be `"asymp"`, `"num"` or `"sim"` as in `LRExp.test`.
`nSamp`	Number of samples used to compute the $p$ -value when `method` is `"sim"`.

Details

The distribution of $\textrm{CV}^2$ is that of Greenwood's statistic up to normalising constants. It approximately normal with expectation $1$ and standard deviation $2/\sqrt{n}$ for a large sample size n. Yet the convergence to the normal is known to be very slow.

Value

A list of test results.

`statistic`, `p.value`	The test statistic, i.e. the squared coefficient of variation $\textrm{CV}^2$ and the $p$ -value.
`df`	The sample size.
`method`	Description of the test method.

Note

This test is sometimes referred to as Wilk's exponentiality test or as WE1 test. It works quite well for a Lomax alternative (i.e. GPD with shape $\xi >0$ ), and hence can be compared to Jackson's test and the Likelihood-Ratio (LR) test of exponentiality. However, this test has lower power that of the two others while having a comparable computation cost due to the evaluation of the Greenwood's statistic distribution.

Author(s)

Yves Deville

References

S. Ascher (1990) "A Survey of Tests for Exponentiality" Commun. Statist. Theory Methods, 19(5), pp. 1811-1525.

Examples

n <- 30; nSamp <- 500
X <- matrix(rexp(n * nSamp), nrow = nSamp, ncol = n)
pVals <- apply(X, 1, function(x) CV2.test(x)$p.value)
plot(pVals)  ## should be uniform on (0, 1)
n <- 30; nSamp <- 500
X <- matrix(rexp(n * nSamp), nrow = nSamp, ncol = n)
pVals <- apply(X, 1, function(x) CV2.test(x)$p.value)
plot(pVals)  ## should be uniform on (0, 1)

Surge heights at Dunkerque

Description

Surge heights near high tide at Dunkerque tide gauge station (France)

Usage

DunkerqueDunkerque

Format

The format is: List of 7

$info : List of 6
- $name : chr "Dunkerque"
- $shortLab : chr "Surge Heights at Dunkerque (France)"
- $longLab : chr "Surge Heights near high tide, Dunkerque (France)"
- $varName : chr "Surge"
- $varShortLab : chr "Surge"
- $varUnit : chr "cm"
$describe : chr "High tide sea surge over 30 cm at Dunkerque... "
$OTinfo : List of 4
- $start : POSIXct[1:1], format: "1956-01-01"
- $end : POSIXct[1:1], format: "2009-01-01"
- $effDuration: num 38.8
- $threshold : num "30"
$OTdata : 'data.frame': 740 obs. of 3 variables:
- $date : POSIXct[1:740], format: "1956-11-27" "1956-12-03" ...
- $Surge : num [1:740] 67.9 30.9 51.8 30.8 39.8 ...
- $comment : chr [1:740] "" "" "" "" ...
$OTmissing : 'data.frame': 83 obs. of 3 variables:
- $start : POSIXct[1:83], format: "1956-01-01" "1956-08-08" ...
- $end : POSIXct[1:83], format: "1956-06-07" "1956-11-03" ...
- $comment: chr [1:83] "" "" "" "" ...
$MAXinfo : 'data.frame' : 1 obs. of 3 variables:
- $start : POSIXct[1:1], format: "1706-01-01"
- $end : POSIXct[1:1], format: "1956-01-01"
- $duration : num 250
$MAXdata :'data.frame': 1 obs. of 4 variables:
- $block : int 1
- $date : POSIXct[1:1], format: "1953-02-01"
- $Surge : num 213
- $comment : chr "1"
- attr(*, "class")= chr "Rendata"

Details

See Brest and Garonne datasets with the same list structure.

An 'historical' surge of 213 cm was observed on 1953-02-01 and is considered by experts as having a return period of 250 years.

Examples

Dunkerque$info
plot(Dunkerque)
Dunkerque$info
plot(Dunkerque)

Expectation-Maximisation for a mixture of exponential distributions

Description

Experimental function for Expectation-Maximisation (EM) estimation

Usage

EM.mixexp(x, m = 2)
EM.mixexp(x, m = 2)

Arguments

`x`	Sample vector with values `>0`.
`m`	Number of mixture components.

Details

The EM algorithm is very simple for exponential mixtures (as well as for many other mixture models).

According to a general feature of EM, this iterative method leads to successive estimates with increasing likelihood but which may converge to a local maximum of the likelihood.

Value

List with

`estimate`	Estimated values as a named vector.
`logL`	Vector giving the log-likelihood for successive iterations.
`Alpha`	Matrix with `m` columns giving probability weights for successive iterations. Row with number `it` contains the `m` probabilities at iteration `it`.
`Theta`	Matrix with `m` columns giving the estimates of the `m` expectations for the successive iterations

Note

The estimation is done for expectation (inverse rates) but the estimate vector in the result contains rates for compatibility reasons (e.g with exponential).

Author(s)

Yves Deville

Examples

set.seed(1234)
x <- rmixexp2(n = 100, prob1 = 0.5, rate2 = 4)
EM.mixexp(x) -> res
res$estimate
matplot(res$Theta, type = "l", lwd = 2,
        xlab = "iteration", ylab = "theta",
        main = "exponential inverse rates")
set.seed(1234)
x <- rmixexp2(n = 100, prob1 = 0.5, rate2 = 4)
EM.mixexp(x) -> res
res$estimate
matplot(res$Theta, type = "l", lwd = 2,
        xlab = "iteration", ylab = "theta",
        main = "exponential inverse rates")

Classical "exponential distribution" plot

Description

Plot a vector using "exponential distribution" scales

Usage

expplot(x,
        plot.pos = "exp",
        rate = NULL,
        labels = NULL,
        mono = TRUE,
        ...)
expplot(x,
        plot.pos = "exp",
        rate = NULL,
        labels = NULL,
        mono = TRUE,
        ...)

Arguments

`x`	The vector to be plotted.
`plot.pos`	Plotting position for points: either "exp" for expected ranks or "med" for a median rank approximation (see Details below).
`rate`	Rate parameter for one or several "exponential distribution" lines to be plotted
`labels`	Text to display in legend when "exponential distribution" lines are specified
`mono`	Monochrome graph?
`...`	Arguments to be passed to `plot`.

Details

This plot shows $-\log[1-F(x)]$ against $x$ where $F(x)$ at point $i$ is taken as $i/(n+1)$ if plot.pos is "exp", or as the "median rank" approximation $(i-0.3)/(n+0.4)$ if plot.pos is "med".

If the data in x is a sample from an exponential distribution, the points should be roughly aligned. However the largest order statistics have high sampling dispersion.

Note

The log scale for y is emulated via the construction of suitable graduations. So be careful when adding graphical material (points, etc) to this graph with functions of the "add to plot" family (points, lines, ...).

The ML estimate of the rate parameter is the inverse of the sample mean.

Author(s)

Yves Deville

Examples

 x <- rexp(200)
 expplot(x, rate = 1/mean(x), labels = "fitted")
x <- rexp(200)
 expplot(x, rate = 1/mean(x), labels = "fitted")

ML estimation of the Gamma distribution

Description

Fast Maximum Likelihood estimation of the Gamma distribution.

Usage

fgamma(x, check.loglik = FALSE)
fgamma(x, check.loglik = FALSE)

Arguments

`x`	Sample vector to be fitted. Should contain only positive non-NA values.
`check.loglik`	If `TRUE`, the log-likelihood is recomputed using `dgamma` function with `log =` `TRUE`. The result is returned as a list element.

Details

The likelihood is concentrated with respect to the scale parameter. The concentrated log-likelihood is a strictly concave function of the shape parameter which can easily maximised numerically.

Value

A list with the following elements

`estimate`	Parameter ML estimates.
`sd`	Vector of (asymptotic) standard deviations for the estimates.
`loglik`	The maximised log-likelihood.
`check.loglik`	The checked log-likelihood.
`cov`	The (asymptotic) covariance matrix computed from theoretical or observed information matrix.
`info`	The information matrix.

Note

The distribution is fitted by using the scale parameter rather than rate (inverse of scale).

Author(s)

Yves Deville

Examples

set.seed(9876)
alpha <- 0.06
beta <- rexp(1)
n <- 30
x <- rgamma(n, shape = alpha, scale = beta)
fit <- fgamma(x, check.loglik = TRUE)

## compare with MASS results
if (require(MASS)) {
   fit.MASS <- fitdistr(x, densfun = "gamma")
   rate <- 1 / fit$estimate["scale"]
   est <- c(fit$estimate, rate = rate)
   der <- rate * rate ## derivative of rate w.r.t scale
   sdest <- c(fit$sd, rate = der * fit$sd["scale"])
   tab <- rbind(sprintf(" %10.8f ", est),
                sprintf("(%10.8f)", sdest))
   colnames(tab) <- c("shape", "scale", "rate")
   rownames(tab) <- c("est", "sd")
   noquote(tab)
}

set.seed(9876)
alpha <- 0.06
beta <- rexp(1)
n <- 30
x <- rgamma(n, shape = alpha, scale = beta)
fit <- fgamma(x, check.loglik = TRUE)

## compare with MASS results
if (require(MASS)) {
   fit.MASS <- fitdistr(x, densfun = "gamma")
   rate <- 1 / fit$estimate["scale"]
   est <- c(fit$estimate, rate = rate)
   der <- rate * rate ## derivative of rate w.r.t scale
   sdest <- c(fit$sd, rate = der * fit$sd["scale"])
   tab <- rbind(sprintf(" %10.8f ", est),
                sprintf("(%10.8f)", sdest))
   colnames(tab) <- c("shape", "scale", "rate")
   rownames(tab) <- c("est", "sd")
   noquote(tab)
}

Fit a GEV distribution from block maxima or r largest order statistics using an aggregated Renewal POT process

Description

Fit a GEV distribution from block maxima or $r$ largest order statistics using an aggregated Renewal POT process.

Usage

fGEV.MAX(MAX.data, MAX.effDuration,
         MAX = NULL,
         control = list(maxit = 300, fnscale = -1),
         scaleData = TRUE,
         cov = TRUE,
         info.observed = TRUE,
         trace = 0)
fGEV.MAX(MAX.data, MAX.effDuration,
         MAX = NULL,
         control = list(maxit = 300, fnscale = -1),
         scaleData = TRUE,
         cov = TRUE,
         info.observed = TRUE,
         trace = 0)

Arguments

`MAX.data`	A list of block maxima or $r$ largest statistics as in `Renouv`.
`MAX.effDuration`	A vector of durations as in `Renouv`. The durations must be identical in order to have a common GEV distribution for the maxima.
`MAX`	A compact representation of the needed data as a list. This is typically created by using the (non exported) `Renext:::makeMAXdata` function.
`control`	List to control the optimisation in `optim`.
`scaleData`	Logical. If `TRUE`, the data in `MAX.data` are scaled before being used in the likelihood. The scaling operation is carried on the excesses (observations minus threshold).
`cov`	Logical. If `TRUE` the standard deviation and the covariance matrix of the estimates are computed and returned as `sd` and `cov` elements of the list. However if the estimated shape parameter is $< - 0.5$ the two elements are filled with `NA` because the regularity conditions can not be thought of as valid.
`info.observed`	Logical. If `TRUE` the covariance is computed from the observed information matrix. If `FALSE`, the expected information matrix is used instead. This is only possible for block maxima data, i.e. when all the blocks contain only one observation. The computation relies on the formula given by Prescott and Walden. Note that the default value differs from that of the functions `fGPD`, `flomax` and `fmaxlo`, for historical reasons.
`trace`	Integer level of verbosity during execution. With the value `0`, nothing is printed.

Details

The data are assumed to provide maxima or $r$ largest statistics arising from an aggregated renewal POT model with unknown event rate $\lambda$ and unknown two-parameter Generalised Pareto Distribution for the excesses. A threshold $u$ is fixed below the given data and the three unknown parameters lambda, scale and shape of the POT model are found by maximising the likelihood. Then a vector of the three parameters for the GEV distribution is computed by transformation. The covariance matrix and standard deviations are computed as well using the jacobian matrix of the transformation.

The maximisation is for the log-likelihood with the rate lambda concentrated out, so it is a two-parameter optimisation.

Value

A list

`estimate`	Named vector of the estimated parameters for the GEV distribution of maxima.
`opt`	Result of the optimisation.
`loglik`	Identical to `opt$value`. This is the maximised log-likelihood for the renewal POT model.
`sd`	Standard deviation of the estimates (approximation based on the ML theory).
`cov`	Covariance matrix of the estimates (approximation based on the ML theory).

Caution

Whatever be the data, the log-likelihood is infinite (hence maximal) for any vector of GEV parameters with shape $< -1$ and postive scale. Hence the log-likelihood should be maximised with a constraint on the shape, while an unconstrained optimisation is used here. In practice, for numerical reasons, the estimate usually remains inside the shape > -1 region. An estimation leading to shape $< -1$ must be considered as meaningless. An estimation with shape $< -0.5$ should be considered with care.

Note

This function could get more arguments in the future.

Author(s)

Yves Deville

References

The Renext Computing Details document.

Prescott P. and Walden A.T. (1980) Maximum Likelihood Estimation of the Parameters of the Generalized Extreme-Value Distribution. Biometrika 67(3), 723-724.

Examples

##====================================================================
## block maxima: simulated data and comparison with  the 'fgev'
## function from the 'evd' package
##====================================================================
set.seed(1234)
u <- 10
nBlocks <- 30
nSim <- 100   ## number of samples 
Par <- array(NA, dim = c(nSim, 3, 2),
             dimnames = list(NULL, c("loc", "scale", "shape"), c("MAX", "evd")))
LL <- array(NA, dim = c(nSim, 2),
            dimnames = list(NULL, c("MAX", "evd")))

for (i in 1:nSim) {
  rd <- rRendata(threshold = u,
                 effDuration = 1,
                 lambda = 12,
                 MAX.effDuration = rep(1, nBlocks),
                 MAX.r = rep(1, nBlocks),
                 distname.y = "exp", par.y = c(rate = 1 / 100))

  MAX <- Renext:::makeMAXdata(rd)
  fit.MAX <- fGEV.MAX(MAX = MAX)
  fit.evd <- fgev(x = unlist(MAX$data))
  Par[i, , "MAX"] <- fit.MAX$estimate
  Par[i, , "evd"] <- fit.evd$estimate
  LL[i, "MAX"] <- fit.MAX$loglik
  LL[i, "evd"] <- logLik(fit.evd)
}

##====================================================================
## r largest: use 'ismev::rlarg.fit' on the venice data set.
## NB 'venice' is taken from the 'evd' package here.
##====================================================================
## Not run:  
require(ismev);
fit1 <- ismev::rlarg.fit(venice)

## transform data: each row is a block
MAX.data <- as.list(as.data.frame(t(venice)))
## remove the NA imposed by the rectangular matrix format
MAX.data <- lapply(MAX.data, function(x) x[!is.na(x)])
MAX.effDuration <- rep(1, length(MAX.data))

fit2 <- fGEV.MAX(MAX.data = MAX.data,
                 MAX.effDuration = MAX.effDuration)

## estimates
est <- cbind(ismev = fit1$mle, RenextLab = fit2$estimate)
print(est)
# covariance
covs <- array(dim = c(2, 3, 3),
              dimnames = list(c("ismev", "RenextLab"),
                colnames(fit2$cov), colnames(fit2$cov)))
                
covs["ismev", , ] <- fit1$cov
covs["RenextLab", , ] <- fit2$cov
print(covs)

## End(Not run)
##====================================================================
## block maxima: simulated data and comparison with  the 'fgev'
## function from the 'evd' package
##====================================================================
set.seed(1234)
u <- 10
nBlocks <- 30
nSim <- 100   ## number of samples 
Par <- array(NA, dim = c(nSim, 3, 2),
             dimnames = list(NULL, c("loc", "scale", "shape"), c("MAX", "evd")))
LL <- array(NA, dim = c(nSim, 2),
            dimnames = list(NULL, c("MAX", "evd")))

for (i in 1:nSim) {
  rd <- rRendata(threshold = u,
                 effDuration = 1,
                 lambda = 12,
                 MAX.effDuration = rep(1, nBlocks),
                 MAX.r = rep(1, nBlocks),
                 distname.y = "exp", par.y = c(rate = 1 / 100))

  MAX <- Renext:::makeMAXdata(rd)
  fit.MAX <- fGEV.MAX(MAX = MAX)
  fit.evd <- fgev(x = unlist(MAX$data))
  Par[i, , "MAX"] <- fit.MAX$estimate
  Par[i, , "evd"] <- fit.evd$estimate
  LL[i, "MAX"] <- fit.MAX$loglik
  LL[i, "evd"] <- logLik(fit.evd)
}

##====================================================================
## r largest: use 'ismev::rlarg.fit' on the venice data set.
## NB 'venice' is taken from the 'evd' package here.
##====================================================================
## Not run:  
require(ismev);
fit1 <- ismev::rlarg.fit(venice)

## transform data: each row is a block
MAX.data <- as.list(as.data.frame(t(venice)))
## remove the NA imposed by the rectangular matrix format
MAX.data <- lapply(MAX.data, function(x) x[!is.na(x)])
MAX.effDuration <- rep(1, length(MAX.data))

fit2 <- fGEV.MAX(MAX.data = MAX.data,
                 MAX.effDuration = MAX.effDuration)

## estimates
est <- cbind(ismev = fit1$mle, RenextLab = fit2$estimate)
print(est)
# covariance
covs <- array(dim = c(2, 3, 3),
              dimnames = list(c("ismev", "RenextLab"),
                colnames(fit2$cov), colnames(fit2$cov)))
                
covs["ismev", , ] <- fit1$cov
covs["RenextLab", , ] <- fit2$cov
print(covs)

## End(Not run)

Fit a two-parameters Generalised Pareto Distribution from a sample

Description

Fit a two-parameters Generalised Pareto Distribution from a sample.

Usage


fGPD(x,
     info.observed = TRUE,
     shapeMin = -0.8,
     dCV = 1e-04,
     cov = TRUE,
     trace = 0)

fGPD(x,
     info.observed = TRUE,
     shapeMin = -0.8,
     dCV = 1e-04,
     cov = TRUE,
     trace = 0)

Arguments

`x`	The sample data vector.
`info.observed`	Logical. The observed information matrix is used when `TRUE` and the expected information matrix is used when `FALSE`.
`shapeMin`	Lower bound on the shape parameter. This must be `>= -1.0` since otherwise the ML estimate is obtained with the `scale` parameter equal to `max(x)`.
`dCV`	Half-length of a small interval centered on 1.0. When the Coefficient of Variation (CV) falls in this interval, an exponential distribution is fitted, see Details.
`cov`	Logical. If `FALSE`, a minimal estimation is performed with no covariance matrix or derivative returned. This can be useful when a large number of ML estimations are required, e.g. to sample from a likelihood ratio.
`trace`	Integer level of verbosity. The value `0` prints nothing.

Details

This function mainly relies on the flomax and fmaxlo functions. When CV is larger than 1.0 + dCV, a Lomax distribution is fitted by Maximum Likelihood using a concentrated log-likelihood. When instead CV is smaller than 1.0 - dCV, a maxlo distribution is fitted. Finally, when CV -1.0 has absolute value <= dCV, an exponential distribution is fitted. In all cases, the result is translated into a parameter vector for the GPD.

Note that when CV is close to 1.0, fitting a Lomax or a maxlo distribution can lead to problems because the estimated values of the shape and scale parameter are large, and because the concentrated log-likelihood is a flat function of the scale parameter.

Value

A list

`estimate`	Vector containing the estimated values of the unknown parameters.
`CV`	The coefficient of variation of `x` computed using `length(x)` as denominator in the variance estimation.
`logLik`, `dlogLik`	The maximised value of the log-likelihood and the vector of its first order derivatives, which should be close to zero.
`sd`, `cov`	Vector of approximated standard deviations and covariance matrix for the estimated parameters. These are based on the inversion of expected information matrix.
`sd`, `cov`	Vector of standard deviations and covariance matrix for the estimates if the `cov` formal is `TRUE`.
`cvg`	Logical. Was convergence reached? This logical flag is set to `TRUE` and remains for compatibility reasons.

Note

It may happen that the estimated shape parameter is < -0.5, in which case the expected information matrix can not be computed, nor does the covariance matrix and standard deviations. In this case, the cov and sd objects contain NA values. This problem can arise also when the shape parameter is greater than but close to the value -0.5. Even when info.observed is TRUE, the information matrix, covariance and standard deviations are set to NA.

When the true (unknown) value is is < -0.5, the regularity conditions required in the ML approximated inference do not hold.

The default value of info.observed was set to TRUE from version 3.0-1 because standard deviations obtained with this choice are usually better.

Author(s)

Yves Deville

References

See the Renext Computing Details document.

Examples

## Not run: 
set.seed(123456)
n <- 500
ns <- 1000
xi <- runif(ns, min = -0.5, max = 0.5)
X <- matrix(nrow = n, ncol = ns)

for (i in 1:length(xi)) {
  Xi <- rgpd(n, scale = 1, shape = xi[i])
  X[ , i] <- Xi
  res1 <- fGPD(Xi)
  res2 <- try(fpot(Xi, threshold = 0.0))
  if (inherits(res2, "try-error")) {
    cat(res2, "\n")
    break
  }
  logLik1 <- res1$loglik; logLik2 <- logLik(res2)
  if (abs(logLik1 - logLik2) > 0.001) {
    cat(sprintf("i = %d, xi = %7.4f\n", i, xi[i]))
    mat <- rbind(c(res1$estimate[1:2], logLik = logLik1),
                 c(res2$estimate[1:2], logLik = logLik2))
    rownames(mat) <- c("fGPD", "fpot")
    print(mat)
  }
}

## End(Not run)
## Not run: 
set.seed(123456)
n <- 500
ns <- 1000
xi <- runif(ns, min = -0.5, max = 0.5)
X <- matrix(nrow = n, ncol = ns)

for (i in 1:length(xi)) {
  Xi <- rgpd(n, scale = 1, shape = xi[i])
  X[ , i] <- Xi
  res1 <- fGPD(Xi)
  res2 <- try(fpot(Xi, threshold = 0.0))
  if (inherits(res2, "try-error")) {
    cat(res2, "\n")
    break
  }
  logLik1 <- res1$loglik; logLik2 <- logLik(res2)
  if (abs(logLik1 - logLik2) > 0.001) {
    cat(sprintf("i = %d, xi = %7.4f\n", i, xi[i]))
    mat <- rbind(c(res1$estimate[1:2], logLik = logLik1),
                 c(res2$estimate[1:2], logLik = logLik2))
    rownames(mat) <- c("fGPD", "fpot")
    print(mat)
  }
}

## End(Not run)

ML estimation of the Lomax distribution

Description

Fast Maximum Likelihood estimation of the Lomax distribution.

Usage

flomax(x,
       info.observed = TRUE,
       plot = FALSE,
       scaleData = TRUE,
       cov = TRUE)
flomax(x,
       info.observed = TRUE,
       plot = FALSE,
       scaleData = TRUE,
       cov = TRUE)

Arguments

`x`	Sample vector to be fitted. Should contain only positive non-NA values.
`info.observed`	Should the observed information matrix be used or the expected one be used?
`plot`	Logical. If `TRUE`, a plot will be produced showing the derivative of the concentrated log-likelihood, function of the shape parameter.
`scaleData`	Logical. If `TRUE` observations in `x` (which are positive) are divided by their mean value. The results are in theory not affected by this transformation, but scaling the data could improve the estimation in some cases. The log-likelihood plots are shown using the scaled values so the returned estimate of the scale parameter is not the the abscissa of the maximum shown on the plot.
`cov`	Logical. If `FALSE`, a minimal estimation is performed with no covariance matrix or derivative returned. This can be useful when a large number of ML estimations are required, e.g. to sample from a likelihood ratio.

Details

The likelihood is concentrated with respect to the shape parameter. This function is increasing for small values of the scale parameter $\beta$ . For large $\beta$ , the derivative of the concentrated log-likelihood tends to zero, and its sign is that of $(1 - \textrm{CV}^2)$ where $\textrm{CV}$ is the coefficient of variation, computed using $n$ as denominator in the formula for the standard deviation.

The ML estimate does not exist when the sample has a coefficient of variation CV less than 1 and it may fail to be found when CV is greater than yet close to 1.

Value

A list with the following elements

`estimate`	Parameter ML estimates.
`sd`	Vector of (asymptotic) standard deviations for the estimates.
`loglik`	The maximised log likelihood.
`dloglik`	Gradient of the log-likelihood at the optimum. Its two elements should normally be close to zero.
`cov`	The (asymptotic) covariance matrix computed from theoretical or observed information matrix.
`info`	The information matrix.

Note

The estimates are biased for small or medium sized sample. The bias is positive for the shape parameter, thus the estimated shape tends to be larger than the true unknown value.

Fitting a Lomax distribution to an exponential sample might lead to a divergence since the exponential is the limit of a Lomax distribution with large shape and large scale with constant ratio shape/scale. Fitting this distribution to a sample having a coefficient of variation smaller than 1 is not allowed since it should lead to divergence of the estimation.

The default value of info.observed was set to TRUE from version 3.0-1 because standard deviations obtained with this choice are usually better.

Author(s)

Yves Deville

References

J. del Castillo and J. Daoudi (2009) "Estimation of the Generalized Pareto Distribution", Statist. Probab. Lett. 79(5), pp. 684-688.

D.E. Giles, H. Feng & R.T. Godwin (2013) "On the Bias of the Maximum Likelihood Estimator for the Two-Parameter Lomax Distribution" Comm. Statist. Theory Methods. Vol. 42, n. 11, pp. 1934-1950.

N. Johnson, S. Kotz and N. Balakrishnan Continuous Univariate Distributions vol. 1, Wiley 1994.

Examples

## generate sample
set.seed(1234)
n <- 200
alpha <- 2 + rexp(1)
beta <- 1 + rexp(1)
x <- rlomax(n, scale = beta, shape = alpha)
res <- flomax(x, plot = TRUE)

## compare with a GPD with shape 'xi' and scale 'sigma'
xi <- 1 / alpha; sigma <- beta * xi  
res.evd <- evd::fpot(x, threshold = 0, model = "gpd")
xi.evd <- res.evd$estimate["shape"]
sigma.evd <- res.evd$estimate["scale"]
beta.evd <- sigma.evd / xi.evd 
alpha.evd <- 1 / xi.evd
cbind(Renext = res$estimate, evd = c(alpha = alpha.evd, beta = beta.evd))
  
## generate sample
set.seed(1234)
n <- 200
alpha <- 2 + rexp(1)
beta <- 1 + rexp(1)
x <- rlomax(n, scale = beta, shape = alpha)
res <- flomax(x, plot = TRUE)

## compare with a GPD with shape 'xi' and scale 'sigma'
xi <- 1 / alpha; sigma <- beta * xi  
res.evd <- evd::fpot(x, threshold = 0, model = "gpd")
xi.evd <- res.evd$estimate["shape"]
sigma.evd <- res.evd$estimate["scale"]
beta.evd <- sigma.evd / xi.evd 
alpha.evd <- 1 / xi.evd
cbind(Renext = res$estimate, evd = c(alpha = alpha.evd, beta = beta.evd))

ML estimation of a 'maxlo' distribution

Description

Fast Maximum Likelihood estimation of a 'maxlo' distribution.

Usage

fmaxlo(x,
       shapeMin = 1.25,
       info.observed = TRUE,
       plot = FALSE,
       scaleData = TRUE,
       cov = TRUE)
fmaxlo(x,
       shapeMin = 1.25,
       info.observed = TRUE,
       plot = FALSE,
       scaleData = TRUE,
       cov = TRUE)

Arguments

`x`	Sample vector to be fitted. Should contain only positive non-NA values.
`shapeMin`	Lower bound on the shape parameter. This must be `>= 1.0` since otherwise the ML estimate is obtained with the `scale` parameter equal to `max(x)`.
`info.observed`	Should the observed information matrix be used or the expected one be used?
`plot`	Logical. If `TRUE`, a plot will be produced showing the derivative of the concentrated log-likelihood, function of the shape parameter. The derivative function shown is that of the log-likelihood for the unconstrained maximisation; it is not used in the estimation.
`scaleData`	Logical. If `TRUE` observations in `x` (which are positive) are divided by their mean value. The results are in theory not affected by this transformation, but scaling the data could improve the estimation in some cases. The log-likelihood plots are shown using the scaled values so the returned estimate of the scale parameter is not the the abscissa of the maximum shown on the plot.
`cov`	Logical. If `FALSE`, a minimal estimation is performed with no covariance matrix or derivative returned. This can be useful when a large number of ML estimations are required, e.g. to sample from a likelihood ratio.

Details

The 'maxlo' likelihood is concentrated with respect to the shape parameter, thus the function to be maximised has only one one scalar argument: the scale parameter $\beta$ . For large scale $\beta$ , the derivative of the concentrated log-likelihood tends to zero, and its sign is that of $(\textrm{CV}^2-1)$ where $\textrm{CV}$ is the coefficient of variation, computed using $n$ as denominator in the formula for the standard deviation.

The ML estimate does not exist when the sample has a coefficient of variation CV greater than 1.0 and it may fail to be found when CV is smaller than yet close to 1.0.

The expected information matrix can be obtained by noticing that when the r.v. $Y$ follows the 'maxlo' distribution with shape $\alpha$ and scale $\beta$ the r.v $V:= 1/(1-Y/\beta)$ follows a Pareto distribution with minimum 1 and and shape parameter $\alpha$ . The information matrix involves the second order moment of $V$ .

The default value of info.observed was set to TRUE from version 3.0-1 because standard deviations obtained with this choice are usually better.

Value

A list with the following elements

`estimate`	Parameter ML estimates.
`sd`	Vector of (asymptotic) standard deviations for the estimates.
`loglik`	The maximised log-likelihood.
`dloglik`	Gradient of the log-likelihood at the optimum. Its two elements should normally be close to zero.
`cov`	The (asymptotic) covariance matrix computed from theoretical or observed information matrix.
`info`	The information matrix.

Note

The name of the distribution hence also that of the fitting function are still experimental and might be changed.

Author(s)

Yves Deville

Examples

## generate sample
set.seed(1234)
n <- 200
alpha <- 2 + rexp(1)
beta <- 1 + rexp(1)
x <- rmaxlo(n, scale = beta, shape = alpha)
res <- fmaxlo(x, plot = TRUE)

## compare with a GPD with shape 'xi' and scale 'sigma'
xi <- -1 / alpha; sigma <- -beta * xi
res.evd <- evd::fpot(x, threshold = 0, model = "gpd")
xi.evd <- res.evd$estimate["shape"]
sigma.evd <- res.evd$estimate["scale"]
beta.evd <- -sigma.evd / xi.evd 
alpha.evd <- -1 / xi.evd
cbind(Renext = res$estimate, evd = c(alpha = alpha.evd, beta = beta.evd))
  

## generate sample
set.seed(1234)
n <- 200
alpha <- 2 + rexp(1)
beta <- 1 + rexp(1)
x <- rmaxlo(n, scale = beta, shape = alpha)
res <- fmaxlo(x, plot = TRUE)

## compare with a GPD with shape 'xi' and scale 'sigma'
xi <- -1 / alpha; sigma <- -beta * xi
res.evd <- evd::fpot(x, threshold = 0, model = "gpd")
xi.evd <- res.evd$estimate["shape"]
sigma.evd <- res.evd$estimate["scale"]
beta.evd <- -sigma.evd / xi.evd 
alpha.evd <- -1 / xi.evd
cbind(Renext = res$estimate, evd = c(alpha = alpha.evd, beta = beta.evd))

ML estimation of classical Weibull distribution

Description

Fast Maximum Likelihood estimation of the classical two parameters Weibull distribution.

Usage


fweibull(x, info.observed = TRUE, scaleData = TRUE, cov = TRUE,
         check.loglik = FALSE)

fweibull(x, info.observed = TRUE, scaleData = TRUE, cov = TRUE,
         check.loglik = FALSE)

Arguments

`x`	Sample vector to be fitted. Should contain only positive non-NA values.
`info.observed`	Should the observed information matrix be used or the expected one be used?
`scaleData`	Should the data be scaled before estimation? If `TRUE`, the observations in `x` (which are positive) are divided by their mean value. The results are in theory not affected by this transformation, but scaling the data could improve the estimation in some cases.
`cov`	Should the covariance of estimates be computed?
`check.loglik`	If `TRUE`, the log-likelihood is recomputed using `dweibull` function with `log = TRUE`. The result is returned as a list element.

Details

The ML estimates are obtained thanks to a reparameterisation with $\eta = scale^{1/shape}$ in place of shape. This allows the maximisation of a one-dimensional likelihood $L$ since the $\eta$ parameter can be concentrated out of $L$ . This also allows the determination of the expected information matrix for $[shape,\,\eta]$ rather than the usual observed information.

Value

A list

`estimate`	Parameter ML estimates.
`sd`	The (asymptotic) standard deviation for estimate.
`cov`	The (asymptotic) covariance matrix computed from theoretical or observed Information matrix.
`eta`	The estimated value for eta.

Note

The default value of info.observed was set to TRUE from version 3.0-1 because standard deviations obtained with this choice are usually better.

Author(s)

Yves Deville

Examples


n <- 1000
set.seed(1234)
shape <- 2 * runif(1)
x <- 100 * rweibull(n, shape = 0.8, scale = 1)
res <- fweibull(x)

## compare with MASS
if (require(MASS)) {
   res2 <- fitdistr(x , "weibull")
   est <- cbind(res$estimate, res2$estimate)
   colnames(est) <- c("Renext", "MASS")
   loglik <- c(res$loglik, res2$loglik)
   est <- rbind(est, loglik)
   est
}

## Weibull plot
weibplot(x,
         shape = c(res$estimate["shape"], res2$estimate["shape"]),
         scale = c(res$estimate["scale"], res2$estimate["scale"]),
         labels = c("Renext 'fweibull'", "MASS 'fitdistr'"),
         mono = TRUE)
n <- 1000
set.seed(1234)
shape <- 2 * runif(1)
x <- 100 * rweibull(n, shape = 0.8, scale = 1)
res <- fweibull(x)

## compare with MASS
if (require(MASS)) {
   res2 <- fitdistr(x , "weibull")
   est <- cbind(res$estimate, res2$estimate)
   colnames(est) <- c("Renext", "MASS")
   loglik <- c(res$loglik, res2$loglik)
   est <- rbind(est, loglik)
   est
}

## Weibull plot
weibplot(x,
         shape = c(res$estimate["shape"], res2$estimate["shape"]),
         scale = c(res$estimate["scale"], res2$estimate["scale"]),
         labels = c("Renext 'fweibull'", "MASS 'fitdistr'"),
         mono = TRUE)

Flow of the french river La Garonne

Description

Flow of the french river La Garonne at le Mas d'Agenais

Usage

GaronneGaronne

Format

The format is: List of 7

$info : List of 6
- $name : chr "Garonne"
- $shortLab : chr "La Garonne at Le Mas d'Agenais"
- $longLab : chr "River flow of La Garonne at Le Mas d'Agenais"
- $varName : chr "Flow"
- $varShortLab : chr "Flow"
- $varUnit : chr "m3/s"
$describe : chr "Flow of the french river La Garonne ..."
$OTinfo :List of 4
- $start : POSIXct[1:1], format: "1913-01-01"
- $end : POSIXct[1:1], format: "1978-01-01"
- $effduration : num 65
- $threshold : num 2500
$OTdata : 'data.frame': 151 obs. of 3 variables:
- $date : POSIXct[1:151], format: "1913-04-08" "1913-04-25" ...
- $Flow : num [1:151] 2600 2800 2700 4579 3400 ...
- comment : chr [1:151] "" "" "" "" ...
$OTmissing : NULL
$MAXinfo :'data.frame': 1 obs. of 3 variables:
- $start : POSIXct[1:1], format: "1770-01-01"
- $end : POSIXct[1:1], format: "1913-01-01"
- $duration : num 143
$MAXdata :'data.frame': 12 obs. of 4 variables:
- $block : num [1:12] 1 1 1 1 1 1 1 1 1 1 ...
- date : POSIXct[1:12], format: NA NA ...
- $Flow : num [1:12] 7500 7400 7000 7000 7000 6600 6500 6500 6400 6300 ...
- $comment : chr [1:12] "1 (1875)" "2 (1770)" "3 (1783)" "4 (1855)" ...
- attr(*, "class")= chr "Rendata"

Details

The data concern the french river La Garonne at the gauging station named Le Mas d'Agenais where many floods occurred during the past centuries.

The data consist in OT data and historical data. The variable is the river flow in cube meter per second $(\textrm{m}^3/\textrm{s})$ as estimated from the river level using a rating curve. The precision is limited and many ties are present among the flow values.

The OT data or "OTdata" contain flows values over the threshold $u = 2500\,\mathrm{m}$ for the $65$ years period 1913-1977. The historical data or "MAXdata" is simply the $r=12$ largest flows for the period of $143$ years 1770-1912. The exact dates of these events are not known with precision but the years are known and given as comments.

Source

The data were taken from the book by Miquel.

References

Miquel J. (1984) Guide pratique d'estimation des probabilités de crues, Eyrolles (coll. EDF DER).

Parent E. and Bernier J. (2003) Bayesian POT modeling for Historical data. Journal of Hydrology vol. 274, pp. 95-108.

Examples

plot(Garonne) 
plot(Garonne)

Translate a vector of GEV parameters into renewal model

Description

Translate a (named) vector of GEV parameters into a renewal model.

Usage

gev2Ren(parGev,
        threshold = NULL,
        lambda = NULL,
        w = 1,
        distname.y = c("gpd", "GPD", "lomax", "maxlo"),
        vcovGev = NULL,
        jacobian = TRUE,
        plot = FALSE)
gev2Ren(parGev,
        threshold = NULL,
        lambda = NULL,
        w = 1,
        distname.y = c("gpd", "GPD", "lomax", "maxlo"),
        vcovGev = NULL,
        jacobian = TRUE,
        plot = FALSE)

Arguments

`parGev`	Named vector of GEV coefficients. This must be a vector of length 3, with names `"loc"`, `"scale"` and `"shape"`.
`threshold`	The threshold of the renewal model.
`lambda`	The rate of the renewal model.
`w`	The duration corresponding to the GEV parameters. Positive numeric scalar assumed to be in years.
`distname.y`	The name of the distributions for the excesses in the renewal model.
`vcovGev`	A (co)variance matrix for the parameter. This must be a symmetric positive matrix with 3 rows and 3 columns, with rownames in correspondence with the parameter names.
`jacobian`	Logical. If `TRUE` a Jacobian matrix will be returned as a `"jacobian"` attribute of the result.
`plot`	If `TRUE` a rough plot will be produced to check the accordance of the GEV and the renewal models. It is a return level plot with the two return level curves shown.

Value

A vector of parameters similar to the coefficient vector of an object with class "Renouv". This vector has an element named "lambda" corresponding to the rate of the Homogeneous Poisson Process. The other elements are the parameters for the distribution of POT excesses.

The result has attributes named "distname.y" and "threshold" which can be used to build a Renouv object using the RenouvNoEst function. The result may as well have attributes "jacobian" and "vcov" according to the arguments values. These objects should be used with attention to their element names, since the parameter order may not be the one you expect.

Author(s)

Yves Deville

Examples

## GEV parameters and block duration
set.seed(1234)
muGev <- rnorm(1); sigmaGev <- rgamma(1, shape = 5)
xiGev <- runif(1, min = -0.2, max = 0.3)
parGev <- c("loc" = muGev, "scale" = sigmaGev, "shape" = xiGev)
parRen <- gev2Ren(parGev, lambda = 1, jacobian = TRUE, plot = TRUE)
## check by reverse transform
parGevCheck <- Ren2gev(parRen, threshold = attr(parRen, "threshold"))
rbind(parGev, parGevCheck)

##=======================================================================
## slightly positive shape convert to "gpd" and "lomax" distributions
##=======================================================================
x <- rgev(n = 100, loc = 3500, scale = 1000, shape = 0.1)
fit.gev <- fgev(x, start = list("loc" = 3000, "scale" = 1000, "shape" = 0.1))
distNames <- c("gpd", "lomax")
namesRen <- c("lambda", "scale", "shape") # works for the 2 target dists
fitNew <- list()
opar <- par(mfrow = c(3, 1))
for (nm in distNames) {  
  parRen <- gev2Ren(parGev = fit.gev$estimate, threshold = 2800,
                    vcov = fit.gev$var.cov, distname.y = nm)
  namesRen <- c("lambda", "scale", "shape")
  myVcov <- attr(parRen, "vcov")[namesRen, namesRen]
  fitNew[[nm]] <- RenouvNoEst(threshold = attr(parRen, "threshold"),
                              estimate = parRen,
                              distname.y = attr(parRen, "distname.y"),
                              cov = myVcov)
  plot(fitNew[[nm]], Tlim = c(1, 200))
}
plot(fit.gev, which = 4)
par(opar)
##=======================================================================
## slightly negative shape convert to "gpd" and "maxlo" distribution
##=======================================================================
x <- rgev(n = 100, loc = 3500, scale = 1000, shape = -0.2)
fit.gev <- fgev(x, start = list("loc" = 3000, "scale" = 1000, "shape" = 0.1))
distNames <- c("gpd", "maxlo")
namesRen <- c("lambda", "scale", "shape") # works for the 2 target dists
fitNew <- list()
opar <- par(mfrow = c(3, 1))
for (nm in distNames) {
  parRen <- gev2Ren(parGev = fit.gev$estimate, threshold = 2800,
                    vcov = fit.gev$var.cov, distname.y = nm)
  myVcov <- attr(parRen, "vcov")[namesRen, namesRen]
  fitNew[[nm]] <- RenouvNoEst(threshold = attr(parRen, "threshold"),
                              estimate = parRen,
                              distname.y = attr(parRen, "distname.y"),
                              cov = myVcov)
  plot(fitNew[[nm]], Tlim = c(1, 200))
}
plot(fit.gev, which = 4)
par(opar)


## GEV parameters and block duration
set.seed(1234)
muGev <- rnorm(1); sigmaGev <- rgamma(1, shape = 5)
xiGev <- runif(1, min = -0.2, max = 0.3)
parGev <- c("loc" = muGev, "scale" = sigmaGev, "shape" = xiGev)
parRen <- gev2Ren(parGev, lambda = 1, jacobian = TRUE, plot = TRUE)
## check by reverse transform
parGevCheck <- Ren2gev(parRen, threshold = attr(parRen, "threshold"))
rbind(parGev, parGevCheck)

##=======================================================================
## slightly positive shape convert to "gpd" and "lomax" distributions
##=======================================================================
x <- rgev(n = 100, loc = 3500, scale = 1000, shape = 0.1)
fit.gev <- fgev(x, start = list("loc" = 3000, "scale" = 1000, "shape" = 0.1))
distNames <- c("gpd", "lomax")
namesRen <- c("lambda", "scale", "shape") # works for the 2 target dists
fitNew <- list()
opar <- par(mfrow = c(3, 1))
for (nm in distNames) {  
  parRen <- gev2Ren(parGev = fit.gev$estimate, threshold = 2800,
                    vcov = fit.gev$var.cov, distname.y = nm)
  namesRen <- c("lambda", "scale", "shape")
  myVcov <- attr(parRen, "vcov")[namesRen, namesRen]
  fitNew[[nm]] <- RenouvNoEst(threshold = attr(parRen, "threshold"),
                              estimate = parRen,
                              distname.y = attr(parRen, "distname.y"),
                              cov = myVcov)
  plot(fitNew[[nm]], Tlim = c(1, 200))
}
plot(fit.gev, which = 4)
par(opar)
##=======================================================================
## slightly negative shape convert to "gpd" and "maxlo" distribution
##=======================================================================
x <- rgev(n = 100, loc = 3500, scale = 1000, shape = -0.2)
fit.gev <- fgev(x, start = list("loc" = 3000, "scale" = 1000, "shape" = 0.1))
distNames <- c("gpd", "maxlo")
namesRen <- c("lambda", "scale", "shape") # works for the 2 target dists
fitNew <- list()
opar <- par(mfrow = c(3, 1))
for (nm in distNames) {
  parRen <- gev2Ren(parGev = fit.gev$estimate, threshold = 2800,
                    vcov = fit.gev$var.cov, distname.y = nm)
  myVcov <- attr(parRen, "vcov")[namesRen, namesRen]
  fitNew[[nm]] <- RenouvNoEst(threshold = attr(parRen, "threshold"),
                              estimate = parRen,
                              distname.y = attr(parRen, "distname.y"),
                              cov = myVcov)
  plot(fitNew[[nm]], Tlim = c(1, 200))
}
plot(fit.gev, which = 4)
par(opar)

Goodness-of-fit for the distribution of dates

Description

Goodness-of-fit diagnostics for the distribution of event dates in a (assumed) Poisson process

Usage

    gof.date(date,
             start = NULL,
             end = NULL,
             plot = TRUE,
             main = NULL,
             skip = NULL,
             plot.type = "skip")
gof.date(date,
             start = NULL,
             end = NULL,
             plot = TRUE,
             main = NULL,
             skip = NULL,
             plot.type = "skip")

Arguments

`date`	Object of class `POSIXct` (or that can be coerced to this class) giving the dates to be tested. Must be in strictly increasing order.
`start`	The beginning of the interval, a `POSIXct` object. If `NULL`, the first event in `date` is used.
`end`	Object of class `POSIXct` the end of the interval. If `NULL`, the last event in `date` is used.
`plot`	Should a plot be shown?
`main`	Character giving the main title of the plot. The default `NULL` stands for a default main describing the period.
`skip`	Optional data.frame with columns `start` and `end` indicating start and end of skipped periods. The two columns need to be coerced to POSIXct objects. They can be POSIXct or character with POSIX datetime format.
`plot.type`	Character indicating the type of plot to produce when a `skip` data.frame is given. With `plot.type = "skip"` the plot shows missing periods as greyed rectangles and the displays the results of a Kolmogorov-Smirnov (KS) test performed on the events. For the `"omit"` case the missing periods are collapsed into vertical lines on the plot and the displayed results are for an "effective" KS test of uniformity performed omitting the missing periods.

Details

In the homogeneous Poisson process, events occur on a time interval in a uniform fashion. More precisely, for a given time interval the distribution of the event dates conditional to their number $n$ is the distribution of the order statistics of a sample of size $n$ of the uniform distribution on this interval.

When the interval has limits taken at events the uniformity statement remains true, but for inner events. This behaviour is met when start and end are not given and taken as the first and last events in date.

Value

A list

`effKS.statistic`, `KS.statistic`	Kolmogorov-Smirnov global test statistic for uniformity (bilateral test) omitting slipped periods or not.
`effKS.pvalue`, `KS.pavalue`	Critical probability in the KS test omitting skipped periods or not.
`effnevt`, `nevt`	Number of events omitting skipped periods or not.
`effduration`, `duration`	Effective duration i.e. total duration of non-skipped periods. In years, omitting skipped periods or not.
`effrate`, `rate`	Occurrence rate in number of events by year, omitting skipped periods or not.
`effduration`, `duation`	Total duration in years, omitting missing periods or not.
`noskip`	Data.frame object giving indications on the periods that are NOT skipped over (hence usually non-missing periods). These are : `start`, `end` (POSIX), `duration` (in years) `rate` (in number of events by year) and Kolmogorov test statistic and p-value. This data.frame is only available when a suitable `skip` has been given.

When the number of events corresponding to the indications of args is 0, the function returns NULL with a warning. When the number of events is less than 6 a warning is shown.

Warning

When skipped periods exist the number of events, duration, rate the global KS test must be computed by omitting the skipped periods in the duration and retaining only valid interevents. The indication given in nevt rate and duration should be used only when no skipped period exist (skip = NULL on input) and replaced by effnevt, effrate and effduration otherwise.

Note

In practical contexts missing periods are often met in the datasets. The diagnostic should therefore be applied on every period with no missing data. Even if the event dates seem reasonably uniform, it is a good idea to check that the rates do not differ significantly over intervals.

When some events are missing and no suitable information is given via the skip argument, the global rate, KS.statistic and KS.pvalue are of little interest. Yet the graph might be instructive.

Author(s)

Yves Deville

Examples

## Use "Brest" dataset
## simple plot. Kolmogorov-Smirnov is not useful
gof1 <- gof.date(date = Brest$OTdata$date)

## consider missing periods. Much better!
gof2 <- gof.date(date = Brest$OTdata$date,
         skip = Brest$OTmissing,
         start = Brest$OTinfo$start,
         end = Brest$OTinfo$end)

print(gof2$noskip)

## Second type of graph
gof3 <- gof.date(date = Brest$OTdata$date,
         skip = Brest$OTmissing,
         start = Brest$OTinfo$start,
         end = Brest$OTinfo$end,
         plot.type = "omit")

## non-skipped periods at Brest
ns <- skip2noskip(skip = Brest$OTmissing,
                 start = Brest$OTinfo$start,
                 end = Brest$OTinfo$end)

## say 9 plots/diagnostics
oldpar <- par(mar = c(3, 4, 3, 2), mfcol = c(3, 3))

for (i in 1:9) {
  GOF <- gof.date(date = Brest$OTdata$date,
           start = ns$start[i],
           end = ns$end[i])
}

par(oldpar)

## Use "Brest" dataset
## simple plot. Kolmogorov-Smirnov is not useful
gof1 <- gof.date(date = Brest$OTdata$date)

## consider missing periods. Much better!
gof2 <- gof.date(date = Brest$OTdata$date,
         skip = Brest$OTmissing,
         start = Brest$OTinfo$start,
         end = Brest$OTinfo$end)

print(gof2$noskip)

## Second type of graph
gof3 <- gof.date(date = Brest$OTdata$date,
         skip = Brest$OTmissing,
         start = Brest$OTinfo$start,
         end = Brest$OTinfo$end,
         plot.type = "omit")

## non-skipped periods at Brest
ns <- skip2noskip(skip = Brest$OTmissing,
                 start = Brest$OTinfo$start,
                 end = Brest$OTinfo$end)

## say 9 plots/diagnostics
oldpar <- par(mar = c(3, 4, 3, 2), mfcol = c(3, 3))

for (i in 1:9) {
  GOF <- gof.date(date = Brest$OTdata$date,
           start = ns$start[i],
           end = ns$end[i])
}

par(oldpar)

Goodness-of-fit test for exponential distribution

Description

Bartlett's goodness-of-fit test for exponential distribution

Usage

   gofExp.test(x)
gofExp.test(x)

Arguments

`x`	Sample with positive values.

Value

A list with elements

`statistic`	Statistic.
`p.value`	Critical value.

Author(s)

Yves Deville

References

See

Yagouti A., Abi-Zeid I., Ouarda, T.B.M.J. and B. Bobée (2001), Revue de processus ponctuels et synthèse de tests statistiques pour le choix d'un type de processus Revue des Sciences de l'Eau, 1, pp. 323-361.

Examples

 ## a sample of size 30
 x <- rexp(30)
 res <- gofExp.test(x)

 ## ns samples: p.values should look as uniform on (0, 1)
 ns <- 100
 xmat <- matrix(rexp(30*ns), nrow = ns, ncol = 30)
 p.values <- apply(xmat, 1, function(x) gofExp.test(x)$p.value)
 plot(sort(p.values), type = "p", pch = 16)
 
## a sample of size 30
 x <- rexp(30)
 res <- gofExp.test(x)

 ## ns samples: p.values should look as uniform on (0, 1)
 ns <- 100
 xmat <- matrix(rexp(30*ns), nrow = ns, ncol = 30)
 p.values <- apply(xmat, 1, function(x) gofExp.test(x)$p.value)
 plot(sort(p.values), type = "p", pch = 16)

Generalised Pareto Distribution

Description

Density function, distribution function, quantile function, random generation, hazard and cumulative hazard functions for the Generalised Pareto Distribution.

Usage

   dGPD(x, loc = 0.0, scale = 1.0, shape = 0.0, log = FALSE)
   pGPD(q, loc = 0.0, scale = 1.0, shape = 0.0, lower.tail = TRUE)
   qGPD(p, loc = 0.0, scale = 1.0, shape = 0.0, lower.tail = TRUE)
   rGPD(n, loc = 0.0, scale = 1.0, shape = 0.0)
   hGPD(x, loc = 0.0, scale = 1.0, shape = 0.0)
   HGPD(x, loc = 0.0, scale = 1.0, shape = 0.0) 
dGPD(x, loc = 0.0, scale = 1.0, shape = 0.0, log = FALSE)
   pGPD(q, loc = 0.0, scale = 1.0, shape = 0.0, lower.tail = TRUE)
   qGPD(p, loc = 0.0, scale = 1.0, shape = 0.0, lower.tail = TRUE)
   rGPD(n, loc = 0.0, scale = 1.0, shape = 0.0)
   hGPD(x, loc = 0.0, scale = 1.0, shape = 0.0)
   HGPD(x, loc = 0.0, scale = 1.0, shape = 0.0)

Arguments

`x`, `q`	Vector of quantiles.
`p`	Vector of probabilities.
`n`	Number of observations.
`loc`	Location parameter $\mu$ .
`scale`	Scale parameter $\sigma$ .
`shape`	Shape parameter $\xi$ .
`log`	Logical; if `TRUE`, the log density is returned.
`lower.tail`	Logical; if `TRUE` (default), probabilities are $\textrm{Pr}[X <= x]$ , otherwise, $\textrm{Pr}[X > x]$ .

Details

Let $\mu$ , $\sigma$ and $\xi$ denote loc, scale and shape. The distribution values $y$ are $\mu \leq y < y_{\textrm{max}}$ .

When $\xi \neq 0$ , the survival function value for $y \geq \mu$ is given by

$S(y) = \left[1 + \xi(y - \mu)/\sigma\right]^{-1/ \xi} \qquad \mu < y < y_{\textrm{max}}$

where the upper end-point is $y_{\textrm{max}} = \infty$ for $\xi >0$ and $y_{\textrm{max}} = \mu -\sigma/ \xi$ for $\xi <0$ .

When $\xi = 0$ , the distribution is exponential with survival

$S(y) = \exp\left[- (y - \mu)/\sigma\right] \qquad \mu \leq y.$

Value

dGPD gives the density function, pGPD gives the distribution function, qGPD gives the quantile function, and rGPD generates random deviates. The functions hGPD and HGPD return the hazard rate and the cumulative hazard.

Note

The functions are slight adaptations of the [r,d,p,q]gpd functions in the evd package. The main difference is that these functions return NaN when shape is negative, as it might be needed in unconstrained optimisation. The quantile function can be used with p=0 and p=1, then returning the lower and upper end-point.

Examples

qGPD(p = c(0, 1), shape = -0.2)
shape <- -0.3
xlim <- qGPD(p = c(0, 1), shape = shape)
x <- seq(from = xlim[1], to = xlim[2], length.out = 100)
h <- hGPD(x, shape = shape)
plot(x, h, type = "o", main = "hazard rate for shape < 0")
shape <- 0.2
xlim <- qGPD(p = c(0, 1 - 1e-5), shape = shape)
x <- seq(from = xlim[1], to = xlim[2], length.out = 100)
h <- hGPD(x, shape = shape)
plot(x, h, type = "o", main = "hazard rate shape > 0 ")
qGPD(p = c(0, 1), shape = -0.2)
shape <- -0.3
xlim <- qGPD(p = c(0, 1), shape = shape)
x <- seq(from = xlim[1], to = xlim[2], length.out = 100)
h <- hGPD(x, shape = shape)
plot(x, h, type = "o", main = "hazard rate for shape < 0")
shape <- 0.2
xlim <- qGPD(p = c(0, 1 - 1e-5), shape = shape)
x <- seq(from = xlim[1], to = xlim[2], length.out = 100)
h <- hGPD(x, shape = shape)
plot(x, h, type = "o", main = "hazard rate shape > 0 ")

Translate a vector of Gumbel parameters into a vector of parameters for a renewal model

Description

Translate a (named) vector of Gumbel parameters into a vector of parameters for a renewal model.

Usage

gumbel2Ren(parGumbel,
           threshold = NULL,
           lambda = NULL,
           w = 1,
           distname.y = c("exponential"),
           vcovGumbel = NULL,
           jacobian = TRUE,
           plot = FALSE)
gumbel2Ren(parGumbel,
           threshold = NULL,
           lambda = NULL,
           w = 1,
           distname.y = c("exponential"),
           vcovGumbel = NULL,
           jacobian = TRUE,
           plot = FALSE)

Arguments

`parGumbel`	Named vector of Gumbel parameters, with name `"scale"` and `"shape"`.
`threshold`	The threshold for the target Renouv parameters.
`lambda`	The rate for the target Renouv parameters.
`w`	A block duration for the conversion.
`distname.y`	The distribution of the excesses.
`vcovGumbel`	A covariance matrix for the Gumbel parameters.
`jacobian`	Logical. If `TRUE`, the jacobian is used.
`plot`	Logical. If `TRUE`, a rough plot will be drawn.

Details

Given a vector of Gumbel parameters and a block duration, there exits an infinity of Renouv models with exponential excesses leading to the prescribed Gumbel distributions for the maximum of the marks on a block with duration w. One of these models may be chosen by specifying either a threshold or a rate lambda.

Value

A vector of Renouv parameters, which can be used with RenouvNoEst.

Caution

All Renouv models lead to the same return level curve whatever be the choice of threshold or lambda. However, when a covariance matrix is given, the covariance matrix for the Renouv parameters and consequently the confidence bounds depend on the threshold or rate. So the computed covariance matrix must in general be considered as putative.

Author(s)

Yves Deville

Plotting positions for exponential return levels

Description

Plotting positions for exponential return level plots.

Usage

  Hpoints(n)
Hpoints(n)

Arguments

`n`	Sample size.

Details

The plotting positions are numeric values to use as the abscissae corresponding to the order statistics in an exponential return level plot. They range from 1 to about $\log n$ . They can be related to the plotting positions given by ppoints.

The returned vector $\mathbf{H}$ has elements

$H_{i} = \frac{1}{n} + \frac{1}{n-1} + \dots + \frac{1}{n + 1 -i}$

for $1 \leq i \leq n$ . This is the expectation of the $i$ -th order statistic for a sample of the standard exponential distribution, see e.g. chap. 4 of Embrechts et al.

Value

Numeric vector of plotting positions with length n.

Note

For $n$ large enough, the largest value $H_n$ is approximately $\gamma + \log n$ where $\gamma$ is the Euler-Mascheroni constant, and $\exp H_n$ is about $1.78 n$ . Thus if the Hpoints are used as plotting positions on a return level plot, the largest observation has a return period of about $1.78 n$ years.

Author(s)

Yves Deville

References

Embrechts P., Klüppelberg C. and Mikosch T. (1997) Modelling Extremal Events for Insurance and Finance. Springer.

Examples

n <- 30
set.seed(1234)
x <- rGPD(n, shape = 0.2)
plot(exp(Hpoints(n)), sort(x), log = "x",
     main = "Basic return level plot")

n <- 30
set.seed(1234)
x <- rGPD(n, shape = 0.2)
plot(exp(Hpoints(n)), sort(x), log = "x",
     main = "Basic return level plot")

Simple estimation for the mixture of two exponential distributions

Description

Compute a simple (preliminary) estimation for the tree parameters of the mixture of two exponential distributions

Usage

   ini.mixexp2(x, plot = FALSE)
ini.mixexp2(x, plot = FALSE)

Arguments

`x`	Sample: numerical vector with elements `>0`.
`plot`	Should a graphic be displayed?

Details

This function gives estimators using several methods if necessary. The goal is to find the rates rate1, rate2 and the mixing probability prob1 with the 'feasibility' constraints 0 < rate1 < rate2 and 0 < prob1 < 1.

First the method of moments is used. If the estimates are feasible they are returned with method = "moments". If not, the estimates are derived using two linear regressions. A regression without constant using only the smallest values gives an estimator of the mean rate. A regression using only the largest values gives rate1 and prob1. Yet the constraints must be fulfilled. If they are, the estimates are returned (together with method = "Hreg" suggesting a cumulative hazard regression). If not, a (poor) default estimate is returned with method = "arbitrary".

Value

A list

`estimate`	A vector with named elements `"prob1"`, `"rate1"` and `"rate2"`.
`method`	The method that really produced the estimators.

Note

The method of moments is implemented in mom.mixexp2. Further investigations are needed to compare the estimators (moments or Hreg) and select the best strategy.

Note that this function returns the estimate within a list and no longer as a vector with named elements as was the case before.

Author(s)

Yves Deville

Examples

set.seed(1234)
x <- rmixexp2(n = 100, prob1 = 0.5, rate2 = 4)
res <- ini.mixexp2(x, plot = TRUE)

set.seed(1234)
x <- rmixexp2(n = 100, prob1 = 0.5, rate2 = 4)
res <- ini.mixexp2(x, plot = TRUE)

Interevents (or interarrivals) from events dates

Description

Compute intervent durations from events dates

Usage

   interevt(date,
            skip = NULL, noskip = NULL)
interevt(date,
            skip = NULL, noskip = NULL)

Arguments

`date`	A `POSIXct` vector containing the date(time) of the events.
`skip`	A data.frame containing two `POSIXct` columns `start` and `end` describing the periods to skip over.
`noskip`	A data.frame like `skip` but where the periods define the NON skipped part of the events.

Details

Interevents are the time differences between successive dates. When the date argument contains occurrence times $T_i$ for successive events of an homogeneous Poisson process, interevents $T_i -T_{i-1}$ are mutually independent with the same exponential distribution.

When some time intervals are skipped independently from the event point process, we may consider the interevents $T_i-T_{i-1}$ between two non-skipped events such that the time interval $(T_{i-1},\,T_i)$ does not contains any skipped interval. These interevents still are mutually independent with the same exponential distribution. When skip or noskip is not NULL the computation therefore only retains couples of two successive datetimes "falling" in the same non-skipped period, which number can therefore be associated with the interevent.

Value

A list mainly containing a interevt data.frame.

`interevt`	Data.frame. Each row describes a retained interevent through a `period` integer giving the "noskip" period, a `start` and `end` `POSIXct` and a `duration` in days.
`noskip`	Only when `skip` or `noskip` args have been given. A data.frame containing broadly the same information as the `noskip` arg is it was given or the information deduced from the `skip` arg if given.
`axis`	When needed, a list with some material to build an axis with uneven ticks as in the `gof.date` with `skip.action = "omit"`.

Note

Only one of the two arguments skip and noskip should be given in the call. In each case, the rows of the returned data.frame objects describe periods in chronological order. That is: start at row 2 must be after the end value of row 1 and so on.

Note that there are usually less interevents than dates since two successive dates will be retained for an interevent only when they are not separated by missing period. As a limit case, there can be no interevents if the noskip periods contain only one date from the date vector.

Author(s)

Yves Deville

Examples

## Use Brest data
ie <- interevt(date = Brest$OTdata$date, skip = Brest$OTmissing)

expplot(ie$interevt$duration, rate = 1 / mean(ie$interevt$duration),
  main = "No threshold")

## keep only data over a threshold
ind1 <- Brest$OTdata$Surge >= 35
ie1 <- interevt(Brest$OTdata$date[ind1], skip = Brest$OTmissing)
expplot(ie1$interevt$duration, main = "Threshold = 35")

## increase threshold
ind2 <- Brest$OTdata$Surge >= 55
ie2 <- interevt(date = Brest$OTdata$date[ind2], skip = Brest$OTmissing)
expplot(ie2$interevt$duration, main = "Threshold = 55 cm")
## Use Brest data
ie <- interevt(date = Brest$OTdata$date, skip = Brest$OTmissing)

expplot(ie$interevt$duration, rate = 1 / mean(ie$interevt$duration),
  main = "No threshold")

## keep only data over a threshold
ind1 <- Brest$OTdata$Surge >= 35
ie1 <- interevt(Brest$OTdata$date[ind1], skip = Brest$OTmissing)
expplot(ie1$interevt$duration, main = "Threshold = 35")

## increase threshold
ind2 <- Brest$OTdata$Surge >= 55
ie2 <- interevt(date = Brest$OTdata$date[ind2], skip = Brest$OTmissing)
expplot(ie2$interevt$duration, main = "Threshold = 55 cm")

Jackson's statistic

Description

Jackson's statistic for the exponentiality test.

Usage

   Jackson(x, norm = FALSE)
Jackson(x, norm = FALSE)

Arguments

`x`	Numeric vector or matrix. In the second case, rows are considered as samples.
`norm`	Logical: if `TRUE`, the statistic is normalized by using the asymptotic mean and standard deviation, respectively 2 and 1.

Details

The value(s) of the statistic are the ratio of two weighted means of the order statistics.

Value

A numeric vector of length 1 when x is a vector, or with length nrow(x) when x is a matrix.

References

J. Beirlant and T. de Weit and Y. Goegebeur(2006) A Goodness-of-fit Statistic for Pareto-Type Behaviour, J. Comp. Appl. Math., 186(1), pp. 99-116

Jackson's test of exponentiality

Description

Jackson's test of exponentiality

Usage

   Jackson.test(x, method = c("num", "sim", "asymp"), nSamp = 15000)
Jackson.test(x, method = c("num", "sim", "asymp"), nSamp = 15000)

Arguments

`x`	numeric vector or matrix.
`method`	Character: choice of the method used to compute the $p$ -value. See the Details section.
`nSamp`	Number of samples used to compute the $p$ -value if `method` is `"sim"`.

Details

Compute the Jackson's test of exponentiality. The test statistic is the ratio of weighted sums of the order statistics. Both sums can also be written as weighted sums of the scalings.

The Jackson's statistic for a sample of size $n$ of the exponential distribution can be shown to be approximately normal. More precisely $\sqrt{n}(J_n -2)$ has approximately a standard normal distribution. This distribution is used to compute the $p$ -value when method is "asymp". When method is "num", a numerical approximation of the distribution is used. Finally, when method is "sim" the $p$ -value is computed by simulating nSamp samples of size length(x) and estimating the probability to have a Jackson's statistic larger than that of the 'observed' x.

Value

A list of results.

`statistic`, `p.value`	The statistic and $p$ -value.
`df`	Number $n$ of observations.
`method`	Description of the test implemented, regardless of how the $p$ -value has been computed.

Note

Jackson's test of exponentiality works fine for a Lomax alternative (GPD with heavy tail). It then reaches nearly the same power as a Likelihood Ratio (LR) test, see Kozubowski et al. It can be implemented more easily than the LR test because simulated values of the test statistic can be obtained quickly enough to compute the $p$ -value by simulation.

Author(s)

Yves Deville

References

J. Beirlant and T. de Weit and Y. Goegebeur(2006) "A Goodness-of-fit Statistic for Pareto-Type Behaviour", J. Comp. Appl. Math., 186(1), pp. 99-116.

T.J. Kozubowski, A. K. Panorska, F. Qeadan, A. Gershunov and D. Rominger (2009) "Testing Exponentiality Versus Pareto Distribution via Likelihood Ratio" Comm. Statist. Simulation Comput. 38(1), pp. 118-139.

Examples

set.seed(1234)
x <- rGPD(n = 50, loc = 0, scale = 1, shape = 0.1)
Jackson.test(x, method = "num")$p.value
Jackson.test(x, method = "asymp")$p.value
Jackson.test(x, method = "sim")$p.value
set.seed(1234)
x <- rGPD(n = 50, loc = 0, scale = 1, shape = 0.1)
Jackson.test(x, method = "num")$p.value
Jackson.test(x, method = "asymp")$p.value
Jackson.test(x, method = "sim")$p.value

Log-likelihood of a "Renouv" object

Description

Log-likelihood, AIC, BIC and number of observations of an object of class "Renouv".

Usage

## S3 method for class 'Renouv'
AIC(object, ..., k = 2)
## S3 method for class 'Renouv'
BIC(object, ...)
## S3 method for class 'Renouv'
logLik(object, ...)
## S3 method for class 'Renouv'
nobs(object, ...)
## S3 method for class 'Renouv'
AIC(object, ..., k = 2)
## S3 method for class 'Renouv'
BIC(object, ...)
## S3 method for class 'Renouv'
logLik(object, ...)
## S3 method for class 'Renouv'
nobs(object, ...)

Arguments

`object`	Object of class `"Renouv"`.
`k`	See `AIC`.
`...`	Not used yet.

Caution

Comparing log-likelihoods, AIC or BIC for different Renouv objects makes sense only when these share the same data and the same threshold.

Note

logLik, AIC and BIC can be used with an object of class "Renouv" which makes use of historical data. In this case, the number of observations may be misleading since a single historical observation may concern dozens of years and thus have a much greater impact on the estimation of the tail than an "ordinary" observation.

Author(s)

Yves Deville

Lomax distribution

Description

Density function, distribution function, quantile function and random generation for the Lomax distribution.

Usage

   dlomax(x, scale = 1.0, shape = 4.0, log = FALSE)
   plomax(q, scale = 1.0, shape = 4.0, lower.tail = TRUE)
   qlomax(p, scale = 1.0, shape = 4.0)
   rlomax(n, scale = 1.0, shape = 4.0) 
dlomax(x, scale = 1.0, shape = 4.0, log = FALSE)
   plomax(q, scale = 1.0, shape = 4.0, lower.tail = TRUE)
   qlomax(p, scale = 1.0, shape = 4.0)
   rlomax(n, scale = 1.0, shape = 4.0)

Arguments

`x`, `q`	Vector of quantiles.
`p`	Vector of probabilities.
`n`	Number of observations.
`scale`, `shape`	Scale and shape parameters. Vectors of length > 1 are not accepted.
`log`	Logical; if `TRUE`, the log density is returned.
`lower.tail`	Logical; if `TRUE` (default), probabilities are $\textrm{Pr}[X <= x]$ , otherwise, $\textrm{Pr}[X > x]$ .

Details

The Lomax distribution function with shape $\alpha > 0$ and scale $\beta > 0$ has survival function

$S(y) = \left[1 + y/\beta \right]^{-\alpha} \qquad (y > 0)$

This distribution has increasing hazard and decreasing mean residual life (MRL). The coefficient of variation decreases with $\alpha$ , and tends to $1$ for large $\alpha$ . The default value $\alpha=4$ corresponds to $\textrm{CV} = \sqrt{2}$ .

Value

dlomax gives the density function, plomax gives the distribution function, qlomax gives the quantile function, and rlomax generates random deviates.

Note

This distribution is sometimes called log-exponential. It is a special case of Generalised Pareto Distribution (GPD) with positive shape $\xi > 0$ , scale $\sigma$ and location $\mu=0$ . The Lomax and GPD parameters are related according to

$\alpha = 1/\xi, \qquad \beta = \sigma/\xi.$

The Lomax distribution can be used in POT to describe excesses following GPD with shape $\xi>0$ thus with decreasing hazard and increasing Mean Residual Life.

Note that the exponential distribution with rate $\nu$ is the limit of a Lomax distribution having large scale $\beta$ and large shape $\alpha$ , with the constraint on the shape/scale ratio $\alpha/\beta = \nu$ .

References

Johnson N. Kotz S. and N. Balakrishnan Continuous Univariate Distributions vol. 1, Wiley 1994.

Lomax distribution in Wikipedia

Examples

shape <- 5; scale <- 10
xl <- qlomax(c(0.00, 0.99), scale = scale, shape = shape)
x <- seq(from = xl[1], to = xl[2], length.out = 200)
f <- dlomax(x, scale = scale, shape = shape)
plot(x, f, type = "l", main = "Lomax density")
F <- plomax(x, scale = scale, shape = shape)
plot(x, F, type ="l", main ="Lomax distribution function")
shape <- 5; scale <- 10
xl <- qlomax(c(0.00, 0.99), scale = scale, shape = shape)
x <- seq(from = xl[1], to = xl[2], length.out = 200)
f <- dlomax(x, scale = scale, shape = shape)
plot(x, f, type = "l", main = "Lomax density")
F <- plomax(x, scale = scale, shape = shape)
plot(x, F, type ="l", main ="Lomax distribution function")

Likelihood Ratio statistic for exponential vs. GPD

Description

Likelihood Ratio statistic for the exponential distribution vs. GPD.

Usage

   LRExp(x, alternative = c("lomax", "GPD", "gpd", "maxlo"))
LRExp(x, alternative = c("lomax", "GPD", "gpd", "maxlo"))

Arguments

`x`	Numeric vector of positive sample values. For the POT context, this should be the vector of excesses over the threshold.
`alternative`	Character string describing the alternative hypothesis

Details

The Likelihood-Ratio statistic is actually $W:=-2 \log \textrm{LR}$ where LR is the ratio of the likelihoods exponential to alternative distribution.

Value

The LR statistic value.

Note

When the alternative is "lomax" or "maxlo", the statistic has a distribution of mixed type under the null hypothesis of exponentiality. This is a mixture of a distribution of continuous type (with positive values) and of a Dirac mass at LR = 0. The probability mass $\textrm{Pr}\{\textrm{LR} = 0\}$ can be computed using the pGreenwood1 function. More precisely, the probability mass is pGreenwood1(n) for the Lomax alternative and 1.0 - pGreenwood1(n) for the maxlo alternative, where n is the sample size length(x).

Likelihood Ratio test of exponentiality vs. GPD

Description

Likelihood Ratio test of exponentiality vs. GPD.

Usage

   LRExp.test(x,
              alternative = c("lomax", "GPD", "gpd", "maxlo"),
              method = c("num", "sim", "asymp"),
              nSamp = 15000,
              simW = FALSE)
LRExp.test(x,
              alternative = c("lomax", "GPD", "gpd", "maxlo"),
              method = c("num", "sim", "asymp"),
              nSamp = 15000,
              simW = FALSE)

Arguments

`x`	Numeric vector of positive sample values. For the POT context this should be the vector of excesses over the threshold.
`alternative`	Character string describing the alternative distribution.
`method`	Method used to compute the $p$ -value.
`nSamp`	Number of samples for a simulation, if `method` is `"sim"`.
`simW`	Logical. If this is set to `TRUE` and `method` is `"sim"`, the simulated values are returned as an element `W` in the list.

Details

The Lomax and maxlo alternatives correspond to a GPD alternative with positive shape parameter $\xi > 0$ (Lomax) and GPD with $\xi < 0$ (maxlo).

The asymptotic distribution of the Likelihood-ratio statistic is known. For the GPD alternative, this is a chi-square distribution with one df. For the Lomax alternative, this is the distribution of a product $BC$ where $B$ and $C$ are two independent random variables following a Bernoulli distribution with probability parameter $p = 0.5$ and a chi-square distribution with one df.

When method is "num", a numerical approximation of the distribution is used. This method is not unlike that used by Kozubowski et al., but a different approximation is used. However, if x has a length $n > 500$ , the method is turned to "asymp".
When method is "sim", nSamp samples of the exponential distribution with the same size as x are drawn and the LR statistic is computed for each sample. The $p$ -value is simply the estimated probability that a simulated LR is greater than the observed LR.
Finally when method is "asymp", the asymptotic distribution is used.

Value

A list of results with elements statistic, p.value and method. Other elements are

`alternative`	Character describing the alternative hypothesis.
`W`	If `simW` is `TRUE` and `method` is `"sim"` only. A vector of `nSamp` simulated values of the statistic $W := -2 \log \textrm{LR}$ .

Note

For the Lomax alternative, the distribution of the test statistic has mixed type: it can take any positive value as well as the value $0$ with a positive probability mass. The probability mass is the probability that the ML estimate of the GPD shape parameter is negative, and a good approximation of it is provided by the pGreenwood1 function. Note that this probability converges to its limit $0.5$ very slowly, which suggests that the asymptotic distribution provides poor results for medium sample sizes, say $< 100$ .

Similarly for a maxlo alternative, the distribution of the test statistic has mixed type: it can take any positive value as well as the value $0$ with a positive probability mass approximately given by 1 -pGreenwood1(n) where $n$ is the sample size.

Author(s)

Yves Deville

References

The approximation method used is described in the Renext Computing Details report.

Examples

set.seed(1234)
x <- rGPD(n = 50, loc = 0, scale = 1, shape = 0.1)
LRExp.test(x, method = "num")$p.value
LRExp.test(x, method = "asymp")$p.value
## Not run: 
## requires much time
LRExp.test(x, method = "sim")$p.value

## End(Not run)
set.seed(1234)
x <- rGPD(n = 50, loc = 0, scale = 1, shape = 0.1)
LRExp.test(x, method = "num")$p.value
LRExp.test(x, method = "asymp")$p.value
## Not run: 
## requires much time
LRExp.test(x, method = "sim")$p.value

## End(Not run)

Likelihood Ratio statistic for Gumbel vs. GEV

Description

Likelihood Ratio statistic for the Gumbel distribution vs. GEV.

Usage


   LRGumbel(x, alternative = c("frechet", "GEV"))

LRGumbel(x, alternative = c("frechet", "GEV"))

Arguments

`x`	Numeric vector of sample values.
`alternative`	Character string describing the alternative.

Details

The Likelihood-Ratio statistic is actually $W:=-2 \log \textrm{LR}$ where LR is the ratio of the likelihoods Gumbel to alternative distribution.

Value

The LR statistic value.

Note

When the alternative is "frechet", the statistic has a distribution of mixed type under the null hypothesis of a Gumbel distribution.

Author(s)

Yves Deville

Likelihood Ratio test for the Gumbel distribution

Description

Likelihood Ratio test of Gumbel vs. GEV

Usage

   LRGumbel.test(x,
                 alternative = c("frechet", "GEV"),
                 method = c("num", "sim", "asymp"),
                 nSamp = 1500,
                 simW = FALSE)
LRGumbel.test(x,
                 alternative = c("frechet", "GEV"),
                 method = c("num", "sim", "asymp"),
                 nSamp = 1500,
                 simW = FALSE)

Arguments

`x`	Numeric vector of sample values.
`alternative`	Character string describing the alternative distribution.
`method`	Method used to compute the $p$ -value.
`nSamp`	Number of samples for a simulation, if `method` is `"sim"`.
`simW`	Logical. If this is set to `TRUE` and `method` is `"sim"`, the simulated values are returned as an element `W` in the list.

Details

The asymptotic distribution of the Likelihood-ratio statistic is known. For the GEV alternative, this is a chi-square distribution with one df. For the Fréchet alternative, this is the distribution of a product $XY$ where $X$ and $Y$ are two independent random variables following a Bernoulli distribution with probability parameter $p = 0.5$ and a chi-square distribution with one df.

When method is "num", a numerical approximation of the distribution is used.
When method is "sim", nSamp samples of the Gumbel distribution with the same size as x are drawn and the LR statistic is computed for each sample. The $p$ -value is simply the estimated probability that a simulated LR is greater than the observed LR. This method requires more computation time than the tow others.
Finally when method is "asymp", the asymptotic distribution is used.

Value

A list of results with elements statistic, p.value and method. Other elements are

`alternative`	Character describing the alternative hypothesis.
`W`	If `simW` is `TRUE` and `method` is `"sim"` only. A vector of `nSamp` simulated values of the statistic $W := -2 \log \textrm{LR}$ .

Note

For the Fréchet alternative, the distribution of the test statistic has mixed type: it can take any positive value as well as the value $0$ with a positive probability mass. The probability mass is the probability that the ML estimate of the GEV shape parameter is negative.

When method is "sim", the computation can be slow because each of the nSamp simulated values requires two optimisations. The "asymp" method provides an acceptable precision for $n \geq 50$ , and may even be used for $n \geq 30$ .

Author(s)

Yves Deville

Examples

set.seed(1234)
x <- rgumbel(60)
res <- LRGumbel.test(x)
set.seed(1234)
x <- rgumbel(60)
res <- LRGumbel.test(x)

'maxlo' distribution

Description

Density function, distribution function, quantile function and random generation for the 'maxlo' distribution.

Usage

   dmaxlo(x, scale = 1.0, shape = 4.0, log = FALSE)
   pmaxlo(q, scale = 1.0, shape = 4.0, lower.tail = TRUE)
   qmaxlo(p, scale = 1.0, shape = 4.0)
   rmaxlo(n, scale = 1.0, shape = 4.0) 
dmaxlo(x, scale = 1.0, shape = 4.0, log = FALSE)
   pmaxlo(q, scale = 1.0, shape = 4.0, lower.tail = TRUE)
   qmaxlo(p, scale = 1.0, shape = 4.0)
   rmaxlo(n, scale = 1.0, shape = 4.0)

Arguments

`x`, `q`	Vector of quantiles.
`p`	Vector of probabilities.
`n`	Number of observations.
`scale`, `shape`	Shift and shape parameters. Vectors of length > 1 are not accepted.
`log`	Logical; if `TRUE`, the log density is returned.
`lower.tail`	Logical; if `TRUE` (default), probabilities are $\textrm{Pr}[X <= x]$ , otherwise, $\textrm{Pr}[X > x]$ .

Details

The 'maxlo' distribution function with shape $\alpha>0$ and scale $\beta>0$ is a special case of Generalised Pareto (GPD) with negative shape $\xi < 0$ and location at zero. This is the finite upper endpoint case of the GPD. Its name is nonstandard and was chosen to suggest some form of symmetry with respect to the Lomax distribution.

The survival function is

$S(y) = \left[1-y/\beta\right]^\alpha \qquad 0 < y < \beta$

This distribution has a coefficient of variation smaller than $1$ .

Value

dmaxlo gives the density function, pmaxlo gives the distribution function, qmaxlo gives the quantile function, and rmaxlo generates random deviates.

Note

The 'maxlo' and GPD parameters are related according to

$\alpha = -1/\xi, \qquad \beta = -\sigma/\xi.$

where $\sigma$ is the scale parameter of the GPD. Since only GPD with $\xi > -0.5$ seem to be used in practice, this distribution should be used with $\alpha > 2$ .

This distribution can be used in POT to describe bounded excesses following GPD with shape $\xi < 0$ . The scale parameter $\beta$ then represents the upper end-point of the excesses, implying the finite upper end-point $u + \beta$ for the levels, where $u$ is the threshold. It can be used in Renouv with a fixed scale parameter, thus allowing a control of the upper end-point.

This distribution is simply a rescaled version of a beta distribution and also a rescaled version of a Kumaraswamy distribution. The name "maxlo" is used here to suggest a form of symmetry to Lomax distribution.

Examples

xs <- rmaxlo(500, shape = 2.2, scale = 1000)
hist(xs, main = "'maxlo' distribution"); rug(xs)

xs <- rmaxlo(500, shape = 4, scale = 1000)
hist(xs, main = "'maxlo' distribution"); rug(xs)

x <- seq(from = -10, to = 1010, by = 2)
plot(x = x, y = dmaxlo(x, shape = 4, scale = 1000),
     type = "l", ylab = "dens",
     col = "orangered", main = "dmaxlo and dgpd")
abline(h = 0)
lines(x = x, y = dgpd(x, shape = -1/4, scale = 250),
     type = "l",
     col = "SpringGreen3", lty = "dashed")



xs <- rmaxlo(500, shape = 2.2, scale = 1000)
hist(xs, main = "'maxlo' distribution"); rug(xs)

xs <- rmaxlo(500, shape = 4, scale = 1000)
hist(xs, main = "'maxlo' distribution"); rug(xs)

x <- seq(from = -10, to = 1010, by = 2)
plot(x = x, y = dmaxlo(x, shape = 4, scale = 1000),
     type = "l", ylab = "dens",
     col = "orangered", main = "dmaxlo and dgpd")
abline(h = 0)
lines(x = x, y = dgpd(x, shape = -1/4, scale = 250),
     type = "l",
     col = "SpringGreen3", lty = "dashed")

Mixture of two exponential distributions

Description

Probability functions associated to the mixture of two exponential distributions.

Usage

   dmixexp2(x, prob1,
            rate1 = 1.0, rate2 = rate1 + delta, delta,
            log = FALSE)
   pmixexp2(q, prob1,
            rate1 = 1.0, rate2 = rate1 + delta, delta,
            log = FALSE)
   qmixexp2(p, prob1,
            rate1 = 1.0, rate2 = rate1 + delta, delta)
   rmixexp2(n, prob1,
            rate1 = 1.0, rate2 = rate1 + delta, delta)
   hmixexp2(x, prob1,
            rate1 = 1.0, rate2 = rate1 + delta, delta)
   Hmixexp2(x, prob1,
            rate1 = 1.0, rate2 = rate1 + delta, delta)
dmixexp2(x, prob1,
            rate1 = 1.0, rate2 = rate1 + delta, delta,
            log = FALSE)
   pmixexp2(q, prob1,
            rate1 = 1.0, rate2 = rate1 + delta, delta,
            log = FALSE)
   qmixexp2(p, prob1,
            rate1 = 1.0, rate2 = rate1 + delta, delta)
   rmixexp2(n, prob1,
            rate1 = 1.0, rate2 = rate1 + delta, delta)
   hmixexp2(x, prob1,
            rate1 = 1.0, rate2 = rate1 + delta, delta)
   Hmixexp2(x, prob1,
            rate1 = 1.0, rate2 = rate1 + delta, delta)

Arguments

`x`, `q`	Vector of quantiles.
`p`	Vector of probabilities.
`n`	Number of observations.
`log`	Logical; if `TRUE`, the log density is returned.
`prob1`	Probability weight for the "number 1" exponential density.
`rate1`	Rate (inverse expectation) for the "number 1" exponential density.
`rate2`	Rate (inverse expectation) for the "number 2" exponential density. Should in most cases be `> rate1`. See Details.
`delta`	Alternative parameterisation `delta = rate2 - rate1`.

Details

The density function is the mixture of two exponential densities

$f(x) = \alpha_1 \lambda_1 \, e^{-\lambda_1 x} + (1-\alpha_1) \lambda_2 \, e^{-\lambda_2x} \qquad x > 0$

where $\alpha_1$ is the probability given in prob1 while $\lambda_1$ and $\lambda_2$ are the two rates given in rate1 and rate2.

A 'naive' identifiability constraint is

$\lambda_1 < \lambda_2$

i.e. rate1 < rate2, corresponding to the simple constraint delta > 0. The parameter delta can be given instead of rate2.

The mixture distribution has a decreasing hazard, increasing Mean Residual Life (MRL) and has a thicker tail than the usual exponential. However the hazard, MRL have a finite non zero limit and the distribution behaves as an exponential for large return levels/periods.

The quantile function is not available in closed form and is computed using a dedicated numerical method.

Value

dmiwexp2, pmiwexp2, qmiwexp2, evaluates the density, the distribution and the quantile functions. dmixexp2 generates a vector of n random draws from the distribution. hmixep2 gives hazard rate and Hmixexp2 gives cumulative hazard.

Examples

rate1 <- 1.0
rate2 <- 4.0
prob1 <- 0.8
qs <- qmixexp2(p = c(0.99, 0.999), prob1 = prob1,
               rate1 = rate1, rate2 = rate2) 
x <- seq(from = 0, to = qs[2], length.out = 200)
F <- pmixexp2(x, prob1 = prob1, rate1 = rate1, rate2 = rate2)
plot(x, F, type = "l", col = "orangered", lwd = 2,
     main = "Mixexp2 distribution and quantile for p = 0.99")
abline(v = qs[1])
abline(h = 0.99)

rate1 <- 1.0
rate2 <- 4.0
prob1 <- 0.8
qs <- qmixexp2(p = c(0.99, 0.999), prob1 = prob1,
               rate1 = rate1, rate2 = rate2) 
x <- seq(from = 0, to = qs[2], length.out = 200)
F <- pmixexp2(x, prob1 = prob1, rate1 = rate1, rate2 = rate2)
plot(x, F, type = "l", col = "orangered", lwd = 2,
     main = "Mixexp2 distribution and quantile for p = 0.99")
abline(v = qs[1])
abline(h = 0.99)

Moment estimation for the mixture of two exponential distributions

Description

Compute the moment estimation for the tree parameters of the mixture of two exponential distributions

Usage

   mom.mixexp2(x)
mom.mixexp2(x)

Arguments

`x`	Sample. Vector containing values `>0`.

Details

The three parameters (probability and the two rates) are computed from the first three moments (theoretical and sample). It can be shown that the inverse rates are obtained solving a quadratic equation. However the roots can be negative or complex and the estimates are not valid ones.

Value

A list with elements

`estimate`	A vector with named elements `"prob1"`, `"rate1"` and `"rate2"`. When the moment estimators are not valid (negative or complex rates), a vector of three `NA` is returned.
`method`	Character `"moments"`.

Note

The theoretical coefficient of variation (CV) of a mixture of two exponential distributions always exceeds 100%. When the sample CV is smallest than 100%, no valid estimates exist since the two first moments can not be matched.

Author(s)

Yves Deville

References

Paul R. Rider. The Method of Moments Applied to a Mixture of Two Exponential Distributions. Ann. Math. Statist. Vol. 32, Number 1 (1961), 143-147.

Examples


x <- rmixexp2(n = 100, prob1 = 0.5, rate1 = 1.0, rate2 = 3.0)
est <- mom.mixexp2(x)

x <- rmixexp2(n = 100, prob1 = 0.5, rate1 = 1.0, rate2 = 3.0)
est <- mom.mixexp2(x)

Parameters from moments

Description

Compute parameters from (theoretical) moments

Usage

   mom2par(densfun = "exponential",
           mean,
           sd = NA)
mom2par(densfun = "exponential",
           mean,
           sd = NA)

Arguments

`densfun`	Name of the distribution. This can be at present time: `"exponential"`, `"weibull"`, `"gpd"`, `"gamma"`, `"negative binomial"`.
`mean`	Theoretical mean (expectation) of the distribution. Can be a vector, in which case the parameters will be vectors.
`sd`	Standard deviation.

Details

For some distributions like Weibull, it is necessary to find a numerical solution since the parameters have no closed form expression involving the moments.

Value

A named list containing the parameters values e.g. with names shape and scale. When mean or sd is vector the list contains vectors.

Note

The name of the formal argument densfun is for compatibility with fitdistr from the MASS package. However, unlike in fitdistr this formal can not be given a density value, i.e. an object of the class "function" such as dnorm.

Author(s)

Yves Deville

Examples

## Weibull
mom2par(densfun = "weibull", mean = 1, sd = 2) 
## Genrealised Pareto
mom2par(densfun = "gpd", mean = 1, sd = 2)
## Gamma
mom2par(densfun = "gamma", mean = 1:10, sd = 1)
## Weibull
mom2par(densfun = "weibull", mean = 1, sd = 2) 
## Genrealised Pareto
mom2par(densfun = "gpd", mean = 1, sd = 2)
## Gamma
mom2par(densfun = "gamma", mean = 1:10, sd = 1)

Negative Binomial Levy process

Description

Negative Binomial Lévy process estimation from partial observations (counts)

Usage

NBlevy(N,
       gamma = NA,
       prob = NA,
       w = rep(1, length(N)),
       sum.w = sum(w),
       interval = c(0.01, 1000),
       optim = TRUE,
       plot = FALSE, ...)
NBlevy(N,
       gamma = NA,
       prob = NA,
       w = rep(1, length(N)),
       sum.w = sum(w),
       interval = c(0.01, 1000),
       optim = TRUE,
       plot = FALSE, ...)

Arguments

`N`	Vector of counts, one count by time period.
`gamma`	The `gamma` parameter if known (NOT IMPLEMENTED YET).
`prob`	The `prob` parameter if known (NOT IMPLEMENTED YET).
`w`	Vector of time length (durations).
`sum.w`	NOT IMPLEMENTED YET. The effective duration. If `sum.w` is strictly inferior to `sum(w)`, it is to be understood that missing periods occur within the counts period. This can be taken into account with a suitable algorithm (Expectation Maximisation, etc.)
`interval`	Interval giving min and max values for `gamma`.
`optim`	If `TRUE` a one-dimensional optimisation is used. Else the zero of the derivative of the (concentrated) log-likelihood is searched for.
`plot`	Should a plot be drawn? May be removed in the future.
`...`	Arguments passed to `plot`.

Details

The vector $\mathbf{N}$ contains counts for events occurring on non-overlapping time periods with lengths given in $\mathbf{w}$ . Under the NB Lévy process assumptions, the observed counts (i.e. elements of $\mathbf{N}$ ) are independent random variables, each following a negative binomial distribution. The size parameter $r_k$ for $N_k$ is $r_k = \gamma w_k$ and the probability parameter $p$ is prob. The vector $\boldsymbol{\mu}$ of the expected counts has elements

$\mu_k=\mathrm{E}(N_k)=\frac{1-p}{p} \,\gamma \,w_k.$

The parameters $\gamma$ and $p \:(\code{prob})$ are estimated by Maximum Likelihood using the likelihood concentrated with respect to the prob parameter.

Value

A list with the results

`estimate`	Parameter estimates.
`sd`	Standard deviation for the estimate.
`score`	Score vector at the estimated parameter vector.
`info`	Observed information matrix.
`cov`	Covariance matrix (approx.).

Note

The Negative Binomial Lévy process is an alternative to the Homogeneous Poisson Process when counts are subject to overdispersion. In the NB process, all counts share the same index of dispersion (variance/expectation ratio), namely 1/prob. When prob is close to 1, the counts are nearly Poisson-distributed.

Author(s)

Yves Deville

References

Kozubowski T.J. and Podgórsky K. (2009) "Distributional properties of the negative binomial Lévy process". Probability and Mathematical Statistics 29, pp. 43-71. Lund University Publications.

Examples

## known parameters
nint <- 100
gam <- 6; prob <- 0.20

## draw random w, then the counts N
w <- rgamma(nint, shape = 3, scale = 1/5)
N <- rnbinom(nint, size = w * gam, prob = prob)
mu <- w * gam * (1 - prob) / prob
Res <- NBlevy(N = N, w = w)

## Use example data 'Brest'
## compute the number of event and duration of the non-skipped periods
gof1 <- gof.date(date = Brest$OTdata$date,
                 skip = Brest$OTmissing,
                 start = Brest$OTinfo$start,
                 end = Brest$OTinfo$end,
                 plot.type = "omit")
ns1 <- gof1$noskip
## fit the NBlevy
fit1 <- NBlevy(N = ns1$nevt, w = ns1$duration)

## use a higher threshold
OT2 <- subset(Brest$OTdata, Surge > 50)
gof2 <- gof.date(date = OT2$date,
                 skip = Brest$OTmissing,
                 start = Brest$OTinfo$start,
                 end = Brest$OTinfo$end,
                 plot.type = "omit")
ns2 <- gof2$noskip
## the NBlevy prob is now closer to 1
fit2 <- NBlevy(N = ns2$nevt, w = ns2$duration)

c(fit1$prob, fit2$prob)
## known parameters
nint <- 100
gam <- 6; prob <- 0.20

## draw random w, then the counts N
w <- rgamma(nint, shape = 3, scale = 1/5)
N <- rnbinom(nint, size = w * gam, prob = prob)
mu <- w * gam * (1 - prob) / prob
Res <- NBlevy(N = N, w = w)

## Use example data 'Brest'
## compute the number of event and duration of the non-skipped periods
gof1 <- gof.date(date = Brest$OTdata$date,
                 skip = Brest$OTmissing,
                 start = Brest$OTinfo$start,
                 end = Brest$OTinfo$end,
                 plot.type = "omit")
ns1 <- gof1$noskip
## fit the NBlevy
fit1 <- NBlevy(N = ns1$nevt, w = ns1$duration)

## use a higher threshold
OT2 <- subset(Brest$OTdata, Surge > 50)
gof2 <- gof.date(date = OT2$date,
                 skip = Brest$OTmissing,
                 start = Brest$OTinfo$start,
                 end = Brest$OTinfo$end,
                 plot.type = "omit")
ns2 <- gof2$noskip
## the NBlevy prob is now closer to 1
fit2 <- NBlevy(N = ns2$nevt, w = ns2$duration)

c(fit1$prob, fit2$prob)

Temporal aggregation of a Marked Process

Description

Temporal aggregation of a Marked Process, leading to block maxima or $r$ -largest observations.

Usage

  OT2MAX(OTdata,
         OTmissing = NULL,
         start = NULL,
         end = NULL,
         MAX.r = 1L,
         blockDuration = "year",
         monthGapStat = TRUE,
         maxMissingFrac = 0.05,
         dataFrames = FALSE,
         infMAX = FALSE,
         plot = TRUE,
         plotType = c("max", "gaps"),
         jitterSeed = 123,
         trace = 0L,
         ...)
OT2MAX(OTdata,
         OTmissing = NULL,
         start = NULL,
         end = NULL,
         MAX.r = 1L,
         blockDuration = "year",
         monthGapStat = TRUE,
         maxMissingFrac = 0.05,
         dataFrames = FALSE,
         infMAX = FALSE,
         plot = TRUE,
         plotType = c("max", "gaps"),
         jitterSeed = 123,
         trace = 0L,
         ...)

Arguments

`OTdata`	Data frame containing a `POSIXct` column `date` and the marks variable.
`OTmissing`	Optional data frame with columns `start` and `end` (coerced to `POSIXct`) giving the beginning and the end of gaps.
`start`	An object coerced to `POSIXct` indicating the beginning of reliable/usable information. Unless this is a beginning of block (1-st of January for years), the 1-st block will begin after `start` in order to use only qualified information.
`end`	An object indicating the end of the reliable/usable information. Unless this is a end of block (1-st of January for years), the last block will end before `end` in order to use only qualified information.
`MAX.r`	Target number of observations in the blocks. Can be of length one (same number of observations for all blocks) or of length equal to the number of blocks, the values being then for the blocks in the same order. In both cases, the target number may be impossible to reach because of a smaller number of events in the block. If `infMAX` is `TRUE`, the target number of observations will be reached by filling if needed with `-Inf` values. The rationale for this is that a non-existing event is assumed to have an arbitrarily small mark.
`blockDuration`	Duration of the blocks. Can only be `"year"` for now.
`monthGapStat`	Logical. Setting it to `TRUE` will compute statistics concerning the gaps and return them or show them on a plot.
`maxMissingFrac`	Maximal fraction of a block duration (between 0 and 1) that can be missing without leading to a `NA` aggregated value.
`dataFrames`	If `TRUE`, the result will contain data frames similar to those found in an object with class `"Rendata"`. If `FALSE` the result will contain list and vector objects, similar to those used as inputs in the `Renouv` function under the names `MAX.data` and `MAX.effDuration`. Note however, that `-Inf` values can be found in these objects when `infMAX` is `TRUE`.
`infMAX`	If `FALSE`, the target number of values the blocks will generally not be reached, because the total number of events in a block can be lower than the target number. Then, the target number value is revised to the number of found values in each block. If `TRUE`, the target number of values is reached by filling the values with `-Inf` and the datetimes with (`POSIXct`) `NA`s.
`plot`	If `TRUE` a simple plot is shown.
`plotType`	Character controlling the plot. With `"max"`, the block maxima are shown. With `plotType = "gap"`, the daily and monthly gap rates are shown. This is possible when suitable information concerning gaps is provided in `OTmissing`. The plot then shows the probability that a given day of the year falls in a gap, as well as monthly gap rates. Most often one wants that the gap rate does not show a seasonal behaviour. Note that gap rates for month-year combinations are shown as grey segments after jitterizing them since the values `0` and `1` may be observed for several years. An alternative way to is using the `monthGapTS` multivariate time series returned by the function, see Examples.
`jitterSeed`	Random seed for jittering. Used only when `plot` is `TRUE`, `plotType` is `"gap"` and when suitable information is provided in `OTmissing`.
`trace`	Integer level of verbosity.
`...`	Other arguments to be passed to `plot`.

Details

The data frame given in OTdata contains the events (or arrivals) given by the date column, as well as one mark column. Depending on the argument MAX.r, the maxima or the $r$ -largest observations of the marks is computed for each time block. When known gaps exist in the data and when they are given in OTmissing, a block for which the total duration of gaps is too large will be omitted.

Value

A list, the content of which depends on the value of dataFrames. If this value is TRUE, the following elements are returned.

`MAXdata`	A data frame of largest values by block with one row for each observation. The largest values are given as columns with names equal to those in the `OTdata` data frame.
`MAXinfo`	A data frame describing the blocks, with one row by block. The two (`POSIXct`) columns `"start"` and `"end"` provide the beginning and the end of the block. The numeric column `duration` gives the effective duration (in year) within block.
`probMissing`	A vector with values corresponding to the days in a block (year). Each value is a estimation of the probability that the day falls in a gap. If `dataFrames` is `FALSE`, the list still contains `probMissing` as before, as well as other lists as used in `Renouv`.
`effDuration`, `r`	Vectors containing the effective duration (in years) and number of value for the blocks.
`data`	List of maxima or $r$ -largest values for the blocks.
`monthGapStat`, `monthGapTS`	Summary information concerning gaps, if `monthGapStat` is `TRUE` and if relevant information is provide via the the `OTmissing` formal. The element `monthGapTS` is a multivariate time series with yearly observations and one series (column) for each of the 12 months. Each series contains the missing fraction of the month for the considered year, ranging from `0.0` (no gap) to `1.0` (full gap). This object can be dealt with standard methods for time-series, but the `plot` method will require to select a reduced number of columns first, see Examples.

Note

Remind that even when maxMissingFrac is set to its maximum value 1.0, there can still be blocks with no data. When the result is intended to be used in the Renouv function, the formal dataFrames should be FALSE; the elements data and effDuration can then be passed as MAX.data and MAX.effDuration. At the time infMAX should also then be set to FALSE since -Inf values are not yet allowed in the $r$ -largest values.

Examples

## use Dunkerque data
OTdata <- Dunkerque$OTdata; OTmissing <- Dunkerque$OTmissing
## allow up to 50\% gap in a block, or only 5\%
MAX1 <- OT2MAX(OTdata = OTdata, OTmissing = OTmissing,
               maxMissingFrac = 0.5,
               main = "impact of the 'maxMissingFrac' formal")
MAX2 <- OT2MAX(OTdata = OTdata, OTmissing = OTmissing, dataFrames = TRUE,
               prefix = "Max", maxMissingFrac = 0.05, plot = FALSE)
lines(MAX2$MAXdata$date, MAX2$MAXdata$Surge, type = "h", col = "red", lwd = 3)
legend("topleft", lw = c(1, 3), col = c("black", "orangered"),
       legend = c("50\% max", " 5\% max"))

## r-largest obs for r = 4
MAX3 <- OT2MAX(OTdata, OTmissing = OTmissing, MAX.r = 4,
               maxMissingFrac = 0.9, 
               dataFrames = FALSE, trace = TRUE,
               main = "r-largest with r = 4")

## restrict the period
MAX4 <- OT2MAX(OTdata, OTmissing = OTmissing, MAX.r = 4,
               start = "1962-01-01",
               end = "1990-01-01",
               maxMissingFrac = 0.9, 
               dataFrames = FALSE, trace = TRUE,
               main = "r-largest with r = 4 with given 'start' and 'end'")
## Not run: 
  ## use in a block maxima analysis, as if there were no gaps.
  fit <- fGEV.MAX(MAX.data = MAX3$data,
                  MAX.effDuration = rep(1, length(MAX3$effDuration)))     

## End(Not run)
## plot the gap rate
MAX5 <- OT2MAX(OTdata = OTdata, OTmissing = OTmissing,
               maxMissingFrac = 0.5,
               main = "probability of being in a  gap",
               plotType = "gap")

## time series plot (only <= 10 months)
plot(MAX5$monthGapTS[ , c(1:4)], main = "gap rate by month")

## much better with lattice.
## Not run: 
    require(lattice)
    xyplot(MAX5$monthGapTS)

## End(Not run)
## use Dunkerque data
OTdata <- Dunkerque$OTdata; OTmissing <- Dunkerque$OTmissing
## allow up to 50\% gap in a block, or only 5\%
MAX1 <- OT2MAX(OTdata = OTdata, OTmissing = OTmissing,
               maxMissingFrac = 0.5,
               main = "impact of the 'maxMissingFrac' formal")
MAX2 <- OT2MAX(OTdata = OTdata, OTmissing = OTmissing, dataFrames = TRUE,
               prefix = "Max", maxMissingFrac = 0.05, plot = FALSE)
lines(MAX2$MAXdata$date, MAX2$MAXdata$Surge, type = "h", col = "red", lwd = 3)
legend("topleft", lw = c(1, 3), col = c("black", "orangered"),
       legend = c("50\% max", " 5\% max"))

## r-largest obs for r = 4
MAX3 <- OT2MAX(OTdata, OTmissing = OTmissing, MAX.r = 4,
               maxMissingFrac = 0.9, 
               dataFrames = FALSE, trace = TRUE,
               main = "r-largest with r = 4")

## restrict the period
MAX4 <- OT2MAX(OTdata, OTmissing = OTmissing, MAX.r = 4,
               start = "1962-01-01",
               end = "1990-01-01",
               maxMissingFrac = 0.9, 
               dataFrames = FALSE, trace = TRUE,
               main = "r-largest with r = 4 with given 'start' and 'end'")
## Not run: 
  ## use in a block maxima analysis, as if there were no gaps.
  fit <- fGEV.MAX(MAX.data = MAX3$data,
                  MAX.effDuration = rep(1, length(MAX3$effDuration)))     

## End(Not run)
## plot the gap rate
MAX5 <- OT2MAX(OTdata = OTdata, OTmissing = OTmissing,
               maxMissingFrac = 0.5,
               main = "probability of being in a  gap",
               plotType = "gap")

## time series plot (only <= 10 months)
plot(MAX5$monthGapTS[ , c(1:4)], main = "gap rate by month")

## much better with lattice.
## Not run: 
    require(lattice)
    xyplot(MAX5$monthGapTS)

## End(Not run)

Add a small amount of noise to a numeric vector

Description

Add a small amount of noise to a numeric vector keeping all the values above the given threshold.

Usage

OTjitter(x, threshold = NULL)
OTjitter(x, threshold = NULL)

Arguments

`x`	The numeric vector to which jitter should be added.
`threshold`	A threshold above which all elements of the modified vector must stay.

Value

A vector with the same length and nearly the same values as x. As in jitter, a small amount of noise is added to each value of x. The noise level is adjusted so that every noisy value remains above the specified threshold. When the a value is very close to the threshold, only a very small amount of negative noise can be added.

Note

The aim of this function is to remove possible ties in experimental OT data. Ties cause problems or warnings in some goodness-of-fit tests such as Kolmogorov-Smirnov.

Author(s)

Yves Deville

Examples

## Garonne data (heavily rounded)
x <- Garonne$OTdata$Flow
min(x) 
xmod <- OTjitter(x, threshold = 2500)
length(x)
nlevels(as.factor(x))
nlevels(as.factor(xmod))
max(abs(x-xmod))
## Garonne data (heavily rounded)
x <- Garonne$OTdata$Flow
min(x) 
xmod <- OTjitter(x, threshold = 2500)
length(x)
nlevels(as.factor(x))
nlevels(as.factor(xmod))
max(abs(x-xmod))

Derivation of probability functions with respect to the parameters

Description

Derivation of probability functions with respect to the parameters by using closed forms.

Usage

parDeriv(par, x, distname, sum = TRUE)
parDeriv(par, x, distname, sum = TRUE)

Arguments

`par`	Vector of parameter values.
`x`	Observations or data at which the derivatives are to be computed.
`distname`	Name of the distribution. See Details.
`sum`	Logical. If `TRUE`, a summation over the element of `x` is carried. Otherwise, the first dimension of the result corresponds to the elements of `x`.

Details

Only a few distributions are and will be available. For now, these are: the two-parameter Weibull c("shape", "scale"), the two-parameter Generalised Pareto, c("scale", "shape") and the two-parameter Lomax and maxlo distributions.

Value

A list of arrays containing the first and second order derivatives.

`derLogdens`, `der2Logdens`	Derivatives of the log-density $\log f(x)$ .
`derSurv`, `der2Surv`	Derivatives of the survival function $S(x)$ .

When x has length $n$ and the distribution depends on $p$ parameters, the arrays of first and second order derivatives have dimension $n \times p$ and $n \times p \times p$ when sum is FALSE. If sum is TRUE the summation drops the first dimension and the arrays are $p$ and $p \times p$ .

Author(s)

Yves Deville

References

See the Renext Computing Details document.

Examples

set.seed(1234)
distname <- "maxlo"
if (distname == "weibull") {
    logL <- function(par) {
        sum(dweibull(x, shape = par["shape"], scale = par["scale"], log = TRUE))
    }
    sumS <- function(par) {
        sum(pweibull(x, shape = par["shape"], scale = par["scale"],
                     lower.tail = FALSE))
    }
    pars <- c("shape" = rexp(1), "scale" = 1000 * rexp(1))
    x <- rweibull(n = 100, shape = pars["shape"], scale = pars["scale"])
    Der <- parDeriv(par = pars, x = x, distname = "weibull") 
} else if (distname == "gpd") {
    require(evd)
    logL <- function(par) {
        sum(dgpd(x, loc = 0, shape = par["shape"], scale = par["scale"],
                 log = TRUE))
    }
    sumS <- function(par) { 
        sum(pgpd(x, loc = 0, shape = par["shape"], scale = par["scale"],
                 lower.tail = FALSE))
    }
    pars <- c("scale" = 1000 * rexp(1),
              "shape" = runif(1, min = -0.4, max = 0.4))
    x <- rgpd(n = 100, loc = 0, shape = pars["shape"], scale = pars["scale"])
    Der <- parDeriv(par = pars, x = x, distname = "gpd")
} else if (distname == "lomax") {
    logL <- function(par) {
        sum(dlomax(x, shape = par["shape"], scale = par["scale"], log = TRUE))
    }
    sumS <- function(par) { 
        sum(plomax(x, shape = par["shape"], scale = par["scale"],
                   lower.tail = FALSE))
    }
    pars <- c( "shape" = 1 + rexp(1), "scale" = 1000 * rexp(1))
    x <- rlomax(n = 100, shape = pars["shape"], scale = pars["scale"])
    Der <- parDeriv(par = pars, x = x, distname = "lomax") 
} else if (distname == "maxlo") {
    logL <- function(par) {
        sum(dmaxlo(x, shape = par["shape"], scale = par["scale"], log = TRUE))
    }
    sumS <- function(par) { 
        sum(pmaxlo(x, shape = par["shape"], scale = par["scale"],
                   lower.tail = FALSE))
    }
    pars <- c( "shape" = 2.5 + runif(1), "scale" = 100 * rexp(1))
    x <- rmaxlo(n = 100, shape = pars["shape"], scale = pars["scale"])
    Der <- parDeriv(par = pars, x = x, distname = "maxlo") 
}

## check logdens
H <- numDeriv::hessian(func = logL, x = pars)
colnames(H) <- names(pars)
Grad <- numDeriv::grad(func = logL, x = pars)

cat("gradient for log density\n")
print(cbind(parDeriv = Der$derLogdens, num = Grad))

cat("hessian for log density\n")
print(cbind(exact = Der$der2Logdens, num = H))

## check survival
HS <- numDeriv::hessian(func = sumS, x = pars)
HS <- (HS + t(HS))/2
colnames(HS) <- names(pars)
GradS <- numDeriv::grad(func = sumS, x = pars)

cat("gradient for Survival\n")
print(cbind(parDeriv = Der$derSurv, num = GradS))

cat("hessian for Survival\n")
print(cbind(exact = Der$der2Surv, num = HS))
set.seed(1234)
distname <- "maxlo"
if (distname == "weibull") {
    logL <- function(par) {
        sum(dweibull(x, shape = par["shape"], scale = par["scale"], log = TRUE))
    }
    sumS <- function(par) {
        sum(pweibull(x, shape = par["shape"], scale = par["scale"],
                     lower.tail = FALSE))
    }
    pars <- c("shape" = rexp(1), "scale" = 1000 * rexp(1))
    x <- rweibull(n = 100, shape = pars["shape"], scale = pars["scale"])
    Der <- parDeriv(par = pars, x = x, distname = "weibull") 
} else if (distname == "gpd") {
    require(evd)
    logL <- function(par) {
        sum(dgpd(x, loc = 0, shape = par["shape"], scale = par["scale"],
                 log = TRUE))
    }
    sumS <- function(par) { 
        sum(pgpd(x, loc = 0, shape = par["shape"], scale = par["scale"],
                 lower.tail = FALSE))
    }
    pars <- c("scale" = 1000 * rexp(1),
              "shape" = runif(1, min = -0.4, max = 0.4))
    x <- rgpd(n = 100, loc = 0, shape = pars["shape"], scale = pars["scale"])
    Der <- parDeriv(par = pars, x = x, distname = "gpd")
} else if (distname == "lomax") {
    logL <- function(par) {
        sum(dlomax(x, shape = par["shape"], scale = par["scale"], log = TRUE))
    }
    sumS <- function(par) { 
        sum(plomax(x, shape = par["shape"], scale = par["scale"],
                   lower.tail = FALSE))
    }
    pars <- c( "shape" = 1 + rexp(1), "scale" = 1000 * rexp(1))
    x <- rlomax(n = 100, shape = pars["shape"], scale = pars["scale"])
    Der <- parDeriv(par = pars, x = x, distname = "lomax") 
} else if (distname == "maxlo") {
    logL <- function(par) {
        sum(dmaxlo(x, shape = par["shape"], scale = par["scale"], log = TRUE))
    }
    sumS <- function(par) { 
        sum(pmaxlo(x, shape = par["shape"], scale = par["scale"],
                   lower.tail = FALSE))
    }
    pars <- c( "shape" = 2.5 + runif(1), "scale" = 100 * rexp(1))
    x <- rmaxlo(n = 100, shape = pars["shape"], scale = pars["scale"])
    Der <- parDeriv(par = pars, x = x, distname = "maxlo") 
}

## check logdens
H <- numDeriv::hessian(func = logL, x = pars)
colnames(H) <- names(pars)
Grad <- numDeriv::grad(func = logL, x = pars)

cat("gradient for log density\n")
print(cbind(parDeriv = Der$derLogdens, num = Grad))

cat("hessian for log density\n")
print(cbind(exact = Der$der2Logdens, num = H))

## check survival
HS <- numDeriv::hessian(func = sumS, x = pars)
HS <- (HS + t(HS))/2
colnames(HS) <- names(pars)
GradS <- numDeriv::grad(func = sumS, x = pars)

cat("gradient for Survival\n")
print(cbind(parDeriv = Der$derSurv, num = GradS))

cat("hessian for Survival\n")
print(cbind(exact = Der$der2Surv, num = HS))

Initial estimation of GPD parameters for an aggregated renewal model

Description

Initial estimation for an aggregated renewal model with GPD marks.

Usage

parIni.MAX(MAX, threshold, distname.y = "exp")
parIni.OTS(OTS, threshold, distname.y = "exp")
parIni.MAX(MAX, threshold, distname.y = "exp")
parIni.OTS(OTS, threshold, distname.y = "exp")

Arguments

`MAX`	A list describing partial observations of `MAX` type. These are block maxima or block $r$ -largest statistics. The list must contain elements named `block`, `effDuration`, `r`, and `data` (a by-block list of $r$ -largest statistics).
`OTS`	A list describing partial observations of `OTS` type. These are observations above the block thresholds, each being not smaller than `threshold`. This list contains `block`, `effDuration`, `threshold`, `r` and `data` (a by-block list of observations).
`threshold`	The threshold of the POT-renewal model. This is the location parameter of the marks from which Largest Order Statistics are observed.
`distname.y`	The name of the distribution. For now this can be `"exp"` or `"exponential"` for exponential excesses implying Gumbel block maxima, or `"gpd"` for GPD excesses implying GEV block maxima. The initialisation is the same in all cases, but the result is formatted according to the target distribution.

Details

The functions estimate the Poisson rate lambda along with the shape parameter say sigma for exponential excesses. If the target distribution is GPD, then the initial shape parameter is taken to be 0.

In the "MAX" case, the estimates are obtained by regressing the maxima or $r$ -Largest Order Statistics within the blocks on the log-duration of the blocks. The response for a block is the minimum of the $r$ available Largest Order Statistics as found within the block, and $r$ will generally vary across block. When some blocks contain $r > 1$ largest order statistics, the corresponding spacings are used to compute a spacings-based estimation of $\sigma$ . This estimator is independent of the regression estimator for $\sigma$ and the two estimators can be combined in a weighted mean.

In the "OTS" case, the estimate of lambda is obtained by a Poisson regression using the log durations of the blocks as an offset. The estimate of sigma is simply the mean of all the available excesses, which by assumption share the same exponential distribution.

Value

A vector containing the estimate of the Poisson rate $\lambda$ , and the estimates of the parameters for the target distribution of the excesses. For exponential excesses, these are simply a rate parameter. For GPD excesses, these are the scale and shape parameters, the second taken as zero.

Note

In the MAX case, the estimation is possible only when the number of blocks is greater than $1$ , since otherwise no information about $\lambda$ can be gained from the data; recall that the time at which the events occurred within a block is not known or used.

Author(s)

Yves Deville

References

See the document Renext Computing Details.

Examples

set.seed(1234)
## initialisation for 'MAX' data
u <- 10
nBlocks <- 30
nSim <- 100
ParMAX <- matrix(NA, nrow = nSim, ncol = 2)
colnames(ParMAX) <- c("lambda", "sigma")

for (i in 1:nSim) {
  rd <- rRendata(threshold = u,
                 effDuration = 1,
                 lambda = 12,
                 MAX.effDuration = c(60, rexp(nBlocks)),
                 MAX.r = c(5, 2 + rpois(nBlocks, lambda = 1)),
                 distname.y = "exp", par.y = c(rate = 1 / 100))

  MAX <- Renext:::makeMAXdata(rd)
  pari <- parIni.MAX(MAX = MAX, threshold = u)
  ParMAX[i, ] <- pari   
}
## the same for OTS data
u <- 10
nBlocks <- 10
nSim <- 100
ParOTS <- matrix(NA, nrow = nSim, ncol = 2)
colnames(ParOTS) <- c("lambda", "sigma")
rds <- list()

for (i in 1:nSim) {
  rd <- rRendata(threshold = u,
                 effDuration = 1,
                 lambda = 12,
                 OTS.effDuration = rexp(nBlocks, rate = 1 / 10),
                 OTS.threshold = u + rexp(nBlocks, rate = 1 / 10),
                 distname.y = "exp", par.y = c(rate = 1 / 100))
  rds[[i]] <- rd
  OTS <-  Renext:::makeOTSdata(rd)
  pari <- parIni.OTS(OTS = OTS, threshold = u)
  ParOTS[i, ] <- pari   
}

set.seed(1234)
## initialisation for 'MAX' data
u <- 10
nBlocks <- 30
nSim <- 100
ParMAX <- matrix(NA, nrow = nSim, ncol = 2)
colnames(ParMAX) <- c("lambda", "sigma")

for (i in 1:nSim) {
  rd <- rRendata(threshold = u,
                 effDuration = 1,
                 lambda = 12,
                 MAX.effDuration = c(60, rexp(nBlocks)),
                 MAX.r = c(5, 2 + rpois(nBlocks, lambda = 1)),
                 distname.y = "exp", par.y = c(rate = 1 / 100))

  MAX <- Renext:::makeMAXdata(rd)
  pari <- parIni.MAX(MAX = MAX, threshold = u)
  ParMAX[i, ] <- pari   
}
## the same for OTS data
u <- 10
nBlocks <- 10
nSim <- 100
ParOTS <- matrix(NA, nrow = nSim, ncol = 2)
colnames(ParOTS) <- c("lambda", "sigma")
rds <- list()

for (i in 1:nSim) {
  rd <- rRendata(threshold = u,
                 effDuration = 1,
                 lambda = 12,
                 OTS.effDuration = rexp(nBlocks, rate = 1 / 10),
                 OTS.threshold = u + rexp(nBlocks, rate = 1 / 10),
                 distname.y = "exp", par.y = c(rate = 1 / 100))
  rds[[i]] <- rd
  OTS <-  Renext:::makeOTSdata(rd)
  pari <- parIni.OTS(OTS = OTS, threshold = u)
  ParOTS[i, ] <- pari   
}

Probability that the Greenwood's statistic is smaller than one

Description

Probability that the Greenwood's statistic is smaller than one.

Usage

 
   pGreenwood1(n)
pGreenwood1(n)

Arguments

`n`	Sample size.

Details

The probability was computed by using the approximation of the quantile function of the Greenwood's statistic returned by qStat. The result is found by interpolating the distribution function for $x = 1$ .

Value

Probability that the Greenwood's statistic is smaller than one. For a random sample of an exponential distribution with size $n$ , this is the probability that the coefficient of variation is less than one, or the probability that the ML estimate of the GPD shape parameter $\xi$ is negative.

Author(s)

Yves Deville

Examples

n <- 8:500
plot(n, pGreenwood1(n), type = "l", col = "orangered", lwd = 2,
     log ="x", ylim =c(0.5, 0.7), main = "slow convergence to 0.5")
grid() ; abline(h = 0.5, col = "SpringGreen")
n <- 8:500
plot(n, pGreenwood1(n), type = "l", col = "orangered", lwd = 2,
     log ="x", ylim =c(0.5, 0.7), main = "slow convergence to 0.5")
grid() ; abline(h = 0.5, col = "SpringGreen")

Plot a Rendata object

Description

Plot 'Rendata' datasets with OT and historical data

Usage


    ## S3 method for class 'Rendata'
plot(x,
     textOver = quantile(x$OTdata[, x$info$varName], probs = 0.99),
     showHist = TRUE,
                 ...) 

## S3 method for class 'Rendata'
plot(x,
     textOver = quantile(x$OTdata[, x$info$varName], probs = 0.99),
     showHist = TRUE,
                 ...)

Arguments

`x`	Rendata object i.e. a list object as read with the `readXML` function.
`textOver`	Mark values of the variable in the `OTdata` part of `x`. Values above the `textOver` value (if any) will be marked with the character version of the block, typically a year
`showHist`	If `TRUE`, the historical periods (is any) are shown on the plot.
`...`	further args to be passed to `plot` function.

Details

The plot shows the main data of the object x (the OTdata part) as well as historical data MAXdata or OTSdata if any. Different colours are used on the background. This function is not intended to produce nice plots to be printed.

Note

This function is mainly a companion function of readXML. Its goal is to check the content of the data read.

Author(s)

Yves Deville

Examples

if (require(XML)) {
   ## use 'index.xml' file shipped with Renext
   dir1 <- system.file("Rendata", package = "Renext")
   BrestNew <- readXML(name = "Brest", dir = dir1)
   plot(BrestNew)
   GaronneNew <- readXML(name = "Garonne", dir = dir1)
   plot(GaronneNew)
   test1 <- readXML(name = "test1", dir = dir1)
   plot(test1)
}
if (require(XML)) {
   ## use 'index.xml' file shipped with Renext
   dir1 <- system.file("Rendata", package = "Renext")
   BrestNew <- readXML(name = "Brest", dir = dir1)
   plot(BrestNew)
   GaronneNew <- readXML(name = "Garonne", dir = dir1)
   plot(GaronneNew)
   test1 <- readXML(name = "test1", dir = dir1)
   plot(test1)
}

Plot an object of class "Renouv"

Description

Plot an object of class "Renouv". The plot is a return level plot with some supplementary elements to display historical data.

Usage

## S3 method for class 'Renouv'
plot(x,
     pct.conf = x$pct.conf,
     show = list(OT = TRUE, quant = TRUE, conf = TRUE,
                 MAX = TRUE, OTS = TRUE),
     mono = TRUE,
     predict = FALSE,
     par = NULL,
     legend = TRUE,
     label = NULL,
     problim = NULL,
     Tlim = NULL,
     main = NULL, xlab = "periods", ylab = "level",
     posOptions = NULL,
     byBlockStyle = NULL,
     ...)
## S3 method for class 'Renouv'
lines(x,
      pct.conf = x$pct.conf,
      show = NULL,
      mono = TRUE,
      predict = FALSE,
      par = NULL,
      legend = FALSE,
      label = NULL,
      posOptions = NULL,
      byBlockStyle = NULL,
      ...)
## S3 method for class 'Renouv'
plot(x,
     pct.conf = x$pct.conf,
     show = list(OT = TRUE, quant = TRUE, conf = TRUE,
                 MAX = TRUE, OTS = TRUE),
     mono = TRUE,
     predict = FALSE,
     par = NULL,
     legend = TRUE,
     label = NULL,
     problim = NULL,
     Tlim = NULL,
     main = NULL, xlab = "periods", ylab = "level",
     posOptions = NULL,
     byBlockStyle = NULL,
     ...)
## S3 method for class 'Renouv'
lines(x,
      pct.conf = x$pct.conf,
      show = NULL,
      mono = TRUE,
      predict = FALSE,
      par = NULL,
      legend = FALSE,
      label = NULL,
      posOptions = NULL,
      byBlockStyle = NULL,
      ...)

Arguments

`x`	Object of class `"Renouv"`.
`pct.conf`	Percents for confidence limits (lower and upper). These levels should be found within those computed in the object `x`. By default, all computed levels will be used.
`show`	A list with named elements specifying which parts of the return level plot must be drawn. Element `OT` is for the the sample points (Over the Threshold data), `quant` is for the quantile curve (or Return Level curve), `conf` is for the confidence limits. These three elements can be set to `TRUE` or `FALSE`. When the element `conf` is `TRUE`, only the percent levels given in `pct.conf` are drawn. Moreover, the levels not already computed in the object given in `x` will be drawn only if the predictions are recomputed with `predict` set to `TRUE`. Finally, the `MAX` and `OTS` elements are for the two possible types of historical data. They can be logical vectors with length one or with length equal to the corresponding of blocks if only some blocks are to be shown. These two elements can also be character vectors indicating the names of the blocks which are to be shown (by partial matching). These names should match one or several elements of the character vector named `blockNames` within the lists `x$history.MAX` or `x$history.OTS` respectively.
`mono`	Logical, `TRUE` for a monochrome plot.
`predict`	Logical. When `TRUE`, predictions are re-computed from the model before plotting. One effect is that the points used to draw the curves are designed to cover the whole range (if specified by the user). One other effect is that the confidence limits are recomputed in order to include the percent levels given on entry.
`par`	A list such as returned by the `RLpar` function.
`legend`	Logical. If `TRUE`, a legend is built and drawn on the graph.
`label`	A character label used to build the labels used in the legend. The default is to use the name of the `x` object. Using an empty string `""` can be better in some cases.
`problim`	Limits for the x-axis in probability scale. Can be used as an alternative to `Tlim`.
`Tlim`	Limits for the x-axis in return period scale. The values are given as a numeric vector of length 2, containing values $\ge 1$ . The first element (minimal return period can be 0 in which case it will be replaced by a very small positive value.
`xlab`	Label of the x-axis (time periods, with log scale).
`main`	Main title (character).
`ylab`	Label of the y-axis (labels).
`posOptions`	A pair list to be passed as list of formals to the `SandT` function computing the plotting positions.
`byBlockStyle`	Logical list (or named logical vector) with elements `MAX` and `OTS`. The value indicates if each (MAX or OTS) block must be plotted with a specific style (plotting character and color), or if instead a common style is used for all blocks of the same type (MAX or OTS). In the first case, each block will create a line in the legend with a label taken from the `history.MAX` element of the object given in `x`. These legend lines will not appear if `legend` is `FALSE` but can be shown later using `RLlegend.show`. In the second case, only one legend line will be generated. When the number of blocks is large for one type and the corresponding value of `byBlockStyle` is `TRUE`, the styles will be recycled and the plot/legend might not be clear. When `byBlockStyle` is `NULL` or does not contain all needed information, default choices are made.
`...`	Other arguments passed to the default `plot` function e.g., `ylim` to adjust the y-axis.

Details

Historical data blocks (MAX or OTS) embedded in the x object (if any) can be plotted or not depending on the value of the corresponding element in show.

If the MAX element is TRUE and if x embeds historical data of type MAX, then these will be shown with a symbol differing from the one for ordinary points.
If OTS element is TRUE and is x embeds historical data of type OTS, then these will be shown with a symbol differing from the one for ordinary points. An exception is when one or several OTS block have no data. Then each such block is shown as an horizontal segment; its right end-point shows the effective duration of the block and the ordinate shows the OTS threshold for this block. No data exceeded the threshold within the block.

This function acts on a list variable named .RLlegend and stored in a special environment bound to the package. This variable is used to build legends for plots produced with multiple commands. See the RLlegend help page. Examples of possible combined uses of the argument of the plot and lines together with the RLlegend* functions are given in the "Renext Graphics" chapter of the Renext Guide document shipped with this package.

Value

No value returned.

Caution

Remind that the methods plot and lines may change the value of the variable .RLlegend in the environment legendEnvir. This variable describes the material to be used in the legend at the next call of RLlegend.show.

Note

The return level plot is of exponential type i.e. uses a log-scale for return periods. This contrasts with the Gumbel plot which is also used in similar contexts.

Author(s)

Yves Deville

Examples

## two fits for the Garonne data
fit.exp <- Renouv(x = Garonne, plot = FALSE)
fit.gpd <- Renouv(x = Garonne, distname.y = "gpd", plot = FALSE)

## simple plot (legend is TRUE here)
plot(fit.exp,
     main = "Two fits overlapped",
     label = "",
     ## Tlim = c(1, 5000),
     predict = TRUE)

## Now try 'lines' and RLlegend.xxx functions
plot(fit.exp,
     main = "Fancy legend",
     show = list(OT = FALSE, quant = FALSE, conf = FALSE,
                 OTS = FALSE, MAX = FALSE),
     legend = FALSE,
     Tlim = c(1, 5000))
RLlegend.ini(x = "bottomright", bg = "lightyellow") ## initialise legend
lines(fit.exp,
      show = list(quant = FALSE, conf = FALSE, OT = TRUE, MAX = TRUE),
      label = "expon",
      par = RLpar(quant.col = "orange", 
        OT.pch = 21, OT.cex = 1.2, OT.col = "SeaGreen", OT.bg = "yellow",
        MAX.block1.col = "purple", MAX.block1.bg = "mistyrose",
        MAX.block1.lwd = 1.4))
lines(fit.gpd,
      pct.conf = c(95, 70),
      show = list(quant = TRUE, conf = TRUE),
      label = "GPD",
      par = RLpar(quant.col = "darkcyan", conf.conf1.col = "red"))
RLlegend.show() ## now draw legend
## two fits for the Garonne data
fit.exp <- Renouv(x = Garonne, plot = FALSE)
fit.gpd <- Renouv(x = Garonne, distname.y = "gpd", plot = FALSE)

## simple plot (legend is TRUE here)
plot(fit.exp,
     main = "Two fits overlapped",
     label = "",
     ## Tlim = c(1, 5000),
     predict = TRUE)

## Now try 'lines' and RLlegend.xxx functions
plot(fit.exp,
     main = "Fancy legend",
     show = list(OT = FALSE, quant = FALSE, conf = FALSE,
                 OTS = FALSE, MAX = FALSE),
     legend = FALSE,
     Tlim = c(1, 5000))
RLlegend.ini(x = "bottomright", bg = "lightyellow") ## initialise legend
lines(fit.exp,
      show = list(quant = FALSE, conf = FALSE, OT = TRUE, MAX = TRUE),
      label = "expon",
      par = RLpar(quant.col = "orange", 
        OT.pch = 21, OT.cex = 1.2, OT.col = "SeaGreen", OT.bg = "yellow",
        MAX.block1.col = "purple", MAX.block1.bg = "mistyrose",
        MAX.block1.lwd = 1.4))
lines(fit.gpd,
      pct.conf = c(95, 70),
      show = list(quant = TRUE, conf = TRUE),
      label = "GPD",
      par = RLpar(quant.col = "darkcyan", conf.conf1.col = "red"))
RLlegend.show() ## now draw legend

Diagnostic plots for Renouv objects

Description

Diagnostic plots for Renouv objects.

Usage

PPplot(x, ...)

QQplot(x, ...)

## S3 method for class 'Renouv'
PPplot(x,
       showHist = FALSE,
       legend = FALSE,
       par = NULL,
       posOptions = NULL,
       ...)

## S3 method for class 'Renouv'
QQplot(x,
       showHist = FALSE,
       legend = FALSE,
       par = NULL,
       posOptions = NULL,
       ...)
PPplot(x, ...)

QQplot(x, ...)

## S3 method for class 'Renouv'
PPplot(x,
       showHist = FALSE,
       legend = FALSE,
       par = NULL,
       posOptions = NULL,
       ...)

## S3 method for class 'Renouv'
QQplot(x,
       showHist = FALSE,
       legend = FALSE,
       par = NULL,
       posOptions = NULL,
       ...)

Arguments

`x`	Object containing a fitted model.
`legend`	Should a legend be shown to identify historical blocks? NOT IMPLEMENTED YET.
`par`	A list of graphical parameters as returned by `RLpar` and used to control the appearance of points. NOT IMPLEMENTED YET.
`posOptions`	A pair list to be passed as list of formals to the `SandT` function computing the plotting positions.
`showHist`	If `TRUE`, historical information contained in the object `x` (if any) will be shown using special plotting positions as computed by `SandT`.
`...`	Other arguments to be passed to `plot`.

Value

No value returned.

Author(s)

Yves Deville

Compute return levels and confidence limits for a "Renouv" object

Description

Compute return levels and confidence limits for an object of class "Renouv".

Usage

   ## S3 method for class 'Renouv'
predict(object,
        newdata = c(10, 20, 50, 100, 200, 500, 1000),
        cov.rate = TRUE,
        level = c(0.95, 0.7),
        prob = FALSE,
        trace = 1, eps = 1e-06,
        ...)
## S3 method for class 'Renouv'
predict(object,
        newdata = c(10, 20, 50, 100, 200, 500, 1000),
        cov.rate = TRUE,
        level = c(0.95, 0.7),
        prob = FALSE,
        trace = 1, eps = 1e-06,
        ...)

Arguments

`object`	An object of class `"Renouv"` typically created by using the `Renouv` function.
`newdata`	The return period at which return levels and confidence bounds are wanted.
`cov.rate`	If `FALSE`, the delta method will not take into account the uncertainty on the event rate `lambda` of the Poisson process. Note however that when `distname.y` is `"exponential"` and when no `MAX` or `OTS` data is used, the value of `cov.rate` has no impact for now, because the delta method is not used then.
`level`	Confidence levels as in other 'predict' methods (not percentages).
`prob`	If `TRUE` a `prob` column is found in the returned data frame. This column can be used to find which quantile was used to compute the return level.
`trace`	Some details are printed when `trace` is not zero.
`eps`	Level of perturbation used to compute the numerical derivatives in the delta method.
`...`	Further arguments passed to or from other methods.

Details

Unless in some very special cases, the confidence limits are approximated ones computed by using the delta method with numerical derivatives.

Value

A data frame with the expected return levels (col. named "quant") at the given return periods, and confidence limits. The returned object has an infer.method attribute describing the method used to compute the confidence limits.

Note

Despite of its name, this method does not compute true predictions. A return period is to be interpreted as an average interevent time rather than the duration of a specific period of time. For instance, the expected return level for a given return period with length 100 years is the level that would be on average exceeded once every 100 years (assuming that the model description in object is correct).

Author(s)

Yves Deville

References

Coles S. (2001) Introduction to Statistical Modelling of Extremes Values, Springer.

Examples

## Use Brest data
fit <- Renouv(Brest)
pred <- predict(fit, newdata = c(100, 125, 150, 175, 200),
                level = c(0.99, 0.95))
## Use Brest data
fit <- Renouv(Brest)
pred <- predict(fit, newdata = c(100, 125, 150, 175, 200),
                level = c(0.99, 0.95))

Quantiles of a test statistic

Description

Quantile of a test statistic.

Usage

   qStat(p, n,
         type = c("Greenwood", "Jackson", "logLRGPD", "logLRLomax",
                  "logLRGEV", "logLRFrechet"),
          outNorm = FALSE)
qStat(p, n,
         type = c("Greenwood", "Jackson", "logLRGPD", "logLRLomax",
                  "logLRGEV", "logLRFrechet"),
          outNorm = FALSE)

Arguments

`p`	Numeric vector of probabilities. Very small values (`p < 0.01`) or very large ones (`p > 0.99`) will be truncated as `0.00` or `1.00` to maintain a realistic level of precision.
`n`	Sample size.
`type`	The type of statistic, see Details.
`outNorm`	Logical. If `TRUE` the output is normalized in a such fashion that its distribution is the asymptotic one (i.e. standard normal in practice). When `FALSE`, the quantiles are given in the true scale of the statistic: $\textrm{CV}^2$ , Jackson. For LR statistics this argument has no impact.

Details

The function provides an approximation of the distribution for several statistics.

For "Greenwood", the statistic is Greenwood's statistic. The distribution is that of the squared coefficient of variation $\textrm{CV}^2$ of a sample of size n from the exponential distribution as computed by CV2.
For "Jackson", the statistic is Jackson's statistic, see Jackson.
For "logLRGPD" and "logLRLomax", the statistic is the log of the likelihood ratio of a sample from the exponential distribution. The log-likelihoods are for an exponential distribution compared to a GPD with non-zero shape, or to a GPD with positive shape (equivalently, a Lomax distribution).
For "logLRGEV" and "logLRFrechet", the statistic is the log of the likelihood ratio of a sample from the Gumbel distribution. The log-likelihoods are for a Gumbel distribution compared to a GEV with non-zero shape, or to a GEV with positive shape (equivalently, a Fréchet distribution).

The log of Likelihood Ratios are multiplied by 2, so that they compare to a chi-square statistic with one degree of freedom.

Value

A vector of quantiles.

Note

The precision of the result given is limited, and is about two-digits. This function is not intended to be used as such and is only provided for information.

Author(s)

Yves Deville

Examples

res <- qStat(n = 40, type = "Greenwood")
plot(res$q, res$p, type = "o")
res <- qStat(n = 40, type = "Greenwood")
plot(res$q, res$p, type = "o")

Read data using an XML index file

Description

Read one or several dataset(s) using an XML index file specifying the data sets to read and the structure of each

Usage

   readXML(name,
           dir,
           index = "index.xml",
           TZ = "GMT",
           trace = 0)
readXML(name,
           dir,
           index = "index.xml",
           TZ = "GMT",
           trace = 0)

Arguments

`name`	Name for the dataset that will be matched against the `name` attribute of datasets as they are given in the index file.
`dir`	Path to the directory where the index file and all data files should be found.
`index`	Name (short) of the index file. This file must be in the directory given by `dir`.
`TZ`	A time zone as in `strptime`. The time zone `"GMT"` should be preferred, since it should work on all platforms and can cope with dates in the remote past.
`trace`	Level of verbosity (integer). Can be used to trace the successive steps by short indications.

Details

The XML index file is parsed within R. Then according to the indications within the index file, other files are read (e.g. csv files). In this way, data returned as a list can contain heterogeneous data: Over Threshold (OT) data, missing periods, MAX data, etc. Various pieces of information are also stored in list elements with name containing the "info" string.

This function requires the CRAN package XML.

Value

A list with the data read.

`info`	General information about the data: `varName`, `varShorlLab` and `varUnit` give the variable name unit and short label.
`OTinfo`	Information for the Over the Threshold (OT).
`OTdata`	Over the Threshold (OT) data.
`OTmissing`	Missing periods within the OTdata period.
`MAXinfo`	Information for the MAX ( $r$ -largest) supplement data.
`MAXdata`	MAX supplement data.
`OTSinfo`	Information for the Over the Threshold Supplement (OTS) data.
`OTSdata`	Over the Threshold (OT) supplement data.

Note

The flat files (usually .csv files) can also be read in a more conventional way e.g. through read.table. However, conform to the index.xml examples or to the index.xsd schema to see how to adjust the reading of parameters such as sep, etc.

Author(s)

Yves Deville

Examples

## Not run: 
## Examples of datasets that can be read
## Browse should work for browsers with xslt support
browseURL(file.path(system.file("Rendata", package = "Renext"), "index.xml"))
if (require(XML)) {
   ## use 'index.xml' file shiped with Renext
   dir1 <- system.file("Rendata", package = "Renext")
   BrestNew1 <- readXML(name = "Brest", dir = dir1)
   test1 <- readXML(name = "test1", dir = dir1)
}

## End(Not run)
## Not run: 
## Examples of datasets that can be read
## Browse should work for browsers with xslt support
browseURL(file.path(system.file("Rendata", package = "Renext"), "index.xml"))
if (require(XML)) {
   ## use 'index.xml' file shiped with Renext
   dir1 <- system.file("Rendata", package = "Renext")
   BrestNew1 <- readXML(name = "Brest", dir = dir1)
   test1 <- readXML(name = "test1", dir = dir1)
}

## End(Not run)

Translate a vector of coefficients from a Renewal-POT model with Pareto excesses into a vector of GEV parameters

Description

Translate a vector of coefficients from a Renewal-POT model with Pareto excesses into a vector of GEV parameters.

Usage

Ren2gev(object,
        threshold = NULL,
        w = 1,
        distname.y = c("gpd", "GPD", "lomax", "maxlo"),
        jacobian = (length(w) == 1L),
        vcovRen = NULL)
Ren2gev(object,
        threshold = NULL,
        w = 1,
        distname.y = c("gpd", "GPD", "lomax", "maxlo"),
        jacobian = (length(w) == 1L),
        vcovRen = NULL)

Arguments

`object`	A named vector of parameters or an object of class `"Renouv"`. In the first case, the names of the vector element must conform to the distribution given in `distname.y`.
`threshold`	The threshold associated with the renewal-POT model. This must be provided and be a non NA finite numeric value. It is the location parameter of the GPD.
`w`	The duration of the blocks.
`distname.y`	The distribution of the excesses in the renewal-POT model. This is normally a `"gpd"` but can be a `"lomax"` or a `"maxlo"` distribution provided that the GEV parameters given in `object` specify a positive or a negative shape respectively.
`jacobian`	Logical. If `TRUE`, the jacobian matrix of the transformation is computed. This is only possible at the time when `w` has length 1.
`vcovRen`	A covariance matrix for the "Ren" vector of parameters. If `object` has class `"Renouv"`, then the covariance matrix embedded in the object is used.

Details

Given Renewal-POT parameters, it is possible to compute the distribution of block maxima. When the distribution is in the Pareto family, the marginal distribution of maxima is GEV. The location and the scale GEV parameters depend on the block duration $w$ , while the GEV shape parameter is identical to that of the GPD input distribution.

Value

When w has length 1, a named vector of GEV parameters as the one estimated by fgev. This vector has an elements named "loc", "scale" and "shape".

When w has length > 1, a matrix with length(w) rows, each representing a vector of GEV parameters as before.

The returned object has attributes named "threshold". and "distname.y" to recall how it was built.

Author(s)

Yves Deville

Examples

fit1 <- Renouv(Garonne, distname.y = "maxlo")
Ren2gev(fit1)
fit2 <- Renouv(Garonne, distname.y = "gpd")
Ren2gev(fit2)
fit1 <- Renouv(Garonne, distname.y = "maxlo")
Ren2gev(fit1)
fit2 <- Renouv(Garonne, distname.y = "gpd")
Ren2gev(fit2)

Translate a vector of coefficients from a Renewal-POT model with exponential excesses to a vector of Gumbel parameters

Description

Translate a vector of coefficients from a Renewal-POT model with exponential excesses to a vector of Gumbel parameters.

Usage

Ren2gumbel(object,
           threshold = NULL,
           w = 1,
           distname.y = c("exponential", "exp"),
           jacobian = (length(w) == 1L),
           vcovRen = NULL)
Ren2gumbel(object,
           threshold = NULL,
           w = 1,
           distname.y = c("exponential", "exp"),
           jacobian = (length(w) == 1L),
           vcovRen = NULL)

Arguments

`object`	A named vector of parameters or an object of class `"Renouv"`. In the first case, the names of the vector element must conform to the exponential distribution so the vector must be of length 2 with names `"lambda"` and `"rate"`.
`threshold`	A threshold associated with the parameters. If `object` is an object with class `"Renouv"`, its threshold slot will be used.
`w`	A block duration or a vector of block durations.
`distname.y`	The name of the distribution for the excesses. Can be either `"exponential"` or `"exp"`. The choice has no impact on the computations, but this name will be attached to the result as an attribute and may affect later use.
`jacobian`	Logical. If `TRUE` the jacobian matrix of the transformation will be computed and attached to the result as an attribute.
`vcovRen`	A covariance matrix for the Renouv parameters.

Value

A vector of GEV parameters if w has length 1, and a matrix if w has length > 1. The returned objects has attributes.

Author(s)

Yves Deville

Examples

## Fit a Renouv model with exponential excesses (default)
fit <- Renouv(Garonne)
## Convert to gumbel (usable for one-year block maxima)
parGumbel <- Ren2gumbel(fit)
## Retrieve the 'Renouv' model by giving the right threshold
parRen <- gumbel2Ren(parGumbel,
                     threshold = 2500,
                     vcovGumbel = attr(parGumbel, "vcov"),
                     plot = TRUE)
## Build a compatible model under the assumption of one event by
## year
parRen2 <- gumbel2Ren(parGumbel,
                      lambda = 1.00,
                      vcovGumbel = attr(parGumbel, "vcov"),
                      plot = TRUE)
parRenNames <- c("lambda", "rate")
## Build a 'Renouv' object without estimation
myVcov <- attr(parRen, "vcov")[parRenNames, parRenNames]
fitNew <- RenouvNoEst(threshold = attr(parRen, "threshold"),
                      estimate = parRen,
                      distname.y = "exp",
                      cov = myVcov)
## Compare return levels
cbind(roundPred(fit$pred)[ , -2], roundPred(fitNew$pred)[ , -2])
## idem for the putative 'Renouv' with rate 1
myVcov2 <- attr(parRen2, "vcov")[parRenNames, parRenNames]
fitNew2 <- RenouvNoEst(threshold = attr(parRen2, "threshold"),
                       estimate = parRen2,
                       distname.y = "exp",
                       cov = myVcov2)
cbind(roundPred(fit$pred)[ , -2], roundPred(fitNew2$pred)[ , -2])
## Fit a Renouv model with exponential excesses (default)
fit <- Renouv(Garonne)
## Convert to gumbel (usable for one-year block maxima)
parGumbel <- Ren2gumbel(fit)
## Retrieve the 'Renouv' model by giving the right threshold
parRen <- gumbel2Ren(parGumbel,
                     threshold = 2500,
                     vcovGumbel = attr(parGumbel, "vcov"),
                     plot = TRUE)
## Build a compatible model under the assumption of one event by
## year
parRen2 <- gumbel2Ren(parGumbel,
                      lambda = 1.00,
                      vcovGumbel = attr(parGumbel, "vcov"),
                      plot = TRUE)
parRenNames <- c("lambda", "rate")
## Build a 'Renouv' object without estimation
myVcov <- attr(parRen, "vcov")[parRenNames, parRenNames]
fitNew <- RenouvNoEst(threshold = attr(parRen, "threshold"),
                      estimate = parRen,
                      distname.y = "exp",
                      cov = myVcov)
## Compare return levels
cbind(roundPred(fit$pred)[ , -2], roundPred(fitNew$pred)[ , -2])
## idem for the putative 'Renouv' with rate 1
myVcov2 <- attr(parRen2, "vcov")[parRenNames, parRenNames]
fitNew2 <- RenouvNoEst(threshold = attr(parRen2, "threshold"),
                       estimate = parRen2,
                       distname.y = "exp",
                       cov = myVcov2)
cbind(roundPred(fit$pred)[ , -2], roundPred(fitNew2$pred)[ , -2])

Fit a 'Renouvellement' model

Description

Fit a 'renouvellement' POT model using Over the Threshold data and possibly historical data of two kinds.

Usage

Renouv(x,
       threshold = NULL,
       effDuration = NULL,
       distname.y = "exponential",
       MAX.data = NULL,
       MAX.effDuration = NULL,
       OTS.data = NULL,
       OTS.effDuration = NULL,
       OTS.threshold = NULL,
       fixed.par.y = NULL,
       start.par.y = NULL,
       force.start.H = FALSE,
       numDeriv = TRUE,
       trans.y = NULL,
       jitter.KS = TRUE,
       pct.conf = c(95, 70),
       rl.prob = NULL,
       prob.max = 1.0-1e-04 ,
       pred.period = NULL,
       suspend.warnings = TRUE,
       control = list(maxit = 300, fnscale = -1),
       control.H = list(maxit = 300, fnscale = -1),
       trace = 0,
       plot = TRUE,
       label = "",
       ...)
Renouv(x,
       threshold = NULL,
       effDuration = NULL,
       distname.y = "exponential",
       MAX.data = NULL,
       MAX.effDuration = NULL,
       OTS.data = NULL,
       OTS.effDuration = NULL,
       OTS.threshold = NULL,
       fixed.par.y = NULL,
       start.par.y = NULL,
       force.start.H = FALSE,
       numDeriv = TRUE,
       trans.y = NULL,
       jitter.KS = TRUE,
       pct.conf = c(95, 70),
       rl.prob = NULL,
       prob.max = 1.0-1e-04 ,
       pred.period = NULL,
       suspend.warnings = TRUE,
       control = list(maxit = 300, fnscale = -1),
       control.H = list(maxit = 300, fnscale = -1),
       trace = 0,
       plot = TRUE,
       label = "",
       ...)

Arguments

`x`	Can be a numeric vector, an object of the class `"Rendata"` or `NULL`. In the first case, `x` contains all the levels above the threshold for a variable of interest. In the second case, most formal arguments take values in accordance with the object content, and can be by-passed by giving the formal explicitly. When `x` is `NULL`, the model is fitted using the data provided using the `OTS` and `MAX` formals.
`threshold`	Value of the threshold for the OT data.
`effDuration`	Effective duration, i.e. duration of the OT period.
`distname.y`	Name of the distribution for the excesses over the threshold. See Details below.
`MAX.data`	Either a numeric vector or a list of numeric vectors representing historical data $r$ -max by blocks. When a vector is given, there is only one block, and the data are the corresponding $r$ -max observed levels where $r$ is the vector length; the block duration is given in `MAX.effDuration`. When a list is given, each list element contains the data for one block, and the effective duration are in `MAX.effDuration`
`MAX.effDuration`	Vector of (effective) durations, one by block MAX data.
`OTS.data`	A numeric vector or a list of numeric vectors representing supplementary Over Threshold data in blocks. When a vector is given, there is only one block, and the data contain all the 'historical' levels over the corresponding threshold given in `OTS.threshold`. The block duration is given in `OTS.effDuration`. When a list is given, each list element contains the data for one block, and the threshold and effective duration are in `OTS.threshold` and `OTS.effDuration`.
`OTS.effDuration`	A numeric vector giving the (effective) durations for the OTS blocks.
`OTS.threshold`	A vector giving the thresholds for the different OTS blocks. The given values must be greater than or equal to the value of `threshold`.
`fixed.par.y`	Named list of known (or fixed) parameter values for the `y`-distribution.
`start.par.y`	Named list of parameter initial values for the `y`-distribution. Only used when the distribution does not belong to the list of special distributions.
`force.start.H`	Logical. When `TRUE`, the values in `start.par.y` (which must then be correct) will be used also as starting values in the maximisation of the global likelihood : OT data and historical data. This is useful e.g. when the historical data fall outside of the support for the distribution fitted without historical data. See below the Details section.
`numDeriv`	Logical: should the hessian be computed using the `numDeriv` package (value `TRUE`) or should it be taken from the results of `optim`?
`trans.y`	Transformation of the levels before thresholding (if not `NULL`). This is only possible with the `"exponential"` value `distname.y`. The two allowed choices are `"square"` and `"log"` meaning that the fitted (exponentially distributed) values are `x.OT^2` `-threshold^2` and `log(x.OT)` `-log(threshold)` respectively. The corresponding distributions for `x.OT` may be called "square-exponential" and "log-exponential".
`jitter.KS`	Logical. When set to `TRUE`, a small amount of noise is added to the "OT" data used in the Kolmogorov-Smirnov test in order to remove ties. This is done using the `OTjitter` function.
`pct.conf`	Character or numeric vector specifying the percentages for the confidence (bilateral) limits on quantiles.
`rl.prob`	Vector of probabilities for the computation of return levels. These are used in plots (hence must be dense enough) and appear on output in the data.frame `ret.lev`.
`prob.max`	Max value of probability for return level and confidence limits evaluations. This argument 'shortens' the default `prob` vector: values `> prob.max` in the default `prob` vector are omitted. Ignored when a `prob` argument is given.
`pred.period`	A vector of "pretty" periods at which return level and probability will be evaluated and returned in the `pred` data.frame.
`suspend.warnings`	Logical. If `TRUE`, the warnings will be suspended during optimisation steps. This is useful when the parameters are subject to constraints as is usually the case.
`control`	A named list used in `optim` for the no-history stage (if any). Note that `fnscale = -1` says that maximisation is required (not minimisation) and must not be changed!
`control.H`	A named list used in `optim` for the historical stage (if any).
`trace`	Level of verbosity. Value `0` prints nothing.
`plot`	Draw a return level plot?
`label`	Label to be used in the legend when `plot` is `TRUE`.
`...`	Arguments passed to `plot.Renouv`, e.g. `main`, `ylim`.

Details

The model is fitted using Maximum Likelihood (ML).

Some distributions listed below and here called "special" are considered in a special manner. For these distributions, it is not necessary to give starting values nor parameter names which are unambiguous.

distribution	parameters
`exponential`	`rate`
`weibull`	`shape`, `scale`
`GPD`	`scale`, `shape`
`gpd`	`scale`, `shape`
`lomax`	`scale`, `shape`
`maxlo`	`scale`, `shape`
`log-normal`	`meanlog`, `sdlog`
`gamma`	`shape`, `scale`
`mixexp2`	`prob1`, `rate1`, `delta`

Other distributions can be used. Because the probability functions are then used in a "black-box" fashion, these distributions should respect the following formal requirements:

The name for the density, distribution and quantile functions must obey to the classical "prefixing convention". Prefixes must be respectively: "d", "p", "q". This rules applies for distribution of the stats package and those of many other packages such evd.
The first (main) argument must be vectorisable in all three functions, i.e. a vector of x, q or p must be accepted by the density, the distribution and the quantile functions.
The density must have a log formal argument. When log is TRUE, the log-density is returned instead of the density.

For such a distribution, it is necessary to give arguments names in start.par.y. The arguments list must have exactly the required number of parameters for the family (e.g. 2 for gamma). Some parameters can be fixed (known); then the parameter set will be the reunion of those appearing in start.par.y and those in fixed.par.y. Anyway, in the present version, at least one parameter must be unknown for the y part of the model.

Mathematical requirements exist for a correct use of ML. They are referred to as "regularity conditions" in ML theory. Note that the support of the distribution must be the set of non-negative real numbers.

The estimation procedure differs according to the existence of the different types of data: main sample, MAX and OTS.

When no historical data is given, the whole set of parameters contains orthogonal subsets: a "point process" part concerning the process of events, and an "observation" part concerning the excesses over the threshold. The parameters can in this case be estimated separately. The rate of the Poisson process is estimated by the empirical rate, i.e. the number of events divided by the total duration given in effDuration. The "Over the Threshold" parameters are estimated from the excesses computed as x.OT minus the threshold.
When historical data is given, the two parameter vectors must be coped with together in maximising the global likelihood. In this case, we begin the estimation ignoring the historical data and then use the estimates as starting values for the maximisation of the global likelihood. In some circumstances, the estimates obtained in the first stage can not be used with historical data because some of these fall outside the support of the distribution fitted. This can happen e.g. with a distname.y = "gpd" when historical data exceed threshold - scale/shape for the values of shape and scale computed in the first stage.
From version 2.1-1 on, it is possible to use OTS and/or MAX data with no OT data by specifying x = NULL. Yet at the time this is only possible distname.y takes one of the two values: "exp", or "gpd". The initial values for the parameter are then obtained by using the parIni.OTS, parIni.MAX functions and possibly by combining the two resulting initial parameter vectors. This possibility can be used to fit a model from block maxima or $r$ -largest classical data but with more flexibility since the duration of the blocks may here not be constant.

The returned Renouv object contains a MAX element concerning the distribution of block maxima in the two following cases.

When distname.y is "exponential" or "exp", the distribution of the maximum is Gumbel. The estimated parameters can be used with the gumbel function of the evd package.
When distname.y is "gpd", "lomax", "maxlo" or "GPD" the distribution of the maximum is a Generalised Extreme Values distribution. The estimated parameters can be used with the gev function of the evd package.

Value

An object with class "Renouv". This is mainly a list with the various results.

`est.N`	Estimate(s) for the count `"N"` part. This estimate does not use the historical data, even if is available.
`est.y`	Estimate(s) for the excess `"y"` part. This estimate does not use the historical data, even if available.
`cov.N`, `cov.y`	The (co-)variances for the estimates above.
`estimate`	Estimate(s) for the whole set of parameters based on OT data and on historical data if available.
`ks.test`	Kolmogorov-Smirnov goodness-of-fit test.
`ret.lev`	A data frame containing return levels and confidence limits. The corresponding probabilities are either provided by user or taken as default values.
`pred`	A data frame similar to `ret.lev`, but with "pretty" return periods. These are taken as the provided values `pred.period` if any or are chosen as "round" multiples of the time unit (taken from `effDuration`). The periods are chosen in order to cover periods ranging from 1/10 to 10 time units.
`MAX`	A list providing the estimated distribution of the maximum of the marks over a block of unit duration. This list element only exists when this distribution can be deduced from the fit, which is the case when `distname.y` is a GPD in a broad sense, see Details.

Other results are available. Use names(result) to see their list.

Except in the the special case where distname.y is "exponential" and where no historical data are used, the inference on quantiles is obtained with the delta method and using numerical derivatives. Confidence limits are unreliable for return levels much greater than the observation-historical period.

Due to the presence of estimated parameters, the Kolmogorov-Smirnov test is unreliable when less than 30 observations are available.

Warning

With some distributions or in presence of historical data, the estimation can fail due to some problem during the optimisation. Even when the optimisation converges, the determination of the (numerical) hessian can be impossible: This can happen if one or more parameter is too small to compute a finite difference approximation of gradient. For instance the 'rate' parameter of the exponential distribution (= inverse mean) will be small when the mean of the excesses is large.

A possible solution is then to rescale the data e.g. dividing them by 10 or 100. As a rule of thumb, an acceptable scaling leads to data (excesses) of a few units to a few hundreds, but an order of magnitude of thousands or more should be avoided and reduced by scaling. The rescaling is recommended for the square exponential distribution (obtained with trans = "square") since the observations are squared.

Another possible way to solve the problem is to change the numDeriv value.

Note

The model only concerns the "Over the Threshold" part of the distribution of the observations. When historical data is used, observations should all be larger than the threshold.

The name of the elements in the returned list is indicative, and is likely to be changed in future versions. At the time, the effect of historical data on estimation (when such data exist) can be evaluated by comparing c(res$est.N, res$est.y) and res$estimate where res is the results list.

Some warnings may indicate that missing values are met during the optimisation process. This is due to the evaluation of the density at tail values. At the time the ML estimates are computed using an unconstrained optimisation, so invalid parameter values can be met during the maximisation or even be returned as (invalid) estimates.

Author(s)

Yves Deville

References

Miquel J. (1984) Guide pratique d'estimation des probabilités de crues, Eyrolles (coll. EDF DER).
Coles S. (2001) Introduction to Statistical Modelling of Extremes Values, Springer.
Embrechts P., Klüppelberg C. and Mikosch T. (1997) Modelling Extremal Events for Insurance and Finance. Springer.

Examples

## Garonne data. Use a "Rendata" object as 'x'. Historical data are used!
fit <- Renouv(x = Garonne, distname = "weibull", trace = 1,
              main = "'Garonne' data")
summary(fit)

## generates a warning because of the ties
fit2 <- Renouv(x = Garonne, distname = "GPD",
               jitter.KS = FALSE,
               threshold = 2800, trace = 1,
               main = "'Garonne' data with threshold = 2800 and GPD fit")

## use a numeric vector as 'x'
fit3 <-
    Renouv(x = Garonne$OTdata$Flow,
           threshold = 2500,
           effDuration = 100,
           distname = "GPD",
           OTS.data = list(numeric(0), c(6800, 7200)),
           OTS.effDuration = c(100, 150),
           OTS.threshold = c(7000, 6000), 
           trace = 1,
           main = "'Garonne' data with artificial \"OTS\" data")
## Add historical (fictive) data
fit4 <- Renouv(x = Garonne$OTdata$Flow,
               threshold = 2500,
               effDuration = 100,
               distname = "weibull",
               fixed.par.y = list(shape = 1.1),
               OTS.data = list(numeric(0), c(6800, 7200)),
               OTS.effDuration = c(100, 150),
               OTS.threshold = c(7000, 6000),
               trace = 0,
               main = "'Garonne' data with artificial \"OTS\" data")

##============================================================================
## use the 'venice' dataset in a r-largest fit from the 'evd' package
##============================================================================
## transform data: each row is a block
MAX.data <- as.list(as.data.frame(t(venice)))
## remove the NA imposed by the rectangular matrix format
MAX.data <- lapply(MAX.data, function(x) x[!is.na(x)])
MAX.effDuration <- rep(1, length(MAX.data))

## fit a Renouv model with no OT data. The threshold
## must be in the support of the gev distribution
u <- 66
fit.gpd <- Renouv(x = NULL,
                  MAX.data = MAX.data,
                  MAX.effDuration = MAX.effDuration,
                  distname.y = "GPD",
                  threshold = u,
                  numDeriv = FALSE,
                  trace = 0,
                  plot = FALSE)
## Not run: 
  require(ismev)
  ## compare with results from the ismev package 
  fit.gev <- rlarg.fit(venice)
  est.gev <- fit.gev$mle
  names(est.gev) <- c("loc", "scale", "shape")
  
  ## transform the 'gev' fit into a Ren parameter set.
  cov.gev <- fit.gev$cov
  rownames(cov.gev) <- colnames(cov.gev) <-  c("loc", "scale", "shape")
  trans <- gev2Ren(est.gev,
                   threshold = u,
                   vcovGev = cov.gev)
  est <- cbind(ismev = trans, RenextLab = coef(fit.gpd))
  colnames(est) <- c("ismev", "RenextLab")
  print(est)
  
  ## fill a 3d array with the two gpd covariance matrices
  cov2 <- attr(trans, "vcov")[c(1, 3, 4), c(1, 3, 4)]
  
  ## covariance
  covs <-
    array(dim = c(2, 3, 3),
          dimnames = list(c("ismev", "RenextLab"),
            colnames(fit.gpd$cov), colnames(fit.gpd$cov)))
  
  covs["ismev", , ] <- cov2
  covs["RenextLab", , ] <- fit.gpd$cov
  print(covs)

## End(Not run)

## Garonne data. Use a "Rendata" object as 'x'. Historical data are used!
fit <- Renouv(x = Garonne, distname = "weibull", trace = 1,
              main = "'Garonne' data")
summary(fit)

## generates a warning because of the ties
fit2 <- Renouv(x = Garonne, distname = "GPD",
               jitter.KS = FALSE,
               threshold = 2800, trace = 1,
               main = "'Garonne' data with threshold = 2800 and GPD fit")

## use a numeric vector as 'x'
fit3 <-
    Renouv(x = Garonne$OTdata$Flow,
           threshold = 2500,
           effDuration = 100,
           distname = "GPD",
           OTS.data = list(numeric(0), c(6800, 7200)),
           OTS.effDuration = c(100, 150),
           OTS.threshold = c(7000, 6000), 
           trace = 1,
           main = "'Garonne' data with artificial \"OTS\" data")
## Add historical (fictive) data
fit4 <- Renouv(x = Garonne$OTdata$Flow,
               threshold = 2500,
               effDuration = 100,
               distname = "weibull",
               fixed.par.y = list(shape = 1.1),
               OTS.data = list(numeric(0), c(6800, 7200)),
               OTS.effDuration = c(100, 150),
               OTS.threshold = c(7000, 6000),
               trace = 0,
               main = "'Garonne' data with artificial \"OTS\" data")

##============================================================================
## use the 'venice' dataset in a r-largest fit from the 'evd' package
##============================================================================
## transform data: each row is a block
MAX.data <- as.list(as.data.frame(t(venice)))
## remove the NA imposed by the rectangular matrix format
MAX.data <- lapply(MAX.data, function(x) x[!is.na(x)])
MAX.effDuration <- rep(1, length(MAX.data))

## fit a Renouv model with no OT data. The threshold
## must be in the support of the gev distribution
u <- 66
fit.gpd <- Renouv(x = NULL,
                  MAX.data = MAX.data,
                  MAX.effDuration = MAX.effDuration,
                  distname.y = "GPD",
                  threshold = u,
                  numDeriv = FALSE,
                  trace = 0,
                  plot = FALSE)
## Not run: 
  require(ismev)
  ## compare with results from the ismev package 
  fit.gev <- rlarg.fit(venice)
  est.gev <- fit.gev$mle
  names(est.gev) <- c("loc", "scale", "shape")
  
  ## transform the 'gev' fit into a Ren parameter set.
  cov.gev <- fit.gev$cov
  rownames(cov.gev) <- colnames(cov.gev) <-  c("loc", "scale", "shape")
  trans <- gev2Ren(est.gev,
                   threshold = u,
                   vcovGev = cov.gev)
  est <- cbind(ismev = trans, RenextLab = coef(fit.gpd))
  colnames(est) <- c("ismev", "RenextLab")
  print(est)
  
  ## fill a 3d array with the two gpd covariance matrices
  cov2 <- attr(trans, "vcov")[c(1, 3, 4), c(1, 3, 4)]
  
  ## covariance
  covs <-
    array(dim = c(2, 3, 3),
          dimnames = list(c("ismev", "RenextLab"),
            colnames(fit.gpd$cov), colnames(fit.gpd$cov)))
  
  covs["ismev", , ] <- cov2
  covs["RenextLab", , ] <- fit.gpd$cov
  print(covs)

## End(Not run)

Define a 'renouvellement' model without estimation

Description

Build a 'renouvellement' model using parameters given by the user.

Usage

RenouvNoEst(threshold,
            estimate = NULL,
            distname.y = "exponential",
            fixed.par.y = NULL,
            trans.y = NULL,
            pct.conf = c(95, 70),
            rl.prob = NULL,
            prob.max = 1 - 1e-04,
            pred.period = NULL,
            cov = NULL,
            nb.OT = NULL,
            infer.method = NULL)
RenouvNoEst(threshold,
            estimate = NULL,
            distname.y = "exponential",
            fixed.par.y = NULL,
            trans.y = NULL,
            pct.conf = c(95, 70),
            rl.prob = NULL,
            prob.max = 1 - 1e-04,
            pred.period = NULL,
            cov = NULL,
            nb.OT = NULL,
            infer.method = NULL)

Arguments

`threshold`	The threshold.
`estimate`	Numeric named vector containing the estimates for the parameters. It must be compatible with the distribution chosen, and must contain in first position an element named `"lambda"` representing an estimated event rate in events by year.
`distname.y`	Character giving the name of the distribution.
`fixed.par.y`	Numeric named vector containing values for vectors which are considered as fixed (and not estimated).
`trans.y`	Transformation as in `Renouv`. Used only when `distname.y` is equal to `"exponential"`.
`pct.conf`	Vector of percents for confidence limits.
`rl.prob`	Probability used in the return level computations. These values are used for instance in return level plots produced with the `plot.Renouv` method. When `NULL` a default vector is used.
`prob.max`	Maximal probability for which computations are done.
`pred.period`	Vector of periods for which predicted return levels will be computed.
`cov`	Covariance matrix for the provided estimated parameters. Must have rownames and colnames in accordance with those of `estimate`. This covariance matrix is used to build confidence limits on parameters and on return levels using the delta method.
`nb.OT`	Number of data over the threshold used for estimation. This will be used only when `distname.y` is equal to `"exponential"`.
`infer.method`	Inference method. Will normally be the delta method.

Details

This function is used for plotting or comparing models with known parameter estimates but with no data available.

The parameters estimates should be accompanied with a covariance matrix assuming an approximately normal joint distribution of these. This matrix is usually obtained by computing the numerical derivatives of the log-likelihood at the second order at the estimates. This covariance is used to compute approximate confidence limits for the return levels of the unknown true distribution that was estimated.

Value

An object of class "Renouv" representing a 'renouvellement' model similar to those built with Renouv. This is mainly a list. Note however that some list elements found in Renouv objects built by Renouv can not be found here. For instance, the returned objects embeds no goodness-of-fit results since the object is created without making use of any data.

Author(s)

Yves Deville

Examples

##======================================================================
## Example from S. Coles' book, page 86 'rainfall data'.
## Note that the first parameter is here the rate 'lambda', and no the
## probability of exceedance as in Coles' book.
##======================================================================
estimate <- c(lambda = 152 / 48, scale = 7.44, shape = 0.184)          
cov <- matrix(c(4.9e-7 * (17531 / 48)^2,  0.0000,  0.0000,
                0.0000,  0.9180, -0.0655,
                0.0000, -0.0655,  0.0102),
              nrow = 3)
colnames(cov) <- rownames(cov) <- names(estimate)
renNE <- RenouvNoEst(threshold = 30, distname.y = "gpd",
                     pct.conf = c(95, 70),
                     estimate = estimate,
                     nb.OT = 152, cov = cov)
summary(renNE)
plot(renNE, main = "Daily rainfall data SW England", ylim = c(0, 400))


##======================================================================
## Example from S. Coles' book, page 86 'rainfall data'.
## Note that the first parameter is here the rate 'lambda', and no the
## probability of exceedance as in Coles' book.
##======================================================================
estimate <- c(lambda = 152 / 48, scale = 7.44, shape = 0.184)          
cov <- matrix(c(4.9e-7 * (17531 / 48)^2,  0.0000,  0.0000,
                0.0000,  0.9180, -0.0655,
                0.0000, -0.0655,  0.0102),
              nrow = 3)
colnames(cov) <- rownames(cov) <- names(estimate)
renNE <- RenouvNoEst(threshold = 30, distname.y = "gpd",
                     pct.conf = c(95, 70),
                     estimate = estimate,
                     nb.OT = 152, cov = cov)
summary(renNE)
plot(renNE, main = "Daily rainfall data SW England", ylim = c(0, 400))

Legend management for return level plots

Description

Legend management for return level plots produced with the plot and lines method of the "Renouv" class.

Usage

    RLlegend.ini(x = "topleft", bty = "n", ...)
    RLlegend.show()
RLlegend.ini(x = "topleft", bty = "n", ...)
    RLlegend.show()

Arguments

`x`	A possible value for the `x` argument of `legend`. This will usually be a character giving the position e.g, `"topleft"` or `"bottomleft"`. See the `legend` function help.
`bty`	As in `legend`. The default value `"n"` differs from the default value of `legend`.
`...`	Other arguments to be kept in the list and passed later to `legend`. These arguments should be chosen among those of `legend` modifying the global legend appearance (e.g., `bg`) but not among those modifying the legend content (e.g. `col` `pt.bg`, `legend`, ...) since the content is here built semi-automatically.

Details

This function is to be used in conjunction with plot.Renouv and lines.Renouv methods. It allows the construction of a legend in a semi-automatic fashion, using the value of the par argument of the plot and lines methods to specify the legend construction.

Each call to the plot.Renouv or lines.Renouv changes the content of a list variable named .RLlegend in a special environment bound to the package. This list is re-created when RLlegend.ini is called, and is used later to draw a legend on the active device when RLlegend.show is called. Between these two calls, the plot and lines methods should be used with their arg legend set to FALSE.

Value

RLlegend.ini returns a copy of the variable which is set.

RLlegend.show returns nothing.

Note

The size of symbols (i.e, plotting characters) can be set by using the RLpar function and the par argument of the methods plot.Renouv and lines.Renouv. However it can not be changed in the legend.

Author(s)

Yves Deville

Examples

## use Garonne data
xG <- Garonne$OTdata$Flow
## use special "exponential" distribution 
fit1 <- Renouv(x = xG, threshold = 2500, distname.y = "exponential",
               effDuration = 65, plot = FALSE)

## use 'exp' in black box fashion, hence with delta method
fit2 <- Renouv(x = xG, , threshold = 2500, distname.y = "exp",
               effDuration = 65, start.par.y = c(rate = 1), plot = FALSE)
RLlegend.ini() ## initialise legend
## sample points only
plot(fit1, main = "Two types of confidence lims",
     show = list(OT = TRUE, quant = FALSE, conf = FALSE),
     label = "",
     legend = FALSE)
## quant and confidence lims
lines(fit1,
     show = list(OT = FALSE, quant = TRUE, conf = TRUE),
     label = "exact",
     legend = FALSE)
## quant (overplot) and confidence lims
lines(fit2,
      show = list(OT = FALSE, quant = TRUE, conf = TRUE),
      par = RLpar(quant.lty = 2, quant.col = "SpringGreen2",
        conf.conf1.col = "orangered", conf.conf1.lwd = 3,
        conf.conf2.col = "orangered", conf.conf2.lwd = 3),
      label = "delta",
      legend = FALSE)
RLlegend.show() ## now draw legend

## use Garonne data
xG <- Garonne$OTdata$Flow
## use special "exponential" distribution 
fit1 <- Renouv(x = xG, threshold = 2500, distname.y = "exponential",
               effDuration = 65, plot = FALSE)

## use 'exp' in black box fashion, hence with delta method
fit2 <- Renouv(x = xG, , threshold = 2500, distname.y = "exp",
               effDuration = 65, start.par.y = c(rate = 1), plot = FALSE)
RLlegend.ini() ## initialise legend
## sample points only
plot(fit1, main = "Two types of confidence lims",
     show = list(OT = TRUE, quant = FALSE, conf = FALSE),
     label = "",
     legend = FALSE)
## quant and confidence lims
lines(fit1,
     show = list(OT = FALSE, quant = TRUE, conf = TRUE),
     label = "exact",
     legend = FALSE)
## quant (overplot) and confidence lims
lines(fit2,
      show = list(OT = FALSE, quant = TRUE, conf = TRUE),
      par = RLpar(quant.lty = 2, quant.col = "SpringGreen2",
        conf.conf1.col = "orangered", conf.conf1.lwd = 3,
        conf.conf2.col = "orangered", conf.conf2.lwd = 3),
      label = "delta",
      legend = FALSE)
RLlegend.show() ## now draw legend

Graphical parameters for Return Level plots

Description

Build a hierarchical list of graphical parameters that can be used in the methods plot or lines for the class "Renouv".

Usage

RLpar(mono = TRUE,
      trace = 0L,
      ...)
RLpar(mono = TRUE,
      trace = 0L,
      ...)

Arguments

`mono`	Logical. The default `TRUE` is for plots possibly using colors but that can be printed in grayscale. With the value `FALSE`, curves or symbols will appear distinctly on a color device but not necessarily when printed in grayscale.
`trace`	Integer level of verbosity. The default value `0` prints nothing.
`...`	Arguments with names corresponding to the hierarchical structure and the graphical parameter to be changed.

Details

The formals are in correspondence with the list hierarchy using a column "." as separator to define the tree. Thus a quant.col formal argument can be used to specify the color of the quantile (or return level) curve, while conf.conf1.col will be used for the first confidence limits (lower and upper).

Value

A list containing lists in a hierarchical fashion. At the root level, an element concerns a single curve (e.g. the return level curve), a single scatterplot (e.g. sample used in POT), a group of curves (e.g. the confidence limits) or a group of scatterplots (e.g. the collection of MAX historical blocks). For single elements (curve or scatterplot) the list contains graphical elements with values as they would be given in plot or lines calls. For group elements, each element is a list of such lists.

Note

A list of default parameter values is built first using the model suitable for the mono value. Then the values provided by the user overwrite the existing. Thus a curve can be coloured even if mono = TRUE, if a colour specification is given for the corresponding element.

When the same parameter name is used several times in RLpar, a warning is thrown.

Author(s)

Yves Deville

Examples

## change color for quantile curve and type for confidence
## limits #1 (with largest confidence level).
newRLpar <- RLpar(quant.col = "red", conf.conf1.lty = "dashed")
newRLpar$quant

## show the names of all possible editable parameters
names(unlist(RLpar()))



## change color for quantile curve and type for confidence
## limits #1 (with largest confidence level).
newRLpar <- RLpar(quant.col = "red", conf.conf1.lty = "dashed")
newRLpar$quant

## show the names of all possible editable parameters
names(unlist(RLpar()))

Return level plot

Description

Return level plot for "Renouvellement" data.

Usage

RLplot(data,
       x = NULL,
       duration = 1,
       lambda,
       conf.pct = 95,
       mono = TRUE,
       mark.rl = 100,
       mark.labels = mark.rl,
       mark.col = NULL,
       main = NULL,
       ylim = NULL,
          ...)
RLplot(data,
       x = NULL,
       duration = 1,
       lambda,
       conf.pct = 95,
       mono = TRUE,
       mark.rl = 100,
       mark.labels = mark.rl,
       mark.col = NULL,
       main = NULL,
       ylim = NULL,
          ...)

Arguments

`data`	A data.frame object with a column named `quant`.
`x`	Optional vector of observed levels.
`duration`	The (effective) duration corresponding to `x` if this argument is used.
`lambda`	Rate, with unit inverse of that used for `duration`, e.g. in inverse years when `duration` is in years.
`conf.pct`	Vector (character or integer) giving confidence levels. See Details below.
`mono`	If `TRUE` colours are replaced by black.
`mark.rl`	Return levels to be marked on the plot.
`mark.labels`	Labels shown at positions in `mark.rl`.
`mark.col`	Colours for marked levels.
`main`	Main title for the return level plot (defaults to empty title).
`ylim`	Limits for the y axis (defaults to values computed from the data).
`...`	Further args to be passed to `plot`. Should be removed in future versions.

Details

Percents should match column names in the data.frame as follows. The upper and lower limits are expected to be U.95 and L.95 respectively. For a 70% confidence percentage, columns should have names "U.70" and "L.70".

The plot is comparable to the return level described in Coles'book and related packages, but the return level is here in log-scale while Coles uses a loglog-scale. A line corresponds here to a one parameter exponential distribution, while Coles'plot corresponds to Gumbel, however the two plots differ only for small return periods. This plot is identical to an expplot but with x and y scales changed: only axis tick-marks differ. The convexity of the scatter plot is therefore opposed in the two plots.

Note

Confidence limits correspond to two-sided symmetrical intervals. This means that the (random) confidence interval may be under or above the true unknown value with the same probabilities. E.g. the probability that the unknown quantile falls above U.95 is 2.5%. The two bounds are yet generally not symmetrical with respect to quant; such a behaviour follows from the use of "delta" method for approximate intervals.

It is possible to add graphical material (points, lines) to this plot using log(returnlev) and quantile coordinates. See Examples section.

Author(s)

Yves Deville

References

Coles S. (2001) Introduction to Statistical Modelling of Extremes Values, Springer.

Examples

## Typical probability vector
prob <- c(0.0001,
  seq(from = 0.01, to = 0.09, by = 0.01),
  seq(from = 0.10, to = 0.80, by = 0.10),
  seq(from = 0.85, to = 0.99, by = 0.01),
  0.995, 0.996, 0.997, 0.998, 0.999, 0.9995)

## Model parameters rate = #evts by year, over nyear
lambda <- 4
nyear <- 30
theta.x <- 4

## draw points
n.x <- rpois(1, lambda = lambda*nyear)
x <- rexp(n.x, rate = 1/theta.x)

## ML estimation (exponential)
lambda.hat <- n.x / nyear
theta.x.hat <- mean(x)
  
## Compute bounds (here exact)
alpha <- 0.05

quant <- qexp(p = prob, rate = 1/theta.x.hat) 

theta.L <- 2*n.x*theta.x.hat / qchisq(1 - alpha/2, df = 2*n.x)
theta.U <- 2*n.x*theta.x.hat / qchisq(alpha/2, df = 2*n.x)

L.95 <- qexp(p = prob, rate = 1/theta.L) 
U.95 <- qexp(p = prob, rate = 1/theta.U) 

## store in data.frame object
data <- data.frame(prob = prob, quant = quant, L.95 = L.95, U.95 = U.95)

RLplot(data = data, x = x, lambda = lambda.hat,
       duration = nyear,
       main = "Poisson-exponential return levels")

RLplot(data = data, x = x, lambda = lambda.hat, duration = nyear,
       mark.rl = 10, mark.labels = "10 ans", mono = FALSE, mark.col = "SeaGreen",
       main = "Poisson-exponential return levels")

points(x = log(50), y = 25, pch = 18, cex = 1.4, col = "purple") 
text(x = log(50), y = 25, col ="purple", pos = 4, labels = "special event") 

## Typical probability vector
prob <- c(0.0001,
  seq(from = 0.01, to = 0.09, by = 0.01),
  seq(from = 0.10, to = 0.80, by = 0.10),
  seq(from = 0.85, to = 0.99, by = 0.01),
  0.995, 0.996, 0.997, 0.998, 0.999, 0.9995)

## Model parameters rate = #evts by year, over nyear
lambda <- 4
nyear <- 30
theta.x <- 4

## draw points
n.x <- rpois(1, lambda = lambda*nyear)
x <- rexp(n.x, rate = 1/theta.x)

## ML estimation (exponential)
lambda.hat <- n.x / nyear
theta.x.hat <- mean(x)
  
## Compute bounds (here exact)
alpha <- 0.05

quant <- qexp(p = prob, rate = 1/theta.x.hat) 

theta.L <- 2*n.x*theta.x.hat / qchisq(1 - alpha/2, df = 2*n.x)
theta.U <- 2*n.x*theta.x.hat / qchisq(alpha/2, df = 2*n.x)

L.95 <- qexp(p = prob, rate = 1/theta.L) 
U.95 <- qexp(p = prob, rate = 1/theta.U) 

## store in data.frame object
data <- data.frame(prob = prob, quant = quant, L.95 = L.95, U.95 = U.95)

RLplot(data = data, x = x, lambda = lambda.hat,
       duration = nyear,
       main = "Poisson-exponential return levels")

RLplot(data = data, x = x, lambda = lambda.hat, duration = nyear,
       mark.rl = 10, mark.labels = "10 ans", mono = FALSE, mark.col = "SeaGreen",
       main = "Poisson-exponential return levels")

points(x = log(50), y = 25, pch = 18, cex = 1.4, col = "purple") 
text(x = log(50), y = 25, col ="purple", pos = 4, labels = "special event")

Round quantiles in a pseudo-prediction table

Description

Round the quantiles of a pseudo prediction table such that computed by predict.Renouv.

Usage

   roundPred(pred, dig.quant = NA)
roundPred(pred, dig.quant = NA)

Arguments

`pred`	The data.frame containing the predicted quantiles and return levels.
`dig.quant`	Number of digits. Guessed if not provided.

Details

Only the columns that can be considered as quantiles are rounded. These are assumed to have names "quant" for the expected return level and "L." or "U." followed by a percentage for lower and upper confidence limits (e.g. "L.95" and "U.95" for 95% percent confidence limits. The number of digits guessed is experimental.

Value

A data.frame with the same structure as that given, but with some columns rounded.

Simulate a random RenData object

Description

Simulate a random RenData object that can be used within the Renouv function for tests.

Usage

rRendata(lambda = 1,
         threshold = 0,
         effDuration = 100,
         distname.y = "exp",
         par.y = c(rate = 1),
         start = "1913-01-01",
         name = NULL,
         varName = "X", varUnit = "?",
         simDate = TRUE, roundDate = FALSE,
         MAX.effDuration = NULL,
         MAX.r = rep(1L, length(MAX.effDuration)),
         OTS.effDuration = NULL,
         OTS.threshold = NULL)
rRendata(lambda = 1,
         threshold = 0,
         effDuration = 100,
         distname.y = "exp",
         par.y = c(rate = 1),
         start = "1913-01-01",
         name = NULL,
         varName = "X", varUnit = "?",
         simDate = TRUE, roundDate = FALSE,
         MAX.effDuration = NULL,
         MAX.r = rep(1L, length(MAX.effDuration)),
         OTS.effDuration = NULL,
         OTS.threshold = NULL)

Arguments

`lambda`	The rate of the Homogeneous Poisson Process.
`threshold`	The threshold for the exceedances.
`effDuration`	The effective duration of the main Over Threshold (OT) period. This must be a positive value.
`distname.y`	Name of the distribution for the excesses to be simulated. See Details.
`par.y`	A named vector or list giving the parameters values for the distribution. The name must conform to the chosen distribution.
`start`	A `POSIXct` object, or character that can be coerced to `POSIXct` (e.g. a date given as a character in the `"YYYY-MM-DD"` format) giving the start of the main OT sample.
`name`	A name for the dataset which will be attached to it and be used by some methods for `"Rendata"`.
`varName`	Name of the simulated variable.
`varUnit`	Unit for the simulated variable (is used by plot).
`simDate`	Logical. If `TRUE` the dates will be reported for the historical data (MAX and OTS).
`roundDate`	Logical. If `TRUE` the time part ot the `date` column will be rounded. Not implemented yet.
`MAX.effDuration`	Vector of the durations for the `MAX` historical blocks.
`MAX.r`	Vector of the (positive) numbers of observations for `MAX` historical blocks. Must have the same length as `MAX.effDuration`. See Caution below for the effect of selection large values.
`OTS.effDuration`	Vector of durations for the `OTS` historical blocks.
`OTS.threshold`	Vector of numerical thresholds for the observations in `OTS` historical blocks. Must have the same length as `OTS.effDuration`. All values must be `>= threshold`.

Details

The distribution of the excesses named in distname.y can be any known distribution, provided that when prefixed with the usual "r" letter, the name gives the wanted simulation function. For example, with distname.y = "exp", the rexp function is used and par.y must thus contain an element with name "rate".

When a suitable numeric threshold is given, the simulated marks of the marked process are the sum of the threshold and of a random excess drawn from distname.y. When the threshold is not a finite numeric value, the observed marks are the simulated values themselves.

The main OT sample is assumed to begin at start. Historical MAX blocks (if any) are assumed to be just before start, and OTS are just before start or just before the beginning of the MAX blocks when there are some. The dates are computed without taking into consideration the problems of leap years or leap seconds.

Value

An object with S3 class "Rendata". This class currently has plot and summary methods.

Caution

By construction, each MAX block contains at least one observation, while a random period of the same duration might have none. The simulated number of events on a MAX block is generated using a censored Poisson distribution. Care must be taken when estimations are made on such data, since creating MAX blocks obviously create a positive bias on lambda. Such bias then also affects the other parameters concerning the excesses, because these parameters are no longer orthogonal to the rate parameter lambda when historical data are used. The bias can be severe if MAX blocks with small durations are used, or if large number of events are chosen in MAX.r.

Note

When effDuration is small relative to the inverse of lambda the number of simulated marks in the OT sample may be $0$ which can cause problems for some uses of the created data.

Author(s)

Yves Deville

Examples

set.seed(1234)
rd <- rRendata(effDuration = 60,
               MAX.effDuration = rep(3, 6),
               MAX.r = rep(4, 6),
               distname.y = "exp", par.y = c(rate = 1/100))
plot(rd)
summary(rd)
rd2 <- rRendata(effDuration = 10,
                MAX.effDuration = rep(60, 2),
                MAX.r = rep(3, 2),
                simDate = FALSE,
                distname.y = "gpd", par.y = c(scale = 20, shape = 0.16))
plot(rd2)
rd3 <- rRendata(effDuration = 10,
                OTS.effDuration = rep(60, 2),
                OTS.threshold = rep(80, 2),
                simDate = FALSE,
                distname.y = "gpd", par.y = c(scale = 20, shape = 0.16))
plot(rd3)
## Renouv fit with historical data
fit <- Renouv(rd)
summary(fit)
set.seed(1234)
rd <- rRendata(effDuration = 60,
               MAX.effDuration = rep(3, 6),
               MAX.r = rep(4, 6),
               distname.y = "exp", par.y = c(rate = 1/100))
plot(rd)
summary(rd)
rd2 <- rRendata(effDuration = 10,
                MAX.effDuration = rep(60, 2),
                MAX.r = rep(3, 2),
                simDate = FALSE,
                distname.y = "gpd", par.y = c(scale = 20, shape = 0.16))
plot(rd2)
rd3 <- rRendata(effDuration = 10,
                OTS.effDuration = rep(60, 2),
                OTS.threshold = rep(80, 2),
                simDate = FALSE,
                distname.y = "gpd", par.y = c(scale = 20, shape = 0.16))
plot(rd3)
## Renouv fit with historical data
fit <- Renouv(rd)
summary(fit)

Compute empirical survivals (S) and return periods (T)

Description

Compute the empirical survival values and the empirical return periods at the observations of an object. These are used as plotting positions in several plots.

Usage

  SandT(object, points = c("p", "H"), a = 0, naive = FALSE)
SandT(object, points = c("p", "H"), a = 0, naive = FALSE)

Arguments

`object`	The object containing the data, with class `"Renouv"` or `"Rendata"`.
`points`	Option for the computation of the plotting positions. When `points` is set to `"p"`, the $p$ -points formula is used with the selected value of `a`. This formula is used to compute the survival from which the return period is computed. When instead `points` is set to `"H"`, Nelson's formula is used to compute the return periods, the survival value still being given by the $p$ -points formula. When the data is heterogeneous, i.e. when `object` contains `MAX` and/or `OTS` data, Nelson's formula is used only to compute the return periods of the upper slice of levels.
`a`	Parameter used in the interpolation formula for the inverse return periods as in Hirsch and Stedinger (1987).
`naive`	Logical. When `TRUE`, naive plotting positions are used to display `MAX` or `OTS` data. These can be defined only when a main sample exists in `object` as a `x.OT` element. For each `MAX` or `OTS` block, the positions use the number of events predicted using the rate of events as estimated from the main sample. When the main sample has a small durations, such predictions are likely to be misleading.

Details

When the object contains historical information (MAX or OTS), the computation is an adaptation Hirsch and Stedinger (1987) for the Marked Process (MP) context. The original method is devoted to block maxima and interpolates the survival values at the thresholds which are computed first. For MP, the interpolation is done for the inverse return periods, and the survival values are deduced from those of the inverse return periods.

Nelson's formula provides unbiased estimates for the values of the cumulative hazard $H(x)$ at the order statistics, and thus can be used to estimate the log-return periods as required on the return level plot.

Value

A list with the following elements

`x`	Numeric vector containing the ordered values from all the available sources in the object: main sample, historical periods either 'MAX' or 'OTS'.
`group`, `groupNames`	Integer and character vectors giving the source of the values in `x`, in the same order of the values. For instance, `group[10]` gives the group form which `x[10]` was extracted, and the name of this group is `groupNames[group[10]]`.
`S`, `T`	Numeric vectors of the same length as `x` and containing the corresponding estimation of the survival value and of the return period.
`thresh`, `lambda.thresh`, `S.thresh`, `T.thresh`	Vector of thresholds and the corresponding estimation for the event rate, survival and return period. All the estimations are for the threshold values. The value of `T.thresh[i]` for a threshold `thresh[i]` results from a simple computation: divide the sum of the durations for blocks with thresholds `>= thresh[i]` by the number of events for these blocks.

Warning

When using points = "H" the estimated values of the survival returned in S and those for the return period T no longer verify T=1/S/lambda, where lambda is the estimated rate. In this case, the values in T should be used in the return level plot, while the values in S should be used in the PP-plot.

Author(s)

Yves Deville

References

The original method for block maxima is described in

Hirsch R.M. and Stedinger J.R.(1887) Plotting Positions for Historical Floods and their precision. Water Ressources Research, vol. 23, N. 4 pp. 715-727.
Millard S. and Neerchal N. (2001). Environmental Statistics with S-Plus. CRC Press

The adaptation for the Marked Process context is described in the Renext Computing Details document.

Examples

## use an object with class "Rendata"
ST1 <- SandT(object = Garonne)
## basic return level plot
plot(ST1$T, ST1$x, col = ST1$group, log = "x")
## use an object with class "Renouv"
fit <- Renouv(x = Garonne, plot = FALSE)
ST2 <- SandT(object = fit)
plot(ST2$T, ST2$x, col = ST2$group, log = "x")
## use an object with class "Rendata"
ST1 <- SandT(object = Garonne)
## basic return level plot
plot(ST1$T, ST1$x, col = ST1$group, log = "x")
## use an object with class "Renouv"
fit <- Renouv(x = Garonne, plot = FALSE)
ST2 <- SandT(object = fit)
plot(ST2$T, ST2$x, col = ST2$group, log = "x")

Fix non-skipped periods from skipped ones

Description

Compute non-skipped periods form start and end of skipped periods.

Usage

   skip2noskip(skip = NULL,
               start = NULL,
               end = NULL)
skip2noskip(skip = NULL,
               start = NULL,
               end = NULL)

Arguments

`skip`	A data.frame object with `start` and `end` columns that can be coerced to `POSIXct`. Other columns can be present (and will be ignored). Each row describes a missing period. Rows must be sorted in chronological order and periods should not overlap. Validity checks are at the time very limited.
`start`	Beginning of the whole period, to be used in `as.POSIXct`.
`end`	End of the whole period to be used in `as.POSIXct`.

Details

In a 'normal' use of this function start and end are given, and are respectively before the beginning of the first skip period and after the end of the last skip period. Thus the returned dataframe will have nrow(skip)+1 rows. However, start and end can be NULL in which case only the nrows(skip)-1 "inner" non-skipped periods will be returned. If start and end are NULL and skip has only one row, the returned result is NULL.

Value

A data.frame object with two POSIXct columns named start and end. Each row corresponds to a non-skipped period

Author(s)

Yves Deville

Examples

## Brest data embeds a description of the gaps

ns <- skip2noskip(skip = Brest$OTmissing)

ns2 <- skip2noskip(skip = Brest$OTmissing,
                   start = Brest$OTinfo$start,
                   end = Brest$OTinfo$end)

## check durations. dur2 should be equal to the effective
## duration (with an error of a fraction of day)
dur <- as.numeric(sum(ns$end-ns$start))/365.25
dur2 <- as.numeric(sum(ns2$end-ns2$start))/365.25


## Brest data embeds a description of the gaps

ns <- skip2noskip(skip = Brest$OTmissing)

ns2 <- skip2noskip(skip = Brest$OTmissing,
                   start = Brest$OTinfo$start,
                   end = Brest$OTinfo$end)

## check durations. dur2 should be equal to the effective
## duration (with an error of a fraction of day)
dur <- as.numeric(sum(ns$end-ns$start))/365.25
dur2 <- as.numeric(sum(ns2$end-ns2$start))/365.25

Shifted Left Truncated Weibull (SLTW) distribution

Description

Density function, distribution function, quantile function and random generation for the Shifted Left Truncated Weibull distribution.

Usage

   dSLTW(x, delta = 1.0, shape = 1.0, scale = 1.0, log = FALSE)
   pSLTW(q, delta = 1.0, shape = 1.0, scale = 1.0, lower.tail = FALSE)
   qSLTW(p, delta = 1.0, shape = 1.0, scale = 1.0)
   rSLTW(n, delta = 1.0, shape = 1.0, scale = 1.0) 
dSLTW(x, delta = 1.0, shape = 1.0, scale = 1.0, log = FALSE)
   pSLTW(q, delta = 1.0, shape = 1.0, scale = 1.0, lower.tail = FALSE)
   qSLTW(p, delta = 1.0, shape = 1.0, scale = 1.0)
   rSLTW(n, delta = 1.0, shape = 1.0, scale = 1.0)

Arguments

`x`, `q`	Vector of quantiles.
`p`	Vector of probabilities.
`n`	Number of observations.
`delta`, `shape`, `scale`	Shift, shape and scale parameters. Vectors of length > 1 are not accepted.
`log`	Logical; if `TRUE`, the log density is returned.
`lower.tail`	Logical; if `TRUE` (default), probabilities are $\textrm{Pr}[X \le x]$ , otherwise, $\textrm{Pr}[X > x]$ .

Details

The SLTW distribution function with shape $\alpha > 0$ , scale $\beta > 0$ and shift $\delta > 0$ has survival function

$S(y) = \exp\left\{ -\left[ \left( \frac{y + \delta}{\beta} \right)^\alpha - \left( \frac{\delta}{\beta} \right)^\alpha \right] \right\} \qquad (y > 0)$

This distribution is that of $Y := X - \delta$ conditional to $X > \delta$ where $X$ follows a Weibull distribution with shape $\alpha$ and scale $\beta$ .

The hazard and mean residual life (MRL) are monotonous functions with the same monotonicity as their Weibull equivalent (with the same shape and scale). The moments or even expectation do not have simple expression.

This distribution is sometimes called power exponential. It is occasionally used in POT with the shift delta taken as the threshold as it should be when the distribution for the level $X$ (and not for the exceedance $Y$ ) is known to be the standard Weibull distribution.

Value

dSLTW gives the density function, pSLTW gives the distribution function, qSLTW gives the quantile function, and rSLTW generates random deviates.

Examples

shape <- rexp(1)+1  
delta = 10

xl <- qSLTW(c(0.001, 0.99), delta = delta, shape = shape)
x <- seq(from = xl[1], to = xl[2], length.out = 200)
f <- dSLTW(x, delta = delta, shape = shape)
plot(x, f, type = "l", main = "SLTW density")

F <- pSLTW(x, delta = delta, shape = shape)
plot(x, F, type = "l", main = "SLTW distribution")

X <- rSLTW(5000, delta = delta, shape = shape)
## Should be close to the uniform repartition
plot(ecdf(pSLTW(X, delta = delta, shape = shape)))

shape <- rexp(1)+1  
delta = 10

xl <- qSLTW(c(0.001, 0.99), delta = delta, shape = shape)
x <- seq(from = xl[1], to = xl[2], length.out = 200)
f <- dSLTW(x, delta = delta, shape = shape)
plot(x, f, type = "l", main = "SLTW density")

F <- pSLTW(x, delta = delta, shape = shape)
plot(x, F, type = "l", main = "SLTW distribution")

X <- rSLTW(5000, delta = delta, shape = shape)
## Should be close to the uniform repartition
plot(ecdf(pSLTW(X, delta = delta, shape = shape)))

Methods computing spacings between Largest Order Statistics

Description

Methods computing the random spacings for the Largest Order Statistics of the marks in a POT renewal.

Usage

spacings(object, ...)

## S3 method for class 'numeric'
spacings(object, wExp = TRUE, ...)

## S3 method for class 'data.frame'
spacings(object, varName, wExp = TRUE, ...)

## S3 method for class 'Rendata'
spacings(object, type = c("MAX", "OTS", "OT"), wExp = TRUE, ...)
spacings(object, ...)

## S3 method for class 'numeric'
spacings(object, wExp = TRUE, ...)

## S3 method for class 'data.frame'
spacings(object, varName, wExp = TRUE, ...)

## S3 method for class 'Rendata'
spacings(object, type = c("MAX", "OTS", "OT"), wExp = TRUE, ...)

Arguments

`object`	A object containing the marks $X_i$ . This is can be a vector or a data frame for data aggregated by blocks. For an object of class `data.frame` one of the three data.frame slots: `OTdata`, `MAXdata` or `OTSdata` can be used using the suitable value of `type`.
`varName`	Character vector of length 1 giving the name of the variable when `object` is a data.frame.
`wExp`	Logical. If `TRUE`, the spacings are weighted as explained in Details.
`type`	Character specifying the data.frame to be used when object has class `"Rendata"`.
`...`	Not used yet.

Details

The spacings are the differences between the Largest Order Statistics. They are useful for some estimation tasks or diagnostics. Given random variables $X_i$ , the $i$ -th spacing $Y_i$ is the difference $X_{(i)}-X_{(i+1)}$ between the $i$ -th largest order statistic $X_{(i)}$ and the next in the decreasing order i.e. $X_{(i+1)}$ .

When the r.vs $X_i$ form a sample of an exponential or Gumbel distribution, the spacings associated with the largest order statistics are or tend to be independent and exponentially distributed. More precisely, the weighted spacings $i \times Y_i$ have or tend to have the same exponential distribution. This can be used to estimate the shape parameter of the underlying distribution using only the largest order statistics. Moreover the $r-1$ spacings $Y_i$ built from the $r$ largest order statistics $i \le r$ are or tend to be independent from the $r$ -th largest order statistic $X_{(r)}$ .

When wExp is TRUE, the returned values are the weighted spacings $i \times Y_i$ .

Value

A list or vector containing the spacings. When the data is structured in blocks as is the MAXdata slot of an object of class "Rendata", the spacings are computed form the order statistics within each block, to maintain independence with the next order statistic in decreasing order.

Caution

By default, the spacings are scaled as explained above, thus assuming that the marks are exponentially distributed.

Author(s)

Yves Deville

References

Embrechts P., Klüppelberg C. and Mikosch T. (1997) Modelling Extremal Events for Insurance and Finance. Springer.

Examples

sp <- spacings(rgumbel(200, loc = 0, scale = 1))
expplot(sp)
sp1 <- spacings(rgev(200, loc = 0, scale = 1, shape = 0.3))
expplot(sp1)
## spacings are computed by block
sp2 <- spacings(object = Garonne$MAXdata,
                varName = Garonne$info$varName)
expplot(unlist(sp2))
sp3 <- spacings(object = Garonne, type = "OT")
expplot(sp3)
sp <- spacings(rgumbel(200, loc = 0, scale = 1))
expplot(sp)
sp1 <- spacings(rgev(200, loc = 0, scale = 1, shape = 0.3))
expplot(sp1)
## spacings are computed by block
sp2 <- spacings(object = Garonne$MAXdata,
                varName = Garonne$info$varName)
expplot(unlist(sp2))
sp3 <- spacings(object = Garonne, type = "OT")
expplot(sp3)

Summary and print methods for "Rendata" objects

Description

Summary method for "Rendata" objects representing data to be used in renouvellement models.

Usage


  ## S3 method for class 'Rendata'
print(x, ...)
 
  ## S3 method for class 'Rendata'
summary(object, ...)

  ## S3 method for class 'summary.Rendata'
print(x, ...)
## S3 method for class 'Rendata'
print(x, ...)
 
  ## S3 method for class 'Rendata'
summary(object, ...)

  ## S3 method for class 'summary.Rendata'
print(x, ...)

Arguments

`object`	An object with class `"Rendata"`.
`x`	An object of class `"summary.Rendata"`, i.e. a result of a call to `summary.Rendata`.
`...`	Further arguments passed to or from other methods.

Examples

## Brest example: no historical data
summary(Brest)

## Garonne example:  with historical data
summary(Garonne)

## Brest example: no historical data
summary(Brest)

## Garonne example:  with historical data
summary(Garonne)

Summary and print methods for "Renouv" objects

Description

Summary method for "Renouv" objects representing 'Renouvellement' (POT) fitted models.

Usage

   ## S3 method for class 'Renouv'
print(x,
        digits = max(3L, getOption("digits") - 3L),
        ...)

   ## S3 method for class 'Renouv'
summary(object,
        correlation = FALSE,
        symbolic.cor = FALSE,
        ...)

   ## S3 method for class 'summary.Renouv'
print(x,
      coef = TRUE,
      pred = TRUE,
      probT = FALSE,
      digits = max(3, getOption("digits") - 3),
      symbolic.cor = x$symbolic.cor,
      signif.stars = getOption("show.signif.stars"),
      ...)

   ## S3 method for class 'summary.Renouv'
format(x,
      ...)

## S3 method for class 'Renouv'
print(x,
        digits = max(3L, getOption("digits") - 3L),
        ...)

   ## S3 method for class 'Renouv'
summary(object,
        correlation = FALSE,
        symbolic.cor = FALSE,
        ...)

   ## S3 method for class 'summary.Renouv'
print(x,
      coef = TRUE,
      pred = TRUE,
      probT = FALSE,
      digits = max(3, getOption("digits") - 3),
      symbolic.cor = x$symbolic.cor,
      signif.stars = getOption("show.signif.stars"),
      ...)

   ## S3 method for class 'summary.Renouv'
format(x,
      ...)

Arguments

`object`	An object with class `"Renouv"`.
`x`	An object of class `"summary.Renouv"`, i.e. a result of a call to `summary.Renouv`.
`correlation`	Logical; if `TRUE`, the correlation matrix of the estimated parameters is returned and printed.
`coef`	Logical. If `FALSE`, the table of coefficients and t-ratios' will not be printed.
`pred`	Logical. If `FALSE`, the table of return periods/levels will not be printed.
`probT`	If `FALSE`, the $p$ -values for the t-tests will not be printed nor displayed.
`digits`	the number of significant digits to use when printing.
`symbolic.cor`	logical. If `TRUE`, print the correlations in a symbolic form (see `symnum`) rather than as numbers.
`signif.stars`	logical. If `TRUE`, ‘significance stars’ are printed for each coefficient.
`...`	Further arguments passed to or from other methods.

Details

print.summary.Renouv tries to be smart about formatting the coefficients, standard errors, return levels, etc. format.summary.Renouv returns as a limited content as a character string. It does not embed coefficients values nor predictions.

Value

The function summary.RenOUV computes and returns a list of summary statistics concerning the object of class "Rendata" given in object. The returned list is an object with class "summary.Renouv".

The function print.summary.Rendata does not returns anything.

Examples

## use Brest data
fit <- Renouv(Brest)
summary(fit)
## use Brest data
fit <- Renouv(Brest)
summary(fit)

Make translucient colors

Description

Make translucient colors.

Usage

translude(colors, alpha = 0.6)
translude(colors, alpha = 0.6)

Arguments

`colors`	A vector of colors in a format that can be understood by `col2rgb`.
`alpha`	Vector of level(s) of opacity between `0` and `1` (`0` means fully transparent and `1` means opaque). After recycling to reach the required length, this value or vector is used as `alpha` in `rgb`.

Value

A vector of translucient (or semi-transparent) colors.

Note

Using the RColorBrewer package might be a better option that transluding chosen colors!

Variance-covariance matrix of the estimates of a "Renouv" object

Description

Variance-covariance matrix of the estimates of a "Renouv" object.

Usage

## S3 method for class 'Renouv'
vcov(object, ...)
## S3 method for class 'Renouv'
vcov(object, ...)

Arguments

`object`	Object of class `"Renouv"`.
`...`	Not used at the time.

Value

A variance-covariance matrix. The rows an columns correspond to the parameters of the Renouv object. The are the rate "lambda" for the Poisson process, and the parameters of the distribution for the excesses over the threshold, with names depending on the chosen distribution.

Author(s)

Yves Deville

Classical Weibull distribution plot

Description

Plots a vector using Weibull distribution scales

Usage

   weibplot(x,
            plot.pos = "exp",
            shape = NULL, scale = NULL,
            labels = NULL,
            mono = TRUE,
            ...)
weibplot(x,
            plot.pos = "exp",
            shape = NULL, scale = NULL,
            labels = NULL,
            mono = TRUE,
            ...)

Arguments

`x`	The vector to be plotted.
`plot.pos`	plotting position for points: either "exp" for expected ranks or "med" for a median rank approximation (see Details below).
`shape`	Shape parameter for one or several Weibull lines to be plotted.
`scale`	Scale parameter for one or several Weibull lines to be plotted.
`labels`	Text to display in legend when Weibull lines are specified.
`mono`	Monochrome graph.
`...`	Arguments to be passed to `plot`.

Details

This plot shows $\log\{-\log[1-F(x)]\}$ against $\log(x)$ where $F(x)$ at point $i$ is taken as $i/(n+1)$ if plot.pos is "exp", or as the "median rank" approximation $(i-0.3)/(n+0.4)$ if plot.pos is "med".

Note

The graph displayed uses a log scale for x. The log-log scale for y is emulated via the construction of suitable graduations. So be careful when adding graphical material (points, etc) to this graph with functions of the "add to plot" family (points, lines, ...).

Author(s)

Yves Deville

Examples

x <- rweibull(200, shape = 1.2, scale = 1)
weibplot(x, main = "Classical Weibull plot")
## Weibull lines
weibplot(x, shape = c(0.9, 1.3), scale = 1)
weibplot(x, shape = c(0.9, 1.3), scale = 1,
         labels = c("before", "after"))
weibplot(x, shape = c(0.9, 1.3), scale = 1,
         labels = c("before", "after"),
         mono = TRUE)
x <- rweibull(200, shape = 1.2, scale = 1)
weibplot(x, main = "Classical Weibull plot")
## Weibull lines
weibplot(x, shape = c(0.9, 1.3), scale = 1)
weibplot(x, shape = c(0.9, 1.3), scale = 1,
         labels = c("before", "after"))
weibplot(x, shape = c(0.9, 1.3), scale = 1,
         labels = c("before", "after"),
         mono = TRUE)

Package 'Renext'

Help Index

Renewal Method for Extreme Values Extrapolation

Description

Details

Author(s)

References

See Also

Examples

Compute an analysis of deviance table for two nested Renouv objects

Description

Usage

Arguments

Details

Value

Note

See Also

Examples

Barplot for Renouv "Over Threshold" counts

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

See Also

Examples

Surge heights at Brest

Description

Usage

Format

Details

Source

Examples

Surge heights at Brest partial data

Description

Usage

Format

Details

Examples

Years with missing periods in 'Brest.year' dataset

Description

Usage

Format

Details

Examples

Squared Coefficient of Variation

Description

Usage

Arguments

Details

Value

Note

Examples

CV2 test of exponentiality

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

See Also

Examples

Surge heights at Dunkerque

Description

Usage

Format

Details

Examples

Expectation-Maximisation for a mixture of exponential distributions

Description

Usage

Arguments

Details

Value

Note