Papers by Piet Groeneboom
It has often been noticed that, in observing the number of incidents that nurses experience durin... more It has often been noticed that, in observing the number of incidents that nurses experience during their shifts, there is a large variation between nurses. We propose a simple statistical model to explain this phenomenon and apply this to the Lucia de Berk case. ⇤ Mathematical Institute, Leiden University; http://www.math.leidenuniv.nl/⇠gill † DIAM, Delft University; http://ssor.twi.tudelft.nl/⇠pietg
We study weighted least squares estimators for the distribution function of observations which ar... more We study weighted least squares estimators for the distribution function of observations which are only visible via interval censoring, i.e., in the situation where one only has information about an interval to which the variable of interest belongs and where one cannot not observe it directly. The least squares estimators are shown to be closely related to nonparametric maximum likelihood estimators (NPMLE's) and to coincide with these in certain cases. New algorithms for computing the estimators are presented and it is shown that they converge from any starting point (in contrast with the EM-algorithm in this situation). Finally, the estimation of non-smooth and smooth functionals of the model is considered; for the latter case, we discuss y/n-consistency and efficiency of the NPMLE. *AMS 1991 subject classifications. 60F17, 62E20, 62G05, 62G20, 45A05.
Zeitschrift f�r Wahrscheinlichkeitstheorie und Verwandte Gebiete, 1976
Probability Theory and Related Fields, 1994
Summary In [4] a central limit theorem for the number of vertices of the convex hull of a uniform... more Summary In [4] a central limit theorem for the number of vertices of the convex hull of a uniform sample from the interior of a convex polygon is derived. This is done by approximating the process of vertices of the convex hull by the process of extreme points of a Poisson point process and by considering the latter process of
Statistica Neerlandica, 1977
We study nonparametric estimation of the sub-distribution functions for current status data with ... more We study nonparametric estimation of the sub-distribution functions for current status data with competing risks. Our main interest is in the nonparametric maximum likelihood estimator (MLE), and for comparison we also consider a simpler "naive estimator." Both types of estimators were studied by Jewell, van der Laan and Henneman [Biometrika (2003) 90 183-197], but little was known about their large sample properties. We have started to fill this gap, by proving that the estimators are consistent and converge globally and locally at rate n 1/3 . We also show that this local rate of convergence is optimal in a minimax sense. The proof of the local rate of convergence of the MLE uses new methods, and relies on a rate result for the sum of the MLEs of the sub-distribution functions which holds uniformly on a fixed neighborhood of a point. Our results are used in Groeneboom, Maathuis and Wellner [Ann. Statist. (2008) 36 1064-1089] to obtain the local limiting distributions of the estimators. . This reprint differs from the original in pagination and typographic detail. 1 2 P. GROENEBOOM, M. H. MAATHUIS AND J. A. WELLNER
The Annals of Statistics, Oct 1, 2014
We study the maximum smoothed likelihood estimator (MSLE) for interval censoring, case 2. Charact... more We study the maximum smoothed likelihood estimator (MSLE) for interval censoring, case 2. Characterizations in terms of convex duality conditions are given and strong consistency is proved. Moreover, we show that, under smoothness conditions on the underlying distributions and using the usual bandwidth choice in density estimation, the local convergence rate is n −2/5 , in contrast with the rate n −1/3 of the ordinary maximum likelihood estimator. In this situation the MSLE is asymptotically equivalent to the solution of a non-linear integral equation, which we solve using the implicit function theorem in Banach spaces. It is shown that, again under appropriate conditions, the MSLE has a normal limit distribution of which the expectation and variance are explicitly determined as a function of the underlying distributions. This is done by showing that the solution of the non-linear integral equation is sufficiently close to the solution of a linear integral equation, which, in turn, is sufficiently close to a "toy estimator", depending on the underlying distributions, for which we can compute the asymptotic bias and variance explicitly.
We consider the problem of estimating a probability density function based on data that are corru... more We consider the problem of estimating a probability density function based on data that are corrupted by noise from a uniform distribution. The (nonparametric) maximum likelihood estimator for the corresponding distribution function is well defined. For the density function this is not the case. We study two nonparametric estimators for this density. The first is a type of kernel density estimate based on the empirical distribution function of the observable data. The second is a kernel density estimate based on the MLE of the distribution function of the unobservable (uncorrupted) data.
Ann Statist, Aug 3, 2001
A process associated with integrated Brownian motion is introduced that characterizes the limit b... more A process associated with integrated Brownian motion is introduced that characterizes the limit behavior of nonparametric least squares and maximum likelihood estimators of convex functions and convex densities, respectively. We call this process "the invelope" and show that it is an almost surely uniquely defined function of integrated Brownian motion. Its role is comparable to the role of the greatest convex minorant of Brownian motion plus a parabolic drift in the problem of estimating monotone functions. An iterative cubic spline algorithm is introduced that solves the constrained least squares problem in the limit situation and some results, obtained by applying this algorithm, are shown to illustrate the theory.
For a version of the interval censoring model, case 2, in which the observation intervals are all... more For a version of the interval censoring model, case 2, in which the observation intervals are allowed to be arbitrarily small, we consider estimation of functionals that are differentiable along Hellinger differentiable paths. The asymptotic information lower bound for such functionals can be represented as the squared L 2 -norm of the canonical gradient in the observation space. This canonical gradient has an implicit expression as a solution of an integral equation that does not belong to one of the standard types. We study an extended version of the integral equation that can also be used for discrete distribution functions like the nonparametric maximum likelihood estimator (NPMLE), and derive the asymptotic normality and efficiency of the NPMLE from properties of the solutions of the integral equations.
We study nonparametric isotonic confidence intervals for monotone functions. In Banerjee and Well... more We study nonparametric isotonic confidence intervals for monotone functions. In Banerjee and Wellner (2001) pointwise confidence intervals, based on likelihood ratio tests for the restricted and unrestricted MLE in the current status model, are introduced. We extend the method to the treatment of other models with monotone functions, and demonstrate our method by a new proof of the results in Banerjee and Wellner (2001) and also by constructing confidence intervals for monotone densities, for which still theory had to be developed. For the latter model we prove that the limit distribution of the LR test under the null hypothesis is the same as in the current status model. We compare the confidence intervals, so obtained, with confidence intervals using the smoothed maximum likelihood estimator (SMLE), using bootstrap methods. The `Lagrange-modified' cusum diagrams, developed here, are an essential tool both for the computation of the restricted MLEs and for the development of the theory for the confidence intervals, based on the LR tests.
We study the two-dimensional process of integrated Brownian motion and Brownian motion, where int... more We study the two-dimensional process of integrated Brownian motion and Brownian motion, where integrated Brownian motion is conditioned to be positive. The transition density of this process is derived from the asymptotic behavior of hitting times of the unconditioned process. Explicit expressions for the transition density in terms of confluent hypergeometric functions are derived, and it is shown how our results on the hitting time distributions imply previous results of Isozaki᎐Watanabe and Goldman. The conditioned process is characterized by a system of stochastic Ž . differential equations SDEs for which we prove an existence and unicity result. Some sample path properties are derived from the SDEs and it is shown that t ¬ t 9r 10 is a ''critical curve'' for the conditioned process in the sense that the expected time that the integral part of the conditioned process spends below any curve t ¬ t ␣ is finite for ␣ -9r10 and infinite for ␣ G 9r10.
J Comput Graph Stat, 2001
A distribution which arises in problems of estimation of monotone functions is that of the locati... more A distribution which arises in problems of estimation of monotone functions is that of the location of the maximum of two-sided Brownian motion minus a parabola. Using results of , (1989), we present algorithms and programs for computation of this distribution and its quantiles. We also present some comparisons with earlier computations
: t ~ R} be two-sided Brownian motion, originating from zero, and let V(a) be defined by V(a)= su... more : t ~ R} be two-sided Brownian motion, originating from zero, and let V(a) be defined by V(a)= sup {t ~IR: W(t)-(t-a) 2 is maximal}. Then {V(a):aslR} is a Markovian jump process, running through the locations of maxima of two-sided Brownian motion with respect to the parabolas fa(t)=(t--a) 2. We give an analytic expression for the infinitesimal generators of the processes {(a+t, V(a+t)): t>=O}, a~IR, in terms of Airy functions in Theorem 4.1. This makes it possible to develop asymptotics for the global behavior of a large class of isotonic estimators (i.e. estimators derived under order restrictions). An example of this is given in , where the asymptotic distribution of the (standardized) L 1-distance between a decreasing density and the Grenander maximum likelihood estimator of this density is determined. On our way to Theorem 4.1 we derive some other results. For example, we give an analytic expression for the joint density of the maximum and the location of the maximum of the process { W(t) -ct 2 : t ~ IR}, where c is an aribrary positive constant. We also determine the Laplace transform of the integral over a Brownian excursion. These last results also have recently been derived by several other authors, using a variety of methods. * This paper was awarded the Rollo Davidson prize 1985 (Cambridge, UK) Theorem 2.1. Let, for c>0, s, xelR, Q}S.x) be the probability measure on the Boret a-field of C([s, oo);1R), corresponding to the process {X(t): t > s}, where X(t) = W(t) -ct 2 and { W(t) : t >= s} is Brownian motion, starting at x + cs 2 at time s. Let the first passage time % of the process X be defined by (2.1), where, as usual, we define ra = 0% if {t > S : X(t) = a} = 0. Then
Bernoulli, Nov 1, 2013
Two new test statistics are introduced to test the null hypotheses that the sampling distribution... more Two new test statistics are introduced to test the null hypotheses that the sampling distribution has an increasing hazard rate on a specified interval [0, a]. These statistics are empirical L 1 -type distances between the isotonic estimates, which use the monotonicity constraint, and either the empirical distribution function or the empirical cumulative hazard. They measure the excursions of the empirical estimates with respect to the isotonic estimates, due to local nonmonotonicity. Asymptotic normality of the test statistics, if the hazard is strictly increasing on [0, a], is established under mild conditions. This is done by first approximating the global empirical distance by an distance with respect to the underlying distribution function. The resulting integral is treated as sum of increasingly many local integrals to which a CLT can be applied. The behavior of the local integrals is determined by a canonical process: the difference between the stochastic process x → W (x) + x 2 where W is standard two-sided Brownian Motion, and its greatest convex minorant.
J Nonparametr Stat, Sep 6, 2011
We consider the problem of estimating the joint distribution function of the event time and a con... more We consider the problem of estimating the joint distribution function of the event time and a continuous mark variable based on censored data. More specifically, the event time is subject to current status censoring and the continuous mark is only observed in case inspection takes place after the event time. The nonparametric maximum likelihood estimator (MLE) in this model is known to be inconsistent. We propose and study an alternative likelihood based estimator, maximizing a smoothed log-likelihood, hence called a maximum smoothed likelihood estimator (MSLE). This estimator is shown to be well defined and consistent, and a simple algorithm is described that can be used to compute it. The MSLE is compared with other estimators in a small simulation study.
Let L n be the length of the longest increasing subsequence of a random permutation of the number... more Let L n be the length of the longest increasing subsequence of a random permutation of the numbers 1, . . . , n, for the uniform distribution on the set of permutations. We discuss the "hydrodynamical approach" to the analysis of the limit behavior, which probably started with , and was subsequently further developed by several authors. We also give two proofs of an exact (non-asymptotic) result, announced in Rains .
Electronic Journal of Statistics 5, Sep 28, 2011
We study three estimators for the interval censoring case 2 problem, a histogram-type estimator, ... more We study three estimators for the interval censoring case 2 problem, a histogram-type estimator, proposed in Birg\'e (1999), the maximum likelihood estimator (MLE) and the smoothed MLE, using a smoothing kernel. Our focus is on the asymptotic distribution of the estimators at a fixed point. The estimators are compared in a simulation study.
Uploads
Papers by Piet Groeneboom