# Example on using the Lexis prior: “Nonparametric Bayesian Intensity Model: Exploring Time-to-Event Data on Two Time Scales”

Our article (Härkänen, But and Haukka, 2017, Nonparametric Bayesian Intensity Model: Exploring Time-to-Event Data on Two Time Scales) will appear in the Scandinavian Journal of Statistics soon. The lexis example as well as the example on reading the output using the simulated data can be found in the Download section.

The Lexis prior incorporates multidimensional smoothing of the hazard surface. We demonstrated in our paper that this approach provides more accurate results than some common unidimensional methods such as the Poisson regression with splines, because the Lexis prior borrows strength in more than one dimension. This method can be useful not only in analyzing multiple time scales but also in case of ordinal covariates defining a stratification in a hazard model. In the latter case one can assume that the hazard functions of the neighbouring covariate categories are similar thus there is some continuity over the hazard functions.

The smoothing is especially useful in case of relatively small number of observations per stratum as the smoothing reduces the risk of false positive findings. When using other existing methods, the common approach is to merge strata in order to have a larger number of observations, but this approach can hide the actual change points, which can be avoided in our approach.

# Flexible event history analyses using the Bite software

Introduction: As data sets with different follow-up data containing event times such as dates of diagnosis, treatment, recovery and death are becoming more commonly available, also the need for more detailed statistical analyses to accommodate these additional event history data has been increasing. Each time scale defined by an event can be assumed to influence the hazard of the future events as a function. For example, consider death as the outcome. Time since birth (age) is an important determinant of the hazard of death, and in general population this hazard is monotonically increasing. After a diagnosis of cancer (without information on the lethality of cancer) the risk of death may not be monotonic as the patients with a benign tumor have a risk close to the healthy population whereas the other patients have a higher risk. Therefore the estimates of the additional hazard may be much higher during the first couple of years after the diagnosis, but after that close to the risk of the general population. Of the other time scales, the introduction of new treatments can decrease the hazard for all patients regardless of their age, thus the hazard of death can decrease sharply after that point in the calendar time.

Standard statistical software to incorporate different time scales and flexible nonparametric methods have been limited although during the past decades we have seen a rapid improvement both in numerical methods and computational resources allowing applications of wider range of statistical methodology than before.

MethodIntensity processes are a particularly useful family of models to accommodate past event times into a model to predict future events. Several approaches have been introduced to combine the effects of different risk factors into a hazard models. The most common have been based on the assumption of multiplicative hazards, and the other approach on additivity. These assumptions can be applied also to combine the hazard rates of different time scales.

Motivation: When I started working on my PhD thesis (Härkänen 2001) in the mid-90’s, theoretical work and first applications of nonparametric Bayesian methods on intensity processes had just been published. In my own applications I noticed, that the models based on intensity processes were intuitively easy to construct, but an efficient implementation required plenty of coding. The Markov chain Monte Carlo (MCMC) methods are computer intensive, thus the C language was the only viable choice 20 years ago. In the optimization of the code for updating a parameter of a hazard function, it was necessary to avoid calculating the excess Poisson likelihood terms, which would cancel out in the Hasting ratio. In a multiplicative or additive hazards model this is not straightforward, thus I decided to write a program to avoid this optimization by hand for each model separately. The result is the Bite software, which can be downloaded from this site.

Example: To illustrate a multiplicative and an additive hazards model for the hazard of death after a cancer diagnosis, one can specify these in Bite using the syntax

```## Choose a multiplicative hazards model:
model death = f(birth) * g(diagnosis);
## ... or ...