Database

How to Fix Left Truncation, Conceptually

3/17/2015 1:33:00 AM
Figure 1 shows several customers throughout time, on the line of time of calendar. Two dates are accentuated; the more is the left date of truncation early and the more is the day of the countable decree later. Only the customers who are in activity after the left date of truncation are in the database.
Figure 1: Customers who stop before the left truncation date are not included in the database.

Customer #5 started and stopped before the left truncation date. This customer is simply missing from the data. We do not even know to look for the customer, because the customer is not there. Customer #2 started at about the same time yet appears in the data, because this customer survived to the left truncation date. The fact that one customer is present and another absent is a property of the data, as opposed to a property of any particular record.

How can the hazard probabilities be calculated without the biases introduced by missing data? Answering this question requires a detailed look at the hazard calculation itself. Remember, the hazard probability at a particular tenure is the number of customers who have an observed stop at that tenure divided by the number of customers who are at risk of stopping. The population at risk is everyone who was active at that tenure who could have stopped, regardless of whether they stopped.

Left truncation adds a twist. Consider the at‐risk population for customers at tenure zero in left truncated data. If a customer started before the left truncation date, the customer is not in the at‐risk pool for tenure zero. Customers who started before the left truncation date and would have a tenure of zero are simply not available in the data. So, the at‐risk population at tenure zero consists only of customers who started since the left truncation date.

Consider the at‐risk population for customers at tenure one. These customers have to be at risk of stopping at tenure one and the stop needs to occur after the left truncation date. So, tenure one needs to occur on or after the left truncation date. In other words, the customer must start between one day before the left truncation date and one day before the cutoff date.

The general rule is that a customer is in the population at risk at tenure t when that tenure occurs on or after the left truncation date and before the cutoff date. The following two rules for membership in the population at risk encapsulate this observation:

Customers start in the time period from the left truncation date minus t to the cutoff date minus t; and,

Customers are active at tenure t.

Together, these two rules imply that the customer is active at that tenure in the period after the left truncation date.

 Others