Figure 1 shows several customers throughout time, on the line of time of calendar. Two dates are accentuated; the more is the left date of truncation early and the more is the day of the countable decree later. Only the customers who are in activity after the left date of truncation are in the database.
Customer
#5 started and stopped before the left truncation date. This customer
is simply missing from the data. We do not even know to look for the
customer, because the customer is not there. Customer #2 started at
about the same time yet appears in the data, because this customer
survived to the left truncation date. The fact that one customer is
present and another absent is a property of the data, as opposed to a
property of any particular record.
How can the hazard probabilities be calculated without the biases introduced by missing data?
Answering this question requires a detailed look at the hazard
calculation itself. Remember, the hazard probability at a particular
tenure is the number of customers who have an observed stop at that
tenure divided by the number of customers who are at risk of stopping.
The population at risk is everyone who was active at that tenure who
could have stopped, regardless of whether they stopped.
Left
truncation adds a twist. Consider the at‐risk population for customers
at tenure zero in left truncated data. If a customer started before the
left truncation date, the customer is not in the at‐risk pool for
tenure zero. Customers who started before the left truncation date and
would have a tenure of zero are simply not available in the data. So,
the at‐risk population at tenure zero consists only of customers who
started since the left truncation date.
Consider
the at‐risk population for customers at tenure one. These customers
have to be at risk of stopping at tenure one and the stop needs to
occur after the left truncation date. So, tenure one needs to occur on
or after the left truncation date. In other words, the customer must
start between one day before the left truncation date and one day
before the cutoff date.
The general rule is that a customer is in the population at risk at tenure t
when that tenure occurs on or after the left truncation date and before
the cutoff date. The following two rules for membership in the
population at risk encapsulate this observation:
Customers start in the time period from the left truncation date minus t to the cutoff date minus t; and,
Customers are active at tenure t.
Together, these two rules imply that the customer is active at that tenure in the period after the left truncation date.