Insurance earned premium calculation in R

Simon

2019/04/15

Requirement

In many instances for indications and modelling work, premium records need to be earned over calendar periods in order to better match premium with losses. Typically in SAS this is done by a loop that increments between the effective and expiry date of a policy and outputting the results at each step.

In R this can be done in a much easier way with both data.table and dplyr, although the data.table solution is much faster. The code below will show how to split an entire policy period over a user-selected period as a fraction of 1.000. This earned_factor can be applied against any written measures to get an earned result.

References

Required packges

Inputs

Outputs

Initiate a sample problemset

# define data
policy_data <-  data.frame( policy = c("a","b","c"),
                            pol_eff = as.Date(c("2018-01-01","2018-07-03","2018-06-30")),
                            pol_exp = as.Date(c("2018-07-03","2019-10-16","2019-10-15")))
policy_data_by <- c("policy", "pol_eff", "pol_exp")
# increment_period <- "1 quarter"
reporting_date <- as.Date("2019-07-25")

Solution 1: data.table

library(data.table)
library(zoo)

# convert the data.frame to a data.table
# keep records where the pol_eff is before the reporting_date
# create increment_date as a vector of quarterly dates in each row of data up to earlier of pol_exp and reporting_date
# transposed for each combination of policy_data_by
policy_data_dt <- setDT(policy_data)[pol_eff <= reporting_date, 
                                     .(increment_date = seq(from = as.Date(as.yearqtr(pol_eff), frac = 0),
                                                            to = pmin(pol_exp, reporting_date), 
                                                            by = "1 quarter")),
                                     by = c(policy_data_by)]
# use the increment_date to define the first and last date of the increment_period
policy_data_dt[, `:=` (increment_start = pmax(as.Date(as.yearqtr(increment_date), frac = 0), pol_eff),
                       increment_end   = pmin(as.Date(as.yearqtr(increment_date), frac = 1), pmin(pol_exp, reporting_date))
                       )]
# calculate the earned_period
# remove the increment_date variable
policy_data_dt[, `:=` (earned_period  = round(  as.numeric(increment_end - increment_start + 1) / 
                                                as.numeric(pol_exp - pol_eff + 1), 3),
                       increment_date = NULL
                       )]
##     policy    pol_eff    pol_exp increment_start increment_end earned_period
##  1:      a 2018-01-01 2018-07-03      2018-01-01    2018-03-31         0.489
##  2:      a 2018-01-01 2018-07-03      2018-04-01    2018-06-30         0.495
##  3:      a 2018-01-01 2018-07-03      2018-07-01    2018-07-03         0.016
##  4:      b 2018-07-03 2019-10-16      2018-07-03    2018-09-30         0.191
##  5:      b 2018-07-03 2019-10-16      2018-10-01    2018-12-31         0.195
##  6:      b 2018-07-03 2019-10-16      2019-01-01    2019-03-31         0.191
##  7:      b 2018-07-03 2019-10-16      2019-04-01    2019-06-30         0.193
##  8:      b 2018-07-03 2019-10-16      2019-07-01    2019-07-25         0.053
##  9:      c 2018-06-30 2019-10-15      2018-06-30    2018-06-30         0.002
## 10:      c 2018-06-30 2019-10-15      2018-07-01    2018-09-30         0.195
## 11:      c 2018-06-30 2019-10-15      2018-10-01    2018-12-31         0.195
## 12:      c 2018-06-30 2019-10-15      2019-01-01    2019-03-31         0.190
## 13:      c 2018-06-30 2019-10-15      2019-04-01    2019-06-30         0.192
## 14:      c 2018-06-30 2019-10-15      2019-07-01    2019-07-25         0.053

Solution 2: dplyr

library(dplyr)
library(tidyr)
library(zoo)

policy_data_dp <- policy_data %>%
  # turn the data.frame into a tibble which is more efficent
  as_tibble() %>%
  # keep records where the pol_eff is before the reporting_date
  filter(pol_eff <= reporting_date) %>%
  # set the processing to be done on each row
  rowwise() %>%
  # create increment_date as a vector of quarterly dates in each row of data up to earlier of pol_exp and reporting_date
  mutate(increment_date = list(seq(from = as.Date(as.yearqtr(pol_eff), frac = 0),
                                   to = pmin(pol_exp, reporting_date),
                                   by = "1 quarter"))) %>%
  # transposed for each row of data
  unnest(increment_date) %>%
  # reset grouping
  ungroup() %>%
  # use the increment_date to define the first and last date of the increment_period
  # calculate the earned_period
  mutate( increment_start = pmax(as.Date(as.yearqtr(increment_date), frac = 0), pol_eff),
          increment_end   = pmin(as.Date(as.yearqtr(increment_date), frac = 1), pmin(pol_exp, reporting_date)),
          earned_period   = round(  as.numeric(increment_end - increment_start + 1) / 
                                    as.numeric(pol_exp - pol_eff + 1), 3)
          ) %>%
  # remove the increment_date variable
  select(-increment_date) %>%
  # identity
  identity()
## # A tibble: 14 x 6
##    policy pol_eff    pol_exp    increment_start increment_end earned_period
##    <chr>  <date>     <date>     <date>          <date>                <dbl>
##  1 a      2018-01-01 2018-07-03 2018-01-01      2018-03-31            0.489
##  2 a      2018-01-01 2018-07-03 2018-04-01      2018-06-30            0.495
##  3 a      2018-01-01 2018-07-03 2018-07-01      2018-07-03            0.016
##  4 b      2018-07-03 2019-10-16 2018-07-03      2018-09-30            0.191
##  5 b      2018-07-03 2019-10-16 2018-10-01      2018-12-31            0.195
##  6 b      2018-07-03 2019-10-16 2019-01-01      2019-03-31            0.191
##  7 b      2018-07-03 2019-10-16 2019-04-01      2019-06-30            0.193
##  8 b      2018-07-03 2019-10-16 2019-07-01      2019-07-25            0.053
##  9 c      2018-06-30 2019-10-15 2018-06-30      2018-06-30            0.002
## 10 c      2018-06-30 2019-10-15 2018-07-01      2018-09-30            0.195
## 11 c      2018-06-30 2019-10-15 2018-10-01      2018-12-31            0.195
## 12 c      2018-06-30 2019-10-15 2019-01-01      2019-03-31            0.19 
## 13 c      2018-06-30 2019-10-15 2019-04-01      2019-06-30            0.192
## 14 c      2018-06-30 2019-10-15 2019-07-01      2019-07-25            0.053