Requirement

In many instances for indications and modelling work, premium records need to be earned over calendar periods in order to better match premium with losses. Typically in SAS this is done by a loop that increments between the effective and expiry date of a policy and outputting the results at each step.

In R this can be done in a much easier way with both data.table and dplyr, although the data.table solution is much faster. The code below will show how to split an entire policy period over a user-selected period as a fraction of 1.000. This earned_factor can be applied against any written measures to get an earned result.

References

Required packges

data.table - Extension of data.frame
dplyr - A Grammar of Data Manipulation
tidyr - Easily Tidy Data with spread() and gather() Functions
zoo - S3 Infrastructure for Regular and Irregular Time Series (Z’s Ordered Observations)

Inputs

policy_data - Sample data.frame containing:
- policy - Policy number
- policy_eff - Policy effective date
- policy_exp - Policy expiry date
- NOTE: There is an assumption that the combinations of the three values above are unique
increment_period - Length of the earning period, e.g. day, week, month, quarter or year, see seq.Date. For this example the increment_period is 1 quarter to match the as.yearqtr function used. If using other periods, then the as.yearqtr function needs to be changed.
reporting_date - Date the earning will stop

Outputs

increment_start - Beginning of the earning period
increment_end - End of the earning period
earned_period - The amount of time (as a fraction of 1) earned in the period

Initiate a sample problemset

# define data
policy_data <-  data.frame( policy = c("a","b","c"),
                            pol_eff = as.Date(c("2018-01-01","2018-07-03","2018-06-30")),
                            pol_exp = as.Date(c("2018-07-03","2019-10-16","2019-10-15")))
policy_data_by <- c("policy", "pol_eff", "pol_exp")
# increment_period <- "1 quarter"
reporting_date <- as.Date("2019-07-25")

Solution 1: `data.table`

library(data.table)
library(zoo)

# convert the data.frame to a data.table
# keep records where the pol_eff is before the reporting_date
# create increment_date as a vector of quarterly dates in each row of data up to earlier of pol_exp and reporting_date
# transposed for each combination of policy_data_by
policy_data_dt <- setDT(policy_data)[pol_eff <= reporting_date, 
                                     .(increment_date = seq(from = as.Date(as.yearqtr(pol_eff), frac = 0),
                                                            to = pmin(pol_exp, reporting_date), 
                                                            by = "1 quarter")),
                                     by = c(policy_data_by)]
# use the increment_date to define the first and last date of the increment_period
policy_data_dt[, `:=` (increment_start = pmax(as.Date(as.yearqtr(increment_date), frac = 0), pol_eff),
                       increment_end   = pmin(as.Date(as.yearqtr(increment_date), frac = 1), pmin(pol_exp, reporting_date))
                       )]
# calculate the earned_period
# remove the increment_date variable
policy_data_dt[, `:=` (earned_period  = round(  as.numeric(increment_end - increment_start + 1) / 
                                                as.numeric(pol_exp - pol_eff + 1), 3),
                       increment_date = NULL
                       )]

##     policy    pol_eff    pol_exp increment_start increment_end earned_period
##  1:      a 2018-01-01 2018-07-03      2018-01-01    2018-03-31         0.489
##  2:      a 2018-01-01 2018-07-03      2018-04-01    2018-06-30         0.495
##  3:      a 2018-01-01 2018-07-03      2018-07-01    2018-07-03         0.016
##  4:      b 2018-07-03 2019-10-16      2018-07-03    2018-09-30         0.191
##  5:      b 2018-07-03 2019-10-16      2018-10-01    2018-12-31         0.195
##  6:      b 2018-07-03 2019-10-16      2019-01-01    2019-03-31         0.191
##  7:      b 2018-07-03 2019-10-16      2019-04-01    2019-06-30         0.193
##  8:      b 2018-07-03 2019-10-16      2019-07-01    2019-07-25         0.053
##  9:      c 2018-06-30 2019-10-15      2018-06-30    2018-06-30         0.002
## 10:      c 2018-06-30 2019-10-15      2018-07-01    2018-09-30         0.195
## 11:      c 2018-06-30 2019-10-15      2018-10-01    2018-12-31         0.195
## 12:      c 2018-06-30 2019-10-15      2019-01-01    2019-03-31         0.190
## 13:      c 2018-06-30 2019-10-15      2019-04-01    2019-06-30         0.192
## 14:      c 2018-06-30 2019-10-15      2019-07-01    2019-07-25         0.053

Solution 2: `dplyr`

library(dplyr)
library(tidyr)
library(zoo)

policy_data_dp <- policy_data %>%
  # turn the data.frame into a tibble which is more efficent
  as_tibble() %>%
  # keep records where the pol_eff is before the reporting_date
  filter(pol_eff <= reporting_date) %>%
  # set the processing to be done on each row
  rowwise() %>%
  # create increment_date as a vector of quarterly dates in each row of data up to earlier of pol_exp and reporting_date
  mutate(increment_date = list(seq(from = as.Date(as.yearqtr(pol_eff), frac = 0),
                                   to = pmin(pol_exp, reporting_date),
                                   by = "1 quarter"))) %>%
  # transposed for each row of data
  unnest(increment_date) %>%
  # reset grouping
  ungroup() %>%
  # use the increment_date to define the first and last date of the increment_period
  # calculate the earned_period
  mutate( increment_start = pmax(as.Date(as.yearqtr(increment_date), frac = 0), pol_eff),
          increment_end   = pmin(as.Date(as.yearqtr(increment_date), frac = 1), pmin(pol_exp, reporting_date)),
          earned_period   = round(  as.numeric(increment_end - increment_start + 1) / 
                                    as.numeric(pol_exp - pol_eff + 1), 3)
          ) %>%
  # remove the increment_date variable
  select(-increment_date) %>%
  # identity
  identity()

## # A tibble: 14 x 6
##    policy pol_eff    pol_exp    increment_start increment_end earned_period
##    <chr>  <date>     <date>     <date>          <date>                <dbl>
##  1 a      2018-01-01 2018-07-03 2018-01-01      2018-03-31            0.489
##  2 a      2018-01-01 2018-07-03 2018-04-01      2018-06-30            0.495
##  3 a      2018-01-01 2018-07-03 2018-07-01      2018-07-03            0.016
##  4 b      2018-07-03 2019-10-16 2018-07-03      2018-09-30            0.191
##  5 b      2018-07-03 2019-10-16 2018-10-01      2018-12-31            0.195
##  6 b      2018-07-03 2019-10-16 2019-01-01      2019-03-31            0.191
##  7 b      2018-07-03 2019-10-16 2019-04-01      2019-06-30            0.193
##  8 b      2018-07-03 2019-10-16 2019-07-01      2019-07-25            0.053
##  9 c      2018-06-30 2019-10-15 2018-06-30      2018-06-30            0.002
## 10 c      2018-06-30 2019-10-15 2018-07-01      2018-09-30            0.195
## 11 c      2018-06-30 2019-10-15 2018-10-01      2018-12-31            0.195
## 12 c      2018-06-30 2019-10-15 2019-01-01      2019-03-31            0.19 
## 13 c      2018-06-30 2019-10-15 2019-04-01      2019-06-30            0.192
## 14 c      2018-06-30 2019-10-15 2019-07-01      2019-07-25            0.053

Insurance earned premium calculation in R

Simon

2019/04/15

Requirement

References

Required packges

Inputs

Outputs

Initiate a sample problemset

Solution 1: `data.table`

Solution 2: `dplyr`

Insurance earned premium calculation in R

Simon

2019/04/15

Requirement

References

Required packges

Inputs

Outputs

Initiate a sample problemset

Solution 1: data.table

Solution 2: dplyr

Solution 1: `data.table`

Solution 2: `dplyr`