Python Insurance Data Analysis

Evaluation Criteria:

• We are looking for you to demonstrate your proficiency in coding languages and your ability to produce legible and maintainable code in an analytical setting.

• Your submission will be graded based upon your familiarity with a command-line programming language. We prefer the submission to be made in Python because this is the primary language used for Claims Analytics. Other acceptable languages include R, Julia, or SQL. We will not accept submissions made via GUI interface packages or Excel.

• Your submission will be evaluated on your chosen method, design, cohesion, and appropriate analysis.

• It will not be evaluated on existence of null results, complexity of response, and code style.

For this exercise, consider that you are a Claims Analyst tasked with reporting on Bodily Injury Severity. Of primary interest to your Claims Management team is the relationship between the time the claim takes to close (from the date it opened) and the total settlement amount. Claims processing time can impact the settlement amount, so the cycle time of claim processes are often tracked. Other variables also impact the settlement amount, so feel free to apply those as you see fit.

Using the "Closed2009" and "ClosedClaimDataFieldDefinition2009" files

from [login to view URL]:10.7910/DVN/AQMV0O as your dataset.


Your assignment is to report on the settlement amounts of claims and the length of time to close the claim (from open to close). Analyze the relationship between these two metrics and generate a summary analysis.

When you are done, submit the following:

• A summary of your analysis, outlined as would be suitable for a brief company presentation in PDF format

• A copy of the source code and any supporting documentation

