Project Name: Time Series Forecast of Spending Habits
The overall objective of this project is to process user spending data to predict future spending habits. Submissions are required to be in Python version 3.7 (or later).
Our data includes 90 days of regularly sampled user-data that includes the following:
• Anonymous User ID
• Total money spent by the User since the previous timestamp
Each user may have a different sample frequency, some may have missing samples, and some spending values may also appear as outliers to usual habits. It is acceptable to interpolate between missing samples and otherwise ignore or manipulate the data to correct spending values considered erroneous or abnormal.
We would like algorithms created that predict spending values for each user at the same frequency as their samples. Predicted spending should extend through the next 30-days from the most recent sample. We would like projects to start with the initial 5 days of data to generate the first prediction, and then iteratively add additional days of data to generate or update each prediction. To summarize, we would like to see 30-day predictions based on 5 days of data, 6 days, 7, and so on, up to 60 days of data. The remaining 30 days are to be used for self-scoring and testing.
We are will score the accuracy of each algorithm based on how closely they predict the cumulative amount spent in a validation dataset from the same users. The validation dataset will not be made available to the developers. Accuracy scoring will place greater importance on the near-term predictions and less importance on the far-term predictions. It is also desired that the algorithms quantify their own confidence for each predicted future data-point. Prediction algorithms will also be evaluated for their stability as they progress from using 5 days of data to 30 days of data.