I need a probability estimator running as a microservice with high availability and a response time measured in milliseconds (or less)
The training set is 150 million records, unbalanced, all features are categorical, some categories with high cardinality. The model needs to be updated daily.
I'm thinking Catboost + FastAPI + Docker / Kubernetes, but I'm open to suggestions
The dataset will be imported from mysql, but I can provide a sample csv
The project should include feature analysis, hyperparameter tuning, and assistance with installing the microservice in our servers, ie everything between providing the data and getting it up and running