This paper presents first steps toward robust models for crisis prediction. We conduct a horse race of conventional statistical methods and more recent machine learning methods as early-warning models. As individual models are in the literature most often built in isolation of other methods, the exercise is of high relevance for assessing the relative performance of a wide variety of methods. Further, we test various ensemble approaches to aggregating the information products of the built models, providing a more robust basis for measuring country-level vulnerabilities. Finally, we provide approaches to estimating model uncertainty in early-warning exercises, particularly model performance uncertainty and model output uncertainty. The approaches put forward in this paper are shown with Europe as a playground. Generally, our results show that the conventional statistical approaches are outperformed by more advanced machine learning methods, such as k-nearest neighbors and neural networks, and particularly by model aggregation approaches through ensemble learning.