Flu prediction isn’t getting easier as the season (fails to) wind down. The incidence profiles (or %ILI technically) – for at least some of the regions – are still showing relatively complex variations.
Here we’ll focus on our two principal models, one incorporating both specific humidity and schools and the second, based on our recent PLoS – Computational Biology paper, which allows R(t) to have two values, with a change in R(t) fit by the model. (Additionally, we ran a relatively large ensemble of other model types; such as ones incorporating priors, but we’ll discuss those results another time).
Generally, both models do reasonably well in fitting to the data up to the present point in time. Typically, when there are strong changes in specific humidity or school holidays that occur at a crucial point in the profile, the SH+school model performs well, and better than the paper 2 model. For these reasons, if we believe that the historical variations in specific humidity are likely to hold out for the remainder of this season, these regions can also be predicted well forward in time. In most cases, these profiles suggest that either the season has peaked or will peak during the next week or two.
The paper 2 model on the other hand, is not constrained in the same way. The model is allowed to place variations in R(t) at points where large changes in incidence occur. But, this can mean that future extrapolation (“prediction”) may not be well constrained. If we consider region 2, for example, the paper 2 model results show that incidence will continue to climb through week 26, which is very unlikely (although if it does, we will certainly learn something from that). On the other hand, in some cases, such as region 7, the paper 2 model does a markedly better job at capturing the variability in %ILI up through the present time. Thus, if we use region 7’s paper 2 results, we will forecast a continued increase for the next 4 weeks (the four-week CDC prediction interval is marked by the grey shaded box in each panel). Both because the AICc scores are significantly lower for the paper 2 model, and because it seems to have captured the strong increase over the last six weeks, this is the model that will be used to make our four-week prediction for region 7. We are aware, however, that the most recent data point, while falling close to the paper 2 curve, may be significantly revised next week; but in which direction, we do not know.
We are keeping records of all predictions as well as the data provided by the CDC, which is adjusted from week to week, so that we can perform a retrospective analyses of our effort after the season is over. By mining the model predictions this way, we should be able to getter gauge our confidence in each model’s predictions, and how that confidence evolves through the course of the season.