Will know your choice of a K value when you update to the second generation of the teams' elo ratings. Every week, maybe, or will it be daily? Looks like the mean score is 1560ish and you use 400 in your probability calculation. The combination of 400 and such a tight range of the initial values result in win probabilities which hover closer to 50%, thereby reducing accuracy early in the season. ie. the logistic model is dampened too much to represent the variability in D1 & D2 Men's Volleyball outcomes, (Maybe OK for NFL football given its parity, but probably not in this landscape?) If keeping the 400, then I would suggest Morehouse closer to 1000 and Hawaii at about 2100 because I am nearly certain Hawaii would win closer to 499 out of 500 contests against Morehouse than the 490 out of 500 it currently forecasts. Either that, or maybe reduce the 400 by 25% in the p calculation. Over time, elo will certainly find its way. My concern might be your K will either be too low so that it will take too long to find its way, or be too high, thereby getting the distribution "better quicker" at the price of creating a recency bias later in the season because single game adjustments will be too large because of it. Even if you choose the K which is "juuuuuust right," as Goldilocks might put it, your model might not even be as precise as you'd want come time for conference play. Just some observations from a guy who does a modified elo to predict all the games, too. I wish I found your schedule before last weekend!
One other observation is that I would weigh conference pre-season polls by its coaches no less than any metric. My experience is the "wisdom of the expert crowd" outperforms a single metric more often than not, even the cool one you create to measure the movement of individual talent. Merging both equally probably gets to a version of the truth even closer to what it really is, thus guaranteeing the model will approach its equilibrium much sooner. Very cool site.
Thanks for the great thoughts. Right now, K=30. I have played around with this a bit and have landed here for the time being (always open to adjusting it though.) Over the course of the season, I think it is somewhere in that goldilocks zone to allow teams to move up and down but not huge jumps with a single match. Because of general elo inflation in the model, I would expect the distance between Hawai'i and Morehouse to be even larger at the end of the season. This is only Morehouse's second season so their dataset is very limited.
I love the idea of adding in pre-season polls. I know in other college sports there is good evidence to show their usefulness in modeling. Right now, not every conference has a preseason poll (at least that I can find). I would have to figure out how to adjust for that, but it might be worth the effort.
Also, I'd love to see your model if you are open to sharing it. Always looking to learn more.
Right now I'm only tracking matches that include at least one D-I/II team. When one of those teams plays D-III or NAIA, the model uses a static elo rating to calculate projections.
Will know your choice of a K value when you update to the second generation of the teams' elo ratings. Every week, maybe, or will it be daily? Looks like the mean score is 1560ish and you use 400 in your probability calculation. The combination of 400 and such a tight range of the initial values result in win probabilities which hover closer to 50%, thereby reducing accuracy early in the season. ie. the logistic model is dampened too much to represent the variability in D1 & D2 Men's Volleyball outcomes, (Maybe OK for NFL football given its parity, but probably not in this landscape?) If keeping the 400, then I would suggest Morehouse closer to 1000 and Hawaii at about 2100 because I am nearly certain Hawaii would win closer to 499 out of 500 contests against Morehouse than the 490 out of 500 it currently forecasts. Either that, or maybe reduce the 400 by 25% in the p calculation. Over time, elo will certainly find its way. My concern might be your K will either be too low so that it will take too long to find its way, or be too high, thereby getting the distribution "better quicker" at the price of creating a recency bias later in the season because single game adjustments will be too large because of it. Even if you choose the K which is "juuuuuust right," as Goldilocks might put it, your model might not even be as precise as you'd want come time for conference play. Just some observations from a guy who does a modified elo to predict all the games, too. I wish I found your schedule before last weekend!
One other observation is that I would weigh conference pre-season polls by its coaches no less than any metric. My experience is the "wisdom of the expert crowd" outperforms a single metric more often than not, even the cool one you create to measure the movement of individual talent. Merging both equally probably gets to a version of the truth even closer to what it really is, thus guaranteeing the model will approach its equilibrium much sooner. Very cool site.
Thanks for the great thoughts. Right now, K=30. I have played around with this a bit and have landed here for the time being (always open to adjusting it though.) Over the course of the season, I think it is somewhere in that goldilocks zone to allow teams to move up and down but not huge jumps with a single match. Because of general elo inflation in the model, I would expect the distance between Hawai'i and Morehouse to be even larger at the end of the season. This is only Morehouse's second season so their dataset is very limited.
I love the idea of adding in pre-season polls. I know in other college sports there is good evidence to show their usefulness in modeling. Right now, not every conference has a preseason poll (at least that I can find). I would have to figure out how to adjust for that, but it might be worth the effort.
Also, I'd love to see your model if you are open to sharing it. Always looking to learn more.
This is so great, TJ! Thanks so much for all of this! The composite schedule is such a nice thing to have on tap. Thanks again!
No non D1-2 matches in the schedule? How do you factor those in?
Right now I'm only tracking matches that include at least one D-I/II team. When one of those teams plays D-III or NAIA, the model uses a static elo rating to calculate projections.