I have previously mentioned that the variables in my Sectional Forecasting model put things in terms of a winning percentage, whether it be actual or expected. I have also mentioned how the basis for two of the variables relies on a comparison of enrollments between the team in question and its opponents or its opponent's opponents. They are all driven from a database of games played. What I have failed to mention is that there is more than one database. So far, the stats I have referenced have been from the Public vs. Public database due to it has a much greater size and should be a little more stable. There is also a database for Public vs. Private and Private vs. Private.
Before I go any further with the other databases, there is one other thing I have failed to mention. The model does not take into account any games played with an opponent that resides outside of Section 2. The reason for this is pretty straightforward, I have no information on these schools and I really just don't have the time to try and find it. Maintaining all these databases is pretty labor intensive as it is and we only have roughly 90 schools in Section 2. To do it for the whole state and in some cases Vermont and New Jersey would just be way too time consuming. So, just to be clear moving forward, any records I mention or stats I give are only for games against Section 2 schools unless otherwise stated.
One other bit to avoid confusion, when I refer to Private schools I am referring to both Parochial and Charter Schools. That is for the reason that both are allowed to accept applicants and aren't restricted at all, or partially restricted by which school district those applicants come from. It's also cleaner to use one name instead of two and in either case, comparing their enrollments as if they were a Public school doesn't work, but when you compare them to each other you get more of an apples to apples comparison. And, when you compare them as a group to the public schools you can get a bit of a read on what the relationship between their enrollments is.
First off, let's compare Private schools vs. Private schools. If the theory holds true that two schools who live life by the same rules get better as their enrollment increases we should see a similar effect that we do in the Public vs. Public database. In the largest category with a greater than 900 enrollment variance, the larger school wins 85% of the time. In the largest category for Private vs. Private schools, a variance of greater than 275, the larger school wins 87.5% of the time. Certainly comparable when you take into account the significantly reduced enrollments of the Private schools. And the trend is comparable as well going from a .441 winning percentage when the variance is less than 81 to a .632 between 81 and 274 and up to .875 over 275.
Now that we know the theory holds true, at least in Section 2 over the past three years, we can compare the Public schools to the Private schools. When I first started trying to figure this out, I had this thought that if I could come up with a multiple for Private schools, I could then compare Public and Private on the same level. Unfortunately it failed miserably every time. There just isn't enough common ground between the Private schools that you can use one multiple. I also tried using several different ones like the size of the city the school is located, but it became too muddled and I ended up with schools needing to change class to make it work.
But, if you separate out the Private schools and compare them as a group to the Public schools you can use that information in the same way. You can take each game they play and get an expected winning percentage just as you do with the Public schools. So, how exactly do they compare? Well, not as cleanly as you might expect, but when you look more closely it does make sense. When a Public school plays a Private school and the variance in enrollment is less than 122, the larger school wins 54% of the games. When the variance is greater than 1899, they win 66%. Now here is where it gets interesting, between 122 and 1899, the smaller team wins 60%.
The driving force behind this is which teams are playing these games. For the most part, the bookend categories when the larger school is winning, the Private school in question is one of the class D schools like Doane Stuart or Hawthorne Valley. Some of them have very small enrollments. Often times, these schools aren't focused on basketball, but rather academics (how dare they!), are located far away from the larger cities in the area (so they don't have a drawing card) and don't always field competitive teams.
The middle group however, is mostly made up of the larger bigger city schools like CBA, Albany Academy or LaSalle. These type of schools are able to pull talent away from the larger Public schools which accomplishes two things, it makes them better and their opponents worse. In this case, CBA is still expected to win a majority of games as one would think. The other schools seem to line up pretty well too. I anticipate that as I obtain more years data I can probably break this group down a bit further, but at the moment there wouldn't be enough games to make it statistically viable.
Now you know how the model works and how the pieces fit together so you can sit back watch the scores and let me handle the math. My next post will be covering the other model I have designed which is more of a predictive model that includes a different variable altogether. Beyond that I hope to get a page up with all the teams and their enrollments for this season and their 5 year averages. Once I get full schedules from the schools I hope to put up the hardest and easiest schedules in Section 2. And, of course, the games are almost here.
No comments:
Post a Comment