Friday, November 2, 2012

The Variables of Sectional Forecasting, part 1

"Let's start with something simple, like one and one ain't three."  Jimmy Buffett had it right when he wrote those lyrics.  If you're going to try and figure out what variables are most important in figuring out anything you should probably start with something simple.  Winning percentage is the quickest and easiest way to figure out who is good and who isn't.  The Section 2 committee follows this number pretty closely in their seeding order.  But, just because it's quick and easy doesn't mean it's always right and just because it's not always right doesn't mean it's not valuable in predicting sectional winners.  It is also why it is my first variable.

Over the past three seasons, 8 of 15 teams with the best winning percentage in their class won sectionals.  It's by no means perfect, but if I can tell you that 53% of teams win based on one variable I would have a pretty good starting point for my statistical model.  What I have tried to do with this model is to use winning percentage as my basis and use other variables to help explain why 47% of teams with the best record don't win sectionals. 

A lot of the theory behind this model has been based on observations over the years watching high school basketball.  Some of it comes from intuition and some from trial and error.  One thing I had always assumed was that larger schools were incrementally better than smaller schools.  If you have 1,000 students to choose from to form a basketball team, you will probably have better odds of finding more talent than if you have 100 students.  The NYSPHSAA apparently feels the same way as they have split their schools into five classes based on enrollment.  Large schools compete against large schools, small against small.  The question I always had was whether or not it really did matter and if so how much and how can I incorporate that into my formula. 

Now I can answer those questions and I can incorporate it as my second variable.  Over the past three seasons the team with the greater enrollment wins 58% of the games.  Over an 18 game schedule that only amounts to 3 games over .500, but over an 82 game season like the NBA plays, they would be 13 games over .500 and certainly in the playoffs.  But even that doesn't tell the whole story.  Teams with a 200 student advantage win 66% of the time.  That is 6 games over .500 for a high school schedule or 27 games over .500 in the NBA and that will have you in the hunt for a division title.  This can have a tremendous effect on a team in section 2.  Broadalbin-Perth plays 8 conference games against teams with more than a 200 enrollment advantage.  That means on average they are starting the season 3-5.

This is the basis for the second variable in the model.  If Broadalbin-Perth wins 4 of those 8 games they have performed better than the average team under those circumstances and have therefore increased their likelihood of winning sectionals.  I will take every team and calculate every schedule to determine what their expected winning percentage would be and compare that to how they actually performed.  That variance is the second variable.

In the following posts leading up to the start of the regular season, I'll be discussing the other two variables, the variance of your opponents winning percentage compared to their expected winning percentage and point differential.  I'm also hoping to show how private schools fit into the equation.  They create their own issues and need to be handled differently.  Finally, I'll be discussing my prediction model and how it differs from the forecasting model.  There are three weeks until the Thanksgiving tournaments and hopefully I'll be able to get these three posts out before then. 

No comments:

Post a Comment