March Mathness is fast approaching, and it's time for me to try to beat the Rasch model. Last year gave me a hint: an interesting problem popped up because the University of South Carolina Gamecock Women were undefeated. The Rasch model does not like "perfect" or maximum scores. It looks like Rasch may have another perfect team this year, the UCLA Bruins Women. To get around this, I will build a propensity score model. My plan has two stages.
First off, we will have to create the model. My preference is to start with very few, but very strong predictor variables (e.g., win-loss record, points for and against) and build the model larger until our predictions stop improving. In this case, our model will be a logistic regression equation that produce some estimates exactly like the odds models used in radiobiology when something's gone wrong.
After building the logistic model on a random subset of the total data, I will score all the data with the equation. That score is called a propensity score, and it will allow us to compare each team's propensity to win given a competitor. You may recognize this because it is very similar to what the Rasch model does. However, the Rasch model is best used on dichotomous (True/False) or small polytomous (i.e., points awarded at intervals) items and, more importantly, only utilizes one parameter. The logistic model can utilize mixed dichotomous and ratio data, which means many predictors.
Let's hope we can take down Rasch in basketball. Go logit! Ra, Ra, Ra!