Here's my problem with tracking a judge's performance. They would have to evaluate the judge against all tables they have judged to see if they are actually a statistical outlier. I doubt that any one in KCBS could handle the stats for that. (You could use the I/O psychology program at MSU to do it)
The entire beauty of the current program is that you have the "samples" of judges that resemble randomness. Sure you have some systematic variance and some silos, but it is better than medical "random samples".
If they start a BBQ inquisition, they may rid the system of GOOD variance. Believe it or not a LOT of these scores that are all over the place are accurate and would hold true if the same sample was provided to a larger audience. Is that not the point, to give an accurate score?
I think you are probably ahead of the curve at this point.
The software will have the ability to collect the data necessary to do the analysis. Just about everybody has an opinion about what it will show. Having worked with statistics quite a bit I've got no doubt that some trends will emerge quickly. But, based on that experience I'd suggest not making any decisions until a year of data had been captured, and used to test various algorithms.
I don't know if anyone currently employed or serving on the board has the experience to do the analysis or consult, but I know of at least one member that contacted me while I served on the board that does. I don't see any hurdles that can't be negotiated.