Here's my problem with tracking a judge's performance. They would have to evaluate the judge against all tables they have judged to see if they are actually a statistical outlier. I doubt that any one in KCBS could handle the stats for that. (You could use the I/O psychology program at MSU to do it)
The entire beauty of the current program is that you have the "samples" of judges that resemble randomness. Sure you have some systematic variance and some silos, but it is better than medical "random samples".
If they start a BBQ inquisition, they may rid the system of GOOD variance. Believe it or not a LOT of these scores that are all over the place are accurate and would hold true if the same sample was provided to a larger audience. Is that not the point, to give an accurate score?