PDA

View Full Version : Is the KCBS BOD schizophrenic?


billygbob
10-16-2010, 09:23 PM
From the Quick Notes from the Board 10-13-10 on KCBS web site:

Rules Committee - Candy Weaver

"...due to a number of different factors, KCBS is evolving to a 3 or 4 number scoring system (i.e., 6-7-8-9), which leads to higher scores and more 180s..."

New Ideas Committee - Merl Whitebook

"...when it appears that a CBJ statistically is inconsistent in scoring (+/- 2 from the mean of the overall contest results,) at two or more contest, in a 12 month period, the CBJ will be mentored by the CBJ Chairperson. ... The motion was seconded by Ed Roith."

So here is one BOD member stating that judges don't use all the numbers and another BOD member motioning to effectively penalize members that do use all the numbers (seconded by the CBJ Committee - Ed Roith: No Report member).

Dollar-to-your-dime that that judging will become a 7-8-9 score contest because you are penalized for actually giving the correct score. In nine years as a CBJ and six as a CTC I know as a fact that many judges do not give less than a 6 because "that's low enough they won't win" and too many that won't give less than a 7 or 8 because "the cooks work hard and spend a lot of money". When I score my 3 or 4 for taste and tenderness on a burnt, dry piece of chicken and everyone else gives a 6 or 7 or 8 I'm the one that is considered the "outlier" and need a talking to - or banned.

<RANT>
I think drbbq on The BBQ Forum has it nailed: "This sounds like my 7yo girls soccer games. Give everyone 9's and a ribbon... and a box of juice".

If the BOD wants to track something, check for the judges that never score out of the same two numbers. I guarantee that over five contests (120 entries give or take - 5 contests x 4 meats x 6 entries) that if that judge never got a piece of crap, or something he'd like to have more of, then you've identified the real problem.

The real issue is that some judges are not scoring "honestly", either out of best intent or attitude. That doesn't eliminate the fact every judge has a different perspective, or the fact every piece of meat is not identical; there will be differences in scores. But if every judge gave an honest score then much of the outcry could be addressed.
</RANT>

ThomEmery
10-16-2010, 09:39 PM
If a CBJ consistently scores well outside the norm of his tables
Maybe "honest" is not what he is

Elegant Bear
10-16-2010, 09:53 PM
I am not a competitor. I took a CBJ class recently and the person who was supposed to cook for the class had a last minute personal emergency and replacement cooks were recruited from throughout the community. The replacements did not have to be competition cooks. Every piece of meat that I tasted that day, I would have sent back to the kitchen had it come from a restaurant. It was the worst BBQ I had tasted in my life. The class leader acknlowledged the problem and said that it would be valuable to us to know what the bad stuff really is. The scores on each sample were in the full range just at my table. There were 9's and 2's.

If there is a scoring problem, it is in the training of the new judges because there are more than 70 CBJ's that came out of my class that have not yet seen, felt or tasted a 7, 8 or 9. Further, from the comments above, some judges have their own "sub criteria" for scoring that are not represented in the KCBS rules.

billygbob
10-16-2010, 09:59 PM
If a CBJ consistently scores well outside the norm of his tables
Maybe "honest" is not what he is

The primary point I'm trying to make is judges not scoring honestly is the issue, and that it is a not an uncommon occurrence; they skew the "norm". And I'll add that the proposed evaluation period of two contests is ludicrous from a statistical standpoint.

SaucyWench
10-16-2010, 10:23 PM
Thom, I judged today, and there were swings in the scoring. 4 of us got sauced fat in the brisket box, 2 got nice crispy chunks. One rib box had 5 very nice ribs, yet the judge sitting beside me got a rib with barely enough meat to taste. His score was significantly lower than the rest of ours. Theoretically, with the tracking that should have started today, if he scores anything lower than the rest of the table within the next 12 months, he will be subject to evaluation. Should this judge be penalized for getting the one bad piece of chicken in a box on top of a rib with no meat?

Bunny
10-16-2010, 11:06 PM
Thom, I judged today, and there were swings in the scoring. 4 of us got sauced fat in the brisket box, 2 got nice crispy chunks. One rib box had 5 very nice ribs, yet the judge sitting beside me got a rib with barely enough meat to taste. His score was significantly lower than the rest of ours. Theoretically, with the tracking that should have started today, if he scores anything lower than the rest of the table within the next 12 months, he will be subject to evaluation. Should this judge be penalized for getting the one bad piece of chicken in a box on top of a rib with no meat?

And that happens a lot more than you know. If a judge gets a really bad rib compared to the others, why are they out of the norm? They are talked to if they are out of the same realm of the other judges, but if they have a legitamate reason, so there lies a true score. We are not out there to sway judges opinions, just making sure they understand our procedures.

Skinny Cook
10-16-2010, 11:19 PM
I see judge number ident theft comming ? :crazy:

I judged the same contest as SaucyWench. I too had one rib that was not fit to feed to a dog. Couldn't have cut it off the bone with a razor blade. They got a low score compared to the rest of the box. The ribs in the box obviously came from several different slabs as the other judges at my table didn't have a like rib. The others were also a different color.
Sure not my inconsistancy as a judge that caused that. :hand:

ThomEmery
10-16-2010, 11:20 PM
Judges will not be penalized
Education is not a penalty

The Rib example given would be noted by the Rep
If this same CBJ were given a bad entry at another contest
it would be noted once more
There are times when one judge will get a substandard piece
and perhaps it will be discovered that 2 times outside table norm
is too frequent the action step

SaucyWench
10-16-2010, 11:31 PM
The problem I see with computerized judge tracking (which by the way, failed this week due to programing issues) is, how do we prove that a judge has been asked why a score, be it low or high in comparison with others at the table, is not within the norm? If a rep says it's a valid reason, will that override the tracking program? Since computers cannot know our reasons for scoring as we do, I fully expect to get a talking to next year, because I will not give a 6, 7 or 8 to something that deserves a 5 or less.

Spydermike72
10-17-2010, 07:13 AM
The problem I see with computerized judge tracking (which by the way, failed this week due to programing issues) is, how do we prove that a judge has been asked why a score, be it low or high in comparison with others at the table, is not within the norm? If a rep says it's a valid reason, will that override the tracking program? Since computers cannot know our reasons for scoring as we do, I fully expect to get a talking to next year, because I will not give a 6, 7 or 8 to something that deserves a 5 or less.

I would hope the program would have a way to note a Reps comments. I dont know that for sure but I am taking a wild guess here...

Smoke'n Ice
10-17-2010, 07:47 AM
At a contest this weekend the following scores on ribs were posted for the top 3 finishers:

1 34.8572 23.4286 30.2856 36.0000 35.4286 34.8572
2 34.8572 24.0000 33.7142 30.2856 33.7142 36.0000
3 35.8572 25.1428 35.4286 32.0000 29.7144 32.0000

I have looked at all of the results and suspect I can pick out the additional entries judged by these judges based on the score pattern. The question is, "Which Judge Should Be Re-educated -- Judge 1 for consistently having no variation or judge 2 for consistently being well below the rest of the table or did judge 2 perchance get the one bad rib in each of the boxes he judged? It might be that he needs new glasses:confused:

billygbob
10-17-2010, 08:18 AM
Judges will not be penalized
Education is not a penalty

Further in the same motion "...Should the problems continue then the matter shall be brought to the Board for removal or further action by the board. ..."

Removal or "further action" is not a penalty?

ThomEmery
10-17-2010, 08:37 AM
Implementing the nuts and bolts of this will take time
Yes there are more questions that need to be answered here
I can only comment on the concept
Maybe we can get a current BoD member to address these points

Thanks for taking the time

BoD action to remove a CBJ is at the last resort I would hope

SaucyWench
10-17-2010, 08:52 AM
At a contest this weekend the following scores on ribs were posted for the top 3 finishers:

1 34.8572 23.4286 30.2856 36.0000 35.4286 34.8572
2 34.8572 24.0000 33.7142 30.2856 33.7142 36.0000
3 35.8572 25.1428 35.4286 32.0000 29.7144 32.0000

I have looked at all of the results and suspect I can pick out the additional entries judged by these judges based on the score pattern. The question is, "Which Judge Should Be Re-educated -- Judge 1 for consistently having no variation or judge 2 for consistently being well below the rest of the table or did judge 2 perchance get the one bad rib in each of the boxes he judged? It might be that he needs new glasses:confused:

There's that dang judge #2 again!!! :-D

Seriously though, there is little chance that the top 3 were at the same table, and according to the "2 points +/- from the mean" plan, judge #2 is not the only one out of line-some of the high scorers are too.

Smoke'n Ice
10-17-2010, 09:06 AM
Seriously though, there is little chance that the top 3 were at the same table

I would not be willing to bet on this:tape:

Lake Dogs
10-17-2010, 09:45 AM
Having been a judge (CBJ) for 6 years at 70+- comps, and a competitor for 3 years
or so, and having experience with other sanctioned cooking events (chili for example),
I think I've seen it all a time or two.

Scoring systems with a wide range (1-9, 1-10, etc) allow the possibility of a wide
numerical variance (mathematics for a deep pile of feces if not monitored and
governed closely). Numerical variance, it and of itself, isn't a bad thing, it allows for
finer delineation between one that's pretty good and another that's perhaps very
good. It also allows (as described above) the variance of score in a piece of bbq
that one judge gets vs. a piece another judge gets.

HOWEVER, it also allows everything else. These "everything else" issues come to
light more clearly when watching judges scoring on items that are in the same pot
(ie not a different piece of meat, but part of the same), like in chili judging. I've
seen it all. Even with really good instruction, some judges still think (and I've heard
this said more than one time, directly out of the mouth of judges) "if they're in
a competition they must be great [or trying to prove something]", so their average
score is *average*-great; resulting in a truly good product ending up with very low
scores. It's like they're judging this entry in front of them against every entry
they've ever judged or eaten at a restaurant or made at home. Out of the same
bowls of chili, I've literally seen 1's to 9's. With better guidance I've seen 4-9's.
I'm not saying which is the correct score (the 4's, or the 9's). I am saying that the
product either sucked or it was pretty darned good, but both? No.

There is a very good argument that KCBS, by throwing out the lowest score, takes
care of this. It's a good argument, but honestly perhaps the lowest score was the
more correct score. I dont know. I'll also tell you that IF it were the result of a
rogue, uneducated, whatever judge, there are many tables that have 2 or 3 of
such folks (entry variance [as described wonderfully above] not withstanding).

Removing/reducing numerical variance you've virtually eliminated the problem. To
the folks against it, I highly suggest working with system with a very small numerical
variance for a while and see first hand whether they work or not. I'll tell you, as
a MIM/MBN judge, I cannot think of a time where the best BBQ didnt win. Not one
single time. I ask that you ask the teams and judges who either compete or judge
both systems and ask for their opinions rather than making assumptions about the
other without any experience.

By the way, I'm not anti-KCBS. I found out yesterday, for example, that I've pretty
much convinced an organizer to change her competition from an MBN format to a
KCBS format. I love judging and competing in both. It doesn't mean that any of them
are perfect or without opportunity for improvement.

watertowerbbq
10-17-2010, 09:58 AM
I see judge number ident theft comming ? :crazy:

I judged the same contest as SaucyWench. I too had one rib that was not fit to feed to a dog. Couldn't have cut it off the bone with a razor blade. They got a low score compared to the rest of the box. The ribs in the box obviously came from several different slabs as the other judges at my table didn't have a like rib. The others were also a different color.
Sure not my inconsistancy as a judge that caused that. :hand:
Just for clairification, did you judge down because it was tough or did you judge down because it was a different color or from a different slab?

dmprantz
10-17-2010, 10:03 AM
Just a thought from little ole me: How often are the 2-3-4-5 scores used, and do they really make much of a difference among them? From what I've seen, scores start out at about 7 for all categories and can go up one or two places for good and great. The can also go down for "not good" and "Raw / innedible." Does it really matter whether you get a 2 or a 5? Both of them are lousy, with the 1 of course reserved for DQ. I for one would appreciate removal of any scores above about five total values plus special values (DQ and raw?). That's just me though.

This reminds me of one of my pet peaves about grading, but it's very similar: In college there is no realistic difference between a D and an F. Sure a D is passing and an F not, but D's don't count toward a degree or qualify as a pre-req, so why even give the grade? I don't think a 1.0 GPA even qualifies for graduation at most schools. Some schools don't even allow you to graduate with anything below a 3.0. Why stick with labels like "average" when average isn't good enough to graduate? The grades should be relabeld and D dropped totally. Sure this is partially a rant, but it's also related: Extra grades or scores can really confuse and complicate matters, and I don't think "because they've always been there" is a good enough reason to keep them.

dmp

QN
10-17-2010, 10:08 AM
I was a table captain this weekend. We had a rib entry that scored this;
9-7-7
9-7-7
9-9-9
9-9-9
8-8-8
8-8-8
Do you think that is to much variance between the judges?

By the way, there were 12 ribs in the box so I did get to try one. Obviously, with that many ribs they did not come from the same slab. The one I had would have been 9-9-9 if I had been judging, but apparently I got one of the good ones...

Lake Dogs
10-17-2010, 10:31 AM
I was a table captain this weekend. We had a rib entry that scored this;
9-7-7
9-7-7
9-9-9
9-9-9
8-8-8
8-8-8
Do you think that is to much variance between the judges?
By the way, there were 12 ribs in the box so I did get to try one. Obviously, with that many ribs they did not come from the same slab. The one I had would have been 9-9-9 if I had been judging, but apparently I got one of the good ones...




No, pretty much dead on. They all pretty much agree that
the appearance was appealing and appetizing (perhaps more
appealing to a few, which is normal and expected). They had
a slight variance in tenderness and taste, that is easily explained
by different ribs on different slabs, different cuts, but also allows
for difference taste preferences. This is how it should look. It's
when you get (see below) that needs work:

977
678
999
688
989
788

Skinny Cook
10-17-2010, 10:36 AM
Just for clairification, did you judge down because it was tough or did you judge down because it was a different color or from a different slab?

I judged down due to it's toughness. My Choc Lab would have turned down that rib.

Skinny Cook
10-17-2010, 10:41 AM
Let it be said that personally I am against this action by the bod. IF implemented it will most likely be the end of me as a KCBS judge, Organizer, and competing team.

Jorge
10-17-2010, 10:48 AM
Perhaps KCBS could using the tracking software to gather data for contest year and then analyze that to see if they are able to gather usable data and determine what, if any trends are present before moving forward.

My day job has me writing computer code, and some of it has involved some fairly advanced statistical analysis. Advanced enough that I had some regular conference calls with folks at MIT and Stanford. I'd be VERY surprised of some interesting trends didn't emerge.

Interesting discussion.

Buster Dog BBQ
10-17-2010, 11:04 AM
Perhaps KCBS could using the tracking software to gather data for contest year and then analyze that to see if they are able to gather usable data and determine what, if any trends are present before moving forward.

My day job has me writing computer code, and some of it has involved some fairly advanced statistical analysis. Advanced enough that I had some regular conference calls with folks at MIT and Stanford. I'd be VERY surprised of some interesting trends didn't emerge.

Interesting discussion.
I think you hit the nail on the head. As expensive as it may be, you enter your judge number, category and the three scores for taste, tenderness and appearance. I am sure there are some judges out there that score the same no matter what.

I also understand that you can go from a 9 to a 5 in a box if a piece of meat is dry. But on appearance? We have seen a 3 point difference where only one judge gave a 6. They all looked at the same box. But if there is a big difference, like more than 3 points, that is where a mandatory comment card should come in play.

Someone mentioned not much variance in scores. I would guess with the popularity of BBQ comp classes this may hold true at some contest in certain regions. With all the Pellet Envy, ISS, Myron Mixon classes and others, the flavor profiles continue to drift to the same rubs, sauces, etc.

Jorge
10-17-2010, 01:40 PM
I think you hit the nail on the head. As expensive as it may be, you enter your judge number, category and the three scores for taste, tenderness and appearance. I am sure there are some judges out there that score the same no matter what.

I also understand that you can go from a 9 to a 5 in a box if a piece of meat is dry. But on appearance? We have seen a 3 point difference where only one judge gave a 6. They all looked at the same box. But if there is a big difference, like more than 3 points, that is where a mandatory comment card should come in play.

Someone mentioned not much variance in scores. I would guess with the popularity of BBQ comp classes this may hold true at some contest in certain regions. With all the Pellet Envy, ISS, Myron Mixon classes and others, the flavor profiles continue to drift to the same rubs, sauces, etc.

I don't know that it would be terribly expensive unless some serious analysis was desired. KCBS has a RFP out now, unless I'm mistaken, for new software.

butt head
10-17-2010, 01:57 PM
And that happens a lot more than you know. If a judge gets a really bad rib compared to the others, why are they out of the norm? They are talked to if they are out of the same realm of the other judges, but if they have a legitamate reason, so there lies a true score. We are not out there to sway judges opinions, just making sure they understand our procedures.

how can they do that if you keep trying to change the procedures

tmcmaster
10-17-2010, 02:01 PM
This strikes me as a bit luny. Do they want HONEST scores, or, like the NFL, do they want more high scores? Can't be both.

LGHT
10-17-2010, 02:13 PM
If there is a scoring problem, it is in the training of the new judges because there are more than 70 CBJ's that came out of my class that have not yet seen, felt or tasted a 7, 8 or 9. Further, from the comments above, some judges have their own "sub criteria" for scoring that are not represented in the KCBS rules.

I'm pretty sure I was at the same training class and I can say that was problem #1. Due to the HUGE amount of people who took the class it just seemed like a CBJ factory that was more concerned with getting as many people in and out and certified instead of really training them to be the best possible judge they can be. I can say the Q was pretty off, but if you have never cooked or tasted comp Q then it was pretty good. I mean even as bad as it was it was just as good as most restaurants so of course it still got 6-7's and 8's as a result. I don't think the judges the score, the rules are the problem at all. I think if you tak e a good hard look at the certification process and limit the amount of people per class and truely focus on "training" new judges correctly all the future problems will be eliminated. However if you just give them a quick 2 hour process and teach them how to fill out the paper then of course judges will be off as they are actually learning how to judge DURING a real competition.

Lake Dogs
10-17-2010, 03:54 PM
I'm pretty sure I was at the same training class and I can say that was problem #1. Due to the HUGE amount of people who took the class it just seemed like a CBJ factory that was more concerned with getting as many people in and out and certified instead of really training them to be the best possible judge they can be. I can say the Q was pretty off, but if you have never cooked or tasted comp Q then it was pretty good. I mean even as bad as it was it was just as good as most restaurants so of course it still got 6-7's and 8's as a result. I don't think the judges the score, the rules are the problem at all. I think if you tak e a good hard look at the certification process and limit the amount of people per class and truely focus on "training" new judges correctly all the future problems will be eliminated. However if you just give them a quick 2 hour process and teach them how to fill out the paper then of course judges will be off as they are actually learning how to judge DURING a real competition.

You've hit the core of the problem (where we've been discussing the
result of the problem). Contrast that to other sanctioning bodies who
have all day training, tests, and then once a judge passes they dont
become a CBJ until they've successfully judged two sanctioned contests.
I do think KCBS is a bit of a victim of their own success/growth, but that
doesnt make the problem any smaller or excuse it.

drbbq
10-17-2010, 04:14 PM
The problem I see with computerized judge tracking (which by the way, failed this week due to programing issues) is, how do we prove that a judge has been asked why a score, be it low or high in comparison with others at the table, is not within the norm? If a rep says it's a valid reason, will that override the tracking program? Since computers cannot know our reasons for scoring as we do, I fully expect to get a talking to next year, because I will not give a 6, 7 or 8 to something that deserves a 5 or less.

I think it's really screwed up to have a system of 2-9, but question anyone who dares use the full scale.

Podge
10-17-2010, 04:18 PM
I tell y'all what I'd like to do, is cook for a small CBJ class. (less than 20 perhaps).. I'd do my damndest to produce what I would for a contest, and see how the new judges would take to it. It'd be really interesting !

LindaM
10-17-2010, 04:32 PM
I'm pretty sure I was at the same training class and I can say that was problem #1. Due to the HUGE amount of people who took the class it just seemed like a CBJ factory that was more concerned with getting as many people in and out and certified instead of really training them to be the best possible judge they can be. I can say the Q was pretty off, but if you have never cooked or tasted comp Q then it was pretty good. I mean even as bad as it was it was just as good as most restaurants so of course it still got 6-7's and 8's as a result. I don't think the judges the score, the rules are the problem at all. I think if you tak e a good hard look at the certification process and limit the amount of people per class and truely focus on "training" new judges correctly all the future problems will be eliminated. However if you just give them a quick 2 hour process and teach them how to fill out the paper then of course judges will be off as they are actually learning how to judge DURING a real competition.

Which proves my point that if a class is taught on a Friday night and those judges judge on Sat at a competition the scores are higher than the norm because those judges NEVER had competition BBQ which is by far better than any BBQ you can get in a restaurant. EVEN MINE....:)

drbbq
10-17-2010, 06:33 PM
The CBJ program is about 15 years old and about 50,000 prospects have taken the class. Not one of them has failed.
How real can this be? It's a fun program and promotes BBQ and KCBS, but if everyone passes what value does it really have?

itI'm pretty sure I was at the same training class and I can say that was problem #1. Due to the HUGE amount of people who took the class it just seemed like a CBJ factory that was more concerned with getting as many people in and out and certified instead of really training them to be the best possible judge they can be. I can say the Q was pretty off, but if you have never cooked or tasted comp Q then it was pretty good. I mean even as bad as it was it was just as good as most restaurants so of course it still got 6-7's and 8's as a result. I don't think the judges the score, the rules are the problem at all. I think if you tak e a good hard look at the certification process and limit the amount of people per class and truely focus on "training" new judges correctly all the future problems will be eliminated. However if you just give them a quick 2 hour process and teach them how to fill out the paper then of course judges will be off as they are actually learning how to judge DURING a real competition.

yelonutz
10-17-2010, 07:37 PM
The CBJ program is about 15 years old and about 50,000 prospects have taken the class. Not one of them has failed.
How real can this be? It's a fun program and promotes BBQ and KCBS, but if everyone passes what value does it really have?

it

I learned about greens, Kale, Red Leaf Lettuce, Cilantro...:thumb:

NUTZ

Skinny Cook
10-17-2010, 07:38 PM
They can just develop a Telefunkin U 49 BBQ tester. Have the Rep pull it onsite, plug it in, then just feed in each sample and the computer can spit out the results. No human error needed. :thumb:

SaucyWench
10-17-2010, 08:28 PM
The CBJ program is about 15 years old and about 50,000 prospects have taken the class. Not one of them has failed.
How real can this be? It's a fun program and promotes BBQ and KCBS, but if everyone passes what value does it really have?

it

I audited a CBJ class last year, and pretty much nothing had changed from my class 8 years before, except that in 2001 we were instructed to start with the assumption that all entries would be 9s, and score down as need be. That resulted in a boatload of 180s, and changes in how to score, first with the 6 as average and up or down from there, and now the definitions of the numbers. Unfortunately, while there are tips in the CD, there is no definition for the definitions-just what is excellent/average/bad competition bbq anyway? We get tips in the judges CD, but it is still up to us to decide what those words mean and score accordingly, even writing a 3 or 4 if it is deserved, no matter if the rest of the table is filled with judges who state openly that they will never give a score lower than a 7. (Thankfully, I've never had a 2-inedible.)

I don't see how suspending a judge until he can be "re-educated" in a class that no one fails is going to help this situation.

Scottie
10-17-2010, 08:37 PM
I tell y'all what I'd like to do, is cook for a small CBJ class. (less than 20 perhaps).. I'd do my damndest to produce what I would for a contest, and see how the new judges would take to it. It'd be really interesting !

I'd do it with you. But can I cook chicken? I sure don't get any feedback from the current crop of CBJ's of what sucks with it now....

INmitch
10-17-2010, 08:55 PM
I've come to accept the difference in each judges opinion & taste. I understand that on chicken no 2 pieces are the same ( even if I spend 3 hrs trying to get them identical). I put 8 ribs in the box 4 from 2 racks (that are not going to be identical). A pork box full of chunks from 2 butts & different parts of the butt. I can see a 2-3 point fluxuation in tast & tenderness. BUT.........when I turn in my 7-8 slices of brisket and see scores from 5 to 9's I have a problem with that cause I know they are almost identical.:boxing: Just my 2C

Slamdunkpro
10-17-2010, 08:56 PM
I'd do it with you. But can I cook chicken? I sure don't get any feedback from the current crop of CBJ's of what sucks with it now....
Heh, I cooked for one of Linda Mullane's CBJ classes and didn't hit my timing very well so the chicken was late going in. We were literally dumping chicken breasts on sheet pans, dousing them with whatever rub was at hand and tossing the whole pan full into the cooker which was set somewhere around "inferno". One of the more experienced cooks helping out with the class came out after the chicken went into the class and asked "how did you cook the chicken? That was some of the best chicken I ever had"

Go figure:noidea::noidea::-D

INmitch
10-17-2010, 08:59 PM
:-D ^^^^^^Roll with it.:-D

SaucyWench
10-17-2010, 09:08 PM
I've come to accept the difference in each judges opinion & taste. I understand that on chicken no 2 pieces are the same ( even if I spend 3 hrs trying to get them identical). I put 8 ribs in the box 4 from 2 racks (that are not going to be identical). A pork box full of chunks from 2 butts & different parts of the butt. I can see a 2-3 point fluxuation in tast & tenderness. BUT.........when I turn in my 7-8 slices of brisket and see scores from 5 to 9's I have a problem with that cause I know they are almost identical.:boxing: Just my 2C

I admit, I don't understand that either. I love good brisket, can't stand "mom's pot roast" brisket. So...yesterday, we had a brisket that I thought was quite tasty, while another judge deemed it pot roast. All I could say is if my mom had ever made pot roast that tasted as good as that brisket, I would love pot roast to this day!

Divemaster
10-18-2010, 09:51 AM
I guess I don't know what the answer is... To be honest, I'm not sure if it is the the fact that we have a large judging scale or that some people can't write...

I for one would like to have meet the judge that gave us a 3 for appearance when all but one of the other judges gave us 9's and the remaining judge gave us an 8.

I've had the discussion on mandatory comment cards for any score 5 and down with others including Mike Lake and the response was always, "Then you would never have a score below a 6. Maybe the answer is to require comment cards for all scores.

LGHT
10-18-2010, 11:28 AM
You've hit the core of the problem (where we've been discussing the
result of the problem). Contrast that to other sanctioning bodies who
have all day training, tests, and then once a judge passes they dont
become a CBJ until they've successfully judged two sanctioned contests.
I do think KCBS is a bit of a victim of their own success/growth, but that
doesnt make the problem any smaller or excuse it.

I don't know if you need to spend all day, but I do think they should have backyard Q and Comp Q for tasting and actually review EACH score submitted and have them clarify exactly WHY they gave them the score they did good or bad. The table captains at the training classes should be CBJ's and not just volunteers passing out boxes and reading numbers. If they had a CBJ at every table they could take ownership of the table and say ok here is what you should look for when you judge appearance, then move on to taste, and move on to tenderness. Taste is and always will be subjective, but when a guy gets 8-9's on appearance and 1 judge gives that same box a 3 then something is wrong. Also as mentioned 1 guy got a comment card saying his brisket was over cooked and another saying it was under cooked? I thought spending 30+ minutes on what greens where acceptable was just a waste of time. I think it should be up to the table captain to determine if a box is acceptable or not. Then you could take those 30 minutes and teach the judges about how chicken skin should be cooked properly, what a perfect cooked rib texture should have and how to do a bend and pull test on brisket. Then as suggested have the first 2 comps that CBJ judges be reviewed by the KCBS rep. If either of the first 2 contest they don't judge consistintly then don't certify them and have them RETAKE the class.

I don't want to go on and on, but I just find it odd that the most important part of being a judge "training" get the least amount of attention.

The Pickled Pig
10-18-2010, 04:05 PM
New Ideas Committee - Merl Whitebook

"...when it appears that a CBJ statistically is inconsistent in scoring (+/- 2 from the mean of the overall contest results,) at two or more contest, in a 12 month period, the CBJ will be mentored by the CBJ Chairperson. ... The motion was seconded by Ed Roith."


I like the idea and think it's overdue. What's wrong with monitoring and mentoring judges? Judging is an area that I think has nearly unanimous support for improvement among cooks and judges alike. We don't necessarily agree on how to improve but most of us agree that improvement would be favorable.

The organization should track judges scores versus their peers and identify those who are consistent outliers. As has been mentioned there are valid reasons for being an outlier on occassion, but a rigorous tracking system should easily distinguish between someone who occassionally scores lower than the rest of the table and someone who consistently scores lower than the rest of their table. I would think that judges who take their craft seriously would want this kind of feedback.

I also think it's a mistake to assume that this type of scoring review would contribute to or lead to grade inflation. If that's going on, it's a separate issue but one that will be easier to fix with judges that score consistently on the same scale.

Scottie
10-18-2010, 04:31 PM
heck Paul.... We can't even get the full TOY points published for the members to see. Do you think we can actually track judges? :becky:


*** THis is no slight to the wonderful ladies at the KCBS that slave over doing the TOY program for our benefit.

Elegant Bear
10-18-2010, 05:57 PM
Mark Twain, noted BBQ enthusiast and author once wrote, "There are three kinds of lies: lies, damned lies, and statistics." We need to be careful about the nature of the problem that we are trying to solve by statistical proof. The underlying problem is the competence of judges which brings into question the certification process. Fix the certification process and you will fix the statistical deviations.

comfrank
10-18-2010, 06:46 PM
If almost no one gives out scores below 6, then we have a Lake Woebegon effect, where all of the children are above average. 6 becomes more like a floor rather than an average.

If nearly all of the scores are 6, 7, 8, or 9, then we have a 4-point scale. I have a colleague who's research specialty is measurement issues. He once did a study where he had people rate the size of objects on a 4-point scale (1 = very small, 2 = small, 3 = big, 4 = very big). The particular objects rated were a penny, a nickel, the moon, and the sun. He demonstrated that with a 4-point scale, there were no statistical differences in the sizes of the objects!!! If a 4-point scale can't distinguish between a nickel and the sun, then we ought to at least entertain alternatives to the current KCBS practice.

Someone earlier pointed out that using the full 2-9 scale would cause a lot of variance. But I can tell you, as a professor who teaches graduate level statistics, that variance is exactly what you need. I am *not* talking about judge-to-judge variance of a given entry. If the appearance scores are, say, 9,9,8,8,8,4, that is a different issue (a question of what in statistics is called "reliability"--I also think that this might be a problem with the KCBS system--but it is a different problem than the one I am addressing). No, I am talking about entry-to-entry variation rather than judge-to-judge variation. If there is little to no variance among entries--which can happen when you have 4-point scales--then we have a "restriction of variance" problem, which makes it very hard to find meaningful differences in the quality of the entries based upon their scores. As an extreme example, suppose there were no variance at all (everyone receives 888, say). Then it is impossible to distinguish good from bad cue. There will obviously be more variance in a 4-point scale, but, as my colleague's study shows, you still have problems picking out the the large from the small or, equivalently excellent cue from average. Not impossible, and not always, but certainly full-range scoring would be a statistical improvement.


--frank in Wilson, NY

Bentley
10-18-2010, 06:57 PM
I took a CBJ class recently and the person who was supposed to cook for the class had a last minute personal emergency and replacement cooks were recruited from throughout the community. The replacements did not have to be competition cooks. Every piece of meat that I tasted that day, I would have sent back to the kitchen had it come from a restaurant. It was the worst BBQ I had tasted in my life.

I'm pretty sure I was at the same training class and I can say that was problem #1.


San Diego, August 7th?

If so you can blame me for the chicken, at least the breast...I did all the breast and about, 60 drumsticks and 7-8 thighs...Sorry, did not think they were that bad...I tried the ribs, they were a 4, the butt was way over cooked, but had a decent flavor I thought, did you like the pork balls...the brisket, well your comments are very kind for them...

Ford
10-19-2010, 04:58 AM
Scoring in KCBS is not comparative. Each entry is scored on it's own merits and should not be compared with another entry. So how does all this statistics crap come into play. That can only happen if there is comparison.

So if Lotta Bull, Smokin Triggers, Cool Smoke, Quau, Pellet Envy and Habitual Smokers all hit the same table and they all think they cooked real good BBQ I'd say that no score would be below 8 and that's as it should be.

Now if 6 newbies all hit the same table and one slices brisket with the grain, one cooks it to 165, well you get the idea, then what scores would you see at that table. Probably 3-6 depending on the judges. That is to be expected.

What's important to me is that scores are consistent for an entry, especially in appearance. So in the first example if the average is 8.33 and one judge gave a 5 in appearance then I think it's something to look at. First the table captain should ask the judge about it and maybe the rep should also talk to the judge. If it happens consistently the judge is just not in tune with the norm and maybe needs to be educated. And I don't mean another CBJ class as it doesn't teach what's good.

We need to revisit the class and have pictures of all 9 boxes, pics of boxes with flaws and ask them to identify the flaws, etc. Also need books or a proxima projector to show examples of different boxes. Just some of my thoughts.

And for the record I have talked to man judges and there are some that have very preconceived ideas of what they want to see in appearance. I've been told that if there are not 2 layers of ribs then the box won't score a 9 ever. As a single layer guy not what I want to hear.

Lake Dogs
10-19-2010, 06:47 AM
If almost no one gives out scores below 6, then we have a Lake Woebegon effect, where all of the children are above average. 6 becomes more like a floor rather than an average.

If nearly all of the scores are 6, 7, 8, or 9, then we have a 4-point scale. I have a colleague who's research specialty is measurement issues. He once did a study where he had people rate the size of objects on a 4-point scale (1 = very small, 2 = small, 3 = big, 4 = very big). The particular objects rated were a penny, a nickel, the moon, and the sun. He demonstrated that with a 4-point scale, there were no statistical differences in the sizes of the objects!!! If a 4-point scale can't distinguish between a nickel and the sun, then we ought to at least entertain alternatives to the current KCBS practice.

Someone earlier pointed out that using the full 2-9 scale would cause a lot of variance. But I can tell you, as a professor who teaches graduate level statistics, that variance is exactly what you need. I am *not* talking about judge-to-judge variance of a given entry. If the appearance scores are, say, 9,9,8,8,8,4, that is a different issue (a question of what in statistics is called "reliability"--I also think that this might be a problem with the KCBS system--but it is a different problem than the one I am addressing). No, I am talking about entry-to-entry variation rather than judge-to-judge variation. If there is little to no variance among entries--which can happen when you have 4-point scales--then we have a "restriction of variance" problem, which makes it very hard to find meaningful differences in the quality of the entries based upon their scores. As an extreme example, suppose there were no variance at all (everyone receives 888, say). Then it is impossible to distinguish good from bad cue. There will obviously be more variance in a 4-point scale, but, as my colleague's study shows, you still have problems picking out the the large from the small or, equivalently excellent cue from average. Not impossible, and not always, but certainly full-range scoring would be a statistical improvement.


--frank in Wilson, NY

Frank, thanks, and yes, I was talking about the judge-to-judge variance.

To the above, if KCBS went with it, they'd need to do much closer to
an MBN style scoring, where after the scores are in, they then rank order
the entries at the table, with 1 and only 1 of the entries getting their
10 (or in this case, their 9). The others would get a fractional point, so
if the 2nd best that day was extremely close to the best one, it might
get the 8.9 and the third best, not being as close, might get an 8.5, etc.

Ultimately, ties happen. They do now, and they will with any system.
I do like KCBS's tie-breaker rules. They're probably the best out there,
frankly.

tmcmaster
10-19-2010, 07:08 AM
Mark Twain, noted BBQ enthusiast and author once wrote, "There are three kinds of lies: lies, damned lies, and statistics." We need to be careful about the nature of the problem that we are trying to solve by statistical proof. The underlying problem is the competence of judges which brings into question the certification process. Fix the certification process and you will fix the statistical deviations.
(UNRELATED) One of my all time favorite quotes!

Podge
10-19-2010, 07:25 AM
I had a judge come up to me this weekend, and told me that I need to become a CBJ and judge some. He said it would help me with ideas on how to make the boxes look, etc. He told me that it'd pick up a lot of ideas too on what people were turning in.. ( i didn't ask for any of this advice, btw).. I guess when you have a small rusty cooker and no banners, very unassumming site, people make assumptions. With this shoot-from-the-hip assumptions about me, I wonder how he judges BBQ ?

Jorge
10-19-2010, 07:27 AM
I had a judge come up to me this weekend, and told me that I need to become a CBJ and judge some. He said it would help me with ideas on how to make the boxes look, etc. He told me that it'd pick up a lot of ideas too on what people were turning in.. ( i didn't ask for any of this advice, btw).. I guess when you have a small rusty cooker and no banners, very unassumming site, people make assumptions. With this shoot-from-the-hip assumptions about me, I wonder how he judges BBQ ?

You need to hire that guy from Chicago as your media relations specialist:-P

Hawg Father of Seoul
10-19-2010, 07:41 AM
Statistics are very interesting. I agree with the gather info for a year and see what you have, post hoc.

My pork scored- 666 777 989 998 979 988

For me it was an 888. I get where people are pizzed about the outlying scores. Often "honest" people are really just jerks. Wish the outliers had to qualify the quantification.

Lake Dogs
10-19-2010, 07:53 AM
Statistics are very interesting. I agree with the gather info for a year and see what you have, post hoc.

My pork scored- 666 777 989 998 979 988

For me it was an 888. I get where people are pizzed about the outlying scores. Often "honest" people are really just jerks. Wish the outliers had to qualify the quantification.

See, how on earth can you have appearance scores vary from 6 to 9?
That is absolutely, without any shadow of a doubt, a judging problem going
to training and oversight. Still, taste scores (in pork, not chicken or
ribs that can be from a different chicken/pig and taste different) shouldnt
vary this much either. HUGE problem. Taste, certainly is subjective, but
a six is basically next akin to *yuck* where 2 other judges say it's
fan-damn-tastic? 7's, 8's, I get it. Either the 6 or the 9's were wrong,
IMHO.


I am reminded of the time where in MIM scoring we had the first 4 judges
turn in their cards; everything looked great. Then the 5th card was
turned in, rating this one particular entry 7's in taste (which is the lowest
you can get). When asked why, he replied "it smells and tastes like
lighter fluid". We went back to the remaining pieces, and sure enough
there was a very distinct smell and taste of lighter fluid. Go figure.
I suppose some judges just like the taste of lighter fluid.... Scoring
systems can't fix this.

Jorge
10-19-2010, 07:59 AM
Mark Twain, noted BBQ enthusiast and author once wrote, "There are three kinds of lies: lies, damned lies, and statistics." We need to be careful about the nature of the problem that we are trying to solve by statistical proof. The underlying problem is the competence of judges which brings into question the certification process. Fix the certification process and you will fix the statistical deviations.

A favorite quote of mine as well. That being said, a study of data gathered over the course of a year would be the most objective way I can think of to truly determine what problems, if any, are present.

I'd like to know how many judges score high or low, often enough to stand out. I'd like to see how many judges routinely score everything virtually the same, etc....

At this point I think all we have is suspicion, anecdotal evidence, etc... If we have factual evidence of the problem, we can then determine the cause and proper corrective action.

While I understand your point, I wouldn't want consider trying to recertify every CBJ to some new standard unless it was the only option available. The chaos it could potentially wreak on contests is pretty ominous, especially if it wasn't done properly. I'm convinced that some of the scoring issues we have today are a result of different generations of CBJs, including some that are still trying to apply either the old start at 9 or start at 6 systems etc.... If KCBS records are intact, and accurate it would be pretty simple to determine if that's the case once a good set of data is available.

I'm not discounting your proposed solution, but I don't think now is the time to implement ANY solution until we really understand what problem(s) we need to solve.

Hawg Father of Seoul
10-19-2010, 08:18 AM
If almost no one gives out scores below 6, then we have a Lake Woebegon effect, where all of the children are above average. 6 becomes more like a floor rather than an average.

If nearly all of the scores are 6, 7, 8, or 9, then we have a 4-point scale. I have a colleague who's research specialty is measurement issues. He once did a study where he had people rate the size of objects on a 4-point scale (1 = very small, 2 = small, 3 = big, 4 = very big). The particular objects rated were a penny, a nickel, the moon, and the sun. He demonstrated that with a 4-point scale, there were no statistical differences in the sizes of the objects!!! If a 4-point scale can't distinguish between a nickel and the sun, then we ought to at least entertain alternatives to the current KCBS practice.



--frank in Wilson, NY

As a student of Psychometrics .... I would love to see what they listed as possible confounds and mediators.

Without adequate training, no scale will measure with any real precision.

goodsmokebbq
10-19-2010, 09:15 AM
The tracking and "penalization" of outlier judges will skew the judging pool. No judge will want to be "labeled" an outlier and will worry about how his fellow judges are going to score and may hide his/her true opinion.

The very strong fact remains in all of this: There are very familiar faces at the top of every contest.

Scottie
10-19-2010, 09:33 AM
You need to hire that guy from Chicago as your media relations specialist:-P


I keep trying to get Dr. BBQ to help me as well. Or at least treat me to a Boston's Italian Beef the next time he is in town... :becky:

CBQ
10-19-2010, 09:38 AM
I think it would be interesting to see judge's performance across contests. If the judge that gave a "4" in the 9,9,9,8,8,4 scenario is handing out 4s and 5s left across multiple contests, maybe the problem is the judge and not the Q.

I have mentioned in other threads that we got a couple of fours in pork at one event last year, and had two comment cards saying the pork was cold. Every CBJ class tells people not to score on temp, so when two judges out of six put that down in writing, you have to wonder about the training and CBJ mix of the judges.

Lowest score dropped helps, but what if you have a collection of newly trained judges? Many states in the northeast have one or two contests a year. There are a few judges really dedicated to Q that drive hundreds of miles to contests, but a lot of the time you get new judges or judges that judge one contest a year. Statistical analysis won't fix this.

Podge
10-19-2010, 10:29 AM
I keep trying to get Dr. BBQ to help me as well. Or at least treat me to a Boston's Italian Beef the next time he is in town... :becky:

I like for people to not know me. Keeps them from asking for samples, shigging, etc..

Bigdog
10-19-2010, 10:47 AM
I like for people to not know me. Keeps them from asking for samples, shigging, etc..

Podge is changing his team name to "The Pube." :-P:-P:-P

This is really not a bad idea, esp. at the Royal where many people don't know BBQ etiquite.

The Pickled Pig
10-19-2010, 11:12 AM
The tracking and "penalization" of outlier judges will skew the judging pool. No judge will want to be "labeled" an outlier and will worry about how his fellow judges are going to score and may hide his/her true opinion.

To some degree, that is the desired outcome. If a judge consistently gives outlier scores based upon their "true" opinion, then I'd hope that they would conform to the norm with some positive mentoring.

I don't get the sense that anyone believes there is a rampant problem of rogue judges. I do believe there are a small number of judges that consistently score entries using a different scale than their peers use. Left on their own, those judges might as well stay home since their scores are likely being thrown out anyway.

As cooks we get feedback at every contest about things that could use improvement. Don't we owe it to our judges to give them the same type of feedback about their craft? How cool would it be for our judges to get a score sheet after every contest they judge that showed them how their scores compared with others at their same table. I think judges would get as much benefit out of that that I do getting cook score sheets.


The very strong fact remains in all of this: There are very familiar faces at the top of every contest.

The judging at contests of 40 teams or less is consistent. While the judging may be flawed at times, it is still fair. We all have the same probability of hitting a "bad" table. And tossing the low score out goes a long way to overcoming an outlier judge. Just because the system is fair doesn't mean it can't be improved.

LGHT
10-19-2010, 11:20 AM
I'd like to know how many judges score high or low, often enough to stand out. I'd like to see how many judges routinely score everything virtually the same, etc....

I don't think it's possible mainly because I doubt that the software KCBS is using now has a feature to keep track of all the CBJ's scores. Although it would be beneficial entering that much data would also be a huge task.

At this point I think all we have is suspicion, anecdotal evidence, etc... If we have factual evidence of the problem, we can then determine the cause and proper corrective action.

This is a very valid point. It may be 1-2 judges out of every comp that isn't on point with the other judges or may not realize how the judging has changed over the years. If a team gets 9-9-8-9-9-5 in appearance does the team have a way to protest the odd score? If not that may be the quickest and easiest way to single out the odd scores. If a judge is that off it will show up on ALL the scoring and then the KCBS rep can review and handle as needed. I mean it will be very hard to review ALL judges scores, but it shouldn't be that hard to review a few "questionable" scores if pointed out by a team.

Scottie
10-19-2010, 11:58 AM
I like for people to not know me. Keeps them from asking for samples, shigging, etc..


Me too. Except if they are passing out bottles of Jack... :becky:

ThomEmery
10-19-2010, 12:57 PM
As cooks we get feedback at every contest about things that could use improvement. Don't we owe it to our judges to give them the same type of feedback about their craft? How cool would it be for our judges to get a score sheet after every contest they judge that showed them how their scores compared with others at their same table. I think judges would get as much benefit out of that that I do getting cook score sheets.

Great idea.... Paul

CBQ
10-19-2010, 12:57 PM
I don't think it's possible mainly because I doubt that the software KCBS is using now has a feature to keep track of all the CBJ's scores. Although it would be beneficial entering that much data would also be a huge task.

I think the data is already there. They capture the scores, and when you are a judge, they ask you for your CBJ number. I think there would be an issue of privacy though. KCBS itself would probably want to review the data itself for action, and not make it public. Whether or not they are staffed to do that kind of analysis is uncertain.

Certainly others are - like what The Pickled Pig is able to do with published results data - but if KCBS does do this kind of evaluation, I don't see it being made widely available.

goodsmokebbq
10-19-2010, 02:00 PM
To some degree, that is the desired outcome. If a judge consistently gives outlier scores based upon their "true" opinion, then I'd hope that they would conform to the norm with some positive mentoring.

I don't get the sense that anyone believes there is a rampant problem of rogue judges. I do believe there are a small number of judges that consistently score entries using a different scale than their peers use. Left on their own, those judges might as well stay home since their scores are likely being thrown out anyway.

As cooks we get feedback at every contest about things that could use improvement. Don't we owe it to our judges to give them the same type of feedback about their craft? How cool would it be for our judges to get a score sheet after every contest they judge that showed them how their scores compared with others at their same table. I think judges would get as much benefit out of that that I do getting cook score sheets.




The judging at contests of 40 teams or less is consistent. While the judging may be flawed at times, it is still fair. We all have the same probability of hitting a "bad" table. And tossing the low score out goes a long way to overcoming an outlier judge. Just because the system is fair doesn't mean it can't be improved.


Judging feedback in the form of a score sheet could be productive in allowing judges a way to "hone" in there scoring. This sounds like "learning" and thats what we all want.

I do however think the judges are going to be very wary of being "put on a list" and "identified for further training". I am not sure if I understand how this would bias a judges decision. Not to mention my fleeting confidence that KCBS can execute a program like this.

I really don't think the judging is flawed, I just think you have different individuals making decisions based on their own opinion and distinct experience. There is a level of variance that will never go away when humans are involved (Computer judging Mod :-D).

LGHT
10-19-2010, 04:27 PM
I agree I don't think the judging is flawed I would chalk it up to the lack of proper and in-depth training of the judges, but of course you can't re-train all current CBJ's so you still have to do something.

I used to do a bit of programming back in the 80's and the one quote that always made sense is Trash in = Trash out.