Grading using artificial intelligence

Snowman · #19 08-11-2021, 02:46 AM

Part 3 of 3...

Practical limitations:

Classification models such as these would come with extremely high error rates, and these types of ML algorithms are simply going to get it wrong far too often because of the natural variation that exists across different card types and card images and the limitations on the input data (high def scans only show so much). The confidence intervals that would accompany an ML model's predictions would be so wide that they would be borderline useless in practice. As Rick-Rarecards pointed out earlier, a model that outputs a 51% chance of your card being authentic just isn't very useful, and it's precisely the type of output that these types of models produce.

The data warehouse required just to support these types of projects is extremely expensive to build and even more expensive to maintain. Data scientists (or "AI/ML engineers", there are many names for these types of jobs) are also not cheap. Neither are the systems architects and database admins necessary for supporting them. Most skilled data scientists' salaries could probably cover 3 or 4 of the most experienced graders on a TPGs payroll, and they'll need an entire team of them to work on these problems. Also, the problems themselves just aren't all that interesting to work on from a problem-solving perspective, and good data scientists are often more motivated by being able to work on something novel and exciting than they are by just salary alone since any good data science job pays well and there's no shortage of interesting companies working on fun and interesting problems that are all competing for their talents. Keeping the good ones around won't be easy. Especially in California. Overall, the scale of the investment in such a project is difficult to exaggerate. The money PSA spent to acquire Genamint is just the tip of the iceberg. The juice will not be worth the squeeze. I'd wager everything I own on that.

TPGs already have enough boxes to check in their pipeline before a card gets returned to the customers. Receiving, research, grading, QC, and shipping. If they were to add taking high-definition scans and running numerous ML algorithms for each card to that pipeline, it could easily triple the time spend on every card. And for what benefit? Worse predictions than humans can already do? Maybe they hope to flag cards to potentially examine more closely? And what percentage of those would be false flags? A lot, that's for sure. Also, I guarantee it would just turn into a running joke with the graders. They'll just roll their eyes every time a math nerd comes up to them with a "questionable card" report. They'll quickly realize on day one that they are better at detecting this stuff than the algorithms are. It's just not practical. It's not the solution they were hoping for.

Also, who would be responsible for interpreting the models' outputs? Normally, this would be a data scientist who interprets model results for executives at other companies. Their expertise is needed to explain some of the anomalies, which there will be no shortage of. But to pay someone with that skill set just to interpret model results on every single card that comes through a TPG like PSA? Yikes. That's a pretty big ask. And if you had a non-skilled worker doing it, then you might as well just scrap the entire project.

What it could do well:

ML could be used to grade centering on all cards with a clearly defined border. It would be a fairly straightforward model to build and one that would be expected to perform well. Yay, I guess? How big of a win is this really though? Do you really need a machine learning model to tell you that a bordered card is off-centered?

However, non-bordered cards pose a much more challenging problem. You could train a model to learn centering by having it pay attention to how far from the edges the Topps logo is for a particular set, but then you're running into the problem again of having to build an entire dataset with tens of thousands of cards from just one particular set (something they rarely have) in order to create the training data it needs to learn how to identify what a well-centered card looks like from say 2022 Topps Chrome (or any other new set that it hasn't seen yet). This is just not practical. And you can't combine different sets with different logo locations into the same training data, because one logo might be well centered 3/16" from the top and left edges whereas a different logo for a different set might be well centered 1/2" from the top and left edges. Having non-uniform distances both being "well-centered" would confuse the algorithm.

Someone mentioned that ML could be used to identify which set a card was from, perhaps to help in the research stage of a TPGs pipeline. In theory, this is possible, but I'm not so sure this is an "ML" problem per se. You certainly wouldn't build a multiclass classification model with tens of thousands of different classes (here a 'class' would be a set of cards like 1987 Topps or 2019 Topps Chrome Sapphire Edition, etc.) because that's just way too many classes for a problem like this. I suppose you could try a different approach, but it seems like more of a matching algorithm type problem, not really a machine learning one that would require a training set of data to learn from.

Fingerprinting cards - Again, this isn't really "AI" or "ML". This is just a matching algorithm. Just like when the FBI "runs someone's prints" for a fingerprint match or dental records. You're basically comparing the numeric values in the RGB (or similar) matrix I mentioned earlier against other image files in a database and calculating something like the Euclidean distance between all vectors in the matrix to come up with a similarity score. When two images have extremely low, or near zero distance, they're probably a match. But again, this is just math and some basic coding skills, this is not machine learning and certainly not "AI" (although I suppose they are somewhat related fields).

Additional challenges that I didn't address:
- Autographs on cards
- Memorabilia cards
- Short printed cards that don't have enough copies to be able to create training datasets from
- Some cards are bowed, others are flat, this could distort the "edge detection" locations in scanned images
- Crossover submissions with cards currently in other slabs
- And a whole lot more...

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Craft of Intelligence - Allen Dulles	Edward	Everything Else, Football, Non-Sports etc.. B/S/T	8	11-20-2015 01:41 PM
Card Grading vs. Autograph Grading	scooter729	Net54baseball Vintage (WWII & Older) Baseball Cards & New Member Introductions	9	08-20-2014 12:52 PM
BVG grading	Vintagevault13	Net54baseball Vintage (WWII & Older) Baseball Cards & New Member Introductions	14	05-11-2014 01:51 PM
Mint Grading, or is it the grading of mints?	brianp-beme	Net54baseball Vintage (WWII & Older) Baseball Cards & New Member Introductions	2	10-30-2010 09:11 AM
Scarcity, Real, Artificial, and Imagined	Archive	Net54baseball Vintage (WWII & Older) Baseball Cards & New Member Introductions	34	09-23-2007 05:27 PM