NonSports Forum

Net54baseball.com
Welcome to Net54baseball.com. These forums are devoted to both Pre- and Post- war baseball cards and vintage memorabilia, as well as other sports. There is a separate section for Buying, Selling and Trading - the B/S/T area!! If you write anything concerning a person or company your full name needs to be in your post or obtainable from it. . Contact the moderator at leon@net54baseball.com should you have any questions or concerns. When you click on links to eBay on this site and make a purchase, this can result in this site earning a commission. Affiliate programs and affiliations include, but are not limited to, the eBay Partner Network. Enjoy!
Net54baseball.com
Net54baseball.com
ebay GSB
T206s on eBay
Babe Ruth Cards on eBay
t206 Ty Cobb on eBay
Ty Cobb Cards on eBay
Lou Gehrig Cards on eBay
Baseball T201-T217 on eBay
Baseball E90-E107 on eBay
T205 Cards on eBay
Baseball Postcards on eBay
Goudey Cards on eBay
Baseball Memorabilia on eBay
Baseball Exhibit Cards on eBay
Baseball Strip Cards on eBay
Baseball Baking Cards on eBay
Sporting News Cards on eBay
Play Ball Cards on eBay
Joe DiMaggio Cards on eBay
Mickey Mantle Cards on eBay
Bowman 1951-1955 on eBay
Football Cards on eBay

Go Back   Net54baseball.com Forums > Net54baseball Main Forum - WWII & Older Baseball Cards > Net54baseball Vintage (WWII & Older) Baseball Cards & New Member Introductions

 
 
Thread Tools Display Modes
Prev Previous Post   Next Post Next
  #19  
Old 08-11-2021, 02:46 AM
Snowman's Avatar
Snowman Snowman is offline
Travis
Tra,vis Tr,ail
 
Join Date: Jul 2021
Posts: 2,442
Default

Part 3 of 3...


Practical limitations:

Classification models such as these would come with extremely high error rates, and these types of ML algorithms are simply going to get it wrong far too often because of the natural variation that exists across different card types and card images and the limitations on the input data (high def scans only show so much). The confidence intervals that would accompany an ML model's predictions would be so wide that they would be borderline useless in practice. As Rick-Rarecards pointed out earlier, a model that outputs a 51% chance of your card being authentic just isn't very useful, and it's precisely the type of output that these types of models produce.

The data warehouse required just to support these types of projects is extremely expensive to build and even more expensive to maintain. Data scientists (or "AI/ML engineers", there are many names for these types of jobs) are also not cheap. Neither are the systems architects and database admins necessary for supporting them. Most skilled data scientists' salaries could probably cover 3 or 4 of the most experienced graders on a TPGs payroll, and they'll need an entire team of them to work on these problems. Also, the problems themselves just aren't all that interesting to work on from a problem-solving perspective, and good data scientists are often more motivated by being able to work on something novel and exciting than they are by just salary alone since any good data science job pays well and there's no shortage of interesting companies working on fun and interesting problems that are all competing for their talents. Keeping the good ones around won't be easy. Especially in California. Overall, the scale of the investment in such a project is difficult to exaggerate. The money PSA spent to acquire Genamint is just the tip of the iceberg. The juice will not be worth the squeeze. I'd wager everything I own on that.

TPGs already have enough boxes to check in their pipeline before a card gets returned to the customers. Receiving, research, grading, QC, and shipping. If they were to add taking high-definition scans and running numerous ML algorithms for each card to that pipeline, it could easily triple the time spend on every card. And for what benefit? Worse predictions than humans can already do? Maybe they hope to flag cards to potentially examine more closely? And what percentage of those would be false flags? A lot, that's for sure. Also, I guarantee it would just turn into a running joke with the graders. They'll just roll their eyes every time a math nerd comes up to them with a "questionable card" report. They'll quickly realize on day one that they are better at detecting this stuff than the algorithms are. It's just not practical. It's not the solution they were hoping for.

Also, who would be responsible for interpreting the models' outputs? Normally, this would be a data scientist who interprets model results for executives at other companies. Their expertise is needed to explain some of the anomalies, which there will be no shortage of. But to pay someone with that skill set just to interpret model results on every single card that comes through a TPG like PSA? Yikes. That's a pretty big ask. And if you had a non-skilled worker doing it, then you might as well just scrap the entire project.


What it could do well:

ML could be used to grade centering on all cards with a clearly defined border. It would be a fairly straightforward model to build and one that would be expected to perform well. Yay, I guess? How big of a win is this really though? Do you really need a machine learning model to tell you that a bordered card is off-centered?

However, non-bordered cards pose a much more challenging problem. You could train a model to learn centering by having it pay attention to how far from the edges the Topps logo is for a particular set, but then you're running into the problem again of having to build an entire dataset with tens of thousands of cards from just one particular set (something they rarely have) in order to create the training data it needs to learn how to identify what a well-centered card looks like from say 2022 Topps Chrome (or any other new set that it hasn't seen yet). This is just not practical. And you can't combine different sets with different logo locations into the same training data, because one logo might be well centered 3/16" from the top and left edges whereas a different logo for a different set might be well centered 1/2" from the top and left edges. Having non-uniform distances both being "well-centered" would confuse the algorithm.

Someone mentioned that ML could be used to identify which set a card was from, perhaps to help in the research stage of a TPGs pipeline. In theory, this is possible, but I'm not so sure this is an "ML" problem per se. You certainly wouldn't build a multiclass classification model with tens of thousands of different classes (here a 'class' would be a set of cards like 1987 Topps or 2019 Topps Chrome Sapphire Edition, etc.) because that's just way too many classes for a problem like this. I suppose you could try a different approach, but it seems like more of a matching algorithm type problem, not really a machine learning one that would require a training set of data to learn from.

Fingerprinting cards - Again, this isn't really "AI" or "ML". This is just a matching algorithm. Just like when the FBI "runs someone's prints" for a fingerprint match or dental records. You're basically comparing the numeric values in the RGB (or similar) matrix I mentioned earlier against other image files in a database and calculating something like the Euclidean distance between all vectors in the matrix to come up with a similarity score. When two images have extremely low, or near zero distance, they're probably a match. But again, this is just math and some basic coding skills, this is not machine learning and certainly not "AI" (although I suppose they are somewhat related fields).


Additional challenges that I didn't address:
- Autographs on cards
- Memorabilia cards
- Short printed cards that don't have enough copies to be able to create training datasets from
- Some cards are bowed, others are flat, this could distort the "edge detection" locations in scanned images
- Crossover submissions with cards currently in other slabs
- And a whole lot more...
Reply With Quote
 




Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is On

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Craft of Intelligence - Allen Dulles Edward Everything Else, Football, Non-Sports etc.. B/S/T 8 11-20-2015 01:41 PM
Card Grading vs. Autograph Grading scooter729 Net54baseball Vintage (WWII & Older) Baseball Cards & New Member Introductions 9 08-20-2014 12:52 PM
BVG grading Vintagevault13 Net54baseball Vintage (WWII & Older) Baseball Cards & New Member Introductions 14 05-11-2014 01:51 PM
Mint Grading, or is it the grading of mints? brianp-beme Net54baseball Vintage (WWII & Older) Baseball Cards & New Member Introductions 2 10-30-2010 09:11 AM
Scarcity, Real, Artificial, and Imagined Archive Net54baseball Vintage (WWII & Older) Baseball Cards & New Member Introductions 34 09-23-2007 05:27 PM


All times are GMT -6. The time now is 04:48 PM.


ebay GSB