u/Commercial_Many_909

What data gap frustrates you most in trading card investing?

Whether you're into Pokemon, Magic, sports cards, or anything else current data tools have similar gaps. Curious what actually matters to people who invest seriously vs nice-to-have features.

The gaps I keep noticing:

- Fair value comparisons that control for rarity, condition, pop counts

- Liquidity metrics (how fast can you actually sell at a given price?)

- Reprint/supply risk for modern issues

- Outlier filtering on marketplace data (eBay especially)

- Grading EV with full outcome distribution

- Cross-platform price reconciliation

For the serious investors here: which of these actually affects your decisions, and which sound smart but you wouldn't change behavior over?

(Background: building tools in the Pokemon space, expanding thinking to other TCGs. Not pitching here — asking what's actually broken vs what's just academically interesting.)

reddit.com
u/Commercial_Many_909 — 2 days ago

Curious what serious collectors think: what data would actually help your collection decisions?

Most of us check eBay sold listings, Cardmarket, PriceCharting when buying or selling significant cards. That's standard. But if you're managing a collection worth tens of thousands, the data feels surprisingly thin in some ways:

- How do you actually know if a card is "fairly priced" relative to its peers? Comparable analysis exists for houses and cars but not really for cards.

- How do you assess reprint risk on modern cards before pulling the trigger on a $1000+ purchase?

- For raw cards, how do you decide grading EV with any rigor? PSA's published pop reports tell part of the story but not the whole.

- How do you spot when prices are being manipulated (wash sales, dealer collusion on small markets)?

What gaps in available data have you hit in your collecting? What would you actually use if it existed? (I'm a finance student building some tools in this space, but this is genuinely a question, not a pitch. I'd rather know what's missing than guess.)

reddit.com
u/Commercial_Many_909 — 2 days ago

Anachronism-free backtest on a hedonic model: card-level coinflip but cohort-level alpha. Methodology question.

Hi all. Earlier I posted about my hedonic regression model for graded Pokémon cards (R² 0.87 LOSO on n=2,622). I ran a proper out-of-sample forward backtest and the result raised a methodological question I'd value input on.

Setup

Trained on 2025-05 data only, scored predictions against actual 2026-05 prices. 2,311 cards eligible.

Results

  • Card-level hit rate (sign of predicted spread = sign of realized return): 49%.
  • Quintile-level: Q5 (top model discount) median 1y return +54%, Q1 (top premium) +22%. Mann-Whitney U test p = 3e-6.
  • Live long-only Q5 index: +60.2% vs broad market +41.7% over 12 months (+18.5% out-of-sample).

So the model has zero predictive power on individual cards but a statistically significant, economically large factor premium at the quintile level. The pattern is familiar from equity factor research (single-stock alpha ≠ portfolio factor alpha), but I haven't seen it cleanly documented for a hedonic regression on an illiquid collectibles market.

My question

Why does individual-level predictive power collapse to coinflip while portfolio-level signal survives? Has anyone seen this pattern formalized?

Thanks for reading.

https://preview.redd.it/a1gkdhksoj0h1.png?width=1128&format=png&auto=webp&s=86e8c5092d2eca506f31be8e1b3472d0bc7f4730

reddit.com
u/Commercial_Many_909 — 11 days ago

I built a quantitative model to find the fair value of raw Pokémon cards (Hedonix H6 raw engine update)

Hey guys, I'm back with another Hedonix update for you.

After implementing the first H6 engine predicting PSA 10 prices and improving it with pop counts and gem rates, I wanted to build a new model that predicts raw card prices. This one was quite difficult since it does not factor in any price as an input (like the graded model does with raw prices).

The whole research started based off a YouTuber's video idea, in which he claimed he built a model doing the exact same thing while achieving an R² of 0.88. My model started with an R² of 0.31.

Why his R² looked so good: His sample was around 30 hand-picked chase cards. With 4-5 regressors on 30 data points, you get an R² > 0.85 in-sample almost mechanically. Unfortunately, no cross-validation was shown in the video. When I rebuilt his architecture on 358 cards with an honest leave-one-set-out CV, it dropped to 0.31. That's not a knock on his work, just what happens when you scale a small in-sample model to a real out-of-sample test.

How I got from 0.31 to a usable model:

  • Bigger panel + era flags (358 SV cards → 2,622 across SM/SWSH/SV): +0.12 R².
  • Adding graded data as features (pop count, gem rate): +0.05 R².
  • eBay daily volume time-series (730 days of daily sales counts per card): +0.28 R².
  • XGBoost over Linear Regression: +0.07 R².

Features that surprised me by having zero impact:

  • LLM artwork scoring (composition, pose, color).
  • Google Trends per character.
  • Manual character tier tags (Eeveelutions, starters, legendaries).

Final result: I'm proud to say that the new raw model achieves an out-of-sample R² of 0.83 and a median error of 34% on 2,622 cards. For comparison, my graded H6 v2 lands at an 0.87 R² / 20% median error. But keep in mind that raw data will always be noisier than graded because of bulk listings, casual sellers, and the lack of a PSA arbiter to standardize condition.

Thanks for reading. As always, I'm still looking for beta testers, so let me know if you wanna test Hedonix

https://preview.redd.it/d5kwu3346xzg1.png?width=1080&format=png&auto=webp&s=88480c8d0ffd369d37d2a55f9216a57d95fadd1f

https://preview.redd.it/bfej2xo46xzg1.png?width=1080&format=png&auto=webp&s=ae40902ff443e8242086c9e985e17d0d08cc9885

reddit.com
u/Commercial_Many_909 — 14 days ago

Hey guys, I'm back with another Hedonix update for you.

After implementing the first H6 engine predicting PSA 10 prices and improving it with pop counts and gem rates, I wanted to build a new model that predicts raw card prices. This one was quite difficult since it does not factor in any price as an input (like the graded model does with raw prices).

The whole research started based off a YouTuber's video idea, in which he claimed he built a model doing the exact same thing while achieving an R² of 0.88. My model started with an R² of 0.31.

Why his R² looked so good: His sample was around 30 hand-picked chase cards. With 4-5 regressors on 30 data points, you get an R² > 0.85 in-sample almost mechanically. Unfortunately, no cross-validation was shown in the video. When I rebuilt his architecture on 358 cards with an honest leave-one-set-out CV, it dropped to 0.31. That's not a knock on his work, just what happens when you scale a small in-sample model to a real out-of-sample test.

How I got from 0.31 to a usable model:

  • Bigger panel + era flags (358 SV cards → 2,622 across SM/SWSH/SV): +0.12 R².
  • Adding graded data as features (pop count, gem rate): +0.05 R².
  • eBay daily volume time-series (730 days of daily sales counts per card): +0.28 R².
  • XGBoost over Linear Regression: +0.07 R².

Features that surprised me by having zero impact:

  • LLM artwork scoring (composition, pose, color).
  • Google Trends per character.
  • Manual character tier tags (Eeveelutions, starters, legendaries).

Final result: I'm proud to say that the new raw model achieves an out-of-sample R² of 0.83 and a median error of 34% on 2,622 cards. For comparison, my graded H6 v2 lands at an 0.87 R² / 20% median error. But keep in mind that raw data will always be noisier than graded because of bulk listings, casual sellers, and the lack of a PSA arbiter to standardize condition.

Thanks for reading. As always, I'm still looking for beta testers, so let me know if you wanna test Hedonix

https://preview.redd.it/tp6yyplc5xzg1.png?width=1080&format=png&auto=webp&s=12055d5bd94e4c4e0a4cb974410a6abe199613ef

reddit.com
u/Commercial_Many_909 — 14 days ago

I built a quantitative model to find the fair value of raw Pokémon cards (Hedonix H6 raw engine update)

Hey guys, I'm back with another Hedonix update for you.

After implementing the first H6 engine predicting PSA 10 prices and improving it with pop counts and gem rates, I wanted to build a new model that predicts raw card prices. This one was quite difficult since it does not factor in any price as an input (like the graded model does with raw prices).

The whole research started based off a YouTuber's video idea, in which he claimed he built a model doing the exact same thing while achieving an R² of 0.88. My model started with an R² of 0.31.

Why his R² looked so good: His sample was around 30 hand-picked chase cards. With 4-5 regressors on 30 data points, you get an R² > 0.85 in-sample almost mechanically. Unfortunately, no cross-validation was shown in the video. When I rebuilt his architecture on 358 cards with an honest leave-one-set-out CV, it dropped to 0.31. That's not a knock on his work, just what happens when you scale a small in-sample model to a real out-of-sample test.

How I got from 0.31 to a usable model:

  • Bigger panel + era flags (358 SV cards → 2,622 across SM/SWSH/SV): +0.12 R².
  • Adding graded data as features (pop count, gem rate): +0.05 R².
  • eBay daily volume time-series (730 days of daily sales counts per card): +0.28 R².
  • XGBoost over Linear Regression: +0.07 R².

Features that surprised me by having zero impact:

  • LLM artwork scoring (composition, pose, color).
  • Google Trends per character.
  • Manual character tier tags (Eeveelutions, starters, legendaries).

Final result: I'm proud to say that the new raw model achieves an out-of-sample R² of 0.83 and a median error of 34% on 2,622 cards. For comparison, my graded H6 v2 lands at an 0.87 R² / 20% median error. But keep in mind that raw data will always be noisier than graded because of bulk listings, casual sellers, and the lack of a PSA arbiter to standardize condition.

Thanks for reading. As always, I'm still looking for beta testers, so let me know if you wanna test Hedonix

https://preview.redd.it/jfmv2ryh5wzg1.png?width=1875&format=png&auto=webp&s=e870bcec186178b9e6d8b5c94f4ea34a2dd2e78d

https://preview.redd.it/12yxfivi5wzg1.png?width=2025&format=png&auto=webp&s=c449aadde87bdd3dd6cf9f8fff6292026778157d

reddit.com
u/Commercial_Many_909 — 14 days ago

Hey guys, some days have passed since I initially posted about my quantitative approach and my platform Hedonix.

Many of you told me that my model would be useless without factoring in pop count and gem rate. I took your feedback to heart and prioritized it for this update. I'm happy to announce that I've successfully updated the model to now being able to weigh in pop count and gem rate correctly which increased R2 to 0.91 (in-sample) and 0.87 (out-of-sample) while reducing median error by -23%. I am incredibly proud of these significant improvements.

While doing the math on this I've had some insightful findings that I wanted to share with you:

  1. Pop count is a demand signal, not supply: Contrary to what many think, pop count correlates positively with price. A massive pop count is actually an indicator of high liquidity and market attention, which drives a premium.
  2. Gem rate is the true scarcity signal: This is the actual supply metric. Prices correlate negatively with the gem rate. If a card is easy to grade, the market floods and prices drop.

What does this mean? If you are seeking long-term stability you should absolutely factor in both metrics, as one alone does not paint the whole picture. The structurally strongest cards are those with high pop count but low gem rate.

H6 v2 in action:
After updating the model the market premium for the two most overvalued cards rn (in my 350 card database) Magikarp IR Paldean Evolved and Sunbreon are cut in half. While still sitting at a high premium (150% at $3,175 for Magikarp and 80% at $5,325 for Sunbreon), the model now factors in a high demand (pop count) but relatively supply (gem rate).

Now you can decide: Is the updated market premium justified by having an outstanding artwork (which the model still can't value correctly) or is there just a high amount of hype priced in?

Thanks for reading! Remember I'm still looking for beta testers, so sign up if you want to use Hedonix.

https://preview.redd.it/2x9brku4nczg1.png?width=1684&format=png&auto=webp&s=57cab968c97b8fd22c0a9c363e63045107d9eadf

https://preview.redd.it/brnjnwd7nczg1.png?width=1047&format=png&auto=webp&s=4dab35b2c2c8460bbe8be5c5dd1056415edd59ad

https://preview.redd.it/m46lr778nczg1.png?width=1034&format=png&auto=webp&s=48560648f7bda69f7ddb13e0bbb2fec836e94356

reddit.com
u/Commercial_Many_909 — 17 days ago