u/divyanshu_gupta007 — reddlx

▲ 3 r/datasets+1 crossposts

Looking for tools to enrich 3,800 licensed property manager names (Ontario, Canada) — need emails. What actually works?

I’m building a lead enrichment pipeline for my friend in Canada and hitting a wall. Looking for advice from anyone who’s done similar work.

The data I have:

•3,800 licensed property managers from Ontario’s official CMRAO registry  
•Name only — no employer, no domain, no address  
•These are real licensed professionals, not residential contacts

What I’ve already tested (with results):
• Apollo.io free tier → blocked on Search API, needs paid plan
• Hunter.io → needs company domain to work, useless without it
• PeopleDataLabs → blocked signup, requires work email
• Prospeo → B2B only, 0% hit on Canadian residential-style data
• Spokeo/BeenVerified → US database only, no Canada coverage
• Canada411 via Apify → works but returns phone numbers only, no emails

What I’m trying to figure out:

1.Is Apollo Basic ($49) actually worth it for Canadian property managers? Has anyone tested it for Canada specifically?  
2.Is there any people-search or enrichment tool with decent Canadian professional coverage?  
3.Has anyone successfully enriched name-only Canadian professional contacts at scale?

What I’ve already ruled out:

•US-only people search tools (Spokeo, BeenVerified, TruthFinder)  
•Tools that need a company domain as input  
•Residential Canadian data (confirmed it basically doesn’t exist)

These are licensed professionals so they should have LinkedIn profiles and company affiliations — just need the right tool to match name → email efficiently.

Any real-world experience appreciated. Happy to share results once I find something that works.

u/divyanshu_gupta007 — 6 days ago

▲ 55 r/n8n_on_server+3 crossposts

Excited to share that out of all April submissions, my NPM Package Intelligence Agent was selected as a spotlight winner! 🚀

Built an AI-powered agent that analyzes any npm package and delivers data-driven recommendations using:

— GitHub & npm APIs for real-time metrics

— Firecrawl for web intelligence

— Gemini for structured insights

Helps you avoid risky dependencies before they hit production.

Huge thanks to the n8n team for running such an incredible challenge — this community pushes builders to go deeper and build smarter.

Try the workflow here: https://n8n.io/workflows/14911

u/divyanshu_gupta007 — 21 days ago

▲ 15 r/n8n_on_server+5 crossposts

We've all been there.

A vendor agreement. A freelance contract. An NDA.

20 pages land in your inbox.

You spend 2–3 hours reading it.

Half of it doesn't make sense without a law degree.

And at the end — you sign anyway and hope for the best.

The real problem isn't the legal jargon you notice.

It's the plain-language clause that looks harmless — but isn't.

"The Client accepts full responsibility for all third-party claims"

No alarm bells. No red flags.

But you just agreed to something that could cost you everything.

And no keyword search will ever catch it.

So I built something to fix this.

A web app that reads any PDF contract and tells you exactly what to worry about — clause by clause.

Upload PDF → AI analyses every clause → full risk report in under 5 minutes.

Each clause gets:

✅ Risk level — HIGH, MEDIUM, or LOW

✅ What it actually means — no legal jargon

✅ A safer alternative — rewritten to protect you better

No legal background needed.

No expensive lawyer for a first-pass review.

The interesting technical part — Hybrid RAG:

The core challenge was catching risky clauses written in plain language.

Simple keyword search fails completely here.

The solution: Hybrid RAG — two search methods combined:

→ Vector Search (Supabase pgvector) — semantic similarity search

against a knowledge base of known risky clause patterns.

Catches dangerous clauses even when the wording looks innocent.

→ BM25 Keyword Search — full-text matching against

explicit legal red flag terms.

→ RRF Reranking — Reciprocal Rank Fusion merges both

result sets with clause-type boost multipliers

(indemnification gets 1.8x, IP gets 1.3x, etc.)

This combination catches what neither method alone would find.

Tech Stack:

→ n8n — backend workflow orchestration (the brain)

→ Google Gemini Flash — clause classification + AI risk annotation

→ Supabase + pgvector — vector knowledge base + result storage

→ BM25 — keyword search via PostgreSQL full-text

→ HTML / CSS / JS — frontend web app

→ Netlify — hosting and deployment

Architecture note:

The AI pipeline takes 3–5 minutes to complete.

So the system is fully async — frontend fires the request,

n8n processes in the background, saves result to Supabase,

frontend polls every 5 seconds until the report is ready.

No timeouts. No CORS issues. No UI stuck loading.

🔗 Free template on n8n hub: https://n8n.io/workflows/

Happy to answer questions about the architecture,

the Hybrid RAG setup, or the Supabase schema. 👇

u/divyanshu_gupta007 — 23 days ago