r/dataengineer

▲ 1 r/dataengineer+1 crossposts

Легкий та ресорс авейр дата екстрактор

Всім привіт

Хочу поділитися своїм проєктом котрий націлений на датаекстракцію даних коли у вас в команді ще немає дата інженера ну або немає на це бюджету.

Суть проекту - це бінарь що в декілька команд вам єкспортне або всю базу або якусь велику широку або важку таблицю.

Можете звісно зауважити що ну типу можна зараз на пітончику навайбкодити або взяти шось типу аірбайта котрий тіки імеддами буде важить якмціла база або ще якісь дата платформу.

А потім якщо навіть у вас все це буде тут починаються приколи із тим що треба десь тримати стейт на кожен експорт або сітка моргнула і треба ретрай.

Ціль мого продукту це по-перше дбайливе відношення не тільки до ресурсів воркера але й до ресурсів бази. Навіть якщо у вас немає репліки то сейф пресет не буде читати агресивно.

Буду дуже вдячний за фідбек.

github.com
u/Mundane_Let_8090 — 2 days ago
▲ 2 r/dataengineer+1 crossposts

Need Advice

Hey,
Currently having 6 YOE with 25 LPA(all fixed) service based, planning to switch to some good Product based companies.
Tech Stack : Azure, Databricks
Is getting something in range of 38-40 LPA possible in current market situation?
Getting calls from Naukri.com but all of them are like immediate to 30 day joiner types plus the CTC they offer is close to what I have right now.
Any suggestion on which companies to target, how to optimize job search?

reddit.com
u/Fit_Mixture515 — 6 days ago
▲ 19 r/dataengineer+1 crossposts

Need some serious help

What is wrong with my resume? I have applied for 200+ job positions from roles data engineer to data analyst. Not a single response back.
Please help

u/undefined06 — 8 days ago
▲ 11 r/dataengineer+1 crossposts

Unlimited Context for AI Agents? How to scale context on Snowflake using platform-native tools 🚀

How do you give an AI agent a memory that is both durable and governed?

We just published a guide to building stateful agent memory on Snowflake using Cortex features and relational primitives to model a knowledge graph. This provides agents with durable, trust-aware recall without adding a dedicated graph database.  

The end-to-end stack:

  • Pipeline: Streams + Tasks + AI_EXTRACT. It’s declarative and runs under the same Snowflake Horizon primitives as the rest of our warehouse.
  • Memory: Instead of a specialized graph database, we used Relational Tables + Vector columns. Traversal is handled by Recursive CTEs.
  • Discovery: Cortex Search provides hybrid retrieval (vector + keyword) with RRF (Reciprocal Rank Fusion).
  • Orchestration: We’ve replaced custom orchestration logic with Cortex Agents used as declarative tools.

The result: agent recall is durable and, more importantly, auditable.

https://www.capitalone.com/software/blog/scaling-agent-context-snowflake-knowledge-graphs/?utm_campaign=scaling_context_ns&utm_source=reddit&utm_medium=social-organic

u/noasync — 9 days ago
▲ 3 r/dataengineer+1 crossposts

Asking for advice

I will try to keep this as succinct as possible.

I am in Europe as an American wanting to switch careers to data engineering because I want a job that has some more flexibility and a future. I am able to stay here through my student visa. But my goal is to stay here forever.

During my studies I had the luck to meet a gentleman, a senior level data engineer, who was willing to guide me towards that career.
He recommended a data engineering audacity nano degree. I completed it. He then recommended I do basic python course, earned that certificate too. I completed a hacker rank SQL cert. Then I got to dbt, one monster of a tool. I was wanting to complete the certification for it but without any practical experience I honestly do not know how that would have been possible.

My mentor proceeded to give me a student position to learn some of the ropes and at once I was struck with the theory/practice divide. Things that I know should have been simple felt agonisingly slow. I was crawling through navigating Astronomer CI/CD pipeline setups. I was struggling writing up DAGs in python, I was even struggling setting up SQL queries without the use of AI. Even now I am not fully confident in these things.
I noticed that my problem solving, having such little experience in the field, meant casting a net in AI and in the web trying to find a solution but at points I was just running in circles.

This continued for a couple months. In the meantime I did get familiar with basic concepts in contenarization (Docker), git commands, Apache Airflow and GCP.
Then one day my boss asks me to migrate pipelines from one orchestrator to another, and I was stuck just moving one because I couldn't figure my way through one bug in the DAG code that literally took him 5 mins. He then proceeded to do all of them.

He was kind to tell me "not to take it personally," but of course, I felt like I had failed him. He then communicated that he does not know what other route to take me on since I am not ready to take on a client on my own. And I agree. I have not done any projects in data cleaning and analysis, or written entire SQL scripts on my own or carried out a project from ingestion through transformation and to reporting (basically the analysis and engineering).

Two days later he recommends I apply for Junior Data Engineering positions since there I will have someone who will be paid to oversee and teach me and I will not have the pressure of the kids of clients he gets. According to him I am qualified for an entry level position. I have been applying ever since, but my confidence through this whole ordeal has taken some hits.

I looked online for what to do in the meantime, but like always , the internet gives you too much and I do not know what exactly to do, or who to listen to: do I pursue data analytics self-training and do some projects on my own that way whilst adding data engineering in the mix? Do I learn agentic AI in the meantime? Do I rather focus on just getting certified on Astronomer, GCP or Azure or dbt without any real life experience?

I am willing to sacrifice my free time training myself but I have no idea where exactly to go and frankly I am tired and quite scared of not making it.

Any words of guidance from experienced data engineers would be welcomed.

reddit.com
u/Silent-Verstand-3377 — 13 days ago