r/MLQuestions

Need Help Improving Accuracy - Signature Matching model by training on top of efficientnetb3 base

I'm working on a signature verification project to detect fraudulent signatures on cheques. Here is how my pipeline looks so far:

1. Extraction (Done)

I'm using a fine-tuned YOLOv8 model (from Tech4Humans) to detect and crop out the signatures from the cheque images. It’s working pretty great at isolating the signature area.

2. Preprocessing (The tricky part)

Since cheques come from all kinds of different scanners and phone cameras, the backgrounds and lighting are all over the place. To clean them up before feeding them to the model, I'm doing:

Ink isolation: Removing noise near the boundaries by thresholding pixels based on background brightness.

Line removal: Scanning for long horizontal bars (like the signature lines on the cheque), wiping them out, and filling the gaps based on the surrounding texture.

Contrast stretching: Using percentile contrast stretching to fix dark backgrounds caused by bad phone/scanner scans.

Smoothing: Applying a bilateral filter to smooth out background noise while keeping the signature lines sharp.

CLAHE: Using Contrast Limited Adaptive Histogram Equalization to boost the contrast of the ink against tough backgrounds.

Padding: Adding final padding so the aspect ratio doesn't get warped.

The Model

Once preprocessed, I’m feeding the images into EfficientNet-B3 to train it to spot the differences between genuine and forged signatures.

The problem is my validation accuracy is not improving beyond 55-57,

what else can I do to improve this?

Is the preprocessing enough is should I improve it more, because I'm still getting some horizontal line like the signature line, and some text in some cases

reddit.com

u/the_MadMax — 1 hour ago

▲ 4 r/MLQuestions+1 crossposts

Learning ML by “ hands on ML with scikit-Learn” by O’rielly 3rd edition

Guys I am trying to learn ML and also wanted to be an ML engineer so should I study this book or suggest me any other books

reddit.com

u/_Harshan — 13 hours ago

▲ 4 r/MLQuestions

How to approach deep learning from a mathematical perspective? (with the goal of becoming a researcher)

Hi everyone, I'm in high school (I failed twice, embarrassing I know but I had various problems) and I would like to be a research scientist in the field of deep learning, I decided to do mathematics instead of taking the standard path with computer science at university,

so at the moment, in the summer of the fourth year of high school (in Italy we have 5 years of high school) I'm reading books on rigorous mathematics (real analysis, proof-based linear algebra) (struggling a lot), and since I hate the classic machine learning courses found online (I prefer books in general),

I was wondering what was the most rigorous way to approach this field (maybe after I have tackled multivariable calculus), I know that most of the knowledge comes from papers but, a general book would be handy, is "deep learning" by Goodfellow a valid choice or is it now out of date?

I'm not interested in DL libraries at all, and I'd like a source as formal as possible. In fact, I'll probably only use numpy and cupy (or even pure CUDA) until I start my PhD. (In general, I have good experience with programming and neural networks; I even created transformers with only numpy \[a sort of tensor micrograd\], but without fully understanding what was going on.).

Do you have any other general advice? Does this kind of atypical path make sense?

reddit.com

u/New_Discipline_775 — 23 hours ago

▲ 6 r/MLQuestions+1 crossposts

Personal AI Project

Right now I am working on YouTube Chatbot, where a user can paste the url of the video and ask questions based on that. I have followed classic RAG approach. The design looks like this :

Initial design :

(query, url) → YouTube Transcript API → Translate to English (Gemini 3.5 Flash) → Chunk → Vector store (Chroma) → Similarity search → Augment context with query → LLM → Output

Upgraded design :

(query, url) → YouTube Transcript API → Chunk raw transcript → Translate to English (Gemini 3.5 Flash) asynchronously each chunk → Vector store (Chroma) → Similarity search → Augment context with query → LLM → Output

I have some intermediate steps also like if the video id is already present in vector store I will directly point to the vector store and retrieve relevant context.

There is Langsmith integration.

My main doubt here is this :

I used free gemini-3.5-flash model and it limited me to only 5 requests per minute, the problem is a particular video was 1 hr long it took approx 126 seconds to translate it using this model
I upgraded it to Tier 1 and I have changed the translation step to asynchronous, i.e, the chunks will get translated in parallel and I noticed the latency drop to 15 seconds.

I am thinking of mentioning this project in my resume, will I face any backlash because I upgraded the model?? (I think basically the model will take the same time, it is the asynchronous logic which helped in bringing the latency down to 15 sec, to make these calls happen I had to increase my Tier and get those extra calls per minute).

reddit.com

u/SignificanceQuick963 — 1 day ago

▲ 9 r/MLQuestions+1 crossposts

Is it ever correct to train a Ridge Regression model on the test set?

I was reading an ML book on Ridge Regression and came across the following code:

rr = Ridge(alpha=0.5)

rr.fit(w_test, h_test)

rr.score(w_test, h_test)

The book explicitly mentions fitting the Ridge model on the test data.

My understanding is that a model should always be trained using the training set

(fit(X_train, y_train))

and evaluated on the test set

(score(X_test, y_test) or predict(X_test)).

Am I missing some context here, or is this simply an error in the book? I'd love to hear how experienced ML practitioners interpret this example.

u/Timely-Alternative53 — 1 day ago

▲ 2 r/MLQuestions+1 crossposts

Any doctors on reddit

So last night at around 11pm I noticed I felt weak and tired and so I went to bed and slept horribly. My stomach was aching all night, I was nauseous and felt cold and off. I wake up at 7:40 to my dog barking and I immediately feel horrible. I check my temperature and it's 102 and so I go back to bed. I wake up again at 10 am and take Tylenol and my temperature before Tylenol is 101. It got down to 98.5 after Tylenol and went back up 4 hours later and got to 101. Then I take another nap for 30 minutes and wake up and take my temperature and it's 100 and I take Tylenol and then chill. My temperature goes back down to 98. My symptoms are: stomach ache, diarrhea, fever below 102, headache, weakness, body aches, fatigue.

reddit.com

u/Mission-Account-4527 — 1 day ago

▲ 6 r/MLQuestions+4 crossposts

AI Questionnaire for School Project

Hello to whom it may concern,

I'm tasked with doing a high school project on any real-word ethical issue, in my case, AI Education Systems. I would love for anyone to answer a couple of questions on the use of Artificial Intelligence within Education Systems. Your authentic opinion is sought after!

Note: This is NOT to push any anti-AI propaganda but rather to gather diverse opinions on the topic (Some of the questions may feel "iffy" but these are required questions by my school to put in, answer in anyway) Feel free to answer as detailed or brief as you'd like to.

Here is a Google form:

https://forms.gle/VypKe4Wb84wtXYs87

Or you can answer in the comments directly:

What is your qualification/level of study in regards to AI or Education? (anything is accepted)
What ethical issue do you think is most visible or important in our community or daily life right now?
I am researching ethics in the following area: Artificial Intelligence within the Education System. How have you personally experienced or been affected by this issue?
How did you deal with / manage the issue personally?
From your own understanding, how do you understand ethical versus unethical behaviour in this area?
Do you think people are always aware that there is an ethical problem in this situation? Why or why not?
Do you think some people justify unethical behaviour in this area? If yes, how do they justify it?
Do you think the situation is fair for everyone involved? Explain your answer.
Who do you think should be responsible for addressing or fixing the issue (individuals / companies / government / schools / families etc.)?
What do you think would be a realistic and ethical way to improve or reduce this problem?
Before you became aware of the conversation around AI in schools, did you ever stop to think of it as an ethical issue — or did it just seem like a practical problem?
Do you think the people building and selling AI tools to schools have a genuine interest in students' wellbeing, or are there other motivations at play?
If a student used AI to complete an assignment because they were overwhelmed or under-supported, would you consider that morally wrong — and does the reason behind it change anything for you?
Do you think your institution has handled this issue in a way that is honest and transparent with students, or has there been a degree of avoidance around it?
Looking forward, do you feel optimistic or concerned about the role AI will play in education — and what would need to change for you to feel differently?
Do you have any further thoughts on this matter?

Thanks for taking this into consideration and feel free to ask any questions!

u/CocoBark24 — 1 day ago

▲ 19 r/MLQuestions+11 crossposts

PROJECT REVIEW

Hello Everyone!!, I just completed a BIG project I have been working for a month and i want your opinion about it.

It's a SpaceX Launch Predictor & Cost Optimizer (A full end-to-end ML system that predicts the probability of a SpaceX Falcon 9 booster landing successfully, enriches launch data with real weather conditions, and exposes the results through an interactive Streamlit web application with a business ROI calculator.)

It Includes Data Pipeline, Advanced Machine Learning Algorithms (with Hyperparameter tuning), Explainability AI (SHAP), MLOps (AWS S3, Docker) and Business Value (ROI Calculator = Financial Results).

FUN FACT: For this project i used my own Evaluation Metric library (standardizes supervised and unsupervised model diagnostics into a single, consistent API), that is also Verified and Published in PYPI Community.

Project Info: https://github.com/Alkiviadisss/SpaceX

github.com

u/Senior-Neck499 — 2 days ago

▲ 11 r/MLQuestions+1 crossposts

neural networking projects

Can you tell me some neural networking projects for beginner level person

I recently built a human written digit predictor.

Now I want to start a new project can you guys give some suggestions

reddit.com

u/Ok_Second2105 — 2 days ago

▲ 3 r/MLQuestions+3 crossposts

Should I do more training for the Number guessing model?

I did a project on making and training a number-guessing reinforcement learning model.

I did 140k episodes, and it started to Show degradation in success rate due to the model being made up of Standard DQN and not Double DQN . Should I train it more to see the max ceiling limit of success rate the model can achieve? What do you think, and how much should I train it until? Number Guessing RL Model

u/Kooky_Golf2367 — 2 days ago

▲ 17 r/MLQuestions

Tesla ML Interview Prep

I have an interview for the Tesla Optimus team as an intern specifically doing machine learning and reinforcement learning stuff. I've not been told what the interview will be about, only that I will be programming in Python. I've been preparing for it through a number of different ways:

Implementing various algorithms (MLP, various optimizers and regularization methods, CNN, forward pass, backward pass, etc.) using just Numpy and PyTorch from scratch with a heavy emphasis on vectorizing everything
Going over the math for all the major ML architectures (MLP, CNN, RNN, Transformer, etc)
Going over the math for all popular RL algorithms (DQN, PPO, SAC)
Making sure I know everything on my resume

Is there anything else that I should be doing or looking at? I haven't really done any LeetCode as I assumed it wouldn't focus on my LeetCode skills, should I brush up on that as well? Any tips would be greatly appreciated!

reddit.com

u/Legal_Low_285 — 2 days ago

▲ 23 r/MLQuestions+2 crossposts

My ML project: Stellar Object Classification (Star, Galaxy, Quasar)

Hello, I'm Shrushti!

I recently completed a machine learning project that classifies astronomical objects as Stars, Galaxies, or Quasars using the Sloan Digital Sky Survey (SDSS) dataset.

Github: https://github.com/sharmashrushti/stellar-object-classification

I'd really appreciate any feedback for improving the project. Thank you!

u/5BeautifulSoup — 3 days ago

▲ 0 r/MLQuestions

I wrote a dataset by hand and trained 500 examples to have a simple conversation .. im starting to think that u can write everything a language model knows and make it do specific tasks without ever having to scrape the internet .. is this possible?

Here is the full convo below

user: wassup bro?

assistant: nothing much .. what u up to?

user: im chillin .. what u thinking about?

assistant: im thinking about the universe .. what about you .. what are you thinking about?

user: im thinking about u bro im tryna undertstand how u can talk

assistant: about those topics but im still bro.

user: what topics do u know

assistant: i know about space, animals, plants, humans, and ai.

user: tell me about plants and animals

assistant: i know a little about animals. Animals usually get energy by eating plants, other animals, or both. Many animals have senses that help them find food, avoid danger, and communicate. The fun part is how many survival strategies animals have.

u/Wvy_World — 3 days ago

▲ 2 r/MLQuestions

As a told newbie - What AI should I use to create the best images?

I asked this question to google, and of course the google AI told me that google was the best...

I want to take a video game character and recolor/redesign some of the armor sets in different poses.

reddit.com

u/hulubuloo — 3 days ago

▲ 3 r/MLQuestions

High school student looking for feedback on my NLP mental health sentiment analysis project

Hi everyone, I am a high school student currently exploring NLP, and I recently did a project surrounding mental health sentiment analysis of textual input. I designed a model to analyze text data and evaluate the underlying emotional sentiment, and would like to get some professional feedback from this community.

As I'm still learning the ropes of NLP & ML, I would love to get any kind of input surrounding how I can improve the project (in terms of model selection, use cases, etc.) and how I could possibly expand on it. Any advice or constructive criticism you can offer would be incredibly valuable to me. Thanks!

The code & implementation details are on GitHub at https://github.com/JaukG9/mental-health-sentiment-analysis, and the site is live at https://jaukg9.github.io/mental-health-sentiment-analysis.

u/Time_Perception5834 — 3 days ago

▲ 5 r/MLQuestions+3 crossposts

Knowledge distillation for time series forecasting

I was wondering if there is a proven technique that works for knowledge distillation in the context of time series forecasting.

I have been trying alignment in the latent space with the Frobenius norm of Gram matrices as alignment loss, but results are not that impressive so far.

Any recommendations? Thanks!

reddit.com

u/Pazigoo36 — 3 days ago

▲ 6 r/MLQuestions

Medical student looking to break into ML for translational medicine research

Hi everyone,

I'm currently a medical student with a long-term goal of pursuing a PhD in a top lab working on machine learning applications in translational medicine and healthcare.

Right now, I know the basics of ML. I've completed a few Coursera courses, implemented some personal projects, and have basic Python experience. However, I'm struggling to figure out how to take the next step. I want to build the kind of skills and portfolio that would make me competitive for world-class research labs.

For those of you working in ML for healthcare, computational biology, or related fields, what would you recommend focusing on? Should I prioritize open source contributions, reproducing papers, Kaggle, research internships, reading papers, or something else?

Also, if anyone here works in this space, I'd love to connect, learn from your experience, and see if there might be opportunities to collaborate on research or open source projects.

Thanks in advance!

reddit.com

u/Slight-Tap-7344 — 3 days ago

▲ 2 r/MLQuestions

Feedback wanted: an adaptive "learner model" for SWE→ML transitions, built on existing content instead of a new curriculum

Hey everyone!

I am a secondary student working on an AI-driven, dynamic learning platform for software engineers upskilling to AI/ML roles.

It has two main features, adapted to this specific task.

Through diagnostics (such as novel problems, asking the user to explain concepts, and other techniques that you might see in a job interview for example) it develops a detailed learner model of the depth of user’s understanding on a topic-by-topic basis, visualised in a colour-coded graph so that the user can aggressively attack their weaknesses and develop proper skill and understanding.
World-class content is already publicly available online. Instead of investing 100s of hours into experts authoring new content, the platform curates tried-and-tested content made by the very best in the field to form a curriculum. My impression is that AI/ML roles require ever-changing skills, and this architecture would allow the curriculum to be able to adapt extremely quickly, with comparable or sometimes even higher quality content than what would be available with static curriculums.

I thought that this wonderful community of developers would be a great place to validate the idea, so for those who:

Have transitioned from software engineering to AI/ML
Are currently transitioning
Are planning to switch roles
Or if you’ve used upskilling services whatsoever

Would this help you?

Any feedback would be greatly appreciated; Thanks in advance.

(P.S. I am planning to make it subscription based, something around €20 / month. )

reddit.com

u/Hungry-Sign5037 — 3 days ago

▲ 7 r/MLQuestions

How to break into AI/ML/Data roles with a Bachelor’s degree?

I'm and AI focused ICE (Informatics and Communication Engineering) Bachelor’s student I've been seeing alot of ai/ml/data science roles even internships requiring master's degree or higher.

What roles in the AI field usually hire Bachelor’s degree holders?

reddit.com

u/gracekebbe — 4 days ago

▲ 4 r/MLQuestions+1 crossposts

Improving machine-translated novels via style transfer — looking for advice on the faithfulness/fluency tradeoff [P]

Hey all.

I recently started working on a project to improve machine-translated webnovels via style transfer. The basic idea is to take the clunky translated prose and rewrite it to something that reads like it was written by a professional author, while remaining as faithful as possible to the original text.

The source material is mostly amateur/MTL output full of direct sentence structure translations carried over from Chinese, awkward honorifics, over-translated idioms, that kind of thing. The goal isn't retranslation from the source but a cleanup of the English output.

The tricky part is I have no clean data pair for supervised approaches.

I've been looking at a few directions:

Fine-tuning on target-style prose — collect high-quality English novels, fine-tune a small LLM to rewrite in that register.
Just use a local LLM — run a local LLM and provide it with guidelines on what to rewrite and leave the same. No fine-tuning or anything needed, just hoping the transformer can handle it.

A few things I'm stuck on:

Is the faithfulness/fluency tradeoff actually manageable at the sentence level, or do I need paragraph-level context or more to preserve narrative coherence?
How do people handle domain-specific terms like

terminology

and catchphrase-type things that need to survive the rewrite unchanged? Hard constraints during decoding, or just hope the model learns to leave them alone?

Happy to hear about similar projects, relevant papers I might have missed, or just general lessons from working in this space. Thanks.

reddit.com

u/Divine_Invictus — 4 days ago