ยท5 min read

Two Years as an ML Engineer: From Research to Production

personalcareerml

It's been about two years since I joined Myelin Foundry as an ML engineer. January 2020, fresh out of SRM in Chennai, landing in Bangalore with a suitcase and the confidence that only comes from never having deployed a model to production. Looking back now, mid-2021, the gap between who I was then and who I am now is kind of wild.

I want to write down what I've actually learned. Not the stuff you read in blog posts about "top 10 ML career tips." The real stuff.

Research is Not Production

This one hit me in the first month. At college, success meant: train model, get good metrics on test set, write report, done. At Myelin, getting good metrics on the test set was maybe 15% of the job.

The other 85% was:

  • Making the model small enough to run on a phone
  • Handling edge cases in real-world data that your clean dataset didn't have
  • Building data pipelines that don't break at 3am
  • Writing inference code that doesn't leak memory over 72 hours of continuous running
  • Explaining to non-technical stakeholders why the model is "only" 92% accurate

That last one is underrated. I spent more time in my second year explaining ML trade-offs to product managers than I spent training models. And honestly, that communication skill has been more valuable than knowing how to tune hyperparameters.

Deployment is 80% of the Work

My first project was a super-resolution model. Getting the model to work took about two weeks. Getting it deployed to mobile, optimized, tested across different devices, handling failures gracefully, and shipping it to users? Three months.

Quantization, TFLite conversion, WebGL compatibility, memory profiling, battery impact testing, A/B testing the output quality. Each of these is a rabbit hole. And each one can block your entire release.

The thing is, nobody teaches you this in college. ML courses end at model.evaluate(). The real world starts after that.

The Tools That Actually Matter

In my first year, I thought the important tools were PyTorch, TensorFlow, and Jupyter notebooks. Now I'd rank my daily toolkit differently:

  1. Git (version control for everything, including data pipelines)
  2. Docker (because "works on my machine" doesn't fly)
  3. TFLite/ONNX (model deployment formats)
  4. Grafana + Prometheus (monitoring deployed models)
  5. SSH (because you will be debugging on remote devices at odd hours)
  6. TensorFlow/PyTorch (for actual model work)

The model framework is #6. Not because it's unimportant, but because everything above it is what separates a working prototype from a working product.

What Changed Personally

Moving from Chennai to Bangalore was a shift. The first few months in the PG were lonely, honestly. New city, no college friends around, and the pandemic hit two months after I started. Working from a 10x10 room, Swiggy deliveries being my main social interaction for a while.

But Bangalore grew on me. The tech culture, the conversations at chai stalls about startups and career plans. My roommate and I would stay up talking about whether to go for MS or keep working. (I eventually decided MS, which is a story for another time.)

The Myelin office, when we could go in, was great. Small team, everyone knew everyone. You could walk up to the CTO and ask about architecture decisions. That kind of access accelerated my learning more than any online course.

Lessons I'd Tell Past Me

Read production code, not just papers. Papers tell you what's possible. Production code tells you what's practical. The gap is enormous.

Debug systematically. Early on, I'd change three things at once when something broke and have no idea what fixed it. Now I change one thing, test, repeat. Boring but effective.

Write things down. I started maintaining a work journal in my second year. Just a few bullet points each day about what I worked on, what broke, what I learned. Looking back at it now, it's the most valuable thing I own professionally.

Invest in your deployment skills early. The ML engineer who can also deploy, monitor, and debug in production is 10x more valuable than one who can only train models. This isn't just my opinion. I've seen the hiring conversations.

What's Next

I'm wrapping up my time at Myelin later this year. Two years of building real ML systems, from super-resolution to anomaly detection to browser inference. It's been the best possible start to a career.

I'm headed to grad school next year. And the funny thing is, I think I'll get way more out of it now than I would have straight out of undergrad. Because now I know what problems actually matter in practice. I know what questions to ask. And I know that the model is never the hard part.