How COVID Is Accelerating On-Device AI
It's been about two weeks since India went into full lockdown. I'm working from my PG in Bangalore, juggling Zoom calls on choppy WiFi, ordering everything on Swiggy, and trying to pretend this is all normal. Between the chaos, I've been thinking about how this pandemic is going to accelerate certain trends in tech, particularly on-device AI.
The Bandwidth Crunch
With half the world suddenly working from home, internet infrastructure is under serious stress. Streaming services are reducing quality. Video calls drop constantly. Cloud API response times are noticeably up.
If your product relies on shipping data to the cloud for inference, this is a problem. Users with shaky connections (which is basically everyone on Indian broadband right now) are getting degraded experiences or no experience at all.
Models that run on-device don't care about your WiFi situation.
Privacy Got Real Overnight
Health data is suddenly everywhere. Temperature screenings, contact tracing apps, symptom checkers, all collecting sensitive information. People are rightly worried about where this data goes. Aarogya Setu launched and the privacy debates started immediately.
On-device inference means the data never leaves the phone. Apple and Google's contact tracing API was designed with this in mind. I think we'll see more products adopt this pattern, not just for health but across the board.
The Products That Are Winning Right Now
Look at what's actually working during lockdown:
- Google Translate's offline mode, on-device NMT models
- Face filters on Instagram/Snapchat, on-device face detection and rendering
- Keyboard predictions, federated learning with local models
- Camera night mode, computational photography running entirely on-device
None of these need the cloud. They work in airplane mode. That resilience is going from "nice to have" to "essential."
What This Means for ML Engineers
If you're building ML products, start thinking about the deployment target early:
- Can this model run on-device? If so, default to that.
- What's the minimum model size? Quantization, pruning, knowledge distillation. Learn these.
- What's the fallback? If cloud is unavailable, does your app still function?
I've been spending most of my time recently making models smaller and faster -- quantization, pruning, and distillation are becoming core skills, not nice-to-haves. And honestly, the constraints make the engineering more interesting, not less.
Silver Lining
Terrible as this pandemic is, it's forcing the industry to build more resilient, privacy-respecting, bandwidth-efficient systems. The models we build now will be better because of these constraints. I wrote about why edge ML is the future a couple months ago, and COVID is making that argument for me faster than any roadmap could.
Stay safe out there. Wash your hands. And maybe learn TFLite while you're stuck at home.
Related Posts
Why Edge ML Is the Future
Cloud inference is great until it isn't. Here's why running ML models on-device is going to matter way more than people think.
Edge AI in 2024: Why On-Device Inference Changes Everything
Four years after I called edge ML the future, on-device inference is finally mainstream. Here's what changed, what didn't, and where we're headed.
Claude Code Isn't a Code Editor. It's a New Way to Use a Computer.
After a month of writing about Claude Code, here's the thing I keep coming back to: this isn't a developer tool. It's a new interface for computing.