GitHub Copilot Is Wild: First Impressions from a Working ML Engineer

So I got into the GitHub Copilot technical preview last month. Installed the VS Code extension, opened a Python file, and within thirty seconds it autocompleted an entire function I was about to write. I just sat there staring at it like it had read my mind.

I've been using it for about two weeks now, mostly on Python ML code, and I have thoughts.

The Good Stuff

It's genuinely useful for boilerplate. Writing a PyTorch Dataset class? Start typing class CustomDataset and Copilot fills in __init__, __len__, and __getitem__ with reasonable defaults. Data loading, transforms, the whole thing. You still need to customize it, but the skeleton saves real time.

It understands context shockingly well. I was writing a training loop and had already defined my model, optimizer, and loss function above. Copilot autocompleted the loop with the correct variable names, .zero_grad(), .backward(), .step(), all in the right order. It picked up on my naming conventions and style.

Docstrings to code actually works. I wrote a docstring explaining "load a YAML config file and return a dictionary with model hyperparameters" and it generated the implementation. Correct imports and everything. I showed this to a couple of people at Myelin and everyone's reaction was the same wide-eyed silence.

The Not So Good Stuff

It confidently writes wrong code. I was working on a custom augmentation pipeline and Copilot suggested a function that looked perfect but had a subtle bug in how it handled image channels. BGR vs RGB ordering. The kind of thing that doesn't throw an error but silently produces garbage results. If you're a junior dev trusting Copilot blindly, that's dangerous.

ML-specific code is hit or miss. For standard PyTorch patterns, it's great. For anything custom or research-y, it starts hallucinating. I was implementing a modified loss function and Copilot kept suggesting the standard version, ignoring the modifications I clearly needed.

It slows you down sometimes. When Copilot suggests something wrong and you have to read through the suggestion, understand why it's wrong, and then write what you actually wanted, that's slower than just typing it yourself. There's a cognitive cost to evaluating suggestions.

The Meta Irony

Look, the thing that keeps making me laugh is the irony of the situation. I'm an ML engineer. I build and deploy neural networks for a living. And now a neural network is writing my neural network code. It's built on Codex, which is built on GPT-3, which is a 175B parameter transformer trained on basically the entire internet's code.

My day job involves squeezing models onto Raspberry Pis and browsers. Copilot's underlying model probably needs a small data center to run. There's something poetic about that gap.

Honest Assessment

After two weeks, I'd say Copilot saves me maybe 15-20 minutes a day. Not life-changing, but noticeable. It's best for boilerplate, repetitive patterns, and standard implementations. It's worst for novel code, edge cases, and anything that requires domain-specific reasoning.

The real question isn't whether Copilot is useful today. It is. The question is what happens when the next version is 10x better. Because looking at the trajectory from GPT-2 to GPT-3 to Codex, that's not a hypothetical. It's a timeline. (Update: the future arrived. The jump from autocomplete to full vibe coding and autonomous agents like Claude Code was faster than I expected.)

I'm keeping it installed. But I'm also keeping my brain plugged in.

GitHub Copilot Is Wild: First Impressions from a Working ML Engineer

The Good Stuff

The Not So Good Stuff

The Meta Irony

Honest Assessment

Related Posts

Custom Commands and Slash Commands: Building Your Own Claude Code CLI

NotebookLM from the Terminal: Querying Your Docs with Claude Code

I Track Calories and Plan Groceries from My Terminal