Safe & Smart Use of AI in Development: What I Tell Every Junior Engineer

AI tools are the most powerful thing to hit software development in a decade. They're also the easiest way to build a career on a foundation of sand. Here's what I cover when I talk to a group of junior developers — honest, practical, and no hype.

I recently sat down with a group of junior developers to talk through AI in development — how to use it, how not to use it, and what nobody in a sales pitch will tell you.

This post covers the same ground in writing.

TLDR: AI is no longer just autocomplete — it's actively replacing engineers who execute tasks without thinking. The developers who stay relevant are the ones who bring ideas, judgment, and ownership to the table, not just working hands. Use AI to amplify your thinking, not substitute for it. Understand everything you ship. Guard security-sensitive code. And start building the habit of being the person in the room with the ideas — because that's the job that's actually hard to automate.

First, let's be honest about what AI tools actually are

Here's the framing I pushed back on in my own head for too long: AI is just a fancy autocomplete.

It was a useful mental model early on. It kept people from over-trusting AI output. But it's no longer accurate, and holding onto it is starting to hurt more than it helps.

Today's AI tools — GitHub Copilot, ChatGPT, Claude, and especially agentic tools like Claude Cowork with Computer use — don't just complete lines of code. They write entire features, explain architectures, debug errors, manage files, fill forms, browse the web, and execute multi-step workflows on your machine autonomously. That's not autocomplete. That's a junior engineer who never sleeps and doesn't ask for a salary.

And they are getting better, fast.

My honest view: AI is on a path to replace engineers who primarily execute tasks — developers who take a ticket, look up how to do it, write the code, and ship it. That workflow is increasingly something a well-prompted AI can do end-to-end. If your value is in the execution, that value is compressing.

What AI cannot replace — at least not yet, and not easily — is the engineer who thinks. The one who asks why the ticket exists before writing a line. Who spots the architectural problem three sprints ahead. Who disagrees with the product decision in the right meeting. Who takes ownership of a system and actually cares what happens to it.

That's the version of the job worth building toward. And it starts with how you use AI right now.

That distinction matters more than anything else in this entire post.

When you use a calculator, you understand that it computes arithmetic correctly and that the responsibility for setting up the right equation is yours. AI coding tools demand the same discipline — except they look far more confident, produce far more complete-looking output, and fail in ways that aren't immediately obvious. Often in exactly the ways that matter most.

That's not a reason to avoid them. It's a reason to use them with your brain fully on.

The trap that catches most junior engineers

Here's a pattern I've seen dozens of times. A junior developer gets stuck on something — a tricky SQL query, an unfamiliar API, a confusing error message. They paste it into ChatGPT, get a complete-looking answer, paste it into their code, it works locally, they ship it.

Two weeks later there's a production incident. The query has an N+1 problem that only shows up at scale. Or the error handling swallows exceptions silently. Or — worst case — there's a security vulnerability.

The problem isn't that they used AI. The problem is that they used AI as a replacement for understanding rather than as an accelerant for understanding.

The difference is subtle but consequential. If you ask AI to solve a problem and you don't understand the solution, you now have two problems: the original problem, plus the fact that you've deployed code you can't reason about.

What AI is actually good for

Let me be specific, because "use it as a tool, not a crutch" is advice so vague it's useless.

Boilerplate and scaffolding. Writing the third CRUD endpoint of the day, setting up a new project, generating TypeScript interfaces from a JSON shape — AI is genuinely great here. This is low-risk, low-thinking work that doesn't require deep understanding to validate.

First drafts to react to. Staring at a blank file is cognitively expensive. Getting a rough draft from AI — even a mediocre one — gives you something to critique and improve. This is faster than starting from scratch and forces active thinking rather than passive copying.

Explaining unfamiliar code. Paste a piece of code you don't understand and ask it to explain what it does. This is one of the highest-value uses, especially when reading a new codebase. Combine it with your own reading — don't replace one with the other.

Writing tests. AI is quite good at generating unit tests for functions you've already written. The tests still need reviewing, but the tedium of writing expect(fn(2)).toBe(4) across fifteen cases is real, and AI handles it well.

Documentation and commit messages. Summarizing what a function does, writing a PR description, drafting a README section. Low-stakes, high-friction work that AI handles well.

Rubber duck debugging. Explaining your problem to AI often helps you find it yourself before the response comes back. This is the same phenomenon as rubber duck debugging, just with a rubber duck that occasionally says something useful.

Agentic workflows for real desktop tasks. This one surprised me when I first used it seriously. Tools like Claude Cowork go beyond answering questions — they can take action on your behalf. The Computer use capability lets Claude actually see your screen, interact with applications, and complete multi-step tasks: reorganising files, filling in forms, navigating UIs, running terminal commands. I've used it to automate the kind of tedious setup and file management work that used to eat 20–30 minutes out of a focused work block.

The key distinction here is agency — the AI isn't just returning text, it's acting. That makes the trust model different. You want to stay in the loop, review what it's doing, and only hand off tasks you understand well enough to catch mistakes. But for repetitive, well-defined work on your own machine, it's genuinely powerful.

Open source AI: the most interesting space right now

If the commercial tools are the well-lit highway, open source AI is the dirt road that goes somewhere nobody else has been yet. And right now, that dirt road is where a lot of the genuinely interesting stuff is happening.

Tools like Ollama, LM Studio, Open WebUI, LocalAI, and models like Llama, Mistral, and DeepSeek have put serious inference capability on your own hardware — no API key, no usage cost, no data leaving your machine. That last point matters enormously for professional work. You can run a capable model locally against your actual codebase, your internal documentation, your proprietary data — things you'd never paste into a public API.

The freedom this unlocks for experimentation is real. Want to fine-tune a model on your codebase's patterns? Run inference in a completely air-gapped environment? Build a custom tool that integrates a model directly into your internal workflow without routing anything through a third-party server? Open source is the only path to most of that.

But open source AI comes with a specific set of caveats that matter more here than in most other domains.

Models can be misconfigured silently. A quantized model running at 4-bit precision behaves differently from the full-precision version. Prompt formats vary between model families and even between versions of the same model — using a Llama 2 prompt format with a Llama 3 model will give you degraded, sometimes bizarre output. These failures don't throw errors. They just produce worse results quietly.

Safety guardrails vary wildly. Commercial models have been extensively red-teamed and tuned for safe outputs. Open source models range from carefully fine-tuned to essentially unfiltered. Know what you're running. Know how it behaves on adversarial inputs — especially if it's going anywhere near user-facing code.

Documentation is often incomplete or out of date. This is the nature of fast-moving open source projects. The README was accurate three months ago. The model card might not reflect the latest checkpoint. The config option you're relying on might have been deprecated in the version you're actually running.

This is why community is non-negotiable. The official documentation is your starting point, not your ending point.

Go to the dedicated Discord servers for whatever tool you're using. Read the #announcements and #troubleshooting channels before you ask a question — the answer is often already there. Join the Slack communities for enterprise-adjacent open source projects like LangChain, LlamaIndex, and Hugging Face. Spend time in the relevant Reddit threads — r/LocalLLaMA in particular has a high signal-to-noise ratio for practical local inference questions.

Read, then read again. Then participate. File a bug when you find one. Ask a specific, well-formed question when you're stuck. Answer someone else's question when you know the answer. Open source AI is a community resource, and communities work when people treat them that way.

And when you find something that works — a prompt pattern, a configuration trick, a workflow that improves your output quality — contribute it back. Write it up. Post it. Open a PR to improve the docs. The ecosystem moves fast partly because practitioners share what they learn.

The engineers I've seen get the most out of open source AI are the ones who treat it as a craft with a community around it, not just a free alternative to the paid tools. They know the GitHub issues for their core tools. They've read the model cards. They follow the researchers. They understand what they're actually running.

That depth pays off — not just in better results, but in the ability to debug when things go wrong, and to innovate when everyone else is still waiting for a feature to land in a commercial product.

OpenClaw: the open source AI agent worth paying attention to

If you want one concrete example of where open source AI innovation is outpacing commercial tooling right now, look at OpenClaw.

Built by Austrian developer Peter Steinberger and first published in late 2025, OpenClaw hit 247,000 GitHub stars and nearly 48,000 forks in roughly 60 days. For context, React took a decade to reach comparable numbers. That kind of velocity isn't hype — it means tens of thousands of developers found it immediately useful and told other developers about it.

So what is it? OpenClaw is a local-first AI agent gateway that turns any large language model into a persistent, autonomous personal assistant — reachable through the messaging platforms you already use. WhatsApp, Telegram, Slack, Discord, Signal, iMessage, Teams, Matrix — it supports all of them. You configure it once, run it on your own hardware, point it at your model of choice (Claude, DeepSeek, Ollama, GPT), and your AI assistant lives in your existing communication layer. No browser tab. No separate app. No data routed through a cloud you don't control.

For developers, this opens up genuinely interesting territory. You can run it against a local Ollama model for zero API cost. You can give it tools — browser, cron jobs, canvas, multi-agent routing. You can hook it into your own workflow without any of it touching a third-party server. There's an active ecosystem of community-built agent templates (awesome-openclaw-agents on GitHub has 160+ production-ready configs) and the multi-agent orchestration tooling is maturing fast.

The caveats are real though, and worth naming:

It moves fast. The project was renamed twice in its first three months (from Clawdbot to Moltbot to OpenClaw). The creator joined OpenAI in February 2026 and a non-profit foundation is now taking over stewardship. That's not a red flag — but it is a reminder that governance and long-term maintenance of open source projects at this growth rate is genuinely hard. Read the GitHub discussions, not just the README.

Setup has sharp edges. Node 24 is recommended; Node 22.14+ is the minimum. Running on an older version produces degraded behaviour without obvious errors. The documentation is good but hasn't always kept up with releases — the AGENTS.md file in the repo is often more current than the official docs.

Understand what you're running before you run it. OpenClaw has access to your messaging platforms, your tools, potentially your files. It executes on your behalf. That's powerful. It also means misconfiguration has real consequences. Read the architecture docs. Understand the permission model. Know what each plugin you enable actually does.

The community is the right place for all of this. The OpenClaw Discord and GitHub discussions are active and high-quality. r/LocalLLaMA regularly has threads on OpenClaw setups and configurations. If something is broken or confusing, someone in those spaces has probably already hit it — and if they haven't, filing a clear issue or writing up what you found is a contribution the whole community benefits from.

OpenClaw is a good example of the kind of open source AI project worth investing time in right now. Not because it's perfect, but because it's exploring a real idea — AI agents embedded in your existing communication layer, running on your own infrastructure — faster than anyone else. The engineers who understand these tools deeply, including their limits, will be the ones who know how to build with them when everyone else is still reading the announcement post.

What AI is bad at (and why it matters)

Security-sensitive code. Authentication flows, authorization checks, cryptographic operations, input sanitization — AI confidently generates code in these areas that has real vulnerabilities. I have personally reviewed AI-generated authentication middleware that would have allowed token forgery. The code looked fine at a glance.

The rule I give engineers on my team: if the code you're writing touches auth, permissions, payments, or user data — you write it yourself, you review it against the actual spec, and you get a second pair of eyes. AI can suggest; you own every line.

Architecture decisions. AI will answer "how should I structure this?" but it has no context about your team's skills, your system's actual bottlenecks, your deployment constraints, or what's already there. The answer will be technically coherent and often wrong for your situation. This kind of judgment only comes from understanding the full picture.

Debugging complex system interactions. AI can suggest reasons why something might be failing, but it can't observe your running system. Distributed systems bugs, race conditions, memory leaks — these require tooling, observability, and methodical elimination. AI is a hypothesis generator at best here, not a debugger.

Novel or domain-specific problems. AI training data skews heavily toward common patterns. If your problem is unusual — a niche library, an unusual performance constraint, a specific regulatory requirement — AI's suggestions will often be generic solutions that don't fit.

The rules I actually follow

After seven years building production systems, here's how I use these tools day-to-day:

I never paste code I haven't read into production. Doesn't matter where it came from. Every line in a pull request is code I'm personally vouching for. If I used AI to write something, I still read it, understand it, and am prepared to explain it in a code review.

I never paste secrets, API keys, or private code into public AI tools. This should be obvious but it isn't, apparently. Several high-profile data leaks have come from developers pasting internal code into ChatGPT. If your company has a code of conduct around AI tools, read it. If it doesn't, treat public AI services as genuinely public.

I verify claims about libraries and APIs. AI confidently cites non-existent functions, deprecated APIs, and incorrect method signatures. Always cross-reference against the actual documentation. This takes 30 seconds and saves hours.

I use AI more for the how than the what. Deciding what to build, what the architecture should be, what the right abstraction is — these require judgment, context, and often conversation with teammates. AI is useful for the how: once I know what I'm building, AI can help me write it faster.

A note specifically for people starting out

There is a version of the next two years where you use AI to get things done quickly, you develop a reputation for shipping, and you never really build the mental models that make a senior engineer actually valuable.

This is a real risk. I've seen it.

The engineers who grow fastest are the ones who use AI to go deeper — when AI generates code they don't understand, they stop and figure out why it works before moving on. When AI suggests a pattern, they ask themselves whether they'd have known to reach for that pattern without the suggestion.

The goal isn't to prove you can write everything from scratch. It's to make sure that when something breaks at 2am in production — and it will — you can reason about what's happening. That skill doesn't come from having shipped a lot of AI-generated code. It comes from having understood what you shipped.

Where this is all going

I'll be direct about where I think this lands: the engineers who will struggle in the next five years are the ones whose primary contribution is task execution. If your job is "take requirement, implement it, open PR" — that loop is getting automated. Not hypothetically. It's happening now.

The engineers who will thrive are the ones who operate at a level AI genuinely can't reach yet: defining what to build, understanding why something should exist, making trade-offs under uncertainty, taking real ownership of outcomes, and thinking beyond the ticket in front of them.

The best engineers I know have already made this shift. They use AI to handle the mechanical, repeatable parts of their work — the boilerplate, the test scaffolding, the routine file management — and they use the freed-up space to think harder about the things that actually matter. The architecture. The user. The long-term consequences of today's decisions.

That's not "AI replaces developers." That's "AI raises the floor, and the ceiling belongs to whoever thinks the hardest."

The habits you build right now determine which side of that line you're on.

Written after a conversation with a group of junior developers entering the industry. If you found it useful and want to talk about anything here, my inbox is open.

This post inspired a live session. Productbox is hosting an in-person event on this topic — see the official announcement on LinkedIn.