How to Build Psychological Safety in Engineering Teams: What Actually Works
A junior engineer watched a deployment go sideways. The logs showed warnings. Metrics spiked. Something felt wrong. But she didn’t say anything—she’d been told to “be confident” and “own your decisions.” Stopping the deployment would mean admitting she wasn’t sure.
Three hours later, customers were calling support. The incident cost the company $50,000 and a week of trust rebuilding exercises that felt more like HR theater than real change.
I’ve seen this pattern repeat across multiple companies. The worst part? Most teams think they have psychological safety. They have posters about “no blame culture” and “fail fast.” But when production breaks, people still run away from the problem instead of toward it.
The Gap Between Slogans and Reality
Here’s what I kept seeing:
┌─────────────────────────────────────────────────────────────────┐│ SLIDE DECK ACTUAL BEHAVIOR ││ ────────── ─────────────── ││ "We have blameless culture" → "Who touched this?" ││ ││ "Everyone can speak up" → *Juniors stay quiet in mtgs* ││ ││ "We celebrate failure" → Postmortems assign owners ││ to "action items" ││ ││ "Trust your team" → Every PR needs 3 approvals ││ from staff engineers │└─────────────────────────────────────────────────────────────────┘The disconnect comes from treating psychological safety as a vibe instead of concrete practices.
What Actually Works: Four Concrete Practices
After years of trial and error, here are the specific practices that transformed teams I worked with:
Practice 1: The Deployment Stop Rule
The rule: Anyone can stop a deployment. No explanation required beyond “something doesn’t feel right.”
I was skeptical at first. Won’t this create chaos? Won’t developers abuse it?
Turns out, the opposite happened:
Before Stop Rule: After Stop Rule:────────────────────────────────────────────Month 1-6: 23 incidents Month 1-18: 4 incidents
Why? People raised Why? Issues caughtconcerns AFTER BEFORE reachingproduction production
Avg time to detect: Avg time to detect:4.2 hours 12 minutesA senior engineer on Reddit described this perfectly:
“Anyone could stop a deployment if something felt off, junior or senior. No explanation needed beyond ‘something doesn’t feel right.’ We had fewer incidents in 18 months than the previous team had in 6.”
The key insight: intuition is valid data. By the time someone can articulate exactly why something feels wrong, the deployment might already be in production. Trusting gut feelings catches problems earlier.
Practice 2: Blameless Postmortems That Are Actually Blameless
The problem with most postmortem templates:
┌────────────────────────────────────────────┐│ ❌ "Who was responsible for this change?" ││ ❌ "What did the engineer miss?" ││ ❌ "How do we prevent this person from ││ making this mistake again?" │└────────────────────────────────────────────┘Even with “blameless” in the title, these questions create fear. Engineers start optimizing for “not being the one blamed” instead of “solving the problem.”
Here’s the reframed version:
┌────────────────────────────────────────────┐│ ✓ "What system gaps allowed this?" ││ ✓ "What information was missing?" ││ ✓ "How do we make this impossible ││ or at least visible earlier?" ││ ✓ "What would need to change for ││ someone to catch this automatically?" │└────────────────────────────────────────────┘The shift from “who” to “what” changes everything. As one engineer noted:
“When something breaks in prod you want people running toward the problem not hiding from it.”
Practice 3: Normalizing “I Don’t Know”
I used to think senior engineers should always have answers. Then I watched a staff engineer say this in a meeting:
“I don’t know. That’s a great question. Let me look into it and get back to you.”
The room didn’t collapse. The junior engineer who asked the question sat up straighter. Later, that same junior told me: “If he can admit he doesn’t know, maybe I can too.”
Fear-Based Team Safe Team───────────────── ────────────────"I'll Google it later" "Can someone explain this?"*Pretends to understand* "I haven't worked with that yet"Avoids asking for help Asks questions in meetingsFakes expertise Learns out loud───────────────── ────────────────Hidden knowledge gaps Visible knowledge gaps= production incidents = collaborative solutionsThe highest-voted comment in that Reddit thread:
“It was very acceptable to be wrong as long as it was handled gracefully… They legitimately wanted feedback and insight, and weren’t upset if it contradicted their previous position.”
Practice 4: Disagreement Without Ego
Technical debates can get heated. The difference between healthy and toxic:
TOXIC: HEALTHY:─────────────────────────── ───────────────────────────"That's a stupid approach" "What problem does this solve?""We've always done it "The tradeoff I see is..."this way""You clearly don't "Help me understand yourunderstand the domain" reasoning"
Makes it PERSONAL Stays TECHNICAL─────────────────────────── ───────────────────────────Outcome: Person wins Outcome: Best solution winsOne engineer described the best culture they’d seen:
“People could disagree hard on technical decisions without making it personal, and senior engineers were expected to unblock others, not just produce individually.”
The “disagree and commit” principle matters here. After a decision is made, everyone commits—even those who argued against it. But during the debate, dissent is encouraged, not punished.
Why This Matters: The Research Backs It Up
Google spent years studying team effectiveness. Project Aristotle’s finding:
Psychological safety is the #1 predictor of high-performing teams.
Not technical skill. Not experience. Not process.
Safety.
┌─────────────────────────────────────────────────────┐│ TEAM EFFECTIVENESS FACTORS ││ ││ ████████████████████████████ Psychological Safety ││ ████████████████ Dependability ││ ████████████ Structure & Clarity ││ ██████████ Meaning ││ ████████ Impact ││ ││ Source: Google's Project Aristotle (2016) │└─────────────────────────────────────────────────────┘Teams with psychological safety:
- Have fewer production incidents (problems surface early)
- Ship faster (no fear of making changes)
- Innovate more (calculated risk-taking is safe)
- Retain talent (people don’t burn out hiding mistakes)
Common Mistakes I’ve Seen
Mistake 1: Declaring Safety Without Practices
“We have psychological safety” means nothing without rituals. It’s like declaring “we have good security” without any actual security measures.
Mistake 2: Retroactive Blaming
The worst is the team that claims to be blameless, then quietly marks people as “responsible” in spreadsheets. Trust me—engineers notice.
Mistake 3: Safety For Some
If only senior engineers can stop deployments or raise concerns, you don’t have psychological safety. You have a hierarchy with better marketing.
Mistake 4: Confusing Safety With Avoiding Accountability
Psychological safety doesn’t mean no one is accountable. It means accountability focuses on learning and improvement, not punishment.
Low Accountability High Accountability ──────────────────── ───────────────────Low Safety Comfortable but Anxious and no improvement defensive
High Safety Learning culture High performance but slow action with real growth (rare) (target state)How I Implemented This
When I joined a team with low trust, I started small:
Week 1: Added “what allowed this to happen?” to our incident template.
Week 2: Publicly said “I don’t know” in a meeting and followed up later.
Week 3: Proposed the deployment stop rule. Got pushback from management about “developer velocity.”
Week 4: Stopped a deployment myself. Caught a config issue that would have taken down auth.
Month 2: Team started stopping deployments. Management noticed the incident drop.
Month 3: A junior engineer stopped a deployment in her first week. She found a database migration issue. She’s now the go-to person for deployment questions.
The change wasn’t instant. But each concrete practice built on the previous one.
Measuring Success
How do you know if psychological safety is improving?
BEFORE: AFTER:────────────────────────────────────────────────"I'll just push this, "Can someone review this? it's probably fine" I'm not sure about the error handling"
*Hides incident until Reports potential issue customer complaints* before deployment
"I'll just say I knew "I didn't catch that—how that in the retro" can we make this visible earlier?"Concrete signals:
- Time from problem occurrence to problem report
- Number of deployment stops and whether they caught real issues
- “I don’t know” frequency in meetings (higher is better)
- Junior engineer participation in technical discussions
Related Concepts
Psychological Safety vs. Trust:
Psychological Safety Trust─────────────────────────── ───────────────────────────"I can take risks "I believe my teammate without being punished" will deliver as promised"
Team-level feeling Interpersonal relationship
Created by leaders Built through interactionsAs one person noted: “Trust flowed naturally. Because they trusted the boss, and the boss trusted person X, they trusted person X.”
Psychological Safety vs. Accountability:
They’re not opposites. Safety creates the conditions for meaningful accountability. When people aren’t afraid, they actually want to improve.
References
- Google’s Project Aristotle - Original research on team effectiveness
- Google SRE Book: Postmortem Culture - How Google handles incidents
- Harvard Research on Psychological Safety - Academic foundation
The Takeaway
Psychological safety isn’t a feeling. It’s a set of concrete practices:
- Anyone can stop deployments — intuition is valid
- Postmortems never blame individuals — systems, not people
- “I don’t know” is always acceptable — knowledge gaps are learning opportunities
- Technical disagreement never gets personal — argue about code, not character
The measure of success: when production breaks, do people run toward the problem or away from it?
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments