avid learner and developer[Ben.randomThoughts,GReader.shared,Delicious.public].tweet
497 stories
·
1 follower

Architects, Anti-Patterns, and Organizational Fuckery

1 Comment and 2 Shares

I recently wrote a twitter thread on the proper role of architects, or as I put it, tongue-in-cheek-ily, whether or not architect is a “bullshit role”.

It got a LOT of reactions (2.5 weeks later, the thread is still going!!), which I would sort into roughly three camps:

  1. “OMG this resonates; this matches my experiences working with architects SO MUCH”,
  2. “I’m an architect, and you’re not wrong”, and
  3. “I’m an architect and I hate you.”

Some of your responses (in all three categories!) were truly excellent and thought-provoking. THANK YOU — I learned a ton. I figured I should write up a longer, more readable, somewhat less bombastic version of my original thread, featuring some of my favorite responses.

Where I’m Coming From

Just to be clear, I don’t hate architects! Many of the most brilliant engineers I have ever met are architects.

Nor do I categorically believe that architects should not exist, especially after reading all of your replies. I received some interesting and compelling arguments for the architect role at larger enterprises, and I have no reason to believe they are not true.

Also, please note that I personally have never worked at a company with “architect” as a role. I have also never worked anywhere but Silicon Valley, or at any company larger than Facebook. My experiences are far from universal. I know this.

Let me get suuuuuper specific here about what I’m reacting to:

  • When I meet a new “architect”, they tend toward the extremes: either world class and amazing or useless and out of touch, with precious little middle ground.
  • When I am interviewing someone whose last job title was “architect”, they often come from long tenured positions, and their engineering skills are usually very, very rusty. They often have a lot of detailed expertise about how their last company worked, but not a lot of relevant, up-to-date experience.
  • Because of 👆, when I see “architect” on a job ladder, I tend to feel dubious about that org in a way I do not when I see “staff engineer” or “principal engineer” on the ladder.

What I have observed is that the architect role tends to be the locus of a whole mess of antipatterns and organizational fuckery. The role itself can also be one that does not set up the people who hold it for a successful career in the long run, if they are not careful. It can be a one-way street to being obsolete.

I think that a lot of companies are using some of their best, most brilliant senior engineers as glorified project manager/politicians to paper over a huge amount of organizational dysfunction, while bribing them with money and prestige, and that honestly makes me pretty angry. 😡

But title is not destiny. And if you are feeling mad because none of what I’ve written applies to you, then I’m not writing about you! Live long and prosper. 🖖

Architect Anti-patterns and fuckery

There is no one right way to structure your org and configure your titles, any more than there is any one right way to architect your systems and deploy your services. And there is an eternal tension between centralization and specialization, in roles as well as in systems.

Most of the pathologies associated with architects seem to flow from one of two originating causes:

  1. unbundling decision-making authority from responsibility for results, and
  2. design becoming too untethered from execution (the “Frank Gehry” syndrome)

But it’s only when being an architect brings more money and prestige than engineering that these problems really tend to solidify and become entrenched.

Skin In The Game

When that happens, you often run into the same fucking problem with architects and devs as we have traditionally seen with devs and ops. Only instead of “No, I can’t be on call or get woken up, my time is far too valuable, too busy writing important software”, the refrain is, “No, I can’t write software or review code, my time is far too valuable, I’m much too busy telling other people how to do their jobs.”

This is also why I think calling the role “architect” instead of “staff engineer” or “principal engineer” may itself be kind of an anti-pattern. A completely different title implies that it’s a completely different job, when what you really want, at least most of the time, is an engineer performing a slightly different (but substantially overlapping) set of functions as a senior engineer.

My core principle here is simple: only the people responsible for building software systems get to make decisions about how those systems get built. I can opine all I want on your architecture or ours, but if I’m not carrying a pager for you, you should probably just smile politely and move along.

Technical decisions should be ultimately be made by the people who have to live with the consequences. But good architects will listen to those people, and help co-create architectural decisions that take into account local, domain, and enterprise perspectives (a Katy Allred quote).

Architecture is a core engineering skill

When you make architecture “someone else’s problem” and scrap the expectation that it is a core skill, you get weaker engineers and worse systems.

Learning to see the forest as well as the trees, and factor in security, maintainability, data integrity and scale, performance, etc is a *critical* part of growing up as an engineer into senior roles.

The story of QA is relevant here. Once upon a time, every technical company had a QA department to test their code and ensure quality. Software engineers weren’t expected to write tests for their code — that was QA’s job. Eventually we realized that we wrote better software when engineers were held responsible for writing their own tests and testing their own code.

Developers howled and complained: they didn’t have time! they would never get anything built! But it gradually became clear that while it may take more time up front to write and test code, it saved immensely more time and pain in the longer run because the code got so much better and problems got found so much earlier.

It’s not like we got rid of QA  — QA departments still exist, especially in some industries, but they are more like consulting experts. They write test suites and test software, but more importantly they are a resource to make sure that everybody is writing good tests and shipping quality software.

This was long enough ago that most people writing code today probably don’t remember this. (It was mostly before my own time as well.) But you hear echoes of the same arguments today when engineers are complaining about having to be on call for their code, or write instrumentation and operate their code in production.

The point is not that every engineer has to do everything. It’s that there are elements of testing, operations, and architecture that every software engineer needs to know in order to write quality code — in order to not make mistakes that will cost you dearly down the line.

Specialists are not here to do the job for you, they’re to help you do the job better.

“Architect” Done Right

If you must have architects at all, I suggest:

  1. Grow your architects from within. The best high-level thinkers are the ones with a thorough grounding in the context and the particulars.
  2. Be clear about who gets to have opinions vs who gets to make decisions. Having architects who consult, educate, and support is terrific. Having “pigeon architects” who “swoop and poop” — er, make technical decisions for engineers to implement — is a recipe for resentment and weak architectures.
  3. Pay them the same as your staff or principal engineers, not dramatically more. Create an org structure that encourages pendulum swings between (eng, mgr, arch) roles, not one with major barriers in form of pay or level disparities.
  4. Consider adopting one of the following patterns, which do a decent job of evading the two main traps we described above.

If your architects don’t have the technical skills, street cred, or time to spend growing baby engineers into great engineers, or mentoring senior engineers in architecture, they are probably also crappy architects. (another Katy Allred quote)

The “Embedded Architect” (aka Staff+ Engineer)

The most reliable way I know to align architecture and engineering goals is for them to be done by the same team. When one team is responsible for designing, developing, maintaining, and operating a service, you tend to have short, tight, feedback loops that let you ship products and iterate swiftly.

Here is one useful measure of your system’s complexity and the overhead involved in making changes:

“How long does it take you to ship a one-character fix?”

There are many other measures, of course, but this is one of the most important. It gets to the heart of why so many engineers get fed up with working at big companies, where the overhead for change is SO high, and the threshold for having an impact is SO long and laborious.

The more teams have to be involved in designing, reviewing, and making changes, the slower you will grind. People seem to accept this as an inevitability of working in large and complex systems far more than I think they should.

Embedding architecture and operations expertise in every engineering team is a good way to show that these are skills and responsibilities we expect every engineer to develop.

This is the model that Facebook had. It is often paired with,

The “Architecture Group” of Practicing Engineers

Every company eventually needs a certain amount of standardization and coordination work. Sometimes this means building out a “Golden Path” of supported software for the organization. Sometimes this looks like a platform engineering team. Sometimes it looks like capacity planning years worth of hardware requirements across hundreds of teams.

I’ve seen this function fulfilled by super-senior engineers who come together informally to discuss upcoming projects at a very high level. I’ve seen it fulfilled by teams that are spun up by leadership to address a specific problem, then spun down again. I’ve seen it fulfilled by guilds and other formal meetings.

These conversations need to happen, absolutely no question about it. The question is whether it’s some people’s full time job, or one of many part-time roles played by your most senior engineers.

I’m more accustomed to the latter. Pro: it keeps the conversations grounded in reality. Con: engineers don’t have a lot of time to spend interfacing with other groups and doing “project management” or “stakeholder management”, which may be a sizable amount of work at some companies.

The “architect-engineer” pendulum

The architect-to-engineer pendulum seems like the only strategy short of embedded architects / shared ownership that seems likely to yield consistently good results, in my opinion.

The reasoning behind this is similar to the reasons for saying that engineering managers should probably spend some time doing hands-on work every few years. You need to be a pretty good engineer before you can be a good engineering manager or a good architect, and 5+ years after doing any hands-on work, you probably aren’t one anymore.

If you’re the type of architect that is part of an engineering team, partly responsible for a product, shipping code for that product, or on call for that product, this may not apply to you. But if you’re the type of architect that spends little if any time debugging/understanding or building the systems you architect, you should probably make a point of swinging back and forth every few years.

The “Time-Share Architect”

This one has aspects of both the “Architecture Working Group” and the “Architect-Engineer Pendulum”. It treats architecture is a job to be done, not a role to be occupied. Thinking of it like a “really extended pager rotation” is an interesting idea.

Somewhat relatedly — at Honeycomb, “lead engineer” is a title attached to a particular project, and refers to a set of actions and responsibilities for that project. It isn’t a title that’s attached to a particular person. Every engineer gets the opportunity to lead projects (if they want to), and everybody gets a break from doing the project management stuff from time to time. The beautiful thing about this is that everybody develops key leadership skills, instead of embodying them in a single person.

The important thing is that someone is performing the coordination activities, but the people building the system have final say on architecture decisions.

The “Advisor Architect”

I honestly have no problem with architects who are not seen as senior to, and do not have opinions overriding those of, the senior engineers who are building and maintaining the system.

Engineers who are making architectural decisions should consult lots of sources and get lots of opinions. If architects provide educated opinions and a high level view of the systems, and the engineers make use of their expertise, well  that’s fan fucking tastic.

If architects are handing them assignments, or overriding their technical decisions and walking off, leaving a mess behind … fuck that shit. That’s the opposite of empowerment and ownership.

The “skin in the game” rule of thumb still holds, though. The less an architect is exposed to the maintenance and operational consequences of decisions, the less sway their opinion should hold with the group. It doesn’t mean it doesn’t bring value. But the limitations of opinions at a distance should be made clear.

The Threat to Architects’ Careers

It’s super flattering to be told you are just too important, your time is too valuable for you to fritter it away on the mundane acts of debugging and reviewing PRs. (I know! It feels great!!!) But I don’t think it serves you well. Not you, or your team, your company, customers, or the tech itself.

And not *every* architect role falls into this trap. But there’s a definite correlation between orgs that stop calling you “engineers” and orgs that encourage (or outright expect) you to stop engineering at that level. In my experience.

But your credibility, your expertise, your moral authority to impose costs on the team are all grounded in your fluency and expertise with this codebase and this production system — and your willingness to shoulder those costs alongside them. (All the baby engineers want to grow up to be a principal engineer like this.)

But if you aren’t grounded in the tech, if you don’t share the burden, your direction is going to be received with some (or a LOT of) cynicism and resentment. Your technical work will also be lower quality.

Furthermore, you’re only hurting yourself in the long run. Some of the most useless people I’ve ever met were engineers who were “promoted” to architect many, many years ago, and have barely touched an editor or production shell since. They can’t get a job anywhere else, certainly not with comparable status or pay, and they know it. 🤒

They may know EVERYTHING about the company where they work, but those aren’t transferable skills. They have become a super highly paid project manager.

And as a result … they often become the single biggest obstacle to progress. They are just plain terrified of being automated out of a job. It is frustrating to work with, and heartbreaking to watch. 💔

Don’t become that sad architect. Be an engineer. Own your own code in production. This is the way.

Coda: On “Solutions Architects”

You might note that I didn’t include solutions architects in this thread. There is absolutely a real and vibrant use for architects who advise. The distinction in my mind is: who has the last word, the engineers or the architect? Good engineering teams will seek advice from all kinds of expert sources, be they managers or architects or vendors.

My complaint is only with “architects” who are perceived to be superior to, and are capable of overruling the judgments of, the engineering team.

Exceptions abound; the title is not the person. My observations do not obviate your existence as a skilled technologist.  You obviously know your own role better than I do. 🙃

charity

Read the whole story
seriousben
12 days ago
reply
Amazing blog posts. Lots of exploration in the different architect archetypes, staff and principal engineers.

The title is a bit strong. But the post itself takes a more nuanced approach.

I really enjoyed it.
Canada
Share this story
Delete

Let It Fail

1 Comment and 2 Shares

Some time ago I found myself leading half of engineering at a young startup. My group had been formed around a philosophy of platform value and as such had taken on a large project to migrate our application services to a new architecture. In parallel the business was evolving too and many new features were already being planned, each deeply integrated into the legacy system we were frantically trying to move on from. Quickly I realized the business and engineering teams were on a collision course.

During my next one-on-one with my manager I raised my concerns, "We're mired in technical debt and dealing with outages nearly every day," I said, effusing the frustrations of the team. These symptoms were well-known and we were sure that the root cause would be addressed by our platform work. There was only one catch: not only was this platform work not delivering incremental value, it was creating more cognitive overhead as effectively two systems existed in parallel until it shipped. With the business asking for more engineering bandwidth, we were in danger of being in limbo indefinitely.

"We must stay focused on the platform work," I declared with confidence. My manager, nodding, agreed unequivocally, "You're absolutely right, Max." Ready to transition to pitching my plan to keep us on course, he surprised me by continuing, "And we're going to let it fail." I was sure I had misheard but he continued, "It will fail too. Exactly in the way you've predicted." Realizing I had not misheard, I paused for a moment to recover. Now thoroughly confused, I could only ask, "But shouldn't we intervene?" My manager smiled, "Oh you certainly could and I know you would address the acute problem but it wouldn't do anything about the chronic ailment."

As he explained I was beginning to understand.

Letting Things Go Sideways

What I had failed to see as I assessed the situation was that while things would certainly break, our work would be delayed, the team would endure a longer period of maintaining both a new and old system, the fallout would be relatively limited and given our scale and working process, quickly corrected. All this meant that the cost of allowing things to go sideways for a bit was relatively low.1 Moreover, it represented an important learning opportunity for the broader business which would generate broader buy in and allow us to dramatically improve process.

In fact articulating the perils of foregoing the platform work and building on technical debt was crucial: while both my manager and I had succeeded in selling the idea in theory, in practice the business was still struggling to map this to the day-to-day business needs. The fact the work had little to no incremental value made this situation more challenging. Perhaps failure could be a helpful illustration.

Back Pressure

In networking and distributed systems, there's a concept known as back pressure. This idea revolves around the notion that different connected points within a system can prevent themselves from becoming overwhelmed and failing completely by controlling the amount of inbound data they accept. Essentially, back pressure enables a kind of controlled failure, and software systems can be designed around this principle to achieve greater scalability and resilience.

Similarly, organizational processes can be thought of as systems that can also benefit from feedback loops based on the principles of back pressure. By implementing such feedback loops, organizations can limit the amount of work they take on, and thus reduce the risk of becoming overwhelmed and failing. This can help improve overall process resilience and scalability, resulting in a more efficient and effective workflow.

Although it's often possible to iteratively optimize software systems, this alone may not result in a more resilient system.2 Much like a situation where an individual intervenes and prevents a minor catastrophe from occurring–while this may address the immediate issue, it doesn't necessarily prepare us for similar future disasters. Instead, we can enhance our processes by incorporating appropriate back pressure, allowing upstream parts of the system to become more aware of potential issues and adapt accordingly. By doing so, we can increase our overall preparedness and improve the system's resiliency, rather than simply addressing acute problems as they arise.

Heroism

Interventions suffer from another problem as well: they often rely on the heroic efforts of an individual who identifies a problem or opportunity and decides to take action. While it may be difficult to see the potential drawbacks of this approach in the short term, a longer time horizon reveals that heroism is often an anti-pattern.

Returning to the notion of systems, we can draw on the idea of single points of failure. When a system relies on one node in a graph and when that node disappears or becomes degraded in some way, the entire system is necessarily impacted.3 Similarly, our human systems can form single points of failure as they become dependent on individuals. Consider that if our hero disappears on a months-long backpacking adventure in Europe and without her to save the day our velocity slows to a crawl, we've stumbled upon a systemic reliance that reveals deeper fragility.

As such generally we want to discourage a culture of heroism and seek to build systems and processes that don't require it.

How I Learned To Stop Worrying

Things did indeed break. But as they did something else happened: our product team began to see the legacy system would not support the business goals and they went from somewhat passive admirers of the theory to active evangelists of the platform work. As the legacy system buckled under new demands, the conversation quickly evolved from, "How do I prioritize this new feature?" to "How do we create space for holistic system work such that we can build better features?" When it became evident the platform work supported the net-new work, the product and engineering teams led prioritization together.

I must admit, it wasn't easy for me to resist the urge to intervene. I have a natural inclination to jump in, get my hands dirty, and help. However, by creating room for some controlled failure, we gained a broader appreciation for the limitations of the legacy system outside engineering, as well as the need for a stronger technical foundations to support our business. The organic back pressure created by the legacy system falling over was enough to steer us back on course, and at a relatively low cost. Moreover, it provided us with the opportunity to focus on developing resilient process that didn't require individual heroics to avert disasters.

Ultimately we completed the platform work with the enthusiastic support of our business stakeholders. This formed the basis of future work and exceeded expectations in virtually every dimension.4 Not only had we solved the worst pain of the legacy system we also unlocked increased velocity which allowed the company to move more quickly with net-new feature work, bringing value to our customers sooner and helping the business reach ever increasing levels of growth.

As leaders, it's tempting to think we need to act. Isn't it our job after all? Sometimes it is. But whether we need to jump in or not is derivative of a more fundamental directive.5

Sometimes the most powerful action we can take is no action at all.

Footnotes

  1. My biggest concern was that the team would burn out on incidents. However, we shared incident work equally between both individual contributors and people managers, myself included. So while a real cost, it wasn't left to the team alone as management made such a decision.

  2. At least not if we haven't made that an explicit goal. This is in contrast to considering the broader systemic pieces and how they fit together in the context of failure.

  3. This is why single points of failure are often one of the first things engineers attempt to identify when planning for resilience.

  4. The team had been right about the root cause of instability all along.

  5. So much of leadership happens before any decision or action. It's easy to conflate the result with the process but they are distinct and separate. The artifacts of leadership, like the code of a program, are mere shadows cast off the work itself.

Read the whole story
seriousben
19 days ago
reply
Interesting take on changing systems without relying on heroism and without trying to get prioritized project that don't seem to have business value. Letting things fail (in a controlled way) can help align and change the systems. Being proactive is not always the best approach.

> Things did indeed break. But as they did something else happened: our product team began to see the legacy system would not support the business goals and they went from somewhat passive admirers of the theory to active evangelists of the platform work. As the legacy system buckled under new demands, the conversation quickly evolved from, "How do I prioritize this new feature?" to "How do we create space for holistic system work such that we can build better features?" When it became evident the platform work supported the net-new work, the product and engineering teams led prioritization together.
Canada
Share this story
Delete

“Yes, if”: Iterating on our RFC Process

1 Comment

Article URL: https://engineering.squarespace.com/blog/2019/the-power-of-yes-if

Comments URL: https://news.ycombinator.com/item?id=34946844

Points: 123

# Comments: 60

Read the whole story
seriousben
22 days ago
reply
Interesting take on a RFC process. At Hashi, RFCs are used to provide teams with more de decision making powers as opposed to centralizing reviewing and decision to a council. Trade-offs.

HN comments provide other opinions: https://news.ycombinator.com/item?id=34946844
Canada
Share this story
Delete

SQL should be the default choice for data engineering pipelines

1 Comment

Article URL: https://www.robinlinacre.com/recommend_sql/

Comments URL: https://news.ycombinator.com/item?id=34578324

Points: 123

# Comments: 62

Read the whole story
seriousben
49 days ago
reply
Interesting take into data pipelines.
Canada
Share this story
Delete

We invested 10% to pay back tech debt

1 Comment

Article URL: https://blog.alexewerlof.com/p/tech-debt-day

Comments URL: https://news.ycombinator.com/item?id=34394351

Points: 254

# Comments: 177

Read the whole story
seriousben
64 days ago
reply
From HN (https://news.ycombinator.com/item?id=34394351):

> If you assign a block of time to quality, you risk people taking that as an excuse to not focus on quality outside that block.

I feel like this has been my worry and experience for a while. Good teams I have been at shipped quality by default. They would not require bug blitz or similar. Except for well defined area of complexity or integration with another team.
Canada
Share this story
Delete

What happens when babies are left to cry it out?

1 Comment

Article URL: https://www.bbc.com/future/article/20220322-how-sleep-training-affects-babies

Comments URL: https://news.ycombinator.com/item?id=34173006

Points: 101

# Comments: 247

Read the whole story
seriousben
81 days ago
reply
Interesting view of the studies done in that domain. Summary and opinions available on HN https://news.ycombinator.com/item?id=34173006
Canada
Share this story
Delete
Next Page of Stories