Category: Uncategorized

Agile Code Reviews

Facebook recently changed their mantra from

Move fast and break things.

to

Move fast with stable infrastructure.

Mark Zuckerberg went on to say ““We were willing to tolerate a few bugs to move fast. But having to slow down and fix things was slowing us down”. Working quickly while engineering things the right way is a balance all teams have to find, especially for startups where speed is necessary for survival.

The code review is one activity that, when done correctly, can help your team achieve both objectives – shipping code quickly and stable systems.

The problem is that code reviews are very easy to get wrong, mostly because they are cultural and driven by human biases. Usually one or more of these things happen:

  1. The engineering team pays lip service to code reviews, doesn’t prioritize them, and simply doesn’t find the time.
  2. A bug finds its way into production causing bad behavior or worse. Management gets involved and code reviews come up as a solution. Engineers reluctantly agree but no framework is discussed for when and how code reviews should be implemented and who should be involved.
  3. The engineering team gets around to doing a code review. The entire team piles into a conference room, because hey it would benefit everyone, and one lucky engineer walks everyone through his code, standing on trial for every line of code, justifying a variety of details including coding conventions.
  4. The code review process is so painful, other engineers rather not bring it up again, and engineering leaders likewise don’t want to put their team through the ringer. Often the time-suck reasoning will be brought up again.
  5. Code reviews are implemented as a variant of pair-programming, which is less intense but still faces scheduling issues. Code reviews continue to fall through the cracks.

In the name of moving quickly, code reviews are often painful or swept under the rug. It doesn’t have to be this way and it shouldn’t.

Code reviews are one of the fews way to actually have confidence in your team’s production code and avoid having to revisit deployed code or re-design a certain component. This approach is a good example of “slowing down to go fast”.

While every team is unique, here is a framework for thinking about the code review process.

  1. Be Respectful. Leaders must foster a culture of respectful, positive feedback. Finding bugs at this stage is a good thing and a learning experience for everyone involved.
  2. Be Automated. Use a tool to automate as much as you can around code reviews. Here is a list of popular code review tools. Discuss with your team the tradeoffs and pick one. It matters most that your team uses it.
  3. Be Low Overhead. Code reviews should be asynchronous. Don’t hold meetings.
  4. Be Consistent. Make code reviews habitual. The code review is part of the deployment process.
  5. Be Reliable. Any piece of code that goes into production should be code reviewed.
  6. Be Timely. Follow-up during stand ups. If you’re being blocked by a pending code review bring it up right away.
  7. Be Thorough. Take the time to understand the code you are reviewing. Make sure that once code review is given, the feedback is incorporated. Require a follow-up code review if necessary, until the “ship it” thumbs up is granted.

Code reviews are by no means a magic bullet for perfect code. However, done correctly they can  improve your team’s velocity while simultaneously improving the quality of your team’s production code base. As an added bonus, code reviews can expedite the knowledge sharing process across the team. This has the advantage of enabling more people to contribute to more pieces of the code base over time, allowing your organization to move faster.

Twitter’s Opportunity

While social messaging applications have experienced unprecedented growth in the past two years, it is no secret user growth at Twitter has not reached its forecasted potential. Growth was at 126% year over year in 2011, and sits at around 25% in 2014. Obviously this percentage will go down over time but even still, taken alone, this latest number is encouraging and fans of the service hope to see this kind of growth continue. Nevertheless, Twitter has watched a handful of companies pass them by and make waves in the industry. With the global ubiquity of mobile messaging apps, one can’t help wonder if Twitter missed a huge opportunity.

Of course, Twitter is more than just a messaging app. It’s a unique platform that has a lot going for it. Twitter already has great partners, amazing presence across all media channels, lots of content and a trusted brand. These are actually some of the hardest things to achieve, pieces other companies would love to have. The right combination of product decisions could potentially propel Twitter to join the ranks of the billion-user club. Execution is everything of course, but here are some things Twitter could do a lot better with their product and in the process drive even greater user growth and engagement.

1. Communication is life’s killer app. And communication is entirely about conversations. As it stands tweets are presented to stand on their own, requiring multiple clicks to get to entire conversations. The mobile experience in particular should be consistent in presenting conversations in a manner that is easily digestible, retrievable and searchable. Another manifestation of this should take place within the lists feature. Users should be able to tweet to a list. There would need to be constraints around this to prevent spam behavior, but group messaging is a naturally social behavior. Fred Wilson suggests Twitter should make tweetstorms a product. I think this is a interesting idea, but takes one step too far too soon. Twitter should get public mobile conversations right, and organic behaviors like tweetstorms will spread far more quickly. The major behavioral challenge Twitter faces compared to other mobile messaging apps is public vs private messaging. Twitter’s unique culture and value proposition is all about open, public messaging. They will need to balance this approach with the potential upside of building around private (direct) messaging.

2. Television viewing behavior is increasingly out of band. People are watching shows where and when they want to, the one or two exceptions being live sports and live talent competitions. Using Twitter while watching live programs is an amazing experience, allowing users to interact with each other and with the personalities on the screen. I can count on one hand the companies that have their logo beamed to millions of active viewers during a program. Twitter is actually incoporating new metrics that try to capture this reality.  This is an advantage unique to Twitter that they should be capitalizing on. Instant replays, extra content not shown on TV, and even re-broadcasts of entire shows (think social DVR) are content plays Twitter should be thinking about. Millions of conversations are already happening around a specific piece of content using hashtags and mentions. Twitter should build product around this natural behavior.

3. One knock against Twitter is the noisy timeline. Every tweet is displayed in your timeline. Rather than break this model, I would like to see Twitter improve it’s Discover feature. While Timeline represents a stream, as it should, Discover should represent my feed, one that captures the ideas of my broader network, is smart and constantly evolving to show me what I want to see. Currently the Discover feature seems to be based more on popularity than personalization, which means I’ve probably already seen the same tweet, or some version of the same story. This just adds to the noise. I rarely check the Discover feed. Content these days is all about capturing attention and Twitter is precisely positioned to do a great job at this, both in breadth and depth. The bottom line is that this feed needs to be smarter — make it my go-to news source that I check every morning.

4. Photos and photo sharing is probably the next biggest driver of user social interaction, second to messaging. People love taking photos and sharing them with whomever they want. And yet after more than seven years, Twitter is still not strongly associated with photos. Only recently did they start supporting tagging in photos, a feature which helped Facebook explode in popularity during their early years. In addition, not getting a deal done with Instagram to display photos inline was and continues to be a major blow for Twitter. Regardless, Twitter has failed to drive innovation in this space and has suffered for it. The Vine acquisition was a strong move on its own. But as a broader social experience, the photo and video features on Twitter should have a greater product presence and be positioned to attract more users.

5. Twitter made a bold and strategic move when it restricted use of its public API. Developers were basically told they should build atop Twitter at their own risk. I’m not sure Twitter has fully recovered from this decision. While a completely open API may not be necessary, partnerships with key application developers would have bolstered Twitter’s role as a social platform. Rather than build their own Music product, which did not fare well, Twitter could have worked with the likes of Pandora, Spotify, and even iTunes to build those experiences into Twitter. Gaming is another growing vertical Twitter has failed to take advantage of, one that is ripe for innovation. One vertical in which Twitter has shown some progress is in commerce and payments, including their integration with Amazon Cart and their recent acquisition of CardSpring. The development of Twitter Cards is a step in the right direction on this front, supporting a variety of user interactions such as subscription and eventually an increased roll-out of commerce capabilities. Still, by now Twitter should have grown beyond the borders of its application and positioned itself as a social platform, powering public communications across applications and the web.

6. This point is more of a personal request, and admittedly one I’m still thinking through. There seems to be two camps when it comes to the Favorites feature: Those who use it as a “Like” button, which appears to be the majority, and those who use it as a way to bookmark content. I’m in the latter camp. Either way, I think there is a better way for Twitter to build product around all this favorited content: Make it searchable and retrievable so that users keep coming back to their favorite posts and photos. My personal library.

Twitter has already made some significant changes to their interface, targeting both the on-boarding process and the read-only experience. I think some of the points above would go even further to make Twitter an amazing consumer experience.

Twitter may never need to join the billion-users club. Despite my ambivalence to Promoted Tweets, Twitter’s advertising units have been performing incredibly well lately. It’s quite possible Twitter becomes extremely profitable with less than 500M users globally.

And of course, as a vehicle for social movements around the world, Twitter has already left a significant, immeasurable mark on history.

And yet. In the ongoing global expansion of mobile devices and mobile platforms, some 7 billion users, and given Twitter’s successes to this point, it would be disappointing to not see Twitter grow its user base significantly and become a social platform used regularly by a majority of the world’s population.

The Social Animal

Technology has changed the meaning of relationships. For better or for worse remains to be seen — we do know that our relationships are now different. 

Given our new technology-based forms of communication, are we any better at measuring our most precious resource, human capital?

David Brooks recently published his book The Social Animal: The Hidden Sources of Love, Character, and Achievement. Put broadly, it examines the human mind and how we can better measure success.

Brooks uses politicians as an extreme example to highlight the two sides of the human mind: On one hand politicians are capable of immense emotional intelligence, able to connect with thousands if not millions of people in person or through media; on the other hand, when it comes to policy, their actions seem almost wholly devoid of this same emotional intelligence, replaced and justified by an indifferent and exacting form of logic and reasoning.

Research of the past fifty years has drawn light upon the reason vs emotion debate, in the context of what it means to be human. Synthesizing the research of economists, sociologists, psychologists, and scientists we come to appreciate a deeper view of humanism.

Brooks gleans three key insights. First, while the conscious mind writes our day to day autobiography, it is the unconscious that does most of the work. We are bombarded with a million data points per day. We can actively only recall a small percentage of those data points, however the unconscious mind creates an image of the world that we can call upon to make what feels like “gut calls”.

Second, studies of patients with brain damage has shown that emotions are not separate from reason and wisdom but in fact emotion is a foundation of reason. Emotions tell us what to value. Interpreting and educating your emotions is one of the most important things we do and we can improve.

Third, we are not self-contained individuals. More than anything we are social, not rational, animals. Our sense of self emerges out of relationships and deep interconnections with others. In our minds, we re-enact what others see in their minds.

These findings impact our appreciation, understanding, and management of human capital. When we think about how we measure human capital we think about the number of hours spent behind a desk, number of tasks completed, grades, SAT scores, degrees, and so on. Human capital may be our most previous resource and we need to look deeper than superficial numbers to measure and determine success.

The aforementioned research points at six aspects that are not measured but are more indicative of success and a far richer life.

1. Mindsight. The ability to enter people’s mind and understand what they have to offer or empathize with their circumstances.

2. Equipose. The ability to read the biases and failures in your own mind. Men in particular lean towards overconfidence. Those with high equipose are self-aware, open-minded, curious, comfortable in unknowns, and are able to adjust the strength of their conclusions with the strength of the evidence.

3. Metis. Derived from Greek mythology, metis is the embodiment of prudence, wisdom, or “street smarts”. It describes a sensitivity to your environment and the ability to pick out patterns in your environment.

4. Sympathy. The ability to work within groups. Groups are smarter than individuals, and groups that work face to face are smarter than groups that communicate electronically because 90% of communication is non-verbal. The effectiveness of a group is rarely linked to the group’s IQ, but rather is linked to softer attributes such as the capacity to work together and taking turns in speaking.

5. Blending. The ability to mix concepts. This is very difficult to measure but is an important source of innovation. Picasso blended western art and African masks, not only the geometry of the art, but the underlying moral systems as well.

6. Limerence. This does not describe an ability but rather a drive or motivation. The conscious mind hungers for success and prestige while the unconscious mind seeks out moments of transcendence. When we are lost in a challenging problem, when a craftsman is lost in his task, when a naturalist is one with nature he or she experiences this transcendence. Our skull line disappears.

The new research of the latter part of last century, and this century to come, in Brooks’ view, will have a profound impact on our culture and on how we interpret what it means to be human.

Ben Horowitz On Conflict

In movies, conflict makes things interesting. Conflict is what draws us into the story. In day to day life however, it’s the opposite. Most people won’t claim to enjoy a boring life, but they do tend to try to avoid conflict in their own lives.

Overall this is probably a good thing. Most conflicts are a waste of energy.

In the workplace though the right type of conflict is a good thing. The following is an excerpt from an interview with Ben Horowitz that highlights this perfectly.

The book uses a lot of war terminology; there are boxing photographs here on the wall. At one point you lament that you can’t award promotions based on executives’ fighting ability. Are you drawn to conflict as a person?

I don’t know that I’m drawn to conflict; you don’t necessarily in these businesses want conflict with other companies, though you get it a fair amount. But, and this is one of the best management pieces of advice I ever got from Marc Andreessen: he was quoting Lenin, who was quoting Karl Marx, who said: “sharpen the contradictions.” Marx was talking about labor and capital, which is not generally what you’re talking about when you’re running a company. But the conflict is where the truth is. And so when there’s a conflict in the organization, you do not want to smooth it over. You want to sharpen the contradictions, heat up both opinions, and resolve it. Good CEOs are really good at doing that. And it’s miserable to work for someone who tries to smooth things over. “Oh no, it’s a miscommunication.” Miscommunication? I don’t agree with that, motherfucker!

When managed correctly conflict draws out strong, valid opinions from a variety of sources. Leaders should encourage employees to voice their opinions. This builds strong characters and a strong team.

Accidental Complexity

When people think about software engineering the first thing they think about is programming. This is a reasonable thought. But programming is really just the beginning of what it means to understand and be great at software engineering.

Engineering at its highest abstraction is about building and maintaining systems to solve problems. Systems rarely comprise a single program. Systems are constructed by connecting components together and engineers make some guarantees about how these components will work and interact in the face of various loads.

Software engineering involves requirements, architecture, design, algorithms, denoising, development, abstracting, testing (of which there are many forms), debugging, benchmarking, optimization, scaling, configuration, tooling, security, maintenance, refactoring, and workflow processes. (Let’s leave relationships like team members, bosses and customers out of this discussion for now)

In all aspects of engineering, complexity will arise.

Accidental complexity is caused not by the inherent nature of the problem being solved – essential complexity – but by your/your team’s approach to solving the problem.

Accidental complexity can be created by your choice of tool, or algorithm, or a poorly designed abstraction. Your approach may not be inherently bad or invalid, but for the purposes of the problem being solved it can introduce unwarranted complexity.

Here are some strategies for battling accidental complexity.

Reuse code. Integrating code that has already been written and tested and run in production is a very good thing. Production quality code is more apt to have been tested to handle various edge cases. Re-writing code from scratch, while sometimes necessary, is inherently prone to errors. The new code base is likely to start out simple but will accrue new forms of complexity over time that another piece of code properly addressed. Joel Spolsky does a fine job talking about code reuse here and here.

Automation. Tools that allow any type automation should be incorporated into the team’s workflow as soon as possible. Automation can be applied to testing (junit, pyunit), development builds (maven, paver), configuration management (chef, puppet), application testing (selenium, cucumber), continuous integration (jenkins) and pretty much all other functions. Automation brings consistency across the team. Consistency maintains simplicity — each team member doesn’t have his or her own way of doing things — and automation frees engineers to focus on problems and not on tooling.

Good abstractions. If a piece of code can’t safely be reused or automated we’re now likely getting to the place where custom code is required to solve the problem. This, however, doesn’t mean you can’t build towards future code reuse or automation. Good abstractions enable these things. Good abstractions empower you to build a system using architectural patterns such as layering and builder patterns. Good abstractions mean appropriate interfaces and APIs to your components and being able reuse these abstractions across your stack where applicable. Components built using SOA reduce complexity by increasing composability: they can be debugged and refactored without being muddied by the logic of other components, they can be swapped out and replaced with a shinier version, moved upstream to a different part of your data pipeline, or re-used as part of a new backend service. An interface is a contract between the component and the rest of the world. Abide by it and your system as a whole benefits from increased maintainability and simplicity.

When you have a toolbox everything doesn’t look like a nail. This one is simple — Use the right tool for the job. As mentioned above consistency is important. But taking shortcuts or trying to retrofit a tool for a task it’s not built for will only increase complexity and technical debt of the team. Continuously evaluate your toolbox and question whether there is a new utility or library that’s better for the job.

Code reviews. Code reviews are cultural. They require that all teammates come committed to the table with respect, open-mindedness, and pragmatism. A code review is not meant to solve every bug but it is helpful to 1) shed light on best practices and 2) increase shared knowledge among the group about the code base. Both these things are very effective ways to reduce accidental complexity. Code reviews foster collaboration and help mitigate the impact of coding in isolation. Knowing what each teammate is working on will help each engineer think more holistically about their code and determine whether the class they are working on should be part of a more generic package that all engineers would find useful.

 

There is no magic bullet. Complexity is very easy to build and simplicity is very hard. It takes discipline and committed thinking and execution on the team’s part to engineer better systems and focus on solving real problems.

 

Big Data and Human Fault-Tolerance

The startup world at once fosters and excoriates the latest buzz phrases in tech: cloud computing, the pivot, nosql, crunches and bubbles, and big data.

“Big data” may currently be one of the most over-used terms but in practice it can refer to solid principles appropriate for building data systems that handle, well, a lot of data.

Nathan Marz describes these approaches in his book Big Data. Go buy this book!

It’s available in eBook form through the Manning Early Access Program (MEAP). Manning Publishing has built up a great library over the past couple years and Big Data looks like another promising read. The first six chapters are available online. Nathan is putting the final touches on these chapters before making the remaining chapters available. Looks like the book in its entirety is due out this summer.

I had a chance to read Big Data. It’s full of great ideas and I realized that we’ve implemented a core data pipeline using an approach similar to the one to which Nathan ascribes.

In the technical community there is a lot of discussion around storage frameworks, but in my opinion there really isn’t enough discussion around frameworks for both storage and computation. The book addresses both these aspects.

The central idea of the book is the lambda architecture, which provides a practical way to implement an arbitrary function on arbitrary data.  The function is supported using three components: the batch layer, the serving layer, and the speed layer. These components alone would make for a great talk or blog post.

But that’s not exactly what I want to focus on here.

The first chapter of the book discusses the desired properties of a big data system. This list includes low latency reads and updates, horizontal scalability, extensibilty, ad hoc query support, minimal maintenance and debug time, and human fault-tolerance. These are true and most would agree they are important.

That last point stood out to me. Human fault-tolerance is one aspect that is easily overlooked when building big data systems and its related to the point about minimal maintenance.

A lot of things can go wrong when building data systems. I’ve been there. Work long enough on engineering systems and you will see and make mistakes, some stupid, some unlucky, some mind boggling. As Nathan points out, mistakes can include deploying incorrect code that corrupts potentially a lot of data, mistakenly deleting data, and incorrect job scheduling that causes a job to overwrite or corrupt data.

Hand-in-hand with human fault-tolerance goes maintenance. Maintenance can include debugging code, deploying code (if you’re deploying code manually you likely have bigger problems), adding nodes to a distributed setup, and generally keeping production systems running smoothly. The more maintenance a production system requires the more that manual human intervention becomes a priority. Setting aside the fact that resources are pulled away from working on actual features – this effort only increases opportunities for further mistakes. I’ve seen small teams lose multiple days dealing with not only the initial problem, but also dealing with the cascading errors caused by the efforts to fix the original bug.

Machines are good at automation. People are not. We should build our systems with this knowledge front and center. If the objective function is quality of life then systems should be built to minimize the maintenance parameters and maximize the automation parameters.

When dealing with “small” data, making a backup, downloading a backup, fixing specific pieces of corrupted data, or re-running a job is usually bounded in time, on the order of minutes or low number of hours. Performing these operations with big data is prohibitively time-consuming and expensive. In the limit, these operations become intractable.

A question that then arises is how do we design and build big data systems for human fault tolerance and minimal maintenance? The book addresses these aspects, which I’ve summarized here.

Simplicity. The more complex a system or component, the greater the risk of something going wrong.  The lambda architecture pushes the complexity out of the core, batch components and out to the transient pieces of the system whose outputs are discardable after a short time. This includes distributed databases – rather than relying on the serving layer to hold state they are continuously over-written by the batch processes.

Immutability. Building data immutability into the core of your system imparts to it an inherent capacity to be human fault-tolerant. Keeping data immutable enables fault-tolerance in 2 big ways. One, it empowers you to work from an immutable master data set that represents a more raw state of your system. If data downstream is corrupted or lost the master data set can be retrieved and computations can be re-run on these data. This has a profound impact on the way you engineer and maintain your data pipeline. Second, it minimizes over-engineering. If updates and deletes were supported you would need to build and maintain an index over all these data to retrieve and update specific objects. In contrast, immutable data is read- and append-only. The master data set can be as simple as a flat file. HDFS is a good example of storage that supports immutable data.

Recomputation. Rather than building computations to incrementally update different data, build computations instead as “recomputations” that can be run in batch across the entire master data set. Even better is if these recomputations are idempotent so that they can be run any number of times over your data. These types of processes are better suited for human fault-tolerance. If any intermediate data are corrupted this approach allows you to simply re-deploy code and re-process. An incremental update approach would require you to determine which data were corrupted, which were okay, and figure out a way to fix just the right pieces of the data. This only increases the dependency on human intervention, which as we know only increases the possibility of more errors. Using recomputations also improves generalizability of your algorithms. Many algorithms, especially in machine learning, are easier to implement in “batch” mode.

These strategies  may sound counter-intuitive. They demand increased storage and processing time, two things a startup team is wont to lack.  These disadvantages however are outweighed in the long run by the simplicity, stability and human fault-tolerance of your system.

While reading the book a lot of the ideas rang true. They reflect some of our concrete implementations. One of our core pipelines embodies the ideas of the lambda architecture and simplicity, immutability and recomputation. We build daily snapshots of our master data set, allowing us to recover and redeploy full computations, while processing new data in a low latency workflow (a speed layer).

Complex systems break down. Complexity is difficult to extend, modify, maintain, and reason about. Incorporating the strategies discussed in Big Data can allow you to spend more time building features and less time stuck maintaining the status quo. Go buy the book for a lot more coverage and detail on this topic.

Update: Just after finishing this post I came across Nathan’s keynote page on the Strata Conf website, a talk he happens to be giving tomorrow morning! Total coincidence that his talk is titled “Human Fault Tolerance”. Glad to know the author of the book really appreciates this aspect to big data system design.

Update II: Feb 28th’s Manning Deal of the Day also happens to be Big Data.

Update III: Here is a link to Nathan Marz’s StrataConf talk on human fault-tolerance.

100 Years

Friday February 1st marked the centennial anniversary of New York’s Grand Central Terminal. The terminal in its current form was completed in 1913.

This image is the southwest corner of Grand Central in 1918 and you can get just a little bit of a sense of the type of lives people lived back then. I walk through this intersection every morning and this picture makes me think about the history of this city and the things that have been built here by its people.

It also got me thinking about building things that last. What does it take to build something that will last 100 years? What are we building today that will be around 100 years from now? 1000 years?

The lifetime of something is often a consequence of use, maintenance, and disaster. The more rigorously something is used the more maintenance it requires. And in the case of disasters, natural or human, often nothing can be done but rebuild.

There are plenty of such examples. The pyramids benefit from a highly stable structure. The Great Pyramid of Giza has stood for 4,553 years.  Jeff Bezos’ team is working on the 10,000 year clock - a testament to his view on long-term thinking – deep in the caves of west Texas. Other structures such as the Mallau Viaduct represent marvels of modern day engineering.

Organizations have the potential to last. They have the benefit of being maintained and replenished with human capital but require a vision and the courage to change with the times.  Nokia was founded in the late 1800s as a paper mill company and eventually expanded into rubber and other industrials. In the 1960s their CEO at the time had the boldness to turn efforts towards their electronics and telecommunications equipment. Unfortunately in the past 20 years Nokia has lacked this same bold leadership and has failed to effectively replenish its human capital.

What about software? Which software company will be the first to reach the century mark? Google and Amazon come to mind. They were founded in the mold of a company built for long-term, 10X bets.

Evernote is a young company that has released some excellent products and intends on being a 100-year company.

“We don’t think a billion dollars is all that cool, either. You know what’s really cool? Making a hundred-year company.” — Phil Libin, CEO Evernote

Given the rapid pace of innovation and change in the technology sector, reaching this mark may prove more difficult than in other industries.

One commonality among these centenarian projects is interdependence – teams of people coming together to build something that would out-last their own individual lives.

At 100 years old Grand Central is youthful among some of history’s longer lasting monuments. But it is a reminder of the things we are capable of completing as a team and the goals towards which we should strive.

Image credit: Wikipedia

The Goal of Abstraction

Abstractions are a beautiful thing. The goal of an abstraction is to reduce or factor out details that you don’t care about. By removing details, an abstraction allows you to focus on the problem at hand and not the underlying implementation.

An abstraction fails when it doesn’t remove the details.

HAML, The Unforgivable Sin

Holding Lots Of State In Your Head

There are several distinguishing traits of a great engineer. Characteristics such as curiosity, objectivity, discipline, skepticism, tenacity, staying humble, and experience make up such a list, depending on the list you’re reading.

One characteristic I haven’t come across very often however is holding state. Specifically, an engineer’s ability to hold lots of state in his or her head. Facebook has mentioned that they look for this ability during their interviews.

Admittedly, this quality probably falls further down an ordered list but on a day to day basis this ability is very important: it separates the good from the great engineers and can be the difference between a system that just works and you spending your weekend debugging the new release.

Holding State In Your Head

So what exactly do I mean by holding state in your head? If you’re good at holding lots of state in your head, you’re good at keeping track of the multiple software/hardware components that make up your service and their corresponding interactions and dependencies. You’re good at sequencing events from development through to the release processes and tracking concurrent components so that processes run at an optimal time, productivity remains high and your team is working efficiently.

Some might say that this is multi-tasking. I disagree for at least three reasons.

First, building software is more complex than working on a to-do list. While your aim should be to build decoupled software components, dependencies among the components will undoubtedly exist. Keeping track of these dependencies adds another dimension of complexity.

Second, software evolves over time. What you built 6 months ago will be interacting with the software you build today. You need to not only remember that fact, but also think about how your new changes will impact previously deployed components. Keeping track of the changes to your system over time adds yet another dimension of complexity.

Third, in practice decisions are usually undocumented. Just as important as keeping track of your various components and their dependencies is keeping track of the decisions you didn’t make. There was a reason that component you built 6 months ago was deployed the way it was. Why? Why was it deployed as a service instead of a cron job? There needs to be logic behind decisions and you need to remember that logic. You need to be good at remembering prior events and decisions and taking those into account when making new decisions about your system.

Obviously if all the intricacies and idiosyncrasies of your system could be documented, they would be. In practice this is never the case, especially at a startup. Even with documentation it’s challenging to capture all the various interactions/processes running concurrently. Most of all, if you’re doing things right, your pace of productivity and development will be too fast to maintain full documentation.

Thus, your ability to hold lots of state in your head is even more important at a startup.

From Junior Developer To VP Engineering 

Holding lots of state in your head means different things for different members of your engineering people stack. For the junior developer, holding lots of state means keeping track of your data structures, memory usage, and how your classes are expected to interface with others. Higher up the people stack, holding lots of state in your head means things like managing systems and people, keeping productivity high, knowing the risks within your system and possible points of failure and making sure those are dealt with accordingly.

Holding lots of state in your head is probably a quality that would be useful in many professions. To build productively and scale a robust system engineers definitely have to be mindful of this characteristic, learn it, and foster it.

How Do You Get Better At Holding Lots Of State In Your Head?

Write it down. Start by scribbling down notes to yourself as soon as you remember something important that you don’t want to forget. Usually it’s not that an engineer hasn’t thought about some looming problem caused by a new dependency (for example), it’s that he or she forgets later on after something else more urgent comes up.  Writing it down starts to train your brain to keep track of multiple issues at the same time.

Communicate. It’s near impossible for one person to know everything. But that doesn’t mean one person should be the only one to know how a given component works or interacts with the system. Shared knowledge makes for a stronger team. Talk openly about your decisions with your colleagues. They might have some feedback, and if not that, at they very least they may just help remind you to revisit that issue you needed to deal with. You’re on a team. Act like it.

Be skeptical. Be more skeptical about what you and others have built. Question whether you’ve missed anything and then question yourself again. Question your colleagues to make sure they’ve done what they said they were going to do (don’t be confrontational, just check in). Question your decisions and openly talk about the decisions your colleagues have made. You’d be surprised how often what you thought you communicated was not what your colleague understood. Don’t take anything for granted. I’m usually a happy-go-lucky type of person. However when it comes to building software I’ve learned that a healthy dose of skepticism can save your ass and your team’s ass.

Practice. Get in the habit of reviewing your various components and how they interact. The more time you spend familiarizing yourself with your software and the more time you spend working through your release process (for example) the better you’ll get at identifying assumptions and holes in your decision making.

Hopefully this stream of words I’ve put down is useful in some way. I’d love to hear your feedback or thoughts in the comments section.

Referly and Gumroad: Cash, Money, HTTP

In his recent retrospective In Memoriam: Even In Losing, How Digg Won, Om Malik waxes endearingly about Digg and the young team that “put the ‘me’ in media”.

One statement in particular was especially salient to me: “Links are and will always be the atomic unit of the web”.

It wasn’t the first time I heard this sentiment but it struck a chord. Taken alone, there is something aged and faded about the sentence, describing a technological relic. Taken in light of recent trends however, the link has become increasingly powerful.

Links used to be just an endpoint. A file on someone’s server that you could traverse to if you knew how to get there. Google made the blue links universal and for a while they were a commodity.

Digg, Reddit, Slashdot, Flickr and others rejuvenated the link. It became associated with status and ownership over pieces of content. If yours got upvoted, in effect you were being praised. Facebook, Twitter, Pinterest, attached identity to the link and the sharing of links blossomed even further as a new and widely accepted and encouraged societal behavior. Each link represents some part of you that exists within the context of all statuses, pictures and comments — each itself linkable.

Recently the link has started to take on increased purpose. Two young startups in particular are using links in different ways: Referly and Gumroad. Both are using links to drive financial transactions and commerce.

Referly is a service that brings together brands and their fans (“brand champions”). Let’s say Alice shares a link to a new book she loves on Amazon.com. If one of her friends clicks on that link and ends up buying the book, Jen gets a percentage of the purchase. Just like a referral fee. Get paid by Amazon! Cool.

The details of how this transaction goes down is where it gets interesting. Alice first gets a unique link from Amazon that contains information about the transaction like her unique UUID, the product UUI D, and the desired end user action (click/purchase/sign-in). If Alice’s friend performs the desired action, the aforementioned percentage is automatically deducted from Amazon’s Referly account and deposited into Jen’s Referly account. All of this tracked by the unique URL.  I think Referly’s on-boarding process needs to become even more seamless, but this is a great start.

This is the next level of social sharing: Actual commerce taking place based on identity, trust, influence, relationships and networks, made possible by powerful links.

Gumroad uses links in another way. Instead of using links to get paid to refer a product, Gumroad uses links to allow you to get paid for selling your own product. It’s a service that enables anyone to sell any digital work (image/video/blog post/e-book etc) using a uniquely identified URL.  Their tagline is ”Sell like you share”. This line may evolve over time but it effectively introduces new users to the product, drawing an analogy to an activity that has become very common and accepted.   Recently they’ve introduced a way to accept shipping information so you can sell physical products as well. All through one URL.

As the tagline implies, just as anyone can share, anyone should be able make money from the things they create — the democratization of the store front. I think big box stores like Walmart and Best Buy will always have their place. But Gumroad has eliminated the need for sellers to set up an online store front or any type of secure financial infrastructure, thereby empowering the long tail of creators to distribute their products around the world with a single click*.

I’m not sure if Gumroad will become wildly successful. I hope they do. But I do believe this is part of the future of the web — making financial transactions and commerce as seamless as possible so that people can distribute their creations around the world.

 

* I could go on about the grassroots, small-biz economies and startups encouraging this sector: Etsy, Kickstarter, Square, Louis CK, etc. Etsy’s CEO Chad Dickerson recently gave a great talk at the US Senate that underlined the importance of supporting the me-conomy**. I’ll leave this as the subject of yet another post.

**me-conomy. Self-explanatory term I think. This just came to my head while writing this post. Clearly I’m not the first though.