How not to interview engineers
Updated on October 24, 2020
This is a response to Slava’s “How to interview engineers” article. I initially thought it was a satire, as have others, but he has doubled down on it:
(…) Some parts are slightly exaggerated for sure, but the essay isn’t meant as a joke.
That being true, he completely misses the point on how to improve hiring, and proposes a worse alternative on many aspects. It doesn’t qualify as provocative, it is just wrong.
I was comfortable taking it as a satire, and I would just ignore the whole thing if it wasn’t (except for the technical memo part), but friends of mine considered it to be somewhat reasonable. This is a adapted version of parts of the discussions we had, risking becoming a gigantic showcase of Poe’s law.
In this piece, I will argument against his view, and propose an alternative approach to improve hiring.
It is common to find people saying how broken technical hiring is, as well put in words by a phrase on this comment:
Everyone loves to read and write about how developer interviewing is flawed, but no one wants to go out on a limb and make suggestions about how to improve it.
I guess Slava was trying to not fall on this trap, and make a suggestion on how to improve instead, which all went terribly wrong.
Timing the candidate shows up on the “talent” and “judgment” sections, and they are both bad ideas for the same reason: programming is not a performance.
What do e-sports, musicians, actors and athletes have in common: performance psychologists.
For a pianist, their state of mind during concerts is crucial: they not only must be able to deal with stage anxiety, but to become really successful they will have to learn how to exploit it. The time window of the concert is what people practice thousands of hours for, and it is what defines one’s career, since how well all the practice went is irrelevant to the nature of the profession. Being able to leverage stage anxiety is an actual goal of them.
That is also applicable to athletes, where the execution during a competition makes them sink or swim, regardless of how all the training was.
The same cannot be said about composers, though. They are more like book writers, where the value is not on very few moments with high adrenaline, but on the aggregate over hours, days, weeks, months and years. A composer may have a deadline to finish a song in five weeks, but it doesn’t really matter if it is done on a single night, every morning between 6 and 9, at the very last week, or any other way. No rigid time structure applies, only whatever fits best to the composer.
Programming is more like composing than doing a concert, which is another way of saying that programming is not a performance. People don’t practice algorithms for months to keep them at their fingertips, so that finally in a single afternoon they can sit down and write everything at once in a rigid 4 hours window, and launch it immediately after.
Instead software is built iteratively, by making small additions, than refactoring the implementation, fixing bugs, writing a lot at once, etc. all while they get a firmer grasp of the problem, stop to think about it, come up with new ideas, etc.
Some specifically plan for including spaced pauses, and call it “Hammock Driven Development”, which is just artist’s “creative idleness” for hackers.
Unless you’re hiring for a live coding group, a competitive programming team, or a professional live demoer, timing the candidate that way is more harmful than useful. This type of timing doesn’t find good programmers, it finds performant programmers, which isn’t the same thing, and you’ll end up with people who can do great work on small problems but who might be unable to deal with big problems, and loose those who can very well handle huge problems, slowly. If you are lucky you’ll get performant people who can also handle big problems on the long term, but maybe not.
An incident is the closest to a “performance” that it gets, and yet it is still dramatically different. Surely it is a high stress scenario, but while people are trying to find a root cause and solve the problem, only the downtime itself is visible to the exterior. It is like being part of the support staff backstage during a play: even though execution matters, you’re still not on the spot. During an incident you’re doing debugging in anger rather than live coding.
Although giving a candidate the task to write a “technical memo” has potential to get a measure of the written communication skills of someone, doing so in a hard time window also misses the point for the same reasons.
Typing is speed in never the bottleneck of a programmer, no matter how great they are.
As Dijkstra said:
But programming, when stripped of all its circumstantial irrelevancies, boils down to no more and no less than very effective thinking so as to avoid unmastered complexity, to very vigorous separation of your many different concerns.
In other words, programming is not about typing, it is about thinking.
Otherwise, the way to get those star programmers that can’t type fast enough a huge productivity boost is to give them a touch typing course. If they are so productive with typing speed being a limitation, imagine what they could accomplish if they had razor sharp touch typing skills?
Also, why stop there? A good touch typist can do 90 WPM (words per minute), and a great one can do 120 WPM, but with a stenography keyboard they get to 200 WPM+. That is double the productivity! Why not try speech-to-text? Make them all use J so they all need to type less! How come nobody thought of that?
And if someone couldn’t solve the programming puzzle in the given time window, but could come back in the following day with an implementation that is not only faster, but uses less memory, was simpler to understand and easier to read than anybody else? You’d be losing that person too.
For “building an extraordinary team at a hard technology startup”, intelligence is not the most important, determination is.
And talent isn’t “IQ specialized for engineers”. IQ itself isn’t a measure of how intelligent someone is. Ever since Alfred Binet with Théodore Simon started to formalize what would become IQ tests years later, they already acknowledged limitations of the technique for measuring intelligence, which is still true today.
So having a high IQ tells only how smart people are for a particular aspect of intelligence, which is not representative of programming. There are numerous aspects of programming that are covered by IQ measurement: how to name variables and functions, how to create models which are compatible with schema evolution, how to make the system dynamic for runtime parameterization without making it fragile, how to measure and observe performance and availability, how to pick between acquiring and paying technical debt, etc.
Not to say about everything else that a programmer does that is not purely programming. Saying high IQ correlates with great programming is a stretch, at best.
Slava tangentially picks on HR, and I will digress on that a bit:
A good rule of thumb is that if a question could be asked by an intern in HR, it’s a non-differential signaling question.
Stretching it, this is a rather snobbish view of HR. Why is it that an intern in HR can’t make signaling questions? Could the same be said of an intern in engineering?
In other words: is the question not signaling because the one asking is from HR, or because the one asking is an intern? If the latter, than he’s just arguing that interns have no place in interviewing, but if the former than he was picking on HR.
Extrapolating that, it is common to find people who don’t value HR’s work, and only see them as inferiors doing unpleasant work, and who aren’t capable enough (or smart enough) to learn programming.
This is equivalent to people who work primarily on backend, and see others working on frontend struggling and say: “isn’t it just building views and showing them on the browser? How could it possibly be that hard? I bet I could do it better, with 20% of code”. As you already know, the answer to it is “well, why don’t you go do it, then?”.
This sense of superiority ignores the fact that HR have actual professionals doing actual hard work, not unlike programmers. If HR is inferior and so easy, why not automate everything away and get rid of a whole department?
I don’t attribute this world view to Slava, this is only an extrapolation of a snippet of the article.
If I found out that people employed theatrics in my interview so that I could feel I’ve “earned the privilege to work at your company”, I would quit.
If your moral compass is so broken that you are comfortable mistreating me while I’m a candidate, I immediately assume you will also mistreat me as an employee, and that the company is not a good place to work, as evil begets stupidity:
But the other reason programmers are fussy, I think, is that evil begets stupidity. An organization that wins by exercising power starts to lose the ability to win by doing better work. And it’s not fun for a smart person to work in a place where the best ideas aren’t the ones that win. I think the reason Google embraced “Don’t be evil” so eagerly was not so much to impress the outside world as to inoculate themselves against arrogance.
Paul Graham goes beyond “don’t be evil” with a better motto: “be good”.
Abusing the asymmetric nature of an interview to increase the chance that the candidate will accept the offer is, well, abusive. I doubt a solid team can actually be built on such poor foundations, surrounded by such evil measures.
And if you really want to give engineers “the measure of whoever they’re going to be working with”, there are plenty of reasonable ways of doing it that don’t include performing fake interviews.
Personality tests around the world need to be a) translated, b) adapted and c) validated. Even though a given test may be applicable and useful in a country, this doesn’t imply it will work for other countries.
Not only tests usually come with translation guidelines, but also its applicability needs to be validated again after the translation and adaptation is done to see if the test still measures what it is supposed to.
That is also true within the same language. If a test is shown to work in England, it may not work in New Zealand, in spite of both speaking english. The cultural context difference is influent to the point of invalidating a test and making it be no longer valid.
Irregardless of the validity of the proposed “big five” personality test, saying “just use attributes x, y and z this test and you’ll be fine” is a rough simplification, much like saying “just use Raft for distributed systems, after all it has been proven to work” shows he throws all of that background away.
So much as applying personality tests themselves is not a trivial task, and psychologists do need special training to become able to effectively apply one.
He calls the ill-defined “industry standard” to be cargo-culting, but his proposal isn’t sound enough to not become one.
Even if the ideas were good, they aren’t solid enough, or based on solid enough things to make them stand out by themselves. Why is it that talent, judgment and personality are required to determine the fitness of a good candidate? Why not 2, 5, or 20 things? Why those specific 3? Why is talent defined like that? Is it just because he found talent to be like that?
Isn’t that definitionally also cargo-culting1? Isn’t he just repeating whatever he found to work form him, without understanding why?
What Feynman proposes is actually the opposite:
In summary, the idea is to try to give all of the information to help others to judge the value of your contribution; not just the information that leads to judgment in one particular direction or another.
What Slava did was just another form of cargo culting, but this was one that he believed to work.
I will not give you a list of things that “worked for me, thus they are correct”. I won’t either critique the current “industry standard”, nor what I’ve learned from interviewing engineers.
Instead, I’d like to invite you to learn from history, and from what other professionals have to teach us.
Programming isn’t an odd profession, where everything about it is different from anything else. It is just another episode in the “technology” series, which has seasons since before recorded history. It may be an episode where things move a bit faster, but it is fundamentally the same.
So here is the key idea: what people did before software engineering?
What hiring is like for engineers in other areas? Don’t civil, electrical and other types of engineering exist for much, much longer than software engineering does? What have those centuries of accumulated experience thought the world about technical hiring?
What studies were performed on the different success rate of interviewing strategies? What have they done right and what have they done wrong?
What is the purpose of HR? Why do they even exist? Do we need them, and if so, what for? What is the value they bring, since everybody insist on building an HR department in their companies? Is the existence of HR another form of cargo culting?
What is industrial and organizational psychology? What is that field of study? What do they specialize in? What have they learned since the discipline appeared? What have they done right and wrong over history? Is is the current academic consensus on that area? What is a hot debate topic in academia on that area? What is the current bleeding edge of research? What can they teach us about hiring? What can they teach us about technical hiring?
If all I’ve said makes me a “no hire” in the proposed framework, I’m really glad.
This says less about my programming skills, and more about the employer’s world view, and I hope not to be fooled into applying for a company that adopts this one.
Claiming to be selecting “extraordinary engineers” isn’t an excuse to reinvent the wheel, poorly.