Categories
General

Retiring a Great Interview Problem

Interviewing software engineers is hard. Jeff Atwood bemoans how difficult it is to find candidates who can write code. The tech press sporadically publishes “best” interview questions that make me cringe — though I love the IKEA question. Startups like Codility and Interview Street see this challenge as an opportunity, offering hiring managers the prospect of outsourcing their coding interviews. Meanwhile, Diego Basch and others are urging us to stop subjecting candidates to whiteboard coding exercises.

I don’t have a silver bullet to offer. I agree that IQ tests and gotcha questions are a terrible way to assess software engineering candidates. At best, they test only one desirable attribute; at worst, they are a crapshoot as to whether a candidate has seen a similar problem or stumbles into the key insight. Coding questions are a much better tool for assessing people whose day job will be coding, but conventional interviews — whether by phone or in person — are a suboptimal way to test coding strength. Also, it’s not clear whether a coding question should assess problem-solving, pure translation of a solution into working code, or both.

In the face of all of these challenges, I came up with an interview problem that has served me and others well for a few years at Endeca, Google, and LinkedIn. It is with a heavy heart that I retire it, for reasons I’ll discuss at the end of the post. But first let me describe the problem and explain why it has been so effective.

The Problem

I call it the “word break” problem and describe it as follows:

Given an input string and a dictionary of words,
segment the input string into a space-separated
sequence of dictionary words if possible. For
example, if the input string is "applepie" and
dictionary contains a standard set of English words,
then we would return the string "apple pie" as output.

Note that I’ve deliberately left some aspects of this problem vague or underspecified, giving the candidate an opportunity to flesh them out. Here are examples of questions a candidate might ask, and how I would answer them:

Q: What if the input string is already a word in the
   dictionary?
A: A single word is a special case of a space-separated
   sequence of words.

Q: Should I only consider segmentations into two words?
A: No, but start with that case if it's easier.

Q: What if the input string cannot be segmented into a
   sequence of words in the dictionary?
A: Then return null or something equivalent.

Q: What about stemming, spelling correction, etc.?
A: Just segment the exact input string into a sequence
   of exact words in the dictionary.

Q: What if there are multiple valid segmentations?
A: Just return any valid segmentation if there is one.

Q: I'm thinking of implementing the dictionary as a
   trie, suffix treeFibonacci heap, ...
A: You don't need to implement the dictionary. Just
   assume access to a reasonable implementation.

Q: What operations does the dictionary support?
A: Exact string lookup. That's all you need.

Q: How big is the dictionary?
A: Assume it's much bigger than the input string,
   but that it fits in memory.

Seeing how a candidate negotiates these details is instructive: it offers you a sense of the candidate’s communication skills and attention to detail, not to mention the candidate’s basic understanding of data structures and algorithms.

A FizzBuzz Solution

Enough with the problem specification and on to the solution. Some candidates start with the simplified version of the problem that only considers segmentations into two words. I consider this a FizzBuzz problem, and I expect any competent software engineer to produce the equivalent of the following in their programming language of choice. I’ll use Java in my example solutions.

String SegmentString(String input, Set<String> dict) {
  int len = input.length();
  for (int i = 1; i < len; i++) {
    String prefix = input.substring(0, i);
    if (dict.contains(prefix)) {
      String suffix = input.substring(i, len);
      if (dict.contains(suffix)) {
        return prefix + " " + suffix;
      }
    }
  }
  return null;
}

I have interviewed candidates who could not produce the above — including candidates who had passed a technical phone screen at Google. As Jeff Atwood says, FizzBuzz problems are a great way to keep interviewers from wasting their time interviewing programmers who can’t program.

A General Solution

Of course, the more interesting problem is the general case, where the input string may be segmented into any number of dictionary words. There are a number of ways to approach this problem, but the most straightforward is recursive backtracking. Here is a typical solution that builds on the previous one:

String SegmentString(String input, Set<String> dict) {
  if (dict.contains(input)) return input;
  int len = input.length();
  for (int i = 1; i < len; i++) {
    String prefix = input.substring(0, i);
    if (dict.contains(prefix)) {
      String suffix = input.substring(i, len);
      String segSuffix = SegmentString(suffix, dict);
      if (segSuffix != null) {
        return prefix + " " + segSuffix;
      }
    }
  }
  return null;
}

Many candidates for software engineering positions cannot come up with the above or an equivalent (e.g., a solution that uses an explicit stack) in half an hour. I’m sure that many of them are competent and productive. But I would not hire them to work on information retrieval or machine learning problems, especially at a company that delivers search functionality on a massive scale.

Analyzing the Running Time

But wait, there’s more! When a candidate does arrive at a solution like the above, I ask for an big O analysis of its worst-case running time as a function of n, the length of the input string. I’ve heard candidates respond with everything from O(n) to O(n!).

I typically offer the following hint:

Consider a pathological dictionary containing the words
"a", "aa", "aaa", ..., i.e., words composed solely of
the letter 'a'. What happens when the input string is a
sequence of n-1 'a's followed by a 'b'?

Hopefully the candidate can figure out that the recursive backtracking solution will explore every possible segmentation of this input string, which reduces the analysis to determine the number of possible segmentations. I leave it as an exercise to the reader (with this hint) to determine that this number is O(2n).

An Efficient Solution

If a candidate gets this far, I ask if it is possible to do better than O(2n). Most candidates realize this is a loaded question, and strong ones recognize the opportunity to apply dynamic programming or memoization. Here is a solution using memoization:

Map<String, String> memoized;

String SegmentString(String input, Set<String> dict) {
  if (dict.contains(input)) return input;
  if (memoized.containsKey(input) {
    return memoized.get(input);
  }
  int len = input.length();
  for (int i = 1; i < len; i++) {
    String prefix = input.substring(0, i);
    if (dict.contains(prefix)) {
      String suffix = input.substring(i, len);
      String segSuffix = SegmentString(suffix, dict);
      if (segSuffix != null) {
        return prefix + " " + segSuffix;
      }
  }
  memoized.put(input, null);
  return null;
}

Again the candidate should be able to perform the worst-case analysis. The key insight is that SegmentString is only called on suffixes of the original input string, and that there are only O(n) suffixes. I leave as an exercise to the reader to determine that the worst-case running time of the memoized solution above is O(n2), assuming that the substring operation only requires constant time (a discussion which itself makes for an interesting tangent).

Why I Love This Problem

There are lots of reasons I love this problem. I’ll enumerate a few:

  • It is a real problem that came up in the course of developing production software. I developed Endeca’s original implementation for rewriting search queries, and this problem came up in the context of spelling correction and thesaurus expansion.
  • It does not require any specialized knowledge — just strings, sets, maps, recursion, and a simple application of dynamic programming / memoization. Basics that are covered in a first- or second-year undergraduate course in computer science.
  • The code is non-trivial but compact enough to use under the tight conditions of a 45-minute interview, whether in person or over the phone using a tool like Collabedit.
  • The problem is challenging, but it isn’t a gotcha problem. Rather, it requires a methodical analysis of the problem and the application of basic computer science tools.
  • The candidate’s performance on the problem isn’t binary. The worst candidates don’t even manage to implement the fizzbuzz solution in 45 minutes. The best implement a memoized solution in 10 minutes, allowing you to make the problem even more interesting, e.g., asking how they would handle a dictionary too large to fit in main memory. Most candidates perform somewhere in the middle.

Happy Retirement

Unfortunately, all good things come to an end. I recently discovered that a candidate posted this problem on Glassdoor. The solution posted there hardly goes into the level of detail I’ve provided in this post, but I decided that a problem this good deserved to retire in style.

It’s hard to come up with good interview problems, and it’s also hard to keep secrets. The secret may be to keep fewer secrets. An ideal interview question is one for which advance knowledge has limited value. I’m working with my colleagues on such an approach. Naturally, I’ll share more if and when we deploy it.

In the meantime, I hope that everyone who experienced the word break problem appreciated it as a worthy test of their skills. No problem is perfect, nor can performance on a single interview question ever be a perfect predictor of how well a candidate will perform as an engineer. Still, this one was pretty good, and I know that a bunch of us will miss it.

ps. Check out this post by Behdad Esfahbod that does the problem justice! He also notes that looking up a string in a dictionary isn’t O(1) but has a cost proportional to the string length, which motivates exploring a trie rather than a dictionary to store the strings.

By Daniel Tunkelang

High-Class Consultant.

105 replies on “Retiring a Great Interview Problem”

Will, I read your response. A short one here: assessment of candidates on a problem like this isn’t binary. Rather, the point is to get as holistic a picture as is possible in the time constraints of how well a candidate solves an algorithmic problem and implements it. It’s not a perfect tool, but no tool is. I’m always on the lookout for better ones — don’t forget that I’m retiring this problem.

And you allude to a candidate’s nervousness under interview conditions. Part of the interviewer’s job is to put the candidate at ease. That isn’t unique to questions that involve coding, and it doesn’t always work out. No interviewer or hiring process is perfect.

Finally, while I share your concerns about using whiteboard coding in interviews (see my link to Diego’s post), I disagree that it discriminates against older candidates. That very hypothesis strikes me as ageist, at least in the absence of supporting data.

Like

As rocketsurgeon mentioned, there is one bug and also one typo in the code. There is a parenthesis missing in this line:

if (dict.contains(input) return input;

And the loop condition never reaches the next to last character in the input string:

for (int i = 1; i < len – 1; i++)

Like

I won’t write this out in Java/whatever syntax as I am too lazy, also not a programmer so not interesting in memorizing syntax of this or that language, but as stated I am lazy so I would want a ‘good’ segmentation, not just any. (I don’t want a lot of “a’s” if there’s an “aaaa” available). So I would check the longest-length word in the dictionary, say that length is k (of course cap this at input-string length and discard all dictionary words greater than this – I’ll assume the dictionary has easy capability both of this max-length-check and to ‘discard’/ignore all >k), then starting from i=1 search all (i,i+k-1) substrings of the input unless/till I find a match, if none reduce k (discard more dictionary!) & try again, if so parse out & into ‘match’, prefix and suffix (as applicable) and recurse onto both of the latter. Details too boring/obvious to spec out.

Personally I dislike cutesy ‘interview questions’ and am instinctively distrustful of the filter they implicitly apply to candidates. When I’m interviewing someone I do what is known as ‘talk to’ the person, one might even say I ‘have a conversation’ with them. That may/may not be better but my way at least I don’t think someone could slide through the interview filter by memorizing some stuff off websites.

Like

Many skills go into a good software engineer, and different interviewers will probe different skills. This problem emphasizes algorithmic thinking and coding — and I ask a similar interview question myself. But algorithmic coding is becoming surprisingly uncommon in many environments because the non-trivial algorithms are wrapped in libraries. Of course, there are places where understanding all this is crucial — someone has to write those libraries, and some people really do have Big Problems that the libraries don’t address — but it doesn’t seem to be a central skill that all programmers need to master.

Is this good or bad? Consider A.N. Whitehead: “Civilization advances by extending the number of important operations which we can perform without thinking about them.”

Like

One thing I’ve learned about interview questions is that you have to lead up to them in steps if you want to determine where a candidate’s understanding trails off — which may be very soon.

For example, many candidates claim on their resumes that they know SQL. I used to ask such candidates how they would determine if person A was an ancestor of person B given a table of parent-child relations. This requires the (advanced) SQL feature of recursive queries (and I’d actually be happy if they could explain why it couldn’t be done in SQL, as it can’t in basic SQL). Now, I ask the question in stages:

* In SQL, how would you represent parent-child relations?
* How would you find X’s parents?
* How would you find X’s grandparents?
* How would you find all of X’s ancestors?
* What if you wanted to do all this high-volume genealogy Web site — would you change your table design or queries? or use some technology other than SQL?

I was shocked to discover that many candidates who listed SQL on their resumes couldn’t do *any* of this, and many required considerable coaching to do it. One candidate didn’t even remember that SQL queries start with SELECT — I would have forgiven this if he’d had conceptual understanding but had just forgotten the keyword, but he had zero conceptual understanding as well.

All this to say that you can’t really trust the self-reporting on a resume and you’ve got to probe to understand what the candidate actually knows.

Like

@Sonic Charmer

Your sketch is actually a reasonable start towards what I’d expect of a candidate. In fact, I think I could persuade you that your assumptions would have to be general enough to at least solve the interview problem as a special case where the min word length is one and the max is large enough to be the length of the input string. Please bear in mind that this “cutesy” problem is a simplification of one I had to solve to deliver software that has been deployed to hundreds of millions of people!

@Stavros

You’re right that not all software engineers needs to have strong command of algorithms. But I do require that strength of the folks I hire, given the problems that my team solves. Same applied at Google and Endeca. And yes, self-reporting on a resume is always subject to the maxim of “trust but verify”.

Like

Excellent article \ topic. We are currently recruiting for a half dozen C++ \ Web UI positions and for many companies it is extremely difficult to determine whom is talking the talk and whom can write VERY clean code and think logically when faced with difficult programming requests. As an example a Sales Director will hand a candidate for a sales position a phone and say “make this call and pitch them” just so you can see what they have, this is partly what interviewing has become like in the IT world.

Like

Raymond, thanks. The problem with all interviewing — and with interviewing software engineers in particular — is that it’s hard to extract a reliable signal under interview conditions.

Ultimately the best solution may be to change the interview conditions. But the approach has to be both effective and efficient. It’s a great research problem — and I’ll let readers here know what my colleagues and I come up with.

Of course, I’d love to hear what others are doing.

Like

As a side note, there is a discussion of a slightly more general problem in Peter Norvig’s excellent chapter from O’Reilly book “Beautiful Data” – http://norvig.com/ngrams/ch14.pdf (see section “Word Segmentation”). His remark that the memoized solution can be thought of as the Viterbi algorithm from the HMM theory is nicely illuminating (and of course obvious upon a moment’s thought).

Like

Hi Daniel:
how could you change your code so that it can find all valid segmentation for the whole string?
E.X. string “aaa”, dict{‘a’, ‘aa’} to be segmentation{“a a a”, ‘a aa’, “aa a”}

Like

William, interesting question. For starters, the number of valid segmentations may be exponential in the length — in fact, that will be the case for a string of n a’s if every sequence of a’s is a dictionary word. Could still use memoization / dynamic programming to avoid repeating work, but storing sets of sequences rather than a single one.

Like

It can be done in O(n), n=length of the string, with some pretty relaxed assumption given the nature of the problem: we just need a preprocessing step on the dictionary.

The idea is to build a set of rolling hashing function using the Rabin-Karp algorithm, for each word length in the dictionary, and the corresponding hash value for each word.

To segment the string, we loop over it once, updating the rolling hash values (for each length) and if we find a match in our set of hash values from the dictionary, we have a potential match. We still have to check the actual dictionary to confirm the match, avoiding false positives.

This design has the added advantage that the dictionary can be larger than what can fit into memory. We only need to store the hash values for each word in memory.

Like

Roberto, one person I presented the problems to did suggested an approach along these lines: since dictionary membership is a regular language, just build a finite state machine. I was impressed by the ingenuity, but I then enforced the constraint that the dictionary only supported a constant-time membership test.

By the way, I’ve since moved on to less exciting coding problems that require less ingenuity and are more a test of basics (though not quite as elementary as fizzbuzz). I’m still surprised at how many candidates with strong resumes fail at these.

Like

Maybe stronger candidates tend to overcomplicate problems, instead of solving them in the simplest way they search for a clever one and get lost.

“[T]he stupider one is, the closer one is to reality. The stupid one is, the clearer one. Stupidity is brief and artless, while intelligence wriggles and hides itself. Intelligence is a knave, but stupidity is honest and straight forward.” — Dostoevsky (The Brothers Karamazov)

Like

Some strong candidates assume that an easy solution must be too naive and therefore wrong. For that reason, it’s important to set expectations at the beginning. If a problem is basic, I tell the candidate as much — which also helps avoid a candidate feeling insulted or worrying that the bar is too low. And for all problems, I urge candidates to come up with a working solution before optimizing it.

Like

At first, it sounds depressing. I’m writing compilers and OSes, among other things, but don’t think I’d pass this interview.

However, the more I read the more I realize I’m a different kind of programmer than who is sought here. I don’t cite CLR by heart, solve real problems in real manner (i.e., including asking others, not to mention using books, Internet etc.) and generally not good at “coding”. I guess my skills aren’t very marketable, in this approach. Might be a better choice not to program for somebody else.

Like

Alex, I’ve actually switched to using problems that are a bit less CLR-ish. I still think it’s reasonable to expect someone to be able to real problems like this one, though I realize it’s unnatural to solve problems under interview conditions.

An alternative approach would be to put less emphasis on the accuracy of the interviewing process and treat the first few months as a trial period. Unfortunately, that’s not the cultural norm, so instead we try to squeeze all the risk out of the hiring process.

Anyway, if you’re able to write a compiler or OS, you should have no problem finding work you’re great at and enjoy.

Like

Alex, I have heard that lament before. I work at a place that uses similarly algorithmic questions (and other questions too — algorithms aren’t everything, but they’re important) and I sometimes hear my coworkers lament that they couldn’t get hired with the standards we have today. Which is, of course, nonsense. 🙂

What’s important to realize about these questions is that in an hour, you have time to make an attempt at an answer, get feedback, and improve your solution. It is a conversation, and not just a blank whiteboard, an interviewer in the corner tapping a ruler on the table every time you make a mistake, and a disappointing early trip home.

Like

Tapping a ruler on the table? More like rapping you on the knuckles! OK, the nuns didn’t really do that to us. 🙂

Seriously, dbt is right. Good interviews are a conversation. Otherwise, it would be better to make them non-interactive tests.

Like

if (segSuffix != null) {
memoized.put(input, prefix + ” ” + segSuffix);
return prefix + ” ” + segSuffix;
}

this if should be added with else as mention below, as to show words fetched before non-dictionary word ::

if (segSuffix != null) {
memoized.put(input, prefix + ” ” + segSuffix);
return prefix + ” ” + segSuffix;
}else{
return prefix ;
}

Like

Would using human cycles with captcha’s get you closer to O(n)? Or does big O analysis only apply when we actually understand the details of the algorithm?

Like

I was thinking about the same problem. Quite surprised to see the same thing appear here and asked for interviews :).
shouldn’t “aaaaab” be
“aaaaa” + b ? if a, aa, aaa, aaaa,aaaaa are in dict ?
Step 1: check for “aaaaab”, not in dictionary
Step 2: check for “aaaaa”, found in dictionary
word segments are “aaaaa” + “b”
I wonder why the need all combinations in a “real-world” problem!

Like

I love the problem…but I can’t help to notice that the interviewer who has used this question to filter many fine candidates over the years can’t even write out a working solution without bugs given unlimited time, years of experience asking others this question and do so in the context of writing an in-depth analysis teaching the unwashed masses about how great an interview problem this is. Can it be such a great question if you can’t even get it right?

The biggest problem with questions like this is that this type of programming is a highly perishable skill. I have played the guitar for 10 years but lately have spent more time singing acoustically – trying to fit my fingers to a Bach piece that used to come to me written on the back of my eyelids is now impossible.

Now if I were playing Bach every day it would be a different story. Same with this problem. You are only going to hire the guy who happened to solve several problems like this last week because they ran into some issue where it was relavent.

Like

Criticism noted. But bear in mind that the question isn’t a binary filter. It’s a test of algorithmic thinking and even of working out the problem requirements.

I’m curious what you mean by “this type of programming” being highly perishable. If you mean solving a basic algorithmic problem that comes up in the course of real work, then I strongly object. As I said in the post, the problem isn’t from a textbook — it’s a simplified version of a real issue that has come up for me and others in the course of writing production software.

But I grant that people don’t always have opportunities to use dynamic programming and perhaps even recursion. With that in mind, I’ve switched to interview problems that are less reliant on these. But I still expect the people I hire to be able to apply these fundamental computer science techniques with confidence.

Of course there’s a risk with this and any interview problem of overfitting to someone’s recent experience. That’s why it’s good to use a diverse set of interview questions. If someone aces the interviews because she solved all of those problems last week, I’ll take my chances and hire her!

Finally, if you have suggestions about how to interview more effectively, I’m all ears.

Like

One problem with this is that good programmers tend to avoid recursion (no data to support it though).

So while you are expecting a 10 min recursive version, the good programmer is trying to bring out a non recursive version and might fail. (Of course, non recursive version for this can be done in an interview).

btw, this is a texbook exercise problem.

I believe Sedgewick’s book (or perhaps CLR) has it.

Like

I concede that a lot of good programmers may instinctively avoid recursion, although in this case I’d say that makes them worse programmers. The whole point of learning a set of software engineering tools is to apply the right one to the right problem, and this problem is mostly naturally defined and solved recursively.

As for it being a textbook exercise, good to know. As I said in the original post, I first encountered this problem in the course of writing production software.

Like

The problem (pun intended) is that the problem is still incompletely defined. For instance, we have no idea about the expected target hardware. Does it have limited memory? Limited stack space? Does the language we are to use support recursion? etc Ok, maybe the last one (or any of them) is not relevant these days, but you get my point.

Basically we have no idea what we need to try and optimize for. Granted, a good candidate might and probably should try and clarify that, but in the face of ambiguity, good programmers follow “(instinctive) good practices” which they gained through experience etc.

For instance, you will use quadratic memory and linear stack in the recursive version. A good programmer might instinctively try avoid the cost of the stack.

But, if the goal is to optimize the time to write the code, then a recursive version will be faster (and I suppose is an implicit goal in the interviews, but almost never the case in critical production code).

So calling them worse programmers for not using recursion is not right, IMO.

btw, you seem to have ignored the cost of looking up the memoized structure. Since you are looking up the string your recursive version is cubic, inspite of the substring and dictionary lookup being O(1). Of course, that can be avoided if we represent the string for lookup by the end point indexes (i,j), rather than the string itself.

Sorry for the long post.

Like

No need to apologize — I appreciate the discussion!

And several of the points you’ve raised have come up when I’ve used this problem in interviews. I’ve seen candidates implement a stack-based approach without recursion. I’ve also had discussions about nuances of scale and performance, including whether the limited memory requires the dictionary to be stored out of code (a great motivation for using a Bloom filter) and whether the cost of creating a read-only copy of a substring is constant (i.e., represented by the end-point indexes) or linear in the length of the substring. (because the substring is actually created as a string).

So you’re right that not all good programmers will jump to recursion — though I do think that is the simplest path for most. And in an interview I urge candidates to start with the simplest solution that works. That’s not only a good idea during an interview, but a good idea in practice, to avoid premature optimization.

Regardless of the choice of interview problem, the interviewer has to be competent and flexible. I’m sure a interviewer can butcher an interview with even the best problem. But some interview problems are better than others. And I strongly favor interview problems that are based on real problem, don’t require specialized knowledge, and provide candidates options to succeed without depending entirely on the candidate arriving at a single insight.

Like

Due to this question I had interview with whitepages.com and this is why I couldn’t get the job… Great answer, hopefully I won’t see this question again in my interviews. LOL

Like

Why do topmost tech companies give more priority to algorithms during the recruitment process?…

Not all top tech companies. 🙂 At LinkedIn, we put a heavy emphasis on the ability to think through the problems we work on. For example, if someone claims expertise in machine learning, we ask them to apply it to one of our recommendation problems. A…

Like

I have a question about the memoized solution. I understand the advantage of saving the results of a dead end computation, where the code reads:

memoized.put(input, null);

But I don’t understand the advantage to memoizing here:

memoized.put(input, prefix + ” ” + segSuffix);

Since segSuffix has already been found to be not null, that means we have reached the end of the input string somewhere deeper in the call stack, and are now just unwinding back to the top? Maybe I’ve missed something, it’s late at night for me, but I can’t see it any other way right now. Thanks.

Like

Comments are closed.