Preparation + tips for the System design interviews

The system design interview is standard for Senior+ software engineer candidates. It often even holds more weight than the coding interview.

How does one do well in a system design round?  A lot of sweat and failures went into me figuring out what really matters and what doesn’t. My current preparation strategy, successfully applied in 2023 job search cycles included the following elements:

(A) Work through a great System Design course,
(B) Sharpen the story-telling skills,
(C) Do lots of practice,
(D) Omit what doesn’t help.

Let’s discuss them in detail.

(A) Several paid system design courses are currently available, but not all of them are equally good. Common pitfalls are going into too much detail + reproducing the actual system instead of making generic design choices. I loved the first iteration of “Grokking the System Design” course on, but the current 2023 iteration suffers from the above pitfalls. “System Design Interview” guide by Alex Xu, on the other hand, is concise and to the point.

(B) System design interview is all about telling a story. First, you single in on a story line by asking clarifying questions. Then, you methodically add the necessary components tying them to the story line. For example: “We send notifications to users based on their subscriptions. Therefore, we need a subscription service to capture user preferences.” Don’t forget character development: “Red riding hood didn’t have much money, but she didn’t need to maintain foreign key relationships and had a simple schema. Therefore, she went for MongoDB instead of MariaDB to store data about her trips through the forest.” Do write your story down alongside drawing the diagram.

(C) A good storyteller is made and not born. It does help to some extent to listen to the others explaining their design ( and are especially good). However, the main benefit comes from you doing it. Good news: there are only about ~30 distinct system design problems. Given sufficient time, one can implement them all and write a story for each. Do maintain your library of those designs, e.g. with diagrams stored in Google drive. Bad news: story-telling skills atrophy without practice. Redo at least some of those designs before each interview cycle.

(D) I found some pieces traditionally included in the study guides to be almost never needed. First, you don’t need to draw many boxes on the path from the client to the target microservice. A single box encapsulating Load Balancer + API gateway + GraphQL API is sufficient. Going into too much detail early on will easily derail the interview.

Second, doing scale estimates is often counterproductive to telling a story. Just make sure to create a scalable design + mention the scalability.

I started omitting the API design. The list of actions supported by the target system is more general (e.g. poll against a 3rd party service is not an API) and covers the APIs very well too.

How to prepare for the Coding interviews.

How does one do well in the coding interviews? The best results are achieved with a systematic approach:

(A) learn and review the fundamentals,
(B) work through common algorithms and their variations,
(C) do targeted practice,
(D) nail the other pieces.

What do these involve? Here is my take from my coding interviews in early 2023.

(A) How does one learn the fundamentals? 
Learn the “Algorithms” from Sedgewick and Wayne. A combination of the classic book and the  Coursera courses (part I and II) is perfect for building a solid foundation from scratch.
A shorter and a very applied resource is the Data Structures and Algorithms (DSA) course on Leetcode Well worth the money.

(B) How does one source and work through common algorithms and their variations?
The aforementioned DSA course on Leetcode is truly amazing. Its text format is the best for reviewing right before the interviews. The variations of the common algorithms are mostly sourced by working through reference problems. Do write your own implementations for several reference problems per topic to have a very clear picture of how you will implement it in an interview. Figure out all decision points, e.g. recursive DFS for backtracking, BFS for shortest path, return type vs parameters for tree traversal, where and how you mark the visited graph nodes, TreeSet vs PriorityQueue usage in Java, your implementation for Tries and Disjoint Sets. Identify easy-to-implement solutions, which don’t sacrifice the time complexity.

Mastery of collections API for your language of choice really helps with correctness / clarity. computeIfAbsent, merge, remove(key, value), Map.of for Maps in Java save you time and debugging headaches.

Ironically, the array-based problems can be the simplest and the hardest. Learn array-specific algorithms like dynamically changing the iteration limits (LC 1326), forward + backward pass technique (LC 42).

(C) Solve at least 2 new medium / hard problems on Leetcode every day. Read through solutions to more problems. Leetcode premium enables access to the lists of problems marked by the company. Glassdoor provides some insight into coding problems for the smaller companies. I was “lucky” with both tools. Consistency is your friend here: don’t stop preparing till the day / time of your interview.

(D) The algorithm / implementation is only 50-70% of the score. Ever wondered why you wrote  a working solution in 10 minutes and got poor feedback? You probably missed out on the remaining 30-50%, which includes:

  • asking many clarifying questions, identifying and handling the edge cases,
  • talking through your intentions and major implementation parts,
  • correct time / space complexity,
  • modularity = utility methods / classes, no repeated code,
  • clean code = all constants defined, lines split into shorter pieces, cohesive parts split into methods (do read “Clean Code” book),
  • overall near production quality code.

Last but not least, you need to enjoy it, at least somewhat! Otherwise you’re going to be miserable in the process, will give up easily, and not realize your full potential.

How to prepare for the behavioral interview.

Almost every company hiring software engineers conducts a dedicated assessment of candidate soft skills known as a behavioral interview. Unlike coding and design interviews, the behavioral interviews have other names: “leadership skills”, “culture fit”, “just talking to the hiring manager – no need to prepare”. Except… you always need to prepare. Talking to Netflix too early in my 2023 interview cycle got me rejected even before the technical phone screen! 

How does one prepare? One needs to:

(A) Internalize the STAR method.
(B) Find a representative set of behavioral questions + leadership principles,
(C) Write down stories from professional life.
(D) Illustrate the questions with the stories.

(A) Interviewing the candidates or even asking hiring managers questions as a candidate I can immediately see the stark difference between the people who internalized the STAR method vs the ones who didn’t. The STAR method turns every example into a mini-story by setting the stage, describing what you did, and how it influenced the team / project / company. If the outcome is negative, then the learnings must be included as well.

(B) Where do you get those behavioral + leadership questions? The behavioral questions are quite generic and can be sourced even from non-technical prep guides. Over the years, I used “Knock’em Dead” series by Martin Yate. In turn, the most extensive set of leadership principles I’ve seen was written by Amazon A sample Google search also turns up quite a bit of behavioral questions. My personal set contains ~100 in total.

(C) Remembering the stories and writing them with the STAR method is the most labor-intensive part. But the more stories, the better. With stories spanning a substantial breadth of situations, you are more likely to find answers to curve-balls questions like “Tell me about the time, when you made a product impact through others without being directly assigned to”. I personally identified 20 stories from Twitter + 14 stories from Bandwidth w/ 10 more from other employers and personal projects. Not all stories are good for the interview, as they might not project the target image (maybe you were burnt out and couldn’t finish the project), but I found the stories with the negative outcome and substantial learnings to be the most powerful. After all, the path to success is paved with failures.

(D) After the stories are written and the questions are sourced, it’s time to tie the two with a brief explanation of how the story illustrated the question. In his books Martin Yates also describes the image one needs to project. Make sure to select the stories supporting that image.

Maintaining the many-to-many relationship between the stories and the questions makes it easy to reconnect between them as needed + illustrate multiple questions with the same story. (Relational schema in real life!)

Good luck with your interviews!

Coding interviews as a proxy for SWE job performance

Coding interviews are administered by companies as a proxy for the future job performance of software engineering candidates. However, it is highly uncommon for a software engineer to be writing tree traversals and graph algorithms from scratch on the day job. The day job focuses more on correct business logic, good software design, and clean code with often incremental additions and improvements. So, are coding interviews a good proxy for job performance? My answer is still “yes”.

First, the coding interviews are deeply testing the project management and execution skills, which are the core to any software job. Preparation to the coding interviews is a project, which needs to be handled with care. The topics covered by the project: how to learn and practice the set of coding problems per company, how to gain a working knowledge of fundamentals and a muscle memory on how to apply them, how to be efficient with practice and review, and how to be consistent in your efforts. Great execution of this project invariably leads to stellar interview performance and shows that you can get things done.

Secondly, there are the elements of clean code and good software design, showcased and judged in the coding interviews. Executing the preparation project well often gives a candidate the confidence and the extra time to write clean well-encapsulated code. Doing these always impresses both your interviewer and your code reviewers on the day job: separating constants, splitting lines, splitting independent pieces of logic into their own methods, proper method naming.

In third, a new wave of coding interview practices and tools (e.g. encourages reaching correct business logic, which tests the debugging skills. Debugging is a critical part of a coder’s day job. Almost all companies in 2023 (apart from Google) want you to run your code and pass a basic set of test cases. It is very important to learn to quickly reverse engineer from an observed failure to a bug in your business logic, especially on the day job.

Fourth, software engineering is a team effort with collaboration and communication skills being as important as coding itself. Those soft skills are easily assessed during the interview. A structured approach of writing down the functional requirements, checking your algorithm against interviewer’s expectations, and discussing the implementation waypoints + the tradeoffs show, especially for the senior candidates, their clarity of thought, flexibility, stakeholder communication, and mentorship skills. You need to be great to work with.

So much to unpack! Did the text make you rethink your preparation strategy? Please, leave your thoughts in the comments.

Smart Video Filter project

After 2 years and 3 months of efforts I finally released the Smart Video Filter project. It started as a long hope of mine from several years ago: as a YouTube viewer, I should be able to filter only universally liked videos with high ratio of likes to dislikes. The YouTube search results and recommendations often contain objectively bad videos, e.g. which are low quality/take 10% of the screen or contain static pictures or are offensive/cruel or are aimed at advertising some product with an unrelated video. All those videos would get a lot of dislikes and relatively few likes, such that their ratio = likes/dislikes is low. The videos I would really like to watch are the ones with very high ratio = likes/dislikes. These videos are objectively amazing and are out of this world. Hence the idea of the filtered search, where for any combination of search terms the user could filter only the videos with high enough “ratio”. The idea can be trivially extended to ratios of likes to views, comments to views, dislikes to views, etc.

A small concern in the above approach is that videos of similar quality with fewer views tend to naturally have high ratios of likes to dislikes (or likes to views). The videos with fewer views tend to be watched by subscribers/aficionados/lovers. As the video gains popularity, it gets exposed to a wider audience, which typically likes it less. Then the ratio goes down. The primary goal is to find the best of all videos. Then we can for each number of views check how many videos have a higher ratio of, e.g. likes to views and only provide the results in top x% of the ratio. A uniform sample of videos across all range of views would be readily obtained by selecting videos in top x% (of some ratio) without specifying the target number of views. That is precisely the idea behind  Smart Video Filter.

Though the idea appears simple, the implementation took many evenings over a long period of time. The project is quite rich in required expertise: UX design, architecture, back-end engineering, front-end engineering, and devops skills. The technologies include NoSQL databases MongoDB and Elasticsearch, front-end technologies Angular 5 and Angular Material from Google, CentOS 7 administration and cluster administration. The project went through several stages. The initial UI was written in AngularJS and then rewritten in Angular 5. The initial backend datastore included PostgreSQL, which was unable to support the required I/O loads and was switched to MongoDB. Initially the project used Elasticsearch 2.1 and then gradually migrated up to Elasticsearch 6.2. Retrieval and refresh of videos and channels metadata need to rely on non-trivial algorithms to be efficient and not use up a daily quota before noon. All that relies on heavily multi-threaded and fault-tolerant Java backend. The underlying cluster system implements high availability, when after a complete loss of one machine all systems still work. YouTube terms of service and developer policies are quite strict and took a while to comply with. Recently, I got an audit review from YouTube compliance team, and well, they aren’t shutting me down yet.

I’m greatly enjoying the final implementation myself, while searching for the best videos and filtering over duration in a fine-grained way. My long-term hope came true! Even now I got heavily distracted using the service instead of writing the post. 46:1 ratio video of Donald Trump “singing” Shake it Off by Taylor Swift is an amazing composition! The project has obvious ways to improve to substitute the YouTube original search even more: provide recommendations and play videos on the site without redirecting to YouTube. However, Smart Video Filter is a “small market share” application. If the “market maker” YouTube itself was to implement it, then lots of videos would not be regularly shown in search results/recommendations, which would have discouraged the content creators. Hope you enjoy this niche service as I enjoy it myself!

HA (high-availability) setup for Smart Video Filter

With high expectations on website and service availability in 2018, it is especially important to ensure the redundant DR (disaster recovery) copies of the service are running at all times and are ready to take on the full PROD load within seconds. Hosting companies like Amazon have long solved this problem for standard services, e.g. for Elasticsearch cluster. Since cluster always runs with 1 or more replicas, the replica node is ready to take over for a short period of time, till a new primary is spun up and synched after the primary failure. A level of abstraction such as Kubernetes also allows creation of high-availability service.

With all available options, what should we use in the real world? It depends on the budget and the available hardware. My recently released service, Smart Video Filter, is a low budget solution working on 2 physical machines running CentOS 7. Two enterprise-grade SSDs with large TBW (terabytes written) resource is substantially cheaper than AWS in terms of storage cost and provisioned IOPS cost. It is recommended to run HA setups with 3 machines, but 2 machines (PROD and DR) provide enough reliability and redundancy in most cases. Four different services on those machines needed to seemlessly switch between PROD and DR: ElasticSearch, MongoDB, Mining Service, and Search Service.

ElasticSearch setup over 2 machine includes creating a cluster with 1 primary and 1 replica. ElasticSearch reads and writes happily proceed even if one of those nodes is down. No special setup is necessary.

MongoDB setup on 2 nodes is trickier. MongoDB has a protection against a split brain condition. The cluster does not allow writes, if a primary is not chosen. A primary can only be chosen by the majority of nodes and there is no majority with 1 out of 2 nodes down. Addition of an Arbiter instance is recommended in such cases. However, a simple arbiter setup isn’t going to work, if the arbiter is deployed on one of data nodes. If the entire node goes down, then it takes the arbiter down with it. What I ended up implementing is a workaround of the split-brain protection, when a MongoDB config is overwritten by the mining service. The mining service provides an independent confirmation that one of data nodes is dead, and adds an Arbiter on a different machine to the cluster, while removing an arbiter running on the same machine as the failed data node. Node health detection by the mining service is described below.

Search Service makes use of the health API. One instance of the service is deployed on PROD and DR node each. Each instance deploys a RESTful endpoint with a predictable response, which consists simply of the string “alive”. Each instance also deploys a client to read this status from itself and from the other node. When both nodes are alive, the PROD node takes over. When DR node detects that it is alive, but the PROD node is not alive, then it takes over. Each node is self-aware: it detects its role by comparing its static IP (within a LAN network) to the defined IPs of the PROD and DR nodes. When the node takes over, it uses port triggering on the router to direct future external requests to itself. It was shown within testing that port triggering can switch the routed node within seconds.

Mining Service employs same health API + another external API, which reports, whether the job is running. When PROD or DR nodes are ready to take over, they let the current job finish on another node, before scheduling jobs on itself. Jobs should not run on PROD and DR node simultaneously. The health detection also helps to switch a MongoDB Arbiter between nodes to ensure MongoDB can elect a primary.

After the full setup is implemented, the system is capable of correctly functioning with only disruptions of several seconds, when one of the machines goes down entirely. This was readily demonstrated within testing, when all services remained highly available throughout a rolling restart of 2 machines!

Specialization Review – Leading People and Teams (University of Michigan)

Here is my review of “Leading People and Teams” specialization taken on Coursera from Aug 2017 till Jan 2018. Courses in this specialization are ranked very high between 4.5 and 5.0. I passed with the average grade of 99.0%.
The specialization consists of 3 courses focusing on leadership and team work, 1 course emphasizing human relationships (HR) side of management, and a capstone project.

The first course, “Inspiring and Motivating Individuals“, is quite inspirational indeed. Surprising research evidence suggests that most employees around the world are not engaged/motivated at work, and lots of them are even actively disengaged. The course outlines the origin of meaning of the work, the importance of company vision and engagement, the drivers of people motivations, and the ways to align the employees with the company’s goals. 

The second course, “Managing Talent“, is aimed primarily at managers conducting the onboarding, managing performance and evaluations, coaching the team members, and maintaining continuity of talent. Research shows that managers play crucial role in personnel turnover. A variety of organizational behavior effects and biases are discussed, such as Dunning-Kruger effect, availability error, racial bias, and gender bias. Knowledge from this course might, just like CSM and PMP certifications, backfire in startups or companies without rigid structure, where many of the standard techniques are not followed.

The 3-rd course, “Influencing people“, is probably the most practical of the specialization. If outlines the bases of power and the bases of strong relationships with people and goes in great depth with examples. The course offers practical advice on how to positively interact with colleagues, how to build relationships, and how to gain influence, while protecting oneself from unwanted influence. Expert knowledge, information power, and referent power are presented as influencing means without formal authority. The material assumes a workplace in US, which provides great insight into the informal expectations for immigrant workers. E.g. the expected level of socializing at the workplace is different around the world, and is somewhat higher than average in the US.

The 4-th course, “Leading teams” takes it to the higher level of team dynamics. It provides practical advice for improving team work, coordination, output, and overall happiness. The course discusses topics as team structure, team size, subteams and splits based on demographics/similarity. Coordination problems and common design making flaws are emphasized and the prevention methods are presented. Psychological safety is presented as a cornerstone for team performance. Team charters and team norms are discussed. Performance-oriented vs. learning-oriented mindsets are shown to provide different outcomes.

The final capstone project, “Leading People and Teams Capstone” is automatically graded as a pass. It offers 3 options on improving leadership skills: (1) solve a real-world leadership business case, (2) take on a leadership challenge at work, or (3) interview a business leader to gain insight of their practices. The option (2) is probably best aligned the main goal of the course to improve the learner’s leadership skills.

Overall, I had a great experience taking the specialization. It emphasizes that leadership skills is not something a person born with. They should and readily are acquired as a result of systematic work. A lot of material is focused on leading without formal authority, which is especially helpful to team members of self-organizing Scrum teams in the software industry. The courses are filled with real-life stories and interviews with people from the industry, which help solidify the concepts. Many pieces of homework are peer graded. Assignments of the others provide insight into ideas, styles, and techniques of people at various stages of career ladder. Those techniques summarize real-life experiences of people managing their subordinates, resolving conflicts, influencing the team, which might not otherwise be accessible to the learners.

Specialization is taught by instructors from the University of Michigan, Ross School of business: Scott DeRue, Full Professor, business school Dean; Maxym Sych, Associate Professor; Cheri Alexander, Chief Innovation Officer. All three are charismatic, knowledgeable, and are great presenters. The material is delivered very coherently and to the point. The lecture slides are very detailed and are great for returning the the material in the future.

Spring Boot + Angular 4

Modern web applications must rely on the best services frameworks and the best user interface frameworks to be most reliable, versatile, and the easiest to develop and maintain. That is why many software development teams choose Spring Boot for the services layer and Angular for the UI layer. Sustainable practices for continuous integration and development of these layers is a key to productivity of your team. Two more choices the team needs to make is the Integrated Development Environment (IDE) and a build automation tool. De facto leading choices are, respectively, IntelliJ IDEA with top of the line support of both backend and UI, and maven traditionally used for the backend with multiple plugins for the UI.

Let me describe my latest setup for a project with Spring Boot 2.0.0, Angular 4.2, and Maven 3.3.9 using IntelliJ IDEA 2017.2.3. Backend code must reside in its own (server) maven module, while UI code must reside in a separate (client) maven module. A module, parent to both, provides easy means of building the entire application. Angular 4 project has its own dependency management with npm, but it can readily be integrated with maven using frontend-maven-plugin. I develop in Windows 10, but steps should be practically the same for other OSes.

Part 1. Front-end module.

  1. Choose distribution of NodeJS and install on your machine. I installed v6.11.2.
  2. Installed NodeJS contains “npm” executable. Check its version as “npm -v”. My version is 3.10.10.
  3. Install Angular command line interface package globally by running as Administrator “npm install -g @angular/cli”. The previous version of this package exists under “angular-cli” name – do NOT install that one, as it only supports Angular 2.
  4. Create a maven module in IntelliJ for the UI (mine is named “search-client”) with a “pom.xml” file, but without any other files or directories.
  5. Populate “search-client” module with a UI template by executing “ng new search-client –skip-git” in a folder parent to “search-client” folder. I have a separate version control repository and prefer to skip provided integration with git.
  6. Merge existing Angular 4 files into “search-client” project or write your UI from scratch, integrate with version control of choice.
  7. Open “package.json” and define a command for prod compilation:
    "scripts": {
      "prod": "ng build --prod --env=prod"
  8. Use the following setup in “pom.xml” for search-client in <build><plugins> section. Match NodeJS and npm versions to the ones discovered above. “npm install” execution can be commented out after the first run to save time.
          <id>install node and npm</id>
          <id>npm install</id>
            <arguments>run-script prod</arguments>
  9. Take note of the compilation output directory specified in “.angular-cli.json” file under option “apps -> outDir” and use it in pom.xml <build> section as a resource
  10. Execution of “mvn clean install” leads to a jar file “search-client-1.0-SNAPSHOT.jar” in a local repository containing compiled frontend code. The command takes 20 seconds for a new project on a 2-nd and subsequent runs.
  11. Define a new “npm” Run/Debug Configuration in IntelliJ to run UI code in development mode: Run -> Edit Configurations -> “+”; then choose correct path to “package.json” file; Command -> run; Scripts -> ng; Arguments -> serve; choose a node interpreter, global one is fine.
  12. Run this configuration and open http://localhost:4200 in a browser. Try modifying Typescript, JS, CSS, or HTML files and observe how the displayed pages change.

Part 2. Integration with backend module.

  1. In our backend module (named “search-server”) declare dependency on a frontend module in “pom.xml” file. Apart from providing access to the UI code, this ensures that the UI code builds before the backend code.
  2. Here the property ${} is defined in a parent module (“search-parent”)
  3. Define a path to UI files in “target” folder of compiled code in “search-server” module:
  4. Use “maven-dependency-plugin” to unpack UI files into the target resources folder. Here the version of the plugin is managed by “spring-boot-starter-parent” project, which is the parent of “search-parent”.
  5. Create a Run configuration in IntelliJ to execute the main application class, e.g. annotated with @SpringBootApplication or @ComponentScan or @EnableAutoConfiguration. After running “mvn install” and starting this configuration, the UI entry point, e.g. “index.html”, will be accessible at the specified application root, port (and host).
  6. Executable “jar” file or “war” file is readily produced with “spring-boot-maven-plugin” used in <build><plugins> section, where the jar file of search-client dependency is excluded:
  7. An additional bind for “maven-clean-package” helps to refresh UI of a running application kicked off by starting the main application class configuration from 5. For that, run “mvn install” on a client module and “mvn install” on a service module (or can run “mvn install” on a parent module instead of both). New UI will load upon refreshing the browser page. The application doesn’t need to be stopped and no “clean” goal needs to be issued:

The provided instructions are optimized with respect to developer experience and compilation time, e.g. one doesn’t have to “clean” each time. One can find similar setups online. This notable discussion operates with an old version of Angular CLI plugin. While global installation of “angular-cli” is not necessary, some suggest to use only a locally installed NodeJS to do the compilation. Then, however, one can’t readily generate the UI project template. This github project uses a setup very similar to mine.