Make your code readable

In this video I'm talking about readability of the code: why it matters and how to make code more readable. I'm presenting 5 small techniques which should make code easier to understand.

📚Resources:

when preparing this episode I started from learning how we, humans, read in general. I've always known  that reading is a complicated process, and reading about reading (yup!) helped me to understand both the visual and semantic aspects of readable code.

  • https://en.wikipedia.org/wiki/Subvocalization - most of us have inner monologue (kind of a voice in our head) and when we read, we vocalize the text in our mind; this process is called subvocalization
  • https://en.wikipedia.org/wiki/Saccade - when reading, our eyes do not move smoothly from one word to another, but rather jump and stop between them in so called saccadic movements
  • https://medium.com/@matryer/line-of-sight-in-code-186dd7cdea88 - post by Mat Ryer where he explains the idea of aligning the happy path to the left. This concept helped me understand better why early returns improve readability
  • https://www.poodr.com/ - "Practical Object-Oriented Design in Ruby", a book by Sandi Metz. This book is about much more than just readable code, but I recommend it to every developer, even if you do not use Ruby at all

Notes

Why is it important?

So first, why is it important to make code as easy to understand as possible? If you're working in a team, I think you already know the answer - the code that you write will be read by other people. They will read it in order to understand how the whole application works and in order to introduce some changes. As I said in the other video, we can't change what we don't understand, therefore in order to keep our team efficient, we should make it easy as possible for them to learn what we meant when we wrote our code.

But even if you work on your own project, and you get back to a certain file after a few weeks, you lose the context in which you wrote that file, you won't remember why you made certain decisions. So making your code self-explanatory or providing some context in the comments, will be very helpful!

Remember, a single line of code might be written once, and it might be read hundreds or thousands of times, so by making it easy to understand, you might be saving yourself and others a lot of time!

Techniques

Use white space

Let's start with the visual part, because it's shorter and easier.

In short: when reading anything, a newspaper, an article, a piece of code, we don't really read it in a smooth, linear fashion. Our eyes jump between words in so called saccadic movements. We also tend to stop at parts that bring our attention. For example, in an article we see headers and subheaders, we see bold font, we see some emphasized quotes.

When writing code we don't have so many ways to emphasize important parts, we can't make our code bolder or larger, we can't even set the colours, cause they depend on the reader's text editor.

What we can do is to make a wise use of white space. Proper indentation, line breaks and empty lines will let you structure your code in a way that separates different pieces from each other and emphasises which lines should be certainly looked at. Looked at a following example:

<?php
public function updateAction($slug) {
  $team = $this->getTeam($slug);
  $em = $this->get('doctrine')->getEntityManager();
  $form = $this->createForm(new TeamType(), $team);
  $request = $this->get('request');
  $this->checkUser($team);
  if ($request->getMethod() == 'POST') {
    $form->bindRequest($request);
  if ($form->isValid()) {
    $team = $form->getData();
    $em->persist($team);
    $em->flush();
    return $this->redirect($this->generateUrl('team_show', array("slug" => $slug)));
  }
  }
  return $this->render('MyApp:Teams:edit.html.twig', array("form" => $form->createView(), "team" => $team));
}
<?php

public function updateAction($slug)
{
  $team = $this->getTeam($slug);
  $em = $this->get('doctrine')->getEntityManager();
  $form = $this->createForm(new TeamType(), $team);
  $request = $this->get('request');

  $this->checkUser($team);

  if ($request->getMethod() == 'POST') {
    $form->bindRequest($request);

    if ($form->isValid()) {
      $team = $form->getData();

      $em->persist($team);
      $em->flush();

      return $this->redirect(
        $this->generateUrl('team_show', array("slug"), $slug))
      );
    }
  }

  return $this->render(
    'MyApp:Teams:edit.html.twig',
    array("form" => $form->createView(), "team" => $team)
  );
}

This is some old code I wrote years ago. These 2 functions do exactly the same thing, the only difference between them is use of white space - spaces, tabs and new lines, and it makes all the difference. The function on the right is well structured. The first 4 lines are declarations where I set up variables for future use. The next line is surrounded by empty lines which emphasises its importance - it says "look here". Further down we have proper indentation, I know that this piece of code is only executed in case of POST request , and this piece of code here is only executed when the form is valid (look at the same line on the left, the indentation suggests that 2 if statements are on the same level, but they're actually not!). Finally the remaining lines are shorter than in the other case, which also makes them easier to read (my eyes don't have to move so much from left to right).

Check for positive conditions

Ok, now we're moving more to semantic part.

When you use if/else, 2 paths might be considered equal, or one might be considered a happy path (or a success path), while the other is a failure path. In such case, the happy path should be first.

The reason why the happy path should be first is that it is the more common case and that's why it should take a priority in the flow of the code. If we call "if error then ... else ..." then we think about the "else" part as "not error". But error is already a negation of what we want, it's a negation of success, therefore we have kind of a double negation ("not not success"). This is visible especially when the if/else conditional is long and you need to spend some time reading each block of code in order to better understand it.

async function fetchNews(url) {

	let cache = new Cache();
	let today = new Date();
	let output = null;

	let latestNews = await fetch(url).then((resp) => resp.json());
	
	if(!latestNews) { // try fetching older news
		let cachedNews;
	  let cacheDate = today;
		let daysAgo = 0;
		while (daysAgo <= 4) {
			output = cache.get(`news-${cacheDate.toLocaleDataString('en-UK')}`);
			if (output) break;
			cacheDate.setDate(today.getDate() - daysAgo);
			daysAgo +=1 ;
		}
	} else {
		output = latestNews.filter((news) => news.isImportant);
	  cache.save(`news-${today.toLocaleDataString('en-UK')}`, output);
	}

	return output;

}

Of course this function has more issues than just the one that I want to highlight. But have a look at the if/else section. When I read it in my mind, I read it as "if not latest news then X else Y", but the "else" part means "not the if part" which means I need to think about "not not latest news" in my head. Just swapping these 2 parts changes a lot:

async function fetchNews(url) {

	let cache = new Cache();
	let today = new Date();
	let output = null;

	let latestNews = await fetch(url).then((resp) => resp.json());
	
	if(latestNews) { 
		output = latestNews.filter((news) => news.isImportant);
	  cache.save(`news-${today.toLocaleDataString('en-UK')}`, output);		
	} else { // try fetching older news
		let cachedNews;
	  let cacheDate = today;
		let daysAgo = 0;
		while (daysAgo <= 4) {
			output = cache.get(`news-${cacheDate.toLocaleDataString('en-UK')}`);
			if (output) break;
			cacheDate.setDate(today.getDate() - daysAgo);
			daysAgo +=1 ;
		}
	}

	return output;

}

It has the same number of lines, it has the same code, the only change is swapping the if statement and it becomes clearer. And to make it even clearer you should split it into smaller functions:

async function fetchNews(url) {

	let cache = new Cache();
	let today = new Date();
	let output = null;

	let latestNews = await fetch(url).then((resp) => resp.json());
	
	if(latestNews) { 
		output = latestNews.filter((news) => news.isImportant);
	  cache.save(`news-${today.toLocaleDataString('en-UK')}`, output);		
	} else { // try fetching older news
		output = this.getCachedNews(cache, today);
	}

	return output;

}

Return early to handle failures

A good alternative for if/else where you handle an error, is to use early return. In such case instead of doing if success then ... else ... end you can just do if error then return value end. Why does it work well? Because the if part is just a short distraction, and does not disrupt the main flow of the function. Let's have a look at example!

We have a checkout class that has 1 public method, which performs a checkout in some kind of imaginary e-commerce setting. The method has 3 if/else conditionals:

  • first we check whether user has enough balance (some kind of credits or points)
  • then we check whether the items are still available
  • finally we create an order and see if it was successful

In the first version of this class, the happy path is represented in 3 nested ifs and it's quite difficult to understand which else refers to which if.

class Status {
  public success: boolean;
  public errorMessage: string;

  constructor(success: boolean, errorMessage: string) {
    this.success = success;
    this.errorMessage = errorMessage;
  }
}

class Checkout {

  public run(currentUser: any, cart: any): Status {
    const totalPrice: number = cart.calculateTotalprice();
    let orderStatus: Status = null;

    if (currentUser.balance >= totalPrice) {
      if (cart.allProductsAvailable()) {
        if (this.createOrder(currentUser, cart)) {
          currentUser.deductBalance(totalPrice);
          orderStatus = new Status(true, null);
        } else {
          orderStatus = new Status(false, "order creating failed");
        }
      } else {
        orderStatus = new Status(false, "some products are unavailable");
      }
    } else {
      orderStatus = new Status(false, "not enough money");
    }

    return orderStatus;
  }

  private createOrder(_currentUser: any, _cart: any): boolean {
    return true;
  }

}

The second version is rewritten to return as soon as we notice a failure. If user does not have enough balance, we return a failure. Then if some product is not available, we return failure, etc.

So you can see that the flow of the function is much clearer, and the happy path is in the main flow, it's not hidden in some nested conditional, it's there at the bottom of the function. After we handled the possible failures, we just continue with the happy path.

class Status {
  public success: boolean;
  public errorMessage: string;

  constructor(success: boolean, errorMessage: string) {
    this.success = success;
    this.errorMessage = errorMessage;
  }
}

class Checkout {

  public run(currentUser: any, cart: any): Status {
    const totalPrice: number = cart.calculateTotalprice();

    if (currentUser.balance < totalPrice) {
      return new Status(false, "not enough money");
    }

    if (!cart.allProductsAvailable()) {
      return new Status(false, "some products are unavailable");
    }

    if (!this.createOrder(currentUser, cart)) {
      return new Status(false, "order creating failed");
    }

    currentUser.deductBalance(totalPrice);

    return new Status(true, null);
  }

  private createOrder(_currentUser: any, _cart: any): boolean {
    return true;
  }

}

Avoid reassigning variables

Reassigning variables is extremely common in programming and I don't want to say that it's always a bad idea. I want to say though that usually it is a bad idea.

What is reassigning?

Let's say you have a variable called "categories" where you store information about all the product categories in your store

let categories = ["electronics", "household", "food"]

Reassigning variable means that you change it's value, for example you want to have only categories that have at least one product

let categories = ["electronics", "household", "food"]
categories = categories.filter((cat) => Product.withCategory(cat).length > 0)

Why do I recommend avoiding reassigning variables then?

The main reason is that in many cases, once you assign certain value to a variable, it represents some idea. If you later change the value of that variable, the idea that it represents might change as well.

In the example above, in the first line, categories variable represents all the categories. However in the next line, the same variables represents a different concept - categories with available products.

If your function has just 2 lines, that's fine, it's easy to understand. But what if there are more lines in between? When you want to modify such code, you need to be aware what currently is represents this variable: is it a list of all categories, or a list of active categories?

There are 2 simple solutions to this issue:

  1. Create 2 variables
let categories = ["electronics", "household", "food"]
let activeCategories = categories.filter((cat) => Product.withCategory(cat).length > 0)
  1. Create variable once and do not change
let categories = ["electronics", "household", "food"]
	.filter((cat) => Product.withCategory(cat).length > 0)

The first option is great when you need to use both values, the 2nd one works better when you use just one value

A more extreme example would look like this:

query = "SELECT p.name, p.price, c.name as category_name FROM products p"
query += " LEFT JOIN categories c"
query += " ON c.id = p.category_id"
query += " ORDER BY p.price ASC"
query += " LIMIT 20"

In this case you want to construct a query. In the first line you have a variable that represents a query, but not the full query, it's just some incomplete SQL. So why do you need a variable to represent such value at all? Instead just combine all these strings together:

query = "SELECT p.name, p.price, c.name as category_name FROM products p" +
  " LEFT JOIN categories c" +
  " ON c.id = p.category_id" +
  " ORDER BY p.price ASC" +
  " LIMIT 20"

There are situation when reassigning variable is a good idea though, for example when you implement a game of chess and you have a variable called currentPlayer.  Initially you can set it to 0 to say that first player is making a move, then change to 1 when it's 2nd player's turn, then to 0 again, 1 again etc.

This works because even though you change the value of the variable, it still represents the same idea - it always represents the current player. The reality of the game changes, therefore the value of the variable changes

function playChess(board, currentPlayer) {
  while(true) {
    makeMove(board, currentPlayer);
    if (checkmate(board)) {
      return `player ${currentPlayer + 1} won!`;
    }
    currentPlayer = (currentPlayer === 0) ? 1 : 0;
  }
}

Never call variable "data"

Naming is one of the hardest technical problems in programming, one day I'll record a whole episode about naming, but for a simple trick: if you ever see a variable named "data", rename it. "Data" is such a generic term that it could mean anything. Literally every variable represents some data. Using this name makes it very confusing.

Seriously, you look at a variable called "currentPlayer" - you get what it means, you look at the variable called "userCart" - you can imagine what it means, you look at a variable called "data" and you have no damn idea, it can mean literally anything.