Sort it out, JS πŸ˜’

December 12, 2019

4 min read β˜•οΈ

Photo by Sophie Elvis on Unsplash

Photo by Sophie Elvis on Unsplash

Fell for the ol' default JavaScript sorting algorithm trap when working with numbers. 😲


It's been a little while since my last post (about a month and a half). This wasn't intentional since I have a lot of things in flight right now. That being said, I do have a lot of things in the pipeline for upcoming posts.

Tonight I made a script to automate the boring stuff when creating a blog post. It's a small-ish node driven command line script that scaffolds out a new blog post using a template. In my website's repo, like most (if not all) Gatsby driven static sites, my posts are located under content/blog. The directory names for each are prefixed with a zero-padded index, followed by a kebab-cased path (which is generally similar to the title and happens to be what makes up the slug).

For example, 0005-git-commit-m-write-more, which maps to this post.

For each new post I make, I simply increment that prefix number for superficial sorting within my content/blog directory. While I do order them by date on my site, I like seeing my directory listed in posted order, since without the number prefix, the order is alphabetical. While alphabetical is fine, I'm gaming the system since numbers are "alphabetically before" letters.

Sorted blog directory

Sorted blog directory

Given the above context, my next blog post will be prefixed with "0009".

Who Wants to be a JavaScriptonaire?

So, what did I fall for? Well, to explain, let's play a game. The stakes are high. The prize... knawwwledge. Let's consider the following code snippet:

const list = [10, 9, 8, 7, 6, 5, 4, 3, 2, 1];

const sortedList = list.sort();
Sorting a list of numbers in JavaScript.

What will the result of the sortedList variable be?

  • (A): [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
  • (B): [1, 10, 2, 3, 4, 5, 6, 7, 8, 9]
  • (C): "Bush did 911"
  • (D): None of the above

I'll wait... No peeking πŸ™ˆ. Or, you can cheat and scroll down to the answer πŸ˜‰

countdown from 60 seconds

Countdown from 60 seconds ⏳

Array.prototype.sort(of?)

Let's go to my favorite site south of Reddit, MDN:

The sort() method sorts the elements of an array in place and returns the sorted array. The default sort order is ascending, built upon converting the elements into strings, then comparing their sequences of UTF-16 code units values.

Emphasis mine. Did you catch it? While there are plenty of things to talk about here (like sorting in place), let's focus on what happens when you don't provide your comparator function. Let's .reverse() and replay that back:

The default sort order is ascending, built upon converting the elements into strings, then comparing their sequences of UTF-16 code unit values.

So, the following list of numbers from ten down to one will be converted into strings when sorted. Let's pretend we can see pseudo-intermediary steps, so that:

const list = [10, 9, 8, 7, 6, 5, 4, 3, 2, 1];

const sortedList = list.sort();
Sorting a list of numbers in JavaScript.

turns into something like:

const list = [10, 9, 8, 7, 6, 5, 4, 3, 2, 1];

// convert list of numbers to strings
const listAsStrings = ['10', '9', '8', '7', '6', '5', '4', '3', '2', '1'];

// sort the list, convert back into numbers
const sortedList = listAsStrings.sort().map((number) => parseInt(number, 10));
Sorting a list of numbers in JavaScript, but with pretend intermediary steps shown.

Again, the default .sort() behavior is:

comparing their sequences of UTF-16 code units values

which means that a comparison between two strings in the list is done character by character (really, code point by code point) until one of them is considered alphabetically before the other, e.g. "abc" comes alphabetically before "xyz".


The Correct Answer Is...

If you answered (A), you're...

...INCORRECT!

If you answered (B), you're...

...CORRECT! sadly... 😒

Surely you can't be serious.

I am serious and don't call me Shirley.

I am serious and don't call me Shirley.

The Problematic Code

The reason why this bit me is that I initially wrote the code to calculate the next prefix number using .sort().

Here's the old way I wrote it:

function newPostNumber() {
  const postNumbers = fs
    .readdirSync(blogsPath) // read the content/blog directory, like `ls`
    .map((filename) => parseInt(filename.split('-')[0], 10)) // get the prefix, as a number
    .filter(Boolean) // remove any falsey things, like things that don't match our pattern
    .sort() // sort the list in order
    .reverse(); // reverse the list so I can pick the first one (last number)

  // increment the last number used by one
  const postNumber = postNumbers[0] + 1;

  // pad the post number with up to 3 zeros (4 number slots)
  return `${postNumber}`.padStart(4, '0');
}
Problematic code

I used .sort().

So, the list of numbers was converted to strings and then... well, you know the rest. After "0009", I would have been stuck generating "0010-..." directories! 😳

The Fix

To fix the above code, all I needed to do was provide my own comparator function:

 - .sort()
 + .sort((lhs, rhs) => lhs - rhs)
Custom comparator function

This comparator function returns true when the first argument is less than the second, forcing a swap between the two. In other words, we're incrementing in ascending order, what we originally wanted.

function newPostNumber() {
  const postNumbers = fs
    .readdirSync(blogsPath)
    .map((filename) => parseInt(filename.split('-')[0], 10))
    .filter(Boolean)
    .sort((l, r) => l - r)
    .reverse();

  const postNumber = postNumbers[0] + 1;

  return `${postNumber}`.padStart(4, '0');
}
The working code using a custom comparator function.

Conclusion

Look, I get it. It works for strings. The default sorting algorithm is pretty generic. But maybe that's the problem?

In the wise words of the Letterkenny gang:

JavaScript, you figure it out.

Figure it out.

Figure it out.

Comments