Pattern Matching in Ruby

One big feature that was added recently into Ruby is pattern matching. Pattern matching was already added in Ruby 2.7 and was improved since than. A lot has been talked about recent additions to Ruby release but not so much about pattern matching, which is a shame in my opinion since this adds so much to our options as Rubyist’s. So, let’s change this. In this blog post we will discuss what pattern matching is, when it can be used and also we will make a small benchmark where we will compare it to a similar if statement.

What is Pattern Matching

In Wikipedia pattern matching is explained as “the act of checking a given sequence of tokens for the presence of the constituents of some pattern”. For me this didn’t explain a lot, so let’s look at an actual example.

For many Rubyist’s, including myself, pattern matching was made known with the rise of Elixir. With the help of this feature Elixir is able to overload functions. This way a function is very small and can handle just one case where another function handles another input argument. It also makes it possible to write very readable code to handle error cases. How does this look like? Let’s look at an overloaded function:

def valid_password?(%{hashed_password: hashed_password}, password) do # <-- Pattern matching happens here!
  Bcrypt.verify_pass(password, hashed_password)
end

def valid_password?(_, _) do
  Bcrypt.no_user_verify()
  false
end

The argument of the function valid_password? is deconstructed and pattern matched. The BEAM (the VM that Elixir runs on) checks what form the argument has and then calls the right function. So, if the argument is a Map AND has the key hashed_password then the first function is called else the other one.

How does it look like for error handling in Elxir?

case Customer.update_company(company, company_params) do
  {:ok, company} ->
    conn
    |> put_flash(:info, "Company updated successfully.")
    |> redirect(to: Routes.company_path(conn, :show, company))

  {:error, %Ecto.Changeset{} = changeset} ->
    render(conn, "edit.html", company: company, changeset: changeset)
end

Here the function update_company is returning a tuple with either an :ok or :error atom as the first entry. Depending on this the runtime is executing the branch of the case statement. This makes it easy to write code that is only concerned about one stuff while the other branch cares about something else.

The examples make it clear what pattern matching is and shows how the code quality benefits from this. So, how does pattern matching look like in Ruby?

Pattern Matching in Ruby

First, let’s see how the general syntax for pattern matching in Ruby looks like:

case <expression>
in <pattern1>
  ...
in <pattern2>
  ...
in <pattern3>
  ...
else
  ...
end

We can pattern match against anything that is an expression, and in Ruby quite a lot is. So, we can call a method, write an if statement or just put in a variable.
Now let’s experiment a little bit with what we can do with this:

data_structure = {shows: {name: "One Piece", characters: ["Luffy", "Zoro", "Nami", "Sanji", "Usopp"], favorite_character: "Zoro"} }

puts case data_structure
in {shows: {name: "One Piece", favorite_character: "Zoro"}}
  "Absolutly!"                                      # <-- This will be printed
in {show: "One Piece", favorite_character: "Luffy"}
  "Good choice too"
else
  "Interesting"
end

Here we are using pattern matching instead of an if statement and we don’t win a lot. Let’s do something more useful and extract a value from the data structure and put in into a variable:

data_structure = {
  shows: [
    {name: "Superman", characters: ["Clark Kent", "Lois Lane", "Lex Luther"], favorite_character: "Clark Kent"},
    {name: "One Piece", characters: ["Luffy", "Zoro", "Nami", "Sanji", "Usopp"], favorite_character: "Zoro"},
    {name: "Spiderman", characters: ["Peter Parker", "Green Goblin", "Dr. Octobus"], favorite_character: "Peter Parker"}
    ]
  }

puts case data_structure
in {shows: [*,{name: "One Piece", favorite_character: favorite_character}, *]}
  "Your favorite One Piece character is: #{favorite_character}" # <-- This will be executed
else
  "Wrong Input"                                               # We can handle an error too!
end

Notice how we don’t have to specify all of the keys in the hash but only the ones we are interested in. Also, with the * we tell Ruby that there might be more objects in the array but we really don’t care about them.

If we would have used an if statement to extract this value into a variable and handle an error as well the code would look a lot more messier:

data_structure = {
  shows: [
    {name: "Superman", characters: ["Clark Kent", "Lois Lane", "Lex Luther"], favorite_character: "Clark Kent"},
    {name: "One Piece", characters: ["Luffy", "Zoro", "Nami", "Sanji", "Usopp"], favorite_character: "Zoro"},
    {name: "Spiderman", characters: ["Peter Parker", "Green Goblin", "Dr. Octobus"], favorite_character: "Peter Parker"}
    ]
  }

if data_structure[:shows]
  one_piece = data_structure[:shows].find {|show| show[:name] == "One Piece"}
  if one_piece[:favorite_character]
    puts "Your favorite One Piece character is #{one_piece[:favorite_character]}"
  else
    "Wrong Input"
  end
else
  "Wrong Input"
end

Not only does the pattern matching version save us to write a few lines of code but it also is so much more readable and therefore maintainable.

This is really great and shows us a great usecase: extracting something from a data structure, i.e. deconstructing an object. But pattern matching can do even more! As mentioned above, the first expression we are evaluating can be a method call as well. This way we could implement an error handling like we did in Elixir like so:

def character_for_show(show_name)
  data_for_show = {
    "One Piece" => { characters: ["Luffy", "Zoro", "Nami", "Sanji", "Usopp"] },
    "Superman" => { characters: ["Clark Kent", "Lois Lane", "Lex Luther"] },
  }[show_name]
  if data_for_show[:characters]
    [:ok, data_for_show[:characters]]
  else
    [:error, nil]
  end
end

case character_for_show("One Piece")
  in [:ok, character_name]
    puts "One Piece characters: #{character_name.join(",")}"
  in [:error, _]
    "Wrong Input"
end

Although we can implement error handling this way, it is not idiomatic Ruby. Therefore the standard library is handling errors in a different way, and we would have to make a wrapper around the important classes/methods our selves. That said, it might be a good option to handle the errors in our business logic. But honestly I would use something else like dry-rb monads - which play nicely with pattern matching. A great post about how to implement error handling with monads can be found here.

Pattern matching in the wild

There is a gem suite out there that is using pattern matching quite extensively, the dry-rb gems. For example the dry-validation gem can pattern match on a successfull validation like so (taken right from the documentation):

case contract.('first_name' => 'John', 'last_name' => 'Doe')  # contract is the validation
in { first_name:, last_name: } => result if result.success?
  puts "Hello #{first_name} #{last_name}"
in _ => result
  puts "Invalid input: #{result.errors.to_h}"
end

Here we use an if statement as an expression to pattern match against. When you take a look into the dry-rb documentation you will find other ways to use pattern matching with this excellent library as well.

Benchmark pattern matching vs if statement

Now, how fast is this pattern matching compared to an imperative search? Let’s find out!

require 'benchmark/ips'

def pattern_match(data_structure)
  case data_structure
  in {shows: [*,{name: "One Piece", favorite_character: favorite_character}, *]}
    "Your favorite One Piece character is: #{favorite_character}" # <-- This will be executed
  else
    "Wrong Input"                                               # We can handle an error too!
  end
end

def impartive_search(data_structure)
  if data_structure[:shows]
  one_piece = data_structure[:shows].find {|show| show[:name] == "One Piece"}
  if one_piece[:favorite_character]
    "Your favorite One Piece character is #{one_piece[:favorite_character]}"
  else
    "Wrong Input"
  end
else
  "Wrong Input"
end
end

Benchmark.ips do |b|
  data_structure = {
    shows: [
      {name: "Superman", characters: ["Clark Kent", "Lois Lane", "Lex Luther"], favorite_character: "Clark Kent"},
      {name: "One Piece", characters: ["Luffy", "Zoro", "Nami", "Sanji", "Usopp"], favorite_character: "Zoro"},
      {name: "Spiderman", characters: ["Peter Parker", "Green Goblin", "Dr. Octobus"], favorite_character: "Peter Parker"}
      ]
    }
  b.time = 10
  b.warmup = 2
  b.report("pattern matching:") do
    pattern_match(data_structure)
  end

  b.report("imperative search:") do
    impartive_search(data_structure)
  end

  b.compare!
end

Warming up --------------------------------------
   pattern matching:    99.516k i/100ms
  imperative search:   134.871k i/100ms
Calculating -------------------------------------
   pattern matching:    969.440k (± 0.9%) i/s -      9.753M in  10.060889s
  imperative search:      1.319M (± 1.0%) i/s -     13.217M in  10.024136s

Comparison: 
  imperative search::  1318684.6 i/s
   pattern matching::   969439.8 i/s - 1.36x  (± 0.00) slower

As you can see pattern matching is still significantly slower than an imperative search. But shouldn’t we use this great feature than? Absolutly not! When we use this only in a place that is not in a loop and doesn’t run over and over again, the difference will not matter do us. But the maintainable is still there so we should use it in than.

Future of Pattern Matching in Ruby

The story of pattern matching in Ruby isn’t finished yet. Recently, I came across a Tweet from Koichi Sasada in which he hints a possible method overloading with pattern matching. It is not confirmed that this will happen, but it really looks interesting to me!

Conclusion

As we have seen pattern matching is a very powerful tool in our tool belt. We can write very maintainable code that is very declarative. The performance is good, but maybe it shouldn’t be used in tight loops because there is a little bit of a performance penalty to pay.