Data Structures in Ruby

Arrays and Hashes

When first approaching a programming language, the classes and tutorials out there suggest a lot of weird and deeply hypothetical stuff, like "Make the console print 'foobar' 5 times" and so on. It's easy to wonder how writing lines of code can eventually create huge, powerful applications. This week in Phase 0 of Dev Bootcamp introduced Ruby's tool for handling groups of data, and it offers a post-trivial glimpse of how developers utilize code bits in the real world.

The two basic data structures in Ruby are called arrays and hashes. Both share essentially the same purpose: to store and organize a set of data in a meaningful way. How it's stored and how it's organized is where they differ. But first, why do these even exist? Can't you store data in a variable? As we've seen in our introduction to Ruby, if you declare a variable, you can use it to store just about any type of information - words, numbers, even the results of a method call. You can then use that information to make new custom methods, isn't that enough?

Well it's pretty obvious, but the programs we use every day are a little more complicated than a "foobarify" app for your phone, (which does, guess what? ...No! the new version prints it six times! Update, fool.) and as the complexity is scaled up, so is the amount of data. How can we organize it in such a way that it can be easily and consistently found, then specifically selected for use or modification? Arrays! Hashes!

Let's start with arrays. You can take a piece of information in the code, like a string (set of characters) or an integer, and arrange it in a cozy group, like this:


        ["popsicles", "rocket fuel", "self-respect", 25]
      

That's pretty fun, but not so useful - until you do this:


        grocery_list = ["popsicles", "rocket fuel", "self-respect", 25]
      

Now we've taken that list of information and structured it in an entity called grocery_list, which can be used in your code like a variable. The array as a whole can be treated as one item:


        print grocery_list
        => ["popsicles", "rocket fuel", "self-respect", 25]
      

...or you can access an item within the array with an index:


        print grocery_list[1]
        => rocket fuel
      

and just a whole lot else.

Hashes are a little different, in that the information is stored associatively. The structure is a key: value pair, like this:
hash = {key: value} - Accessing values in a hash is similar to in an array, except instead of the numerical index, you supply the key. For example if my hash looks like:


        fav_food = {:dave => "tacos", :dog => "tacos"}
      

I could find the dog's favorite food by typing fav_food[dog]. Just because we both love tacos doesn't mean I can't store that information in a hash. Hash values can be identical, as long as the keys are unique.

The main difference between these two structures is how the information is stored and accessed. Arrays are better for situations where the order of data stored is predictable and sequential, like this:


        time_machine_trials = ["failure","failure","failure"]
      

Making up keys for this array wouldn't be helpful in accessing the information later. However in a case like:


        student_grades = {:marcus => "B", :wanda => "A", :waldo => "F"}
      

you'd really want to be able to search for the information by student name rather than guess until you found the right index. As we incorporate more and more information into our programs, we'll need the power to manipulate it in meaningful ways, and that comes down to good organization. Starting with knowing when to use arrays and when to use hashes is crucial to a solid overall architecture.

Thanks for reading and checkback soon for more adventures from a Ruby n00bie.