The Ruby Tutorial that I wish I had
I've moved to a new team recently. Some of the infrastructure definitions were written in Ruby. Ruby isn't a language I'm familiar with but I know a handful of programming languages, including Python, so I thought it would be trivial to pick up. I was very wrong.
Whenever I read Ruby, I felt lost. I genuinely had no idea how to interpret most of the program I was looking at. The code snippets just looked magical to me. I found it even more confusing than C++, which I had been programming for the last 2 years and has its own reputation for complexity.
I spent several frustrating nights studying to get to a point where I could understand relatively simple Ruby code. I quickly went through the official docs, starting with To Ruby from Python and combed through the FAQ. Still I felt I didn't really understand the language. I couldn't find answers to basic things like when I can/cannot omit brackets when calling a method.
I don't want other experienced programmers to go through the frustration I had so I want to share what I've learned to help others get started with Ruby. Here's a tutorial that I would have found useful 2 weeks ago.
Since it's a long collection, here's the table of contents for your convenience:
- Ruby is a lot more Object-Oriented
- Fun with Modules
- Diversity of Method definition/call Syntax
- Syntactic Sugar for Setters
- Blocks
- Procs
- yield
- procs
- Percent Strings
- 3 Ways to Write a Hash
- instance_eval for that Magic DSL look
- Conclusion
Ruby is a lot more object-oriented
Ruby is more object-oriented than many other mainstream programming languages. For example, in Ruby, it is a lot more idiomatic to use methods attached to basic classes like Integer
, rather than to use a free function. Take a look at how to count from 0 to 4.
5.times {|x| puts x}
Compare this to what I'd do in Python:
for x in range(5): print(x)
As far as I can tell, there is no obvious distinction between primitives and objects. Java has a fairly strict division across the two types, like how an int doesn't have any methods. In Python, built-in types like int are a bit more object-like.
1.__add__(2) # this is SyntaxError
(1).__add__(2) # This is OK - 3
The Python built-ins are still special in a sense that they cannot be overridden.
>>> int.__add__ = lambda x, y: y
Traceback (most recent call last):
File "", line 1, in
TypeError: can't set attributes of built-in/extension type 'int'
In Ruby, extending/overriding core classes is possible. The following code adds a method named add1
to Integer.
# in Ruby, this adds the following to the existing Integer definition.
class Integer
def add1
self + 1
end
end
puts 2.add1 # prints 3
I'll leave it up to you to decide if it's a good thing or not 😉
In addition, there is no free function. That's just like Java, but you can define methods without a function. So where do they go? The answer is that it's attached to the class Object. You can inspect this yourself by running the following script:
def test; 42 end
puts method(:test).owner
# output: Object
Since every object in Ruby derives from Object, does this mean these functions are effectively global functions that are in every single class? The answer is yes. Check out the following example:
class B
def answer
puts "fun_in_main owned by #{method(:fun_in_main).owner}"
fun_in_main
end
end
def fun_in_main; 42 end
puts B.new.answer
# output
fun_in_main owned by Object
42
Fun with Modules
Ruby modules have two purposes. First, they can organize classes and methods into a namespace. In that respect, it's a lot like a Python package. Interestingly, Ruby modules are also used as a template for mixing in methods into a class. What I found confusing about this was that a module itself is the target of mixin, rather than a class in the module. To me it makes more senes to have a class mix into another class, rather than have a module mix into a class. Then I realized that the syntax for creating "free functions" in a module looked like a static class method. So I started wondering, are modules and classes the same? To investigate this, I ran the following experiment:
module Quacks
# effectively a free function under Quacks namespace
def self.static_quack
puts "static_quack"
end
# for use as a mixin
def quack
puts "quack"
end
end
class Duck
include Quacks # now I can use all methods from Quacks
end
Quacks.static_quack # => prints static_quack
Duck.new.quack # => prints quack
In this code snippet, static_quack
is a static method to the module, so the module is being used to emulate a free function. On the other hand, quack
is meant to be mixed into the class Duck
when include Quacks
run.
irb(main):009:0> Quacks.new
Traceback (most recent call last):
2: from /usr/bin/irb:11:in `'
1: from (irb):82
NoMethodError (undefined method `new' for Quacks:Module)
It's not quite a class since it doesn't have the new
method. But it does kind of look like a class because it has all the class-like methods:
irb(main):010:0> Quacks.instance_methods
=> [:quack]
irb(main):011:0> Quacks.methods false
=> [:static_quack]
Answer to my question: they are similar but not the same thing.
Diversity of Method definition/call Syntax
In Ruby, there is no attribute/method distinction. Everything is a method by default, but they do look like attributes. That's good for encapsulation but I found this one of the most confusing part of the Ruby syntax. Consider the following class:
class Sample
def x
3
end
end
The class Sample
has a method/attribute named x
, so you can access it like the following:
s = Sample.new
puts s.x()
But you can also call x like this:
puts s.x
For any zero-argument method, you may omit the normal function call braces.
The next question I had was, how would I get the reference to the method itself, if the method name invokes the method right away? The answer is to use the method method
and pass in the name of the method as a symbol.
m = s.method(:x)
m.call # calls s.x
Then this method call be called using call()
like in the example. Note this method is bound to the object by default, which can be retrieved by calling s.receiver
.
This terse method call syntax also extends to single argument calls. In the following example, f
is a method that takes a single argument and adds 1 to it.
class AddOne
def f x
x + 1
end
end
But it's also valid put the brackets around formal arguments like this:
def f(x)
...
end
The same applies when calling the method. Both styles are valid:
a = AddOne.new
a.f 1 # => 2
a.f(2) # => 3
But when the method has two or more arguments, you must use brackets around the method call.
def add_two(a, b)
a + b
end
add_two(1, 2) # => 3
add_two 1,2 # => 3
add_two 1 2 # => not OK
I found this kind of inconsistent, considering languages like F# that has a similar function application syntax allows the second form (with currying).
Syntactic Sugar for Setters
class Holder
def initialize
@x = 3
end
attr_accessor :x
end
h = Holder.new
h.x= 1 # Ok this makes sense, it's a short-hand for h.x=(1)
What the tutorials didn't tell me is why code like the following works:
h.x = 1 # Why does this work? and what does it even do?
At a glance, it parses in my head like (h.x)
EQUALS
TWO
. It took me a while to find out the answer. It's a syntactic sugar--Ruby will convert that into a method call into x=
. In other words, all of the following are the same:
h.x=(1)
h.x= 1
h.x = 1
We can deduce from this syntactic sugar that the "get_x/set_x"-style method naming convention doesn't make too much sense in Ruby. When an attribute-like method name ends with =
, we know it's a setter, and otherwise it's a getter.
Blocks
Ruby has blocks, which are kind of like lambdas in Python in that you can pass in a block of code to be executed by the method. Here is an example:
5.times {|x| puts x} # prints 0 1 2 3 4
5.times do |x| puts x end # same as above
Of course, in Ruby, there are two ways to write the same thing, but that's fine, I am used to that by now. What I found complicated was how to actually use them and how they interact with other method parameters. First, all methods in Ruby will take an implicit block, after the last parameter. In the following example, it's okay to call f
with a block because every method accepts an implicit block. f
just doesn't use it.
def f a
puts "f is called with #{a}"
end
def f_no_argument; end
f(5) {|x| puts "block called" } # this block is unused.
# Output
# f is called with 5
Note that a block is not exactly the same as the last argument to the call. It must be specified outside the brackets for the arguments (if they are around).
f(5) {|x| puts "block called" } # OK
f 5, {|x| puts "block called" } # not OK
# No-argument examples
f_no_argument {|x| puts "block called" } # OK
f_no_argument() {|x| puts "block called" } # OK
Once inside a method, calling the passed-in block requires using the keyword yield
, which means a very different thing than in Python.
yield
yield
in Ruby executes the block passed in. yield
is a bit special compared to regular function calls because Ruby doesn't seem to validate the number of arguments in the block. For example, calling the following method f
without any argument will give you ArgumentError
:
def f x; puts x end
f 1 # ok
f # ArgumentError (wrong number of arguments (given 0, expected 1))
But calling a block with a wrong number of arguments is fine.
def f
yield
yield 1
yield 1, 2
end
f {|x| puts x} # not a problem
The missing arguments are substituted with nil
s.
procs
Unlike lambdas, blocks are not really assigned to a variable. In order to actually grab the block and do the normal variable-like things (e.g., storing it, or forwarding it), you can accept it as the last argument prefixed with & to auto-convert it to a proc
, which is then bound to a normal variable.
def addOne(x, &p)
# p is a Proc
p(x + 1)
yield x + 1
end
addOne(1) {|x| puts x}
# output:
# 2
# 2
In this example, p
refers to the block that prints. Note that yield
also continues to work.
Procs can be converted back into a block
argument to another function by prefixing &
again. In the following example, forward takes a block as a proc, then converts it back to a block, to be passed into Integer#times
.
def forward &p
2.times &p
end
forward { |x| puts x }
# output:
# 0
# 1
Percent Strings
Percent Strings are another type of syntactic sugar that makes it easy to write a certain constructs like symbol arrays. But if you have never seen them before, you can't really guess what they mean. Here are some of them:
# %i for symbol arrays (i stands for what?)
%i(a b c) # => [:a, :b, :c]
# %w is like %i except it gives you a string array (w for words?).
%w(a b c) # => ["a", "b", "c"]
# %q for a string (q for quotes?)
%q(a b c) # => "a b c"
# %r for a regex pattern (r for regex?)
%r(a b c) # => /a b c/
# %x is a subshell call (x for.. eXecute?).
%x(echo hi) # => "hi \n"
`echo hi` # just one more way to do it
3 Ways to Write a Hash
Most tutorials cover 2 different ways to write a Hash
(i.e., dict
in python). The first is the most verbose way, listing each key and value:
x = {"a" => 1, "b" => 2}
The second way is a short hand, if you want the keys to be symbols:
x = {a:1, b: :b}
x = {:a => 1, :b => :b} # equivalent to line above
What tutorials often don't cover is the third shorthand-form, which can be used only as the last argument to a method call.
puts a:1, b:2 # prints {:a=>1, :b=>2}
In this case, a and b are symbols. Again, this only works if the hash is the last argument to a function call.
puts 1, a:1, b:1
Curiously, this does not work for assignment, or an assignment-like method call. Check out the following:
class Test
attr_accessor :member
end
t = Test.new
t.member = a:1 # does not work
t.member= a:1 # does not work
t.member=(a:1) # does not work
instance_eval for that magic DSL look
The last core ingredient for understanding Ruby is instance_eval
. instance_eval
takes a block and will run the block in the context of that instance. Effectively it just swaps the self
of the block. The following demonstrates something that resmbles a typical Ruby DSL. It will let you configure a Hash in a cool-looking way.
class DSLTest
def initialize
@config = Hash.new
end
def configure
yield @config
end
def run &p
instance_eval &p # this means to convert the proc p back into a block
puts "Configuration is #{@config}"
end
end
x = 9
DSLTest.new.run do
configure do |c|
c[:key] = x
end
end
# prints Configuration is {:key=>9}
Conclusion
Matz, the creator of Ruby, wanted a “[…] a scripting language that was more powerful than Perl, and more object-oriented than Python”. And I can certainly agree that Ruby has achieved both. It is more object-oriented than Python. It is also Perl-like- in both good and bad ways. Ruby can be concise and powerful, but I can't help feeling bothered by how there is always more than one way to do something. I don't like it, but I can now read Ruby code without being completely intimidated, at least. I hope this post is helpful to those who struggle to understand Ruby.