Expanding Shortened URLs in a Ruby String

May 07, 2009 · 1 min read

Everyone and their dog uses some sort of URL shortening service these days. While it's handy for cramming a link into short messages like those on Twitter, it's not always considered best practice for a bunch of reasons.

Since plenty of applications pull content from Twitter feeds and similar services, it would be great to expand those shortened URLs and undo the damage. So I built a little module that does exactly that.

Borrowing heavily from a Ruby-based Twitter client, I extracted a module you can mix into String. The idea is simple: for each known shortening service, follow the redirect and swap in the real URL.

require 'net/http'

module BarkingIguana
  module ExpandUrl
    def expand_urls!
      ExpandUrl.services.each do |service|
        gsub!(service[:pattern]) { |match|
          ExpandUrl.expand($2, service[:host]) || $1
        }
      end
    end

    def expand_urls
      s = dup
      s.expand_urls!
      s
    end

    def ExpandUrl.services
      [
        { :host => "tinyurl.com", :pattern => %r'(http://tinyurl\.com(/[\w/]+))' },
        { :host => "is.gd", :pattern => %r'(http://is\.gd(/[\w/]+))' },
        { :host => "bit.ly", :pattern => %r'(http://bit\.ly(/[\w/]+))' },
        { :host => "ff.im", :pattern => %r'(http://ff\.im(/[\w/]+))'},
      ]
    end

    def ExpandUrl.expand(path, host)
      result = ::Net::HTTP.new(host).head(path)
      case result
      when ::Net::HTTPRedirection
        result['Location']
      end
    end
  end
end

To use it, include the module into String:

class String
  include BarkingIguana::ExpandUrl
end

Then call expand_urls or expand_urls! on any text containing shortened URLs. The bang method modifies the string in place; the regular method returns a new string and leaves the original untouched.

s = "http://tinyurl.com/asdf"
s.expand_urls!
puts s.inspect
# => "http://support.microsoft.com/default.aspx?scid=kb;EN-US;158122"

It currently supports ff.im, is.gd, bit.ly, and tinyurl. If you know of other services that should be included, I would love to hear about them. This code — like the original implementation — is released under the MIT licence. The full code including licence and RDoc can be found at http://pastie.org/471016.