Overkill Email Obfuscation with Ruby and Javascript 7

Posted by unixmonkey on March 12, 2008

Robot Spiders from RunawayThe web is a generally free and open place for all types of communication, but if you put your email address on 1 website, you can expect an email-harvesting robot spider to find that address and send it to its spammer overlords.

Once on a spammer’s list, you can expect to get all kinds of interesting stock tips, products to enhance your manhood, and friendly letters from Nigerian diplomats.

If you simply have too little to do in the day, this can be a great way to meet new people and start a career in day trading. However, some of us are just too darn busy to stop what we are doing every 2/3rds of a second to check our email; but still need it for keeping in contact with friends, family, and business contacts.

From a few tips pulled from the web, I set to create a nice link helper for Ruby / Rails intended to display email links that work indistinguishably from regular mailto: links, and even gracefully downgrade for users without javascript.

Lets not even display the email address on the page at all, and use a little javascript to render the email address after the fact by breaking it up and putting it back together with javascript.

# Takes in an email address and (optionally) anchor text,
# its purpose is to obfuscate email addresses so spiders and
# spammers can't harvest them.
def js_antispam_email_link(email, linktext=email)
    user, domain = email.split('@')
    # if linktext wasn't specified, throw email address builder into js document.write statement
    linktext = "'+'#{user}'+'@'+'#{domain}'+'" if linktext == email 
    out =  "<noscript>#{linktext} #{user}(at)#{domain}</noscript>n"
    out += "<script language='javascript'>n"
    out += "  <!--n"
    out += "    string = '#{user}'+'@'+'#{domain}';n"
    out += "    document.write('<a href='+'ma'+'il'+'to:'+ string +'>#{linktext}</a>'); n"
    out += "  //-->n"
    out += "</script>n"
    return out
end

This is probably good enough for 90% of those robots, but you know if one spammer gets your address, he will likely share (or sell) your email to all his friends. The weak spot in this looks like the noscript version, lets fuzz that up a bit by converting to HTML character entities.

One of the earliest and simplest ways to obfuscate an email address is by converting each character into its HTML equivalent. This makes the source look nasty, but will be correctly rendered by the browser that the end-user is none the wiser.

An address like abc@example.com will look like this in the source:

&#097;&#098;&#099;&#064;&#101;&#120;&#097;&#109;&#112;&#108;&#101;&#046;&#099;&#111;&#109;

Let’s build a simple method to convert a plaintext string into something like the above. I’m going to cheat and only convert a-z and A-Z and leave @ signs, dots, dashes, etc. alone.

# HTML encodes ASCII chars a-z, useful for obfuscating
# an email address from spiders and spammers
def html_obfuscate(string)
  output_array = []
  lower = %w(a b c d e f g h i j k l m n o p q r s t u v w x y z)
  upper = %w(A B C D E F G H I J K L M N O P Q R S T U V W X Y Z)
  char_array = string.split('')
  char_array.each do |char|  
    output = lower.index(char) + 97 if lower.include?(char)
    output = upper.index(char) + 65 if upper.include?(char)
    if output
      output_array << "&##{output};"
    else 
      output_array << char
    end
  end
  return output_array.join
end

now in our js_antispam_email_link method we can “encrypt” the user and domain before sending to the browser like so:

def js_antispam_email_link(email, linktext=email)
  user, domain = email.split('@')
  user = html_obfuscate(user)
  domain = html_obfuscate(domain)
  ...

Not bad, but many spiders these days can still decode HTML entities and get at that address, so lets build up our defenses a bit more by adding some methods to really screw with those spiders.

We’ll write a method that encrypts a string with ROT13 and puts that on the webpage, and use some javascript to decrypt that on page display. ROT13 is a really simple cipher where you take characters a-z and shift them by half the alphabet.

This is a really simple one-liner borrowed from Jay Komineck

# Rot13 encodes a string
def rot13(string)
  string.tr "A-Za-z", "N-ZA-Mn-za-m"
end

Lets use this to really beef up our link helper by using some javascript that can decipher this. JS code taken from Allan Odgaard

string = '#{email}'.replace(/[a-zA-Z]/g, 
  function(c){ 
    return String.fromCharCode(
      (c <= 'Z' ? 90 : 122) >= (c = c.charCodeAt(0) + 13) ? c : c - 26
    );
  });

Now we’ve got some pretty strong defense against those pesky robots and by using simple HTML character encoding and lightweight ROT13 ciphering it shouldn’t be too taxing on your webserver to spit out a page with a few emails on it. Less sophisticated browsers still get the contact info and everyone is a little bit happier to come home to a (relatively) clean inbox.

Here’s the whole shebang put together, put this in application_helper.rb if using rails:

# Rot13 encodes a string
def rot13(string)
  string.tr "A-Za-z", "N-ZA-Mn-za-m"
end
 
# HTML encodes ASCII chars a-z, useful for obfuscating
# an email address from spiders and spammers
def html_obfuscate(string)
  output_array = []
  lower = %w(a b c d e f g h i j k l m n o p q r s t u v w x y z)
  upper = %w(A B C D E F G H I J K L M N O P Q R S T U V W X Y Z)
  char_array = string.split('')
  char_array.each do |char|  
    output = lower.index(char) + 97 if lower.include?(char)
    output = upper.index(char) + 65 if upper.include?(char)
    if output
      output_array << "&##{output};"
    else 
      output_array << char
    end
  end
  return output_array.join
end
 
# Takes in an email address and (optionally) anchor text,
# its purpose is to obfuscate email addresses so spiders and
# spammers can't harvest them.
def js_antispam_email_link(email, linktext=email)
  user, domain = email.split('@')
  user   = html_obfuscate(user)
  domain = html_obfuscate(domain)
  # if linktext wasn't specified, throw encoded email address builder into js document.write statement
  linktext = "'+'#{user}'+'@'+'#{domain}'+'" if linktext == email 
  rot13_encoded_email = rot13(email) # obfuscate email address as rot13
  out =  "<noscript>#{linktext}<br/><small>#{user}(at)#{domain}</small></noscript>n" # js disabled browsers see this
  out += "<script language='javascript'>n"
  out += "  <!--n"
  out += "    string = '#{rot13_encoded_email}'.replace(/[a-zA-Z]/g, function(c){ return String.fromCharCode((c <= 'Z' ? 90 : 122) >= (c = c.charCodeAt(0) + 13) ? c : c - 26);});n"
  out += "    document.write('<a href='+'ma'+'il'+'to:'+ string +'>#{linktext}</a>'); n"
  out += "  //-->n"
  out += "</script>n"
  return out
end

I hope this helps out somebody out there, please leave a comment if you have any suggestions.

Getting attachment_fu to play nice with acts_as_versioned

Posted by unixmonkey on February 09, 2008

If you’ve ever wanted to keep track of revisions to document files or images in your Rails app, you are likely to want to use Acts_as_versioned, which is the authority on versioning database records, and Attachment_fu, which is the authority on uploading files with Rails.

The problem is that they don’t know about each other and will step on each other’s toes without some changes. This article serves as a quick introduction to each, and shows how to make the two plugins get along like best friends.

Acts_as_versioned was written by Rails Core Team member Rick Olsen (who also wrote attachment_fu and Restful_authentication among others) that essentially makes a mirror table of the one you want to version, and keeps every version of the record you are updating.

Say I have a document table with fields like this:

id title description
1 rep08 2008 report

Acts_as_versioned will add a column “version”, and a separate table “document_versions”.

id title description version
1 rep08 2008 report 1

The document_versions table will look a bit like this

id document_id title description version
1 1 rep08 2008 report 1

Setting up acts_as_versioned is pretty simple, I got most of my introduction to it from urbanhonking.com

Now every time you update the original document, the changes are saved in your main documents table, and the version column is incremented by 1.

After a few edits of the document, you’ll see the versioning information in the Document_versions table add up.

id document_id title description version
1 1 rep08 2008 report 1
2 1 rep08 2008 report changed 2
3 1 rep08 chgd 2008 report changed 3

Great! We can now use some of acts_as_versioned’s built-in methods for determining if there are older versions, and be able to view or even revert to them.

Now lets add the ability to upload a file to attach to a document record with attachment_fu.

Attachment_fu is another plugin that makes uploading files and keeping track of them in the database relatively simple.

A good intro to attachment_fu can be found on Mike Clark’s blog

Attachment_fu would require a few changes to our documents table:

id title description version filename content_type size
1 rep08 2008 report 1 rep08.jpg image/jpeg 2854

Don’t forget to add the same fields to your documents_versions table, too.

Once we’ve added the right file fields to the new and edit forms, and image_tag or download link on the show view, we’ve got working file uploads. Nice.

Try to edit a record by attaching a new file, the new file is displayed and the record is preserved as an older version in the versioned table. But if you try to view the old version…wait a minute? Where did my version 1 file go!

That’s right, attachment_fu deletes the old file when you add a new one (as it should if you aren’t versioning your data). Attachment_fu’s rename_file method is the one responsible for deleting (or renaming) the old file when a new one is added, so lets monkeypatch that in our model to not do anything.

def rename_file
end

Now, it will only overwrite the file if the filename is the same. Lets store each version in its own folder to keep them from clobbering each other by monkey-patching the path files get written to in our model also:

def attachment_path_id
  "/#{id}/v#{version}/"
end
def partitioned_path(*args)
  attachment_path_id + args.to_s
end

This changes the public path from /0000/0001/rep08.jpg to /1/v1/rep08.jpg

Now, if we want to display the image, we cannot use the ‘public_filename’ method, because it is only given to the Document model, and not the Document_Version model.

That’s okay, because with our new path arrangement, we can reliably predict where the old versions of the files will be kept. You can show them with some code similar to this in your views:

<% for version in @document.versions %>
 
  Version <%= version.version %>
  <%= image_tag("/documents/#{@document.id}/v#{version.version/" + version.filename) %>
  <hr />
 
<% end %>

Now, when we delete a record, attachment_fu only knows about the current document, and will leave behind orphaned files and folders from the old versions. Lets fix that by having it get rid of the document id folder.

Rails reserves some special methods (callbacks) for performing actions before or after other major actions, lets tap into that by defining a method that will magically get called every time we delete a record.

def after_destroy
  FileUtils.rm_rf(RAILS_ROOT + "/public/documents/#{id}/")
end

This translates into the shell command rm -rf and deletes our ID directory and everything inside it.

Hooray!

As a wrap up, lets look at our complete Document model:

class Document < ActiveRecord::Base
  acts_as_versioned
  has_attachment :storage => :file_system
 
  def rename_file
  end
 
  def attachment_path_id
    "/#{id}/v#{version}/"
  end
 
  def partitioned_path(*args)
    attachment_path_id + args.to_s
  end
 
  def after_destroy
    FileUtils.rm_rf(RAILS_ROOT + "/public/documents/#{id}/") if id
  end
 
end

I’ve whipped up a sample Rails app demonstrating the points and code in this article. It uses Rails 2.0.2 with the sqlite3 database.

Download it here: Attachments_versioned (240kb .zip)

I hope this saves some work for someone who wants to leverage these two excellent plugins by Rick Olsen (technoweenie) on the same model without having them fight too much.

Autotest with custom growl notifications in Leopard 5

Posted by unixmonkey on January 17, 2008

Autotest is part of the ZenTest suite for testing ruby and rails apps. Instead of running ‘rake test’ every time you want to run your tests, simply open another tab in your terminal, browse to your rails project directory and run ‘autotest’. It will run your test suite and sit there waiting for any file to be modified. When you edit any file in your rails project, autotest will automatically run your tests again.

To install ZenTest and autotest, open a terminal and run ‘sudo gem install ZenTest’.

This is great, awesome even. I don’t want to understate how useful this is, its like breathing when doing test-driven development, but when I’m coding, I’ll often have a different terminal up front with a script/console shell or tail-ing the development log. I want my tests to get in my face when something goes wrong.

The way to do this is with Growl on OSX, Snarl for Windows, or several other similar pop-up notification apps.

As of this writing, the latest version of Growl is 1.1.2, the ZenTest Gem is 3.8.0, and OSX Leopard is 10.5.1. I installed Leopard as a clean install.

After downloading and installing Growl, while the .dmg is still mounted, open a terminal and run ‘/Volumes/Growl 1.1.2/Extras/growlnotify/install.sh’.

This will put a commandline tool called ‘growlnotify’ in your /usr/local/bin dir that is used to pop up those messages. Now you can use a text editor to create a ~/.autotest file.

Lets start simple and check that growl is working with autotest, add this line to your .autotest file:

require ‘autotest/growl’

Then save, and go into your Rails project and run ‘autotest’

If all the pieces were installed right, you should get some nice popups when you run autotest and each time you modify a file in your rails project directory.

Lets pretty this up a bit by over-riding the Autotest::Growl module in our .autotest file. I’ve pretty much copied this verbatim from /Library/Ruby/Gems/1.8/gems/ZenTest-3.8.0/lib/autotest/growl.rb
only I’ve added an image directive to the autotest hooks, a nice red rails logo for test failures, and a green one for test passes, then changed the growl method to substitute my images into the growlnotify command. Be sure the modify the image paths for your images directory.

# ~.autotest
 
require 'autotest/redgreen'
require 'autotest/growl'
 
module Autotest::Growl
 
  def self.growl title, msg, img="/Applications/Mail.app/Contents/Resources/Caution.tiff", pri=0
    title += " in #{Dir.pwd}"
    msg += " at #{Time.now.strftime("%I:%M %p")}"
    system "growlnotify -n autotest --image #{img} -p #{pri} -m #{msg.inspect} #{title}"
  end
 
  Autotest.add_hook :run do  |at|
     growl "autotest running", "Started"
   end
 
   Autotest.add_hook :red do |at|
     img = "/Users/djones/.autotest_images/rails_fail.png"
     growl "Tests Failed", "#{at.files_to_test.size} tests failed", img, 2
   end
 
   Autotest.add_hook :green do |at|
     img = "/Users/djones/.autotest_images/rails_ok.png"
     growl "Tests Passed", "Tests passed", img, -2 if at.tainted
   end
 
   Autotest.add_hook :all_good do |at|
     img = "/Users/djones/.autotest_images/rails_fail.png"
     growl "Tests Passed", "All tests passed", img, -2 if at.tainted
   end
 
end

Now create a .autotest_images folder in your home directory and put these images in there (I got them from here), or you can use your own.

rails_ok.pngrails_fail.png

Thin: slimmer and faster than mongrel

Posted by unixmonkey on January 05, 2008

We all know mongrel is the bees knees when it comes to serving Rails or Merb apps, even if its creator had a meltdown.

But there is a new kid on the block named thin that claims to run Rails apps almost twice as fast as mongrel. Check the graph.

Thin vs mongrel vs webrick comparison chart

A simple ‘gem install thin’ and going into your rails app and issuing ‘thin start’ is enough to get you shedding weight.

This is definitely something I’ll be playing with for the next few days.

Setting up a production rails server step-by-step

Posted by unixmonkey on December 27, 2007

Pushing your app to the real world with Rails can be a very daunting task to someone who is used to just uploading flat html or php files.

If you’ve never dealt with Apache, mod_rewrite, or proxy servers, prepare to spend a lot of time hammering out a solution. Worried about performance, or your need to scale out at a later date? Pick a solution that won’t leave you wondering if your site will be able to take a spike in traffic.

For Rails, there are lots of deployment strategies, some are tuned for compatibility with shared hosting (fastcgi), and some are built for speed and minimum configuration (Litespeed Nginx and Lighttpd), and some are built to leverage the strength and flexibility of stable and established server software (Apache+Mongrel).

If you want to scale, you’re going to need a load balancing proxy. The choices here are pound, pen, and apache’s mod_proxy_balancer.

An abundance of choice is a double-edged sword. Competition is good in any arena, but it makes it very hard for someone who hasn’t tried them all to choose one.

I’ve chosen to setup my Rails server using the Apache+Mongrel+mod_proxy_balancer combination.

For the operating system, I’ve chosen the newest Ubuntu 7.10 (Gusty) server. Ubuntu uses the Debian-style apt package management, but with more current packages than Debian stable, and is the current cool kid on the block for Linux systems. I have been using Ubuntu for years and can attest to its stability and cutting edge (but not bleeding edge) packages.

I’ve gone through and built a Rails server by hand before, but poorly documenting what I had done, and not securing it properly. When I stumbled upon Slicehost’s server setup articles, I knew I had found exactly what I needed to build a server configured like a pro, and all the documentation I should have written in the first place.

Here is the step-by-step setup for a production Rails server. I’ve tested this setup and can vouch for its awesomeness. I encourage you to make some changes specific to your setup where appropriate. You should be able to skip unneccesary stuff like php and virtual hosts if you don’t need them.

Ubunty Setup – part 1
Ubunty Setup – part 2
Myql and Rails
Apache and PHP
Apache Virtual Hosts
Vhosts and permissions
Apache Vhosts, Rails and Mongrels
Mongrel Clustering
Subversion intro
Setting up svnserve
Securing svnserve with ssh
Setting up Capistrano
Setting up Piston to manage plugins

As you can see, most of these are from Slicehost’s documentation articles. They also detail setup for several other major linux distributions and other deployment strategies like nginx. Major props to slicehost for putting such excellent documentation together.

Rails exceptions and debug trace even in production mode? 11

Posted by unixmonkey on December 17, 2007

During my recent adventure of setting up a production server for my rails applications, I stumbled upon some strange behavior where if I typed in a non-existent route or caused rails to barf all over itself, it would still show me a debug stack trace instead of the 400.html or 500.html living in /public

I uncommented the ENV[‘RAILS_ENV’] ||= ‘production’ line in /config/environment.rb, and the behavior persisted. I created a controller and view to echo out the environment, and did confirm that the app is running in production mode.

Googling only led to minor success where it was alluded in 1 or 2 posts that rails might think it is still running locally.

I started to eye my Apache mod_proxy_balancer + mongrel_cluster arrangement, where the cluster is set to spawn instances on 127.0.0.1 (localhost). After changing that local address to the outside facing address (192.168.x.x) and accessing it at that address in my vhost conf. Things were working as expected.

Now why would rails act this way? Searching the Rails API led me to the rescue_action and rescue_action_in_public methods, which led me to the local_request? method of determining which of the previous methods get called.

Great, now I’ve found the culprit. The code looks like the below:

# from rescue_action
if consider_all_requests_local || local_request?
  rescue_action_locally(exception)
 else
  rescue_action_in_public(exception)
end
 
#local_request?
def local_request?
  request.remote_addr == LOCALHOST and request.remote_ip == LOCALHOST
end

An easy fix is to drop a method that overrides this in application.rb

def local_request?
  return false if RAILS_ENV == 'production'
end

So, if you are having trouble making your app behave like its in production mode, give it a little spanking and remind it not to trust strangers with its debug trace.

Rapid Rails Deployment with Heroku Beta

Posted by unixmonkey on November 21, 2007

Today I got an email from heroku.com to beta test their rails hosting. Its pretty cool as it provides a web interface to setting up and editing your ruby on rails applications.

Within seconds of my getting the email, I had logged in and created a new rails app. Heroku automatically added a database.yml file pre-populated with the correct database connection info.

I wanted to see if I could get mephisto running quickly, as my host for unixmonkey.net had gone down last week and shows no sign of returning. I opened their interactive rake terminal and ran db:schema:load and db:bootstrap and my app was instantly available to me to log in and start configuring.

Everything was working fine for several hours while I was setting up to make my first huge post, but them I went to check it out at 6pm, it was down with a 500 error (ngnix), even the main heroku.com was down with a “zero sized reply”. But I checked again 2 hours or so later, and it was back up.

This is hands-down the easiest deployment I’ve ever seen. Either start with a fresh rails app, or import yours as a gzipped tar file. The only downsides so far:

* There is is a 10mb limit per app, so you can’t freeze rails in your vendor directory.
* There is a bandwidth limit I hit almost immediately after making my app public, and you can’t admin your site at all when overlimit, best you can do is export a dump of your app.
* Your app is always running in development mode. (production and other are planned for later release).

All these are likely just part of growing pains, and will be addressed eventually.

I sent an email to their support staff and my questions were addressed within hours, even late on a Saturday. You can tell its a labor of love.

Sign up for the beta. Its really cool.

Big Nerd Ranch Ruby on Rails Retreat

Posted by unixmonkey on November 19, 2007

After poking around with Ruby on Rails and reading every book and following every tutorial I could get my hands on for over 5 months, I decided to get real and sign up for the Big Nerd Ranch Ruby on Rails course being taught by Charles Brian Quinn, consultant for Highgroove Studios and co-founder of Slingshot Hosting.

I hadn’t previously heard of Big Nerd Ranch, so I was a little worried about the quality of the course, but with the Pragmatic Programmers Rails course being only two days, I felt I needed more time to be completely absorbed in the course material. The Big Nerd Ranch approach to removing you from from all distractions in a quiet, comfortable setting with a very small class size seemed to be just what I needed to focus and regroup.

Continue reading…

Skinning your entire Rails app

Posted by unixmonkey on September 18, 2007

Say I want to deploy my app as a service to several good customers, but I want my app to share the exact layout as each customer site, so it appears to be an internal app to their company and its own clients. I take thier CSS and HTML markup, and make it into a template to wrap around my app’s content.

When I took to task to replicate that functionality in Ruby on Rails, I wasn’t certain would be easily accomplished. All the tutorials I’ve read list template switching as purely CSS affairs. Using CSS to change the presentation of your site is a fine thing, but isn’t enough if you are looking to completely copy the look and feel of a client’s website with your app dropped in.

It took some thought and a little help from the community.

Previously, the system I was used to accepted a site variable passed with the login form that told which template to render. like http://myapp.com/?site=ibm

That’s ugly and really obvious to what it is doing, but what other way is there to know which template to render?

I recall setting up accounts at a handful of sites that were in the format of: http://myusername.myapp.com

This is known as using a subdomain as an account key. Luckily, there is a very simple plugin to set that up in Rails http://wiki.rubyonrails.org/rails/pages/HowToUseSubdomainsAsAccountKeys

After setting up the plugin, I’m able to access the subdomain name anywhere in the app. Good thing, because I need to access it in application.rhtml (or .haml)

 
<% # if there's a subdomain, render the partial of the same name -%>
<% if account_subdomain -%>
    <% # first check to make sure account_subdomain is valid and in accounts table  -%>
    <% @account = Account.find_by_username(account_subdomain) -%>
    <% unless @account.nil? -%>
        <%= render :partial => "layouts/"+account_subdomain %>
    <% else -%>
        <% # There's a subdomain, but it isn't valid. Render default template -%>
        <%= render :partial => "layouts/default" %>
    <% end -%>
<% else -%>
    <%= render :partial => "layouts/default" %>
<% end -%>

Notice in the above, that there are no :yield statements. That’s because :yield can live in a partial too. I’ve set up a partial for each subdomain listed in my database in app/views/layouts, and added some logic to render the default layout when a subdomain either isn’t present or is invalid.