Tuning resque workers memory usage

One of the problems we had to deal with was that we only have one server with 2GB of memory and we needed to spawn the larger number of workers possible.
To achieve that we must avoid loading the rails environment with the resque worker. Start by changing your Rakefile to look like this:

unless ENV['RESQUE_WORKER'] == 'true'
  require File.expand_path('../config/application', __FILE__)
  require 'rake'

  ROOT_PATH = File.expand_path("..", __FILE__)
  require 'bundler'
  require 'rake'
  require File.join(ROOT_PATH, 'config/initializers/_config')
  ENV['BUNDLE_GEMFILE'] = File.expand_path('Gemfile', File.dirname(__FILE__))

In your Gemfile you should create a group only with the gems that are used in your worker.
Next change the rake task definitions for your rake environment. Here is an example of our lib/tasks/resque.rake :

# -*- coding: UTF-8 -*-
require "resque_scheduler"
require 'resque_scheduler/tasks'
require 'resque/tasks'

namespace :resque do
  task :setup => :anemic_environment

task :anemic_environment do
  raise "Please set your RESQUE_WORKER variable to true" unless ENV['RESQUE_WORKER'] == "true"
  $:.unshift File.join(ROOT_PATH, "app/models")
  $:.unshift File.join(ROOT_PATH, "app/consumers")
  $:.unshift File.join(ROOT_PATH, "lib")
  db = YAML.load_file File.join(ROOT_PATH, "config/database.yml")
  puts db[RAILS_ENV]
  ActiveRecord::Base.establish_connection db[RAILS_ENV]
      File.join(ROOT_PATH, "app/consumers/*.rb"),
      File.join(ROOT_PATH, "app/models/*.rb"),
      File.join(ROOT_PATH, "lib/*.rb")
  ].each do |f|
    name = f.split("/").last.gsub(/\.rb$/, "")
    autoload name.camelize, name


As you can see in the "anemic_environment" rake task, we are adding our models, workers (we called consumers) and files from the lib directory to the load path. This is only some convenience for me, you don't need to do that. We also start the connection to the database manually, since we aren't loading the Rails environment (Rails really saves us a lot of work :P )
We also used autoload (from activesupport), so we don't have to require our files manually. This approach is also convenient because the worker won't require the file until it is needed (very useful when performance is not an issue, but memory is). If you have performance issues, it's better to require the files  prior to starting the worker (in this "anemic_environment" e.g.).
Doing this we were able to reduce the memory of a single consumer from 90 MB to 54MB (40% less memory).
But we can still tune the worker a little more by not requiring the gems we depend on until we need them. This can be done by adding a ":require => false" in the Gemfile:

group :default do
  gem 'rails', '3.0.9'
  gem 'authlogic'

group :default, :worker do
  gem 'mysql2', '~> 0.2.7'
  gem 'attr_encrypted'

  gem 'resque'
  gem 'resque-scheduler'

group :worker do
  gem 'activerecord', :require => 'active_record'
  gem 'activesupport', :require => 'active_support/all'

  gem 'actionmailer', :require => false
  gem 'httpclient', :require => false
  gem 'twitter'   , :require => false

Finally we were able to reduce the memory usage to 36MB (60% less than the default way). That is a really great improvement and now we can spawn twice the original number of workers in our application.


Securing your LAMR a.k.a. Being a devops but not a so lame one

When deploying stuff in production environment, security is always a concern.
Here is a summary of some basic guidelines when running a LAMR (linux, apache, mysql, ruby) server, to make your server more secure.

1) Linux

Disable root account
To disable root access, run the following command:
sudo passwd -l root

User password strength
Having a strong password is recommend to avoid dictionary attacks.
Always set a long random generated password.
Avoid using dictionary word and passwords related to your personal life.

Use ssh keys
Allow access only using ssh keys and from trusted hosts, since passwords are more easily broken. To do so, you must add your public key to the  ~/.ssh/authorized_keys file
from="trusted_host" ssh-rsa encrypted_key Comment describing key

Secure your ssh server
To secure your ssh server you must do some changes in the sshd_config
- Do not allow root login

PermitRootLogin no

- Change the listen port

Port 666

- Allow only certain users to have access via ssh

AllowUsers username ref username@email

- Disable the protocol version 1

Protocol 2

- Add a banner to users connecting to the ssh server (legal protection)

Banner /etc/some_file

- Don't allow password authentication

PasswordAuthentication no

- Disable unused authentication modules (even if disabled by default)
- Restrict access to file transfer

Keep your system and packages updated
Once a vulnerability is found, software updates are released to fix it. Old software versions may have a lot of well known vulnerabilities well-known to the hacker community.

Run the minimum number of services required and install the minimum amount of software required
More software means more vulnerabilities possibilities, so keep your software to a minimum necessary.

Don't install C compilers and other development utilities in your production server
Development tools may help an attacker compromise the system even further, allowing him to do privilege escalation.

Always run daemons under a new user and group
An attack to a software running as a certain user (nobody e.g.) may compromise other software running as this same user.
Also, remove shell access for these users.

2) Apache

Hide the apache version and other sensitive information
Attackers can use this information to exploit this version vulnerabilities.
It is also an indicator that you left most defaults alone.
Change your apache configuration file to be set like this:

ServerTokens Prod
ServerSignature Off
TraceEnable Off

Run using its own user and group
Make sure apache is running as its own user (sometimes it runs as nobody).

Files outside the web root directory should not be served

Turn off file browsing
This can be achieved setting:

Options -Indexes
Turn off server side includes
SSI-enabled files can execute any CGI script or program under the permissions of the apache user, using the exec cmd element. Thus, this is a security risk.

Turn off support for .htaccess files
htaccess files override the default configuration of your apache folder. To avoid this, you need to set your configuration ass follows:

AllowOverride None
Disable unused modules
The more modules you have enabled, greater the chance you have a security vulnerability.

Don't allow apache to follow symbolic links
Following symlinks makes your apache serve files outside your root directory.
To avoid this, set the following in your config file:

Options -FollowSymLinks

3) Mysql

Restrict access to the user table

Multiple applications should access the mysql server with different users

Grant only the necessary privileges to your application user in database

Secure root account with a strong password

Do not use plain text passwords in the database

Secure access to mysql port from untrusted hosts

Always escape SQL statement inputs in your application code

4) Ruby

SSL provides a secure connection, where the data is encrypted. This helps avoiding session hijacks through  cookie sniffing.

Expire sessions on inactivity
Sessions that never expire are a danger that can lead to attacks (CSRF, session hijacking e.g.). Expiring session leave a smaller time-frame in which attacks can happen

Reset session on every successful login
This way you can avoid session fixation attacks (attacker using a fixed session identifier to access an user session)

Use protect_from_forgery when possible
protect_from_forgery adds a security token as hidden input in your forms, that helps prevent CSRF attacks.

Do not allow user to supply parts of the URL to be redirected to
Allowing users to provide parts of the URL, can redirect someone to a complete different site (sometimes looking like yours, as in fishing attacks).

Use ip restriction to access admin interface
You usually won't need your admin interface open to all the world, so restricting access by ip address would narrow the range of attacks to your admin pages.

Use separated credentials to admin login
Having a separated admin credential can prevent a user to achieve privilege escalation changing some parameters of his profile.

Use attr_protected in sensitive model information
Mass assignment allows users to change any column related to the model being edited/created.
If you have any sensitive model information in a column, you should use attr_protected method to avoid mass assignment of this attribute.
But as in any security guideline, it is preferable to use whitelist rather than blacklist, so it's better to use attr_accessible.

Use captcha after some failed login attempts
Using captcha after some unsuccessful login attempts helps preventing brute force attacks to access an user account.

Be careful not to log sensitive information
Be careful not to log stuff like passwords, credit card numbers and other sensitive informations. Log filter_parameters helps you with that, but it is not 100% efficient,  so be aware of what you are logging.

Design your controllers restricting resources access
If some resources belongs to an user only, design your controllers according to it, avoiding users to access other users resources simply by changing the url.

# wrong
@project = Project.find(params[:id])
# right
@project = @current_user.projects.find(params[:id])

Escape all inputs (specially client ones)
Never trust inputs in your system, always escape then.
- To avoid SQL injection use placeholders

Project.where(["name in (?)", names])

- To avoid command injection use methods with sanity check

system("/bin/echo","hello; rm *")

- To avoid HTML/JavaScript Injection escape inputs (default in Rails 3)
- To avoid Header Injection don't let users inputs to set parts of a header.

Run your daemons with a new user and group
As told before, you should avoid running multiple softwares under the same user.

There is a lot to improve yet, but I hope this small guideline is useful to you guys. Feel free to ask any questions.



Big Refactoring: when two became one a.k.a. why do I need tests and test environments

We had a issue these days very common to the dev world: some of are models didn't represent very well our business. Wrong naming and wrong modeling always gets me crazy. This time we had two models that didn't represent very well as different ones and we thought it was better to merge them in a new model better named. The real challenge is that they were some core models in our system, so there was a lot of changes involved.

So what is the best way to refactor this code?
- We could make lot's of small changes, transferring data gradually from one model to another (less risky and with less impact, taking a lot of time)
- We could make one big change at once, transferring all data from both models to a complete new one (more risky and wit more impact, taking considerably less time)

We went for the second choice, but with a slight difference as we killed only one model, instead of both. Leaving the creation of the new model to a second step.
Since we had lots of tests in our system and a complete test environment, the risks and impact of a big refactoring are dramatically reduced. Then, it took only a couple of hours to create db migration, merge all logic in only one model and use sed to replace its occurrences inside the code.

Some details here, that will help your refactoring be easily rolled back:
- when refactoring models that have relationship with other models, keep the old column in the database
- in case of killing some models, keep the old tables instead of deleting them
- always have a rollback plan
- make your deploy/rollback tasks as automated as possible

Well, it's done. Let's run the tests. 59% of failure. We forgot to made changes in the test factories to represent the new model. Changes made, let's try again... 100% of success.

The next step is to write a rake task to migrate all data into only one model. Why not doing this inside the migration files? 2 reasons:
1) You should never use the model objects inside them, because, as in this case, you may need to rename them and your migrations will break
2) I'm not that good writing SQL to use the execute method for populating the table and fixing relations between the models

Once the rake task is done, we deployed this code in the test environment and did some smoke tests.
All going fine, we felt more confident to use this in production. \o/

Now we just need to deploy in our production environment.
A core refactoring with not much pain, thanks to tests and a good test environment for running smoke tests.



I will try to use this tool to expose some of the issues and solutions I found while coding.

class Morellon
   def say
      "hello world"

See you around...