Tag cloud

SMS PIN Verification with Twilio and Django

This is a quick example of SMS PIN verification using Twilio cloud services and Django which I did for a small site recently.

SMS PIN form

The page template contains the following HTML snippet

<input type="text" id="mobile_number" name="mobile_number" placeholder="111-111-1111" required>
<button class="btn" type="button" onClick="send_pin()"><i class="icon-share"></i> Get PIN</button>

and a JavaScript function utilizing jQuery:

function send_pin() {
    $.ajax({
                url: "{% url 'ajax_send_pin' %}",
                type: "POST",
                data: { mobile_number:  $("#mobile_number").val() },
            })
            .done(function(data) {
                alert("PIN sent via SMS!");
            })
            .fail(function(jqXHR, textStatus, errorThrown) {
                alert(errorThrown + ' : ' + jqXHR.responseText);
            });
}

Then create the following views in Django:

def _get_pin(length=5):
    """ Return a numeric PIN with length digits """
    return random.sample(range(10**(length-1), 10**length), 1)[0]


def _verify_pin(mobile_number, pin):
    """ Verify a PIN is correct """
    return pin == cache.get(mobile_number)


def ajax_send_pin(request):
    """ Sends SMS PIN to the specified number """
    mobile_number = request.POST.get('mobile_number', "")
    if not mobile_number:
        return HttpResponse("No mobile number", mimetype='text/plain', status=403)

    pin = _get_pin()

    # store the PIN in the cache for later verification.
    cache.set(mobile_number, pin, 24*3600) # valid for 24 hrs

    client = TwilioRestClient(settings.TWILIO_ACCOUNT_SID, settings.TWILIO_AUTH_TOKEN)
    message = client.messages.create(
                        body="%s" % pin,
                        to=mobile_number,
                        from_=settings.TWILIO_FROM_NUMBER,
                    )
    return HttpResponse("Message %s sent" % message.sid, mimetype='text/plain', status=200)

def process_order(request):
    """ Process orders made via web form and verified by SMS PIN. """
    form = OrderForm(request.POST or None)

    if form.is_valid():
        pin = int(request.POST.get("pin", "0"))
        mobile_number = request.POST.get("mobile_number", "")

        if _verify_pin(mobile_number, pin):
            form.save()
            return redirect('transaction_complete')
        else:
            messages.error(request, "Invalid PIN!")
    else:
        return render(
                    request,
                    'order.html',
                    {
                        'form': form
                    }
                )

PINs are only numeric and are stored in the CACHE instead of the database which I think is simpler and less expensive in terms of I/O operations.

There are comments.

Reducing AWS Cloud Costs - Real Money Example

Last month Amazon reduced by 50% prices for EBS storage. This, combined with shrinking EBS root volume size and moving /tmp to instance storage allowed me to reduce EBS related costs behind Difio by around 50%. Following are the real figures from my AWS Bill.

EBS costs for Difio were gradually rising up with every new node added to the cluster and increased package processing (resulting in more I/O):

  • November 2013 - $7.38
  • December 2013 - $10.55
  • January 2014 - $11.97

January 2014:

EBS
$0.095 per GB-Month of snapshot data stored     9.052 GB-Mo     $0.86
$0.10  per GB-month of provisioned storage      101.656 GB-Mo  $10.17
$0.10  per 1 million I/O requests               9,405,243 IOs   $0.94
                                                        Total: $11.97

In February there was one new system added to process additional requests (cluster nodes run as spot instances) and an increased number of temporary instances (although I haven't counted them) while I was restructuring AMI internals to accommodate the open source version of Difio. My assumption (based on historical data) is this would have driven the costs up in the region of $15 per month only for EBS.

After implementing the stated minimal improvements and having Amazon reduced the prices by half the bill looks like this:

February 2014:

EBS
$0.095 per GB-Month of snapshot data stored     8.668 GB-Mo     $0.82
$0.05  per GB-month of provisioned storage      58.012 GB-Mo    $2.90
$0.05  per 1 million I/O requests               5,704,482 IOs   $0.29
                                                        Total:  $4.01

Explanation

Snapshot data stored is the volume of snapshots (AMIs, backups, etc) which I have. This is fairly constant.

Provisioned storage is the volume of EBS storage provisioned for running instances (e.g. root file system, data partitions, etc.). This was reduced mainly because of shrinking the root volumes. (Previously I've used larger root volumes for a bigger /tmp).

I/O requests is the number of I/O requests associated with your EBS volumes. As far as I understand Amazon doesn't charge for I/O related to ephemeral storage. Moving /tmp from EBS to instance storage is the reason this was reduced roughly by half.

Where To Next

I've reduced the root volumes back to the 8GB defaults but this has still room for improvement b/c the AMI is quite minimal. This will bring the largest improvements. Another thing is the still relatively high I/O rate that touches EBS volumes. I haven't investigated where this comes from though.

There are comments.

Moving /tmp from EBS to Instance Storage

I've seen a fair amount of stories about moving away from Amazon's EBS volumes to ephemeral instance storage. I've decided to give it a try starting with /tmp directory where Difio operates.

It should be noted that although instance storage may be available for some instance types it may not be attached by default. Use this command to check:

$ curl http://169.254.169.254/latest/meta-data/block-device-mapping/
ami
root
swap

In the above example there is no instance storage present.

You can attach one either when launching the EC2 instance or when creating a customized AMI (instance storage devices are pre-defined in the AMI). When creating an AMI you can attach more ephemeral devices but they will not available when instance is launched. The maximum number of available instance storage devices can be found in the docs. That is to say if you have an AMI which defines 2 ephemeral devices and launch a standard m1.small instance there will be only one ephemeral device present.

Also note that for M3 instances, you must specify instance store volumes in the block device mapping for the instance. When you launch an M3 instance, Amazon ignores any instance store volumes specified in the block device mapping for the AMI.

As far as I can see the AWS Console doesn't indicate if instance storage is attached or not. For instance with 1 ephemeral volume:

$ curl http://169.254.169.254/latest/meta-data/block-device-mapping/
ami
ephemeral0
root
swap

$ curl http://169.254.169.254/latest/meta-data/block-device-mapping/ephemeral0
sdb

Ephemeral devices can be mounted in /media/ephemeralX/, but not all volumes. I've found that usually only ephemeral0 is mounted automatically.

$ curl http://169.254.169.254/latest/meta-data/block-device-mapping/
ami
ephemeral0
ephemeral1
root

$ ls -l /media/
drwxr-xr-x 3 root root 4096 21 ное  2009 ephemeral0

For Difio I have an init.d script which executes when the system boots. To enable /tmp on ephemeral storage I just added the following snippet:

echo $"Mounting /tmp on ephemeral storage:"
for ef in `curl http://169.254.169.254/latest/meta-data/block-device-mapping/ 2>/dev/null | grep ephemeral`; do
    disk=`curl http://169.254.169.254/latest/meta-data/block-device-mapping/$ef 2>/dev/null`
    echo $"Unmounting /dev/$disk"
    umount /dev/$disk

    echo $"mkfs /dev/$disk"
    mkfs.ext4 -q /dev/$disk

    echo $"Mounting /dev/$disk"
    mount -t ext4 /dev/$disk /tmp && chmod 1777 /tmp && success || failure
done

NB: success and failure are from /etc/rc.d/init.d/functions. If you are using LVM or RAID you need to reconstruct your block devices accordingly!

If everything goes right I should be able to reduce my AWS costs by saving on provisioned storage and I/O requests. I'll keep you posted on this after a month or two.

There are comments.

AWS Tip: Shrinking EBS Root Volume Size

Amazon's Elastic Block Store volumes are easy to use and expand but notoriously hard to shrink once their size has grown. Here is my tip for shrinking EBS size and saving some money from over-provisioned storage. I'm assuming that you want to shrink the root volume which is on EBS.

  • Write down the block device name for the root volume (/dev/sda1) - from AWS console: Instances; Select instance; Look at Details tab; See Root device or Block devices;
  • Write down the availability zone of your instance - from AWS console: Instances; column Availability Zone;
  • Stop instance;
  • Create snapshot of the root volume;
  • From the snapshot, create a second volume, in the same availability zone as your instance (you will have to attach it later). This will be your pristine source;
  • Create new empty EBS volume (not based on a snapshot), with smaller size, in the same availability zone - from AWS console: Volumes; Create Volume; Snapshot == No Snapshot; IMPORTANT - size should be large enough to hold all the files from the source file system (try df -h on the source first);
  • Attach both volumes to instance while taking note of the block devices names you assign for them in the AWS console;

For example: In my case /dev/sdc1 is the source snapshot and /dev/sdd1 is the empty target.

  • Start instance;
  • Optionally check the source file system with e2fsck -f /dev/sdc1;
  • Create a file system for the empty volume - mkfs.ext4 /dev/sdd1;
  • Mount volumes at /source and /target respectively;
  • Now sync the files: rsync -aHAXxSP /source/ /target. Note the missing slash (/) after /target. If you add it you will end up with files inside /target/source/ which you don't want;
  • Quickly verify the new directory structure with ls -l /target;
  • Unmount /target;
  • Optionally check the new file system for consistency e2fsck -f /dev/sdd1;
  • IMPORTANT - check how /boot/grub/grub.conf specifies the root volume - by UUID, by LABEL, by device name, etc. You will have to duplicate the same for the new smaller volume or update /target/boot/grub/grub.conf to match the new volume. Check /target/etc/fstab as well!

In my case I had to e2label /dev/sdd1 / because both grub.conf and fstab were using the device label.

  • Shutdown the instance;
  • Detach all volumes;
  • IMPORTANT - attach the new smaller volume to the instance using the same block device name from the first step (e.g. /dev/sda1);
  • Start the instance and verify it is working correctly;
  • DELETE auxiliary volumes and snapshots so they don't take space and accumulate costs!

There are comments.

Idempotent Django Email Sender with Amazon SQS and Memcache

Recently I wrote about my problem with duplicate Amazon SQS messages causing multiple emails for Difio. After considering several options and feedback from @Answers4AWS I wrote a small decorator to fix this.

It uses the cache backend to prevent the task from executing twice during the specified time frame. The code is available at https://djangosnippets.org/snippets/3010/.

As stated on Twitter you should use Memcache (or ElastiCache) for this. If using Amazon S3 with my django-s3-cache don't use the us-east-1 region because it is eventually consistent.

The solution is fast and simple on the development side and uses my existing cache infrastructure so it doesn't cost anything more!

There is still a race condition between marking the message as processed and the second check but nevertheless this should minimize the possibility of receiving duplicate emails to an accepted level. Only time will tell though!

There are comments.

Duplicate Amazon SQS Messages Cause Multiple Emails

Beware if using Amazon Simple Queue Service to send email messages! Sometime SQS messages are duplicated which results in multiple copies of the messages being sent. This happened today at Difio and is really annoying to users. In this post I will explain why there is no easy way of fixing it.

Q: Can a deleted message be received again?

Yes, under rare circumstances you might receive a previously deleted message again. This can occur in the rare situation in which a DeleteMessage operation doesn't delete all copies of a message because one of the servers in the distributed Amazon SQS system isn't available at the time of the deletion. That message copy can then be delivered again. You should design your application so that no errors or inconsistencies occur if you receive a deleted message again.

Amazon FAQ

In my case the cron scheduler logs say:

>>> <AsyncResult: a9e5a73a-4d4a-4995-a91c-90295e27100a>

While on the worker nodes the logs say:

[2013-12-06 10:13:06,229: INFO/MainProcess] Got task from broker: tasks.cron_monthly_email_reminder[a9e5a73a-4d4a-4995-a91c-90295e27100a]
[2013-12-06 10:18:09,456: INFO/MainProcess] Got task from broker: tasks.cron_monthly_email_reminder[a9e5a73a-4d4a-4995-a91c-90295e27100a]

This clearly shows the same message (see the UUID) has been processed twice! This resulted in hundreds of duplicate emails :(.

Why This Is Hard To Fix

There are two basic approaches to solve this issue:

  • Check some log files or database for previous record of the message having been processed;
  • Use idempotent operations that if you process the message again, you get the same results, and that those results don't create duplicate files/records.

The problem with checking for duplicate messages is:

  • There is a race condition between marking the message as processed and the second check;
  • You need to use some sort of locking mechanism to safe-guard against the race condition;
  • In the event of an eventual consistency of the log/DB you can't guarantee that the previous attempt will show up and so can't guarantee that you won't process the message twice.

All of the above don't seem to work well for distributed applications not to mention Difio processes millions of messages per month, per node and the logs are quite big.

The second option is to have control of the Message-Id or some other email header so that the second message will be discarded either at the server (Amazon SES in my case) or at the receiving MUA. I like this better but I don't think it is technically possible with the current environment. Need to check though.

I've asked AWS support to look into this thread and hopefully they will have some more hints. If you have any other ideas please post in the comments! Thanks!

There are comments.

Bug in Python URLGrabber/cURL on Fedora and Amazon Linux

Accidentally I have discovered a bug for Python's URLGrabber module which has to do with change in behavior in libcurl.

>>> from urlgrabber.grabber import URLGrabber
>>> g = URLGrabber(reget=None)
>>> g.urlgrab('https://s3.amazonaws.com/production.s3.rubygems.org/gems/columnize-0.3.6.gem', '/tmp/columnize.gem')
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/home/celeryd/.virtualenvs/difio/lib/python2.6/site-packages/urlgrabber/grabber.py", line 976, in urlgrab
    return self._retry(opts, retryfunc, url, filename)
  File "/home/celeryd/.virtualenvs/difio/lib/python2.6/site-packages/urlgrabber/grabber.py", line 880, in _retry
    r = apply(func, (opts,) + args, {})
  File "/home/celeryd/.virtualenvs/difio/lib/python2.6/site-packages/urlgrabber/grabber.py", line 962, in retryfunc
    fo = PyCurlFileObject(url, filename, opts)
  File "/home/celeryd/.virtualenvs/difio/lib/python2.6/site-packages/urlgrabber/grabber.py", line 1056, in __init__
    self._do_open()
  File "/home/celeryd/.virtualenvs/difio/lib/python2.6/site-packages/urlgrabber/grabber.py", line 1307, in _do_open
    self._set_opts()
  File "/home/celeryd/.virtualenvs/difio/lib/python2.6/site-packages/urlgrabber/grabber.py", line 1161, in _set_opts
    self.curl_obj.setopt(pycurl.SSL_VERIFYHOST, opts.ssl_verify_host)
error: (43, '')
>>>

The code above works fine with curl-7.27 or older while it breaks with curl-7.29 and newer. As explained by Zdenek Pavlas the reason is an internal change in libcurl which doesn't accept a value of 1 anymore!

The bug is reproducible with a newer libcurl version and a vanilla urlgrabber==3.9.1 from PyPI (e.g. inside a virtualenv). The latest python-urlgrabber RPM packages in both Fedora and Amazon Linux already have the fix.

I have tested the patch proposed by Zdenek and it works for me. I still have no idea why there aren't any updates released on PyPI though!

There are comments.

This Week: Python Testing, Chris DiBona on Open Source and OpenShift ENV Variables

Here is a random collection of links I came across this week which appear interesting to me but I don't have time to blog about in details.

Making a Multi-branch Test Server for Python Apps

If you are wondering how to test different feature branches of your Python application but don't have the resources to create separate test servers this is for you: http://depressedoptimism.com/blog/2013/10/8/making-a-multi-branch-test-server!

Kudos to the python-django-bulgaria Google group for finding this link!

OpenSource.com Interview with Chris DiBona

Just read it at http://opensource.com/business/13/10/interview-chris-dibona.

I particularly like the part where he called open source "brutal".

You once called open source “brutal”. What did you mean by that?

...

I think that it is because open source projects are able to only work with the productive people and ignore everyone else. That behavior can come across as very harsh or exclusionary, and that's because it is that: brutally harsh and exclusionary of anyone who isn't contributing.

...

So, I guess what I'm saying is that survival of the fittest as practiced in the open source world is a pretty brutal mechanism, but it works very very well for producing quality software. Boy is it hard on newcomers though...

OpenShift Finally Introduces Environment Variables

Yes! Finally!

    rhc set-env VARIABLE1=VALUE1 -a myapp

No need for my work around anymore! I will give the new feature a go very soon.

Read more about it at the OpenShift blog: https://www.openshift.com/blogs/taking-advantage-of-environment-variables-in-openshift-php-apps.

Have you found anything interesting this week? Please share in the comments below! Thanks!

There are comments.

Tip: Setting Secure ENV variables on Red Hat OpenShift

OpenShift is still missing the client side tools to set environment variables without exposing the values in source code but there is a way to do it. Here is how.

First ssh into your application and navigate to the $OPENSHIFT_DATA_DIR. Create a file to define your environment.

$ rhc ssh -a difio
Password: ***

[difio-otb.rhcloud.com 51d32a854382ecf7a9000116]\> cd $OPENSHIFT_DATA_DIR
[difio-otb.rhcloud.com data]\> vi myenv.sh
[difio-otb.rhcloud.com data]\> cat myenv.sh
#!/bin/bash
export MYENV="hello"

[difio-otb.rhcloud.com data]\> chmod a+x myenv.sh 
[difio-otb.rhcloud.com data]\> exit
Connection to difio-otb.rhcloud.com closed.

Now modify your code and git push to OpenShift. Then ssh into the app once again to verify that your configuration is still in place.

[atodorov@redbull difio]$ rhc ssh -a difio
Password: ***

[difio-otb.rhcloud.com 51d32a854382ecf7a9000116]\> cd $OPENSHIFT_DATA_DIR
[difio-otb.rhcloud.com data]\> ls -l
total 4
-rwxr-xr-x. 1 51d32a854382ecf7a9000116 51d32a854382ecf7a9000116 34  8 jul 14,33 myenv.sh
[difio-otb.rhcloud.com data]\>

Use the defined variables as you wish.

There are comments.

Performance test: Amazon ElastiCache vs Amazon S3

Which Django cache backend is faster? Amazon ElastiCache or Amazon S3 ?

Previously I've mentioned about using Django's cache to keep state between HTTP requests. In my demo described there I was using django-s3-cache. It is time to move to production so I decided to measure the performance difference between the two cache options available at Amazon Web Services.

Update 2013-07-01: my initial test may have been false since I had not configured ElastiCache access properly. I saw no errors but discovered the issue today on another system which was failing to store the cache keys but didn't show any errors either. I've re-run the tests and updated times are shown below.

Test infrastructure

  • One Amazon S3 bucket, located in US Standard (aka US East) region;
  • One Amazon ElastiCache cluster with one Small Cache Node (cache.m1.small) with Moderate I/O capacity;
  • One Amazon Elasticache cluster with one Large Cache Node (cache.m1.large) with High I/O Capacity;
  • Update: I've tested both python-memcached and pylibmc client libraries for Django;
  • Update: Test is executed from an EC2 node in the us-east-1a availability zone;
  • Update: Cache clusters are in the us-east-1a availability zone.

Test Scenario

The test platform is Django. I've created a skeleton project with only CACHES settings defined and necessary dependencies installed. A file called test.py holds the test cases, which use the standard timeit module. The object which is stored in cache is very small - it holds a phone/address identifiers and couple of user made selections. The code looks like this:

import timeit

s3_set = timeit.Timer(
"""
for i in range(1000):
    my_cache.set(i, MyObject)
"""
,
"""
from django.core import cache

my_cache = cache.get_cache('default')

MyObject = {
    'from' : '359123456789',
    'address' : '6afce9f7-acff-49c5-9fbe-14e238f73190',
    'hour' : '12:30',
    'weight' : 5,
    'type' : 1,
}
"""
)

s3_get = timeit.Timer(
"""
for i in range(1000):
    MyObject = my_cache.get(i)
"""
,
"""
from django.core import cache

my_cache = cache.get_cache('default')
"""
)

Tests were executed from the Django shell on my laptop on an EC2 instance in the us-east-1a availability zone. ElastiCache nodes were freshly created/rebooted before test execution. S3 bucket had no objects.

$ ./manage.py shell
Python 2.6.8 (unknown, Mar 14 2013, 09:31:22) 
[GCC 4.6.2 20111027 (Red Hat 4.6.2-2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> from test import *
>>> 
>>> 
>>> 
>>> s3_set.repeat(repeat=3, number=1)
[68.089607000350952, 70.806712865829468, 72.49261999130249]
>>> 
>>> 
>>> s3_get.repeat(repeat=3, number=1)
[43.778793096542358, 43.054368019104004, 36.19232702255249]
>>> 
>>> 
>>> pymc_set.repeat(repeat=3, number=1)
[0.40637087821960449, 0.3568730354309082, 0.35815882682800293]
>>> 
>>> 
>>> pymc_get.repeat(repeat=3, number=1)
[0.35759496688842773, 0.35180497169494629, 0.39198613166809082]
>>> 
>>> 
>>> libmc_set.repeat(repeat=3, number=1)
[0.3902890682220459, 0.30157709121704102, 0.30596804618835449]
>>> 
>>> 
>>> libmc_get.repeat(repeat=3, number=1)
[0.28874802589416504, 0.30520200729370117, 0.29050207138061523]
>>> 
>>> 
>>> libmc_large_set.repeat(repeat=3, number=1)
[1.0291709899902344, 0.31709098815917969, 0.32010698318481445]
>>> 
>>> 
>>> libmc_large_get.repeat(repeat=3, number=1)
[0.2957158088684082, 0.29067802429199219, 0.29692888259887695]
>>>

Results

As expected ElastiCache is much faster (10x) compared to S3. However the difference between the two ElastiCache node types is subtle. I will stay with the smallest possible node to minimize costs. Also as seen, pylibmc is a bit faster compared to the pure Python implementation.

Depending on your objects size or how many set/get operations you perform per second you may need to go with the larger nodes. Just test it!

It surprised me how slow django-s3-cache is. The false test showed django-s3-cache to be 100x slower but new results are better. 10x decrease in performance sounds about right for a filesystem backed cache.

A quick look at the code of the two backends shows some differences. The one I immediately see is that for every cache key django-s3-cache creates an sha1 hash which is used as the storage file name. This was modeled after the filesystem backend but I think the design is wrong - the memcached backends don't do this.

Another one is that django-s3-cache time-stamps all objects and uses pickle to serialize them. I wonder if it can't just write them as binary blobs directly. There's definitely lots of room for improvement of django-s3-cache. I will let you know my findings once I get to it.

There are comments.

Twilio is Located in Amazon Web Services US East

Where do I store my audio files in order to minimize download and call wait time?

Twilio is a cloud vendor that provides telephony services. It can download and <Play> arbitrary audio files and will cache the files for better performance.

Twilio support told me they are not disclosing the location of their servers, so from my web application hosted in AWS US East:

[ivr-otb.rhcloud.com logs]\> grep TwilioProxy access_log-* | cut -f 1 -d '-' | sort | uniq 
10.125.90.172 
10.214.183.239 
10.215.187.220 
10.245.155.18 
10.255.119.159 
10.31.197.102

Now let's map these addresses to host names. From another EC2 system, also in Amazon US East:

[ec2-user@ip-10-29-206-86 ~]$ dig -x 10.125.90.172 -x 10.214.183.239 -x 10.215.187.220 -x 10.245.155.18 -x 10.255.119.159 -x 10.31.197.102

; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.17.rc1.29.amzn1 <<>> -x 10.125.90.172 -x 10.214.183.239 -x 10.215.187.220 -x 10.245.155.18 -x 10.255.119.159 -x 10.31.197.102
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 43245
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;172.90.125.10.in-addr.arpa.    IN      PTR

;; ANSWER SECTION:
172.90.125.10.in-addr.arpa. 113 IN      PTR     ip-10-125-90-172.ec2.internal.

;; Query time: 1 msec
;; SERVER: 172.16.0.23#53(172.16.0.23)
;; WHEN: Mon Jun 24 20:48:21 2013
;; MSG SIZE  rcvd: 87

;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 52693
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;239.183.214.10.in-addr.arpa.   IN      PTR

;; ANSWER SECTION:
239.183.214.10.in-addr.arpa. 42619 IN   PTR     domU-12-31-39-0B-B0-01.compute-1.internal.

;; Query time: 0 msec
;; SERVER: 172.16.0.23#53(172.16.0.23)
;; WHEN: Mon Jun 24 20:48:21 2013
;; MSG SIZE  rcvd: 100

;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 25255
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;220.187.215.10.in-addr.arpa.   IN      PTR

;; ANSWER SECTION:
220.187.215.10.in-addr.arpa. 43140 IN   PTR     domU-12-31-39-0C-B8-2E.compute-1.internal.

;; Query time: 0 msec
;; SERVER: 172.16.0.23#53(172.16.0.23)
;; WHEN: Mon Jun 24 20:48:21 2013
;; MSG SIZE  rcvd: 100

;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 15099
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;18.155.245.10.in-addr.arpa.    IN      PTR

;; ANSWER SECTION:
18.155.245.10.in-addr.arpa. 840 IN      PTR     ip-10-245-155-18.ec2.internal.

;; Query time: 0 msec
;; SERVER: 172.16.0.23#53(172.16.0.23)
;; WHEN: Mon Jun 24 20:48:21 2013
;; MSG SIZE  rcvd: 87

;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 28878
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;159.119.255.10.in-addr.arpa.   IN      PTR

;; ANSWER SECTION:
159.119.255.10.in-addr.arpa. 43140 IN   PTR     domU-12-31-39-01-70-51.compute-1.internal.

;; Query time: 0 msec
;; SERVER: 172.16.0.23#53(172.16.0.23)
;; WHEN: Mon Jun 24 20:48:21 2013
;; MSG SIZE  rcvd: 100

;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 28727
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;102.197.31.10.in-addr.arpa.    IN      PTR

;; ANSWER SECTION:
102.197.31.10.in-addr.arpa. 840 IN      PTR     ip-10-31-197-102.ec2.internal.

;; Query time: 0 msec
;; SERVER: 172.16.0.23#53(172.16.0.23)
;; WHEN: Mon Jun 24 20:48:21 2013
;; MSG SIZE  rcvd: 87

In short:

ip-10-125-90-172.ec2.internal.
ip-10-245-155-18.ec2.internal.
ip-10-31-197-102.ec2.internal.
domU-12-31-39-01-70-51.compute-1.internal.
domU-12-31-39-0B-B0-01.compute-1.internal.
domU-12-31-39-0C-B8-2E.compute-1.internal.

The ip-*.ec2.internal are clearly in US East. The domU-*.computer-1.internal also look like US East although I'm not 100% sure what is the difference between the two. The later ones look like HVM guests while the former ones are para-virtualized.

For comparison here are some internal addresses from my own EC2 systems:

  • ip-10-228-237-207.eu-west-1.compute.internal - EU Ireland
  • ip-10-248-19-46.us-west-2.compute.internal - US West Oregon
  • ip-10-160-58-141.us-west-1.compute.internal - US West N. California

After relocating my audio files to an S3 bucket in US East the average call length dropped from 2:30 min to 2:00 min for the same IVR choices. This also minimizes the costs since Twilio charges per minute of incoming/outgoing calls. I think the audio quality has improved as well.

There are comments.

Tip: Caching Large Objects for Celery and Amazon SQS

Some time ago a guy called Matt asked about passing large objects through their messaging queue. They were switching from RabbitMQ to Amazon SQS which has a limit of 64K total message size.

Recently I've made some changes in Difio which require passing larger objects as parameters to a Celery task. Since Difio is also using SQS I faced the same problem. Here is the solution using a cache back-end:

from celery.task import task
from django.core import cache as cache_module

def some_method():
    ... skip ...

    task_cache = cache_module.get_cache('taskq')
    task_cache.set(uuid, data, 3600)

    handle_data.delay(uuid)

    ... skip ...

@task
def handle_data(uuid):
    task_cache = cache_module.get_cache('taskq')
    data = task_cache.get(uuid)

    if data is None:
        return

    ... do stuff ...

Objects are persisted in a secondary cache back-end, not the default one, to avoid accidental destruction. uuid parameter is a string.

Although the objects passed are smaller than 64K I haven't seen any issues with this solution so far. Let me know if you are using something similar in your code and how it works for you.

There are comments.

Django Tips: Using Cache for Stateful HTTP

How do you keep state when working with a stateless protocol like HTTP? One possible answer is to use a cache back-end.

I'm working on an IVR application demo with Django and Twilio. The caller will make multiple choices using the phone keyboard. All of this needs to be put together and sent back to another server for processing. In my views I've added a simple cache get/set lines to preserve the selection.

Here's how it looks with actual application logic omitted

def incoming_call(request):
    call_sid = request.GET.get('CallSid', '')
    caller_id = request.GET.get('From', '')

    state = {'from' : caller_id}
    cache.set(call_sid, state)

    return render(request, 'step2.xml')

def step2(request):
    call_sid = request.GET.get('CallSid', '')
    selection = int(request.GET.get('Digits', 0))

    state = cache.get(call_sid, {})
    state['step2_selection'] = selection
    cache.set(call_sid, state)

    return render(request, 'final_step.xml')


def final_step(request):
    call_sid = request.GET.get('CallSid', '')
    selection = int(request.GET.get('Digits', 1))

    state = cache.get(call_sid, {})
    state['final_step_selection'] = selection

    Backend.process_user_selections(state)

    return render(request, 'thank_you.xml')

At each step Django will update the current state associated with this call and return a TwiML XML response. CallSid is a handy unique identifier provided by Twilio.

I am using the latest django-s3-cache version which properly works with directories. When going to production that will likely switch to Amazon ElastiCache.

There are comments.

How to Deploy Python Hotfix on RedHat OpenShift Cloud

In this article I will show you how to deploy hotfix versions for Python packages on the RedHat OpenShift PaaS cloud.

Background

You are already running a Python application on your OpenShift instance. You are using some 3rd party dependencies when you find a bug in one of them. You go forward, fix the bug and submit a pull request. You don't want to wait for upstream to release a new version but rather build a hotfix package yourself and deploy to production immediately.

Solution

There are two basic approaches to solving this problem:

  1. Include the hotfix package source code in your application, i.e. add it to your git tree or;
  2. Build the hotfix separately and deploy as a dependency. Don't include it in your git tree, just add a requirement on the hotfix version.

I will talk about the later. The tricky part here is to instruct the cloud environment to use your package (including the proper location) and not upstream or their local mirror.

Python applications hosted at OpenShift don't support requirements.txt which can point to various package sources and even install packages directly from GitHub. They support setup.py which fetches packages from http://pypi.python.org but it is flexible enough to support other locations.

Building the hotfix

First of all we'd like to build a hotfix package. This will be the upstream version that we are currently using plus the patch for our critical issue:

$ wget https://pypi.python.org/packages/source/p/python-magic/python-magic-0.4.3.tar.gz
$ tar -xzvf python-magic-0.4.3.tar.gz 
$ cd python-magic-0.4.3
$ curl https://github.com/ahupp/python-magic/pull/31.patch | patch

Verify the patch has been applied correctly and then modify setup.py to increase the version string. In this case I will set it to version='0.4.3.1'.

Then build the new package using python setup.py sdist and upload it to a web server.

Deploying to OpenShift

Modify setup.py and specify the hotfix version. Because this version is not on PyPI and will not be on OpenShift's local mirror you need to provide the location where it can be found. This is done with the dependency_links parameter to setup(). Here's how it looks:

diff --git a/setup.py b/setup.py
index c6e837c..2daa2a9 100644
--- a/setup.py
+++ b/setup.py
@@ -6,5 +6,6 @@ setup(name='YourAppName',
       author='Your Name',
       author_email='example@example.com',
       url='http://www.python.org/sigs/distutils-sig/',
-      install_requires=['python-magic==0.4.3'],
+      dependency_links=['https://s3.amazonaws.com/atodorov/blog/python-magic-0.4.3.1.tar.gz'],
+      install_requires=['python-magic==0.4.3.1'],
      )

Now just git push to OpenShift and observe the console output:

remote: Processing dependencies for YourAppName==1.0
remote: Searching for python-magic==0.4.3.1
remote: Best match: python-magic 0.4.3.1
remote: Downloading https://s3.amazonaws.com/atodorov/blog/python-magic-0.4.3.1.tar.gz
remote: Processing python-magic-0.4.3.1.tar.gz
remote: Running python-magic-0.4.3.1/setup.py -q bdist_egg --dist-dir /tmp/easy_install-ZRVMBg/python-magic-0.4.3.1/egg-dist-tmp-R_Nxie
remote: zip_safe flag not set; analyzing archive contents...
remote: Removing python-magic 0.4.3 from easy-install.pth file
remote: Adding python-magic 0.4.3.1 to easy-install.pth file

Congratulations! Your hotfix package has just been deployed.

This approach should work for other cloud providers and other programming languages as well. Let me know if you have any experience with that.

There are comments.

Email Logging for Django on RedHat OpenShift with Amazon SES

Sending email in the cloud can be tricky. IPs of cloud providers are blacklisted because of frequent abuse. For that reason I use Amazon SES as my email backend. Here is how to configure Django to send emails to site admins when something goes wrong.

settings.py
# Valid addresses only.
ADMINS = (
    ('Alexander Todorov', 'atodorov@example.com'),
)

LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
    'handlers': {
        'mail_admins': {
            'level': 'ERROR',
            'class': 'django.utils.log.AdminEmailHandler'
        }
    },
    'loggers': {
        'django.request': {
            'handlers': ['mail_admins'],
            'level': 'ERROR',
            'propagate': True,
        },
    }
}
 
# Used as the From: address when reporting errors to admins
# Needs to be verified in Amazon SES as a valid sender
SERVER_EMAIL = 'django@example.com'

# Amazon Simple Email Service settings
AWS_SES_ACCESS_KEY_ID = 'xxxxxxxxxxxx'
AWS_SES_SECRET_ACCESS_KEY = 'xxxxxxxx'
EMAIL_BACKEND = 'django_ses.SESBackend'

You also need the django-ses dependency.

See http://docs.djangoproject.com/en/dev/topics/logging for more details on how to customize your logging configuration.

I am using this configuration successfully at RedHat's OpenShift PaaS environment. Other users have reported it works for them too. Should work with any other PaaS provider.

There are comments.

Performance Test: Amazon EBS vs. Instance Storage, Pt.1

I'm exploring the possibility to speed-up my cloud database so I've run some basic tests against storage options available to Amazon EC2 instances. The instance was m1.large with High I/O performance and two additional disks with the same size:

  • /dev/xvdb - type EBS
  • /dev/xvdc - type instance storage

Both are Xen para-virtual disks. The difference is that EBS is persistent across reboots while instance storage is ephemeral.

hdparm

For a quick test I used hdparm. The manual says:

-T  Perform timings of cache reads for benchmark and comparison purposes.
    This displays the speed of reading directly from the Linux buffer cache
    without disk access. This measurement is essentially an indication of
    the throughput of the processor, cache, and memory of the system under test.

-t  Perform timings of device reads for benchmark and comparison purposes.
    This displays the speed of reading through the buffer cache to the disk
    without any prior caching of data. This measurement is an indication of how
    fast the drive can sustain sequential data reads under Linux, without any
    filesystem overhead.

The results of 3 runs of hdparm are shown below:

# hdparm -tT /dev/xvdb /dev/xvdc

/dev/xvdb:
 Timing cached reads:   11984 MB in  1.98 seconds = 6038.36 MB/sec
 Timing buffered disk reads:  158 MB in  3.01 seconds =  52.52 MB/sec

/dev/xvdc:
 Timing cached reads:   11988 MB in  1.98 seconds = 6040.01 MB/sec
 Timing buffered disk reads:  1810 MB in  3.00 seconds = 603.12 MB/sec


# hdparm -tT /dev/xvdb /dev/xvdc

/dev/xvdb:
 Timing cached reads:   11892 MB in  1.98 seconds = 5991.51 MB/sec
 Timing buffered disk reads:  172 MB in  3.00 seconds =  57.33 MB/sec

/dev/xvdc:
 Timing cached reads:   12056 MB in  1.98 seconds = 6075.29 MB/sec
 Timing buffered disk reads:  1972 MB in  3.00 seconds = 657.11 MB/sec


# hdparm -tT /dev/xvdb /dev/xvdc

/dev/xvdb:
 Timing cached reads:   11994 MB in  1.98 seconds = 6042.39 MB/sec
 Timing buffered disk reads:  254 MB in  3.02 seconds =  84.14 MB/sec

/dev/xvdc:
 Timing cached reads:   11890 MB in  1.99 seconds = 5989.70 MB/sec
 Timing buffered disk reads:  1962 MB in  3.00 seconds = 653.65 MB/sec

Result: Sequential reads from instance storage are 10x faster compared to EBS on average.

IOzone

I'm running MySQL and sequential data reads are probably over idealistic scenario. So I found another benchmark suite, called IOzone. I used the 3-414 version built from the official SRPM.

IOzone performs multiple tests. I'm interested in read/re-read, random-read/write, read-backwards and stride-read.

For this round of testing I've tested with ext4 filesystem with and without journal on both types of disks. I also experimented running Iozone inside a ramfs mounted directory. However I didn't have the time to run the test suite multiple times.

Then I used iozone-results-comparator to visualize the results. (I had to do a minor fix to the code to run inside virtualenv and install all missing dependencies).

Raw IOzone output, data visualization and the modified tools are available in the aws_disk_benchmark_w_iozone.tar.bz2 file (size 51M).

Graphics

EBS without journal(Baseline) vs. Instance Storage without journal(Set1) EBS vs. Instance Storage

Instance Storage without journal(Baseline) vs. Ramfs(Set1) IS vs. Ramfs

Results

  • ext4 journal has no effect on reads, causes slow down when writing to disk. This is expected;
  • Instance storage is faster compared to EBS but not much. If I understand the results correctly, read performance is similar in some cases;
  • Ramfs is definitely the fastest but read performance compared to instance storage is not two-fold (or more) as I expected;

Conclusion

Instance storage appears to be faster (and this is expected) but I'm still not sure if my application will gain any speed improvement or how much if migrated to read from instance storage (or ramfs) instead of EBS. I will be performing more real-world test next time, by comparing execution time for some of my largest SQL queries.

If you have other ideas how to adequately measure I/O performance in the AWS cloud, please use the comments below.

There are comments.

Click Tracking without MailChimp

Here is a standard notification message that users at Difio receive. It is plain text, no HTML crap, short and URLs are clean and descriptive. As the project lead developer I wanted to track when people click on these links and visit the website but also keep existing functionality.

"Email with links"

Standard approach

A pretty common approach when sending huge volumes of email is to use an external service, such as MailChimp. This is one of many email marketing services which comes with a lot of features. The most important to me was analytics and reports.

The downside is that MailChimp (and I guess others) use HTML formatted emails extensively. I don't like that and I'm sure my users will not like it as well. They are all developers. Not to mention that MailChimp is much more expensive than Amazon SES which I use currently. No MailChimp for me!

Another common approach, used by Feedburner by the way, is to use shortened URLs which redirect to the original ones and measure clicks in between. I also didn't like this for two reasons: 1) the shortened URLs look ugly and they are not at all descriptive and 2) I need to generate them automatically and maintain all the mappings. Why bother ?

How I did it?

So I needed something which will do a redirect to a predefined URL, measure how many redirects were there (essentially clicks on the link) and look nice. The solution is very simple, if you have not recognized it by now from the picture above.

I opted for a custom redirect engine, which will add tracking information to the destination URL so I can track it in Google Analytics.

Previous URLs were of the form http://www.dif.io/updates/haml-3.1.2/haml-3.2.0.rc.3/11765/. I've added the humble /daily/? prefix before the URL path so it becomes http://www.dif.io/daily/?/updates/haml-3.1.2/haml-3.2.0.rc.3/11765/

Now /updates/haml-3.1.2/haml-3.2.0.rc.3/11765/ becomes a query string parameter which the /daily/index.html page uses as its destination. Before doing the redirect a script adds tracking parameters so that Google Analytics will properly report this visit. Here is the code:

<html>
<head>
<script type="text/javascript">
var uri = window.location.toString();
var question = uri.indexOf("?");
var param = uri.substring(question + 1, uri.length)
if (question > 0) {
    window.location.href = param + '?utm_source=email&utm_medium=email&utm_campaign=Daily_Notification';
}
</script>
</head>
<body></body>
</html>

Previously Google Analytics was reporting these visits as direct hits while now it lists them under campaigns like so:

"Difio Analytics"

Because all visitors of Difio use JavaScript enabled browsers I combined this approach with another one, to remove query string with JavaScript and present clean URLs to the visitor.

Why JavaScript?

You may be asking why the hell I am using JavaScript and not Apache's wonderful mod_rewrite module? This is because the destination URLs are hosted in Amazon S3 and I'm planning to integrate with Amazon CloudFront. Both of them don't support .htaccess rules nor anything else similar to mod_rewrite.

As always I'd love to hear your thoughts and feedback. Please use the comment form below.

There are comments.

Cross-domain AJAX Served From CDN

This week Amazon announced support for dynamic content in their CDN solution Amazon CloudFront. The announce coincided with my efforts to migrate more pieces of Difio's website to CloudFront.

In this article I will not talk about hosting static files on CDN. This is easy and I've already written about it here. I will show how to cache AJAX(JSONP actually) responses and serve them directly from Amazon CloudFront.

Background

For those of you who may not be familiar (are there any) CDN stands for Content Delivery Network. In short this employs numerous servers with identical content. The requests from the browser are served from the location which gives best performance for the user. This is used by all major websites to speed-up static content like images, video, CSS and JavaScript files.

AJAX means Asynchronous JavaScript and XML. This is what Google uses to create dynamic user interface which doesn't require to reload the page.

Architecture

Difio has two web interfaces. The primary one is a static HTML website which employs JavaScript for the dynamic areas. It is hosted on the dif.io domain. The other one is powered by Django and provides the same interface plus the applications dashboard and several API functions which don't have a visible user interface. This is under the *.rhcloud.com domain b/c it is hosted on OpenShift.

The present state of the website is the result of rapid development using conventional methods - HTML templates and server-side processing. This is migrating to modern web technology like static HTML and JavaScript while the server side will remain pure API service.

For this migration to happen I need the HTML pages at dif.io to execute JavaScript and load information which comes from the rhcloud.com domain. Unfortunately this is not easily doable with AJAX because of the Same origin policy in browsers.

I'm using the Dojo Toolkit JavaScript framework which has a solution. It's called JSONP. Here's how it works:

     dif.io ------ JSONP request --> abc.rhcloud.com --v
        ^                                              |
        |                                              |
    JavaScript processing                              |
        |                                              |
        +------------------ JSONP response ------------+

This is pretty standard configuration for a web service.

Going to the clouds

The way Dojo implements JSONP is through the dojo.io.script module. It works by appending a query string parameter of the form ?callback=funcName which the server uses to generate the JSONP response. This callback name is dynamically generated by Dojo based on the order in which your call to dojo.io.script is executed.

Until recently Amazon CloudFront ignored all query string parameters when requesting the content from the origin server. Without the query string it was not possible to generate the JSONP response. Luckily Amazon resolved the issue only one day after I asked about it on their forums.

Now Amazon CloudFront will use the URL path and the query string parameters to identify the objects in cache. To enable this edit the CloudFront distribution behavior(s) and set Forward Query Strings to Yes.

When a visitor of the website requests the data Amazon CloudFront will use exactly the same url path and query strings to fetch the content from the origin server. All that I had to do is switch the domain of the JSONP service to point to the cloudfront.net domain. It became like this:

                                                        | Everything on this side is handled by Amazon.
                                                        | No code required!
                                                        |
     dif.io ------ JSONP request --> xyz.cloudfront.net -- JSONP request if cache miss --> abc.rhcloud.com --v
        ^                              |                ^                                                    |
        |                              |                |                                                    |
    JavaScript processing              |                +---------- JSONP response --------------------------+
        |                              |
        +---- cached JSONP response ---+

As you can see the website structure and code didn't change at all. All that changed was a single domain name.

Controlling the cache

Amazon CloudFront will keep the contents in cache based on the origin headers if present or the manual configuration from the AWS Console. To work around frequent requests to the origin server it is considered best practice to set the Expires header to a value far in the future, like 1 year. However if the content changes you need some way to tell CloudFront about it. The most commonly used method is through using different URLs to access the same content. This will cause CloudFront to cache the content under the new location while keeping the old content until it expires.

Dojo makes this very easy:

require(["dojo/io/script"],
    function(script) {
            script.get({
                url: "https://xyz.cloudfront.net/api/json/updates/1234",
                callbackParamName: "callback",
                content: {t: timeStamp},
                load: function(jsonData) {
                    ....
                },

The content property allows additional key/value pairs to be sent in the query string. The timeStamp parameter serves only to control Amazon CloudFront cache. It's not processed server side.

On the server-side we have:

response['Cache-Control'] = 'max-age=31536000'
response['Expires'] = (datetime.now()+timedelta(seconds=31536000)).strftime('%a, %d %b %Y %H:%M:%S GMT')

Benefits

There were two immediate benefits:

  • Reduced page load time. Combined with serving static files from CDN this greatly improves the user experience;
  • Reduced server load. Content is requested only once if it is missing from the cache and then served from CloudFront. The server isn't so busy serving content so it can be used to do more computations or simply reduce the bill.

The presented method works well for Difio because of two things:

  • The content which Difio serves usually doesn't change at all once made public. In rare occasions, for example an error has been published, we have to regenerate new content and publish it under the same URL.
  • Before content is made public it is inspected for errors and this also preseeds the cache.

There are comments.

Using OpenShift as Amazon CloudFront Origin Server

It's been several months after the start of Difio and I started migrating various parts of the platform to CDN. The first to go are static files like CSS, JavaScript, images and such. In this article I will show you how to get started with Amazon CloudFront and OpenShift. It is very easy once you understand how it works.

Why CloudFront and OpenShift

Amazon CloudFront is cheap and easy to setup with virtually no maintenance. The most important feature is that it can fetch content from any public website. Integrating it together with OpenShift gives some nice benefits:

  • All static assets are managed with Git and stored in the same place where the application code and HTML is - easy to develop and deploy;
  • No need for external service to host the static files;
  • CloudFront will be serving the files so network load on OpenShift is minimal;
  • Easy to manage versioned URLs because HTML and static assets are in the same repo - more on this later;

Object expiration

CloudFront will cache your objects for a certain period and then expire them. Frequently used objects are expired less often. Depending on the content you may want to update the cache more or less frequently. In my case CSS and JavaScript files change rarely so I wanted to tell CloudFront to not expire the files quickly. I did this by telling Apache to send a custom value for the Expires header.

    $ curl http://d71ktrt2emu2j.cloudfront.net/static/v1/css/style.css -D headers.txt
    $ cat headers.txt 
    HTTP/1.0 200 OK
    Date: Mon, 16 Apr 2012 19:02:16 GMT
    Server: Apache/2.2.15 (Red Hat)
    Last-Modified: Mon, 16 Apr 2012 19:00:33 GMT
    ETag: "120577-1b2d-4bdd06fc6f640"
    Accept-Ranges: bytes
    Content-Length: 6957
    Cache-Control: max-age=31536000
    Expires: Tue, 16 Apr 2013 19:02:16 GMT
    Content-Type: text/css
    Strict-Transport-Security: max-age=15768000, includeSubDomains
    Age: 73090
    X-Cache: Hit from cloudfront
    X-Amz-Cf-Id: X558vcEOsQkVQn5V9fbrWNTdo543v8VStxdb7LXIcUWAIbLKuIvp-w==,e8Dipk5FSNej3e0Y7c5ro-9mmn7OK8kWfbaRGwi1ww8ihwVzSab24A==
    Via: 1.0 d6343f267c91f2f0e78ef0a7d0b7921d.cloudfront.net (CloudFront)
    Connection: close

All headers before Strict-Transport-Security come from the origin server.

Versioning

Sometimes however you need to update the files and force CloudFront to update the content. The recommended way to do this is to use URL versioning and update the path to the files which changed. This will force CloudFront to cache and serve the content under the new path while keeping the old content available until it expires. This way your visitors will not be viewing your site with the new CSS and old JavaScript.

There are many ways to do this and there are some nice frameworks as well. For Python there is webassets. I don't have many static files so I opted for no additional dependencies. Instead I will be updating the versions by hand.

What comes to mind is using mod_rewrite to redirect the versioned URLs back to non versioned ones. However there's a catch. If you do this CloudFront will cache the redirect itself, not the content. The next time visitors hit CloudFront they will receive the cached redirect and follow it back to your origin server, which is defeating the purpose of having CDN.

To do it properly you have to rewrite the URLs but still return a 200 response code and the content which needs to be cached. This is done with mod_proxy:

    RewriteEngine on
    RewriteRule ^VERSION-(\d+)/(.*)$ http://%{ENV:OPENSHIFT_INTERNAL_IP}:%{ENV:OPENSHIFT_INTERNAL_PORT}/static/$2 [P,L]

This .htaccess trick doesn't work on OpenShift though. mod_proxy is not enabled at the moment. See bug 812389 for more info.

Luckily I was able to use symlinks to point to the content. Here's how it looks:

    $ pwd
    /home/atodorov/difio/wsgi/static
    
    $ cat .htaccess
    ExpiresActive On
    ExpiresDefault "access plus 1 year"
    
    $ ls -l
    drwxrwxr-x. 6 atodorov atodorov 4096 16 Apr 21,31 o
    lrwxrwxrwx. 1 atodorov atodorov    1 16 Apr 21,47 v1 -> o
    
    settings.py:
    STATIC_URL = '//d71ktrt2emu2j.cloudfront.net/static/v1/'
    
    HTML template:
    <link type="text/css" rel="stylesheet" media="screen" href="{{ STATIC_URL }}css/style.css" />

How to implement it

First you need to split all CSS and JavaScript from your HTML if you haven't done so already.

Then place everything under your git repo so that OpenShift will serve the files. For Python applications place the files under wsgi/static/ directory in your git repo.

Point all of your HTML templates to the static location on OpenShift and test if everything works as expected. This is best done if you're using some sort of template language and store the location in a single variable which you can change later. Difio uses Django and the STATIC_URL variable of course.

Create your CloudFront distribution - don't use Amazon S3, instead configure a custom origin server. Write down your CloudFront URL. It will be something like 1234xyz.cludfront.net.

Every time a request hits CloudFront it will check if the object is present in the cache. If not present CloudFront will fetch the object from the origin server and populate the cache. Then the object is sent to the user.

Update your templates to point to the new cloudfront.net URL and redeploy your website!

There are comments.

OpenShift Cron Takes Over Celerybeat

Celery is an asynchronous task queue/job queue based on distributed message passing. You can define tasks as Python functions, execute them in the background and in a periodic fashion. Difio uses Celery for virtually everything. Some of the tasks are scheduled after some event takes place (like user pressed a button) or scheduled periodically.

Celery provides several components of which celerybeat is the periodic task scheduler. When combined with Django it gives you a very nice admin interface which allows periodic tasks to be added to the scheduler.

Why change

Difio has relied on celerybeat for a couple of months. Back then, when Difio launched, there was no cron support for OpenShift so running celerybeat sounded reasonable. It used to run on a dedicated virtual server and for most of the time that was fine.

There were a number of issues which Difio faced during its first months:

  • celerybeat would sometime die due to no free memory on the virtual instance. When that happened no new tasks were scheduled and data was left unprocessed. Let alone that higher memory instance and the processing power which comes with it cost extra money.

  • Difio is split into several components which need to have the same code base locally - the most important are database settings and the periodic tasks code. At least in one occasion celerybeat failed to start because of a buggy task code. The offending code was fixed in the application server on OpenShift but not properly synced to the celerybeat instance. Keeping code in sync is a priority for distributed projects which rely on Celery.

  • Celery and django-celery seem to be updated quite often. This poses a significant risk of ending up with different versions on the scheduler, worker nodes and the app server. This will bring the whole application to a halt if at some point a backward incompatible change is introduced and not properly tested and updated. Keeping infrastructure components in sync can be a big challenge and I try to minimize this effort as much as possible.

  • Having to navigate to the admin pages every time I add a new task or want to change the execution frequency doesn't feel very natural for a console user like myself and IMHO is less productive. For the record I primarily use mcedit. I wanted to have something more close to the write, commit and push work-flow.

The take over

It's been some time since OpenShift introduced the cron cartridge and I decided to give it a try.

The first thing I did is to write a simple script which can execute any task from the difio.tasks module by piping it to the Django shell (a Python shell actually).

run_celery_task
#!/bin/bash
#
# Copyright (c) 2012, Alexander Todorov <atodorov@nospam.otb.bg>
#
# This script is symlinked to from the hourly/minutely, etc. directories
#
# SYNOPSIS
#
# ./run_celery_task cron_search_dates
#
# OR
#
# ln -s run_celery_task cron_search_dates
# ./cron_search_dates
#

TASK_NAME=$1
[ -z "$TASK_NAME" ] && TASK_NAME=$(basename $0)

if [ -n "$OPENSHIFT_APP_DIR" ]; then
    source $OPENSHIFT_APP_DIR/virtenv/bin/activate
    export PYTHON_EGG_CACHE=$OPENSHIFT_DATA_DIR/.python-eggs
    REPO_DIR=$OPENSHIFT_REPO_DIR
else
    REPO_DIR=$(dirname $0)"/../../.."
fi

echo "import difio.tasks; difio.tasks.$TASK_NAME.delay()" | $REPO_DIR/wsgi/difio/manage.py shell

This is a multicall script which allows symlinks with different names to point to it. Thus to add a new task to cron I just need to make a symlink to the script from one of the hourly/, minutely/, daily/, etc. directories under cron/.

The script accepts a parameter as well which allows me to execute it locally for debugging purposes or to schedule some tasks out of band. This is how it looks like on the file system:

$ ls -l .openshift/cron/hourly/
some_task_name -> ../tasks/run_celery_task
another_task -> ../tasks/run_celery_task

After having done these preparations I only had to embed the cron cartridge and git push to OpenShift:

rhc-ctl-app -a difio -e add-cron-1.4 && git push

What's next

At present OpenShift can schedule your jobs every minute, hour, day, week or month and does so using the run-parts script. You can't schedule a script to execute at 4:30 every Monday or every 45 minutes for example. See rhbz #803485 if you want to follow the progress. Luckily Difio doesn't use this sort of job scheduling for the moment.

Difio is scheduling periodic tasks from OpenShift cron for a few days already. It seems to work reliably and with no issues. One less component to maintain and worry about. More time to write code.

There are comments.


Page 1 / 2