Set up a tranSMART development environment

In an earlier post, I shared my experiences with installing tranSMART on Amazon, but there are now AMI's availabe for the tranSMART 1.0 GA version, which makes that post obsolete. However, I like to be able to code on the road when I have to travel, so I wanted to have a local development environment on my MacBook. Since Oracle XE is not supported for Mac OSX, this is not trivial, but by setting up a virtual machine it is possible to make a transparent local tranSMART development setup.

First of all, we need virtualization software. You could just install VirtualBox if you haven't already, Oracle provides it for free and in my experience it works well. Next, we need a virtual machine host. I hate install wizards, so I'm happy to re-use the ready to go Virtual Box images of CentOS from the virtualboxes.org team (http://virtualboxes.org/images/centos), and tweak that one. I used the following CentOS image:

http://sourceforge.net/projects/virtualboximage/files/CentOS/5.6/Centos-x86_64.7z/download

Just extract the downloaded image, import it into VirtualBox and you are almost ready to go. Before you start the image, configure the networking options for the network card in the virtual machine. I choose NAT, because I want to have all tranSMART stuff running as if it was on my local machine. To achieve that, we have to forward all the ports needed tranSMART software (see screenshot below). Also, we expose the host's SSH port as 2222 on the MacBook, so we can use Terminal instead of the crappy VirtualBox console window to manage the virtual machine. See the screenshot below:

You can now start the machine, and you should be able to connect via ssh:

mbpkees:~ kees$ ssh root@localhost -p 2222

The root password of the above mentioned image is 'reverse', if you wish you can change it using passwd.

The next steps are to install Oracle and all the other tranSMART applications, which is documented in the Install Guide. However, at some point you will notice this image has only 8GB total size, which is not enough for any serious tranSMART installation, even taking into account the 11GB limit of Oracle XE 11g. Luckily, we can easily modify the size of the virtual disk. Shutdown the virtual machine, find the path to the virtual disk image (right click on it, and go to Settings > Storage, click on the disk, and find the path on the right). Then fire up a terminal and navigate to that folder. VBoxManage should be in your path if VirtualBox installed properly. With the following command, you can resize the disk to about 20GB:

mbpkees:Centos 5.6 64 kees$ VBoxManage modifyhd Centos64.vdi --resize 20000

0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%

Next, we need to start the virtual machine and add the newly created space to the actual file system. Since the image is using LVM, this is a simple task. To be safe, I prefer to just make a new LVM partition and then extend the volume group and logical volumes with that, rather than resizing the partition itself. You can do this as follows:

[root@localhost ~]# fdisk /dev/sda

...

Command (m for help): p

Disk /dev/sda: 20.9 GB, 20971520000 bytes
255 heads, 63 sectors/track, 2549 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sda1 * 1 13 104391 83 Linux
/dev/sda2 14 1044 8281507+ 8e Linux LVM

Command (m for help): nCommand action
e extended
p primary partition (1-4)
p
Partition number (1-4): 3
First cylinder (1045-2549, default 1045):
Using default value 1045
Last cylinder or +size or +sizeM or +sizeK (1045-2549, default 2549):
Using default value 2549

Let's review the partition table to see how it looks:

Command (m for help): p

Disk /dev/sda: 20.9 GB, 20971520000 bytes
255 heads, 63 sectors/track, 2549 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sda1 * 1 13 104391 83 Linux
/dev/sda2 14 1044 8281507+ 8e Linux LVM
/dev/sda3 1045 2549 12088912+ 83 Linux

It looks good, but we need to change the id of the new partition to Linux LVM, in order to make the LVM comfortable extending the volume group with it:

Command (m for help): t

Partition number (1-4): 3

Hex code (type L to list codes): 8e

Changed system type of partition 3 to 8e (Linux LVM)

Command (m for help): p

Disk /dev/sda: 20.9 GB, 20971520000 bytes
255 heads, 63 sectors/track, 2549 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sda1 * 1 13 104391 83 Linux
/dev/sda2 14 1044 8281507+ 8e Linux LVM
/dev/sda3 1045 2549 12088912+ 8e Linux LVM

Command (m for help): w

The partition table has been altered!

Calling ioctl() to re-read partition table.

WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
The kernel still uses the old table.
The new table will be used at the next reboot.
Syncing disks.

Now, reboot the system, by issuing 'reboot'. After the system is back up, reconnect via SSH. We can now add the new volume /dev/sda3 to the logical volume group:

[root@localhost ~]# vgextend /dev/VolGroup00 /dev/sda3

No physical volume label read from /dev/sda3

Physical volume "/dev/sda3" successfully created

Volume group "VolGroup00" successfully extended

Let's add those extra gigs to the main logical volume:

[root@localhost ~]# lvextend -L+11G /dev/VolGroup00/LogVol00 /dev/sda3

Extending logical volume LogVol00 to 17.97 GB

Logical volume LogVol00 successfully resized

And while we're at it, let's take advantage of the space that's left and add some extra swap space too, to make sure Oracle doesn't run out of memory:

[root@localhost ~]# swapoff /dev/VolGroup00/LogVol01

[root@localhost ~]# lvresize -l +100%FREE /dev/VolGroup00/LogVol01

Extending logical volume LogVol01 to 1.38 GB

Logical volume LogVol01 successfully resized

[root@localhost ~]# mkswap /dev/VolGroup00/LogVol01

Setting up swapspace version 1, size = 1476390 kB

[root@localhost ~]# swapon /dev/VolGroup00/LogVol01

Now we should see an increase in available swap space:

[root@localhost ~]# free

total used free shared buffers cached

Mem: 2058840 613740 1445100 0 14576 442012

-/+ buffers/cache: 157152 1901688

Swap: 1441784 0 1441784

And, of course, as a final step, we need to expand the filesystem to make use of the newly available space:

[root@localhost ~]# resize2fs -p /dev/VolGroup00/LogVol00resize2fs 1.39 (29-May-2006)

Filesystem at /dev/VolGroup00/LogVol00 is mounted on /; on-line resizing required

Performing an on-line resize of /dev/VolGroup00/LogVol00 to 4718592 (4k) blocks.

The filesystem on /dev/VolGroup00/LogVol00 is now 4718592 blocks long.

Indeed, the volume group now shows the expanded size:

[root@localhost ~]# df -h

Filesystem Size Used Avail Use% Mounted on

/dev/mapper/VolGroup00-LogVol00

18G 6.1G 11G 37% /

/dev/sda1 99M 26M 68M 28% /boot

tmpfs 1006M 156M 850M 16% /dev/shm

We're ready to rock now.

The next step is to do a local installation of the neccessary tranSMART components on the virtual machine: i2b2, Solr, R and if you like GenePattern. The install guide on the tranSMART project wiki is pretty solid (and I updated some parts on the way), so I won't repeat all the steps here. Just remember to download and unzip the applications using the same account as you intend to run the services, so that the user account has the rights to modify the files in the folder. If you forget that (as I did :-)), you can of course always change the owner using 'chown -R'.

Finally, we want to configure the system in a way that we can run a local tranSMART instance in Grails development mode, but at the same time it should use Oracle, i2b2, Solr, R and GenePattern from the virtual machine. Preferably all without violating the JavaScript same origin policy (if we would use different ports for i2b2 services and tranSMART, a browser with standard security configuration would see cross-domain XHR requests and deny those). The trick here is to use the local Apache server on your MacBook to serve out the requests for both i2b2 and tranSMART. If you enabled file sharing in the system preferences, there is already an Apache server running, and you can change the configuration by opening the .conf file in /etc/apache2/users. In my case, I can do this as follows from command line:

sudo nanon /etc/apache2/users/kees.conf

Now let's add the necessary rewrites:
[code]

ProxyPass http://localhost:9090/i2b2/services/OntologyService
ProxyPassReverse http://localhost:9090/i2b2/services/OntologyService

ProxyPass http://localhost:9090/i2b2/services/QueryToolService
ProxyPassReverse http://localhost:9090/i2b2/services/QueryToolService

ProxyPass http://localhost:9090/i2b2/services/PMService
ProxyPassReverse http://localhost:9090/i2b2/services/PMService

ProxyPass http://localhost:8080/transmartApp
ProxyPassReverse http://localhost:8080/transmartApp

[/code]
If you use another Apache instance, make sure mod_proxy is loaded. Then reload or restart Apache, the command to restart the Mac OSX built in server is:

sudo apachectl restart

Now, you can run a local tranSMART installation in Grails development environment on port 8080, and connect to it at http://localhost/transmartApp! Which allows you to write code and test the changes on the fly, and even use your IDE's debugging environment to dive right in what's happening. I recommend IntelliJ IDEA, my favourite IDE for all Grails/Groovy related projects.

Tags