Sunday, March 13, 2016

Using AWS for Research


What is Amazon Web Services (AWS)? 


AWS is a cloud computing service provided by Amazon. Amazon EC2 is the service we'll go over in this post. It allows users to launch their own virtual instances from a variety of operating systems. Amazon provides these computational resources to the public and private sector at an extremely low price. I have listed below both the advantages and disadvantages of using cloud computing resources. For me the advantages definitely outweigh the disadvantages :) AWS is generally much more flexible than your universities local cluster in that you can choose the size, type and number of machines and have the rights to download and run whenever needed.

 Advantages 

  1. None of the maintenance of hosting your own server 
  2. Access to high computing machines 
  3. Access to high parallel computing machines (like those running hadoop) 
  4. The cost is extremely cheap 
  5. Sudo user rights (can install whatever you want without asking your sysadmin) 
  6. The computers available to you come in a variety shapes and sizes

 Disadvantages 

  1. There might not be enough machines for everyone and their maybe a wait time (This has yet happen) 
  2. It cost money (but amazon provides educational grants) 
  3. AWS is a blank slate therefore you need to install everything and copy files over (They make this easy with thing like s3 for storing files and allow you to create images of an instance so you can relaunch one with software pre-installed) 

 AWS Getting Started 

Using EC2 is easy once you've launched your instance you can ssh using your ssh keys:
 If your machine is ubuntu then  ssh -i location_to_pem_file.pem ubuntu@ec2-54-153-7-122.us-west-1.compute.amazonaws.com All inputs and outputs should be saved in your /mnt/ directory (this is where all of the system storage is).

By default you are not the owner of this directory so you need to change the permissions.

cd /mnt/
sudo chown ubuntu:ubuntu .
mkdir data

 If you need to mount more storage on your EC2 machine: 

  1.  Go to AWS Console  
  2. Create an EBS volume of the size needed 
  3. Attach this volume to your running EC2 machine 
  4. Then mount the volume to your EC2 machine using the following commands 

sudo mkfs -t ext4 /dev/xvdf 
sudo mount /dev/xvdf /mnt/data 
# /dev/xvdf might be diff depending on where volume was added 
cd /mnt/data/ 
mount 
# mount while in data dir 
sudo chown ubuntu:ubuntu . 
df 
# df to confirm that your volume was successfully added


 All data can be easily on saved on S3 and instances or machines with preinstalled packages can be saved and reused as images