Setting up RStudio server on the cloud?

r
rstudio
cloud

#1

I wanted to setup Rstudio on a cloud to create a common ground of working with one of my clients. They will upload the data to the server and I’ll then build the models on it in a secure environment.

There are following options I could gather till now:

  1. Setting up Rstudio server: http://www.rstudio.com/products/rstudio/#Server
  2. Setting up a droplet on Digital Ocean: https://github.com/sckott/analogsea
  3. Creating an AWS instance and then running RStudio on it

Are there any other options, people can suggest? Also, among the ones mentioned above, can any one highlight pros and cons of each of these, specifically in relation with:

  • Ease of setup
  • Servers available across the globe (in Europe and in the U.S.)
  • Cost
  • Ability to upload and build models on large datasets

Any suggestions - R experts?

J


#2

Jon,

Here is a brief comparison on Digital Ocean (DO) and Amazon Web Services (AWS) based on my experience (though not on hosting a RStudio server):

  • Digital Ocean droplets are very easy to set up High performance machines aimed towards developers. The cost would be far lower compared to Amazon, but you will not have all the facilities Amazon has. For example, you will not have a Load balancer available on DO. In terms of benefit, you get SSD drives on a low cost, so you can expect good performance from these machines. Your client, most likely would be more comfortable with AWS for its known presence.

  • Amazon, on the other hand, has wider range of services and lower cost, but far bigger brand and comfort with clients

  • RStudio professional server looks looks to have some good options as well. You can do the math for the cost you will incur on AWS, DO and RStudio for your data size and get to an answer.

Additionally, you can alos look at http://www.revolutionanalytics.com/revolution-r-cloud

Regards,
Kunal


#3

we can try the bioconductor cloud.
http://www.bioconductor.org/help/bioconductor-cloud-ami/

an Amazon Machine Image (AMI) that is optimized for running Bioconductor in the Amazon Elastic Compute Cloud (or EC2) for sequencing tasks.

Here are a few reasons you could use it:

You do not want to install Bioconductor on your own machine.
You have a long-running task and you don't want it to tie up the CPU on your own machine.
You have a parallelizable task and would like to run it (either on multiple CPUs on a single machine, or in a cluster of many machines).
You want to run R in your web browser (using RStudio Server).
The AMI contains many packages which can be very difficult to install and configure.

See below for more specific scenarios.

Preloaded AMI

The AMI comes pre-loaded with the latest release version of R, and the following Bioconductor packages (and all their CRAN dependencies):

affxparser
affy
affyio
affylmGUI
annaffy
annotate
AnnotationDbi
aroma.light
BayesPeak
baySeq
Biobase
BiocInstaller
biomaRt
Biostrings
BSgenome
Category
ChIPpeakAnno
chipseq
ChIPseqR
ChIPsim
CSAR
cummeRbund
DESeq
DEXSeq
DiffBind
DNAcopy
DynDoc
EDASeq
edgeR
ensemblVEP
gage
genefilter
geneplotter
GenomeGraphs
genomeIntervals
GenomicFeatures
GenomicRanges
Genominator
GEOquery
GGBase
GGtools
girafe
goseq
GOstats
graph
GSEABase
HilbertVis
impute
IRanges
limma
MEDIPS
multtest
oneChannelGUI
PAnnBuilder
preprocessCore
qpgraph
qrqc
R453Plus1Toolbox
RBGL
Repitools
rGADEM
Rgraphviz
Ringo
Rolexa
Rsamtools
Rsubread
rtracklayer
segmentSeq
seqbias
seqLogo
ShortRead
snpStats
splots
SRAdb
tkWidgets
VariantAnnotation
vsn
widgetTools
zlibbioc

Plus the following categories of annotation package:

org.*
BSgenome.*
PolyPhen.*
SIFT.*
TxDb.*