How to build an awesome Machine Learning computer to run your hefty (tidymodels) R code.

How to build an awesome Machine Learning computer to run your hefty (tidymodels) R code.

Hint: Don't.

Have you seen the prices of computer parts these days? Doesn't help as well that YouTube flings us the latest computer build video, every second day (maybe I should unsubscribe from that channel). It's a recession, people, we're supposed to be holding onto our pennies pretty tightly, right?

My solution was to create a "chonky VM" on Digital Ocean (using my experiences from setting up this blog and accompanying shiny server). Once set up, I'd just git pull the repo, run it, then just git push the result back up (nice one!).

The Steps

Here's how to set up RStudio Server, at your own domain, with SSL (https) encryption.

  1. Go buy a cheap domain. Something like your_name.xyz will be ~AU$2 from godaddy.com (don’t buy the extra “protection” they try to up-sell you. Also, make sure you're only buying 1 year (not 2). Turn off auto-renewals, if you can be bothered.
  2. In the DNS settings (in godaddy), change the nameservers in the DNS settings to point to digital ocean (ns1.digitalocean.com, ns2.digitalocean.com, ns3.digitalocean.com). They say it may take up ~1 hour for these changes to propagate.
  3. Create a digital ocean account (use someone’s referral link, for two months worth of freebie credits)
  4. Download and install Putty
  5. Add an SSH key to Digital Ocean (we could just use a plain old password… but IT security people will laugh at us)
  6. In digital ocean, create a Ubuntu droplet / instance. I like to start with a $5 instance and then resize it to with more CPUs, to speed up the install (we also need a machine with more than 1gb of memory, to install the packages we want, as well.
  7. Go to "Networking" and add your_domain.xyz
  8. In this domain view, add an A record: @ and have this point to your newly created droplet.
  9. SSH into your droplet with Putty.
  10. Enter the following commands, adjusting the parts, from my own specific details to yours.

Linux Bash Commands

adduser julian
gpasswd -a julian sudo

su - julian


sudo apt-get update
sudo apt-get -y install nginx gdebi-core libcurl4-gnutls-dev libxml2-dev libssl-dev git



sudo sh -c 'echo "deb http://cran.rstudio.com/bin/linux/ubuntu bionic-cran35/" >> /etc/apt/sources.list'
gpg --keyserver keyserver.ubuntu.com --recv-key E298A3A825C0D65DFD57CBB651716619E084DAB9
gpg -a --export E298A3A825C0D65DFD57CBB651716619E084DAB9 | sudo apt-key add -

sudo apt-get update
sudo apt-get -y install r-base


wget https://download2.rstudio.org/server/xenial/amd64/rstudio-server-1.3.959-amd64.deb
sudo gdebi rstudio-server-1.3.959-amd64.deb


sudo add-apt-repository ppa:certbot/certbot
sudo apt install python-certbot-nginx


sudo nano /etc/nginx/sites-available/rstudio.conf

### enter the following into the rstudio.conf file:

server {
        server_name tadge-apps.xyz;
        location / {
                proxy_pass http://localhost:8787/;
        }
}


sudo ln -s /etc/nginx/sites-available/rstudio.conf /etc/nginx/sites-enabled/rstudio.conf

sudo systemctl reload nginx

sudo certbot --nginx 

git config --global user.email "julian@tadge-analytics.com.au"
git config --global user.name "Julian Tagell"

Nice one! Your instance will now be available at your domain. Your user/password will be the ubuntu user and password that you made in the first of the steps above.

It's now up to you to install all the R packages that you'll be needing. You can do it from the Linux command line, as well, if you wanted to.

sudo su - -c "R -e \"install.packages(c('tidyverse', 'tidymodels', 'lubridate', 'remotes', 'drake'), repos='http://cran.rstudio.com/')\""

This takes time.


Now it's up to you to resize the instance as you like. Be aware that you need to resize it back to something cheaper when your done with the heavy lifting or to destroy the instance. You are charged for as long as the instance exists.

Do your own research or risk cloud service bill shock.

So there you have it. You're all set to git pull your repo and conduct the taxing steps in a much more enhanced environment then your local machine.

I'll be posting an example application that I have used this for, shortly.