Once you start working with APIs, you will find yourself working with keys and tokens and shared secrets (shhhhh!!!) and the like. You should never put these sorts of things directly in your code. The reason is quite simple: depending on the platform and its API, those alphanumeric identifiers may very well unlock access to your accounts or your data for anyone who comes into possession of them. Or, if not unlocking access to your data, they can enable others to use cloud-based resources (like the Google Cloud platform) that can cost you money!
R has a good way to drastically reduce the risk of this occurring. It’s up to you to take advantage of that mechanism, though.
R has a couple of “startup” files:
.Rprofilegets executed every time R starts up, so, if you always want to run some specific script, you can put it in the
.Renvironalso gets evaluated every time an R session starts, but its sole purpose is to set environment variables.
We’re not going to go into detail on
.Rprofile here, as it’s not used for the API key protection that we’re covering.
So, there’s a
.Renviron file. File that away for now.
These startup files can be located in three different locations, but only one version of any file will be read and used for any given R session. So, which one gets used? The file that gets used is the one that exists at the lowest level of the hierarchy. We’ll start from the top of the hierarchy.
R_HOME is the directory in which R is installed. Enter
R.home() to find out that location.
There may be cases where you want to put the
.Renviron file here…but generally not. It’s just cleaner to keep “the program itself” separate from “the configuration of the program.”
HOME is actually the user’s home directory. You can find out what your
HOME directory is with
Sys.getenv("HOME") (or, in RStudio, in the Files pane, click on the word Home at the top of the pane; you can get back to your working directory by selecting More > Go to Working Directory).
This is where we generally recommend that you locate your
We touched on this earlier, and it’s generally set as the current project’s directory. You can find the current working directory with
getwd(). If you have different credentials that you use with different projects, then you might want to place your
.Renviron file in this directory.
But, if you go this route, you can find yourself with a slew of
.Renviron files scattered across your system and, since only a single
.Renviron file can be used at a time, you may actually find yourself replicating some of the same auth credentials across multiple locations. Plus, if you’re being careful to not check these files in to any network locations, you can be in a world of hurt if you have a hardware failure, as you will have to recreate each individual
.Renviron file for each project before you are able to use it.
Technically, all environment variables set in
.Renviron are the same, in that they are simple key-value pair (which we’ll get to later). But, it is still useful to think of there being two types of variables:
GARGLE_EMAILfor authentication if they exist.
AW_CLIENT_SECRETfor authentication if they exist. And, many of the functions in the package will use
AW_REPORTSUITE_IDfor the company ID and report suite ID if they exist as environment variables (in both cases, though, these values can be specified or overridden within the functions themselves using the
GITHUB_PATenvironment variable as the auth token for accessing GitHub if it exists.
GA_VIEW_ID_Z. If you develop your scripts cleanly, you can have a single line of code to set the view ID to use throughout the script (e.g., “
ga_view <- Sys.getenv("GA_VIEW_ID_X")”–more on that specific syntax below) and then simply update that one line of code to reference the appropriate client whenever you want to use the script with a different account.
The structure of the
.Renviron file is quite simple. It’s just a text file with one row for each variable you want to define. As an example, the following
.Renviron file establishes two environment variables:
GA_VIEW_ID_Y. As we discussed above, the first of these is an environment variable that the googleAnalyticsR checks for when performing authentication, while the second is just a convenience variable that is the view ID for our client, “Y”:
# Google Analytics Values GAR_CLIENT_JSON = "/Users/gilligan/Documents/R Projects/ga-client.json" GA_VIEW_ID_Y = "ga:XXXX1474"
Note that you can also include comments in the
.Renviron file just as you would include comments in R script–by starting the line with a
Environment variables are accessed using the
Consider the above example. Let’s say that we’re writing a script that uses
googleAnaltyicsR. In the script, we could create an object called
ga_view_di that establishes what we’ll use for the
viewId argument throughout the function calls used in our script:
ga_view_id <- Sys.getenv("GA_VIEW_ID_Y")
We’ll cover Github later, but, if you’re using Github to manage your projects (and there are lots of reasons to do that), you do not want to include your
.Renviron file when you commit code to the repository. The way to bake this into your process is to add a couple of lines (one line, really, but comments are nice to use):
# Environment file .Renviron
.gitignore file specifies which files to not include when checking updates into a repository. You may also want to include data files in the
.gitignore file…but that’s a topic for another page.
This doesn’t mean that you shouldn’t be backing up your
.Renviron file(s). You absolutely should be! But, this is a key reason that we recommend using a single, master
.Renviron file that lives in your home directory (the second of the three options described above). While that directory may only contain locally-stored files, any time you update your
.Renviron, you can get in the habit of backing it up to a secure, but non-local location: be that Goole Drive, OneDrive, Dropbox, or even as a document within your password manager.