As I mentioned in my last post, I chose to look at the EPA Toxics Release Inventory this month. The full dataset was way too large for my little netbook to handle, so I filtered the results to my home state of Massachusetts, and downloaded it in csv format. I used this helpful pdf to identify the variables I was most interested in: year, total release of toxic substances, and whether the substance released was a carcinogen.
I just finished taking a short introductory R course via Coursera, so I thought I’d do my analysis in R. Consequently, there’s not much to it, and I’m honestly not sure what I have is reliable. But! I have something, which is progress.
I decided I wanted to plot the total releases in Massachusetts over time, as well as the total release of carcinogens. I’m not terribly familiar with time series – we used them a lot in neuroimaging analysis, but always with the a priori expectation that they’d fit a specific model, usually a hemodynamic response curve or a top hat function. So I’m just plotting these, I’m not analyzing them. Here we go:
My expert opinion: pretty! Also, I wonder what’s up with with 1997.
Code is here.