Code can be found here
The heroes we need and deserve.
Street trees provide huge benefits to cities, and so far have never lodged a complaint about anything. Here’s a brief list of benefits, as estimated by Dan Burden, with data from the US Forestry Service. Just a heads up, the dollar figures used here are the USD in 2006.
- For an estimated planting and 3 year maintenance cost of between $250-$600, a street tree provides up to $90,000 over its lifetime in direct benefits.
- Businesses on tree-lined streets were showing 12% increased income streams, giving them a possible edge over non-treed competition.
- Neighbourhoods with solid tree cover from urban street trees had houses spending 15-35% less on energy bills.
- The article above mentions another 19 benefits!
Okay, go outside and hug your local tree right now. This page will still be here when you get back.
Here in Ottawa, over 150,000 trees have been inventoried, and put into this dataset. Let’s take a look!
Data Cleaning
The cleanup of this dataset looks like it will be relatively straightforward. After a brief glance, I note that several columns with answer values that should be numeric contain the string “None”. Another column, however, contains the string “N/A”. For the purpose of analysis in Python, not to mention if you ever want to normalize this dataset, this will need to be fixed.
For dealing with these null values, there are two routes we can choose. If I am certain of the maximum size of the number, I could assign a unique code (think 9999), that is higher than the maximum value, and fills the same amount of space. If I know the number can only be positive, I could also assign it a code like -1.
You can also set all of these values to N/A, which dataframes in Python or R will accept as “Not Available”, or blank values. For reading data to MySQL, we may need to work around this, and let it know that any string called “N/A” is a blank, or null value.
After spending a few frustrating hours trying to load this data into MySQL, I have realized how spoiled using SAS had made me. Point and click load for the dataset took almost an hour. If you want to load it via query (much faster, I will admit), you need to identify all the columns to load.
Now, there are workarounds to this, I just don’t want to use them seeing as how my table is relatively easy to load. However, if you were working with a CSV file with tens, hundreds, or thousands of columns, those workarounds would definitely be necessary.
Because I am only using SQL for verification purposes, I will work around my current issues by removing columns from the CSV datasets and loading the remaining columns manually.
Looking at the Tree-Inventory dataset, the first thing that jumped out at me were the missing street names. Though I am not sure how important these will be, I want to know if the ADDSTR column, which appears to have a unique identifying number for a street, can be joined with the Roads dataset to access all street names.
After loading the smaller datasets, I have used an inner join to keep only the unique road numbers listed on the street trees dataset. To verify that this returns accurate results, I can take a subset of the data, the group that lists an actual street name on the STREET column and compare it against the ROAD_NAME_FULL column added. If this subset is all matching, then I would consider the remainder of the data accurate.
Well then. As you can see, the numbers of ADDSTR and ROAD_ID match up, but the names do not!
Guess I will need to try plan B. Note that we have over 150,000 records in the tree-inventory dataset, and many of them contain a real street name in the STREET column. Plan B is to extract all records that hold a valid street name, as well as the ADDSTR identifier number.
Next, I will rejoin this new table back to the original table, using a left outer join, to keep any records that can’t be joined in place.
Hey, that worked!
This is obviously not perfect, we still have not populated all of the street names, but it is a start. Checking the counts, we have 124,691 returned where the new Street column is not null.
For the analysis, I will use the original file for most portions, but we may use this new dataset for some checks. Output to CSV and let’s move on.
Data Analysis
With relatively clean data, I can now proceed to analysing it.
First, let’s see what kind of trees, and how many, we have in Ottawa.
Interesting.
I didn't know half of these tree species existed. I probably wouldn't recognize them if I saw them. But it's neat to see just how many different types of trees we have growing here.
Next, let’s see how they are divvied up between wards. Fittingly, we will use a tree map to check this out. The tree map was created using a Python library called Squarify
As we can see from this map, Ward 5 (West-Carleton March), has the lowest number of identified street trees. One would assume that a predominantly rural ward would be well treed, but it is also possible that most trees are on private lands, or on federally owned lands.
By contrast, the ward with the most identified street trees (14,297) is Alta Vista, a ward within the Greenbelt. If you've ever spent any time around Alta Vista, you may remember seeing a lot of trees. With smaller lot lines than a rural ward, and significant space close to public streets, it would make sense that a significant amount of trees in the ward would be public trees.
Alta Vista Drive, from Google Maps
One of Ottawa’s most prominent tree issues is bugs. Specifically, the Emerald Ash Borer which likes to bore through our Ash trees (as the name implies). Another problem is Dutch Elm Disease, which was released by the Dutch in an effort to make their trees look better (I am joking, ha ha). This disease is actually a fungus, spread by elm bark beetles.
This dataset contains all of the trees that have been tagged for injection programs relating to each of these diseases. Let’s see how many.
EAB is for Ash trees in the EAB (Emerald Ash Borer) injection program.
DED1 and DED2 are for Elm trees in the DED (Dutch Elm Disease) injection program.
Is there any difference in the location of treated trees and non-treated trees?
Well, ward 5 looks like a serious outlier, having a percentage far higher than any other ward. However, the face that ward 5 doesn’t even contain 100 public trees means that this ward is an outlier in more ways than one.
Ottawa's strategy to combat the spread of invasive specied involves treatment of infected trees, and proactive removal of doomed trees, followed by replanting.
With such a strategy in place, there might be a difference in tree diameters between Ash, Elm, and all other public trees. I will use a KDE (Kernel Density Estimator) plot to better map out the density of tree diameters for each tree type (Ash, Elm, Other), by Ward.
In this chart, Ash Trees are represented by red, Elm trees by the purple lines, and all Other species of trees by the green.
What this chart reveals, is that the average diameter (DBH) of Ash and Elm trees seems to be much lower than all the other trees. Some part of this could be explained by natural tree growth. However, the fact that such a program exists would indicate that there is more to it than that.
Let’s go back to the ward differences for a second. Remember how one of the benefits of street trees was lowered anxiety and increased calm? I’m wondering if there is a way to actually verify this. If we had a database collecting people’s general feelings of stress and anxiety, perhaps we could perform some analyses on the aggregate data to see whether or not the amount of trees in their ward has an impact on this level. But I digress.
Let’s take a gander at that roads dataset we spent all that time cleaning earlier. These are the 10 roads with the most trees.
Obviously, most of these roads are quite long, and therefore it makes sense that they would have so many trees. Alta Vista, Bank Street, Smyth, and Pleasant Park also all run through the most treed ward, Alta Vista.
One pleasant surprise is Queen Street, however. A relatively short road straight through Ottawa's downtown core, its nice to know that there is so much greenery growing on it, beneath the tall buildings.
Trees on Queen Street, from Google Maps