4 Throughputs

This chapter is very similar to the previous chapter (“Emissions”). If you haven’t read that one yet, it would probably be a good idea to read it first!

The first section in this chapter, Obtaining throughput data (below), is a bit of a digression. If you like, you can just skip ahead to Charting annual throughputs.

4.1 Obtaining throughput data

In the Emissions chapter, we had a nice “published” dataset ready at hand: BY2011_annual_emission_data.

Unfortunately, there are no analogously “published” area-source throughputs for BY2011. However, we can fairly easily (re-)produce CY1990-2030 area-source estimates from the information contained in the BY2011 DataBank t-tables. Then, we can extract throughputs from those. That’s what we’ll do in the subsections below. Again, this is a bit of a digression — you can just skip ahead to Charting annual throughputs and come back to this later, if you like.

4.1.1 Recreating BY2011 area-source projections

In the chunk of code below, we recreate the entire BY2011 area-source inventory, from CY1990 through CY2040. (It’s actually pretty fast.)

#
# Step 1: `BY_area_source_projections()` recreates an entire inventory. For
# each category, it assembles `tput_qty`, `ef_qty`, and `cf_qty`, and then
# calculates `ems_qty` from those.
#
# It can also do much more --- see `help("BY_area_source_projections")` to learn
# how you can "swap in" alternative throughputs, emission factors, etc.
#
# **NOTE:** For BY2011, in some cases, the calculated emission estimates may
# differ from the published emission estimates. See the Appendix for an
# explanation and a specific example.
# 
BY2011_area_source_projection_data <-
  BY(2011) %>%
  BY_area_source_projections(
    years = CY(1990:2030),
    verbose = TRUE)

head(BY2011_area_source_projection_data)
year cat_id cnty_abbr tput_qty tput_unit pol_id pol_abbr ef_qty ef_unit cf_qty ems_qty ems_unit
CY1990 1576 CC 1727880 1000 CUBIC FT 990 TOG 0.134 lb/tput 0.28 32.41502 tons/yr
CY1990 1576 CC 1727880 1000 CUBIC FT 6970 CH4 0.030 lb/tput 1.00 25.91819 tons/yr
CY1991 1576 CC 3277889 1000 CUBIC FT 990 TOG 0.134 lb/tput 0.28 61.49320 tons/yr
CY1991 1576 CC 3277889 1000 CUBIC FT 6970 CH4 0.030 lb/tput 1.00 49.16833 tons/yr
CY1992 1576 CC 3138134 1000 CUBIC FT 990 TOG 0.134 lb/tput 0.28 58.87140 tons/yr
CY1992 1576 CC 3138134 1000 CUBIC FT 6970 CH4 0.030 lb/tput 1.00 47.07201 tons/yr

The resulting BY2011_area_source_projection_data now contains columns for:

  • ems_qty and ems_unit (emissions);
  • ef_qty (emission factors);
  • cf_qty (uncontrolled fractions); and
  • tput_qty and tput_unit (throughputs).

4.1.2 De-duplicating recreated BY2011 throughputs

Next, we derive BY2011_area_source_throughput_data from BY2011_area_source_projection_data.

The fact that BY2011_area_source_projection_data contains tput_qty and tput_unit makes it almost suitable for use with chart_annual_throughputs_by(). But, BY2011_area_source_projection_data is in long (“tidy”) format, suitable for use with chart_annual_emissions_by(). This means that the values in the tput_qty column are repeated once per pollutant. If we just added them up, we’d end up with a big over-estimate.

We’ll use distinct() to solve the problem.

#
# We can use `distinct()` to de-duplicate throughputs.
# 
BY2011_area_source_throughput_data <-
  BY2011_area_source_projection_data %>%
  distinct(
    year, 
    cat_id,
    cnty_abbr,
    tput_qty,
    tput_unit)

Now we have a clean BY2011_area_source_throughput_data.

Again, if you’re reading in throughput data from a .CSV or .XLSX file, you won’t have to worry at all about the steps above. Just skip ahead to the next section!

4.2 Charting annual throughputs

Let’s examine a category where the growth in throughput varies by county over time. A good example is category #761 Sanitary Sewers.

#
# Even though the relevant throughputs appear twice in these data --- once for
# `ROG` rows, and once for `CH4` --- `chart_annual_throughputs_by()` tries not
# to double-count. See `help("chart_annual_throughputs_by")` for details.
# 
BY2011_area_source_projection_data %>%
  filter_categories(
    "#761 Sanitary Sewers" = 761) %>%
  chart_annual_throughputs_by(
    flag_years = CY(2011),
    color = cnty_abbr,
    title = "Sanitary Sewers",
    subtitle = str_c(
      "Sanitary sewers (category #761) tracks with household population.",
      "The relevant growth profile (#657, in DataBank) also varies by county.",
      sep = "\n"))

As you can see, the syntax for chart_annual_throughputs_by() is identical to that for chart_annual_emissions_by().

Above, we’ve colored by cnty_abbr, rather than cat_id. This often makes sense when working with throughput data, since different categories often are estimated in terms of different throughput units. So, displaying throughputs for different categories on the same scale may not always make sense (although when the units are the same, it does).

If you used chart_annual_throughputs() and didn’t supply color = cnty_abbr, you will see the regional totals instead (see below).

#
# Even though the relevant throughputs appear twice in these data --- once for
# `ROG` rows, and once for `CH4` --- `chart_annual_throughputs_by()` tries not
# to double-count. See `help("chart_annual_throughputs_by")` for details.
# 
BY2011_area_source_projection_data %>%
  filter_categories(
    "#761 Sanitary Sewers" = 761) %>%
  chart_annual_throughputs(
    flag_years = CY(2011),
    title = "Sanitary Sewers",
    subtitle = str_c(
      "Sanitary sewers (category #761) tracks with household population.",
      "The relevant growth profile (#657, in DataBank) also varies by county.",
      sep = "\n"))

What if we wanted to understand these throughputs in terms of percent change (%), rather than absolute units (“1000 Pounds”)? Let’s move on to the next two chapters, Relative Changes and Growth Profiles.