Panama-8813

DIY Big Year: A Geeky Look At Data

Meet Our Team

NEWS & UPDATES

Stay up-to-date with new tours, special offers and exciting news. We'll also share some hints and tips for travel, photography and birding. We will NEVER share nor sell your information!

  • Please help us send the information for trip styles in which you are most interested.
  • This field is for validation purposes and should be left unchanged.

Oct 2, 2017 | by Greg Miller
Scenic Bixby Creek Bridge in Big Sur California

Scenic Bixby Creek Bridge in Big Sur California – photo by Greg Miller

If you know me then you know I like numbers. A lot. Actually. I love data. It can be so powerful. But it can also be misleading and confusing.

In an early DIY Big Year post I told you about some eBird data that I have been wrangling with for over a year now. In this post I want to give you a peak at some different ways to look at the data I have collected.

Let’s have some fun with Top 10 lists. I have data rolled up for 299 Counties in the United States. The Counties are found all over. All 50 States are represented. Checklist data is from eBird (http://ebird.org) from 2006-2016 as of September 2016. Area and population data come from census.gov.

10 Largest Counties
Rank County Area (sq mi)
1 San Bernardino County, CA 20,057
2 Coconino County, AZ 18,619
3 Nye County, NV 18,182
4 Kenai Peninsula County, AK 16,075
5 Mohave County, AZ 13,311
6 Inyo County, CA 10,181
7 Maricopa County, AZ 9,200
8 Pima County, AZ 9,187
9 Kern County, CA 8,132
10 Yavapai County, AZ 8,124
10 Smallest Counties
Rank County Area (sq mi)
299 New York County, NY 23
298 San Francisco County, CA 47
297 Suffolk County, MA 58
296 Richmond County, NY 58
295 Kings County, NY 71
294 Newport County, RI 102
293 Queens County, NY 109
292 Los Alamos County, NM 109
291 Clarke County, GA 119
290 Philadelphia County, PA 134
10 Counties with Largest Population
Rank County 2016 Population Estimate
1 Los Angeles County, CA 10,137,915
2 Cook County, IL 5,203,499
3 Harris County, TX 4,589,928
4 Maricopa County, AZ 4,242,997
5 San Diego County, CA 3,317,749
6 Orange County, CA 3,172,532
7 Miami-Dade County, FL 2,712,945
8 Kings County, NY 2,629,150
9 Dallas County, TX 2,574,984
10 Riverside County, CA 2,387,741
10 Counties with Smallest Population
Rank County 2016 Population Estimate
299 Cameron Parish, LA 6,882
298 Custer County, SD 8,596
297 Brewster County, TX 9,200
296 Mono County, CA 13,981
295 San Juan County, WA 16,339
294 Socorro County, NM 17,027
293 Mariposa County, CA 17,410
292 Inyo County, CA 18,144
291 Los Alamos County, NM 18,147
290 Teton County, WY 23,191
10 Most Densely Populated Counties
Rank County Population per sq mi
1 New York County, NY 71,999
2 Kings County, NY 37,124
3 Queens County, NY 21,497
4 San Francisco County, CA 18,581
5 Suffolk County, MA 13,486
6 Philadelphia County, PA 11,692
7 Richmond County, NY 8,155
8 Cook County, IL 5,504
9 Nassau County, NY 4,782
10 Bergen County, NJ 4,031
10 Least Densely Populated Counties
Rank County Population per sq mi
299 Brewster County, TX 1.5
298 Inyo County, CA 1.8
297 Nye County, NV 2.4
296 Socorro County, NM 2.6
295 Kenai Peninsula County, AK 3.6
294 Mono County, CA 4.6
293 Cameron Parish, LA 5.4
292 Custer County, SD 5.5
291 Teton County, WY 5.8
290 Coconino County, AZ 7.6
10 Counties with Highest Number of Checklists
Rank County Total Checklists
1 Los Angeles County, CA 124,721
2 Cook County, IL 110,781
3 Pima County, AZ 104,968
4 Tompkins County, NY 89,995
5 San Diego County, CA 87,942
6 Middlesex County, MA 75,238
7 King County, WA 73,768
8 Essex County, MA 72,725
9 Harris County, TX 69,955
10 St. Louis County, MN 69,352
10 Counties with Lowest Number of Checklists
Rank County Total Checklists
299 Custer County, SD 2,034
298 Hancock County, MS 2,097
297 Ward County, ND 2,238
296 Pulaski County, KY 3,218
295 Cass County, ND 3,324
294 Harrison County, MS 3,527
293 Dodge County, NE 4,349
292 Nye County, NV 4,450
291 Benton County, AR 4,618
290 Washington County, AR 4,621
10 Counties with Highest Number of Species
Rank County Total Species
1 Los Angeles County, CA 494
2 San Diego County, CA 488
3 Santa Barbara County, CA 448
4 Cochise County, AZ 440
5 San Francisco County, CA 439
5 Ventura County, CA 439
7 Cameron County, TX 434
8 Pima County, AZ 431
9 Orange County, CA 429
10 Humboldt County, CA 428
10 Counties with Lowest Number of Species
Rank County Total Species
299 Kauai County, HI 141
298 Hawaii County, HI 155
297 Honolulu County, HI 171
296 Spartanburg County, SC 204
295 Kanawha County, WV 222
294 Fulton County, GA 231
293 Chemung County, NY 236
292 Anchorage County, AK 236
291 Greenville County, SC 237
290 Herkimer County, NY 241
10 Counties with Highest Number of Checklists per capita
Rank County Checklists per capita
1 Brewster County, TX 1.24
2 Cameron Parish, LA 1.09
3 Los Alamos County, NM 1.03
4 Mariposa County, CA 0.98
5 Addison County, VT 0.91
6 San Juan County, WA 0.89
7 Tompkins County, NY 0.86
8 Santa Cruz County, AZ 0.85
9 Mono County, CA 0.76
10 Inyo County, CA 0.75
10 Counties with Lowest Number of Checklists per capita
Rank County Checklists per capita
299 Clark County, NV 0.00602
298 Dallas County, TX 0.00734
297 Shelby County, TN 0.00743
296 Wayne County, MI 0.00849
295 Queens County, NY 0.00867
294 Honolulu County, HI 0.00958
293 Broward County, FL 0.00963
292 Tarrant County, TX 0.00980
291 Tulsa County, OK 0.01016
290 Providence County, RI 0.01097

 

There you have it—a preliminary look at the data I am using. The cross section of data is pretty diverse. It has highly populated areas as well as those that are sparsely populated. Some are large in area. Some are small.

But all of these Counties have one thing in common—they are the most checklists submitted in the United States (or for some States, the most checklists in the State. See my criteria in my previous posts).

Do the most populated Counties have the most checklists? Not always. Do the Counties with the most checklists have the highest number of species? Not always. Is there a significant correlation between population and total checklists submitted? Nope. How about between checklists submitted and number of species? No again.

So how does this all figure into planning a Big Year? I’m glad you asked. Because I am going to tell you anyway. We live in an age with dizzying amounts of data. Having data can be powerful. But only if you are able to harness the information to help you in a way that makes sense.

What am I talking about? Our goal for a Big Year is to see as many unique species in one calendar year as possible. And remember I hope to make this affordable and efficient, too. A bigger population means are larger number of checklists submitted, but only up to a point. And a larger number of checklists means a larger number of species, but again, only up to a point.

All of the data above is interesting. But it does not yet answer the questions about where and when to go birding for the greatest number of unique species in the shortest amount of time.

The total number of species listed above is for the whole period from 2006-2016. It encompasses all seasons. So that number really is not a great indicator of where to go and a poor indicator of when to go.

There are 299 Counties. And each County has 4 weeks of data per month. A month is always 4 weeks in eBird. The first week is the 1st through the 7th. The second week is the 8th through the 14th. The third week is the 15th through the 21st. And the fourth (and last) week is the 22nd through the end of the month. So 299 Counties x 48 weeks of data gives one a large number of possibilities of where to go and when. In fact, that number of possibilities is 14,352. (Have I told you how much I like numbers?)

Now you may be asking, out of 14,352 possibilities where does one even begin to guess where to go and when? Oh, you thought that was complicated. Throw in 984 species into the mix. Yep. You will need more than a hand calculator. You could do it in a spreadsheet if you could do a pivot table of 7 million rows and then sort it. Good luck with that.

A database is a perfect solution. It is unparalleled in its power to perform on problems like this. It can make calculations on mind-boggling amounts of data and retrieve the information in a matter of seconds (ok, minutes for some of our questions). And this is what I did. I took all those spreadsheets of downloaded data and loaded them into a database. (I used Sql Server Express)

In my next post I will tell you how I used this data to arrive at some answers to the questions of the best places to go at the best times of year to maximize the total number of species on each trip. You won’t want to miss that one!

Leave a Comment

You must be logged in to post a comment.