Double click here to edit new Home Header component
Investment Bank
How can we help?
Parsys 1
Sifting through data on 2.4bn rides for insights
Parsys 2

As the number of jurisdictions regulating the ride-hailing industry grows, a natural question for investors is what, if any, effect this would have on the use of ride-hailing apps.

Almost surely, new rules and fees will raise the price of rides, so this comes down to an elasticity calculation – how sensitive is ride-hailing demand to changes in price? Adding to the relevance of this question is the limited ability of the biggest ride-hailing companies to continue to absorb losses now that they are public.

In order to provide insight into these issues, our Investment Sciences team downloaded data on circa 2.4 billion rides from 2010 to 2019 across all major providers of taxi services from the NYC Taxi and Limousine Commission (TLC). It is the agency responsible for licensing and regulating New York City's medallion (yellow) taxis, street hail livery (green) taxis, for-hire vehicles, commuter vans and paratransit (wheelchair-accessible) vehicles.

The TLC collects trip record information for each taxi and for-hire vehicle trip completed by TLC licensed drivers and vehicles. Broadly speaking, data on pick-up and, more recently, drop-off location, duration of the ride, fare (for yellow and green taxis only), and whether or not the ride was shared (for-hire vehicles only, capturing pooled versus individual rides) are available.

In order to understand the drivers of ridership across different parts of the city, TLC data was merged with demographic data from the US Census, the IRS and New York City Open Data. This required mapping pick-up and drop-off locations into the Neighborhood Tabulation Areas (NTAs) used by New York City, which are based on Census Tracts used by the US Census department, and which themselves can be mapped to zip codes.

The result is a comprehensive ride-hailing dataset that can be used to answer important questions about how the introduction of app-based ride hailing has changed transportation in New York City.

Some key notes about working with the data:

  • Some observations prior to 2018 are missing values for pick-up and drop-off locations. To avoid introducing any bias related to this, we use observations from 2018 forward for most of our analysis.
  • The locations of pick-ups and drop offs are provided by latitude/longitude coordinate prior to 2015, but in later data are coded to one of 263 taxi zones.
    • In order to add information about the populations of these areas, we mapped them as closely as possible to Neighborhood Tabulation Areas (NTAs), which are aggregations of census tracts used by New York City government in providing population aggregates on its Open Data portal. We also mapped NTAs and taxi zones to zip code boundaries to join other data, such as income data from the IRS.
    • We started by aggregating ride volumes at the NTA level (we use neighbourhood and NTA interchangeably in this text). Because the boundaries of NTAs and taxi zones do not match exactly, for each NTA we summed all the rides for the taxi zones that are entirely within that NTA. Then, we split the rides in any taxi zone between all NTAs that overlap with it, proportionate to the amount of overlap.
    • Some taxi zones do not correspond to population centers, and therefore do not have meaningful population features. These areas include: JFK and LaGuardia airports, parks, cemeteries, Rikers Island, and rides that originated or concluded outside New York City.
Parsys 3
Related content


9 data points that expose hidden ride-hailing trends in NYC 

Explore the hidden trends behind the explosive growth of app-based ride-hailing in New York City.

Parsys 4

About the analysts

Jeff Meli is Head of Research within the Investment Bank at Barclays. Jeff joined Barclays in 2005 as Head of US Credit Strategy Research. He later became Head of Credit Research. He was most recently Co-Head of FICC Research and Co-Head of Research before being named Head of Research globally. Previously, he worked at Deutsche Bank and JP Morgan, with a focus on structured credit. Jeff has a PhD in Finance from the University of Chicago and an AB in Mathematics from Princeton.

Adam Kelleher is Head of Research Data Science at Barclays, based in New York. He is responsible for developing alternative data capabilities for research. Adam joined Barclays from BuzzFeed in May 2018. His previous work focused on large-scale machine learning for content recommendation and observational causal inference to guide data analytics.

He was recognised as one of FastCompany’s “Most Creative People” for his work on the POUND project and was an early advocate of causal inference in data science, which he now teaches at Columbia University’s Data Science Institute. He received his PhD in theoretical physics from The University of North Carolina at Chapel Hill in 2013.

Ryan Preclaw is the head of Investment Sciences, a group that creates investment insights by combining alternative data, data science, and traditional research. Previously, he was a Director in Credit Strategy, where he focused on special situations, event-driven strategies, and industries facing fundamental transitions.

Ryan has also worked as a coverage banker in Barclays' Communications and Media group. Prior to joining Barclays, Ryan worked as an economist at NERA Economic Consulting and London Economics International. Ryan received his M.B.A. from the University of Chicago in 2008, his M.A. from Western University in 2001, and his B.A. from the University of Alberta in 2000.

Parsys 5
Parsys 6
Parsys 7
Parsys 8
Parsys 9
Parsys 10
Parsys 11
Parsys 12
Parsys 13
Parsys 14
Parsys 15
Parsys 16
Parsys 17
Parsys 18
Parsys 19
Parsys 20
iParsys for Double Pixel component