Theory/How far do we drive?/6. Methodology

< 5. Analysis of EV Energy Consumption ••• Home ••• 7. Results >

To characterize driving behavior of the U.S. population, a dataset of the National Household Travel Survey 2009 was used (released Jan. 2010). The National Household Travel Survey (NHTS) provides information to assist transportation planners and others who need comprehensive data on travel and transportation patterns in the United States . 150,147 households completed the NHTS between March 2008 and May 2009 and contributed data on four levels: household, person, vehicle and travel day. The data is organized in these four tables, although key variables occur in multiple tables to simplify analysis. Especially important to this study are the travel day and person tables, which include information about the covered distance on trips and the distance people commute to work.

The NHTS data was collected with Computer Assisted Telephone Interviewing (CATI) technology that randomly dialed telephone numbers from a list of registered landline phone numbers. Besides that, the computer assigned the responding households a specific date as their ‘Travel Day’ on which the household members had to report all trips made using any type of transportation mode. Participants were asked to record trips in a Travel Diary that was sent to the household prior to their Travel Day. Each household received a reminder call the day before the assigned Travel Day. More information about the survey process can be found in the 2009 NHTS User’s Guide (NHTS, 2010).

The dataset was downloaded from the NHTS website and imported into SPSS 18 (IBM, 2011). Before the analyses were run, the data tables were preprocessed as described in the following paragraph.

Data description & preprocessing

The two primary NHTS tables used in this study are the Travel Day Dataset and the Person Dataset. A schematic overview of pre-processing steps is shown in Figure 1:

methodology NHTS

Figure 10: Schematic overview of data selection and preparation before analyses were run on distances driven.

150,147 households participated in the survey, counting 308,901 people and they owned altogether 294,409 cars. That is 1.96 cars per household. 179,484 cars (61%) were actually used on the ‘Travel Day’, making a total of 748,918 individual trips.

The Travel Day and Person datasets contain 103 and 111 variables respectively, of which 19 and 6 were used for analyses. The selected variable names and description are shown in the table below, with bold variables representing identifiers for either households, persons, vehicles and trips.

Travel Day Dataset:

TDCASEID Trip Number
HOUSEID HH eight-digit ID number
HHVEHCNT Count of HH vehicles
HHSTATE* State HH location
MSACAT* MSA category for the HH home address
URBRUR* Household in urban/rural area
URBANSIZE* Size of urban area in which home address is located
URBAN* Home address in urbanized area
PERSONID Person ID number
VEHID Vehicle ID number
TRIPPURP* General Trip Purpose
TRAVDAY* Travel day - day of week
TRPTRANS Transportation mode used on trip
WHYFROM* Trip purpose for previous trip
WHYTO* Travel day purpose of trip
WHYTRP1S* Trip purpose summary
PUBTRANS Respondent Used Public Transportation on trip
STRTTIME Trip START time in military
ENDTIME Trip END time in military
TRPMILES Calculated Trip distance converted into miles
WTTRDFIN Final travel day weight

Person Dataset:

HOUSEID HH eight-digit ID number
PERSONID Person ID number
TIMETOWK Minutes to go from home to work last week
WRKTRANS Transportation mode to work last week
HH_RACE* Race of HH respondent
WTPERFIN Final person weight

Independent control variables
Some control variables were identified and their effect on dependent variables was quantified. For example, there is a significant difference between the average trip distances for households in either an urban or rural area. Similarly, car usage patterns greatly depend on whether the travel day is a weekday or in the weekend.

Weighting factors
The weighting factors WTTRDFIN, WTHHFIN and WTPERFIN are used to account for sample representativeness in each of the three datasets: Travel Day, Household and Person. Cases that are rare in the dataset compared to their occurrence in the Census are weighted heavier than abundant cases that are relatively rare in the population. For example, wealthier families may be less likely to take part in the NHTS and could therefore be underrepresented. To compensate for this, wealthy families participating are given a higher WTHHFIN than others. Similarly, WTPERFIN in the Person dataset may compensate for a certain age-group that is underrepresented. For all weighting factors, Census data is used as a baseline.


Cars per household
As was shown in Paragraph 3.2, the large majority of Mini-E drivers (94%) used their gasoline car when ‘EV limitations’ had to be overcome. With 100% of the participants agreeing that “EVs are suitable for daily use”, it seems that they learned how to deal with the vehicle’s range limitations by simply taking their gasoline car for longer trips. Thus, an important factor in the integration of EVs is the presence of an alternative conventional car in households, so that they can be used for long range trips. It is hypothesized that the implementation of sub-100-mile range EVs is more difficult in households that do not own one or more gasoline cars on the side. So how many cars do U.S. households typically own? Fortunately, this was part of the NHTS 2009. From the Vehicles dataset, cars, SUVs, vans and pickup trucks were counted per household ID and weighted with ‘WTHHFIN’. The number of households without cars was calculated as: the total number of households included in the NHTS minus the unique household IDs found in the Vehicles dataset (which owned at least one car). The results were plotted in a pie-diagram.

Driven distance per vehicle
In order to find the driven distance per vehicle, some data entries in the travel day table had to be refined. Since every person-trip is listed in that table, duplicate entries occur when two or more household members were on a trip in the same vehicle. One of the duplicates remained in the dataset, while the other(s) were deleted.
Then, the entries with the same vehicle ID’s were aggregated (summed) on trip distance to find the daily driven distance per car. This is important because it is very likely that cars will only be charged at home, at least until the EV charging infrastructure is more established at destinations like workplaces and grocery stores. Because a significant difference was found in driven distance between urban and rural households, results are displayed for both groups separately.

Commuting distance
Because of the fixed distance of daily commutes, and the high likelihood of charging stations being installed at parking places for employees, it is useful to find out the distribution of commuting distances. From all people surveyed, 106,681 commute to work by car. The distribution is derived from the variable DISTTOWK from the Person dataset.

Car usage patterns
The variables STRTTIME, ENDTIME and TRAVDAY were used to create a car usage distribution. 96 time bins of 15 minutes were made for each vehicle, denoting a ‘1’ when the vehicle was used at any time during each 15-minute increment. The resulting usage patterns can be of value to electric grid planners for the expected rise in system load by Electric Vehicles. Because daily usage patterns are different for weekdays and weekends, multiple characterizations were performed. Then, Taylor polynomials were fit to the curves.
The car usage pattern analysis is done for 1) weekdays and 2) weekends. It is important to make this categorization because of obvious differences in usage patterns for weekdays and weekends. In terms of data management, some things had to be taken into account before the graphs could be produced:

  • The total number of cars is defined as: the sum of HHVEHCNT (Household Vehicle Count) for households that had their travel day in one of the above categories. This includes cars of households that did not use any of their cars on the travel day. This number is determined from the HH dataset.
  • 96 bins were created for every 15-min time interval, coded as fractions of hours: ‘bin_2375’ stands for the time interval 11:45-11:59pm. The bins were applied to the Travel Day-dataset where only trips with cars were selected (N=748,918). A bin will have a value of 1 when a car trip took place at any point in time during that 15-min interval. For example, if a car trip has STRTTIME = 7:10 and ENDTIME = 7:45, bin_0700, bin_725, bin_750 and bin_775 will have value = 1. Because participants are likely to round up the Start and End time of their trip, the car usage pattern is likely to be somewhat overestimated.
  • In up to 10% of travel days where cars were used, a single car is used in two consecutive trips that end and begin in the same 15-min bin (for instance when somebody is dropped off at the train station). Because these are two separate trips, this would lead to a ‘double count’. This was prevented by counting only unique vehicle ID’s for each bin.

Results can be found on the next page!

< 5. Analysis of EV Energy Consumption ••• Home ••• 7. Results >

Your comments


U.S. Department of Transportation. (2010). National Household Travel Survey of 2009. 2009 NHTS. Retrieved August 2011, from

U.S. Department of Transportation. (2011). National Household Travel Survey User’s Guide. Transportation (Vol. 2011). Retrieved from

IBM Analytics - SPSS. (2010). SPSS. Retrieved from


© 2017
Made possible by:
Valid XHTML 1.0 Transitional     Valid CSS!