Chicago bike crash data

Since the data comes from IDOT, questions and other resources posted here will probably be relevant to the entire state and not just Chicago.

My take on the data

in progress

The information contained is significant to and relevant many kinds and directions of analysis. The database has a good structure, too. The data is divided into three tables: One containing information on the CRASH (its occurrence), the VEHICLES involved, and the PERSONS involved.

What the data does a poor job at is accommodating the inherent differences in a crash with a bicyclist. Bicycles do not appear in the data as vehicles; instead of noting whether or not bicyclists carried legally-required lighting devices, it notes if they were wearing “contrasting clothing.” The data does not include dooring (although this will change for 2011 data).

Selected statistics

  • 1,239 of 4,931 crashes occurred during DUSK or DARKNESS periods, as coded by the IDOT data, in 2007-2009.


  • Is it important for police to note if a pedestrian or pedalcyclist involved in a crash was wearing contrasting clothing? What guidance do police officers investigating crashes and writing reports have on determining the status of the clothing? At least for pedalcyclist crashes, I believe it's more important to note whether or not the pedalcyclist was in compliance with local laws on bicycle lighting.

Most important crash data fields

Crash table

  • Casenum, f1
  • Crash year, f4
  • Crash month, f5
  • Crash day, f6
  • Day of week - although this can be easily calculated with PHP, f8
  • Collision type code, f13 (this is what defines pedestrian, pedalcyclist, rear end collision)
  • Total killed, f14
  • No injuries, f16
  • Traffic control device code, f26
  • Road surface condition code, f27
  • Road defects code, f28
  • Light condition code, f29
  • Weather code, f30
  • Cause 1 code, f31
  • Cause 2 code, f32
  • Time of crash, f34
  • Traffic control condition code, f35
  • Intersection related, f36 (boolean)
  • Hit and run, f37 (boolean)
  • Crash latitude, f47
  • Crash longitude, f48
  • Property description 1, f69
  • Property description 2, f70

Person table

Fields 8, 10, 11, 14, 19-21 will need linked tables to list their possible values.

  • Casenum, f1 (links with the Casenum field of the other tables)
  • Person type, f2 - this describes person as driver, pedestrian, pedalcyclist, passenger, etc…
  • UnitNo, f3 - this field describes its Unit Number on the crash report, SR-1050. It links with the vehicle table.
  • Age, f5 - I don't think date of birth is important, plus it gets into privacy issues. Age is kind of important, though, for classification.
  • Sex, f6
  • DRAC, f8
  • BAC, f9
  • VIS, f10
  • DRVA, f11
  • INJ, f13
  • SAFT, f14
  • AIR, f15
  • EJCT, f16
  • PPA, f19 (numeric)
  • PPL, f20 (numeric)
  • PEDV, f21 (numeric)

Vehicle table

Fields 5-8, 17-23 will need linked tables to list their possible values.

  • Casenum, f1
  • UnitNo, f2
  • NoOccupants, f4 - this is number of occupants
  • VEHT, f5 (numeric) - vehicle type
  • VEHU, f6 (numeric) - vehicle use (like passenger versus taxi)
  • VEHD, f7 (numeric) - vehicle defects
  • MANV, f8 (numeric) - vehicle maneuver
  • DIRP, f9 (numeric)
  • CV IND, f13 (boolean) - whether or not it's a commercial vehicle
  • EVNT1, f17
  • EVNT2, f18
  • EVNT3, f19
  • LOC1, f20
  • LOC2, f21
  • LOC3, f22
  • FirstContact, f23 (fields 17-23 are numeric)

I don't understand these fields:

  • MostHarmfulEvent
  • LocationOfMostHarmful
  • MostHarmfulEventNo

Not sure if these are important or not:

  • Vehicle Model Year
  • Vehicle Make
  • Vehicle Model


/home/stevevance/ · Last modified: 2011/05/11 22:11 by stevevance
Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Noncommercial-Share Alike 3.0 Unported
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki