Since the data comes from IDOT, questions and other resources posted here will probably be relevant to the entire state and not just Chicago.
in progress
The information contained is significant to and relevant many kinds and directions of analysis. The database has a good structure, too. The data is divided into three tables: One containing information on the CRASH (its occurrence), the VEHICLES involved, and the PERSONS involved.
What the data does a poor job at is accommodating the inherent differences in a crash with a bicyclist. Bicycles do not appear in the data as vehicles; instead of noting whether or not bicyclists carried legally-required lighting devices, it notes if they were wearing “contrasting clothing.” The data does not include dooring (although this will change for 2011 data).
1,239 of 4,931 crashes occurred during DUSK or DARKNESS periods, as coded by the IDOT data, in 2007-2009.
Is it important for police to note if a pedestrian or pedalcyclist involved in a crash was wearing contrasting clothing? What guidance do police officers investigating crashes and writing reports have on determining the status of the clothing? At least for pedalcyclist crashes, I believe it's more important to note whether or not the pedalcyclist was in compliance with local laws on bicycle lighting.
Casenum, f1
Crash year, f4
Crash month, f5
Crash day, f6
Day of week - although this can be easily calculated with
PHP, f8
Collision type code, f13 (this is what defines pedestrian, pedalcyclist, rear end collision)
Total killed, f14
No injuries, f16
Traffic control device code, f26
Road surface condition code, f27
Road defects code, f28
Light condition code, f29
Weather code, f30
Cause 1 code, f31
Cause 2 code, f32
Time of crash, f34
Traffic control condition code, f35
Intersection related, f36 (boolean)
Hit and run, f37 (boolean)
Crash latitude, f47
Crash longitude, f48
Property description 1, f69
Property description 2, f70
Fields 8, 10, 11, 14, 19-21 will need linked tables to list their possible values.
Casenum, f1 (links with the Casenum field of the other tables)
Person type, f2 - this describes person as driver, pedestrian, pedalcyclist, passenger, etc…
UnitNo, f3 - this field describes its Unit Number on the crash report, SR-1050. It links with the vehicle table.
Age, f5 - I don't think date of birth is important, plus it gets into privacy issues. Age is kind of important, though, for classification.
Sex, f6
DRAC, f8
BAC, f9
VIS, f10
DRVA, f11
INJ, f13
SAFT, f14
AIR, f15
EJCT, f16
PPA, f19 (numeric)
PPL, f20 (numeric)
PEDV, f21 (numeric)
Fields 5-8, 17-23 will need linked tables to list their possible values.
Casenum, f1
UnitNo, f2
NoOccupants, f4 - this is number of occupants
VEHT, f5 (numeric) - vehicle type
VEHU, f6 (numeric) - vehicle use (like passenger versus taxi)
VEHD, f7 (numeric) - vehicle defects
MANV, f8 (numeric) - vehicle maneuver
DIRP, f9 (numeric)
CV IND, f13 (boolean) - whether or not it's a commercial vehicle
EVNT1, f17
EVNT2, f18
EVNT3, f19
LOC1, f20
LOC2, f21
LOC3, f22
FirstContact, f23 (fields 17-23 are numeric)
I don't understand these fields:
MostHarmfulEvent
LocationOfMostHarmful
MostHarmfulEventNo
Not sure if these are important or not:
Vehicle Model Year
Vehicle Make
Vehicle Model
Comments