HIGH RISK CRASH ANALYSIS
Final Report 558
Prepared by:
Simon Washington and Wen Cheng
Department of Civil Engineering & Engineering Mechanics
University of Arizona
Tucson, AZ 85721
December 2005
Prepared for:
Arizona Department of Transportation
206 South 17th Avenue
Phoenix, Arizona 85007
in cooperation with
U. S. Department of Transportation
Federal Highway Administration
DISCLAIMER
The contents of this report reflect the views of the authors who are responsible for the facts and the
accuracy of the data presented herein. The contents do not necessarily reflect the official views or
policies of the Arizona Department of Transportation or the Federal Highway Administration. This
report does not constitute a standard, specification, or regulation. Trade or manufacturers' names
which may appear herein are cited only because they are considered essential to the objectives of
the report. The U. S. Government and the State of Arizona do not endorse products or
manufacturers.
Technical Report Documentation Page
1. Report No.
FHWA- AZ- 05- 558
2. Government Accession No.
3. Recipient's Catalog No.
5. Report Date
December 2005
4. Title and Subtitle
High Risk Crash Analysis
6. Performing Organization Code
7. Author
Dr. Simon Washington and Wen Cheng
8. Performing Organization Report No.
10. Work Unit No.
9. Performing Organization Name and Address
University of Arizona
Tucson, AZ 85721 11. Contract or Grant No.
SPR- PL- 1-( 63) 558
13. Type of Report & Period Covered
Final Report
12. Sponsoring Agency Name and Address
Arizona Department of Transportation
206 S. 17th Avenue
Phoenix, Arizona 85007 14. Sponsoring Agency Code
15. Supplementary Notes
Prepared in cooperation with the U. S. Department of Transportation, Federal Highway Administration
16. Abstract
In agencies with jurisdiction over extensive road infrastructure it is common practice to select and rectify hazardous locations.
Improving hazardous locations may arise during safety management activities, during maintenance activities, or as a result of
political pressures and/ or public attention. Commonly a two- stage process is used. In the first stage the past accident history
of all sites is reviewed to screen a limited number of high- risk locations for further examination. In the second stage the
selected sites are studied in greater detail to devise cost- effective remedial actions or countermeasures for a subset of
correctable sites. Due often to limited time and resources constraints and the extensive number of candidate sites typically
considered in such endeavors, it is impractical for agencies to examine all sites in detail. The current Arizona Local
Government Safety Project Analysis Model ( ALGSP) is intended to facilitate conducting these procedures by providing an
automated method for analysis and evaluation of motor vehicle crashes and subsequent remediation of ‘ hot spot’ or ‘ high
risk’ locations. The software is user friendly and can save lots of time for local jurisdictions and governments such as
Metropolitan Planning Organizations ( MPOs), counties, cities, and towns. Some analytical improvements are possible,
however.
The objective of this study was to provide recommendations that will lead to improvement in the accuracy and reliability of
the ALGSP software for identifying true ‘ hot spots’ within the Arizona transportation system or network, be they road
segments, ramps, or intersections.
The research resulted in 1) a survey of past and current hot spot identification ( HSID) approaches, 2) evaluation of HSID
methods and exploration of optimum duration of before- period crash data under simulated scenarios, 3) development of safety
performance functions ( SPFs) for various functional road sections within Arizona, 4) extended comparisons of alternative
HSID methods based on SPFs by using real crash data, and 5) recommendations for improving the identification ability of
current ALGSP model.
17. Key Words
Hot Spot Identification, High Risk Sites, Sites with
Promise, Safety, Motor Vehicle Crashes
18. Distribution Statement
Document is available to the U. S.
Public through the National
Technical Information Service,
Springfield, Virginia, 22161
19. Security Classification
Unclassified
20. Security Classification
Unclassified
21. No. of Pages
154
22. Price
23. Registrant's Seal
SI* ( MODERN METRIC) CONVERSION FACTORS
APPROXIMATE CONVERSIONS TO SI UNITS APPROXIMATE CONVERSIONS FROM SI UNITS
Symbol When You Know Multiply By To Find Symbol Symbol When You Know Multiply By To Find Symbol
LENGTH LENGTH
in Inches 25.4 millimeters mm mm millimeters 0.039 inches in
ft Feet 0.305 meters m m meters 3.28 feet ft
yd Yards 0.914 meters m m meters 1.09 yards yd
mi Miles 1.61 kilometers km km kilometers 0.621 miles mi
AREA AREA
in2 square inches 645.2 square millimeters mm2 mm2 Square millimeters 0.0016 square inches in2
ft2 square feet 0.093 square meters m2 m2 Square meters 10.764 square feet ft2
yd2 square yards 0.836 square meters m2 m2 Square meters 1.195 square yards yd2
ac Acres 0.405 hectares ha ha hectares 2.47 acres ac
mi2 square miles 2.59 square kilometers km2 km2 Square kilometers 0.386 square miles mi2
VOLUME VOLUME
fl oz fluid ounces 29.57 milliliters mL mL milliliters 0.034 fluid ounces fl oz
gal Gallons 3.785 liters L L liters 0.264 gallons gal
ft3 cubic feet 0.028 cubic meters m3 m3 Cubic meters 35.315 cubic feet ft3
yd3 cubic yards 0.765 cubic meters m3 m3 Cubic meters 1.308 cubic yards yd3
NOTE: Volumes greater than 1000L shall be shown in m3.
MASS MASS
oz Ounces 28.35 grams g g grams 0.035 ounces oz
lb Pounds 0.454 kilograms kg kg kilograms 2.205 pounds lb
T short tons ( 2000lb) 0.907 megagrams
( or “ metric ton”)
mg
( or “ t”)
Mg megagrams
( or “ metric ton”)
1.102 short tons ( 2000lb) T
TEMPERATURE ( exact) TEMPERATURE ( exact)
º F Fahrenheit
temperature
5( F- 32)/ 9
or ( F- 32)/ 1.8
Celsius temperature º C º C Celsius temperature 1.8C + 32 Fahrenheit
temperature
º F
ILLUMINATION ILLUMINATION
fc foot candles 10.76 lux lx lx lux 0.0929 foot- candles fc
fl foot- Lamberts 3.426 candela/ m2 cd/ m2 cd/ m2 candela/ m2 0.2919 foot- Lamberts fl
FORCE AND PRESSURE OR STRESS FORCE AND PRESSURE OR STRESS
lbf Poundforce 4.45 newtons N N newtons 0.225 poundforce lbf
lbf/ in2 poundforce per
square inch
6.89 kilopascals kPa kPa kilopascals 0.145 poundforce per
square inch
lbf/ in2
SI is the symbol for the International System of Units. Appropriate rounding should be made to comply with Section 4 of ASTM E380
TABLE OF CONTENTS
EXECUTIVE SUMMARY ................................................................................................ 1
CHAPTER I - INTRODUCTION ...................................................................................... 3
CHAPTER II - LITERATURE REVIEW OF HSID METHODS ..................................... 5
HOT- SPOT IDENTIFICATION PROBLEM BACKGROUND............................... 5
BAYESIAN TECHNIQUES TO IDENTIFY HAZARDOUS LOCATIONS ......... 11
Bayesian Techniques Based on Accident Frequencies..................................... 11
Bayesian Techniques Based on Accident Rates ............................................... 13
CHAPTER III - EXPERIMENT DESIGN FOR EVALUATION OF HSID METHODS
AND EXPLORATION OF ACCIDENT HISTORY....................................................... 17
EXPERIMENT FOR EVALUATING HSID METHOD PERFORMANCE .......... 17
Hot Spot Identification Methods....................................................................... 17
Ground Rules for Simulation Experiment ........................................................ 19
Generating Mean Crash Frequencies from Real Data ...................................... 20
Generation of Random Poisson Samples from TPMs ...................................... 21
Performance Evaluation Results for HSID Methods ........................................ 26
EXPERIMENT FOR OPTIMIZING DURATION OF CRASH HISTORY ........... 30
RESULTS ................................................................................................................. 32
CONCLUSIONS AND RECOMMENDATIONS ................................................... 38
CHAPTER IV - SAFETY PERFORMANCE FUNCTIONS FOR ARIZONA ROAD
SEGMENTS ..................................................................................................................... 39
DATA DESCRIPTION ............................................................................................ 39
HOW TO CREATE SPFS? ...................................................................................... 40
RESULTS OF SPFS ................................................................................................. 41
CONCLUSIONS....................................................................................................... 42
CHAPTER V - COMPARISON OF HSID METHODS BASED ON REAL CRASH
DATA OF ARIZONA ROAD SEGMENTS.................................................................... 43
HSID METHODS BASED ON SPFS ...................................................................... 43
The EB Approach Based on SPFs .................................................................... 43
Accident Reduction Potential Method Based on SPFs ..................................... 44
Numerical Examples to Show the HSID Methods Based on SPFs .................. 44
DATA DESCRIPTION ............................................................................................ 46
TESTS FOR COMPARISON OF HSID METHODS.............................................. 46
Site Consistency Test........................................................................................ 47
Method Consistency Test.................................................................................. 48
Total Ranking Differences Test ........................................................................ 48
False Identification Test.................................................................................... 49
COMPARISON RESULTS...................................................................................... 51
Site Consistency Test Result............................................................................. 51
Method Consistency Test Result ...................................................................... 52
Total Ranking Differences Test Result............................................................. 53
False Identification Test Result ........................................................................ 54
False True Poisson Means Differences Test Result.......................................... 55
Result of Similarity of Alternative HSID Identification Methods.................... 56
CONCLUSIONS AND RECOMMENDATIONS ................................................... 57
CHAPTER VI - HSID IN CURRENT ALGSP MODEL AND RECOMMENDED
SOFTWARE CHANGES ................................................................................................. 59
HSID IN CURRENT ALGSP MODEL ................................................................... 59
RECOMMENDED SOFTWARE CHANGES......................................................... 61
Incorporating the Functional Classification as an Additional User Selection
Parameter .......................................................................................................... 61
Data Interface Improvement ............................................................................. 61
Exploring the Relationship between Exposure and Safety as Employed in the
ALGSP.............................................................................................................. 62
Incorporation of the EB Techniques to Calculate the Expected Crash Number62
Incorporation of Accident Reduction Potential Method................................... 63
Incorporation of the EB Techniques to Calculate the Expected Crash Costs... 64
Recommended Period of Analysis for Software Users..................................... 64
REFERENCES ................................................................................................................. 67
APPENDIX A: REAL ARIZONA CRASH DATA USED FOR THE DEVELOPMENT
OF SIMULATED CRASH DATA................................................................................... 71
APPENDIX B: THE IDENTIFICATION ERROR RATES ASSOCIATED WITH
VARIOUS HSID METHODS, CONFIDENCE LEVELS, AND GROUPS ................... 80
APPENDIX C: SAFETY PERFORMANCE FUNCTIONS OF VARIOUS
FUNCTIONAL CLASSIFICATIONS OF ARIZONA ROAD SEGMENTS................ 107
APPENDIX D: COMPARISON TESTS RESULTS AND SIMILARITY OF
ALTERNATIVE HSID METHODS FOR VARIOUS CLASSIFICATIONS OF
HIGHWAY SECTIONS................................................................................................. 117
LIST OF TABLES
Table 1: Summary of Gamma Fittings of Six Datasets .................................................... 24
Table 2: Simulated Data for 30 Sites and 16 Observation Periods................................... 25
Table 3: Percent Errors for Low Heterogeneity in Crash Counts..................................... 29
Table 4: Percent Errors for High Heterogeneity in Crash Counts .................................... 29
Table 5: Snapshot of the Simulated Data.......................................................................... 31
Table 6: The Number of t- year Which is the “ Knee” of the Curve for Group 1 .............. 33
Table 7: The Number of t- year Which is the “ Knee” of the Curve for Group 2 .............. 33
Table 8: The Number of t- year Which is the “ Knee” of the Curve for Group 3 .............. 33
Table 9: Percent Errors for Low Heterogeneity in Crash Counts ( 3 Years Data) ............ 37
Table 10: Percent Errors for High Heterogeneity in Crash Counts ( 3 Years Data).......... 37
Table 11: Functional Classification Codes ...................................................................... 39
Table 12: Statistics for Roads of Various Functional Classifications............................... 40
Table 13: Crash Information of a Sample of 20 Principle Arterial Road Sections........... 47
Table 14: Results of Site Consistency Test of Various Methods for All Classifications of
Highways: Accumulated Crashes for Hot Spot Sites for Various Methods ............. 51
Table 15: Results of Method Consistency Test of Various Methods for All Classifications
of Highways: Number of Sites Commonly Identified across Periods ...................... 52
Table 16: Results of Total Ranking Differences Test of Various Methods for All
Classifications of Highways: Cumulative Ranking Differences of Hot Spot Sites .. 53
Table 17: Results of False Identification Test of Various Methods for All Classifications
of Highways: Frequency of Errors............................................................................ 54
Table 18: Results of False True Poisson Means Differences Test of Various Methods for
All Classifications of Highways: Cumulative Difference in TPMs.......................... 55
Table 19: Accumulated Similarity of Various Methods for All Classifications of
Highways ( δ = 0.90).................................................................................................. 56
Table 20: Accumulated Similarity of Various Methods for All Classifications of
Highways ( δ = 0.95).................................................................................................. 56
Table 21: Observed Data from Apache ( E1) .................................................................... 71
Table 22: Observed Data from Gila ( E2).......................................................................... 71
Table 23: Observed Data from Graham ( L1).................................................................... 72
Table 24: Observed Data from Lapaz ( L2)....................................................................... 72
Table 25: Observed Data from Pima ( S1)......................................................................... 72
Table 26: Observed Data from Santacruz ( S2) ................................................................. 73
Table 27: The Identification Error Rates of SR Method for Group 1 ( δ = 0.90).............. 80
Table 28: The Identification Error Rates of ER Method for Group 1 ( δ = 0.90).............. 81
Table 29: The Identification Error Rates of CI Method for Group 1 ( δ = 0.90)............... 82
Table 30: The Identification Error Rates of SR Method for Group 1 ( δ = 0.95).............. 83
Table 31: The Identification Error Rates of EB Method for Group 1 ( δ = 0.95).............. 84
Table 32: The Identification Error Rates of CI Method for Group 1 ( δ = 0.95)............... 85
Table 33: The Identification Error Rates of SR Method for Group 1 ( δ = 0.99).............. 86
Table 34: The Identification Error Rates of EB Method for Group 1 ( δ = 0.99).............. 87
Table 35: The Identification Error Rates of CI Method for Group 1 ( δ = 0.99)............... 88
Table 36: The Identification Error Rates of SR Method for Group 2 ( δ = 0.90).............. 89
Table 37: The Identification Error Rates of EB Method for Group 2 ( δ = 0.90).............. 90
Table 38: The Identification Error Rates of CI Method for Group 2 ( δ = 0.90)............... 91
Table 39: The Identification Error Rates of SR Method for Group 2 ( δ = 0.95).............. 92
Table 40: The Identification Error Rates of EB Method for Group 2 ( δ = 0.95).............. 93
Table 41: The Identification Error Rates of CI Method for Group 2 ( δ = 0.95)............... 94
Table 42: The Identification Error Rates of SR Method for Group 2 ( δ = 0.99).............. 95
Table 43: The Identification Error Rates of EB Method for Group 2 ( δ = 0.99).............. 96
Table 44: The Identification Error Rates of CI Method for Group 2 ( δ = 0.99)............... 97
Table 45: The Identification Error Rates of SR Method for Group 3 ( δ = 0.90).............. 98
Table 46: The Identification Error Rates of EB Method for Group 3 ( δ = 0.90).............. 99
Table 47: The Identification Error Rates of CI Method for Group 3 ( δ = 0.90)............. 100
Table 48: The Identification Error Rates of SR Method for Group 3 ( δ = 0.95)............ 101
Table 49: The Identification Error Rates of EB Method for Group 3 ( δ = 0.95)............ 102
Table 50: The Identification Error Rates of CI Method for Group 3 ( δ = 0.95)............. 103
Table 51: The Identification Error Rates of SR Method for Group 3 ( δ = 0.99)............ 104
Table 52: The Identification Error Rates of EB Method for Group 3 ( δ = 0.99)............ 105
Table 53: The Identification Error Rates of CI Method for Group 3 ( δ = 0.99)............. 106
Table 54: Estimation Results for SPF of Rural Interstate Principle Arterials ( Functional
Code: 1)................................................................................................................... 108
Table 55: Estimation Results for SPF of Rural Other Principle Arterials ...................... 109
Table 56: Estimation Results for SPF of Rural Minor Arterials..................................... 110
Table 57: Estimation Results for SPF of Rural Major Collectors ( Functional Code: 7) 111
Table 58: Estimation Results for SPF of Rural Minor Collectors ( Functional Code: 8) 112
Table 59: Estimation Results for SPF of Urban Interstate Principle Arterials ( Functional
Code: 11)................................................................................................................. 113
Table 60: Estimation Results for SPF of Urban Freeways ............................................. 114
Table 61: Estimation Results for SPF of Urban Other Principle Arterials ( Functional
Code: 14)................................................................................................................. 115
Table 62: Estimation Results for SPF of Urban Minor Arterials.................................... 116
Table 63: Similarity of Identification Results ( δ = 0.90) of Various Methods ( Functional
Code: 1)................................................................................................................... 117
Table 64: Similarity of Identification Results ( δ = 0.95) of Various Methods ( Functional
Code: 1)................................................................................................................... 117
Table 65: Results of Site Consistency Test of Various Methods.................................... 118
Table 66: Results of Method Consistency Test of Various Methods ............................. 118
Table 67: Results of Total Ranking Differences Test of Various Methods.................... 118
Table 68: Results of False Identification Test of Various Methods ............................... 119
Table 69: Results of False True Poisson Means Differences Test of Various Methods
( Functional Code: 1) ............................................................................................... 119
Table 70: Similarity of Identification Results ( δ = 0.90) of Various Methods ( Functional
Code: 2)................................................................................................................... 120
Table 71: Similarity of Identification Results ( δ = 0.95) of Various Methods ( Functional
Code: 2)................................................................................................................... 120
Table 72: Results of Site Consistency Test of Various Methods.................................... 120
Table 73: Results of Method Consistency Test of Various Methods ............................. 121
Table 74: Results of Total Ranking Differences Test of Various Methods.................... 121
Table 75: Results of False Identification Test of Various Methods ............................... 121
Table 76: Results of False True Poisson Means Differences Test of Various Methods
( Functional Code: 2) ............................................................................................... 122
Table 77: Similarity of Identification Results ( δ = 0.90) of Various Methods ( Functional
Code: 6)................................................................................................................... 123
Table 78: Similarity of Identification Results ( δ = 0.95) of Various Methods ( Functional
Code: 6)................................................................................................................... 123
Table 79: Results of Site Consistency Test of Various Methods.................................... 123
Table 80: Results of Method Consistency Test of Various Methods ............................. 124
Table 81: Results of Total Ranking Differences Test of Various Methods.................... 124
Table 82: Results of False Identification Test of Various Methods ............................... 124
Table 83: Results of False True Poisson Means Differences Test of Various Methods
( Functional Code: 6) ............................................................................................... 125
Table 84: Similarity of Identification Results ( δ = 0.90) of Various Methods ( Functional
Code: 7)................................................................................................................... 126
Table 85: Similarity of Identification Results ( δ = 0.95) of Various Methods ( Functional
Code: 7)................................................................................................................... 126
Table 86: Results of Site Consistency Test of Various Methods.................................... 126
Table 87: Results of Method Consistency Test of Various Methods ............................. 127
Table 88: Results of Total Ranking Differences Test of Various Methods.................... 127
Table 89: Results of False Identification Test of Various Methods ............................... 127
Table 90: Results of False True Poisson Means Differences Test of Various Methods
( Functional Code: 7) ............................................................................................... 128
Table 91: Similarity of Identification Results ( δ = 0.90) of Various Methods ( Functional
Code: 8)................................................................................................................... 129
Table 92: Similarity of Identification Results ( δ = 0.95) of Various Methods ( Functional
Code: 8)................................................................................................................... 129
Table 93: Results of Site Consistency Test of Various Methods.................................... 129
Table 94: Results of Method Consistency Test of Various Methods ............................. 130
Table 95: Results of Total Ranking Differences Test of Various Methods.................... 130
Table 96: Results of False Identification Test of Various Methods ............................... 130
Table 97: Results of False True Poisson Means Differences Test of Various Methods
( Functional Code: 8) ............................................................................................... 131
Table 98: Similarity of Identification Results ( δ = 0.90) of Various Methods ( Functional
Code: 11)................................................................................................................. 132
Table 99: Similarity of Identification Results ( δ = 0.95) of Various Methods ( Functional
Code: 11)................................................................................................................. 132
Table 100: Results of Site Consistency Test of Various Methods.................................. 132
Table 101: Results of Method Consistency Test of Various Methods ........................... 133
Table 102: Results of Total Ranking Differences Test of Various Methods.................. 133
Table 103: Results of False Identification Test of Various Methods ............................. 133
Table 104: Results of False True Poisson Means Differences Test of Various Methods
( Functional Code: 11) ............................................................................................. 134
Table 105: Similarity of Identification Results ( δ = 0.90) of Various Methods............. 135
Table 106: Similarity of Identification Results ( δ = 0.95) of Various Methods............. 135
Table 107: Results of Site Consistency Test of Various Methods.................................. 135
Table 108: Results of Method Consistency Test of Various Methods ........................... 136
Table 109: Results of Total Ranking Differences Test of Various Methods ( Functional
Code: 12)................................................................................................................. 136
Table 110: Results of False Identification Test of Various Methods ............................. 136
Table 111: Results of False True Poisson Means Differences Test of Various Methods
( Functional Code: 12) ............................................................................................. 137
Table 112: Similarity of Identification Results ( δ = 0.90) of Various Methods ( Functional
Code: 14)................................................................................................................. 138
Table 113: Similarity of Identification Results ( δ = 0.95) of Various Methods ( Functional
Code: 14)................................................................................................................. 138
Table 114: Results of Site Consistency Test of Various Methods.................................. 138
Table 115: Results of Method Consistency Test of Various Methods ........................... 139
Table 116: Results of Total Ranking Differences Test of Various Methods ( Functional
Code: 14)................................................................................................................. 139
Table 117: Results of False Identification Test of Various Methods ............................. 139
Table 118: Results of False True Poisson Means Differences Test of Various Methods
( Functional Code: 14) ............................................................................................. 140
Table 119: Similarity of Identification Results ( δ = 0.90) of Various Methods ( Functional
Code: 16)................................................................................................................. 141
Table 120: Similarity of Identification Results ( δ = 0.95) of Various Methods ( Functional
Code: 16)................................................................................................................. 141
Table 121: Results of Site Consistency Test of Various Methods.................................. 141
Table 122: Results of Method Consistency Test of Various Methods ........................... 142
Table 123: Results of Total Ranking Differences Test of Various Methods ( Functional
Code: 16)................................................................................................................. 142
Table 124: Results of False Identification Test of Various Methods ............................. 142
Table 125: Results of False True Poisson Means Differences Test of Various Methods
( Functional Code: 16) ............................................................................................. 143
LIST OF FIGURES
Figure 1: Observed and Fitted PDF of E1 Crash Data and Fit Summary Statistics......... 23
Figure 2: Fitted and Empirical CDF of E1........................................................................ 24
Figure 3: Moving Averages vs. Original Statistic............................................................. 32
Figure 4: The Number of t- year Which is the “ Knee” of the Curve for 90% Confidence
Level ......................................................................................................................... 34
Figure 5: The Number of t- year Which is the “ Knee” of the Curve for 95% Confidence
Level ......................................................................................................................... 34
Figure 6: The Number of t- year Which is the “ Knee” of the Curve for 99% Confidence
Level ......................................................................................................................... 35
Figure 7: The Number of t- year Which is the “ Knee” of the Curve for All Confidence
Levels........................................................................................................................ 35
Figure 8: The Cumulative Percent Distribution of Various t- years.................................. 36
Figure 9: Key Steps of ALGSP Model ............................................................................. 59
Figure 10: The Flowchart of Conducting EB Analysis .................................................... 63
Figure 11: The Flowchart of Computing Accident Reduction Potential .......................... 64
Figure 12: Empirical Cumulative Distribution of Dataset One ( E1) ................................ 74
Figure 13: Empirical Cumulative Distribution of Dataset Two ( E2) ............................... 75
Figure 14: Empirical Cumulative Distribution of Dataset Three ( L1) ............................. 76
Figure 15: Empirical Cumulative Distribution of Dataset Four ( L2) ............................... 77
Figure 16: Empirical Cumulative Distribution of Dataset Five ( S1)................................ 78
Figure 17: Empirical Cumulative Distribution of Dataset Six ( S2).................................. 79
Figure 18: Relation of AADT and Crashes/ year- km for Rural Interstate Principle
Arterials ( Functional Code: 1, year: 2000) ............................................................. 108
Figure 19: Relation of AADT and Crashes/ year- km for Rural Other Principle Arterials
( Functional Code: 2, year: 2000) ............................................................................ 109
Figure 20: Relation of AADT and Crashes/ year- km for Rural Minor Arterials ( Functional
Code: 6, year: 2000)................................................................................................ 110
Figure 21: Relation of AADT and Crashes/ year- km for Rural Major Collectors
( Functional Code: 7, year: 2000) ............................................................................ 111
Figure 22: Relation of AADT and Crashes/ year- km for Rural Minor Collectors
( Functional Code: 8, year: 2000) ............................................................................ 112
Figure 23: Relation of AADT and Crashes/ year- km for Urban Interstate Principle
Arterials ( Functional Code: 11, year: 2000) ........................................................... 113
Figure 24: Relation of AADT and Crashes/ year- km for Urban Freeways ..................... 114
Figure 25: Relation of AADT and Crashes/ year- km for Urban Other Principle Arterials
( Functional Code: 14, year: 2000) .......................................................................... 115
Figure 26: Relation of AADT and Crashes/ year- km for Urban Minor Arterials
( Functional Code: 16, year: 2000) .......................................................................... 116
1
EXECUTIVE SUMMARY
In many agencies with jurisdiction over extensive road infrastructure, it is common
practice to select and rectify hazardous locations. Improving hazardous locations may
arise during safety management activities, during maintenance activities, or as a result of
political pressures and/ or public attention. Commonly a two- stage process is used. In the
first stage, the past accident history of all sites is reviewed to screen a limited number of
high risk locations for further examination. In the second stage, the selected sites are
studied in greater detail to devise cost- effective remedial actions or countermeasures for a
subset of correctable sites.
Due to limited time and resources, constraints and the extensive number of candidate
sites typically considered in such endeavors, it is impractical for agencies to examine all
sites in detail. The current Arizona Local Government Safety Project ( ALGSP) Analysis
Model, which was developed by Carey ( 2001) with funding from the Arizona
Department of Transportation ( ADOT), is intended to facilitate conducting these
procedures by providing an automated method for analysis and evaluation of motor
vehicle crashes and subsequent remediation of ‘ hot spot’ or ‘ high risk’ locations. The
software is user friendly and can save large amounts of time for local jurisdictions and
governments such as Metropolitan Planning Organizations ( MPOs), counties, cities, and
towns. However, its analytical core is based on the simple ranking of crash statistics,
where the user is offered choices of crash frequency, crash rate, crash severity, or crash
cost ( severities associated with average costs per crash severity type). Although this
method has the benefit of straightforwardness, the efficiency of identifying truly high- risk
sites leaves some room for improvement. This research, funded by ADOT, aims to justify
and recommend improvements to the analytical algorithms within the ALGSP model,
thus enhancing its ability to accurately identify high risk sites.
Included in the results of this research are a survey of past and current hot spot
identification ( HSID) approaches; evaluation of HSID methods, and exploration of
optimum duration of before- period crash data under simulated scenarios; development of
safety performance functions ( SPFs) for various functional road sections within Arizona;
extended comparisons of alternative HSID methods based on SPFs by using real crash
data; and recommendations for improving the identification ability of the current ALGSP
model. These results are divided into the following sections:
• Literature review of HSID methods ( chapter II): Through tracing the historical and
conceptual development of various HSID techniques, the strengths and weaknesses
associated with alternative approaches are assessed and appropriate directions of
future research on HSID methods are explored and proposed. A detailed description
of Bayesian approaches is also provided.
• Experimental design for evaluation of HSID methods and exploration of accident
history ( chapter III): In this experiment, “ sites with promise” are known a priori.
Real intersection crash data from six counties within Arizona are used to simulate
crash frequency distributions at hypothetical sites. A range of real conditions is
manipulated to quantify their effects. Various levels of confidences are explored.
2
False positives ( labeling a safe site as high risk) and false negatives ( labeling a high
risk site as safe) are compared across the following three methods, say, simple
ranking method, confidence interval method, and Empirical Bayesian ( EB) method.
Finally, the effect of crash history duration in these approaches is quantified.
• Safety performance functions for Arizona road segments ( chapter IV): The SPFs
for nine functional classifications of road sections in Arizona are created based on
the crash data of Year 2000 provided by ADOT. Due to the existence of
overdispersion of accidents, Negative Binomial models are utilized to develop these
SPFs.
• Comparison of HSID methods based on real crash data of Arizona road segments
( chapter V): On the basis of SPFs for Arizona road sections, five tests are
implemented to evaluate the performances of the EB, accident reduction potential,
accident frequency, and the accident rate methods. Two levels of confidences are
explored under each test. In addition, the similarity of identification results of the
alternative HSID methods is explored as well.
• HSID in current ALGSP model and recommended software changes ( chapter VI):
The algorithms for conducting HSID in the current ALGSP model are first
reviewed and the software changes are then recommended. These recommend-ations
include incorporating functional classification as an additional selection
parameter, data interface improvements, accident history requirements, embed- ding
the relationships between exposure and safety for various roadway functional
classes, incorporation of the EB techniques to compute the expected crash count,
incorporation of accident reduction potential as an additional weighting method,
and incorporation of EB techniques to calculate the expected crash costs.
Based on both real and simulated data, the results in this report show significant
advantages of the EB methods over other HSID methods across various confidence levels
and different statistical tests. Specifically, the research found that:
• A higher percentage of truly high risk sites are identified as ‘ high risk.’
• A higher percentage of truly safe sites are identified as ‘ safe.’
• Overall misclassifications are reduced using a Bayesian approach compared to
alternative methodologies.
• The Bayesian approach shows the best site consistency and method consistency
among the alternative methodologies.
Although it is shown that incorporation of Bayesian techniques into the ALGSP will
provide model users with more accurate prediction of hot spots, improvements are con-tingent
upon accurate safety performance functions, which are currently unavailable in
the ALGSP. Safety performance functions— the relationship between traffic volumes,
road section lengths, and crashes— are provided in Appendix C for various roadway
functional classifications in the state of Arizona. These safety performance functions
enable the software enhancements needed to improve the ALGSP and accommodate
Empirical Bayes’ procedures.
3
CHAPTER I - INTRODUCTION
Hot spot identification is a critical contemporary transportation issue. The Intermodal
Surface Transportation Efficiency Act ( ISTEA) of 1991, along with the subsequent
Transportation Efficiency Act for the 21st Century ( TEA- 21), brought HSID squarely into
transportation planning activities. In particular, ISTEA requires each state to develop a
work plan outlining strategies to implement Safety Management Systems ( NCHRP,
2003). The objectives outlined in this management system require that several activities
be undertaken by MPOs and/ or DOTs:
1) The development and maintenance of a regional safety database so that safety
investments can be evaluated regionally and forward in time.
2) The adoption of a defensible ( i. e. state of practice) methodology for identifying safety
deficiencies within a region.
3) A maintained and updated record of ‘ sites with promise,’ including intersections,
segments, interchanges, ramps, curves, etc.
4) A defensible methodology for evaluating the effectiveness of safety countermeasures.
Besides this mandate to spend safety funds wisely, there is professional pressure to con-duct
rigorous analyses and be held accountable for ‘ good number crunching.’ Due to both
public and professional pressures and the import associated with motor vehicle injuries and
fatalities, transportation safety professionals desire analytical tools to cope with HSID.
As a powerful tool for local governments and jurisdictions, the current ALGSP model can
be used to facilitate the selection of hazardous roadway locations in local jurisdictions
and to aid in the evaluation of potential spot treatments of safety hazards. Its
identification method is to simply rank the crash statistics in descending order and then
the top ones are selected in terms of the allowed money budget. Due to a random “ up”
fluctuation in crash counts during the observation period, this simple ranking method is
always subject to regression- to- the- mean bias, which decreases the identification
accuracy. By contrast, Bayesian methods have been proposed for obviating this bias and
have revealed themselves as superior for accurately identifying ‘ sites with promise’ in
considerable literature. However, much of the research was conducted on real crash data
( where hazardous sites are not truly known) and comparisons across various scenarios
have not been conducted. In addition, real crash data specific to Arizona regions have not
been used to examine the performance of Bayesian analyses. By designing a special
experiment which simulates various scenarios and using the real crash data from Arizona,
this research effort evaluates and compares alternative HSID methods. All the results
show the consistent superiority of Bayesian techniques for accurately identifying ‘ sites
with promise.’ This lays the solid foundation for the future incorporation of Bayesian
approaches into the current ALGSP model. Moreover, safety performance functions of
various classifications of road sections within Arizona are also provided in this report to
facilitate the integration procedure.
This report is divided into five primary sections. In the second section of this report,
Literature Review of HSID Methods, the historical and conceptual development of HSID
4
procedures is reviewed chronologically, and for the convenience of understanding the
more complicated computation procedures, the detailed description about two types of
Bayesian techniques is provided.
In the third section, an experimental approach is taken to evaluate the performance of
simple ranking, classical confidence intervals, and the EB techniques in terms of percent
of false negatives and positives. Several practical empirical crash distributions from the
state of Arizona are selected to represent a realistic range of ‘ base’ crash data and several
degrees of crash heterogeneity are examined in the simulation. The results demonstrate
that the EB methods in general outperform the other two relatively conventional methods,
especially in the low heterogeneity situations. In addition, the effect of crash history
duration employed in the three HSID methods is also explored in this experiment. The
moving average method is used to smooth the trend of the various duration data and to
find the “ knee” of the curve. Using 3 years of crash history data results in significant
improvements in error rates for all three methods, and 3 through 6 years make up almost
90% of all the optimum duration.
The major focus of the fourth section is on developing the safety performances of road
sections. Since design criteria and level of service vary according to the function of the
highway facility, the safety performance function is created for each of nine types of road
sections within Arizona. The data for modeling includes accident number, Annual
Average Daily Traffic ( AADT), and road section length. The graph showing the
relationship among variables, the model form, and measures of goodness- of- fit are
provided as well. It is expected that the input of the alternate SPFs would facilitate the
procedure of incorporation of Bayesian techniques into the future ALGSP.
The fifth section contains a comprehensive comparison of identification performances of
the EB, accident reduction potential, accident frequency, and the accident rate methods
using crash data from Arizona and the SPFs developed in the previous section. Five
evaluation tests including site consistency test, the method consistency test, total ranking
differences test, false identification test, and false/ true Poisson mean differences test are
conducted. Both top 10% and top 5% locations ( in terms of accident frequency) are
considered as hot spots. The results across the nine types of road sections show the
consistent advantage associated with the EB method, and disadvantage of the accident
rate method while conducting HSID.
The final section provides recommended software changes to improve its ability to select
truly hazardous locations from road network. The information of traffic volume is
proposed to be incorporated in the software. As one of the factors significantly affecting
road safety, it should be included in the safety performance function, which is the basis
for conducting the EB analysis. Both the experimental design results based on the
simulated data and the results of the evaluation tests based on Arizona crash data support
the incorporation of Bayesian technique in the software. The accident reduction potential
method is also recommended to be included as an additional weighting method. Finally,
the recommendation of length of crash analysis period is provided.
5
CHAPTER II - LITERATURE REVIEW OF HSID METHODS
Identifying ‘ sites with promise,’ also known as black spots, hot spots, or high- risk
locations, has received considerable attention in the literature. This is not surprising,
since there is public and professional pressure to allocate safety investment resources
efficiently across the transportation system and to invest in sites that will yield safety
benefits for relatively modest cost. In addition, US federal legislation requires the
practice of remediating high risk locations.
It is intended that this identification stage act as an effective sieve that allows sites that do
not require remedial action to pass through, while retaining sites that require remediation.
This is difficult to accomplish, however, because an individual sites’ safety performance
( i. e. number of crashes) varies from year to year as a result of natural variation— causing
two potential errors— false positives and false negatives. False positives are sites
identified as needing remediation when in fact they are safe, while false negatives are
sites identified as being safe when in fact they require remediation.
The following literature review comprehensively examines hot spot identification
methods. It is intended to support ongoing work for the Arizona Department of
Transportation aimed at improving the current ALGSP Model. It is the first of several
steps toward ultimately improving the software that enables jurisdictions in the state of
Arizona to identify sites for potential improvement, such as road segments, intersections,
ramps, etc. This literature review is divided into two sections: the historical and
conceptual development of hot- spot identification methods, and a detailed description of
Bayesian techniques, the current state of the art.
HOT- SPOT IDENTIFICATION PROBLEM BACKGROUND
Due to the significant importance of identifying sites with promise, a large number of
techniques have been employed to improve the detection accuracy. The historical and
conceptual development of such procedures is reviewed chronologically in this section to
help familiarize you with the hot- spot identification problem background.
The following notation will be useful in the discussions that follow:
X = observed accident count for a road section/ site and period;
λ = expected accident count ( E{ X}) for the road section/ site and period;
E{ λ} = mean of λ’ s for similar road sections/ sites;
D = length of the road section;
Q = number of vehicles passing road section/ site during period to which X pertains;
R = observed accident rate ( e. g., crashes/ vehicle- kilometer or crashes/ million entering
vehicles);
REB = accident rate estimated by the EB method;
R = average value of R for similar road sections and sites;
UCLX = upper control limit for observed accident counts ( X);
UCLR = upper control limit for observed accident rate ( R);
t = number of years of accident data to be analyzed;
α, β = parameters.
6
Perhaps the simplest way to identify sites with promise is by simply ranking them in
descending order of their accident frequencies and/ or accident rates. Although this
method has the benefit of straightforwardness, the efficiency of identifying truly high- risk
sites leaves considerable room for improvement. To overcome this deficiency, a
substantial body of research has been devoted to providing more efficient and justifiable
site identification techniques. For example, Norden et al. ( 1956) proposed a method to
analyze accident data for highway sections based on statistical quality control techniques.
Using an approximation of the Poisson distribution for crash counts, and 0.5 percent
probability, they developed the equations for UCLX and UCLR used to identify critical
thresholds. When X exceeds UCLX ( or R exceeds UCLR), a site was identified as deviant
with regard to safety. This approach drew much attention at that time, and some similar
methods ( with relatively minor differences) based on this procedure were proposed in
subsequent years.
Researchers then began to ponder the issues of how many years ( t) of accident data are
necessary to conduct a defensible analysis. By finding that a 13- year average could be
adequately estimated from 3 years of accident counts, May ( 1964) first provided the
conclusion, “ There is little to be gained by using a longer study period than three years.”
It is reasonable to use the current data instead of using old data that no longer reflect a
current situation. However, considering that a sensible choice of t must depend on the
magnitude of the average that is being estimated and on some knowledge of what makes
past accident counts obsolete, this influential practice seems somewhat arbitrary.
Crash severity became the next issue of importance regarding HSID methods. Common
sense suggested that a site with more severe crashes ( all else being equal) should receive
higher priority in remediation efforts. The safety index was first introduced by
Tamburri and Smith ( 1970) and later incorporated into the practice of HSID. In essence,
they said each road type ( as examples, rural two- lane roads, urban freeways, etc.) had a
characteristic mix ( distribution) of accident severities among fatal, injury, and property
damage only ( PDO) crashes. On the basis of the accident severity and road type, accident
costs were used to weight crashes. They also suggested that all crashes be expressed in
terms of PDO equivalent accidents ( for example a certain injury crash may be equivalent
to 5 PDO crashes).
Deacon et al. ( 1975) considered the difference between identifying hot spots and sections
and explored how long analysis sections should be conducted. They also presented an
analysis of a sensible t, in comparison to that provided earlier by May ( 1964). Their
conclusions suggested that a balance is sought between reliability of the crash data
( longer being more reliable) and the need to detect adverse change quickly ( shorter being
more able to reveal adverse safety changes), and that a single t should be determined on
this basis. They also recommend 9.5 as the weight for fatal and A- injury crashes, and 3.5
for B and C crashes when using a safety index.
Laughland et al. ( 1975) first described the ranking procedure using both the number and
rate methods. The method proposed identifies hazardous locations when X exceeds some
predetermined value UCLX and R exceeds UCLR. The claimed advantage of this
7
procedure is that it excludes so- called hazardous locations identified as a result of R being
as large as a result of low exposure.
Renshaw et al. ( 1980) argued that questions about the length of sections, duration of
accident history, amount of traffic, and detection accuracy must all be considered jointly
and that reliable detection is often not practical.
Hakkert and Mahalel ( 1978) first proposed that blackspots should be defined as those sites
whose accident frequency was significantly higher than the expected at some prescribed
level of significance. This point was then favored by McGuigan ( 1981; 1982), who put
forward the concept of potential accident reduction ( PAR), such as the difference between
the observed accident counts and the expected number of similar sites. He stated, with
some justification, that PAR should be a better basis on which to rank sites than annual
accident totals ( AAT), which tends to identify high flow sites which do not necessarily
have the potential for accident reduction. This method is similar to the quality control
method to some extent. The former represents the magnitude of the problem, that is, how
many accidents can be avoided given the normal situation, and the latter represents how
large the probability that the site is abnormal by using the given level of confidence.
Estimating E{ λ} using a multivariate model was suggested by Mahal et al. ( 1982). By
using E{ λ} as the mean, they deemed a location as deviant if the probability of observing
X or more accidents was smaller than some predetermined value.
Flak et al. ( 1982) recommended that crashes be categorized according to specific road
conditions ( weather, pavement material, etc.) and by accident type ( turning, side- swipe,
rear- end, etc.), and so forth. This concept differed from previous ones in that it seeks to
identify deviant locations with regards to very specific conditions. Although appealing
from an experimental design point of view, this concept is likely to produce sample sizes
too small to detect significant differences for all but the largest of databases.
Hauer and Persaud ( 1984) proposed a concept of sieve efficiency in which the number of
sites to be inspected and the expected numbers of correct positives, false positives, and
false negatives serve as measures of performance. They examined the performance of
various HSID techniques on the basis of performance measures that are easy to
understand. They argued that the quality- control approach to HSID does not give the
analyst clues about how well or how poorly the sieve is working. They also suggested
that numerical methods are needed to free the procedure from reliance on the assumption
that λ obeys the gamma distribution.
Regression- to- the- mean ( RTM) bias associated with typical methods of site selection has
been identified in the literature and some research dealing with RTM has been developed.
Persaud and Hauer ( 1984) compared and evaluated the performance of an EB and a
nonparametric method for debiasing before- and- after analyses. The results of several data
sets show that the Bayesian methods in most cases yield better estimates than the other
one. Wright et al. ( 1988) made a survey on the previous research dealing with the RTM
effect. He examined the validity of assumptions associated with those methods, evaluated
8
the robustness of the results based on the assumptions, and provided some suggestions for
improving the quality of the results.
Mak et al. ( 1985) developed a procedure to conduct an automated analysis of hazardous
locations. The procedure consists of ( a) a mainframe computer program to identify and
rank black- spots, ( b) a microcomputer program to identify factors overrepresented in
accident occurrence at these locations relative to the average for similar highways in the
area, ( c) a multidisciplinary approach to identify accident causative factors and to devise
appropriate remedial measures, and ( d) evaluation of remedial measures actually
implemented. The procedure is based on accident rate ( number of injury and fatal
accidents per 100 million vehicle miles of travel).
Higle and Witkowski ( 1988) developed a Bayesian model for HSID using accident rate
data rather than accident counts, which are shown to have identification criteria
analogous to those used in the classical identification scheme. The comparisons between
the Bayesian analysis and classical statistical analyses suggest that there is an appreciable
difference among the various identification techniques in terms of HSID performance,
and that some classically based statistical techniques are prone to err in the direction of
excess false negatives.
Based on data from 145 intersections in Metropolitan Toronto, Hauer et al. ( 1988)
provided Bayesian models to estimate the safety of signalized intersections on the basis
of information about its traffic flow and accident history. For each of the 15 accident
patterns ( categorized by the movement of the vehicles), an equation is given to estimate
the expected number of accidents and the variance using the relevant traffic flows. When
data about past accidents are available, estimates based on traffic flow are revised with a
simple equation. By applying these Bayesian models, one can estimate safety when both
flows and accident history are given and, on this basis, judge whether an intersection is
unusually hazardous. This method of estimation is also recommended for accident
warrants in the Manual on Uniform Traffic Control Devices.
Through a simulation experiment, Higle and Hecht ( 1989) evaluated and compared
various techniques for the identification of hazardous locations, based on classically and
Bayesian statistical analyses respectively, in terms of their ability to identify hazardous
locations correctly. The results reveal that the two classically based techniques suffer
from some shortcomings, and the Bayesian method based on accident rate exhibits a
tendency to perform well, producing lower numbers of both false negative and false
positive errors.
By 1990 it was generally becoming accepted among academic circles that the Empirical
Bayes approach to unsafety estimation was superior to previous HSID methods. The
Bayesian approach generally makes use of two kinds of clues of an entity: its traits ( such
as traffic, geometry, age, or gender) and its historical crash record. It requires information
about the mean and the variance of the unsafety in a “ reference population” of similar
entities. Obviously, this method suffers from several shortcomings: First, a very large
reference population is required; second, the choice of reference population is to some
9
extent arbitrary; and third, entities in the reference population usually cannot match the
traits of the entity for which the unsafety is estimated. Hauer ( 1992) alleviated these
shortcomings by offering the multivariate regression method for estimating the mean and
the variance of unsafety in reference population. By describing its logical foundations
and illustrating some numerical examples, Hauer shows how the multivariate method
makes the Empirical Bayes method to unsafety estimation applicable to a wider range of
circumstances and yields better estimates of unsafety than previous methods.
Persaud ( 1991) presented a method for estimating the underlying accident potential of
Ontario road sections using accident and road related data. The comparative results
indicate that the EB estimates are superior to those based on the accident count or the
regression predictions by themselves, particularly for sections that might be of interest in
a program to identify and treat unsafe road locations.
Brown et al. ( 1992) presented the convergence of HSID by police- reported data, by
highway inventory, and by community reporting. Weighted injury frequencies per unit
distance and weighted injury rates per 100 million vehicle- km are presented for all sites
and for all numbered highway segments. Priority sites are then ranked considering injury
frequencies and injury rates.
Hauer et al. ( 1993) explored the probabilistic properties of the process of identifying
entities, such as drivers or intersections, for some form of remedial action when they
experience N crashes within D units of time, the N- D “ trigger.” On the basis of the
probability distribution of the “ time- to- trigger,” it is concluded that in road safety the
problem of false positives is severe, and therefore entities identified on the basis of
accident or conviction counts should be subjected to further safety diagnosis. Moreover,
they found that the longer the N- D trigger is applied to a population, the less useful it
becomes.
Tarko et al. ( 1996) presented a methodology of area- wide safety analyses to detect those
areas ( states, counties, townships, etc.) that should be considered for safety treatment.
The method is implemented for Indiana at the county level and uses regression models to
estimate the normal number of crashes in individual counties. The counties are priority
ranked using the combined criterion including both the above- norm number of cashes and
the confidence level. This combined criterion helps select counties where the excessive
number of crashes is not caused solely by the randomness of the process. This application
differs from previous applications in that the HSID was conducted at the planning or
county level, instead of at the intersection or road segment level.
Stokes and Mutabazi ( 1996) traced the evolution of the formulas used in the rate- quality
control method from their origin in the late 1950s to their present form, and they also
presented and discussed the derivation of the basic formulas used in the method. It is
suggested that, contrary to assertions in the literature, the accuracy of the equations used
in the rate- quality method is not proved by eliminating the normal approximation
correction factor from the original equations and the need for a correction factor is
particularly apparent at higher probability levels.
10
On the basis of the review of previous procedures for black- spots identification, Hauer
( 1996) made an attempt to create some order in the thinking and made some suggestions
to improve identification. In comparison with the stage of identification, he pointed out
that the stage of site safety diagnosis and remediation is somewhat underdeveloped.
Persaud et al. ( 1999) put forward a similar concept to potential accident reduction, such
as potential- for- safety- improvement ( PSI). For the sake of correcting for the RTM bias,
he replaced the observed accident number with the long- term mean of accident counts in
the PAR previously stated.
Davis and Yang ( 2001) made use of Hierarchical Bayes methods combined with an
induced exposure model to identify intersections where the crash risk for a given driver
subgroup is relatively higher than that for some other groups. They carried out the
necessary computations using Gibbs sampling, producing point and interval estimates of
relative crash risk for the specified driver group at each site in a sample. The methods can
also be extended to identify hazardous locations for a specified accident type. This
method of HSID requires sophisticated modeling skill and software, and is currently
beyond the level of most DOT staff expertise.
Kononov et al. ( 2002) presented the direct diagnostics method to conduct HSID and
develop appropriate countermeasures. The underlying principle is that a site should be
identified for further examination if there is overrepresentation of specific accidents
relative to the similar sites.
With empirical Bayes gradually becoming the standard and staple of professional
practice, Hauer et al. ( 2002) presented a tutorial on safety estimation using the EB
method. This tutorial contains comprehensive illustration of using the EB procedures and
can be viewed as the bridge between theory and practice for the EB application.
The above mentioned research represents only a small portion of the extensive past and
current HSID research. In summary, the large body of techniques for HSID generally
includes simple ranking of accident frequencies and/ or accident rates, rate- quality control
methods, site identification using the notion of a safety index, number- and- rate methods,
accident pattern recognition method, and various applications of Bayesian approaches on
both crash frequencies and crash rates. In comparison with other techniques, Bayesian
techniques have been shown to offer improved ability to identify black- spots by
accounting for both history and expected crashes for similar sites, which can obviate the
“ regression- to- the- mean” problem that simpler methods fail to correct.
This literature review summary clearly indicates that opportunities exist for possible
enhancements leading to improved HSID within the recently released ALGSP model,
which currently performs a simple ranking based on accident frequencies. However, as
one might expect, the incorporation of Bayesian methods will increase the data collection
burden: additional information about site crash histories and reference populations will
need to be collected. The following section is devoted to describing the Bayesian
techniques in greater detail.
11
BAYESIAN TECHNIQUES TO IDENTIFY HAZARDOUS LOCATIONS
An underlying characteristic of crash occurrence is the random fluctuation from year to
year of crash counts under constant and unchanging traffic, weather, and roadside
conditions ( which of course in reality does not occur). This characteristic significantly
reduces the ability to detect truly hazardous locations in the sense that a crash site may
appear to represent a relatively high risk in a given year when in fact the site’s
underlying, inherent risk level is average or low ( Hauer, 1997). A site that reveals a high
observed risk in one year is on average followed by a crash count in the subsequent year
that is closer to the mean— a phenomenon known as regression to the mean. However, it
was shown in the previous section that Bayesian approaches, by utilizing two kinds of
clues of an entity ( its traits and its historical accident record), involve corrections for
RTM and can improve significantly the efficiency of site identification. Incorporation of
such techniques into the ALGSP model will offer improvements in the performance of
HSID. Unfortunately, in contrast to other approaches, which are relatively
straightforward, the Bayesian techniques require a greater quantity of information
associated with locations inspected and also involve relatively more complicated
computations – albeit trivial for a computer.
Noting that the large portion of this research is to test the performances of various HSID
methods ( including the somewhat typical methods and Bayesian techniques), this section
describes in detail the analytical aspects of various Bayesian techniques generally accept-ed
as ‘ state of the art.’ The research reviews are divided into two groups: Bayesian tech-niques
based on accident frequencies and Bayesian techniques based on accident rates.
Bayesian Techniques Based on Accident Frequencies
To alleviate the RTM bias associated with other site identification techniques, Hauer et
al. ( 1984; 1988; 1992) discussed numerous aspects of HSID to derive what is known as
the EB method. EB methods differ technically from Bayes’ methods in that the former
relies on empirical data as “ subjective” information while the latter relies on truly
subjective information ( e. g. expert opinions, judgment, etc.).
The EB method rests on the following logic. Two assumptions are first needed, which
can be traced back to those of Morin ( 1967) and Norden et al. ( 1956):
Assumption 1: At one given location, accident occurrence obeys the Poisson
probability law. That is, P x λ denotes the probability of recording x accidents on a
site where their expected number is λ, where P x λ = λxe− λ / x!. ( 1)
Assumption 2: The probability distribution of the λ of the population of sites is
gamma distributed, where g ( λ) is denoted as the gamma probability density function.
Estimation of the long term safety of an entity is obtained through using both kinds of
clues, that is, the traits such as gender, age, traffic, or geometry of an entity and the
12
historical accident record of the entity. If the count of crashes ( x) obeys the Poisson
probability law and the distribution of the λ’ s in the reference population is approximated
by a Gamma probability density function, a good estimator of the λ for a specific entity
is:
αE{ λ}+ ( 1− α ) x, with α = E{ λ}/[ E{ λ }+ VAR{ λ }]. ( 2)
From the above equation, we know estimates of E { λ} and VAR { λ} which pertain to the
λ’ s of the reference population are needed. There are two methods to estimate the E { λ}
and VAR { λ}. One of them is the method of sample moments, the other is the
multivariate regression method.
To describe the method of sample moments, let us first consider a reference population of
n entities of which n( x) entities have recorded X= 0, 1, 2,… accidents during a specified
period. With this notation, the sample mean and the sample variance are, respectively:
μ = Σxn( x) / Σn( x) ( 3)
s2 = [ Σ( x − μ) 2n( x)]/ Σn( x) ( 4)
In the method of sample moments, the estimators of E { λ} and VAR { λ} are equal to μ
and s2- μ respectively. The larger is the reference population. These estimates are more
accurate.
The primary attraction of the method is that its validity rests on a single assumption: that
if λi remained constant, the occurrence of accidents would be well described by the
Poisson probability law. However, there remain two practical difficulties: ( 1) It is rare
that a sufficiently large data set can be found to allow for adequately accurate estimation
of E { λ} and VAR { λ}; ( 2) Even with very large data sets, one cannot find adequate
reference populations when entities are described by several traits ( e. g. geometric
conditions, etc.). In order to obviate these difficulties, Hauer ( 1992) provided the
multivariate regression method. With this correction, a multivariate model is fitted to
accident data to estimate the E { λ} as a function of independent variables, and the
residuals ( i. e., the difference between an accident count on some specific entity that
served as “ datum” for model fitting and the estimate E { λ} calculated from the fitted
model equation) are viewed as coming from a family of compound Poisson distributions:
VAR{ x}= VAR{ λ}+ E{ λ} ( 5)
The E { λ} of the reference population is estimated using the model equation; VAR{ x} is
estimated using the squared residuals. Therefore, based on equation ( 5), the difference
[ squared residual – estimate of E { λ}] can be used to estimate VAR { λ} for the imaginary
reference population to which this datum point belongs.
As mentioned previously, it is easy to note that the primary difference between the
method of sample moments and multivariate regression method is that the estimates of E
{ λ} and VAR { λ} are obtained using different analytical procedures. The method of
sample moments is straightforward, while the latter one yields more precise results.
Once the estimates of E { λ} and VAR { λ} are obtained, the expected safety of an entity is
obtained using Equation 2. However, the truly hazardous locations cannot be screened
13
based solely on the long term safety associated with each entity, a model of the entire
distribution function of λ X is required.
On the basis of the assumptions stated previously, the probability that a site selected
randomly has x accidents is approximated by the negative binomial ( NB) probability
distribution. Thus, the parameters of g ( λ) are estimated using EB logic according to the
following sequence of steps:
Step 1: The sample mean and variance is computed across sites. The notation n( x) is used
to denote the number of sites that had x crashes. The estimated mean and variance are
computed using:
μ = Σxn( x) / Σn( x) ( 6)
s2 = [ Σ( x − μ) 2n( x)]/ Σn( x) ( 7)
Step 2: The EB weighting parameters α and β are then obtained using:
α = μ /( s2 − μ ) ( 8)
β = μ * α ( 9)
Step 3: With the two weighting parameters obtained, the parameters of the gamma
distribution are obtained such that:
g( λ ) = α β λβ − 1e− αλ / Γ( β ) . ( 10)
The subpopulation of sites that had x accidents also follows a gamma probability
distribution and its gamma probability density function is given by:
g( λ x) = ( 1 + α ) β + xλβ + x − 1e−( 1+ α ) λ / Γ( β + x) . ( 11)
With the probability density functions defined, the selection of hazardous locations is
now straightforward. Suppose that λ* is the “ acceptable” upper limit of accident counts,
then a site i is identified as hazardous if the probability that λ exceeds λ* is relatively
small. Specifically, if:
P( λ> λ* x)> δ ( 12)
Where δ is the tolerance level that is contingent upon the choice of safety specialists ( i. e.
level of acceptable risk) and takes into account conditions in the local jurisdiction, then
site i is identified as a truly hazardous location.
Bayesian Techniques Based on Accident Rates
In contrast to earlier papers regarding EB techniques, which were concerned with
predicting the number of crashes that will occur at a particular location, Higle and
Witkowski ( 1988) investigated using Bayesian analysis of crashes for the identification
of hazardous locations based on accident rates and not frequencies. It should be noted
that use of rates has been strongly discouraged by some researchers, and a growing body
of literature discourages the use of rates ( Hauer, 1997). Due to the similar assumptions
14
and procedures, the research can be viewed as a complement to the previous research
relying on EB approaches. Using empirical comparisons of performance between
Bayesian and classical statistical analyses, Higle et al. found that there is an appreciable
difference among the various identification techniques, and that some classically based
statistical techniques may be prone to err in the direction of excessive false negatives.
Higle and Witkowski divided the Bayesian analysis into two steps. In the first step, crash
histories are aggregated across a number of sites to get a gross estimation of the
probability distribution of the accident rates across the region. In the second step, the
regional distribution and the accident history at a particular site are used to obtain a
refined estimation of the probability distribution associated with the accident rate at that
particular site.
In performing the analysis, Higle and Witkowski made two assumptions that are similar
to those made by previous researchers:
Assumption 1: At any given location, when the accident rate is known ( i. e., if R R i ~ = ,
note that i R~ is treated as a random variable), the actual number of accidents follows a
Poisson distribution with expected value i R( DQ) . That is:
} ( ) R DQ i
X
i
i i i e
X
P X X R R DQ R DQ ( )
!
( )
{ = ~ = ( ) = − ( 13)
Assumption 2: The probability distribution of the regional accident rate, fR( R), is the
gamma distribution. That is:
( )
R
R f R Rα e β
α
α
β − −
Γ
( ) = 1 ( 14)
Higle and Witkowski recommended that for each computation, it may be preferable to
use the MME ( method of moments estimates) values rather than the MLE ( maximum
likelihood estimates) values of α and β. Within the framework of Bayesian analysis, the
site- specific parameters are: i i α = α + X , i i DQ) ( + = β β . Based on α i
and βi, the site-specific
probability density functions were then obtained. The steps to identify the truly
hazardous locations are shown as follows:
Step 1: Estimate the sample mean and variance of the observed accident rates of the
population of locations:
Σ=
=
m
i i
i
DQ
X
m 1 ( )
μ 1 ( 15)
Σ− ⎟ ⎟⎠
⎞
⎜ ⎜⎝ ⎛
−
−
=
m
i i
i
DQ
X
m
s
1
2
2
1 ( )
1 μ ( 16)
Step 2: Estimate parameters α and β, where:
β = μ / s2 ( 17)
α = μ * β ( 18)
15
With the two parameters,
( )
R
fR R Rα e β
α
α
β − −
Γ
( ) = 1 ( 19)
Step 3: Obtain i i i f R X,( DQ).
The subpopulation of sites that had X accidents also follows gamma distribution and its
gamma probability density function is as follows:
R
i
i
i i i
i i
i
f R X DQ Rα e β
α
α
β − −
Γ
= 1
( )
,( ) . ( 20)
Where: i i α = α + X ( 21)
i i β = β + ( DQ) ( 22)
With these probability density functions, the selection of hazardous locations is now
straightforward. Suppose that λ* is the “ acceptable” upper limit accident counts, then a
site i can be deemed as hazardous if the probability that λ exceeds λ* is relatively
significant. Say, if:
P( λ > λ* x)> δ , ( 23)
Where δ is the tolerance level which is contingent upon the choice of safety specialists
and the actual situation of local jurisdiction. Sites above the critical threshold are then
identified as truly hazardous locations.
To summarize, Bayesian techniques, by accounting for both crash history and expected
crashes for similar sites, have been shown to offer improved ability to identify truly
hazardous locations. The next section quantifies the differences between Bayesian
techniques and other typical approaches.
16
17
CHAPTER III - EXPERIMENT DESIGN FOR EVALUATION OF
HSID METHODS AND EXPLORATION OF ACCIDENT HISTORY
On the basis of the previous literature review for HSID methods, Bayesian methods
revealed themselves as superior for accurately identifying sites with promise. However,
much of the research was conducted on real crash data ( where hazardous sites are not
truly known) and comparisons across various Bayesian methods have not been
conducted. This chapter is focused on examining the performances of the EB and
alternative typical methods within various environments and exploring the best duration
of accident history, which causes minimum false identifications.
The chapter is divided into sections as follows. Section 1, “ Experiment for Evaluating
HSID Method Performance,” discusses the steps of an experiment designed to evaluate
the performance of HSID methods. Section 2, “ Experiment for Optimizing Duration of
Crash History” presents the steps with regard to the optimum duration of before- period
crash data. Both real data and simulated crash data are utilized in the experiments. The
real data were obtained from current ALGSP users in Arizona. Simulated data correspond
with a designed experiment that varies such as degree ( or percentage) of difference
between “ correctable” and “ average” sites, variability in the data, and different crash
distributions. The final section provides the conclusions and recommendations that arise
from the two experiments performed to evaluated HSID methods for use in the ALGSP,
and translate the analytical results into practical recommendations.
EXPERIMENT FOR EVALUATING HSID METHOD PERFORMANCE
The main objective of this first experiment is to quantify and assess the predictive
performance of various HSID methods, such as the simple ranking method, the method
based on classical statistical confidence intervals, and the EB method, in order to identify
the best one for inclusion in the ALGSP model. Of course there are many aspects of the
simulation experiment that desire careful attention, such as sample sizes, nature of crash
data, reliability of tests, etc. Prior to describing the detailed aspects of the experiment,
HSID methods are first reviewed.
Hot Spot Identification Methods
A site ( series of sites, etc.) may experience relatively high numbers of crashes due to: 1)
an underlying safety problem; or 2) a random “ up” fluctuation in crash counts during the
observation period. Simply observing unusually high crash counts does not indicate
which of the two conditions prevails at the site. It is possible to articulate the objective of
HSID as follows:
The objective of hot spot identification is to identify transportation system
locations ( road segments, intersections, interchanges, ramps, etc.) that
possess underlying correctable safety problems, and whose effect will be
revealed through elevated crash frequencies relative to similar locations.
18
Two aspects of the previous statement are noteworthy. First, it is possible to have truly
unsafe sites that do not reveal elevated crash frequencies— these are termed ‘ false
negatives.’ It is also possible to have elevated crash frequencies, which do not result from
underlying safety problems— these are termed ‘ false positives.’ False positives, if acted
upon, lead to investment of public funds with few safety benefits. False negatives lead to
missed opportunities for effective safety investments. As one might expect, correct
determinations include identifying a safe site as “ safe” and an unsafe site as “ high risk.”
When considering the seriousness of errors ( false positives and false negatives) with
respect to safety management, one generally concludes that false negatives are the least
desirable result, since a jurisdiction will fail to make wise investments and reduce
fatalities, injuries ( serious and minor), and property damage crashes.
For evaluation purposes, an HSID method is sought that produces the smallest proportion
of false negatives and false positives. Hence, the percentages of false negatives, false
positives, and overall misidentifications ( false positives plus false negatives) are used to
compare the performance of three commonly implemented HSID methods: 1) simple
ranking of sites; 2) classically based confidence intervals; and 3) the EB methods. These
three methods are now described.
The simple ranking method ( denoted SR in experiments) is the most straightforward
HSID method. Applying this method, a set of locations ( e. g. all 4- lane signalized
intersections in a jurisdiction) is ranked in descending order of crash frequencies ( or
counts, X), and then the top sites are identified as high- risk locations for further
inspections. Typically, resources are invested to improve correctable sites from the top
down until allocated funds are expended. This method, for example, is one analysis
option available in the current version of the ALGSP model.
A second method for HSID is based on classical statistical confidence intervals ( denoted
CI in experiments) ( 1975). Location i is identified as unsafe if the observed accident
count Xi exceeds the observed average of counts of comparison ( similar) locations, μ,
with level of confidence equal to δ, that is, Xi > μ+ KδS, where S is denoted as the
standard deviation of the comparison locations, and Kδ is the corresponding critical
values. In practice δ is typically 0.90, 0.95, or 0.99, and depends upon the actual situation
and considerations such as the number of sites, amount of safety investment resources,
etc. These values serve as approximations, since they are borrowed from the normal
distribution function and thus have no special meaning in terms of the distribution of true
accident counts, which typically follow Poisson or negative binomial distributions. This
method is commonly used in the sense that it is inferred from the classical statistics and
can be performed conveniently.
Critical in the SR and CI methods is the notion of ‘ comparison sites.’ Comparison sites
are used to obtain an estimate of ‘ expected crashes’ for similar sites. When sites are
ranked using simple ranking, it is assumed that sites that are being ranked with similar
geometric and traffic conditions. Geometrics and traffic play a significant role in crash
potential and thus must be treated carefully. Often jurisdictions will group to the extent
19
possible ‘ similar’ sites together in the ranking; however, it is often the case that sites with
different geometric and traffic conditions ( i. e. exposure) are compared in the ranking
method. In the confidence interval method, it is assumed that the group or set of
comparison sites are similar to the site being compared. Critical to the outcome of any
HSID method is the level of sophistication employed to identify comparison sites.
For the EB technique, the former section has given a detailed description. It is
noteworthy that only the EB based on accident counts would be used herein. Equation 24
is followed to compute the long- term accidents of each site:
{ } ( 1 ) , i i λ = αE λ + − α x with α = E{ λ }/[ E{ λ }+ VAR{ λ }]. ( 24)
The weight parameter α is obtained by using the method of sample moments in which
the estimators of E{ λ} and VAR{ λ} are equal to μ and s2 respectively ( μ denotes the
sample mean and s2 denotes the sample variance). From the above expressions, it is
known that the second of the two clues, crash history, significantly affects the estimate of
λ, since longer crash histories tend to be more stable ( in crashes per year) than shorter
crash histories. Thus, different historical accident records yield different estimators of
E{ λ} and VAR{ λ}, and subsequently different identification error rates ( false positives
and false negatives). Similarly, these different identification error rates are also supposed
to be obtained under simple ranking and confidence analysis methods when utilizing
various historical accident records. Because of its importance, the optimum crash history
is examined in an experiment described in chapter 2 of this report.
Ground Rules for Simulation Experiment
To accomplish the evaluation of HSID methods, a simulation experiment was designed to
test a variety of conditions. The simulation experiment consists of the following specific
steps:
1) Generate mean crash frequencies from real data. Crash datasets from Arizona ( and
users of the ALGSP) which represent a range of in- situ crash conditions ( i. e.,
intersections, road segments, etc.) are first obtained. These data are used to determine
various shapes of distributions of crash means ( λ’ s). Gamma distributions are fit to
the observed data to reflect heterogeneity in site crash means. These gamma
distributed means are meant to reflect TRUTH, that is, the true state of underlying
safety at various locations on a transportation network ( note that in practice we do not
know TRUTH— and herein lies the power of simulation). The gamma distributed
means are denoted true Poisson means ( TPMs), and represent the means of crashes
across sites.
2) From TPMs, generate random Poisson samples. Thirty independent random numbers
for each simulated site are generated. For each of the 1000 sites, the TPM is used to
generate 30 crash counts that represent OBSERVED data for 30 different observation
periods, which are assumed to represent years in the analysis.
20
3) Evaluate HSID performance. By knowing the true state of safety for sites ( the TPMs),
and having observed data ( the randomly generated Poisson numbers), the
performance of HSID methods can be tested. The following steps are used to set up
the evaluation:
a) SR, CI, and EB are applied in separate simulation runs to rank sites for
improvement. These are applied by columns ( a single observation period, which
represents what an analyst might see in reality).
b) For the Bayesian runs, it is assumed that rows ( data across observation periods for
the same site) can also be used to represent the comparison group in order to
calculate E( x) and VAR( x). This implies that the analyst has accommodated for
covariates and is able to estimate an expected value for a site that accounts for
things such as exposure, geometrics, etc.
c) For the various hot spot thresholds, false positives, false negatives, and total
misidentifications in percent are computed. The percent of false positives will
always be larger than the percent of false negatives since the latter represent
hazardous sites that get identified as non- hazardous, which is much larger
candidate pool of sites than hazardous sites. Recall that false positives are safe
sites that are identified as hazardous, a relatively small pool of sites.
4) Evaluate effect of length of history. In the SR, CI, and EB methods the analyst must
decide how long a history to use for calculations. In this experiment the effect of
various accident histories ( 1 year until 10 years of data) on performance are evaluated
based on the corresponding identification rate.
5) Make practical recommendations. The results of the previous steps are discussed and
translated into practical recommendations for improving the ALGSP software.
Various aspects of the simulation experiment previously listed need to be discussed, as
the quality and design of the simulated data directly impacts the quality and
generalizability of the analysis results.
Generating Mean Crash Frequencies from Real Data
To support the development of simulated crash data, 6 years ( January 1995 through
December 2000) of crash counts from intersections in Apache, Gila, Graham, La Paz,
Pima, and Santa Cruz counties in the state of Arizona are used. These data and their
corresponding cumulative distributions are shown in Appendix A. Three types of
characteristically different underlying cumulative distributions of TPMs were observed in
the Arizona crash data: an exponential shape ( denoted E), a linear shape ( denoted L), and
an s- shape ( denoted S). In addition, two levels of heterogeneity in crash counts were
observed: low heterogeneity ( denoted 1) where the range in observed crash counts is less
than 20 crashes, and high heterogeneity ( denoted 2) where the range is in excess of 50
crashes.
Recall that the empirical distributions will be used to generate TRUTH, or the means of
Poisson counts of sites with varying underlying means. In this simulation study these are
21
denoted as TPMs. Since the data represent the true underlying safety of a site, crash
counts are Poisson distributed at an individual site, and the statistic is the mean.
The cumulative distributions used to represent the TPMs are labeled as E1, E2, L1, L2,
S1, and S2, respectively. For example, E2 represents an exponential shaped distribution
with high heterogeneity in TPMs. These six data sets were selected from various
jurisdictions within Arizona to try to represent the range of underlying characteristics
related to true accident count distributions, with the intent of making the results gained
from this experiment applicable across a variety of typical situations.
As stated previously, the observed data are used to inform the simulation of the TPMs. In
this experiment three reasonable assumptions are required to establish the foundation for
a successful simulation of crash data:
Assumption 1: The empirical cumulative distributions shown in Figures 12
through 17 ( see Appendix A) represent the TPMs of the underlying crash
process— thus the true safety of all sites in the collection of sites is known.
These data in reality are unknowable, since it is not known a priori which
sites are “ hazardous.”
Assumption 2: Theoretical distribution of these TPMs of the population of
sites follows gamma distribution, and the probability that a site selected
randomly has a given number of accidents is approximated by the negative
binomial distribution.
Assumption 3: The TPMs provide the basis for generating observed crash
count data. Thus, for example, the median ranked site in Figure 12 ( E1)
that has an underlying Poisson mean of around six crashes ( per
observation period) is used to randomly generate a crash outcome, which
could be 0, 1, 2, 3, …. etc. in any given observation period.
The result of assumptions 1 and 2 is that for each simulated site the underlying TPM
( expected crash count) is known, which is then used to randomly generate the observed
crash count.
Generation of Random Poisson Samples from TPMs
The empirical cumulative TPMs shown in Figures 12~ 17 ( see Appendix A) represent the
data required to meet Assumption 1 discussed previously. Using these data, observed
crash counts are generated to represent observed data for a given observation period.
However, due to the relatively small observed sample sizes ( less than 200 sites in all six
datasets) and the corresponding dispersion of crash counts, no sites would be identified as
hazardous in some cases when using the three HSID methods stated previously. For
example, if the top 1% of sites are identified as high risk ( δ = 0.99), all the sites in the
datasets labeled as L1, S1, and L2 would be identified as safe when utilizing the classical
confidence interval method and Bayesian method, thus leading to zero false negatives in
these scenarios and damaging the regulars of results to some degree.
22
To solve this problem and provide sufficient sample sizes for statistical comparisons,
theoretical distributions of TPMs are fitted to the six datasets. Then the sample sizes are
enlarged by randomly generating the required number of sites under these gamma
distributions ( site specific crash means are gamma distributed whereas within- site crashes
are Poisson distributed). In this experiment, 1,000 sites are simulated. Fitting specific
gamma distributions to a given sequence of data can be implemented through various
software packages, such as MINITAB, SAS 8.1 ( 1998), and Arena 7.0 ( Kelton, 2003).
Herein the Arena 7.0 is employed. Within the context of Arena, the curve fitting is based
on the use of maximum likelihood estimators, and the quality of a curve fit is based
primarily on the square error criterion. The fitting of probability density function ( PDF)
of a gamma distribution to the observed data is based on the histogram plot of these data.
The distribution summary report also presents the expression of given probability density
function, the corresponding p- value of Chi Square test and square error, etc. Figure 1
shows one example of fitting gamma distribution to the dataset. To show the fitting
effect, the corresponding theoretical cumulative distribution function ( CDF) is also
plotted in the same graph of empirical CDF ( Figure 2 shows the distribution of dataset
E1). The figures show that the gamma distribution fits well to the observed data. The
summary of all six fittings is shown in Table 1.
23
Distribution Summary
Distribution: Gamma
Expression: 3.5 + GAMM ( 13.4, 2.27)
Square Error: 0.020052
Chi Square Test Results
Number of intervals = 8
Degrees of freedom = 5
Test Statistic = 8.2
Corresponding p- value = 0.16
Data Summary
Number of Data Points = 94
Min Data Value = 4
Max Data Value = 70
Sample Mean = 33.8
Sample Std Dev = 16.7
Histogram Summary
Histogram Range = 3.5 to 70.5
Number of Intervals = 67
Figure 1: Observed and Fitted PDF of E1 Crash Data and Fit Summary Statistics
24
0 10 20 30 40 50 60 70
Accident counts
- 10
10
30
50
70
90
Cumulative distribution
E1
Figure 2: Fitted and Empirical CDF of E1
Table 1: Summary of Gamma Fittings of Six Datasets
Data set Fitting Expression Square Error Test Statistic p- value
E1 0.5+ Gamm( 3.79,1.75) 0.022344 26 < 0.005
E2 1.5+ Gamm( 15.9,1.7) 0.011836 13.4 0.0385
L1 0.5+ Gamm( 4.31,1.71) 0.038173 11.1 0.0119
L2 3.5+ Gamm( 13.4,2.27) 0.020052 8.2 0.16
S1 0.5+ Gamm( 2,4.3) 0.014903 33.5 < 0.005
S2 0.5+ Gamm( 9.06,2.57) 0.013211 23 < 0.005
Note: E— Exponential shape; L— Linear shape; S— Sigmoidal shape; 1— Low heterogeneity of crash
counts; 2— High heterogeneity of crash counts.
After TPMs have been simulated ( the crash means across sites which reflect the true and
typically unknown state of nature), the next step is to generate observed crash counts for
the sites. These counts will represent the observed crash counts across observation
periods for a particular site ( where its true safety is known). It is well- established that
crash counts fluctuate across observation periods as a result of the randomness inherent in
the underlying crash process and is well approximated by a Poisson process. In other
words, the count of crashes changes from one period to another even if driver
demography, traffic flow, road, weather, and the like remained unchanged. To represent
this natural fluctuation, a random sample of 30 observation periods ( which could be
months, years, etc.) associated with each location is randomly generated using a random
number generator and underlying TPMs defined by the fitted distributions in Figure
12~ 17 ( see Appendix A). A small snapshot of the data obtained from this simulation is
shown in Table 2.
25
Table 2: Simulated Data for 30 Sites and 16 Observation Periods
SITE TPM PERIOD
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1 4 5 1 4 1 2 7 4 3 4 4 2 1 1 5 5 6
2 8 5 9 8 6 8 4 9 9 5 4 8 8 9 9 13 8
3 8 12 7 10 5 5 7 11 8 8 8 11 6 6 7 8 7
4 9 12 9 10 16 8 12 7 9 11 8 10 8 16 11 6 8
5 9 10 13 12 8 9 6 12 10 9 9 4 5 12 11 11 4
6 10 15 4 6 10 4 17 6 11 12 7 10 10 15 6 17 10
7 10 8 5 10 8 13 10 11 7 12 10 8 9 9 6 9 10
8 10 7 8 11 14 10 12 7 11 12 11 12 13 7 7 7 11
9 12 13 17 8 14 12 10 16 10 7 15 17 9 11 15 14 15
10 12 10 9 13 13 6 12 18 11 15 12 12 12 13 12 13 9
11 12 9 10 10 14 15 12 7 14 6 12 11 19 9 17 10 18
12 12 11 14 14 9 16 7 15 3 10 13 9 11 7 2 12 14
13 12 15 15 16 13 8 12 13 16 16 12 15 11 15 12 14 9
14 12 14 10 10 11 15 15 12 13 14 15 13 14 11 13 17 19
15 12 11 12 12 8 12 13 12 7 9 11 9 9 9 12 4 9
16 13 8 17 13 8 12 11 17 15 16 13 12 15 16 12 14 19
17 13 9 13 16 16 11 8 6 18 12 8 7 11 12 12 17 15
18 13 10 18 15 16 10 15 10 16 17 10 6 8 8 10 13 6
19 13 14 13 17 11 6 11 18 15 11 17 16 19 13 11 15 14
20 13 7 4 13 11 12 10 17 19 6 7 12 15 7 15 14 12
21 14 16 17 12 18 13 17 12 11 7 13 15 10 18 14 17 19
22 15 15 18 21 15 15 14 13 21 14 13 20 13 12 19 16 16
23 15 11 13 16 12 12 16 10 16 19 20 21 16 13 19 11 16
24 15 9 16 16 11 14 12 15 18 11 16 14 29 11 12 19 14
25 16 18 12 15 9 19 18 14 11 19 15 18 14 18 18 14 20
26 17 22 10 19 12 15 19 18 10 11 17 20 16 15 11 10 15
27 18 14 21 9 19 16 17 19 18 18 14 16 28 19 18 19 10
28 18 8 20 19 5 16 18 20 28 16 17 19 14 15 14 18 15
29 19 26 19 18 21 17 29 12 22 25 15 23 11 19 20 15 24
30 20 22 18 23 21 23 19 26 22 16 20 19 15 14 19 13 15
Note: SITE= number of site, e. g. intersection, road segment, etc.; TPM= true underlying safety of site or
Poisson mean; SIMULATED DATA= observed crash count in observation period; Shaded cells represent
‘ truly hazardous’ locations ( sites 19 and 20).
Table 2 shows 16 simulated observations periods for 30 sites with TPMs given in the
second column from the left. For example, the two sites with 19 or more crashes per
observation period may be identified a priori as hazardous since the TPMs reflect the true
underlying state of nature. The two sites in the shaded cells are hot spots whereas the 18
sites above the shaded area are ‘ safe.’ In any given observation period such as
observation period 5, the observed number of truly hazardous sites that recorded 19 or
more crashes was two sites out of 20, where one was a truly hazardous site ( site 20) and
26
one was not ( site 16, a false positive). In observation period 5 there was also a false
negative, since truly hazardous site 19 revealed only 17 crashes.
So, by simulating large numbers of observation periods ( 30) characterized by different
TPM cumulative distribution shapes, a large number of sites ( 1000) for each of the six
observed crash distributions, the number of false negatives and positives ( the sum total of
the two is called false identifications) can be counted as a consequence of the three
different HSID methods described previously.
Performance Evaluation Results for HSID Methods
Given knowledge of three HSID methods, the ground rules for the simulation experiment,
and an explanation of how data were simulated, the three HSID methods were applied to
the simulated data to evaluate their relative effectiveness at identifying hot spots.
Establishing fair comparisons among the different HSID methods is paramount. In order
to objectively compare the performances of the HSID methods described previously,
equivalent evaluation criteria must be used. One consideration in this regard is the use of
δ, or cutoff level used to establish hazardous locations. Three values of δ are employed in
the evaluations, 0.90, 0.95, and 0.99 corresponding to the top 10%, 5%, and 1% of all
sites respectively. In practice, this corresponds with the amount of resources available for
remediation and the number of similar sites being compared. For example, a local
government wanting to remediate hot spot signalized intersections ( where 75 such
intersections exist) might fix 7 intersections, or 10% ( δ = 0.90).
All parameters of the simulation experiment have now been described. They include
shapes of the TPMs ( E, S, and L), levels of heterogeneity in the TPMs ( 1 and 2), and
levels of δ ( 0.90, 0.95, and 0.99). Three HSID methods are assessed, SR, CI, and EB.
Evaluation criteria include percent of false positives ( FP), percent of false negatives ( FN),
and sum total percent of FP and FN, called false identifications ( FI). For all of the
simulations, samples sizes were 1,000 for TPMs and 30 for observation periods.
To conduct the simulation experiment with these parameters, the following steps were
undertaken:
1. All the TPM cumulative distributions are divided into truly hazardous locations and
non- hazardous locations, using thresholds of 0.90, 0.95, and 0.99 to represent
different data separation thresholds. This step results in three “ critical” crash count
threshold values, CC0.90, CC0.95, and CC0.99 for each combination of cumulative TPM
shape and heterogeneity level. These values represent differentiation values to
distinguish between known truly hazardous locations and safe locations.
2. The three different HSID methods are used to identify hot spots using the simulated
data. Specifically, the SR method simply ranks observed frequencies as shown in
Table 2, the CI method uses the entire sample mean and standard deviation to
determine confidence intervals for ranking, and the EB method uses a weighted
27
average of crash history and observed frequency using Gamma distribution
parameters to rank sites.
3. Simulated crash data are then compared to the values CC0.90, CC0.95, and CC0.99. For
the truly hazardous sites, if the randomly generated crash counts are lower than the
values CC0.90, CC0.95, and CC0.99, then FNs are produced. Truly hazardous sites
generated observed crash counts lower than the critical crash count values. Similarly,
for the collection of non- hazardous locations, when the simulated data are larger than
the values CC0.90, CC0.95, and CC0.99, FPs are generated. FPs and FNs are simply
counted for each simulation run. Similarly, the number of FIs is the sum of the
number of false negatives and positives.
4. To make the three performance metrics comparable across simulations, the
percentage of FNs, FPs, and FIs are calculated. Because the FNs are the truly
hazardous locations that are mistook as “ safe” sites, the percentage is simply the
number of simulated FNs divided by the simulated truly safe sites; similarly, the
percentage of the FPs is the number of FPs divided by the truly hazardous locations.
Finally, the percentage of FIs is obtained by dividing the sum of FNs and FPs by the
total number of randomly generated data locations. For example, suppose there are 20
sites under inspection with the top five of them are identified as hot spots according
to the corresponding information of TPM. Again, the number of simulated data for
each site is assumed as 30, thus, the total truly hazardous locations would be 150, and
the number of truly safe ones is 4,500. If 45 sites among the 150 truly hazardous
locations are wrongly viewed as safe ones, the percent of FN would be
45/ 4,500* 100%= 1%.
5. Finally, the percentage of FPs, FNs, and FIs across simulation conditions are tallied
and reported.
Tables 3 and 4 summarize the results of the errors ( FNs, FPs, and FIs) produced under the
variety of simulation conditions. Table 3 presents the results when heterogeneity of crash
counts is relatively low, while Table 4 presents the results when heterogeneity is
relatively high. Critical crash count threshold values increase from left to right in both
tables. The runs labeled CI, SR, and EB refer to classical confidence interval, simple
ranking, and Bayesian methods of HSID respectively. Finally, L, S, and E refer to the
underlying characteristics shapes of the cumulative distributions of TPMs: linear, s-shaped,
and exponential respectively.
For low heterogeneity and high heterogeneity simulations, the trends of percent errors with
the increasing of value of δ are in conformance with each other, however, the values of
percent errors for low heterogeneity are much higher than those for high heterogeneity. The
major reasons are likely because the low heterogeneity dataset has relatively small standard
deviations when compared with the other datasets. The small range of crash counts in a
dataset makes it more difficult to identify hazardous locations. On the contrary, it is easy to
identify hot spots when the corresponding crash counts are greatly dispersed, particularly
when dispersion is large on the upper most crash count deciles.
28
Another prominent characteristic associated with both tables is that the percentage of
false negatives decreases in the same direction as δ for the three kinds of HSID methods.
In most cases the percentage of false negatives is substantially reduced using the EB
method. The fairly complicated explanation for this is as follows. The threshold value
divides the top ‘ outlying’ crash counts from the remainder of the data, either the top 10%,
5%, or 1% of observed counts. By definition these counts are more likely to suffer from
regression to the mean in a subsequent observation period than from counts around the
TPM. Thus the crash history of the top x% of crash counts act to reduce the effect of the
current crash count x when ranking these sites. As a result, sites that suffer less from
regression to the mean get ranked higher in the list— sites that ordinarily would have been
ranked as false negatives.
Conversely, the decrease of the percentage of the false negatives is accompanied by an
increase in the percentage of the false positives ( except for δ of 0.95 for L1 and L2, in
these two cases, the percent error of FP under the confidence analysis method is the
smallest among the three threshold values). It shows that the stricter identification criteria
would select less non- hazardous sites for remedy, although it may leave the larger
number of truly hazardous locations undetected. Surprisingly, the false identifications
also go the same direction to the false negatives with the increase of the value of δ.
Probably the best explanation for this phenomenon is that the relatively small number of
false negatives can lead to more false positives, and then reduce the efficiency of the
investment of local governments. In conclusion, the percent of false positives increases
with the rising of thresholds, whereas the percent false negatives and false identifications
decrease with the rising of thresholds. Results in almost all simulation scenarios share the
same trends.
29
Table 3: Percent Errors for Low Heterogeneity in Crash Counts
Percent Errors: Low Heterogeneity
δ 0.9 0.95 0.99
Method CI SR EB CI SR EB CI SR EB
FN 2.49 3.55 2.40 1.54 2.09 1.41 0.63 0.55 0.38
FP 62.76 31.97 E1 21.63 82.47 39.73 26.87 114.32 54.00 37.67
FI 7.17 6.39 4.33 5.31 3.97 2.69 2.46 1.08 0.75
FN 2.21 4.44 2.91 1.39 2.40 1.73 0.15 0.62 0.45
L1 FP 106.14 39.97 26.20 65.24 45.67 32.80 431.62 61.00 45.00
FI 8.75 7.99 5.24 3.62 4.57 3.28 2.10 1.22 0.90
FN 0.54 6.53 5.28 0.21 3.48 2.90 0.00 0.81 0.73
S1 FP 753.44 58.73 47.50 1251.33 66.20 55.13 NA 80.33 72.33
FI 10.03 11.75 9.50 6.46 6.62 5.51 1.91 1.61 1.45
Note: 1. FN— False negatives; FP— False Positives; FI— False Identifications.
2. In the table, the reason that some FPs can exceed 100% is due to non- normality of the distribution
and setting of threshold, and in these cases, the CI method identifies more hazardous locations than truly
exist. For the same reason, the existence of “ NA” in the table is due to zero truly hazardous locations
identified by confidence analysis.
3. The shaded cells show the lowest identification error rate.
Table 4: Percent Errors for High Heterogeneity in Crash Counts
Percent Errors: High Heterogeneity
δ 0.9 0.95 0.99
Method CI SR EB CI SR EB CI SR EB
FN 1.78 2.09 1.13 1.33 1.33 0.86 0.39 0.26 0.17
E1 FP 24.37 18.77 10.13 32.56 25.33 16.40 57.07 26.00 16.67
FI 4.13 3.75 2.03 3.34 2.53 1.64 1.54 0.52 0.33
FN 1.89 2.55 1.57 1.50 1.43 0.91 0.44 0.37 0.23
L1 FP 36.33 22.93 14.13 32.20 27.20 17.33 45.22 36.67 22.67
FI 5.14 4.59 2.83 3.40 2.72 1.73 1.29 0.73 0.45
FN 2.16 2.73 1.74 1.17 1.31 0.71 0.47 0.26 0.12
S1 FP 34.80 24.53 15.67 41.08 24.87 13.47 38.37 25.33 12.33
FI 5.16 4.91 3.13 3.31 2.49 1.35 1.32 0.51 0.25
Note: 1. FN— False negatives; FP— False Positives; FI— False Identifications.
2. The shaded cells show the lowest identification error rate.
30
There is also some difference among the percent errors resulting from the three
identification methods. Comparing to the other two traditionally methods, the Bayesian
technique yields fewer false negatives in most cases in both the tables. That is, the
Bayesian technique is more efficient in flagging the sites that require further analysis.
Unfortunately, this higher efficiency is at the cost of the substantial number of false
positives generated, which reduce the efficiency of the investment of local governments.
Only in the case of budgetary constraints may the false positives not result in the
unneeded repairs of the locations that are not truly hazardous. As for the confidence
interval method and the simple ranking method, there is no big difference between them.
Both methods generally generate higher identification error rate than does Bayesian,
indicating the relatively worse performance in identifying hazardous locations.
EXPERIMENT FOR OPTIMIZING DURATION OF CRASH HISTORY
May ( 1964) first discussed the issue that how many years of accident data should be
analyzed when determining the accident- prone locations. He explored the difference
between sorts of average accident counts with “ t” increasing until 13 years. The result has
shown that the difference diminishes as “ t” increases as well as the marginal benefit of
increasing “ t” declines. The “ knee” of the curve is said to occur at t= 3 years. Based on
that information, May then came to the conclusion that “ there is little to be gained by
using a longer study period than three years.”
In this experiment, a different logic is employed to explore the best study duration for
accident data analysis. Instead of using the simple accident counts in the method
presented by May, this experiment will utilize the identification error rate as an indicator,
or the identification error rates associated with various “ t” years compared to obtain the
optimum study period. When conducting history analysis, the three identification
methods are also employer, and the corresponding processes remain the same. The only
difference lies in how to use the different periods of data. To show the logic clearly,
another small snapshot is used again ( Table 5). First, the ith column of data is assumed to
represent the ith- current- year accident data. For example, for site 9, the first four data
represent the accident counts during the four current years, and the rest data in the first
four columns can be viewed as the accident counts associated with other similar sites
during the same period. Let’s consider conducting Bayesian analysis. It is known that for
a given t- year period, Equation 24 is used for each site to compute the corresponding
expected accident counts. However, since the TPM represent the long- term number of
accidents per year, thus for the t- year period, average accident counts per year should be
used in this equation. In the end of forth year, the “ x” for site 10 should be 14 accidents
( average of the first 4 data), and E { λ} = 12.88accidents ( row average accident), VAR { λ}
= 5.18 accidents2 ( row variance), α= 0.713 thus the expected accident counts associated
with site 10 by using the first 4- year data is 13.2 accidents. Obviously, for the 16
different observation periods, we can generate 13 Bayesian expected data associated with
site 10 by using the 4- year history record. Based on these Bayesian expected accident
counts of various sites, the previously stated process of the Bayesian method can then be
employed to compute the percent of false negatives, false positives, and false
identification for different “ t” years. The similar history analysis logic can also apply to
31
the other two identification methods. Due to a large amount of iterative computations in
this experiment, a special computer code is written to calculate the various identification
error rates associated with different period of accident data.
Table 5: Snapshot of the Simulated Data
Site TPM Simulated data
1 3 7 3 4 3 2 1 2 3 3 4 2 3 3 4 3 2
2 3 3 5 5 2 3 1 1 4 2 2 1 2 2 7 4 5
3 5 5 7 6 5 5 6 4 4 3 4 7 4 2 4 7 2
4 7 4 6 5 9 4 6 7 4 8 10 13 6 9 7 7 3
5 8 8 6 8 6 9 9 12 7 2 3 8 11 7 5 7 7
6 9 15 10 16 12 12 8 8 6 9 12 18 15 9 7 12 8
7 9 9 10 12 8 11 5 8 9 13 9 10 12 7 7 8 5
8 12 12 5 11 18 12 12 16 12 7 10 13 10 9 11 9 13
9 13 13 13 12 10 12 12 13 14 11 7 14 13 7 16 18 7
10 14 16 14 15 11 10 12 15 9 15 15 13 11 11 16 12 11
11 15 17 15 13 15 13 13 16 16 13 11 18 14 9 12 22 18
12 16 18 19 20 11 7 14 12 10 16 18 14 17 9 15 19 18
In theory, as the “ t” increases, the expected accident counts of each site, which is
computed based on the simulated data, would converge to its TPM ( the reason is that in
the experiment each row of simulated data strictly follow the Poisson distribution) and
the corresponding identification error rate would converge to zero. However, in a real
situation with “ t” increasing, each site would suffer from more influential factors, and
thus the long period of data generally cannot represent the current situation. On the other
hand, if the short period of data is used, lots of information would be missing and it is
difficult to obtain the true long- term accident counts. Consequently, a trade- off should be
made to find the study period that is short enough to represent the current condition and
long enough to obtain the true expected accident counts. In this experiment, various
identification rates are plotted versus the different “ t” years. The “ knee” of such a curve
is expected as the optimum study period.
Considering the data is older than 10 years, it no longer reflects a current situation. In the
experiment, the 30 simulated data are averagely divided into 3 groups, that is, the first 10
columns of data belong to group 1, the eleventh to twentieth column of data follows into
group 2, the last 10 columns of data belong to group 3. The common characteristic shared
by the three groups is assumed to reflect the true relation between identification error rate
and “ t” years. For each group, the three common confidence levels, 90%, 95%, and 99%
are used for the three analyses.
In the diagram of identification error rate vs. “ t” year, there still exists some fluctuations
along the curve, although generally the identification error rate decreases while “ t”
increases. To quickly determine and eliminate the initial “ warm- up” period ( i. e., the
period before the knee of the curve), Welch’s moving average method ( Kelton, 2003) is
utilized. Through the moving average, this method can further out the statistical
32
fluctuations in observations ( yi) and illustrate clearly the “ warm- up” period. As shown in
Figure 3, series 1 represents the original Fn rates associated with different “ t.” Due to the
existence of two outliers ( the plot of t= 4 and t= 6), it is difficult to obtain the “ knee” of the
curve. However, it is easy to know from the series 2 ( the curve of moving averages) that
the 5- year range is the best study period.
8 9
10
11
12
13
14
15
16
17
18
1 2 3 4 5 6 7 8 9 10 t
FN(%)
Ser i es1
Ser i es2
Figure 3: Moving Averages vs. Original Statistic
The moving average Y ( w) i ( where w, the window size) of random observations is
defined as follows:
⎪ ⎪⎩
⎪ ⎪⎨
⎧
=
−
+ + + +
= + −
+
+ + + +
=
−
− −
i w
i
y y y
i w m w
w
y y y
Y w
i i
i w i i w
i
, 1, ,
2 1
, 1, ,
( ) 2 1
1 2 1 L
L L
L
L L
( 25)
In this experiment, the window size is selected as 1.
RESULTS
Similar to the previous experiment, the three HSID methods are also performed in this
experiment to explore the optimal duration of accident history. The number of various
optimal “ t” across the three confidence levels and three groups is shown in the Tables
6~ 8. For the convenience of viewing, the plots of the frequency of various t- periods for
the different confidence levels and groups are illustrated in the Figures 4~ 6, and the plots
of the cumulative results of all the confidence levels and groups are demonstrated in the
Figures 7~ 8. Readers interested in the details of identification error rates associated with
various HSID methods, confidence levels, and groups are referred to Appendix B.
33
Table 6: The Number of t- year Which is the “ Knee” of the Curve for Group 1
Year 1 2 3 4 5 6 7 8 9 10
90% 1 22 13 6 8 2 2
95% 1 1 23 10 8 7 2 2
99% 2 20 8 10 6 4 3 1
SUM 1 4 65 31 24 21 8 7 1
Note: In this group there are 162 scenarios ( 3 identification methods, 3 kinds of shapes, low and high
heterogeneity for crash counts, 3 threshold values for truly hazardous locations, and 3 kinds of false
identifications, or FN, FP, FI).
Table 7: The Number of t- year Which is the “ Knee” of the Curve for Group 2
Year 1 2 3 4 5 6 7 8 9 10
90% 2 0 28 10 4 5 3 1 1
95% 0 3 21 11 7 6 4 2 0
99% 0 1 27 9 5 7 2 3 0
SUM 2 4 76 30 16 18 9 6 1
Note: In this group there are 162 scenarios ( 3 identification methods, 3 kinds of shapes, low and high
heterogeneity for crash counts, 3 threshold values for truly hazardous locations, and 3 kinds of false
identifications, or FN, FP, FI).
Table 8: The Number of t- year Which is the “ Knee” of the Curve for Group 3
Year 1 2 3 4 5 6 7 8 9 10
90% 1 22 14 6 5 2 1 1
95% 2 2 20 7 7 8 3 4 1
99% 3 27 11 5 5 4 1
SUM 2 6 69 32 18 18 9 6 2
Note: In this group there are 162 scenarios ( 3 identification methods, 3 kinds of shapes, low and high
heterogeneity for crash counts, 3 threshold values for truly hazardous locations, and 3 kinds of false
identifications, or FN, FP, FI).
34
0
20
40
60
80
1 2 3 4 5 6 7 8 9
t- year
number
Group 3
Group 2
Group 1
Figure 4: The Number of t- year Which is the “ Knee” of the Curve for 90%
Confidence Level
0
10
20
30
40
50
60
70
1 2 3 4 5 6 7 8 9
t- year
number
Group 3
Group 2
Group 1
Figure 5: The Number of t- year Which is the “ Knee” of the Curve for 95%
Confidence Level
35
0
20
40
60
80
1 2 3 4 5 6 7 8 9
t- year
number
Group 3
Group 2
Group 1
Figure 6: The Number of t- year Which is the “ Knee” of the Curve for 99%
Confidence Level
0
50
100
150
200
250
1 2 3 4 5 6 7 8 9
t- year
number
Group 3
Group 2
Group 1
Figure 7: The Number of t- year Which is the “ Knee” of the Curve for All
Confidence Levels
36
0
20
40
60
80
100
120
0 1 2 3 4 5 6 7 8 9
t- year
cumulative percent
Figure 8: The Cumulative Percent Distribution of Various t- years
In terms of Figures 7 and 8, it is known that across all the simulation scenarios, a 3- year
crash history represented the largest portion of “ best” study period of crash history, and 3
through 6 years make up almost 90% of all the optimum t- years. Hence, as the trade- off
between the long and short history record, if there is no significant physical change in the
location under securitization and the long history record can be obtained, it is suggested
that the most recent 6- years of crash record is sufficient to capture the majority of the
beneficial effect of crash history. In contrast, 3- years of crash history data represents the
‘ shortest’ period of time that should be used and which achieves a significant benefit of
crash history ( under most general conditions). Crash histories of 1 and 2 years provide
relatively little benefit in the methods and under the range of conditions assessed.
To illustrate the improvement in identification performance results from using 3- year
history data, Tables 9 and 10 are provided ( in contrast to Tables 3 and 4). The differences
lie in that Tables 3 and 4 use 1 year of crash data and the percent of identification rates
are computed based on the last 30 years of data, whereas Tables 9 and 10 use 3- year data
and the corresponding percent of identification rates are calculated on the basis of the
current 10 years of data.
37
Table 9: Percent Errors for Low Heterogeneity in Crash Counts ( 3 Years Data)
Percent Errors: Low Heterogeneity
δ 0.9 0.95 0.99
Method CI SR EB CI SR EB CI SR EB
FN 2.02 2.32 1.53 1.36 1.34 0.82 0.89 0.40 0.25
FP 28.06 20.88 E 13.75 38.60 25.50 15.50 48.56 40.00 25.00
FI 4.68 4.18 2.75 3.69 2.55 1.55 2.13 0.80 0.50
FN 2.56 2.75 2.13 1.69 1.72 1.25 0.47 0.51 0.40
L FP 33.16 24.75 19.13 50.00 32.75 23.75 91.07 50.00 40.00
FI 5.56 4.95 3.83 4.33 3.28 2.54 0.14 0.67 0.53
FN 1.10 4.88 4.33 0.68 2.88 2.54 0.14 0.67 0.53
S FP 228.21 43.88 39.00 239.38 54.75 48.25 362.16 66.25 52.50
FI 9.05 8.78 7.80 5.45 5.48 4.83 1.81 1.33 1.05
Note: 1. FN— False Negatives; FP— False Positives; FI— False Identifications; CI— Confidence Interval;
SR — Simple Ranking; EB— Empirical Bayesian; E— Exponential Shape; L— Linear Shape; S— Sigmoidal
Shape.
2. In the table, the reason that some FPs can exceed 100% is due to non- normality of the distribution
and setting of threshold, and in these cases, the CI method identifies more hazardous locations than truly
exist. For the same reason, the existing of “ NA” in the table is due to zero truly hazardous locations
identified by confidence analysis.
3. The shaded cells show the lowest identification error rate.
Table 10: Percent Errors for High Heterogeneity in Crash Counts ( 3 Years Data)
Percent Errors: High Heterogeneity
δ 0.9 0.95 0.99
Method CI SR EB CI SR EB CI SR EB
FN 1.08 1.28 0.67 0.96 0.95 0.71 0.24 0.14 0.10
E FP 13.96 11.50 6.00 15.32 18.00 13.50 34.66 13.75 10.00
FI 2.51 2.30 1.20 1.98 1.80 1.35 1.00 0.28 0.20
FN 1.72 1.63 1.36 1.19 0.96 0.87 0.41 0.21 0.20
L FP 14.37 14.63 12.25 15.07 18.25 16.50 20.11 21.25 18.25
FI 3.08 2.93 2.45 2.14 1.83 1.65 0.86 0.43 0.38
FN 2.10 2.04 1.65 0.70 0.66 0.55 0.40 0.15 0.10
S FP 18.01 18.38 14.88 20.83 12.50 10.50 21.03 15.00 10.00
FI 3.73 3.68 2.98 1.85 1.25 1.05 0.90 0.30 0.20
Note: 1. FN— False Negatives; FP— False Positives; FI— False Identifications; CI— Confidence Interval;
SR — Simple Ranking; EB— Empirical Bayesian; E— Exponential Shape; L— Linear Shape; S— Sigmoidal
Shape.
2. The shaded cells show the lowest identification error rate.
38
By comparing these tables, it is known that using 3 years of crash history data results in
significant improvements in error rates for all three methods, CI, SR,