Methodology for Computing GTG Performance Statistics
The GTG is the operational version of the Integrated Turbulence
Forecasting Algorithm (ITFA), which has been verified over several
years. Long-term statistics on ITFAs performance are available on the Real-Time
Verification System (RTVS). The statistics that are presented on the GTG
performance page were obtained from the RTVS.
ITFA diagnoses are verified using Yes and No pilot reports
(PIREPs) of turbulence conditions. Only forecasts and PIREPs located at
altitudes of 20,000 ft and above are considered in the performance
statistics, since the current version of GTG is only intended to forecast
Clear-Air Turbulence (CAT) at these altitudes. These reports and the GTG
forecasts are used to compute three basic statistics: PODy
(probability of detection of Yes PIREPs), PODn (probability of
detection of No PIREPs), and the % Volume covered by
a Yes forecast. Only PIREPs indicating moderate or greater turbulence severity
are used to compute PODy and only PIREPs that explicitly state "turbulence
negative" or smooth are used to compute PODn.
PODy can be interpreted as the proportion of Yes reports
that are correctly classified as having turbulence conditions. PODn
is the proportion of No reports that are correctly classified as not having
turbulence conditions. Thus, 1-PODn can be interpreted as the
proportion of negative reports that are incorrectly classified. The
% Volume is the percentage of the airspace at 20,000 ft
and above covered with a Yes turbulence forecast.
To create the discrimination and
airspace coverage plots, GTG is converted from a turbulence
severity indicator to a Yes/No forecast, by using a variety of threshold values
(the threshold value for each point is shown on the
plots). Grid points with GTG values greater than the threshold are classified
as "Yes" forecasts; smaller GTG values are classified as "No"
forecasts. Then all of the pairs of statistics are computed for each threshold
and are plotted to create the diagrams. The discrimination plot shows PODy versus
1-PODn, while the airspace coverage plot shows PODy versus % Volume.
The discrimination diagram essentially is a "relative
operating characteristic" (ROC) plot, based on an area of research called
signal detection theory (SDT; Mason 1982). This plot measures the ability of
a forecasting system to discriminate between Yes and No observations. It measures
the trade-off between correctly classifying Yes observations and incorrectly
classifying No observations. For forecasts that are skillful, the ROC curve should
lie above the 45-degree line (i.e., curves for better forecasts lie
further toward the upper left corner in the diagram). In fact, the area under
the ROC curve is a measure of skill, and is called the skill index (SI) on the
GTG performance plots. This index ranges from 0 to 100. SI values greater than
50 indicate the forecasts have some skill. Larger values indicate greater skill.
The airspace coverage plot measures the trade-off
between correctly classifying Yes observations, and covering a large amount of
airspace with a Yes forecast. Unfortunately, due to the nature of PIREPs, it
is inappropriate to compute standard measures of over-warning such as the
False Alarm Ratio (FAR; Brown and Young 2000). Thus, the airspace coverage
plot provides an alternative measure of over-warning. Better forecasts
are indicated by curves that are closer to the upper left corner. Together,
the discrimination plot and the airspace coverage plot provide a relatively
complete picture of ITFA performance.
For more information about GTG performance, see Brown et al.
(2002) which is the quality assessment report for GTG. Additional information
about the verification approach is included in Brown et al. (1997). Additional
information about GTG performance is also presented in Brown et al. (2000).
References
-
Brown, B.G., G. Thompson, R.T. Bruintjes, R. Bullock, and T. Kane, 1997:
Intercomparison of in-flight icing algorithms. Part II: Statistical
verification results. Wea. Forecasting, 12, 890-914.
-
Brown, B.G., and G.S. Young, 2000: Verification of icing and turbulence forecasts:
Why some verification statistics can't be computed using PIREPs. Preprints,
9th Conference on Aviation, Range, and Aerospace Meteorology, Orlando, FL,
11-15 September, American Meteorological Society (Boston), 393-398.
-
Brown, B.G., J.L. Mahoney, J. Henderson, T.L. Kane, R. Bullock, and J.E. Hart,
2000: The turbulence algorithm intercomparison exercise: Statistical verification
results. Preprints, 9th Conference on Aviation, Range, and Aerospace
Meteorology, Orlando, FL, 11-15 Sept., American Meteorological Society
(Boston), 466-471.
-
Brown, B.G., J.L. Mahoney, R. Bullock, M.B. Chapman, C. Fischer, T.L. Fowler,
J.E. Hart, and J.K. Henderson, 2002: Integrated Turbulence Forecasting Algorithm
(ITFA): Quality Assessment Report. Report to the FAA Aviation Weather Research
Program.
-
Mason, I., 1982: A model for assessment of weather forecasts. Australian
Meteorological Magazine, 30, 291-303.
|