GxP-Blog

Data evaluation during GMP validation

Too much precision not good enough? Application of statistical methods using sound judgement

The mathematician Abraham Wald described statistics as a way of organizing data that allows reasonable decisions to be made in the event of uncertainty. The term “allow” implies that the application of statistical methods will not necessarily lead to reasonable decisions. By choosing statistical methods and a methodical approach, we influence how unerring or uncertain the journey will be. I am deliberately leaving out the aspect of "ignorance" here. Only this much: GxP inspectors are rarely pleased when ignorance is the cornerstone of a validation.

In this blog post, I will use a case study to show how important statistical procedures are for a target-oriented evaluation and, at the same time, how treacherous they are in the validation and verification of analytical methods. Fact is: If there is a very good precision (= very low scatter) of the measurement results, this is a fact to be evaluated positively when viewed objectively. In this case study, you will be confronted with the thankless situation when statistical tests "punish" excellent precision. You will find answers to the question: How can these unnecessary complications be avoided already in the choice of acceptance criteria?

Case study: Verification of the content determination of anhydrous citric acid according to USP (GMP)

Let us look at an example of how preventable complications can arise when choosing acceptance criteria. Consider the verification of the content of water-free citric acid according to the USP. The content is an acid-based titration, the specification of the product indicates 99.5% to 100.5% (based on the water-free substance). The following acceptance criteria were defined for accuracy and precision when verifying the testing method according to the USP in two laboratories:

Precision: The relative standard deviation (RSD, n=6) must not exceed more than 0.5%.

Accuracy: No significant difference between nominal value and mean value (t-test).

The precision provides information on the extent to which the measured results differ; whereas the accuracy of the measured results differs from the true value. In this case, the true value is the declaration of the USP reference standard of 1,000 mg/mg (=100.0%). Analysts A and B, from different laboratories, conduct the analyses. The results of the accuracy test are shown in Figure 1.

Analyst B:

  • Analyst B's analysis data meets the acceptance criterion for correctness testing.
  • The dispersion of this data also meets the acceptance criterion for precision testing.

Analyst A:

  • Analyst A's analysis data did not meet the acceptability criterion for the correctness check.
  • The scatter of this data meets the acceptance criterion for precision testing.

Comparing the two sets of data, it is noticeable that Analyst A was both more correct (mean closer to the "true value") and more precise. Nevertheless, unlike Analyst B, Analyst A must deal with the stigma of a failed acceptance criterion in the verification.

Conclusions for data assessment in GMP validation.

What can we deduce from this example? When conducting validations, does the best method involve taking a casual approach, thus creating a higher distribution rate so as not to fall into the same trap as Analyst A? Is it better to use statistical tests or instead use superficial plausible acceptance criteria as with precision? For many analytical methods, a 0.5% spread is not at all bad. Let us first look at how plausible the acceptance criteria are for precision. The analytical data collected by Analyst B almost completely covers the specification range of the product and has not yet reached the acceptance limit (0.5%) of the spread defined here. The probability for analytical values outside the specification (OOS) when testing the quality of incoming citric acid is no longer low, and there is little room for product variability. To describe the situation somewhat more visually, if you imagine the specification’s width as the entrance to a garage and the spread of the measured data as the width of the body of an incoming automobile, Analyst B will not be able to get much use out of the car’s exterior mirrors. A complete renunciation of the statistical basis for assessing the data definitely is the wrong approach. The conscious creation of a larger spread would be a desperate act that has nothing in common with GMP. Instead, quantitative acceptance criteria must be based on sound statistics.

Let’s Do it Better

Let us look first at the acceptance criteria for the precision of the testing method. The requirement is that the confidence range (twice the measurement uncertainty) should not be greater than half the tolerance range given on the specification limits.1 If we multiply the standard uncertainty (s/√n) by the corresponding t-value from the t-table, we can consider the result as the measurement uncertainty. By using a derivation of this formula, it is possible to calculate how high s is if twice the measurement uncertainty corresponds to half the tolerance range of the given specification. We want to specify the relative standard deviation which also is referred to as the coefficient of variation but not the standard deviation. This corresponds to the standard deviation divided by the expected value. By converting these formulas, the maximum permitted coefficient of variation can be determined (see Figure 2).

In the example, it is 0.24% for the precision test (n=6). Analyst B would not pass a precision test given the available data. Between you and me, I’ve always been suspicious of data provided by people who drive without outside mirrors. Analyst B would have to establish precision-enhancing measures before a strong routine application of the testing method can be achieved. 

That still leaves the correctness assessment. In principle, a t-test is an objective statistical method which may be regarded as scientifically sound. Only does it make sense to make an assessment of the correctness of the method using the scatter of actual data? There is the weak point that with each precision improvement of my method the claim to correctness is increased at the same time and the claim to correctness decreases the more imprecise I work. This cannot be the goal in the evaluation of the method correctness. 


A few lines before we have calculated the maximum allowed scatter of the measured data in a scientifically well-founded way.  So let's use the maximum allowed scatter and the t-table value to calculate the maximum allowed difference between the mean of the measurement results and the "true value" (declaration of the reference standard). Finally, we would consider a passing t-test with actual data that hit the maximum allowed spread on the dot as acceptable. This is exactly where the boundary for the correctness assessment should be drawn. The calculation requires a rearrangement of the formula shown in Figure 2. I would not like to withhold this little fiddling from the interested among you in any case and therefore omit a formula illustration here. Please make sure that you calculate consistently with the correct sample size.

Pragmatism at its Best

Using a well thought-out risk-based approach is a good way to focus validation projects on key risks and thus reduce costs. Improved quality and reduced costs result when tried and true concepts are relied upon. Pragmatism in its best form is a constructive way to further develop processes. In this spirit, I hope that you will: “Live the continuous process of improvement.” We will be happy to use our expertise to support you.

1„Handbuch Validierung in der Analytik, 2. überarbeitete Auflage. S. Kromidas (Wiley-VCH Verlag)

This website uses cookies to provide you with the best possible experience. If you continue to browse our site, you consent to our use of cookies and to our privacy policy.