<- read_excel(here("posts/20220927/Gage_RnR_template_V4.xlsx"), range = "A3:E78")
data <- data %>%
data_long pivot_longer(cols = c(Op1, Op2, Op3),
names_to = "Operator", values_to = "measure")
<- data_long %>%
measure_long mutate(across(.cols = c(Part, Operator), as_factor))
Measurement uncertainty estimation with Linear Mixed Models in R
Objective
Assess the feasibility and benefits of using linear mixed models in measurement uncertainty studies (e.g. enlarge the number in input factors. This document provides a comparison of measurement uncertainty studies with lm and anova, with the R SixSigma package and with the R lme4 package.
Introduction
The estimation of a measurement method uncertainty is a key output of a method validation. It provides users of the method a way to communicate the measurement results taking in consideration the variability introduced by the method itself as described in The beginers guide to uncertainty of measurement pag 17 by S.Bell Bell (2001)
Many different ways to estimate uncertainty have been listed by J E Muelaner (2015). In my professional activity I’ve been working to improve the way to estimate this uncertainty by adopting some of these different statistical approaches and the related software tools. In particular I’ve started by using the Minitab gage rnR following the AIAG2010 guidelines and later adopted the R package SixSigma by E.Cano Emilio L. Cano (2012). This last approach provided an effective way to treat large datasets but still has limitations in the modelling of random effects, overestimating errors and sometimes leading to negative variations, as explained by D.Montgomery in Design and analysis of experiments page 573 Montgomery (2012)
In this article we explore linear mixed models with the R package lme4, comparing it with the previous approaches and suggest routes to integrate as a new operational tool in our toolset for method validation.
Methods
Data loading
The dataset corresponds to a gage rnR study performed on pod volume measured with a vision system. It consists of 5 parts and 3 operators that measured each part 15 times. This has resulted in 225 individuals measurements.
We’re showing below the characteristics of this dataset:
%>% summary() measure_long
Part Replicate Operator measure
P1:45 Length:225 Op1:75 Min. :1799
P2:45 Class :character Op2:75 1st Qu.:1803
P3:45 Mode :character Op3:75 Median :1805
P4:45 Mean :1805
P5:45 3rd Qu.:1808
Max. :1813
Results
Linear model
The first and most direct way to assess the effect of the input factors is to establish a simple linear model and perform an anova.
<- lm(measure ~ Part + Operator + Part:Operator, measure_long)
measure_lm summary(measure_lm)
Call:
lm(formula = measure ~ Part + Operator + Part:Operator, data = measure_long)
Residuals:
Min 1Q Median 3Q Max
-4.2925 -0.7783 -0.0317 0.7431 3.6451
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1803.3953 0.3237 5570.485 < 2e-16 ***
PartP2 -1.9339 0.4578 -4.224 3.58e-05 ***
PartP3 5.8067 0.4578 12.683 < 2e-16 ***
PartP4 3.3965 0.4578 7.419 2.89e-12 ***
PartP5 1.0892 0.4578 2.379 0.0183 *
OperatorOp2 0.5223 0.4578 1.141 0.2553
OperatorOp3 -0.2854 0.4578 -0.623 0.5337
PartP2:OperatorOp2 0.1732 0.6475 0.267 0.7893
PartP3:OperatorOp2 -0.4307 0.6475 -0.665 0.5066
PartP4:OperatorOp2 0.4805 0.6475 0.742 0.4589
PartP5:OperatorOp2 0.1041 0.6475 0.161 0.8725
PartP2:OperatorOp3 0.3327 0.6475 0.514 0.6079
PartP3:OperatorOp3 0.9127 0.6475 1.410 0.1601
PartP4:OperatorOp3 0.9753 0.6475 1.506 0.1335
PartP5:OperatorOp3 0.3723 0.6475 0.575 0.5659
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.254 on 210 degrees of freedom
Multiple R-squared: 0.8399, Adjusted R-squared: 0.8292
F-statistic: 78.67 on 14 and 210 DF, p-value: < 2.2e-16
<- aov(measure_lm)
measure_lm_aov summary(measure_lm_aov)
Df Sum Sq Mean Sq F value Pr(>F)
Part 4 1707.1 426.8 271.457 <2e-16 ***
Operator 2 13.1 6.6 4.177 0.0166 *
Part:Operator 8 11.2 1.4 0.892 0.5237
Residuals 210 330.1 1.6
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
In this case the anova indicates that the part has a significative effect (as expected) and the operator too.
Six Sigma
<- ss.rr(
measure_rr data = measure_long,
var = measure,
part = Part,
appr = Operator,
alphaLim = 1,
errorTerm = "repeatability",
main = "Micrometer FTR600\nr&R for tablet thickness",
sub = "Tablet L",
lsl = 17750,
usl = 18250,
print_plot = FALSE
)
Complete model (with interaction):
Df Sum Sq Mean Sq F value Pr(>F)
Part 4 1707.1 426.8 271.457 <2e-16
Operator 2 13.1 6.6 4.177 0.0166
Part:Operator 8 11.2 1.4 0.892 0.5237
Repeatability 210 330.1 1.6
Total 224 2061.6
alpha for removing interaction: 1
Gage R&R
VarComp %Contrib
Total Gage R&R 1.64098201 14.79
Repeatability 1.57212495 14.17
Reproducibility 0.06885705 0.62
Operator 0.06885705 0.62
Part:Operator 0.00000000 0.00
Part-To-Part 9.45247729 85.21
Total Variation 11.09345930 100.00
VarComp StdDev StudyVar %StudyVar %Tolerance
Total Gage R&R 1.64098201 1.2810082 7.686049 38.46 1.54
Repeatability 1.57212495 1.2538441 7.523064 37.65 1.50
Reproducibility 0.06885705 0.2624063 1.574438 7.88 0.31
Operator 0.06885705 0.2624063 1.574438 7.88 0.31
Part:Operator 0.00000000 0.0000000 0.000000 0.00 0.00
Part-To-Part 9.45247729 3.0744881 18.446929 92.31 3.69
Total Variation 11.09345930 3.3306845 19.984107 100.00 4.00
Number of Distinct Categories = 3
There are many outputs that are provided by the SixSigma package which are discussed in full details in the Precision chapter of our book industRial statistics (in particular the results produced by R have been confirmed by us with a manual calculation in a spreadsheet using the same dataset).
Mixed Model (2 factors)
https://therbootcamp.github.io/SwR_2019Apr/_sessions/MixedModels/MixedModels.html#23
https://therbootcamp.github.io/SwR_2019Apr/_sessions/MixedModels/MixedModels_practical.html
<- lmer(measure ~ 1 +
measure_lme 1|Operator) + (1|Part) + (1|Part:Operator),
(
measure_long)
summary(measure_lme)
Linear mixed model fit by REML ['lmerMod']
Formula: measure ~ 1 + (1 | Operator) + (1 | Part) + (1 | Part:Operator)
Data: measure_long
REML criterion at convergence: 766.9
Scaled residuals:
Min 1Q Median 3Q Max
-3.2650 -0.6427 0.0242 0.5793 2.8999
Random effects:
Groups Name Variance Std.Dev.
Part:Operator (Intercept) 0.00000 0.0000
Part (Intercept) 9.44894 3.0739
Operator (Intercept) 0.06668 0.2582
Residual 1.56592 1.2514
Number of obs: 225, groups: Part:Operator, 15; Part, 5; Operator, 3
Fixed effects:
Estimate Std. Error t value
(Intercept) 1805.341 1.385 1303
optimizer (nloptwrap) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')
This approach with linear mixed models using the lme4 R package produces very similar results (the cause for the small differences is still to be investigated and should in principle be related with parameters of the model and calculation functions).
Mixed Model (3 factors)
<- data %>%
data2 mutate(Op1d2 = Op1*1.001, Op2d2 = Op2*1.002, Op3d2 = Op3*1.003)
<- data2 %>%
data2_long pivot_longer(cols = c(Op1, Op2, Op3, Op1d2, Op2d2, Op3d2),
names_to = "Operator", values_to = "measure") %>%
mutate(day = str_sub(Operator, 4,5)) %>%
mutate(day = if_else(day == "", "d1", day)) %>%
mutate(Operator = str_sub(Operator, 1, 3))
<- data2_long %>%
measure2_long mutate(across(.cols = c(Part, Operator, day), as_factor))
<- lmer(measure ~ 1 +
measure2_lme 1|Operator) + (1|Part) + (1|day) +
(1|Part:Operator) + (1|day:Operator),
(
measure2_long)summary(measure2_lme)
Linear mixed model fit by REML ['lmerMod']
Formula: measure ~ 1 + (1 | Operator) + (1 | Part) + (1 | day) + (1 |
Part:Operator) + (1 | day:Operator)
Data: measure2_long
REML criterion at convergence: 1526.9
Scaled residuals:
Min 1Q Median 3Q Max
-3.4125 -0.6233 0.0105 0.5731 2.9625
Random effects:
Groups Name Variance Std.Dev.
Part:Operator (Intercept) 0.04326 0.2080
day:Operator (Intercept) 1.61265 1.2699
Part (Intercept) 9.46258 3.0761
Operator (Intercept) 0.28599 0.5348
day (Intercept) 5.95525 2.4403
Residual 1.53155 1.2376
Number of obs: 450, groups:
Part:Operator, 15; day:Operator, 6; Part, 5; Operator, 3; day, 2
Fixed effects:
Estimate Std. Error t value
(Intercept) 1807.146 2.289 789.4
optimizer (nloptwrap) convergence code: 0 (OK)
Model failed to converge with max|grad| = 0.00307362 (tol = 0.002, component 1)
This approach with the lme4 package extends to 3 factors, in our view demonstrating its interest for the cases where its needed to go beyond the SixSigma package and as here add for instance the day as a random input factor.
Conclusions
This document shows that results obtained with the R packages SixSigma and lme4 are comparable. As lme4 allows to extend the number of input factors thus introducing for example the day, we recommend its application in a new concrete case to further assess its application in practice and enlarge the Method validation toolbox.
References