Metrology in quality management processes is much more advanced today than it was a few decades ago, thus placing a higher emphasis on empirical data (Messina, 1987; Miller and Freund, 1965). Organisations worldwide rely on sophisticated measurement equipment to collect this data for key characteristics of a product or process. However, these characteristics cannot be measured with perfect certainty. There are always errors when measurements are carried out, which means if a characteristic is measured again, it will result in a different value. In this context, Measurement Systems Analysis (a collection of statistical methods) is a popular technique for the analysis of measurement system capability (Automotive Industry Action Group [AIAG], 2002; Smith et al., 2007). In particular, Gauge R&R is an industry-standard technique to evaluate measurement equipment for precision. However, achieving reproducible results is more challenging in the context of destructive testing as parts are destroyed during measurement and cannot be measured again. The current study aims to evaluate the existing Gauge R&R methods for quantifying measurement variation and defining validation criteria for measurement equipment that uses destructive testing. A number of applied research studies have been conducted in the area of Gage R&R for non-destructive testing (e.g. Barbosa et al., 2014; Peruchi et al., 2013; Liangxing et al., 2014; Hoffa and Laux, 2007); however, there seems to be a dearth of research into applications of Gauge R&R for destructive measurement system. An examination by Han and He (2007) used the Gauge R&R model to validate a rip-off force measurement system. Just one-off research in the last 20 years is too limited in scope to provide any useful insights, so the current study aimed to provide further cognisance into the applications of Gage R&R model for destructive testing. Further, a Crossed Gauge R&R technique, instead of Nested Gauge R&R, was applied for validating a burst strength test equipment (destructive test), which was a new measurement device using a plastic welded thermostatic cartridge sub-assembly used in mixer showers subjected to high-pressure water. Such a process has not been tested before.
This research study was conducted in a UK-based, global plumbing company, renowned for its bathroom products. The sub-assembly product in focus was an ultrasonically welded part and the key process output variable burst strength, which also served as a key performance indicator. Burst strength of an ultrasonic weld is measured by slowly increasing water pressure through the part until it fractures. Well documented validation criteria existed for most measurement equipment except for burst strength test equipment, as it involved destructive testing. The research study aimed to evaluate the precision of the measurement system using Gauge R&R. This article begins by presenting a literature review of the key findings in this area that provides further justification for the current study, followed by the research design and methodology section. Next, a section on research results provides detailed information and metrics on how the application of Gauge R&R technique performed in destructive testing. Application of the results from the current study will be compared to previous research findings in the discussion section, which will then lead to the conclusion and references at the end. Limitations of the current research and possible directions for future research will also be discussed in the conclusion section.
1 Literature review
Six Sigma is a popular quality improvement methodology that aims to significantly improve the quality of a manufacturing process and reduce costs by minimising process variation and reducing defects (Breyfogle and Meadows, 2001; Breyfogle et al., 2001, Sujová et al., 2019). In a manufacturing or assembly process, all its sub-processes are known to inherently possess a variation (Juran and Godfrey, 1999; Mckay and Steiner, 1997). Reduction of a process variation is contingent on an understanding of the relative contributions of various input variables on key performance indicators of the processes. Equally important is the ability to discriminate between process and measurement variations (Ishikawa, 1982; Juran, 1990; Persijn and Nuland, 1996).
Measurement System Analysis (MSA) attempts to quantify the measurement error relative to process tolerance and variation using statistical techniques (Mast and Trip, 2005). If measurement system variation is found to be high (>10%), then most efforts must be directed towards its moderation, prior to embarking upon achievement of a reduction in process variation. This hierarchy in prioritisation is evident in the Six Sigma methodology that emphasises the measurements’ monitoring as a significant analysis activity during the Measure phase prior to data collection and Analyse phases (Pande et al., 2002).
Measurement system analysis (MSA) is defined as “a collection of instruments or gauges, standards, operations, methods, fixtures, software, personnel, environment, and assumptions used to quantify a unit of measure or the complete process used to obtain measurements” AIAG (2002, p. 5). Quantification of measurement error through close scrutiny of diverse variation sources, including the measurement system, the operators, and the parts is central to MSA. Variation in a measurement system consists of four distinct components: (i) bias which refers to the difference between the values of measurement and reference values, (ii) stability serves as a quantifiable indicator of fluctuations in bias over time, (iii) repeatability accounts for measurement variations caused due to inherent errors in the instrument, also referred to as precision, and (iv) reproducibility captures environmental fluctuations arising due to the unique setups and techniques of external sources, such as operators (Engel and De Vries, 1997; Smith et al., 2007). Appropriateness of the gauge for the intended application is best assessed using the repeatability and reproducibility components of a Gauge Repeatability & Reproducibility (GR&R) study (Burdick et al., 2005).
Abundant literature is available on the procedures and relevance of gauge reproducibility and repeatability studies (e.g., Burdick et al., 2003; Dolezal et al., 1998; Goffnet, 2004; Pan, 2004, 2006; Persijn and Nuland, 1996; Smith et al., 2007; Vardeman and Job, 1999). For example, Wesff (2012) published data on how a significant defect reduction in the measurement system, resulting in a massive saving of about $130,000 per year, was achieved by replacing continental automotive systems and reprogramming of the equipment’s software language. Similarly, Bhakri and Belokar (2017) reported on the effectiveness of conducting a Gauge R&R study in achieving a reduction in measurement variation from 37.6% to 14.2%.
Repeatability measurement reiterates recurrent measurements of a part, thus mapping the internal, ‘within operator’, variability in gauge, resulting from the measurement system. Reproducibility, on the other hand, accounts for the environmental fluctuations, also known as ‘between operator’ variations, arising from gauge and external factors such as time (Smith et al., 2007; Pan, 2004). This is achieved by an assessment of variability sourced from manifold operators attempting recurrent measurements of a specific component (Pan, 2006; Tsai, 1989).
A combined estimate of both reproducibility and repeatability variations is referred to as Total Gauge R&R. In addition, total measurement system variation is the sum of the individual parts’ variation and total Gauge R&R (AIAG, 2002; Pan, 2006). Assessing the suitability of the examined measurement systems using the Gauge R&R study was the primary goal of this paper.
2 Research methods
The overall aim of this research was to evaluate the effectiveness of Crossed Gauge RnR (ANNOVA) in quantifying measurement error for a destructive measurement system. The manufacturing and assembly process of a mixer shower cartridge that controls the outlet flow and temperature was selected. One of the key performance indicators for the product and process is the burst strength of the assembled cartridge. Burst strength of a cartridge is measured by slowly ramping up water pressure until it fails. Specific research objectives were to evaluate the utility of Crossed Gauge R&R technique in:
- Quantifying measurement variation coming from the burst strength test rig;
- Identifying the source of measurement variation in the burst strength test rig;
- Defining a validation procedure for measurement equipment that uses destructive testing.
The concerned manufacturing company, wherein the current study was conducted, had a high focus on quality systems that included MSA studies for all measurement equipment used in the laboratory or the shop floor. Currently, there was no qualification method for the burst strength test rig and, hence, the quality of the product could not be confirmed reliably. The burst strength test confirms the structural integrity of the product, and without a satisfactory measurement system, it could result in the bad product reaching customers and leading to safety and operational hazard.
Aiming to make a Gauge R&R study viable, during a destructive measurement process, homogeneity of batches and the component parts is essential (Montgomery, 2001, 2013). Homogeneity in batches is indicated by a collection of similar measurement parts/specimens that are likely to yield similar results. Inherent similarities in components enable repeated measurements of parts that may have been destroyed. An action-oriented, quantitative research methodology was used. Reliable statistical techniques are available for an effective calculation of repeatability and reproducibility of a destructive measurement process. Statistical software Minitab (2002) was used as a medium for conducting statistical analyses. Minitab possesses two inbuilt, standard methodologies for conducting Gauge RnR: Crossed and Nested Gauge R&R. For effective execution of analysis under a crossed design, batch sizes of homogeneous parts must be large enough so that each operator can measure at least two parts from each batch (Box et al., 1978). Example of a crossed experimental design is shown in (Fig. 1).
In a crossed experimental model, operator by batch interactions is mapped across numerous operators, owing to significantly large batch sizes. However, in the case of relatively small sizes of the homogeneous batches, wherein, multiple parts from a batch cannot be allocated to several operators, a nested or hierarchical model is more appropriate, as shown in (Fig. 2).
Validation of burst strength measurement equipment was undertaken in the current research study. Burst strength of ultrasonic welded plastic components was measured using destructive testing. Batches were created with different burst strengths covering the operational range of the product. As multiple parts within batches can be manufactured and provided to randomly selected operators, a crossed Gauge RnR model was deemed suitable.
3 Research results
This section covers the key findings from the current study against key objectives, which are shown below:
- Quantifying measurement variation coming from the burst strength test rig;
- Identifying the source of measurement variation in the burst strength test rig;
- Defining a validation procedure for measurement equipment that uses destructive testing.
Further information on key results from each stage of the research is discussed below.
As already discussed, creating homogeneous batches with a reasonable sample size is critical to the success of Crossed Gauge R&R study. Further, these batches should cover the entire operational range of the product or measurement equipment. Key manufacturing process that defined the burst strength of the product was the ultrasonic welding of two plastic components. Even before MSA could start, the first step was to identify settings for key input variables to generate desired homogenous batches covering the range of burst strength measurements. A design of experiment (DoE) was conducted to characterise key input variables against the burst strength of the welded part. For example, (Fig. 3) shows the CNX diagram for DoE.
A 2-level 6-factor ½-fractional factorial design of experiments was conducted, and the results were analysed using the Minitab software. As an example, Fig. 4 shows all the statistically significant factors and interactions relative to their impact on the burst strength.
Summary of DoE analysis results from the Minitab software is presented in Tab. 1. A high R-sq value of 99.58% means that the regression model created for DoE explains 99.58% of the total variation seen in the process, which is extremely good. Factors or interaction with p-value < 0.05 have a statistically significant impact on the burst strength of the product. It also provides a regression equation for the burst strength.
Minitab results from DoE Model Summary
|Housing*Standsti ll Delay||10.9000||5.4500||0.0368||148.10||0.000||1.00|
|Energy*Standsti ll Delay||-3.6750||-1.8375||0.0368||-49.93||0.000||1.00|
|Housing*Energy*Standsti ll Delay||0.3250||0.1625||0.0368||4.42||0.022||1.00|
|Cover*Energy*Standsti ll Delay||-21.2750||-10.6375||0.0823||-129.28||0.000||5.00|
|Housing*Cover*Energy*Standsti ll Delay||-1.3750||-0.6875||0.0823||-8.36||0.004||1.00|
Regression Equation in Coded Units
Burst Strength = 15.2750 + 12.7750 Housing - 0.6250 Energy - 0.2500 Working Pressure
+ 2.6250 Standstill Delay + 0.9375 Amplitude + 3.3500 Housing*Energy
- 2.3000 Housing*Working Pressure + 5.4500 Housing*Standsti ll Delay
+ 2.8625 Housing*Amplitude - 4.8625 Cover*Amplitude
- 1.8375 Energy*Standsti ll Delay + 2.6500 Housing*Cover*Working Pressure
+ 3.3625 Housing*Cover*Amplitude + 0.1625 Housing*Energy*Standsti ll Delay
- 10.6375 Cover*Energy*Standsti ll Delay
- 0.6875 Housing*Cover*Energy*Standsti ll Delay
Uncoded coefficients are not available with non-hierarchical model.
From the analysis above, various burst strength settings can be achieved by modifying key input factors. One of the challenges was to create parts covering the entire operational range of the equipment. Five batches consisting of nine parts each were manufactured with each part measured for the burst strength by three different operators. The measurement results were then analysed using the Minitab Gauge R&R crossed study.
As demonstrated in Tab. 2, the total Gauge R&R % is only 2.88% of the total study variation suggesting only 2.88% of the variation coming from the measurement system and the rest from the manufacturing process. This is well below the requirement of 10%.
Gauge Evaluation Results from Crossed Gauge R&R Analysis
|SOURCE||STDDEV (SD)||STUDY VAR (6 × SD)||%STUDY VAR (%SV)||%TOLERANCE (SV/TOLER)|
|Total Gage R&R||0.28774||1.7265||2.88||8.63|
Furthermore, total Gauge R&R is 8.63% of the total tolerance for the burst strength. This is below the requirement of 10%, suggesting that measurement equipment can differentiate between good and bad products. The number of distinct categories was greater than five suggesting the resolution of measurement to be suitable for the application.
The R-chart by an operator in Fig. 5 shows all points within control limits suggesting no special cause and all operators performing at similar levels. From the Xbar chart in Fig. 5, one can see that mean burst strength for various test specimen vary more than the control limits. This is a desirable result, as the control limits are based on the combined repeatability and within-part variations. It indicates the between-burst strength differences will likely be detected over the repeatability error. The remaining charts present the burst strength by part and operator. They show a similar trend and distribution suggesting no special variation is induced due to operator; besides, the batch was homogeneous. Based on the above results, it was inferred that the burst strength test rig had passed all the internal validation requirements.
4 Discussion of the results
Key quality management systems (ISO 9001) (e.g. Zimon, 2019) and quality improvement initiatives (Six Sigma) rely on the accuracy and precision of data collection. This leads to the validation of measurement equipment using industry-standard techniques of Gauge R&R calculating measurement variation relative to process variation and process tolerance. Limited research evidence is available on the viability and relevance of the technique used for the measurement system analysis for the burst strength measurement as a destructive test (e.g. Bhakri and Belokar, 2017; Gorman and Bower, 2002; Laux, 2007; Wheeler, 1990). The current study, thus, provided novel insights into the various challenges and benefits of conducting such a design.
Test structure or plan. With destructive testing, more often than not, it will be difficult to get a large number of test specimens. It is extremely important to develop a test plan that can capture the process and measurement system variation. For Crossed Gauge R&R studies, 90 is considered to be a good number with ten parts, three operators and three repeats (10x3x3). However, considering the cost of parts being destroyed and the effort involved, for the purpose of this study, it was deemed to be impractical. Hence, the sample size was halved to 45 with five parts, three operators and three repeats (5x3x3). The current research found it to be adequate for capturing the necessary measurement equipment and process variations. The design and finding in itself can be compared to a previous research, but a much smaller number of 45 measurements, due to cost restrictions was used in the current study that yielded the same results as found by the study (Han and He, 2007) that used a much larger sample size of 72 measurements. This is a strength and novelty of this research as industries can use this knowledge to conduct cost-effective testing.
Homogeneous batches. The homogeneity of batches is absolutely critical for the successful implementation of the Crossed Gauge R&R study as have been consistently emphasised in previous literature sources (e.g. Gorman and Bower, 2002; Mast and Trip, 2005) and research studies (Han and He, 2007; Phillips et al., 1997). Findings of the current study supported those inferences relating to the homogeneity of batches. Otherwise, validation efforts are likely to fail. It is important to know that repeatability includes the within batch variation, and if batches are not homogenous, measurement equipment is likely to fail the validation. In the current research, the repeatability percentage was 8.63, which was close to the acceptability limit of 10%. Based on prior experience with the product and controlled laboratory testing, in the within batch variation was deemed to be approximately 2% out of the 8.63% repeatability, noticed during the crossed Gauge R&R. It is advisable to not base any significant interpretations solely on repeatability numbers. Furthermore, in line with Han and He’s research, it was found that caution must also be exercised with respect to the within batch variation, in the case the repeatability exceeds the acceptable limit.
Process knowledge. Initial exercise of DoE was extremely useful in developing knowledge about the ultrasonic welding process and characterising the key input variable that affects the burst strength of the product. This proved to be effective in creating homogeneous batches across the entire operational range. It also helped in critically analysing the result of the Gauge R&R study and not accepting them on face value. A novel finding from the current research was that it went beyond the relevance of homogeneous batches; additional evidence on the exploration of tools and techniques that can be used to develop homogenous batches was also revealed through current research findings.
The utility of the range chart. The first iteration of the Gauge study failed the repeatability condition. Out of control points in the range control chart at high burst strength (> 16 bar), testing indicated an occurrence of something unusual at high pressure. Further investigation revealed that the test housing that held the cartridge was slightly bigger, resulting in leaks at higher pressure before the cartridge could break. Correct burst strength results were achieved once the cartridge housing was modified. The current study supported the utility of the Range chart as conducted by other researchers (Bhakhri and Belokar, 2017; Diering et al., 2015).
Observations must be recorded. In line with previous literature (AIAG, 2002), the recording of observations was found to be of high relevance as tests were destructive, and measurements could not be repeated. On several occasions, anomalies were found in data, where a seal was found to be split after the test, resulting in a lower than normal value of the burst strength. Based on the observation, these values were safely ignored, and the test could be repeated, with the same settings.
The need to examine the experiment execution in destructive testing. Similar to analysis studies of the non-destructive measurement system, it was observed that extra vigilance was required for executing the Gauge R&R study. There was a need to randomise the runs with a clear statement of the purpose for the execution of the measurements. These findings corroborated the results found in the previously published literature (AIAG, 2002).
Cost vs value. In the current study, the cost of poor quality could be 1/10 000 of the cost of a part. So, it was reasonably easy to justify the cost of wasting 45 parts to validate the measurement system and prevent the cost of poor quality. However, this may not be possible in every case, and for those cases, the value of destructive testing in the first place should be questioned and, if possible, alternate testing/measurement methods should be found.
The following conclusions can be drawn from this study:
- The use of the Crossed Gauge R&R analytical tool was found to be useful as a means of validating the burst strength test equipment that uses destructive testing. However, it should be used as an aid rather than the solution for the validation of a measurement system.
- Well documented validation criteria existed for most measurement equipment, except for the burst strength test equipment as it involved destructive testing (Mast and Trip, 2005).
It should be noted that the biggest challenge in a destructive Gauge R&R study is an estimation of the within batch variation, which was also encountered in the current study. However, the in-depth knowledge of manufacturing processes and measurement equipment were helpful in overcoming this limitation in the current research. The applicability of this research is thus limited to similar settings only, as it may be equally challenging to conduct Gauge R&R for a new measurement system or a manufacturing process in a different context. It will, therefore, be useful for future research to develop a more general statistical method for calculating the within batch variation for the measurement system using destructive testing.
Automotive Industry Action Group (AIAG). (2002). Measurement System Analysis, Reference Manual, 3rd ed.
Barbosa, G. F., Peres, G. F., & Hermosilla, J. L. G. (2014). R&R (repeatability and reproducibility) gage study applied on gaps’ measurements of aircraft assemblies made by a laser technology device. Production Engineering - Research and Development 8(4), 477-489. doi: 10.1007/s11740-014-0553-z
Bhakhri, R., & Belokar, R.M. (2017). Quality Improvement Using GR&R: A Case Study. International Research Journal of Engineering and Technology, 4(6), 3018-3023.
Box, G. E. P., Hunter, W. G., & Hunter, S. J. (1978). Statistics for Experimenters New York, United States: Wiley.
Breyfogle, F. W., & Meadows, B. (2001). Bottom-Line Success with Six Sigma Milwaukee, United States: Quality Progress, ASQ.
Breyfogle, F. W., Cupello, J. M., & Meadows, B. (2001a). Managing Six Sigma: A Practical Guide to Understanding, Assessing, and Implementing the Strategy That Yields Bottom-Line Success New York, United States: Wiley.
Burdick, R. K., Borror, C. M., & Montgomery, D. C. (2003). A review of measurement systems capability analysis. Journal of Quality Technology 35(4), 342-354.
Burdick, R. K., Park, Y. J., & Montgomery, D. C. (2005). Confidence intervals for misclassification rates in a gauge R&R study. Journal of Quality Technology 37(4), 294-303.
Diering, M., Hamrol, A., & Kujawińska, A. (2015). Measurement System Analysis Combined with Shewhart’s Approach. Key Engineering Materials 637, 7-11.
Dolezal, K. K., Burdick, R. K., & Birch, N. J. (1998). Analysis of a two-factor R&R study with fixed operators. Journal of Quality Technology 30(2), 163-170.
Engel, J., & De Vries, B. (1997). Evaluating a well-known criterion for measurement precision. Journal of Quality Technology 29(4), 469-476.
Gorman, D., & Bower, K.M. (2002). Measurement System Analysis And Destructive Testing, Six Sigma Forum Magazine. American Society for Quality 1(4), 16-19.
Han, Y., & He, Z. (2007). An Applied Study of Destructive Measurement System Analysis. Second IEEE Conference on Industrial Electronics and Applications.
Hoffa, D. W., & Laux, C. M. (2007). Gauge R&R: An Effective Methodology for Determining the Adequacy of a New Measurement System for Micron-level Metrology. Journal of Industrial Technology 23(4).
Ishikawa, K. (1982). Guide to quality control White Plains, United States: Quality Resources.
Juran, J. M. (1990). Quality control handbook (4th ed.). New York, United States: McGraw Hill.
Juran, J. M., & Godfrey, A. B. (1999). Juran’s Quality Control Handbook, 5th ed New York, United States: McGraw-Hill.
Liangxing, S., Wei, C., & Liang, F. L. (2014). An Approach for Simple Linear Profile Gauge R&R Studies. Discrete Dynamics in Nature and Society 2014, 816980.
Mast, D. J., & Trip, R. (2005). Gage R&R studies for destructive measurements. Journal of Quality Technology, 37(1), 40-49.
Measurement Systems Analysis (MSA) Work Group. (2010).
Messina, W. S. (1987). Statistical Quality Control for Manufacturing Managers, New York, United States: Wiley.
Miller, I., & Freund, J. (1965). Probability and Statistics for Engineers. New Jersey, United States: Prentice-Hall, Englewood Cliffs.
Minitab. (2002). Minitab Statistical Software. Release 13. Pennsylvania, United States: State College.
Montgomery, D. C. (2001). Design and Analysis of Experiments, fifth edition. New York, United States: John Wiley & Sons.
Montgomery, D. C. (2013). Introduction to Statistical Quality Control seventh edition New York, United States: John Wiley & Sons.
Pan, J. H. (2004). Determination of the optimal allocation of parameters for gauge repeatability and reproducibility study. International Journal of Quality and Reliability Management 21(6), 672-682.
Pan, J. H. (2006). Evaluating the gauge repeatability and reproducibility for different industries. Quality and Quantity 40(4), 499-518.
Pande, P., Neuman, R., & Cavanagh, R. (2002). The Six Sigma Way: Team Field Book New York, United States: McGraw-Hill.
Persijn, M., & Nuland, Y. V. (1996). Relation between measurement system capability and process capability. Quality Engineering, 9(1), 95-97.
Peruchi, R. S., Balestrassi, P. P., Paiva, A. P., Ferreira, J. R., & Carmelossi, M. D. (2013). A new multivariate gage R&R method for correlated characteristics. International Journal of Production Economics 144(1), 301-315.
Phillips, A. R., Jeffries, R., Schneider, J., & Frankoski, S. P. (1997). Using Repeatability and Reproducibility Studies to Evaluate a Destructive Test Method. Journal of Quality Engineering 10(2), 283-290.
Smith, R. R., McCrary, S. W., & Callahan, R. N. (2007). Gauge repeatability and reproducibility studies and measurement system analysis: A multimethod exploration of the state of practice. Journal of Quality Technology 23(1), 1-11.
Sujová, A., Marcineková, K., & Simanová, Ľ. (2019). Influence of Modern Process Performance Indicators on Corporate Performance — the Empirical Study. Engineering Management in Production and Services 11(2), 119-129. doi: 10.2478/emj-2019-0015
Tsai, P. (1989). Variable gauge repeatability and reproducibility study using the analysis of variance method. Quality Engineering 1(1), 107-115.
Vardeman, S. B., & Job, J. M. (1999). Statistical quality assurance methods for engineers New York, United States: John Wiley & Sons, Inc.
Wesff, E. (2012). Chinese OEM reduces returns with improved product testing. The Global Voice of Quality 4(2), 1-6.
Wheeler, D. J. (1990). Evaluating the Measurement Process when Testing is Destructive TAPPI Polymers and Laminations Conference. Boston, United States: TAPPI Press.
Zimon, D. (2017). The Influence of Quality Management Systems for Improvement of Logistics Supply in Poland. Oeconomia Copernicana 8(4), 643-655.