How would a company use risk index numbers/values to determine sample size for protocols (e.g. validation)? If you are in industry- perhaps you can give examples of what occurs in your companies?
For example in my company, if there is a high risk index number, the sample size needed for testing would have to be greater. For example, my company deals with IAB's and leaks are considered a major failure and the sample size to test leaks would need be greater. However, something like a cosmetic defect is consider a minor risk/failure and would not need a large sample size.
Hi Abhishek,
After doing some research, I found that conducting a hypothesis test of effects/relationships (power analysis) would be ideal in determining sample size. Studies that attempt to determine an effect such as differences between two treatments or the presence/absence of a certain risk factor need accurate sample size calculations to ensure that the analysis is statistically significant. If the sample size is insufficient, it will be hard to prove that any differences observed is meaningful because it could just be due to sampling variation. Some factors that can affect sample size calculations are:
1. Objective of research - is the research based on an estimation, hypothesis or equivalence testing problem?
2. Are there any covariates or factors for which to control?
3. Research design - is it a simple randomized controlled trial (RCT), cluster randomized trial, equivalence trial, etc.?
4. Is there a desired statistical significance level?
5. Test statistic analysis - will it be a one- or two-tailed test?
Selection of sample size for clinical study is always critical choice. Few samples do not represent size of target population. On the other hand, more samples than required put more individuals to the risk of the intervention. Additionally, it is wastage of time and resources.
Optimum number of participants required to be able to arrive at scientifically proven results. Risk index is undoubtedly one of the major factors which are being considered for sample size determination. Though I believe it would not be fair to judge only with one factor. There are many other factors which affect sample size. For example,
•	Level of significance
•	Underlying event rate in the population
•	Standard deviation in the population.
-Hetal
I agree that sample size is critical to any project. I don't have any industry experience with sample size related to risk analysis, however, I was able to find an article that talks able sample size estimation which was a good read.
http://www.sciencedirect.com/science/article/pii/S1532046414000501
I do not have experience in the industry but I found that the usage of the confidence, reliability, and acceptance quality limits (AQLs) to determine sample sizes for process validation are used to ensure that validation activities will yield valid results based upon an organization’s risk acceptance determination threshold, industry practice, guidance documents, and regulatory requirements. The Bayes Success-Run Theorem is used to determine an appropriate risk-based sample size for process validations. The Bayes Success-Run Theorem is implemented as follows:
R = (1-C) ^ (1/n)
where: R = Reliability (or probability of success)
C = confidence level
n = sample size for "0" failures allowed on test
Transposed the formula becomes n= ln(1-C)/ln(R)
For example, if we want to be 95% confident that a process is 95% reliable how many parts do we need to produce that are defect-free?
n= ln (1-.95)/ln(.95) = 59 parts with "0" failures allowed on test
The confidence and reliability levels to determine the appropriate sample size will depend on the risks associated with the product.
Even though, I included a mathematical example, I think it really helped understand how can we determine an appropriate risk-based sample size for the upcoming project.
sources:  http://www.pharmaceuticalonline.com/doc/risk-based-approaches-to-establishing-sample-sizes-for-process-validation-0001
http://www.qualitymag.com/articles/91991-sample-sizes-how-many-do-i-need
From my experience and in my opinion I do not think the risk associated with a specific product performance (or lack thereof), should have a major impact on the sample size that you use for testing. Sample size is usually driven by the confidence and reliability associated with a specific product requirement in conjunction with the test evaluation method (variable or attribute). For a variable test evaluation of confidence and reliability is usually achieved by means of product capability (CpK or PpK), and in order to conduct a capability analysis you need to test a statistical significant sample size. There are often debates for what is considered statistical significant sample size but that does not really have anything to do with the risk associated with a specific product performance. Variable based product testing generally requires low sample sizes because of the statistical power associated with the methodology of continuous variable data, and does not the warrant the need to drive sample sizes up based on risk. As for attribute testing, confidence and reliability are also the driving factors as to the amount of samples that are needed for testing. Based on the confidence and reliability levels the amount of samples needed for testing along with the amount of acceptable failures for that given population can be determined. I do not see why the risk associated with a specific product performance would make you deviate from the sample size plans associated with the confidence and reliability levels. However what all of that being said I can see how the risk associated with different product performances could change the confidence and reliability levels that are required ; which would then impact the amount of samples needed for testing (at least for attribute testing).
Before they begin, they must establish our definitions of risk and their associated confidence level and reliability value. These definitions can and should vary based on the organizational needs. A good place to determine the risk level is failure mode and effects analysis (FMEA), a systematic group of activities designed to recognize, document, and evaluate the potential failure of a product or process and its effects. FMEA uses a risk priority number (RPN), which is comprised of frequency, detection, and severity. The higher the RPN, the higher the risk. However, a low probability of occurrence in conjunction with high severity and high probability of detection may still necessitate the appropriate controls for high risk.
As a Quality Engineer, we have AQL levels in place in our drawings for the Quality Inspectors to use when inspecting incoming product that is to be used on the manufacturing floor. As neb2 stated, AQLs are Acceptance Quality Limits which indicate what sample size can be used depending on the requirement being tested on the drawing specification. Our AQL levels run from 0.65, 1.0 and 4.0. 0.65 is the highest risk which typically allows for a higher sample size. A minor cosmetic defect is not an issue however when the occurrence is frequent it does become an issue because of our supplier specs.
At Getinge, the sample size is determined mostly from the risk index. Since Getinge’s main acute therapy product is Intra-aortic balloons the risk index significantly changes the sample size while running a protocol. Recently, I was helping a senior engineer deal with a CAPA on IAB leaks within the tip seal. We wrote up a protocol which had to test a sample size of 300 IABs for a leak and pull test. If one out of the 300 IABs failed either of the tests, then it would be a major problem. The risk of having a leak on a class three medical device is not something that is taken lightly. The fact that the probability of more than one leak will result in a greater sample size to account for the high risk index.
In agreement with this post. AQL levels for most companies are the guideline for identifying what the sample size should be depending on the risk. However, further analysis such as power level, significance level, confidence intervals, etc can be applied to potentially explain why that sample level isn't necessary for that level of risk. Usually this can be frowned upon, going against what the AQL levels are, however it is up to Quality if they are willing to accept less samples if the risk isn't as severe.
The type of risk plays a role in what statistical analysis you need when testing. Most companies here just use a higher sample size when testing high risk hazards but I believe it can go deeper than that. Each risk has an associate frequency and severity. For high frequency risks, increasing the sample size does not seem to make as much logical sense as increasing the power of the test. If you know the risk frequently happens, increasing the sample size will not change the proportion of failures. Decreasing the standard deviation is the way to go. For high severity risks, increasing the sample size makes sense. For very severe risks, it makes more sense to make sure it never happens, rather than trying to lessen the effect of it happening.
As hc225 mentioned, from experience in the industry, AQL levels are in place for the critical dimensions from the technical drawings of the product. Quality Control (QC) measures are conducted for every batch of product. As I have experienced with a Class I medical product in the US, but a Class II medical product in EU, sample size per QC test is also dependent on risk as mentioned by hc225. And the acceptance rate of each QC test is also dependent on risk. Large sample sizes for these test can be very costly since it is conducted per batch of product during it's entire lifespan in the marketplace. And if QC tests are added or removed due to presence of failures, cost, necessity due to risk, then it will require registration notifications and possible re-registration depending on the type of changes and the effected markets.
Determining the sample size for validation protocols seems to be a somewhat neglected, but very important task. The biggest reason this task proves to be difficult is the ability to properly analyze risk and quantify how many employees are required. If this number is underestimated, than there could be long delays in the validation process, or the job could be done to an unsatisfactory extent. On the other hand, the overestimation of the amount of risk involved would result in a waste of man power and resources. Therefore, properly classifying the amount of risk involved is an essential task in product development.

