Understanding shrinkage and how to circumvent it (2024)

Introduction
Conditional distribution
Conditional mode (EBEs) and conditional mean
Shrinkage
Consequences of shrinkage
How to circumvent shrinkage
Example
Conclusion

Introduction

Shrinkage is a phenomenon that appears when the data is insufficient to precisely estimate the individual parameters (EBEs). In that case, the EBEs “shrink” towards the center of the population distribution and do not properly represent the inter-individual variability. This leads to diagnostic plots that may be misleading, either hiding true relationships or inducing wrong ones.

In the diagnostic plots, Monolix uses samples from the conditional distribution as individual parameters, which lead to reliable plots even when shrinkage is present in the model [1]. This method is based on the calculation of the conditional distribution.

Conditional distribution

The conditional distribution is defined for each individual. It represents the uncertainty of the individual’s parameter value, taking the information at hand for this individualinto account:

the observed data for that individual,
the covariate values for that individual,
the fact that the individual belongs to the population for which we have already estimated the typical parameter value (fixed effects) and the inter-individual variability (standard deviation of the random effects).

In a mathematical formalism, the conditional distribution is written $p(\psi_i|y_i;\hat{\theta})$ with $\psi_i$ the individual parameters for individual $i$, $\hat{\theta}$ the estimated population parameters, and $y_i$ the data (observations) for individual $i$.

It is not possible to directly calculate the probability for a given $\psi_i$ (no closed form), but it is possible to obtain samples from the distribution using a Markov-Chain Monte-Carlo procedure (MCMC). This is what is done in the Conditional distribution task.

With the following conditional distribution for the volume V of individual i, we see that the most probable value is around 25 L but there is quite some uncertainty: the value could also be 15 or 40 for instance. For visual purpose, we have drawn the distribution as a smooth curve, but remember that the conditional distribution has no explicit expression. One can only obtain samples from this distribution using MCMC.

Conditional mode (EBEs) and conditional mean

It is often convenient to work with a single value for the individual parameters (called an estimator), instead of a probability distribution. Several “summary” values can be used, such as the mode or the mean of the conditional distribution.

The mode is also called maximum a posteriori or EBE(for empirical bayes estimate). It is often preferred over the mean, because the mode represents the most likely value, i.e the value which has the highest probability.

Shrinkage

When the individual data brings only few information about the individual parameter value, the conditional distribution is large, reflecting the uncertainty of the individual parameter value. In that case, the mode of the conditional distribution is close to (or “shrinks” to) the mode of the population distribution. If this is the case for all or most of the individuals, all individual parameters end up concentrated around the mode of the population distribution and do not correctly represent the inter-individual variability which has been estimated via the standard deviation parameters (omega parameters in Monolix). This is the shrinkage phenomenon. Shrinkage typically occurs when the data is sparse.

Below we present the example of a parameter V which has a lot of shrinkage and a parameter k with almost no shrinkage. We consider a data set with 10 individuals. In the upper plots, the conditional distributions of each of the 10 individuals are shown. For the volume V, the individual parameter values are uncertain and their conditional distributions are large. When reporting the mode (closed circles) of the conditional distributions on the population distribution (black curve, bottom plots), the modes appear shrunk compared to the population distribution. On the opposite, for k, the conditional distributions are narrow and the modes are well spread over the population distribution. There is shrinkage for V, but not for k.

Pulling the individual parameters of all individuals together, one can overlay the population distribution (black line) with the histogram of individual parameters (i.e conditional modes) (blue bars). This is displayed in the Distribution of the individual parameters plot in Monolix:

The shrinkage phenomenon can be quantified via a shrinkage value for each parameter. In Monolix, the formula for shrinkage has been updated in version 2024 to use the standard deviation instead of the variance in accordance with industry standards.

Starting with Monolix version 2024, shrinkage is calculated from the empirical standard deviation of the random effects $ \textrm{sd}(\eta_i) $ and the estimated standard deviation (the omega population parameter $\omega$). The random effects $ \textrm{sd}(\eta_i) $ can be calculated from the EBEs, conditional mean or samples from the conditional distribution. Typically, the shrinkage is reported using the EBEs.

$$\eta\textrm{-sh}=1-\frac{\textrm{sd}(\eta_i)}{\omega}$$

In the case of inter occasion variability, shrinkage includes both inter individual and inter occasion variability (the gamma population parameter $\gamma$)

$$\eta\textrm{-sh}=1-\frac{\textrm{sd}(\eta_i)}{\sqrt{\omega^2 + \gamma^2}}$$

In Monolix versions 2023 and earlier, shrinkage is calculated using the ratio of the empirical variance and the estimated variance as:

Results

In Monolix versions 2023 and earlier, the shrinkage can be displayed in the Distribution of the individual parameters plot, by selecting the “information” toggle.

Starting in Monolix version 2024, shrinkage information is available in several ways:

Results / Indiv. Results / Cond. Mean [summary] – shrinkage is reported if the Conditional Distribution task has been run
Results / Indiv. Results / Cond. Mode [summary] – shrinkage is reported if the EBEs task has been run
In the IndividualParameters folder inside the results folder, there is a file shrinkage.txt
In reports, using the metric SHRINKAGE in the population or individual parameters table placeholder
In the plot Distribution of the individual parameters by switching on “information”
In the plot Distribution of the standardized random effects by switching on “information”

Calculating the shrinkage in R

Starting with Monolix version 2024, there is a method available to directly return the shrinkage information: getEtaShrinkage().

In Monolix version 2023 and earlier, there is no function from the lixoftConnectors to directly get the shrinkage, but you can easily calculate it from the estimated parameters like this for example for parameter V:

population_params <-read.csv("monolix_project/populationParameters.txt")eta_estimated <-read.csv("monolix_project/IndividualParameters/estimatedRandomEffects.txt")omega_V <- population_params$value[population_params$parameter=="V"]eta_V_estimated_mode <- eta_estimated$eta_V_modeshrinkage_eta_V_estimated_mode <- (1-var(eta_V_estimated_mode)/omega_V^2)*100

Comparison to Nonmem: the Nonmem definition of shrinkage is based on a ratio of standard deviations. This is also the case in Monolix 2024 and above. However, Monolix 2023 and below uses a ratio of variances (which is more common in statistics). Below we provide a “conversion table” which should be read in the following way: a situation that would in Nonmem and Monolix 2024 lead to a shrinkage calculation of 30%, would in Monolix 2023 lead to a calculation of shrinkage of around 50%. The Nonmem and Monolix 2024 version of the shrinkage can be calculated from the Monolix 2023 shrinkage using the following formula:

$$\eta\textrm{-shNM}=\eta\textrm{-shMLX24}=1-\sqrt{1-\eta\textrm{-shMLX23}}$$

Is it OK to get a negative shrinkage? Yes. In case of no shrinkage, $var(\eta_i)= \omega^2$ when $var(\eta_i)$ is calculated on an infinitely large sample. In practice, $var(\eta_i)$ is calculated on a limited sample related to the number of individuals. Its value can be by chance a little bigger than $\omega^2$, leading to a slightly negativeshrinkage.

Consequences of shrinkage

In case of shrinkage the individual parameters (conditional mode/EBEs or conditional mean) are biased because they do not correctly reflect the population distribution.

As these individual parameters are used in diagnostic plots (in particular the Correlation between random effects and the Individual parameters versus covariates plots) the diagnostic plots can become misleading in presence of shrinkage, either hiding relations or suggesting wrong ones. This complicates the identification of mis-specifications and burdens the modeling process.

Note that the shrinkage of the EBEs has no consequences on the population parameter estimation via SAEM (which doesn’t use EBEs, contrary to FOCE for instance). However the lack of informative data may lead to large standard errors for the population parameters and a slower convergence.

How to circumvent shrinkage

Monolix provides a very efficient solution to circumvent the shrinkage problem, i.e the bias in the diagnostic plots induced by the use of shrunk individual parameters. Instead of using the shrunk conditional mode/EBEs or conditional mean, Monolix uses parameter values randomly sampled from the conditional distribution:

The fact of pooling the random samples of the conditional distribution of all individuals allows us to look at them as if they where sampled from the population distribution. And this is exactly what we want: to have individual parameter values (i.e the samples) that correctly reflect the population distribution.

From a mathematical point of view, one can show that the random samples are an unbiased estimator:

$$p(\psi_i)=\int p(\psi_i|y_i)p(y_i)dy_i=\mathbb{E}_{y_i}(p(\psi_i|y_i))$$

The improvement brought by the random samples from the conditional distributions can be visualized in the following way: while the mode (closed circles) are shrunk, the random samples (stars) spread over the entire population distribution (in black). One can even draw several random sample per individual to increase the informativeness of the diagnostic plots. This is what is done in the MonolixSuite2018R1 version (while MonolixSuite2016R1 uses one sample per individual).

In [1], the authors warrant the use of sampled individual parameters. They demonstrate their usefulness in diagnostic plots via numerical experiments with simulated data. They also show that statistical tests based on these sampled individual parameters are unbiased,the type I error rate is the desired significance level of the test and the probability to detect a mis-specification in the model increases with the magnitude of this mis-specification.

Example

Fitting the sparseTobramycin data with a (V,k) model leads to a high shrinkage (75%) of the volume V when using the EBEs. On the opposite, when using samples from the conditional distributions of each individual, there is no shrinkage anymore.

The usefulness of using the samples from the conditional distribution can be seen in the Correlation between the random effects plot. Using the EBEs, the plot suggests a positive correlation of about 30% between the volume and the elimination rate. Using the random samples, the plot does not suggest this correlation any more. If the correlation is added to the model, it is estimated small and not significant.

Another example of shrinkage can be seen for the parameter ka in the warfarin data set. In this example, the data is sparse during the absorption phase leading to a large uncertainty of the individual parameter values.

Conclusion

The use of samples from the conditional distribution is a powerful way to avoid the bias due to the shrinkage in the diagnostic plots. This method has been validated mathematically and with numerical experiments.

In Monolix, the random samples are used by default in all diagnostic plots, if the Conditional distribution task has been run. The choice of the estimator for the individual parameters can be changed in the Settings tab:

Understanding shrinkage and how to circumvent it (2024)

FAQs

What is the process of shrinkage? ›

In the shrinkage process of gels induced by elevating temperature over the transition temperature, a PNIPAM gel often forms the skin structure in which the unshrunk inner layer is surrounded by the shrunk outer layer, and thus it takes a long time to complete the shrinkage process.

Discover More ›

What is the industry standard for shrinkage? ›

The industry standards for shrinkage vary depending on the type, size, and nature of your business; however, the average shrinkage rate across all industries is approximately 1.4% of sales or 2.1% of inventory value.

Learn More Now ›

How to work out shrinkage percentage? ›

Calculating the shrinkage rate

By calculating the rate of shrinkage, you can gain an overview of changes in unexplained losses over time. To do this, just divide the value of unexplained losses by the total value of sales, then multiply the result by 100 to obtain a percentage.

Discover More Details ›

Why is shrinkage important? ›

Shrinkage can have a significant impact on a company's bottom line, as it reduces profits and can lead to cash flow problems. Businesses should take proactive measures to minimize shrinkage, such as implementing security measures, conducting regular inventory audits, and training employees on proper procedures.

See Details ›

Is shrinkage a KPI? ›

Inventory Shrinkage Rate is a KPI used to measure the rate at which the value of inventory has been reduced due to loss, theft, or inaccurate record keeping.

Is 3% shrinkage a lot? ›

In general, expect up to 3–4% shrinkage, which on a pair of jeans with a 32″ inseam would mean shrinking about 1″–1¼” in the length. The width is much less prone to shrinkage because there is less applied tension in that direction during fabric construction. This can vary from brand to brand and style to style.

Discover More ›

What does 5% shrinkage mean? ›

The acceptable shrinkage percentage for cotton is typically around 5%. This means that if you have a piece of cotton fabric that is 100 inches long, it is acceptable for it to shrink up to 5 inches in length after washing and drying.

What is shrinkage and its formula? ›

Planned shrinkage consists of leaves and week-offs, whereas unplanned shrinkage consists of half-day and absenteeism. Calculation of Shrinkage = Planned Shrinkage + Unplanned Shrinkage. Planned Shrinkage = [Total number of leaves + Total number of week-offs] / Total headcount.

Get More Info Here ›

What is a shrinkage rule? ›

noun A shrinkage-rule; a rule or graduated scale used by pattern-makers, which is a fraction of an inch longer per foot than a standard rule. When used for iron, ⅛ of an inch allows for the shrinkage of the casting in cooling, since every dimension is longer than the nominal or standard one in that proportion.

Read On ›

How do you estimate shrinkage? ›

Calculating retail shrinkage is fairly straightforward: take the optimal income you could make from retail merchandise, and subtract the actual income realized from that merchandise. While retail shrinkage is often measured in terms of total dollars lost it's, again, better expressed as a percentage of company sales.

Keep Reading ›

Why is shrinkage inevitable? ›

Why is shrinkage unavoidable? Shrinkage is often considered unavoidable in the retail and supply chain sectors because of the numerous variables that can lead to loss. Despite best efforts, factors such as human error, theft, damage, and spoilage are realities of managing large inventories.

Keep Reading ›

Can shrinkage be avoided? ›

Avoiding heavy duty cycles, fast spins and high-heat drying can also prevent shrinkage. Use delicate cycles instead, and place delicate clothes in a mesh laundry bag for added protection. When drying, consider a low-heat or air dry setting. When in doubt, always follow the instructions on your garment's care tag.

Discover More ›

What is the biggest cause of shrinkage? ›

Of Shrinkage In Retail. There are four main causes of shrinkage: shoplifting, employee theft, administrative errors, and fraud. Understanding how shrinkage happens in retail stores is the first step in reducing and preventing it.

View Details ›

What is shrinking method? ›

Shrinking – involves altering the surface area of a sheet of metal. Hot and cold methods are used. Tools such as metal shrinking hammer, which in a sense, grabs the metal using sharp edges to pull the metal together often using a dolly (metal block) behind to cause it to assume a different shaping.

See Details ›

What is the process of world shrinkage? ›

Globalization is the process of world shrinkage, of distances getting shorter, things moving closer.

What is the process of drying shrinkage? ›

The aggregate under test is mixed with cement and water and cast into prisms of specified dimensions (200 × 50 × 50 mm). The prisms are subjected to wetting followed by drying at 110°C and the change in length from the wet to the dry state is determined.

Discover More ›

What is the shrinkage limit process? ›

Soil samples for shrinkage limit tests are usually taken from a larger sample prepared for liquid and plastic limit tests. A soil specimen with moisture content above the liquid limit is placed in the shrinkage dish and struck off with the straightedge. The sample is then oven-dried.

View Details ›