Skip to contents

Computes a sample group-aware Zero-Inflation (ZI) index for each gene using a negative-binomial (NB) baseline fitted with edgeR. For each group (e.g., drug condition), the function:

  1. estimates gene-wise tagwise dispersions with edgeR (using all selected groups),

  2. builds NB-expected zero probabilities from TMMwsp-scaled means, and

  3. returns per-gene ZI (observed zeros minus NB-expected zeros) and per-group summaries (e.g., % genes with ZI > 0.05). ZI-cutoffs are user-defined.

This is intended as a fast screening diagnostic to decide whether standard NB GLM methods (edgeR/DESeq2) are adequate or whether a zero-aware workflow (e.g., ZINB-WaVE) might be warranted.

This function relies on edgeR to estimate dispersion. The current implementation requires ≥2 groups in the design so that edgeR can stabilize gene-wise dispersions across groups. If you only have a single group and still want a design-aware baseline for expected zeros, fit a Gamma–Poisson/NB GLM and compute the expected zero probabilities from its fitted means and over-dispersion.

Usage

check_zeroinflation(
  data = NULL,
  group_by = NULL,
  samples = NULL,
  batch = 1,
  cutoffs = c(0.1, 0.2)
)

Arguments

data

Seurat object.

group_by

Character, column in data@meta.data that defines groups (default: "combined_id").

samples

Character vector of group labels/patterns to include. If NULL or if none match, all groups in group_by are used.

batch

Optional batch indicator; if length 1, an intercept-free design is used with group dummies.

cutoffs

Numeric vector of user-supply ZI thresholds for summary statistics

Value

A list with:

  • gene_metrics_by_group: long data frame (group × gene) with p0_obs, p0_nb, ZI, and counts.

  • summary_by_group: one row per group with medians and % ZI thresholds, plus observed/expected zero counts for the group.

Note

  • This is a screening tool; it is not a replacement for fitting a full GLM with your actual design. If strong covariates exist, a GLM baseline (e.g., glmGamPoi::glm_gp) will yield more faithful expected-zero rates.

  • For single-group experiments, consider either adding a reference group or switching to a GLM-based baseline that does not require multiple groups.

Examples

data(mini_mac)
check_zeroinflation(mini_mac, group_by = "combined_id",
                     samples = c("DMSO_0","Staurosporine_10"))
#> $gene_metrics_by_group
#>                  group            gene mean_count_group dispersion p0_obs
#> NAMPT           DMSO_0           NAMPT           8.3158  0.0076845  0.000
#> ENSG00000278869 DMSO_0 ENSG00000278869           0.0000  0.0000977  1.000
#> CABP7-DT        DMSO_0        CABP7-DT           0.0000  0.0000977  1.000
#> NBEAP4          DMSO_0          NBEAP4           0.0000  0.0000977  1.000
#> FMO2            DMSO_0            FMO2           0.0000  0.0000977  1.000
#> NDUFA4P2        DMSO_0        NDUFA4P2           0.0000  0.0000977  1.000
#> DPY19L4P2       DMSO_0       DPY19L4P2           0.0000  0.0000977  1.000
#> ENSG00000286114 DMSO_0 ENSG00000286114           0.0000  0.0000977  1.000
#> ENSG00000265935 DMSO_0 ENSG00000265935           0.0000  0.0000977  1.000
#> Y-RNA           DMSO_0           Y-RNA           0.0000  0.0000977  1.000
#> FAM201B         DMSO_0         FAM201B           0.0000  0.0000977  1.000
#> ENSG00000243018 DMSO_0 ENSG00000243018           0.0000  0.0000977  1.000
#> TRBC1           DMSO_0           TRBC1           0.0526  0.0000977  0.947
#> FAM20C          DMSO_0          FAM20C           5.1053  0.0067377  0.000
#> ENSG00000251536 DMSO_0 ENSG00000251536           0.0000  0.0000977  1.000
#> CLDN18          DMSO_0          CLDN18           0.0000  0.0000977  1.000
#> ENSG00000259688 DMSO_0 ENSG00000259688           0.0000  0.0000977  1.000
#> RBM7P1          DMSO_0          RBM7P1           0.0000  0.0000977  1.000
#> UBE2HP1         DMSO_0         UBE2HP1           0.0000  0.0000977  1.000
#> ENSG00000258419 DMSO_0 ENSG00000258419           0.0000  0.0000977  1.000
#> ENSG00000231698 DMSO_0 ENSG00000231698           0.0000  0.0000977  1.000
#> NF1P5           DMSO_0           NF1P5           0.0000  0.0000977  1.000
#> PTPN20          DMSO_0          PTPN20           0.0000  0.0000977  1.000
#> ENSG00000280048 DMSO_0 ENSG00000280048           0.0000  0.0000977  1.000
#> AHCY            DMSO_0            AHCY           5.1053  0.0075781  0.000
#> CCDC127         DMSO_0         CCDC127           7.7368  0.0077213  0.000
#> FSHR            DMSO_0            FSHR           0.0000  0.0000977  1.000
#> TRIM64C         DMSO_0         TRIM64C           0.0000  0.0000977  1.000
#> ENSG00000285971 DMSO_0 ENSG00000285971           0.0000  0.0000977  1.000
#> ENSG00000217239 DMSO_0 ENSG00000217239           0.6842  0.0000977  0.421
#> ENSG00000236366 DMSO_0 ENSG00000236366           0.0000  0.0000977  1.000
#> ENSG00000286853 DMSO_0 ENSG00000286853           0.0000  0.0000977  1.000
#> RN7SL270P       DMSO_0       RN7SL270P           0.0000  0.0000977  1.000
#> CYCSP41         DMSO_0         CYCSP41           0.0000  0.0000977  1.000
#> MIR150          DMSO_0          MIR150           0.0000  0.0000977  1.000
#> ENSG00000289359 DMSO_0 ENSG00000289359           0.0000  0.0000977  1.000
#> Metazoa-SRP     DMSO_0     Metazoa-SRP           0.0000  0.0000977  1.000
#> ENSG00000284620 DMSO_0 ENSG00000284620           0.0000  0.0000977  1.000
#> Y-RNA.1         DMSO_0         Y-RNA.1           0.0000  0.0000977  1.000
#> CNN2P9          DMSO_0          CNN2P9           0.0000  0.0000977  1.000
#> ENSG00000273375 DMSO_0 ENSG00000273375           0.0526  0.0000977  0.947
#> ENSG00000287871 DMSO_0 ENSG00000287871           0.0000  0.0000977  1.000
#> LINC02862       DMSO_0       LINC02862           0.0000  0.0000977  1.000
#> MIR556          DMSO_0          MIR556           0.0000  0.0000977  1.000
#> ENSG00000235609 DMSO_0 ENSG00000235609           0.7368  0.0000977  0.632
#> MBD2            DMSO_0            MBD2           6.7368  0.0077760  0.000
#> HIGD1AP6        DMSO_0        HIGD1AP6           0.0000  0.0000977  1.000
#> ENSG00000276958 DMSO_0 ENSG00000276958           0.0000  0.0000977  1.000
#> ENSG00000275295 DMSO_0 ENSG00000275295           0.0000  0.0000977  1.000
#> ENSG00000285454 DMSO_0 ENSG00000285454           0.0000  0.0000977  1.000
#> C10orf71-AS1    DMSO_0    C10orf71-AS1           0.0000  0.0000977  1.000
#> ENSG00000256001 DMSO_0 ENSG00000256001           0.0000  0.0000977  1.000
#> ENSG00000279294 DMSO_0 ENSG00000279294           0.0000  0.0000977  1.000
#> IFNWP5          DMSO_0          IFNWP5           0.0000  0.0000977  1.000
#> MAN1C1          DMSO_0          MAN1C1           0.0526  0.0000977  0.947
#> RN7SL211P       DMSO_0       RN7SL211P           0.0000  0.0000977  1.000
#> GNRHR2P1        DMSO_0        GNRHR2P1           0.0000  0.0000977  1.000
#> ENSG00000273904 DMSO_0 ENSG00000273904           0.0000  0.0000977  1.000
#> ENSG00000241593 DMSO_0 ENSG00000241593           0.0000  0.0000977  1.000
#> WDR31           DMSO_0           WDR31           0.4211  0.0000977  0.684
#> DRD5            DMSO_0            DRD5           0.0000  0.0000977  1.000
#> ENSG00000256569 DMSO_0 ENSG00000256569           0.0000  0.0000977  1.000
#> EPHX1           DMSO_0           EPHX1           6.2105  0.0077643  0.000
#> ACTN1           DMSO_0           ACTN1           3.4737  0.0051444  0.000
#> MIR5188         DMSO_0         MIR5188           0.0000  0.0000977  1.000
#> RNU6-118P       DMSO_0       RNU6-118P           0.0000  0.0000977  1.000
#> ENSG00000271758 DMSO_0 ENSG00000271758           0.0000  0.0000977  1.000
#> ZNF84-DT        DMSO_0        ZNF84-DT           0.0000  0.0000977  1.000
#> ENSG00000248733 DMSO_0 ENSG00000248733           0.0000  0.0000977  1.000
#> ACTL7A          DMSO_0          ACTL7A           0.0000  0.0000977  1.000
#> GID4            DMSO_0            GID4           3.1053  0.0023841  0.000
#> Y-RNA.2         DMSO_0         Y-RNA.2           0.0000  0.0000977  1.000
#> MIR200C         DMSO_0         MIR200C           0.0000  0.0000977  1.000
#> ENSG00000224644 DMSO_0 ENSG00000224644           0.0000  0.0000977  1.000
#> CSTA            DMSO_0            CSTA           4.4737  0.0071847  0.000
#> MIR664A         DMSO_0         MIR664A           0.0000  0.0000977  1.000
#> MIR4802         DMSO_0         MIR4802           0.0000  0.0000977  1.000
#> ENSG00000278655 DMSO_0 ENSG00000278655           0.0000  0.0000977  1.000
#> ENSG00000280122 DMSO_0 ENSG00000280122           0.0000  0.0000977  1.000
#> ENSG00000254180 DMSO_0 ENSG00000254180           0.0000  0.0000977  1.000
#> RNU6-896P       DMSO_0       RNU6-896P           0.0000  0.0000977  1.000
#> ENSG00000286805 DMSO_0 ENSG00000286805           0.0000  0.0000977  1.000
#> SHANK1          DMSO_0          SHANK1           0.0000  0.0000977  1.000
#> ENSG00000291048 DMSO_0 ENSG00000291048           0.0000  0.0000977  1.000
#> RN7SL268P       DMSO_0       RN7SL268P           0.0000  0.0000977  1.000
#> NLGN2           DMSO_0           NLGN2           0.5263  0.0000977  0.579
#> DMC1            DMSO_0            DMC1           0.2632  0.0000977  0.737
#> KCNAB1-AS1      DMSO_0      KCNAB1-AS1           0.0000  0.0000977  1.000
#> ENSG00000276015 DMSO_0 ENSG00000276015           0.0000  0.0000977  1.000
#> WWTR1-IT1       DMSO_0       WWTR1-IT1           0.0000  0.0000977  1.000
#> ENSG00000260465 DMSO_0 ENSG00000260465           0.0000  0.0000977  1.000
#> RPL5P30         DMSO_0         RPL5P30           0.1053  0.0000977  0.895
#> ENSG00000270988 DMSO_0 ENSG00000270988           0.0000  0.0000977  1.000
#> MIR545          DMSO_0          MIR545           0.0000  0.0000977  1.000
#> ENSG00000257548 DMSO_0 ENSG00000257548           0.0000  0.0000977  1.000
#> ENSG00000289950 DMSO_0 ENSG00000289950           0.0000  0.0000977  1.000
#> ENSG00000262413 DMSO_0 ENSG00000262413           0.1579  0.0000977  0.842
#> ENSG00000249890 DMSO_0 ENSG00000249890           0.0000  0.0000977  1.000
#> RN7SL255P       DMSO_0       RN7SL255P           0.0000  0.0000977  1.000
#> TRIM53CP        DMSO_0        TRIM53CP           0.0000  0.0000977  1.000
#> RNA5SP107       DMSO_0       RNA5SP107           0.0000  0.0000977  1.000
#> RNU6-845P       DMSO_0       RNU6-845P           0.0000  0.0000977  1.000
#> ENSG00000241114 DMSO_0 ENSG00000241114           0.0000  0.0000977  1.000
#> SERBP1P2        DMSO_0        SERBP1P2           0.0000  0.0000977  1.000
#> RPS10-NUDT3     DMSO_0     RPS10-NUDT3           0.0526  0.0000977  0.947
#> CDY12P          DMSO_0          CDY12P           0.0000  0.0000977  1.000
#> MIR4644         DMSO_0         MIR4644           0.0000  0.0000977  1.000
#> ENSG00000223343 DMSO_0 ENSG00000223343           0.0000  0.0000977  1.000
#> MORF4L1P3       DMSO_0       MORF4L1P3           0.0000  0.0000977  1.000
#> MRGPRX3         DMSO_0         MRGPRX3           0.8947  0.0000977  0.316
#> CD160           DMSO_0           CD160           0.0000  0.0000977  1.000
#>                 obs_zeros_num    p0_nb expected_zeros_num        ZI
#> NAMPT                       0 0.000564             0.0107 -0.000564
#> ENSG00000278869            19 1.000000            19.0000  0.000000
#> CABP7-DT                   19 1.000000            19.0000  0.000000
#> NBEAP4                     19 1.000000            19.0000  0.000000
#> FMO2                       19 1.000000            19.0000  0.000000
#> NDUFA4P2                   19 1.000000            19.0000  0.000000
#> DPY19L4P2                  19 1.000000            19.0000  0.000000
#> ENSG00000286114            19 1.000000            19.0000  0.000000
#> ENSG00000265935            19 1.000000            19.0000  0.000000
#> Y-RNA                      19 1.000000            19.0000  0.000000
#> FAM201B                    19 1.000000            19.0000  0.000000
#> ENSG00000243018            19 1.000000            19.0000  0.000000
#> TRBC1                      18 0.948760            18.0264 -0.001392
#> FAM20C                      0 0.008519             0.1619 -0.008519
#> ENSG00000251536            19 1.000000            19.0000  0.000000
#> CLDN18                     19 1.000000            19.0000  0.000000
#> ENSG00000259688            19 1.000000            19.0000  0.000000
#> RBM7P1                     19 1.000000            19.0000  0.000000
#> UBE2HP1                    19 1.000000            19.0000  0.000000
#> ENSG00000258419            19 1.000000            19.0000  0.000000
#> ENSG00000231698            19 1.000000            19.0000  0.000000
#> NF1P5                      19 1.000000            19.0000  0.000000
#> PTPN20                     19 1.000000            19.0000  0.000000
#> ENSG00000280048            19 1.000000            19.0000  0.000000
#> AHCY                        0 0.008594             0.1633 -0.008594
#> CCDC127                     0 0.000913             0.0173 -0.000913
#> FSHR                       19 1.000000            19.0000  0.000000
#> TRIM64C                    19 1.000000            19.0000  0.000000
#> ENSG00000285971            19 1.000000            19.0000  0.000000
#> ENSG00000217239             8 0.507246             9.6377 -0.086193
#> ENSG00000236366            19 1.000000            19.0000  0.000000
#> ENSG00000286853            19 1.000000            19.0000  0.000000
#> RN7SL270P                  19 1.000000            19.0000  0.000000
#> CYCSP41                    19 1.000000            19.0000  0.000000
#> MIR150                     19 1.000000            19.0000  0.000000
#> ENSG00000289359            19 1.000000            19.0000  0.000000
#> Metazoa-SRP                19 1.000000            19.0000  0.000000
#> ENSG00000284620            19 1.000000            19.0000  0.000000
#> Y-RNA.1                    19 1.000000            19.0000  0.000000
#> CNN2P9                     19 1.000000            19.0000  0.000000
#> ENSG00000273375            18 0.948760            18.0264 -0.001392
#> ENSG00000287871            19 1.000000            19.0000  0.000000
#> LINC02862                  19 1.000000            19.0000  0.000000
#> MIR556                     19 1.000000            19.0000  0.000000
#> ENSG00000235609            12 0.481655             9.1514  0.149924
#> MBD2                        0 0.002116             0.0402 -0.002116
#> HIGD1AP6                   19 1.000000            19.0000  0.000000
#> ENSG00000276958            19 1.000000            19.0000  0.000000
#> ENSG00000275295            19 1.000000            19.0000  0.000000
#> ENSG00000285454            19 1.000000            19.0000  0.000000
#> C10orf71-AS1               19 1.000000            19.0000  0.000000
#> ENSG00000256001            19 1.000000            19.0000  0.000000
#> ENSG00000279294            19 1.000000            19.0000  0.000000
#> IFNWP5                     19 1.000000            19.0000  0.000000
#> MAN1C1                     18 0.948760            18.0264 -0.001392
#> RN7SL211P                  19 1.000000            19.0000  0.000000
#> GNRHR2P1                   19 1.000000            19.0000  0.000000
#> ENSG00000273904            19 1.000000            19.0000  0.000000
#> ENSG00000241593            19 1.000000            19.0000  0.000000
#> WDR31                      13 0.657719            12.4967  0.026492
#> DRD5                       19 1.000000            19.0000  0.000000
#> ENSG00000256569            19 1.000000            19.0000  0.000000
#> EPHX1                       0 0.003312             0.0629 -0.003312
#> ACTN1                       0 0.036295             0.6896 -0.036295
#> MIR5188                    19 1.000000            19.0000  0.000000
#> RNU6-118P                  19 1.000000            19.0000  0.000000
#> ENSG00000271758            19 1.000000            19.0000  0.000000
#> ZNF84-DT                   19 1.000000            19.0000  0.000000
#> ENSG00000248733            19 1.000000            19.0000  0.000000
#> ACTL7A                     19 1.000000            19.0000  0.000000
#> GID4                        0 0.050308             0.9559 -0.050308
#> Y-RNA.2                    19 1.000000            19.0000  0.000000
#> MIR200C                    19 1.000000            19.0000  0.000000
#> ENSG00000224644            19 1.000000            19.0000  0.000000
#> CSTA                        0 0.014942             0.2839 -0.014942
#> MIR664A                    19 1.000000            19.0000  0.000000
#> MIR4802                    19 1.000000            19.0000  0.000000
#> ENSG00000278655            19 1.000000            19.0000  0.000000
#> ENSG00000280122            19 1.000000            19.0000  0.000000
#> ENSG00000254180            19 1.000000            19.0000  0.000000
#> RNU6-896P                  19 1.000000            19.0000  0.000000
#> ENSG00000286805            19 1.000000            19.0000  0.000000
#> SHANK1                     19 1.000000            19.0000  0.000000
#> ENSG00000291048            19 1.000000            19.0000  0.000000
#> RN7SL268P                  19 1.000000            19.0000  0.000000
#> NLGN2                      11 0.592692            11.2611 -0.013745
#> DMC1                       14 0.769245            14.6157 -0.032403
#> KCNAB1-AS1                 19 1.000000            19.0000  0.000000
#> ENSG00000276015            19 1.000000            19.0000  0.000000
#> WWTR1-IT1                  19 1.000000            19.0000  0.000000
#> ENSG00000260465            19 1.000000            19.0000  0.000000
#> RPL5P30                    17 0.900205            17.1039 -0.005468
#> ENSG00000270988            19 1.000000            19.0000  0.000000
#> MIR545                     19 1.000000            19.0000  0.000000
#> ENSG00000257548            19 1.000000            19.0000  0.000000
#> ENSG00000289950            19 1.000000            19.0000  0.000000
#> ENSG00000262413            16 0.854190            16.2296 -0.012085
#> ENSG00000249890            19 1.000000            19.0000  0.000000
#> RN7SL255P                  19 1.000000            19.0000  0.000000
#> TRIM53CP                   19 1.000000            19.0000  0.000000
#> RNA5SP107                  19 1.000000            19.0000  0.000000
#> RNU6-845P                  19 1.000000            19.0000  0.000000
#> ENSG00000241114            19 1.000000            19.0000  0.000000
#> SERBP1P2                   19 1.000000            19.0000  0.000000
#> RPS10-NUDT3                18 0.948760            18.0264 -0.001392
#> CDY12P                     19 1.000000            19.0000  0.000000
#> MIR4644                    19 1.000000            19.0000  0.000000
#> ENSG00000223343            19 1.000000            19.0000  0.000000
#> MORF4L1P3                  19 1.000000            19.0000  0.000000
#> MRGPRX3                     6 0.412527             7.8380 -0.096737
#> CD160                      19 1.000000            19.0000  0.000000
#>  [ reached 'max' / getOption("max.print") -- omitted 889 rows ]
#> 
#> $summary_by_group
#>                             group n_genes n_wells mean_p0_obs mean_p0_nb
#> DMSO_0                     DMSO_0     500      19       0.819      0.820
#> Staurosporine_10 Staurosporine_10     500       3       0.888      0.872
#>                    mean_ZI observed_zeros_num expected_zeros_num pct_ZI_gt_0.1
#> DMSO_0           -0.000968               7780               7789         0.004
#> Staurosporine_10  0.016115               1332               1308         0.080
#>                  pct_ZI_gt_0.2
#> DMSO_0                   0.000
#> Staurosporine_10         0.052
#>