Skip to contents

Computes a sample group-aware Zero-Inflation (ZI) index for each gene using a negative-binomial (NB) baseline fitted with edgeR. For each group (e.g., drug condition), the function:

  1. estimates gene-wise tagwise dispersions with edgeR (using all selected groups),

  2. builds NB-expected zero probabilities from TMMwsp-scaled means, and

  3. returns per-gene ZI (observed zeros minus NB-expected zeros) and per-group summaries (e.g., % genes with ZI > 0.05). ZI-cutoffs are user-defined.

This is intended as a fast screening diagnostic to decide whether standard NB GLM methods (edgeR/DESeq2) are adequate or whether a zero-aware workflow (e.g., ZINB-WaVE) might be warranted.

This function relies on edgeR to estimate dispersion. The current implementation requires ≥2 groups in the design so that edgeR can stabilize gene-wise dispersions across groups. If you only have a single group and still want a design-aware baseline for expected zeros, fit a Gamma–Poisson/NB GLM and compute the expected zero probabilities from its fitted means and over-dispersion.

Usage

check_zeroinflation(
  data = NULL,
  group_by = NULL,
  samples = NULL,
  batch = 1,
  cutoffs = c(0.1, 0.2)
)

Arguments

data

Seurat object.

group_by

Character, column in data@meta.data that defines groups (default: "combined_id").

samples

Character vector of group labels/patterns to include. If NULL or if none match, all groups in group_by are used.

batch

Optional batch indicator; if length 1, an intercept-free design is used with group dummies.

cutoffs

Numeric vector of user-supply ZI thresholds for summary statistics

Value

A list with:

  • gene_metrics_by_group: long data frame (group × gene) with p0_obs, p0_nb, ZI, and counts.

  • summary_by_group: one row per group with medians and % ZI thresholds, plus observed/expected zero counts for the group.

Note

  • This is a screening tool; it is not a replacement for fitting a full GLM with your actual design. If strong covariates exist, a GLM baseline (e.g., glmGamPoi::glm_gp) will yield more faithful expected-zero rates.

  • For single-group experiments, consider either adding a reference group or switching to a GLM-based baseline that does not require multiple groups.

Examples

data(mini_mac)
check_zeroinflation(mini_mac, group_by = "combined_id",
                     samples = c("DMSO_0","Staurosporine_10"))
#> $gene_metrics_by_group
#>                  group            gene mean_count_group   dispersion    p0_obs
#> NAMPT           DMSO_0           NAMPT       8.31578947 7.684511e-03 0.0000000
#> ENSG00000278869 DMSO_0 ENSG00000278869       0.00000000 9.765625e-05 1.0000000
#> CABP7-DT        DMSO_0        CABP7-DT       0.00000000 9.765625e-05 1.0000000
#> NBEAP4          DMSO_0          NBEAP4       0.00000000 9.765625e-05 1.0000000
#> FMO2            DMSO_0            FMO2       0.00000000 9.765625e-05 1.0000000
#> NDUFA4P2        DMSO_0        NDUFA4P2       0.00000000 9.765625e-05 1.0000000
#> DPY19L4P2       DMSO_0       DPY19L4P2       0.00000000 9.765625e-05 1.0000000
#> ENSG00000286114 DMSO_0 ENSG00000286114       0.00000000 9.765625e-05 1.0000000
#> ENSG00000265935 DMSO_0 ENSG00000265935       0.00000000 9.765625e-05 1.0000000
#> Y-RNA           DMSO_0           Y-RNA       0.00000000 9.765625e-05 1.0000000
#> FAM201B         DMSO_0         FAM201B       0.00000000 9.765625e-05 1.0000000
#> ENSG00000243018 DMSO_0 ENSG00000243018       0.00000000 9.765625e-05 1.0000000
#> TRBC1           DMSO_0           TRBC1       0.05263158 9.765625e-05 0.9473684
#> FAM20C          DMSO_0          FAM20C       5.10526316 6.737674e-03 0.0000000
#> ENSG00000251536 DMSO_0 ENSG00000251536       0.00000000 9.765625e-05 1.0000000
#> CLDN18          DMSO_0          CLDN18       0.00000000 9.765625e-05 1.0000000
#> ENSG00000259688 DMSO_0 ENSG00000259688       0.00000000 9.765625e-05 1.0000000
#> RBM7P1          DMSO_0          RBM7P1       0.00000000 9.765625e-05 1.0000000
#> UBE2HP1         DMSO_0         UBE2HP1       0.00000000 9.765625e-05 1.0000000
#> ENSG00000258419 DMSO_0 ENSG00000258419       0.00000000 9.765625e-05 1.0000000
#> ENSG00000231698 DMSO_0 ENSG00000231698       0.00000000 9.765625e-05 1.0000000
#> NF1P5           DMSO_0           NF1P5       0.00000000 9.765625e-05 1.0000000
#> PTPN20          DMSO_0          PTPN20       0.00000000 9.765625e-05 1.0000000
#> ENSG00000280048 DMSO_0 ENSG00000280048       0.00000000 9.765625e-05 1.0000000
#> AHCY            DMSO_0            AHCY       5.10526316 7.578144e-03 0.0000000
#> CCDC127         DMSO_0         CCDC127       7.73684211 7.721302e-03 0.0000000
#> FSHR            DMSO_0            FSHR       0.00000000 9.765625e-05 1.0000000
#> TRIM64C         DMSO_0         TRIM64C       0.00000000 9.765625e-05 1.0000000
#> ENSG00000285971 DMSO_0 ENSG00000285971       0.00000000 9.765625e-05 1.0000000
#> ENSG00000217239 DMSO_0 ENSG00000217239       0.68421053 9.765625e-05 0.4210526
#> ENSG00000236366 DMSO_0 ENSG00000236366       0.00000000 9.765625e-05 1.0000000
#> ENSG00000286853 DMSO_0 ENSG00000286853       0.00000000 9.765625e-05 1.0000000
#> RN7SL270P       DMSO_0       RN7SL270P       0.00000000 9.765625e-05 1.0000000
#> CYCSP41         DMSO_0         CYCSP41       0.00000000 9.765625e-05 1.0000000
#> MIR150          DMSO_0          MIR150       0.00000000 9.765625e-05 1.0000000
#> ENSG00000289359 DMSO_0 ENSG00000289359       0.00000000 9.765625e-05 1.0000000
#> Metazoa-SRP     DMSO_0     Metazoa-SRP       0.00000000 9.765625e-05 1.0000000
#> ENSG00000284620 DMSO_0 ENSG00000284620       0.00000000 9.765625e-05 1.0000000
#> Y-RNA.1         DMSO_0         Y-RNA.1       0.00000000 9.765625e-05 1.0000000
#> CNN2P9          DMSO_0          CNN2P9       0.00000000 9.765625e-05 1.0000000
#> ENSG00000273375 DMSO_0 ENSG00000273375       0.05263158 9.765625e-05 0.9473684
#> ENSG00000287871 DMSO_0 ENSG00000287871       0.00000000 9.765625e-05 1.0000000
#> LINC02862       DMSO_0       LINC02862       0.00000000 9.765625e-05 1.0000000
#> MIR556          DMSO_0          MIR556       0.00000000 9.765625e-05 1.0000000
#> ENSG00000235609 DMSO_0 ENSG00000235609       0.73684211 9.765625e-05 0.6315789
#> MBD2            DMSO_0            MBD2       6.73684211 7.776025e-03 0.0000000
#> HIGD1AP6        DMSO_0        HIGD1AP6       0.00000000 9.765625e-05 1.0000000
#> ENSG00000276958 DMSO_0 ENSG00000276958       0.00000000 9.765625e-05 1.0000000
#> ENSG00000275295 DMSO_0 ENSG00000275295       0.00000000 9.765625e-05 1.0000000
#> ENSG00000285454 DMSO_0 ENSG00000285454       0.00000000 9.765625e-05 1.0000000
#> C10orf71-AS1    DMSO_0    C10orf71-AS1       0.00000000 9.765625e-05 1.0000000
#> ENSG00000256001 DMSO_0 ENSG00000256001       0.00000000 9.765625e-05 1.0000000
#> ENSG00000279294 DMSO_0 ENSG00000279294       0.00000000 9.765625e-05 1.0000000
#> IFNWP5          DMSO_0          IFNWP5       0.00000000 9.765625e-05 1.0000000
#> MAN1C1          DMSO_0          MAN1C1       0.05263158 9.765625e-05 0.9473684
#> RN7SL211P       DMSO_0       RN7SL211P       0.00000000 9.765625e-05 1.0000000
#> GNRHR2P1        DMSO_0        GNRHR2P1       0.00000000 9.765625e-05 1.0000000
#> ENSG00000273904 DMSO_0 ENSG00000273904       0.00000000 9.765625e-05 1.0000000
#> ENSG00000241593 DMSO_0 ENSG00000241593       0.00000000 9.765625e-05 1.0000000
#> WDR31           DMSO_0           WDR31       0.42105263 9.765625e-05 0.6842105
#> DRD5            DMSO_0            DRD5       0.00000000 9.765625e-05 1.0000000
#> ENSG00000256569 DMSO_0 ENSG00000256569       0.00000000 9.765625e-05 1.0000000
#> EPHX1           DMSO_0           EPHX1       6.21052632 7.764337e-03 0.0000000
#> ACTN1           DMSO_0           ACTN1       3.47368421 5.144421e-03 0.0000000
#> MIR5188         DMSO_0         MIR5188       0.00000000 9.765625e-05 1.0000000
#> RNU6-118P       DMSO_0       RNU6-118P       0.00000000 9.765625e-05 1.0000000
#> ENSG00000271758 DMSO_0 ENSG00000271758       0.00000000 9.765625e-05 1.0000000
#> ZNF84-DT        DMSO_0        ZNF84-DT       0.00000000 9.765625e-05 1.0000000
#> ENSG00000248733 DMSO_0 ENSG00000248733       0.00000000 9.765625e-05 1.0000000
#> ACTL7A          DMSO_0          ACTL7A       0.00000000 9.765625e-05 1.0000000
#> GID4            DMSO_0            GID4       3.10526316 2.384059e-03 0.0000000
#> Y-RNA.2         DMSO_0         Y-RNA.2       0.00000000 9.765625e-05 1.0000000
#> MIR200C         DMSO_0         MIR200C       0.00000000 9.765625e-05 1.0000000
#> ENSG00000224644 DMSO_0 ENSG00000224644       0.00000000 9.765625e-05 1.0000000
#> CSTA            DMSO_0            CSTA       4.47368421 7.184685e-03 0.0000000
#> MIR664A         DMSO_0         MIR664A       0.00000000 9.765625e-05 1.0000000
#> MIR4802         DMSO_0         MIR4802       0.00000000 9.765625e-05 1.0000000
#> ENSG00000278655 DMSO_0 ENSG00000278655       0.00000000 9.765625e-05 1.0000000
#> ENSG00000280122 DMSO_0 ENSG00000280122       0.00000000 9.765625e-05 1.0000000
#> ENSG00000254180 DMSO_0 ENSG00000254180       0.00000000 9.765625e-05 1.0000000
#> RNU6-896P       DMSO_0       RNU6-896P       0.00000000 9.765625e-05 1.0000000
#> ENSG00000286805 DMSO_0 ENSG00000286805       0.00000000 9.765625e-05 1.0000000
#> SHANK1          DMSO_0          SHANK1       0.00000000 9.765625e-05 1.0000000
#> ENSG00000291048 DMSO_0 ENSG00000291048       0.00000000 9.765625e-05 1.0000000
#> RN7SL268P       DMSO_0       RN7SL268P       0.00000000 9.765625e-05 1.0000000
#> NLGN2           DMSO_0           NLGN2       0.52631579 9.765625e-05 0.5789474
#> DMC1            DMSO_0            DMC1       0.26315789 9.765625e-05 0.7368421
#> KCNAB1-AS1      DMSO_0      KCNAB1-AS1       0.00000000 9.765625e-05 1.0000000
#> ENSG00000276015 DMSO_0 ENSG00000276015       0.00000000 9.765625e-05 1.0000000
#> WWTR1-IT1       DMSO_0       WWTR1-IT1       0.00000000 9.765625e-05 1.0000000
#> ENSG00000260465 DMSO_0 ENSG00000260465       0.00000000 9.765625e-05 1.0000000
#> RPL5P30         DMSO_0         RPL5P30       0.10526316 9.765625e-05 0.8947368
#> ENSG00000270988 DMSO_0 ENSG00000270988       0.00000000 9.765625e-05 1.0000000
#> MIR545          DMSO_0          MIR545       0.00000000 9.765625e-05 1.0000000
#> ENSG00000257548 DMSO_0 ENSG00000257548       0.00000000 9.765625e-05 1.0000000
#> ENSG00000289950 DMSO_0 ENSG00000289950       0.00000000 9.765625e-05 1.0000000
#> ENSG00000262413 DMSO_0 ENSG00000262413       0.15789474 9.765625e-05 0.8421053
#> ENSG00000249890 DMSO_0 ENSG00000249890       0.00000000 9.765625e-05 1.0000000
#> RN7SL255P       DMSO_0       RN7SL255P       0.00000000 9.765625e-05 1.0000000
#> TRIM53CP        DMSO_0        TRIM53CP       0.00000000 9.765625e-05 1.0000000
#> RNA5SP107       DMSO_0       RNA5SP107       0.00000000 9.765625e-05 1.0000000
#> RNU6-845P       DMSO_0       RNU6-845P       0.00000000 9.765625e-05 1.0000000
#> ENSG00000241114 DMSO_0 ENSG00000241114       0.00000000 9.765625e-05 1.0000000
#> SERBP1P2        DMSO_0        SERBP1P2       0.00000000 9.765625e-05 1.0000000
#> RPS10-NUDT3     DMSO_0     RPS10-NUDT3       0.05263158 9.765625e-05 0.9473684
#> CDY12P          DMSO_0          CDY12P       0.00000000 9.765625e-05 1.0000000
#> MIR4644         DMSO_0         MIR4644       0.00000000 9.765625e-05 1.0000000
#> ENSG00000223343 DMSO_0 ENSG00000223343       0.00000000 9.765625e-05 1.0000000
#> MORF4L1P3       DMSO_0       MORF4L1P3       0.00000000 9.765625e-05 1.0000000
#> MRGPRX3         DMSO_0         MRGPRX3       0.89473684 9.765625e-05 0.3157895
#> CD160           DMSO_0           CD160       0.00000000 9.765625e-05 1.0000000
#>                 obs_zeros_num        p0_nb expected_zeros_num            ZI
#> NAMPT                       0 0.0005640631         0.01071720 -0.0005640631
#> ENSG00000278869            19 1.0000000000        19.00000000  0.0000000000
#> CABP7-DT                   19 1.0000000000        19.00000000  0.0000000000
#> NBEAP4                     19 1.0000000000        19.00000000  0.0000000000
#> FMO2                       19 1.0000000000        19.00000000  0.0000000000
#> NDUFA4P2                   19 1.0000000000        19.00000000  0.0000000000
#> DPY19L4P2                  19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000286114            19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000265935            19 1.0000000000        19.00000000  0.0000000000
#> Y-RNA                      19 1.0000000000        19.00000000  0.0000000000
#> FAM201B                    19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000243018            19 1.0000000000        19.00000000  0.0000000000
#> TRBC1                      18 0.9487604337        18.02644824 -0.0013920127
#> FAM20C                      0 0.0085194504         0.16186956 -0.0085194504
#> ENSG00000251536            19 1.0000000000        19.00000000  0.0000000000
#> CLDN18                     19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000259688            19 1.0000000000        19.00000000  0.0000000000
#> RBM7P1                     19 1.0000000000        19.00000000  0.0000000000
#> UBE2HP1                    19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000258419            19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000231698            19 1.0000000000        19.00000000  0.0000000000
#> NF1P5                      19 1.0000000000        19.00000000  0.0000000000
#> PTPN20                     19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000280048            19 1.0000000000        19.00000000  0.0000000000
#> AHCY                        0 0.0085942501         0.16329075 -0.0085942501
#> CCDC127                     0 0.0009125321         0.01733811 -0.0009125321
#> FSHR                       19 1.0000000000        19.00000000  0.0000000000
#> TRIM64C                    19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000285971            19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000217239             8 0.5072457703         9.63766964 -0.0861931387
#> ENSG00000236366            19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000286853            19 1.0000000000        19.00000000  0.0000000000
#> RN7SL270P                  19 1.0000000000        19.00000000  0.0000000000
#> CYCSP41                    19 1.0000000000        19.00000000  0.0000000000
#> MIR150                     19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000289359            19 1.0000000000        19.00000000  0.0000000000
#> Metazoa-SRP                19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000284620            19 1.0000000000        19.00000000  0.0000000000
#> Y-RNA.1                    19 1.0000000000        19.00000000  0.0000000000
#> CNN2P9                     19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000273375            18 0.9487604337        18.02644824 -0.0013920127
#> ENSG00000287871            19 1.0000000000        19.00000000  0.0000000000
#> LINC02862                  19 1.0000000000        19.00000000  0.0000000000
#> MIR556                     19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000235609            12 0.4816551325         9.15144752  0.1499238148
#> MBD2                        0 0.0021160863         0.04020564 -0.0021160863
#> HIGD1AP6                   19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000276958            19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000275295            19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000285454            19 1.0000000000        19.00000000  0.0000000000
#> C10orf71-AS1               19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000256001            19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000279294            19 1.0000000000        19.00000000  0.0000000000
#> IFNWP5                     19 1.0000000000        19.00000000  0.0000000000
#> MAN1C1                     18 0.9487604337        18.02644824 -0.0013920127
#> RN7SL211P                  19 1.0000000000        19.00000000  0.0000000000
#> GNRHR2P1                   19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000273904            19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000241593            19 1.0000000000        19.00000000  0.0000000000
#> WDR31                      13 0.6577186357        12.49665408  0.0264918906
#> DRD5                       19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000256569            19 1.0000000000        19.00000000  0.0000000000
#> EPHX1                       0 0.0033115123         0.06291873 -0.0033115123
#> ACTN1                       0 0.0362952319         0.68960941 -0.0362952319
#> MIR5188                    19 1.0000000000        19.00000000  0.0000000000
#> RNU6-118P                  19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000271758            19 1.0000000000        19.00000000  0.0000000000
#> ZNF84-DT                   19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000248733            19 1.0000000000        19.00000000  0.0000000000
#> ACTL7A                     19 1.0000000000        19.00000000  0.0000000000
#> GID4                        0 0.0503084282         0.95586014 -0.0503084282
#> Y-RNA.2                    19 1.0000000000        19.00000000  0.0000000000
#> MIR200C                    19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000224644            19 1.0000000000        19.00000000  0.0000000000
#> CSTA                        0 0.0149415657         0.28388975 -0.0149415657
#> MIR664A                    19 1.0000000000        19.00000000  0.0000000000
#> MIR4802                    19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000278655            19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000280122            19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000254180            19 1.0000000000        19.00000000  0.0000000000
#> RNU6-896P                  19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000286805            19 1.0000000000        19.00000000  0.0000000000
#> SHANK1                     19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000291048            19 1.0000000000        19.00000000  0.0000000000
#> RN7SL268P                  19 1.0000000000        19.00000000  0.0000000000
#> NLGN2                      11 0.5926918838        11.26114579 -0.0137445154
#> DMC1                       14 0.7692454377        14.61566332 -0.0324033325
#> KCNAB1-AS1                 19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000276015            19 1.0000000000        19.00000000  0.0000000000
#> WWTR1-IT1                  19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000260465            19 1.0000000000        19.00000000  0.0000000000
#> RPL5P30                    17 0.9002049947        17.10389490 -0.0054681526
#> ENSG00000270988            19 1.0000000000        19.00000000  0.0000000000
#> MIR545                     19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000257548            19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000289950            19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000262413            16 0.8541900003        16.22961001 -0.0120847371
#> ENSG00000249890            19 1.0000000000        19.00000000  0.0000000000
#> RN7SL255P                  19 1.0000000000        19.00000000  0.0000000000
#> TRIM53CP                   19 1.0000000000        19.00000000  0.0000000000
#> RNA5SP107                  19 1.0000000000        19.00000000  0.0000000000
#> RNU6-845P                  19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000241114            19 1.0000000000        19.00000000  0.0000000000
#> SERBP1P2                   19 1.0000000000        19.00000000  0.0000000000
#> RPS10-NUDT3                18 0.9487604337        18.02644824 -0.0013920127
#> CDY12P                     19 1.0000000000        19.00000000  0.0000000000
#> MIR4644                    19 1.0000000000        19.00000000  0.0000000000
#> ENSG00000223343            19 1.0000000000        19.00000000  0.0000000000
#> MORF4L1P3                  19 1.0000000000        19.00000000  0.0000000000
#> MRGPRX3                     6 0.4125265457         7.83800437 -0.0967370721
#> CD160                      19 1.0000000000        19.00000000  0.0000000000
#>  [ reached 'max' / getOption("max.print") -- omitted 889 rows ]
#> 
#> $summary_by_group
#>                             group n_genes n_wells median_p0_obs median_p0_nb
#> DMSO_0                     DMSO_0     500      19             1            1
#> Staurosporine_10 Staurosporine_10     500       3             1            1
#>                  median_ZI observed_zeros_num expected_zeros_num pct_ZI_gt_0.1
#> DMSO_0                   0               7780           7789.191         0.004
#> Staurosporine_10         0               1332           1307.828         0.080
#>                  pct_ZI_gt_0.2
#> DMSO_0                   0.000
#> Staurosporine_10         0.052
#>