Skip to contents

Quickly subsample a specified number of genes from a Seurat object and return a smaller Seurat object containing the selected features and all original wells/samples. This is a lightweight convenience wrapper around seqgendiff::select_counts() and is intended for creating a small working object to run check_zeroinflation() (or similar diagnostics) rapidly.

Usage

subsample_genes(data, ngene = 100, gselect = "random", seed = 1)

Arguments

data

A Seurat object (v4 or v5) with counts in assay "RNA".

ngene

Integer. Number of genes to keep (must be <= total genes).

gselect

Gene-selection strategy as used by seqgendiff::select_counts(). Defaults to "random".

seed

Integer random seed for reproducibility.

Value

A Seurat object containing the subsampled genes and all original wells/samples.

Examples

data(mini_mac)
subsample_genes(mini_mac, ngene = 50, gselect = "random", seed = 1 )
#> # A Seurat-tibble abstraction: 308 × 21
#> # Features=50 | Cells=308 | Active assay=RNA | Assays=RNA
#>    .cell      orig.ident   nCount_RNA nFeature_RNA Plate_ID Well_ID Row   Column
#>    <chr>      <fct>             <dbl>        <int> <chr>    <chr>   <chr>  <int>
#>  1 AACAGGCAAT PMMSq033_mi…         65           29 PMMSq033 B02     B          2
#>  2 AACCAGCCAG PMMSq033_mi…        522           97 PMMSq033 C02     C          2
#>  3 AACCAGTTGA PMMSq033_mi…        415           82 PMMSq033 D02     D          2
#>  4 AACCGGCGTA PMMSq033_mi…        578           93 PMMSq033 E02     E          2
#>  5 AACCTAGTCC PMMSq033_mi…        286           72 PMMSq033 F02     F          2
#>  6 AACTCTACAC PMMSq033_mi…        515           96 PMMSq033 G02     G          2
#>  7 AACTGTGTCA PMMSq033_mi…        408           87 PMMSq033 H02     H          2
#>  8 AAGATGTCCA PMMSq033_mi…        332           78 PMMSq033 I02     I          2
#>  9 AAGCATATGG PMMSq033_mi…        498           92 PMMSq033 J02     J          2
#> 10 AAGCTCACCT PMMSq033_mi…        539          102 PMMSq033 K02     K          2
#> # ℹ 298 more rows
#> # ℹ 13 more variables: Species <chr>, Cell_type <chr>, Model_type <chr>,
#> #   Time <fct>, Unit <chr>, Treatment_1 <chr>, Concentration_1 <fct>,
#> #   Unit_1 <chr>, Sample_type <chr>, Project <chr>, combined_id <chr>,
#> #   percent.mt <dbl>, percent.ribo <dbl>