inmoose.sim.sim_rnaseq
- inmoose.sim.sim_rnaseq(nb_genes, nb_samples, batch=None, group=None, single_cell=False, alpha=0.6, beta=0.3, outlier_prob=0.05, outlier_location=4, outlier_scale=0.5, libsize_loc=11, libsize_scale=0.2, phi=0.1, bcv_df=60, x0=0, k=-1, random_state=None)
Simulate (sc)RNASeq data.
For a precise description and understanding of the parameters, please refer to https://genomebiology.biomedcentral.com/articles/10.1186/s13059-017-1305-0.
- Parameters:
nb_genes (int) – number of genes
nb_samples (int) – number of samples
batch (array-like, optional) – batch indices. Must have as many elements as nb_samples.
group (array-like, optional) – vector/factor for biological condition of interest. Must have as many elements as nb_samples.
single_cell (bool, optional) – if True, simulate scRNASeq data by adding drop-outs. Defaults to False.
alpha (float, optional) – shape of the gamma distribution to draw the initial means from
beta (float, optional) – rate of the gamme distribution to draw the initial means from
outlier_prob (float, optional) – proportion of outliers
outlier_location (float, optional) – location parameter for the log-normal distribution to draw the outlier inflation factors from
outlier_scale (float, optional) – scale parameter for the log-normal distribution to draw the outlier inflation factors from
libsize_loc (float, optional) – mean of the normal distribution to draw the log of the library sizes from
libsize_scale (float, optional) – standard deviation of the normal distribution to draw the log of the library sizes from
phi (float, optional) – common dispersion
bcv_df (int, optional) – degrees of freedom in the inverse chi-square distribution used to draw the biological coefficient of variation parameters
x0 (float, optional) – dropout midpoint, used to draw the dropouts. Unused if single_cell=False.
k (float, optional) – dropout shape, used to draw the dropouts. Unused if single_cell=False.
random_state ({int, numpy.random.Generator or numpy.random.RandomState},) – optional If random_state is None (or np.random), the numpy.random.RandomState singleton is used. If random_state is an int, a new
RandomStateinstance is used, seeded with random_state. If random_state is already aGeneratororRandomStateinstance, that instance is used.
- Returns:
simulated count matrix, of size nb_genes x nb_samples
- Return type:
ndarray