inmoose.sim.sim_rnaseq

inmoose.sim.sim_rnaseq(nb_genes, nb_samples, batch=None, group=None, single_cell=False, alpha=0.6, beta=0.3, outlier_prob=0.05, outlier_location=4, outlier_scale=0.5, libsize_loc=11, libsize_scale=0.2, phi=0.1, bcv_df=60, x0=0, k=-1, random_state=None)

Simulate (sc)RNASeq data.

For a precise description and understanding of the parameters, please refer to https://genomebiology.biomedcentral.com/articles/10.1186/s13059-017-1305-0.

Parameters:
  • nb_genes (int) – number of genes

  • nb_samples (int) – number of samples

  • batch (array-like, optional) – batch indices. Must have as many elements as nb_samples.

  • group (array-like, optional) – vector/factor for biological condition of interest. Must have as many elements as nb_samples.

  • single_cell (bool, optional) – if True, simulate scRNASeq data by adding drop-outs. Defaults to False.

  • alpha (float, optional) – shape of the gamma distribution to draw the initial means from

  • beta (float, optional) – rate of the gamme distribution to draw the initial means from

  • outlier_prob (float, optional) – proportion of outliers

  • outlier_location (float, optional) – location parameter for the log-normal distribution to draw the outlier inflation factors from

  • outlier_scale (float, optional) – scale parameter for the log-normal distribution to draw the outlier inflation factors from

  • libsize_loc (float, optional) – mean of the normal distribution to draw the log of the library sizes from

  • libsize_scale (float, optional) – standard deviation of the normal distribution to draw the log of the library sizes from

  • phi (float, optional) – common dispersion

  • bcv_df (int, optional) – degrees of freedom in the inverse chi-square distribution used to draw the biological coefficient of variation parameters

  • x0 (float, optional) – dropout midpoint, used to draw the dropouts. Unused if single_cell=False.

  • k (float, optional) – dropout shape, used to draw the dropouts. Unused if single_cell=False.

  • random_state ({int, numpy.random.Generator or numpy.random.RandomState},) – optional If random_state is None (or np.random), the numpy.random.RandomState singleton is used. If random_state is an int, a new RandomState instance is used, seeded with random_state. If random_state is already a Generator or RandomState instance, that instance is used.

Returns:

simulated count matrix, of size nb_genes x nb_samples

Return type:

ndarray