test_gene_overrepresentation {tidybulk}R Documentation

analyse gene over-representation with GSEA

Description

test_gene_overrepresentation() takes as imput a 'tbl' formatted as | <SAMPLE> | <ENSEMBL_ID> | <COUNT> | <...> | and returns a 'tbl' with the GSEA statistics

Usage

test_gene_overrepresentation(.data, .sample = NULL, .entrez, .do_test, species)

## S4 method for signature 'spec_tbl_df'
test_gene_overrepresentation(.data, .sample = NULL, .entrez, .do_test, species)

## S4 method for signature 'tbl_df'
test_gene_overrepresentation(.data, .sample = NULL, .entrez, .do_test, species)

## S4 method for signature 'tidybulk'
test_gene_overrepresentation(.data, .sample = NULL, .entrez, .do_test, species)

Arguments

.data

A 'tbl' formatted as | <SAMPLE> | <TRANSCRIPT> | <COUNT> | <...> |

.sample

The name of the sample column

.entrez

The ENTREZ ID of the transcripts/genes

.do_test

A boolean column name symbol. It indicates the transcript to check

species

A character. For example, human or mouse. MSigDB uses the latin species names (e.g., \"Mus musculus\", \"Homo sapiens\")

Details

Maturing lifecycle

This wrapper execute gene enrichment analyses of the dataset using a list of transcripts and GSEA. This wrapper uses clusterProfiler on the backend.

Value

A 'tbl' object

A 'tbl' object

A 'tbl' object

A 'tbl' object

Examples


df_entrez = symbol_to_entrez(tidybulk::counts_mini, .transcript = transcript, .sample = sample)
df_entrez = aggregate_duplicates(df_entrez, aggregation_function = sum, .sample = sample, .transcript = entrez, .abundance = count)
df_entrez = mutate(df_entrez, do_test = transcript %in% c("TNFRSF4", "PLCH2", "PADI4", "PAX7"))

	test_gene_overrepresentation(
		df_entrez,
		.sample = sample,
		.entrez = entrez,
		.do_test = do_test,
		species="Homo sapiens"
	)



[Package tidybulk version 1.0.2 Index]