Package 'uem915'

Title: Omic Analysis uem915
Description: Omic Analysis uem915.
Authors: Florent Dumont [aut, cre] (ORCID: <https://orcid.org/0000-0002-4439-5070>)
Maintainer: Florent Dumont <[email protected]>
License: GPL-3
Version: 1.0.2
Built: 2026-05-11 09:20:38 UTC
Source: https://github.com/fdumbioinfo/uem915

Help Index


Principal Component Analysis

Description

Principal Component Analysis

Usage

acp(
  dat,
  factor = NULL,
  samplename = NULL,
  pc1 = 1,
  pc2 = 2,
  center = TRUE,
  scale = TRUE,
  title = "ACP",
  legendtitle = "TREATMENT"
)

Arguments

dat

matrix numeric

factor

factor

samplename

character

pc1

numeric

pc2

numeric

center

logical TRUE

scale

logical TRUE

title

character

legendtitle

character

Value

no values

Author(s)

Florent Dumont [email protected]

Examples

# not run
# acp( mat1 , sif1 )

Annotation

Description

Annotate a list of symbols or IDs

Usage

annot(
  symbollist,
  species = NULL,
  ortholog = F,
  dboutput = "ncbi",
  idtype = NULL
)

Arguments

symbollist

character list of IDs or Symbols

species

character for species hs mm rn dr

ortholog

logical return homo sapiens ortholog of species

dboutput

character database used for Symbol annotation ncbi or ebi

idtype

character annotation database ID type among SYMBOL (by defaut) GENE, ENST, ENSG, ENSP, UNIPROT

Details

supported is : symbol, ncbi gene, ensembl gene , transcrit, protein, uniprot swissrot, uniprot trembl species : hs homo sapien , mm mus musculus , rn rattus norvegicus, dr danio rerio

Value

data.frame

Author(s)

Florent Dumont [email protected]

Examples

# not run
# annot(SymbolList)

Boxplot

Description

Boxplot

Usage

boxplot(
  dat,
  factor,
  outline = FALSE,
  title = "Boxplot",
  legendtitle = "TREATMENT",
  outlier = T,
  coefiqr = 1.5,
  ggplot = FALSE
)

Arguments

dat

matrix numeric

factor

factor

outline

logical display outliers FALSE by default

title

character

legendtitle

character

outlier

boolean

coefiqr

numeric

ggplot

logical use graphics library or ggplot FALSE by default

Details

To make boxplot from matrix.

Value

plot

Author(s)

Florent Dumont [email protected]

Examples

# not run
# mat1 %>% boxplot( factor = sif1$F3 )

boxplot for one var

Description

boxplot for one var

Usage

boxplot1(dat, ylab = "y", xlab = "TREATMENT", log = T)

Arguments

dat

data.frame

ylab

character

xlab

character

log

logical if TRUE data are delog in base 2

Value

plot

Author(s)

Florent Dumont [email protected]

Examples

# not run

MSigDB enrichment analysis

Description

MSigDB enrichment analysis

Usage

ena(
  SymbolList = NULL,
  geneannot = NULL,
  species = "hs",
  bg = 25000,
  filtergeneset = "all",
  overlapmin = 2,
  enaScoremin = 1,
  top = 80,
  labsize = 11,
  dpibarplot = "screen",
  path = ".",
  dirname = NULL
)

Arguments

SymbolList

character Symbol or NCBI gene ID

geneannot

data.frame

species

character hs mm rn dr ss

bg

numeric

filtergeneset

regexp to filter geneset database

overlapmin

numeric for minimum overlap between geneset and list

enaScoremin

numeric for minimum ratio ena

top

numeric top features to plot

labsize

numeric size of function in barplot

dpibarplot

character barplot resolution

path

character for relative path of output directory

dirname

character name for output

Value

file with enrichment analysis results

Author(s)

Florent Dumont [email protected]

Examples

# not run
# ena( Symbollist , filtergeneset = "reactome")

loading regular libraries

Description

load magrittr, dplyr, gplots, ggplot2, foreach, parallel, doParallel

Usage

env()

Hierarchical clustering classification

Description

make a hierarchical clustering classification

Usage

hc(
  dat,
  factor = NULL,
  title = "Hierarchical Clustering",
  plot = TRUE,
  method = "complete",
  legendtitle = "TREATMENT",
  cexlabel = 0.6
)

Arguments

dat

matrix numeric

factor

factor

title

character

plot

logical

method

character to choose agglomerative method of clustering dendrogramm

legendtitle

character

cexlabel

numeric

Details

Possible agglomerative method are the same hclust fonction : "complete" method by default

Value

a dendrogram

Author(s)

Florent Dumont [email protected]

Examples

# not run
# hc(dat)

Heatmap

Description

To make a heatmap

Usage

heatmap(
  dat,
  factor,
  method = "complete",
  dendrogram = "both",
  k = NULL,
  labCol = "",
  cexCol = 0.85,
  labRow = "",
  cexRow = 0.35,
  cexlegend = 0.65,
  keysize = 0.9,
  keycolor = c("darkgreen", "orange", "darkred"),
  parmar = c(5, 4, 5, 6)
)

Arguments

dat

matrix numeric

factor

factor

method

character

dendrogram

character to display 'none', 'row', 'column' or 'both' (by default) dendrograms

k

numeric number of clusters to colorize for rows

labCol

character

cexCol

numeric

labRow

Character

cexRow

numeric

cexlegend

numeric

keysize

numeric

keycolor

character of 3 for low mid high value of the key

parmar

numeric 4 values for margin sizes

Details

To make a heatmap from a matrix or a data.frame

Value

no returned value

Author(s)

Florent Dumont [email protected]

Examples

# not run
# library(magrittr)
# data(sif1)
# data(mat1)
# mat1 %>% heatmap(sif1$F3)

import tab file in data.frame

Description

import tab file in data.frame

Usage

input(filename, sep = "\t", quote = "")

Arguments

filename

character path to the file to read

sep

character for field separator

quote

character for field quote

Details

wrapper of read.table function for tabular separated files

Value

data.frame

Author(s)

Florent Dumont [email protected]

Examples

# not run
# input( "filename" ) -> dt

Normalization

Description

quantile normalization and log2

Usage

norm(dat, method = NULL, log = TRUE)

Arguments

dat

data.frame

method

character apply quantile normalization by default see details

log

logical apply log base 2

Details

for .method see limma normalizeBetweenArrays method

Value

data.frame

Author(s)

Florent Dumont [email protected]

Examples

# not run
# norm(dt)

Omic bioanalysis workflow

Description

Biostatistics analysis: QCs, ANOVA, threshold filtering, venn diagramm, cluster analysis for significant rows, pattern search

Graphics: volcanoplots, heatmaps, lineplots, boxplots (with Kruskal-Wallis test)

Functional analysis: MSigDB enrichment analysis, stringDB protein interaction network, topGO analysis (gene ontology)

See example section to test workflow with internal data.

Usage

omic(
  dat,
  sif,
  annot = NULL,
  species = "hs",
  model = NULL,
  paired = NULL,
  nested = NULL,
  batch = NULL,
  addfactor = NULL,
  qcs = TRUE,
  threshold = c(c(2, 6, 11), c(2, 6, 11) + 120),
  padj = "none",
  pattern = TRUE,
  venn = TRUE,
  cluster = TRUE,
  nc = c(2, 3, 6, 12),
  heatmap = TRUE,
  maxheatmap = NULL,
  volcanoplot = TRUE,
  lineplot = TRUE,
  boxplotrow = TRUE,
  ena = TRUE,
  enamin = 2,
  filtergeneset = "all",
  enaScoremin = 1.1,
  bg = 25000,
  sample = NULL,
  dopar = NULL,
  path = ".",
  dirname = NULL,
  zip = FALSE,
  remove = FALSE
)

Arguments

dat

data.frame normalize data table

sif

data.frame sample information file including model factors

annot

data.frame annotation with Symbol column for functional analysis

species

character available species: bt ce dr dm gg hs mm pt rn ss xt

model

character anova model factors (see details)

paired

character factor for paired design

nested

character factor for nested design

batch

character factor for batch effect design

addfactor

character additionnal factors

qcs

logical quality controls

threshold

numeric vector from 1 to 160 (see details)

padj

character fdr by defaut for Benjamini-Hochberg false discovery correction

pattern

logical search relevant pattern accross comparisons (see details)

venn

logical venn diagram

cluster

logical row hierarchical clustering using pearson correlation

nc

numeric number of clusters to cut in dendrogramm

heatmap

logical do heatmaps for all lists

maxheatmap

numeric max rows for heatmap

volcanoplot

logical make volcanoplot for each threshold

lineplot

logical do lineplot for significant features

boxplotrow

logical do boxplot for significant features with Kruskal

ena

logical msigdb enrichement analysis (over-representation analysis)

enamin

numeric min list size for functional analysis

filtergeneset

character regular expression to filter collection geneset (e.g. "reactome|tft")

enaScoremin

numeric for minimum ratio ena

bg

numeric background used for over-representation test

sample

numeric subset analysis

dopar

numeric core number

path

character results directory path

dirname

character results directory name

zip

logical compress results directory if TRUE

remove

logical remove uncompress results directory if TRUE

Details

Use uem915::env() to load required libraries before uem915::omic() (see example)

Accepted values for threshold param (1 to 150): see list -> uem915:::thresholdlist %>% lapply("[",c(1,2)) %>% unlist %>% matrix(ncol=2,byrow = T) %>% data.frame %>% setNames(c("pval","fc"))

Pattern: search relevant profiles among up and down comparison combinations

filtergeneset param: see geneset collections -> moalannotgene::genesetdb %>% lapply(names)

Experimental design examples:

  • model = "TREAMTENT" for 1-way anova

  • model = "TREATMENT+TIME+TREATMENT*TIME" for 3-ways anova with interaction

  • model = "TREATMENT", paired = "CASE" for 2-ways paired anova

  • model = "TREATMENT", batch = "BATCH" for 2-ways anova with remove batch effect (<=> paired anova)

  • model = "TREATMENT+PHENOTYPE" for 2-ways anova

  • model = "TREATMENT", addfactor = "PHENOTYPE" for 2-ways anova but venn, cluster and pattern are not applied to addfactor

  • model = "TREATMENT", nested = "TREATMENTinCASE" for 2-ways nested design

  • model = "TREATMENT", paired = "CASE", batch = "BATCH" for 3-ways paired anova with remove batch effect

Limitations:

  • Complete block designs only

  • Use dopar = 2 to decrease computing resources

  • sample param will subsets random rows in dat and decrease analysis time.

Annotation updates: 05112023 for gene and ensembl, MSigDB 7.5.1, StringDB 12.0

Input format:

  • Use uem915::input() to load external data from tsv files

  • norm data table must contains IDs in first column

  • norm data (in columns) and sample information (in rows) must have same order

  • norm data and annotation rows must have same order

  • use uem915::annot() function to annotate IDs (Symbols, ensembl and gene ids accepted)

Value

omic results directory

Author(s)

Florent Dumont [email protected]

Examples

# # test omic() with internal dataset GSE65055:
# # loading libraries
# library(uem915)
# uem915::env()
# # loading norm data
# moal:::GSE65055normdata -> normdata
# normdata %>% head
# # loading sample information file
# moal:::GSE65055sampledata -> sampledata
# sampledata %>% head
# # ordering factor levels
# sampledata$ANEUPLOIDY %>% ordered( c("Control","T13","T18","T21") ) -> sampledata$ANEUPLOIDY
# sampledata$TISSUE %>% as.factor -> sampledata$TISSUE
# # annotation
# normdata$rowID %>% moal::annot( species = "hs", idtype = "GENE" ) -> annotdata
# # omic analysis
# moal::omic(
#  dat = normdata, sif = sampledata, annot = annotdata, species = "hs",
#  model = "ANEUPLOIDY", batch = "TISSUE", threshold = c(6,126),
#  heatmap = T, lineplot = T, boxplotrow = T,
#  venn = F, cluster = F, pattern = F,
#  ena = T, network = F, topgo = F,
#  sample = NULL, dopar = NULL, zip = F,
#  dirname = "test", path = "." )

export data.frame in tab file

Description

export data.frame in tab file

Usage

output(dt, filename)

Arguments

dt

data.frame

filename

character

Author(s)

Florent Dumont [email protected]

Examples

# not run
# output( dt )

Quality Controls

Description

do descriptive statistics : histogram, boxplot , hierarchical clustering and ACP for column

Usage

qc(
  dat,
  sif = NULL,
  inputdata = F,
  histo = TRUE,
  boxplot = TRUE,
  hc = TRUE,
  acp = TRUE,
  dirname = NULL,
  path = "."
)

Arguments

dat

matrix numeric

sif

data.frame

inputdata

logical to export input data or not

histo

logical do histogram if TRUE by defaut

boxplot

logical do boxplot if TRUE by defaut

hc

logical do hierarchical clustering if TRUE by defaut

acp

logical do principal component analysis if TRUE by defaut

dirname

character

path

character

Details

return pval for each factor of anova model function use doparalle

Value

data.frame

Author(s)

Florent Dumont [email protected]

Examples

# not run
# qc(dat, metadata)

replace value by group median

Description

replace value by group median

Usage

replacegroupmed(dat, value = 0, factor)

Arguments

dat

character vector of mzxml file

value

numeric value to substitute

factor

character

Details

replace value by column median if group size is one and replace by row group median if group size > 1

Value

data.frame

Author(s)

Florent Dumont [email protected]

Examples

# not run
# replacegroupmed(dat)

Venn diagramm

Description

To make a Venn diagramm of 2, 3 or 4 lists

Usage

venn(
  list = NULL,
  listnames = NULL,
  returnlist = F,
  title = "Venn Diagram",
  plot = T,
  export = F,
  path = ".",
  dirname = "venn"
)

Arguments

list

list of 2 , 3 or 4 character vector or list of two data.frame to compare

listnames

character list names to display on graph

returnlist

logical

title

character title to display on graph

plot

logical to display the plot or not

export

logical export list in file

path

character

dirname

character name of the directory created when export = T

Details

until 4 list

Value

venn plot and new lists generated by venn.

Author(s)

Florent Dumont [email protected]

Examples

# not run
# library(magrittr)
# list(
#   c(letters[6:20] , letters[25] ) ,
#   letters[1:15] ,
#   c( letters[2:5] , letters[8:23] ) ) %>% venn