Inspect problems in aggregated FastQC reports.

qc_fails(object, element = c("sample", "module"), compact = TRUE)

qc_warns(object, element = c("sample", "module"), compact = TRUE)

qc_problems(object, element = c("sample", "module"), name = NULL,
  status = c("FAIL", "WARN"), compact = TRUE)

Arguments

object

an object of class qc_aggregate.

element

character vector specifying which element to check for inspecting problems. Allowed values are one of c("sample", "module"). Default is "sample".

  • If "sample", shows samples with more failed and/or warned modules

  • If "module", shows moduled that failed and/or warned in the most samples

compact

logical value. If TRUE, returns a compact output format; otherwise, returns a stretched format.

name

character vector containing the names of modules and/or samples of interest. See qc_read for valid module names. If name specified, a stretched output format is returned by default unless you explicitly indicate compact = TRUE.

status

character vector specifying the module status. Allowed values includes one or the combination of c("FAIL", "WARN"). If status = "FAIL", only modules with failed status are returned.

Value

  • qc_problems(), qc_fails(), qc_warns(): returns a tibble (data frame) containing samples that had one or more modules with failure or warning. The format and the interpretation of the results depend on the argument 'element', which value is one of c("sample", "module").

    • If element = "sample" (default), results are samples with failed and/or warned modules. The results contain the following columns: sample (sample names), nb_problems (the number of modules with problems), module (the name of modules with problems).

    • If element = "module", results are modules that failed and/or warned in the most samples. The results contain the following columns: module (the name of module with problems), nb_problems (the number of samples with problems), sample (the name of samples with problems)

Functions

  • qc_fails: Displays which samples had one or more failed modules. Use qc_fails(qc, "module") to see which modules failed in the most samples.

  • qc_warns: Displays which samples had one or more warned modules. Use qc_warns(qc, "module") to see which modules warned in the most samples.

  • qc_problems: Union of qc_fails() and qc_warns(). Display which samples or modules that failed or warned.

Examples

# Demo QC dir qc.dir <- system.file("fastqc_results", package = "fastqcr") qc.dir
#> [1] "/Users/kassambara/Documents/R/MyPackages/fastqcr/inst/fastqc_results"
# List of files in the directory list.files(qc.dir)
#> [1] "S1_fastqc.zip" "S2_fastqc.zip" "S3_fastqc.zip" "S4_fastqc.zip" #> [5] "S5_fastqc.zip"
# Aggregate the report qc <- qc_aggregate(qc.dir, progressbar = FALSE) # Display samples with failed modules qc_fails(qc)
#> # A tibble: 5 x 3 #> sample nb_problems module #> <chr> <int> <chr> #> 1 S3 2 Per base sequence content, Per sequence GC content #> 2 S4 2 Per base sequence content, Per sequence GC content #> 3 S1 1 Per base sequence content #> 4 S2 1 Per base sequence content #> 5 S5 1 Per base sequence content
qc_fails(qc, compact = FALSE)
#> # A tibble: 7 x 4 #> sample nb_problems module status #> <chr> <int> <chr> <chr> #> 1 S3 2 Per base sequence content FAIL #> 2 S3 2 Per sequence GC content FAIL #> 3 S4 2 Per base sequence content FAIL #> 4 S4 2 Per sequence GC content FAIL #> 5 S1 1 Per base sequence content FAIL #> 6 S2 1 Per base sequence content FAIL #> 7 S5 1 Per base sequence content FAIL
# Display samples with warned modules qc_warns(qc)
#> # A tibble: 5 x 3 #> sample nb_problems module #> <chr> <int> <chr> #> 1 S1 2 Per sequence GC content, Sequence Length Distribution #> 2 S2 2 Per sequence GC content, Sequence Length Distribution #> 3 S5 2 Per sequence GC content, Sequence Length Distribution #> 4 S3 1 Sequence Length Distribution #> 5 S4 1 Sequence Length Distribution
# Module failed in the most samples qc_fails(qc, "module")
#> # A tibble: 2 x 3 #> module nb_problems sample #> <chr> <int> <chr> #> 1 Per base sequence content 5 S1, S2, S3, S4, S5 #> 2 Per sequence GC content 2 S3, S4
qc_fails(qc, "module", compact = FALSE)
#> # A tibble: 7 x 4 #> module nb_problems sample status #> <chr> <int> <chr> <chr> #> 1 Per base sequence content 5 S1 FAIL #> 2 Per base sequence content 5 S2 FAIL #> 3 Per base sequence content 5 S3 FAIL #> 4 Per base sequence content 5 S4 FAIL #> 5 Per base sequence content 5 S5 FAIL #> 6 Per sequence GC content 2 S3 FAIL #> 7 Per sequence GC content 2 S4 FAIL
# Specify a module of interest qc_problems(qc, "module", name = "Per sequence GC content")
#> # A tibble: 5 x 4 #> module nb_problems sample status #> <chr> <int> <chr> <chr> #> 1 Per sequence GC content 5 S3 FAIL #> 2 Per sequence GC content 5 S4 FAIL #> 3 Per sequence GC content 5 S1 WARN #> 4 Per sequence GC content 5 S2 WARN #> 5 Per sequence GC content 5 S5 WARN