Inspect problems in aggregated FastQC reports.
an object of class qc_aggregate.
character vector specifying which element to check for inspecting problems. Allowed values are one of c("sample", "module"). Default is "sample".
If "sample", shows samples with more failed and/or warned modules
If "module", shows moduled that failed and/or warned in the most samples
logical value. If TRUE, returns a compact output format; otherwise, returns a stretched format.
character vector containing the names of modules and/or samples of interest. See qc_read for valid module names. If name specified, a stretched output format is returned by default unless you explicitly indicate compact = TRUE.
character vector specifying the module status. Allowed values includes one or the combination of c("FAIL", "WARN"). If status = "FAIL", only modules with failed status are returned.
qc_problems(), qc_fails(), qc_warns(): returns a tibble (data frame) containing samples that had one or more modules with failure or warning. The format and the interpretation of the results depend on the argument 'element', which value is one of c("sample", "module").
If element = "sample" (default), results are samples with failed and/or warned modules. The results contain the following columns: sample (sample names), nb_problems (the number of modules with problems), module (the name of modules with problems).
If element = "module", results are modules that failed and/or warned in the most samples. The results contain the following columns: module (the name of module with problems), nb_problems (the number of samples with problems), sample (the name of samples with problems)
qc_fails()
: Displays which samples had one or more failed modules. Use
qc_fails(qc, "module") to see which modules failed in the most samples.
qc_warns()
: Displays which samples had one or more warned modules. Use
qc_warns(qc, "module") to see which modules warned in the most samples.
qc_problems()
: Union of qc_fails()
and qc_warns()
.
Display which samples or modules that failed or warned.
# Demo QC dir
qc.dir <- system.file("fastqc_results", package = "fastqcr")
qc.dir
#> [1] "/private/var/folders/xm/8p6yj4bj6s57n4v_51714lwm0000gp/T/RtmpT6jSz8/temp_libpatha9b37e9f6eab/fastqcr/fastqc_results"
# List of files in the directory
list.files(qc.dir)
#> [1] "S1_fastqc.zip" "S2_fastqc.zip" "S3_fastqc.zip" "S4_fastqc.zip"
#> [5] "S5_fastqc.zip"
# Aggregate the report
qc <- qc_aggregate(qc.dir, progressbar = FALSE)
# Display samples with failed modules
qc_fails(qc)
#> # A tibble: 5 × 3
#> sample nb_problems module
#> <chr> <int> <chr>
#> 1 S3 2 Per base sequence content, Per sequence GC content
#> 2 S4 2 Per base sequence content, Per sequence GC content
#> 3 S1 1 Per base sequence content
#> 4 S2 1 Per base sequence content
#> 5 S5 1 Per base sequence content
qc_fails(qc, compact = FALSE)
#> # A tibble: 7 × 4
#> sample nb_problems module status
#> <chr> <int> <chr> <chr>
#> 1 S3 2 Per base sequence content FAIL
#> 2 S3 2 Per sequence GC content FAIL
#> 3 S4 2 Per base sequence content FAIL
#> 4 S4 2 Per sequence GC content FAIL
#> 5 S1 1 Per base sequence content FAIL
#> 6 S2 1 Per base sequence content FAIL
#> 7 S5 1 Per base sequence content FAIL
# Display samples with warned modules
qc_warns(qc)
#> # A tibble: 5 × 3
#> sample nb_problems module
#> <chr> <int> <chr>
#> 1 S1 2 Per sequence GC content, Sequence Length Distribution
#> 2 S2 2 Per sequence GC content, Sequence Length Distribution
#> 3 S5 2 Per sequence GC content, Sequence Length Distribution
#> 4 S3 1 Sequence Length Distribution
#> 5 S4 1 Sequence Length Distribution
# Module failed in the most samples
qc_fails(qc, "module")
#> # A tibble: 2 × 3
#> module nb_problems sample
#> <chr> <int> <chr>
#> 1 Per base sequence content 5 S1, S2, S3, S4, S5
#> 2 Per sequence GC content 2 S3, S4
qc_fails(qc, "module", compact = FALSE)
#> # A tibble: 7 × 4
#> module nb_problems sample status
#> <chr> <int> <chr> <chr>
#> 1 Per base sequence content 5 S1 FAIL
#> 2 Per base sequence content 5 S2 FAIL
#> 3 Per base sequence content 5 S3 FAIL
#> 4 Per base sequence content 5 S4 FAIL
#> 5 Per base sequence content 5 S5 FAIL
#> 6 Per sequence GC content 2 S3 FAIL
#> 7 Per sequence GC content 2 S4 FAIL
# Specify a module of interest
qc_problems(qc, "module", name = "Per sequence GC content")
#> # A tibble: 5 × 4
#> module nb_problems sample status
#> <chr> <int> <chr> <chr>
#> 1 Per sequence GC content 5 S3 FAIL
#> 2 Per sequence GC content 5 S4 FAIL
#> 3 Per sequence GC content 5 S1 WARN
#> 4 Per sequence GC content 5 S2 WARN
#> 5 Per sequence GC content 5 S5 WARN