Inspect problems in aggregated FastQC reports.

qc_fails(object, element = c("sample", "module"), compact = TRUE)

qc_warns(object, element = c("sample", "module"), compact = TRUE)

qc_problems(
  object,
  element = c("sample", "module"),
  name = NULL,
  status = c("FAIL", "WARN"),
  compact = TRUE
)

Arguments

object

an object of class qc_aggregate.

element

character vector specifying which element to check for inspecting problems. Allowed values are one of c("sample", "module"). Default is "sample".

  • If "sample", shows samples with more failed and/or warned modules

  • If "module", shows moduled that failed and/or warned in the most samples

compact

logical value. If TRUE, returns a compact output format; otherwise, returns a stretched format.

name

character vector containing the names of modules and/or samples of interest. See qc_read for valid module names. If name specified, a stretched output format is returned by default unless you explicitly indicate compact = TRUE.

status

character vector specifying the module status. Allowed values includes one or the combination of c("FAIL", "WARN"). If status = "FAIL", only modules with failed status are returned.

Value

  • qc_problems(), qc_fails(), qc_warns(): returns a tibble (data frame) containing samples that had one or more modules with failure or warning. The format and the interpretation of the results depend on the argument 'element', which value is one of c("sample", "module").

    • If element = "sample" (default), results are samples with failed and/or warned modules. The results contain the following columns: sample (sample names), nb_problems (the number of modules with problems), module (the name of modules with problems).

    • If element = "module", results are modules that failed and/or warned in the most samples. The results contain the following columns: module (the name of module with problems), nb_problems (the number of samples with problems), sample (the name of samples with problems)

Functions

  • qc_fails(): Displays which samples had one or more failed modules. Use qc_fails(qc, "module") to see which modules failed in the most samples.

  • qc_warns(): Displays which samples had one or more warned modules. Use qc_warns(qc, "module") to see which modules warned in the most samples.

  • qc_problems(): Union of qc_fails() and qc_warns(). Display which samples or modules that failed or warned.

Examples

# Demo QC dir
qc.dir <- system.file("fastqc_results", package = "fastqcr")
qc.dir
#> [1] "/private/var/folders/xm/8p6yj4bj6s57n4v_51714lwm0000gp/T/RtmpT6jSz8/temp_libpatha9b37e9f6eab/fastqcr/fastqc_results"
# List of files in the directory
list.files(qc.dir)
#> [1] "S1_fastqc.zip" "S2_fastqc.zip" "S3_fastqc.zip" "S4_fastqc.zip"
#> [5] "S5_fastqc.zip"

# Aggregate the report
qc <- qc_aggregate(qc.dir, progressbar = FALSE)

# Display samples with failed modules
qc_fails(qc)
#> # A tibble: 5 × 3
#>   sample nb_problems module                                            
#>   <chr>        <int> <chr>                                             
#> 1 S3               2 Per base sequence content, Per sequence GC content
#> 2 S4               2 Per base sequence content, Per sequence GC content
#> 3 S1               1 Per base sequence content                         
#> 4 S2               1 Per base sequence content                         
#> 5 S5               1 Per base sequence content                         
qc_fails(qc, compact = FALSE)
#> # A tibble: 7 × 4
#>   sample nb_problems module                    status
#>   <chr>        <int> <chr>                     <chr> 
#> 1 S3               2 Per base sequence content FAIL  
#> 2 S3               2 Per sequence GC content   FAIL  
#> 3 S4               2 Per base sequence content FAIL  
#> 4 S4               2 Per sequence GC content   FAIL  
#> 5 S1               1 Per base sequence content FAIL  
#> 6 S2               1 Per base sequence content FAIL  
#> 7 S5               1 Per base sequence content FAIL  

# Display samples with warned modules
qc_warns(qc)
#> # A tibble: 5 × 3
#>   sample nb_problems module                                               
#>   <chr>        <int> <chr>                                                
#> 1 S1               2 Per sequence GC content, Sequence Length Distribution
#> 2 S2               2 Per sequence GC content, Sequence Length Distribution
#> 3 S5               2 Per sequence GC content, Sequence Length Distribution
#> 4 S3               1 Sequence Length Distribution                         
#> 5 S4               1 Sequence Length Distribution                         

# Module failed in the most samples
qc_fails(qc, "module")
#> # A tibble: 2 × 3
#>   module                    nb_problems sample            
#>   <chr>                           <int> <chr>             
#> 1 Per base sequence content           5 S1, S2, S3, S4, S5
#> 2 Per sequence GC content             2 S3, S4            
qc_fails(qc, "module", compact = FALSE)
#> # A tibble: 7 × 4
#>   module                    nb_problems sample status
#>   <chr>                           <int> <chr>  <chr> 
#> 1 Per base sequence content           5 S1     FAIL  
#> 2 Per base sequence content           5 S2     FAIL  
#> 3 Per base sequence content           5 S3     FAIL  
#> 4 Per base sequence content           5 S4     FAIL  
#> 5 Per base sequence content           5 S5     FAIL  
#> 6 Per sequence GC content             2 S3     FAIL  
#> 7 Per sequence GC content             2 S4     FAIL  

# Specify a module of interest
qc_problems(qc, "module",  name = "Per sequence GC content")
#> # A tibble: 5 × 4
#>   module                  nb_problems sample status
#>   <chr>                         <int> <chr>  <chr> 
#> 1 Per sequence GC content           5 S3     FAIL  
#> 2 Per sequence GC content           5 S4     FAIL  
#> 3 Per sequence GC content           5 S1     WARN  
#> 4 Per sequence GC content           5 S2     WARN  
#> 5 Per sequence GC content           5 S5     WARN