Sample Loader Module

API reference for sample discovery and file loading

1 Sample Loader Module

1.1 Overview

The sample_loader.R module handles sample discovery and multi-file loading for IMPACT-VIS. It automatically discovers samples by scanning for GDS files and loads associated SV and CNV files with graceful partial-failure tolerance.

Location: app/logic/sample_loader.R

1.2 Exported Functions

1.2.1 get_sample_names()

Discovers all available samples in a directory by scanning for *_SNV_IMPACT.gds files.

Parameters: - dir (character): Directory path to scan (e.g., “app/data”)

Returns: Character vector of sample IDs

Details: Searches directory for files matching pattern *_SNV_IMPACT.gds, extracts the sample ID by removing the _SNV_IMPACT.gds suffix. Used by the UI to populate the sample selector dropdown.

Example:

box::use(app/logic/sample_loader[get_sample_names])

samples <- get_sample_names("app/data")
# Returns: c("1113506", "1142012", "1357986", ...)

1.2.2 expected_sample_paths()

Constructs expected file paths for a given sample ID based on naming conventions.

Parameters: - sample (character): Sample ID (e.g., “1113506”) - dir (character): Directory path (e.g., “app/data”)

Returns: Named list with paths: - $snv: Path to GDS file (or NULL if doesn’t exist) - $sv: Path to SV TSV file (or NULL if doesn’t exist) - $cnv: Path to CNV TXT file (or NULL if doesn’t exist) - $gda: Path to GeneList.txt (or NULL if doesn’t exist)

Details: Constructs paths based on naming conventions: {sample}_SNV_IMPACT.gds, {sample}_SV_IMPACT.tsv, {sample}_CNV_IMPACT.txt. Returns NULL for paths that don’t exist on filesystem, allowing graceful handling of partial datasets.

Example:

box::use(app/logic/sample_loader[expected_sample_paths])

paths <- expected_sample_paths("1113506", "app/data")
# Returns list with paths like:
# $snv: "app/data/1113506_SNV_IMPACT.gds"
# $sv: "app/data/1113506_SV_IMPACT.tsv"
# $cnv: "app/data/1113506_CNV_IMPACT.txt"
# $gda: NULL

1.2.3 load_sample_files()

Loads all available data files for a sample (GDS, SV, CNV) with partial-failure tolerance.

Parameters: - sample (character): Sample ID - dir (character): Directory path (default: “app/data”)

Returns: List with: - $gds: Loaded GDS data (data.frame) or NULL - $sv: Loaded SV data (data.frame) or NULL - $cnv: Loaded CNV data (data.frame) or NULL - $errors: Character vector of error messages - $warnings: Character vector of warnings - $any_data_loaded: Logical: TRUE if at least one file loaded successfully - $partial_failure: Logical: TRUE if some files loaded but others failed

Details: Attempts to load each file type independently. If one fails, others are still loaded. Returns structured result allowing the UI to: - Display successfully loaded data - Show warnings/errors for failed files - Function with partial data (e.g., show SNV but not SV if SV file missing)

Example:

box::use(app/logic/sample_loader[load_sample_files])

result <- load_sample_files("1113506", "app/data")

if (result$any_data_loaded) {
  # Display SNV data if loaded
  if (!is.null(result$gds)) {
    cat("Loaded", nrow(result$gds), "SNVs\n")
  }
  
  # Show warnings for any partial failures
  if (result$partial_failure) {
    cat("Warnings:\n", paste(result$errors, collapse="\n"))
  }
} else {
  cat("No data loaded. Errors:\n", paste(result$errors, collapse="\n"))
}

1.2.4 close_gds_handle()

Safely closes an open GDS file handle and clears reactive references.

Parameters: - gds_handle (reactive): Reactive object holding the GDS handle - gds_path (reactive): Reactive object holding the GDS path

Returns: NULL (invisibly)

Details: Used internally for cleanup when switching samples or closing the app. Safely closes SeqArray GDS handles without error, then clears reactive values to NULL.

Example:

box::use(app/logic/sample_loader[close_gds_handle])

# In a Shiny module
close_gds_handle(gds_handle_reactive, gds_path_reactive)

1.3 Implementation Details

1.3.1 Sample Discovery Flow

  1. User opens app → get_sample_names("app/data") → Find all available samples
  2. User selects sample from dropdown
  3. expected_sample_paths(sample, "app/data") → Get expected file locations
  4. load_sample_files(sample, "app/data") → Load all available file types
  5. Display loaded data (partial or complete)

1.3.2 File Naming Requirements

For sample “1113506”, expected files:

app/data/1113506_SNV_IMPACT.gds           (required for sample discovery)
app/data/1113506_SV_IMPACT.tsv            (optional)
app/data/1113506_CNV_IMPACT.txt           (optional)
app/data/1113506_SNV_IMPACT.gds_variant_states.rds (optional, state persistence)

1.3.3 Partial Failure Tolerance

The module is designed to handle incomplete datasets: - Missing SV file → SNV and CNV still load - Missing CNV file → SNV and SV still load - Missing SV and CNV → SNV alone displays - All files missing → Error returned with specific guidance

1.4 Common Workflows

1.4.1 Workflow 1: Populate Sample Dropdown

box::use(app/logic/sample_loader[get_sample_names])

# In UI
selectInput(
  "sample_selector",
  "Select Sample:",
  choices = get_sample_names("app/data")
)

1.4.2 Workflow 2: Load Sample on Selection Change

box::use(
  app/logic/sample_loader[load_sample_files],
  app/logic/error_handler[show_error, show_warning]
)

# In server
observeEvent(input$sample_selector, {
  sample_data <- load_sample_files(input$sample_selector, "app/data")
  
  if (sample_data$any_data_loaded) {
    # Update reactive values with loaded data
    snv_reactive(sample_data$gds)
    sv_reactive(sample_data$sv)
    cnv_reactive(sample_data$cnv)
    
    if (sample_data$partial_failure) {
      show_warning(session, "Partial Data Load",
        paste("Some files failed to load:", 
              paste(sample_data$errors, collapse="; ")))
    }
  } else {
    show_error(session, "Load Failed",
      "Could not load any data for this sample",
      paste(sample_data$errors, collapse="\n"))
  }
})

1.5 See Also