Skip to contents

This function preprocesses GWAS summary statistics to prepare them for use with MRcare. It standardizes column names, performs quality control, and saves the processed data to a file for later use.

Usage

preprocess_gwas_data(
  gwas_data,
  output_file,
  data_type = c("exposure", "outcome"),
  A1 = NULL,
  A2 = NULL,
  SNP = NULL,
  CHR = NULL,
  POS = NULL,
  BETA = NULL,
  SE = NULL,
  Pval = NULL,
  MAF = NULL,
  N = NULL,
  Zscore = NULL,
  Ninput = NULL,
  maf_filter = 0.01,
  remove_ambiguous = TRUE,
  verbose = TRUE
)

Arguments

gwas_data

Path to a GWAS summary statistics file or a data frame containing GWAS summary statistics

output_file

Path where the processed GWAS data will be saved

data_type

Either "exposure" or "outcome" to determine processing logic

A1

Column name for effect allele (default: NULL, will try to detect)

A2

Column name for other allele (default: NULL, will try to detect)

SNP

Column name for SNP ID (default: NULL, will try to detect)

CHR

Column name for chromosome (default: NULL, will try to detect)

POS

Column name for position (default: NULL, will try to detect)

BETA

Column name for effect size (default: NULL, will try to detect)

SE

Column name for standard error (default: NULL, will try to detect)

Pval

Column name for p-value (default: NULL, will try to detect)

MAF

Column name for minor allele frequency (default: NULL, will try to detect)

N

Column name for sample size (default: NULL, will try to detect)

Zscore

Column name for Z-score (default: NULL, will try to detect)

Ninput

Sample size if not available in data (default: NULL)

maf_filter

Minimum MAF threshold to include SNPs (default: 0.01)

remove_ambiguous

Whether to remove strand-ambiguous SNPs (default: TRUE)

verbose

Whether to print progress messages (default: TRUE)

Value

Path to the processed GWAS data file

Examples

if (FALSE) { # \dontrun{
# Process exposure data
exposure_processed <- preprocess_gwas_data(
  gwas_data = "path/to/raw_exposure_gwas.txt",
  output_file = "path/to/processed_exposure_gwas.rds",
  data_type = "exposure",
  Ninput = 10000
)

# Process outcome data
outcome_processed <- preprocess_gwas_data(
  gwas_data = "path/to/raw_outcome_gwas.txt",
  output_file = "path/to/processed_outcome_gwas.rds",
  data_type = "outcome",
  Ninput = 15000
)
} # }