--- title: "Getting Started with meddra.read" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Getting Started with meddra.read} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(meddra.read) ``` ## What is MedDRA? MedDRA (Medical Dictionary for Regulatory Activities) is a standardized medical terminology used in clinical trials and regulatory submissions to classify adverse events. It is organized as a five-level hierarchy: | Level | Abbreviation | Description | |-------|-------------|-------------| | 1 (broadest) | SOC | System Organ Class | | 2 | HLGT | High Level Group Term | | 3 | HLT | High Level Term | | 4 | PT | Preferred Term | | 5 (most specific) | LLT | Lowest Level Term | MedDRA data is proprietary and requires a license from the [MedDRA MSSO](https://www.meddra.org/). This package helps you load and work with your licensed MedDRA data. The examples below use a small, clearly fictional dataset bundled with the package for illustration purposes. ## MedDRA Distribution File Structure When you download a licensed MedDRA release, it contains two subdirectories: ``` my_meddra_release/ ├── MedAscii/ # Main MedDRA data files (.asc) │ ├── soc.asc # System Organ Classes │ ├── hlgt.asc # High Level Group Terms │ ├── hlt.asc # High Level Terms │ ├── pt.asc # Preferred Terms │ ├── llt.asc # Lowest Level Terms │ ├── hlt_pt.asc # HLT to PT linking table │ ├── hlgt_hlt.asc # HLGT to HLT linking table │ ├── soc_hlgt.asc # SOC to HLGT linking table │ ├── mdhier.asc # Denormalized hierarchy │ ├── meddra_release.asc # Version information │ └── ... # Additional files (SMQ, specialties, etc.) └── SeqAscii/ # Sequential update files (.seq) ├── soc.seq ├── pt.seq └── ... ``` All files use the `$` character as a field separator. ## Reading MedDRA Data Use `read_meddra()` pointing to the parent directory that contains `MedAscii` and `SeqAscii` (or `MedSeq`) subdirectories. It returns a named list of data.frames, one per file. ```{r read} # For your licensed data, replace this path with your actual MedDRA directory: # example_dir <- "/path/to/your/meddra/release" # The package includes a small fictional dataset for illustration: example_dir <- system.file("example_meddra", package = "meddra.read") meddra_raw <- read_meddra(example_dir) ``` The result is a named list with one data.frame per MedDRA file: ```{r list-names} names(meddra_raw) ``` Each data.frame corresponds to one of the MedDRA source files. For example, the System Organ Class data: ```{r soc} meddra_raw$soc.asc ``` The Preferred Terms: ```{r pt} meddra_raw$pt.asc ``` The Lowest Level Terms (note: `llt_currency = "Y"` means the term is current; `"N"` means it is a non-current synonym): ```{r llt} meddra_raw$llt.asc ``` ## Joining into a Single Data Frame `join_meddra()` merges all the hierarchy tables into a single flat data.frame, making it easy to look up or filter by any level of the hierarchy: ```{r join} meddra_df <- join_meddra(meddra_raw) meddra_df ``` The resulting data.frame has one row per LLT (Lowest Level Term) and includes all parent hierarchy levels. The columns are: | Column | Description | |--------|-------------| | `soc_code`, `soc_name`, `soc_abbrev` | System Organ Class | | `hlgt_code`, `hlgt_name` | High Level Group Term | | `hlt_code`, `hlt_name` | High Level Term | | `pt_code`, `pt_name`, `pt_soc_code` | Preferred Term | | `llt_code`, `llt_name`, `llt_currency` | Lowest Level Term | | `primary_soc_fg` | `"Y"` if this SOC is the primary (preferred) SOC for the PT | ## Common Use Cases ### Filter by System Organ Class To work with terms from a specific SOC: ```{r filter-soc} subset(meddra_df, soc_name == "Example Nervous System Disorders") ``` ### Find all LLTs for a Preferred Term To find all Lowest Level Terms (including non-current synonyms) for a given PT: ```{r find-llts} subset(meddra_df, pt_name == "Example Headache", select = c(llt_code, llt_name, llt_currency)) ``` ### Keep only current LLTs Non-current LLTs (`llt_currency = "N"`) are historical synonyms. In most analyses you will want to keep only current terms: ```{r current-only} current <- subset(meddra_df, llt_currency == "Y") current[, c("llt_name", "pt_name", "soc_abbrev")] ``` ### Check the MedDRA version ```{r version} meddra_raw$meddra_release.asc ``` ## Working with SMQ Data Standardized MedDRA Queries (SMQs) are pre-defined sets of terms used to search for adverse events. The SMQ data is available in `smq_list.asc` and `smq_content.asc`: ```{r smq} meddra_raw$smq_list.asc meddra_raw$smq_content.asc ```