Obtain US Census-adjusted nmdp Alleleic Frequencies by Region

Usage

obtain_census_adjusted_nmdp_hla_frequencies_by_region(
  in_region,
  region_level = "state"
)

Arguments

in_region: A string of either 'us', 'all states' (not useful), or a valid state name like 'Alaska'
region_level: A string of either 'state' or 'county' - default is 'state'

Value

Data Frame of US Census Adjusted HLA Alleleic Frequencies

allele - the Allele in question
is_g - This trailing g on some of the alleles to include they are inclusive of multiple alleles ( see Maiers2007 for details)
nmdp_race_code - Value used by nmdp to denote races in their studies
nmdp_af - Allelic frequency measured by nmdp within a race
nmdp_calc_gf - Genotypic frequency calculated righ toff the allelic frequency using hardy-weinburg
region - used in downstream tables to denote regions - this is only US
census_region - function used in this function to call up US 2020 census data can be adjusted for states - if a state is chosen it is put here
total_2020_pop - total population in a given region
us_2020_percent_pop - percentage of the overall population with the given nmdp code in a given region. Should sum close to 1 for a specific HLA Allele
us_2020_nmdp_gf - The US-Census race-adjusted genotypic frequency value - used in downstream applications

Note

relies on preprocessing performed in 'preprocessing/nmdpFrequencies.Rmd'

Examples

obtain_census_adjusted_nmdp_hla_frequencies_by_region(in_region = 'Alaska')
#> # A tibble: 11,705 × 12
#>    region loci  allele   is_g nmdp_race_code nmdp_af nmdp_calc_gf fips 
#>    <chr>  <chr> <chr>   <dbl> <chr>            <dbl>        <dbl> <chr>
#>  1 us     A     A*02:01     1 CAU             0.272         0.469 02   
#>  2 us     A     A*01:01     1 CAU             0.159         0.292 02   
#>  3 us     C     C*07:01     1 CAU             0.155         0.286 02   
#>  4 us     A     A*03:01     1 CAU             0.140         0.260 02   
#>  5 us     C     C*07:02     1 CAU             0.137         0.254 02   
#>  6 us     B     B*07:02     1 CAU             0.125         0.234 02   
#>  7 us     C     C*04:01     1 CAU             0.115         0.217 02   
#>  8 us     B     B*08:01     1 CAU             0.106         0.200 02   
#>  9 us     C     C*06:02     1 CAU             0.0951        0.181 02   
#> 10 us     A     A*24:02     1 CAU             0.0891        0.170 02   
#> # ℹ 11,695 more rows
#> # ℹ 4 more variables: census_region <chr>, total_2020_pop <dbl>,
#> #   us_2020_percent_pop <dbl>, us_2020_nmdp_gf <dbl>