Obtain US Census-adjusted nmdp Alleleic Frequencies by Region
Source:R/obtain_census_adjusted_nmdp_hla_frequencies_by_region.R
obtain_census_adjusted_nmdp_hla_frequencies_by_region.Rd
Obtain US Census-adjusted nmdp Alleleic Frequencies by Region
Value
Data Frame of US Census Adjusted HLA Alleleic Frequencies
allele - the Allele in question
is_g - This trailing
g
on some of the alleles to include they are inclusive of multiple alleles ( see Maiers2007 for details)nmdp_race_code - Value used by nmdp to denote races in their studies
nmdp_af - Allelic frequency measured by nmdp within a race
nmdp_calc_gf - Genotypic frequency calculated righ toff the allelic frequency using hardy-weinburg
region - used in downstream tables to denote regions - this is only US
census_region - function used in this function to call up US 2020 census data can be adjusted for states - if a state is chosen it is put here
total_2020_pop - total population in a given region
us_2020_percent_pop - percentage of the overall population with the given nmdp code in a given region. Should sum close to 1 for a specific HLA Allele
us_2020_nmdp_gf - The US-Census race-adjusted genotypic frequency value - used in downstream applications
Examples
obtain_census_adjusted_nmdp_hla_frequencies_by_region(in_region = 'Alaska')
#> # A tibble: 11,705 × 12
#> region loci allele is_g nmdp_race_code nmdp_af nmdp_calc_gf fips
#> <chr> <chr> <chr> <dbl> <chr> <dbl> <dbl> <chr>
#> 1 us A A*02:01 1 CAU 0.272 0.469 02
#> 2 us A A*01:01 1 CAU 0.159 0.292 02
#> 3 us C C*07:01 1 CAU 0.155 0.286 02
#> 4 us A A*03:01 1 CAU 0.140 0.260 02
#> 5 us C C*07:02 1 CAU 0.137 0.254 02
#> 6 us B B*07:02 1 CAU 0.125 0.234 02
#> 7 us C C*04:01 1 CAU 0.115 0.217 02
#> 8 us B B*08:01 1 CAU 0.106 0.200 02
#> 9 us C C*06:02 1 CAU 0.0951 0.181 02
#> 10 us A A*24:02 1 CAU 0.0891 0.170 02
#> # ℹ 11,695 more rows
#> # ℹ 4 more variables: census_region <chr>, total_2020_pop <dbl>,
#> # us_2020_percent_pop <dbl>, us_2020_nmdp_gf <dbl>