This readme.txt describes and summarizes the files in the vt resource bundle. Files are based on hs38DH.fa made by Heng Li. Most reference sets are lifted over from hs37d5. data set (bcf) samples @snps/indels/complex/sv/trs description #1000G.v5 0 81316694/3296894/66806/59426/0 1000G v5. [1000G 2015?] dbsnp138 0 10588965/2488793/69749/0/0 derived from GATK's resource bundle that excludes 1000G variants. 1000G.omni.chip 2141 2432554/5/0/0/0 1000G individuals typed on the omni chip [1000G 2015?] mills 0 0/208753/0/0/0 indels from [Mills 2006] mills.chip 158 0/8904/0/0/0 indels from [Mills 2011] affy.exome.chip 2122 281875/34389/0/0/0 1000G individuals and others typed on the affymetrix exome chip [1000G 2015?] NA12878.broad.kb 1 281345/87389/152/0/0 from GATK's NA12878 knowledgebase. NA12878.v7.illumina.platinum 1 3702969/650764/13751/0/0 Illumina's platinum genomes version 7 NA12878.v8.illumina.platinum 1 3776096/692955/3910/0/0 Illumina's platinum genomes version 8 NA12878.nist.giab.v2.19 1 2787294/363036/2098/0/0 NIST Genome In a Bottle v2.19 UK10K.20140722 0 42415506/4198841/0/0 UK10K variants #codis 0 0/0/0/0/15 CODIS STRs plus 2 Pentanucleotide repeat STRs from Promega #cannot liftover structural like variants, thus SVs or any variants that use a symbolic allele are omitted @statistics not updated file (bed.gz) description mdust regions of low complexity annotated using mdust [Morgulis 2006] rmsk repeat regions from repeat masker obtained from UCSC genome browser HG38 database gencode.v19.cds coding sequence regions based on GENCODE v27 annotations [Harrow 2012] trf.lobstr tandem repeat finder STRs (motif length 1 to 6) from lobSTR's resource bundle [Gymrek 2012] trf.vntrseek tandem repeat finder STRs (motif length 7 or more) from VNTRseek's resource bundle [Gelfand 2014] Note: Please let me know if I did not cite a resource properly. maintained by: Adrian Tan (atks@umich.edu)