This guide supports the Galter Library class called Genetic Variation and Human Disease. See our Classes schedule for the next available offering. If this class is not on our upcoming schedule, it is still available to you or your group by request.
Two types of genetic variation events are the sources of most human genetic variation:
Example: Velocardiofacial syndrome (VCSF), characterized by the presence of features like cleft palate, cardiac anomalies and learning disabilities, is associated with a deletion mutation on chromosome 22q11.2.
Very few polymorphisms show direct impact by creating deleterious phenotypes. However, non-disease-causing polymorphisms, when mapped to the genome, may serve as markers to identify and map other genes that do cause disease when mutated. If these non-disease-causing variations are found to be inherited with a particular trait, but do not cause the trait, they may provide evidence of where the trait's gene is located in the genome.
Terminology
NCBI's single nucleotide polymorphism database dbSNP is the most-used SNP database worldwide.
The direct URL for dbSNP is:
http://www.ncbi.nlm.nih.gov/SNP/index.html
but most often SNP records are accessed through links from other databases such as NCBI's Gene or OMIM databases or from European Bioinformatics Institute (EBI) databases. The URL above is useful for searching submitted SNP identifiers or searching for batches of SNP or experiments.
In addition to SNPs, dbSNP contains data from small-scale multi-base deletions or insertions (also called deletion insertion polymorphisms or DIPs) and microsatellite repeat variations (also called short tandem repeats or STRs).
You can view summary statistics for the current build of dbSNP or for past builds. dbSNP also has extensive documentation and a submission system for researchers to submit SNP data from their own experiments. These features can be accessed from the menu on the left side of the dbSNP home page.

When a researcher uploads SNP data, each SNP is assigned a submitted SNP identifier (ss#). Submitted SNPs are then checked against the current contents of dbSNP. SNPs that are redundant (i.e., match already-submitted SNP types) are assigned to the appropriate reference SNP (rs#) cluster. Unique SNPs are assigned new rs numbers upon the next build of dbSNP. Submitted SNP records contain information on the experimental procedures used to identify the SNP, the research project in which the SNP was identified and more data on the SNP.

A different searchable interface for NCBI's dbSNP is found at the URL:
http://www.ncbi.nlm.nih.gov/snp/
This page allows you to search more efficiently for SNPs associated with a specific gene or disease in specific organisms or in a particular region of a chromosome. You can do this by using the Limits tab on this page.
Example: Mutations on BRCA1 gene have been reported to be associated with the early onset of breast cancer. Retrieve all non-synonymous and validated coding refSNPs for human BRCA1 from dbSNP.
Solution:

Limits in dbSNP Searches
dbSNP Search Results

You can access specific links from the dbSNP search results by using the colored graphic representation bar, or view the full record by clicking the rs number.
From the full record, you can view SNPs in Sequence Viewer to see their genomic orientation, plus mapping to mRNA and protein products. You can also view population diversity for the alleles and the FASTA sequence of the polymorphism.
The default view of a record from a search of dbSNP is a view of only coding SNPs. To view all SNPs in a gene region, click on the radio button next to in gene region in the GeneView section of the SNP record, then click Go.

You can also view SNPs in a chromosomal context by using NCBI's MapViewer.
Example: Mutations in Dopamine Receptor 5 (DRD5) gene have been observed in patients with various neurological disorders. How many refSNP records can you find for DRD5 and how many of them are present in its coding region? Show all ref SNPs in the context of a chromosome.
Solution:


You now will see SNPs that align to the gene region of interest, with links to details on each SNP in dbSNP. NCBI has a descriptive legend for all of the symbols you see for refSNPs in Map Viewer, so you can learn how to tell quickly which SNPs match to transcript and coding regions, for example.
Online Mendelian Inheritance in Man (OMIM) is a database of human gene-phenotype correlations maintained by the NCBI. Records in OMIM are presented as summary reviews of the current state of knowledge on known Mendelian disorders and over 12,000 genes. OMIM records are good starting points for learning disease-gene links, because a great deal of information can be accessed from each record.

Accessing Allelic Variants in OMIM
You can view a list of allelic variants for a gene from its record in OMIM.
Example: Search OMIM for information on the gene glucose-6-phosphate dehydrogenase (G6PD). What known disorders are caused by allelic variants in this gene?
Solution:

Limitations of Allelic Variants in OMIM
Not all possible allelic variants will be listed in OMIM. The Gene View in dbSNP will give you a better listing of all polymorphisms for a gene. However, the allelic variants in OMIM are usually selected because they have some significance. Reasons for allelic variants to be included in OMIM include:
NCBI's dbSNP includes data from both the International HapMap Project and the 1000 Genomes Project. When viewing a GeneView in dbSNP, you can see which SNPs have been sequenced or mapped by either project in the Validation column of the gene report.

You can also browse data from the International HapMap project directly from the HapMap Project website. Links in the left menu will take you to a search interface for the most current data from the project.
Haplotypes are contiguous, linear sets of SNP alleles along a genome that are inherited as a block. Haplotypes can give information on what markers are inherited together and thus can be used to find markers for disease trait inheritance.


Data from the 1000 Genomes Project does not currently have its own separate search interface, but the 1000 Genomes website has links through which you can download entire sequence datasets for your own research use.
Other genome variation resources and servers are available on the Web.