Genes Related to Melanoma Displaying the Ultraviolet Signature Mutation
Judy Qin1,2, Igor F. Tsigelny4,5, Valentina L. Kouznetsova3,4 1MAP program SDSC UCSD, 2REHS program SDSC UCSD, 3Moores Cancer Center UCSD, 4San Diego Supercomputer Center UCSD, 5CureMatch, Inc.
Though melanoma accounts for less than 1% of all skin cancer cases, it causes the majority of skin cancer related deaths. Certain genes in humans that are prone to mutations by ultraviolet radiation (UV) contain a unique signature in their DNA when exposed. One of the defining characteristics of this signature is a Cytosine<Thymine nucleotide mutation where the cytosine gets replaced by a thymine. Other mutations, Cytosine<Adenine, Cytosine<Guanine, Thymine<Adenine, Thymine<Cytosine, and Thymine<Guanine can occur as well. Through close analysis of the signature graph, I determined that there are 11 triplets that are most frequently mutated by UV radiation as a C<T substitution. For example, a gene related to Melanoma, CDK4, has 56 possible nucleotide substitution mutations. I looked at the mutation frequencies in the signature graph of CDK4 and compared it to the generic skin cancer signature graph to determine whether or not the mutations corresponded to the UV mutational signature. The ViSANT program elucidated clusters of genes in compact network modules from my list of Melanoma related genes. Ingenuity Pathway Analysis (IPA) showed the networks connections within these modules. I specifically looked at the canonical and network pathways and made sure they were UV linked. The p-values were also analyzed to make sure the results given were statistically significant. These results can shed a light to possible new approach to elucidation of molecular mechanisms of UV-related mutations and new biomarkers.
Ultraviolet rays, the leading cause of skin cancers including Melanoma, Basal Cell Carcinoma, and Squamous Cell Carcinoma, can be especially damaging to humans. 86% of the cases are attributed to radiation from the sun. Overexposure to these rays damage skin cells by mutating the DNA of genes that control cell growth. There are three main types of UV rays: UVA, UVB, and UVC. We are most commonly exposed to UVA rays, which are most commonly linked to long term damage and penetrate deeper into the skin than UVB rays do. UVB rays, the leading cause of most skin cancers, are more powerful than UVA rays and are able to cause more mutations directly into skin cells’ DNA. Though UVC rays contain the most energy out of the three, they do not get through the atmosphere as easily compared with the other two. The amount of UV exposure we get depends on the time of day and the duration we spend in the sun. [1, 2, 3].
As Figure 1 depicts, UVA rays are able to penetrate through the skin’s dermis, whereas UVB rays are only able to penetrate through the epidermis. However, since UVB rays have a higher intensity than UVA, they are ultimately able to cause more damage. (Figure 1)
When UV light hits the DNA in skin cells, it causes pyrimidines, cytosine, or thymine to form a dimer. A dimer is molecular complex consisting of two identical molecules linked together. The bond between the two pyrimidines is indicated by a red line. When that part of the DNA undergoes replication, it produces a normal strand along with a new strand that causes adenine to form instead of guanine (which usually binds to cytosine). Therefore, when it replicates one more time, it produces the same strand with the cytosine dimer along with adenine. Another strand produced changes the two cytosines to thymines, which usually binds to adenine. This produces a CC to TT mutation, which is what typically displayed in the UV signature. (Figure 2)
The signature chart for ultraviolet cancers shows the mutation type probability for each triplet. In DNA, mutations only occur at pyrimidines, cytosine and thymine. As a result, only 6 substitution mutations are possible, thus creating 96 possible triplet changes. My findings deduced that C>T mutations are evidently the most likely mutation.
1. From looking at the UV signature graph above (Graph 1) I can see that the most common mutation in the DNA is a C>T substitution. I identified the most commonly triplet mutations: TCC, TCA, TCG, CCC, TCT, CCT, CCA, ACC, CCG, GCC, and GCT. Using each triplet’s probability of mutation, I was able to understand which one out of the eleven had the highest probability and which one had the lowest .
2. Scientific research online shows a correlation between seven genes and skin cancer, mainly melanoma. Since skin cancer is caused by UV rays, we decided to further investigate the DNA of the seven genes listed above in Table 1.
3. Using a DNA codon to amino acid chart , I was able to determine the main amino acid substitution in each of the eleven triplets. I noticed a significant trend in my results: it was usually Serine or Proline to Phenylalanine or Leucine. Further literary research confirmed that proteins that are most commonly mutated contain regions rich in Proline, Glutamic acid, Serine, and Threonine in ultraviolet radiation.
4. Searching for each gene and protein on GeneCards and UniProt, I was able to confirm that in addition to the previously discovered potential amino acid substitution mutations, there were many more mutations for each protein. This proved that the protein each gene was coded for was susceptible to being easily mutated.
5. The NCBI Reference Sequence allowed us to input each individual gene and observe its entire DNA sequence. I was able to calculate the percentage of each triplet in each gene and compare it to that in average genome. From this, I came to the conclusion that the higher the percentage, the more likely the gene is to mutate from UV rays and display the UV signature.
Graph 1 represents the percentage of triplets that are commonly mutated by UV radiation in the average human’s DNA  and the gene CDK4. The blue shows how often these triplets come up in the normal human genome, and the purple shows how often these triplets are present in one of the genes I found related to melanoma — CDK4. I repeated this with the other six genes found through literature. All of them exhibited a much higher percentage of each of the triplets within their DNA, which allowed us to confirm that these genes have a good chance of being mutated by UV. More of these commonly mutated triplets allows for more chances of mutation.
Graph 3: Substitution mutations in CDK4 
From the potential 56 substitution mutations found in one of the melanoma linked genes CDK4, Graph 3 shows the breakdown of observed nucleotide changes. The largest percentage of substitution mutations are C>T, which matches the generic signature graph. In triplets, cytosine is most likely to mutate to a thymine making it reasonable to conclude that this gene exhibits the UV signature and is likely to be mutated. (Graph 4)
Graph 4 represents the percentage of mutated triplets in the DNA of CDK4. It shows what percent of the CDK4 triplets are mutated through these 6 nucleotide changes. The most common mutation in the signature graph is the C>T substitution.
6. COSMIC (Catalogue Of Somatic Mutations In Cancer) gave us all the potential mutations for all of my genes. There were 56 possible nucleotide substitution mutations presented for CDK4. After looking at all 56 mutations and the triplets they corresponded to, I created my own signature graph modeled after the original. When a mutation occurred at adenine or guanine, I looked at the opposite strand of DNA, where cytosine and thymine were present.
7. COSMIC also provided us with a massive list of genes and the diseases they were related to. I searched through the list to melanoma related genes. Along with the original seven genes from literature, I was able to find 25 more. With a list of 35 genes, I entered them into ViSANT to create clusters of these genes and related genes. Then, I put them into IPA to observe the pathway between these clusters.
After entering the 35 genes into ViSANT, it gave us this model, Figure 4. The genes were grouped into clusters along with many more related genes. The most significant clusters out of these 16 were 2, 5, 9, 15 which I found after putting them all into Ingenuity Pathway Analysis.
Figure 5 depicts a melanoma signaling pathway and shows genes and potential mutations related to this disease. It shows how certain mutations can cause genes to result in cell proliferation or cell death. The different melanoma related pathways are clearly shown as well. The AKT pathway is also included, as it is a very important signal transduction pathway that degrades genes resulting in many forms of cancer. The p-value for this pathway was 8.46E10-3. The lower the p-value, the more statistically significant the results are.
Figure 6: Canonical and network pathways for Cluster 5
Figure 6 shows the canonical and network pathways for Cluster 5. The p-value for the canonical DNA Damage Checkpoint Regulation pathway is 1.17E-5. UV rays also mark the start of of this pathway. The orange genes on the right are significant genes to my project in the network.
Figure 7: Canonical and network pathways for Cluster 15
Figure 7 shows the canonical and network pathways for Cluster 15. The left figure is a UVC-Induced MAPK Signaling Pathway. The p-value is 3.45E-17. Important enzymes, genes, and growth factors are in orange.
Figure 8: Canonical and network pathways for Cluster 9
The left diagram in Figure 8 shows the canonical and network pathways for Cluster 9. It shows a very important Sonic Hedgehog (SHH) Signaling pathway. Abnormal activation of this pathway leads to a variety of cancers including melanoma because it plays a large role in regulating cell differentiation, cell proliferation, and tissue polarity. The p-value is 2.8E-12. Important genes and growth factors are in orange.
From the data, I can conclude that many of the genes that were identified to be related to skin cancer all have more potential to be mutated by UV rays and display the UV signature. The percentages of each of the eleven triplets within the CDK4 gene’s DNA showed a much higher frequency compared to the average human genome without cancer. Since there are more of these commonly mutated triplets, it makes it more likely for UV rays to affect the DNA in this gene and similar ones found through literature or COSMIC, causing them to mutate and create nucleotide mutation substitutions. The following substitution mutations would occur: TCC → TTC, TCA → TTA, TCG → TTG, CCC → CTC, TCT → TTT, CCT → CTT, CCA → CTA, ACC → ATC, CCG → CTG, GCC → GTC, and GCT → GTT. Once the gene’s DNA mutates, it is unable to perform its normal functions, such as suppressing tumors. There are also amino acid substitutions that occur in the proteins that the genes normally produce, resulting in various abnormalities. Due to all of the changes these genes can potentially undergo based on the frequency of these eleven most commonly mutated triplets, they have been proven to be susceptible in displaying the UV signature in a person’s body and cause skin cancer. Using ViSANT, I was able to get similar genes to the original 35. Many of the 16 clusters from ViSANT such as 2, 5, 9, and 15 showed very promising results, as they all had extremely low p-values and pathways related to melanoma. Cluster 2 showed a specific melanoma pathway while Cluster 5 showed one related directly to UV. Cluster 9 represented a very important pathway that regulates cells, the Sonic Hedgehog pathway, and Cluster 15 also directly related to UV light. I can conclude that the genes found do have a relatively high correlation to melanoma and various other skin cancers. Therefore, my hypothesis was generally correct. I was surprised by the similarity between the cancer signature graph I created for CDK4, its potential mutations, and the general UV signature graph. As I predicted, there were a large number of potential C>T mutations. However, I am working to find a way to automatically be able to get a signature graph of any inputted gene and its potential mutations because I had to use the data for CDK4 and manually figure out the number of C>T, C>A, C>G, T>A, T>C, and T>G mutations there were to create my graph. Having a program that can do it automatically would allow me to obtain signature graphs for any gene in the future. In the future, signature graphs will hopefully play a significant role in determining the effects of different genes that lead to a variety of cancers.
 What Is Ultraviolet (UV) Radiation? American Cancer Society. http://www.cancer.org/cancer/skin-cancer/prevention-and-early-detection/what-is-uv-radiation.html.
 Ananthaswamy HN. Sunlight and skin cancer. Journal of Biomedicine and Biotechnology. 2001; 1(2): 49 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC113773/
 Alexandrov LB, Nik-Zainal S, Wedge DC, et al. Signatures of mutational processes in human cancer. Nature. 2013;500(7463):415-21.
 Amino Acid and Codon Table. http://www.cbs.dtu.dk/courses/27619/codon.html
 Chan P. Home. GtRNAdb: Genomic tRNA Database. http://gtrnadb.ucsc.edu/genomes/eukaryota/Hsapi19/
 Cosmic. CDK4 Gene - COSMIC. CDK4 Gene - Somatic Mutations in Cancer. http://cancer.sanger.ac.uk/cosmic/gene/analysis?ln=CDK4
 Giglia-Mari G, Sarasin A. TP53 mutations in human skin cancers. Hum Mutat. 2003 Mar;21(3):217-28 www.ncbi.nlm.nih.gov/pubmed/12619107.
 Grifantini, Kristina. “How Does Sunscreen Work?” LiveScience, Purch, 25 June 2010, www.livescience.com/32666-how-does-sunscreen-work.html
 Genetics of Skin Cancer. National Cancer Institute. https://www.cancer.gov/types/skin/hp/skin-genetics-pdq
 Pfeifer GP, You YH, Besaratinia A. Mutations induced by ultraviolet light. Mutat Res. 2005 Apr 1;571(1-2):19-31 https://www.ncbi.nlm.nih.gov/pubmed/15748635
Sharpless E, Chin L. The INK4a/ARF locus and melanoma. Oncogene. 2003 May 19;22(20):3092-8 www.ncbi.nlm.nih.gov/pubmed/12789286
 Alberts B, Johnson A, Lewis J, et al. Molecular Biology of the Cell. 4th edition. New York: Garland Science; 2002. Programmed Cell Death (Apoptosis)
Thank you so much to Dr. Igor Tsigelny and Dr. Valentina Kouznetsova for providing me with the support to carry out this project. You both have not only always been there to guide me in the right direction, but also inspire my love for scientific research.