How to Read a Loss of Heterozygosity Plot

  • Technical advance
  • Open up Access
  • Published:

A new method to detect loss of heterozygosity using cohort heterozygosity comparisons

  • 8714 Accesses

  • 6 Citations

  • 6 Altmetric

  • Metrics details

Abstract

Background

Loss of heterozygosity (LOH) is an important marker for one of the 'two-hits' required for tumor suppressor cistron inactivation. Traditional methods for mapping LOH regions require the comparison of both tumor and patient-matched normal DNA samples. Nevertheless, for many archival samples, patient-matched normal Deoxyribonucleic acid is not available leading to the under-utilization of this important resource in LOH studies. Here we describe a new method for LOH analysis that relies on the genome-wide comparing of heterozygosity of unmarried nucleotide polymorphisms (SNPs) betwixt cohorts of cases and united nations-matched healthy control samples. Regions of LOH are defined by consistent decreases in heterozygosity beyond a genetic region in the case cohort compared to the control cohort.

Methods

Deoxyribonucleic acid was collected from 20 Follicular Lymphoma (FL) tumor samples, xx Diffuse Big B-cell Lymphoma (DLBCL) tumor samples, neoplastic B-cells of 10 B-cell Chronic Lymphocytic Leukemia (B-CLL) patients and Buccal cell samples matched to four of these B-CLL patients. The cohort heterozygosity comparison method was developed and validated using LOH derived in a small cohort of B-CLL by traditional comparisons of tumor and normal DNA samples, and compared to the simply alternative method for LOH analysis without patient matched controls. LOH candidate regions were and so generated for enlarged cohorts of B-CLL, FL and DLBCL samples using our cohort heterozygosity comparing method in order to evaluate potential LOH candidate regions in these non-Hodgkin'southward lymphoma tumor subtypes.

Results

Using a pocket-size cohort of B-CLL samples with patient-matched normal Deoxyribonucleic acid nosotros accept validated the utility of this method and shown that it displays more accurateness and sensitivity in detecting LOH candidate regions compared to the only alternative method, the Hidden Markov Model (HMM) method. After, using B-CLL, FL and DLBCL tumor samples we accept utilised cohort heterozygosity comparisons to localise LOH candidate regions in these subtypes of non-Hodgkin'due south lymphoma. Detected LOH regions included both previously described regions of LOH as well every bit novel genomic candidate regions.

Conclusions

Nosotros have proven the efficacy of the apply of cohort heterozygosity comparisons for genome-broad mapping of LOH and shown it to be in many ways superior to the HMM method. Additionally, the use of this method to analyse SNP microarray data from iii common forms of non-Hodgkin'south lymphoma yielded interesting tumor suppressor factor candidates, including the ETV3 gene that was highlighted in both B-CLL and FL.

Peer Review reports

Groundwork

The elimination of tumor suppressor gene (TSG) function contributes to carcinogenesis and cancer progression. Early work on the RB1 factor locus suggested two hits in TSGs were required to disrupt TSG function [1, two]. That is, both alleles of a TSG must be interrupted by mutation or allelic loss in order to render it inactive. Loss of heterozygosity (LOH), the transition from germ-line heterozygosity at a polymorphic locus to somatic homozygosity, is a hallmark of allelic loss and thus represents one of the two hits required for TSG inactivation.

Analysis of LOH is therefore important in cancer research in order to localize potential TSGs that may have a part in affliction genesis and progression [3]. The introduction of high-density unmarried nucleotide polymorphism (SNP) arrays has allowed loftier-resolution mapping of LOH and the depiction of minimally lost regions that indicate the presence of of import TSGs [4]. The conventional method for LOH assay relies on the comparison of SNP genotypes between tumor DNA samples and patient-matched control (germ-line) Deoxyribonucleic acid samples obtained from normal (non-tumor) tissue. LOH determined in this manner will henceforth be referred to as 'conventional LOH'. The limitation of conventional methods of LOH analysis is that many archived tumor samples are not accompanied by patient-matched control tissue resulting in the under-utilization of these potentially valuable resources for interrogation of LOH.

Analysis of allelic copy number, using Hidden Markov Model (HMM)-based approaches such equally QuantiSNP [v] and PennCNV [vi], allow detection of LOH resulting from hemizygous deletion of single alleles, simply lack the chapters to discover copy-number neutral LOH. Recently, an culling method was described that utilizes a HMM to infer the presence of LOH from SNP microarray data of not-matched tumor samples past the absence of heterozygosity [7], thus allowing detection of copy-number neutral LOH. This method utilizes long strings of SNPs with homozygous calls to infer LOH and generates high numbers of candidate regions spanning large genetic distances. In order to overcome this limitation and increase the resolution of LOH mapping, nosotros have developed a method to infer the presence of LOH regions through the comparison of SNP heterozygosity values between case and reference cohorts (Accomplice Heterozygosity Comparison; CHC). The CHC method infers LOH candidate regions by the presence of short strings of SNPs which exhibit consistently lower prevalence of heterozygosity in the case accomplice compared to the control cohort. The method is based on the hypothesis that decreased heterozygosity of closely spaced SNPs in the case cohort compared to the command cohort indicates a region of LOH. The CHC method calculates heterozygosity values for SNPs covering the entire genome in case and control cohorts, and so identifies and recovers the information from informative SNPs. By directly comparing heterozygosity of informative SNPs, groups of SNPs with significant differences in heterozygosity betwixt the cohorts can be identified.

In this study, the CHC method was employed using SNP data from Affymetrix 250 K SNP arrays to infer LOH regions in tumor samples from B-cell Chronic Lymphocytic Leukemia (B-CLL), Lengthened Large B-cell Lymphoma (DLBCL) and Follicular Lymphoma (FL) patients. SNP array data from patient-matched buccal samples from 4 of the B-CLL patients was initially used to definitively map LOH past conventional methods in social club to determine regions that will henceforth be referred to as 'conventional LOH'. These 'conventional LOH' regions were used in club to validate 'inferred LOH' regions generated by the CHC method. Validation of 'conventional LOH' was additionally performed by high-density microsatellite assay. This work, also as direct comparison of the efficacy of the CHC method with that of the simply alternative (HMM) method, highlighted the utility of cohort heterozygosity comparisons for mapping LOH candidate regions without the need for patient-matched control samples. Nosotros therefore performed CHC assay on larger numbers of FL, DLBCL and B-CLL samples in order to elucidate LOH candidate regions with potential roles in disease pathogenesis.

Methods

Patient Samples and DNA Extraction

Peripheral blood samples were obtained from B-CLL patients (northward = x) through the Prince Charles and Gold Coast Hospitals. Lymphocytes were enriched from iii mL of whole-blood using ACCUSPIN™ system HISTOPAQUE®-1077 columns (Sigma-Aldrich) co-ordinate to the manufacturer's protocol and B-cells were isolated from lymphocyte enrichments using Dynabeads® CD19 (Invitrogen) and a magnetic particle concentrator according to the manufacturer'due south protocol. Following isolation of B-cells, the cells were lysed past the addition of 600 μL of lysis buffer (Qiagen) and passing through a xx-guage needle. DNA was purified from lysed cells using QIAamp Deoxyribonucleic acid Tissue Mini Kits according to the manufacturer's protocol. Buccal cell samples were collected from 6 of the B-CLL patients using Catch-All sample drove swabs (Epicentre Biotechnologies), and DNA isolated using BuccalAmp Deoxyribonucleic acid extraction kits (Epicentre Biotechnology) co-ordinate to the manufacturer's protocol.

FL (n = 20) and DLBCL (n = 20) tumor specimens were obtained through the Australian Leukemia and Lymphoma Grouping (ALLG) Tissue Depository financial institution (Princess Alexandra Hospital, Queensland, Australia) and BioOptions BioRepository Service http://www.biooptions.com/. DNA was extracted from tumor tissue using a modified column extraction protocol. 10 mg of tissue was homogenized in 1 mL of phosphate-buffered saline (pH 7.four) using a rotor-strator homogenizer. Protein was digested past incubation with Proteinase-K at 70°C for 30 min. Dna was purified from the milieu using QIAamp DNA Blood Midi Kit (Qiagen) according to the manufacturers protocol and all samples that were below a concentration of 100 ng/μL were precipitated in ethanol and re-eluted in a smaller volume.

Single Nucleotide Polymorphism Microarrays

DNA samples were amplified, fragmented, labelled and hybridized to Affymetrix 250 1000 Sty SNP microarrays in accordance with the manufacturer's protocol. Raw data was extracted from paradigm files using GeneChip Operating Organization software (GCOS; Affymetrix). SNP genotyping and HMM-based LOH analysis was performed using GeneChip Genotyping Software (GTYPE; Affymetrix). It should be noted that the HMM-method does not utilize comparison to any normal sample, and generates LOH candidate regions based upon the genotypes generated from the tumour samples but. 'Conventional LOH' was detected in 4 B-CLL patients by comparison of SNP microarray genotypes from Buccal and tumor samples. This was performed using dChipSNP software as previously described [8].

Microsatellite Assay of LOH

Case tumor and buccal DNA samples were amplified using the Illustra Genomiphi V2 Dna amplification kit (GE Healthcare) according to the manufacturer'due south protocol. Each sample was genotyped at the Australian Genome Research Facility (AGRF, Sydney, Australia) using a standardized set of 63 microsatellites markers for chromosome i. These markers bridge chromosome 1 with an average resolution of v cM http://appliedbiosystems.com/.

Evaluation and Statistical Analysis of 'Inferred LOH' Regions

In guild to evaluate the 'Inferred LOH' regions generated by CHC and HMM methods, annotation data from the Affymetrix GeneChip® Genotyping Software was analyzed in Microsoft Excel. Each inferred LOH region was then plotted on the spreadsheet and evaluated for size and distance from SNPs showing 'conventional LOH'. Size was calculated using the concrete position of the SNPs at the end of each region. The distance from 'conventional LOH' was calculated using the concrete positions of the SNPs at the cease of each region and the physical position of the closest SNP showing 'conventional LOH'. If candidate regions encompassed SNPs showing 'conventional LOH', the distance to the closest 'conventional LOH' was determined to be 0.

Normality of the distribution of delta values was tested using Blom'due south formula in order to ensure the right model was being employed for power analysis. To constitute differences in sensitivity and specificity of the CHC method with different face-to-face betoken thresholds (CPT), correlations between CPT and the percentage of regions encompassing 'conventional LOH' or the altitude of regions from 'conventional LOH' were derived using Pearsons correlation and the total data set for each CPT. In gild to illustrate the fact that the primary determining factor for the distance of LOH regions inferred by CHC from the 'conventional LOH' was the resolution of the SNPs within the region, Pearsons correlation was used to compare the distance from the closest SNP showing 'conventional LOH' and the distance betwixt SNPs. Merely the 'inferred LOH' regions derived with a CPT of five, and that did not encompass markers showing 'conventional LOH' (n = 159) were employed for this analysis. Demonstration of the need for the CHC method to accept high heterozygosity frequencies at SNPs showing 'conventional LOH' in society to map LOH candidate regions over them when employing modest cohort samples sizes was achieved by comparing of heterozygosity frequencies of SNPs showing 'conventional LOH' found inside candidate regions compared to those side by side to candidate regions. This was achieved through the use of an contained-samples Students T-examination of all regions generated with a CPT of 5. In social club to highlight the utility of the enrichment score (ES) generated from the sum of delta values, Pearsons correlation was used to correlate the ES values of all 'inferred LOH' regions generated with a CPT of v with the altitude of each region from the closest SNP showing 'conventional LOH'. To evaluate the efficacy of the HMM method, the percentage of HMM candidate regions that encompassed SNPs showing 'conventional LOH' was compared between patients by one-way ANOVA with Bonferroni post-hoc analysis.

Accomplice Heterozygosity Comparison Analysis of B-CLL, FL and DLBCL Samples

SNP microarray data for DLBCL, FL and B-CLL samples was analysed using the CHC method with reference to data generated from Caucasian HapMap samples. This data is freely bachelor from http://www.affymetrix.com. It should be noted that different control samples were used for each NHL affliction sub-category in society to avert biasing the results. In order to ensure accuracy of LOH regions, CHC analysis was performed using a delta threshold of 0.4 and a CPT of 5. Regions of 'inferred LOH' were ranked by enrichment score and the height 2 regions for each subtype discussed.

Implementation

Selection of Informative Markers

Selection of informative SNPs (SNP i ) is the first important aspect of the CHC method. As heterozygous genotypes are required to infer LOH using both the conventional and CHC methods of analysis, it is important that there be a likelihood of obtaining heterozygous genotypes inside a case cohort. In order to perform cohort comparisons simply those markers that were predicted to give a heterozygous telephone call within the cohort were accounted to exist informative; that is, only those markers with a population heterozygosity value (HV) greater than or equal to the reciprocal of the cohort size. HV for the given ethnicity is as adamant by the SNP microarray annotation data. In the initial validation ready of four samples in this investigation, 55% of the SNPs on the array were selected equally informative markers; this translates to >125,000 SNPs. Just by increasing the sample size in the secondary examination sets of B-CLL, FL and DLBCL samples, larger numbers of the SNPs were selected as informative. Thus, the resolution of this method remains high even with low sample sizes, but is improved with increasing sample size due to inclusion of SNPs with lower heterozygosity frequencies.

Calculation of Delta Values

Because of the potential to skew case and command heterozygosity values, all markers with absent-minded genotype calls in any of the samples are removed from the analysis. Heterozygosity values are and then calculated for all remaining SNPs for each cohort. A delta value (Δ), representative of the difference in heterozygosity value betwixt each cohort, is and so calculated for each SNP.

Calculation of Delta Threshold

Different cohort sample sizes between investigations also arrive important to calculate the delta threshold for each application of the CHC method. The statistical power of different investigations is dependant upon their corresponding sample sizes. With increasing sample size, a lesser issue size (Δ) is needed in society to obtain statistically significant results, whilst maintaining the desired minimum power (eighty%). The delta threshold is therefore set to maintain a statistical power of eighty% (α = 0.05). For example, with the initial validation cohort (n = 4, σ = 0.three) the delta threshold was calculated to be 0.5, whereas past increasing the sample size to ten in the secondary cohort of B-CLL samples while maintaining the same standard deviation the maximum required delta threshold required to obtain 80% power was lowered to 0.3. This means that with a sample size of 4 a l% decrease in example cohort heterozygosity frequency must be observed in the case cohort compared to the reference cohort in order for the mark to be deemed informative, whereas with a sample size of 10 simply a 30% subtract in case accomplice heterozygosity frequency is required for a marker to be deemed informative. Increasing the cohort sample size when using the CHC method of LOH assay is therefore expected to increment the specificity of the method and hence decrease the number of blazon I errors. Statistical power assay was employed to calculate the delta value required (i.e. effect size) to reach a minimum ability of 80% with the specified accomplice sample size (α = 0.05). The following formula was utilized to calculate the delta threshold (Δ T ), which was used to determine which delta values were suggestive of LOH (Δ Due south ). Inside this formula, σ represents the standard deviation, Z represents the Z-score for a given value, a represents the level of significance (set up to 0.05), β represents type 2 error charge per unit and n represents the sample size.

Face-to-face Bespeak Threshold and Inferring LOH

The number of contiguous SNPs required to infer LOH within a region is referred to every bit the contiguous bespeak threshold (CPT). Strings of contiguous SNPs larger than or equal to the CPT and yielding informative delta values are highlighted as 'inferred LOH' regions. Past altering this threshold users are able to change the sensitivity and specificity of the CHC method. Using a low CPT a comparatively higher number of 'inferred LOH' regions can be localized compared to using a loftier CPT. However, this increased sensitivity is accompanied past decreased specificity. Setting the CPT is therefore a merchandise-off betwixt sensitivity and specificity, and should exist considered for each application of this method. Yet, it is recommended that a minimum CPT of v exist employed in order to ensure sufficient specificity.

Enrichment Score

In order to predict the proximity of 'inferred LOH' regions to markers showing 'conventional LOH', an enrichment score can be generated past summing the delta values generated inside an 'inferred LOH' region. This enrichment score negatively correlates with the distance to closest 'conventional LOH' mark, indicating that the higher the enrichment score the closer the 'inferred LOH' regions will link to markers showing 'conventional LOH'. Information technology is therefore recommended that the sequence immediately flanking CHC candidate regions with low enrichment scores besides be considered when searching for candidate TSGs.

Automatic Analysis using Cohort Heterozygosity Comparison

A platform for automatic CHC assay (Additional file one), instructions for use (Additional file 2), and the example and command input files (Additional files iii and 4) utilized for analysis B-CLL, FL and DLBCL cohorts in this manuscript are available from the journal spider web-site.

Results

Validation of the Cohort Heterozygosity Comparing Method

Delta values were calculated for each informative SNP as described higher up. These values demonstrated a positively skewed normal distribution, as would exist anticipated with decreased heterozygosity frequency in the case accomplice. Adding of the delta threshold is an important facet of the CHC method. With increasing cohort sample size a decrease in the effect size (Δ) is required to obtain significant results, whilst maintaining 80% power. For the initial validation set of B-CLL samples (n = 4, σ = 0.3) the delta threshold was calculated to be 0.five, for the subsequent test set (n = 10, σ = 0.3) the delta threshold was calculated to be 0.3.

The number of inferred regions for each CPT can be seen in Table 1. From this it can be seen that increasing the CPT decreased the number of LOH candidate regions generated via the CHC method. A significant positive correlation was constitute between CPT and the per centum of regions that spanned SNPs showing 'conventional LOH' (Pearsons correlation coefficient = 0.998; p = 0.002), and a significant negative correlation was plant betwixt CPT and the mean distance of 'inferred LOH' regions from the closest SNP showing 'conventional LOH' (Pearsons correlation coefficient = -0.996; p = 0.004). It is therefore predicted that raising the CPT increases the selectivity of the CHC method (i.due east. decrease the chance of a Type I error), but in return may also decrease its sensitivity (i.due east. increase the chance of a type II error).

Tabular array 1 Validation of 'Inferred LOH' Regions Generated past the CHC Method

Full size table

A striking tendency was likewise observed whereby those regions that did not embrace SNPs showing 'conventional LOH' mapped directly adjacent to them. When evaluating the LOH regions inferred with a CPT of 5, those SNPs showing 'conventional LOH' that mapped directly adjacent to regions of 'inferred LOH' were institute to have a mean heterozygosity of 0.19 (north = 159). The mean population heterozygosity of SNPs showing 'conventional LOH' that mapped inside 'inferred LOH' regions was found to be 0.32 (n = 140), and was significantly higher than those mapping next to candidate regions (p < 0.001). A positive correlation was found between the distance of candidate regions from the closest SNP showing 'conventional LOH' and the distance between SNPs (Pearsons correlation coefficient = 0.290; p = 0.022). A pregnant negative correlation was found between enrichment score and the distance to the closest SNP showing 'conventional LOH' (Pearsons correlation coefficient = -0.143; p < 0.001).

Evaluation of the Hidden Markov Model Method

The HMM method for inferred LOH regions, associated with the Affymetrix GeneChip® Genotyping Software CNAT 4.0 tool, generates scores of either i or 0 relating to 'inferred LOH' or retentivity of heterozygosity respectively. Regions with scores of ane were classified as 'inferred LOH' regions and are summarized in Table 2. This method was found to generate large numbers of regions that spanned large genomic distances and that had a variable rate of success in encompassing 'conventional LOH' in the CLL patients. In total for the 4 cases, 699 regions were selected as regions of 'inferred LOH' past the HMM method. Of these, 82 regions overlapped in two patients, 22 candidate regions overlapped in three patients, and 2 candidate regions overlapped in all iv patients, resulting in 567 independent regions of 'inferred LOH' for follow-upward analysis from four cases. A significant difference in the percentage of HMM candidate regions encompassing SNPs showing 'conventional LOH' was also noted between cases (p = 0.01). No feature of individual case data was constitute to predict the percentage of regions encompassing SNPs showing 'conventional LOH'.

Table 2 Validation of 'Inferred LOH Regions Generated by the HMM Method

Full size table

Accomplice Heterozygosity Comparison Analysis of NHL Subtypes

After validation of the CHC method, SNP array data for DLBCL, FL and B-CLL samples was analysed using the CHC method. This generated 9 LOH candidate regions in DLBCL with enrichment scores ranging from 2.1 to 5.45, 10 LOH candidate regions in FL with enrichment scores ranging between 2.two and 4.65, and 65 candidate regions in B-CLL with enrichment scores ranging from 2 to 5.eight. The top 2 candidates for each NHL subtype, as assessed by enrichment score, are displayed on Table 3. Figure ane shows a moving window assay of the delta scores over the 1q32.ane candidate locus for all 3 of the NHL subtypes, showing its implication in CLL and FL only non DLBCL. The 1q32.1 region highlighted by the CHC method in CLL and FL cohorts, and the 3p25.3 region highlighted in the CLL cohort, was not detected by the HMM method in any of these samples. Withal, the HMM method did infer LOH over areas exceeding 500 Mbp in two FL samples that overlapped the insufficiently smaller 5q11.2 candidate region highlighted by the CHC method. The HMM method likewise inferred LOH over a 965 Kbp in one DLBCL sample that overlapped the smaller 4q32.2 candidate region highlighted past the CHC method. Interestingly, the near notable overlap in candidate prediction betwixt the HMM and CHC methods occurred at the 3q11.2 candidate region, in which the HMM region inferred LOH in vii DLBCL samples with sizes ranging from 681 Kbp up to all of chromosome 3.

Figure 1
figure 1

Rolling window representation of delta values beyond the 1q23.1 cytoband for CLL, FL and DLBCL cohorts. The position of each SNP over this region is shown past a blackness vertical bar. Information technology tin can be seen that high delta values are obtained for contiguous SNPs over the ETV3 gene locus, marked with a ruddy box, in CLL and FL cohorts but non in the DLBCL cohort.

Total size paradigm

Table 3 Top two Regions of 'Inferred LOH' for each NHL Subtype

Full size tabular array

Discussion

We have adult a method for inferring regions of LOH by analyzing cohort heterozygosity values using SNP arrays, without the demand for patient-matched samples. Initially, LOH regions were inferred in a modest cohort of B-CLL patients using the Accomplice Heterozygosity Comparison method. In lodge to investigate the efficacy of this method, regions of 'conventional LOH' were mapped in these patients through the use of SNP array data from patient-matched buccal samples, which was farther validated by analysis of microsatellite markers. Following validation of this novel method of LOH analysis, CHC assay was used to infer LOH in an extended cohort of B-CLL samples besides every bit larger cohorts of FL and DLBCL samples.

The CHC method employs genotype data generated past SNP microarray in order to detect regions of decreased heterozygosity frequency in the example cohort compared to the control cohort. Due to the decreasing heterozygosity of markers of markers on modern SNP assortment platforms, the CHC method only utilizes markers with a predicted chapters to generate at least one heterozygous genotype within each of the case and command cohorts. For each of these SNPs, delta values are calculated that correspond to the difference in heterozygosity between cases and controls. Decreases in heterozygosity frequency, as demonstrated by a positive delta score, are considered to be suggestive of LOH if they exceed a set threshold. This threshold (ΔT) is one of the modulatory parameters of the CHC method that can exist used to alter the balance between sensitivity and specificity, but is usually prepare to maintain sufficient statistical power for the analysis. The second modulatory parameter contiguous point threshold (CPT), which is used to highlight regions as inferring LOH but if the number of contiguous markers exceeding ΔT is equal to or above the CPT. The CPT is applied in lodge to minimize false positives, and the ability of this parameter to attune the sensitivity and specificity of this method is aptly shown past the decreasing number of inferred LOH regions and more accurate inference of LOH with increasing CPT, respectively (Tabular array one).

A caveat of the CHC method when employing small sample sizes is the increased requirement for loftier heterozygosity frequencies in order to observe LOH. This lead to a portion of the regions within the validation cohort mapping adjacent to, rather than encompassing, SNPs showing 'conventional LOH'. In this investigation the validation cohort sample size was modest; pregnant that the effect size must be large in order to define regions with significantly decreased heterozygosity in the case accomplice compared to the command cohort. In the validation cohort of four samples, when employing a CPT of v and a delta threshold of 0.5, there must be 5 consecutive SNPs with a 50% subtract in heterozygosity betwixt the instance and command cohorts in gild LOH to be inferred. This means the CHC method can only detect LOH in small sample sizes if the region of LOH has a high enough initial rate of heterozygosity. In instances in which SNPs showing 'conventional LOH' had low population heterozygosity, the 'inferred LOH' region ofttimes mapped directly next to them rather than encompassing them. This is supported past the fact that the SNPs showing 'conventional LOH' that mapped adjacent to regions of 'inferred LOH' derived past the CHC method had a significantly lower population heterozygosity than those that mapped within the 'inferred LOH' regions. Furthermore, the distance of these adjacently positioned regions of 'inferred LOH' from 'conventional LOH' was adamant primarily by the resolution of markers inside each corresponding genomic region. The need for high population heterozygosity frequencies in order to map SNPs showing 'conventional LOH' within regions of 'inferred LOH' generated by the CHC method can be alleviated by increasing the cohort sample size. This would augment the power of the investigation and subtract the effect size required to define SNPs as being suggestive of LOH. However, even with the small sample size in the validation cohort the CHC method was shown to successfully infer LOH regions closely to SNPs showing 'conventional LOH', including those with depression population heterozygosities. Furthermore, utilization of the CHC method with small sample sizes is further aided by the apply of enrichment scores for each 'inferred LOH' region. This score is a sum of the heterozygosity frequency delta values, and was shown to significantly and negatively correlate with the distance of 'inferred LOH' from 'conventional LOH'.

The simply current culling to the CHC method for high resolution mapping of LOH using SNP microarray data without the demand for patient-matched control samples is a method based on an HMM algorithm [7]. This method relies on identification of long strings of homozygous SNP genotypes in guild to infer LOH. The HMM method therefore primarily differs from the CHC algorithm due to its style of analysis being directed linearly within a unmarried sample, while the CHC algorithm applies cross-sectional interrogation of genomic regions beyond unabridged cohorts of samples. This means that, while the HMM algorithm is predisposed to generating imitation-positive LOH calls as a result of consanguinity in individual patient'due south family unit copse and the subsequent enrichment of homozygosity, the cross-sectional arroyo adopted by the CHC algorithm is not every bit heavily affected past private genetic background. We employed this method to analyze each of the four initial validation case samples in this investigation. It yielded 699 candidate regions with an boilerplate size of over 95 kb betwixt the four cases. Only 19% of these candidate regions overlapped in two or more patients, resulting in 567 independent genomic regions that were suggestive of LOH. The HMM method does not provide a cumulative index or probability of LOH across all samples, and thus differences betwixt this method and the CHC method are to exist expected. However, when comparing the accuracy of the HMM-generated regions to the regions inferred past the CHC method with the recommended CPT, it can be seen that the CHC method delineated regions that were on average over 70% smaller, and more ofttimes mapped within the immediate vicinity of 'conventional LOH'. It should exist noted that, although the size of HMM-inferred regions of LOH could be decreased by altering the state change parameter, this would too further subtract the specificity of the method. Nonetheless, the CHC method was found to infer LOH more than specifically and inside smaller genomic regions than the HMM method. This suggests that, not simply is the CHC method more accurate in inferring LOH, merely the increased precision may allow more specific depiction of smaller regions of LOH. Although there are articulate advantages in combining cases into a cohort as office of the CHC method, this approach inhibits the power to infer LOH regions within individual samples. Instead, the CHC method is more appropriately employed in generating candidate LOH regions implicating genes with a broad role in disease pathogenesis rather than a variable importance on a instance-to-instance basis.

In order to investigate whether automated analysis of larger samples of NHL patients could derive hypothetically important candidates, the CHC method was used to clarify larger cohorts B-CLL, FL and DLBCL samples. Using the enrichment score, the top two LOH candidate regions for each affliction subtype were investigated for candidate genes that may possess a hypothetical tumor-suppressor office in NHL. Although ane of the regions of 'inferred LOH' in DLBCL mapped over a genetic region with no closely linked coding loci, this region (3q11.ii) has been shown to be lost in a range of cancers including acute lymphoblastic leukemia and mantle cell lymphoma [9, 10], and the remaining five candidate regions were closely linked to attractive TSG candidates. Amid these were two Deoxyribonucleic acid repair genes (FANCD2, ERCC8), an inhibitor of RAS-mediated transformation (ANP32C), and a repressor of NFκB activity that was highlighted in both B-CLL and FL cohorts (ETV3).

The association between DNA repair genes and the pathogenesis of NHL is demonstrated by lymphoma predisposition within immunodeficiency cases containing mutations in genes mediating DNA harm repair [xi] to developing NHL, every bit well as the numerous associations between polymorphisms in DNA repair genes and genetic susceptibility to NHL [12–14]. Information technology is therefore not surprising that DNA repair genes may play a primal tumor suppressor role in NHL. The FANCD2 gene was linked with a region of 'inferred LOH' highlighted in the B-CLL patient accomplice, and functions in past forming complexes with BRCA1 or RAD51 and mediating repair of transcriptionally active genes [15]. Furthermore, this factor has also been shown to be essential for maintaining the G2 cell-bicycle checkpoint [16]. The ERCC8 cistron is also involved in Deoxyribonucleic acid repair of transcriptionally active genes [17], and was linked with a region of inferred LOH in the FL patient cohort. Further support for this gene equally a potential TSG is provided by the increased frequency of cancer in Cackayne'southward syndrome patients in which the ERCC8 factor is mutated [xviii]. The targeting of the FANCD2 and ERCC8 genes by LOH, as inferred by the CHC method, indicates that knock-out of their function may permit for the accumulation of mutations within transcriptionally agile genes. Furthermore this may likewise be compounded by subsequent un-checked progression through the G2 checkpoint without FANCD2-associated prison cell-cycle arrest or apoptotic induction.

The ANP32C gene was highlighted every bit a potential TSG by CHC analysis of the DLBCL patient cohort. This gene has been previously described as a tumor suppressor factor, and elucidation of its function revealed that this role may exist due to repression of RAS-mediated tumorigenesis [18]. Even so, our implication of this gene in the pathogenesis of NHL is a completely novel finding. The most interesting finding inside the CHC analysis results was the revelation of ETV3 as a potential TSG in both B-CLL and FL patient cohorts. This cistron is function of the ETS-family unit of tumor suppressors and functions in repression of NFκB-activated transcription [19]. LOH of the ETV3 locus at 1q23.1 has been previously demonstrated in B-CLL [twenty], and its potential every bit a TSG is supported by the importance of NFκB signaling in promoting proliferation and inhibiting apoptosis within lymphocytes [21]. The implication of genes with well defined tumor suppressor roles by accomplice heterozygosity comparison of B-CLL, FL and DLBCL provides evidence that this novel method of analysis infers LOH within plausible TSGs. Although the implication of these candidate genes is only a preliminary finding, and validation by microsatellite analysis in a larger cohort of samples would be required in order to definitively link them with the pathogenesis of NHL, they conspicuously illustrate the utility of the CHC method for derivation of LOH candidates.

Conclusions

We take developed a method of LOH analysis using instance and unmatched control samples that differs from that which is currently bachelor by directing the assay cantankerous-sectionally beyond multiple genomes within a cohort. Analysis of SNP array data with the CHC method was shown to infer LOH regions that linked closely to SNPs showing 'conventional LOH'. This method tin be adopted for investigation of varied sample sizes, and the specificity and sensitivity of the method can exist tailored to suit the objectives of each investigation. The candidate regions generated past the CHC method were considerably smaller and mapped significantly closer to SNPs showing 'conventional LOH' than the candidate regions generated by it's only current culling, the HMM method. The utility of this method was farther highlighted by its use to elucidate hypothetical, but potentially important, regions of LOH in three subtypes of NHL, including the ETV3 cistron that was highlighted in both B-CLL and FL patient cohorts.

Abbreviations

CHC:

Cohort Heterozygosity Comparison

HMM:

Subconscious Markov Model

LOH:

Loss of Heterozygosity

TSG:

Tumor Suppressor Factor.

References

  1. Comings D: A general theory of carcinogenesis. Proc Natl Acad Sci. 1971, 70: 3324-3328. 10.1073/pnas.70.12.3324.

    Article  Google Scholar

  2. Knudson A: Two genetic hits (more or less) to cancer. Nat Rev Cancer. 2001, 1: 157-163. 10.1038/35101031.

    CAS  Article  PubMed  Google Scholar

  3. Thiagalingam S, Foy R, Cheng K, Lee H, Thigalingam A, Ponte J: Loss of heterozygosity as a predictor to map tumor suppressor genes in cancer: molecular basis of its occurrence. Curr Opin Oncol. 2002, 14: 65-72. x.1097/00001622-200201000-00012.

    CAS  Article  PubMed  Google Scholar

  4. Lipshutz R, Fodor S, Gingeras T, Lockhart D: High density constructed oligonucleotide arrays. Nat Genet. 1999, 21: 20-24. 10.1038/4447.

    CAS  Commodity  PubMed  Google Scholar

  5. Collela S, Yau C, Taylor J, Mirza Chiliad, Butler H, Clouston P, Bassett A, Seller A, Holmes C, Ragoussis J: QuantiSNP: an objective Bayes Hidden-Markov Model to observe and accurately map re-create number variation using SNP genotyping data. Nuceic Acids Res. 2007, 35: 2013-2015. x.1093/nar/gkm076.

    Article  Google Scholar

  6. Wang One thousand, Li M, Hadley D, Liu R, Glessner J, Grant South, Hakonarson H, Bucan M: PennCNV: an integrated Hidden-Markov Model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome Res. 17: 1665-1674. ten.1101/gr.6861907.

  7. Beroukhim R, Lin G, Park Y, Hao K, Zhao 10, Garraway L, Fox East, Hochberg Due east, Mellinghoff I, Hofer M, Descazeaud A, Rubin Thousand, Meyerson M, Wong Due west, Sellers W, Li C: Inferring loss-of-heterozygosity from unpaired tumors using high-density oligonucleotide SNP arrays. PLOS Comput Biol. 2009, 2: 323-332.

    Google Scholar

  8. Lin M, Wei L, Sellers Westward, Lieberfarb One thousand, Wong Westward, Li C: dChipSNP: significance curve and clustering of SNP-array-based loss-of-heterozygosity information. Bioinformatics. 2004, 20: 1233-1240. ten.1093/bioinformatics/bth069.

    CAS  Commodity  PubMed  Google Scholar

  9. Tsuzuki Due south, Karnan South, Horibe K, Matsumoto K, Kato K, Inukai T, Goi Grand, Sugita K, Nakazawa S, Kasugai Y, Ueda R, Seto Thousand: Genetic abnormalities involved in t(12;21) TEL-AML1 acute lymphoblastic leukemia: Analysis past ways of array-based comparative genomic hybridization. Cancer Sci. 2007, 98: 698-706. x.1111/j.1349-7006.2007.00443.x.

    CAS  Article  PubMed  Google Scholar

  10. Wlodarska I, Pittaluga S, Hagemeijer H, De Wolf-Peeters C, Berghe Van Den H: Secondary chromosomal changes in mantle cell lymphoma. Haematologica. 1999, 84: 594-599.

    CAS  PubMed  Google Scholar

  11. Tran H, Nourse J, Hall S, Green M, Griffiths Fifty, Gandhi M: Immunodeficiency-associated lymphomas. Blood Rev. 2008, 22: 261-281. 10.1016/j.blre.2008.03.009.

    Article  PubMed  Google Scholar

  12. Hill D, Wang South, Cerhan J, Davis S, Cozen W, Severson R, Hartge P, Wacholder Southward, Yeager K, Chanock Due south, Rothman North: Chance of non-Hodgkin's lymphoma (NHL) in relation to germline variation in DNA repair and related genes. Blood. 2006, 108: 3161-3167. 10.1182/blood-2005-01-026690.

    CAS  Commodity  PubMed  PubMed Key  Google Scholar

  13. Shen K, Zheng T, Lan Q, Zhang Y, Zahm Due south, Wang S, Holford T, Leaderer B, Yeager M, Welch R, Kang D, Boyle P, Zhang B, Zou Chiliad, Zhu Y, Chanock S, Rothman N: Polymorphisms in DNA repair genes and risk of not- Hodgkin'south lymphoma among women in Connecticut. Hum Genet. 2006, 119: 659-668. 10.1007/s00439-006-0177-2.

    CAS  Article  PubMed  Google Scholar

  14. Ekstrom-Smedby Yard, Lindgren C, Hjalgrim H, Humphreys G, Schollkopf C, Chang E, Roos One thousand, Ryder L, Falk K, Palmgren J, Kere J, Melbye M, Glimelius B, Adami H: Variation in Deoxyribonucleic acid repair genes ERCC2, XRCC1, and XRCC3 and take chances of Follicular Lymphoma. Cancer Epidemiol Biomarkers Prev. 2006, xv: 258-265. 10.1158/1055-9965.EPI-05-0583.

    Article  Google Scholar

  15. Taniguchi T, Garcia-Higuera I, Andreassen P, Gregory R, Grompe M, D'Andrea A: S-phase-specific interaction of the Fanconi anemia protein, FANCD2, with BRCA1 and RAD51. Claret. 2002, 100: 2414-2420. 10.1182/blood-2002-01-0278.

    CAS  Article  PubMed  Google Scholar

  16. Freie B, Ciccone S, Li X, Plett P, Orschell C, Srour E, Hanenberg H, Shindler D, Lee S, Clapp Westward: A office for the Fanconi anemia C protein in maintaining the Dna damage-induced G2 checkpoint. J Biol Chem. 2004, 279: 50986-50993. 10.1074/jbc.M407160200.

    CAS  Article  PubMed  Google Scholar

  17. Hakem R: DNA-damage repair; the good, the bad, and the ugly. EMBO J. 2008, 27: 589-605. ten.1038/emboj.2008.15.

    CAS  Article  PubMed  PubMed Central  Google Scholar

  18. Bai J, Brody J, Kadkol S, Pasternack G: Tumor suppression and potentiation past manipulation of pp32 expression. Oncogene. 2001, xx: 2153-2160. x.1038/sj.onc.1204294.

    CAS  Article  PubMed  Google Scholar

  19. El Kasmi K, Smith A, Williams L, Neale G, Panopolous A, Watowich S, Haker H, Foxwell B, Murray P: A transcriptional repressor and corepressor induced by the STAT3-regulated anti-inflammatory signaling pathway. J Immunol. 2007, 179: 7215-7219.

    CAS  Article  PubMed  Google Scholar

  20. Pfiefer D, Pantic M, Skatulla I, Rawluk J, Kreutz C, Martenz U, Fisch P, Timmer J, Veelken H: Genome-wide analysis of DNA copy number and LOH in CLL using loftier-density SNP arrays. Blood. 2007, 109: 1202-1210. 10.1182/blood-2006-07-034256.

    Article  Google Scholar

  21. Baeuerle P, Henkle T: Function and activation of NFkappa-B in the immune system. Ann Rev Immunol. 1994, 12: 141-179.

    CAS  Article  Google Scholar

Pre-publication history

  • The pre-publication history for this paper can exist accessed hither:http://world wide web.biomedcentral.com/1471-2407/10/195/prepub

Download references

Acknowledgements

This work was supported by funding from the Herbert Family and the Griffith Medical Inquiry Higher.

Writer information

Affiliations

Corresponding writer

Correspondence to Lyn R Griffiths.

Additional information

Competing interests

The authors declare that they accept no competing interests.

Authors' contributions

MRG participated in Deoxyribonucleic acid extractions, methological development and information analysis. RAL participated in methodological evolution. PJ participated in methodological evolution and software design. PW, JW and PM participated in patient recruitment and sample processing. LRG participated in methodological development and manuscript writing. All authors read and approved the concluding paper.

Electronic supplementary material

12885_2009_1994_MOESM1_ESM.PL

Additional file one: CHC Programme. Perl script for automated assay using the cohort heterozygosity comparison method. (PL 11 KB)

12885_2009_1994_MOESM2_ESM.TXT

Additional file two: Instructions for utilize of CHC Programme. Text File containing instructions for the use of the CHC Program. (TXT 1 KB)

12885_2009_1994_MOESM3_ESM.TXT

Additional file 3: Case Case Information. Text File containing genotype and annotation information for the extended B-CLL case accomplice. (TXT 17 MB)

12885_2009_1994_MOESM4_ESM.TXT

Additional file 4: Example Control Data. Text File containing genotype and annotation data for the extended B-CLL control cohort. (TXT 17 MB)

Authors' original submitted files for images

Rights and permissions

Open up Access This article is published nether license to BioMed Cardinal Ltd. This is an Open up Access article is distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in whatsoever medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Cite this article

Light-green, Grand.R., Jardine, P., Wood, P. et al. A new method to detect loss of heterozygosity using accomplice heterozygosity comparisons. BMC Cancer 10, 195 (2010). https://doi.org/10.1186/1471-2407-10-195

Download citation

  • Received:

  • Accustomed:

  • Published:

  • DOI : https://doi.org/x.1186/1471-2407-ten-195

Keywords

  • Hibernate Markov Model
  • Follicular Lymphoma
  • Enrichment Score
  • Unmarried Nucleotide Polymorphism Array
  • Hide Markov Model Method

rumphdiffe1999.blogspot.com

Source: https://bmccancer.biomedcentral.com/articles/10.1186/1471-2407-10-195

0 Response to "How to Read a Loss of Heterozygosity Plot"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel