Introduction

Patients from a Swedish family have been reported to suffer from weakness and atrophy of the distal limb and axial trunk muscles.1 Three patients (III:1, III:5 and IV:14) in addition developed a myopathy of the right ventricle, dilatation and arrhythmias in conjunction with an atrioventricular block or a sick sinus syndrome and fulfilled the diagnostic criteria of arrhythmogenic right ventricular cardiomyopathy (ARVC).2 First clinical symptoms of muscle weakness were noticed during the third to sixth decade of life. Subsequent manifestations and the clinical course of the disease varied between affected members of the family. As the family encompassed 21 members at various ages, a definite diagnosis of myofibrillar myopathy (MFM) could only be established in 10 patients.1

The histopathological studies of affected skeletal muscle biopsies from seven patients revealed focal disorganization of myofibrils, accumulation of granulofilamentous material and/or deposition of multiple proteins like desmin, dystrophin and γ-sarcoglycan in myofibrils. Although rimmed vacuoles were frequently seen in most biopsies, the 15–21 nm long filaments typical for inclusion body myositis were not identified. One patient died of right-sided heart failure at the age of 59 and was examined postmortem. The endomyocard of the right ventricle showed degeneration of cardiomyocytes and severe fibrofatty tissue replacement without inflammation.

Since the pathological muscle phenotype was detected only in late adulthood, but before the age of 60, most healthy family members have been assigned an unknown status. Considering a genetic defect in 18 previously mapped candidate regions, the authors found co-segregation with one of these loci on 10q22.3 and obtained pairwise LOD scores of 2.73 and 2.76 for the markers D10S2327 (80.4 Mbp (million base pairs)) and D10S1752 (78.0 Mbp), respectively, and a multipoint LOD score of 3.06 at locus D10S2327. One affected member of the family, III:2, showed recombination at D10S605 (78.9 Mbp), another affected member, IV:15, showed recombination at D10S215 (89.5 Mbp). These two markers were, therefore, reported as the proximal and distal boundaries of the disease interval, which measures 10.6 Mbp on the current sequence map of the human genome.

More than 10 years ago, the 10q21-q23 region was already inferred as a candidate region for cardiomyopathies, because it harbored a high number of genes related to the structure and functions of striated muscles.3 Later, a 4.6-Mbp long interval between D10S1730 (78.6 Mbp) and D10S1696 (83.2 Mbp) was reported to co-segregate with autosomal dominant dilated cardiomyopathy (CMD1C) in a three-generation family with 12 affected members. On the basis of the current sequence map4 for human chromosome 10 (see Deloukas et al4), the co-segregating interval for the CMD1C family appeared to overlap with the MFM/ARVC7 locus. Despite considerable efforts and sequencing of candidate genes from the 4.6-Mbp interval including DLG5, PPIF and ANXA11, no significant changes in the coding regions were so far identified in members of the CMD1C family.5

In this study, we re-examined the previous linkage data for ARVC7 by genome-wide single nucleotide polymorphism (SNP) analyses to evaluate the proposed linkage to D10S2327 and D10S1752. We narrowed the critical region on 10q22.3 to a 4.27-Mbp segment carrying only 18 protein encoding candidate genes. Seventeen of the 18 candidate genes are shared with the CMD1C region strongly suggesting that mutations in a single gene are responsible for relatively diverse clinical phenotypes.

Materials and methods

DNA preparation

The 13 family members included in our study, underwent neurological and cardiological investigations at Uppsala University Hospital as described. Informed consent was obtained from each person before participating in this study. Peripheral blood samples were previously taken to extract genomic DNA from nucleated blood cells. We generated an EBV-transformed B-cell line from peripheral blood mononuclear cells by standard procedures. Briefly, 10 × 106 peripheral blood mononuclear cells were suspended in 2.5 ml medium (RPMI-1640 supplemented with 10% FCS) and 2.5 ml EBV containing supernatant of the cell line 95-8 was added. After 2 h, 5 ml medium containing cyclosporine A (1 μg/ml) was added. The cell line obtained was stimulated with lipopolysaccharide for 24 h, or phorbol ester (PMA) for 24 h. Cell lines that contained subsets of human chromosomes for patient IV:14 were generated by cell fusion with an engineered murine recipient cell line.6 Stable hybrid cell lines carrying only one of the two chromosome 10 homologs of patient IV:14 were selected and expanded using chromosome 10 specific polymorphic markers (GMP Genetics Inc., Waltham, MA, USA).

Genotyping

We genotyped DNA samples from 10 affected and 3 nonaffected family members using the Affymetrix GeneChip Human Mapping 10K 2.0. Array version Xba 142. This version of the 10K array comprises a total of 10 204 SNPs with a mean intermarker distance of 258 kb, equivalent to 0.36 cM. Genotypes were called by the GeneChip DNA Analysis Software (GDAS v2.0, Affymetrix). We verified sample genders by counting heterozygous SNPs on the X chromosome. Relationship errors were evaluated with the help of the program Graphical Relationship Representation.7 The program PedCheck was applied to detect Mendelian errors8 and data for SNPs with such errors were removed from the data set.

Nonmendelian errors were identified by using the program MERLIN8 and unlikely genotypes for related samples were deleted. Nonparametric linkage analysis using all genotypes from a chromosome simultaneously was carried out with MERLIN. Parametric linkage analysis was performed by a modified version9, 10 of the program GENEHUNTER 2.1 through stepwise use of a sliding window with sets of 90. Haplotypes were reconstructed with GENEHUNTER 2.1 and presented graphically with HaploPainter.11 This latter program also reveals informative SNP markers as points of recombination between parental haplotypes. All data handling was performed using the graphical user interface ALOHOMORA12 developed at the Berlin Gene Mapping Center to facilitate linkage analysis with chip data. PCR primers for dinucleotide and tetranucleotide repeat markers were taken from the Genome Database (www.gdb.org). Product sizes of the PCR assays were determined by capillary electrophoresis on a MegaBACE 1000 system (Amersham). The following markers were used to determine haplotypes and recombination break points on 10q22.3 and 2q37, D10S1704, D10S1645, D10S1151, AFM094yh3, D10S517 (UT920), D10S1437 (GATA91D01), AFMa344yc5, D10S201, D10S1777, D10S1761, D10S1696, D2S341, D2S2340, D2S427, D2S2193, D2S1765, D2S2344, D2S206, D2S2176, D2S1279, D2S331 and D2S2348.

Bioinformatics of the MFM/ARVC7 region

The genomic sequence of the critical region became available during the course of this study.4 Genes were annotated manually on the basis of various mammalian cDNAs sequences (ESTs) matching the working draft sequences of region-specific BAC clones in BLASTN searches. Expression patterns of the annotated genes were examined using Unigene EST collections and GNF Sym-Atlas version 1.0.3.

Mutation screening

PCR primers were designed to amplify all coding exons and adjacent intron–exon borders of the 17 candidate genes and the first exon of the neuregulin 3 gene with help of ExonPrimer (http://ihg.gsf.de/ihg/ExonPrimer.html) or Primer3 (http://frodo.wi.mit.edu/). The PCR primers used to amplify genomic DNA from MFM/ARVC7 patients for each exon and adjacent intron boundaries are available on request. Amplified exons from at least one patient were sequenced directly in both directions using the BIG DYE dideoxy-terminator chemistry (Perkin Elmer) on an ABI 377 DNA sequencer (PE Applied Biosystems). Chromas 1.5 software (Technelysium) and GCG package computer programmes (Wisconsin) were used to examine and compare raw sequence data. To exclude intragenic microdeletions, exons were also amplified from hybrid cell DNA carrying only the affected chromosome 10 of patient IV:14 (cell line no. 38) and the normal maternally inherited chromosome 10 (cell line no. 1). To verify the presence of two large tandem repeat copies spanning RP11-506M13 (GenBank accession AC068139) and RP11-589B3 (GenBank accession BX248123), respectively, on both chromosome 10 homologs, the primers DJ3044 (5′-GGGAAATCCACAGYGACTTA-3′) and DJ3045 (5′-GTGCTTCACATCCCCTCATC-3′) were used to amplify two almost identical subfragments of these duplicons that, however, differed by 28 bp in length. Amplification of the 16 exons of the ZASP gene,13 the exon 2 of myotilin14 and the metavinculin-specific exon of vinculin,15 which are known myopathy genes, was performed as described. The promoter region of the PPIF gene spanning the putative TATA box at nucleotides −105 to −110 upstream of the ATG initiation codon16 was amplified with the primers DJ3063 (5′-ATCATGGGAGGTGTCTCAGC-3′) and DJ3064 (5′-CGGGGAGAACACAGAACTCA-3′) between positions −646 and −29 and sequenced in 5′–3′ direction.

Results

Genotyping and linkage analyses

Only those family members whose clinical status had been determined with high reliability have been selected and are shown in Figure 1. Hence, our genome-wide linkage analysis includes only eight clinically affected individuals with positive muscle histology (III:2, III:3, III:5, III:9, IV:2, IV:9, IV:14 and IV:15), two clinically affected individuals with only myopathic changes in the electromyograms (EMGs) (III:6 and IV:7) and one healthy individual III:4 examined by a neurologist lastly in 2005 at the age of 69. This latter individual did not show signs of distal or axial muscle weakness, signs of ARVC on the electrocardiograms and a pathological EMG. In addition, we analyzed the genotypes of two healthy spouses, III:1b and III:5b, related to the affected family members by marriage. We did not evaluate the genotype of the healthy individual II:9, since her deceased sister II:8 may well be the first affected person in this pedigree. In total, the disease locus of II:8 was passed through 11 meiosis to 10 affected offsprings which were available for analyses.

Figure 1
figure 1

Partial family pedigree with 13 family members used for genome-wide SNP typing. Black squares and circles indicate affected men or women, respectively, and slashes indicate deceased individuals for which no DNA samples were available. Only one healthy individual showing no signs of myopathy at the age beyond 69 (III:4) was included in our analyses. Individuals marked by dashed squares were only diagnosed on the basis of clinical signs and electromyographic findings; the histomorphological finding for one affected member (III:6) was unavailable and normal in one patient (IV:7) with axial weakness.

Figure 2 shows the parametric (LOD, upper panel) and nonparametric LOD (Z-all) scores over the entire genome for the reduced, but most informative family subtree. The maximum LOD score of 3.01 that is predicted for this subtree is only obtained for the chromosome 10q22.3 region. Of further interest are two other regions showing high LOD scores on chromosome 2q37 and 14q22. Between rs699662 and rs1113193 on chromosome 2 the genotypes of all family members co-segregate with the disease, but the SNP genotyping data were not informative for patient IV:2. We therefore analyzed several dinucleotide repeat markers between D2S341 and D2S2344 for eight family members and found that the haplotype of the patient IV:2 for this interval was not shared with the other affected family members. The SNP genotyping data for the 14q22 region between rs2356915 (chr.14: 50.99 Mbp) and rs1886460 (chr.14: 54.17 Mbp) indeed agreed for all affected family members, but the only healthy member of this family, III:4, also carried the same haplotype. Full co-segregation between the disease and genotypes of the 13 family members was only observed for the chromosome 10q22.3 locus as previously suggested. The disease-linked interval spans between rs625587 (chr.10: 78.81 Mbp close to D10S605: 78.92 Mbp) and rs951203 (chr.10: 84.04 close to D10S1786: 83.92 Mbp) according to SNP genotyping. This interval is contained within the region D10S605-D10S215 (chr.10: 78.92–89.46 Mbp), which was previously reported to harbor the genetic defect for MFM. Using additional dinucleotide repeat markers, we were able to further pinpoint the proximal and distal boundaries of the disease interval. The proximal boundary is defined by D10S1645 (chr.10: 79.65), the distal boundary by D10S1786 (chr.10: 83.92). The most proximal nonrecombinant markers were D10S1677 (chr.10: 79.88 Mbp) and AFM094yh3 (chr.10: 79.98 Mbp) and the most distal nonrecombinant marker was rs2207782 (chr.10: 83.75 Mbp). All genotypes determined for additional markers in this interval (UT920, D10S1437, D10S219 and D10S1761) fully support our SNP linkage data and previous linkage results with D10S201 (chr.10: 80.70) and D10S2327 (chr.10: 80.38). Previous interpretation of genotyping data for D10S1432 (chr.10: 74.33) and D10S1752 (chr.10: 78.00) was incorrect with regard to marker placements (D10S1752) and phase assignments (D10S1432 and D10S1752). The critical region for MFM/ARVC7 is 4.27 Mbp and is almost completely contained within the CMD1C interval. It shares the distal 3.57 Mbp with it, but does not contain ZASP (LDB3) and metavinculin (VCL).

Figure 2
figure 2

Nonparametric Z-all values and multipoint LOD scores for the MFM/ARVC7 pedigree using GeneHunter 2.1. The y axis depicts the respective scores for nonoverlapping windows spanning 90 contiguous SNPs. The high nonparametric Z-all value on chromosome 14 is not confirmed by the corresponding LOD score, as the healthy individual III:4 shares the same haplotype with all affected relatives.

Physical map of the MFM/ARVC7 locus

We analyzed the nascent BAC sequences that became available during the course of this study, constructed physical maps with these sequences and genetic markers and extracted the coding regions of all genes by comparisons with the mouse, rat and dog genomes and with ESTs from various vertebrates. The sequence map released recently for chromosome 10 (May 2004 assembly, NCBI35) still possesses a sequence gap (Figure 3) in the middle of the disease interval.4

Figure 3
figure 3

Linkage of MFM/ARVC7 to 10q22.3. (a) Section of the linked region with a 300-kb tandem duplication containing the surfactant protein genes A1 and A2 and a member of the FAM22 genes. Length variation within these duplicons (horizontal arrows) permitted us (b) to exclude a potential intrachromosomal deletion by paralogous recombination within these duplicons using IV:14-derived DNA from the two monochromosomal hybrid cell lines no. 1 (nonaffected haplotype inherited from III:5b) and no. 38 (disease-associated haplotype inherited from III:5). The sequence gap is indicated by a vertical arrow and the position of the amplimers by an open and a filled diamond in panels a and b. (c) Comparative marker map of the CMD1C and ARVC7 region showing the overlap between both disease loci. (d) Simple repeat and SNP markers (left column) referenced in the paper and their location on chromosome 10 in million base pairs (Mbp) according to the NCBI35 assembly of the human genome.

While we also did not manage to close it, this gap between AL132656 and AC068139 appears to be relatively small. Its estimated size is about 10 kb4 and represents the junction between two highly conserved tandem repeats of more than 300 kb. Each repeat contains a functional copy of the surfactant protein A1 and A2 genes. The latter highly conserved genes evolved by local tandem duplication and form a local inverted repeat within these larger 300 kb duplicons. In contrast to the human region, the corresponding syntenic segment of the dog genome is completely sequenced, displays a much simpler arrangement of genes and fully supports the number and location of all orthologous human genes.

Candidate genes for MFM/ARVC7

The repertoire of human candidate genes for MFM/ARVC7 comprises 17 genes and the first exon of neuregulin 3 (NRG3) (see Table 1). This set of genes and physical arrangement is also found in the dog genome with minor, but striking differences. The PLAC9, ANXA11, C10ORF57, SFTPD and SFTPA segment is inverted in the dog genome. Although not yet annotated, the PLAC9 gene is clearly conserved in the dog genome. The SFTPA gene family is represented by a single-copy gene in the dog, and hence the four human copies of the SFTPA family must be the result of recent duplications in the primate lineage. In a first step, the SFTPA ancestor was most likely duplicated by a segmental inversion (ancestral SFTPA1 and SFTPA2 genes). This inverted tandem repeat was then copied as a whole to form the present-day four-gene cluster (SFTPA1, SFTPA2, SFTPA1B and SFTPA2B) of the human genome.

Table 1 Functional candidate genes for ARVC7/MFM

Mutation screening of all candidate genes

All coding exons of the 17 genes in the MFM/ARVC7 critical region and the first exon of the neuregulin 3 (NRG3) gene were subjected to sequence analysis. In total, we amplified and sequenced 111 exons, including the exons of the identical SFTPA1/SFTPA1B and SFTPA2/SFTPA2B gene pairs, which were co-amplified and sequenced together. No family-specific sequence alteration was discovered.

Since mutations in the ZASP gene and in exon 2 of the myotilin gene were previously detected in 16 and 9% of MFM patients, respectively (n=68),17 we also searched for mutations in these sequences and in another gene from chromosome 10, called metavinculin15 which is associated with dilated cardiomyopathy. Disease-associated mutations in these three genes, however, were not found. These negative results are corroborated by our genome-wide linkage data, which excluded metavinculin and ZASP at 10q22 and myotilin at 5q31.

Exclusion of microdeletions

We considered the possibility of intragenic microdeletions of one or more adjacent exons in the candidate region. To this end, we generated monochromosomal hybrid cell lines carrying only one of the two chromosome 10 homologs for patient IV:14. Amplification of each exon from both hybrid cell lines (no. 38 carrying the disease haplotype and no. 1) was achieved with the same number of PCR cycles indicating that deletions of the amplified segments were not present in our family. This approach clearly has limitations and is, for example, not able to exclude deletions in noncoding regulatory segments, local inversions or tandem duplications.

Discussion

Some years ago, linkage of ARVC7 (OMIM entry 609160) to 10q22.3 in a Swedish family was inferred1 with the assumption that ARVC7 was allelic with cardiomyopathy, dilated 1C (CMD1C).5 Here we present the genome-wide linkage results for this family which suffers from both ARVC and MFM. Linkage calculations were based on genotype data from 11 family members, one healthy individual beyond the age of 69 (III:4), eight patients with a positive muscle histology and a pathological EMG, and two patients who only had a pathological EMG (III:6 and IV:7) and were either not biopsied (III:6) or had no histopathological findings of the biopsied muscle (IV:7). Although we had no reasonable doubts to question the clinical disease status and EMG diagnosis of the latter two patients, we re-analyzed our SNP genotyping data with the assumption that one of these two family members developed similar clinical symptoms, but was not affected by the genuine, inherited form of MFM. With the assumption that III:6 and IV:7 were wrongly classified, the highest peak of genome-wide LOD scores were still obtained for 10q22.3, but did not reach its maximum of 3.01, since both individuals, III:6 and IV:7, shared the same genotype with other affected members in this region. The remaining smaller LOD score peaks were below 2.2 and were not consistent with the affection status of at least two other family members.

While we already excluded the most intriguing candidate genes on chromosome 10q22, namely ZASP (cypher)13, 17, 18 and metavinculin,15 by this genome-wide linkage analysis, we confirmed our negative results by sequencing all ZASP exons and the metavinculin-specific exon of the vinculin gene. With the exception of CMD1C, our genome-wide screen thus excluded all known skeletal myopathy- and cardiomyopathy-associated loci for the Swedish family. The critical MFM/ARVC7 region almost completely overlapped with the previously reported CMD1C interval. The only candidate exon not shared with the CMD1C region is the exon 1 of neuregulin 3 (NRG3) at the distal boundary. In comparison to our MFM family, the critical disease interval for the CMD1C family is significantly larger in centromeric direction and ranges from D10S1730 (position 78.60 Mbp, NCBI35) to D10S1696 (position 83.22 Mbp). Therefore, four additional genes KCMA1, DLG5, POLR3A and RPS24 have to be considered as a potential cause of CMD1C besides the 17 candidate genes shared by both disease regions.

Several circumstances fortify our speculation that CMD1C and ARVC7 are probably caused by allelic defects in the same gene. Clinical manifestations in both families show a relatively narrow tissue distribution and are restricted to muscular tissues, to cardiomyocytes in one family and to heart and skeletal muscles in the second family. Functional defects are relatively mild and unfold in these families in late adulthood. A growing number of examples have been published indicating that mutations in the same gene (eg desmin, myotilin, ZASP) can lead to different clinical phenotypes and manifestations, to desmin-related myopathies, dilated cardiomyopathies or ARVC. The fact that some negative data on PPIF and ANXA115 and no positive findings have been published for CMD1C over the last years supports our suspicion that unusual mutations of regulatory sequences in one of the 17 shared candidate genes are the cause for the slow degeneration of myocytes in both families.

Five candidate genes encoding surfactant proteins, SFTPA1, SFTPA1B, SFTPA2, SFTPA2B and SFTPD,4 are expressed exclusively in the lung and respiratory tract as secretory proteins and components of the lung surfactant proteolipid complex and also serve an antimicrobial function of the innate immune system. Particular alleles of the SFTPA1 gene are susceptibility factors for the respiratory distress syndrom (OMIM no. 267450) and idiopathic pulmonary fibrosis (OMIM no. 178500). Two additional proteins, encoded by MAT1A and DYDC1, are also restricted to non-muscle tissues. MAT1A is exclusively expressed in the liver, DYDC1 in the testis (GNF SymAtlas, http://www.gnf.org). MAT1A mutations have been reported in humans and result in plasma hypermethioninemia, which, however, does not damage cardiac and skeletal muscles. On the basis of our current knowledge, it is pretty unlikely that mutations in one of these seven genes can generate a muscle-restricted disease as observed in our family. Additional families with ARVC, DCM or MFM should be screened for linkage to this relatively small genomic segment and, once identified, such additional patients may help to unravel pathological mutations in one of the 10 most likely candidates.