OBJECTIVE: Cocaine use disorders (CUDs) represent a major public health problem in many countries.
To better understand the interaction between the environmental modulations and phenotype,
the aim of the present study was to investigate the DNA methylation pattern of CUD
patients, who had concomitant cocaine and crack dependence, and healthy controls.
METHODS: We studied DNA methylation profiles in the peripheral blood of 23 CUD patients and 24 healthy control subjects using the Illumina Infinium HumanMethylation450 BeadChip arrays.
RESULTS: Comparison between CUD patients and controls revealed 186 differentially methylated positions (DMPs; adjusted p-value [adjP] < 10-5) related to 152 genes, with a subset of CpGs confirmed by pyrosequencing. DNA methylation patterns discriminated CUD patients and control groups. A gene network approach showed that the EHMT1, EHMT2, MAPK1, MAPK3, MAP2K1, and HDAC5 genes, which are involved in transcription and chromatin regulation cellular signaling pathways, were also associated with cocaine dependence.
CONCLUSION: The investigation of DNA methylation patterns may contribute to a better understanding of the biological mechanisms involved in CUD.
Keywords: Cocaine; crack cocaine; DNA methylation; drug dependence; epigenetics.
|Received: March 13 2018; Accepted: October 23 2018|
Cocaine is an alkaloid extracted from Erythroxylon coca and Erythroxylon novogranatense, two plant species that grow mainly in South America and whose use has been reported for over 5,000 years.1 However, over the last 30 years, abuse of and dependence on cocaine and its smoked form, known as crack, emerged as a major public health problem in many countries.2 Brazil is one of the greatest consumer markets for cocaine worldwide (3.2 million people), and the world's largest consumer of crack (1.8 million people).3,4
Crack has become a drug of choice due to its low price and more intense effects compared to other drugs; its onset of action is immediate and it provides higher cocaine concentration levels (40-85%) compared to other routes, such as snorted, injected, and inhaled.5,6 Since crack and other similar products simply represent different formulations of the same substance (cocaine), they exert similar effects in the body, with differences arising from the route of administration. The main effects of cocaine and crack in the brain involve modulation of neurotransmitters from the mesolimbic-mesocortical system by blocking dopamine reuptake in the synaptic cleft. This results in euphoria and a feeling of wellbeing, which may lead the user to repeat the experience and, therefore, increase vulnerability to drug dependence.1,6
Drug dependence behavior results from complex interactions between genetic and environmental factors, and the study of epigenetics has been suggested as a potential method of better understanding these mechanisms.7 Epigenetics modulate gene expression throughout life in response to environmental exposures, such as drug consumption. Different epigenetic approaches have been used to elucidate this interaction, such as DNA methylation and histone modifications, which are associated with synaptic plasticity and memory formation, as well as with behavioral responses to drugs of abuse.8,9 DNA methylation machinery in brain regions related to reward undergo changes after cocaine exposure. These include the enzymes that catalyze DNA methylation (DNMTs) and demethylation (TETs), both able to induce stable changes in neuronal function and behavior, which are directly linked to chromatin remodeling and epigenetic modulation.9,10
Changes in genome-wide DNA methylation have been explored successfully in other psychiatric disorders, such as bipolar disorder, schizophrenia, major depression, and posttraumatic stress disorder,11,12 as well as to evaluate the correspondence between DNA methylation in brain tissue and accessible peripheral tissues (e.g., blood).13,14 Within this context, we researched DNA methylation patterns in patients with cocaine use disorder (CUD) and controls, aiming to explore this biological mechanism potentially involved with drug dependence by in silico analyses.
Samples were selected from the DNA Bank of the Departamento e Instituto de Psiquiatria, Faculdade de Medicina da Universidade de São Paulo (FMUSP), São Paulo, Brazil. Genomic DNA extracted from peripheral blood was selected from 23 CUD patients (originally, 24 patients were selected, but one of them was identified as being 47, XXY during the analysis and therefore excluded from the study) and 24 healthy controls, all males ranging between 23 and 29 years of age. Cases were recruited and evaluated as inpatients and outpatients from seven drug-dependence treatment clinics in São Paulo. Individuals with another psychiatric diagnosis (such as psychosis) or any chronic organic illness, such as diabetes or other metabolic disorders, were excluded. All current cocaine users then took part in a structured interview to collect data on sociodemographic characteristics, sexual behaviors, and a drug-use profile. Interviews were conducted by three researchers (experienced psychiatrists or psychologists) who had been trained to use the questionnaire. All cases met ICD-10 criteria for diagnosis of cocaine dependence and had 8-to-10-year histories of concomitant cocaine and crack dependence as well as use/abuse of other drugs, such as cannabis, solvents, heroin, and tobacco. The control group consisted of volunteer blood donors who completed a short questionnaire designed to screen for contagious diseases and use of any kind of drug. Volunteers with a history of abuse or recent contact with any potential drugs of abuse, contagious diseases, or who had any known current psychiatric condition or lifetime history of psychiatric hospitalization were not selected for this study (Table 1). A detailed description of the above groups can be found elsewhere.15,16 The study was approved by the local research ethics committee (protocol 143/13) and conducted in accordance with the Declaration of Helsinki.
|Cases (n=23)||Controls (n=24)||p-value*|
|23||7 (30.4)||7 (29.2)||0.9999|
|24||4 (17.4)||4 (16.7)|
|25||2 (8.7)||2 (8.3)|
|26||1 (4.3)||1 (4.2)|
|27||2 (8.7)||3 (12.5)|
|28||3 (13.0)||3 (12.5)|
|29||4 (17.4)||4 (16.7)|
|White||13 (56.5)||13 (54.2)||0.8710|
|African||10 (43.5)||11 (45.8)|
|Primary education or less||19 (82.6)||10 (41.7)||0.0039|
|Secondary education||4 (17.4)||14 (58.3)|
|Other drugs used|
Data presented as n (%).
* Chi-square test (99% confidence interval).
Genome-wide DNA methylation
Genomic DNA was extracted from peripheral blood using the "salting out" method. A total of 500 ng of DNA was bisulfite-converted using the EZ DNA methylation kit (Zymo Research, Irvine, California, USA), following the manufacturer's instructions. Converted DNA was denatured, neutralized, and amplified by isothermal polymerase chain reaction (PCR) for 16 hours (16 cycles: 95 °C, 30 seconds; 50 °C, 60 minutes; 4 °C, 10 minutes). DNA was hybridized using the Illumina Infinium HumanMethylation450 BeadChip arrays (450K) (Illumina Inc, San Diego, California, USA), following the Illumina Infinium HD methylation protocol. Arrays were then scanned with the iScan Microarray Scanner (Illumina), and IDAT files were extracted using GenomeStudio software (version 1.8). Data were submitted to the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) (GSE77056).
DNA methylation analyses
Methylation data was analyzed in ChAMP (Chip Methylation Analysis Pipeline)17 and RnBeads18 packages, both available in Bioconductor (https://www.bioconductor.org/). ChAMP quality control steps used a threshold of p > 0.01 to remove 1,261 probes with poor intensity detection values. Next, we removed 281 probes represented by less than three beads and 37,301 probes located in single nucleotide polymorphisms (SNPs) (dbSNP database) or that aligned at multiple locations.19 These pre-processing steps left 446,669 probes to be analyzed in 47 samples. To account for the different contributions of DNA methylation levels due to the heterogeneity underlying the cell composition of the blood, we applied cellular correction using the estimateCellCounts function, which inferred the distribution of the white blood cells (CD8+ and CD4+ T cells, B cells, monocytes, granulocytes, and natural killer cells).20,21 The beta-mixture quantile (BMIQ) normalization method was used to adjust the intensity distribution of the Infinium II probes to fit that of the Infinium I probes.22
For RnBeads, the Greedycut algorithm removed unreliable measurements for 813 probes (p > 0.05), probes that were associated with SNPs (4,713 probes), as well as those located in specific contexts (non-CpG positions, 3,140 probes). The background was corrected with the noob method, which is based on a normal-exponential convolution using out-of-band probes,23 resulting in 476,911 probes in 47 samples. Signal intensities from type I and II probes were normalized using the SWAN method, which adjusts intensities based on a quantile approach.24
Comparison between cases and controls using the ChAMP and RnBeads packages was applied to M-values (logit of B-values) employing an empirical bayesian framework linear model from limma.25,26 Positions with an adjusted p-value (adjP) < 10-5 as given by the Benjamini & Hochberg method, presenting the same methylation direction, were considered as differentially methylated positions (DMPs). We also used RnBeads to identify methylation differences at predefined genomic regions; to do so, we combined the uncorrected CpG-specific p-values within a given region using an extension of Fisher's method that resulted in a single aggregated p-value for each region, also adjusted for multiple testing correction using the false discovery rate (FDR).18 We considered differentially methylated regions (DMRs) as those presenting an adjP of < 0.05.
To verify the methylation pattern in cases and controls, a hierarchical cluster analysis was applied to the M-values from DMPs using Ward's method, Euclidean distance with 1,000 resampling, and z-score normalization as parameters, as implemented by the fpc package.27
Gene-specific CpG methylation was evaluated by pyrosequencing. Primers were designed using PyroMark Assay Design SW 2.0 software (Qiagen, Hilden, Germany) to amplify the bisulfite-treated DNA of six DMPs located in the C11orf58 (cg10633981), HDAC5 (cg11049075), MAP2K1 (cg24449302), MPV17L (cg09844907), LOC401431 (cg17479131) genes and in an intergenic region (IGR) (cg19933985). Primers and PCR conditions are shown in Table 2. PCR was performed using the PyroMark PCR Kit (Qiagen) and PyroMark Q24 system (Qiagen), following the manufacturer's instructions. For each assay, the software included at least one control dispensation to adjust signal over background noise and verify the efficiency of bisulfite conversion. Methylated and unmethylated bisulfite-converted human control DNA (EpiTect PCR control DNA, Qiagen) was used as positive and negative controls.
|Reverse: 5´-ACTTCCTAATCTATCAATTCAATATCATAA-3´ - Biotin||58|
|cg19933985||IGR||Forward: 5´-TTGGAATTTAGGTGAAGATGTT-3´ - Biotin||56|
|Reverse: 5´-CCCAAATAACTACAAAAACAAAAATTTACT-3´ - Biotin||59|
|Reverse: 5´-CTTTCCACACTCCCACTC-3´ - Biotin||57|
|Reverse: 5´-CCTTCTACCTCTATAACTTTACCTATTCT-3´ - Biotin||60|
|Reverse: 5´-ATATAAACCTCTTAAAACAAACTACACT-3´ - Biotin||57|
Tm = temperature (annealing).
For each CpG position, methylation percentages provided by pyrosequencing were calculated as the ratio of C to C + T, as implemented in the PyroMark Q24 2.0.7 software (Qiagen) and normalized by positive controls. Mann-Whitney U tests were applied to identify differences between case and control groups, considering significant those CpG positions with p < 0.05. Correlations between genome-wide and pyrosequencing data were analyzed with linear regression using GraphPad Prism version 7.03 (GraphPad for Science, San Diego, CA, USA).
Genes associated with DMPs were submitted to WebGestalt28 to identify enriched protein-protein interaction (PPI) modules using the Biological General Repository for Interaction Datasets29 (BioGRID), considering significant modules with adjP < 0.05 as given by the Benjamini-Hochberg test. The random walk method and Gene Ontology (GO) enrichment analysis based on the retrieved sub-network (FDR < 0.05) were used to identify hyperrepresented PPI modules.28 We also used MetaCoreT software with standard parameters (https://portal.genego.com/) (p < 0.05, hypergeometric distribution; z score > 45) to further explore enriched functional annotation of DMPs in a gene network context.
Identification of changes in DNA methylation in the peripheral blood of cocaine use disorder (CUD) patients
Genome-wide DNA methylation differences between 23 CUD patients and 24 healthy controls were identified using the ChAMP & RnBeads package analysis. ChAMP revealed 249 DMPs, with 108 hypermethylated and 141 hypomethylated (adjP < 10-5), and RnBeads showed 361 DMPs, with 69 hypermethylated and 292 hypomethylated (adjP < 10-5). The overlap between both methods resulted in 186 DMPs related to 152 genes, all in the same direction, with 61 hypermethylated and 125 hypomethylated CpG positions. According to the distribution of the positions in relation to their genomic location and CpG characterization, most of the DMPs are located in promoter regions (5´UTR, TSS200, and TSS1500) (82-44%) and in a CpG island (83-45%) (Table S1, available as online-only supplementary material). Thus, we verified the existence of DMRs, identifying 118 DMRs: four located in CpG islands, 48 in promoters, 37 in gene bodies, and 29 in genome-wide tiling regions (Table S2, available as online-only supplementary material). Five genes were common to DMP and DMR analyses: KLHL9, WDR61, OSMR, SNORD94, and ARF4. A hierarchical clustering analysis applied to M-values showed that methylation patterns from the 186 DMPs discriminated two groups of samples (cluster stability [boot mean]), which completely separated CUD patients from healthy controls (Figure 1).
Correlations between genome-wide and pyrosequencing data
To confirm the methylation levels of selected DMPs, pyrosequencing was carried out for six CpG positions: five located next to genes and one in the IGR. Four CpGs were chosen, as they presented the highest methylation differences: cg10633981 (C11orf58), cg09844907 (MPV17L), cg17479131 (LOC401431), and cg19933985 (IGR), as well as two additional CpGs located at genes previously related to cocaine dependence (cg11049075 - HDAC5, and cg24449302 - MAP2K1).30,31 Methylation levels of three CpG positions were correlated between B-values and pyrosequencing data (HDAC5, MAP2K1, and cg19933985) (Table 3).
|?ß||Pyrosequencing||Mann-Whitney U test||Linear regression|
|cg10633981||C11orf58||-0.167||-0.166||0.043*||< 0.0001||0.319||< 0.0001|
|cg19933985||IGR||-0.145||-0.148||-0.072||< 0.0001||0.675||< 0.0001|
Characterization of genes associated with differentially methylated CpG positions
The 152 genes related to DMPs were submitted to PPI networks modules enrichment analysis, which identified a module of genes involved with G1/S transition of the mitotic cell cycle, histone deacetylase complex, and RNA polymerase II core promoter proximal region sequence-specific DNA binding (C = 1329; O = 29; E = 10.48; R = 2.77; p = 3.35E-07; FDR = 2.91E-04). To better understand the importance of the biological process associated with these genes, 137 out of the 152 were mapped on BIOGRID, with the network prioritization showing 28 genes directly connected (Figure 2A). We noted that, considering gene methylation status, the distribution was not random (p = 0.087): 23 genes were hypomethylated and five were hypermethylated in CUD patients compared to controls. There was enrichment for 138 GO biological process categories (Table S3, available as online-only supplementary material). Ninety-one genes were used as seeds in the expanded sub-network composed of 101 genes (Figure 2B). Using the expanded network, 59 GO biological process categories were enriched (Table S3). Both prioritization and expanded networks presented biological processes related to neurogenesis, histone modification, regulation of signal transduction, regulation of transcription, and regulation of the ERK1/ERK2 cascade and MAPK pathway. In addition, enriched networks identified by MetaCoreT contained genes mainly involved with cell differentiation (p = 1.68E-21; z score: 47.86), macromolecule biosynthesis (p = 3.91E-26; z score: 55.11), and metabolic process (p-value: 1.36E-23; z score: 50.5). CHD7, SLC6A20, ASCL1, PAX5, and TULP3 are also related to neurogenesis and regulation of nervous system development (Figure 3). The 28 genes associated with the 48 DMRs located in promoter regions were enriched mainly for the PI3K-Akt signaling pathway (C = 341; O = 4; E = 0.38; R = 10.52; p = 3.01E-04; FDR = 2.62E-02).
Chronic exposure to cocaine and crack interferes with the brain reward system by several mechanisms, including disruption of DNA methylation dynamics.10 We explored changes in DNA methylation in the peripheral blood of CUD patients, aiming to understand the potential contribution of this mechanism to this disorder. Hierarchical clustering based on the methylation patterns of DNA extracted from peripheral blood discriminated cases from controls. Pyrosequencing analysis confirmed the methylation levels identified on the genome-wide DNA methylation in three out of six (50%) genes analyzed. Our results showed that most CpGs were hypomethylated in CUD patients compared to controls. A similar pattern was previously observed in the nucleus accumbens (NAc) in rodent models, with global DNA hypomethylation after cocaine administration.32 It is important to note that, among the 28 most connected genes, 23 (82.1%) were hypomethylated and only five (17.9%) were hypermethylated; however, two CpGs (CBPT1, MAPK3) were located in the gene body, and one CpG (HDAC5) was located in the 5´UTR, within the shore region.
Cocaine use results in long-term adaptive effects in the central nervous system by changing neuronal structure and function. This neural activity may modulate DNA methylation patterns.8,33 Wright et al. observed upregulation of DNMT3A/B after cocaine administration and drug-seeking behaviors in rodents, thus connecting (at least in part) cocaine use to the DNA methylation process.32 Here, enrichment of the network resulted mainly in biological pathways already known to be related to cocaine use, such as transcription and chromatin regulation, histone activity, and cell cycle (neurogenesis and neuronal differentiation). These findings are consistent with the literature, but here we highlight DNA methylation as one of the mechanisms involved in these biological pathways.
Regarding the genes related to chromatin regulation and histone activity pathways, EHMT1 and EHMT2 encode histone methyltransferases involved in transcriptional repression and, in the mouse brain, are associated with prolonged cocaine exposure.34,35HDAC5 encodes a histone deacetylase that alters the chromosome structure, affecting the access of transcription factors to DNA. It maintains brain homeostasis, which is disrupted in several disease conditions.33,36 Animal-model studies have reported an association of HDAC5 with adaptations to cocaine exposure, due its involvement in behavioral transition from drug experimentation to compulsive drug use.30,33MEF2 is a possible mediator of HDAC5 function, as it reduces cocaine reward sensitivity after repeated cocaine exposures,30 and MEF2D, a survival-related transcription factor,37 is connected to HDAC5 in gene-gene interaction analysis.
The extracellular signal-related kinase (ERK) pathway was also enriched in the network analysis, represented by CTBP1, TAL1, and FBXW7. ERK pathway activity mediates changes in neuronal activities as a consequence of cocaine use, which could lead to epigenomic changes.10CTBP1 and TAL1 are related to cellular and metabolic processes. CTBP1 is downregulated in cerebellar granule neurons that induces apoptosis,38 whereas TAL1 is a transcription factor necessary for the specification of the blood formation during development,39 with a role in the MEK/ERK pathway.40FBXW7, a gene involved in cell proliferation and differentiation, is also involved in the regulation of ERK1/2 MAP kinase signaling.41
We also identified methylation changes in CpG positions from MAPK1, MAPK3, MAP2K1, and OSMR, all involved in the MAPK pathway. This pathway is activated by increases in extracellular dopamine in the striatum in response to cocaine abuse, and plays an important role in gene regulation by direct activation of transcription factors and chromatin remodeling.42 These data corroborate our analysis, as we identified hypomethylation in the promoter region of MAPK1, which might increase expression in drug-dependent individuals. MAPK3 showed a hypermethylated CpG in the gene and higher expression in dorsolateral prefrontal cortex.31MAP2K1 had a hypomethylated CpG located in the CpG island at the gene promoter, suggesting that this CpG could be associated with the higher expression levels found in the prefrontal cortex of cocaine and crack dependents.31 Moreover, STMN1 and UBE2V2 presented CpG hypermethylation in the gene body and in the gene promoter, respectively, and are known to be downregulated in the hippocampus of cocaine dependents. HIPK3 and LMTK2 reported hypomethylation in a CpG located in the CpG island and gene body, respectively, and are upregulated in cocaine dependents.43
Comparing our data with those of other DNA methylation studies that evaluated use/abuse of other substances, such as tobacco, we found similarities in 30 differentially methylated genes (adjP < 10-5): STX1A, ATP11A, ZNRF3, MYO10, UNG, PCDH9, TGFBR3, CARS2, CTBP1, PSMB8, and SNX21 from Zeilinger et al.44; PDLIM7, ATP11A, ADCK4, MPI, CARS2, VWF, RCAN3, PSMB8, PRKAG2, PHF17, SDK1, CBX7, PCDH9, CDC42BPB, ABCC1, and PAX5 from Joehanes et al.45; and PCDH9, TRIO, and ATP11A from Wilson et al.46 Among these genes, we highlight CTBP1 (described above), PSMB8, and PCDH9. PSMB8 is a gene involved in the ERK1/2 and PI3K/AKT signaling pathways,47 which can be activated by cocaine intake and have enhanced activity in reward-related brain areas.48PCDH9, which is a gene found in all of the aforementioned studies, has a role in cell adhesion and recognition and is associated with protective effects against cocaine-induced cytotoxicity in the mouse brain.49
We identified two important genes by both DMP and DMR analyses: OSMR, involved in the MAPK pathway as described above50; and ARF4, involved in protein activity and required for mammalian development. Both have been related to cocaine exposure response in the NAc of rats.51 Four genes associated to 48 DMRs located in promoters were enriched for Pi3K signaling pathway: PIK3CB, CCNE2, OSMR, and CDKN1B. Of these, CDKN1B has been previously reported as involved with gene regulation in the NAc after chronic cocaine exposure.51,52 Comparison of genes associated with 48 DMRs located in promoters with tobacco DNA methylation studies showed overlap for three genes: EHD1, PITPNB,44 and NRARP.45
Altogether, genes related to the ERK1/2 cascade, MAP kinase pathway, Pi3K signaling pathway, neurogenesis, and neuronal differentiation - already known to be related to cocaine and crack dependence - were found to be differentially methylated in peripheral blood. Moreover, the gene network analysis also suggested that these genes could be involved with the underlying pathology of CUD.
This study has some methodological limitations. Drug dependence classification was recently updated in the newest version of the DSM (DSM-5). Unlike in the DSM-IV and ICD system (ICD-10 and its updated version, ICD-11), which assume different diagnostic criteria for substance abuse and substance dependence, the DSM-5 includes both substance dependence and substance abuse in a dimensional classification spectrum.53 Although cases had to meet the ICD-10 criteria for cocaine and crack dependence to be included in this study, we used the current nomenclature proposed by DSM-5, i.e., cocaine use disorder (CUD), throughout the paper. Cases also used and abused other substances (e.g., tobacco, cannabis, solvents, and heroin); however, cocaine/crack was the only drug abused by all participants in the CUD group. Although we controlled the methylation analysis for associations with traits and co-variables, there is a possibility that substance comorbidities among the CUD group interfered with methylation evaluation, because of the difficulty of controlling these variables since controls did not use any substance. Another limitation is the small sample size; to mitigate this, our sample selection criteria (including years of dependence, age group, only males) sought to ensure the best possible homogeneity in each group, reducing biases and individual interferences. Considering these limitations, our findings should be explored in other groups, such as women, other age ranges, and other substances, with proper controls for these are properly controlled in the analysis.
In conclusion, the present study found significant differences in peripheral-blood DNA methylation patterns between CUD patients and controls, suggesting this approach is relevant and can contribute to a better understanding of the biological mechanisms involved in CUD. In addition, our data suggest that DNA methylation changes in the peripheral blood may be used to identify the CUD vulnerability phenotype. However, future investigations using this approach in independent samples are needed to explore these findings, both in peripheral tissues and, particularly, in brain regions involved with CUD.
The authors thank Bianca Lisboa and Thaís Chile for technical assistance. Funding was provided by Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP; grant 2013/00659-2) and Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES; PROEX 1229245 to CC). MM and ACT were supported by FAPESP grants 2015/06281-7 and 2014/00041-1, respectively.
The authors report no conflicts of interest.