Microarray detects differences between hESC and hEB derived from them
To assess alteration in gene expression in hESC and hEBs, cells were cultured and induced to form EBs (Fig. 1). Total RNA was harvested from undifferentiated hESCs and hEBs derived from them and a time course of gene expression was performed to assess downregulation of known ES cell specific genes. Day 13 of differentiation was chosen as a time point for subsequent analysis where there is a clear downregulation of known hESC markers and an upregulation of some early markers of differentiation (Fig. 2). The expression of known undifferentiated hESC markers including Oct4, Nanog and Esg1 showed reduction in expression in hEB at day 13 while known markers of differentiation (SOX1, Nestin, and GATA4) showed a marked increase in expression. Several genes (Sox2, TERT, BCRP, Cx43, and Rex-1) did not show detectable change between hESC and hEB cells. For microarray studies each sample was compared to human universal reference RNA (mixture of total RNAs from a collection of adult human cell lines, chosen to represent a broad range of expressed genes and both male and female donors are represented) to maintain uniformity and allow comparisons across samples. cDNA from BG02 and pooled samples of hESC were labeled with Cy5 and huURNA with Cy3, and ~17,000 oligonucleotide arrays were hybridized. Similarly, cDNA from hEB derived from differentiated BG02-ESC (day 13 BG02 EB, day 21 BG02 EB) and pooled WiCell EB (day 13) was labeled with Cy5 and huURNA with Cy3 and arrays were hybridized and data analyzed. Since the process of differentiation is relatively stochastic and cell lines may behave differently as they differentiate, data from different experiments (with different samples) was not pooled and reported as the expression from single hybridization. Each sample was analyzed in duplicate obtained from two different independent cultures.
Overall microarray results between technical replicates were similar and representative images from each experiment are available (see Additional file-1, 2, 3, 4, 5). Microarray analysis showed that 11,000 of the ~17,000 features present on the array were detectable above background at a ≥ 150 minimum intensity and target pixels of one standard deviation (SD) above background (≥30) cutoffs. hEB and hESC expressed approximately 2400 to 3000 genes. Pooled WiCell ESC and BG02 ESC showed over expression of 2471 and 2843 genes at ≥ 2 fold respectively, compared to huURNA (Supplementary Table-1Sa and 1Sb, see additional file-6 and 7). As these cells differentiated to hEB, the number of total genes that were detected remained similar to hESC (supplementary Table 2S, see additional file-8).
For some experiments ESC samples were pooled as we were interested in identifying differences between ESC and EBs that would be common to multiple lines. Further, as EB formation is variable we felt pooling may allow us to focus on large differences which would not be lost in the averaging process. Once we obtained results we then tested if this was true by using cell line provided by a different provider and propagated under different culture condition. In addition, since the purpose of day 21 differentiation was to study which genes persisted and which were modulated as a result of further differentiation we did not use day 21 pooled EBs but focused on results from a single line
We have previously shown that six hESC lines (BG02, BG01, GE01, GE09, TE06 and PES cell lines from GE01, GE07 and GE09) express 92 genes in common at ≥ 3 fold levels when compared with huURNA (28). We therefore examined the expression of these genes and in addition all genes overexpressed in BG02 and pooled hESCs (Table 1/see additional file-9) (supplementary Table 1Sa and 1Sb, see additional file 6 and 7). Eighty-seven of 92 genes were also detected in the present sample confirming the quality of the array hybridization and the fidelity of the samples used. In addition, previous study identified several early markers of differentiation, that were present at low levels in hESC and when hESC cells were differentiated, these genes were upregulated and up regulation was confirmed by EST enumeration technique [28]. The present study confirmed these earlier observations by microarray experiments (Table2/see additional file 9). These differentiation genes included Keratin 8, Keratin 18, ACTC, and TUBB5. When we examined our current results, we found that these genes were also up regulated in hEB cells derived from BG02 and WiCell lines by microarray studies (Table 2/ see additional file 9). These results confirm the EST enumeration data and further provide support for the quality of hEBs used.
Overall the technical replicates, the testing of the expression of known genes and the ability to detect expected changes in the current samples confirmed the suitability of the current data set for additional analysis.
Modulation of "stemness genes by ES cell differentiation
Previously, we identified a set of 92 genes are expressed at high levels in most hESC and can be detected by microarray and EST enumeration [28]. To test if changes in their expression could be used to monitor differentiation, we examined the relative levels of these genes as ES cell differentiated. Eighty-seven of the 92 genes expressed in all six hESC lines were also over expressed in day 13 BGO2-EB and day 13 pooled WiCell EB compared to HuURNA (data not shown). However, as a result of differentiation of the BG02-ESC line 77 out of 92 genes were down modulated in BG02 EB cells (Table 1/ see additional file 9). Among these, known ES cell markers e.g., POU5F1, GTCM-1, LEFTB, Galanin, GJA1, TDGF1, SFRP2, FABP5 and Lin-28 showed a marked decline in their expression compared to undifferentiated ES cells. Similarly, other ES cell specific markers such as CER1 and DNMT3B also showed marked decline in the expression in day 13 BG02 EB compared to ESC (Table1/ see additional file 9). However, three zinc finger proteins did not show any significant change and down modulation of Numatrin, C20orf129 and Laminin receptor was modest (Table1/ see additional file 9).
Twelve novel genes that were overexpressed in all 6 hESC lines tested [28], were all down modulated by differentiation. Among them MGC27165 (IFITM1), GSH1, PPAT, KIAA1573, TD-60, C20orf168, ARL8 showed a marked decline in fold expression in BG02 EB compared to undifferentiated BG02 (Table 1/ see additional file 9) while changes in other genes was modest.
Pooled EB samples of ES cells (WA01, WA07 and WA09) showed a similar gene expression profile as observed in day 13 BG02-EB. 53 out of 92 "stemness" genes were down modulated compared to undifferentiated WiCell ESC (Table 1/ see additional file 9). These 53 genes were a subset of the 77 that were markedly downmodulated in BG02 derived EB compared to undifferentiated BG02-ESC (see above and Table 1/ see additional file 9). POU5F1, LEFTB, FABP5, Galanin, GJA1, SFRP2, Nanog and TDGF1 ES specific genes showed ≥ 3 times lower expression in pooled WiCell EB compared to pooled WiCell ESC (Table 1/ see additional file 9). In addition, other ES cell specific markers like DNMT3B, CER1, and SOX2 also showed down modulation in pooled WiCell EB. Novel genes such as PPAT, MGC27165, GSH1, ARL8, and KIAA1573 showed marked down modulation in pooled EB compared to pooled ES cells. However, zinc finger protein Znf257, laminin receptor, C20orf1 and C20orf129 showed modest decline (Table 1/ see additional file 9).
Both pooled WiCell EB and day 13 BG02-EB showed an upregulation of genes that we had previously identified as being markers of differentiation that were present in detectable levels in undifferentiated hESC lines (Table 2/ see additional file 9). These included early differentiation markers e.g., KRT8, KRT18, TUBB5 and ACTC. Overall, the pattern of differentiation appeared similar with BG02 when compared to embryoid bodies derived from WiCell ESC.
Among 92 genes, thirteen additional novel genes have been identified previously in hESC [28]. However, only 3 of the thirteen genes were present on this array and their expression was analyzed. Our results show that Dppa4 was expressed in BG02 ES and pooled WiCell ES, however, its expression level was eight fold higher in day 13 BG02 EB compared to huURNA and down modulated to 2 fold at day 21 BG02-EB. Similarly, pooled WiCell hESC cells showed 10 fold over expression of Dppa4 compared to HuURNA, however upon differentiation to EB the expression was decreased. In sharp contrast, claudin-6 showed down modulation of expression from pooled WiCell ESC to EB by about 10 fold but was not present in detectable levels in either BG02 ES or BG02 EBs.
In PEB some genes that showed down modulation in BG02 EBs were in fact up regulated compared to ESC. The reason for this reverse pattern is not known. Although we did not check whether these were also up regulated in day 21 pooled EBs, our future studies are focused on addressing these issues. It is possible that a mixture of three hESC lines behaved differently when they were mixed and differentiated.
Overall, these data confirm that seventy-seven of the ninety-two genes we had identified as hESC specific could be used to assess the differentiation state of ES cells as they differentiate. A large number of these genes show a dramatic down modulation as early as Day 13 and this is seen in both WiCell and Bresagen (BG02) cell lines and can be readily assessed by microarray analysis.
Genes that are upregulated as ES cells differentiated to form embryoid bodies
To analyze genes that were upregulated as hESC differentiated, embryoid bodies were prepared as previously described and their quality analyzed as described above. Samples, which showed a clear downmodulation of ES cell specific genes, were subjected to microarray studies and their overall gene expression pattern compared with that of the undifferentiated population. Each individual experiment was analyzed separately and differences in gene expression observed were then compared. When BG02 hESC were compared with embryoid bodies derived from them a total of 333 new genes were expressed at ≥ 3 fold higher levels in day 13 BG02-EB compared to undifferentiated BG02 ESC (Supplementary Table 3S, see additional file-8). These genes included many genes known to be upregulated in EB cells and confirmed the quality of the array and the hybridization (see supplementary data). In addition, we identified numerous additional genes whose expression in hEBs had not been documented previously.
Since we were interested in identifying a core set of genes that are upregulated as ES cells differentiated we compared the pattern of gene upregulation in a second distinct population of ES cells (WiCell lines) grown under different conditions using protocols described previously [7, 31, 32]. Out of 333 genes that were upregulated in day 13 BG02 EB 194 genes were also upregulated at ≥ 3 fold in WiCell derived EB (pooled EB) (Table 3/ see additional file 9). The remaining 139 genes were also upregulated but they did not meet the rigorous three fold cutoff criteria. The expression of these 139 genes by other techniques will be subject of confirmation in future studies.
The common subset of genes, which showed upregulation at ≥ 3 fold in both BG02 EB and pooled WiCell EBs were classified by hierarchical clustering. As shown in Fig 3, all listed genes (a snap shot of 194 genes) were upregulated in EBs derived either from BG02 or WiCell lines and the expression levels can be readily distinguished from those in undifferentiated ESC. These results clearly show a marked difference in gene expression profile in differentiated compared to undifferentiated ES.
More detailed examination of the genes that were differentially expressed in ESC and EBs indicated multiple signaling pathways are altered. BG02 EB showed over expression of Keratin 19, Profilin-1, Fibronectin 1, HAND1, COL1A2, ZAK, COL4A2, BIRC7, NID2, TUBB5, TMSB4X, PLP2, ENO1 and COL5A2 (Supplementary table 4S, see additional file-8). Pooled WiCell EB also showed upregulation of similar genes including HAND1, KRT19, Fibronectin 1, Profilin1, TMSB4X, Vimentin, Enolase, PLP2, COL4A2, CAPN1, NID2, IVL, ZAK, SPTA1 and COL1A2, which are related to cell differentiation or cytoskeleton (Supplementary Table 5S/see additional file 8). Several genes related to cell signaling, cell growth, cell cycle and metabolic activities were uniquely identified (see supplementary table 3S or see additional file-8) and (Table 3/ see additional file 9). For example, Glypican-3, member of the glypican-related integral membrane proteoglycan family (GRIPS) contains a core protein anchored to the cytoplasmic membrane via a glycosyl phosphatidylinositol linkage and plays a role in the control of cell division and growth regulation was over expressed. Calreticulin (CALR), which can inhibit androgen receptor and retinoic acid receptor transcriptional activities in vivo, as well as retinoic acid-induced neuronal differentiation was over expressed by 5 fold. Cyclin-dependent kinase inhibitor 1C (CDKN1C), an inhibitor of several G1 cyclin/CDK complexes (Cyclin-E-CDK2, Cyclin-D2-CDK4 and Cyclin-A-CDK2) and to lesser extent of the mitotic cyclin-B-CDC2, was over expressed in EBs. Finally, 26 hypothetical, 2 zinc finger proteins and 9 unknown genes whose functions are still to be determined were also over expressed in EBs (Supplementary Table 3S and 6S or see additional file-8) and (Table 4/ see additional file 9).
Overall, these results indicate that 194 genes identified as upregulated in two different EB cell lines may serve as early and sensitive markers to monitor ES cell differentiation.
Confirmation of gene expression profile by microarray, MPSS, EST-enumeration, RT-PCR and immunohistochemical analyses
To provide independent verification of the results we utilized three different strategies. We compared gene expression patterns obtained using microarray studies with the MPSS data set generated by our laboratory using the pooled ES and EBs derived from them [30]. In addition, we prepared duplicate samples of BG02 ES and BG02 derived embryoid bodies and subjected them to a microarray analysis using a large scale oligonucleotide array based on a different set of oligonucleotides that were commercially available (Agilent, Foster city, CA). Finally we examined a subset of genes that were not validated by either of these methods by RT-PCR. Of the 194 genes, which were uniquely over expressed at ≥ 3 fold in both BG02 and WiCell cell derived EBs 148 genes showed higher expression in an MPSS analysis of WiCell samples (data now shown and reference [33]. Overall, comparison of microarray results with MPSS showed a high concordance in gene expression profile. For example, known ES cell specific markers including such as POU5F1, Galanin, DNMT3B, GJA1, LEFTB and TDGF1 showed higher expression in undifferentiated BG02 and PES cells compared to differentiated BG02 or PEB by both microarray and MPSS analyses (Table 5/ see additional file 9). Similarly, early ES differentiation markers e.g., KRT8, KRT18, KRT19, ACTC and EB specific genes such as Vimentin, AFP, HAND1, and COL1A2 showed higher expression in EBs compared to ESC by both microarray and EST enumeration (Table 6 and Table 7/ see additional file 9).
When fold expression of 8 out of 9 unknown genes identified by microarray was compared with tpm level detected by MPSS, a similar pattern of gene expression was observed (Table 8/ see additional file 9). All of these genes showed higher expression in BG02-EB and WiCell derived EB compared to BG02-ES and WiCell ESC samples. Six of these genes were also confirmed by RT-PCR analysis (Fig. 5). The reliability of microarray results was further confirmed by RT-PCR analysis of 10 genes for hypothetical proteins, 2 for zinc finger proteins, 3 for unknown proteins and 7 genes that were highly expressed in embryoid bodies (Fig. 4, 5, 6). This similarity in the results confirmed the reliability of the microarray studies.
As forty-six of the 194 genes (Table 9/ see additional file 9) identified by microarray showed no significant difference by MPSS analysis, it was concluded that microarray and MPSS assays may be different in sensitivity or there was variability in the production of embryoid bodies. To address this issue and to determine if the differences observed by microarray were reliable, we used a second microarray platform to examine gene expression in the same samples. Interestingly 33 of the 46 genes showed overexpression in WiCell derived EBs using Agilent human 22 k oligo-arrays (Table 9/ see additional file 9). This suggested that different methods have different sensitivities and it is important to use multiple methods to confirm expression.
The expression of the remaining 13 genes detected as overexpressed by microarray studies using a custom built microarray but not by MPSS or by Agilent microarrays was tested by RT-PCR analysis (Fig 4). Of the thirteen genes, 9 were confirmed by RT-PCR analysis further confirming the differential sensitivity of various arrays and other large-scale analytical methods.
Gene expression profile of Day 21 BG02-EB
Our results identified a large set of genes that are differentially modulated as cells differentiate to form embryoid bodies over a period of two weeks. To examine whether the same set of genes could be used to assess differentiation of hESC at twenty-one day of differentiation, we examined gene expression using RNA prepared from Day 21 embryoid bodies. For these studies, undifferentiated hESC (BG02-ESC) cells were used to generate EBs by a brief exposure to collagenase IV and small clusters of cells were obtained by scraping with a pipette. ES cells were differentiated for 13 and 21 days, harvested to prepare total RNA and analyzed by hybridization to microarrays. We found that a majority of ESC specific genes that were down modulated as cells differentiated for thirteen days were further down modulated as a result of differentiation to day 21 BG02-EB (data not shown). Among 194 genes overexpressed in day 13 BG02-EB and PEB, thirty-three genes showed a decrease in fold expression at day 21 BG02-EB. However, other than COL1A2, which was down modulated from 27 fold to 3 fold, none showed any marked decrease. The remaining genes were either not expressed or did not show any change in the expression (data not shown).
We previously reported that eight early differentiation marker genes were expressed in hESC, which were further upregulated in hEBs as determined by EST enumeration [28]. Therefore, in this study we examined their expression at day 21 of differentiation. For this experiment gene expression between BG02 ES was compared with day 13 BG02-EB and day 21 BG02-EB. We found that six of these genes were markedly down modulated in day 21 BG02-EB (Table 10/ see additional file 9) compared to day 13EB and PEB. The other two genes showed minor or no significant change.
In addition, expression of 11 known EB specific genes that were shown to be overexpressed in both day 13 BG02-EB and pooled EB (supplementary table 4S and 5S or see additional file-8) also examined in day 21 BG02-EB. Interestingly, all of them were down modulated in day 21 BG02-EB (Supplementary table 4S or see additional file-8)
Additionally, we examined the status of genes that were upregulated in day 21EB but not in ES or day 13 BG02-EB. Genes related to cytokeratin and hair keratin (e.g., KRT17, KRT20, and IVL), which are responsible for structural integrity showed higher expression in day 21EB compared to BG02ES or day 13EB. In addition, genes related to mature tissues were exclusively upregulated in day 21EB but not in ES and day 13 BG02-EB (Supplementary Table 4S or see additional file-8) indicating that additional differentiation markers can be used to distinguish early vs. late EBs.
While most genes followed an expected pattern of change in expression in day21 EB, Nanog showed a reverse pattern of expression. On day 13 its expression level was significantly reduced compared to BG02-ES, however, on day 21EB an increased expression was observed compared to day 13 BG02-EB. Levels were almost similar to those in BG02-ESC samples (data not shown). Downmodulation of Nanog in day 13-EB was confirmed by RT-PCR analysis (Fig 7) though its upregulation in Day 21 samples did not match the RT-PCR results (see discussion). The downregulation in nanog followed the downregulation of Oct 3/4 and TERT and other ES specific genes (Figure 7 and data not shown). In addition, although Sox 2 gene showed a decrease in expression in pooled EB compared to pooled ESC but it didn't show any change in expression pattern from BG02ES to day 13 BG02-EB and day 21 BG02-EB (data not shown). Expression of Sox-2 gene was also confirmed by RT-PCR and immunocytochemistry analyses (Fig.7 and Fig. 8). Both analyses demonstrated that Sox2 is expressed in day13 and day 21EBs. Oct3/4 protein expression was used as a control that did not show expression in Sox-2 expressing cells confirming that the Sox-2 expression represented induction in a newly differentiated population rather than in persistent ESC.
These results indicate that KRT8, KRT18, TUBB5, ACTC, SERPINH1, TUBB4, KRT19, HAND1, FN1, ENO1, COL1A2, COL5A2, COL4A2 and other identified markers may serve as a indicators of early and late stages of differentiation as they were further markedly down regulated on day 21 compared to day 13 of differentiation. In contrast, genes such as KRT17, KRT20, IVL, NPHP3, CAPN1, and CNTN6 are markers of later stage of differentiation as they are overexpressed in day 21 compared to day 13 of differentiation. Thus microarray studies can distinguish between ES cells and embryoid bodies as well as between early and late stage embryoid bodies.