The hESC lines H1, H7 and H9 (WiCell, Madison, WI) were cultured on feeder layers derived from mitotically inactivated HS27 human fibroblast cells (HS27, ATCC), or mouse embryonic fibroblsts or under feeder-free conditions on Matrigel (BD, Franklin Lakes, NJ) coated plates for at least 10 passages. Culture medium for all cultures was composed of DMEM/F12-Glutamax 1:1, 20% Knockout Serum Replacement, 2 mM nonessential amino acids, 100 μM beta-mercaptoethanol, 50 μg/ml Pen-Strep (all from Invitrogen, Carlsbad, CA), and 4 ng/ml human recombinant basic fibroblast growth factor (bFGF/FGF2; PeproTech Inc., Rocky Hill, NJ.) Feeder-free cultures were prepared for gene expression analysis by manually harvesting individual colonies with uniform typical undifferentiated ESC morphology.
BG01 (46, XY), BG02 (46, XY), BG03 (46, XX), I6 (46, XY) and BG01V (BG01 karyotypic variant: 49, XXY, +12, +17): Cells were maintained for 3 (BG01V), 7 (BG02), 8 (BG01), or 21 (BG03) passages under feeder-free condition on fibronectin-coated plates in medium that had been conditioned by mouse embryonic fibroblasts for 24 hours. Culture medium was DMEM/F12, 1:1 supplemented with 20% Knockout Serum Replacement, 2 mM non-essential amino acids, 2 mM L-glutamine, 50 μg/ml Pen-Strep, 100 μM beta-mercaptoethanol, and 4 ng/ml of bFGF.
Different hESC lines were grown in slightly different culture conditions as described above. H lines were grown on Matrigel coated dishes, while BG lines on fibronectin treated dishes. These coating substrata supported the growth of hESCs similarly, as evaluated by colony morphology, immunocytochemistry and proliferation rate (data not shown).
Embryoid bodies (EBs) were prepared from BG lines as described in . Cells were aggregated and cultured on non-adherent substrata for fourteen days.
NTera2 cells were purchased from ATCC and cultured in parallel with hESCssamples using protocols described previously . HS27 embryonic human newborn foreskin cells (ATCC CRL-1634) were grown in DMEM with 10%FBS.
All samples included in this study can be found in Additional file 6.
Bead array gene expression analysis
RNA was isolated from cultured cells using the Qiagen RNEasy kit (Qiagen, Inc, Valencia, CA). Sample amplification was performed using 100 ng of total RNA as input material by the method of Van Gelder et al . Amplified RNA synthesized from limited quantities of heterogenous cDNA  was performed using the Illumina RNA Amplification kit (Ambion, Inc., Austin, TX) following the Manufacturer instructions. Labeling was achieved by use of the incorporation of biotin-16-UTP (Perkin Elmer Life and Analytical Sciences, Boston, MA) present at a ratio of 1:1 with unlabeled UTP. Labeled, amplified material (700 ng per array) was hybridized to a pilot version of the Illumina HumanRef-8 BeadChip according to the Manufacturer's instructions (Illumina, Inc., San Diego, CA). Amersham fluorolink streptavidin-Cy3 (GE Healthcare Bio-Sciences, Little Chalfont, UK) following the BeadChip manual. Arrays were scanned with an Illumina Bead array Reader confocal scanner according to the Manufacturer's instructions. Array data processing and analysis was performed using Illumina BeadStudio software.
Identification of differentially expressed genes and clustering analysis
Differentially expressed genes between ES and EB were identified by ANOVA at p value 0.05 using bioconductor . Unsupervised hierarchical clustering analysis and principal component analysis (PCA) were conducted using software Pcluster  and TreeView .
Identification of diagnostic markers
PAM (prediction analysis of microarray) was employed for the identification of diagnostic markers from insulin pathway genes by using the software package bioconductor . PAM is a class prediction method for expression data mining. It can provide a list of significant genes whose expression characterizes each diagnostic class. The average gene expression level in multiple classes, such as ES, EB, NS, and FB, was divided by the within-class standard deviation for that gene. The nearest centroid classification computed by PAM takes the protein expression profile from a new sample, and compares it to each of these class centroids .
RT-PCR and quantitative real-time PCR analysis
Total RNA was isolated with TRIzol (Invitrogen. cDNA was synthesized using 2.5 μg total RNA in a 20-μl reaction with Superscript II (Invitrogen) and oligo (dT)12–18 (Promega; Madison, WI). One microliter RNase H (Invitrogen) was added to each tube and incubated for 20 minutes at 37°C before proceeding to the RT-PCR analysis. The PCR primers are: RPS4Y-forward: 5' AGATTCTCTTCCGTCGCAG 3', RPS4Y-reverse, 5' CTCCACCAATCACCATACAC 3'; EIFAY-forward, 5' CTGCTGCATCTTAGTTCAGTC 3'; EIFAY-reverse 5' CTTCCAATCGTCCATTTCCC 3'. Quantitative real time PCR gene specific primer pairs and probes were purchased from Applied Biosystems (Foster City, CA) for the following genes: MMP3 (Hs00233962_m1), TFRSF11B (Hs00171068_m1), THBS1 (Hs00170236_m1), KRTHA4 (Hs00606019_gH), and for internal control β-actin (ACTB, Hs99999903_m1).