@HD VN:1.4 SO:unsorted @RG ID:18548_4#3 PL:ILLUMINA PU:160102_HS32_18548_A_C80JPANXX_4#3 LB:15400314 DS:EGAS00001001625: Many studies over the past 10 years, culminating in the recent report of the International Stem Cell Initiative (ISCI, 2011) have shown that hPSC acquire genetic and epigenetic changes during their time in culture. Many of the genetic changes are non-random and recurrent, probably because they provide a selective growth advantage to the undifferentiated cells. Some are shared by embryonal carcinoma cells, the malignant counterparts of ES cells. The origins of these growth advantages are poorly understood, but may come from altered cell cycle dynamics, resistance to apoptosis or altered patterns of differentiation. Less is known about the nature and consequences of epigenetic changes, but it is likely that these similarly affect hPSC behaviour; e.g., enhanced expression of DLK1, an imprinted gene, is associated with altered hPSC growth (Enver et al 2005). Inevitably, these genetic and epigenetic changes will impact on our ability to use hPSC for regenerative medicine, either because malignant transformation of the undifferentiated cells or their differentiated derivatives to be used for transplantation compromises safety, or because they impede the function of those differentiated derivatives, or because they affect the efficiency with which the undifferentiated cells can be expanded and differentiated into desired cell types. Focusing initially upon the existing clinical grade hESC lines, later moving to iPSC, we will Consolidate and extend knowledge of the rate, type and functional impact of the genetic variations that occur during hPSC culture. We will use whole genome and exome sequencing as well as SNP arrays, together with clonal analysis and other cytogenetics techniques. Common changes will be compared with those found in the normal human population, at low frequency in the original cell population or observed during iPSC generation in the HIPSCI project currently based at the WTSI. These studies will provide a better understanding of the range of genetic changes that occur in hPSC beyond the CNVs already identified. In conjunction with cancer genome resources and expertise at WTSI, bioinformatic analyses of these hPSC data will allow us to assess potential impact on hPSC behaviour pertinent to applications in regenerative medicine, notably the likelihood that specific changes arising in undifferentiated PSC cultures may be associated with potential malignant transformation of differentiated progeny. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ DT:2016-01-02T00:00:00+0000 SM:EGAN00001350268 PG:BamIndexDecoder CN:SC @PG ID:SCS PN:HiSeq Control Software DS:Controlling software on instrument VN:2.2.68 @PG ID:basecalling PN:RTA PP:SCS DS:Basecalling Package VN:1.18.66.3 @PG ID:Illumina2bam PN:Illumina2bam PP:basecalling DS:Convert Illumina BCL to BAM or SAM file VN:V1.17 CL:uk.ac.sanger.npg.illumina.Illumina2bam INTENSITY_DIR=/nfs/sf46/ILorHSany_sf46/analysis/160102_HS32_18548_A_C80JPANXX/Data/Intensities BASECALLS_DIR=/nfs/sf46/ILorHSany_sf46/analysis/160102_HS32_18548_A_C80JPANXX/Data/Intensities/BaseCalls LANE=4 OUTPUT=/dev/stdout READ_GROUP_ID=18548_4 SAMPLE_ALIAS=EGAN00001350266,EGAN00001350267,EGAN00001350268,EGAN00001350269,EGAN00001350270,EGAN00001350271,EGAN00001350272,EGAN00001350273,EGAN00001350274,EGAN00001350275 STUDY_NAME=EGAS00001001625: Many studies over the past 10 years, culminating in the recent report of the International Stem Cell Initiative (ISCI, 2011) have shown that hPSC acquire genetic and epigenetic changes during their time in culture. Many of the genetic changes are non-random and recurrent, probably because they provide a selective growth advantage to the undifferentiated cells. Some are shared by embryonal carcinoma cells, the malignant counterparts of ES cells. The origins of these growth advantages are poorly understood, but may come from altered cell cycle dynamics, resistance to apoptosis or altered patterns of differentiation. Less is known about the nature and consequences of epigenetic changes, but it is likely that these similarly affect hPSC behaviour; e.g., enhanced expression of DLK1, an imprinted gene, is associated with altered hPSC growth (Enver et al 2005). Inevitably, these genetic and epigenetic changes will impact on our ability to use hPSC for regenerative medicine, either because malignant transformation of the undifferentiated cells or their differentiated derivatives to be used for transplantation compromises safety, or because they impede the function of those differentiated derivatives, or because they affect the efficiency with which the undifferentiated cells can be expanded and differentiated into desired cell types. Focusing initially upon the existing clinical grade hESC lines, later moving to iPSC, we will Consolidate and extend knowledge of the rate, type and functional impact of the genetic variations that occur during hPSC culture. We will use whole genome and exome sequencing as well as SNP arrays, together with clonal analysis and other cytogenetics techniques. Common changes will be compared with those found in the normal human population, at low frequency in the original cell population or observed during iPSC generation in the HIPSCI project currently based at the WTSI. These studies will provide a better understanding of the range of genetic changes that occur in hPSC beyond the CNVs already identified. In conjunction with cancer genome resources and expertise at WTSI, bioinformatic analyses of these hPSC data will allow us to assess potential impact on hPSC behaviour pertinent to applications in regenerative medicine, notably the likelihood that specific changes arising in undifferentiated PSC cultures may be associated with potential malignant transformation of differentiated progeny. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ PLATFORM_UNIT=160102_HS32_18548_A_C80JPANXX_4 COMPRESSION_LEVEL=0 GENERATE_SECONDARY_BASE_CALLS=false PF_FILTER=true LIBRARY_NAME=unknown SEQUENCING_CENTER=SC PLATFORM=ILLUMINA BARCODE_SEQUENCE_TAG_NAME=BC BARCODE_QUALITY_TAG_NAME=QT ADD_CLUSTER_INDEX_TAG=false VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false @PG ID:bamadapterfind PN:bamadapterfind PP:Illumina2bam VN:2.0.19 CL:bamadapterfind level=0 @PG ID:BamIndexDecoder PN:BamIndexDecoder PP:bamadapterfind DS:A command-line tool to decode multiplexed bam file VN:V1.17 CL:uk.ac.sanger.npg.picard.BamIndexDecoder INPUT=/dev/stdin OUTPUT=/dev/stdout BARCODE_FILE=/nfs/sf46/ILorHSany_sf46/analysis/160102_HS32_18548_A_C80JPANXX/Data/Intensities/BAM_basecalls_20160107-153944/metadata_cache_18548/lane_4.taglist METRICS_FILE=/nfs/sf46/ILorHSany_sf46/analysis/160102_HS32_18548_A_C80JPANXX/Data/Intensities/BAM_basecalls_20160107-153944/18548_4.bam.tag_decode.metrics VALIDATION_STRINGENCY=SILENT CREATE_MD5_FILE=false BARCODE_TAG_NAME=BC BARCODE_QUALITY_TAG_NAME=QT MAX_MISMATCHES=1 MIN_MISMATCH_DELTA=1 MAX_NO_CALLS=2 CONVERT_LOW_QUALITY_TO_NO_CALL=false MAX_LOW_QUALITY_TO_CONVERT=15 VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false @PG ID:spf PN:spatial_filter PP:BamIndexDecoder DS:A program to apply a spatial filter VN:v10.23 CL:/software/solexa/pkg/pb_calibration/v10.23/bin/spatial_filter -c -F pb_align_18548_4.bam.filter -t /nfs/sf46/ILorHSany_sf46/analysis/160102_HS32_18548_A_C80JPANXX/Data/Intensities/BAM_basecalls_20160107-153944/no_cal/archive/qc/tileviz/18548_4 --region_size 200 --region_mismatch_threshold 0.0160 --region_insertion_threshold 0.0160 --region_deletion_threshold 0.0160 pb_align_18548_4.bam ; /software/solexa/pkg/pb_calibration/v10.23/bin/spatial_filter -a -f -u -F pb_align_18548_4.bam.filter - @PG ID:bwa PN:bwa PP:spf VN:0.5.10-tpx @PG ID:BamMerger PN:BamMerger PP:bwa DS:A command-line tool to merge BAM/SAM alignment info in the first input file with the data in an unmapped BAM file, producing a third BAM file that has alignment data and all the additional data from the unmapped BAM VN:V1.17 CL:uk.ac.sanger.npg.picard.BamMerger ALIGNED_BAM=pb_align_18548_4.bam INPUT=/dev/stdin OUTPUT=18548_4.bam KEEP_EXTRA_UNMAPPED_READS=true REPLACE_ALIGNED_BASE_QUALITY=true VALIDATION_STRINGENCY=SILENT CREATE_MD5_FILE=true ALIGNMENT_PROGRAM_ID=bwa KEEP_ALL_PG=false VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false @PG ID:SplitBamByReadGroup PN:SplitBamByReadGroup PP:BamMerger DS:Split a BAM file into multiple BAM files based on ReadGroup. Headers are a copy of the original file, removing @RGs where IDs match with the other ReadGroup IDs VN:V1.17 CL:uk.ac.sanger.npg.picard.SplitBamByReadGroup INPUT=/nfs/sf46/ILorHSany_sf46/analysis/160102_HS32_18548_A_C80JPANXX/Data/Intensities/BAM_basecalls_20160107-153944/no_cal/18548_4.bam OUTPUT_PREFIX=/nfs/sf46/ILorHSany_sf46/analysis/160102_HS32_18548_A_C80JPANXX/Data/Intensities/BAM_basecalls_20160107-153944/no_cal/lane4/ VALIDATION_STRINGENCY=SILENT CREATE_MD5_FILE=true VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false @PG ID:AlignmentFilter PN:AlignmentFilter DS:Give a list of SAM/BAM files with the same set of records and in the same order but aligned with different references, split reads into different files according to alignments. You have option to put unaligned reads into one of output files or a separate file VN:V1.17 CL:uk.ac.sanger.npg.picard.AlignmentFilter INPUT_ALIGNMENT=[/nfs/sf46/ILorHSany_sf46/analysis/160102_HS32_18548_A_C80JPANXX/Data/Intensities/BAM_basecalls_20160107-153944/no_cal/lane4/18548_4#3.bam] OUTPUT_ALIGNMENT=[/nfs/sf46/ILorHSany_sf46/analysis/160102_HS32_18548_A_C80JPANXX/Data/Intensities/BAM_basecalls_20160107-153944/no_cal/archive/lane4/18548_4#3_phix.bam] OUTPUT_UNALIGNED=/nfs/sf46/ILorHSany_sf46/analysis/160102_HS32_18548_A_C80JPANXX/Data/Intensities/BAM_basecalls_20160107-153944/no_cal/archive/lane4/18548_4#3.bam METRICS_FILE=/nfs/sf46/ILorHSany_sf46/analysis/160102_HS32_18548_A_C80JPANXX/Data/Intensities/BAM_basecalls_20160107-153944/no_cal/archive/lane4/18548_4#3.bam_alignment_filter_metrics.json VALIDATION_STRINGENCY=SILENT CREATE_MD5_FILE=true VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false @PG ID:bammarkduplicates2 PN:bammarkduplicates2 CL:/software/solexa/pkg/biobambam/2.0.19/bin/bammarkduplicates2 I=/nfs/sf46/ILorHSany_sf46/analysis/160102_HS32_18548_A_C80JPANXX/Data/Intensities/BAM_basecalls_20160107-153944/no_cal/archive/lane4/18548_4#3.bam O=/dev/stdout tmpfile=/nfs/sf46/ILorHSany_sf46/analysis/160102_HS32_18548_A_C80JPANXX/Data/Intensities/BAM_basecalls_20160107-153944/no_cal/archive/lane4/8J3mjpf77y/ M=/nfs/sf46/ILorHSany_sf46/analysis/160102_HS32_18548_A_C80JPANXX/Data/Intensities/BAM_basecalls_20160107-153944/no_cal/archive/lane4/18548_4#3_mk.markdups_metrics.txt PP:AlignmentFilter VN:2.0.19 @PG ID:scramble PN:scramble PP:SplitBamByReadGroup VN:1.14.6 CL:/software/solexa/pkg/scramble/1.14.6/bin/scramble -I bam -O cram @PG ID:scramble.1 PN:scramble PP:bammarkduplicates2 VN:1.14.6 CL:/software/solexa/pkg/scramble/1.14.6/bin/scramble -I bam -O cram