@HD VN:1.4 SO:coordinate @RG ID:1#83 PL:ILLUMINA PU:130623_HS15_10138_B_D21KJACXX_8#83 LB:7488496 DS:EGAS00001000407: Multifocality or multicentricity in breast cancer may be defined as the presence of two or more tumor foci within a single quadrant of the breast or within different quadrants of the same breast, respectively. This original classification of the breast cancer as multicentric or multifocal was based on the assumption that cancers arising in the same quadrant were more likely to arise from the same ductal structures than those occurring in separate areas of the breast. The problem with these definitions is that the ?quadrants? of the breast are arbitrary external designations, as no internal boundaries do exist. This project will therefore focus both on synchronous multifocal and multicentric tumors. The incidence of multifocal and multicentric breast cancers was reported to be between 13 and 75% depending on the definition used, the extent of the pathologic sampling of the breast and whether in situ disease is considered evidence of multicentricity (1). Although this incidence is variable, those figures show that it is a frequent phenomenon. Multiple (multifocal/multicentric) breast carcinomas, especially when occurring in the same breast, represent a real challenge for both pathologists and clinicians in terms of identifying the cellular origin and the best therapeutic management of the cancer. Multifocality or multicentricity has been associated with a number of more aggressive features including an increased rate of regional lymph node metastases and adverse patient outcome when compared with unifocal tumors (2-3), and a possible increased risk of local recurrence following breast conserving surgery (4). For the moment, the literature is divided on whether there is a corresponding impact on survival outcomes. Today, the current convention to stage and to treat multifocal and multicentric tumors is the classical tumor-node-metastasis (TNM) staging guidelines with which tumor size is assessed by the largest tumor focus without taking other foci of disease into consideration. If some papers, as the recent one from Lynch and colleagues, support the current staging convention (3), others, however, as Boyages et al. suggested that aggregate size and not the size of the largest lesion should be considered in order to refine the prognostic assessment of those tumors (5). On the top of that, the question whether multifocal/multicentric carcinomas are due to the spread of a single carcinoma throughout the breast or is due to multiple carcinomas arising simultaneously has been a matter of debate. Some studies suggested that multifocal breast cancer may result from either intramammary spread from a single primary tumor or multiple synchronous primary tumors; whereas others suggest that multiple breast carcinomas always arise from the same clone (6-8). Recently, Pietri and colleagues analyzed the biological characterization of a series of 113 multifocal/multicentric breast cancers (8) which were diagnosed over a 5-year period. The expression of estrogen (ER) and progesterone (PgR) receptors, Ki-67 proliferative index, expression of HER2 and tumor grading were prospectively determined in each tumor focus, and mismatches among foci were recorded. Mismatches in ER status were present in 5 (4.4%) cases and PgR in 18 (15.9%) cases. Mismatches in tumor grading were present in 21 cases (18.6%), proliferative index (Ki-67) in 17 (15%) cases and HER2 status in 11 (9.7%) cases. Interestingly, this heterogeneity among foci has led to 14 (12.4%) patients receiving different adjuvant treatments compared with what would have been indicated if we had only taken into account the biologic status of the primary tumor. This study therefore showed that differences in biological characteristics of multifocal/multicentric lesions play a crucial role in the adjuvant treatment decision making process. In this study, we will concentrate on a larger series of patients with multifocal invasive ductal breast cancer lesions. We aim at: 1. Evaluating the incidence of multifocality according to the different breast cancer molecular subtypes (ER-/HER2-, HER2+, ER+/HER2-). 2. Evaluating the incidence of multifocality in patients with hereditary breast cancer disease (presence of germline BRCA1 or BRCA2 mutations). Moreover, we would like to investigate if multifocal lesions with BRCA1 or BRCA2 mutations exhibit a characteristic combination of substitution mutation signatures and a distinctive profile of deletions as demonstrated recently by Nik-Zainal and colleagues (9). 3. Correlating multifocality with clinical information in order to define its influence on patients? survival (DFS and OS). 4. Carrying high coverage targeted gene sequencing of driver cancer genes and genes whose mutation is of therapeutic importance in order to compare clinically-relevant genetic differences between several multifocal breast cancer lesions. 5. Evaluating the impact of the distance between the different lesions on the clinical outcome but also on the genetic differences. 6. Comparing gene expression patterns between several multifocal breast cancer lesions and correlate them with the results of the targeted genes screen. 7. Characterizing the genomic and transcriptomic status of cancer related genes in metastatic lesions (local recurrence, positive lymph node or distant metastatic sites) from the same multifocal invasive ductal breast cancer patients in order to evaluate the consequence of genomic and transcriptomic heterogeneity of multifocal lesions on metastatic lesions. Multiple (multifocal/multicentric) breast carcinomas, especially when occurring in the same breast, represent a real challenge for both pathologists and clinicians in terms of identifying the cellular origin and the best therapeutic choice. This project has the potential to identify genetic/transcriptomic differences existing between several lesions constituting multifocal breast cancers, which in the routine clinical practice are usually considered to be homogeneous among them. We foresee validating significant results in a larger series of patients and this, in turn, could have a remarkable impact on the treatment and clinical management of multifocal breast cancers. Indeed, we hope to provide some evidence whether or not each focus matters in multifocal and multicentric breast cancer to define the adequate therapeutic approach, especially in the context of targeted therapies. The work to be done at Sanger will be target gene screen pooling of 1400 samples. DT:2013-06-23T00:00:00+0100 SM:EGAN00001105088 CN:SC @PG ID:SCS PN:HiSeq Control Software DS:Controlling software on instrument VN:1.5.15.1 @PG ID:basecalling PN:RTA PP:SCS DS:Basecalling Package VN:1.13.57.0 @PG ID:Illumina2bam PN:Illumina2bam PP:basecalling DS:Convert Illumina BCL to BAM or SAM file VN:V1.10 CL:uk.ac.sanger.npg.illumina.Illumina2bam INTENSITY_DIR=/nfs/sf40/ILorHSany_sf40/analysis/130623_HS15_10138_B_D21KJACXX/Data/Intensities BASECALLS_DIR=/nfs/sf40/ILorHSany_sf40/analysis/130623_HS15_10138_B_D21KJACXX/Data/Intensities/BaseCalls LANE=8 OUTPUT=/dev/stdout SAMPLE_ALIAS=EGAN00001105006,EGAN00001105007,EGAN00001105008,EGAN00001105009,EGAN00001105010,EGAN00001105011,EGAN00001105012,EGAN00001105013,EGAN00001105014,EGAN00001105015,EGAN00001105016,EGAN00001105017,EGAN00001105018,EGAN00001105019,EGAN00001105020,EGAN00001105021,EGAN00001105022,EGAN00001105023,EGAN00001105024,EGAN00001105025,EGAN00001105026,EGAN00001105027,EGAN00001105028,EGAN00001105029,EGAN00001105030,EGAN00001105031,EGAN00001105032,EGAN00001105033,EGAN00001105034,EGAN00001105035,EGAN00001105036,EGAN00001105037,EGAN00001105038,EGAN00001105039,EGAN00001105040,EGAN00001105041,EGAN00001105042,EGAN00001105043,EGAN00001105044,EGAN00001105045,EGAN00001105046,EGAN00001105047,EGAN00001105048,EGAN00001105049,EGAN00001105050,EGAN00001105051,EGAN00001105052,EGAN00001105053,EGAN00001105054,EGAN00001105055,EGAN00001105056,EGAN00001105057,EGAN00001105058,EGAN00001105059,EGAN00001105060,EGAN00001105061,EGAN00001105062,EGAN00001105063,EGAN00001105064,EGAN00001105065,EGAN00001105066,EGAN00001105067,EGAN00001105068,EGAN00001105069,EGAN00001105070,EGAN00001105071,EGAN00001105072,EGAN00001105073,EGAN00001105074,EGAN00001105075,EGAN00001105076,EGAN00001105077,EGAN00001105078,EGAN00001105079,EGAN00001105080,EGAN00001105081,EGAN00001105082,EGAN00001105083,EGAN00001105084,EGAN00001105085,EGAN00001105086,EGAN00001105087,EGAN00001105088,EGAN00001105089,EGAN00001105090,EGAN00001105091,EGAN00001105092,EGAN00001105093,EGAN00001105094,EGAN00001105095,EGAN00001105096,EGAN00001105097,EGAN00001105098,EGAN00001105099,EGAN00001105100,EGAN00001105101,phiX_for_spiked_buffers LIBRARY_NAME=7476627 STUDY_NAME=EGAS00001000407: Multifocality or multicentricity in breast cancer may be defined as the presence of two or more tumor foci within a single quadrant of the breast or within different quadrants of the same breast, respectively. This original classification of the breast cancer as multicentric or multifocal was based on the assumption that cancers arising in the same quadrant were more likely to arise from the same ductal structures than those occurring in separate areas of the breast. The problem with these definitions is that the ?quadrants? of the breast are arbitrary external designations, as no internal boundaries do exist. This project will therefore focus both on synchronous multifocal and multicentric tumors. The incidence of multifocal and multicentric breast cancers was reported to be between 13 and 75% depending on the definition used, the extent of the pathologic sampling of the breast and whether in situ disease is considered evidence of multicentricity (1). Although this incidence is variable, those figures show that it is a frequent phenomenon. Multiple (multifocal/multicentric) breast carcinomas, especially when occurring in the same breast, represent a real challenge for both pathologists and clinicians in terms of identifying the cellular origin and the best therapeutic management of the cancer. Multifocality or multicentricity has been associated with a number of more aggressive features including an increased rate of regional lymph node metastases and adverse patient outcome when compared with unifocal tumors (2-3), and a possible increased risk of local recurrence following breast conserving surgery (4). For the moment, the literature is divided on whether there is a corresponding impact on survival outcomes. Today, the current convention to stage and to treat multifocal and multicentric tumors is the classical tumor-node-metastasis (TNM) staging guidelines with which tumor size is assessed by the largest tumor focus without taking other foci of disease into consideration. If some papers, as the recent one from Lynch and colleagues, support the current staging convention (3), others, however, as Boyages et al. suggested that aggregate size and not the size of the largest lesion should be considered in order to refine the prognostic assessment of those tumors (5). On the top of that, the question whether multifocal/multicentric carcinomas are due to the spread of a single carcinoma throughout the breast or is due to multiple carcinomas arising simultaneously has been a matter of debate. Some studies suggested that multifocal breast cancer may result from either intramammary spread from a single primary tumor or multiple synchronous primary tumors; whereas others suggest that multiple breast carcinomas always arise from the same clone (6-8). Recently, Pietri and colleagues analyzed the biological characterization of a series of 113 multifocal/multicentric breast cancers (8) which were diagnosed over a 5-year period. The expression of estrogen (ER) and progesterone (PgR) receptors, Ki-67 proliferative index, expression of HER2 and tumor grading were prospectively determined in each tumor focus, and mismatches among foci were recorded. Mismatches in ER status were present in 5 (4.4%) cases and PgR in 18 (15.9%) cases. Mismatches in tumor grading were present in 21 cases (18.6%), proliferative index (Ki-67) in 17 (15%) cases and HER2 status in 11 (9.7%) cases. Interestingly, this heterogeneity among foci has led to 14 (12.4%) patients receiving different adjuvant treatments compared with what would have been indicated if we had only taken into account the biologic status of the primary tumor. This study therefore showed that differences in biological characteristics of multifocal/multicentric lesions play a crucial role in the adjuvant treatment decision making process. In this study, we will concentrate on a larger series of patients with multifocal invasive ductal breast cancer lesions. We aim at: 1. Evaluating the incidence of multifocality according to the different breast cancer molecular subtypes (ER-/HER2-, HER2+, ER+/HER2-). 2. Evaluating the incidence of multifocality in patients with hereditary breast cancer disease (presence of germline BRCA1 or BRCA2 mutations). Moreover, we would like to investigate if multifocal lesions with BRCA1 or BRCA2 mutations exhibit a characteristic combination of substitution mutation signatures and a distinctive profile of deletions as demonstrated recently by Nik-Zainal and colleagues (9). 3. Correlating multifocality with clinical information in order to define its influence on patients? survival (DFS and OS). 4. Carrying high coverage targeted gene sequencing of driver cancer genes and genes whose mutation is of therapeutic importance in order to compare clinically-relevant genetic differences between several multifocal breast cancer lesions. 5. Evaluating the impact of the distance between the different lesions on the clinical outcome but also on the genetic differences. 6. Comparing gene expression patterns between several multifocal breast cancer lesions and correlate them with the results of the targeted genes screen. 7. Characterizing the genomic and transcriptomic status of cancer related genes in metastatic lesions (local recurrence, positive lymph node or distant metastatic sites) from the same multifocal invasive ductal breast cancer patients in order to evaluate the consequence of genomic and transcriptomic heterogeneity of multifocal lesions on metastatic lesions. Multiple (multifocal/multicentric) breast carcinomas, especially when occurring in the same breast, represent a real challenge for both pathologists and clinicians in terms of identifying the cellular origin and the best therapeutic choice. This project has the potential to identify genetic/transcriptomic differences existing between several lesions constituting multifocal breast cancers, which in the routine clinical practice are usually considered to be homogeneous among them. We foresee validating significant results in a larger series of patients and this, in turn, could have a remarkable impact on the treatment and clinical management of multifocal breast cancers. Indeed, we hope to provide some evidence whether or not each focus matters in multifocal and multicentric breast cancer to define the adequate therapeutic approach, especially in the context of targeted therapies. The work to be done at Sanger will be target gene screen pooling of 1400 samples.,Illumina Controls: SPIKED_CONTROL COMPRESSION_LEVEL=0 CREATE_MD5_FILE=true GENERATE_SECONDARY_BASE_CALLS=false PF_FILTER=true READ_GROUP_ID=1 SEQUENCING_CENTER=SC PLATFORM=ILLUMINA BARCODE_SEQUENCE_TAG_NAME=BC BARCODE_QUALITY_TAG_NAME=QT VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false @PG ID:BamAdapterFinder PN:BamAdapterFinder PP:Illumina2bam DS:Find short inserts by finding overlapping forward/reverse reads. Note position with a tag. VN:V1.10 CL:uk.ac.sanger.npg.picard.BamAdapterFinder INPUT=/dev/stdin OUTPUT=/dev/stdout ADAPTER_LENGTH_TAG=a3 ADAPTER_MATCH_TAG=ah VALIDATION_STRINGENCY=SILENT COMPRESSION_LEVEL=0 CREATE_MD5_FILE=true MIN_OVERLAP=32 PCT_MISMATCHES=10.0 ADAPTER_MATCH=12 VERBOSITY=INFO QUIET=false MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false @PG ID:BamIndexDecoder PN:BamIndexDecoder PP:BamAdapterFinder DS:A command-line tool to decode multiplexed bam file VN:V1.10 CL:uk.ac.sanger.npg.picard.BamIndexDecoder INPUT=/dev/stdin OUTPUT=/nfs/sf40/ILorHSany_sf40/analysis/130623_HS15_10138_B_D21KJACXX/Data/Intensities/BAM_basecalls_20130703-183032/10138_8.bam BARCODE_FILE=/nfs/sf40/ILorHSany_sf40/analysis/130623_HS15_10138_B_D21KJACXX/lane_8.taglist METRICS_FILE=/nfs/sf40/ILorHSany_sf40/analysis/130623_HS15_10138_B_D21KJACXX/Data/Intensities/BAM_basecalls_20130703-183032/10138_8.bam.tag_decode.metrics VALIDATION_STRINGENCY=SILENT CREATE_MD5_FILE=true BARCODE_TAG_NAME=BC BARCODE_QUALITY_TAG_NAME=QT MAX_MISMATCHES=1 MIN_MISMATCH_DELTA=1 MAX_NO_CALLS=2 CONVERT_LOW_QUALITY_TO_NO_CALL=false MAX_LOW_QUALITY_TO_CONVERT=15 VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false @PG ID:pb_cal PN:predictor_pu PP:BamIndexDecoder DS:A program to apply a calibration table VN:v10.12 CL:/software/solexa/bin/pb_calibration/v10.12/predictor_pu -ct 10138_8_purity_cycle_caltable.txt -intensity-dir /nfs/sf40/ILorHSany_sf40/analysis/130623_HS15_10138_B_D21KJACXX/Data/Intensities -cstart1 1 -cstart2 84 -u ../10138_8.bam @PG ID:spf PN:spatial_filter PP:pb_cal DS:A program to apply a spatial filter VN:v10.12 CL:/software/solexa/bin/pb_calibration/v10.12/spatial_filter -c -F pb_align_10138_8.bam.filter --region_size 700 --region_min_count 122 --region_mismatch_threshold 0.0160 --region_insertion_threshold 0.0160 --region_deletion_threshold 0.0160 pb_align_10138_8.bam ; /software/solexa/bin/pb_calibration/v10.12/spatial_filter -a -u -F pb_align_10138_8.bam.filter - @PG ID:bwa PN:bwa PP:spf VN:0.5.10-tpx @PG ID:BamMerger PN:BamMerger PP:bwa DS:A command-line tool to merge BAM/SAM alignment info in the first input file with the data in an unmapped BAM file, producing a third BAM file that has alignment data and all the additional data from the unmapped BAM VN:V1.10 CL:uk.ac.sanger.npg.picard.BamMerger ALIGNED_BAM=pb_align_10138_8.bam INPUT=/dev/stdin OUTPUT=10138_8.bam KEEP_EXTRA_UNMAPPED_READS=true REPLACE_ALIGNED_BASE_QUALITY=true VALIDATION_STRINGENCY=SILENT CREATE_MD5_FILE=true ALIGNMENT_PROGRAM_ID=bwa KEEP_ALL_PG=false VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false @PG ID:SplitBamByReadGroup PN:SplitBamByReadGroup PP:BamMerger DS:Split a BAM file into multiple BAM files based on ReadGroup. Headers are a copy of the original file, removing @RGs where IDs match with the other ReadGroup IDs VN:V1.10 CL:uk.ac.sanger.npg.picard.SplitBamByReadGroup INPUT=/nfs/sf40/ILorHSany_sf40/analysis/130623_HS15_10138_B_D21KJACXX/Data/Intensities/BAM_basecalls_20130703-183032/PB_cal_bam/10138_8.bam OUTPUT_PREFIX=/nfs/sf40/ILorHSany_sf40/analysis/130623_HS15_10138_B_D21KJACXX/Data/Intensities/BAM_basecalls_20130703-183032/PB_cal_bam/lane8/10138_8 OUTPUT_COMMON_RG_HEAD_TO_TRIM=1 VALIDATION_STRINGENCY=SILENT CREATE_MD5_FILE=true VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false @PG ID:bwa_aln PN:bwa PP:SplitBamByReadGroup VN:0.5.10-tpx CL:/software/solexa/bin/aligners/bwa/bwa-0.5.10-mt/bwa aln -t 12 /lustre/scratch109/srpipe/references/Homo_sapiens/CGP_GRCh37.NCBI.allchr_MT/all/bwa/Homo_sapiens.GRCh37.NCBI.allchr_MT.fa -b1 /nfs/sf40/ILorHSany_sf40/analysis/130623_HS15_10138_B_D21KJACXX/Data/Intensities/BAM_basecalls_20130703-183032/PB_cal_bam/lane8/10138_8#83.bam > /tmp/0NWgnhyOP2/1.sai @PG ID:bwa_aln_1 PN:bwa PP:bwa_aln VN:0.5.10-tpx CL:/software/solexa/bin/aligners/bwa/bwa-0.5.10-mt/bwa aln -t 12 /lustre/scratch109/srpipe/references/Homo_sapiens/CGP_GRCh37.NCBI.allchr_MT/all/bwa/Homo_sapiens.GRCh37.NCBI.allchr_MT.fa -b2 /nfs/sf40/ILorHSany_sf40/analysis/130623_HS15_10138_B_D21KJACXX/Data/Intensities/BAM_basecalls_20130703-183032/PB_cal_bam/lane8/10138_8#83.bam > /tmp/0NWgnhyOP2/2.sai @PG ID:bwa_sam PN:bwa PP:bwa_aln_1 VN:0.5.10-tpx CL:/software/solexa/bin/aligners/bwa/bwa-0.5.10-mt/bwa sampe -t 6 /lustre/scratch109/srpipe/references/Homo_sapiens/CGP_GRCh37.NCBI.allchr_MT/all/bwa/Homo_sapiens.GRCh37.NCBI.allchr_MT.fa /tmp/0NWgnhyOP2/1.sai /tmp/0NWgnhyOP2/2.sai /nfs/sf40/ILorHSany_sf40/analysis/130623_HS15_10138_B_D21KJACXX/Data/Intensities/BAM_basecalls_20130703-183032/PB_cal_bam/lane8/10138_8#83.bam /nfs/sf40/ILorHSany_sf40/analysis/130623_HS15_10138_B_D21KJACXX/Data/Intensities/BAM_basecalls_20130703-183032/PB_cal_bam/lane8/10138_8#83.bam @PG ID:Picard_SamFormatConverter PN:SamFormatConverter PP:bwa_sam VN:1.72(1230) CL:/software/java/bin/java -Xmx1000m -jar /software/solexa/bin/aligners/picard/picard-tools-1.72/SamFormatConverter.jar VALIDATION_STRINGENCY=SILENT INPUT=/dev/stdin OUTPUT=/dev/stdout COMPRESSION_LEVEL=0 @PG ID:samtools_fixmate PN:samtools PP:Picard_SamFormatConverter VN:0.1.18 (r982:295) CL:/software/solexa/bin/aligners/samtools/samtools-0.1.18/samtools fixmate - - @PG ID:BamMerger_1 PN:BamMerger PP:samtools_fixmate VN:V1.10 DS:A command-line tool to merge BAM/SAM alignment info in the first input file with the data in an unmapped BAM file, producing a third BAM file that has alignment data and all the additional data from the unmapped BAM CL:uk.ac.sanger.npg.picard.BamMerger ALIGNED_BAM=/dev/stdin ALIGNMENT_PROGRAM_ID=NULL KEEP_ALL_PG=true INPUT=/nfs/sf40/ILorHSany_sf40/analysis/130623_HS15_10138_B_D21KJACXX/Data/Intensities/BAM_basecalls_20130703-183032/PB_cal_bam/lane8/10138_8#83.bam OUTPUT=/tmp/0NWgnhyOP2/output_fifo.bam KEEP_EXTRA_UNMAPPED_READS=true VALIDATION_STRINGENCY=SILENT CREATE_MD5_FILE=true REPLACE_ALIGNED_BASE_QUALITY=false VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false @PG ID:AlignmentFilter PN:AlignmentFilter PP:BamMerger_1 DS:Give a list of SAM/BAM files with the same set of records and in the same order but aligned with different references, split reads into different files according to alignments. You have option to put unaligned reads into one of output files or a separate file VN:V1.10 CL:uk.ac.sanger.npg.picard.AlignmentFilter INPUT_ALIGNMENT=[/nfs/sf40/ILorHSany_sf40/analysis/130623_HS15_10138_B_D21KJACXX/Data/Intensities/BAM_basecalls_20130703-183032/PB_cal_bam/lane8/10138_8#83.bam, /tmp/0NWgnhyOP2/output_fifo.bam] OUTPUT_ALIGNMENT=[/nfs/sf40/ILorHSany_sf40/analysis/130623_HS15_10138_B_D21KJACXX/Data/Intensities/BAM_basecalls_20130703-183032/PB_cal_bam/archive/lane8/10138_8#83_phix.bam, /nfs/sf40/ILorHSany_sf40/analysis/130623_HS15_10138_B_D21KJACXX/Data/Intensities/BAM_basecalls_20130703-183032/PB_cal_bam/archive/lane8/10138_8#83.bam] METRICS_FILE=/nfs/sf40/ILorHSany_sf40/analysis/130623_HS15_10138_B_D21KJACXX/Data/Intensities/BAM_basecalls_20130703-183032/PB_cal_bam/archive/lane8/10138_8#83.bam_alignment_filter_metrics.json VALIDATION_STRINGENCY=SILENT CREATE_MD5_FILE=true VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false @PG ID:bamsort PN:bamsort PP:AlignmentFilter VN:0.0.64 CL:/software/hpag/biobambam/0.0.64/bin/bamsort tmpfile=/nfs/sf40/ILorHSany_sf40/analysis/130623_HS15_10138_B_D21KJACXX/Data/Intensities/BAM_basecalls_20130703-183032/PB_cal_bam/archive/lane8/oc0Q7yqRd6/ @PG ID:bammarkduplicates PN:bammarkduplicates PP:bamsort VN:0.0.64 CL:/software/hpag/biobambam/0.0.64/bin/bammarkduplicates I=/nfs/sf40/ILorHSany_sf40/analysis/130623_HS15_10138_B_D21KJACXX/Data/Intensities/BAM_basecalls_20130703-183032/PB_cal_bam/archive/lane8/oc0Q7yqRd6/sorted.bam O=/dev/stdout tmpfile=/nfs/sf40/ILorHSany_sf40/analysis/130623_HS15_10138_B_D21KJACXX/Data/Intensities/BAM_basecalls_20130703-183032/PB_cal_bam/archive/lane8/oc0Q7yqRd6/ M=/tmp/kfTIOJVaXG level=0 @PG ID:BamTagStripper PN:BamTagStripper PP:bammarkduplicates DS:Strip a list of tags in bam/sam record. By default, any tags containing lowercase letters will be stripped and other tags will be kept. A list of tags can be given to keep or strip VN:V1.10 CL:uk.ac.sanger.npg.picard.BamTagStripper INPUT=/dev/stdin OUTPUT=/dev/stdout TAG_TO_KEEP=[a3, ah, br, qr, tq, tr] TAG_TO_STRIP=[OQ] TMP_DIR=[/nfs/sf40/ILorHSany_sf40/analysis/130623_HS15_10138_B_D21KJACXX/Data/Intensities/BAM_basecalls_20130703-183032/PB_cal_bam/archive/lane8/oc0Q7yqRd6] VERBOSITY=INFO VALIDATION_STRINGENCY=SILENT CREATE_INDEX=false CREATE_MD5_FILE=false QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 @PG ID:scramble PN:scramble PP:BamTagStripper VN:1.13.2 CL:/software/badger/bin/scramble -I bam -O cram -r /lustre/scratch109/srpipe/references/Homo_sapiens/CGP_GRCh37.NCBI.allchr_MT/all/fasta/Homo_sapiens.GRCh37.NCBI.allchr_MT.fa @SQ SN:1 LN:249250621 UR:/lustre/scratch109/srpipe/references/Homo_sapiens/CGP_GRCh37.NCBI.allchr_MT/all/fasta/Homo_sapiens.GRCh37.NCBI.allchr_MT.fa AS:CGP GRCh37.NCBI.allchr_MT M5:1b22b98cdeb4a9304cb5d48026a85128 SP:Homo sapiens @SQ SN:2 LN:243199373 UR:/lustre/scratch109/srpipe/references/Homo_sapiens/CGP_GRCh37.NCBI.allchr_MT/all/fasta/Homo_sapiens.GRCh37.NCBI.allchr_MT.fa AS:CGP GRCh37.NCBI.allchr_MT M5:a0d9851da00400dec1098a9255ac712e SP:Homo sapiens @SQ SN:3 LN:198022430 UR:/lustre/scratch109/srpipe/references/Homo_sapiens/CGP_GRCh37.NCBI.allchr_MT/all/fasta/Homo_sapiens.GRCh37.NCBI.allchr_MT.fa AS:CGP GRCh37.NCBI.allchr_MT M5:fdfd811849cc2fadebc929bb925902e5 SP:Homo sapiens @SQ SN:4 LN:191154276 UR:/lustre/scratch109/srpipe/references/Homo_sapiens/CGP_GRCh37.NCBI.allchr_MT/all/fasta/Homo_sapiens.GRCh37.NCBI.allchr_MT.fa AS:CGP GRCh37.NCBI.allchr_MT M5:23dccd106897542ad87d2765d28a19a1 SP:Homo sapiens @SQ SN:5 LN:180915260 UR:/lustre/scratch109/srpipe/references/Homo_sapiens/CGP_GRCh37.NCBI.allchr_MT/all/fasta/Homo_sapiens.GRCh37.NCBI.allchr_MT.fa AS:CGP GRCh37.NCBI.allchr_MT M5:0740173db9ffd264d728f32784845cd7 SP:Homo sapiens @SQ SN:6 LN:171115067 UR:/lustre/scratch109/srpipe/references/Homo_sapiens/CGP_GRCh37.NCBI.allchr_MT/all/fasta/Homo_sapiens.GRCh37.NCBI.allchr_MT.fa AS:CGP GRCh37.NCBI.allchr_MT M5:1d3a93a248d92a729ee764823acbbc6b SP:Homo sapiens @SQ SN:7 LN:159138663 UR:/lustre/scratch109/srpipe/references/Homo_sapiens/CGP_GRCh37.NCBI.allchr_MT/all/fasta/Homo_sapiens.GRCh37.NCBI.allchr_MT.fa AS:CGP GRCh37.NCBI.allchr_MT M5:618366e953d6aaad97dbe4777c29375e SP:Homo sapiens @SQ SN:8 LN:146364022 UR:/lustre/scratch109/srpipe/references/Homo_sapiens/CGP_GRCh37.NCBI.allchr_MT/all/fasta/Homo_sapiens.GRCh37.NCBI.allchr_MT.fa AS:CGP GRCh37.NCBI.allchr_MT M5:96f514a9929e410c6651697bded59aec SP:Homo sapiens @SQ SN:9 LN:141213431 UR:/lustre/scratch109/srpipe/references/Homo_sapiens/CGP_GRCh37.NCBI.allchr_MT/all/fasta/Homo_sapiens.GRCh37.NCBI.allchr_MT.fa AS:CGP GRCh37.NCBI.allchr_MT M5:3e273117f15e0a400f01055d9f393768 SP:Homo sapiens @SQ SN:10 LN:135534747 UR:/lustre/scratch109/srpipe/references/Homo_sapiens/CGP_GRCh37.NCBI.allchr_MT/all/fasta/Homo_sapiens.GRCh37.NCBI.allchr_MT.fa AS:CGP GRCh37.NCBI.allchr_MT M5:988c28e000e84c26d552359af1ea2e1d SP:Homo sapiens @SQ SN:11 LN:135006516 UR:/lustre/scratch109/srpipe/references/Homo_sapiens/CGP_GRCh37.NCBI.allchr_MT/all/fasta/Homo_sapiens.GRCh37.NCBI.allchr_MT.fa AS:CGP GRCh37.NCBI.allchr_MT M5:98c59049a2df285c76ffb1c6db8f8b96 SP:Homo sapiens @SQ SN:12 LN:133851895 UR:/lustre/scratch109/srpipe/references/Homo_sapiens/CGP_GRCh37.NCBI.allchr_MT/all/fasta/Homo_sapiens.GRCh37.NCBI.allchr_MT.fa AS:CGP GRCh37.NCBI.allchr_MT M5:51851ac0e1a115847ad36449b0015864 SP:Homo sapiens @SQ SN:13 LN:115169878 UR:/lustre/scratch109/srpipe/references/Homo_sapiens/CGP_GRCh37.NCBI.allchr_MT/all/fasta/Homo_sapiens.GRCh37.NCBI.allchr_MT.fa AS:CGP GRCh37.NCBI.allchr_MT M5:283f8d7892baa81b510a015719ca7b0b SP:Homo sapiens @SQ SN:14 LN:107349540 UR:/lustre/scratch109/srpipe/references/Homo_sapiens/CGP_GRCh37.NCBI.allchr_MT/all/fasta/Homo_sapiens.GRCh37.NCBI.allchr_MT.fa AS:CGP GRCh37.NCBI.allchr_MT M5:98f3cae32b2a2e9524bc19813927542e SP:Homo sapiens @SQ SN:15 LN:102531392 UR:/lustre/scratch109/srpipe/references/Homo_sapiens/CGP_GRCh37.NCBI.allchr_MT/all/fasta/Homo_sapiens.GRCh37.NCBI.allchr_MT.fa AS:CGP GRCh37.NCBI.allchr_MT M5:e5645a794a8238215b2cd77acb95a078 SP:Homo sapiens @SQ SN:16 LN:90354753 UR:/lustre/scratch109/srpipe/references/Homo_sapiens/CGP_GRCh37.NCBI.allchr_MT/all/fasta/Homo_sapiens.GRCh37.NCBI.allchr_MT.fa AS:CGP GRCh37.NCBI.allchr_MT M5:fc9b1a7b42b97a864f56b348b06095e6 SP:Homo sapiens @SQ SN:17 LN:81195210 UR:/lustre/scratch109/srpipe/references/Homo_sapiens/CGP_GRCh37.NCBI.allchr_MT/all/fasta/Homo_sapiens.GRCh37.NCBI.allchr_MT.fa AS:CGP GRCh37.NCBI.allchr_MT M5:351f64d4f4f9ddd45b35336ad97aa6de SP:Homo sapiens @SQ SN:18 LN:78077248 UR:/lustre/scratch109/srpipe/references/Homo_sapiens/CGP_GRCh37.NCBI.allchr_MT/all/fasta/Homo_sapiens.GRCh37.NCBI.allchr_MT.fa AS:CGP GRCh37.NCBI.allchr_MT M5:b15d4b2d29dde9d3e4f93d1d0f2cbc9c SP:Homo sapiens @SQ SN:19 LN:59128983 UR:/lustre/scratch109/srpipe/references/Homo_sapiens/CGP_GRCh37.NCBI.allchr_MT/all/fasta/Homo_sapiens.GRCh37.NCBI.allchr_MT.fa AS:CGP GRCh37.NCBI.allchr_MT M5:1aacd71f30db8e561810913e0b72636d SP:Homo sapiens @SQ SN:20 LN:63025520 UR:/lustre/scratch109/srpipe/references/Homo_sapiens/CGP_GRCh37.NCBI.allchr_MT/all/fasta/Homo_sapiens.GRCh37.NCBI.allchr_MT.fa AS:CGP GRCh37.NCBI.allchr_MT M5:0dec9660ec1efaaf33281c0d5ea2560f SP:Homo sapiens @SQ SN:21 LN:48129895 UR:/lustre/scratch109/srpipe/references/Homo_sapiens/CGP_GRCh37.NCBI.allchr_MT/all/fasta/Homo_sapiens.GRCh37.NCBI.allchr_MT.fa AS:CGP GRCh37.NCBI.allchr_MT M5:2979a6085bfe28e3ad6f552f361ed74d SP:Homo sapiens @SQ SN:22 LN:51304566 UR:/lustre/scratch109/srpipe/references/Homo_sapiens/CGP_GRCh37.NCBI.allchr_MT/all/fasta/Homo_sapiens.GRCh37.NCBI.allchr_MT.fa AS:CGP GRCh37.NCBI.allchr_MT M5:a718acaa6135fdca8357d5bfe94211dd SP:Homo sapiens @SQ SN:X LN:155270560 UR:/lustre/scratch109/srpipe/references/Homo_sapiens/CGP_GRCh37.NCBI.allchr_MT/all/fasta/Homo_sapiens.GRCh37.NCBI.allchr_MT.fa AS:CGP GRCh37.NCBI.allchr_MT M5:7e0e2e580297b7764e31dbc80c2540dd SP:Homo sapiens @SQ SN:Y LN:59373566 UR:/lustre/scratch109/srpipe/references/Homo_sapiens/CGP_GRCh37.NCBI.allchr_MT/all/fasta/Homo_sapiens.GRCh37.NCBI.allchr_MT.fa AS:CGP GRCh37.NCBI.allchr_MT M5:1e86411d73e6f00a10590f976be01623 SP:Homo sapiens @SQ SN:MT LN:16569 UR:/lustre/scratch109/srpipe/references/Homo_sapiens/CGP_GRCh37.NCBI.allchr_MT/all/fasta/Homo_sapiens.GRCh37.NCBI.allchr_MT.fa AS:CGP GRCh37.NCBI.allchr_MT M5:c68f52674c9fb33aef52dcf399755519 SP:Homo sapiens