Download Brochure | Short Courses
The amount of data being generated by the life sciences is growing at an exponential pace; the amount of data coming from next generation sequencing platforms, as well as data generated from imaging efforts has created a large scale informatics challenge. IT infrastructure and informatics continue to evolve to keep pace with the storage, analysis, visualization and sharing needs of data being generated at the terabyte level. This track will feature case studies from leading IT, IS and Informatics experts from pharmaceutical drug discovery and development, and discuss current approaches, platforms and workflows being utilized to handle data on such a large scale.
9:00 Conference Registration and Morning Coffee
9:30 Chairperson’s Opening RemarksSponsored by

Derek Burke, Director, International Marketing, Panasas
9:35 Next-Generation Sequencing Data: Next-Generation Management Problems
Guy Coates, Ph.D., Informatics System Group, The Wellcome Trust Sanger
Sequencing centers are now collecting and storing large amounts of data from their sequencing machines. However, as datasets get ever larger, simply keeping track of the data becomes a challenge; datasets are lost or needlessly duplicated. The situation is further complicated as datasets become dispersed over collaborative and cloud infrastructures. The talk will explore some of the technologies we are using to help people manage and share their data in the next-gen sequencing era.
10:05 Keeping Pace—Scalable IT Infrastructures to Support Data Intensive Science
Rupert Lueck, Ph.D., Head, IT Services, EMBL Heidelberg
The IT infrastructure required to support science at EMBL is seriously challenged by the enormous amounts of data generated by technologies such as NGS and high-throughput microscopy that are used in large-scale and interdisciplinary systems biology projects. This talk highlights our strategies to ensure a scalable, reliable and cost efficient IT environment to keep pace with the rapidly growing demands for high performance storage and compute power.
Sponsored by
10:35 Coffee Break
11:00 Omics Data - Serving Sequencing at the National Level
Francois Artiguenave, Ph.D., Head, Bioinformatics Laboratory, Genoscope
Life science has been profoundly impacted by technological advances allowing faster and cheaper DNA sequencing. Opening a wide range of applications, the last sequencing platforms raised new challenges in processing, analysing and interpreting massive data. The growing role of informatics and bioinformatics will be illustrated by providing some figures about genome sequencing and others applications aimed at unraveling biological mechanisms.
11:30 Speaker to be Announced
12:00 Sponsored by
Data Management Challenges in Translational Collaborative Research and Clinical Care at Erasmus MC
Bert Eussen, Clinical Genetics, Erasmus MC
Peter Walgemoed, Director, Carelliance
Translational medicine is driving an evolution in research and personalized patient care; particularly in next generation sequencing research. This evolution also requires a new approach in the data center to architect new data access and reporting policies, set service levels and manage the explosive data growth associated with this new model for research and clinical care. Erasmus MC in partnership with HP, is implementing a storage and computational data services concept. They will share their approach to managing the Cell Biology/Clinical Genetics data sprawl and information management challenges.
12:30 Lunch for Purchase in the Exhibit Hall
13:45 Dedicated Poster Viewing in the Exhibit Hall
14:30 Chairperson’s Remarks
14:35 Featured Presentation: Sequencing Data Storage Sponsored by
Chris Dagdigian, Founding Partner & Director, Technology, BioTeam, Inc.
Next-Generation Sequencing (NGS) instruments are forcing evolutionary and revolutionary changes in research IT architectures & infrastructures. Chemistry and lab protocols are advancing faster than the underlying IT systems and methods, leading to a crisis of capability in many organizations. This presentation will focus on “science-centric storage” for life science informatics, with specific attention on requirements, trends, data management methods and emerging best practices.
15:05 Experiences from the European Bioinformatics Institute’s Data Resources, Storage, and Management
Guy Cochrane, Ph.D., European Nucleotide Archive Team Leader, Bioinformatics Institute
15:35 Refreshment Break
Sponsored by
16:00 Sponsored Presentation
16:30 How to Overcome the 100 Miles between Petabases and Petabytes
Jürgen Eils, Bioinformatics Database Group Leader, German Cancer Research Center
Recently, Heidelberg University received a grant to build the largest data storage facility in Germany at 5-10 petabytes. From a management and logicistical perspective, the massive throughput of next-generation sequencing requires new concepts and strategies. One problem is the long distance transport of data from the sequencer machine to the data storage facility. We will present strategies and concepts with emphasis on reusability and sustainability for storing and retrieving the comprehensive collection of sequencing data in combination with associated clinical and histopathological annotation data – all in accordance with the International Cancer Genome Consortium (ICGC) guidelines.
17:00 Enhanced Scalabilty, Large Data Volumes Management, Integrated Analysis, and NGS Informatics Support in a Medical Setting
Andrew Stubbs, Ph.D., Assistant Professor, Department of Bioinformatics, Erasmus Medical Center
17:30 The First Success Stories after the Swedish Buildup of Computational Power and Large-Scale Storage for Gene-Sequence Data
Ingela Nystrom, Ph.D., Director, UPPMAX, Center for Image Analysis, Uppsala University
Last year, we reported on the buildup of a system at Uppsala University, Sweden, intended for researchers who deal with the large-scale data from modern gene-sequencing technology. The system has 1200 cores, 4 TB RAM, and 500 TB storage. Now, we report our first success stories, e.g., the whole-genome resequencing project revealing signatures of selection during chicken domestication.
18:00 Sponsored Presentation (Opportunity Available)
18:30 Interactive Breakout Discussion Groups:
Repositories of Metagenomic Data and Tools for Academic and Commercial Users
Moderator: Oleg Reva, Ph.D., Senior Lecturer, Biochemistry, University of Pretoria
Data formats and supplementary information requirements
Clustering and binning of environmental sequences
Modeling metabolic pathways and ecological interactions: facts and artifacts
Storage of Omic Data - The Cloud & Beyond
Moderator: Chris Dagdigian, Founding Partner & Director of Technology, BioTeam, Inc.
Science-centric storage
Data management issues
Best practices
Web Services
Moderator: Christian Hauck, Ph.D., Knowledge Management & Competitive Intelligence, Novartis Pharma AG
Analyzing and Storing Gene Sequence Data
Moderator: Ingela Nystrom, Ph.D., Director, UPPMAX, Center for Image Analysis, Uppsala University
User support
Data security issues
What will come next?
Semantic Web and Ontologies
Moderator: Martin Gollery, Senior Bioinformatics Scientist, Tahoe Informatics
19:15 – 21:00 CHI Networking Reception
9:00 Conference Registration and Morning Coffee
9:30 Chairperson’s Opening Remarks
Yuriy Gankin, Ph.D., Chief Scientific Officer, GGA Software Services LLC
9:35 Implementation & Use of PHAEDRA, a Standards-Based System for High-Content Image Analysis and Evaluation
Frans Cornelissen, IT Manager, Janssen Pharmaceutical
High-content Image Analysis based screening (HTS-HCA) technology is an important drug discovery tool for identification of biological probes and drug leads by screening large volume, diverse biochemical and cell-based assays, using image capture & analysis. The architecture and usage of the PHAEDRA environment will be described, using examples like measurement of 3D tumor colony size on brightfield image stacks, quantification of dendritic length, spine density and spine diameter in 3D fluorescent image stacks.
9:55 Lessons Learned in Imaging Informatics for Drug Discovery
Gudrun Zahlmann, Ph.D., Manager, Imaging Infrastructure, pRED, F. Hoffman-La Roche Ltd.
10:15 Quantitative Image Analysis Tools for Biological Research
Ewert Bengtsson, Ph.D., Professor, Center for Image Analysis, Uppsala University
Pure visual analysis of microscopy images is limited in its ability to provide objective quantitative information. Here computerized image analysis can provide automated, quantitative tools enabling accurate and high throughput analysis. We will describe methods for improved microscope image analysis ranging from better utilization of the color information via robust modeling and segmentation to quantification and classification methods.
Sponsored by
10:35 Coffee Break
11:00 Sponsored Presentation
11:30 Next-Generation Interfaces and Interaction with Complex Information Landscapes
Bryn Roberts, Ph.D., Global Head, Informatics, Pharma Research and Early Development, F. Hoffmann-La Roche AG
Presentation of background and proof-of-concept projects on two main themes: Potential approaches of enabling scientists to navigate complex, heterogeneous information using semantic integration technologies and a next generation of user interfaces; and Enabling teams to generate, explore and progress hypotheses together using collaborative computer interfaces.
12:00 Informatics for Data Driven Drug Discovery: How to Win the "War" against Data Complexity and Silos
Jacob de Vlieg, Ph.D., Professor & Global Head, Molecular Design and Informatics, Merck
Modern drug discovery and development is based on a highly data-intensive and complex multidisciplinary research process with the capacity to perform data-driven research as the potentially biggest differentiator in drug discovery and development. Working smarter is an absolute requirement to win the “war” against data complexity and silos. However, there are many unmet scientific, technological and business process challenges that need to be addressed before we can truly make full use of powerful informatics & omics-inspired technologies within pharma R&D. All pharmaceutical scientists in the 21st century will need to be knowledge workers supported by experts. New broad-oriented in silico drug hunters able to work at the interface of chemistry, biology and informatics and employing enhanced scientific (eScience) concepts are required to connect the “inhuman” scale of data and to present the data and information at an “human” scale of understanding. It is all about how we can effectively apply (modern or existing) technologies in the discovery business process at the right maturity level to support business-driven molecular innovation. Data-driven technologies are becoming increasingly interwoven with each other and often require significant modifications of the pharma R&D businesses process to have full impact or meaning. For example, this can mean that the focus is not (only) to understand the underlying biology but to compute probabilities and to spot valuable patterns in heterogeneous and noisy experimental data sets from several internal or external scientific disciplines, e.g. to identify correlations between a safety and efficacy fingerprints of a reference compound or drug in human (e.g. pharmacogenomics data, post market safety profiles reported by consumers, etc.) and the fingerprints of compound series in the design phase (e.g. structural, computed or omics derived data). These types of pragmatic pattern recognition approaches have the potential to deliver output in business terms or to boost R&D innovation and improve the business bottom-line. In the presentation, I will give some real life examples on how informatics, modeling and simulation technologies are used to make better use of internal and external (experimental) data sets for biomarker discovery and compound (library) design or prioritization and are delivering output in business terms.
12:30 Lunch for Purchase in the Exhibit Hall
13:00 Dedicated Poster Viewing in the Exhibit Hall
13:30 Close of Conference
Download Brochure | Short Courses