http://bioinfo-core.org/api.php?action=feedcontributions&user=Bgrichter&feedformat=atomBioWiki - User contributions [en]2024-03-28T16:48:45ZUser contributionsMediaWiki 1.34.0http://bioinfo-core.org/index.php?title=User:Bpospisil&diff=10416User:Bpospisil2022-05-04T17:19:08Z<p>Bgrichter: Creating user page for new user.</p>
<hr />
<div>Bryan has more than 9 years in bioinformatic software sales, with a wide array of offerings ranging from early discovery, development and into manufacturing. Some of the largest accounts can range from 1-4 Billion in annual revenue. Bryan is known for his personable nature and is based in San Diego, California.</div>Bgrichterhttp://bioinfo-core.org/index.php?title=User_talk:Bpospisil&diff=10417User talk:Bpospisil2022-05-04T17:19:08Z<p>Bgrichter: Welcome!</p>
<hr />
<div>'''Welcome to ''BioWiki''!'''<br />
We hope you will contribute much and well.<br />
You will probably want to read the [https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents help pages].<br />
Again, welcome and have fun! [[User:Bgrichter|Bgrichter]] ([[User talk:Bgrichter|talk]]) 10:19, 4 May 2022 (PDT)</div>Bgrichterhttp://bioinfo-core.org/index.php?title=User:Karakach&diff=10414User:Karakach2020-02-11T19:56:07Z<p>Bgrichter: Creating user page for new user.</p>
<hr />
<div>Dr. Tobias Karakach received his PhD from Dalhousie University in Halifax, NS, Canada in Chemometrics with a focus on the analysis of the then emergent high throughput gene expression microarray data. He developed methods for the analysis of these data starting from quantification of signals from raw images, pre-processing the data, ultimately, analyzing them using various statistical approaches. The focus of his thesis was in modeling microarray data designed to answer ordinal biological observations. He followed his PhD work with post-doctoral experience at the National Research Council of Canada (NRC) focusing on the analysis of NMR- and MS-based untargeted metabolomics, and proteomics data. He then proceeded to conduct independent work as a research officer, focusing on analyzing data derived from bioanalytical technologies such as magnetic resonance, mass spectrometry, hyperspectral imaging, fluorescence, Infrared, and other vibrational spectroscopic tools.<br />
In 2017, Dr. Karakach moved to the Vlaams Institute voor Biotechnologie (VIB) in Leuven, Belgium and took a position as a staff scientist in charge of bioinformatics at the laboratory for Angiogenesis and Vascular Metabolism. During this time, he expanded his repertoire of bioinformatics skills, specifically as apply to single cell RNA/DNA sequencing and added valuable biological knowledge to his computational expertise. He then moved to Winnipeg, MB Canada, to set up a bioinformatics core laboratory at the Children's Hospital Research Institute of Manitoba (CHRIM). He splits his time between traditional core-facility activities, teaching and development new computational methods for the analysis of temporal multi-omics data. He has over 25 high impact publications in the diverse field of Bioinformatics. You can find more about his facility at www.chrimbioinformatics.org</div>Bgrichterhttp://bioinfo-core.org/index.php?title=User_talk:Karakach&diff=10415User talk:Karakach2020-02-11T19:56:07Z<p>Bgrichter: Welcome!</p>
<hr />
<div>'''Welcome to ''BioWiki''!'''<br />
We hope you will contribute much and well.<br />
You will probably want to read the [https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents help pages].<br />
Again, welcome and have fun! [[User:Bgrichter|Bgrichter]] ([[User talk:Bgrichter|talk]]) 13:56, 11 February 2020 (CST)</div>Bgrichterhttp://bioinfo-core.org/index.php?title=User:KrisDavie&diff=10408User:KrisDavie2019-09-02T12:20:39Z<p>Bgrichter: Creating user page for new user.</p>
<hr />
<div>I am mainly interested in the development and automation of pipelines to facilitate researchers in their bioinformatic needs, as well as making bioinformatics more accessible to those researchers.<br />
<br />
Head of the VIB-CBD Bioinformatics Expertise Unit at KU Leuven, Belgium.<br />
Co-developer of SCope (https://github.com/aertslab/SCope)<br />
<br />
<br />
Performed PhD Studies in the Lab of Computational Biology (Stein Aerts) at KU Leuven, Belgium<br />
Obtained BSc(Hons) in Forensic Biology as Teesside University, UK</div>Bgrichterhttp://bioinfo-core.org/index.php?title=User_talk:KrisDavie&diff=10409User talk:KrisDavie2019-09-02T12:20:39Z<p>Bgrichter: Welcome!</p>
<hr />
<div>'''Welcome to ''BioWiki''!'''<br />
We hope you will contribute much and well.<br />
You will probably want to read the [https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents help pages].<br />
Again, welcome and have fun! [[User:Bgrichter|Bgrichter]] ([[User talk:Bgrichter|talk]]) 07:20, 2 September 2019 (CDT)</div>Bgrichterhttp://bioinfo-core.org/index.php?title=ISMB_2019:_BioinfoCoreWorkshop&diff=10397ISMB 2019: BioinfoCoreWorkshop2019-07-23T12:20:58Z<p>Bgrichter: /* 175 total people over the 2.5 hours (over capacity within room). 55 people participated for the full 2.5 hours including participation in the breakout sessions and discussions. 75 people for final NextFlow Demo */</p>
<hr />
<div>=Workshop Overview=<br />
<br />
The bioinfo-core workshop is scheduled for Monday, July 22, 2019, from 10:15 to 12:40 pm at the Congress Center in Basel.<br />
<br />
The bioinformatics core workshop is a workshop by practitioners and managers of Core Facilities for all members of core facilities, including scientists, engineers, analysts, operations and management staff. In this 16th year of bringing the Core community together at ISMB, we will explore topics relevant to bioinformatics core facilities through lightning talks and demos followed by small-group break out discussions with insights brought back to the full audience for further discussion and knowledge sharing.<br />
<br />
Organizers:<br />
<br />
* Madelaine Gogol, Stowers Institute, United States<br />
* Hemant Kelkar, University of North Carolina, United States<br />
* Alastair Kerr, CRUK-MI, University of Manchester, United Kingdom<br />
* Brent Richter, Partners HealthCare of Massachusetts General and Brigham and Women’s Hospitals, United States<br />
* Alberto Riva, University of Florida, United States<br />
<br />
Social Events:<br />
<br />
* ISCB Markthalle event, Tuesday, July 23rd, 8pm (look for bioinfo-core signs)<br />
* Wednesday night dinner, Veranda Pelicano, 8pm (meet outside congress center at 7:30pm to walk over), email mcm@stowers.org to RSVP<br />
<br />
Additional related opportunity:<br />
* [http://www.aebc2.eu/ AEBC2 Workshop] - Friday, July 26th.<br />
<br />
==Part A: Technologies and Analytical Methods==<br />
<br />
Machine Learning, AI, single cell RNA-seq analysis, and conda/bioconda.<br />
<br />
==Part B: Communication and Training==<br />
<br />
Communication and project management tools and training offered by cores.<br />
<br />
==Part C: Small group discussion==<br />
<br />
During this hour-long session, audience members will divide into groups based on their own interests. Groups will come up with their main take away points and bring them back to the main audience for knowledge sharing and for further discussion. Topics may include all previous presentation areas as well as other areas of interest to running or working within a bioinformatics core facility.<br />
<br />
==Part D: Pipeline Demo==<br />
<br />
Demo of nextflow<br />
<br />
==Schedule==<br />
<br />
{|class="wikitable"<br />
|-<br />
|Time<br />
|Title<br />
|Authors<br />
|-<br />
|10:20 - 10:30 AM<br />
|Transitioning bioinformatics core to support biomedical AI/ML research - lessons learned<br />
|Yang Fann, NIH, United States<br />
|-<br />
|10:30 - 10:40 AM<br />
|Supporting single cell RNA-seq analysis: A Core's Perspective<br />
|Shannan Ho Sui, Harvard School of Public Health, United States<br />
|-<br />
|10:40 - 10:50 AM<br />
|Conda and Bioconda, the best thing since sliced bread<br />
|Devon Ryan, Max Planck Institute, Germany<br />
|-<br />
|10:50 - 11:00 AM<br />
|Improving project management and tracking with Asana and Toggl<br />
|Sara Brin Rosenthal, UCSD, United States<br />
|-<br />
|11:00 - 11:10 AM<br />
|Bioinformatics training (in the context of a core)<br />
|Radhika Khetani, Harvard School of Public Health, United States<br />
|-<br />
|11:10 - 11:20 AM<br />
|Development of bioinformatics workshop by a core facility<br />
|Alberto Riva, University of Florida, United States<br />
|-<br />
|11:20 - 11:55 AM<br />
|Small Group Discussions<br />
|<br />
|-<br />
|11:55 AM - 12:20 PM<br />
|Small Group Reports<br />
|<br />
|-<br />
|12:20 PM - 12:35 PM<br />
|nf-core - A community effort to collect a curated set of pipelines built using Nextflow (https://nf-co.re/).<br />
|Harshil Patel, The Francis Crick Institute, United Kingdom<br />
|-<br />
|}<br />
<br />
<br />
== Workshop Discussion ==<br />
===175 total people over the 2.5 hours (over capacity within room). 55 people participated for the full 2.5 hours including participation in the breakout sessions and discussions. 75 people for final NextFlow Demo===<br />
<br />
*Transitioning bioinformatics core to support biomedical AI/ML research - lessons learned<br />
**Large, diverse datasets from multiple sources both private and public from around the world.<br />
*Supporting single cell RNA-seq analysis: A Core's Perspective<br />
**Single cell growing in demand over the last 5 years. Data analysis is becoming the bottleneck. Taking a community based approach by collaborating with other HSPS teams and other schools (HMS) to tackle the problem: sequencing core (de-multiplexing), labs (iterative; requires research input--is cell cycling part or mitochondria?), training, etc. <br />
**built out bcbio python toolkit with 62 international contributors.<br />
**settled on serat suite of tools but also uses many others such as multicca<br />
*Conda and Bioconda, the best thing since sliced bread<br />
**Installing Software--get asked for help to install all kinds of software, particularly ones that carry many dependencies.<br />
**with Conda, root access not needed, ever. dependencies are handled for you.<br />
**free and can add your own packages<br />
**module load activates a conda environment behind the scene for them<br />
**bioconductor packages in bioconda. for every package they also compile a singularity and docker container. Biocontainer<br />
**CoreOS--Quay.<br />
**1700 packages upgraded over a week. behind the scenes, bioconductor upgrading<br />
**bioconda has 700+ contributors: release your tools using bioconda.<br />
*Improving project management and tracking with Asana and Toggl<br />
**Fee for Service Center with up to 324 projects over the last 4 years.<br />
**Need to Track projects intra-team: transition them 1 team member to another as the project cycles through the experts<br />
**Analysis can be punctuated with long periods of time while investigator writes papers and grants. Needs to pick up history sometimes a year later.<br />
**Asana: have defined a workflow within Asana that includes intake, waiting periods, in progress systems, close out and bill<br />
**archive data to S3.<br />
**implemented toggl to track time on each project and subtask. Integrates with Asana for project management components.<br />
**allows for obtaining better estimates to people. Have found in general they underestimate work.<br />
*Bioinformatics training (in the context of a core)<br />
**Funders provide FTEs dedicated to training (harvard catalyst, HMS)<br />
**interplay between training and consulting: surge in single cell analysis highlights need for training in this technology<br />
**2/3 time spent on training, the remainder on consulting and understanding best practices<br />
**partner with faculty on teaching for credit--e.g. an R component for their cause <br />
**10:1 student to instructor ratios, 25 per class. Use local resources such as their HPC system. Publish materials on GitHub<br />
*Development of bioinformatics workshop by a core facility<br />
**being asked to provide practical bioinformatics training<br />
**challenges: large and diverse audience which makes it hard to develop a suitable curriculum, limited to 8x1hour courses, need to find source of support<br />
**partnered with the cancer center for admin support, the library for 5-seat lab, faculty for some lectures and research computing for the HiPerGator cluster with a dedicated allocation of cores.<br />
**successful: filled 50 spots in just a few days and over ½ attending all lectures. videorecorded and publicly available.<br />
<br />
==Breakout sessions==<br />
*Training<br />
**chunk out training and repackage and create efficiency<br />
**signups--under subscription vs. over. Charging to put some skin in the game<br />
**Access to compute.<br />
**Google and AWS use, and cost effective. use of jupyter notebooks are particularly cheap<br />
*Single Cell<br />
**help people help themselves.<br />
**shiny apps<br />
**what let's you know it worked properly? primer dimers, cell ranger but '''serat''' R package is the main thing that came out of it.<br />
**need to talk about the standard set of thresholds<br />
*Project Management<br />
**from Excel to google docs<br />
**Asana, trello, Jira<br />
**time tracking with Toggle and Harvast (app on phone, laptop, etc)<br />
**Wants: Confluence to integrate project management together with documentation?<br />
**fees help manage demand and help finance pipeline development<br />
*Conda/bioconda reproducibility<br />
<br />
==Demos==<br />
*Nextflow<br />
**manages reproducibility. integrates with many other schedulers <br />
**uses Conda<br />
**AWS iGenomes<br />
**git repo at nf-core/configs and test datasets at nf-core/test-datasets</div>Bgrichterhttp://bioinfo-core.org/index.php?title=ISMB_2019:_BioinfoCoreWorkshop&diff=10395ISMB 2019: BioinfoCoreWorkshop2019-07-22T10:47:13Z<p>Bgrichter: </p>
<hr />
<div>=Workshop Overview=<br />
<br />
The bioinfo-core workshop is scheduled for Monday, July 22, 2019, from 10:15 to 12:40 pm at the Congress Center in Basel.<br />
<br />
The bioinformatics core workshop is a workshop by practitioners and managers of Core Facilities for all members of core facilities, including scientists, engineers, analysts, operations and management staff. In this 16th year of bringing the Core community together at ISMB, we will explore topics relevant to bioinformatics core facilities through lightning talks and demos followed by small-group break out discussions with insights brought back to the full audience for further discussion and knowledge sharing.<br />
<br />
Organizers:<br />
<br />
* Madelaine Gogol, Stowers Institute, United States<br />
* Hemant Kelkar, University of North Carolina, United States<br />
* Alastair Kerr, CRUK-MI, University of Manchester, United Kingdom<br />
* Brent Richter, Partners HealthCare of Massachusetts General and Brigham and Women’s Hospitals, United States<br />
* Alberto Riva, University of Florida, United States<br />
<br />
==Part A: Technologies and Analytical Methods==<br />
<br />
Machine Learning, AI, single cell RNA-seq analysis, and conda/bioconda.<br />
<br />
==Part B: Communication and Training==<br />
<br />
Communication and project management tools and training offered by cores.<br />
<br />
==Part C: Small group discussion==<br />
<br />
During this hour-long session, audience members will divide into groups based on their own interests. Groups will come up with their main take away points and bring them back to the main audience for knowledge sharing and for further discussion. Topics may include all previous presentation areas as well as other areas of interest to running or working within a bioinformatics core facility.<br />
<br />
==Part D: Pipeline Demo==<br />
<br />
Demo of nextflow<br />
<br />
==Schedule==<br />
<br />
{|class="wikitable"<br />
|-<br />
|Time<br />
|Title<br />
|Authors<br />
|-<br />
|10:20 - 10:30 AM<br />
|Transitioning bioinformatics core to support biomedical AI/ML research - lessons learned<br />
|Yang Fann, NIH, United States<br />
|-<br />
|10:30 - 10:40 AM<br />
|Supporting single cell RNA-seq analysis: A Core's Perspective<br />
|Shannan Ho Sui, Harvard School of Public Health, United States<br />
|-<br />
|10:40 - 10:50 AM<br />
|Conda and Bioconda, the best thing since sliced bread<br />
|Devon Ryan, Max Planck Institute, Germany<br />
|-<br />
|10:50 - 11:00 AM<br />
|Improving project management and tracking with Asana and Toggl<br />
|Sara Brin Rosenthal, UCSD, United States<br />
|-<br />
|11:00 - 11:10 AM<br />
|Bioinformatics training (in the context of a core)<br />
|Radhika Khetani, Harvard School of Public Health, United States<br />
|-<br />
|11:10 - 11:20 AM<br />
|Development of bioinformatics workshop by a core facility<br />
|Alberto Riva, University of Florida, United States<br />
|-<br />
|11:20 - 11:55 AM<br />
|Small Group Discussions<br />
|<br />
|-<br />
|11:55 AM - 12:20 PM<br />
|Small Group Reports<br />
|<br />
|-<br />
|12:20 PM - 12:35 PM<br />
|nf-core - A community effort to collect a curated set of pipelines built using Nextflow (https://nf-co.re/).<br />
|Harshil Patel, The Francis Crick Institute, United Kingdom<br />
|-<br />
|}<br />
<br />
<br />
== Workshop Discussion ==<br />
===175 total people over the 2.5 hours (over capacity within room). 55 people participated for the full 2.5 hours including participation in the breakout sessions and discussions. 75 people for final NextFlow Demo===<br />
<br />
*Transitioning bioinformatics core to support biomedical AI/ML research - lessons learned<br />
**Large, diverse datasets from multiple sources both private and public from around the world.<br />
*Supporting single cell RNA-seq analysis: A Core's Perspective<br />
**Single cell growing in demand over the last 5 years. Data analysis is becoming the bottleneck. Taking a community based approach by collaborating with other HSPS teams and other schools (HMS) to tackle the problem: sequencing core (de-multiplexing), labs (iterative requires research input--is cell cycling part or mitochondria?), training, etc. <br />
**built out bcbio python toolkit with 62 international contributors.<br />
**settled on serat suite of tools but also uses many others such as multicca<br />
*Conda and Bioconda, the best thing since sliced bread<br />
**Installing Software--get asked for help to install, particularly ones that need dependencies.<br />
**no route needed, ever. dependencies are handled for you.<br />
**free and can add your own .<br />
**module load activates a condo environment behind the scene<br />
**bioconductor packages in bioconda. for every package made a singularity and docker container compiled as well. Biocontainer<br />
**CoreOS--Quay.<br />
**1700 packages via can over a week. behind the scene bioconductor upgrading<br />
**bioconda has 700+ contributors release your tools.<br />
*Improving project management and tracking with Asana and Toggl<br />
**Fee for Service Center with up to 324 projects over the last 4 years.<br />
**Tracking projects in the transition from 1 team member to another as the project cycles through the experts<br />
**Analysis can be punctuated with long periods of time while investigator writes papers and grants. Needs to pick up history sometimes a year later.<br />
**Asana: have defined a workflow within Asana that includes intake, waiting periods, in progress systems, close out and bill<br />
**archive data to S3.<br />
**implemented toggl to track time on each project and subtask. Integrates with Asana for project management components.<br />
**allows for obtaining better estimates to people. Have found in general they underestimate work.<br />
*Bioinformatics training (in the context of a core)<br />
**Funders provide FTEs dedicated to training (harvard catalyst, HMS)<br />
**interplay between training and consulting: surge in single cell analysis highlights need for training in this technology<br />
**2/3 time spent on training, the remainder on consulting and understanding best practices<br />
**partner with faculty on teaching for credit--e.g. an R component for their cause <br />
**10:1 student to instructor ratios, 25 per class. Use local resources such as their HPC system. Publish materials on GitHub<br />
*Development of bioinformatics workshop by a core facility<br />
**being asked to provide practical bioinformatics training<br />
**challenges: large and diverse audience which makes it hard to develop a suitable curriculum, limited to 8x1hour courses, need to find source of support<br />
**partnered with the cancer center for admin support, the library for 5-seat lab, faculty for some lectures and research computing for the HiPerGator cluster with a dedicated allocation of cores.<br />
**successful: filled 50 spots in just a few days and over ½ attending all lectures. videorecorded and publicly available.<br />
<br />
==Breakout sessions==<br />
*Training<br />
**chunk out training and repackage and create efficiency<br />
**signups--under subscription vs. over. Charging to put some skin in the game<br />
**Access to compute.<br />
**Google and AWS use, and cost effective. use of jupyter notebooks are particularly cheap<br />
*Single Cell<br />
**help people help themselves.<br />
**shiny apps<br />
**what let's you know it worked properly? primer dimers, cell ranger but '''serat''' R package is the main thing that came out of it.<br />
**need to talk about the standard set of thresholds<br />
*Project Management<br />
**from Excel to google docs<br />
**Asana, trello, Jira<br />
**time tracking with Toggle and Harvast (app on phone, laptop, etc)<br />
**Wants: Confluence to integrate project management together with documentation?<br />
**fees help manage demand and help finance pipeline development<br />
*Conda/bioconda reproducibility<br />
<br />
==Demos==<br />
*Nextflow<br />
**manages reproducibility. integrates with many other schedulers <br />
**uses Conda<br />
**AWS iGenomes<br />
**git repo at nf-core/configs and test datasets at nf-core/test-datasets</div>Bgrichterhttp://bioinfo-core.org/index.php?title=ISMB_2019:_BioinfoCoreWorkshop&diff=10394ISMB 2019: BioinfoCoreWorkshop2019-07-22T10:20:44Z<p>Bgrichter: /* Breakout sessions */</p>
<hr />
<div>=Workshop Overview=<br />
<br />
The bioinfo-core workshop is scheduled for Monday, July 22, 2019, from 10:15 to 12:40 pm at the Congress Center in Basel.<br />
<br />
The bioinformatics core workshop is a workshop by practitioners and managers of Core Facilities for all members of core facilities, including scientists, engineers, analysts, operations and management staff. In this 16th year of bringing the Core community together at ISMB, we will explore topics relevant to bioinformatics core facilities through lightning talks and demos followed by small-group break out discussions with insights brought back to the full audience for further discussion and knowledge sharing.<br />
<br />
Organizers:<br />
<br />
* Madelaine Gogol, Stowers Institute, United States<br />
* Hemant Kelkar, University of North Carolina, United States<br />
* Alastair Kerr, CRUK-MI, University of Manchester, United Kingdom<br />
* Brent Richter, Partners HealthCare of Massachusetts General and Brigham and Women’s Hospitals, United States<br />
* Alberto Riva, University of Florida, United States<br />
<br />
==Part A: Technologies and Analytical Methods==<br />
<br />
Machine Learning, AI, single cell RNA-seq analysis, and conda/bioconda.<br />
<br />
==Part B: Communication and Training==<br />
<br />
Communication and project management tools and training offered by cores.<br />
<br />
==Part C: Small group discussion==<br />
<br />
During this hour-long session, audience members will divide into groups based on their own interests. Groups will come up with their main take away points and bring them back to the main audience for knowledge sharing and for further discussion. Topics may include all previous presentation areas as well as other areas of interest to running or working within a bioinformatics core facility.<br />
<br />
==Part D: Pipeline Demo==<br />
<br />
Demo of nextflow<br />
<br />
==Schedule==<br />
<br />
{|class="wikitable"<br />
|-<br />
|Time<br />
|Title<br />
|Authors<br />
|-<br />
|10:20 - 10:30 AM<br />
|Transitioning bioinformatics core to support biomedical AI/ML research - lessons learned<br />
|Yang Fann, NIH, United States<br />
|-<br />
|10:30 - 10:40 AM<br />
|Supporting single cell RNA-seq analysis: A Core's Perspective<br />
|Shannan Ho Sui, Harvard School of Public Health, United States<br />
|-<br />
|10:40 - 10:50 AM<br />
|Conda and Bioconda, the best thing since sliced bread<br />
|Devon Ryan, Max Planck Institute, Germany<br />
|-<br />
|10:50 - 11:00 AM<br />
|Improving project management and tracking with Asana and Toggl<br />
|Sara Brin Rosenthal, UCSD, United States<br />
|-<br />
|11:00 - 11:10 AM<br />
|Bioinformatics training (in the context of a core)<br />
|Radhika Khetani, Harvard School of Public Health, United States<br />
|-<br />
|11:10 - 11:20 AM<br />
|Development of bioinformatics workshop by a core facility<br />
|Alberto Riva, University of Florida, United States<br />
|-<br />
|11:20 - 11:55 AM<br />
|Small Group Discussions<br />
|<br />
|-<br />
|11:55 AM - 12:20 PM<br />
|Small Group Reports<br />
|<br />
|-<br />
|12:20 PM - 12:35 PM<br />
|nf-core - A community effort to collect a curated set of pipelines built using Nextflow (https://nf-co.re/).<br />
|Harshil Patel, The Francis Crick Institute, United Kingdom<br />
|-<br />
|}<br />
<br />
<br />
== Workshop Discussion ==<br />
<br />
*Transitioning bioinformatics core to support biomedical AI/ML research - lessons learned<br />
**Large, diverse datasets from multiple sources both private and public from around the world.<br />
*Supporting single cell RNA-seq analysis: A Core's Perspective<br />
**Single cell growing in demand over the last 5 years. Data analysis is becoming the bottleneck. Taking a community based approach by collaborating with other HSPS teams and other schools (HMS) to tackle the problem: sequencing core (de-multiplexing), labs (iterative requires research input--is cell cycling part or mitochondria?), training, etc. <br />
**built out bcbio python toolkit with 62 international contributors.<br />
**settled on serat suite of tools but also uses many others such as multicca<br />
*Conda and Bioconda, the best thing since sliced bread<br />
**Installing Software--get asked for help to install, particularly ones that need dependencies.<br />
**no route needed, ever. dependencies are handled for you.<br />
**free and can add your own .<br />
**module load activates a condo environment behind the scene<br />
**bioconductor packages in bioconda. for every package made a singularity and docker container compiled as well. Biocontainer<br />
**CoreOS--Quay.<br />
**1700 packages via can over a week. behind the scene bioconductor upgrading<br />
**bioconda has 700+ contributors release your tools.<br />
*Improving project management and tracking with Asana and Toggl<br />
**Fee for Service Center with up to 324 projects over the last 4 years.<br />
**Tracking projects in the transition from 1 team member to another as the project cycles through the experts<br />
**Analysis can be punctuated with long periods of time while investigator writes papers and grants. Needs to pick up history sometimes a year later.<br />
**Asana: have defined a workflow within Asana that includes intake, waiting periods, in progress systems, close out and bill<br />
**archive data to S3.<br />
**implemented toggl to track time on each project and subtask. Integrates with Asana for project management components.<br />
**allows for obtaining better estimates to people. Have found in general they underestimate work.<br />
*Bioinformatics training (in the context of a core)<br />
**Funders provide FTEs dedicated to training (harvard catalyst, HMS)<br />
**interplay between training and consulting: surge in single cell analysis highlights need for training in this technology<br />
**2/3 time spent on training, the remainder on consulting and understanding best practices<br />
**partner with faculty on teaching for credit--e.g. an R component for their cause <br />
**10:1 student to instructor ratios, 25 per class. Use local resources such as their HPC system. Publish materials on GitHub<br />
*Development of bioinformatics workshop by a core facility<br />
**being asked to provide practical bioinformatics training<br />
**challenges: large and diverse audience which makes it hard to develop a suitable curriculum, limited to 8x1hour courses, need to find source of support<br />
**partnered with the cancer center for admin support, the library for 5-seat lab, faculty for some lectures and research computing for the HiPerGator cluster with a dedicated allocation of cores.<br />
**successful: filled 50 spots in just a few days and over ½ attending all lectures. videorecorded and publicly available.<br />
<br />
==Breakout sessions==<br />
*Training<br />
**chunk out training and repackage and create efficiency<br />
**signups--under subscription vs. over. Charging to put some skin in the game<br />
**Access to compute.<br />
**Google and AWS use, and cost effective. use of jupyter notebooks are particularly cheap<br />
*Single Cell<br />
**help people help themselves.<br />
**shiny apps<br />
**what let's you know it worked properly? primer dimers, cell ranger but '''serat''' R package is the main thing that came out of it.<br />
**need to talk about the standard set of thresholds<br />
*Project Management<br />
**from Excel to google docs<br />
**Asana, trello, Jira<br />
**time tracking with Toggle and Harvast (app on phone, laptop, etc)<br />
**Wants: Confluence to integrate project management together with documentation?<br />
**fees help manage demand and help finance pipeline development<br />
*Conda/bioconda reproducibility</div>Bgrichterhttp://bioinfo-core.org/index.php?title=ISMB_2019:_BioinfoCoreWorkshop&diff=10393ISMB 2019: BioinfoCoreWorkshop2019-07-22T10:18:56Z<p>Bgrichter: </p>
<hr />
<div>=Workshop Overview=<br />
<br />
The bioinfo-core workshop is scheduled for Monday, July 22, 2019, from 10:15 to 12:40 pm at the Congress Center in Basel.<br />
<br />
The bioinformatics core workshop is a workshop by practitioners and managers of Core Facilities for all members of core facilities, including scientists, engineers, analysts, operations and management staff. In this 16th year of bringing the Core community together at ISMB, we will explore topics relevant to bioinformatics core facilities through lightning talks and demos followed by small-group break out discussions with insights brought back to the full audience for further discussion and knowledge sharing.<br />
<br />
Organizers:<br />
<br />
* Madelaine Gogol, Stowers Institute, United States<br />
* Hemant Kelkar, University of North Carolina, United States<br />
* Alastair Kerr, CRUK-MI, University of Manchester, United Kingdom<br />
* Brent Richter, Partners HealthCare of Massachusetts General and Brigham and Women’s Hospitals, United States<br />
* Alberto Riva, University of Florida, United States<br />
<br />
==Part A: Technologies and Analytical Methods==<br />
<br />
Machine Learning, AI, single cell RNA-seq analysis, and conda/bioconda.<br />
<br />
==Part B: Communication and Training==<br />
<br />
Communication and project management tools and training offered by cores.<br />
<br />
==Part C: Small group discussion==<br />
<br />
During this hour-long session, audience members will divide into groups based on their own interests. Groups will come up with their main take away points and bring them back to the main audience for knowledge sharing and for further discussion. Topics may include all previous presentation areas as well as other areas of interest to running or working within a bioinformatics core facility.<br />
<br />
==Part D: Pipeline Demo==<br />
<br />
Demo of nextflow<br />
<br />
==Schedule==<br />
<br />
{|class="wikitable"<br />
|-<br />
|Time<br />
|Title<br />
|Authors<br />
|-<br />
|10:20 - 10:30 AM<br />
|Transitioning bioinformatics core to support biomedical AI/ML research - lessons learned<br />
|Yang Fann, NIH, United States<br />
|-<br />
|10:30 - 10:40 AM<br />
|Supporting single cell RNA-seq analysis: A Core's Perspective<br />
|Shannan Ho Sui, Harvard School of Public Health, United States<br />
|-<br />
|10:40 - 10:50 AM<br />
|Conda and Bioconda, the best thing since sliced bread<br />
|Devon Ryan, Max Planck Institute, Germany<br />
|-<br />
|10:50 - 11:00 AM<br />
|Improving project management and tracking with Asana and Toggl<br />
|Sara Brin Rosenthal, UCSD, United States<br />
|-<br />
|11:00 - 11:10 AM<br />
|Bioinformatics training (in the context of a core)<br />
|Radhika Khetani, Harvard School of Public Health, United States<br />
|-<br />
|11:10 - 11:20 AM<br />
|Development of bioinformatics workshop by a core facility<br />
|Alberto Riva, University of Florida, United States<br />
|-<br />
|11:20 - 11:55 AM<br />
|Small Group Discussions<br />
|<br />
|-<br />
|11:55 AM - 12:20 PM<br />
|Small Group Reports<br />
|<br />
|-<br />
|12:20 PM - 12:35 PM<br />
|nf-core - A community effort to collect a curated set of pipelines built using Nextflow (https://nf-co.re/).<br />
|Harshil Patel, The Francis Crick Institute, United Kingdom<br />
|-<br />
|}<br />
<br />
<br />
== Workshop Discussion ==<br />
<br />
*Transitioning bioinformatics core to support biomedical AI/ML research - lessons learned<br />
**Large, diverse datasets from multiple sources both private and public from around the world.<br />
*Supporting single cell RNA-seq analysis: A Core's Perspective<br />
**Single cell growing in demand over the last 5 years. Data analysis is becoming the bottleneck. Taking a community based approach by collaborating with other HSPS teams and other schools (HMS) to tackle the problem: sequencing core (de-multiplexing), labs (iterative requires research input--is cell cycling part or mitochondria?), training, etc. <br />
**built out bcbio python toolkit with 62 international contributors.<br />
**settled on serat suite of tools but also uses many others such as multicca<br />
*Conda and Bioconda, the best thing since sliced bread<br />
**Installing Software--get asked for help to install, particularly ones that need dependencies.<br />
**no route needed, ever. dependencies are handled for you.<br />
**free and can add your own .<br />
**module load activates a condo environment behind the scene<br />
**bioconductor packages in bioconda. for every package made a singularity and docker container compiled as well. Biocontainer<br />
**CoreOS--Quay.<br />
**1700 packages via can over a week. behind the scene bioconductor upgrading<br />
**bioconda has 700+ contributors release your tools.<br />
*Improving project management and tracking with Asana and Toggl<br />
**Fee for Service Center with up to 324 projects over the last 4 years.<br />
**Tracking projects in the transition from 1 team member to another as the project cycles through the experts<br />
**Analysis can be punctuated with long periods of time while investigator writes papers and grants. Needs to pick up history sometimes a year later.<br />
**Asana: have defined a workflow within Asana that includes intake, waiting periods, in progress systems, close out and bill<br />
**archive data to S3.<br />
**implemented toggl to track time on each project and subtask. Integrates with Asana for project management components.<br />
**allows for obtaining better estimates to people. Have found in general they underestimate work.<br />
*Bioinformatics training (in the context of a core)<br />
**Funders provide FTEs dedicated to training (harvard catalyst, HMS)<br />
**interplay between training and consulting: surge in single cell analysis highlights need for training in this technology<br />
**2/3 time spent on training, the remainder on consulting and understanding best practices<br />
**partner with faculty on teaching for credit--e.g. an R component for their cause <br />
**10:1 student to instructor ratios, 25 per class. Use local resources such as their HPC system. Publish materials on GitHub<br />
*Development of bioinformatics workshop by a core facility<br />
**being asked to provide practical bioinformatics training<br />
**challenges: large and diverse audience which makes it hard to develop a suitable curriculum, limited to 8x1hour courses, need to find source of support<br />
**partnered with the cancer center for admin support, the library for 5-seat lab, faculty for some lectures and research computing for the HiPerGator cluster with a dedicated allocation of cores.<br />
**successful: filled 50 spots in just a few days and over ½ attending all lectures. videorecorded and publicly available.<br />
<br />
==Breakout sessions==<br />
*Training<br />
**chunk out training and repackage and create efficiency<br />
**signups--under subscription vs. over. Charging to put some skin in the game<br />
**Access to compute.<br />
**Google and AWS use, and cost effective. use of jupyter notebooks are particularly cheap<br />
*Single Cell<br />
**help people help themselves.<br />
**shiny apps<br />
**what let's you know it worked properly? primer dimers, cell ranger but '''serat''' R package is the main thing that came out of it.<br />
**need to talk about the standard set of thresholds<br />
*Project Management<br />
**from Excel to google docs<br />
**Asana, trello, Jira<br />
**time tracking with Toggle and Harvast (app on phone, laptop, etc)<br />
**Wants: Congluence?<br />
**fees help manage demand and help finance pipeline development<br />
*Conda/bioconda reproducibility</div>Bgrichterhttp://bioinfo-core.org/index.php?title=ISMB_2019:_BioinfoCoreWorkshop&diff=10392ISMB 2019: BioinfoCoreWorkshop2019-07-22T10:18:27Z<p>Bgrichter: </p>
<hr />
<div>=Workshop Overview=<br />
<br />
The bioinfo-core workshop is scheduled for Monday, July 22, 2019, from 10:15 to 12:40 pm at the Congress Center in Basel.<br />
<br />
The bioinformatics core workshop is a workshop by practitioners and managers of Core Facilities for all members of core facilities, including scientists, engineers, analysts, operations and management staff. In this 16th year of bringing the Core community together at ISMB, we will explore topics relevant to bioinformatics core facilities through lightning talks and demos followed by small-group break out discussions with insights brought back to the full audience for further discussion and knowledge sharing.<br />
<br />
Organizers:<br />
<br />
* Madelaine Gogol, Stowers Institute, United States<br />
* Hemant Kelkar, University of North Carolina, United States<br />
* Alastair Kerr, CRUK-MI, University of Manchester, United Kingdom<br />
* Brent Richter, Partners HealthCare of Massachusetts General and Brigham and Women’s Hospitals, United States<br />
* Alberto Riva, University of Florida, United States<br />
<br />
==Part A: Technologies and Analytical Methods==<br />
<br />
Machine Learning, AI, single cell RNA-seq analysis, and conda/bioconda.<br />
<br />
==Part B: Communication and Training==<br />
<br />
Communication and project management tools and training offered by cores.<br />
<br />
==Part C: Small group discussion==<br />
<br />
During this hour-long session, audience members will divide into groups based on their own interests. Groups will come up with their main take away points and bring them back to the main audience for knowledge sharing and for further discussion. Topics may include all previous presentation areas as well as other areas of interest to running or working within a bioinformatics core facility.<br />
<br />
==Part D: Pipeline Demo==<br />
<br />
Demo of nextflow<br />
<br />
==Schedule==<br />
<br />
{|class="wikitable"<br />
|-<br />
|Time<br />
|Title<br />
|Authors<br />
|-<br />
|10:20 - 10:30 AM<br />
|Transitioning bioinformatics core to support biomedical AI/ML research - lessons learned<br />
|Yang Fann, NIH, United States<br />
|-<br />
|10:30 - 10:40 AM<br />
|Supporting single cell RNA-seq analysis: A Core's Perspective<br />
|Shannan Ho Sui, Harvard School of Public Health, United States<br />
|-<br />
|10:40 - 10:50 AM<br />
|Conda and Bioconda, the best thing since sliced bread<br />
|Devon Ryan, Max Planck Institute, Germany<br />
|-<br />
|10:50 - 11:00 AM<br />
|Improving project management and tracking with Asana and Toggl<br />
|Sara Brin Rosenthal, UCSD, United States<br />
|-<br />
|11:00 - 11:10 AM<br />
|Bioinformatics training (in the context of a core)<br />
|Radhika Khetani, Harvard School of Public Health, United States<br />
|-<br />
|11:10 - 11:20 AM<br />
|Development of bioinformatics workshop by a core facility<br />
|Alberto Riva, University of Florida, United States<br />
|-<br />
|11:20 - 11:55 AM<br />
|Small Group Discussions<br />
|<br />
|-<br />
|11:55 AM - 12:20 PM<br />
|Small Group Reports<br />
|<br />
|-<br />
|12:20 PM - 12:35 PM<br />
|nf-core - A community effort to collect a curated set of pipelines built using Nextflow (https://nf-co.re/).<br />
|Harshil Patel, The Francis Crick Institute, United Kingdom<br />
|-<br />
|}<br />
<br />
<br />
== Workshop Discussion ==<br />
<br />
*Transitioning bioinformatics core to support biomedical AI/ML research - lessons learned<br />
**Large, diverse datasets from multiple sources both private and public from around the world.<br />
*Supporting single cell RNA-seq analysis: A Core's Perspective<br />
**Single cell growing in demand over the last 5 years. Data analysis is becoming the bottleneck. Taking a community based approach by collaborating with other HSPS teams and other schools (HMS) to tackle the problem: sequencing core (de-multiplexing), labs (iterative requires research input--is cell cycling part or mitochondria?), training, etc. <br />
**built out bcbio python toolkit with 62 international contributors.<br />
**settled on serat suite of tools but also uses many others such as multicca<br />
*Conda and Bioconda, the best thing since sliced bread<br />
**Installing Software--get asked for help to install, particularly ones that need dependencies.<br />
**no route needed, ever. dependencies are handled for you.<br />
**free and can add your own .<br />
**module load activates a condo environment behind the scene<br />
**bioconductor packages in bioconda. for every package made a singularity and docker container compiled as well. Biocontainer<br />
**CoreOS--Quay.<br />
**1700 packages via can over a week. behind the scene bioconductor upgrading<br />
**bioconda has 700+ contributors release your tools.<br />
*Improving project management and tracking with Asana and Toggl<br />
**Fee for Service Center with up to 324 projects over the last 4 years.<br />
**Tracking projects in the transition from 1 team member to another as the project cycles through the experts<br />
**Analysis can be punctuated with long periods of time while investigator writes papers and grants. Needs to pick up history sometimes a year later.<br />
**Asana: have defined a workflow within Asana that includes intake, waiting periods, in progress systems, close out and bill<br />
**archive data to S3.<br />
**implemented toggl to track time on each project and subtask. Integrates with Asana for project management components.<br />
**allows for obtaining better estimates to people. Have found in general they underestimate work.<br />
*Bioinformatics training (in the context of a core)<br />
**Funders provide FTEs dedicated to training (harvard catalyst, HMS)<br />
**interplay between training and consulting: surge in single cell analysis highlights need for training in this technology<br />
**2/3 time spent on training, the remainder on consulting and understanding best practices<br />
**partner with faculty on teaching for credit--e.g. an R component for their cause <br />
**10:1 student to instructor ratios, 25 per class. Use local resources such as their HPC system. Publish materials on GitHub<br />
*Development of bioinformatics workshop by a core facility<br />
**being asked to provide practical bioinformatics training<br />
**challenges: large and diverse audience which makes it hard to develop a suitable curriculum, limited to 8x1hour courses, need to find source of support<br />
**partnered with the cancer center for admin support, the library for 5-seat lab, faculty for some lectures and research computing for the HiPerGator cluster with a dedicated allocation of cores.<br />
**successful: filled 50 spots in just a few days and over ½ attending all lectures. videorecorded and publicly available.<br />
<br />
==Breakout sessions<br />
*Training<br />
**chunk out training and repackage and create efficiency<br />
**signups--under subscription vs. over. Charging to put some skin in the game<br />
**Access to compute.<br />
**Google and AWS use, and cost effective. use of jupyter notebooks are particularly cheap<br />
*Single Cell<br />
**help people help themselves.<br />
**shiny apps<br />
**what let's you know it worked properly? primer dimers, cell ranger but '''serat''' R package is the main thing that came out of it.<br />
**need to talk about the standard set of thresholds<br />
*Project Management<br />
**from Excel to google docs<br />
**Asana, trello, Jira<br />
**time tracking with Toggle and Harvast (app on phone, laptop, etc)<br />
**Wants: Congluence?<br />
**fees help manage demand and help finance pipeline development<br />
*Conda/bioconda reproducibility</div>Bgrichterhttp://bioinfo-core.org/index.php?title=User:Gjain&diff=10370User:Gjain2019-03-20T21:51:22Z<p>Bgrichter: Creating user page for new user.</p>
<hr />
<div>Bioinformatics, Computational Biology and Data Scientist with extensive interdisciplinary and international experience <br />
<br />
10+ years' experience in big data analysis, pipeline development, data visualization, web development using computational biology, statistics and machine learning methodology.<br />
<br />
Specialties: <br />
• Languages: Python, Perl, R, HTML/HTML5, PHP, Javascript, AJAX, Shell scripting, C/C++, Octave <br />
<br />
• Big Data: Next generation sequencing, Statistical and bioinformatics data analysis, Machine learning algorithms<br />
<br />
• High-throughput Assays: Chromosome Conformation Capture (3C)-based techniques, ChIPseq, RNAseq, Single-cell RNAseq<br />
<br />
• Software systems and databases: MySQL, Oracle, SVN, Git, Django<br />
<br />
• Other: Unix/Linux, Vector Graphics/illustrations, High performance cluster computing (HPCC), cloud computing (AWS)</div>Bgrichterhttp://bioinfo-core.org/index.php?title=User_talk:Gjain&diff=10371User talk:Gjain2019-03-20T21:51:22Z<p>Bgrichter: Welcome!</p>
<hr />
<div>'''Welcome to ''BioWiki''!'''<br />
We hope you will contribute much and well.<br />
You will probably want to read the [https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents help pages].<br />
Again, welcome and have fun! [[User:Bgrichter|Bgrichter]] ([[User talk:Bgrichter|talk]]) 16:51, 20 March 2019 (CDT)</div>Bgrichterhttp://bioinfo-core.org/index.php?title=User:Aswin_sjc&diff=10364User:Aswin sjc2019-02-25T13:43:38Z<p>Bgrichter: Creating user page for new user.</p>
<hr />
<div>I am Aswin S Soman. I've done my 12th grade in Govt.V&HSS Vellanad. Currently, doing BS-MS dual degree in CEG(Computational & Evolutionary Genetics) Lab, Indian Institute of Science Education and Research Bhopal.<br />
As a master scholar I am involved in processing and analysis of single cell RNA-seqencing data.</div>Bgrichterhttp://bioinfo-core.org/index.php?title=User_talk:Aswin_sjc&diff=10365User talk:Aswin sjc2019-02-25T13:43:38Z<p>Bgrichter: Welcome!</p>
<hr />
<div>'''Welcome to ''BioWiki''!'''<br />
We hope you will contribute much and well.<br />
You will probably want to read the [https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents help pages].<br />
Again, welcome and have fun! [[User:Bgrichter|Bgrichter]] ([[User talk:Bgrichter|talk]]) 07:43, 25 February 2019 (CST)</div>Bgrichterhttp://bioinfo-core.org/index.php?title=User:Ameynert&diff=10362User:Ameynert2019-02-25T13:41:44Z<p>Bgrichter: Creating user page for new user.</p>
<hr />
<div>Dr Meynert is the Bioinformatics Analysis Core (BAC) Manager at the MRC Institute of Genetics and Molecular Medicine (Human Genetics Unit) at the University of Edinburgh. She has been with the MRC HGU since 2010, when she joined as a Career Development Fellow in Prof Martin Taylor’s Evolutionary Genomics group. Subsequently in 2014, she moved to Prof Colin Semple’s Medical and Regulatory Genomics group as a Bioinformatician post-doctoral staff member, and was promoted to Medical Genomics Team Leader in 2016, and to BAC Manager in 2018. Dr Meynert completed a research MSc in Bioinformatics at Simon Fraser University (Vancouver, Canada) in 2005, and began her PhD studies at the European Bioinformatics Institute, University of Cambridge (Cambridge, UK) immediately afterwards, completing her degree in 2009. As BAC Manager, Dr Meynert supervises and directs a small team of post-doctoral level researchers and staff undertaking bioinformatics analyses of high throughput sequencing data for various biomedical research projects across the IGMM.<br />
<br />
During her research career, Dr Meynert has focused primarily on exploring the applications and limitations of high throughput sequencing in the context of human genetics. She has developed methods for investigating the sensitivity of variant detection using different sequencing techniques and platforms, and is currently investigating the replicability of variant detection in formalin-fixed paraffin embedded (FFPE) tumour samples. Dr Meynert is a sought after collaborator within the IGMM and internationally, as evidenced by her strong publication record as a co-author on developmental disorder and cancer genetics projects. She has also participated in the FANTOM5 consortium producing a mammalian promoter atlas. Her current collaborations include the analysis of the largest study to date of whole genome sequencing of ovarian cancer patients, co-funded by the Scottish Genomes Partnership and AstraZeneca.<br />
<br />
Dr Meynert was a section editor of the journal Biomolecular Detection and Quantification from 2013-2018. She has organized and hosted two Scottish NextGen Bioinformatics User Group (NextGenBUG) meetings at the IGMM in 2015 and 2017, and since 2017 has been the primary organizer of thrice-yearly Edinburgh Bioinformatics meetings. Since 2016 she has been responsible for teaching introductory high throughput sequencing analysis workshops for the Molecular Pathology MSc and Human Genetics Unit PhD students at the IGMM, and ad hoc workshops for other researchers and industrial partners.</div>Bgrichterhttp://bioinfo-core.org/index.php?title=User_talk:Ameynert&diff=10363User talk:Ameynert2019-02-25T13:41:44Z<p>Bgrichter: Welcome!</p>
<hr />
<div>'''Welcome to ''BioWiki''!'''<br />
We hope you will contribute much and well.<br />
You will probably want to read the [https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents help pages].<br />
Again, welcome and have fun! [[User:Bgrichter|Bgrichter]] ([[User talk:Bgrichter|talk]]) 07:41, 25 February 2019 (CST)</div>Bgrichterhttp://bioinfo-core.org/index.php?title=User:Danrobertson87&diff=10354User:Danrobertson872018-11-03T17:54:06Z<p>Bgrichter: Creating user page for new user.</p>
<hr />
<div>Bioinformatics support officer - Wellcome Centre for Cell Biology, Edinburgh University<br />
<br />
For the last three years I’ve been working in a bioinformatics core facility in a team supporting twenty research groups within the Wellcome Centre for Cell Biology. Prior to this I worked as a bioinformatician for three years at Anthony Nolan.<br />
<br />
MRes Bioinformatics, University of Glasgow</div>Bgrichterhttp://bioinfo-core.org/index.php?title=User_talk:Danrobertson87&diff=10355User talk:Danrobertson872018-11-03T17:54:06Z<p>Bgrichter: Welcome!</p>
<hr />
<div>'''Welcome to ''BioWiki''!'''<br />
We hope you will contribute much and well.<br />
You will probably want to read the [https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents help pages].<br />
Again, welcome and have fun! [[User:Bgrichter|Bgrichter]] ([[User talk:Bgrichter|talk]]) 12:54, 3 November 2018 (CDT)</div>Bgrichterhttp://bioinfo-core.org/index.php?title=User:Apeltzer&diff=10352User:Apeltzer2018-11-03T17:53:08Z<p>Bgrichter: Creating user page for new user.</p>
<hr />
<div>I’m Alexander Peltzer, a bioinformatician working in bioinformatics on liver cancer analysis at the Quantitative Biology Center Tuebingen. While pursuing my Ph.D, I worked at the Max-Planck-Institute for the Science of Human History in Jena/Germany and the Eberhard-Karls-Universität in Tübingen/Germany. My work centers around developing novel methods for the scalable and reproducible analysis of multi-omics liver cancer data.<br />
<br />
Formerly, I worked on developing novel methods for the analysis and handling of ancient DNA (aDNA), with a specific focus on making my methods more user-friendly and thus suitable for archaeologists and biologists. My main PhD project was about making state of the art methods available to a larger (and growing!) community that wants to leverage aDNA to gain insights into evolutionary events all over the globe.<br />
<br />
Another aspect of my work is to make my research reproducible: Making use of projects such as Docker, NextFlow and other tools to “sandbox” genomics applications to make them portable between different types of infrastructure, such as cloud instances, local workstations and HPC environments. In general user-friendly applications for genomics can be seen as one of the thriving aspects of my work. I also made some smaller tools e.g. in R/Shiny to help users in choosing better parameters for their DNA mapping analysis with BWA-Mismatches.<br />
<br />
I received my BSc and MSc degrees as well as my Ph.D in Bioinformatics from the Eberhard-Karls-Universität Tübingen, while also studying abroad at the San Francisco State University, CA (USA) during my Master’s.<br />
<br />
Check out my GitHub page if you’re looking for some of my tools and methods. Most of them are open source and licenced under GPLv3 and I’m very happy to receive feedback or pull requests.</div>Bgrichterhttp://bioinfo-core.org/index.php?title=User_talk:Apeltzer&diff=10353User talk:Apeltzer2018-11-03T17:53:08Z<p>Bgrichter: Welcome!</p>
<hr />
<div>'''Welcome to ''BioWiki''!'''<br />
We hope you will contribute much and well.<br />
You will probably want to read the [https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents help pages].<br />
Again, welcome and have fun! [[User:Bgrichter|Bgrichter]] ([[User talk:Bgrichter|talk]]) 12:53, 3 November 2018 (CDT)</div>Bgrichterhttp://bioinfo-core.org/index.php?title=ISMB_2018:_BioinfoCoreWorkshop&diff=10334ISMB 2018: BioinfoCoreWorkshop2018-05-09T15:40:28Z<p>Bgrichter: /* Workshop Overview */</p>
<hr />
<div><br />
<br />
=Workshop Overview=<br />
<br />
The bioinfo-core workshop is scheduled for Saturday, July 7, 2018, from 2:00-4:00 pm.<br />
<br />
The workshop will explore three topics relevant to bioinformatics core facilities. Members of core facilities will share their experience and insights in lightning talks to introduce the topics and some of the issues followed by topical small group discussions to deepen and expand upon the topics. <br />
<br />
Organizers:<br />
<br />
* Madelaine Gogol, Stowers Institute, United States<br />
* Hemant Kelkar, University of North Carolina, United States<br />
* Alastair Kerr, University of Edinburgh, United Kingdom<br />
* Brent Richter, Partners HealthCare of Massachusetts General and Brigham and Women’s Hospitals, United States<br />
* Alberto Riva, University of Florida, United States<br />
<br />
==Part A: Strategies for Hiring, Recruiting, and Interviewing new bioinformaticians==<br />
* finding and hiring people<br />
* interview techniques and questions<br />
* best practices for recruiting candidates<br />
<br />
==Part B: Containerization, Clouds, and Workflows==<br />
* cloud infrastructure limitations and recommendations<br />
* Key datasets in Clouds <br />
* containerization<br />
* workflow development and results of a favorite-tool survey<br />
<br />
==Part C: When good experiments go bad: Negotiating experiment quality failures==<br />
* detecting failure<br />
* guidelines for terminating bad projects<br />
<br />
==Part D: Small group discussion==<br />
During this session, audience members will divide into groups based on their own interests. Groups will discuss the areas introduced (or other issues, concerns and ideas), then come up with their main take away points and bring them back to share with the larger group. Topics may include:<br />
* interviewing / hiring / etc.<br />
* containers, clouds, workflows<br />
* experiment failure / quality control<br />
* single cell analysis<br />
* nanopore</div>Bgrichterhttp://bioinfo-core.org/index.php?title=User_talk:Dboernigen&diff=10322User talk:Dboernigen2018-02-07T16:02:26Z<p>Bgrichter: Welcome!</p>
<hr />
<div>'''Welcome to ''BioWiki''!'''<br />
We hope you will contribute much and well.<br />
You will probably want to read the [https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents help pages].<br />
Again, welcome and have fun! [[User:Bgrichter|Bgrichter]] ([[User talk:Bgrichter|talk]]) 10:02, 7 February 2018 (CST)</div>Bgrichterhttp://bioinfo-core.org/index.php?title=User:Dboernigen&diff=10321User:Dboernigen2018-02-07T16:02:25Z<p>Bgrichter: Creating user page for new user.</p>
<hr />
<div>Since 2015 - Scientist at University Hospital Hamburg Eppendorf, Germany<br />
2013-2015 - Sr. Postdoc at The University of Chicago, USA<br />
2011-2013 - Postdoc at Harvard University, USA<br />
2007-2011 - PhD Student at Katholieke Universiteit Leuven, Belgium<br />
<br />
Research interests: <br />
Gene Expression and Regulation, Network Biology, Data Integration, Machine Learning, Candidate Gene Prioritization, Comparative Genomics, Human Genetics and Cancer Genetics</div>Bgrichterhttp://bioinfo-core.org/index.php?title=User_talk:Mfernandes61&diff=10320User talk:Mfernandes612018-02-07T16:02:01Z<p>Bgrichter: Welcome!</p>
<hr />
<div>'''Welcome to ''BioWiki''!'''<br />
We hope you will contribute much and well.<br />
You will probably want to read the [https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents help pages].<br />
Again, welcome and have fun! [[User:Bgrichter|Bgrichter]] ([[User talk:Bgrichter|talk]]) 10:02, 7 February 2018 (CST)</div>Bgrichterhttp://bioinfo-core.org/index.php?title=User:Mfernandes61&diff=10319User:Mfernandes612018-02-07T16:02:00Z<p>Bgrichter: Creating user page for new user.</p>
<hr />
<div>Bioinformatics Training Developer at CRUK Cambridge Institute 2018-<br />
Bioinformatics Training Developer at Institute of Food Resaerch/Quadram Institute 2015-2017<br />
Research Scientist (Bayesian belief networks) at Institute of Food Research 2000-2015<br />
Scientific IT support role at Institute of Food Research 1987-2000<br />
Research Scientist (Image Analysis) at Institute of Food Research 1984-1987</div>Bgrichterhttp://bioinfo-core.org/index.php?title=User_talk:Udsami&diff=10309User talk:Udsami2017-10-26T20:29:14Z<p>Bgrichter: Welcome!</p>
<hr />
<div>'''Welcome to ''BioWiki''!'''<br />
We hope you will contribute much and well.<br />
You will probably want to read the [https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents help pages].<br />
Again, welcome and have fun! [[User:Bgrichter|Bgrichter]] ([[User talk:Bgrichter|talk]]) 15:29, 26 October 2017 (CDT)</div>Bgrichterhttp://bioinfo-core.org/index.php?title=User:Udsami&diff=10308User:Udsami2017-10-26T20:29:13Z<p>Bgrichter: Creating user page for new user.</p>
<hr />
<div>Umit Sami has over fifteen years of experience in various sub-fields of Computer and System Engineering. His scientific interests are mostly in design, development and deployment of secure, scalable and intelligent computational systems and AI applications. He has earned a B.S. in Computer Science from Binghamton University and M.S. in Management from Stony Brook University.</div>Bgrichterhttp://bioinfo-core.org/index.php?title=ERIS&diff=10307ERIS2017-10-24T21:46:25Z<p>Bgrichter: </p>
<hr />
<div>'''Location''': Boston and Cambridge MA, USA<br />
<br />
Partners HealtCare Systems<br />
*Consisting of Massachusetts General Hospital<br />
*Brigham and Womens Hospital<br />
*Dana Farber Cancer Institute<br />
*Mclean Hospital<br />
*Spaulding Rehab Hospital<br />
* + 7 regional hospitals<br />
<br />
'''Name/Title''': Brent G. Richter, Director:<br><br />
*Enterprise Research Infrastructure & Systems (ERIS)<br />
*Massachusetts General Hospital & Brigham and Womens Hospital Site Research IT Services<br />
*Bioinformatics, Harvard Medical School/Partners Healthcare Cencter for Gentics and Genomics <br />
<br />
'''Links https://rc.partners.org:''' <br />
<br />
'''Group size:''' Diverse:<br />
*3 bioinformatician developers<br />
*3 Systems Engineers/administrators<br />
*4 Scientific, Data Analytics & parallel computing Engineers/Specialists/Scientists<br />
*1 DBA/ DB developer<br />
*4 Site (MGH, BWH and McLean) senior support specialists<br />
*8 Site Infrastructure Technicians<br />
<br />
'''Environment:''' Diverse:<br />
*Linux, HPC clusters (windows and Linux), Massive Storage, Internal Cloud, Oracle/Postgres/MySQL, Application support, web hosting<br />
<br />
'''Tools:''' <br />
Sequencing, next-gen instruments, proteomics, image analysis, R, Jupyter, PythonSAS, everything in public domain.<br />
<br />
'''Pertinent hardware info:''' Mainly HP, some dell, vanilla boxes, Cisco, IB<br />
<br />
Training: <br />
<br />
Bioinformatics support model: Research-grade SLA's, support pipelines and tools as they can be applied to HPC environment (linux and windows clusters, Large shared-memory machines, web hosted pipelines, etc.</div>Bgrichterhttp://bioinfo-core.org/index.php?title=User_talk:Cprietos&diff=10300User talk:Cprietos2017-10-17T15:07:01Z<p>Bgrichter: Welcome!</p>
<hr />
<div>'''Welcome to ''BioWiki''!'''<br />
We hope you will contribute much and well.<br />
You will probably want to read the [https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents help pages].<br />
Again, welcome and have fun! [[User:Bgrichter|Bgrichter]] ([[User talk:Bgrichter|talk]]) 10:07, 17 October 2017 (CDT)</div>Bgrichterhttp://bioinfo-core.org/index.php?title=User:Cprietos&diff=10299User:Cprietos2017-10-17T15:06:59Z<p>Bgrichter: Creating user page for new user.</p>
<hr />
<div>Carlos Prieto obtained his PhD in Computational Biology at the Cancer Research Center of Salamanca (CIC USAL/CSIC) in 2009. In his PhD his research was focused on the study of molecular cancer processes. He developed new algorithms, software and methods for the analysis of transcriptomic and protein interaction data.<br />
His post doctorate professional experience has been focused on the corporate and academic sectors. He worked as a Bioinformatic Analyst for Icinetic, performing a project for the pharmaceutical company PharmaMar and he also worked for the biopharmaceutical company Celgene as computational biologist. In the academic sector, he obtained two postdoctoral positions in INBIOTEC (Biotechnological Institute of Leon) and in the Biomedical and Biotechnological Institute of Cantabria (IBBTEC CSIC). In these positions, he developed new bioinformatics skills in the biotechnology and human health research areas. He is currently working at the University of Salamanca (USAL), where he is exploiting all his bioinformatics skills for the creation of a new bioinformatics core facility.<br />
His experience in research projects and peer reviewed articles during the last years (ORCID: 0000-0001-8178-9768), includes data visualization, protein-protein interaction networks, RNA-Seq analysis, reverse engineering, microarray analysis, pattern recognition, machine learning, genome sequencing, algorithm design, software development and integrative bioinformatics.</div>Bgrichterhttp://bioinfo-core.org/index.php?title=User_talk:Wilma&diff=10296User talk:Wilma2017-09-15T13:11:43Z<p>Bgrichter: Welcome!</p>
<hr />
<div>'''Welcome to ''BioWiki''!'''<br />
We hope you will contribute much and well.<br />
You will probably want to read the [https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents help pages].<br />
Again, welcome and have fun! [[User:Bgrichter|Bgrichter]] ([[User talk:Bgrichter|talk]]) 08:11, 15 September 2017 (CDT)</div>Bgrichterhttp://bioinfo-core.org/index.php?title=User:Wilma&diff=10295User:Wilma2017-09-15T13:11:42Z<p>Bgrichter: Creating user page for new user.</p>
<hr />
<div>I've received my PhD in Biology from the Heinrich-Heine University Dusseldorf, Germany in 2006. Afterwards I started a PostDoc with Des Higgins at the University College Dublin, Ireland. During that time I authored/co-authored multiple sequence alignment programs like ClustalW2, Clustal-Omega, R-Coffee etc. I then started a Research Associate position in the Genome Institute of Singapore, where I currently lead the Research Pipeline Development team, part of the Scientific Research Computing division.</div>Bgrichterhttp://bioinfo-core.org/index.php?title=User_talk:Ewels&diff=10285User talk:Ewels2017-07-25T05:28:51Z<p>Bgrichter: Welcome!</p>
<hr />
<div>'''Welcome to ''BioWiki''!'''<br />
We hope you will contribute much and well.<br />
You will probably want to read the [https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents help pages].<br />
Again, welcome and have fun! [[User:Bgrichter|Bgrichter]] ([[User talk:Bgrichter|talk]]) 00:28, 25 July 2017 (CDT)</div>Bgrichterhttp://bioinfo-core.org/index.php?title=User:Ewels&diff=10284User:Ewels2017-07-25T05:28:50Z<p>Bgrichter: Creating user page for new user.</p>
<hr />
<div>Deputy Head of Facility at the National Genomics Infrastructure in Stockholm, Sweden. I work towards developing new analysis pipelines and quality control procedures, with a focus on data visualisation and user-friendly interfaces. Previously I did a PhD and PostDoc at the Babraham Institute in Cambridge, UK working in epigenetics with HiC and bisulfite-sequencing data.</div>Bgrichterhttp://bioinfo-core.org/index.php?title=User_talk:Rshamilton&diff=10283User talk:Rshamilton2017-07-25T05:28:30Z<p>Bgrichter: Welcome!</p>
<hr />
<div>'''Welcome to ''BioWiki''!'''<br />
We hope you will contribute much and well.<br />
You will probably want to read the [https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents help pages].<br />
Again, welcome and have fun! [[User:Bgrichter|Bgrichter]] ([[User talk:Bgrichter|talk]]) 00:28, 25 July 2017 (CDT)</div>Bgrichterhttp://bioinfo-core.org/index.php?title=User:Rshamilton&diff=10282User:Rshamilton2017-07-25T05:28:29Z<p>Bgrichter: Creating user page for new user.</p>
<hr />
<div>Russell Hamilton heads the bioinformatics core facility at the Centre for Trophoblast Research (CTR), University of Cambridge. The focus of the CTR is the study of the placenta and maternal-fetal interactions during pregnancy and brings together over 25 Principal Investigators based in different departments within the University of Cambridge and Babraham Institute.</div>Bgrichterhttp://bioinfo-core.org/index.php?title=Main_Page&diff=10280Main Page2017-07-24T12:48:01Z<p>Bgrichter: </p>
<hr />
<div>'''Welcome to the bioinfo-core's wiki!''' <br />
<br><br />
<br><br />
*'''[http://lists.open-bio.org/mailman/listinfo/bioinfo-core Sign up to the listserv and participate in the discussion!]'''<br />
*'''[[BioWiki:Community_portal | Add your core to the wiki]]'''<br />
<br><br />
<br><br />
We thank [http://www.iscb.org ISCB] for hosting and maintaining this wiki.<br />
<br><br />
<br><br />
===Newest Content===<br />
<br />
* [[ISMB_2017:_BioinfoCoreWorkshop | ISMB 2017 Workshop]]<br />
* [[ISMB_2016:_BioinfoCoreWorkshop | ISMB 2016 Workshop]]<br />
* [[18th_Discussion-16_Oct_2015 | 18th Discussion - ISMB2015 follow up]]<br />
* [[Interesting NGS failures]]<br />
* [[ISMB_2015:_BioinfoCoreWorkshop|ISMB 2015 Workshop - The evolving relationship between core facilities and researchers]]<br />
* [[17th_Discussion-27_Feb_2015 | 17th Discussion - Best practices for bioinformatics training]]<br />
* [[ISMB_2014:_InfrastructureForNewCores|16th Discussion - ISMB2014 follow up: Infrastructure for new Cores]]<br />
* [[ISMB_2014:_BioinfoCoreWorkshopWriteUp|ISMB 2014 Workshop Write Up]]<br />
* [[15th_Discussion-24_Feb_2014 | 15th Discussion - The biologist is the analyst]]<br />
* [[ISCB_COSI_Proposal | Proposal to make bioinfo-core an ISCB community of special interest]]<br />
* [[ISMB_2014:_BioinfoCoreWorkshop|ISMB 2014 Workshop Proposal]]<br />
* [[14th_Discussion-7_November_2013| 14th Discussion - Evaluating software]]<br />
* [[ISMB_2013:_BioinfoCoreWorkshop|ISMB 2013 Workshop]]<br />
* [[13th_Discussion-5_November_2012| 13th Discussion - Embedded bioinformaticians and Integrative analysis]]<br />
* [[ISMB_2012:_Workshop_Proposal|ISMB 2012 Workshop]]<br />
* [[12th_Discussion-21_May_2012|12th Discussion - Managing Storage in a Core Facility]]<br />
* [[11th_Discussion-7_November_2011|11th Discussion - Measuring the output of a Core and Tracking Software Versions]]<br />
* ISMB 2012: Bioinfo Core Workshop - Long Beach CA - July 16, 2012 [http://www.iscb.org/ismb2012-program/ismb2012-workshops#w3|ISMB Workshop]<br />
* [[ISMB 2011: Workshop on Analysis Pipelines for High Throughput Sequencing]]<br />
* [[ISMB 2011: Workshop on Practical Aspects of Running a Core Facility]]<br />
* [[ISMB 2011 Workshop Call]]<br />
* ISMB 2010 Workshop [[Call Minutes]] page<br />
* Include [http://twitter.com/#search?q=%23BioInfoCore #BioInfoCore] in your [http://twitter.com/ tweets] for the Core community. <br />
* Numerous new additions to the community portal<br />
<br />
*[[BioWiki:Community_portal | Community Portal]]<br />
<br />
= Introduction =<br />
Bioinfo-core is a worldwide body of people that manage or staff bioinformatics facilities within organizations of all types including academia, academic medical centers, medical schools, biotechs and pharmas. Through this wiki and our online [http://lists.open-bio.org/mailman/listinfo/bioinfo-core discussion lists] we discuss many topics that are challenging bioinformatics cores world wide: from IT, new instrumentation, staffing and training bioinformaticians, tools, software, to services for biologists and MD's.<br />
<BR><BR><br />
We hold several events throughout the year including quarterly conference calls (with published [[Call Minutes]]) and a yearly set of informal presentations and dinners at the annual meeting, Intelligent Systems in Molecular Biology ([http://www.iscb.org/iscb-conferences ISMB]), the official conference of [http://www.iscb.org/ ISCB]<br />
<br><br><br />
Please browse, add and participate in the wiki and the discussion lists. To edit the wiki, create a New Account and then edit the [[BioWiki:Community_portal | Community Portal]] to add a link for your core facility and its description.<br />
<br />
= Wiki page links =<br />
*[[Call Minutes]]: Annual meetings at ISMB with presenations; Detailed minutes from quarterly conference calls on selected and pertinent topics. <br />
*[[BioWiki:Community_portal | Community Portal]]: list your organization!<br />
*[[Ongoing Discussions]]: discussion forums including lists of software, tools, etc.<br />
*[[Special:Categories]]: find pages using categories such as Tools, Presentations, NextGenSequencing, Meetings etc.<br />
<br />
=Bioinfo-core Member Publications relevant to core facilities=<br />
*[http://collections.plos.org/ploscompbiol/corefacilities.php PLoS Computational Biology Journal--CORE facilities: editorial and perspectives]<br />
*[http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1000372 The Need for Centralization of Computational Biology Resources] Lewitter F, Rebhan M, Richter B, Sexton DP<br />
*[http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1000369 Managing and Analyzing Next-Generation Sequence Data] Richter BG, Sexton DP<br />
*[http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1000368 Establishing a Successful Bioinformatics Core Facility Team] Lewitter F, Rebhan M</div>Bgrichterhttp://bioinfo-core.org/index.php?title=Main_Page&diff=10279Main Page2017-07-24T12:47:26Z<p>Bgrichter: </p>
<hr />
<div>'''Welcome to the bioinfo-core's wiki!''' <br />
<br><br />
<br><br />
'''*[http://lists.open-bio.org/mailman/listinfo/bioinfo-core Sign up to the listserv and participate in the discussion!]'''<br />
'''*[[BioWiki:Community_portal | Add your core to the wiki]]'''<br />
<br><br />
<br><br />
We thank [http://www.iscb.org ISCB] for hosting and maintaining this wiki.<br />
<br><br />
<br><br />
===Newest Content===<br />
<br />
* [[ISMB_2017:_BioinfoCoreWorkshop | ISMB 2017 Workshop]]<br />
* [[ISMB_2016:_BioinfoCoreWorkshop | ISMB 2016 Workshop]]<br />
* [[18th_Discussion-16_Oct_2015 | 18th Discussion - ISMB2015 follow up]]<br />
* [[Interesting NGS failures]]<br />
* [[ISMB_2015:_BioinfoCoreWorkshop|ISMB 2015 Workshop - The evolving relationship between core facilities and researchers]]<br />
* [[17th_Discussion-27_Feb_2015 | 17th Discussion - Best practices for bioinformatics training]]<br />
* [[ISMB_2014:_InfrastructureForNewCores|16th Discussion - ISMB2014 follow up: Infrastructure for new Cores]]<br />
* [[ISMB_2014:_BioinfoCoreWorkshopWriteUp|ISMB 2014 Workshop Write Up]]<br />
* [[15th_Discussion-24_Feb_2014 | 15th Discussion - The biologist is the analyst]]<br />
* [[ISCB_COSI_Proposal | Proposal to make bioinfo-core an ISCB community of special interest]]<br />
* [[ISMB_2014:_BioinfoCoreWorkshop|ISMB 2014 Workshop Proposal]]<br />
* [[14th_Discussion-7_November_2013| 14th Discussion - Evaluating software]]<br />
* [[ISMB_2013:_BioinfoCoreWorkshop|ISMB 2013 Workshop]]<br />
* [[13th_Discussion-5_November_2012| 13th Discussion - Embedded bioinformaticians and Integrative analysis]]<br />
* [[ISMB_2012:_Workshop_Proposal|ISMB 2012 Workshop]]<br />
* [[12th_Discussion-21_May_2012|12th Discussion - Managing Storage in a Core Facility]]<br />
* [[11th_Discussion-7_November_2011|11th Discussion - Measuring the output of a Core and Tracking Software Versions]]<br />
* ISMB 2012: Bioinfo Core Workshop - Long Beach CA - July 16, 2012 [http://www.iscb.org/ismb2012-program/ismb2012-workshops#w3|ISMB Workshop]<br />
* [[ISMB 2011: Workshop on Analysis Pipelines for High Throughput Sequencing]]<br />
* [[ISMB 2011: Workshop on Practical Aspects of Running a Core Facility]]<br />
* [[ISMB 2011 Workshop Call]]<br />
* ISMB 2010 Workshop [[Call Minutes]] page<br />
* Include [http://twitter.com/#search?q=%23BioInfoCore #BioInfoCore] in your [http://twitter.com/ tweets] for the Core community. <br />
* Numerous new additions to the community portal<br />
<br />
*[[BioWiki:Community_portal | Community Portal]]<br />
<br />
= Introduction =<br />
Bioinfo-core is a worldwide body of people that manage or staff bioinformatics facilities within organizations of all types including academia, academic medical centers, medical schools, biotechs and pharmas. Through this wiki and our online [http://lists.open-bio.org/mailman/listinfo/bioinfo-core discussion lists] we discuss many topics that are challenging bioinformatics cores world wide: from IT, new instrumentation, staffing and training bioinformaticians, tools, software, to services for biologists and MD's.<br />
<BR><BR><br />
We hold several events throughout the year including quarterly conference calls (with published [[Call Minutes]]) and a yearly set of informal presentations and dinners at the annual meeting, Intelligent Systems in Molecular Biology ([http://www.iscb.org/iscb-conferences ISMB]), the official conference of [http://www.iscb.org/ ISCB]<br />
<br><br><br />
Please browse, add and participate in the wiki and the discussion lists. To edit the wiki, create a New Account and then edit the [[BioWiki:Community_portal | Community Portal]] to add a link for your core facility and its description.<br />
<br />
= Wiki page links =<br />
*[[Call Minutes]]: Annual meetings at ISMB with presenations; Detailed minutes from quarterly conference calls on selected and pertinent topics. <br />
*[[BioWiki:Community_portal | Community Portal]]: list your organization!<br />
*[[Ongoing Discussions]]: discussion forums including lists of software, tools, etc.<br />
*[[Special:Categories]]: find pages using categories such as Tools, Presentations, NextGenSequencing, Meetings etc.<br />
<br />
=Bioinfo-core Member Publications relevant to core facilities=<br />
*[http://collections.plos.org/ploscompbiol/corefacilities.php PLoS Computational Biology Journal--CORE facilities: editorial and perspectives]<br />
*[http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1000372 The Need for Centralization of Computational Biology Resources] Lewitter F, Rebhan M, Richter B, Sexton DP<br />
*[http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1000369 Managing and Analyzing Next-Generation Sequence Data] Richter BG, Sexton DP<br />
*[http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1000368 Establishing a Successful Bioinformatics Core Facility Team] Lewitter F, Rebhan M</div>Bgrichterhttp://bioinfo-core.org/index.php?title=Main_Page&diff=10278Main Page2017-07-24T12:46:57Z<p>Bgrichter: </p>
<hr />
<div>'''Welcome to the bioinfo-core's wiki!''' <br />
<br><br />
<br><br />
'''*[http://lists.open-bio.org/mailman/listinfo/bioinfo-core Sign up to the listserv and participate in the discussion!]<br />
*[[BioWiki:Community_portal | Add your core to the wiki]]'''<br />
<br><br />
<br><br />
We thank [http://www.iscb.org ISCB] for hosting and maintaining this wiki.<br />
<br><br />
<br><br />
===Newest Content===<br />
<br />
* [[ISMB_2017:_BioinfoCoreWorkshop | ISMB 2017 Workshop]]<br />
* [[ISMB_2016:_BioinfoCoreWorkshop | ISMB 2016 Workshop]]<br />
* [[18th_Discussion-16_Oct_2015 | 18th Discussion - ISMB2015 follow up]]<br />
* [[Interesting NGS failures]]<br />
* [[ISMB_2015:_BioinfoCoreWorkshop|ISMB 2015 Workshop - The evolving relationship between core facilities and researchers]]<br />
* [[17th_Discussion-27_Feb_2015 | 17th Discussion - Best practices for bioinformatics training]]<br />
* [[ISMB_2014:_InfrastructureForNewCores|16th Discussion - ISMB2014 follow up: Infrastructure for new Cores]]<br />
* [[ISMB_2014:_BioinfoCoreWorkshopWriteUp|ISMB 2014 Workshop Write Up]]<br />
* [[15th_Discussion-24_Feb_2014 | 15th Discussion - The biologist is the analyst]]<br />
* [[ISCB_COSI_Proposal | Proposal to make bioinfo-core an ISCB community of special interest]]<br />
* [[ISMB_2014:_BioinfoCoreWorkshop|ISMB 2014 Workshop Proposal]]<br />
* [[14th_Discussion-7_November_2013| 14th Discussion - Evaluating software]]<br />
* [[ISMB_2013:_BioinfoCoreWorkshop|ISMB 2013 Workshop]]<br />
* [[13th_Discussion-5_November_2012| 13th Discussion - Embedded bioinformaticians and Integrative analysis]]<br />
* [[ISMB_2012:_Workshop_Proposal|ISMB 2012 Workshop]]<br />
* [[12th_Discussion-21_May_2012|12th Discussion - Managing Storage in a Core Facility]]<br />
* [[11th_Discussion-7_November_2011|11th Discussion - Measuring the output of a Core and Tracking Software Versions]]<br />
* ISMB 2012: Bioinfo Core Workshop - Long Beach CA - July 16, 2012 [http://www.iscb.org/ismb2012-program/ismb2012-workshops#w3|ISMB Workshop]<br />
* [[ISMB 2011: Workshop on Analysis Pipelines for High Throughput Sequencing]]<br />
* [[ISMB 2011: Workshop on Practical Aspects of Running a Core Facility]]<br />
* [[ISMB 2011 Workshop Call]]<br />
* ISMB 2010 Workshop [[Call Minutes]] page<br />
* Include [http://twitter.com/#search?q=%23BioInfoCore #BioInfoCore] in your [http://twitter.com/ tweets] for the Core community. <br />
* Numerous new additions to the community portal<br />
<br />
*[[BioWiki:Community_portal | Community Portal]]<br />
<br />
= Introduction =<br />
Bioinfo-core is a worldwide body of people that manage or staff bioinformatics facilities within organizations of all types including academia, academic medical centers, medical schools, biotechs and pharmas. Through this wiki and our online [http://lists.open-bio.org/mailman/listinfo/bioinfo-core discussion lists] we discuss many topics that are challenging bioinformatics cores world wide: from IT, new instrumentation, staffing and training bioinformaticians, tools, software, to services for biologists and MD's.<br />
<BR><BR><br />
We hold several events throughout the year including quarterly conference calls (with published [[Call Minutes]]) and a yearly set of informal presentations and dinners at the annual meeting, Intelligent Systems in Molecular Biology ([http://www.iscb.org/iscb-conferences ISMB]), the official conference of [http://www.iscb.org/ ISCB]<br />
<br><br><br />
Please browse, add and participate in the wiki and the discussion lists. To edit the wiki, create a New Account and then edit the [[BioWiki:Community_portal | Community Portal]] to add a link for your core facility and its description.<br />
<br />
= Wiki page links =<br />
*[[Call Minutes]]: Annual meetings at ISMB with presenations; Detailed minutes from quarterly conference calls on selected and pertinent topics. <br />
*[[BioWiki:Community_portal | Community Portal]]: list your organization!<br />
*[[Ongoing Discussions]]: discussion forums including lists of software, tools, etc.<br />
*[[Special:Categories]]: find pages using categories such as Tools, Presentations, NextGenSequencing, Meetings etc.<br />
<br />
=Bioinfo-core Member Publications relevant to core facilities=<br />
*[http://collections.plos.org/ploscompbiol/corefacilities.php PLoS Computational Biology Journal--CORE facilities: editorial and perspectives]<br />
*[http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1000372 The Need for Centralization of Computational Biology Resources] Lewitter F, Rebhan M, Richter B, Sexton DP<br />
*[http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1000369 Managing and Analyzing Next-Generation Sequence Data] Richter BG, Sexton DP<br />
*[http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1000368 Establishing a Successful Bioinformatics Core Facility Team] Lewitter F, Rebhan M</div>Bgrichterhttp://bioinfo-core.org/index.php?title=User_talk:Roland.Krause&diff=10277User talk:Roland.Krause2017-07-24T12:30:04Z<p>Bgrichter: Welcome!</p>
<hr />
<div>'''Welcome to ''BioWiki''!'''<br />
We hope you will contribute much and well.<br />
You will probably want to read the [https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents help pages].<br />
Again, welcome and have fun! [[User:Bgrichter|Bgrichter]] ([[User talk:Bgrichter|talk]]) 07:30, 24 July 2017 (CDT)</div>Bgrichterhttp://bioinfo-core.org/index.php?title=User:Roland.Krause&diff=10276User:Roland.Krause2017-07-24T12:30:03Z<p>Bgrichter: Creating user page for new user.</p>
<hr />
<div>Currently at Luxembourg centre for systems biomedicine, university of Luxembourg, bioinformatics core facility with Reinhard Schneider. Elixir-Luxembourg training coordinator. Data manager for European and International epilepsy genetics projects. R user. Previously at max Planck institute for molecular genetics, Berlin, Germany and cellzome ag Heidelberg, Germany. Can write 50 word bio totally easily but wondering that so many people still signed up</div>Bgrichterhttp://bioinfo-core.org/index.php?title=ISMB_2017:_BioinfoCoreWorkshop&diff=10275ISMB 2017: BioinfoCoreWorkshop2017-07-24T12:20:45Z<p>Bgrichter: /* Ensuring Reproducibility */</p>
<hr />
<div>We are holding a Bioinfo-core workshop at the [https://www.iscb.org/ismbeccb2017 ISMB/ECCB meeting] in Prague. We have been provided a [https://www.iscb.org/cms_addon/conferences/ismbeccb2017/workshops.php#WK02 half-day slot in the program] on Monday, July 24, 2017, 10:00 am – 12:30 pm<br />
<br />
=Standing on Two Legs: Managing Operations in a Core and Ensuring Scientific Reproducibility=<br />
<br />
== Workshop Structure ==<br />
<br />
The workshop is split into two sessions with a required break between. Each session will have two 15 minute talks followed by a 30 minute discussion. <br />
<br />
* The first slot will have 2 15 minute talks on the topic of Managing Core Facilities followed by a 40 minute audiance discussion. After the break we will have 2 15 minute talks about Ensuring Scientific Reproducibility, followed by a 40 minute panel discussion.<br />
<br />
== Workshop topics ==<br />
<br />
===Managing a Core Facility===<br />
<br />
<br />
'''Setting up a new bioinformatics core facility: a first year review'''<br><br />
Speaker: '''Russell Hamilton''', University of Cambridge<br><br />
Time: 10:00-10:15<br><br />
<br />
<br />
'''Managing people in a core facility'''<br><br />
Speaker: '''Annette McGrath''', CSIRO The University of Queensland<br><br />
Time: 10:15-10:30<br />
<br />
<br />
One constant of running a core facility: have to deal with people at all levels: funders, staff, investigators/clients, etc.<br />
Difference in your clients affect what kind of staff, in terms of skills and personalities, need to be maintained in a core: need for pipeline redundancy and translation to staffing, do they need to interact directly with clients? customer service is important, for example.<br />
Managers have to take care of their people: do they have what they need in computers, tools and training? ensure they can take pride in their work, work out time/priority conflicts, do they have the big picture of the mission on the organization, the core and their clients? Talk to your folks as to what motivates them and ways to stay engaged. Assign projects to stretch their abilities and instill a sense of ownership to them for their projects. celebrate successes.<br />
Annette has always Advocated for the need for bioinformatics within her organizations--a priority. <br />
<br />
Some Questions:<br />
What's your biggest horror story?<br />
what where some of the traits and abilities you look for in a candidate?<br />
what's the size of your team(s) and the reporting structure?<br />
<br />
'''Management Discussion: 10:30-11:10'''<br />
<br />
balance of operating a core and research: research is secondary, but how are you judged? how do you manage phd students through the core? they are co-supervised. <br />
a fair number of people in the audience maintain both research and service. how are these balanced? administrative part of running the core comes first.<br />
how many people are hired into academic tracks vs. professional track?<br />
The problem of career progression: none at all--hired into a position and remain at that hiring position. This was highlighted at a recent Cores conference in the UK. what is interesting is that funders were present and discussed putting together a professional development track to parallel an academic track--Wellcome, etc. what is rewarded? <br />
what is the balance in providing training vs. owning the expertise in a specific analytical domain? Train collaborators? how much of the analysis is a core doing by themselves and how much to empower other to perform. self-help is important to get started and to answer low-live questions. However, development of this body of knowledge in how-to articles or knowledge bases is challenging to do as it takes time to put into the effort.<br />
another potential topic: couple of discussion around: training and education--how much do you train and what? What level? does it cut into your business?<br />
training at CSIRO--develop a mission and stay focused. aim to provide others literacy so they understand some of the basic information. "data internships" where a customer interns with a bioinformatician for a week or 2.<br />
How do you stay organized as you grow in people? is it manageable: are you all working like mad vs. same number of projects, but they are now more shared (greater bandwidth). <br />
the "embedded" bioinformatician: how to keep them engaged in the core. dynamic of research bioinformaticians embedded in the core--these folks generally feel most isolated, reach out and engage them as a community.<br />
<br />
Different organizational structures: study/survey opportunity--reporting to an academic? working inside institutional funding?<br />
<br />
<br />
===Ensuring Reproducibility===<br />
<br />
<br />
'''Developing Reliable QC at the Swedish National Genomics Infrastructure'''<br><br />
Speaker: '''Phil Ewels''', SciLifeLab (Sweden)<br><br />
Time: 11:30-11:45<br><br />
[[File:Phil_Ewels_ISMB_BioinfoCore_2017.pdf]]<br />
<br />
Core facility for all of Sweden. Maintain a continuum of QC, from fully automated and rigorous using MultiQC to the occasional QC for development. utilizes continuous integration: GitHub, docker and travis CL. uses [https://github.com/search?q=topic%3Anextflow&type=Repositories Nextflow] for NGI-RNAseq and others.<br />
everything is on GitHub. http://opensource.scilifelab.se, http://multiqc.info, http://GitHub.com/SciLifeLab.<br />
<br />
'''Reproducible and fully documented data analyses at the Functional Genomics Center Zurich'''<br><br />
Speaker: '''Lennart Opitz''', University of Zurich<br><br />
Time: 11:45-12<br><br />
<br />
Leverages an Open Technology Platform since 2002<br />
<br />
<br />
'''Reproducibility Discussion: 12-12:30'''<br />
<br />
Opitz: How do you deal with external clients and data backup? Data is backed up by policy. the users signs up for retention of their data for a period of time--industry and external users understand their data is captured and kept for 3-6 months. QC results are kept forever but original data is purged.<br />
<br />
What is everyone's definition of reproducibility? is it that you are able to run the data again in 10 years and get the same results and are you ensured to get the same results? Ewels has everything on GitHub and can re-run data using specific version of tools branched on GitHub. <br />
[http://singularity.lbl.gov singularity] to run containers on HPC. <br />
ISO certification: validation steps, everything is audited, complete documentation system around the IT and informatics systems. <br />
<br />
Do you have a sense as to whether your users understand the investments you have done to setup such a rigorous system? a lot of people couldn't reproduce their old pipeline and didn't trust them. They've written the QC measures to demonstrate quality, <br />
the value and improve reproducibility and trust.<br />
try to develop interactive tools with SHINY.<br />
training on experimental design and coaching, involving the core in experimental design. It's an advantage to work in a center of core labs, to work together with the sequencing lab/informatics lab, etc so the kick off is all done together. also advantage to have wet lab<br />
right next store.<br />
idea of an interactive tool that exports a reproducible script.</div>Bgrichterhttp://bioinfo-core.org/index.php?title=ISMB_2017:_BioinfoCoreWorkshop&diff=10274ISMB 2017: BioinfoCoreWorkshop2017-07-24T12:19:47Z<p>Bgrichter: </p>
<hr />
<div>We are holding a Bioinfo-core workshop at the [https://www.iscb.org/ismbeccb2017 ISMB/ECCB meeting] in Prague. We have been provided a [https://www.iscb.org/cms_addon/conferences/ismbeccb2017/workshops.php#WK02 half-day slot in the program] on Monday, July 24, 2017, 10:00 am – 12:30 pm<br />
<br />
=Standing on Two Legs: Managing Operations in a Core and Ensuring Scientific Reproducibility=<br />
<br />
== Workshop Structure ==<br />
<br />
The workshop is split into two sessions with a required break between. Each session will have two 15 minute talks followed by a 30 minute discussion. <br />
<br />
* The first slot will have 2 15 minute talks on the topic of Managing Core Facilities followed by a 40 minute audiance discussion. After the break we will have 2 15 minute talks about Ensuring Scientific Reproducibility, followed by a 40 minute panel discussion.<br />
<br />
== Workshop topics ==<br />
<br />
===Managing a Core Facility===<br />
<br />
<br />
'''Setting up a new bioinformatics core facility: a first year review'''<br><br />
Speaker: '''Russell Hamilton''', University of Cambridge<br><br />
Time: 10:00-10:15<br><br />
<br />
<br />
'''Managing people in a core facility'''<br><br />
Speaker: '''Annette McGrath''', CSIRO The University of Queensland<br><br />
Time: 10:15-10:30<br />
<br />
<br />
One constant of running a core facility: have to deal with people at all levels: funders, staff, investigators/clients, etc.<br />
Difference in your clients affect what kind of staff, in terms of skills and personalities, need to be maintained in a core: need for pipeline redundancy and translation to staffing, do they need to interact directly with clients? customer service is important, for example.<br />
Managers have to take care of their people: do they have what they need in computers, tools and training? ensure they can take pride in their work, work out time/priority conflicts, do they have the big picture of the mission on the organization, the core and their clients? Talk to your folks as to what motivates them and ways to stay engaged. Assign projects to stretch their abilities and instill a sense of ownership to them for their projects. celebrate successes.<br />
Annette has always Advocated for the need for bioinformatics within her organizations--a priority. <br />
<br />
Some Questions:<br />
What's your biggest horror story?<br />
what where some of the traits and abilities you look for in a candidate?<br />
what's the size of your team(s) and the reporting structure?<br />
<br />
'''Management Discussion: 10:30-11:10'''<br />
<br />
balance of operating a core and research: research is secondary, but how are you judged? how do you manage phd students through the core? they are co-supervised. <br />
a fair number of people in the audience maintain both research and service. how are these balanced? administrative part of running the core comes first.<br />
how many people are hired into academic tracks vs. professional track?<br />
The problem of career progression: none at all--hired into a position and remain at that hiring position. This was highlighted at a recent Cores conference in the UK. what is interesting is that funders were present and discussed putting together a professional development track to parallel an academic track--Wellcome, etc. what is rewarded? <br />
what is the balance in providing training vs. owning the expertise in a specific analytical domain? Train collaborators? how much of the analysis is a core doing by themselves and how much to empower other to perform. self-help is important to get started and to answer low-live questions. However, development of this body of knowledge in how-to articles or knowledge bases is challenging to do as it takes time to put into the effort.<br />
another potential topic: couple of discussion around: training and education--how much do you train and what? What level? does it cut into your business?<br />
training at CSIRO--develop a mission and stay focused. aim to provide others literacy so they understand some of the basic information. "data internships" where a customer interns with a bioinformatician for a week or 2.<br />
How do you stay organized as you grow in people? is it manageable: are you all working like mad vs. same number of projects, but they are now more shared (greater bandwidth). <br />
the "embedded" bioinformatician: how to keep them engaged in the core. dynamic of research bioinformaticians embedded in the core--these folks generally feel most isolated, reach out and engage them as a community.<br />
<br />
Different organizational structures: study/survey opportunity--reporting to an academic? working inside institutional funding?<br />
<br />
<br />
===Ensuring Reproducibility===<br />
<br />
<br />
'''Developing Reliable QC at the Swedish National Genomics Infrastructure'''<br><br />
Speaker: '''Phil Ewels''', SciLifeLab (Sweden)<br><br />
Time: 11:30-11:45<br><br />
[[File:Phil_Ewels_ISMB_BioinfoCore_2017.pdf]]<br />
<br />
Core facility for all of Sweden. Maintain a continuum of QC, from fully automated and rigorous using MultiQC to the occasional QC for development. utilizes continuous integration: GitHub, docker and travis CL. uses [https://github.com/search?q=topic%3Anextflow&type=Repositories Nextflow] for NGI-RNAseq and others.<br />
everything is on GitHub. http://opensource.scilifelab.se, http://multiqc.info, http://GitHub.com/SciLifeLab.<br />
<br />
'''Reproducible and fully documented data analyses at the Functional Genomics Center Zurich'''<br><br />
Speaker: '''Lennart Opitz''', University of Zurich<br><br />
Time: 11:45-12<br><br />
<br />
Leverages an Open Technology Platform since 2002<br />
<br />
'''Reproducibility Discussion: 12-12:30'''<br />
<br />
Opitz: How do you deal with external clients and data backup? Data is backed up by policy. the users signs up for retention of their data for a period of time--industry and external users understand their data is captured and kept for 3-6 months. QC results are kept forever but original data is purged.<br />
<br />
What is everyone's definition of reproducibility? is it that you are able to run the data again in 10 years and get the same results and are you ensured to get the same results? Ewels has everything on GitHub and can re-run data using specific version of tools branched on GitHub. <br />
[http://singularity.lbl.gov singularity] to run containers on HPC. <br />
ISO certification: validation steps, everything is audited, complete documentation system around the IT and informatics systems. <br />
<br />
Do you have a sense as to whether your users understand the investments you have done to setup such a rigorous system? a lot of people couldn't reproduce their old pipeline and didn't trust them. They've written the QC measures to demonstrate quality, <br />
the value and improve reproducibility and trust.<br />
try to develop interactive tools with SHINY.<br />
training on experimental design and coaching, involving the core in experimental design. It's an advantage to work in a center of core labs, to work together with the sequencing lab/informatics lab, etc so the kick off is all done together. also advantage to have wet lab<br />
right next store.<br />
idea of an interactive tool that exports a reproducible script.</div>Bgrichterhttp://bioinfo-core.org/index.php?title=ISMB_2017:_BioinfoCoreWorkshop&diff=10273ISMB 2017: BioinfoCoreWorkshop2017-07-24T12:15:55Z<p>Bgrichter: </p>
<hr />
<div>We are holding a Bioinfo-core workshop at the [https://www.iscb.org/ismbeccb2017 ISMB/ECCB meeting] in Prague. We have been provided a [https://www.iscb.org/cms_addon/conferences/ismbeccb2017/workshops.php#WK02 half-day slot in the program] on Monday, July 24, 2017, 10:00 am – 12:30 pm<br />
<br />
=Standing on Two Legs: Managing Operations in a Core and Ensuring Scientific Reproducibility=<br />
<br />
== Workshop Structure ==<br />
<br />
The workshop is split into two sessions with a required break between. Each session will have two 15 minute talks followed by a 30 minute discussion. <br />
<br />
* The first slot will have 2 15 minute talks on the topic of Managing Core Facilities followed by a 40 minute audiance discussion. After the break we will have 2 15 minute talks about Ensuring Scientific Reproducibility, followed by a 40 minute panel discussion.<br />
<br />
== Workshop topics ==<br />
<br />
'''Managing a Core Facility'''<br />
<br />
<br />
'''Setting up a new bioinformatics core facility: a first year review'''<br><br />
Speaker: '''Russell Hamilton''', University of Cambridge<br><br />
Time: 10:00-10:15<br><br />
<br />
<br />
'''Managing people in a core facility'''<br><br />
Speaker: '''Annette McGrath''', CSIRO The University of Queensland<br><br />
Time: 10:15-10:30<br />
<br />
<br />
One constant of running a core facility: have to deal with people at all levels: funders, staff, investigators/clients, etc.<br />
Difference in your clients affect what kind of staff, in terms of skills and personalities, need to be maintained in a core: need for pipeline redundancy and translation to staffing, do they need to interact directly with clients? customer service is important, for example.<br />
Managers have to take care of their people: do they have what they need in computers, tools and training? ensure they can take pride in their work, work out time/priority conflicts, do they have the big picture of the mission on the organization, the core and their clients? Talk to your folks as to what motivates them and ways to stay engaged. Assign projects to stretch their abilities and instill a sense of ownership to them for their projects. celebrate successes.<br />
Annette has always Advocated for the need for bioinformatics within her organizations--a priority. <br />
<br />
Some Questions:<br />
What's your biggest horror story?<br />
what where some of the traits you look for in a candidate.<br />
what's the size of your team(s) and the reporting structure?<br />
<br />
Management Panel: 10:30-11:10<br />
<br />
balance of operating a core and research: research is secondary, but how are you judged? how do you manage phd students through the core? they are co-supervised. <br />
a fair number of people in the audience maintain both research and service. how are these balanced? administrative part of running the core comes first.<br />
how many people are hired into academic tracks vs. professional track?<br />
The problem of career progression: none at all--hired into a position and remain at that hiring position. This was highlighted at a recent Cores conference in the UK. what is interesting is that funders were present and discussed putting together a professional development track to parallel an academic track--Wellcome, etc. what is rewarded? <br />
what is the balance in providing training vs. owning the expertise in a specific analytical domain? Train collaborators? how much of the analysis is a core doing by themselves and how much to empower other to perform. self-help is important to get started and to answer low-live questions. However, development of this body of knowledge in how-to articles or knowledge bases is challenging to do as it takes time to put into the effort.<br />
another potential topic: couple of discussion around: training and education--how much do you train and what? What level? does it cut into your business?<br />
training at CSIRO--develop a mission and stay focused. aim to provide others literacy so they understand some of the basic information. "data internships" where a customer interns with a bioinformatician for a week or 2.<br />
How do you stay organized as you grow in people? is it manageable: are you all working like mad vs. same number of projects, but they are now more shared (greater bandwidth). <br />
the "embedded" bioinformatician: how to keep them engaged in the core. dynamic of research bioinformaticians embedded in the core--these folks generally feel most isolated, reach out and engage them as a community.<br />
<br />
Different organizational structures: study/survey opportunity--reporting to an academic? working inside institutional funding?<br />
<br />
'''Ensuring Reproducibility'''<br />
<br />
<br />
'''Developing Reliable QC at the Swedish National Genomics Infrastructure'''<br><br />
Speaker: '''Phil Ewels''', SciLifeLab (Sweden)<br><br />
Time: 11:30-11:45<br><br />
[[File:Phil_Ewels_ISMB_BioinfoCore_2017.pdf]]<br />
<br />
Core facility for all of Sweden. Maintain a continuum of QC, from fully automated and rigorous using MultiQC to the occasional QC for development. utilizes continuous integration: GitHub, docker and travis CL. uses [https://github.com/search?q=topic%3Anextflow&type=Repositories Nextflow] for NGI-RNAseq and others.<br />
everything is on GitHub. http://opensource.scilifelab.se, http://multiqc.info, http://GitHub.com/SciLifeLab.<br />
<br />
'''Reproducible and fully documented data analyses at the Functional Genomics Center Zurich'''<br><br />
Speaker: '''Lennart Opitz''', University of Zurich<br><br />
Time: 11:45-12<br><br />
<br />
Leverages an Open Technology Platform since 2002<br />
<br />
Reproducibility Panel: 12-12:30<br />
<br />
Opitz: How do you deal with external clients and data backup? Data is backed up by policy. the users signs up for retention of their data for a period of time--industry and external users understand their data is captured and kept for 3-6 months. QC results are kept forever but original data is purged.<br />
<br />
What is everyone's definition of reproducibility? is it that you are able to run the data again in 10 years and get the same results and are you ensured to get the same results? Ewels has everything on GitHub and can re-run data using specific version of tools branched on GitHub. <br />
[http://singularity.lbl.gov singularity] to run containers on HPC. <br />
ISO certification: validation steps, everything is audited, complete documentation system around the IT and informatics systems. <br />
<br />
Do you have a sense as to whether your users understand the investments you have done to setup such a rigorous system? a lot of people couldn't reproduce their old pipeline and didn't trust them. They've written the QC measures to demonstrate quality, <br />
the value and improve reproducibility and trust.<br />
try to develop interactive tools with SHINY.<br />
training on experimental design and coaching, involving the core in experimental design. It's an advantage to work in a center of core labs, to work together with the sequencing lab/informatics lab, etc so the kick off is all done together. also advantage to have wet lab<br />
right next store.<br />
idea of an interactive tool that exports a reproducible script.<br />
<br />
'''Ensuring Reproducibility'''<br />
<br />
<br />
'''Developing Reliable QC at the Swedish National Genomics Infrastructure'''<br><br />
Speaker: '''Phil Ewels''', SciLifeLab (Sweden)<br><br />
Time: 11:30-11:45<br><br />
[[File:Phil_Ewels_ISMB_BioinfoCore_2017.pdf]]<br />
<br />
'''Reproducible and fully documented data analyses at the Functional Genomics Center Zurich'''<br><br />
Speaker: '''Lennart Opitz''', University of Zurich<br><br />
Time: 11:45-12<br><br />
<br />
Reproducibility Panel: 12-12:30</div>Bgrichterhttp://bioinfo-core.org/index.php?title=Main_Page&diff=10272Main Page2017-07-24T05:49:47Z<p>Bgrichter: </p>
<hr />
<div>'''Welcome to the bioinfo-core's wiki!''' <br />
<br />
*[http://lists.open-bio.org/mailman/listinfo/bioinfo-core Participate in the discussion]<br />
*[[BioWiki:Community_portal | Add your core to the wiki]]<br />
<br><br />
We thank [http://www.iscb.org ISCB] for hosting and maintaining this wiki.<br />
<br><br />
<br><br />
===Newest Content===<br />
<br />
* [[ISMB_2017:_BioinfoCoreWorkshop | ISMB 2017 Workshop]]<br />
* [[ISMB_2016:_BioinfoCoreWorkshop | ISMB 2016 Workshop]]<br />
* [[18th_Discussion-16_Oct_2015 | 18th Discussion - ISMB2015 follow up]]<br />
* [[Interesting NGS failures]]<br />
* [[ISMB_2015:_BioinfoCoreWorkshop|ISMB 2015 Workshop - The evolving relationship between core facilities and researchers]]<br />
* [[17th_Discussion-27_Feb_2015 | 17th Discussion - Best practices for bioinformatics training]]<br />
* [[ISMB_2014:_InfrastructureForNewCores|16th Discussion - ISMB2014 follow up: Infrastructure for new Cores]]<br />
* [[ISMB_2014:_BioinfoCoreWorkshopWriteUp|ISMB 2014 Workshop Write Up]]<br />
* [[15th_Discussion-24_Feb_2014 | 15th Discussion - The biologist is the analyst]]<br />
* [[ISCB_COSI_Proposal | Proposal to make bioinfo-core an ISCB community of special interest]]<br />
* [[ISMB_2014:_BioinfoCoreWorkshop|ISMB 2014 Workshop Proposal]]<br />
* [[14th_Discussion-7_November_2013| 14th Discussion - Evaluating software]]<br />
* [[ISMB_2013:_BioinfoCoreWorkshop|ISMB 2013 Workshop]]<br />
* [[13th_Discussion-5_November_2012| 13th Discussion - Embedded bioinformaticians and Integrative analysis]]<br />
* [[ISMB_2012:_Workshop_Proposal|ISMB 2012 Workshop]]<br />
* [[12th_Discussion-21_May_2012|12th Discussion - Managing Storage in a Core Facility]]<br />
* [[11th_Discussion-7_November_2011|11th Discussion - Measuring the output of a Core and Tracking Software Versions]]<br />
* ISMB 2012: Bioinfo Core Workshop - Long Beach CA - July 16, 2012 [http://www.iscb.org/ismb2012-program/ismb2012-workshops#w3|ISMB Workshop]<br />
* [[ISMB 2011: Workshop on Analysis Pipelines for High Throughput Sequencing]]<br />
* [[ISMB 2011: Workshop on Practical Aspects of Running a Core Facility]]<br />
* [[ISMB 2011 Workshop Call]]<br />
* ISMB 2010 Workshop [[Call Minutes]] page<br />
* Include [http://twitter.com/#search?q=%23BioInfoCore #BioInfoCore] in your [http://twitter.com/ tweets] for the Core community. <br />
* Numerous new additions to the community portal<br />
<br />
*[[BioWiki:Community_portal | Community Portal]]<br />
<br />
= Introduction =<br />
Bioinfo-core is a worldwide body of people that manage or staff bioinformatics facilities within organizations of all types including academia, academic medical centers, medical schools, biotechs and pharmas. Through this wiki and our online [http://lists.open-bio.org/mailman/listinfo/bioinfo-core discussion lists] we discuss many topics that are challenging bioinformatics cores world wide: from IT, new instrumentation, staffing and training bioinformaticians, tools, software, to services for biologists and MD's.<br />
<BR><BR><br />
We hold several events throughout the year including quarterly conference calls (with published [[Call Minutes]]) and a yearly set of informal presentations and dinners at the annual meeting, Intelligent Systems in Molecular Biology ([http://www.iscb.org/iscb-conferences ISMB]), the official conference of [http://www.iscb.org/ ISCB]<br />
<br><br><br />
Please browse, add and participate in the wiki and the discussion lists. To edit the wiki, create a New Account and then edit the [[BioWiki:Community_portal | Community Portal]] to add a link for your core facility and its description.<br />
<br />
= Wiki page links =<br />
*[[Call Minutes]]: Annual meetings at ISMB with presenations; Detailed minutes from quarterly conference calls on selected and pertinent topics. <br />
*[[BioWiki:Community_portal | Community Portal]]: list your organization!<br />
*[[Ongoing Discussions]]: discussion forums including lists of software, tools, etc.<br />
*[[Special:Categories]]: find pages using categories such as Tools, Presentations, NextGenSequencing, Meetings etc.<br />
<br />
=Bioinfo-core Member Publications relevant to core facilities=<br />
*[http://collections.plos.org/ploscompbiol/corefacilities.php PLoS Computational Biology Journal--CORE facilities: editorial and perspectives]<br />
*[http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1000372 The Need for Centralization of Computational Biology Resources] Lewitter F, Rebhan M, Richter B, Sexton DP<br />
*[http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1000369 Managing and Analyzing Next-Generation Sequence Data] Richter BG, Sexton DP<br />
*[http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1000368 Establishing a Successful Bioinformatics Core Facility Team] Lewitter F, Rebhan M</div>Bgrichterhttp://bioinfo-core.org/index.php?title=ISMB_2016:_BioinfoCoreWorkshop&diff=10227ISMB 2016: BioinfoCoreWorkshop2016-07-12T19:09:24Z<p>Bgrichter: /* Big Data */</p>
<hr />
<div>We are holding a Bioinfo-core workshop at the [https://www.iscb.org/ismb2016 2016 ISMB meeting] in Orlando, Florida. We have been given a half-day [https://www.iscb.org/ismb2016/2700 workshop track] slot in the program on Monday, July 11th from 2:00-4:30 PM.<br />
<br />
== Workshop Structure ==<br />
<br />
The workshop is split into 4 sessions of ~30 mins each with a required break between the first and second half of the meeting (3-3:30).<br />
<br />
* The first slot will have 2 15 minute talks on the topic of Big Data followed by a 30 minute panel discussion. After the break we will have 2 15 minute talks about Big Compute, followed by a 30 minute panel discussion.<br />
<br />
== Workshop topics ==<br />
<br />
The workshop will address "The practical experience of big data and big compute". Members of core facilities will share their experience and insights via presentation and panel discussion.<br />
<br />
=== Big data ===<br />
<br />
Speaker: '''Yury Bukhman''', Great Lakes Bioenergy Research Center<br />
Time: 2:00 pm – 2:15 pm<br />
<br />
Presentation Overview:<br />
<br />
The Computational Biology Core of the Great Lakes Bioenergy Research Center supports mostly academic labs at the University of Wisconsin, Michigan State University and other universities. With a variety of experiment types, they are challenged to manage and analyze disparate data and metadata in a diverse academic environment. Details of these data challenges and solutions will be discussed.<br />
<br />
[[File:yury.pdf]]<br />
<br />
Speaker: '''Alberto Riva''', University of Florida <br />
Time:2:15 pm – 2:30 pm<br />
<br />
Presentation Overview<br />
<br />
The Bioinformatics Core of the ICBR provides bioinformatics services to the large and diverse scientific community of the University of Florida. Routine handling of projects covering a vast spectrum of biological and biomedical research requires a flexible and powerful data infrastructure. Implementation details of a software development environment (Actor) for reliable, reusable, reproducible analysis pipelines will be discussed, as well as insights on managing big data projects in a core setting.<br />
<br />
[[File:alberto.pdf]]<br />
<br />
<br />
'''Big Data Panel'''<br />
Time: 2:30 pm – 3:00 pm<br />
<br />
Moderator: Madelaine Gogol, Stowers Institute for Medical Research<br />
* Panel Speaker: Yury Bukhman, Great Lakes Bioenergy Research Center<br />
* Panel Speaker: Alberto Riva, University of Florida<br />
* Panel Speaker: Hua Li, Stowers Institute for Medical Research<br />
* Panel Speaker: Jyothi Thimmapuram, Purdue University<br />
<br />
The presenters, panelists, and attendees will explore practical experience with “big data” as well as use of public datasets in a panel discussion. Topics may include accuracy of annotation, trust of data, raw versus processed, data validation, and QC.<br />
<br />
=== Big Compute ===<br />
<br />
Speaker: '''Sergi Sayols Puig''', Institute of Molecular Biology Mainz<br />
Time: 3:30 pm – 3:45 pm<br />
<br />
Presentation Overview<br />
With a variety of computing infrastructures available, building robust, transferable pipelines can increase utilization of compute resources. NGS analysis pipelines implemented as docker containers and deployed on a variety of compute platforms – (cluster, supercomputer, or workstation) will be discussed.<br />
<br />
<br />
Speaker: '''Jingzhi Zhu''', The Koch Institute at MIT <br />
Time: 3:45 pm – 4:00 pm<br />
<br />
Experiences transitioning a Bioinformatics core from a local to a cloud-based compute solution will be discussed, including the motivation, performance, cost, and issues with deploying bioinformatics pipelines to Amazon EC2 instances.<br />
<br />
[[File:Jingzhi.pdf]]<br />
<br />
<br />
'''Big Compute Panel''' <br />
Time: 4:00 pm – 4:30 pm<br />
<br />
Moderator: Brent Richter, Partners HealthCare <br />
* Panel Speaker: Sergi Sayols Puig, Institute of Molecular Biology Mainz <br />
* Panel Speaker: Jingzhi Zhu, The Koch Institute at MIT <br />
* Panel Speaker: Sara Grimm, NIEHS<br />
<br />
The presenters, panelists, and attendees will discuss how people manage to stay on top of compute requirements for their own sites in a panel discussion. Major hurdles to overcome and the compromises needed for success will be discussed. We may also touch on experiences with containers and portable computing.<br />
<br />
We will have a bioinfo-core dinner the night of the workshop, Monday, at 6:30 PM. The dinner will be at [http://www.swandolphinrestaurants.com/gardengrove/index.html Garden Grove], a restaurant in the Swan Hotel.<br />
<br />
==Discussion with notes==<br />
===Big Data===<br />
Yury Bukhman. The GLBRC consortium consists mainly of people at U of Wisonson and a small group at Michigan state university. The consortium in mainly involved in agriculture and sustainablility. Very practical--loking to develop biofules and biochemicals. <br />
All groups in the consortium is mandated to have a data management plan that's reviewed by the bioinformatics core on a yearly basis. This provides a consulting opportunity for bioinformatics planning and research IT: both on prem resources as well as cloud services.<br />
The core has developed a metadata database called GLOW.<br />
Alberto Riva: The bioinformatics core facility sits within a closely organized group of core facilities that can be used for Life and Health Science. Additionally, they have access to large and shared IT resources with segments setup for their specific use cases (a private area of the large 10,000 core cluster, for example). <br />
<br />
What fraction of the University of Florida system uses the core and how is work paid for?<br />
The core is currently fee for service, but moving to a model where work is charged/allocated by level of effort for a resource with longer term projects through full resource allocation to a grant.<br />
Regarding percentage of university using the core, there is no good measure. Not everyone knows about the core, they focus on some outreach, but overall it's hard to quantify.<br />
<br />
Discussion of overall cost that includes data analysis and storage. <br />
One view was that storage is getting cheaper, however the data itself is still a problem: the data growing faster than storage is getting cheaper. HMS, for example, has hired a data manager who works solely with people to put their data in the appropriate places--cheap archive storage vs. more expensive on-line high-performance storage. <br />
<br />
At Purdue, there is not a single big large set of data, but 1000's of small datasets. Purdue core works with users who have varying levels of analytic and IT knowledge. They find that they have spend time working on datasets in order to adapt/format/clean them for analysis as well as understanding the experimental parameters. Not everyone knows what goes on inside and behind the scenes of the core in performing this work. They expect the work to be quick, but without prior involvement in developing the experiment, it takes days to get the dataset to a state where it can be run through their analysis! Educating the students and educating their users about the data, dataset and the analysis is important. <br />
<br />
Collecting metadata of small and large datasets is a big problem, particularly if one wants to combing data across experiments or in the future. It is required to compare different datasets. Additionally, when submitting new data to public datasets, the repositories require long list of metadata. GLBRC maintains a spreadsheet that's required to be filled out that specifically focused on the metadata. This forces investigators to think about the metadata.<br />
<br />
Biggest challenges for Riva is in educating users on how to generate the data--you may have all the big data you want, but if the experiment is not designed properly, there's quite a lot of cruft.<br />
<br />
The evolving technology in big data, NGS, life science is really an evolution in what "big" means. We've always dealt with challenging datasets but "big data" involves additional or more challenging work on the actual analysis and management processes--elaborate.<br />
The biggest problem is in the complexity of the projects. But a larger problem is working with faculty who don't have a lot of money<br />
Most cores are willing to devote part of their time, pro bono, to generate results for grant submission. The investigator will include the data and cost:effort into the grant for the analysis services.<br />
<br />
How do you deal with privacy and security of the data? When thinking about a pipeline, do you take into acount what's public vs. private?<br />
Purdue: download all data into their local environment.<br />
Florida: they have the largest southern florida health center who works with patient data. To comply with regulations, the research computing group has created a secure area for their cluster to work with this data. It's walled off from external and internal access--i.e controlled access.<br />
<br />
Bottom line, last thought: a core and the personnel within it has to be adaptable in order to understand what is brought to them. No 2 experiments are alike and needs continuously change. The trends, technology capability and tools change. Need to remain flexible, adapt pipelines, process and people.</div>Bgrichterhttp://bioinfo-core.org/index.php?title=ISMB_2016:_BioinfoCoreWorkshop&diff=10226ISMB 2016: BioinfoCoreWorkshop2016-07-12T19:08:39Z<p>Bgrichter: /* Big Data */</p>
<hr />
<div>We are holding a Bioinfo-core workshop at the [https://www.iscb.org/ismb2016 2016 ISMB meeting] in Orlando, Florida. We have been given a half-day [https://www.iscb.org/ismb2016/2700 workshop track] slot in the program on Monday, July 11th from 2:00-4:30 PM.<br />
<br />
== Workshop Structure ==<br />
<br />
The workshop is split into 4 sessions of ~30 mins each with a required break between the first and second half of the meeting (3-3:30).<br />
<br />
* The first slot will have 2 15 minute talks on the topic of Big Data followed by a 30 minute panel discussion. After the break we will have 2 15 minute talks about Big Compute, followed by a 30 minute panel discussion.<br />
<br />
== Workshop topics ==<br />
<br />
The workshop will address "The practical experience of big data and big compute". Members of core facilities will share their experience and insights via presentation and panel discussion.<br />
<br />
=== Big data ===<br />
<br />
Speaker: '''Yury Bukhman''', Great Lakes Bioenergy Research Center<br />
Time: 2:00 pm – 2:15 pm<br />
<br />
Presentation Overview:<br />
<br />
The Computational Biology Core of the Great Lakes Bioenergy Research Center supports mostly academic labs at the University of Wisconsin, Michigan State University and other universities. With a variety of experiment types, they are challenged to manage and analyze disparate data and metadata in a diverse academic environment. Details of these data challenges and solutions will be discussed.<br />
<br />
[[File:yury.pdf]]<br />
<br />
Speaker: '''Alberto Riva''', University of Florida <br />
Time:2:15 pm – 2:30 pm<br />
<br />
Presentation Overview<br />
<br />
The Bioinformatics Core of the ICBR provides bioinformatics services to the large and diverse scientific community of the University of Florida. Routine handling of projects covering a vast spectrum of biological and biomedical research requires a flexible and powerful data infrastructure. Implementation details of a software development environment (Actor) for reliable, reusable, reproducible analysis pipelines will be discussed, as well as insights on managing big data projects in a core setting.<br />
<br />
[[File:alberto.pdf]]<br />
<br />
<br />
'''Big Data Panel'''<br />
Time: 2:30 pm – 3:00 pm<br />
<br />
Moderator: Madelaine Gogol, Stowers Institute for Medical Research<br />
* Panel Speaker: Yury Bukhman, Great Lakes Bioenergy Research Center<br />
* Panel Speaker: Alberto Riva, University of Florida<br />
* Panel Speaker: Hua Li, Stowers Institute for Medical Research<br />
* Panel Speaker: Jyothi Thimmapuram, Purdue University<br />
<br />
The presenters, panelists, and attendees will explore practical experience with “big data” as well as use of public datasets in a panel discussion. Topics may include accuracy of annotation, trust of data, raw versus processed, data validation, and QC.<br />
<br />
=== Big Compute ===<br />
<br />
Speaker: '''Sergi Sayols Puig''', Institute of Molecular Biology Mainz<br />
Time: 3:30 pm – 3:45 pm<br />
<br />
Presentation Overview<br />
With a variety of computing infrastructures available, building robust, transferable pipelines can increase utilization of compute resources. NGS analysis pipelines implemented as docker containers and deployed on a variety of compute platforms – (cluster, supercomputer, or workstation) will be discussed.<br />
<br />
<br />
Speaker: '''Jingzhi Zhu''', The Koch Institute at MIT <br />
Time: 3:45 pm – 4:00 pm<br />
<br />
Experiences transitioning a Bioinformatics core from a local to a cloud-based compute solution will be discussed, including the motivation, performance, cost, and issues with deploying bioinformatics pipelines to Amazon EC2 instances.<br />
<br />
[[File:Jingzhi.pdf]]<br />
<br />
<br />
'''Big Compute Panel''' <br />
Time: 4:00 pm – 4:30 pm<br />
<br />
Moderator: Brent Richter, Partners HealthCare <br />
* Panel Speaker: Sergi Sayols Puig, Institute of Molecular Biology Mainz <br />
* Panel Speaker: Jingzhi Zhu, The Koch Institute at MIT <br />
* Panel Speaker: Sara Grimm, NIEHS<br />
<br />
The presenters, panelists, and attendees will discuss how people manage to stay on top of compute requirements for their own sites in a panel discussion. Major hurdles to overcome and the compromises needed for success will be discussed. We may also touch on experiences with containers and portable computing.<br />
<br />
We will have a bioinfo-core dinner the night of the workshop, Monday, at 6:30 PM. The dinner will be at [http://www.swandolphinrestaurants.com/gardengrove/index.html Garden Grove], a restaurant in the Swan Hotel.<br />
<br />
==Discussion with notes==<br />
===Big Data===<br />
Yury Bukhman. The GLBRC consortium consists mainly of people at U of Wisonson and a small group at Michigan state university. The consortium in mainly involved in agriculture and sustainablility. Very practical--loking to develop biofules and biochemicals. <br />
All groups in the consortium is mandated to have a data management plan that's reviewed by the bioinformatics core on a yearly basis. This provides a consulting opportunity for bioinformatics planning and research IT: both on prem resources as well as cloud services.<br />
The core has developed a metadata database called GLOW.<br />
Alberto Riva: The bioinformatics core facility sits within a closely organized group of core facilities that can be used for Life and Health Science. Additionally, they have access to large and shared IT resources with segments setup for their specific use cases (a private area of the large 10,000 core cluster, for example). <br />
<br />
What fraction of the University of Florida system uses the core and how is work paid for?<br />
The core is currently fee for service, but moving to a model where work is charged/allocated by level of effort for a resource with longer term projects through full resource allocation to a grant.<br />
Regarding percentage of university using the core, there is no good measure. Not everyone knows about the core, they focus on some outreach, but overall it's hard to quantify.<br />
<br />
Discussion of overall cost that includes data analysis and storage. <br />
One view was that storage is getting cheaper, however the data itself is still a problem: the data growing faster than storage is getting cheaper. HMS, for example, has hired a data manager who works solely with people to put their data in the appropriate places--cheap archive storage vs. more expensive on-line high-performance storage. <br />
<br />
At Purdue, there is not a single big large set of data, but 1000's of small datasets. Purdue core works with users who have varying levels of analytic and IT knowledge. They find that they have spend time working on datasets in order to adapt/format/clean them for analysis as well as understanding the experimental parameters. Not everyone knows what goes on inside and behind the scenes of the core in performing this work. They expect the work to be quick, but without prior involvement in developing the experiment, it takes days to get the dataset to a state where it can be run through their analysis! Educating the students and educating their users about the data, dataset and the analysis is important. <br />
<br />
Collecting metadata of small and large datasets is a big problem, particularly if one wants to combing data across experiments or in the future. It is required to compare different datasets. Additionally, when submitting new data to public datasets, the repositories require long list of metadata. GLBRC maintains a spreadsheet that's required to be filled out that specifically focused on the metadata. This forces investigators to think about the metadata.<br />
<br />
Biggest challenges for Riva is in educating users on how to generate the data--you may have all the big data you want, but if the experiment is not designed properly, there's quite a lot of cruft.<br />
<br />
The evolving technology in big data, NGS, life science is really an evolution in what "big" means. We've always dealt with challenging datasets but "big data" involves additional or more challenging work on the actual analysis and management processes--elaborate.<br />
The biggest problem is in the complexity of the projects. But a larger problem is working with faculty who don't have a lot of money<br />
Most cores are willing to devote part of their time, pro bono, to generate results for grant submission. The investigator will include the data and cost:effort into the grant for the analysis services.<br />
<br />
How do you deal with privacy and security of the data? When thinking about a pipeline, do you take into acount what's public vs. private?<br />
Purdue: download all data into their local environment.<br />
Florida: they have the largest southern florida health center who works with patient data. To comply with regulations, the research computing group has created a secure area for their cluster to work with this data. It's walled off from external and internal access--i.e controlled access.<br />
<br />
Bottom line, last thought: a core and the personnel within it has to be adaptable in order to understand what is brought to them. No 2 experiments are alike and needs continuously change. The trends, technology capability and tools change. Need to remain flexible, adapt pipelines, process and people.</div>Bgrichterhttp://bioinfo-core.org/index.php?title=ISMB_2016:_BioinfoCoreWorkshop&diff=10225ISMB 2016: BioinfoCoreWorkshop2016-07-12T19:06:50Z<p>Bgrichter: </p>
<hr />
<div>We are holding a Bioinfo-core workshop at the [https://www.iscb.org/ismb2016 2016 ISMB meeting] in Orlando, Florida. We have been given a half-day [https://www.iscb.org/ismb2016/2700 workshop track] slot in the program on Monday, July 11th from 2:00-4:30 PM.<br />
<br />
== Workshop Structure ==<br />
<br />
The workshop is split into 4 sessions of ~30 mins each with a required break between the first and second half of the meeting (3-3:30).<br />
<br />
* The first slot will have 2 15 minute talks on the topic of Big Data followed by a 30 minute panel discussion. After the break we will have 2 15 minute talks about Big Compute, followed by a 30 minute panel discussion.<br />
<br />
== Workshop topics ==<br />
<br />
The workshop will address "The practical experience of big data and big compute". Members of core facilities will share their experience and insights via presentation and panel discussion.<br />
<br />
=== Big data ===<br />
<br />
Speaker: '''Yury Bukhman''', Great Lakes Bioenergy Research Center<br />
Time: 2:00 pm – 2:15 pm<br />
<br />
Presentation Overview:<br />
<br />
The Computational Biology Core of the Great Lakes Bioenergy Research Center supports mostly academic labs at the University of Wisconsin, Michigan State University and other universities. With a variety of experiment types, they are challenged to manage and analyze disparate data and metadata in a diverse academic environment. Details of these data challenges and solutions will be discussed.<br />
<br />
[[File:yury.pdf]]<br />
<br />
Speaker: '''Alberto Riva''', University of Florida <br />
Time:2:15 pm – 2:30 pm<br />
<br />
Presentation Overview<br />
<br />
The Bioinformatics Core of the ICBR provides bioinformatics services to the large and diverse scientific community of the University of Florida. Routine handling of projects covering a vast spectrum of biological and biomedical research requires a flexible and powerful data infrastructure. Implementation details of a software development environment (Actor) for reliable, reusable, reproducible analysis pipelines will be discussed, as well as insights on managing big data projects in a core setting.<br />
<br />
[[File:alberto.pdf]]<br />
<br />
<br />
'''Big Data Panel'''<br />
Time: 2:30 pm – 3:00 pm<br />
<br />
Moderator: Madelaine Gogol, Stowers Institute for Medical Research<br />
* Panel Speaker: Yury Bukhman, Great Lakes Bioenergy Research Center<br />
* Panel Speaker: Alberto Riva, University of Florida<br />
* Panel Speaker: Hua Li, Stowers Institute for Medical Research<br />
* Panel Speaker: Jyothi Thimmapuram, Purdue University<br />
<br />
The presenters, panelists, and attendees will explore practical experience with “big data” as well as use of public datasets in a panel discussion. Topics may include accuracy of annotation, trust of data, raw versus processed, data validation, and QC.<br />
<br />
=== Big Compute ===<br />
<br />
Speaker: '''Sergi Sayols Puig''', Institute of Molecular Biology Mainz<br />
Time: 3:30 pm – 3:45 pm<br />
<br />
Presentation Overview<br />
With a variety of computing infrastructures available, building robust, transferable pipelines can increase utilization of compute resources. NGS analysis pipelines implemented as docker containers and deployed on a variety of compute platforms – (cluster, supercomputer, or workstation) will be discussed.<br />
<br />
<br />
Speaker: '''Jingzhi Zhu''', The Koch Institute at MIT <br />
Time: 3:45 pm – 4:00 pm<br />
<br />
Experiences transitioning a Bioinformatics core from a local to a cloud-based compute solution will be discussed, including the motivation, performance, cost, and issues with deploying bioinformatics pipelines to Amazon EC2 instances.<br />
<br />
[[File:Jingzhi.pdf]]<br />
<br />
<br />
'''Big Compute Panel''' <br />
Time: 4:00 pm – 4:30 pm<br />
<br />
Moderator: Brent Richter, Partners HealthCare <br />
* Panel Speaker: Sergi Sayols Puig, Institute of Molecular Biology Mainz <br />
* Panel Speaker: Jingzhi Zhu, The Koch Institute at MIT <br />
* Panel Speaker: Sara Grimm, NIEHS<br />
<br />
The presenters, panelists, and attendees will discuss how people manage to stay on top of compute requirements for their own sites in a panel discussion. Major hurdles to overcome and the compromises needed for success will be discussed. We may also touch on experiences with containers and portable computing.<br />
<br />
We will have a bioinfo-core dinner the night of the workshop, Monday, at 6:30 PM. The dinner will be at [http://www.swandolphinrestaurants.com/gardengrove/index.html Garden Grove], a restaurant in the Swan Hotel.<br />
<br />
==Discussion with notes==<br />
===Big Data===<br />
Yury Bukhman. The GLBRC consortium consists mainly of people at U of Wisonson and a small group at Michigan state university. The consortium in mainly involved in agriculture and sustainablility. Very practical--loking to develop biofules and biochemicals. <br />
All groups in the consortium is mandated to have a data management plan that's reviewed by the bioinformatics core on a yearly basis. This provides a consulting opportunity for bioinformatics planning and research IT: both on prem resources as well as cloud services.<br />
The core has developed a metadata database called GLOW.<br />
Alberto Riva: The bioinformatics core facility sits within a closely organized group of core facilities that can be used for Life and Health Science. Additionally, they have access to large and shared IT resources with segments setup for their specific use cases (a private area of the large 10,000 core cluster, for example). <br />
<br />
What fraction of the University of Florida system uses the core and how is work paid for?<br />
It's fee for service, but moving to a model where work is charged/allocated by level of effort for a resource with longer term projects through full resource allocation to a grant.<br />
Regarding percentage of university using the core, there is no good measure. Not everyone knows about the core, they focus on some outreach, but overall it's hard to quantify.<br />
<br />
Discussion of overall cost that includes data analysis and storage. One view was that storage is getting cheaper, however the data itself is still a problem: the data growing faster than storage is getting cheaper. HMS, for example, has hired a data manager who works solely with people to put their data in the appropriate places--cheap archive storage vs. more expensive on-line high-performance storage. <br />
<br />
At Purdue, there is not a single big large set of data, but 1000's of small datasets. Purdue core works with users who have varying levels of analytic and IT knowledge. They find that they have spend time working on datasets in order to adapt/format/clean them for analysis as well as understanding the experimental parameters. Not everyone knows what goes on inside and behind the scenes of the core in performing this work. They expect the work to be quick, but without prior involvement in developing the experiment, it takes days to get the dataset to a state where it can be run through their analysis! Educating the students and educating their users about the data, dataset and the analysis is important. <br />
<br />
Collecting metadata of small and large datasets is a big problem, particularly if one wants to combing data across experiments or in the future. It is required to compare different datasets. Additionally, when submitting new data to public datasets, the repositories require long list of metadata. GLBRC maintains a spreadsheet that's required to be filled out that specifically focused on the metadata. This forces investigators to think about the metadata.<br />
<br />
Biggest challenges for Riva is in educating users on how to generate the data--you may have all the big data you want, but if the experiment is not designed properly, there's quite a lot of cruft.<br />
<br />
The evolving technology in big data, NGS, life science is really an evolution in what "big" means. We've always dealt with challenging datasets but "big data" involves additional or more challenging work on the actual analysis and management processes--elaborate.<br />
The biggest problem is in the complexity of the projects. But a larger problem is working with faculty who don't have a lot of money<br />
Most cores are willing to devote part of their time, pro bono, to generate results for grant submission. The investigator will include the data and cost:effort into the grant for the analysis services.<br />
<br />
How do you deal with privacy and security of the data? When thinking about a pipeline, do you take into acount what's public vs. private?<br />
Purdue: download all data into their local environment.<br />
Florida: they have the largest southern florida health center who works with patient data. To comply with regulations, the research computing group has created a secure area for their cluster to work with this data. It's walled off from external and internal access--i.e controlled access.<br />
<br />
Bottom line, last thought: a core and the personnel within it has to be adaptable in order to understand what is brought to them. No 2 experiments are alike and needs continuously change. The trends, technology capability and tools change. Need to remain flexible, adapt pipelines, process and people.</div>Bgrichterhttp://bioinfo-core.org/index.php?title=Main_Page&diff=10217Main Page2016-07-11T11:21:25Z<p>Bgrichter: </p>
<hr />
<div>'''Welcome to the bioinfo-core's wiki!''' <br />
<br />
*[http://lists.open-bio.org/mailman/listinfo/bioinfo-core Participate in the discussion]<br />
*[[BioWiki:Community_portal | Add your core to the wiki]]<br />
<br><br />
We thank [http://www.iscb.org ISCB] for hosting and maintaining this wiki.<br />
<br><br />
<br><br />
===Newest Content===<br />
<br />
* [[ISMB_2016:_BioinfoCoreWorkshop | ISMB 2016 Workshop Proposal]]<br />
* [[18th_Discussion-16_Oct_2015 | 18th Discussion - ISMB2015 follow up]]<br />
* [[Interesting NGS failures]]<br />
* [[ISMB_2015:_BioinfoCoreWorkshop|ISMB 2015 Workshop - The evolving relationship between core facilities and researchers]]<br />
* [[17th_Discussion-27_Feb_2015 | 17th Discussion - Best practices for bioinformatics training]]<br />
* [[ISMB_2014:_InfrastructureForNewCores|16th Discussion - ISMB2014 follow up: Infrastructure for new Cores]]<br />
* [[ISMB_2014:_BioinfoCoreWorkshopWriteUp|ISMB 2014 Workshop Write Up]]<br />
* [[15th_Discussion-24_Feb_2014 | 15th Discussion - The biologist is the analyst]]<br />
* [[ISCB_COSI_Proposal | Proposal to make bioinfo-core an ISCB community of special interest]]<br />
* [[ISMB_2014:_BioinfoCoreWorkshop|ISMB 2014 Workshop Proposal]]<br />
* [[14th_Discussion-7_November_2013| 14th Discussion - Evaluating software]]<br />
* [[ISMB_2013:_BioinfoCoreWorkshop|ISMB 2013 Workshop]]<br />
* [[13th_Discussion-5_November_2012| 13th Discussion - Embedded bioinformaticians and Integrative analysis]]<br />
* [[ISMB_2012:_Workshop_Proposal|ISMB 2012 Workshop]]<br />
* [[12th_Discussion-21_May_2012|12th Discussion - Managing Storage in a Core Facility]]<br />
* [[11th_Discussion-7_November_2011|11th Discussion - Measuring the output of a Core and Tracking Software Versions]]<br />
* ISMB 2012: Bioinfo Core Workshop - Long Beach CA - July 16, 2012 [http://www.iscb.org/ismb2012-program/ismb2012-workshops#w3|ISMB Workshop]<br />
* [[ISMB 2011: Workshop on Analysis Pipelines for High Throughput Sequencing]]<br />
* [[ISMB 2011: Workshop on Practical Aspects of Running a Core Facility]]<br />
* [[ISMB 2011 Workshop Call]]<br />
* ISMB 2010 Workshop [[Call Minutes]] page<br />
* Include [http://twitter.com/#search?q=%23BICORE #BICORE] in your [http://twitter.com/ tweets] for the Core community. <br />
* Numerous new additions to the community portal<br />
<br />
*[[BioWiki:Community_portal | Community Portal]]<br />
<br />
= Introduction =<br />
Bioinfo-core is a worldwide body of people that manage or staff bioinformatics facilities within organizations of all types including academia, academic medical centers, medical schools, biotechs and pharmas. Through this wiki and our online [http://lists.open-bio.org/mailman/listinfo/bioinfo-core discussion lists] we discuss many topics that are challenging bioinformatics cores world wide: from IT, new instrumentation, staffing and training bioinformaticians, tools, software, to services for biologists and MD's.<br />
<BR><BR><br />
We hold several events throughout the year including quarterly conference calls (with published [[Call Minutes]]) and a yearly set of informal presentations and dinners at the annual meeting, Intelligent Systems in Molecular Biology ([http://www.iscb.org/iscb-conferences ISMB]), the official conference of [http://www.iscb.org/ ISCB]<br />
<br><br><br />
Please browse, add and participate in the wiki and the discussion lists. To edit the wiki, create a New Account and then edit the [[BioWiki:Community_portal | Community Portal]] to add a link for your core facility and its description.<br />
<br />
= Wiki page links =<br />
*[[Call Minutes]]: Annual meetings at ISMB with presenations; Detailed minutes from quarterly conference calls on selected and pertinent topics. <br />
*[[BioWiki:Community_portal | Community Portal]]: list your organization!<br />
*[[Ongoing Discussions]]: discussion forums including lists of software, tools, etc.<br />
*[[Special:Categories]]: find pages using categories such as Tools, Presentations, NextGenSequencing, Meetings etc.<br />
<br />
=Bioinfo-core Member Publications relevant to core facilities=<br />
*[http://collections.plos.org/ploscompbiol/corefacilities.php PLoS Computational Biology Journal--CORE facilities: editorial and perspectives]<br />
*[http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1000372 The Need for Centralization of Computational Biology Resources] Lewitter F, Rebhan M, Richter B, Sexton DP<br />
*[http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1000369 Managing and Analyzing Next-Generation Sequence Data] Richter BG, Sexton DP<br />
*[http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1000368 Establishing a Successful Bioinformatics Core Facility Team] Lewitter F, Rebhan M</div>Bgrichter