SteinBlog

Opening for a postdoc in my group

We now have an opening for a postdoc position in my group in collaboration with the group of Dietrich Rebholz-Schuhmann at the EBI. The position is funded for three years by the EIPOD scheme at EMBL.

The proposed work combines methods from image recognition (OSRA, Filippov2009), cheminformatics (CDK ,Steinbeck2003), chemometrics (Gkoutos2003) and text-mining (OSCAR3, Corbett2006, Tiago2009) to extract information relevant to small molecules from the primary literature. The project will deliver methods to discover information about chemical entities linked to their chemical structures and their assigned spectra. The research focus lies on the cross-validation of the extracted information against cheminformatics prediction methods to compensate error propagation and to benchmark prediction methods on published data.

You’ll find a one-page project description on the EIPOD page, together with information on how to apply. If you are interested and have questions, feel free to contact me at steinbeck [at] ebi.ac.uk.

References
Corbett and Murray-Rust. High-throughput identification of chemistry in life science texts. LCNS (2006), 1611-3349
Filippov and Nicklaus. Optical Structure Recognition Software (OSRA). J. Chem. Inf. Model (2009), 49(3), 740–743
Gkoutos GV et al. Chemical Machine Vision. (2003) 43:1342–1355.
Grego T et al. Identification of Chemical Entities in Patent Documents. In: LNCS (2009) 5518:942-949
Guha et al. The Blue Obelisk – Interoperability in Chemical Informatics. J Chem Inform Model (2005) 46(3):991-998
Steinbeck et al. The Chemistry Development Kit (CDK). J chem inform comp sci (2003) 43(2): 493-500


Therapeutic Applications of Computational Biology and Chemistry 2012 (TACBAC)

Courtesy of Andres Rueda, Flickr

I’m co-organizing the 2012 conference on Therapeutic Applications of Computational Biology and Chemistry (TACBAC), 12-14 March 2012, at the Wellcome Trust Conference Centre, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK

This conference will bring together leading researchers investigating computational chemistry and biology techniques as applied to advancing our ability to predict, diagnose and modulate human disease. This broad and multidisciplinary meeting will explore the major challenges in drug discovery and development where innovation in computational approaches and tools can really make a significant and tangible contribution towards novel treatments.  You should attend this conference if you are a researcher interested in drug discovery, or developing or using computational approaches to the development of therapeutics, or if you are a key decision maker in a pharmaceutical or biotechnology company.

Each of the sessions, which progress from identifying disease mechanisms to implementing new therapeutic and diagnostic approaches in the clinic, will bring together experts in both the biomedical and the computational aspects of the topic under discussion. Sessions chairs will encourage discussion and contribution from the attendees.

Session topics
• Clinical implications of individual genomes
• Metabolism and biomarkers
• Computational systems biology
• Discovery of chemical probes
• Modelling xenobiotic metabolism

Registration and submission of abstracts is open now.

More here.


Microsoft Windows is pathetic

Courtesy of peru, lili eta marije

I know this title is going to bore the hell out of most of you. We had this religious fight about Linux, Windows and Mac OS and other OS’s for the last 20 years and longer 🙂

I’m right now sitting in a conference, like almost every week, and right now, as in many cases in the past month, the presenter is busy clicking away pop-up messages from the presentation laptop running Windows. “Process X has experienced problems and needs to close. Do you want to send a report?”. “Process Y has experienced problems and needs to close. Do you want to send a report?”. “Adobe Reader is outdated but there is an update available. Do you want to update now?”. No, I don’t. Ok, the latter is not a windows problem but a problem of the spirit that Windows evaporates.

Dear Microsoft: Computers are for doing work or having fun (playing games, watching videos, listening to music). In all of the cases, the user has made a decision to perform a particular activity, such as presenting a talk. It is not up to you to point out that now he or she should be doing something else, such as updating your outdated operating system (or should I say: User Interface?) What is the consequence? Use MacOS or Linux for presentations (the latter only if you know how to use xrandr :-)).


IntEnz Release 69 and Rhea Release 22 are out

Rafael Alcantara in my group, together with our collaborators at the Swiss Institute of Bioinformatics, has released version 69 of our enzyme nomenclature database IntEnz and version 22 of our biochemical reaction database Rhea.

News:

  • 11 new sub-subclasses and 49 EC numbers have been added to the enzyme classification.
  • Over sixteen thousand unique reaction identifiers.
  • Rhea includes now cross-references to EcoCyc.
  • UniProt 2011_05 and Reactome 36 have been used for cross-references.

Please consider subscribing to the RSS new feed for IntEnz <https://sourceforge.net/export/rss2_projnews.php?group_id=94642>

and/or Rhea

<https://sourceforge.net/export/rss2_projnews.php?group_id=255417>.

Next release (70/23) is flexibly scheduled for June 13th.

 


ChEBI release 79

Choline, ChEBI's entity of the month in May 2011

Choline, ChEBI's entity of the month in May 2011

Chemical Entities of Biological Interest (ChEBI) is one of the free, yet fully curated databases in Chemistry. It is developed by my team at the European Bioinformatics Institute in Hinxton, Cambridge, UK. As of 09 May 2011  we now have ChEBI release 79 online, with 25,238 fully annotated three star level molecular entities.

The term ‘molecular entity’ refers to any constitutionally or isotopically distinct atom, molecule, ion, ion pair, radical, radical ion, complex, conformer, etc., identifiable as a separately distinguishable entity. The molecular entities in question are either products of nature or synthetic products used to intervene in the processes of living organisms.

ChEBI incorporates an ontological classification, whereby the relationships between molecular entities or classes of entities and their parents and/or children are specified.

ChEBI uses nomenclature, symbolism and terminology endorsed by the following international scientific bodies:

Molecules directly encoded by the genome (e.g. nucleic acids, proteins and peptides derived from proteins by cleavage) are not as a rule included in ChEBI.

All data in the database is non-proprietary or is derived from a non-proprietary source. It is thus freely accessible and available to anyone. In addition, each data item is fully traceable and explicitly referenced to the original source.

The text on the ChEBI website is available under the Creative Commons License.

 


Un-creative re-use of J. Chem. Inf. editorial

Courtesy of Breakfast for Dinner

I was recently alerted by someone about chunks of text copied from an editorial by my colleague David Wild in the Journal of Cheminformatics [1] appearing in another article by Nutan Prakash and Dinta A. Gareja in the Journal of Proteomics and Bioinformatics [2]. While David’s article is cited as a reference, those larger chunks of text are clearly not identified as being his words.

A closer investigation reveals that large chunks of text [2] are copied from other sources. The introduction is copied from the Wikipedia article on Cheminformatics as well as from an article by  Aktar and Murmu and one by Karthikeyan and Krishnan. The latter is cited but the first two sources are not. Generally, citations for these questionable cases appear only as references at the end of the text and are not marked in the text.

It is a common misconception that text from wikipedia can be freely abused. The license says:

Re-use of text:

  • Attribution: To re-distribute a text page in any form, provide credit to the authors either by including a) a hyperlink (where possible) or URL to the page or pages you are re-using, b) a hyperlink (where possible) or URL to an alternative, stable online copy which is freely accessible, which conforms with the license, and which provides credit to the authors in a manner equivalent to the credit given on this website, or c) a list of all authors.

Although David’s original article is cited and therefore I think formally the original license was not violated, this somehow feels a little disturbing and I would appreciate your comments on this.

In my opinion, this text [2] is clearly a patch-work of unaltered text from other sources without original contributions from the authors and should not have been published.

References:

[1] David J Wild, Grand challenges for cheminformatics, Journal of Cheminformatics 2009, 1:1

[2] Prakash N, Gareja DA (2010) Cheminformatics. J Proteomics Bioinform 3: 249-252.


Open Position: Small Molecule Information Mining

Courtesy of archangeldeb

Courtesy of archangeldeb

We have an open position for an interdisciplinary postdoctoral fellow as part of the EMBL Interdisciplinary Postdoc (EIPOD) scheme. The successful candidate will work in my group (cheminformatics and metabolism) in collaboration with the text mining group headed by Dietrich Rebholz-Schuhmann.We are going to exploit text mining, image recognition and cheminformatics to extract chemical knowledge and data from the chemical and biological literature. If you are interested, please apply via the online application form. The application deadline is March 20th, 2011. Feel free to email me at steinbeck[at]ebi.ac.uk in case of questions.


Job ad: Bioinformatician/Cheminformatician

I’m currently seeking a scientist (bioinformatics, cheminformatics) for my group. We focus on cheminformatics and metabolism research and services with a particular emphasis on the analysis and understanding of metabolism. The bioinformatician/cheminformatician will pursue his or her own research project, assist me in managing and supervising our phd research projects, specifically in the areas of structure elucidation of biological metabolites, metabolic reconstruction and in writing grant applications to further develop the group. The post-holder will present group activities at conferences, including of course his or her own research, and cooperate closely with other members of the institute.

The European Bioinformatics Institute (EBI) is a world-leading bioinformatics centre providing biological data to the scientific community, with expertise in data storage, analysis and representation.

Applicants should have a PhD in a relevant field and at least one completed year of postdoctoral work that has resulted in significant peer reviewed publications. Work experience and theoretical knowledge in bio- or cheminformatics as well as molecular biology are required.
Project management skills and the ability to supervise and collaborate with other research teams are essential. Fluency in English is mandatory.

An initial contract of 3 years will be offered to the successful candidate. This can be renewed, depending on circumstances at the time of the review.

The closing date for this application is July 25, 2010.

Please consult http://www.embl.de/aboutus/jobs/jobs_embl_ebi_hinxton/2010/w_10_055_ebi/ for the official ad.

To apply, please send a CV (including names and addresses of referees) and covering letter, by email, quoting ref. no. W/10/055/EBI in the subject line, to: applications@ebi.ac.uk.