References

Revised Cambridge NCI database

Pubblicato
Autori Yong Zhang, Henry S. Rzepa, James J. P. Stewart, Peter Murray-Rust, Matthew J. Harvey, Nicholas Mason, Andrew McLean, Imperial College High Performance Computing Service, Imperial College London, Imperial College London, Yong Zhang

This is a reprocessed version of the library described in 'A global resource for computational chemistry' P Murray-Rust et al (2004) and found in the Cambridge Repository (www.repository.cam.ac.uk/handle/1810/724). Each record contains a copy of the original Cambridge entry's CML files (NCI.xml and PM5.xml) and a new PM7 structure (PM7.xml) along with the input and output files of the MOPAC run used to compute it. Entries for which InChIs of the original NCI structure and the PM7 optimisation did not match have been dropped from the collection.

Library and Information StudiesFOS: Media and communicationsFOS: Media and communicationsInformation SystemsFOS: Computer and information sciences

RDM workflows and integrations for HEIs using hosted services

Pubblicato

The report investigates some of the workflows and processes involved in Research Data Management (RDM). The report has been created as part of the JISC Research Data Spring supported project “Small and Specialist: A consortial approach to building and integrated RDM system” (CREST RDMS).

Computational Chemistry

Quantum chemistry structures and properties of 134 kilo molecules

Pubblicato
Autori Raghunathan Ramakrishnan, Pavlo Dral, Pavlo O. Dral, Matthias Rupp, O. Anatole Von Lilienfeld

Computational de novo design of new drugs and materials requires rigorous and unbiased exploration of chemical compound space. However, large uncharted territories persist due to its size scaling combinatorially with molecular size. We report computed geometric, energetic, electronic, and thermodynamic properties for 134k stable small organic molecules made up of CHONF. These molecules correspond to the subset of all 133,885 species with up to nine heavy atoms (CONF) out of the GDB-17 chemical universe of 166 billion organic molecules. We report geometries minimal in energy, corresponding harmonic frequencies, dipole moments, polarizabilities, along with energies, enthalpies, and free energies of atomization. All properties were calculated at the B3LYP/6-31G(2df,p) level of quantum chemistry. Furthermore, for the predominant stoichiometry, C7H10O2, there are 6,095 constitutional isomers among the 134k molecules. We report energies, enthalpies, and free energies of atomization at the more accurate G4MP2 level of theory for all of them. As such, this data set provides quantum chemical properties for a relevant, consistent, and comprehensive chemical space of small organic molecules. This database may serve the benchmarking of existing methods, development of new methods, such as hybrid quantum mechanics/machine learning, and systematic identification of structure-property relationships.

Version history: In an earlier version of this dataset the titles of the files "Data for 133885 GDB-9 molecules" and "Data for 6095 constitutional isomers of C7H10O2" were swapped. This has now been corrected.