The collaboration between the MRC and AstraZeneca to give UK academic researchers access to development compounds has been acknowledged as a precedent for the compilation of the analogous NCATS 58-set from multiple companies. A snapshot of the MRC list of 22 is shown below.
There is plenty of background information about both sets and repurposing exercises in general available on the web. In addition, the associated issue of public code names with very-difficult-to-dig-out (VDTODO) or completely blinded structures has also generated additional blog posts (e.g. at CollabChem and Chembl-og)
I have performed the same exercise for these as for the NCATS 56 small molecules. This is a) map the code names to a structure, b) assign a PubChem CID and c) search SureChemOpen for matches to early patent filings. The summary list is pasted below, a more extended table has been deposited at Figshare and a set of links for the 12 CIDs is available as a public MyNCBI collection ( http://www.ncbi.nlm.nih.gov/sites/myncbi/collections/public/1HKfLlxQ0OICuWKEPOFU48tky/)
I managed to dig out CIDs mapped to 12 of the 21 codes, but there are 5 compounds-in-common between the MRC and NCATS sets. Note also we have a patent-mapping full-house for the 12. These posts are about picking out the quirky details so lets see what we have...
AZD6703 will be a bit of a system test because the publication is recent and neither yet MeSH processed nor picked up by ChEMBL. It was also a dozy in the Goldilocks school of abstract drafting for lead compounds, as you can see below, where we see no less than the IUPAC, code name, target and indication all in the title (if the abstract had had some inhibition data and included the term "arthritis" this would have been an almost perfect pay-wall bypass!)
IUPACs in titles (where there are no direct MeSH > PubChem links) can be easily processed by chemicalize.org (the result is in the picture insert). This gives an exact match to CID 11373432 and a patent whack back to WO-2005042502-A1. However, what is odd is the opening out the MMDB > CID links for the PDB structures gives the set of four below but does not include 11373432.
What I think has happened is the not uncommon story of the ligand going into a crystal structure (i.e. dropped into the tube) not being exactly the same as the structure the software has pulled back out of the electron density data (see below)
The text is slightly cryptic in that the candidate IUPAC (3-[(2R)-tetrahydrofuran-2-ylmethyl]-2-thioxo-1,2,3,7-tetrahydro-6H-purin-6-one) converted by chemicalize.org, is not explicitly juxtaposed to AZD5904. However, it is the only one in the paper and it whacks WO-2009025618-A1 "MIPO inhibitors for the treatment of huntington's disease and multiple system atrophy" which makes it a good bet.