In life sciences papers, citations are widely used and typically consist of two parts: a) a list of references found at the end of the citing paper that provides full bibliographic information for each source; and b) reference markers located in the text that are linked to the references. The text surrounding a reference marker is defined as the citation context. In citation contexts, researchers must clarify the relationships between their paper and the papers cited within their paper. However, this process may be difficult for some researchers, especially for non-English-speakers. In such cases, these researchers can learn how to write citation contexts more efficiently by observing the citation contexts that other researchers have made on a paper. For this purpose, we developed the Colil (Comments on Literature in Literature) database and a web-based search service called Colil for citation contexts in the life sciences domain. The data needed for the Colil database have been extracted from open access papers deposited in PMC Open Access Subset (PMC-OAS). Colil searches for a cited paper in the Colil database and then returns a list of the citation contexts for it and its relevant papers based on co-citations. We also expect that a set of citation contexts retrieved from Colil for a give paper should help researchers to comprehend the paper more efficiently. Users also query and browse the Colil database using the common query interface (SPARQL) through our SPARQL endpoint. Furthermore, complete dumps of the Colil database are downloadable through the FTP site.
If a pair of papers was cited by at least two papers, we counted it as a co-citation. The relevance score was equal to the number of citing papers. In the case where the papers (A, B) are co-cited by three different papers (X, Y, Z), the relevance score is 3.
This figure shows the number of PubMed-indexed papers published each year in PMC-OAS (as of March 2014). It has grown over the past 10 years, with the most recent years showing an exponential growth.
As of March 2014, there are 545,147 PubMed-indexed PMC-OAS papers that cited at least one PubMed-indexed paper; the obtained papers were distributed across 3,171 journals. The papers contained 24,684,765 citation contexts, and each of them cited an average of 41.5 PubMed-indexed papers. Conversely, 5,136,741 PubMed-indexed papers have been cited by at least one PMC-OAS paper; the cited papers correspond to approximately one-quarter of the entire PubMed entries and are distributed across 11,588 journals.
The URL of Colil SPARQL endpoint is "http://colil.dbcls.jp/sparql".
Please refer to this page for sample queries.
Here is the contact point: support AT dbcls DOT rois DOT ac DOT jp.