This HTML5 document contains 888 embedded RDF statements represented using HTML+Microdata notation.

The embedded RDF content will be recognized by any processor of HTML5 Microdata.

Namespace Prefixes

PrefixIRI
rdfshttp://www.w3.org/2000/01/rdf-schema#
bibohttp://purl.org/ontology/bibo/
n7http://pubannotation.org/docs/sourcedb/PMC/sourceid/
n9http://togows.dbcls.jp/entry/pubmed/
xsdhhttp://www.w3.org/2001/XMLSchema#
n3http://purl.jp/bio/10/colil/id/
n11http://dx.doi.org/
n2http://purl.jp/bio/10/colil/ontology/201303#
n10http://www.ncbi.nlm.nih.gov/pmc/articles/
n5http://purl.org/spar/doco/
dchttp://purl.org/dc/elements/1.1/
rdfhttp://www.w3.org/1999/02/22-rdf-syntax-ns#

Statements

Subject Item
n3:19483094
rdf:type
n2:CitationPaper n2:RelevantPaper n2:ReferencePaper
rdfs:seeAlso
n7:0 n9:19483094 n10:0 n11:10.1093%2Fnar%2Fgkp423
bibo:cites
n3:11447254 n3:16645617 n3:2158081 n3:1968407 n3:9917388 n3:12466850 n3:6383204 n3:7518774 n3:2576026 n3:3822832 n3:16928770 n3:15084257 n3:17200232 n3:10222198 n3:10404615 n3:10571058 n3:14731391 n3:8617811 n3:7929383 n3:16916456 n3:12824371 n3:17123746 n3:12520026 n3:16809387 n3:2119171 n3:12777501 n3:17238282 n3:8105373 n3:7565750 n3:19208466 n3:9409661 n3:10592200 n3:1333034 n3:10391217 n3:9634693 n3:15166027 n3:15735639 n3:11547334 n3:9719638 n3:18802470 n3:14534164 n3:11751219 n3:16962967 n3:2329577 n3:12364607 n3:7926770 n3:11237011 n3:10944202 n3:12217908 n3:15256515 n3:15462673 n3:12908068 n3:18461991 n3:19092803 n3:1313968 n3:10077607 n3:17959722 n3:15961489 n3:2185240 n3:17567998 n3:18367472 n3:15215380 n3:15637633 n3:17478497 n3:16806065 n3:8887666 n3:15989968 n3:9788350 n3:14681465 n3:9271374 n3:16827941 n3:15253150 n3:10698627 n3:17452354 n3:10064699 n3:11438706 n3:11175784 n3:11125071 n3:10734201 n3:16108719 n3:2184437 n3:10835359 n3:12120892 n3:12727897 n3:11500490
n2:cocitationWith
n3:19906716 n3:18006571 n3:9847208 n3:16381825 n3:19890324 n3:15806101 n3:18411406 n3:24194598 n3:19357200 n3:17408486 n3:12775844 n3:12824371 n3:18523729 n3:9581503 n3:12777501 n3:17284674 n3:17964271 n3:17571346 n3:19095804 n3:17921353 n3:12855472 n3:11917018 n3:21430782 n3:18172436 n3:15735639 n3:10785665 n3:15908603 n3:17376166 n3:16524982 n3:10802651 n3:15256515 n3:15883375 n3:19029883 n3:18078513 n3:15961489 n3:20211142 n3:14530449 n3:18364237 n3:18367472 n3:20363979 n3:15637633 n3:17324271 n3:7584402 n3:17953483 n3:18533028 n3:17994088 n3:10698627 n3:17452354 n3:17360592 n3:11244049 n3:19850720
n2:hasRelevantBibliographicResourceOf
_:vb405628940 _:vb405628941 _:vb405628942 _:vb405628943 _:vb405628936 _:vb405628937 _:vb405628938 _:vb405628939 _:vb405628932 _:vb405628933 _:vb405628934 _:vb405628935 _:vb405628956 _:vb405628957 _:vb405628958 _:vb405628959 _:vb405628952 _:vb405628953 _:vb405628954 _:vb405628955 _:vb405628948 _:vb405628949 _:vb405628950 _:vb405628951 _:vb405628944 _:vb405628945 _:vb405628946 _:vb405628947 _:vb405628972 _:vb405628973 _:vb405628974 _:vb405628975 _:vb405628968 _:vb405628969 _:vb405628970 _:vb405628971 _:vb405628964 _:vb405628965 _:vb405628966 _:vb405628967 _:vb405628960 _:vb405628961 _:vb405628962 _:vb405628963 _:vb405628980 _:vb405628981 _:vb405628982 _:vb405628976 _:vb405628977 _:vb405628978 _:vb405628979
n2:pmcid
PMC0
bibo:doi
10.1093%2Fnar%2Fgkp423
n5:contains
_:vb5357084 _:vb5357044 _:vb5357046 _:vb5356983 _:vb5356972
Subject Item
_:vb5356972
rdf:type
n5:Section
dc:title
introduction
n5:contains
_:vb5356980 _:vb5356981 _:vb5356982 _:vb5356976 _:vb5356977 _:vb5356978 _:vb5356979 _:vb5356973 _:vb5356974 _:vb5356975
Subject Item
_:vb5356973
rdf:type
n2:Context
rdf:value
Transcriptional initiation is a major point of control for gene expression (1–>>3<<), and considerable effort has been devoted to deciphering the code by which transcriptional regulation occurs.
n2:mentions
n3:12777501 n3:2119171 n3:10404615
Subject Item
_:vb5356974
rdf:type
n2:Context
rdf:value
Several methods have been proposed to predict individual transcription factor-binding sites by detecting statistically overrepresented motifs within the promoter (4–>>15<<). In the absence of functional data, however, overrepresentation of DNA sequence elements is not a sufficient criterion for functionality for two basic reasons. First, binding sites are generally short (5–10 bp), which means that many
n2:mentions
n3:17238282 n3:15215380 n3:12217908 n3:12824371 n3:11751219 n3:2184437 n3:9719638 n3:10734201 n3:11175784 n3:10944202 n3:9788350
Subject Item
_:vb5356975
rdf:type
n2:Context
rdf:value
Several recent studies have used spatial preferences as a criterion to predict cis-regulatory elements (16–>>20<<). With one notable exception (21), most previous studies have taken the ‘sliding window’ approach, which measures position-specific overrepresentation within several independent windows of pre-determined width (e.g. 20–25 bp).
n2:mentions
n3:15256515 n3:12364607 n3:16806065 n3:17452354 n3:16827941
Subject Item
_:vb5356976
rdf:type
n2:Context
rdf:value
With one notable exception (>>21<<), most previous studies have taken the ‘sliding window’ approach, which measures position-specific overrepresentation within several independent windows of pre-determined width (e.g. 20–25 bp).
n2:mentions
n3:18367472
Subject Item
_:vb5356977
rdf:type
n2:Context
rdf:value
A second difficulty not addressed by previous studies is that the dinucleotide composition fluctuates dramatically near the start of transcription (>>19<<). This can greatly affect the frequency of motif occurrence in a position-specific manner, and raises the concern that some of the motifs predicted to exhibit positional specificity are not true cis-regulatory elements.
n2:mentions
n3:15256515
Subject Item
_:vb5356978
rdf:type
n2:Context
rdf:value
Transcription is not driven by individual proteins working in isolation, but is instead produced by cooperative interactions between multiple protein factors (>>3<<,22–24). Previous studies have shown that mutual relationships exist between various motifs, such as paired co-occurrences and relative orientations to the TSS (25–28). Such relationships have been effectively utilized in a variety of
n2:mentions
n3:12777501
Subject Item
_:vb5356979
rdf:type
n2:Context
rdf:value
Transcription is not driven by individual proteins working in isolation, but is instead produced by cooperative interactions between multiple protein factors (3,22–>>24<<). Previous studies have shown that mutual relationships exist between various motifs, such as paired co-occurrences and relative orientations to the TSS (25–28). Such relationships have been effectively utilized in a variety of
n2:mentions
n3:9409661 n3:12727897 n3:8105373
Subject Item
_:vb5356980
rdf:type
n2:Context
rdf:value
Previous studies have shown that mutual relationships exist between various motifs, such as paired co-occurrences and relative orientations to the TSS (25–>>28<<). Such relationships have been effectively utilized in a variety of applications, such as the study of condition-specific and time-dependent gene expression patterns, gene network analyses, and promoter region detection (25,29–31).
n2:mentions
n3:16809387 n3:15166027 n3:14731391 n3:11547334
Subject Item
_:vb5356981
rdf:type
n2:Context
rdf:value
Such relationships have been effectively utilized in a variety of applications, such as the study of condition-specific and time-dependent gene expression patterns, gene network analyses, and promoter region detection (>>25<<,29–31). However, such studies are frequently limited to analyzing sequence element relationships between either known binding site motifs or those predicted using standard motif overrepresentation methods. The study presented here
n2:mentions
n3:11547334
Subject Item
_:vb5356982
rdf:type
n2:Context
rdf:value
Such relationships have been effectively utilized in a variety of applications, such as the study of condition-specific and time-dependent gene expression patterns, gene network analyses, and promoter region detection (25,29–>>31<<). However, such studies are frequently limited to analyzing sequence element relationships between either known binding site motifs or those predicted using standard motif overrepresentation methods. The study presented here effectively
n2:mentions
n3:15084257 n3:10391217 n3:15253150
Subject Item
_:vb5356983
rdf:type
n5:Section
dc:title
results
n5:contains
_:vb5356984 _:vb5356985 _:vb5356986 _:vb5356987 _:vb5356988 _:vb5356989 _:vb5356990 _:vb5356991 _:vb5357040 _:vb5357041 _:vb5357042 _:vb5357043 _:vb5357032 _:vb5357033 _:vb5357034 _:vb5357035 _:vb5357036 _:vb5357037 _:vb5357038 _:vb5357039 _:vb5357024 _:vb5357025 _:vb5357026 _:vb5357027 _:vb5357028 _:vb5357029 _:vb5357030 _:vb5357031 _:vb5357016 _:vb5357017 _:vb5357018 _:vb5357019 _:vb5357020 _:vb5357021 _:vb5357022 _:vb5357023 _:vb5357008 _:vb5357009 _:vb5357010 _:vb5357011 _:vb5357012 _:vb5357013 _:vb5357014 _:vb5357015 _:vb5357000 _:vb5357001 _:vb5357002 _:vb5357003 _:vb5357004 _:vb5357005 _:vb5357006 _:vb5357007 _:vb5356992 _:vb5356993 _:vb5356994 _:vb5356995 _:vb5356996 _:vb5356997 _:vb5356998 _:vb5356999
Subject Item
_:vb5356984
rdf:type
n2:Context
rdf:value
In order to determine which motifs exhibit positional enrichment within human promoters, we analyzed spatial enrichment for all 6-mers on a set of non-redundant RefSeq human promoters (>>44<<,45) collected from the UCSC Genome Browser (http://genome.ucsc.edu) (41,42). The data set consisted of 20 609 sequences, each comprising the region 500-bp upstream and 100-bp downstream of a known TSS.
n2:mentions
n3:10592200
Subject Item
_:vb5356985
rdf:type
n2:Context
rdf:value
In order to determine which motifs exhibit positional enrichment within human promoters, we analyzed spatial enrichment for all 6-mers on a set of non-redundant RefSeq human promoters (44,>>45<<) collected from the UCSC Genome Browser (http://genome.ucsc.edu) (41,42). The data set consisted of 20 609 sequences, each comprising the region 500-bp upstream and 100-bp downstream of a known TSS.
n2:mentions
n3:11125071
Subject Item
_:vb5356986
rdf:type
n2:Context
rdf:value
which motifs exhibit positional enrichment within human promoters, we analyzed spatial enrichment for all 6-mers on a set of non-redundant RefSeq human promoters (44,45) collected from the UCSC Genome Browser (http://genome.ucsc.edu) (>>41<<,42). The data set consisted of 20 609 sequences, each comprising the region 500-bp upstream and 100-bp downstream of a known TSS.
n2:mentions
n3:14681465
Subject Item
_:vb5356987
rdf:type
n2:Context
rdf:value
edu) (41,>>42<<). The data set consisted of 20 609 sequences, each comprising the region 500-bp upstream and 100-bp downstream of a known TSS.
n2:mentions
n3:11237011
Subject Item
_:vb5356988
rdf:type
n2:Context
rdf:value
In order to test whether the predicted motifs overlapped with known regulatory elements, we compared our results to known TF-binding sites in the TRANSFAC database (>>39<<) using STAMP (40).
n2:mentions
n3:12520026
Subject Item
_:vb5356989
rdf:type
n2:Context
rdf:value
In order to test whether the predicted motifs overlapped with known regulatory elements, we compared our results to known TF-binding sites in the TRANSFAC database (39) using STAMP (>>40<<). Thirty-four of the motif clusters matched known cis-regulatory elements, comprising a total of twenty known binding sites within TRANSFAC as well as the Inr sequence element (Table 1). Several of the motifs predicted were previously
n2:mentions
n3:17478497
Subject Item
_:vb5356990
rdf:type
n2:Context
rdf:value
Several of the motifs predicted were previously known to exhibit position-specific over-representation, including the TBP, SP1, NFY, CREB, ETS, NRF1 and MYC factor-binding sites (>>19<<,46–48). We also predicted several additional motifs whose positional enrichment had not been previously documented, including fourteen novel regulatory motif candidates, denoted as d1-d14. The location of enrichment for each of the
n2:mentions
n3:15256515
Subject Item
_:vb5356991
rdf:type
n2:Context
rdf:value
Several of the motifs predicted were previously known to exhibit position-specific over-representation, including the TBP, SP1, NFY, CREB, ETS, NRF1 and MYC factor-binding sites (19,46–>>48<<). We also predicted several additional motifs whose positional enrichment had not been previously documented, including fourteen novel regulatory motif candidates, denoted as d1-d14. The location of enrichment for each of the predicted
n2:mentions
n3:17567998 n3:2329577 n3:15735639
Subject Item
_:vb5356992
rdf:type
n2:Context
rdf:value
The third column shows factor names-binding to the known regulatory elements in TRANSFAC (>>39<<); putatively novel motifs are labeled d1–d14.
n2:mentions
n3:12520026
Subject Item
_:vb5356993
rdf:type
n2:Context
rdf:value
The right columns show comparisons to previous studies using the ‘sliding window method’ (>>18<<,19,47,48).
n2:mentions
n3:17452354
Subject Item
_:vb5356994
rdf:type
n2:Context
rdf:value
The right columns show comparisons to previous studies using the ‘sliding window method’ (18,>>19<<,47,48). Asterisks denote matches to non-redundant consensus motifs produced by these studies after k-mer clustering; only motifs predicted to be enriched at approximately the same location were considered matches. All sequence matches to
n2:mentions
n3:15256515
Subject Item
_:vb5356995
rdf:type
n2:Context
rdf:value
The right columns show comparisons to previous studies using the ‘sliding window method’ (18,19,>>47<<,48). Asterisks denote matches to non-redundant consensus motifs produced by these studies after k-mer clustering; only motifs predicted to be enriched at approximately the same location were considered matches. All sequence matches to
n2:mentions
n3:15735639
Subject Item
_:vb5356996
rdf:type
n2:Context
rdf:value
The right columns show comparisons to previous studies using the ‘sliding window method’ (18,19,47,>>48<<). Asterisks denote matches to non-redundant consensus motifs produced by these studies after k-mer clustering; only motifs predicted to be enriched at approximately the same location were considered matches. All sequence matches to
n2:mentions
n3:17567998
Subject Item
_:vb5356997
rdf:type
n2:Context
rdf:value
All sequence matches to TRANSFAC, mouse motif predictions, and those of previous studies were conducted using STAMP (>>40<<) (E-value threshold:
n2:mentions
n3:17478497
Subject Item
_:vb5356998
rdf:type
n2:Context
rdf:value
Such poly(T) sequences are known to alter DNA conformation, thereby affecting transcriptional regulation by displacing the nucleosome from the DNA molecule (49–>>51<<). Similarly, the novel d10 motif, comprising a CA-dinucleotide repeat, promotes left-handed Z-DNA conformations (52–54). The positional biases of these motifs may therefore reflect a functional role for each motif at these locations. The
n2:mentions
n3:19208466 n3:2185240 n3:11438706
Subject Item
_:vb5356999
rdf:type
n2:Context
rdf:value
Similarly, the novel d10 motif, comprising a CA-dinucleotide repeat, promotes left-handed Z-DNA conformations (52–>>54<<). The positional biases of these motifs may therefore reflect a functional role for each motif at these locations. The MLFs of the novel reverse complement motifs d2 and d9 are shown at the bottom of Figure 4; each orientation of this
n2:mentions
n3:6383204 n3:2158081 n3:11447254
Subject Item
_:vb5357000
rdf:type
n2:Context
rdf:value
We conducted a second comprehensive MLF analysis using a sequence data set of 18 354 non-redundant mouse promoters in RefSeq (43–>>45<<). We then compared the motif predictions between the two species according to sequence similarity as well as the location of positional overrepresentation.
n2:mentions
n3:11125071 n3:12466850 n3:10592200
Subject Item
_:vb5357001
rdf:type
n2:Context
rdf:value
Several previous studies have analyzed spatial preferences of potential regulatory motifs within the promoter (17–>>21<<,47,48). Most previous analyses, with one exception (21), have used the ‘sliding window’ approach.
n2:mentions
n3:15256515 n3:18367472 n3:17452354 n3:16827941 n3:16806065
Subject Item
_:vb5357002
rdf:type
n2:Context
rdf:value
Several previous studies have analyzed spatial preferences of potential regulatory motifs within the promoter (17–21,>>47<<,48). Most previous analyses, with one exception (21), have used the ‘sliding window’ approach.
n2:mentions
n3:15735639
Subject Item
_:vb5357003
rdf:type
n2:Context
rdf:value
Several previous studies have analyzed spatial preferences of potential regulatory motifs within the promoter (17–21,47,>>48<<). Most previous analyses, with one exception (21), have used the ‘sliding window’ approach.
n2:mentions
n3:17567998
Subject Item
_:vb5357004
rdf:type
n2:Context
rdf:value
Most previous analyses, with one exception (>>21<<), have used the ‘sliding window’ approach.
n2:mentions
n3:18367472
Subject Item
_:vb5357005
rdf:type
n2:Context
rdf:value
A previous study conducted by FitzGerald et al. (>>19<<) used the sliding window approach, considering motif occurrences within separate windows of 20 bp.
n2:mentions
n3:15256515
Subject Item
_:vb5357006
rdf:type
n2:Context
rdf:value
Spatially enriched k-mers are compared between studies conducted by FitzGerald et al. (>>19<<), Tharakaraman et al. (21) and Vardhanabhuti et al. (18) as well as the MPF model.
n2:mentions
n3:15256515
Subject Item
_:vb5357007
rdf:type
n2:Context
rdf:value
Spatially enriched k-mers are compared between studies conducted by FitzGerald et al. (19), Tharakaraman et al. (>>21<<) and Vardhanabhuti et al. (18) as well as the MPF model. The total number of (unclustered) k-mer predictions are shown in the top row.
n2:mentions
n3:18367472
Subject Item
_:vb5357008
rdf:type
n2:Context
rdf:value
Spatially enriched k-mers are compared between studies conducted by FitzGerald et al. (19), Tharakaraman et al. (21) and Vardhanabhuti et al. (>>18<<) as well as the MPF model. The total number of (unclustered) k-mer predictions are shown in the top row.
n2:mentions
n3:17452354
Subject Item
_:vb5357009
rdf:type
n2:Context
rdf:value
sequences were detected by FitzGerald et al. Table 1 contains comparisons between our regulatory motif predictions to those of FitzGerald et al. as well as three other studies providing non-redundant motifs with spatial enrichment (>>18<<,47,48). We found that many of our motifs predicted with wider ranges of positional enrichment could not be detected using the sliding window approach.
n2:mentions
n3:17452354
Subject Item
_:vb5357010
rdf:type
n2:Context
rdf:value
sequences were detected by FitzGerald et al. Table 1 contains comparisons between our regulatory motif predictions to those of FitzGerald et al. as well as three other studies providing non-redundant motifs with spatial enrichment (18,>>47<<,48). We found that many of our motifs predicted with wider ranges of positional enrichment could not be detected using the sliding window approach.
n2:mentions
n3:15735639
Subject Item
_:vb5357011
rdf:type
n2:Context
rdf:value
were detected by FitzGerald et al. Table 1 contains comparisons between our regulatory motif predictions to those of FitzGerald et al. as well as three other studies providing non-redundant motifs with spatial enrichment (18,47,>>48<<). We found that many of our motifs predicted with wider ranges of positional enrichment could not be detected using the sliding window approach.
n2:mentions
n3:17567998
Subject Item
_:vb5357012
rdf:type
n2:Context
rdf:value
The Inr sequence has been previously characterized by the consensus motif YYAnWYY (>>55<<). This element is known to function specifically at a single nucleotide site at the start of transcription (55,56), and therefore it is difficult to detect using low resolution approaches. Out of 156 8-mer predictions made by FitzGerald
n2:mentions
n3:17123746
Subject Item
_:vb5357013
rdf:type
n2:Context
rdf:value
This element is known to function specifically at a single nucleotide site at the start of transcription (>>55<<,56), and therefore it is difficult to detect using low resolution approaches.
n2:mentions
n3:17123746
Subject Item
_:vb5357014
rdf:type
n2:Context
rdf:value
This element is known to function specifically at a single nucleotide site at the start of transcription (55,>>56<<), and therefore it is difficult to detect using low resolution approaches.
n2:mentions
n3:16916456
Subject Item
_:vb5357015
rdf:type
n2:Context
rdf:value
Despite the highly significant amount of positional overrepresentation exhibited by this motif, none of the studies using the sliding window approach detected any motifs containing this 5-mer (>>18<<,19,47,48).
n2:mentions
n3:17452354
Subject Item
_:vb5357016
rdf:type
n2:Context
rdf:value
Despite the highly significant amount of positional overrepresentation exhibited by this motif, none of the studies using the sliding window approach detected any motifs containing this 5-mer (18,>>19<<,47,48). Figure 5 shows the occurrence data of this motif using 20 bp windows and using single-site resolution; we note the significant decrease of the signal when considering the data using windows of 20 bp. Figure 5.
n2:mentions
n3:15256515
Subject Item
_:vb5357017
rdf:type
n2:Context
rdf:value
Despite the highly significant amount of positional overrepresentation exhibited by this motif, none of the studies using the sliding window approach detected any motifs containing this 5-mer (18,19,>>47<<,48). Figure 5 shows the occurrence data of this motif using 20 bp windows and using single-site resolution; we note the significant decrease of the signal when considering the data using windows of 20 bp. Figure 5.
n2:mentions
n3:15735639
Subject Item
_:vb5357018
rdf:type
n2:Context
rdf:value
Despite the highly significant amount of positional overrepresentation exhibited by this motif, none of the studies using the sliding window approach detected any motifs containing this 5-mer (18,19,47,>>48<<). Figure 5 shows the occurrence data of this motif using 20 bp windows and using single-site resolution; we note the significant decrease of the signal when considering the data using windows of 20 bp. Figure 5.
n2:mentions
n3:17567998
Subject Item
_:vb5357019
rdf:type
n2:Context
rdf:value
Tharakaraman et al. (>>21<<,57) also scanned for positional biases within human promoters. However, their methodology allowed for varying window sizes, improving sensitivity of spatial enrichment considerably.
n2:mentions
n3:18367472
Subject Item
_:vb5357020
rdf:type
n2:Context
rdf:value
Tharakaraman et al. (21,>>57<<) also scanned for positional biases within human promoters. However, their methodology allowed for varying window sizes, improving sensitivity of spatial enrichment considerably.
n2:mentions
n3:15961489
Subject Item
_:vb5357021
rdf:type
n2:Context
rdf:value
Since GC mono- and di-nucleotide composition rises substantially near the start of transcription (>>19<<), about a third of the 8-mers predicted by Tharakaraman et al. were highly GC-rich, containing at least seven out of eight G/C consensus sites.
n2:mentions
n3:15256515
Subject Item
_:vb5357022
rdf:type
n2:Context
rdf:value
In contrast to the analysis of Tharakaraman et al., Vardhanabhuti et al. (>>18<<) controlled for changes in basepair composition across the promoter. In this analysis, the observed number of occurrences of a given motif was compared to an expected number of occurrences in each window of 20 bp.
n2:mentions
n3:17452354
Subject Item
_:vb5357023
rdf:type
n2:Context
rdf:value
This motif was predicted by Vardhanabhuti et al. to be enriched 45 bp prior to the TSS, although it is known to function at a very specific location 30-bp upstream of the TSS (>>55<<,56,58–60).
n2:mentions
n3:17123746
Subject Item
_:vb5357024
rdf:type
n2:Context
rdf:value
This motif was predicted by Vardhanabhuti et al. to be enriched 45 bp prior to the TSS, although it is known to function at a very specific location 30-bp upstream of the TSS (55,>>56<<,58–60). The authors attribute this discrepancy to an increase of A/T nucleotide composition at this location, increasing the ‘expected’ number of occurrences within this window and therefore decreasing the observed/expected ratio.
n2:mentions
n3:16916456
Subject Item
_:vb5357025
rdf:type
n2:Context
rdf:value
This motif was predicted by Vardhanabhuti et al. to be enriched 45 bp prior to the TSS, although it is known to function at a very specific location 30-bp upstream of the TSS (55,56,58–>>60<<). The authors attribute this discrepancy to an increase of A/T nucleotide composition at this location, increasing the ‘expected’ number of occurrences within this window and therefore decreasing the observed/expected ratio. However, the
n2:mentions
n3:7929383 n3:7926770 n3:7518774
Subject Item
_:vb5357026
rdf:type
n2:Context
rdf:value
Periodic distributions have been associated with DNA sequence features attributed to the structural conformations of the nucleosome (>>61<<), and TF-pair interactions are often known to occur at phased intervals around the histone complex or the winding of the DNA double-helix (3,62,63).
n2:mentions
n3:10077607
Subject Item
_:vb5357027
rdf:type
n2:Context
rdf:value
with DNA sequence features attributed to the structural conformations of the nucleosome (61), and TF-pair interactions are often known to occur at phased intervals around the histone complex or the winding of the DNA double-helix (>>3<<,62,63). This scheme is shown in Figure 6, which illustrates a potential preference for protein–protein interactions to occur in a specific orientation in relation to the turn of the double-helix.
n2:mentions
n3:12777501
Subject Item
_:vb5357028
rdf:type
n2:Context
rdf:value
Proteins must often be positioned in a particular orientation with respect to the DNA molecule to induce potential interactions (>>3<<,62,63). Interactions between protein A and protein B occur when the latter is positioned at B1. The same interaction frequently occurs one turn of the double-helix away from B1 (i.e. at B2), since the orientation of protein B is
n2:mentions
n3:12777501
Subject Item
_:vb5357029
rdf:type
n2:Context
rdf:value
Figure 7 shows two MRFs which were both generated by motif-pairs that bind TFs with known interactions (>>64<<,65), namely, the NFY-NFY and NFY-SP1-binding motif pairs.
n2:mentions
n3:9917388
Subject Item
_:vb5357030
rdf:type
n2:Context
rdf:value
Figure 7 shows two MRFs which were both generated by motif-pairs that bind TFs with known interactions (64,>>65<<), namely, the NFY-NFY and NFY-SP1-binding motif pairs.
n2:mentions
n3:15462673
Subject Item
_:vb5357031
rdf:type
n2:Context
rdf:value
MRFs of two motif-pairs-binding interacting TFs (>>64<<,65). A known occurrence of the reverse-strand NFY-binding site defines the position x = 0; x-axis values denote the position of the (a) plus-strand NFY-binding site and the (b) minus-strand SP1-binding site.
n2:mentions
n3:9917388
Subject Item
_:vb5357032
rdf:type
n2:Context
rdf:value
MRFs of two motif-pairs-binding interacting TFs (64,>>65<<). A known occurrence of the reverse-strand NFY-binding site defines the position x = 0; x-axis values denote the position of the (a) plus-strand NFY-binding site and the (b) minus-strand SP1-binding site.
n2:mentions
n3:15462673
Subject Item
_:vb5357033
rdf:type
n2:Context
rdf:value
Tfs binding to known motifs in TRANSFAC (>>39<<) are shown in the third column [STAMP (40) E-value threshold:
n2:mentions
n3:12520026
Subject Item
_:vb5357034
rdf:type
n2:Context
rdf:value
Tfs binding to known motifs in TRANSFAC (39) are shown in the third column [STAMP (>>40<<) E-value threshold:
n2:mentions
n3:17478497
Subject Item
_:vb5357035
rdf:type
n2:Context
rdf:value
Both factors are known to have direct interactions with NFY (>>64<<,65). Table 4.
n2:mentions
n3:9917388
Subject Item
_:vb5357036
rdf:type
n2:Context
rdf:value
Both factors are known to have direct interactions with NFY (64,>>65<<). Table 4.
n2:mentions
n3:15462673
Subject Item
_:vb5357037
rdf:type
n2:Context
rdf:value
Each partner motif binds either the NFY or SP1 factors (left column). Both NFY-NFY and NFY-SP1 factor-pairs exhibit known interactions (>>64<<,65).
n2:mentions
n3:9917388
Subject Item
_:vb5357038
rdf:type
n2:Context
rdf:value
Each partner motif binds either the NFY or SP1 factors (left column). Both NFY-NFY and NFY-SP1 factor-pairs exhibit known interactions (64,>>65<<).
n2:mentions
n3:15462673
Subject Item
_:vb5357039
rdf:type
n2:Context
rdf:value
TFs binding to the known cis-regulatory elements in TRANSFAC (>>39<<) are shown in the third columns. Binding factors with known direct interactions to SRF, a MADS-box family member, are labeled with asterisks.
n2:mentions
n3:12520026
Subject Item
_:vb5357040
rdf:type
n2:Context
rdf:value
These include binding motifs of TCF3 (>>66<<), CEBP (67), NFY (68) and ATF6 (69). We found that partner motifs pairing with the MADS-box-binding site were frequently predicted in both orientations.
n2:mentions
n3:8617811
Subject Item
_:vb5357041
rdf:type
n2:Context
rdf:value
These include binding motifs of TCF3 (66), CEBP (>>67<<), NFY (68) and ATF6 (69). We found that partner motifs pairing with the MADS-box-binding site were frequently predicted in both orientations.
n2:mentions
n3:11500490
Subject Item
_:vb5357042
rdf:type
n2:Context
rdf:value
These include binding motifs of TCF3 (66), CEBP (67), NFY (>>68<<) and ATF6 (69). We found that partner motifs pairing with the MADS-box-binding site were frequently predicted in both orientations.
n2:mentions
n3:10571058
Subject Item
_:vb5357043
rdf:type
n2:Context
rdf:value
These include binding motifs of TCF3 (66), CEBP (67), NFY (68) and ATF6 (>>69<<). We found that partner motifs pairing with the MADS-box-binding site were frequently predicted in both orientations.
n2:mentions
n3:9271374
Subject Item
_:vb5357044
rdf:type
n5:Section
dc:title
methods
n5:contains
_:vb5357045
Subject Item
_:vb5357045
rdf:type
n2:Context
rdf:value
Formally, Rw(x) for a motif w of length lw, i.e. w = w(1)…w(lw), is given by a position-specific 1st order Markov-dependency model as described in Karlin et al. (>>32<<). The expected frequency Rw(x) of w at position x is given by 7 where Rw(i)w(i + 1)(x) gives the observed frequency of the dinucleotide w(i )w(i + 1) at position x; Rw(i)(x) represents the analogous mono-nucleotide frequency.
n2:mentions
n3:1313968
Subject Item
_:vb5357046
rdf:type
n5:Section
dc:title
discussion
n5:contains
_:vb5357080 _:vb5357081 _:vb5357082 _:vb5357083 _:vb5357076 _:vb5357077 _:vb5357078 _:vb5357079 _:vb5357072 _:vb5357073 _:vb5357074 _:vb5357075 _:vb5357068 _:vb5357069 _:vb5357070 _:vb5357071 _:vb5357064 _:vb5357065 _:vb5357066 _:vb5357067 _:vb5357060 _:vb5357061 _:vb5357062 _:vb5357063 _:vb5357056 _:vb5357057 _:vb5357058 _:vb5357059 _:vb5357052 _:vb5357053 _:vb5357054 _:vb5357055 _:vb5357048 _:vb5357049 _:vb5357050 _:vb5357051 _:vb5357047
Subject Item
_:vb5357047
rdf:type
n2:Context
rdf:value
Although such motif finders have been useful for certain applications, assessments of these methods have shown that the efficacy of such approaches is somewhat limited (>>70<<). The method presented here is designed to predict either individual regulatory elements or functional motif-pair relationships using spatial biases, rather than the overall frequency of occurrence, as a criteria for functionality. The
n2:mentions
n3:15637633
Subject Item
_:vb5357048
rdf:type
n2:Context
rdf:value
While most previous studies predict spatial preferences by counting motif occurrences within multiple independent windows of ∼20–25 bp (18–>>20<<,47,48), the MLF model considers the data collectively at single-site resolution, estimating the underlying frequency of occurrence according to position.
n2:mentions
n3:15256515 n3:17452354 n3:16827941
Subject Item
_:vb5357049
rdf:type
n2:Context
rdf:value
While most previous studies predict spatial preferences by counting motif occurrences within multiple independent windows of ∼20–25 bp (18–20,>>47<<,48), the MLF model considers the data collectively at single-site resolution, estimating the underlying frequency of occurrence according to position.
n2:mentions
n3:15735639
Subject Item
_:vb5357050
rdf:type
n2:Context
rdf:value
While most previous studies predict spatial preferences by counting motif occurrences within multiple independent windows of ∼20–25 bp (18–20,47,>>48<<), the MLF model considers the data collectively at single-site resolution, estimating the underlying frequency of occurrence according to position.
n2:mentions
n3:17567998
Subject Item
_:vb5357051
rdf:type
n2:Context
rdf:value
Inspection of the results of FitzGerald et al. (>>19<<) and Vardhanabhuti et al. (18) showed that the most common Inr 5-mer consensus (TCAGT) did not appear in the predictions of either study.
n2:mentions
n3:15256515
Subject Item
_:vb5357052
rdf:type
n2:Context
rdf:value
Inspection of the results of FitzGerald et al. (19) and Vardhanabhuti et al. (>>18<<) showed that the most common Inr 5-mer consensus (TCAGT) did not appear in the predictions of either study.
n2:mentions
n3:17452354
Subject Item
_:vb5357053
rdf:type
n2:Context
rdf:value
For instance, Vardhanabhuti et al. (>>18<<) estimated the expected frequency of occurrence separately at each location within the promoter.
n2:mentions
n3:17452354
Subject Item
_:vb5357054
rdf:type
n2:Context
rdf:value
As discussed by the authors of this study, their method was unable to predict positional specificity of the TATA-box at its correct functional location (>>48<<,58–60); it is highly likely that many other existing signals went undetected using this model.
n2:mentions
n3:17567998
Subject Item
_:vb5357055
rdf:type
n2:Context
rdf:value
As discussed by the authors of this study, their method was unable to predict positional specificity of the TATA-box at its correct functional location (48,58–>>60<<); it is highly likely that many other existing signals went undetected using this model.
n2:mentions
n3:7929383 n3:7926770 n3:7518774
Subject Item
_:vb5357056
rdf:type
n2:Context
rdf:value
For instance, functional occurrences of the SRF-binding motif are not limited to its predicted location of enrichment (>>71<<). However, SRF binding is known to be facilitated by the YY1 factor, with each protein occupying the same binding location on the DNA, albeit on opposite strands of the molecule (72,73). As YY1 has been shown to function at specific
n2:mentions
n3:17200232
Subject Item
_:vb5357057
rdf:type
n2:Context
rdf:value
However, SRF binding is known to be facilitated by the YY1 factor, with each protein occupying the same binding location on the DNA, albeit on opposite strands of the molecule (>>72<<,73). As YY1 has been shown to function at specific locations within the promoter (48), it is not surprising that the SRF-binding motif also exhibits positional enrichment near that of YY1. Thus, although functional occurrences of the
n2:mentions
n3:7565750
Subject Item
_:vb5357058
rdf:type
n2:Context
rdf:value
However, SRF binding is known to be facilitated by the YY1 factor, with each protein occupying the same binding location on the DNA, albeit on opposite strands of the molecule (72,>>73<<). As YY1 has been shown to function at specific locations within the promoter (48), it is not surprising that the SRF-binding motif also exhibits positional enrichment near that of YY1. Thus, although functional occurrences of the
n2:mentions
n3:8887666
Subject Item
_:vb5357059
rdf:type
n2:Context
rdf:value
As YY1 has been shown to function at specific locations within the promoter (>>48<<), it is not surprising that the SRF-binding motif also exhibits positional enrichment near that of YY1.
n2:mentions
n3:17567998
Subject Item
_:vb5357060
rdf:type
n2:Context
rdf:value
These sequence elements are known to promote left-handed Z-DNA structures that affect DNA supercoiling, and subsequently transcription (52–>>54<<). Similarly, the d3 motif consists of a homopolymeric thymine [poly(T)] tract. Such poly(T) tracts are known to alter the conformation of the DNA molecule, thereby disrupting nucleosome positioning (51). Both d3 and d10 show a significant
n2:mentions
n3:2158081 n3:6383204 n3:11447254
Subject Item
_:vb5357061
rdf:type
n2:Context
rdf:value
Such poly(T) tracts are known to alter the conformation of the DNA molecule, thereby disrupting nucleosome positioning (>>51<<). Both d3 and d10 show a significant overrepresentation peak centering near the start of the transcription bubble (74). A recent study conducted by Kaplan et al. (75) has shown that nucleosome occupancy is significantly decreased at this
n2:mentions
n3:19208466
Subject Item
_:vb5357062
rdf:type
n2:Context
rdf:value
Both d3 and d10 show a significant overrepresentation peak centering near the start of the transcription bubble (>>74<<). A recent study conducted by Kaplan et al. (75) has shown that nucleosome occupancy is significantly decreased at this location. It is therefore likely that occurrences of these motifs in this region of the promoter may make the area
n2:mentions
n3:15989968
Subject Item
_:vb5357063
rdf:type
n2:Context
rdf:value
A recent study conducted by Kaplan et al. (>>75<<) has shown that nucleosome occupancy is significantly decreased at this location.
n2:mentions
n3:19092803
Subject Item
_:vb5357064
rdf:type
n2:Context
rdf:value
This analysis was conducted on 1354 mouse promoters generated using CAGE-tag data (>>56<<,76), a significant decrease in the number of sequences used during our previous analyses.
n2:mentions
n3:16916456
Subject Item
_:vb5357065
rdf:type
n2:Context
rdf:value
This analysis was conducted on 1354 mouse promoters generated using CAGE-tag data (56,>>76<<), a significant decrease in the number of sequences used during our previous analyses.
n2:mentions
n3:16645617
Subject Item
_:vb5357066
rdf:type
n2:Context
rdf:value
bias of some infrequently occurring motifs could not be detected using the smaller RIKEN data set, including several regulatory elements whose positional bias is well-documented, such as the CREB, MYC, NRF1 and ETS-binding sites (>>19<<,47,48). Thus, we note that it is usually preferable to use larger data sets than smaller ones in order to minimize the amount of random noise relative to the background frequency.
n2:mentions
n3:15256515
Subject Item
_:vb5357067
rdf:type
n2:Context
rdf:value
bias of some infrequently occurring motifs could not be detected using the smaller RIKEN data set, including several regulatory elements whose positional bias is well-documented, such as the CREB, MYC, NRF1 and ETS-binding sites (19,>>47<<,48). Thus, we note that it is usually preferable to use larger data sets than smaller ones in order to minimize the amount of random noise relative to the background frequency.
n2:mentions
n3:15735639
Subject Item
_:vb5357068
rdf:type
n2:Context
rdf:value
bias of some infrequently occurring motifs could not be detected using the smaller RIKEN data set, including several regulatory elements whose positional bias is well-documented, such as the CREB, MYC, NRF1 and ETS-binding sites (19,47,>>48<<). Thus, we note that it is usually preferable to use larger data sets than smaller ones in order to minimize the amount of random noise relative to the background frequency.
n2:mentions
n3:17567998
Subject Item
_:vb5357069
rdf:type
n2:Context
rdf:value
uses multi-modal characteristics of inter-motif distance frequencies, and is therefore inherently different from previous uni-modal models, which have generally relied on the sliding window method or maximum-distance approaches (16–>>18<<,31,77–79). We have found that individual instances of spatial preferences are generally constrained to widths of only ∼2–3 bp, and that single overrepresentation peaks often exhibit only minimal amounts of significance.
n2:mentions
n3:12364607 n3:16806065 n3:17452354
Subject Item
_:vb5357070
rdf:type
n2:Context
rdf:value
uses multi-modal characteristics of inter-motif distance frequencies, and is therefore inherently different from previous uni-modal models, which have generally relied on the sliding window method or maximum-distance approaches (16–18,>>31<<,77–79). We have found that individual instances of spatial preferences are generally constrained to widths of only ∼2–3 bp, and that single overrepresentation peaks often exhibit only minimal amounts of significance.
n2:mentions
n3:15084257
Subject Item
_:vb5357071
rdf:type
n2:Context
rdf:value
multi-modal characteristics of inter-motif distance frequencies, and is therefore inherently different from previous uni-modal models, which have generally relied on the sliding window method or maximum-distance approaches (16–18,31,77–>>79<<). We have found that individual instances of spatial preferences are generally constrained to widths of only ∼2–3 bp, and that single overrepresentation peaks often exhibit only minimal amounts of significance.
n2:mentions
n3:10698627 n3:14534164 n3:16108719
Subject Item
_:vb5357072
rdf:type
n2:Context
rdf:value
Structural analyses have shown that protein binding, and the binding of multi-protein complexes in particular, distort the conformation of the DNA (80–>>83<<), thus affecting the helical characteristics of the DNA.
n2:mentions
n3:16962967 n3:18802470 n3:10222198 n3:18461991
Subject Item
_:vb5357073
rdf:type
n2:Context
rdf:value
Selective binding of proteins to DNA involves not only sequence-specific elements within the DNA, but also topological characteristics of the DNA molecule (84–>>86<<); this is known to be particularly true during the recruitment of multiple interacting proteins to the DNA (85,86).
n2:mentions
n3:10064699 n3:1333034 n3:9634693
Subject Item
_:vb5357074
rdf:type
n2:Context
rdf:value
DNA involves not only sequence-specific elements within the DNA, but also topological characteristics of the DNA molecule (84–86); this is known to be particularly true during the recruitment of multiple interacting proteins to the DNA (>>85<<,86). Thus, although the biological explanation of the observed pattern remains unclear, these results are not inconsistent with our current knowledge of protein–protein and protein–DNA interactions.
n2:mentions
n3:9634693
Subject Item
_:vb5357075
rdf:type
n2:Context
rdf:value
involves not only sequence-specific elements within the DNA, but also topological characteristics of the DNA molecule (84–86); this is known to be particularly true during the recruitment of multiple interacting proteins to the DNA (85,>>86<<). Thus, although the biological explanation of the observed pattern remains unclear, these results are not inconsistent with our current knowledge of protein–protein and protein–DNA interactions.
n2:mentions
n3:10064699
Subject Item
_:vb5357076
rdf:type
n2:Context
rdf:value
All partners predicted to pair with the NFY-binding motif correspond to either the NFY or SP1-binding elements; both NFY–NFY and NFY–SP1 factor-pair interactions are documented in the literature (>>64<<,65). The MADS-box family consensus sequences predicted during the analysis bind both the myocyte enhancer factor 2 (MEF2) and the serum response factor (SRF) (87). These two factors are known to be involved in complex extra-cellular
n2:mentions
n3:9917388
Subject Item
_:vb5357077
rdf:type
n2:Context
rdf:value
All partners predicted to pair with the NFY-binding motif correspond to either the NFY or SP1-binding elements; both NFY–NFY and NFY–SP1 factor-pair interactions are documented in the literature (64,>>65<<). The MADS-box family consensus sequences predicted during the analysis bind both the myocyte enhancer factor 2 (MEF2) and the serum response factor (SRF) (87). These two factors are known to be involved in complex extra-cellular
n2:mentions
n3:15462673
Subject Item
_:vb5357078
rdf:type
n2:Context
rdf:value
The MADS-box family consensus sequences predicted during the analysis bind both the myocyte enhancer factor 2 (MEF2) and the serum response factor (SRF) (>>87<<). These two factors are known to be involved in complex extra-cellular signaling pathways, playing multiple roles involving cell differentiation and development (88–90). Both MEF2 and SRF regulate gene expression through the recruitment
n2:mentions
n3:10835359
Subject Item
_:vb5357079
rdf:type
n2:Context
rdf:value
These two factors are known to be involved in complex extra-cellular signaling pathways, playing multiple roles involving cell differentiation and development (88–>>90<<). Both MEF2 and SRF regulate gene expression through the recruitment of multiple accessory co-factors whose presence or absence within the complex cause differential expression of their target genes (66,67,86), and therefore we would
n2:mentions
n3:17959722 n3:12120892 n3:16928770
Subject Item
_:vb5357080
rdf:type
n2:Context
rdf:value
Both MEF2 and SRF regulate gene expression through the recruitment of multiple accessory co-factors whose presence or absence within the complex cause differential expression of their target genes (>>66<<,67,86), and therefore we would expect a large number of partner motifs to pair with their binding elements.
n2:mentions
n3:8617811
Subject Item
_:vb5357081
rdf:type
n2:Context
rdf:value
Both MEF2 and SRF regulate gene expression through the recruitment of multiple accessory co-factors whose presence or absence within the complex cause differential expression of their target genes (66,>>67<<,86), and therefore we would expect a large number of partner motifs to pair with their binding elements.
n2:mentions
n3:11500490
Subject Item
_:vb5357082
rdf:type
n2:Context
rdf:value
Both MEF2 and SRF regulate gene expression through the recruitment of multiple accessory co-factors whose presence or absence within the complex cause differential expression of their target genes (66,67,>>86<<), and therefore we would expect a large number of partner motifs to pair with their binding elements.
n2:mentions
n3:10064699
Subject Item
_:vb5357083
rdf:type
n2:Context
rdf:value
Three such factors belong to the homeobox family, whose members play a crucial role in early development (91–>>93<<). Many of the remaining partner motifs may play unknown functional roles in concert with one of the MADS-box protein factors.
n2:mentions
n3:12908068 n3:1968407 n3:2576026
Subject Item
_:vb5357084
rdf:type
n5:Section
dc:title
model selection
n5:contains
_:vb5357085 _:vb5357086 _:vb5357087 _:vb5357092 _:vb5357093 _:vb5357088 _:vb5357089 _:vb5357090 _:vb5357091
Subject Item
_:vb5357085
rdf:type
n2:Context
rdf:value
Motif clusters are condensed into a single consensus sequence according to the criteria derived from (>>38<<) and (39). Namely, each aligned site is assigned a single residue consensus if it comprises 50% of the aligned k-mers and occurs at least twice as frequently as every other nucleotide type.
n2:mentions
n3:3822832
Subject Item
_:vb5357086
rdf:type
n2:Context
rdf:value
Motif clusters are condensed into a single consensus sequence according to the criteria derived from (38) and (>>39<<). Namely, each aligned site is assigned a single residue consensus if it comprises 50% of the aligned k-mers and occurs at least twice as frequently as every other nucleotide type.
n2:mentions
n3:12520026
Subject Item
_:vb5357087
rdf:type
n2:Context
rdf:value
During our analyses, comparisons to known regulatory elements in TRANSFAC v11.3 (>>39<<) were conducted using STAMP (40); only binding motifs found in humans were considered.
n2:mentions
n3:12520026
Subject Item
_:vb5357088
rdf:type
n2:Context
rdf:value
During our analyses, comparisons to known regulatory elements in TRANSFAC v11.3 (39) were conducted using STAMP (>>40<<); only binding motifs found in humans were considered.
n2:mentions
n3:17478497
Subject Item
_:vb5357089
rdf:type
n2:Context
rdf:value
DNA sequences used during our analyses were taken from the UCSC Table Browser (http://genome.ucsc.edu) (>>41<<). Human analyses were conducted using the promoter sequences from the hg18, Build 36.1 assembly (42); mouse promoter data was taken from the mm9, NCBI Build 37 (43).
n2:mentions
n3:14681465
Subject Item
_:vb5357090
rdf:type
n2:Context
rdf:value
Human analyses were conducted using the promoter sequences from the hg18, Build 36.1 assembly (>>42<<); mouse promoter data was taken from the mm9, NCBI Build 37 (43).
n2:mentions
n3:11237011
Subject Item
_:vb5357091
rdf:type
n2:Context
rdf:value
Human analyses were conducted using the promoter sequences from the hg18, Build 36.1 assembly (42); mouse promoter data was taken from the mm9, NCBI Build 37 (>>43<<). Both data sets contained sequences comprising 500-bp upstream and 100-bp downstream of a known TSS in RefSeq (44,45). Sequence-pairs with at least 500 matching sites were filtered from the data sets. Genes without 5′ UTR annotations
n2:mentions
n3:12466850
Subject Item
_:vb5357092
rdf:type
n2:Context
rdf:value
Both data sets contained sequences comprising 500-bp upstream and 100-bp downstream of a known TSS in RefSeq (>>44<<,45). Sequence-pairs with at least 500 matching sites were filtered from the data sets. Genes without 5′ UTR annotations were excluded in order to eliminate TSS annotations caused by incomplete mRNA transcripts. The final data sets
n2:mentions
n3:10592200
Subject Item
_:vb5357093
rdf:type
n2:Context
rdf:value
Both data sets contained sequences comprising 500-bp upstream and 100-bp downstream of a known TSS in RefSeq (44,>>45<<). Sequence-pairs with at least 500 matching sites were filtered from the data sets. Genes without 5′ UTR annotations were excluded in order to eliminate TSS annotations caused by incomplete mRNA transcripts. The final data sets comprised
n2:mentions
n3:11125071
Subject Item
_:vb405628932
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
7
n2:hasRelevantPaperId
n3:17964271
Subject Item
_:vb405628933
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
5
n2:hasRelevantPaperId
n3:15256515
Subject Item
_:vb405628934
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
5
n2:hasRelevantPaperId
n3:17452354
Subject Item
_:vb405628935
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
4
n2:hasRelevantPaperId
n3:15637633
Subject Item
_:vb405628936
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
4
n2:hasRelevantPaperId
n3:18411406
Subject Item
_:vb405628937
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
3
n2:hasRelevantPaperId
n3:17994088
Subject Item
_:vb405628938
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
3
n2:hasRelevantPaperId
n3:18533028
Subject Item
_:vb405628939
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
3
n2:hasRelevantPaperId
n3:24194598
Subject Item
_:vb405628940
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
3
n2:hasRelevantPaperId
n3:15735639
Subject Item
_:vb405628941
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
3
n2:hasRelevantPaperId
n3:10698627
Subject Item
_:vb405628942
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
3
n2:hasRelevantPaperId
n3:18367472
Subject Item
_:vb405628943
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
3
n2:hasRelevantPaperId
n3:16381825
Subject Item
_:vb405628944
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:17284674
Subject Item
_:vb405628945
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:10802651
Subject Item
_:vb405628946
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:15806101
Subject Item
_:vb405628947
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:19357200
Subject Item
_:vb405628948
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:15961489
Subject Item
_:vb405628949
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:12855472
Subject Item
_:vb405628950
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:18523729
Subject Item
_:vb405628951
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:19095804
Subject Item
_:vb405628952
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:17324271
Subject Item
_:vb405628953
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:14530449
Subject Item
_:vb405628954
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:15883375
Subject Item
_:vb405628955
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:9847208
Subject Item
_:vb405628956
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:18006571
Subject Item
_:vb405628957
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:15908603
Subject Item
_:vb405628958
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:16524982
Subject Item
_:vb405628959
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:17921353
Subject Item
_:vb405628960
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:17360592
Subject Item
_:vb405628961
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:19850720
Subject Item
_:vb405628962
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:9581503
Subject Item
_:vb405628963
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:20363979
Subject Item
_:vb405628964
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:12775844
Subject Item
_:vb405628965
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:17376166
Subject Item
_:vb405628966
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:19029883
Subject Item
_:vb405628967
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:7584402
Subject Item
_:vb405628968
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:11917018
Subject Item
_:vb405628969
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:12777501
Subject Item
_:vb405628970
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:18078513
Subject Item
_:vb405628971
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:19890324
Subject Item
_:vb405628972
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:21430782
Subject Item
_:vb405628973
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:18364237
Subject Item
_:vb405628974
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:17953483
Subject Item
_:vb405628975
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:17408486
Subject Item
_:vb405628976
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:18172436
Subject Item
_:vb405628977
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:19906716
Subject Item
_:vb405628978
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:10785665
Subject Item
_:vb405628979
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:12824371
Subject Item
_:vb405628980
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:11244049
Subject Item
_:vb405628981
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:17571346
Subject Item
_:vb405628982
rdf:type
n2:RelevantBibliographicResource
n2:RelevantScore
2
n2:hasRelevantPaperId
n3:20211142