Document Detail

DNA sequence-specific recognition by a transcriptional regulator requires indirect readout of A-tracts.
Jump to Full Text
MedLine Citation:
PMID:  17452358     Owner:  NLM     Status:  MEDLINE    
The bacteriophage Ø29 transcriptional regulator p4 binds to promoters of different intrinsic activities. The p4-DNA complex contains two identical protomers that make similar interactions with the target sequence 5'-AACTTTTT-15 bp-AAAATGTT-3'. To define how the various elements in the target sequence contribute to p4's affinity, we studied p4 binding to a series of mutated binding sites. The binding specificity depends critically on base pairs of the target sequence through both direct as well as indirect readout. There is only one specific contact between a base and an amino acid residue; other contacts take place with the phosphate backbone. Alteration of direct amino acid-base contacts, or mutation of non-contacted A.T base pairs at A-tracts abolished binding. We generated three 5 ns molecular dynamics (MD) simulations to investigate the basis for the p4-DNA complex specificity. Recognition is controlled by the protein and depends on DNA dynamic properties. MD results on protein-DNA contacts and the divergence of p4 affinity to modified binding sites reveal an inherent asymmetry, which is required for p4-specific binding and may be crucial for transcription regulation.
Jesús Mendieta; Laura Pérez-Lago; Margarita Salas; Ana Camacho
Related Documents :
8043588 - Mutation of tryptophan 128 in t4 endonuclease v does not affect glycosylase or abasic s...
11419938 - Conformational changes of the ferric uptake regulation protein upon metal activation an...
9234678 - Protein kinase a-dependent phosphorylation modulates dna-binding activity of hepatocyte...
18764298 - Model of dna bending by cooperative binding of proteins.
20398208 - Acetylation represses the binding of chey to its target proteins.
17760878 - Zinc-binding property of the major yolk protein in the sea urchin - implications of its...
Publication Detail:
Type:  Journal Article; Research Support, Non-U.S. Gov't     Date:  2007-04-22
Journal Detail:
Title:  Nucleic acids research     Volume:  35     ISSN:  1362-4962     ISO Abbreviation:  Nucleic Acids Res.     Publication Date:  2007  
Date Detail:
Created Date:  2007-06-12     Completed Date:  2007-06-25     Revised Date:  2009-11-18    
Medline Journal Info:
Nlm Unique ID:  0411011     Medline TA:  Nucleic Acids Res     Country:  England    
Other Details:
Languages:  eng     Pagination:  3252-61     Citation Subset:  IM    
Instituto de Biología Molecular Eladio Viñuela (CSIC), Centro de Biología Molecular Severo Ochoa (CSIC-UAM), Universidad Autónoma, Canto Blanco, 28049 Madrid, Spain.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Bacillus Phages / genetics*
Base Sequence
Binding Sites
Computer Simulation
DNA, Viral / chemistry*,  metabolism
Models, Molecular
Nucleic Acid Conformation
Promoter Regions, Genetic*
Protein Binding
Repetitive Sequences, Nucleic Acid
Transcription Factors / chemistry*,  metabolism
Viral Proteins / chemistry*,  metabolism
Reg. No./Substance:
0/DNA, Viral; 0/Transcription Factors; 0/Viral Proteins; 0/p4 protein, Bacteriophage phi 29

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine

Full Text
Journal Information
Journal ID (nlm-ta): Nucleic Acids Res
Journal ID (publisher-id): Nucleic Acids Research
ISSN: 0305-1048
ISSN: 1362-4962
Publisher: Oxford University Press
Article Information
Download PDF
? 2007 The Author(s)
open-access: This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Received Day: 23 Month: 1 Year: 2007
Accepted Day: 13 Month: 3 Year: 2007
collection publication date: Month: 5 Year: 2007
Print publication date: Month: 5 Year: 2007
Electronic publication date: Day: 22 Month: 4 Year: 2007
Volume: 35 Issue: 10
First Page: 3252 Last Page: 3261
ID: 1904284
DOI: 10.1093/nar/gkm180
PubMed Id: 17452358

DNA sequence-specific recognition by a transcriptional regulator requires indirect readout of A-tracts
Jes?s Mendieta
Laura P?rez-Lago
Margarita Salas
Ana Camacho*
Instituto de Biolog?a Molecular ?Eladio Vi?uela? (CSIC), Centro de Biolog?a Molecular ?Severo Ochoa? (CSIC-UAM), Universidad Aut?noma, Canto Blanco, 28049 Madrid, Spain
Correspondence: *To whom correspondence should be addressed. Tel: 34-91 497 8435; Fax: 34-91 497 8490; Email:


Phage ?29 protein p4, in synergy with viral protein p6, effects the transcriptional switch that divides bacteriophage ?29 infection into early and late phases (1,2). Protein p4 binds to two regions of DNA, each containing two binding sites in tandem, and each binding site (named sites 1?4) consists of imperfect inverted repeats. The consensus binding sequence is: 5?-AACTTTTT-15?bp-AAAATGTT-3? (see Figure 1; 3?6). Site 3 is the highest affinity one, followed by sites 1 and 2. Site 4, with the most imperfect inverted repeat sequence, is the lowest affinity-binding site.

Protein p4 crystallizes as a dimer, and each monomer has an ?/? fold and a novel N-terminal ?-turn substructure, the N-hook, for DNA interaction (7). In the p4?DNA co-crystal, the DNA presents a B conformation with locally narrowed and widened minor grooves. Each p4 monomer hook intrudes into the DNA major groove where Gln5 and Arg6 establish hydrogen bonds with the DNA bases T???15 and G???13, respectively (Figure 1). In addition, 14 contacts between the protein dimer and the DNA phosphates are observed. Substitutions of Gln5 and Arg6 with Ala provided evidence that the Arg6-G???13 interaction is required for p4-binding site recognition through a direct readout mechanism. Since B-DNA sequences have dissimilar distribution of hydrogen-bond donor and acceptor groups on their bases, but similar sugar?phosphate backbones, p4?phosphate backbone contacts seemed insufficient, a priori, to explain p4's affinity and sequence specificity. Therefore, recognition of other aspect of the sequence structure (indirect readout) could also account for p4 sequence specificity. It is known that indirect readout of DNA sequences plays a role in determining the stability and/or specificity of many protein?DNA complexes, but precise knowledge of how the sequence modulates DNA structure is limited.

Although DNA?protein crystal structures identify atomic interactions, they do not always reveal the specificity of those interactions, or the induced conformational changes in the protein and DNA. Here, we studied the structural stability of the protein and DNA in the p4?DNA complex, and the relative roles played by direct and indirect readout in sequence-specific recognition by protein p4. Molecular dynamics (MD) simulations (8) represent a highly suitable tool to explore the dynamic behaviour of p4 and its binding site, and the mechanism of these interactions. Therefore, we analysed p4?DNA interactions by MD simulations, and by focussing on the interaction of p4 to sequence-modified binding sites.

Protein and DNAs

Bacteriophage ?29 p4 protein was expressed and purified as described (9). The oligodeoxyribonucleotides (Isogen) used are shown in Figure S1. To obtain each 60-bp double-stranded DNA, two complementary oligodeoxyribonucleotides were used. One of the oligodeoxyribonucleotides from each pair was 5?-end labelled using [?-32P] ATP and T4 polynucleotide kinase (10). The labelled strand was purified from unincorporated [?-32P] ATP through a mini Quick Spin Column (Roche). Complementary oligodeoxyribonucleotides were annealed to yield double-stranded DNA by mixing labelled and unlabelled oligonucleotides in a 1:10 ratio in 80??l of 25?mM Tris-HCl (pH 7.5), 200?mM NaCl, heated for 2?min at 90?C, and allowed to cool gradually (14?24?h) to 20?C.

Band-shift assays

Band-shift assays were performed with a fixed amount of DNA and increasing concentrations of p4 for each of the DNA substitutions. From the data obtained, experiments were carried out using those p4 concentrations that give rise to a linear response with each DNA. Binding reactions (20??l) contained labelled DNA (5 fmol), 25?mM Tris-HCl (pH 7.5), 10?mM MgCl2, 100?mM KCl, 0.5??g of poly[d(I-C)], 1??g of bovine serum albumin and protein p4 at the concentrations indicated in each figure. Incubation was for 15?min at 4?C, and the reaction mixture was loaded onto a non-denaturing 6% polyacrylamide gel after addition of 4% (v/v) glycerol. Electrophoresis was performed at 4?C at 20?mA/gel. Gels were dried, and the label present on free DNA and p4?DNA complexes, quantified in a GS-710 Imaging Densitometer, gave the total amount of DNA; the amount of DNA complexed with the protein was calculated as a fraction of total DNA.

MD simulations

MD simulations were performed using the PMEMD module of AMBER8 and the parm 99 parameter set (11). The X-ray structure (PDB code 2FIO) was used for the MD simulation. The system includes the two p4 monomers (except residues 114?124 corresponding to the flexible helix ?4) and the DNA molecule with the following sequence: 5?-TAACTTTTTGCAAGACTTTTTTATAAAATGTTGA-3?. Independent simulations were carried out on the p4?DNA complex (Native complex) and the unbound form of the DNA derived from the structure of the complex but devoid of the protein (free DNA). A third simulation was carried out on the p4?DNA complex where the Arg6 on each monomer of the protein was substituted to Ala (R6A complex). An adequate number of Na+ ions were added to neutralize the net negative charge of the systems (57 Na+ ions in the native system, 65 Na+ ions in the free DNA system and 59 Na+ ions in the R6A system). The counterions were placed in a shell around the system using a Coulombic potential in a grid. The neutralized complexes were then immersed in a truncated octahedron solvent box keeping a distance of 12?? between the wall of the box and the closest atom of the solute. The counterions and the solvent were added using LEAP module of AMBER. Initial relaxation of each complex was achieved by performing 10?000 steps of energy minimization using a cut off of 10.0??. Subsequently, and to start the MD simulations, the temperature was raised from 0 to 298K in a 200-ps heating phase, and velocities were reassigned at each new temperature according to the Maxwell?Boltzmann distribution. During this period, the positions of the C? atoms of the solute were restrained with a force constant of 20?kcal?mol?1???2 and the Watson?Crick bonds between all the base pairs of the DNA were constrained. This constraint with a force constant of 10?kcal mol?1???2 was maintained during the equilibration steps to impede a spurious disorganization of the structure during the heating of the system from 0 to 298K. During the last 100 ps of the equilibration phase of the MD, the force constant was reduced stepwise until 0 except for the distance corresponding to the Watson?Crick hydrogen bonds between the first and the last base pairs of the DNA molecule which was maintained in order to mimic the cooperative stabilizing effect of base pairs present at both DNA ends that are not included in our system. This constraint was maintained during the productive phase of the simulations. The SHAKE algorithm was used throughout to constraint all hydrogen bonds to their equilibrium values so that an integration time step of 2?fs could be employed. The list of non-bonded pairs was updated every 25 steps, and coordinates were saved every 2?ps. Periodic boundary conditions were applied and electrostatic interactions were represented using the smooth particle mesh Ewald method with a grid spacing of ?1??. The trajectory length was 5?ns for all the complexes. Analysis of the trajectories was performed using the CARNAL module of AMBER 8.

Affinity of p4 for synthetic binding sites with base substitutions at the A-tracts

The fact that only two bases, G13 and T15, on the target site are bonded by protein p4 suggests that sequence-specific recognition by the protein may involve factors other than direct amino acid side chain?base interactions. The protein contacts backbone phosphates neighbouring three A-tracts (Figure 1). A-tracts have special properties, including bifurcated hydrogen bonds, high propeller twist and buckle of base pairs, suitable for protein recognition (12?15).

To elucidate the molecular basis of p4 selectivity for its cognate sequence, we studied the binding affinity of p4 to site 3 sequences containing base modifications at the A-tracts (Figures 2?4). The ability of p4 to bind to modified site 3 bearing C?G base pairs substituting each A?T base pair on the external A-tracts adjacent to G???13 was analysed by band-shift assays (Figure 2). Base-pair substitutions at positions ?8 did not affect p4 binding, while the amount of complex formed was reduced by 3- or 4-fold when the base pair was modified at positions ?9 and ?12, respectively. Interestingly, p4 binding was drastically impaired when C?G substituted the A?T base pairs at the centre of the A-tracts (positions ?10 or ?11); a faint band of p4?DNA complex was produced with the DNA modified at positions 11 and no p4?DNA complex was detected when the modification was at position 10. Thus, the affinity decreased 100-fold for substitutions at positions ?11, and >200-fold when the substitution resided at position ? 10. Therefore, the A?T ? 10 and the A?T ?11 base pairs are critical for p4 binding. Since there are no bases or phosphates contacted by the protein at these positions, the effect should be due to specific sequence recognition by indirect readout.

The failure of p4 binding to site 3 bearing C?G substitutions at positions ?10 and ?11 could be explained if, when present, the guanine amino group exerts a detrimental effect on p4 binding through the formation of a third hydrogen bond with the pairing cytosine, which renders the base pair less deformable than the A?T base pair. Alternatively, the 2-amino group of guanine in the minor groove may sterically or electrostatically interfere with p4?DNA interaction. If the presence of the amino group in the minor groove and the increase in the number of hydrogen bonds between the bases interfered with p4 binding, the affinity of p4 for DNA would be drastically decreased upon substitution of adenine with the base analogue diaminopurine (DAP). DAP substitution at positions ?10 reduced p4 binding >200-fold (Figure 3). Conversely, removal of the extra hydrogen bond and the amino group using C?I or T?A base pairs at positions ?10 greatly reduced the deleterious effect (Figure 3). These latter substitutions decreased the affinity of p4 for the DNA by only 3-fold for T?A or 6-fold for C?I. The C?I base pair mimics the A?T base pair in the minor groove, while being identical to the C?G base pair in the major groove (16,17).

Protein p4 has ?8-fold lower relative affinity for site 2 than for site 3 (6). There are two main differences at the A-tract level: site 2 has only three adenines in one of the external repeats, and its central A-tract is shorter with respect to the homologous site 3 A-tract (Figure 4). To analyse further the role of the external A-tracts, and to elucidate the relevance of the central A-tract, we studied the binding affinity of p4 to the site 2 sequence modified at its A-tracts. The relative affinity of p4 for site 2 did not substantially increase by substituting the G?C base pair by A?T at position ?9 (Figure 4, Site 2B), while enlarging the central A-tract by substitutions at positions 0 and +1 (Figure 4, Site 2A) yielded an amount of p4?DNA complex similar to that formed with the site 3 sequence. Therefore, and in agreement with the data from modified site 3, an A?T base pair is not critical at position ?9. However, when present at positions 0 and +1, it enhances p4 binding.

MD simulations of p4?DNA complexes

In order to gain insight into the solution conformation of the DNA and the induced conformational changes in the protein and DNA upon complex formation, we explored the p4?DNA complex by MD simulations. Simulations were carried out after standard structural equilibration. The three structures studied are referred to as the native complex (Figure 1C), the p4R6A complex where the protein has been modified by changing Arg6 into Ala, and the free DNA dynamics of the site 3 sequence with the conformational modification of the p4?DNA crystal structure but devoid of p4.

Stability of the native complex

C? atoms root-mean-square-deviation (RMSD) values versus simulation time of DNA and p4 in the complexes were tested. Figure 5 provides a quantitative view of these parameters. The most remarkable feature in the p4?DNA (native) trajectory along the simulation is the consistency of the protein conformation with the X-ray structure and its stability over the 5?ns simulation. Indeed, the structural fluctuations did not exceed 1.5??. The RMSD values of the DNA stabilized after 1.5?ns to reach peaks of up to 3.5??. These RMSD variations reflect dynamic conformations of the DNA and, since neither is followed by alteration of the protein, the DNA must adapt to fit the protein.

The influence of p4 on the conformational adaptation of the DNA in the complex can be studied by following the time course of the MD simulation of free DNA starting from the structure obtained from crystallography, and allowing the DNA to relax to the solution-free form. Conformational changes from DNA bound to p4 to the unbound form are represented in the free DNA simulation of Figure 5, where initial values lower than 2?? soon reached values up to 8??. These results suggested that the protein imposed the DNA conformation in the crystal, if so, an intermediate situation should be found in the p4R6A?DNA complex, where the Arg6?G13 interactions of both p4 monomers are abolished. The trajectory of the protein in the p4R6A?DNA complex reflects, as was the case of the native complex, the stability of the protein with RMSD values lower than 1.5?? (Figure 5). However, the RMSD for the DNA reached values higher than those of the native complex (up to 6??), as would be expected if some of the protein-dependent constraints were released. Therefore, we focussed our analysis on the obvious changes of the DNA conformation and the consequences of the protein-imposed structural restriction to the DNA.

Minor groove width and curvature

At the three areas of amino acid?phosphate contacts, the minor groove facing p4 narrowed from the 11.5?? of a regular B-DNA width, while it widened up to 15?? on the opposite face of the minor groove (Figure 1C). These effects agree with the final structure of the DNA curved towards the protein. To verify the protein-induced structural features (minor groove narrowing and DNA bending) imposed on the DNA, and to determine how much is intrinsic to the sequence-dependent structure of the free DNA, we analysed the degree of DNA curvature and minor groove width variation along each MD simulation (Figure 6). The curvature of the DNA was calculated using the program Curves (18). During the MD simulation of the free DNA, the bent structure evolved rapidly into a low curvature (with a mean value of 28.1???8.1?). In contrast, the p4-bound form of the DNA (native), with an initial bend angle of 54.2?, reached a curvature mean value of 68???7.5? (Figure S2). Since the free DNA relaxes to a significantly different structural form, the DNA conformation in the complex is an unstable, energetically strained form, and not a stable substrate, that is only stabilized after interaction with p4.

Minor groove parameters were determined measuring the distances between phosphates of T10 and T14, T9 and G13, and T8 and A12 or T12, which correspond to marked width minima. In the native complex, the minor groove width was maintained at 8?9?? along the 5?ns trajectory at both external A-tracts (for simplicity, we assign monomer A to the sequence 5?-AAAAAGTT-3? and monomer B to the sequence 5?-AAAATGTT-3?, see Figure 1), the narrowest point being between T9 and G13 (Figure 6, native green lines). In contrast, the free DNA trajectory displayed values consistent with the minor groove width of B-DNA, ?12?? for the phosphates on the sequence of monomer A and between 10 and 12?? for the phosphates on the sequence of monomer B (Figure 6, free DNA). Hence, p4 generated local narrowing of the minor groove and this induced conformation may differ upon disruption of the Arg6?G13 interactions. The measurements in the p4R6A trajectory showed variation in minor groove widths depending on the distances and on the sequence measured. On the sequence of monomer B (5?-AAAATGTT-3?), the distances between phosphates 8 and 12 (yellow line) or 9 and 13 (green line) were maintained at 10?? along the trajectory, while the distance between phosphates 10 and 14 (red line) reached peaks up to 13??, which may correlate with the peaks on the DNA RMSD of Figure 5. In contrast, phosphate distances measured on the sequence of monomer A (5?-AAAAAGTT-3?) were uneven. Minor groove width between phosphates 9 and 13 (green line) was maintained at ?9??, between phosphates 8 and 12 (yellow line) the width varied between 10 and 12??, and the distance between phosphates 10 and 14 (red line) reached up to 16??, which followed its DNA RMSD profile (Figure 5). Therefore, Arg6 bonds are required to maintain the DNA conformation, and the data additionally suggest that the asymmetry of the DNA inverted repeats influences differentially its interactions with the 2-fold symmetric protein.

Protein?DNA interactions

Since the conformational variation of the DNA is not followed by alterations of the protein, the interactions with the protein must influence the changes in DNA. We addressed this question by measuring the distance of protein and DNA atoms involved in interactions in the protein?DNA X-ray structure along the MD trajectories. Each p4 monomer established two types of protein?DNA contacts: amino acid?base interactions (Gln5?T15 and Arg6?G13), and amino acid?backbone interactions (Thr4 and His10 with the G13 phosphate, Tyr33 with the T8 phosphate, Lys36 of monomer A with the G7 phosphate, and Lys36 of monomer B with the A7 phosphate). Mutation of Gln5 to Ala had no effect on, or even improved, p4 binding to DNA (7). To analyse further the Gln5?T15 interaction, we measured the distance between the atoms involved in the MD trajectory of the native complex (Figure 7). From the beginning of the trajectory, and in most of the snapshots, the distances from the N? group of Gln5 to the O4 of T15 on both monomers were longer than 6??, too large for hydrogen bonding, suggesting that this interaction is not necessary to stabilize the p4?DNA complex. In agreement with this result, we found that changing the G?C base pair at position ?15 of binding site 1 to A?T (see Figure 1) did not favour p4 binding (results not shown).

In contrast, the amino acid?base interaction Arg6?G13 and the interaction of Thr4 with the phosphate at position ?13 were maintained at ?3?? in both monomers along the native complex trajectory (Figure 8A). The hydrogen bonds connecting Tyr33 with phosphate 8 were present in ?50% of the trajectory of both monomers. Furthermore, the overall average interactions between the DNA with His10 and Lys36 appeared in a lower percentage of the snapshots of the trajectory of monomer B compared to those of monomer A, where the interaction with Lys36 fluctuates considerably (Figure 8A). Hydrogen bonding between His10 and O1P at positions ?13 and Lys36 with O1P at positions ?7 occurred in a high percentage of the trajectory for the monomer A, but were only transiently present on monomer B. Inspection of the p4R6A complex (Figure 8B), where the Arg6:G13 contacts were abolished, showed stable and persistent hydrogen bonding between Thr4 and the O2P at positions ?13 while the remaining protein?DNA interactions reached values over 3?? in a high percentage of the snapshots in both monomers after the stabilization time. The distances measured with O2P indicate that there is no switching of partners (not shown). Since a complete protein?DNA dissociation process lies beyond our timescale, this process would not be detected; the described features display a possible beginning step of the dissociation process. From the results obtained, we conclude that Thr4 is an important stabilizing element of the p4?DNA complex and, while monomer A?DNA contacts remain more constant, contacts on monomer B present fluctuations.

Functional asymmetry on protein binding

To study the functional difference of the sequences contacted by each protein monomer, we analysed p4 binding to site 3 with symmetric inverted repeats. Figure 9 shows that protein p4 has higher affinity (4-fold) for site 3 bearing the sequence 5?-AAAATGTT-3? (B) than that for site 3 with the sequence 5?-AACTTTTT-3? (A) in both external A-tracts. To corroborate this data, we substituted individually each guanine at positions ?13 by adenines. Even if p4-binding affinity was clearly diminished when either of these guanines was removed we found that the amount of complex formed with a site 3 devoid of guanine on the monomer A sequence (position ?13; G?A) was reduced 10-fold. In contrast, there was a greater than 30-fold decrease when the guanine was mutated on the monomer B sequence (position +13; G?B). In agreement with this finding, single substitution of DAP at position +10 diminished >10-fold the relative affinity of the protein for DNA, while binding was diminished 5-fold when DAP was present only at position ?10 (not shown). These data and the MD simulations trajectories support the conclusion that each protein monomer displays different binding entropies due to the slight asymmetry of the inverted repeats on site 3.


Protein p4 is a transcriptional regulator that binds to four target sites with different relative affinities (Figure 1). In this article, we studied the principles determining p4 binding specificity. A structural study of the p4?DNA complex by X-ray crystallography showed two direct base?amino acid interactions; Arg6 and Gln5 of each protein monomer contact G13 and T15, respectively. However (i) alanine substitution of each amino acid showed that only the Arg6?G13 interaction is required for p4 binding; (ii) substitution of either guanine at position 13 significantly reduced the relative affinity of p4 for DNA, and simultaneous mutation of both guanines abolished p4 binding (Figures 9 and S4); (iii) MD simulation on the p4?DNA complex showed stable contacts along the trajectory between the Arg6 of both monomers and the guanines at position ?13, while the distances from the N? of Gln5 of either monomer to the O4 of T15 are excessively large for hydrogen bonding in >90% of the trajectory. It is conceivable that the interaction of Gln5 with T15 in the X-ray structure resulted from the bend of the DNA toward the protein. Hence, specific sequence recognition by direct readout relies exclusively on Arg6?G13 interactions. Taking into account that two guanines on complementary DNA strands separated by 25?bp is a frequent event along the ?29 genome while p4 binds specifically only to the four target sequences depicted in Figure 1, other characteristics of the p4-binding site sequence should contribute, in addition, to the p4-binding specificity.

Based on the X-ray structure, the p4?DNA complex with the sequence of site 3 includes three patches of amino acids and three A-tracts. Those A-tracts are present in the other p4-binding sites, as well as in the binding sites of the p4 homologous protein of Nf, a phage closely related to ?29 (19). Residues Thr4 and His10 contact the ?29 site 3 DNA backbone precisely at one border of the two external A-tracts, with Tyr33 and Lys36 contacting the opposite border of the A-tracts. Substitution of A?T with C?G at each base pair of those A-tracts influences or abolishes the affinity of p4 for DNA. Our findings show that placing an amino group on the minor groove of the base pair located at the centre of the A-track (positions ?10) abolishes p4?DNA complex formation. The A?T base pair exhibits a higher intra-base pair propeller deformation than C?G or DAP?T base pairs, a consequence of two Watson?Crick hydrogen bonds between base pairs rather than three. DNA bearing the amino group in the minor groove is under-wound with respect to sites lacking this group, which may directly impact the DNA helical twist and twisting flexibility by mechanical occlusion. Either way, the main negative effect results in a less deformable base pair and, therefore, an increase in the energy required to distort the DNA. Furthermore, A?T or T?A base pairs at position 10 allow the DNA to be more easily bent in the direction of the minor groove that faces the protein at this position. Base substitutions at the A-tracts assert both sequence recognition by indirect readout and p4 over-winding of the minor groove at the centre of the A-tracks.

Examination of the unbound and bound DNA conformations by MD provides clues as to the dynamics of the system. The absolute value of the RMSD for DNA in the p4?DNA complex, lower than 2??, can be considered small taking into account the large size of the simulated system. However, the RMSD values for DNA have a higher mean value (up to 8??) in the free DNA MD, demonstrating considerable DNA immobilization mediated by p4 binding. The RMSD values of the protein with respect to the starting value have a constant mean value of ?1.5?? along the trajectory, suggesting high stability of the protein moiety in the complex. On the other hand, comparison of the free and DNA-bound crystal structures of p4 does not reveal significant structural changes upon p4 binding with a C? RMSD of 0.665?? (7). The DNA in the p4?DNA crystal structure is curved towards the protein due to local compression of the minor grooves at the areas where the protein contacts the backbone phosphates. In the MD simulation, the stable form of free DNA released from p4 does not show overall bending or a stable local narrow minor groove. In contrast, the DNA bound to p4 displays a stably narrowed minor groove between T10 and T14, T9 and G13, and T8 and A12 or T12 phosphates. Since in the absence of p4 the free DNA relaxes to a significantly different conformation, the structure of the DNA in the complex is not a stable or meta-stable substrate but a consequence of the induced conformational modification impressed by p4, which in turn does not reveal significant structural changes upon DNA binding. In fact, p4 binding does not require intrinsically bent DNA (6). The MD data, the disadvantage of sequences containing disrupted A-tracts (this article) for p4 binding to DNA, and the drastic reduction on binding affinity of p4 Thr4Ala and Tyr33Ala mutants (7) indicate that the sequence-dependent characteristic of A-tracts provides an indirect readout by affecting the optimal complementarity both for amino acid?base hydrogen bonding and for precisely positioned interactions between amino acid and DNA phosphates. Hence, the stability of p4?DNA complex is a delicate balance of direct and indirect readout.

The collective results of this study indicate that p4?DNA-binding stability is a consequence of p4-induced conformational modification of the DNA from the canonical B form through an indirect readout mechanism whereas the primary function of the DNA is its ability to acquire a conformation capable of enhancing positive interactions with its cognate protein.

Indirect readout is less well characterized than direct readout. The affinity of protein for its DNA target by indirect readout relies on the fact that B-DNA exhibits a high degree of sequence-dependent structural variation (20?25) which includes recognition of aspects of DNA structure such as intrinsic curvature, topology of major and minor grooves, local geometry of backbone phosphates and flexibility or deformability. These mechanisms have been used to explain some aspects of the affinity of other prokaryotic transcriptional regulators for its target sequences; CAP seems to discriminate between a consensus pyrimidine?purine steps involving sequence effects on the energetic of primary-kink formation (26,27), while bacteriophage 343 repressor recognizes structural features on the central base pair of its target sequence (28,29), and water-mediated contacts are known as an important recognition tool in the trp-repressor operator system (30,31).

The complexity of p4 interaction with its target sequence is compounded by the fact that, despite the 2-fold symmetry of the protein dimer, the protein uses pseudo-inverted repeats to interact with DNA, one monomer (monomer A) interacting with the sequence 5?-AAAAAGTT-3? and the other (monomer B) with the sequence 5?-AAAATGTT-3?. Moreover, p4 is capable of recognizing other sequences with asymmetric inverted repeats (Figure 1). We demonstrated that each inverted repeat provides different contributions to p4-binding affinity. Additionally, the p4?DNA MD simulation indicates that the hydrogen bonds of Tyr33, His10 and Lys36 with DNA are significantly more variable, both in residence time as well as bonding distance on monomer B than on monomer A, suggesting that interactions on monomer B present higher entropic stability. Since the pyrimidine?purine T/G step is more susceptible to deformation than the A/G step because it has a smaller amount of base overlap, the T/G bases of monomer B may permit a better orientation of the G?+?13 for its interaction with Arg6 (Figure S3).

Taking into account that DNA deformation is an important component of the driving force for p4?DNA association, the data presented here provide insights into how the role of DNA sequence may influence a directional binding for p4. We consider that: (1) asymmetry is functionally required for p4?DNA interaction; (2) the MD simulations suggest a net order of p4 binding to DNA sites with minor groove narrowing and curving of the helical axis; (3) the distance from G???13 to G?+?13 is of ?90??, while the distance from the Arg6 of one of the monomers to Arg6 of the other monomer of 75?? is too short for simultaneous interaction of both p4 monomers at the inverted repeats. Therefore, we propose a zipper-binding model where one of the p4 monomers interacts first with the higher entropic stability sequence, 5?-AAAATGTT-3? followed by local minor groove narrowing. This change in DNA conformation will allow interactions between basic residues of both monomers with the central A-tract, and the progressive bend of the DNA would permit the 5?-AAAAAGTT-3? inverted repeat to interact with the hook of the second p4 monomer.

Phage ?29 early promoters A2c and A2b and late promoter A3 are coordinately regulated by a multimeric complex of viral proteins p4 and p6, which elicits the switch from early to late transcription repressing promoters A2c and A2b, and simultaneously activating promoter A3. In the multimeric complex, p4 dimers occupy binding sites 1 and 3, and p6 binds the sequence from sites 1 to 3 synergizing p4 binding (2). Since protein p6 polymerizes from the A-track 5?-AAAAAGTT-3? of site 1 to the A-track 5?-AAAAAGTT-3? of site 3, a second functional implication for the site asymmetry could explain p6-mediated stabilization of p4 binding by anchoring the p4 monomer to the lower affinity A-track. This stabilization is critical for the regulation of the promoters and so required for the transition between early to late transition during ?29 development.


Supplementary data are available at NAR Online.


We thank L. Rothman-Denes for stimulating discussions and critical reading of this manuscript and Galo Ramirez for comments on the manuscript. This study was supported by the Ministerio de Educaci?n y Ciencia of Spain (Grant BFU-2005-0733 to M.S.), the Comunidad de Madrid (Grant 08.2/0026.1/2005 to A.C.; Grant S-0505/MAT-0283 to M.S. and a fellowship to L.P.-L.) and an Institutional Grant from Fundaci?n Ram?n Areces to the Centro de Biolog?a Molecular ?Severo Ochoa?. We thank Laurentino Villar for protein p4 purification and the CIEMAT for generous allowances of computer time at the Jen50. Funding to pay the Open Access publication charge was provided by was provided by the Ministerio de Educati?n y Ciencia.

Conflict of interest statement. None declared.

1. El?as-Arnanz M,Salas M. Functional interactions between a phage histone-like protein and a transcriptional factor in regulation of ?29 early-late transcriptional switchGenes Dev. 1999;13:2502–2513. [pmid: 10521395]
2. Camacho A,Salas M. Mechanism for the switch of ?29 DNA early to late transcription by regulatory protein p4 and histone-like protein p6EMBO J. 2001;20:6060–6070. [pmid: 11689446]
3. Barthelemy I,Salas M. Characterization of a new prokaryotic transcriptional activator and its DNA recognition siteJ. Mol. Biol. 1989;208:225–232. [pmid: 2504924]
4. Nuez B,Rojo F,Salas M. Requirement for an A-tract structure at the binding site of phage ?29 transcriptional activatorJ. Mol. Biol. 1994;237:175–181. [pmid: 8126731]
5. Camacho A,Salas M. Molecular interplay between RNA polymerase and two transcriptional regulators in promoter switchJ. Mol. Biol. 2004;336:357–368. [pmid: 14757050]
6. P?rez-Lago L,Salas M,Camacho A. A precise DNA bend angle is essential for the function of the phage ?29 transcriptional regulatorNucleic Acids Res. 2005;33:126–134. [pmid: 15642698]
7. Badia D,Camacho A,P?rez-Lago L,Escand?n C,Salas M,Coll M. The structure of phage ?29 transcription regulator p4-DNA complex reveals an N-hook motif for DNAMol. Cell 2006;22:73–81. [pmid: 16600871]
8. Hansson T,Oostenbrink C,van Gunsteren W. Molecular dynamics simulationsCurr. Opin. Struct. Biol. 2002;12:190–196. [pmid: 11959496]
9. Barthelemy I,L?zaro JM,M?ndez E,Mellado RP,Salas M. Purification in an active form of the ?29 protein p4 that controls the viral late transcriptionNucl. Acids Res. 1989;15:7781–7793. [pmid: 3671066]
10. Sambrook J,Fritsch EF,Maniatis T. Molecular Cloning: A Laboratory Manual. (2nd). 19892nd. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press;
11. Case DA,Plearman DA,Cadwell JW,Cheatham TE III,Wang J,Ross W,Simmerling CL,Darden TA,Merz KM,et al. AMBER72002San Francisco: University of California;
12. Coll M,Frederick CA,Wang AH,Rich A. A bifurcated hydrogen-bonded conformation in the d(A.T) base pairs of the DNA dodecamer d(CGCAAATTTGCG) and its complex with distamycinProc. Natl. Acad. Sci. USA 1987;84:8385–8389. [pmid: 3479798]
13. Nelson HC,Finch JT,Luisi BF,Klug A. The structure of an oligo(dA).oligo(dT) tract and its biological implicationsNature 1987;330:221–226. [pmid: 3670410]
14. Crothers DM. DNA curvature and deformation in protein-DNA complexes: a step in the right directionProc. Natl. Acad. Sci. USA 1998;95:15163–15165. [pmid: 9860938]
15. Goodsell DS,Dickerson RE. Bending and curvature calculations in B-DNANucleic Acids Res. 1994;22:5497–5503. [pmid: 7816643]
16. Saenger W. Principles of Nucleic Acids. Advanced Texts in Chemistry. 1983New York: Springer;
17. Xuan JC,Weber IT. Crystal structure of a B-DNA dodecamer containing inosine, d(CGCIAATTCGCG), at 2.4?? resolution and its comparison with other B-DNA dodecamersNucleic Acids Res. 1992;20:5457–5464. [pmid: 1437563]
18. Lavery R,Sklenar H. Defining the structure of irregular nucleic acids: conventions and principlesJ. Biomol. Struct. Dyn. 1989;6:655–667. [pmid: 2619933]
19. P?rez-Lago L,Salas M,Camacho A. Homologies and divergences in the transcription regulatory system of two related Bacillus subtilis phagesJ. Bacteriol. 2005;187:6403–6409. [pmid: 16159774]
20. Hogan ME,Austin RH. Importance of DNA stiffness in protein-DNA binding specificityNature 1987;329:263–266. [pmid: 3627268]
21. von Hippel PH. Protein-DNA recognition: new perspectives and underlying themesScience 1994;263:769–770. [pmid: 8303292]
22. Harrington RE,Winicov I. New concepts in protein-DNA recognition: sequence-directed DNA bending and flexibilityProg. Nucleic Acid Res. Mol. Biol. 1994;47:195–270. [pmid: 8016321]
23. Dickerson RE,Chiu TK. Helix bending as a factor in protein/DNA recognitionBiopolymers 1997;44:361–403. [pmid: 9782776]
24. Kalodimos CG,Biris N,Bonvin AM,Levandoski MM,Guennuegues M,Boelens R,Kaptein R. Structure and flexibility adaptation in nonspecific and specific protein-DNA complexesScience 2004;305:386–389. [pmid: 15256668]
25. Gromiha MM,Siebers JG,Selvaraj S,Kono H,Sarai A. Role of inter and intramolecular interactions in protein-DNA recognitionGene 2005;364:108–113. [pmid: 16249059]
26. Gartenberg MR,Crothers DM. DNA sequence determinants of CAP-induced bending and protein binding affinityNature 1988;333:824–829. [pmid: 2838756]
27. Napoli AA,Lawson CL,Ebright RH,Berman HM. Indirect readout of DNA sequence at the primary-kink site in the CAP-DNA complex: recognition of pyrimidine-purine and purine-purine stepsJ. Mol. Biol. 2006;357:173–183. [pmid: 16427082]
28. Koudelka GB,Carlson P. DNA twisting and the effects of non-contacted bases on affinity of 434 operator for 434 repressorNature 1992;355:89–91. [pmid: 1731202]
29. Mauro SA,Pawlowski D,Koudelka GB. The role of the minor groove substituents in indirect readout of DNA sequence by 434 repressorJ. Biol. Chem. 2003;278:12955–12960. [pmid: 12569094]
30. Otwinowski Z,Schevitz RW,Zhang RG,Lawson CL,Joachimiak A,Marmorstein RQ,Luisi BF,Sigler PB. Crystal structure of trp repressor/operator complex at atomic resolutionNature 1988;335:321–329. [pmid: 3419502]
31. Bareket-Samish A,Cohen I,Haran TE. Direct versus indirect readout in the interaction of the trp repressor with non-canonical binding sitesJ. Mol. Biol. 1998;277:1071–1080. [pmid: 9571023]


[Figure ID: F1]
Figure 1. 

(A) Location of the four p4-binding sites with respect to the early (A2b and A2c), and late (A3) bacteriophage ?29 promoters with the ?10 and ?35 elements indicated. Protein p4 monomers are drawn as elliptical objects with their N-hook DNA-recognition motifs. Monomers of the p4 dimer are represented in purple and green and each monomer is distinguished as A or B (see text). (B) Sequence of the p4-binding sites 1?4 aligned and with the central position of each site indicated by a red diamond. Base pairs corresponding to positions 0 and ?15 are indicated. The guanines and thymines referred to in this article are in red. The three site 3 sequence A-tracts are enclosed in discontinuous lines boxes. (C) Structure of the protein p4 in complex with the 34-bp site 3 DNA sequence used in the MD simulations.

[Figure ID: F2]
Figure 2. 

Band-shift assays of p4 and DNA sequences with modified base pairs in the external A-tracts. The site 3 sequence with the two external A-tracts in boxes and base pair at positions ?8 and ?13 is shown. Modified sites 3 assayed are summarized below with the values corresponding to the decrease of relative affinities indicated. Note that a C?G base pair substituted each A?T base pair on both external A-tracts except at position +12 where C?G is substituting a T?A base pair. The p4?DNA complex was formed, and resolved from free DNA through a polyacrylamide gel. The ratio between bound and total DNA was used to calculate the relative affinity of p4 for each modified site 3. The concentration of p4 (in nM dimer) used is indicated on top of the gel.

[Figure ID: F3]
Figure 3. 

Effect of base substitutions at site 3 position ?10 by nucleotide analogues on the p4 affinity for site 3. The substitution at positions ?10 was made symmetrically, such that the modified base appears at identical positions on both strands. The decrease in the relative affinity of p4 for each modified site 3 is indicated.

[Figure ID: F4]
Figure 4. 

Band-shift assays of p4 with DNA sequences with modified base pairs in the A-tracts of site 2. The site 3 sequence with the central A-tract boxed and the central position denoted by a diamond is shown with the base pairs at positions ?13, ?8 and 0 indicated. The site 2 sequence and the modifications of the sequence are shown below. Other conditions as in Figure 2.

[Figure ID: F5]
Figure 5. 

Root-mean-square-deviation (RMSD) in angstroms of MD snapshots of the protein C? trace (green lines), and the DNA phosphates (red lines) as a function of the simulation time. The different panels are, from top to bottom: free DNA, native protein p4-complex and p4R6A complex.

[Figure ID: F6]
Figure 6. 

Minor groove width measurements as a function of time. From top to bottom, dynamics of free DNA, of native complex and of p4R6A complex. Left graphs represent measurements between the phosphates of the sequence 5?-AAAAAGTT-3?. Right graphs represent measurements on the sequence 5?-AAAATGTT-3? (Figure 1). Red lines represent the distances between the T14 and T10 phosphates, green lines show the distances between the G13 and T9 phosphates and yellow lines are the distances between the phosphates of A or T at position 12 and T8.

[Figure ID: F7]
Figure 7. 

Dynamic profiles of the inter-atomic distance between atom N? of Gln5 and the atom O4 of T15 as a function of simulation time (nanoseconds). Red line corresponds to the interaction of protein monomer A, green line represents the contact of protein monomer B. In the crystal, the bond distance in monomers A and B are 3.32 and 3.04??, respectively.

[Figure ID: F8]
Figure 8. 

Protein?DNA bond distances along the MD trajectory of the native and p4R6A complexes. (A) Dynamics of the p4?DNA native complex. (B) Dynamics of the p4R6A?DNA complex. Left, distance measurements between residues of p4 monomer A and the sequence 5?-AAAAAGTT-3?; right, distance measurements between residues of monomer B and the sequence 5?-AAAATGTT-3?. Bond distances measured were: from the NH2 and N? atoms of Arg6 to atoms O6 and N7 of ?13G (red and green lines, respectively); from atom O?1 of Thr4 to O2P of ?13G (marine); from atom N? of His10 to O1P of ?13G (blue); from O of Tyr33 to O1P of ?8T (light green); from N? of Lys36 to O1P of ?7G or +7A (dark blue).

[Figure ID: F9]
Figure 9. 

Specificity of p4 for site 3 modified at the inverted repeat sequences. Band-shift experiment of p4 with site 3 sequences with symmetry on the inverted repeats of the external A-tracts (A and B), or site 3 devoid of one or the other guanine at positions 13 of the inverted repeats (G?A and G?B). Mutant A has the sequence 5?-AAAAAGTT-3? at both inverted repeats, while mutant B has the sequence 5?-AAAATGTT-3? at its inverted repeats. Mutant G?A has modified the guanine at position ?13 by adenine and in mutant G?B the guanine modified by adenine was at position +13. The concentration of p4 (in nM) used is indicated on top of the gel.

Article Categories:
  • Structural Biology

Previous Document:  The C-terminal loop of the homing endonuclease I-CreI is essential for site recognition, DNA binding...
Next Document:  The 1.4-A crystal structure of the S. pombe Pop2p deadenylase subunit unveils the configuration of a...