by Lee Chin Wei
and Andrew Long
Contents
 An Overview
 Different types of Synchronous Counters
 A Comparison between Synchronous and Asynchronous
Counters
 Synchronous Counter Design
 Making Fast Counters
 Referennces
An Overview
The purpose of the survey is to collate information on
Digital Synchronous Counters. Particular emphasis
was placed on the following areas :
 Types of Synchronous Counters and How they work
 Fast Counter Techniques
 Fast Counters from Xilinx
 Implementation of Counters :
Dedicated Hardware and Alternative Devices
The material is presented in a manner suitable for a teaching tool. It seeks
to enlighten and to spark off interest in the design of counters. As
R.S.S Obermann remarks "....design of counters has, in my experience, always
been an excellent proving ground for anyone who has mastered Boolean algebra...
Have fun reading !!!!!
Different types of Synchronous Counters
Binary Up Counters
A synchronous binary counter counts from 0 to 2^{N}1, where N is
the number of bits/flipflops in the counter. Each flipflop is used to represent
one bit. The flipflop in the lowestorder position is complemented/toggled
with every clock pulse and a flipflop in any other position is complemented on
the next clock pulse provided all the bits in the lowerorder positions are equal
to 1.
Take for example A_{4} A_{3} A_{2} A_{1} =
0011. On the next count, A_{4} A_{3} A_{2} A_{1}
= 0100. A_{1}, the lowestorder bit, is always complemented. A_{2}
is complemented because all the lowerorder positions (A_{1} only in this
case) are 1's. A_{3} is also complemented because all the lowerorder
positions, A_{2} and A_{1} are 1's. But A_{4} is not
complemented the lowerorder positions, A_{3} A_{2} A_{1}
= 011, do not give an all 1 condition.
To implment a synchronous counter, we need a flipflop for every bit and an
AND gate for every bit except the first and the last bit. The diagram below shows
the implementation of a 4bit synchronous upcounter.
4bit Synchronous Binary UpCounter
From the diagram above, we can see that although the counter is synchronous
and is supposed to change simultaneously, we have a propagation delay through the
AND gates which add up to give an overall propagation delay which is proportional
to the number of bits of the counter. To overcome this problem, we can feed the
outputs from the flipflops directly to a manyinput AND gate as follows :
4bit Synchronous Binary Up Counter using
speedup technique
This method does overcomes the problem of additive propagation delay but
introduces some other problem of its own. From the diagram above, we can see that
the third flipflop gets its JK input from the output of a 2input AND gate and
the fourth flipflop gets its input from a 3input AND gate and so on. If we
have a counter that counts to for example 16 bits, we will need to have :
1 * 15input AND gate,
1 * 14input AND gate,
. . .
. . .
. . .
1 * 3input AND gate and
1 * 2input AND gate.
This method obviously usus a lot more resources than the first method. Not only
that, in the first method, the output from each flipflop is only used as an
input to one AND gate. In the second method, the output from each flipflop is
used as an input to all the higherorder bits. If we have a 12bit counter, the
output of the first flipflop will have to drive 10 gates (called fanout. The
output from the flipflop may not have the power to do this.
The "solution" to this is to use a compromise between the two methods. Say
we have a 12bit counter, we can organise it into 3 groups of 4. Within each
group of 4, we use the second method and between the 3 groups, use the first
method. This way, we only have an overall gate propagation delay and a
maximum fanout of 3 instead of 10 using the first and second method
respectively.
There are many variations to the basic binary counter. The one described
above is the binary up counter (counts upwards). Besides the up counter, there
is the binary down counter, the binary up/down counter, binarycodeddecimal
(BCD) counter etc. Any counter that counts in binary is called a binary
counter.
Binary Down Counters
In a binary up counter, a particular bit, except for the first bit,
toggles if all the lowerorder bits are 1's. The opposite is true for binary
down counters. That is, a particular bit toggles if all the lowerorder bits
are 0's and the first bit toggles on every pulse.
Taking an example, A_{4} A_{3} A_{2} A_{1}
= 0100. On the next count, A_{4} A_{3} A_{2} A_{1}
= 0011. A_{1}, the lowestorder bit, is always complemented. A_{2}
is complemented because all the lowerorder positions (A_{1} only in this
case) are 0's. A_{3} is also complemented because all the lowerorder
positions, A_{2} and A_{1} are 0's. But A_{4} is not
complemented the lowerorder positions, A_{3} A_{2} A_{1}
= 011, do not give an all 0 condition.
4bit Synchronous Binary Down Counter
The implementation of a synchronous binary down counter is exactly the
same as that of a synchronous binary up counter except that the inverted
output from each flipflop is used. All the methods used improve a binary
up counter can be similarly applied here.
Binary Up/Down Counters
The similarities between the implementation of a binary up counter and
a binary down counter leads to the possibility of a binary up/down counter,
which is a binary up counter and a binary down counter combined into one.
Since the difference is only in which output of the flipflop to use, the
normal output or the inverted one, we use two AND gates for each flipflop
to "choose" which of the output to use.
3bit Synchronous Binary Up/Down
Counter
From the diagram, we can see that COUNTUP and COUNTDOWN are used as
control inputs to determine whether the normal flipflop outputs or the
inverted ones are fed into the JK inputs of the following flipflops. If
neither is at logic level 1, the counter doesn't count and if both are at
logic level 1, all the bits of the counter toggle at every clock pulse.
The OR gate allows either of the two outputs which have been enabled to
be fed into the next flipflop. As with the binary up and binary down
counter, the speed up techniques apply.
MODN/DividebyN Counters
Normal binary counter counts from 0 to 2^{N}  1, where N is
the number od bits/flipflops in the counter. In some cases, we want it
to count to numbers other than 2^{N}  1. This can be done by
allowing the counter to skip states that are normally part of the counting
sequence. There are a few methods of doing this. One of the most common
methods is to use the CLEAR input on the flipflops.
3bit Synchronous Binary MOD6 Counter
In the example above, we have a MOD6 counter. Without the NAND gate,
it is a MOD8 counter. Now, with the NAND gate, the output from the NAND
gate is connected to the asynchronous CLEAR inputs of each flipflop. The
inputs to the NAND gate are the outputs of the B and C flipflops. So, all
the flipflops will be cleared when B = C = 1 (110_{2} = 6_{10
}). When the counter goess from state 101 to state 110, the NAND
output will immediately clear the counter to state 000. Once the
flipflops have been cleared, the B = C = 1 condition no longer exists and
the NAND output goes back to high. The counter will therefore count from
000 to 101, and for a very short period of time, be in state 110 before
the counter is cleared. This state is called the temporary state and the
counter usually only remains in a temporary state for a few nanoseconds.
We can essentially say that the counter skips 110 and 111 so that it goes
only six different states; thus, it is a MOD6 counter. We also have to
note that the temporary state causes a spike or glitch on the output
waveform of B. This glitch is very narrow and will not normally be a
problem unless it is used to drive other circuitry outside the counter.
The 111 state is the unused state here. In a state machine with unused
states, we need to make sure that the unused states do not cause the
system to hang, ie. no way to get out of the state. We don't have to
worry about this here because even if the system does go to the 111 state,
it will go to state 000, a valid state) on the next clock pulse.
Binary Coded Decimal (BCD) Counters
The BCD counter is just a special case of the MODN counter (N = 10).
BCD counters are very commonly used because most human beings count in
decimal. To make a digital clock which can tell the hour, minute and
second for example, we need 3 BCD counters (for the second digit of the
hour, minute and second), two MOD6 counters (for the first digit of the
minute and second), and one MOD2 counter (for the first digit of the
hour).
Ring Counters
Ring counters are implemented using shift registers. It is essentially
a circulating shift register connected so that the last flipflop shifts
its value into the first flipflop. There is usually only a single 1
circulating in the register, as long as clock pulses are applied.
4bit Synchronous Ring Counter
In the diagram above, assuming a starting state of Q_{3} = 1
and Q_{2} = Q_{1} = Q_{0} = 0. At the first
pulse, the 1 shifts from Q_{3} to Q_{2} and the counter
is in the 0100 state. The next pulse produces the 0010 state and the
third, 0001. At the fourth pulse, the 1 at Q_{0} is transferred
back to Q_{3}, resulting in the 1000 state, which is the initial
state. Subsequent pulses will cause the sequence to repeat, hence the
name ring counter.
The ring counter above functions as a MOD4 counter since it has four
distinct states and each flipflop output waveform has a frequency equal
to onefourth of the clock frequency. A ring counter can be constructed
for any MOD number. A MODN ring counter will require N flipflops
connected in the arrangement as the diagram above.
A ring counter requires more flipflops than a binary counter for
the same MOD number. For example, a MOD8 ring counter requires 8
flipflops while a MOD8 binary counter only requires 3 (2^{3}
= 8). So if a ring counter is less efficient in the use of flipflops
than a binary counter, why do we still need ring counters? One main
reason is because ring counters are much easier to decode. In fact,
ring counters can be decoded without the use of logic gates. The decoding
signal is obtained at the output of its corresponding flipflop.
For the ring counter to operate properly, it must start with only
one flipflop in the 1 state and all the others at 0. Since it is not
possible to expect the counter to come up to this state when power is
first applied to the circuit, it is necessary to preset the counter to
the required starting state before the clock pulses are applied. One
way to do this is to apply a pulse to the PRESET input of one of the
flipflops and the CLEAR inputs of all the others. This will place a
single 1 in the ring counter.
Johnson/TwistedRing Counters
The Johnson counter, also known as the twistedring counter, is
exactly the same as the ring counter except that the inverted output of
the last flipflop is connected to the input of the first flipflop.
4bit Synchronous Johnson Counter
The Johnson counter works in the following way : Take the initial
state of the counter to be 000. On the first clock pulse, the inverse of
the last flipflop will be fed into the first flipflop, producing the
state 100. On the second clock pulse, since the last flipflop is still at
level 0, another 1 will be fed into the first flipflop, giving the state
110. On the third clock pulse, the state 111 is produced. On the fourth
clock pulse, the inverse of the last flipflop, now a 0, will be shifted
to the first flipflop, giving the state 011. On the fifth and sixth
clock pulse, using the same reasoning, we will get the states 001 and
000, which is the initial state again. Hence, this Johnson counter has six
distinct states : 000, 100, 110, 111, 011 and 001, and the sequence is
repeated so long as there is input pulse. Thus this is a MOD6 Johnson
counter.
The MOD number of a Johnson counter is twice the number of flipflops.
In the example above, three flipflops were used to create the MOD6
Johnson counter. So for a given MOD number, a Johnson counter requires
only half the number of flipflops needed for a ring counter. However, a
Johnson counter requires decoding gates whereas a ring counter doesn't.
As with the binary counter, one logic gate (AND gate) is required to
decode each state, but with the Johnson counter, each gate requires only
two inputs, regardless of the number of flipflops in the counter. Note
that we are comparing with the binary counter using the speed up
technique discussed above. The reason for this is that for each state,
two of the N flipflops used will be in a unique combination of states.
In the example above, the combination Q_{2} = Q_{1} = 0
occurs only once in the counting sequence, at the count of 0. The state
010 does not occur. Thus, an AND gate with inputs (not Q_{2}) and
(not Q_{2}) can be used to decode for this state. The same
characteristic is shared by all the other states in the sequence.
A Johnson counters represent a middle ground between ring counters
and binary counters. A Johnson counter requires fewer flipflops than a
ring counter but generally more than a binary counter; it has more
decoding circuitry than a ring counter but less than a binary counter.
Thus, it sometimes represents a logical choice for certain applications.
Loadable/Presettable Counters
Many synchronous counters available as ICs are designed to be
presettable. This means that they can be preset to any desired starting
value. This can be done either asynchronously (independent of the clock
signal or synchronously (on the active transition of the clock signal).
This presetting operation is also known as loading, hence the name
loadable counter. The diagram below shows a 3bit asynchronously
presettable synchronous up counter.
3bit Synchronous Binary Presettable
Counter
In the diagram above, the J, K and CLK inputs are wired the same
way as a synchronous up counter. The asynchronous PRESET and CLEAR
inputs are used to perform the asynchronous presetting. The counter is
loaded by applying the desired binary number to the inputs P_{2},
P_{1} and P_{0} and a LOW pulse is applied to the
PARALLEL LOAD input, not(PL). This will asynchronously transfer P_{
2}, P_{1} and P_{0} into the flipflops. This
transfer occurs independently of the J, K, and CLK inputs. As long as
not(PL) remains in the LOW state, the CLK input has no effect on the
flipflop. After not(PL) returns to high, the counter resumes counting,
starting from the number that was loaded into the counter.
For the example above, say that P_{2} = 1, P_{1} =
0, and P_{0} = 1. When not(PL) is high, these inputs have no
effect. The counter will perform normal countup operations if there
are clock pulses. Now let's say that not(PL) goes low at Q_{2}
= 0, Q_{1} = 1 and Q_{0} = 0. This will produce LOW
states at the CLEAR input of Q_{1}, and the PRESET inputs of
Q_{2} and Q_{0}. This will make the counter go to state
101 regardless of what is occuring at the CLK input. The counter will
remain at state 101 until not(PL) goes back to HIGH. The counter will
then continue counting from 101.
A comparison between Synchronous and
Asynchronous Counters
Asynchronous counters, also known as ripple counters, are not
clocked by a common pulse and hence every flipflop in the counter
changes at different times. The flipflops in an asynchronous counter
is usually clocked by the output pulse of the preceding flipflop.
The first flipflop is clocked by an external event. A synchronous
counter however, has an internal clock, and the external event is used
to produce a pulse which is synchronised with this internal clock.
The diagram of an ripple counter is shown below.
4bit Ripple Counter
It can be seen that a ripple counter requires less circuitry than
a synchronous counter. No logic gates are used at all in the example
above. Although the asynchronous counter is easier to construct, it
has some major disadvantages over the synchronous counter.
First of all, the asynchronous counter is slow. In a synchronous
counter, all the flipflops will change states simultaneously while
for an asynchronous counter, the propagation delays of the flipflops
add together to produce the overall delay. Hence, the more bits or
number of flipflops in an asynchronous counter, the slower it will
be.
Secondly, there are certain "risks" when using an asynchronous
counter. In a complex system, many state changes occur on each clock
edge and some ICs respond faster than others. If an external event
is allowed to affect a system whenever it occurs (unsynchronised),
there is a small chance that it will occur near a clock transition,
after some IC's have responded, but before others have. This
intermingling of transitions often causes erroneous operations. And
the worse this is that these problems are difficult to forsee and
test for because of the random time difference between the events.
Synchronous Counter Design
A synchronous counter usually consists of two parts: the memory
element and the combinational element. The memory element is
implemented using flipflops while the combinational element can
be implemented in a number of ways. Using logic gates is the
traditional method of implementing combinational logic and has
been applied for decades. Since this method often results in
minimum component cost for many combinational systems, it is still
a popular approach. However there are other methods of implementing
combinational logic which offers other advantages. Some of the
alternative methods which are discussed here are: multiplexers
(MUX), readonly memory (ROM) and programmable logic array (PLA).
Multiplexer
The multiplexer, also called the data selector, it has n select
inputs, 2^{n} input lines and 1 output line (and usually
also a complement of the output). The 2^{n} possible
combinations of the select inputs connects one of the input lines
to the output. When used as a combinational logic device, the n
select inputs represent n variables and the 2^{n} input
lines represent all the minterms of the n variables.
ReadOnly Memory
The ROM is usually used as a storage unit for fixed programs in a
computer. However, it can also be used to implement combinational
logic. It is useful for systems requiring changeable functions.
When a different function is required, a different ROM producing
this function can be plugged into the circuit. No wiring change
is necessary. The ROM has n input lines pointing to 2^{n}
locations within the ROM that store words of M bits. As with the
MUX, each input line is used to represent a variable and the
2^{n} locations represent the minterms.
Programmable Logic Array
The PLA is very similar to the ROM. It can be thought of as a ROM
with a large percentage of its locations deleted. A ROM with 16
input address lines must have 2^{16}, or 65,536 storage
locations, and all the words stored in these have to be decoded.
The PLA only decodes a small percentage of the minterms. The PLA
is sometimes used to produce a system with a small number of chips
in a minimum time.
More information on these devices are given in
article 2 of cwl3.
Making Fast Counters
Where speed is a concern....
In certain application, speed is an important factor affecting the choice
of a counter. For example, counters used in communication and certain instrumentation applications are necessarily fast. We will be looking at some technique
commonly used to improve the speed of a counter. To reinforce, the concepts presented, some commercial counters (by Xilinx) will be considered.
General Structure of a Synchronous Binary Counter
There are two common ways in which a synchronous binary counter is structured.
These are, namely, the series carry synchronous counter
and the parallel carry synchronous
counter. These two counters are illustrated
as follows :
Series Carry Synchronous Counter
Parallel Carry Synchronous Counter
Both counters depicted above are binaryup counters.
The T implies a T flipflop. The
flipflop complements/toggles its output on the rising edge of a clock pulse
provided its enable (EN) input is high.
From the diagrams, it can be seen that the least significant bit Q0
toggles on every clock pulse, and subsequent bits toggle when
preceding bits are high. The important distinction between the two counters
is the way the EN signals propagate from Q0 to Q3. This is
illustrated by the highlighted paths. The signals are propagated
serially and in parallel (to each AND gate) in the first and second case
respectively.
The parallel carry scheme results in a much faster counter. This difference
in speed is accounted for by the delay encountered during the propagation
of the EN signals. To illustrate the worst case delay in both cases,
we consider a change in Q0 from 0 to 1. (see diagrams
above)
In the series carry scheme, the time to propagate the change in
Q0 must take
into account the propagation delays of the 3 AND
gates (A, B, C). In the parallel carry scheme, only the propagation delay of
1 AND gate has to be considered. Therefore, the
minimum clock period of the parallel
scheme is shorter. Thus, the parallel synchronous carry counter operates at a
greater maximum frequency. This structure is believed to be the fastest
synchronous binary counter structure. In applications that require speed,
this scheme is commonly used.
This structure does have limitations. From the diagrams, it can be seen that
a single flipflop output(consider Q0)
has to drive a number of subsequent AND gates. The output current of a
flipflop may not be large enough to drive that many gates. It becomes a
problem when the counter gets bigger. To overcome this, a tree of AND gates
is usually used. How exactly this tree will look like is an engineering
choice. This choice will reflect the tradeoff between speed requirements and
the constraint mentioned above.
Although the series carry scheme is slower, it does not suffer the
same drawback as the parallel carry scheme. This makes it a suitable basis
for making big counters. Its speed can be improved by using some form
of Prescaling. This technique will be considered in subsequent
sections.
Prescaling
" The Concept "
The idea of prescaling is to provide a
"prescaling" stage between the incoming
clock frquency and the counting circuit. The prescaling
stage is sometimes provided by a dedicated prescaling device known as
the Prescaler. This device/circuit is
designed primarily using Emitter Coupled Logic (ECL)
. ECL benefits from very fast switching capabilities. This makes
it suitable for high speed counting work.
Despite its suitability to high speed counting work, it has little or
no counting features since such features will only impede its operating
speed. The reader does not have to concern himself or herself with
the implementation of the prescaler. The reader should, however, understand
the function it performs in the overall counter.
A prescaler generates a "clock" pulse after it has received a number
of input pulses. This "clock" pulse is then fed to the counting
circuit. For example, a dividebyn
prescaler will generate a pulse when it has received
n input pulses. At present, there are prescalers
that can accept a range of frequencies ranging from a few hundred
Megahertx to a few Gigahertz. The point of the prescaler is to divide
an incoming clock and, thereby provide a clock to a larger, slower
counting circuit.
The curious reader would probably be wondering how the actual
(and faster) incoming clock frequency is actually reflected in the
slower counting circuit. There are a number of ways in which a prescaler
can be used, but one sophisticated setup is the
"pulse swallowing" counter. The characteristic of a
"pulse swallowing" counter is that it stops counting when a predetermined
number of pulses has been received.
The following diagram shows a downcounting Binary Coded Decimal (BCD)
counter in a simplified "pulse swallowing" setup.
BCD Pulse Swallowing Counter
In the above setup, the Tens and Units sections of a BCD
counter are shown. Note that a section stops counting when zero has
been reached. Consequently, a carry is also generated (
UC and TC)
. Both sections are presettable via P3P0.
The outputs(Q3Q0) reset to the preset
values when Pe is high. UC is
fed back to the prescaler as the Mode(M)
input signal. When M is high or low, the prescaler dividesby
10 or 11 repectively
before generating a "clock" pulse. To demonstrate the principle of
"pulse swallowing", let's consider an example.
Suppose we preset a value of 32 (0011 0010). The outputs will have
values as shown below :
Tens Units Mode(M) Decimal Value
after 0 clock pulses
0011 0010 0 32
after 11 clock pulses
0010 0001 0 21
after 22 clock pulses
0001 0000 1 10
after 32 clock pulses
0000 0000 1 00

Effectively, a "pulse swallowing" counter "swallows up" fast incoming
clock pulses. This is reflected in the slower counter by simultaneously
driving the Tens and Units section. Therefore the net effect of such a
combination (of prescaler and counter) is a counter operating at a much
higher speed than what it was capable of alone.
Pipelining
Pipelining is a "predict and store"
technique. It "predicts" an event one(usually) clock cycle before it is
to occur. Upon prediction, certain output value(s) (resulting that from that
event) are set. These new value(s) are stored/latched using
flipflops (usually D type). They appear at the outputs on the next clock
pulse when the event actually occurs. How does this actually help in speeding
things up?
Let's say the detection of an event and the setting of the required
outputs take 20ns. The propagation of the outputs takes another
10ns.
Consider the two situations where pipelining is used and not used.
If the above actions had to be performed in one clock cycle, the minimum
clock period would be 30 ns(without pipelining). If these two sets of
actions were performed in two separate clock periods, the minimum clock
period is 20ns(with pipelining). With pipelining, the overall
frequency/speed of the circuit is improved. This is illustrated
schematically as follows :
" Pipelining speeds things up!! "
Fast Counters From
Xilinx
# Synchronous Presettable Counter
(Xilinx Application Notes XAPP 003.002)
Maximum Clock Frequency
8 bits : 71 MHz
16 bits : 55 Mhz
This counter demonstrates the parallel carry
synchronous counter structure and the
pipelining technique.
Let's consider an upcounting version of this counter.
Presettable Up Counter
Q counter bits
D preset via these inputs
On first sight, it looks complicated but the reader may have noticed
that there are many similar blocks of logic circuitry. Let's take a look
at some of these blocks and see how they work.
Consider block producing Q0
Block Producing Q0 (least significant bit)
TERMINAL COUNT high
As seen below, an inverted version of Q0
is propagated through AND gate A.
D_{0} is not propagated through A because an inverted
version of TERMINAL COUNT is fed
into B. Therefore the output of B is low. With this setup,
Q0 toggles on every rising edge of the clock pulse.
when TERMINAL COUNT
high
TERMINAL COUNT low
As seen below, the inverted version of Q0
is not propagated through A.
D_{0} is propagated through B
because an inverted version of
TERMINAL COUNT is fed into B. The output of the OR gate
will have the value of D_{0}.
Therefore, Q0
will have the value ofD_{0} on
the next clock pulse.It is noted that the preset value
D appears as the output Q
on the next clock pulse (after terminal count). This applies to all bit stages.
when TERMINAL COUNT
low
Consider block producing Q3
Block Producing Q3
TERMINAL COUNT high
As seen below, the output of the EXOR gate C will be propagated through A.
The output of C is high when either(not both)
T_{3} or Q3
is high. T_{3}
is the ANDED version of all preceding outputs(Q0Q2 ).(Note that in the Q1
stage, the T input is replaced by
Q1) Effectively,
Q3 stays the same when T_{3}
is low. When T_{3}
is high, Q3 toggles.
Therefore, in all the bit stages, an output bit toggles when the preceding
bits are high.
when TERMINAL COUNT
high
TERMINAL COUNT low
The preset value is loaded on the next clock pulse as before.
Consider the Carry connections
The Carry Connections
Let's focus our attention on the generation of the
T outputs(see above). This counter uses an adapted version of the
parallel carry scheme by employing an AND gate tree. The different outputs
driving the AND gates are summarised schematically as follows :
AND Gate Tree Diagrams
In this setup, Q0 is fed directly to the
next bit stage and in parallel to all the T(T2T6)
AND gates. This minimises the worst case delay
(compare with the series carry scheme). Subsequent bits feed in
parallel to the relevant AND gates. The additional gate delay
introduced by T_{x} does not
affect the critical paths from Q0
to Q7 because of the way the numbers
change.
Consider the Pipelining block
Pipelining TERMINAL
COUNT
When the counter output is 11111110, the NAND gate output is low. Therfore,
when the value preceding terminal count is detected, the required
TERMINAL COUNT value (low) is
fed to the input of the flipflop. On the next clock pulse(when it is
terminal count), TERMINAL COUNT is low
("load preset value"). This propagates the preset values (D0D7) to the inputs of the flipflops.
Note : when any other values are detected, the NAND gate output is
high. Thus TERMINAL COUNT is high
("do not load preset value").
# HighSpeed Synchronous Prescaler Counter
( Xilinx Application Notes 001.002)
Max Clock Frequency
8 bits : 200 MHz
16 bits : 115 MHz
The counter demonstrates prescaling and
pipelining.
The counter can be represented in a block diagram as follows :
NonLoadable Binary Counter
The counter is implemented on a Field Programmable Logic Device (FPGA).
This requires it to be implemented as tribit blocks(TB1 and TB2
) for optimal resource usage. The reader does not need to concern
himself/herself with this.
The counter employs the concept of prescaling
but does not use a dedicated prescaler (ECL device). Instead,
the least significant (LS) tribit(Q0Q2
) provides the prescaling function. All tribits
respond (increment) to a clock pulse if its CountEnable inputs
(CE, CEP
, CET) are high. The CEO of the LS tribit is high once in
every 8 clock cycles when all its outputs are high. The
"prescaler" pulse effectively reduces the clock rate to the rest of the
tribits by a factor of 8. Note that there is no change in the
original clock rate. The 7 clock cycles when the LS CEO is low gives the
CEOCET ripple chain
(of subsequent tribits) time to settle. If this prescaling
was not done, the settling time would have to be taken into account
when determining the minimum clock period of the counter. This
would significantly limit the minimum clock period, thereby slowing
the counter down. This would become clearer when we examine the actual
implementation of this counter.
TB1
TB2
Note : all clock inputs are assumed to be driven by a common clock.
Qa and Qc represent the LSB and MSB of a tribit
respectively.
Generation of CEO in TB1(A) and TB2(B)
CEOs are high only when
CEs(or CET
s) and the outputs Q(Q_{a}Q_{c})are high.
Generation of Q_{a} in TB1(A) and TB2(B)
When the CountEnable inputs are high, the EXOR gate complements
Q_{a}. Therefore, the CountEnable
inputs effectively "enable" or "disable" the complementing function.
Generation of Q_{b} in TB1(A) and TB2(B)
When both Q_{a} and the CountEnable
inputs are high, the complementing function is enabled.
Generation of Q_{c}
The generation of Qc is similar to the generation of Qb except that
the value of Qb is also fed into the AND gate.
The Ripple Chain
The delay of this ripple chain is the sum of all the gate delays
presented by the chain of AND gates. To appreciate the sigificance
of this delay, let's consider an example. Suppose the current values
from Q23Q3 is 01111...(all ones)...110. On the next "prescaler" pulse,
Q3 will become 1. This "information" about the change in Q3
has to be propagated to subsequent tribits before the next clock
pulse. This is necessary to ensure the correct
changes(on the next clock pulse) to subsequent bits. As evident from
the diagrams, the worst case delay is the propagation of this
"information" from Q3 through the ripple chain and to the flipflop
input of Q23. This delay is a major and common problem with most
binary counters. The "prescaler" accommodates this delay
by allowing time for the propagation of this "information". This does not
affect the effective operating frequency of the counter because the
"prescaler"(LS tribit) still operates at the faster clock rate.
The speed of the counter can be improved further by
pipeliningthe LS CEO signal :
Pipelining CEO
We see that when 110 is detected, CEO is
set and fed to the flipflop input. This value appears as CEO on the next clock pulse(when 111 occurs).
The actual implemetation of LS tribit with pipelining is seen below :
Implemention of LS tribit with Pipeline
# UltraFast Synchronous Counter
(Xilinx Application Notes XAPP 014.001)
Maximum Clock Frequency
8 bits : 256MHz
16 bits : 108MHz
UltraFast Counter
In this counter, the LS bit Q0 acts as the "prescaler". The effective
"clock" rate provided by this prescaler is 1/2 of the actual clock
rate.
In the previous example, the distribution of the CEO signal (from the LS
to the MS tribit) introduces a line transmission delay. This
counter eliminates the delay by replicating QO for bits after Q1.
This is done by the following chain/network of flipflops :
Network To Replicate Q_{0}
To best describe the function of such a network, let's take a look
at the timing diagram depicting the output values :
The Timing Diagram
It is seen that all QX0 outputs are in sync with Q0 after the initial
delays. The effect of this is that bits after Q1 appear to be driven
directly by Q0 and without the line transmission delay. This improves
the minimum clock period.
The Second "Prescaler"
Here Q1 and Q2 act as the second "prescaler". This additional prescaler is
needed to accommodate a large counter(more bits). The effective "clock"
rate provided by this prescaler to the rest of the counter is 1/8 of
the actual clock rate. This second prescaling stage allows the rest of the
counting circuit to employ the series carry scheme
. The use of such a carry scheme allows a larger counter to be
constructed.
From the diagram, CEP2 is pipelined. When the LS three bits are
101(Q2Q0), the output of AND gate A is high. Since Q0 is high at
this point, the value of A is selected and appears at the output of
multiplexer B. This value is fed to the Flipflop input. On the
next clock cycle, the value appears as CEP2(high). At this point, the
LS three bits is 110. Since QY01 is low, CEP2 is selected by the multiplexer.
Thus, on the next clock cycle, CEP2 is high again. This is summarised
below :
Q2 Q2 Q1 A Dinput(flipflop) CEP2
1 0 0 1 0 0
1 0 1 1 1 0
1 1 0 0 1 1
1 1 1 0 0 1
References
1.  Title:
 Counting and Counters  
 Author(s):
 R M M Oberman  
 Source:
 The Macmillan Press Ltd  

 
2.  Title:
 Electronic Counters  
 Author(s):
 R M M Oberman  
 Source:
 The Macmillan Press Ltd  

 
Type: 
 Usefulness: 
 Readability: 

3.  Title:
 Logic Design Principles  
 Author(s):
 Edward J. McCluskey  
 Source:
 Prentice Hall International  

 
Type: 
 Usefulness: 
 Readability: 

4.  Title:
 Digital System Design  
 Author(s):
 Barry Wilkinson with Rafic Makki  
 Source:
 Prentice Hall International  

 
Type: 
 Usefulness: 
1/2
 Readability: 

5.  Title:
 Digital Design : Principles and Practices  
 Author(s):
 John F. Wakerly  
 Source:
 Prentice Hall International  

 
Type: 
 Usefulness: 
 Readability: 
1/2

6.  Title:
 Digital Systems : Principles and Practices  
 Author(s):
 Ronald J. Tocci  
 Source:
 Prentice Hall International  

 
Type: 
 Usefulness: 
 Readability: 

7.  Title:
 Practical Digital Design Using ICs  
 Author(s):
 Joseph D. Greenfield  
 Source:
 John Wiley & Sons  

 
Type: 
 Usefulness: 
 Readability: 

8.  Title:
 Digital Logic Design
 
 Author(s):
 Brian Holdsworth  
 Source:
 ButterworthHeinemann Ltd  

 
Type: 
 Usefulness: 
1/2
 Readability: 

9.  Title:
 Digital Design  
 Author(s):
 M. Morris Mano  
 Source:
 Prentice Hall International>  

 
Type: 
 Usefulness: 
1/2
 Readability: 

10.  Title:
 Logic Design Principles  
 Author(s):
 Edward J. McCluskey  
 Source:
 Prentice Hall International  

 
Type: 
 Usefulness: 
 Readability: 

11.  Title:
 Digital Electronics  
 Author(s):
 Christopher E. Strangio  
 Source:
 Prentice Hall International  

 
Type: 
 Usefulness: 
 Readability: 

12.  Title:
 Digital Logic and State Machine Design  
 Author(s):
 David J. Comer  
 Source:
 Saunders College Publishing  

 
Type: 
 Usefulness: 
 Readability: 

13.  Title:
 Digital Circuits and Microprocessors  
 Author(s):
 Herbert Taub  
 Source:
 Prentice Hall International  

 
Type: 
 Usefulness: 
 Readability: 

14.  Title:
 The Programmable Logic Data Book  
 Author(s):
 Xilinx Inc.  
 Source:
 Xilinx Inc.  

 
Type: 
 Usefulness: 
 Readability: 

15.  Title:
 Digital Logic and Computer Design  
 Author(s):
 M. Morris Mano  
 Source:
 Prentice Hall International  

 
Type: 
 Usefulness: 
 Readability: 

16.  Title:
 ISE Second Year Digital Electronics Notes  
 Author(s):
 Mike Brookes  
 Source:
 Mike Brookes  

 
Type: 
 Usefulness: 
 Readability: 
