Fast Counter Techniques

an article by Andrew Long

June 1996


General Structure of a Synchronous Counter

The Pulse Swallowing Technique


Some Practical Counters



This article addresses the issue of counter speed. In certain applications, speed is an important factor affecting the choice of a counter. For example, counters used in communication applications(eg. phase locked loop generator in a TV) are necessarily fast. Techniques which are used to increase the speed of a synchronous counter will be discussed. Some practical counters from Xilinx will be discussed to illustrate the use of and adaptation of these techniques to commercial counters.


There are two common ways in which a synchronous counter can be structured. These are, namely, the series carry synchronous counter and the parallel carry synchronous counter. These two counters are illustrated as follows :

The T implies a T Flip-flop. The Flip-flop toggles or complements its output when the clock pulse(CLK) goes high. The key here is that the Flip-flop only responds to the clock pulse when its enable input(EN) is high. The enable inputs are controlled by the counter enable signals(CNTEN) and the outputs of the preceding(lower order) Flip-flops. It can be deduced that the least significant bit Q0 toggles on every clock pulse and subsequent bits toggle when preceding bits are all high. Therefore, these are up counters.

The counters are named as such because of the way the EN signals are propagated from the least significant bits to the most significant bits. Although they are functionally the same, it can be seen that the parallel carry scheme results in a much faster counter.

The difference in speed is accounted for by the fact that the minimum clock period of the series carry scheme is limited by the propagational delay of one Flip-flop and the sum of the propagational delays of the AND gates. This implies that, as the number of bits increase(larger counter) the overall propagational delay will increase, and thus slowing the counter down.

The parallel carry scheme does not suffer from this drawback. Its minimum clock period is limited by the propagational delays of one Flip-flop and one AND gate regardless of the counter size. This structure is believed to be the fastest binary counter structure. In applications that require speed, this scheme is commonly used. This structure does have its own limitations. From the diagram, it can be seen that a single Flip-flop output(consider Q0) has to drive a number of subsequent AND gates. This becomes quite a problem when the counter gets bigger. The output current of a Flip-flop may not be large enough to drive that many gates. On its own, the usefulness of this scheme for fast applications is limited.

Considering the speed differences and drawbacks of both schemes leads us to contemplate the possibility of using the two schemes to complement each other. The proceeding sections will discuss how these two schemes are used together and in conjunction with other specialist techniques that make a counter fast.


This technique employs the use of the device known as the Prescaler. This device/circuit is designed primarily using Emitter Coupled Logic(ECL). ECL benefits from very fast switching capabilities. This makes it suitable for high speed counting work. The cost of gates made using ECL is 10 times more expensive than gates made from Transistor-transistor Logic (TTL) or Complementary Metal-Oxide- Semiconductors (CMOS).

Though the prescaler is designed for high speed counting work, it has little or no counting features since such features will only impede its operating speed. The reader does not have to concern himself or herself with the implementation of the prescaler. The reader should understand the function it performs in the overall counting circuit. Essentially, a prescaler generates a "clock" pulse (to the rest of the counting circuit) after it has received a certain number of input pulses. For example, a divide-by-n prescaler will generate a pulse when it has received n input pulses. At present there are prescalers that can accept a range of frequencies ranging from a few hundred Megahertz to a few Gigahertz. The point of the prescaler is to divide an incoming clock frequency and, thereby provide a clock to a larger, slower counter.

One would question as to how the actual (and faster) incoming clock frequency is actually reflected in the slower counter. To answer this, let's consider the following example of a BCD(Binary Coded Decimal) counter.

In the above diagram, the two least significant decades, units counter( UC) and tens counter
( TC) are shown. The "pulse swallowing" counter is used to count up to a predetermined number of pulses. The trick here is the simultaneously driven TC and UC by the prescaler.

The circuit is connected in such a way that it operates as a up-counter when the 9's Complement of the desired terminal value is preset into the counter via the P inputs. The output Q4 from the prescaler provides the clock pulse(CP) to both counter sections. The carry signal(TC) of UC is fed back to its carry enable input(CE) and the mode input(M) of the prescaler. UC will stop counting and the prescaler will switch to the divide-by-10 mode when TC is low. To illustrate the significance of these connections, let's say we want to count up to 54 pulses. The counter will start with the initial value indicated below and stop when both TCs are low.

		      (9's Complement)

		TC			UC

            P4 P2 P1 P0		    P4 P2 P1 P0

(initial)   0  1  0  0 	(4)	    0  1  0  1  (5)	       
	    0  1  0  1              0  1  1  0                  
            0  1  1  0              0  1  1  1
            0  1  1  1              1  0  0  0
            1  0  0  0              1  0  0  1  (9) TC low
	    1  0  0  1  (9) TC low     

Initially, when the M input is low, the prescaler operates in the divide-by-11 mode. After 4 prescaler pulses, the count value is 89 (equivalent to 44 input pulses). At this point, TC of UC goes low causing the input to its CE to go high and the M input to be high. This stops UC and switches the prescaler to the divide-by-10 mode. After the next clock cycle, the counter value is 99 (equivalent to 54 input pulses) and output TC of TC goes low. It can be seen that when all counter sections have reached a carry output, the predetermined number of pulse have been received. Note that the counter only received 5 pulses to reach the desired value when in fact 54 pulses were received by the entire counter circuit(including the prescaler). Therefore, the net effect of such a combination is a counter operating at a much higher speed than what it was capable of by itself.


Pipelining is a "prediction" technique. It "predicts" an event one (usually) clock cycle before it is to occur. Upon prediction, certain output values (resulting from that event) are set. These new values do not appear immediately at the outputs. The new values are stored/latched using Flip-flops. They appear only on the next clock cycle when the event is expected to occur. The use of this technique will be illustrated and clarified when we consider some practical counters.


Let's consider some practical counters from Xilinx .

Synchronous Presettable Counter XAPP 003.002 .

This binary up-counter demonstrates the Parallel Carry Synchronous Counter Structure and the Pipelining Technique .

The counter is presettable via the D inputs. When terminal count(11111111) is reached, the counter will return to the preset value on the next clock pulse. The counter outputs are denoted by Q . Q0 is the least significant bit.Note that preceding output values are fed in parallel to subsequent bit stages. These are seen as inputs to the AND gates producing the T outputs. The output Q0 is fed directly into the next bit stage. It can deduced that Q0 will only toggle on every clock cycle when TERMINAL COUNT is high(indicating that terminal count has not been reached). Subsequent bits toggle when preceding bits are high and TERMINAL COUNT high.

Pipelining is used here to predict terminal count one clock cycle before it occurs. This is done by ANDING all the T outputs and the inverted version of Q0. The result is negated. The result is fed to the TERMINAL COUNT Flip-flop. The result is low when all T outputs are high and Q0 low. This is the value 11111110 (one short of terminal count). On the next clock cycle(when terminal count occurs), the result appears as TERMINAL COUNT. When TERMINAL COUNT goes low, the D inputs are propagated to the D inputs of the Flip-flop in the bit stages. In effect, pipelining separates the detection of terminal count and the setting of TERMINAL COUNT into two clock periods. In the second clock period, TERMINAL COUNT is set and propagates to all the bit stages. If all these were done in one clock period, the minimum clock period will be significantly limited, thereby slowing the counter down.

High-Speed Synchronous Prescaler Counter XAPP 001.002 .

In this counter, an adapted version of the "pulse swallowing" technique is used.

From the diagram, the least significant tri-bit ( TB1 ) acts as a "prescaler". Its Count Enable Output (CEO) is propagated in parallel to the rest of the tri-bits as shown. It goes high only once in every 8 clock cycles. Note that the tri-bit blocks are only triggered when their Count Enable Parallel (CEP) inputs and Count Enable( CE)/ Count Enable Trickle (CET ) inputs are high. The 8 clock pulses give the CEO-CET ripple chain ample time to settle(that is, to propagate from TB2 to terminal tri-bit). If this were not done, the minimum clock period would be limited by the propagation delay of TB2's CEO signal to the terminal tri-bit, thereby slowing the counter down.


  1. Counting and Counters
    by R.M.M. Oberman
    The Macmillian Press Ltd
  2. Digital Design
    by M. Morris Mano
    Prentice-Hall International
  3. Digital System Design
    by Barry Wilkinson with Rafic Makki
    Prentice-Hall International
  4. XILINX XAPP Application Notes
  5. ISE Second Year Digital Electronics Notes
    by Mike Brookes

  6. Do mail me if you have any suggestions