Reduced Complexity Interleaver Growth Algorithm for Turbo Codes........... F. Daneshgaran and M....

954 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO. 3, MAY 2005

Reduced Complexity Interleaver GrowthAlgorithm for Turbo Codes

Fred Daneshgaran, Member, IEEE, and Massimiliano Laddomada, Member, IEEE

Abstract—This paper is focused on the problem of significantlyreducing the complexity of the recursive interleaver growth algo-rithm (IGA) with the goal of extending the range of applicability ofthe algorithm to significantly larger interleavers for a given CPUtime and processor. In particular, we present two novel modifica-tions to IGA changing the complexity order of the algorithm from

4

maxto 2

max, present several further minor modifica-

tions reducing the CPU time albeit not fundamentally changing thecomplexity order, and present a mixed mode strategy that com-bines the results of complexity reduction techniques that do notalter the algorithm outcome itself, with a novel transposition valueset cardinality constrained design that does modify the optimiza-tion results. The mixed strategy can be used to further extend therange of interleaver sizes by changing the complexity order from

2

maxto ( max) (i.e., linear in the interleaver size). Fi-

nally, we present optimized variable length interleavers for the Uni-versal Mobile Telecommunications System (UMTS) and Consulta-tive Committee for Space Data Systems (CCSDS) standards out-performing the best interleavers proposed in the literature.

Index Terms—Complexity, interleavers, iterative algorithms, op-timization, permutations, turbo codes.

I. INTRODUCTION

PARALLEL concatenated convolutional codes (PCCC) orturbo codes have recently become very popular for for-

ward error correction largely due to their efficient suboptimaliterative decoding algorithm and their remarkable performancenear the capacity limit [1]–[4]. An example two of a componentPCCC composed of two identical memory recursive sys-tematic convolutional (RSC) codes coupled by an interleaver oflength shown by the box labeled is depicted in Fig. 1.For trellis termination of the RSC codes, we use the techniqueproposed in [4]. After trellis termination, the resulting code re-sembles a large block code and is denoted by .

There is now a growing body of literature on interleaver de-sign techniques for turbo codes (see, for instance, [1], [5]–[18]).In [1], we presented a systematic, recursive interleaver design al-gorithm producing interleavers specifically tailored to the con-stituent RSC codes of the construction. We may genericallycharacterize such interleavers as distance spectrum optimized

Manuscript received January 20, 2002; revised July 24, 2003, February 18,2004; accepted March 12, 2004. The editor coordinating the review of this paperand approving it for publication is K. Narayanan. This work was supported inpart by Euroconcepts S.r.l., in part by FIRB, and in part by the CAPANINAProject founded by the European Commission withing the VI framework pro-gram.

F. Daneshgaran is with the Department of Electrical and Computer Engi-neering, California State University, Los Angeles, CA 90032 USA (e-mail:[email protected]).

M. Laddomada is with the Dipartimento di Elettronica, Politecnico di Torino,10129 Torino, Italy.

Digital Object Identifier 10.1109/TWC.2005.847094

Fig. 1. Example of a rate one-third PCCC employing two identical rateone-half memory � = 2 RSC codes. The interleaver is represented by thepermutation box �.

(DSO) interleavers obtained via the application of the inter-leaver growth algorithm (IGA) [1]. The terminology is used toemphasize the fact that the goal of the interleaver design algo-rithm presented in [1] is to optimize the distance spectrum ofthe associated block code . The main impact of such a de-sign philosophy is improvement in the asymptotic performanceof the code and lowering of the error floor typically observed inthe bit-error rate (BER) and frame-error rate (FER) plots of theturbo codes. In effect, the optimization extends the operationalregion of the code since it is unlikely that a turbo code would beoperated in the error floor region itself. Since the complexity ofoptimization of the interleaver to produce a code with the bestdistance spectrum is very large (factorial in the problem size),the IGA uses a greedy approach to produce suboptimal resultswith a reasonable level of design complexity. Hence, while theoptimization problem is well defined, IGA only produces op-timal results in the context of a greedy approach.

The following are the main attributes of the IGA for inter-leaver design.

1) The DSO interleavers obtained using IGA in general re-sult in ’s with better distance spectra compared to theother design techniques reported in the literature [9]. Fur-thermore, once the algorithm is initialized, the IGA resultsin a unique solution unlike the other design techniqueswhich generally lead to a class of interleavers to choosefrom.

2) The IGA by virtue of its recursive nature (i.e., an inter-leaver of length is constructed from one of length

) leads to implicitly prunable interleavers in the sensethat the interleaver construction is recursive and there arehardware efficient means of implementing variable lengthinterleavers based on the results.

3) The interleavers designed using IGA are tailored to thespecific RSC codes used in the construction of the PCCC.This is in sharp contrast to most other interleaver design

1536-1276/$20.00 © 2005 IEEE

https://www.researchgate.net/publication/3079871_Design_of_interleavers_for_turbo_codes_Iterative_interleaver_growth_algorithms_of_polynomial_complexity?el=1_x_8&enrichId=rgreq-3c2c193c-916d-40c8-80a0-5561164ee29f&enrichSource=Y292ZXJQYWdlOzM0MzMxMzU7QVM6MjEzNzk1Mjk4Mzg1OTIzQDE0Mjc5ODQxNzk5MjA=

DANESHGARAN AND LADDOMADA: REDUCED COMPLEXITY INTERLEAVER GROWTH ALGORITHM FOR TURBO CODES 955

techniques that are not per-se tailored to the RSC codesof the construction, if not through a trial-and-test process.

This paper is focused on the problem of significantly reducingthe complexity of IGA itself with the goal of extending the rangeof applicability of the algorithm to larger interleavers. In its orig-inal formulation, even though the algorithm complexity is poly-nomial, the complexity is so large that it severely limits the max-imum interleaver length for a given CPU time and processor.Recently, we proposed a very efficient error pattern feedbackfor interleaver optimization significantly improving the perfor-mance of the resulting interleavers in terms of their distancespectra [19]. The immediate consequence of the use of error pat-tern feedback in the design is that for each interleaver of a giventarget length to be designed, the IGA must be run several timesand not just once, further exasperating the complexity problem.

The rest of the paper is organized as follows. In Section II,we provide a brief overview of the IGA and highlight the com-plexity order of the algorithm in its original formulation. Sec-tion III presents the core theoretical result of the paper where wepresent two novel modifications of the IGA changing the orderof complexity of the algorithm from to .Furthermore, we present several additional useful modificationsof the algorithm that do not change its operational behavior,but do speed up the algorithm although not affecting its com-plexity order. In Section IV we provide an overview of the delayconstrained approach of limiting design complexity first intro-duced in [5], [20] and discuss its advantages and limitations in acohesive framework using empirical observations obtained viasimulations. Subsequently, we present a mixed mode strategyusing a novel transposition value set cardinality constrained de-sign to further extend the range of applicability of the algorithmto larger interleavers. Various tradeoffs between execution timeand performance of the designed interleavers can be obtainedby changing the cardinality limit. Finally, we present the resultsof interleaver design for turbo codes proposed for the UniversalMobile Telecommunications System (UMTS) and ConsultativeCommittee for Space Data Systems (CCSDS) standards andpresent prunable interleavers for these standards that to the bestof our knowledge, outperform the best interleavers proposed inthe literature. Conclusions are presented in Section V.

II. OVERVIEW OF THE IGA AND ITS COMPLEXITY

This section summarizes the main results of [1] on the IGAalgorithm. For the sake of brevity, we shall keep the descrip-tion of the IGA to the minimum, and invite the interested readerto review [1] for many theoretical details that are omitted here.The basic understanding of the IGA algorithm presented belowrequires representation of the permutations using their transpo-sition vectors introduced in [1]. It is the transposition vector ofa permutation and not directly the permutation itself that is iter-atively grown to the desired size.

A. Transposition Vector of a Permutation

Consider an indexed set of elements . A giveninterleaver performs a particular permutation of this set of ele-ments. The permutation acts on the indexes of the elements.Henceforth, the notation is used to mean that the th

input symbol is carried to the th position at the output. It is abasic result in group theory [21] that any permutation on aset of elements can be written as a product of disjoint cycles,and may be divided into disjoint subsets such that each cycleoperates on a different subset. A cycle of length two is calleda transposition. It is easy to verify that any finite cycle can bewritten as a product of transpositions. Hence, we conclude thattranspositions represent the elementary constituents of any per-mutation.

The finite state permuter (FSP) introduced in [1], is a realiza-tion of an interleaver in the form of a sliding window transposi-tion box of fixed length equal to the delay of the permutation iteffectuates on its input sequence, and with the property that thetransposition performed at a given time slot is responsible forthe generation of the output at the same time slot. The operationof the FSP can be understood by thinking of the sliding windowas a queue. To generate any possible sequence of outputs, it issufficient to exchange the head of the queue with the elementthat is to be ejected at that time slot.

Any permutation on a finite set of elements can be representedusing a unique transposition vector associated with its FSP re-alization. As an example consider the permutation

Consider the queue model of the FSP and assume that data en-ters from left to right. Let us label the transpositions to be per-formed sequentially with the head of the queue to generate thedesired outputs, using positive integers. Let the integer 1 denotethe case whereby no transposition is performed with the headof the queue and the element at the head of the queue is simplyejected. Then, the transposition vector (to be read from left toright) that fully defines permutation is .Take the binary sequence 10 110 labeled from left to right. Per-mutation maps this sequence to 00 111. Consider a new trans-position vector associatedwith the permutation

Note that looks quite different from , yet its output cor-responding to the shifted sequence 010 110 is 001 110 which isa shifted version of the output generated by . In essence, thedescriptions of and using the transposition vectors pre-serves information about the prefix symbol substitution propertyof these two permutations on error patterns whereby the firsttransposition exchanges a zero with a zero, or a one with a one.

Any permutation on elements uniquely defines a transpo-sition vector of size . Conversely, any transposition vector ofsize , defines a unique permutation on elements. Note thatwhen synthesizing a permutation using the transposition vector

, the th element of vector , when scanned from right to left,can only assume values in the set . In the interleaverdesign algorithm presented in [1], the interleaver is grown basedon the minimization of a cost function related to the asymptoticBER (or FER) performance of the overall code. Hence, we shall


first introduce this cost function, and then present the IGA algo-rithm.

B. Cost Function and the Interleaver Growth Algorithm

Let be a terminating error pattern of length and Ham-ming weight associated with the RSC code in a PCCCconfiguration. By a terminating error pattern, we mean a binarysequence at the input of the RSC that corresponds to a path inthe trellis of the code diverging and reemerging with the all-zeropath. As an example, is one such sequence for theturbo code presented in Fig. 1. Let the interleaver length be

, and let denote the th phase of error patternfor that can appear within the inter-

leaver span of length . For instance, for ,, ,

. Then, all the phases of represent terminatingerror events prior to permutation and all have equal probabilityof occurring within the interleaver span.

Let denote a transposition vector of length .Let the elementary cost function for the interleaver de-sign based on just one phase of one error event be de-noted by , where the notation usedimplies that is a function of the transposition vector

and error pattern phase , both of whichare vectors whose size grows with . We use the no-tation to mean that thetransposition vector permutes the components of

generating an error pattern with thesame weight. Let denote the RSC outputsequence corresponding to the input , and set

, where is theHamming weight of the argument. Here, corresponds tothe Hamming distance from the all-zero path of the path inthe trellis of the RSC corresponding to the input .We similarly define asthe Hamming weight of the output of the RSC receiving theunpermuted input. A suitable elementary cost function directlyrelated to the upperbound on pairwise error event probabilityof the code is (1), located at the bottom of the page, where

is a constant related to the code rate and thesignal-to-noise ratio (SNR) . This elementary cost func-tion can be used for optimization of the asymptotic BER of thecode. If the goal is optimization of the asymptotic FER, we cansimply drop the factor from (1).

We assume that the overall cost function is additive in thephases of our error patterns. For a PCCC containing two iden-tical RSCs and one interleaver [4], we define the total cost func-tion as the sum of the cost function associated with the forwardpermutation denoted , and the cost function associ-

ated with the inverse permutation denoted . Hence,the global cost function using error patterns is

(2)

The expression in (1) clearly resembles the upperbound to theasymptotic BER of the code. The following are the operationalsteps of the IGA algorithm that aims at iteratively minimizingthis cost function using a greedy search paradigm.

1) Set the iteration index which corresponds to the initialsize of the interleaver to . Initialize the recursion byexhaustively solving for

(3)

for a manageable value of . Set .2) Let and find from

(4)

and form .3) Set and iterate step-2 to grow the transpo-

sition vector to the desired size .We note the following.

1) Using IGA, one can neither establish the local nor globalstrict optimality of the solution at a given recursion step.To establish local optimality, one needs to define a no-tion of perturbation about a point in an appropriate sizedspace. If one takes the definition of a local perturbation ofa permutation of length to be allowing the first elementof the transposition vector to assume any of the possiblevalues, while the rest of the transposition is fixed, thenIGA is by definition locally optimal. This notion of localperturbation may indeed be justified in light of the prefixsymbol substitution property referred to above. Under thisnotion of local perturbation, IGA is essentially a steepestdecent type algorithm.

2) In many optimization algorithms using the greedy ap-proach, the input vector is often of a fixed dimension.In IGA, the dimensionality of the input vectors (i.e., thetransposition vectors) actually increase from one recur-sion to the next. Hence, the search is over an ever growinginput space of higher and higher dimensions.

3) The core theoretical justification for the use of IGA are [1,Theorems 3.1 and 3.2]. These theorems demonstrate thatthe sequence of cost functions with respect to randomiza-tion of the first transposition with the rest of the transpo-sition intact is asymptotically a Martingale [22]. Meaning

(1)


that for a sufficiently large interleaver length, even ran-domly growing the transposition vector one element at atime is a fair process. IGA does much better in that it doesnot randomly pick the transposition value at a given recur-sion step, indeed the transposition that gives the best valueof the cost function at that step is selected. This leads toa deterministic process where on the average the value ofthe cost function should decrease. One cannot say the re-sulting process is a supermartingale since there is no ran-domness involved anymore, but the minimum of the costfunction at each step beats the average unless the varianceis zero and hence it is anticipated that IGA should performbetter than a pure Martingale. This is as good as having asupermartingale. Indeed, simulation results exhibit an al-most monotonic and rapid decrease in the value of the costfunction as a function of the recursion step.

C. Complexity of IGA in Original Formulation

In [1], it is demonstrated that for a set of single error events,IGA has a polynomial complexity of order . Thiscomplexity measure is associated with the raw algorithm itselfand does not account for the time required to obtain the code-word weights. This is because theoretically a very large lookup table could be used to store the codeword weights (the inputweight of the error patterns are often very limited). However, itis impractical to use huge look up tables for large interleavers.If we consider actually sequentially encoding the permuted se-quence at the input of the RSC (the unpermuted sequence neednot be encoded since it represents an error event with a wellknown weight at the output of the RSC encoder), the algorithm’sexecution time complexity grows by one order (i.e., it becomes

). Unfortunately, for many interleaver sizes of prac-tical interest (for information block lengths from several hun-dreds to several thousands), the algorithm complexity can stillbe quite high.

III. NOVEL MODIFICATIONS CHANGING

COMPLEXITY ORDER OF IGA

In this section, we present two novel modifications of IGAthat effectively change its complexity order from to

. In order to better help understand the first modifi-cation of the IGA that changes its complexity order, we shalluse a very simple example. In particular, suppose our aim is todesign an interleaver for the turbo code of Fig. 1. Suppose wehave chosen two error patterns for the formulation of the costfunction to use for interleaver design. Let these error patternsbe and . It can be easily verified thatboth these sequences cause a divergence and reemergence fromthe all-zero path in the trellis of the constituent RSC code of theturbo code presented in Fig. 1.

Suppose the interleaver has been grown to length 6 so whatis currently available is the transposition vector . We wishto now enlarge the interleaver and grow it to length 7. To dothis we need to examine all the possible values that the sev-enth transposition can assume and form where

. For each possible value of , a global costfunction is formed, and the transposition minimizing this costfunction is chosen to form , where is the min-imizing value of . The cost function itself is composed of el-ementary cost functions associated with how the transpositionsaffect the error patterns and their various phases. For now, letus concentrate on the forward component of the cost function.The arguments that follow, apply equally to the component ofthe cost function associated with the inverse permutation. Let

be the error pattern matrix for error and theerror pattern matrix for error that was used during the min-imization at iteration (we assume that data enters theRSC sequentially starting with its leftmost position)

(5)

(6)

In transition from iteration 6 to iteration 7, one new phaseof each error pattern must be additionally examined, so that

and become

(7)

(8)

The last row of each matrix represent the new phase (first phaseaccording to our previous definition) of the error pattern. Asso-ciated with each error pattern matrix, there is a cost vector thataccounts for the contribution of a particular phase of a particularerror pattern on the forward component of the global cost func-tion. Since is the chosen transposition vector at iteration 6,the cost vectors associated with matrices andand their permuted versions are minimal cost vectors at that it-eration. Let us denote these cost vectors as

(9)

(10)

where denotes the elementary forward component of thecost associated with the th phase of error pattern at iteration

, and the symbol “ ” is used to highlight the minimal value ofthe cost which is associated with the transposition minimizingthe global cost function at iteration .

Suppose we are at iteration and wish to ex-amine the cost vectors associated with the choice of the sev-enth transposition resulting in the transposition vector

https://www.researchgate.net/publication/3079871_Design_of_interleavers_for_turbo_codes_Iterative_interleaver_growth_algorithms_of_polynomial_complexity?el=1_x_8&enrichId=rgreq-3c2c193c-916d-40c8-80a0-5561164ee29f&enrichSource=Y292ZXJQYWdlOzM0MzMxMzU7QVM6MjEzNzk1Mjk4Mzg1OTIzQDE0Mjc5ODQxNzk5MjA=


. Application of the first transposition trans-forms the matrices and as follows:

(11)

(12)

where, the highlighted columns are the columns affected by thetransposition (note that in our queue model of the FSP,data enters from left, and the head of the queue is at the right-most position). After the application of , an element isejected (i.e., the last columns of matrices and

are ejected) and the rest of the transpositionsdefining are applied to the remaining columns of matrices

and

(13)

(14)

After the application of , the rows of matrices andand the rows of the permuted matrices associated with

the application of are encoded by the RSCs,and two cost vectors are generated

(15)

(16)

The total cost associated with the forward component of the costfunction for the choice of is then the sum of the entriesin the vectors and denotedas

(17)where denotes a column vector of ones of appropriate size.

First, consider the raw complexity of this algorithm whichis effectively IGA in its original formulation. There are

phases of to be examined at iteration for eacherror pattern. Let the number of error patterns used for the for-mulation of the cost function be . For large ,

. There are transpositions for each one of which wemust generate cost vectors. Generation of each element of acost vector requires encoding bits at the inputs of the upper

and lower RSC. If this encoding is performed sequentially, wehave an additional operations to perform. This brings thecomplexity of the algorithm at iteration to .The global complexity if the goal is to grow the interleaver tolength is

(18)

The key observation that can be used to reduce the complexityorder is as follows. Note that the highlighted rows of matrices

and are the same as what isfound in matrices and . Furthermore, the firstelement that was ejected after the application of forthese rows were zeros. Since the beginning zeros injected intothe RSC do not impact the values of the cost function associ-ated with a given error event aside from a scaling effect (i.e.,they just delay the start of an error event), we conclude thatthe elementary cost values associated with highlighted rows of

and and their permuted ver-sions must equal

(19)

(20)

where, the factor is a scaling factorthat arises from the definition of the elementary cost function.Hence, these values do not need to be recalculated at iteration

. Indeed, it is noted that for any given error pattern, at it-eration , the component of the cost due to the first phasemust be computed and that there are at most phasesthat are affected by application of , hence, the componentof the cost function for these phases must be recomputed. Thecomponent of the cost function associated with the unaffectedphases of the error patterns need not be recomputed and maybe obtained by scaling the optimal cost values available fromiteration . Indeed, when is large, most phases of the errorpatterns are unaffected by any given choice of the first transpo-sition. This suggests the following alternate expression:

(21)

where is the forward component of the cost functionassociated with the best transposition vector of length-6, .This expression shows how the value of the cost function froma previous iteration is used in a new iteration. This modifica-tion reduces the complexity order when error events are sequen-tially encoded from to , or if table lookup is used (in practice, this is impractical), from to

.Having presented the crux of the argument through a simple

example, the following is the formal description of the modifiedIGA.


1) Set the iteration index which corresponds to the initialsize of the interleaver to . Initialize the recursion byexhaustively solving for

(22)

for a manageable value of . Set and.

2) For each set andfor all find the sets which arethe sets of indexes of the affected phases of error pat-tern (excluding the first phase) at iterationdue to the application of the first transposition . Let

denote the transposition vectorof the inverse permutation associated with the permuta-tion induced by (the first transposition of the in-verse permutation is denoted by ). For each

and for all find the setswhich are the sets of indexes of the affected

phases of error pattern (excluding the first phase) at it-eration due to the application of the first transpo-sition of the inverse permutation. We note in passingthat for a fixed set of error patterns, the set of indexesnoted here can be obtained analytically.

3) For each set andcompute the components of the cost function for forwardcost for the first phase, and for all the affected phases ofall the error patterns

(23)

(24)

Similarly, compute the components of the cost functionfor reverse cost for the first phase, and for all the affectedphases of all the error patterns

(25)

(26)

4) Compute the total cost associated with transpositionas follows:

Fig. 2. State transition diagram of the RSC encoder of Fig. 1. Transitions arelabeled by the input bit and the corresponding parity output bit.

(27)

where,

and arethe complement set of indexes. The following simple ma-nipulations bring to light the iterative nature of the com-putation of the global cost function and the complexitydependence on the cardinality of the affected sets:

(28)

(29)


TABLE ICONSTANT m(r; � ) AS A FUNCTION OF THE STARTING STATE AND REMAINDER r. THE TERMINAL STATE AFTER THE ENCODING OF THE ZERO

RUN IS ALSO LISTED

5) Find from

(30)

and form .6) Set and iterate step-2 to grow the transposi-

tion vector to the desired size .Simulation results are provided later in the paper confirming thereduction in the complexity order of the IGA via the applicationof the proposed modification.

Having reduced the complexity by one order, we present an-other novel modification that further reduces the complexityby one order changing the complexity of IGA to .We noted earlier that the permuted sequences associated witha given transposition vector at iteration can be sequen-tially encoded by RSC. This requires roughly operations. Thekey observation is that in the formulation of the elementary costfunctions we do not need the actual sequences at the output ofthe RSC, we only need their weights. Since the weight of a givenpermuted sequence at the input of the RSC is often much smallerthan (i.e., ), the input sequence to theRSC is largely composed of zeros. From an information theo-retic point of view, such sequences have a very simple descrip-tion. In particular, a given input sequence of weight can bedescribed via the run-length of zeros without the need to evenconsider the run-length of ones. The run-length description isextremely useful since it allows us to compute weight of thecorresponding sequence at the output of the RSC in a numberof steps that is proportional to the input weight of the sequenceindependently of . Once again to better understand this con-cept consider the turbo code of Fig. 1. The constituent RSC ofthis turbo code has the state transition diagram, as shown inFig. 2. The states of the RSC are identified by the contents ofthe memory cells in Fig. 1 whereby the most significant bit istaken to be associated with the rightmost cell. Since the RSCis systematic, we have only shown the input/output bit pair oneach transition for the parity bits.

In order to determine the weight of the output sequence as-sociated with a given input sequence described using its zerorun-length information, we need to know the parity contribu-tion of a run of zeros from a given starting state denoted , andthe resulting terminal state denoted . Let denote a run ofzeros. Clearly, if the starting state is , the parity contri-bution is zero regardless of , and the terminal state isas well. Hence, consider other starting states. The formula forthe parity weight denoted , that can be directly ob-tained from the observation of the state transition diagram is

(31)

where is a constant that depends on the remainderand the starting state . Table I sum-

marizes the information about this constant and the resultingterminal state after the run of zeros. As an example of appli-cation of this concept to determination of the output weight ofthe encoded sequence, suppose we have an input sequence de-scribed as . What is the weight of the en-coded parity bits at the output? The following transitions sum-marize how we may obtain this weight:

(32)

Each transition is marked by a starting state and a terminal statewhich becomes the starting state of the next transition below it.The transition itself is labeled by input-bit/output-weight of thetransition. When there are zero runs, (31) in conjunction withTable I is used to calculate the parity weight. Note that the endstate associated with this sequence is not the all-zero state. How-ever, at the end of the data block the trellis is terminated and theweight of the output during termination can be easily obtainedfrom the knowledge of . It is evident from this simpleexample, that computation of the weight of the output sequencehas a complexity that is on the order of the input weight inde-pendent of the sequence length.

Next, is the question of how do we obtain a description ofthe permuted sequence in terms of the zero run-lengths? Weneed to demonstrate that this operation too can be performedwith a complexity on the order of the sequence weight inde-pendently of its length. To this end, suppose we are at iteration

and are examining . Consider anaffected phase of an error pattern of weight . Sincewe are looking at the affected phase, by definition the first el-ement ejected by the FSP is a one. Hence, one bit is mappedto position one at the output. There remains oneswhose permuted positions we need to know. Note that after theapplication of , is applied to the remainder of the se-quence. If at the end of iteration , the transposition vectoris converted to the permutation vector , then knowing thelocations of the ones at the input, we can simplyread off their permuted locations. Let the permuted positionsbe . Upon sorting the permutedpositions of ones at the output, we easily obtain the descriptionof the permuted sequence in terms of the run-length of zeros as


in the example above. Once again given , all the operationsinvolved have a complexity on the order of sequence weights in-dependent of its length. Conversion from to which hasa complexity on the order of sequence length is performed onceper iteration, hence overall this operation leads to a quadraticcomplexity on final interleaver length .

Two additional minor modifications that can be used to speedup the algorithm are: 1) using the description of a trellis sec-tion in matrix form for encoding rather than literally sequen-tially encoding bits based on the RSC implementation model;this trivial modification can significantly enhance the processingspeed since it reduces the encoding process to table look up (i.e.,memory access operations); and 2) embedding a stopping crite-rion in computation of the sequence weights when they becometoo large.

As an example of application of the modifications and theresulting impact on complexity, consider the interleaver designproblem for a turbo code with the following specifications:

1) 16-state RSC constituent codes with gener-ator

;2) 46 single error patterns (i.e., sequences that cause a

divergence and reemergence in the trellis of the RSCcode once) used for the formulation of the global costfunction (i.e., ).

Fig. 3 depicts the actual CPU time required to design theinterleaver using: a) the original IGA algorithm with sequentialencoding of the permuted error patterns whereby the fourthpower dependence of required CPU time on the interleaverlength is evident; b) the IGA algorithm using the first modi-fication whereby the third power dependence of the requiredCPU time on the interleaver length is evident; and c) the IGAalgorithm using both modifications changing the complexityorder and CPU execution time to have a quadratic dependenceon the interleaver length. In cases (b) and (c), we have used theminor modifications for algorithm speed-up referenced earlier.To confirm that the algorithms practically produce identicalresults, the left-most curve in Fig. 5 depicts the global cost asa function of the interleaver length whereby the curves of thecost functions for all three cases coincide. We have additionallyexamined the resulting transposition vectors in all three casesand have confirmed that they too are identical.

For the turbo code specified earlier, we have designed an in-terleaver of length 1280 using the IGA with both modificationsand employing error pattern feedback [19]. Table II providesa summary of the results. The parameters and in thetable denote the starting and final interleaver lengths during agiven feedback iteration, the number of added error patternsused during the design and the total number of error patternsare also listed in the table. The last column provides informa-tion about the actual values of the free distance, its multiplicityand input information weight for the designed turbo code. To thebest of our knowledge, this turbo code has the highest minimumdistance with the lowest multiplicity and the lowest Hammingweight of the input associated with the minimum distance everreported in the literature for the specified interleaver length andconstituent code. As a comparison, for the same turbo code con-figuration, in [23] the authors obtained

Fig. 3. Interleaver growth required CPU time versus length for three cases:original IGA algorithm (left-most curve); IGA with the first modification(second curve from left); and IGA with both modifications (right-most curve).

and where, are the bestfree distance and cumulative input weight over 10 000 randominterleavers and are the mean and variance of the freedistance over the test ensemble. In the same paper, the authorsreported on the free distance of the CCSDS 16-state turbo codewith interleaver length equal to 1784 and code rate .The reported values are . Note thatwe achieve with an interleaver lengthequal of 1280 which correspond to lower end-to-end delay.

As a second example, consider the interleaver design for theUMTS standard. We have designed an interleaver of length 1280using IGA with both modifications and with error pattern feed-back. The interleaver is implicitly prunable and given its itera-tive construction, exhibits excellent distance spectra at variousblock lengths. In particular, in Table III we report the following:1) free distance of our pruned interleavers at block lengths of in-terest for the UMTS standard (second column); 2) to the best ofour knowledge, the best free distance and the associated inputweight term reported in the literature [24] for the eight-stateturbo code at the noted block length (third column); 3) the meanvalue and variance of the free distance obtained from examina-tion of 10 000 randomly chosen interleavers as reported in [24](fourth column); and 4) the minimum distance and input Ham-ming weight of the turbo code for the UMTS standard and theminimum distance and input Hamming weight of the UMTSturbo code employing the best spread interleaver found over anensemble of 1000 spread interleavers as reported in [23] (fifthcolumn).

IV. DELAY CONSTRAINED AND TRANSPOSITION VALUE SET

CARDINALITY LIMITED DESIGNS

In [5] and [20], the authors presented a simple method of lim-iting interleaving delay during the design process. It is shownin [1] that the interleaver delay in its FSP realization is relatedto the maximum element of the transposition vector of the per-mutation it implements. In particular, for a transposition vector


TABLE IISUMMARY OF THE RESULTS OF INTERLEAVER DESIGN EMPLOYING ERROR PATTERN FEEDBACK FOR THE TURBO CODE EMPLOYING 16 STATE IDENTICAL RSCS

WITH GENERATOR G(D) = [1; (1 +D +D +D +D )=(1 +D +D )]

TABLE IIIPERFORMANCE COMPARISON OF OUR IMPLICITLY PRUNABLE UMTS INTERLEAVER DESIGNED USING THE MODIFIED IGA WITH ERROR PATTERN FEEDBACK

WITH RESULTS REPORTED IN THE LITERATURE. THE UMTS TURBO CODE EMPLOYS 8 STATE CONSTITUENT RSC CODES WITH GENERATOR

G(D) = [1; (1 +D +D )=(1 +D +D )] AND VARIABLE BLOCK LENGTH INTERLEAVER

of size denoted , the interleaving delay is. This observation suggests that a simple method

of limiting the delay of an interleaver independently of its lengthis that of putting an upper limit on the maximum value of thetransposition that may be examined during the iterleaver growthprocess. In doing so, not only the delay of the resulting inter-leavers is limited, obviously the complexity of the growth algo-rithm itself is affected as well. Here is a summary of the delayconstrained design presented in [5] and [20].

1) Grow the transposition vector from some initial lengthto some desired delay limited threshold

using IGA.2) From iteration to , the upper

limit on the value of the transposition to be examinedat each iteration is set to . Experimental resultspresented in [5] and [20] suggest that good results interms of the performance of the resulting interleaversare obtained if .

The complexity of the resulting algorithm using IGA in its orig-inal formulation is

(33)

Hence, for fixed , the algorithm complexity becomes cubic in. If, on the other hand, we use the modified IGA presented

in this paper the algorithm complexity becomes

(34)

Fig. 4.Cost function saturation effect observed in delay constrained designs.The cost function asymptotes match very well the contribution of the internalminimum distance terms which saturate. The straight lines depict this asymptoteestimated from the contribution of the internal minimum distance term.

where is the maximum Hamming weight of the errorpatterns used for the formulation of the cost function. Indeed,imposition of a delay constraint renders the resulting algorithmcomplexity ultimately linear in the final interleaver size.

One major drawback of the delay constraint designs how-ever, is the global cost function saturation effect that is observedvery shortly after . An example of this effect we haveobserved in numerous simulations is depicted in Fig. 4. Thisexample is associated with the turbo code with the specifica-tions given at the end of the previous section. The cost func-tion asymptotes can be very accurately estimated based on theinternal minimum distance [1] of the code which saturates. In


Fig. 5. Plots of the global cost as a function of the interleaver length fortransposition value set cardinality constrained designs. The left-most curveis for the unconstrained IGA in original and modified formulations (the costfunction for original and modified versions of IGA coincide). There is a gradualincrease in the value of the cost function as the cardinality is reduced. Thisgradual increase is due to the degradation of the internal distance spectrum.

effect, after the saturation, the interleaver performance as mea-sured by its internal distance spectrum is not improving. Thissuggests that the delay constrained designs could have an infe-rior performance in comparison to the original formulation atlengths larger than the delay threshold. The main problem withthe delay constrained designs leading to the saturation effect isthat since the range of action of the transpositions are limitedto the beginning of the FSP, error pattern traps are created. Byerror pattern traps we mean a scenario whereby error patternsleading to low distance codewords at iteration simplyremain unaffected by future transpositions that never “hit” thebuffer locations where the error patterns may exist.

The observations made in connection with the limiting be-havior of the delay constrained designs suggest that it is impor-tant that at iteration , the transposition be able to hit loca-tions throughout the FSP buffer. Given this fact, one method ofimplementing the modified IGA in order to render the algorithmcomplexity ultimately linear in the maximum length , isthat of random selection of transpositions to be examined atiteration with a uniform distribution throughout therange of permitted transpositions at that iteration, namely, uni-form in the range . Indeed, in our experimentalresults this technique has been quite successful and no satura-tion effect of the cost function or the internal minimum distancehas been observed. An example of application of this techniqueis depicted in Figs. 5 and 6. The turbo code specification for thedesigns are the same as those reported above (i.e., 16-state RSCsand 46 error patterns). Fig. 5 depicts the value of the global costfunction as a function of the interleaver length for various valuesof . It is evident that there is performance degradation in com-parison to the original IGA formulation as we use various trans-position value set cardinality limits , with a desirable gradualperformance degradation as the value of is decreased. Fig. 6,depicts the plots of actual CPU time required for the interleavergrowth as a function of the interleaver length and cardinality

Fig. 6. Complexity as measured by the CPU time versus interleaver lengthfor the IGA with both modifications presented in the paper (left-most curve)and transposition value set cardinality limited designs. For the latter, as thecardinality is reduced, the execution time reduces.

limits . It is evident from the figures, that we may obtaina performance versus execution speed tradeoffs for the designof interleavers for turbo codes. Note that regardless of , themodified IGA with transposition value set cardinality limits, ul-timately exhibit a linear behavior in the interleaver length.

V. CONCLUSION

In this paper, we have presented two novel modifications of anoriginal algorithm developed in [1] for systematic design of im-plicitly prunable interleavers tailored to the constituent RSCs ofthe turbo code. The modifications change the complexity orderof the interleaver design from to where

is the target interleaver length to be designed. These mod-ifications in addition to several minor modifications presented inthe paper significantly extend the range of applicability of IGAto larger interleavers. It can be reasonably assumed that the mod-ified IGA can be used to design interleavers in the range fromseveral hundreds to several thousands within very reasonableCPU times. Finally, we have presented a novel yet very effec-tive modification inspired by our observations of the behaviorof delay constrained interleaver designs presented in [5], [20],that ultimately changes the complexity of the interleaver designalgorithm to linear in the interleaver length.

ACKNOWLEDGMENT

The authors would like to thank the Associate Editor and theanonymous reviewers for many useful suggestions that have im-proved the quality of our paper.

REFERENCES

[1] F. Daneshgaran and M. Mondin, “Design of interleavers for turbo codes:Iterative interleaver growth algorithms of polynomial complexity,” IEEETrans. Inf. Theory, vol. 45, no. 6, pp. 1845–1859, Sep. 1999.

[2] C. Berrou, A. Glavieux, and P. Thitimajshima, “Near Shannon limiterror-correcting coding and decoding: Turbo-codes,” in Proc. IEEEInt. Conf. Communications, Geneva, Switzerland, May 1993, pp.1064–1070.


[3] S. Benedetto and G. Montorsi, “Unveiling turbo codes: Some results onparallel concatenated coding schemes,” IEEE Trans. Inf. Theory, vol. 42,no. 2, pp. 409–428, Mar. 1996.

[4] D. Divsalar and F. Pollara, “turbo codes for PCS applications,” in Proc.IEEE Int. Conf. Communications, Seattle, WA, May 1995, pp. 54–59.

[5] M. Campanella, G. Garbo, and S. Mangione, “Simple method for lim-iting delay of optimized interleavers for turbo codes,” IEE Electron.Lett., vol. 36, no. 14, pp. 1216–1217, Jul. 2000.

[6] J. Hokfelt, O. Edfors, and T. Maseng, “A turbo code interleaver de-sign criterion based on the performance of iterative decoding,” IEEECommun. Lett., vol. 5, no. 2, pp. 52–54, Feb. 2001.

[7] S. Dolinar and D. Divsalar, “Weight distribution of turbo codes usingrandom and nonrandom permutations,” Jet Propulsion Lab, Pasadena,CA, JPL TDA progress rep. 42-122, 1995.

[8] S. Crozier, “New high-spread high-distance interleavers for turbocodes,” in Proc. 20th Biennial Symp. Communications, Kingston, ON,Canada, May 2000, pp. 3–7.

[9] F. Daneshgaran and M. Mondin, “Optimized turbo codes for delayconstrained applications,” IEEE Trans. Inf. Theory, vol. 48, no. 1, pp.293–305, Jan. 2002.

[10] F. Said, A. H. Aghvami, and W. G. Chambers, “Improving randominterleaver for turbo codes,” IEE Electron. Lett., vol. 35, no. 25, pp.2194–2195, Dec. 1999.

[11] C. Fragouli and R. D. Wesel, “Semi-random interleaver design criteria,”in Proc. IEEE Global Communications Conf., vol. 5, Dec. 1999, pp.2352–2356.

[12] H. Herzberg, “Multilevel turbo coding with a short latency,” in Proc.IEEE Int. Symp. Information Theory, Ulm, Germany, Jun. 1997, pp.112–112.

[13] J. Hokfelt and T. Maseng, “Methodical interleaver design for turbocodes,” in Proc. Int. Symp. Turbo Codes Related Topics, Brest, France,Sep. 1997, pp. 212–215.

[14] P. Robertson, “Illuminating the structure of code and decoder of parallelconcatenated recursive systematic (turbo) codes,” in Proc. IEEE GlobalCommunications Conf., Dec. 1994, pp. 1298–1303.

[15] O. Takeshita and D. Costello Jr, “New classes of algebraic interleaversfor turbo-codes,” in Proc. Int. Symp. Information Theory, Cambridge,MA, Aug. 1998, pp. 419–419.

[16] K. Andrews, C. Heegard, and D. Kozen, “Interleaver design methods forturbo codes,” in Proc. Int. Symp. Information Theory, Cambridge, MA,Aug. 1998, pp. 420–420.

[17] J. Yuan, B. Vucetic, and W. Feng, “Combined turbo codes and interleaverdesign,” IEEE Trans. Commun., vol. 47, no. 4, pp. 484–487, Apr. 1999.

[18] H. Ogiwara and Y.-J. Wu, “Code matched interleaver for parallel con-catenated trellis coded modulation,” in Proc. Int. Communications Conf., New York, Apr. 28 May 2, 2002.

[19] F. Daneshgaran, M. Laddomada, and M. Mondin, “Interleaver designfor serially concatenated convolutional codes: Theory and application,”IEEE Trans. Inf. Theory, vol. 50, no. 6, pp. 1177–1188, Jun. 2004.

[20] G. Garbo and S. Mangione, “Some results on delay constrained inter-leavers for turbo codes,” in Proc. SoftCOM, Workshop Channel CodingTechniques, Split-Dubrovnik (Croatia), Ancona-Bari, Italy, Oct. 2001,pp. 17–24.

[21] H. Marshall Jr, The Theory of Groups, 2nd ed. New York: Chelsea,1976.

[22] P. A. Meyer, Martingales and Stochastic Integrals I. Berlin, Germany:Springer-Verlag, 1972.

[23] R. Garello, P. Pierleoni, and S. Benedetto, “Computing the free distanceof turbo codes and serially concatenated codes with interleavers: Algo-rithms and applications,” IEEE J. Sel. Areas Commun., vol. 19, no. 5,pp. 800–812, May 2001.

[24] R. Garello, F. Chiaraluce, P. Pierleoni, M. Scaloni, and S. Benedetto,“On error floor and free distance of turbo codes,” in Proc. Int. Conf.Communications, vol. 1, 2001, pp. 45–49.

Fred Daneshgaran (S’84–M’84) received the B.S.degree in electrical and mechanical engineering fromCalifornia State University, Los Angeles (CSLA) in1984, the M.S. degree in electrical engineering fromCSLA in 1985, and the Ph.D. degree in electrical en-gineering from University of California, Los Angeles(UCLA), in 1992.

From 1985 to 1987, he was an Instructor with theDepartment of Electrical and Computer Engineering(ECE) at CSLA. From 1987 to 1993, he was an As-sistant Professor, from 1993 to 1996, he was an As-

sociate Professor, and since 1997, he has been a full Professor with the ECEDepartment at CSLA. Since 1989, he has been the Chairman of the Communica-tions Group of the ECE Department at CSLA. Additionally, from 1999 to 2001,he acted as the Chief Scientist for TechnoConcepts, Inc., where he directed thedevelopment of a prototype software-defined radio system, managed the hard-ware and software teams, and orchestrated the entire development process. In2000, he co-founded EuroConcepts s.r.l., an R&D company specializing in thedesign of advanced communication links and software radio, where he is cur-rently acting as the Chief Executive Officer and the Chief Technology Officer.In 1996, he founded Quantum Bit Communications, LLC, a consulting firm spe-cializing in wireless communications. Since 1992, he has been a Research Con-sultant to the TLC group of the EE Department of the Politecnico di Torino,Torino, Italy, where he consulted on a variety of projects for both national andEuropean-funded contracts, and conducted joint research on wavelets, codedmodulation, channel coding, etc., with members of the Signal Analysis and Sim-ulation (SAS) group.

Dr. Daneshgaran is currently serving as the Associate Editor of the IEEETRANSACTIONS ON WIRELESS COMMUNICATIONS in the areas of modulation andcoding, multirate and multicarrier communications, broadband wireless com-munications, and software radio.

Massimiliano Laddomada (S’00–M’04) was bornin 1973. He received the degree in electronics engi-neering in 1999, and the Ph.D. degree in communi-cations engineering in 2003, from the Politecnico diTorino, Torino, Italy.

He is currently an Assistant Professor with the Po-litecnico di Torino. From June 2000 to March 2001,he was a Visiting Researcher at California State Uni-versity, Los Angeles, and a Consultant Engineer withTechnoconcepts, Inc., Los Angeles, CA, a start-upcompany specializing in software radio. His research

is mainly in wireless communications, especially modulation and coding, in-cluding turbo codes and, more recently, networks coding.

Dr. Laddomada is currently serving as a member of the Editorial Board ofIEEE Communications Surveys and Tutorials. He was awarded a five-year open-ended fellowship by Ente per il Dirito all Studio (E.D.S.U.) in recognition of hisuniversity career as an electronics engineer. In 2003, he was awarded the PremioZucca per l’Innovazione nell’ICT from Unione Industriale of Turin, Italy.

Date post:	10-Nov-2023
Category:	Documents
Upload:	umanitoba
View:	0 times
Download:	0 times

Reduced Complexity Interleaver Growth Algorithm for Turbo Codes........... F. Daneshgaran and M....

Documents