*** UZURA3: MPEG1/LayerIII Encoder in FORTRAN90 ***

(Draft version : last updated : 2002-7-21)

Japanese/English


(2002-7-21) UZURA3 Ver.0.4a source revised: Outerloop is rewriiten. ATH modified. Parameteres are modified. Fact chunk in wav file is skipped.
(2002-7-14) UZURA3 Ver.0.4a source revised: Added scalefactor_scale, pre-emphasis control. Modified subblock_gain control. Rewrote outer_loop. Sound quality improved.

(2002-7-11) UZURA3 Ver.0.3z source revised:ATH is changed again. Calculation method for the number of frames is modified.
(2002-7- 5) UZURA3 Ver.0.3y source revised: ATH is changed. Definitions for distortion and allowed distortion are modified.
(2002-7- 5) UZURA3 Ver.0.3x source revised: Subblock_gain is introduced. Parameters are modified.
(2002-7- 4) Added about subblock_gain and about mixed-block.
(2002-7- 2) Added contents and MP1 encoder UZURA1.
(2002-7- 2) Added about VBR algorithm and RIO500 VBR file-tail-skip-bug.
(2002-6-27) UZURA3 Ver.0.3w source revised: VBR is tuned. Added LINKs.
(2002-6-24) UZURA3 Ver.0.3v source revised: several corrections, parameters are tuned
(2002-6-24) English page draft is prepared



    Contents
  1. Source codes
  2. About execution parameters
  3. About program structure
  4. References
  5. LINKs
  6. Acknowledgements
  7. Appendix

*** Source Codes ***

** Download **

Compaq Visual FORTRAN V6.6A: source codes for Compaq Visual FORTRAN V6.6A.
(faster I/O, non-standard, uses dfort.lib)
Fortran90 Standard: source codes within Fortran90 Standard.
(little/big endian auto detect but slow I/O. Unit of RECL=1 of direct access file must be 1 byte. Input 'fort.wav', Output 'fort.mp3')

** About Source Codes **

Program dependency
crc.f90 MODULE: mod_crc16
CRC16
mpeg.f90 MODULE: mod_mpg
global types, parameters, variables
huffman.f90 MODULE: mod_huffman
Huffman code tables
filter.f90 MODULE: mod_polyphase
polyphase filter bank
mdct.f90 MODULE: mod_mdct
MDCT, anti-alias
psycho.f90 MODULE: mod_psycho
USE: mod_mpg
ATH, long/short switch
inner_loop.f90 MODULE: mod_inner_loop
USE: mod_mpg, mod_huffman
inner loop
layer3.f90 MODULE: mod_layer3
USE: mod_mpg, mod_psycho, mod_inner_loop
alloc_bits, outer loop
standard_f90.f90 / windowsCVF.f90 MODULE: wav_io, bit_io,arguments
USE: mod_mpg, mod_crc, mod_layer3, mod_huffman
low-level I/O, get_argument
encode.f90 MODULE: mod_encode
USE: bit_io, mod_mpg, mod_crc, mod_huffman, mod_layer3
write mpg file
uzura3main.f90
USE: arguments, wav_io, bit_io, mod_mpg, mod_polyphase, mod_psycho, mod_mdct, mod_layer3, mod_huffman, mod_encode
main program

** e.t.c. **

  1. Reserved words are written in CAPITAL letters except debugging parts.
  2. Always used IMPLICITE NONE, but always obeyed implicite name convention.
  3. LOGICAL variable names are begun by "q".

*** About Execution Parameters ***

** list of options **

** An example of screen output **


*** Outline of the Program ***

** Basic Structure of the Program **

    Main routine
  1. Pseudo code
    Initialization                 : calculate the number of frames  
    REPEAT
      Read wav file one frame      : read 16bit signed integer PCM data 
      Encode one frame             : transform PCM data to a character string consists of '0' and '1'.
      Write mpg file one frame     : write bits according to the '01' string.
    UNTIL last frame
    STOP
    

    note: Without using the reservoir, a frame becomes independent of other frames, which make the encoding process far easier. Therefore the reservoir is not used here.

  2. meaning of parts

    Read wav file one frame

    In the wav file, 16bit signed integer PCM (Pulse Code Modulation) data are written. In the first part, PCM data for one frame are read.
    In UZURA, 16bit signed integers are returned as double precision real numbers. Because in the tables of ISO documents[1], the figures are given up to 9 digits, while the effective digits of the single real data is only up to 7 to 8. The reason why ISO tables are given up to 9 digits is perhaps to handle 32bit integer PCM input (2^-31 = 4.66d-01). UZURA expects 16bit PCM wav file ripped from CD. ISO documents defines only Encode part.

    Encode one frame

    In this encoding part, the PCM data are transformed into the frequency domain by applying the hybrid filter, which consists of polyphase filter bank and MDCT (Modified Discrete Cosine Transform)(array x). Next these data are quantized (array ix). Then the quantized data are compressed by Huffman coding.
    In UZURA, the Huffman coded binaries are returned in the form of string of '0'&'1'.

    Write mpg file one frame

    In this part, the Huffman coded data for a frame are written to a file adding mpeg header and side information.
    In UZURA, the string of '0'&'1' are gathered to a byte by 8 characters and written out.

*** Structure of Encoding Part ***

    encode one frame
  1. Pseudo code
    Polyphase filter bank           : Matrix multiplication 
    MDCT                            : FFT
    Bit allocation                  
     Outer loop                     : Minimum problem of a equation
       Inner loop                   : Solve an equation by iteration 
    Huffman code                    : Table look up
    RETURN

  2. meaning of parts
    Polyphase filter bank

    PCM data are transformed to frequency domain. Polyphase filter bank consists of 32 equal width band pass filters. In the ISO document[1], a prototype filter is given as a table. By shifting this, 32 band pass filters are obtained. (Shift in the frequency domain corresponds to the multiplication of a phase factor in the time domain, that seems why it is called 'polyphase' filter bank.)
    In UZURA, this is done by matrix multiplication[2].

    MDCT

    The output of the Polyphase filter bank are far more divided in frequency.
    In UZURA, MDCT of length N is implemented as FFT of length N/4[3].(proof) FFT of base 3 is required here.

    Bit allocation

    The data x(576) out of the hybrid filter are quantized as ix(576). This routine consists of double loop structure, i.e. outer loop and inner loop.

    1. Pseudo code
      Initialization                  :
      obtain allowable distortion
      REPEAT                          : [Outer loop] search best scale factor 
       set scale factor                       
       REPEAT                         : [Inner loop] decide quantization step
        set quantization step
        calculate required bits
       UNTIL (required bits < allowed bits)
       calculate distortion
      UNTIL (exit condition is satisfied)
      RETURN

    2. outer loop is essentially a minimization problem of an function. This problem is reduced to a search of a set of values in the scale factor space, which minimize a quantity 'distortion'. Here we have to define two things. One is a scalar value 'distortion', which mathematically is a definition of 'norm' in the x-space. The other is a searching algorithm in the scale factor space, which may decrease the distortion.
      In UZURA, the example in the ISO document is adopted.

      (Although the case for the long block will be given hereafter, the case for the short and mixed block can be obtained in the same line.)

      Definition of 'norm' in the 576-dimensional x-space

      Euclidean norm is chosen. However it is renormalized by the width of a band of each scale factor band.

      
      norm_long = 0.0  
      DO iscfb = 0, 21
       tmp = 0.0
       bw = iend(iscfb) - istart(iscfb) + 1   : band width 
       DO i = istart(iscfb), iend(iscfb)      
        tmp = tmp + | x(i) / scale_factor(iscfb) |^2
       END DO
       norm_long =  norm_long + tmp / bw      : normalize with band width
       norm_long =  norm_long + tmp           : non-weighted Eucledian norm
      END DO
      norm_long = SQRT(norm_long)
      (2002-7- 8) distortion is now scaled by 'scale_factor'
      (2002-7-21) norm is now not weighted by the scalefactor band widths (due to the cahnge of the ATH function)

      Searching algorithm in the 21-dimensinal scale factor space

      The square sum of quantization noise and allowed noise for each scale factor band is calculated. If the square sum of the quantization noise for an scale factor band is larger than that of the allowed noise, the scale factor of that band is increased by 1.

      
      DO iscfb = 0, 21 
       dx(iscfb) = 0.0
       dt(iscfb) = 0.0
       DO i = istart(iscfb), iend(iscfb)      
        dx(iscfb) = dx(iscfb) + |x(i) - x'(i)|^2 
        dt(iscfb) = dt(iscfb) + |th(i)|^2 
       END DO
       IF ( dx(iscfb) > dt(iscfb) ) THEN ds(iscfb) = 1 ELSE ds(iscfb) = 0
       scale_factor(iscfb) = scale_factor(iscfb) + ds(iscfb)
      END DO
      Here x' is defined as x'(i) = SIGN(ix) * |ix(i)|^(4/3) * 2^( (qquant + quantanf) / 4 ).

      Exit condition from the outer loop

      Basically there are two exit conditions from the outer loop.
      One is, when the s-vector (a set of scale factors) in the scale factor space is converged; that is, when the condition ds(iscfb) = 0 is satisfied for all scale factor bands, while applying the above algorithm. This means that the quantization noise becomes small enough for all scale factor bands.
      The other one is, when the s-vector in the scale factor space goes out of the defined area of the ISO document, while applying the above algorithm. In this case, the s-vector for the least distortion along the searching path is returned.

    3. inner loop is reduced to a problem of solving an equation by iteration. The purpose of this part is to obtain the minimum quantization step (quant), with which the required bits are less than the allowed bits. The smaller the quantization step is, the less quantization distortion may become. The problem which have to be solved can be written as, required_bits(quant_min) <= allowed bits. Here, the function required_bits(quant) is globally a decreasing function. Besides some exceptional cases, it can be said that it is monotonously decreasing function. With this assumption, by searching quant from the small value, the above inequality is satisfied at some point and that is the value that is wanted.
      In UZURA, this is solved by bi-section method. Because the problem is an integer equality, the employed method is slightly modified from the general style.

    Huffman code

    In this part, the quantized data ix are compressed by Huffman encode method. This is essentially a unique process of table look up.
    In UZURA, this is implemented after the example in the ISO document.

    ** Psychoacoustic analysis **

    From the physical restrictions of human body, there are principally audible sounds and inaudible sounds. Due to the fact that a sound masks other sounds near it both in the frequency and time domain, sounds principally audible are often unrecognized by our conscious. On the other hand, physically non-existent sound is sometimes heard by our conscious. Psychoacoustic analysis is a study of such effects. By utilizing these characteristics, the required information can be decreased by keeping the quality of sound to our conscious.

    Here, minimal pschoacoustic effects required for implementation of encoder will be given. There are four things that should be decided by psychoanalysis.

    1. Long/Short switching
    2. Masking
    3. NS(Normal-Stereo)/MS(Mid-Side-stereo) switching (joint stereo)
    4. Calculation of allowed distortion
    Usually, an FFT calculation is done for psychoacoustic analysis independent of the hybrid filter.
    In UZURA these things are decided without using an FFT for simplicity. By doing so, the program becomes far simpler and clearer.

    long/short switching

    The main purpose of the long/short switching is to prevent the 'pre-echo', when the sound rises up suddenly. By reducing the block length, the propagation of the quantization noise can be shortened in the time domain, while the required bits increases in the frequency domain. The selection of the long/short/mixed block should be decided before MDCT.
    In UZURA, it should be decided in the subband base, just after the polyphase filter. It is decided that a switch to the short block occurs when the square sum of intensities of subbands within a granule increased strongly from the previous granule. If Sum|Subband_present|^2 > Sum|Subband_previous|^2 * switch is satisfied, the short block will be chosen.

    Masking

    In the masking process, sounds physically exist but psychoacoustically in audible are omitted from the data. This may have to be done before taking NS/MS-switching.
    In UZURA, only the masking by ATH(Absolute Threshold of Hearing) are taken. This corresponds to omitting sounds that are principally inaudible.

    NS/MS switching

    Generally speaking, more or less same sounds reach the right and left ears. By utilizing this correlation between LR channels, required information can be reduced. In the MPEG/Layer 3, a transformation to the average of these channels(Mid-channel) (L+R)/SQRT(2) and to the difference from it (Side-channel) (L-R)/SQRT(2) can be used for that purpose. The choice between the normal stereo and the MS-stereo may have to be decided before calculating allowed distortion (noise).
    In UZURA, this is decided by referring annex G of the ISO document in the base after MDCT. If Sum( ABS(|L|^2-|R|^2) ) < Sum( |L|^2 + |R|^2 ) * xms is satisfied, the MS-stereo is used.

    allowed distortion

    For the purpose of the searching the best scale factor in the outer loop, the allowed distortion (noise) is required. Therefore, this quantity should be obtained before the outer loop is called. This quantity should be essentially the same as the masking threshold. However good results cannot be expected by using the ATH for this purpose.
    In UZURA, it is assumed that the allowed distortion is linearly proportional to the intensity in the unit of dB, and that the factors are independent of frequency. In short, dX(dB) = A * |X| + B. This corresponds to the limiting case, where the width of the spreading function of masking is zero. Changing the unit from the dB, it can be rewritten as th(i) = MAX( a * |x(i)|^p, ath(i) )(a = 10^(B/20), p = A).
    (2002-7- 8) modified to th(i) = a * |x(i)|^p
    (2002-7-21) changed back to the original form th(i) = MAX( a * |x(i)|^p, ath(i) ) (due to new ATH)
    In the program, a, p are chosen as system parameters. Considering the effect of the ATH, an equation th(i) = MAX( a * |x(i)|^p, ath(i) ) is adopted for the estimation of the allowed distortion.


    ** ET CETERA **

    About not using the 'reservoir'

    Because the use of the reservoir breaks the independence of frames, I don't think it is a good idea.

    About not using the 'scale factor selection information'

    I've never considered it.

    About not using the 'intensity stereo'

    I heard that when the wave length becomes shorter than the diameter of a head, the diffraction of wave can be ignored and the stereophonic effect can be well decided by the intensity ratio between the right and left ears...but...

    About ATH function

    ATH function is often given as,
    ATH(f[kHz])[dB] = 3.64 f^-0.8 - 6.5 exp(-0.6(f - 3.3)^2) + 0.001 f^4.
    This function rises up at low and high frequency regions by power of f and has Gaussian dip around 3.3kHz. (It seems that to me that this function is decided by drawing linear line on log-section-paper on both ends. And Gaussian is one of the two most famous functions with symmetrical peak line. [The other one is Lorentzian.]) It is known that this ATH function does not reproduce actual ATH. The LAME group pointed out that encoder output is improved by replacing this ATH function with more accurate function. $B!J(Jquality - what's 'athtype 3')

    In UZURA ATH function is decided according to LAME group results and my ATH measurement. (The error bar of measurement is supposed to be over 10dB.)

    I am not sure how to decide the absolute value of the ATH function. It is expected that with 16bit PCM data, because of 2^-15 = 3.05d-5 = -90.3dB, the bottom of the ATH is near -90dB[3]. But I am not sure. It seems to be around -90~-120dB from experience.



    About scalefactor_scale, preemphasis, subblock_gain

    Scalefactor_scale shows whether scale factors are given as powers of SQRT(2) [0] or 2 [1].

    x = x * sqrt(2)^( (1 + scalefactor_scale) * scla_factor(scfb) )
    
    When the scalefactor_scale = 0, it is possible to give fine tuning. While when the scalefactor_scale = 1, it is possible to obtain broader dynamic range.
    In UZURA, according to ISO document C.1.5.4.3, the scale factor space is searched with scalefactor_scale = 1, if convergence were not obtained, the scale factor space is searched again with scalefactor_scale = 1.

    Preemphasis is defined only for long block. It is a sort of off-sets of scale factors for high frequency bands and defined as,
    xr=SIGN(ix)*|ix|^(4/3)*2^(global_gain[gr] - 210) / 4 * 2^-(scalefac_mutiplier*(scalefac_l[gr,ch,sfb] + preflag[gr,?ch?]*pretab[sfb])).
    In UZURA, according to ISO document C.1.5.4.3.4, if all of the subband 17 to 20 exceeds the allowed distortions after the first call of inner loop, preemphasis is used.

    subblock_gain(3) is defined only for short block. It is a sort of scale factor for each windows and defined as,
    xr=SIGN(ix)*|ix|^(4/3)*2^(global_gain[gr] - 210 - 8 * subblock_gain[gr,window]) / 4 * 2^-(scalefac_mutiplier*scalefac_s[gr,ch,sfb,window]).
    In UZURA, the subblock_gain is used when the scale_factor reached maximum so as to average the intensity of the three windows would be averaged.

    These procedures are in SUBROUTINE outer_loop.

    [OUTER LOOP]
    LOOP1 :DO scalefactor_scale = 0, 1
     LOOP2 : DO subblock_gain = (0,0,0), (7, 7, 7)
      LOOP3 : DO scalefactor = (0...0), (15....7...) ! scale factor loop  (outer loop)
               CALL inner_loop
               CALL calc_distortion
               save best parameters
               IF (converged) EXIT LOOP1
               CALL check_preemphasis_on?
               IF (preemphasis_on) CYCLE LOOP3
               CALL increase_scalefactor
               IF (scale_factor reached max) EXIT LOOP3
      END DO LOOP3
      IF ( subgain reached max.) EXIT LOOP2
     END DO LOOP2
     IF ( scalefactor_scale ) EXIT LOOP1
    END DO LOOP1
    load best parameters
    RETURN
    
    About mixed block

    There are three points to be decided about the switching of the block length.

    1. Selection of window shape (Normal/Start/Short/Stop)
    2. Selection of MDCT length (long/short)
    3. Take or Do not take anti-alias-reduction
    These points must be decided for each granlue and subband.

    In section 2.4.2.7 of the ISO document (p26) 'block_type', long block is defined as

    In the case of long blocks (block_type not equal to 2 or in the lower subband of block_type 2 if the mixed_block_flag is set) ....
    And short block is defined as
    In the case of short blocks (in the upper subbands of a type 2 block if the mixed_block_flag is set, or in all subbands of a type 2 block if mixed_block_flag is not set) ....

    On the other hand, in section 2.4.3.4.10.1 'Alias reduction', it is written that

    For long block_type granules (block_type != 2) the input to the synthesis filterbank is processed for alias reduction before processiong by the IMDCT.
    ....
    Alias reduction is not applied for granules with block_type == 2 (short block)....
    The definitions for the long block and short block do not match. If we follow these lines, in the case of mixed_block_flag == 1 && block_type == 2, encoders should not take anti-alias reduction on the subbband 0 and 1.

    From several reasons, however, it is more reasonable to think that the consideration of mixed block is forgotten in section 2.4.3.4.10.1, and the definition of long/short block should be the first one. Therefore anti-alias reduction should be taken at subband 0-1 in the case of mixed_block_flag == 1 && block_type == 2. With this definition, the condition for the points 2 (MDCT length) and 3 (anti-alias) become the same thing as the definition of long/short block.

    Although the definition of long/short short block is enough for points 2 and 3, this is not enough for the point 1 (window selection). If one considers a situation when switching between a normal long block (window_switching_flag == 0) and mixed block (block_type == 2 && mixed_block_flag == 1) happens, it is reasonable to apply normal window to the subband 0-1 even when block_type is start/stop. From the definition of the long/short block, the start and stop blocks are always long block regardless of the state of the mixed_block_flag. Considering the combination of the flags, (window_switching_flag, block_type, mixed_block_flag), it seems most reasonable to select windows as follows,

    1. (0, -, -) long block; normal window
    2. (1, 1, 0) long block; start window
    3. (1, 1, 1) long block; normal window (0-1), start window (2-31)
    4. (1, 2, 0) short block; short window
    5. (1, 2, 1) long block (0-1), short block (2-31); normal window (0-1) short (2-31)
    6. (1, 3, 0) long block; stop window
    7. (1, 3, 1) long block; normal window (0-1), stop window (2-31)

    In UZURA, this rule is applied. By the way, because I start writting this WEB page without checking the definition of 'block', the terms 'long block/ short block/ mixed block' are used rather arbitarily. Note it in mind, please.(^^;

    About VBR

    Nothing is written about VBR in the ISO document. However, the ISO document does not forbid changing the mpeg header parameters within a file, the bit rate might be changed frame by frame as one wish. And this seems to work.
    In UZURA, bitrate is decided by the strength of 'Psychoacoustic Moment'. 'Psychoacoustic Moment (PM)' is defined as PM = Sum( f * |x(f)|^2 ) / Sum( |x(f)|^2 ), where f is frequency, x(f) spectral intensity at f. Encoders often run short of bits, when strong peaks rise in high frequency region. 'Psychoacoustic Moment' is a quantity defined by myself for the purpose of detecting such situation.

    About RIO500 tail-of-VBR-file skip bug
    It is known that there is a tail-of-VBR_file-skip-bug in the firmware Ver.2.15 of the MP3 portable player RIO500 (Diamond multimedia Inc./Sonic Blue Inc.). In the course of experiments on VBR files, I found the amount of the skip is roughly proportional to the bit-rate of the first frame of the MP3 file. By forcing the bit-rate of the first frame to 32kbps, the skip can be practically ignorable. I would like to report on this in near future (maybe...).

    (added 2002-6-28)
    It is found that the playing time of the file is decided by the bit_rate of the first frame and the size of mp3 file in RIO500 firmware 2.15 and 2.16. Therefore it is not neccesary that the bit_rate of the first frame should be 32kbps. It is enough if the bit_rate of the first frame be less than the average bit_rate of the file.

    I uploaded, under this directory, 14 files encoded in 128kbps besides the first frames, which are changed from 32kbps to 320kbps. One can check the above mentioned behavior by playing these files displaying "remaining time" in RIO500.

    About the ISO documents

    Although the ISO document is written in English, the format of figures are written in French/German style (the decimal point is comma not period). The ISO documents consists of normative part and informative part. Section 2 and Annex A,B are normative and describe decoder. Annex C describes the implementation of an encoder as a reference example.

    About DIST10

    DIST10 is the sample reference MPEG/Audio Encoder/Decoder codes by ISO. DIST10 can be found on the net easily. By reading codes without ISO document, MP1 can be understood because of its simplicity, MP2 may not be understood, and MP3 may be impossible to be understood. The main routines of MP1/MP2 encoders are written by Davis Pan, and the codes are not so much C-ish, so it's possible for non-C user to read. The MP3 encoder is wriiten by many persons and quite C-ish, it is impossible for non-C user to read.


    • [1] ISO/IEC 11172-3 Information technology - Coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mbit/s - Part 3: Audio (First edition) (1993).
    • [2] D.Y.Pan, " Digital Audio Compression", Digital Technical Journal 5(1993) 1.
    • [3] B. Lincoln, " An Experimental High Fidelity Perceptual Audio", (1998)

    *** References ***

      In Japanese
    • CH3 and Shou SHIOKA 'Sound Format MP3: Special', C magazine (Softbank Publishing Co.) 1998-3 p26. : Basic articles in Japanese. Mr. CH3 is the author of 'SCAMP'.
    • Jin MIYAZAKI 'Special: Interface around PC. III', Transister Technology Special No.72, (CQ Publishing): In chapter 1-6 WAVE FILE, in chapter 1-7 MP3 FILE formats are explained.
    • Atsushi KOSUGI : 'Construction of a system equivalent to MP3 and a design of subband filter bank' Interface 2001/8-2002/2 (CQ publishing): Unfortunately I havn't read the original article, however, there is a WEB page which contains the equivalent contents by the author.

      In English
    • ISO/IEC 11172-3 Information technology - Coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mbit/s - Part 3: Audio (First edition) (1993).: The ISO document. It is sold on the form of printed paper and does not exists on the net. But there is a draft version? in MP3' Tech.
    • K.Brandenburg and H.Popp, 'An introduction to MPEG Layer-3': Fraunhofer's introductory article.
    • Karlheinz Brandenburg, 'Mp3 And AAC Explained': Fraunhofer's big man, Karlheinz Brandenburg's elementary paper.
    • Davis Pan, 'Digital Audio Compression': A paper written by Davis Pan. More general than the below paper but easier.
    • Davis Pan, 'A Tutorial on MPEG/Audio Compression': Intermediate level paper. Quite useful. We love Davis Pan!

    *** LINKS ***

      Japan
    • Common room on compressed music: A WEB page hosted by Mr. Karajan-kyo. Quantitave evaluation of many music formats in laboratory is quite interesting. BBS is active and good mannered.
    • Charge! -- A page for signal processing and C programmer --: There is a minimal implementation of MP3 encoder/decoder by one Huffman table with clean C code by Mr. A. Kosugi. It is quite enlightening.
    • efu's page: A high-level study and free-softwares on WAV files are presented. Spectral analyzer 'WaveSpecta' and test signal generator 'WaveGene' are very useful tools.
    • Project9k: A high-quality sound player 'Lilith' is developed. Original MP3 decode engine of 'Lilith' produce crystal clear sound.
    • The Electronic Lives Manufactureing: A WEB page hosted by Mr. ChaN. There is a fascinating page about hand-made MP3 player kit under 'Electronics Handicrafts'/'Audio processing'. I made a kit from Wakamatsu Tsusho by myself. The sound quality was better than RIO500. (But I happen to step over it and broke it! ^^;)

      non-Japan
    • ISO (International Organization for Standardization) : ISO documents can be bought from here. MPEG/Audio document is 200 Swiss Franc without shipping charge.
    • MP3' Tech: A WEB page hosted by Mr. Gabriel Bouvigne. Many papers and source codes of encoder/decoders are collected under Programmer's corner. He himself has implemented minimal MP3 encoder Shine.
    • The LAME Project: A WEB page of LAME project. No need for explanation. Many useful documentations and information.
    • MAD: MPEG Audio Decoder: A WEB page hosted by Robert Leslie. His high quality MPEG Audio Decoder is presented. Wimamp Plugin exists. (A bit softer sound compared to Lilith.)


    *** Acknowledgement ***

    I am indebt to many people for making UZURA.

    I thank to the people at the BBS at the site held by Mr. Karajan-kyo. (Especially I am obliged to Mr. /|/|, Mr. Katajang-kyo, Mr. Shibata, Mr. Tominaga, and Mr. Nekojiro (in order of A.I.U.E.O.) in many ways.)

    I thank to Mr. efu for his programs 'WaveSpectra' and ;'WaveGena'. These are quite useful tools for tuning UZURA. Without them I cannot debug UZURA at all.

    I thank to Mr. Gabriel Bouvigne for uploading UZURA to his site and receiving many kind responses. Uzura is inspired by his minimal MP3 encoder Shine.

    I thank to Mr. Robert Leslie for giving me information about mixed block.


    +++ Appendix +++

    -Me and UZURA and MP3 (in Japanese)

    -MP1 encoder UZURA1
    The relation of 'dX = AX + B' is used for allowed distortion in psychoacoustic routine.
    MP1 Encoder UZURA1 for CVF
    MP1 Encoder UZURA1 (Standard F90)



    This page is visited times.

    This page is link free.

    kitaurawa@lycos.jp

    [PR]‰Ą•l‚Å’ī–Ģ—Í‰ŋŠi‚Ė‹L”OŽĘ^‚ð:‹L”OŽĘ^‚Š‘ål‹CΌ‹ĨŽĘ^ΐސlŽŪŽĘ^