Vs 1053
Vs 1053
Vs 1053
VS1053b -
Ogg Vorbis/MP3/AAC/WMA/FLAC/
MIDI AUDIO CODEC CIRCUIT
Features Description
Contents
VS1053 1
Table of Contents 2
List of Figures 5
1 Licenses 6
2 Disclaimer 6
3 Definitions 6
7 SPI Buses 16
7.1 SPI Bus Pin Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
7.1.1 VS10xx Native Modes (New Mode, recommended) . . . . . . . . . . 16
7.1.2 VS1001 Compatibility Mode (deprecated, do not use in new designs) 16
7.2 Data Request Pin DREQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
7.3 Serial Protocol for Serial Data Interface (SPI / SDI) . . . . . . . . . . . . . . . . 18
7.3.1 SDI in VS10xx Native Modes (New Mode, recommended) . . . . . . 18
7.3.2 SDI Timing Diagram in VS10xx Native Modes (New Mode) . . . . . . 19
7.3.3 SDI in VS1001 Compatibility Mode (deprecated, do not use in new
designs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
7.3.4 Passive SDI Mode (deprecated, do not use in new designs) . . . . . 20
7.4 Serial Protocol for Serial Command Interface (SPI / SCI) . . . . . . . . . . . . . 21
7.4.1 SCI Read . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
7.4.2 SCI Write . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
7.4.3 SCI Multiple Write . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
7.4.4 SCI Timing Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
7.5 SPI Examples with SM_SDINEW and SM_SDISHARED set . . . . . . . . . . . 24
7.5.1 Two SCI Writes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
7.5.2 Two SDI Bytes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
7.5.3 SCI Operation in Middle of Two SDI Bytes . . . . . . . . . . . . . . . 25
9 Functional Description 34
9.1 Main Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
9.2 Data Flow of VS1053b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
9.3 EarSpeaker Spatial Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
9.4 Serial Data Interface (SDI) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
9.5 Serial Control Interface (SCI) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
9.6 SCI Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
9.6.1 SCI_MODE (RW) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
9.6.2 SCI_STATUS (RW) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
9.6.3 SCI_BASS (RW) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
9.6.4 SCI_CLOCKF (RW) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
9.6.5 SCI_DECODE_TIME (RW) . . . . . . . . . . . . . . . . . . . . . . . 44
9.6.6 SCI_AUDATA (RW) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
9.6.7 SCI_WRAM (RW) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
9.6.8 SCI_WRAMADDR (W) . . . . . . . . . . . . . . . . . . . . . . . . . . 45
9.6.9 SCI_HDAT0 and SCI_HDAT1 (R) . . . . . . . . . . . . . . . . . . . . 45
9.6.10 SCI_AIADDR (RW) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
9.6.11 SCI_VOL (RW) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
9.6.12 SCI_AICTRL[x] (RW) . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
10 Operation 49
10.1 Clocking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
10.2 Hardware Reset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
10.3 Software Reset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
10.4 Low Power Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
10.5 Play and Decode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
10.5.1 Playing a Whole File . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
10.5.2 Cancelling Playback . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
10.5.3 Fast Play . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
10.5.4 Fast Forward and Rewind without Audio . . . . . . . . . . . . . . . . 52
10.5.5 Maintaining Correct Decode Time . . . . . . . . . . . . . . . . . . . . 52
10.6 Feeding PCM Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
10.7 Ogg Vorbis Recording . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
10.8 PCM / ADPCM Recording . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
10.8.1 Activating PCM / ADPCM Recording Mode . . . . . . . . . . . . . . . 54
10.8.2 Reading PCM / IMA ADPCM Data . . . . . . . . . . . . . . . . . . . . 55
10.8.3 Adding a PCM RIFF Header . . . . . . . . . . . . . . . . . . . . . . . 56
10.8.4 Adding an IMA ADPCM RIFF Header . . . . . . . . . . . . . . . . . . 57
10.8.5 Playing ADPCM Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
10.8.6 Sample Rate Considerations . . . . . . . . . . . . . . . . . . . . . . . 58
10.8.7 Record Monitoring Volume . . . . . . . . . . . . . . . . . . . . . . . . 59
10.9 SPI Boot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
10.10 Real-Time MIDI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
11 VS1053b Registers 70
11.1 Who Needs to Read This Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . 70
11.2 The Processor Core . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
11.3 VS1053b Hardware DAC Audio Paths . . . . . . . . . . . . . . . . . . . . . . . . 71
11.4 VS1053b Hardware ADC Audio Paths . . . . . . . . . . . . . . . . . . . . . . . . 72
11.5 VS1053b Memory Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
11.6 SCI Hardware Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
11.7 Serial Data Interface (SDI) Registers . . . . . . . . . . . . . . . . . . . . . . . . 73
11.8 DAC Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
11.9 PLL Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
11.10 GPIO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
11.11 Interrupt Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
11.12 UART (Universal Asynchronous Receiver/Transmitter) . . . . . . . . . . . . . . 78
11.12.1 UART Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
11.12.2 Status UART_STATUS . . . . . . . . . . . . . . . . . . . . . . . . . . 78
11.12.3 Data UART_DATA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
11.12.4 Data High UART_DATAH . . . . . . . . . . . . . . . . . . . . . . . . . 79
11.12.5 Divider UART_DIV . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
11.12.6 UART Interrupts and Operation . . . . . . . . . . . . . . . . . . . . . 80
11.13 Timers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
11.13.1 Timer Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
11.13.2 Configuration TIMER_CONFIG . . . . . . . . . . . . . . . . . . . . . 81
11.13.3 Configuration TIMER_ENABLE . . . . . . . . . . . . . . . . . . . . . 82
11.13.4 Timer X Startvalue TIMER_Tx[L/H] . . . . . . . . . . . . . . . . . . . 82
11.13.5 Timer X Counter TIMER_TxCNT[L/H] . . . . . . . . . . . . . . . . . . 82
11.13.6 Timer Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
11.14 I2S DAC Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
11.15 Analog-to-Digital Converter (ADC) . . . . . . . . . . . . . . . . . . . . . . . . . . 84
11.16 Resampler SampleRate Converter (SRC) . . . . . . . . . . . . . . . . . . . . . 85
11.17 Sidestream Sigma-Delta Modulator (SDM) . . . . . . . . . . . . . . . . . . . . . 86
12 Version Changes 87
12.1 Changes Between VS1033c and VS1053a/b Firmware, 2007-03-08 . . . . . . . 87
14 Contact Information 90
List of Figures
1 Pin configuration, LQFP-48. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2 VS1053b in LQFP-48 packaging. . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3 Typical connection diagram using LQFP-48. . . . . . . . . . . . . . . . . . . . . . 14
4 SDI in VS10xx Native Mode, single-byte transfer . . . . . . . . . . . . . . . . . . 18
5 SDI in VS10xx Native Mode, multi-byte transfer, X ≥ 1 . . . . . . . . . . . . . . . 18
6 SDI timing diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
7 SDI in VS1001 Mode - one byte transfer. Do not use in new designs! . . . . . . . 20
8 SDI in VS1001 Mode - two byte transfer. Do not use in new designs! . . . . . . . 20
9 SCI word read . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
10 SCI word write . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
11 SCI multiple word write . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
12 SPI timing diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
13 Two SCI operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
14 Two SDI bytes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
15 Two SDI bytes separated by an SCI operation . . . . . . . . . . . . . . . . . . . . 25
16 Data flow of VS1053b. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
17 EarSpeaker externalized sound sources vs. normal inside-the-head sound . . . 36
18 VS1053b ADC and DAC data paths with some data registers . . . . . . . . . . . 71
19 VS1053b ADC and DAC data paths with some data registers . . . . . . . . . . . 72
20 RS232 serial interface protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
21 I2S interface, 192 kHz. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
1 Licenses
VS1053b contains AAC technology (ISO/IEC 13818-7 and ISO/IEC 14496-3) which cannot be
used without a proper license from Via Licensing Corporation or individual patent holders.
VS1053b contains spectral band replication (SBR) and parametric stereo (PS) technologies
developed by Coding Technologies. Licensing of SBR is handled within MPEG4 through Via
Licensing Corporation. Licensing of PS is handled with Coding Technologies.
See http://www.codingtechnologies.com/licensing/aacplus.htm for more information.
To the best of our knowledge, if the end product does not play a specific format that otherwise
would require a customer license: MPEG 1.0/2.0 layers I and II, WMA, or AAC, the respective
license should not be required. Decoding of MPEG layers I and II are disabled by default,
and WMA and AAC format exclusion can be easily performed based on the contents of the
SCI_HDAT1 register. Also PS and SBR decoding can be separately disabled.
2 Disclaimer
3 Definitions
B Byte, 8 bits.
b Bit.
W Word. In VS_DSP, instruction words are 32-bit and data words are 16-bit wide.
1 Must be connected together as close the device as possible for latch-up immunity.
2 Reference voltage can be internally selected between 1.23V and 1.65V, see section 9.6.2.
3 The maximum sample rate that can be played with correct speed is XTALI/256 (or XTALI/512
if SM_CLK_RANGE is set). Thus, XTALI must be at least 12.288 MHz (24.576 MHz) to be able
to play 48 kHz at correct speed.
4 Reset value is 1.0×. Recommended SC_MULT=3.5×, SC_ADD=1.0× (SCI_CLOCKF=0x8800).
DAC Characteristics
Parameter Symbol Min Typ Max Unit
DAC Resolution 18 bits
Total Harmonic Distortion, -3 dB of full-scale THD 0.04 %
Third Harmonic Distortion, -3 dB of full-scale 0.01 %
Dynamic Range (DAC unmuted, A-weighted) IDR 100 dB
S/N Ratio (full scale signal) SNR 94 dB
Interchannel Isolation (Cross Talk), 600Ω + GBUF 80 dB
Interchannel Isolation (Cross Talk), 30Ω + GBUF 53 dB
Interchannel Gain Mismatch -0.5 0.5 dB
Frequency Response -0.1 0.1 dB
Full Scale Output Voltage LEVEL16 27501 mVpp
Full Scale Output Voltage, VREF = 1.2 V LEVEL12 20501 mVpp
Deviation from Linear Phase 5 ◦
1 double can be achieved with +-to-+ wiring for mono difference sound.
2 AOLR may be much lower, but below Typical distortion performance may be compromised.
ADC Characteristics
Parameter Symbol Min Typ Max Unit
Microphone input amplifier gain MGAIN 26 dB
Microphone input amplitude (differential) MLEV16 64 1801 mVpp AC
Microphone input amplitude (diff.), VREF = 1.2 V MLEV12 48 1401 mVpp AC
Microphone Total Harmonic Distortion MTHD 0.03 0.07 %
Microphone S/N Ratio MSNR 60 72 dB
Microphone input impedances, per pin MIMP 45 kΩ
Line input amplitude LLEV16 2500 28001 mVpp AC
Line input amplitude, VREF = 1.2 V LLEV12 1900 21001 mVpp AC
Line input Total Harmonic Distortion LTHD 0.005 0.014 %
Line input S/N Ratio LSNR 85 90 dB
Line input impedance LIMP 80 kΩ
Internal clock multiplier 3.0×. TA=+25◦ C. IOVDD =2.8 V, AVDD = 2.6 V, CVDD = 1.8V̇.
XRESET active
Parameter Min Typ Max Unit
Power Supply Consumption IOVDD 0.3 3.0 µA
Power Supply Consumption AVDD 0.6 5.0 µA
Power Supply Consumption CVDD 18 35.0 µA
5.1 Packages
LPQFP-48 is a lead (Pb) free and also RoHS compliant package. RoHS is a short name of
Directive 2002/95/EC on the restriction of the use of certain hazardous substances in electrical
and electronic equipment.
5.1.1 LQFP-48
48
1
Pin types:
Figure Note 1: Connect either Microphone In or Line In, but not both at the same time.
Note: This connection assumes SM_SDINEW is active (see Chapter 9.6.1). If also SM_SDISHARE
is used, xDCS should be tied low or high (see Chapter 7.1.1).
The common buffer GBUF can be used for common voltage (1.23 V) for earphones. This will
eliminate the need for large isolation capacitors on line outputs, and thus the audio output pins
from VS1053b may be connected directly to the earphone connector.
GBUF must NOT be connected to ground under any circumstances. If GBUF is not used,
LEFT and RIGHT must be provided with coupling capacitors. To keep GBUF stable, you should
always have the resistor and capacitor even when GBUF is not used. See application notes for
details.
Unused GPIO pins should have a pull-down resistor. Unused line and microphone inputs should
not be connected.
7 SPI Buses
The SPI Bus - which was originally used in some Motorola devices - has been used for both
VS1053b’s Serial Data Interface SDI (Chapters 7.3 and 9.4) and Serial Control Interface SCI
(Chapters 7.4 and 9.5).
These modes are active on VS1053b when SM_SDINEW is set to 1 (default at startup). DCLK
and SDATA are not used for data transfer and they can be used as general-purpose I/O pins
(GPIO2 and GPIO3). BSYNC function changes to data interface chip select (XDCS).
This mode is active when SM_SDINEW is set to 0. In this mode, DCLK, SDATA and BSYNC
are active.
The DREQ pin/signal is used to signal if VS1053b’s 2048-byte FIFO is capable of receiving
data. If DREQ is high, VS1053b can take at least 32 bytes of SDI data or one SCI command.
DREQ is turned low when the stream buffer is too full and for the duration of an SCI command.
Because of the 32-byte safety area, the sender may send up to 32 bytes of SDI data at a
time without checking the status of DREQ, making controlling VS1053b easier for low-speed
microcontrollers.
Note: DREQ may turn low or high at any time, even during a byte transmission. Thus, DREQ
should only be used to decide whether to send more bytes. A transmission that has already
started doesn’t need to be aborted.
Note: In VS1053b DREQ also goes down while an SCI operation is in progress.
There are cases when you still want to send SCI commands when DREQ is low. Because
DREQ is shared between SDI and SCI, you can not determine if an SCI command has been
executed if SDI is not ready to receive data. In this case you need a long enough delay after
every SCI command to make certain none of them are missed. The SCI Registers table in
Chapter 9.6 gives the worst-case handling time for each SCI register write.
Note: The status of DREQ can also be read through SCI with the following code. For details on
SCI registers, see Chapter 7.4.
// This example reads status of DREQ pin through the SPI/SCI register
// interface.
#define SCI_WRAMADDR 7
#define SCI_WRAM 6
while (!endOfFile) {
int dreq;
WriteSciReg(SCI_WRAMADDR, 0xC012); // Send address of DREQ register
dreq = ReadSciReg(SCI_WRAM) & 1; // Read value of DREQ (in bit 0)
if (dreq) {
// DREQ high: send 1-32 bytes audio data
} else {
// DREQ low: wait 5 milliseconds (so that VS10xx doesn't get
// continuous SCI operations)
}
} /* while (!endOfFile) */
The serial data interface operates in slave mode so DCLK signal must be generated by an
external circuit.
Data (SDATA signal) can be clocked in at either the rising or falling edge of DCLK (Chapter 9.6).
VS1053b assumes its data input to be byte-sychronized. SDI bytes may be transmitted either
MSb or LSb first, depending of register SCI_MODE bit SM_SDIORD (Chapter 9.6.1).
The firmware is able to accept the maximum bitrate the SDI supports.
XDCS
SDATA D7 D6 D5 D4 D3 D2 D1 D0
DCLK
Note that when sending data through SDI you have to check the Data Request Pin DREQ at
least after every 32 bytes (Chapter 7.2).
XDCS
SDATA D7 D6 D5 D4 D3 D2 D1 D0 D7 D6 D5 ... D3 D2 D1 D0
DCLK ...
If SM_SDISHARE is 1, the XDCS signal is internally generated by inverting the XCS input.
xDCS
tXCS
D7 D6 D5 D4 D3 D2 D1 D0
SCK
SI
tH
tSU
Note: Although the timing is derived from the internal clock CLKI, the system always starts up in
1.0× mode, thus CLKI=XTALI. After you have configured a higher clock through SCI_CLOCKF
and waited for DREQ to rise, you can use a higher SPI speed as well.
7.3.3 SDI in VS1001 Compatibility Mode (deprecated, do not use in new designs)
BSYNC
SDATA D7 D6 D5 D4 D3 D2 D1 D0
DCLK
Figure 7: SDI in VS1001 Mode - one byte transfer. Do not use in new designs!
When VS1053b is running in VS1001 compatibility mode, a BSYNC signal must be generated
to ensure correct bit-alignment of the input bitstream, as shown in Figures 7 and 8.
The first DCLK sampling edge (rising or falling, depending on selected polarity), during which
the BSYNC is high, marks the first bit of a byte (LSB, if LSB-first order is used, MSB, if MSB-first
order is used). If BSYNC is ’1’ when the last bit is received, the receiver stays active and next
8 bits are also received.
BSYNC
SDATA D7 D6 D5 D4 D3 D2 D1 D0 D7 D6 D5 D4 D3 D2 D1 D0
DCLK
Figure 8: SDI in VS1001 Mode - two byte transfer. Do not use in new designs!
The serial bus protocol for the Serial Command Interface SCI (Chapter 9.5) consists of an
instruction byte, address byte and one 16-bit data word. Each read or write operation can read
or write a single register. Data bits are read at the rising edge, so the user should update data
at the falling edge. Bytes are always send MSb first. XCS should be low for the full duration of
the operation, but you can have pauses between bits if needed.
The operation is specified by an 8-bit instruction opcode. The supported instructions are read
and write. See table below.
Instruction
Name Opcode Operation
READ 0b0000 0011 Read data
WRITE 0b0000 0010 Write data
Note: VS1053b sets DREQ low after each SCI operation. The duration depends on the opera-
tion. It is not allowed to finish a new SCI/SDI operation before DREQ is high again.
XCS
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 30 31
SCK
3 2 1 0
SI 0 0 0 0 0 0 1 1 0 0 0 0 don’t care don’t care
15 14 1 0
SO 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 X
execution
DREQ
VS1053b registers are read from using the following sequence, as shown in Figure 9. First,
XCS line is pulled low to select the device. Then the READ opcode (0x3) is transmitted via
the SI line followed by an 8-bit word address. After the address has been read in, any further
data on SI is ignored by the chip. The 16-bit data corresponding to the received address will be
shifted out onto the SO line.
XCS should be driven high after data has been shifted out.
DREQ is driven low for a short while when in a read operation by the chip. This is a very short
time and doesn’t require special user attention.
XCS
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 30 31
SCK
3 2 1 0 15 14 1 0
SI 0 0 0 0 0 0 1 0 0 0 0 0 X
execution
DREQ
VS1053b registers are written from using the following sequence, as shown in Figure 10. First,
XCS line is pulled low to select the device. Then the WRITE opcode (0x2) is transmitted via the
SI line followed by an 8-bit word address.
After the word has been shifted in and the last clock has been sent, XCS should be pulled high
to end the WRITE sequence.
After the last bit has been sent, DREQ is driven low for the duration of the register update,
marked “execution” in the figure. The time varies depending on the register and its contents
(see table in Chapter 9.6 for details). If the maximum time is longer than what it takes from the
microcontroller to feed the next SCI command or SDI byte, status of DREQ must be checked
before finishing the next SCI/SDI operation.
XCS
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 29 30 31 32 33 m−2m−1
SCK
3 2 1 0 15 14 1 0 15 14 1 0
SI 0 0 0 0 0 0 1 0 0 0 0 0 X X
execution execution
DREQ
VS1053b allows for the user to send multiple words to the same SCI register, which allows
fast SCI uploads, shown in Figure 11. The main difference to a single write is that instead of
bringing XCS up after sending the last bit of a data word, the next data word is sent immediately.
After the last data word, XCS is driven high as with a single word write.
After the last bit of a word has been sent, DREQ is driven low for the duration of the register
update, marked “execution” in the figure. The time varies depending on the register and its
contents (see table in Chapter 9.6 for details). If the maximum time is longer than what it takes
from the microcontroller to feed the next SCI command or SDI byte, status of DREQ must be
checked before finishing the next SCI/SDI operation.
XCS
tXCS
0 1 14 15 16 30 31
SCK
SI
tH
tSU
SO
tZ
tV tDIS
1 25 ns is when pin loaded with 100 pF capacitance. The time is shorter with lower capacitance.
Note: Although the timing is derived from the internal clock CLKI, the system always starts up in
1.0× mode, thus CLKI=XTALI. After you have configured a higher clock through SCI_CLOCKF
and waited for DREQ to rise, you can use a higher SPI speed as well.
Note: Because tWL + tWH + tH is 6×CLKI + 25 ns, the maximum speed for SCI reads is CLKI/7.
XCS
0 1 2 3 30 31 32 33 61 62 63
SCK
1 0 2 1 0
SI 0 0 0 0 X 0 0 X
DREQ
Figure 13 shows two consecutive SCI operations. Note that xCS must be raised to inactive
state between the writes. Also DREQ must be respected as shown in the figure.
SDI Byte 1
SDI Byte 2
XCS
0 1 2 3 6 7 8 9 13 14 15
SCK
7 6 5 4 3 1 0 7 6 5 2 1 0
SI X
DREQ
SDI data is synchronized with a raising edge of xCS as shown in Figure 14. However, every
byte doesn’t need separate synchronization.
XCS
0 1 7 8 9 39 40 41 46 47
SCK
7 6 5 1 0 7 6 5 1 0
SI 0 0 X
DREQ
Figure 15 shows how an SCI operation is embedded in between SDI operations. xCS edges
are used to synchronize both SDI and SCI. Remember to respect DREQ as shown in the figure.
Conventions
Mark Description
+ Format is supported
? Format is supported but not thoroughly tested
- Format exists but is not supported
Format doesn’t exist
MPEG 1.01 :
Samplerate / Hz Bitrate / kbit/s
32 40 48 56 64 80 96 112 128 160 192 224 256 320
48000 + + + + + + + + + + + + + +
44100 + + + + + + + + + + + + + +
32000 + + + + + + + + + + + + + +
MPEG 2.01 :
Samplerate / Hz Bitrate / kbit/s
8 16 24 32 40 48 56 64 80 96 112 128 144 160
24000 + + + + + + + + + + + + + +
22050 + + + + + + + + + + + + + +
16000 + + + + + + + + + + + + + +
MPEG 2.51 :
Samplerate / Hz Bitrate / kbit/s
8 16 24 32 40 48 56 64 80 96 112 128 144 160
12000 + + + + + + + + + + + + + +
11025 + + + + + + + + + + + + + +
8000 + + + + + + + + + + + + + +
MPEG 1.0:
Samplerate / Hz Bitrate / kbit/s
32 48 56 64 80 96 112 128 160 192 224 256 320 384
48000 + + + + + + + + + + + + + +
44100 + + + + + + + + + + + + + +
32000 + + + + + + + + + + + + + +
MPEG 2.0:
Samplerate / Hz Bitrate / kbit/s
8 16 24 32 40 48 56 64 80 96 112 128 144 160
24000 + + + + + + + + + + + + + +
22050 + + + + + + + + + + + + + +
16000 + + + + + + + + + + + + + +
MPEG 1.0:
Samplerate / Hz Bitrate / kbit/s
32 64 96 128 160 192 224 256 288 320 352 384 416 448
48000 + + + + + + + + + + + + + +
44100 + + + + + + + + + + + + + +
32000 + + + + + + + + + + + + + +
MPEG 2.0:
Samplerate / Hz Bitrate / kbit/s
32 48 56 64 80 96 112 128 144 160 176 192 224 256
24000 ? ? ? ? ? ? ? ? ? ? ? ? ? ?
22050 ? ? ? ? ? ? ? ? ? ? ? ? ? ?
16000 ? ? ? ? ? ? ? ? ? ? ? ? ? ?
Only floor 1 is supported. No known current encoder uses floor 0. All one- and two-channel
Ogg Vorbis files should be playable with this decoder.
Dynamic range control (DRC) is supported and can be controlled by the user to limit or enhance
the dynamic range of the material that contains DRC information.
Both Sine window and Kaiser-Bessel-derived window are supported. For MPEG4 pseudo-
random noise substitution (PNS) is supported. Short frames (120 and 960 samples) are not
supported.
Spectral Band Replication (SBR) level 3, and Parametric Stereo (PS) level 3 are supported (HE-
AAC v2). Level 3 means that maximum of 2 channels, samplerates up to and including 48 kHz
without and with SBR (with or without PS) are supported. Also, both mixing modes (Ra and Rb ),
IPD/OPD synthesis and 34 frequency bands resolution are implemented. The downsampled
synthesis mode (core coder rates > 24 kHz and <= 48 kHz with SBR) is implemented.
SBR and PS decoding can also be disabled. Also different operating modes can be selected.
See config1 and sbrAndPsStatus in section 10.11 : "Extra parameters".
If enabled, the internal clock (CLKI) is automatically increased if AAC decoding needs a higher
clock. PS and SBR operation is automatically switched off if the internal clock is too slow for
correct decoding. Generally HE-AAC v2 files need 4.5× clock to decode both SBR and PS
content. This is why 3.5× + 1.0× clock is the recommended default.
For AAC the streaming ADTS format is recommended. This format allows easy rewind and fast
forward because resynchronization is easily possible.
In addition to ADTS (.aac), MPEG2 ADIF (.aac) and MPEG4 AUDIO (.mp4 / .m4a) files are
played, but these formats are less suitable for rewind and fast forward operations. You can still
implement these features by using the safe jump points table, or using slightly less robust but
much easier automatic resync mechanism (see Section 10.5.4).
Because 3GPP (.3gp) and 3GPPv2 (.3g2) files are just MPEG4 files, those that contain only
HE-AAC or HE-AACv2 content are played.
Note: To be able to play the .3gp, .3g2, .mp4 and .m4a files, the mdat atom must be the
last atom in the MP4 file. Because VS1053b receives all data as a stream, all metadata must
be available before the music data is received. Several MP4 file formatters do not satisfy this
requirement and some kind of conversion is required. This is also why the streamable ADTS
format is recommended.
Programs exist that optimize the .mp4 and .m4a into so-called streamable format that has the
mdat atom last in the file, and thus suitable for web servers’ audio streaming. You can use this
kind of tool to process files for VS1053b too. For example mp4creator -optimize file.mp4.
AAC12 :
1 64000 Hz, 88200 Hz, and 96000 Hz AAC files are played at the highest possible samplerate
(48000 Hz with 12.288 MHz XTALI).
2 Also all variable bitrate (VBR) formats are supported. Note that the table gives the maximum
bitrate allowed for two channels for a specific samplerate as defined by the AAC specification.
The decoder does not actually have a fixed lower or upper limit.
WMA 7:
Samplerate Bitrate / kbit/s
/ Hz 5 6 8 10 12 16 20 22 32 40 48 64 80 96 128 160 192
8000 + + + +
11025 + +
16000 + + + +
22050 + + + +
32000 + + + +
44100 + + + + + + + +
48000 + +
WMA 8:
Samplerate Bitrate / kbit/s
/ Hz 5 6 8 10 12 16 20 22 32 40 48 64 80 96 128 160 192
8000 + + + +
11025 + +
16000 + + + +
22050 + + + +
32000 + + + +
44100 + + + + + + + +
48000 + + +
WMA 9:
Samplerate Bitrate / kbit/s
/ Hz 5 6 8 10 12 16 20 22 32 40 48 64 80 96 128 160 192 256 320
8000 + + + +
11025 + +
16000 + + + +
22050 + + + +
32000 + + + +
44100 + + + + + + + + + + +
48000 + + + + +
In addition to these expected WMA decoding profiles, all other bitrate and samplerate com-
binations are supported, including variable bitrate WMA streams. Note that WMA does not
consume the bitstream as evenly as MP3, so you need a higher peak transfer capability for
clean playback at the same bitrate.
Up to 48 kHz and 24-bit FLAC files are supported with the VS1053b Patches w/ FLAC Decoder
plugin that is available at http://www.vlsi.fi/en/support/software/vs10xxpatches.html . Read the
accompanying documentation of the plugin for details.
The most common RIFF WAV subformats are supported, with 1 or 2 audio channels.
General MIDI and SP-MIDI format 0 files are played. Format 1 and 2 files must be converted to
format 0 by the user. The maximum polyphony is 64, the maximum sustained polyphony is 40.
Actual polyphony depends on the internal clock rate (which is user-selectable), the instruments
used, whether the reverb effect is enabled, and the possible global postprocessing effects en-
abled, such as bass enhancer, treble control or EarSpeaker spatial processing. The polyphony
restriction algorithm makes use of the SP-MIDI MIP table, if present, and uses smooth note
removal.
43 MHz (3.5× input clock) achieves 19-31 simultaneous sustained notes. The instantaneous
amount of notes can be larger. This is a fair compromise between power consumption and
quality, but higher clocks can be used to increase polyphony.
Reverb effect can be controlled by the user. In addition to reverb automatic and reverb off
modes, 14 different decay times can be selected. These roughly correspond to different room
sizes. Also, each midi song decides how much effect each instrument gets. Because the reverb
effect uses about 4 MHz of processing power the automatic control enables reverb only when
the internal clock is at least 3.0×.
In VS1053b both EarSpeaker and MIDI reverb can be on simultaneously. This is ideal for
listening MIDI songs with headphones.
New instruments have been implemented in addition to the 36 that are available in VS1003.
VS1053b now has unique instruments in the whole GM1 instrument set and one bank of GM2
percussions.
9 Functional Description
VS1053b is based on a proprietary digital signal processor, VS_DSP. It contains all the code
and data memory needed for Ogg Vorbis, MP3, AAC, WMA and WAV PCM + ADPCM audio
decoding and a MIDI synthesizer, together with serial interfaces, a multirate stereo audio DAC
and analog output amplifiers and filters. Also PCM/ADPCM audio encoding is supported us-
ing a microphone amplifier and/or line-level inputs and a stereo A/D converter. With software
plugins the chip can also decode lossless FLAC as well as record the high-quality Ogg Vorbis
format. A UART is provided for debugging purposes.
SM_ADPCM=0
L
Audio S.rate.conv.
FIFO and DAC R
First, depending on the audio data, and provided ADPCM encoding mode is not set, Ogg
Vorbis, PCM WAV or IMA ADPCM WAV is received and decoded from the SDI bus.
After decoding, if SCI_AIADDR is non-zero, application code is executed from the address
pointed to by that register. For more details, see Application Notes for VS10XX.
Then data may be sent to the Bass Enhancer and Treble Control depending on the SCI_BASS
register.
After that the data to the Audio FIFO, which holds the data until it is read by the Audio interrupt
and fed to the samplerate converter and DACs. The size of the audio FIFO is 2048 stereo
(2×16-bit) samples, or 8 KiB.
The samplerate converter upsamples all different samplerates to XTALI/2, or 128 times the
highest usable samplerate with 18-bit precision. Volume control is performed in the upsampled
domain. New volume settings are loaded only when the upsampled signal crosses the zero
point (or after a timeout). This zero-crossing detection almost completely removes all audible
noise that occurs when volume is suddenly changed.
The samplerate conversion to a common samplerate removes the need for complex PLL-based
clocking schemes and allows almost unlimited sample rate accuracy with one fixed input clock
frequency. With a 12.288 MHz clock, the DA converter operates at 128 × 48 kHz, i.e. 6.144
MHz, and creates a stereo in-phase analog signal. The oversampled output is low-pass filtered
by an on-chip analog filter. This signal is then forwarded to the earphone amplifier.
While listening to headphones the sound has a tendency to be localized inside the head. The
sound field becomes flat and lacking the sensation of dimensions. This is an unnatural, awk-
ward and sometimes even disturbing situation. This phenomenon is often referred in literature
as ‘lateralization’, meaning ’in-the-head’ localization. Long-term listening to lateralized sound
may lead to listening fatigue.
All real-life sound sources are external, leaving traces to the acoustic wavefront that arrives to
the ear drums. From these traces, the auditory system of the brain is able to judge the distance
and angle of each sound source. In loudspeaker listening the sound is external and these
traces are available. In headphone listening these traces are missing or ambiguous.
EarSpeaker processes sound to make listening via headphones more like listening to the same
music from real loudspeakers or live music. Once EarSpeaker processing is activated, the
instruments are moved from inside to the outside of the head, making it easier to separate
the different instruments (see figure 17). The listening experience becomes more natural and
pleasant, and the stereo image is sharper as the instruments are widely on front of the listener
instead of being inside the head.
Figure 17: EarSpeaker externalized sound sources vs. normal inside-the-head sound
Note that EarSpeaker differs from any common spatial processing effects, such as echo, reverb,
or bass boost. EarSpeaker accurately simulates the human auditory model and real listening
environment acoustics. Thus is does not change the tonal character of the music by introducing
artificial effects.
EarSpeaker processing can be parameterized to a few different modes, each simulating a little
different type of acoustical situation, suiting different personal preferences and types of record-
ing. See section 9.6.1 for how to activate different modes.
• Off: Best option when listening through loudspeakers or if the audio to be played contains
binaural preprocessing.
• minimal: Suited for listening to normal musical scores with headphones, very subtle.
• normal: Suited for listening to normal musical scores with headphones, moves sound
source further away than minimal.
• extreme: Suited for old or ’dry’ recordings, or if the audio to be played is artificial, for
example generated MIDI.
The serial data interface is meant for transferring compressed data for the different decoders of
VS1053b.
If the input of the decoder is invalid or it is not received fast enough, analog outputs are auto-
matically muted.
Also several different tests may be activated through SDI as described in Chapter 10.
The serial control interface is compatible with the SPI bus specification. Data transfers are
always 16 bits. VS1053b is controlled by writing and reading the registers of the interface.
VS1053b sets DREQ low when it detects an SCI operation (this delay is 16 to 40 CLKI cycles
depending on whether an interrupt service routine is active) and restores it when it has pro-
cessed the operation. The duration depends on the operation. If DREQ is low when an SCI
operation is performed, it also stays low after SCI operation processing.
If DREQ is high before a SCI operation, do not start a new SCI/SDI operation before DREQ is
high again. If DREQ is low before a SCI operation because the SDI can not accept more data,
make certain there is enough time to complete the operation before sending another.
1This is the worst-case time that DREQ stays low after writing to this register. The user may
choose to skip the DREQ check for those register writes that take less than 100 clock cycles to
execute and use a fixed delay instead.
2 In addition, the cycles spent in the user application routine must be counted.
3 Firmware changes the value of this register immediately to 0x48 (analog enabled), and after
a short while to 0x40 (analog drivers enabled).
4 When mode register write specifies a software reset the worst-case time is 22000 XTALI
cycles.
5 If the clock multiplier is changed, writing to CLOCKF register may force internal clock to run
at 1.0 × XTALI for a while. Thus it is not a good idea to send SCI or SDI bits while this register
update is in progress.
6 Firmware changes the value of this register immediately to 0x4800.
Reads from all SCI registers complete in under 100 CLKI cycles, except a read from AIADDR
in 200 cycles. In addition the cycles spent in the user application routine must be counted to
the read time of AIADDR, AUDATA, and AICTRL0..3.
When SM_DIFF is set, the player inverts the left channel output. For a stereo input this creates
virtual surround, and for a mono input this creates a differential left/right signal.
SM_LAYER12 enables MPEG 1.0 and 2.0 layer I and II decoding in addition to layer III.
If you want to stop decoding a in the middle, set SM_CANCEL, and continue sending data
honouring DREQ. When SM_CANCEL is detected by a codec, it will stop decoding and return
to the main loop. The stream buffer content is discarded and the SM_CANCEL bit cleared.
SCI_HDAT1 will also be cleared. See Chapter 10.5.2 for details.
samplerate.
If SM_TESTS is set, SDI tests are allowed. For more details on SDI tests, look at Chapter 10.12.
SM_STREAM activates VS1053b’s stream mode. In this mode, data should be sent with as
even intervals as possible and preferable in blocks of less than 512 bytes, and VS1053b makes
every attempt to keep its input buffer half full by changing its playback speed up to 5%. For best
quality sound, the average speed error should be within 0.5%, the bitrate should not exceed
160 kbit/s and VBR should not be used. For details, see Application Notes for VS10XX. This
mode only works with MP3 and WAV files.
SM_DACT defines the active edge of data clock for SDI. When ’0’, data is read at the rising
edge, when ’1’, data is read at the falling edge.
When SM_SDIORD is clear, bytes on SDI are sent MSb first. By setting SM_SDIORD, the user
may reverse the bit order for SDI, i.e. bit 0 is received first and bit 7 last. Bytes are, however,
still sent in the default order. This register bit has no effect on the SCI bus.
Setting SM_SDISHARE makes SCI and SDI share the same chip select, as explained in Chap-
ter 7.1, if also SM_SDINEW is set.
Setting SM_SDINEW will activate VS10xx native serial modes as described in Chapters 7.1.1 and 7.3.1.
Note, that this bit is set as a default when VS1053b is started up.
By activating SM_ADPCM and SM_RESET at the same time, the user will activate IMA ADPCM
recording mode (see section 10.8).
SM_LINE_IN is used to select the left-channel input for ADPCM recording. If ’0’, differential
microphone input pins MICP and MICN are used; if ’1’, line-level MICP/LINEIN1 pin is used.
SM_CLK_RANGE activates a clock divider in the XTAL input. When SM_CLK_RANGE is set,
the clock is divided by 2 at the input. From the chip’s point of view e.g. 24 MHz becomes
12 MHz. SM_CLK_RANGE should be set as soon as possible after a chip reset.
SCI_STATUS contains information on the current status of VS1053b. It also controls some
low-level things that the user does not usually have to care about.
Name Bits Description
SS_DO_NOT_JUMP 15 Header in decode, do not fast forward/rewind
SS_SWING 14:12 Set swing to +0 dB, +0.5 dB, .., or +3.5 dB
SS_VCM_OVERLOAD 11 GBUF overload indicator ’1’ = overload
SS_VCM_DISABLE 10 GBUF overload detection ’1’ = disable
9:8 reserved
SS_VER 7:4 Version
SS_APDOWN2 3 Analog driver powerdown
SS_APDOWN1 2 Analog internal powerdown
SS_AD_CLOCK 1 AD clock select, ’0’ = 6 MHz, ’1’ = 3 MHz
SS_REFERENCE_SEL 0 Reference voltage selection, ’0’ = 1.23 V, ’1’ = 1.65 V
SS_DO_NOT_JUMP is set when a WAV, Ogg Vorbis, WMA, MP4, or AAC-ADIF header is
being decoded and jumping to another location in the file is not allowed. If you use soft reset or
cancel, clear this bit yourself or it can be accidentally left set.
If AVDD is at least 3.3 V, SS_REFERENCE_SEL can be set to select 1.65 V reference voltage
to increase the analog output swing.
SS_AD_CLOCK can be set to divide the AD modulator frequency by 2 if XTALI/2 is too much.
SS_VER is 0 for VS1001, 1 for VS1011, 2 for VS1002, 3 for VS1003, 4 for VS1053 and VS8053,
5 for VS1033, 7 for VS1103, and 6 for VS1063.
If the user wants to powerdown VS1053b with a minimum power-off transient, set SCI_VOL to
0xffff, then wait for at least a few milliseconds before activating reset.
VS1053b contains GBUF protection circuit which disconnects the GBUF driver when too much
current is drawn, indicating a short-circuit to ground. SS_VCM_OVERLOAD is high while the
overload is detected. SS_VCM_DISABLE can be set to disable the protection feature.
SS_SWING allows you to go above the 0 dB volume setting. Value 0 is normal mode, 1 gives
+0.5 dB, and 2 gives +1.0 dB. Settings from 3 to 7 cause the DAC modulator to be overdriven
and should not be used. You can use SS_SWING with I2S to control the amount of headroom.
Note: Due to a firmware bug in the VS1053b volume calculation routine clears SS_AD_CLOCK
and SS_REFERENCE_SEL bits. Writes to SCI_STATUS or SCI_VOL, and sample rate changes
(if bass enhancer or treble control are active) causes the volume calculation routine to be called.
See the VS1053b Patches w/ FLAC Decoder plugin for a workaround:
http://www.vlsi.fi/en/support/software/vs10xxpatches.html
The Bass Enhancer VSBE is a bass boosting DSP algorithm, which tries to take the most out
of the users earphones without causing clipping.
Note: Because VSBE tries to avoid clipping, it gives the best bass boost with dynamical music
material, or when the playback volume is not set to maximum. It also does not create bass: the
source material must have some bass to begin with.
Treble Control VSTC is activated when ST_AMPLITUDE is non-zero. For example setting
SCI_BASS to 0x7a00 will have 10.5 dB treble enhancement at and above 10 kHz.
Bass Enhancer uses about 2.1 MIPS and Treble Control 1.2 MIPS at 44100 Hz samplerate.
Both can be on simultaneously.
In VS1053b bass and treble initialization and volume change is delayed until the next batch of
samples are sent to the audio FIFO. Thus, unlike with earlier VS10XX chips, audio interrupts
can no longer be missed when SCI_BASS or SCI_VOL is written to.
The external clock multiplier SCI register SCI_CLOCKF, which has changed slightly since
VS1003 and VS1033, is presented in the table below.
SCI_CLOCKF bits
Name Bits Description
SC_MULT 15:13 Clock multiplier
SC_ADD 12:11 Allowed multiplier addition
SC_FREQ 10: 0 Clock frequency
SC_MULT activates the built-in clock multiplier. This will multiply XTALI to create a higher CLKI.
When the multiplier is changed by more than 0.5×, the chip runs at 1.0× clock for a few hundres
clock cycles. The values are as follows:
SC_ADD tells how much the decoder firmware is allowed to add to the multiplier specified by
SC_MULT if more cycles are temporarily needed to decode a WMA or AAC stream. The values
are:
SC_ADD MASK Multiplier addition
0 0x0000 No modification is allowed
1 0x0800 XTALI×1.0
2 0x1000 XTALI×1.5
3 0x1800 XTALI×2.0
If SC_FREQ is non-zero, it tells that the input clock XTALI is running at something else than
12.288 MHz. XTALI is set in 4 kHz steps. The formula for calculating the correct value for this
register is XT ALI−8000000
4000 (XTALI is in Hz).
XT ALI
Note: because maximum samplerate is 256 , all samplerates are not available if XTALI <
12.288 MHz.
Note: Automatic clock change can only happen when decoding WMA and AAC files. Automatic
clock change is done one 0.5× at a time. This does not cause a drop to 1.0× clock and you can
use the same SCI and SDI clock throughout the file.
When decoding correct data, current decoded time is shown in this register in full seconds.
The user may change the value of this register. In that case the new value should be written
twice to make absolutely certain that the change is not overwritten by the firmware. A write to
SCI_DECODE_TIME also resets the byteRate calculation.
With fast playback (see the playSpeed extra parameter) the decode time also counts faster.
Some codecs (WMA and Ogg Vorbis) can also indicate the absolute play position, see the
positionMsec extra parameter in section 10.11.
When decoding correct data, the current samplerate and number of channels can be found
in bits 15:1 and 0 of SCI_AUDATA, respectively. Bits 15:1 contain the samplerate divided by
two, and bit 0 is 0 for mono data and 1 for stereo. Writing to SCI_AUDATA will change the
samplerate directly.
To reduce digital power consumption when idle, you can write a low samplerate to SCI_AUDATA.
Note: Ogg Vorbis decoding overrides AUDATA change. If you want to fine-tune samplerate in
streaming applications with Ogg Vorbis, use SCI_CLOCKF to control the playback rate instead
of AUDATA.
SCI_WRAM is used to upload application programs and data to instruction and data RAMs.
The start address must be initialized by writing to SCI_WRAMADDR prior to the first write/read
of SCI_WRAM. As 16 bits of data can be transferred with one SCI_WRAM write/read, and the
instruction word is 32 bits long, two consecutive writes/reads are needed for each instruction
word. The byte order is big-endian (i.e. most significant words first). After each full-word
write/read, the internal pointer is autoincremented.
SCI_WRAMADDR is used to set the program address for following SCI_WRAM writes/reads.
Use an address offset from the following table to access X, Y, I or peripheral memory.
Only user areas in X, Y, and instruction memory are listed above. Other areas can be accessed,
but should not be written to unless otherwise specified.
For WAV files, SCI_HDAT1 contains 0x7665 (“ve”). SCI_HDAT0 contains the data rate mea-
sured in bytes per second for all supported RIFF WAVE formats: mono and stereo 8-bit or
16-bit PCM, mono and stereo IMA ADPCM. To get the bitrate of the file, multiply the value by 8.
For AAC ADTS streams, SCI_HDAT1 contains 0x4154 (“AT”). For AAC ADIF files, SCI_HDAT1
contains 0x4144 (“AD”). For AAC .mp4 / .m4a files, SCI_HDAT1 contains 0x4D34 (“M4”).
SCI_HDAT0 contains the average data rate in bytes per second. To get the bitrate of the file,
multiply the value by 8.
For WMA files, SCI_HDAT1 contains 0x574D (“WM”) and SCI_HDAT0 contains the data rate
measured in bytes per second. To get the bitrate of the file, multiply the value by 8.
For MIDI files, SCI_HDAT1 contains 0x4D54 (“MT”) and SCI_HDAT0 contains the average data
rate in bytes per second. To get the bitrate of the file, multiply the value by 8.
For Ogg Vorbis files, SCI_HDAT1 contains 0x4F67 “Og”. SCI_HDAT0 contains the average
data rate in bytes per second. To get the bitrate of the file, multiply the value by 8.
For MP3 files, SCI_HDAT1 is between 0xFFE0 and 0xFFFF. SCI_HDAT1 / 0 contain the follow-
ing:
When read, SCI_HDAT0 and SCI_HDAT1 contain header information that is extracted from
MP3 stream currently being decoded. After reset both registers are cleared, indicating no data
has been found yet.
The “bitrate” field in HDAT0 is read according to the following table. Notice that for variable
bitrate stream the value changes constantly.
The average data rate in bytes per second can be read from memory, see the byteRate extra
parameter. This variable contains the byte rate for all codecs. To get the bitrate of the file,
multiply the value by 8.
The bitrate calculation is not automatically reset between songs, but it can also be reset without
a software or hardware reset by writing to SCI_DECODE_TIME.
SCI_AIADDR indicates the start address of the application code written earlier with SCI_WRAMADDR
and SCI_WRAM registers. If no application code is used, this register should not be initialized,
or it should be initialized to zero. For more details, see Application Notes for VS10XX.
Note: Reading AIADDR is not recommended. It can cause samplerate to be set to a very low
value.
SCI_VOL is a volume control for the player hardware. The most significant byte of the volume
register controls the left channel volume, the low part controls the right channel volume. The
channel volume sets the attenuation from the maximum volume level in 0.5 dB steps. Thus,
maximum volume is 0x0000 and total silence is 0xFEFE.
Note, that after hardware reset the volume is set to full volume. Resetting the software does
not reset the volume setting.
Example: for a volume of -2.0 dB for the left channel and -3.5 dB for the right channel: (2.0/0.5)
= 4, 3.5/0.5 = 7 → SCI_VOL = 0x0407.
Example: SCI_VOL = 0x2424 → both left and right volumes are 0x24 * -0.5 = -18.0 dB.
In VS1053b bass and treble initialization and volume change is delayed until the next batch of
samples are sent to the audio FIFO. Thus, audio interrupts can no longer be missed during a
write to SCI_BASS or SCI_VOL.
This delays the volume setting slightly, but because the volume control is now done in the DAC
hardware instead of performing it to the samples going into the audio FIFO, the overall volume
change response is better than before. Also, the actual volume control has zero-cross detec-
tion, which almost completely removes all audible noise that occurs when volume is suddenly
changed.
SCI_AICTRL[x] registers ( x=[0 .. 3] ) can be used to access the user’s application program.
The AICTRL registers are also used with PCM/ADPCM encoding mode.
10 Operation
10.1 Clocking
VS1053b operates on a single, nominally 12.288 MHz fundamental frequency master clock.
This clock can be generated by external circuitry (connected to pin XTALI) or by the internal
clock crystal interface (pins XTALI and XTALO). This clock is used by the analog parts and
determines the highest available samplerate. With 12.288 MHz clock all samplerates up to
48000 Hz are available.
VS1053b can also use 24..26 MHz clocks when SM_CLK_RANGE in the SCI_MODE register is
set to 1. The system clock is then divided by 2 at the clock input and the chip gets a 12..13 MHz
input clock.
When the XRESET -signal is driven low, VS1053b is reset and all the control registers and
internal states are set to the initial values. XRESET-signal is asynchronous to any external
clock. The reset mode doubles as a full-powerdown mode, where both digital and analog parts
of VS1053b are in minimum power consumption stage, and where clocks are stopped. Also
XTALO is grounded.
When XRESET is asseted, all output pins go to their default states. All input pins will go to
high-impedance state (to input state), except SO, which is still controlled by the XCS.
After a hardware reset (or at power-up) DREQ will stay down for around 22000 clock cycles,
which means an approximate 1.8 ms delay if VS1053b is run at 12.288 MHz. After this the
user should set such basic software registers as SCI_MODE, SCI_BASS, SCI_CLOCKF, and
SCI_VOL before starting decoding. See section 9.6 for details.
If the input clock is 24..26 MHz, SM_CLK_RANGE should be set as soon as possible after a
chip reset without waiting for DREQ.
Internal clock can be multiplied with a PLL. Supported multipliers through the SCI_CLOCKF
register are 1.0 × . . . 5.0× the input clock. Reset value for Internal Clock Multiplier is 1.0×. If
typical values are wanted, the Internal Clock Multiplier needs to be set to 3.5× after reset. Wait
until DREQ rises, then write value 0x9800 to SCI_CLOCKF (register 3). See section 9.6.4 for
details.
In some cases the decoder software has to be reset. This is done by activating bit SM_RESET
in register SCI_MODE (Chapter 9.6.1). Then wait for at least 2 µs, then look at DREQ. DREQ
will stay down for about 22000 clock cycles, which means an approximate 1.8 ms delay if
VS1053b is run at 12.288 MHz. After DREQ is up, you may continue playback as usual.
As opposed to all earlier VS10XX chips, it is not recommended to do a software reset between
songs. This way the user may be sure that even files with low samplerates or bitrates are played
right to their end.
If you need to keep the system running while not decoding data, but need to lower the power
consumption, you can use the following tricks.
• Select the 1.0× clock by writing 0x0000 to SCI_CLOCKF. This disables the PLL and saves
some power.
• Write a low non-zero value, such as 0x0010 to SCI_AUDATA. This will reduce the sam-
plerate and the number of audio interrupts required. Between audio interrupts the VSDSP
core will just wait for an interrupt, thus saving power.
• If possible for the application, write 0xffff to SCI_VOL to disable the analog drivers.
Note: The low power mode consumes significantly more electricity than hardware reset.
This is the normal operation mode of VS1053b. SDI data is decoded. Decoded samples are
converted to analog domain by the internal DAC. If no decodable data is found, SCI_HDAT0
and SCI_HDAT1 are set to 0.
When there is no input for decoding, VS1053b goes into idle mode (lower power consumption
than during decoding) and actively monitors the serial data input for valid data.
Cancelling playback of a song is a normal operation when the user wants to jump to another
song while doing playback.
VS1053b allows fast audio playback. If your microcontroller can feed data fast enough to the
VS1053b, this is the preferred way to fast forward audio.
To estimate whether or not your microcontroller can feed enough data to VS1053b in fast play
mode, see contents of extra parameter value byteRate (Chapter 10.11). Note that byteRate
contains the data speed of the file played back at nominal speed even when fast play is active.
To do fast forward and rewind you need the capability to do random access to the audio file.
Unfortunately fast forward and rewind isn’t available at all times, like when file headers are
being read.
Note: It is recommended that playback volume is decreased by e.g. 10 dB when fast forward-
ing/rewinding.
Note: Midi is not suitable for random-access. You can implement fast forward using the
playSpeed extra parameter to select 1-128× play speed. SCI_DECODE_TIME also speeds
up. If necessary, rewind can be implemented by restarting decoding of a MIDI file and fast
playing to the appropriate place. SCI_DECODE_TIME can be used to decide when the right
place has been reached.
When fast forward and rewind operations are performed, there is no way to maintain correct
decode time for most files. However, WMA and Ogg Vorbis files offer exact time information in
the file. To use accurate time information whenever possible, use the following algorithm:
VS1053b can be used as a PCM decoder by sending a WAV file header. If the length sent in the
WAV header is 0xFFFFFFFF, VS1053b will stay in PCM mode indefinitely (or until SM_CANCEL
has been set). 8-bit (unsigned) linear and 16-bit (signed, 2’s complement) linear audio is sup-
ported in mono or stereo. A WAV header looks like this:
File Offset Field Name Size Bytes Description
0 ChunkID 4 "RIFF"
4 ChunkSize 4 0xff 0xff 0xff 0xff
8 Format 4 "WAVE"
12 SubChunk1ID 4 "fmt "
16 SubChunk1Size 4 0x10 0x0 0x0 0x0 16
20 AudioFormat 2 0x1 0x0 Linear PCM
22 NumOfChannels 2 C0 C1 1 for mono, 2 for stereo
24 SampleRate 4 S0 S1 S2 S3 0x1f40 for 8 kHz
28 ByteRate 4 R0 R1 R2 R3 0x3e80 for 8 kHz 16-bit mono
32 BlockAlign 2 A0 A1 0x02 0x00 for mono, 0x04 0x00 for stereo 16-bit
34 BitsPerSample 2 B0 B1 0x10 0x00 for 16-bit data
52 SubChunk2ID 4 "data"
56 SubChunk2Size 4 0xff 0xff 0xff 0xff Data size
Ogg Vorbis is an open file format that allows for very high sound quality with low to medium
bitrates.
Ogg Vorbis recording is activated by loading the Ogg Vorbis Encoder Application to the 16
KiB program RAM memory of the VS1053b. After activation, encoder results can be read
from registers SCI_HDAT0 and SCI_HDAT1, much like when using PCM/ADPCM recording
(Chapter 10.8).
Three profiles are provided: one for high-quality stereo recording at a bitrate of approx. 140
kbit/s, and two for speech-quality mono recording at a bitrates between 15 and 30 kbit/s.
To use the Ogg Vorbis Encoder application, please load the application from VLSI Solution’s
Web page http://www.vlsi.fi/en/support/software/vs10xxapplications.html and read the accom-
panying documentation.
This chapter explains how to record a RIFF/WAV file in PCM or IMA ADPCM format. IME
ADPCM is a widely supported ADPCM format and many PC audio playback programs can play
it. IMA ADPCM recording gives a compression ratio of almost 4:1 compared to linear, 16-bit
audio. This makes it possible to record for example ono 8 kHz audio at 32.44 kbit/s.
VS1053 has a stereo ADC, thus also two-channel (separate AGC, if AGC enabled) and stereo
(common AGC, if AGC enabled) modes are available. Mono recording mode selects either the
left or right channel. Left channel is either MIC or LINE1 depending on the SCI_MODE register.
If absolute best quality stereo PCM recording at 48 kHz is required, download and use the
VS1053 WAV PCM Recorder Application plugin, available for download at
http://www.vlsi.fi/en/support/software/vs10xxapplications.html . If you use it, follow the instruc-
tions of the plugin documentation instead of this datasheet.
PCM / IMA ADPCM recording mode is activated by setting bit SM_ADPCM in SCI_MODE
and loading and starting a patch code. Line input 1 is used instead of differential mic in-
put if SM_LINE1 is set. Before activating ADPCM recording, user must write the right val-
ues to SCI_AICTRL0 and SCI_AICTRL3. These values are only read at recording startup.
SCI_AICTRL1 and SCI_AICTRL2 can be altered anytime, but it is preferable to write good init
values before activation.
SCI_AICTRL1 controls linear recording gain. 1024 is equal to digital gain 1, 512 is equal to
digital gain 0.5 and so on. If the user wants to use automatic gain control (AGC), SCI_AICTRL1
should be set to 0. Typical speech applications usually are better off using AGC, as this takes
care of relatively uniform speech loudness in recordings.
SCI_AICTRL2 controls the maximum AGC gain. This can be used to limit the amplification of
noise when there is no signal. If SCI_AICTRL2 is zero, the maximum gain is initialized to 65535
(64×), i.e. whole range is used.
For example:
WriteVS10xxRegister(SCI_AICTRL0, 16000U);
WriteVS10xxRegister(SCI_AICTRL1, 0);
WriteVS10xxRegister(SCI_AICTRL2, 4096U);
WriteVS10xxRegister(SCI_AICTRL3, 0);
WriteVS10xxRegister(SCI_MODE, ReadVS10xxRegister(SCI_MODE) |
SM_ADPCM | SM_LINE1);
#ifdef I_HAVE_THE_VS1053B_PATCHES_PACKAGE
/* Strongly recommended to use the VS1053b Patches package. Get it at
http://www.vlsi.fi/en/support/software/vs10xxpatches.html */
LoadVS1053PatchesPackage(); /* Loads patches and starts recording */
#else
/* If not using the VS1053b Patches package, you need to do these steps */
WriteVS10xxPatch();
#endif
selects 16 kHz, stereo mode with automatic gain control and maximum amplification of 4×.
This small and incomplete patch is also available from VLSI Solution’s web page
http://www.vlsi.fi/en/support/software/vs10xxpatches.html by the name of VS1053b IMA AD-
PCM Encoder Fix.
After PCM / IMA ADPCM recording has been activated, registers SCI_HDAT0 and SCI_HDAT1
have new functions.
The PCM / IMA ADPCM sample buffer is 1024 16-bit words. The fill status of the buffer can
be read from SCI_HDAT1. If SCI_HDAT1 is greater than 0, you can read as many 16-bit words
from SCI_HDAT0. If the data is not read fast enough, the buffer overflows and returns to empty
state.
Note: if SCI_HDAT1 ≥ 768, it may be better to wait for the buffer to overflow and clear before
reading samples. That way you may avoid buffer aliasing.
In IMA ADPCM mode each mono IMA ADPCM block is 128 words, i.e. 256 bytes, and stereo
IMA ADPCM block is 256 words, i.e. 512 bytes. If you wish to interrupt reading data and
possibly continue later, please stop at the boundary. This way complete compression blocks
are skipped and the encoded stream stays valid.
To make your PCM file a RIFF / WAV file, you have to add a header to the data. The following
shows a header for a mono file. Note that 2- and 4-byte values are little-endian (lowest byte
first).
If you know beforehand how much you are going to record, you may fill in the complete header
before any actual data. However, if you don’t know how much you are going to record, you have
to fill in the header size datas F and D after finishing recording.
The PCM data is read from SCI_HDAT0 and written into file as follows. The low 8 bits of
SCI_HDAT0 should be written as the first byte to a file, then the high 8 bits (little-endian order).
Note that this is different from how ADPCM data is written to a file (see Chapter 10.8.4).
Below is an example of a valid header for a 44.1 kHz mono PCM file that has a final length of
1798768 (0x1B7270) bytes:
0000 52 49 46 46 68 72 1b 00 57 41 56 45 66 6d 74 20 |RIFFhr..WAVEfmt |
0010 10 00 00 00 01 00 01 00 80 bb 00 00 00 77 01 00 |.............w..|
0020 02 00 10 00 64 61 74 61 44 72 1b 00 |....dataDr......|
To make your IMA ADPCM file a RIFF / WAV file, you have to add a header to the data. The
following shows a header for a mono file. Note that 2- and 4-byte values are little-endian (lowest
byte first).
If you know beforehand how much you are going to record, you may fill in the complete header
before any actual data. However, if you don’t know how much you are going to record, you have
to fill in the header size datas F , S and D after finishing recording.
The 128 words (256 words for stereo) of an ADPCM block are read from SCI_HDAT0 and
written into file as follows. The high 8 bits of SCI_HDAT0 should be written as the first byte to a
file, then the low 8 bits (big-endian order). Note that this is contrary to the native byte order of
some 16-bit microcontrollers, and you may have to take extra care to do this right. Note also,
that this is different from how PCM data is written to a file (see Chapter 10.8.3).
To see if you have written the mono file in the right way check bytes 2 and 3 (the first byte
counts as byte 0) of each 256-byte block. Byte 2 should be in the range 0..88 and byte 3 should
be zero. For stereo you check bytes 2, 3, 6, and 7 of each 512-byte block. Bytes 2 and 6 should
be in the range 0..88. Bytes 3 and 7 should be zero.
Below is an example of a valid header for a 44.1 kHz stereo IMA ADPCM file that has a final
length of 10038844 (0x992E3C) bytes:
0000 52 49 46 46 34 2e 99 00 57 41 56 45 66 6d 74 20 |RIFF4...WAVEfmt |
0010 14 00 00 00 11 00 02 00 44 ac 00 00 a7 ae 00 00 |........D.......|
0020 00 02 04 00 02 00 f9 01 66 61 63 74 04 00 00 00 |........fact....|
0030 14 15 97 00 64 61 74 61 00 2e 99 00 |....data....|
In order to play back your PCM / IMA ADPCM recordings, you have to have a file with a header
as described in Chapter 10.8.3 or Chapter 10.8.4. If this is the case, all you need to do is to
provide the ADPCM file through SDI as you would with any audio file.
VS10xx chips that support IMA ADPCM playback are capable of playing back ADPCM files with
any sample rate. However, some other programs may expect IMA ADPCM files to have some
exact sample rates, like 8000 or 11025 Hz. Also, some programs or systems do not support
sample rates below 8000 Hz.
If you want better quality with the expense of increased data rate, you can use higher sample
rates, for example 16 kHz.
In VS1053b writing to the SCI_VOL register during IMA ADPCM encoding does not change the
volume. You need to set a suitable volume before activating the IMA ADPCM mode, or you can
use the VS1053 hardware volume control register DAC_VOL directly.
For example:
WriteVS10xxRegister(SCI_WRAMADDR, 0xc045); /*DAC_VOL*/
WriteVS10xxRegister(SCI_WRAM, 0x0101); /*-6.0 dB*/
The hardware volume control DAC_VOL (address 0xc045) allows 0.5 dB steps for both left (high
8 bits) and right channel (low 8 bits). The low 4 bits of both 8-bit values set the attenuation in
6 dB steps (range 0..15), the high 4 bits in 0.5 dB steps (range 0..11). For examples, see table
below.
If GPIO0 is set with a pull-up resistor to 1 at boot time, VS1053b tries to boot from external SPI
memory.
The memory has to be an SPI Bus Serial EEPROM with 16-bit or 24-bit addresses. The serial
speed used by VS1053b is 245 kHz with the nominal 12.288 MHz clock. The first three bytes
in the memory have to be 0x50, 0x26, 0x48.
If GPIO0 is low and GPIO1 is high during boot, real-time MIDI mode is activated. In this mode
the PLL is configured to 4.0×, the UART is configured to the MIDI data rate 31250 bps, and
real-time MIDI data is then read from UART and SDI. Both input methods should not be used
simultaneously. If you use SDI, first send 0x00 and then send the MIDI data byte.
EarSpeaker setting can be configured with GPIO2 and GPIO3. The state of GPIO2 and GPIO3
are only read at startup.
Real-Time MIDI can also be started with a small patch code using SCI.
Note: The real-time MIDI parser in VS1053b does not know how to skip SysEx messages. An
improved version can be loaded into IRAM if needed.
The following structure is in X memory at address 0x1e02 (note the different location than in
VS1033) and can be used to change some extra parameters or get useful information.
Notice that reading two-word variables through the SCI_WRAMADDR and SCI_WRAM inter-
face is not protected in any way. The variable can be updated between the read of the low and
high parts. The problem arises when both the low and high parts change values. To determine
if the value is correct, you should read the value twice and compare the results.
The following example shows what happens when bytesLeft is decreased from 0x10000 to
0xffff and the update happens between low and high part reads or after high part read.
You can see that in the invalid read the low part wraps from 0x0000 to 0xffff while the high part
stays the same. In this case the second read gives a valid answer, otherwise always use the
value of the first read. The second read is needed when it is possible that the low part wraps
around, changing the high part, i.e. when the low part is small. bytesLeft is only decreased
by one at a time, so a reread is needed only if the low part is 0.
These parameters are common for all codecs. Other fields are only valid when the correspond-
ing codec is active. The currently active codec can be determined from SCI_HDAT1.
The fuse-programmed ID is read at startup and copied into the chipID field. If not available,
the value will be all zeros. The version field can be used to determine the layout of the rest
of the structure. The version number is changed when the structure is changed. For VS1053b
the structure version is 3.
playSpeed makes it possible to fast forward songs. Decoding of the bitstream is performed,
but only each playSpeed frames are played. For example by writing 4 to playSpeed will play
the song four times as fast as normal, if you are able to feed the data with that speed. Write 0
or 1 to return to normal speed. SCI_DECODE_TIME will also count faster. All current codecs
support the playSpeed configuration.
byteRate contains the average bitrate in bytes per second for every code. The value is updated
once per second and it can be used to calculate an estimate of the remaining playtime. This
value is also available in SCI_HDAT0 for all codecs except MP3, MP2, and MP1.
endFillByte indicates what byte value to send after file is sent before SM_CANCEL.
jumpPoints contain 32-bit file offsets. Each valid (non-zero) entry indicates a start of a packet
for WMA or start of a raw data block for AAC (ADIF, .mp4 / .m4a). latestJump contains the
index of the entry that was updated last. If you only read entry pointed to by latestJump you
do not need to read the entry twice to ensure validity. Jump point information can be used to
implement perfect fast forward and rewind for WMA and AAC (ADIF, .mp4 / .m4a).
positionMsec is a field that gives the current play position in a file in milliseconds, regardless
of rewind and fast forward operations. The value is only available in codecs that can determine
the play position from the stream itself. Currently WMA and Ogg Vorbis provide this information.
If the position is unknown, this field contains -1.
resync field is used to force a resynchronization to the stream for WMA and AAC (ADIF, .mp4
/ .m4a) instead of ending the decode at first error. This field can be used to implement almost
perfect fast forward and rewind for WMA and AAC (ADIF, .mp4 / .m4a). The user should set this
field before performing data seeks if they are not in packet or data block boundaries. The field
value tells how many tries are allowed before giving up. The value 32767 gives infinite tries.
The resync field is set to 32767 after a reset to make resynchronization the default action, but
it can be cleared after reset to restore the old action. When resync is set, every file decode
should always end as described in Chapter 10.5.1.
Seek fields no longer exist. When resync is required, WMA and AAC codecs now enter broad-
cast/stream mode where file size information is ignored. Also, the file size and sample size in-
formation of WAV files are ignored when resync is non-zero. The user must use SM_CANCEL
or software reset to end decoding.
Note: WAV, WMA, ADIF, and .mp4 / .m4a files begin with a metadata or header section, which
must be fully processed before any fast forward or rewind operation. SS_DO_NOT_JUMP
(in SCI_STATUS) is clear when the header information has been processed and jumps are
allowed.
10.11.2 WMA
The ASF header packet size is available in packetSize. With this information and a packet
start offset from jumpPoints you can parse the packet headers and skip packets in ASF files.
WMA decoder can also increase the internal clock automatically when it detects that a file can
not be decoded correctly with the current clock. The maximum allowed clock is configured with
the SCI_CLOCKF register.
10.11.3 AAC
playSelect determines which element to decode if a stream has multiple elements. The value
is set to 0 each time AAC decoding starts, which causes the first element that appears in the
stream to be selected for decoding. Other values are: 0x01 - select first single channel element
(SCE), 0x02 - select first channel pair element (CPE), 0x03 - select first low frequency element
(LFE), S ∗ 16 + 5 - select SCE number S, P ∗ 16 + 6 - select CPE number P, L ∗ 16 + 7 -
select LFE number L. When automatic selection has been performed, playSelect reflects the
selected element.
sceFoundMask, cpeFoundMask, and lfeFoundMask indicate which elements have been found in
an AAC stream since the variables have last been cleared. The values can be used to present
an element selection menu with only the available elements.
dynCompress and dynBoost change the behavior of the dynamic range control (DRC) that is
present in some AAC streams. These are also initialized when AAC decoding starts.
sbrAndPsStatus indicates spectral band replication (SBR) and parametric stereo (PS) status.
Bit Usage
0 SBR present
1 upsampling active
2 PS present
3 PS active
Bits 7 to 4 in config1 can be used to control the SBR and PS decoding. Bits 5 and 4 select
SBR mode and bits 7 and 6 select PS mode. These configuration bits are useful if your AAC
license does not cover SBR and/or PS.
config1(5:4) Usage
’00’ normal mode, upsample <24 kHz AAC files
’01’ do not automatically upsample <24 kHz AAC files, but
enable upsampling if SBR is encountered
’10’ never upsample
’11’ disable SBR (also disables PS)
config1(7:6) Usage
’00’ normal mode, process PS if it is available
’01’ process PS if it is available, but in downsampled mode
’10’ reserved
’11’ disable PS processing
AAC decoder can also increase the internal clock automatically when it detects that a file can
not be decoded correctly with the current clock. The maximum allowed clock is configured with
the SCI_CLOCKF register.
If even the highest allowed clock is too slow to decode an AAC file with SBR and PS compo-
nents, the advanced decoding features are automatically dropped one by one until the file can
be played. First the parametric stereo processing is dropped (the playback becomes mono).
If that is not enough, the spectral band replication is turned into downsampled mode (reduced
bandwidth). As the last resort the spectral band replication is fully disabled. Dropped features
are restored at each song change.
10.11.4 Midi
Ogg Vorbis decoding supports Replay Gain technology. The Replay Gain technology is used
to automatically give all songs a matching volume so that the user does not need to adjust the
volume setting between songs.
If the Ogg Vorbis decoder finds a REPLAYGAIN_ALBUM_GAIN tag in the song header, the tag
is parsed and the decoded gain setting is written to the gain parameter.
If REPLAYGAIN_ALBUM_GAIN is not available, REPLAYGAIN_TRACK_GAIN is used.
If even REPLAYGAIN_TRACK_GAIN is not available, a default of -6 dB (gain value -12) is set.
The player software can use the gain value to adjust the volume level. Negative values mean
that the volume should be decreased, positive values mean that the volume should be in-
creased.
For example gain = -11 means that volume should be decreased by 5.5 dB (−11/2 = −5.5),
and left and right attenuation should be increased by 11. When gain = 2 volume should be
increased by 1 dB (2/2 = 1.0), and left and right attenuation should be decreased by 2. Because
volume setting can not go above +0 dB, the value should be saturated.
There are several test modes in VS1053b, which allow the user to perform memory tests, SCI
bus tests, and several different sine wave tests.
All tests, except for the New Sine and Sweep Tests, are started in a similar way: do a hardware
reset to VS1053b, then set register SM_MODE bit SM_TESTS, and then send a test command
sequence to the SDI bus. Each test is started by sending a 4-byte special command sequence,
followed by 4 zeros. The sequences are described below.
Sine test is initialized with the 8-byte sequence 0x53 0xEF 0x6E n 0 0 0 0, where n defines the
sine test to use. n is defined as follows:
F s Idx Fs F s Idx Fs
n bits
Name Bits Description 0 44100 Hz 4 24000 Hz
1 48000 Hz 5 16000 Hz
F s Idx 7:5 Samplerate index
2 32000 Hz 6 11025 Hz
S 4:0 Sine skip speed
3 22050 Hz 7 12000 Hz
S
The frequency of the sine to be output can now be calculated from F = F s × 128 .
Example: Sine test is activated with value 126, which is 0b01111110. Breaking n to its compo-
nents, Fs Idx = 0b011 = 3 and thus Fs = 22050Hz. S = 0b11110 = 30, and thus the final sine
30
frequency F = 22050Hz × 128 ≈ 5168Hz.
To exit the sine test, send the sequence 0x45 0x78 0x69 0x74 0 0 0 0.
Note: Sine test signals go through the digital volume control, so it is possible to test channels
separately.
A more frequency-accurate sine test can be started and controlled from SCI. SCI_AICTRL0 and
SCI_AICTRL1 set the sine frequencies for left and right channel, respectively. These registers,
volume (SCI_VOL), and samplerate (SCI_AUDATA) can be set before or during the test. Write
0x4020 to SCI_AIADDR to start the test.
SCI_AICTRLn can be calculated from the desired frequency and DAC samplerate by:
The maximum value for SCI_AICTRLn is 0x8000U. For the best S/N ratio for the generated
sine, three LSb’s of the SCI_AICTRLn should be zero. The resulting frequencies Fsin can be
calculated from the DAC samplerate Fs and SCI_AICTRL0 / SCI_AICTRL1 using the following
equation.
Both these tests use the normal audio path, thus also SCI_BASS, differential output mode, and
EarSpeaker settings have an effect.
Pin test is activated with the 8-byte sequence 0x50 0xED 0x6E 0x54 0 0 0 0. This test is meant
for chip production testing only.
Sci test is initialized with the 8-byte sequence 0x53 0x70 0xEE n 0 0 0 0, where n is the
register number to test. The content of the given register is read and copied to SCI_HDAT0. If
the register to be tested is HDAT0, the result is copied to SCI_HDAT1.
Memory test mode is initialized with the 8-byte sequence 0x4D 0xEA 0x6D 0x54 0 0 0 0. After
this sequence, wait for 1100000 clock cycles. The result can be read from the SCI register
SCI_HDAT0, and ’one’ bits are interpreted as follows:
11 VS1053b Registers
User software is required when a user wishes to add some own functionality like DSP effects
to VS1053b.
However, most users of VS1053b don’t need to worry about writing their own code, or about
this chapter, including those who only download software plug-ins from VLSI Solution’s Web
site.
Note: Also see VS1063 Hardware Guide for more information, because the hardware is com-
patible with VS1053.
VS_DSP is a 16/32-bit DSP processor core that also had extensive all-purpose processor fea-
tures. VLSI Solution’s free VSKIT Software Package contains all the tools and documentation
needed to write, simulate and debug Assembly Language or Extended ANSI C programs for the
VS_DSP processor core. VLSI Solution also offers a full Integrated Development Environment
VSIDE for full debug capabilities.
Registers Register
DAC_FCTLL
DAC_FCTLH DAC_VOL
Left
Registers
DAC_LEFT DAC Sigma−delta Analog
SRC modulator driver Right
DAC_RIGHT
CBUF
I2S
Resampler Sidestream
SRC SDM
Figure 18: VS1053b ADC and DAC data paths with some data registers
The main audio path starts from the DAC register (Chapter 11.8) to the high-fidelity, fully digital
DAC SRC (Digital-to-Analog Converter SampleRate Converter), which low-pass filters and in-
terpolates the data to the high samplerate of XTALI/2 (nominally 6.144 MHz). This 18-bit data is
then fed to the volume control. It then passes through the sigma-delta modulator to the analog
driver and analog Left and Right signals.
The user may resample and record the data with the Resampler SampleRate Converter (Chap-
ter 11.16). Because there is no automatic low-pass filtering, it is the user’s responsibility to avoid
aliasing distortion.
The user may add a PCM sidestream with the Sidestream Sigma-Delta Modulator input (Chap-
ter 11.17). As is the case with the Resampler SampleRate Converter, hardware doesn’t offer
low-pass filtering, so sufficient aliasing image rejection is the responsibility of the user.
MICN
Microphone
MICP amplifier
LINE1 Multiplexer
Registers
ADC ADC ADC_DATA_LEFT
LINE2 decimator
ADC_DATA_RIGHT
Figure 19: VS1053b ADC and DAC data paths with some data registers
Analog audio may be fed up to two channels: one as a differential signal to MICN/MICP or as
a one-sided signal to Line1, and the other as a one-sided signal to Line2.
If microphone input for the left channel has been selected, audio is fed through a microphone
amplifier and that signal is selected by a multiplexer.
Audio is then downsampled to one of four allowed samplerates: XTALI/64, XTALI/128, XTALI/256
or XTALI/512. With the nominal 12.288 MHz crystal, these correspond to 192, 96, 48 or 24 kHz
samplerates, respectively (Chapter 11.17).
If the “3 MHz” option bit SS_AD_CLOCK in register SCI_STATUS has been set to 1, then
samplerates are divided by two, so the nominal samplerates become 96, 48, 24 and 12 kHz.
SCI registers described in Chapter 9.6 can be found here between 0xC000..0xC00F. In addition
to these registers, there is one in address 0xC010, called SCI_CHANGE.
SCI_CHANGE bits
Name Bits Description
SCI_CH_WRITE 4 1 if last access was a write cycle
SCI_CH_ADDR 3:0 SCI address of last access
SCI_CHANGE contains the last SCI register that has been accessed through the SCI bus, as
well as whether the access was a read or write operation.
Whenever two bytes have been written to the SDI bus, an interrupt is generated and the data
can be read as a 16-bit big-endian value from the SDI registers. The user can control the DREQ
pin as if it was a general-purpose output through its own register bit.
The internal 20-bit register DAC_FCTL is calculated from DAC_FCTLH and DAC_FCTLL reg-
isters as follows: DAC_FCTL = (DAC_FCTLH & 15) × 65536 + DAC_FCTLL. Highest supported
value for DAC_FCTL is 0x80000.
If we define C = DAC_FCTL and X = XTALI in Hz, then the resulting samplerate fs of the asso-
ciated DAC SampleRate Converter is fs = C × X × 2−27 .
Example:
If C = 0x80000 and X = 12.288 MHz then fs = 524288 × (12.288 × 106 ) × 2−27 = 48000 (Hz).
Note: FCTLH bits 13:4 are used for the PLL Controller. See Chapter 11.9 for details.
DAC_VOL bits
Name Bits Description
LEFT_FINE 15:12 Left channel gain +0.0 dB. . .+5.5 dB (0 to 11)
LEFT_COARSE 11:8 Left channel attenuation in -6 dB steps
RIGHT_FINE 7:4 Right channel volume +0.0 dB. . .+5.5 dB (0 to
11)
RIGHT_COARSE 3:0 Right channel attenuation in -6 dB steps
Normally DAC_VOL is handled by the firmware. DAC_VOL depends on SCI_VOL and the bass
and treble settings in SCI_BASS (and optionally SS_SWING bits in SCI_STATUS).
The Phase-Locked Loop (PLL) controller is used to generate clock frequencies that are higher
than the incoming (crystal-based) clock frequency. The PLL output is used by the CPU core
and some peripherals.
• VCO Enable/Disable
• Select VCO or input clock to be output clock
• Route VCO frequency to output pin
At the core of the PLL controller is the VCO, a high frequency oscillator, whose oscillation
frequency is adjusted to be an integer multiple of some input frequency. As the name “Phase-
Locked Loop” suggests, this is done by comparing the phase of the input frequency against the
phase of a signal which is derived from the VCO output through frequency division.
If the system is stable, e.g. the comparison phase difference remains virtually zero, the PLL is
said to be “in lock”. This means that the output frequency of the VCO is stable and reliable.
The PLL is preceded by a division-by-two unit. Thus, with a nominal XTALI = 12.288 MHz, the
internal clock frequency CLKI can be adjusted with an accuracy of XTALI/2 = 6.144 MHz.
PLL control lies in DAC_FCTL bits 13:4. To see what bits 3:0 do, see Chapter 11.8.
The PLL locked status can be checked by generating a high-active pulse (writing first “1” , then
“0”) to FCH_PLL_SET_LOCK and reading FCH_PLL_LOCKED. FCH_PLL_LOCKED is set to
“1” along with the high level of FCH_PLL_SET_LOCK and to “0” whenever the PLL falls out of
lock. So if the “1” remains in FCH_PLL_LOCKED, PLL is in sync.
The PLL controller’s operation is optimized for frequencies around 12. . . 13 MHz. If you use an
24. . . 26 MHz input clock, set the extra clock divider bit SM_CLK_RANGE in register SCI_MODE
to 1 before activating the PLL.
It’s recommended to change the PLL rate in small steps and wait for the PLL to stabilize after
each change. For diagnostic purposes, the PLL clock output (VCO) can be routed to an I/O pin
so it can be scanned with an oscilloscope.
FCH_PLL_RATE (bits 7:4) control PLL multiplication rate. PLL multiplier is (FCH_PLL_RATE
+ 1). When FCH_PLL_RATE is 0, the VCO is powered down and output clock is forced to be
input clock (same as if FCH_PLL_FORCE_PLL = 0).
11.10 GPIO
GPIO_DIR is used to set the direction of the GPIO pins. 1 means output. GPIO_ODATA
remembers its values even if a GPIO_DIR bit is set to input.
GPIO_IDATA is used to read the pin states. In VS1053 also the SDI and SCI input pins
can be read through GPIO_IDATA: SCLK = GPIO_IDATA[8], XCS = GPIO_IDATA[9], SI =
GPIO_IDATA[10], and XDCS = GPIO_IDATA[11].
In addition to data direction control for GPIO pins 0 to 7, there are two additional control bits in
GPIO_DIR. GPIO_DIR[8] switches SO into software control, so that the value in GPIO_ODATA[8]
is shown on SO whenever xCS is low. When GPIO_DIR[9] is set to ’1’ SCI and SDI are dis-
abled. When using SCLK, xCS, SI and xDCS as general-purpose input, GPIO_DIR[9] prevents
transitions in them from getting random data from SDI and/or SCI.
Note that in VS1053b the VSDSP registers can be read and written through the SCI_WRAMADDR
and SCI_WRAM registers. You can thus use the GPIO pins quite conveniently.
INT_ENABLE bits
Name Bits Description
INT_EN_SDM 9 Enable Sigma Delta Modulator interrupt
INT_EN_SRC 8 Enable SampleRate Converter interrupt
INT_EN_TIM1 7 Enable Timer 1 interrupt
INT_EN_TIM0 6 Enable Timer 0 interrupt
INT_EN_RX 5 Enable UART RX interrupt
INT_EN_TX 4 Enable UART TX interrupt
INT_EN_ADC 3 Enable AD modulator interrupt
INT_EN_SDI 2 Enable Data interrupt
INT_EN_SCI 1 Enable SCI interrupt
INT_EN_DAC 0 Enable DAC interrupt
Note: It may take up to 6 clock cycles before changing INT_ENABLE has any effect.
Writing any value to INT_GLOB_DIS adds one to the interrupt counter INT_COUNTER and
effectively disables all interrupts. It may take up to 6 clock cycles before writing to this register
has any effect.
Writing any value to INT_GLOB_ENA subtracts one from the interrupt counter INT_COUNTER,
unless it already was 0, in which case nothing happens. If, after the operation INT_COUNTER
becomes zero, interrupts selected with INT_ENABLE are restored. An interrupt routine should
always write to this register as the last thing it does, because interrupts automatically add one
to the interrupt counter, but subtracting it back to its initial value is the responsibility of the user.
It may take up to 6 clock cycles before writing this register has any effect.
By reading INT_COUNTER the user may check if the interrupt counter is correct or not. If the
register is not 0, interrupts are disabled.
The RS232 UART implements a serial interface using RS232 standard 8N1 (8 data bits, no
parity, 1 stop bit).
Start Stop
bit D0 D1 D2 D3 D4 D5 D6 D7 bit
When the line is idling, it stays in logic high state. When a byte is transmitted, the transmission
begins with a start bit (logic zero) and continues with data bits (LSB first) and ends up with a
stop bit (logic high). 10 bits are sent for each 8-bit byte frame.
UART registers
Reg Type Reset Abbrev[bits] Description
0xC028 r 0 UART_STATUS[4:0] Status
0xC029 r/w 0 UART_DATA[7:0] Data
0xC02A r/w 0 UART_DATAH[15:8] Data High
0xC02B r/w 0 UART_DIV Divider
A read from the status register returns the transmitter and receiver states.
UART_STATUS bits
Name Bits Description
UART_ST_FRAMEERR 4 Framing error (stop bit was 0)
UART_ST_RXORUN 3 Receiver overrun
UART_ST_RXFULL 2 Receiver data register full
UART_ST_TXFULL 1 Transmitter data register full
UART_ST_TXRUNNING 0 Transmitter running
UART_ST_RXORUN is set if a received byte overwrites unread data when it is transferred from
the receiver shift register to the data register, otherwise it is cleared.
UART_ST_TXFULL is set if a write to the data register is not allowed (data register full).
A read from UART_DATA returns the received byte in bits 7:0, bits 15:8 are returned as ’0’. If
there is no more data to be read, the receiver data register full indicator will be cleared.
A receive interrupt will be generated when a byte is moved from the receiver shift register to
the receiver data register.
A write to UART_DATA sets a byte for transmission. The data is taken from bits 7:0, other
bits in the written value are ignored. If the transmitter is idle, the byte is immediately moved
to the transmitter shift register, a transmit interrupt request is generated, and transmission is
started. If the transmitter is busy, the UART_ST_TXFULL will be set and the byte remains in the
transmitter data register until the previous byte has been sent and transmission can proceed.
UART_DIV Bits
Name Bits Description
UART_DIV_D1 15:8 Divider 1 (0..255)
UART_DIV_D2 7:0 Divider 2 (6..255)
The divider is set to 0x0000 in reset. The ROM boot code must initialize it correctly depending
on the master clock frequency to get the correct bit speed. The second divider (D2 ) must be
from 6 to 255.
fm
The communication speed f = (D1 +1)×(D2 ) , where fm is the master clock frequency, and f is
the TX/RX speed in bps.
Transmitter operates as follows: After an 8-bit word is written to the transmit data register it will
be transmitted instantly if the transmitter is not busy transmitting the previous byte. When the
transmission begins a TX_INTR interrupt will be sent. Status bit [1] informs the transmitter data
register empty (or full state) and bit [0] informs the transmitter (shift register) empty state. A
new word must not be written to transmitter data register if it is not empty (bit [1] = ’0’). The
transmitter data register will be empty as soon as it is shifted to transmitter and the transmission
is begun. It is safe to write a new word to transmitter data register every time a transmit interrupt
is generated.
Receiver operates as follows: It samples the RX signal line and if it detects a high to low
transition, a start bit is found. After this it samples each 8 bit at the middle of the bit time (using
a constant timer), and fills the receiver (shift register) LSB first. Finally the data in the receiver
is moved to the reveive data register, the stop bit state is checked (logic high = ok, logic low =
framing error) for status bit[4], the RX_INTR interrupt is sent, status bit[2] (receive data register
full) is set, and status bit[2] old state is copied to bit[3] (receive data overrun). After that the
receiver returns to idle state to wait for a new start bit. Status bit[2] is zeroed when the receiver
data register is read.
RS232 communication speed is set using two clock dividers. The base clock is the processor
master clock. Bits 15-8 in these registers are for first divider and bits 7-0 for second divider. RX
sample frequency is the clock frequency that is input for the second divider.
11.13 Timers
There are two 32-bit timers that can be initialized and enabled independently of each other. If
enabled, a timer initializes to its start value, written by a processor, and starts decrementing
every clock cycle. When the value goes past zero, an interrupt is sent, and the timer initializes
to the value in its start value register, and continues downcounting. A timer stays in that loop
as long as it is enabled.
A timer has a 32-bit timer register for down counting and a 32-bit TIMER1_LH register for
holding the timer start value written by the processor. Timers have also a 2-bit TIMER_ENA
register. Each timer is enabled (1) or disabled (0) by a corresponding bit of the enable register.
TIMER_CONFIG Bits
Name Bits Description
TIMER_CF_CLKDIV 7:0 Master clock divider
TIMER_CF_CLKDIV is the master clock divider for all timer clocks. The generated internal
fm
clock frequency fi = c+1 , where fm is the master clock frequency and c is TIMER_CF_CLKDIV.
Example: With a 12 MHz master clock, TIMER_CF_DIV=3 divides the master clock by 4, and
the output/sampling clock would thus be fi = 12M Hz
3+1 = 3M Hz.
TIMER_ENABLE Bits
Name Bits Description
TIMER_EN_T1 1 Enable timer 1
TIMER_EN_T0 0 Enable timer 0
The 32-bit start value TIMER_Tx[L/H] sets the initial counter value when the timer is reset. The
fi
timer interrupt frequency ft = c+1 where fi is the master clock obtained with the clock divider
(see Chapter 11.13.2 and c is TIMER_Tx[L/H].
Example: With a 12 MHz master clock and with TIMER_CF_CLKDIV=3, the master clock fi =
3M Hz. If TIMER_TH=0, TIMER_TL=99, then the timer interrupt frequency ft = 3M Hz
99+1 =
30kHz.
TIMER_TxCNT[L/H] contains the current counter values. By reading this register pair, the user
may get knowledge of how long it will take before the next timer interrupt. Also, by writing to
this register, a one-shot different length timer interrupt delay may be realized.
Each timer has its own interrupt, which is asserted when the timer counter underflows.
The 16-bit I2S Interface makes it possible to attach an external DAC to the system.
Note: The samplerate of the audio file and the I2S rate are independent. All audio will be
automatically converted to 6.144 MHz for VS1053 DAC and to the configured I2S rate using a
high-quality sample-rate converter.
Note: In VS1053b the I2S pins share different GPIO pins than in VS1033 to be able to use SPI
boot and I2S in the same application.
I2S_CONFIG Bits
Name Bits Description
I2S_CF_MCLK_ENA 3 Enables the MCLK output (12.288 MHz)
I2S_CF_ENA 2 Enables I2S, otherwise pins are GPIO
I2S_CF_SRATE 1:0 I2S rate, "10" = 192, "01" = 96, "00" = 48 kHz
I2S_CF_ENA enables the I2S interface. After reset I2S is disabled and the pins are used for
GPIO inputs.
I2S_CF_MCLK_ENA enables the MCLK output. The frequency is either directly the input clock
(nominal 12.288 MHz), or half the input clock when mode register bit SM_CLK_RANGE is set
to 1 (24-26 MHz input clock).
I2S_CF_SRATE controls the output samplerate. When set to 48 kHz, SCLK is MCLK divided
by 8, when 96 kHz SCLK is MCLK divided by 4, and when 192 kHz SCLK is MCLK divided by
2. I2S_CF_SRATE can only be changed when I2S_CF_ENA is 0.
MCLK
SCLK
LROUT
To enable I2S first write 0xc017 to SCI_WRAMADDR and 0x00f0 to SCI_WRAM, then write
0xc040 to SCI_WRAMADDR and 0x000c to SCI_WRAM.
ADC_CONTROL Bits
Name Bits Description
ADC_MODU2_PD 4 Right channel powerdown
ADC_MODU1_PD 3 Left channel powerdown
ADC_DECIM_FACTOR 2:1 ADC Decimator factor:
- 3 = downsample to XTALI/512 (nominal 24 kHz)
- 2 = downsample to XTALI/256 (nominal 48 kHz)
- 1 = downsample to XTALI/128 (nominal 96 kHz)
- 0 = downsample to XTALI/64 (nominal 192 kHz)
ADC_ENABLE 0 Set to activate ADC converter and decimator
Note: Setting bit SS_AD_CLOCK in register SCI_STATUS will halve the operation speed of the
A/D unit, and thus halve the resulting samplerate.
Each time a new (stereo) sample has been generated, an ADC interrupt is generated.
The resampler SRC makes it possible to catch audio from the DAC path.
Note: hardware makes no attempts at low-pass filtering data. If the SRC samplerate is lower
than the DAC samplerate, aliasing may and will occur.
SRC_CONTROL Bits
Name Bits Description
SRC_ENABLE 12 Set to enable SRC
SRC_DIV 11:0 Set samplerate to XTALI/2/(SRC_DIV+1)
Each time a new (stereo) sample has been generated, an SRC interrupt is generated.
The Sidestream Sigma-Delta Modulator makes it possible to insert a digital side stream on top
of existing audio.
Note: The SDM provides a direct, low-delay side channel to the Sigma-Delta DACs of VS10xx.
It makes no attempts at low-pass filtering data. Thus there will be practically no image rejection.
If using low samplerates, this may cause audible aliasing distortion.
SDM_CONTROL Bits
Name Bits Description
SDM_ENABLE 12 Set to enable SDM
SDM_DIV 11:0 Set samplerate to XTALI/2/(SDM_DIV+1)
12 Version Changes
This chapter describes the lastest and most important changes done to VS1053b
• I2S pins are now in GPIO4-GPIO7 and do not overlap with SPI boot pins.
• No software reset required between files when used correctly.
• Ogg Vorbis decoding added. Non-fatal ogg or vorbis decode errors cause automatic
resync. This allows easy rewind and fast forward. Decoding ends if the "last frame" flag
is reached or SM_CANCEL is set.
• HE-AAC v2 Level 3 decoding added. It is possible to disable PS and SBR processing and
control the upsampling modes through parametric_x.control1.
• Like the WMA decoder, the AAC decoder uses the clock adder (see SCI_CLOCKF) if it
needs more clock to decode the file. HE-AAC features are dropped one by one, if the file
can not be decoded correctly even with the highest allowed clock. Parametric stereo is
the first feature to be dropped, then downsampled mode is used, and as the final resort
Spectral Band Replication is disabled. Features are automatically restored for the next
file.
• Completely new volume control with zero-cross detection prevents pops when volume is
changed.
• Audio FIFO underrun detection (with slow fade to zero) instead of looping the audio buffer
content.
• Average bitrate calculation (byteRate) for all codecs.
• All codecs support fast play mode with selectable speeds for the best-quality fast forward
operation. Fast play also advances DECODE_TIME faster.
• WMA and Ogg Vorbis provide an absolute decode position in milliseconds.
• When SM_CANCEL is detected, the firmware also discards the stream buffer contents.
• Bit SCIST_DO_NOT_JUMP in SCI_STATUS is ’1’ when jumps in the file should not be
done: during header processing and with Midi files.
• IMA ADPCM encode now supports stereo encoding and selectable samplerate.
• Delayed volume and bass/treble control calculation reduces the time the corresponding
SCI operations take. This delayed handling and the new volume control hardware pre-
vents audio samples from being missed during volume change.
• SCI_DECODE_TIME only cleared at hardware and software reset to allow files to be
played back-to-back or looped.
This chapter describes the latest and most important changes to this document.
• Added mention of RIFF 8-bit and 16-bit data signedness to Chapter 10.6, Feeding PCM
Data.
• Updated telephone number in Chapter 14, Contact Information.
• Clarified how Ogg Vorbis set Replay Gain parameters in Chapter 10.11.5, Ogg Vorbis.
• Explained that Bass Enhancer handles the bass part of the audio signal in mono in Chap-
ter 9.6.3, SCI_BASS.
• Added GPIO_DDR[9:8] and GPIO_ODATA[8] explanation to Chapter 11.10, GPIO.
• Moved value from tV(min) to tV(max) in Chapter 7.4.4, SPI Timing Diagram.
• Chapter 10.8, PCM / ADPCM Recording, has been updated to better explain how to
record using the VS1053b Patches package.
• Clarified byte endianness when reading PCM or ADPCM audio data in Chapter 10.8.3,
Adding a PCM RIFF Header, and Chapter 10.8.4, Adding an IMA ADPCM RIFF Header.
14 Contact Information
VLSI Solution Oy
Entrance G, 2nd floor
Hermiankatu 8
FI-33720 Tampere
FINLAND
URL: http://www.vlsi.fi/
Phone: +358-50-462-3200
Commercial e-mail: sales@vlsi.fi