xref: /aosp_15_r20/external/libopus/doc/draft-ietf-codec-opus-update.xml (revision a58d3d2adb790c104798cd88c8a3aff4fa8b82cc)
1*a58d3d2aSXin Li<?xml version="1.0" encoding="US-ASCII"?>
2*a58d3d2aSXin Li<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
3*a58d3d2aSXin Li<?rfc toc="yes"?>
4*a58d3d2aSXin Li<?rfc tocompact="yes"?>
5*a58d3d2aSXin Li<?rfc tocdepth="3"?>
6*a58d3d2aSXin Li<?rfc tocindent="yes"?>
7*a58d3d2aSXin Li<?rfc symrefs="yes"?>
8*a58d3d2aSXin Li<?rfc sortrefs="yes"?>
9*a58d3d2aSXin Li<?rfc comments="yes"?>
10*a58d3d2aSXin Li<?rfc inline="yes"?>
11*a58d3d2aSXin Li<?rfc compact="yes"?>
12*a58d3d2aSXin Li<?rfc subcompact="no"?>
13*a58d3d2aSXin Li<rfc category="std" docName="draft-ietf-codec-opus-update-10"
14*a58d3d2aSXin Li     ipr="trust200902" updates="6716">
15*a58d3d2aSXin Li  <front>
16*a58d3d2aSXin Li    <title abbrev="Opus Update">Updates to the Opus Audio Codec</title>
17*a58d3d2aSXin Li
18*a58d3d2aSXin Li<author initials="JM" surname="Valin" fullname="Jean-Marc Valin">
19*a58d3d2aSXin Li<organization>Mozilla Corporation</organization>
20*a58d3d2aSXin Li<address>
21*a58d3d2aSXin Li<postal>
22*a58d3d2aSXin Li<street>331 E. Evelyn Avenue</street>
23*a58d3d2aSXin Li<city>Mountain View</city>
24*a58d3d2aSXin Li<region>CA</region>
25*a58d3d2aSXin Li<code>94041</code>
26*a58d3d2aSXin Li<country>USA</country>
27*a58d3d2aSXin Li</postal>
28*a58d3d2aSXin Li<phone>+1 650 903-0800</phone>
29*a58d3d2aSXin Li<email>[email protected]</email>
30*a58d3d2aSXin Li</address>
31*a58d3d2aSXin Li</author>
32*a58d3d2aSXin Li
33*a58d3d2aSXin Li<author initials="K." surname="Vos" fullname="Koen Vos">
34*a58d3d2aSXin Li<organization>vocTone</organization>
35*a58d3d2aSXin Li<address>
36*a58d3d2aSXin Li<postal>
37*a58d3d2aSXin Li<street></street>
38*a58d3d2aSXin Li<city></city>
39*a58d3d2aSXin Li<region></region>
40*a58d3d2aSXin Li<code></code>
41*a58d3d2aSXin Li<country></country>
42*a58d3d2aSXin Li</postal>
43*a58d3d2aSXin Li<phone></phone>
44*a58d3d2aSXin Li<email>[email protected]</email>
45*a58d3d2aSXin Li</address>
46*a58d3d2aSXin Li</author>
47*a58d3d2aSXin Li
48*a58d3d2aSXin Li
49*a58d3d2aSXin Li
50*a58d3d2aSXin Li    <date day="24" month="August" year="2017" />
51*a58d3d2aSXin Li
52*a58d3d2aSXin Li    <abstract>
53*a58d3d2aSXin Li      <t>This document addresses minor issues that were found in the specification
54*a58d3d2aSXin Li      of the Opus audio codec in RFC 6716. It updates the normative decoder implementation
55*a58d3d2aSXin Li      included in the appendix of RFC 6716. The changes fixes real and potential security-related
56*a58d3d2aSXin Li      issues, as well minor quality-related issues.</t>
57*a58d3d2aSXin Li    </abstract>
58*a58d3d2aSXin Li  </front>
59*a58d3d2aSXin Li
60*a58d3d2aSXin Li  <middle>
61*a58d3d2aSXin Li    <section title="Introduction">
62*a58d3d2aSXin Li      <t>This document addresses minor issues that were discovered in the reference
63*a58d3d2aSXin Li      implementation of the Opus codec. Unlike most IETF specifications, Opus is defined
64*a58d3d2aSXin Li      in <xref target="RFC6716">RFC 6716</xref> in terms of a normative reference
65*a58d3d2aSXin Li      decoder implementation rather than from the associated text description.
66*a58d3d2aSXin Li      That RFC includes the reference decoder implementation as Appendix A.
67*a58d3d2aSXin Li      That's why only issues affecting the decoder are
68*a58d3d2aSXin Li      listed here. An up-to-date implementation of the Opus encoder can be found at
69*a58d3d2aSXin Li      <eref target="https://opus-codec.org/"/>.</t>
70*a58d3d2aSXin Li    <t>
71*a58d3d2aSXin Li      Some of the changes in this document update normative behaviour in a way that requires
72*a58d3d2aSXin Li      new test vectors. The English text of the specification is unaffected, only
73*a58d3d2aSXin Li      the C implementation is. The updated specification remains fully compatible with
74*a58d3d2aSXin Li      the original specification.
75*a58d3d2aSXin Li    </t>
76*a58d3d2aSXin Li
77*a58d3d2aSXin Li    <t>
78*a58d3d2aSXin Li    Note: due to RFC formatting conventions, lines exceeding the column width
79*a58d3d2aSXin Li    in the patch are split using a backslash character. The backslashes
80*a58d3d2aSXin Li    at the end of a line and the white space at the beginning
81*a58d3d2aSXin Li    of the following line are not part of the patch. A properly formatted patch
82*a58d3d2aSXin Li    including all changes is available at
83*a58d3d2aSXin Li    <eref target="https://www.ietf.org/proceedings/98/slides/materials-98-codec-opus-update-00.patch"/>
84*a58d3d2aSXin Li    and has a SHA-1 hash of 029e3aa88fc342c91e67a21e7bfbc9458661cd5f.
85*a58d3d2aSXin Li    </t>
86*a58d3d2aSXin Li
87*a58d3d2aSXin Li    </section>
88*a58d3d2aSXin Li
89*a58d3d2aSXin Li    <section title="Terminology">
90*a58d3d2aSXin Li      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
91*a58d3d2aSXin Li      "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
92*a58d3d2aSXin Li      document are to be interpreted as described in <xref
93*a58d3d2aSXin Li      target="RFC2119">RFC 2119</xref>.</t>
94*a58d3d2aSXin Li    </section>
95*a58d3d2aSXin Li
96*a58d3d2aSXin Li    <section title="Stereo State Reset in SILK">
97*a58d3d2aSXin Li      <t>The reference implementation does not reinitialize the stereo state
98*a58d3d2aSXin Li      during a mode switch. The old stereo memory can produce a brief impulse
99*a58d3d2aSXin Li      (i.e. single sample) in the decoded audio. This can be fixed by changing
100*a58d3d2aSXin Li      silk/dec_API.c at line 72:
101*a58d3d2aSXin Li    </t>
102*a58d3d2aSXin Li<figure>
103*a58d3d2aSXin Li<artwork><![CDATA[
104*a58d3d2aSXin Li<CODE BEGINS>
105*a58d3d2aSXin Li     for( n = 0; n < DECODER_NUM_CHANNELS; n++ ) {
106*a58d3d2aSXin Li         ret  = silk_init_decoder( &channel_state[ n ] );
107*a58d3d2aSXin Li     }
108*a58d3d2aSXin Li+    silk_memset(&((silk_decoder *)decState)->sStereo, 0,
109*a58d3d2aSXin Li+                sizeof(((silk_decoder *)decState)->sStereo));
110*a58d3d2aSXin Li+    /* Not strictly needed, but it's cleaner that way */
111*a58d3d2aSXin Li+    ((silk_decoder *)decState)->prev_decode_only_middle = 0;
112*a58d3d2aSXin Li
113*a58d3d2aSXin Li     return ret;
114*a58d3d2aSXin Li }
115*a58d3d2aSXin Li<CODE ENDS>
116*a58d3d2aSXin Li]]></artwork>
117*a58d3d2aSXin Li</figure>
118*a58d3d2aSXin Li     <t>
119*a58d3d2aSXin Li     This change affects the normative output of the decoder, but the
120*a58d3d2aSXin Li     amount of change is within the tolerance and too small to make the testvector check fail.
121*a58d3d2aSXin Li      </t>
122*a58d3d2aSXin Li    </section>
123*a58d3d2aSXin Li
124*a58d3d2aSXin Li    <section anchor="padding" title="Parsing of the Opus Packet Padding">
125*a58d3d2aSXin Li      <t>It was discovered that some invalid packets of very large size could trigger
126*a58d3d2aSXin Li      an out-of-bounds read in the Opus packet parsing code responsible for padding.
127*a58d3d2aSXin Li      This is due to an integer overflow if the signaled padding exceeds 2^31-1 bytes
128*a58d3d2aSXin Li      (the actual packet may be smaller). The code can be fixed by decrementing the
129*a58d3d2aSXin Li      (signed) len value, instead of incrementing a separate padding counter.
130*a58d3d2aSXin Li      This is done by applying the following changes at line 596 of src/opus_decoder.c:
131*a58d3d2aSXin Li    </t>
132*a58d3d2aSXin Li<figure>
133*a58d3d2aSXin Li<artwork><![CDATA[
134*a58d3d2aSXin Li<CODE BEGINS>
135*a58d3d2aSXin Li       /* Padding flag is bit 6 */
136*a58d3d2aSXin Li       if (ch&0x40)
137*a58d3d2aSXin Li       {
138*a58d3d2aSXin Li-         int padding=0;
139*a58d3d2aSXin Li          int p;
140*a58d3d2aSXin Li          do {
141*a58d3d2aSXin Li             if (len<=0)
142*a58d3d2aSXin Li                return OPUS_INVALID_PACKET;
143*a58d3d2aSXin Li             p = *data++;
144*a58d3d2aSXin Li             len--;
145*a58d3d2aSXin Li-            padding += p==255 ? 254: p;
146*a58d3d2aSXin Li+            len -= p==255 ? 254: p;
147*a58d3d2aSXin Li          } while (p==255);
148*a58d3d2aSXin Li-         len -= padding;
149*a58d3d2aSXin Li       }
150*a58d3d2aSXin Li<CODE ENDS>
151*a58d3d2aSXin Li]]></artwork>
152*a58d3d2aSXin Li</figure>
153*a58d3d2aSXin Li      <t>This packet parsing issue is limited to reading memory up
154*a58d3d2aSXin Li         to about 60 kB beyond the compressed buffer. This can only be triggered
155*a58d3d2aSXin Li         by a compressed packet more than about 16 MB long, so it's not a problem
156*a58d3d2aSXin Li         for RTP. In theory, it could crash a file
157*a58d3d2aSXin Li         decoder (e.g. Opus in Ogg) if the memory just after the incoming packet
158*a58d3d2aSXin Li         is out-of-range, but our attempts to trigger such a crash in a production
159*a58d3d2aSXin Li         application built using an affected version of the Opus decoder failed.</t>
160*a58d3d2aSXin Li    </section>
161*a58d3d2aSXin Li
162*a58d3d2aSXin Li    <section anchor="resampler" title="Resampler buffer">
163*a58d3d2aSXin Li      <t>The SILK resampler had the following issues:
164*a58d3d2aSXin Li        <list style="numbers">
165*a58d3d2aSXin Li    <t>The calls to memcpy() were using sizeof(opus_int32), but the type of the
166*a58d3d2aSXin Li        local buffer was opus_int16.</t>
167*a58d3d2aSXin Li    <t>Because the size was wrong, this potentially allowed the source
168*a58d3d2aSXin Li        and destination regions of the memcpy() to overlap on the copy from "buf" to "buf".
169*a58d3d2aSXin Li          We believe that nSamplesIn (number of input samples) is at least fs_in_khZ (sampling rate in kHz),
170*a58d3d2aSXin Li          which is at least 8.
171*a58d3d2aSXin Li       Since RESAMPLER_ORDER_FIR_12 is only 8, that should not be a problem once
172*a58d3d2aSXin Li       the type size is fixed.</t>
173*a58d3d2aSXin Li          <t>The size of the buffer used RESAMPLER_MAX_BATCH_SIZE_IN, but the
174*a58d3d2aSXin Li        data stored in it was actually twice the input batch size
175*a58d3d2aSXin Li        (nSamplesIn&lt;&lt;1).</t>
176*a58d3d2aSXin Li      </list></t>
177*a58d3d2aSXin Li    <t>The code can be fixed by applying the following changes to line 78 of silk/resampler_private_IIR_FIR.c:
178*a58d3d2aSXin Li    </t>
179*a58d3d2aSXin Li<figure>
180*a58d3d2aSXin Li<artwork><![CDATA[
181*a58d3d2aSXin Li<CODE BEGINS>
182*a58d3d2aSXin Li )
183*a58d3d2aSXin Li {
184*a58d3d2aSXin Li     silk_resampler_state_struct *S = \
185*a58d3d2aSXin Li(silk_resampler_state_struct *)SS;
186*a58d3d2aSXin Li     opus_int32 nSamplesIn;
187*a58d3d2aSXin Li     opus_int32 max_index_Q16, index_increment_Q16;
188*a58d3d2aSXin Li-    opus_int16 buf[ RESAMPLER_MAX_BATCH_SIZE_IN + \
189*a58d3d2aSXin LiRESAMPLER_ORDER_FIR_12 ];
190*a58d3d2aSXin Li+    opus_int16 buf[ 2*RESAMPLER_MAX_BATCH_SIZE_IN + \
191*a58d3d2aSXin LiRESAMPLER_ORDER_FIR_12 ];
192*a58d3d2aSXin Li
193*a58d3d2aSXin Li     /* Copy buffered samples to start of buffer */
194*a58d3d2aSXin Li-    silk_memcpy( buf, S->sFIR, RESAMPLER_ORDER_FIR_12 \
195*a58d3d2aSXin Li* sizeof( opus_int32 ) );
196*a58d3d2aSXin Li+    silk_memcpy( buf, S->sFIR, RESAMPLER_ORDER_FIR_12 \
197*a58d3d2aSXin Li* sizeof( opus_int16 ) );
198*a58d3d2aSXin Li
199*a58d3d2aSXin Li     /* Iterate over blocks of frameSizeIn input samples */
200*a58d3d2aSXin Li     index_increment_Q16 = S->invRatio_Q16;
201*a58d3d2aSXin Li     while( 1 ) {
202*a58d3d2aSXin Li         nSamplesIn = silk_min( inLen, S->batchSize );
203*a58d3d2aSXin Li
204*a58d3d2aSXin Li         /* Upsample 2x */
205*a58d3d2aSXin Li         silk_resampler_private_up2_HQ( S->sIIR, &buf[ \
206*a58d3d2aSXin LiRESAMPLER_ORDER_FIR_12 ], in, nSamplesIn );
207*a58d3d2aSXin Li
208*a58d3d2aSXin Li         max_index_Q16 = silk_LSHIFT32( nSamplesIn, 16 + 1 \
209*a58d3d2aSXin Li);         /* + 1 because 2x upsampling */
210*a58d3d2aSXin Li         out = silk_resampler_private_IIR_FIR_INTERPOL( out, \
211*a58d3d2aSXin Libuf, max_index_Q16, index_increment_Q16 );
212*a58d3d2aSXin Li         in += nSamplesIn;
213*a58d3d2aSXin Li         inLen -= nSamplesIn;
214*a58d3d2aSXin Li
215*a58d3d2aSXin Li         if( inLen > 0 ) {
216*a58d3d2aSXin Li             /* More iterations to do; copy last part of \
217*a58d3d2aSXin Lifiltered signal to beginning of buffer */
218*a58d3d2aSXin Li-            silk_memcpy( buf, &buf[ nSamplesIn << 1 ], \
219*a58d3d2aSXin LiRESAMPLER_ORDER_FIR_12 * sizeof( opus_int32 ) );
220*a58d3d2aSXin Li+            silk_memmove( buf, &buf[ nSamplesIn << 1 ], \
221*a58d3d2aSXin LiRESAMPLER_ORDER_FIR_12 * sizeof( opus_int16 ) );
222*a58d3d2aSXin Li         } else {
223*a58d3d2aSXin Li             break;
224*a58d3d2aSXin Li         }
225*a58d3d2aSXin Li     }
226*a58d3d2aSXin Li
227*a58d3d2aSXin Li     /* Copy last part of filtered signal to the state for \
228*a58d3d2aSXin Lithe next call */
229*a58d3d2aSXin Li-    silk_memcpy( S->sFIR, &buf[ nSamplesIn << 1 ], \
230*a58d3d2aSXin LiRESAMPLER_ORDER_FIR_12 * sizeof( opus_int32 ) );
231*a58d3d2aSXin Li+    silk_memcpy( S->sFIR, &buf[ nSamplesIn << 1 ], \
232*a58d3d2aSXin LiRESAMPLER_ORDER_FIR_12 * sizeof( opus_int16 ) );
233*a58d3d2aSXin Li }
234*a58d3d2aSXin Li<CODE ENDS>
235*a58d3d2aSXin Li]]></artwork>
236*a58d3d2aSXin Li</figure>
237*a58d3d2aSXin Li    </section>
238*a58d3d2aSXin Li
239*a58d3d2aSXin Li    <section title="Integer wrap-around in inverse gain computation">
240*a58d3d2aSXin Li      <t>
241*a58d3d2aSXin Li        It was discovered through decoder fuzzing that some bitstreams could produce
242*a58d3d2aSXin Li        integer values exceeding 32-bits in LPC_inverse_pred_gain_QA(), causing
243*a58d3d2aSXin Li        a wrap-around. The C standard considers
244*a58d3d2aSXin Li        this behavior as undefined. The following patch to line 87 of silk/LPC_inv_pred_gain.c
245*a58d3d2aSXin Li        detects values that do not fit in a 32-bit integer and considers the corresponding filters unstable:
246*a58d3d2aSXin Li      </t>
247*a58d3d2aSXin Li<figure>
248*a58d3d2aSXin Li<artwork><![CDATA[
249*a58d3d2aSXin Li<CODE BEGINS>
250*a58d3d2aSXin Li         /* Update AR coefficient */
251*a58d3d2aSXin Li         for( n = 0; n < k; n++ ) {
252*a58d3d2aSXin Li-            tmp_QA = Aold_QA[ n ] - MUL32_FRAC_Q( \
253*a58d3d2aSXin LiAold_QA[ k - n - 1 ], rc_Q31, 31 );
254*a58d3d2aSXin Li-            Anew_QA[ n ] = MUL32_FRAC_Q( tmp_QA, rc_mult2 , mult2Q );
255*a58d3d2aSXin Li+            opus_int64 tmp64;
256*a58d3d2aSXin Li+            tmp_QA = silk_SUB_SAT32( Aold_QA[ n ], MUL32_FRAC_Q( \
257*a58d3d2aSXin LiAold_QA[ k - n - 1 ], rc_Q31, 31 ) );
258*a58d3d2aSXin Li+            tmp64 = silk_RSHIFT_ROUND64( silk_SMULL( tmp_QA, \
259*a58d3d2aSXin Lirc_mult2 ), mult2Q);
260*a58d3d2aSXin Li+            if( tmp64 > silk_int32_MAX || tmp64 < silk_int32_MIN ) {
261*a58d3d2aSXin Li+               return 0;
262*a58d3d2aSXin Li+            }
263*a58d3d2aSXin Li+            Anew_QA[ n ] = ( opus_int32 )tmp64;
264*a58d3d2aSXin Li         }
265*a58d3d2aSXin Li<CODE ENDS>
266*a58d3d2aSXin Li]]></artwork>
267*a58d3d2aSXin Li</figure>
268*a58d3d2aSXin Li    </section>
269*a58d3d2aSXin Li
270*a58d3d2aSXin Li    <section title="Integer wrap-around in LSF decoding" anchor="lsf_overflow">
271*a58d3d2aSXin Li      <t>
272*a58d3d2aSXin Li        It was discovered -- also from decoder fuzzing -- that an integer wrap-around could
273*a58d3d2aSXin Li        occur when decoding bitstreams with extremely large values for the high LSF parameters.
274*a58d3d2aSXin Li        The end result of the wrap-around is an illegal read access on the stack, which
275*a58d3d2aSXin Li        the authors do not believe is exploitable but should nonetheless be fixed. The following
276*a58d3d2aSXin Li        patch to line 137 of silk/NLSF_stabilize.c prevents the problem:
277*a58d3d2aSXin Li      </t>
278*a58d3d2aSXin Li<figure>
279*a58d3d2aSXin Li<artwork><![CDATA[
280*a58d3d2aSXin Li<CODE BEGINS>
281*a58d3d2aSXin Li           /* Keep delta_min distance between the NLSFs */
282*a58d3d2aSXin Li         for( i = 1; i < L; i++ )
283*a58d3d2aSXin Li-            NLSF_Q15[i] = silk_max_int( NLSF_Q15[i], \
284*a58d3d2aSXin LiNLSF_Q15[i-1] + NDeltaMin_Q15[i] );
285*a58d3d2aSXin Li+            NLSF_Q15[i] = silk_max_int( NLSF_Q15[i], \
286*a58d3d2aSXin Lisilk_ADD_SAT16( NLSF_Q15[i-1], NDeltaMin_Q15[i] ) );
287*a58d3d2aSXin Li
288*a58d3d2aSXin Li         /* Last NLSF should be no higher than 1 - NDeltaMin[L] */
289*a58d3d2aSXin Li<CODE ENDS>
290*a58d3d2aSXin Li]]></artwork>
291*a58d3d2aSXin Li</figure>
292*a58d3d2aSXin Li
293*a58d3d2aSXin Li    </section>
294*a58d3d2aSXin Li
295*a58d3d2aSXin Li    <section title="Cap on Band Energy">
296*a58d3d2aSXin Li      <t>On extreme bit-streams, it is possible for log-domain band energy levels
297*a58d3d2aSXin Li        to exceed the maximum single-precision floating point value once converted
298*a58d3d2aSXin Li        to a linear scale. This would later cause the decoded values to be NaN (not a number),
299*a58d3d2aSXin Li        possibly causing problems in the software using the PCM values. This can be
300*a58d3d2aSXin Li        avoided with the following patch to line 552 of celt/quant_bands.c:
301*a58d3d2aSXin Li      </t>
302*a58d3d2aSXin Li<figure>
303*a58d3d2aSXin Li<artwork><![CDATA[
304*a58d3d2aSXin Li<CODE BEGINS>
305*a58d3d2aSXin Li       {
306*a58d3d2aSXin Li          opus_val16 lg = ADD16(oldEBands[i+c*m->nbEBands],
307*a58d3d2aSXin Li                          SHL16((opus_val16)eMeans[i],6));
308*a58d3d2aSXin Li+         lg = MIN32(QCONST32(32.f, 16), lg);
309*a58d3d2aSXin Li          eBands[i+c*m->nbEBands] = PSHR32(celt_exp2(lg),4);
310*a58d3d2aSXin Li       }
311*a58d3d2aSXin Li       for (;i<m->nbEBands;i++)
312*a58d3d2aSXin Li<CODE ENDS>
313*a58d3d2aSXin Li]]></artwork>
314*a58d3d2aSXin Li</figure>
315*a58d3d2aSXin Li    </section>
316*a58d3d2aSXin Li
317*a58d3d2aSXin Li    <section title="Hybrid Folding" anchor="folding">
318*a58d3d2aSXin Li      <t>When encoding in hybrid mode at low bitrate, we sometimes only have
319*a58d3d2aSXin Li        enough bits to code a single CELT band (8 - 9.6 kHz). When that happens,
320*a58d3d2aSXin Li        the second band (CELT band 18, from 9.6 to 12 kHz) cannot use folding
321*a58d3d2aSXin Li        because it is wider than the amount already coded, and falls back to
322*a58d3d2aSXin Li        white noise. Because it can also happen on transients (e.g. stops), it
323*a58d3d2aSXin Li        can cause audible pre-echo.
324*a58d3d2aSXin Li      </t>
325*a58d3d2aSXin Li      <t>
326*a58d3d2aSXin Li        To address the issue, we change the folding behavior so that it is
327*a58d3d2aSXin Li        never forced to fall back to LCG due to the first band not containing
328*a58d3d2aSXin Li        enough coefficients to fold onto the second band. This
329*a58d3d2aSXin Li        is achieved by simply repeating part of the first band in the folding
330*a58d3d2aSXin Li        of the second band. This changes the code in celt/bands.c around line 1237:
331*a58d3d2aSXin Li      </t>
332*a58d3d2aSXin Li<figure>
333*a58d3d2aSXin Li<artwork><![CDATA[
334*a58d3d2aSXin Li<CODE BEGINS>
335*a58d3d2aSXin Li          b = 0;
336*a58d3d2aSXin Li       }
337*a58d3d2aSXin Li
338*a58d3d2aSXin Li-      if (resynth && M*eBands[i]-N >= M*eBands[start] && \
339*a58d3d2aSXin Li(update_lowband || lowband_offset==0))
340*a58d3d2aSXin Li+      if (resynth && (M*eBands[i]-N >= M*eBands[start] || \
341*a58d3d2aSXin Lii==start+1) && (update_lowband || lowband_offset==0))
342*a58d3d2aSXin Li             lowband_offset = i;
343*a58d3d2aSXin Li
344*a58d3d2aSXin Li+      if (i == start+1)
345*a58d3d2aSXin Li+      {
346*a58d3d2aSXin Li+         int n1, n2;
347*a58d3d2aSXin Li+         int offset;
348*a58d3d2aSXin Li+         n1 = M*(eBands[start+1]-eBands[start]);
349*a58d3d2aSXin Li+         n2 = M*(eBands[start+2]-eBands[start+1]);
350*a58d3d2aSXin Li+         offset = M*eBands[start];
351*a58d3d2aSXin Li+         /* Duplicate enough of the first band folding data to \
352*a58d3d2aSXin Libe able to fold the second band.
353*a58d3d2aSXin Li+            Copies no data for CELT-only mode. */
354*a58d3d2aSXin Li+         OPUS_COPY(&norm[offset+n1], &norm[offset+2*n1 - n2], n2-n1);
355*a58d3d2aSXin Li+         if (C==2)
356*a58d3d2aSXin Li+            OPUS_COPY(&norm2[offset+n1], &norm2[offset+2*n1 - n2], \
357*a58d3d2aSXin Lin2-n1);
358*a58d3d2aSXin Li+      }
359*a58d3d2aSXin Li+
360*a58d3d2aSXin Li       tf_change = tf_res[i];
361*a58d3d2aSXin Li       if (i>=m->effEBands)
362*a58d3d2aSXin Li       {
363*a58d3d2aSXin Li<CODE ENDS>
364*a58d3d2aSXin Li]]></artwork>
365*a58d3d2aSXin Li</figure>
366*a58d3d2aSXin Li
367*a58d3d2aSXin Li      <t>
368*a58d3d2aSXin Li       as well as line 1260:
369*a58d3d2aSXin Li      </t>
370*a58d3d2aSXin Li
371*a58d3d2aSXin Li<figure>
372*a58d3d2aSXin Li<artwork><![CDATA[
373*a58d3d2aSXin Li<CODE BEGINS>
374*a58d3d2aSXin Li          fold_start = lowband_offset;
375*a58d3d2aSXin Li          while(M*eBands[--fold_start] > effective_lowband);
376*a58d3d2aSXin Li          fold_end = lowband_offset-1;
377*a58d3d2aSXin Li-         while(M*eBands[++fold_end] < effective_lowband+N);
378*a58d3d2aSXin Li+         while(++fold_end < i && M*eBands[fold_end] < \
379*a58d3d2aSXin Lieffective_lowband+N);
380*a58d3d2aSXin Li          x_cm = y_cm = 0;
381*a58d3d2aSXin Li          fold_i = fold_start; do {
382*a58d3d2aSXin Li            x_cm |= collapse_masks[fold_i*C+0];
383*a58d3d2aSXin Li
384*a58d3d2aSXin Li<CODE ENDS>
385*a58d3d2aSXin Li]]></artwork>
386*a58d3d2aSXin Li</figure>
387*a58d3d2aSXin Li      <t>
388*a58d3d2aSXin Li        The fix does not impact compatibility, because the improvement does
389*a58d3d2aSXin Li        not depend on the encoder doing anything special. There is also no
390*a58d3d2aSXin Li        reasonable way for an encoder to use the original behavior to
391*a58d3d2aSXin Li        improve quality over the proposed change.
392*a58d3d2aSXin Li      </t>
393*a58d3d2aSXin Li    </section>
394*a58d3d2aSXin Li
395*a58d3d2aSXin Li    <section title="Downmix to Mono" anchor="stereo">
396*a58d3d2aSXin Li      <t>The last issue is not strictly a bug, but it is an issue that has been reported
397*a58d3d2aSXin Li      when downmixing an Opus decoded stream to mono, whether this is done inside the decoder
398*a58d3d2aSXin Li      or as a post-processing step on the stereo decoder output. Opus intensity stereo allows
399*a58d3d2aSXin Li      optionally coding the two channels 180-degrees out of phase on a per-band basis.
400*a58d3d2aSXin Li      This provides better stereo quality than forcing the two channels to be in phase,
401*a58d3d2aSXin Li      but when the output is downmixed to mono, the energy in the affected bands is cancelled
402*a58d3d2aSXin Li      sometimes resulting in audible artifacts.
403*a58d3d2aSXin Li      </t>
404*a58d3d2aSXin Li      <t>As a work-around for this issue, the decoder MAY choose not to apply the 180-degree
405*a58d3d2aSXin Li      phase shift. This can be useful when downmixing to mono inside or
406*a58d3d2aSXin Li      outside of the decoder (e.g. user-controllable).
407*a58d3d2aSXin Li      </t>
408*a58d3d2aSXin Li    </section>
409*a58d3d2aSXin Li
410*a58d3d2aSXin Li
411*a58d3d2aSXin Li    <section title="New Test Vectors">
412*a58d3d2aSXin Li      <t>Changes in <xref target="folding"/> and <xref target="stereo"/> have
413*a58d3d2aSXin Li        sufficient impact on the testvectors to make them fail. For this reason,
414*a58d3d2aSXin Li        this document also updates the Opus test vectors. The new test vectors now
415*a58d3d2aSXin Li        include two decoded outputs for the same bitstream. The outputs with
416*a58d3d2aSXin Li        suffix 'm' do not apply the CELT 180-degree phase shift as allowed in
417*a58d3d2aSXin Li        <xref target="stereo"/>, while the outputs without the suffix do. An
418*a58d3d2aSXin Li        implementation is compliant as long as it passes either set of vectors.
419*a58d3d2aSXin Li      </t>
420*a58d3d2aSXin Li      <t>
421*a58d3d2aSXin Li        Any Opus implementation
422*a58d3d2aSXin Li        that passes either the original test vectors from <xref target="RFC6716">RFC 6716</xref>
423*a58d3d2aSXin Li        or one of the new sets of test vectors is compliant with the Opus specification. However, newer implementations
424*a58d3d2aSXin Li        SHOULD be based on the new test vectors rather than the old ones.
425*a58d3d2aSXin Li      </t>
426*a58d3d2aSXin Li      <t>The new test vectors are located at
427*a58d3d2aSXin Li        <eref target="https://www.ietf.org/proceedings/98/slides/materials-98-codec-opus-newvectors-00.tar.gz"/>.
428*a58d3d2aSXin Li        The SHA-1 hashes of the test vectors are:
429*a58d3d2aSXin Li<figure>
430*a58d3d2aSXin Li<artwork>
431*a58d3d2aSXin Li<![CDATA[
432*a58d3d2aSXin Lie49b2862ceec7324790ed8019eb9744596d5be01  testvector01.bit
433*a58d3d2aSXin Lib809795ae1bcd606049d76de4ad24236257135e0  testvector02.bit
434*a58d3d2aSXin Lie0c4ecaeab44d35a2f5b6575cd996848e5ee2acc  testvector03.bit
435*a58d3d2aSXin Lia0f870cbe14ebb71fa9066ef3ee96e59c9a75187  testvector04.bit
436*a58d3d2aSXin Li9b3d92b48b965dfe9edf7b8a85edd4309f8cf7c8  testvector05.bit
437*a58d3d2aSXin Li28e66769ab17e17f72875283c14b19690cbc4e57  testvector06.bit
438*a58d3d2aSXin Libacf467be3215fc7ec288f29e2477de1192947a6  testvector07.bit
439*a58d3d2aSXin Liddbe08b688bbf934071f3893cd0030ce48dba12f  testvector08.bit
440*a58d3d2aSXin Li3932d9d61944dab1201645b8eeaad595d5705ecb  testvector09.bit
441*a58d3d2aSXin Li521eb2a1e0cc9c31b8b740673307c2d3b10c1900  testvector10.bit
442*a58d3d2aSXin Li6bc8f3146fcb96450c901b16c3d464ccdf4d5d96  testvector11.bit
443*a58d3d2aSXin Li338c3f1b4b97226bc60bc41038becbc6de06b28f  testvector12.bit
444*a58d3d2aSXin Lif5ef93884da6a814d311027918e9afc6f2e5c2c8  testvector01.dec
445*a58d3d2aSXin Li48ac1ff1995250a756e1e17bd32acefa8cd2b820  testvector02.dec
446*a58d3d2aSXin Lid15567e919db2d0e818727092c0af8dd9df23c95  testvector03.dec
447*a58d3d2aSXin Li1249dd28f5bd1e39a66fd6d99449dca7a8316342  testvector04.dec
448*a58d3d2aSXin Lib85675d81deef84a112c466cdff3b7aaa1d2fc76  testvector05.dec
449*a58d3d2aSXin Li55f0b191e90bfa6f98b50d01a64b44255cb4813e  testvector06.dec
450*a58d3d2aSXin Li61e8b357ab090b1801eeb578a28a6ae935e25b7b  testvector07.dec
451*a58d3d2aSXin Lia58539ee5321453b2ddf4c0f2500e856b3966862  testvector08.dec
452*a58d3d2aSXin Libb96aad2cde188555862b7bbb3af6133851ef8f4  testvector09.dec
453*a58d3d2aSXin Li1b6cdf0413ac9965b16184b1bea129b5c0b2a37a  testvector10.dec
454*a58d3d2aSXin Lib1fff72b74666e3027801b29dbc48b31f80dee0d  testvector11.dec
455*a58d3d2aSXin Li98e09bbafed329e341c3b4052e9c4ba5fc83f9b1  testvector12.dec
456*a58d3d2aSXin Li1e7d984ea3fbb16ba998aea761f4893fbdb30157  testvector01m.dec
457*a58d3d2aSXin Li48ac1ff1995250a756e1e17bd32acefa8cd2b820  testvector02m.dec
458*a58d3d2aSXin Lid15567e919db2d0e818727092c0af8dd9df23c95  testvector03m.dec
459*a58d3d2aSXin Li1249dd28f5bd1e39a66fd6d99449dca7a8316342  testvector04m.dec
460*a58d3d2aSXin Lid70b0bad431e7d463bc3da49bd2d49f1c6d0a530  testvector05m.dec
461*a58d3d2aSXin Li6ac1648c3174c95fada565161a6c78bdbe59c77d  testvector06m.dec
462*a58d3d2aSXin Lifc5e2f709693738324fb4c8bdc0dad6dda04e713  testvector07m.dec
463*a58d3d2aSXin Liaad2ba397bf1b6a18e8e09b50e4b19627d479f00  testvector08m.dec
464*a58d3d2aSXin Li6feb7a7b9d7cdc1383baf8d5739e2a514bd0ba08  testvector09m.dec
465*a58d3d2aSXin Li1b6cdf0413ac9965b16184b1bea129b5c0b2a37a  testvector10m.dec
466*a58d3d2aSXin Lifd3d3a7b0dfbdab98d37ed9aa04b659b9fefbd18  testvector11m.dec
467*a58d3d2aSXin Li98e09bbafed329e341c3b4052e9c4ba5fc83f9b1  testvector12m.dec
468*a58d3d2aSXin Li]]>
469*a58d3d2aSXin Li</artwork>
470*a58d3d2aSXin Li</figure>
471*a58d3d2aSXin Li      Note that the decoder input bitstream files (.bit) are unchanged.
472*a58d3d2aSXin Li      </t>
473*a58d3d2aSXin Li    </section>
474*a58d3d2aSXin Li
475*a58d3d2aSXin Li    <section anchor="security" title="Security Considerations">
476*a58d3d2aSXin Li      <t>This document fixes two security issues reported on Opus and that affect the
477*a58d3d2aSXin Li        reference implementation in <xref target="RFC6716">RFC 6716</xref>: CVE-2013-0899
478*a58d3d2aSXin Li        <eref target="https://nvd.nist.gov/vuln/detail/CVE-2013-0899"/>
479*a58d3d2aSXin Li        and CVE-2017-0381 <eref target="https://nvd.nist.gov/vuln/detail/CVE-2017-0381"/>.
480*a58d3d2aSXin Li        CVE- 2013-0899 theoretically could have caused an information leak. The leaked
481*a58d3d2aSXin Li        information would have gone through the decoder process before being accessible
482*a58d3d2aSXin Li        to the attacker. It is fixed by <xref target="padding"/>.
483*a58d3d2aSXin Li        CVE-2017-0381 could have resulted in a 16-bit out-of-bounds read from a fixed
484*a58d3d2aSXin Li        location.  It is fixed in <xref target="lsf_overflow"/>.
485*a58d3d2aSXin Li        Beyond the two fixed CVEs, this document adds no new security considerations on top of
486*a58d3d2aSXin Li        <xref target="RFC6716">RFC 6716</xref>.
487*a58d3d2aSXin Li      </t>
488*a58d3d2aSXin Li    </section>
489*a58d3d2aSXin Li
490*a58d3d2aSXin Li    <section anchor="IANA" title="IANA Considerations">
491*a58d3d2aSXin Li      <t>This document makes no request of IANA.</t>
492*a58d3d2aSXin Li
493*a58d3d2aSXin Li      <t>Note to RFC Editor: this section may be removed on publication as an
494*a58d3d2aSXin Li      RFC.</t>
495*a58d3d2aSXin Li    </section>
496*a58d3d2aSXin Li
497*a58d3d2aSXin Li    <section anchor="Acknowledgements" title="Acknowledgements">
498*a58d3d2aSXin Li      <t>We would like to thank Juri Aedla for reporting the issue with the parsing of
499*a58d3d2aSXin Li      the Opus padding. Thanks to Felicia Lim for reporting the LSF integer overflow issue.
500*a58d3d2aSXin Li      Also, thanks to Tina le Grand, Jonathan Lennox, and Mark Harris for their
501*a58d3d2aSXin Li      feedback on this document.</t>
502*a58d3d2aSXin Li    </section>
503*a58d3d2aSXin Li  </middle>
504*a58d3d2aSXin Li
505*a58d3d2aSXin Li  <back>
506*a58d3d2aSXin Li    <references title="Normative References">
507*a58d3d2aSXin Li      <?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml"?>
508*a58d3d2aSXin Li      <?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.6716.xml"?>
509*a58d3d2aSXin Li
510*a58d3d2aSXin Li
511*a58d3d2aSXin Li    </references>
512*a58d3d2aSXin Li  </back>
513*a58d3d2aSXin Li</rfc>
514