Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
stenzek
GitHub Repository: stenzek/duckstation
Path: blob/master/dep/libchdr/include/dr_libs/dr_flac.h
4247 views
1
/*
2
FLAC audio decoder. Choice of public domain or MIT-0. See license statements at the end of this file.
3
dr_flac - v0.12.42 - 2023-11-02
4
5
David Reid - [email protected]
6
7
GitHub: https://github.com/mackron/dr_libs
8
*/
9
10
/*
11
RELEASE NOTES - v0.12.0
12
=======================
13
Version 0.12.0 has breaking API changes including changes to the existing API and the removal of deprecated APIs.
14
15
16
Improved Client-Defined Memory Allocation
17
-----------------------------------------
18
The main change with this release is the addition of a more flexible way of implementing custom memory allocation routines. The
19
existing system of DRFLAC_MALLOC, DRFLAC_REALLOC and DRFLAC_FREE are still in place and will be used by default when no custom
20
allocation callbacks are specified.
21
22
To use the new system, you pass in a pointer to a drflac_allocation_callbacks object to drflac_open() and family, like this:
23
24
void* my_malloc(size_t sz, void* pUserData)
25
{
26
return malloc(sz);
27
}
28
void* my_realloc(void* p, size_t sz, void* pUserData)
29
{
30
return realloc(p, sz);
31
}
32
void my_free(void* p, void* pUserData)
33
{
34
free(p);
35
}
36
37
...
38
39
drflac_allocation_callbacks allocationCallbacks;
40
allocationCallbacks.pUserData = &myData;
41
allocationCallbacks.onMalloc = my_malloc;
42
allocationCallbacks.onRealloc = my_realloc;
43
allocationCallbacks.onFree = my_free;
44
drflac* pFlac = drflac_open_file("my_file.flac", &allocationCallbacks);
45
46
The advantage of this new system is that it allows you to specify user data which will be passed in to the allocation routines.
47
48
Passing in null for the allocation callbacks object will cause dr_flac to use defaults which is the same as DRFLAC_MALLOC,
49
DRFLAC_REALLOC and DRFLAC_FREE and the equivalent of how it worked in previous versions.
50
51
Every API that opens a drflac object now takes this extra parameter. These include the following:
52
53
drflac_open()
54
drflac_open_relaxed()
55
drflac_open_with_metadata()
56
drflac_open_with_metadata_relaxed()
57
drflac_open_file()
58
drflac_open_file_with_metadata()
59
drflac_open_memory()
60
drflac_open_memory_with_metadata()
61
drflac_open_and_read_pcm_frames_s32()
62
drflac_open_and_read_pcm_frames_s16()
63
drflac_open_and_read_pcm_frames_f32()
64
drflac_open_file_and_read_pcm_frames_s32()
65
drflac_open_file_and_read_pcm_frames_s16()
66
drflac_open_file_and_read_pcm_frames_f32()
67
drflac_open_memory_and_read_pcm_frames_s32()
68
drflac_open_memory_and_read_pcm_frames_s16()
69
drflac_open_memory_and_read_pcm_frames_f32()
70
71
72
73
Optimizations
74
-------------
75
Seeking performance has been greatly improved. A new binary search based seeking algorithm has been introduced which significantly
76
improves performance over the brute force method which was used when no seek table was present. Seek table based seeking also takes
77
advantage of the new binary search seeking system to further improve performance there as well. Note that this depends on CRC which
78
means it will be disabled when DR_FLAC_NO_CRC is used.
79
80
The SSE4.1 pipeline has been cleaned up and optimized. You should see some improvements with decoding speed of 24-bit files in
81
particular. 16-bit streams should also see some improvement.
82
83
drflac_read_pcm_frames_s16() has been optimized. Previously this sat on top of drflac_read_pcm_frames_s32() and performed it's s32
84
to s16 conversion in a second pass. This is now all done in a single pass. This includes SSE2 and ARM NEON optimized paths.
85
86
A minor optimization has been implemented for drflac_read_pcm_frames_s32(). This will now use an SSE2 optimized pipeline for stereo
87
channel reconstruction which is the last part of the decoding process.
88
89
The ARM build has seen a few improvements. The CLZ (count leading zeroes) and REV (byte swap) instructions are now used when
90
compiling with GCC and Clang which is achieved using inline assembly. The CLZ instruction requires ARM architecture version 5 at
91
compile time and the REV instruction requires ARM architecture version 6.
92
93
An ARM NEON optimized pipeline has been implemented. To enable this you'll need to add -mfpu=neon to the command line when compiling.
94
95
96
Removed APIs
97
------------
98
The following APIs were deprecated in version 0.11.0 and have been completely removed in version 0.12.0:
99
100
drflac_read_s32() -> drflac_read_pcm_frames_s32()
101
drflac_read_s16() -> drflac_read_pcm_frames_s16()
102
drflac_read_f32() -> drflac_read_pcm_frames_f32()
103
drflac_seek_to_sample() -> drflac_seek_to_pcm_frame()
104
drflac_open_and_decode_s32() -> drflac_open_and_read_pcm_frames_s32()
105
drflac_open_and_decode_s16() -> drflac_open_and_read_pcm_frames_s16()
106
drflac_open_and_decode_f32() -> drflac_open_and_read_pcm_frames_f32()
107
drflac_open_and_decode_file_s32() -> drflac_open_file_and_read_pcm_frames_s32()
108
drflac_open_and_decode_file_s16() -> drflac_open_file_and_read_pcm_frames_s16()
109
drflac_open_and_decode_file_f32() -> drflac_open_file_and_read_pcm_frames_f32()
110
drflac_open_and_decode_memory_s32() -> drflac_open_memory_and_read_pcm_frames_s32()
111
drflac_open_and_decode_memory_s16() -> drflac_open_memory_and_read_pcm_frames_s16()
112
drflac_open_and_decode_memory_f32() -> drflac_open_memroy_and_read_pcm_frames_f32()
113
114
Prior versions of dr_flac operated on a per-sample basis whereas now it operates on PCM frames. The removed APIs all relate
115
to the old per-sample APIs. You now need to use the "pcm_frame" versions.
116
*/
117
118
119
/*
120
Introduction
121
============
122
dr_flac is a single file library. To use it, do something like the following in one .c file.
123
124
```c
125
#define DR_FLAC_IMPLEMENTATION
126
#include "dr_flac.h"
127
```
128
129
You can then #include this file in other parts of the program as you would with any other header file. To decode audio data, do something like the following:
130
131
```c
132
drflac* pFlac = drflac_open_file("MySong.flac", NULL);
133
if (pFlac == NULL) {
134
// Failed to open FLAC file
135
}
136
137
drflac_int32* pSamples = malloc(pFlac->totalPCMFrameCount * pFlac->channels * sizeof(drflac_int32));
138
drflac_uint64 numberOfInterleavedSamplesActuallyRead = drflac_read_pcm_frames_s32(pFlac, pFlac->totalPCMFrameCount, pSamples);
139
```
140
141
The drflac object represents the decoder. It is a transparent type so all the information you need, such as the number of channels and the bits per sample,
142
should be directly accessible - just make sure you don't change their values. Samples are always output as interleaved signed 32-bit PCM. In the example above
143
a native FLAC stream was opened, however dr_flac has seamless support for Ogg encapsulated FLAC streams as well.
144
145
You do not need to decode the entire stream in one go - you just specify how many samples you'd like at any given time and the decoder will give you as many
146
samples as it can, up to the amount requested. Later on when you need the next batch of samples, just call it again. Example:
147
148
```c
149
while (drflac_read_pcm_frames_s32(pFlac, chunkSizeInPCMFrames, pChunkSamples) > 0) {
150
do_something();
151
}
152
```
153
154
You can seek to a specific PCM frame with `drflac_seek_to_pcm_frame()`.
155
156
If you just want to quickly decode an entire FLAC file in one go you can do something like this:
157
158
```c
159
unsigned int channels;
160
unsigned int sampleRate;
161
drflac_uint64 totalPCMFrameCount;
162
drflac_int32* pSampleData = drflac_open_file_and_read_pcm_frames_s32("MySong.flac", &channels, &sampleRate, &totalPCMFrameCount, NULL);
163
if (pSampleData == NULL) {
164
// Failed to open and decode FLAC file.
165
}
166
167
...
168
169
drflac_free(pSampleData, NULL);
170
```
171
172
You can read samples as signed 16-bit integer and 32-bit floating-point PCM with the *_s16() and *_f32() family of APIs respectively, but note that these
173
should be considered lossy.
174
175
176
If you need access to metadata (album art, etc.), use `drflac_open_with_metadata()`, `drflac_open_file_with_metdata()` or `drflac_open_memory_with_metadata()`.
177
The rationale for keeping these APIs separate is that they're slightly slower than the normal versions and also just a little bit harder to use. dr_flac
178
reports metadata to the application through the use of a callback, and every metadata block is reported before `drflac_open_with_metdata()` returns.
179
180
The main opening APIs (`drflac_open()`, etc.) will fail if the header is not present. The presents a problem in certain scenarios such as broadcast style
181
streams or internet radio where the header may not be present because the user has started playback mid-stream. To handle this, use the relaxed APIs:
182
183
`drflac_open_relaxed()`
184
`drflac_open_with_metadata_relaxed()`
185
186
It is not recommended to use these APIs for file based streams because a missing header would usually indicate a corrupt or perverse file. In addition, these
187
APIs can take a long time to initialize because they may need to spend a lot of time finding the first frame.
188
189
190
191
Build Options
192
=============
193
#define these options before including this file.
194
195
#define DR_FLAC_NO_STDIO
196
Disable `drflac_open_file()` and family.
197
198
#define DR_FLAC_NO_OGG
199
Disables support for Ogg/FLAC streams.
200
201
#define DR_FLAC_BUFFER_SIZE <number>
202
Defines the size of the internal buffer to store data from onRead(). This buffer is used to reduce the number of calls back to the client for more data.
203
Larger values means more memory, but better performance. My tests show diminishing returns after about 4KB (which is the default). Consider reducing this if
204
you have a very efficient implementation of onRead(), or increase it if it's very inefficient. Must be a multiple of 8.
205
206
#define DR_FLAC_NO_CRC
207
Disables CRC checks. This will offer a performance boost when CRC is unnecessary. This will disable binary search seeking. When seeking, the seek table will
208
be used if available. Otherwise the seek will be performed using brute force.
209
210
#define DR_FLAC_NO_SIMD
211
Disables SIMD optimizations (SSE on x86/x64 architectures, NEON on ARM architectures). Use this if you are having compatibility issues with your compiler.
212
213
#define DR_FLAC_NO_WCHAR
214
Disables all functions ending with `_w`. Use this if your compiler does not provide wchar.h. Not required if DR_FLAC_NO_STDIO is also defined.
215
216
217
218
Notes
219
=====
220
- dr_flac does not support changing the sample rate nor channel count mid stream.
221
- dr_flac is not thread-safe, but its APIs can be called from any thread so long as you do your own synchronization.
222
- When using Ogg encapsulation, a corrupted metadata block will result in `drflac_open_with_metadata()` and `drflac_open()` returning inconsistent samples due
223
to differences in corrupted stream recorvery logic between the two APIs.
224
*/
225
226
#ifndef dr_flac_h
227
#define dr_flac_h
228
229
#ifdef __cplusplus
230
extern "C" {
231
#endif
232
233
#define DRFLAC_STRINGIFY(x) #x
234
#define DRFLAC_XSTRINGIFY(x) DRFLAC_STRINGIFY(x)
235
236
#define DRFLAC_VERSION_MAJOR 0
237
#define DRFLAC_VERSION_MINOR 12
238
#define DRFLAC_VERSION_REVISION 42
239
#define DRFLAC_VERSION_STRING DRFLAC_XSTRINGIFY(DRFLAC_VERSION_MAJOR) "." DRFLAC_XSTRINGIFY(DRFLAC_VERSION_MINOR) "." DRFLAC_XSTRINGIFY(DRFLAC_VERSION_REVISION)
240
241
#include <stddef.h> /* For size_t. */
242
243
/* Sized Types */
244
typedef signed char drflac_int8;
245
typedef unsigned char drflac_uint8;
246
typedef signed short drflac_int16;
247
typedef unsigned short drflac_uint16;
248
typedef signed int drflac_int32;
249
typedef unsigned int drflac_uint32;
250
#if defined(_MSC_VER) && !defined(__clang__)
251
typedef signed __int64 drflac_int64;
252
typedef unsigned __int64 drflac_uint64;
253
#else
254
#if defined(__clang__) || (defined(__GNUC__) && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6)))
255
#pragma GCC diagnostic push
256
#pragma GCC diagnostic ignored "-Wlong-long"
257
#if defined(__clang__)
258
#pragma GCC diagnostic ignored "-Wc++11-long-long"
259
#endif
260
#endif
261
typedef signed long long drflac_int64;
262
typedef unsigned long long drflac_uint64;
263
#if defined(__clang__) || (defined(__GNUC__) && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6)))
264
#pragma GCC diagnostic pop
265
#endif
266
#endif
267
#if defined(__LP64__) || defined(_WIN64) || (defined(__x86_64__) && !defined(__ILP32__)) || defined(_M_X64) || defined(__ia64) || defined(_M_IA64) || defined(__aarch64__) || defined(_M_ARM64) || defined(__powerpc64__)
268
typedef drflac_uint64 drflac_uintptr;
269
#else
270
typedef drflac_uint32 drflac_uintptr;
271
#endif
272
typedef drflac_uint8 drflac_bool8;
273
typedef drflac_uint32 drflac_bool32;
274
#define DRFLAC_TRUE 1
275
#define DRFLAC_FALSE 0
276
/* End Sized Types */
277
278
/* Decorations */
279
#if !defined(DRFLAC_API)
280
#if defined(DRFLAC_DLL)
281
#if defined(_WIN32)
282
#define DRFLAC_DLL_IMPORT __declspec(dllimport)
283
#define DRFLAC_DLL_EXPORT __declspec(dllexport)
284
#define DRFLAC_DLL_PRIVATE static
285
#else
286
#if defined(__GNUC__) && __GNUC__ >= 4
287
#define DRFLAC_DLL_IMPORT __attribute__((visibility("default")))
288
#define DRFLAC_DLL_EXPORT __attribute__((visibility("default")))
289
#define DRFLAC_DLL_PRIVATE __attribute__((visibility("hidden")))
290
#else
291
#define DRFLAC_DLL_IMPORT
292
#define DRFLAC_DLL_EXPORT
293
#define DRFLAC_DLL_PRIVATE static
294
#endif
295
#endif
296
297
#if defined(DR_FLAC_IMPLEMENTATION) || defined(DRFLAC_IMPLEMENTATION)
298
#define DRFLAC_API DRFLAC_DLL_EXPORT
299
#else
300
#define DRFLAC_API DRFLAC_DLL_IMPORT
301
#endif
302
#define DRFLAC_PRIVATE DRFLAC_DLL_PRIVATE
303
#else
304
#define DRFLAC_API extern
305
#define DRFLAC_PRIVATE static
306
#endif
307
#endif
308
/* End Decorations */
309
310
#if defined(_MSC_VER) && _MSC_VER >= 1700 /* Visual Studio 2012 */
311
#define DRFLAC_DEPRECATED __declspec(deprecated)
312
#elif (defined(__GNUC__) && __GNUC__ >= 4) /* GCC 4 */
313
#define DRFLAC_DEPRECATED __attribute__((deprecated))
314
#elif defined(__has_feature) /* Clang */
315
#if __has_feature(attribute_deprecated)
316
#define DRFLAC_DEPRECATED __attribute__((deprecated))
317
#else
318
#define DRFLAC_DEPRECATED
319
#endif
320
#else
321
#define DRFLAC_DEPRECATED
322
#endif
323
324
DRFLAC_API void drflac_version(drflac_uint32* pMajor, drflac_uint32* pMinor, drflac_uint32* pRevision);
325
DRFLAC_API const char* drflac_version_string(void);
326
327
/* Allocation Callbacks */
328
typedef struct
329
{
330
void* pUserData;
331
void* (* onMalloc)(size_t sz, void* pUserData);
332
void* (* onRealloc)(void* p, size_t sz, void* pUserData);
333
void (* onFree)(void* p, void* pUserData);
334
} drflac_allocation_callbacks;
335
/* End Allocation Callbacks */
336
337
/*
338
As data is read from the client it is placed into an internal buffer for fast access. This controls the size of that buffer. Larger values means more speed,
339
but also more memory. In my testing there is diminishing returns after about 4KB, but you can fiddle with this to suit your own needs. Must be a multiple of 8.
340
*/
341
#ifndef DR_FLAC_BUFFER_SIZE
342
#define DR_FLAC_BUFFER_SIZE 4096
343
#endif
344
345
346
/* Architecture Detection */
347
#if defined(_WIN64) || defined(_LP64) || defined(__LP64__)
348
#define DRFLAC_64BIT
349
#endif
350
351
#if defined(__x86_64__) || defined(_M_X64)
352
#define DRFLAC_X64
353
#elif defined(__i386) || defined(_M_IX86)
354
#define DRFLAC_X86
355
#elif defined(__arm__) || defined(_M_ARM) || defined(__arm64) || defined(__arm64__) || defined(__aarch64__) || defined(_M_ARM64)
356
#define DRFLAC_ARM
357
#endif
358
/* End Architecture Detection */
359
360
361
#ifdef DRFLAC_64BIT
362
typedef drflac_uint64 drflac_cache_t;
363
#else
364
typedef drflac_uint32 drflac_cache_t;
365
#endif
366
367
/* The various metadata block types. */
368
#define DRFLAC_METADATA_BLOCK_TYPE_STREAMINFO 0
369
#define DRFLAC_METADATA_BLOCK_TYPE_PADDING 1
370
#define DRFLAC_METADATA_BLOCK_TYPE_APPLICATION 2
371
#define DRFLAC_METADATA_BLOCK_TYPE_SEEKTABLE 3
372
#define DRFLAC_METADATA_BLOCK_TYPE_VORBIS_COMMENT 4
373
#define DRFLAC_METADATA_BLOCK_TYPE_CUESHEET 5
374
#define DRFLAC_METADATA_BLOCK_TYPE_PICTURE 6
375
#define DRFLAC_METADATA_BLOCK_TYPE_INVALID 127
376
377
/* The various picture types specified in the PICTURE block. */
378
#define DRFLAC_PICTURE_TYPE_OTHER 0
379
#define DRFLAC_PICTURE_TYPE_FILE_ICON 1
380
#define DRFLAC_PICTURE_TYPE_OTHER_FILE_ICON 2
381
#define DRFLAC_PICTURE_TYPE_COVER_FRONT 3
382
#define DRFLAC_PICTURE_TYPE_COVER_BACK 4
383
#define DRFLAC_PICTURE_TYPE_LEAFLET_PAGE 5
384
#define DRFLAC_PICTURE_TYPE_MEDIA 6
385
#define DRFLAC_PICTURE_TYPE_LEAD_ARTIST 7
386
#define DRFLAC_PICTURE_TYPE_ARTIST 8
387
#define DRFLAC_PICTURE_TYPE_CONDUCTOR 9
388
#define DRFLAC_PICTURE_TYPE_BAND 10
389
#define DRFLAC_PICTURE_TYPE_COMPOSER 11
390
#define DRFLAC_PICTURE_TYPE_LYRICIST 12
391
#define DRFLAC_PICTURE_TYPE_RECORDING_LOCATION 13
392
#define DRFLAC_PICTURE_TYPE_DURING_RECORDING 14
393
#define DRFLAC_PICTURE_TYPE_DURING_PERFORMANCE 15
394
#define DRFLAC_PICTURE_TYPE_SCREEN_CAPTURE 16
395
#define DRFLAC_PICTURE_TYPE_BRIGHT_COLORED_FISH 17
396
#define DRFLAC_PICTURE_TYPE_ILLUSTRATION 18
397
#define DRFLAC_PICTURE_TYPE_BAND_LOGOTYPE 19
398
#define DRFLAC_PICTURE_TYPE_PUBLISHER_LOGOTYPE 20
399
400
typedef enum
401
{
402
drflac_container_native,
403
drflac_container_ogg,
404
drflac_container_unknown
405
} drflac_container;
406
407
typedef enum
408
{
409
drflac_seek_origin_start,
410
drflac_seek_origin_current
411
} drflac_seek_origin;
412
413
/* The order of members in this structure is important because we map this directly to the raw data within the SEEKTABLE metadata block. */
414
typedef struct
415
{
416
drflac_uint64 firstPCMFrame;
417
drflac_uint64 flacFrameOffset; /* The offset from the first byte of the header of the first frame. */
418
drflac_uint16 pcmFrameCount;
419
} drflac_seekpoint;
420
421
typedef struct
422
{
423
drflac_uint16 minBlockSizeInPCMFrames;
424
drflac_uint16 maxBlockSizeInPCMFrames;
425
drflac_uint32 minFrameSizeInPCMFrames;
426
drflac_uint32 maxFrameSizeInPCMFrames;
427
drflac_uint32 sampleRate;
428
drflac_uint8 channels;
429
drflac_uint8 bitsPerSample;
430
drflac_uint64 totalPCMFrameCount;
431
drflac_uint8 md5[16];
432
} drflac_streaminfo;
433
434
typedef struct
435
{
436
/*
437
The metadata type. Use this to know how to interpret the data below. Will be set to one of the
438
DRFLAC_METADATA_BLOCK_TYPE_* tokens.
439
*/
440
drflac_uint32 type;
441
442
/*
443
A pointer to the raw data. This points to a temporary buffer so don't hold on to it. It's best to
444
not modify the contents of this buffer. Use the structures below for more meaningful and structured
445
information about the metadata. It's possible for this to be null.
446
*/
447
const void* pRawData;
448
449
/* The size in bytes of the block and the buffer pointed to by pRawData if it's non-NULL. */
450
drflac_uint32 rawDataSize;
451
452
union
453
{
454
drflac_streaminfo streaminfo;
455
456
struct
457
{
458
int unused;
459
} padding;
460
461
struct
462
{
463
drflac_uint32 id;
464
const void* pData;
465
drflac_uint32 dataSize;
466
} application;
467
468
struct
469
{
470
drflac_uint32 seekpointCount;
471
const drflac_seekpoint* pSeekpoints;
472
} seektable;
473
474
struct
475
{
476
drflac_uint32 vendorLength;
477
const char* vendor;
478
drflac_uint32 commentCount;
479
const void* pComments;
480
} vorbis_comment;
481
482
struct
483
{
484
char catalog[128];
485
drflac_uint64 leadInSampleCount;
486
drflac_bool32 isCD;
487
drflac_uint8 trackCount;
488
const void* pTrackData;
489
} cuesheet;
490
491
struct
492
{
493
drflac_uint32 type;
494
drflac_uint32 mimeLength;
495
const char* mime;
496
drflac_uint32 descriptionLength;
497
const char* description;
498
drflac_uint32 width;
499
drflac_uint32 height;
500
drflac_uint32 colorDepth;
501
drflac_uint32 indexColorCount;
502
drflac_uint32 pictureDataSize;
503
const drflac_uint8* pPictureData;
504
} picture;
505
} data;
506
} drflac_metadata;
507
508
509
/*
510
Callback for when data needs to be read from the client.
511
512
513
Parameters
514
----------
515
pUserData (in)
516
The user data that was passed to drflac_open() and family.
517
518
pBufferOut (out)
519
The output buffer.
520
521
bytesToRead (in)
522
The number of bytes to read.
523
524
525
Return Value
526
------------
527
The number of bytes actually read.
528
529
530
Remarks
531
-------
532
A return value of less than bytesToRead indicates the end of the stream. Do _not_ return from this callback until either the entire bytesToRead is filled or
533
you have reached the end of the stream.
534
*/
535
typedef size_t (* drflac_read_proc)(void* pUserData, void* pBufferOut, size_t bytesToRead);
536
537
/*
538
Callback for when data needs to be seeked.
539
540
541
Parameters
542
----------
543
pUserData (in)
544
The user data that was passed to drflac_open() and family.
545
546
offset (in)
547
The number of bytes to move, relative to the origin. Will never be negative.
548
549
origin (in)
550
The origin of the seek - the current position or the start of the stream.
551
552
553
Return Value
554
------------
555
Whether or not the seek was successful.
556
557
558
Remarks
559
-------
560
The offset will never be negative. Whether or not it is relative to the beginning or current position is determined by the "origin" parameter which will be
561
either drflac_seek_origin_start or drflac_seek_origin_current.
562
563
When seeking to a PCM frame using drflac_seek_to_pcm_frame(), dr_flac may call this with an offset beyond the end of the FLAC stream. This needs to be detected
564
and handled by returning DRFLAC_FALSE.
565
*/
566
typedef drflac_bool32 (* drflac_seek_proc)(void* pUserData, int offset, drflac_seek_origin origin);
567
568
/*
569
Callback for when a metadata block is read.
570
571
572
Parameters
573
----------
574
pUserData (in)
575
The user data that was passed to drflac_open() and family.
576
577
pMetadata (in)
578
A pointer to a structure containing the data of the metadata block.
579
580
581
Remarks
582
-------
583
Use pMetadata->type to determine which metadata block is being handled and how to read the data. This
584
will be set to one of the DRFLAC_METADATA_BLOCK_TYPE_* tokens.
585
*/
586
typedef void (* drflac_meta_proc)(void* pUserData, drflac_metadata* pMetadata);
587
588
589
/* Structure for internal use. Only used for decoders opened with drflac_open_memory. */
590
typedef struct
591
{
592
const drflac_uint8* data;
593
size_t dataSize;
594
size_t currentReadPos;
595
} drflac__memory_stream;
596
597
/* Structure for internal use. Used for bit streaming. */
598
typedef struct
599
{
600
/* The function to call when more data needs to be read. */
601
drflac_read_proc onRead;
602
603
/* The function to call when the current read position needs to be moved. */
604
drflac_seek_proc onSeek;
605
606
/* The user data to pass around to onRead and onSeek. */
607
void* pUserData;
608
609
610
/*
611
The number of unaligned bytes in the L2 cache. This will always be 0 until the end of the stream is hit. At the end of the
612
stream there will be a number of bytes that don't cleanly fit in an L1 cache line, so we use this variable to know whether
613
or not the bistreamer needs to run on a slower path to read those last bytes. This will never be more than sizeof(drflac_cache_t).
614
*/
615
size_t unalignedByteCount;
616
617
/* The content of the unaligned bytes. */
618
drflac_cache_t unalignedCache;
619
620
/* The index of the next valid cache line in the "L2" cache. */
621
drflac_uint32 nextL2Line;
622
623
/* The number of bits that have been consumed by the cache. This is used to determine how many valid bits are remaining. */
624
drflac_uint32 consumedBits;
625
626
/*
627
The cached data which was most recently read from the client. There are two levels of cache. Data flows as such:
628
Client -> L2 -> L1. The L2 -> L1 movement is aligned and runs on a fast path in just a few instructions.
629
*/
630
drflac_cache_t cacheL2[DR_FLAC_BUFFER_SIZE/sizeof(drflac_cache_t)];
631
drflac_cache_t cache;
632
633
/*
634
CRC-16. This is updated whenever bits are read from the bit stream. Manually set this to 0 to reset the CRC. For FLAC, this
635
is reset to 0 at the beginning of each frame.
636
*/
637
drflac_uint16 crc16;
638
drflac_cache_t crc16Cache; /* A cache for optimizing CRC calculations. This is filled when when the L1 cache is reloaded. */
639
drflac_uint32 crc16CacheIgnoredBytes; /* The number of bytes to ignore when updating the CRC-16 from the CRC-16 cache. */
640
} drflac_bs;
641
642
typedef struct
643
{
644
/* The type of the subframe: SUBFRAME_CONSTANT, SUBFRAME_VERBATIM, SUBFRAME_FIXED or SUBFRAME_LPC. */
645
drflac_uint8 subframeType;
646
647
/* The number of wasted bits per sample as specified by the sub-frame header. */
648
drflac_uint8 wastedBitsPerSample;
649
650
/* The order to use for the prediction stage for SUBFRAME_FIXED and SUBFRAME_LPC. */
651
drflac_uint8 lpcOrder;
652
653
/* A pointer to the buffer containing the decoded samples in the subframe. This pointer is an offset from drflac::pExtraData. */
654
drflac_int32* pSamplesS32;
655
} drflac_subframe;
656
657
typedef struct
658
{
659
/*
660
If the stream uses variable block sizes, this will be set to the index of the first PCM frame. If fixed block sizes are used, this will
661
always be set to 0. This is 64-bit because the decoded PCM frame number will be 36 bits.
662
*/
663
drflac_uint64 pcmFrameNumber;
664
665
/*
666
If the stream uses fixed block sizes, this will be set to the frame number. If variable block sizes are used, this will always be 0. This
667
is 32-bit because in fixed block sizes, the maximum frame number will be 31 bits.
668
*/
669
drflac_uint32 flacFrameNumber;
670
671
/* The sample rate of this frame. */
672
drflac_uint32 sampleRate;
673
674
/* The number of PCM frames in each sub-frame within this frame. */
675
drflac_uint16 blockSizeInPCMFrames;
676
677
/*
678
The channel assignment of this frame. This is not always set to the channel count. If interchannel decorrelation is being used this
679
will be set to DRFLAC_CHANNEL_ASSIGNMENT_LEFT_SIDE, DRFLAC_CHANNEL_ASSIGNMENT_RIGHT_SIDE or DRFLAC_CHANNEL_ASSIGNMENT_MID_SIDE.
680
*/
681
drflac_uint8 channelAssignment;
682
683
/* The number of bits per sample within this frame. */
684
drflac_uint8 bitsPerSample;
685
686
/* The frame's CRC. */
687
drflac_uint8 crc8;
688
} drflac_frame_header;
689
690
typedef struct
691
{
692
/* The header. */
693
drflac_frame_header header;
694
695
/*
696
The number of PCM frames left to be read in this FLAC frame. This is initially set to the block size. As PCM frames are read,
697
this will be decremented. When it reaches 0, the decoder will see this frame as fully consumed and load the next frame.
698
*/
699
drflac_uint32 pcmFramesRemaining;
700
701
/* The list of sub-frames within the frame. There is one sub-frame for each channel, and there's a maximum of 8 channels. */
702
drflac_subframe subframes[8];
703
} drflac_frame;
704
705
typedef struct
706
{
707
/* The function to call when a metadata block is read. */
708
drflac_meta_proc onMeta;
709
710
/* The user data posted to the metadata callback function. */
711
void* pUserDataMD;
712
713
/* Memory allocation callbacks. */
714
drflac_allocation_callbacks allocationCallbacks;
715
716
717
/* The sample rate. Will be set to something like 44100. */
718
drflac_uint32 sampleRate;
719
720
/*
721
The number of channels. This will be set to 1 for monaural streams, 2 for stereo, etc. Maximum 8. This is set based on the
722
value specified in the STREAMINFO block.
723
*/
724
drflac_uint8 channels;
725
726
/* The bits per sample. Will be set to something like 16, 24, etc. */
727
drflac_uint8 bitsPerSample;
728
729
/* The maximum block size, in samples. This number represents the number of samples in each channel (not combined). */
730
drflac_uint16 maxBlockSizeInPCMFrames;
731
732
/*
733
The total number of PCM Frames making up the stream. Can be 0 in which case it's still a valid stream, but just means
734
the total PCM frame count is unknown. Likely the case with streams like internet radio.
735
*/
736
drflac_uint64 totalPCMFrameCount;
737
738
739
/* The container type. This is set based on whether or not the decoder was opened from a native or Ogg stream. */
740
drflac_container container;
741
742
/* The number of seekpoints in the seektable. */
743
drflac_uint32 seekpointCount;
744
745
746
/* Information about the frame the decoder is currently sitting on. */
747
drflac_frame currentFLACFrame;
748
749
750
/* The index of the PCM frame the decoder is currently sitting on. This is only used for seeking. */
751
drflac_uint64 currentPCMFrame;
752
753
/* The position of the first FLAC frame in the stream. This is only ever used for seeking. */
754
drflac_uint64 firstFLACFramePosInBytes;
755
756
757
/* A hack to avoid a malloc() when opening a decoder with drflac_open_memory(). */
758
drflac__memory_stream memoryStream;
759
760
761
/* A pointer to the decoded sample data. This is an offset of pExtraData. */
762
drflac_int32* pDecodedSamples;
763
764
/* A pointer to the seek table. This is an offset of pExtraData, or NULL if there is no seek table. */
765
drflac_seekpoint* pSeekpoints;
766
767
/* Internal use only. Only used with Ogg containers. Points to a drflac_oggbs object. This is an offset of pExtraData. */
768
void* _oggbs;
769
770
/* Internal use only. Used for profiling and testing different seeking modes. */
771
drflac_bool32 _noSeekTableSeek : 1;
772
drflac_bool32 _noBinarySearchSeek : 1;
773
drflac_bool32 _noBruteForceSeek : 1;
774
775
/* The bit streamer. The raw FLAC data is fed through this object. */
776
drflac_bs bs;
777
778
/* Variable length extra data. We attach this to the end of the object so we can avoid unnecessary mallocs. */
779
drflac_uint8 pExtraData[1];
780
} drflac;
781
782
783
/*
784
Opens a FLAC decoder.
785
786
787
Parameters
788
----------
789
onRead (in)
790
The function to call when data needs to be read from the client.
791
792
onSeek (in)
793
The function to call when the read position of the client data needs to move.
794
795
pUserData (in, optional)
796
A pointer to application defined data that will be passed to onRead and onSeek.
797
798
pAllocationCallbacks (in, optional)
799
A pointer to application defined callbacks for managing memory allocations.
800
801
802
Return Value
803
------------
804
Returns a pointer to an object representing the decoder.
805
806
807
Remarks
808
-------
809
Close the decoder with `drflac_close()`.
810
811
`pAllocationCallbacks` can be NULL in which case it will use `DRFLAC_MALLOC`, `DRFLAC_REALLOC` and `DRFLAC_FREE`.
812
813
This function will automatically detect whether or not you are attempting to open a native or Ogg encapsulated FLAC, both of which should work seamlessly
814
without any manual intervention. Ogg encapsulation also works with multiplexed streams which basically means it can play FLAC encoded audio tracks in videos.
815
816
This is the lowest level function for opening a FLAC stream. You can also use `drflac_open_file()` and `drflac_open_memory()` to open the stream from a file or
817
from a block of memory respectively.
818
819
The STREAMINFO block must be present for this to succeed. Use `drflac_open_relaxed()` to open a FLAC stream where the header may not be present.
820
821
Use `drflac_open_with_metadata()` if you need access to metadata.
822
823
824
Seek Also
825
---------
826
drflac_open_file()
827
drflac_open_memory()
828
drflac_open_with_metadata()
829
drflac_close()
830
*/
831
DRFLAC_API drflac* drflac_open(drflac_read_proc onRead, drflac_seek_proc onSeek, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks);
832
833
/*
834
Opens a FLAC stream with relaxed validation of the header block.
835
836
837
Parameters
838
----------
839
onRead (in)
840
The function to call when data needs to be read from the client.
841
842
onSeek (in)
843
The function to call when the read position of the client data needs to move.
844
845
container (in)
846
Whether or not the FLAC stream is encapsulated using standard FLAC encapsulation or Ogg encapsulation.
847
848
pUserData (in, optional)
849
A pointer to application defined data that will be passed to onRead and onSeek.
850
851
pAllocationCallbacks (in, optional)
852
A pointer to application defined callbacks for managing memory allocations.
853
854
855
Return Value
856
------------
857
A pointer to an object representing the decoder.
858
859
860
Remarks
861
-------
862
The same as drflac_open(), except attempts to open the stream even when a header block is not present.
863
864
Because the header is not necessarily available, the caller must explicitly define the container (Native or Ogg). Do not set this to `drflac_container_unknown`
865
as that is for internal use only.
866
867
Opening in relaxed mode will continue reading data from onRead until it finds a valid frame. If a frame is never found it will continue forever. To abort,
868
force your `onRead` callback to return 0, which dr_flac will use as an indicator that the end of the stream was found.
869
870
Use `drflac_open_with_metadata_relaxed()` if you need access to metadata.
871
*/
872
DRFLAC_API drflac* drflac_open_relaxed(drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_container container, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks);
873
874
/*
875
Opens a FLAC decoder and notifies the caller of the metadata chunks (album art, etc.).
876
877
878
Parameters
879
----------
880
onRead (in)
881
The function to call when data needs to be read from the client.
882
883
onSeek (in)
884
The function to call when the read position of the client data needs to move.
885
886
onMeta (in)
887
The function to call for every metadata block.
888
889
pUserData (in, optional)
890
A pointer to application defined data that will be passed to onRead, onSeek and onMeta.
891
892
pAllocationCallbacks (in, optional)
893
A pointer to application defined callbacks for managing memory allocations.
894
895
896
Return Value
897
------------
898
A pointer to an object representing the decoder.
899
900
901
Remarks
902
-------
903
Close the decoder with `drflac_close()`.
904
905
`pAllocationCallbacks` can be NULL in which case it will use `DRFLAC_MALLOC`, `DRFLAC_REALLOC` and `DRFLAC_FREE`.
906
907
This is slower than `drflac_open()`, so avoid this one if you don't need metadata. Internally, this will allocate and free memory on the heap for every
908
metadata block except for STREAMINFO and PADDING blocks.
909
910
The caller is notified of the metadata via the `onMeta` callback. All metadata blocks will be handled before the function returns. This callback takes a
911
pointer to a `drflac_metadata` object which is a union containing the data of all relevant metadata blocks. Use the `type` member to discriminate against
912
the different metadata types.
913
914
The STREAMINFO block must be present for this to succeed. Use `drflac_open_with_metadata_relaxed()` to open a FLAC stream where the header may not be present.
915
916
Note that this will behave inconsistently with `drflac_open()` if the stream is an Ogg encapsulated stream and a metadata block is corrupted. This is due to
917
the way the Ogg stream recovers from corrupted pages. When `drflac_open_with_metadata()` is being used, the open routine will try to read the contents of the
918
metadata block, whereas `drflac_open()` will simply seek past it (for the sake of efficiency). This inconsistency can result in different samples being
919
returned depending on whether or not the stream is being opened with metadata.
920
921
922
Seek Also
923
---------
924
drflac_open_file_with_metadata()
925
drflac_open_memory_with_metadata()
926
drflac_open()
927
drflac_close()
928
*/
929
DRFLAC_API drflac* drflac_open_with_metadata(drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks);
930
931
/*
932
The same as drflac_open_with_metadata(), except attempts to open the stream even when a header block is not present.
933
934
See Also
935
--------
936
drflac_open_with_metadata()
937
drflac_open_relaxed()
938
*/
939
DRFLAC_API drflac* drflac_open_with_metadata_relaxed(drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, drflac_container container, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks);
940
941
/*
942
Closes the given FLAC decoder.
943
944
945
Parameters
946
----------
947
pFlac (in)
948
The decoder to close.
949
950
951
Remarks
952
-------
953
This will destroy the decoder object.
954
955
956
See Also
957
--------
958
drflac_open()
959
drflac_open_with_metadata()
960
drflac_open_file()
961
drflac_open_file_w()
962
drflac_open_file_with_metadata()
963
drflac_open_file_with_metadata_w()
964
drflac_open_memory()
965
drflac_open_memory_with_metadata()
966
*/
967
DRFLAC_API void drflac_close(drflac* pFlac);
968
969
970
/*
971
Reads sample data from the given FLAC decoder, output as interleaved signed 32-bit PCM.
972
973
974
Parameters
975
----------
976
pFlac (in)
977
The decoder.
978
979
framesToRead (in)
980
The number of PCM frames to read.
981
982
pBufferOut (out, optional)
983
A pointer to the buffer that will receive the decoded samples.
984
985
986
Return Value
987
------------
988
Returns the number of PCM frames actually read. If the return value is less than `framesToRead` it has reached the end.
989
990
991
Remarks
992
-------
993
pBufferOut can be null, in which case the call will act as a seek, and the return value will be the number of frames seeked.
994
*/
995
DRFLAC_API drflac_uint64 drflac_read_pcm_frames_s32(drflac* pFlac, drflac_uint64 framesToRead, drflac_int32* pBufferOut);
996
997
998
/*
999
Reads sample data from the given FLAC decoder, output as interleaved signed 16-bit PCM.
1000
1001
1002
Parameters
1003
----------
1004
pFlac (in)
1005
The decoder.
1006
1007
framesToRead (in)
1008
The number of PCM frames to read.
1009
1010
pBufferOut (out, optional)
1011
A pointer to the buffer that will receive the decoded samples.
1012
1013
1014
Return Value
1015
------------
1016
Returns the number of PCM frames actually read. If the return value is less than `framesToRead` it has reached the end.
1017
1018
1019
Remarks
1020
-------
1021
pBufferOut can be null, in which case the call will act as a seek, and the return value will be the number of frames seeked.
1022
1023
Note that this is lossy for streams where the bits per sample is larger than 16.
1024
*/
1025
DRFLAC_API drflac_uint64 drflac_read_pcm_frames_s16(drflac* pFlac, drflac_uint64 framesToRead, drflac_int16* pBufferOut);
1026
1027
/*
1028
Reads sample data from the given FLAC decoder, output as interleaved 32-bit floating point PCM.
1029
1030
1031
Parameters
1032
----------
1033
pFlac (in)
1034
The decoder.
1035
1036
framesToRead (in)
1037
The number of PCM frames to read.
1038
1039
pBufferOut (out, optional)
1040
A pointer to the buffer that will receive the decoded samples.
1041
1042
1043
Return Value
1044
------------
1045
Returns the number of PCM frames actually read. If the return value is less than `framesToRead` it has reached the end.
1046
1047
1048
Remarks
1049
-------
1050
pBufferOut can be null, in which case the call will act as a seek, and the return value will be the number of frames seeked.
1051
1052
Note that this should be considered lossy due to the nature of floating point numbers not being able to exactly represent every possible number.
1053
*/
1054
DRFLAC_API drflac_uint64 drflac_read_pcm_frames_f32(drflac* pFlac, drflac_uint64 framesToRead, float* pBufferOut);
1055
1056
/*
1057
Seeks to the PCM frame at the given index.
1058
1059
1060
Parameters
1061
----------
1062
pFlac (in)
1063
The decoder.
1064
1065
pcmFrameIndex (in)
1066
The index of the PCM frame to seek to. See notes below.
1067
1068
1069
Return Value
1070
-------------
1071
`DRFLAC_TRUE` if successful; `DRFLAC_FALSE` otherwise.
1072
*/
1073
DRFLAC_API drflac_bool32 drflac_seek_to_pcm_frame(drflac* pFlac, drflac_uint64 pcmFrameIndex);
1074
1075
1076
1077
#ifndef DR_FLAC_NO_STDIO
1078
/*
1079
Opens a FLAC decoder from the file at the given path.
1080
1081
1082
Parameters
1083
----------
1084
pFileName (in)
1085
The path of the file to open, either absolute or relative to the current directory.
1086
1087
pAllocationCallbacks (in, optional)
1088
A pointer to application defined callbacks for managing memory allocations.
1089
1090
1091
Return Value
1092
------------
1093
A pointer to an object representing the decoder.
1094
1095
1096
Remarks
1097
-------
1098
Close the decoder with drflac_close().
1099
1100
1101
Remarks
1102
-------
1103
This will hold a handle to the file until the decoder is closed with drflac_close(). Some platforms will restrict the number of files a process can have open
1104
at any given time, so keep this mind if you have many decoders open at the same time.
1105
1106
1107
See Also
1108
--------
1109
drflac_open_file_with_metadata()
1110
drflac_open()
1111
drflac_close()
1112
*/
1113
DRFLAC_API drflac* drflac_open_file(const char* pFileName, const drflac_allocation_callbacks* pAllocationCallbacks);
1114
DRFLAC_API drflac* drflac_open_file_w(const wchar_t* pFileName, const drflac_allocation_callbacks* pAllocationCallbacks);
1115
1116
/*
1117
Opens a FLAC decoder from the file at the given path and notifies the caller of the metadata chunks (album art, etc.)
1118
1119
1120
Parameters
1121
----------
1122
pFileName (in)
1123
The path of the file to open, either absolute or relative to the current directory.
1124
1125
pAllocationCallbacks (in, optional)
1126
A pointer to application defined callbacks for managing memory allocations.
1127
1128
onMeta (in)
1129
The callback to fire for each metadata block.
1130
1131
pUserData (in)
1132
A pointer to the user data to pass to the metadata callback.
1133
1134
pAllocationCallbacks (in)
1135
A pointer to application defined callbacks for managing memory allocations.
1136
1137
1138
Remarks
1139
-------
1140
Look at the documentation for drflac_open_with_metadata() for more information on how metadata is handled.
1141
1142
1143
See Also
1144
--------
1145
drflac_open_with_metadata()
1146
drflac_open()
1147
drflac_close()
1148
*/
1149
DRFLAC_API drflac* drflac_open_file_with_metadata(const char* pFileName, drflac_meta_proc onMeta, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks);
1150
DRFLAC_API drflac* drflac_open_file_with_metadata_w(const wchar_t* pFileName, drflac_meta_proc onMeta, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks);
1151
#endif
1152
1153
/*
1154
Opens a FLAC decoder from a pre-allocated block of memory
1155
1156
1157
Parameters
1158
----------
1159
pData (in)
1160
A pointer to the raw encoded FLAC data.
1161
1162
dataSize (in)
1163
The size in bytes of `data`.
1164
1165
pAllocationCallbacks (in)
1166
A pointer to application defined callbacks for managing memory allocations.
1167
1168
1169
Return Value
1170
------------
1171
A pointer to an object representing the decoder.
1172
1173
1174
Remarks
1175
-------
1176
This does not create a copy of the data. It is up to the application to ensure the buffer remains valid for the lifetime of the decoder.
1177
1178
1179
See Also
1180
--------
1181
drflac_open()
1182
drflac_close()
1183
*/
1184
DRFLAC_API drflac* drflac_open_memory(const void* pData, size_t dataSize, const drflac_allocation_callbacks* pAllocationCallbacks);
1185
1186
/*
1187
Opens a FLAC decoder from a pre-allocated block of memory and notifies the caller of the metadata chunks (album art, etc.)
1188
1189
1190
Parameters
1191
----------
1192
pData (in)
1193
A pointer to the raw encoded FLAC data.
1194
1195
dataSize (in)
1196
The size in bytes of `data`.
1197
1198
onMeta (in)
1199
The callback to fire for each metadata block.
1200
1201
pUserData (in)
1202
A pointer to the user data to pass to the metadata callback.
1203
1204
pAllocationCallbacks (in)
1205
A pointer to application defined callbacks for managing memory allocations.
1206
1207
1208
Remarks
1209
-------
1210
Look at the documentation for drflac_open_with_metadata() for more information on how metadata is handled.
1211
1212
1213
See Also
1214
-------
1215
drflac_open_with_metadata()
1216
drflac_open()
1217
drflac_close()
1218
*/
1219
DRFLAC_API drflac* drflac_open_memory_with_metadata(const void* pData, size_t dataSize, drflac_meta_proc onMeta, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks);
1220
1221
1222
1223
/* High Level APIs */
1224
1225
/*
1226
Opens a FLAC stream from the given callbacks and fully decodes it in a single operation. The return value is a
1227
pointer to the sample data as interleaved signed 32-bit PCM. The returned data must be freed with drflac_free().
1228
1229
You can pass in custom memory allocation callbacks via the pAllocationCallbacks parameter. This can be NULL in which
1230
case it will use DRFLAC_MALLOC, DRFLAC_REALLOC and DRFLAC_FREE.
1231
1232
Sometimes a FLAC file won't keep track of the total sample count. In this situation the function will continuously
1233
read samples into a dynamically sized buffer on the heap until no samples are left.
1234
1235
Do not call this function on a broadcast type of stream (like internet radio streams and whatnot).
1236
*/
1237
DRFLAC_API drflac_int32* drflac_open_and_read_pcm_frames_s32(drflac_read_proc onRead, drflac_seek_proc onSeek, void* pUserData, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks);
1238
1239
/* Same as drflac_open_and_read_pcm_frames_s32(), except returns signed 16-bit integer samples. */
1240
DRFLAC_API drflac_int16* drflac_open_and_read_pcm_frames_s16(drflac_read_proc onRead, drflac_seek_proc onSeek, void* pUserData, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks);
1241
1242
/* Same as drflac_open_and_read_pcm_frames_s32(), except returns 32-bit floating-point samples. */
1243
DRFLAC_API float* drflac_open_and_read_pcm_frames_f32(drflac_read_proc onRead, drflac_seek_proc onSeek, void* pUserData, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks);
1244
1245
#ifndef DR_FLAC_NO_STDIO
1246
/* Same as drflac_open_and_read_pcm_frames_s32() except opens the decoder from a file. */
1247
DRFLAC_API drflac_int32* drflac_open_file_and_read_pcm_frames_s32(const char* filename, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks);
1248
1249
/* Same as drflac_open_file_and_read_pcm_frames_s32(), except returns signed 16-bit integer samples. */
1250
DRFLAC_API drflac_int16* drflac_open_file_and_read_pcm_frames_s16(const char* filename, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks);
1251
1252
/* Same as drflac_open_file_and_read_pcm_frames_s32(), except returns 32-bit floating-point samples. */
1253
DRFLAC_API float* drflac_open_file_and_read_pcm_frames_f32(const char* filename, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks);
1254
#endif
1255
1256
/* Same as drflac_open_and_read_pcm_frames_s32() except opens the decoder from a block of memory. */
1257
DRFLAC_API drflac_int32* drflac_open_memory_and_read_pcm_frames_s32(const void* data, size_t dataSize, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks);
1258
1259
/* Same as drflac_open_memory_and_read_pcm_frames_s32(), except returns signed 16-bit integer samples. */
1260
DRFLAC_API drflac_int16* drflac_open_memory_and_read_pcm_frames_s16(const void* data, size_t dataSize, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks);
1261
1262
/* Same as drflac_open_memory_and_read_pcm_frames_s32(), except returns 32-bit floating-point samples. */
1263
DRFLAC_API float* drflac_open_memory_and_read_pcm_frames_f32(const void* data, size_t dataSize, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks);
1264
1265
/*
1266
Frees memory that was allocated internally by dr_flac.
1267
1268
Set pAllocationCallbacks to the same object that was passed to drflac_open_*_and_read_pcm_frames_*(). If you originally passed in NULL, pass in NULL for this.
1269
*/
1270
DRFLAC_API void drflac_free(void* p, const drflac_allocation_callbacks* pAllocationCallbacks);
1271
1272
1273
/* Structure representing an iterator for vorbis comments in a VORBIS_COMMENT metadata block. */
1274
typedef struct
1275
{
1276
drflac_uint32 countRemaining;
1277
const char* pRunningData;
1278
} drflac_vorbis_comment_iterator;
1279
1280
/*
1281
Initializes a vorbis comment iterator. This can be used for iterating over the vorbis comments in a VORBIS_COMMENT
1282
metadata block.
1283
*/
1284
DRFLAC_API void drflac_init_vorbis_comment_iterator(drflac_vorbis_comment_iterator* pIter, drflac_uint32 commentCount, const void* pComments);
1285
1286
/*
1287
Goes to the next vorbis comment in the given iterator. If null is returned it means there are no more comments. The
1288
returned string is NOT null terminated.
1289
*/
1290
DRFLAC_API const char* drflac_next_vorbis_comment(drflac_vorbis_comment_iterator* pIter, drflac_uint32* pCommentLengthOut);
1291
1292
1293
/* Structure representing an iterator for cuesheet tracks in a CUESHEET metadata block. */
1294
typedef struct
1295
{
1296
drflac_uint32 countRemaining;
1297
const char* pRunningData;
1298
} drflac_cuesheet_track_iterator;
1299
1300
/* The order of members here is important because we map this directly to the raw data within the CUESHEET metadata block. */
1301
typedef struct
1302
{
1303
drflac_uint64 offset;
1304
drflac_uint8 index;
1305
drflac_uint8 reserved[3];
1306
} drflac_cuesheet_track_index;
1307
1308
typedef struct
1309
{
1310
drflac_uint64 offset;
1311
drflac_uint8 trackNumber;
1312
char ISRC[12];
1313
drflac_bool8 isAudio;
1314
drflac_bool8 preEmphasis;
1315
drflac_uint8 indexCount;
1316
const drflac_cuesheet_track_index* pIndexPoints;
1317
} drflac_cuesheet_track;
1318
1319
/*
1320
Initializes a cuesheet track iterator. This can be used for iterating over the cuesheet tracks in a CUESHEET metadata
1321
block.
1322
*/
1323
DRFLAC_API void drflac_init_cuesheet_track_iterator(drflac_cuesheet_track_iterator* pIter, drflac_uint32 trackCount, const void* pTrackData);
1324
1325
/* Goes to the next cuesheet track in the given iterator. If DRFLAC_FALSE is returned it means there are no more comments. */
1326
DRFLAC_API drflac_bool32 drflac_next_cuesheet_track(drflac_cuesheet_track_iterator* pIter, drflac_cuesheet_track* pCuesheetTrack);
1327
1328
1329
#ifdef __cplusplus
1330
}
1331
#endif
1332
#endif /* dr_flac_h */
1333
1334
1335
/************************************************************************************************************************************************************
1336
************************************************************************************************************************************************************
1337
1338
IMPLEMENTATION
1339
1340
************************************************************************************************************************************************************
1341
************************************************************************************************************************************************************/
1342
#if defined(DR_FLAC_IMPLEMENTATION) || defined(DRFLAC_IMPLEMENTATION)
1343
#ifndef dr_flac_c
1344
#define dr_flac_c
1345
1346
/* Disable some annoying warnings. */
1347
#if defined(__clang__) || (defined(__GNUC__) && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6)))
1348
#pragma GCC diagnostic push
1349
#if __GNUC__ >= 7
1350
#pragma GCC diagnostic ignored "-Wimplicit-fallthrough"
1351
#endif
1352
#endif
1353
1354
#ifdef __linux__
1355
#ifndef _BSD_SOURCE
1356
#define _BSD_SOURCE
1357
#endif
1358
#ifndef _DEFAULT_SOURCE
1359
#define _DEFAULT_SOURCE
1360
#endif
1361
#ifndef __USE_BSD
1362
#define __USE_BSD
1363
#endif
1364
#include <endian.h>
1365
#endif
1366
1367
#include <stdlib.h>
1368
#include <string.h>
1369
1370
/* Inline */
1371
#ifdef _MSC_VER
1372
#define DRFLAC_INLINE __forceinline
1373
#elif defined(__GNUC__)
1374
/*
1375
I've had a bug report where GCC is emitting warnings about functions possibly not being inlineable. This warning happens when
1376
the __attribute__((always_inline)) attribute is defined without an "inline" statement. I think therefore there must be some
1377
case where "__inline__" is not always defined, thus the compiler emitting these warnings. When using -std=c89 or -ansi on the
1378
command line, we cannot use the "inline" keyword and instead need to use "__inline__". In an attempt to work around this issue
1379
I am using "__inline__" only when we're compiling in strict ANSI mode.
1380
*/
1381
#if defined(__STRICT_ANSI__)
1382
#define DRFLAC_GNUC_INLINE_HINT __inline__
1383
#else
1384
#define DRFLAC_GNUC_INLINE_HINT inline
1385
#endif
1386
1387
#if (__GNUC__ > 3 || (__GNUC__ == 3 && __GNUC_MINOR__ >= 2)) || defined(__clang__)
1388
#define DRFLAC_INLINE DRFLAC_GNUC_INLINE_HINT __attribute__((always_inline))
1389
#else
1390
#define DRFLAC_INLINE DRFLAC_GNUC_INLINE_HINT
1391
#endif
1392
#elif defined(__WATCOMC__)
1393
#define DRFLAC_INLINE __inline
1394
#else
1395
#define DRFLAC_INLINE
1396
#endif
1397
/* End Inline */
1398
1399
/*
1400
Intrinsics Support
1401
1402
There's a bug in GCC 4.2.x which results in an incorrect compilation error when using _mm_slli_epi32() where it complains with
1403
1404
"error: shift must be an immediate"
1405
1406
Unfortuantely dr_flac depends on this for a few things so we're just going to disable SSE on GCC 4.2 and below.
1407
*/
1408
#if !defined(DR_FLAC_NO_SIMD)
1409
#if defined(DRFLAC_X64) || defined(DRFLAC_X86)
1410
#if defined(_MSC_VER) && !defined(__clang__)
1411
/* MSVC. */
1412
#if _MSC_VER >= 1400 && !defined(DRFLAC_NO_SSE2) /* 2005 */
1413
#define DRFLAC_SUPPORT_SSE2
1414
#endif
1415
#if _MSC_VER >= 1600 && !defined(DRFLAC_NO_SSE41) /* 2010 */
1416
#define DRFLAC_SUPPORT_SSE41
1417
#endif
1418
#elif defined(__clang__) || (defined(__GNUC__) && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 3)))
1419
/* Assume GNUC-style. */
1420
#if defined(__SSE2__) && !defined(DRFLAC_NO_SSE2)
1421
#define DRFLAC_SUPPORT_SSE2
1422
#endif
1423
#if defined(__SSE4_1__) && !defined(DRFLAC_NO_SSE41)
1424
#define DRFLAC_SUPPORT_SSE41
1425
#endif
1426
#endif
1427
1428
/* If at this point we still haven't determined compiler support for the intrinsics just fall back to __has_include. */
1429
#if !defined(__GNUC__) && !defined(__clang__) && defined(__has_include)
1430
#if !defined(DRFLAC_SUPPORT_SSE2) && !defined(DRFLAC_NO_SSE2) && __has_include(<emmintrin.h>)
1431
#define DRFLAC_SUPPORT_SSE2
1432
#endif
1433
#if !defined(DRFLAC_SUPPORT_SSE41) && !defined(DRFLAC_NO_SSE41) && __has_include(<smmintrin.h>)
1434
#define DRFLAC_SUPPORT_SSE41
1435
#endif
1436
#endif
1437
1438
#if defined(DRFLAC_SUPPORT_SSE41)
1439
#include <smmintrin.h>
1440
#elif defined(DRFLAC_SUPPORT_SSE2)
1441
#include <emmintrin.h>
1442
#endif
1443
#endif
1444
1445
#if defined(DRFLAC_ARM)
1446
#if !defined(DRFLAC_NO_NEON) && (defined(__ARM_NEON) || defined(__aarch64__) || defined(_M_ARM64))
1447
#define DRFLAC_SUPPORT_NEON
1448
#include <arm_neon.h>
1449
#endif
1450
#endif
1451
#endif
1452
1453
/* Compile-time CPU feature support. */
1454
#if !defined(DR_FLAC_NO_SIMD) && (defined(DRFLAC_X86) || defined(DRFLAC_X64))
1455
#if defined(_MSC_VER) && !defined(__clang__)
1456
#if _MSC_VER >= 1400
1457
#include <intrin.h>
1458
static void drflac__cpuid(int info[4], int fid)
1459
{
1460
__cpuid(info, fid);
1461
}
1462
#else
1463
#define DRFLAC_NO_CPUID
1464
#endif
1465
#else
1466
#if defined(__GNUC__) || defined(__clang__)
1467
static void drflac__cpuid(int info[4], int fid)
1468
{
1469
/*
1470
It looks like the -fPIC option uses the ebx register which GCC complains about. We can work around this by just using a different register, the
1471
specific register of which I'm letting the compiler decide on. The "k" prefix is used to specify a 32-bit register. The {...} syntax is for
1472
supporting different assembly dialects.
1473
1474
What's basically happening is that we're saving and restoring the ebx register manually.
1475
*/
1476
#if defined(DRFLAC_X86) && defined(__PIC__)
1477
__asm__ __volatile__ (
1478
"xchg{l} {%%}ebx, %k1;"
1479
"cpuid;"
1480
"xchg{l} {%%}ebx, %k1;"
1481
: "=a"(info[0]), "=&r"(info[1]), "=c"(info[2]), "=d"(info[3]) : "a"(fid), "c"(0)
1482
);
1483
#else
1484
__asm__ __volatile__ (
1485
"cpuid" : "=a"(info[0]), "=b"(info[1]), "=c"(info[2]), "=d"(info[3]) : "a"(fid), "c"(0)
1486
);
1487
#endif
1488
}
1489
#else
1490
#define DRFLAC_NO_CPUID
1491
#endif
1492
#endif
1493
#else
1494
#define DRFLAC_NO_CPUID
1495
#endif
1496
1497
static DRFLAC_INLINE drflac_bool32 drflac_has_sse2(void)
1498
{
1499
#if defined(DRFLAC_SUPPORT_SSE2)
1500
#if (defined(DRFLAC_X64) || defined(DRFLAC_X86)) && !defined(DRFLAC_NO_SSE2)
1501
#if defined(DRFLAC_X64)
1502
return DRFLAC_TRUE; /* 64-bit targets always support SSE2. */
1503
#elif (defined(_M_IX86_FP) && _M_IX86_FP == 2) || defined(__SSE2__)
1504
return DRFLAC_TRUE; /* If the compiler is allowed to freely generate SSE2 code we can assume support. */
1505
#else
1506
#if defined(DRFLAC_NO_CPUID)
1507
return DRFLAC_FALSE;
1508
#else
1509
int info[4];
1510
drflac__cpuid(info, 1);
1511
return (info[3] & (1 << 26)) != 0;
1512
#endif
1513
#endif
1514
#else
1515
return DRFLAC_FALSE; /* SSE2 is only supported on x86 and x64 architectures. */
1516
#endif
1517
#else
1518
return DRFLAC_FALSE; /* No compiler support. */
1519
#endif
1520
}
1521
1522
static DRFLAC_INLINE drflac_bool32 drflac_has_sse41(void)
1523
{
1524
#if defined(DRFLAC_SUPPORT_SSE41)
1525
#if (defined(DRFLAC_X64) || defined(DRFLAC_X86)) && !defined(DRFLAC_NO_SSE41)
1526
#if defined(__SSE4_1__) || defined(__AVX__)
1527
return DRFLAC_TRUE; /* If the compiler is allowed to freely generate SSE41 code we can assume support. */
1528
#else
1529
#if defined(DRFLAC_NO_CPUID)
1530
return DRFLAC_FALSE;
1531
#else
1532
int info[4];
1533
drflac__cpuid(info, 1);
1534
return (info[2] & (1 << 19)) != 0;
1535
#endif
1536
#endif
1537
#else
1538
return DRFLAC_FALSE; /* SSE41 is only supported on x86 and x64 architectures. */
1539
#endif
1540
#else
1541
return DRFLAC_FALSE; /* No compiler support. */
1542
#endif
1543
}
1544
1545
1546
#if defined(_MSC_VER) && _MSC_VER >= 1500 && (defined(DRFLAC_X86) || defined(DRFLAC_X64)) && !defined(__clang__)
1547
#define DRFLAC_HAS_LZCNT_INTRINSIC
1548
#elif (defined(__GNUC__) && ((__GNUC__ > 4) || (__GNUC__ == 4 && __GNUC_MINOR__ >= 7)))
1549
#define DRFLAC_HAS_LZCNT_INTRINSIC
1550
#elif defined(__clang__)
1551
#if defined(__has_builtin)
1552
#if __has_builtin(__builtin_clzll) || __has_builtin(__builtin_clzl)
1553
#define DRFLAC_HAS_LZCNT_INTRINSIC
1554
#endif
1555
#endif
1556
#endif
1557
1558
#if defined(_MSC_VER) && _MSC_VER >= 1400 && !defined(__clang__)
1559
#define DRFLAC_HAS_BYTESWAP16_INTRINSIC
1560
#define DRFLAC_HAS_BYTESWAP32_INTRINSIC
1561
#define DRFLAC_HAS_BYTESWAP64_INTRINSIC
1562
#elif defined(__clang__)
1563
#if defined(__has_builtin)
1564
#if __has_builtin(__builtin_bswap16)
1565
#define DRFLAC_HAS_BYTESWAP16_INTRINSIC
1566
#endif
1567
#if __has_builtin(__builtin_bswap32)
1568
#define DRFLAC_HAS_BYTESWAP32_INTRINSIC
1569
#endif
1570
#if __has_builtin(__builtin_bswap64)
1571
#define DRFLAC_HAS_BYTESWAP64_INTRINSIC
1572
#endif
1573
#endif
1574
#elif defined(__GNUC__)
1575
#if ((__GNUC__ > 4) || (__GNUC__ == 4 && __GNUC_MINOR__ >= 3))
1576
#define DRFLAC_HAS_BYTESWAP32_INTRINSIC
1577
#define DRFLAC_HAS_BYTESWAP64_INTRINSIC
1578
#endif
1579
#if ((__GNUC__ > 4) || (__GNUC__ == 4 && __GNUC_MINOR__ >= 8))
1580
#define DRFLAC_HAS_BYTESWAP16_INTRINSIC
1581
#endif
1582
#elif defined(__WATCOMC__) && defined(__386__)
1583
#define DRFLAC_HAS_BYTESWAP16_INTRINSIC
1584
#define DRFLAC_HAS_BYTESWAP32_INTRINSIC
1585
#define DRFLAC_HAS_BYTESWAP64_INTRINSIC
1586
extern __inline drflac_uint16 _watcom_bswap16(drflac_uint16);
1587
extern __inline drflac_uint32 _watcom_bswap32(drflac_uint32);
1588
extern __inline drflac_uint64 _watcom_bswap64(drflac_uint64);
1589
#pragma aux _watcom_bswap16 = \
1590
"xchg al, ah" \
1591
parm [ax] \
1592
value [ax] \
1593
modify nomemory;
1594
#pragma aux _watcom_bswap32 = \
1595
"bswap eax" \
1596
parm [eax] \
1597
value [eax] \
1598
modify nomemory;
1599
#pragma aux _watcom_bswap64 = \
1600
"bswap eax" \
1601
"bswap edx" \
1602
"xchg eax,edx" \
1603
parm [eax edx] \
1604
value [eax edx] \
1605
modify nomemory;
1606
#endif
1607
1608
1609
/* Standard library stuff. */
1610
#ifndef DRFLAC_ASSERT
1611
#include <assert.h>
1612
#define DRFLAC_ASSERT(expression) assert(expression)
1613
#endif
1614
#ifndef DRFLAC_MALLOC
1615
#define DRFLAC_MALLOC(sz) malloc((sz))
1616
#endif
1617
#ifndef DRFLAC_REALLOC
1618
#define DRFLAC_REALLOC(p, sz) realloc((p), (sz))
1619
#endif
1620
#ifndef DRFLAC_FREE
1621
#define DRFLAC_FREE(p) free((p))
1622
#endif
1623
#ifndef DRFLAC_COPY_MEMORY
1624
#define DRFLAC_COPY_MEMORY(dst, src, sz) memcpy((dst), (src), (sz))
1625
#endif
1626
#ifndef DRFLAC_ZERO_MEMORY
1627
#define DRFLAC_ZERO_MEMORY(p, sz) memset((p), 0, (sz))
1628
#endif
1629
#ifndef DRFLAC_ZERO_OBJECT
1630
#define DRFLAC_ZERO_OBJECT(p) DRFLAC_ZERO_MEMORY((p), sizeof(*(p)))
1631
#endif
1632
1633
#define DRFLAC_MAX_SIMD_VECTOR_SIZE 64 /* 64 for AVX-512 in the future. */
1634
1635
/* Result Codes */
1636
typedef drflac_int32 drflac_result;
1637
#define DRFLAC_SUCCESS 0
1638
#define DRFLAC_ERROR -1 /* A generic error. */
1639
#define DRFLAC_INVALID_ARGS -2
1640
#define DRFLAC_INVALID_OPERATION -3
1641
#define DRFLAC_OUT_OF_MEMORY -4
1642
#define DRFLAC_OUT_OF_RANGE -5
1643
#define DRFLAC_ACCESS_DENIED -6
1644
#define DRFLAC_DOES_NOT_EXIST -7
1645
#define DRFLAC_ALREADY_EXISTS -8
1646
#define DRFLAC_TOO_MANY_OPEN_FILES -9
1647
#define DRFLAC_INVALID_FILE -10
1648
#define DRFLAC_TOO_BIG -11
1649
#define DRFLAC_PATH_TOO_LONG -12
1650
#define DRFLAC_NAME_TOO_LONG -13
1651
#define DRFLAC_NOT_DIRECTORY -14
1652
#define DRFLAC_IS_DIRECTORY -15
1653
#define DRFLAC_DIRECTORY_NOT_EMPTY -16
1654
#define DRFLAC_END_OF_FILE -17
1655
#define DRFLAC_NO_SPACE -18
1656
#define DRFLAC_BUSY -19
1657
#define DRFLAC_IO_ERROR -20
1658
#define DRFLAC_INTERRUPT -21
1659
#define DRFLAC_UNAVAILABLE -22
1660
#define DRFLAC_ALREADY_IN_USE -23
1661
#define DRFLAC_BAD_ADDRESS -24
1662
#define DRFLAC_BAD_SEEK -25
1663
#define DRFLAC_BAD_PIPE -26
1664
#define DRFLAC_DEADLOCK -27
1665
#define DRFLAC_TOO_MANY_LINKS -28
1666
#define DRFLAC_NOT_IMPLEMENTED -29
1667
#define DRFLAC_NO_MESSAGE -30
1668
#define DRFLAC_BAD_MESSAGE -31
1669
#define DRFLAC_NO_DATA_AVAILABLE -32
1670
#define DRFLAC_INVALID_DATA -33
1671
#define DRFLAC_TIMEOUT -34
1672
#define DRFLAC_NO_NETWORK -35
1673
#define DRFLAC_NOT_UNIQUE -36
1674
#define DRFLAC_NOT_SOCKET -37
1675
#define DRFLAC_NO_ADDRESS -38
1676
#define DRFLAC_BAD_PROTOCOL -39
1677
#define DRFLAC_PROTOCOL_UNAVAILABLE -40
1678
#define DRFLAC_PROTOCOL_NOT_SUPPORTED -41
1679
#define DRFLAC_PROTOCOL_FAMILY_NOT_SUPPORTED -42
1680
#define DRFLAC_ADDRESS_FAMILY_NOT_SUPPORTED -43
1681
#define DRFLAC_SOCKET_NOT_SUPPORTED -44
1682
#define DRFLAC_CONNECTION_RESET -45
1683
#define DRFLAC_ALREADY_CONNECTED -46
1684
#define DRFLAC_NOT_CONNECTED -47
1685
#define DRFLAC_CONNECTION_REFUSED -48
1686
#define DRFLAC_NO_HOST -49
1687
#define DRFLAC_IN_PROGRESS -50
1688
#define DRFLAC_CANCELLED -51
1689
#define DRFLAC_MEMORY_ALREADY_MAPPED -52
1690
#define DRFLAC_AT_END -53
1691
1692
#define DRFLAC_CRC_MISMATCH -100
1693
/* End Result Codes */
1694
1695
1696
#define DRFLAC_SUBFRAME_CONSTANT 0
1697
#define DRFLAC_SUBFRAME_VERBATIM 1
1698
#define DRFLAC_SUBFRAME_FIXED 8
1699
#define DRFLAC_SUBFRAME_LPC 32
1700
#define DRFLAC_SUBFRAME_RESERVED 255
1701
1702
#define DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE 0
1703
#define DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE2 1
1704
1705
#define DRFLAC_CHANNEL_ASSIGNMENT_INDEPENDENT 0
1706
#define DRFLAC_CHANNEL_ASSIGNMENT_LEFT_SIDE 8
1707
#define DRFLAC_CHANNEL_ASSIGNMENT_RIGHT_SIDE 9
1708
#define DRFLAC_CHANNEL_ASSIGNMENT_MID_SIDE 10
1709
1710
#define DRFLAC_SEEKPOINT_SIZE_IN_BYTES 18
1711
#define DRFLAC_CUESHEET_TRACK_SIZE_IN_BYTES 36
1712
#define DRFLAC_CUESHEET_TRACK_INDEX_SIZE_IN_BYTES 12
1713
1714
#define drflac_align(x, a) ((((x) + (a) - 1) / (a)) * (a))
1715
1716
1717
DRFLAC_API void drflac_version(drflac_uint32* pMajor, drflac_uint32* pMinor, drflac_uint32* pRevision)
1718
{
1719
if (pMajor) {
1720
*pMajor = DRFLAC_VERSION_MAJOR;
1721
}
1722
1723
if (pMinor) {
1724
*pMinor = DRFLAC_VERSION_MINOR;
1725
}
1726
1727
if (pRevision) {
1728
*pRevision = DRFLAC_VERSION_REVISION;
1729
}
1730
}
1731
1732
DRFLAC_API const char* drflac_version_string(void)
1733
{
1734
return DRFLAC_VERSION_STRING;
1735
}
1736
1737
1738
/* CPU caps. */
1739
#if defined(__has_feature)
1740
#if __has_feature(thread_sanitizer)
1741
#define DRFLAC_NO_THREAD_SANITIZE __attribute__((no_sanitize("thread")))
1742
#else
1743
#define DRFLAC_NO_THREAD_SANITIZE
1744
#endif
1745
#else
1746
#define DRFLAC_NO_THREAD_SANITIZE
1747
#endif
1748
1749
#if defined(DRFLAC_HAS_LZCNT_INTRINSIC)
1750
static drflac_bool32 drflac__gIsLZCNTSupported = DRFLAC_FALSE;
1751
#endif
1752
1753
#ifndef DRFLAC_NO_CPUID
1754
static drflac_bool32 drflac__gIsSSE2Supported = DRFLAC_FALSE;
1755
static drflac_bool32 drflac__gIsSSE41Supported = DRFLAC_FALSE;
1756
1757
/*
1758
I've had a bug report that Clang's ThreadSanitizer presents a warning in this function. Having reviewed this, this does
1759
actually make sense. However, since CPU caps should never differ for a running process, I don't think the trade off of
1760
complicating internal API's by passing around CPU caps versus just disabling the warnings is worthwhile. I'm therefore
1761
just going to disable these warnings. This is disabled via the DRFLAC_NO_THREAD_SANITIZE attribute.
1762
*/
1763
DRFLAC_NO_THREAD_SANITIZE static void drflac__init_cpu_caps(void)
1764
{
1765
static drflac_bool32 isCPUCapsInitialized = DRFLAC_FALSE;
1766
1767
if (!isCPUCapsInitialized) {
1768
/* LZCNT */
1769
#if defined(DRFLAC_HAS_LZCNT_INTRINSIC)
1770
int info[4] = {0};
1771
drflac__cpuid(info, 0x80000001);
1772
drflac__gIsLZCNTSupported = (info[2] & (1 << 5)) != 0;
1773
#endif
1774
1775
/* SSE2 */
1776
drflac__gIsSSE2Supported = drflac_has_sse2();
1777
1778
/* SSE4.1 */
1779
drflac__gIsSSE41Supported = drflac_has_sse41();
1780
1781
/* Initialized. */
1782
isCPUCapsInitialized = DRFLAC_TRUE;
1783
}
1784
}
1785
#else
1786
static drflac_bool32 drflac__gIsNEONSupported = DRFLAC_FALSE;
1787
1788
static DRFLAC_INLINE drflac_bool32 drflac__has_neon(void)
1789
{
1790
#if defined(DRFLAC_SUPPORT_NEON)
1791
#if defined(DRFLAC_ARM) && !defined(DRFLAC_NO_NEON)
1792
#if (defined(__ARM_NEON) || defined(__aarch64__) || defined(_M_ARM64))
1793
return DRFLAC_TRUE; /* If the compiler is allowed to freely generate NEON code we can assume support. */
1794
#else
1795
/* TODO: Runtime check. */
1796
return DRFLAC_FALSE;
1797
#endif
1798
#else
1799
return DRFLAC_FALSE; /* NEON is only supported on ARM architectures. */
1800
#endif
1801
#else
1802
return DRFLAC_FALSE; /* No compiler support. */
1803
#endif
1804
}
1805
1806
DRFLAC_NO_THREAD_SANITIZE static void drflac__init_cpu_caps(void)
1807
{
1808
drflac__gIsNEONSupported = drflac__has_neon();
1809
1810
#if defined(DRFLAC_HAS_LZCNT_INTRINSIC) && defined(DRFLAC_ARM) && (defined(__ARM_ARCH) && __ARM_ARCH >= 5)
1811
drflac__gIsLZCNTSupported = DRFLAC_TRUE;
1812
#endif
1813
}
1814
#endif
1815
1816
1817
/* Endian Management */
1818
static DRFLAC_INLINE drflac_bool32 drflac__is_little_endian(void)
1819
{
1820
#if defined(DRFLAC_X86) || defined(DRFLAC_X64)
1821
return DRFLAC_TRUE;
1822
#elif defined(__BYTE_ORDER) && defined(__LITTLE_ENDIAN) && __BYTE_ORDER == __LITTLE_ENDIAN
1823
return DRFLAC_TRUE;
1824
#else
1825
int n = 1;
1826
return (*(char*)&n) == 1;
1827
#endif
1828
}
1829
1830
static DRFLAC_INLINE drflac_uint16 drflac__swap_endian_uint16(drflac_uint16 n)
1831
{
1832
#ifdef DRFLAC_HAS_BYTESWAP16_INTRINSIC
1833
#if defined(_MSC_VER) && !defined(__clang__)
1834
return _byteswap_ushort(n);
1835
#elif defined(__GNUC__) || defined(__clang__)
1836
return __builtin_bswap16(n);
1837
#elif defined(__WATCOMC__) && defined(__386__)
1838
return _watcom_bswap16(n);
1839
#else
1840
#error "This compiler does not support the byte swap intrinsic."
1841
#endif
1842
#else
1843
return ((n & 0xFF00) >> 8) |
1844
((n & 0x00FF) << 8);
1845
#endif
1846
}
1847
1848
static DRFLAC_INLINE drflac_uint32 drflac__swap_endian_uint32(drflac_uint32 n)
1849
{
1850
#ifdef DRFLAC_HAS_BYTESWAP32_INTRINSIC
1851
#if defined(_MSC_VER) && !defined(__clang__)
1852
return _byteswap_ulong(n);
1853
#elif defined(__GNUC__) || defined(__clang__)
1854
#if defined(DRFLAC_ARM) && (defined(__ARM_ARCH) && __ARM_ARCH >= 6) && !defined(__ARM_ARCH_6M__) && !defined(DRFLAC_64BIT) /* <-- 64-bit inline assembly has not been tested, so disabling for now. */
1855
/* Inline assembly optimized implementation for ARM. In my testing, GCC does not generate optimized code with __builtin_bswap32(). */
1856
drflac_uint32 r;
1857
__asm__ __volatile__ (
1858
#if defined(DRFLAC_64BIT)
1859
"rev %w[out], %w[in]" : [out]"=r"(r) : [in]"r"(n) /* <-- This is untested. If someone in the community could test this, that would be appreciated! */
1860
#else
1861
"rev %[out], %[in]" : [out]"=r"(r) : [in]"r"(n)
1862
#endif
1863
);
1864
return r;
1865
#else
1866
return __builtin_bswap32(n);
1867
#endif
1868
#elif defined(__WATCOMC__) && defined(__386__)
1869
return _watcom_bswap32(n);
1870
#else
1871
#error "This compiler does not support the byte swap intrinsic."
1872
#endif
1873
#else
1874
return ((n & 0xFF000000) >> 24) |
1875
((n & 0x00FF0000) >> 8) |
1876
((n & 0x0000FF00) << 8) |
1877
((n & 0x000000FF) << 24);
1878
#endif
1879
}
1880
1881
static DRFLAC_INLINE drflac_uint64 drflac__swap_endian_uint64(drflac_uint64 n)
1882
{
1883
#ifdef DRFLAC_HAS_BYTESWAP64_INTRINSIC
1884
#if defined(_MSC_VER) && !defined(__clang__)
1885
return _byteswap_uint64(n);
1886
#elif defined(__GNUC__) || defined(__clang__)
1887
return __builtin_bswap64(n);
1888
#elif defined(__WATCOMC__) && defined(__386__)
1889
return _watcom_bswap64(n);
1890
#else
1891
#error "This compiler does not support the byte swap intrinsic."
1892
#endif
1893
#else
1894
/* Weird "<< 32" bitshift is required for C89 because it doesn't support 64-bit constants. Should be optimized out by a good compiler. */
1895
return ((n & ((drflac_uint64)0xFF000000 << 32)) >> 56) |
1896
((n & ((drflac_uint64)0x00FF0000 << 32)) >> 40) |
1897
((n & ((drflac_uint64)0x0000FF00 << 32)) >> 24) |
1898
((n & ((drflac_uint64)0x000000FF << 32)) >> 8) |
1899
((n & ((drflac_uint64)0xFF000000 )) << 8) |
1900
((n & ((drflac_uint64)0x00FF0000 )) << 24) |
1901
((n & ((drflac_uint64)0x0000FF00 )) << 40) |
1902
((n & ((drflac_uint64)0x000000FF )) << 56);
1903
#endif
1904
}
1905
1906
1907
static DRFLAC_INLINE drflac_uint16 drflac__be2host_16(drflac_uint16 n)
1908
{
1909
if (drflac__is_little_endian()) {
1910
return drflac__swap_endian_uint16(n);
1911
}
1912
1913
return n;
1914
}
1915
1916
static DRFLAC_INLINE drflac_uint32 drflac__be2host_32(drflac_uint32 n)
1917
{
1918
if (drflac__is_little_endian()) {
1919
return drflac__swap_endian_uint32(n);
1920
}
1921
1922
return n;
1923
}
1924
1925
static DRFLAC_INLINE drflac_uint32 drflac__be2host_32_ptr_unaligned(const void* pData)
1926
{
1927
const drflac_uint8* pNum = (drflac_uint8*)pData;
1928
return *(pNum) << 24 | *(pNum+1) << 16 | *(pNum+2) << 8 | *(pNum+3);
1929
}
1930
1931
static DRFLAC_INLINE drflac_uint64 drflac__be2host_64(drflac_uint64 n)
1932
{
1933
if (drflac__is_little_endian()) {
1934
return drflac__swap_endian_uint64(n);
1935
}
1936
1937
return n;
1938
}
1939
1940
1941
static DRFLAC_INLINE drflac_uint32 drflac__le2host_32(drflac_uint32 n)
1942
{
1943
if (!drflac__is_little_endian()) {
1944
return drflac__swap_endian_uint32(n);
1945
}
1946
1947
return n;
1948
}
1949
1950
static DRFLAC_INLINE drflac_uint32 drflac__le2host_32_ptr_unaligned(const void* pData)
1951
{
1952
const drflac_uint8* pNum = (drflac_uint8*)pData;
1953
return *pNum | *(pNum+1) << 8 | *(pNum+2) << 16 | *(pNum+3) << 24;
1954
}
1955
1956
1957
static DRFLAC_INLINE drflac_uint32 drflac__unsynchsafe_32(drflac_uint32 n)
1958
{
1959
drflac_uint32 result = 0;
1960
result |= (n & 0x7F000000) >> 3;
1961
result |= (n & 0x007F0000) >> 2;
1962
result |= (n & 0x00007F00) >> 1;
1963
result |= (n & 0x0000007F) >> 0;
1964
1965
return result;
1966
}
1967
1968
1969
1970
/* The CRC code below is based on this document: http://zlib.net/crc_v3.txt */
1971
static drflac_uint8 drflac__crc8_table[] = {
1972
0x00, 0x07, 0x0E, 0x09, 0x1C, 0x1B, 0x12, 0x15, 0x38, 0x3F, 0x36, 0x31, 0x24, 0x23, 0x2A, 0x2D,
1973
0x70, 0x77, 0x7E, 0x79, 0x6C, 0x6B, 0x62, 0x65, 0x48, 0x4F, 0x46, 0x41, 0x54, 0x53, 0x5A, 0x5D,
1974
0xE0, 0xE7, 0xEE, 0xE9, 0xFC, 0xFB, 0xF2, 0xF5, 0xD8, 0xDF, 0xD6, 0xD1, 0xC4, 0xC3, 0xCA, 0xCD,
1975
0x90, 0x97, 0x9E, 0x99, 0x8C, 0x8B, 0x82, 0x85, 0xA8, 0xAF, 0xA6, 0xA1, 0xB4, 0xB3, 0xBA, 0xBD,
1976
0xC7, 0xC0, 0xC9, 0xCE, 0xDB, 0xDC, 0xD5, 0xD2, 0xFF, 0xF8, 0xF1, 0xF6, 0xE3, 0xE4, 0xED, 0xEA,
1977
0xB7, 0xB0, 0xB9, 0xBE, 0xAB, 0xAC, 0xA5, 0xA2, 0x8F, 0x88, 0x81, 0x86, 0x93, 0x94, 0x9D, 0x9A,
1978
0x27, 0x20, 0x29, 0x2E, 0x3B, 0x3C, 0x35, 0x32, 0x1F, 0x18, 0x11, 0x16, 0x03, 0x04, 0x0D, 0x0A,
1979
0x57, 0x50, 0x59, 0x5E, 0x4B, 0x4C, 0x45, 0x42, 0x6F, 0x68, 0x61, 0x66, 0x73, 0x74, 0x7D, 0x7A,
1980
0x89, 0x8E, 0x87, 0x80, 0x95, 0x92, 0x9B, 0x9C, 0xB1, 0xB6, 0xBF, 0xB8, 0xAD, 0xAA, 0xA3, 0xA4,
1981
0xF9, 0xFE, 0xF7, 0xF0, 0xE5, 0xE2, 0xEB, 0xEC, 0xC1, 0xC6, 0xCF, 0xC8, 0xDD, 0xDA, 0xD3, 0xD4,
1982
0x69, 0x6E, 0x67, 0x60, 0x75, 0x72, 0x7B, 0x7C, 0x51, 0x56, 0x5F, 0x58, 0x4D, 0x4A, 0x43, 0x44,
1983
0x19, 0x1E, 0x17, 0x10, 0x05, 0x02, 0x0B, 0x0C, 0x21, 0x26, 0x2F, 0x28, 0x3D, 0x3A, 0x33, 0x34,
1984
0x4E, 0x49, 0x40, 0x47, 0x52, 0x55, 0x5C, 0x5B, 0x76, 0x71, 0x78, 0x7F, 0x6A, 0x6D, 0x64, 0x63,
1985
0x3E, 0x39, 0x30, 0x37, 0x22, 0x25, 0x2C, 0x2B, 0x06, 0x01, 0x08, 0x0F, 0x1A, 0x1D, 0x14, 0x13,
1986
0xAE, 0xA9, 0xA0, 0xA7, 0xB2, 0xB5, 0xBC, 0xBB, 0x96, 0x91, 0x98, 0x9F, 0x8A, 0x8D, 0x84, 0x83,
1987
0xDE, 0xD9, 0xD0, 0xD7, 0xC2, 0xC5, 0xCC, 0xCB, 0xE6, 0xE1, 0xE8, 0xEF, 0xFA, 0xFD, 0xF4, 0xF3
1988
};
1989
1990
static drflac_uint16 drflac__crc16_table[] = {
1991
0x0000, 0x8005, 0x800F, 0x000A, 0x801B, 0x001E, 0x0014, 0x8011,
1992
0x8033, 0x0036, 0x003C, 0x8039, 0x0028, 0x802D, 0x8027, 0x0022,
1993
0x8063, 0x0066, 0x006C, 0x8069, 0x0078, 0x807D, 0x8077, 0x0072,
1994
0x0050, 0x8055, 0x805F, 0x005A, 0x804B, 0x004E, 0x0044, 0x8041,
1995
0x80C3, 0x00C6, 0x00CC, 0x80C9, 0x00D8, 0x80DD, 0x80D7, 0x00D2,
1996
0x00F0, 0x80F5, 0x80FF, 0x00FA, 0x80EB, 0x00EE, 0x00E4, 0x80E1,
1997
0x00A0, 0x80A5, 0x80AF, 0x00AA, 0x80BB, 0x00BE, 0x00B4, 0x80B1,
1998
0x8093, 0x0096, 0x009C, 0x8099, 0x0088, 0x808D, 0x8087, 0x0082,
1999
0x8183, 0x0186, 0x018C, 0x8189, 0x0198, 0x819D, 0x8197, 0x0192,
2000
0x01B0, 0x81B5, 0x81BF, 0x01BA, 0x81AB, 0x01AE, 0x01A4, 0x81A1,
2001
0x01E0, 0x81E5, 0x81EF, 0x01EA, 0x81FB, 0x01FE, 0x01F4, 0x81F1,
2002
0x81D3, 0x01D6, 0x01DC, 0x81D9, 0x01C8, 0x81CD, 0x81C7, 0x01C2,
2003
0x0140, 0x8145, 0x814F, 0x014A, 0x815B, 0x015E, 0x0154, 0x8151,
2004
0x8173, 0x0176, 0x017C, 0x8179, 0x0168, 0x816D, 0x8167, 0x0162,
2005
0x8123, 0x0126, 0x012C, 0x8129, 0x0138, 0x813D, 0x8137, 0x0132,
2006
0x0110, 0x8115, 0x811F, 0x011A, 0x810B, 0x010E, 0x0104, 0x8101,
2007
0x8303, 0x0306, 0x030C, 0x8309, 0x0318, 0x831D, 0x8317, 0x0312,
2008
0x0330, 0x8335, 0x833F, 0x033A, 0x832B, 0x032E, 0x0324, 0x8321,
2009
0x0360, 0x8365, 0x836F, 0x036A, 0x837B, 0x037E, 0x0374, 0x8371,
2010
0x8353, 0x0356, 0x035C, 0x8359, 0x0348, 0x834D, 0x8347, 0x0342,
2011
0x03C0, 0x83C5, 0x83CF, 0x03CA, 0x83DB, 0x03DE, 0x03D4, 0x83D1,
2012
0x83F3, 0x03F6, 0x03FC, 0x83F9, 0x03E8, 0x83ED, 0x83E7, 0x03E2,
2013
0x83A3, 0x03A6, 0x03AC, 0x83A9, 0x03B8, 0x83BD, 0x83B7, 0x03B2,
2014
0x0390, 0x8395, 0x839F, 0x039A, 0x838B, 0x038E, 0x0384, 0x8381,
2015
0x0280, 0x8285, 0x828F, 0x028A, 0x829B, 0x029E, 0x0294, 0x8291,
2016
0x82B3, 0x02B6, 0x02BC, 0x82B9, 0x02A8, 0x82AD, 0x82A7, 0x02A2,
2017
0x82E3, 0x02E6, 0x02EC, 0x82E9, 0x02F8, 0x82FD, 0x82F7, 0x02F2,
2018
0x02D0, 0x82D5, 0x82DF, 0x02DA, 0x82CB, 0x02CE, 0x02C4, 0x82C1,
2019
0x8243, 0x0246, 0x024C, 0x8249, 0x0258, 0x825D, 0x8257, 0x0252,
2020
0x0270, 0x8275, 0x827F, 0x027A, 0x826B, 0x026E, 0x0264, 0x8261,
2021
0x0220, 0x8225, 0x822F, 0x022A, 0x823B, 0x023E, 0x0234, 0x8231,
2022
0x8213, 0x0216, 0x021C, 0x8219, 0x0208, 0x820D, 0x8207, 0x0202
2023
};
2024
2025
static DRFLAC_INLINE drflac_uint8 drflac_crc8_byte(drflac_uint8 crc, drflac_uint8 data)
2026
{
2027
return drflac__crc8_table[crc ^ data];
2028
}
2029
2030
static DRFLAC_INLINE drflac_uint8 drflac_crc8(drflac_uint8 crc, drflac_uint32 data, drflac_uint32 count)
2031
{
2032
#ifdef DR_FLAC_NO_CRC
2033
(void)crc;
2034
(void)data;
2035
(void)count;
2036
return 0;
2037
#else
2038
#if 0
2039
/* REFERENCE (use of this implementation requires an explicit flush by doing "drflac_crc8(crc, 0, 8);") */
2040
drflac_uint8 p = 0x07;
2041
for (int i = count-1; i >= 0; --i) {
2042
drflac_uint8 bit = (data & (1 << i)) >> i;
2043
if (crc & 0x80) {
2044
crc = ((crc << 1) | bit) ^ p;
2045
} else {
2046
crc = ((crc << 1) | bit);
2047
}
2048
}
2049
return crc;
2050
#else
2051
drflac_uint32 wholeBytes;
2052
drflac_uint32 leftoverBits;
2053
drflac_uint64 leftoverDataMask;
2054
2055
static drflac_uint64 leftoverDataMaskTable[8] = {
2056
0x00, 0x01, 0x03, 0x07, 0x0F, 0x1F, 0x3F, 0x7F
2057
};
2058
2059
DRFLAC_ASSERT(count <= 32);
2060
2061
wholeBytes = count >> 3;
2062
leftoverBits = count - (wholeBytes*8);
2063
leftoverDataMask = leftoverDataMaskTable[leftoverBits];
2064
2065
switch (wholeBytes) {
2066
case 4: crc = drflac_crc8_byte(crc, (drflac_uint8)((data & (0xFF000000UL << leftoverBits)) >> (24 + leftoverBits)));
2067
case 3: crc = drflac_crc8_byte(crc, (drflac_uint8)((data & (0x00FF0000UL << leftoverBits)) >> (16 + leftoverBits)));
2068
case 2: crc = drflac_crc8_byte(crc, (drflac_uint8)((data & (0x0000FF00UL << leftoverBits)) >> ( 8 + leftoverBits)));
2069
case 1: crc = drflac_crc8_byte(crc, (drflac_uint8)((data & (0x000000FFUL << leftoverBits)) >> ( 0 + leftoverBits)));
2070
case 0: if (leftoverBits > 0) crc = (drflac_uint8)((crc << leftoverBits) ^ drflac__crc8_table[(crc >> (8 - leftoverBits)) ^ (data & leftoverDataMask)]);
2071
}
2072
return crc;
2073
#endif
2074
#endif
2075
}
2076
2077
static DRFLAC_INLINE drflac_uint16 drflac_crc16_byte(drflac_uint16 crc, drflac_uint8 data)
2078
{
2079
return (crc << 8) ^ drflac__crc16_table[(drflac_uint8)(crc >> 8) ^ data];
2080
}
2081
2082
static DRFLAC_INLINE drflac_uint16 drflac_crc16_cache(drflac_uint16 crc, drflac_cache_t data)
2083
{
2084
#ifdef DRFLAC_64BIT
2085
crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 56) & 0xFF));
2086
crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 48) & 0xFF));
2087
crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 40) & 0xFF));
2088
crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 32) & 0xFF));
2089
#endif
2090
crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 24) & 0xFF));
2091
crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 16) & 0xFF));
2092
crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 8) & 0xFF));
2093
crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 0) & 0xFF));
2094
2095
return crc;
2096
}
2097
2098
static DRFLAC_INLINE drflac_uint16 drflac_crc16_bytes(drflac_uint16 crc, drflac_cache_t data, drflac_uint32 byteCount)
2099
{
2100
switch (byteCount)
2101
{
2102
#ifdef DRFLAC_64BIT
2103
case 8: crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 56) & 0xFF));
2104
case 7: crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 48) & 0xFF));
2105
case 6: crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 40) & 0xFF));
2106
case 5: crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 32) & 0xFF));
2107
#endif
2108
case 4: crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 24) & 0xFF));
2109
case 3: crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 16) & 0xFF));
2110
case 2: crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 8) & 0xFF));
2111
case 1: crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 0) & 0xFF));
2112
}
2113
2114
return crc;
2115
}
2116
2117
#if 0
2118
static DRFLAC_INLINE drflac_uint16 drflac_crc16__32bit(drflac_uint16 crc, drflac_uint32 data, drflac_uint32 count)
2119
{
2120
#ifdef DR_FLAC_NO_CRC
2121
(void)crc;
2122
(void)data;
2123
(void)count;
2124
return 0;
2125
#else
2126
#if 0
2127
/* REFERENCE (use of this implementation requires an explicit flush by doing "drflac_crc16(crc, 0, 16);") */
2128
drflac_uint16 p = 0x8005;
2129
for (int i = count-1; i >= 0; --i) {
2130
drflac_uint16 bit = (data & (1ULL << i)) >> i;
2131
if (r & 0x8000) {
2132
r = ((r << 1) | bit) ^ p;
2133
} else {
2134
r = ((r << 1) | bit);
2135
}
2136
}
2137
2138
return crc;
2139
#else
2140
drflac_uint32 wholeBytes;
2141
drflac_uint32 leftoverBits;
2142
drflac_uint64 leftoverDataMask;
2143
2144
static drflac_uint64 leftoverDataMaskTable[8] = {
2145
0x00, 0x01, 0x03, 0x07, 0x0F, 0x1F, 0x3F, 0x7F
2146
};
2147
2148
DRFLAC_ASSERT(count <= 64);
2149
2150
wholeBytes = count >> 3;
2151
leftoverBits = count & 7;
2152
leftoverDataMask = leftoverDataMaskTable[leftoverBits];
2153
2154
switch (wholeBytes) {
2155
default:
2156
case 4: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (0xFF000000UL << leftoverBits)) >> (24 + leftoverBits)));
2157
case 3: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (0x00FF0000UL << leftoverBits)) >> (16 + leftoverBits)));
2158
case 2: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (0x0000FF00UL << leftoverBits)) >> ( 8 + leftoverBits)));
2159
case 1: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (0x000000FFUL << leftoverBits)) >> ( 0 + leftoverBits)));
2160
case 0: if (leftoverBits > 0) crc = (crc << leftoverBits) ^ drflac__crc16_table[(crc >> (16 - leftoverBits)) ^ (data & leftoverDataMask)];
2161
}
2162
return crc;
2163
#endif
2164
#endif
2165
}
2166
2167
static DRFLAC_INLINE drflac_uint16 drflac_crc16__64bit(drflac_uint16 crc, drflac_uint64 data, drflac_uint32 count)
2168
{
2169
#ifdef DR_FLAC_NO_CRC
2170
(void)crc;
2171
(void)data;
2172
(void)count;
2173
return 0;
2174
#else
2175
drflac_uint32 wholeBytes;
2176
drflac_uint32 leftoverBits;
2177
drflac_uint64 leftoverDataMask;
2178
2179
static drflac_uint64 leftoverDataMaskTable[8] = {
2180
0x00, 0x01, 0x03, 0x07, 0x0F, 0x1F, 0x3F, 0x7F
2181
};
2182
2183
DRFLAC_ASSERT(count <= 64);
2184
2185
wholeBytes = count >> 3;
2186
leftoverBits = count & 7;
2187
leftoverDataMask = leftoverDataMaskTable[leftoverBits];
2188
2189
switch (wholeBytes) {
2190
default:
2191
case 8: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (((drflac_uint64)0xFF000000 << 32) << leftoverBits)) >> (56 + leftoverBits))); /* Weird "<< 32" bitshift is required for C89 because it doesn't support 64-bit constants. Should be optimized out by a good compiler. */
2192
case 7: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (((drflac_uint64)0x00FF0000 << 32) << leftoverBits)) >> (48 + leftoverBits)));
2193
case 6: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (((drflac_uint64)0x0000FF00 << 32) << leftoverBits)) >> (40 + leftoverBits)));
2194
case 5: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (((drflac_uint64)0x000000FF << 32) << leftoverBits)) >> (32 + leftoverBits)));
2195
case 4: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (((drflac_uint64)0xFF000000 ) << leftoverBits)) >> (24 + leftoverBits)));
2196
case 3: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (((drflac_uint64)0x00FF0000 ) << leftoverBits)) >> (16 + leftoverBits)));
2197
case 2: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (((drflac_uint64)0x0000FF00 ) << leftoverBits)) >> ( 8 + leftoverBits)));
2198
case 1: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (((drflac_uint64)0x000000FF ) << leftoverBits)) >> ( 0 + leftoverBits)));
2199
case 0: if (leftoverBits > 0) crc = (crc << leftoverBits) ^ drflac__crc16_table[(crc >> (16 - leftoverBits)) ^ (data & leftoverDataMask)];
2200
}
2201
return crc;
2202
#endif
2203
}
2204
2205
2206
static DRFLAC_INLINE drflac_uint16 drflac_crc16(drflac_uint16 crc, drflac_cache_t data, drflac_uint32 count)
2207
{
2208
#ifdef DRFLAC_64BIT
2209
return drflac_crc16__64bit(crc, data, count);
2210
#else
2211
return drflac_crc16__32bit(crc, data, count);
2212
#endif
2213
}
2214
#endif
2215
2216
2217
#ifdef DRFLAC_64BIT
2218
#define drflac__be2host__cache_line drflac__be2host_64
2219
#else
2220
#define drflac__be2host__cache_line drflac__be2host_32
2221
#endif
2222
2223
/*
2224
BIT READING ATTEMPT #2
2225
2226
This uses a 32- or 64-bit bit-shifted cache - as bits are read, the cache is shifted such that the first valid bit is sitting
2227
on the most significant bit. It uses the notion of an L1 and L2 cache (borrowed from CPU architecture), where the L1 cache
2228
is a 32- or 64-bit unsigned integer (depending on whether or not a 32- or 64-bit build is being compiled) and the L2 is an
2229
array of "cache lines", with each cache line being the same size as the L1. The L2 is a buffer of about 4KB and is where data
2230
from onRead() is read into.
2231
*/
2232
#define DRFLAC_CACHE_L1_SIZE_BYTES(bs) (sizeof((bs)->cache))
2233
#define DRFLAC_CACHE_L1_SIZE_BITS(bs) (sizeof((bs)->cache)*8)
2234
#define DRFLAC_CACHE_L1_BITS_REMAINING(bs) (DRFLAC_CACHE_L1_SIZE_BITS(bs) - (bs)->consumedBits)
2235
#define DRFLAC_CACHE_L1_SELECTION_MASK(_bitCount) (~((~(drflac_cache_t)0) >> (_bitCount)))
2236
#define DRFLAC_CACHE_L1_SELECTION_SHIFT(bs, _bitCount) (DRFLAC_CACHE_L1_SIZE_BITS(bs) - (_bitCount))
2237
#define DRFLAC_CACHE_L1_SELECT(bs, _bitCount) (((bs)->cache) & DRFLAC_CACHE_L1_SELECTION_MASK(_bitCount))
2238
#define DRFLAC_CACHE_L1_SELECT_AND_SHIFT(bs, _bitCount) (DRFLAC_CACHE_L1_SELECT((bs), (_bitCount)) >> DRFLAC_CACHE_L1_SELECTION_SHIFT((bs), (_bitCount)))
2239
#define DRFLAC_CACHE_L1_SELECT_AND_SHIFT_SAFE(bs, _bitCount)(DRFLAC_CACHE_L1_SELECT((bs), (_bitCount)) >> (DRFLAC_CACHE_L1_SELECTION_SHIFT((bs), (_bitCount)) & (DRFLAC_CACHE_L1_SIZE_BITS(bs)-1)))
2240
#define DRFLAC_CACHE_L2_SIZE_BYTES(bs) (sizeof((bs)->cacheL2))
2241
#define DRFLAC_CACHE_L2_LINE_COUNT(bs) (DRFLAC_CACHE_L2_SIZE_BYTES(bs) / sizeof((bs)->cacheL2[0]))
2242
#define DRFLAC_CACHE_L2_LINES_REMAINING(bs) (DRFLAC_CACHE_L2_LINE_COUNT(bs) - (bs)->nextL2Line)
2243
2244
2245
#ifndef DR_FLAC_NO_CRC
2246
static DRFLAC_INLINE void drflac__reset_crc16(drflac_bs* bs)
2247
{
2248
bs->crc16 = 0;
2249
bs->crc16CacheIgnoredBytes = bs->consumedBits >> 3;
2250
}
2251
2252
static DRFLAC_INLINE void drflac__update_crc16(drflac_bs* bs)
2253
{
2254
if (bs->crc16CacheIgnoredBytes == 0) {
2255
bs->crc16 = drflac_crc16_cache(bs->crc16, bs->crc16Cache);
2256
} else {
2257
bs->crc16 = drflac_crc16_bytes(bs->crc16, bs->crc16Cache, DRFLAC_CACHE_L1_SIZE_BYTES(bs) - bs->crc16CacheIgnoredBytes);
2258
bs->crc16CacheIgnoredBytes = 0;
2259
}
2260
}
2261
2262
static DRFLAC_INLINE drflac_uint16 drflac__flush_crc16(drflac_bs* bs)
2263
{
2264
/* We should never be flushing in a situation where we are not aligned on a byte boundary. */
2265
DRFLAC_ASSERT((DRFLAC_CACHE_L1_BITS_REMAINING(bs) & 7) == 0);
2266
2267
/*
2268
The bits that were read from the L1 cache need to be accumulated. The number of bytes needing to be accumulated is determined
2269
by the number of bits that have been consumed.
2270
*/
2271
if (DRFLAC_CACHE_L1_BITS_REMAINING(bs) == 0) {
2272
drflac__update_crc16(bs);
2273
} else {
2274
/* We only accumulate the consumed bits. */
2275
bs->crc16 = drflac_crc16_bytes(bs->crc16, bs->crc16Cache >> DRFLAC_CACHE_L1_BITS_REMAINING(bs), (bs->consumedBits >> 3) - bs->crc16CacheIgnoredBytes);
2276
2277
/*
2278
The bits that we just accumulated should never be accumulated again. We need to keep track of how many bytes were accumulated
2279
so we can handle that later.
2280
*/
2281
bs->crc16CacheIgnoredBytes = bs->consumedBits >> 3;
2282
}
2283
2284
return bs->crc16;
2285
}
2286
#endif
2287
2288
static DRFLAC_INLINE drflac_bool32 drflac__reload_l1_cache_from_l2(drflac_bs* bs)
2289
{
2290
size_t bytesRead;
2291
size_t alignedL1LineCount;
2292
2293
/* Fast path. Try loading straight from L2. */
2294
if (bs->nextL2Line < DRFLAC_CACHE_L2_LINE_COUNT(bs)) {
2295
bs->cache = bs->cacheL2[bs->nextL2Line++];
2296
return DRFLAC_TRUE;
2297
}
2298
2299
/*
2300
If we get here it means we've run out of data in the L2 cache. We'll need to fetch more from the client, if there's
2301
any left.
2302
*/
2303
if (bs->unalignedByteCount > 0) {
2304
return DRFLAC_FALSE; /* If we have any unaligned bytes it means there's no more aligned bytes left in the client. */
2305
}
2306
2307
bytesRead = bs->onRead(bs->pUserData, bs->cacheL2, DRFLAC_CACHE_L2_SIZE_BYTES(bs));
2308
2309
bs->nextL2Line = 0;
2310
if (bytesRead == DRFLAC_CACHE_L2_SIZE_BYTES(bs)) {
2311
bs->cache = bs->cacheL2[bs->nextL2Line++];
2312
return DRFLAC_TRUE;
2313
}
2314
2315
2316
/*
2317
If we get here it means we were unable to retrieve enough data to fill the entire L2 cache. It probably
2318
means we've just reached the end of the file. We need to move the valid data down to the end of the buffer
2319
and adjust the index of the next line accordingly. Also keep in mind that the L2 cache must be aligned to
2320
the size of the L1 so we'll need to seek backwards by any misaligned bytes.
2321
*/
2322
alignedL1LineCount = bytesRead / DRFLAC_CACHE_L1_SIZE_BYTES(bs);
2323
2324
/* We need to keep track of any unaligned bytes for later use. */
2325
bs->unalignedByteCount = bytesRead - (alignedL1LineCount * DRFLAC_CACHE_L1_SIZE_BYTES(bs));
2326
if (bs->unalignedByteCount > 0) {
2327
bs->unalignedCache = bs->cacheL2[alignedL1LineCount];
2328
}
2329
2330
if (alignedL1LineCount > 0) {
2331
size_t offset = DRFLAC_CACHE_L2_LINE_COUNT(bs) - alignedL1LineCount;
2332
size_t i;
2333
for (i = alignedL1LineCount; i > 0; --i) {
2334
bs->cacheL2[i-1 + offset] = bs->cacheL2[i-1];
2335
}
2336
2337
bs->nextL2Line = (drflac_uint32)offset;
2338
bs->cache = bs->cacheL2[bs->nextL2Line++];
2339
return DRFLAC_TRUE;
2340
} else {
2341
/* If we get into this branch it means we weren't able to load any L1-aligned data. */
2342
bs->nextL2Line = DRFLAC_CACHE_L2_LINE_COUNT(bs);
2343
return DRFLAC_FALSE;
2344
}
2345
}
2346
2347
static drflac_bool32 drflac__reload_cache(drflac_bs* bs)
2348
{
2349
size_t bytesRead;
2350
2351
#ifndef DR_FLAC_NO_CRC
2352
drflac__update_crc16(bs);
2353
#endif
2354
2355
/* Fast path. Try just moving the next value in the L2 cache to the L1 cache. */
2356
if (drflac__reload_l1_cache_from_l2(bs)) {
2357
bs->cache = drflac__be2host__cache_line(bs->cache);
2358
bs->consumedBits = 0;
2359
#ifndef DR_FLAC_NO_CRC
2360
bs->crc16Cache = bs->cache;
2361
#endif
2362
return DRFLAC_TRUE;
2363
}
2364
2365
/* Slow path. */
2366
2367
/*
2368
If we get here it means we have failed to load the L1 cache from the L2. Likely we've just reached the end of the stream and the last
2369
few bytes did not meet the alignment requirements for the L2 cache. In this case we need to fall back to a slower path and read the
2370
data from the unaligned cache.
2371
*/
2372
bytesRead = bs->unalignedByteCount;
2373
if (bytesRead == 0) {
2374
bs->consumedBits = DRFLAC_CACHE_L1_SIZE_BITS(bs); /* <-- The stream has been exhausted, so marked the bits as consumed. */
2375
return DRFLAC_FALSE;
2376
}
2377
2378
DRFLAC_ASSERT(bytesRead < DRFLAC_CACHE_L1_SIZE_BYTES(bs));
2379
bs->consumedBits = (drflac_uint32)(DRFLAC_CACHE_L1_SIZE_BYTES(bs) - bytesRead) * 8;
2380
2381
bs->cache = drflac__be2host__cache_line(bs->unalignedCache);
2382
bs->cache &= DRFLAC_CACHE_L1_SELECTION_MASK(DRFLAC_CACHE_L1_BITS_REMAINING(bs)); /* <-- Make sure the consumed bits are always set to zero. Other parts of the library depend on this property. */
2383
bs->unalignedByteCount = 0; /* <-- At this point the unaligned bytes have been moved into the cache and we thus have no more unaligned bytes. */
2384
2385
#ifndef DR_FLAC_NO_CRC
2386
bs->crc16Cache = bs->cache >> bs->consumedBits;
2387
bs->crc16CacheIgnoredBytes = bs->consumedBits >> 3;
2388
#endif
2389
return DRFLAC_TRUE;
2390
}
2391
2392
static void drflac__reset_cache(drflac_bs* bs)
2393
{
2394
bs->nextL2Line = DRFLAC_CACHE_L2_LINE_COUNT(bs); /* <-- This clears the L2 cache. */
2395
bs->consumedBits = DRFLAC_CACHE_L1_SIZE_BITS(bs); /* <-- This clears the L1 cache. */
2396
bs->cache = 0;
2397
bs->unalignedByteCount = 0; /* <-- This clears the trailing unaligned bytes. */
2398
bs->unalignedCache = 0;
2399
2400
#ifndef DR_FLAC_NO_CRC
2401
bs->crc16Cache = 0;
2402
bs->crc16CacheIgnoredBytes = 0;
2403
#endif
2404
}
2405
2406
2407
static DRFLAC_INLINE drflac_bool32 drflac__read_uint32(drflac_bs* bs, unsigned int bitCount, drflac_uint32* pResultOut)
2408
{
2409
DRFLAC_ASSERT(bs != NULL);
2410
DRFLAC_ASSERT(pResultOut != NULL);
2411
DRFLAC_ASSERT(bitCount > 0);
2412
DRFLAC_ASSERT(bitCount <= 32);
2413
2414
if (bs->consumedBits == DRFLAC_CACHE_L1_SIZE_BITS(bs)) {
2415
if (!drflac__reload_cache(bs)) {
2416
return DRFLAC_FALSE;
2417
}
2418
}
2419
2420
if (bitCount <= DRFLAC_CACHE_L1_BITS_REMAINING(bs)) {
2421
/*
2422
If we want to load all 32-bits from a 32-bit cache we need to do it slightly differently because we can't do
2423
a 32-bit shift on a 32-bit integer. This will never be the case on 64-bit caches, so we can have a slightly
2424
more optimal solution for this.
2425
*/
2426
#ifdef DRFLAC_64BIT
2427
*pResultOut = (drflac_uint32)DRFLAC_CACHE_L1_SELECT_AND_SHIFT(bs, bitCount);
2428
bs->consumedBits += bitCount;
2429
bs->cache <<= bitCount;
2430
#else
2431
if (bitCount < DRFLAC_CACHE_L1_SIZE_BITS(bs)) {
2432
*pResultOut = (drflac_uint32)DRFLAC_CACHE_L1_SELECT_AND_SHIFT(bs, bitCount);
2433
bs->consumedBits += bitCount;
2434
bs->cache <<= bitCount;
2435
} else {
2436
/* Cannot shift by 32-bits, so need to do it differently. */
2437
*pResultOut = (drflac_uint32)bs->cache;
2438
bs->consumedBits = DRFLAC_CACHE_L1_SIZE_BITS(bs);
2439
bs->cache = 0;
2440
}
2441
#endif
2442
2443
return DRFLAC_TRUE;
2444
} else {
2445
/* It straddles the cached data. It will never cover more than the next chunk. We just read the number in two parts and combine them. */
2446
drflac_uint32 bitCountHi = DRFLAC_CACHE_L1_BITS_REMAINING(bs);
2447
drflac_uint32 bitCountLo = bitCount - bitCountHi;
2448
drflac_uint32 resultHi;
2449
2450
DRFLAC_ASSERT(bitCountHi > 0);
2451
DRFLAC_ASSERT(bitCountHi < 32);
2452
resultHi = (drflac_uint32)DRFLAC_CACHE_L1_SELECT_AND_SHIFT(bs, bitCountHi);
2453
2454
if (!drflac__reload_cache(bs)) {
2455
return DRFLAC_FALSE;
2456
}
2457
if (bitCountLo > DRFLAC_CACHE_L1_BITS_REMAINING(bs)) {
2458
/* This happens when we get to end of stream */
2459
return DRFLAC_FALSE;
2460
}
2461
2462
*pResultOut = (resultHi << bitCountLo) | (drflac_uint32)DRFLAC_CACHE_L1_SELECT_AND_SHIFT(bs, bitCountLo);
2463
bs->consumedBits += bitCountLo;
2464
bs->cache <<= bitCountLo;
2465
return DRFLAC_TRUE;
2466
}
2467
}
2468
2469
static drflac_bool32 drflac__read_int32(drflac_bs* bs, unsigned int bitCount, drflac_int32* pResult)
2470
{
2471
drflac_uint32 result;
2472
2473
DRFLAC_ASSERT(bs != NULL);
2474
DRFLAC_ASSERT(pResult != NULL);
2475
DRFLAC_ASSERT(bitCount > 0);
2476
DRFLAC_ASSERT(bitCount <= 32);
2477
2478
if (!drflac__read_uint32(bs, bitCount, &result)) {
2479
return DRFLAC_FALSE;
2480
}
2481
2482
/* Do not attempt to shift by 32 as it's undefined. */
2483
if (bitCount < 32) {
2484
drflac_uint32 signbit;
2485
signbit = ((result >> (bitCount-1)) & 0x01);
2486
result |= (~signbit + 1) << bitCount;
2487
}
2488
2489
*pResult = (drflac_int32)result;
2490
return DRFLAC_TRUE;
2491
}
2492
2493
#ifdef DRFLAC_64BIT
2494
static drflac_bool32 drflac__read_uint64(drflac_bs* bs, unsigned int bitCount, drflac_uint64* pResultOut)
2495
{
2496
drflac_uint32 resultHi;
2497
drflac_uint32 resultLo;
2498
2499
DRFLAC_ASSERT(bitCount <= 64);
2500
DRFLAC_ASSERT(bitCount > 32);
2501
2502
if (!drflac__read_uint32(bs, bitCount - 32, &resultHi)) {
2503
return DRFLAC_FALSE;
2504
}
2505
2506
if (!drflac__read_uint32(bs, 32, &resultLo)) {
2507
return DRFLAC_FALSE;
2508
}
2509
2510
*pResultOut = (((drflac_uint64)resultHi) << 32) | ((drflac_uint64)resultLo);
2511
return DRFLAC_TRUE;
2512
}
2513
#endif
2514
2515
/* Function below is unused, but leaving it here in case I need to quickly add it again. */
2516
#if 0
2517
static drflac_bool32 drflac__read_int64(drflac_bs* bs, unsigned int bitCount, drflac_int64* pResultOut)
2518
{
2519
drflac_uint64 result;
2520
drflac_uint64 signbit;
2521
2522
DRFLAC_ASSERT(bitCount <= 64);
2523
2524
if (!drflac__read_uint64(bs, bitCount, &result)) {
2525
return DRFLAC_FALSE;
2526
}
2527
2528
signbit = ((result >> (bitCount-1)) & 0x01);
2529
result |= (~signbit + 1) << bitCount;
2530
2531
*pResultOut = (drflac_int64)result;
2532
return DRFLAC_TRUE;
2533
}
2534
#endif
2535
2536
static drflac_bool32 drflac__read_uint16(drflac_bs* bs, unsigned int bitCount, drflac_uint16* pResult)
2537
{
2538
drflac_uint32 result;
2539
2540
DRFLAC_ASSERT(bs != NULL);
2541
DRFLAC_ASSERT(pResult != NULL);
2542
DRFLAC_ASSERT(bitCount > 0);
2543
DRFLAC_ASSERT(bitCount <= 16);
2544
2545
if (!drflac__read_uint32(bs, bitCount, &result)) {
2546
return DRFLAC_FALSE;
2547
}
2548
2549
*pResult = (drflac_uint16)result;
2550
return DRFLAC_TRUE;
2551
}
2552
2553
#if 0
2554
static drflac_bool32 drflac__read_int16(drflac_bs* bs, unsigned int bitCount, drflac_int16* pResult)
2555
{
2556
drflac_int32 result;
2557
2558
DRFLAC_ASSERT(bs != NULL);
2559
DRFLAC_ASSERT(pResult != NULL);
2560
DRFLAC_ASSERT(bitCount > 0);
2561
DRFLAC_ASSERT(bitCount <= 16);
2562
2563
if (!drflac__read_int32(bs, bitCount, &result)) {
2564
return DRFLAC_FALSE;
2565
}
2566
2567
*pResult = (drflac_int16)result;
2568
return DRFLAC_TRUE;
2569
}
2570
#endif
2571
2572
static drflac_bool32 drflac__read_uint8(drflac_bs* bs, unsigned int bitCount, drflac_uint8* pResult)
2573
{
2574
drflac_uint32 result;
2575
2576
DRFLAC_ASSERT(bs != NULL);
2577
DRFLAC_ASSERT(pResult != NULL);
2578
DRFLAC_ASSERT(bitCount > 0);
2579
DRFLAC_ASSERT(bitCount <= 8);
2580
2581
if (!drflac__read_uint32(bs, bitCount, &result)) {
2582
return DRFLAC_FALSE;
2583
}
2584
2585
*pResult = (drflac_uint8)result;
2586
return DRFLAC_TRUE;
2587
}
2588
2589
static drflac_bool32 drflac__read_int8(drflac_bs* bs, unsigned int bitCount, drflac_int8* pResult)
2590
{
2591
drflac_int32 result;
2592
2593
DRFLAC_ASSERT(bs != NULL);
2594
DRFLAC_ASSERT(pResult != NULL);
2595
DRFLAC_ASSERT(bitCount > 0);
2596
DRFLAC_ASSERT(bitCount <= 8);
2597
2598
if (!drflac__read_int32(bs, bitCount, &result)) {
2599
return DRFLAC_FALSE;
2600
}
2601
2602
*pResult = (drflac_int8)result;
2603
return DRFLAC_TRUE;
2604
}
2605
2606
2607
static drflac_bool32 drflac__seek_bits(drflac_bs* bs, size_t bitsToSeek)
2608
{
2609
if (bitsToSeek <= DRFLAC_CACHE_L1_BITS_REMAINING(bs)) {
2610
bs->consumedBits += (drflac_uint32)bitsToSeek;
2611
bs->cache <<= bitsToSeek;
2612
return DRFLAC_TRUE;
2613
} else {
2614
/* It straddles the cached data. This function isn't called too frequently so I'm favouring simplicity here. */
2615
bitsToSeek -= DRFLAC_CACHE_L1_BITS_REMAINING(bs);
2616
bs->consumedBits += DRFLAC_CACHE_L1_BITS_REMAINING(bs);
2617
bs->cache = 0;
2618
2619
/* Simple case. Seek in groups of the same number as bits that fit within a cache line. */
2620
#ifdef DRFLAC_64BIT
2621
while (bitsToSeek >= DRFLAC_CACHE_L1_SIZE_BITS(bs)) {
2622
drflac_uint64 bin;
2623
if (!drflac__read_uint64(bs, DRFLAC_CACHE_L1_SIZE_BITS(bs), &bin)) {
2624
return DRFLAC_FALSE;
2625
}
2626
bitsToSeek -= DRFLAC_CACHE_L1_SIZE_BITS(bs);
2627
}
2628
#else
2629
while (bitsToSeek >= DRFLAC_CACHE_L1_SIZE_BITS(bs)) {
2630
drflac_uint32 bin;
2631
if (!drflac__read_uint32(bs, DRFLAC_CACHE_L1_SIZE_BITS(bs), &bin)) {
2632
return DRFLAC_FALSE;
2633
}
2634
bitsToSeek -= DRFLAC_CACHE_L1_SIZE_BITS(bs);
2635
}
2636
#endif
2637
2638
/* Whole leftover bytes. */
2639
while (bitsToSeek >= 8) {
2640
drflac_uint8 bin;
2641
if (!drflac__read_uint8(bs, 8, &bin)) {
2642
return DRFLAC_FALSE;
2643
}
2644
bitsToSeek -= 8;
2645
}
2646
2647
/* Leftover bits. */
2648
if (bitsToSeek > 0) {
2649
drflac_uint8 bin;
2650
if (!drflac__read_uint8(bs, (drflac_uint32)bitsToSeek, &bin)) {
2651
return DRFLAC_FALSE;
2652
}
2653
bitsToSeek = 0; /* <-- Necessary for the assert below. */
2654
}
2655
2656
DRFLAC_ASSERT(bitsToSeek == 0);
2657
return DRFLAC_TRUE;
2658
}
2659
}
2660
2661
2662
/* This function moves the bit streamer to the first bit after the sync code (bit 15 of the of the frame header). It will also update the CRC-16. */
2663
static drflac_bool32 drflac__find_and_seek_to_next_sync_code(drflac_bs* bs)
2664
{
2665
DRFLAC_ASSERT(bs != NULL);
2666
2667
/*
2668
The sync code is always aligned to 8 bits. This is convenient for us because it means we can do byte-aligned movements. The first
2669
thing to do is align to the next byte.
2670
*/
2671
if (!drflac__seek_bits(bs, DRFLAC_CACHE_L1_BITS_REMAINING(bs) & 7)) {
2672
return DRFLAC_FALSE;
2673
}
2674
2675
for (;;) {
2676
drflac_uint8 hi;
2677
2678
#ifndef DR_FLAC_NO_CRC
2679
drflac__reset_crc16(bs);
2680
#endif
2681
2682
if (!drflac__read_uint8(bs, 8, &hi)) {
2683
return DRFLAC_FALSE;
2684
}
2685
2686
if (hi == 0xFF) {
2687
drflac_uint8 lo;
2688
if (!drflac__read_uint8(bs, 6, &lo)) {
2689
return DRFLAC_FALSE;
2690
}
2691
2692
if (lo == 0x3E) {
2693
return DRFLAC_TRUE;
2694
} else {
2695
if (!drflac__seek_bits(bs, DRFLAC_CACHE_L1_BITS_REMAINING(bs) & 7)) {
2696
return DRFLAC_FALSE;
2697
}
2698
}
2699
}
2700
}
2701
2702
/* Should never get here. */
2703
/*return DRFLAC_FALSE;*/
2704
}
2705
2706
2707
#if defined(DRFLAC_HAS_LZCNT_INTRINSIC)
2708
#define DRFLAC_IMPLEMENT_CLZ_LZCNT
2709
#endif
2710
#if defined(_MSC_VER) && _MSC_VER >= 1400 && (defined(DRFLAC_X64) || defined(DRFLAC_X86)) && !defined(__clang__)
2711
#define DRFLAC_IMPLEMENT_CLZ_MSVC
2712
#endif
2713
#if defined(__WATCOMC__) && defined(__386__)
2714
#define DRFLAC_IMPLEMENT_CLZ_WATCOM
2715
#endif
2716
#ifdef __MRC__
2717
#include <intrinsics.h>
2718
#define DRFLAC_IMPLEMENT_CLZ_MRC
2719
#endif
2720
2721
static DRFLAC_INLINE drflac_uint32 drflac__clz_software(drflac_cache_t x)
2722
{
2723
drflac_uint32 n;
2724
static drflac_uint32 clz_table_4[] = {
2725
0,
2726
4,
2727
3, 3,
2728
2, 2, 2, 2,
2729
1, 1, 1, 1, 1, 1, 1, 1
2730
};
2731
2732
if (x == 0) {
2733
return sizeof(x)*8;
2734
}
2735
2736
n = clz_table_4[x >> (sizeof(x)*8 - 4)];
2737
if (n == 0) {
2738
#ifdef DRFLAC_64BIT
2739
if ((x & ((drflac_uint64)0xFFFFFFFF << 32)) == 0) { n = 32; x <<= 32; }
2740
if ((x & ((drflac_uint64)0xFFFF0000 << 32)) == 0) { n += 16; x <<= 16; }
2741
if ((x & ((drflac_uint64)0xFF000000 << 32)) == 0) { n += 8; x <<= 8; }
2742
if ((x & ((drflac_uint64)0xF0000000 << 32)) == 0) { n += 4; x <<= 4; }
2743
#else
2744
if ((x & 0xFFFF0000) == 0) { n = 16; x <<= 16; }
2745
if ((x & 0xFF000000) == 0) { n += 8; x <<= 8; }
2746
if ((x & 0xF0000000) == 0) { n += 4; x <<= 4; }
2747
#endif
2748
n += clz_table_4[x >> (sizeof(x)*8 - 4)];
2749
}
2750
2751
return n - 1;
2752
}
2753
2754
#ifdef DRFLAC_IMPLEMENT_CLZ_LZCNT
2755
static DRFLAC_INLINE drflac_bool32 drflac__is_lzcnt_supported(void)
2756
{
2757
/* Fast compile time check for ARM. */
2758
#if defined(DRFLAC_HAS_LZCNT_INTRINSIC) && defined(DRFLAC_ARM) && (defined(__ARM_ARCH) && __ARM_ARCH >= 5)
2759
return DRFLAC_TRUE;
2760
#elif defined(__MRC__)
2761
return DRFLAC_TRUE;
2762
#else
2763
/* If the compiler itself does not support the intrinsic then we'll need to return false. */
2764
#ifdef DRFLAC_HAS_LZCNT_INTRINSIC
2765
return drflac__gIsLZCNTSupported;
2766
#else
2767
return DRFLAC_FALSE;
2768
#endif
2769
#endif
2770
}
2771
2772
static DRFLAC_INLINE drflac_uint32 drflac__clz_lzcnt(drflac_cache_t x)
2773
{
2774
/*
2775
It's critical for competitive decoding performance that this function be highly optimal. With MSVC we can use the __lzcnt64() and __lzcnt() intrinsics
2776
to achieve good performance, however on GCC and Clang it's a little bit more annoying. The __builtin_clzl() and __builtin_clzll() intrinsics leave
2777
it undefined as to the return value when `x` is 0. We need this to be well defined as returning 32 or 64, depending on whether or not it's a 32- or
2778
64-bit build. To work around this we would need to add a conditional to check for the x = 0 case, but this creates unnecessary inefficiency. To work
2779
around this problem I have written some inline assembly to emit the LZCNT (x86) or CLZ (ARM) instruction directly which removes the need to include
2780
the conditional. This has worked well in the past, but for some reason Clang's MSVC compatible driver, clang-cl, does not seem to be handling this
2781
in the same way as the normal Clang driver. It seems that `clang-cl` is just outputting the wrong results sometimes, maybe due to some register
2782
getting clobbered?
2783
2784
I'm not sure if this is a bug with dr_flac's inlined assembly (most likely), a bug in `clang-cl` or just a misunderstanding on my part with inline
2785
assembly rules for `clang-cl`. If somebody can identify an error in dr_flac's inlined assembly I'm happy to get that fixed.
2786
2787
Fortunately there is an easy workaround for this. Clang implements MSVC-specific intrinsics for compatibility. It also defines _MSC_VER for extra
2788
compatibility. We can therefore just check for _MSC_VER and use the MSVC intrinsic which, fortunately for us, Clang supports. It would still be nice
2789
to know how to fix the inlined assembly for correctness sake, however.
2790
*/
2791
2792
#if defined(_MSC_VER) /*&& !defined(__clang__)*/ /* <-- Intentionally wanting Clang to use the MSVC __lzcnt64/__lzcnt intrinsics due to above ^. */
2793
#ifdef DRFLAC_64BIT
2794
return (drflac_uint32)__lzcnt64(x);
2795
#else
2796
return (drflac_uint32)__lzcnt(x);
2797
#endif
2798
#else
2799
#if defined(__GNUC__) || defined(__clang__)
2800
#if defined(DRFLAC_X64)
2801
{
2802
drflac_uint64 r;
2803
__asm__ __volatile__ (
2804
"lzcnt{ %1, %0| %0, %1}" : "=r"(r) : "r"(x) : "cc"
2805
);
2806
2807
return (drflac_uint32)r;
2808
}
2809
#elif defined(DRFLAC_X86)
2810
{
2811
drflac_uint32 r;
2812
__asm__ __volatile__ (
2813
"lzcnt{l %1, %0| %0, %1}" : "=r"(r) : "r"(x) : "cc"
2814
);
2815
2816
return r;
2817
}
2818
#elif defined(DRFLAC_ARM) && (defined(__ARM_ARCH) && __ARM_ARCH >= 5) && !defined(__ARM_ARCH_6M__) && !defined(DRFLAC_64BIT) /* <-- I haven't tested 64-bit inline assembly, so only enabling this for the 32-bit build for now. */
2819
{
2820
unsigned int r;
2821
__asm__ __volatile__ (
2822
#if defined(DRFLAC_64BIT)
2823
"clz %w[out], %w[in]" : [out]"=r"(r) : [in]"r"(x) /* <-- This is untested. If someone in the community could test this, that would be appreciated! */
2824
#else
2825
"clz %[out], %[in]" : [out]"=r"(r) : [in]"r"(x)
2826
#endif
2827
);
2828
2829
return r;
2830
}
2831
#else
2832
if (x == 0) {
2833
return sizeof(x)*8;
2834
}
2835
#ifdef DRFLAC_64BIT
2836
return (drflac_uint32)__builtin_clzll((drflac_uint64)x);
2837
#else
2838
return (drflac_uint32)__builtin_clzl((drflac_uint32)x);
2839
#endif
2840
#endif
2841
#else
2842
/* Unsupported compiler. */
2843
#error "This compiler does not support the lzcnt intrinsic."
2844
#endif
2845
#endif
2846
}
2847
#endif
2848
2849
#ifdef DRFLAC_IMPLEMENT_CLZ_MSVC
2850
#include <intrin.h> /* For BitScanReverse(). */
2851
2852
static DRFLAC_INLINE drflac_uint32 drflac__clz_msvc(drflac_cache_t x)
2853
{
2854
drflac_uint32 n;
2855
2856
if (x == 0) {
2857
return sizeof(x)*8;
2858
}
2859
2860
#ifdef DRFLAC_64BIT
2861
_BitScanReverse64((unsigned long*)&n, x);
2862
#else
2863
_BitScanReverse((unsigned long*)&n, x);
2864
#endif
2865
return sizeof(x)*8 - n - 1;
2866
}
2867
#endif
2868
2869
#ifdef DRFLAC_IMPLEMENT_CLZ_WATCOM
2870
static __inline drflac_uint32 drflac__clz_watcom (drflac_uint32);
2871
#ifdef DRFLAC_IMPLEMENT_CLZ_WATCOM_LZCNT
2872
/* Use the LZCNT instruction (only available on some processors since the 2010s). */
2873
#pragma aux drflac__clz_watcom_lzcnt = \
2874
"db 0F3h, 0Fh, 0BDh, 0C0h" /* lzcnt eax, eax */ \
2875
parm [eax] \
2876
value [eax] \
2877
modify nomemory;
2878
#else
2879
/* Use the 386+-compatible implementation. */
2880
#pragma aux drflac__clz_watcom = \
2881
"bsr eax, eax" \
2882
"xor eax, 31" \
2883
parm [eax] nomemory \
2884
value [eax] \
2885
modify exact [eax] nomemory;
2886
#endif
2887
#endif
2888
2889
static DRFLAC_INLINE drflac_uint32 drflac__clz(drflac_cache_t x)
2890
{
2891
#ifdef DRFLAC_IMPLEMENT_CLZ_LZCNT
2892
if (drflac__is_lzcnt_supported()) {
2893
return drflac__clz_lzcnt(x);
2894
} else
2895
#endif
2896
{
2897
#ifdef DRFLAC_IMPLEMENT_CLZ_MSVC
2898
return drflac__clz_msvc(x);
2899
#elif defined(DRFLAC_IMPLEMENT_CLZ_WATCOM_LZCNT)
2900
return drflac__clz_watcom_lzcnt(x);
2901
#elif defined(DRFLAC_IMPLEMENT_CLZ_WATCOM)
2902
return (x == 0) ? sizeof(x)*8 : drflac__clz_watcom(x);
2903
#elif defined(__MRC__)
2904
return __cntlzw(x);
2905
#else
2906
return drflac__clz_software(x);
2907
#endif
2908
}
2909
}
2910
2911
2912
static DRFLAC_INLINE drflac_bool32 drflac__seek_past_next_set_bit(drflac_bs* bs, unsigned int* pOffsetOut)
2913
{
2914
drflac_uint32 zeroCounter = 0;
2915
drflac_uint32 setBitOffsetPlus1;
2916
2917
while (bs->cache == 0) {
2918
zeroCounter += (drflac_uint32)DRFLAC_CACHE_L1_BITS_REMAINING(bs);
2919
if (!drflac__reload_cache(bs)) {
2920
return DRFLAC_FALSE;
2921
}
2922
}
2923
2924
if (bs->cache == 1) {
2925
/* Not catching this would lead to undefined behaviour: a shift of a 32-bit number by 32 or more is undefined */
2926
*pOffsetOut = zeroCounter + (drflac_uint32)DRFLAC_CACHE_L1_BITS_REMAINING(bs) - 1;
2927
if (!drflac__reload_cache(bs)) {
2928
return DRFLAC_FALSE;
2929
}
2930
2931
return DRFLAC_TRUE;
2932
}
2933
2934
setBitOffsetPlus1 = drflac__clz(bs->cache);
2935
setBitOffsetPlus1 += 1;
2936
2937
if (setBitOffsetPlus1 > DRFLAC_CACHE_L1_BITS_REMAINING(bs)) {
2938
/* This happens when we get to end of stream */
2939
return DRFLAC_FALSE;
2940
}
2941
2942
bs->consumedBits += setBitOffsetPlus1;
2943
bs->cache <<= setBitOffsetPlus1;
2944
2945
*pOffsetOut = zeroCounter + setBitOffsetPlus1 - 1;
2946
return DRFLAC_TRUE;
2947
}
2948
2949
2950
2951
static drflac_bool32 drflac__seek_to_byte(drflac_bs* bs, drflac_uint64 offsetFromStart)
2952
{
2953
DRFLAC_ASSERT(bs != NULL);
2954
DRFLAC_ASSERT(offsetFromStart > 0);
2955
2956
/*
2957
Seeking from the start is not quite as trivial as it sounds because the onSeek callback takes a signed 32-bit integer (which
2958
is intentional because it simplifies the implementation of the onSeek callbacks), however offsetFromStart is unsigned 64-bit.
2959
To resolve we just need to do an initial seek from the start, and then a series of offset seeks to make up the remainder.
2960
*/
2961
if (offsetFromStart > 0x7FFFFFFF) {
2962
drflac_uint64 bytesRemaining = offsetFromStart;
2963
if (!bs->onSeek(bs->pUserData, 0x7FFFFFFF, drflac_seek_origin_start)) {
2964
return DRFLAC_FALSE;
2965
}
2966
bytesRemaining -= 0x7FFFFFFF;
2967
2968
while (bytesRemaining > 0x7FFFFFFF) {
2969
if (!bs->onSeek(bs->pUserData, 0x7FFFFFFF, drflac_seek_origin_current)) {
2970
return DRFLAC_FALSE;
2971
}
2972
bytesRemaining -= 0x7FFFFFFF;
2973
}
2974
2975
if (bytesRemaining > 0) {
2976
if (!bs->onSeek(bs->pUserData, (int)bytesRemaining, drflac_seek_origin_current)) {
2977
return DRFLAC_FALSE;
2978
}
2979
}
2980
} else {
2981
if (!bs->onSeek(bs->pUserData, (int)offsetFromStart, drflac_seek_origin_start)) {
2982
return DRFLAC_FALSE;
2983
}
2984
}
2985
2986
/* The cache should be reset to force a reload of fresh data from the client. */
2987
drflac__reset_cache(bs);
2988
return DRFLAC_TRUE;
2989
}
2990
2991
2992
static drflac_result drflac__read_utf8_coded_number(drflac_bs* bs, drflac_uint64* pNumberOut, drflac_uint8* pCRCOut)
2993
{
2994
drflac_uint8 crc;
2995
drflac_uint64 result;
2996
drflac_uint8 utf8[7] = {0};
2997
int byteCount;
2998
int i;
2999
3000
DRFLAC_ASSERT(bs != NULL);
3001
DRFLAC_ASSERT(pNumberOut != NULL);
3002
DRFLAC_ASSERT(pCRCOut != NULL);
3003
3004
crc = *pCRCOut;
3005
3006
if (!drflac__read_uint8(bs, 8, utf8)) {
3007
*pNumberOut = 0;
3008
return DRFLAC_AT_END;
3009
}
3010
crc = drflac_crc8(crc, utf8[0], 8);
3011
3012
if ((utf8[0] & 0x80) == 0) {
3013
*pNumberOut = utf8[0];
3014
*pCRCOut = crc;
3015
return DRFLAC_SUCCESS;
3016
}
3017
3018
/*byteCount = 1;*/
3019
if ((utf8[0] & 0xE0) == 0xC0) {
3020
byteCount = 2;
3021
} else if ((utf8[0] & 0xF0) == 0xE0) {
3022
byteCount = 3;
3023
} else if ((utf8[0] & 0xF8) == 0xF0) {
3024
byteCount = 4;
3025
} else if ((utf8[0] & 0xFC) == 0xF8) {
3026
byteCount = 5;
3027
} else if ((utf8[0] & 0xFE) == 0xFC) {
3028
byteCount = 6;
3029
} else if ((utf8[0] & 0xFF) == 0xFE) {
3030
byteCount = 7;
3031
} else {
3032
*pNumberOut = 0;
3033
return DRFLAC_CRC_MISMATCH; /* Bad UTF-8 encoding. */
3034
}
3035
3036
/* Read extra bytes. */
3037
DRFLAC_ASSERT(byteCount > 1);
3038
3039
result = (drflac_uint64)(utf8[0] & (0xFF >> (byteCount + 1)));
3040
for (i = 1; i < byteCount; ++i) {
3041
if (!drflac__read_uint8(bs, 8, utf8 + i)) {
3042
*pNumberOut = 0;
3043
return DRFLAC_AT_END;
3044
}
3045
crc = drflac_crc8(crc, utf8[i], 8);
3046
3047
result = (result << 6) | (utf8[i] & 0x3F);
3048
}
3049
3050
*pNumberOut = result;
3051
*pCRCOut = crc;
3052
return DRFLAC_SUCCESS;
3053
}
3054
3055
3056
static DRFLAC_INLINE drflac_uint32 drflac__ilog2_u32(drflac_uint32 x)
3057
{
3058
#if 1 /* Needs optimizing. */
3059
drflac_uint32 result = 0;
3060
while (x > 0) {
3061
result += 1;
3062
x >>= 1;
3063
}
3064
3065
return result;
3066
#endif
3067
}
3068
3069
static DRFLAC_INLINE drflac_bool32 drflac__use_64_bit_prediction(drflac_uint32 bitsPerSample, drflac_uint32 order, drflac_uint32 precision)
3070
{
3071
/* https://web.archive.org/web/20220205005724/https://github.com/ietf-wg-cellar/flac-specification/blob/37a49aa48ba4ba12e8757badfc59c0df35435fec/rfc_backmatter.md */
3072
return bitsPerSample + precision + drflac__ilog2_u32(order) > 32;
3073
}
3074
3075
3076
/*
3077
The next two functions are responsible for calculating the prediction.
3078
3079
When the bits per sample is >16 we need to use 64-bit integer arithmetic because otherwise we'll run out of precision. It's
3080
safe to assume this will be slower on 32-bit platforms so we use a more optimal solution when the bits per sample is <=16.
3081
*/
3082
#if defined(__clang__)
3083
__attribute__((no_sanitize("signed-integer-overflow")))
3084
#endif
3085
static DRFLAC_INLINE drflac_int32 drflac__calculate_prediction_32(drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pDecodedSamples)
3086
{
3087
drflac_int32 prediction = 0;
3088
3089
DRFLAC_ASSERT(order <= 32);
3090
3091
/* 32-bit version. */
3092
3093
/* VC++ optimizes this to a single jmp. I've not yet verified this for other compilers. */
3094
switch (order)
3095
{
3096
case 32: prediction += coefficients[31] * pDecodedSamples[-32];
3097
case 31: prediction += coefficients[30] * pDecodedSamples[-31];
3098
case 30: prediction += coefficients[29] * pDecodedSamples[-30];
3099
case 29: prediction += coefficients[28] * pDecodedSamples[-29];
3100
case 28: prediction += coefficients[27] * pDecodedSamples[-28];
3101
case 27: prediction += coefficients[26] * pDecodedSamples[-27];
3102
case 26: prediction += coefficients[25] * pDecodedSamples[-26];
3103
case 25: prediction += coefficients[24] * pDecodedSamples[-25];
3104
case 24: prediction += coefficients[23] * pDecodedSamples[-24];
3105
case 23: prediction += coefficients[22] * pDecodedSamples[-23];
3106
case 22: prediction += coefficients[21] * pDecodedSamples[-22];
3107
case 21: prediction += coefficients[20] * pDecodedSamples[-21];
3108
case 20: prediction += coefficients[19] * pDecodedSamples[-20];
3109
case 19: prediction += coefficients[18] * pDecodedSamples[-19];
3110
case 18: prediction += coefficients[17] * pDecodedSamples[-18];
3111
case 17: prediction += coefficients[16] * pDecodedSamples[-17];
3112
case 16: prediction += coefficients[15] * pDecodedSamples[-16];
3113
case 15: prediction += coefficients[14] * pDecodedSamples[-15];
3114
case 14: prediction += coefficients[13] * pDecodedSamples[-14];
3115
case 13: prediction += coefficients[12] * pDecodedSamples[-13];
3116
case 12: prediction += coefficients[11] * pDecodedSamples[-12];
3117
case 11: prediction += coefficients[10] * pDecodedSamples[-11];
3118
case 10: prediction += coefficients[ 9] * pDecodedSamples[-10];
3119
case 9: prediction += coefficients[ 8] * pDecodedSamples[- 9];
3120
case 8: prediction += coefficients[ 7] * pDecodedSamples[- 8];
3121
case 7: prediction += coefficients[ 6] * pDecodedSamples[- 7];
3122
case 6: prediction += coefficients[ 5] * pDecodedSamples[- 6];
3123
case 5: prediction += coefficients[ 4] * pDecodedSamples[- 5];
3124
case 4: prediction += coefficients[ 3] * pDecodedSamples[- 4];
3125
case 3: prediction += coefficients[ 2] * pDecodedSamples[- 3];
3126
case 2: prediction += coefficients[ 1] * pDecodedSamples[- 2];
3127
case 1: prediction += coefficients[ 0] * pDecodedSamples[- 1];
3128
}
3129
3130
return (drflac_int32)(prediction >> shift);
3131
}
3132
3133
static DRFLAC_INLINE drflac_int32 drflac__calculate_prediction_64(drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pDecodedSamples)
3134
{
3135
drflac_int64 prediction;
3136
3137
DRFLAC_ASSERT(order <= 32);
3138
3139
/* 64-bit version. */
3140
3141
/* This method is faster on the 32-bit build when compiling with VC++. See note below. */
3142
#ifndef DRFLAC_64BIT
3143
if (order == 8)
3144
{
3145
prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1];
3146
prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2];
3147
prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3];
3148
prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4];
3149
prediction += coefficients[4] * (drflac_int64)pDecodedSamples[-5];
3150
prediction += coefficients[5] * (drflac_int64)pDecodedSamples[-6];
3151
prediction += coefficients[6] * (drflac_int64)pDecodedSamples[-7];
3152
prediction += coefficients[7] * (drflac_int64)pDecodedSamples[-8];
3153
}
3154
else if (order == 7)
3155
{
3156
prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1];
3157
prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2];
3158
prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3];
3159
prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4];
3160
prediction += coefficients[4] * (drflac_int64)pDecodedSamples[-5];
3161
prediction += coefficients[5] * (drflac_int64)pDecodedSamples[-6];
3162
prediction += coefficients[6] * (drflac_int64)pDecodedSamples[-7];
3163
}
3164
else if (order == 3)
3165
{
3166
prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1];
3167
prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2];
3168
prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3];
3169
}
3170
else if (order == 6)
3171
{
3172
prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1];
3173
prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2];
3174
prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3];
3175
prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4];
3176
prediction += coefficients[4] * (drflac_int64)pDecodedSamples[-5];
3177
prediction += coefficients[5] * (drflac_int64)pDecodedSamples[-6];
3178
}
3179
else if (order == 5)
3180
{
3181
prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1];
3182
prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2];
3183
prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3];
3184
prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4];
3185
prediction += coefficients[4] * (drflac_int64)pDecodedSamples[-5];
3186
}
3187
else if (order == 4)
3188
{
3189
prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1];
3190
prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2];
3191
prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3];
3192
prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4];
3193
}
3194
else if (order == 12)
3195
{
3196
prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1];
3197
prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2];
3198
prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3];
3199
prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4];
3200
prediction += coefficients[4] * (drflac_int64)pDecodedSamples[-5];
3201
prediction += coefficients[5] * (drflac_int64)pDecodedSamples[-6];
3202
prediction += coefficients[6] * (drflac_int64)pDecodedSamples[-7];
3203
prediction += coefficients[7] * (drflac_int64)pDecodedSamples[-8];
3204
prediction += coefficients[8] * (drflac_int64)pDecodedSamples[-9];
3205
prediction += coefficients[9] * (drflac_int64)pDecodedSamples[-10];
3206
prediction += coefficients[10] * (drflac_int64)pDecodedSamples[-11];
3207
prediction += coefficients[11] * (drflac_int64)pDecodedSamples[-12];
3208
}
3209
else if (order == 2)
3210
{
3211
prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1];
3212
prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2];
3213
}
3214
else if (order == 1)
3215
{
3216
prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1];
3217
}
3218
else if (order == 10)
3219
{
3220
prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1];
3221
prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2];
3222
prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3];
3223
prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4];
3224
prediction += coefficients[4] * (drflac_int64)pDecodedSamples[-5];
3225
prediction += coefficients[5] * (drflac_int64)pDecodedSamples[-6];
3226
prediction += coefficients[6] * (drflac_int64)pDecodedSamples[-7];
3227
prediction += coefficients[7] * (drflac_int64)pDecodedSamples[-8];
3228
prediction += coefficients[8] * (drflac_int64)pDecodedSamples[-9];
3229
prediction += coefficients[9] * (drflac_int64)pDecodedSamples[-10];
3230
}
3231
else if (order == 9)
3232
{
3233
prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1];
3234
prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2];
3235
prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3];
3236
prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4];
3237
prediction += coefficients[4] * (drflac_int64)pDecodedSamples[-5];
3238
prediction += coefficients[5] * (drflac_int64)pDecodedSamples[-6];
3239
prediction += coefficients[6] * (drflac_int64)pDecodedSamples[-7];
3240
prediction += coefficients[7] * (drflac_int64)pDecodedSamples[-8];
3241
prediction += coefficients[8] * (drflac_int64)pDecodedSamples[-9];
3242
}
3243
else if (order == 11)
3244
{
3245
prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1];
3246
prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2];
3247
prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3];
3248
prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4];
3249
prediction += coefficients[4] * (drflac_int64)pDecodedSamples[-5];
3250
prediction += coefficients[5] * (drflac_int64)pDecodedSamples[-6];
3251
prediction += coefficients[6] * (drflac_int64)pDecodedSamples[-7];
3252
prediction += coefficients[7] * (drflac_int64)pDecodedSamples[-8];
3253
prediction += coefficients[8] * (drflac_int64)pDecodedSamples[-9];
3254
prediction += coefficients[9] * (drflac_int64)pDecodedSamples[-10];
3255
prediction += coefficients[10] * (drflac_int64)pDecodedSamples[-11];
3256
}
3257
else
3258
{
3259
int j;
3260
3261
prediction = 0;
3262
for (j = 0; j < (int)order; ++j) {
3263
prediction += coefficients[j] * (drflac_int64)pDecodedSamples[-j-1];
3264
}
3265
}
3266
#endif
3267
3268
/*
3269
VC++ optimizes this to a single jmp instruction, but only the 64-bit build. The 32-bit build generates less efficient code for some
3270
reason. The ugly version above is faster so we'll just switch between the two depending on the target platform.
3271
*/
3272
#ifdef DRFLAC_64BIT
3273
prediction = 0;
3274
switch (order)
3275
{
3276
case 32: prediction += coefficients[31] * (drflac_int64)pDecodedSamples[-32];
3277
case 31: prediction += coefficients[30] * (drflac_int64)pDecodedSamples[-31];
3278
case 30: prediction += coefficients[29] * (drflac_int64)pDecodedSamples[-30];
3279
case 29: prediction += coefficients[28] * (drflac_int64)pDecodedSamples[-29];
3280
case 28: prediction += coefficients[27] * (drflac_int64)pDecodedSamples[-28];
3281
case 27: prediction += coefficients[26] * (drflac_int64)pDecodedSamples[-27];
3282
case 26: prediction += coefficients[25] * (drflac_int64)pDecodedSamples[-26];
3283
case 25: prediction += coefficients[24] * (drflac_int64)pDecodedSamples[-25];
3284
case 24: prediction += coefficients[23] * (drflac_int64)pDecodedSamples[-24];
3285
case 23: prediction += coefficients[22] * (drflac_int64)pDecodedSamples[-23];
3286
case 22: prediction += coefficients[21] * (drflac_int64)pDecodedSamples[-22];
3287
case 21: prediction += coefficients[20] * (drflac_int64)pDecodedSamples[-21];
3288
case 20: prediction += coefficients[19] * (drflac_int64)pDecodedSamples[-20];
3289
case 19: prediction += coefficients[18] * (drflac_int64)pDecodedSamples[-19];
3290
case 18: prediction += coefficients[17] * (drflac_int64)pDecodedSamples[-18];
3291
case 17: prediction += coefficients[16] * (drflac_int64)pDecodedSamples[-17];
3292
case 16: prediction += coefficients[15] * (drflac_int64)pDecodedSamples[-16];
3293
case 15: prediction += coefficients[14] * (drflac_int64)pDecodedSamples[-15];
3294
case 14: prediction += coefficients[13] * (drflac_int64)pDecodedSamples[-14];
3295
case 13: prediction += coefficients[12] * (drflac_int64)pDecodedSamples[-13];
3296
case 12: prediction += coefficients[11] * (drflac_int64)pDecodedSamples[-12];
3297
case 11: prediction += coefficients[10] * (drflac_int64)pDecodedSamples[-11];
3298
case 10: prediction += coefficients[ 9] * (drflac_int64)pDecodedSamples[-10];
3299
case 9: prediction += coefficients[ 8] * (drflac_int64)pDecodedSamples[- 9];
3300
case 8: prediction += coefficients[ 7] * (drflac_int64)pDecodedSamples[- 8];
3301
case 7: prediction += coefficients[ 6] * (drflac_int64)pDecodedSamples[- 7];
3302
case 6: prediction += coefficients[ 5] * (drflac_int64)pDecodedSamples[- 6];
3303
case 5: prediction += coefficients[ 4] * (drflac_int64)pDecodedSamples[- 5];
3304
case 4: prediction += coefficients[ 3] * (drflac_int64)pDecodedSamples[- 4];
3305
case 3: prediction += coefficients[ 2] * (drflac_int64)pDecodedSamples[- 3];
3306
case 2: prediction += coefficients[ 1] * (drflac_int64)pDecodedSamples[- 2];
3307
case 1: prediction += coefficients[ 0] * (drflac_int64)pDecodedSamples[- 1];
3308
}
3309
#endif
3310
3311
return (drflac_int32)(prediction >> shift);
3312
}
3313
3314
3315
#if 0
3316
/*
3317
Reference implementation for reading and decoding samples with residual. This is intentionally left unoptimized for the
3318
sake of readability and should only be used as a reference.
3319
*/
3320
static drflac_bool32 drflac__decode_samples_with_residual__rice__reference(drflac_bs* bs, drflac_uint32 bitsPerSample, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 lpcOrder, drflac_int32 lpcShift, drflac_uint32 lpcPrecision, const drflac_int32* coefficients, drflac_int32* pSamplesOut)
3321
{
3322
drflac_uint32 i;
3323
3324
DRFLAC_ASSERT(bs != NULL);
3325
DRFLAC_ASSERT(pSamplesOut != NULL);
3326
3327
for (i = 0; i < count; ++i) {
3328
drflac_uint32 zeroCounter = 0;
3329
for (;;) {
3330
drflac_uint8 bit;
3331
if (!drflac__read_uint8(bs, 1, &bit)) {
3332
return DRFLAC_FALSE;
3333
}
3334
3335
if (bit == 0) {
3336
zeroCounter += 1;
3337
} else {
3338
break;
3339
}
3340
}
3341
3342
drflac_uint32 decodedRice;
3343
if (riceParam > 0) {
3344
if (!drflac__read_uint32(bs, riceParam, &decodedRice)) {
3345
return DRFLAC_FALSE;
3346
}
3347
} else {
3348
decodedRice = 0;
3349
}
3350
3351
decodedRice |= (zeroCounter << riceParam);
3352
if ((decodedRice & 0x01)) {
3353
decodedRice = ~(decodedRice >> 1);
3354
} else {
3355
decodedRice = (decodedRice >> 1);
3356
}
3357
3358
3359
if (drflac__use_64_bit_prediction(bitsPerSample, lpcOrder, lpcPrecision)) {
3360
pSamplesOut[i] = decodedRice + drflac__calculate_prediction_64(lpcOrder, lpcShift, coefficients, pSamplesOut + i);
3361
} else {
3362
pSamplesOut[i] = decodedRice + drflac__calculate_prediction_32(lpcOrder, lpcShift, coefficients, pSamplesOut + i);
3363
}
3364
}
3365
3366
return DRFLAC_TRUE;
3367
}
3368
#endif
3369
3370
#if 0
3371
static drflac_bool32 drflac__read_rice_parts__reference(drflac_bs* bs, drflac_uint8 riceParam, drflac_uint32* pZeroCounterOut, drflac_uint32* pRiceParamPartOut)
3372
{
3373
drflac_uint32 zeroCounter = 0;
3374
drflac_uint32 decodedRice;
3375
3376
for (;;) {
3377
drflac_uint8 bit;
3378
if (!drflac__read_uint8(bs, 1, &bit)) {
3379
return DRFLAC_FALSE;
3380
}
3381
3382
if (bit == 0) {
3383
zeroCounter += 1;
3384
} else {
3385
break;
3386
}
3387
}
3388
3389
if (riceParam > 0) {
3390
if (!drflac__read_uint32(bs, riceParam, &decodedRice)) {
3391
return DRFLAC_FALSE;
3392
}
3393
} else {
3394
decodedRice = 0;
3395
}
3396
3397
*pZeroCounterOut = zeroCounter;
3398
*pRiceParamPartOut = decodedRice;
3399
return DRFLAC_TRUE;
3400
}
3401
#endif
3402
3403
#if 0
3404
static DRFLAC_INLINE drflac_bool32 drflac__read_rice_parts(drflac_bs* bs, drflac_uint8 riceParam, drflac_uint32* pZeroCounterOut, drflac_uint32* pRiceParamPartOut)
3405
{
3406
drflac_cache_t riceParamMask;
3407
drflac_uint32 zeroCounter;
3408
drflac_uint32 setBitOffsetPlus1;
3409
drflac_uint32 riceParamPart;
3410
drflac_uint32 riceLength;
3411
3412
DRFLAC_ASSERT(riceParam > 0); /* <-- riceParam should never be 0. drflac__read_rice_parts__param_equals_zero() should be used instead for this case. */
3413
3414
riceParamMask = DRFLAC_CACHE_L1_SELECTION_MASK(riceParam);
3415
3416
zeroCounter = 0;
3417
while (bs->cache == 0) {
3418
zeroCounter += (drflac_uint32)DRFLAC_CACHE_L1_BITS_REMAINING(bs);
3419
if (!drflac__reload_cache(bs)) {
3420
return DRFLAC_FALSE;
3421
}
3422
}
3423
3424
setBitOffsetPlus1 = drflac__clz(bs->cache);
3425
zeroCounter += setBitOffsetPlus1;
3426
setBitOffsetPlus1 += 1;
3427
3428
riceLength = setBitOffsetPlus1 + riceParam;
3429
if (riceLength < DRFLAC_CACHE_L1_BITS_REMAINING(bs)) {
3430
riceParamPart = (drflac_uint32)((bs->cache & (riceParamMask >> setBitOffsetPlus1)) >> DRFLAC_CACHE_L1_SELECTION_SHIFT(bs, riceLength));
3431
3432
bs->consumedBits += riceLength;
3433
bs->cache <<= riceLength;
3434
} else {
3435
drflac_uint32 bitCountLo;
3436
drflac_cache_t resultHi;
3437
3438
bs->consumedBits += riceLength;
3439
bs->cache <<= setBitOffsetPlus1 & (DRFLAC_CACHE_L1_SIZE_BITS(bs)-1); /* <-- Equivalent to "if (setBitOffsetPlus1 < DRFLAC_CACHE_L1_SIZE_BITS(bs)) { bs->cache <<= setBitOffsetPlus1; }" */
3440
3441
/* It straddles the cached data. It will never cover more than the next chunk. We just read the number in two parts and combine them. */
3442
bitCountLo = bs->consumedBits - DRFLAC_CACHE_L1_SIZE_BITS(bs);
3443
resultHi = DRFLAC_CACHE_L1_SELECT_AND_SHIFT(bs, riceParam); /* <-- Use DRFLAC_CACHE_L1_SELECT_AND_SHIFT_SAFE() if ever this function allows riceParam=0. */
3444
3445
if (bs->nextL2Line < DRFLAC_CACHE_L2_LINE_COUNT(bs)) {
3446
#ifndef DR_FLAC_NO_CRC
3447
drflac__update_crc16(bs);
3448
#endif
3449
bs->cache = drflac__be2host__cache_line(bs->cacheL2[bs->nextL2Line++]);
3450
bs->consumedBits = 0;
3451
#ifndef DR_FLAC_NO_CRC
3452
bs->crc16Cache = bs->cache;
3453
#endif
3454
} else {
3455
/* Slow path. We need to fetch more data from the client. */
3456
if (!drflac__reload_cache(bs)) {
3457
return DRFLAC_FALSE;
3458
}
3459
if (bitCountLo > DRFLAC_CACHE_L1_BITS_REMAINING(bs)) {
3460
/* This happens when we get to end of stream */
3461
return DRFLAC_FALSE;
3462
}
3463
}
3464
3465
riceParamPart = (drflac_uint32)(resultHi | DRFLAC_CACHE_L1_SELECT_AND_SHIFT_SAFE(bs, bitCountLo));
3466
3467
bs->consumedBits += bitCountLo;
3468
bs->cache <<= bitCountLo;
3469
}
3470
3471
pZeroCounterOut[0] = zeroCounter;
3472
pRiceParamPartOut[0] = riceParamPart;
3473
3474
return DRFLAC_TRUE;
3475
}
3476
#endif
3477
3478
static DRFLAC_INLINE drflac_bool32 drflac__read_rice_parts_x1(drflac_bs* bs, drflac_uint8 riceParam, drflac_uint32* pZeroCounterOut, drflac_uint32* pRiceParamPartOut)
3479
{
3480
drflac_uint32 riceParamPlus1 = riceParam + 1;
3481
/*drflac_cache_t riceParamPlus1Mask = DRFLAC_CACHE_L1_SELECTION_MASK(riceParamPlus1);*/
3482
drflac_uint32 riceParamPlus1Shift = DRFLAC_CACHE_L1_SELECTION_SHIFT(bs, riceParamPlus1);
3483
drflac_uint32 riceParamPlus1MaxConsumedBits = DRFLAC_CACHE_L1_SIZE_BITS(bs) - riceParamPlus1;
3484
3485
/*
3486
The idea here is to use local variables for the cache in an attempt to encourage the compiler to store them in registers. I have
3487
no idea how this will work in practice...
3488
*/
3489
drflac_cache_t bs_cache = bs->cache;
3490
drflac_uint32 bs_consumedBits = bs->consumedBits;
3491
3492
/* The first thing to do is find the first unset bit. Most likely a bit will be set in the current cache line. */
3493
drflac_uint32 lzcount = drflac__clz(bs_cache);
3494
if (lzcount < sizeof(bs_cache)*8) {
3495
pZeroCounterOut[0] = lzcount;
3496
3497
/*
3498
It is most likely that the riceParam part (which comes after the zero counter) is also on this cache line. When extracting
3499
this, we include the set bit from the unary coded part because it simplifies cache management. This bit will be handled
3500
outside of this function at a higher level.
3501
*/
3502
extract_rice_param_part:
3503
bs_cache <<= lzcount;
3504
bs_consumedBits += lzcount;
3505
3506
if (bs_consumedBits <= riceParamPlus1MaxConsumedBits) {
3507
/* Getting here means the rice parameter part is wholly contained within the current cache line. */
3508
pRiceParamPartOut[0] = (drflac_uint32)(bs_cache >> riceParamPlus1Shift);
3509
bs_cache <<= riceParamPlus1;
3510
bs_consumedBits += riceParamPlus1;
3511
} else {
3512
drflac_uint32 riceParamPartHi;
3513
drflac_uint32 riceParamPartLo;
3514
drflac_uint32 riceParamPartLoBitCount;
3515
3516
/*
3517
Getting here means the rice parameter part straddles the cache line. We need to read from the tail of the current cache
3518
line, reload the cache, and then combine it with the head of the next cache line.
3519
*/
3520
3521
/* Grab the high part of the rice parameter part. */
3522
riceParamPartHi = (drflac_uint32)(bs_cache >> riceParamPlus1Shift);
3523
3524
/* Before reloading the cache we need to grab the size in bits of the low part. */
3525
riceParamPartLoBitCount = bs_consumedBits - riceParamPlus1MaxConsumedBits;
3526
DRFLAC_ASSERT(riceParamPartLoBitCount > 0 && riceParamPartLoBitCount < 32);
3527
3528
/* Now reload the cache. */
3529
if (bs->nextL2Line < DRFLAC_CACHE_L2_LINE_COUNT(bs)) {
3530
#ifndef DR_FLAC_NO_CRC
3531
drflac__update_crc16(bs);
3532
#endif
3533
bs_cache = drflac__be2host__cache_line(bs->cacheL2[bs->nextL2Line++]);
3534
bs_consumedBits = riceParamPartLoBitCount;
3535
#ifndef DR_FLAC_NO_CRC
3536
bs->crc16Cache = bs_cache;
3537
#endif
3538
} else {
3539
/* Slow path. We need to fetch more data from the client. */
3540
if (!drflac__reload_cache(bs)) {
3541
return DRFLAC_FALSE;
3542
}
3543
if (riceParamPartLoBitCount > DRFLAC_CACHE_L1_BITS_REMAINING(bs)) {
3544
/* This happens when we get to end of stream */
3545
return DRFLAC_FALSE;
3546
}
3547
3548
bs_cache = bs->cache;
3549
bs_consumedBits = bs->consumedBits + riceParamPartLoBitCount;
3550
}
3551
3552
/* We should now have enough information to construct the rice parameter part. */
3553
riceParamPartLo = (drflac_uint32)(bs_cache >> (DRFLAC_CACHE_L1_SELECTION_SHIFT(bs, riceParamPartLoBitCount)));
3554
pRiceParamPartOut[0] = riceParamPartHi | riceParamPartLo;
3555
3556
bs_cache <<= riceParamPartLoBitCount;
3557
}
3558
} else {
3559
/*
3560
Getting here means there are no bits set on the cache line. This is a less optimal case because we just wasted a call
3561
to drflac__clz() and we need to reload the cache.
3562
*/
3563
drflac_uint32 zeroCounter = (drflac_uint32)(DRFLAC_CACHE_L1_SIZE_BITS(bs) - bs_consumedBits);
3564
for (;;) {
3565
if (bs->nextL2Line < DRFLAC_CACHE_L2_LINE_COUNT(bs)) {
3566
#ifndef DR_FLAC_NO_CRC
3567
drflac__update_crc16(bs);
3568
#endif
3569
bs_cache = drflac__be2host__cache_line(bs->cacheL2[bs->nextL2Line++]);
3570
bs_consumedBits = 0;
3571
#ifndef DR_FLAC_NO_CRC
3572
bs->crc16Cache = bs_cache;
3573
#endif
3574
} else {
3575
/* Slow path. We need to fetch more data from the client. */
3576
if (!drflac__reload_cache(bs)) {
3577
return DRFLAC_FALSE;
3578
}
3579
3580
bs_cache = bs->cache;
3581
bs_consumedBits = bs->consumedBits;
3582
}
3583
3584
lzcount = drflac__clz(bs_cache);
3585
zeroCounter += lzcount;
3586
3587
if (lzcount < sizeof(bs_cache)*8) {
3588
break;
3589
}
3590
}
3591
3592
pZeroCounterOut[0] = zeroCounter;
3593
goto extract_rice_param_part;
3594
}
3595
3596
/* Make sure the cache is restored at the end of it all. */
3597
bs->cache = bs_cache;
3598
bs->consumedBits = bs_consumedBits;
3599
3600
return DRFLAC_TRUE;
3601
}
3602
3603
static DRFLAC_INLINE drflac_bool32 drflac__seek_rice_parts(drflac_bs* bs, drflac_uint8 riceParam)
3604
{
3605
drflac_uint32 riceParamPlus1 = riceParam + 1;
3606
drflac_uint32 riceParamPlus1MaxConsumedBits = DRFLAC_CACHE_L1_SIZE_BITS(bs) - riceParamPlus1;
3607
3608
/*
3609
The idea here is to use local variables for the cache in an attempt to encourage the compiler to store them in registers. I have
3610
no idea how this will work in practice...
3611
*/
3612
drflac_cache_t bs_cache = bs->cache;
3613
drflac_uint32 bs_consumedBits = bs->consumedBits;
3614
3615
/* The first thing to do is find the first unset bit. Most likely a bit will be set in the current cache line. */
3616
drflac_uint32 lzcount = drflac__clz(bs_cache);
3617
if (lzcount < sizeof(bs_cache)*8) {
3618
/*
3619
It is most likely that the riceParam part (which comes after the zero counter) is also on this cache line. When extracting
3620
this, we include the set bit from the unary coded part because it simplifies cache management. This bit will be handled
3621
outside of this function at a higher level.
3622
*/
3623
extract_rice_param_part:
3624
bs_cache <<= lzcount;
3625
bs_consumedBits += lzcount;
3626
3627
if (bs_consumedBits <= riceParamPlus1MaxConsumedBits) {
3628
/* Getting here means the rice parameter part is wholly contained within the current cache line. */
3629
bs_cache <<= riceParamPlus1;
3630
bs_consumedBits += riceParamPlus1;
3631
} else {
3632
/*
3633
Getting here means the rice parameter part straddles the cache line. We need to read from the tail of the current cache
3634
line, reload the cache, and then combine it with the head of the next cache line.
3635
*/
3636
3637
/* Before reloading the cache we need to grab the size in bits of the low part. */
3638
drflac_uint32 riceParamPartLoBitCount = bs_consumedBits - riceParamPlus1MaxConsumedBits;
3639
DRFLAC_ASSERT(riceParamPartLoBitCount > 0 && riceParamPartLoBitCount < 32);
3640
3641
/* Now reload the cache. */
3642
if (bs->nextL2Line < DRFLAC_CACHE_L2_LINE_COUNT(bs)) {
3643
#ifndef DR_FLAC_NO_CRC
3644
drflac__update_crc16(bs);
3645
#endif
3646
bs_cache = drflac__be2host__cache_line(bs->cacheL2[bs->nextL2Line++]);
3647
bs_consumedBits = riceParamPartLoBitCount;
3648
#ifndef DR_FLAC_NO_CRC
3649
bs->crc16Cache = bs_cache;
3650
#endif
3651
} else {
3652
/* Slow path. We need to fetch more data from the client. */
3653
if (!drflac__reload_cache(bs)) {
3654
return DRFLAC_FALSE;
3655
}
3656
3657
if (riceParamPartLoBitCount > DRFLAC_CACHE_L1_BITS_REMAINING(bs)) {
3658
/* This happens when we get to end of stream */
3659
return DRFLAC_FALSE;
3660
}
3661
3662
bs_cache = bs->cache;
3663
bs_consumedBits = bs->consumedBits + riceParamPartLoBitCount;
3664
}
3665
3666
bs_cache <<= riceParamPartLoBitCount;
3667
}
3668
} else {
3669
/*
3670
Getting here means there are no bits set on the cache line. This is a less optimal case because we just wasted a call
3671
to drflac__clz() and we need to reload the cache.
3672
*/
3673
for (;;) {
3674
if (bs->nextL2Line < DRFLAC_CACHE_L2_LINE_COUNT(bs)) {
3675
#ifndef DR_FLAC_NO_CRC
3676
drflac__update_crc16(bs);
3677
#endif
3678
bs_cache = drflac__be2host__cache_line(bs->cacheL2[bs->nextL2Line++]);
3679
bs_consumedBits = 0;
3680
#ifndef DR_FLAC_NO_CRC
3681
bs->crc16Cache = bs_cache;
3682
#endif
3683
} else {
3684
/* Slow path. We need to fetch more data from the client. */
3685
if (!drflac__reload_cache(bs)) {
3686
return DRFLAC_FALSE;
3687
}
3688
3689
bs_cache = bs->cache;
3690
bs_consumedBits = bs->consumedBits;
3691
}
3692
3693
lzcount = drflac__clz(bs_cache);
3694
if (lzcount < sizeof(bs_cache)*8) {
3695
break;
3696
}
3697
}
3698
3699
goto extract_rice_param_part;
3700
}
3701
3702
/* Make sure the cache is restored at the end of it all. */
3703
bs->cache = bs_cache;
3704
bs->consumedBits = bs_consumedBits;
3705
3706
return DRFLAC_TRUE;
3707
}
3708
3709
3710
static drflac_bool32 drflac__decode_samples_with_residual__rice__scalar_zeroorder(drflac_bs* bs, drflac_uint32 bitsPerSample, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pSamplesOut)
3711
{
3712
drflac_uint32 t[2] = {0x00000000, 0xFFFFFFFF};
3713
drflac_uint32 zeroCountPart0;
3714
drflac_uint32 riceParamPart0;
3715
drflac_uint32 riceParamMask;
3716
drflac_uint32 i;
3717
3718
DRFLAC_ASSERT(bs != NULL);
3719
DRFLAC_ASSERT(pSamplesOut != NULL);
3720
3721
(void)bitsPerSample;
3722
(void)order;
3723
(void)shift;
3724
(void)coefficients;
3725
3726
riceParamMask = (drflac_uint32)~((~0UL) << riceParam);
3727
3728
i = 0;
3729
while (i < count) {
3730
/* Rice extraction. */
3731
if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart0, &riceParamPart0)) {
3732
return DRFLAC_FALSE;
3733
}
3734
3735
/* Rice reconstruction. */
3736
riceParamPart0 &= riceParamMask;
3737
riceParamPart0 |= (zeroCountPart0 << riceParam);
3738
riceParamPart0 = (riceParamPart0 >> 1) ^ t[riceParamPart0 & 0x01];
3739
3740
pSamplesOut[i] = riceParamPart0;
3741
3742
i += 1;
3743
}
3744
3745
return DRFLAC_TRUE;
3746
}
3747
3748
static drflac_bool32 drflac__decode_samples_with_residual__rice__scalar(drflac_bs* bs, drflac_uint32 bitsPerSample, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 lpcOrder, drflac_int32 lpcShift, drflac_uint32 lpcPrecision, const drflac_int32* coefficients, drflac_int32* pSamplesOut)
3749
{
3750
drflac_uint32 t[2] = {0x00000000, 0xFFFFFFFF};
3751
drflac_uint32 zeroCountPart0 = 0;
3752
drflac_uint32 zeroCountPart1 = 0;
3753
drflac_uint32 zeroCountPart2 = 0;
3754
drflac_uint32 zeroCountPart3 = 0;
3755
drflac_uint32 riceParamPart0 = 0;
3756
drflac_uint32 riceParamPart1 = 0;
3757
drflac_uint32 riceParamPart2 = 0;
3758
drflac_uint32 riceParamPart3 = 0;
3759
drflac_uint32 riceParamMask;
3760
const drflac_int32* pSamplesOutEnd;
3761
drflac_uint32 i;
3762
3763
DRFLAC_ASSERT(bs != NULL);
3764
DRFLAC_ASSERT(pSamplesOut != NULL);
3765
3766
if (lpcOrder == 0) {
3767
return drflac__decode_samples_with_residual__rice__scalar_zeroorder(bs, bitsPerSample, count, riceParam, lpcOrder, lpcShift, coefficients, pSamplesOut);
3768
}
3769
3770
riceParamMask = (drflac_uint32)~((~0UL) << riceParam);
3771
pSamplesOutEnd = pSamplesOut + (count & ~3);
3772
3773
if (drflac__use_64_bit_prediction(bitsPerSample, lpcOrder, lpcPrecision)) {
3774
while (pSamplesOut < pSamplesOutEnd) {
3775
/*
3776
Rice extraction. It's faster to do this one at a time against local variables than it is to use the x4 version
3777
against an array. Not sure why, but perhaps it's making more efficient use of registers?
3778
*/
3779
if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart0, &riceParamPart0) ||
3780
!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart1, &riceParamPart1) ||
3781
!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart2, &riceParamPart2) ||
3782
!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart3, &riceParamPart3)) {
3783
return DRFLAC_FALSE;
3784
}
3785
3786
riceParamPart0 &= riceParamMask;
3787
riceParamPart1 &= riceParamMask;
3788
riceParamPart2 &= riceParamMask;
3789
riceParamPart3 &= riceParamMask;
3790
3791
riceParamPart0 |= (zeroCountPart0 << riceParam);
3792
riceParamPart1 |= (zeroCountPart1 << riceParam);
3793
riceParamPart2 |= (zeroCountPart2 << riceParam);
3794
riceParamPart3 |= (zeroCountPart3 << riceParam);
3795
3796
riceParamPart0 = (riceParamPart0 >> 1) ^ t[riceParamPart0 & 0x01];
3797
riceParamPart1 = (riceParamPart1 >> 1) ^ t[riceParamPart1 & 0x01];
3798
riceParamPart2 = (riceParamPart2 >> 1) ^ t[riceParamPart2 & 0x01];
3799
riceParamPart3 = (riceParamPart3 >> 1) ^ t[riceParamPart3 & 0x01];
3800
3801
pSamplesOut[0] = riceParamPart0 + drflac__calculate_prediction_64(lpcOrder, lpcShift, coefficients, pSamplesOut + 0);
3802
pSamplesOut[1] = riceParamPart1 + drflac__calculate_prediction_64(lpcOrder, lpcShift, coefficients, pSamplesOut + 1);
3803
pSamplesOut[2] = riceParamPart2 + drflac__calculate_prediction_64(lpcOrder, lpcShift, coefficients, pSamplesOut + 2);
3804
pSamplesOut[3] = riceParamPart3 + drflac__calculate_prediction_64(lpcOrder, lpcShift, coefficients, pSamplesOut + 3);
3805
3806
pSamplesOut += 4;
3807
}
3808
} else {
3809
while (pSamplesOut < pSamplesOutEnd) {
3810
if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart0, &riceParamPart0) ||
3811
!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart1, &riceParamPart1) ||
3812
!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart2, &riceParamPart2) ||
3813
!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart3, &riceParamPart3)) {
3814
return DRFLAC_FALSE;
3815
}
3816
3817
riceParamPart0 &= riceParamMask;
3818
riceParamPart1 &= riceParamMask;
3819
riceParamPart2 &= riceParamMask;
3820
riceParamPart3 &= riceParamMask;
3821
3822
riceParamPart0 |= (zeroCountPart0 << riceParam);
3823
riceParamPart1 |= (zeroCountPart1 << riceParam);
3824
riceParamPart2 |= (zeroCountPart2 << riceParam);
3825
riceParamPart3 |= (zeroCountPart3 << riceParam);
3826
3827
riceParamPart0 = (riceParamPart0 >> 1) ^ t[riceParamPart0 & 0x01];
3828
riceParamPart1 = (riceParamPart1 >> 1) ^ t[riceParamPart1 & 0x01];
3829
riceParamPart2 = (riceParamPart2 >> 1) ^ t[riceParamPart2 & 0x01];
3830
riceParamPart3 = (riceParamPart3 >> 1) ^ t[riceParamPart3 & 0x01];
3831
3832
pSamplesOut[0] = riceParamPart0 + drflac__calculate_prediction_32(lpcOrder, lpcShift, coefficients, pSamplesOut + 0);
3833
pSamplesOut[1] = riceParamPart1 + drflac__calculate_prediction_32(lpcOrder, lpcShift, coefficients, pSamplesOut + 1);
3834
pSamplesOut[2] = riceParamPart2 + drflac__calculate_prediction_32(lpcOrder, lpcShift, coefficients, pSamplesOut + 2);
3835
pSamplesOut[3] = riceParamPart3 + drflac__calculate_prediction_32(lpcOrder, lpcShift, coefficients, pSamplesOut + 3);
3836
3837
pSamplesOut += 4;
3838
}
3839
}
3840
3841
i = (count & ~3);
3842
while (i < count) {
3843
/* Rice extraction. */
3844
if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart0, &riceParamPart0)) {
3845
return DRFLAC_FALSE;
3846
}
3847
3848
/* Rice reconstruction. */
3849
riceParamPart0 &= riceParamMask;
3850
riceParamPart0 |= (zeroCountPart0 << riceParam);
3851
riceParamPart0 = (riceParamPart0 >> 1) ^ t[riceParamPart0 & 0x01];
3852
/*riceParamPart0 = (riceParamPart0 >> 1) ^ (~(riceParamPart0 & 0x01) + 1);*/
3853
3854
/* Sample reconstruction. */
3855
if (drflac__use_64_bit_prediction(bitsPerSample, lpcOrder, lpcPrecision)) {
3856
pSamplesOut[0] = riceParamPart0 + drflac__calculate_prediction_64(lpcOrder, lpcShift, coefficients, pSamplesOut + 0);
3857
} else {
3858
pSamplesOut[0] = riceParamPart0 + drflac__calculate_prediction_32(lpcOrder, lpcShift, coefficients, pSamplesOut + 0);
3859
}
3860
3861
i += 1;
3862
pSamplesOut += 1;
3863
}
3864
3865
return DRFLAC_TRUE;
3866
}
3867
3868
#if defined(DRFLAC_SUPPORT_SSE2)
3869
static DRFLAC_INLINE __m128i drflac__mm_packs_interleaved_epi32(__m128i a, __m128i b)
3870
{
3871
__m128i r;
3872
3873
/* Pack. */
3874
r = _mm_packs_epi32(a, b);
3875
3876
/* a3a2 a1a0 b3b2 b1b0 -> a3a2 b3b2 a1a0 b1b0 */
3877
r = _mm_shuffle_epi32(r, _MM_SHUFFLE(3, 1, 2, 0));
3878
3879
/* a3a2 b3b2 a1a0 b1b0 -> a3b3 a2b2 a1b1 a0b0 */
3880
r = _mm_shufflehi_epi16(r, _MM_SHUFFLE(3, 1, 2, 0));
3881
r = _mm_shufflelo_epi16(r, _MM_SHUFFLE(3, 1, 2, 0));
3882
3883
return r;
3884
}
3885
#endif
3886
3887
#if defined(DRFLAC_SUPPORT_SSE41)
3888
static DRFLAC_INLINE __m128i drflac__mm_not_si128(__m128i a)
3889
{
3890
return _mm_xor_si128(a, _mm_cmpeq_epi32(_mm_setzero_si128(), _mm_setzero_si128()));
3891
}
3892
3893
static DRFLAC_INLINE __m128i drflac__mm_hadd_epi32(__m128i x)
3894
{
3895
__m128i x64 = _mm_add_epi32(x, _mm_shuffle_epi32(x, _MM_SHUFFLE(1, 0, 3, 2)));
3896
__m128i x32 = _mm_shufflelo_epi16(x64, _MM_SHUFFLE(1, 0, 3, 2));
3897
return _mm_add_epi32(x64, x32);
3898
}
3899
3900
static DRFLAC_INLINE __m128i drflac__mm_hadd_epi64(__m128i x)
3901
{
3902
return _mm_add_epi64(x, _mm_shuffle_epi32(x, _MM_SHUFFLE(1, 0, 3, 2)));
3903
}
3904
3905
static DRFLAC_INLINE __m128i drflac__mm_srai_epi64(__m128i x, int count)
3906
{
3907
/*
3908
To simplify this we are assuming count < 32. This restriction allows us to work on a low side and a high side. The low side
3909
is shifted with zero bits, whereas the right side is shifted with sign bits.
3910
*/
3911
__m128i lo = _mm_srli_epi64(x, count);
3912
__m128i hi = _mm_srai_epi32(x, count);
3913
3914
hi = _mm_and_si128(hi, _mm_set_epi32(0xFFFFFFFF, 0, 0xFFFFFFFF, 0)); /* The high part needs to have the low part cleared. */
3915
3916
return _mm_or_si128(lo, hi);
3917
}
3918
3919
static drflac_bool32 drflac__decode_samples_with_residual__rice__sse41_32(drflac_bs* bs, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pSamplesOut)
3920
{
3921
int i;
3922
drflac_uint32 riceParamMask;
3923
drflac_int32* pDecodedSamples = pSamplesOut;
3924
drflac_int32* pDecodedSamplesEnd = pSamplesOut + (count & ~3);
3925
drflac_uint32 zeroCountParts0 = 0;
3926
drflac_uint32 zeroCountParts1 = 0;
3927
drflac_uint32 zeroCountParts2 = 0;
3928
drflac_uint32 zeroCountParts3 = 0;
3929
drflac_uint32 riceParamParts0 = 0;
3930
drflac_uint32 riceParamParts1 = 0;
3931
drflac_uint32 riceParamParts2 = 0;
3932
drflac_uint32 riceParamParts3 = 0;
3933
__m128i coefficients128_0;
3934
__m128i coefficients128_4;
3935
__m128i coefficients128_8;
3936
__m128i samples128_0;
3937
__m128i samples128_4;
3938
__m128i samples128_8;
3939
__m128i riceParamMask128;
3940
3941
const drflac_uint32 t[2] = {0x00000000, 0xFFFFFFFF};
3942
3943
riceParamMask = (drflac_uint32)~((~0UL) << riceParam);
3944
riceParamMask128 = _mm_set1_epi32(riceParamMask);
3945
3946
/* Pre-load. */
3947
coefficients128_0 = _mm_setzero_si128();
3948
coefficients128_4 = _mm_setzero_si128();
3949
coefficients128_8 = _mm_setzero_si128();
3950
3951
samples128_0 = _mm_setzero_si128();
3952
samples128_4 = _mm_setzero_si128();
3953
samples128_8 = _mm_setzero_si128();
3954
3955
/*
3956
Pre-loading the coefficients and prior samples is annoying because we need to ensure we don't try reading more than
3957
what's available in the input buffers. It would be convenient to use a fall-through switch to do this, but this results
3958
in strict aliasing warnings with GCC. To work around this I'm just doing something hacky. This feels a bit convoluted
3959
so I think there's opportunity for this to be simplified.
3960
*/
3961
#if 1
3962
{
3963
int runningOrder = order;
3964
3965
/* 0 - 3. */
3966
if (runningOrder >= 4) {
3967
coefficients128_0 = _mm_loadu_si128((const __m128i*)(coefficients + 0));
3968
samples128_0 = _mm_loadu_si128((const __m128i*)(pSamplesOut - 4));
3969
runningOrder -= 4;
3970
} else {
3971
switch (runningOrder) {
3972
case 3: coefficients128_0 = _mm_set_epi32(0, coefficients[2], coefficients[1], coefficients[0]); samples128_0 = _mm_set_epi32(pSamplesOut[-1], pSamplesOut[-2], pSamplesOut[-3], 0); break;
3973
case 2: coefficients128_0 = _mm_set_epi32(0, 0, coefficients[1], coefficients[0]); samples128_0 = _mm_set_epi32(pSamplesOut[-1], pSamplesOut[-2], 0, 0); break;
3974
case 1: coefficients128_0 = _mm_set_epi32(0, 0, 0, coefficients[0]); samples128_0 = _mm_set_epi32(pSamplesOut[-1], 0, 0, 0); break;
3975
}
3976
runningOrder = 0;
3977
}
3978
3979
/* 4 - 7 */
3980
if (runningOrder >= 4) {
3981
coefficients128_4 = _mm_loadu_si128((const __m128i*)(coefficients + 4));
3982
samples128_4 = _mm_loadu_si128((const __m128i*)(pSamplesOut - 8));
3983
runningOrder -= 4;
3984
} else {
3985
switch (runningOrder) {
3986
case 3: coefficients128_4 = _mm_set_epi32(0, coefficients[6], coefficients[5], coefficients[4]); samples128_4 = _mm_set_epi32(pSamplesOut[-5], pSamplesOut[-6], pSamplesOut[-7], 0); break;
3987
case 2: coefficients128_4 = _mm_set_epi32(0, 0, coefficients[5], coefficients[4]); samples128_4 = _mm_set_epi32(pSamplesOut[-5], pSamplesOut[-6], 0, 0); break;
3988
case 1: coefficients128_4 = _mm_set_epi32(0, 0, 0, coefficients[4]); samples128_4 = _mm_set_epi32(pSamplesOut[-5], 0, 0, 0); break;
3989
}
3990
runningOrder = 0;
3991
}
3992
3993
/* 8 - 11 */
3994
if (runningOrder == 4) {
3995
coefficients128_8 = _mm_loadu_si128((const __m128i*)(coefficients + 8));
3996
samples128_8 = _mm_loadu_si128((const __m128i*)(pSamplesOut - 12));
3997
runningOrder -= 4;
3998
} else {
3999
switch (runningOrder) {
4000
case 3: coefficients128_8 = _mm_set_epi32(0, coefficients[10], coefficients[9], coefficients[8]); samples128_8 = _mm_set_epi32(pSamplesOut[-9], pSamplesOut[-10], pSamplesOut[-11], 0); break;
4001
case 2: coefficients128_8 = _mm_set_epi32(0, 0, coefficients[9], coefficients[8]); samples128_8 = _mm_set_epi32(pSamplesOut[-9], pSamplesOut[-10], 0, 0); break;
4002
case 1: coefficients128_8 = _mm_set_epi32(0, 0, 0, coefficients[8]); samples128_8 = _mm_set_epi32(pSamplesOut[-9], 0, 0, 0); break;
4003
}
4004
runningOrder = 0;
4005
}
4006
4007
/* Coefficients need to be shuffled for our streaming algorithm below to work. Samples are already in the correct order from the loading routine above. */
4008
coefficients128_0 = _mm_shuffle_epi32(coefficients128_0, _MM_SHUFFLE(0, 1, 2, 3));
4009
coefficients128_4 = _mm_shuffle_epi32(coefficients128_4, _MM_SHUFFLE(0, 1, 2, 3));
4010
coefficients128_8 = _mm_shuffle_epi32(coefficients128_8, _MM_SHUFFLE(0, 1, 2, 3));
4011
}
4012
#else
4013
/* This causes strict-aliasing warnings with GCC. */
4014
switch (order)
4015
{
4016
case 12: ((drflac_int32*)&coefficients128_8)[0] = coefficients[11]; ((drflac_int32*)&samples128_8)[0] = pDecodedSamples[-12];
4017
case 11: ((drflac_int32*)&coefficients128_8)[1] = coefficients[10]; ((drflac_int32*)&samples128_8)[1] = pDecodedSamples[-11];
4018
case 10: ((drflac_int32*)&coefficients128_8)[2] = coefficients[ 9]; ((drflac_int32*)&samples128_8)[2] = pDecodedSamples[-10];
4019
case 9: ((drflac_int32*)&coefficients128_8)[3] = coefficients[ 8]; ((drflac_int32*)&samples128_8)[3] = pDecodedSamples[- 9];
4020
case 8: ((drflac_int32*)&coefficients128_4)[0] = coefficients[ 7]; ((drflac_int32*)&samples128_4)[0] = pDecodedSamples[- 8];
4021
case 7: ((drflac_int32*)&coefficients128_4)[1] = coefficients[ 6]; ((drflac_int32*)&samples128_4)[1] = pDecodedSamples[- 7];
4022
case 6: ((drflac_int32*)&coefficients128_4)[2] = coefficients[ 5]; ((drflac_int32*)&samples128_4)[2] = pDecodedSamples[- 6];
4023
case 5: ((drflac_int32*)&coefficients128_4)[3] = coefficients[ 4]; ((drflac_int32*)&samples128_4)[3] = pDecodedSamples[- 5];
4024
case 4: ((drflac_int32*)&coefficients128_0)[0] = coefficients[ 3]; ((drflac_int32*)&samples128_0)[0] = pDecodedSamples[- 4];
4025
case 3: ((drflac_int32*)&coefficients128_0)[1] = coefficients[ 2]; ((drflac_int32*)&samples128_0)[1] = pDecodedSamples[- 3];
4026
case 2: ((drflac_int32*)&coefficients128_0)[2] = coefficients[ 1]; ((drflac_int32*)&samples128_0)[2] = pDecodedSamples[- 2];
4027
case 1: ((drflac_int32*)&coefficients128_0)[3] = coefficients[ 0]; ((drflac_int32*)&samples128_0)[3] = pDecodedSamples[- 1];
4028
}
4029
#endif
4030
4031
/* For this version we are doing one sample at a time. */
4032
while (pDecodedSamples < pDecodedSamplesEnd) {
4033
__m128i prediction128;
4034
__m128i zeroCountPart128;
4035
__m128i riceParamPart128;
4036
4037
if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts0, &riceParamParts0) ||
4038
!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts1, &riceParamParts1) ||
4039
!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts2, &riceParamParts2) ||
4040
!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts3, &riceParamParts3)) {
4041
return DRFLAC_FALSE;
4042
}
4043
4044
zeroCountPart128 = _mm_set_epi32(zeroCountParts3, zeroCountParts2, zeroCountParts1, zeroCountParts0);
4045
riceParamPart128 = _mm_set_epi32(riceParamParts3, riceParamParts2, riceParamParts1, riceParamParts0);
4046
4047
riceParamPart128 = _mm_and_si128(riceParamPart128, riceParamMask128);
4048
riceParamPart128 = _mm_or_si128(riceParamPart128, _mm_slli_epi32(zeroCountPart128, riceParam));
4049
riceParamPart128 = _mm_xor_si128(_mm_srli_epi32(riceParamPart128, 1), _mm_add_epi32(drflac__mm_not_si128(_mm_and_si128(riceParamPart128, _mm_set1_epi32(0x01))), _mm_set1_epi32(0x01))); /* <-- SSE2 compatible */
4050
/*riceParamPart128 = _mm_xor_si128(_mm_srli_epi32(riceParamPart128, 1), _mm_mullo_epi32(_mm_and_si128(riceParamPart128, _mm_set1_epi32(0x01)), _mm_set1_epi32(0xFFFFFFFF)));*/ /* <-- Only supported from SSE4.1 and is slower in my testing... */
4051
4052
if (order <= 4) {
4053
for (i = 0; i < 4; i += 1) {
4054
prediction128 = _mm_mullo_epi32(coefficients128_0, samples128_0);
4055
4056
/* Horizontal add and shift. */
4057
prediction128 = drflac__mm_hadd_epi32(prediction128);
4058
prediction128 = _mm_srai_epi32(prediction128, shift);
4059
prediction128 = _mm_add_epi32(riceParamPart128, prediction128);
4060
4061
samples128_0 = _mm_alignr_epi8(prediction128, samples128_0, 4);
4062
riceParamPart128 = _mm_alignr_epi8(_mm_setzero_si128(), riceParamPart128, 4);
4063
}
4064
} else if (order <= 8) {
4065
for (i = 0; i < 4; i += 1) {
4066
prediction128 = _mm_mullo_epi32(coefficients128_4, samples128_4);
4067
prediction128 = _mm_add_epi32(prediction128, _mm_mullo_epi32(coefficients128_0, samples128_0));
4068
4069
/* Horizontal add and shift. */
4070
prediction128 = drflac__mm_hadd_epi32(prediction128);
4071
prediction128 = _mm_srai_epi32(prediction128, shift);
4072
prediction128 = _mm_add_epi32(riceParamPart128, prediction128);
4073
4074
samples128_4 = _mm_alignr_epi8(samples128_0, samples128_4, 4);
4075
samples128_0 = _mm_alignr_epi8(prediction128, samples128_0, 4);
4076
riceParamPart128 = _mm_alignr_epi8(_mm_setzero_si128(), riceParamPart128, 4);
4077
}
4078
} else {
4079
for (i = 0; i < 4; i += 1) {
4080
prediction128 = _mm_mullo_epi32(coefficients128_8, samples128_8);
4081
prediction128 = _mm_add_epi32(prediction128, _mm_mullo_epi32(coefficients128_4, samples128_4));
4082
prediction128 = _mm_add_epi32(prediction128, _mm_mullo_epi32(coefficients128_0, samples128_0));
4083
4084
/* Horizontal add and shift. */
4085
prediction128 = drflac__mm_hadd_epi32(prediction128);
4086
prediction128 = _mm_srai_epi32(prediction128, shift);
4087
prediction128 = _mm_add_epi32(riceParamPart128, prediction128);
4088
4089
samples128_8 = _mm_alignr_epi8(samples128_4, samples128_8, 4);
4090
samples128_4 = _mm_alignr_epi8(samples128_0, samples128_4, 4);
4091
samples128_0 = _mm_alignr_epi8(prediction128, samples128_0, 4);
4092
riceParamPart128 = _mm_alignr_epi8(_mm_setzero_si128(), riceParamPart128, 4);
4093
}
4094
}
4095
4096
/* We store samples in groups of 4. */
4097
_mm_storeu_si128((__m128i*)pDecodedSamples, samples128_0);
4098
pDecodedSamples += 4;
4099
}
4100
4101
/* Make sure we process the last few samples. */
4102
i = (count & ~3);
4103
while (i < (int)count) {
4104
/* Rice extraction. */
4105
if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts0, &riceParamParts0)) {
4106
return DRFLAC_FALSE;
4107
}
4108
4109
/* Rice reconstruction. */
4110
riceParamParts0 &= riceParamMask;
4111
riceParamParts0 |= (zeroCountParts0 << riceParam);
4112
riceParamParts0 = (riceParamParts0 >> 1) ^ t[riceParamParts0 & 0x01];
4113
4114
/* Sample reconstruction. */
4115
pDecodedSamples[0] = riceParamParts0 + drflac__calculate_prediction_32(order, shift, coefficients, pDecodedSamples);
4116
4117
i += 1;
4118
pDecodedSamples += 1;
4119
}
4120
4121
return DRFLAC_TRUE;
4122
}
4123
4124
static drflac_bool32 drflac__decode_samples_with_residual__rice__sse41_64(drflac_bs* bs, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pSamplesOut)
4125
{
4126
int i;
4127
drflac_uint32 riceParamMask;
4128
drflac_int32* pDecodedSamples = pSamplesOut;
4129
drflac_int32* pDecodedSamplesEnd = pSamplesOut + (count & ~3);
4130
drflac_uint32 zeroCountParts0 = 0;
4131
drflac_uint32 zeroCountParts1 = 0;
4132
drflac_uint32 zeroCountParts2 = 0;
4133
drflac_uint32 zeroCountParts3 = 0;
4134
drflac_uint32 riceParamParts0 = 0;
4135
drflac_uint32 riceParamParts1 = 0;
4136
drflac_uint32 riceParamParts2 = 0;
4137
drflac_uint32 riceParamParts3 = 0;
4138
__m128i coefficients128_0;
4139
__m128i coefficients128_4;
4140
__m128i coefficients128_8;
4141
__m128i samples128_0;
4142
__m128i samples128_4;
4143
__m128i samples128_8;
4144
__m128i prediction128;
4145
__m128i riceParamMask128;
4146
4147
const drflac_uint32 t[2] = {0x00000000, 0xFFFFFFFF};
4148
4149
DRFLAC_ASSERT(order <= 12);
4150
4151
riceParamMask = (drflac_uint32)~((~0UL) << riceParam);
4152
riceParamMask128 = _mm_set1_epi32(riceParamMask);
4153
4154
prediction128 = _mm_setzero_si128();
4155
4156
/* Pre-load. */
4157
coefficients128_0 = _mm_setzero_si128();
4158
coefficients128_4 = _mm_setzero_si128();
4159
coefficients128_8 = _mm_setzero_si128();
4160
4161
samples128_0 = _mm_setzero_si128();
4162
samples128_4 = _mm_setzero_si128();
4163
samples128_8 = _mm_setzero_si128();
4164
4165
#if 1
4166
{
4167
int runningOrder = order;
4168
4169
/* 0 - 3. */
4170
if (runningOrder >= 4) {
4171
coefficients128_0 = _mm_loadu_si128((const __m128i*)(coefficients + 0));
4172
samples128_0 = _mm_loadu_si128((const __m128i*)(pSamplesOut - 4));
4173
runningOrder -= 4;
4174
} else {
4175
switch (runningOrder) {
4176
case 3: coefficients128_0 = _mm_set_epi32(0, coefficients[2], coefficients[1], coefficients[0]); samples128_0 = _mm_set_epi32(pSamplesOut[-1], pSamplesOut[-2], pSamplesOut[-3], 0); break;
4177
case 2: coefficients128_0 = _mm_set_epi32(0, 0, coefficients[1], coefficients[0]); samples128_0 = _mm_set_epi32(pSamplesOut[-1], pSamplesOut[-2], 0, 0); break;
4178
case 1: coefficients128_0 = _mm_set_epi32(0, 0, 0, coefficients[0]); samples128_0 = _mm_set_epi32(pSamplesOut[-1], 0, 0, 0); break;
4179
}
4180
runningOrder = 0;
4181
}
4182
4183
/* 4 - 7 */
4184
if (runningOrder >= 4) {
4185
coefficients128_4 = _mm_loadu_si128((const __m128i*)(coefficients + 4));
4186
samples128_4 = _mm_loadu_si128((const __m128i*)(pSamplesOut - 8));
4187
runningOrder -= 4;
4188
} else {
4189
switch (runningOrder) {
4190
case 3: coefficients128_4 = _mm_set_epi32(0, coefficients[6], coefficients[5], coefficients[4]); samples128_4 = _mm_set_epi32(pSamplesOut[-5], pSamplesOut[-6], pSamplesOut[-7], 0); break;
4191
case 2: coefficients128_4 = _mm_set_epi32(0, 0, coefficients[5], coefficients[4]); samples128_4 = _mm_set_epi32(pSamplesOut[-5], pSamplesOut[-6], 0, 0); break;
4192
case 1: coefficients128_4 = _mm_set_epi32(0, 0, 0, coefficients[4]); samples128_4 = _mm_set_epi32(pSamplesOut[-5], 0, 0, 0); break;
4193
}
4194
runningOrder = 0;
4195
}
4196
4197
/* 8 - 11 */
4198
if (runningOrder == 4) {
4199
coefficients128_8 = _mm_loadu_si128((const __m128i*)(coefficients + 8));
4200
samples128_8 = _mm_loadu_si128((const __m128i*)(pSamplesOut - 12));
4201
runningOrder -= 4;
4202
} else {
4203
switch (runningOrder) {
4204
case 3: coefficients128_8 = _mm_set_epi32(0, coefficients[10], coefficients[9], coefficients[8]); samples128_8 = _mm_set_epi32(pSamplesOut[-9], pSamplesOut[-10], pSamplesOut[-11], 0); break;
4205
case 2: coefficients128_8 = _mm_set_epi32(0, 0, coefficients[9], coefficients[8]); samples128_8 = _mm_set_epi32(pSamplesOut[-9], pSamplesOut[-10], 0, 0); break;
4206
case 1: coefficients128_8 = _mm_set_epi32(0, 0, 0, coefficients[8]); samples128_8 = _mm_set_epi32(pSamplesOut[-9], 0, 0, 0); break;
4207
}
4208
runningOrder = 0;
4209
}
4210
4211
/* Coefficients need to be shuffled for our streaming algorithm below to work. Samples are already in the correct order from the loading routine above. */
4212
coefficients128_0 = _mm_shuffle_epi32(coefficients128_0, _MM_SHUFFLE(0, 1, 2, 3));
4213
coefficients128_4 = _mm_shuffle_epi32(coefficients128_4, _MM_SHUFFLE(0, 1, 2, 3));
4214
coefficients128_8 = _mm_shuffle_epi32(coefficients128_8, _MM_SHUFFLE(0, 1, 2, 3));
4215
}
4216
#else
4217
switch (order)
4218
{
4219
case 12: ((drflac_int32*)&coefficients128_8)[0] = coefficients[11]; ((drflac_int32*)&samples128_8)[0] = pDecodedSamples[-12];
4220
case 11: ((drflac_int32*)&coefficients128_8)[1] = coefficients[10]; ((drflac_int32*)&samples128_8)[1] = pDecodedSamples[-11];
4221
case 10: ((drflac_int32*)&coefficients128_8)[2] = coefficients[ 9]; ((drflac_int32*)&samples128_8)[2] = pDecodedSamples[-10];
4222
case 9: ((drflac_int32*)&coefficients128_8)[3] = coefficients[ 8]; ((drflac_int32*)&samples128_8)[3] = pDecodedSamples[- 9];
4223
case 8: ((drflac_int32*)&coefficients128_4)[0] = coefficients[ 7]; ((drflac_int32*)&samples128_4)[0] = pDecodedSamples[- 8];
4224
case 7: ((drflac_int32*)&coefficients128_4)[1] = coefficients[ 6]; ((drflac_int32*)&samples128_4)[1] = pDecodedSamples[- 7];
4225
case 6: ((drflac_int32*)&coefficients128_4)[2] = coefficients[ 5]; ((drflac_int32*)&samples128_4)[2] = pDecodedSamples[- 6];
4226
case 5: ((drflac_int32*)&coefficients128_4)[3] = coefficients[ 4]; ((drflac_int32*)&samples128_4)[3] = pDecodedSamples[- 5];
4227
case 4: ((drflac_int32*)&coefficients128_0)[0] = coefficients[ 3]; ((drflac_int32*)&samples128_0)[0] = pDecodedSamples[- 4];
4228
case 3: ((drflac_int32*)&coefficients128_0)[1] = coefficients[ 2]; ((drflac_int32*)&samples128_0)[1] = pDecodedSamples[- 3];
4229
case 2: ((drflac_int32*)&coefficients128_0)[2] = coefficients[ 1]; ((drflac_int32*)&samples128_0)[2] = pDecodedSamples[- 2];
4230
case 1: ((drflac_int32*)&coefficients128_0)[3] = coefficients[ 0]; ((drflac_int32*)&samples128_0)[3] = pDecodedSamples[- 1];
4231
}
4232
#endif
4233
4234
/* For this version we are doing one sample at a time. */
4235
while (pDecodedSamples < pDecodedSamplesEnd) {
4236
__m128i zeroCountPart128;
4237
__m128i riceParamPart128;
4238
4239
if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts0, &riceParamParts0) ||
4240
!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts1, &riceParamParts1) ||
4241
!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts2, &riceParamParts2) ||
4242
!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts3, &riceParamParts3)) {
4243
return DRFLAC_FALSE;
4244
}
4245
4246
zeroCountPart128 = _mm_set_epi32(zeroCountParts3, zeroCountParts2, zeroCountParts1, zeroCountParts0);
4247
riceParamPart128 = _mm_set_epi32(riceParamParts3, riceParamParts2, riceParamParts1, riceParamParts0);
4248
4249
riceParamPart128 = _mm_and_si128(riceParamPart128, riceParamMask128);
4250
riceParamPart128 = _mm_or_si128(riceParamPart128, _mm_slli_epi32(zeroCountPart128, riceParam));
4251
riceParamPart128 = _mm_xor_si128(_mm_srli_epi32(riceParamPart128, 1), _mm_add_epi32(drflac__mm_not_si128(_mm_and_si128(riceParamPart128, _mm_set1_epi32(1))), _mm_set1_epi32(1)));
4252
4253
for (i = 0; i < 4; i += 1) {
4254
prediction128 = _mm_xor_si128(prediction128, prediction128); /* Reset to 0. */
4255
4256
switch (order)
4257
{
4258
case 12:
4259
case 11: prediction128 = _mm_add_epi64(prediction128, _mm_mul_epi32(_mm_shuffle_epi32(coefficients128_8, _MM_SHUFFLE(1, 1, 0, 0)), _mm_shuffle_epi32(samples128_8, _MM_SHUFFLE(1, 1, 0, 0))));
4260
case 10:
4261
case 9: prediction128 = _mm_add_epi64(prediction128, _mm_mul_epi32(_mm_shuffle_epi32(coefficients128_8, _MM_SHUFFLE(3, 3, 2, 2)), _mm_shuffle_epi32(samples128_8, _MM_SHUFFLE(3, 3, 2, 2))));
4262
case 8:
4263
case 7: prediction128 = _mm_add_epi64(prediction128, _mm_mul_epi32(_mm_shuffle_epi32(coefficients128_4, _MM_SHUFFLE(1, 1, 0, 0)), _mm_shuffle_epi32(samples128_4, _MM_SHUFFLE(1, 1, 0, 0))));
4264
case 6:
4265
case 5: prediction128 = _mm_add_epi64(prediction128, _mm_mul_epi32(_mm_shuffle_epi32(coefficients128_4, _MM_SHUFFLE(3, 3, 2, 2)), _mm_shuffle_epi32(samples128_4, _MM_SHUFFLE(3, 3, 2, 2))));
4266
case 4:
4267
case 3: prediction128 = _mm_add_epi64(prediction128, _mm_mul_epi32(_mm_shuffle_epi32(coefficients128_0, _MM_SHUFFLE(1, 1, 0, 0)), _mm_shuffle_epi32(samples128_0, _MM_SHUFFLE(1, 1, 0, 0))));
4268
case 2:
4269
case 1: prediction128 = _mm_add_epi64(prediction128, _mm_mul_epi32(_mm_shuffle_epi32(coefficients128_0, _MM_SHUFFLE(3, 3, 2, 2)), _mm_shuffle_epi32(samples128_0, _MM_SHUFFLE(3, 3, 2, 2))));
4270
}
4271
4272
/* Horizontal add and shift. */
4273
prediction128 = drflac__mm_hadd_epi64(prediction128);
4274
prediction128 = drflac__mm_srai_epi64(prediction128, shift);
4275
prediction128 = _mm_add_epi32(riceParamPart128, prediction128);
4276
4277
/* Our value should be sitting in prediction128[0]. We need to combine this with our SSE samples. */
4278
samples128_8 = _mm_alignr_epi8(samples128_4, samples128_8, 4);
4279
samples128_4 = _mm_alignr_epi8(samples128_0, samples128_4, 4);
4280
samples128_0 = _mm_alignr_epi8(prediction128, samples128_0, 4);
4281
4282
/* Slide our rice parameter down so that the value in position 0 contains the next one to process. */
4283
riceParamPart128 = _mm_alignr_epi8(_mm_setzero_si128(), riceParamPart128, 4);
4284
}
4285
4286
/* We store samples in groups of 4. */
4287
_mm_storeu_si128((__m128i*)pDecodedSamples, samples128_0);
4288
pDecodedSamples += 4;
4289
}
4290
4291
/* Make sure we process the last few samples. */
4292
i = (count & ~3);
4293
while (i < (int)count) {
4294
/* Rice extraction. */
4295
if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts0, &riceParamParts0)) {
4296
return DRFLAC_FALSE;
4297
}
4298
4299
/* Rice reconstruction. */
4300
riceParamParts0 &= riceParamMask;
4301
riceParamParts0 |= (zeroCountParts0 << riceParam);
4302
riceParamParts0 = (riceParamParts0 >> 1) ^ t[riceParamParts0 & 0x01];
4303
4304
/* Sample reconstruction. */
4305
pDecodedSamples[0] = riceParamParts0 + drflac__calculate_prediction_64(order, shift, coefficients, pDecodedSamples);
4306
4307
i += 1;
4308
pDecodedSamples += 1;
4309
}
4310
4311
return DRFLAC_TRUE;
4312
}
4313
4314
static drflac_bool32 drflac__decode_samples_with_residual__rice__sse41(drflac_bs* bs, drflac_uint32 bitsPerSample, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 lpcOrder, drflac_int32 lpcShift, drflac_uint32 lpcPrecision, const drflac_int32* coefficients, drflac_int32* pSamplesOut)
4315
{
4316
DRFLAC_ASSERT(bs != NULL);
4317
DRFLAC_ASSERT(pSamplesOut != NULL);
4318
4319
/* In my testing the order is rarely > 12, so in this case I'm going to simplify the SSE implementation by only handling order <= 12. */
4320
if (lpcOrder > 0 && lpcOrder <= 12) {
4321
if (drflac__use_64_bit_prediction(bitsPerSample, lpcOrder, lpcPrecision)) {
4322
return drflac__decode_samples_with_residual__rice__sse41_64(bs, count, riceParam, lpcOrder, lpcShift, coefficients, pSamplesOut);
4323
} else {
4324
return drflac__decode_samples_with_residual__rice__sse41_32(bs, count, riceParam, lpcOrder, lpcShift, coefficients, pSamplesOut);
4325
}
4326
} else {
4327
return drflac__decode_samples_with_residual__rice__scalar(bs, bitsPerSample, count, riceParam, lpcOrder, lpcShift, lpcPrecision, coefficients, pSamplesOut);
4328
}
4329
}
4330
#endif
4331
4332
#if defined(DRFLAC_SUPPORT_NEON)
4333
static DRFLAC_INLINE void drflac__vst2q_s32(drflac_int32* p, int32x4x2_t x)
4334
{
4335
vst1q_s32(p+0, x.val[0]);
4336
vst1q_s32(p+4, x.val[1]);
4337
}
4338
4339
static DRFLAC_INLINE void drflac__vst2q_u32(drflac_uint32* p, uint32x4x2_t x)
4340
{
4341
vst1q_u32(p+0, x.val[0]);
4342
vst1q_u32(p+4, x.val[1]);
4343
}
4344
4345
static DRFLAC_INLINE void drflac__vst2q_f32(float* p, float32x4x2_t x)
4346
{
4347
vst1q_f32(p+0, x.val[0]);
4348
vst1q_f32(p+4, x.val[1]);
4349
}
4350
4351
static DRFLAC_INLINE void drflac__vst2q_s16(drflac_int16* p, int16x4x2_t x)
4352
{
4353
vst1q_s16(p, vcombine_s16(x.val[0], x.val[1]));
4354
}
4355
4356
static DRFLAC_INLINE void drflac__vst2q_u16(drflac_uint16* p, uint16x4x2_t x)
4357
{
4358
vst1q_u16(p, vcombine_u16(x.val[0], x.val[1]));
4359
}
4360
4361
static DRFLAC_INLINE int32x4_t drflac__vdupq_n_s32x4(drflac_int32 x3, drflac_int32 x2, drflac_int32 x1, drflac_int32 x0)
4362
{
4363
drflac_int32 x[4];
4364
x[3] = x3;
4365
x[2] = x2;
4366
x[1] = x1;
4367
x[0] = x0;
4368
return vld1q_s32(x);
4369
}
4370
4371
static DRFLAC_INLINE int32x4_t drflac__valignrq_s32_1(int32x4_t a, int32x4_t b)
4372
{
4373
/* Equivalent to SSE's _mm_alignr_epi8(a, b, 4) */
4374
4375
/* Reference */
4376
/*return drflac__vdupq_n_s32x4(
4377
vgetq_lane_s32(a, 0),
4378
vgetq_lane_s32(b, 3),
4379
vgetq_lane_s32(b, 2),
4380
vgetq_lane_s32(b, 1)
4381
);*/
4382
4383
return vextq_s32(b, a, 1);
4384
}
4385
4386
static DRFLAC_INLINE uint32x4_t drflac__valignrq_u32_1(uint32x4_t a, uint32x4_t b)
4387
{
4388
/* Equivalent to SSE's _mm_alignr_epi8(a, b, 4) */
4389
4390
/* Reference */
4391
/*return drflac__vdupq_n_s32x4(
4392
vgetq_lane_s32(a, 0),
4393
vgetq_lane_s32(b, 3),
4394
vgetq_lane_s32(b, 2),
4395
vgetq_lane_s32(b, 1)
4396
);*/
4397
4398
return vextq_u32(b, a, 1);
4399
}
4400
4401
static DRFLAC_INLINE int32x2_t drflac__vhaddq_s32(int32x4_t x)
4402
{
4403
/* The sum must end up in position 0. */
4404
4405
/* Reference */
4406
/*return vdupq_n_s32(
4407
vgetq_lane_s32(x, 3) +
4408
vgetq_lane_s32(x, 2) +
4409
vgetq_lane_s32(x, 1) +
4410
vgetq_lane_s32(x, 0)
4411
);*/
4412
4413
int32x2_t r = vadd_s32(vget_high_s32(x), vget_low_s32(x));
4414
return vpadd_s32(r, r);
4415
}
4416
4417
static DRFLAC_INLINE int64x1_t drflac__vhaddq_s64(int64x2_t x)
4418
{
4419
return vadd_s64(vget_high_s64(x), vget_low_s64(x));
4420
}
4421
4422
static DRFLAC_INLINE int32x4_t drflac__vrevq_s32(int32x4_t x)
4423
{
4424
/* Reference */
4425
/*return drflac__vdupq_n_s32x4(
4426
vgetq_lane_s32(x, 0),
4427
vgetq_lane_s32(x, 1),
4428
vgetq_lane_s32(x, 2),
4429
vgetq_lane_s32(x, 3)
4430
);*/
4431
4432
return vrev64q_s32(vcombine_s32(vget_high_s32(x), vget_low_s32(x)));
4433
}
4434
4435
static DRFLAC_INLINE int32x4_t drflac__vnotq_s32(int32x4_t x)
4436
{
4437
return veorq_s32(x, vdupq_n_s32(0xFFFFFFFF));
4438
}
4439
4440
static DRFLAC_INLINE uint32x4_t drflac__vnotq_u32(uint32x4_t x)
4441
{
4442
return veorq_u32(x, vdupq_n_u32(0xFFFFFFFF));
4443
}
4444
4445
static drflac_bool32 drflac__decode_samples_with_residual__rice__neon_32(drflac_bs* bs, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pSamplesOut)
4446
{
4447
int i;
4448
drflac_uint32 riceParamMask;
4449
drflac_int32* pDecodedSamples = pSamplesOut;
4450
drflac_int32* pDecodedSamplesEnd = pSamplesOut + (count & ~3);
4451
drflac_uint32 zeroCountParts[4];
4452
drflac_uint32 riceParamParts[4];
4453
int32x4_t coefficients128_0;
4454
int32x4_t coefficients128_4;
4455
int32x4_t coefficients128_8;
4456
int32x4_t samples128_0;
4457
int32x4_t samples128_4;
4458
int32x4_t samples128_8;
4459
uint32x4_t riceParamMask128;
4460
int32x4_t riceParam128;
4461
int32x2_t shift64;
4462
uint32x4_t one128;
4463
4464
const drflac_uint32 t[2] = {0x00000000, 0xFFFFFFFF};
4465
4466
riceParamMask = (drflac_uint32)~((~0UL) << riceParam);
4467
riceParamMask128 = vdupq_n_u32(riceParamMask);
4468
4469
riceParam128 = vdupq_n_s32(riceParam);
4470
shift64 = vdup_n_s32(-shift); /* Negate the shift because we'll be doing a variable shift using vshlq_s32(). */
4471
one128 = vdupq_n_u32(1);
4472
4473
/*
4474
Pre-loading the coefficients and prior samples is annoying because we need to ensure we don't try reading more than
4475
what's available in the input buffers. It would be conenient to use a fall-through switch to do this, but this results
4476
in strict aliasing warnings with GCC. To work around this I'm just doing something hacky. This feels a bit convoluted
4477
so I think there's opportunity for this to be simplified.
4478
*/
4479
{
4480
int runningOrder = order;
4481
drflac_int32 tempC[4] = {0, 0, 0, 0};
4482
drflac_int32 tempS[4] = {0, 0, 0, 0};
4483
4484
/* 0 - 3. */
4485
if (runningOrder >= 4) {
4486
coefficients128_0 = vld1q_s32(coefficients + 0);
4487
samples128_0 = vld1q_s32(pSamplesOut - 4);
4488
runningOrder -= 4;
4489
} else {
4490
switch (runningOrder) {
4491
case 3: tempC[2] = coefficients[2]; tempS[1] = pSamplesOut[-3]; /* fallthrough */
4492
case 2: tempC[1] = coefficients[1]; tempS[2] = pSamplesOut[-2]; /* fallthrough */
4493
case 1: tempC[0] = coefficients[0]; tempS[3] = pSamplesOut[-1]; /* fallthrough */
4494
}
4495
4496
coefficients128_0 = vld1q_s32(tempC);
4497
samples128_0 = vld1q_s32(tempS);
4498
runningOrder = 0;
4499
}
4500
4501
/* 4 - 7 */
4502
if (runningOrder >= 4) {
4503
coefficients128_4 = vld1q_s32(coefficients + 4);
4504
samples128_4 = vld1q_s32(pSamplesOut - 8);
4505
runningOrder -= 4;
4506
} else {
4507
switch (runningOrder) {
4508
case 3: tempC[2] = coefficients[6]; tempS[1] = pSamplesOut[-7]; /* fallthrough */
4509
case 2: tempC[1] = coefficients[5]; tempS[2] = pSamplesOut[-6]; /* fallthrough */
4510
case 1: tempC[0] = coefficients[4]; tempS[3] = pSamplesOut[-5]; /* fallthrough */
4511
}
4512
4513
coefficients128_4 = vld1q_s32(tempC);
4514
samples128_4 = vld1q_s32(tempS);
4515
runningOrder = 0;
4516
}
4517
4518
/* 8 - 11 */
4519
if (runningOrder == 4) {
4520
coefficients128_8 = vld1q_s32(coefficients + 8);
4521
samples128_8 = vld1q_s32(pSamplesOut - 12);
4522
runningOrder -= 4;
4523
} else {
4524
switch (runningOrder) {
4525
case 3: tempC[2] = coefficients[10]; tempS[1] = pSamplesOut[-11]; /* fallthrough */
4526
case 2: tempC[1] = coefficients[ 9]; tempS[2] = pSamplesOut[-10]; /* fallthrough */
4527
case 1: tempC[0] = coefficients[ 8]; tempS[3] = pSamplesOut[- 9]; /* fallthrough */
4528
}
4529
4530
coefficients128_8 = vld1q_s32(tempC);
4531
samples128_8 = vld1q_s32(tempS);
4532
runningOrder = 0;
4533
}
4534
4535
/* Coefficients need to be shuffled for our streaming algorithm below to work. Samples are already in the correct order from the loading routine above. */
4536
coefficients128_0 = drflac__vrevq_s32(coefficients128_0);
4537
coefficients128_4 = drflac__vrevq_s32(coefficients128_4);
4538
coefficients128_8 = drflac__vrevq_s32(coefficients128_8);
4539
}
4540
4541
/* For this version we are doing one sample at a time. */
4542
while (pDecodedSamples < pDecodedSamplesEnd) {
4543
int32x4_t prediction128;
4544
int32x2_t prediction64;
4545
uint32x4_t zeroCountPart128;
4546
uint32x4_t riceParamPart128;
4547
4548
if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[0], &riceParamParts[0]) ||
4549
!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[1], &riceParamParts[1]) ||
4550
!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[2], &riceParamParts[2]) ||
4551
!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[3], &riceParamParts[3])) {
4552
return DRFLAC_FALSE;
4553
}
4554
4555
zeroCountPart128 = vld1q_u32(zeroCountParts);
4556
riceParamPart128 = vld1q_u32(riceParamParts);
4557
4558
riceParamPart128 = vandq_u32(riceParamPart128, riceParamMask128);
4559
riceParamPart128 = vorrq_u32(riceParamPart128, vshlq_u32(zeroCountPart128, riceParam128));
4560
riceParamPart128 = veorq_u32(vshrq_n_u32(riceParamPart128, 1), vaddq_u32(drflac__vnotq_u32(vandq_u32(riceParamPart128, one128)), one128));
4561
4562
if (order <= 4) {
4563
for (i = 0; i < 4; i += 1) {
4564
prediction128 = vmulq_s32(coefficients128_0, samples128_0);
4565
4566
/* Horizontal add and shift. */
4567
prediction64 = drflac__vhaddq_s32(prediction128);
4568
prediction64 = vshl_s32(prediction64, shift64);
4569
prediction64 = vadd_s32(prediction64, vget_low_s32(vreinterpretq_s32_u32(riceParamPart128)));
4570
4571
samples128_0 = drflac__valignrq_s32_1(vcombine_s32(prediction64, vdup_n_s32(0)), samples128_0);
4572
riceParamPart128 = drflac__valignrq_u32_1(vdupq_n_u32(0), riceParamPart128);
4573
}
4574
} else if (order <= 8) {
4575
for (i = 0; i < 4; i += 1) {
4576
prediction128 = vmulq_s32(coefficients128_4, samples128_4);
4577
prediction128 = vmlaq_s32(prediction128, coefficients128_0, samples128_0);
4578
4579
/* Horizontal add and shift. */
4580
prediction64 = drflac__vhaddq_s32(prediction128);
4581
prediction64 = vshl_s32(prediction64, shift64);
4582
prediction64 = vadd_s32(prediction64, vget_low_s32(vreinterpretq_s32_u32(riceParamPart128)));
4583
4584
samples128_4 = drflac__valignrq_s32_1(samples128_0, samples128_4);
4585
samples128_0 = drflac__valignrq_s32_1(vcombine_s32(prediction64, vdup_n_s32(0)), samples128_0);
4586
riceParamPart128 = drflac__valignrq_u32_1(vdupq_n_u32(0), riceParamPart128);
4587
}
4588
} else {
4589
for (i = 0; i < 4; i += 1) {
4590
prediction128 = vmulq_s32(coefficients128_8, samples128_8);
4591
prediction128 = vmlaq_s32(prediction128, coefficients128_4, samples128_4);
4592
prediction128 = vmlaq_s32(prediction128, coefficients128_0, samples128_0);
4593
4594
/* Horizontal add and shift. */
4595
prediction64 = drflac__vhaddq_s32(prediction128);
4596
prediction64 = vshl_s32(prediction64, shift64);
4597
prediction64 = vadd_s32(prediction64, vget_low_s32(vreinterpretq_s32_u32(riceParamPart128)));
4598
4599
samples128_8 = drflac__valignrq_s32_1(samples128_4, samples128_8);
4600
samples128_4 = drflac__valignrq_s32_1(samples128_0, samples128_4);
4601
samples128_0 = drflac__valignrq_s32_1(vcombine_s32(prediction64, vdup_n_s32(0)), samples128_0);
4602
riceParamPart128 = drflac__valignrq_u32_1(vdupq_n_u32(0), riceParamPart128);
4603
}
4604
}
4605
4606
/* We store samples in groups of 4. */
4607
vst1q_s32(pDecodedSamples, samples128_0);
4608
pDecodedSamples += 4;
4609
}
4610
4611
/* Make sure we process the last few samples. */
4612
i = (count & ~3);
4613
while (i < (int)count) {
4614
/* Rice extraction. */
4615
if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[0], &riceParamParts[0])) {
4616
return DRFLAC_FALSE;
4617
}
4618
4619
/* Rice reconstruction. */
4620
riceParamParts[0] &= riceParamMask;
4621
riceParamParts[0] |= (zeroCountParts[0] << riceParam);
4622
riceParamParts[0] = (riceParamParts[0] >> 1) ^ t[riceParamParts[0] & 0x01];
4623
4624
/* Sample reconstruction. */
4625
pDecodedSamples[0] = riceParamParts[0] + drflac__calculate_prediction_32(order, shift, coefficients, pDecodedSamples);
4626
4627
i += 1;
4628
pDecodedSamples += 1;
4629
}
4630
4631
return DRFLAC_TRUE;
4632
}
4633
4634
static drflac_bool32 drflac__decode_samples_with_residual__rice__neon_64(drflac_bs* bs, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pSamplesOut)
4635
{
4636
int i;
4637
drflac_uint32 riceParamMask;
4638
drflac_int32* pDecodedSamples = pSamplesOut;
4639
drflac_int32* pDecodedSamplesEnd = pSamplesOut + (count & ~3);
4640
drflac_uint32 zeroCountParts[4];
4641
drflac_uint32 riceParamParts[4];
4642
int32x4_t coefficients128_0;
4643
int32x4_t coefficients128_4;
4644
int32x4_t coefficients128_8;
4645
int32x4_t samples128_0;
4646
int32x4_t samples128_4;
4647
int32x4_t samples128_8;
4648
uint32x4_t riceParamMask128;
4649
int32x4_t riceParam128;
4650
int64x1_t shift64;
4651
uint32x4_t one128;
4652
int64x2_t prediction128 = { 0 };
4653
uint32x4_t zeroCountPart128;
4654
uint32x4_t riceParamPart128;
4655
4656
const drflac_uint32 t[2] = {0x00000000, 0xFFFFFFFF};
4657
4658
riceParamMask = (drflac_uint32)~((~0UL) << riceParam);
4659
riceParamMask128 = vdupq_n_u32(riceParamMask);
4660
4661
riceParam128 = vdupq_n_s32(riceParam);
4662
shift64 = vdup_n_s64(-shift); /* Negate the shift because we'll be doing a variable shift using vshlq_s32(). */
4663
one128 = vdupq_n_u32(1);
4664
4665
/*
4666
Pre-loading the coefficients and prior samples is annoying because we need to ensure we don't try reading more than
4667
what's available in the input buffers. It would be convenient to use a fall-through switch to do this, but this results
4668
in strict aliasing warnings with GCC. To work around this I'm just doing something hacky. This feels a bit convoluted
4669
so I think there's opportunity for this to be simplified.
4670
*/
4671
{
4672
int runningOrder = order;
4673
drflac_int32 tempC[4] = {0, 0, 0, 0};
4674
drflac_int32 tempS[4] = {0, 0, 0, 0};
4675
4676
/* 0 - 3. */
4677
if (runningOrder >= 4) {
4678
coefficients128_0 = vld1q_s32(coefficients + 0);
4679
samples128_0 = vld1q_s32(pSamplesOut - 4);
4680
runningOrder -= 4;
4681
} else {
4682
switch (runningOrder) {
4683
case 3: tempC[2] = coefficients[2]; tempS[1] = pSamplesOut[-3]; /* fallthrough */
4684
case 2: tempC[1] = coefficients[1]; tempS[2] = pSamplesOut[-2]; /* fallthrough */
4685
case 1: tempC[0] = coefficients[0]; tempS[3] = pSamplesOut[-1]; /* fallthrough */
4686
}
4687
4688
coefficients128_0 = vld1q_s32(tempC);
4689
samples128_0 = vld1q_s32(tempS);
4690
runningOrder = 0;
4691
}
4692
4693
/* 4 - 7 */
4694
if (runningOrder >= 4) {
4695
coefficients128_4 = vld1q_s32(coefficients + 4);
4696
samples128_4 = vld1q_s32(pSamplesOut - 8);
4697
runningOrder -= 4;
4698
} else {
4699
switch (runningOrder) {
4700
case 3: tempC[2] = coefficients[6]; tempS[1] = pSamplesOut[-7]; /* fallthrough */
4701
case 2: tempC[1] = coefficients[5]; tempS[2] = pSamplesOut[-6]; /* fallthrough */
4702
case 1: tempC[0] = coefficients[4]; tempS[3] = pSamplesOut[-5]; /* fallthrough */
4703
}
4704
4705
coefficients128_4 = vld1q_s32(tempC);
4706
samples128_4 = vld1q_s32(tempS);
4707
runningOrder = 0;
4708
}
4709
4710
/* 8 - 11 */
4711
if (runningOrder == 4) {
4712
coefficients128_8 = vld1q_s32(coefficients + 8);
4713
samples128_8 = vld1q_s32(pSamplesOut - 12);
4714
runningOrder -= 4;
4715
} else {
4716
switch (runningOrder) {
4717
case 3: tempC[2] = coefficients[10]; tempS[1] = pSamplesOut[-11]; /* fallthrough */
4718
case 2: tempC[1] = coefficients[ 9]; tempS[2] = pSamplesOut[-10]; /* fallthrough */
4719
case 1: tempC[0] = coefficients[ 8]; tempS[3] = pSamplesOut[- 9]; /* fallthrough */
4720
}
4721
4722
coefficients128_8 = vld1q_s32(tempC);
4723
samples128_8 = vld1q_s32(tempS);
4724
runningOrder = 0;
4725
}
4726
4727
/* Coefficients need to be shuffled for our streaming algorithm below to work. Samples are already in the correct order from the loading routine above. */
4728
coefficients128_0 = drflac__vrevq_s32(coefficients128_0);
4729
coefficients128_4 = drflac__vrevq_s32(coefficients128_4);
4730
coefficients128_8 = drflac__vrevq_s32(coefficients128_8);
4731
}
4732
4733
/* For this version we are doing one sample at a time. */
4734
while (pDecodedSamples < pDecodedSamplesEnd) {
4735
if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[0], &riceParamParts[0]) ||
4736
!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[1], &riceParamParts[1]) ||
4737
!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[2], &riceParamParts[2]) ||
4738
!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[3], &riceParamParts[3])) {
4739
return DRFLAC_FALSE;
4740
}
4741
4742
zeroCountPart128 = vld1q_u32(zeroCountParts);
4743
riceParamPart128 = vld1q_u32(riceParamParts);
4744
4745
riceParamPart128 = vandq_u32(riceParamPart128, riceParamMask128);
4746
riceParamPart128 = vorrq_u32(riceParamPart128, vshlq_u32(zeroCountPart128, riceParam128));
4747
riceParamPart128 = veorq_u32(vshrq_n_u32(riceParamPart128, 1), vaddq_u32(drflac__vnotq_u32(vandq_u32(riceParamPart128, one128)), one128));
4748
4749
for (i = 0; i < 4; i += 1) {
4750
int64x1_t prediction64;
4751
4752
prediction128 = veorq_s64(prediction128, prediction128); /* Reset to 0. */
4753
switch (order)
4754
{
4755
case 12:
4756
case 11: prediction128 = vaddq_s64(prediction128, vmull_s32(vget_low_s32(coefficients128_8), vget_low_s32(samples128_8)));
4757
case 10:
4758
case 9: prediction128 = vaddq_s64(prediction128, vmull_s32(vget_high_s32(coefficients128_8), vget_high_s32(samples128_8)));
4759
case 8:
4760
case 7: prediction128 = vaddq_s64(prediction128, vmull_s32(vget_low_s32(coefficients128_4), vget_low_s32(samples128_4)));
4761
case 6:
4762
case 5: prediction128 = vaddq_s64(prediction128, vmull_s32(vget_high_s32(coefficients128_4), vget_high_s32(samples128_4)));
4763
case 4:
4764
case 3: prediction128 = vaddq_s64(prediction128, vmull_s32(vget_low_s32(coefficients128_0), vget_low_s32(samples128_0)));
4765
case 2:
4766
case 1: prediction128 = vaddq_s64(prediction128, vmull_s32(vget_high_s32(coefficients128_0), vget_high_s32(samples128_0)));
4767
}
4768
4769
/* Horizontal add and shift. */
4770
prediction64 = drflac__vhaddq_s64(prediction128);
4771
prediction64 = vshl_s64(prediction64, shift64);
4772
prediction64 = vadd_s64(prediction64, vdup_n_s64(vgetq_lane_u32(riceParamPart128, 0)));
4773
4774
/* Our value should be sitting in prediction64[0]. We need to combine this with our SSE samples. */
4775
samples128_8 = drflac__valignrq_s32_1(samples128_4, samples128_8);
4776
samples128_4 = drflac__valignrq_s32_1(samples128_0, samples128_4);
4777
samples128_0 = drflac__valignrq_s32_1(vcombine_s32(vreinterpret_s32_s64(prediction64), vdup_n_s32(0)), samples128_0);
4778
4779
/* Slide our rice parameter down so that the value in position 0 contains the next one to process. */
4780
riceParamPart128 = drflac__valignrq_u32_1(vdupq_n_u32(0), riceParamPart128);
4781
}
4782
4783
/* We store samples in groups of 4. */
4784
vst1q_s32(pDecodedSamples, samples128_0);
4785
pDecodedSamples += 4;
4786
}
4787
4788
/* Make sure we process the last few samples. */
4789
i = (count & ~3);
4790
while (i < (int)count) {
4791
/* Rice extraction. */
4792
if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[0], &riceParamParts[0])) {
4793
return DRFLAC_FALSE;
4794
}
4795
4796
/* Rice reconstruction. */
4797
riceParamParts[0] &= riceParamMask;
4798
riceParamParts[0] |= (zeroCountParts[0] << riceParam);
4799
riceParamParts[0] = (riceParamParts[0] >> 1) ^ t[riceParamParts[0] & 0x01];
4800
4801
/* Sample reconstruction. */
4802
pDecodedSamples[0] = riceParamParts[0] + drflac__calculate_prediction_64(order, shift, coefficients, pDecodedSamples);
4803
4804
i += 1;
4805
pDecodedSamples += 1;
4806
}
4807
4808
return DRFLAC_TRUE;
4809
}
4810
4811
static drflac_bool32 drflac__decode_samples_with_residual__rice__neon(drflac_bs* bs, drflac_uint32 bitsPerSample, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 lpcOrder, drflac_int32 lpcShift, drflac_uint32 lpcPrecision, const drflac_int32* coefficients, drflac_int32* pSamplesOut)
4812
{
4813
DRFLAC_ASSERT(bs != NULL);
4814
DRFLAC_ASSERT(pSamplesOut != NULL);
4815
4816
/* In my testing the order is rarely > 12, so in this case I'm going to simplify the NEON implementation by only handling order <= 12. */
4817
if (lpcOrder > 0 && lpcOrder <= 12) {
4818
if (drflac__use_64_bit_prediction(bitsPerSample, lpcOrder, lpcPrecision)) {
4819
return drflac__decode_samples_with_residual__rice__neon_64(bs, count, riceParam, lpcOrder, lpcShift, coefficients, pSamplesOut);
4820
} else {
4821
return drflac__decode_samples_with_residual__rice__neon_32(bs, count, riceParam, lpcOrder, lpcShift, coefficients, pSamplesOut);
4822
}
4823
} else {
4824
return drflac__decode_samples_with_residual__rice__scalar(bs, bitsPerSample, count, riceParam, lpcOrder, lpcShift, lpcPrecision, coefficients, pSamplesOut);
4825
}
4826
}
4827
#endif
4828
4829
static drflac_bool32 drflac__decode_samples_with_residual__rice(drflac_bs* bs, drflac_uint32 bitsPerSample, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 lpcOrder, drflac_int32 lpcShift, drflac_uint32 lpcPrecision, const drflac_int32* coefficients, drflac_int32* pSamplesOut)
4830
{
4831
#if defined(DRFLAC_SUPPORT_SSE41)
4832
if (drflac__gIsSSE41Supported) {
4833
return drflac__decode_samples_with_residual__rice__sse41(bs, bitsPerSample, count, riceParam, lpcOrder, lpcShift, lpcPrecision, coefficients, pSamplesOut);
4834
} else
4835
#elif defined(DRFLAC_SUPPORT_NEON)
4836
if (drflac__gIsNEONSupported) {
4837
return drflac__decode_samples_with_residual__rice__neon(bs, bitsPerSample, count, riceParam, lpcOrder, lpcShift, lpcPrecision, coefficients, pSamplesOut);
4838
} else
4839
#endif
4840
{
4841
/* Scalar fallback. */
4842
#if 0
4843
return drflac__decode_samples_with_residual__rice__reference(bs, bitsPerSample, count, riceParam, lpcOrder, lpcShift, lpcPrecision, coefficients, pSamplesOut);
4844
#else
4845
return drflac__decode_samples_with_residual__rice__scalar(bs, bitsPerSample, count, riceParam, lpcOrder, lpcShift, lpcPrecision, coefficients, pSamplesOut);
4846
#endif
4847
}
4848
}
4849
4850
/* Reads and seeks past a string of residual values as Rice codes. The decoder should be sitting on the first bit of the Rice codes. */
4851
static drflac_bool32 drflac__read_and_seek_residual__rice(drflac_bs* bs, drflac_uint32 count, drflac_uint8 riceParam)
4852
{
4853
drflac_uint32 i;
4854
4855
DRFLAC_ASSERT(bs != NULL);
4856
4857
for (i = 0; i < count; ++i) {
4858
if (!drflac__seek_rice_parts(bs, riceParam)) {
4859
return DRFLAC_FALSE;
4860
}
4861
}
4862
4863
return DRFLAC_TRUE;
4864
}
4865
4866
#if defined(__clang__)
4867
__attribute__((no_sanitize("signed-integer-overflow")))
4868
#endif
4869
static drflac_bool32 drflac__decode_samples_with_residual__unencoded(drflac_bs* bs, drflac_uint32 bitsPerSample, drflac_uint32 count, drflac_uint8 unencodedBitsPerSample, drflac_uint32 lpcOrder, drflac_int32 lpcShift, drflac_uint32 lpcPrecision, const drflac_int32* coefficients, drflac_int32* pSamplesOut)
4870
{
4871
drflac_uint32 i;
4872
4873
DRFLAC_ASSERT(bs != NULL);
4874
DRFLAC_ASSERT(unencodedBitsPerSample <= 31); /* <-- unencodedBitsPerSample is a 5 bit number, so cannot exceed 31. */
4875
DRFLAC_ASSERT(pSamplesOut != NULL);
4876
4877
for (i = 0; i < count; ++i) {
4878
if (unencodedBitsPerSample > 0) {
4879
if (!drflac__read_int32(bs, unencodedBitsPerSample, pSamplesOut + i)) {
4880
return DRFLAC_FALSE;
4881
}
4882
} else {
4883
pSamplesOut[i] = 0;
4884
}
4885
4886
if (drflac__use_64_bit_prediction(bitsPerSample, lpcOrder, lpcPrecision)) {
4887
pSamplesOut[i] += drflac__calculate_prediction_64(lpcOrder, lpcShift, coefficients, pSamplesOut + i);
4888
} else {
4889
pSamplesOut[i] += drflac__calculate_prediction_32(lpcOrder, lpcShift, coefficients, pSamplesOut + i);
4890
}
4891
}
4892
4893
return DRFLAC_TRUE;
4894
}
4895
4896
4897
/*
4898
Reads and decodes the residual for the sub-frame the decoder is currently sitting on. This function should be called
4899
when the decoder is sitting at the very start of the RESIDUAL block. The first <order> residuals will be ignored. The
4900
<blockSize> and <order> parameters are used to determine how many residual values need to be decoded.
4901
*/
4902
static drflac_bool32 drflac__decode_samples_with_residual(drflac_bs* bs, drflac_uint32 bitsPerSample, drflac_uint32 blockSize, drflac_uint32 lpcOrder, drflac_int32 lpcShift, drflac_uint32 lpcPrecision, const drflac_int32* coefficients, drflac_int32* pDecodedSamples)
4903
{
4904
drflac_uint8 residualMethod;
4905
drflac_uint8 partitionOrder;
4906
drflac_uint32 samplesInPartition;
4907
drflac_uint32 partitionsRemaining;
4908
4909
DRFLAC_ASSERT(bs != NULL);
4910
DRFLAC_ASSERT(blockSize != 0);
4911
DRFLAC_ASSERT(pDecodedSamples != NULL); /* <-- Should we allow NULL, in which case we just seek past the residual rather than do a full decode? */
4912
4913
if (!drflac__read_uint8(bs, 2, &residualMethod)) {
4914
return DRFLAC_FALSE;
4915
}
4916
4917
if (residualMethod != DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE && residualMethod != DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE2) {
4918
return DRFLAC_FALSE; /* Unknown or unsupported residual coding method. */
4919
}
4920
4921
/* Ignore the first <order> values. */
4922
pDecodedSamples += lpcOrder;
4923
4924
if (!drflac__read_uint8(bs, 4, &partitionOrder)) {
4925
return DRFLAC_FALSE;
4926
}
4927
4928
/*
4929
From the FLAC spec:
4930
The Rice partition order in a Rice-coded residual section must be less than or equal to 8.
4931
*/
4932
if (partitionOrder > 8) {
4933
return DRFLAC_FALSE;
4934
}
4935
4936
/* Validation check. */
4937
if ((blockSize / (1 << partitionOrder)) < lpcOrder) {
4938
return DRFLAC_FALSE;
4939
}
4940
4941
samplesInPartition = (blockSize / (1 << partitionOrder)) - lpcOrder;
4942
partitionsRemaining = (1 << partitionOrder);
4943
for (;;) {
4944
drflac_uint8 riceParam = 0;
4945
if (residualMethod == DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE) {
4946
if (!drflac__read_uint8(bs, 4, &riceParam)) {
4947
return DRFLAC_FALSE;
4948
}
4949
if (riceParam == 15) {
4950
riceParam = 0xFF;
4951
}
4952
} else if (residualMethod == DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE2) {
4953
if (!drflac__read_uint8(bs, 5, &riceParam)) {
4954
return DRFLAC_FALSE;
4955
}
4956
if (riceParam == 31) {
4957
riceParam = 0xFF;
4958
}
4959
}
4960
4961
if (riceParam != 0xFF) {
4962
if (!drflac__decode_samples_with_residual__rice(bs, bitsPerSample, samplesInPartition, riceParam, lpcOrder, lpcShift, lpcPrecision, coefficients, pDecodedSamples)) {
4963
return DRFLAC_FALSE;
4964
}
4965
} else {
4966
drflac_uint8 unencodedBitsPerSample = 0;
4967
if (!drflac__read_uint8(bs, 5, &unencodedBitsPerSample)) {
4968
return DRFLAC_FALSE;
4969
}
4970
4971
if (!drflac__decode_samples_with_residual__unencoded(bs, bitsPerSample, samplesInPartition, unencodedBitsPerSample, lpcOrder, lpcShift, lpcPrecision, coefficients, pDecodedSamples)) {
4972
return DRFLAC_FALSE;
4973
}
4974
}
4975
4976
pDecodedSamples += samplesInPartition;
4977
4978
if (partitionsRemaining == 1) {
4979
break;
4980
}
4981
4982
partitionsRemaining -= 1;
4983
4984
if (partitionOrder != 0) {
4985
samplesInPartition = blockSize / (1 << partitionOrder);
4986
}
4987
}
4988
4989
return DRFLAC_TRUE;
4990
}
4991
4992
/*
4993
Reads and seeks past the residual for the sub-frame the decoder is currently sitting on. This function should be called
4994
when the decoder is sitting at the very start of the RESIDUAL block. The first <order> residuals will be set to 0. The
4995
<blockSize> and <order> parameters are used to determine how many residual values need to be decoded.
4996
*/
4997
static drflac_bool32 drflac__read_and_seek_residual(drflac_bs* bs, drflac_uint32 blockSize, drflac_uint32 order)
4998
{
4999
drflac_uint8 residualMethod;
5000
drflac_uint8 partitionOrder;
5001
drflac_uint32 samplesInPartition;
5002
drflac_uint32 partitionsRemaining;
5003
5004
DRFLAC_ASSERT(bs != NULL);
5005
DRFLAC_ASSERT(blockSize != 0);
5006
5007
if (!drflac__read_uint8(bs, 2, &residualMethod)) {
5008
return DRFLAC_FALSE;
5009
}
5010
5011
if (residualMethod != DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE && residualMethod != DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE2) {
5012
return DRFLAC_FALSE; /* Unknown or unsupported residual coding method. */
5013
}
5014
5015
if (!drflac__read_uint8(bs, 4, &partitionOrder)) {
5016
return DRFLAC_FALSE;
5017
}
5018
5019
/*
5020
From the FLAC spec:
5021
The Rice partition order in a Rice-coded residual section must be less than or equal to 8.
5022
*/
5023
if (partitionOrder > 8) {
5024
return DRFLAC_FALSE;
5025
}
5026
5027
/* Validation check. */
5028
if ((blockSize / (1 << partitionOrder)) <= order) {
5029
return DRFLAC_FALSE;
5030
}
5031
5032
samplesInPartition = (blockSize / (1 << partitionOrder)) - order;
5033
partitionsRemaining = (1 << partitionOrder);
5034
for (;;)
5035
{
5036
drflac_uint8 riceParam = 0;
5037
if (residualMethod == DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE) {
5038
if (!drflac__read_uint8(bs, 4, &riceParam)) {
5039
return DRFLAC_FALSE;
5040
}
5041
if (riceParam == 15) {
5042
riceParam = 0xFF;
5043
}
5044
} else if (residualMethod == DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE2) {
5045
if (!drflac__read_uint8(bs, 5, &riceParam)) {
5046
return DRFLAC_FALSE;
5047
}
5048
if (riceParam == 31) {
5049
riceParam = 0xFF;
5050
}
5051
}
5052
5053
if (riceParam != 0xFF) {
5054
if (!drflac__read_and_seek_residual__rice(bs, samplesInPartition, riceParam)) {
5055
return DRFLAC_FALSE;
5056
}
5057
} else {
5058
drflac_uint8 unencodedBitsPerSample = 0;
5059
if (!drflac__read_uint8(bs, 5, &unencodedBitsPerSample)) {
5060
return DRFLAC_FALSE;
5061
}
5062
5063
if (!drflac__seek_bits(bs, unencodedBitsPerSample * samplesInPartition)) {
5064
return DRFLAC_FALSE;
5065
}
5066
}
5067
5068
5069
if (partitionsRemaining == 1) {
5070
break;
5071
}
5072
5073
partitionsRemaining -= 1;
5074
samplesInPartition = blockSize / (1 << partitionOrder);
5075
}
5076
5077
return DRFLAC_TRUE;
5078
}
5079
5080
5081
static drflac_bool32 drflac__decode_samples__constant(drflac_bs* bs, drflac_uint32 blockSize, drflac_uint32 subframeBitsPerSample, drflac_int32* pDecodedSamples)
5082
{
5083
drflac_uint32 i;
5084
5085
/* Only a single sample needs to be decoded here. */
5086
drflac_int32 sample;
5087
if (!drflac__read_int32(bs, subframeBitsPerSample, &sample)) {
5088
return DRFLAC_FALSE;
5089
}
5090
5091
/*
5092
We don't really need to expand this, but it does simplify the process of reading samples. If this becomes a performance issue (unlikely)
5093
we'll want to look at a more efficient way.
5094
*/
5095
for (i = 0; i < blockSize; ++i) {
5096
pDecodedSamples[i] = sample;
5097
}
5098
5099
return DRFLAC_TRUE;
5100
}
5101
5102
static drflac_bool32 drflac__decode_samples__verbatim(drflac_bs* bs, drflac_uint32 blockSize, drflac_uint32 subframeBitsPerSample, drflac_int32* pDecodedSamples)
5103
{
5104
drflac_uint32 i;
5105
5106
for (i = 0; i < blockSize; ++i) {
5107
drflac_int32 sample;
5108
if (!drflac__read_int32(bs, subframeBitsPerSample, &sample)) {
5109
return DRFLAC_FALSE;
5110
}
5111
5112
pDecodedSamples[i] = sample;
5113
}
5114
5115
return DRFLAC_TRUE;
5116
}
5117
5118
static drflac_bool32 drflac__decode_samples__fixed(drflac_bs* bs, drflac_uint32 blockSize, drflac_uint32 subframeBitsPerSample, drflac_uint8 lpcOrder, drflac_int32* pDecodedSamples)
5119
{
5120
drflac_uint32 i;
5121
5122
static drflac_int32 lpcCoefficientsTable[5][4] = {
5123
{0, 0, 0, 0},
5124
{1, 0, 0, 0},
5125
{2, -1, 0, 0},
5126
{3, -3, 1, 0},
5127
{4, -6, 4, -1}
5128
};
5129
5130
/* Warm up samples and coefficients. */
5131
for (i = 0; i < lpcOrder; ++i) {
5132
drflac_int32 sample;
5133
if (!drflac__read_int32(bs, subframeBitsPerSample, &sample)) {
5134
return DRFLAC_FALSE;
5135
}
5136
5137
pDecodedSamples[i] = sample;
5138
}
5139
5140
if (!drflac__decode_samples_with_residual(bs, subframeBitsPerSample, blockSize, lpcOrder, 0, 4, lpcCoefficientsTable[lpcOrder], pDecodedSamples)) {
5141
return DRFLAC_FALSE;
5142
}
5143
5144
return DRFLAC_TRUE;
5145
}
5146
5147
static drflac_bool32 drflac__decode_samples__lpc(drflac_bs* bs, drflac_uint32 blockSize, drflac_uint32 bitsPerSample, drflac_uint8 lpcOrder, drflac_int32* pDecodedSamples)
5148
{
5149
drflac_uint8 i;
5150
drflac_uint8 lpcPrecision;
5151
drflac_int8 lpcShift;
5152
drflac_int32 coefficients[32];
5153
5154
/* Warm up samples. */
5155
for (i = 0; i < lpcOrder; ++i) {
5156
drflac_int32 sample;
5157
if (!drflac__read_int32(bs, bitsPerSample, &sample)) {
5158
return DRFLAC_FALSE;
5159
}
5160
5161
pDecodedSamples[i] = sample;
5162
}
5163
5164
if (!drflac__read_uint8(bs, 4, &lpcPrecision)) {
5165
return DRFLAC_FALSE;
5166
}
5167
if (lpcPrecision == 15) {
5168
return DRFLAC_FALSE; /* Invalid. */
5169
}
5170
lpcPrecision += 1;
5171
5172
if (!drflac__read_int8(bs, 5, &lpcShift)) {
5173
return DRFLAC_FALSE;
5174
}
5175
5176
/*
5177
From the FLAC specification:
5178
5179
Quantized linear predictor coefficient shift needed in bits (NOTE: this number is signed two's-complement)
5180
5181
Emphasis on the "signed two's-complement". In practice there does not seem to be any encoders nor decoders supporting negative shifts. For now dr_flac is
5182
not going to support negative shifts as I don't have any reference files. However, when a reference file comes through I will consider adding support.
5183
*/
5184
if (lpcShift < 0) {
5185
return DRFLAC_FALSE;
5186
}
5187
5188
DRFLAC_ZERO_MEMORY(coefficients, sizeof(coefficients));
5189
for (i = 0; i < lpcOrder; ++i) {
5190
if (!drflac__read_int32(bs, lpcPrecision, coefficients + i)) {
5191
return DRFLAC_FALSE;
5192
}
5193
}
5194
5195
if (!drflac__decode_samples_with_residual(bs, bitsPerSample, blockSize, lpcOrder, lpcShift, lpcPrecision, coefficients, pDecodedSamples)) {
5196
return DRFLAC_FALSE;
5197
}
5198
5199
return DRFLAC_TRUE;
5200
}
5201
5202
5203
static drflac_bool32 drflac__read_next_flac_frame_header(drflac_bs* bs, drflac_uint8 streaminfoBitsPerSample, drflac_frame_header* header)
5204
{
5205
const drflac_uint32 sampleRateTable[12] = {0, 88200, 176400, 192000, 8000, 16000, 22050, 24000, 32000, 44100, 48000, 96000};
5206
const drflac_uint8 bitsPerSampleTable[8] = {0, 8, 12, (drflac_uint8)-1, 16, 20, 24, (drflac_uint8)-1}; /* -1 = reserved. */
5207
5208
DRFLAC_ASSERT(bs != NULL);
5209
DRFLAC_ASSERT(header != NULL);
5210
5211
/* Keep looping until we find a valid sync code. */
5212
for (;;) {
5213
drflac_uint8 crc8 = 0xCE; /* 0xCE = drflac_crc8(0, 0x3FFE, 14); */
5214
drflac_uint8 reserved = 0;
5215
drflac_uint8 blockingStrategy = 0;
5216
drflac_uint8 blockSize = 0;
5217
drflac_uint8 sampleRate = 0;
5218
drflac_uint8 channelAssignment = 0;
5219
drflac_uint8 bitsPerSample = 0;
5220
drflac_bool32 isVariableBlockSize;
5221
5222
if (!drflac__find_and_seek_to_next_sync_code(bs)) {
5223
return DRFLAC_FALSE;
5224
}
5225
5226
if (!drflac__read_uint8(bs, 1, &reserved)) {
5227
return DRFLAC_FALSE;
5228
}
5229
if (reserved == 1) {
5230
continue;
5231
}
5232
crc8 = drflac_crc8(crc8, reserved, 1);
5233
5234
if (!drflac__read_uint8(bs, 1, &blockingStrategy)) {
5235
return DRFLAC_FALSE;
5236
}
5237
crc8 = drflac_crc8(crc8, blockingStrategy, 1);
5238
5239
if (!drflac__read_uint8(bs, 4, &blockSize)) {
5240
return DRFLAC_FALSE;
5241
}
5242
if (blockSize == 0) {
5243
continue;
5244
}
5245
crc8 = drflac_crc8(crc8, blockSize, 4);
5246
5247
if (!drflac__read_uint8(bs, 4, &sampleRate)) {
5248
return DRFLAC_FALSE;
5249
}
5250
crc8 = drflac_crc8(crc8, sampleRate, 4);
5251
5252
if (!drflac__read_uint8(bs, 4, &channelAssignment)) {
5253
return DRFLAC_FALSE;
5254
}
5255
if (channelAssignment > 10) {
5256
continue;
5257
}
5258
crc8 = drflac_crc8(crc8, channelAssignment, 4);
5259
5260
if (!drflac__read_uint8(bs, 3, &bitsPerSample)) {
5261
return DRFLAC_FALSE;
5262
}
5263
if (bitsPerSample == 3 || bitsPerSample == 7) {
5264
continue;
5265
}
5266
crc8 = drflac_crc8(crc8, bitsPerSample, 3);
5267
5268
5269
if (!drflac__read_uint8(bs, 1, &reserved)) {
5270
return DRFLAC_FALSE;
5271
}
5272
if (reserved == 1) {
5273
continue;
5274
}
5275
crc8 = drflac_crc8(crc8, reserved, 1);
5276
5277
5278
isVariableBlockSize = blockingStrategy == 1;
5279
if (isVariableBlockSize) {
5280
drflac_uint64 pcmFrameNumber;
5281
drflac_result result = drflac__read_utf8_coded_number(bs, &pcmFrameNumber, &crc8);
5282
if (result != DRFLAC_SUCCESS) {
5283
if (result == DRFLAC_AT_END) {
5284
return DRFLAC_FALSE;
5285
} else {
5286
continue;
5287
}
5288
}
5289
header->flacFrameNumber = 0;
5290
header->pcmFrameNumber = pcmFrameNumber;
5291
} else {
5292
drflac_uint64 flacFrameNumber = 0;
5293
drflac_result result = drflac__read_utf8_coded_number(bs, &flacFrameNumber, &crc8);
5294
if (result != DRFLAC_SUCCESS) {
5295
if (result == DRFLAC_AT_END) {
5296
return DRFLAC_FALSE;
5297
} else {
5298
continue;
5299
}
5300
}
5301
header->flacFrameNumber = (drflac_uint32)flacFrameNumber; /* <-- Safe cast. */
5302
header->pcmFrameNumber = 0;
5303
}
5304
5305
5306
DRFLAC_ASSERT(blockSize > 0);
5307
if (blockSize == 1) {
5308
header->blockSizeInPCMFrames = 192;
5309
} else if (blockSize <= 5) {
5310
DRFLAC_ASSERT(blockSize >= 2);
5311
header->blockSizeInPCMFrames = 576 * (1 << (blockSize - 2));
5312
} else if (blockSize == 6) {
5313
if (!drflac__read_uint16(bs, 8, &header->blockSizeInPCMFrames)) {
5314
return DRFLAC_FALSE;
5315
}
5316
crc8 = drflac_crc8(crc8, header->blockSizeInPCMFrames, 8);
5317
header->blockSizeInPCMFrames += 1;
5318
} else if (blockSize == 7) {
5319
if (!drflac__read_uint16(bs, 16, &header->blockSizeInPCMFrames)) {
5320
return DRFLAC_FALSE;
5321
}
5322
crc8 = drflac_crc8(crc8, header->blockSizeInPCMFrames, 16);
5323
if (header->blockSizeInPCMFrames == 0xFFFF) {
5324
return DRFLAC_FALSE; /* Frame is too big. This is the size of the frame minus 1. The STREAMINFO block defines the max block size which is 16-bits. Adding one will make it 17 bits and therefore too big. */
5325
}
5326
header->blockSizeInPCMFrames += 1;
5327
} else {
5328
DRFLAC_ASSERT(blockSize >= 8);
5329
header->blockSizeInPCMFrames = 256 * (1 << (blockSize - 8));
5330
}
5331
5332
5333
if (sampleRate <= 11) {
5334
header->sampleRate = sampleRateTable[sampleRate];
5335
} else if (sampleRate == 12) {
5336
if (!drflac__read_uint32(bs, 8, &header->sampleRate)) {
5337
return DRFLAC_FALSE;
5338
}
5339
crc8 = drflac_crc8(crc8, header->sampleRate, 8);
5340
header->sampleRate *= 1000;
5341
} else if (sampleRate == 13) {
5342
if (!drflac__read_uint32(bs, 16, &header->sampleRate)) {
5343
return DRFLAC_FALSE;
5344
}
5345
crc8 = drflac_crc8(crc8, header->sampleRate, 16);
5346
} else if (sampleRate == 14) {
5347
if (!drflac__read_uint32(bs, 16, &header->sampleRate)) {
5348
return DRFLAC_FALSE;
5349
}
5350
crc8 = drflac_crc8(crc8, header->sampleRate, 16);
5351
header->sampleRate *= 10;
5352
} else {
5353
continue; /* Invalid. Assume an invalid block. */
5354
}
5355
5356
5357
header->channelAssignment = channelAssignment;
5358
5359
header->bitsPerSample = bitsPerSampleTable[bitsPerSample];
5360
if (header->bitsPerSample == 0) {
5361
header->bitsPerSample = streaminfoBitsPerSample;
5362
}
5363
5364
if (header->bitsPerSample != streaminfoBitsPerSample) {
5365
/* If this subframe has a different bitsPerSample then streaminfo or the first frame, reject it */
5366
return DRFLAC_FALSE;
5367
}
5368
5369
if (!drflac__read_uint8(bs, 8, &header->crc8)) {
5370
return DRFLAC_FALSE;
5371
}
5372
5373
#ifndef DR_FLAC_NO_CRC
5374
if (header->crc8 != crc8) {
5375
continue; /* CRC mismatch. Loop back to the top and find the next sync code. */
5376
}
5377
#endif
5378
return DRFLAC_TRUE;
5379
}
5380
}
5381
5382
static drflac_bool32 drflac__read_subframe_header(drflac_bs* bs, drflac_subframe* pSubframe)
5383
{
5384
drflac_uint8 header;
5385
int type;
5386
5387
if (!drflac__read_uint8(bs, 8, &header)) {
5388
return DRFLAC_FALSE;
5389
}
5390
5391
/* First bit should always be 0. */
5392
if ((header & 0x80) != 0) {
5393
return DRFLAC_FALSE;
5394
}
5395
5396
type = (header & 0x7E) >> 1;
5397
if (type == 0) {
5398
pSubframe->subframeType = DRFLAC_SUBFRAME_CONSTANT;
5399
} else if (type == 1) {
5400
pSubframe->subframeType = DRFLAC_SUBFRAME_VERBATIM;
5401
} else {
5402
if ((type & 0x20) != 0) {
5403
pSubframe->subframeType = DRFLAC_SUBFRAME_LPC;
5404
pSubframe->lpcOrder = (drflac_uint8)(type & 0x1F) + 1;
5405
} else if ((type & 0x08) != 0) {
5406
pSubframe->subframeType = DRFLAC_SUBFRAME_FIXED;
5407
pSubframe->lpcOrder = (drflac_uint8)(type & 0x07);
5408
if (pSubframe->lpcOrder > 4) {
5409
pSubframe->subframeType = DRFLAC_SUBFRAME_RESERVED;
5410
pSubframe->lpcOrder = 0;
5411
}
5412
} else {
5413
pSubframe->subframeType = DRFLAC_SUBFRAME_RESERVED;
5414
}
5415
}
5416
5417
if (pSubframe->subframeType == DRFLAC_SUBFRAME_RESERVED) {
5418
return DRFLAC_FALSE;
5419
}
5420
5421
/* Wasted bits per sample. */
5422
pSubframe->wastedBitsPerSample = 0;
5423
if ((header & 0x01) == 1) {
5424
unsigned int wastedBitsPerSample;
5425
if (!drflac__seek_past_next_set_bit(bs, &wastedBitsPerSample)) {
5426
return DRFLAC_FALSE;
5427
}
5428
pSubframe->wastedBitsPerSample = (drflac_uint8)wastedBitsPerSample + 1;
5429
}
5430
5431
return DRFLAC_TRUE;
5432
}
5433
5434
static drflac_bool32 drflac__decode_subframe(drflac_bs* bs, drflac_frame* frame, int subframeIndex, drflac_int32* pDecodedSamplesOut)
5435
{
5436
drflac_subframe* pSubframe;
5437
drflac_uint32 subframeBitsPerSample;
5438
5439
DRFLAC_ASSERT(bs != NULL);
5440
DRFLAC_ASSERT(frame != NULL);
5441
5442
pSubframe = frame->subframes + subframeIndex;
5443
if (!drflac__read_subframe_header(bs, pSubframe)) {
5444
return DRFLAC_FALSE;
5445
}
5446
5447
/* Side channels require an extra bit per sample. Took a while to figure that one out... */
5448
subframeBitsPerSample = frame->header.bitsPerSample;
5449
if ((frame->header.channelAssignment == DRFLAC_CHANNEL_ASSIGNMENT_LEFT_SIDE || frame->header.channelAssignment == DRFLAC_CHANNEL_ASSIGNMENT_MID_SIDE) && subframeIndex == 1) {
5450
subframeBitsPerSample += 1;
5451
} else if (frame->header.channelAssignment == DRFLAC_CHANNEL_ASSIGNMENT_RIGHT_SIDE && subframeIndex == 0) {
5452
subframeBitsPerSample += 1;
5453
}
5454
5455
if (subframeBitsPerSample > 32) {
5456
/* libFLAC and ffmpeg reject 33-bit subframes as well */
5457
return DRFLAC_FALSE;
5458
}
5459
5460
/* Need to handle wasted bits per sample. */
5461
if (pSubframe->wastedBitsPerSample >= subframeBitsPerSample) {
5462
return DRFLAC_FALSE;
5463
}
5464
subframeBitsPerSample -= pSubframe->wastedBitsPerSample;
5465
5466
pSubframe->pSamplesS32 = pDecodedSamplesOut;
5467
5468
switch (pSubframe->subframeType)
5469
{
5470
case DRFLAC_SUBFRAME_CONSTANT:
5471
{
5472
drflac__decode_samples__constant(bs, frame->header.blockSizeInPCMFrames, subframeBitsPerSample, pSubframe->pSamplesS32);
5473
} break;
5474
5475
case DRFLAC_SUBFRAME_VERBATIM:
5476
{
5477
drflac__decode_samples__verbatim(bs, frame->header.blockSizeInPCMFrames, subframeBitsPerSample, pSubframe->pSamplesS32);
5478
} break;
5479
5480
case DRFLAC_SUBFRAME_FIXED:
5481
{
5482
drflac__decode_samples__fixed(bs, frame->header.blockSizeInPCMFrames, subframeBitsPerSample, pSubframe->lpcOrder, pSubframe->pSamplesS32);
5483
} break;
5484
5485
case DRFLAC_SUBFRAME_LPC:
5486
{
5487
drflac__decode_samples__lpc(bs, frame->header.blockSizeInPCMFrames, subframeBitsPerSample, pSubframe->lpcOrder, pSubframe->pSamplesS32);
5488
} break;
5489
5490
default: return DRFLAC_FALSE;
5491
}
5492
5493
return DRFLAC_TRUE;
5494
}
5495
5496
static drflac_bool32 drflac__seek_subframe(drflac_bs* bs, drflac_frame* frame, int subframeIndex)
5497
{
5498
drflac_subframe* pSubframe;
5499
drflac_uint32 subframeBitsPerSample;
5500
5501
DRFLAC_ASSERT(bs != NULL);
5502
DRFLAC_ASSERT(frame != NULL);
5503
5504
pSubframe = frame->subframes + subframeIndex;
5505
if (!drflac__read_subframe_header(bs, pSubframe)) {
5506
return DRFLAC_FALSE;
5507
}
5508
5509
/* Side channels require an extra bit per sample. Took a while to figure that one out... */
5510
subframeBitsPerSample = frame->header.bitsPerSample;
5511
if ((frame->header.channelAssignment == DRFLAC_CHANNEL_ASSIGNMENT_LEFT_SIDE || frame->header.channelAssignment == DRFLAC_CHANNEL_ASSIGNMENT_MID_SIDE) && subframeIndex == 1) {
5512
subframeBitsPerSample += 1;
5513
} else if (frame->header.channelAssignment == DRFLAC_CHANNEL_ASSIGNMENT_RIGHT_SIDE && subframeIndex == 0) {
5514
subframeBitsPerSample += 1;
5515
}
5516
5517
/* Need to handle wasted bits per sample. */
5518
if (pSubframe->wastedBitsPerSample >= subframeBitsPerSample) {
5519
return DRFLAC_FALSE;
5520
}
5521
subframeBitsPerSample -= pSubframe->wastedBitsPerSample;
5522
5523
pSubframe->pSamplesS32 = NULL;
5524
5525
switch (pSubframe->subframeType)
5526
{
5527
case DRFLAC_SUBFRAME_CONSTANT:
5528
{
5529
if (!drflac__seek_bits(bs, subframeBitsPerSample)) {
5530
return DRFLAC_FALSE;
5531
}
5532
} break;
5533
5534
case DRFLAC_SUBFRAME_VERBATIM:
5535
{
5536
unsigned int bitsToSeek = frame->header.blockSizeInPCMFrames * subframeBitsPerSample;
5537
if (!drflac__seek_bits(bs, bitsToSeek)) {
5538
return DRFLAC_FALSE;
5539
}
5540
} break;
5541
5542
case DRFLAC_SUBFRAME_FIXED:
5543
{
5544
unsigned int bitsToSeek = pSubframe->lpcOrder * subframeBitsPerSample;
5545
if (!drflac__seek_bits(bs, bitsToSeek)) {
5546
return DRFLAC_FALSE;
5547
}
5548
5549
if (!drflac__read_and_seek_residual(bs, frame->header.blockSizeInPCMFrames, pSubframe->lpcOrder)) {
5550
return DRFLAC_FALSE;
5551
}
5552
} break;
5553
5554
case DRFLAC_SUBFRAME_LPC:
5555
{
5556
drflac_uint8 lpcPrecision;
5557
5558
unsigned int bitsToSeek = pSubframe->lpcOrder * subframeBitsPerSample;
5559
if (!drflac__seek_bits(bs, bitsToSeek)) {
5560
return DRFLAC_FALSE;
5561
}
5562
5563
if (!drflac__read_uint8(bs, 4, &lpcPrecision)) {
5564
return DRFLAC_FALSE;
5565
}
5566
if (lpcPrecision == 15) {
5567
return DRFLAC_FALSE; /* Invalid. */
5568
}
5569
lpcPrecision += 1;
5570
5571
5572
bitsToSeek = (pSubframe->lpcOrder * lpcPrecision) + 5; /* +5 for shift. */
5573
if (!drflac__seek_bits(bs, bitsToSeek)) {
5574
return DRFLAC_FALSE;
5575
}
5576
5577
if (!drflac__read_and_seek_residual(bs, frame->header.blockSizeInPCMFrames, pSubframe->lpcOrder)) {
5578
return DRFLAC_FALSE;
5579
}
5580
} break;
5581
5582
default: return DRFLAC_FALSE;
5583
}
5584
5585
return DRFLAC_TRUE;
5586
}
5587
5588
5589
static DRFLAC_INLINE drflac_uint8 drflac__get_channel_count_from_channel_assignment(drflac_int8 channelAssignment)
5590
{
5591
drflac_uint8 lookup[] = {1, 2, 3, 4, 5, 6, 7, 8, 2, 2, 2};
5592
5593
DRFLAC_ASSERT(channelAssignment <= 10);
5594
return lookup[channelAssignment];
5595
}
5596
5597
static drflac_result drflac__decode_flac_frame(drflac* pFlac)
5598
{
5599
int channelCount;
5600
int i;
5601
drflac_uint8 paddingSizeInBits;
5602
drflac_uint16 desiredCRC16;
5603
#ifndef DR_FLAC_NO_CRC
5604
drflac_uint16 actualCRC16;
5605
#endif
5606
5607
/* This function should be called while the stream is sitting on the first byte after the frame header. */
5608
DRFLAC_ZERO_MEMORY(pFlac->currentFLACFrame.subframes, sizeof(pFlac->currentFLACFrame.subframes));
5609
5610
/* The frame block size must never be larger than the maximum block size defined by the FLAC stream. */
5611
if (pFlac->currentFLACFrame.header.blockSizeInPCMFrames > pFlac->maxBlockSizeInPCMFrames) {
5612
return DRFLAC_ERROR;
5613
}
5614
5615
/* The number of channels in the frame must match the channel count from the STREAMINFO block. */
5616
channelCount = drflac__get_channel_count_from_channel_assignment(pFlac->currentFLACFrame.header.channelAssignment);
5617
if (channelCount != (int)pFlac->channels) {
5618
return DRFLAC_ERROR;
5619
}
5620
5621
for (i = 0; i < channelCount; ++i) {
5622
if (!drflac__decode_subframe(&pFlac->bs, &pFlac->currentFLACFrame, i, pFlac->pDecodedSamples + (pFlac->currentFLACFrame.header.blockSizeInPCMFrames * i))) {
5623
return DRFLAC_ERROR;
5624
}
5625
}
5626
5627
paddingSizeInBits = (drflac_uint8)(DRFLAC_CACHE_L1_BITS_REMAINING(&pFlac->bs) & 7);
5628
if (paddingSizeInBits > 0) {
5629
drflac_uint8 padding = 0;
5630
if (!drflac__read_uint8(&pFlac->bs, paddingSizeInBits, &padding)) {
5631
return DRFLAC_AT_END;
5632
}
5633
}
5634
5635
#ifndef DR_FLAC_NO_CRC
5636
actualCRC16 = drflac__flush_crc16(&pFlac->bs);
5637
#endif
5638
if (!drflac__read_uint16(&pFlac->bs, 16, &desiredCRC16)) {
5639
return DRFLAC_AT_END;
5640
}
5641
5642
#ifndef DR_FLAC_NO_CRC
5643
if (actualCRC16 != desiredCRC16) {
5644
return DRFLAC_CRC_MISMATCH; /* CRC mismatch. */
5645
}
5646
#endif
5647
5648
pFlac->currentFLACFrame.pcmFramesRemaining = pFlac->currentFLACFrame.header.blockSizeInPCMFrames;
5649
5650
return DRFLAC_SUCCESS;
5651
}
5652
5653
static drflac_result drflac__seek_flac_frame(drflac* pFlac)
5654
{
5655
int channelCount;
5656
int i;
5657
drflac_uint16 desiredCRC16;
5658
#ifndef DR_FLAC_NO_CRC
5659
drflac_uint16 actualCRC16;
5660
#endif
5661
5662
channelCount = drflac__get_channel_count_from_channel_assignment(pFlac->currentFLACFrame.header.channelAssignment);
5663
for (i = 0; i < channelCount; ++i) {
5664
if (!drflac__seek_subframe(&pFlac->bs, &pFlac->currentFLACFrame, i)) {
5665
return DRFLAC_ERROR;
5666
}
5667
}
5668
5669
/* Padding. */
5670
if (!drflac__seek_bits(&pFlac->bs, DRFLAC_CACHE_L1_BITS_REMAINING(&pFlac->bs) & 7)) {
5671
return DRFLAC_ERROR;
5672
}
5673
5674
/* CRC. */
5675
#ifndef DR_FLAC_NO_CRC
5676
actualCRC16 = drflac__flush_crc16(&pFlac->bs);
5677
#endif
5678
if (!drflac__read_uint16(&pFlac->bs, 16, &desiredCRC16)) {
5679
return DRFLAC_AT_END;
5680
}
5681
5682
#ifndef DR_FLAC_NO_CRC
5683
if (actualCRC16 != desiredCRC16) {
5684
return DRFLAC_CRC_MISMATCH; /* CRC mismatch. */
5685
}
5686
#endif
5687
5688
return DRFLAC_SUCCESS;
5689
}
5690
5691
static drflac_bool32 drflac__read_and_decode_next_flac_frame(drflac* pFlac)
5692
{
5693
DRFLAC_ASSERT(pFlac != NULL);
5694
5695
for (;;) {
5696
drflac_result result;
5697
5698
if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) {
5699
return DRFLAC_FALSE;
5700
}
5701
5702
result = drflac__decode_flac_frame(pFlac);
5703
if (result != DRFLAC_SUCCESS) {
5704
if (result == DRFLAC_CRC_MISMATCH) {
5705
continue; /* CRC mismatch. Skip to the next frame. */
5706
} else {
5707
return DRFLAC_FALSE;
5708
}
5709
}
5710
5711
return DRFLAC_TRUE;
5712
}
5713
}
5714
5715
static void drflac__get_pcm_frame_range_of_current_flac_frame(drflac* pFlac, drflac_uint64* pFirstPCMFrame, drflac_uint64* pLastPCMFrame)
5716
{
5717
drflac_uint64 firstPCMFrame;
5718
drflac_uint64 lastPCMFrame;
5719
5720
DRFLAC_ASSERT(pFlac != NULL);
5721
5722
firstPCMFrame = pFlac->currentFLACFrame.header.pcmFrameNumber;
5723
if (firstPCMFrame == 0) {
5724
firstPCMFrame = ((drflac_uint64)pFlac->currentFLACFrame.header.flacFrameNumber) * pFlac->maxBlockSizeInPCMFrames;
5725
}
5726
5727
lastPCMFrame = firstPCMFrame + pFlac->currentFLACFrame.header.blockSizeInPCMFrames;
5728
if (lastPCMFrame > 0) {
5729
lastPCMFrame -= 1; /* Needs to be zero based. */
5730
}
5731
5732
if (pFirstPCMFrame) {
5733
*pFirstPCMFrame = firstPCMFrame;
5734
}
5735
if (pLastPCMFrame) {
5736
*pLastPCMFrame = lastPCMFrame;
5737
}
5738
}
5739
5740
static drflac_bool32 drflac__seek_to_first_frame(drflac* pFlac)
5741
{
5742
drflac_bool32 result;
5743
5744
DRFLAC_ASSERT(pFlac != NULL);
5745
5746
result = drflac__seek_to_byte(&pFlac->bs, pFlac->firstFLACFramePosInBytes);
5747
5748
DRFLAC_ZERO_MEMORY(&pFlac->currentFLACFrame, sizeof(pFlac->currentFLACFrame));
5749
pFlac->currentPCMFrame = 0;
5750
5751
return result;
5752
}
5753
5754
static DRFLAC_INLINE drflac_result drflac__seek_to_next_flac_frame(drflac* pFlac)
5755
{
5756
/* This function should only ever be called while the decoder is sitting on the first byte past the FRAME_HEADER section. */
5757
DRFLAC_ASSERT(pFlac != NULL);
5758
return drflac__seek_flac_frame(pFlac);
5759
}
5760
5761
5762
static drflac_uint64 drflac__seek_forward_by_pcm_frames(drflac* pFlac, drflac_uint64 pcmFramesToSeek)
5763
{
5764
drflac_uint64 pcmFramesRead = 0;
5765
while (pcmFramesToSeek > 0) {
5766
if (pFlac->currentFLACFrame.pcmFramesRemaining == 0) {
5767
if (!drflac__read_and_decode_next_flac_frame(pFlac)) {
5768
break; /* Couldn't read the next frame, so just break from the loop and return. */
5769
}
5770
} else {
5771
if (pFlac->currentFLACFrame.pcmFramesRemaining > pcmFramesToSeek) {
5772
pcmFramesRead += pcmFramesToSeek;
5773
pFlac->currentFLACFrame.pcmFramesRemaining -= (drflac_uint32)pcmFramesToSeek; /* <-- Safe cast. Will always be < currentFrame.pcmFramesRemaining < 65536. */
5774
pcmFramesToSeek = 0;
5775
} else {
5776
pcmFramesRead += pFlac->currentFLACFrame.pcmFramesRemaining;
5777
pcmFramesToSeek -= pFlac->currentFLACFrame.pcmFramesRemaining;
5778
pFlac->currentFLACFrame.pcmFramesRemaining = 0;
5779
}
5780
}
5781
}
5782
5783
pFlac->currentPCMFrame += pcmFramesRead;
5784
return pcmFramesRead;
5785
}
5786
5787
5788
static drflac_bool32 drflac__seek_to_pcm_frame__brute_force(drflac* pFlac, drflac_uint64 pcmFrameIndex)
5789
{
5790
drflac_bool32 isMidFrame = DRFLAC_FALSE;
5791
drflac_uint64 runningPCMFrameCount;
5792
5793
DRFLAC_ASSERT(pFlac != NULL);
5794
5795
/* If we are seeking forward we start from the current position. Otherwise we need to start all the way from the start of the file. */
5796
if (pcmFrameIndex >= pFlac->currentPCMFrame) {
5797
/* Seeking forward. Need to seek from the current position. */
5798
runningPCMFrameCount = pFlac->currentPCMFrame;
5799
5800
/* The frame header for the first frame may not yet have been read. We need to do that if necessary. */
5801
if (pFlac->currentPCMFrame == 0 && pFlac->currentFLACFrame.pcmFramesRemaining == 0) {
5802
if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) {
5803
return DRFLAC_FALSE;
5804
}
5805
} else {
5806
isMidFrame = DRFLAC_TRUE;
5807
}
5808
} else {
5809
/* Seeking backwards. Need to seek from the start of the file. */
5810
runningPCMFrameCount = 0;
5811
5812
/* Move back to the start. */
5813
if (!drflac__seek_to_first_frame(pFlac)) {
5814
return DRFLAC_FALSE;
5815
}
5816
5817
/* Decode the first frame in preparation for sample-exact seeking below. */
5818
if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) {
5819
return DRFLAC_FALSE;
5820
}
5821
}
5822
5823
/*
5824
We need to as quickly as possible find the frame that contains the target sample. To do this, we iterate over each frame and inspect its
5825
header. If based on the header we can determine that the frame contains the sample, we do a full decode of that frame.
5826
*/
5827
for (;;) {
5828
drflac_uint64 pcmFrameCountInThisFLACFrame;
5829
drflac_uint64 firstPCMFrameInFLACFrame = 0;
5830
drflac_uint64 lastPCMFrameInFLACFrame = 0;
5831
5832
drflac__get_pcm_frame_range_of_current_flac_frame(pFlac, &firstPCMFrameInFLACFrame, &lastPCMFrameInFLACFrame);
5833
5834
pcmFrameCountInThisFLACFrame = (lastPCMFrameInFLACFrame - firstPCMFrameInFLACFrame) + 1;
5835
if (pcmFrameIndex < (runningPCMFrameCount + pcmFrameCountInThisFLACFrame)) {
5836
/*
5837
The sample should be in this frame. We need to fully decode it, however if it's an invalid frame (a CRC mismatch), we need to pretend
5838
it never existed and keep iterating.
5839
*/
5840
drflac_uint64 pcmFramesToDecode = pcmFrameIndex - runningPCMFrameCount;
5841
5842
if (!isMidFrame) {
5843
drflac_result result = drflac__decode_flac_frame(pFlac);
5844
if (result == DRFLAC_SUCCESS) {
5845
/* The frame is valid. We just need to skip over some samples to ensure it's sample-exact. */
5846
return drflac__seek_forward_by_pcm_frames(pFlac, pcmFramesToDecode) == pcmFramesToDecode; /* <-- If this fails, something bad has happened (it should never fail). */
5847
} else {
5848
if (result == DRFLAC_CRC_MISMATCH) {
5849
goto next_iteration; /* CRC mismatch. Pretend this frame never existed. */
5850
} else {
5851
return DRFLAC_FALSE;
5852
}
5853
}
5854
} else {
5855
/* We started seeking mid-frame which means we need to skip the frame decoding part. */
5856
return drflac__seek_forward_by_pcm_frames(pFlac, pcmFramesToDecode) == pcmFramesToDecode;
5857
}
5858
} else {
5859
/*
5860
It's not in this frame. We need to seek past the frame, but check if there was a CRC mismatch. If so, we pretend this
5861
frame never existed and leave the running sample count untouched.
5862
*/
5863
if (!isMidFrame) {
5864
drflac_result result = drflac__seek_to_next_flac_frame(pFlac);
5865
if (result == DRFLAC_SUCCESS) {
5866
runningPCMFrameCount += pcmFrameCountInThisFLACFrame;
5867
} else {
5868
if (result == DRFLAC_CRC_MISMATCH) {
5869
goto next_iteration; /* CRC mismatch. Pretend this frame never existed. */
5870
} else {
5871
return DRFLAC_FALSE;
5872
}
5873
}
5874
} else {
5875
/*
5876
We started seeking mid-frame which means we need to seek by reading to the end of the frame instead of with
5877
drflac__seek_to_next_flac_frame() which only works if the decoder is sitting on the byte just after the frame header.
5878
*/
5879
runningPCMFrameCount += pFlac->currentFLACFrame.pcmFramesRemaining;
5880
pFlac->currentFLACFrame.pcmFramesRemaining = 0;
5881
isMidFrame = DRFLAC_FALSE;
5882
}
5883
5884
/* If we are seeking to the end of the file and we've just hit it, we're done. */
5885
if (pcmFrameIndex == pFlac->totalPCMFrameCount && runningPCMFrameCount == pFlac->totalPCMFrameCount) {
5886
return DRFLAC_TRUE;
5887
}
5888
}
5889
5890
next_iteration:
5891
/* Grab the next frame in preparation for the next iteration. */
5892
if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) {
5893
return DRFLAC_FALSE;
5894
}
5895
}
5896
}
5897
5898
5899
#if !defined(DR_FLAC_NO_CRC)
5900
/*
5901
We use an average compression ratio to determine our approximate start location. FLAC files are generally about 50%-70% the size of their
5902
uncompressed counterparts so we'll use this as a basis. I'm going to split the middle and use a factor of 0.6 to determine the starting
5903
location.
5904
*/
5905
#define DRFLAC_BINARY_SEARCH_APPROX_COMPRESSION_RATIO 0.6f
5906
5907
static drflac_bool32 drflac__seek_to_approximate_flac_frame_to_byte(drflac* pFlac, drflac_uint64 targetByte, drflac_uint64 rangeLo, drflac_uint64 rangeHi, drflac_uint64* pLastSuccessfulSeekOffset)
5908
{
5909
DRFLAC_ASSERT(pFlac != NULL);
5910
DRFLAC_ASSERT(pLastSuccessfulSeekOffset != NULL);
5911
DRFLAC_ASSERT(targetByte >= rangeLo);
5912
DRFLAC_ASSERT(targetByte <= rangeHi);
5913
5914
*pLastSuccessfulSeekOffset = pFlac->firstFLACFramePosInBytes;
5915
5916
for (;;) {
5917
/* After rangeLo == rangeHi == targetByte fails, we need to break out. */
5918
drflac_uint64 lastTargetByte = targetByte;
5919
5920
/* When seeking to a byte, failure probably means we've attempted to seek beyond the end of the stream. To counter this we just halve it each attempt. */
5921
if (!drflac__seek_to_byte(&pFlac->bs, targetByte)) {
5922
/* If we couldn't even seek to the first byte in the stream we have a problem. Just abandon the whole thing. */
5923
if (targetByte == 0) {
5924
drflac__seek_to_first_frame(pFlac); /* Try to recover. */
5925
return DRFLAC_FALSE;
5926
}
5927
5928
/* Halve the byte location and continue. */
5929
targetByte = rangeLo + ((rangeHi - rangeLo)/2);
5930
rangeHi = targetByte;
5931
} else {
5932
/* Getting here should mean that we have seeked to an appropriate byte. */
5933
5934
/* Clear the details of the FLAC frame so we don't misreport data. */
5935
DRFLAC_ZERO_MEMORY(&pFlac->currentFLACFrame, sizeof(pFlac->currentFLACFrame));
5936
5937
/*
5938
Now seek to the next FLAC frame. We need to decode the entire frame (not just the header) because it's possible for the header to incorrectly pass the
5939
CRC check and return bad data. We need to decode the entire frame to be more certain. Although this seems unlikely, this has happened to me in testing
5940
so it needs to stay this way for now.
5941
*/
5942
#if 1
5943
if (!drflac__read_and_decode_next_flac_frame(pFlac)) {
5944
/* Halve the byte location and continue. */
5945
targetByte = rangeLo + ((rangeHi - rangeLo)/2);
5946
rangeHi = targetByte;
5947
} else {
5948
break;
5949
}
5950
#else
5951
if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) {
5952
/* Halve the byte location and continue. */
5953
targetByte = rangeLo + ((rangeHi - rangeLo)/2);
5954
rangeHi = targetByte;
5955
} else {
5956
break;
5957
}
5958
#endif
5959
}
5960
5961
/* We already tried this byte and there are no more to try, break out. */
5962
if(targetByte == lastTargetByte) {
5963
return DRFLAC_FALSE;
5964
}
5965
}
5966
5967
/* The current PCM frame needs to be updated based on the frame we just seeked to. */
5968
drflac__get_pcm_frame_range_of_current_flac_frame(pFlac, &pFlac->currentPCMFrame, NULL);
5969
5970
DRFLAC_ASSERT(targetByte <= rangeHi);
5971
5972
*pLastSuccessfulSeekOffset = targetByte;
5973
return DRFLAC_TRUE;
5974
}
5975
5976
static drflac_bool32 drflac__decode_flac_frame_and_seek_forward_by_pcm_frames(drflac* pFlac, drflac_uint64 offset)
5977
{
5978
/* This section of code would be used if we were only decoding the FLAC frame header when calling drflac__seek_to_approximate_flac_frame_to_byte(). */
5979
#if 0
5980
if (drflac__decode_flac_frame(pFlac) != DRFLAC_SUCCESS) {
5981
/* We failed to decode this frame which may be due to it being corrupt. We'll just use the next valid FLAC frame. */
5982
if (drflac__read_and_decode_next_flac_frame(pFlac) == DRFLAC_FALSE) {
5983
return DRFLAC_FALSE;
5984
}
5985
}
5986
#endif
5987
5988
return drflac__seek_forward_by_pcm_frames(pFlac, offset) == offset;
5989
}
5990
5991
5992
static drflac_bool32 drflac__seek_to_pcm_frame__binary_search_internal(drflac* pFlac, drflac_uint64 pcmFrameIndex, drflac_uint64 byteRangeLo, drflac_uint64 byteRangeHi)
5993
{
5994
/* This assumes pFlac->currentPCMFrame is sitting on byteRangeLo upon entry. */
5995
5996
drflac_uint64 targetByte;
5997
drflac_uint64 pcmRangeLo = pFlac->totalPCMFrameCount;
5998
drflac_uint64 pcmRangeHi = 0;
5999
drflac_uint64 lastSuccessfulSeekOffset = (drflac_uint64)-1;
6000
drflac_uint64 closestSeekOffsetBeforeTargetPCMFrame = byteRangeLo;
6001
drflac_uint32 seekForwardThreshold = (pFlac->maxBlockSizeInPCMFrames != 0) ? pFlac->maxBlockSizeInPCMFrames*2 : 4096;
6002
6003
targetByte = byteRangeLo + (drflac_uint64)(((drflac_int64)((pcmFrameIndex - pFlac->currentPCMFrame) * pFlac->channels * pFlac->bitsPerSample)/8.0f) * DRFLAC_BINARY_SEARCH_APPROX_COMPRESSION_RATIO);
6004
if (targetByte > byteRangeHi) {
6005
targetByte = byteRangeHi;
6006
}
6007
6008
for (;;) {
6009
if (drflac__seek_to_approximate_flac_frame_to_byte(pFlac, targetByte, byteRangeLo, byteRangeHi, &lastSuccessfulSeekOffset)) {
6010
/* We found a FLAC frame. We need to check if it contains the sample we're looking for. */
6011
drflac_uint64 newPCMRangeLo;
6012
drflac_uint64 newPCMRangeHi;
6013
drflac__get_pcm_frame_range_of_current_flac_frame(pFlac, &newPCMRangeLo, &newPCMRangeHi);
6014
6015
/* If we selected the same frame, it means we should be pretty close. Just decode the rest. */
6016
if (pcmRangeLo == newPCMRangeLo) {
6017
if (!drflac__seek_to_approximate_flac_frame_to_byte(pFlac, closestSeekOffsetBeforeTargetPCMFrame, closestSeekOffsetBeforeTargetPCMFrame, byteRangeHi, &lastSuccessfulSeekOffset)) {
6018
break; /* Failed to seek to closest frame. */
6019
}
6020
6021
if (drflac__decode_flac_frame_and_seek_forward_by_pcm_frames(pFlac, pcmFrameIndex - pFlac->currentPCMFrame)) {
6022
return DRFLAC_TRUE;
6023
} else {
6024
break; /* Failed to seek forward. */
6025
}
6026
}
6027
6028
pcmRangeLo = newPCMRangeLo;
6029
pcmRangeHi = newPCMRangeHi;
6030
6031
if (pcmRangeLo <= pcmFrameIndex && pcmRangeHi >= pcmFrameIndex) {
6032
/* The target PCM frame is in this FLAC frame. */
6033
if (drflac__decode_flac_frame_and_seek_forward_by_pcm_frames(pFlac, pcmFrameIndex - pFlac->currentPCMFrame) ) {
6034
return DRFLAC_TRUE;
6035
} else {
6036
break; /* Failed to seek to FLAC frame. */
6037
}
6038
} else {
6039
const float approxCompressionRatio = (drflac_int64)(lastSuccessfulSeekOffset - pFlac->firstFLACFramePosInBytes) / ((drflac_int64)(pcmRangeLo * pFlac->channels * pFlac->bitsPerSample)/8.0f);
6040
6041
if (pcmRangeLo > pcmFrameIndex) {
6042
/* We seeked too far forward. We need to move our target byte backward and try again. */
6043
byteRangeHi = lastSuccessfulSeekOffset;
6044
if (byteRangeLo > byteRangeHi) {
6045
byteRangeLo = byteRangeHi;
6046
}
6047
6048
targetByte = byteRangeLo + ((byteRangeHi - byteRangeLo) / 2);
6049
if (targetByte < byteRangeLo) {
6050
targetByte = byteRangeLo;
6051
}
6052
} else /*if (pcmRangeHi < pcmFrameIndex)*/ {
6053
/* We didn't seek far enough. We need to move our target byte forward and try again. */
6054
6055
/* If we're close enough we can just seek forward. */
6056
if ((pcmFrameIndex - pcmRangeLo) < seekForwardThreshold) {
6057
if (drflac__decode_flac_frame_and_seek_forward_by_pcm_frames(pFlac, pcmFrameIndex - pFlac->currentPCMFrame)) {
6058
return DRFLAC_TRUE;
6059
} else {
6060
break; /* Failed to seek to FLAC frame. */
6061
}
6062
} else {
6063
byteRangeLo = lastSuccessfulSeekOffset;
6064
if (byteRangeHi < byteRangeLo) {
6065
byteRangeHi = byteRangeLo;
6066
}
6067
6068
targetByte = lastSuccessfulSeekOffset + (drflac_uint64)(((drflac_int64)((pcmFrameIndex-pcmRangeLo) * pFlac->channels * pFlac->bitsPerSample)/8.0f) * approxCompressionRatio);
6069
if (targetByte > byteRangeHi) {
6070
targetByte = byteRangeHi;
6071
}
6072
6073
if (closestSeekOffsetBeforeTargetPCMFrame < lastSuccessfulSeekOffset) {
6074
closestSeekOffsetBeforeTargetPCMFrame = lastSuccessfulSeekOffset;
6075
}
6076
}
6077
}
6078
}
6079
} else {
6080
/* Getting here is really bad. We just recover as best we can, but moving to the first frame in the stream, and then abort. */
6081
break;
6082
}
6083
}
6084
6085
drflac__seek_to_first_frame(pFlac); /* <-- Try to recover. */
6086
return DRFLAC_FALSE;
6087
}
6088
6089
static drflac_bool32 drflac__seek_to_pcm_frame__binary_search(drflac* pFlac, drflac_uint64 pcmFrameIndex)
6090
{
6091
drflac_uint64 byteRangeLo;
6092
drflac_uint64 byteRangeHi;
6093
drflac_uint32 seekForwardThreshold = (pFlac->maxBlockSizeInPCMFrames != 0) ? pFlac->maxBlockSizeInPCMFrames*2 : 4096;
6094
6095
/* Our algorithm currently assumes the FLAC stream is currently sitting at the start. */
6096
if (drflac__seek_to_first_frame(pFlac) == DRFLAC_FALSE) {
6097
return DRFLAC_FALSE;
6098
}
6099
6100
/* If we're close enough to the start, just move to the start and seek forward. */
6101
if (pcmFrameIndex < seekForwardThreshold) {
6102
return drflac__seek_forward_by_pcm_frames(pFlac, pcmFrameIndex) == pcmFrameIndex;
6103
}
6104
6105
/*
6106
Our starting byte range is the byte position of the first FLAC frame and the approximate end of the file as if it were completely uncompressed. This ensures
6107
the entire file is included, even though most of the time it'll exceed the end of the actual stream. This is OK as the frame searching logic will handle it.
6108
*/
6109
byteRangeLo = pFlac->firstFLACFramePosInBytes;
6110
byteRangeHi = pFlac->firstFLACFramePosInBytes + (drflac_uint64)((drflac_int64)(pFlac->totalPCMFrameCount * pFlac->channels * pFlac->bitsPerSample)/8.0f);
6111
6112
return drflac__seek_to_pcm_frame__binary_search_internal(pFlac, pcmFrameIndex, byteRangeLo, byteRangeHi);
6113
}
6114
#endif /* !DR_FLAC_NO_CRC */
6115
6116
static drflac_bool32 drflac__seek_to_pcm_frame__seek_table(drflac* pFlac, drflac_uint64 pcmFrameIndex)
6117
{
6118
drflac_uint32 iClosestSeekpoint = 0;
6119
drflac_bool32 isMidFrame = DRFLAC_FALSE;
6120
drflac_uint64 runningPCMFrameCount;
6121
drflac_uint32 iSeekpoint;
6122
6123
6124
DRFLAC_ASSERT(pFlac != NULL);
6125
6126
if (pFlac->pSeekpoints == NULL || pFlac->seekpointCount == 0) {
6127
return DRFLAC_FALSE;
6128
}
6129
6130
/* Do not use the seektable if pcmFramIndex is not coverd by it. */
6131
if (pFlac->pSeekpoints[0].firstPCMFrame > pcmFrameIndex) {
6132
return DRFLAC_FALSE;
6133
}
6134
6135
for (iSeekpoint = 0; iSeekpoint < pFlac->seekpointCount; ++iSeekpoint) {
6136
if (pFlac->pSeekpoints[iSeekpoint].firstPCMFrame >= pcmFrameIndex) {
6137
break;
6138
}
6139
6140
iClosestSeekpoint = iSeekpoint;
6141
}
6142
6143
/* There's been cases where the seek table contains only zeros. We need to do some basic validation on the closest seekpoint. */
6144
if (pFlac->pSeekpoints[iClosestSeekpoint].pcmFrameCount == 0 || pFlac->pSeekpoints[iClosestSeekpoint].pcmFrameCount > pFlac->maxBlockSizeInPCMFrames) {
6145
return DRFLAC_FALSE;
6146
}
6147
if (pFlac->pSeekpoints[iClosestSeekpoint].firstPCMFrame > pFlac->totalPCMFrameCount && pFlac->totalPCMFrameCount > 0) {
6148
return DRFLAC_FALSE;
6149
}
6150
6151
#if !defined(DR_FLAC_NO_CRC)
6152
/* At this point we should know the closest seek point. We can use a binary search for this. We need to know the total sample count for this. */
6153
if (pFlac->totalPCMFrameCount > 0) {
6154
drflac_uint64 byteRangeLo;
6155
drflac_uint64 byteRangeHi;
6156
6157
byteRangeHi = pFlac->firstFLACFramePosInBytes + (drflac_uint64)((drflac_int64)(pFlac->totalPCMFrameCount * pFlac->channels * pFlac->bitsPerSample)/8.0f);
6158
byteRangeLo = pFlac->firstFLACFramePosInBytes + pFlac->pSeekpoints[iClosestSeekpoint].flacFrameOffset;
6159
6160
/*
6161
If our closest seek point is not the last one, we only need to search between it and the next one. The section below calculates an appropriate starting
6162
value for byteRangeHi which will clamp it appropriately.
6163
6164
Note that the next seekpoint must have an offset greater than the closest seekpoint because otherwise our binary search algorithm will break down. There
6165
have been cases where a seektable consists of seek points where every byte offset is set to 0 which causes problems. If this happens we need to abort.
6166
*/
6167
if (iClosestSeekpoint < pFlac->seekpointCount-1) {
6168
drflac_uint32 iNextSeekpoint = iClosestSeekpoint + 1;
6169
6170
/* Basic validation on the seekpoints to ensure they're usable. */
6171
if (pFlac->pSeekpoints[iClosestSeekpoint].flacFrameOffset >= pFlac->pSeekpoints[iNextSeekpoint].flacFrameOffset || pFlac->pSeekpoints[iNextSeekpoint].pcmFrameCount == 0) {
6172
return DRFLAC_FALSE; /* The next seekpoint doesn't look right. The seek table cannot be trusted from here. Abort. */
6173
}
6174
6175
if (pFlac->pSeekpoints[iNextSeekpoint].firstPCMFrame != (((drflac_uint64)0xFFFFFFFF << 32) | 0xFFFFFFFF)) { /* Make sure it's not a placeholder seekpoint. */
6176
byteRangeHi = pFlac->firstFLACFramePosInBytes + pFlac->pSeekpoints[iNextSeekpoint].flacFrameOffset - 1; /* byteRangeHi must be zero based. */
6177
}
6178
}
6179
6180
if (drflac__seek_to_byte(&pFlac->bs, pFlac->firstFLACFramePosInBytes + pFlac->pSeekpoints[iClosestSeekpoint].flacFrameOffset)) {
6181
if (drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) {
6182
drflac__get_pcm_frame_range_of_current_flac_frame(pFlac, &pFlac->currentPCMFrame, NULL);
6183
6184
if (drflac__seek_to_pcm_frame__binary_search_internal(pFlac, pcmFrameIndex, byteRangeLo, byteRangeHi)) {
6185
return DRFLAC_TRUE;
6186
}
6187
}
6188
}
6189
}
6190
#endif /* !DR_FLAC_NO_CRC */
6191
6192
/* Getting here means we need to use a slower algorithm because the binary search method failed or cannot be used. */
6193
6194
/*
6195
If we are seeking forward and the closest seekpoint is _before_ the current sample, we just seek forward from where we are. Otherwise we start seeking
6196
from the seekpoint's first sample.
6197
*/
6198
if (pcmFrameIndex >= pFlac->currentPCMFrame && pFlac->pSeekpoints[iClosestSeekpoint].firstPCMFrame <= pFlac->currentPCMFrame) {
6199
/* Optimized case. Just seek forward from where we are. */
6200
runningPCMFrameCount = pFlac->currentPCMFrame;
6201
6202
/* The frame header for the first frame may not yet have been read. We need to do that if necessary. */
6203
if (pFlac->currentPCMFrame == 0 && pFlac->currentFLACFrame.pcmFramesRemaining == 0) {
6204
if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) {
6205
return DRFLAC_FALSE;
6206
}
6207
} else {
6208
isMidFrame = DRFLAC_TRUE;
6209
}
6210
} else {
6211
/* Slower case. Seek to the start of the seekpoint and then seek forward from there. */
6212
runningPCMFrameCount = pFlac->pSeekpoints[iClosestSeekpoint].firstPCMFrame;
6213
6214
if (!drflac__seek_to_byte(&pFlac->bs, pFlac->firstFLACFramePosInBytes + pFlac->pSeekpoints[iClosestSeekpoint].flacFrameOffset)) {
6215
return DRFLAC_FALSE;
6216
}
6217
6218
/* Grab the frame the seekpoint is sitting on in preparation for the sample-exact seeking below. */
6219
if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) {
6220
return DRFLAC_FALSE;
6221
}
6222
}
6223
6224
for (;;) {
6225
drflac_uint64 pcmFrameCountInThisFLACFrame;
6226
drflac_uint64 firstPCMFrameInFLACFrame = 0;
6227
drflac_uint64 lastPCMFrameInFLACFrame = 0;
6228
6229
drflac__get_pcm_frame_range_of_current_flac_frame(pFlac, &firstPCMFrameInFLACFrame, &lastPCMFrameInFLACFrame);
6230
6231
pcmFrameCountInThisFLACFrame = (lastPCMFrameInFLACFrame - firstPCMFrameInFLACFrame) + 1;
6232
if (pcmFrameIndex < (runningPCMFrameCount + pcmFrameCountInThisFLACFrame)) {
6233
/*
6234
The sample should be in this frame. We need to fully decode it, but if it's an invalid frame (a CRC mismatch) we need to pretend
6235
it never existed and keep iterating.
6236
*/
6237
drflac_uint64 pcmFramesToDecode = pcmFrameIndex - runningPCMFrameCount;
6238
6239
if (!isMidFrame) {
6240
drflac_result result = drflac__decode_flac_frame(pFlac);
6241
if (result == DRFLAC_SUCCESS) {
6242
/* The frame is valid. We just need to skip over some samples to ensure it's sample-exact. */
6243
return drflac__seek_forward_by_pcm_frames(pFlac, pcmFramesToDecode) == pcmFramesToDecode; /* <-- If this fails, something bad has happened (it should never fail). */
6244
} else {
6245
if (result == DRFLAC_CRC_MISMATCH) {
6246
goto next_iteration; /* CRC mismatch. Pretend this frame never existed. */
6247
} else {
6248
return DRFLAC_FALSE;
6249
}
6250
}
6251
} else {
6252
/* We started seeking mid-frame which means we need to skip the frame decoding part. */
6253
return drflac__seek_forward_by_pcm_frames(pFlac, pcmFramesToDecode) == pcmFramesToDecode;
6254
}
6255
} else {
6256
/*
6257
It's not in this frame. We need to seek past the frame, but check if there was a CRC mismatch. If so, we pretend this
6258
frame never existed and leave the running sample count untouched.
6259
*/
6260
if (!isMidFrame) {
6261
drflac_result result = drflac__seek_to_next_flac_frame(pFlac);
6262
if (result == DRFLAC_SUCCESS) {
6263
runningPCMFrameCount += pcmFrameCountInThisFLACFrame;
6264
} else {
6265
if (result == DRFLAC_CRC_MISMATCH) {
6266
goto next_iteration; /* CRC mismatch. Pretend this frame never existed. */
6267
} else {
6268
return DRFLAC_FALSE;
6269
}
6270
}
6271
} else {
6272
/*
6273
We started seeking mid-frame which means we need to seek by reading to the end of the frame instead of with
6274
drflac__seek_to_next_flac_frame() which only works if the decoder is sitting on the byte just after the frame header.
6275
*/
6276
runningPCMFrameCount += pFlac->currentFLACFrame.pcmFramesRemaining;
6277
pFlac->currentFLACFrame.pcmFramesRemaining = 0;
6278
isMidFrame = DRFLAC_FALSE;
6279
}
6280
6281
/* If we are seeking to the end of the file and we've just hit it, we're done. */
6282
if (pcmFrameIndex == pFlac->totalPCMFrameCount && runningPCMFrameCount == pFlac->totalPCMFrameCount) {
6283
return DRFLAC_TRUE;
6284
}
6285
}
6286
6287
next_iteration:
6288
/* Grab the next frame in preparation for the next iteration. */
6289
if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) {
6290
return DRFLAC_FALSE;
6291
}
6292
}
6293
}
6294
6295
6296
#ifndef DR_FLAC_NO_OGG
6297
typedef struct
6298
{
6299
drflac_uint8 capturePattern[4]; /* Should be "OggS" */
6300
drflac_uint8 structureVersion; /* Always 0. */
6301
drflac_uint8 headerType;
6302
drflac_uint64 granulePosition;
6303
drflac_uint32 serialNumber;
6304
drflac_uint32 sequenceNumber;
6305
drflac_uint32 checksum;
6306
drflac_uint8 segmentCount;
6307
drflac_uint8 segmentTable[255];
6308
} drflac_ogg_page_header;
6309
#endif
6310
6311
typedef struct
6312
{
6313
drflac_read_proc onRead;
6314
drflac_seek_proc onSeek;
6315
drflac_meta_proc onMeta;
6316
drflac_container container;
6317
void* pUserData;
6318
void* pUserDataMD;
6319
drflac_uint32 sampleRate;
6320
drflac_uint8 channels;
6321
drflac_uint8 bitsPerSample;
6322
drflac_uint64 totalPCMFrameCount;
6323
drflac_uint16 maxBlockSizeInPCMFrames;
6324
drflac_uint64 runningFilePos;
6325
drflac_bool32 hasStreamInfoBlock;
6326
drflac_bool32 hasMetadataBlocks;
6327
drflac_bs bs; /* <-- A bit streamer is required for loading data during initialization. */
6328
drflac_frame_header firstFrameHeader; /* <-- The header of the first frame that was read during relaxed initalization. Only set if there is no STREAMINFO block. */
6329
6330
#ifndef DR_FLAC_NO_OGG
6331
drflac_uint32 oggSerial;
6332
drflac_uint64 oggFirstBytePos;
6333
drflac_ogg_page_header oggBosHeader;
6334
#endif
6335
} drflac_init_info;
6336
6337
static DRFLAC_INLINE void drflac__decode_block_header(drflac_uint32 blockHeader, drflac_uint8* isLastBlock, drflac_uint8* blockType, drflac_uint32* blockSize)
6338
{
6339
blockHeader = drflac__be2host_32(blockHeader);
6340
*isLastBlock = (drflac_uint8)((blockHeader & 0x80000000UL) >> 31);
6341
*blockType = (drflac_uint8)((blockHeader & 0x7F000000UL) >> 24);
6342
*blockSize = (blockHeader & 0x00FFFFFFUL);
6343
}
6344
6345
static DRFLAC_INLINE drflac_bool32 drflac__read_and_decode_block_header(drflac_read_proc onRead, void* pUserData, drflac_uint8* isLastBlock, drflac_uint8* blockType, drflac_uint32* blockSize)
6346
{
6347
drflac_uint32 blockHeader;
6348
6349
*blockSize = 0;
6350
if (onRead(pUserData, &blockHeader, 4) != 4) {
6351
return DRFLAC_FALSE;
6352
}
6353
6354
drflac__decode_block_header(blockHeader, isLastBlock, blockType, blockSize);
6355
return DRFLAC_TRUE;
6356
}
6357
6358
static drflac_bool32 drflac__read_streaminfo(drflac_read_proc onRead, void* pUserData, drflac_streaminfo* pStreamInfo)
6359
{
6360
drflac_uint32 blockSizes;
6361
drflac_uint64 frameSizes = 0;
6362
drflac_uint64 importantProps;
6363
drflac_uint8 md5[16];
6364
6365
/* min/max block size. */
6366
if (onRead(pUserData, &blockSizes, 4) != 4) {
6367
return DRFLAC_FALSE;
6368
}
6369
6370
/* min/max frame size. */
6371
if (onRead(pUserData, &frameSizes, 6) != 6) {
6372
return DRFLAC_FALSE;
6373
}
6374
6375
/* Sample rate, channels, bits per sample and total sample count. */
6376
if (onRead(pUserData, &importantProps, 8) != 8) {
6377
return DRFLAC_FALSE;
6378
}
6379
6380
/* MD5 */
6381
if (onRead(pUserData, md5, sizeof(md5)) != sizeof(md5)) {
6382
return DRFLAC_FALSE;
6383
}
6384
6385
blockSizes = drflac__be2host_32(blockSizes);
6386
frameSizes = drflac__be2host_64(frameSizes);
6387
importantProps = drflac__be2host_64(importantProps);
6388
6389
pStreamInfo->minBlockSizeInPCMFrames = (drflac_uint16)((blockSizes & 0xFFFF0000) >> 16);
6390
pStreamInfo->maxBlockSizeInPCMFrames = (drflac_uint16) (blockSizes & 0x0000FFFF);
6391
pStreamInfo->minFrameSizeInPCMFrames = (drflac_uint32)((frameSizes & (((drflac_uint64)0x00FFFFFF << 16) << 24)) >> 40);
6392
pStreamInfo->maxFrameSizeInPCMFrames = (drflac_uint32)((frameSizes & (((drflac_uint64)0x00FFFFFF << 16) << 0)) >> 16);
6393
pStreamInfo->sampleRate = (drflac_uint32)((importantProps & (((drflac_uint64)0x000FFFFF << 16) << 28)) >> 44);
6394
pStreamInfo->channels = (drflac_uint8 )((importantProps & (((drflac_uint64)0x0000000E << 16) << 24)) >> 41) + 1;
6395
pStreamInfo->bitsPerSample = (drflac_uint8 )((importantProps & (((drflac_uint64)0x0000001F << 16) << 20)) >> 36) + 1;
6396
pStreamInfo->totalPCMFrameCount = ((importantProps & ((((drflac_uint64)0x0000000F << 16) << 16) | 0xFFFFFFFF)));
6397
DRFLAC_COPY_MEMORY(pStreamInfo->md5, md5, sizeof(md5));
6398
6399
return DRFLAC_TRUE;
6400
}
6401
6402
6403
static void* drflac__malloc_default(size_t sz, void* pUserData)
6404
{
6405
(void)pUserData;
6406
return DRFLAC_MALLOC(sz);
6407
}
6408
6409
static void* drflac__realloc_default(void* p, size_t sz, void* pUserData)
6410
{
6411
(void)pUserData;
6412
return DRFLAC_REALLOC(p, sz);
6413
}
6414
6415
static void drflac__free_default(void* p, void* pUserData)
6416
{
6417
(void)pUserData;
6418
DRFLAC_FREE(p);
6419
}
6420
6421
6422
static void* drflac__malloc_from_callbacks(size_t sz, const drflac_allocation_callbacks* pAllocationCallbacks)
6423
{
6424
if (pAllocationCallbacks == NULL) {
6425
return NULL;
6426
}
6427
6428
if (pAllocationCallbacks->onMalloc != NULL) {
6429
return pAllocationCallbacks->onMalloc(sz, pAllocationCallbacks->pUserData);
6430
}
6431
6432
/* Try using realloc(). */
6433
if (pAllocationCallbacks->onRealloc != NULL) {
6434
return pAllocationCallbacks->onRealloc(NULL, sz, pAllocationCallbacks->pUserData);
6435
}
6436
6437
return NULL;
6438
}
6439
6440
static void* drflac__realloc_from_callbacks(void* p, size_t szNew, size_t szOld, const drflac_allocation_callbacks* pAllocationCallbacks)
6441
{
6442
if (pAllocationCallbacks == NULL) {
6443
return NULL;
6444
}
6445
6446
if (pAllocationCallbacks->onRealloc != NULL) {
6447
return pAllocationCallbacks->onRealloc(p, szNew, pAllocationCallbacks->pUserData);
6448
}
6449
6450
/* Try emulating realloc() in terms of malloc()/free(). */
6451
if (pAllocationCallbacks->onMalloc != NULL && pAllocationCallbacks->onFree != NULL) {
6452
void* p2;
6453
6454
p2 = pAllocationCallbacks->onMalloc(szNew, pAllocationCallbacks->pUserData);
6455
if (p2 == NULL) {
6456
return NULL;
6457
}
6458
6459
if (p != NULL) {
6460
DRFLAC_COPY_MEMORY(p2, p, szOld);
6461
pAllocationCallbacks->onFree(p, pAllocationCallbacks->pUserData);
6462
}
6463
6464
return p2;
6465
}
6466
6467
return NULL;
6468
}
6469
6470
static void drflac__free_from_callbacks(void* p, const drflac_allocation_callbacks* pAllocationCallbacks)
6471
{
6472
if (p == NULL || pAllocationCallbacks == NULL) {
6473
return;
6474
}
6475
6476
if (pAllocationCallbacks->onFree != NULL) {
6477
pAllocationCallbacks->onFree(p, pAllocationCallbacks->pUserData);
6478
}
6479
}
6480
6481
6482
static drflac_bool32 drflac__read_and_decode_metadata(drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, void* pUserData, void* pUserDataMD, drflac_uint64* pFirstFramePos, drflac_uint64* pSeektablePos, drflac_uint32* pSeekpointCount, drflac_allocation_callbacks* pAllocationCallbacks)
6483
{
6484
/*
6485
We want to keep track of the byte position in the stream of the seektable. At the time of calling this function we know that
6486
we'll be sitting on byte 42.
6487
*/
6488
drflac_uint64 runningFilePos = 42;
6489
drflac_uint64 seektablePos = 0;
6490
drflac_uint32 seektableSize = 0;
6491
6492
for (;;) {
6493
drflac_metadata metadata;
6494
drflac_uint8 isLastBlock = 0;
6495
drflac_uint8 blockType = 0;
6496
drflac_uint32 blockSize;
6497
if (drflac__read_and_decode_block_header(onRead, pUserData, &isLastBlock, &blockType, &blockSize) == DRFLAC_FALSE) {
6498
return DRFLAC_FALSE;
6499
}
6500
runningFilePos += 4;
6501
6502
metadata.type = blockType;
6503
metadata.pRawData = NULL;
6504
metadata.rawDataSize = 0;
6505
6506
switch (blockType)
6507
{
6508
case DRFLAC_METADATA_BLOCK_TYPE_APPLICATION:
6509
{
6510
if (blockSize < 4) {
6511
return DRFLAC_FALSE;
6512
}
6513
6514
if (onMeta) {
6515
void* pRawData = drflac__malloc_from_callbacks(blockSize, pAllocationCallbacks);
6516
if (pRawData == NULL) {
6517
return DRFLAC_FALSE;
6518
}
6519
6520
if (onRead(pUserData, pRawData, blockSize) != blockSize) {
6521
drflac__free_from_callbacks(pRawData, pAllocationCallbacks);
6522
return DRFLAC_FALSE;
6523
}
6524
6525
metadata.pRawData = pRawData;
6526
metadata.rawDataSize = blockSize;
6527
metadata.data.application.id = drflac__be2host_32(*(drflac_uint32*)pRawData);
6528
metadata.data.application.pData = (const void*)((drflac_uint8*)pRawData + sizeof(drflac_uint32));
6529
metadata.data.application.dataSize = blockSize - sizeof(drflac_uint32);
6530
onMeta(pUserDataMD, &metadata);
6531
6532
drflac__free_from_callbacks(pRawData, pAllocationCallbacks);
6533
}
6534
} break;
6535
6536
case DRFLAC_METADATA_BLOCK_TYPE_SEEKTABLE:
6537
{
6538
seektablePos = runningFilePos;
6539
seektableSize = blockSize;
6540
6541
if (onMeta) {
6542
drflac_uint32 seekpointCount;
6543
drflac_uint32 iSeekpoint;
6544
void* pRawData;
6545
6546
seekpointCount = blockSize/DRFLAC_SEEKPOINT_SIZE_IN_BYTES;
6547
6548
pRawData = drflac__malloc_from_callbacks(seekpointCount * sizeof(drflac_seekpoint), pAllocationCallbacks);
6549
if (pRawData == NULL) {
6550
return DRFLAC_FALSE;
6551
}
6552
6553
/* We need to read seekpoint by seekpoint and do some processing. */
6554
for (iSeekpoint = 0; iSeekpoint < seekpointCount; ++iSeekpoint) {
6555
drflac_seekpoint* pSeekpoint = (drflac_seekpoint*)pRawData + iSeekpoint;
6556
6557
if (onRead(pUserData, pSeekpoint, DRFLAC_SEEKPOINT_SIZE_IN_BYTES) != DRFLAC_SEEKPOINT_SIZE_IN_BYTES) {
6558
drflac__free_from_callbacks(pRawData, pAllocationCallbacks);
6559
return DRFLAC_FALSE;
6560
}
6561
6562
/* Endian swap. */
6563
pSeekpoint->firstPCMFrame = drflac__be2host_64(pSeekpoint->firstPCMFrame);
6564
pSeekpoint->flacFrameOffset = drflac__be2host_64(pSeekpoint->flacFrameOffset);
6565
pSeekpoint->pcmFrameCount = drflac__be2host_16(pSeekpoint->pcmFrameCount);
6566
}
6567
6568
metadata.pRawData = pRawData;
6569
metadata.rawDataSize = blockSize;
6570
metadata.data.seektable.seekpointCount = seekpointCount;
6571
metadata.data.seektable.pSeekpoints = (const drflac_seekpoint*)pRawData;
6572
6573
onMeta(pUserDataMD, &metadata);
6574
6575
drflac__free_from_callbacks(pRawData, pAllocationCallbacks);
6576
}
6577
} break;
6578
6579
case DRFLAC_METADATA_BLOCK_TYPE_VORBIS_COMMENT:
6580
{
6581
if (blockSize < 8) {
6582
return DRFLAC_FALSE;
6583
}
6584
6585
if (onMeta) {
6586
void* pRawData;
6587
const char* pRunningData;
6588
const char* pRunningDataEnd;
6589
drflac_uint32 i;
6590
6591
pRawData = drflac__malloc_from_callbacks(blockSize, pAllocationCallbacks);
6592
if (pRawData == NULL) {
6593
return DRFLAC_FALSE;
6594
}
6595
6596
if (onRead(pUserData, pRawData, blockSize) != blockSize) {
6597
drflac__free_from_callbacks(pRawData, pAllocationCallbacks);
6598
return DRFLAC_FALSE;
6599
}
6600
6601
metadata.pRawData = pRawData;
6602
metadata.rawDataSize = blockSize;
6603
6604
pRunningData = (const char*)pRawData;
6605
pRunningDataEnd = (const char*)pRawData + blockSize;
6606
6607
metadata.data.vorbis_comment.vendorLength = drflac__le2host_32_ptr_unaligned(pRunningData); pRunningData += 4;
6608
6609
/* Need space for the rest of the block */
6610
if ((pRunningDataEnd - pRunningData) - 4 < (drflac_int64)metadata.data.vorbis_comment.vendorLength) { /* <-- Note the order of operations to avoid overflow to a valid value */
6611
drflac__free_from_callbacks(pRawData, pAllocationCallbacks);
6612
return DRFLAC_FALSE;
6613
}
6614
metadata.data.vorbis_comment.vendor = pRunningData; pRunningData += metadata.data.vorbis_comment.vendorLength;
6615
metadata.data.vorbis_comment.commentCount = drflac__le2host_32_ptr_unaligned(pRunningData); pRunningData += 4;
6616
6617
/* Need space for 'commentCount' comments after the block, which at minimum is a drflac_uint32 per comment */
6618
if ((pRunningDataEnd - pRunningData) / sizeof(drflac_uint32) < metadata.data.vorbis_comment.commentCount) { /* <-- Note the order of operations to avoid overflow to a valid value */
6619
drflac__free_from_callbacks(pRawData, pAllocationCallbacks);
6620
return DRFLAC_FALSE;
6621
}
6622
metadata.data.vorbis_comment.pComments = pRunningData;
6623
6624
/* Check that the comments section is valid before passing it to the callback */
6625
for (i = 0; i < metadata.data.vorbis_comment.commentCount; ++i) {
6626
drflac_uint32 commentLength;
6627
6628
if (pRunningDataEnd - pRunningData < 4) {
6629
drflac__free_from_callbacks(pRawData, pAllocationCallbacks);
6630
return DRFLAC_FALSE;
6631
}
6632
6633
commentLength = drflac__le2host_32_ptr_unaligned(pRunningData); pRunningData += 4;
6634
if (pRunningDataEnd - pRunningData < (drflac_int64)commentLength) { /* <-- Note the order of operations to avoid overflow to a valid value */
6635
drflac__free_from_callbacks(pRawData, pAllocationCallbacks);
6636
return DRFLAC_FALSE;
6637
}
6638
pRunningData += commentLength;
6639
}
6640
6641
onMeta(pUserDataMD, &metadata);
6642
6643
drflac__free_from_callbacks(pRawData, pAllocationCallbacks);
6644
}
6645
} break;
6646
6647
case DRFLAC_METADATA_BLOCK_TYPE_CUESHEET:
6648
{
6649
if (blockSize < 396) {
6650
return DRFLAC_FALSE;
6651
}
6652
6653
if (onMeta) {
6654
void* pRawData;
6655
const char* pRunningData;
6656
const char* pRunningDataEnd;
6657
size_t bufferSize;
6658
drflac_uint8 iTrack;
6659
drflac_uint8 iIndex;
6660
void* pTrackData;
6661
6662
/*
6663
This needs to be loaded in two passes. The first pass is used to calculate the size of the memory allocation
6664
we need for storing the necessary data. The second pass will fill that buffer with usable data.
6665
*/
6666
pRawData = drflac__malloc_from_callbacks(blockSize, pAllocationCallbacks);
6667
if (pRawData == NULL) {
6668
return DRFLAC_FALSE;
6669
}
6670
6671
if (onRead(pUserData, pRawData, blockSize) != blockSize) {
6672
drflac__free_from_callbacks(pRawData, pAllocationCallbacks);
6673
return DRFLAC_FALSE;
6674
}
6675
6676
metadata.pRawData = pRawData;
6677
metadata.rawDataSize = blockSize;
6678
6679
pRunningData = (const char*)pRawData;
6680
pRunningDataEnd = (const char*)pRawData + blockSize;
6681
6682
DRFLAC_COPY_MEMORY(metadata.data.cuesheet.catalog, pRunningData, 128); pRunningData += 128;
6683
metadata.data.cuesheet.leadInSampleCount = drflac__be2host_64(*(const drflac_uint64*)pRunningData); pRunningData += 8;
6684
metadata.data.cuesheet.isCD = (pRunningData[0] & 0x80) != 0; pRunningData += 259;
6685
metadata.data.cuesheet.trackCount = pRunningData[0]; pRunningData += 1;
6686
metadata.data.cuesheet.pTrackData = NULL; /* Will be filled later. */
6687
6688
/* Pass 1: Calculate the size of the buffer for the track data. */
6689
{
6690
const char* pRunningDataSaved = pRunningData; /* Will be restored at the end in preparation for the second pass. */
6691
6692
bufferSize = metadata.data.cuesheet.trackCount * DRFLAC_CUESHEET_TRACK_SIZE_IN_BYTES;
6693
6694
for (iTrack = 0; iTrack < metadata.data.cuesheet.trackCount; ++iTrack) {
6695
drflac_uint8 indexCount;
6696
drflac_uint32 indexPointSize;
6697
6698
if (pRunningDataEnd - pRunningData < DRFLAC_CUESHEET_TRACK_SIZE_IN_BYTES) {
6699
drflac__free_from_callbacks(pRawData, pAllocationCallbacks);
6700
return DRFLAC_FALSE;
6701
}
6702
6703
/* Skip to the index point count */
6704
pRunningData += 35;
6705
6706
indexCount = pRunningData[0];
6707
pRunningData += 1;
6708
6709
bufferSize += indexCount * sizeof(drflac_cuesheet_track_index);
6710
6711
/* Quick validation check. */
6712
indexPointSize = indexCount * DRFLAC_CUESHEET_TRACK_INDEX_SIZE_IN_BYTES;
6713
if (pRunningDataEnd - pRunningData < (drflac_int64)indexPointSize) {
6714
drflac__free_from_callbacks(pRawData, pAllocationCallbacks);
6715
return DRFLAC_FALSE;
6716
}
6717
6718
pRunningData += indexPointSize;
6719
}
6720
6721
pRunningData = pRunningDataSaved;
6722
}
6723
6724
/* Pass 2: Allocate a buffer and fill the data. Validation was done in the step above so can be skipped. */
6725
{
6726
char* pRunningTrackData;
6727
6728
pTrackData = drflac__malloc_from_callbacks(bufferSize, pAllocationCallbacks);
6729
if (pTrackData == NULL) {
6730
drflac__free_from_callbacks(pRawData, pAllocationCallbacks);
6731
return DRFLAC_FALSE;
6732
}
6733
6734
pRunningTrackData = (char*)pTrackData;
6735
6736
for (iTrack = 0; iTrack < metadata.data.cuesheet.trackCount; ++iTrack) {
6737
drflac_uint8 indexCount;
6738
6739
DRFLAC_COPY_MEMORY(pRunningTrackData, pRunningData, DRFLAC_CUESHEET_TRACK_SIZE_IN_BYTES);
6740
pRunningData += DRFLAC_CUESHEET_TRACK_SIZE_IN_BYTES-1; /* Skip forward, but not beyond the last byte in the CUESHEET_TRACK block which is the index count. */
6741
pRunningTrackData += DRFLAC_CUESHEET_TRACK_SIZE_IN_BYTES-1;
6742
6743
/* Grab the index count for the next part. */
6744
indexCount = pRunningData[0];
6745
pRunningData += 1;
6746
pRunningTrackData += 1;
6747
6748
/* Extract each track index. */
6749
for (iIndex = 0; iIndex < indexCount; ++iIndex) {
6750
drflac_cuesheet_track_index* pTrackIndex = (drflac_cuesheet_track_index*)pRunningTrackData;
6751
6752
DRFLAC_COPY_MEMORY(pRunningTrackData, pRunningData, DRFLAC_CUESHEET_TRACK_INDEX_SIZE_IN_BYTES);
6753
pRunningData += DRFLAC_CUESHEET_TRACK_INDEX_SIZE_IN_BYTES;
6754
pRunningTrackData += sizeof(drflac_cuesheet_track_index);
6755
6756
pTrackIndex->offset = drflac__be2host_64(pTrackIndex->offset);
6757
}
6758
}
6759
6760
metadata.data.cuesheet.pTrackData = pTrackData;
6761
}
6762
6763
/* The original data is no longer needed. */
6764
drflac__free_from_callbacks(pRawData, pAllocationCallbacks);
6765
pRawData = NULL;
6766
6767
onMeta(pUserDataMD, &metadata);
6768
6769
drflac__free_from_callbacks(pTrackData, pAllocationCallbacks);
6770
pTrackData = NULL;
6771
}
6772
} break;
6773
6774
case DRFLAC_METADATA_BLOCK_TYPE_PICTURE:
6775
{
6776
if (blockSize < 32) {
6777
return DRFLAC_FALSE;
6778
}
6779
6780
if (onMeta) {
6781
void* pRawData;
6782
const char* pRunningData;
6783
const char* pRunningDataEnd;
6784
6785
pRawData = drflac__malloc_from_callbacks(blockSize, pAllocationCallbacks);
6786
if (pRawData == NULL) {
6787
return DRFLAC_FALSE;
6788
}
6789
6790
if (onRead(pUserData, pRawData, blockSize) != blockSize) {
6791
drflac__free_from_callbacks(pRawData, pAllocationCallbacks);
6792
return DRFLAC_FALSE;
6793
}
6794
6795
metadata.pRawData = pRawData;
6796
metadata.rawDataSize = blockSize;
6797
6798
pRunningData = (const char*)pRawData;
6799
pRunningDataEnd = (const char*)pRawData + blockSize;
6800
6801
metadata.data.picture.type = drflac__be2host_32_ptr_unaligned(pRunningData); pRunningData += 4;
6802
metadata.data.picture.mimeLength = drflac__be2host_32_ptr_unaligned(pRunningData); pRunningData += 4;
6803
6804
/* Need space for the rest of the block */
6805
if ((pRunningDataEnd - pRunningData) - 24 < (drflac_int64)metadata.data.picture.mimeLength) { /* <-- Note the order of operations to avoid overflow to a valid value */
6806
drflac__free_from_callbacks(pRawData, pAllocationCallbacks);
6807
return DRFLAC_FALSE;
6808
}
6809
metadata.data.picture.mime = pRunningData; pRunningData += metadata.data.picture.mimeLength;
6810
metadata.data.picture.descriptionLength = drflac__be2host_32_ptr_unaligned(pRunningData); pRunningData += 4;
6811
6812
/* Need space for the rest of the block */
6813
if ((pRunningDataEnd - pRunningData) - 20 < (drflac_int64)metadata.data.picture.descriptionLength) { /* <-- Note the order of operations to avoid overflow to a valid value */
6814
drflac__free_from_callbacks(pRawData, pAllocationCallbacks);
6815
return DRFLAC_FALSE;
6816
}
6817
metadata.data.picture.description = pRunningData; pRunningData += metadata.data.picture.descriptionLength;
6818
metadata.data.picture.width = drflac__be2host_32_ptr_unaligned(pRunningData); pRunningData += 4;
6819
metadata.data.picture.height = drflac__be2host_32_ptr_unaligned(pRunningData); pRunningData += 4;
6820
metadata.data.picture.colorDepth = drflac__be2host_32_ptr_unaligned(pRunningData); pRunningData += 4;
6821
metadata.data.picture.indexColorCount = drflac__be2host_32_ptr_unaligned(pRunningData); pRunningData += 4;
6822
metadata.data.picture.pictureDataSize = drflac__be2host_32_ptr_unaligned(pRunningData); pRunningData += 4;
6823
metadata.data.picture.pPictureData = (const drflac_uint8*)pRunningData;
6824
6825
/* Need space for the picture after the block */
6826
if (pRunningDataEnd - pRunningData < (drflac_int64)metadata.data.picture.pictureDataSize) { /* <-- Note the order of operations to avoid overflow to a valid value */
6827
drflac__free_from_callbacks(pRawData, pAllocationCallbacks);
6828
return DRFLAC_FALSE;
6829
}
6830
6831
onMeta(pUserDataMD, &metadata);
6832
6833
drflac__free_from_callbacks(pRawData, pAllocationCallbacks);
6834
}
6835
} break;
6836
6837
case DRFLAC_METADATA_BLOCK_TYPE_PADDING:
6838
{
6839
if (onMeta) {
6840
metadata.data.padding.unused = 0;
6841
6842
/* Padding doesn't have anything meaningful in it, so just skip over it, but make sure the caller is aware of it by firing the callback. */
6843
if (!onSeek(pUserData, blockSize, drflac_seek_origin_current)) {
6844
isLastBlock = DRFLAC_TRUE; /* An error occurred while seeking. Attempt to recover by treating this as the last block which will in turn terminate the loop. */
6845
} else {
6846
onMeta(pUserDataMD, &metadata);
6847
}
6848
}
6849
} break;
6850
6851
case DRFLAC_METADATA_BLOCK_TYPE_INVALID:
6852
{
6853
/* Invalid chunk. Just skip over this one. */
6854
if (onMeta) {
6855
if (!onSeek(pUserData, blockSize, drflac_seek_origin_current)) {
6856
isLastBlock = DRFLAC_TRUE; /* An error occurred while seeking. Attempt to recover by treating this as the last block which will in turn terminate the loop. */
6857
}
6858
}
6859
} break;
6860
6861
default:
6862
{
6863
/*
6864
It's an unknown chunk, but not necessarily invalid. There's a chance more metadata blocks might be defined later on, so we
6865
can at the very least report the chunk to the application and let it look at the raw data.
6866
*/
6867
if (onMeta) {
6868
void* pRawData = drflac__malloc_from_callbacks(blockSize, pAllocationCallbacks);
6869
if (pRawData == NULL) {
6870
return DRFLAC_FALSE;
6871
}
6872
6873
if (onRead(pUserData, pRawData, blockSize) != blockSize) {
6874
drflac__free_from_callbacks(pRawData, pAllocationCallbacks);
6875
return DRFLAC_FALSE;
6876
}
6877
6878
metadata.pRawData = pRawData;
6879
metadata.rawDataSize = blockSize;
6880
onMeta(pUserDataMD, &metadata);
6881
6882
drflac__free_from_callbacks(pRawData, pAllocationCallbacks);
6883
}
6884
} break;
6885
}
6886
6887
/* If we're not handling metadata, just skip over the block. If we are, it will have been handled earlier in the switch statement above. */
6888
if (onMeta == NULL && blockSize > 0) {
6889
if (!onSeek(pUserData, blockSize, drflac_seek_origin_current)) {
6890
isLastBlock = DRFLAC_TRUE;
6891
}
6892
}
6893
6894
runningFilePos += blockSize;
6895
if (isLastBlock) {
6896
break;
6897
}
6898
}
6899
6900
*pSeektablePos = seektablePos;
6901
*pSeekpointCount = seektableSize / DRFLAC_SEEKPOINT_SIZE_IN_BYTES;
6902
*pFirstFramePos = runningFilePos;
6903
6904
return DRFLAC_TRUE;
6905
}
6906
6907
static drflac_bool32 drflac__init_private__native(drflac_init_info* pInit, drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, void* pUserData, void* pUserDataMD, drflac_bool32 relaxed)
6908
{
6909
/* Pre Condition: The bit stream should be sitting just past the 4-byte id header. */
6910
6911
drflac_uint8 isLastBlock;
6912
drflac_uint8 blockType;
6913
drflac_uint32 blockSize;
6914
6915
(void)onSeek;
6916
6917
pInit->container = drflac_container_native;
6918
6919
/* The first metadata block should be the STREAMINFO block. */
6920
if (!drflac__read_and_decode_block_header(onRead, pUserData, &isLastBlock, &blockType, &blockSize)) {
6921
return DRFLAC_FALSE;
6922
}
6923
6924
if (blockType != DRFLAC_METADATA_BLOCK_TYPE_STREAMINFO || blockSize != 34) {
6925
if (!relaxed) {
6926
/* We're opening in strict mode and the first block is not the STREAMINFO block. Error. */
6927
return DRFLAC_FALSE;
6928
} else {
6929
/*
6930
Relaxed mode. To open from here we need to just find the first frame and set the sample rate, etc. to whatever is defined
6931
for that frame.
6932
*/
6933
pInit->hasStreamInfoBlock = DRFLAC_FALSE;
6934
pInit->hasMetadataBlocks = DRFLAC_FALSE;
6935
6936
if (!drflac__read_next_flac_frame_header(&pInit->bs, 0, &pInit->firstFrameHeader)) {
6937
return DRFLAC_FALSE; /* Couldn't find a frame. */
6938
}
6939
6940
if (pInit->firstFrameHeader.bitsPerSample == 0) {
6941
return DRFLAC_FALSE; /* Failed to initialize because the first frame depends on the STREAMINFO block, which does not exist. */
6942
}
6943
6944
pInit->sampleRate = pInit->firstFrameHeader.sampleRate;
6945
pInit->channels = drflac__get_channel_count_from_channel_assignment(pInit->firstFrameHeader.channelAssignment);
6946
pInit->bitsPerSample = pInit->firstFrameHeader.bitsPerSample;
6947
pInit->maxBlockSizeInPCMFrames = 65535; /* <-- See notes here: https://xiph.org/flac/format.html#metadata_block_streaminfo */
6948
return DRFLAC_TRUE;
6949
}
6950
} else {
6951
drflac_streaminfo streaminfo;
6952
if (!drflac__read_streaminfo(onRead, pUserData, &streaminfo)) {
6953
return DRFLAC_FALSE;
6954
}
6955
6956
pInit->hasStreamInfoBlock = DRFLAC_TRUE;
6957
pInit->sampleRate = streaminfo.sampleRate;
6958
pInit->channels = streaminfo.channels;
6959
pInit->bitsPerSample = streaminfo.bitsPerSample;
6960
pInit->totalPCMFrameCount = streaminfo.totalPCMFrameCount;
6961
pInit->maxBlockSizeInPCMFrames = streaminfo.maxBlockSizeInPCMFrames; /* Don't care about the min block size - only the max (used for determining the size of the memory allocation). */
6962
pInit->hasMetadataBlocks = !isLastBlock;
6963
6964
if (onMeta) {
6965
drflac_metadata metadata;
6966
metadata.type = DRFLAC_METADATA_BLOCK_TYPE_STREAMINFO;
6967
metadata.pRawData = NULL;
6968
metadata.rawDataSize = 0;
6969
metadata.data.streaminfo = streaminfo;
6970
onMeta(pUserDataMD, &metadata);
6971
}
6972
6973
return DRFLAC_TRUE;
6974
}
6975
}
6976
6977
#ifndef DR_FLAC_NO_OGG
6978
#define DRFLAC_OGG_MAX_PAGE_SIZE 65307
6979
#define DRFLAC_OGG_CAPTURE_PATTERN_CRC32 1605413199 /* CRC-32 of "OggS". */
6980
6981
typedef enum
6982
{
6983
drflac_ogg_recover_on_crc_mismatch,
6984
drflac_ogg_fail_on_crc_mismatch
6985
} drflac_ogg_crc_mismatch_recovery;
6986
6987
#ifndef DR_FLAC_NO_CRC
6988
static drflac_uint32 drflac__crc32_table[] = {
6989
0x00000000L, 0x04C11DB7L, 0x09823B6EL, 0x0D4326D9L,
6990
0x130476DCL, 0x17C56B6BL, 0x1A864DB2L, 0x1E475005L,
6991
0x2608EDB8L, 0x22C9F00FL, 0x2F8AD6D6L, 0x2B4BCB61L,
6992
0x350C9B64L, 0x31CD86D3L, 0x3C8EA00AL, 0x384FBDBDL,
6993
0x4C11DB70L, 0x48D0C6C7L, 0x4593E01EL, 0x4152FDA9L,
6994
0x5F15ADACL, 0x5BD4B01BL, 0x569796C2L, 0x52568B75L,
6995
0x6A1936C8L, 0x6ED82B7FL, 0x639B0DA6L, 0x675A1011L,
6996
0x791D4014L, 0x7DDC5DA3L, 0x709F7B7AL, 0x745E66CDL,
6997
0x9823B6E0L, 0x9CE2AB57L, 0x91A18D8EL, 0x95609039L,
6998
0x8B27C03CL, 0x8FE6DD8BL, 0x82A5FB52L, 0x8664E6E5L,
6999
0xBE2B5B58L, 0xBAEA46EFL, 0xB7A96036L, 0xB3687D81L,
7000
0xAD2F2D84L, 0xA9EE3033L, 0xA4AD16EAL, 0xA06C0B5DL,
7001
0xD4326D90L, 0xD0F37027L, 0xDDB056FEL, 0xD9714B49L,
7002
0xC7361B4CL, 0xC3F706FBL, 0xCEB42022L, 0xCA753D95L,
7003
0xF23A8028L, 0xF6FB9D9FL, 0xFBB8BB46L, 0xFF79A6F1L,
7004
0xE13EF6F4L, 0xE5FFEB43L, 0xE8BCCD9AL, 0xEC7DD02DL,
7005
0x34867077L, 0x30476DC0L, 0x3D044B19L, 0x39C556AEL,
7006
0x278206ABL, 0x23431B1CL, 0x2E003DC5L, 0x2AC12072L,
7007
0x128E9DCFL, 0x164F8078L, 0x1B0CA6A1L, 0x1FCDBB16L,
7008
0x018AEB13L, 0x054BF6A4L, 0x0808D07DL, 0x0CC9CDCAL,
7009
0x7897AB07L, 0x7C56B6B0L, 0x71159069L, 0x75D48DDEL,
7010
0x6B93DDDBL, 0x6F52C06CL, 0x6211E6B5L, 0x66D0FB02L,
7011
0x5E9F46BFL, 0x5A5E5B08L, 0x571D7DD1L, 0x53DC6066L,
7012
0x4D9B3063L, 0x495A2DD4L, 0x44190B0DL, 0x40D816BAL,
7013
0xACA5C697L, 0xA864DB20L, 0xA527FDF9L, 0xA1E6E04EL,
7014
0xBFA1B04BL, 0xBB60ADFCL, 0xB6238B25L, 0xB2E29692L,
7015
0x8AAD2B2FL, 0x8E6C3698L, 0x832F1041L, 0x87EE0DF6L,
7016
0x99A95DF3L, 0x9D684044L, 0x902B669DL, 0x94EA7B2AL,
7017
0xE0B41DE7L, 0xE4750050L, 0xE9362689L, 0xEDF73B3EL,
7018
0xF3B06B3BL, 0xF771768CL, 0xFA325055L, 0xFEF34DE2L,
7019
0xC6BCF05FL, 0xC27DEDE8L, 0xCF3ECB31L, 0xCBFFD686L,
7020
0xD5B88683L, 0xD1799B34L, 0xDC3ABDEDL, 0xD8FBA05AL,
7021
0x690CE0EEL, 0x6DCDFD59L, 0x608EDB80L, 0x644FC637L,
7022
0x7A089632L, 0x7EC98B85L, 0x738AAD5CL, 0x774BB0EBL,
7023
0x4F040D56L, 0x4BC510E1L, 0x46863638L, 0x42472B8FL,
7024
0x5C007B8AL, 0x58C1663DL, 0x558240E4L, 0x51435D53L,
7025
0x251D3B9EL, 0x21DC2629L, 0x2C9F00F0L, 0x285E1D47L,
7026
0x36194D42L, 0x32D850F5L, 0x3F9B762CL, 0x3B5A6B9BL,
7027
0x0315D626L, 0x07D4CB91L, 0x0A97ED48L, 0x0E56F0FFL,
7028
0x1011A0FAL, 0x14D0BD4DL, 0x19939B94L, 0x1D528623L,
7029
0xF12F560EL, 0xF5EE4BB9L, 0xF8AD6D60L, 0xFC6C70D7L,
7030
0xE22B20D2L, 0xE6EA3D65L, 0xEBA91BBCL, 0xEF68060BL,
7031
0xD727BBB6L, 0xD3E6A601L, 0xDEA580D8L, 0xDA649D6FL,
7032
0xC423CD6AL, 0xC0E2D0DDL, 0xCDA1F604L, 0xC960EBB3L,
7033
0xBD3E8D7EL, 0xB9FF90C9L, 0xB4BCB610L, 0xB07DABA7L,
7034
0xAE3AFBA2L, 0xAAFBE615L, 0xA7B8C0CCL, 0xA379DD7BL,
7035
0x9B3660C6L, 0x9FF77D71L, 0x92B45BA8L, 0x9675461FL,
7036
0x8832161AL, 0x8CF30BADL, 0x81B02D74L, 0x857130C3L,
7037
0x5D8A9099L, 0x594B8D2EL, 0x5408ABF7L, 0x50C9B640L,
7038
0x4E8EE645L, 0x4A4FFBF2L, 0x470CDD2BL, 0x43CDC09CL,
7039
0x7B827D21L, 0x7F436096L, 0x7200464FL, 0x76C15BF8L,
7040
0x68860BFDL, 0x6C47164AL, 0x61043093L, 0x65C52D24L,
7041
0x119B4BE9L, 0x155A565EL, 0x18197087L, 0x1CD86D30L,
7042
0x029F3D35L, 0x065E2082L, 0x0B1D065BL, 0x0FDC1BECL,
7043
0x3793A651L, 0x3352BBE6L, 0x3E119D3FL, 0x3AD08088L,
7044
0x2497D08DL, 0x2056CD3AL, 0x2D15EBE3L, 0x29D4F654L,
7045
0xC5A92679L, 0xC1683BCEL, 0xCC2B1D17L, 0xC8EA00A0L,
7046
0xD6AD50A5L, 0xD26C4D12L, 0xDF2F6BCBL, 0xDBEE767CL,
7047
0xE3A1CBC1L, 0xE760D676L, 0xEA23F0AFL, 0xEEE2ED18L,
7048
0xF0A5BD1DL, 0xF464A0AAL, 0xF9278673L, 0xFDE69BC4L,
7049
0x89B8FD09L, 0x8D79E0BEL, 0x803AC667L, 0x84FBDBD0L,
7050
0x9ABC8BD5L, 0x9E7D9662L, 0x933EB0BBL, 0x97FFAD0CL,
7051
0xAFB010B1L, 0xAB710D06L, 0xA6322BDFL, 0xA2F33668L,
7052
0xBCB4666DL, 0xB8757BDAL, 0xB5365D03L, 0xB1F740B4L
7053
};
7054
#endif
7055
7056
static DRFLAC_INLINE drflac_uint32 drflac_crc32_byte(drflac_uint32 crc32, drflac_uint8 data)
7057
{
7058
#ifndef DR_FLAC_NO_CRC
7059
return (crc32 << 8) ^ drflac__crc32_table[(drflac_uint8)((crc32 >> 24) & 0xFF) ^ data];
7060
#else
7061
(void)data;
7062
return crc32;
7063
#endif
7064
}
7065
7066
#if 0
7067
static DRFLAC_INLINE drflac_uint32 drflac_crc32_uint32(drflac_uint32 crc32, drflac_uint32 data)
7068
{
7069
crc32 = drflac_crc32_byte(crc32, (drflac_uint8)((data >> 24) & 0xFF));
7070
crc32 = drflac_crc32_byte(crc32, (drflac_uint8)((data >> 16) & 0xFF));
7071
crc32 = drflac_crc32_byte(crc32, (drflac_uint8)((data >> 8) & 0xFF));
7072
crc32 = drflac_crc32_byte(crc32, (drflac_uint8)((data >> 0) & 0xFF));
7073
return crc32;
7074
}
7075
7076
static DRFLAC_INLINE drflac_uint32 drflac_crc32_uint64(drflac_uint32 crc32, drflac_uint64 data)
7077
{
7078
crc32 = drflac_crc32_uint32(crc32, (drflac_uint32)((data >> 32) & 0xFFFFFFFF));
7079
crc32 = drflac_crc32_uint32(crc32, (drflac_uint32)((data >> 0) & 0xFFFFFFFF));
7080
return crc32;
7081
}
7082
#endif
7083
7084
static DRFLAC_INLINE drflac_uint32 drflac_crc32_buffer(drflac_uint32 crc32, drflac_uint8* pData, drflac_uint32 dataSize)
7085
{
7086
/* This can be optimized. */
7087
drflac_uint32 i;
7088
for (i = 0; i < dataSize; ++i) {
7089
crc32 = drflac_crc32_byte(crc32, pData[i]);
7090
}
7091
return crc32;
7092
}
7093
7094
7095
static DRFLAC_INLINE drflac_bool32 drflac_ogg__is_capture_pattern(drflac_uint8 pattern[4])
7096
{
7097
return pattern[0] == 'O' && pattern[1] == 'g' && pattern[2] == 'g' && pattern[3] == 'S';
7098
}
7099
7100
static DRFLAC_INLINE drflac_uint32 drflac_ogg__get_page_header_size(drflac_ogg_page_header* pHeader)
7101
{
7102
return 27 + pHeader->segmentCount;
7103
}
7104
7105
static DRFLAC_INLINE drflac_uint32 drflac_ogg__get_page_body_size(drflac_ogg_page_header* pHeader)
7106
{
7107
drflac_uint32 pageBodySize = 0;
7108
int i;
7109
7110
for (i = 0; i < pHeader->segmentCount; ++i) {
7111
pageBodySize += pHeader->segmentTable[i];
7112
}
7113
7114
return pageBodySize;
7115
}
7116
7117
static drflac_result drflac_ogg__read_page_header_after_capture_pattern(drflac_read_proc onRead, void* pUserData, drflac_ogg_page_header* pHeader, drflac_uint32* pBytesRead, drflac_uint32* pCRC32)
7118
{
7119
drflac_uint8 data[23];
7120
drflac_uint32 i;
7121
7122
DRFLAC_ASSERT(*pCRC32 == DRFLAC_OGG_CAPTURE_PATTERN_CRC32);
7123
7124
if (onRead(pUserData, data, 23) != 23) {
7125
return DRFLAC_AT_END;
7126
}
7127
*pBytesRead += 23;
7128
7129
/*
7130
It's not actually used, but set the capture pattern to 'OggS' for completeness. Not doing this will cause static analysers to complain about
7131
us trying to access uninitialized data. We could alternatively just comment out this member of the drflac_ogg_page_header structure, but I
7132
like to have it map to the structure of the underlying data.
7133
*/
7134
pHeader->capturePattern[0] = 'O';
7135
pHeader->capturePattern[1] = 'g';
7136
pHeader->capturePattern[2] = 'g';
7137
pHeader->capturePattern[3] = 'S';
7138
7139
pHeader->structureVersion = data[0];
7140
pHeader->headerType = data[1];
7141
DRFLAC_COPY_MEMORY(&pHeader->granulePosition, &data[ 2], 8);
7142
DRFLAC_COPY_MEMORY(&pHeader->serialNumber, &data[10], 4);
7143
DRFLAC_COPY_MEMORY(&pHeader->sequenceNumber, &data[14], 4);
7144
DRFLAC_COPY_MEMORY(&pHeader->checksum, &data[18], 4);
7145
pHeader->segmentCount = data[22];
7146
7147
/* Calculate the CRC. Note that for the calculation the checksum part of the page needs to be set to 0. */
7148
data[18] = 0;
7149
data[19] = 0;
7150
data[20] = 0;
7151
data[21] = 0;
7152
7153
for (i = 0; i < 23; ++i) {
7154
*pCRC32 = drflac_crc32_byte(*pCRC32, data[i]);
7155
}
7156
7157
7158
if (onRead(pUserData, pHeader->segmentTable, pHeader->segmentCount) != pHeader->segmentCount) {
7159
return DRFLAC_AT_END;
7160
}
7161
*pBytesRead += pHeader->segmentCount;
7162
7163
for (i = 0; i < pHeader->segmentCount; ++i) {
7164
*pCRC32 = drflac_crc32_byte(*pCRC32, pHeader->segmentTable[i]);
7165
}
7166
7167
return DRFLAC_SUCCESS;
7168
}
7169
7170
static drflac_result drflac_ogg__read_page_header(drflac_read_proc onRead, void* pUserData, drflac_ogg_page_header* pHeader, drflac_uint32* pBytesRead, drflac_uint32* pCRC32)
7171
{
7172
drflac_uint8 id[4];
7173
7174
*pBytesRead = 0;
7175
7176
if (onRead(pUserData, id, 4) != 4) {
7177
return DRFLAC_AT_END;
7178
}
7179
*pBytesRead += 4;
7180
7181
/* We need to read byte-by-byte until we find the OggS capture pattern. */
7182
for (;;) {
7183
if (drflac_ogg__is_capture_pattern(id)) {
7184
drflac_result result;
7185
7186
*pCRC32 = DRFLAC_OGG_CAPTURE_PATTERN_CRC32;
7187
7188
result = drflac_ogg__read_page_header_after_capture_pattern(onRead, pUserData, pHeader, pBytesRead, pCRC32);
7189
if (result == DRFLAC_SUCCESS) {
7190
return DRFLAC_SUCCESS;
7191
} else {
7192
if (result == DRFLAC_CRC_MISMATCH) {
7193
continue;
7194
} else {
7195
return result;
7196
}
7197
}
7198
} else {
7199
/* The first 4 bytes did not equal the capture pattern. Read the next byte and try again. */
7200
id[0] = id[1];
7201
id[1] = id[2];
7202
id[2] = id[3];
7203
if (onRead(pUserData, &id[3], 1) != 1) {
7204
return DRFLAC_AT_END;
7205
}
7206
*pBytesRead += 1;
7207
}
7208
}
7209
}
7210
7211
7212
/*
7213
The main part of the Ogg encapsulation is the conversion from the physical Ogg bitstream to the native FLAC bitstream. It works
7214
in three general stages: Ogg Physical Bitstream -> Ogg/FLAC Logical Bitstream -> FLAC Native Bitstream. dr_flac is designed
7215
in such a way that the core sections assume everything is delivered in native format. Therefore, for each encapsulation type
7216
dr_flac is supporting there needs to be a layer sitting on top of the onRead and onSeek callbacks that ensures the bits read from
7217
the physical Ogg bitstream are converted and delivered in native FLAC format.
7218
*/
7219
typedef struct
7220
{
7221
drflac_read_proc onRead; /* The original onRead callback from drflac_open() and family. */
7222
drflac_seek_proc onSeek; /* The original onSeek callback from drflac_open() and family. */
7223
void* pUserData; /* The user data passed on onRead and onSeek. This is the user data that was passed on drflac_open() and family. */
7224
drflac_uint64 currentBytePos; /* The position of the byte we are sitting on in the physical byte stream. Used for efficient seeking. */
7225
drflac_uint64 firstBytePos; /* The position of the first byte in the physical bitstream. Points to the start of the "OggS" identifier of the FLAC bos page. */
7226
drflac_uint32 serialNumber; /* The serial number of the FLAC audio pages. This is determined by the initial header page that was read during initialization. */
7227
drflac_ogg_page_header bosPageHeader; /* Used for seeking. */
7228
drflac_ogg_page_header currentPageHeader;
7229
drflac_uint32 bytesRemainingInPage;
7230
drflac_uint32 pageDataSize;
7231
drflac_uint8 pageData[DRFLAC_OGG_MAX_PAGE_SIZE];
7232
} drflac_oggbs; /* oggbs = Ogg Bitstream */
7233
7234
static size_t drflac_oggbs__read_physical(drflac_oggbs* oggbs, void* bufferOut, size_t bytesToRead)
7235
{
7236
size_t bytesActuallyRead = oggbs->onRead(oggbs->pUserData, bufferOut, bytesToRead);
7237
oggbs->currentBytePos += bytesActuallyRead;
7238
7239
return bytesActuallyRead;
7240
}
7241
7242
static drflac_bool32 drflac_oggbs__seek_physical(drflac_oggbs* oggbs, drflac_uint64 offset, drflac_seek_origin origin)
7243
{
7244
if (origin == drflac_seek_origin_start) {
7245
if (offset <= 0x7FFFFFFF) {
7246
if (!oggbs->onSeek(oggbs->pUserData, (int)offset, drflac_seek_origin_start)) {
7247
return DRFLAC_FALSE;
7248
}
7249
oggbs->currentBytePos = offset;
7250
7251
return DRFLAC_TRUE;
7252
} else {
7253
if (!oggbs->onSeek(oggbs->pUserData, 0x7FFFFFFF, drflac_seek_origin_start)) {
7254
return DRFLAC_FALSE;
7255
}
7256
oggbs->currentBytePos = offset;
7257
7258
return drflac_oggbs__seek_physical(oggbs, offset - 0x7FFFFFFF, drflac_seek_origin_current);
7259
}
7260
} else {
7261
while (offset > 0x7FFFFFFF) {
7262
if (!oggbs->onSeek(oggbs->pUserData, 0x7FFFFFFF, drflac_seek_origin_current)) {
7263
return DRFLAC_FALSE;
7264
}
7265
oggbs->currentBytePos += 0x7FFFFFFF;
7266
offset -= 0x7FFFFFFF;
7267
}
7268
7269
if (!oggbs->onSeek(oggbs->pUserData, (int)offset, drflac_seek_origin_current)) { /* <-- Safe cast thanks to the loop above. */
7270
return DRFLAC_FALSE;
7271
}
7272
oggbs->currentBytePos += offset;
7273
7274
return DRFLAC_TRUE;
7275
}
7276
}
7277
7278
static drflac_bool32 drflac_oggbs__goto_next_page(drflac_oggbs* oggbs, drflac_ogg_crc_mismatch_recovery recoveryMethod)
7279
{
7280
drflac_ogg_page_header header;
7281
for (;;) {
7282
drflac_uint32 crc32 = 0;
7283
drflac_uint32 bytesRead;
7284
drflac_uint32 pageBodySize;
7285
#ifndef DR_FLAC_NO_CRC
7286
drflac_uint32 actualCRC32;
7287
#endif
7288
7289
if (drflac_ogg__read_page_header(oggbs->onRead, oggbs->pUserData, &header, &bytesRead, &crc32) != DRFLAC_SUCCESS) {
7290
return DRFLAC_FALSE;
7291
}
7292
oggbs->currentBytePos += bytesRead;
7293
7294
pageBodySize = drflac_ogg__get_page_body_size(&header);
7295
if (pageBodySize > DRFLAC_OGG_MAX_PAGE_SIZE) {
7296
continue; /* Invalid page size. Assume it's corrupted and just move to the next page. */
7297
}
7298
7299
if (header.serialNumber != oggbs->serialNumber) {
7300
/* It's not a FLAC page. Skip it. */
7301
if (pageBodySize > 0 && !drflac_oggbs__seek_physical(oggbs, pageBodySize, drflac_seek_origin_current)) {
7302
return DRFLAC_FALSE;
7303
}
7304
continue;
7305
}
7306
7307
7308
/* We need to read the entire page and then do a CRC check on it. If there's a CRC mismatch we need to skip this page. */
7309
if (drflac_oggbs__read_physical(oggbs, oggbs->pageData, pageBodySize) != pageBodySize) {
7310
return DRFLAC_FALSE;
7311
}
7312
oggbs->pageDataSize = pageBodySize;
7313
7314
#ifndef DR_FLAC_NO_CRC
7315
actualCRC32 = drflac_crc32_buffer(crc32, oggbs->pageData, oggbs->pageDataSize);
7316
if (actualCRC32 != header.checksum) {
7317
if (recoveryMethod == drflac_ogg_recover_on_crc_mismatch) {
7318
continue; /* CRC mismatch. Skip this page. */
7319
} else {
7320
/*
7321
Even though we are failing on a CRC mismatch, we still want our stream to be in a good state. Therefore we
7322
go to the next valid page to ensure we're in a good state, but return false to let the caller know that the
7323
seek did not fully complete.
7324
*/
7325
drflac_oggbs__goto_next_page(oggbs, drflac_ogg_recover_on_crc_mismatch);
7326
return DRFLAC_FALSE;
7327
}
7328
}
7329
#else
7330
(void)recoveryMethod; /* <-- Silence a warning. */
7331
#endif
7332
7333
oggbs->currentPageHeader = header;
7334
oggbs->bytesRemainingInPage = pageBodySize;
7335
return DRFLAC_TRUE;
7336
}
7337
}
7338
7339
/* Function below is unused at the moment, but I might be re-adding it later. */
7340
#if 0
7341
static drflac_uint8 drflac_oggbs__get_current_segment_index(drflac_oggbs* oggbs, drflac_uint8* pBytesRemainingInSeg)
7342
{
7343
drflac_uint32 bytesConsumedInPage = drflac_ogg__get_page_body_size(&oggbs->currentPageHeader) - oggbs->bytesRemainingInPage;
7344
drflac_uint8 iSeg = 0;
7345
drflac_uint32 iByte = 0;
7346
while (iByte < bytesConsumedInPage) {
7347
drflac_uint8 segmentSize = oggbs->currentPageHeader.segmentTable[iSeg];
7348
if (iByte + segmentSize > bytesConsumedInPage) {
7349
break;
7350
} else {
7351
iSeg += 1;
7352
iByte += segmentSize;
7353
}
7354
}
7355
7356
*pBytesRemainingInSeg = oggbs->currentPageHeader.segmentTable[iSeg] - (drflac_uint8)(bytesConsumedInPage - iByte);
7357
return iSeg;
7358
}
7359
7360
static drflac_bool32 drflac_oggbs__seek_to_next_packet(drflac_oggbs* oggbs)
7361
{
7362
/* The current packet ends when we get to the segment with a lacing value of < 255 which is not at the end of a page. */
7363
for (;;) {
7364
drflac_bool32 atEndOfPage = DRFLAC_FALSE;
7365
7366
drflac_uint8 bytesRemainingInSeg;
7367
drflac_uint8 iFirstSeg = drflac_oggbs__get_current_segment_index(oggbs, &bytesRemainingInSeg);
7368
7369
drflac_uint32 bytesToEndOfPacketOrPage = bytesRemainingInSeg;
7370
for (drflac_uint8 iSeg = iFirstSeg; iSeg < oggbs->currentPageHeader.segmentCount; ++iSeg) {
7371
drflac_uint8 segmentSize = oggbs->currentPageHeader.segmentTable[iSeg];
7372
if (segmentSize < 255) {
7373
if (iSeg == oggbs->currentPageHeader.segmentCount-1) {
7374
atEndOfPage = DRFLAC_TRUE;
7375
}
7376
7377
break;
7378
}
7379
7380
bytesToEndOfPacketOrPage += segmentSize;
7381
}
7382
7383
/*
7384
At this point we will have found either the packet or the end of the page. If were at the end of the page we'll
7385
want to load the next page and keep searching for the end of the packet.
7386
*/
7387
drflac_oggbs__seek_physical(oggbs, bytesToEndOfPacketOrPage, drflac_seek_origin_current);
7388
oggbs->bytesRemainingInPage -= bytesToEndOfPacketOrPage;
7389
7390
if (atEndOfPage) {
7391
/*
7392
We're potentially at the next packet, but we need to check the next page first to be sure because the packet may
7393
straddle pages.
7394
*/
7395
if (!drflac_oggbs__goto_next_page(oggbs)) {
7396
return DRFLAC_FALSE;
7397
}
7398
7399
/* If it's a fresh packet it most likely means we're at the next packet. */
7400
if ((oggbs->currentPageHeader.headerType & 0x01) == 0) {
7401
return DRFLAC_TRUE;
7402
}
7403
} else {
7404
/* We're at the next packet. */
7405
return DRFLAC_TRUE;
7406
}
7407
}
7408
}
7409
7410
static drflac_bool32 drflac_oggbs__seek_to_next_frame(drflac_oggbs* oggbs)
7411
{
7412
/* The bitstream should be sitting on the first byte just after the header of the frame. */
7413
7414
/* What we're actually doing here is seeking to the start of the next packet. */
7415
return drflac_oggbs__seek_to_next_packet(oggbs);
7416
}
7417
#endif
7418
7419
static size_t drflac__on_read_ogg(void* pUserData, void* bufferOut, size_t bytesToRead)
7420
{
7421
drflac_oggbs* oggbs = (drflac_oggbs*)pUserData;
7422
drflac_uint8* pRunningBufferOut = (drflac_uint8*)bufferOut;
7423
size_t bytesRead = 0;
7424
7425
DRFLAC_ASSERT(oggbs != NULL);
7426
DRFLAC_ASSERT(pRunningBufferOut != NULL);
7427
7428
/* Reading is done page-by-page. If we've run out of bytes in the page we need to move to the next one. */
7429
while (bytesRead < bytesToRead) {
7430
size_t bytesRemainingToRead = bytesToRead - bytesRead;
7431
7432
if (oggbs->bytesRemainingInPage >= bytesRemainingToRead) {
7433
DRFLAC_COPY_MEMORY(pRunningBufferOut, oggbs->pageData + (oggbs->pageDataSize - oggbs->bytesRemainingInPage), bytesRemainingToRead);
7434
bytesRead += bytesRemainingToRead;
7435
oggbs->bytesRemainingInPage -= (drflac_uint32)bytesRemainingToRead;
7436
break;
7437
}
7438
7439
/* If we get here it means some of the requested data is contained in the next pages. */
7440
if (oggbs->bytesRemainingInPage > 0) {
7441
DRFLAC_COPY_MEMORY(pRunningBufferOut, oggbs->pageData + (oggbs->pageDataSize - oggbs->bytesRemainingInPage), oggbs->bytesRemainingInPage);
7442
bytesRead += oggbs->bytesRemainingInPage;
7443
pRunningBufferOut += oggbs->bytesRemainingInPage;
7444
oggbs->bytesRemainingInPage = 0;
7445
}
7446
7447
DRFLAC_ASSERT(bytesRemainingToRead > 0);
7448
if (!drflac_oggbs__goto_next_page(oggbs, drflac_ogg_recover_on_crc_mismatch)) {
7449
break; /* Failed to go to the next page. Might have simply hit the end of the stream. */
7450
}
7451
}
7452
7453
return bytesRead;
7454
}
7455
7456
static drflac_bool32 drflac__on_seek_ogg(void* pUserData, int offset, drflac_seek_origin origin)
7457
{
7458
drflac_oggbs* oggbs = (drflac_oggbs*)pUserData;
7459
int bytesSeeked = 0;
7460
7461
DRFLAC_ASSERT(oggbs != NULL);
7462
DRFLAC_ASSERT(offset >= 0); /* <-- Never seek backwards. */
7463
7464
/* Seeking is always forward which makes things a lot simpler. */
7465
if (origin == drflac_seek_origin_start) {
7466
if (!drflac_oggbs__seek_physical(oggbs, (int)oggbs->firstBytePos, drflac_seek_origin_start)) {
7467
return DRFLAC_FALSE;
7468
}
7469
7470
if (!drflac_oggbs__goto_next_page(oggbs, drflac_ogg_fail_on_crc_mismatch)) {
7471
return DRFLAC_FALSE;
7472
}
7473
7474
return drflac__on_seek_ogg(pUserData, offset, drflac_seek_origin_current);
7475
}
7476
7477
DRFLAC_ASSERT(origin == drflac_seek_origin_current);
7478
7479
while (bytesSeeked < offset) {
7480
int bytesRemainingToSeek = offset - bytesSeeked;
7481
DRFLAC_ASSERT(bytesRemainingToSeek >= 0);
7482
7483
if (oggbs->bytesRemainingInPage >= (size_t)bytesRemainingToSeek) {
7484
bytesSeeked += bytesRemainingToSeek;
7485
(void)bytesSeeked; /* <-- Silence a dead store warning emitted by Clang Static Analyzer. */
7486
oggbs->bytesRemainingInPage -= bytesRemainingToSeek;
7487
break;
7488
}
7489
7490
/* If we get here it means some of the requested data is contained in the next pages. */
7491
if (oggbs->bytesRemainingInPage > 0) {
7492
bytesSeeked += (int)oggbs->bytesRemainingInPage;
7493
oggbs->bytesRemainingInPage = 0;
7494
}
7495
7496
DRFLAC_ASSERT(bytesRemainingToSeek > 0);
7497
if (!drflac_oggbs__goto_next_page(oggbs, drflac_ogg_fail_on_crc_mismatch)) {
7498
/* Failed to go to the next page. We either hit the end of the stream or had a CRC mismatch. */
7499
return DRFLAC_FALSE;
7500
}
7501
}
7502
7503
return DRFLAC_TRUE;
7504
}
7505
7506
7507
static drflac_bool32 drflac_ogg__seek_to_pcm_frame(drflac* pFlac, drflac_uint64 pcmFrameIndex)
7508
{
7509
drflac_oggbs* oggbs = (drflac_oggbs*)pFlac->_oggbs;
7510
drflac_uint64 originalBytePos;
7511
drflac_uint64 runningGranulePosition;
7512
drflac_uint64 runningFrameBytePos;
7513
drflac_uint64 runningPCMFrameCount;
7514
7515
DRFLAC_ASSERT(oggbs != NULL);
7516
7517
originalBytePos = oggbs->currentBytePos; /* For recovery. Points to the OggS identifier. */
7518
7519
/* First seek to the first frame. */
7520
if (!drflac__seek_to_byte(&pFlac->bs, pFlac->firstFLACFramePosInBytes)) {
7521
return DRFLAC_FALSE;
7522
}
7523
oggbs->bytesRemainingInPage = 0;
7524
7525
runningGranulePosition = 0;
7526
for (;;) {
7527
if (!drflac_oggbs__goto_next_page(oggbs, drflac_ogg_recover_on_crc_mismatch)) {
7528
drflac_oggbs__seek_physical(oggbs, originalBytePos, drflac_seek_origin_start);
7529
return DRFLAC_FALSE; /* Never did find that sample... */
7530
}
7531
7532
runningFrameBytePos = oggbs->currentBytePos - drflac_ogg__get_page_header_size(&oggbs->currentPageHeader) - oggbs->pageDataSize;
7533
if (oggbs->currentPageHeader.granulePosition >= pcmFrameIndex) {
7534
break; /* The sample is somewhere in the previous page. */
7535
}
7536
7537
/*
7538
At this point we know the sample is not in the previous page. It could possibly be in this page. For simplicity we
7539
disregard any pages that do not begin a fresh packet.
7540
*/
7541
if ((oggbs->currentPageHeader.headerType & 0x01) == 0) { /* <-- Is it a fresh page? */
7542
if (oggbs->currentPageHeader.segmentTable[0] >= 2) {
7543
drflac_uint8 firstBytesInPage[2];
7544
firstBytesInPage[0] = oggbs->pageData[0];
7545
firstBytesInPage[1] = oggbs->pageData[1];
7546
7547
if ((firstBytesInPage[0] == 0xFF) && (firstBytesInPage[1] & 0xFC) == 0xF8) { /* <-- Does the page begin with a frame's sync code? */
7548
runningGranulePosition = oggbs->currentPageHeader.granulePosition;
7549
}
7550
7551
continue;
7552
}
7553
}
7554
}
7555
7556
/*
7557
We found the page that that is closest to the sample, so now we need to find it. The first thing to do is seek to the
7558
start of that page. In the loop above we checked that it was a fresh page which means this page is also the start of
7559
a new frame. This property means that after we've seeked to the page we can immediately start looping over frames until
7560
we find the one containing the target sample.
7561
*/
7562
if (!drflac_oggbs__seek_physical(oggbs, runningFrameBytePos, drflac_seek_origin_start)) {
7563
return DRFLAC_FALSE;
7564
}
7565
if (!drflac_oggbs__goto_next_page(oggbs, drflac_ogg_recover_on_crc_mismatch)) {
7566
return DRFLAC_FALSE;
7567
}
7568
7569
/*
7570
At this point we'll be sitting on the first byte of the frame header of the first frame in the page. We just keep
7571
looping over these frames until we find the one containing the sample we're after.
7572
*/
7573
runningPCMFrameCount = runningGranulePosition;
7574
for (;;) {
7575
/*
7576
There are two ways to find the sample and seek past irrelevant frames:
7577
1) Use the native FLAC decoder.
7578
2) Use Ogg's framing system.
7579
7580
Both of these options have their own pros and cons. Using the native FLAC decoder is slower because it needs to
7581
do a full decode of the frame. Using Ogg's framing system is faster, but more complicated and involves some code
7582
duplication for the decoding of frame headers.
7583
7584
Another thing to consider is that using the Ogg framing system will perform direct seeking of the physical Ogg
7585
bitstream. This is important to consider because it means we cannot read data from the drflac_bs object using the
7586
standard drflac__*() APIs because that will read in extra data for its own internal caching which in turn breaks
7587
the positioning of the read pointer of the physical Ogg bitstream. Therefore, anything that would normally be read
7588
using the native FLAC decoding APIs, such as drflac__read_next_flac_frame_header(), need to be re-implemented so as to
7589
avoid the use of the drflac_bs object.
7590
7591
Considering these issues, I have decided to use the slower native FLAC decoding method for the following reasons:
7592
1) Seeking is already partially accelerated using Ogg's paging system in the code block above.
7593
2) Seeking in an Ogg encapsulated FLAC stream is probably quite uncommon.
7594
3) Simplicity.
7595
*/
7596
drflac_uint64 firstPCMFrameInFLACFrame = 0;
7597
drflac_uint64 lastPCMFrameInFLACFrame = 0;
7598
drflac_uint64 pcmFrameCountInThisFrame;
7599
7600
if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) {
7601
return DRFLAC_FALSE;
7602
}
7603
7604
drflac__get_pcm_frame_range_of_current_flac_frame(pFlac, &firstPCMFrameInFLACFrame, &lastPCMFrameInFLACFrame);
7605
7606
pcmFrameCountInThisFrame = (lastPCMFrameInFLACFrame - firstPCMFrameInFLACFrame) + 1;
7607
7608
/* If we are seeking to the end of the file and we've just hit it, we're done. */
7609
if (pcmFrameIndex == pFlac->totalPCMFrameCount && (runningPCMFrameCount + pcmFrameCountInThisFrame) == pFlac->totalPCMFrameCount) {
7610
drflac_result result = drflac__decode_flac_frame(pFlac);
7611
if (result == DRFLAC_SUCCESS) {
7612
pFlac->currentPCMFrame = pcmFrameIndex;
7613
pFlac->currentFLACFrame.pcmFramesRemaining = 0;
7614
return DRFLAC_TRUE;
7615
} else {
7616
return DRFLAC_FALSE;
7617
}
7618
}
7619
7620
if (pcmFrameIndex < (runningPCMFrameCount + pcmFrameCountInThisFrame)) {
7621
/*
7622
The sample should be in this FLAC frame. We need to fully decode it, however if it's an invalid frame (a CRC mismatch), we need to pretend
7623
it never existed and keep iterating.
7624
*/
7625
drflac_result result = drflac__decode_flac_frame(pFlac);
7626
if (result == DRFLAC_SUCCESS) {
7627
/* The frame is valid. We just need to skip over some samples to ensure it's sample-exact. */
7628
drflac_uint64 pcmFramesToDecode = (size_t)(pcmFrameIndex - runningPCMFrameCount); /* <-- Safe cast because the maximum number of samples in a frame is 65535. */
7629
if (pcmFramesToDecode == 0) {
7630
return DRFLAC_TRUE;
7631
}
7632
7633
pFlac->currentPCMFrame = runningPCMFrameCount;
7634
7635
return drflac__seek_forward_by_pcm_frames(pFlac, pcmFramesToDecode) == pcmFramesToDecode; /* <-- If this fails, something bad has happened (it should never fail). */
7636
} else {
7637
if (result == DRFLAC_CRC_MISMATCH) {
7638
continue; /* CRC mismatch. Pretend this frame never existed. */
7639
} else {
7640
return DRFLAC_FALSE;
7641
}
7642
}
7643
} else {
7644
/*
7645
It's not in this frame. We need to seek past the frame, but check if there was a CRC mismatch. If so, we pretend this
7646
frame never existed and leave the running sample count untouched.
7647
*/
7648
drflac_result result = drflac__seek_to_next_flac_frame(pFlac);
7649
if (result == DRFLAC_SUCCESS) {
7650
runningPCMFrameCount += pcmFrameCountInThisFrame;
7651
} else {
7652
if (result == DRFLAC_CRC_MISMATCH) {
7653
continue; /* CRC mismatch. Pretend this frame never existed. */
7654
} else {
7655
return DRFLAC_FALSE;
7656
}
7657
}
7658
}
7659
}
7660
}
7661
7662
7663
7664
static drflac_bool32 drflac__init_private__ogg(drflac_init_info* pInit, drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, void* pUserData, void* pUserDataMD, drflac_bool32 relaxed)
7665
{
7666
drflac_ogg_page_header header;
7667
drflac_uint32 crc32 = DRFLAC_OGG_CAPTURE_PATTERN_CRC32;
7668
drflac_uint32 bytesRead = 0;
7669
7670
/* Pre Condition: The bit stream should be sitting just past the 4-byte OggS capture pattern. */
7671
(void)relaxed;
7672
7673
pInit->container = drflac_container_ogg;
7674
pInit->oggFirstBytePos = 0;
7675
7676
/*
7677
We'll get here if the first 4 bytes of the stream were the OggS capture pattern, however it doesn't necessarily mean the
7678
stream includes FLAC encoded audio. To check for this we need to scan the beginning-of-stream page markers and check if
7679
any match the FLAC specification. Important to keep in mind that the stream may be multiplexed.
7680
*/
7681
if (drflac_ogg__read_page_header_after_capture_pattern(onRead, pUserData, &header, &bytesRead, &crc32) != DRFLAC_SUCCESS) {
7682
return DRFLAC_FALSE;
7683
}
7684
pInit->runningFilePos += bytesRead;
7685
7686
for (;;) {
7687
int pageBodySize;
7688
7689
/* Break if we're past the beginning of stream page. */
7690
if ((header.headerType & 0x02) == 0) {
7691
return DRFLAC_FALSE;
7692
}
7693
7694
/* Check if it's a FLAC header. */
7695
pageBodySize = drflac_ogg__get_page_body_size(&header);
7696
if (pageBodySize == 51) { /* 51 = the lacing value of the FLAC header packet. */
7697
/* It could be a FLAC page... */
7698
drflac_uint32 bytesRemainingInPage = pageBodySize;
7699
drflac_uint8 packetType;
7700
7701
if (onRead(pUserData, &packetType, 1) != 1) {
7702
return DRFLAC_FALSE;
7703
}
7704
7705
bytesRemainingInPage -= 1;
7706
if (packetType == 0x7F) {
7707
/* Increasingly more likely to be a FLAC page... */
7708
drflac_uint8 sig[4];
7709
if (onRead(pUserData, sig, 4) != 4) {
7710
return DRFLAC_FALSE;
7711
}
7712
7713
bytesRemainingInPage -= 4;
7714
if (sig[0] == 'F' && sig[1] == 'L' && sig[2] == 'A' && sig[3] == 'C') {
7715
/* Almost certainly a FLAC page... */
7716
drflac_uint8 mappingVersion[2];
7717
if (onRead(pUserData, mappingVersion, 2) != 2) {
7718
return DRFLAC_FALSE;
7719
}
7720
7721
if (mappingVersion[0] != 1) {
7722
return DRFLAC_FALSE; /* Only supporting version 1.x of the Ogg mapping. */
7723
}
7724
7725
/*
7726
The next 2 bytes are the non-audio packets, not including this one. We don't care about this because we're going to
7727
be handling it in a generic way based on the serial number and packet types.
7728
*/
7729
if (!onSeek(pUserData, 2, drflac_seek_origin_current)) {
7730
return DRFLAC_FALSE;
7731
}
7732
7733
/* Expecting the native FLAC signature "fLaC". */
7734
if (onRead(pUserData, sig, 4) != 4) {
7735
return DRFLAC_FALSE;
7736
}
7737
7738
if (sig[0] == 'f' && sig[1] == 'L' && sig[2] == 'a' && sig[3] == 'C') {
7739
/* The remaining data in the page should be the STREAMINFO block. */
7740
drflac_streaminfo streaminfo;
7741
drflac_uint8 isLastBlock;
7742
drflac_uint8 blockType;
7743
drflac_uint32 blockSize;
7744
if (!drflac__read_and_decode_block_header(onRead, pUserData, &isLastBlock, &blockType, &blockSize)) {
7745
return DRFLAC_FALSE;
7746
}
7747
7748
if (blockType != DRFLAC_METADATA_BLOCK_TYPE_STREAMINFO || blockSize != 34) {
7749
return DRFLAC_FALSE; /* Invalid block type. First block must be the STREAMINFO block. */
7750
}
7751
7752
if (drflac__read_streaminfo(onRead, pUserData, &streaminfo)) {
7753
/* Success! */
7754
pInit->hasStreamInfoBlock = DRFLAC_TRUE;
7755
pInit->sampleRate = streaminfo.sampleRate;
7756
pInit->channels = streaminfo.channels;
7757
pInit->bitsPerSample = streaminfo.bitsPerSample;
7758
pInit->totalPCMFrameCount = streaminfo.totalPCMFrameCount;
7759
pInit->maxBlockSizeInPCMFrames = streaminfo.maxBlockSizeInPCMFrames;
7760
pInit->hasMetadataBlocks = !isLastBlock;
7761
7762
if (onMeta) {
7763
drflac_metadata metadata;
7764
metadata.type = DRFLAC_METADATA_BLOCK_TYPE_STREAMINFO;
7765
metadata.pRawData = NULL;
7766
metadata.rawDataSize = 0;
7767
metadata.data.streaminfo = streaminfo;
7768
onMeta(pUserDataMD, &metadata);
7769
}
7770
7771
pInit->runningFilePos += pageBodySize;
7772
pInit->oggFirstBytePos = pInit->runningFilePos - 79; /* Subtracting 79 will place us right on top of the "OggS" identifier of the FLAC bos page. */
7773
pInit->oggSerial = header.serialNumber;
7774
pInit->oggBosHeader = header;
7775
break;
7776
} else {
7777
/* Failed to read STREAMINFO block. Aww, so close... */
7778
return DRFLAC_FALSE;
7779
}
7780
} else {
7781
/* Invalid file. */
7782
return DRFLAC_FALSE;
7783
}
7784
} else {
7785
/* Not a FLAC header. Skip it. */
7786
if (!onSeek(pUserData, bytesRemainingInPage, drflac_seek_origin_current)) {
7787
return DRFLAC_FALSE;
7788
}
7789
}
7790
} else {
7791
/* Not a FLAC header. Seek past the entire page and move on to the next. */
7792
if (!onSeek(pUserData, bytesRemainingInPage, drflac_seek_origin_current)) {
7793
return DRFLAC_FALSE;
7794
}
7795
}
7796
} else {
7797
if (!onSeek(pUserData, pageBodySize, drflac_seek_origin_current)) {
7798
return DRFLAC_FALSE;
7799
}
7800
}
7801
7802
pInit->runningFilePos += pageBodySize;
7803
7804
7805
/* Read the header of the next page. */
7806
if (drflac_ogg__read_page_header(onRead, pUserData, &header, &bytesRead, &crc32) != DRFLAC_SUCCESS) {
7807
return DRFLAC_FALSE;
7808
}
7809
pInit->runningFilePos += bytesRead;
7810
}
7811
7812
/*
7813
If we get here it means we found a FLAC audio stream. We should be sitting on the first byte of the header of the next page. The next
7814
packets in the FLAC logical stream contain the metadata. The only thing left to do in the initialization phase for Ogg is to create the
7815
Ogg bistream object.
7816
*/
7817
pInit->hasMetadataBlocks = DRFLAC_TRUE; /* <-- Always have at least VORBIS_COMMENT metadata block. */
7818
return DRFLAC_TRUE;
7819
}
7820
#endif
7821
7822
static drflac_bool32 drflac__init_private(drflac_init_info* pInit, drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, drflac_container container, void* pUserData, void* pUserDataMD)
7823
{
7824
drflac_bool32 relaxed;
7825
drflac_uint8 id[4];
7826
7827
if (pInit == NULL || onRead == NULL || onSeek == NULL) {
7828
return DRFLAC_FALSE;
7829
}
7830
7831
DRFLAC_ZERO_MEMORY(pInit, sizeof(*pInit));
7832
pInit->onRead = onRead;
7833
pInit->onSeek = onSeek;
7834
pInit->onMeta = onMeta;
7835
pInit->container = container;
7836
pInit->pUserData = pUserData;
7837
pInit->pUserDataMD = pUserDataMD;
7838
7839
pInit->bs.onRead = onRead;
7840
pInit->bs.onSeek = onSeek;
7841
pInit->bs.pUserData = pUserData;
7842
drflac__reset_cache(&pInit->bs);
7843
7844
7845
/* If the container is explicitly defined then we can try opening in relaxed mode. */
7846
relaxed = container != drflac_container_unknown;
7847
7848
/* Skip over any ID3 tags. */
7849
for (;;) {
7850
if (onRead(pUserData, id, 4) != 4) {
7851
return DRFLAC_FALSE; /* Ran out of data. */
7852
}
7853
pInit->runningFilePos += 4;
7854
7855
if (id[0] == 'I' && id[1] == 'D' && id[2] == '3') {
7856
drflac_uint8 header[6];
7857
drflac_uint8 flags;
7858
drflac_uint32 headerSize;
7859
7860
if (onRead(pUserData, header, 6) != 6) {
7861
return DRFLAC_FALSE; /* Ran out of data. */
7862
}
7863
pInit->runningFilePos += 6;
7864
7865
flags = header[1];
7866
7867
DRFLAC_COPY_MEMORY(&headerSize, header+2, 4);
7868
headerSize = drflac__unsynchsafe_32(drflac__be2host_32(headerSize));
7869
if (flags & 0x10) {
7870
headerSize += 10;
7871
}
7872
7873
if (!onSeek(pUserData, headerSize, drflac_seek_origin_current)) {
7874
return DRFLAC_FALSE; /* Failed to seek past the tag. */
7875
}
7876
pInit->runningFilePos += headerSize;
7877
} else {
7878
break;
7879
}
7880
}
7881
7882
if (id[0] == 'f' && id[1] == 'L' && id[2] == 'a' && id[3] == 'C') {
7883
return drflac__init_private__native(pInit, onRead, onSeek, onMeta, pUserData, pUserDataMD, relaxed);
7884
}
7885
#ifndef DR_FLAC_NO_OGG
7886
if (id[0] == 'O' && id[1] == 'g' && id[2] == 'g' && id[3] == 'S') {
7887
return drflac__init_private__ogg(pInit, onRead, onSeek, onMeta, pUserData, pUserDataMD, relaxed);
7888
}
7889
#endif
7890
7891
/* If we get here it means we likely don't have a header. Try opening in relaxed mode, if applicable. */
7892
if (relaxed) {
7893
if (container == drflac_container_native) {
7894
return drflac__init_private__native(pInit, onRead, onSeek, onMeta, pUserData, pUserDataMD, relaxed);
7895
}
7896
#ifndef DR_FLAC_NO_OGG
7897
if (container == drflac_container_ogg) {
7898
return drflac__init_private__ogg(pInit, onRead, onSeek, onMeta, pUserData, pUserDataMD, relaxed);
7899
}
7900
#endif
7901
}
7902
7903
/* Unsupported container. */
7904
return DRFLAC_FALSE;
7905
}
7906
7907
static void drflac__init_from_info(drflac* pFlac, const drflac_init_info* pInit)
7908
{
7909
DRFLAC_ASSERT(pFlac != NULL);
7910
DRFLAC_ASSERT(pInit != NULL);
7911
7912
DRFLAC_ZERO_MEMORY(pFlac, sizeof(*pFlac));
7913
pFlac->bs = pInit->bs;
7914
pFlac->onMeta = pInit->onMeta;
7915
pFlac->pUserDataMD = pInit->pUserDataMD;
7916
pFlac->maxBlockSizeInPCMFrames = pInit->maxBlockSizeInPCMFrames;
7917
pFlac->sampleRate = pInit->sampleRate;
7918
pFlac->channels = (drflac_uint8)pInit->channels;
7919
pFlac->bitsPerSample = (drflac_uint8)pInit->bitsPerSample;
7920
pFlac->totalPCMFrameCount = pInit->totalPCMFrameCount;
7921
pFlac->container = pInit->container;
7922
}
7923
7924
7925
static drflac* drflac_open_with_metadata_private(drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, drflac_container container, void* pUserData, void* pUserDataMD, const drflac_allocation_callbacks* pAllocationCallbacks)
7926
{
7927
drflac_init_info init;
7928
drflac_uint32 allocationSize;
7929
drflac_uint32 wholeSIMDVectorCountPerChannel;
7930
drflac_uint32 decodedSamplesAllocationSize;
7931
#ifndef DR_FLAC_NO_OGG
7932
drflac_oggbs* pOggbs = NULL;
7933
#endif
7934
drflac_uint64 firstFramePos;
7935
drflac_uint64 seektablePos;
7936
drflac_uint32 seekpointCount;
7937
drflac_allocation_callbacks allocationCallbacks;
7938
drflac* pFlac;
7939
7940
/* CPU support first. */
7941
drflac__init_cpu_caps();
7942
7943
if (!drflac__init_private(&init, onRead, onSeek, onMeta, container, pUserData, pUserDataMD)) {
7944
return NULL;
7945
}
7946
7947
if (pAllocationCallbacks != NULL) {
7948
allocationCallbacks = *pAllocationCallbacks;
7949
if (allocationCallbacks.onFree == NULL || (allocationCallbacks.onMalloc == NULL && allocationCallbacks.onRealloc == NULL)) {
7950
return NULL; /* Invalid allocation callbacks. */
7951
}
7952
} else {
7953
allocationCallbacks.pUserData = NULL;
7954
allocationCallbacks.onMalloc = drflac__malloc_default;
7955
allocationCallbacks.onRealloc = drflac__realloc_default;
7956
allocationCallbacks.onFree = drflac__free_default;
7957
}
7958
7959
7960
/*
7961
The size of the allocation for the drflac object needs to be large enough to fit the following:
7962
1) The main members of the drflac structure
7963
2) A block of memory large enough to store the decoded samples of the largest frame in the stream
7964
3) If the container is Ogg, a drflac_oggbs object
7965
7966
The complicated part of the allocation is making sure there's enough room the decoded samples, taking into consideration
7967
the different SIMD instruction sets.
7968
*/
7969
allocationSize = sizeof(drflac);
7970
7971
/*
7972
The allocation size for decoded frames depends on the number of 32-bit integers that fit inside the largest SIMD vector
7973
we are supporting.
7974
*/
7975
if ((init.maxBlockSizeInPCMFrames % (DRFLAC_MAX_SIMD_VECTOR_SIZE / sizeof(drflac_int32))) == 0) {
7976
wholeSIMDVectorCountPerChannel = (init.maxBlockSizeInPCMFrames / (DRFLAC_MAX_SIMD_VECTOR_SIZE / sizeof(drflac_int32)));
7977
} else {
7978
wholeSIMDVectorCountPerChannel = (init.maxBlockSizeInPCMFrames / (DRFLAC_MAX_SIMD_VECTOR_SIZE / sizeof(drflac_int32))) + 1;
7979
}
7980
7981
decodedSamplesAllocationSize = wholeSIMDVectorCountPerChannel * DRFLAC_MAX_SIMD_VECTOR_SIZE * init.channels;
7982
7983
allocationSize += decodedSamplesAllocationSize;
7984
allocationSize += DRFLAC_MAX_SIMD_VECTOR_SIZE; /* Allocate extra bytes to ensure we have enough for alignment. */
7985
7986
#ifndef DR_FLAC_NO_OGG
7987
/* There's additional data required for Ogg streams. */
7988
if (init.container == drflac_container_ogg) {
7989
allocationSize += sizeof(drflac_oggbs);
7990
7991
pOggbs = (drflac_oggbs*)drflac__malloc_from_callbacks(sizeof(*pOggbs), &allocationCallbacks);
7992
if (pOggbs == NULL) {
7993
return NULL; /*DRFLAC_OUT_OF_MEMORY;*/
7994
}
7995
7996
DRFLAC_ZERO_MEMORY(pOggbs, sizeof(*pOggbs));
7997
pOggbs->onRead = onRead;
7998
pOggbs->onSeek = onSeek;
7999
pOggbs->pUserData = pUserData;
8000
pOggbs->currentBytePos = init.oggFirstBytePos;
8001
pOggbs->firstBytePos = init.oggFirstBytePos;
8002
pOggbs->serialNumber = init.oggSerial;
8003
pOggbs->bosPageHeader = init.oggBosHeader;
8004
pOggbs->bytesRemainingInPage = 0;
8005
}
8006
#endif
8007
8008
/*
8009
This part is a bit awkward. We need to load the seektable so that it can be referenced in-memory, but I want the drflac object to
8010
consist of only a single heap allocation. To this, the size of the seek table needs to be known, which we determine when reading
8011
and decoding the metadata.
8012
*/
8013
firstFramePos = 42; /* <-- We know we are at byte 42 at this point. */
8014
seektablePos = 0;
8015
seekpointCount = 0;
8016
if (init.hasMetadataBlocks) {
8017
drflac_read_proc onReadOverride = onRead;
8018
drflac_seek_proc onSeekOverride = onSeek;
8019
void* pUserDataOverride = pUserData;
8020
8021
#ifndef DR_FLAC_NO_OGG
8022
if (init.container == drflac_container_ogg) {
8023
onReadOverride = drflac__on_read_ogg;
8024
onSeekOverride = drflac__on_seek_ogg;
8025
pUserDataOverride = (void*)pOggbs;
8026
}
8027
#endif
8028
8029
if (!drflac__read_and_decode_metadata(onReadOverride, onSeekOverride, onMeta, pUserDataOverride, pUserDataMD, &firstFramePos, &seektablePos, &seekpointCount, &allocationCallbacks)) {
8030
#ifndef DR_FLAC_NO_OGG
8031
drflac__free_from_callbacks(pOggbs, &allocationCallbacks);
8032
#endif
8033
return NULL;
8034
}
8035
8036
allocationSize += seekpointCount * sizeof(drflac_seekpoint);
8037
}
8038
8039
8040
pFlac = (drflac*)drflac__malloc_from_callbacks(allocationSize, &allocationCallbacks);
8041
if (pFlac == NULL) {
8042
#ifndef DR_FLAC_NO_OGG
8043
drflac__free_from_callbacks(pOggbs, &allocationCallbacks);
8044
#endif
8045
return NULL;
8046
}
8047
8048
drflac__init_from_info(pFlac, &init);
8049
pFlac->allocationCallbacks = allocationCallbacks;
8050
pFlac->pDecodedSamples = (drflac_int32*)drflac_align((size_t)pFlac->pExtraData, DRFLAC_MAX_SIMD_VECTOR_SIZE);
8051
8052
#ifndef DR_FLAC_NO_OGG
8053
if (init.container == drflac_container_ogg) {
8054
drflac_oggbs* pInternalOggbs = (drflac_oggbs*)((drflac_uint8*)pFlac->pDecodedSamples + decodedSamplesAllocationSize + (seekpointCount * sizeof(drflac_seekpoint)));
8055
DRFLAC_COPY_MEMORY(pInternalOggbs, pOggbs, sizeof(*pOggbs));
8056
8057
/* At this point the pOggbs object has been handed over to pInternalOggbs and can be freed. */
8058
drflac__free_from_callbacks(pOggbs, &allocationCallbacks);
8059
pOggbs = NULL;
8060
8061
/* The Ogg bistream needs to be layered on top of the original bitstream. */
8062
pFlac->bs.onRead = drflac__on_read_ogg;
8063
pFlac->bs.onSeek = drflac__on_seek_ogg;
8064
pFlac->bs.pUserData = (void*)pInternalOggbs;
8065
pFlac->_oggbs = (void*)pInternalOggbs;
8066
}
8067
#endif
8068
8069
pFlac->firstFLACFramePosInBytes = firstFramePos;
8070
8071
/* NOTE: Seektables are not currently compatible with Ogg encapsulation (Ogg has its own accelerated seeking system). I may change this later, so I'm leaving this here for now. */
8072
#ifndef DR_FLAC_NO_OGG
8073
if (init.container == drflac_container_ogg)
8074
{
8075
pFlac->pSeekpoints = NULL;
8076
pFlac->seekpointCount = 0;
8077
}
8078
else
8079
#endif
8080
{
8081
/* If we have a seektable we need to load it now, making sure we move back to where we were previously. */
8082
if (seektablePos != 0) {
8083
pFlac->seekpointCount = seekpointCount;
8084
pFlac->pSeekpoints = (drflac_seekpoint*)((drflac_uint8*)pFlac->pDecodedSamples + decodedSamplesAllocationSize);
8085
8086
DRFLAC_ASSERT(pFlac->bs.onSeek != NULL);
8087
DRFLAC_ASSERT(pFlac->bs.onRead != NULL);
8088
8089
/* Seek to the seektable, then just read directly into our seektable buffer. */
8090
if (pFlac->bs.onSeek(pFlac->bs.pUserData, (int)seektablePos, drflac_seek_origin_start)) {
8091
drflac_uint32 iSeekpoint;
8092
8093
for (iSeekpoint = 0; iSeekpoint < seekpointCount; iSeekpoint += 1) {
8094
if (pFlac->bs.onRead(pFlac->bs.pUserData, pFlac->pSeekpoints + iSeekpoint, DRFLAC_SEEKPOINT_SIZE_IN_BYTES) == DRFLAC_SEEKPOINT_SIZE_IN_BYTES) {
8095
/* Endian swap. */
8096
pFlac->pSeekpoints[iSeekpoint].firstPCMFrame = drflac__be2host_64(pFlac->pSeekpoints[iSeekpoint].firstPCMFrame);
8097
pFlac->pSeekpoints[iSeekpoint].flacFrameOffset = drflac__be2host_64(pFlac->pSeekpoints[iSeekpoint].flacFrameOffset);
8098
pFlac->pSeekpoints[iSeekpoint].pcmFrameCount = drflac__be2host_16(pFlac->pSeekpoints[iSeekpoint].pcmFrameCount);
8099
} else {
8100
/* Failed to read the seektable. Pretend we don't have one. */
8101
pFlac->pSeekpoints = NULL;
8102
pFlac->seekpointCount = 0;
8103
break;
8104
}
8105
}
8106
8107
/* We need to seek back to where we were. If this fails it's a critical error. */
8108
if (!pFlac->bs.onSeek(pFlac->bs.pUserData, (int)pFlac->firstFLACFramePosInBytes, drflac_seek_origin_start)) {
8109
drflac__free_from_callbacks(pFlac, &allocationCallbacks);
8110
return NULL;
8111
}
8112
} else {
8113
/* Failed to seek to the seektable. Ominous sign, but for now we can just pretend we don't have one. */
8114
pFlac->pSeekpoints = NULL;
8115
pFlac->seekpointCount = 0;
8116
}
8117
}
8118
}
8119
8120
8121
/*
8122
If we get here, but don't have a STREAMINFO block, it means we've opened the stream in relaxed mode and need to decode
8123
the first frame.
8124
*/
8125
if (!init.hasStreamInfoBlock) {
8126
pFlac->currentFLACFrame.header = init.firstFrameHeader;
8127
for (;;) {
8128
drflac_result result = drflac__decode_flac_frame(pFlac);
8129
if (result == DRFLAC_SUCCESS) {
8130
break;
8131
} else {
8132
if (result == DRFLAC_CRC_MISMATCH) {
8133
if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) {
8134
drflac__free_from_callbacks(pFlac, &allocationCallbacks);
8135
return NULL;
8136
}
8137
continue;
8138
} else {
8139
drflac__free_from_callbacks(pFlac, &allocationCallbacks);
8140
return NULL;
8141
}
8142
}
8143
}
8144
}
8145
8146
return pFlac;
8147
}
8148
8149
8150
8151
#ifndef DR_FLAC_NO_STDIO
8152
#include <stdio.h>
8153
#ifndef DR_FLAC_NO_WCHAR
8154
#include <wchar.h> /* For wcslen(), wcsrtombs() */
8155
#endif
8156
8157
/* Errno */
8158
/* drflac_result_from_errno() is only used for fopen() and wfopen() so putting it inside DR_WAV_NO_STDIO for now. If something else needs this later we can move it out. */
8159
#include <errno.h>
8160
static drflac_result drflac_result_from_errno(int e)
8161
{
8162
switch (e)
8163
{
8164
case 0: return DRFLAC_SUCCESS;
8165
#ifdef EPERM
8166
case EPERM: return DRFLAC_INVALID_OPERATION;
8167
#endif
8168
#ifdef ENOENT
8169
case ENOENT: return DRFLAC_DOES_NOT_EXIST;
8170
#endif
8171
#ifdef ESRCH
8172
case ESRCH: return DRFLAC_DOES_NOT_EXIST;
8173
#endif
8174
#ifdef EINTR
8175
case EINTR: return DRFLAC_INTERRUPT;
8176
#endif
8177
#ifdef EIO
8178
case EIO: return DRFLAC_IO_ERROR;
8179
#endif
8180
#ifdef ENXIO
8181
case ENXIO: return DRFLAC_DOES_NOT_EXIST;
8182
#endif
8183
#ifdef E2BIG
8184
case E2BIG: return DRFLAC_INVALID_ARGS;
8185
#endif
8186
#ifdef ENOEXEC
8187
case ENOEXEC: return DRFLAC_INVALID_FILE;
8188
#endif
8189
#ifdef EBADF
8190
case EBADF: return DRFLAC_INVALID_FILE;
8191
#endif
8192
#ifdef ECHILD
8193
case ECHILD: return DRFLAC_ERROR;
8194
#endif
8195
#ifdef EAGAIN
8196
case EAGAIN: return DRFLAC_UNAVAILABLE;
8197
#endif
8198
#ifdef ENOMEM
8199
case ENOMEM: return DRFLAC_OUT_OF_MEMORY;
8200
#endif
8201
#ifdef EACCES
8202
case EACCES: return DRFLAC_ACCESS_DENIED;
8203
#endif
8204
#ifdef EFAULT
8205
case EFAULT: return DRFLAC_BAD_ADDRESS;
8206
#endif
8207
#ifdef ENOTBLK
8208
case ENOTBLK: return DRFLAC_ERROR;
8209
#endif
8210
#ifdef EBUSY
8211
case EBUSY: return DRFLAC_BUSY;
8212
#endif
8213
#ifdef EEXIST
8214
case EEXIST: return DRFLAC_ALREADY_EXISTS;
8215
#endif
8216
#ifdef EXDEV
8217
case EXDEV: return DRFLAC_ERROR;
8218
#endif
8219
#ifdef ENODEV
8220
case ENODEV: return DRFLAC_DOES_NOT_EXIST;
8221
#endif
8222
#ifdef ENOTDIR
8223
case ENOTDIR: return DRFLAC_NOT_DIRECTORY;
8224
#endif
8225
#ifdef EISDIR
8226
case EISDIR: return DRFLAC_IS_DIRECTORY;
8227
#endif
8228
#ifdef EINVAL
8229
case EINVAL: return DRFLAC_INVALID_ARGS;
8230
#endif
8231
#ifdef ENFILE
8232
case ENFILE: return DRFLAC_TOO_MANY_OPEN_FILES;
8233
#endif
8234
#ifdef EMFILE
8235
case EMFILE: return DRFLAC_TOO_MANY_OPEN_FILES;
8236
#endif
8237
#ifdef ENOTTY
8238
case ENOTTY: return DRFLAC_INVALID_OPERATION;
8239
#endif
8240
#ifdef ETXTBSY
8241
case ETXTBSY: return DRFLAC_BUSY;
8242
#endif
8243
#ifdef EFBIG
8244
case EFBIG: return DRFLAC_TOO_BIG;
8245
#endif
8246
#ifdef ENOSPC
8247
case ENOSPC: return DRFLAC_NO_SPACE;
8248
#endif
8249
#ifdef ESPIPE
8250
case ESPIPE: return DRFLAC_BAD_SEEK;
8251
#endif
8252
#ifdef EROFS
8253
case EROFS: return DRFLAC_ACCESS_DENIED;
8254
#endif
8255
#ifdef EMLINK
8256
case EMLINK: return DRFLAC_TOO_MANY_LINKS;
8257
#endif
8258
#ifdef EPIPE
8259
case EPIPE: return DRFLAC_BAD_PIPE;
8260
#endif
8261
#ifdef EDOM
8262
case EDOM: return DRFLAC_OUT_OF_RANGE;
8263
#endif
8264
#ifdef ERANGE
8265
case ERANGE: return DRFLAC_OUT_OF_RANGE;
8266
#endif
8267
#ifdef EDEADLK
8268
case EDEADLK: return DRFLAC_DEADLOCK;
8269
#endif
8270
#ifdef ENAMETOOLONG
8271
case ENAMETOOLONG: return DRFLAC_PATH_TOO_LONG;
8272
#endif
8273
#ifdef ENOLCK
8274
case ENOLCK: return DRFLAC_ERROR;
8275
#endif
8276
#ifdef ENOSYS
8277
case ENOSYS: return DRFLAC_NOT_IMPLEMENTED;
8278
#endif
8279
#ifdef ENOTEMPTY
8280
case ENOTEMPTY: return DRFLAC_DIRECTORY_NOT_EMPTY;
8281
#endif
8282
#ifdef ELOOP
8283
case ELOOP: return DRFLAC_TOO_MANY_LINKS;
8284
#endif
8285
#ifdef ENOMSG
8286
case ENOMSG: return DRFLAC_NO_MESSAGE;
8287
#endif
8288
#ifdef EIDRM
8289
case EIDRM: return DRFLAC_ERROR;
8290
#endif
8291
#ifdef ECHRNG
8292
case ECHRNG: return DRFLAC_ERROR;
8293
#endif
8294
#ifdef EL2NSYNC
8295
case EL2NSYNC: return DRFLAC_ERROR;
8296
#endif
8297
#ifdef EL3HLT
8298
case EL3HLT: return DRFLAC_ERROR;
8299
#endif
8300
#ifdef EL3RST
8301
case EL3RST: return DRFLAC_ERROR;
8302
#endif
8303
#ifdef ELNRNG
8304
case ELNRNG: return DRFLAC_OUT_OF_RANGE;
8305
#endif
8306
#ifdef EUNATCH
8307
case EUNATCH: return DRFLAC_ERROR;
8308
#endif
8309
#ifdef ENOCSI
8310
case ENOCSI: return DRFLAC_ERROR;
8311
#endif
8312
#ifdef EL2HLT
8313
case EL2HLT: return DRFLAC_ERROR;
8314
#endif
8315
#ifdef EBADE
8316
case EBADE: return DRFLAC_ERROR;
8317
#endif
8318
#ifdef EBADR
8319
case EBADR: return DRFLAC_ERROR;
8320
#endif
8321
#ifdef EXFULL
8322
case EXFULL: return DRFLAC_ERROR;
8323
#endif
8324
#ifdef ENOANO
8325
case ENOANO: return DRFLAC_ERROR;
8326
#endif
8327
#ifdef EBADRQC
8328
case EBADRQC: return DRFLAC_ERROR;
8329
#endif
8330
#ifdef EBADSLT
8331
case EBADSLT: return DRFLAC_ERROR;
8332
#endif
8333
#ifdef EBFONT
8334
case EBFONT: return DRFLAC_INVALID_FILE;
8335
#endif
8336
#ifdef ENOSTR
8337
case ENOSTR: return DRFLAC_ERROR;
8338
#endif
8339
#ifdef ENODATA
8340
case ENODATA: return DRFLAC_NO_DATA_AVAILABLE;
8341
#endif
8342
#ifdef ETIME
8343
case ETIME: return DRFLAC_TIMEOUT;
8344
#endif
8345
#ifdef ENOSR
8346
case ENOSR: return DRFLAC_NO_DATA_AVAILABLE;
8347
#endif
8348
#ifdef ENONET
8349
case ENONET: return DRFLAC_NO_NETWORK;
8350
#endif
8351
#ifdef ENOPKG
8352
case ENOPKG: return DRFLAC_ERROR;
8353
#endif
8354
#ifdef EREMOTE
8355
case EREMOTE: return DRFLAC_ERROR;
8356
#endif
8357
#ifdef ENOLINK
8358
case ENOLINK: return DRFLAC_ERROR;
8359
#endif
8360
#ifdef EADV
8361
case EADV: return DRFLAC_ERROR;
8362
#endif
8363
#ifdef ESRMNT
8364
case ESRMNT: return DRFLAC_ERROR;
8365
#endif
8366
#ifdef ECOMM
8367
case ECOMM: return DRFLAC_ERROR;
8368
#endif
8369
#ifdef EPROTO
8370
case EPROTO: return DRFLAC_ERROR;
8371
#endif
8372
#ifdef EMULTIHOP
8373
case EMULTIHOP: return DRFLAC_ERROR;
8374
#endif
8375
#ifdef EDOTDOT
8376
case EDOTDOT: return DRFLAC_ERROR;
8377
#endif
8378
#ifdef EBADMSG
8379
case EBADMSG: return DRFLAC_BAD_MESSAGE;
8380
#endif
8381
#ifdef EOVERFLOW
8382
case EOVERFLOW: return DRFLAC_TOO_BIG;
8383
#endif
8384
#ifdef ENOTUNIQ
8385
case ENOTUNIQ: return DRFLAC_NOT_UNIQUE;
8386
#endif
8387
#ifdef EBADFD
8388
case EBADFD: return DRFLAC_ERROR;
8389
#endif
8390
#ifdef EREMCHG
8391
case EREMCHG: return DRFLAC_ERROR;
8392
#endif
8393
#ifdef ELIBACC
8394
case ELIBACC: return DRFLAC_ACCESS_DENIED;
8395
#endif
8396
#ifdef ELIBBAD
8397
case ELIBBAD: return DRFLAC_INVALID_FILE;
8398
#endif
8399
#ifdef ELIBSCN
8400
case ELIBSCN: return DRFLAC_INVALID_FILE;
8401
#endif
8402
#ifdef ELIBMAX
8403
case ELIBMAX: return DRFLAC_ERROR;
8404
#endif
8405
#ifdef ELIBEXEC
8406
case ELIBEXEC: return DRFLAC_ERROR;
8407
#endif
8408
#ifdef EILSEQ
8409
case EILSEQ: return DRFLAC_INVALID_DATA;
8410
#endif
8411
#ifdef ERESTART
8412
case ERESTART: return DRFLAC_ERROR;
8413
#endif
8414
#ifdef ESTRPIPE
8415
case ESTRPIPE: return DRFLAC_ERROR;
8416
#endif
8417
#ifdef EUSERS
8418
case EUSERS: return DRFLAC_ERROR;
8419
#endif
8420
#ifdef ENOTSOCK
8421
case ENOTSOCK: return DRFLAC_NOT_SOCKET;
8422
#endif
8423
#ifdef EDESTADDRREQ
8424
case EDESTADDRREQ: return DRFLAC_NO_ADDRESS;
8425
#endif
8426
#ifdef EMSGSIZE
8427
case EMSGSIZE: return DRFLAC_TOO_BIG;
8428
#endif
8429
#ifdef EPROTOTYPE
8430
case EPROTOTYPE: return DRFLAC_BAD_PROTOCOL;
8431
#endif
8432
#ifdef ENOPROTOOPT
8433
case ENOPROTOOPT: return DRFLAC_PROTOCOL_UNAVAILABLE;
8434
#endif
8435
#ifdef EPROTONOSUPPORT
8436
case EPROTONOSUPPORT: return DRFLAC_PROTOCOL_NOT_SUPPORTED;
8437
#endif
8438
#ifdef ESOCKTNOSUPPORT
8439
case ESOCKTNOSUPPORT: return DRFLAC_SOCKET_NOT_SUPPORTED;
8440
#endif
8441
#ifdef EOPNOTSUPP
8442
case EOPNOTSUPP: return DRFLAC_INVALID_OPERATION;
8443
#endif
8444
#ifdef EPFNOSUPPORT
8445
case EPFNOSUPPORT: return DRFLAC_PROTOCOL_FAMILY_NOT_SUPPORTED;
8446
#endif
8447
#ifdef EAFNOSUPPORT
8448
case EAFNOSUPPORT: return DRFLAC_ADDRESS_FAMILY_NOT_SUPPORTED;
8449
#endif
8450
#ifdef EADDRINUSE
8451
case EADDRINUSE: return DRFLAC_ALREADY_IN_USE;
8452
#endif
8453
#ifdef EADDRNOTAVAIL
8454
case EADDRNOTAVAIL: return DRFLAC_ERROR;
8455
#endif
8456
#ifdef ENETDOWN
8457
case ENETDOWN: return DRFLAC_NO_NETWORK;
8458
#endif
8459
#ifdef ENETUNREACH
8460
case ENETUNREACH: return DRFLAC_NO_NETWORK;
8461
#endif
8462
#ifdef ENETRESET
8463
case ENETRESET: return DRFLAC_NO_NETWORK;
8464
#endif
8465
#ifdef ECONNABORTED
8466
case ECONNABORTED: return DRFLAC_NO_NETWORK;
8467
#endif
8468
#ifdef ECONNRESET
8469
case ECONNRESET: return DRFLAC_CONNECTION_RESET;
8470
#endif
8471
#ifdef ENOBUFS
8472
case ENOBUFS: return DRFLAC_NO_SPACE;
8473
#endif
8474
#ifdef EISCONN
8475
case EISCONN: return DRFLAC_ALREADY_CONNECTED;
8476
#endif
8477
#ifdef ENOTCONN
8478
case ENOTCONN: return DRFLAC_NOT_CONNECTED;
8479
#endif
8480
#ifdef ESHUTDOWN
8481
case ESHUTDOWN: return DRFLAC_ERROR;
8482
#endif
8483
#ifdef ETOOMANYREFS
8484
case ETOOMANYREFS: return DRFLAC_ERROR;
8485
#endif
8486
#ifdef ETIMEDOUT
8487
case ETIMEDOUT: return DRFLAC_TIMEOUT;
8488
#endif
8489
#ifdef ECONNREFUSED
8490
case ECONNREFUSED: return DRFLAC_CONNECTION_REFUSED;
8491
#endif
8492
#ifdef EHOSTDOWN
8493
case EHOSTDOWN: return DRFLAC_NO_HOST;
8494
#endif
8495
#ifdef EHOSTUNREACH
8496
case EHOSTUNREACH: return DRFLAC_NO_HOST;
8497
#endif
8498
#ifdef EALREADY
8499
case EALREADY: return DRFLAC_IN_PROGRESS;
8500
#endif
8501
#ifdef EINPROGRESS
8502
case EINPROGRESS: return DRFLAC_IN_PROGRESS;
8503
#endif
8504
#ifdef ESTALE
8505
case ESTALE: return DRFLAC_INVALID_FILE;
8506
#endif
8507
#ifdef EUCLEAN
8508
case EUCLEAN: return DRFLAC_ERROR;
8509
#endif
8510
#ifdef ENOTNAM
8511
case ENOTNAM: return DRFLAC_ERROR;
8512
#endif
8513
#ifdef ENAVAIL
8514
case ENAVAIL: return DRFLAC_ERROR;
8515
#endif
8516
#ifdef EISNAM
8517
case EISNAM: return DRFLAC_ERROR;
8518
#endif
8519
#ifdef EREMOTEIO
8520
case EREMOTEIO: return DRFLAC_IO_ERROR;
8521
#endif
8522
#ifdef EDQUOT
8523
case EDQUOT: return DRFLAC_NO_SPACE;
8524
#endif
8525
#ifdef ENOMEDIUM
8526
case ENOMEDIUM: return DRFLAC_DOES_NOT_EXIST;
8527
#endif
8528
#ifdef EMEDIUMTYPE
8529
case EMEDIUMTYPE: return DRFLAC_ERROR;
8530
#endif
8531
#ifdef ECANCELED
8532
case ECANCELED: return DRFLAC_CANCELLED;
8533
#endif
8534
#ifdef ENOKEY
8535
case ENOKEY: return DRFLAC_ERROR;
8536
#endif
8537
#ifdef EKEYEXPIRED
8538
case EKEYEXPIRED: return DRFLAC_ERROR;
8539
#endif
8540
#ifdef EKEYREVOKED
8541
case EKEYREVOKED: return DRFLAC_ERROR;
8542
#endif
8543
#ifdef EKEYREJECTED
8544
case EKEYREJECTED: return DRFLAC_ERROR;
8545
#endif
8546
#ifdef EOWNERDEAD
8547
case EOWNERDEAD: return DRFLAC_ERROR;
8548
#endif
8549
#ifdef ENOTRECOVERABLE
8550
case ENOTRECOVERABLE: return DRFLAC_ERROR;
8551
#endif
8552
#ifdef ERFKILL
8553
case ERFKILL: return DRFLAC_ERROR;
8554
#endif
8555
#ifdef EHWPOISON
8556
case EHWPOISON: return DRFLAC_ERROR;
8557
#endif
8558
default: return DRFLAC_ERROR;
8559
}
8560
}
8561
/* End Errno */
8562
8563
/* fopen */
8564
static drflac_result drflac_fopen(FILE** ppFile, const char* pFilePath, const char* pOpenMode)
8565
{
8566
#if defined(_MSC_VER) && _MSC_VER >= 1400
8567
errno_t err;
8568
#endif
8569
8570
if (ppFile != NULL) {
8571
*ppFile = NULL; /* Safety. */
8572
}
8573
8574
if (pFilePath == NULL || pOpenMode == NULL || ppFile == NULL) {
8575
return DRFLAC_INVALID_ARGS;
8576
}
8577
8578
#if defined(_MSC_VER) && _MSC_VER >= 1400
8579
err = fopen_s(ppFile, pFilePath, pOpenMode);
8580
if (err != 0) {
8581
return drflac_result_from_errno(err);
8582
}
8583
#else
8584
#if defined(_WIN32) || defined(__APPLE__)
8585
*ppFile = fopen(pFilePath, pOpenMode);
8586
#else
8587
#if defined(_FILE_OFFSET_BITS) && _FILE_OFFSET_BITS == 64 && defined(_LARGEFILE64_SOURCE)
8588
*ppFile = fopen64(pFilePath, pOpenMode);
8589
#else
8590
*ppFile = fopen(pFilePath, pOpenMode);
8591
#endif
8592
#endif
8593
if (*ppFile == NULL) {
8594
drflac_result result = drflac_result_from_errno(errno);
8595
if (result == DRFLAC_SUCCESS) {
8596
result = DRFLAC_ERROR; /* Just a safety check to make sure we never ever return success when pFile == NULL. */
8597
}
8598
8599
return result;
8600
}
8601
#endif
8602
8603
return DRFLAC_SUCCESS;
8604
}
8605
8606
/*
8607
_wfopen() isn't always available in all compilation environments.
8608
8609
* Windows only.
8610
* MSVC seems to support it universally as far back as VC6 from what I can tell (haven't checked further back).
8611
* MinGW-64 (both 32- and 64-bit) seems to support it.
8612
* MinGW wraps it in !defined(__STRICT_ANSI__).
8613
* OpenWatcom wraps it in !defined(_NO_EXT_KEYS).
8614
8615
This can be reviewed as compatibility issues arise. The preference is to use _wfopen_s() and _wfopen() as opposed to the wcsrtombs()
8616
fallback, so if you notice your compiler not detecting this properly I'm happy to look at adding support.
8617
*/
8618
#if defined(_WIN32)
8619
#if defined(_MSC_VER) || defined(__MINGW64__) || (!defined(__STRICT_ANSI__) && !defined(_NO_EXT_KEYS))
8620
#define DRFLAC_HAS_WFOPEN
8621
#endif
8622
#endif
8623
8624
#ifndef DR_FLAC_NO_WCHAR
8625
static drflac_result drflac_wfopen(FILE** ppFile, const wchar_t* pFilePath, const wchar_t* pOpenMode, const drflac_allocation_callbacks* pAllocationCallbacks)
8626
{
8627
if (ppFile != NULL) {
8628
*ppFile = NULL; /* Safety. */
8629
}
8630
8631
if (pFilePath == NULL || pOpenMode == NULL || ppFile == NULL) {
8632
return DRFLAC_INVALID_ARGS;
8633
}
8634
8635
#if defined(DRFLAC_HAS_WFOPEN)
8636
{
8637
/* Use _wfopen() on Windows. */
8638
#if defined(_MSC_VER) && _MSC_VER >= 1400
8639
errno_t err = _wfopen_s(ppFile, pFilePath, pOpenMode);
8640
if (err != 0) {
8641
return drflac_result_from_errno(err);
8642
}
8643
#else
8644
*ppFile = _wfopen(pFilePath, pOpenMode);
8645
if (*ppFile == NULL) {
8646
return drflac_result_from_errno(errno);
8647
}
8648
#endif
8649
(void)pAllocationCallbacks;
8650
}
8651
#else
8652
/*
8653
Use fopen() on anything other than Windows. Requires a conversion. This is annoying because
8654
fopen() is locale specific. The only real way I can think of to do this is with wcsrtombs(). Note
8655
that wcstombs() is apparently not thread-safe because it uses a static global mbstate_t object for
8656
maintaining state. I've checked this with -std=c89 and it works, but if somebody get's a compiler
8657
error I'll look into improving compatibility.
8658
*/
8659
8660
/*
8661
Some compilers don't support wchar_t or wcsrtombs() which we're using below. In this case we just
8662
need to abort with an error. If you encounter a compiler lacking such support, add it to this list
8663
and submit a bug report and it'll be added to the library upstream.
8664
*/
8665
#if defined(__DJGPP__)
8666
{
8667
/* Nothing to do here. This will fall through to the error check below. */
8668
}
8669
#else
8670
{
8671
mbstate_t mbs;
8672
size_t lenMB;
8673
const wchar_t* pFilePathTemp = pFilePath;
8674
char* pFilePathMB = NULL;
8675
char pOpenModeMB[32] = {0};
8676
8677
/* Get the length first. */
8678
DRFLAC_ZERO_OBJECT(&mbs);
8679
lenMB = wcsrtombs(NULL, &pFilePathTemp, 0, &mbs);
8680
if (lenMB == (size_t)-1) {
8681
return drflac_result_from_errno(errno);
8682
}
8683
8684
pFilePathMB = (char*)drflac__malloc_from_callbacks(lenMB + 1, pAllocationCallbacks);
8685
if (pFilePathMB == NULL) {
8686
return DRFLAC_OUT_OF_MEMORY;
8687
}
8688
8689
pFilePathTemp = pFilePath;
8690
DRFLAC_ZERO_OBJECT(&mbs);
8691
wcsrtombs(pFilePathMB, &pFilePathTemp, lenMB + 1, &mbs);
8692
8693
/* The open mode should always consist of ASCII characters so we should be able to do a trivial conversion. */
8694
{
8695
size_t i = 0;
8696
for (;;) {
8697
if (pOpenMode[i] == 0) {
8698
pOpenModeMB[i] = '\0';
8699
break;
8700
}
8701
8702
pOpenModeMB[i] = (char)pOpenMode[i];
8703
i += 1;
8704
}
8705
}
8706
8707
*ppFile = fopen(pFilePathMB, pOpenModeMB);
8708
8709
drflac__free_from_callbacks(pFilePathMB, pAllocationCallbacks);
8710
}
8711
#endif
8712
8713
if (*ppFile == NULL) {
8714
return DRFLAC_ERROR;
8715
}
8716
#endif
8717
8718
return DRFLAC_SUCCESS;
8719
}
8720
#endif
8721
/* End fopen */
8722
8723
static size_t drflac__on_read_stdio(void* pUserData, void* bufferOut, size_t bytesToRead)
8724
{
8725
return fread(bufferOut, 1, bytesToRead, (FILE*)pUserData);
8726
}
8727
8728
static drflac_bool32 drflac__on_seek_stdio(void* pUserData, int offset, drflac_seek_origin origin)
8729
{
8730
DRFLAC_ASSERT(offset >= 0); /* <-- Never seek backwards. */
8731
8732
return fseek((FILE*)pUserData, offset, (origin == drflac_seek_origin_current) ? SEEK_CUR : SEEK_SET) == 0;
8733
}
8734
8735
8736
DRFLAC_API drflac* drflac_open_file(const char* pFileName, const drflac_allocation_callbacks* pAllocationCallbacks)
8737
{
8738
drflac* pFlac;
8739
FILE* pFile;
8740
8741
if (drflac_fopen(&pFile, pFileName, "rb") != DRFLAC_SUCCESS) {
8742
return NULL;
8743
}
8744
8745
pFlac = drflac_open(drflac__on_read_stdio, drflac__on_seek_stdio, (void*)pFile, pAllocationCallbacks);
8746
if (pFlac == NULL) {
8747
fclose(pFile);
8748
return NULL;
8749
}
8750
8751
return pFlac;
8752
}
8753
8754
#ifndef DR_FLAC_NO_WCHAR
8755
DRFLAC_API drflac* drflac_open_file_w(const wchar_t* pFileName, const drflac_allocation_callbacks* pAllocationCallbacks)
8756
{
8757
drflac* pFlac;
8758
FILE* pFile;
8759
8760
if (drflac_wfopen(&pFile, pFileName, L"rb", pAllocationCallbacks) != DRFLAC_SUCCESS) {
8761
return NULL;
8762
}
8763
8764
pFlac = drflac_open(drflac__on_read_stdio, drflac__on_seek_stdio, (void*)pFile, pAllocationCallbacks);
8765
if (pFlac == NULL) {
8766
fclose(pFile);
8767
return NULL;
8768
}
8769
8770
return pFlac;
8771
}
8772
#endif
8773
8774
DRFLAC_API drflac* drflac_open_file_with_metadata(const char* pFileName, drflac_meta_proc onMeta, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks)
8775
{
8776
drflac* pFlac;
8777
FILE* pFile;
8778
8779
if (drflac_fopen(&pFile, pFileName, "rb") != DRFLAC_SUCCESS) {
8780
return NULL;
8781
}
8782
8783
pFlac = drflac_open_with_metadata_private(drflac__on_read_stdio, drflac__on_seek_stdio, onMeta, drflac_container_unknown, (void*)pFile, pUserData, pAllocationCallbacks);
8784
if (pFlac == NULL) {
8785
fclose(pFile);
8786
return pFlac;
8787
}
8788
8789
return pFlac;
8790
}
8791
8792
#ifndef DR_FLAC_NO_WCHAR
8793
DRFLAC_API drflac* drflac_open_file_with_metadata_w(const wchar_t* pFileName, drflac_meta_proc onMeta, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks)
8794
{
8795
drflac* pFlac;
8796
FILE* pFile;
8797
8798
if (drflac_wfopen(&pFile, pFileName, L"rb", pAllocationCallbacks) != DRFLAC_SUCCESS) {
8799
return NULL;
8800
}
8801
8802
pFlac = drflac_open_with_metadata_private(drflac__on_read_stdio, drflac__on_seek_stdio, onMeta, drflac_container_unknown, (void*)pFile, pUserData, pAllocationCallbacks);
8803
if (pFlac == NULL) {
8804
fclose(pFile);
8805
return pFlac;
8806
}
8807
8808
return pFlac;
8809
}
8810
#endif
8811
#endif /* DR_FLAC_NO_STDIO */
8812
8813
static size_t drflac__on_read_memory(void* pUserData, void* bufferOut, size_t bytesToRead)
8814
{
8815
drflac__memory_stream* memoryStream = (drflac__memory_stream*)pUserData;
8816
size_t bytesRemaining;
8817
8818
DRFLAC_ASSERT(memoryStream != NULL);
8819
DRFLAC_ASSERT(memoryStream->dataSize >= memoryStream->currentReadPos);
8820
8821
bytesRemaining = memoryStream->dataSize - memoryStream->currentReadPos;
8822
if (bytesToRead > bytesRemaining) {
8823
bytesToRead = bytesRemaining;
8824
}
8825
8826
if (bytesToRead > 0) {
8827
DRFLAC_COPY_MEMORY(bufferOut, memoryStream->data + memoryStream->currentReadPos, bytesToRead);
8828
memoryStream->currentReadPos += bytesToRead;
8829
}
8830
8831
return bytesToRead;
8832
}
8833
8834
static drflac_bool32 drflac__on_seek_memory(void* pUserData, int offset, drflac_seek_origin origin)
8835
{
8836
drflac__memory_stream* memoryStream = (drflac__memory_stream*)pUserData;
8837
8838
DRFLAC_ASSERT(memoryStream != NULL);
8839
DRFLAC_ASSERT(offset >= 0); /* <-- Never seek backwards. */
8840
8841
if (offset > (drflac_int64)memoryStream->dataSize) {
8842
return DRFLAC_FALSE;
8843
}
8844
8845
if (origin == drflac_seek_origin_current) {
8846
if (memoryStream->currentReadPos + offset <= memoryStream->dataSize) {
8847
memoryStream->currentReadPos += offset;
8848
} else {
8849
return DRFLAC_FALSE; /* Trying to seek too far forward. */
8850
}
8851
} else {
8852
if ((drflac_uint32)offset <= memoryStream->dataSize) {
8853
memoryStream->currentReadPos = offset;
8854
} else {
8855
return DRFLAC_FALSE; /* Trying to seek too far forward. */
8856
}
8857
}
8858
8859
return DRFLAC_TRUE;
8860
}
8861
8862
DRFLAC_API drflac* drflac_open_memory(const void* pData, size_t dataSize, const drflac_allocation_callbacks* pAllocationCallbacks)
8863
{
8864
drflac__memory_stream memoryStream;
8865
drflac* pFlac;
8866
8867
memoryStream.data = (const drflac_uint8*)pData;
8868
memoryStream.dataSize = dataSize;
8869
memoryStream.currentReadPos = 0;
8870
pFlac = drflac_open(drflac__on_read_memory, drflac__on_seek_memory, &memoryStream, pAllocationCallbacks);
8871
if (pFlac == NULL) {
8872
return NULL;
8873
}
8874
8875
pFlac->memoryStream = memoryStream;
8876
8877
/* This is an awful hack... */
8878
#ifndef DR_FLAC_NO_OGG
8879
if (pFlac->container == drflac_container_ogg)
8880
{
8881
drflac_oggbs* oggbs = (drflac_oggbs*)pFlac->_oggbs;
8882
oggbs->pUserData = &pFlac->memoryStream;
8883
}
8884
else
8885
#endif
8886
{
8887
pFlac->bs.pUserData = &pFlac->memoryStream;
8888
}
8889
8890
return pFlac;
8891
}
8892
8893
DRFLAC_API drflac* drflac_open_memory_with_metadata(const void* pData, size_t dataSize, drflac_meta_proc onMeta, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks)
8894
{
8895
drflac__memory_stream memoryStream;
8896
drflac* pFlac;
8897
8898
memoryStream.data = (const drflac_uint8*)pData;
8899
memoryStream.dataSize = dataSize;
8900
memoryStream.currentReadPos = 0;
8901
pFlac = drflac_open_with_metadata_private(drflac__on_read_memory, drflac__on_seek_memory, onMeta, drflac_container_unknown, &memoryStream, pUserData, pAllocationCallbacks);
8902
if (pFlac == NULL) {
8903
return NULL;
8904
}
8905
8906
pFlac->memoryStream = memoryStream;
8907
8908
/* This is an awful hack... */
8909
#ifndef DR_FLAC_NO_OGG
8910
if (pFlac->container == drflac_container_ogg)
8911
{
8912
drflac_oggbs* oggbs = (drflac_oggbs*)pFlac->_oggbs;
8913
oggbs->pUserData = &pFlac->memoryStream;
8914
}
8915
else
8916
#endif
8917
{
8918
pFlac->bs.pUserData = &pFlac->memoryStream;
8919
}
8920
8921
return pFlac;
8922
}
8923
8924
8925
8926
DRFLAC_API drflac* drflac_open(drflac_read_proc onRead, drflac_seek_proc onSeek, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks)
8927
{
8928
return drflac_open_with_metadata_private(onRead, onSeek, NULL, drflac_container_unknown, pUserData, pUserData, pAllocationCallbacks);
8929
}
8930
DRFLAC_API drflac* drflac_open_relaxed(drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_container container, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks)
8931
{
8932
return drflac_open_with_metadata_private(onRead, onSeek, NULL, container, pUserData, pUserData, pAllocationCallbacks);
8933
}
8934
8935
DRFLAC_API drflac* drflac_open_with_metadata(drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks)
8936
{
8937
return drflac_open_with_metadata_private(onRead, onSeek, onMeta, drflac_container_unknown, pUserData, pUserData, pAllocationCallbacks);
8938
}
8939
DRFLAC_API drflac* drflac_open_with_metadata_relaxed(drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, drflac_container container, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks)
8940
{
8941
return drflac_open_with_metadata_private(onRead, onSeek, onMeta, container, pUserData, pUserData, pAllocationCallbacks);
8942
}
8943
8944
DRFLAC_API void drflac_close(drflac* pFlac)
8945
{
8946
if (pFlac == NULL) {
8947
return;
8948
}
8949
8950
#ifndef DR_FLAC_NO_STDIO
8951
/*
8952
If we opened the file with drflac_open_file() we will want to close the file handle. We can know whether or not drflac_open_file()
8953
was used by looking at the callbacks.
8954
*/
8955
if (pFlac->bs.onRead == drflac__on_read_stdio) {
8956
fclose((FILE*)pFlac->bs.pUserData);
8957
}
8958
8959
#ifndef DR_FLAC_NO_OGG
8960
/* Need to clean up Ogg streams a bit differently due to the way the bit streaming is chained. */
8961
if (pFlac->container == drflac_container_ogg) {
8962
drflac_oggbs* oggbs = (drflac_oggbs*)pFlac->_oggbs;
8963
DRFLAC_ASSERT(pFlac->bs.onRead == drflac__on_read_ogg);
8964
8965
if (oggbs->onRead == drflac__on_read_stdio) {
8966
fclose((FILE*)oggbs->pUserData);
8967
}
8968
}
8969
#endif
8970
#endif
8971
8972
drflac__free_from_callbacks(pFlac, &pFlac->allocationCallbacks);
8973
}
8974
8975
8976
#if 0
8977
static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_left_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)
8978
{
8979
drflac_uint64 i;
8980
for (i = 0; i < frameCount; ++i) {
8981
drflac_uint32 left = (drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample);
8982
drflac_uint32 side = (drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample);
8983
drflac_uint32 right = left - side;
8984
8985
pOutputSamples[i*2+0] = (drflac_int32)left;
8986
pOutputSamples[i*2+1] = (drflac_int32)right;
8987
}
8988
}
8989
#endif
8990
8991
static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_left_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)
8992
{
8993
drflac_uint64 i;
8994
drflac_uint64 frameCount4 = frameCount >> 2;
8995
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
8996
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
8997
drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
8998
drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
8999
9000
for (i = 0; i < frameCount4; ++i) {
9001
drflac_uint32 left0 = pInputSamples0U32[i*4+0] << shift0;
9002
drflac_uint32 left1 = pInputSamples0U32[i*4+1] << shift0;
9003
drflac_uint32 left2 = pInputSamples0U32[i*4+2] << shift0;
9004
drflac_uint32 left3 = pInputSamples0U32[i*4+3] << shift0;
9005
9006
drflac_uint32 side0 = pInputSamples1U32[i*4+0] << shift1;
9007
drflac_uint32 side1 = pInputSamples1U32[i*4+1] << shift1;
9008
drflac_uint32 side2 = pInputSamples1U32[i*4+2] << shift1;
9009
drflac_uint32 side3 = pInputSamples1U32[i*4+3] << shift1;
9010
9011
drflac_uint32 right0 = left0 - side0;
9012
drflac_uint32 right1 = left1 - side1;
9013
drflac_uint32 right2 = left2 - side2;
9014
drflac_uint32 right3 = left3 - side3;
9015
9016
pOutputSamples[i*8+0] = (drflac_int32)left0;
9017
pOutputSamples[i*8+1] = (drflac_int32)right0;
9018
pOutputSamples[i*8+2] = (drflac_int32)left1;
9019
pOutputSamples[i*8+3] = (drflac_int32)right1;
9020
pOutputSamples[i*8+4] = (drflac_int32)left2;
9021
pOutputSamples[i*8+5] = (drflac_int32)right2;
9022
pOutputSamples[i*8+6] = (drflac_int32)left3;
9023
pOutputSamples[i*8+7] = (drflac_int32)right3;
9024
}
9025
9026
for (i = (frameCount4 << 2); i < frameCount; ++i) {
9027
drflac_uint32 left = pInputSamples0U32[i] << shift0;
9028
drflac_uint32 side = pInputSamples1U32[i] << shift1;
9029
drflac_uint32 right = left - side;
9030
9031
pOutputSamples[i*2+0] = (drflac_int32)left;
9032
pOutputSamples[i*2+1] = (drflac_int32)right;
9033
}
9034
}
9035
9036
#if defined(DRFLAC_SUPPORT_SSE2)
9037
static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_left_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)
9038
{
9039
drflac_uint64 i;
9040
drflac_uint64 frameCount4 = frameCount >> 2;
9041
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
9042
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
9043
drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
9044
drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
9045
9046
DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);
9047
9048
for (i = 0; i < frameCount4; ++i) {
9049
__m128i left = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0);
9050
__m128i side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1);
9051
__m128i right = _mm_sub_epi32(left, side);
9052
9053
_mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 0), _mm_unpacklo_epi32(left, right));
9054
_mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 4), _mm_unpackhi_epi32(left, right));
9055
}
9056
9057
for (i = (frameCount4 << 2); i < frameCount; ++i) {
9058
drflac_uint32 left = pInputSamples0U32[i] << shift0;
9059
drflac_uint32 side = pInputSamples1U32[i] << shift1;
9060
drflac_uint32 right = left - side;
9061
9062
pOutputSamples[i*2+0] = (drflac_int32)left;
9063
pOutputSamples[i*2+1] = (drflac_int32)right;
9064
}
9065
}
9066
#endif
9067
9068
#if defined(DRFLAC_SUPPORT_NEON)
9069
static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_left_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)
9070
{
9071
drflac_uint64 i;
9072
drflac_uint64 frameCount4 = frameCount >> 2;
9073
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
9074
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
9075
drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
9076
drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
9077
int32x4_t shift0_4;
9078
int32x4_t shift1_4;
9079
9080
DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);
9081
9082
shift0_4 = vdupq_n_s32(shift0);
9083
shift1_4 = vdupq_n_s32(shift1);
9084
9085
for (i = 0; i < frameCount4; ++i) {
9086
uint32x4_t left;
9087
uint32x4_t side;
9088
uint32x4_t right;
9089
9090
left = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift0_4);
9091
side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift1_4);
9092
right = vsubq_u32(left, side);
9093
9094
drflac__vst2q_u32((drflac_uint32*)pOutputSamples + i*8, vzipq_u32(left, right));
9095
}
9096
9097
for (i = (frameCount4 << 2); i < frameCount; ++i) {
9098
drflac_uint32 left = pInputSamples0U32[i] << shift0;
9099
drflac_uint32 side = pInputSamples1U32[i] << shift1;
9100
drflac_uint32 right = left - side;
9101
9102
pOutputSamples[i*2+0] = (drflac_int32)left;
9103
pOutputSamples[i*2+1] = (drflac_int32)right;
9104
}
9105
}
9106
#endif
9107
9108
static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_left_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)
9109
{
9110
#if defined(DRFLAC_SUPPORT_SSE2)
9111
if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) {
9112
drflac_read_pcm_frames_s32__decode_left_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
9113
} else
9114
#elif defined(DRFLAC_SUPPORT_NEON)
9115
if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) {
9116
drflac_read_pcm_frames_s32__decode_left_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
9117
} else
9118
#endif
9119
{
9120
/* Scalar fallback. */
9121
#if 0
9122
drflac_read_pcm_frames_s32__decode_left_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
9123
#else
9124
drflac_read_pcm_frames_s32__decode_left_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
9125
#endif
9126
}
9127
}
9128
9129
9130
#if 0
9131
static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_right_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)
9132
{
9133
drflac_uint64 i;
9134
for (i = 0; i < frameCount; ++i) {
9135
drflac_uint32 side = (drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample);
9136
drflac_uint32 right = (drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample);
9137
drflac_uint32 left = right + side;
9138
9139
pOutputSamples[i*2+0] = (drflac_int32)left;
9140
pOutputSamples[i*2+1] = (drflac_int32)right;
9141
}
9142
}
9143
#endif
9144
9145
static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_right_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)
9146
{
9147
drflac_uint64 i;
9148
drflac_uint64 frameCount4 = frameCount >> 2;
9149
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
9150
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
9151
drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
9152
drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
9153
9154
for (i = 0; i < frameCount4; ++i) {
9155
drflac_uint32 side0 = pInputSamples0U32[i*4+0] << shift0;
9156
drflac_uint32 side1 = pInputSamples0U32[i*4+1] << shift0;
9157
drflac_uint32 side2 = pInputSamples0U32[i*4+2] << shift0;
9158
drflac_uint32 side3 = pInputSamples0U32[i*4+3] << shift0;
9159
9160
drflac_uint32 right0 = pInputSamples1U32[i*4+0] << shift1;
9161
drflac_uint32 right1 = pInputSamples1U32[i*4+1] << shift1;
9162
drflac_uint32 right2 = pInputSamples1U32[i*4+2] << shift1;
9163
drflac_uint32 right3 = pInputSamples1U32[i*4+3] << shift1;
9164
9165
drflac_uint32 left0 = right0 + side0;
9166
drflac_uint32 left1 = right1 + side1;
9167
drflac_uint32 left2 = right2 + side2;
9168
drflac_uint32 left3 = right3 + side3;
9169
9170
pOutputSamples[i*8+0] = (drflac_int32)left0;
9171
pOutputSamples[i*8+1] = (drflac_int32)right0;
9172
pOutputSamples[i*8+2] = (drflac_int32)left1;
9173
pOutputSamples[i*8+3] = (drflac_int32)right1;
9174
pOutputSamples[i*8+4] = (drflac_int32)left2;
9175
pOutputSamples[i*8+5] = (drflac_int32)right2;
9176
pOutputSamples[i*8+6] = (drflac_int32)left3;
9177
pOutputSamples[i*8+7] = (drflac_int32)right3;
9178
}
9179
9180
for (i = (frameCount4 << 2); i < frameCount; ++i) {
9181
drflac_uint32 side = pInputSamples0U32[i] << shift0;
9182
drflac_uint32 right = pInputSamples1U32[i] << shift1;
9183
drflac_uint32 left = right + side;
9184
9185
pOutputSamples[i*2+0] = (drflac_int32)left;
9186
pOutputSamples[i*2+1] = (drflac_int32)right;
9187
}
9188
}
9189
9190
#if defined(DRFLAC_SUPPORT_SSE2)
9191
static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_right_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)
9192
{
9193
drflac_uint64 i;
9194
drflac_uint64 frameCount4 = frameCount >> 2;
9195
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
9196
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
9197
drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
9198
drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
9199
9200
DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);
9201
9202
for (i = 0; i < frameCount4; ++i) {
9203
__m128i side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0);
9204
__m128i right = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1);
9205
__m128i left = _mm_add_epi32(right, side);
9206
9207
_mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 0), _mm_unpacklo_epi32(left, right));
9208
_mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 4), _mm_unpackhi_epi32(left, right));
9209
}
9210
9211
for (i = (frameCount4 << 2); i < frameCount; ++i) {
9212
drflac_uint32 side = pInputSamples0U32[i] << shift0;
9213
drflac_uint32 right = pInputSamples1U32[i] << shift1;
9214
drflac_uint32 left = right + side;
9215
9216
pOutputSamples[i*2+0] = (drflac_int32)left;
9217
pOutputSamples[i*2+1] = (drflac_int32)right;
9218
}
9219
}
9220
#endif
9221
9222
#if defined(DRFLAC_SUPPORT_NEON)
9223
static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_right_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)
9224
{
9225
drflac_uint64 i;
9226
drflac_uint64 frameCount4 = frameCount >> 2;
9227
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
9228
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
9229
drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
9230
drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
9231
int32x4_t shift0_4;
9232
int32x4_t shift1_4;
9233
9234
DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);
9235
9236
shift0_4 = vdupq_n_s32(shift0);
9237
shift1_4 = vdupq_n_s32(shift1);
9238
9239
for (i = 0; i < frameCount4; ++i) {
9240
uint32x4_t side;
9241
uint32x4_t right;
9242
uint32x4_t left;
9243
9244
side = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift0_4);
9245
right = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift1_4);
9246
left = vaddq_u32(right, side);
9247
9248
drflac__vst2q_u32((drflac_uint32*)pOutputSamples + i*8, vzipq_u32(left, right));
9249
}
9250
9251
for (i = (frameCount4 << 2); i < frameCount; ++i) {
9252
drflac_uint32 side = pInputSamples0U32[i] << shift0;
9253
drflac_uint32 right = pInputSamples1U32[i] << shift1;
9254
drflac_uint32 left = right + side;
9255
9256
pOutputSamples[i*2+0] = (drflac_int32)left;
9257
pOutputSamples[i*2+1] = (drflac_int32)right;
9258
}
9259
}
9260
#endif
9261
9262
static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_right_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)
9263
{
9264
#if defined(DRFLAC_SUPPORT_SSE2)
9265
if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) {
9266
drflac_read_pcm_frames_s32__decode_right_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
9267
} else
9268
#elif defined(DRFLAC_SUPPORT_NEON)
9269
if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) {
9270
drflac_read_pcm_frames_s32__decode_right_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
9271
} else
9272
#endif
9273
{
9274
/* Scalar fallback. */
9275
#if 0
9276
drflac_read_pcm_frames_s32__decode_right_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
9277
#else
9278
drflac_read_pcm_frames_s32__decode_right_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
9279
#endif
9280
}
9281
}
9282
9283
9284
#if 0
9285
static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_mid_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)
9286
{
9287
for (drflac_uint64 i = 0; i < frameCount; ++i) {
9288
drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
9289
drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
9290
9291
mid = (mid << 1) | (side & 0x01);
9292
9293
pOutputSamples[i*2+0] = (drflac_int32)((drflac_uint32)((drflac_int32)(mid + side) >> 1) << unusedBitsPerSample);
9294
pOutputSamples[i*2+1] = (drflac_int32)((drflac_uint32)((drflac_int32)(mid - side) >> 1) << unusedBitsPerSample);
9295
}
9296
}
9297
#endif
9298
9299
static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_mid_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)
9300
{
9301
drflac_uint64 i;
9302
drflac_uint64 frameCount4 = frameCount >> 2;
9303
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
9304
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
9305
drflac_int32 shift = unusedBitsPerSample;
9306
9307
if (shift > 0) {
9308
shift -= 1;
9309
for (i = 0; i < frameCount4; ++i) {
9310
drflac_uint32 temp0L;
9311
drflac_uint32 temp1L;
9312
drflac_uint32 temp2L;
9313
drflac_uint32 temp3L;
9314
drflac_uint32 temp0R;
9315
drflac_uint32 temp1R;
9316
drflac_uint32 temp2R;
9317
drflac_uint32 temp3R;
9318
9319
drflac_uint32 mid0 = pInputSamples0U32[i*4+0] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
9320
drflac_uint32 mid1 = pInputSamples0U32[i*4+1] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
9321
drflac_uint32 mid2 = pInputSamples0U32[i*4+2] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
9322
drflac_uint32 mid3 = pInputSamples0U32[i*4+3] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
9323
9324
drflac_uint32 side0 = pInputSamples1U32[i*4+0] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
9325
drflac_uint32 side1 = pInputSamples1U32[i*4+1] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
9326
drflac_uint32 side2 = pInputSamples1U32[i*4+2] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
9327
drflac_uint32 side3 = pInputSamples1U32[i*4+3] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
9328
9329
mid0 = (mid0 << 1) | (side0 & 0x01);
9330
mid1 = (mid1 << 1) | (side1 & 0x01);
9331
mid2 = (mid2 << 1) | (side2 & 0x01);
9332
mid3 = (mid3 << 1) | (side3 & 0x01);
9333
9334
temp0L = (mid0 + side0) << shift;
9335
temp1L = (mid1 + side1) << shift;
9336
temp2L = (mid2 + side2) << shift;
9337
temp3L = (mid3 + side3) << shift;
9338
9339
temp0R = (mid0 - side0) << shift;
9340
temp1R = (mid1 - side1) << shift;
9341
temp2R = (mid2 - side2) << shift;
9342
temp3R = (mid3 - side3) << shift;
9343
9344
pOutputSamples[i*8+0] = (drflac_int32)temp0L;
9345
pOutputSamples[i*8+1] = (drflac_int32)temp0R;
9346
pOutputSamples[i*8+2] = (drflac_int32)temp1L;
9347
pOutputSamples[i*8+3] = (drflac_int32)temp1R;
9348
pOutputSamples[i*8+4] = (drflac_int32)temp2L;
9349
pOutputSamples[i*8+5] = (drflac_int32)temp2R;
9350
pOutputSamples[i*8+6] = (drflac_int32)temp3L;
9351
pOutputSamples[i*8+7] = (drflac_int32)temp3R;
9352
}
9353
} else {
9354
for (i = 0; i < frameCount4; ++i) {
9355
drflac_uint32 temp0L;
9356
drflac_uint32 temp1L;
9357
drflac_uint32 temp2L;
9358
drflac_uint32 temp3L;
9359
drflac_uint32 temp0R;
9360
drflac_uint32 temp1R;
9361
drflac_uint32 temp2R;
9362
drflac_uint32 temp3R;
9363
9364
drflac_uint32 mid0 = pInputSamples0U32[i*4+0] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
9365
drflac_uint32 mid1 = pInputSamples0U32[i*4+1] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
9366
drflac_uint32 mid2 = pInputSamples0U32[i*4+2] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
9367
drflac_uint32 mid3 = pInputSamples0U32[i*4+3] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
9368
9369
drflac_uint32 side0 = pInputSamples1U32[i*4+0] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
9370
drflac_uint32 side1 = pInputSamples1U32[i*4+1] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
9371
drflac_uint32 side2 = pInputSamples1U32[i*4+2] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
9372
drflac_uint32 side3 = pInputSamples1U32[i*4+3] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
9373
9374
mid0 = (mid0 << 1) | (side0 & 0x01);
9375
mid1 = (mid1 << 1) | (side1 & 0x01);
9376
mid2 = (mid2 << 1) | (side2 & 0x01);
9377
mid3 = (mid3 << 1) | (side3 & 0x01);
9378
9379
temp0L = (drflac_uint32)((drflac_int32)(mid0 + side0) >> 1);
9380
temp1L = (drflac_uint32)((drflac_int32)(mid1 + side1) >> 1);
9381
temp2L = (drflac_uint32)((drflac_int32)(mid2 + side2) >> 1);
9382
temp3L = (drflac_uint32)((drflac_int32)(mid3 + side3) >> 1);
9383
9384
temp0R = (drflac_uint32)((drflac_int32)(mid0 - side0) >> 1);
9385
temp1R = (drflac_uint32)((drflac_int32)(mid1 - side1) >> 1);
9386
temp2R = (drflac_uint32)((drflac_int32)(mid2 - side2) >> 1);
9387
temp3R = (drflac_uint32)((drflac_int32)(mid3 - side3) >> 1);
9388
9389
pOutputSamples[i*8+0] = (drflac_int32)temp0L;
9390
pOutputSamples[i*8+1] = (drflac_int32)temp0R;
9391
pOutputSamples[i*8+2] = (drflac_int32)temp1L;
9392
pOutputSamples[i*8+3] = (drflac_int32)temp1R;
9393
pOutputSamples[i*8+4] = (drflac_int32)temp2L;
9394
pOutputSamples[i*8+5] = (drflac_int32)temp2R;
9395
pOutputSamples[i*8+6] = (drflac_int32)temp3L;
9396
pOutputSamples[i*8+7] = (drflac_int32)temp3R;
9397
}
9398
}
9399
9400
for (i = (frameCount4 << 2); i < frameCount; ++i) {
9401
drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
9402
drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
9403
9404
mid = (mid << 1) | (side & 0x01);
9405
9406
pOutputSamples[i*2+0] = (drflac_int32)((drflac_uint32)((drflac_int32)(mid + side) >> 1) << unusedBitsPerSample);
9407
pOutputSamples[i*2+1] = (drflac_int32)((drflac_uint32)((drflac_int32)(mid - side) >> 1) << unusedBitsPerSample);
9408
}
9409
}
9410
9411
#if defined(DRFLAC_SUPPORT_SSE2)
9412
static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_mid_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)
9413
{
9414
drflac_uint64 i;
9415
drflac_uint64 frameCount4 = frameCount >> 2;
9416
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
9417
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
9418
drflac_int32 shift = unusedBitsPerSample;
9419
9420
DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);
9421
9422
if (shift == 0) {
9423
for (i = 0; i < frameCount4; ++i) {
9424
__m128i mid;
9425
__m128i side;
9426
__m128i left;
9427
__m128i right;
9428
9429
mid = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample);
9430
side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample);
9431
9432
mid = _mm_or_si128(_mm_slli_epi32(mid, 1), _mm_and_si128(side, _mm_set1_epi32(0x01)));
9433
9434
left = _mm_srai_epi32(_mm_add_epi32(mid, side), 1);
9435
right = _mm_srai_epi32(_mm_sub_epi32(mid, side), 1);
9436
9437
_mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 0), _mm_unpacklo_epi32(left, right));
9438
_mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 4), _mm_unpackhi_epi32(left, right));
9439
}
9440
9441
for (i = (frameCount4 << 2); i < frameCount; ++i) {
9442
drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
9443
drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
9444
9445
mid = (mid << 1) | (side & 0x01);
9446
9447
pOutputSamples[i*2+0] = (drflac_int32)(mid + side) >> 1;
9448
pOutputSamples[i*2+1] = (drflac_int32)(mid - side) >> 1;
9449
}
9450
} else {
9451
shift -= 1;
9452
for (i = 0; i < frameCount4; ++i) {
9453
__m128i mid;
9454
__m128i side;
9455
__m128i left;
9456
__m128i right;
9457
9458
mid = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample);
9459
side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample);
9460
9461
mid = _mm_or_si128(_mm_slli_epi32(mid, 1), _mm_and_si128(side, _mm_set1_epi32(0x01)));
9462
9463
left = _mm_slli_epi32(_mm_add_epi32(mid, side), shift);
9464
right = _mm_slli_epi32(_mm_sub_epi32(mid, side), shift);
9465
9466
_mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 0), _mm_unpacklo_epi32(left, right));
9467
_mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 4), _mm_unpackhi_epi32(left, right));
9468
}
9469
9470
for (i = (frameCount4 << 2); i < frameCount; ++i) {
9471
drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
9472
drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
9473
9474
mid = (mid << 1) | (side & 0x01);
9475
9476
pOutputSamples[i*2+0] = (drflac_int32)((mid + side) << shift);
9477
pOutputSamples[i*2+1] = (drflac_int32)((mid - side) << shift);
9478
}
9479
}
9480
}
9481
#endif
9482
9483
#if defined(DRFLAC_SUPPORT_NEON)
9484
static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_mid_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)
9485
{
9486
drflac_uint64 i;
9487
drflac_uint64 frameCount4 = frameCount >> 2;
9488
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
9489
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
9490
drflac_int32 shift = unusedBitsPerSample;
9491
int32x4_t wbpsShift0_4; /* wbps = Wasted Bits Per Sample */
9492
int32x4_t wbpsShift1_4; /* wbps = Wasted Bits Per Sample */
9493
uint32x4_t one4;
9494
9495
DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);
9496
9497
wbpsShift0_4 = vdupq_n_s32(pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample);
9498
wbpsShift1_4 = vdupq_n_s32(pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample);
9499
one4 = vdupq_n_u32(1);
9500
9501
if (shift == 0) {
9502
for (i = 0; i < frameCount4; ++i) {
9503
uint32x4_t mid;
9504
uint32x4_t side;
9505
int32x4_t left;
9506
int32x4_t right;
9507
9508
mid = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), wbpsShift0_4);
9509
side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), wbpsShift1_4);
9510
9511
mid = vorrq_u32(vshlq_n_u32(mid, 1), vandq_u32(side, one4));
9512
9513
left = vshrq_n_s32(vreinterpretq_s32_u32(vaddq_u32(mid, side)), 1);
9514
right = vshrq_n_s32(vreinterpretq_s32_u32(vsubq_u32(mid, side)), 1);
9515
9516
drflac__vst2q_s32(pOutputSamples + i*8, vzipq_s32(left, right));
9517
}
9518
9519
for (i = (frameCount4 << 2); i < frameCount; ++i) {
9520
drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
9521
drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
9522
9523
mid = (mid << 1) | (side & 0x01);
9524
9525
pOutputSamples[i*2+0] = (drflac_int32)(mid + side) >> 1;
9526
pOutputSamples[i*2+1] = (drflac_int32)(mid - side) >> 1;
9527
}
9528
} else {
9529
int32x4_t shift4;
9530
9531
shift -= 1;
9532
shift4 = vdupq_n_s32(shift);
9533
9534
for (i = 0; i < frameCount4; ++i) {
9535
uint32x4_t mid;
9536
uint32x4_t side;
9537
int32x4_t left;
9538
int32x4_t right;
9539
9540
mid = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), wbpsShift0_4);
9541
side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), wbpsShift1_4);
9542
9543
mid = vorrq_u32(vshlq_n_u32(mid, 1), vandq_u32(side, one4));
9544
9545
left = vreinterpretq_s32_u32(vshlq_u32(vaddq_u32(mid, side), shift4));
9546
right = vreinterpretq_s32_u32(vshlq_u32(vsubq_u32(mid, side), shift4));
9547
9548
drflac__vst2q_s32(pOutputSamples + i*8, vzipq_s32(left, right));
9549
}
9550
9551
for (i = (frameCount4 << 2); i < frameCount; ++i) {
9552
drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
9553
drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
9554
9555
mid = (mid << 1) | (side & 0x01);
9556
9557
pOutputSamples[i*2+0] = (drflac_int32)((mid + side) << shift);
9558
pOutputSamples[i*2+1] = (drflac_int32)((mid - side) << shift);
9559
}
9560
}
9561
}
9562
#endif
9563
9564
static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_mid_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)
9565
{
9566
#if defined(DRFLAC_SUPPORT_SSE2)
9567
if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) {
9568
drflac_read_pcm_frames_s32__decode_mid_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
9569
} else
9570
#elif defined(DRFLAC_SUPPORT_NEON)
9571
if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) {
9572
drflac_read_pcm_frames_s32__decode_mid_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
9573
} else
9574
#endif
9575
{
9576
/* Scalar fallback. */
9577
#if 0
9578
drflac_read_pcm_frames_s32__decode_mid_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
9579
#else
9580
drflac_read_pcm_frames_s32__decode_mid_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
9581
#endif
9582
}
9583
}
9584
9585
9586
#if 0
9587
static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_independent_stereo__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)
9588
{
9589
for (drflac_uint64 i = 0; i < frameCount; ++i) {
9590
pOutputSamples[i*2+0] = (drflac_int32)((drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample));
9591
pOutputSamples[i*2+1] = (drflac_int32)((drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample));
9592
}
9593
}
9594
#endif
9595
9596
static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_independent_stereo__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)
9597
{
9598
drflac_uint64 i;
9599
drflac_uint64 frameCount4 = frameCount >> 2;
9600
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
9601
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
9602
drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
9603
drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
9604
9605
for (i = 0; i < frameCount4; ++i) {
9606
drflac_uint32 tempL0 = pInputSamples0U32[i*4+0] << shift0;
9607
drflac_uint32 tempL1 = pInputSamples0U32[i*4+1] << shift0;
9608
drflac_uint32 tempL2 = pInputSamples0U32[i*4+2] << shift0;
9609
drflac_uint32 tempL3 = pInputSamples0U32[i*4+3] << shift0;
9610
9611
drflac_uint32 tempR0 = pInputSamples1U32[i*4+0] << shift1;
9612
drflac_uint32 tempR1 = pInputSamples1U32[i*4+1] << shift1;
9613
drflac_uint32 tempR2 = pInputSamples1U32[i*4+2] << shift1;
9614
drflac_uint32 tempR3 = pInputSamples1U32[i*4+3] << shift1;
9615
9616
pOutputSamples[i*8+0] = (drflac_int32)tempL0;
9617
pOutputSamples[i*8+1] = (drflac_int32)tempR0;
9618
pOutputSamples[i*8+2] = (drflac_int32)tempL1;
9619
pOutputSamples[i*8+3] = (drflac_int32)tempR1;
9620
pOutputSamples[i*8+4] = (drflac_int32)tempL2;
9621
pOutputSamples[i*8+5] = (drflac_int32)tempR2;
9622
pOutputSamples[i*8+6] = (drflac_int32)tempL3;
9623
pOutputSamples[i*8+7] = (drflac_int32)tempR3;
9624
}
9625
9626
for (i = (frameCount4 << 2); i < frameCount; ++i) {
9627
pOutputSamples[i*2+0] = (drflac_int32)(pInputSamples0U32[i] << shift0);
9628
pOutputSamples[i*2+1] = (drflac_int32)(pInputSamples1U32[i] << shift1);
9629
}
9630
}
9631
9632
#if defined(DRFLAC_SUPPORT_SSE2)
9633
static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_independent_stereo__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)
9634
{
9635
drflac_uint64 i;
9636
drflac_uint64 frameCount4 = frameCount >> 2;
9637
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
9638
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
9639
drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
9640
drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
9641
9642
for (i = 0; i < frameCount4; ++i) {
9643
__m128i left = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0);
9644
__m128i right = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1);
9645
9646
_mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 0), _mm_unpacklo_epi32(left, right));
9647
_mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 4), _mm_unpackhi_epi32(left, right));
9648
}
9649
9650
for (i = (frameCount4 << 2); i < frameCount; ++i) {
9651
pOutputSamples[i*2+0] = (drflac_int32)(pInputSamples0U32[i] << shift0);
9652
pOutputSamples[i*2+1] = (drflac_int32)(pInputSamples1U32[i] << shift1);
9653
}
9654
}
9655
#endif
9656
9657
#if defined(DRFLAC_SUPPORT_NEON)
9658
static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_independent_stereo__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)
9659
{
9660
drflac_uint64 i;
9661
drflac_uint64 frameCount4 = frameCount >> 2;
9662
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
9663
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
9664
drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
9665
drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
9666
9667
int32x4_t shift4_0 = vdupq_n_s32(shift0);
9668
int32x4_t shift4_1 = vdupq_n_s32(shift1);
9669
9670
for (i = 0; i < frameCount4; ++i) {
9671
int32x4_t left;
9672
int32x4_t right;
9673
9674
left = vreinterpretq_s32_u32(vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift4_0));
9675
right = vreinterpretq_s32_u32(vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift4_1));
9676
9677
drflac__vst2q_s32(pOutputSamples + i*8, vzipq_s32(left, right));
9678
}
9679
9680
for (i = (frameCount4 << 2); i < frameCount; ++i) {
9681
pOutputSamples[i*2+0] = (drflac_int32)(pInputSamples0U32[i] << shift0);
9682
pOutputSamples[i*2+1] = (drflac_int32)(pInputSamples1U32[i] << shift1);
9683
}
9684
}
9685
#endif
9686
9687
static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_independent_stereo(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)
9688
{
9689
#if defined(DRFLAC_SUPPORT_SSE2)
9690
if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) {
9691
drflac_read_pcm_frames_s32__decode_independent_stereo__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
9692
} else
9693
#elif defined(DRFLAC_SUPPORT_NEON)
9694
if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) {
9695
drflac_read_pcm_frames_s32__decode_independent_stereo__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
9696
} else
9697
#endif
9698
{
9699
/* Scalar fallback. */
9700
#if 0
9701
drflac_read_pcm_frames_s32__decode_independent_stereo__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
9702
#else
9703
drflac_read_pcm_frames_s32__decode_independent_stereo__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
9704
#endif
9705
}
9706
}
9707
9708
9709
DRFLAC_API drflac_uint64 drflac_read_pcm_frames_s32(drflac* pFlac, drflac_uint64 framesToRead, drflac_int32* pBufferOut)
9710
{
9711
drflac_uint64 framesRead;
9712
drflac_uint32 unusedBitsPerSample;
9713
9714
if (pFlac == NULL || framesToRead == 0) {
9715
return 0;
9716
}
9717
9718
if (pBufferOut == NULL) {
9719
return drflac__seek_forward_by_pcm_frames(pFlac, framesToRead);
9720
}
9721
9722
DRFLAC_ASSERT(pFlac->bitsPerSample <= 32);
9723
unusedBitsPerSample = 32 - pFlac->bitsPerSample;
9724
9725
framesRead = 0;
9726
while (framesToRead > 0) {
9727
/* If we've run out of samples in this frame, go to the next. */
9728
if (pFlac->currentFLACFrame.pcmFramesRemaining == 0) {
9729
if (!drflac__read_and_decode_next_flac_frame(pFlac)) {
9730
break; /* Couldn't read the next frame, so just break from the loop and return. */
9731
}
9732
} else {
9733
unsigned int channelCount = drflac__get_channel_count_from_channel_assignment(pFlac->currentFLACFrame.header.channelAssignment);
9734
drflac_uint64 iFirstPCMFrame = pFlac->currentFLACFrame.header.blockSizeInPCMFrames - pFlac->currentFLACFrame.pcmFramesRemaining;
9735
drflac_uint64 frameCountThisIteration = framesToRead;
9736
9737
if (frameCountThisIteration > pFlac->currentFLACFrame.pcmFramesRemaining) {
9738
frameCountThisIteration = pFlac->currentFLACFrame.pcmFramesRemaining;
9739
}
9740
9741
if (channelCount == 2) {
9742
const drflac_int32* pDecodedSamples0 = pFlac->currentFLACFrame.subframes[0].pSamplesS32 + iFirstPCMFrame;
9743
const drflac_int32* pDecodedSamples1 = pFlac->currentFLACFrame.subframes[1].pSamplesS32 + iFirstPCMFrame;
9744
9745
switch (pFlac->currentFLACFrame.header.channelAssignment)
9746
{
9747
case DRFLAC_CHANNEL_ASSIGNMENT_LEFT_SIDE:
9748
{
9749
drflac_read_pcm_frames_s32__decode_left_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut);
9750
} break;
9751
9752
case DRFLAC_CHANNEL_ASSIGNMENT_RIGHT_SIDE:
9753
{
9754
drflac_read_pcm_frames_s32__decode_right_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut);
9755
} break;
9756
9757
case DRFLAC_CHANNEL_ASSIGNMENT_MID_SIDE:
9758
{
9759
drflac_read_pcm_frames_s32__decode_mid_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut);
9760
} break;
9761
9762
case DRFLAC_CHANNEL_ASSIGNMENT_INDEPENDENT:
9763
default:
9764
{
9765
drflac_read_pcm_frames_s32__decode_independent_stereo(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut);
9766
} break;
9767
}
9768
} else {
9769
/* Generic interleaving. */
9770
drflac_uint64 i;
9771
for (i = 0; i < frameCountThisIteration; ++i) {
9772
unsigned int j;
9773
for (j = 0; j < channelCount; ++j) {
9774
pBufferOut[(i*channelCount)+j] = (drflac_int32)((drflac_uint32)(pFlac->currentFLACFrame.subframes[j].pSamplesS32[iFirstPCMFrame + i]) << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[j].wastedBitsPerSample));
9775
}
9776
}
9777
}
9778
9779
framesRead += frameCountThisIteration;
9780
pBufferOut += frameCountThisIteration * channelCount;
9781
framesToRead -= frameCountThisIteration;
9782
pFlac->currentPCMFrame += frameCountThisIteration;
9783
pFlac->currentFLACFrame.pcmFramesRemaining -= (drflac_uint32)frameCountThisIteration;
9784
}
9785
}
9786
9787
return framesRead;
9788
}
9789
9790
9791
#if 0
9792
static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_left_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)
9793
{
9794
drflac_uint64 i;
9795
for (i = 0; i < frameCount; ++i) {
9796
drflac_uint32 left = (drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample);
9797
drflac_uint32 side = (drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample);
9798
drflac_uint32 right = left - side;
9799
9800
left >>= 16;
9801
right >>= 16;
9802
9803
pOutputSamples[i*2+0] = (drflac_int16)left;
9804
pOutputSamples[i*2+1] = (drflac_int16)right;
9805
}
9806
}
9807
#endif
9808
9809
static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_left_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)
9810
{
9811
drflac_uint64 i;
9812
drflac_uint64 frameCount4 = frameCount >> 2;
9813
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
9814
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
9815
drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
9816
drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
9817
9818
for (i = 0; i < frameCount4; ++i) {
9819
drflac_uint32 left0 = pInputSamples0U32[i*4+0] << shift0;
9820
drflac_uint32 left1 = pInputSamples0U32[i*4+1] << shift0;
9821
drflac_uint32 left2 = pInputSamples0U32[i*4+2] << shift0;
9822
drflac_uint32 left3 = pInputSamples0U32[i*4+3] << shift0;
9823
9824
drflac_uint32 side0 = pInputSamples1U32[i*4+0] << shift1;
9825
drflac_uint32 side1 = pInputSamples1U32[i*4+1] << shift1;
9826
drflac_uint32 side2 = pInputSamples1U32[i*4+2] << shift1;
9827
drflac_uint32 side3 = pInputSamples1U32[i*4+3] << shift1;
9828
9829
drflac_uint32 right0 = left0 - side0;
9830
drflac_uint32 right1 = left1 - side1;
9831
drflac_uint32 right2 = left2 - side2;
9832
drflac_uint32 right3 = left3 - side3;
9833
9834
left0 >>= 16;
9835
left1 >>= 16;
9836
left2 >>= 16;
9837
left3 >>= 16;
9838
9839
right0 >>= 16;
9840
right1 >>= 16;
9841
right2 >>= 16;
9842
right3 >>= 16;
9843
9844
pOutputSamples[i*8+0] = (drflac_int16)left0;
9845
pOutputSamples[i*8+1] = (drflac_int16)right0;
9846
pOutputSamples[i*8+2] = (drflac_int16)left1;
9847
pOutputSamples[i*8+3] = (drflac_int16)right1;
9848
pOutputSamples[i*8+4] = (drflac_int16)left2;
9849
pOutputSamples[i*8+5] = (drflac_int16)right2;
9850
pOutputSamples[i*8+6] = (drflac_int16)left3;
9851
pOutputSamples[i*8+7] = (drflac_int16)right3;
9852
}
9853
9854
for (i = (frameCount4 << 2); i < frameCount; ++i) {
9855
drflac_uint32 left = pInputSamples0U32[i] << shift0;
9856
drflac_uint32 side = pInputSamples1U32[i] << shift1;
9857
drflac_uint32 right = left - side;
9858
9859
left >>= 16;
9860
right >>= 16;
9861
9862
pOutputSamples[i*2+0] = (drflac_int16)left;
9863
pOutputSamples[i*2+1] = (drflac_int16)right;
9864
}
9865
}
9866
9867
#if defined(DRFLAC_SUPPORT_SSE2)
9868
static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_left_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)
9869
{
9870
drflac_uint64 i;
9871
drflac_uint64 frameCount4 = frameCount >> 2;
9872
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
9873
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
9874
drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
9875
drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
9876
9877
DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);
9878
9879
for (i = 0; i < frameCount4; ++i) {
9880
__m128i left = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0);
9881
__m128i side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1);
9882
__m128i right = _mm_sub_epi32(left, side);
9883
9884
left = _mm_srai_epi32(left, 16);
9885
right = _mm_srai_epi32(right, 16);
9886
9887
_mm_storeu_si128((__m128i*)(pOutputSamples + i*8), drflac__mm_packs_interleaved_epi32(left, right));
9888
}
9889
9890
for (i = (frameCount4 << 2); i < frameCount; ++i) {
9891
drflac_uint32 left = pInputSamples0U32[i] << shift0;
9892
drflac_uint32 side = pInputSamples1U32[i] << shift1;
9893
drflac_uint32 right = left - side;
9894
9895
left >>= 16;
9896
right >>= 16;
9897
9898
pOutputSamples[i*2+0] = (drflac_int16)left;
9899
pOutputSamples[i*2+1] = (drflac_int16)right;
9900
}
9901
}
9902
#endif
9903
9904
#if defined(DRFLAC_SUPPORT_NEON)
9905
static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_left_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)
9906
{
9907
drflac_uint64 i;
9908
drflac_uint64 frameCount4 = frameCount >> 2;
9909
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
9910
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
9911
drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
9912
drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
9913
int32x4_t shift0_4;
9914
int32x4_t shift1_4;
9915
9916
DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);
9917
9918
shift0_4 = vdupq_n_s32(shift0);
9919
shift1_4 = vdupq_n_s32(shift1);
9920
9921
for (i = 0; i < frameCount4; ++i) {
9922
uint32x4_t left;
9923
uint32x4_t side;
9924
uint32x4_t right;
9925
9926
left = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift0_4);
9927
side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift1_4);
9928
right = vsubq_u32(left, side);
9929
9930
left = vshrq_n_u32(left, 16);
9931
right = vshrq_n_u32(right, 16);
9932
9933
drflac__vst2q_u16((drflac_uint16*)pOutputSamples + i*8, vzip_u16(vmovn_u32(left), vmovn_u32(right)));
9934
}
9935
9936
for (i = (frameCount4 << 2); i < frameCount; ++i) {
9937
drflac_uint32 left = pInputSamples0U32[i] << shift0;
9938
drflac_uint32 side = pInputSamples1U32[i] << shift1;
9939
drflac_uint32 right = left - side;
9940
9941
left >>= 16;
9942
right >>= 16;
9943
9944
pOutputSamples[i*2+0] = (drflac_int16)left;
9945
pOutputSamples[i*2+1] = (drflac_int16)right;
9946
}
9947
}
9948
#endif
9949
9950
static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_left_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)
9951
{
9952
#if defined(DRFLAC_SUPPORT_SSE2)
9953
if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) {
9954
drflac_read_pcm_frames_s16__decode_left_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
9955
} else
9956
#elif defined(DRFLAC_SUPPORT_NEON)
9957
if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) {
9958
drflac_read_pcm_frames_s16__decode_left_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
9959
} else
9960
#endif
9961
{
9962
/* Scalar fallback. */
9963
#if 0
9964
drflac_read_pcm_frames_s16__decode_left_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
9965
#else
9966
drflac_read_pcm_frames_s16__decode_left_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
9967
#endif
9968
}
9969
}
9970
9971
9972
#if 0
9973
static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_right_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)
9974
{
9975
drflac_uint64 i;
9976
for (i = 0; i < frameCount; ++i) {
9977
drflac_uint32 side = (drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample);
9978
drflac_uint32 right = (drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample);
9979
drflac_uint32 left = right + side;
9980
9981
left >>= 16;
9982
right >>= 16;
9983
9984
pOutputSamples[i*2+0] = (drflac_int16)left;
9985
pOutputSamples[i*2+1] = (drflac_int16)right;
9986
}
9987
}
9988
#endif
9989
9990
static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_right_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)
9991
{
9992
drflac_uint64 i;
9993
drflac_uint64 frameCount4 = frameCount >> 2;
9994
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
9995
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
9996
drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
9997
drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
9998
9999
for (i = 0; i < frameCount4; ++i) {
10000
drflac_uint32 side0 = pInputSamples0U32[i*4+0] << shift0;
10001
drflac_uint32 side1 = pInputSamples0U32[i*4+1] << shift0;
10002
drflac_uint32 side2 = pInputSamples0U32[i*4+2] << shift0;
10003
drflac_uint32 side3 = pInputSamples0U32[i*4+3] << shift0;
10004
10005
drflac_uint32 right0 = pInputSamples1U32[i*4+0] << shift1;
10006
drflac_uint32 right1 = pInputSamples1U32[i*4+1] << shift1;
10007
drflac_uint32 right2 = pInputSamples1U32[i*4+2] << shift1;
10008
drflac_uint32 right3 = pInputSamples1U32[i*4+3] << shift1;
10009
10010
drflac_uint32 left0 = right0 + side0;
10011
drflac_uint32 left1 = right1 + side1;
10012
drflac_uint32 left2 = right2 + side2;
10013
drflac_uint32 left3 = right3 + side3;
10014
10015
left0 >>= 16;
10016
left1 >>= 16;
10017
left2 >>= 16;
10018
left3 >>= 16;
10019
10020
right0 >>= 16;
10021
right1 >>= 16;
10022
right2 >>= 16;
10023
right3 >>= 16;
10024
10025
pOutputSamples[i*8+0] = (drflac_int16)left0;
10026
pOutputSamples[i*8+1] = (drflac_int16)right0;
10027
pOutputSamples[i*8+2] = (drflac_int16)left1;
10028
pOutputSamples[i*8+3] = (drflac_int16)right1;
10029
pOutputSamples[i*8+4] = (drflac_int16)left2;
10030
pOutputSamples[i*8+5] = (drflac_int16)right2;
10031
pOutputSamples[i*8+6] = (drflac_int16)left3;
10032
pOutputSamples[i*8+7] = (drflac_int16)right3;
10033
}
10034
10035
for (i = (frameCount4 << 2); i < frameCount; ++i) {
10036
drflac_uint32 side = pInputSamples0U32[i] << shift0;
10037
drflac_uint32 right = pInputSamples1U32[i] << shift1;
10038
drflac_uint32 left = right + side;
10039
10040
left >>= 16;
10041
right >>= 16;
10042
10043
pOutputSamples[i*2+0] = (drflac_int16)left;
10044
pOutputSamples[i*2+1] = (drflac_int16)right;
10045
}
10046
}
10047
10048
#if defined(DRFLAC_SUPPORT_SSE2)
10049
static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_right_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)
10050
{
10051
drflac_uint64 i;
10052
drflac_uint64 frameCount4 = frameCount >> 2;
10053
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
10054
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
10055
drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
10056
drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
10057
10058
DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);
10059
10060
for (i = 0; i < frameCount4; ++i) {
10061
__m128i side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0);
10062
__m128i right = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1);
10063
__m128i left = _mm_add_epi32(right, side);
10064
10065
left = _mm_srai_epi32(left, 16);
10066
right = _mm_srai_epi32(right, 16);
10067
10068
_mm_storeu_si128((__m128i*)(pOutputSamples + i*8), drflac__mm_packs_interleaved_epi32(left, right));
10069
}
10070
10071
for (i = (frameCount4 << 2); i < frameCount; ++i) {
10072
drflac_uint32 side = pInputSamples0U32[i] << shift0;
10073
drflac_uint32 right = pInputSamples1U32[i] << shift1;
10074
drflac_uint32 left = right + side;
10075
10076
left >>= 16;
10077
right >>= 16;
10078
10079
pOutputSamples[i*2+0] = (drflac_int16)left;
10080
pOutputSamples[i*2+1] = (drflac_int16)right;
10081
}
10082
}
10083
#endif
10084
10085
#if defined(DRFLAC_SUPPORT_NEON)
10086
static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_right_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)
10087
{
10088
drflac_uint64 i;
10089
drflac_uint64 frameCount4 = frameCount >> 2;
10090
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
10091
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
10092
drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
10093
drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
10094
int32x4_t shift0_4;
10095
int32x4_t shift1_4;
10096
10097
DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);
10098
10099
shift0_4 = vdupq_n_s32(shift0);
10100
shift1_4 = vdupq_n_s32(shift1);
10101
10102
for (i = 0; i < frameCount4; ++i) {
10103
uint32x4_t side;
10104
uint32x4_t right;
10105
uint32x4_t left;
10106
10107
side = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift0_4);
10108
right = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift1_4);
10109
left = vaddq_u32(right, side);
10110
10111
left = vshrq_n_u32(left, 16);
10112
right = vshrq_n_u32(right, 16);
10113
10114
drflac__vst2q_u16((drflac_uint16*)pOutputSamples + i*8, vzip_u16(vmovn_u32(left), vmovn_u32(right)));
10115
}
10116
10117
for (i = (frameCount4 << 2); i < frameCount; ++i) {
10118
drflac_uint32 side = pInputSamples0U32[i] << shift0;
10119
drflac_uint32 right = pInputSamples1U32[i] << shift1;
10120
drflac_uint32 left = right + side;
10121
10122
left >>= 16;
10123
right >>= 16;
10124
10125
pOutputSamples[i*2+0] = (drflac_int16)left;
10126
pOutputSamples[i*2+1] = (drflac_int16)right;
10127
}
10128
}
10129
#endif
10130
10131
static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_right_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)
10132
{
10133
#if defined(DRFLAC_SUPPORT_SSE2)
10134
if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) {
10135
drflac_read_pcm_frames_s16__decode_right_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
10136
} else
10137
#elif defined(DRFLAC_SUPPORT_NEON)
10138
if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) {
10139
drflac_read_pcm_frames_s16__decode_right_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
10140
} else
10141
#endif
10142
{
10143
/* Scalar fallback. */
10144
#if 0
10145
drflac_read_pcm_frames_s16__decode_right_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
10146
#else
10147
drflac_read_pcm_frames_s16__decode_right_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
10148
#endif
10149
}
10150
}
10151
10152
10153
#if 0
10154
static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_mid_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)
10155
{
10156
for (drflac_uint64 i = 0; i < frameCount; ++i) {
10157
drflac_uint32 mid = (drflac_uint32)pInputSamples0[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
10158
drflac_uint32 side = (drflac_uint32)pInputSamples1[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
10159
10160
mid = (mid << 1) | (side & 0x01);
10161
10162
pOutputSamples[i*2+0] = (drflac_int16)(((drflac_uint32)((drflac_int32)(mid + side) >> 1) << unusedBitsPerSample) >> 16);
10163
pOutputSamples[i*2+1] = (drflac_int16)(((drflac_uint32)((drflac_int32)(mid - side) >> 1) << unusedBitsPerSample) >> 16);
10164
}
10165
}
10166
#endif
10167
10168
static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_mid_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)
10169
{
10170
drflac_uint64 i;
10171
drflac_uint64 frameCount4 = frameCount >> 2;
10172
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
10173
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
10174
drflac_uint32 shift = unusedBitsPerSample;
10175
10176
if (shift > 0) {
10177
shift -= 1;
10178
for (i = 0; i < frameCount4; ++i) {
10179
drflac_uint32 temp0L;
10180
drflac_uint32 temp1L;
10181
drflac_uint32 temp2L;
10182
drflac_uint32 temp3L;
10183
drflac_uint32 temp0R;
10184
drflac_uint32 temp1R;
10185
drflac_uint32 temp2R;
10186
drflac_uint32 temp3R;
10187
10188
drflac_uint32 mid0 = pInputSamples0U32[i*4+0] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
10189
drflac_uint32 mid1 = pInputSamples0U32[i*4+1] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
10190
drflac_uint32 mid2 = pInputSamples0U32[i*4+2] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
10191
drflac_uint32 mid3 = pInputSamples0U32[i*4+3] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
10192
10193
drflac_uint32 side0 = pInputSamples1U32[i*4+0] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
10194
drflac_uint32 side1 = pInputSamples1U32[i*4+1] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
10195
drflac_uint32 side2 = pInputSamples1U32[i*4+2] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
10196
drflac_uint32 side3 = pInputSamples1U32[i*4+3] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
10197
10198
mid0 = (mid0 << 1) | (side0 & 0x01);
10199
mid1 = (mid1 << 1) | (side1 & 0x01);
10200
mid2 = (mid2 << 1) | (side2 & 0x01);
10201
mid3 = (mid3 << 1) | (side3 & 0x01);
10202
10203
temp0L = (mid0 + side0) << shift;
10204
temp1L = (mid1 + side1) << shift;
10205
temp2L = (mid2 + side2) << shift;
10206
temp3L = (mid3 + side3) << shift;
10207
10208
temp0R = (mid0 - side0) << shift;
10209
temp1R = (mid1 - side1) << shift;
10210
temp2R = (mid2 - side2) << shift;
10211
temp3R = (mid3 - side3) << shift;
10212
10213
temp0L >>= 16;
10214
temp1L >>= 16;
10215
temp2L >>= 16;
10216
temp3L >>= 16;
10217
10218
temp0R >>= 16;
10219
temp1R >>= 16;
10220
temp2R >>= 16;
10221
temp3R >>= 16;
10222
10223
pOutputSamples[i*8+0] = (drflac_int16)temp0L;
10224
pOutputSamples[i*8+1] = (drflac_int16)temp0R;
10225
pOutputSamples[i*8+2] = (drflac_int16)temp1L;
10226
pOutputSamples[i*8+3] = (drflac_int16)temp1R;
10227
pOutputSamples[i*8+4] = (drflac_int16)temp2L;
10228
pOutputSamples[i*8+5] = (drflac_int16)temp2R;
10229
pOutputSamples[i*8+6] = (drflac_int16)temp3L;
10230
pOutputSamples[i*8+7] = (drflac_int16)temp3R;
10231
}
10232
} else {
10233
for (i = 0; i < frameCount4; ++i) {
10234
drflac_uint32 temp0L;
10235
drflac_uint32 temp1L;
10236
drflac_uint32 temp2L;
10237
drflac_uint32 temp3L;
10238
drflac_uint32 temp0R;
10239
drflac_uint32 temp1R;
10240
drflac_uint32 temp2R;
10241
drflac_uint32 temp3R;
10242
10243
drflac_uint32 mid0 = pInputSamples0U32[i*4+0] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
10244
drflac_uint32 mid1 = pInputSamples0U32[i*4+1] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
10245
drflac_uint32 mid2 = pInputSamples0U32[i*4+2] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
10246
drflac_uint32 mid3 = pInputSamples0U32[i*4+3] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
10247
10248
drflac_uint32 side0 = pInputSamples1U32[i*4+0] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
10249
drflac_uint32 side1 = pInputSamples1U32[i*4+1] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
10250
drflac_uint32 side2 = pInputSamples1U32[i*4+2] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
10251
drflac_uint32 side3 = pInputSamples1U32[i*4+3] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
10252
10253
mid0 = (mid0 << 1) | (side0 & 0x01);
10254
mid1 = (mid1 << 1) | (side1 & 0x01);
10255
mid2 = (mid2 << 1) | (side2 & 0x01);
10256
mid3 = (mid3 << 1) | (side3 & 0x01);
10257
10258
temp0L = ((drflac_int32)(mid0 + side0) >> 1);
10259
temp1L = ((drflac_int32)(mid1 + side1) >> 1);
10260
temp2L = ((drflac_int32)(mid2 + side2) >> 1);
10261
temp3L = ((drflac_int32)(mid3 + side3) >> 1);
10262
10263
temp0R = ((drflac_int32)(mid0 - side0) >> 1);
10264
temp1R = ((drflac_int32)(mid1 - side1) >> 1);
10265
temp2R = ((drflac_int32)(mid2 - side2) >> 1);
10266
temp3R = ((drflac_int32)(mid3 - side3) >> 1);
10267
10268
temp0L >>= 16;
10269
temp1L >>= 16;
10270
temp2L >>= 16;
10271
temp3L >>= 16;
10272
10273
temp0R >>= 16;
10274
temp1R >>= 16;
10275
temp2R >>= 16;
10276
temp3R >>= 16;
10277
10278
pOutputSamples[i*8+0] = (drflac_int16)temp0L;
10279
pOutputSamples[i*8+1] = (drflac_int16)temp0R;
10280
pOutputSamples[i*8+2] = (drflac_int16)temp1L;
10281
pOutputSamples[i*8+3] = (drflac_int16)temp1R;
10282
pOutputSamples[i*8+4] = (drflac_int16)temp2L;
10283
pOutputSamples[i*8+5] = (drflac_int16)temp2R;
10284
pOutputSamples[i*8+6] = (drflac_int16)temp3L;
10285
pOutputSamples[i*8+7] = (drflac_int16)temp3R;
10286
}
10287
}
10288
10289
for (i = (frameCount4 << 2); i < frameCount; ++i) {
10290
drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
10291
drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
10292
10293
mid = (mid << 1) | (side & 0x01);
10294
10295
pOutputSamples[i*2+0] = (drflac_int16)(((drflac_uint32)((drflac_int32)(mid + side) >> 1) << unusedBitsPerSample) >> 16);
10296
pOutputSamples[i*2+1] = (drflac_int16)(((drflac_uint32)((drflac_int32)(mid - side) >> 1) << unusedBitsPerSample) >> 16);
10297
}
10298
}
10299
10300
#if defined(DRFLAC_SUPPORT_SSE2)
10301
static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_mid_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)
10302
{
10303
drflac_uint64 i;
10304
drflac_uint64 frameCount4 = frameCount >> 2;
10305
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
10306
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
10307
drflac_uint32 shift = unusedBitsPerSample;
10308
10309
DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);
10310
10311
if (shift == 0) {
10312
for (i = 0; i < frameCount4; ++i) {
10313
__m128i mid;
10314
__m128i side;
10315
__m128i left;
10316
__m128i right;
10317
10318
mid = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample);
10319
side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample);
10320
10321
mid = _mm_or_si128(_mm_slli_epi32(mid, 1), _mm_and_si128(side, _mm_set1_epi32(0x01)));
10322
10323
left = _mm_srai_epi32(_mm_add_epi32(mid, side), 1);
10324
right = _mm_srai_epi32(_mm_sub_epi32(mid, side), 1);
10325
10326
left = _mm_srai_epi32(left, 16);
10327
right = _mm_srai_epi32(right, 16);
10328
10329
_mm_storeu_si128((__m128i*)(pOutputSamples + i*8), drflac__mm_packs_interleaved_epi32(left, right));
10330
}
10331
10332
for (i = (frameCount4 << 2); i < frameCount; ++i) {
10333
drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
10334
drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
10335
10336
mid = (mid << 1) | (side & 0x01);
10337
10338
pOutputSamples[i*2+0] = (drflac_int16)(((drflac_int32)(mid + side) >> 1) >> 16);
10339
pOutputSamples[i*2+1] = (drflac_int16)(((drflac_int32)(mid - side) >> 1) >> 16);
10340
}
10341
} else {
10342
shift -= 1;
10343
for (i = 0; i < frameCount4; ++i) {
10344
__m128i mid;
10345
__m128i side;
10346
__m128i left;
10347
__m128i right;
10348
10349
mid = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample);
10350
side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample);
10351
10352
mid = _mm_or_si128(_mm_slli_epi32(mid, 1), _mm_and_si128(side, _mm_set1_epi32(0x01)));
10353
10354
left = _mm_slli_epi32(_mm_add_epi32(mid, side), shift);
10355
right = _mm_slli_epi32(_mm_sub_epi32(mid, side), shift);
10356
10357
left = _mm_srai_epi32(left, 16);
10358
right = _mm_srai_epi32(right, 16);
10359
10360
_mm_storeu_si128((__m128i*)(pOutputSamples + i*8), drflac__mm_packs_interleaved_epi32(left, right));
10361
}
10362
10363
for (i = (frameCount4 << 2); i < frameCount; ++i) {
10364
drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
10365
drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
10366
10367
mid = (mid << 1) | (side & 0x01);
10368
10369
pOutputSamples[i*2+0] = (drflac_int16)(((mid + side) << shift) >> 16);
10370
pOutputSamples[i*2+1] = (drflac_int16)(((mid - side) << shift) >> 16);
10371
}
10372
}
10373
}
10374
#endif
10375
10376
#if defined(DRFLAC_SUPPORT_NEON)
10377
static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_mid_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)
10378
{
10379
drflac_uint64 i;
10380
drflac_uint64 frameCount4 = frameCount >> 2;
10381
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
10382
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
10383
drflac_uint32 shift = unusedBitsPerSample;
10384
int32x4_t wbpsShift0_4; /* wbps = Wasted Bits Per Sample */
10385
int32x4_t wbpsShift1_4; /* wbps = Wasted Bits Per Sample */
10386
10387
DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);
10388
10389
wbpsShift0_4 = vdupq_n_s32(pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample);
10390
wbpsShift1_4 = vdupq_n_s32(pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample);
10391
10392
if (shift == 0) {
10393
for (i = 0; i < frameCount4; ++i) {
10394
uint32x4_t mid;
10395
uint32x4_t side;
10396
int32x4_t left;
10397
int32x4_t right;
10398
10399
mid = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), wbpsShift0_4);
10400
side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), wbpsShift1_4);
10401
10402
mid = vorrq_u32(vshlq_n_u32(mid, 1), vandq_u32(side, vdupq_n_u32(1)));
10403
10404
left = vshrq_n_s32(vreinterpretq_s32_u32(vaddq_u32(mid, side)), 1);
10405
right = vshrq_n_s32(vreinterpretq_s32_u32(vsubq_u32(mid, side)), 1);
10406
10407
left = vshrq_n_s32(left, 16);
10408
right = vshrq_n_s32(right, 16);
10409
10410
drflac__vst2q_s16(pOutputSamples + i*8, vzip_s16(vmovn_s32(left), vmovn_s32(right)));
10411
}
10412
10413
for (i = (frameCount4 << 2); i < frameCount; ++i) {
10414
drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
10415
drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
10416
10417
mid = (mid << 1) | (side & 0x01);
10418
10419
pOutputSamples[i*2+0] = (drflac_int16)(((drflac_int32)(mid + side) >> 1) >> 16);
10420
pOutputSamples[i*2+1] = (drflac_int16)(((drflac_int32)(mid - side) >> 1) >> 16);
10421
}
10422
} else {
10423
int32x4_t shift4;
10424
10425
shift -= 1;
10426
shift4 = vdupq_n_s32(shift);
10427
10428
for (i = 0; i < frameCount4; ++i) {
10429
uint32x4_t mid;
10430
uint32x4_t side;
10431
int32x4_t left;
10432
int32x4_t right;
10433
10434
mid = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), wbpsShift0_4);
10435
side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), wbpsShift1_4);
10436
10437
mid = vorrq_u32(vshlq_n_u32(mid, 1), vandq_u32(side, vdupq_n_u32(1)));
10438
10439
left = vreinterpretq_s32_u32(vshlq_u32(vaddq_u32(mid, side), shift4));
10440
right = vreinterpretq_s32_u32(vshlq_u32(vsubq_u32(mid, side), shift4));
10441
10442
left = vshrq_n_s32(left, 16);
10443
right = vshrq_n_s32(right, 16);
10444
10445
drflac__vst2q_s16(pOutputSamples + i*8, vzip_s16(vmovn_s32(left), vmovn_s32(right)));
10446
}
10447
10448
for (i = (frameCount4 << 2); i < frameCount; ++i) {
10449
drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
10450
drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
10451
10452
mid = (mid << 1) | (side & 0x01);
10453
10454
pOutputSamples[i*2+0] = (drflac_int16)(((mid + side) << shift) >> 16);
10455
pOutputSamples[i*2+1] = (drflac_int16)(((mid - side) << shift) >> 16);
10456
}
10457
}
10458
}
10459
#endif
10460
10461
static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_mid_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)
10462
{
10463
#if defined(DRFLAC_SUPPORT_SSE2)
10464
if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) {
10465
drflac_read_pcm_frames_s16__decode_mid_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
10466
} else
10467
#elif defined(DRFLAC_SUPPORT_NEON)
10468
if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) {
10469
drflac_read_pcm_frames_s16__decode_mid_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
10470
} else
10471
#endif
10472
{
10473
/* Scalar fallback. */
10474
#if 0
10475
drflac_read_pcm_frames_s16__decode_mid_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
10476
#else
10477
drflac_read_pcm_frames_s16__decode_mid_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
10478
#endif
10479
}
10480
}
10481
10482
10483
#if 0
10484
static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_independent_stereo__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)
10485
{
10486
for (drflac_uint64 i = 0; i < frameCount; ++i) {
10487
pOutputSamples[i*2+0] = (drflac_int16)((drflac_int32)((drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample)) >> 16);
10488
pOutputSamples[i*2+1] = (drflac_int16)((drflac_int32)((drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample)) >> 16);
10489
}
10490
}
10491
#endif
10492
10493
static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_independent_stereo__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)
10494
{
10495
drflac_uint64 i;
10496
drflac_uint64 frameCount4 = frameCount >> 2;
10497
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
10498
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
10499
drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
10500
drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
10501
10502
for (i = 0; i < frameCount4; ++i) {
10503
drflac_uint32 tempL0 = pInputSamples0U32[i*4+0] << shift0;
10504
drflac_uint32 tempL1 = pInputSamples0U32[i*4+1] << shift0;
10505
drflac_uint32 tempL2 = pInputSamples0U32[i*4+2] << shift0;
10506
drflac_uint32 tempL3 = pInputSamples0U32[i*4+3] << shift0;
10507
10508
drflac_uint32 tempR0 = pInputSamples1U32[i*4+0] << shift1;
10509
drflac_uint32 tempR1 = pInputSamples1U32[i*4+1] << shift1;
10510
drflac_uint32 tempR2 = pInputSamples1U32[i*4+2] << shift1;
10511
drflac_uint32 tempR3 = pInputSamples1U32[i*4+3] << shift1;
10512
10513
tempL0 >>= 16;
10514
tempL1 >>= 16;
10515
tempL2 >>= 16;
10516
tempL3 >>= 16;
10517
10518
tempR0 >>= 16;
10519
tempR1 >>= 16;
10520
tempR2 >>= 16;
10521
tempR3 >>= 16;
10522
10523
pOutputSamples[i*8+0] = (drflac_int16)tempL0;
10524
pOutputSamples[i*8+1] = (drflac_int16)tempR0;
10525
pOutputSamples[i*8+2] = (drflac_int16)tempL1;
10526
pOutputSamples[i*8+3] = (drflac_int16)tempR1;
10527
pOutputSamples[i*8+4] = (drflac_int16)tempL2;
10528
pOutputSamples[i*8+5] = (drflac_int16)tempR2;
10529
pOutputSamples[i*8+6] = (drflac_int16)tempL3;
10530
pOutputSamples[i*8+7] = (drflac_int16)tempR3;
10531
}
10532
10533
for (i = (frameCount4 << 2); i < frameCount; ++i) {
10534
pOutputSamples[i*2+0] = (drflac_int16)((pInputSamples0U32[i] << shift0) >> 16);
10535
pOutputSamples[i*2+1] = (drflac_int16)((pInputSamples1U32[i] << shift1) >> 16);
10536
}
10537
}
10538
10539
#if defined(DRFLAC_SUPPORT_SSE2)
10540
static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_independent_stereo__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)
10541
{
10542
drflac_uint64 i;
10543
drflac_uint64 frameCount4 = frameCount >> 2;
10544
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
10545
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
10546
drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
10547
drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
10548
10549
for (i = 0; i < frameCount4; ++i) {
10550
__m128i left = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0);
10551
__m128i right = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1);
10552
10553
left = _mm_srai_epi32(left, 16);
10554
right = _mm_srai_epi32(right, 16);
10555
10556
/* At this point we have results. We can now pack and interleave these into a single __m128i object and then store the in the output buffer. */
10557
_mm_storeu_si128((__m128i*)(pOutputSamples + i*8), drflac__mm_packs_interleaved_epi32(left, right));
10558
}
10559
10560
for (i = (frameCount4 << 2); i < frameCount; ++i) {
10561
pOutputSamples[i*2+0] = (drflac_int16)((pInputSamples0U32[i] << shift0) >> 16);
10562
pOutputSamples[i*2+1] = (drflac_int16)((pInputSamples1U32[i] << shift1) >> 16);
10563
}
10564
}
10565
#endif
10566
10567
#if defined(DRFLAC_SUPPORT_NEON)
10568
static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_independent_stereo__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)
10569
{
10570
drflac_uint64 i;
10571
drflac_uint64 frameCount4 = frameCount >> 2;
10572
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
10573
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
10574
drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
10575
drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
10576
10577
int32x4_t shift0_4 = vdupq_n_s32(shift0);
10578
int32x4_t shift1_4 = vdupq_n_s32(shift1);
10579
10580
for (i = 0; i < frameCount4; ++i) {
10581
int32x4_t left;
10582
int32x4_t right;
10583
10584
left = vreinterpretq_s32_u32(vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift0_4));
10585
right = vreinterpretq_s32_u32(vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift1_4));
10586
10587
left = vshrq_n_s32(left, 16);
10588
right = vshrq_n_s32(right, 16);
10589
10590
drflac__vst2q_s16(pOutputSamples + i*8, vzip_s16(vmovn_s32(left), vmovn_s32(right)));
10591
}
10592
10593
for (i = (frameCount4 << 2); i < frameCount; ++i) {
10594
pOutputSamples[i*2+0] = (drflac_int16)((pInputSamples0U32[i] << shift0) >> 16);
10595
pOutputSamples[i*2+1] = (drflac_int16)((pInputSamples1U32[i] << shift1) >> 16);
10596
}
10597
}
10598
#endif
10599
10600
static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_independent_stereo(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)
10601
{
10602
#if defined(DRFLAC_SUPPORT_SSE2)
10603
if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) {
10604
drflac_read_pcm_frames_s16__decode_independent_stereo__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
10605
} else
10606
#elif defined(DRFLAC_SUPPORT_NEON)
10607
if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) {
10608
drflac_read_pcm_frames_s16__decode_independent_stereo__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
10609
} else
10610
#endif
10611
{
10612
/* Scalar fallback. */
10613
#if 0
10614
drflac_read_pcm_frames_s16__decode_independent_stereo__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
10615
#else
10616
drflac_read_pcm_frames_s16__decode_independent_stereo__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
10617
#endif
10618
}
10619
}
10620
10621
DRFLAC_API drflac_uint64 drflac_read_pcm_frames_s16(drflac* pFlac, drflac_uint64 framesToRead, drflac_int16* pBufferOut)
10622
{
10623
drflac_uint64 framesRead;
10624
drflac_uint32 unusedBitsPerSample;
10625
10626
if (pFlac == NULL || framesToRead == 0) {
10627
return 0;
10628
}
10629
10630
if (pBufferOut == NULL) {
10631
return drflac__seek_forward_by_pcm_frames(pFlac, framesToRead);
10632
}
10633
10634
DRFLAC_ASSERT(pFlac->bitsPerSample <= 32);
10635
unusedBitsPerSample = 32 - pFlac->bitsPerSample;
10636
10637
framesRead = 0;
10638
while (framesToRead > 0) {
10639
/* If we've run out of samples in this frame, go to the next. */
10640
if (pFlac->currentFLACFrame.pcmFramesRemaining == 0) {
10641
if (!drflac__read_and_decode_next_flac_frame(pFlac)) {
10642
break; /* Couldn't read the next frame, so just break from the loop and return. */
10643
}
10644
} else {
10645
unsigned int channelCount = drflac__get_channel_count_from_channel_assignment(pFlac->currentFLACFrame.header.channelAssignment);
10646
drflac_uint64 iFirstPCMFrame = pFlac->currentFLACFrame.header.blockSizeInPCMFrames - pFlac->currentFLACFrame.pcmFramesRemaining;
10647
drflac_uint64 frameCountThisIteration = framesToRead;
10648
10649
if (frameCountThisIteration > pFlac->currentFLACFrame.pcmFramesRemaining) {
10650
frameCountThisIteration = pFlac->currentFLACFrame.pcmFramesRemaining;
10651
}
10652
10653
if (channelCount == 2) {
10654
const drflac_int32* pDecodedSamples0 = pFlac->currentFLACFrame.subframes[0].pSamplesS32 + iFirstPCMFrame;
10655
const drflac_int32* pDecodedSamples1 = pFlac->currentFLACFrame.subframes[1].pSamplesS32 + iFirstPCMFrame;
10656
10657
switch (pFlac->currentFLACFrame.header.channelAssignment)
10658
{
10659
case DRFLAC_CHANNEL_ASSIGNMENT_LEFT_SIDE:
10660
{
10661
drflac_read_pcm_frames_s16__decode_left_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut);
10662
} break;
10663
10664
case DRFLAC_CHANNEL_ASSIGNMENT_RIGHT_SIDE:
10665
{
10666
drflac_read_pcm_frames_s16__decode_right_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut);
10667
} break;
10668
10669
case DRFLAC_CHANNEL_ASSIGNMENT_MID_SIDE:
10670
{
10671
drflac_read_pcm_frames_s16__decode_mid_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut);
10672
} break;
10673
10674
case DRFLAC_CHANNEL_ASSIGNMENT_INDEPENDENT:
10675
default:
10676
{
10677
drflac_read_pcm_frames_s16__decode_independent_stereo(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut);
10678
} break;
10679
}
10680
} else {
10681
/* Generic interleaving. */
10682
drflac_uint64 i;
10683
for (i = 0; i < frameCountThisIteration; ++i) {
10684
unsigned int j;
10685
for (j = 0; j < channelCount; ++j) {
10686
drflac_int32 sampleS32 = (drflac_int32)((drflac_uint32)(pFlac->currentFLACFrame.subframes[j].pSamplesS32[iFirstPCMFrame + i]) << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[j].wastedBitsPerSample));
10687
pBufferOut[(i*channelCount)+j] = (drflac_int16)(sampleS32 >> 16);
10688
}
10689
}
10690
}
10691
10692
framesRead += frameCountThisIteration;
10693
pBufferOut += frameCountThisIteration * channelCount;
10694
framesToRead -= frameCountThisIteration;
10695
pFlac->currentPCMFrame += frameCountThisIteration;
10696
pFlac->currentFLACFrame.pcmFramesRemaining -= (drflac_uint32)frameCountThisIteration;
10697
}
10698
}
10699
10700
return framesRead;
10701
}
10702
10703
10704
#if 0
10705
static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_left_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)
10706
{
10707
drflac_uint64 i;
10708
for (i = 0; i < frameCount; ++i) {
10709
drflac_uint32 left = (drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample);
10710
drflac_uint32 side = (drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample);
10711
drflac_uint32 right = left - side;
10712
10713
pOutputSamples[i*2+0] = (float)((drflac_int32)left / 2147483648.0);
10714
pOutputSamples[i*2+1] = (float)((drflac_int32)right / 2147483648.0);
10715
}
10716
}
10717
#endif
10718
10719
static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_left_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)
10720
{
10721
drflac_uint64 i;
10722
drflac_uint64 frameCount4 = frameCount >> 2;
10723
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
10724
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
10725
drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
10726
drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
10727
10728
float factor = 1 / 2147483648.0;
10729
10730
for (i = 0; i < frameCount4; ++i) {
10731
drflac_uint32 left0 = pInputSamples0U32[i*4+0] << shift0;
10732
drflac_uint32 left1 = pInputSamples0U32[i*4+1] << shift0;
10733
drflac_uint32 left2 = pInputSamples0U32[i*4+2] << shift0;
10734
drflac_uint32 left3 = pInputSamples0U32[i*4+3] << shift0;
10735
10736
drflac_uint32 side0 = pInputSamples1U32[i*4+0] << shift1;
10737
drflac_uint32 side1 = pInputSamples1U32[i*4+1] << shift1;
10738
drflac_uint32 side2 = pInputSamples1U32[i*4+2] << shift1;
10739
drflac_uint32 side3 = pInputSamples1U32[i*4+3] << shift1;
10740
10741
drflac_uint32 right0 = left0 - side0;
10742
drflac_uint32 right1 = left1 - side1;
10743
drflac_uint32 right2 = left2 - side2;
10744
drflac_uint32 right3 = left3 - side3;
10745
10746
pOutputSamples[i*8+0] = (drflac_int32)left0 * factor;
10747
pOutputSamples[i*8+1] = (drflac_int32)right0 * factor;
10748
pOutputSamples[i*8+2] = (drflac_int32)left1 * factor;
10749
pOutputSamples[i*8+3] = (drflac_int32)right1 * factor;
10750
pOutputSamples[i*8+4] = (drflac_int32)left2 * factor;
10751
pOutputSamples[i*8+5] = (drflac_int32)right2 * factor;
10752
pOutputSamples[i*8+6] = (drflac_int32)left3 * factor;
10753
pOutputSamples[i*8+7] = (drflac_int32)right3 * factor;
10754
}
10755
10756
for (i = (frameCount4 << 2); i < frameCount; ++i) {
10757
drflac_uint32 left = pInputSamples0U32[i] << shift0;
10758
drflac_uint32 side = pInputSamples1U32[i] << shift1;
10759
drflac_uint32 right = left - side;
10760
10761
pOutputSamples[i*2+0] = (drflac_int32)left * factor;
10762
pOutputSamples[i*2+1] = (drflac_int32)right * factor;
10763
}
10764
}
10765
10766
#if defined(DRFLAC_SUPPORT_SSE2)
10767
static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_left_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)
10768
{
10769
drflac_uint64 i;
10770
drflac_uint64 frameCount4 = frameCount >> 2;
10771
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
10772
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
10773
drflac_uint32 shift0 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample) - 8;
10774
drflac_uint32 shift1 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample) - 8;
10775
__m128 factor;
10776
10777
DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);
10778
10779
factor = _mm_set1_ps(1.0f / 8388608.0f);
10780
10781
for (i = 0; i < frameCount4; ++i) {
10782
__m128i left = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0);
10783
__m128i side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1);
10784
__m128i right = _mm_sub_epi32(left, side);
10785
__m128 leftf = _mm_mul_ps(_mm_cvtepi32_ps(left), factor);
10786
__m128 rightf = _mm_mul_ps(_mm_cvtepi32_ps(right), factor);
10787
10788
_mm_storeu_ps(pOutputSamples + i*8 + 0, _mm_unpacklo_ps(leftf, rightf));
10789
_mm_storeu_ps(pOutputSamples + i*8 + 4, _mm_unpackhi_ps(leftf, rightf));
10790
}
10791
10792
for (i = (frameCount4 << 2); i < frameCount; ++i) {
10793
drflac_uint32 left = pInputSamples0U32[i] << shift0;
10794
drflac_uint32 side = pInputSamples1U32[i] << shift1;
10795
drflac_uint32 right = left - side;
10796
10797
pOutputSamples[i*2+0] = (drflac_int32)left / 8388608.0f;
10798
pOutputSamples[i*2+1] = (drflac_int32)right / 8388608.0f;
10799
}
10800
}
10801
#endif
10802
10803
#if defined(DRFLAC_SUPPORT_NEON)
10804
static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_left_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)
10805
{
10806
drflac_uint64 i;
10807
drflac_uint64 frameCount4 = frameCount >> 2;
10808
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
10809
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
10810
drflac_uint32 shift0 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample) - 8;
10811
drflac_uint32 shift1 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample) - 8;
10812
float32x4_t factor4;
10813
int32x4_t shift0_4;
10814
int32x4_t shift1_4;
10815
10816
DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);
10817
10818
factor4 = vdupq_n_f32(1.0f / 8388608.0f);
10819
shift0_4 = vdupq_n_s32(shift0);
10820
shift1_4 = vdupq_n_s32(shift1);
10821
10822
for (i = 0; i < frameCount4; ++i) {
10823
uint32x4_t left;
10824
uint32x4_t side;
10825
uint32x4_t right;
10826
float32x4_t leftf;
10827
float32x4_t rightf;
10828
10829
left = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift0_4);
10830
side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift1_4);
10831
right = vsubq_u32(left, side);
10832
leftf = vmulq_f32(vcvtq_f32_s32(vreinterpretq_s32_u32(left)), factor4);
10833
rightf = vmulq_f32(vcvtq_f32_s32(vreinterpretq_s32_u32(right)), factor4);
10834
10835
drflac__vst2q_f32(pOutputSamples + i*8, vzipq_f32(leftf, rightf));
10836
}
10837
10838
for (i = (frameCount4 << 2); i < frameCount; ++i) {
10839
drflac_uint32 left = pInputSamples0U32[i] << shift0;
10840
drflac_uint32 side = pInputSamples1U32[i] << shift1;
10841
drflac_uint32 right = left - side;
10842
10843
pOutputSamples[i*2+0] = (drflac_int32)left / 8388608.0f;
10844
pOutputSamples[i*2+1] = (drflac_int32)right / 8388608.0f;
10845
}
10846
}
10847
#endif
10848
10849
static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_left_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)
10850
{
10851
#if defined(DRFLAC_SUPPORT_SSE2)
10852
if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) {
10853
drflac_read_pcm_frames_f32__decode_left_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
10854
} else
10855
#elif defined(DRFLAC_SUPPORT_NEON)
10856
if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) {
10857
drflac_read_pcm_frames_f32__decode_left_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
10858
} else
10859
#endif
10860
{
10861
/* Scalar fallback. */
10862
#if 0
10863
drflac_read_pcm_frames_f32__decode_left_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
10864
#else
10865
drflac_read_pcm_frames_f32__decode_left_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
10866
#endif
10867
}
10868
}
10869
10870
10871
#if 0
10872
static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_right_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)
10873
{
10874
drflac_uint64 i;
10875
for (i = 0; i < frameCount; ++i) {
10876
drflac_uint32 side = (drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample);
10877
drflac_uint32 right = (drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample);
10878
drflac_uint32 left = right + side;
10879
10880
pOutputSamples[i*2+0] = (float)((drflac_int32)left / 2147483648.0);
10881
pOutputSamples[i*2+1] = (float)((drflac_int32)right / 2147483648.0);
10882
}
10883
}
10884
#endif
10885
10886
static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_right_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)
10887
{
10888
drflac_uint64 i;
10889
drflac_uint64 frameCount4 = frameCount >> 2;
10890
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
10891
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
10892
drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
10893
drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
10894
float factor = 1 / 2147483648.0;
10895
10896
for (i = 0; i < frameCount4; ++i) {
10897
drflac_uint32 side0 = pInputSamples0U32[i*4+0] << shift0;
10898
drflac_uint32 side1 = pInputSamples0U32[i*4+1] << shift0;
10899
drflac_uint32 side2 = pInputSamples0U32[i*4+2] << shift0;
10900
drflac_uint32 side3 = pInputSamples0U32[i*4+3] << shift0;
10901
10902
drflac_uint32 right0 = pInputSamples1U32[i*4+0] << shift1;
10903
drflac_uint32 right1 = pInputSamples1U32[i*4+1] << shift1;
10904
drflac_uint32 right2 = pInputSamples1U32[i*4+2] << shift1;
10905
drflac_uint32 right3 = pInputSamples1U32[i*4+3] << shift1;
10906
10907
drflac_uint32 left0 = right0 + side0;
10908
drflac_uint32 left1 = right1 + side1;
10909
drflac_uint32 left2 = right2 + side2;
10910
drflac_uint32 left3 = right3 + side3;
10911
10912
pOutputSamples[i*8+0] = (drflac_int32)left0 * factor;
10913
pOutputSamples[i*8+1] = (drflac_int32)right0 * factor;
10914
pOutputSamples[i*8+2] = (drflac_int32)left1 * factor;
10915
pOutputSamples[i*8+3] = (drflac_int32)right1 * factor;
10916
pOutputSamples[i*8+4] = (drflac_int32)left2 * factor;
10917
pOutputSamples[i*8+5] = (drflac_int32)right2 * factor;
10918
pOutputSamples[i*8+6] = (drflac_int32)left3 * factor;
10919
pOutputSamples[i*8+7] = (drflac_int32)right3 * factor;
10920
}
10921
10922
for (i = (frameCount4 << 2); i < frameCount; ++i) {
10923
drflac_uint32 side = pInputSamples0U32[i] << shift0;
10924
drflac_uint32 right = pInputSamples1U32[i] << shift1;
10925
drflac_uint32 left = right + side;
10926
10927
pOutputSamples[i*2+0] = (drflac_int32)left * factor;
10928
pOutputSamples[i*2+1] = (drflac_int32)right * factor;
10929
}
10930
}
10931
10932
#if defined(DRFLAC_SUPPORT_SSE2)
10933
static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_right_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)
10934
{
10935
drflac_uint64 i;
10936
drflac_uint64 frameCount4 = frameCount >> 2;
10937
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
10938
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
10939
drflac_uint32 shift0 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample) - 8;
10940
drflac_uint32 shift1 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample) - 8;
10941
__m128 factor;
10942
10943
DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);
10944
10945
factor = _mm_set1_ps(1.0f / 8388608.0f);
10946
10947
for (i = 0; i < frameCount4; ++i) {
10948
__m128i side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0);
10949
__m128i right = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1);
10950
__m128i left = _mm_add_epi32(right, side);
10951
__m128 leftf = _mm_mul_ps(_mm_cvtepi32_ps(left), factor);
10952
__m128 rightf = _mm_mul_ps(_mm_cvtepi32_ps(right), factor);
10953
10954
_mm_storeu_ps(pOutputSamples + i*8 + 0, _mm_unpacklo_ps(leftf, rightf));
10955
_mm_storeu_ps(pOutputSamples + i*8 + 4, _mm_unpackhi_ps(leftf, rightf));
10956
}
10957
10958
for (i = (frameCount4 << 2); i < frameCount; ++i) {
10959
drflac_uint32 side = pInputSamples0U32[i] << shift0;
10960
drflac_uint32 right = pInputSamples1U32[i] << shift1;
10961
drflac_uint32 left = right + side;
10962
10963
pOutputSamples[i*2+0] = (drflac_int32)left / 8388608.0f;
10964
pOutputSamples[i*2+1] = (drflac_int32)right / 8388608.0f;
10965
}
10966
}
10967
#endif
10968
10969
#if defined(DRFLAC_SUPPORT_NEON)
10970
static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_right_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)
10971
{
10972
drflac_uint64 i;
10973
drflac_uint64 frameCount4 = frameCount >> 2;
10974
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
10975
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
10976
drflac_uint32 shift0 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample) - 8;
10977
drflac_uint32 shift1 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample) - 8;
10978
float32x4_t factor4;
10979
int32x4_t shift0_4;
10980
int32x4_t shift1_4;
10981
10982
DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);
10983
10984
factor4 = vdupq_n_f32(1.0f / 8388608.0f);
10985
shift0_4 = vdupq_n_s32(shift0);
10986
shift1_4 = vdupq_n_s32(shift1);
10987
10988
for (i = 0; i < frameCount4; ++i) {
10989
uint32x4_t side;
10990
uint32x4_t right;
10991
uint32x4_t left;
10992
float32x4_t leftf;
10993
float32x4_t rightf;
10994
10995
side = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift0_4);
10996
right = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift1_4);
10997
left = vaddq_u32(right, side);
10998
leftf = vmulq_f32(vcvtq_f32_s32(vreinterpretq_s32_u32(left)), factor4);
10999
rightf = vmulq_f32(vcvtq_f32_s32(vreinterpretq_s32_u32(right)), factor4);
11000
11001
drflac__vst2q_f32(pOutputSamples + i*8, vzipq_f32(leftf, rightf));
11002
}
11003
11004
for (i = (frameCount4 << 2); i < frameCount; ++i) {
11005
drflac_uint32 side = pInputSamples0U32[i] << shift0;
11006
drflac_uint32 right = pInputSamples1U32[i] << shift1;
11007
drflac_uint32 left = right + side;
11008
11009
pOutputSamples[i*2+0] = (drflac_int32)left / 8388608.0f;
11010
pOutputSamples[i*2+1] = (drflac_int32)right / 8388608.0f;
11011
}
11012
}
11013
#endif
11014
11015
static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_right_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)
11016
{
11017
#if defined(DRFLAC_SUPPORT_SSE2)
11018
if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) {
11019
drflac_read_pcm_frames_f32__decode_right_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
11020
} else
11021
#elif defined(DRFLAC_SUPPORT_NEON)
11022
if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) {
11023
drflac_read_pcm_frames_f32__decode_right_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
11024
} else
11025
#endif
11026
{
11027
/* Scalar fallback. */
11028
#if 0
11029
drflac_read_pcm_frames_f32__decode_right_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
11030
#else
11031
drflac_read_pcm_frames_f32__decode_right_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
11032
#endif
11033
}
11034
}
11035
11036
11037
#if 0
11038
static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_mid_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)
11039
{
11040
for (drflac_uint64 i = 0; i < frameCount; ++i) {
11041
drflac_uint32 mid = (drflac_uint32)pInputSamples0[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
11042
drflac_uint32 side = (drflac_uint32)pInputSamples1[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
11043
11044
mid = (mid << 1) | (side & 0x01);
11045
11046
pOutputSamples[i*2+0] = (float)((((drflac_int32)(mid + side) >> 1) << (unusedBitsPerSample)) / 2147483648.0);
11047
pOutputSamples[i*2+1] = (float)((((drflac_int32)(mid - side) >> 1) << (unusedBitsPerSample)) / 2147483648.0);
11048
}
11049
}
11050
#endif
11051
11052
static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_mid_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)
11053
{
11054
drflac_uint64 i;
11055
drflac_uint64 frameCount4 = frameCount >> 2;
11056
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
11057
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
11058
drflac_uint32 shift = unusedBitsPerSample;
11059
float factor = 1 / 2147483648.0;
11060
11061
if (shift > 0) {
11062
shift -= 1;
11063
for (i = 0; i < frameCount4; ++i) {
11064
drflac_uint32 temp0L;
11065
drflac_uint32 temp1L;
11066
drflac_uint32 temp2L;
11067
drflac_uint32 temp3L;
11068
drflac_uint32 temp0R;
11069
drflac_uint32 temp1R;
11070
drflac_uint32 temp2R;
11071
drflac_uint32 temp3R;
11072
11073
drflac_uint32 mid0 = pInputSamples0U32[i*4+0] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
11074
drflac_uint32 mid1 = pInputSamples0U32[i*4+1] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
11075
drflac_uint32 mid2 = pInputSamples0U32[i*4+2] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
11076
drflac_uint32 mid3 = pInputSamples0U32[i*4+3] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
11077
11078
drflac_uint32 side0 = pInputSamples1U32[i*4+0] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
11079
drflac_uint32 side1 = pInputSamples1U32[i*4+1] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
11080
drflac_uint32 side2 = pInputSamples1U32[i*4+2] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
11081
drflac_uint32 side3 = pInputSamples1U32[i*4+3] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
11082
11083
mid0 = (mid0 << 1) | (side0 & 0x01);
11084
mid1 = (mid1 << 1) | (side1 & 0x01);
11085
mid2 = (mid2 << 1) | (side2 & 0x01);
11086
mid3 = (mid3 << 1) | (side3 & 0x01);
11087
11088
temp0L = (mid0 + side0) << shift;
11089
temp1L = (mid1 + side1) << shift;
11090
temp2L = (mid2 + side2) << shift;
11091
temp3L = (mid3 + side3) << shift;
11092
11093
temp0R = (mid0 - side0) << shift;
11094
temp1R = (mid1 - side1) << shift;
11095
temp2R = (mid2 - side2) << shift;
11096
temp3R = (mid3 - side3) << shift;
11097
11098
pOutputSamples[i*8+0] = (drflac_int32)temp0L * factor;
11099
pOutputSamples[i*8+1] = (drflac_int32)temp0R * factor;
11100
pOutputSamples[i*8+2] = (drflac_int32)temp1L * factor;
11101
pOutputSamples[i*8+3] = (drflac_int32)temp1R * factor;
11102
pOutputSamples[i*8+4] = (drflac_int32)temp2L * factor;
11103
pOutputSamples[i*8+5] = (drflac_int32)temp2R * factor;
11104
pOutputSamples[i*8+6] = (drflac_int32)temp3L * factor;
11105
pOutputSamples[i*8+7] = (drflac_int32)temp3R * factor;
11106
}
11107
} else {
11108
for (i = 0; i < frameCount4; ++i) {
11109
drflac_uint32 temp0L;
11110
drflac_uint32 temp1L;
11111
drflac_uint32 temp2L;
11112
drflac_uint32 temp3L;
11113
drflac_uint32 temp0R;
11114
drflac_uint32 temp1R;
11115
drflac_uint32 temp2R;
11116
drflac_uint32 temp3R;
11117
11118
drflac_uint32 mid0 = pInputSamples0U32[i*4+0] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
11119
drflac_uint32 mid1 = pInputSamples0U32[i*4+1] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
11120
drflac_uint32 mid2 = pInputSamples0U32[i*4+2] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
11121
drflac_uint32 mid3 = pInputSamples0U32[i*4+3] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
11122
11123
drflac_uint32 side0 = pInputSamples1U32[i*4+0] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
11124
drflac_uint32 side1 = pInputSamples1U32[i*4+1] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
11125
drflac_uint32 side2 = pInputSamples1U32[i*4+2] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
11126
drflac_uint32 side3 = pInputSamples1U32[i*4+3] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
11127
11128
mid0 = (mid0 << 1) | (side0 & 0x01);
11129
mid1 = (mid1 << 1) | (side1 & 0x01);
11130
mid2 = (mid2 << 1) | (side2 & 0x01);
11131
mid3 = (mid3 << 1) | (side3 & 0x01);
11132
11133
temp0L = (drflac_uint32)((drflac_int32)(mid0 + side0) >> 1);
11134
temp1L = (drflac_uint32)((drflac_int32)(mid1 + side1) >> 1);
11135
temp2L = (drflac_uint32)((drflac_int32)(mid2 + side2) >> 1);
11136
temp3L = (drflac_uint32)((drflac_int32)(mid3 + side3) >> 1);
11137
11138
temp0R = (drflac_uint32)((drflac_int32)(mid0 - side0) >> 1);
11139
temp1R = (drflac_uint32)((drflac_int32)(mid1 - side1) >> 1);
11140
temp2R = (drflac_uint32)((drflac_int32)(mid2 - side2) >> 1);
11141
temp3R = (drflac_uint32)((drflac_int32)(mid3 - side3) >> 1);
11142
11143
pOutputSamples[i*8+0] = (drflac_int32)temp0L * factor;
11144
pOutputSamples[i*8+1] = (drflac_int32)temp0R * factor;
11145
pOutputSamples[i*8+2] = (drflac_int32)temp1L * factor;
11146
pOutputSamples[i*8+3] = (drflac_int32)temp1R * factor;
11147
pOutputSamples[i*8+4] = (drflac_int32)temp2L * factor;
11148
pOutputSamples[i*8+5] = (drflac_int32)temp2R * factor;
11149
pOutputSamples[i*8+6] = (drflac_int32)temp3L * factor;
11150
pOutputSamples[i*8+7] = (drflac_int32)temp3R * factor;
11151
}
11152
}
11153
11154
for (i = (frameCount4 << 2); i < frameCount; ++i) {
11155
drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
11156
drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
11157
11158
mid = (mid << 1) | (side & 0x01);
11159
11160
pOutputSamples[i*2+0] = (drflac_int32)((drflac_uint32)((drflac_int32)(mid + side) >> 1) << unusedBitsPerSample) * factor;
11161
pOutputSamples[i*2+1] = (drflac_int32)((drflac_uint32)((drflac_int32)(mid - side) >> 1) << unusedBitsPerSample) * factor;
11162
}
11163
}
11164
11165
#if defined(DRFLAC_SUPPORT_SSE2)
11166
static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_mid_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)
11167
{
11168
drflac_uint64 i;
11169
drflac_uint64 frameCount4 = frameCount >> 2;
11170
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
11171
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
11172
drflac_uint32 shift = unusedBitsPerSample - 8;
11173
float factor;
11174
__m128 factor128;
11175
11176
DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);
11177
11178
factor = 1.0f / 8388608.0f;
11179
factor128 = _mm_set1_ps(factor);
11180
11181
if (shift == 0) {
11182
for (i = 0; i < frameCount4; ++i) {
11183
__m128i mid;
11184
__m128i side;
11185
__m128i tempL;
11186
__m128i tempR;
11187
__m128 leftf;
11188
__m128 rightf;
11189
11190
mid = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample);
11191
side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample);
11192
11193
mid = _mm_or_si128(_mm_slli_epi32(mid, 1), _mm_and_si128(side, _mm_set1_epi32(0x01)));
11194
11195
tempL = _mm_srai_epi32(_mm_add_epi32(mid, side), 1);
11196
tempR = _mm_srai_epi32(_mm_sub_epi32(mid, side), 1);
11197
11198
leftf = _mm_mul_ps(_mm_cvtepi32_ps(tempL), factor128);
11199
rightf = _mm_mul_ps(_mm_cvtepi32_ps(tempR), factor128);
11200
11201
_mm_storeu_ps(pOutputSamples + i*8 + 0, _mm_unpacklo_ps(leftf, rightf));
11202
_mm_storeu_ps(pOutputSamples + i*8 + 4, _mm_unpackhi_ps(leftf, rightf));
11203
}
11204
11205
for (i = (frameCount4 << 2); i < frameCount; ++i) {
11206
drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
11207
drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
11208
11209
mid = (mid << 1) | (side & 0x01);
11210
11211
pOutputSamples[i*2+0] = ((drflac_int32)(mid + side) >> 1) * factor;
11212
pOutputSamples[i*2+1] = ((drflac_int32)(mid - side) >> 1) * factor;
11213
}
11214
} else {
11215
shift -= 1;
11216
for (i = 0; i < frameCount4; ++i) {
11217
__m128i mid;
11218
__m128i side;
11219
__m128i tempL;
11220
__m128i tempR;
11221
__m128 leftf;
11222
__m128 rightf;
11223
11224
mid = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample);
11225
side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample);
11226
11227
mid = _mm_or_si128(_mm_slli_epi32(mid, 1), _mm_and_si128(side, _mm_set1_epi32(0x01)));
11228
11229
tempL = _mm_slli_epi32(_mm_add_epi32(mid, side), shift);
11230
tempR = _mm_slli_epi32(_mm_sub_epi32(mid, side), shift);
11231
11232
leftf = _mm_mul_ps(_mm_cvtepi32_ps(tempL), factor128);
11233
rightf = _mm_mul_ps(_mm_cvtepi32_ps(tempR), factor128);
11234
11235
_mm_storeu_ps(pOutputSamples + i*8 + 0, _mm_unpacklo_ps(leftf, rightf));
11236
_mm_storeu_ps(pOutputSamples + i*8 + 4, _mm_unpackhi_ps(leftf, rightf));
11237
}
11238
11239
for (i = (frameCount4 << 2); i < frameCount; ++i) {
11240
drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
11241
drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
11242
11243
mid = (mid << 1) | (side & 0x01);
11244
11245
pOutputSamples[i*2+0] = (drflac_int32)((mid + side) << shift) * factor;
11246
pOutputSamples[i*2+1] = (drflac_int32)((mid - side) << shift) * factor;
11247
}
11248
}
11249
}
11250
#endif
11251
11252
#if defined(DRFLAC_SUPPORT_NEON)
11253
static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_mid_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)
11254
{
11255
drflac_uint64 i;
11256
drflac_uint64 frameCount4 = frameCount >> 2;
11257
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
11258
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
11259
drflac_uint32 shift = unusedBitsPerSample - 8;
11260
float factor;
11261
float32x4_t factor4;
11262
int32x4_t shift4;
11263
int32x4_t wbps0_4; /* Wasted Bits Per Sample */
11264
int32x4_t wbps1_4; /* Wasted Bits Per Sample */
11265
11266
DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);
11267
11268
factor = 1.0f / 8388608.0f;
11269
factor4 = vdupq_n_f32(factor);
11270
wbps0_4 = vdupq_n_s32(pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample);
11271
wbps1_4 = vdupq_n_s32(pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample);
11272
11273
if (shift == 0) {
11274
for (i = 0; i < frameCount4; ++i) {
11275
int32x4_t lefti;
11276
int32x4_t righti;
11277
float32x4_t leftf;
11278
float32x4_t rightf;
11279
11280
uint32x4_t mid = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), wbps0_4);
11281
uint32x4_t side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), wbps1_4);
11282
11283
mid = vorrq_u32(vshlq_n_u32(mid, 1), vandq_u32(side, vdupq_n_u32(1)));
11284
11285
lefti = vshrq_n_s32(vreinterpretq_s32_u32(vaddq_u32(mid, side)), 1);
11286
righti = vshrq_n_s32(vreinterpretq_s32_u32(vsubq_u32(mid, side)), 1);
11287
11288
leftf = vmulq_f32(vcvtq_f32_s32(lefti), factor4);
11289
rightf = vmulq_f32(vcvtq_f32_s32(righti), factor4);
11290
11291
drflac__vst2q_f32(pOutputSamples + i*8, vzipq_f32(leftf, rightf));
11292
}
11293
11294
for (i = (frameCount4 << 2); i < frameCount; ++i) {
11295
drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
11296
drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
11297
11298
mid = (mid << 1) | (side & 0x01);
11299
11300
pOutputSamples[i*2+0] = ((drflac_int32)(mid + side) >> 1) * factor;
11301
pOutputSamples[i*2+1] = ((drflac_int32)(mid - side) >> 1) * factor;
11302
}
11303
} else {
11304
shift -= 1;
11305
shift4 = vdupq_n_s32(shift);
11306
for (i = 0; i < frameCount4; ++i) {
11307
uint32x4_t mid;
11308
uint32x4_t side;
11309
int32x4_t lefti;
11310
int32x4_t righti;
11311
float32x4_t leftf;
11312
float32x4_t rightf;
11313
11314
mid = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), wbps0_4);
11315
side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), wbps1_4);
11316
11317
mid = vorrq_u32(vshlq_n_u32(mid, 1), vandq_u32(side, vdupq_n_u32(1)));
11318
11319
lefti = vreinterpretq_s32_u32(vshlq_u32(vaddq_u32(mid, side), shift4));
11320
righti = vreinterpretq_s32_u32(vshlq_u32(vsubq_u32(mid, side), shift4));
11321
11322
leftf = vmulq_f32(vcvtq_f32_s32(lefti), factor4);
11323
rightf = vmulq_f32(vcvtq_f32_s32(righti), factor4);
11324
11325
drflac__vst2q_f32(pOutputSamples + i*8, vzipq_f32(leftf, rightf));
11326
}
11327
11328
for (i = (frameCount4 << 2); i < frameCount; ++i) {
11329
drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
11330
drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
11331
11332
mid = (mid << 1) | (side & 0x01);
11333
11334
pOutputSamples[i*2+0] = (drflac_int32)((mid + side) << shift) * factor;
11335
pOutputSamples[i*2+1] = (drflac_int32)((mid - side) << shift) * factor;
11336
}
11337
}
11338
}
11339
#endif
11340
11341
static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_mid_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)
11342
{
11343
#if defined(DRFLAC_SUPPORT_SSE2)
11344
if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) {
11345
drflac_read_pcm_frames_f32__decode_mid_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
11346
} else
11347
#elif defined(DRFLAC_SUPPORT_NEON)
11348
if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) {
11349
drflac_read_pcm_frames_f32__decode_mid_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
11350
} else
11351
#endif
11352
{
11353
/* Scalar fallback. */
11354
#if 0
11355
drflac_read_pcm_frames_f32__decode_mid_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
11356
#else
11357
drflac_read_pcm_frames_f32__decode_mid_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
11358
#endif
11359
}
11360
}
11361
11362
#if 0
11363
static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_independent_stereo__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)
11364
{
11365
for (drflac_uint64 i = 0; i < frameCount; ++i) {
11366
pOutputSamples[i*2+0] = (float)((drflac_int32)((drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample)) / 2147483648.0);
11367
pOutputSamples[i*2+1] = (float)((drflac_int32)((drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample)) / 2147483648.0);
11368
}
11369
}
11370
#endif
11371
11372
static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_independent_stereo__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)
11373
{
11374
drflac_uint64 i;
11375
drflac_uint64 frameCount4 = frameCount >> 2;
11376
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
11377
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
11378
drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;
11379
drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;
11380
float factor = 1 / 2147483648.0;
11381
11382
for (i = 0; i < frameCount4; ++i) {
11383
drflac_uint32 tempL0 = pInputSamples0U32[i*4+0] << shift0;
11384
drflac_uint32 tempL1 = pInputSamples0U32[i*4+1] << shift0;
11385
drflac_uint32 tempL2 = pInputSamples0U32[i*4+2] << shift0;
11386
drflac_uint32 tempL3 = pInputSamples0U32[i*4+3] << shift0;
11387
11388
drflac_uint32 tempR0 = pInputSamples1U32[i*4+0] << shift1;
11389
drflac_uint32 tempR1 = pInputSamples1U32[i*4+1] << shift1;
11390
drflac_uint32 tempR2 = pInputSamples1U32[i*4+2] << shift1;
11391
drflac_uint32 tempR3 = pInputSamples1U32[i*4+3] << shift1;
11392
11393
pOutputSamples[i*8+0] = (drflac_int32)tempL0 * factor;
11394
pOutputSamples[i*8+1] = (drflac_int32)tempR0 * factor;
11395
pOutputSamples[i*8+2] = (drflac_int32)tempL1 * factor;
11396
pOutputSamples[i*8+3] = (drflac_int32)tempR1 * factor;
11397
pOutputSamples[i*8+4] = (drflac_int32)tempL2 * factor;
11398
pOutputSamples[i*8+5] = (drflac_int32)tempR2 * factor;
11399
pOutputSamples[i*8+6] = (drflac_int32)tempL3 * factor;
11400
pOutputSamples[i*8+7] = (drflac_int32)tempR3 * factor;
11401
}
11402
11403
for (i = (frameCount4 << 2); i < frameCount; ++i) {
11404
pOutputSamples[i*2+0] = (drflac_int32)(pInputSamples0U32[i] << shift0) * factor;
11405
pOutputSamples[i*2+1] = (drflac_int32)(pInputSamples1U32[i] << shift1) * factor;
11406
}
11407
}
11408
11409
#if defined(DRFLAC_SUPPORT_SSE2)
11410
static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_independent_stereo__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)
11411
{
11412
drflac_uint64 i;
11413
drflac_uint64 frameCount4 = frameCount >> 2;
11414
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
11415
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
11416
drflac_uint32 shift0 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample) - 8;
11417
drflac_uint32 shift1 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample) - 8;
11418
11419
float factor = 1.0f / 8388608.0f;
11420
__m128 factor128 = _mm_set1_ps(factor);
11421
11422
for (i = 0; i < frameCount4; ++i) {
11423
__m128i lefti;
11424
__m128i righti;
11425
__m128 leftf;
11426
__m128 rightf;
11427
11428
lefti = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0);
11429
righti = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1);
11430
11431
leftf = _mm_mul_ps(_mm_cvtepi32_ps(lefti), factor128);
11432
rightf = _mm_mul_ps(_mm_cvtepi32_ps(righti), factor128);
11433
11434
_mm_storeu_ps(pOutputSamples + i*8 + 0, _mm_unpacklo_ps(leftf, rightf));
11435
_mm_storeu_ps(pOutputSamples + i*8 + 4, _mm_unpackhi_ps(leftf, rightf));
11436
}
11437
11438
for (i = (frameCount4 << 2); i < frameCount; ++i) {
11439
pOutputSamples[i*2+0] = (drflac_int32)(pInputSamples0U32[i] << shift0) * factor;
11440
pOutputSamples[i*2+1] = (drflac_int32)(pInputSamples1U32[i] << shift1) * factor;
11441
}
11442
}
11443
#endif
11444
11445
#if defined(DRFLAC_SUPPORT_NEON)
11446
static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_independent_stereo__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)
11447
{
11448
drflac_uint64 i;
11449
drflac_uint64 frameCount4 = frameCount >> 2;
11450
const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;
11451
const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;
11452
drflac_uint32 shift0 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample) - 8;
11453
drflac_uint32 shift1 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample) - 8;
11454
11455
float factor = 1.0f / 8388608.0f;
11456
float32x4_t factor4 = vdupq_n_f32(factor);
11457
int32x4_t shift0_4 = vdupq_n_s32(shift0);
11458
int32x4_t shift1_4 = vdupq_n_s32(shift1);
11459
11460
for (i = 0; i < frameCount4; ++i) {
11461
int32x4_t lefti;
11462
int32x4_t righti;
11463
float32x4_t leftf;
11464
float32x4_t rightf;
11465
11466
lefti = vreinterpretq_s32_u32(vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift0_4));
11467
righti = vreinterpretq_s32_u32(vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift1_4));
11468
11469
leftf = vmulq_f32(vcvtq_f32_s32(lefti), factor4);
11470
rightf = vmulq_f32(vcvtq_f32_s32(righti), factor4);
11471
11472
drflac__vst2q_f32(pOutputSamples + i*8, vzipq_f32(leftf, rightf));
11473
}
11474
11475
for (i = (frameCount4 << 2); i < frameCount; ++i) {
11476
pOutputSamples[i*2+0] = (drflac_int32)(pInputSamples0U32[i] << shift0) * factor;
11477
pOutputSamples[i*2+1] = (drflac_int32)(pInputSamples1U32[i] << shift1) * factor;
11478
}
11479
}
11480
#endif
11481
11482
static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_independent_stereo(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)
11483
{
11484
#if defined(DRFLAC_SUPPORT_SSE2)
11485
if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) {
11486
drflac_read_pcm_frames_f32__decode_independent_stereo__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
11487
} else
11488
#elif defined(DRFLAC_SUPPORT_NEON)
11489
if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) {
11490
drflac_read_pcm_frames_f32__decode_independent_stereo__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
11491
} else
11492
#endif
11493
{
11494
/* Scalar fallback. */
11495
#if 0
11496
drflac_read_pcm_frames_f32__decode_independent_stereo__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
11497
#else
11498
drflac_read_pcm_frames_f32__decode_independent_stereo__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);
11499
#endif
11500
}
11501
}
11502
11503
DRFLAC_API drflac_uint64 drflac_read_pcm_frames_f32(drflac* pFlac, drflac_uint64 framesToRead, float* pBufferOut)
11504
{
11505
drflac_uint64 framesRead;
11506
drflac_uint32 unusedBitsPerSample;
11507
11508
if (pFlac == NULL || framesToRead == 0) {
11509
return 0;
11510
}
11511
11512
if (pBufferOut == NULL) {
11513
return drflac__seek_forward_by_pcm_frames(pFlac, framesToRead);
11514
}
11515
11516
DRFLAC_ASSERT(pFlac->bitsPerSample <= 32);
11517
unusedBitsPerSample = 32 - pFlac->bitsPerSample;
11518
11519
framesRead = 0;
11520
while (framesToRead > 0) {
11521
/* If we've run out of samples in this frame, go to the next. */
11522
if (pFlac->currentFLACFrame.pcmFramesRemaining == 0) {
11523
if (!drflac__read_and_decode_next_flac_frame(pFlac)) {
11524
break; /* Couldn't read the next frame, so just break from the loop and return. */
11525
}
11526
} else {
11527
unsigned int channelCount = drflac__get_channel_count_from_channel_assignment(pFlac->currentFLACFrame.header.channelAssignment);
11528
drflac_uint64 iFirstPCMFrame = pFlac->currentFLACFrame.header.blockSizeInPCMFrames - pFlac->currentFLACFrame.pcmFramesRemaining;
11529
drflac_uint64 frameCountThisIteration = framesToRead;
11530
11531
if (frameCountThisIteration > pFlac->currentFLACFrame.pcmFramesRemaining) {
11532
frameCountThisIteration = pFlac->currentFLACFrame.pcmFramesRemaining;
11533
}
11534
11535
if (channelCount == 2) {
11536
const drflac_int32* pDecodedSamples0 = pFlac->currentFLACFrame.subframes[0].pSamplesS32 + iFirstPCMFrame;
11537
const drflac_int32* pDecodedSamples1 = pFlac->currentFLACFrame.subframes[1].pSamplesS32 + iFirstPCMFrame;
11538
11539
switch (pFlac->currentFLACFrame.header.channelAssignment)
11540
{
11541
case DRFLAC_CHANNEL_ASSIGNMENT_LEFT_SIDE:
11542
{
11543
drflac_read_pcm_frames_f32__decode_left_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut);
11544
} break;
11545
11546
case DRFLAC_CHANNEL_ASSIGNMENT_RIGHT_SIDE:
11547
{
11548
drflac_read_pcm_frames_f32__decode_right_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut);
11549
} break;
11550
11551
case DRFLAC_CHANNEL_ASSIGNMENT_MID_SIDE:
11552
{
11553
drflac_read_pcm_frames_f32__decode_mid_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut);
11554
} break;
11555
11556
case DRFLAC_CHANNEL_ASSIGNMENT_INDEPENDENT:
11557
default:
11558
{
11559
drflac_read_pcm_frames_f32__decode_independent_stereo(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut);
11560
} break;
11561
}
11562
} else {
11563
/* Generic interleaving. */
11564
drflac_uint64 i;
11565
for (i = 0; i < frameCountThisIteration; ++i) {
11566
unsigned int j;
11567
for (j = 0; j < channelCount; ++j) {
11568
drflac_int32 sampleS32 = (drflac_int32)((drflac_uint32)(pFlac->currentFLACFrame.subframes[j].pSamplesS32[iFirstPCMFrame + i]) << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[j].wastedBitsPerSample));
11569
pBufferOut[(i*channelCount)+j] = (float)(sampleS32 / 2147483648.0);
11570
}
11571
}
11572
}
11573
11574
framesRead += frameCountThisIteration;
11575
pBufferOut += frameCountThisIteration * channelCount;
11576
framesToRead -= frameCountThisIteration;
11577
pFlac->currentPCMFrame += frameCountThisIteration;
11578
pFlac->currentFLACFrame.pcmFramesRemaining -= (unsigned int)frameCountThisIteration;
11579
}
11580
}
11581
11582
return framesRead;
11583
}
11584
11585
11586
DRFLAC_API drflac_bool32 drflac_seek_to_pcm_frame(drflac* pFlac, drflac_uint64 pcmFrameIndex)
11587
{
11588
if (pFlac == NULL) {
11589
return DRFLAC_FALSE;
11590
}
11591
11592
/* Don't do anything if we're already on the seek point. */
11593
if (pFlac->currentPCMFrame == pcmFrameIndex) {
11594
return DRFLAC_TRUE;
11595
}
11596
11597
/*
11598
If we don't know where the first frame begins then we can't seek. This will happen when the STREAMINFO block was not present
11599
when the decoder was opened.
11600
*/
11601
if (pFlac->firstFLACFramePosInBytes == 0) {
11602
return DRFLAC_FALSE;
11603
}
11604
11605
if (pcmFrameIndex == 0) {
11606
pFlac->currentPCMFrame = 0;
11607
return drflac__seek_to_first_frame(pFlac);
11608
} else {
11609
drflac_bool32 wasSuccessful = DRFLAC_FALSE;
11610
drflac_uint64 originalPCMFrame = pFlac->currentPCMFrame;
11611
11612
/* Clamp the sample to the end. */
11613
if (pcmFrameIndex > pFlac->totalPCMFrameCount) {
11614
pcmFrameIndex = pFlac->totalPCMFrameCount;
11615
}
11616
11617
/* If the target sample and the current sample are in the same frame we just move the position forward. */
11618
if (pcmFrameIndex > pFlac->currentPCMFrame) {
11619
/* Forward. */
11620
drflac_uint32 offset = (drflac_uint32)(pcmFrameIndex - pFlac->currentPCMFrame);
11621
if (pFlac->currentFLACFrame.pcmFramesRemaining > offset) {
11622
pFlac->currentFLACFrame.pcmFramesRemaining -= offset;
11623
pFlac->currentPCMFrame = pcmFrameIndex;
11624
return DRFLAC_TRUE;
11625
}
11626
} else {
11627
/* Backward. */
11628
drflac_uint32 offsetAbs = (drflac_uint32)(pFlac->currentPCMFrame - pcmFrameIndex);
11629
drflac_uint32 currentFLACFramePCMFrameCount = pFlac->currentFLACFrame.header.blockSizeInPCMFrames;
11630
drflac_uint32 currentFLACFramePCMFramesConsumed = currentFLACFramePCMFrameCount - pFlac->currentFLACFrame.pcmFramesRemaining;
11631
if (currentFLACFramePCMFramesConsumed > offsetAbs) {
11632
pFlac->currentFLACFrame.pcmFramesRemaining += offsetAbs;
11633
pFlac->currentPCMFrame = pcmFrameIndex;
11634
return DRFLAC_TRUE;
11635
}
11636
}
11637
11638
/*
11639
Different techniques depending on encapsulation. Using the native FLAC seektable with Ogg encapsulation is a bit awkward so
11640
we'll instead use Ogg's natural seeking facility.
11641
*/
11642
#ifndef DR_FLAC_NO_OGG
11643
if (pFlac->container == drflac_container_ogg)
11644
{
11645
wasSuccessful = drflac_ogg__seek_to_pcm_frame(pFlac, pcmFrameIndex);
11646
}
11647
else
11648
#endif
11649
{
11650
/* First try seeking via the seek table. If this fails, fall back to a brute force seek which is much slower. */
11651
if (/*!wasSuccessful && */!pFlac->_noSeekTableSeek) {
11652
wasSuccessful = drflac__seek_to_pcm_frame__seek_table(pFlac, pcmFrameIndex);
11653
}
11654
11655
#if !defined(DR_FLAC_NO_CRC)
11656
/* Fall back to binary search if seek table seeking fails. This requires the length of the stream to be known. */
11657
if (!wasSuccessful && !pFlac->_noBinarySearchSeek && pFlac->totalPCMFrameCount > 0) {
11658
wasSuccessful = drflac__seek_to_pcm_frame__binary_search(pFlac, pcmFrameIndex);
11659
}
11660
#endif
11661
11662
/* Fall back to brute force if all else fails. */
11663
if (!wasSuccessful && !pFlac->_noBruteForceSeek) {
11664
wasSuccessful = drflac__seek_to_pcm_frame__brute_force(pFlac, pcmFrameIndex);
11665
}
11666
}
11667
11668
if (wasSuccessful) {
11669
pFlac->currentPCMFrame = pcmFrameIndex;
11670
} else {
11671
/* Seek failed. Try putting the decoder back to it's original state. */
11672
if (drflac_seek_to_pcm_frame(pFlac, originalPCMFrame) == DRFLAC_FALSE) {
11673
/* Failed to seek back to the original PCM frame. Fall back to 0. */
11674
drflac_seek_to_pcm_frame(pFlac, 0);
11675
}
11676
}
11677
11678
return wasSuccessful;
11679
}
11680
}
11681
11682
11683
11684
/* High Level APIs */
11685
11686
/* SIZE_MAX */
11687
#if defined(SIZE_MAX)
11688
#define DRFLAC_SIZE_MAX SIZE_MAX
11689
#else
11690
#if defined(DRFLAC_64BIT)
11691
#define DRFLAC_SIZE_MAX ((drflac_uint64)0xFFFFFFFFFFFFFFFF)
11692
#else
11693
#define DRFLAC_SIZE_MAX 0xFFFFFFFF
11694
#endif
11695
#endif
11696
/* End SIZE_MAX */
11697
11698
11699
/* Using a macro as the definition of the drflac__full_decode_and_close_*() API family. Sue me. */
11700
#define DRFLAC_DEFINE_FULL_READ_AND_CLOSE(extension, type) \
11701
static type* drflac__full_read_and_close_ ## extension (drflac* pFlac, unsigned int* channelsOut, unsigned int* sampleRateOut, drflac_uint64* totalPCMFrameCountOut)\
11702
{ \
11703
type* pSampleData = NULL; \
11704
drflac_uint64 totalPCMFrameCount; \
11705
\
11706
DRFLAC_ASSERT(pFlac != NULL); \
11707
\
11708
totalPCMFrameCount = pFlac->totalPCMFrameCount; \
11709
\
11710
if (totalPCMFrameCount == 0) { \
11711
type buffer[4096]; \
11712
drflac_uint64 pcmFramesRead; \
11713
size_t sampleDataBufferSize = sizeof(buffer); \
11714
\
11715
pSampleData = (type*)drflac__malloc_from_callbacks(sampleDataBufferSize, &pFlac->allocationCallbacks); \
11716
if (pSampleData == NULL) { \
11717
goto on_error; \
11718
} \
11719
\
11720
while ((pcmFramesRead = (drflac_uint64)drflac_read_pcm_frames_##extension(pFlac, sizeof(buffer)/sizeof(buffer[0])/pFlac->channels, buffer)) > 0) { \
11721
if (((totalPCMFrameCount + pcmFramesRead) * pFlac->channels * sizeof(type)) > sampleDataBufferSize) { \
11722
type* pNewSampleData; \
11723
size_t newSampleDataBufferSize; \
11724
\
11725
newSampleDataBufferSize = sampleDataBufferSize * 2; \
11726
pNewSampleData = (type*)drflac__realloc_from_callbacks(pSampleData, newSampleDataBufferSize, sampleDataBufferSize, &pFlac->allocationCallbacks); \
11727
if (pNewSampleData == NULL) { \
11728
drflac__free_from_callbacks(pSampleData, &pFlac->allocationCallbacks); \
11729
goto on_error; \
11730
} \
11731
\
11732
sampleDataBufferSize = newSampleDataBufferSize; \
11733
pSampleData = pNewSampleData; \
11734
} \
11735
\
11736
DRFLAC_COPY_MEMORY(pSampleData + (totalPCMFrameCount*pFlac->channels), buffer, (size_t)(pcmFramesRead*pFlac->channels*sizeof(type))); \
11737
totalPCMFrameCount += pcmFramesRead; \
11738
} \
11739
\
11740
/* At this point everything should be decoded, but we just want to fill the unused part buffer with silence - need to \
11741
protect those ears from random noise! */ \
11742
DRFLAC_ZERO_MEMORY(pSampleData + (totalPCMFrameCount*pFlac->channels), (size_t)(sampleDataBufferSize - totalPCMFrameCount*pFlac->channels*sizeof(type))); \
11743
} else { \
11744
drflac_uint64 dataSize = totalPCMFrameCount*pFlac->channels*sizeof(type); \
11745
if (dataSize > (drflac_uint64)DRFLAC_SIZE_MAX) { \
11746
goto on_error; /* The decoded data is too big. */ \
11747
} \
11748
\
11749
pSampleData = (type*)drflac__malloc_from_callbacks((size_t)dataSize, &pFlac->allocationCallbacks); /* <-- Safe cast as per the check above. */ \
11750
if (pSampleData == NULL) { \
11751
goto on_error; \
11752
} \
11753
\
11754
totalPCMFrameCount = drflac_read_pcm_frames_##extension(pFlac, pFlac->totalPCMFrameCount, pSampleData); \
11755
} \
11756
\
11757
if (sampleRateOut) *sampleRateOut = pFlac->sampleRate; \
11758
if (channelsOut) *channelsOut = pFlac->channels; \
11759
if (totalPCMFrameCountOut) *totalPCMFrameCountOut = totalPCMFrameCount; \
11760
\
11761
drflac_close(pFlac); \
11762
return pSampleData; \
11763
\
11764
on_error: \
11765
drflac_close(pFlac); \
11766
return NULL; \
11767
}
11768
11769
DRFLAC_DEFINE_FULL_READ_AND_CLOSE(s32, drflac_int32)
11770
DRFLAC_DEFINE_FULL_READ_AND_CLOSE(s16, drflac_int16)
11771
DRFLAC_DEFINE_FULL_READ_AND_CLOSE(f32, float)
11772
11773
DRFLAC_API drflac_int32* drflac_open_and_read_pcm_frames_s32(drflac_read_proc onRead, drflac_seek_proc onSeek, void* pUserData, unsigned int* channelsOut, unsigned int* sampleRateOut, drflac_uint64* totalPCMFrameCountOut, const drflac_allocation_callbacks* pAllocationCallbacks)
11774
{
11775
drflac* pFlac;
11776
11777
if (channelsOut) {
11778
*channelsOut = 0;
11779
}
11780
if (sampleRateOut) {
11781
*sampleRateOut = 0;
11782
}
11783
if (totalPCMFrameCountOut) {
11784
*totalPCMFrameCountOut = 0;
11785
}
11786
11787
pFlac = drflac_open(onRead, onSeek, pUserData, pAllocationCallbacks);
11788
if (pFlac == NULL) {
11789
return NULL;
11790
}
11791
11792
return drflac__full_read_and_close_s32(pFlac, channelsOut, sampleRateOut, totalPCMFrameCountOut);
11793
}
11794
11795
DRFLAC_API drflac_int16* drflac_open_and_read_pcm_frames_s16(drflac_read_proc onRead, drflac_seek_proc onSeek, void* pUserData, unsigned int* channelsOut, unsigned int* sampleRateOut, drflac_uint64* totalPCMFrameCountOut, const drflac_allocation_callbacks* pAllocationCallbacks)
11796
{
11797
drflac* pFlac;
11798
11799
if (channelsOut) {
11800
*channelsOut = 0;
11801
}
11802
if (sampleRateOut) {
11803
*sampleRateOut = 0;
11804
}
11805
if (totalPCMFrameCountOut) {
11806
*totalPCMFrameCountOut = 0;
11807
}
11808
11809
pFlac = drflac_open(onRead, onSeek, pUserData, pAllocationCallbacks);
11810
if (pFlac == NULL) {
11811
return NULL;
11812
}
11813
11814
return drflac__full_read_and_close_s16(pFlac, channelsOut, sampleRateOut, totalPCMFrameCountOut);
11815
}
11816
11817
DRFLAC_API float* drflac_open_and_read_pcm_frames_f32(drflac_read_proc onRead, drflac_seek_proc onSeek, void* pUserData, unsigned int* channelsOut, unsigned int* sampleRateOut, drflac_uint64* totalPCMFrameCountOut, const drflac_allocation_callbacks* pAllocationCallbacks)
11818
{
11819
drflac* pFlac;
11820
11821
if (channelsOut) {
11822
*channelsOut = 0;
11823
}
11824
if (sampleRateOut) {
11825
*sampleRateOut = 0;
11826
}
11827
if (totalPCMFrameCountOut) {
11828
*totalPCMFrameCountOut = 0;
11829
}
11830
11831
pFlac = drflac_open(onRead, onSeek, pUserData, pAllocationCallbacks);
11832
if (pFlac == NULL) {
11833
return NULL;
11834
}
11835
11836
return drflac__full_read_and_close_f32(pFlac, channelsOut, sampleRateOut, totalPCMFrameCountOut);
11837
}
11838
11839
#ifndef DR_FLAC_NO_STDIO
11840
DRFLAC_API drflac_int32* drflac_open_file_and_read_pcm_frames_s32(const char* filename, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks)
11841
{
11842
drflac* pFlac;
11843
11844
if (sampleRate) {
11845
*sampleRate = 0;
11846
}
11847
if (channels) {
11848
*channels = 0;
11849
}
11850
if (totalPCMFrameCount) {
11851
*totalPCMFrameCount = 0;
11852
}
11853
11854
pFlac = drflac_open_file(filename, pAllocationCallbacks);
11855
if (pFlac == NULL) {
11856
return NULL;
11857
}
11858
11859
return drflac__full_read_and_close_s32(pFlac, channels, sampleRate, totalPCMFrameCount);
11860
}
11861
11862
DRFLAC_API drflac_int16* drflac_open_file_and_read_pcm_frames_s16(const char* filename, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks)
11863
{
11864
drflac* pFlac;
11865
11866
if (sampleRate) {
11867
*sampleRate = 0;
11868
}
11869
if (channels) {
11870
*channels = 0;
11871
}
11872
if (totalPCMFrameCount) {
11873
*totalPCMFrameCount = 0;
11874
}
11875
11876
pFlac = drflac_open_file(filename, pAllocationCallbacks);
11877
if (pFlac == NULL) {
11878
return NULL;
11879
}
11880
11881
return drflac__full_read_and_close_s16(pFlac, channels, sampleRate, totalPCMFrameCount);
11882
}
11883
11884
DRFLAC_API float* drflac_open_file_and_read_pcm_frames_f32(const char* filename, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks)
11885
{
11886
drflac* pFlac;
11887
11888
if (sampleRate) {
11889
*sampleRate = 0;
11890
}
11891
if (channels) {
11892
*channels = 0;
11893
}
11894
if (totalPCMFrameCount) {
11895
*totalPCMFrameCount = 0;
11896
}
11897
11898
pFlac = drflac_open_file(filename, pAllocationCallbacks);
11899
if (pFlac == NULL) {
11900
return NULL;
11901
}
11902
11903
return drflac__full_read_and_close_f32(pFlac, channels, sampleRate, totalPCMFrameCount);
11904
}
11905
#endif
11906
11907
DRFLAC_API drflac_int32* drflac_open_memory_and_read_pcm_frames_s32(const void* data, size_t dataSize, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks)
11908
{
11909
drflac* pFlac;
11910
11911
if (sampleRate) {
11912
*sampleRate = 0;
11913
}
11914
if (channels) {
11915
*channels = 0;
11916
}
11917
if (totalPCMFrameCount) {
11918
*totalPCMFrameCount = 0;
11919
}
11920
11921
pFlac = drflac_open_memory(data, dataSize, pAllocationCallbacks);
11922
if (pFlac == NULL) {
11923
return NULL;
11924
}
11925
11926
return drflac__full_read_and_close_s32(pFlac, channels, sampleRate, totalPCMFrameCount);
11927
}
11928
11929
DRFLAC_API drflac_int16* drflac_open_memory_and_read_pcm_frames_s16(const void* data, size_t dataSize, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks)
11930
{
11931
drflac* pFlac;
11932
11933
if (sampleRate) {
11934
*sampleRate = 0;
11935
}
11936
if (channels) {
11937
*channels = 0;
11938
}
11939
if (totalPCMFrameCount) {
11940
*totalPCMFrameCount = 0;
11941
}
11942
11943
pFlac = drflac_open_memory(data, dataSize, pAllocationCallbacks);
11944
if (pFlac == NULL) {
11945
return NULL;
11946
}
11947
11948
return drflac__full_read_and_close_s16(pFlac, channels, sampleRate, totalPCMFrameCount);
11949
}
11950
11951
DRFLAC_API float* drflac_open_memory_and_read_pcm_frames_f32(const void* data, size_t dataSize, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks)
11952
{
11953
drflac* pFlac;
11954
11955
if (sampleRate) {
11956
*sampleRate = 0;
11957
}
11958
if (channels) {
11959
*channels = 0;
11960
}
11961
if (totalPCMFrameCount) {
11962
*totalPCMFrameCount = 0;
11963
}
11964
11965
pFlac = drflac_open_memory(data, dataSize, pAllocationCallbacks);
11966
if (pFlac == NULL) {
11967
return NULL;
11968
}
11969
11970
return drflac__full_read_and_close_f32(pFlac, channels, sampleRate, totalPCMFrameCount);
11971
}
11972
11973
11974
DRFLAC_API void drflac_free(void* p, const drflac_allocation_callbacks* pAllocationCallbacks)
11975
{
11976
if (pAllocationCallbacks != NULL) {
11977
drflac__free_from_callbacks(p, pAllocationCallbacks);
11978
} else {
11979
drflac__free_default(p, NULL);
11980
}
11981
}
11982
11983
11984
11985
11986
DRFLAC_API void drflac_init_vorbis_comment_iterator(drflac_vorbis_comment_iterator* pIter, drflac_uint32 commentCount, const void* pComments)
11987
{
11988
if (pIter == NULL) {
11989
return;
11990
}
11991
11992
pIter->countRemaining = commentCount;
11993
pIter->pRunningData = (const char*)pComments;
11994
}
11995
11996
DRFLAC_API const char* drflac_next_vorbis_comment(drflac_vorbis_comment_iterator* pIter, drflac_uint32* pCommentLengthOut)
11997
{
11998
drflac_int32 length;
11999
const char* pComment;
12000
12001
/* Safety. */
12002
if (pCommentLengthOut) {
12003
*pCommentLengthOut = 0;
12004
}
12005
12006
if (pIter == NULL || pIter->countRemaining == 0 || pIter->pRunningData == NULL) {
12007
return NULL;
12008
}
12009
12010
length = drflac__le2host_32_ptr_unaligned(pIter->pRunningData);
12011
pIter->pRunningData += 4;
12012
12013
pComment = pIter->pRunningData;
12014
pIter->pRunningData += length;
12015
pIter->countRemaining -= 1;
12016
12017
if (pCommentLengthOut) {
12018
*pCommentLengthOut = length;
12019
}
12020
12021
return pComment;
12022
}
12023
12024
12025
12026
12027
DRFLAC_API void drflac_init_cuesheet_track_iterator(drflac_cuesheet_track_iterator* pIter, drflac_uint32 trackCount, const void* pTrackData)
12028
{
12029
if (pIter == NULL) {
12030
return;
12031
}
12032
12033
pIter->countRemaining = trackCount;
12034
pIter->pRunningData = (const char*)pTrackData;
12035
}
12036
12037
DRFLAC_API drflac_bool32 drflac_next_cuesheet_track(drflac_cuesheet_track_iterator* pIter, drflac_cuesheet_track* pCuesheetTrack)
12038
{
12039
drflac_cuesheet_track cuesheetTrack;
12040
const char* pRunningData;
12041
drflac_uint64 offsetHi;
12042
drflac_uint64 offsetLo;
12043
12044
if (pIter == NULL || pIter->countRemaining == 0 || pIter->pRunningData == NULL) {
12045
return DRFLAC_FALSE;
12046
}
12047
12048
pRunningData = pIter->pRunningData;
12049
12050
offsetHi = drflac__be2host_32(*(const drflac_uint32*)pRunningData); pRunningData += 4;
12051
offsetLo = drflac__be2host_32(*(const drflac_uint32*)pRunningData); pRunningData += 4;
12052
cuesheetTrack.offset = offsetLo | (offsetHi << 32);
12053
cuesheetTrack.trackNumber = pRunningData[0]; pRunningData += 1;
12054
DRFLAC_COPY_MEMORY(cuesheetTrack.ISRC, pRunningData, sizeof(cuesheetTrack.ISRC)); pRunningData += 12;
12055
cuesheetTrack.isAudio = (pRunningData[0] & 0x80) != 0;
12056
cuesheetTrack.preEmphasis = (pRunningData[0] & 0x40) != 0; pRunningData += 14;
12057
cuesheetTrack.indexCount = pRunningData[0]; pRunningData += 1;
12058
cuesheetTrack.pIndexPoints = (const drflac_cuesheet_track_index*)pRunningData; pRunningData += cuesheetTrack.indexCount * sizeof(drflac_cuesheet_track_index);
12059
12060
pIter->pRunningData = pRunningData;
12061
pIter->countRemaining -= 1;
12062
12063
if (pCuesheetTrack) {
12064
*pCuesheetTrack = cuesheetTrack;
12065
}
12066
12067
return DRFLAC_TRUE;
12068
}
12069
12070
#if defined(__clang__) || (defined(__GNUC__) && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6)))
12071
#pragma GCC diagnostic pop
12072
#endif
12073
#endif /* dr_flac_c */
12074
#endif /* DR_FLAC_IMPLEMENTATION */
12075
12076
12077
/*
12078
REVISION HISTORY
12079
================
12080
v0.12.42 - 2023-11-02
12081
- Fix build for ARMv6-M.
12082
- Fix a compilation warning with GCC.
12083
12084
v0.12.41 - 2023-06-17
12085
- Fix an incorrect date in revision history. No functional change.
12086
12087
v0.12.40 - 2023-05-22
12088
- Minor code restructure. No functional change.
12089
12090
v0.12.39 - 2022-09-17
12091
- Fix compilation with DJGPP.
12092
- Fix compilation error with Visual Studio 2019 and the ARM build.
12093
- Fix an error with SSE 4.1 detection.
12094
- Add support for disabling wchar_t with DR_WAV_NO_WCHAR.
12095
- Improve compatibility with compilers which lack support for explicit struct packing.
12096
- Improve compatibility with low-end and embedded hardware by reducing the amount of stack
12097
allocation when loading an Ogg encapsulated file.
12098
12099
v0.12.38 - 2022-04-10
12100
- Fix compilation error on older versions of GCC.
12101
12102
v0.12.37 - 2022-02-12
12103
- Improve ARM detection.
12104
12105
v0.12.36 - 2022-02-07
12106
- Fix a compilation error with the ARM build.
12107
12108
v0.12.35 - 2022-02-06
12109
- Fix a bug due to underestimating the amount of precision required for the prediction stage.
12110
- Fix some bugs found from fuzz testing.
12111
12112
v0.12.34 - 2022-01-07
12113
- Fix some misalignment bugs when reading metadata.
12114
12115
v0.12.33 - 2021-12-22
12116
- Fix a bug with seeking when the seek table does not start at PCM frame 0.
12117
12118
v0.12.32 - 2021-12-11
12119
- Fix a warning with Clang.
12120
12121
v0.12.31 - 2021-08-16
12122
- Silence some warnings.
12123
12124
v0.12.30 - 2021-07-31
12125
- Fix platform detection for ARM64.
12126
12127
v0.12.29 - 2021-04-02
12128
- Fix a bug where the running PCM frame index is set to an invalid value when over-seeking.
12129
- Fix a decoding error due to an incorrect validation check.
12130
12131
v0.12.28 - 2021-02-21
12132
- Fix a warning due to referencing _MSC_VER when it is undefined.
12133
12134
v0.12.27 - 2021-01-31
12135
- Fix a static analysis warning.
12136
12137
v0.12.26 - 2021-01-17
12138
- Fix a compilation warning due to _BSD_SOURCE being deprecated.
12139
12140
v0.12.25 - 2020-12-26
12141
- Update documentation.
12142
12143
v0.12.24 - 2020-11-29
12144
- Fix ARM64/NEON detection when compiling with MSVC.
12145
12146
v0.12.23 - 2020-11-21
12147
- Fix compilation with OpenWatcom.
12148
12149
v0.12.22 - 2020-11-01
12150
- Fix an error with the previous release.
12151
12152
v0.12.21 - 2020-11-01
12153
- Fix a possible deadlock when seeking.
12154
- Improve compiler support for older versions of GCC.
12155
12156
v0.12.20 - 2020-09-08
12157
- Fix a compilation error on older compilers.
12158
12159
v0.12.19 - 2020-08-30
12160
- Fix a bug due to an undefined 32-bit shift.
12161
12162
v0.12.18 - 2020-08-14
12163
- Fix a crash when compiling with clang-cl.
12164
12165
v0.12.17 - 2020-08-02
12166
- Simplify sized types.
12167
12168
v0.12.16 - 2020-07-25
12169
- Fix a compilation warning.
12170
12171
v0.12.15 - 2020-07-06
12172
- Check for negative LPC shifts and return an error.
12173
12174
v0.12.14 - 2020-06-23
12175
- Add include guard for the implementation section.
12176
12177
v0.12.13 - 2020-05-16
12178
- Add compile-time and run-time version querying.
12179
- DRFLAC_VERSION_MINOR
12180
- DRFLAC_VERSION_MAJOR
12181
- DRFLAC_VERSION_REVISION
12182
- DRFLAC_VERSION_STRING
12183
- drflac_version()
12184
- drflac_version_string()
12185
12186
v0.12.12 - 2020-04-30
12187
- Fix compilation errors with VC6.
12188
12189
v0.12.11 - 2020-04-19
12190
- Fix some pedantic warnings.
12191
- Fix some undefined behaviour warnings.
12192
12193
v0.12.10 - 2020-04-10
12194
- Fix some bugs when trying to seek with an invalid seek table.
12195
12196
v0.12.9 - 2020-04-05
12197
- Fix warnings.
12198
12199
v0.12.8 - 2020-04-04
12200
- Add drflac_open_file_w() and drflac_open_file_with_metadata_w().
12201
- Fix some static analysis warnings.
12202
- Minor documentation updates.
12203
12204
v0.12.7 - 2020-03-14
12205
- Fix compilation errors with VC6.
12206
12207
v0.12.6 - 2020-03-07
12208
- Fix compilation error with Visual Studio .NET 2003.
12209
12210
v0.12.5 - 2020-01-30
12211
- Silence some static analysis warnings.
12212
12213
v0.12.4 - 2020-01-29
12214
- Silence some static analysis warnings.
12215
12216
v0.12.3 - 2019-12-02
12217
- Fix some warnings when compiling with GCC and the -Og flag.
12218
- Fix a crash in out-of-memory situations.
12219
- Fix potential integer overflow bug.
12220
- Fix some static analysis warnings.
12221
- Fix a possible crash when using custom memory allocators without a custom realloc() implementation.
12222
- Fix a bug with binary search seeking where the bits per sample is not a multiple of 8.
12223
12224
v0.12.2 - 2019-10-07
12225
- Internal code clean up.
12226
12227
v0.12.1 - 2019-09-29
12228
- Fix some Clang Static Analyzer warnings.
12229
- Fix an unused variable warning.
12230
12231
v0.12.0 - 2019-09-23
12232
- API CHANGE: Add support for user defined memory allocation routines. This system allows the program to specify their own memory allocation
12233
routines with a user data pointer for client-specific contextual data. This adds an extra parameter to the end of the following APIs:
12234
- drflac_open()
12235
- drflac_open_relaxed()
12236
- drflac_open_with_metadata()
12237
- drflac_open_with_metadata_relaxed()
12238
- drflac_open_file()
12239
- drflac_open_file_with_metadata()
12240
- drflac_open_memory()
12241
- drflac_open_memory_with_metadata()
12242
- drflac_open_and_read_pcm_frames_s32()
12243
- drflac_open_and_read_pcm_frames_s16()
12244
- drflac_open_and_read_pcm_frames_f32()
12245
- drflac_open_file_and_read_pcm_frames_s32()
12246
- drflac_open_file_and_read_pcm_frames_s16()
12247
- drflac_open_file_and_read_pcm_frames_f32()
12248
- drflac_open_memory_and_read_pcm_frames_s32()
12249
- drflac_open_memory_and_read_pcm_frames_s16()
12250
- drflac_open_memory_and_read_pcm_frames_f32()
12251
Set this extra parameter to NULL to use defaults which is the same as the previous behaviour. Setting this NULL will use
12252
DRFLAC_MALLOC, DRFLAC_REALLOC and DRFLAC_FREE.
12253
- Remove deprecated APIs:
12254
- drflac_read_s32()
12255
- drflac_read_s16()
12256
- drflac_read_f32()
12257
- drflac_seek_to_sample()
12258
- drflac_open_and_decode_s32()
12259
- drflac_open_and_decode_s16()
12260
- drflac_open_and_decode_f32()
12261
- drflac_open_and_decode_file_s32()
12262
- drflac_open_and_decode_file_s16()
12263
- drflac_open_and_decode_file_f32()
12264
- drflac_open_and_decode_memory_s32()
12265
- drflac_open_and_decode_memory_s16()
12266
- drflac_open_and_decode_memory_f32()
12267
- Remove drflac.totalSampleCount which is now replaced with drflac.totalPCMFrameCount. You can emulate drflac.totalSampleCount
12268
by doing pFlac->totalPCMFrameCount*pFlac->channels.
12269
- Rename drflac.currentFrame to drflac.currentFLACFrame to remove ambiguity with PCM frames.
12270
- Fix errors when seeking to the end of a stream.
12271
- Optimizations to seeking.
12272
- SSE improvements and optimizations.
12273
- ARM NEON optimizations.
12274
- Optimizations to drflac_read_pcm_frames_s16().
12275
- Optimizations to drflac_read_pcm_frames_s32().
12276
12277
v0.11.10 - 2019-06-26
12278
- Fix a compiler error.
12279
12280
v0.11.9 - 2019-06-16
12281
- Silence some ThreadSanitizer warnings.
12282
12283
v0.11.8 - 2019-05-21
12284
- Fix warnings.
12285
12286
v0.11.7 - 2019-05-06
12287
- C89 fixes.
12288
12289
v0.11.6 - 2019-05-05
12290
- Add support for C89.
12291
- Fix a compiler warning when CRC is disabled.
12292
- Change license to choice of public domain or MIT-0.
12293
12294
v0.11.5 - 2019-04-19
12295
- Fix a compiler error with GCC.
12296
12297
v0.11.4 - 2019-04-17
12298
- Fix some warnings with GCC when compiling with -std=c99.
12299
12300
v0.11.3 - 2019-04-07
12301
- Silence warnings with GCC.
12302
12303
v0.11.2 - 2019-03-10
12304
- Fix a warning.
12305
12306
v0.11.1 - 2019-02-17
12307
- Fix a potential bug with seeking.
12308
12309
v0.11.0 - 2018-12-16
12310
- API CHANGE: Deprecated drflac_read_s32(), drflac_read_s16() and drflac_read_f32() and replaced them with
12311
drflac_read_pcm_frames_s32(), drflac_read_pcm_frames_s16() and drflac_read_pcm_frames_f32(). The new APIs take
12312
and return PCM frame counts instead of sample counts. To upgrade you will need to change the input count by
12313
dividing it by the channel count, and then do the same with the return value.
12314
- API_CHANGE: Deprecated drflac_seek_to_sample() and replaced with drflac_seek_to_pcm_frame(). Same rules as
12315
the changes to drflac_read_*() apply.
12316
- API CHANGE: Deprecated drflac_open_and_decode_*() and replaced with drflac_open_*_and_read_*(). Same rules as
12317
the changes to drflac_read_*() apply.
12318
- Optimizations.
12319
12320
v0.10.0 - 2018-09-11
12321
- Remove the DR_FLAC_NO_WIN32_IO option and the Win32 file IO functionality. If you need to use Win32 file IO you
12322
need to do it yourself via the callback API.
12323
- Fix the clang build.
12324
- Fix undefined behavior.
12325
- Fix errors with CUESHEET metdata blocks.
12326
- Add an API for iterating over each cuesheet track in the CUESHEET metadata block. This works the same way as the
12327
Vorbis comment API.
12328
- Other miscellaneous bug fixes, mostly relating to invalid FLAC streams.
12329
- Minor optimizations.
12330
12331
v0.9.11 - 2018-08-29
12332
- Fix a bug with sample reconstruction.
12333
12334
v0.9.10 - 2018-08-07
12335
- Improve 64-bit detection.
12336
12337
v0.9.9 - 2018-08-05
12338
- Fix C++ build on older versions of GCC.
12339
12340
v0.9.8 - 2018-07-24
12341
- Fix compilation errors.
12342
12343
v0.9.7 - 2018-07-05
12344
- Fix a warning.
12345
12346
v0.9.6 - 2018-06-29
12347
- Fix some typos.
12348
12349
v0.9.5 - 2018-06-23
12350
- Fix some warnings.
12351
12352
v0.9.4 - 2018-06-14
12353
- Optimizations to seeking.
12354
- Clean up.
12355
12356
v0.9.3 - 2018-05-22
12357
- Bug fix.
12358
12359
v0.9.2 - 2018-05-12
12360
- Fix a compilation error due to a missing break statement.
12361
12362
v0.9.1 - 2018-04-29
12363
- Fix compilation error with Clang.
12364
12365
v0.9 - 2018-04-24
12366
- Fix Clang build.
12367
- Start using major.minor.revision versioning.
12368
12369
v0.8g - 2018-04-19
12370
- Fix build on non-x86/x64 architectures.
12371
12372
v0.8f - 2018-02-02
12373
- Stop pretending to support changing rate/channels mid stream.
12374
12375
v0.8e - 2018-02-01
12376
- Fix a crash when the block size of a frame is larger than the maximum block size defined by the FLAC stream.
12377
- Fix a crash the the Rice partition order is invalid.
12378
12379
v0.8d - 2017-09-22
12380
- Add support for decoding streams with ID3 tags. ID3 tags are just skipped.
12381
12382
v0.8c - 2017-09-07
12383
- Fix warning on non-x86/x64 architectures.
12384
12385
v0.8b - 2017-08-19
12386
- Fix build on non-x86/x64 architectures.
12387
12388
v0.8a - 2017-08-13
12389
- A small optimization for the Clang build.
12390
12391
v0.8 - 2017-08-12
12392
- API CHANGE: Rename dr_* types to drflac_*.
12393
- Optimizations. This brings dr_flac back to about the same class of efficiency as the reference implementation.
12394
- Add support for custom implementations of malloc(), realloc(), etc.
12395
- Add CRC checking to Ogg encapsulated streams.
12396
- Fix VC++ 6 build. This is only for the C++ compiler. The C compiler is not currently supported.
12397
- Bug fixes.
12398
12399
v0.7 - 2017-07-23
12400
- Add support for opening a stream without a header block. To do this, use drflac_open_relaxed() / drflac_open_with_metadata_relaxed().
12401
12402
v0.6 - 2017-07-22
12403
- Add support for recovering from invalid frames. With this change, dr_flac will simply skip over invalid frames as if they
12404
never existed. Frames are checked against their sync code, the CRC-8 of the frame header and the CRC-16 of the whole frame.
12405
12406
v0.5 - 2017-07-16
12407
- Fix typos.
12408
- Change drflac_bool* types to unsigned.
12409
- Add CRC checking. This makes dr_flac slower, but can be disabled with #define DR_FLAC_NO_CRC.
12410
12411
v0.4f - 2017-03-10
12412
- Fix a couple of bugs with the bitstreaming code.
12413
12414
v0.4e - 2017-02-17
12415
- Fix some warnings.
12416
12417
v0.4d - 2016-12-26
12418
- Add support for 32-bit floating-point PCM decoding.
12419
- Use drflac_int* and drflac_uint* sized types to improve compiler support.
12420
- Minor improvements to documentation.
12421
12422
v0.4c - 2016-12-26
12423
- Add support for signed 16-bit integer PCM decoding.
12424
12425
v0.4b - 2016-10-23
12426
- A minor change to drflac_bool8 and drflac_bool32 types.
12427
12428
v0.4a - 2016-10-11
12429
- Rename drBool32 to drflac_bool32 for styling consistency.
12430
12431
v0.4 - 2016-09-29
12432
- API/ABI CHANGE: Use fixed size 32-bit booleans instead of the built-in bool type.
12433
- API CHANGE: Rename drflac_open_and_decode*() to drflac_open_and_decode*_s32().
12434
- API CHANGE: Swap the order of "channels" and "sampleRate" parameters in drflac_open_and_decode*(). Rationale for this is to
12435
keep it consistent with drflac_audio.
12436
12437
v0.3f - 2016-09-21
12438
- Fix a warning with GCC.
12439
12440
v0.3e - 2016-09-18
12441
- Fixed a bug where GCC 4.3+ was not getting properly identified.
12442
- Fixed a few typos.
12443
- Changed date formats to ISO 8601 (YYYY-MM-DD).
12444
12445
v0.3d - 2016-06-11
12446
- Minor clean up.
12447
12448
v0.3c - 2016-05-28
12449
- Fixed compilation error.
12450
12451
v0.3b - 2016-05-16
12452
- Fixed Linux/GCC build.
12453
- Updated documentation.
12454
12455
v0.3a - 2016-05-15
12456
- Minor fixes to documentation.
12457
12458
v0.3 - 2016-05-11
12459
- Optimizations. Now at about parity with the reference implementation on 32-bit builds.
12460
- Lots of clean up.
12461
12462
v0.2b - 2016-05-10
12463
- Bug fixes.
12464
12465
v0.2a - 2016-05-10
12466
- Made drflac_open_and_decode() more robust.
12467
- Removed an unused debugging variable
12468
12469
v0.2 - 2016-05-09
12470
- Added support for Ogg encapsulation.
12471
- API CHANGE. Have the onSeek callback take a third argument which specifies whether or not the seek
12472
should be relative to the start or the current position. Also changes the seeking rules such that
12473
seeking offsets will never be negative.
12474
- Have drflac_open_and_decode() fail gracefully if the stream has an unknown total sample count.
12475
12476
v0.1b - 2016-05-07
12477
- Properly close the file handle in drflac_open_file() and family when the decoder fails to initialize.
12478
- Removed a stale comment.
12479
12480
v0.1a - 2016-05-05
12481
- Minor formatting changes.
12482
- Fixed a warning on the GCC build.
12483
12484
v0.1 - 2016-05-03
12485
- Initial versioned release.
12486
*/
12487
12488
/*
12489
This software is available as a choice of the following licenses. Choose
12490
whichever you prefer.
12491
12492
===============================================================================
12493
ALTERNATIVE 1 - Public Domain (www.unlicense.org)
12494
===============================================================================
12495
This is free and unencumbered software released into the public domain.
12496
12497
Anyone is free to copy, modify, publish, use, compile, sell, or distribute this
12498
software, either in source code form or as a compiled binary, for any purpose,
12499
commercial or non-commercial, and by any means.
12500
12501
In jurisdictions that recognize copyright laws, the author or authors of this
12502
software dedicate any and all copyright interest in the software to the public
12503
domain. We make this dedication for the benefit of the public at large and to
12504
the detriment of our heirs and successors. We intend this dedication to be an
12505
overt act of relinquishment in perpetuity of all present and future rights to
12506
this software under copyright law.
12507
12508
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
12509
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
12510
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
12511
AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
12512
ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
12513
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
12514
12515
For more information, please refer to <http://unlicense.org/>
12516
12517
===============================================================================
12518
ALTERNATIVE 2 - MIT No Attribution
12519
===============================================================================
12520
Copyright 2023 David Reid
12521
12522
Permission is hereby granted, free of charge, to any person obtaining a copy of
12523
this software and associated documentation files (the "Software"), to deal in
12524
the Software without restriction, including without limitation the rights to
12525
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
12526
of the Software, and to permit persons to whom the Software is furnished to do
12527
so.
12528
12529
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
12530
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
12531
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
12532
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
12533
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
12534
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
12535
SOFTWARE.
12536
*/
12537
12538