Path: blob/master/dep/libchdr/include/dr_libs/dr_flac.h
4247 views
/*1FLAC audio decoder. Choice of public domain or MIT-0. See license statements at the end of this file.2dr_flac - v0.12.42 - 2023-11-0234David Reid - [email protected]56GitHub: https://github.com/mackron/dr_libs7*/89/*10RELEASE NOTES - v0.12.011=======================12Version 0.12.0 has breaking API changes including changes to the existing API and the removal of deprecated APIs.131415Improved Client-Defined Memory Allocation16-----------------------------------------17The main change with this release is the addition of a more flexible way of implementing custom memory allocation routines. The18existing system of DRFLAC_MALLOC, DRFLAC_REALLOC and DRFLAC_FREE are still in place and will be used by default when no custom19allocation callbacks are specified.2021To use the new system, you pass in a pointer to a drflac_allocation_callbacks object to drflac_open() and family, like this:2223void* my_malloc(size_t sz, void* pUserData)24{25return malloc(sz);26}27void* my_realloc(void* p, size_t sz, void* pUserData)28{29return realloc(p, sz);30}31void my_free(void* p, void* pUserData)32{33free(p);34}3536...3738drflac_allocation_callbacks allocationCallbacks;39allocationCallbacks.pUserData = &myData;40allocationCallbacks.onMalloc = my_malloc;41allocationCallbacks.onRealloc = my_realloc;42allocationCallbacks.onFree = my_free;43drflac* pFlac = drflac_open_file("my_file.flac", &allocationCallbacks);4445The advantage of this new system is that it allows you to specify user data which will be passed in to the allocation routines.4647Passing in null for the allocation callbacks object will cause dr_flac to use defaults which is the same as DRFLAC_MALLOC,48DRFLAC_REALLOC and DRFLAC_FREE and the equivalent of how it worked in previous versions.4950Every API that opens a drflac object now takes this extra parameter. These include the following:5152drflac_open()53drflac_open_relaxed()54drflac_open_with_metadata()55drflac_open_with_metadata_relaxed()56drflac_open_file()57drflac_open_file_with_metadata()58drflac_open_memory()59drflac_open_memory_with_metadata()60drflac_open_and_read_pcm_frames_s32()61drflac_open_and_read_pcm_frames_s16()62drflac_open_and_read_pcm_frames_f32()63drflac_open_file_and_read_pcm_frames_s32()64drflac_open_file_and_read_pcm_frames_s16()65drflac_open_file_and_read_pcm_frames_f32()66drflac_open_memory_and_read_pcm_frames_s32()67drflac_open_memory_and_read_pcm_frames_s16()68drflac_open_memory_and_read_pcm_frames_f32()69707172Optimizations73-------------74Seeking performance has been greatly improved. A new binary search based seeking algorithm has been introduced which significantly75improves performance over the brute force method which was used when no seek table was present. Seek table based seeking also takes76advantage of the new binary search seeking system to further improve performance there as well. Note that this depends on CRC which77means it will be disabled when DR_FLAC_NO_CRC is used.7879The SSE4.1 pipeline has been cleaned up and optimized. You should see some improvements with decoding speed of 24-bit files in80particular. 16-bit streams should also see some improvement.8182drflac_read_pcm_frames_s16() has been optimized. Previously this sat on top of drflac_read_pcm_frames_s32() and performed it's s3283to s16 conversion in a second pass. This is now all done in a single pass. This includes SSE2 and ARM NEON optimized paths.8485A minor optimization has been implemented for drflac_read_pcm_frames_s32(). This will now use an SSE2 optimized pipeline for stereo86channel reconstruction which is the last part of the decoding process.8788The ARM build has seen a few improvements. The CLZ (count leading zeroes) and REV (byte swap) instructions are now used when89compiling with GCC and Clang which is achieved using inline assembly. The CLZ instruction requires ARM architecture version 5 at90compile time and the REV instruction requires ARM architecture version 6.9192An ARM NEON optimized pipeline has been implemented. To enable this you'll need to add -mfpu=neon to the command line when compiling.939495Removed APIs96------------97The following APIs were deprecated in version 0.11.0 and have been completely removed in version 0.12.0:9899drflac_read_s32() -> drflac_read_pcm_frames_s32()100drflac_read_s16() -> drflac_read_pcm_frames_s16()101drflac_read_f32() -> drflac_read_pcm_frames_f32()102drflac_seek_to_sample() -> drflac_seek_to_pcm_frame()103drflac_open_and_decode_s32() -> drflac_open_and_read_pcm_frames_s32()104drflac_open_and_decode_s16() -> drflac_open_and_read_pcm_frames_s16()105drflac_open_and_decode_f32() -> drflac_open_and_read_pcm_frames_f32()106drflac_open_and_decode_file_s32() -> drflac_open_file_and_read_pcm_frames_s32()107drflac_open_and_decode_file_s16() -> drflac_open_file_and_read_pcm_frames_s16()108drflac_open_and_decode_file_f32() -> drflac_open_file_and_read_pcm_frames_f32()109drflac_open_and_decode_memory_s32() -> drflac_open_memory_and_read_pcm_frames_s32()110drflac_open_and_decode_memory_s16() -> drflac_open_memory_and_read_pcm_frames_s16()111drflac_open_and_decode_memory_f32() -> drflac_open_memroy_and_read_pcm_frames_f32()112113Prior versions of dr_flac operated on a per-sample basis whereas now it operates on PCM frames. The removed APIs all relate114to the old per-sample APIs. You now need to use the "pcm_frame" versions.115*/116117118/*119Introduction120============121dr_flac is a single file library. To use it, do something like the following in one .c file.122123```c124#define DR_FLAC_IMPLEMENTATION125#include "dr_flac.h"126```127128You can then #include this file in other parts of the program as you would with any other header file. To decode audio data, do something like the following:129130```c131drflac* pFlac = drflac_open_file("MySong.flac", NULL);132if (pFlac == NULL) {133// Failed to open FLAC file134}135136drflac_int32* pSamples = malloc(pFlac->totalPCMFrameCount * pFlac->channels * sizeof(drflac_int32));137drflac_uint64 numberOfInterleavedSamplesActuallyRead = drflac_read_pcm_frames_s32(pFlac, pFlac->totalPCMFrameCount, pSamples);138```139140The drflac object represents the decoder. It is a transparent type so all the information you need, such as the number of channels and the bits per sample,141should be directly accessible - just make sure you don't change their values. Samples are always output as interleaved signed 32-bit PCM. In the example above142a native FLAC stream was opened, however dr_flac has seamless support for Ogg encapsulated FLAC streams as well.143144You do not need to decode the entire stream in one go - you just specify how many samples you'd like at any given time and the decoder will give you as many145samples as it can, up to the amount requested. Later on when you need the next batch of samples, just call it again. Example:146147```c148while (drflac_read_pcm_frames_s32(pFlac, chunkSizeInPCMFrames, pChunkSamples) > 0) {149do_something();150}151```152153You can seek to a specific PCM frame with `drflac_seek_to_pcm_frame()`.154155If you just want to quickly decode an entire FLAC file in one go you can do something like this:156157```c158unsigned int channels;159unsigned int sampleRate;160drflac_uint64 totalPCMFrameCount;161drflac_int32* pSampleData = drflac_open_file_and_read_pcm_frames_s32("MySong.flac", &channels, &sampleRate, &totalPCMFrameCount, NULL);162if (pSampleData == NULL) {163// Failed to open and decode FLAC file.164}165166...167168drflac_free(pSampleData, NULL);169```170171You can read samples as signed 16-bit integer and 32-bit floating-point PCM with the *_s16() and *_f32() family of APIs respectively, but note that these172should be considered lossy.173174175If you need access to metadata (album art, etc.), use `drflac_open_with_metadata()`, `drflac_open_file_with_metdata()` or `drflac_open_memory_with_metadata()`.176The rationale for keeping these APIs separate is that they're slightly slower than the normal versions and also just a little bit harder to use. dr_flac177reports metadata to the application through the use of a callback, and every metadata block is reported before `drflac_open_with_metdata()` returns.178179The main opening APIs (`drflac_open()`, etc.) will fail if the header is not present. The presents a problem in certain scenarios such as broadcast style180streams or internet radio where the header may not be present because the user has started playback mid-stream. To handle this, use the relaxed APIs:181182`drflac_open_relaxed()`183`drflac_open_with_metadata_relaxed()`184185It is not recommended to use these APIs for file based streams because a missing header would usually indicate a corrupt or perverse file. In addition, these186APIs can take a long time to initialize because they may need to spend a lot of time finding the first frame.187188189190Build Options191=============192#define these options before including this file.193194#define DR_FLAC_NO_STDIO195Disable `drflac_open_file()` and family.196197#define DR_FLAC_NO_OGG198Disables support for Ogg/FLAC streams.199200#define DR_FLAC_BUFFER_SIZE <number>201Defines the size of the internal buffer to store data from onRead(). This buffer is used to reduce the number of calls back to the client for more data.202Larger values means more memory, but better performance. My tests show diminishing returns after about 4KB (which is the default). Consider reducing this if203you have a very efficient implementation of onRead(), or increase it if it's very inefficient. Must be a multiple of 8.204205#define DR_FLAC_NO_CRC206Disables CRC checks. This will offer a performance boost when CRC is unnecessary. This will disable binary search seeking. When seeking, the seek table will207be used if available. Otherwise the seek will be performed using brute force.208209#define DR_FLAC_NO_SIMD210Disables SIMD optimizations (SSE on x86/x64 architectures, NEON on ARM architectures). Use this if you are having compatibility issues with your compiler.211212#define DR_FLAC_NO_WCHAR213Disables all functions ending with `_w`. Use this if your compiler does not provide wchar.h. Not required if DR_FLAC_NO_STDIO is also defined.214215216217Notes218=====219- dr_flac does not support changing the sample rate nor channel count mid stream.220- dr_flac is not thread-safe, but its APIs can be called from any thread so long as you do your own synchronization.221- When using Ogg encapsulation, a corrupted metadata block will result in `drflac_open_with_metadata()` and `drflac_open()` returning inconsistent samples due222to differences in corrupted stream recorvery logic between the two APIs.223*/224225#ifndef dr_flac_h226#define dr_flac_h227228#ifdef __cplusplus229extern "C" {230#endif231232#define DRFLAC_STRINGIFY(x) #x233#define DRFLAC_XSTRINGIFY(x) DRFLAC_STRINGIFY(x)234235#define DRFLAC_VERSION_MAJOR 0236#define DRFLAC_VERSION_MINOR 12237#define DRFLAC_VERSION_REVISION 42238#define DRFLAC_VERSION_STRING DRFLAC_XSTRINGIFY(DRFLAC_VERSION_MAJOR) "." DRFLAC_XSTRINGIFY(DRFLAC_VERSION_MINOR) "." DRFLAC_XSTRINGIFY(DRFLAC_VERSION_REVISION)239240#include <stddef.h> /* For size_t. */241242/* Sized Types */243typedef signed char drflac_int8;244typedef unsigned char drflac_uint8;245typedef signed short drflac_int16;246typedef unsigned short drflac_uint16;247typedef signed int drflac_int32;248typedef unsigned int drflac_uint32;249#if defined(_MSC_VER) && !defined(__clang__)250typedef signed __int64 drflac_int64;251typedef unsigned __int64 drflac_uint64;252#else253#if defined(__clang__) || (defined(__GNUC__) && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6)))254#pragma GCC diagnostic push255#pragma GCC diagnostic ignored "-Wlong-long"256#if defined(__clang__)257#pragma GCC diagnostic ignored "-Wc++11-long-long"258#endif259#endif260typedef signed long long drflac_int64;261typedef unsigned long long drflac_uint64;262#if defined(__clang__) || (defined(__GNUC__) && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6)))263#pragma GCC diagnostic pop264#endif265#endif266#if defined(__LP64__) || defined(_WIN64) || (defined(__x86_64__) && !defined(__ILP32__)) || defined(_M_X64) || defined(__ia64) || defined(_M_IA64) || defined(__aarch64__) || defined(_M_ARM64) || defined(__powerpc64__)267typedef drflac_uint64 drflac_uintptr;268#else269typedef drflac_uint32 drflac_uintptr;270#endif271typedef drflac_uint8 drflac_bool8;272typedef drflac_uint32 drflac_bool32;273#define DRFLAC_TRUE 1274#define DRFLAC_FALSE 0275/* End Sized Types */276277/* Decorations */278#if !defined(DRFLAC_API)279#if defined(DRFLAC_DLL)280#if defined(_WIN32)281#define DRFLAC_DLL_IMPORT __declspec(dllimport)282#define DRFLAC_DLL_EXPORT __declspec(dllexport)283#define DRFLAC_DLL_PRIVATE static284#else285#if defined(__GNUC__) && __GNUC__ >= 4286#define DRFLAC_DLL_IMPORT __attribute__((visibility("default")))287#define DRFLAC_DLL_EXPORT __attribute__((visibility("default")))288#define DRFLAC_DLL_PRIVATE __attribute__((visibility("hidden")))289#else290#define DRFLAC_DLL_IMPORT291#define DRFLAC_DLL_EXPORT292#define DRFLAC_DLL_PRIVATE static293#endif294#endif295296#if defined(DR_FLAC_IMPLEMENTATION) || defined(DRFLAC_IMPLEMENTATION)297#define DRFLAC_API DRFLAC_DLL_EXPORT298#else299#define DRFLAC_API DRFLAC_DLL_IMPORT300#endif301#define DRFLAC_PRIVATE DRFLAC_DLL_PRIVATE302#else303#define DRFLAC_API extern304#define DRFLAC_PRIVATE static305#endif306#endif307/* End Decorations */308309#if defined(_MSC_VER) && _MSC_VER >= 1700 /* Visual Studio 2012 */310#define DRFLAC_DEPRECATED __declspec(deprecated)311#elif (defined(__GNUC__) && __GNUC__ >= 4) /* GCC 4 */312#define DRFLAC_DEPRECATED __attribute__((deprecated))313#elif defined(__has_feature) /* Clang */314#if __has_feature(attribute_deprecated)315#define DRFLAC_DEPRECATED __attribute__((deprecated))316#else317#define DRFLAC_DEPRECATED318#endif319#else320#define DRFLAC_DEPRECATED321#endif322323DRFLAC_API void drflac_version(drflac_uint32* pMajor, drflac_uint32* pMinor, drflac_uint32* pRevision);324DRFLAC_API const char* drflac_version_string(void);325326/* Allocation Callbacks */327typedef struct328{329void* pUserData;330void* (* onMalloc)(size_t sz, void* pUserData);331void* (* onRealloc)(void* p, size_t sz, void* pUserData);332void (* onFree)(void* p, void* pUserData);333} drflac_allocation_callbacks;334/* End Allocation Callbacks */335336/*337As data is read from the client it is placed into an internal buffer for fast access. This controls the size of that buffer. Larger values means more speed,338but also more memory. In my testing there is diminishing returns after about 4KB, but you can fiddle with this to suit your own needs. Must be a multiple of 8.339*/340#ifndef DR_FLAC_BUFFER_SIZE341#define DR_FLAC_BUFFER_SIZE 4096342#endif343344345/* Architecture Detection */346#if defined(_WIN64) || defined(_LP64) || defined(__LP64__)347#define DRFLAC_64BIT348#endif349350#if defined(__x86_64__) || defined(_M_X64)351#define DRFLAC_X64352#elif defined(__i386) || defined(_M_IX86)353#define DRFLAC_X86354#elif defined(__arm__) || defined(_M_ARM) || defined(__arm64) || defined(__arm64__) || defined(__aarch64__) || defined(_M_ARM64)355#define DRFLAC_ARM356#endif357/* End Architecture Detection */358359360#ifdef DRFLAC_64BIT361typedef drflac_uint64 drflac_cache_t;362#else363typedef drflac_uint32 drflac_cache_t;364#endif365366/* The various metadata block types. */367#define DRFLAC_METADATA_BLOCK_TYPE_STREAMINFO 0368#define DRFLAC_METADATA_BLOCK_TYPE_PADDING 1369#define DRFLAC_METADATA_BLOCK_TYPE_APPLICATION 2370#define DRFLAC_METADATA_BLOCK_TYPE_SEEKTABLE 3371#define DRFLAC_METADATA_BLOCK_TYPE_VORBIS_COMMENT 4372#define DRFLAC_METADATA_BLOCK_TYPE_CUESHEET 5373#define DRFLAC_METADATA_BLOCK_TYPE_PICTURE 6374#define DRFLAC_METADATA_BLOCK_TYPE_INVALID 127375376/* The various picture types specified in the PICTURE block. */377#define DRFLAC_PICTURE_TYPE_OTHER 0378#define DRFLAC_PICTURE_TYPE_FILE_ICON 1379#define DRFLAC_PICTURE_TYPE_OTHER_FILE_ICON 2380#define DRFLAC_PICTURE_TYPE_COVER_FRONT 3381#define DRFLAC_PICTURE_TYPE_COVER_BACK 4382#define DRFLAC_PICTURE_TYPE_LEAFLET_PAGE 5383#define DRFLAC_PICTURE_TYPE_MEDIA 6384#define DRFLAC_PICTURE_TYPE_LEAD_ARTIST 7385#define DRFLAC_PICTURE_TYPE_ARTIST 8386#define DRFLAC_PICTURE_TYPE_CONDUCTOR 9387#define DRFLAC_PICTURE_TYPE_BAND 10388#define DRFLAC_PICTURE_TYPE_COMPOSER 11389#define DRFLAC_PICTURE_TYPE_LYRICIST 12390#define DRFLAC_PICTURE_TYPE_RECORDING_LOCATION 13391#define DRFLAC_PICTURE_TYPE_DURING_RECORDING 14392#define DRFLAC_PICTURE_TYPE_DURING_PERFORMANCE 15393#define DRFLAC_PICTURE_TYPE_SCREEN_CAPTURE 16394#define DRFLAC_PICTURE_TYPE_BRIGHT_COLORED_FISH 17395#define DRFLAC_PICTURE_TYPE_ILLUSTRATION 18396#define DRFLAC_PICTURE_TYPE_BAND_LOGOTYPE 19397#define DRFLAC_PICTURE_TYPE_PUBLISHER_LOGOTYPE 20398399typedef enum400{401drflac_container_native,402drflac_container_ogg,403drflac_container_unknown404} drflac_container;405406typedef enum407{408drflac_seek_origin_start,409drflac_seek_origin_current410} drflac_seek_origin;411412/* The order of members in this structure is important because we map this directly to the raw data within the SEEKTABLE metadata block. */413typedef struct414{415drflac_uint64 firstPCMFrame;416drflac_uint64 flacFrameOffset; /* The offset from the first byte of the header of the first frame. */417drflac_uint16 pcmFrameCount;418} drflac_seekpoint;419420typedef struct421{422drflac_uint16 minBlockSizeInPCMFrames;423drflac_uint16 maxBlockSizeInPCMFrames;424drflac_uint32 minFrameSizeInPCMFrames;425drflac_uint32 maxFrameSizeInPCMFrames;426drflac_uint32 sampleRate;427drflac_uint8 channels;428drflac_uint8 bitsPerSample;429drflac_uint64 totalPCMFrameCount;430drflac_uint8 md5[16];431} drflac_streaminfo;432433typedef struct434{435/*436The metadata type. Use this to know how to interpret the data below. Will be set to one of the437DRFLAC_METADATA_BLOCK_TYPE_* tokens.438*/439drflac_uint32 type;440441/*442A pointer to the raw data. This points to a temporary buffer so don't hold on to it. It's best to443not modify the contents of this buffer. Use the structures below for more meaningful and structured444information about the metadata. It's possible for this to be null.445*/446const void* pRawData;447448/* The size in bytes of the block and the buffer pointed to by pRawData if it's non-NULL. */449drflac_uint32 rawDataSize;450451union452{453drflac_streaminfo streaminfo;454455struct456{457int unused;458} padding;459460struct461{462drflac_uint32 id;463const void* pData;464drflac_uint32 dataSize;465} application;466467struct468{469drflac_uint32 seekpointCount;470const drflac_seekpoint* pSeekpoints;471} seektable;472473struct474{475drflac_uint32 vendorLength;476const char* vendor;477drflac_uint32 commentCount;478const void* pComments;479} vorbis_comment;480481struct482{483char catalog[128];484drflac_uint64 leadInSampleCount;485drflac_bool32 isCD;486drflac_uint8 trackCount;487const void* pTrackData;488} cuesheet;489490struct491{492drflac_uint32 type;493drflac_uint32 mimeLength;494const char* mime;495drflac_uint32 descriptionLength;496const char* description;497drflac_uint32 width;498drflac_uint32 height;499drflac_uint32 colorDepth;500drflac_uint32 indexColorCount;501drflac_uint32 pictureDataSize;502const drflac_uint8* pPictureData;503} picture;504} data;505} drflac_metadata;506507508/*509Callback for when data needs to be read from the client.510511512Parameters513----------514pUserData (in)515The user data that was passed to drflac_open() and family.516517pBufferOut (out)518The output buffer.519520bytesToRead (in)521The number of bytes to read.522523524Return Value525------------526The number of bytes actually read.527528529Remarks530-------531A return value of less than bytesToRead indicates the end of the stream. Do _not_ return from this callback until either the entire bytesToRead is filled or532you have reached the end of the stream.533*/534typedef size_t (* drflac_read_proc)(void* pUserData, void* pBufferOut, size_t bytesToRead);535536/*537Callback for when data needs to be seeked.538539540Parameters541----------542pUserData (in)543The user data that was passed to drflac_open() and family.544545offset (in)546The number of bytes to move, relative to the origin. Will never be negative.547548origin (in)549The origin of the seek - the current position or the start of the stream.550551552Return Value553------------554Whether or not the seek was successful.555556557Remarks558-------559The offset will never be negative. Whether or not it is relative to the beginning or current position is determined by the "origin" parameter which will be560either drflac_seek_origin_start or drflac_seek_origin_current.561562When seeking to a PCM frame using drflac_seek_to_pcm_frame(), dr_flac may call this with an offset beyond the end of the FLAC stream. This needs to be detected563and handled by returning DRFLAC_FALSE.564*/565typedef drflac_bool32 (* drflac_seek_proc)(void* pUserData, int offset, drflac_seek_origin origin);566567/*568Callback for when a metadata block is read.569570571Parameters572----------573pUserData (in)574The user data that was passed to drflac_open() and family.575576pMetadata (in)577A pointer to a structure containing the data of the metadata block.578579580Remarks581-------582Use pMetadata->type to determine which metadata block is being handled and how to read the data. This583will be set to one of the DRFLAC_METADATA_BLOCK_TYPE_* tokens.584*/585typedef void (* drflac_meta_proc)(void* pUserData, drflac_metadata* pMetadata);586587588/* Structure for internal use. Only used for decoders opened with drflac_open_memory. */589typedef struct590{591const drflac_uint8* data;592size_t dataSize;593size_t currentReadPos;594} drflac__memory_stream;595596/* Structure for internal use. Used for bit streaming. */597typedef struct598{599/* The function to call when more data needs to be read. */600drflac_read_proc onRead;601602/* The function to call when the current read position needs to be moved. */603drflac_seek_proc onSeek;604605/* The user data to pass around to onRead and onSeek. */606void* pUserData;607608609/*610The number of unaligned bytes in the L2 cache. This will always be 0 until the end of the stream is hit. At the end of the611stream there will be a number of bytes that don't cleanly fit in an L1 cache line, so we use this variable to know whether612or not the bistreamer needs to run on a slower path to read those last bytes. This will never be more than sizeof(drflac_cache_t).613*/614size_t unalignedByteCount;615616/* The content of the unaligned bytes. */617drflac_cache_t unalignedCache;618619/* The index of the next valid cache line in the "L2" cache. */620drflac_uint32 nextL2Line;621622/* The number of bits that have been consumed by the cache. This is used to determine how many valid bits are remaining. */623drflac_uint32 consumedBits;624625/*626The cached data which was most recently read from the client. There are two levels of cache. Data flows as such:627Client -> L2 -> L1. The L2 -> L1 movement is aligned and runs on a fast path in just a few instructions.628*/629drflac_cache_t cacheL2[DR_FLAC_BUFFER_SIZE/sizeof(drflac_cache_t)];630drflac_cache_t cache;631632/*633CRC-16. This is updated whenever bits are read from the bit stream. Manually set this to 0 to reset the CRC. For FLAC, this634is reset to 0 at the beginning of each frame.635*/636drflac_uint16 crc16;637drflac_cache_t crc16Cache; /* A cache for optimizing CRC calculations. This is filled when when the L1 cache is reloaded. */638drflac_uint32 crc16CacheIgnoredBytes; /* The number of bytes to ignore when updating the CRC-16 from the CRC-16 cache. */639} drflac_bs;640641typedef struct642{643/* The type of the subframe: SUBFRAME_CONSTANT, SUBFRAME_VERBATIM, SUBFRAME_FIXED or SUBFRAME_LPC. */644drflac_uint8 subframeType;645646/* The number of wasted bits per sample as specified by the sub-frame header. */647drflac_uint8 wastedBitsPerSample;648649/* The order to use for the prediction stage for SUBFRAME_FIXED and SUBFRAME_LPC. */650drflac_uint8 lpcOrder;651652/* A pointer to the buffer containing the decoded samples in the subframe. This pointer is an offset from drflac::pExtraData. */653drflac_int32* pSamplesS32;654} drflac_subframe;655656typedef struct657{658/*659If the stream uses variable block sizes, this will be set to the index of the first PCM frame. If fixed block sizes are used, this will660always be set to 0. This is 64-bit because the decoded PCM frame number will be 36 bits.661*/662drflac_uint64 pcmFrameNumber;663664/*665If the stream uses fixed block sizes, this will be set to the frame number. If variable block sizes are used, this will always be 0. This666is 32-bit because in fixed block sizes, the maximum frame number will be 31 bits.667*/668drflac_uint32 flacFrameNumber;669670/* The sample rate of this frame. */671drflac_uint32 sampleRate;672673/* The number of PCM frames in each sub-frame within this frame. */674drflac_uint16 blockSizeInPCMFrames;675676/*677The channel assignment of this frame. This is not always set to the channel count. If interchannel decorrelation is being used this678will be set to DRFLAC_CHANNEL_ASSIGNMENT_LEFT_SIDE, DRFLAC_CHANNEL_ASSIGNMENT_RIGHT_SIDE or DRFLAC_CHANNEL_ASSIGNMENT_MID_SIDE.679*/680drflac_uint8 channelAssignment;681682/* The number of bits per sample within this frame. */683drflac_uint8 bitsPerSample;684685/* The frame's CRC. */686drflac_uint8 crc8;687} drflac_frame_header;688689typedef struct690{691/* The header. */692drflac_frame_header header;693694/*695The number of PCM frames left to be read in this FLAC frame. This is initially set to the block size. As PCM frames are read,696this will be decremented. When it reaches 0, the decoder will see this frame as fully consumed and load the next frame.697*/698drflac_uint32 pcmFramesRemaining;699700/* The list of sub-frames within the frame. There is one sub-frame for each channel, and there's a maximum of 8 channels. */701drflac_subframe subframes[8];702} drflac_frame;703704typedef struct705{706/* The function to call when a metadata block is read. */707drflac_meta_proc onMeta;708709/* The user data posted to the metadata callback function. */710void* pUserDataMD;711712/* Memory allocation callbacks. */713drflac_allocation_callbacks allocationCallbacks;714715716/* The sample rate. Will be set to something like 44100. */717drflac_uint32 sampleRate;718719/*720The number of channels. This will be set to 1 for monaural streams, 2 for stereo, etc. Maximum 8. This is set based on the721value specified in the STREAMINFO block.722*/723drflac_uint8 channels;724725/* The bits per sample. Will be set to something like 16, 24, etc. */726drflac_uint8 bitsPerSample;727728/* The maximum block size, in samples. This number represents the number of samples in each channel (not combined). */729drflac_uint16 maxBlockSizeInPCMFrames;730731/*732The total number of PCM Frames making up the stream. Can be 0 in which case it's still a valid stream, but just means733the total PCM frame count is unknown. Likely the case with streams like internet radio.734*/735drflac_uint64 totalPCMFrameCount;736737738/* The container type. This is set based on whether or not the decoder was opened from a native or Ogg stream. */739drflac_container container;740741/* The number of seekpoints in the seektable. */742drflac_uint32 seekpointCount;743744745/* Information about the frame the decoder is currently sitting on. */746drflac_frame currentFLACFrame;747748749/* The index of the PCM frame the decoder is currently sitting on. This is only used for seeking. */750drflac_uint64 currentPCMFrame;751752/* The position of the first FLAC frame in the stream. This is only ever used for seeking. */753drflac_uint64 firstFLACFramePosInBytes;754755756/* A hack to avoid a malloc() when opening a decoder with drflac_open_memory(). */757drflac__memory_stream memoryStream;758759760/* A pointer to the decoded sample data. This is an offset of pExtraData. */761drflac_int32* pDecodedSamples;762763/* A pointer to the seek table. This is an offset of pExtraData, or NULL if there is no seek table. */764drflac_seekpoint* pSeekpoints;765766/* Internal use only. Only used with Ogg containers. Points to a drflac_oggbs object. This is an offset of pExtraData. */767void* _oggbs;768769/* Internal use only. Used for profiling and testing different seeking modes. */770drflac_bool32 _noSeekTableSeek : 1;771drflac_bool32 _noBinarySearchSeek : 1;772drflac_bool32 _noBruteForceSeek : 1;773774/* The bit streamer. The raw FLAC data is fed through this object. */775drflac_bs bs;776777/* Variable length extra data. We attach this to the end of the object so we can avoid unnecessary mallocs. */778drflac_uint8 pExtraData[1];779} drflac;780781782/*783Opens a FLAC decoder.784785786Parameters787----------788onRead (in)789The function to call when data needs to be read from the client.790791onSeek (in)792The function to call when the read position of the client data needs to move.793794pUserData (in, optional)795A pointer to application defined data that will be passed to onRead and onSeek.796797pAllocationCallbacks (in, optional)798A pointer to application defined callbacks for managing memory allocations.799800801Return Value802------------803Returns a pointer to an object representing the decoder.804805806Remarks807-------808Close the decoder with `drflac_close()`.809810`pAllocationCallbacks` can be NULL in which case it will use `DRFLAC_MALLOC`, `DRFLAC_REALLOC` and `DRFLAC_FREE`.811812This function will automatically detect whether or not you are attempting to open a native or Ogg encapsulated FLAC, both of which should work seamlessly813without any manual intervention. Ogg encapsulation also works with multiplexed streams which basically means it can play FLAC encoded audio tracks in videos.814815This is the lowest level function for opening a FLAC stream. You can also use `drflac_open_file()` and `drflac_open_memory()` to open the stream from a file or816from a block of memory respectively.817818The STREAMINFO block must be present for this to succeed. Use `drflac_open_relaxed()` to open a FLAC stream where the header may not be present.819820Use `drflac_open_with_metadata()` if you need access to metadata.821822823Seek Also824---------825drflac_open_file()826drflac_open_memory()827drflac_open_with_metadata()828drflac_close()829*/830DRFLAC_API drflac* drflac_open(drflac_read_proc onRead, drflac_seek_proc onSeek, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks);831832/*833Opens a FLAC stream with relaxed validation of the header block.834835836Parameters837----------838onRead (in)839The function to call when data needs to be read from the client.840841onSeek (in)842The function to call when the read position of the client data needs to move.843844container (in)845Whether or not the FLAC stream is encapsulated using standard FLAC encapsulation or Ogg encapsulation.846847pUserData (in, optional)848A pointer to application defined data that will be passed to onRead and onSeek.849850pAllocationCallbacks (in, optional)851A pointer to application defined callbacks for managing memory allocations.852853854Return Value855------------856A pointer to an object representing the decoder.857858859Remarks860-------861The same as drflac_open(), except attempts to open the stream even when a header block is not present.862863Because the header is not necessarily available, the caller must explicitly define the container (Native or Ogg). Do not set this to `drflac_container_unknown`864as that is for internal use only.865866Opening in relaxed mode will continue reading data from onRead until it finds a valid frame. If a frame is never found it will continue forever. To abort,867force your `onRead` callback to return 0, which dr_flac will use as an indicator that the end of the stream was found.868869Use `drflac_open_with_metadata_relaxed()` if you need access to metadata.870*/871DRFLAC_API drflac* drflac_open_relaxed(drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_container container, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks);872873/*874Opens a FLAC decoder and notifies the caller of the metadata chunks (album art, etc.).875876877Parameters878----------879onRead (in)880The function to call when data needs to be read from the client.881882onSeek (in)883The function to call when the read position of the client data needs to move.884885onMeta (in)886The function to call for every metadata block.887888pUserData (in, optional)889A pointer to application defined data that will be passed to onRead, onSeek and onMeta.890891pAllocationCallbacks (in, optional)892A pointer to application defined callbacks for managing memory allocations.893894895Return Value896------------897A pointer to an object representing the decoder.898899900Remarks901-------902Close the decoder with `drflac_close()`.903904`pAllocationCallbacks` can be NULL in which case it will use `DRFLAC_MALLOC`, `DRFLAC_REALLOC` and `DRFLAC_FREE`.905906This is slower than `drflac_open()`, so avoid this one if you don't need metadata. Internally, this will allocate and free memory on the heap for every907metadata block except for STREAMINFO and PADDING blocks.908909The caller is notified of the metadata via the `onMeta` callback. All metadata blocks will be handled before the function returns. This callback takes a910pointer to a `drflac_metadata` object which is a union containing the data of all relevant metadata blocks. Use the `type` member to discriminate against911the different metadata types.912913The STREAMINFO block must be present for this to succeed. Use `drflac_open_with_metadata_relaxed()` to open a FLAC stream where the header may not be present.914915Note that this will behave inconsistently with `drflac_open()` if the stream is an Ogg encapsulated stream and a metadata block is corrupted. This is due to916the way the Ogg stream recovers from corrupted pages. When `drflac_open_with_metadata()` is being used, the open routine will try to read the contents of the917metadata block, whereas `drflac_open()` will simply seek past it (for the sake of efficiency). This inconsistency can result in different samples being918returned depending on whether or not the stream is being opened with metadata.919920921Seek Also922---------923drflac_open_file_with_metadata()924drflac_open_memory_with_metadata()925drflac_open()926drflac_close()927*/928DRFLAC_API drflac* drflac_open_with_metadata(drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks);929930/*931The same as drflac_open_with_metadata(), except attempts to open the stream even when a header block is not present.932933See Also934--------935drflac_open_with_metadata()936drflac_open_relaxed()937*/938DRFLAC_API drflac* drflac_open_with_metadata_relaxed(drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, drflac_container container, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks);939940/*941Closes the given FLAC decoder.942943944Parameters945----------946pFlac (in)947The decoder to close.948949950Remarks951-------952This will destroy the decoder object.953954955See Also956--------957drflac_open()958drflac_open_with_metadata()959drflac_open_file()960drflac_open_file_w()961drflac_open_file_with_metadata()962drflac_open_file_with_metadata_w()963drflac_open_memory()964drflac_open_memory_with_metadata()965*/966DRFLAC_API void drflac_close(drflac* pFlac);967968969/*970Reads sample data from the given FLAC decoder, output as interleaved signed 32-bit PCM.971972973Parameters974----------975pFlac (in)976The decoder.977978framesToRead (in)979The number of PCM frames to read.980981pBufferOut (out, optional)982A pointer to the buffer that will receive the decoded samples.983984985Return Value986------------987Returns the number of PCM frames actually read. If the return value is less than `framesToRead` it has reached the end.988989990Remarks991-------992pBufferOut can be null, in which case the call will act as a seek, and the return value will be the number of frames seeked.993*/994DRFLAC_API drflac_uint64 drflac_read_pcm_frames_s32(drflac* pFlac, drflac_uint64 framesToRead, drflac_int32* pBufferOut);995996997/*998Reads sample data from the given FLAC decoder, output as interleaved signed 16-bit PCM.99910001001Parameters1002----------1003pFlac (in)1004The decoder.10051006framesToRead (in)1007The number of PCM frames to read.10081009pBufferOut (out, optional)1010A pointer to the buffer that will receive the decoded samples.101110121013Return Value1014------------1015Returns the number of PCM frames actually read. If the return value is less than `framesToRead` it has reached the end.101610171018Remarks1019-------1020pBufferOut can be null, in which case the call will act as a seek, and the return value will be the number of frames seeked.10211022Note that this is lossy for streams where the bits per sample is larger than 16.1023*/1024DRFLAC_API drflac_uint64 drflac_read_pcm_frames_s16(drflac* pFlac, drflac_uint64 framesToRead, drflac_int16* pBufferOut);10251026/*1027Reads sample data from the given FLAC decoder, output as interleaved 32-bit floating point PCM.102810291030Parameters1031----------1032pFlac (in)1033The decoder.10341035framesToRead (in)1036The number of PCM frames to read.10371038pBufferOut (out, optional)1039A pointer to the buffer that will receive the decoded samples.104010411042Return Value1043------------1044Returns the number of PCM frames actually read. If the return value is less than `framesToRead` it has reached the end.104510461047Remarks1048-------1049pBufferOut can be null, in which case the call will act as a seek, and the return value will be the number of frames seeked.10501051Note that this should be considered lossy due to the nature of floating point numbers not being able to exactly represent every possible number.1052*/1053DRFLAC_API drflac_uint64 drflac_read_pcm_frames_f32(drflac* pFlac, drflac_uint64 framesToRead, float* pBufferOut);10541055/*1056Seeks to the PCM frame at the given index.105710581059Parameters1060----------1061pFlac (in)1062The decoder.10631064pcmFrameIndex (in)1065The index of the PCM frame to seek to. See notes below.106610671068Return Value1069-------------1070`DRFLAC_TRUE` if successful; `DRFLAC_FALSE` otherwise.1071*/1072DRFLAC_API drflac_bool32 drflac_seek_to_pcm_frame(drflac* pFlac, drflac_uint64 pcmFrameIndex);1073107410751076#ifndef DR_FLAC_NO_STDIO1077/*1078Opens a FLAC decoder from the file at the given path.107910801081Parameters1082----------1083pFileName (in)1084The path of the file to open, either absolute or relative to the current directory.10851086pAllocationCallbacks (in, optional)1087A pointer to application defined callbacks for managing memory allocations.108810891090Return Value1091------------1092A pointer to an object representing the decoder.109310941095Remarks1096-------1097Close the decoder with drflac_close().109810991100Remarks1101-------1102This will hold a handle to the file until the decoder is closed with drflac_close(). Some platforms will restrict the number of files a process can have open1103at any given time, so keep this mind if you have many decoders open at the same time.110411051106See Also1107--------1108drflac_open_file_with_metadata()1109drflac_open()1110drflac_close()1111*/1112DRFLAC_API drflac* drflac_open_file(const char* pFileName, const drflac_allocation_callbacks* pAllocationCallbacks);1113DRFLAC_API drflac* drflac_open_file_w(const wchar_t* pFileName, const drflac_allocation_callbacks* pAllocationCallbacks);11141115/*1116Opens a FLAC decoder from the file at the given path and notifies the caller of the metadata chunks (album art, etc.)111711181119Parameters1120----------1121pFileName (in)1122The path of the file to open, either absolute or relative to the current directory.11231124pAllocationCallbacks (in, optional)1125A pointer to application defined callbacks for managing memory allocations.11261127onMeta (in)1128The callback to fire for each metadata block.11291130pUserData (in)1131A pointer to the user data to pass to the metadata callback.11321133pAllocationCallbacks (in)1134A pointer to application defined callbacks for managing memory allocations.113511361137Remarks1138-------1139Look at the documentation for drflac_open_with_metadata() for more information on how metadata is handled.114011411142See Also1143--------1144drflac_open_with_metadata()1145drflac_open()1146drflac_close()1147*/1148DRFLAC_API drflac* drflac_open_file_with_metadata(const char* pFileName, drflac_meta_proc onMeta, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks);1149DRFLAC_API drflac* drflac_open_file_with_metadata_w(const wchar_t* pFileName, drflac_meta_proc onMeta, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks);1150#endif11511152/*1153Opens a FLAC decoder from a pre-allocated block of memory115411551156Parameters1157----------1158pData (in)1159A pointer to the raw encoded FLAC data.11601161dataSize (in)1162The size in bytes of `data`.11631164pAllocationCallbacks (in)1165A pointer to application defined callbacks for managing memory allocations.116611671168Return Value1169------------1170A pointer to an object representing the decoder.117111721173Remarks1174-------1175This does not create a copy of the data. It is up to the application to ensure the buffer remains valid for the lifetime of the decoder.117611771178See Also1179--------1180drflac_open()1181drflac_close()1182*/1183DRFLAC_API drflac* drflac_open_memory(const void* pData, size_t dataSize, const drflac_allocation_callbacks* pAllocationCallbacks);11841185/*1186Opens a FLAC decoder from a pre-allocated block of memory and notifies the caller of the metadata chunks (album art, etc.)118711881189Parameters1190----------1191pData (in)1192A pointer to the raw encoded FLAC data.11931194dataSize (in)1195The size in bytes of `data`.11961197onMeta (in)1198The callback to fire for each metadata block.11991200pUserData (in)1201A pointer to the user data to pass to the metadata callback.12021203pAllocationCallbacks (in)1204A pointer to application defined callbacks for managing memory allocations.120512061207Remarks1208-------1209Look at the documentation for drflac_open_with_metadata() for more information on how metadata is handled.121012111212See Also1213-------1214drflac_open_with_metadata()1215drflac_open()1216drflac_close()1217*/1218DRFLAC_API drflac* drflac_open_memory_with_metadata(const void* pData, size_t dataSize, drflac_meta_proc onMeta, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks);1219122012211222/* High Level APIs */12231224/*1225Opens a FLAC stream from the given callbacks and fully decodes it in a single operation. The return value is a1226pointer to the sample data as interleaved signed 32-bit PCM. The returned data must be freed with drflac_free().12271228You can pass in custom memory allocation callbacks via the pAllocationCallbacks parameter. This can be NULL in which1229case it will use DRFLAC_MALLOC, DRFLAC_REALLOC and DRFLAC_FREE.12301231Sometimes a FLAC file won't keep track of the total sample count. In this situation the function will continuously1232read samples into a dynamically sized buffer on the heap until no samples are left.12331234Do not call this function on a broadcast type of stream (like internet radio streams and whatnot).1235*/1236DRFLAC_API drflac_int32* drflac_open_and_read_pcm_frames_s32(drflac_read_proc onRead, drflac_seek_proc onSeek, void* pUserData, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks);12371238/* Same as drflac_open_and_read_pcm_frames_s32(), except returns signed 16-bit integer samples. */1239DRFLAC_API drflac_int16* drflac_open_and_read_pcm_frames_s16(drflac_read_proc onRead, drflac_seek_proc onSeek, void* pUserData, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks);12401241/* Same as drflac_open_and_read_pcm_frames_s32(), except returns 32-bit floating-point samples. */1242DRFLAC_API float* drflac_open_and_read_pcm_frames_f32(drflac_read_proc onRead, drflac_seek_proc onSeek, void* pUserData, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks);12431244#ifndef DR_FLAC_NO_STDIO1245/* Same as drflac_open_and_read_pcm_frames_s32() except opens the decoder from a file. */1246DRFLAC_API drflac_int32* drflac_open_file_and_read_pcm_frames_s32(const char* filename, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks);12471248/* Same as drflac_open_file_and_read_pcm_frames_s32(), except returns signed 16-bit integer samples. */1249DRFLAC_API drflac_int16* drflac_open_file_and_read_pcm_frames_s16(const char* filename, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks);12501251/* Same as drflac_open_file_and_read_pcm_frames_s32(), except returns 32-bit floating-point samples. */1252DRFLAC_API float* drflac_open_file_and_read_pcm_frames_f32(const char* filename, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks);1253#endif12541255/* Same as drflac_open_and_read_pcm_frames_s32() except opens the decoder from a block of memory. */1256DRFLAC_API drflac_int32* drflac_open_memory_and_read_pcm_frames_s32(const void* data, size_t dataSize, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks);12571258/* Same as drflac_open_memory_and_read_pcm_frames_s32(), except returns signed 16-bit integer samples. */1259DRFLAC_API drflac_int16* drflac_open_memory_and_read_pcm_frames_s16(const void* data, size_t dataSize, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks);12601261/* Same as drflac_open_memory_and_read_pcm_frames_s32(), except returns 32-bit floating-point samples. */1262DRFLAC_API float* drflac_open_memory_and_read_pcm_frames_f32(const void* data, size_t dataSize, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks);12631264/*1265Frees memory that was allocated internally by dr_flac.12661267Set pAllocationCallbacks to the same object that was passed to drflac_open_*_and_read_pcm_frames_*(). If you originally passed in NULL, pass in NULL for this.1268*/1269DRFLAC_API void drflac_free(void* p, const drflac_allocation_callbacks* pAllocationCallbacks);127012711272/* Structure representing an iterator for vorbis comments in a VORBIS_COMMENT metadata block. */1273typedef struct1274{1275drflac_uint32 countRemaining;1276const char* pRunningData;1277} drflac_vorbis_comment_iterator;12781279/*1280Initializes a vorbis comment iterator. This can be used for iterating over the vorbis comments in a VORBIS_COMMENT1281metadata block.1282*/1283DRFLAC_API void drflac_init_vorbis_comment_iterator(drflac_vorbis_comment_iterator* pIter, drflac_uint32 commentCount, const void* pComments);12841285/*1286Goes to the next vorbis comment in the given iterator. If null is returned it means there are no more comments. The1287returned string is NOT null terminated.1288*/1289DRFLAC_API const char* drflac_next_vorbis_comment(drflac_vorbis_comment_iterator* pIter, drflac_uint32* pCommentLengthOut);129012911292/* Structure representing an iterator for cuesheet tracks in a CUESHEET metadata block. */1293typedef struct1294{1295drflac_uint32 countRemaining;1296const char* pRunningData;1297} drflac_cuesheet_track_iterator;12981299/* The order of members here is important because we map this directly to the raw data within the CUESHEET metadata block. */1300typedef struct1301{1302drflac_uint64 offset;1303drflac_uint8 index;1304drflac_uint8 reserved[3];1305} drflac_cuesheet_track_index;13061307typedef struct1308{1309drflac_uint64 offset;1310drflac_uint8 trackNumber;1311char ISRC[12];1312drflac_bool8 isAudio;1313drflac_bool8 preEmphasis;1314drflac_uint8 indexCount;1315const drflac_cuesheet_track_index* pIndexPoints;1316} drflac_cuesheet_track;13171318/*1319Initializes a cuesheet track iterator. This can be used for iterating over the cuesheet tracks in a CUESHEET metadata1320block.1321*/1322DRFLAC_API void drflac_init_cuesheet_track_iterator(drflac_cuesheet_track_iterator* pIter, drflac_uint32 trackCount, const void* pTrackData);13231324/* Goes to the next cuesheet track in the given iterator. If DRFLAC_FALSE is returned it means there are no more comments. */1325DRFLAC_API drflac_bool32 drflac_next_cuesheet_track(drflac_cuesheet_track_iterator* pIter, drflac_cuesheet_track* pCuesheetTrack);132613271328#ifdef __cplusplus1329}1330#endif1331#endif /* dr_flac_h */133213331334/************************************************************************************************************************************************************1335************************************************************************************************************************************************************13361337IMPLEMENTATION13381339************************************************************************************************************************************************************1340************************************************************************************************************************************************************/1341#if defined(DR_FLAC_IMPLEMENTATION) || defined(DRFLAC_IMPLEMENTATION)1342#ifndef dr_flac_c1343#define dr_flac_c13441345/* Disable some annoying warnings. */1346#if defined(__clang__) || (defined(__GNUC__) && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6)))1347#pragma GCC diagnostic push1348#if __GNUC__ >= 71349#pragma GCC diagnostic ignored "-Wimplicit-fallthrough"1350#endif1351#endif13521353#ifdef __linux__1354#ifndef _BSD_SOURCE1355#define _BSD_SOURCE1356#endif1357#ifndef _DEFAULT_SOURCE1358#define _DEFAULT_SOURCE1359#endif1360#ifndef __USE_BSD1361#define __USE_BSD1362#endif1363#include <endian.h>1364#endif13651366#include <stdlib.h>1367#include <string.h>13681369/* Inline */1370#ifdef _MSC_VER1371#define DRFLAC_INLINE __forceinline1372#elif defined(__GNUC__)1373/*1374I've had a bug report where GCC is emitting warnings about functions possibly not being inlineable. This warning happens when1375the __attribute__((always_inline)) attribute is defined without an "inline" statement. I think therefore there must be some1376case where "__inline__" is not always defined, thus the compiler emitting these warnings. When using -std=c89 or -ansi on the1377command line, we cannot use the "inline" keyword and instead need to use "__inline__". In an attempt to work around this issue1378I am using "__inline__" only when we're compiling in strict ANSI mode.1379*/1380#if defined(__STRICT_ANSI__)1381#define DRFLAC_GNUC_INLINE_HINT __inline__1382#else1383#define DRFLAC_GNUC_INLINE_HINT inline1384#endif13851386#if (__GNUC__ > 3 || (__GNUC__ == 3 && __GNUC_MINOR__ >= 2)) || defined(__clang__)1387#define DRFLAC_INLINE DRFLAC_GNUC_INLINE_HINT __attribute__((always_inline))1388#else1389#define DRFLAC_INLINE DRFLAC_GNUC_INLINE_HINT1390#endif1391#elif defined(__WATCOMC__)1392#define DRFLAC_INLINE __inline1393#else1394#define DRFLAC_INLINE1395#endif1396/* End Inline */13971398/*1399Intrinsics Support14001401There's a bug in GCC 4.2.x which results in an incorrect compilation error when using _mm_slli_epi32() where it complains with14021403"error: shift must be an immediate"14041405Unfortuantely dr_flac depends on this for a few things so we're just going to disable SSE on GCC 4.2 and below.1406*/1407#if !defined(DR_FLAC_NO_SIMD)1408#if defined(DRFLAC_X64) || defined(DRFLAC_X86)1409#if defined(_MSC_VER) && !defined(__clang__)1410/* MSVC. */1411#if _MSC_VER >= 1400 && !defined(DRFLAC_NO_SSE2) /* 2005 */1412#define DRFLAC_SUPPORT_SSE21413#endif1414#if _MSC_VER >= 1600 && !defined(DRFLAC_NO_SSE41) /* 2010 */1415#define DRFLAC_SUPPORT_SSE411416#endif1417#elif defined(__clang__) || (defined(__GNUC__) && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 3)))1418/* Assume GNUC-style. */1419#if defined(__SSE2__) && !defined(DRFLAC_NO_SSE2)1420#define DRFLAC_SUPPORT_SSE21421#endif1422#if defined(__SSE4_1__) && !defined(DRFLAC_NO_SSE41)1423#define DRFLAC_SUPPORT_SSE411424#endif1425#endif14261427/* If at this point we still haven't determined compiler support for the intrinsics just fall back to __has_include. */1428#if !defined(__GNUC__) && !defined(__clang__) && defined(__has_include)1429#if !defined(DRFLAC_SUPPORT_SSE2) && !defined(DRFLAC_NO_SSE2) && __has_include(<emmintrin.h>)1430#define DRFLAC_SUPPORT_SSE21431#endif1432#if !defined(DRFLAC_SUPPORT_SSE41) && !defined(DRFLAC_NO_SSE41) && __has_include(<smmintrin.h>)1433#define DRFLAC_SUPPORT_SSE411434#endif1435#endif14361437#if defined(DRFLAC_SUPPORT_SSE41)1438#include <smmintrin.h>1439#elif defined(DRFLAC_SUPPORT_SSE2)1440#include <emmintrin.h>1441#endif1442#endif14431444#if defined(DRFLAC_ARM)1445#if !defined(DRFLAC_NO_NEON) && (defined(__ARM_NEON) || defined(__aarch64__) || defined(_M_ARM64))1446#define DRFLAC_SUPPORT_NEON1447#include <arm_neon.h>1448#endif1449#endif1450#endif14511452/* Compile-time CPU feature support. */1453#if !defined(DR_FLAC_NO_SIMD) && (defined(DRFLAC_X86) || defined(DRFLAC_X64))1454#if defined(_MSC_VER) && !defined(__clang__)1455#if _MSC_VER >= 14001456#include <intrin.h>1457static void drflac__cpuid(int info[4], int fid)1458{1459__cpuid(info, fid);1460}1461#else1462#define DRFLAC_NO_CPUID1463#endif1464#else1465#if defined(__GNUC__) || defined(__clang__)1466static void drflac__cpuid(int info[4], int fid)1467{1468/*1469It looks like the -fPIC option uses the ebx register which GCC complains about. We can work around this by just using a different register, the1470specific register of which I'm letting the compiler decide on. The "k" prefix is used to specify a 32-bit register. The {...} syntax is for1471supporting different assembly dialects.14721473What's basically happening is that we're saving and restoring the ebx register manually.1474*/1475#if defined(DRFLAC_X86) && defined(__PIC__)1476__asm__ __volatile__ (1477"xchg{l} {%%}ebx, %k1;"1478"cpuid;"1479"xchg{l} {%%}ebx, %k1;"1480: "=a"(info[0]), "=&r"(info[1]), "=c"(info[2]), "=d"(info[3]) : "a"(fid), "c"(0)1481);1482#else1483__asm__ __volatile__ (1484"cpuid" : "=a"(info[0]), "=b"(info[1]), "=c"(info[2]), "=d"(info[3]) : "a"(fid), "c"(0)1485);1486#endif1487}1488#else1489#define DRFLAC_NO_CPUID1490#endif1491#endif1492#else1493#define DRFLAC_NO_CPUID1494#endif14951496static DRFLAC_INLINE drflac_bool32 drflac_has_sse2(void)1497{1498#if defined(DRFLAC_SUPPORT_SSE2)1499#if (defined(DRFLAC_X64) || defined(DRFLAC_X86)) && !defined(DRFLAC_NO_SSE2)1500#if defined(DRFLAC_X64)1501return DRFLAC_TRUE; /* 64-bit targets always support SSE2. */1502#elif (defined(_M_IX86_FP) && _M_IX86_FP == 2) || defined(__SSE2__)1503return DRFLAC_TRUE; /* If the compiler is allowed to freely generate SSE2 code we can assume support. */1504#else1505#if defined(DRFLAC_NO_CPUID)1506return DRFLAC_FALSE;1507#else1508int info[4];1509drflac__cpuid(info, 1);1510return (info[3] & (1 << 26)) != 0;1511#endif1512#endif1513#else1514return DRFLAC_FALSE; /* SSE2 is only supported on x86 and x64 architectures. */1515#endif1516#else1517return DRFLAC_FALSE; /* No compiler support. */1518#endif1519}15201521static DRFLAC_INLINE drflac_bool32 drflac_has_sse41(void)1522{1523#if defined(DRFLAC_SUPPORT_SSE41)1524#if (defined(DRFLAC_X64) || defined(DRFLAC_X86)) && !defined(DRFLAC_NO_SSE41)1525#if defined(__SSE4_1__) || defined(__AVX__)1526return DRFLAC_TRUE; /* If the compiler is allowed to freely generate SSE41 code we can assume support. */1527#else1528#if defined(DRFLAC_NO_CPUID)1529return DRFLAC_FALSE;1530#else1531int info[4];1532drflac__cpuid(info, 1);1533return (info[2] & (1 << 19)) != 0;1534#endif1535#endif1536#else1537return DRFLAC_FALSE; /* SSE41 is only supported on x86 and x64 architectures. */1538#endif1539#else1540return DRFLAC_FALSE; /* No compiler support. */1541#endif1542}154315441545#if defined(_MSC_VER) && _MSC_VER >= 1500 && (defined(DRFLAC_X86) || defined(DRFLAC_X64)) && !defined(__clang__)1546#define DRFLAC_HAS_LZCNT_INTRINSIC1547#elif (defined(__GNUC__) && ((__GNUC__ > 4) || (__GNUC__ == 4 && __GNUC_MINOR__ >= 7)))1548#define DRFLAC_HAS_LZCNT_INTRINSIC1549#elif defined(__clang__)1550#if defined(__has_builtin)1551#if __has_builtin(__builtin_clzll) || __has_builtin(__builtin_clzl)1552#define DRFLAC_HAS_LZCNT_INTRINSIC1553#endif1554#endif1555#endif15561557#if defined(_MSC_VER) && _MSC_VER >= 1400 && !defined(__clang__)1558#define DRFLAC_HAS_BYTESWAP16_INTRINSIC1559#define DRFLAC_HAS_BYTESWAP32_INTRINSIC1560#define DRFLAC_HAS_BYTESWAP64_INTRINSIC1561#elif defined(__clang__)1562#if defined(__has_builtin)1563#if __has_builtin(__builtin_bswap16)1564#define DRFLAC_HAS_BYTESWAP16_INTRINSIC1565#endif1566#if __has_builtin(__builtin_bswap32)1567#define DRFLAC_HAS_BYTESWAP32_INTRINSIC1568#endif1569#if __has_builtin(__builtin_bswap64)1570#define DRFLAC_HAS_BYTESWAP64_INTRINSIC1571#endif1572#endif1573#elif defined(__GNUC__)1574#if ((__GNUC__ > 4) || (__GNUC__ == 4 && __GNUC_MINOR__ >= 3))1575#define DRFLAC_HAS_BYTESWAP32_INTRINSIC1576#define DRFLAC_HAS_BYTESWAP64_INTRINSIC1577#endif1578#if ((__GNUC__ > 4) || (__GNUC__ == 4 && __GNUC_MINOR__ >= 8))1579#define DRFLAC_HAS_BYTESWAP16_INTRINSIC1580#endif1581#elif defined(__WATCOMC__) && defined(__386__)1582#define DRFLAC_HAS_BYTESWAP16_INTRINSIC1583#define DRFLAC_HAS_BYTESWAP32_INTRINSIC1584#define DRFLAC_HAS_BYTESWAP64_INTRINSIC1585extern __inline drflac_uint16 _watcom_bswap16(drflac_uint16);1586extern __inline drflac_uint32 _watcom_bswap32(drflac_uint32);1587extern __inline drflac_uint64 _watcom_bswap64(drflac_uint64);1588#pragma aux _watcom_bswap16 = \1589"xchg al, ah" \1590parm [ax] \1591value [ax] \1592modify nomemory;1593#pragma aux _watcom_bswap32 = \1594"bswap eax" \1595parm [eax] \1596value [eax] \1597modify nomemory;1598#pragma aux _watcom_bswap64 = \1599"bswap eax" \1600"bswap edx" \1601"xchg eax,edx" \1602parm [eax edx] \1603value [eax edx] \1604modify nomemory;1605#endif160616071608/* Standard library stuff. */1609#ifndef DRFLAC_ASSERT1610#include <assert.h>1611#define DRFLAC_ASSERT(expression) assert(expression)1612#endif1613#ifndef DRFLAC_MALLOC1614#define DRFLAC_MALLOC(sz) malloc((sz))1615#endif1616#ifndef DRFLAC_REALLOC1617#define DRFLAC_REALLOC(p, sz) realloc((p), (sz))1618#endif1619#ifndef DRFLAC_FREE1620#define DRFLAC_FREE(p) free((p))1621#endif1622#ifndef DRFLAC_COPY_MEMORY1623#define DRFLAC_COPY_MEMORY(dst, src, sz) memcpy((dst), (src), (sz))1624#endif1625#ifndef DRFLAC_ZERO_MEMORY1626#define DRFLAC_ZERO_MEMORY(p, sz) memset((p), 0, (sz))1627#endif1628#ifndef DRFLAC_ZERO_OBJECT1629#define DRFLAC_ZERO_OBJECT(p) DRFLAC_ZERO_MEMORY((p), sizeof(*(p)))1630#endif16311632#define DRFLAC_MAX_SIMD_VECTOR_SIZE 64 /* 64 for AVX-512 in the future. */16331634/* Result Codes */1635typedef drflac_int32 drflac_result;1636#define DRFLAC_SUCCESS 01637#define DRFLAC_ERROR -1 /* A generic error. */1638#define DRFLAC_INVALID_ARGS -21639#define DRFLAC_INVALID_OPERATION -31640#define DRFLAC_OUT_OF_MEMORY -41641#define DRFLAC_OUT_OF_RANGE -51642#define DRFLAC_ACCESS_DENIED -61643#define DRFLAC_DOES_NOT_EXIST -71644#define DRFLAC_ALREADY_EXISTS -81645#define DRFLAC_TOO_MANY_OPEN_FILES -91646#define DRFLAC_INVALID_FILE -101647#define DRFLAC_TOO_BIG -111648#define DRFLAC_PATH_TOO_LONG -121649#define DRFLAC_NAME_TOO_LONG -131650#define DRFLAC_NOT_DIRECTORY -141651#define DRFLAC_IS_DIRECTORY -151652#define DRFLAC_DIRECTORY_NOT_EMPTY -161653#define DRFLAC_END_OF_FILE -171654#define DRFLAC_NO_SPACE -181655#define DRFLAC_BUSY -191656#define DRFLAC_IO_ERROR -201657#define DRFLAC_INTERRUPT -211658#define DRFLAC_UNAVAILABLE -221659#define DRFLAC_ALREADY_IN_USE -231660#define DRFLAC_BAD_ADDRESS -241661#define DRFLAC_BAD_SEEK -251662#define DRFLAC_BAD_PIPE -261663#define DRFLAC_DEADLOCK -271664#define DRFLAC_TOO_MANY_LINKS -281665#define DRFLAC_NOT_IMPLEMENTED -291666#define DRFLAC_NO_MESSAGE -301667#define DRFLAC_BAD_MESSAGE -311668#define DRFLAC_NO_DATA_AVAILABLE -321669#define DRFLAC_INVALID_DATA -331670#define DRFLAC_TIMEOUT -341671#define DRFLAC_NO_NETWORK -351672#define DRFLAC_NOT_UNIQUE -361673#define DRFLAC_NOT_SOCKET -371674#define DRFLAC_NO_ADDRESS -381675#define DRFLAC_BAD_PROTOCOL -391676#define DRFLAC_PROTOCOL_UNAVAILABLE -401677#define DRFLAC_PROTOCOL_NOT_SUPPORTED -411678#define DRFLAC_PROTOCOL_FAMILY_NOT_SUPPORTED -421679#define DRFLAC_ADDRESS_FAMILY_NOT_SUPPORTED -431680#define DRFLAC_SOCKET_NOT_SUPPORTED -441681#define DRFLAC_CONNECTION_RESET -451682#define DRFLAC_ALREADY_CONNECTED -461683#define DRFLAC_NOT_CONNECTED -471684#define DRFLAC_CONNECTION_REFUSED -481685#define DRFLAC_NO_HOST -491686#define DRFLAC_IN_PROGRESS -501687#define DRFLAC_CANCELLED -511688#define DRFLAC_MEMORY_ALREADY_MAPPED -521689#define DRFLAC_AT_END -5316901691#define DRFLAC_CRC_MISMATCH -1001692/* End Result Codes */169316941695#define DRFLAC_SUBFRAME_CONSTANT 01696#define DRFLAC_SUBFRAME_VERBATIM 11697#define DRFLAC_SUBFRAME_FIXED 81698#define DRFLAC_SUBFRAME_LPC 321699#define DRFLAC_SUBFRAME_RESERVED 25517001701#define DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE 01702#define DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE2 117031704#define DRFLAC_CHANNEL_ASSIGNMENT_INDEPENDENT 01705#define DRFLAC_CHANNEL_ASSIGNMENT_LEFT_SIDE 81706#define DRFLAC_CHANNEL_ASSIGNMENT_RIGHT_SIDE 91707#define DRFLAC_CHANNEL_ASSIGNMENT_MID_SIDE 1017081709#define DRFLAC_SEEKPOINT_SIZE_IN_BYTES 181710#define DRFLAC_CUESHEET_TRACK_SIZE_IN_BYTES 361711#define DRFLAC_CUESHEET_TRACK_INDEX_SIZE_IN_BYTES 1217121713#define drflac_align(x, a) ((((x) + (a) - 1) / (a)) * (a))171417151716DRFLAC_API void drflac_version(drflac_uint32* pMajor, drflac_uint32* pMinor, drflac_uint32* pRevision)1717{1718if (pMajor) {1719*pMajor = DRFLAC_VERSION_MAJOR;1720}17211722if (pMinor) {1723*pMinor = DRFLAC_VERSION_MINOR;1724}17251726if (pRevision) {1727*pRevision = DRFLAC_VERSION_REVISION;1728}1729}17301731DRFLAC_API const char* drflac_version_string(void)1732{1733return DRFLAC_VERSION_STRING;1734}173517361737/* CPU caps. */1738#if defined(__has_feature)1739#if __has_feature(thread_sanitizer)1740#define DRFLAC_NO_THREAD_SANITIZE __attribute__((no_sanitize("thread")))1741#else1742#define DRFLAC_NO_THREAD_SANITIZE1743#endif1744#else1745#define DRFLAC_NO_THREAD_SANITIZE1746#endif17471748#if defined(DRFLAC_HAS_LZCNT_INTRINSIC)1749static drflac_bool32 drflac__gIsLZCNTSupported = DRFLAC_FALSE;1750#endif17511752#ifndef DRFLAC_NO_CPUID1753static drflac_bool32 drflac__gIsSSE2Supported = DRFLAC_FALSE;1754static drflac_bool32 drflac__gIsSSE41Supported = DRFLAC_FALSE;17551756/*1757I've had a bug report that Clang's ThreadSanitizer presents a warning in this function. Having reviewed this, this does1758actually make sense. However, since CPU caps should never differ for a running process, I don't think the trade off of1759complicating internal API's by passing around CPU caps versus just disabling the warnings is worthwhile. I'm therefore1760just going to disable these warnings. This is disabled via the DRFLAC_NO_THREAD_SANITIZE attribute.1761*/1762DRFLAC_NO_THREAD_SANITIZE static void drflac__init_cpu_caps(void)1763{1764static drflac_bool32 isCPUCapsInitialized = DRFLAC_FALSE;17651766if (!isCPUCapsInitialized) {1767/* LZCNT */1768#if defined(DRFLAC_HAS_LZCNT_INTRINSIC)1769int info[4] = {0};1770drflac__cpuid(info, 0x80000001);1771drflac__gIsLZCNTSupported = (info[2] & (1 << 5)) != 0;1772#endif17731774/* SSE2 */1775drflac__gIsSSE2Supported = drflac_has_sse2();17761777/* SSE4.1 */1778drflac__gIsSSE41Supported = drflac_has_sse41();17791780/* Initialized. */1781isCPUCapsInitialized = DRFLAC_TRUE;1782}1783}1784#else1785static drflac_bool32 drflac__gIsNEONSupported = DRFLAC_FALSE;17861787static DRFLAC_INLINE drflac_bool32 drflac__has_neon(void)1788{1789#if defined(DRFLAC_SUPPORT_NEON)1790#if defined(DRFLAC_ARM) && !defined(DRFLAC_NO_NEON)1791#if (defined(__ARM_NEON) || defined(__aarch64__) || defined(_M_ARM64))1792return DRFLAC_TRUE; /* If the compiler is allowed to freely generate NEON code we can assume support. */1793#else1794/* TODO: Runtime check. */1795return DRFLAC_FALSE;1796#endif1797#else1798return DRFLAC_FALSE; /* NEON is only supported on ARM architectures. */1799#endif1800#else1801return DRFLAC_FALSE; /* No compiler support. */1802#endif1803}18041805DRFLAC_NO_THREAD_SANITIZE static void drflac__init_cpu_caps(void)1806{1807drflac__gIsNEONSupported = drflac__has_neon();18081809#if defined(DRFLAC_HAS_LZCNT_INTRINSIC) && defined(DRFLAC_ARM) && (defined(__ARM_ARCH) && __ARM_ARCH >= 5)1810drflac__gIsLZCNTSupported = DRFLAC_TRUE;1811#endif1812}1813#endif181418151816/* Endian Management */1817static DRFLAC_INLINE drflac_bool32 drflac__is_little_endian(void)1818{1819#if defined(DRFLAC_X86) || defined(DRFLAC_X64)1820return DRFLAC_TRUE;1821#elif defined(__BYTE_ORDER) && defined(__LITTLE_ENDIAN) && __BYTE_ORDER == __LITTLE_ENDIAN1822return DRFLAC_TRUE;1823#else1824int n = 1;1825return (*(char*)&n) == 1;1826#endif1827}18281829static DRFLAC_INLINE drflac_uint16 drflac__swap_endian_uint16(drflac_uint16 n)1830{1831#ifdef DRFLAC_HAS_BYTESWAP16_INTRINSIC1832#if defined(_MSC_VER) && !defined(__clang__)1833return _byteswap_ushort(n);1834#elif defined(__GNUC__) || defined(__clang__)1835return __builtin_bswap16(n);1836#elif defined(__WATCOMC__) && defined(__386__)1837return _watcom_bswap16(n);1838#else1839#error "This compiler does not support the byte swap intrinsic."1840#endif1841#else1842return ((n & 0xFF00) >> 8) |1843((n & 0x00FF) << 8);1844#endif1845}18461847static DRFLAC_INLINE drflac_uint32 drflac__swap_endian_uint32(drflac_uint32 n)1848{1849#ifdef DRFLAC_HAS_BYTESWAP32_INTRINSIC1850#if defined(_MSC_VER) && !defined(__clang__)1851return _byteswap_ulong(n);1852#elif defined(__GNUC__) || defined(__clang__)1853#if defined(DRFLAC_ARM) && (defined(__ARM_ARCH) && __ARM_ARCH >= 6) && !defined(__ARM_ARCH_6M__) && !defined(DRFLAC_64BIT) /* <-- 64-bit inline assembly has not been tested, so disabling for now. */1854/* Inline assembly optimized implementation for ARM. In my testing, GCC does not generate optimized code with __builtin_bswap32(). */1855drflac_uint32 r;1856__asm__ __volatile__ (1857#if defined(DRFLAC_64BIT)1858"rev %w[out], %w[in]" : [out]"=r"(r) : [in]"r"(n) /* <-- This is untested. If someone in the community could test this, that would be appreciated! */1859#else1860"rev %[out], %[in]" : [out]"=r"(r) : [in]"r"(n)1861#endif1862);1863return r;1864#else1865return __builtin_bswap32(n);1866#endif1867#elif defined(__WATCOMC__) && defined(__386__)1868return _watcom_bswap32(n);1869#else1870#error "This compiler does not support the byte swap intrinsic."1871#endif1872#else1873return ((n & 0xFF000000) >> 24) |1874((n & 0x00FF0000) >> 8) |1875((n & 0x0000FF00) << 8) |1876((n & 0x000000FF) << 24);1877#endif1878}18791880static DRFLAC_INLINE drflac_uint64 drflac__swap_endian_uint64(drflac_uint64 n)1881{1882#ifdef DRFLAC_HAS_BYTESWAP64_INTRINSIC1883#if defined(_MSC_VER) && !defined(__clang__)1884return _byteswap_uint64(n);1885#elif defined(__GNUC__) || defined(__clang__)1886return __builtin_bswap64(n);1887#elif defined(__WATCOMC__) && defined(__386__)1888return _watcom_bswap64(n);1889#else1890#error "This compiler does not support the byte swap intrinsic."1891#endif1892#else1893/* Weird "<< 32" bitshift is required for C89 because it doesn't support 64-bit constants. Should be optimized out by a good compiler. */1894return ((n & ((drflac_uint64)0xFF000000 << 32)) >> 56) |1895((n & ((drflac_uint64)0x00FF0000 << 32)) >> 40) |1896((n & ((drflac_uint64)0x0000FF00 << 32)) >> 24) |1897((n & ((drflac_uint64)0x000000FF << 32)) >> 8) |1898((n & ((drflac_uint64)0xFF000000 )) << 8) |1899((n & ((drflac_uint64)0x00FF0000 )) << 24) |1900((n & ((drflac_uint64)0x0000FF00 )) << 40) |1901((n & ((drflac_uint64)0x000000FF )) << 56);1902#endif1903}190419051906static DRFLAC_INLINE drflac_uint16 drflac__be2host_16(drflac_uint16 n)1907{1908if (drflac__is_little_endian()) {1909return drflac__swap_endian_uint16(n);1910}19111912return n;1913}19141915static DRFLAC_INLINE drflac_uint32 drflac__be2host_32(drflac_uint32 n)1916{1917if (drflac__is_little_endian()) {1918return drflac__swap_endian_uint32(n);1919}19201921return n;1922}19231924static DRFLAC_INLINE drflac_uint32 drflac__be2host_32_ptr_unaligned(const void* pData)1925{1926const drflac_uint8* pNum = (drflac_uint8*)pData;1927return *(pNum) << 24 | *(pNum+1) << 16 | *(pNum+2) << 8 | *(pNum+3);1928}19291930static DRFLAC_INLINE drflac_uint64 drflac__be2host_64(drflac_uint64 n)1931{1932if (drflac__is_little_endian()) {1933return drflac__swap_endian_uint64(n);1934}19351936return n;1937}193819391940static DRFLAC_INLINE drflac_uint32 drflac__le2host_32(drflac_uint32 n)1941{1942if (!drflac__is_little_endian()) {1943return drflac__swap_endian_uint32(n);1944}19451946return n;1947}19481949static DRFLAC_INLINE drflac_uint32 drflac__le2host_32_ptr_unaligned(const void* pData)1950{1951const drflac_uint8* pNum = (drflac_uint8*)pData;1952return *pNum | *(pNum+1) << 8 | *(pNum+2) << 16 | *(pNum+3) << 24;1953}195419551956static DRFLAC_INLINE drflac_uint32 drflac__unsynchsafe_32(drflac_uint32 n)1957{1958drflac_uint32 result = 0;1959result |= (n & 0x7F000000) >> 3;1960result |= (n & 0x007F0000) >> 2;1961result |= (n & 0x00007F00) >> 1;1962result |= (n & 0x0000007F) >> 0;19631964return result;1965}1966196719681969/* The CRC code below is based on this document: http://zlib.net/crc_v3.txt */1970static drflac_uint8 drflac__crc8_table[] = {19710x00, 0x07, 0x0E, 0x09, 0x1C, 0x1B, 0x12, 0x15, 0x38, 0x3F, 0x36, 0x31, 0x24, 0x23, 0x2A, 0x2D,19720x70, 0x77, 0x7E, 0x79, 0x6C, 0x6B, 0x62, 0x65, 0x48, 0x4F, 0x46, 0x41, 0x54, 0x53, 0x5A, 0x5D,19730xE0, 0xE7, 0xEE, 0xE9, 0xFC, 0xFB, 0xF2, 0xF5, 0xD8, 0xDF, 0xD6, 0xD1, 0xC4, 0xC3, 0xCA, 0xCD,19740x90, 0x97, 0x9E, 0x99, 0x8C, 0x8B, 0x82, 0x85, 0xA8, 0xAF, 0xA6, 0xA1, 0xB4, 0xB3, 0xBA, 0xBD,19750xC7, 0xC0, 0xC9, 0xCE, 0xDB, 0xDC, 0xD5, 0xD2, 0xFF, 0xF8, 0xF1, 0xF6, 0xE3, 0xE4, 0xED, 0xEA,19760xB7, 0xB0, 0xB9, 0xBE, 0xAB, 0xAC, 0xA5, 0xA2, 0x8F, 0x88, 0x81, 0x86, 0x93, 0x94, 0x9D, 0x9A,19770x27, 0x20, 0x29, 0x2E, 0x3B, 0x3C, 0x35, 0x32, 0x1F, 0x18, 0x11, 0x16, 0x03, 0x04, 0x0D, 0x0A,19780x57, 0x50, 0x59, 0x5E, 0x4B, 0x4C, 0x45, 0x42, 0x6F, 0x68, 0x61, 0x66, 0x73, 0x74, 0x7D, 0x7A,19790x89, 0x8E, 0x87, 0x80, 0x95, 0x92, 0x9B, 0x9C, 0xB1, 0xB6, 0xBF, 0xB8, 0xAD, 0xAA, 0xA3, 0xA4,19800xF9, 0xFE, 0xF7, 0xF0, 0xE5, 0xE2, 0xEB, 0xEC, 0xC1, 0xC6, 0xCF, 0xC8, 0xDD, 0xDA, 0xD3, 0xD4,19810x69, 0x6E, 0x67, 0x60, 0x75, 0x72, 0x7B, 0x7C, 0x51, 0x56, 0x5F, 0x58, 0x4D, 0x4A, 0x43, 0x44,19820x19, 0x1E, 0x17, 0x10, 0x05, 0x02, 0x0B, 0x0C, 0x21, 0x26, 0x2F, 0x28, 0x3D, 0x3A, 0x33, 0x34,19830x4E, 0x49, 0x40, 0x47, 0x52, 0x55, 0x5C, 0x5B, 0x76, 0x71, 0x78, 0x7F, 0x6A, 0x6D, 0x64, 0x63,19840x3E, 0x39, 0x30, 0x37, 0x22, 0x25, 0x2C, 0x2B, 0x06, 0x01, 0x08, 0x0F, 0x1A, 0x1D, 0x14, 0x13,19850xAE, 0xA9, 0xA0, 0xA7, 0xB2, 0xB5, 0xBC, 0xBB, 0x96, 0x91, 0x98, 0x9F, 0x8A, 0x8D, 0x84, 0x83,19860xDE, 0xD9, 0xD0, 0xD7, 0xC2, 0xC5, 0xCC, 0xCB, 0xE6, 0xE1, 0xE8, 0xEF, 0xFA, 0xFD, 0xF4, 0xF31987};19881989static drflac_uint16 drflac__crc16_table[] = {19900x0000, 0x8005, 0x800F, 0x000A, 0x801B, 0x001E, 0x0014, 0x8011,19910x8033, 0x0036, 0x003C, 0x8039, 0x0028, 0x802D, 0x8027, 0x0022,19920x8063, 0x0066, 0x006C, 0x8069, 0x0078, 0x807D, 0x8077, 0x0072,19930x0050, 0x8055, 0x805F, 0x005A, 0x804B, 0x004E, 0x0044, 0x8041,19940x80C3, 0x00C6, 0x00CC, 0x80C9, 0x00D8, 0x80DD, 0x80D7, 0x00D2,19950x00F0, 0x80F5, 0x80FF, 0x00FA, 0x80EB, 0x00EE, 0x00E4, 0x80E1,19960x00A0, 0x80A5, 0x80AF, 0x00AA, 0x80BB, 0x00BE, 0x00B4, 0x80B1,19970x8093, 0x0096, 0x009C, 0x8099, 0x0088, 0x808D, 0x8087, 0x0082,19980x8183, 0x0186, 0x018C, 0x8189, 0x0198, 0x819D, 0x8197, 0x0192,19990x01B0, 0x81B5, 0x81BF, 0x01BA, 0x81AB, 0x01AE, 0x01A4, 0x81A1,20000x01E0, 0x81E5, 0x81EF, 0x01EA, 0x81FB, 0x01FE, 0x01F4, 0x81F1,20010x81D3, 0x01D6, 0x01DC, 0x81D9, 0x01C8, 0x81CD, 0x81C7, 0x01C2,20020x0140, 0x8145, 0x814F, 0x014A, 0x815B, 0x015E, 0x0154, 0x8151,20030x8173, 0x0176, 0x017C, 0x8179, 0x0168, 0x816D, 0x8167, 0x0162,20040x8123, 0x0126, 0x012C, 0x8129, 0x0138, 0x813D, 0x8137, 0x0132,20050x0110, 0x8115, 0x811F, 0x011A, 0x810B, 0x010E, 0x0104, 0x8101,20060x8303, 0x0306, 0x030C, 0x8309, 0x0318, 0x831D, 0x8317, 0x0312,20070x0330, 0x8335, 0x833F, 0x033A, 0x832B, 0x032E, 0x0324, 0x8321,20080x0360, 0x8365, 0x836F, 0x036A, 0x837B, 0x037E, 0x0374, 0x8371,20090x8353, 0x0356, 0x035C, 0x8359, 0x0348, 0x834D, 0x8347, 0x0342,20100x03C0, 0x83C5, 0x83CF, 0x03CA, 0x83DB, 0x03DE, 0x03D4, 0x83D1,20110x83F3, 0x03F6, 0x03FC, 0x83F9, 0x03E8, 0x83ED, 0x83E7, 0x03E2,20120x83A3, 0x03A6, 0x03AC, 0x83A9, 0x03B8, 0x83BD, 0x83B7, 0x03B2,20130x0390, 0x8395, 0x839F, 0x039A, 0x838B, 0x038E, 0x0384, 0x8381,20140x0280, 0x8285, 0x828F, 0x028A, 0x829B, 0x029E, 0x0294, 0x8291,20150x82B3, 0x02B6, 0x02BC, 0x82B9, 0x02A8, 0x82AD, 0x82A7, 0x02A2,20160x82E3, 0x02E6, 0x02EC, 0x82E9, 0x02F8, 0x82FD, 0x82F7, 0x02F2,20170x02D0, 0x82D5, 0x82DF, 0x02DA, 0x82CB, 0x02CE, 0x02C4, 0x82C1,20180x8243, 0x0246, 0x024C, 0x8249, 0x0258, 0x825D, 0x8257, 0x0252,20190x0270, 0x8275, 0x827F, 0x027A, 0x826B, 0x026E, 0x0264, 0x8261,20200x0220, 0x8225, 0x822F, 0x022A, 0x823B, 0x023E, 0x0234, 0x8231,20210x8213, 0x0216, 0x021C, 0x8219, 0x0208, 0x820D, 0x8207, 0x02022022};20232024static DRFLAC_INLINE drflac_uint8 drflac_crc8_byte(drflac_uint8 crc, drflac_uint8 data)2025{2026return drflac__crc8_table[crc ^ data];2027}20282029static DRFLAC_INLINE drflac_uint8 drflac_crc8(drflac_uint8 crc, drflac_uint32 data, drflac_uint32 count)2030{2031#ifdef DR_FLAC_NO_CRC2032(void)crc;2033(void)data;2034(void)count;2035return 0;2036#else2037#if 02038/* REFERENCE (use of this implementation requires an explicit flush by doing "drflac_crc8(crc, 0, 8);") */2039drflac_uint8 p = 0x07;2040for (int i = count-1; i >= 0; --i) {2041drflac_uint8 bit = (data & (1 << i)) >> i;2042if (crc & 0x80) {2043crc = ((crc << 1) | bit) ^ p;2044} else {2045crc = ((crc << 1) | bit);2046}2047}2048return crc;2049#else2050drflac_uint32 wholeBytes;2051drflac_uint32 leftoverBits;2052drflac_uint64 leftoverDataMask;20532054static drflac_uint64 leftoverDataMaskTable[8] = {20550x00, 0x01, 0x03, 0x07, 0x0F, 0x1F, 0x3F, 0x7F2056};20572058DRFLAC_ASSERT(count <= 32);20592060wholeBytes = count >> 3;2061leftoverBits = count - (wholeBytes*8);2062leftoverDataMask = leftoverDataMaskTable[leftoverBits];20632064switch (wholeBytes) {2065case 4: crc = drflac_crc8_byte(crc, (drflac_uint8)((data & (0xFF000000UL << leftoverBits)) >> (24 + leftoverBits)));2066case 3: crc = drflac_crc8_byte(crc, (drflac_uint8)((data & (0x00FF0000UL << leftoverBits)) >> (16 + leftoverBits)));2067case 2: crc = drflac_crc8_byte(crc, (drflac_uint8)((data & (0x0000FF00UL << leftoverBits)) >> ( 8 + leftoverBits)));2068case 1: crc = drflac_crc8_byte(crc, (drflac_uint8)((data & (0x000000FFUL << leftoverBits)) >> ( 0 + leftoverBits)));2069case 0: if (leftoverBits > 0) crc = (drflac_uint8)((crc << leftoverBits) ^ drflac__crc8_table[(crc >> (8 - leftoverBits)) ^ (data & leftoverDataMask)]);2070}2071return crc;2072#endif2073#endif2074}20752076static DRFLAC_INLINE drflac_uint16 drflac_crc16_byte(drflac_uint16 crc, drflac_uint8 data)2077{2078return (crc << 8) ^ drflac__crc16_table[(drflac_uint8)(crc >> 8) ^ data];2079}20802081static DRFLAC_INLINE drflac_uint16 drflac_crc16_cache(drflac_uint16 crc, drflac_cache_t data)2082{2083#ifdef DRFLAC_64BIT2084crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 56) & 0xFF));2085crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 48) & 0xFF));2086crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 40) & 0xFF));2087crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 32) & 0xFF));2088#endif2089crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 24) & 0xFF));2090crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 16) & 0xFF));2091crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 8) & 0xFF));2092crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 0) & 0xFF));20932094return crc;2095}20962097static DRFLAC_INLINE drflac_uint16 drflac_crc16_bytes(drflac_uint16 crc, drflac_cache_t data, drflac_uint32 byteCount)2098{2099switch (byteCount)2100{2101#ifdef DRFLAC_64BIT2102case 8: crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 56) & 0xFF));2103case 7: crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 48) & 0xFF));2104case 6: crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 40) & 0xFF));2105case 5: crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 32) & 0xFF));2106#endif2107case 4: crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 24) & 0xFF));2108case 3: crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 16) & 0xFF));2109case 2: crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 8) & 0xFF));2110case 1: crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 0) & 0xFF));2111}21122113return crc;2114}21152116#if 02117static DRFLAC_INLINE drflac_uint16 drflac_crc16__32bit(drflac_uint16 crc, drflac_uint32 data, drflac_uint32 count)2118{2119#ifdef DR_FLAC_NO_CRC2120(void)crc;2121(void)data;2122(void)count;2123return 0;2124#else2125#if 02126/* REFERENCE (use of this implementation requires an explicit flush by doing "drflac_crc16(crc, 0, 16);") */2127drflac_uint16 p = 0x8005;2128for (int i = count-1; i >= 0; --i) {2129drflac_uint16 bit = (data & (1ULL << i)) >> i;2130if (r & 0x8000) {2131r = ((r << 1) | bit) ^ p;2132} else {2133r = ((r << 1) | bit);2134}2135}21362137return crc;2138#else2139drflac_uint32 wholeBytes;2140drflac_uint32 leftoverBits;2141drflac_uint64 leftoverDataMask;21422143static drflac_uint64 leftoverDataMaskTable[8] = {21440x00, 0x01, 0x03, 0x07, 0x0F, 0x1F, 0x3F, 0x7F2145};21462147DRFLAC_ASSERT(count <= 64);21482149wholeBytes = count >> 3;2150leftoverBits = count & 7;2151leftoverDataMask = leftoverDataMaskTable[leftoverBits];21522153switch (wholeBytes) {2154default:2155case 4: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (0xFF000000UL << leftoverBits)) >> (24 + leftoverBits)));2156case 3: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (0x00FF0000UL << leftoverBits)) >> (16 + leftoverBits)));2157case 2: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (0x0000FF00UL << leftoverBits)) >> ( 8 + leftoverBits)));2158case 1: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (0x000000FFUL << leftoverBits)) >> ( 0 + leftoverBits)));2159case 0: if (leftoverBits > 0) crc = (crc << leftoverBits) ^ drflac__crc16_table[(crc >> (16 - leftoverBits)) ^ (data & leftoverDataMask)];2160}2161return crc;2162#endif2163#endif2164}21652166static DRFLAC_INLINE drflac_uint16 drflac_crc16__64bit(drflac_uint16 crc, drflac_uint64 data, drflac_uint32 count)2167{2168#ifdef DR_FLAC_NO_CRC2169(void)crc;2170(void)data;2171(void)count;2172return 0;2173#else2174drflac_uint32 wholeBytes;2175drflac_uint32 leftoverBits;2176drflac_uint64 leftoverDataMask;21772178static drflac_uint64 leftoverDataMaskTable[8] = {21790x00, 0x01, 0x03, 0x07, 0x0F, 0x1F, 0x3F, 0x7F2180};21812182DRFLAC_ASSERT(count <= 64);21832184wholeBytes = count >> 3;2185leftoverBits = count & 7;2186leftoverDataMask = leftoverDataMaskTable[leftoverBits];21872188switch (wholeBytes) {2189default:2190case 8: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (((drflac_uint64)0xFF000000 << 32) << leftoverBits)) >> (56 + leftoverBits))); /* Weird "<< 32" bitshift is required for C89 because it doesn't support 64-bit constants. Should be optimized out by a good compiler. */2191case 7: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (((drflac_uint64)0x00FF0000 << 32) << leftoverBits)) >> (48 + leftoverBits)));2192case 6: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (((drflac_uint64)0x0000FF00 << 32) << leftoverBits)) >> (40 + leftoverBits)));2193case 5: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (((drflac_uint64)0x000000FF << 32) << leftoverBits)) >> (32 + leftoverBits)));2194case 4: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (((drflac_uint64)0xFF000000 ) << leftoverBits)) >> (24 + leftoverBits)));2195case 3: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (((drflac_uint64)0x00FF0000 ) << leftoverBits)) >> (16 + leftoverBits)));2196case 2: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (((drflac_uint64)0x0000FF00 ) << leftoverBits)) >> ( 8 + leftoverBits)));2197case 1: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (((drflac_uint64)0x000000FF ) << leftoverBits)) >> ( 0 + leftoverBits)));2198case 0: if (leftoverBits > 0) crc = (crc << leftoverBits) ^ drflac__crc16_table[(crc >> (16 - leftoverBits)) ^ (data & leftoverDataMask)];2199}2200return crc;2201#endif2202}220322042205static DRFLAC_INLINE drflac_uint16 drflac_crc16(drflac_uint16 crc, drflac_cache_t data, drflac_uint32 count)2206{2207#ifdef DRFLAC_64BIT2208return drflac_crc16__64bit(crc, data, count);2209#else2210return drflac_crc16__32bit(crc, data, count);2211#endif2212}2213#endif221422152216#ifdef DRFLAC_64BIT2217#define drflac__be2host__cache_line drflac__be2host_642218#else2219#define drflac__be2host__cache_line drflac__be2host_322220#endif22212222/*2223BIT READING ATTEMPT #222242225This uses a 32- or 64-bit bit-shifted cache - as bits are read, the cache is shifted such that the first valid bit is sitting2226on the most significant bit. It uses the notion of an L1 and L2 cache (borrowed from CPU architecture), where the L1 cache2227is a 32- or 64-bit unsigned integer (depending on whether or not a 32- or 64-bit build is being compiled) and the L2 is an2228array of "cache lines", with each cache line being the same size as the L1. The L2 is a buffer of about 4KB and is where data2229from onRead() is read into.2230*/2231#define DRFLAC_CACHE_L1_SIZE_BYTES(bs) (sizeof((bs)->cache))2232#define DRFLAC_CACHE_L1_SIZE_BITS(bs) (sizeof((bs)->cache)*8)2233#define DRFLAC_CACHE_L1_BITS_REMAINING(bs) (DRFLAC_CACHE_L1_SIZE_BITS(bs) - (bs)->consumedBits)2234#define DRFLAC_CACHE_L1_SELECTION_MASK(_bitCount) (~((~(drflac_cache_t)0) >> (_bitCount)))2235#define DRFLAC_CACHE_L1_SELECTION_SHIFT(bs, _bitCount) (DRFLAC_CACHE_L1_SIZE_BITS(bs) - (_bitCount))2236#define DRFLAC_CACHE_L1_SELECT(bs, _bitCount) (((bs)->cache) & DRFLAC_CACHE_L1_SELECTION_MASK(_bitCount))2237#define DRFLAC_CACHE_L1_SELECT_AND_SHIFT(bs, _bitCount) (DRFLAC_CACHE_L1_SELECT((bs), (_bitCount)) >> DRFLAC_CACHE_L1_SELECTION_SHIFT((bs), (_bitCount)))2238#define DRFLAC_CACHE_L1_SELECT_AND_SHIFT_SAFE(bs, _bitCount)(DRFLAC_CACHE_L1_SELECT((bs), (_bitCount)) >> (DRFLAC_CACHE_L1_SELECTION_SHIFT((bs), (_bitCount)) & (DRFLAC_CACHE_L1_SIZE_BITS(bs)-1)))2239#define DRFLAC_CACHE_L2_SIZE_BYTES(bs) (sizeof((bs)->cacheL2))2240#define DRFLAC_CACHE_L2_LINE_COUNT(bs) (DRFLAC_CACHE_L2_SIZE_BYTES(bs) / sizeof((bs)->cacheL2[0]))2241#define DRFLAC_CACHE_L2_LINES_REMAINING(bs) (DRFLAC_CACHE_L2_LINE_COUNT(bs) - (bs)->nextL2Line)224222432244#ifndef DR_FLAC_NO_CRC2245static DRFLAC_INLINE void drflac__reset_crc16(drflac_bs* bs)2246{2247bs->crc16 = 0;2248bs->crc16CacheIgnoredBytes = bs->consumedBits >> 3;2249}22502251static DRFLAC_INLINE void drflac__update_crc16(drflac_bs* bs)2252{2253if (bs->crc16CacheIgnoredBytes == 0) {2254bs->crc16 = drflac_crc16_cache(bs->crc16, bs->crc16Cache);2255} else {2256bs->crc16 = drflac_crc16_bytes(bs->crc16, bs->crc16Cache, DRFLAC_CACHE_L1_SIZE_BYTES(bs) - bs->crc16CacheIgnoredBytes);2257bs->crc16CacheIgnoredBytes = 0;2258}2259}22602261static DRFLAC_INLINE drflac_uint16 drflac__flush_crc16(drflac_bs* bs)2262{2263/* We should never be flushing in a situation where we are not aligned on a byte boundary. */2264DRFLAC_ASSERT((DRFLAC_CACHE_L1_BITS_REMAINING(bs) & 7) == 0);22652266/*2267The bits that were read from the L1 cache need to be accumulated. The number of bytes needing to be accumulated is determined2268by the number of bits that have been consumed.2269*/2270if (DRFLAC_CACHE_L1_BITS_REMAINING(bs) == 0) {2271drflac__update_crc16(bs);2272} else {2273/* We only accumulate the consumed bits. */2274bs->crc16 = drflac_crc16_bytes(bs->crc16, bs->crc16Cache >> DRFLAC_CACHE_L1_BITS_REMAINING(bs), (bs->consumedBits >> 3) - bs->crc16CacheIgnoredBytes);22752276/*2277The bits that we just accumulated should never be accumulated again. We need to keep track of how many bytes were accumulated2278so we can handle that later.2279*/2280bs->crc16CacheIgnoredBytes = bs->consumedBits >> 3;2281}22822283return bs->crc16;2284}2285#endif22862287static DRFLAC_INLINE drflac_bool32 drflac__reload_l1_cache_from_l2(drflac_bs* bs)2288{2289size_t bytesRead;2290size_t alignedL1LineCount;22912292/* Fast path. Try loading straight from L2. */2293if (bs->nextL2Line < DRFLAC_CACHE_L2_LINE_COUNT(bs)) {2294bs->cache = bs->cacheL2[bs->nextL2Line++];2295return DRFLAC_TRUE;2296}22972298/*2299If we get here it means we've run out of data in the L2 cache. We'll need to fetch more from the client, if there's2300any left.2301*/2302if (bs->unalignedByteCount > 0) {2303return DRFLAC_FALSE; /* If we have any unaligned bytes it means there's no more aligned bytes left in the client. */2304}23052306bytesRead = bs->onRead(bs->pUserData, bs->cacheL2, DRFLAC_CACHE_L2_SIZE_BYTES(bs));23072308bs->nextL2Line = 0;2309if (bytesRead == DRFLAC_CACHE_L2_SIZE_BYTES(bs)) {2310bs->cache = bs->cacheL2[bs->nextL2Line++];2311return DRFLAC_TRUE;2312}231323142315/*2316If we get here it means we were unable to retrieve enough data to fill the entire L2 cache. It probably2317means we've just reached the end of the file. We need to move the valid data down to the end of the buffer2318and adjust the index of the next line accordingly. Also keep in mind that the L2 cache must be aligned to2319the size of the L1 so we'll need to seek backwards by any misaligned bytes.2320*/2321alignedL1LineCount = bytesRead / DRFLAC_CACHE_L1_SIZE_BYTES(bs);23222323/* We need to keep track of any unaligned bytes for later use. */2324bs->unalignedByteCount = bytesRead - (alignedL1LineCount * DRFLAC_CACHE_L1_SIZE_BYTES(bs));2325if (bs->unalignedByteCount > 0) {2326bs->unalignedCache = bs->cacheL2[alignedL1LineCount];2327}23282329if (alignedL1LineCount > 0) {2330size_t offset = DRFLAC_CACHE_L2_LINE_COUNT(bs) - alignedL1LineCount;2331size_t i;2332for (i = alignedL1LineCount; i > 0; --i) {2333bs->cacheL2[i-1 + offset] = bs->cacheL2[i-1];2334}23352336bs->nextL2Line = (drflac_uint32)offset;2337bs->cache = bs->cacheL2[bs->nextL2Line++];2338return DRFLAC_TRUE;2339} else {2340/* If we get into this branch it means we weren't able to load any L1-aligned data. */2341bs->nextL2Line = DRFLAC_CACHE_L2_LINE_COUNT(bs);2342return DRFLAC_FALSE;2343}2344}23452346static drflac_bool32 drflac__reload_cache(drflac_bs* bs)2347{2348size_t bytesRead;23492350#ifndef DR_FLAC_NO_CRC2351drflac__update_crc16(bs);2352#endif23532354/* Fast path. Try just moving the next value in the L2 cache to the L1 cache. */2355if (drflac__reload_l1_cache_from_l2(bs)) {2356bs->cache = drflac__be2host__cache_line(bs->cache);2357bs->consumedBits = 0;2358#ifndef DR_FLAC_NO_CRC2359bs->crc16Cache = bs->cache;2360#endif2361return DRFLAC_TRUE;2362}23632364/* Slow path. */23652366/*2367If we get here it means we have failed to load the L1 cache from the L2. Likely we've just reached the end of the stream and the last2368few bytes did not meet the alignment requirements for the L2 cache. In this case we need to fall back to a slower path and read the2369data from the unaligned cache.2370*/2371bytesRead = bs->unalignedByteCount;2372if (bytesRead == 0) {2373bs->consumedBits = DRFLAC_CACHE_L1_SIZE_BITS(bs); /* <-- The stream has been exhausted, so marked the bits as consumed. */2374return DRFLAC_FALSE;2375}23762377DRFLAC_ASSERT(bytesRead < DRFLAC_CACHE_L1_SIZE_BYTES(bs));2378bs->consumedBits = (drflac_uint32)(DRFLAC_CACHE_L1_SIZE_BYTES(bs) - bytesRead) * 8;23792380bs->cache = drflac__be2host__cache_line(bs->unalignedCache);2381bs->cache &= DRFLAC_CACHE_L1_SELECTION_MASK(DRFLAC_CACHE_L1_BITS_REMAINING(bs)); /* <-- Make sure the consumed bits are always set to zero. Other parts of the library depend on this property. */2382bs->unalignedByteCount = 0; /* <-- At this point the unaligned bytes have been moved into the cache and we thus have no more unaligned bytes. */23832384#ifndef DR_FLAC_NO_CRC2385bs->crc16Cache = bs->cache >> bs->consumedBits;2386bs->crc16CacheIgnoredBytes = bs->consumedBits >> 3;2387#endif2388return DRFLAC_TRUE;2389}23902391static void drflac__reset_cache(drflac_bs* bs)2392{2393bs->nextL2Line = DRFLAC_CACHE_L2_LINE_COUNT(bs); /* <-- This clears the L2 cache. */2394bs->consumedBits = DRFLAC_CACHE_L1_SIZE_BITS(bs); /* <-- This clears the L1 cache. */2395bs->cache = 0;2396bs->unalignedByteCount = 0; /* <-- This clears the trailing unaligned bytes. */2397bs->unalignedCache = 0;23982399#ifndef DR_FLAC_NO_CRC2400bs->crc16Cache = 0;2401bs->crc16CacheIgnoredBytes = 0;2402#endif2403}240424052406static DRFLAC_INLINE drflac_bool32 drflac__read_uint32(drflac_bs* bs, unsigned int bitCount, drflac_uint32* pResultOut)2407{2408DRFLAC_ASSERT(bs != NULL);2409DRFLAC_ASSERT(pResultOut != NULL);2410DRFLAC_ASSERT(bitCount > 0);2411DRFLAC_ASSERT(bitCount <= 32);24122413if (bs->consumedBits == DRFLAC_CACHE_L1_SIZE_BITS(bs)) {2414if (!drflac__reload_cache(bs)) {2415return DRFLAC_FALSE;2416}2417}24182419if (bitCount <= DRFLAC_CACHE_L1_BITS_REMAINING(bs)) {2420/*2421If we want to load all 32-bits from a 32-bit cache we need to do it slightly differently because we can't do2422a 32-bit shift on a 32-bit integer. This will never be the case on 64-bit caches, so we can have a slightly2423more optimal solution for this.2424*/2425#ifdef DRFLAC_64BIT2426*pResultOut = (drflac_uint32)DRFLAC_CACHE_L1_SELECT_AND_SHIFT(bs, bitCount);2427bs->consumedBits += bitCount;2428bs->cache <<= bitCount;2429#else2430if (bitCount < DRFLAC_CACHE_L1_SIZE_BITS(bs)) {2431*pResultOut = (drflac_uint32)DRFLAC_CACHE_L1_SELECT_AND_SHIFT(bs, bitCount);2432bs->consumedBits += bitCount;2433bs->cache <<= bitCount;2434} else {2435/* Cannot shift by 32-bits, so need to do it differently. */2436*pResultOut = (drflac_uint32)bs->cache;2437bs->consumedBits = DRFLAC_CACHE_L1_SIZE_BITS(bs);2438bs->cache = 0;2439}2440#endif24412442return DRFLAC_TRUE;2443} else {2444/* It straddles the cached data. It will never cover more than the next chunk. We just read the number in two parts and combine them. */2445drflac_uint32 bitCountHi = DRFLAC_CACHE_L1_BITS_REMAINING(bs);2446drflac_uint32 bitCountLo = bitCount - bitCountHi;2447drflac_uint32 resultHi;24482449DRFLAC_ASSERT(bitCountHi > 0);2450DRFLAC_ASSERT(bitCountHi < 32);2451resultHi = (drflac_uint32)DRFLAC_CACHE_L1_SELECT_AND_SHIFT(bs, bitCountHi);24522453if (!drflac__reload_cache(bs)) {2454return DRFLAC_FALSE;2455}2456if (bitCountLo > DRFLAC_CACHE_L1_BITS_REMAINING(bs)) {2457/* This happens when we get to end of stream */2458return DRFLAC_FALSE;2459}24602461*pResultOut = (resultHi << bitCountLo) | (drflac_uint32)DRFLAC_CACHE_L1_SELECT_AND_SHIFT(bs, bitCountLo);2462bs->consumedBits += bitCountLo;2463bs->cache <<= bitCountLo;2464return DRFLAC_TRUE;2465}2466}24672468static drflac_bool32 drflac__read_int32(drflac_bs* bs, unsigned int bitCount, drflac_int32* pResult)2469{2470drflac_uint32 result;24712472DRFLAC_ASSERT(bs != NULL);2473DRFLAC_ASSERT(pResult != NULL);2474DRFLAC_ASSERT(bitCount > 0);2475DRFLAC_ASSERT(bitCount <= 32);24762477if (!drflac__read_uint32(bs, bitCount, &result)) {2478return DRFLAC_FALSE;2479}24802481/* Do not attempt to shift by 32 as it's undefined. */2482if (bitCount < 32) {2483drflac_uint32 signbit;2484signbit = ((result >> (bitCount-1)) & 0x01);2485result |= (~signbit + 1) << bitCount;2486}24872488*pResult = (drflac_int32)result;2489return DRFLAC_TRUE;2490}24912492#ifdef DRFLAC_64BIT2493static drflac_bool32 drflac__read_uint64(drflac_bs* bs, unsigned int bitCount, drflac_uint64* pResultOut)2494{2495drflac_uint32 resultHi;2496drflac_uint32 resultLo;24972498DRFLAC_ASSERT(bitCount <= 64);2499DRFLAC_ASSERT(bitCount > 32);25002501if (!drflac__read_uint32(bs, bitCount - 32, &resultHi)) {2502return DRFLAC_FALSE;2503}25042505if (!drflac__read_uint32(bs, 32, &resultLo)) {2506return DRFLAC_FALSE;2507}25082509*pResultOut = (((drflac_uint64)resultHi) << 32) | ((drflac_uint64)resultLo);2510return DRFLAC_TRUE;2511}2512#endif25132514/* Function below is unused, but leaving it here in case I need to quickly add it again. */2515#if 02516static drflac_bool32 drflac__read_int64(drflac_bs* bs, unsigned int bitCount, drflac_int64* pResultOut)2517{2518drflac_uint64 result;2519drflac_uint64 signbit;25202521DRFLAC_ASSERT(bitCount <= 64);25222523if (!drflac__read_uint64(bs, bitCount, &result)) {2524return DRFLAC_FALSE;2525}25262527signbit = ((result >> (bitCount-1)) & 0x01);2528result |= (~signbit + 1) << bitCount;25292530*pResultOut = (drflac_int64)result;2531return DRFLAC_TRUE;2532}2533#endif25342535static drflac_bool32 drflac__read_uint16(drflac_bs* bs, unsigned int bitCount, drflac_uint16* pResult)2536{2537drflac_uint32 result;25382539DRFLAC_ASSERT(bs != NULL);2540DRFLAC_ASSERT(pResult != NULL);2541DRFLAC_ASSERT(bitCount > 0);2542DRFLAC_ASSERT(bitCount <= 16);25432544if (!drflac__read_uint32(bs, bitCount, &result)) {2545return DRFLAC_FALSE;2546}25472548*pResult = (drflac_uint16)result;2549return DRFLAC_TRUE;2550}25512552#if 02553static drflac_bool32 drflac__read_int16(drflac_bs* bs, unsigned int bitCount, drflac_int16* pResult)2554{2555drflac_int32 result;25562557DRFLAC_ASSERT(bs != NULL);2558DRFLAC_ASSERT(pResult != NULL);2559DRFLAC_ASSERT(bitCount > 0);2560DRFLAC_ASSERT(bitCount <= 16);25612562if (!drflac__read_int32(bs, bitCount, &result)) {2563return DRFLAC_FALSE;2564}25652566*pResult = (drflac_int16)result;2567return DRFLAC_TRUE;2568}2569#endif25702571static drflac_bool32 drflac__read_uint8(drflac_bs* bs, unsigned int bitCount, drflac_uint8* pResult)2572{2573drflac_uint32 result;25742575DRFLAC_ASSERT(bs != NULL);2576DRFLAC_ASSERT(pResult != NULL);2577DRFLAC_ASSERT(bitCount > 0);2578DRFLAC_ASSERT(bitCount <= 8);25792580if (!drflac__read_uint32(bs, bitCount, &result)) {2581return DRFLAC_FALSE;2582}25832584*pResult = (drflac_uint8)result;2585return DRFLAC_TRUE;2586}25872588static drflac_bool32 drflac__read_int8(drflac_bs* bs, unsigned int bitCount, drflac_int8* pResult)2589{2590drflac_int32 result;25912592DRFLAC_ASSERT(bs != NULL);2593DRFLAC_ASSERT(pResult != NULL);2594DRFLAC_ASSERT(bitCount > 0);2595DRFLAC_ASSERT(bitCount <= 8);25962597if (!drflac__read_int32(bs, bitCount, &result)) {2598return DRFLAC_FALSE;2599}26002601*pResult = (drflac_int8)result;2602return DRFLAC_TRUE;2603}260426052606static drflac_bool32 drflac__seek_bits(drflac_bs* bs, size_t bitsToSeek)2607{2608if (bitsToSeek <= DRFLAC_CACHE_L1_BITS_REMAINING(bs)) {2609bs->consumedBits += (drflac_uint32)bitsToSeek;2610bs->cache <<= bitsToSeek;2611return DRFLAC_TRUE;2612} else {2613/* It straddles the cached data. This function isn't called too frequently so I'm favouring simplicity here. */2614bitsToSeek -= DRFLAC_CACHE_L1_BITS_REMAINING(bs);2615bs->consumedBits += DRFLAC_CACHE_L1_BITS_REMAINING(bs);2616bs->cache = 0;26172618/* Simple case. Seek in groups of the same number as bits that fit within a cache line. */2619#ifdef DRFLAC_64BIT2620while (bitsToSeek >= DRFLAC_CACHE_L1_SIZE_BITS(bs)) {2621drflac_uint64 bin;2622if (!drflac__read_uint64(bs, DRFLAC_CACHE_L1_SIZE_BITS(bs), &bin)) {2623return DRFLAC_FALSE;2624}2625bitsToSeek -= DRFLAC_CACHE_L1_SIZE_BITS(bs);2626}2627#else2628while (bitsToSeek >= DRFLAC_CACHE_L1_SIZE_BITS(bs)) {2629drflac_uint32 bin;2630if (!drflac__read_uint32(bs, DRFLAC_CACHE_L1_SIZE_BITS(bs), &bin)) {2631return DRFLAC_FALSE;2632}2633bitsToSeek -= DRFLAC_CACHE_L1_SIZE_BITS(bs);2634}2635#endif26362637/* Whole leftover bytes. */2638while (bitsToSeek >= 8) {2639drflac_uint8 bin;2640if (!drflac__read_uint8(bs, 8, &bin)) {2641return DRFLAC_FALSE;2642}2643bitsToSeek -= 8;2644}26452646/* Leftover bits. */2647if (bitsToSeek > 0) {2648drflac_uint8 bin;2649if (!drflac__read_uint8(bs, (drflac_uint32)bitsToSeek, &bin)) {2650return DRFLAC_FALSE;2651}2652bitsToSeek = 0; /* <-- Necessary for the assert below. */2653}26542655DRFLAC_ASSERT(bitsToSeek == 0);2656return DRFLAC_TRUE;2657}2658}265926602661/* This function moves the bit streamer to the first bit after the sync code (bit 15 of the of the frame header). It will also update the CRC-16. */2662static drflac_bool32 drflac__find_and_seek_to_next_sync_code(drflac_bs* bs)2663{2664DRFLAC_ASSERT(bs != NULL);26652666/*2667The sync code is always aligned to 8 bits. This is convenient for us because it means we can do byte-aligned movements. The first2668thing to do is align to the next byte.2669*/2670if (!drflac__seek_bits(bs, DRFLAC_CACHE_L1_BITS_REMAINING(bs) & 7)) {2671return DRFLAC_FALSE;2672}26732674for (;;) {2675drflac_uint8 hi;26762677#ifndef DR_FLAC_NO_CRC2678drflac__reset_crc16(bs);2679#endif26802681if (!drflac__read_uint8(bs, 8, &hi)) {2682return DRFLAC_FALSE;2683}26842685if (hi == 0xFF) {2686drflac_uint8 lo;2687if (!drflac__read_uint8(bs, 6, &lo)) {2688return DRFLAC_FALSE;2689}26902691if (lo == 0x3E) {2692return DRFLAC_TRUE;2693} else {2694if (!drflac__seek_bits(bs, DRFLAC_CACHE_L1_BITS_REMAINING(bs) & 7)) {2695return DRFLAC_FALSE;2696}2697}2698}2699}27002701/* Should never get here. */2702/*return DRFLAC_FALSE;*/2703}270427052706#if defined(DRFLAC_HAS_LZCNT_INTRINSIC)2707#define DRFLAC_IMPLEMENT_CLZ_LZCNT2708#endif2709#if defined(_MSC_VER) && _MSC_VER >= 1400 && (defined(DRFLAC_X64) || defined(DRFLAC_X86)) && !defined(__clang__)2710#define DRFLAC_IMPLEMENT_CLZ_MSVC2711#endif2712#if defined(__WATCOMC__) && defined(__386__)2713#define DRFLAC_IMPLEMENT_CLZ_WATCOM2714#endif2715#ifdef __MRC__2716#include <intrinsics.h>2717#define DRFLAC_IMPLEMENT_CLZ_MRC2718#endif27192720static DRFLAC_INLINE drflac_uint32 drflac__clz_software(drflac_cache_t x)2721{2722drflac_uint32 n;2723static drflac_uint32 clz_table_4[] = {27240,27254,27263, 3,27272, 2, 2, 2,27281, 1, 1, 1, 1, 1, 1, 12729};27302731if (x == 0) {2732return sizeof(x)*8;2733}27342735n = clz_table_4[x >> (sizeof(x)*8 - 4)];2736if (n == 0) {2737#ifdef DRFLAC_64BIT2738if ((x & ((drflac_uint64)0xFFFFFFFF << 32)) == 0) { n = 32; x <<= 32; }2739if ((x & ((drflac_uint64)0xFFFF0000 << 32)) == 0) { n += 16; x <<= 16; }2740if ((x & ((drflac_uint64)0xFF000000 << 32)) == 0) { n += 8; x <<= 8; }2741if ((x & ((drflac_uint64)0xF0000000 << 32)) == 0) { n += 4; x <<= 4; }2742#else2743if ((x & 0xFFFF0000) == 0) { n = 16; x <<= 16; }2744if ((x & 0xFF000000) == 0) { n += 8; x <<= 8; }2745if ((x & 0xF0000000) == 0) { n += 4; x <<= 4; }2746#endif2747n += clz_table_4[x >> (sizeof(x)*8 - 4)];2748}27492750return n - 1;2751}27522753#ifdef DRFLAC_IMPLEMENT_CLZ_LZCNT2754static DRFLAC_INLINE drflac_bool32 drflac__is_lzcnt_supported(void)2755{2756/* Fast compile time check for ARM. */2757#if defined(DRFLAC_HAS_LZCNT_INTRINSIC) && defined(DRFLAC_ARM) && (defined(__ARM_ARCH) && __ARM_ARCH >= 5)2758return DRFLAC_TRUE;2759#elif defined(__MRC__)2760return DRFLAC_TRUE;2761#else2762/* If the compiler itself does not support the intrinsic then we'll need to return false. */2763#ifdef DRFLAC_HAS_LZCNT_INTRINSIC2764return drflac__gIsLZCNTSupported;2765#else2766return DRFLAC_FALSE;2767#endif2768#endif2769}27702771static DRFLAC_INLINE drflac_uint32 drflac__clz_lzcnt(drflac_cache_t x)2772{2773/*2774It's critical for competitive decoding performance that this function be highly optimal. With MSVC we can use the __lzcnt64() and __lzcnt() intrinsics2775to achieve good performance, however on GCC and Clang it's a little bit more annoying. The __builtin_clzl() and __builtin_clzll() intrinsics leave2776it undefined as to the return value when `x` is 0. We need this to be well defined as returning 32 or 64, depending on whether or not it's a 32- or277764-bit build. To work around this we would need to add a conditional to check for the x = 0 case, but this creates unnecessary inefficiency. To work2778around this problem I have written some inline assembly to emit the LZCNT (x86) or CLZ (ARM) instruction directly which removes the need to include2779the conditional. This has worked well in the past, but for some reason Clang's MSVC compatible driver, clang-cl, does not seem to be handling this2780in the same way as the normal Clang driver. It seems that `clang-cl` is just outputting the wrong results sometimes, maybe due to some register2781getting clobbered?27822783I'm not sure if this is a bug with dr_flac's inlined assembly (most likely), a bug in `clang-cl` or just a misunderstanding on my part with inline2784assembly rules for `clang-cl`. If somebody can identify an error in dr_flac's inlined assembly I'm happy to get that fixed.27852786Fortunately there is an easy workaround for this. Clang implements MSVC-specific intrinsics for compatibility. It also defines _MSC_VER for extra2787compatibility. We can therefore just check for _MSC_VER and use the MSVC intrinsic which, fortunately for us, Clang supports. It would still be nice2788to know how to fix the inlined assembly for correctness sake, however.2789*/27902791#if defined(_MSC_VER) /*&& !defined(__clang__)*/ /* <-- Intentionally wanting Clang to use the MSVC __lzcnt64/__lzcnt intrinsics due to above ^. */2792#ifdef DRFLAC_64BIT2793return (drflac_uint32)__lzcnt64(x);2794#else2795return (drflac_uint32)__lzcnt(x);2796#endif2797#else2798#if defined(__GNUC__) || defined(__clang__)2799#if defined(DRFLAC_X64)2800{2801drflac_uint64 r;2802__asm__ __volatile__ (2803"lzcnt{ %1, %0| %0, %1}" : "=r"(r) : "r"(x) : "cc"2804);28052806return (drflac_uint32)r;2807}2808#elif defined(DRFLAC_X86)2809{2810drflac_uint32 r;2811__asm__ __volatile__ (2812"lzcnt{l %1, %0| %0, %1}" : "=r"(r) : "r"(x) : "cc"2813);28142815return r;2816}2817#elif defined(DRFLAC_ARM) && (defined(__ARM_ARCH) && __ARM_ARCH >= 5) && !defined(__ARM_ARCH_6M__) && !defined(DRFLAC_64BIT) /* <-- I haven't tested 64-bit inline assembly, so only enabling this for the 32-bit build for now. */2818{2819unsigned int r;2820__asm__ __volatile__ (2821#if defined(DRFLAC_64BIT)2822"clz %w[out], %w[in]" : [out]"=r"(r) : [in]"r"(x) /* <-- This is untested. If someone in the community could test this, that would be appreciated! */2823#else2824"clz %[out], %[in]" : [out]"=r"(r) : [in]"r"(x)2825#endif2826);28272828return r;2829}2830#else2831if (x == 0) {2832return sizeof(x)*8;2833}2834#ifdef DRFLAC_64BIT2835return (drflac_uint32)__builtin_clzll((drflac_uint64)x);2836#else2837return (drflac_uint32)__builtin_clzl((drflac_uint32)x);2838#endif2839#endif2840#else2841/* Unsupported compiler. */2842#error "This compiler does not support the lzcnt intrinsic."2843#endif2844#endif2845}2846#endif28472848#ifdef DRFLAC_IMPLEMENT_CLZ_MSVC2849#include <intrin.h> /* For BitScanReverse(). */28502851static DRFLAC_INLINE drflac_uint32 drflac__clz_msvc(drflac_cache_t x)2852{2853drflac_uint32 n;28542855if (x == 0) {2856return sizeof(x)*8;2857}28582859#ifdef DRFLAC_64BIT2860_BitScanReverse64((unsigned long*)&n, x);2861#else2862_BitScanReverse((unsigned long*)&n, x);2863#endif2864return sizeof(x)*8 - n - 1;2865}2866#endif28672868#ifdef DRFLAC_IMPLEMENT_CLZ_WATCOM2869static __inline drflac_uint32 drflac__clz_watcom (drflac_uint32);2870#ifdef DRFLAC_IMPLEMENT_CLZ_WATCOM_LZCNT2871/* Use the LZCNT instruction (only available on some processors since the 2010s). */2872#pragma aux drflac__clz_watcom_lzcnt = \2873"db 0F3h, 0Fh, 0BDh, 0C0h" /* lzcnt eax, eax */ \2874parm [eax] \2875value [eax] \2876modify nomemory;2877#else2878/* Use the 386+-compatible implementation. */2879#pragma aux drflac__clz_watcom = \2880"bsr eax, eax" \2881"xor eax, 31" \2882parm [eax] nomemory \2883value [eax] \2884modify exact [eax] nomemory;2885#endif2886#endif28872888static DRFLAC_INLINE drflac_uint32 drflac__clz(drflac_cache_t x)2889{2890#ifdef DRFLAC_IMPLEMENT_CLZ_LZCNT2891if (drflac__is_lzcnt_supported()) {2892return drflac__clz_lzcnt(x);2893} else2894#endif2895{2896#ifdef DRFLAC_IMPLEMENT_CLZ_MSVC2897return drflac__clz_msvc(x);2898#elif defined(DRFLAC_IMPLEMENT_CLZ_WATCOM_LZCNT)2899return drflac__clz_watcom_lzcnt(x);2900#elif defined(DRFLAC_IMPLEMENT_CLZ_WATCOM)2901return (x == 0) ? sizeof(x)*8 : drflac__clz_watcom(x);2902#elif defined(__MRC__)2903return __cntlzw(x);2904#else2905return drflac__clz_software(x);2906#endif2907}2908}290929102911static DRFLAC_INLINE drflac_bool32 drflac__seek_past_next_set_bit(drflac_bs* bs, unsigned int* pOffsetOut)2912{2913drflac_uint32 zeroCounter = 0;2914drflac_uint32 setBitOffsetPlus1;29152916while (bs->cache == 0) {2917zeroCounter += (drflac_uint32)DRFLAC_CACHE_L1_BITS_REMAINING(bs);2918if (!drflac__reload_cache(bs)) {2919return DRFLAC_FALSE;2920}2921}29222923if (bs->cache == 1) {2924/* Not catching this would lead to undefined behaviour: a shift of a 32-bit number by 32 or more is undefined */2925*pOffsetOut = zeroCounter + (drflac_uint32)DRFLAC_CACHE_L1_BITS_REMAINING(bs) - 1;2926if (!drflac__reload_cache(bs)) {2927return DRFLAC_FALSE;2928}29292930return DRFLAC_TRUE;2931}29322933setBitOffsetPlus1 = drflac__clz(bs->cache);2934setBitOffsetPlus1 += 1;29352936if (setBitOffsetPlus1 > DRFLAC_CACHE_L1_BITS_REMAINING(bs)) {2937/* This happens when we get to end of stream */2938return DRFLAC_FALSE;2939}29402941bs->consumedBits += setBitOffsetPlus1;2942bs->cache <<= setBitOffsetPlus1;29432944*pOffsetOut = zeroCounter + setBitOffsetPlus1 - 1;2945return DRFLAC_TRUE;2946}2947294829492950static drflac_bool32 drflac__seek_to_byte(drflac_bs* bs, drflac_uint64 offsetFromStart)2951{2952DRFLAC_ASSERT(bs != NULL);2953DRFLAC_ASSERT(offsetFromStart > 0);29542955/*2956Seeking from the start is not quite as trivial as it sounds because the onSeek callback takes a signed 32-bit integer (which2957is intentional because it simplifies the implementation of the onSeek callbacks), however offsetFromStart is unsigned 64-bit.2958To resolve we just need to do an initial seek from the start, and then a series of offset seeks to make up the remainder.2959*/2960if (offsetFromStart > 0x7FFFFFFF) {2961drflac_uint64 bytesRemaining = offsetFromStart;2962if (!bs->onSeek(bs->pUserData, 0x7FFFFFFF, drflac_seek_origin_start)) {2963return DRFLAC_FALSE;2964}2965bytesRemaining -= 0x7FFFFFFF;29662967while (bytesRemaining > 0x7FFFFFFF) {2968if (!bs->onSeek(bs->pUserData, 0x7FFFFFFF, drflac_seek_origin_current)) {2969return DRFLAC_FALSE;2970}2971bytesRemaining -= 0x7FFFFFFF;2972}29732974if (bytesRemaining > 0) {2975if (!bs->onSeek(bs->pUserData, (int)bytesRemaining, drflac_seek_origin_current)) {2976return DRFLAC_FALSE;2977}2978}2979} else {2980if (!bs->onSeek(bs->pUserData, (int)offsetFromStart, drflac_seek_origin_start)) {2981return DRFLAC_FALSE;2982}2983}29842985/* The cache should be reset to force a reload of fresh data from the client. */2986drflac__reset_cache(bs);2987return DRFLAC_TRUE;2988}298929902991static drflac_result drflac__read_utf8_coded_number(drflac_bs* bs, drflac_uint64* pNumberOut, drflac_uint8* pCRCOut)2992{2993drflac_uint8 crc;2994drflac_uint64 result;2995drflac_uint8 utf8[7] = {0};2996int byteCount;2997int i;29982999DRFLAC_ASSERT(bs != NULL);3000DRFLAC_ASSERT(pNumberOut != NULL);3001DRFLAC_ASSERT(pCRCOut != NULL);30023003crc = *pCRCOut;30043005if (!drflac__read_uint8(bs, 8, utf8)) {3006*pNumberOut = 0;3007return DRFLAC_AT_END;3008}3009crc = drflac_crc8(crc, utf8[0], 8);30103011if ((utf8[0] & 0x80) == 0) {3012*pNumberOut = utf8[0];3013*pCRCOut = crc;3014return DRFLAC_SUCCESS;3015}30163017/*byteCount = 1;*/3018if ((utf8[0] & 0xE0) == 0xC0) {3019byteCount = 2;3020} else if ((utf8[0] & 0xF0) == 0xE0) {3021byteCount = 3;3022} else if ((utf8[0] & 0xF8) == 0xF0) {3023byteCount = 4;3024} else if ((utf8[0] & 0xFC) == 0xF8) {3025byteCount = 5;3026} else if ((utf8[0] & 0xFE) == 0xFC) {3027byteCount = 6;3028} else if ((utf8[0] & 0xFF) == 0xFE) {3029byteCount = 7;3030} else {3031*pNumberOut = 0;3032return DRFLAC_CRC_MISMATCH; /* Bad UTF-8 encoding. */3033}30343035/* Read extra bytes. */3036DRFLAC_ASSERT(byteCount > 1);30373038result = (drflac_uint64)(utf8[0] & (0xFF >> (byteCount + 1)));3039for (i = 1; i < byteCount; ++i) {3040if (!drflac__read_uint8(bs, 8, utf8 + i)) {3041*pNumberOut = 0;3042return DRFLAC_AT_END;3043}3044crc = drflac_crc8(crc, utf8[i], 8);30453046result = (result << 6) | (utf8[i] & 0x3F);3047}30483049*pNumberOut = result;3050*pCRCOut = crc;3051return DRFLAC_SUCCESS;3052}305330543055static DRFLAC_INLINE drflac_uint32 drflac__ilog2_u32(drflac_uint32 x)3056{3057#if 1 /* Needs optimizing. */3058drflac_uint32 result = 0;3059while (x > 0) {3060result += 1;3061x >>= 1;3062}30633064return result;3065#endif3066}30673068static DRFLAC_INLINE drflac_bool32 drflac__use_64_bit_prediction(drflac_uint32 bitsPerSample, drflac_uint32 order, drflac_uint32 precision)3069{3070/* https://web.archive.org/web/20220205005724/https://github.com/ietf-wg-cellar/flac-specification/blob/37a49aa48ba4ba12e8757badfc59c0df35435fec/rfc_backmatter.md */3071return bitsPerSample + precision + drflac__ilog2_u32(order) > 32;3072}307330743075/*3076The next two functions are responsible for calculating the prediction.30773078When the bits per sample is >16 we need to use 64-bit integer arithmetic because otherwise we'll run out of precision. It's3079safe to assume this will be slower on 32-bit platforms so we use a more optimal solution when the bits per sample is <=16.3080*/3081#if defined(__clang__)3082__attribute__((no_sanitize("signed-integer-overflow")))3083#endif3084static DRFLAC_INLINE drflac_int32 drflac__calculate_prediction_32(drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pDecodedSamples)3085{3086drflac_int32 prediction = 0;30873088DRFLAC_ASSERT(order <= 32);30893090/* 32-bit version. */30913092/* VC++ optimizes this to a single jmp. I've not yet verified this for other compilers. */3093switch (order)3094{3095case 32: prediction += coefficients[31] * pDecodedSamples[-32];3096case 31: prediction += coefficients[30] * pDecodedSamples[-31];3097case 30: prediction += coefficients[29] * pDecodedSamples[-30];3098case 29: prediction += coefficients[28] * pDecodedSamples[-29];3099case 28: prediction += coefficients[27] * pDecodedSamples[-28];3100case 27: prediction += coefficients[26] * pDecodedSamples[-27];3101case 26: prediction += coefficients[25] * pDecodedSamples[-26];3102case 25: prediction += coefficients[24] * pDecodedSamples[-25];3103case 24: prediction += coefficients[23] * pDecodedSamples[-24];3104case 23: prediction += coefficients[22] * pDecodedSamples[-23];3105case 22: prediction += coefficients[21] * pDecodedSamples[-22];3106case 21: prediction += coefficients[20] * pDecodedSamples[-21];3107case 20: prediction += coefficients[19] * pDecodedSamples[-20];3108case 19: prediction += coefficients[18] * pDecodedSamples[-19];3109case 18: prediction += coefficients[17] * pDecodedSamples[-18];3110case 17: prediction += coefficients[16] * pDecodedSamples[-17];3111case 16: prediction += coefficients[15] * pDecodedSamples[-16];3112case 15: prediction += coefficients[14] * pDecodedSamples[-15];3113case 14: prediction += coefficients[13] * pDecodedSamples[-14];3114case 13: prediction += coefficients[12] * pDecodedSamples[-13];3115case 12: prediction += coefficients[11] * pDecodedSamples[-12];3116case 11: prediction += coefficients[10] * pDecodedSamples[-11];3117case 10: prediction += coefficients[ 9] * pDecodedSamples[-10];3118case 9: prediction += coefficients[ 8] * pDecodedSamples[- 9];3119case 8: prediction += coefficients[ 7] * pDecodedSamples[- 8];3120case 7: prediction += coefficients[ 6] * pDecodedSamples[- 7];3121case 6: prediction += coefficients[ 5] * pDecodedSamples[- 6];3122case 5: prediction += coefficients[ 4] * pDecodedSamples[- 5];3123case 4: prediction += coefficients[ 3] * pDecodedSamples[- 4];3124case 3: prediction += coefficients[ 2] * pDecodedSamples[- 3];3125case 2: prediction += coefficients[ 1] * pDecodedSamples[- 2];3126case 1: prediction += coefficients[ 0] * pDecodedSamples[- 1];3127}31283129return (drflac_int32)(prediction >> shift);3130}31313132static DRFLAC_INLINE drflac_int32 drflac__calculate_prediction_64(drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pDecodedSamples)3133{3134drflac_int64 prediction;31353136DRFLAC_ASSERT(order <= 32);31373138/* 64-bit version. */31393140/* This method is faster on the 32-bit build when compiling with VC++. See note below. */3141#ifndef DRFLAC_64BIT3142if (order == 8)3143{3144prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1];3145prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2];3146prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3];3147prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4];3148prediction += coefficients[4] * (drflac_int64)pDecodedSamples[-5];3149prediction += coefficients[5] * (drflac_int64)pDecodedSamples[-6];3150prediction += coefficients[6] * (drflac_int64)pDecodedSamples[-7];3151prediction += coefficients[7] * (drflac_int64)pDecodedSamples[-8];3152}3153else if (order == 7)3154{3155prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1];3156prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2];3157prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3];3158prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4];3159prediction += coefficients[4] * (drflac_int64)pDecodedSamples[-5];3160prediction += coefficients[5] * (drflac_int64)pDecodedSamples[-6];3161prediction += coefficients[6] * (drflac_int64)pDecodedSamples[-7];3162}3163else if (order == 3)3164{3165prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1];3166prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2];3167prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3];3168}3169else if (order == 6)3170{3171prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1];3172prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2];3173prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3];3174prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4];3175prediction += coefficients[4] * (drflac_int64)pDecodedSamples[-5];3176prediction += coefficients[5] * (drflac_int64)pDecodedSamples[-6];3177}3178else if (order == 5)3179{3180prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1];3181prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2];3182prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3];3183prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4];3184prediction += coefficients[4] * (drflac_int64)pDecodedSamples[-5];3185}3186else if (order == 4)3187{3188prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1];3189prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2];3190prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3];3191prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4];3192}3193else if (order == 12)3194{3195prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1];3196prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2];3197prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3];3198prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4];3199prediction += coefficients[4] * (drflac_int64)pDecodedSamples[-5];3200prediction += coefficients[5] * (drflac_int64)pDecodedSamples[-6];3201prediction += coefficients[6] * (drflac_int64)pDecodedSamples[-7];3202prediction += coefficients[7] * (drflac_int64)pDecodedSamples[-8];3203prediction += coefficients[8] * (drflac_int64)pDecodedSamples[-9];3204prediction += coefficients[9] * (drflac_int64)pDecodedSamples[-10];3205prediction += coefficients[10] * (drflac_int64)pDecodedSamples[-11];3206prediction += coefficients[11] * (drflac_int64)pDecodedSamples[-12];3207}3208else if (order == 2)3209{3210prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1];3211prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2];3212}3213else if (order == 1)3214{3215prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1];3216}3217else if (order == 10)3218{3219prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1];3220prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2];3221prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3];3222prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4];3223prediction += coefficients[4] * (drflac_int64)pDecodedSamples[-5];3224prediction += coefficients[5] * (drflac_int64)pDecodedSamples[-6];3225prediction += coefficients[6] * (drflac_int64)pDecodedSamples[-7];3226prediction += coefficients[7] * (drflac_int64)pDecodedSamples[-8];3227prediction += coefficients[8] * (drflac_int64)pDecodedSamples[-9];3228prediction += coefficients[9] * (drflac_int64)pDecodedSamples[-10];3229}3230else if (order == 9)3231{3232prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1];3233prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2];3234prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3];3235prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4];3236prediction += coefficients[4] * (drflac_int64)pDecodedSamples[-5];3237prediction += coefficients[5] * (drflac_int64)pDecodedSamples[-6];3238prediction += coefficients[6] * (drflac_int64)pDecodedSamples[-7];3239prediction += coefficients[7] * (drflac_int64)pDecodedSamples[-8];3240prediction += coefficients[8] * (drflac_int64)pDecodedSamples[-9];3241}3242else if (order == 11)3243{3244prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1];3245prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2];3246prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3];3247prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4];3248prediction += coefficients[4] * (drflac_int64)pDecodedSamples[-5];3249prediction += coefficients[5] * (drflac_int64)pDecodedSamples[-6];3250prediction += coefficients[6] * (drflac_int64)pDecodedSamples[-7];3251prediction += coefficients[7] * (drflac_int64)pDecodedSamples[-8];3252prediction += coefficients[8] * (drflac_int64)pDecodedSamples[-9];3253prediction += coefficients[9] * (drflac_int64)pDecodedSamples[-10];3254prediction += coefficients[10] * (drflac_int64)pDecodedSamples[-11];3255}3256else3257{3258int j;32593260prediction = 0;3261for (j = 0; j < (int)order; ++j) {3262prediction += coefficients[j] * (drflac_int64)pDecodedSamples[-j-1];3263}3264}3265#endif32663267/*3268VC++ optimizes this to a single jmp instruction, but only the 64-bit build. The 32-bit build generates less efficient code for some3269reason. The ugly version above is faster so we'll just switch between the two depending on the target platform.3270*/3271#ifdef DRFLAC_64BIT3272prediction = 0;3273switch (order)3274{3275case 32: prediction += coefficients[31] * (drflac_int64)pDecodedSamples[-32];3276case 31: prediction += coefficients[30] * (drflac_int64)pDecodedSamples[-31];3277case 30: prediction += coefficients[29] * (drflac_int64)pDecodedSamples[-30];3278case 29: prediction += coefficients[28] * (drflac_int64)pDecodedSamples[-29];3279case 28: prediction += coefficients[27] * (drflac_int64)pDecodedSamples[-28];3280case 27: prediction += coefficients[26] * (drflac_int64)pDecodedSamples[-27];3281case 26: prediction += coefficients[25] * (drflac_int64)pDecodedSamples[-26];3282case 25: prediction += coefficients[24] * (drflac_int64)pDecodedSamples[-25];3283case 24: prediction += coefficients[23] * (drflac_int64)pDecodedSamples[-24];3284case 23: prediction += coefficients[22] * (drflac_int64)pDecodedSamples[-23];3285case 22: prediction += coefficients[21] * (drflac_int64)pDecodedSamples[-22];3286case 21: prediction += coefficients[20] * (drflac_int64)pDecodedSamples[-21];3287case 20: prediction += coefficients[19] * (drflac_int64)pDecodedSamples[-20];3288case 19: prediction += coefficients[18] * (drflac_int64)pDecodedSamples[-19];3289case 18: prediction += coefficients[17] * (drflac_int64)pDecodedSamples[-18];3290case 17: prediction += coefficients[16] * (drflac_int64)pDecodedSamples[-17];3291case 16: prediction += coefficients[15] * (drflac_int64)pDecodedSamples[-16];3292case 15: prediction += coefficients[14] * (drflac_int64)pDecodedSamples[-15];3293case 14: prediction += coefficients[13] * (drflac_int64)pDecodedSamples[-14];3294case 13: prediction += coefficients[12] * (drflac_int64)pDecodedSamples[-13];3295case 12: prediction += coefficients[11] * (drflac_int64)pDecodedSamples[-12];3296case 11: prediction += coefficients[10] * (drflac_int64)pDecodedSamples[-11];3297case 10: prediction += coefficients[ 9] * (drflac_int64)pDecodedSamples[-10];3298case 9: prediction += coefficients[ 8] * (drflac_int64)pDecodedSamples[- 9];3299case 8: prediction += coefficients[ 7] * (drflac_int64)pDecodedSamples[- 8];3300case 7: prediction += coefficients[ 6] * (drflac_int64)pDecodedSamples[- 7];3301case 6: prediction += coefficients[ 5] * (drflac_int64)pDecodedSamples[- 6];3302case 5: prediction += coefficients[ 4] * (drflac_int64)pDecodedSamples[- 5];3303case 4: prediction += coefficients[ 3] * (drflac_int64)pDecodedSamples[- 4];3304case 3: prediction += coefficients[ 2] * (drflac_int64)pDecodedSamples[- 3];3305case 2: prediction += coefficients[ 1] * (drflac_int64)pDecodedSamples[- 2];3306case 1: prediction += coefficients[ 0] * (drflac_int64)pDecodedSamples[- 1];3307}3308#endif33093310return (drflac_int32)(prediction >> shift);3311}331233133314#if 03315/*3316Reference implementation for reading and decoding samples with residual. This is intentionally left unoptimized for the3317sake of readability and should only be used as a reference.3318*/3319static drflac_bool32 drflac__decode_samples_with_residual__rice__reference(drflac_bs* bs, drflac_uint32 bitsPerSample, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 lpcOrder, drflac_int32 lpcShift, drflac_uint32 lpcPrecision, const drflac_int32* coefficients, drflac_int32* pSamplesOut)3320{3321drflac_uint32 i;33223323DRFLAC_ASSERT(bs != NULL);3324DRFLAC_ASSERT(pSamplesOut != NULL);33253326for (i = 0; i < count; ++i) {3327drflac_uint32 zeroCounter = 0;3328for (;;) {3329drflac_uint8 bit;3330if (!drflac__read_uint8(bs, 1, &bit)) {3331return DRFLAC_FALSE;3332}33333334if (bit == 0) {3335zeroCounter += 1;3336} else {3337break;3338}3339}33403341drflac_uint32 decodedRice;3342if (riceParam > 0) {3343if (!drflac__read_uint32(bs, riceParam, &decodedRice)) {3344return DRFLAC_FALSE;3345}3346} else {3347decodedRice = 0;3348}33493350decodedRice |= (zeroCounter << riceParam);3351if ((decodedRice & 0x01)) {3352decodedRice = ~(decodedRice >> 1);3353} else {3354decodedRice = (decodedRice >> 1);3355}335633573358if (drflac__use_64_bit_prediction(bitsPerSample, lpcOrder, lpcPrecision)) {3359pSamplesOut[i] = decodedRice + drflac__calculate_prediction_64(lpcOrder, lpcShift, coefficients, pSamplesOut + i);3360} else {3361pSamplesOut[i] = decodedRice + drflac__calculate_prediction_32(lpcOrder, lpcShift, coefficients, pSamplesOut + i);3362}3363}33643365return DRFLAC_TRUE;3366}3367#endif33683369#if 03370static drflac_bool32 drflac__read_rice_parts__reference(drflac_bs* bs, drflac_uint8 riceParam, drflac_uint32* pZeroCounterOut, drflac_uint32* pRiceParamPartOut)3371{3372drflac_uint32 zeroCounter = 0;3373drflac_uint32 decodedRice;33743375for (;;) {3376drflac_uint8 bit;3377if (!drflac__read_uint8(bs, 1, &bit)) {3378return DRFLAC_FALSE;3379}33803381if (bit == 0) {3382zeroCounter += 1;3383} else {3384break;3385}3386}33873388if (riceParam > 0) {3389if (!drflac__read_uint32(bs, riceParam, &decodedRice)) {3390return DRFLAC_FALSE;3391}3392} else {3393decodedRice = 0;3394}33953396*pZeroCounterOut = zeroCounter;3397*pRiceParamPartOut = decodedRice;3398return DRFLAC_TRUE;3399}3400#endif34013402#if 03403static DRFLAC_INLINE drflac_bool32 drflac__read_rice_parts(drflac_bs* bs, drflac_uint8 riceParam, drflac_uint32* pZeroCounterOut, drflac_uint32* pRiceParamPartOut)3404{3405drflac_cache_t riceParamMask;3406drflac_uint32 zeroCounter;3407drflac_uint32 setBitOffsetPlus1;3408drflac_uint32 riceParamPart;3409drflac_uint32 riceLength;34103411DRFLAC_ASSERT(riceParam > 0); /* <-- riceParam should never be 0. drflac__read_rice_parts__param_equals_zero() should be used instead for this case. */34123413riceParamMask = DRFLAC_CACHE_L1_SELECTION_MASK(riceParam);34143415zeroCounter = 0;3416while (bs->cache == 0) {3417zeroCounter += (drflac_uint32)DRFLAC_CACHE_L1_BITS_REMAINING(bs);3418if (!drflac__reload_cache(bs)) {3419return DRFLAC_FALSE;3420}3421}34223423setBitOffsetPlus1 = drflac__clz(bs->cache);3424zeroCounter += setBitOffsetPlus1;3425setBitOffsetPlus1 += 1;34263427riceLength = setBitOffsetPlus1 + riceParam;3428if (riceLength < DRFLAC_CACHE_L1_BITS_REMAINING(bs)) {3429riceParamPart = (drflac_uint32)((bs->cache & (riceParamMask >> setBitOffsetPlus1)) >> DRFLAC_CACHE_L1_SELECTION_SHIFT(bs, riceLength));34303431bs->consumedBits += riceLength;3432bs->cache <<= riceLength;3433} else {3434drflac_uint32 bitCountLo;3435drflac_cache_t resultHi;34363437bs->consumedBits += riceLength;3438bs->cache <<= setBitOffsetPlus1 & (DRFLAC_CACHE_L1_SIZE_BITS(bs)-1); /* <-- Equivalent to "if (setBitOffsetPlus1 < DRFLAC_CACHE_L1_SIZE_BITS(bs)) { bs->cache <<= setBitOffsetPlus1; }" */34393440/* It straddles the cached data. It will never cover more than the next chunk. We just read the number in two parts and combine them. */3441bitCountLo = bs->consumedBits - DRFLAC_CACHE_L1_SIZE_BITS(bs);3442resultHi = DRFLAC_CACHE_L1_SELECT_AND_SHIFT(bs, riceParam); /* <-- Use DRFLAC_CACHE_L1_SELECT_AND_SHIFT_SAFE() if ever this function allows riceParam=0. */34433444if (bs->nextL2Line < DRFLAC_CACHE_L2_LINE_COUNT(bs)) {3445#ifndef DR_FLAC_NO_CRC3446drflac__update_crc16(bs);3447#endif3448bs->cache = drflac__be2host__cache_line(bs->cacheL2[bs->nextL2Line++]);3449bs->consumedBits = 0;3450#ifndef DR_FLAC_NO_CRC3451bs->crc16Cache = bs->cache;3452#endif3453} else {3454/* Slow path. We need to fetch more data from the client. */3455if (!drflac__reload_cache(bs)) {3456return DRFLAC_FALSE;3457}3458if (bitCountLo > DRFLAC_CACHE_L1_BITS_REMAINING(bs)) {3459/* This happens when we get to end of stream */3460return DRFLAC_FALSE;3461}3462}34633464riceParamPart = (drflac_uint32)(resultHi | DRFLAC_CACHE_L1_SELECT_AND_SHIFT_SAFE(bs, bitCountLo));34653466bs->consumedBits += bitCountLo;3467bs->cache <<= bitCountLo;3468}34693470pZeroCounterOut[0] = zeroCounter;3471pRiceParamPartOut[0] = riceParamPart;34723473return DRFLAC_TRUE;3474}3475#endif34763477static DRFLAC_INLINE drflac_bool32 drflac__read_rice_parts_x1(drflac_bs* bs, drflac_uint8 riceParam, drflac_uint32* pZeroCounterOut, drflac_uint32* pRiceParamPartOut)3478{3479drflac_uint32 riceParamPlus1 = riceParam + 1;3480/*drflac_cache_t riceParamPlus1Mask = DRFLAC_CACHE_L1_SELECTION_MASK(riceParamPlus1);*/3481drflac_uint32 riceParamPlus1Shift = DRFLAC_CACHE_L1_SELECTION_SHIFT(bs, riceParamPlus1);3482drflac_uint32 riceParamPlus1MaxConsumedBits = DRFLAC_CACHE_L1_SIZE_BITS(bs) - riceParamPlus1;34833484/*3485The idea here is to use local variables for the cache in an attempt to encourage the compiler to store them in registers. I have3486no idea how this will work in practice...3487*/3488drflac_cache_t bs_cache = bs->cache;3489drflac_uint32 bs_consumedBits = bs->consumedBits;34903491/* The first thing to do is find the first unset bit. Most likely a bit will be set in the current cache line. */3492drflac_uint32 lzcount = drflac__clz(bs_cache);3493if (lzcount < sizeof(bs_cache)*8) {3494pZeroCounterOut[0] = lzcount;34953496/*3497It is most likely that the riceParam part (which comes after the zero counter) is also on this cache line. When extracting3498this, we include the set bit from the unary coded part because it simplifies cache management. This bit will be handled3499outside of this function at a higher level.3500*/3501extract_rice_param_part:3502bs_cache <<= lzcount;3503bs_consumedBits += lzcount;35043505if (bs_consumedBits <= riceParamPlus1MaxConsumedBits) {3506/* Getting here means the rice parameter part is wholly contained within the current cache line. */3507pRiceParamPartOut[0] = (drflac_uint32)(bs_cache >> riceParamPlus1Shift);3508bs_cache <<= riceParamPlus1;3509bs_consumedBits += riceParamPlus1;3510} else {3511drflac_uint32 riceParamPartHi;3512drflac_uint32 riceParamPartLo;3513drflac_uint32 riceParamPartLoBitCount;35143515/*3516Getting here means the rice parameter part straddles the cache line. We need to read from the tail of the current cache3517line, reload the cache, and then combine it with the head of the next cache line.3518*/35193520/* Grab the high part of the rice parameter part. */3521riceParamPartHi = (drflac_uint32)(bs_cache >> riceParamPlus1Shift);35223523/* Before reloading the cache we need to grab the size in bits of the low part. */3524riceParamPartLoBitCount = bs_consumedBits - riceParamPlus1MaxConsumedBits;3525DRFLAC_ASSERT(riceParamPartLoBitCount > 0 && riceParamPartLoBitCount < 32);35263527/* Now reload the cache. */3528if (bs->nextL2Line < DRFLAC_CACHE_L2_LINE_COUNT(bs)) {3529#ifndef DR_FLAC_NO_CRC3530drflac__update_crc16(bs);3531#endif3532bs_cache = drflac__be2host__cache_line(bs->cacheL2[bs->nextL2Line++]);3533bs_consumedBits = riceParamPartLoBitCount;3534#ifndef DR_FLAC_NO_CRC3535bs->crc16Cache = bs_cache;3536#endif3537} else {3538/* Slow path. We need to fetch more data from the client. */3539if (!drflac__reload_cache(bs)) {3540return DRFLAC_FALSE;3541}3542if (riceParamPartLoBitCount > DRFLAC_CACHE_L1_BITS_REMAINING(bs)) {3543/* This happens when we get to end of stream */3544return DRFLAC_FALSE;3545}35463547bs_cache = bs->cache;3548bs_consumedBits = bs->consumedBits + riceParamPartLoBitCount;3549}35503551/* We should now have enough information to construct the rice parameter part. */3552riceParamPartLo = (drflac_uint32)(bs_cache >> (DRFLAC_CACHE_L1_SELECTION_SHIFT(bs, riceParamPartLoBitCount)));3553pRiceParamPartOut[0] = riceParamPartHi | riceParamPartLo;35543555bs_cache <<= riceParamPartLoBitCount;3556}3557} else {3558/*3559Getting here means there are no bits set on the cache line. This is a less optimal case because we just wasted a call3560to drflac__clz() and we need to reload the cache.3561*/3562drflac_uint32 zeroCounter = (drflac_uint32)(DRFLAC_CACHE_L1_SIZE_BITS(bs) - bs_consumedBits);3563for (;;) {3564if (bs->nextL2Line < DRFLAC_CACHE_L2_LINE_COUNT(bs)) {3565#ifndef DR_FLAC_NO_CRC3566drflac__update_crc16(bs);3567#endif3568bs_cache = drflac__be2host__cache_line(bs->cacheL2[bs->nextL2Line++]);3569bs_consumedBits = 0;3570#ifndef DR_FLAC_NO_CRC3571bs->crc16Cache = bs_cache;3572#endif3573} else {3574/* Slow path. We need to fetch more data from the client. */3575if (!drflac__reload_cache(bs)) {3576return DRFLAC_FALSE;3577}35783579bs_cache = bs->cache;3580bs_consumedBits = bs->consumedBits;3581}35823583lzcount = drflac__clz(bs_cache);3584zeroCounter += lzcount;35853586if (lzcount < sizeof(bs_cache)*8) {3587break;3588}3589}35903591pZeroCounterOut[0] = zeroCounter;3592goto extract_rice_param_part;3593}35943595/* Make sure the cache is restored at the end of it all. */3596bs->cache = bs_cache;3597bs->consumedBits = bs_consumedBits;35983599return DRFLAC_TRUE;3600}36013602static DRFLAC_INLINE drflac_bool32 drflac__seek_rice_parts(drflac_bs* bs, drflac_uint8 riceParam)3603{3604drflac_uint32 riceParamPlus1 = riceParam + 1;3605drflac_uint32 riceParamPlus1MaxConsumedBits = DRFLAC_CACHE_L1_SIZE_BITS(bs) - riceParamPlus1;36063607/*3608The idea here is to use local variables for the cache in an attempt to encourage the compiler to store them in registers. I have3609no idea how this will work in practice...3610*/3611drflac_cache_t bs_cache = bs->cache;3612drflac_uint32 bs_consumedBits = bs->consumedBits;36133614/* The first thing to do is find the first unset bit. Most likely a bit will be set in the current cache line. */3615drflac_uint32 lzcount = drflac__clz(bs_cache);3616if (lzcount < sizeof(bs_cache)*8) {3617/*3618It is most likely that the riceParam part (which comes after the zero counter) is also on this cache line. When extracting3619this, we include the set bit from the unary coded part because it simplifies cache management. This bit will be handled3620outside of this function at a higher level.3621*/3622extract_rice_param_part:3623bs_cache <<= lzcount;3624bs_consumedBits += lzcount;36253626if (bs_consumedBits <= riceParamPlus1MaxConsumedBits) {3627/* Getting here means the rice parameter part is wholly contained within the current cache line. */3628bs_cache <<= riceParamPlus1;3629bs_consumedBits += riceParamPlus1;3630} else {3631/*3632Getting here means the rice parameter part straddles the cache line. We need to read from the tail of the current cache3633line, reload the cache, and then combine it with the head of the next cache line.3634*/36353636/* Before reloading the cache we need to grab the size in bits of the low part. */3637drflac_uint32 riceParamPartLoBitCount = bs_consumedBits - riceParamPlus1MaxConsumedBits;3638DRFLAC_ASSERT(riceParamPartLoBitCount > 0 && riceParamPartLoBitCount < 32);36393640/* Now reload the cache. */3641if (bs->nextL2Line < DRFLAC_CACHE_L2_LINE_COUNT(bs)) {3642#ifndef DR_FLAC_NO_CRC3643drflac__update_crc16(bs);3644#endif3645bs_cache = drflac__be2host__cache_line(bs->cacheL2[bs->nextL2Line++]);3646bs_consumedBits = riceParamPartLoBitCount;3647#ifndef DR_FLAC_NO_CRC3648bs->crc16Cache = bs_cache;3649#endif3650} else {3651/* Slow path. We need to fetch more data from the client. */3652if (!drflac__reload_cache(bs)) {3653return DRFLAC_FALSE;3654}36553656if (riceParamPartLoBitCount > DRFLAC_CACHE_L1_BITS_REMAINING(bs)) {3657/* This happens when we get to end of stream */3658return DRFLAC_FALSE;3659}36603661bs_cache = bs->cache;3662bs_consumedBits = bs->consumedBits + riceParamPartLoBitCount;3663}36643665bs_cache <<= riceParamPartLoBitCount;3666}3667} else {3668/*3669Getting here means there are no bits set on the cache line. This is a less optimal case because we just wasted a call3670to drflac__clz() and we need to reload the cache.3671*/3672for (;;) {3673if (bs->nextL2Line < DRFLAC_CACHE_L2_LINE_COUNT(bs)) {3674#ifndef DR_FLAC_NO_CRC3675drflac__update_crc16(bs);3676#endif3677bs_cache = drflac__be2host__cache_line(bs->cacheL2[bs->nextL2Line++]);3678bs_consumedBits = 0;3679#ifndef DR_FLAC_NO_CRC3680bs->crc16Cache = bs_cache;3681#endif3682} else {3683/* Slow path. We need to fetch more data from the client. */3684if (!drflac__reload_cache(bs)) {3685return DRFLAC_FALSE;3686}36873688bs_cache = bs->cache;3689bs_consumedBits = bs->consumedBits;3690}36913692lzcount = drflac__clz(bs_cache);3693if (lzcount < sizeof(bs_cache)*8) {3694break;3695}3696}36973698goto extract_rice_param_part;3699}37003701/* Make sure the cache is restored at the end of it all. */3702bs->cache = bs_cache;3703bs->consumedBits = bs_consumedBits;37043705return DRFLAC_TRUE;3706}370737083709static drflac_bool32 drflac__decode_samples_with_residual__rice__scalar_zeroorder(drflac_bs* bs, drflac_uint32 bitsPerSample, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pSamplesOut)3710{3711drflac_uint32 t[2] = {0x00000000, 0xFFFFFFFF};3712drflac_uint32 zeroCountPart0;3713drflac_uint32 riceParamPart0;3714drflac_uint32 riceParamMask;3715drflac_uint32 i;37163717DRFLAC_ASSERT(bs != NULL);3718DRFLAC_ASSERT(pSamplesOut != NULL);37193720(void)bitsPerSample;3721(void)order;3722(void)shift;3723(void)coefficients;37243725riceParamMask = (drflac_uint32)~((~0UL) << riceParam);37263727i = 0;3728while (i < count) {3729/* Rice extraction. */3730if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart0, &riceParamPart0)) {3731return DRFLAC_FALSE;3732}37333734/* Rice reconstruction. */3735riceParamPart0 &= riceParamMask;3736riceParamPart0 |= (zeroCountPart0 << riceParam);3737riceParamPart0 = (riceParamPart0 >> 1) ^ t[riceParamPart0 & 0x01];37383739pSamplesOut[i] = riceParamPart0;37403741i += 1;3742}37433744return DRFLAC_TRUE;3745}37463747static drflac_bool32 drflac__decode_samples_with_residual__rice__scalar(drflac_bs* bs, drflac_uint32 bitsPerSample, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 lpcOrder, drflac_int32 lpcShift, drflac_uint32 lpcPrecision, const drflac_int32* coefficients, drflac_int32* pSamplesOut)3748{3749drflac_uint32 t[2] = {0x00000000, 0xFFFFFFFF};3750drflac_uint32 zeroCountPart0 = 0;3751drflac_uint32 zeroCountPart1 = 0;3752drflac_uint32 zeroCountPart2 = 0;3753drflac_uint32 zeroCountPart3 = 0;3754drflac_uint32 riceParamPart0 = 0;3755drflac_uint32 riceParamPart1 = 0;3756drflac_uint32 riceParamPart2 = 0;3757drflac_uint32 riceParamPart3 = 0;3758drflac_uint32 riceParamMask;3759const drflac_int32* pSamplesOutEnd;3760drflac_uint32 i;37613762DRFLAC_ASSERT(bs != NULL);3763DRFLAC_ASSERT(pSamplesOut != NULL);37643765if (lpcOrder == 0) {3766return drflac__decode_samples_with_residual__rice__scalar_zeroorder(bs, bitsPerSample, count, riceParam, lpcOrder, lpcShift, coefficients, pSamplesOut);3767}37683769riceParamMask = (drflac_uint32)~((~0UL) << riceParam);3770pSamplesOutEnd = pSamplesOut + (count & ~3);37713772if (drflac__use_64_bit_prediction(bitsPerSample, lpcOrder, lpcPrecision)) {3773while (pSamplesOut < pSamplesOutEnd) {3774/*3775Rice extraction. It's faster to do this one at a time against local variables than it is to use the x4 version3776against an array. Not sure why, but perhaps it's making more efficient use of registers?3777*/3778if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart0, &riceParamPart0) ||3779!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart1, &riceParamPart1) ||3780!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart2, &riceParamPart2) ||3781!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart3, &riceParamPart3)) {3782return DRFLAC_FALSE;3783}37843785riceParamPart0 &= riceParamMask;3786riceParamPart1 &= riceParamMask;3787riceParamPart2 &= riceParamMask;3788riceParamPart3 &= riceParamMask;37893790riceParamPart0 |= (zeroCountPart0 << riceParam);3791riceParamPart1 |= (zeroCountPart1 << riceParam);3792riceParamPart2 |= (zeroCountPart2 << riceParam);3793riceParamPart3 |= (zeroCountPart3 << riceParam);37943795riceParamPart0 = (riceParamPart0 >> 1) ^ t[riceParamPart0 & 0x01];3796riceParamPart1 = (riceParamPart1 >> 1) ^ t[riceParamPart1 & 0x01];3797riceParamPart2 = (riceParamPart2 >> 1) ^ t[riceParamPart2 & 0x01];3798riceParamPart3 = (riceParamPart3 >> 1) ^ t[riceParamPart3 & 0x01];37993800pSamplesOut[0] = riceParamPart0 + drflac__calculate_prediction_64(lpcOrder, lpcShift, coefficients, pSamplesOut + 0);3801pSamplesOut[1] = riceParamPart1 + drflac__calculate_prediction_64(lpcOrder, lpcShift, coefficients, pSamplesOut + 1);3802pSamplesOut[2] = riceParamPart2 + drflac__calculate_prediction_64(lpcOrder, lpcShift, coefficients, pSamplesOut + 2);3803pSamplesOut[3] = riceParamPart3 + drflac__calculate_prediction_64(lpcOrder, lpcShift, coefficients, pSamplesOut + 3);38043805pSamplesOut += 4;3806}3807} else {3808while (pSamplesOut < pSamplesOutEnd) {3809if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart0, &riceParamPart0) ||3810!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart1, &riceParamPart1) ||3811!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart2, &riceParamPart2) ||3812!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart3, &riceParamPart3)) {3813return DRFLAC_FALSE;3814}38153816riceParamPart0 &= riceParamMask;3817riceParamPart1 &= riceParamMask;3818riceParamPart2 &= riceParamMask;3819riceParamPart3 &= riceParamMask;38203821riceParamPart0 |= (zeroCountPart0 << riceParam);3822riceParamPart1 |= (zeroCountPart1 << riceParam);3823riceParamPart2 |= (zeroCountPart2 << riceParam);3824riceParamPart3 |= (zeroCountPart3 << riceParam);38253826riceParamPart0 = (riceParamPart0 >> 1) ^ t[riceParamPart0 & 0x01];3827riceParamPart1 = (riceParamPart1 >> 1) ^ t[riceParamPart1 & 0x01];3828riceParamPart2 = (riceParamPart2 >> 1) ^ t[riceParamPart2 & 0x01];3829riceParamPart3 = (riceParamPart3 >> 1) ^ t[riceParamPart3 & 0x01];38303831pSamplesOut[0] = riceParamPart0 + drflac__calculate_prediction_32(lpcOrder, lpcShift, coefficients, pSamplesOut + 0);3832pSamplesOut[1] = riceParamPart1 + drflac__calculate_prediction_32(lpcOrder, lpcShift, coefficients, pSamplesOut + 1);3833pSamplesOut[2] = riceParamPart2 + drflac__calculate_prediction_32(lpcOrder, lpcShift, coefficients, pSamplesOut + 2);3834pSamplesOut[3] = riceParamPart3 + drflac__calculate_prediction_32(lpcOrder, lpcShift, coefficients, pSamplesOut + 3);38353836pSamplesOut += 4;3837}3838}38393840i = (count & ~3);3841while (i < count) {3842/* Rice extraction. */3843if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart0, &riceParamPart0)) {3844return DRFLAC_FALSE;3845}38463847/* Rice reconstruction. */3848riceParamPart0 &= riceParamMask;3849riceParamPart0 |= (zeroCountPart0 << riceParam);3850riceParamPart0 = (riceParamPart0 >> 1) ^ t[riceParamPart0 & 0x01];3851/*riceParamPart0 = (riceParamPart0 >> 1) ^ (~(riceParamPart0 & 0x01) + 1);*/38523853/* Sample reconstruction. */3854if (drflac__use_64_bit_prediction(bitsPerSample, lpcOrder, lpcPrecision)) {3855pSamplesOut[0] = riceParamPart0 + drflac__calculate_prediction_64(lpcOrder, lpcShift, coefficients, pSamplesOut + 0);3856} else {3857pSamplesOut[0] = riceParamPart0 + drflac__calculate_prediction_32(lpcOrder, lpcShift, coefficients, pSamplesOut + 0);3858}38593860i += 1;3861pSamplesOut += 1;3862}38633864return DRFLAC_TRUE;3865}38663867#if defined(DRFLAC_SUPPORT_SSE2)3868static DRFLAC_INLINE __m128i drflac__mm_packs_interleaved_epi32(__m128i a, __m128i b)3869{3870__m128i r;38713872/* Pack. */3873r = _mm_packs_epi32(a, b);38743875/* a3a2 a1a0 b3b2 b1b0 -> a3a2 b3b2 a1a0 b1b0 */3876r = _mm_shuffle_epi32(r, _MM_SHUFFLE(3, 1, 2, 0));38773878/* a3a2 b3b2 a1a0 b1b0 -> a3b3 a2b2 a1b1 a0b0 */3879r = _mm_shufflehi_epi16(r, _MM_SHUFFLE(3, 1, 2, 0));3880r = _mm_shufflelo_epi16(r, _MM_SHUFFLE(3, 1, 2, 0));38813882return r;3883}3884#endif38853886#if defined(DRFLAC_SUPPORT_SSE41)3887static DRFLAC_INLINE __m128i drflac__mm_not_si128(__m128i a)3888{3889return _mm_xor_si128(a, _mm_cmpeq_epi32(_mm_setzero_si128(), _mm_setzero_si128()));3890}38913892static DRFLAC_INLINE __m128i drflac__mm_hadd_epi32(__m128i x)3893{3894__m128i x64 = _mm_add_epi32(x, _mm_shuffle_epi32(x, _MM_SHUFFLE(1, 0, 3, 2)));3895__m128i x32 = _mm_shufflelo_epi16(x64, _MM_SHUFFLE(1, 0, 3, 2));3896return _mm_add_epi32(x64, x32);3897}38983899static DRFLAC_INLINE __m128i drflac__mm_hadd_epi64(__m128i x)3900{3901return _mm_add_epi64(x, _mm_shuffle_epi32(x, _MM_SHUFFLE(1, 0, 3, 2)));3902}39033904static DRFLAC_INLINE __m128i drflac__mm_srai_epi64(__m128i x, int count)3905{3906/*3907To simplify this we are assuming count < 32. This restriction allows us to work on a low side and a high side. The low side3908is shifted with zero bits, whereas the right side is shifted with sign bits.3909*/3910__m128i lo = _mm_srli_epi64(x, count);3911__m128i hi = _mm_srai_epi32(x, count);39123913hi = _mm_and_si128(hi, _mm_set_epi32(0xFFFFFFFF, 0, 0xFFFFFFFF, 0)); /* The high part needs to have the low part cleared. */39143915return _mm_or_si128(lo, hi);3916}39173918static drflac_bool32 drflac__decode_samples_with_residual__rice__sse41_32(drflac_bs* bs, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pSamplesOut)3919{3920int i;3921drflac_uint32 riceParamMask;3922drflac_int32* pDecodedSamples = pSamplesOut;3923drflac_int32* pDecodedSamplesEnd = pSamplesOut + (count & ~3);3924drflac_uint32 zeroCountParts0 = 0;3925drflac_uint32 zeroCountParts1 = 0;3926drflac_uint32 zeroCountParts2 = 0;3927drflac_uint32 zeroCountParts3 = 0;3928drflac_uint32 riceParamParts0 = 0;3929drflac_uint32 riceParamParts1 = 0;3930drflac_uint32 riceParamParts2 = 0;3931drflac_uint32 riceParamParts3 = 0;3932__m128i coefficients128_0;3933__m128i coefficients128_4;3934__m128i coefficients128_8;3935__m128i samples128_0;3936__m128i samples128_4;3937__m128i samples128_8;3938__m128i riceParamMask128;39393940const drflac_uint32 t[2] = {0x00000000, 0xFFFFFFFF};39413942riceParamMask = (drflac_uint32)~((~0UL) << riceParam);3943riceParamMask128 = _mm_set1_epi32(riceParamMask);39443945/* Pre-load. */3946coefficients128_0 = _mm_setzero_si128();3947coefficients128_4 = _mm_setzero_si128();3948coefficients128_8 = _mm_setzero_si128();39493950samples128_0 = _mm_setzero_si128();3951samples128_4 = _mm_setzero_si128();3952samples128_8 = _mm_setzero_si128();39533954/*3955Pre-loading the coefficients and prior samples is annoying because we need to ensure we don't try reading more than3956what's available in the input buffers. It would be convenient to use a fall-through switch to do this, but this results3957in strict aliasing warnings with GCC. To work around this I'm just doing something hacky. This feels a bit convoluted3958so I think there's opportunity for this to be simplified.3959*/3960#if 13961{3962int runningOrder = order;39633964/* 0 - 3. */3965if (runningOrder >= 4) {3966coefficients128_0 = _mm_loadu_si128((const __m128i*)(coefficients + 0));3967samples128_0 = _mm_loadu_si128((const __m128i*)(pSamplesOut - 4));3968runningOrder -= 4;3969} else {3970switch (runningOrder) {3971case 3: coefficients128_0 = _mm_set_epi32(0, coefficients[2], coefficients[1], coefficients[0]); samples128_0 = _mm_set_epi32(pSamplesOut[-1], pSamplesOut[-2], pSamplesOut[-3], 0); break;3972case 2: coefficients128_0 = _mm_set_epi32(0, 0, coefficients[1], coefficients[0]); samples128_0 = _mm_set_epi32(pSamplesOut[-1], pSamplesOut[-2], 0, 0); break;3973case 1: coefficients128_0 = _mm_set_epi32(0, 0, 0, coefficients[0]); samples128_0 = _mm_set_epi32(pSamplesOut[-1], 0, 0, 0); break;3974}3975runningOrder = 0;3976}39773978/* 4 - 7 */3979if (runningOrder >= 4) {3980coefficients128_4 = _mm_loadu_si128((const __m128i*)(coefficients + 4));3981samples128_4 = _mm_loadu_si128((const __m128i*)(pSamplesOut - 8));3982runningOrder -= 4;3983} else {3984switch (runningOrder) {3985case 3: coefficients128_4 = _mm_set_epi32(0, coefficients[6], coefficients[5], coefficients[4]); samples128_4 = _mm_set_epi32(pSamplesOut[-5], pSamplesOut[-6], pSamplesOut[-7], 0); break;3986case 2: coefficients128_4 = _mm_set_epi32(0, 0, coefficients[5], coefficients[4]); samples128_4 = _mm_set_epi32(pSamplesOut[-5], pSamplesOut[-6], 0, 0); break;3987case 1: coefficients128_4 = _mm_set_epi32(0, 0, 0, coefficients[4]); samples128_4 = _mm_set_epi32(pSamplesOut[-5], 0, 0, 0); break;3988}3989runningOrder = 0;3990}39913992/* 8 - 11 */3993if (runningOrder == 4) {3994coefficients128_8 = _mm_loadu_si128((const __m128i*)(coefficients + 8));3995samples128_8 = _mm_loadu_si128((const __m128i*)(pSamplesOut - 12));3996runningOrder -= 4;3997} else {3998switch (runningOrder) {3999case 3: coefficients128_8 = _mm_set_epi32(0, coefficients[10], coefficients[9], coefficients[8]); samples128_8 = _mm_set_epi32(pSamplesOut[-9], pSamplesOut[-10], pSamplesOut[-11], 0); break;4000case 2: coefficients128_8 = _mm_set_epi32(0, 0, coefficients[9], coefficients[8]); samples128_8 = _mm_set_epi32(pSamplesOut[-9], pSamplesOut[-10], 0, 0); break;4001case 1: coefficients128_8 = _mm_set_epi32(0, 0, 0, coefficients[8]); samples128_8 = _mm_set_epi32(pSamplesOut[-9], 0, 0, 0); break;4002}4003runningOrder = 0;4004}40054006/* Coefficients need to be shuffled for our streaming algorithm below to work. Samples are already in the correct order from the loading routine above. */4007coefficients128_0 = _mm_shuffle_epi32(coefficients128_0, _MM_SHUFFLE(0, 1, 2, 3));4008coefficients128_4 = _mm_shuffle_epi32(coefficients128_4, _MM_SHUFFLE(0, 1, 2, 3));4009coefficients128_8 = _mm_shuffle_epi32(coefficients128_8, _MM_SHUFFLE(0, 1, 2, 3));4010}4011#else4012/* This causes strict-aliasing warnings with GCC. */4013switch (order)4014{4015case 12: ((drflac_int32*)&coefficients128_8)[0] = coefficients[11]; ((drflac_int32*)&samples128_8)[0] = pDecodedSamples[-12];4016case 11: ((drflac_int32*)&coefficients128_8)[1] = coefficients[10]; ((drflac_int32*)&samples128_8)[1] = pDecodedSamples[-11];4017case 10: ((drflac_int32*)&coefficients128_8)[2] = coefficients[ 9]; ((drflac_int32*)&samples128_8)[2] = pDecodedSamples[-10];4018case 9: ((drflac_int32*)&coefficients128_8)[3] = coefficients[ 8]; ((drflac_int32*)&samples128_8)[3] = pDecodedSamples[- 9];4019case 8: ((drflac_int32*)&coefficients128_4)[0] = coefficients[ 7]; ((drflac_int32*)&samples128_4)[0] = pDecodedSamples[- 8];4020case 7: ((drflac_int32*)&coefficients128_4)[1] = coefficients[ 6]; ((drflac_int32*)&samples128_4)[1] = pDecodedSamples[- 7];4021case 6: ((drflac_int32*)&coefficients128_4)[2] = coefficients[ 5]; ((drflac_int32*)&samples128_4)[2] = pDecodedSamples[- 6];4022case 5: ((drflac_int32*)&coefficients128_4)[3] = coefficients[ 4]; ((drflac_int32*)&samples128_4)[3] = pDecodedSamples[- 5];4023case 4: ((drflac_int32*)&coefficients128_0)[0] = coefficients[ 3]; ((drflac_int32*)&samples128_0)[0] = pDecodedSamples[- 4];4024case 3: ((drflac_int32*)&coefficients128_0)[1] = coefficients[ 2]; ((drflac_int32*)&samples128_0)[1] = pDecodedSamples[- 3];4025case 2: ((drflac_int32*)&coefficients128_0)[2] = coefficients[ 1]; ((drflac_int32*)&samples128_0)[2] = pDecodedSamples[- 2];4026case 1: ((drflac_int32*)&coefficients128_0)[3] = coefficients[ 0]; ((drflac_int32*)&samples128_0)[3] = pDecodedSamples[- 1];4027}4028#endif40294030/* For this version we are doing one sample at a time. */4031while (pDecodedSamples < pDecodedSamplesEnd) {4032__m128i prediction128;4033__m128i zeroCountPart128;4034__m128i riceParamPart128;40354036if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts0, &riceParamParts0) ||4037!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts1, &riceParamParts1) ||4038!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts2, &riceParamParts2) ||4039!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts3, &riceParamParts3)) {4040return DRFLAC_FALSE;4041}40424043zeroCountPart128 = _mm_set_epi32(zeroCountParts3, zeroCountParts2, zeroCountParts1, zeroCountParts0);4044riceParamPart128 = _mm_set_epi32(riceParamParts3, riceParamParts2, riceParamParts1, riceParamParts0);40454046riceParamPart128 = _mm_and_si128(riceParamPart128, riceParamMask128);4047riceParamPart128 = _mm_or_si128(riceParamPart128, _mm_slli_epi32(zeroCountPart128, riceParam));4048riceParamPart128 = _mm_xor_si128(_mm_srli_epi32(riceParamPart128, 1), _mm_add_epi32(drflac__mm_not_si128(_mm_and_si128(riceParamPart128, _mm_set1_epi32(0x01))), _mm_set1_epi32(0x01))); /* <-- SSE2 compatible */4049/*riceParamPart128 = _mm_xor_si128(_mm_srli_epi32(riceParamPart128, 1), _mm_mullo_epi32(_mm_and_si128(riceParamPart128, _mm_set1_epi32(0x01)), _mm_set1_epi32(0xFFFFFFFF)));*/ /* <-- Only supported from SSE4.1 and is slower in my testing... */40504051if (order <= 4) {4052for (i = 0; i < 4; i += 1) {4053prediction128 = _mm_mullo_epi32(coefficients128_0, samples128_0);40544055/* Horizontal add and shift. */4056prediction128 = drflac__mm_hadd_epi32(prediction128);4057prediction128 = _mm_srai_epi32(prediction128, shift);4058prediction128 = _mm_add_epi32(riceParamPart128, prediction128);40594060samples128_0 = _mm_alignr_epi8(prediction128, samples128_0, 4);4061riceParamPart128 = _mm_alignr_epi8(_mm_setzero_si128(), riceParamPart128, 4);4062}4063} else if (order <= 8) {4064for (i = 0; i < 4; i += 1) {4065prediction128 = _mm_mullo_epi32(coefficients128_4, samples128_4);4066prediction128 = _mm_add_epi32(prediction128, _mm_mullo_epi32(coefficients128_0, samples128_0));40674068/* Horizontal add and shift. */4069prediction128 = drflac__mm_hadd_epi32(prediction128);4070prediction128 = _mm_srai_epi32(prediction128, shift);4071prediction128 = _mm_add_epi32(riceParamPart128, prediction128);40724073samples128_4 = _mm_alignr_epi8(samples128_0, samples128_4, 4);4074samples128_0 = _mm_alignr_epi8(prediction128, samples128_0, 4);4075riceParamPart128 = _mm_alignr_epi8(_mm_setzero_si128(), riceParamPart128, 4);4076}4077} else {4078for (i = 0; i < 4; i += 1) {4079prediction128 = _mm_mullo_epi32(coefficients128_8, samples128_8);4080prediction128 = _mm_add_epi32(prediction128, _mm_mullo_epi32(coefficients128_4, samples128_4));4081prediction128 = _mm_add_epi32(prediction128, _mm_mullo_epi32(coefficients128_0, samples128_0));40824083/* Horizontal add and shift. */4084prediction128 = drflac__mm_hadd_epi32(prediction128);4085prediction128 = _mm_srai_epi32(prediction128, shift);4086prediction128 = _mm_add_epi32(riceParamPart128, prediction128);40874088samples128_8 = _mm_alignr_epi8(samples128_4, samples128_8, 4);4089samples128_4 = _mm_alignr_epi8(samples128_0, samples128_4, 4);4090samples128_0 = _mm_alignr_epi8(prediction128, samples128_0, 4);4091riceParamPart128 = _mm_alignr_epi8(_mm_setzero_si128(), riceParamPart128, 4);4092}4093}40944095/* We store samples in groups of 4. */4096_mm_storeu_si128((__m128i*)pDecodedSamples, samples128_0);4097pDecodedSamples += 4;4098}40994100/* Make sure we process the last few samples. */4101i = (count & ~3);4102while (i < (int)count) {4103/* Rice extraction. */4104if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts0, &riceParamParts0)) {4105return DRFLAC_FALSE;4106}41074108/* Rice reconstruction. */4109riceParamParts0 &= riceParamMask;4110riceParamParts0 |= (zeroCountParts0 << riceParam);4111riceParamParts0 = (riceParamParts0 >> 1) ^ t[riceParamParts0 & 0x01];41124113/* Sample reconstruction. */4114pDecodedSamples[0] = riceParamParts0 + drflac__calculate_prediction_32(order, shift, coefficients, pDecodedSamples);41154116i += 1;4117pDecodedSamples += 1;4118}41194120return DRFLAC_TRUE;4121}41224123static drflac_bool32 drflac__decode_samples_with_residual__rice__sse41_64(drflac_bs* bs, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pSamplesOut)4124{4125int i;4126drflac_uint32 riceParamMask;4127drflac_int32* pDecodedSamples = pSamplesOut;4128drflac_int32* pDecodedSamplesEnd = pSamplesOut + (count & ~3);4129drflac_uint32 zeroCountParts0 = 0;4130drflac_uint32 zeroCountParts1 = 0;4131drflac_uint32 zeroCountParts2 = 0;4132drflac_uint32 zeroCountParts3 = 0;4133drflac_uint32 riceParamParts0 = 0;4134drflac_uint32 riceParamParts1 = 0;4135drflac_uint32 riceParamParts2 = 0;4136drflac_uint32 riceParamParts3 = 0;4137__m128i coefficients128_0;4138__m128i coefficients128_4;4139__m128i coefficients128_8;4140__m128i samples128_0;4141__m128i samples128_4;4142__m128i samples128_8;4143__m128i prediction128;4144__m128i riceParamMask128;41454146const drflac_uint32 t[2] = {0x00000000, 0xFFFFFFFF};41474148DRFLAC_ASSERT(order <= 12);41494150riceParamMask = (drflac_uint32)~((~0UL) << riceParam);4151riceParamMask128 = _mm_set1_epi32(riceParamMask);41524153prediction128 = _mm_setzero_si128();41544155/* Pre-load. */4156coefficients128_0 = _mm_setzero_si128();4157coefficients128_4 = _mm_setzero_si128();4158coefficients128_8 = _mm_setzero_si128();41594160samples128_0 = _mm_setzero_si128();4161samples128_4 = _mm_setzero_si128();4162samples128_8 = _mm_setzero_si128();41634164#if 14165{4166int runningOrder = order;41674168/* 0 - 3. */4169if (runningOrder >= 4) {4170coefficients128_0 = _mm_loadu_si128((const __m128i*)(coefficients + 0));4171samples128_0 = _mm_loadu_si128((const __m128i*)(pSamplesOut - 4));4172runningOrder -= 4;4173} else {4174switch (runningOrder) {4175case 3: coefficients128_0 = _mm_set_epi32(0, coefficients[2], coefficients[1], coefficients[0]); samples128_0 = _mm_set_epi32(pSamplesOut[-1], pSamplesOut[-2], pSamplesOut[-3], 0); break;4176case 2: coefficients128_0 = _mm_set_epi32(0, 0, coefficients[1], coefficients[0]); samples128_0 = _mm_set_epi32(pSamplesOut[-1], pSamplesOut[-2], 0, 0); break;4177case 1: coefficients128_0 = _mm_set_epi32(0, 0, 0, coefficients[0]); samples128_0 = _mm_set_epi32(pSamplesOut[-1], 0, 0, 0); break;4178}4179runningOrder = 0;4180}41814182/* 4 - 7 */4183if (runningOrder >= 4) {4184coefficients128_4 = _mm_loadu_si128((const __m128i*)(coefficients + 4));4185samples128_4 = _mm_loadu_si128((const __m128i*)(pSamplesOut - 8));4186runningOrder -= 4;4187} else {4188switch (runningOrder) {4189case 3: coefficients128_4 = _mm_set_epi32(0, coefficients[6], coefficients[5], coefficients[4]); samples128_4 = _mm_set_epi32(pSamplesOut[-5], pSamplesOut[-6], pSamplesOut[-7], 0); break;4190case 2: coefficients128_4 = _mm_set_epi32(0, 0, coefficients[5], coefficients[4]); samples128_4 = _mm_set_epi32(pSamplesOut[-5], pSamplesOut[-6], 0, 0); break;4191case 1: coefficients128_4 = _mm_set_epi32(0, 0, 0, coefficients[4]); samples128_4 = _mm_set_epi32(pSamplesOut[-5], 0, 0, 0); break;4192}4193runningOrder = 0;4194}41954196/* 8 - 11 */4197if (runningOrder == 4) {4198coefficients128_8 = _mm_loadu_si128((const __m128i*)(coefficients + 8));4199samples128_8 = _mm_loadu_si128((const __m128i*)(pSamplesOut - 12));4200runningOrder -= 4;4201} else {4202switch (runningOrder) {4203case 3: coefficients128_8 = _mm_set_epi32(0, coefficients[10], coefficients[9], coefficients[8]); samples128_8 = _mm_set_epi32(pSamplesOut[-9], pSamplesOut[-10], pSamplesOut[-11], 0); break;4204case 2: coefficients128_8 = _mm_set_epi32(0, 0, coefficients[9], coefficients[8]); samples128_8 = _mm_set_epi32(pSamplesOut[-9], pSamplesOut[-10], 0, 0); break;4205case 1: coefficients128_8 = _mm_set_epi32(0, 0, 0, coefficients[8]); samples128_8 = _mm_set_epi32(pSamplesOut[-9], 0, 0, 0); break;4206}4207runningOrder = 0;4208}42094210/* Coefficients need to be shuffled for our streaming algorithm below to work. Samples are already in the correct order from the loading routine above. */4211coefficients128_0 = _mm_shuffle_epi32(coefficients128_0, _MM_SHUFFLE(0, 1, 2, 3));4212coefficients128_4 = _mm_shuffle_epi32(coefficients128_4, _MM_SHUFFLE(0, 1, 2, 3));4213coefficients128_8 = _mm_shuffle_epi32(coefficients128_8, _MM_SHUFFLE(0, 1, 2, 3));4214}4215#else4216switch (order)4217{4218case 12: ((drflac_int32*)&coefficients128_8)[0] = coefficients[11]; ((drflac_int32*)&samples128_8)[0] = pDecodedSamples[-12];4219case 11: ((drflac_int32*)&coefficients128_8)[1] = coefficients[10]; ((drflac_int32*)&samples128_8)[1] = pDecodedSamples[-11];4220case 10: ((drflac_int32*)&coefficients128_8)[2] = coefficients[ 9]; ((drflac_int32*)&samples128_8)[2] = pDecodedSamples[-10];4221case 9: ((drflac_int32*)&coefficients128_8)[3] = coefficients[ 8]; ((drflac_int32*)&samples128_8)[3] = pDecodedSamples[- 9];4222case 8: ((drflac_int32*)&coefficients128_4)[0] = coefficients[ 7]; ((drflac_int32*)&samples128_4)[0] = pDecodedSamples[- 8];4223case 7: ((drflac_int32*)&coefficients128_4)[1] = coefficients[ 6]; ((drflac_int32*)&samples128_4)[1] = pDecodedSamples[- 7];4224case 6: ((drflac_int32*)&coefficients128_4)[2] = coefficients[ 5]; ((drflac_int32*)&samples128_4)[2] = pDecodedSamples[- 6];4225case 5: ((drflac_int32*)&coefficients128_4)[3] = coefficients[ 4]; ((drflac_int32*)&samples128_4)[3] = pDecodedSamples[- 5];4226case 4: ((drflac_int32*)&coefficients128_0)[0] = coefficients[ 3]; ((drflac_int32*)&samples128_0)[0] = pDecodedSamples[- 4];4227case 3: ((drflac_int32*)&coefficients128_0)[1] = coefficients[ 2]; ((drflac_int32*)&samples128_0)[1] = pDecodedSamples[- 3];4228case 2: ((drflac_int32*)&coefficients128_0)[2] = coefficients[ 1]; ((drflac_int32*)&samples128_0)[2] = pDecodedSamples[- 2];4229case 1: ((drflac_int32*)&coefficients128_0)[3] = coefficients[ 0]; ((drflac_int32*)&samples128_0)[3] = pDecodedSamples[- 1];4230}4231#endif42324233/* For this version we are doing one sample at a time. */4234while (pDecodedSamples < pDecodedSamplesEnd) {4235__m128i zeroCountPart128;4236__m128i riceParamPart128;42374238if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts0, &riceParamParts0) ||4239!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts1, &riceParamParts1) ||4240!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts2, &riceParamParts2) ||4241!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts3, &riceParamParts3)) {4242return DRFLAC_FALSE;4243}42444245zeroCountPart128 = _mm_set_epi32(zeroCountParts3, zeroCountParts2, zeroCountParts1, zeroCountParts0);4246riceParamPart128 = _mm_set_epi32(riceParamParts3, riceParamParts2, riceParamParts1, riceParamParts0);42474248riceParamPart128 = _mm_and_si128(riceParamPart128, riceParamMask128);4249riceParamPart128 = _mm_or_si128(riceParamPart128, _mm_slli_epi32(zeroCountPart128, riceParam));4250riceParamPart128 = _mm_xor_si128(_mm_srli_epi32(riceParamPart128, 1), _mm_add_epi32(drflac__mm_not_si128(_mm_and_si128(riceParamPart128, _mm_set1_epi32(1))), _mm_set1_epi32(1)));42514252for (i = 0; i < 4; i += 1) {4253prediction128 = _mm_xor_si128(prediction128, prediction128); /* Reset to 0. */42544255switch (order)4256{4257case 12:4258case 11: prediction128 = _mm_add_epi64(prediction128, _mm_mul_epi32(_mm_shuffle_epi32(coefficients128_8, _MM_SHUFFLE(1, 1, 0, 0)), _mm_shuffle_epi32(samples128_8, _MM_SHUFFLE(1, 1, 0, 0))));4259case 10:4260case 9: prediction128 = _mm_add_epi64(prediction128, _mm_mul_epi32(_mm_shuffle_epi32(coefficients128_8, _MM_SHUFFLE(3, 3, 2, 2)), _mm_shuffle_epi32(samples128_8, _MM_SHUFFLE(3, 3, 2, 2))));4261case 8:4262case 7: prediction128 = _mm_add_epi64(prediction128, _mm_mul_epi32(_mm_shuffle_epi32(coefficients128_4, _MM_SHUFFLE(1, 1, 0, 0)), _mm_shuffle_epi32(samples128_4, _MM_SHUFFLE(1, 1, 0, 0))));4263case 6:4264case 5: prediction128 = _mm_add_epi64(prediction128, _mm_mul_epi32(_mm_shuffle_epi32(coefficients128_4, _MM_SHUFFLE(3, 3, 2, 2)), _mm_shuffle_epi32(samples128_4, _MM_SHUFFLE(3, 3, 2, 2))));4265case 4:4266case 3: prediction128 = _mm_add_epi64(prediction128, _mm_mul_epi32(_mm_shuffle_epi32(coefficients128_0, _MM_SHUFFLE(1, 1, 0, 0)), _mm_shuffle_epi32(samples128_0, _MM_SHUFFLE(1, 1, 0, 0))));4267case 2:4268case 1: prediction128 = _mm_add_epi64(prediction128, _mm_mul_epi32(_mm_shuffle_epi32(coefficients128_0, _MM_SHUFFLE(3, 3, 2, 2)), _mm_shuffle_epi32(samples128_0, _MM_SHUFFLE(3, 3, 2, 2))));4269}42704271/* Horizontal add and shift. */4272prediction128 = drflac__mm_hadd_epi64(prediction128);4273prediction128 = drflac__mm_srai_epi64(prediction128, shift);4274prediction128 = _mm_add_epi32(riceParamPart128, prediction128);42754276/* Our value should be sitting in prediction128[0]. We need to combine this with our SSE samples. */4277samples128_8 = _mm_alignr_epi8(samples128_4, samples128_8, 4);4278samples128_4 = _mm_alignr_epi8(samples128_0, samples128_4, 4);4279samples128_0 = _mm_alignr_epi8(prediction128, samples128_0, 4);42804281/* Slide our rice parameter down so that the value in position 0 contains the next one to process. */4282riceParamPart128 = _mm_alignr_epi8(_mm_setzero_si128(), riceParamPart128, 4);4283}42844285/* We store samples in groups of 4. */4286_mm_storeu_si128((__m128i*)pDecodedSamples, samples128_0);4287pDecodedSamples += 4;4288}42894290/* Make sure we process the last few samples. */4291i = (count & ~3);4292while (i < (int)count) {4293/* Rice extraction. */4294if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts0, &riceParamParts0)) {4295return DRFLAC_FALSE;4296}42974298/* Rice reconstruction. */4299riceParamParts0 &= riceParamMask;4300riceParamParts0 |= (zeroCountParts0 << riceParam);4301riceParamParts0 = (riceParamParts0 >> 1) ^ t[riceParamParts0 & 0x01];43024303/* Sample reconstruction. */4304pDecodedSamples[0] = riceParamParts0 + drflac__calculate_prediction_64(order, shift, coefficients, pDecodedSamples);43054306i += 1;4307pDecodedSamples += 1;4308}43094310return DRFLAC_TRUE;4311}43124313static drflac_bool32 drflac__decode_samples_with_residual__rice__sse41(drflac_bs* bs, drflac_uint32 bitsPerSample, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 lpcOrder, drflac_int32 lpcShift, drflac_uint32 lpcPrecision, const drflac_int32* coefficients, drflac_int32* pSamplesOut)4314{4315DRFLAC_ASSERT(bs != NULL);4316DRFLAC_ASSERT(pSamplesOut != NULL);43174318/* In my testing the order is rarely > 12, so in this case I'm going to simplify the SSE implementation by only handling order <= 12. */4319if (lpcOrder > 0 && lpcOrder <= 12) {4320if (drflac__use_64_bit_prediction(bitsPerSample, lpcOrder, lpcPrecision)) {4321return drflac__decode_samples_with_residual__rice__sse41_64(bs, count, riceParam, lpcOrder, lpcShift, coefficients, pSamplesOut);4322} else {4323return drflac__decode_samples_with_residual__rice__sse41_32(bs, count, riceParam, lpcOrder, lpcShift, coefficients, pSamplesOut);4324}4325} else {4326return drflac__decode_samples_with_residual__rice__scalar(bs, bitsPerSample, count, riceParam, lpcOrder, lpcShift, lpcPrecision, coefficients, pSamplesOut);4327}4328}4329#endif43304331#if defined(DRFLAC_SUPPORT_NEON)4332static DRFLAC_INLINE void drflac__vst2q_s32(drflac_int32* p, int32x4x2_t x)4333{4334vst1q_s32(p+0, x.val[0]);4335vst1q_s32(p+4, x.val[1]);4336}43374338static DRFLAC_INLINE void drflac__vst2q_u32(drflac_uint32* p, uint32x4x2_t x)4339{4340vst1q_u32(p+0, x.val[0]);4341vst1q_u32(p+4, x.val[1]);4342}43434344static DRFLAC_INLINE void drflac__vst2q_f32(float* p, float32x4x2_t x)4345{4346vst1q_f32(p+0, x.val[0]);4347vst1q_f32(p+4, x.val[1]);4348}43494350static DRFLAC_INLINE void drflac__vst2q_s16(drflac_int16* p, int16x4x2_t x)4351{4352vst1q_s16(p, vcombine_s16(x.val[0], x.val[1]));4353}43544355static DRFLAC_INLINE void drflac__vst2q_u16(drflac_uint16* p, uint16x4x2_t x)4356{4357vst1q_u16(p, vcombine_u16(x.val[0], x.val[1]));4358}43594360static DRFLAC_INLINE int32x4_t drflac__vdupq_n_s32x4(drflac_int32 x3, drflac_int32 x2, drflac_int32 x1, drflac_int32 x0)4361{4362drflac_int32 x[4];4363x[3] = x3;4364x[2] = x2;4365x[1] = x1;4366x[0] = x0;4367return vld1q_s32(x);4368}43694370static DRFLAC_INLINE int32x4_t drflac__valignrq_s32_1(int32x4_t a, int32x4_t b)4371{4372/* Equivalent to SSE's _mm_alignr_epi8(a, b, 4) */43734374/* Reference */4375/*return drflac__vdupq_n_s32x4(4376vgetq_lane_s32(a, 0),4377vgetq_lane_s32(b, 3),4378vgetq_lane_s32(b, 2),4379vgetq_lane_s32(b, 1)4380);*/43814382return vextq_s32(b, a, 1);4383}43844385static DRFLAC_INLINE uint32x4_t drflac__valignrq_u32_1(uint32x4_t a, uint32x4_t b)4386{4387/* Equivalent to SSE's _mm_alignr_epi8(a, b, 4) */43884389/* Reference */4390/*return drflac__vdupq_n_s32x4(4391vgetq_lane_s32(a, 0),4392vgetq_lane_s32(b, 3),4393vgetq_lane_s32(b, 2),4394vgetq_lane_s32(b, 1)4395);*/43964397return vextq_u32(b, a, 1);4398}43994400static DRFLAC_INLINE int32x2_t drflac__vhaddq_s32(int32x4_t x)4401{4402/* The sum must end up in position 0. */44034404/* Reference */4405/*return vdupq_n_s32(4406vgetq_lane_s32(x, 3) +4407vgetq_lane_s32(x, 2) +4408vgetq_lane_s32(x, 1) +4409vgetq_lane_s32(x, 0)4410);*/44114412int32x2_t r = vadd_s32(vget_high_s32(x), vget_low_s32(x));4413return vpadd_s32(r, r);4414}44154416static DRFLAC_INLINE int64x1_t drflac__vhaddq_s64(int64x2_t x)4417{4418return vadd_s64(vget_high_s64(x), vget_low_s64(x));4419}44204421static DRFLAC_INLINE int32x4_t drflac__vrevq_s32(int32x4_t x)4422{4423/* Reference */4424/*return drflac__vdupq_n_s32x4(4425vgetq_lane_s32(x, 0),4426vgetq_lane_s32(x, 1),4427vgetq_lane_s32(x, 2),4428vgetq_lane_s32(x, 3)4429);*/44304431return vrev64q_s32(vcombine_s32(vget_high_s32(x), vget_low_s32(x)));4432}44334434static DRFLAC_INLINE int32x4_t drflac__vnotq_s32(int32x4_t x)4435{4436return veorq_s32(x, vdupq_n_s32(0xFFFFFFFF));4437}44384439static DRFLAC_INLINE uint32x4_t drflac__vnotq_u32(uint32x4_t x)4440{4441return veorq_u32(x, vdupq_n_u32(0xFFFFFFFF));4442}44434444static drflac_bool32 drflac__decode_samples_with_residual__rice__neon_32(drflac_bs* bs, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pSamplesOut)4445{4446int i;4447drflac_uint32 riceParamMask;4448drflac_int32* pDecodedSamples = pSamplesOut;4449drflac_int32* pDecodedSamplesEnd = pSamplesOut + (count & ~3);4450drflac_uint32 zeroCountParts[4];4451drflac_uint32 riceParamParts[4];4452int32x4_t coefficients128_0;4453int32x4_t coefficients128_4;4454int32x4_t coefficients128_8;4455int32x4_t samples128_0;4456int32x4_t samples128_4;4457int32x4_t samples128_8;4458uint32x4_t riceParamMask128;4459int32x4_t riceParam128;4460int32x2_t shift64;4461uint32x4_t one128;44624463const drflac_uint32 t[2] = {0x00000000, 0xFFFFFFFF};44644465riceParamMask = (drflac_uint32)~((~0UL) << riceParam);4466riceParamMask128 = vdupq_n_u32(riceParamMask);44674468riceParam128 = vdupq_n_s32(riceParam);4469shift64 = vdup_n_s32(-shift); /* Negate the shift because we'll be doing a variable shift using vshlq_s32(). */4470one128 = vdupq_n_u32(1);44714472/*4473Pre-loading the coefficients and prior samples is annoying because we need to ensure we don't try reading more than4474what's available in the input buffers. It would be conenient to use a fall-through switch to do this, but this results4475in strict aliasing warnings with GCC. To work around this I'm just doing something hacky. This feels a bit convoluted4476so I think there's opportunity for this to be simplified.4477*/4478{4479int runningOrder = order;4480drflac_int32 tempC[4] = {0, 0, 0, 0};4481drflac_int32 tempS[4] = {0, 0, 0, 0};44824483/* 0 - 3. */4484if (runningOrder >= 4) {4485coefficients128_0 = vld1q_s32(coefficients + 0);4486samples128_0 = vld1q_s32(pSamplesOut - 4);4487runningOrder -= 4;4488} else {4489switch (runningOrder) {4490case 3: tempC[2] = coefficients[2]; tempS[1] = pSamplesOut[-3]; /* fallthrough */4491case 2: tempC[1] = coefficients[1]; tempS[2] = pSamplesOut[-2]; /* fallthrough */4492case 1: tempC[0] = coefficients[0]; tempS[3] = pSamplesOut[-1]; /* fallthrough */4493}44944495coefficients128_0 = vld1q_s32(tempC);4496samples128_0 = vld1q_s32(tempS);4497runningOrder = 0;4498}44994500/* 4 - 7 */4501if (runningOrder >= 4) {4502coefficients128_4 = vld1q_s32(coefficients + 4);4503samples128_4 = vld1q_s32(pSamplesOut - 8);4504runningOrder -= 4;4505} else {4506switch (runningOrder) {4507case 3: tempC[2] = coefficients[6]; tempS[1] = pSamplesOut[-7]; /* fallthrough */4508case 2: tempC[1] = coefficients[5]; tempS[2] = pSamplesOut[-6]; /* fallthrough */4509case 1: tempC[0] = coefficients[4]; tempS[3] = pSamplesOut[-5]; /* fallthrough */4510}45114512coefficients128_4 = vld1q_s32(tempC);4513samples128_4 = vld1q_s32(tempS);4514runningOrder = 0;4515}45164517/* 8 - 11 */4518if (runningOrder == 4) {4519coefficients128_8 = vld1q_s32(coefficients + 8);4520samples128_8 = vld1q_s32(pSamplesOut - 12);4521runningOrder -= 4;4522} else {4523switch (runningOrder) {4524case 3: tempC[2] = coefficients[10]; tempS[1] = pSamplesOut[-11]; /* fallthrough */4525case 2: tempC[1] = coefficients[ 9]; tempS[2] = pSamplesOut[-10]; /* fallthrough */4526case 1: tempC[0] = coefficients[ 8]; tempS[3] = pSamplesOut[- 9]; /* fallthrough */4527}45284529coefficients128_8 = vld1q_s32(tempC);4530samples128_8 = vld1q_s32(tempS);4531runningOrder = 0;4532}45334534/* Coefficients need to be shuffled for our streaming algorithm below to work. Samples are already in the correct order from the loading routine above. */4535coefficients128_0 = drflac__vrevq_s32(coefficients128_0);4536coefficients128_4 = drflac__vrevq_s32(coefficients128_4);4537coefficients128_8 = drflac__vrevq_s32(coefficients128_8);4538}45394540/* For this version we are doing one sample at a time. */4541while (pDecodedSamples < pDecodedSamplesEnd) {4542int32x4_t prediction128;4543int32x2_t prediction64;4544uint32x4_t zeroCountPart128;4545uint32x4_t riceParamPart128;45464547if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[0], &riceParamParts[0]) ||4548!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[1], &riceParamParts[1]) ||4549!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[2], &riceParamParts[2]) ||4550!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[3], &riceParamParts[3])) {4551return DRFLAC_FALSE;4552}45534554zeroCountPart128 = vld1q_u32(zeroCountParts);4555riceParamPart128 = vld1q_u32(riceParamParts);45564557riceParamPart128 = vandq_u32(riceParamPart128, riceParamMask128);4558riceParamPart128 = vorrq_u32(riceParamPart128, vshlq_u32(zeroCountPart128, riceParam128));4559riceParamPart128 = veorq_u32(vshrq_n_u32(riceParamPart128, 1), vaddq_u32(drflac__vnotq_u32(vandq_u32(riceParamPart128, one128)), one128));45604561if (order <= 4) {4562for (i = 0; i < 4; i += 1) {4563prediction128 = vmulq_s32(coefficients128_0, samples128_0);45644565/* Horizontal add and shift. */4566prediction64 = drflac__vhaddq_s32(prediction128);4567prediction64 = vshl_s32(prediction64, shift64);4568prediction64 = vadd_s32(prediction64, vget_low_s32(vreinterpretq_s32_u32(riceParamPart128)));45694570samples128_0 = drflac__valignrq_s32_1(vcombine_s32(prediction64, vdup_n_s32(0)), samples128_0);4571riceParamPart128 = drflac__valignrq_u32_1(vdupq_n_u32(0), riceParamPart128);4572}4573} else if (order <= 8) {4574for (i = 0; i < 4; i += 1) {4575prediction128 = vmulq_s32(coefficients128_4, samples128_4);4576prediction128 = vmlaq_s32(prediction128, coefficients128_0, samples128_0);45774578/* Horizontal add and shift. */4579prediction64 = drflac__vhaddq_s32(prediction128);4580prediction64 = vshl_s32(prediction64, shift64);4581prediction64 = vadd_s32(prediction64, vget_low_s32(vreinterpretq_s32_u32(riceParamPart128)));45824583samples128_4 = drflac__valignrq_s32_1(samples128_0, samples128_4);4584samples128_0 = drflac__valignrq_s32_1(vcombine_s32(prediction64, vdup_n_s32(0)), samples128_0);4585riceParamPart128 = drflac__valignrq_u32_1(vdupq_n_u32(0), riceParamPart128);4586}4587} else {4588for (i = 0; i < 4; i += 1) {4589prediction128 = vmulq_s32(coefficients128_8, samples128_8);4590prediction128 = vmlaq_s32(prediction128, coefficients128_4, samples128_4);4591prediction128 = vmlaq_s32(prediction128, coefficients128_0, samples128_0);45924593/* Horizontal add and shift. */4594prediction64 = drflac__vhaddq_s32(prediction128);4595prediction64 = vshl_s32(prediction64, shift64);4596prediction64 = vadd_s32(prediction64, vget_low_s32(vreinterpretq_s32_u32(riceParamPart128)));45974598samples128_8 = drflac__valignrq_s32_1(samples128_4, samples128_8);4599samples128_4 = drflac__valignrq_s32_1(samples128_0, samples128_4);4600samples128_0 = drflac__valignrq_s32_1(vcombine_s32(prediction64, vdup_n_s32(0)), samples128_0);4601riceParamPart128 = drflac__valignrq_u32_1(vdupq_n_u32(0), riceParamPart128);4602}4603}46044605/* We store samples in groups of 4. */4606vst1q_s32(pDecodedSamples, samples128_0);4607pDecodedSamples += 4;4608}46094610/* Make sure we process the last few samples. */4611i = (count & ~3);4612while (i < (int)count) {4613/* Rice extraction. */4614if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[0], &riceParamParts[0])) {4615return DRFLAC_FALSE;4616}46174618/* Rice reconstruction. */4619riceParamParts[0] &= riceParamMask;4620riceParamParts[0] |= (zeroCountParts[0] << riceParam);4621riceParamParts[0] = (riceParamParts[0] >> 1) ^ t[riceParamParts[0] & 0x01];46224623/* Sample reconstruction. */4624pDecodedSamples[0] = riceParamParts[0] + drflac__calculate_prediction_32(order, shift, coefficients, pDecodedSamples);46254626i += 1;4627pDecodedSamples += 1;4628}46294630return DRFLAC_TRUE;4631}46324633static drflac_bool32 drflac__decode_samples_with_residual__rice__neon_64(drflac_bs* bs, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pSamplesOut)4634{4635int i;4636drflac_uint32 riceParamMask;4637drflac_int32* pDecodedSamples = pSamplesOut;4638drflac_int32* pDecodedSamplesEnd = pSamplesOut + (count & ~3);4639drflac_uint32 zeroCountParts[4];4640drflac_uint32 riceParamParts[4];4641int32x4_t coefficients128_0;4642int32x4_t coefficients128_4;4643int32x4_t coefficients128_8;4644int32x4_t samples128_0;4645int32x4_t samples128_4;4646int32x4_t samples128_8;4647uint32x4_t riceParamMask128;4648int32x4_t riceParam128;4649int64x1_t shift64;4650uint32x4_t one128;4651int64x2_t prediction128 = { 0 };4652uint32x4_t zeroCountPart128;4653uint32x4_t riceParamPart128;46544655const drflac_uint32 t[2] = {0x00000000, 0xFFFFFFFF};46564657riceParamMask = (drflac_uint32)~((~0UL) << riceParam);4658riceParamMask128 = vdupq_n_u32(riceParamMask);46594660riceParam128 = vdupq_n_s32(riceParam);4661shift64 = vdup_n_s64(-shift); /* Negate the shift because we'll be doing a variable shift using vshlq_s32(). */4662one128 = vdupq_n_u32(1);46634664/*4665Pre-loading the coefficients and prior samples is annoying because we need to ensure we don't try reading more than4666what's available in the input buffers. It would be convenient to use a fall-through switch to do this, but this results4667in strict aliasing warnings with GCC. To work around this I'm just doing something hacky. This feels a bit convoluted4668so I think there's opportunity for this to be simplified.4669*/4670{4671int runningOrder = order;4672drflac_int32 tempC[4] = {0, 0, 0, 0};4673drflac_int32 tempS[4] = {0, 0, 0, 0};46744675/* 0 - 3. */4676if (runningOrder >= 4) {4677coefficients128_0 = vld1q_s32(coefficients + 0);4678samples128_0 = vld1q_s32(pSamplesOut - 4);4679runningOrder -= 4;4680} else {4681switch (runningOrder) {4682case 3: tempC[2] = coefficients[2]; tempS[1] = pSamplesOut[-3]; /* fallthrough */4683case 2: tempC[1] = coefficients[1]; tempS[2] = pSamplesOut[-2]; /* fallthrough */4684case 1: tempC[0] = coefficients[0]; tempS[3] = pSamplesOut[-1]; /* fallthrough */4685}46864687coefficients128_0 = vld1q_s32(tempC);4688samples128_0 = vld1q_s32(tempS);4689runningOrder = 0;4690}46914692/* 4 - 7 */4693if (runningOrder >= 4) {4694coefficients128_4 = vld1q_s32(coefficients + 4);4695samples128_4 = vld1q_s32(pSamplesOut - 8);4696runningOrder -= 4;4697} else {4698switch (runningOrder) {4699case 3: tempC[2] = coefficients[6]; tempS[1] = pSamplesOut[-7]; /* fallthrough */4700case 2: tempC[1] = coefficients[5]; tempS[2] = pSamplesOut[-6]; /* fallthrough */4701case 1: tempC[0] = coefficients[4]; tempS[3] = pSamplesOut[-5]; /* fallthrough */4702}47034704coefficients128_4 = vld1q_s32(tempC);4705samples128_4 = vld1q_s32(tempS);4706runningOrder = 0;4707}47084709/* 8 - 11 */4710if (runningOrder == 4) {4711coefficients128_8 = vld1q_s32(coefficients + 8);4712samples128_8 = vld1q_s32(pSamplesOut - 12);4713runningOrder -= 4;4714} else {4715switch (runningOrder) {4716case 3: tempC[2] = coefficients[10]; tempS[1] = pSamplesOut[-11]; /* fallthrough */4717case 2: tempC[1] = coefficients[ 9]; tempS[2] = pSamplesOut[-10]; /* fallthrough */4718case 1: tempC[0] = coefficients[ 8]; tempS[3] = pSamplesOut[- 9]; /* fallthrough */4719}47204721coefficients128_8 = vld1q_s32(tempC);4722samples128_8 = vld1q_s32(tempS);4723runningOrder = 0;4724}47254726/* Coefficients need to be shuffled for our streaming algorithm below to work. Samples are already in the correct order from the loading routine above. */4727coefficients128_0 = drflac__vrevq_s32(coefficients128_0);4728coefficients128_4 = drflac__vrevq_s32(coefficients128_4);4729coefficients128_8 = drflac__vrevq_s32(coefficients128_8);4730}47314732/* For this version we are doing one sample at a time. */4733while (pDecodedSamples < pDecodedSamplesEnd) {4734if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[0], &riceParamParts[0]) ||4735!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[1], &riceParamParts[1]) ||4736!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[2], &riceParamParts[2]) ||4737!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[3], &riceParamParts[3])) {4738return DRFLAC_FALSE;4739}47404741zeroCountPart128 = vld1q_u32(zeroCountParts);4742riceParamPart128 = vld1q_u32(riceParamParts);47434744riceParamPart128 = vandq_u32(riceParamPart128, riceParamMask128);4745riceParamPart128 = vorrq_u32(riceParamPart128, vshlq_u32(zeroCountPart128, riceParam128));4746riceParamPart128 = veorq_u32(vshrq_n_u32(riceParamPart128, 1), vaddq_u32(drflac__vnotq_u32(vandq_u32(riceParamPart128, one128)), one128));47474748for (i = 0; i < 4; i += 1) {4749int64x1_t prediction64;47504751prediction128 = veorq_s64(prediction128, prediction128); /* Reset to 0. */4752switch (order)4753{4754case 12:4755case 11: prediction128 = vaddq_s64(prediction128, vmull_s32(vget_low_s32(coefficients128_8), vget_low_s32(samples128_8)));4756case 10:4757case 9: prediction128 = vaddq_s64(prediction128, vmull_s32(vget_high_s32(coefficients128_8), vget_high_s32(samples128_8)));4758case 8:4759case 7: prediction128 = vaddq_s64(prediction128, vmull_s32(vget_low_s32(coefficients128_4), vget_low_s32(samples128_4)));4760case 6:4761case 5: prediction128 = vaddq_s64(prediction128, vmull_s32(vget_high_s32(coefficients128_4), vget_high_s32(samples128_4)));4762case 4:4763case 3: prediction128 = vaddq_s64(prediction128, vmull_s32(vget_low_s32(coefficients128_0), vget_low_s32(samples128_0)));4764case 2:4765case 1: prediction128 = vaddq_s64(prediction128, vmull_s32(vget_high_s32(coefficients128_0), vget_high_s32(samples128_0)));4766}47674768/* Horizontal add and shift. */4769prediction64 = drflac__vhaddq_s64(prediction128);4770prediction64 = vshl_s64(prediction64, shift64);4771prediction64 = vadd_s64(prediction64, vdup_n_s64(vgetq_lane_u32(riceParamPart128, 0)));47724773/* Our value should be sitting in prediction64[0]. We need to combine this with our SSE samples. */4774samples128_8 = drflac__valignrq_s32_1(samples128_4, samples128_8);4775samples128_4 = drflac__valignrq_s32_1(samples128_0, samples128_4);4776samples128_0 = drflac__valignrq_s32_1(vcombine_s32(vreinterpret_s32_s64(prediction64), vdup_n_s32(0)), samples128_0);47774778/* Slide our rice parameter down so that the value in position 0 contains the next one to process. */4779riceParamPart128 = drflac__valignrq_u32_1(vdupq_n_u32(0), riceParamPart128);4780}47814782/* We store samples in groups of 4. */4783vst1q_s32(pDecodedSamples, samples128_0);4784pDecodedSamples += 4;4785}47864787/* Make sure we process the last few samples. */4788i = (count & ~3);4789while (i < (int)count) {4790/* Rice extraction. */4791if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[0], &riceParamParts[0])) {4792return DRFLAC_FALSE;4793}47944795/* Rice reconstruction. */4796riceParamParts[0] &= riceParamMask;4797riceParamParts[0] |= (zeroCountParts[0] << riceParam);4798riceParamParts[0] = (riceParamParts[0] >> 1) ^ t[riceParamParts[0] & 0x01];47994800/* Sample reconstruction. */4801pDecodedSamples[0] = riceParamParts[0] + drflac__calculate_prediction_64(order, shift, coefficients, pDecodedSamples);48024803i += 1;4804pDecodedSamples += 1;4805}48064807return DRFLAC_TRUE;4808}48094810static drflac_bool32 drflac__decode_samples_with_residual__rice__neon(drflac_bs* bs, drflac_uint32 bitsPerSample, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 lpcOrder, drflac_int32 lpcShift, drflac_uint32 lpcPrecision, const drflac_int32* coefficients, drflac_int32* pSamplesOut)4811{4812DRFLAC_ASSERT(bs != NULL);4813DRFLAC_ASSERT(pSamplesOut != NULL);48144815/* In my testing the order is rarely > 12, so in this case I'm going to simplify the NEON implementation by only handling order <= 12. */4816if (lpcOrder > 0 && lpcOrder <= 12) {4817if (drflac__use_64_bit_prediction(bitsPerSample, lpcOrder, lpcPrecision)) {4818return drflac__decode_samples_with_residual__rice__neon_64(bs, count, riceParam, lpcOrder, lpcShift, coefficients, pSamplesOut);4819} else {4820return drflac__decode_samples_with_residual__rice__neon_32(bs, count, riceParam, lpcOrder, lpcShift, coefficients, pSamplesOut);4821}4822} else {4823return drflac__decode_samples_with_residual__rice__scalar(bs, bitsPerSample, count, riceParam, lpcOrder, lpcShift, lpcPrecision, coefficients, pSamplesOut);4824}4825}4826#endif48274828static drflac_bool32 drflac__decode_samples_with_residual__rice(drflac_bs* bs, drflac_uint32 bitsPerSample, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 lpcOrder, drflac_int32 lpcShift, drflac_uint32 lpcPrecision, const drflac_int32* coefficients, drflac_int32* pSamplesOut)4829{4830#if defined(DRFLAC_SUPPORT_SSE41)4831if (drflac__gIsSSE41Supported) {4832return drflac__decode_samples_with_residual__rice__sse41(bs, bitsPerSample, count, riceParam, lpcOrder, lpcShift, lpcPrecision, coefficients, pSamplesOut);4833} else4834#elif defined(DRFLAC_SUPPORT_NEON)4835if (drflac__gIsNEONSupported) {4836return drflac__decode_samples_with_residual__rice__neon(bs, bitsPerSample, count, riceParam, lpcOrder, lpcShift, lpcPrecision, coefficients, pSamplesOut);4837} else4838#endif4839{4840/* Scalar fallback. */4841#if 04842return drflac__decode_samples_with_residual__rice__reference(bs, bitsPerSample, count, riceParam, lpcOrder, lpcShift, lpcPrecision, coefficients, pSamplesOut);4843#else4844return drflac__decode_samples_with_residual__rice__scalar(bs, bitsPerSample, count, riceParam, lpcOrder, lpcShift, lpcPrecision, coefficients, pSamplesOut);4845#endif4846}4847}48484849/* Reads and seeks past a string of residual values as Rice codes. The decoder should be sitting on the first bit of the Rice codes. */4850static drflac_bool32 drflac__read_and_seek_residual__rice(drflac_bs* bs, drflac_uint32 count, drflac_uint8 riceParam)4851{4852drflac_uint32 i;48534854DRFLAC_ASSERT(bs != NULL);48554856for (i = 0; i < count; ++i) {4857if (!drflac__seek_rice_parts(bs, riceParam)) {4858return DRFLAC_FALSE;4859}4860}48614862return DRFLAC_TRUE;4863}48644865#if defined(__clang__)4866__attribute__((no_sanitize("signed-integer-overflow")))4867#endif4868static drflac_bool32 drflac__decode_samples_with_residual__unencoded(drflac_bs* bs, drflac_uint32 bitsPerSample, drflac_uint32 count, drflac_uint8 unencodedBitsPerSample, drflac_uint32 lpcOrder, drflac_int32 lpcShift, drflac_uint32 lpcPrecision, const drflac_int32* coefficients, drflac_int32* pSamplesOut)4869{4870drflac_uint32 i;48714872DRFLAC_ASSERT(bs != NULL);4873DRFLAC_ASSERT(unencodedBitsPerSample <= 31); /* <-- unencodedBitsPerSample is a 5 bit number, so cannot exceed 31. */4874DRFLAC_ASSERT(pSamplesOut != NULL);48754876for (i = 0; i < count; ++i) {4877if (unencodedBitsPerSample > 0) {4878if (!drflac__read_int32(bs, unencodedBitsPerSample, pSamplesOut + i)) {4879return DRFLAC_FALSE;4880}4881} else {4882pSamplesOut[i] = 0;4883}48844885if (drflac__use_64_bit_prediction(bitsPerSample, lpcOrder, lpcPrecision)) {4886pSamplesOut[i] += drflac__calculate_prediction_64(lpcOrder, lpcShift, coefficients, pSamplesOut + i);4887} else {4888pSamplesOut[i] += drflac__calculate_prediction_32(lpcOrder, lpcShift, coefficients, pSamplesOut + i);4889}4890}48914892return DRFLAC_TRUE;4893}489448954896/*4897Reads and decodes the residual for the sub-frame the decoder is currently sitting on. This function should be called4898when the decoder is sitting at the very start of the RESIDUAL block. The first <order> residuals will be ignored. The4899<blockSize> and <order> parameters are used to determine how many residual values need to be decoded.4900*/4901static drflac_bool32 drflac__decode_samples_with_residual(drflac_bs* bs, drflac_uint32 bitsPerSample, drflac_uint32 blockSize, drflac_uint32 lpcOrder, drflac_int32 lpcShift, drflac_uint32 lpcPrecision, const drflac_int32* coefficients, drflac_int32* pDecodedSamples)4902{4903drflac_uint8 residualMethod;4904drflac_uint8 partitionOrder;4905drflac_uint32 samplesInPartition;4906drflac_uint32 partitionsRemaining;49074908DRFLAC_ASSERT(bs != NULL);4909DRFLAC_ASSERT(blockSize != 0);4910DRFLAC_ASSERT(pDecodedSamples != NULL); /* <-- Should we allow NULL, in which case we just seek past the residual rather than do a full decode? */49114912if (!drflac__read_uint8(bs, 2, &residualMethod)) {4913return DRFLAC_FALSE;4914}49154916if (residualMethod != DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE && residualMethod != DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE2) {4917return DRFLAC_FALSE; /* Unknown or unsupported residual coding method. */4918}49194920/* Ignore the first <order> values. */4921pDecodedSamples += lpcOrder;49224923if (!drflac__read_uint8(bs, 4, &partitionOrder)) {4924return DRFLAC_FALSE;4925}49264927/*4928From the FLAC spec:4929The Rice partition order in a Rice-coded residual section must be less than or equal to 8.4930*/4931if (partitionOrder > 8) {4932return DRFLAC_FALSE;4933}49344935/* Validation check. */4936if ((blockSize / (1 << partitionOrder)) < lpcOrder) {4937return DRFLAC_FALSE;4938}49394940samplesInPartition = (blockSize / (1 << partitionOrder)) - lpcOrder;4941partitionsRemaining = (1 << partitionOrder);4942for (;;) {4943drflac_uint8 riceParam = 0;4944if (residualMethod == DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE) {4945if (!drflac__read_uint8(bs, 4, &riceParam)) {4946return DRFLAC_FALSE;4947}4948if (riceParam == 15) {4949riceParam = 0xFF;4950}4951} else if (residualMethod == DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE2) {4952if (!drflac__read_uint8(bs, 5, &riceParam)) {4953return DRFLAC_FALSE;4954}4955if (riceParam == 31) {4956riceParam = 0xFF;4957}4958}49594960if (riceParam != 0xFF) {4961if (!drflac__decode_samples_with_residual__rice(bs, bitsPerSample, samplesInPartition, riceParam, lpcOrder, lpcShift, lpcPrecision, coefficients, pDecodedSamples)) {4962return DRFLAC_FALSE;4963}4964} else {4965drflac_uint8 unencodedBitsPerSample = 0;4966if (!drflac__read_uint8(bs, 5, &unencodedBitsPerSample)) {4967return DRFLAC_FALSE;4968}49694970if (!drflac__decode_samples_with_residual__unencoded(bs, bitsPerSample, samplesInPartition, unencodedBitsPerSample, lpcOrder, lpcShift, lpcPrecision, coefficients, pDecodedSamples)) {4971return DRFLAC_FALSE;4972}4973}49744975pDecodedSamples += samplesInPartition;49764977if (partitionsRemaining == 1) {4978break;4979}49804981partitionsRemaining -= 1;49824983if (partitionOrder != 0) {4984samplesInPartition = blockSize / (1 << partitionOrder);4985}4986}49874988return DRFLAC_TRUE;4989}49904991/*4992Reads and seeks past the residual for the sub-frame the decoder is currently sitting on. This function should be called4993when the decoder is sitting at the very start of the RESIDUAL block. The first <order> residuals will be set to 0. The4994<blockSize> and <order> parameters are used to determine how many residual values need to be decoded.4995*/4996static drflac_bool32 drflac__read_and_seek_residual(drflac_bs* bs, drflac_uint32 blockSize, drflac_uint32 order)4997{4998drflac_uint8 residualMethod;4999drflac_uint8 partitionOrder;5000drflac_uint32 samplesInPartition;5001drflac_uint32 partitionsRemaining;50025003DRFLAC_ASSERT(bs != NULL);5004DRFLAC_ASSERT(blockSize != 0);50055006if (!drflac__read_uint8(bs, 2, &residualMethod)) {5007return DRFLAC_FALSE;5008}50095010if (residualMethod != DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE && residualMethod != DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE2) {5011return DRFLAC_FALSE; /* Unknown or unsupported residual coding method. */5012}50135014if (!drflac__read_uint8(bs, 4, &partitionOrder)) {5015return DRFLAC_FALSE;5016}50175018/*5019From the FLAC spec:5020The Rice partition order in a Rice-coded residual section must be less than or equal to 8.5021*/5022if (partitionOrder > 8) {5023return DRFLAC_FALSE;5024}50255026/* Validation check. */5027if ((blockSize / (1 << partitionOrder)) <= order) {5028return DRFLAC_FALSE;5029}50305031samplesInPartition = (blockSize / (1 << partitionOrder)) - order;5032partitionsRemaining = (1 << partitionOrder);5033for (;;)5034{5035drflac_uint8 riceParam = 0;5036if (residualMethod == DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE) {5037if (!drflac__read_uint8(bs, 4, &riceParam)) {5038return DRFLAC_FALSE;5039}5040if (riceParam == 15) {5041riceParam = 0xFF;5042}5043} else if (residualMethod == DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE2) {5044if (!drflac__read_uint8(bs, 5, &riceParam)) {5045return DRFLAC_FALSE;5046}5047if (riceParam == 31) {5048riceParam = 0xFF;5049}5050}50515052if (riceParam != 0xFF) {5053if (!drflac__read_and_seek_residual__rice(bs, samplesInPartition, riceParam)) {5054return DRFLAC_FALSE;5055}5056} else {5057drflac_uint8 unencodedBitsPerSample = 0;5058if (!drflac__read_uint8(bs, 5, &unencodedBitsPerSample)) {5059return DRFLAC_FALSE;5060}50615062if (!drflac__seek_bits(bs, unencodedBitsPerSample * samplesInPartition)) {5063return DRFLAC_FALSE;5064}5065}506650675068if (partitionsRemaining == 1) {5069break;5070}50715072partitionsRemaining -= 1;5073samplesInPartition = blockSize / (1 << partitionOrder);5074}50755076return DRFLAC_TRUE;5077}507850795080static drflac_bool32 drflac__decode_samples__constant(drflac_bs* bs, drflac_uint32 blockSize, drflac_uint32 subframeBitsPerSample, drflac_int32* pDecodedSamples)5081{5082drflac_uint32 i;50835084/* Only a single sample needs to be decoded here. */5085drflac_int32 sample;5086if (!drflac__read_int32(bs, subframeBitsPerSample, &sample)) {5087return DRFLAC_FALSE;5088}50895090/*5091We don't really need to expand this, but it does simplify the process of reading samples. If this becomes a performance issue (unlikely)5092we'll want to look at a more efficient way.5093*/5094for (i = 0; i < blockSize; ++i) {5095pDecodedSamples[i] = sample;5096}50975098return DRFLAC_TRUE;5099}51005101static drflac_bool32 drflac__decode_samples__verbatim(drflac_bs* bs, drflac_uint32 blockSize, drflac_uint32 subframeBitsPerSample, drflac_int32* pDecodedSamples)5102{5103drflac_uint32 i;51045105for (i = 0; i < blockSize; ++i) {5106drflac_int32 sample;5107if (!drflac__read_int32(bs, subframeBitsPerSample, &sample)) {5108return DRFLAC_FALSE;5109}51105111pDecodedSamples[i] = sample;5112}51135114return DRFLAC_TRUE;5115}51165117static drflac_bool32 drflac__decode_samples__fixed(drflac_bs* bs, drflac_uint32 blockSize, drflac_uint32 subframeBitsPerSample, drflac_uint8 lpcOrder, drflac_int32* pDecodedSamples)5118{5119drflac_uint32 i;51205121static drflac_int32 lpcCoefficientsTable[5][4] = {5122{0, 0, 0, 0},5123{1, 0, 0, 0},5124{2, -1, 0, 0},5125{3, -3, 1, 0},5126{4, -6, 4, -1}5127};51285129/* Warm up samples and coefficients. */5130for (i = 0; i < lpcOrder; ++i) {5131drflac_int32 sample;5132if (!drflac__read_int32(bs, subframeBitsPerSample, &sample)) {5133return DRFLAC_FALSE;5134}51355136pDecodedSamples[i] = sample;5137}51385139if (!drflac__decode_samples_with_residual(bs, subframeBitsPerSample, blockSize, lpcOrder, 0, 4, lpcCoefficientsTable[lpcOrder], pDecodedSamples)) {5140return DRFLAC_FALSE;5141}51425143return DRFLAC_TRUE;5144}51455146static drflac_bool32 drflac__decode_samples__lpc(drflac_bs* bs, drflac_uint32 blockSize, drflac_uint32 bitsPerSample, drflac_uint8 lpcOrder, drflac_int32* pDecodedSamples)5147{5148drflac_uint8 i;5149drflac_uint8 lpcPrecision;5150drflac_int8 lpcShift;5151drflac_int32 coefficients[32];51525153/* Warm up samples. */5154for (i = 0; i < lpcOrder; ++i) {5155drflac_int32 sample;5156if (!drflac__read_int32(bs, bitsPerSample, &sample)) {5157return DRFLAC_FALSE;5158}51595160pDecodedSamples[i] = sample;5161}51625163if (!drflac__read_uint8(bs, 4, &lpcPrecision)) {5164return DRFLAC_FALSE;5165}5166if (lpcPrecision == 15) {5167return DRFLAC_FALSE; /* Invalid. */5168}5169lpcPrecision += 1;51705171if (!drflac__read_int8(bs, 5, &lpcShift)) {5172return DRFLAC_FALSE;5173}51745175/*5176From the FLAC specification:51775178Quantized linear predictor coefficient shift needed in bits (NOTE: this number is signed two's-complement)51795180Emphasis on the "signed two's-complement". In practice there does not seem to be any encoders nor decoders supporting negative shifts. For now dr_flac is5181not going to support negative shifts as I don't have any reference files. However, when a reference file comes through I will consider adding support.5182*/5183if (lpcShift < 0) {5184return DRFLAC_FALSE;5185}51865187DRFLAC_ZERO_MEMORY(coefficients, sizeof(coefficients));5188for (i = 0; i < lpcOrder; ++i) {5189if (!drflac__read_int32(bs, lpcPrecision, coefficients + i)) {5190return DRFLAC_FALSE;5191}5192}51935194if (!drflac__decode_samples_with_residual(bs, bitsPerSample, blockSize, lpcOrder, lpcShift, lpcPrecision, coefficients, pDecodedSamples)) {5195return DRFLAC_FALSE;5196}51975198return DRFLAC_TRUE;5199}520052015202static drflac_bool32 drflac__read_next_flac_frame_header(drflac_bs* bs, drflac_uint8 streaminfoBitsPerSample, drflac_frame_header* header)5203{5204const drflac_uint32 sampleRateTable[12] = {0, 88200, 176400, 192000, 8000, 16000, 22050, 24000, 32000, 44100, 48000, 96000};5205const drflac_uint8 bitsPerSampleTable[8] = {0, 8, 12, (drflac_uint8)-1, 16, 20, 24, (drflac_uint8)-1}; /* -1 = reserved. */52065207DRFLAC_ASSERT(bs != NULL);5208DRFLAC_ASSERT(header != NULL);52095210/* Keep looping until we find a valid sync code. */5211for (;;) {5212drflac_uint8 crc8 = 0xCE; /* 0xCE = drflac_crc8(0, 0x3FFE, 14); */5213drflac_uint8 reserved = 0;5214drflac_uint8 blockingStrategy = 0;5215drflac_uint8 blockSize = 0;5216drflac_uint8 sampleRate = 0;5217drflac_uint8 channelAssignment = 0;5218drflac_uint8 bitsPerSample = 0;5219drflac_bool32 isVariableBlockSize;52205221if (!drflac__find_and_seek_to_next_sync_code(bs)) {5222return DRFLAC_FALSE;5223}52245225if (!drflac__read_uint8(bs, 1, &reserved)) {5226return DRFLAC_FALSE;5227}5228if (reserved == 1) {5229continue;5230}5231crc8 = drflac_crc8(crc8, reserved, 1);52325233if (!drflac__read_uint8(bs, 1, &blockingStrategy)) {5234return DRFLAC_FALSE;5235}5236crc8 = drflac_crc8(crc8, blockingStrategy, 1);52375238if (!drflac__read_uint8(bs, 4, &blockSize)) {5239return DRFLAC_FALSE;5240}5241if (blockSize == 0) {5242continue;5243}5244crc8 = drflac_crc8(crc8, blockSize, 4);52455246if (!drflac__read_uint8(bs, 4, &sampleRate)) {5247return DRFLAC_FALSE;5248}5249crc8 = drflac_crc8(crc8, sampleRate, 4);52505251if (!drflac__read_uint8(bs, 4, &channelAssignment)) {5252return DRFLAC_FALSE;5253}5254if (channelAssignment > 10) {5255continue;5256}5257crc8 = drflac_crc8(crc8, channelAssignment, 4);52585259if (!drflac__read_uint8(bs, 3, &bitsPerSample)) {5260return DRFLAC_FALSE;5261}5262if (bitsPerSample == 3 || bitsPerSample == 7) {5263continue;5264}5265crc8 = drflac_crc8(crc8, bitsPerSample, 3);526652675268if (!drflac__read_uint8(bs, 1, &reserved)) {5269return DRFLAC_FALSE;5270}5271if (reserved == 1) {5272continue;5273}5274crc8 = drflac_crc8(crc8, reserved, 1);527552765277isVariableBlockSize = blockingStrategy == 1;5278if (isVariableBlockSize) {5279drflac_uint64 pcmFrameNumber;5280drflac_result result = drflac__read_utf8_coded_number(bs, &pcmFrameNumber, &crc8);5281if (result != DRFLAC_SUCCESS) {5282if (result == DRFLAC_AT_END) {5283return DRFLAC_FALSE;5284} else {5285continue;5286}5287}5288header->flacFrameNumber = 0;5289header->pcmFrameNumber = pcmFrameNumber;5290} else {5291drflac_uint64 flacFrameNumber = 0;5292drflac_result result = drflac__read_utf8_coded_number(bs, &flacFrameNumber, &crc8);5293if (result != DRFLAC_SUCCESS) {5294if (result == DRFLAC_AT_END) {5295return DRFLAC_FALSE;5296} else {5297continue;5298}5299}5300header->flacFrameNumber = (drflac_uint32)flacFrameNumber; /* <-- Safe cast. */5301header->pcmFrameNumber = 0;5302}530353045305DRFLAC_ASSERT(blockSize > 0);5306if (blockSize == 1) {5307header->blockSizeInPCMFrames = 192;5308} else if (blockSize <= 5) {5309DRFLAC_ASSERT(blockSize >= 2);5310header->blockSizeInPCMFrames = 576 * (1 << (blockSize - 2));5311} else if (blockSize == 6) {5312if (!drflac__read_uint16(bs, 8, &header->blockSizeInPCMFrames)) {5313return DRFLAC_FALSE;5314}5315crc8 = drflac_crc8(crc8, header->blockSizeInPCMFrames, 8);5316header->blockSizeInPCMFrames += 1;5317} else if (blockSize == 7) {5318if (!drflac__read_uint16(bs, 16, &header->blockSizeInPCMFrames)) {5319return DRFLAC_FALSE;5320}5321crc8 = drflac_crc8(crc8, header->blockSizeInPCMFrames, 16);5322if (header->blockSizeInPCMFrames == 0xFFFF) {5323return DRFLAC_FALSE; /* Frame is too big. This is the size of the frame minus 1. The STREAMINFO block defines the max block size which is 16-bits. Adding one will make it 17 bits and therefore too big. */5324}5325header->blockSizeInPCMFrames += 1;5326} else {5327DRFLAC_ASSERT(blockSize >= 8);5328header->blockSizeInPCMFrames = 256 * (1 << (blockSize - 8));5329}533053315332if (sampleRate <= 11) {5333header->sampleRate = sampleRateTable[sampleRate];5334} else if (sampleRate == 12) {5335if (!drflac__read_uint32(bs, 8, &header->sampleRate)) {5336return DRFLAC_FALSE;5337}5338crc8 = drflac_crc8(crc8, header->sampleRate, 8);5339header->sampleRate *= 1000;5340} else if (sampleRate == 13) {5341if (!drflac__read_uint32(bs, 16, &header->sampleRate)) {5342return DRFLAC_FALSE;5343}5344crc8 = drflac_crc8(crc8, header->sampleRate, 16);5345} else if (sampleRate == 14) {5346if (!drflac__read_uint32(bs, 16, &header->sampleRate)) {5347return DRFLAC_FALSE;5348}5349crc8 = drflac_crc8(crc8, header->sampleRate, 16);5350header->sampleRate *= 10;5351} else {5352continue; /* Invalid. Assume an invalid block. */5353}535453555356header->channelAssignment = channelAssignment;53575358header->bitsPerSample = bitsPerSampleTable[bitsPerSample];5359if (header->bitsPerSample == 0) {5360header->bitsPerSample = streaminfoBitsPerSample;5361}53625363if (header->bitsPerSample != streaminfoBitsPerSample) {5364/* If this subframe has a different bitsPerSample then streaminfo or the first frame, reject it */5365return DRFLAC_FALSE;5366}53675368if (!drflac__read_uint8(bs, 8, &header->crc8)) {5369return DRFLAC_FALSE;5370}53715372#ifndef DR_FLAC_NO_CRC5373if (header->crc8 != crc8) {5374continue; /* CRC mismatch. Loop back to the top and find the next sync code. */5375}5376#endif5377return DRFLAC_TRUE;5378}5379}53805381static drflac_bool32 drflac__read_subframe_header(drflac_bs* bs, drflac_subframe* pSubframe)5382{5383drflac_uint8 header;5384int type;53855386if (!drflac__read_uint8(bs, 8, &header)) {5387return DRFLAC_FALSE;5388}53895390/* First bit should always be 0. */5391if ((header & 0x80) != 0) {5392return DRFLAC_FALSE;5393}53945395type = (header & 0x7E) >> 1;5396if (type == 0) {5397pSubframe->subframeType = DRFLAC_SUBFRAME_CONSTANT;5398} else if (type == 1) {5399pSubframe->subframeType = DRFLAC_SUBFRAME_VERBATIM;5400} else {5401if ((type & 0x20) != 0) {5402pSubframe->subframeType = DRFLAC_SUBFRAME_LPC;5403pSubframe->lpcOrder = (drflac_uint8)(type & 0x1F) + 1;5404} else if ((type & 0x08) != 0) {5405pSubframe->subframeType = DRFLAC_SUBFRAME_FIXED;5406pSubframe->lpcOrder = (drflac_uint8)(type & 0x07);5407if (pSubframe->lpcOrder > 4) {5408pSubframe->subframeType = DRFLAC_SUBFRAME_RESERVED;5409pSubframe->lpcOrder = 0;5410}5411} else {5412pSubframe->subframeType = DRFLAC_SUBFRAME_RESERVED;5413}5414}54155416if (pSubframe->subframeType == DRFLAC_SUBFRAME_RESERVED) {5417return DRFLAC_FALSE;5418}54195420/* Wasted bits per sample. */5421pSubframe->wastedBitsPerSample = 0;5422if ((header & 0x01) == 1) {5423unsigned int wastedBitsPerSample;5424if (!drflac__seek_past_next_set_bit(bs, &wastedBitsPerSample)) {5425return DRFLAC_FALSE;5426}5427pSubframe->wastedBitsPerSample = (drflac_uint8)wastedBitsPerSample + 1;5428}54295430return DRFLAC_TRUE;5431}54325433static drflac_bool32 drflac__decode_subframe(drflac_bs* bs, drflac_frame* frame, int subframeIndex, drflac_int32* pDecodedSamplesOut)5434{5435drflac_subframe* pSubframe;5436drflac_uint32 subframeBitsPerSample;54375438DRFLAC_ASSERT(bs != NULL);5439DRFLAC_ASSERT(frame != NULL);54405441pSubframe = frame->subframes + subframeIndex;5442if (!drflac__read_subframe_header(bs, pSubframe)) {5443return DRFLAC_FALSE;5444}54455446/* Side channels require an extra bit per sample. Took a while to figure that one out... */5447subframeBitsPerSample = frame->header.bitsPerSample;5448if ((frame->header.channelAssignment == DRFLAC_CHANNEL_ASSIGNMENT_LEFT_SIDE || frame->header.channelAssignment == DRFLAC_CHANNEL_ASSIGNMENT_MID_SIDE) && subframeIndex == 1) {5449subframeBitsPerSample += 1;5450} else if (frame->header.channelAssignment == DRFLAC_CHANNEL_ASSIGNMENT_RIGHT_SIDE && subframeIndex == 0) {5451subframeBitsPerSample += 1;5452}54535454if (subframeBitsPerSample > 32) {5455/* libFLAC and ffmpeg reject 33-bit subframes as well */5456return DRFLAC_FALSE;5457}54585459/* Need to handle wasted bits per sample. */5460if (pSubframe->wastedBitsPerSample >= subframeBitsPerSample) {5461return DRFLAC_FALSE;5462}5463subframeBitsPerSample -= pSubframe->wastedBitsPerSample;54645465pSubframe->pSamplesS32 = pDecodedSamplesOut;54665467switch (pSubframe->subframeType)5468{5469case DRFLAC_SUBFRAME_CONSTANT:5470{5471drflac__decode_samples__constant(bs, frame->header.blockSizeInPCMFrames, subframeBitsPerSample, pSubframe->pSamplesS32);5472} break;54735474case DRFLAC_SUBFRAME_VERBATIM:5475{5476drflac__decode_samples__verbatim(bs, frame->header.blockSizeInPCMFrames, subframeBitsPerSample, pSubframe->pSamplesS32);5477} break;54785479case DRFLAC_SUBFRAME_FIXED:5480{5481drflac__decode_samples__fixed(bs, frame->header.blockSizeInPCMFrames, subframeBitsPerSample, pSubframe->lpcOrder, pSubframe->pSamplesS32);5482} break;54835484case DRFLAC_SUBFRAME_LPC:5485{5486drflac__decode_samples__lpc(bs, frame->header.blockSizeInPCMFrames, subframeBitsPerSample, pSubframe->lpcOrder, pSubframe->pSamplesS32);5487} break;54885489default: return DRFLAC_FALSE;5490}54915492return DRFLAC_TRUE;5493}54945495static drflac_bool32 drflac__seek_subframe(drflac_bs* bs, drflac_frame* frame, int subframeIndex)5496{5497drflac_subframe* pSubframe;5498drflac_uint32 subframeBitsPerSample;54995500DRFLAC_ASSERT(bs != NULL);5501DRFLAC_ASSERT(frame != NULL);55025503pSubframe = frame->subframes + subframeIndex;5504if (!drflac__read_subframe_header(bs, pSubframe)) {5505return DRFLAC_FALSE;5506}55075508/* Side channels require an extra bit per sample. Took a while to figure that one out... */5509subframeBitsPerSample = frame->header.bitsPerSample;5510if ((frame->header.channelAssignment == DRFLAC_CHANNEL_ASSIGNMENT_LEFT_SIDE || frame->header.channelAssignment == DRFLAC_CHANNEL_ASSIGNMENT_MID_SIDE) && subframeIndex == 1) {5511subframeBitsPerSample += 1;5512} else if (frame->header.channelAssignment == DRFLAC_CHANNEL_ASSIGNMENT_RIGHT_SIDE && subframeIndex == 0) {5513subframeBitsPerSample += 1;5514}55155516/* Need to handle wasted bits per sample. */5517if (pSubframe->wastedBitsPerSample >= subframeBitsPerSample) {5518return DRFLAC_FALSE;5519}5520subframeBitsPerSample -= pSubframe->wastedBitsPerSample;55215522pSubframe->pSamplesS32 = NULL;55235524switch (pSubframe->subframeType)5525{5526case DRFLAC_SUBFRAME_CONSTANT:5527{5528if (!drflac__seek_bits(bs, subframeBitsPerSample)) {5529return DRFLAC_FALSE;5530}5531} break;55325533case DRFLAC_SUBFRAME_VERBATIM:5534{5535unsigned int bitsToSeek = frame->header.blockSizeInPCMFrames * subframeBitsPerSample;5536if (!drflac__seek_bits(bs, bitsToSeek)) {5537return DRFLAC_FALSE;5538}5539} break;55405541case DRFLAC_SUBFRAME_FIXED:5542{5543unsigned int bitsToSeek = pSubframe->lpcOrder * subframeBitsPerSample;5544if (!drflac__seek_bits(bs, bitsToSeek)) {5545return DRFLAC_FALSE;5546}55475548if (!drflac__read_and_seek_residual(bs, frame->header.blockSizeInPCMFrames, pSubframe->lpcOrder)) {5549return DRFLAC_FALSE;5550}5551} break;55525553case DRFLAC_SUBFRAME_LPC:5554{5555drflac_uint8 lpcPrecision;55565557unsigned int bitsToSeek = pSubframe->lpcOrder * subframeBitsPerSample;5558if (!drflac__seek_bits(bs, bitsToSeek)) {5559return DRFLAC_FALSE;5560}55615562if (!drflac__read_uint8(bs, 4, &lpcPrecision)) {5563return DRFLAC_FALSE;5564}5565if (lpcPrecision == 15) {5566return DRFLAC_FALSE; /* Invalid. */5567}5568lpcPrecision += 1;556955705571bitsToSeek = (pSubframe->lpcOrder * lpcPrecision) + 5; /* +5 for shift. */5572if (!drflac__seek_bits(bs, bitsToSeek)) {5573return DRFLAC_FALSE;5574}55755576if (!drflac__read_and_seek_residual(bs, frame->header.blockSizeInPCMFrames, pSubframe->lpcOrder)) {5577return DRFLAC_FALSE;5578}5579} break;55805581default: return DRFLAC_FALSE;5582}55835584return DRFLAC_TRUE;5585}558655875588static DRFLAC_INLINE drflac_uint8 drflac__get_channel_count_from_channel_assignment(drflac_int8 channelAssignment)5589{5590drflac_uint8 lookup[] = {1, 2, 3, 4, 5, 6, 7, 8, 2, 2, 2};55915592DRFLAC_ASSERT(channelAssignment <= 10);5593return lookup[channelAssignment];5594}55955596static drflac_result drflac__decode_flac_frame(drflac* pFlac)5597{5598int channelCount;5599int i;5600drflac_uint8 paddingSizeInBits;5601drflac_uint16 desiredCRC16;5602#ifndef DR_FLAC_NO_CRC5603drflac_uint16 actualCRC16;5604#endif56055606/* This function should be called while the stream is sitting on the first byte after the frame header. */5607DRFLAC_ZERO_MEMORY(pFlac->currentFLACFrame.subframes, sizeof(pFlac->currentFLACFrame.subframes));56085609/* The frame block size must never be larger than the maximum block size defined by the FLAC stream. */5610if (pFlac->currentFLACFrame.header.blockSizeInPCMFrames > pFlac->maxBlockSizeInPCMFrames) {5611return DRFLAC_ERROR;5612}56135614/* The number of channels in the frame must match the channel count from the STREAMINFO block. */5615channelCount = drflac__get_channel_count_from_channel_assignment(pFlac->currentFLACFrame.header.channelAssignment);5616if (channelCount != (int)pFlac->channels) {5617return DRFLAC_ERROR;5618}56195620for (i = 0; i < channelCount; ++i) {5621if (!drflac__decode_subframe(&pFlac->bs, &pFlac->currentFLACFrame, i, pFlac->pDecodedSamples + (pFlac->currentFLACFrame.header.blockSizeInPCMFrames * i))) {5622return DRFLAC_ERROR;5623}5624}56255626paddingSizeInBits = (drflac_uint8)(DRFLAC_CACHE_L1_BITS_REMAINING(&pFlac->bs) & 7);5627if (paddingSizeInBits > 0) {5628drflac_uint8 padding = 0;5629if (!drflac__read_uint8(&pFlac->bs, paddingSizeInBits, &padding)) {5630return DRFLAC_AT_END;5631}5632}56335634#ifndef DR_FLAC_NO_CRC5635actualCRC16 = drflac__flush_crc16(&pFlac->bs);5636#endif5637if (!drflac__read_uint16(&pFlac->bs, 16, &desiredCRC16)) {5638return DRFLAC_AT_END;5639}56405641#ifndef DR_FLAC_NO_CRC5642if (actualCRC16 != desiredCRC16) {5643return DRFLAC_CRC_MISMATCH; /* CRC mismatch. */5644}5645#endif56465647pFlac->currentFLACFrame.pcmFramesRemaining = pFlac->currentFLACFrame.header.blockSizeInPCMFrames;56485649return DRFLAC_SUCCESS;5650}56515652static drflac_result drflac__seek_flac_frame(drflac* pFlac)5653{5654int channelCount;5655int i;5656drflac_uint16 desiredCRC16;5657#ifndef DR_FLAC_NO_CRC5658drflac_uint16 actualCRC16;5659#endif56605661channelCount = drflac__get_channel_count_from_channel_assignment(pFlac->currentFLACFrame.header.channelAssignment);5662for (i = 0; i < channelCount; ++i) {5663if (!drflac__seek_subframe(&pFlac->bs, &pFlac->currentFLACFrame, i)) {5664return DRFLAC_ERROR;5665}5666}56675668/* Padding. */5669if (!drflac__seek_bits(&pFlac->bs, DRFLAC_CACHE_L1_BITS_REMAINING(&pFlac->bs) & 7)) {5670return DRFLAC_ERROR;5671}56725673/* CRC. */5674#ifndef DR_FLAC_NO_CRC5675actualCRC16 = drflac__flush_crc16(&pFlac->bs);5676#endif5677if (!drflac__read_uint16(&pFlac->bs, 16, &desiredCRC16)) {5678return DRFLAC_AT_END;5679}56805681#ifndef DR_FLAC_NO_CRC5682if (actualCRC16 != desiredCRC16) {5683return DRFLAC_CRC_MISMATCH; /* CRC mismatch. */5684}5685#endif56865687return DRFLAC_SUCCESS;5688}56895690static drflac_bool32 drflac__read_and_decode_next_flac_frame(drflac* pFlac)5691{5692DRFLAC_ASSERT(pFlac != NULL);56935694for (;;) {5695drflac_result result;56965697if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) {5698return DRFLAC_FALSE;5699}57005701result = drflac__decode_flac_frame(pFlac);5702if (result != DRFLAC_SUCCESS) {5703if (result == DRFLAC_CRC_MISMATCH) {5704continue; /* CRC mismatch. Skip to the next frame. */5705} else {5706return DRFLAC_FALSE;5707}5708}57095710return DRFLAC_TRUE;5711}5712}57135714static void drflac__get_pcm_frame_range_of_current_flac_frame(drflac* pFlac, drflac_uint64* pFirstPCMFrame, drflac_uint64* pLastPCMFrame)5715{5716drflac_uint64 firstPCMFrame;5717drflac_uint64 lastPCMFrame;57185719DRFLAC_ASSERT(pFlac != NULL);57205721firstPCMFrame = pFlac->currentFLACFrame.header.pcmFrameNumber;5722if (firstPCMFrame == 0) {5723firstPCMFrame = ((drflac_uint64)pFlac->currentFLACFrame.header.flacFrameNumber) * pFlac->maxBlockSizeInPCMFrames;5724}57255726lastPCMFrame = firstPCMFrame + pFlac->currentFLACFrame.header.blockSizeInPCMFrames;5727if (lastPCMFrame > 0) {5728lastPCMFrame -= 1; /* Needs to be zero based. */5729}57305731if (pFirstPCMFrame) {5732*pFirstPCMFrame = firstPCMFrame;5733}5734if (pLastPCMFrame) {5735*pLastPCMFrame = lastPCMFrame;5736}5737}57385739static drflac_bool32 drflac__seek_to_first_frame(drflac* pFlac)5740{5741drflac_bool32 result;57425743DRFLAC_ASSERT(pFlac != NULL);57445745result = drflac__seek_to_byte(&pFlac->bs, pFlac->firstFLACFramePosInBytes);57465747DRFLAC_ZERO_MEMORY(&pFlac->currentFLACFrame, sizeof(pFlac->currentFLACFrame));5748pFlac->currentPCMFrame = 0;57495750return result;5751}57525753static DRFLAC_INLINE drflac_result drflac__seek_to_next_flac_frame(drflac* pFlac)5754{5755/* This function should only ever be called while the decoder is sitting on the first byte past the FRAME_HEADER section. */5756DRFLAC_ASSERT(pFlac != NULL);5757return drflac__seek_flac_frame(pFlac);5758}575957605761static drflac_uint64 drflac__seek_forward_by_pcm_frames(drflac* pFlac, drflac_uint64 pcmFramesToSeek)5762{5763drflac_uint64 pcmFramesRead = 0;5764while (pcmFramesToSeek > 0) {5765if (pFlac->currentFLACFrame.pcmFramesRemaining == 0) {5766if (!drflac__read_and_decode_next_flac_frame(pFlac)) {5767break; /* Couldn't read the next frame, so just break from the loop and return. */5768}5769} else {5770if (pFlac->currentFLACFrame.pcmFramesRemaining > pcmFramesToSeek) {5771pcmFramesRead += pcmFramesToSeek;5772pFlac->currentFLACFrame.pcmFramesRemaining -= (drflac_uint32)pcmFramesToSeek; /* <-- Safe cast. Will always be < currentFrame.pcmFramesRemaining < 65536. */5773pcmFramesToSeek = 0;5774} else {5775pcmFramesRead += pFlac->currentFLACFrame.pcmFramesRemaining;5776pcmFramesToSeek -= pFlac->currentFLACFrame.pcmFramesRemaining;5777pFlac->currentFLACFrame.pcmFramesRemaining = 0;5778}5779}5780}57815782pFlac->currentPCMFrame += pcmFramesRead;5783return pcmFramesRead;5784}578557865787static drflac_bool32 drflac__seek_to_pcm_frame__brute_force(drflac* pFlac, drflac_uint64 pcmFrameIndex)5788{5789drflac_bool32 isMidFrame = DRFLAC_FALSE;5790drflac_uint64 runningPCMFrameCount;57915792DRFLAC_ASSERT(pFlac != NULL);57935794/* If we are seeking forward we start from the current position. Otherwise we need to start all the way from the start of the file. */5795if (pcmFrameIndex >= pFlac->currentPCMFrame) {5796/* Seeking forward. Need to seek from the current position. */5797runningPCMFrameCount = pFlac->currentPCMFrame;57985799/* The frame header for the first frame may not yet have been read. We need to do that if necessary. */5800if (pFlac->currentPCMFrame == 0 && pFlac->currentFLACFrame.pcmFramesRemaining == 0) {5801if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) {5802return DRFLAC_FALSE;5803}5804} else {5805isMidFrame = DRFLAC_TRUE;5806}5807} else {5808/* Seeking backwards. Need to seek from the start of the file. */5809runningPCMFrameCount = 0;58105811/* Move back to the start. */5812if (!drflac__seek_to_first_frame(pFlac)) {5813return DRFLAC_FALSE;5814}58155816/* Decode the first frame in preparation for sample-exact seeking below. */5817if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) {5818return DRFLAC_FALSE;5819}5820}58215822/*5823We need to as quickly as possible find the frame that contains the target sample. To do this, we iterate over each frame and inspect its5824header. If based on the header we can determine that the frame contains the sample, we do a full decode of that frame.5825*/5826for (;;) {5827drflac_uint64 pcmFrameCountInThisFLACFrame;5828drflac_uint64 firstPCMFrameInFLACFrame = 0;5829drflac_uint64 lastPCMFrameInFLACFrame = 0;58305831drflac__get_pcm_frame_range_of_current_flac_frame(pFlac, &firstPCMFrameInFLACFrame, &lastPCMFrameInFLACFrame);58325833pcmFrameCountInThisFLACFrame = (lastPCMFrameInFLACFrame - firstPCMFrameInFLACFrame) + 1;5834if (pcmFrameIndex < (runningPCMFrameCount + pcmFrameCountInThisFLACFrame)) {5835/*5836The sample should be in this frame. We need to fully decode it, however if it's an invalid frame (a CRC mismatch), we need to pretend5837it never existed and keep iterating.5838*/5839drflac_uint64 pcmFramesToDecode = pcmFrameIndex - runningPCMFrameCount;58405841if (!isMidFrame) {5842drflac_result result = drflac__decode_flac_frame(pFlac);5843if (result == DRFLAC_SUCCESS) {5844/* The frame is valid. We just need to skip over some samples to ensure it's sample-exact. */5845return drflac__seek_forward_by_pcm_frames(pFlac, pcmFramesToDecode) == pcmFramesToDecode; /* <-- If this fails, something bad has happened (it should never fail). */5846} else {5847if (result == DRFLAC_CRC_MISMATCH) {5848goto next_iteration; /* CRC mismatch. Pretend this frame never existed. */5849} else {5850return DRFLAC_FALSE;5851}5852}5853} else {5854/* We started seeking mid-frame which means we need to skip the frame decoding part. */5855return drflac__seek_forward_by_pcm_frames(pFlac, pcmFramesToDecode) == pcmFramesToDecode;5856}5857} else {5858/*5859It's not in this frame. We need to seek past the frame, but check if there was a CRC mismatch. If so, we pretend this5860frame never existed and leave the running sample count untouched.5861*/5862if (!isMidFrame) {5863drflac_result result = drflac__seek_to_next_flac_frame(pFlac);5864if (result == DRFLAC_SUCCESS) {5865runningPCMFrameCount += pcmFrameCountInThisFLACFrame;5866} else {5867if (result == DRFLAC_CRC_MISMATCH) {5868goto next_iteration; /* CRC mismatch. Pretend this frame never existed. */5869} else {5870return DRFLAC_FALSE;5871}5872}5873} else {5874/*5875We started seeking mid-frame which means we need to seek by reading to the end of the frame instead of with5876drflac__seek_to_next_flac_frame() which only works if the decoder is sitting on the byte just after the frame header.5877*/5878runningPCMFrameCount += pFlac->currentFLACFrame.pcmFramesRemaining;5879pFlac->currentFLACFrame.pcmFramesRemaining = 0;5880isMidFrame = DRFLAC_FALSE;5881}58825883/* If we are seeking to the end of the file and we've just hit it, we're done. */5884if (pcmFrameIndex == pFlac->totalPCMFrameCount && runningPCMFrameCount == pFlac->totalPCMFrameCount) {5885return DRFLAC_TRUE;5886}5887}58885889next_iteration:5890/* Grab the next frame in preparation for the next iteration. */5891if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) {5892return DRFLAC_FALSE;5893}5894}5895}589658975898#if !defined(DR_FLAC_NO_CRC)5899/*5900We use an average compression ratio to determine our approximate start location. FLAC files are generally about 50%-70% the size of their5901uncompressed counterparts so we'll use this as a basis. I'm going to split the middle and use a factor of 0.6 to determine the starting5902location.5903*/5904#define DRFLAC_BINARY_SEARCH_APPROX_COMPRESSION_RATIO 0.6f59055906static drflac_bool32 drflac__seek_to_approximate_flac_frame_to_byte(drflac* pFlac, drflac_uint64 targetByte, drflac_uint64 rangeLo, drflac_uint64 rangeHi, drflac_uint64* pLastSuccessfulSeekOffset)5907{5908DRFLAC_ASSERT(pFlac != NULL);5909DRFLAC_ASSERT(pLastSuccessfulSeekOffset != NULL);5910DRFLAC_ASSERT(targetByte >= rangeLo);5911DRFLAC_ASSERT(targetByte <= rangeHi);59125913*pLastSuccessfulSeekOffset = pFlac->firstFLACFramePosInBytes;59145915for (;;) {5916/* After rangeLo == rangeHi == targetByte fails, we need to break out. */5917drflac_uint64 lastTargetByte = targetByte;59185919/* When seeking to a byte, failure probably means we've attempted to seek beyond the end of the stream. To counter this we just halve it each attempt. */5920if (!drflac__seek_to_byte(&pFlac->bs, targetByte)) {5921/* If we couldn't even seek to the first byte in the stream we have a problem. Just abandon the whole thing. */5922if (targetByte == 0) {5923drflac__seek_to_first_frame(pFlac); /* Try to recover. */5924return DRFLAC_FALSE;5925}59265927/* Halve the byte location and continue. */5928targetByte = rangeLo + ((rangeHi - rangeLo)/2);5929rangeHi = targetByte;5930} else {5931/* Getting here should mean that we have seeked to an appropriate byte. */59325933/* Clear the details of the FLAC frame so we don't misreport data. */5934DRFLAC_ZERO_MEMORY(&pFlac->currentFLACFrame, sizeof(pFlac->currentFLACFrame));59355936/*5937Now seek to the next FLAC frame. We need to decode the entire frame (not just the header) because it's possible for the header to incorrectly pass the5938CRC check and return bad data. We need to decode the entire frame to be more certain. Although this seems unlikely, this has happened to me in testing5939so it needs to stay this way for now.5940*/5941#if 15942if (!drflac__read_and_decode_next_flac_frame(pFlac)) {5943/* Halve the byte location and continue. */5944targetByte = rangeLo + ((rangeHi - rangeLo)/2);5945rangeHi = targetByte;5946} else {5947break;5948}5949#else5950if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) {5951/* Halve the byte location and continue. */5952targetByte = rangeLo + ((rangeHi - rangeLo)/2);5953rangeHi = targetByte;5954} else {5955break;5956}5957#endif5958}59595960/* We already tried this byte and there are no more to try, break out. */5961if(targetByte == lastTargetByte) {5962return DRFLAC_FALSE;5963}5964}59655966/* The current PCM frame needs to be updated based on the frame we just seeked to. */5967drflac__get_pcm_frame_range_of_current_flac_frame(pFlac, &pFlac->currentPCMFrame, NULL);59685969DRFLAC_ASSERT(targetByte <= rangeHi);59705971*pLastSuccessfulSeekOffset = targetByte;5972return DRFLAC_TRUE;5973}59745975static drflac_bool32 drflac__decode_flac_frame_and_seek_forward_by_pcm_frames(drflac* pFlac, drflac_uint64 offset)5976{5977/* This section of code would be used if we were only decoding the FLAC frame header when calling drflac__seek_to_approximate_flac_frame_to_byte(). */5978#if 05979if (drflac__decode_flac_frame(pFlac) != DRFLAC_SUCCESS) {5980/* We failed to decode this frame which may be due to it being corrupt. We'll just use the next valid FLAC frame. */5981if (drflac__read_and_decode_next_flac_frame(pFlac) == DRFLAC_FALSE) {5982return DRFLAC_FALSE;5983}5984}5985#endif59865987return drflac__seek_forward_by_pcm_frames(pFlac, offset) == offset;5988}598959905991static drflac_bool32 drflac__seek_to_pcm_frame__binary_search_internal(drflac* pFlac, drflac_uint64 pcmFrameIndex, drflac_uint64 byteRangeLo, drflac_uint64 byteRangeHi)5992{5993/* This assumes pFlac->currentPCMFrame is sitting on byteRangeLo upon entry. */59945995drflac_uint64 targetByte;5996drflac_uint64 pcmRangeLo = pFlac->totalPCMFrameCount;5997drflac_uint64 pcmRangeHi = 0;5998drflac_uint64 lastSuccessfulSeekOffset = (drflac_uint64)-1;5999drflac_uint64 closestSeekOffsetBeforeTargetPCMFrame = byteRangeLo;6000drflac_uint32 seekForwardThreshold = (pFlac->maxBlockSizeInPCMFrames != 0) ? pFlac->maxBlockSizeInPCMFrames*2 : 4096;60016002targetByte = byteRangeLo + (drflac_uint64)(((drflac_int64)((pcmFrameIndex - pFlac->currentPCMFrame) * pFlac->channels * pFlac->bitsPerSample)/8.0f) * DRFLAC_BINARY_SEARCH_APPROX_COMPRESSION_RATIO);6003if (targetByte > byteRangeHi) {6004targetByte = byteRangeHi;6005}60066007for (;;) {6008if (drflac__seek_to_approximate_flac_frame_to_byte(pFlac, targetByte, byteRangeLo, byteRangeHi, &lastSuccessfulSeekOffset)) {6009/* We found a FLAC frame. We need to check if it contains the sample we're looking for. */6010drflac_uint64 newPCMRangeLo;6011drflac_uint64 newPCMRangeHi;6012drflac__get_pcm_frame_range_of_current_flac_frame(pFlac, &newPCMRangeLo, &newPCMRangeHi);60136014/* If we selected the same frame, it means we should be pretty close. Just decode the rest. */6015if (pcmRangeLo == newPCMRangeLo) {6016if (!drflac__seek_to_approximate_flac_frame_to_byte(pFlac, closestSeekOffsetBeforeTargetPCMFrame, closestSeekOffsetBeforeTargetPCMFrame, byteRangeHi, &lastSuccessfulSeekOffset)) {6017break; /* Failed to seek to closest frame. */6018}60196020if (drflac__decode_flac_frame_and_seek_forward_by_pcm_frames(pFlac, pcmFrameIndex - pFlac->currentPCMFrame)) {6021return DRFLAC_TRUE;6022} else {6023break; /* Failed to seek forward. */6024}6025}60266027pcmRangeLo = newPCMRangeLo;6028pcmRangeHi = newPCMRangeHi;60296030if (pcmRangeLo <= pcmFrameIndex && pcmRangeHi >= pcmFrameIndex) {6031/* The target PCM frame is in this FLAC frame. */6032if (drflac__decode_flac_frame_and_seek_forward_by_pcm_frames(pFlac, pcmFrameIndex - pFlac->currentPCMFrame) ) {6033return DRFLAC_TRUE;6034} else {6035break; /* Failed to seek to FLAC frame. */6036}6037} else {6038const float approxCompressionRatio = (drflac_int64)(lastSuccessfulSeekOffset - pFlac->firstFLACFramePosInBytes) / ((drflac_int64)(pcmRangeLo * pFlac->channels * pFlac->bitsPerSample)/8.0f);60396040if (pcmRangeLo > pcmFrameIndex) {6041/* We seeked too far forward. We need to move our target byte backward and try again. */6042byteRangeHi = lastSuccessfulSeekOffset;6043if (byteRangeLo > byteRangeHi) {6044byteRangeLo = byteRangeHi;6045}60466047targetByte = byteRangeLo + ((byteRangeHi - byteRangeLo) / 2);6048if (targetByte < byteRangeLo) {6049targetByte = byteRangeLo;6050}6051} else /*if (pcmRangeHi < pcmFrameIndex)*/ {6052/* We didn't seek far enough. We need to move our target byte forward and try again. */60536054/* If we're close enough we can just seek forward. */6055if ((pcmFrameIndex - pcmRangeLo) < seekForwardThreshold) {6056if (drflac__decode_flac_frame_and_seek_forward_by_pcm_frames(pFlac, pcmFrameIndex - pFlac->currentPCMFrame)) {6057return DRFLAC_TRUE;6058} else {6059break; /* Failed to seek to FLAC frame. */6060}6061} else {6062byteRangeLo = lastSuccessfulSeekOffset;6063if (byteRangeHi < byteRangeLo) {6064byteRangeHi = byteRangeLo;6065}60666067targetByte = lastSuccessfulSeekOffset + (drflac_uint64)(((drflac_int64)((pcmFrameIndex-pcmRangeLo) * pFlac->channels * pFlac->bitsPerSample)/8.0f) * approxCompressionRatio);6068if (targetByte > byteRangeHi) {6069targetByte = byteRangeHi;6070}60716072if (closestSeekOffsetBeforeTargetPCMFrame < lastSuccessfulSeekOffset) {6073closestSeekOffsetBeforeTargetPCMFrame = lastSuccessfulSeekOffset;6074}6075}6076}6077}6078} else {6079/* Getting here is really bad. We just recover as best we can, but moving to the first frame in the stream, and then abort. */6080break;6081}6082}60836084drflac__seek_to_first_frame(pFlac); /* <-- Try to recover. */6085return DRFLAC_FALSE;6086}60876088static drflac_bool32 drflac__seek_to_pcm_frame__binary_search(drflac* pFlac, drflac_uint64 pcmFrameIndex)6089{6090drflac_uint64 byteRangeLo;6091drflac_uint64 byteRangeHi;6092drflac_uint32 seekForwardThreshold = (pFlac->maxBlockSizeInPCMFrames != 0) ? pFlac->maxBlockSizeInPCMFrames*2 : 4096;60936094/* Our algorithm currently assumes the FLAC stream is currently sitting at the start. */6095if (drflac__seek_to_first_frame(pFlac) == DRFLAC_FALSE) {6096return DRFLAC_FALSE;6097}60986099/* If we're close enough to the start, just move to the start and seek forward. */6100if (pcmFrameIndex < seekForwardThreshold) {6101return drflac__seek_forward_by_pcm_frames(pFlac, pcmFrameIndex) == pcmFrameIndex;6102}61036104/*6105Our starting byte range is the byte position of the first FLAC frame and the approximate end of the file as if it were completely uncompressed. This ensures6106the entire file is included, even though most of the time it'll exceed the end of the actual stream. This is OK as the frame searching logic will handle it.6107*/6108byteRangeLo = pFlac->firstFLACFramePosInBytes;6109byteRangeHi = pFlac->firstFLACFramePosInBytes + (drflac_uint64)((drflac_int64)(pFlac->totalPCMFrameCount * pFlac->channels * pFlac->bitsPerSample)/8.0f);61106111return drflac__seek_to_pcm_frame__binary_search_internal(pFlac, pcmFrameIndex, byteRangeLo, byteRangeHi);6112}6113#endif /* !DR_FLAC_NO_CRC */61146115static drflac_bool32 drflac__seek_to_pcm_frame__seek_table(drflac* pFlac, drflac_uint64 pcmFrameIndex)6116{6117drflac_uint32 iClosestSeekpoint = 0;6118drflac_bool32 isMidFrame = DRFLAC_FALSE;6119drflac_uint64 runningPCMFrameCount;6120drflac_uint32 iSeekpoint;612161226123DRFLAC_ASSERT(pFlac != NULL);61246125if (pFlac->pSeekpoints == NULL || pFlac->seekpointCount == 0) {6126return DRFLAC_FALSE;6127}61286129/* Do not use the seektable if pcmFramIndex is not coverd by it. */6130if (pFlac->pSeekpoints[0].firstPCMFrame > pcmFrameIndex) {6131return DRFLAC_FALSE;6132}61336134for (iSeekpoint = 0; iSeekpoint < pFlac->seekpointCount; ++iSeekpoint) {6135if (pFlac->pSeekpoints[iSeekpoint].firstPCMFrame >= pcmFrameIndex) {6136break;6137}61386139iClosestSeekpoint = iSeekpoint;6140}61416142/* There's been cases where the seek table contains only zeros. We need to do some basic validation on the closest seekpoint. */6143if (pFlac->pSeekpoints[iClosestSeekpoint].pcmFrameCount == 0 || pFlac->pSeekpoints[iClosestSeekpoint].pcmFrameCount > pFlac->maxBlockSizeInPCMFrames) {6144return DRFLAC_FALSE;6145}6146if (pFlac->pSeekpoints[iClosestSeekpoint].firstPCMFrame > pFlac->totalPCMFrameCount && pFlac->totalPCMFrameCount > 0) {6147return DRFLAC_FALSE;6148}61496150#if !defined(DR_FLAC_NO_CRC)6151/* At this point we should know the closest seek point. We can use a binary search for this. We need to know the total sample count for this. */6152if (pFlac->totalPCMFrameCount > 0) {6153drflac_uint64 byteRangeLo;6154drflac_uint64 byteRangeHi;61556156byteRangeHi = pFlac->firstFLACFramePosInBytes + (drflac_uint64)((drflac_int64)(pFlac->totalPCMFrameCount * pFlac->channels * pFlac->bitsPerSample)/8.0f);6157byteRangeLo = pFlac->firstFLACFramePosInBytes + pFlac->pSeekpoints[iClosestSeekpoint].flacFrameOffset;61586159/*6160If our closest seek point is not the last one, we only need to search between it and the next one. The section below calculates an appropriate starting6161value for byteRangeHi which will clamp it appropriately.61626163Note that the next seekpoint must have an offset greater than the closest seekpoint because otherwise our binary search algorithm will break down. There6164have been cases where a seektable consists of seek points where every byte offset is set to 0 which causes problems. If this happens we need to abort.6165*/6166if (iClosestSeekpoint < pFlac->seekpointCount-1) {6167drflac_uint32 iNextSeekpoint = iClosestSeekpoint + 1;61686169/* Basic validation on the seekpoints to ensure they're usable. */6170if (pFlac->pSeekpoints[iClosestSeekpoint].flacFrameOffset >= pFlac->pSeekpoints[iNextSeekpoint].flacFrameOffset || pFlac->pSeekpoints[iNextSeekpoint].pcmFrameCount == 0) {6171return DRFLAC_FALSE; /* The next seekpoint doesn't look right. The seek table cannot be trusted from here. Abort. */6172}61736174if (pFlac->pSeekpoints[iNextSeekpoint].firstPCMFrame != (((drflac_uint64)0xFFFFFFFF << 32) | 0xFFFFFFFF)) { /* Make sure it's not a placeholder seekpoint. */6175byteRangeHi = pFlac->firstFLACFramePosInBytes + pFlac->pSeekpoints[iNextSeekpoint].flacFrameOffset - 1; /* byteRangeHi must be zero based. */6176}6177}61786179if (drflac__seek_to_byte(&pFlac->bs, pFlac->firstFLACFramePosInBytes + pFlac->pSeekpoints[iClosestSeekpoint].flacFrameOffset)) {6180if (drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) {6181drflac__get_pcm_frame_range_of_current_flac_frame(pFlac, &pFlac->currentPCMFrame, NULL);61826183if (drflac__seek_to_pcm_frame__binary_search_internal(pFlac, pcmFrameIndex, byteRangeLo, byteRangeHi)) {6184return DRFLAC_TRUE;6185}6186}6187}6188}6189#endif /* !DR_FLAC_NO_CRC */61906191/* Getting here means we need to use a slower algorithm because the binary search method failed or cannot be used. */61926193/*6194If we are seeking forward and the closest seekpoint is _before_ the current sample, we just seek forward from where we are. Otherwise we start seeking6195from the seekpoint's first sample.6196*/6197if (pcmFrameIndex >= pFlac->currentPCMFrame && pFlac->pSeekpoints[iClosestSeekpoint].firstPCMFrame <= pFlac->currentPCMFrame) {6198/* Optimized case. Just seek forward from where we are. */6199runningPCMFrameCount = pFlac->currentPCMFrame;62006201/* The frame header for the first frame may not yet have been read. We need to do that if necessary. */6202if (pFlac->currentPCMFrame == 0 && pFlac->currentFLACFrame.pcmFramesRemaining == 0) {6203if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) {6204return DRFLAC_FALSE;6205}6206} else {6207isMidFrame = DRFLAC_TRUE;6208}6209} else {6210/* Slower case. Seek to the start of the seekpoint and then seek forward from there. */6211runningPCMFrameCount = pFlac->pSeekpoints[iClosestSeekpoint].firstPCMFrame;62126213if (!drflac__seek_to_byte(&pFlac->bs, pFlac->firstFLACFramePosInBytes + pFlac->pSeekpoints[iClosestSeekpoint].flacFrameOffset)) {6214return DRFLAC_FALSE;6215}62166217/* Grab the frame the seekpoint is sitting on in preparation for the sample-exact seeking below. */6218if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) {6219return DRFLAC_FALSE;6220}6221}62226223for (;;) {6224drflac_uint64 pcmFrameCountInThisFLACFrame;6225drflac_uint64 firstPCMFrameInFLACFrame = 0;6226drflac_uint64 lastPCMFrameInFLACFrame = 0;62276228drflac__get_pcm_frame_range_of_current_flac_frame(pFlac, &firstPCMFrameInFLACFrame, &lastPCMFrameInFLACFrame);62296230pcmFrameCountInThisFLACFrame = (lastPCMFrameInFLACFrame - firstPCMFrameInFLACFrame) + 1;6231if (pcmFrameIndex < (runningPCMFrameCount + pcmFrameCountInThisFLACFrame)) {6232/*6233The sample should be in this frame. We need to fully decode it, but if it's an invalid frame (a CRC mismatch) we need to pretend6234it never existed and keep iterating.6235*/6236drflac_uint64 pcmFramesToDecode = pcmFrameIndex - runningPCMFrameCount;62376238if (!isMidFrame) {6239drflac_result result = drflac__decode_flac_frame(pFlac);6240if (result == DRFLAC_SUCCESS) {6241/* The frame is valid. We just need to skip over some samples to ensure it's sample-exact. */6242return drflac__seek_forward_by_pcm_frames(pFlac, pcmFramesToDecode) == pcmFramesToDecode; /* <-- If this fails, something bad has happened (it should never fail). */6243} else {6244if (result == DRFLAC_CRC_MISMATCH) {6245goto next_iteration; /* CRC mismatch. Pretend this frame never existed. */6246} else {6247return DRFLAC_FALSE;6248}6249}6250} else {6251/* We started seeking mid-frame which means we need to skip the frame decoding part. */6252return drflac__seek_forward_by_pcm_frames(pFlac, pcmFramesToDecode) == pcmFramesToDecode;6253}6254} else {6255/*6256It's not in this frame. We need to seek past the frame, but check if there was a CRC mismatch. If so, we pretend this6257frame never existed and leave the running sample count untouched.6258*/6259if (!isMidFrame) {6260drflac_result result = drflac__seek_to_next_flac_frame(pFlac);6261if (result == DRFLAC_SUCCESS) {6262runningPCMFrameCount += pcmFrameCountInThisFLACFrame;6263} else {6264if (result == DRFLAC_CRC_MISMATCH) {6265goto next_iteration; /* CRC mismatch. Pretend this frame never existed. */6266} else {6267return DRFLAC_FALSE;6268}6269}6270} else {6271/*6272We started seeking mid-frame which means we need to seek by reading to the end of the frame instead of with6273drflac__seek_to_next_flac_frame() which only works if the decoder is sitting on the byte just after the frame header.6274*/6275runningPCMFrameCount += pFlac->currentFLACFrame.pcmFramesRemaining;6276pFlac->currentFLACFrame.pcmFramesRemaining = 0;6277isMidFrame = DRFLAC_FALSE;6278}62796280/* If we are seeking to the end of the file and we've just hit it, we're done. */6281if (pcmFrameIndex == pFlac->totalPCMFrameCount && runningPCMFrameCount == pFlac->totalPCMFrameCount) {6282return DRFLAC_TRUE;6283}6284}62856286next_iteration:6287/* Grab the next frame in preparation for the next iteration. */6288if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) {6289return DRFLAC_FALSE;6290}6291}6292}629362946295#ifndef DR_FLAC_NO_OGG6296typedef struct6297{6298drflac_uint8 capturePattern[4]; /* Should be "OggS" */6299drflac_uint8 structureVersion; /* Always 0. */6300drflac_uint8 headerType;6301drflac_uint64 granulePosition;6302drflac_uint32 serialNumber;6303drflac_uint32 sequenceNumber;6304drflac_uint32 checksum;6305drflac_uint8 segmentCount;6306drflac_uint8 segmentTable[255];6307} drflac_ogg_page_header;6308#endif63096310typedef struct6311{6312drflac_read_proc onRead;6313drflac_seek_proc onSeek;6314drflac_meta_proc onMeta;6315drflac_container container;6316void* pUserData;6317void* pUserDataMD;6318drflac_uint32 sampleRate;6319drflac_uint8 channels;6320drflac_uint8 bitsPerSample;6321drflac_uint64 totalPCMFrameCount;6322drflac_uint16 maxBlockSizeInPCMFrames;6323drflac_uint64 runningFilePos;6324drflac_bool32 hasStreamInfoBlock;6325drflac_bool32 hasMetadataBlocks;6326drflac_bs bs; /* <-- A bit streamer is required for loading data during initialization. */6327drflac_frame_header firstFrameHeader; /* <-- The header of the first frame that was read during relaxed initalization. Only set if there is no STREAMINFO block. */63286329#ifndef DR_FLAC_NO_OGG6330drflac_uint32 oggSerial;6331drflac_uint64 oggFirstBytePos;6332drflac_ogg_page_header oggBosHeader;6333#endif6334} drflac_init_info;63356336static DRFLAC_INLINE void drflac__decode_block_header(drflac_uint32 blockHeader, drflac_uint8* isLastBlock, drflac_uint8* blockType, drflac_uint32* blockSize)6337{6338blockHeader = drflac__be2host_32(blockHeader);6339*isLastBlock = (drflac_uint8)((blockHeader & 0x80000000UL) >> 31);6340*blockType = (drflac_uint8)((blockHeader & 0x7F000000UL) >> 24);6341*blockSize = (blockHeader & 0x00FFFFFFUL);6342}63436344static DRFLAC_INLINE drflac_bool32 drflac__read_and_decode_block_header(drflac_read_proc onRead, void* pUserData, drflac_uint8* isLastBlock, drflac_uint8* blockType, drflac_uint32* blockSize)6345{6346drflac_uint32 blockHeader;63476348*blockSize = 0;6349if (onRead(pUserData, &blockHeader, 4) != 4) {6350return DRFLAC_FALSE;6351}63526353drflac__decode_block_header(blockHeader, isLastBlock, blockType, blockSize);6354return DRFLAC_TRUE;6355}63566357static drflac_bool32 drflac__read_streaminfo(drflac_read_proc onRead, void* pUserData, drflac_streaminfo* pStreamInfo)6358{6359drflac_uint32 blockSizes;6360drflac_uint64 frameSizes = 0;6361drflac_uint64 importantProps;6362drflac_uint8 md5[16];63636364/* min/max block size. */6365if (onRead(pUserData, &blockSizes, 4) != 4) {6366return DRFLAC_FALSE;6367}63686369/* min/max frame size. */6370if (onRead(pUserData, &frameSizes, 6) != 6) {6371return DRFLAC_FALSE;6372}63736374/* Sample rate, channels, bits per sample and total sample count. */6375if (onRead(pUserData, &importantProps, 8) != 8) {6376return DRFLAC_FALSE;6377}63786379/* MD5 */6380if (onRead(pUserData, md5, sizeof(md5)) != sizeof(md5)) {6381return DRFLAC_FALSE;6382}63836384blockSizes = drflac__be2host_32(blockSizes);6385frameSizes = drflac__be2host_64(frameSizes);6386importantProps = drflac__be2host_64(importantProps);63876388pStreamInfo->minBlockSizeInPCMFrames = (drflac_uint16)((blockSizes & 0xFFFF0000) >> 16);6389pStreamInfo->maxBlockSizeInPCMFrames = (drflac_uint16) (blockSizes & 0x0000FFFF);6390pStreamInfo->minFrameSizeInPCMFrames = (drflac_uint32)((frameSizes & (((drflac_uint64)0x00FFFFFF << 16) << 24)) >> 40);6391pStreamInfo->maxFrameSizeInPCMFrames = (drflac_uint32)((frameSizes & (((drflac_uint64)0x00FFFFFF << 16) << 0)) >> 16);6392pStreamInfo->sampleRate = (drflac_uint32)((importantProps & (((drflac_uint64)0x000FFFFF << 16) << 28)) >> 44);6393pStreamInfo->channels = (drflac_uint8 )((importantProps & (((drflac_uint64)0x0000000E << 16) << 24)) >> 41) + 1;6394pStreamInfo->bitsPerSample = (drflac_uint8 )((importantProps & (((drflac_uint64)0x0000001F << 16) << 20)) >> 36) + 1;6395pStreamInfo->totalPCMFrameCount = ((importantProps & ((((drflac_uint64)0x0000000F << 16) << 16) | 0xFFFFFFFF)));6396DRFLAC_COPY_MEMORY(pStreamInfo->md5, md5, sizeof(md5));63976398return DRFLAC_TRUE;6399}640064016402static void* drflac__malloc_default(size_t sz, void* pUserData)6403{6404(void)pUserData;6405return DRFLAC_MALLOC(sz);6406}64076408static void* drflac__realloc_default(void* p, size_t sz, void* pUserData)6409{6410(void)pUserData;6411return DRFLAC_REALLOC(p, sz);6412}64136414static void drflac__free_default(void* p, void* pUserData)6415{6416(void)pUserData;6417DRFLAC_FREE(p);6418}641964206421static void* drflac__malloc_from_callbacks(size_t sz, const drflac_allocation_callbacks* pAllocationCallbacks)6422{6423if (pAllocationCallbacks == NULL) {6424return NULL;6425}64266427if (pAllocationCallbacks->onMalloc != NULL) {6428return pAllocationCallbacks->onMalloc(sz, pAllocationCallbacks->pUserData);6429}64306431/* Try using realloc(). */6432if (pAllocationCallbacks->onRealloc != NULL) {6433return pAllocationCallbacks->onRealloc(NULL, sz, pAllocationCallbacks->pUserData);6434}64356436return NULL;6437}64386439static void* drflac__realloc_from_callbacks(void* p, size_t szNew, size_t szOld, const drflac_allocation_callbacks* pAllocationCallbacks)6440{6441if (pAllocationCallbacks == NULL) {6442return NULL;6443}64446445if (pAllocationCallbacks->onRealloc != NULL) {6446return pAllocationCallbacks->onRealloc(p, szNew, pAllocationCallbacks->pUserData);6447}64486449/* Try emulating realloc() in terms of malloc()/free(). */6450if (pAllocationCallbacks->onMalloc != NULL && pAllocationCallbacks->onFree != NULL) {6451void* p2;64526453p2 = pAllocationCallbacks->onMalloc(szNew, pAllocationCallbacks->pUserData);6454if (p2 == NULL) {6455return NULL;6456}64576458if (p != NULL) {6459DRFLAC_COPY_MEMORY(p2, p, szOld);6460pAllocationCallbacks->onFree(p, pAllocationCallbacks->pUserData);6461}64626463return p2;6464}64656466return NULL;6467}64686469static void drflac__free_from_callbacks(void* p, const drflac_allocation_callbacks* pAllocationCallbacks)6470{6471if (p == NULL || pAllocationCallbacks == NULL) {6472return;6473}64746475if (pAllocationCallbacks->onFree != NULL) {6476pAllocationCallbacks->onFree(p, pAllocationCallbacks->pUserData);6477}6478}647964806481static drflac_bool32 drflac__read_and_decode_metadata(drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, void* pUserData, void* pUserDataMD, drflac_uint64* pFirstFramePos, drflac_uint64* pSeektablePos, drflac_uint32* pSeekpointCount, drflac_allocation_callbacks* pAllocationCallbacks)6482{6483/*6484We want to keep track of the byte position in the stream of the seektable. At the time of calling this function we know that6485we'll be sitting on byte 42.6486*/6487drflac_uint64 runningFilePos = 42;6488drflac_uint64 seektablePos = 0;6489drflac_uint32 seektableSize = 0;64906491for (;;) {6492drflac_metadata metadata;6493drflac_uint8 isLastBlock = 0;6494drflac_uint8 blockType = 0;6495drflac_uint32 blockSize;6496if (drflac__read_and_decode_block_header(onRead, pUserData, &isLastBlock, &blockType, &blockSize) == DRFLAC_FALSE) {6497return DRFLAC_FALSE;6498}6499runningFilePos += 4;65006501metadata.type = blockType;6502metadata.pRawData = NULL;6503metadata.rawDataSize = 0;65046505switch (blockType)6506{6507case DRFLAC_METADATA_BLOCK_TYPE_APPLICATION:6508{6509if (blockSize < 4) {6510return DRFLAC_FALSE;6511}65126513if (onMeta) {6514void* pRawData = drflac__malloc_from_callbacks(blockSize, pAllocationCallbacks);6515if (pRawData == NULL) {6516return DRFLAC_FALSE;6517}65186519if (onRead(pUserData, pRawData, blockSize) != blockSize) {6520drflac__free_from_callbacks(pRawData, pAllocationCallbacks);6521return DRFLAC_FALSE;6522}65236524metadata.pRawData = pRawData;6525metadata.rawDataSize = blockSize;6526metadata.data.application.id = drflac__be2host_32(*(drflac_uint32*)pRawData);6527metadata.data.application.pData = (const void*)((drflac_uint8*)pRawData + sizeof(drflac_uint32));6528metadata.data.application.dataSize = blockSize - sizeof(drflac_uint32);6529onMeta(pUserDataMD, &metadata);65306531drflac__free_from_callbacks(pRawData, pAllocationCallbacks);6532}6533} break;65346535case DRFLAC_METADATA_BLOCK_TYPE_SEEKTABLE:6536{6537seektablePos = runningFilePos;6538seektableSize = blockSize;65396540if (onMeta) {6541drflac_uint32 seekpointCount;6542drflac_uint32 iSeekpoint;6543void* pRawData;65446545seekpointCount = blockSize/DRFLAC_SEEKPOINT_SIZE_IN_BYTES;65466547pRawData = drflac__malloc_from_callbacks(seekpointCount * sizeof(drflac_seekpoint), pAllocationCallbacks);6548if (pRawData == NULL) {6549return DRFLAC_FALSE;6550}65516552/* We need to read seekpoint by seekpoint and do some processing. */6553for (iSeekpoint = 0; iSeekpoint < seekpointCount; ++iSeekpoint) {6554drflac_seekpoint* pSeekpoint = (drflac_seekpoint*)pRawData + iSeekpoint;65556556if (onRead(pUserData, pSeekpoint, DRFLAC_SEEKPOINT_SIZE_IN_BYTES) != DRFLAC_SEEKPOINT_SIZE_IN_BYTES) {6557drflac__free_from_callbacks(pRawData, pAllocationCallbacks);6558return DRFLAC_FALSE;6559}65606561/* Endian swap. */6562pSeekpoint->firstPCMFrame = drflac__be2host_64(pSeekpoint->firstPCMFrame);6563pSeekpoint->flacFrameOffset = drflac__be2host_64(pSeekpoint->flacFrameOffset);6564pSeekpoint->pcmFrameCount = drflac__be2host_16(pSeekpoint->pcmFrameCount);6565}65666567metadata.pRawData = pRawData;6568metadata.rawDataSize = blockSize;6569metadata.data.seektable.seekpointCount = seekpointCount;6570metadata.data.seektable.pSeekpoints = (const drflac_seekpoint*)pRawData;65716572onMeta(pUserDataMD, &metadata);65736574drflac__free_from_callbacks(pRawData, pAllocationCallbacks);6575}6576} break;65776578case DRFLAC_METADATA_BLOCK_TYPE_VORBIS_COMMENT:6579{6580if (blockSize < 8) {6581return DRFLAC_FALSE;6582}65836584if (onMeta) {6585void* pRawData;6586const char* pRunningData;6587const char* pRunningDataEnd;6588drflac_uint32 i;65896590pRawData = drflac__malloc_from_callbacks(blockSize, pAllocationCallbacks);6591if (pRawData == NULL) {6592return DRFLAC_FALSE;6593}65946595if (onRead(pUserData, pRawData, blockSize) != blockSize) {6596drflac__free_from_callbacks(pRawData, pAllocationCallbacks);6597return DRFLAC_FALSE;6598}65996600metadata.pRawData = pRawData;6601metadata.rawDataSize = blockSize;66026603pRunningData = (const char*)pRawData;6604pRunningDataEnd = (const char*)pRawData + blockSize;66056606metadata.data.vorbis_comment.vendorLength = drflac__le2host_32_ptr_unaligned(pRunningData); pRunningData += 4;66076608/* Need space for the rest of the block */6609if ((pRunningDataEnd - pRunningData) - 4 < (drflac_int64)metadata.data.vorbis_comment.vendorLength) { /* <-- Note the order of operations to avoid overflow to a valid value */6610drflac__free_from_callbacks(pRawData, pAllocationCallbacks);6611return DRFLAC_FALSE;6612}6613metadata.data.vorbis_comment.vendor = pRunningData; pRunningData += metadata.data.vorbis_comment.vendorLength;6614metadata.data.vorbis_comment.commentCount = drflac__le2host_32_ptr_unaligned(pRunningData); pRunningData += 4;66156616/* Need space for 'commentCount' comments after the block, which at minimum is a drflac_uint32 per comment */6617if ((pRunningDataEnd - pRunningData) / sizeof(drflac_uint32) < metadata.data.vorbis_comment.commentCount) { /* <-- Note the order of operations to avoid overflow to a valid value */6618drflac__free_from_callbacks(pRawData, pAllocationCallbacks);6619return DRFLAC_FALSE;6620}6621metadata.data.vorbis_comment.pComments = pRunningData;66226623/* Check that the comments section is valid before passing it to the callback */6624for (i = 0; i < metadata.data.vorbis_comment.commentCount; ++i) {6625drflac_uint32 commentLength;66266627if (pRunningDataEnd - pRunningData < 4) {6628drflac__free_from_callbacks(pRawData, pAllocationCallbacks);6629return DRFLAC_FALSE;6630}66316632commentLength = drflac__le2host_32_ptr_unaligned(pRunningData); pRunningData += 4;6633if (pRunningDataEnd - pRunningData < (drflac_int64)commentLength) { /* <-- Note the order of operations to avoid overflow to a valid value */6634drflac__free_from_callbacks(pRawData, pAllocationCallbacks);6635return DRFLAC_FALSE;6636}6637pRunningData += commentLength;6638}66396640onMeta(pUserDataMD, &metadata);66416642drflac__free_from_callbacks(pRawData, pAllocationCallbacks);6643}6644} break;66456646case DRFLAC_METADATA_BLOCK_TYPE_CUESHEET:6647{6648if (blockSize < 396) {6649return DRFLAC_FALSE;6650}66516652if (onMeta) {6653void* pRawData;6654const char* pRunningData;6655const char* pRunningDataEnd;6656size_t bufferSize;6657drflac_uint8 iTrack;6658drflac_uint8 iIndex;6659void* pTrackData;66606661/*6662This needs to be loaded in two passes. The first pass is used to calculate the size of the memory allocation6663we need for storing the necessary data. The second pass will fill that buffer with usable data.6664*/6665pRawData = drflac__malloc_from_callbacks(blockSize, pAllocationCallbacks);6666if (pRawData == NULL) {6667return DRFLAC_FALSE;6668}66696670if (onRead(pUserData, pRawData, blockSize) != blockSize) {6671drflac__free_from_callbacks(pRawData, pAllocationCallbacks);6672return DRFLAC_FALSE;6673}66746675metadata.pRawData = pRawData;6676metadata.rawDataSize = blockSize;66776678pRunningData = (const char*)pRawData;6679pRunningDataEnd = (const char*)pRawData + blockSize;66806681DRFLAC_COPY_MEMORY(metadata.data.cuesheet.catalog, pRunningData, 128); pRunningData += 128;6682metadata.data.cuesheet.leadInSampleCount = drflac__be2host_64(*(const drflac_uint64*)pRunningData); pRunningData += 8;6683metadata.data.cuesheet.isCD = (pRunningData[0] & 0x80) != 0; pRunningData += 259;6684metadata.data.cuesheet.trackCount = pRunningData[0]; pRunningData += 1;6685metadata.data.cuesheet.pTrackData = NULL; /* Will be filled later. */66866687/* Pass 1: Calculate the size of the buffer for the track data. */6688{6689const char* pRunningDataSaved = pRunningData; /* Will be restored at the end in preparation for the second pass. */66906691bufferSize = metadata.data.cuesheet.trackCount * DRFLAC_CUESHEET_TRACK_SIZE_IN_BYTES;66926693for (iTrack = 0; iTrack < metadata.data.cuesheet.trackCount; ++iTrack) {6694drflac_uint8 indexCount;6695drflac_uint32 indexPointSize;66966697if (pRunningDataEnd - pRunningData < DRFLAC_CUESHEET_TRACK_SIZE_IN_BYTES) {6698drflac__free_from_callbacks(pRawData, pAllocationCallbacks);6699return DRFLAC_FALSE;6700}67016702/* Skip to the index point count */6703pRunningData += 35;67046705indexCount = pRunningData[0];6706pRunningData += 1;67076708bufferSize += indexCount * sizeof(drflac_cuesheet_track_index);67096710/* Quick validation check. */6711indexPointSize = indexCount * DRFLAC_CUESHEET_TRACK_INDEX_SIZE_IN_BYTES;6712if (pRunningDataEnd - pRunningData < (drflac_int64)indexPointSize) {6713drflac__free_from_callbacks(pRawData, pAllocationCallbacks);6714return DRFLAC_FALSE;6715}67166717pRunningData += indexPointSize;6718}67196720pRunningData = pRunningDataSaved;6721}67226723/* Pass 2: Allocate a buffer and fill the data. Validation was done in the step above so can be skipped. */6724{6725char* pRunningTrackData;67266727pTrackData = drflac__malloc_from_callbacks(bufferSize, pAllocationCallbacks);6728if (pTrackData == NULL) {6729drflac__free_from_callbacks(pRawData, pAllocationCallbacks);6730return DRFLAC_FALSE;6731}67326733pRunningTrackData = (char*)pTrackData;67346735for (iTrack = 0; iTrack < metadata.data.cuesheet.trackCount; ++iTrack) {6736drflac_uint8 indexCount;67376738DRFLAC_COPY_MEMORY(pRunningTrackData, pRunningData, DRFLAC_CUESHEET_TRACK_SIZE_IN_BYTES);6739pRunningData += DRFLAC_CUESHEET_TRACK_SIZE_IN_BYTES-1; /* Skip forward, but not beyond the last byte in the CUESHEET_TRACK block which is the index count. */6740pRunningTrackData += DRFLAC_CUESHEET_TRACK_SIZE_IN_BYTES-1;67416742/* Grab the index count for the next part. */6743indexCount = pRunningData[0];6744pRunningData += 1;6745pRunningTrackData += 1;67466747/* Extract each track index. */6748for (iIndex = 0; iIndex < indexCount; ++iIndex) {6749drflac_cuesheet_track_index* pTrackIndex = (drflac_cuesheet_track_index*)pRunningTrackData;67506751DRFLAC_COPY_MEMORY(pRunningTrackData, pRunningData, DRFLAC_CUESHEET_TRACK_INDEX_SIZE_IN_BYTES);6752pRunningData += DRFLAC_CUESHEET_TRACK_INDEX_SIZE_IN_BYTES;6753pRunningTrackData += sizeof(drflac_cuesheet_track_index);67546755pTrackIndex->offset = drflac__be2host_64(pTrackIndex->offset);6756}6757}67586759metadata.data.cuesheet.pTrackData = pTrackData;6760}67616762/* The original data is no longer needed. */6763drflac__free_from_callbacks(pRawData, pAllocationCallbacks);6764pRawData = NULL;67656766onMeta(pUserDataMD, &metadata);67676768drflac__free_from_callbacks(pTrackData, pAllocationCallbacks);6769pTrackData = NULL;6770}6771} break;67726773case DRFLAC_METADATA_BLOCK_TYPE_PICTURE:6774{6775if (blockSize < 32) {6776return DRFLAC_FALSE;6777}67786779if (onMeta) {6780void* pRawData;6781const char* pRunningData;6782const char* pRunningDataEnd;67836784pRawData = drflac__malloc_from_callbacks(blockSize, pAllocationCallbacks);6785if (pRawData == NULL) {6786return DRFLAC_FALSE;6787}67886789if (onRead(pUserData, pRawData, blockSize) != blockSize) {6790drflac__free_from_callbacks(pRawData, pAllocationCallbacks);6791return DRFLAC_FALSE;6792}67936794metadata.pRawData = pRawData;6795metadata.rawDataSize = blockSize;67966797pRunningData = (const char*)pRawData;6798pRunningDataEnd = (const char*)pRawData + blockSize;67996800metadata.data.picture.type = drflac__be2host_32_ptr_unaligned(pRunningData); pRunningData += 4;6801metadata.data.picture.mimeLength = drflac__be2host_32_ptr_unaligned(pRunningData); pRunningData += 4;68026803/* Need space for the rest of the block */6804if ((pRunningDataEnd - pRunningData) - 24 < (drflac_int64)metadata.data.picture.mimeLength) { /* <-- Note the order of operations to avoid overflow to a valid value */6805drflac__free_from_callbacks(pRawData, pAllocationCallbacks);6806return DRFLAC_FALSE;6807}6808metadata.data.picture.mime = pRunningData; pRunningData += metadata.data.picture.mimeLength;6809metadata.data.picture.descriptionLength = drflac__be2host_32_ptr_unaligned(pRunningData); pRunningData += 4;68106811/* Need space for the rest of the block */6812if ((pRunningDataEnd - pRunningData) - 20 < (drflac_int64)metadata.data.picture.descriptionLength) { /* <-- Note the order of operations to avoid overflow to a valid value */6813drflac__free_from_callbacks(pRawData, pAllocationCallbacks);6814return DRFLAC_FALSE;6815}6816metadata.data.picture.description = pRunningData; pRunningData += metadata.data.picture.descriptionLength;6817metadata.data.picture.width = drflac__be2host_32_ptr_unaligned(pRunningData); pRunningData += 4;6818metadata.data.picture.height = drflac__be2host_32_ptr_unaligned(pRunningData); pRunningData += 4;6819metadata.data.picture.colorDepth = drflac__be2host_32_ptr_unaligned(pRunningData); pRunningData += 4;6820metadata.data.picture.indexColorCount = drflac__be2host_32_ptr_unaligned(pRunningData); pRunningData += 4;6821metadata.data.picture.pictureDataSize = drflac__be2host_32_ptr_unaligned(pRunningData); pRunningData += 4;6822metadata.data.picture.pPictureData = (const drflac_uint8*)pRunningData;68236824/* Need space for the picture after the block */6825if (pRunningDataEnd - pRunningData < (drflac_int64)metadata.data.picture.pictureDataSize) { /* <-- Note the order of operations to avoid overflow to a valid value */6826drflac__free_from_callbacks(pRawData, pAllocationCallbacks);6827return DRFLAC_FALSE;6828}68296830onMeta(pUserDataMD, &metadata);68316832drflac__free_from_callbacks(pRawData, pAllocationCallbacks);6833}6834} break;68356836case DRFLAC_METADATA_BLOCK_TYPE_PADDING:6837{6838if (onMeta) {6839metadata.data.padding.unused = 0;68406841/* Padding doesn't have anything meaningful in it, so just skip over it, but make sure the caller is aware of it by firing the callback. */6842if (!onSeek(pUserData, blockSize, drflac_seek_origin_current)) {6843isLastBlock = DRFLAC_TRUE; /* An error occurred while seeking. Attempt to recover by treating this as the last block which will in turn terminate the loop. */6844} else {6845onMeta(pUserDataMD, &metadata);6846}6847}6848} break;68496850case DRFLAC_METADATA_BLOCK_TYPE_INVALID:6851{6852/* Invalid chunk. Just skip over this one. */6853if (onMeta) {6854if (!onSeek(pUserData, blockSize, drflac_seek_origin_current)) {6855isLastBlock = DRFLAC_TRUE; /* An error occurred while seeking. Attempt to recover by treating this as the last block which will in turn terminate the loop. */6856}6857}6858} break;68596860default:6861{6862/*6863It's an unknown chunk, but not necessarily invalid. There's a chance more metadata blocks might be defined later on, so we6864can at the very least report the chunk to the application and let it look at the raw data.6865*/6866if (onMeta) {6867void* pRawData = drflac__malloc_from_callbacks(blockSize, pAllocationCallbacks);6868if (pRawData == NULL) {6869return DRFLAC_FALSE;6870}68716872if (onRead(pUserData, pRawData, blockSize) != blockSize) {6873drflac__free_from_callbacks(pRawData, pAllocationCallbacks);6874return DRFLAC_FALSE;6875}68766877metadata.pRawData = pRawData;6878metadata.rawDataSize = blockSize;6879onMeta(pUserDataMD, &metadata);68806881drflac__free_from_callbacks(pRawData, pAllocationCallbacks);6882}6883} break;6884}68856886/* If we're not handling metadata, just skip over the block. If we are, it will have been handled earlier in the switch statement above. */6887if (onMeta == NULL && blockSize > 0) {6888if (!onSeek(pUserData, blockSize, drflac_seek_origin_current)) {6889isLastBlock = DRFLAC_TRUE;6890}6891}68926893runningFilePos += blockSize;6894if (isLastBlock) {6895break;6896}6897}68986899*pSeektablePos = seektablePos;6900*pSeekpointCount = seektableSize / DRFLAC_SEEKPOINT_SIZE_IN_BYTES;6901*pFirstFramePos = runningFilePos;69026903return DRFLAC_TRUE;6904}69056906static drflac_bool32 drflac__init_private__native(drflac_init_info* pInit, drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, void* pUserData, void* pUserDataMD, drflac_bool32 relaxed)6907{6908/* Pre Condition: The bit stream should be sitting just past the 4-byte id header. */69096910drflac_uint8 isLastBlock;6911drflac_uint8 blockType;6912drflac_uint32 blockSize;69136914(void)onSeek;69156916pInit->container = drflac_container_native;69176918/* The first metadata block should be the STREAMINFO block. */6919if (!drflac__read_and_decode_block_header(onRead, pUserData, &isLastBlock, &blockType, &blockSize)) {6920return DRFLAC_FALSE;6921}69226923if (blockType != DRFLAC_METADATA_BLOCK_TYPE_STREAMINFO || blockSize != 34) {6924if (!relaxed) {6925/* We're opening in strict mode and the first block is not the STREAMINFO block. Error. */6926return DRFLAC_FALSE;6927} else {6928/*6929Relaxed mode. To open from here we need to just find the first frame and set the sample rate, etc. to whatever is defined6930for that frame.6931*/6932pInit->hasStreamInfoBlock = DRFLAC_FALSE;6933pInit->hasMetadataBlocks = DRFLAC_FALSE;69346935if (!drflac__read_next_flac_frame_header(&pInit->bs, 0, &pInit->firstFrameHeader)) {6936return DRFLAC_FALSE; /* Couldn't find a frame. */6937}69386939if (pInit->firstFrameHeader.bitsPerSample == 0) {6940return DRFLAC_FALSE; /* Failed to initialize because the first frame depends on the STREAMINFO block, which does not exist. */6941}69426943pInit->sampleRate = pInit->firstFrameHeader.sampleRate;6944pInit->channels = drflac__get_channel_count_from_channel_assignment(pInit->firstFrameHeader.channelAssignment);6945pInit->bitsPerSample = pInit->firstFrameHeader.bitsPerSample;6946pInit->maxBlockSizeInPCMFrames = 65535; /* <-- See notes here: https://xiph.org/flac/format.html#metadata_block_streaminfo */6947return DRFLAC_TRUE;6948}6949} else {6950drflac_streaminfo streaminfo;6951if (!drflac__read_streaminfo(onRead, pUserData, &streaminfo)) {6952return DRFLAC_FALSE;6953}69546955pInit->hasStreamInfoBlock = DRFLAC_TRUE;6956pInit->sampleRate = streaminfo.sampleRate;6957pInit->channels = streaminfo.channels;6958pInit->bitsPerSample = streaminfo.bitsPerSample;6959pInit->totalPCMFrameCount = streaminfo.totalPCMFrameCount;6960pInit->maxBlockSizeInPCMFrames = streaminfo.maxBlockSizeInPCMFrames; /* Don't care about the min block size - only the max (used for determining the size of the memory allocation). */6961pInit->hasMetadataBlocks = !isLastBlock;69626963if (onMeta) {6964drflac_metadata metadata;6965metadata.type = DRFLAC_METADATA_BLOCK_TYPE_STREAMINFO;6966metadata.pRawData = NULL;6967metadata.rawDataSize = 0;6968metadata.data.streaminfo = streaminfo;6969onMeta(pUserDataMD, &metadata);6970}69716972return DRFLAC_TRUE;6973}6974}69756976#ifndef DR_FLAC_NO_OGG6977#define DRFLAC_OGG_MAX_PAGE_SIZE 653076978#define DRFLAC_OGG_CAPTURE_PATTERN_CRC32 1605413199 /* CRC-32 of "OggS". */69796980typedef enum6981{6982drflac_ogg_recover_on_crc_mismatch,6983drflac_ogg_fail_on_crc_mismatch6984} drflac_ogg_crc_mismatch_recovery;69856986#ifndef DR_FLAC_NO_CRC6987static drflac_uint32 drflac__crc32_table[] = {69880x00000000L, 0x04C11DB7L, 0x09823B6EL, 0x0D4326D9L,69890x130476DCL, 0x17C56B6BL, 0x1A864DB2L, 0x1E475005L,69900x2608EDB8L, 0x22C9F00FL, 0x2F8AD6D6L, 0x2B4BCB61L,69910x350C9B64L, 0x31CD86D3L, 0x3C8EA00AL, 0x384FBDBDL,69920x4C11DB70L, 0x48D0C6C7L, 0x4593E01EL, 0x4152FDA9L,69930x5F15ADACL, 0x5BD4B01BL, 0x569796C2L, 0x52568B75L,69940x6A1936C8L, 0x6ED82B7FL, 0x639B0DA6L, 0x675A1011L,69950x791D4014L, 0x7DDC5DA3L, 0x709F7B7AL, 0x745E66CDL,69960x9823B6E0L, 0x9CE2AB57L, 0x91A18D8EL, 0x95609039L,69970x8B27C03CL, 0x8FE6DD8BL, 0x82A5FB52L, 0x8664E6E5L,69980xBE2B5B58L, 0xBAEA46EFL, 0xB7A96036L, 0xB3687D81L,69990xAD2F2D84L, 0xA9EE3033L, 0xA4AD16EAL, 0xA06C0B5DL,70000xD4326D90L, 0xD0F37027L, 0xDDB056FEL, 0xD9714B49L,70010xC7361B4CL, 0xC3F706FBL, 0xCEB42022L, 0xCA753D95L,70020xF23A8028L, 0xF6FB9D9FL, 0xFBB8BB46L, 0xFF79A6F1L,70030xE13EF6F4L, 0xE5FFEB43L, 0xE8BCCD9AL, 0xEC7DD02DL,70040x34867077L, 0x30476DC0L, 0x3D044B19L, 0x39C556AEL,70050x278206ABL, 0x23431B1CL, 0x2E003DC5L, 0x2AC12072L,70060x128E9DCFL, 0x164F8078L, 0x1B0CA6A1L, 0x1FCDBB16L,70070x018AEB13L, 0x054BF6A4L, 0x0808D07DL, 0x0CC9CDCAL,70080x7897AB07L, 0x7C56B6B0L, 0x71159069L, 0x75D48DDEL,70090x6B93DDDBL, 0x6F52C06CL, 0x6211E6B5L, 0x66D0FB02L,70100x5E9F46BFL, 0x5A5E5B08L, 0x571D7DD1L, 0x53DC6066L,70110x4D9B3063L, 0x495A2DD4L, 0x44190B0DL, 0x40D816BAL,70120xACA5C697L, 0xA864DB20L, 0xA527FDF9L, 0xA1E6E04EL,70130xBFA1B04BL, 0xBB60ADFCL, 0xB6238B25L, 0xB2E29692L,70140x8AAD2B2FL, 0x8E6C3698L, 0x832F1041L, 0x87EE0DF6L,70150x99A95DF3L, 0x9D684044L, 0x902B669DL, 0x94EA7B2AL,70160xE0B41DE7L, 0xE4750050L, 0xE9362689L, 0xEDF73B3EL,70170xF3B06B3BL, 0xF771768CL, 0xFA325055L, 0xFEF34DE2L,70180xC6BCF05FL, 0xC27DEDE8L, 0xCF3ECB31L, 0xCBFFD686L,70190xD5B88683L, 0xD1799B34L, 0xDC3ABDEDL, 0xD8FBA05AL,70200x690CE0EEL, 0x6DCDFD59L, 0x608EDB80L, 0x644FC637L,70210x7A089632L, 0x7EC98B85L, 0x738AAD5CL, 0x774BB0EBL,70220x4F040D56L, 0x4BC510E1L, 0x46863638L, 0x42472B8FL,70230x5C007B8AL, 0x58C1663DL, 0x558240E4L, 0x51435D53L,70240x251D3B9EL, 0x21DC2629L, 0x2C9F00F0L, 0x285E1D47L,70250x36194D42L, 0x32D850F5L, 0x3F9B762CL, 0x3B5A6B9BL,70260x0315D626L, 0x07D4CB91L, 0x0A97ED48L, 0x0E56F0FFL,70270x1011A0FAL, 0x14D0BD4DL, 0x19939B94L, 0x1D528623L,70280xF12F560EL, 0xF5EE4BB9L, 0xF8AD6D60L, 0xFC6C70D7L,70290xE22B20D2L, 0xE6EA3D65L, 0xEBA91BBCL, 0xEF68060BL,70300xD727BBB6L, 0xD3E6A601L, 0xDEA580D8L, 0xDA649D6FL,70310xC423CD6AL, 0xC0E2D0DDL, 0xCDA1F604L, 0xC960EBB3L,70320xBD3E8D7EL, 0xB9FF90C9L, 0xB4BCB610L, 0xB07DABA7L,70330xAE3AFBA2L, 0xAAFBE615L, 0xA7B8C0CCL, 0xA379DD7BL,70340x9B3660C6L, 0x9FF77D71L, 0x92B45BA8L, 0x9675461FL,70350x8832161AL, 0x8CF30BADL, 0x81B02D74L, 0x857130C3L,70360x5D8A9099L, 0x594B8D2EL, 0x5408ABF7L, 0x50C9B640L,70370x4E8EE645L, 0x4A4FFBF2L, 0x470CDD2BL, 0x43CDC09CL,70380x7B827D21L, 0x7F436096L, 0x7200464FL, 0x76C15BF8L,70390x68860BFDL, 0x6C47164AL, 0x61043093L, 0x65C52D24L,70400x119B4BE9L, 0x155A565EL, 0x18197087L, 0x1CD86D30L,70410x029F3D35L, 0x065E2082L, 0x0B1D065BL, 0x0FDC1BECL,70420x3793A651L, 0x3352BBE6L, 0x3E119D3FL, 0x3AD08088L,70430x2497D08DL, 0x2056CD3AL, 0x2D15EBE3L, 0x29D4F654L,70440xC5A92679L, 0xC1683BCEL, 0xCC2B1D17L, 0xC8EA00A0L,70450xD6AD50A5L, 0xD26C4D12L, 0xDF2F6BCBL, 0xDBEE767CL,70460xE3A1CBC1L, 0xE760D676L, 0xEA23F0AFL, 0xEEE2ED18L,70470xF0A5BD1DL, 0xF464A0AAL, 0xF9278673L, 0xFDE69BC4L,70480x89B8FD09L, 0x8D79E0BEL, 0x803AC667L, 0x84FBDBD0L,70490x9ABC8BD5L, 0x9E7D9662L, 0x933EB0BBL, 0x97FFAD0CL,70500xAFB010B1L, 0xAB710D06L, 0xA6322BDFL, 0xA2F33668L,70510xBCB4666DL, 0xB8757BDAL, 0xB5365D03L, 0xB1F740B4L7052};7053#endif70547055static DRFLAC_INLINE drflac_uint32 drflac_crc32_byte(drflac_uint32 crc32, drflac_uint8 data)7056{7057#ifndef DR_FLAC_NO_CRC7058return (crc32 << 8) ^ drflac__crc32_table[(drflac_uint8)((crc32 >> 24) & 0xFF) ^ data];7059#else7060(void)data;7061return crc32;7062#endif7063}70647065#if 07066static DRFLAC_INLINE drflac_uint32 drflac_crc32_uint32(drflac_uint32 crc32, drflac_uint32 data)7067{7068crc32 = drflac_crc32_byte(crc32, (drflac_uint8)((data >> 24) & 0xFF));7069crc32 = drflac_crc32_byte(crc32, (drflac_uint8)((data >> 16) & 0xFF));7070crc32 = drflac_crc32_byte(crc32, (drflac_uint8)((data >> 8) & 0xFF));7071crc32 = drflac_crc32_byte(crc32, (drflac_uint8)((data >> 0) & 0xFF));7072return crc32;7073}70747075static DRFLAC_INLINE drflac_uint32 drflac_crc32_uint64(drflac_uint32 crc32, drflac_uint64 data)7076{7077crc32 = drflac_crc32_uint32(crc32, (drflac_uint32)((data >> 32) & 0xFFFFFFFF));7078crc32 = drflac_crc32_uint32(crc32, (drflac_uint32)((data >> 0) & 0xFFFFFFFF));7079return crc32;7080}7081#endif70827083static DRFLAC_INLINE drflac_uint32 drflac_crc32_buffer(drflac_uint32 crc32, drflac_uint8* pData, drflac_uint32 dataSize)7084{7085/* This can be optimized. */7086drflac_uint32 i;7087for (i = 0; i < dataSize; ++i) {7088crc32 = drflac_crc32_byte(crc32, pData[i]);7089}7090return crc32;7091}709270937094static DRFLAC_INLINE drflac_bool32 drflac_ogg__is_capture_pattern(drflac_uint8 pattern[4])7095{7096return pattern[0] == 'O' && pattern[1] == 'g' && pattern[2] == 'g' && pattern[3] == 'S';7097}70987099static DRFLAC_INLINE drflac_uint32 drflac_ogg__get_page_header_size(drflac_ogg_page_header* pHeader)7100{7101return 27 + pHeader->segmentCount;7102}71037104static DRFLAC_INLINE drflac_uint32 drflac_ogg__get_page_body_size(drflac_ogg_page_header* pHeader)7105{7106drflac_uint32 pageBodySize = 0;7107int i;71087109for (i = 0; i < pHeader->segmentCount; ++i) {7110pageBodySize += pHeader->segmentTable[i];7111}71127113return pageBodySize;7114}71157116static drflac_result drflac_ogg__read_page_header_after_capture_pattern(drflac_read_proc onRead, void* pUserData, drflac_ogg_page_header* pHeader, drflac_uint32* pBytesRead, drflac_uint32* pCRC32)7117{7118drflac_uint8 data[23];7119drflac_uint32 i;71207121DRFLAC_ASSERT(*pCRC32 == DRFLAC_OGG_CAPTURE_PATTERN_CRC32);71227123if (onRead(pUserData, data, 23) != 23) {7124return DRFLAC_AT_END;7125}7126*pBytesRead += 23;71277128/*7129It's not actually used, but set the capture pattern to 'OggS' for completeness. Not doing this will cause static analysers to complain about7130us trying to access uninitialized data. We could alternatively just comment out this member of the drflac_ogg_page_header structure, but I7131like to have it map to the structure of the underlying data.7132*/7133pHeader->capturePattern[0] = 'O';7134pHeader->capturePattern[1] = 'g';7135pHeader->capturePattern[2] = 'g';7136pHeader->capturePattern[3] = 'S';71377138pHeader->structureVersion = data[0];7139pHeader->headerType = data[1];7140DRFLAC_COPY_MEMORY(&pHeader->granulePosition, &data[ 2], 8);7141DRFLAC_COPY_MEMORY(&pHeader->serialNumber, &data[10], 4);7142DRFLAC_COPY_MEMORY(&pHeader->sequenceNumber, &data[14], 4);7143DRFLAC_COPY_MEMORY(&pHeader->checksum, &data[18], 4);7144pHeader->segmentCount = data[22];71457146/* Calculate the CRC. Note that for the calculation the checksum part of the page needs to be set to 0. */7147data[18] = 0;7148data[19] = 0;7149data[20] = 0;7150data[21] = 0;71517152for (i = 0; i < 23; ++i) {7153*pCRC32 = drflac_crc32_byte(*pCRC32, data[i]);7154}715571567157if (onRead(pUserData, pHeader->segmentTable, pHeader->segmentCount) != pHeader->segmentCount) {7158return DRFLAC_AT_END;7159}7160*pBytesRead += pHeader->segmentCount;71617162for (i = 0; i < pHeader->segmentCount; ++i) {7163*pCRC32 = drflac_crc32_byte(*pCRC32, pHeader->segmentTable[i]);7164}71657166return DRFLAC_SUCCESS;7167}71687169static drflac_result drflac_ogg__read_page_header(drflac_read_proc onRead, void* pUserData, drflac_ogg_page_header* pHeader, drflac_uint32* pBytesRead, drflac_uint32* pCRC32)7170{7171drflac_uint8 id[4];71727173*pBytesRead = 0;71747175if (onRead(pUserData, id, 4) != 4) {7176return DRFLAC_AT_END;7177}7178*pBytesRead += 4;71797180/* We need to read byte-by-byte until we find the OggS capture pattern. */7181for (;;) {7182if (drflac_ogg__is_capture_pattern(id)) {7183drflac_result result;71847185*pCRC32 = DRFLAC_OGG_CAPTURE_PATTERN_CRC32;71867187result = drflac_ogg__read_page_header_after_capture_pattern(onRead, pUserData, pHeader, pBytesRead, pCRC32);7188if (result == DRFLAC_SUCCESS) {7189return DRFLAC_SUCCESS;7190} else {7191if (result == DRFLAC_CRC_MISMATCH) {7192continue;7193} else {7194return result;7195}7196}7197} else {7198/* The first 4 bytes did not equal the capture pattern. Read the next byte and try again. */7199id[0] = id[1];7200id[1] = id[2];7201id[2] = id[3];7202if (onRead(pUserData, &id[3], 1) != 1) {7203return DRFLAC_AT_END;7204}7205*pBytesRead += 1;7206}7207}7208}720972107211/*7212The main part of the Ogg encapsulation is the conversion from the physical Ogg bitstream to the native FLAC bitstream. It works7213in three general stages: Ogg Physical Bitstream -> Ogg/FLAC Logical Bitstream -> FLAC Native Bitstream. dr_flac is designed7214in such a way that the core sections assume everything is delivered in native format. Therefore, for each encapsulation type7215dr_flac is supporting there needs to be a layer sitting on top of the onRead and onSeek callbacks that ensures the bits read from7216the physical Ogg bitstream are converted and delivered in native FLAC format.7217*/7218typedef struct7219{7220drflac_read_proc onRead; /* The original onRead callback from drflac_open() and family. */7221drflac_seek_proc onSeek; /* The original onSeek callback from drflac_open() and family. */7222void* pUserData; /* The user data passed on onRead and onSeek. This is the user data that was passed on drflac_open() and family. */7223drflac_uint64 currentBytePos; /* The position of the byte we are sitting on in the physical byte stream. Used for efficient seeking. */7224drflac_uint64 firstBytePos; /* The position of the first byte in the physical bitstream. Points to the start of the "OggS" identifier of the FLAC bos page. */7225drflac_uint32 serialNumber; /* The serial number of the FLAC audio pages. This is determined by the initial header page that was read during initialization. */7226drflac_ogg_page_header bosPageHeader; /* Used for seeking. */7227drflac_ogg_page_header currentPageHeader;7228drflac_uint32 bytesRemainingInPage;7229drflac_uint32 pageDataSize;7230drflac_uint8 pageData[DRFLAC_OGG_MAX_PAGE_SIZE];7231} drflac_oggbs; /* oggbs = Ogg Bitstream */72327233static size_t drflac_oggbs__read_physical(drflac_oggbs* oggbs, void* bufferOut, size_t bytesToRead)7234{7235size_t bytesActuallyRead = oggbs->onRead(oggbs->pUserData, bufferOut, bytesToRead);7236oggbs->currentBytePos += bytesActuallyRead;72377238return bytesActuallyRead;7239}72407241static drflac_bool32 drflac_oggbs__seek_physical(drflac_oggbs* oggbs, drflac_uint64 offset, drflac_seek_origin origin)7242{7243if (origin == drflac_seek_origin_start) {7244if (offset <= 0x7FFFFFFF) {7245if (!oggbs->onSeek(oggbs->pUserData, (int)offset, drflac_seek_origin_start)) {7246return DRFLAC_FALSE;7247}7248oggbs->currentBytePos = offset;72497250return DRFLAC_TRUE;7251} else {7252if (!oggbs->onSeek(oggbs->pUserData, 0x7FFFFFFF, drflac_seek_origin_start)) {7253return DRFLAC_FALSE;7254}7255oggbs->currentBytePos = offset;72567257return drflac_oggbs__seek_physical(oggbs, offset - 0x7FFFFFFF, drflac_seek_origin_current);7258}7259} else {7260while (offset > 0x7FFFFFFF) {7261if (!oggbs->onSeek(oggbs->pUserData, 0x7FFFFFFF, drflac_seek_origin_current)) {7262return DRFLAC_FALSE;7263}7264oggbs->currentBytePos += 0x7FFFFFFF;7265offset -= 0x7FFFFFFF;7266}72677268if (!oggbs->onSeek(oggbs->pUserData, (int)offset, drflac_seek_origin_current)) { /* <-- Safe cast thanks to the loop above. */7269return DRFLAC_FALSE;7270}7271oggbs->currentBytePos += offset;72727273return DRFLAC_TRUE;7274}7275}72767277static drflac_bool32 drflac_oggbs__goto_next_page(drflac_oggbs* oggbs, drflac_ogg_crc_mismatch_recovery recoveryMethod)7278{7279drflac_ogg_page_header header;7280for (;;) {7281drflac_uint32 crc32 = 0;7282drflac_uint32 bytesRead;7283drflac_uint32 pageBodySize;7284#ifndef DR_FLAC_NO_CRC7285drflac_uint32 actualCRC32;7286#endif72877288if (drflac_ogg__read_page_header(oggbs->onRead, oggbs->pUserData, &header, &bytesRead, &crc32) != DRFLAC_SUCCESS) {7289return DRFLAC_FALSE;7290}7291oggbs->currentBytePos += bytesRead;72927293pageBodySize = drflac_ogg__get_page_body_size(&header);7294if (pageBodySize > DRFLAC_OGG_MAX_PAGE_SIZE) {7295continue; /* Invalid page size. Assume it's corrupted and just move to the next page. */7296}72977298if (header.serialNumber != oggbs->serialNumber) {7299/* It's not a FLAC page. Skip it. */7300if (pageBodySize > 0 && !drflac_oggbs__seek_physical(oggbs, pageBodySize, drflac_seek_origin_current)) {7301return DRFLAC_FALSE;7302}7303continue;7304}730573067307/* We need to read the entire page and then do a CRC check on it. If there's a CRC mismatch we need to skip this page. */7308if (drflac_oggbs__read_physical(oggbs, oggbs->pageData, pageBodySize) != pageBodySize) {7309return DRFLAC_FALSE;7310}7311oggbs->pageDataSize = pageBodySize;73127313#ifndef DR_FLAC_NO_CRC7314actualCRC32 = drflac_crc32_buffer(crc32, oggbs->pageData, oggbs->pageDataSize);7315if (actualCRC32 != header.checksum) {7316if (recoveryMethod == drflac_ogg_recover_on_crc_mismatch) {7317continue; /* CRC mismatch. Skip this page. */7318} else {7319/*7320Even though we are failing on a CRC mismatch, we still want our stream to be in a good state. Therefore we7321go to the next valid page to ensure we're in a good state, but return false to let the caller know that the7322seek did not fully complete.7323*/7324drflac_oggbs__goto_next_page(oggbs, drflac_ogg_recover_on_crc_mismatch);7325return DRFLAC_FALSE;7326}7327}7328#else7329(void)recoveryMethod; /* <-- Silence a warning. */7330#endif73317332oggbs->currentPageHeader = header;7333oggbs->bytesRemainingInPage = pageBodySize;7334return DRFLAC_TRUE;7335}7336}73377338/* Function below is unused at the moment, but I might be re-adding it later. */7339#if 07340static drflac_uint8 drflac_oggbs__get_current_segment_index(drflac_oggbs* oggbs, drflac_uint8* pBytesRemainingInSeg)7341{7342drflac_uint32 bytesConsumedInPage = drflac_ogg__get_page_body_size(&oggbs->currentPageHeader) - oggbs->bytesRemainingInPage;7343drflac_uint8 iSeg = 0;7344drflac_uint32 iByte = 0;7345while (iByte < bytesConsumedInPage) {7346drflac_uint8 segmentSize = oggbs->currentPageHeader.segmentTable[iSeg];7347if (iByte + segmentSize > bytesConsumedInPage) {7348break;7349} else {7350iSeg += 1;7351iByte += segmentSize;7352}7353}73547355*pBytesRemainingInSeg = oggbs->currentPageHeader.segmentTable[iSeg] - (drflac_uint8)(bytesConsumedInPage - iByte);7356return iSeg;7357}73587359static drflac_bool32 drflac_oggbs__seek_to_next_packet(drflac_oggbs* oggbs)7360{7361/* The current packet ends when we get to the segment with a lacing value of < 255 which is not at the end of a page. */7362for (;;) {7363drflac_bool32 atEndOfPage = DRFLAC_FALSE;73647365drflac_uint8 bytesRemainingInSeg;7366drflac_uint8 iFirstSeg = drflac_oggbs__get_current_segment_index(oggbs, &bytesRemainingInSeg);73677368drflac_uint32 bytesToEndOfPacketOrPage = bytesRemainingInSeg;7369for (drflac_uint8 iSeg = iFirstSeg; iSeg < oggbs->currentPageHeader.segmentCount; ++iSeg) {7370drflac_uint8 segmentSize = oggbs->currentPageHeader.segmentTable[iSeg];7371if (segmentSize < 255) {7372if (iSeg == oggbs->currentPageHeader.segmentCount-1) {7373atEndOfPage = DRFLAC_TRUE;7374}73757376break;7377}73787379bytesToEndOfPacketOrPage += segmentSize;7380}73817382/*7383At this point we will have found either the packet or the end of the page. If were at the end of the page we'll7384want to load the next page and keep searching for the end of the packet.7385*/7386drflac_oggbs__seek_physical(oggbs, bytesToEndOfPacketOrPage, drflac_seek_origin_current);7387oggbs->bytesRemainingInPage -= bytesToEndOfPacketOrPage;73887389if (atEndOfPage) {7390/*7391We're potentially at the next packet, but we need to check the next page first to be sure because the packet may7392straddle pages.7393*/7394if (!drflac_oggbs__goto_next_page(oggbs)) {7395return DRFLAC_FALSE;7396}73977398/* If it's a fresh packet it most likely means we're at the next packet. */7399if ((oggbs->currentPageHeader.headerType & 0x01) == 0) {7400return DRFLAC_TRUE;7401}7402} else {7403/* We're at the next packet. */7404return DRFLAC_TRUE;7405}7406}7407}74087409static drflac_bool32 drflac_oggbs__seek_to_next_frame(drflac_oggbs* oggbs)7410{7411/* The bitstream should be sitting on the first byte just after the header of the frame. */74127413/* What we're actually doing here is seeking to the start of the next packet. */7414return drflac_oggbs__seek_to_next_packet(oggbs);7415}7416#endif74177418static size_t drflac__on_read_ogg(void* pUserData, void* bufferOut, size_t bytesToRead)7419{7420drflac_oggbs* oggbs = (drflac_oggbs*)pUserData;7421drflac_uint8* pRunningBufferOut = (drflac_uint8*)bufferOut;7422size_t bytesRead = 0;74237424DRFLAC_ASSERT(oggbs != NULL);7425DRFLAC_ASSERT(pRunningBufferOut != NULL);74267427/* Reading is done page-by-page. If we've run out of bytes in the page we need to move to the next one. */7428while (bytesRead < bytesToRead) {7429size_t bytesRemainingToRead = bytesToRead - bytesRead;74307431if (oggbs->bytesRemainingInPage >= bytesRemainingToRead) {7432DRFLAC_COPY_MEMORY(pRunningBufferOut, oggbs->pageData + (oggbs->pageDataSize - oggbs->bytesRemainingInPage), bytesRemainingToRead);7433bytesRead += bytesRemainingToRead;7434oggbs->bytesRemainingInPage -= (drflac_uint32)bytesRemainingToRead;7435break;7436}74377438/* If we get here it means some of the requested data is contained in the next pages. */7439if (oggbs->bytesRemainingInPage > 0) {7440DRFLAC_COPY_MEMORY(pRunningBufferOut, oggbs->pageData + (oggbs->pageDataSize - oggbs->bytesRemainingInPage), oggbs->bytesRemainingInPage);7441bytesRead += oggbs->bytesRemainingInPage;7442pRunningBufferOut += oggbs->bytesRemainingInPage;7443oggbs->bytesRemainingInPage = 0;7444}74457446DRFLAC_ASSERT(bytesRemainingToRead > 0);7447if (!drflac_oggbs__goto_next_page(oggbs, drflac_ogg_recover_on_crc_mismatch)) {7448break; /* Failed to go to the next page. Might have simply hit the end of the stream. */7449}7450}74517452return bytesRead;7453}74547455static drflac_bool32 drflac__on_seek_ogg(void* pUserData, int offset, drflac_seek_origin origin)7456{7457drflac_oggbs* oggbs = (drflac_oggbs*)pUserData;7458int bytesSeeked = 0;74597460DRFLAC_ASSERT(oggbs != NULL);7461DRFLAC_ASSERT(offset >= 0); /* <-- Never seek backwards. */74627463/* Seeking is always forward which makes things a lot simpler. */7464if (origin == drflac_seek_origin_start) {7465if (!drflac_oggbs__seek_physical(oggbs, (int)oggbs->firstBytePos, drflac_seek_origin_start)) {7466return DRFLAC_FALSE;7467}74687469if (!drflac_oggbs__goto_next_page(oggbs, drflac_ogg_fail_on_crc_mismatch)) {7470return DRFLAC_FALSE;7471}74727473return drflac__on_seek_ogg(pUserData, offset, drflac_seek_origin_current);7474}74757476DRFLAC_ASSERT(origin == drflac_seek_origin_current);74777478while (bytesSeeked < offset) {7479int bytesRemainingToSeek = offset - bytesSeeked;7480DRFLAC_ASSERT(bytesRemainingToSeek >= 0);74817482if (oggbs->bytesRemainingInPage >= (size_t)bytesRemainingToSeek) {7483bytesSeeked += bytesRemainingToSeek;7484(void)bytesSeeked; /* <-- Silence a dead store warning emitted by Clang Static Analyzer. */7485oggbs->bytesRemainingInPage -= bytesRemainingToSeek;7486break;7487}74887489/* If we get here it means some of the requested data is contained in the next pages. */7490if (oggbs->bytesRemainingInPage > 0) {7491bytesSeeked += (int)oggbs->bytesRemainingInPage;7492oggbs->bytesRemainingInPage = 0;7493}74947495DRFLAC_ASSERT(bytesRemainingToSeek > 0);7496if (!drflac_oggbs__goto_next_page(oggbs, drflac_ogg_fail_on_crc_mismatch)) {7497/* Failed to go to the next page. We either hit the end of the stream or had a CRC mismatch. */7498return DRFLAC_FALSE;7499}7500}75017502return DRFLAC_TRUE;7503}750475057506static drflac_bool32 drflac_ogg__seek_to_pcm_frame(drflac* pFlac, drflac_uint64 pcmFrameIndex)7507{7508drflac_oggbs* oggbs = (drflac_oggbs*)pFlac->_oggbs;7509drflac_uint64 originalBytePos;7510drflac_uint64 runningGranulePosition;7511drflac_uint64 runningFrameBytePos;7512drflac_uint64 runningPCMFrameCount;75137514DRFLAC_ASSERT(oggbs != NULL);75157516originalBytePos = oggbs->currentBytePos; /* For recovery. Points to the OggS identifier. */75177518/* First seek to the first frame. */7519if (!drflac__seek_to_byte(&pFlac->bs, pFlac->firstFLACFramePosInBytes)) {7520return DRFLAC_FALSE;7521}7522oggbs->bytesRemainingInPage = 0;75237524runningGranulePosition = 0;7525for (;;) {7526if (!drflac_oggbs__goto_next_page(oggbs, drflac_ogg_recover_on_crc_mismatch)) {7527drflac_oggbs__seek_physical(oggbs, originalBytePos, drflac_seek_origin_start);7528return DRFLAC_FALSE; /* Never did find that sample... */7529}75307531runningFrameBytePos = oggbs->currentBytePos - drflac_ogg__get_page_header_size(&oggbs->currentPageHeader) - oggbs->pageDataSize;7532if (oggbs->currentPageHeader.granulePosition >= pcmFrameIndex) {7533break; /* The sample is somewhere in the previous page. */7534}75357536/*7537At this point we know the sample is not in the previous page. It could possibly be in this page. For simplicity we7538disregard any pages that do not begin a fresh packet.7539*/7540if ((oggbs->currentPageHeader.headerType & 0x01) == 0) { /* <-- Is it a fresh page? */7541if (oggbs->currentPageHeader.segmentTable[0] >= 2) {7542drflac_uint8 firstBytesInPage[2];7543firstBytesInPage[0] = oggbs->pageData[0];7544firstBytesInPage[1] = oggbs->pageData[1];75457546if ((firstBytesInPage[0] == 0xFF) && (firstBytesInPage[1] & 0xFC) == 0xF8) { /* <-- Does the page begin with a frame's sync code? */7547runningGranulePosition = oggbs->currentPageHeader.granulePosition;7548}75497550continue;7551}7552}7553}75547555/*7556We found the page that that is closest to the sample, so now we need to find it. The first thing to do is seek to the7557start of that page. In the loop above we checked that it was a fresh page which means this page is also the start of7558a new frame. This property means that after we've seeked to the page we can immediately start looping over frames until7559we find the one containing the target sample.7560*/7561if (!drflac_oggbs__seek_physical(oggbs, runningFrameBytePos, drflac_seek_origin_start)) {7562return DRFLAC_FALSE;7563}7564if (!drflac_oggbs__goto_next_page(oggbs, drflac_ogg_recover_on_crc_mismatch)) {7565return DRFLAC_FALSE;7566}75677568/*7569At this point we'll be sitting on the first byte of the frame header of the first frame in the page. We just keep7570looping over these frames until we find the one containing the sample we're after.7571*/7572runningPCMFrameCount = runningGranulePosition;7573for (;;) {7574/*7575There are two ways to find the sample and seek past irrelevant frames:75761) Use the native FLAC decoder.75772) Use Ogg's framing system.75787579Both of these options have their own pros and cons. Using the native FLAC decoder is slower because it needs to7580do a full decode of the frame. Using Ogg's framing system is faster, but more complicated and involves some code7581duplication for the decoding of frame headers.75827583Another thing to consider is that using the Ogg framing system will perform direct seeking of the physical Ogg7584bitstream. This is important to consider because it means we cannot read data from the drflac_bs object using the7585standard drflac__*() APIs because that will read in extra data for its own internal caching which in turn breaks7586the positioning of the read pointer of the physical Ogg bitstream. Therefore, anything that would normally be read7587using the native FLAC decoding APIs, such as drflac__read_next_flac_frame_header(), need to be re-implemented so as to7588avoid the use of the drflac_bs object.75897590Considering these issues, I have decided to use the slower native FLAC decoding method for the following reasons:75911) Seeking is already partially accelerated using Ogg's paging system in the code block above.75922) Seeking in an Ogg encapsulated FLAC stream is probably quite uncommon.75933) Simplicity.7594*/7595drflac_uint64 firstPCMFrameInFLACFrame = 0;7596drflac_uint64 lastPCMFrameInFLACFrame = 0;7597drflac_uint64 pcmFrameCountInThisFrame;75987599if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) {7600return DRFLAC_FALSE;7601}76027603drflac__get_pcm_frame_range_of_current_flac_frame(pFlac, &firstPCMFrameInFLACFrame, &lastPCMFrameInFLACFrame);76047605pcmFrameCountInThisFrame = (lastPCMFrameInFLACFrame - firstPCMFrameInFLACFrame) + 1;76067607/* If we are seeking to the end of the file and we've just hit it, we're done. */7608if (pcmFrameIndex == pFlac->totalPCMFrameCount && (runningPCMFrameCount + pcmFrameCountInThisFrame) == pFlac->totalPCMFrameCount) {7609drflac_result result = drflac__decode_flac_frame(pFlac);7610if (result == DRFLAC_SUCCESS) {7611pFlac->currentPCMFrame = pcmFrameIndex;7612pFlac->currentFLACFrame.pcmFramesRemaining = 0;7613return DRFLAC_TRUE;7614} else {7615return DRFLAC_FALSE;7616}7617}76187619if (pcmFrameIndex < (runningPCMFrameCount + pcmFrameCountInThisFrame)) {7620/*7621The sample should be in this FLAC frame. We need to fully decode it, however if it's an invalid frame (a CRC mismatch), we need to pretend7622it never existed and keep iterating.7623*/7624drflac_result result = drflac__decode_flac_frame(pFlac);7625if (result == DRFLAC_SUCCESS) {7626/* The frame is valid. We just need to skip over some samples to ensure it's sample-exact. */7627drflac_uint64 pcmFramesToDecode = (size_t)(pcmFrameIndex - runningPCMFrameCount); /* <-- Safe cast because the maximum number of samples in a frame is 65535. */7628if (pcmFramesToDecode == 0) {7629return DRFLAC_TRUE;7630}76317632pFlac->currentPCMFrame = runningPCMFrameCount;76337634return drflac__seek_forward_by_pcm_frames(pFlac, pcmFramesToDecode) == pcmFramesToDecode; /* <-- If this fails, something bad has happened (it should never fail). */7635} else {7636if (result == DRFLAC_CRC_MISMATCH) {7637continue; /* CRC mismatch. Pretend this frame never existed. */7638} else {7639return DRFLAC_FALSE;7640}7641}7642} else {7643/*7644It's not in this frame. We need to seek past the frame, but check if there was a CRC mismatch. If so, we pretend this7645frame never existed and leave the running sample count untouched.7646*/7647drflac_result result = drflac__seek_to_next_flac_frame(pFlac);7648if (result == DRFLAC_SUCCESS) {7649runningPCMFrameCount += pcmFrameCountInThisFrame;7650} else {7651if (result == DRFLAC_CRC_MISMATCH) {7652continue; /* CRC mismatch. Pretend this frame never existed. */7653} else {7654return DRFLAC_FALSE;7655}7656}7657}7658}7659}7660766176627663static drflac_bool32 drflac__init_private__ogg(drflac_init_info* pInit, drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, void* pUserData, void* pUserDataMD, drflac_bool32 relaxed)7664{7665drflac_ogg_page_header header;7666drflac_uint32 crc32 = DRFLAC_OGG_CAPTURE_PATTERN_CRC32;7667drflac_uint32 bytesRead = 0;76687669/* Pre Condition: The bit stream should be sitting just past the 4-byte OggS capture pattern. */7670(void)relaxed;76717672pInit->container = drflac_container_ogg;7673pInit->oggFirstBytePos = 0;76747675/*7676We'll get here if the first 4 bytes of the stream were the OggS capture pattern, however it doesn't necessarily mean the7677stream includes FLAC encoded audio. To check for this we need to scan the beginning-of-stream page markers and check if7678any match the FLAC specification. Important to keep in mind that the stream may be multiplexed.7679*/7680if (drflac_ogg__read_page_header_after_capture_pattern(onRead, pUserData, &header, &bytesRead, &crc32) != DRFLAC_SUCCESS) {7681return DRFLAC_FALSE;7682}7683pInit->runningFilePos += bytesRead;76847685for (;;) {7686int pageBodySize;76877688/* Break if we're past the beginning of stream page. */7689if ((header.headerType & 0x02) == 0) {7690return DRFLAC_FALSE;7691}76927693/* Check if it's a FLAC header. */7694pageBodySize = drflac_ogg__get_page_body_size(&header);7695if (pageBodySize == 51) { /* 51 = the lacing value of the FLAC header packet. */7696/* It could be a FLAC page... */7697drflac_uint32 bytesRemainingInPage = pageBodySize;7698drflac_uint8 packetType;76997700if (onRead(pUserData, &packetType, 1) != 1) {7701return DRFLAC_FALSE;7702}77037704bytesRemainingInPage -= 1;7705if (packetType == 0x7F) {7706/* Increasingly more likely to be a FLAC page... */7707drflac_uint8 sig[4];7708if (onRead(pUserData, sig, 4) != 4) {7709return DRFLAC_FALSE;7710}77117712bytesRemainingInPage -= 4;7713if (sig[0] == 'F' && sig[1] == 'L' && sig[2] == 'A' && sig[3] == 'C') {7714/* Almost certainly a FLAC page... */7715drflac_uint8 mappingVersion[2];7716if (onRead(pUserData, mappingVersion, 2) != 2) {7717return DRFLAC_FALSE;7718}77197720if (mappingVersion[0] != 1) {7721return DRFLAC_FALSE; /* Only supporting version 1.x of the Ogg mapping. */7722}77237724/*7725The next 2 bytes are the non-audio packets, not including this one. We don't care about this because we're going to7726be handling it in a generic way based on the serial number and packet types.7727*/7728if (!onSeek(pUserData, 2, drflac_seek_origin_current)) {7729return DRFLAC_FALSE;7730}77317732/* Expecting the native FLAC signature "fLaC". */7733if (onRead(pUserData, sig, 4) != 4) {7734return DRFLAC_FALSE;7735}77367737if (sig[0] == 'f' && sig[1] == 'L' && sig[2] == 'a' && sig[3] == 'C') {7738/* The remaining data in the page should be the STREAMINFO block. */7739drflac_streaminfo streaminfo;7740drflac_uint8 isLastBlock;7741drflac_uint8 blockType;7742drflac_uint32 blockSize;7743if (!drflac__read_and_decode_block_header(onRead, pUserData, &isLastBlock, &blockType, &blockSize)) {7744return DRFLAC_FALSE;7745}77467747if (blockType != DRFLAC_METADATA_BLOCK_TYPE_STREAMINFO || blockSize != 34) {7748return DRFLAC_FALSE; /* Invalid block type. First block must be the STREAMINFO block. */7749}77507751if (drflac__read_streaminfo(onRead, pUserData, &streaminfo)) {7752/* Success! */7753pInit->hasStreamInfoBlock = DRFLAC_TRUE;7754pInit->sampleRate = streaminfo.sampleRate;7755pInit->channels = streaminfo.channels;7756pInit->bitsPerSample = streaminfo.bitsPerSample;7757pInit->totalPCMFrameCount = streaminfo.totalPCMFrameCount;7758pInit->maxBlockSizeInPCMFrames = streaminfo.maxBlockSizeInPCMFrames;7759pInit->hasMetadataBlocks = !isLastBlock;77607761if (onMeta) {7762drflac_metadata metadata;7763metadata.type = DRFLAC_METADATA_BLOCK_TYPE_STREAMINFO;7764metadata.pRawData = NULL;7765metadata.rawDataSize = 0;7766metadata.data.streaminfo = streaminfo;7767onMeta(pUserDataMD, &metadata);7768}77697770pInit->runningFilePos += pageBodySize;7771pInit->oggFirstBytePos = pInit->runningFilePos - 79; /* Subtracting 79 will place us right on top of the "OggS" identifier of the FLAC bos page. */7772pInit->oggSerial = header.serialNumber;7773pInit->oggBosHeader = header;7774break;7775} else {7776/* Failed to read STREAMINFO block. Aww, so close... */7777return DRFLAC_FALSE;7778}7779} else {7780/* Invalid file. */7781return DRFLAC_FALSE;7782}7783} else {7784/* Not a FLAC header. Skip it. */7785if (!onSeek(pUserData, bytesRemainingInPage, drflac_seek_origin_current)) {7786return DRFLAC_FALSE;7787}7788}7789} else {7790/* Not a FLAC header. Seek past the entire page and move on to the next. */7791if (!onSeek(pUserData, bytesRemainingInPage, drflac_seek_origin_current)) {7792return DRFLAC_FALSE;7793}7794}7795} else {7796if (!onSeek(pUserData, pageBodySize, drflac_seek_origin_current)) {7797return DRFLAC_FALSE;7798}7799}78007801pInit->runningFilePos += pageBodySize;780278037804/* Read the header of the next page. */7805if (drflac_ogg__read_page_header(onRead, pUserData, &header, &bytesRead, &crc32) != DRFLAC_SUCCESS) {7806return DRFLAC_FALSE;7807}7808pInit->runningFilePos += bytesRead;7809}78107811/*7812If we get here it means we found a FLAC audio stream. We should be sitting on the first byte of the header of the next page. The next7813packets in the FLAC logical stream contain the metadata. The only thing left to do in the initialization phase for Ogg is to create the7814Ogg bistream object.7815*/7816pInit->hasMetadataBlocks = DRFLAC_TRUE; /* <-- Always have at least VORBIS_COMMENT metadata block. */7817return DRFLAC_TRUE;7818}7819#endif78207821static drflac_bool32 drflac__init_private(drflac_init_info* pInit, drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, drflac_container container, void* pUserData, void* pUserDataMD)7822{7823drflac_bool32 relaxed;7824drflac_uint8 id[4];78257826if (pInit == NULL || onRead == NULL || onSeek == NULL) {7827return DRFLAC_FALSE;7828}78297830DRFLAC_ZERO_MEMORY(pInit, sizeof(*pInit));7831pInit->onRead = onRead;7832pInit->onSeek = onSeek;7833pInit->onMeta = onMeta;7834pInit->container = container;7835pInit->pUserData = pUserData;7836pInit->pUserDataMD = pUserDataMD;78377838pInit->bs.onRead = onRead;7839pInit->bs.onSeek = onSeek;7840pInit->bs.pUserData = pUserData;7841drflac__reset_cache(&pInit->bs);784278437844/* If the container is explicitly defined then we can try opening in relaxed mode. */7845relaxed = container != drflac_container_unknown;78467847/* Skip over any ID3 tags. */7848for (;;) {7849if (onRead(pUserData, id, 4) != 4) {7850return DRFLAC_FALSE; /* Ran out of data. */7851}7852pInit->runningFilePos += 4;78537854if (id[0] == 'I' && id[1] == 'D' && id[2] == '3') {7855drflac_uint8 header[6];7856drflac_uint8 flags;7857drflac_uint32 headerSize;78587859if (onRead(pUserData, header, 6) != 6) {7860return DRFLAC_FALSE; /* Ran out of data. */7861}7862pInit->runningFilePos += 6;78637864flags = header[1];78657866DRFLAC_COPY_MEMORY(&headerSize, header+2, 4);7867headerSize = drflac__unsynchsafe_32(drflac__be2host_32(headerSize));7868if (flags & 0x10) {7869headerSize += 10;7870}78717872if (!onSeek(pUserData, headerSize, drflac_seek_origin_current)) {7873return DRFLAC_FALSE; /* Failed to seek past the tag. */7874}7875pInit->runningFilePos += headerSize;7876} else {7877break;7878}7879}78807881if (id[0] == 'f' && id[1] == 'L' && id[2] == 'a' && id[3] == 'C') {7882return drflac__init_private__native(pInit, onRead, onSeek, onMeta, pUserData, pUserDataMD, relaxed);7883}7884#ifndef DR_FLAC_NO_OGG7885if (id[0] == 'O' && id[1] == 'g' && id[2] == 'g' && id[3] == 'S') {7886return drflac__init_private__ogg(pInit, onRead, onSeek, onMeta, pUserData, pUserDataMD, relaxed);7887}7888#endif78897890/* If we get here it means we likely don't have a header. Try opening in relaxed mode, if applicable. */7891if (relaxed) {7892if (container == drflac_container_native) {7893return drflac__init_private__native(pInit, onRead, onSeek, onMeta, pUserData, pUserDataMD, relaxed);7894}7895#ifndef DR_FLAC_NO_OGG7896if (container == drflac_container_ogg) {7897return drflac__init_private__ogg(pInit, onRead, onSeek, onMeta, pUserData, pUserDataMD, relaxed);7898}7899#endif7900}79017902/* Unsupported container. */7903return DRFLAC_FALSE;7904}79057906static void drflac__init_from_info(drflac* pFlac, const drflac_init_info* pInit)7907{7908DRFLAC_ASSERT(pFlac != NULL);7909DRFLAC_ASSERT(pInit != NULL);79107911DRFLAC_ZERO_MEMORY(pFlac, sizeof(*pFlac));7912pFlac->bs = pInit->bs;7913pFlac->onMeta = pInit->onMeta;7914pFlac->pUserDataMD = pInit->pUserDataMD;7915pFlac->maxBlockSizeInPCMFrames = pInit->maxBlockSizeInPCMFrames;7916pFlac->sampleRate = pInit->sampleRate;7917pFlac->channels = (drflac_uint8)pInit->channels;7918pFlac->bitsPerSample = (drflac_uint8)pInit->bitsPerSample;7919pFlac->totalPCMFrameCount = pInit->totalPCMFrameCount;7920pFlac->container = pInit->container;7921}792279237924static drflac* drflac_open_with_metadata_private(drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, drflac_container container, void* pUserData, void* pUserDataMD, const drflac_allocation_callbacks* pAllocationCallbacks)7925{7926drflac_init_info init;7927drflac_uint32 allocationSize;7928drflac_uint32 wholeSIMDVectorCountPerChannel;7929drflac_uint32 decodedSamplesAllocationSize;7930#ifndef DR_FLAC_NO_OGG7931drflac_oggbs* pOggbs = NULL;7932#endif7933drflac_uint64 firstFramePos;7934drflac_uint64 seektablePos;7935drflac_uint32 seekpointCount;7936drflac_allocation_callbacks allocationCallbacks;7937drflac* pFlac;79387939/* CPU support first. */7940drflac__init_cpu_caps();79417942if (!drflac__init_private(&init, onRead, onSeek, onMeta, container, pUserData, pUserDataMD)) {7943return NULL;7944}79457946if (pAllocationCallbacks != NULL) {7947allocationCallbacks = *pAllocationCallbacks;7948if (allocationCallbacks.onFree == NULL || (allocationCallbacks.onMalloc == NULL && allocationCallbacks.onRealloc == NULL)) {7949return NULL; /* Invalid allocation callbacks. */7950}7951} else {7952allocationCallbacks.pUserData = NULL;7953allocationCallbacks.onMalloc = drflac__malloc_default;7954allocationCallbacks.onRealloc = drflac__realloc_default;7955allocationCallbacks.onFree = drflac__free_default;7956}795779587959/*7960The size of the allocation for the drflac object needs to be large enough to fit the following:79611) The main members of the drflac structure79622) A block of memory large enough to store the decoded samples of the largest frame in the stream79633) If the container is Ogg, a drflac_oggbs object79647965The complicated part of the allocation is making sure there's enough room the decoded samples, taking into consideration7966the different SIMD instruction sets.7967*/7968allocationSize = sizeof(drflac);79697970/*7971The allocation size for decoded frames depends on the number of 32-bit integers that fit inside the largest SIMD vector7972we are supporting.7973*/7974if ((init.maxBlockSizeInPCMFrames % (DRFLAC_MAX_SIMD_VECTOR_SIZE / sizeof(drflac_int32))) == 0) {7975wholeSIMDVectorCountPerChannel = (init.maxBlockSizeInPCMFrames / (DRFLAC_MAX_SIMD_VECTOR_SIZE / sizeof(drflac_int32)));7976} else {7977wholeSIMDVectorCountPerChannel = (init.maxBlockSizeInPCMFrames / (DRFLAC_MAX_SIMD_VECTOR_SIZE / sizeof(drflac_int32))) + 1;7978}79797980decodedSamplesAllocationSize = wholeSIMDVectorCountPerChannel * DRFLAC_MAX_SIMD_VECTOR_SIZE * init.channels;79817982allocationSize += decodedSamplesAllocationSize;7983allocationSize += DRFLAC_MAX_SIMD_VECTOR_SIZE; /* Allocate extra bytes to ensure we have enough for alignment. */79847985#ifndef DR_FLAC_NO_OGG7986/* There's additional data required for Ogg streams. */7987if (init.container == drflac_container_ogg) {7988allocationSize += sizeof(drflac_oggbs);79897990pOggbs = (drflac_oggbs*)drflac__malloc_from_callbacks(sizeof(*pOggbs), &allocationCallbacks);7991if (pOggbs == NULL) {7992return NULL; /*DRFLAC_OUT_OF_MEMORY;*/7993}79947995DRFLAC_ZERO_MEMORY(pOggbs, sizeof(*pOggbs));7996pOggbs->onRead = onRead;7997pOggbs->onSeek = onSeek;7998pOggbs->pUserData = pUserData;7999pOggbs->currentBytePos = init.oggFirstBytePos;8000pOggbs->firstBytePos = init.oggFirstBytePos;8001pOggbs->serialNumber = init.oggSerial;8002pOggbs->bosPageHeader = init.oggBosHeader;8003pOggbs->bytesRemainingInPage = 0;8004}8005#endif80068007/*8008This part is a bit awkward. We need to load the seektable so that it can be referenced in-memory, but I want the drflac object to8009consist of only a single heap allocation. To this, the size of the seek table needs to be known, which we determine when reading8010and decoding the metadata.8011*/8012firstFramePos = 42; /* <-- We know we are at byte 42 at this point. */8013seektablePos = 0;8014seekpointCount = 0;8015if (init.hasMetadataBlocks) {8016drflac_read_proc onReadOverride = onRead;8017drflac_seek_proc onSeekOverride = onSeek;8018void* pUserDataOverride = pUserData;80198020#ifndef DR_FLAC_NO_OGG8021if (init.container == drflac_container_ogg) {8022onReadOverride = drflac__on_read_ogg;8023onSeekOverride = drflac__on_seek_ogg;8024pUserDataOverride = (void*)pOggbs;8025}8026#endif80278028if (!drflac__read_and_decode_metadata(onReadOverride, onSeekOverride, onMeta, pUserDataOverride, pUserDataMD, &firstFramePos, &seektablePos, &seekpointCount, &allocationCallbacks)) {8029#ifndef DR_FLAC_NO_OGG8030drflac__free_from_callbacks(pOggbs, &allocationCallbacks);8031#endif8032return NULL;8033}80348035allocationSize += seekpointCount * sizeof(drflac_seekpoint);8036}803780388039pFlac = (drflac*)drflac__malloc_from_callbacks(allocationSize, &allocationCallbacks);8040if (pFlac == NULL) {8041#ifndef DR_FLAC_NO_OGG8042drflac__free_from_callbacks(pOggbs, &allocationCallbacks);8043#endif8044return NULL;8045}80468047drflac__init_from_info(pFlac, &init);8048pFlac->allocationCallbacks = allocationCallbacks;8049pFlac->pDecodedSamples = (drflac_int32*)drflac_align((size_t)pFlac->pExtraData, DRFLAC_MAX_SIMD_VECTOR_SIZE);80508051#ifndef DR_FLAC_NO_OGG8052if (init.container == drflac_container_ogg) {8053drflac_oggbs* pInternalOggbs = (drflac_oggbs*)((drflac_uint8*)pFlac->pDecodedSamples + decodedSamplesAllocationSize + (seekpointCount * sizeof(drflac_seekpoint)));8054DRFLAC_COPY_MEMORY(pInternalOggbs, pOggbs, sizeof(*pOggbs));80558056/* At this point the pOggbs object has been handed over to pInternalOggbs and can be freed. */8057drflac__free_from_callbacks(pOggbs, &allocationCallbacks);8058pOggbs = NULL;80598060/* The Ogg bistream needs to be layered on top of the original bitstream. */8061pFlac->bs.onRead = drflac__on_read_ogg;8062pFlac->bs.onSeek = drflac__on_seek_ogg;8063pFlac->bs.pUserData = (void*)pInternalOggbs;8064pFlac->_oggbs = (void*)pInternalOggbs;8065}8066#endif80678068pFlac->firstFLACFramePosInBytes = firstFramePos;80698070/* NOTE: Seektables are not currently compatible with Ogg encapsulation (Ogg has its own accelerated seeking system). I may change this later, so I'm leaving this here for now. */8071#ifndef DR_FLAC_NO_OGG8072if (init.container == drflac_container_ogg)8073{8074pFlac->pSeekpoints = NULL;8075pFlac->seekpointCount = 0;8076}8077else8078#endif8079{8080/* If we have a seektable we need to load it now, making sure we move back to where we were previously. */8081if (seektablePos != 0) {8082pFlac->seekpointCount = seekpointCount;8083pFlac->pSeekpoints = (drflac_seekpoint*)((drflac_uint8*)pFlac->pDecodedSamples + decodedSamplesAllocationSize);80848085DRFLAC_ASSERT(pFlac->bs.onSeek != NULL);8086DRFLAC_ASSERT(pFlac->bs.onRead != NULL);80878088/* Seek to the seektable, then just read directly into our seektable buffer. */8089if (pFlac->bs.onSeek(pFlac->bs.pUserData, (int)seektablePos, drflac_seek_origin_start)) {8090drflac_uint32 iSeekpoint;80918092for (iSeekpoint = 0; iSeekpoint < seekpointCount; iSeekpoint += 1) {8093if (pFlac->bs.onRead(pFlac->bs.pUserData, pFlac->pSeekpoints + iSeekpoint, DRFLAC_SEEKPOINT_SIZE_IN_BYTES) == DRFLAC_SEEKPOINT_SIZE_IN_BYTES) {8094/* Endian swap. */8095pFlac->pSeekpoints[iSeekpoint].firstPCMFrame = drflac__be2host_64(pFlac->pSeekpoints[iSeekpoint].firstPCMFrame);8096pFlac->pSeekpoints[iSeekpoint].flacFrameOffset = drflac__be2host_64(pFlac->pSeekpoints[iSeekpoint].flacFrameOffset);8097pFlac->pSeekpoints[iSeekpoint].pcmFrameCount = drflac__be2host_16(pFlac->pSeekpoints[iSeekpoint].pcmFrameCount);8098} else {8099/* Failed to read the seektable. Pretend we don't have one. */8100pFlac->pSeekpoints = NULL;8101pFlac->seekpointCount = 0;8102break;8103}8104}81058106/* We need to seek back to where we were. If this fails it's a critical error. */8107if (!pFlac->bs.onSeek(pFlac->bs.pUserData, (int)pFlac->firstFLACFramePosInBytes, drflac_seek_origin_start)) {8108drflac__free_from_callbacks(pFlac, &allocationCallbacks);8109return NULL;8110}8111} else {8112/* Failed to seek to the seektable. Ominous sign, but for now we can just pretend we don't have one. */8113pFlac->pSeekpoints = NULL;8114pFlac->seekpointCount = 0;8115}8116}8117}811881198120/*8121If we get here, but don't have a STREAMINFO block, it means we've opened the stream in relaxed mode and need to decode8122the first frame.8123*/8124if (!init.hasStreamInfoBlock) {8125pFlac->currentFLACFrame.header = init.firstFrameHeader;8126for (;;) {8127drflac_result result = drflac__decode_flac_frame(pFlac);8128if (result == DRFLAC_SUCCESS) {8129break;8130} else {8131if (result == DRFLAC_CRC_MISMATCH) {8132if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) {8133drflac__free_from_callbacks(pFlac, &allocationCallbacks);8134return NULL;8135}8136continue;8137} else {8138drflac__free_from_callbacks(pFlac, &allocationCallbacks);8139return NULL;8140}8141}8142}8143}81448145return pFlac;8146}8147814881498150#ifndef DR_FLAC_NO_STDIO8151#include <stdio.h>8152#ifndef DR_FLAC_NO_WCHAR8153#include <wchar.h> /* For wcslen(), wcsrtombs() */8154#endif81558156/* Errno */8157/* drflac_result_from_errno() is only used for fopen() and wfopen() so putting it inside DR_WAV_NO_STDIO for now. If something else needs this later we can move it out. */8158#include <errno.h>8159static drflac_result drflac_result_from_errno(int e)8160{8161switch (e)8162{8163case 0: return DRFLAC_SUCCESS;8164#ifdef EPERM8165case EPERM: return DRFLAC_INVALID_OPERATION;8166#endif8167#ifdef ENOENT8168case ENOENT: return DRFLAC_DOES_NOT_EXIST;8169#endif8170#ifdef ESRCH8171case ESRCH: return DRFLAC_DOES_NOT_EXIST;8172#endif8173#ifdef EINTR8174case EINTR: return DRFLAC_INTERRUPT;8175#endif8176#ifdef EIO8177case EIO: return DRFLAC_IO_ERROR;8178#endif8179#ifdef ENXIO8180case ENXIO: return DRFLAC_DOES_NOT_EXIST;8181#endif8182#ifdef E2BIG8183case E2BIG: return DRFLAC_INVALID_ARGS;8184#endif8185#ifdef ENOEXEC8186case ENOEXEC: return DRFLAC_INVALID_FILE;8187#endif8188#ifdef EBADF8189case EBADF: return DRFLAC_INVALID_FILE;8190#endif8191#ifdef ECHILD8192case ECHILD: return DRFLAC_ERROR;8193#endif8194#ifdef EAGAIN8195case EAGAIN: return DRFLAC_UNAVAILABLE;8196#endif8197#ifdef ENOMEM8198case ENOMEM: return DRFLAC_OUT_OF_MEMORY;8199#endif8200#ifdef EACCES8201case EACCES: return DRFLAC_ACCESS_DENIED;8202#endif8203#ifdef EFAULT8204case EFAULT: return DRFLAC_BAD_ADDRESS;8205#endif8206#ifdef ENOTBLK8207case ENOTBLK: return DRFLAC_ERROR;8208#endif8209#ifdef EBUSY8210case EBUSY: return DRFLAC_BUSY;8211#endif8212#ifdef EEXIST8213case EEXIST: return DRFLAC_ALREADY_EXISTS;8214#endif8215#ifdef EXDEV8216case EXDEV: return DRFLAC_ERROR;8217#endif8218#ifdef ENODEV8219case ENODEV: return DRFLAC_DOES_NOT_EXIST;8220#endif8221#ifdef ENOTDIR8222case ENOTDIR: return DRFLAC_NOT_DIRECTORY;8223#endif8224#ifdef EISDIR8225case EISDIR: return DRFLAC_IS_DIRECTORY;8226#endif8227#ifdef EINVAL8228case EINVAL: return DRFLAC_INVALID_ARGS;8229#endif8230#ifdef ENFILE8231case ENFILE: return DRFLAC_TOO_MANY_OPEN_FILES;8232#endif8233#ifdef EMFILE8234case EMFILE: return DRFLAC_TOO_MANY_OPEN_FILES;8235#endif8236#ifdef ENOTTY8237case ENOTTY: return DRFLAC_INVALID_OPERATION;8238#endif8239#ifdef ETXTBSY8240case ETXTBSY: return DRFLAC_BUSY;8241#endif8242#ifdef EFBIG8243case EFBIG: return DRFLAC_TOO_BIG;8244#endif8245#ifdef ENOSPC8246case ENOSPC: return DRFLAC_NO_SPACE;8247#endif8248#ifdef ESPIPE8249case ESPIPE: return DRFLAC_BAD_SEEK;8250#endif8251#ifdef EROFS8252case EROFS: return DRFLAC_ACCESS_DENIED;8253#endif8254#ifdef EMLINK8255case EMLINK: return DRFLAC_TOO_MANY_LINKS;8256#endif8257#ifdef EPIPE8258case EPIPE: return DRFLAC_BAD_PIPE;8259#endif8260#ifdef EDOM8261case EDOM: return DRFLAC_OUT_OF_RANGE;8262#endif8263#ifdef ERANGE8264case ERANGE: return DRFLAC_OUT_OF_RANGE;8265#endif8266#ifdef EDEADLK8267case EDEADLK: return DRFLAC_DEADLOCK;8268#endif8269#ifdef ENAMETOOLONG8270case ENAMETOOLONG: return DRFLAC_PATH_TOO_LONG;8271#endif8272#ifdef ENOLCK8273case ENOLCK: return DRFLAC_ERROR;8274#endif8275#ifdef ENOSYS8276case ENOSYS: return DRFLAC_NOT_IMPLEMENTED;8277#endif8278#ifdef ENOTEMPTY8279case ENOTEMPTY: return DRFLAC_DIRECTORY_NOT_EMPTY;8280#endif8281#ifdef ELOOP8282case ELOOP: return DRFLAC_TOO_MANY_LINKS;8283#endif8284#ifdef ENOMSG8285case ENOMSG: return DRFLAC_NO_MESSAGE;8286#endif8287#ifdef EIDRM8288case EIDRM: return DRFLAC_ERROR;8289#endif8290#ifdef ECHRNG8291case ECHRNG: return DRFLAC_ERROR;8292#endif8293#ifdef EL2NSYNC8294case EL2NSYNC: return DRFLAC_ERROR;8295#endif8296#ifdef EL3HLT8297case EL3HLT: return DRFLAC_ERROR;8298#endif8299#ifdef EL3RST8300case EL3RST: return DRFLAC_ERROR;8301#endif8302#ifdef ELNRNG8303case ELNRNG: return DRFLAC_OUT_OF_RANGE;8304#endif8305#ifdef EUNATCH8306case EUNATCH: return DRFLAC_ERROR;8307#endif8308#ifdef ENOCSI8309case ENOCSI: return DRFLAC_ERROR;8310#endif8311#ifdef EL2HLT8312case EL2HLT: return DRFLAC_ERROR;8313#endif8314#ifdef EBADE8315case EBADE: return DRFLAC_ERROR;8316#endif8317#ifdef EBADR8318case EBADR: return DRFLAC_ERROR;8319#endif8320#ifdef EXFULL8321case EXFULL: return DRFLAC_ERROR;8322#endif8323#ifdef ENOANO8324case ENOANO: return DRFLAC_ERROR;8325#endif8326#ifdef EBADRQC8327case EBADRQC: return DRFLAC_ERROR;8328#endif8329#ifdef EBADSLT8330case EBADSLT: return DRFLAC_ERROR;8331#endif8332#ifdef EBFONT8333case EBFONT: return DRFLAC_INVALID_FILE;8334#endif8335#ifdef ENOSTR8336case ENOSTR: return DRFLAC_ERROR;8337#endif8338#ifdef ENODATA8339case ENODATA: return DRFLAC_NO_DATA_AVAILABLE;8340#endif8341#ifdef ETIME8342case ETIME: return DRFLAC_TIMEOUT;8343#endif8344#ifdef ENOSR8345case ENOSR: return DRFLAC_NO_DATA_AVAILABLE;8346#endif8347#ifdef ENONET8348case ENONET: return DRFLAC_NO_NETWORK;8349#endif8350#ifdef ENOPKG8351case ENOPKG: return DRFLAC_ERROR;8352#endif8353#ifdef EREMOTE8354case EREMOTE: return DRFLAC_ERROR;8355#endif8356#ifdef ENOLINK8357case ENOLINK: return DRFLAC_ERROR;8358#endif8359#ifdef EADV8360case EADV: return DRFLAC_ERROR;8361#endif8362#ifdef ESRMNT8363case ESRMNT: return DRFLAC_ERROR;8364#endif8365#ifdef ECOMM8366case ECOMM: return DRFLAC_ERROR;8367#endif8368#ifdef EPROTO8369case EPROTO: return DRFLAC_ERROR;8370#endif8371#ifdef EMULTIHOP8372case EMULTIHOP: return DRFLAC_ERROR;8373#endif8374#ifdef EDOTDOT8375case EDOTDOT: return DRFLAC_ERROR;8376#endif8377#ifdef EBADMSG8378case EBADMSG: return DRFLAC_BAD_MESSAGE;8379#endif8380#ifdef EOVERFLOW8381case EOVERFLOW: return DRFLAC_TOO_BIG;8382#endif8383#ifdef ENOTUNIQ8384case ENOTUNIQ: return DRFLAC_NOT_UNIQUE;8385#endif8386#ifdef EBADFD8387case EBADFD: return DRFLAC_ERROR;8388#endif8389#ifdef EREMCHG8390case EREMCHG: return DRFLAC_ERROR;8391#endif8392#ifdef ELIBACC8393case ELIBACC: return DRFLAC_ACCESS_DENIED;8394#endif8395#ifdef ELIBBAD8396case ELIBBAD: return DRFLAC_INVALID_FILE;8397#endif8398#ifdef ELIBSCN8399case ELIBSCN: return DRFLAC_INVALID_FILE;8400#endif8401#ifdef ELIBMAX8402case ELIBMAX: return DRFLAC_ERROR;8403#endif8404#ifdef ELIBEXEC8405case ELIBEXEC: return DRFLAC_ERROR;8406#endif8407#ifdef EILSEQ8408case EILSEQ: return DRFLAC_INVALID_DATA;8409#endif8410#ifdef ERESTART8411case ERESTART: return DRFLAC_ERROR;8412#endif8413#ifdef ESTRPIPE8414case ESTRPIPE: return DRFLAC_ERROR;8415#endif8416#ifdef EUSERS8417case EUSERS: return DRFLAC_ERROR;8418#endif8419#ifdef ENOTSOCK8420case ENOTSOCK: return DRFLAC_NOT_SOCKET;8421#endif8422#ifdef EDESTADDRREQ8423case EDESTADDRREQ: return DRFLAC_NO_ADDRESS;8424#endif8425#ifdef EMSGSIZE8426case EMSGSIZE: return DRFLAC_TOO_BIG;8427#endif8428#ifdef EPROTOTYPE8429case EPROTOTYPE: return DRFLAC_BAD_PROTOCOL;8430#endif8431#ifdef ENOPROTOOPT8432case ENOPROTOOPT: return DRFLAC_PROTOCOL_UNAVAILABLE;8433#endif8434#ifdef EPROTONOSUPPORT8435case EPROTONOSUPPORT: return DRFLAC_PROTOCOL_NOT_SUPPORTED;8436#endif8437#ifdef ESOCKTNOSUPPORT8438case ESOCKTNOSUPPORT: return DRFLAC_SOCKET_NOT_SUPPORTED;8439#endif8440#ifdef EOPNOTSUPP8441case EOPNOTSUPP: return DRFLAC_INVALID_OPERATION;8442#endif8443#ifdef EPFNOSUPPORT8444case EPFNOSUPPORT: return DRFLAC_PROTOCOL_FAMILY_NOT_SUPPORTED;8445#endif8446#ifdef EAFNOSUPPORT8447case EAFNOSUPPORT: return DRFLAC_ADDRESS_FAMILY_NOT_SUPPORTED;8448#endif8449#ifdef EADDRINUSE8450case EADDRINUSE: return DRFLAC_ALREADY_IN_USE;8451#endif8452#ifdef EADDRNOTAVAIL8453case EADDRNOTAVAIL: return DRFLAC_ERROR;8454#endif8455#ifdef ENETDOWN8456case ENETDOWN: return DRFLAC_NO_NETWORK;8457#endif8458#ifdef ENETUNREACH8459case ENETUNREACH: return DRFLAC_NO_NETWORK;8460#endif8461#ifdef ENETRESET8462case ENETRESET: return DRFLAC_NO_NETWORK;8463#endif8464#ifdef ECONNABORTED8465case ECONNABORTED: return DRFLAC_NO_NETWORK;8466#endif8467#ifdef ECONNRESET8468case ECONNRESET: return DRFLAC_CONNECTION_RESET;8469#endif8470#ifdef ENOBUFS8471case ENOBUFS: return DRFLAC_NO_SPACE;8472#endif8473#ifdef EISCONN8474case EISCONN: return DRFLAC_ALREADY_CONNECTED;8475#endif8476#ifdef ENOTCONN8477case ENOTCONN: return DRFLAC_NOT_CONNECTED;8478#endif8479#ifdef ESHUTDOWN8480case ESHUTDOWN: return DRFLAC_ERROR;8481#endif8482#ifdef ETOOMANYREFS8483case ETOOMANYREFS: return DRFLAC_ERROR;8484#endif8485#ifdef ETIMEDOUT8486case ETIMEDOUT: return DRFLAC_TIMEOUT;8487#endif8488#ifdef ECONNREFUSED8489case ECONNREFUSED: return DRFLAC_CONNECTION_REFUSED;8490#endif8491#ifdef EHOSTDOWN8492case EHOSTDOWN: return DRFLAC_NO_HOST;8493#endif8494#ifdef EHOSTUNREACH8495case EHOSTUNREACH: return DRFLAC_NO_HOST;8496#endif8497#ifdef EALREADY8498case EALREADY: return DRFLAC_IN_PROGRESS;8499#endif8500#ifdef EINPROGRESS8501case EINPROGRESS: return DRFLAC_IN_PROGRESS;8502#endif8503#ifdef ESTALE8504case ESTALE: return DRFLAC_INVALID_FILE;8505#endif8506#ifdef EUCLEAN8507case EUCLEAN: return DRFLAC_ERROR;8508#endif8509#ifdef ENOTNAM8510case ENOTNAM: return DRFLAC_ERROR;8511#endif8512#ifdef ENAVAIL8513case ENAVAIL: return DRFLAC_ERROR;8514#endif8515#ifdef EISNAM8516case EISNAM: return DRFLAC_ERROR;8517#endif8518#ifdef EREMOTEIO8519case EREMOTEIO: return DRFLAC_IO_ERROR;8520#endif8521#ifdef EDQUOT8522case EDQUOT: return DRFLAC_NO_SPACE;8523#endif8524#ifdef ENOMEDIUM8525case ENOMEDIUM: return DRFLAC_DOES_NOT_EXIST;8526#endif8527#ifdef EMEDIUMTYPE8528case EMEDIUMTYPE: return DRFLAC_ERROR;8529#endif8530#ifdef ECANCELED8531case ECANCELED: return DRFLAC_CANCELLED;8532#endif8533#ifdef ENOKEY8534case ENOKEY: return DRFLAC_ERROR;8535#endif8536#ifdef EKEYEXPIRED8537case EKEYEXPIRED: return DRFLAC_ERROR;8538#endif8539#ifdef EKEYREVOKED8540case EKEYREVOKED: return DRFLAC_ERROR;8541#endif8542#ifdef EKEYREJECTED8543case EKEYREJECTED: return DRFLAC_ERROR;8544#endif8545#ifdef EOWNERDEAD8546case EOWNERDEAD: return DRFLAC_ERROR;8547#endif8548#ifdef ENOTRECOVERABLE8549case ENOTRECOVERABLE: return DRFLAC_ERROR;8550#endif8551#ifdef ERFKILL8552case ERFKILL: return DRFLAC_ERROR;8553#endif8554#ifdef EHWPOISON8555case EHWPOISON: return DRFLAC_ERROR;8556#endif8557default: return DRFLAC_ERROR;8558}8559}8560/* End Errno */85618562/* fopen */8563static drflac_result drflac_fopen(FILE** ppFile, const char* pFilePath, const char* pOpenMode)8564{8565#if defined(_MSC_VER) && _MSC_VER >= 14008566errno_t err;8567#endif85688569if (ppFile != NULL) {8570*ppFile = NULL; /* Safety. */8571}85728573if (pFilePath == NULL || pOpenMode == NULL || ppFile == NULL) {8574return DRFLAC_INVALID_ARGS;8575}85768577#if defined(_MSC_VER) && _MSC_VER >= 14008578err = fopen_s(ppFile, pFilePath, pOpenMode);8579if (err != 0) {8580return drflac_result_from_errno(err);8581}8582#else8583#if defined(_WIN32) || defined(__APPLE__)8584*ppFile = fopen(pFilePath, pOpenMode);8585#else8586#if defined(_FILE_OFFSET_BITS) && _FILE_OFFSET_BITS == 64 && defined(_LARGEFILE64_SOURCE)8587*ppFile = fopen64(pFilePath, pOpenMode);8588#else8589*ppFile = fopen(pFilePath, pOpenMode);8590#endif8591#endif8592if (*ppFile == NULL) {8593drflac_result result = drflac_result_from_errno(errno);8594if (result == DRFLAC_SUCCESS) {8595result = DRFLAC_ERROR; /* Just a safety check to make sure we never ever return success when pFile == NULL. */8596}85978598return result;8599}8600#endif86018602return DRFLAC_SUCCESS;8603}86048605/*8606_wfopen() isn't always available in all compilation environments.86078608* Windows only.8609* MSVC seems to support it universally as far back as VC6 from what I can tell (haven't checked further back).8610* MinGW-64 (both 32- and 64-bit) seems to support it.8611* MinGW wraps it in !defined(__STRICT_ANSI__).8612* OpenWatcom wraps it in !defined(_NO_EXT_KEYS).86138614This can be reviewed as compatibility issues arise. The preference is to use _wfopen_s() and _wfopen() as opposed to the wcsrtombs()8615fallback, so if you notice your compiler not detecting this properly I'm happy to look at adding support.8616*/8617#if defined(_WIN32)8618#if defined(_MSC_VER) || defined(__MINGW64__) || (!defined(__STRICT_ANSI__) && !defined(_NO_EXT_KEYS))8619#define DRFLAC_HAS_WFOPEN8620#endif8621#endif86228623#ifndef DR_FLAC_NO_WCHAR8624static drflac_result drflac_wfopen(FILE** ppFile, const wchar_t* pFilePath, const wchar_t* pOpenMode, const drflac_allocation_callbacks* pAllocationCallbacks)8625{8626if (ppFile != NULL) {8627*ppFile = NULL; /* Safety. */8628}86298630if (pFilePath == NULL || pOpenMode == NULL || ppFile == NULL) {8631return DRFLAC_INVALID_ARGS;8632}86338634#if defined(DRFLAC_HAS_WFOPEN)8635{8636/* Use _wfopen() on Windows. */8637#if defined(_MSC_VER) && _MSC_VER >= 14008638errno_t err = _wfopen_s(ppFile, pFilePath, pOpenMode);8639if (err != 0) {8640return drflac_result_from_errno(err);8641}8642#else8643*ppFile = _wfopen(pFilePath, pOpenMode);8644if (*ppFile == NULL) {8645return drflac_result_from_errno(errno);8646}8647#endif8648(void)pAllocationCallbacks;8649}8650#else8651/*8652Use fopen() on anything other than Windows. Requires a conversion. This is annoying because8653fopen() is locale specific. The only real way I can think of to do this is with wcsrtombs(). Note8654that wcstombs() is apparently not thread-safe because it uses a static global mbstate_t object for8655maintaining state. I've checked this with -std=c89 and it works, but if somebody get's a compiler8656error I'll look into improving compatibility.8657*/86588659/*8660Some compilers don't support wchar_t or wcsrtombs() which we're using below. In this case we just8661need to abort with an error. If you encounter a compiler lacking such support, add it to this list8662and submit a bug report and it'll be added to the library upstream.8663*/8664#if defined(__DJGPP__)8665{8666/* Nothing to do here. This will fall through to the error check below. */8667}8668#else8669{8670mbstate_t mbs;8671size_t lenMB;8672const wchar_t* pFilePathTemp = pFilePath;8673char* pFilePathMB = NULL;8674char pOpenModeMB[32] = {0};86758676/* Get the length first. */8677DRFLAC_ZERO_OBJECT(&mbs);8678lenMB = wcsrtombs(NULL, &pFilePathTemp, 0, &mbs);8679if (lenMB == (size_t)-1) {8680return drflac_result_from_errno(errno);8681}86828683pFilePathMB = (char*)drflac__malloc_from_callbacks(lenMB + 1, pAllocationCallbacks);8684if (pFilePathMB == NULL) {8685return DRFLAC_OUT_OF_MEMORY;8686}86878688pFilePathTemp = pFilePath;8689DRFLAC_ZERO_OBJECT(&mbs);8690wcsrtombs(pFilePathMB, &pFilePathTemp, lenMB + 1, &mbs);86918692/* The open mode should always consist of ASCII characters so we should be able to do a trivial conversion. */8693{8694size_t i = 0;8695for (;;) {8696if (pOpenMode[i] == 0) {8697pOpenModeMB[i] = '\0';8698break;8699}87008701pOpenModeMB[i] = (char)pOpenMode[i];8702i += 1;8703}8704}87058706*ppFile = fopen(pFilePathMB, pOpenModeMB);87078708drflac__free_from_callbacks(pFilePathMB, pAllocationCallbacks);8709}8710#endif87118712if (*ppFile == NULL) {8713return DRFLAC_ERROR;8714}8715#endif87168717return DRFLAC_SUCCESS;8718}8719#endif8720/* End fopen */87218722static size_t drflac__on_read_stdio(void* pUserData, void* bufferOut, size_t bytesToRead)8723{8724return fread(bufferOut, 1, bytesToRead, (FILE*)pUserData);8725}87268727static drflac_bool32 drflac__on_seek_stdio(void* pUserData, int offset, drflac_seek_origin origin)8728{8729DRFLAC_ASSERT(offset >= 0); /* <-- Never seek backwards. */87308731return fseek((FILE*)pUserData, offset, (origin == drflac_seek_origin_current) ? SEEK_CUR : SEEK_SET) == 0;8732}873387348735DRFLAC_API drflac* drflac_open_file(const char* pFileName, const drflac_allocation_callbacks* pAllocationCallbacks)8736{8737drflac* pFlac;8738FILE* pFile;87398740if (drflac_fopen(&pFile, pFileName, "rb") != DRFLAC_SUCCESS) {8741return NULL;8742}87438744pFlac = drflac_open(drflac__on_read_stdio, drflac__on_seek_stdio, (void*)pFile, pAllocationCallbacks);8745if (pFlac == NULL) {8746fclose(pFile);8747return NULL;8748}87498750return pFlac;8751}87528753#ifndef DR_FLAC_NO_WCHAR8754DRFLAC_API drflac* drflac_open_file_w(const wchar_t* pFileName, const drflac_allocation_callbacks* pAllocationCallbacks)8755{8756drflac* pFlac;8757FILE* pFile;87588759if (drflac_wfopen(&pFile, pFileName, L"rb", pAllocationCallbacks) != DRFLAC_SUCCESS) {8760return NULL;8761}87628763pFlac = drflac_open(drflac__on_read_stdio, drflac__on_seek_stdio, (void*)pFile, pAllocationCallbacks);8764if (pFlac == NULL) {8765fclose(pFile);8766return NULL;8767}87688769return pFlac;8770}8771#endif87728773DRFLAC_API drflac* drflac_open_file_with_metadata(const char* pFileName, drflac_meta_proc onMeta, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks)8774{8775drflac* pFlac;8776FILE* pFile;87778778if (drflac_fopen(&pFile, pFileName, "rb") != DRFLAC_SUCCESS) {8779return NULL;8780}87818782pFlac = drflac_open_with_metadata_private(drflac__on_read_stdio, drflac__on_seek_stdio, onMeta, drflac_container_unknown, (void*)pFile, pUserData, pAllocationCallbacks);8783if (pFlac == NULL) {8784fclose(pFile);8785return pFlac;8786}87878788return pFlac;8789}87908791#ifndef DR_FLAC_NO_WCHAR8792DRFLAC_API drflac* drflac_open_file_with_metadata_w(const wchar_t* pFileName, drflac_meta_proc onMeta, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks)8793{8794drflac* pFlac;8795FILE* pFile;87968797if (drflac_wfopen(&pFile, pFileName, L"rb", pAllocationCallbacks) != DRFLAC_SUCCESS) {8798return NULL;8799}88008801pFlac = drflac_open_with_metadata_private(drflac__on_read_stdio, drflac__on_seek_stdio, onMeta, drflac_container_unknown, (void*)pFile, pUserData, pAllocationCallbacks);8802if (pFlac == NULL) {8803fclose(pFile);8804return pFlac;8805}88068807return pFlac;8808}8809#endif8810#endif /* DR_FLAC_NO_STDIO */88118812static size_t drflac__on_read_memory(void* pUserData, void* bufferOut, size_t bytesToRead)8813{8814drflac__memory_stream* memoryStream = (drflac__memory_stream*)pUserData;8815size_t bytesRemaining;88168817DRFLAC_ASSERT(memoryStream != NULL);8818DRFLAC_ASSERT(memoryStream->dataSize >= memoryStream->currentReadPos);88198820bytesRemaining = memoryStream->dataSize - memoryStream->currentReadPos;8821if (bytesToRead > bytesRemaining) {8822bytesToRead = bytesRemaining;8823}88248825if (bytesToRead > 0) {8826DRFLAC_COPY_MEMORY(bufferOut, memoryStream->data + memoryStream->currentReadPos, bytesToRead);8827memoryStream->currentReadPos += bytesToRead;8828}88298830return bytesToRead;8831}88328833static drflac_bool32 drflac__on_seek_memory(void* pUserData, int offset, drflac_seek_origin origin)8834{8835drflac__memory_stream* memoryStream = (drflac__memory_stream*)pUserData;88368837DRFLAC_ASSERT(memoryStream != NULL);8838DRFLAC_ASSERT(offset >= 0); /* <-- Never seek backwards. */88398840if (offset > (drflac_int64)memoryStream->dataSize) {8841return DRFLAC_FALSE;8842}88438844if (origin == drflac_seek_origin_current) {8845if (memoryStream->currentReadPos + offset <= memoryStream->dataSize) {8846memoryStream->currentReadPos += offset;8847} else {8848return DRFLAC_FALSE; /* Trying to seek too far forward. */8849}8850} else {8851if ((drflac_uint32)offset <= memoryStream->dataSize) {8852memoryStream->currentReadPos = offset;8853} else {8854return DRFLAC_FALSE; /* Trying to seek too far forward. */8855}8856}88578858return DRFLAC_TRUE;8859}88608861DRFLAC_API drflac* drflac_open_memory(const void* pData, size_t dataSize, const drflac_allocation_callbacks* pAllocationCallbacks)8862{8863drflac__memory_stream memoryStream;8864drflac* pFlac;88658866memoryStream.data = (const drflac_uint8*)pData;8867memoryStream.dataSize = dataSize;8868memoryStream.currentReadPos = 0;8869pFlac = drflac_open(drflac__on_read_memory, drflac__on_seek_memory, &memoryStream, pAllocationCallbacks);8870if (pFlac == NULL) {8871return NULL;8872}88738874pFlac->memoryStream = memoryStream;88758876/* This is an awful hack... */8877#ifndef DR_FLAC_NO_OGG8878if (pFlac->container == drflac_container_ogg)8879{8880drflac_oggbs* oggbs = (drflac_oggbs*)pFlac->_oggbs;8881oggbs->pUserData = &pFlac->memoryStream;8882}8883else8884#endif8885{8886pFlac->bs.pUserData = &pFlac->memoryStream;8887}88888889return pFlac;8890}88918892DRFLAC_API drflac* drflac_open_memory_with_metadata(const void* pData, size_t dataSize, drflac_meta_proc onMeta, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks)8893{8894drflac__memory_stream memoryStream;8895drflac* pFlac;88968897memoryStream.data = (const drflac_uint8*)pData;8898memoryStream.dataSize = dataSize;8899memoryStream.currentReadPos = 0;8900pFlac = drflac_open_with_metadata_private(drflac__on_read_memory, drflac__on_seek_memory, onMeta, drflac_container_unknown, &memoryStream, pUserData, pAllocationCallbacks);8901if (pFlac == NULL) {8902return NULL;8903}89048905pFlac->memoryStream = memoryStream;89068907/* This is an awful hack... */8908#ifndef DR_FLAC_NO_OGG8909if (pFlac->container == drflac_container_ogg)8910{8911drflac_oggbs* oggbs = (drflac_oggbs*)pFlac->_oggbs;8912oggbs->pUserData = &pFlac->memoryStream;8913}8914else8915#endif8916{8917pFlac->bs.pUserData = &pFlac->memoryStream;8918}89198920return pFlac;8921}8922892389248925DRFLAC_API drflac* drflac_open(drflac_read_proc onRead, drflac_seek_proc onSeek, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks)8926{8927return drflac_open_with_metadata_private(onRead, onSeek, NULL, drflac_container_unknown, pUserData, pUserData, pAllocationCallbacks);8928}8929DRFLAC_API drflac* drflac_open_relaxed(drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_container container, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks)8930{8931return drflac_open_with_metadata_private(onRead, onSeek, NULL, container, pUserData, pUserData, pAllocationCallbacks);8932}89338934DRFLAC_API drflac* drflac_open_with_metadata(drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks)8935{8936return drflac_open_with_metadata_private(onRead, onSeek, onMeta, drflac_container_unknown, pUserData, pUserData, pAllocationCallbacks);8937}8938DRFLAC_API drflac* drflac_open_with_metadata_relaxed(drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, drflac_container container, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks)8939{8940return drflac_open_with_metadata_private(onRead, onSeek, onMeta, container, pUserData, pUserData, pAllocationCallbacks);8941}89428943DRFLAC_API void drflac_close(drflac* pFlac)8944{8945if (pFlac == NULL) {8946return;8947}89488949#ifndef DR_FLAC_NO_STDIO8950/*8951If we opened the file with drflac_open_file() we will want to close the file handle. We can know whether or not drflac_open_file()8952was used by looking at the callbacks.8953*/8954if (pFlac->bs.onRead == drflac__on_read_stdio) {8955fclose((FILE*)pFlac->bs.pUserData);8956}89578958#ifndef DR_FLAC_NO_OGG8959/* Need to clean up Ogg streams a bit differently due to the way the bit streaming is chained. */8960if (pFlac->container == drflac_container_ogg) {8961drflac_oggbs* oggbs = (drflac_oggbs*)pFlac->_oggbs;8962DRFLAC_ASSERT(pFlac->bs.onRead == drflac__on_read_ogg);89638964if (oggbs->onRead == drflac__on_read_stdio) {8965fclose((FILE*)oggbs->pUserData);8966}8967}8968#endif8969#endif89708971drflac__free_from_callbacks(pFlac, &pFlac->allocationCallbacks);8972}897389748975#if 08976static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_left_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)8977{8978drflac_uint64 i;8979for (i = 0; i < frameCount; ++i) {8980drflac_uint32 left = (drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample);8981drflac_uint32 side = (drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample);8982drflac_uint32 right = left - side;89838984pOutputSamples[i*2+0] = (drflac_int32)left;8985pOutputSamples[i*2+1] = (drflac_int32)right;8986}8987}8988#endif89898990static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_left_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)8991{8992drflac_uint64 i;8993drflac_uint64 frameCount4 = frameCount >> 2;8994const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;8995const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;8996drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;8997drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;89988999for (i = 0; i < frameCount4; ++i) {9000drflac_uint32 left0 = pInputSamples0U32[i*4+0] << shift0;9001drflac_uint32 left1 = pInputSamples0U32[i*4+1] << shift0;9002drflac_uint32 left2 = pInputSamples0U32[i*4+2] << shift0;9003drflac_uint32 left3 = pInputSamples0U32[i*4+3] << shift0;90049005drflac_uint32 side0 = pInputSamples1U32[i*4+0] << shift1;9006drflac_uint32 side1 = pInputSamples1U32[i*4+1] << shift1;9007drflac_uint32 side2 = pInputSamples1U32[i*4+2] << shift1;9008drflac_uint32 side3 = pInputSamples1U32[i*4+3] << shift1;90099010drflac_uint32 right0 = left0 - side0;9011drflac_uint32 right1 = left1 - side1;9012drflac_uint32 right2 = left2 - side2;9013drflac_uint32 right3 = left3 - side3;90149015pOutputSamples[i*8+0] = (drflac_int32)left0;9016pOutputSamples[i*8+1] = (drflac_int32)right0;9017pOutputSamples[i*8+2] = (drflac_int32)left1;9018pOutputSamples[i*8+3] = (drflac_int32)right1;9019pOutputSamples[i*8+4] = (drflac_int32)left2;9020pOutputSamples[i*8+5] = (drflac_int32)right2;9021pOutputSamples[i*8+6] = (drflac_int32)left3;9022pOutputSamples[i*8+7] = (drflac_int32)right3;9023}90249025for (i = (frameCount4 << 2); i < frameCount; ++i) {9026drflac_uint32 left = pInputSamples0U32[i] << shift0;9027drflac_uint32 side = pInputSamples1U32[i] << shift1;9028drflac_uint32 right = left - side;90299030pOutputSamples[i*2+0] = (drflac_int32)left;9031pOutputSamples[i*2+1] = (drflac_int32)right;9032}9033}90349035#if defined(DRFLAC_SUPPORT_SSE2)9036static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_left_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)9037{9038drflac_uint64 i;9039drflac_uint64 frameCount4 = frameCount >> 2;9040const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;9041const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;9042drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;9043drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;90449045DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);90469047for (i = 0; i < frameCount4; ++i) {9048__m128i left = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0);9049__m128i side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1);9050__m128i right = _mm_sub_epi32(left, side);90519052_mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 0), _mm_unpacklo_epi32(left, right));9053_mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 4), _mm_unpackhi_epi32(left, right));9054}90559056for (i = (frameCount4 << 2); i < frameCount; ++i) {9057drflac_uint32 left = pInputSamples0U32[i] << shift0;9058drflac_uint32 side = pInputSamples1U32[i] << shift1;9059drflac_uint32 right = left - side;90609061pOutputSamples[i*2+0] = (drflac_int32)left;9062pOutputSamples[i*2+1] = (drflac_int32)right;9063}9064}9065#endif90669067#if defined(DRFLAC_SUPPORT_NEON)9068static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_left_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)9069{9070drflac_uint64 i;9071drflac_uint64 frameCount4 = frameCount >> 2;9072const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;9073const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;9074drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;9075drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;9076int32x4_t shift0_4;9077int32x4_t shift1_4;90789079DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);90809081shift0_4 = vdupq_n_s32(shift0);9082shift1_4 = vdupq_n_s32(shift1);90839084for (i = 0; i < frameCount4; ++i) {9085uint32x4_t left;9086uint32x4_t side;9087uint32x4_t right;90889089left = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift0_4);9090side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift1_4);9091right = vsubq_u32(left, side);90929093drflac__vst2q_u32((drflac_uint32*)pOutputSamples + i*8, vzipq_u32(left, right));9094}90959096for (i = (frameCount4 << 2); i < frameCount; ++i) {9097drflac_uint32 left = pInputSamples0U32[i] << shift0;9098drflac_uint32 side = pInputSamples1U32[i] << shift1;9099drflac_uint32 right = left - side;91009101pOutputSamples[i*2+0] = (drflac_int32)left;9102pOutputSamples[i*2+1] = (drflac_int32)right;9103}9104}9105#endif91069107static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_left_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)9108{9109#if defined(DRFLAC_SUPPORT_SSE2)9110if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) {9111drflac_read_pcm_frames_s32__decode_left_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);9112} else9113#elif defined(DRFLAC_SUPPORT_NEON)9114if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) {9115drflac_read_pcm_frames_s32__decode_left_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);9116} else9117#endif9118{9119/* Scalar fallback. */9120#if 09121drflac_read_pcm_frames_s32__decode_left_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);9122#else9123drflac_read_pcm_frames_s32__decode_left_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);9124#endif9125}9126}912791289129#if 09130static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_right_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)9131{9132drflac_uint64 i;9133for (i = 0; i < frameCount; ++i) {9134drflac_uint32 side = (drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample);9135drflac_uint32 right = (drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample);9136drflac_uint32 left = right + side;91379138pOutputSamples[i*2+0] = (drflac_int32)left;9139pOutputSamples[i*2+1] = (drflac_int32)right;9140}9141}9142#endif91439144static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_right_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)9145{9146drflac_uint64 i;9147drflac_uint64 frameCount4 = frameCount >> 2;9148const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;9149const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;9150drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;9151drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;91529153for (i = 0; i < frameCount4; ++i) {9154drflac_uint32 side0 = pInputSamples0U32[i*4+0] << shift0;9155drflac_uint32 side1 = pInputSamples0U32[i*4+1] << shift0;9156drflac_uint32 side2 = pInputSamples0U32[i*4+2] << shift0;9157drflac_uint32 side3 = pInputSamples0U32[i*4+3] << shift0;91589159drflac_uint32 right0 = pInputSamples1U32[i*4+0] << shift1;9160drflac_uint32 right1 = pInputSamples1U32[i*4+1] << shift1;9161drflac_uint32 right2 = pInputSamples1U32[i*4+2] << shift1;9162drflac_uint32 right3 = pInputSamples1U32[i*4+3] << shift1;91639164drflac_uint32 left0 = right0 + side0;9165drflac_uint32 left1 = right1 + side1;9166drflac_uint32 left2 = right2 + side2;9167drflac_uint32 left3 = right3 + side3;91689169pOutputSamples[i*8+0] = (drflac_int32)left0;9170pOutputSamples[i*8+1] = (drflac_int32)right0;9171pOutputSamples[i*8+2] = (drflac_int32)left1;9172pOutputSamples[i*8+3] = (drflac_int32)right1;9173pOutputSamples[i*8+4] = (drflac_int32)left2;9174pOutputSamples[i*8+5] = (drflac_int32)right2;9175pOutputSamples[i*8+6] = (drflac_int32)left3;9176pOutputSamples[i*8+7] = (drflac_int32)right3;9177}91789179for (i = (frameCount4 << 2); i < frameCount; ++i) {9180drflac_uint32 side = pInputSamples0U32[i] << shift0;9181drflac_uint32 right = pInputSamples1U32[i] << shift1;9182drflac_uint32 left = right + side;91839184pOutputSamples[i*2+0] = (drflac_int32)left;9185pOutputSamples[i*2+1] = (drflac_int32)right;9186}9187}91889189#if defined(DRFLAC_SUPPORT_SSE2)9190static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_right_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)9191{9192drflac_uint64 i;9193drflac_uint64 frameCount4 = frameCount >> 2;9194const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;9195const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;9196drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;9197drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;91989199DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);92009201for (i = 0; i < frameCount4; ++i) {9202__m128i side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0);9203__m128i right = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1);9204__m128i left = _mm_add_epi32(right, side);92059206_mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 0), _mm_unpacklo_epi32(left, right));9207_mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 4), _mm_unpackhi_epi32(left, right));9208}92099210for (i = (frameCount4 << 2); i < frameCount; ++i) {9211drflac_uint32 side = pInputSamples0U32[i] << shift0;9212drflac_uint32 right = pInputSamples1U32[i] << shift1;9213drflac_uint32 left = right + side;92149215pOutputSamples[i*2+0] = (drflac_int32)left;9216pOutputSamples[i*2+1] = (drflac_int32)right;9217}9218}9219#endif92209221#if defined(DRFLAC_SUPPORT_NEON)9222static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_right_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)9223{9224drflac_uint64 i;9225drflac_uint64 frameCount4 = frameCount >> 2;9226const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;9227const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;9228drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;9229drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;9230int32x4_t shift0_4;9231int32x4_t shift1_4;92329233DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);92349235shift0_4 = vdupq_n_s32(shift0);9236shift1_4 = vdupq_n_s32(shift1);92379238for (i = 0; i < frameCount4; ++i) {9239uint32x4_t side;9240uint32x4_t right;9241uint32x4_t left;92429243side = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift0_4);9244right = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift1_4);9245left = vaddq_u32(right, side);92469247drflac__vst2q_u32((drflac_uint32*)pOutputSamples + i*8, vzipq_u32(left, right));9248}92499250for (i = (frameCount4 << 2); i < frameCount; ++i) {9251drflac_uint32 side = pInputSamples0U32[i] << shift0;9252drflac_uint32 right = pInputSamples1U32[i] << shift1;9253drflac_uint32 left = right + side;92549255pOutputSamples[i*2+0] = (drflac_int32)left;9256pOutputSamples[i*2+1] = (drflac_int32)right;9257}9258}9259#endif92609261static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_right_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)9262{9263#if defined(DRFLAC_SUPPORT_SSE2)9264if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) {9265drflac_read_pcm_frames_s32__decode_right_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);9266} else9267#elif defined(DRFLAC_SUPPORT_NEON)9268if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) {9269drflac_read_pcm_frames_s32__decode_right_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);9270} else9271#endif9272{9273/* Scalar fallback. */9274#if 09275drflac_read_pcm_frames_s32__decode_right_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);9276#else9277drflac_read_pcm_frames_s32__decode_right_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);9278#endif9279}9280}928192829283#if 09284static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_mid_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)9285{9286for (drflac_uint64 i = 0; i < frameCount; ++i) {9287drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;9288drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;92899290mid = (mid << 1) | (side & 0x01);92919292pOutputSamples[i*2+0] = (drflac_int32)((drflac_uint32)((drflac_int32)(mid + side) >> 1) << unusedBitsPerSample);9293pOutputSamples[i*2+1] = (drflac_int32)((drflac_uint32)((drflac_int32)(mid - side) >> 1) << unusedBitsPerSample);9294}9295}9296#endif92979298static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_mid_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)9299{9300drflac_uint64 i;9301drflac_uint64 frameCount4 = frameCount >> 2;9302const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;9303const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;9304drflac_int32 shift = unusedBitsPerSample;93059306if (shift > 0) {9307shift -= 1;9308for (i = 0; i < frameCount4; ++i) {9309drflac_uint32 temp0L;9310drflac_uint32 temp1L;9311drflac_uint32 temp2L;9312drflac_uint32 temp3L;9313drflac_uint32 temp0R;9314drflac_uint32 temp1R;9315drflac_uint32 temp2R;9316drflac_uint32 temp3R;93179318drflac_uint32 mid0 = pInputSamples0U32[i*4+0] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;9319drflac_uint32 mid1 = pInputSamples0U32[i*4+1] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;9320drflac_uint32 mid2 = pInputSamples0U32[i*4+2] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;9321drflac_uint32 mid3 = pInputSamples0U32[i*4+3] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;93229323drflac_uint32 side0 = pInputSamples1U32[i*4+0] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;9324drflac_uint32 side1 = pInputSamples1U32[i*4+1] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;9325drflac_uint32 side2 = pInputSamples1U32[i*4+2] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;9326drflac_uint32 side3 = pInputSamples1U32[i*4+3] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;93279328mid0 = (mid0 << 1) | (side0 & 0x01);9329mid1 = (mid1 << 1) | (side1 & 0x01);9330mid2 = (mid2 << 1) | (side2 & 0x01);9331mid3 = (mid3 << 1) | (side3 & 0x01);93329333temp0L = (mid0 + side0) << shift;9334temp1L = (mid1 + side1) << shift;9335temp2L = (mid2 + side2) << shift;9336temp3L = (mid3 + side3) << shift;93379338temp0R = (mid0 - side0) << shift;9339temp1R = (mid1 - side1) << shift;9340temp2R = (mid2 - side2) << shift;9341temp3R = (mid3 - side3) << shift;93429343pOutputSamples[i*8+0] = (drflac_int32)temp0L;9344pOutputSamples[i*8+1] = (drflac_int32)temp0R;9345pOutputSamples[i*8+2] = (drflac_int32)temp1L;9346pOutputSamples[i*8+3] = (drflac_int32)temp1R;9347pOutputSamples[i*8+4] = (drflac_int32)temp2L;9348pOutputSamples[i*8+5] = (drflac_int32)temp2R;9349pOutputSamples[i*8+6] = (drflac_int32)temp3L;9350pOutputSamples[i*8+7] = (drflac_int32)temp3R;9351}9352} else {9353for (i = 0; i < frameCount4; ++i) {9354drflac_uint32 temp0L;9355drflac_uint32 temp1L;9356drflac_uint32 temp2L;9357drflac_uint32 temp3L;9358drflac_uint32 temp0R;9359drflac_uint32 temp1R;9360drflac_uint32 temp2R;9361drflac_uint32 temp3R;93629363drflac_uint32 mid0 = pInputSamples0U32[i*4+0] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;9364drflac_uint32 mid1 = pInputSamples0U32[i*4+1] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;9365drflac_uint32 mid2 = pInputSamples0U32[i*4+2] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;9366drflac_uint32 mid3 = pInputSamples0U32[i*4+3] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;93679368drflac_uint32 side0 = pInputSamples1U32[i*4+0] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;9369drflac_uint32 side1 = pInputSamples1U32[i*4+1] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;9370drflac_uint32 side2 = pInputSamples1U32[i*4+2] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;9371drflac_uint32 side3 = pInputSamples1U32[i*4+3] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;93729373mid0 = (mid0 << 1) | (side0 & 0x01);9374mid1 = (mid1 << 1) | (side1 & 0x01);9375mid2 = (mid2 << 1) | (side2 & 0x01);9376mid3 = (mid3 << 1) | (side3 & 0x01);93779378temp0L = (drflac_uint32)((drflac_int32)(mid0 + side0) >> 1);9379temp1L = (drflac_uint32)((drflac_int32)(mid1 + side1) >> 1);9380temp2L = (drflac_uint32)((drflac_int32)(mid2 + side2) >> 1);9381temp3L = (drflac_uint32)((drflac_int32)(mid3 + side3) >> 1);93829383temp0R = (drflac_uint32)((drflac_int32)(mid0 - side0) >> 1);9384temp1R = (drflac_uint32)((drflac_int32)(mid1 - side1) >> 1);9385temp2R = (drflac_uint32)((drflac_int32)(mid2 - side2) >> 1);9386temp3R = (drflac_uint32)((drflac_int32)(mid3 - side3) >> 1);93879388pOutputSamples[i*8+0] = (drflac_int32)temp0L;9389pOutputSamples[i*8+1] = (drflac_int32)temp0R;9390pOutputSamples[i*8+2] = (drflac_int32)temp1L;9391pOutputSamples[i*8+3] = (drflac_int32)temp1R;9392pOutputSamples[i*8+4] = (drflac_int32)temp2L;9393pOutputSamples[i*8+5] = (drflac_int32)temp2R;9394pOutputSamples[i*8+6] = (drflac_int32)temp3L;9395pOutputSamples[i*8+7] = (drflac_int32)temp3R;9396}9397}93989399for (i = (frameCount4 << 2); i < frameCount; ++i) {9400drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;9401drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;94029403mid = (mid << 1) | (side & 0x01);94049405pOutputSamples[i*2+0] = (drflac_int32)((drflac_uint32)((drflac_int32)(mid + side) >> 1) << unusedBitsPerSample);9406pOutputSamples[i*2+1] = (drflac_int32)((drflac_uint32)((drflac_int32)(mid - side) >> 1) << unusedBitsPerSample);9407}9408}94099410#if defined(DRFLAC_SUPPORT_SSE2)9411static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_mid_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)9412{9413drflac_uint64 i;9414drflac_uint64 frameCount4 = frameCount >> 2;9415const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;9416const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;9417drflac_int32 shift = unusedBitsPerSample;94189419DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);94209421if (shift == 0) {9422for (i = 0; i < frameCount4; ++i) {9423__m128i mid;9424__m128i side;9425__m128i left;9426__m128i right;94279428mid = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample);9429side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample);94309431mid = _mm_or_si128(_mm_slli_epi32(mid, 1), _mm_and_si128(side, _mm_set1_epi32(0x01)));94329433left = _mm_srai_epi32(_mm_add_epi32(mid, side), 1);9434right = _mm_srai_epi32(_mm_sub_epi32(mid, side), 1);94359436_mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 0), _mm_unpacklo_epi32(left, right));9437_mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 4), _mm_unpackhi_epi32(left, right));9438}94399440for (i = (frameCount4 << 2); i < frameCount; ++i) {9441drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;9442drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;94439444mid = (mid << 1) | (side & 0x01);94459446pOutputSamples[i*2+0] = (drflac_int32)(mid + side) >> 1;9447pOutputSamples[i*2+1] = (drflac_int32)(mid - side) >> 1;9448}9449} else {9450shift -= 1;9451for (i = 0; i < frameCount4; ++i) {9452__m128i mid;9453__m128i side;9454__m128i left;9455__m128i right;94569457mid = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample);9458side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample);94599460mid = _mm_or_si128(_mm_slli_epi32(mid, 1), _mm_and_si128(side, _mm_set1_epi32(0x01)));94619462left = _mm_slli_epi32(_mm_add_epi32(mid, side), shift);9463right = _mm_slli_epi32(_mm_sub_epi32(mid, side), shift);94649465_mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 0), _mm_unpacklo_epi32(left, right));9466_mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 4), _mm_unpackhi_epi32(left, right));9467}94689469for (i = (frameCount4 << 2); i < frameCount; ++i) {9470drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;9471drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;94729473mid = (mid << 1) | (side & 0x01);94749475pOutputSamples[i*2+0] = (drflac_int32)((mid + side) << shift);9476pOutputSamples[i*2+1] = (drflac_int32)((mid - side) << shift);9477}9478}9479}9480#endif94819482#if defined(DRFLAC_SUPPORT_NEON)9483static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_mid_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)9484{9485drflac_uint64 i;9486drflac_uint64 frameCount4 = frameCount >> 2;9487const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;9488const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;9489drflac_int32 shift = unusedBitsPerSample;9490int32x4_t wbpsShift0_4; /* wbps = Wasted Bits Per Sample */9491int32x4_t wbpsShift1_4; /* wbps = Wasted Bits Per Sample */9492uint32x4_t one4;94939494DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);94959496wbpsShift0_4 = vdupq_n_s32(pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample);9497wbpsShift1_4 = vdupq_n_s32(pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample);9498one4 = vdupq_n_u32(1);94999500if (shift == 0) {9501for (i = 0; i < frameCount4; ++i) {9502uint32x4_t mid;9503uint32x4_t side;9504int32x4_t left;9505int32x4_t right;95069507mid = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), wbpsShift0_4);9508side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), wbpsShift1_4);95099510mid = vorrq_u32(vshlq_n_u32(mid, 1), vandq_u32(side, one4));95119512left = vshrq_n_s32(vreinterpretq_s32_u32(vaddq_u32(mid, side)), 1);9513right = vshrq_n_s32(vreinterpretq_s32_u32(vsubq_u32(mid, side)), 1);95149515drflac__vst2q_s32(pOutputSamples + i*8, vzipq_s32(left, right));9516}95179518for (i = (frameCount4 << 2); i < frameCount; ++i) {9519drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;9520drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;95219522mid = (mid << 1) | (side & 0x01);95239524pOutputSamples[i*2+0] = (drflac_int32)(mid + side) >> 1;9525pOutputSamples[i*2+1] = (drflac_int32)(mid - side) >> 1;9526}9527} else {9528int32x4_t shift4;95299530shift -= 1;9531shift4 = vdupq_n_s32(shift);95329533for (i = 0; i < frameCount4; ++i) {9534uint32x4_t mid;9535uint32x4_t side;9536int32x4_t left;9537int32x4_t right;95389539mid = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), wbpsShift0_4);9540side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), wbpsShift1_4);95419542mid = vorrq_u32(vshlq_n_u32(mid, 1), vandq_u32(side, one4));95439544left = vreinterpretq_s32_u32(vshlq_u32(vaddq_u32(mid, side), shift4));9545right = vreinterpretq_s32_u32(vshlq_u32(vsubq_u32(mid, side), shift4));95469547drflac__vst2q_s32(pOutputSamples + i*8, vzipq_s32(left, right));9548}95499550for (i = (frameCount4 << 2); i < frameCount; ++i) {9551drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;9552drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;95539554mid = (mid << 1) | (side & 0x01);95559556pOutputSamples[i*2+0] = (drflac_int32)((mid + side) << shift);9557pOutputSamples[i*2+1] = (drflac_int32)((mid - side) << shift);9558}9559}9560}9561#endif95629563static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_mid_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)9564{9565#if defined(DRFLAC_SUPPORT_SSE2)9566if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) {9567drflac_read_pcm_frames_s32__decode_mid_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);9568} else9569#elif defined(DRFLAC_SUPPORT_NEON)9570if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) {9571drflac_read_pcm_frames_s32__decode_mid_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);9572} else9573#endif9574{9575/* Scalar fallback. */9576#if 09577drflac_read_pcm_frames_s32__decode_mid_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);9578#else9579drflac_read_pcm_frames_s32__decode_mid_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);9580#endif9581}9582}958395849585#if 09586static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_independent_stereo__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)9587{9588for (drflac_uint64 i = 0; i < frameCount; ++i) {9589pOutputSamples[i*2+0] = (drflac_int32)((drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample));9590pOutputSamples[i*2+1] = (drflac_int32)((drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample));9591}9592}9593#endif95949595static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_independent_stereo__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)9596{9597drflac_uint64 i;9598drflac_uint64 frameCount4 = frameCount >> 2;9599const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;9600const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;9601drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;9602drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;96039604for (i = 0; i < frameCount4; ++i) {9605drflac_uint32 tempL0 = pInputSamples0U32[i*4+0] << shift0;9606drflac_uint32 tempL1 = pInputSamples0U32[i*4+1] << shift0;9607drflac_uint32 tempL2 = pInputSamples0U32[i*4+2] << shift0;9608drflac_uint32 tempL3 = pInputSamples0U32[i*4+3] << shift0;96099610drflac_uint32 tempR0 = pInputSamples1U32[i*4+0] << shift1;9611drflac_uint32 tempR1 = pInputSamples1U32[i*4+1] << shift1;9612drflac_uint32 tempR2 = pInputSamples1U32[i*4+2] << shift1;9613drflac_uint32 tempR3 = pInputSamples1U32[i*4+3] << shift1;96149615pOutputSamples[i*8+0] = (drflac_int32)tempL0;9616pOutputSamples[i*8+1] = (drflac_int32)tempR0;9617pOutputSamples[i*8+2] = (drflac_int32)tempL1;9618pOutputSamples[i*8+3] = (drflac_int32)tempR1;9619pOutputSamples[i*8+4] = (drflac_int32)tempL2;9620pOutputSamples[i*8+5] = (drflac_int32)tempR2;9621pOutputSamples[i*8+6] = (drflac_int32)tempL3;9622pOutputSamples[i*8+7] = (drflac_int32)tempR3;9623}96249625for (i = (frameCount4 << 2); i < frameCount; ++i) {9626pOutputSamples[i*2+0] = (drflac_int32)(pInputSamples0U32[i] << shift0);9627pOutputSamples[i*2+1] = (drflac_int32)(pInputSamples1U32[i] << shift1);9628}9629}96309631#if defined(DRFLAC_SUPPORT_SSE2)9632static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_independent_stereo__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)9633{9634drflac_uint64 i;9635drflac_uint64 frameCount4 = frameCount >> 2;9636const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;9637const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;9638drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;9639drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;96409641for (i = 0; i < frameCount4; ++i) {9642__m128i left = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0);9643__m128i right = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1);96449645_mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 0), _mm_unpacklo_epi32(left, right));9646_mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 4), _mm_unpackhi_epi32(left, right));9647}96489649for (i = (frameCount4 << 2); i < frameCount; ++i) {9650pOutputSamples[i*2+0] = (drflac_int32)(pInputSamples0U32[i] << shift0);9651pOutputSamples[i*2+1] = (drflac_int32)(pInputSamples1U32[i] << shift1);9652}9653}9654#endif96559656#if defined(DRFLAC_SUPPORT_NEON)9657static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_independent_stereo__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)9658{9659drflac_uint64 i;9660drflac_uint64 frameCount4 = frameCount >> 2;9661const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;9662const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;9663drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;9664drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;96659666int32x4_t shift4_0 = vdupq_n_s32(shift0);9667int32x4_t shift4_1 = vdupq_n_s32(shift1);96689669for (i = 0; i < frameCount4; ++i) {9670int32x4_t left;9671int32x4_t right;96729673left = vreinterpretq_s32_u32(vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift4_0));9674right = vreinterpretq_s32_u32(vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift4_1));96759676drflac__vst2q_s32(pOutputSamples + i*8, vzipq_s32(left, right));9677}96789679for (i = (frameCount4 << 2); i < frameCount; ++i) {9680pOutputSamples[i*2+0] = (drflac_int32)(pInputSamples0U32[i] << shift0);9681pOutputSamples[i*2+1] = (drflac_int32)(pInputSamples1U32[i] << shift1);9682}9683}9684#endif96859686static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_independent_stereo(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples)9687{9688#if defined(DRFLAC_SUPPORT_SSE2)9689if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) {9690drflac_read_pcm_frames_s32__decode_independent_stereo__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);9691} else9692#elif defined(DRFLAC_SUPPORT_NEON)9693if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) {9694drflac_read_pcm_frames_s32__decode_independent_stereo__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);9695} else9696#endif9697{9698/* Scalar fallback. */9699#if 09700drflac_read_pcm_frames_s32__decode_independent_stereo__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);9701#else9702drflac_read_pcm_frames_s32__decode_independent_stereo__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);9703#endif9704}9705}970697079708DRFLAC_API drflac_uint64 drflac_read_pcm_frames_s32(drflac* pFlac, drflac_uint64 framesToRead, drflac_int32* pBufferOut)9709{9710drflac_uint64 framesRead;9711drflac_uint32 unusedBitsPerSample;97129713if (pFlac == NULL || framesToRead == 0) {9714return 0;9715}97169717if (pBufferOut == NULL) {9718return drflac__seek_forward_by_pcm_frames(pFlac, framesToRead);9719}97209721DRFLAC_ASSERT(pFlac->bitsPerSample <= 32);9722unusedBitsPerSample = 32 - pFlac->bitsPerSample;97239724framesRead = 0;9725while (framesToRead > 0) {9726/* If we've run out of samples in this frame, go to the next. */9727if (pFlac->currentFLACFrame.pcmFramesRemaining == 0) {9728if (!drflac__read_and_decode_next_flac_frame(pFlac)) {9729break; /* Couldn't read the next frame, so just break from the loop and return. */9730}9731} else {9732unsigned int channelCount = drflac__get_channel_count_from_channel_assignment(pFlac->currentFLACFrame.header.channelAssignment);9733drflac_uint64 iFirstPCMFrame = pFlac->currentFLACFrame.header.blockSizeInPCMFrames - pFlac->currentFLACFrame.pcmFramesRemaining;9734drflac_uint64 frameCountThisIteration = framesToRead;97359736if (frameCountThisIteration > pFlac->currentFLACFrame.pcmFramesRemaining) {9737frameCountThisIteration = pFlac->currentFLACFrame.pcmFramesRemaining;9738}97399740if (channelCount == 2) {9741const drflac_int32* pDecodedSamples0 = pFlac->currentFLACFrame.subframes[0].pSamplesS32 + iFirstPCMFrame;9742const drflac_int32* pDecodedSamples1 = pFlac->currentFLACFrame.subframes[1].pSamplesS32 + iFirstPCMFrame;97439744switch (pFlac->currentFLACFrame.header.channelAssignment)9745{9746case DRFLAC_CHANNEL_ASSIGNMENT_LEFT_SIDE:9747{9748drflac_read_pcm_frames_s32__decode_left_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut);9749} break;97509751case DRFLAC_CHANNEL_ASSIGNMENT_RIGHT_SIDE:9752{9753drflac_read_pcm_frames_s32__decode_right_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut);9754} break;97559756case DRFLAC_CHANNEL_ASSIGNMENT_MID_SIDE:9757{9758drflac_read_pcm_frames_s32__decode_mid_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut);9759} break;97609761case DRFLAC_CHANNEL_ASSIGNMENT_INDEPENDENT:9762default:9763{9764drflac_read_pcm_frames_s32__decode_independent_stereo(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut);9765} break;9766}9767} else {9768/* Generic interleaving. */9769drflac_uint64 i;9770for (i = 0; i < frameCountThisIteration; ++i) {9771unsigned int j;9772for (j = 0; j < channelCount; ++j) {9773pBufferOut[(i*channelCount)+j] = (drflac_int32)((drflac_uint32)(pFlac->currentFLACFrame.subframes[j].pSamplesS32[iFirstPCMFrame + i]) << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[j].wastedBitsPerSample));9774}9775}9776}97779778framesRead += frameCountThisIteration;9779pBufferOut += frameCountThisIteration * channelCount;9780framesToRead -= frameCountThisIteration;9781pFlac->currentPCMFrame += frameCountThisIteration;9782pFlac->currentFLACFrame.pcmFramesRemaining -= (drflac_uint32)frameCountThisIteration;9783}9784}97859786return framesRead;9787}978897899790#if 09791static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_left_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)9792{9793drflac_uint64 i;9794for (i = 0; i < frameCount; ++i) {9795drflac_uint32 left = (drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample);9796drflac_uint32 side = (drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample);9797drflac_uint32 right = left - side;97989799left >>= 16;9800right >>= 16;98019802pOutputSamples[i*2+0] = (drflac_int16)left;9803pOutputSamples[i*2+1] = (drflac_int16)right;9804}9805}9806#endif98079808static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_left_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)9809{9810drflac_uint64 i;9811drflac_uint64 frameCount4 = frameCount >> 2;9812const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;9813const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;9814drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;9815drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;98169817for (i = 0; i < frameCount4; ++i) {9818drflac_uint32 left0 = pInputSamples0U32[i*4+0] << shift0;9819drflac_uint32 left1 = pInputSamples0U32[i*4+1] << shift0;9820drflac_uint32 left2 = pInputSamples0U32[i*4+2] << shift0;9821drflac_uint32 left3 = pInputSamples0U32[i*4+3] << shift0;98229823drflac_uint32 side0 = pInputSamples1U32[i*4+0] << shift1;9824drflac_uint32 side1 = pInputSamples1U32[i*4+1] << shift1;9825drflac_uint32 side2 = pInputSamples1U32[i*4+2] << shift1;9826drflac_uint32 side3 = pInputSamples1U32[i*4+3] << shift1;98279828drflac_uint32 right0 = left0 - side0;9829drflac_uint32 right1 = left1 - side1;9830drflac_uint32 right2 = left2 - side2;9831drflac_uint32 right3 = left3 - side3;98329833left0 >>= 16;9834left1 >>= 16;9835left2 >>= 16;9836left3 >>= 16;98379838right0 >>= 16;9839right1 >>= 16;9840right2 >>= 16;9841right3 >>= 16;98429843pOutputSamples[i*8+0] = (drflac_int16)left0;9844pOutputSamples[i*8+1] = (drflac_int16)right0;9845pOutputSamples[i*8+2] = (drflac_int16)left1;9846pOutputSamples[i*8+3] = (drflac_int16)right1;9847pOutputSamples[i*8+4] = (drflac_int16)left2;9848pOutputSamples[i*8+5] = (drflac_int16)right2;9849pOutputSamples[i*8+6] = (drflac_int16)left3;9850pOutputSamples[i*8+7] = (drflac_int16)right3;9851}98529853for (i = (frameCount4 << 2); i < frameCount; ++i) {9854drflac_uint32 left = pInputSamples0U32[i] << shift0;9855drflac_uint32 side = pInputSamples1U32[i] << shift1;9856drflac_uint32 right = left - side;98579858left >>= 16;9859right >>= 16;98609861pOutputSamples[i*2+0] = (drflac_int16)left;9862pOutputSamples[i*2+1] = (drflac_int16)right;9863}9864}98659866#if defined(DRFLAC_SUPPORT_SSE2)9867static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_left_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)9868{9869drflac_uint64 i;9870drflac_uint64 frameCount4 = frameCount >> 2;9871const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;9872const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;9873drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;9874drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;98759876DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);98779878for (i = 0; i < frameCount4; ++i) {9879__m128i left = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0);9880__m128i side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1);9881__m128i right = _mm_sub_epi32(left, side);98829883left = _mm_srai_epi32(left, 16);9884right = _mm_srai_epi32(right, 16);98859886_mm_storeu_si128((__m128i*)(pOutputSamples + i*8), drflac__mm_packs_interleaved_epi32(left, right));9887}98889889for (i = (frameCount4 << 2); i < frameCount; ++i) {9890drflac_uint32 left = pInputSamples0U32[i] << shift0;9891drflac_uint32 side = pInputSamples1U32[i] << shift1;9892drflac_uint32 right = left - side;98939894left >>= 16;9895right >>= 16;98969897pOutputSamples[i*2+0] = (drflac_int16)left;9898pOutputSamples[i*2+1] = (drflac_int16)right;9899}9900}9901#endif99029903#if defined(DRFLAC_SUPPORT_NEON)9904static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_left_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)9905{9906drflac_uint64 i;9907drflac_uint64 frameCount4 = frameCount >> 2;9908const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;9909const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;9910drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;9911drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;9912int32x4_t shift0_4;9913int32x4_t shift1_4;99149915DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);99169917shift0_4 = vdupq_n_s32(shift0);9918shift1_4 = vdupq_n_s32(shift1);99199920for (i = 0; i < frameCount4; ++i) {9921uint32x4_t left;9922uint32x4_t side;9923uint32x4_t right;99249925left = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift0_4);9926side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift1_4);9927right = vsubq_u32(left, side);99289929left = vshrq_n_u32(left, 16);9930right = vshrq_n_u32(right, 16);99319932drflac__vst2q_u16((drflac_uint16*)pOutputSamples + i*8, vzip_u16(vmovn_u32(left), vmovn_u32(right)));9933}99349935for (i = (frameCount4 << 2); i < frameCount; ++i) {9936drflac_uint32 left = pInputSamples0U32[i] << shift0;9937drflac_uint32 side = pInputSamples1U32[i] << shift1;9938drflac_uint32 right = left - side;99399940left >>= 16;9941right >>= 16;99429943pOutputSamples[i*2+0] = (drflac_int16)left;9944pOutputSamples[i*2+1] = (drflac_int16)right;9945}9946}9947#endif99489949static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_left_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)9950{9951#if defined(DRFLAC_SUPPORT_SSE2)9952if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) {9953drflac_read_pcm_frames_s16__decode_left_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);9954} else9955#elif defined(DRFLAC_SUPPORT_NEON)9956if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) {9957drflac_read_pcm_frames_s16__decode_left_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);9958} else9959#endif9960{9961/* Scalar fallback. */9962#if 09963drflac_read_pcm_frames_s16__decode_left_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);9964#else9965drflac_read_pcm_frames_s16__decode_left_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);9966#endif9967}9968}996999709971#if 09972static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_right_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)9973{9974drflac_uint64 i;9975for (i = 0; i < frameCount; ++i) {9976drflac_uint32 side = (drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample);9977drflac_uint32 right = (drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample);9978drflac_uint32 left = right + side;99799980left >>= 16;9981right >>= 16;99829983pOutputSamples[i*2+0] = (drflac_int16)left;9984pOutputSamples[i*2+1] = (drflac_int16)right;9985}9986}9987#endif99889989static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_right_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)9990{9991drflac_uint64 i;9992drflac_uint64 frameCount4 = frameCount >> 2;9993const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;9994const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;9995drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;9996drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;99979998for (i = 0; i < frameCount4; ++i) {9999drflac_uint32 side0 = pInputSamples0U32[i*4+0] << shift0;10000drflac_uint32 side1 = pInputSamples0U32[i*4+1] << shift0;10001drflac_uint32 side2 = pInputSamples0U32[i*4+2] << shift0;10002drflac_uint32 side3 = pInputSamples0U32[i*4+3] << shift0;1000310004drflac_uint32 right0 = pInputSamples1U32[i*4+0] << shift1;10005drflac_uint32 right1 = pInputSamples1U32[i*4+1] << shift1;10006drflac_uint32 right2 = pInputSamples1U32[i*4+2] << shift1;10007drflac_uint32 right3 = pInputSamples1U32[i*4+3] << shift1;1000810009drflac_uint32 left0 = right0 + side0;10010drflac_uint32 left1 = right1 + side1;10011drflac_uint32 left2 = right2 + side2;10012drflac_uint32 left3 = right3 + side3;1001310014left0 >>= 16;10015left1 >>= 16;10016left2 >>= 16;10017left3 >>= 16;1001810019right0 >>= 16;10020right1 >>= 16;10021right2 >>= 16;10022right3 >>= 16;1002310024pOutputSamples[i*8+0] = (drflac_int16)left0;10025pOutputSamples[i*8+1] = (drflac_int16)right0;10026pOutputSamples[i*8+2] = (drflac_int16)left1;10027pOutputSamples[i*8+3] = (drflac_int16)right1;10028pOutputSamples[i*8+4] = (drflac_int16)left2;10029pOutputSamples[i*8+5] = (drflac_int16)right2;10030pOutputSamples[i*8+6] = (drflac_int16)left3;10031pOutputSamples[i*8+7] = (drflac_int16)right3;10032}1003310034for (i = (frameCount4 << 2); i < frameCount; ++i) {10035drflac_uint32 side = pInputSamples0U32[i] << shift0;10036drflac_uint32 right = pInputSamples1U32[i] << shift1;10037drflac_uint32 left = right + side;1003810039left >>= 16;10040right >>= 16;1004110042pOutputSamples[i*2+0] = (drflac_int16)left;10043pOutputSamples[i*2+1] = (drflac_int16)right;10044}10045}1004610047#if defined(DRFLAC_SUPPORT_SSE2)10048static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_right_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)10049{10050drflac_uint64 i;10051drflac_uint64 frameCount4 = frameCount >> 2;10052const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;10053const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;10054drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;10055drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;1005610057DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);1005810059for (i = 0; i < frameCount4; ++i) {10060__m128i side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0);10061__m128i right = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1);10062__m128i left = _mm_add_epi32(right, side);1006310064left = _mm_srai_epi32(left, 16);10065right = _mm_srai_epi32(right, 16);1006610067_mm_storeu_si128((__m128i*)(pOutputSamples + i*8), drflac__mm_packs_interleaved_epi32(left, right));10068}1006910070for (i = (frameCount4 << 2); i < frameCount; ++i) {10071drflac_uint32 side = pInputSamples0U32[i] << shift0;10072drflac_uint32 right = pInputSamples1U32[i] << shift1;10073drflac_uint32 left = right + side;1007410075left >>= 16;10076right >>= 16;1007710078pOutputSamples[i*2+0] = (drflac_int16)left;10079pOutputSamples[i*2+1] = (drflac_int16)right;10080}10081}10082#endif1008310084#if defined(DRFLAC_SUPPORT_NEON)10085static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_right_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)10086{10087drflac_uint64 i;10088drflac_uint64 frameCount4 = frameCount >> 2;10089const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;10090const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;10091drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;10092drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;10093int32x4_t shift0_4;10094int32x4_t shift1_4;1009510096DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);1009710098shift0_4 = vdupq_n_s32(shift0);10099shift1_4 = vdupq_n_s32(shift1);1010010101for (i = 0; i < frameCount4; ++i) {10102uint32x4_t side;10103uint32x4_t right;10104uint32x4_t left;1010510106side = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift0_4);10107right = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift1_4);10108left = vaddq_u32(right, side);1010910110left = vshrq_n_u32(left, 16);10111right = vshrq_n_u32(right, 16);1011210113drflac__vst2q_u16((drflac_uint16*)pOutputSamples + i*8, vzip_u16(vmovn_u32(left), vmovn_u32(right)));10114}1011510116for (i = (frameCount4 << 2); i < frameCount; ++i) {10117drflac_uint32 side = pInputSamples0U32[i] << shift0;10118drflac_uint32 right = pInputSamples1U32[i] << shift1;10119drflac_uint32 left = right + side;1012010121left >>= 16;10122right >>= 16;1012310124pOutputSamples[i*2+0] = (drflac_int16)left;10125pOutputSamples[i*2+1] = (drflac_int16)right;10126}10127}10128#endif1012910130static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_right_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)10131{10132#if defined(DRFLAC_SUPPORT_SSE2)10133if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) {10134drflac_read_pcm_frames_s16__decode_right_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);10135} else10136#elif defined(DRFLAC_SUPPORT_NEON)10137if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) {10138drflac_read_pcm_frames_s16__decode_right_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);10139} else10140#endif10141{10142/* Scalar fallback. */10143#if 010144drflac_read_pcm_frames_s16__decode_right_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);10145#else10146drflac_read_pcm_frames_s16__decode_right_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);10147#endif10148}10149}101501015110152#if 010153static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_mid_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)10154{10155for (drflac_uint64 i = 0; i < frameCount; ++i) {10156drflac_uint32 mid = (drflac_uint32)pInputSamples0[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;10157drflac_uint32 side = (drflac_uint32)pInputSamples1[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;1015810159mid = (mid << 1) | (side & 0x01);1016010161pOutputSamples[i*2+0] = (drflac_int16)(((drflac_uint32)((drflac_int32)(mid + side) >> 1) << unusedBitsPerSample) >> 16);10162pOutputSamples[i*2+1] = (drflac_int16)(((drflac_uint32)((drflac_int32)(mid - side) >> 1) << unusedBitsPerSample) >> 16);10163}10164}10165#endif1016610167static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_mid_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)10168{10169drflac_uint64 i;10170drflac_uint64 frameCount4 = frameCount >> 2;10171const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;10172const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;10173drflac_uint32 shift = unusedBitsPerSample;1017410175if (shift > 0) {10176shift -= 1;10177for (i = 0; i < frameCount4; ++i) {10178drflac_uint32 temp0L;10179drflac_uint32 temp1L;10180drflac_uint32 temp2L;10181drflac_uint32 temp3L;10182drflac_uint32 temp0R;10183drflac_uint32 temp1R;10184drflac_uint32 temp2R;10185drflac_uint32 temp3R;1018610187drflac_uint32 mid0 = pInputSamples0U32[i*4+0] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;10188drflac_uint32 mid1 = pInputSamples0U32[i*4+1] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;10189drflac_uint32 mid2 = pInputSamples0U32[i*4+2] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;10190drflac_uint32 mid3 = pInputSamples0U32[i*4+3] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;1019110192drflac_uint32 side0 = pInputSamples1U32[i*4+0] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;10193drflac_uint32 side1 = pInputSamples1U32[i*4+1] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;10194drflac_uint32 side2 = pInputSamples1U32[i*4+2] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;10195drflac_uint32 side3 = pInputSamples1U32[i*4+3] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;1019610197mid0 = (mid0 << 1) | (side0 & 0x01);10198mid1 = (mid1 << 1) | (side1 & 0x01);10199mid2 = (mid2 << 1) | (side2 & 0x01);10200mid3 = (mid3 << 1) | (side3 & 0x01);1020110202temp0L = (mid0 + side0) << shift;10203temp1L = (mid1 + side1) << shift;10204temp2L = (mid2 + side2) << shift;10205temp3L = (mid3 + side3) << shift;1020610207temp0R = (mid0 - side0) << shift;10208temp1R = (mid1 - side1) << shift;10209temp2R = (mid2 - side2) << shift;10210temp3R = (mid3 - side3) << shift;1021110212temp0L >>= 16;10213temp1L >>= 16;10214temp2L >>= 16;10215temp3L >>= 16;1021610217temp0R >>= 16;10218temp1R >>= 16;10219temp2R >>= 16;10220temp3R >>= 16;1022110222pOutputSamples[i*8+0] = (drflac_int16)temp0L;10223pOutputSamples[i*8+1] = (drflac_int16)temp0R;10224pOutputSamples[i*8+2] = (drflac_int16)temp1L;10225pOutputSamples[i*8+3] = (drflac_int16)temp1R;10226pOutputSamples[i*8+4] = (drflac_int16)temp2L;10227pOutputSamples[i*8+5] = (drflac_int16)temp2R;10228pOutputSamples[i*8+6] = (drflac_int16)temp3L;10229pOutputSamples[i*8+7] = (drflac_int16)temp3R;10230}10231} else {10232for (i = 0; i < frameCount4; ++i) {10233drflac_uint32 temp0L;10234drflac_uint32 temp1L;10235drflac_uint32 temp2L;10236drflac_uint32 temp3L;10237drflac_uint32 temp0R;10238drflac_uint32 temp1R;10239drflac_uint32 temp2R;10240drflac_uint32 temp3R;1024110242drflac_uint32 mid0 = pInputSamples0U32[i*4+0] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;10243drflac_uint32 mid1 = pInputSamples0U32[i*4+1] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;10244drflac_uint32 mid2 = pInputSamples0U32[i*4+2] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;10245drflac_uint32 mid3 = pInputSamples0U32[i*4+3] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;1024610247drflac_uint32 side0 = pInputSamples1U32[i*4+0] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;10248drflac_uint32 side1 = pInputSamples1U32[i*4+1] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;10249drflac_uint32 side2 = pInputSamples1U32[i*4+2] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;10250drflac_uint32 side3 = pInputSamples1U32[i*4+3] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;1025110252mid0 = (mid0 << 1) | (side0 & 0x01);10253mid1 = (mid1 << 1) | (side1 & 0x01);10254mid2 = (mid2 << 1) | (side2 & 0x01);10255mid3 = (mid3 << 1) | (side3 & 0x01);1025610257temp0L = ((drflac_int32)(mid0 + side0) >> 1);10258temp1L = ((drflac_int32)(mid1 + side1) >> 1);10259temp2L = ((drflac_int32)(mid2 + side2) >> 1);10260temp3L = ((drflac_int32)(mid3 + side3) >> 1);1026110262temp0R = ((drflac_int32)(mid0 - side0) >> 1);10263temp1R = ((drflac_int32)(mid1 - side1) >> 1);10264temp2R = ((drflac_int32)(mid2 - side2) >> 1);10265temp3R = ((drflac_int32)(mid3 - side3) >> 1);1026610267temp0L >>= 16;10268temp1L >>= 16;10269temp2L >>= 16;10270temp3L >>= 16;1027110272temp0R >>= 16;10273temp1R >>= 16;10274temp2R >>= 16;10275temp3R >>= 16;1027610277pOutputSamples[i*8+0] = (drflac_int16)temp0L;10278pOutputSamples[i*8+1] = (drflac_int16)temp0R;10279pOutputSamples[i*8+2] = (drflac_int16)temp1L;10280pOutputSamples[i*8+3] = (drflac_int16)temp1R;10281pOutputSamples[i*8+4] = (drflac_int16)temp2L;10282pOutputSamples[i*8+5] = (drflac_int16)temp2R;10283pOutputSamples[i*8+6] = (drflac_int16)temp3L;10284pOutputSamples[i*8+7] = (drflac_int16)temp3R;10285}10286}1028710288for (i = (frameCount4 << 2); i < frameCount; ++i) {10289drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;10290drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;1029110292mid = (mid << 1) | (side & 0x01);1029310294pOutputSamples[i*2+0] = (drflac_int16)(((drflac_uint32)((drflac_int32)(mid + side) >> 1) << unusedBitsPerSample) >> 16);10295pOutputSamples[i*2+1] = (drflac_int16)(((drflac_uint32)((drflac_int32)(mid - side) >> 1) << unusedBitsPerSample) >> 16);10296}10297}1029810299#if defined(DRFLAC_SUPPORT_SSE2)10300static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_mid_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)10301{10302drflac_uint64 i;10303drflac_uint64 frameCount4 = frameCount >> 2;10304const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;10305const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;10306drflac_uint32 shift = unusedBitsPerSample;1030710308DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);1030910310if (shift == 0) {10311for (i = 0; i < frameCount4; ++i) {10312__m128i mid;10313__m128i side;10314__m128i left;10315__m128i right;1031610317mid = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample);10318side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample);1031910320mid = _mm_or_si128(_mm_slli_epi32(mid, 1), _mm_and_si128(side, _mm_set1_epi32(0x01)));1032110322left = _mm_srai_epi32(_mm_add_epi32(mid, side), 1);10323right = _mm_srai_epi32(_mm_sub_epi32(mid, side), 1);1032410325left = _mm_srai_epi32(left, 16);10326right = _mm_srai_epi32(right, 16);1032710328_mm_storeu_si128((__m128i*)(pOutputSamples + i*8), drflac__mm_packs_interleaved_epi32(left, right));10329}1033010331for (i = (frameCount4 << 2); i < frameCount; ++i) {10332drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;10333drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;1033410335mid = (mid << 1) | (side & 0x01);1033610337pOutputSamples[i*2+0] = (drflac_int16)(((drflac_int32)(mid + side) >> 1) >> 16);10338pOutputSamples[i*2+1] = (drflac_int16)(((drflac_int32)(mid - side) >> 1) >> 16);10339}10340} else {10341shift -= 1;10342for (i = 0; i < frameCount4; ++i) {10343__m128i mid;10344__m128i side;10345__m128i left;10346__m128i right;1034710348mid = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample);10349side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample);1035010351mid = _mm_or_si128(_mm_slli_epi32(mid, 1), _mm_and_si128(side, _mm_set1_epi32(0x01)));1035210353left = _mm_slli_epi32(_mm_add_epi32(mid, side), shift);10354right = _mm_slli_epi32(_mm_sub_epi32(mid, side), shift);1035510356left = _mm_srai_epi32(left, 16);10357right = _mm_srai_epi32(right, 16);1035810359_mm_storeu_si128((__m128i*)(pOutputSamples + i*8), drflac__mm_packs_interleaved_epi32(left, right));10360}1036110362for (i = (frameCount4 << 2); i < frameCount; ++i) {10363drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;10364drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;1036510366mid = (mid << 1) | (side & 0x01);1036710368pOutputSamples[i*2+0] = (drflac_int16)(((mid + side) << shift) >> 16);10369pOutputSamples[i*2+1] = (drflac_int16)(((mid - side) << shift) >> 16);10370}10371}10372}10373#endif1037410375#if defined(DRFLAC_SUPPORT_NEON)10376static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_mid_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)10377{10378drflac_uint64 i;10379drflac_uint64 frameCount4 = frameCount >> 2;10380const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;10381const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;10382drflac_uint32 shift = unusedBitsPerSample;10383int32x4_t wbpsShift0_4; /* wbps = Wasted Bits Per Sample */10384int32x4_t wbpsShift1_4; /* wbps = Wasted Bits Per Sample */1038510386DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);1038710388wbpsShift0_4 = vdupq_n_s32(pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample);10389wbpsShift1_4 = vdupq_n_s32(pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample);1039010391if (shift == 0) {10392for (i = 0; i < frameCount4; ++i) {10393uint32x4_t mid;10394uint32x4_t side;10395int32x4_t left;10396int32x4_t right;1039710398mid = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), wbpsShift0_4);10399side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), wbpsShift1_4);1040010401mid = vorrq_u32(vshlq_n_u32(mid, 1), vandq_u32(side, vdupq_n_u32(1)));1040210403left = vshrq_n_s32(vreinterpretq_s32_u32(vaddq_u32(mid, side)), 1);10404right = vshrq_n_s32(vreinterpretq_s32_u32(vsubq_u32(mid, side)), 1);1040510406left = vshrq_n_s32(left, 16);10407right = vshrq_n_s32(right, 16);1040810409drflac__vst2q_s16(pOutputSamples + i*8, vzip_s16(vmovn_s32(left), vmovn_s32(right)));10410}1041110412for (i = (frameCount4 << 2); i < frameCount; ++i) {10413drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;10414drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;1041510416mid = (mid << 1) | (side & 0x01);1041710418pOutputSamples[i*2+0] = (drflac_int16)(((drflac_int32)(mid + side) >> 1) >> 16);10419pOutputSamples[i*2+1] = (drflac_int16)(((drflac_int32)(mid - side) >> 1) >> 16);10420}10421} else {10422int32x4_t shift4;1042310424shift -= 1;10425shift4 = vdupq_n_s32(shift);1042610427for (i = 0; i < frameCount4; ++i) {10428uint32x4_t mid;10429uint32x4_t side;10430int32x4_t left;10431int32x4_t right;1043210433mid = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), wbpsShift0_4);10434side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), wbpsShift1_4);1043510436mid = vorrq_u32(vshlq_n_u32(mid, 1), vandq_u32(side, vdupq_n_u32(1)));1043710438left = vreinterpretq_s32_u32(vshlq_u32(vaddq_u32(mid, side), shift4));10439right = vreinterpretq_s32_u32(vshlq_u32(vsubq_u32(mid, side), shift4));1044010441left = vshrq_n_s32(left, 16);10442right = vshrq_n_s32(right, 16);1044310444drflac__vst2q_s16(pOutputSamples + i*8, vzip_s16(vmovn_s32(left), vmovn_s32(right)));10445}1044610447for (i = (frameCount4 << 2); i < frameCount; ++i) {10448drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;10449drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;1045010451mid = (mid << 1) | (side & 0x01);1045210453pOutputSamples[i*2+0] = (drflac_int16)(((mid + side) << shift) >> 16);10454pOutputSamples[i*2+1] = (drflac_int16)(((mid - side) << shift) >> 16);10455}10456}10457}10458#endif1045910460static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_mid_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)10461{10462#if defined(DRFLAC_SUPPORT_SSE2)10463if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) {10464drflac_read_pcm_frames_s16__decode_mid_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);10465} else10466#elif defined(DRFLAC_SUPPORT_NEON)10467if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) {10468drflac_read_pcm_frames_s16__decode_mid_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);10469} else10470#endif10471{10472/* Scalar fallback. */10473#if 010474drflac_read_pcm_frames_s16__decode_mid_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);10475#else10476drflac_read_pcm_frames_s16__decode_mid_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);10477#endif10478}10479}104801048110482#if 010483static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_independent_stereo__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)10484{10485for (drflac_uint64 i = 0; i < frameCount; ++i) {10486pOutputSamples[i*2+0] = (drflac_int16)((drflac_int32)((drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample)) >> 16);10487pOutputSamples[i*2+1] = (drflac_int16)((drflac_int32)((drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample)) >> 16);10488}10489}10490#endif1049110492static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_independent_stereo__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)10493{10494drflac_uint64 i;10495drflac_uint64 frameCount4 = frameCount >> 2;10496const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;10497const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;10498drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;10499drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;1050010501for (i = 0; i < frameCount4; ++i) {10502drflac_uint32 tempL0 = pInputSamples0U32[i*4+0] << shift0;10503drflac_uint32 tempL1 = pInputSamples0U32[i*4+1] << shift0;10504drflac_uint32 tempL2 = pInputSamples0U32[i*4+2] << shift0;10505drflac_uint32 tempL3 = pInputSamples0U32[i*4+3] << shift0;1050610507drflac_uint32 tempR0 = pInputSamples1U32[i*4+0] << shift1;10508drflac_uint32 tempR1 = pInputSamples1U32[i*4+1] << shift1;10509drflac_uint32 tempR2 = pInputSamples1U32[i*4+2] << shift1;10510drflac_uint32 tempR3 = pInputSamples1U32[i*4+3] << shift1;1051110512tempL0 >>= 16;10513tempL1 >>= 16;10514tempL2 >>= 16;10515tempL3 >>= 16;1051610517tempR0 >>= 16;10518tempR1 >>= 16;10519tempR2 >>= 16;10520tempR3 >>= 16;1052110522pOutputSamples[i*8+0] = (drflac_int16)tempL0;10523pOutputSamples[i*8+1] = (drflac_int16)tempR0;10524pOutputSamples[i*8+2] = (drflac_int16)tempL1;10525pOutputSamples[i*8+3] = (drflac_int16)tempR1;10526pOutputSamples[i*8+4] = (drflac_int16)tempL2;10527pOutputSamples[i*8+5] = (drflac_int16)tempR2;10528pOutputSamples[i*8+6] = (drflac_int16)tempL3;10529pOutputSamples[i*8+7] = (drflac_int16)tempR3;10530}1053110532for (i = (frameCount4 << 2); i < frameCount; ++i) {10533pOutputSamples[i*2+0] = (drflac_int16)((pInputSamples0U32[i] << shift0) >> 16);10534pOutputSamples[i*2+1] = (drflac_int16)((pInputSamples1U32[i] << shift1) >> 16);10535}10536}1053710538#if defined(DRFLAC_SUPPORT_SSE2)10539static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_independent_stereo__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)10540{10541drflac_uint64 i;10542drflac_uint64 frameCount4 = frameCount >> 2;10543const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;10544const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;10545drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;10546drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;1054710548for (i = 0; i < frameCount4; ++i) {10549__m128i left = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0);10550__m128i right = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1);1055110552left = _mm_srai_epi32(left, 16);10553right = _mm_srai_epi32(right, 16);1055410555/* At this point we have results. We can now pack and interleave these into a single __m128i object and then store the in the output buffer. */10556_mm_storeu_si128((__m128i*)(pOutputSamples + i*8), drflac__mm_packs_interleaved_epi32(left, right));10557}1055810559for (i = (frameCount4 << 2); i < frameCount; ++i) {10560pOutputSamples[i*2+0] = (drflac_int16)((pInputSamples0U32[i] << shift0) >> 16);10561pOutputSamples[i*2+1] = (drflac_int16)((pInputSamples1U32[i] << shift1) >> 16);10562}10563}10564#endif1056510566#if defined(DRFLAC_SUPPORT_NEON)10567static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_independent_stereo__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)10568{10569drflac_uint64 i;10570drflac_uint64 frameCount4 = frameCount >> 2;10571const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;10572const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;10573drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;10574drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;1057510576int32x4_t shift0_4 = vdupq_n_s32(shift0);10577int32x4_t shift1_4 = vdupq_n_s32(shift1);1057810579for (i = 0; i < frameCount4; ++i) {10580int32x4_t left;10581int32x4_t right;1058210583left = vreinterpretq_s32_u32(vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift0_4));10584right = vreinterpretq_s32_u32(vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift1_4));1058510586left = vshrq_n_s32(left, 16);10587right = vshrq_n_s32(right, 16);1058810589drflac__vst2q_s16(pOutputSamples + i*8, vzip_s16(vmovn_s32(left), vmovn_s32(right)));10590}1059110592for (i = (frameCount4 << 2); i < frameCount; ++i) {10593pOutputSamples[i*2+0] = (drflac_int16)((pInputSamples0U32[i] << shift0) >> 16);10594pOutputSamples[i*2+1] = (drflac_int16)((pInputSamples1U32[i] << shift1) >> 16);10595}10596}10597#endif1059810599static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_independent_stereo(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples)10600{10601#if defined(DRFLAC_SUPPORT_SSE2)10602if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) {10603drflac_read_pcm_frames_s16__decode_independent_stereo__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);10604} else10605#elif defined(DRFLAC_SUPPORT_NEON)10606if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) {10607drflac_read_pcm_frames_s16__decode_independent_stereo__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);10608} else10609#endif10610{10611/* Scalar fallback. */10612#if 010613drflac_read_pcm_frames_s16__decode_independent_stereo__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);10614#else10615drflac_read_pcm_frames_s16__decode_independent_stereo__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);10616#endif10617}10618}1061910620DRFLAC_API drflac_uint64 drflac_read_pcm_frames_s16(drflac* pFlac, drflac_uint64 framesToRead, drflac_int16* pBufferOut)10621{10622drflac_uint64 framesRead;10623drflac_uint32 unusedBitsPerSample;1062410625if (pFlac == NULL || framesToRead == 0) {10626return 0;10627}1062810629if (pBufferOut == NULL) {10630return drflac__seek_forward_by_pcm_frames(pFlac, framesToRead);10631}1063210633DRFLAC_ASSERT(pFlac->bitsPerSample <= 32);10634unusedBitsPerSample = 32 - pFlac->bitsPerSample;1063510636framesRead = 0;10637while (framesToRead > 0) {10638/* If we've run out of samples in this frame, go to the next. */10639if (pFlac->currentFLACFrame.pcmFramesRemaining == 0) {10640if (!drflac__read_and_decode_next_flac_frame(pFlac)) {10641break; /* Couldn't read the next frame, so just break from the loop and return. */10642}10643} else {10644unsigned int channelCount = drflac__get_channel_count_from_channel_assignment(pFlac->currentFLACFrame.header.channelAssignment);10645drflac_uint64 iFirstPCMFrame = pFlac->currentFLACFrame.header.blockSizeInPCMFrames - pFlac->currentFLACFrame.pcmFramesRemaining;10646drflac_uint64 frameCountThisIteration = framesToRead;1064710648if (frameCountThisIteration > pFlac->currentFLACFrame.pcmFramesRemaining) {10649frameCountThisIteration = pFlac->currentFLACFrame.pcmFramesRemaining;10650}1065110652if (channelCount == 2) {10653const drflac_int32* pDecodedSamples0 = pFlac->currentFLACFrame.subframes[0].pSamplesS32 + iFirstPCMFrame;10654const drflac_int32* pDecodedSamples1 = pFlac->currentFLACFrame.subframes[1].pSamplesS32 + iFirstPCMFrame;1065510656switch (pFlac->currentFLACFrame.header.channelAssignment)10657{10658case DRFLAC_CHANNEL_ASSIGNMENT_LEFT_SIDE:10659{10660drflac_read_pcm_frames_s16__decode_left_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut);10661} break;1066210663case DRFLAC_CHANNEL_ASSIGNMENT_RIGHT_SIDE:10664{10665drflac_read_pcm_frames_s16__decode_right_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut);10666} break;1066710668case DRFLAC_CHANNEL_ASSIGNMENT_MID_SIDE:10669{10670drflac_read_pcm_frames_s16__decode_mid_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut);10671} break;1067210673case DRFLAC_CHANNEL_ASSIGNMENT_INDEPENDENT:10674default:10675{10676drflac_read_pcm_frames_s16__decode_independent_stereo(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut);10677} break;10678}10679} else {10680/* Generic interleaving. */10681drflac_uint64 i;10682for (i = 0; i < frameCountThisIteration; ++i) {10683unsigned int j;10684for (j = 0; j < channelCount; ++j) {10685drflac_int32 sampleS32 = (drflac_int32)((drflac_uint32)(pFlac->currentFLACFrame.subframes[j].pSamplesS32[iFirstPCMFrame + i]) << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[j].wastedBitsPerSample));10686pBufferOut[(i*channelCount)+j] = (drflac_int16)(sampleS32 >> 16);10687}10688}10689}1069010691framesRead += frameCountThisIteration;10692pBufferOut += frameCountThisIteration * channelCount;10693framesToRead -= frameCountThisIteration;10694pFlac->currentPCMFrame += frameCountThisIteration;10695pFlac->currentFLACFrame.pcmFramesRemaining -= (drflac_uint32)frameCountThisIteration;10696}10697}1069810699return framesRead;10700}107011070210703#if 010704static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_left_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)10705{10706drflac_uint64 i;10707for (i = 0; i < frameCount; ++i) {10708drflac_uint32 left = (drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample);10709drflac_uint32 side = (drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample);10710drflac_uint32 right = left - side;1071110712pOutputSamples[i*2+0] = (float)((drflac_int32)left / 2147483648.0);10713pOutputSamples[i*2+1] = (float)((drflac_int32)right / 2147483648.0);10714}10715}10716#endif1071710718static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_left_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)10719{10720drflac_uint64 i;10721drflac_uint64 frameCount4 = frameCount >> 2;10722const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;10723const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;10724drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;10725drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;1072610727float factor = 1 / 2147483648.0;1072810729for (i = 0; i < frameCount4; ++i) {10730drflac_uint32 left0 = pInputSamples0U32[i*4+0] << shift0;10731drflac_uint32 left1 = pInputSamples0U32[i*4+1] << shift0;10732drflac_uint32 left2 = pInputSamples0U32[i*4+2] << shift0;10733drflac_uint32 left3 = pInputSamples0U32[i*4+3] << shift0;1073410735drflac_uint32 side0 = pInputSamples1U32[i*4+0] << shift1;10736drflac_uint32 side1 = pInputSamples1U32[i*4+1] << shift1;10737drflac_uint32 side2 = pInputSamples1U32[i*4+2] << shift1;10738drflac_uint32 side3 = pInputSamples1U32[i*4+3] << shift1;1073910740drflac_uint32 right0 = left0 - side0;10741drflac_uint32 right1 = left1 - side1;10742drflac_uint32 right2 = left2 - side2;10743drflac_uint32 right3 = left3 - side3;1074410745pOutputSamples[i*8+0] = (drflac_int32)left0 * factor;10746pOutputSamples[i*8+1] = (drflac_int32)right0 * factor;10747pOutputSamples[i*8+2] = (drflac_int32)left1 * factor;10748pOutputSamples[i*8+3] = (drflac_int32)right1 * factor;10749pOutputSamples[i*8+4] = (drflac_int32)left2 * factor;10750pOutputSamples[i*8+5] = (drflac_int32)right2 * factor;10751pOutputSamples[i*8+6] = (drflac_int32)left3 * factor;10752pOutputSamples[i*8+7] = (drflac_int32)right3 * factor;10753}1075410755for (i = (frameCount4 << 2); i < frameCount; ++i) {10756drflac_uint32 left = pInputSamples0U32[i] << shift0;10757drflac_uint32 side = pInputSamples1U32[i] << shift1;10758drflac_uint32 right = left - side;1075910760pOutputSamples[i*2+0] = (drflac_int32)left * factor;10761pOutputSamples[i*2+1] = (drflac_int32)right * factor;10762}10763}1076410765#if defined(DRFLAC_SUPPORT_SSE2)10766static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_left_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)10767{10768drflac_uint64 i;10769drflac_uint64 frameCount4 = frameCount >> 2;10770const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;10771const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;10772drflac_uint32 shift0 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample) - 8;10773drflac_uint32 shift1 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample) - 8;10774__m128 factor;1077510776DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);1077710778factor = _mm_set1_ps(1.0f / 8388608.0f);1077910780for (i = 0; i < frameCount4; ++i) {10781__m128i left = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0);10782__m128i side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1);10783__m128i right = _mm_sub_epi32(left, side);10784__m128 leftf = _mm_mul_ps(_mm_cvtepi32_ps(left), factor);10785__m128 rightf = _mm_mul_ps(_mm_cvtepi32_ps(right), factor);1078610787_mm_storeu_ps(pOutputSamples + i*8 + 0, _mm_unpacklo_ps(leftf, rightf));10788_mm_storeu_ps(pOutputSamples + i*8 + 4, _mm_unpackhi_ps(leftf, rightf));10789}1079010791for (i = (frameCount4 << 2); i < frameCount; ++i) {10792drflac_uint32 left = pInputSamples0U32[i] << shift0;10793drflac_uint32 side = pInputSamples1U32[i] << shift1;10794drflac_uint32 right = left - side;1079510796pOutputSamples[i*2+0] = (drflac_int32)left / 8388608.0f;10797pOutputSamples[i*2+1] = (drflac_int32)right / 8388608.0f;10798}10799}10800#endif1080110802#if defined(DRFLAC_SUPPORT_NEON)10803static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_left_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)10804{10805drflac_uint64 i;10806drflac_uint64 frameCount4 = frameCount >> 2;10807const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;10808const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;10809drflac_uint32 shift0 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample) - 8;10810drflac_uint32 shift1 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample) - 8;10811float32x4_t factor4;10812int32x4_t shift0_4;10813int32x4_t shift1_4;1081410815DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);1081610817factor4 = vdupq_n_f32(1.0f / 8388608.0f);10818shift0_4 = vdupq_n_s32(shift0);10819shift1_4 = vdupq_n_s32(shift1);1082010821for (i = 0; i < frameCount4; ++i) {10822uint32x4_t left;10823uint32x4_t side;10824uint32x4_t right;10825float32x4_t leftf;10826float32x4_t rightf;1082710828left = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift0_4);10829side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift1_4);10830right = vsubq_u32(left, side);10831leftf = vmulq_f32(vcvtq_f32_s32(vreinterpretq_s32_u32(left)), factor4);10832rightf = vmulq_f32(vcvtq_f32_s32(vreinterpretq_s32_u32(right)), factor4);1083310834drflac__vst2q_f32(pOutputSamples + i*8, vzipq_f32(leftf, rightf));10835}1083610837for (i = (frameCount4 << 2); i < frameCount; ++i) {10838drflac_uint32 left = pInputSamples0U32[i] << shift0;10839drflac_uint32 side = pInputSamples1U32[i] << shift1;10840drflac_uint32 right = left - side;1084110842pOutputSamples[i*2+0] = (drflac_int32)left / 8388608.0f;10843pOutputSamples[i*2+1] = (drflac_int32)right / 8388608.0f;10844}10845}10846#endif1084710848static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_left_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)10849{10850#if defined(DRFLAC_SUPPORT_SSE2)10851if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) {10852drflac_read_pcm_frames_f32__decode_left_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);10853} else10854#elif defined(DRFLAC_SUPPORT_NEON)10855if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) {10856drflac_read_pcm_frames_f32__decode_left_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);10857} else10858#endif10859{10860/* Scalar fallback. */10861#if 010862drflac_read_pcm_frames_f32__decode_left_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);10863#else10864drflac_read_pcm_frames_f32__decode_left_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);10865#endif10866}10867}108681086910870#if 010871static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_right_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)10872{10873drflac_uint64 i;10874for (i = 0; i < frameCount; ++i) {10875drflac_uint32 side = (drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample);10876drflac_uint32 right = (drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample);10877drflac_uint32 left = right + side;1087810879pOutputSamples[i*2+0] = (float)((drflac_int32)left / 2147483648.0);10880pOutputSamples[i*2+1] = (float)((drflac_int32)right / 2147483648.0);10881}10882}10883#endif1088410885static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_right_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)10886{10887drflac_uint64 i;10888drflac_uint64 frameCount4 = frameCount >> 2;10889const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;10890const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;10891drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;10892drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;10893float factor = 1 / 2147483648.0;1089410895for (i = 0; i < frameCount4; ++i) {10896drflac_uint32 side0 = pInputSamples0U32[i*4+0] << shift0;10897drflac_uint32 side1 = pInputSamples0U32[i*4+1] << shift0;10898drflac_uint32 side2 = pInputSamples0U32[i*4+2] << shift0;10899drflac_uint32 side3 = pInputSamples0U32[i*4+3] << shift0;1090010901drflac_uint32 right0 = pInputSamples1U32[i*4+0] << shift1;10902drflac_uint32 right1 = pInputSamples1U32[i*4+1] << shift1;10903drflac_uint32 right2 = pInputSamples1U32[i*4+2] << shift1;10904drflac_uint32 right3 = pInputSamples1U32[i*4+3] << shift1;1090510906drflac_uint32 left0 = right0 + side0;10907drflac_uint32 left1 = right1 + side1;10908drflac_uint32 left2 = right2 + side2;10909drflac_uint32 left3 = right3 + side3;1091010911pOutputSamples[i*8+0] = (drflac_int32)left0 * factor;10912pOutputSamples[i*8+1] = (drflac_int32)right0 * factor;10913pOutputSamples[i*8+2] = (drflac_int32)left1 * factor;10914pOutputSamples[i*8+3] = (drflac_int32)right1 * factor;10915pOutputSamples[i*8+4] = (drflac_int32)left2 * factor;10916pOutputSamples[i*8+5] = (drflac_int32)right2 * factor;10917pOutputSamples[i*8+6] = (drflac_int32)left3 * factor;10918pOutputSamples[i*8+7] = (drflac_int32)right3 * factor;10919}1092010921for (i = (frameCount4 << 2); i < frameCount; ++i) {10922drflac_uint32 side = pInputSamples0U32[i] << shift0;10923drflac_uint32 right = pInputSamples1U32[i] << shift1;10924drflac_uint32 left = right + side;1092510926pOutputSamples[i*2+0] = (drflac_int32)left * factor;10927pOutputSamples[i*2+1] = (drflac_int32)right * factor;10928}10929}1093010931#if defined(DRFLAC_SUPPORT_SSE2)10932static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_right_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)10933{10934drflac_uint64 i;10935drflac_uint64 frameCount4 = frameCount >> 2;10936const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;10937const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;10938drflac_uint32 shift0 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample) - 8;10939drflac_uint32 shift1 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample) - 8;10940__m128 factor;1094110942DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);1094310944factor = _mm_set1_ps(1.0f / 8388608.0f);1094510946for (i = 0; i < frameCount4; ++i) {10947__m128i side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0);10948__m128i right = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1);10949__m128i left = _mm_add_epi32(right, side);10950__m128 leftf = _mm_mul_ps(_mm_cvtepi32_ps(left), factor);10951__m128 rightf = _mm_mul_ps(_mm_cvtepi32_ps(right), factor);1095210953_mm_storeu_ps(pOutputSamples + i*8 + 0, _mm_unpacklo_ps(leftf, rightf));10954_mm_storeu_ps(pOutputSamples + i*8 + 4, _mm_unpackhi_ps(leftf, rightf));10955}1095610957for (i = (frameCount4 << 2); i < frameCount; ++i) {10958drflac_uint32 side = pInputSamples0U32[i] << shift0;10959drflac_uint32 right = pInputSamples1U32[i] << shift1;10960drflac_uint32 left = right + side;1096110962pOutputSamples[i*2+0] = (drflac_int32)left / 8388608.0f;10963pOutputSamples[i*2+1] = (drflac_int32)right / 8388608.0f;10964}10965}10966#endif1096710968#if defined(DRFLAC_SUPPORT_NEON)10969static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_right_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)10970{10971drflac_uint64 i;10972drflac_uint64 frameCount4 = frameCount >> 2;10973const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;10974const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;10975drflac_uint32 shift0 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample) - 8;10976drflac_uint32 shift1 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample) - 8;10977float32x4_t factor4;10978int32x4_t shift0_4;10979int32x4_t shift1_4;1098010981DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);1098210983factor4 = vdupq_n_f32(1.0f / 8388608.0f);10984shift0_4 = vdupq_n_s32(shift0);10985shift1_4 = vdupq_n_s32(shift1);1098610987for (i = 0; i < frameCount4; ++i) {10988uint32x4_t side;10989uint32x4_t right;10990uint32x4_t left;10991float32x4_t leftf;10992float32x4_t rightf;1099310994side = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift0_4);10995right = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift1_4);10996left = vaddq_u32(right, side);10997leftf = vmulq_f32(vcvtq_f32_s32(vreinterpretq_s32_u32(left)), factor4);10998rightf = vmulq_f32(vcvtq_f32_s32(vreinterpretq_s32_u32(right)), factor4);1099911000drflac__vst2q_f32(pOutputSamples + i*8, vzipq_f32(leftf, rightf));11001}1100211003for (i = (frameCount4 << 2); i < frameCount; ++i) {11004drflac_uint32 side = pInputSamples0U32[i] << shift0;11005drflac_uint32 right = pInputSamples1U32[i] << shift1;11006drflac_uint32 left = right + side;1100711008pOutputSamples[i*2+0] = (drflac_int32)left / 8388608.0f;11009pOutputSamples[i*2+1] = (drflac_int32)right / 8388608.0f;11010}11011}11012#endif1101311014static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_right_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)11015{11016#if defined(DRFLAC_SUPPORT_SSE2)11017if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) {11018drflac_read_pcm_frames_f32__decode_right_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);11019} else11020#elif defined(DRFLAC_SUPPORT_NEON)11021if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) {11022drflac_read_pcm_frames_f32__decode_right_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);11023} else11024#endif11025{11026/* Scalar fallback. */11027#if 011028drflac_read_pcm_frames_f32__decode_right_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);11029#else11030drflac_read_pcm_frames_f32__decode_right_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);11031#endif11032}11033}110341103511036#if 011037static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_mid_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)11038{11039for (drflac_uint64 i = 0; i < frameCount; ++i) {11040drflac_uint32 mid = (drflac_uint32)pInputSamples0[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;11041drflac_uint32 side = (drflac_uint32)pInputSamples1[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;1104211043mid = (mid << 1) | (side & 0x01);1104411045pOutputSamples[i*2+0] = (float)((((drflac_int32)(mid + side) >> 1) << (unusedBitsPerSample)) / 2147483648.0);11046pOutputSamples[i*2+1] = (float)((((drflac_int32)(mid - side) >> 1) << (unusedBitsPerSample)) / 2147483648.0);11047}11048}11049#endif1105011051static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_mid_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)11052{11053drflac_uint64 i;11054drflac_uint64 frameCount4 = frameCount >> 2;11055const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;11056const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;11057drflac_uint32 shift = unusedBitsPerSample;11058float factor = 1 / 2147483648.0;1105911060if (shift > 0) {11061shift -= 1;11062for (i = 0; i < frameCount4; ++i) {11063drflac_uint32 temp0L;11064drflac_uint32 temp1L;11065drflac_uint32 temp2L;11066drflac_uint32 temp3L;11067drflac_uint32 temp0R;11068drflac_uint32 temp1R;11069drflac_uint32 temp2R;11070drflac_uint32 temp3R;1107111072drflac_uint32 mid0 = pInputSamples0U32[i*4+0] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;11073drflac_uint32 mid1 = pInputSamples0U32[i*4+1] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;11074drflac_uint32 mid2 = pInputSamples0U32[i*4+2] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;11075drflac_uint32 mid3 = pInputSamples0U32[i*4+3] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;1107611077drflac_uint32 side0 = pInputSamples1U32[i*4+0] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;11078drflac_uint32 side1 = pInputSamples1U32[i*4+1] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;11079drflac_uint32 side2 = pInputSamples1U32[i*4+2] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;11080drflac_uint32 side3 = pInputSamples1U32[i*4+3] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;1108111082mid0 = (mid0 << 1) | (side0 & 0x01);11083mid1 = (mid1 << 1) | (side1 & 0x01);11084mid2 = (mid2 << 1) | (side2 & 0x01);11085mid3 = (mid3 << 1) | (side3 & 0x01);1108611087temp0L = (mid0 + side0) << shift;11088temp1L = (mid1 + side1) << shift;11089temp2L = (mid2 + side2) << shift;11090temp3L = (mid3 + side3) << shift;1109111092temp0R = (mid0 - side0) << shift;11093temp1R = (mid1 - side1) << shift;11094temp2R = (mid2 - side2) << shift;11095temp3R = (mid3 - side3) << shift;1109611097pOutputSamples[i*8+0] = (drflac_int32)temp0L * factor;11098pOutputSamples[i*8+1] = (drflac_int32)temp0R * factor;11099pOutputSamples[i*8+2] = (drflac_int32)temp1L * factor;11100pOutputSamples[i*8+3] = (drflac_int32)temp1R * factor;11101pOutputSamples[i*8+4] = (drflac_int32)temp2L * factor;11102pOutputSamples[i*8+5] = (drflac_int32)temp2R * factor;11103pOutputSamples[i*8+6] = (drflac_int32)temp3L * factor;11104pOutputSamples[i*8+7] = (drflac_int32)temp3R * factor;11105}11106} else {11107for (i = 0; i < frameCount4; ++i) {11108drflac_uint32 temp0L;11109drflac_uint32 temp1L;11110drflac_uint32 temp2L;11111drflac_uint32 temp3L;11112drflac_uint32 temp0R;11113drflac_uint32 temp1R;11114drflac_uint32 temp2R;11115drflac_uint32 temp3R;1111611117drflac_uint32 mid0 = pInputSamples0U32[i*4+0] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;11118drflac_uint32 mid1 = pInputSamples0U32[i*4+1] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;11119drflac_uint32 mid2 = pInputSamples0U32[i*4+2] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;11120drflac_uint32 mid3 = pInputSamples0U32[i*4+3] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;1112111122drflac_uint32 side0 = pInputSamples1U32[i*4+0] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;11123drflac_uint32 side1 = pInputSamples1U32[i*4+1] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;11124drflac_uint32 side2 = pInputSamples1U32[i*4+2] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;11125drflac_uint32 side3 = pInputSamples1U32[i*4+3] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;1112611127mid0 = (mid0 << 1) | (side0 & 0x01);11128mid1 = (mid1 << 1) | (side1 & 0x01);11129mid2 = (mid2 << 1) | (side2 & 0x01);11130mid3 = (mid3 << 1) | (side3 & 0x01);1113111132temp0L = (drflac_uint32)((drflac_int32)(mid0 + side0) >> 1);11133temp1L = (drflac_uint32)((drflac_int32)(mid1 + side1) >> 1);11134temp2L = (drflac_uint32)((drflac_int32)(mid2 + side2) >> 1);11135temp3L = (drflac_uint32)((drflac_int32)(mid3 + side3) >> 1);1113611137temp0R = (drflac_uint32)((drflac_int32)(mid0 - side0) >> 1);11138temp1R = (drflac_uint32)((drflac_int32)(mid1 - side1) >> 1);11139temp2R = (drflac_uint32)((drflac_int32)(mid2 - side2) >> 1);11140temp3R = (drflac_uint32)((drflac_int32)(mid3 - side3) >> 1);1114111142pOutputSamples[i*8+0] = (drflac_int32)temp0L * factor;11143pOutputSamples[i*8+1] = (drflac_int32)temp0R * factor;11144pOutputSamples[i*8+2] = (drflac_int32)temp1L * factor;11145pOutputSamples[i*8+3] = (drflac_int32)temp1R * factor;11146pOutputSamples[i*8+4] = (drflac_int32)temp2L * factor;11147pOutputSamples[i*8+5] = (drflac_int32)temp2R * factor;11148pOutputSamples[i*8+6] = (drflac_int32)temp3L * factor;11149pOutputSamples[i*8+7] = (drflac_int32)temp3R * factor;11150}11151}1115211153for (i = (frameCount4 << 2); i < frameCount; ++i) {11154drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;11155drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;1115611157mid = (mid << 1) | (side & 0x01);1115811159pOutputSamples[i*2+0] = (drflac_int32)((drflac_uint32)((drflac_int32)(mid + side) >> 1) << unusedBitsPerSample) * factor;11160pOutputSamples[i*2+1] = (drflac_int32)((drflac_uint32)((drflac_int32)(mid - side) >> 1) << unusedBitsPerSample) * factor;11161}11162}1116311164#if defined(DRFLAC_SUPPORT_SSE2)11165static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_mid_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)11166{11167drflac_uint64 i;11168drflac_uint64 frameCount4 = frameCount >> 2;11169const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;11170const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;11171drflac_uint32 shift = unusedBitsPerSample - 8;11172float factor;11173__m128 factor128;1117411175DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);1117611177factor = 1.0f / 8388608.0f;11178factor128 = _mm_set1_ps(factor);1117911180if (shift == 0) {11181for (i = 0; i < frameCount4; ++i) {11182__m128i mid;11183__m128i side;11184__m128i tempL;11185__m128i tempR;11186__m128 leftf;11187__m128 rightf;1118811189mid = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample);11190side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample);1119111192mid = _mm_or_si128(_mm_slli_epi32(mid, 1), _mm_and_si128(side, _mm_set1_epi32(0x01)));1119311194tempL = _mm_srai_epi32(_mm_add_epi32(mid, side), 1);11195tempR = _mm_srai_epi32(_mm_sub_epi32(mid, side), 1);1119611197leftf = _mm_mul_ps(_mm_cvtepi32_ps(tempL), factor128);11198rightf = _mm_mul_ps(_mm_cvtepi32_ps(tempR), factor128);1119911200_mm_storeu_ps(pOutputSamples + i*8 + 0, _mm_unpacklo_ps(leftf, rightf));11201_mm_storeu_ps(pOutputSamples + i*8 + 4, _mm_unpackhi_ps(leftf, rightf));11202}1120311204for (i = (frameCount4 << 2); i < frameCount; ++i) {11205drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;11206drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;1120711208mid = (mid << 1) | (side & 0x01);1120911210pOutputSamples[i*2+0] = ((drflac_int32)(mid + side) >> 1) * factor;11211pOutputSamples[i*2+1] = ((drflac_int32)(mid - side) >> 1) * factor;11212}11213} else {11214shift -= 1;11215for (i = 0; i < frameCount4; ++i) {11216__m128i mid;11217__m128i side;11218__m128i tempL;11219__m128i tempR;11220__m128 leftf;11221__m128 rightf;1122211223mid = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample);11224side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample);1122511226mid = _mm_or_si128(_mm_slli_epi32(mid, 1), _mm_and_si128(side, _mm_set1_epi32(0x01)));1122711228tempL = _mm_slli_epi32(_mm_add_epi32(mid, side), shift);11229tempR = _mm_slli_epi32(_mm_sub_epi32(mid, side), shift);1123011231leftf = _mm_mul_ps(_mm_cvtepi32_ps(tempL), factor128);11232rightf = _mm_mul_ps(_mm_cvtepi32_ps(tempR), factor128);1123311234_mm_storeu_ps(pOutputSamples + i*8 + 0, _mm_unpacklo_ps(leftf, rightf));11235_mm_storeu_ps(pOutputSamples + i*8 + 4, _mm_unpackhi_ps(leftf, rightf));11236}1123711238for (i = (frameCount4 << 2); i < frameCount; ++i) {11239drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;11240drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;1124111242mid = (mid << 1) | (side & 0x01);1124311244pOutputSamples[i*2+0] = (drflac_int32)((mid + side) << shift) * factor;11245pOutputSamples[i*2+1] = (drflac_int32)((mid - side) << shift) * factor;11246}11247}11248}11249#endif1125011251#if defined(DRFLAC_SUPPORT_NEON)11252static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_mid_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)11253{11254drflac_uint64 i;11255drflac_uint64 frameCount4 = frameCount >> 2;11256const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;11257const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;11258drflac_uint32 shift = unusedBitsPerSample - 8;11259float factor;11260float32x4_t factor4;11261int32x4_t shift4;11262int32x4_t wbps0_4; /* Wasted Bits Per Sample */11263int32x4_t wbps1_4; /* Wasted Bits Per Sample */1126411265DRFLAC_ASSERT(pFlac->bitsPerSample <= 24);1126611267factor = 1.0f / 8388608.0f;11268factor4 = vdupq_n_f32(factor);11269wbps0_4 = vdupq_n_s32(pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample);11270wbps1_4 = vdupq_n_s32(pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample);1127111272if (shift == 0) {11273for (i = 0; i < frameCount4; ++i) {11274int32x4_t lefti;11275int32x4_t righti;11276float32x4_t leftf;11277float32x4_t rightf;1127811279uint32x4_t mid = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), wbps0_4);11280uint32x4_t side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), wbps1_4);1128111282mid = vorrq_u32(vshlq_n_u32(mid, 1), vandq_u32(side, vdupq_n_u32(1)));1128311284lefti = vshrq_n_s32(vreinterpretq_s32_u32(vaddq_u32(mid, side)), 1);11285righti = vshrq_n_s32(vreinterpretq_s32_u32(vsubq_u32(mid, side)), 1);1128611287leftf = vmulq_f32(vcvtq_f32_s32(lefti), factor4);11288rightf = vmulq_f32(vcvtq_f32_s32(righti), factor4);1128911290drflac__vst2q_f32(pOutputSamples + i*8, vzipq_f32(leftf, rightf));11291}1129211293for (i = (frameCount4 << 2); i < frameCount; ++i) {11294drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;11295drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;1129611297mid = (mid << 1) | (side & 0x01);1129811299pOutputSamples[i*2+0] = ((drflac_int32)(mid + side) >> 1) * factor;11300pOutputSamples[i*2+1] = ((drflac_int32)(mid - side) >> 1) * factor;11301}11302} else {11303shift -= 1;11304shift4 = vdupq_n_s32(shift);11305for (i = 0; i < frameCount4; ++i) {11306uint32x4_t mid;11307uint32x4_t side;11308int32x4_t lefti;11309int32x4_t righti;11310float32x4_t leftf;11311float32x4_t rightf;1131211313mid = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), wbps0_4);11314side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), wbps1_4);1131511316mid = vorrq_u32(vshlq_n_u32(mid, 1), vandq_u32(side, vdupq_n_u32(1)));1131711318lefti = vreinterpretq_s32_u32(vshlq_u32(vaddq_u32(mid, side), shift4));11319righti = vreinterpretq_s32_u32(vshlq_u32(vsubq_u32(mid, side), shift4));1132011321leftf = vmulq_f32(vcvtq_f32_s32(lefti), factor4);11322rightf = vmulq_f32(vcvtq_f32_s32(righti), factor4);1132311324drflac__vst2q_f32(pOutputSamples + i*8, vzipq_f32(leftf, rightf));11325}1132611327for (i = (frameCount4 << 2); i < frameCount; ++i) {11328drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;11329drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;1133011331mid = (mid << 1) | (side & 0x01);1133211333pOutputSamples[i*2+0] = (drflac_int32)((mid + side) << shift) * factor;11334pOutputSamples[i*2+1] = (drflac_int32)((mid - side) << shift) * factor;11335}11336}11337}11338#endif1133911340static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_mid_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)11341{11342#if defined(DRFLAC_SUPPORT_SSE2)11343if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) {11344drflac_read_pcm_frames_f32__decode_mid_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);11345} else11346#elif defined(DRFLAC_SUPPORT_NEON)11347if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) {11348drflac_read_pcm_frames_f32__decode_mid_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);11349} else11350#endif11351{11352/* Scalar fallback. */11353#if 011354drflac_read_pcm_frames_f32__decode_mid_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);11355#else11356drflac_read_pcm_frames_f32__decode_mid_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);11357#endif11358}11359}1136011361#if 011362static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_independent_stereo__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)11363{11364for (drflac_uint64 i = 0; i < frameCount; ++i) {11365pOutputSamples[i*2+0] = (float)((drflac_int32)((drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample)) / 2147483648.0);11366pOutputSamples[i*2+1] = (float)((drflac_int32)((drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample)) / 2147483648.0);11367}11368}11369#endif1137011371static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_independent_stereo__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)11372{11373drflac_uint64 i;11374drflac_uint64 frameCount4 = frameCount >> 2;11375const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;11376const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;11377drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample;11378drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample;11379float factor = 1 / 2147483648.0;1138011381for (i = 0; i < frameCount4; ++i) {11382drflac_uint32 tempL0 = pInputSamples0U32[i*4+0] << shift0;11383drflac_uint32 tempL1 = pInputSamples0U32[i*4+1] << shift0;11384drflac_uint32 tempL2 = pInputSamples0U32[i*4+2] << shift0;11385drflac_uint32 tempL3 = pInputSamples0U32[i*4+3] << shift0;1138611387drflac_uint32 tempR0 = pInputSamples1U32[i*4+0] << shift1;11388drflac_uint32 tempR1 = pInputSamples1U32[i*4+1] << shift1;11389drflac_uint32 tempR2 = pInputSamples1U32[i*4+2] << shift1;11390drflac_uint32 tempR3 = pInputSamples1U32[i*4+3] << shift1;1139111392pOutputSamples[i*8+0] = (drflac_int32)tempL0 * factor;11393pOutputSamples[i*8+1] = (drflac_int32)tempR0 * factor;11394pOutputSamples[i*8+2] = (drflac_int32)tempL1 * factor;11395pOutputSamples[i*8+3] = (drflac_int32)tempR1 * factor;11396pOutputSamples[i*8+4] = (drflac_int32)tempL2 * factor;11397pOutputSamples[i*8+5] = (drflac_int32)tempR2 * factor;11398pOutputSamples[i*8+6] = (drflac_int32)tempL3 * factor;11399pOutputSamples[i*8+7] = (drflac_int32)tempR3 * factor;11400}1140111402for (i = (frameCount4 << 2); i < frameCount; ++i) {11403pOutputSamples[i*2+0] = (drflac_int32)(pInputSamples0U32[i] << shift0) * factor;11404pOutputSamples[i*2+1] = (drflac_int32)(pInputSamples1U32[i] << shift1) * factor;11405}11406}1140711408#if defined(DRFLAC_SUPPORT_SSE2)11409static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_independent_stereo__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)11410{11411drflac_uint64 i;11412drflac_uint64 frameCount4 = frameCount >> 2;11413const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;11414const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;11415drflac_uint32 shift0 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample) - 8;11416drflac_uint32 shift1 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample) - 8;1141711418float factor = 1.0f / 8388608.0f;11419__m128 factor128 = _mm_set1_ps(factor);1142011421for (i = 0; i < frameCount4; ++i) {11422__m128i lefti;11423__m128i righti;11424__m128 leftf;11425__m128 rightf;1142611427lefti = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0);11428righti = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1);1142911430leftf = _mm_mul_ps(_mm_cvtepi32_ps(lefti), factor128);11431rightf = _mm_mul_ps(_mm_cvtepi32_ps(righti), factor128);1143211433_mm_storeu_ps(pOutputSamples + i*8 + 0, _mm_unpacklo_ps(leftf, rightf));11434_mm_storeu_ps(pOutputSamples + i*8 + 4, _mm_unpackhi_ps(leftf, rightf));11435}1143611437for (i = (frameCount4 << 2); i < frameCount; ++i) {11438pOutputSamples[i*2+0] = (drflac_int32)(pInputSamples0U32[i] << shift0) * factor;11439pOutputSamples[i*2+1] = (drflac_int32)(pInputSamples1U32[i] << shift1) * factor;11440}11441}11442#endif1144311444#if defined(DRFLAC_SUPPORT_NEON)11445static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_independent_stereo__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)11446{11447drflac_uint64 i;11448drflac_uint64 frameCount4 = frameCount >> 2;11449const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0;11450const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1;11451drflac_uint32 shift0 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample) - 8;11452drflac_uint32 shift1 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample) - 8;1145311454float factor = 1.0f / 8388608.0f;11455float32x4_t factor4 = vdupq_n_f32(factor);11456int32x4_t shift0_4 = vdupq_n_s32(shift0);11457int32x4_t shift1_4 = vdupq_n_s32(shift1);1145811459for (i = 0; i < frameCount4; ++i) {11460int32x4_t lefti;11461int32x4_t righti;11462float32x4_t leftf;11463float32x4_t rightf;1146411465lefti = vreinterpretq_s32_u32(vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift0_4));11466righti = vreinterpretq_s32_u32(vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift1_4));1146711468leftf = vmulq_f32(vcvtq_f32_s32(lefti), factor4);11469rightf = vmulq_f32(vcvtq_f32_s32(righti), factor4);1147011471drflac__vst2q_f32(pOutputSamples + i*8, vzipq_f32(leftf, rightf));11472}1147311474for (i = (frameCount4 << 2); i < frameCount; ++i) {11475pOutputSamples[i*2+0] = (drflac_int32)(pInputSamples0U32[i] << shift0) * factor;11476pOutputSamples[i*2+1] = (drflac_int32)(pInputSamples1U32[i] << shift1) * factor;11477}11478}11479#endif1148011481static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_independent_stereo(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples)11482{11483#if defined(DRFLAC_SUPPORT_SSE2)11484if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) {11485drflac_read_pcm_frames_f32__decode_independent_stereo__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);11486} else11487#elif defined(DRFLAC_SUPPORT_NEON)11488if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) {11489drflac_read_pcm_frames_f32__decode_independent_stereo__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);11490} else11491#endif11492{11493/* Scalar fallback. */11494#if 011495drflac_read_pcm_frames_f32__decode_independent_stereo__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);11496#else11497drflac_read_pcm_frames_f32__decode_independent_stereo__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples);11498#endif11499}11500}1150111502DRFLAC_API drflac_uint64 drflac_read_pcm_frames_f32(drflac* pFlac, drflac_uint64 framesToRead, float* pBufferOut)11503{11504drflac_uint64 framesRead;11505drflac_uint32 unusedBitsPerSample;1150611507if (pFlac == NULL || framesToRead == 0) {11508return 0;11509}1151011511if (pBufferOut == NULL) {11512return drflac__seek_forward_by_pcm_frames(pFlac, framesToRead);11513}1151411515DRFLAC_ASSERT(pFlac->bitsPerSample <= 32);11516unusedBitsPerSample = 32 - pFlac->bitsPerSample;1151711518framesRead = 0;11519while (framesToRead > 0) {11520/* If we've run out of samples in this frame, go to the next. */11521if (pFlac->currentFLACFrame.pcmFramesRemaining == 0) {11522if (!drflac__read_and_decode_next_flac_frame(pFlac)) {11523break; /* Couldn't read the next frame, so just break from the loop and return. */11524}11525} else {11526unsigned int channelCount = drflac__get_channel_count_from_channel_assignment(pFlac->currentFLACFrame.header.channelAssignment);11527drflac_uint64 iFirstPCMFrame = pFlac->currentFLACFrame.header.blockSizeInPCMFrames - pFlac->currentFLACFrame.pcmFramesRemaining;11528drflac_uint64 frameCountThisIteration = framesToRead;1152911530if (frameCountThisIteration > pFlac->currentFLACFrame.pcmFramesRemaining) {11531frameCountThisIteration = pFlac->currentFLACFrame.pcmFramesRemaining;11532}1153311534if (channelCount == 2) {11535const drflac_int32* pDecodedSamples0 = pFlac->currentFLACFrame.subframes[0].pSamplesS32 + iFirstPCMFrame;11536const drflac_int32* pDecodedSamples1 = pFlac->currentFLACFrame.subframes[1].pSamplesS32 + iFirstPCMFrame;1153711538switch (pFlac->currentFLACFrame.header.channelAssignment)11539{11540case DRFLAC_CHANNEL_ASSIGNMENT_LEFT_SIDE:11541{11542drflac_read_pcm_frames_f32__decode_left_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut);11543} break;1154411545case DRFLAC_CHANNEL_ASSIGNMENT_RIGHT_SIDE:11546{11547drflac_read_pcm_frames_f32__decode_right_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut);11548} break;1154911550case DRFLAC_CHANNEL_ASSIGNMENT_MID_SIDE:11551{11552drflac_read_pcm_frames_f32__decode_mid_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut);11553} break;1155411555case DRFLAC_CHANNEL_ASSIGNMENT_INDEPENDENT:11556default:11557{11558drflac_read_pcm_frames_f32__decode_independent_stereo(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut);11559} break;11560}11561} else {11562/* Generic interleaving. */11563drflac_uint64 i;11564for (i = 0; i < frameCountThisIteration; ++i) {11565unsigned int j;11566for (j = 0; j < channelCount; ++j) {11567drflac_int32 sampleS32 = (drflac_int32)((drflac_uint32)(pFlac->currentFLACFrame.subframes[j].pSamplesS32[iFirstPCMFrame + i]) << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[j].wastedBitsPerSample));11568pBufferOut[(i*channelCount)+j] = (float)(sampleS32 / 2147483648.0);11569}11570}11571}1157211573framesRead += frameCountThisIteration;11574pBufferOut += frameCountThisIteration * channelCount;11575framesToRead -= frameCountThisIteration;11576pFlac->currentPCMFrame += frameCountThisIteration;11577pFlac->currentFLACFrame.pcmFramesRemaining -= (unsigned int)frameCountThisIteration;11578}11579}1158011581return framesRead;11582}115831158411585DRFLAC_API drflac_bool32 drflac_seek_to_pcm_frame(drflac* pFlac, drflac_uint64 pcmFrameIndex)11586{11587if (pFlac == NULL) {11588return DRFLAC_FALSE;11589}1159011591/* Don't do anything if we're already on the seek point. */11592if (pFlac->currentPCMFrame == pcmFrameIndex) {11593return DRFLAC_TRUE;11594}1159511596/*11597If we don't know where the first frame begins then we can't seek. This will happen when the STREAMINFO block was not present11598when the decoder was opened.11599*/11600if (pFlac->firstFLACFramePosInBytes == 0) {11601return DRFLAC_FALSE;11602}1160311604if (pcmFrameIndex == 0) {11605pFlac->currentPCMFrame = 0;11606return drflac__seek_to_first_frame(pFlac);11607} else {11608drflac_bool32 wasSuccessful = DRFLAC_FALSE;11609drflac_uint64 originalPCMFrame = pFlac->currentPCMFrame;1161011611/* Clamp the sample to the end. */11612if (pcmFrameIndex > pFlac->totalPCMFrameCount) {11613pcmFrameIndex = pFlac->totalPCMFrameCount;11614}1161511616/* If the target sample and the current sample are in the same frame we just move the position forward. */11617if (pcmFrameIndex > pFlac->currentPCMFrame) {11618/* Forward. */11619drflac_uint32 offset = (drflac_uint32)(pcmFrameIndex - pFlac->currentPCMFrame);11620if (pFlac->currentFLACFrame.pcmFramesRemaining > offset) {11621pFlac->currentFLACFrame.pcmFramesRemaining -= offset;11622pFlac->currentPCMFrame = pcmFrameIndex;11623return DRFLAC_TRUE;11624}11625} else {11626/* Backward. */11627drflac_uint32 offsetAbs = (drflac_uint32)(pFlac->currentPCMFrame - pcmFrameIndex);11628drflac_uint32 currentFLACFramePCMFrameCount = pFlac->currentFLACFrame.header.blockSizeInPCMFrames;11629drflac_uint32 currentFLACFramePCMFramesConsumed = currentFLACFramePCMFrameCount - pFlac->currentFLACFrame.pcmFramesRemaining;11630if (currentFLACFramePCMFramesConsumed > offsetAbs) {11631pFlac->currentFLACFrame.pcmFramesRemaining += offsetAbs;11632pFlac->currentPCMFrame = pcmFrameIndex;11633return DRFLAC_TRUE;11634}11635}1163611637/*11638Different techniques depending on encapsulation. Using the native FLAC seektable with Ogg encapsulation is a bit awkward so11639we'll instead use Ogg's natural seeking facility.11640*/11641#ifndef DR_FLAC_NO_OGG11642if (pFlac->container == drflac_container_ogg)11643{11644wasSuccessful = drflac_ogg__seek_to_pcm_frame(pFlac, pcmFrameIndex);11645}11646else11647#endif11648{11649/* First try seeking via the seek table. If this fails, fall back to a brute force seek which is much slower. */11650if (/*!wasSuccessful && */!pFlac->_noSeekTableSeek) {11651wasSuccessful = drflac__seek_to_pcm_frame__seek_table(pFlac, pcmFrameIndex);11652}1165311654#if !defined(DR_FLAC_NO_CRC)11655/* Fall back to binary search if seek table seeking fails. This requires the length of the stream to be known. */11656if (!wasSuccessful && !pFlac->_noBinarySearchSeek && pFlac->totalPCMFrameCount > 0) {11657wasSuccessful = drflac__seek_to_pcm_frame__binary_search(pFlac, pcmFrameIndex);11658}11659#endif1166011661/* Fall back to brute force if all else fails. */11662if (!wasSuccessful && !pFlac->_noBruteForceSeek) {11663wasSuccessful = drflac__seek_to_pcm_frame__brute_force(pFlac, pcmFrameIndex);11664}11665}1166611667if (wasSuccessful) {11668pFlac->currentPCMFrame = pcmFrameIndex;11669} else {11670/* Seek failed. Try putting the decoder back to it's original state. */11671if (drflac_seek_to_pcm_frame(pFlac, originalPCMFrame) == DRFLAC_FALSE) {11672/* Failed to seek back to the original PCM frame. Fall back to 0. */11673drflac_seek_to_pcm_frame(pFlac, 0);11674}11675}1167611677return wasSuccessful;11678}11679}11680116811168211683/* High Level APIs */1168411685/* SIZE_MAX */11686#if defined(SIZE_MAX)11687#define DRFLAC_SIZE_MAX SIZE_MAX11688#else11689#if defined(DRFLAC_64BIT)11690#define DRFLAC_SIZE_MAX ((drflac_uint64)0xFFFFFFFFFFFFFFFF)11691#else11692#define DRFLAC_SIZE_MAX 0xFFFFFFFF11693#endif11694#endif11695/* End SIZE_MAX */116961169711698/* Using a macro as the definition of the drflac__full_decode_and_close_*() API family. Sue me. */11699#define DRFLAC_DEFINE_FULL_READ_AND_CLOSE(extension, type) \11700static type* drflac__full_read_and_close_ ## extension (drflac* pFlac, unsigned int* channelsOut, unsigned int* sampleRateOut, drflac_uint64* totalPCMFrameCountOut)\11701{ \11702type* pSampleData = NULL; \11703drflac_uint64 totalPCMFrameCount; \11704\11705DRFLAC_ASSERT(pFlac != NULL); \11706\11707totalPCMFrameCount = pFlac->totalPCMFrameCount; \11708\11709if (totalPCMFrameCount == 0) { \11710type buffer[4096]; \11711drflac_uint64 pcmFramesRead; \11712size_t sampleDataBufferSize = sizeof(buffer); \11713\11714pSampleData = (type*)drflac__malloc_from_callbacks(sampleDataBufferSize, &pFlac->allocationCallbacks); \11715if (pSampleData == NULL) { \11716goto on_error; \11717} \11718\11719while ((pcmFramesRead = (drflac_uint64)drflac_read_pcm_frames_##extension(pFlac, sizeof(buffer)/sizeof(buffer[0])/pFlac->channels, buffer)) > 0) { \11720if (((totalPCMFrameCount + pcmFramesRead) * pFlac->channels * sizeof(type)) > sampleDataBufferSize) { \11721type* pNewSampleData; \11722size_t newSampleDataBufferSize; \11723\11724newSampleDataBufferSize = sampleDataBufferSize * 2; \11725pNewSampleData = (type*)drflac__realloc_from_callbacks(pSampleData, newSampleDataBufferSize, sampleDataBufferSize, &pFlac->allocationCallbacks); \11726if (pNewSampleData == NULL) { \11727drflac__free_from_callbacks(pSampleData, &pFlac->allocationCallbacks); \11728goto on_error; \11729} \11730\11731sampleDataBufferSize = newSampleDataBufferSize; \11732pSampleData = pNewSampleData; \11733} \11734\11735DRFLAC_COPY_MEMORY(pSampleData + (totalPCMFrameCount*pFlac->channels), buffer, (size_t)(pcmFramesRead*pFlac->channels*sizeof(type))); \11736totalPCMFrameCount += pcmFramesRead; \11737} \11738\11739/* At this point everything should be decoded, but we just want to fill the unused part buffer with silence - need to \11740protect those ears from random noise! */ \11741DRFLAC_ZERO_MEMORY(pSampleData + (totalPCMFrameCount*pFlac->channels), (size_t)(sampleDataBufferSize - totalPCMFrameCount*pFlac->channels*sizeof(type))); \11742} else { \11743drflac_uint64 dataSize = totalPCMFrameCount*pFlac->channels*sizeof(type); \11744if (dataSize > (drflac_uint64)DRFLAC_SIZE_MAX) { \11745goto on_error; /* The decoded data is too big. */ \11746} \11747\11748pSampleData = (type*)drflac__malloc_from_callbacks((size_t)dataSize, &pFlac->allocationCallbacks); /* <-- Safe cast as per the check above. */ \11749if (pSampleData == NULL) { \11750goto on_error; \11751} \11752\11753totalPCMFrameCount = drflac_read_pcm_frames_##extension(pFlac, pFlac->totalPCMFrameCount, pSampleData); \11754} \11755\11756if (sampleRateOut) *sampleRateOut = pFlac->sampleRate; \11757if (channelsOut) *channelsOut = pFlac->channels; \11758if (totalPCMFrameCountOut) *totalPCMFrameCountOut = totalPCMFrameCount; \11759\11760drflac_close(pFlac); \11761return pSampleData; \11762\11763on_error: \11764drflac_close(pFlac); \11765return NULL; \11766}1176711768DRFLAC_DEFINE_FULL_READ_AND_CLOSE(s32, drflac_int32)11769DRFLAC_DEFINE_FULL_READ_AND_CLOSE(s16, drflac_int16)11770DRFLAC_DEFINE_FULL_READ_AND_CLOSE(f32, float)1177111772DRFLAC_API drflac_int32* drflac_open_and_read_pcm_frames_s32(drflac_read_proc onRead, drflac_seek_proc onSeek, void* pUserData, unsigned int* channelsOut, unsigned int* sampleRateOut, drflac_uint64* totalPCMFrameCountOut, const drflac_allocation_callbacks* pAllocationCallbacks)11773{11774drflac* pFlac;1177511776if (channelsOut) {11777*channelsOut = 0;11778}11779if (sampleRateOut) {11780*sampleRateOut = 0;11781}11782if (totalPCMFrameCountOut) {11783*totalPCMFrameCountOut = 0;11784}1178511786pFlac = drflac_open(onRead, onSeek, pUserData, pAllocationCallbacks);11787if (pFlac == NULL) {11788return NULL;11789}1179011791return drflac__full_read_and_close_s32(pFlac, channelsOut, sampleRateOut, totalPCMFrameCountOut);11792}1179311794DRFLAC_API drflac_int16* drflac_open_and_read_pcm_frames_s16(drflac_read_proc onRead, drflac_seek_proc onSeek, void* pUserData, unsigned int* channelsOut, unsigned int* sampleRateOut, drflac_uint64* totalPCMFrameCountOut, const drflac_allocation_callbacks* pAllocationCallbacks)11795{11796drflac* pFlac;1179711798if (channelsOut) {11799*channelsOut = 0;11800}11801if (sampleRateOut) {11802*sampleRateOut = 0;11803}11804if (totalPCMFrameCountOut) {11805*totalPCMFrameCountOut = 0;11806}1180711808pFlac = drflac_open(onRead, onSeek, pUserData, pAllocationCallbacks);11809if (pFlac == NULL) {11810return NULL;11811}1181211813return drflac__full_read_and_close_s16(pFlac, channelsOut, sampleRateOut, totalPCMFrameCountOut);11814}1181511816DRFLAC_API float* drflac_open_and_read_pcm_frames_f32(drflac_read_proc onRead, drflac_seek_proc onSeek, void* pUserData, unsigned int* channelsOut, unsigned int* sampleRateOut, drflac_uint64* totalPCMFrameCountOut, const drflac_allocation_callbacks* pAllocationCallbacks)11817{11818drflac* pFlac;1181911820if (channelsOut) {11821*channelsOut = 0;11822}11823if (sampleRateOut) {11824*sampleRateOut = 0;11825}11826if (totalPCMFrameCountOut) {11827*totalPCMFrameCountOut = 0;11828}1182911830pFlac = drflac_open(onRead, onSeek, pUserData, pAllocationCallbacks);11831if (pFlac == NULL) {11832return NULL;11833}1183411835return drflac__full_read_and_close_f32(pFlac, channelsOut, sampleRateOut, totalPCMFrameCountOut);11836}1183711838#ifndef DR_FLAC_NO_STDIO11839DRFLAC_API drflac_int32* drflac_open_file_and_read_pcm_frames_s32(const char* filename, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks)11840{11841drflac* pFlac;1184211843if (sampleRate) {11844*sampleRate = 0;11845}11846if (channels) {11847*channels = 0;11848}11849if (totalPCMFrameCount) {11850*totalPCMFrameCount = 0;11851}1185211853pFlac = drflac_open_file(filename, pAllocationCallbacks);11854if (pFlac == NULL) {11855return NULL;11856}1185711858return drflac__full_read_and_close_s32(pFlac, channels, sampleRate, totalPCMFrameCount);11859}1186011861DRFLAC_API drflac_int16* drflac_open_file_and_read_pcm_frames_s16(const char* filename, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks)11862{11863drflac* pFlac;1186411865if (sampleRate) {11866*sampleRate = 0;11867}11868if (channels) {11869*channels = 0;11870}11871if (totalPCMFrameCount) {11872*totalPCMFrameCount = 0;11873}1187411875pFlac = drflac_open_file(filename, pAllocationCallbacks);11876if (pFlac == NULL) {11877return NULL;11878}1187911880return drflac__full_read_and_close_s16(pFlac, channels, sampleRate, totalPCMFrameCount);11881}1188211883DRFLAC_API float* drflac_open_file_and_read_pcm_frames_f32(const char* filename, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks)11884{11885drflac* pFlac;1188611887if (sampleRate) {11888*sampleRate = 0;11889}11890if (channels) {11891*channels = 0;11892}11893if (totalPCMFrameCount) {11894*totalPCMFrameCount = 0;11895}1189611897pFlac = drflac_open_file(filename, pAllocationCallbacks);11898if (pFlac == NULL) {11899return NULL;11900}1190111902return drflac__full_read_and_close_f32(pFlac, channels, sampleRate, totalPCMFrameCount);11903}11904#endif1190511906DRFLAC_API drflac_int32* drflac_open_memory_and_read_pcm_frames_s32(const void* data, size_t dataSize, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks)11907{11908drflac* pFlac;1190911910if (sampleRate) {11911*sampleRate = 0;11912}11913if (channels) {11914*channels = 0;11915}11916if (totalPCMFrameCount) {11917*totalPCMFrameCount = 0;11918}1191911920pFlac = drflac_open_memory(data, dataSize, pAllocationCallbacks);11921if (pFlac == NULL) {11922return NULL;11923}1192411925return drflac__full_read_and_close_s32(pFlac, channels, sampleRate, totalPCMFrameCount);11926}1192711928DRFLAC_API drflac_int16* drflac_open_memory_and_read_pcm_frames_s16(const void* data, size_t dataSize, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks)11929{11930drflac* pFlac;1193111932if (sampleRate) {11933*sampleRate = 0;11934}11935if (channels) {11936*channels = 0;11937}11938if (totalPCMFrameCount) {11939*totalPCMFrameCount = 0;11940}1194111942pFlac = drflac_open_memory(data, dataSize, pAllocationCallbacks);11943if (pFlac == NULL) {11944return NULL;11945}1194611947return drflac__full_read_and_close_s16(pFlac, channels, sampleRate, totalPCMFrameCount);11948}1194911950DRFLAC_API float* drflac_open_memory_and_read_pcm_frames_f32(const void* data, size_t dataSize, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks)11951{11952drflac* pFlac;1195311954if (sampleRate) {11955*sampleRate = 0;11956}11957if (channels) {11958*channels = 0;11959}11960if (totalPCMFrameCount) {11961*totalPCMFrameCount = 0;11962}1196311964pFlac = drflac_open_memory(data, dataSize, pAllocationCallbacks);11965if (pFlac == NULL) {11966return NULL;11967}1196811969return drflac__full_read_and_close_f32(pFlac, channels, sampleRate, totalPCMFrameCount);11970}119711197211973DRFLAC_API void drflac_free(void* p, const drflac_allocation_callbacks* pAllocationCallbacks)11974{11975if (pAllocationCallbacks != NULL) {11976drflac__free_from_callbacks(p, pAllocationCallbacks);11977} else {11978drflac__free_default(p, NULL);11979}11980}1198111982119831198411985DRFLAC_API void drflac_init_vorbis_comment_iterator(drflac_vorbis_comment_iterator* pIter, drflac_uint32 commentCount, const void* pComments)11986{11987if (pIter == NULL) {11988return;11989}1199011991pIter->countRemaining = commentCount;11992pIter->pRunningData = (const char*)pComments;11993}1199411995DRFLAC_API const char* drflac_next_vorbis_comment(drflac_vorbis_comment_iterator* pIter, drflac_uint32* pCommentLengthOut)11996{11997drflac_int32 length;11998const char* pComment;1199912000/* Safety. */12001if (pCommentLengthOut) {12002*pCommentLengthOut = 0;12003}1200412005if (pIter == NULL || pIter->countRemaining == 0 || pIter->pRunningData == NULL) {12006return NULL;12007}1200812009length = drflac__le2host_32_ptr_unaligned(pIter->pRunningData);12010pIter->pRunningData += 4;1201112012pComment = pIter->pRunningData;12013pIter->pRunningData += length;12014pIter->countRemaining -= 1;1201512016if (pCommentLengthOut) {12017*pCommentLengthOut = length;12018}1201912020return pComment;12021}1202212023120241202512026DRFLAC_API void drflac_init_cuesheet_track_iterator(drflac_cuesheet_track_iterator* pIter, drflac_uint32 trackCount, const void* pTrackData)12027{12028if (pIter == NULL) {12029return;12030}1203112032pIter->countRemaining = trackCount;12033pIter->pRunningData = (const char*)pTrackData;12034}1203512036DRFLAC_API drflac_bool32 drflac_next_cuesheet_track(drflac_cuesheet_track_iterator* pIter, drflac_cuesheet_track* pCuesheetTrack)12037{12038drflac_cuesheet_track cuesheetTrack;12039const char* pRunningData;12040drflac_uint64 offsetHi;12041drflac_uint64 offsetLo;1204212043if (pIter == NULL || pIter->countRemaining == 0 || pIter->pRunningData == NULL) {12044return DRFLAC_FALSE;12045}1204612047pRunningData = pIter->pRunningData;1204812049offsetHi = drflac__be2host_32(*(const drflac_uint32*)pRunningData); pRunningData += 4;12050offsetLo = drflac__be2host_32(*(const drflac_uint32*)pRunningData); pRunningData += 4;12051cuesheetTrack.offset = offsetLo | (offsetHi << 32);12052cuesheetTrack.trackNumber = pRunningData[0]; pRunningData += 1;12053DRFLAC_COPY_MEMORY(cuesheetTrack.ISRC, pRunningData, sizeof(cuesheetTrack.ISRC)); pRunningData += 12;12054cuesheetTrack.isAudio = (pRunningData[0] & 0x80) != 0;12055cuesheetTrack.preEmphasis = (pRunningData[0] & 0x40) != 0; pRunningData += 14;12056cuesheetTrack.indexCount = pRunningData[0]; pRunningData += 1;12057cuesheetTrack.pIndexPoints = (const drflac_cuesheet_track_index*)pRunningData; pRunningData += cuesheetTrack.indexCount * sizeof(drflac_cuesheet_track_index);1205812059pIter->pRunningData = pRunningData;12060pIter->countRemaining -= 1;1206112062if (pCuesheetTrack) {12063*pCuesheetTrack = cuesheetTrack;12064}1206512066return DRFLAC_TRUE;12067}1206812069#if defined(__clang__) || (defined(__GNUC__) && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6)))12070#pragma GCC diagnostic pop12071#endif12072#endif /* dr_flac_c */12073#endif /* DR_FLAC_IMPLEMENTATION */120741207512076/*12077REVISION HISTORY12078================12079v0.12.42 - 2023-11-0212080- Fix build for ARMv6-M.12081- Fix a compilation warning with GCC.1208212083v0.12.41 - 2023-06-1712084- Fix an incorrect date in revision history. No functional change.1208512086v0.12.40 - 2023-05-2212087- Minor code restructure. No functional change.1208812089v0.12.39 - 2022-09-1712090- Fix compilation with DJGPP.12091- Fix compilation error with Visual Studio 2019 and the ARM build.12092- Fix an error with SSE 4.1 detection.12093- Add support for disabling wchar_t with DR_WAV_NO_WCHAR.12094- Improve compatibility with compilers which lack support for explicit struct packing.12095- Improve compatibility with low-end and embedded hardware by reducing the amount of stack12096allocation when loading an Ogg encapsulated file.1209712098v0.12.38 - 2022-04-1012099- Fix compilation error on older versions of GCC.1210012101v0.12.37 - 2022-02-1212102- Improve ARM detection.1210312104v0.12.36 - 2022-02-0712105- Fix a compilation error with the ARM build.1210612107v0.12.35 - 2022-02-0612108- Fix a bug due to underestimating the amount of precision required for the prediction stage.12109- Fix some bugs found from fuzz testing.1211012111v0.12.34 - 2022-01-0712112- Fix some misalignment bugs when reading metadata.1211312114v0.12.33 - 2021-12-2212115- Fix a bug with seeking when the seek table does not start at PCM frame 0.1211612117v0.12.32 - 2021-12-1112118- Fix a warning with Clang.1211912120v0.12.31 - 2021-08-1612121- Silence some warnings.1212212123v0.12.30 - 2021-07-3112124- Fix platform detection for ARM64.1212512126v0.12.29 - 2021-04-0212127- Fix a bug where the running PCM frame index is set to an invalid value when over-seeking.12128- Fix a decoding error due to an incorrect validation check.1212912130v0.12.28 - 2021-02-2112131- Fix a warning due to referencing _MSC_VER when it is undefined.1213212133v0.12.27 - 2021-01-3112134- Fix a static analysis warning.1213512136v0.12.26 - 2021-01-1712137- Fix a compilation warning due to _BSD_SOURCE being deprecated.1213812139v0.12.25 - 2020-12-2612140- Update documentation.1214112142v0.12.24 - 2020-11-2912143- Fix ARM64/NEON detection when compiling with MSVC.1214412145v0.12.23 - 2020-11-2112146- Fix compilation with OpenWatcom.1214712148v0.12.22 - 2020-11-0112149- Fix an error with the previous release.1215012151v0.12.21 - 2020-11-0112152- Fix a possible deadlock when seeking.12153- Improve compiler support for older versions of GCC.1215412155v0.12.20 - 2020-09-0812156- Fix a compilation error on older compilers.1215712158v0.12.19 - 2020-08-3012159- Fix a bug due to an undefined 32-bit shift.1216012161v0.12.18 - 2020-08-1412162- Fix a crash when compiling with clang-cl.1216312164v0.12.17 - 2020-08-0212165- Simplify sized types.1216612167v0.12.16 - 2020-07-2512168- Fix a compilation warning.1216912170v0.12.15 - 2020-07-0612171- Check for negative LPC shifts and return an error.1217212173v0.12.14 - 2020-06-2312174- Add include guard for the implementation section.1217512176v0.12.13 - 2020-05-1612177- Add compile-time and run-time version querying.12178- DRFLAC_VERSION_MINOR12179- DRFLAC_VERSION_MAJOR12180- DRFLAC_VERSION_REVISION12181- DRFLAC_VERSION_STRING12182- drflac_version()12183- drflac_version_string()1218412185v0.12.12 - 2020-04-3012186- Fix compilation errors with VC6.1218712188v0.12.11 - 2020-04-1912189- Fix some pedantic warnings.12190- Fix some undefined behaviour warnings.1219112192v0.12.10 - 2020-04-1012193- Fix some bugs when trying to seek with an invalid seek table.1219412195v0.12.9 - 2020-04-0512196- Fix warnings.1219712198v0.12.8 - 2020-04-0412199- Add drflac_open_file_w() and drflac_open_file_with_metadata_w().12200- Fix some static analysis warnings.12201- Minor documentation updates.1220212203v0.12.7 - 2020-03-1412204- Fix compilation errors with VC6.1220512206v0.12.6 - 2020-03-0712207- Fix compilation error with Visual Studio .NET 2003.1220812209v0.12.5 - 2020-01-3012210- Silence some static analysis warnings.1221112212v0.12.4 - 2020-01-2912213- Silence some static analysis warnings.1221412215v0.12.3 - 2019-12-0212216- Fix some warnings when compiling with GCC and the -Og flag.12217- Fix a crash in out-of-memory situations.12218- Fix potential integer overflow bug.12219- Fix some static analysis warnings.12220- Fix a possible crash when using custom memory allocators without a custom realloc() implementation.12221- Fix a bug with binary search seeking where the bits per sample is not a multiple of 8.1222212223v0.12.2 - 2019-10-0712224- Internal code clean up.1222512226v0.12.1 - 2019-09-2912227- Fix some Clang Static Analyzer warnings.12228- Fix an unused variable warning.1222912230v0.12.0 - 2019-09-2312231- API CHANGE: Add support for user defined memory allocation routines. This system allows the program to specify their own memory allocation12232routines with a user data pointer for client-specific contextual data. This adds an extra parameter to the end of the following APIs:12233- drflac_open()12234- drflac_open_relaxed()12235- drflac_open_with_metadata()12236- drflac_open_with_metadata_relaxed()12237- drflac_open_file()12238- drflac_open_file_with_metadata()12239- drflac_open_memory()12240- drflac_open_memory_with_metadata()12241- drflac_open_and_read_pcm_frames_s32()12242- drflac_open_and_read_pcm_frames_s16()12243- drflac_open_and_read_pcm_frames_f32()12244- drflac_open_file_and_read_pcm_frames_s32()12245- drflac_open_file_and_read_pcm_frames_s16()12246- drflac_open_file_and_read_pcm_frames_f32()12247- drflac_open_memory_and_read_pcm_frames_s32()12248- drflac_open_memory_and_read_pcm_frames_s16()12249- drflac_open_memory_and_read_pcm_frames_f32()12250Set this extra parameter to NULL to use defaults which is the same as the previous behaviour. Setting this NULL will use12251DRFLAC_MALLOC, DRFLAC_REALLOC and DRFLAC_FREE.12252- Remove deprecated APIs:12253- drflac_read_s32()12254- drflac_read_s16()12255- drflac_read_f32()12256- drflac_seek_to_sample()12257- drflac_open_and_decode_s32()12258- drflac_open_and_decode_s16()12259- drflac_open_and_decode_f32()12260- drflac_open_and_decode_file_s32()12261- drflac_open_and_decode_file_s16()12262- drflac_open_and_decode_file_f32()12263- drflac_open_and_decode_memory_s32()12264- drflac_open_and_decode_memory_s16()12265- drflac_open_and_decode_memory_f32()12266- Remove drflac.totalSampleCount which is now replaced with drflac.totalPCMFrameCount. You can emulate drflac.totalSampleCount12267by doing pFlac->totalPCMFrameCount*pFlac->channels.12268- Rename drflac.currentFrame to drflac.currentFLACFrame to remove ambiguity with PCM frames.12269- Fix errors when seeking to the end of a stream.12270- Optimizations to seeking.12271- SSE improvements and optimizations.12272- ARM NEON optimizations.12273- Optimizations to drflac_read_pcm_frames_s16().12274- Optimizations to drflac_read_pcm_frames_s32().1227512276v0.11.10 - 2019-06-2612277- Fix a compiler error.1227812279v0.11.9 - 2019-06-1612280- Silence some ThreadSanitizer warnings.1228112282v0.11.8 - 2019-05-2112283- Fix warnings.1228412285v0.11.7 - 2019-05-0612286- C89 fixes.1228712288v0.11.6 - 2019-05-0512289- Add support for C89.12290- Fix a compiler warning when CRC is disabled.12291- Change license to choice of public domain or MIT-0.1229212293v0.11.5 - 2019-04-1912294- Fix a compiler error with GCC.1229512296v0.11.4 - 2019-04-1712297- Fix some warnings with GCC when compiling with -std=c99.1229812299v0.11.3 - 2019-04-0712300- Silence warnings with GCC.1230112302v0.11.2 - 2019-03-1012303- Fix a warning.1230412305v0.11.1 - 2019-02-1712306- Fix a potential bug with seeking.1230712308v0.11.0 - 2018-12-1612309- API CHANGE: Deprecated drflac_read_s32(), drflac_read_s16() and drflac_read_f32() and replaced them with12310drflac_read_pcm_frames_s32(), drflac_read_pcm_frames_s16() and drflac_read_pcm_frames_f32(). The new APIs take12311and return PCM frame counts instead of sample counts. To upgrade you will need to change the input count by12312dividing it by the channel count, and then do the same with the return value.12313- API_CHANGE: Deprecated drflac_seek_to_sample() and replaced with drflac_seek_to_pcm_frame(). Same rules as12314the changes to drflac_read_*() apply.12315- API CHANGE: Deprecated drflac_open_and_decode_*() and replaced with drflac_open_*_and_read_*(). Same rules as12316the changes to drflac_read_*() apply.12317- Optimizations.1231812319v0.10.0 - 2018-09-1112320- Remove the DR_FLAC_NO_WIN32_IO option and the Win32 file IO functionality. If you need to use Win32 file IO you12321need to do it yourself via the callback API.12322- Fix the clang build.12323- Fix undefined behavior.12324- Fix errors with CUESHEET metdata blocks.12325- Add an API for iterating over each cuesheet track in the CUESHEET metadata block. This works the same way as the12326Vorbis comment API.12327- Other miscellaneous bug fixes, mostly relating to invalid FLAC streams.12328- Minor optimizations.1232912330v0.9.11 - 2018-08-2912331- Fix a bug with sample reconstruction.1233212333v0.9.10 - 2018-08-0712334- Improve 64-bit detection.1233512336v0.9.9 - 2018-08-0512337- Fix C++ build on older versions of GCC.1233812339v0.9.8 - 2018-07-2412340- Fix compilation errors.1234112342v0.9.7 - 2018-07-0512343- Fix a warning.1234412345v0.9.6 - 2018-06-2912346- Fix some typos.1234712348v0.9.5 - 2018-06-2312349- Fix some warnings.1235012351v0.9.4 - 2018-06-1412352- Optimizations to seeking.12353- Clean up.1235412355v0.9.3 - 2018-05-2212356- Bug fix.1235712358v0.9.2 - 2018-05-1212359- Fix a compilation error due to a missing break statement.1236012361v0.9.1 - 2018-04-2912362- Fix compilation error with Clang.1236312364v0.9 - 2018-04-2412365- Fix Clang build.12366- Start using major.minor.revision versioning.1236712368v0.8g - 2018-04-1912369- Fix build on non-x86/x64 architectures.1237012371v0.8f - 2018-02-0212372- Stop pretending to support changing rate/channels mid stream.1237312374v0.8e - 2018-02-0112375- Fix a crash when the block size of a frame is larger than the maximum block size defined by the FLAC stream.12376- Fix a crash the the Rice partition order is invalid.1237712378v0.8d - 2017-09-2212379- Add support for decoding streams with ID3 tags. ID3 tags are just skipped.1238012381v0.8c - 2017-09-0712382- Fix warning on non-x86/x64 architectures.1238312384v0.8b - 2017-08-1912385- Fix build on non-x86/x64 architectures.1238612387v0.8a - 2017-08-1312388- A small optimization for the Clang build.1238912390v0.8 - 2017-08-1212391- API CHANGE: Rename dr_* types to drflac_*.12392- Optimizations. This brings dr_flac back to about the same class of efficiency as the reference implementation.12393- Add support for custom implementations of malloc(), realloc(), etc.12394- Add CRC checking to Ogg encapsulated streams.12395- Fix VC++ 6 build. This is only for the C++ compiler. The C compiler is not currently supported.12396- Bug fixes.1239712398v0.7 - 2017-07-2312399- Add support for opening a stream without a header block. To do this, use drflac_open_relaxed() / drflac_open_with_metadata_relaxed().1240012401v0.6 - 2017-07-2212402- Add support for recovering from invalid frames. With this change, dr_flac will simply skip over invalid frames as if they12403never existed. Frames are checked against their sync code, the CRC-8 of the frame header and the CRC-16 of the whole frame.1240412405v0.5 - 2017-07-1612406- Fix typos.12407- Change drflac_bool* types to unsigned.12408- Add CRC checking. This makes dr_flac slower, but can be disabled with #define DR_FLAC_NO_CRC.1240912410v0.4f - 2017-03-1012411- Fix a couple of bugs with the bitstreaming code.1241212413v0.4e - 2017-02-1712414- Fix some warnings.1241512416v0.4d - 2016-12-2612417- Add support for 32-bit floating-point PCM decoding.12418- Use drflac_int* and drflac_uint* sized types to improve compiler support.12419- Minor improvements to documentation.1242012421v0.4c - 2016-12-2612422- Add support for signed 16-bit integer PCM decoding.1242312424v0.4b - 2016-10-2312425- A minor change to drflac_bool8 and drflac_bool32 types.1242612427v0.4a - 2016-10-1112428- Rename drBool32 to drflac_bool32 for styling consistency.1242912430v0.4 - 2016-09-2912431- API/ABI CHANGE: Use fixed size 32-bit booleans instead of the built-in bool type.12432- API CHANGE: Rename drflac_open_and_decode*() to drflac_open_and_decode*_s32().12433- API CHANGE: Swap the order of "channels" and "sampleRate" parameters in drflac_open_and_decode*(). Rationale for this is to12434keep it consistent with drflac_audio.1243512436v0.3f - 2016-09-2112437- Fix a warning with GCC.1243812439v0.3e - 2016-09-1812440- Fixed a bug where GCC 4.3+ was not getting properly identified.12441- Fixed a few typos.12442- Changed date formats to ISO 8601 (YYYY-MM-DD).1244312444v0.3d - 2016-06-1112445- Minor clean up.1244612447v0.3c - 2016-05-2812448- Fixed compilation error.1244912450v0.3b - 2016-05-1612451- Fixed Linux/GCC build.12452- Updated documentation.1245312454v0.3a - 2016-05-1512455- Minor fixes to documentation.1245612457v0.3 - 2016-05-1112458- Optimizations. Now at about parity with the reference implementation on 32-bit builds.12459- Lots of clean up.1246012461v0.2b - 2016-05-1012462- Bug fixes.1246312464v0.2a - 2016-05-1012465- Made drflac_open_and_decode() more robust.12466- Removed an unused debugging variable1246712468v0.2 - 2016-05-0912469- Added support for Ogg encapsulation.12470- API CHANGE. Have the onSeek callback take a third argument which specifies whether or not the seek12471should be relative to the start or the current position. Also changes the seeking rules such that12472seeking offsets will never be negative.12473- Have drflac_open_and_decode() fail gracefully if the stream has an unknown total sample count.1247412475v0.1b - 2016-05-0712476- Properly close the file handle in drflac_open_file() and family when the decoder fails to initialize.12477- Removed a stale comment.1247812479v0.1a - 2016-05-0512480- Minor formatting changes.12481- Fixed a warning on the GCC build.1248212483v0.1 - 2016-05-0312484- Initial versioned release.12485*/1248612487/*12488This software is available as a choice of the following licenses. Choose12489whichever you prefer.1249012491===============================================================================12492ALTERNATIVE 1 - Public Domain (www.unlicense.org)12493===============================================================================12494This is free and unencumbered software released into the public domain.1249512496Anyone is free to copy, modify, publish, use, compile, sell, or distribute this12497software, either in source code form or as a compiled binary, for any purpose,12498commercial or non-commercial, and by any means.1249912500In jurisdictions that recognize copyright laws, the author or authors of this12501software dedicate any and all copyright interest in the software to the public12502domain. We make this dedication for the benefit of the public at large and to12503the detriment of our heirs and successors. We intend this dedication to be an12504overt act of relinquishment in perpetuity of all present and future rights to12505this software under copyright law.1250612507THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR12508IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,12509FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE12510AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN12511ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION12512WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.1251312514For more information, please refer to <http://unlicense.org/>1251512516===============================================================================12517ALTERNATIVE 2 - MIT No Attribution12518===============================================================================12519Copyright 2023 David Reid1252012521Permission is hereby granted, free of charge, to any person obtaining a copy of12522this software and associated documentation files (the "Software"), to deal in12523the Software without restriction, including without limitation the rights to12524use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies12525of the Software, and to permit persons to whom the Software is furnished to do12526so.1252712528THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR12529IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,12530FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE12531AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER12532LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,12533OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE12534SOFTWARE.12535*/125361253712538