18c2ecf20Sopenharmony_ci========================= 28c2ecf20Sopenharmony_ciALSA Compress-Offload API 38c2ecf20Sopenharmony_ci========================= 48c2ecf20Sopenharmony_ci 58c2ecf20Sopenharmony_ciPierre-Louis.Bossart <pierre-louis.bossart@linux.intel.com> 68c2ecf20Sopenharmony_ci 78c2ecf20Sopenharmony_ciVinod Koul <vinod.koul@linux.intel.com> 88c2ecf20Sopenharmony_ci 98c2ecf20Sopenharmony_ci 108c2ecf20Sopenharmony_ciOverview 118c2ecf20Sopenharmony_ci======== 128c2ecf20Sopenharmony_ciSince its early days, the ALSA API was defined with PCM support or 138c2ecf20Sopenharmony_ciconstant bitrates payloads such as IEC61937 in mind. Arguments and 148c2ecf20Sopenharmony_cireturned values in frames are the norm, making it a challenge to 158c2ecf20Sopenharmony_ciextend the existing API to compressed data streams. 168c2ecf20Sopenharmony_ci 178c2ecf20Sopenharmony_ciIn recent years, audio digital signal processors (DSP) were integrated 188c2ecf20Sopenharmony_ciin system-on-chip designs, and DSPs are also integrated in audio 198c2ecf20Sopenharmony_cicodecs. Processing compressed data on such DSPs results in a dramatic 208c2ecf20Sopenharmony_cireduction of power consumption compared to host-based 218c2ecf20Sopenharmony_ciprocessing. Support for such hardware has not been very good in Linux, 228c2ecf20Sopenharmony_cimostly because of a lack of a generic API available in the mainline 238c2ecf20Sopenharmony_cikernel. 248c2ecf20Sopenharmony_ci 258c2ecf20Sopenharmony_ciRather than requiring a compatibility break with an API change of the 268c2ecf20Sopenharmony_ciALSA PCM interface, a new 'Compressed Data' API is introduced to 278c2ecf20Sopenharmony_ciprovide a control and data-streaming interface for audio DSPs. 288c2ecf20Sopenharmony_ci 298c2ecf20Sopenharmony_ciThe design of this API was inspired by the 2-year experience with the 308c2ecf20Sopenharmony_ciIntel Moorestown SOC, with many corrections required to upstream the 318c2ecf20Sopenharmony_ciAPI in the mainline kernel instead of the staging tree and make it 328c2ecf20Sopenharmony_ciusable by others. 338c2ecf20Sopenharmony_ci 348c2ecf20Sopenharmony_ci 358c2ecf20Sopenharmony_ciRequirements 368c2ecf20Sopenharmony_ci============ 378c2ecf20Sopenharmony_ciThe main requirements are: 388c2ecf20Sopenharmony_ci 398c2ecf20Sopenharmony_ci- separation between byte counts and time. Compressed formats may have 408c2ecf20Sopenharmony_ci a header per file, per frame, or no header at all. The payload size 418c2ecf20Sopenharmony_ci may vary from frame-to-frame. As a result, it is not possible to 428c2ecf20Sopenharmony_ci estimate reliably the duration of audio buffers when handling 438c2ecf20Sopenharmony_ci compressed data. Dedicated mechanisms are required to allow for 448c2ecf20Sopenharmony_ci reliable audio-video synchronization, which requires precise 458c2ecf20Sopenharmony_ci reporting of the number of samples rendered at any given time. 468c2ecf20Sopenharmony_ci 478c2ecf20Sopenharmony_ci- Handling of multiple formats. PCM data only requires a specification 488c2ecf20Sopenharmony_ci of the sampling rate, number of channels and bits per sample. In 498c2ecf20Sopenharmony_ci contrast, compressed data comes in a variety of formats. Audio DSPs 508c2ecf20Sopenharmony_ci may also provide support for a limited number of audio encoders and 518c2ecf20Sopenharmony_ci decoders embedded in firmware, or may support more choices through 528c2ecf20Sopenharmony_ci dynamic download of libraries. 538c2ecf20Sopenharmony_ci 548c2ecf20Sopenharmony_ci- Focus on main formats. This API provides support for the most 558c2ecf20Sopenharmony_ci popular formats used for audio and video capture and playback. It is 568c2ecf20Sopenharmony_ci likely that as audio compression technology advances, new formats 578c2ecf20Sopenharmony_ci will be added. 588c2ecf20Sopenharmony_ci 598c2ecf20Sopenharmony_ci- Handling of multiple configurations. Even for a given format like 608c2ecf20Sopenharmony_ci AAC, some implementations may support AAC multichannel but HE-AAC 618c2ecf20Sopenharmony_ci stereo. Likewise WMA10 level M3 may require too much memory and cpu 628c2ecf20Sopenharmony_ci cycles. The new API needs to provide a generic way of listing these 638c2ecf20Sopenharmony_ci formats. 648c2ecf20Sopenharmony_ci 658c2ecf20Sopenharmony_ci- Rendering/Grabbing only. This API does not provide any means of 668c2ecf20Sopenharmony_ci hardware acceleration, where PCM samples are provided back to 678c2ecf20Sopenharmony_ci user-space for additional processing. This API focuses instead on 688c2ecf20Sopenharmony_ci streaming compressed data to a DSP, with the assumption that the 698c2ecf20Sopenharmony_ci decoded samples are routed to a physical output or logical back-end. 708c2ecf20Sopenharmony_ci 718c2ecf20Sopenharmony_ci- Complexity hiding. Existing user-space multimedia frameworks all 728c2ecf20Sopenharmony_ci have existing enums/structures for each compressed format. This new 738c2ecf20Sopenharmony_ci API assumes the existence of a platform-specific compatibility layer 748c2ecf20Sopenharmony_ci to expose, translate and make use of the capabilities of the audio 758c2ecf20Sopenharmony_ci DSP, eg. Android HAL or PulseAudio sinks. By construction, regular 768c2ecf20Sopenharmony_ci applications are not supposed to make use of this API. 778c2ecf20Sopenharmony_ci 788c2ecf20Sopenharmony_ci 798c2ecf20Sopenharmony_ciDesign 808c2ecf20Sopenharmony_ci====== 818c2ecf20Sopenharmony_ciThe new API shares a number of concepts with the PCM API for flow 828c2ecf20Sopenharmony_cicontrol. Start, pause, resume, drain and stop commands have the same 838c2ecf20Sopenharmony_cisemantics no matter what the content is. 848c2ecf20Sopenharmony_ci 858c2ecf20Sopenharmony_ciThe concept of memory ring buffer divided in a set of fragments is 868c2ecf20Sopenharmony_ciborrowed from the ALSA PCM API. However, only sizes in bytes can be 878c2ecf20Sopenharmony_cispecified. 888c2ecf20Sopenharmony_ci 898c2ecf20Sopenharmony_ciSeeks/trick modes are assumed to be handled by the host. 908c2ecf20Sopenharmony_ci 918c2ecf20Sopenharmony_ciThe notion of rewinds/forwards is not supported. Data committed to the 928c2ecf20Sopenharmony_ciring buffer cannot be invalidated, except when dropping all buffers. 938c2ecf20Sopenharmony_ci 948c2ecf20Sopenharmony_ciThe Compressed Data API does not make any assumptions on how the data 958c2ecf20Sopenharmony_ciis transmitted to the audio DSP. DMA transfers from main memory to an 968c2ecf20Sopenharmony_ciembedded audio cluster or to a SPI interface for external DSPs are 978c2ecf20Sopenharmony_cipossible. As in the ALSA PCM case, a core set of routines is exposed; 988c2ecf20Sopenharmony_cieach driver implementer will have to write support for a set of 998c2ecf20Sopenharmony_cimandatory routines and possibly make use of optional ones. 1008c2ecf20Sopenharmony_ci 1018c2ecf20Sopenharmony_ciThe main additions are 1028c2ecf20Sopenharmony_ci 1038c2ecf20Sopenharmony_ciget_caps 1048c2ecf20Sopenharmony_ci This routine returns the list of audio formats supported. Querying the 1058c2ecf20Sopenharmony_ci codecs on a capture stream will return encoders, decoders will be 1068c2ecf20Sopenharmony_ci listed for playback streams. 1078c2ecf20Sopenharmony_ci 1088c2ecf20Sopenharmony_ciget_codec_caps 1098c2ecf20Sopenharmony_ci For each codec, this routine returns a list of 1108c2ecf20Sopenharmony_ci capabilities. The intent is to make sure all the capabilities 1118c2ecf20Sopenharmony_ci correspond to valid settings, and to minimize the risks of 1128c2ecf20Sopenharmony_ci configuration failures. For example, for a complex codec such as AAC, 1138c2ecf20Sopenharmony_ci the number of channels supported may depend on a specific profile. If 1148c2ecf20Sopenharmony_ci the capabilities were exposed with a single descriptor, it may happen 1158c2ecf20Sopenharmony_ci that a specific combination of profiles/channels/formats may not be 1168c2ecf20Sopenharmony_ci supported. Likewise, embedded DSPs have limited memory and cpu cycles, 1178c2ecf20Sopenharmony_ci it is likely that some implementations make the list of capabilities 1188c2ecf20Sopenharmony_ci dynamic and dependent on existing workloads. In addition to codec 1198c2ecf20Sopenharmony_ci settings, this routine returns the minimum buffer size handled by the 1208c2ecf20Sopenharmony_ci implementation. This information can be a function of the DMA buffer 1218c2ecf20Sopenharmony_ci sizes, the number of bytes required to synchronize, etc, and can be 1228c2ecf20Sopenharmony_ci used by userspace to define how much needs to be written in the ring 1238c2ecf20Sopenharmony_ci buffer before playback can start. 1248c2ecf20Sopenharmony_ci 1258c2ecf20Sopenharmony_ciset_params 1268c2ecf20Sopenharmony_ci This routine sets the configuration chosen for a specific codec. The 1278c2ecf20Sopenharmony_ci most important field in the parameters is the codec type; in most 1288c2ecf20Sopenharmony_ci cases decoders will ignore other fields, while encoders will strictly 1298c2ecf20Sopenharmony_ci comply to the settings 1308c2ecf20Sopenharmony_ci 1318c2ecf20Sopenharmony_ciget_params 1328c2ecf20Sopenharmony_ci This routines returns the actual settings used by the DSP. Changes to 1338c2ecf20Sopenharmony_ci the settings should remain the exception. 1348c2ecf20Sopenharmony_ci 1358c2ecf20Sopenharmony_ciget_timestamp 1368c2ecf20Sopenharmony_ci The timestamp becomes a multiple field structure. It lists the number 1378c2ecf20Sopenharmony_ci of bytes transferred, the number of samples processed and the number 1388c2ecf20Sopenharmony_ci of samples rendered/grabbed. All these values can be used to determine 1398c2ecf20Sopenharmony_ci the average bitrate, figure out if the ring buffer needs to be 1408c2ecf20Sopenharmony_ci refilled or the delay due to decoding/encoding/io on the DSP. 1418c2ecf20Sopenharmony_ci 1428c2ecf20Sopenharmony_ciNote that the list of codecs/profiles/modes was derived from the 1438c2ecf20Sopenharmony_ciOpenMAX AL specification instead of reinventing the wheel. 1448c2ecf20Sopenharmony_ciModifications include: 1458c2ecf20Sopenharmony_ci- Addition of FLAC and IEC formats 1468c2ecf20Sopenharmony_ci- Merge of encoder/decoder capabilities 1478c2ecf20Sopenharmony_ci- Profiles/modes listed as bitmasks to make descriptors more compact 1488c2ecf20Sopenharmony_ci- Addition of set_params for decoders (missing in OpenMAX AL) 1498c2ecf20Sopenharmony_ci- Addition of AMR/AMR-WB encoding modes (missing in OpenMAX AL) 1508c2ecf20Sopenharmony_ci- Addition of format information for WMA 1518c2ecf20Sopenharmony_ci- Addition of encoding options when required (derived from OpenMAX IL) 1528c2ecf20Sopenharmony_ci- Addition of rateControlSupported (missing in OpenMAX AL) 1538c2ecf20Sopenharmony_ci 1548c2ecf20Sopenharmony_ciState Machine 1558c2ecf20Sopenharmony_ci============= 1568c2ecf20Sopenharmony_ci 1578c2ecf20Sopenharmony_ciThe compressed audio stream state machine is described below :: 1588c2ecf20Sopenharmony_ci 1598c2ecf20Sopenharmony_ci +----------+ 1608c2ecf20Sopenharmony_ci | | 1618c2ecf20Sopenharmony_ci | OPEN | 1628c2ecf20Sopenharmony_ci | | 1638c2ecf20Sopenharmony_ci +----------+ 1648c2ecf20Sopenharmony_ci | 1658c2ecf20Sopenharmony_ci | 1668c2ecf20Sopenharmony_ci | compr_set_params() 1678c2ecf20Sopenharmony_ci | 1688c2ecf20Sopenharmony_ci v 1698c2ecf20Sopenharmony_ci compr_free() +----------+ 1708c2ecf20Sopenharmony_ci +------------------------------------| | 1718c2ecf20Sopenharmony_ci | | SETUP | 1728c2ecf20Sopenharmony_ci | +-------------------------| |<-------------------------+ 1738c2ecf20Sopenharmony_ci | | compr_write() +----------+ | 1748c2ecf20Sopenharmony_ci | | ^ | 1758c2ecf20Sopenharmony_ci | | | compr_drain_notify() | 1768c2ecf20Sopenharmony_ci | | | or | 1778c2ecf20Sopenharmony_ci | | | compr_stop() | 1788c2ecf20Sopenharmony_ci | | | | 1798c2ecf20Sopenharmony_ci | | +----------+ | 1808c2ecf20Sopenharmony_ci | | | | | 1818c2ecf20Sopenharmony_ci | | | DRAIN | | 1828c2ecf20Sopenharmony_ci | | | | | 1838c2ecf20Sopenharmony_ci | | +----------+ | 1848c2ecf20Sopenharmony_ci | | ^ | 1858c2ecf20Sopenharmony_ci | | | | 1868c2ecf20Sopenharmony_ci | | | compr_drain() | 1878c2ecf20Sopenharmony_ci | | | | 1888c2ecf20Sopenharmony_ci | v | | 1898c2ecf20Sopenharmony_ci | +----------+ +----------+ | 1908c2ecf20Sopenharmony_ci | | | compr_start() | | compr_stop() | 1918c2ecf20Sopenharmony_ci | | PREPARE |------------------->| RUNNING |--------------------------+ 1928c2ecf20Sopenharmony_ci | | | | | | 1938c2ecf20Sopenharmony_ci | +----------+ +----------+ | 1948c2ecf20Sopenharmony_ci | | | ^ | 1958c2ecf20Sopenharmony_ci | |compr_free() | | | 1968c2ecf20Sopenharmony_ci | | compr_pause() | | compr_resume() | 1978c2ecf20Sopenharmony_ci | | | | | 1988c2ecf20Sopenharmony_ci | v v | | 1998c2ecf20Sopenharmony_ci | +----------+ +----------+ | 2008c2ecf20Sopenharmony_ci | | | | | compr_stop() | 2018c2ecf20Sopenharmony_ci +--->| FREE | | PAUSE |---------------------------+ 2028c2ecf20Sopenharmony_ci | | | | 2038c2ecf20Sopenharmony_ci +----------+ +----------+ 2048c2ecf20Sopenharmony_ci 2058c2ecf20Sopenharmony_ci 2068c2ecf20Sopenharmony_ciGapless Playback 2078c2ecf20Sopenharmony_ci================ 2088c2ecf20Sopenharmony_ciWhen playing thru an album, the decoders have the ability to skip the encoder 2098c2ecf20Sopenharmony_cidelay and padding and directly move from one track content to another. The end 2108c2ecf20Sopenharmony_ciuser can perceive this as gapless playback as we don't have silence while 2118c2ecf20Sopenharmony_ciswitching from one track to another 2128c2ecf20Sopenharmony_ci 2138c2ecf20Sopenharmony_ciAlso, there might be low-intensity noises due to encoding. Perfect gapless is 2148c2ecf20Sopenharmony_cidifficult to reach with all types of compressed data, but works fine with most 2158c2ecf20Sopenharmony_cimusic content. The decoder needs to know the encoder delay and encoder padding. 2168c2ecf20Sopenharmony_ciSo we need to pass this to DSP. This metadata is extracted from ID3/MP4 headers 2178c2ecf20Sopenharmony_ciand are not present by default in the bitstream, hence the need for a new 2188c2ecf20Sopenharmony_ciinterface to pass this information to the DSP. Also DSP and userspace needs to 2198c2ecf20Sopenharmony_ciswitch from one track to another and start using data for second track. 2208c2ecf20Sopenharmony_ci 2218c2ecf20Sopenharmony_ciThe main additions are: 2228c2ecf20Sopenharmony_ci 2238c2ecf20Sopenharmony_ciset_metadata 2248c2ecf20Sopenharmony_ci This routine sets the encoder delay and encoder padding. This can be used by 2258c2ecf20Sopenharmony_ci decoder to strip the silence. This needs to be set before the data in the track 2268c2ecf20Sopenharmony_ci is written. 2278c2ecf20Sopenharmony_ci 2288c2ecf20Sopenharmony_ciset_next_track 2298c2ecf20Sopenharmony_ci This routine tells DSP that metadata and write operation sent after this would 2308c2ecf20Sopenharmony_ci correspond to subsequent track 2318c2ecf20Sopenharmony_ci 2328c2ecf20Sopenharmony_cipartial drain 2338c2ecf20Sopenharmony_ci This is called when end of file is reached. The userspace can inform DSP that 2348c2ecf20Sopenharmony_ci EOF is reached and now DSP can start skipping padding delay. Also next write 2358c2ecf20Sopenharmony_ci data would belong to next track 2368c2ecf20Sopenharmony_ci 2378c2ecf20Sopenharmony_ciSequence flow for gapless would be: 2388c2ecf20Sopenharmony_ci- Open 2398c2ecf20Sopenharmony_ci- Get caps / codec caps 2408c2ecf20Sopenharmony_ci- Set params 2418c2ecf20Sopenharmony_ci- Set metadata of the first track 2428c2ecf20Sopenharmony_ci- Fill data of the first track 2438c2ecf20Sopenharmony_ci- Trigger start 2448c2ecf20Sopenharmony_ci- User-space finished sending all, 2458c2ecf20Sopenharmony_ci- Indicate next track data by sending set_next_track 2468c2ecf20Sopenharmony_ci- Set metadata of the next track 2478c2ecf20Sopenharmony_ci- then call partial_drain to flush most of buffer in DSP 2488c2ecf20Sopenharmony_ci- Fill data of the next track 2498c2ecf20Sopenharmony_ci- DSP switches to second track 2508c2ecf20Sopenharmony_ci 2518c2ecf20Sopenharmony_ci(note: order for partial_drain and write for next track can be reversed as well) 2528c2ecf20Sopenharmony_ci 2538c2ecf20Sopenharmony_ciGapless Playback SM 2548c2ecf20Sopenharmony_ci=================== 2558c2ecf20Sopenharmony_ci 2568c2ecf20Sopenharmony_ciFor Gapless, we move from running state to partial drain and back, along 2578c2ecf20Sopenharmony_ciwith setting of meta_data and signalling for next track :: 2588c2ecf20Sopenharmony_ci 2598c2ecf20Sopenharmony_ci 2608c2ecf20Sopenharmony_ci +----------+ 2618c2ecf20Sopenharmony_ci compr_drain_notify() | | 2628c2ecf20Sopenharmony_ci +------------------------>| RUNNING | 2638c2ecf20Sopenharmony_ci | | | 2648c2ecf20Sopenharmony_ci | +----------+ 2658c2ecf20Sopenharmony_ci | | 2668c2ecf20Sopenharmony_ci | | 2678c2ecf20Sopenharmony_ci | | compr_next_track() 2688c2ecf20Sopenharmony_ci | | 2698c2ecf20Sopenharmony_ci | V 2708c2ecf20Sopenharmony_ci | +----------+ 2718c2ecf20Sopenharmony_ci | | | 2728c2ecf20Sopenharmony_ci | |NEXT_TRACK| 2738c2ecf20Sopenharmony_ci | | | 2748c2ecf20Sopenharmony_ci | +----------+ 2758c2ecf20Sopenharmony_ci | | 2768c2ecf20Sopenharmony_ci | | 2778c2ecf20Sopenharmony_ci | | compr_partial_drain() 2788c2ecf20Sopenharmony_ci | | 2798c2ecf20Sopenharmony_ci | V 2808c2ecf20Sopenharmony_ci | +----------+ 2818c2ecf20Sopenharmony_ci | | | 2828c2ecf20Sopenharmony_ci +------------------------ | PARTIAL_ | 2838c2ecf20Sopenharmony_ci | DRAIN | 2848c2ecf20Sopenharmony_ci +----------+ 2858c2ecf20Sopenharmony_ci 2868c2ecf20Sopenharmony_ciNot supported 2878c2ecf20Sopenharmony_ci============= 2888c2ecf20Sopenharmony_ci- Support for VoIP/circuit-switched calls is not the target of this 2898c2ecf20Sopenharmony_ci API. Support for dynamic bit-rate changes would require a tight 2908c2ecf20Sopenharmony_ci coupling between the DSP and the host stack, limiting power savings. 2918c2ecf20Sopenharmony_ci 2928c2ecf20Sopenharmony_ci- Packet-loss concealment is not supported. This would require an 2938c2ecf20Sopenharmony_ci additional interface to let the decoder synthesize data when frames 2948c2ecf20Sopenharmony_ci are lost during transmission. This may be added in the future. 2958c2ecf20Sopenharmony_ci 2968c2ecf20Sopenharmony_ci- Volume control/routing is not handled by this API. Devices exposing a 2978c2ecf20Sopenharmony_ci compressed data interface will be considered as regular ALSA devices; 2988c2ecf20Sopenharmony_ci volume changes and routing information will be provided with regular 2998c2ecf20Sopenharmony_ci ALSA kcontrols. 3008c2ecf20Sopenharmony_ci 3018c2ecf20Sopenharmony_ci- Embedded audio effects. Such effects should be enabled in the same 3028c2ecf20Sopenharmony_ci manner, no matter if the input was PCM or compressed. 3038c2ecf20Sopenharmony_ci 3048c2ecf20Sopenharmony_ci- multichannel IEC encoding. Unclear if this is required. 3058c2ecf20Sopenharmony_ci 3068c2ecf20Sopenharmony_ci- Encoding/decoding acceleration is not supported as mentioned 3078c2ecf20Sopenharmony_ci above. It is possible to route the output of a decoder to a capture 3088c2ecf20Sopenharmony_ci stream, or even implement transcoding capabilities. This routing 3098c2ecf20Sopenharmony_ci would be enabled with ALSA kcontrols. 3108c2ecf20Sopenharmony_ci 3118c2ecf20Sopenharmony_ci- Audio policy/resource management. This API does not provide any 3128c2ecf20Sopenharmony_ci hooks to query the utilization of the audio DSP, nor any preemption 3138c2ecf20Sopenharmony_ci mechanisms. 3148c2ecf20Sopenharmony_ci 3158c2ecf20Sopenharmony_ci- No notion of underrun/overrun. Since the bytes written are compressed 3168c2ecf20Sopenharmony_ci in nature and data written/read doesn't translate directly to 3178c2ecf20Sopenharmony_ci rendered output in time, this does not deal with underrun/overrun and 3188c2ecf20Sopenharmony_ci maybe dealt in user-library 3198c2ecf20Sopenharmony_ci 3208c2ecf20Sopenharmony_ci 3218c2ecf20Sopenharmony_ciCredits 3228c2ecf20Sopenharmony_ci======= 3238c2ecf20Sopenharmony_ci- Mark Brown and Liam Girdwood for discussions on the need for this API 3248c2ecf20Sopenharmony_ci- Harsha Priya for her work on intel_sst compressed API 3258c2ecf20Sopenharmony_ci- Rakesh Ughreja for valuable feedback 3268c2ecf20Sopenharmony_ci- Sing Nallasellan, Sikkandar Madar and Prasanna Samaga for 3278c2ecf20Sopenharmony_ci demonstrating and quantifying the benefits of audio offload on a 3288c2ecf20Sopenharmony_ci real platform. 329