Audio

Dive into the technical aspects of audio on your device, including codecs, format support, and customization options.

Audio Documentation

Post

Replies

Boosts

Views

Activity

ShazamKit supported for iOS apps that can run on Mac silicon?

I am having issues deploying my iOS app, that uses ShazamKit, to get working on a Mac with Apple silicon. When uploading the archive to App Store Connect I do get ITMS-90863: Macs with Apple silicon support issue - The app links with libraries that aren’t present in macOS: /usr/lib/swift/libswiftShazamKit.dylib Is ShazamKit not supported for iOS apps that can run on Macs with Apple silicon? Or is there something I should fix in my setup / deployment?

Media Technologies Audio Apple Silicon ShazamKit

1.2k

Jun ’25

Apple Music iOS 26 features in Android

Since many users like me use Apple Music on Android, the app is almost as feature-rich as iOS. It would be fantastic if the developers could add the new iOS 26 features to the Android app, along with a minor UI change. I know it’s challenging to implement liquid glass on Android hardware or design, but features like auto-mix, pronunciation, and translation could be added. kindly consider this request !!!!

Media Technologies Audio Design Developer Tools Interface Builder App Review

231

Jul ’25

AVB Support for the AVnu MILAN Conventions

The AVB AVnu MILAN Convention has a groweing Population. Many big companies (Cisco, Meyer Sound, d&b Audio, l‘acoustics, Presonus, digico etc.) implements the AVB AVnu Milan Standards. Is there a plan on the Apple side to also implement AVnu Milan on top of the AVB Protocol? The advantage for Apple Sound would be a great Integration in the professionell Audio market and a more stable intergration on top of the AVB protocol. The atdecc work, but Not that stable.

Media Technologies Audio

179

Oct ’25

SpeechAnalyzer > AnalysisContext lack of documentation

I'm using the new SpeechAnalyzer framework to detect certain commands and want to improve accuracy by giving context. Seems like AnalysisContext is the solution for this, but couldn't find any usage example. So I want to make sure that I'm doing it right or not. let context = AnalysisContext() context.contextualStrings = [ AnalysisContext.ContextualStringsTag("commands"): [ "set speed level", "set jump level", "increase speed", "decrease speed", ... ], AnalysisContext.ContextualStringsTag("vocabulary"): [ "speed", "jump", ... ] ] try await analyzer.setContext(context) With this implementation, it still gives outputs like "Set some speed level", "It's speed level", etc. Also, is it possible to make it expect number after those commands, in order to eliminate results like "set some speed level to" (instead of two).

Media Technologies Audio Speech

573

Is AVAudioPCMFormatFloat32 required for playing a buffer with AVAudioEngine / AVAudioPlayerNode

I have a PCM audio buffer (AVAudioPCMFormatInt16). When I try to play it using AVPlayerNode / AVAudioEngine an exception is thrown: "[[busArray objectAtIndexedSubscript:(NSUInteger)element] setFormat:format error:&nsErr]: returned false, error Error Domain=NSOSStatusErrorDomain Code=-10868 (related thread https://forums.developer.apple.com/forums/thread/700497?answerId=780530022#780530022) If I convert the buffer to AVAudioPCMFormatFloat32 playback works. My questions are: Does AVAudioEngine / AVPlayerNode require AVAudioPCMBuffer to be in the Float32 format? Is there a way I can configure it to accept another format instead for my application? If 1 is YES is this documented anywhere? If 1 is YES is this required format subject to change at any point? Thanks! I was looking to watch the "AVAudioEngine in Practice" session video from WWDC 2014 but I can't find it anywhere (https://forums.developer.apple.com/forums/thread/747008).

Media Technologies Audio AVAudioNode AVAudioEngine AVFoundation

1.1k

Oct ’25

Airplay selection not working

I'm trying to implement airplay into my app. I can successfully playback sound and trigger the airplay selector sheet. If the target device is a Bluetooth only device I can connect with no problem and stream the audio to the Bluetooth device, but if the audio device is a airplay specific device like a HomePod or an Apple TV when I select it, I get a spinning icon, indicating that it is trying to connect, and eventually it times out and stops without connecting. I don't believe it is an AirPlay audio issue because if I go to a different app, for example a podcast app and select my HomePods for output, and then switch back to my app. My audio will correctly stream to the HomePod. Not only that, I have it so that my icon will change color to indicate that it is connected via airplay and it is correctly indicating that it is connected via AirPlay. But I cannot then disconnect it using the Airplay selector. The issue appears to be in the AirPlay selection side, which I have spent several days attempting to troubleshoot mostly using ChatGPT to suggest code different than what I have to maybe work around the issue. Mostly it is focused on the audio player section, but it doesn't seem like that is really the route that is the problem.

Media Technologies Audio Audio AVAudioSession AirPlay AirPlay 2

263

Jun ’25

iOS AUv3 extension: no Icon shown in host

Hi, I'm working on an AUv3 project. The app itself displays my icon. However the Auv3 extension does not display any icon in any host app (AUM, Drambo, etc.0). I thought that the extension would inherit the host app icon but that it does not appear to be the case. I tried to add the icon as a 1024x1024 file to the extension target and the update my extension plist file withe a CFBundleIconFile key but no luck either. It must surely be really easy. What am I missing? Thanks in advance for your help!

Media Technologies Audio

155

May ’25

[26] audioTimeRange would still be interesting for .volatileResults in SpeechTranscriber

So experimenting with the new SpeechTranscriber, if I do: let transcriber = SpeechTranscriber( locale: locale, transcriptionOptions: [], reportingOptions: [.volatileResults], attributeOptions: [.audioTimeRange] ) only the final result has audio time ranges, not the volatile results. Is this a performance consideration? If there is no performance problem, it would be nice to have the option to also get speech time ranges for volatile responses. I'm not presenting the volatile text at all in the UI, I was just trying to keep statistics about the non-speech and the speech noise level, this way I can determine when the noise level falls under the noisefloor for a while. The goal here was to finalize the recording automatically, when the noise level indicate that the user has finished speaking.

Media Technologies Audio Speech

768

Nov ’25

iOS 26 HLS Audio Track Display Behavior: EXT-X-MEDIA NAME vs LANGUAGE Attributes

Hello Apple Developer Community, I am seeking clarification on the intended display behavior of HLS audio tracks within the iOS 26 (or current beta) native player, specifically concerning the NAME and LANGUAGE attributes of the EXT-X-MEDIA tag. In our HLS manifests, we define alternative audio tracks using EXT-X-MEDIA tags, like so: #EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="audio",LANGUAGE="ja",NAME="AUDIO-1",DEFAULT=YES,AUTOSELECT=YES,URI="audio_ja.m3u8" #EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="audio",LANGUAGE="ja",NAME="AUDIO-2",URI="audio_en.m3u8" Our observation is that when an audio track is selected and its name is displayed in the native iOS media controls (e.g., Control Center or within a full-screen video player's UI), the value specified in the NAME attribute ("AUDIO-1", "AUDIO-2") does not seem to be used. Instead, the display appears to derive from the LANGUAGE attribute ("ja", "en"), often showing the system's localized string for that language (e.g., "Japanese", "English"). We would like to understand the official or intended behavior regarding this. Is it the expected behavior for the iOS native player to prioritize the LANGUAGE attribute (or its localized equivalent) over the NAME attribute for displaying the selected audio track's label? If this is the intended design, what is the recommended best practice for developers who wish to present a custom, human-readable name for audio tracks (beyond the standard language name) in the native iOS UI? Are there any specific AVPlayer properties or AVMediaSelectionOption considerations that would allow more granular control over this display, or is this entirely managed by the system based on the LANGUAGE attribute? Any insights or official guidance on this behavior in iOS 26 (and potentially previous versions) would be greatly appreciated. Thank you for your time and assistance.

Media Technologies Audio iOS Beta

433

Aug ’25

SpeechTranscriber/SpeechAnalyzer being relatively slow compared to FoundationModel and TTS

So, I've been wondering how fast a an offline STT -> ML Prompt -> TTS roundtrip would be. Interestingly, for many tests, the SpeechTranscriber (STT) takes the bulk of the time, compared to generating a FoundationModel response and creating the Audio using TTS. E.g. InteractionStatistics: - listeningStarted: 21:24:23 4480 2423 - timeTillFirstAboveNoiseFloor: 01.794 - timeTillLastNoiseAboveFloor: 02.383 - timeTillFirstSpeechDetected: 02.399 - timeTillTranscriptFinalized: 04.510 - timeTillFirstMLModelResponse: 04.938 - timeTillMLModelResponse: 05.379 - timeTillTTSStarted: 04.962 - timeTillTTSFinished: 11.016 - speechLength: 06.054 - timeToResponse: 02.578 - transcript: This is a test. - mlModelResponse: Sure! I'm ready to help with your test. What do you need help with? Here, between my audio input ending and the Text-2-Speech starting top play (using AVSpeechUtterance) the total response time was 2.5s. Of that time, it took the SpeechAnalyzer 2.1s to get the transcript finalized, FoundationModel only took 0.4s to respond (and TTS started playing nearly instantly). I'm already using reportingOptions: [.volatileResults, .fastResults] so it's probably as fast as possible right now? I'm just surprised the STT takes so much longer compared to the other parts (all being CoreML based, aren't they?)

Media Technologies Audio Speech

749

ScreenCaptureKit System Audio Capture Crashes with EXC_BAD_ACCESS

Bug Report: ScreenCaptureKit System Audio Capture Crashes with EXC_BAD_ACCESS Summary When using ScreenCaptureKit to capture system audio for extended periods, the application crashes with EXC_BAD_ACCESS in Swift's error handling runtime. The crash occurs in swift_getErrorValue when trying to process an error from the SCStream delegate method didStopWithError. This appears to be a framework-level issue in ScreenCaptureKit or its underlying ReplayKit implementation. Environment macOS Sonoma 14.6.1 Swift 5.8 ScreenCaptureKit framework Detailed Description Our application captures system audio using ScreenCaptureKit's audio capture capabilities. After successfully capturing for several minutes (typically after 3-4 segments of 60-second recordings), the application crashes with an EXC_BAD_ACCESS error. The crash happens when the Swift runtime attempts to process an error in the SCStreamDelegate.stream(_:didStopWithError:) method. The crash consistently occurs in swift_getErrorValue when attempting to access the class of what appears to be a null object. This suggests that the error being passed from the system framework to our delegate method is malformed or contains invalid memory. Steps to Reproduce Create an SCStream with audio capture enabled Add audio output to the stream Start capture and write audio data to disk Allow the capture to run for several minutes (3-5 minutes typically triggers the issue) The app will crash with EXC_BAD_ACCESS in swift_getErrorValue Code Sample func stream(_ stream: SCStream, didStopWithError error: Error) { print("Stream stopped with error: \(error)") // Crash occurs before this line executes } func stream(_ stream: SCStream, didOutputSampleBuffer sampleBuffer: CMSampleBuffer, of type: SCStreamOutputType) { guard type == .audio, sampleBuffer.isValid else { return } // Process audio data... } Expected Behavior The error should be properly propagated to the delegate method, allowing for graceful error handling and recovery. Actual Behavior The application crashes with EXC_BAD_ACCESS when the Swift runtime attempts to process the error in swift_getErrorValue. Crash Log Details Thread #35, queue = 'com.apple.NSXPCConnection.m-user.com.apple.replayd', stop reason = EXC_BAD_ACCESS (code=1, address=0x0) frame #0: 0x0000000194c3088c libswiftCore.dylib`swift::_swift_getClass(void const*) + 8 frame #1: 0x0000000194c30104 libswiftCore.dylib`swift_getErrorValue + 40 frame #2: 0x00000001057fba30 shadow`NewScreenCaptureService.stream(stream=0x0000600002de6700, error=Swift.Error @ 0x000000016b7b5e30) at NEW+ScreenCaptureService.swift:365:15 frame #3: 0x00000001057fc050 shadow`@objc NewScreenCaptureService.stream(_:didStopWithError:) at <compiler-generated>:0 frame #4: 0x0000000219ec5ca0 ScreenCaptureKit`-[SCStreamManager stream:didStopWithError:] + 456 frame #5: 0x00000001ca68a5cc ReplayKit`-[RPScreenRecorder stream:didStopWithError:] + 84 frame #6: 0x00000001ca696ff8 ReplayKit`-[RPDaemonProxy stream:didStopWithError:] + 224 Printing description of stream._streamQueue: error: ObjectiveC.id:4294967281:18: note: 'id' has been explicitly marked unavailable here public typealias id = AnyObject ^ error: /var/folders/v4/3xg1hmp93gjd8_xlzmryf_wm0000gn/T/expr23-dfa421..cpp:1:65: 'id' is unavailable in Swift: 'id' is not available in Swift; use 'Any' Swift._DebuggerSupport.stringForPrintObject(Swift.UnsafePointer<id>(bitPattern: 0x104ae08c0)!.pointee) ^~ ObjectiveC.id:2:18: note: 'id' has been explicitly marked unavailable here public typealias id = AnyObject ^ warning: /var/folders/v4/3xg1hmp93gjd8_xlzmryf_wm0000gn/T/expr23-dfa421..cpp:5:7: initialization of variable '$__lldb_error_result' was never used; consider replacing with assignment to '_' or removing it var $__lldb_error_result = __lldb_tmp_error ~~~~^~~~~~~~~~~~~~~~~~~~ _ Before the crash, we observed this error message in the console: [ERROR] *****SCStream*****RemoteAudioQueueOperationHandlerWithError:1015 Error received from the remote queue -16665 Additional Context The issue occurs consistently after approximately 3-4 successful audio segment recordings of 60 seconds each Commenting out custom segment rotation logic does not prevent the crash The crash involves XPC communication with Apple's ReplayKit daemon The error appears to be corrupted or malformed when crossing the XPC boundary Workarounds Attempted Added proper thread safety for all published properties using DispatchQueue.main.async Implemented more robust error handling in the delegate methods None of these approaches prevented the crash since it occurs at the Swift runtime level before our code executes. Impact This issue prevents reliable long-duration audio capture using ScreenCaptureKit. This bug significantly limits the usefulness of ScreenCaptureKit for any application requiring continuous system audio capture for more than a few minutes. Perhaps this issue might be related to a macOS bug where the system dialog indicates that the screen is being shared, even though nothing is actually being shared. Moreover, when attempting to stop sharing, nothing happens.

Media Technologies Audio ScreenCaptureKit

855

Logic Pro cannot load v3 audio unit with framework compiled with Swift 6

Sequoia 15.4.1 (24E263) XCode: 16.3 (16E140) Logic Pro: 11.2.1 I’ve been developing a complex audio unit for Mac OS that works perfectly well in its own bespoke host app and is now well into its beta testing stage. It did take some effort to get it to work well in Logic Pro however and all was fine and working well until: The AU part is an empty app extension with a framework containing its code. The framework contains Swift code for the UI and C code for the DSP parts. When the framework is compiled using the Swift 5 compiler the AU will run in Logic with no problems. (I should also mention that AU passes the most strict auval tests). But… when the framework is compiled with Swift 6 Logic Pro cannot load it. Logic displays a message saying the audio unit could not be loaded and to contact the developer. My own host app loads the AU perfectly well with the Swift 6 version, so I know there’s nothing wrong with the audio unit. I cannot find any differences in any of the built output files except, of course, the actual binary code in the framework. I’ve worked for hours on this and cannot find a solution other than to build the framework in Swift 5. (I worked hard to get all the async code updated and working with Swift 6! so I feel a little cheated!) What is happening? Is this a bug in Logic? Is this a bug in Swift 6 compiler/linker? I’m at the Duh! hands in the air, tearing out hair stage! ( once again!)

Media Technologies Audio Swift AudioUnit Core Audio

569

Jul ’25

SIGABORT with ExtAudioFileWrite and .m4a file

Hi, I am getting into a trap. Please check stack-trace, howto fix this? regards, Joël stack-trace with ExtAudioFileWrite

Media Technologies Audio AudioToolbox macOS

930

Jun ’25

Delay in Microphone Input When Talking While Receiving Audio in PTT Framework (Full Duplex Mode)

Context: I am currently developing an app using the Push-to-Talk (PTT) framework. I have reviewed both the PTT framework documentation and the CallKit demo project to better understand how to properly manage audio session activation and AVAudioEngine setup. I am not activating the audio session manually. The audio session configuration is handled in the incomingPushResult or didBeginTransmitting callbacks from the PTChannelManagerDelegate. I am using a single AVAudioEngine instance for both input and playback. The engine is started in the didActivate callback from the PTChannelManagerDelegate. When I receive a push in full duplex mode, I set the active participant to the user who is speaking. Issue When I attempt to talk while the other participant is already speaking, my input tap on the input node takes a few seconds to return valid PCM audio data. Initially, it returns an empty PCM audio block. Details: The audio session is already active and configured with .playAndRecord. The input tap is already installed when the engine is started. When I talk from a neutral state (no one is speaking), the system plays the standard "microphone activation" tone, which covers this initial delay. However, this does not happen when I am already receiving audio. Assumptions / Current Setup Because the audio session is active in play and record, I assumed that microphone input would be available immediately, even while receiving audio. However, there seems to be a delay before valid input is delivered to the tap, only occurring when switching from a receive state to simultaneously talking. Questions Is this expected behavior when using the PTT framework in full duplex mode with a shared AVAudioEngine? Should I be restarting or reconfiguring the engine or audio session when beginning to talk while receiving audio? Is there a recommended pattern for managing microphone readiness in this scenario to avoid the initial empty PCM buffer? Would using separate engines for input and output improve responsiveness? I would like to confirm the correct approach to handling simultaneous talk and receive in full duplex mode using PTT framework and AVAudioEngine. Specifically, I need guidance on ensuring the microphone is ready to capture audio immediately without the delay seen in my current implementation. Relevant Code Snippets Engine Setup func setup() { let input = audioEngine.inputNode do { try input.setVoiceProcessingEnabled(true) } catch { print("Could not enable voice processing \(error)") return } input.isVoiceProcessingAGCEnabled = false let output = audioEngine.outputNode let mainMixer = audioEngine.mainMixerNode audioEngine.connect(pttPlayerNode, to: mainMixer, format: outputFormat) audioEngine.connect(beepNode, to: mainMixer, format: outputFormat) audioEngine.connect(mainMixer, to: output, format: outputFormat) // Initialize converters converter = AVAudioConverter(from: inputFormat, to: outputFormat)! f32ToInt16Converter = AVAudioConverter(from: outputFormat, to: inputFormat)! audioEngine.prepare() } Input Tap Installation func installTap() { guard AudioHandler.shared.checkMicrophonePermission() else { print("Microphone not granted for recording") return } guard !isInputTapped else { print("[AudioEngine] Input is already tapped!") return } let input = audioEngine.inputNode let microphoneFormat = input.inputFormat(forBus: 0) let microphoneDownsampler = AVAudioConverter(from: microphoneFormat, to: outputFormat)! let desiredFormat = outputFormat let inputFramesNeeded = AVAudioFrameCount((Double(OpusCodec.DECODED_PACKET_NUM_SAMPLES) * microphoneFormat.sampleRate) / desiredFormat.sampleRate) input.installTap(onBus: 0, bufferSize: inputFramesNeeded, format: input.inputFormat(forBus: 0)) { [weak self] buffer, when in guard let self = self else { return } // Output buffer: 1920 frames at 16kHz guard let outputBuffer = AVAudioPCMBuffer(pcmFormat: desiredFormat, frameCapacity: AVAudioFrameCount(OpusCodec.DECODED_PACKET_NUM_SAMPLES)) else { return } outputBuffer.frameLength = outputBuffer.frameCapacity let inputBlock: AVAudioConverterInputBlock = { inNumPackets, outStatus in outStatus.pointee = .haveData return buffer } var error: NSError? let converterResult = microphoneDownsampler.convert(to: outputBuffer, error: &error, withInputFrom: inputBlock) if converterResult != .haveData { DebugLogger.shared.print("Downsample error \(converterResult)") } else { self.handleDownsampledBuffer(outputBuffer) } } isInputTapped = true }

Media Technologies Audio iOS AVAudioSession AVAudioEngine Push To Talk

511

Aug ’25

SpeechTranscriber not providing audioTimeRange for most results

I started playing which transcription of audio files on macOS today, latest beta of Xcode and latest beta of Tahoe. Transcription itself works really well, but for some reason the majority of the results contain no audioTimeRange. I got 22 single-word results with time ranges, spread out all over total file of 53 minutes. Is there something I can do to improve this? To my understanding, I have followed sample code and instructions very closely, but the SwiftTranscriptionSampleApp and other examples I've seen lead me to believe I should be getting a lot more time ranges than I actually do.

Media Technologies Audio Speech

209

Aug ’25

WatchOS: Can a background metronome app coexist with both Runna workout and Spotify playback?

I’m building a standalone Apple Watch metronome app for running. My goal is for these 3 apps to work at the same time: Runna owns the workout session Spotify plays music my app plays a metronome click in the background So far this is what I've found: Using HKWorkoutSession in my metronome app works well with Spotify, but conflicts with Runna and other workout apps, so I removed that. Using watchOS background audio with longFormAudio allows my app run in the background, and it can coexist with Runna. However, it seems to conflict with Spotify playback, and one app tends to stop the other. Is there any supported watchOS audio/background configuration that allows all 3 at once? More specifically this is what I need: another app owns HKWorkoutSession Spotify keeps playing my app keeps generating metronome clicks in the background Or is this simply not supported by current watchOS session/background rules? My metronome uses AVAudioEngine / AVAudioPlayerNode with generated click audio. Thank you!

Media Technologies Audio watchOS HealthKit AVAudioSession AVAudioEngine

418

Application tones start when I get incoming call or message

I've got a problem with my app where I'm testing it on my own phone. I'm using audio kit to generate tones as part of the app. Everything seems to work fine. Sounds start, Stop, etc. They play when the app is closed and when the phone is locked, so background is working. However, I'm seeing an issue where, even when STOP is pressed and the application exited, if I get a notification such as a text message, the base tone for the app starts to play. If I then open the app, check the Start/Stop button - it says start so that. hasnt' been activated. If I click Start, then a 2nd tone starts. This one stops with the Stop button. However the original tone that was set off by an incoming message carries on playing. Until I go to the Open Apps View on the phone and slide the application upwards. For the life of me, I can't figure out whats happening here.

Media Technologies Audio AudioToolbox Debugging

118

May ’25

AudioOutputUnitStart takes ~500 ms when using Push-to-Talk framework after beginTransmission

I’m working with the Push-to-Talk (PTT) framework and observing a consistent delay when starting audio capture. Scenario: A PTT call is already active The AVAudioSession is fully configured I request beginTransmission on the PTT channel I start my Audio Unit for recording (AudioOutputUnitStart) Observed behavior: AudioOutputUnitStart takes ~500 ms This happens whether I start the Audio Unit: after didBeginTransmission, or after AVAudioSession didActivate Comparison: Using the same Audio Unit, same format, and same configuration Without the PTT framework, AudioOutputUnitStart takes ~200 ms Additional notes: I am not modifying or reconfiguring AVAudioSession when requesting beginTransmission The audio session is already set up when the PTT call starts There are no interruptions or route changes at the time of starting the Audio Unit Impact: This extra latency is significant for Push-to-Talk use cases where fast transmit start is critical.

Media Technologies Audio AudioToolbox Push To Talk

363

Feb ’26

Using StoreKit from an AUv3 plugin that can be loaded in-process

I have a bunch of Audio Unit v3 plugins that are approaching release, and I was considering using subscription-model pricing, as I have done in a soon to be released iOS app. However, whether this is possible or not is not at all obvious. Specifically: The plugin can, depending on the host app, be loaded in-process or out-of-process - yes, I know, Logic Pro and Garage Band will not load a plug-in in-process anymore, but I am not going to rule that out for other audio apps and force on them the overhead of IPC (I spent two solid weeks deciphering the process to actually make it possible for an AUv3 to run in-process - see this - https://github.com/timboudreau/audio_unit_rust_demo - example with notes) Depending on how it is loaded, the value of Bundle.main.bundleIdentifier will vary. If I use the StoreKit API, will that return product results for my bundle identifier when being called as a library from a foreign application? I would expect it would be a major security hole if random apps could query about purchases of other random apps, so I assume not. Even if I restricted the plugins to running out-of-process, I have to set up the in-app purchases on the app store for the App container's ID, not the extension's ID, and the extension is what run - the outer app that is what you purchase is just a toy demo that exists solely to register the audio unit. I have similar questions with regard to MetricKit, which I would similarly like to use, but which may be running inside some random app. If there were some sort of signed token, or similar mechanism, that could be bundled or acquired by the running plugin extension that could be used to ensure both StoreKit and MetricKit operate under the assumption that purchases and metrics should be accessed as if called from the container app, that would be very helpful. This is the difference between having a one-and-done sales model and something that provides ongoing revenue to maintain these products - I am a one-person shop - if I price these products where they would need to be to pay the bills assuming a single sale per customer ever, the price will be too high for anyone to want to try products from a small vendor they've never heard of. So, being able to do a free trial period and then subscription is the difference between this being a viable business or not.

Media Technologies Audio MetricKit StoreKit AudioUnit

781

Crackling/Popping sound when using AVAudioUnitTimePitch

I have a simple AVAudioEngine graph as follows: AVAudioPlayerNode -> AVAudioUnitEQ -> AVAudioUnitTimePitch -> AVAudioUnitReverb -> Main mixer node of AVAudioEngine. I noticed that whenever I have AVAudioUnitTimePitch or AVAudioUnitVarispeed in the graph, I noticed a very distinct crackling/popping sound in my Airpods Pro 2 when starting up the engine and playing the AVAudioPlayerNode and unable to find the reason why this is happening. When I remove the node, the crackling completely goes away. How do I fix this problem since i need the user to be able to control the pitch and rate of the audio during playback. import AVKit @Observable @MainActor class AudioEngineManager { nonisolated private let engine = AVAudioEngine() private let playerNode = AVAudioPlayerNode() private let reverb = AVAudioUnitReverb() private let pitch = AVAudioUnitTimePitch() private let eq = AVAudioUnitEQ(numberOfBands: 10) private var audioFile: AVAudioFile? private var fadePlayPauseTask: Task<Void, Error>? private var playPauseCurrentFadeTime: Double = 0 init() { setupAudioEngine() } private func setupAudioEngine() { guard let url = Bundle.main.url(forResource: "Song name goes here", withExtension: "mp3") else { print("Audio file not found") return } do { audioFile = try AVAudioFile(forReading: url) } catch { print("Failed to load audio file: \(error)") return } reverb.loadFactoryPreset(.mediumHall) reverb.wetDryMix = 50 pitch.pitch = 0 // Increase pitch by 500 cents (5 semitones) engine.attach(playerNode) engine.attach(pitch) engine.attach(reverb) engine.attach(eq) // Connect: player -> pitch -> reverb -> output engine.connect(playerNode, to: eq, format: audioFile?.processingFormat) engine.connect(eq, to: pitch, format: audioFile?.processingFormat) engine.connect(pitch, to: reverb, format: audioFile?.processingFormat) engine.connect(reverb, to: engine.mainMixerNode, format: audioFile?.processingFormat) } func prepare() { guard let audioFile else { return } playerNode.scheduleFile(audioFile, at: nil) } func play() { DispatchQueue.global().async { [weak self] in guard let self else { return } engine.prepare() try? engine.start() DispatchQueue.main.async { [weak self] in guard let self else { return } playerNode.play() fadePlayPauseTask?.cancel() playPauseCurrentFadeTime = 0 fadePlayPauseTask = Task { [weak self] in guard let self else { return } while true { let volume = updateVolume(for: playPauseCurrentFadeTime / 0.1, rising: true) // Ramp up volume until 1 is reached if volume >= 1 { break } engine.mainMixerNode.outputVolume = volume try await Task.sleep(for: .milliseconds(10)) playPauseCurrentFadeTime += 0.01 } engine.mainMixerNode.outputVolume = 1 } } } } func pause() { fadePlayPauseTask?.cancel() playPauseCurrentFadeTime = 0 fadePlayPauseTask = Task { [weak self] in guard let self else { return } while true { let volume = updateVolume(for: playPauseCurrentFadeTime / 0.1, rising: false) // Ramp down volume until 0 is reached if volume <= 0 { break } engine.mainMixerNode.outputVolume = volume try await Task.sleep(for: .milliseconds(10)) playPauseCurrentFadeTime += 0.01 } engine.mainMixerNode.outputVolume = 0 playerNode.pause() // Shut down engine once ramp down completes DispatchQueue.global().async { [weak self] in guard let self else { return } engine.pause() } } } private func updateVolume(for x: Double, rising: Bool) -> Float { if rising { // Fade in return Float(pow(x, 2) * (3.0 - 2.0 * (x))) } else { // Fade out return Float(1 - (pow(x, 2) * (3.0 - 2.0 * (x)))) } } func setPitch(_ value: Float) { pitch.pitch = value } func setReverbMix(_ value: Float) { reverb.wetDryMix = value } } struct ContentView: View { @State private var audioManager = AudioEngineManager() @State private var pitch: Float = 0 @State private var reverb: Float = 0 var body: some View { VStack(spacing: 20) { Text("🎵 Audio Player with Reverb & Pitch") .font(.title2) HStack { Button("Prepare") { audioManager.prepare() } Button("Play") { audioManager.play() } .padding() .background(Color.green) .foregroundColor(.white) .cornerRadius(10) Button("Pause") { audioManager.pause() } .padding() .background(Color.red) .foregroundColor(.white) .cornerRadius(10) } VStack { Text("Pitch: \(Int(pitch)) cents") Slider(value: $pitch, in: -2400...2400, step: 100) { _ in audioManager.setPitch(pitch) } } VStack { Text("Reverb Mix: \(Int(reverb))%") Slider(value: $reverb, in: 0...100, step: 1) { _ in audioManager.setReverbMix(reverb) } } } .padding() } }