Quick Guide to iOS Audio Recording and Playback

With the fast-paced evolution of the modern user experience, the focus of mobile development is shifting to more creative and innovative ways of interacting with the users. One of the more famous example is through audio communication. From simple voice recording to more complex AI applications like Siri and Alexa, it is slowly becoming a trend to incorporate audio capture and manipulation in one's app.

In every trend, it would be great to begin with the basics. Today, we will learn the basics of how to record the sound using AVAudioRecorder, playback the recorded sound and add sound effects using AVAudioEngine. While we learn how to do that, we will have a running app that can record, play and add effects to it.

Getting Started

For this app we will be making use of Swift’s native library, AVFoundation. I made this using Swift 4 and Xcode 9 so be sure your tools are updated. You can use this repository as a guide if you ever get lost.

Without further ado, here's a quick and "swift" (got it?) guide to audio manipulation in iOS.

Main View Controller

First, let’s create a view controller that contains the following elements: a record button, stop button and a label. Make sure that the stop button is disabled once the view is loaded.

Record Button

Apparent from its name, the record button, once tapped, should be able to start the recording. Aside from that, it should also be able to perform the following actions:

  • Change the record button to disabled and the stop button to enabled
  • Set the label to "Recording in Progress"

The set of codes below would implement the actual audio recording.

let format = DateFormatter()
format.dateFormat="yyyyMMddHHmmss"
let audioFileName = "recording-\(format.string(from: Date())).wav"
let dirPath = NSSearchPathForDirectoriesInDomains(.documentDirectory, .userDomainMask, true)[0] as String
let pathArray = [dirPath, audioFileName]
let filePath = URL(string: pathArray.joined(separator: "/"))

let session = AVAudioSession.sharedInstance()
try! session.setCategory(AVAudioSessionCategoryPlayAndRecord, with:AVAudioSessionCategoryOptions.defaultToSpeaker)

try! audioRecorder = AVAudioRecorder(url: filePath!, settings: [:])
audioRecorder.delegate = self
audioRecorder.isMeteringEnabled = true
audioRecorder.prepareToRecord()
audioRecorder.record()

Stop Button

When the record button has been triggered, the stop button becomes enabled. Evidently, it should break the recording process. This piece of code, when added to the stop button action, would implement such mechanism.

audioRecorder.stop()
let audioSession = AVAudioSession.sharedInstance()
try! audioSession.setActive(false)

Once the recording was stopped, we should open it in a new view controller that would play the recording. Add this code to listen to the produced recording, but make sure to add the AVAudioRecorderDelegate protocol to the class.

func audioRecorderDidFinishRecording(_ recorder: AVAudioRecorder, successfully flag: Bool) {
if flag {
performSegue(withIdentifier: "PlayRecordedAudioSegue", sender: self)
}
}

In the segue, we should pass the recorded URL to the next view controller.

Second View Controller

Now, in the second view controller, add the play button, stop button, file name label and an edit button that open a new view controller.

In this controller, we will be using the following variables for playing the recorded audio:

var audioFile: AVAudioFile!
var audioEngine: AVAudioEngine!
var audioPlayerNode: AVAudioPlayerNode!
var stopTimer: Timer!

Class for Audio Manipulation

Now let’s create an extension class that has a superclass of AVAudioPlayerDelegate. In this class, it will contain the setting up of audio, playing and stopping. During the playing of the audio, we can manipulate it to change its pitch and rate, as well as add reverb and echo effects.

extension PlayAudioViewController: AVAudioPlayerDelegate {
// MARK: Alerts

struct Alerts {
static let RecordingDisabledTitle = "Recording Disabled"
static let RecordingDisabledMessage = "You've disabled this app from recording your microphone. Check Settings."
static let RecordingFailedTitle = "Recording Failed"
static let RecordingFailedMessage = "Something went wrong with your recording."
static let AudioRecorderError = "Audio Recorder Error"
static let AudioSessionError = "Audio Session Error"
static let AudioRecordingError = "Audio Recording Error"
static let AudioFileError = "Audio File Error"
static let AudioEngineError = "Audio Engine Error"
}

// MARK: Audio Functions

func setupAudio() {
// initialize (recording) audio file
do {
audioFile = try AVAudioFile(forReading: recordedAudioURL as URL)
} catch {
Alert.showDismissAlert(Alerts.AudioFileError, message: String(describing: error), in: self)
}
}

func playSound(rate: Float? = nil, pitch: Float? = nil, echo: Bool = false, reverb: Bool = false) {

// initialize audio engine components
audioEngine = AVAudioEngine()

// node for playing audio
audioPlayerNode = AVAudioPlayerNode()
audioEngine.attach(audioPlayerNode)

// node for adjusting rate/pitch
let changeRatePitchNode = AVAudioUnitTimePitch()
if let pitch = pitch {
changeRatePitchNode.pitch = pitch
}
if let rate = rate {
changeRatePitchNode.rate = rate
}
audioEngine.attach(changeRatePitchNode)

// node for echo
let echoNode = AVAudioUnitDistortion()
echoNode.loadFactoryPreset(.multiEcho1)
audioEngine.attach(echoNode)

// node for reverb
let reverbNode = AVAudioUnitReverb()
reverbNode.loadFactoryPreset(.cathedral)
reverbNode.wetDryMix = 50
audioEngine.attach(reverbNode)

// connect nodes
if echo == true && reverb == true {
connectAudioNodes(audioPlayerNode, changeRatePitchNode, echoNode, reverbNode, audioEngine.outputNode)
} else if echo == true {
connectAudioNodes(audioPlayerNode, changeRatePitchNode, echoNode, audioEngine.outputNode)
} else if reverb == true {
connectAudioNodes(audioPlayerNode, changeRatePitchNode, reverbNode, audioEngine.outputNode)
} else {
connectAudioNodes(audioPlayerNode, changeRatePitchNode, audioEngine.outputNode)
}

// schedule to play and start the engine
audioPlayerNode.stop()
audioPlayerNode.scheduleFile(audioFile, at: nil) {
var delayInSeconds: Double = 0

if let lastRenderTime = self.audioPlayerNode.lastRenderTime, let playerTime = self.audioPlayerNode.playerTime(forNodeTime: lastRenderTime) {

if let rate = rate {
delayInSeconds = Double(self.audioFile.length - playerTime.sampleTime) / Double(self.audioFile.processingFormat.sampleRate) / Double(rate)
} else {
delayInSeconds = Double(self.audioFile.length - playerTime.sampleTime) / Double(self.audioFile.processingFormat.sampleRate)
}
}

// schedule a stop timer for when audio finishes playing
self.stopTimer = Timer(timeInterval: delayInSeconds, target: self, selector: #selector(PlayAudioViewController.stopSound), userInfo: nil, repeats: false)
RunLoop.main.add(self.stopTimer!, forMode: RunLoopMode.defaultRunLoopMode)
}

do {
try audioEngine.start()
} catch {
Alert.showDismissAlert(Alerts.AudioEngineError, message: String(describing: error), in: self)
return
}

// play the recording!
audioPlayerNode.play()
}

@objc func stopSound() {
if let audioPlayerNode = audioPlayerNode {
audioPlayerNode.stop()
}

if let stopTimer = stopTimer {
stopTimer.invalidate()
}

if let audioEngine = audioEngine {
audioEngine.stop()
audioEngine.reset()
}
}

// MARK: Connect List of Audio Nodes

func connectAudioNodes(_ nodes: AVAudioNode...) {
for x in 0..<nodes.count-1 {
audioEngine.connect(nodes[x], to: nodes[x+1], format: audioFile.processingFormat)
}
}
}

In this view controller, we should call setupAudio in the viewDidLoad, playAudio in the play button action and stopAudio in the stop button action. The edit button action should perform a segue to a view controller that changes the effects of the audio.

In the settings view controller, we will have a rate slider, pitch slider, reverb switch and echo switch, as shown in the image below. The rate slider should have a minimum value of 0.5 and a maximum of 1.5. The pitch slider should have a minimum value of -1000 and a maximum of 1000.

Once you click the back button, it should pass the values of the rate, pitch, echo and reverb to the previous view controller. Add this extension class to trigger that:

extension EditAudioViewController: UINavigationControllerDelegate {

func navigationController(_ navigationController: UINavigationController,
willShow viewController: UIViewController,
animated: Bool) {
(viewController as? PlayAudioViewController)?.rate = rateSlider.value
(viewController as? PlayAudioViewController)?.pitch = pitchSlider.value
(viewController as? PlayAudioViewController)?.echo = echoSwitch.isOn
(viewController as? PlayAudioViewController)?.reverb = reverbSwitch.isOn
}
}

If you try to play the audio again, it should play the audio with the added effects.

Additional Notes

As you can notice, we pass the pitch, rate, reverb and echo from each view controller to another. This is because the adding of effects to the audio does not change the source audio file. The effects are only added once the audio is played and is handled by the AVAudioEngine.

Conclusion

Handling audio files can be hard to understand but once you get the hang of it, it can be quite easy to do. The next step we may want to go from here is manipulating the file itself and not just when the audio is played, but that’s for another day.

* Primary Photo by Lui Peng on Unsplash

Blog Posts by Lorence Lim