Mixing with Headphones

Mixing with Headphones


The Renaissance – tips and tricks for mixing with headphones

A look on the street and in bus & train these days is enough to realize: The headphone is back! Those who have already arrived in the autumn of life feel pushed back to the time of the 80s. The Sony Group first established the location-independent availability of music on a grand scale with its legendary Walkman. Back then, however, the music selection was limited to the number of compact cassettes you were willing to lug around. Nowadays, thanks to modern data transmission via the mobile network combined with numerous streaming providers. Location-independent access to music, podcasts and radio plays are virtually unlimited. Together with the trend toward more home offices, this has given headphone sales an additional boost. 

Headphone Sales

Ten years ago, just under nine million headphones were sold annually in Germany. In 2015, the figure was already 11.4 million units – with a clear upward trend! What surprises is that sound quality is listed as the most important purchase criterion. Way ahead of the “design and price” categories. Consumers currently prefer to buy high-quality headphones instead of monstrous hi-fi tower speakers with imposing speaker cabinets. 

The arguments for investing in headphones are many and varied. Sound alone is not always the decisive factor. Headphones are now also a lifestyle product, where design and brand affiliation play a weighty role for some users. Sometimes it is also special features, such as noise cancelling, that move a certain model into the shopping cart. If you travel a lot and use trains and planes, you won’t want to miss the automatic suppression of ambient noise (noise-canceling). Effective noise canceling filters out the annoying ambient noise from the useful signal, and the user is not tempted to compensate for noise with a higher listening volume. 

A further aspect of our reality in life is communication with cell phones and tablets. In line with this trend, most mobile devices come with matching in-ear headphones. This is having a lasting impact on media use, especially among young people. According to a JIM Study in 2020, the smartphone is by far the most popular device among German youngsters for connecting to the Internet.

Spotify

At the same time, the distribution of Internet use in terms of content has shifted noticeably within ten years. While 23% of young people used the Internet for entertainment purposes (e.g., music, videos, pictures) in 2010, this figure rose to 34% in 2020. According to the JIM study, the streaming service “Spotify” is more popular among young people than Facebook for example. With a look at these facts, it should be clear: Headphones are back and as musicians, sound engineers and mastering engineers, we have to ask ourselves whether we are responding adequately to this trend? 

More devices = more mixing decisions?

With the sheer number of different mobile devices, the question inevitably arises as to whether special consideration needs to be given to these devices when mixing and mastering? It used to be simpler. There were the duos “tape & turntable” and “hi-fi speakers & headphones”. Today, the number of different devices and media seems almost infinite.

Vinyl Record players, CD players, MP-3 players, smartphones, tablets, desktop computers, laptops, wi-fi speakers and active battery Bluetooth speakers. Some consumers own a hi-fi system worth a condo, while others consume music through their smartphone’s built-in mini-speakers.

The range is extremely wide, leaving you with an uneasy feeling when mixing whether the mix will sound good on any device. There will probably not be a generally valid answer. However, those who know their “customers” can draw their conclusions. If you produce rap music for a youthful crowd, you should definitely check your mix via smartphone speakers and standard in-ear headphones. Widely they are included with every iPhone or Samsung phone.

The perfect listening environment – headphones?!

In the 80s, the Hifi tower with two large loudspeakers was the ultimate in music entertainment. The music lover sat there then in the perfect stereo triangle and has listened to the music there. According to this example nearly all music studios are still built today. Two loudspeakers in an acoustically optimized room for recording and mixing. 

But even then, the room acoustics in the hi-fi living rooms were anything but perfect. Depending on the nature of the surfaces, windows and room geometry, an optimal condition was never possible there. Many compromises were made and especially in the bass range, the room has a formative role and is rarely linear. Then the different loudspeaker models still sound very different, a standard reference therefore does not exist. So if, in the worst case, the studio acoustics and the room acoustics of the listener are not optimal, the music experience is already very clouded soundwise.

Nowadays, perfect listening situations at home are becoming less and less common. The number of different speaker systems is unmanageable and many playback systems are designed more for background sound. 

For the real music lover with a demand for quality, headphones have established themselves as the listening system. Customers are also willing to spend more money if this enables an increase in quality. 

Disadvantage of studio monitor speakers

Studio stereo monitor speakers come in all sizes and variations and are available in active and passive versions. The difference between studio monitors and studio headphones results from the fact that studio monitors never act in isolation from the room acoustics. Room acoustics have a significant impact on the performance of studio monitor speakers. 

The better the room acoustics, the better the performance of the monitors. The room acoustics have a large part in the frequency response of the loudspeaker. This may be linear in the test lab, but anything but perfect in an acoustically unfavorable environment. Then even buying the next better model is no real help, the acoustics don’t change there. Mixes made in unfavorable rooms often also have problems in the bass. Either there is a frequency range that is missing or is exaggerated. Also, spatial imaging is disturbed by first reflections. 

Consideration

Then there are other practical considerations. What noise can I make in my room without disturbing a roommate or neighbor? At what times can I work? Is it possible to make the room acoustically perfect and what costs can I expect? Especially for many home studios these are important questions. 

The question here is whether it would make sense to make the mix directly with headphones and then check whether this mix also works on other speaker systems. The likelihood of enjoying the music on better headphones is higher than on a high-end speaker in optimal placement. 

Therefore, it would make sense to make the best sound for the headphones right away. This way the listener has almost the same listening experience, possibly even with the same headphone model. It sometimes takes many years for someone to get a balanced and good stereo mix in an unfavorable acoustic environment. With headphones, this is possible much faster. 

If other music is also heard with these, the reference is directly there and a comparison is easier.

Through laptops, music production has become more mobile and possible in many places, with headphones the same listening situation is available everywhere. 

Especially the bass range can be judged very well with headphones, the frequency response goes very deep and is reproduced without distortion. 

What should you pay attention to

A headphone mix is always an unnatural listening environment. Why is that? Here’s a little experiment. Sit in front of your studio monitors or hi-fi system and move your head quickly from left to right and back. While you are moving, you will hear a slight flanging or comb filter effect. If you repeat the same movement under headphones, you will notice that the sound remains identical. No matter in which position your head is. 

The reason for this is quickly explained: when you listen to music through headphones, the signal lacks the natural crosstalk that is always present in normal speaker signals. 

Even if you sit in front of your monitor speakers in a perfect stereo triangle, the left ear will still hear sound from the right monitor and vice versa. The head never completely blocks sound events from the opposite side. This would also be fatal, because only thanks to the ears and the ability for acoustic localization our ancestors could recognize from which direction the saber-toothed tiger was approaching. Binaural hearing is a basic prerequisite for an aural sense of direction. 

Historical evolution

If danger approaches from the left, its sound first reaches the left ear. The right ear, on the other hand, perceives the sounds at a reduced level and also with a slight time delay. The distance between the ears creates a difference in the time of flight, which enables our brain to extrapolate the direction of sound incidence. Phylogenetically, listening via studio monitors is more natural than via headphones. Only, the common prehistoric man was a hunter-gatherer and not a music producer concerned about missing crosstalk. Producers should know about these issues, however, because in an acoustically problematic room, studio monitors excite many unwanted room reflections that complicate mix decisions. 

Example: In a large room, our binaural hearing can very well distinguish between direct sound and reflections as soon as the temporal distance between direct sound and reflections is large enough. However, this abstraction performance of the brain consumes a lot of attention resources. Mixing and mastering in such an environment are latent permanent stress for our hearing and should be remedied by sustainably improving the acoustics of the room. How should producers and musicians react to this fact if they regularly work in different places with different room acoustics? As an immediate measure, it is preferable to reach for studio headphones! Because apart from the lack of crosstalk, headphones have several features that recommend them for music productions.

Pro-Phones!

The great advantage of headphones is that they always offer the same acoustic landscape. Therefore, it does not matter where you are with them. Headphones offer a tonal home that provides an identical working basis day after day. If you are a producer or sound engineer who travels a lot, you should definitely get a pair of good studio headphones. No matter how bad the monitoring conditions may be in a studio, on a streaming job or at an FoH place in a reflection hell called „Town Hall”: headphones provide a reliable reference that users can always fall back on. 

When mixing with headphones, what do I need to keep in mind?

After we have evaluated the technical requirements for the best possible headphone mix, here is a short summary of what you should keep in your mind when mixing with headphones. The mix on headphones always sounds a bit more striking and bigger compared to conventional monitors. Hard panning (left/right) sounds more drastic and extreme than on monitors due to the lack of crosstalk. Therefore, creating a natural stereo image and making panning decisions is more difficult on headphones. 

On the other hand, minimal mixing errors and noises are much easier to localize on headphones. Headphones are also ideal for editing vocal or drum tracks. For longer sessions, it makes sense to switch between headphones and monitors more often. Especially with headphones, the risk of fatigue and too high levels are always present. In home studios, vocals are often recorded in the same room where the studio equipment is located. Here, a good pair of headphones is also important to directly assess the artist’s performance. With experience, analog processing can also be used directly here during the recording, because this can be immediately evaluated acoustically.

Which headphones are perfect for me?

For mixing tasks and studio use, studio headphones are generally better than consumer models. Quite a few consumer models prefer to sound “fat” instead of being an aid in mix decisions. Therefore, reaching for studio headphones is always preferable. Now there is another decision to make and that concerns the construction type. What should it be? Closed, semi-open or open? I’m not talking about the status of your favorite club, but the design of the earcup. 

Closed headphones like the beyerdynamic DT770 Pro have the advantage that hardly any sound penetrates from the outside and the listener itself hardly emits any sound to the outside. Closed headphones are therefore particularly suitable for tracking in the studio. The spillover of other instruments into the headphones is suppressed and a loud click track does not spill over into the vocal microphone when singing. In addition, closed headphones work well in noisy environments and people around do not feel disturbed, since hardly any sound penetrates to the outside. Due to their design, closed headphones have some disadvantages when it comes to achieving the most linear sound possible. The closed ear cup always creates a pressure buildup, especially at low frequencies. This is one reason why (semi-)open headphones often have a more natural bass response, generally sound more “open – airy” and also boast better impulse fidelity.

Which headphone for which task

During long working days, the (semi-)open models offer better wearing comfort due to good airflow (air exchange through the open earcups), which is particularly advantageous at higher temperatures. The open models, on the other hand, have the disadvantage that they audibly emit sounds to the outside and accordingly offer only a small separation between the useful signal and ambient noise in a noisy environment. The right choice therefore also depends on the preferred application. For pure mixing tasks in a studio room without ambient noise, open headphones are a good choice, while closed headphones are recommended for tracking at tape volume. 

In addition to working in the studio with headphone amplifiers with a lot of torque, there is also the application that you want to listen to your mix via smartphone. It makes sense to use headphones with the lowest possible impedance (ohms). Our example DT 770 Pro headphones are available in three different ohm versions (32, 80 or 250 ohms). As a rule of thumb, the higher the ohm, the more voltage the headphone amplifier has to apply to generate a decent level. This means that if you want to use your headphone on a smartphone, laptop or tablet, you should preferably add a low-impedance version to your shopping cart. 

Conclusion

It is worth considering buying a very good pair of headphones for studio work and learning to record and mix with them. The cost is only a fraction of what very good speakers and room acoustics cost. 

Therefore, it makes more sense to invest the money saved in audible improvements to the signal chain such as microphones, preamps, sound generators and analog outboard processing. With a good pair of headphones, the differences in these improvements can be quickly noticed. These improvements can also be heard directly in the music and that, after all, should be the goal of the measures.

What are your thoughts on this subject? If you like this blog post please leave a comment and share this post with your friends.

Thanks for reading!

Your Ruben Tilgner

How to Podcast

Estimated reading time: 14 minutes

How to Podcast


Due to the pandemic, our media behavior is changing permanently. Many extroverted presentation possibilities such as concerts, exhibitions, or face-to-face seminars are (if at all) currently only possible to a limited extent. But that doesn’t stop creative minds from spreading their artistic output. Due to the situation, people are looking for alternative ways to expose themselves to the world.

Besides live streams, the podcast is a popular way to make more complex topics, in particular, accessible to a larger audience. The access barriers for the audience are low. On the other hand, a podcaster has to consider a few things to come up with a good quality podcast. In this blog post, we want to offer you valuable tips around the topic “podcast”. Besides relevant, interesting content, good audio quality is a key to a successful podcast. Especially if you want to recommend yourself as a high-fidelity island in an ocean of podcasts. So let’s get started immediately.

The renaissance of podcasts

It might be assumed that the podcast medium is only a few years old. Rather, it has been rediscovered in recent years and is currently celebrating a noticeable revival due to the pandemic. Yet the basic idea is surprisingly old. In the 1980s, the company RCS (Radio Computing Services) provided digital talk content for radio stations for the first time, which at least comes close to the basic idea of today’s podcasts. At that time, however, no one was talking about a podcast. “Audiobloggin” was an attempt to give this new medium a name. The term “podcast” was initially used in 2004 by Ben Hammersley in a Guardian newspaper article.

The Medium

The year 2004 is considered formative for the podcast medium because of another event. Software developer Dave Winer and former MTV VJ Adam Curry are considered as the inventors of the podcast format that is still known today. From this point on, the development rapidly accelerated, also thanks to the Internet. In the same year, the first podcast service provider, Libsyn, was launched. Today Libsyn is a stock exchange-listed company. In 2005, Apple released native support for podcasts for the first time with iTunes 4.9. Steve Jobs demonstrated how to create a podcast using Garageband in 2006, while in 2013 Apple reported its “1 Billionth Podcast Subscriber.” The burning hot topic of “podcast” can also be measured by the fact that in 2019 the streaming service Spotify acquired the podcast services “Gimlet Media” and “Anchor FM inc.” and thus became one of the largest providers of podcasts.

33 percent of Germans said In 2020 that they listened to podcasts, according to a representative survey. In terms of content, most listeners (in addition to the omnipresent Corona Topic) prefer content on comedy, sports, and news. The world of podcasts is widely varied in terms of content. The number of podcast providers is just as wide-ranging.

Which podcast hosting platform would you like?

Ok, you have decided to start your own podcast. While i’ll offer tips & tricks on technical implementation later, the first step you should take is to think about where your podcast should be hosted. Sounds trivial, but this groundwork has a direct impact on the performance of your podcast. On the one hand, the range of hosting platforms is very large, and on the other hand, an unwise choice can have a massive impact on the future of your podcast project. But first things first.

Before you decide which provider to choose, you should clearly outline what you want to achieve with your podcast in the medium and long term and then decide on a suitable provider. You do not pursue any commercial intentions and rather deal with topics and content beyond the predominant zeitgeist? Then you can easily host your podcast “The Multiband Compressor in Changing Times” on your website. This will not cause any additional costs and the access of your rather small community will not cause any hassle for the webserver. However, if you plan to address a larger audience, you should rely on a professional podcast provider from the beginning. Otherwise, a later move from your website to a professional provider can be problematic. In a pinch, you may even lose some of your subscribers.

Can’t you just host your podcasts with one of the global players like iTunes, Google Podcasts, or Spotify?

Unfortunately, you can’t. Rather, these providers simply take your data from a dedicated podcast hosting. Thus, you have to match your ideas with the offers of the podcaster hosting platforms to find the right partner. If you don’t know exactly where you want your podcast to go, it’s best to go with an entry-level package from a professional provider (e.g. Podcaster, Anchor, Captivate, or buzzsprout). If the number of subscribers grows, you can upgrade your hoster, for example, to earn some extra money via an affiliate marketing integration or to get a more detailed picture of your listeners via detailed statistics.

How to Podcast | Anchor Screenshot

What equipment do I need for a podcast?

Since podcasts focus primarily on voice recordings, the minimum equipment needed is a microphone and a recording device. As always with audio technology, the price range is wide. Since most podcasts are now consumed through a smartphone, the following thought is obvious: why not record your first attempts directly with your smartphone? Every smartphone has a built-in microphone, and together with the free “Anchor” app from Spotify (available for iOS and Android), you can start your first steps as a podcaster without a big budget. Especially since this combination can also be used “off the grid” beyond a studio environment.

However, this combination quickly reaches its limits when professional sound quality is required, there are multiple participants in the conversation or even a participant is to be integrated via Skype. In this case, we switch from a smartphone to a laptop or desktop computer in combination with a professional audio interface with multiple microphone inputs. Tip: For the best possible sound quality, each participant should listen via headphones instead of monitors. If I playback the voice of a guest (e.g. via Skype) via monitors, then his voice is also picked up via his own microphone. This inevitably creates unsightly comb filter effects that are hard to remove later. 

Which microphone is the perfect one for a podcast?

When it comes to choosing a microphone, the choice is similar to that of podcast platform providers. What we already know: If you are recording a live podcast with one or more participants, you should not use monitors. If you still need to have your hands free while speaking, you can also use an audio/speech combination as an alternative microphone. Among podcasters, for example, the beyerdynamic DT-297-PV-MKII is considered a popular standard. The DT-297-PV combines a cardioid condenser capsule with a dynamic headphone, which is used for monitoring. An affordable alternative is the Presonus Revelator USB-C microphone, which combines everything necessary for a podcast. 

Next Level Sound Quality

There is still room for improvement in terms of sound quality. A look at the equipment of professional podcasts (e.g. The Joe Rogan Experience) provides insight. Here you can see professional broadcast microphones in use across the board, which are known for one thing in addition to their proven sound quality: A predictable proximity effect. Professional broadcast microphones such as the Electro Voice RE20, Neumann BCM 01, or the Shure SM7 sound very balanced and give the speaker a deep, warm voice at a close distance – just as you would expect from the radio. Ribbon microphones also have a very warm and pleasant sound with very few S sounds. In addition to the use of well-known high-quality microphones, a closer look reveals another detail in professional podcast studios: acoustic elements for improving room acoustics.

Ambitious podcasters basically operate similarly to a professional recording studio. The quality of the recording is determined by the weakest link in the chain. Thus, the use of an expensive microphone makes only limited sense if the room acoustics are anything but ideal. If you put a professional studio microphone in the shopping cart, the rest of the signal chain (microphone preamplifier, equalizer, compressor, digital converter, room acoustics, post-pro or post-processing equipment) should operate at a comparable level. Especially a good analog compressor gives the voice the professional sound we know and love from the radio. Excessive dynamic fluctuations are unpleasant when listening, especially when using headphones. But also very important when listening in a loud environment like driving a car.

During recording

In the recording studio, many users keep the options open to postpone the actual sound to the mix phase. As a podcaster, you can do the same. But it often makes sense to decide on a sound while recording. Some podcasts are also streamed live and are only available for download after the broadcast. In this case, you should definitely use EQ and compression during the recording. The same applies if you receive one or more guests on the podcast. Skillful use of EQ and compression makes the sound more balanced and each participant can better understand his or her counterpart.

Why is that?

Every voice and every microphone sounds different, so there are almost always one or two problem frequencies where a voice might sound too nasally (400-500Hz) too shrill (3-4kHz), or too treble-heavy (6-8kHz). If you filter out these frequencies with a professional EQ like the elysia xfilter 500, the voice will sound much more pleasing and consistent. Even at the source, the sound can be significantly improved with the right tools in just a few steps. The elysia skulpter 500 preamp offers direct access to the most important parameters, such as microphone pre-amplification.

This is especially important in the case when talk guests with very tight schedules join in. In these cases, the soundcheck must be as short as possible. No problem with the help of the skulpter 500. The built-in microphone preamp boosts low-output microphones with +65 dB when needed. Perfect for microphones like the Shure SM7B, in case it’s being reviewed by a delicate voice. For fast and efficient sound correction, the “Shape” filter is available. If the microphone is used very close or the speaker has an unusually low voice, unnecessary low bass components can be professionally disposed of with the tunable Low Cut. Very dynamic voices are specifically captured with the interactive single-knob compressor, which benefits a much more homogeneous sound.

All these functions are controlled by just four potentiometers, which produce fast and comprehensible results. In combination with a professional AD converter or audio interface, this is already half the battle. Speech understandability is also very high here, which is important for longer podcasts. The minimum performance is achieved when a distortion-free recording with a constant level is captured on the hard disk. For that, you should know the basics of staging. You will find lot’s of information in our previous Gain Staging Blog post.

Caution. Room!

For the final touch, you should ensure good room acoustics and Ideally, you’ll have a recording room with dry, even acoustics that are also damped from environmental noise. Investing a not inconsiderable portion of your budget in room acoustics and soundproofing may not have been on your wish list, but it’s definitely worth it. However, there are often rooms that are already very low noise by themselves, such as the bedroom. As we will see later in the topic „Post-processing for maximum sound”, the recordings are usually compressed again significantly.

Compression clearly brings existing room components to the front in the mix. Existing room reverberation may not be particularly annoying during the recording, but at the latest in post-processing, you will be annoyed by the indirect sound, which can only be saved with a lot of effort afterward. The same applies to noise from the outside that creeps uninvited into the recording. Background noise should be avoided in all shelters.

Post-processing for maximum sound

To be able to mix the recorded tracks comfortably, we need audio software. The selection here is also wide. From free entry-level applications like Audacity to full-fledged DAWs (Cubase, Logic, or Pro Tools), both price and features range. The podcaster is therefore spoiled for choice. Regardless of the price and features, however, there are some criteria that a suitable software for podcasters should provide. These include the recording option, arranging and adding additional audio material, such as jingles, original sounds, or background music. Furthermore, the software must be able to combine all audio files into a stereo mixdown in different formats. Let’s get started and put the finishing touches on our signals.

The main focus should always be on the best possible sounding, intelligible voice reproduction. The listener wants to listen to the voice, that’s important to keep the listener engaged. This should be worked out in the post-production/post-processing. Post-production/Post-processing is divided into three steps. First, we try to adjust the volume of all the signals among themselves so that the podcast can be listened to without large jumps in level.

If we have different speakers and perhaps different music feeds, we also try to match them in terms of sound and dynamics. The second step is to work with the individual signals and use EQ and dynamics processors. More tips on how to use EQs and dynamic tools will be presented in a future blog post (spoiler).

Tools

Another useful tool to improve the sound can be restoration or noise reduction plug-ins, which specifically filter out any background noise. Anything that serves intelligibility and an unobtrusive listening impression are useful for keeping the listener engaged. The final step in a podcast post-processing/post-production is a “mini-master”. All individual signals are mixed together to create a stereo track. This stereo track can also be fine-tuned again, similar to mastering for music production. However, the topic of “How to Master” requires one or more separate blog posts and cannot be further elaborated here for reasons of space.

However, i would like to share the following tip: 

A frequently asked question in the podcast community is: How loud should a podcast master be? The AES (Audio Engineering Society) recommends a maximum level between -20 and -16 LUFS for podcasts. We, on the other hand, recommend working with the level that generates the best sound from the master. This again is simply dependent on the recording and mix. Some voices may be treated more heavily with compression and limiting than others. Therefore, the ear and not LUFS metering should be the final consideration. More on this topic in our previous Mastering for Spotify blog post.

That has format

Once we have found the suitable mastering settings, the mixdown to the appropriate audio format is next. Usually, this is an MP3. MP3 is preferred for podcasts due to its small file size, even though it’s not the best option in terms of sound. A direct competitor would be M4a, which produces similarly small files and sounds even better. Unfortunately, not all podcast platforms (e.g. Spotify) support the M4a format, so you should only choose it if you prefer “sound over reach”. If the sound is the most important criterion, then there is no way around WAV format. However, the big disadvantage of WAV is the huge file size, which might scare away some subscribers. Especially if they prefer to listen to your podcasts as a download or access your podcast via mobile data on their smartphone. Therefore, a high-resolution MP3 (320 Kbit/s) is currently the common compromise.

Metadata

After we have created the master, the last step is to add the metadata. For the podcast area, the insertion of so-called ID-3 or Enclosure tags in the master file is almost absolutely indispensable. This guarantees that your podcast is equipped with the most important tags (podcast title, cover art, year of publication, etc.). This metadata is the digital identity card for your podcast file. If your audio software cannot assign metadata, the ID-3 tags can also be added later via software like iTunes, MP3Tag, or ID3 Editor. The Presonus DAW Studio One has an extra “Project Page”, which is like a mastering program. Metadata for individual tracks and albums, including artwork, can also be entered there. This looks like this: 

how to podcast | studio one metadata

The ID-3 or enclosure tags are important because this metadata can be referenced in an RSS feed. This makes it easier to find your episodes and to download them automatically if you wish. But what is an RSS feed? An RSS feed is a file format that shows you content from the Internet. You can use it to subscribe to blogs or podcasts, and when you visit the corresponding website, you will automatically be alerted to new content. RSS feeds are effectively the standard for a podcast subscription. Theoretically, you can create an RSS feed yourself, but this is not necessary if you use a podcaster hosting platform. They automatically generate appropriate RSS feeds for their customers. One more reason to entrust your podcast to a hosting platform.

Summary

We hope this blog post has provided some valuable tips and tricks for your podcast venture. At least as far as the production side is concerned, it should now be clear how to produce a podcast. You should always keep one thing in mind: A podcast with good sound and professional production does not guarantee a large number of subscribers. Basically, a successful podcast is hardly different from a successful music production at this point. Only those who understand how to combine a good sound with interesting, fresh content will build up a loyal and hopefully steadily growing audience over time. Especially in challenging times, people are increasingly looking for one thing: meaningful content with relevance. If your podcast succeeds in combining both and the production is also convincing in terms of sound, then your podcast is unlikely to complain about a lack of attention in the future.

I hope you enjoyed this post and i would be happy if you can comment, discuss and share this post.

Yours, Ruben Tilgner

Mastering for Spotify, YouTube, Tidal, Amazon Music, Apple Music and other Streaming Services

Estimated reading time: 14 minutes

Mastering for Spotify, YouTube, Tidal, Amazon Music, Apple Music and other Streaming Services


Does audio streaming platforms also require a special master?

Introduction

Streaming platforms (Spotify, Apple, Tidal, Amazon, Youtube, Deezer etc.) are hot topics in the audio community. Especially since these online services suggest concrete guidelines for the ideal loudness of tracks. To what extent should you follow these guidelines when mastering and what do you have to consider when interacting with audio streaming services? To find the answer, we have to take a little trip back in time.

Do you remember the good old cassette recorder? In the 80s, people used it to make their own mixtapes. Songs of different artists gathered on a tape, which we pushed into a tape deck of our car with a Cherry Coke in the other hand in order to show up with suitable sound before hitting at the next ice cream dealer in the city center. The mixtapes offered a consistently pleasant listening experience, at least as far as the volume of the individual tracks was concerned. When we created mixtapes, the recording level was simply adjusted by our hand, so that differently loud records were more or less consciously normalized by hand. 

Back to the Streaming Future. Time leap: Year 2021.

Music fans like us still enjoying mixtapes, except that today we call them playlists and they are part of various streaming services such as Apple Music, Amazon Music, Spotify, YouTube, Deezer or Tidal. In their early years, these streaming services quickly discovered that without a regulating hand on the volume fader, their playlists required constant readjustment by the users due to the varying loudness of individual tracks.

So they looked for a digital counterpart to the analog record level knob and found it in an automated normalization algorithm that processes every uploaded song according to predefined guidelines. The streaming service Spotify for example, specifies the number -14 dB LUFS as an ideal loudness value. This means if our song is louder than -14 dB LUFS, it will be automatically reduced in volume by the streaming algorithm so that playlists have a more consistent average loudness. Sounds like a good idea at first glance, right?

Why LUFS?

The problem with different volume levels was not just limited to the music area. In the broadcasting area, the problem was also widespread. The difference in volume between a television movie and the commercial interruption it contains sometimes took on such bizarre proportions that the European Broadcast Union felt forced to issue a regulation on loudness. This was the birth of the EBU R128 specification, which was initially implemented in Germany in 2012. With this regulation, a new unit of measurement was introduced, the LUFS (Loudness Units relative to Full Scale).

Whereby 1 x LU (Loudness Units) equals the relative value of 1 dB and at the same time, a new upper limit for digital audio was defined. A digital peak level of -1 dB TP (True Peak) should not be exceeded according to EBU speecification. This is the reason why Spotify and Co provide a True Peak limit of -1 dBFS for music files. 

Tip: I recommend to keep this limit. Especially if we do not adhere to the loudness specification of -14 dB LUFS. At higher levels, the normalization algorithm will definitely intervene in a moderating way. Spotify refers to the following in this context: If we do not keep -1 dB TP as limiter upper limit (ceiling), sound artifacts may occur due to the normalization process.

This value is not carved in stone, as you will see later. Loudness units offer a special advantage to the mastering engineer. Simply spoken, we should be able to use LUFS to quantify how “loud” a song is and thereby compare different songs in terms of loudness. More on this later.

Mastering for Spotify, Youtube, Tidal, Amazon Music, Apple Music and other Streaming Services | T-Racks Stealth Limiter

How can we see if our mix is normalized by a streaming service?

The bad news is that some streaming services have quite different guidelines. Therefore, you basically have to search for the specifications of each individual service if you want to follow their guidelines. This can be quite a hassle, as there are more than fifty streaming and broadcasting platforms worldwide. As an example, here are the guidelines of some services in regards to ideal LUFS values:

-11 LUFS Spotify Loud

-14 LUFS Amazon Alexa, Spotify Normal, Tidal, YouTube

-15 LUFS Deezer

-16 LUFS Apple, AES Streaming Service Recommendations

-18 LUFS Sony Entertainment

-23 LUFS EU R128 Broadcast

-24 LUFS US TV ATSC A/85 Broadcast

-27 LUFS Netflix

The good news is that there are various ways to compare your mix with the specifications of the most important streaming services at a glance. How much your specific track will be manipulated by the respective streaming service? You can check this on the following website: www.loudnesspenalty.com

Mastering for Spotify, Youtube, Tidal, Amazon Music, Apple Music and other Streaming Services | Loudness Penalty

Some DAWs, such as the latest version of Cubase Pro also feature comprehensive LUFS metering. Alternatively, the industry offers various plug-ins that provide information about the LUFS loudness of a track. One suitable candidate is YOULEAN Loudness Meter 2, which is also available in a free version: https://youlean.co/youlean-loudness-meter/.

Another LUFS metering alternative is the Waves WLM Plus Loudness Meter, which is already fed with a wide range of customized presets for the most important platforms. 

Waves Loudness Meter

Metering

Using the Waves Meter as an example, we will briefly go into the most important LUFS meters, because LUFS metering involves a lot more than just a naked dB number in front of the unit. When we’re talking about LUFS, it should be clear what this exactly means. LUFS data is determined over a period of time and depending on the length of the time span and this can lead to different results. The most important value is the LUFS Long Term Display.

This is determined over the entire duration of a track and therefore represents an average value. To get an exact Long Term value we have to play the song once from the beginning to the end. Other LUFS meters (e.g. in Cubase Pro) like to refer to the Long Term value as LUFS Integrated. LUFS Long Term or Integrated is the value that is prefixed in the streaming platform’s specifications. For “Spotify Normal” this means that if a track has a loudness of -12LUFS Integrated, the Spotify algorithm will lower this track by two dB to -14LUFS. 

LUFS Short Term

The Waves WLN Plus plugin offers other LUFS indicators for evaluation, such as LUFS Short Term. LUFS Short Term is determined over a period of three seconds when the plugin measures according to EBU standards. This is an important point, because depending on the ballistics, the measurement distances are different in time and can therefore lead to different results. A special feature of the Waves WLM Plus plugin is the built-in True Peak Limiter. Many streaming platforms insist on a true peak limit of -1dB (some even -2dB). If you use the WLM Plus Meter as the last plugin in the chain of your mastering software, the True Peak limit is guaranteed not to be exceeded when the limiter is activated.

Is the “Loudness War” finally over thanks to LUFS?  

As we already learned, all streaming platforms define maximum values. If our master exceeds these specifications, it will automatically made quieter. The supposedly logical conclusion: we no longer need loud masters. At least this is true for those who adhere to the specifications of the streaming platforms. Now, parts of the music industry have always been considered a place away from all reason, where things like to run differently than logic dictates. The “LUFS dictate” is a suitable example of this. 

Fact is: Most professional mastering engineers don’t care about LUFS in practice nor about the specifications of the streaming services! 

Weird stuff, I know. However, the facts are clear and the thesis can be proven with simple methods. We remember that YouTube, just like Spotify, specifies a loudness of -14dB LUFS and automatically plays louder tracks at a lower volume. So all professional mixes should take this into account, right? It’s nice that this can be checked without much effort. Open a recent music video on YouTube, right-click on the video and click on ” Stats for nerds”. The entry “content loudness” indicates by how much dB the audio track is lowered by the YouTube algorithm. Now things become interesting. For the current AC/DC single “Shot in the Dark” this is 5.9dB. Billy Talent’s “I Beg To Differ” is even lowered by 8.6dB. 

Amazing, isn’t it?  

Obviously, hardly anyone seems to adhere to the specifications of the streaming platforms. Why is that? 

There are several reasons. The loudness specifications differ from streaming platform to streaming platform. If you take these specifications seriously, you would have to create a separate master for each platform. This would result in a whole series of different sounding tracks, for the following reason. Mastering equipment (whether analog or digital) does not work linearly across the entire dynamic spectrum. 

Example:

The sound of the mix/master changes if you have to squeeze 3dB more gain reduction out of the limiter for one song than for another streaming platform. If you finally normalize all master files to an identical average value, the sound differences become audible due to the different dynamics processing. The differences are sometimes bigger and sometimes smaller. Depending on which processing you have done. 

Another reason for questioning the loudness specifications is the inconsistency of the streaming platforms. Take Spotify, for example. Do you know that Spotify’s normalization algorithm is not enabled when playing Spotifys via web player or a third party app? From the Spotify FAQs:

Spotify for Artists FAQ
The Metal Mix

This means that if you deliver a metal mix with -14dB LUFS and it is played back via Spotify in a third-party app, the mix is simply too weak compared to other productions. And there are other imponderables in the streaming universe. Spotify allows its premium users to choose from three different normalization settings, with standards that also differ. For example, the platform recommends a default of -11dB LUFS and a True Peak value of -2dB TP for the “Spotify Loud” setting, while “Spotify Normal” is certified at -14dB LUFS and -1dB TP. Also from the Spotify FAQs:

FAQ2

For mastering engineers, this is a questionable state of affairs. Mastering for streaming platforms is like trying to hit a constantly changing target at varying distances with a precision rifle. Even more serious, however, is the following consideration: What happens if one or more streaming platforms raise, lower, or even eliminate their loudness thresholds in the future? There is no guarantee that the specifications currently in place will still be valid in the future. Unlikely? Not at all! YouTube introduced its normalization algorithm in December 2015. Uploads prior to December 2015 may sound louder if they were mastered louder than -14dB LUFS. Even after 2015, YouTube’s default has not remained constant. From 2016 to 2019, the typical YouTube normalization was -13dB and did not refer to LUFS. Only since 2019 YouTube has been using the -14dB LUFS by default. 

The reason why loudness is not exclusively manifested in numbers

If you look at the loudness statistics of some YouTube videos and listen to them very carefully at the same time, you might have made an unusual observation. Some videos sound louder even though their loudness statistics indicate that they are nominally quieter than other videos. How can this be? There is a difference between measured loudness in LUFS and perceived loudness. Indeed, it is the latter that determines how loud we perceive a song to be, not the LUFS specification. But how do you create such a lasting loudness impression?

Many elements have to work together for us to perceive a song as loud (perceived loudness). Stereo width, tonal balance, song arrangement, saturation, dynamics manipulation – just to name a few pieces of the puzzle. The song must also be well composed and performed. The recording must be top-notch and the mix professional. The icing on the cake is a first-class master. If all these things come together, the song is denser, more forward and, despite moderate mastering limiter use, simply sounds louder than a mediocre song with less good mix & mastering, even if the LUFS integrated specifications suggest a different result. An essential aspect of a mastering process is professional dynamics management. Dynamics are an integral part of the arrangement and mix from the beginning.

In mastering, we want to try to further emphasize dynamics while not destroying them. Because one thing is always inherent in the mastering process: a limitation of dynamics. How well this manipulation of dynamics is done is what separates good mastering from bad mastering and a good mix with a professional master always sounds fatter and louder than a bad mix with a master that is only trimmed for loudness.

Choose your tools wisely!

High quality equalizers and compressors like the combination of the elysia xfilter and the elysia xpressor provide a perfect basis for a more assertive mix and a convincing master. Quality compression preserves the naturalness of the transients, which automatically makes the mix appear louder. You miss the punch and pressure in your song? High-quality analog compressors always guarantee impressive results and are more beneficial to the sound of a track than relying solely on digital peak limiting.

You are loosing audible details in the mixing and mastering stage? Bring them back into light with the elysia museq! The number of playback devices has grown exponentially in recent years. This doesn’t exactly make the art of mastering easier.

Besides the classic hi-fi system, laptops, smart phones, Bluetooth speakers and all kinds of headphones are fighting for the listener’s attention in everyday life. Good analog EQs and compressors can help to adjust the tonal focus for these devices as well. Analog processing also preserves the natural dynamics of a track much better than endless plug-in rows, which often turn out to be a workflow brake. But “analog” can provide even more for your mixing & mastering project. Analog saturation is an additional way to increase the perceived loudness of a mix and to noticeably improve audibility, especially on small monitoring systems like a laptop or a Bluetooth speaker.

Saturation and Coloration

The elysia karacter provides a wide range of tonal coloration and saturation that you can use to make a mix sound denser and more assertive. Competitive mastering benefits sustainably from the use of selected analog hardware. The workflow is accelerated and you can make necessary mix decisions very quick and accurate. For this reason, high-quality analog technology enjoys the highest popularity, especially in high-end mastering studios. karacter is available as a 1 RU 19″ Version, karacter 500 – module and in our new super handy qube series as a karacter qube.

Mastering Recommendations for 2021

As you can see, the considerations related to mastering for streaming platforms are anything but trivial. Some people’s heads may be spinning because of the numerous variables. In addition, there is still the question of how to master your tracks in 2021. 

 The answer is obvious: create your master in a way that serves the song. Some styles of music (jazz, classical) require much more dynamics than others (heavy metal, hip-hop). The latter can certainly benefit from distortion, saturation, and clipping as a stylistic element. What sounds great is allowed. The supreme authority for a successful master is always the sound. If the song calls for a loud master, it is legitimate to put the appropriate tools in place for it. The limit of loudness maximization is reached when the sound quality suffers. Even in 2021, the master should sound better than the mix. The use of compression and limiting should always serve the result and not be based on the LUFS specifications of various streaming services. Loudness is a conscious artistic decision and should not end up in an attempt to achieve certain LFUS specifications.

And the specifications of the streaming services? 

With how many LUFS should i master to?

There is only one valid reason to master a song to -14dB LUFS. The value of -14dB LUFS is just right if the song sounds better with it than with -13 or -15dB LUFS!

I hope you were able to take some valuable information from this blog post and it will help you take your mix and personal master for digital streaming services to the next level. 

I would be happy about a lively exchange. Feel free to share and leave a comment or if you have any further questions, I’ll be happy to answer them of course.

Yours, Ruben Tilgner 

-18dBFS is the new 0dBu

Estimated reading time: 18 minutes

-18dBFS is the new 0dBu

Gain staging and the integration of analog hardware in modern DAW systems


Introduction

-18dBFS is the new 0dBu:

In practice, however, even experienced engineers often have only a proximate idea of what “correct” levels are. Like trying to explain the offside rule in soccer, a successful level balance is simple and complex at the same time. Especially when you have the digital and analog worlds supposed to work together on equal grounds. This blog post offers concrete tips for confident headroom management and “how to integrate analog hardware in digital production environment – DAW systems” in a meaningful way.

Digital vs. Analog Hardware

A good thing is that you don’t have to choose one or the other. In modern music production, we need both worlds, and with a bit of know-how, the whole thing works surprisingly well. But the fact is: On one hand, digital live consoles and recording systems are becoming more and more compact in terms of their form factor, on the other hand, the number of inputs and outputs and the maximum number of tracks are increasing at the same time. The massive number of input signals and tracks demand even more to always find suitable level ratios.

Let’s start at the source and ask the simple question, “Why do you actually need a mic preamp?”

The answer is as simple as clear. We need a Mic-Preamp to turn a microphone signal into a line signal. A mixer, audio interface, or DAW always operates at line level, not microphone level. This is the case for all audio interfaces, such as insert points or audio outputs. How far do we actually need to turn up the microphone preamp, and is there one “correct” level? There is no universal constant with a claim to be the sole representative, but there is a thoroughly sensible recommendation that has proven itself in a practical workflow. I recommend to level all input signals to line level with the help of the microphone preamplifier. Line level is the sweet spot for audio systems.

But what exactly is line level now and where can it be read?

Now we’re at a point where it gets a little more complicated. For the definition of the line level, a reference level is used and this is different depending on which standard is used as a basis. The reference level for professional audio equipment according to the German broadcast standard is +6dBu (1.55Vrms, -9dBFS). It refers to a level of 0dBu at 0.775V (RMS). In the USA, the analog reference level of +4dBu, corresponding to 1.228V (effective value), is used. Furthermore relevant in audio technology is the reference level of 0 dBV, corresponding to exactly 1V (RMS) and the home equipment level (USA) with -10dBV, corresponding to 0.3162V (RMS). Got it? We’ll focus on the +4dBu reference level in this blog post. Simply for the reason that most professional audio equipment relies on this reference level for line-level.

dBu & dBV vs. dBFS

What is +4dBu and what does it mean?

Level ratios in audio systems are expressed in the logarithmic ratio decibel (dB). It is important to understand that there is a difference between digital and analog mixers in terms of “dB metering”. This is the experience of anyone who has swapped from an analog- to a digital mixer for the first time (or vice versa). Obviously, the usual level settings don’t fit anymore. Why is that? The simple explanation: analog mixers almost always use 0dBu (0.775V) as a reference point, while their digital counterparts use the standard set by the European Broadcasting Union (EBU) for digital audio levels. According to the EBU, the old analog “0dBu” should now be equivalent to -18dBFS (full scale). Digital consoles- and DAW users, therefore, hold fast: -18dBFS is our new 0dBu!

This sounds simple, but unfortunately, it’s not that easy, because dBu values can’t be unambiguously converted to dBFS. It varies from device to device which analog voltage leads to a certain digital level. Many professional studio devices are connoted with the nominal output of +4dBu, while consumer devices tend to fall back on the dBV meter (-10dBV). This is not enough confusion. There are also massive differences in terms of “headroom”. With analog equipment, there is still plenty of headroom available when a VU meter is operating in a 0dB range. Often there is another 20dB available until an analog soft clipping signals the end of the line. The digital domain is much more uncompromising at this point. Levels beyond the 0dBFS mark produce hard clipping, which sounds unpleasant on the one hand and represents a fixed upper limit on the other. The level simply does not get any louder. 

We keep in mind: The analog world works with dBu & dBV indications, while dBFS describes the level ratios in the digital domain. Accordingly, the meter displays on an analog mixing console are also different compared to a digital console or DAW.

Analog meter indicators are referenced to dBu. If the meter shows 0dB, this equals +4dBu at the mixer output and we are happy about a rich headroom. A digital meter is usually scaled over the range of -80 to 0dBFS, with 0dBFS representing the clipping limit. To make a comparison, let’s recall: 0dBu (analog) = -18dBFS (digital). This is true for many digital devices, such as Yamaha digital mixers, but not all. ProTools, for example, works with the reference level of 0dBu = -20dBFS. We often find this difference when comparing European and US equipment. The good news is that we can live very well with this difference in practice. Two dB is not what matters in the search for the perfect level of audio signals. 

Floating Point

But why do we need to worry about level ratios in a DAW at all? Almost all modern DAWs work with floating-point arithmetic, which provides the user with infinite headroom and dynamics (theoretically 1500dB). The internal dynamics are so great that clipping cannot occur. Therefore, common wisdom on this subject is: “You can do whatever you want with your levels in a floating-point DAW, you just must not overdrive the sum output”. Theoretically true, but practically problematic for two reasons. First, there are plug-ins (often emulations of classic studio hardware) that don’t like it at all if you feed their input with extremely high levels.

This degrades the signal audibly. Very high levels have a second undesirable side effect: they make it virtually impossible to use analog audio hardware as an insert. Most common DAWs work with a 32-bit floating-point audio engine. Clipping can only occur on the way into the DAW (e.g. overdriven MicPre) or on the way out of the DAW (overdriven sum DA-converter). This happens faster than you think. Example: Anyone who works with commercial loops knows the problem. Finished loops are often normalized and you reach quickly the 0dBFS mark on the loudest parts mark on your peak meter. If we play several loops simultaneously and two loops will reach 0dBFS at one point at the same time, we already have clipping on the master bus. You need to avoid too high levels in a DAW at all costs.

Noise Generator

We’ve talked about clipping and headroom so far, but what about the other side of the coin? How do analog and digital audio systems handle very low levels? In the analog world, the facts are clear: the lower our signal level, the closer our useful signal approaches the noise floor. That means our “signal to noise” ratio is not optimal. Low signals enter the ring with the noise floor, which doesn’t come off without causing collateral damage to the sound quality. Therefore, in an analog environment, we must always emphasize solid levels and high-quality equipment with the best possible internal “signal to noise” ratio. This is the only way to guarantee that in critical applications (e.g. classical recordings, or music with very high dynamics) the analog recording is as noise-free as possible.

And digital?

Fader position as a part of Gain Staging

Another often overlooked detail on the way to a solid gain structure is the position of the faders. First of all, it doesn’t matter whether we’re working with an analog mixer, a digital mixer, or a DAW. Faders have a resolution, and this is not linear.

The resolution around the 0dB mark is much higher than in the lower part of the fader path. To mix as sensitively as possible, the fader position should be near the 0dB mark. If we create a new project in a DAW, the faders in the DAW project are in the 0dB position by default. This is how most DAWs handle it. Now we can finally turn up the mic preamps and set the appropriate recording level. We recommend leveling all signals in the digital domain to -18dBFS RMS / -9dBFS peak. In other words, to the line-level already invoked at the beginning, because that’s what digital mixers and DAWs are designed for. Since we have the channel faders close to the 0 dB mark, the question now is: How do I lower signals that are too loud in the mix? 

You have several ways to do this and many of them are simply not recommended. For example, you could turn down the gain of the mic preamp. But then we’re no longer feeding line level to the DAW. With an analog mixer, this results in a poor “signal to noise” ratio. A digital mixer with the same approach has the problem that all sends (e.g. monitor mixes for the musicians, insert points) also leave the line-level sweet spot. Ok, let’s just pull down the channel fader! But then we leave the area for the best resolution, where we can adjust the levels most sensitively. This may “only” be uncomfortable in the studio, but at a large live event with a PA to match, it quickly becomes a real problem.

This is where working in the fader sweet spot is essential. The ability to specifically make the lead vocal two dB louder via the fader is almost impossible if we start with a fader output setting of, let’s say, -50dB. If we move the fader up just a few millimeters, we quickly reach -40dB, which is an enormous jump in volume. The solution to this problem: We prefer to use audio subgroups for rough volume balancing. If these are not available, we fall back on DCA or VCA groups. The input signals are assigned to the subgroups (or DCAs or VCAs) accordingly. For example, one group for drums, one for cymbals, one for vocals and one each for guitars, keyboards and bass. With the help of the groups you can set a rough balance between the instruments and vocal signals and use the channel faders for small volume corrections. 

Special tip: It makes sense to route effect returns to the corresponding groups instead of to the master. The drum reverb to the drum group, or the vocal reverb to the vocal group. If you have to correct the group volume, then the effect part is automatically pulled along and the ratio signal/effect part always remains the same.

Gain Staging in the DAW – the hunt for line level


As a first step, we need to clear up a misunderstanding. “Gain” and “Volume” are not members of the same family. Adjusting gain is not the same as adjusting volume. In simple words, Volume is the volume after processing, while Gain is the volume before processing. Or even simpler, Gain is input level, Volume is output level!

The next important step for clean gain staging is to determine what kind of meter display my digital mixer or DAW is even working with. Where exactly is line level on my meter display?

Many digital consoles and DAWs have hybrid metering. Like the metering in Studio One V5, which we’ll use as an example. The scaling going from -72dB to +10dB and from -80dB to +6dB in the sum output.

Studio One metering is between an analog dBu meter and a digital meter in dBFS in terms of its scaling. This is similar in many DAWs. It is important to know whether the meter display shows an RMS (average level) or Peak Meter (peak level). If we see only peak metering and control to line level (-18dBFS), then the level is too low, especially for very dynamic source material with fast transients like a snare drum. The greater the dynamic range of a track, the higher the peak values and the lower the average value. Therefore, drum tracks can quickly lighten up the clip meter of a peak meter but produce comparatively little deflection on an RMS meter.

In Studio One, however, we get all the information we need. The blue Studio One meter represents peak metering, while the white line in the display always shows the RMS average level. Also important is where the metering (tap point) is tapped. For leveling out, the metering should show the pre-fader level ratios, especially if you already inserted insert plug-ins or analog devices into the channel. These can significantly influence the post-fader metering.

-18dBFS is the new 0dBu | Gains Staging and the integration of analog Hardware in DAW Systems

Keyword: Plugins

You need to drive digital emulations with a suitable level. There are still some fix-point plug-ins and emulations of old hardware classics on the market that don’t like high input levels. It is sometimes difficult to see which metering the plugins use themselves and where the line level is located. A screenshot illustrates the dilemma.

-18dBFS is the new 0dBu | Gain Staging and the integration of analog hardware in DAW Systems

The BSS DRP402 compressor clearly has a dBFS meter. Thus, the BSS compressor has line-level reference on its metering at -20 dBFS. The bx townhouse compressor in the screenshot is fed with the same input signal as the BSS DRP402 but shows completely different metering.

Here you may assume since it is an analog emulation, its meter display is more like a VU meter. 

Fire Department Operation

It’s not uncommon that you will find yourself in the studio with recordings that just want to be mixed. Experienced sound engineers will agree with me. Many recordings by less experienced musicians or junior technicians are simply too high. So what can you do to bring the levels back to a reasonable level? Digitally, this is not a big problem, at least if the tracks are free of digital clipping. Turning the tracks down doesn’t change the sound, and we don’t have to worry about noise floor problems on the digital level either. In any DAW, you can reduce the waveform (amplitude) to the desired level.

-18dBFS is the new 0dBu | Gain Staging and the integration of analog hardware in DAW Systems

Alternatively, every DAW offers a Trim plug-in that you can place in the first insert slot to lower the level there.

The same plugin can also be used in busses or in the master if the added tracks prove to be too loud. We did not use the virtual faders of the DAW mixer for this task, because they are post-insert and, as we already know, only change the volume but not the gain of the track.

Analog gear in combination with a DAW

The combination of analog audio gear and a DAW has a special charm. The fast, haptic access and the independent sound of analog processors make up the appeal of a hybrid setup. You can use Analog gear as a front-end (mic preamps) or as insert effects (e.g., dynamics). If you want to connect an external preamp to your audio interface, you should use a line input to bypass the generic MicPreamp of the audio interface.

In insert mode, we have to accept an AD/DA conversion for pure analog gear to get into the DAW. Therefore the quality of the AD/DA converters is important. If you use the full 24bit spectrum by a full scale, this corresponds to a dynamic range of 144dB. This overstrains even a high-end digital converter. Therefore, you need to drive your analog gear in the insert at line level to give the digital converters enough headroom. Especially if you plan to boost the signal with the analog audio gear.

This simply requires headroom. If, on the other hand, you only make subtractive EQ settings, you can also work with higher send and return levels. Now we only need to adjust the level ratios for the insert operation. Several things need our attention. 

It depends on the entire signal chain

The level ratios in a DAW are constant and always understandable. When integrating analog gear, however, we have to look at the entire signal flow and sometimes readjust it. We start with the send level from the DAW. Again, i recommend you to send the send signal with line-level to an output of the audio interface.

The next step requires a small amount of detective work. In the technical specifications of the audio interface, we look up the reference level of the outputs and have to bring them in line with the input of the analog gear we want to loop into the DAW. If the interface has balanced XLR outputs, we connect it to a balanced XLR input of the analog insert unit. However, what do we do with unbalanced devices that have a reference level of -10dBV? Many audio interfaces offer a switch for their line inputs and outputs from +4dBu to -10dBV, which should you use in this case. In the technical specifications of the audio interface, you can find out which analog level is present at 0dBFS.  This you can also switch in some cases.

On an RME Fireface 802, for example, you can switch between +19dBu, +13dBu and +2dBV. It is important to know that many elysia products can handle a maximum level of about +20dBu. This level applies to the entire signal chain from the interface output to the analog device and from its output back to the interface. Ideally, a line-level send signal with an identical return level will make its way back into the DAW. In addition, the analog unit itself is under observation. Make sure that neither its input nor its output is distorting. These distortions will otherwise be passed on to the DAW unadulterated.

elysia qube series

It also depends a bit on the type of analog gear how its insert levels behave. A ground-in EQ that moderately boosts or cuts frequencies is less critical than a transient shaper (elysia nvelope). Depending on the setting, this can generate peaks that RMS metering can hardly detect. In a worst-case scenario, this creates distortions that are only audible but not readable without peak metering. Another classic operating mistake is a too high make-up gain setting for compressors.

In worst case, both the output of the compressor itself and the return input of the sound card are overdriven. The level balance at all four places (input & output analog device + input & output of the interface) of an insert should be under close observation. But we are not alone. Help for insert operation is provided by generic DAW on-board tools, which we will look at in conclusion.

Use Insert-Plugins!

When integrating analog hardware, you should definitely use insert plugins, which almost every DAW provides. Reaper features the “ReaInsert” plugin, ProTools comes with “Insert” and Studio One provides the “Pipeline XT” plugin.The wiring for this application is quite simple.

We connect a line output of our audio interface to the input of our hardware. We connect the output of our hardware to a free line input of our interface. We select the input and output of our interface as a source in our insert plugin (see Pipeline XT screenshot) and have established the connection.

A classic “send & return” connection. Depending on the buffer size setting, the AD/DA conversion causes a more or less large propagation delay, which can be problematic. Especially when we use signals in parallel. What does this mean? Let’s say we split our snare drum into two channels in the DAW. The first channel stays in the DAW and is only handled with a latency-free gate plugin, the second channel goes out of the DAW via Pipeline XT, into an elysia mpressor and from there back into the DAW.

Due to the AD/DA conversion, the second snare track is time delayed compared to the first track. For both snare tracks to play together time aligned, we need latency compensation. This you could do manually by moving the first snare track, or you could simply click the “Auto” button in Pipeline XT for automatic latency compensation. This is much faster and more precise. The advantage is that the automatic delay compensation ensures that our insert signal phases coherently with the other tracks of the project. With this tool, you can also easily adjust the level of the external hardware. If distortion already occurs here, you can reduce the send level and the return level will increase at the same time. 

This is also the last tip in this blog post. The question of the correct level should be settled, as well as all relevant side issues that have a significant impact on the gain structure and a hybrid way of working. For all the theory and number mysticism – it does not depend on a dB exact adjustment. It is quite sufficient to stick roughly to the recommendations. This guarantees a reasonable level that will make your mixing work much easier and faster. Happy Mixing!

Here’s a great Video from RME Audio about Matching Analog and Digital Levels.

YouTube

By loading the video, you agree to YouTube’s privacy policy.
Learn more

Load video

Feel free to discuss, leave a comment below or share this blog post in your social media channels.

Yours, Ruben

How to deal with audio latency

Estimated reading time: 10 minutes

How to deal with latency in audio productions


Increased signal propagation time and annoying latency are uninvited permanent guests in every recording studio and at live events. This blog post shows you how to avoid audio latency problems and optimize your workflow.

As you surely know, the name elysia is a synonym for the finest analog audio hardware. As musicians, we also know and appreciate the advantages of modern digital audio technology. Mix scenes and DAW projects can be saved, total recall is mandatory and monstrous copper multicores are replaced by slim network cables. A maximally flexible signal flow via network protocols such as DANTE and AVB allows the simple setup of complex systems. Digital audio makes everything better? That would be nice, but reality shows an ambivalent balance. If you look and listen closely, the digital domain sometimes causes problems that are not even present in the analog world. Want an example? 

From the depths of the bits & bytes arose a merciless adversary that will sabotage your recordings or live gigs. Plenty of phase and comb filter problems will occur. But with the right settings, you are not powerless against the annoying latencies in digital audio systems. 

What is audio latency and why it doesn’t occur in analog setups?

Latency occurs with every digital conversion (AD or DA). Latency is noticeable in audio systems as signal propagation time. In the analog domain the situation is clear: The signal propagation time from input to the output of an analog mixer is always zero.

Latencies only existed in the compound midi devices, where external synths or samplers were integrated via midi. In practice, this was not a problem, since the entire monitoring situation always remained analog and thus no latency was audible. With digital mixing consoles or audio interfaces, on the other hand, there is always a delay between input and output.

Latency can have different reasons, for example the different signal propagation times of different converter types. Depending on the type and design, a converter needs more or less time to manage the audio signal. For this reason, mixing consoles and recording interfaces always use identical converter types in the same modules (e.g. input channels), so that the modules have the same signal propagation time among each other. As we will see, within a digital mixer or recording setup latency is not a fixed quantity. 

Signal propagation time and round trip latency

Latency in digital audio systems is specified either in samples or milliseconds. A DAW with a buffer size of 512 samples generates at least a delay of 11.6 milliseconds (0.016s) if we work with a sampling rate of 44.1kHz. The calculation is simple: We divide 512 samples by 44.1 (44100 samples per second) and get 11.6 milliseconds (1ms = 1/1000sec).

If we work with a higher sample rate, the latency decreases. If we run our DAW at 96kHz instead of 44.1kHz, the latency will be cut in half. The higher the sample rate, the lower the latency. Doesn’t it then make sense to always work with the highest possible sample rate to elegantly work around latency problems? Clear answer: No! 96 or even 192kHz operation of audio systems is a big challenge for the computer CPU. The higher sample rate makes the CPU rapidly break out in a sweat, which is why a very potent CPU is imperative for a high channel count. This is one reason why many entry-level audio interfaces often only work with a sample rate of 44.1 or 48kHz. 

Typically, mixer latency refers to the time it takes for a signal to travel from an analog input channel to the analog summing output. This process is also called “RTL”, which is the abbreviation for “Round Trip Latency”. The actual RTL of an audio interface depends on many factors: The type of interface (USB, Thunderbolt, AVB or DANTE), the performance of the recording computer, the operating system used, the settings of the sound card/audio interface and those of the recording project (sample rate, number of audio & midi tracks, plugin load) and the signal delays of the converters used. Therefore it is not easy to compare the real performance of different audio interfaces in terms of latency. 

It depends on the individual case!

A high total runtime in a DAW does not necessarily have to be problematic. Some things depend on your workflow. Even with the buffer size of 512 samples from our initial example, we can record without any problems. The DAW plays the backing tracks to which we record overdubs. Latency does not play a role here. If you work in a studio, it only becomes critical if the DAW is also used for playing out headphone mixes or if you want to play VST instruments or VST guitar plug-ins to record them to the hard disk. In this case, too high a latency makes itself felt in a delayed headphone mix and an indirect playing feel. 

If that is the case, you will have to adjust the latency of your DAW downwards. There is no rule of thumb as to when latency has a negative effect on the playing feel or the listening situation. Every musician reacts individually. Some can cope with an offset of ten milliseconds, while others already feel uncomfortable at 3 or 4 milliseconds.

The Trip

Sound travels 343 meters (1125ft) in one second, which corresponds to 34.3 centimeters (0.1125ft) per millisecond. Said ten milliseconds therefore correspond to a distance of 3.43 meters (11.25ft). Do you still remember the last club gig? You’re standing at the edge of the stage rocking with your guitar in your hand, while the guitar amp is enthroned three to four meters (10 – 13ft) behind you. This corresponds to a signal delay of 10-12ms. So for most users, a buffer size between 64 and 128 samples should be low enough to play VST instruments or create headphone mixes directly in the DAW.

Unless you’re using plug-ins that cause high latency themselves! Most modern DAW programs have automatic latency compensation that matches all channels and busses to the plug-in with the highest runtime. This has the advantage that all channels and busses work phase coherent and therefore there are no audio artifacts (comb filter effects). The disadvantage is the high overall latency.

Some plug-ins, such as convolution reverbs or linear phase EQs, have significantly higher latencies. If these are in monitoring, this has an immediate audible effect even with small buffer size. Not all DAWs show plug-in latencies, and plug-in manufacturers tend to keep a low profile on this point.

First Aid

We have already learned about two methods of dealing directly with annoying latency. Another is monitoring via hardware monitoring that may be provided by the audio interface.

RME audio interfaces, for example, comes with the Total Mix software. This allows low latency monitoring with on-board tools. Depending on the interface even with EQ, dynamics and reverb. Instead of monitoring via the DAW or the monitoring hardware of the interface, you can alternatively send the DAW project sum or stems into an analog mixer and monitor the recording mic together with the DAW signals analog with zero latency. If you are working exclusively in the DAW, then it helps to increase the sample rate and/or decrease the buffer size. Both of these put a significant load on the computer CPU.

Depending on the size of the DAW project and the installed CPU, this can lead to bottlenecks. If no other computer with more processing power is available, it can help to replace CPU-hungry plug-ins in the DAW project or to set them to bypass. Alternatively, you can render plug-ins in audio files or freeze tracks.

Buffersize Options
The buffer size essentially determines the latency of a DAW
Track Rendering in DAW
Almost every DAW offers a function to render intensive plug-ins to reduce the load on the CPU

Good old days

Do modern problems require modern solutions? Sometimes a look back can help.

It is not always advantageous to record everything flat and without processing. Mix decisions, how a recorded track will sound in the end, will be postponed into the future. Why not commit to a sound like in the analog days and record it directly to the hard disk? If you’re afraid you might record a guitar sound that turns out to be a problem child later in the mixdown, you can record an additional clean DI track for later re-amping.

Keyboards and synthesizers can be played live and recorded as an audio track, which also circumvents the latency problem. Why not record signals with processing during tracking? This speeds up any production, and if analog products like ours are used, you don’t have to worry about latency.

If you are recording vocals, try to compress the signal moderately during the recording with a good compressor like the mpressor or try it with our elysia skulpter. With the elysia skulpter there are some nice and practical sound shaping functions like filter, saturation and compressor in addition to the classic preamp possibilities – so you have a complete channel strip. If tracks are already recorded with analog processing, this approach also saves some CPU power during mixing. Especially with many vocal overdub tracks, an unnecessarily large number of plug-ins are required, which in turn leads to a change in the buffer size and consequently has a negative effect on latency.  

What are your experiences with audio latencies in different environments? Do you have them under control? I’m looking forward to your comments.

Here are some FAQ:

What is audio latency and why it doesn’t occur in analog setups?

Latency occurs with every digital conversion (AD or DA). Latency is noticeable in audio systems as signal propagation time. In the analog domain the situation is clear: The signal propagation time from input to the output of an analog mixer is always zero.

Latencies only existed in the compound midi devices, where external synths or samplers were integrated via midi. In practice, this was not a problem, since the entire monitoring situation always remained analog and thus no latency was audible. With digital mixing consoles or audio interfaces, on the other hand, there is always a delay between input and output.
Latency can have different reasons, for example the different signal propagation times of different converter types. Depending on the type and design, a converter needs more or less time to manage the audio signal. For this reason, mixing consoles and recording interfaces always use identical converter types in the same modules (e.g. input channels), so that the modules have the same signal propagation time among each other. As we will see, within a digital mixer or recording setup latency is not a fixed quantity. 

What is Round Trip Latency?

Typically, mixer latency refers to the time it takes for a signal to travel from an analog input channel to the analog summing output. This process is also called “RTL”, which is short for “Round Trip Latency”.
The actual RTL of an audio interface depends on many factors: The type of interface (USB, Thunderbolt, AVB or DANTE), the performance of the recording computer, the operating system used, the settings of the sound card/audio interface and those of the recording project (sample rate, number of audio & midi tracks, plugin load) and the signal delays of the converters used. Therefore it is not easy to compare the real performance of different audio interfaces in terms of latency.