Podcast microphone and equipment: recording a clean voice
The order that actually matters: room first, mic placement second, the microphone model last. How to record a clean voice without a studio.
By Hanna Eng·Audio engineer, Abbey Road Institute Paris
To record a clean podcast voice, the priority order is the room first, mic placement second, the microphone model last. In an untreated room use a cardioid dynamic mic, speak 15 to 20 cm away angled 15 to 30 degrees off-axis with a pop filter 5 to 10 cm in front, and record at 44.1 kHz, 24-bit.
Most gear guides sell you a microphone. As the person who has to repair these recordings afterward, I will tell you the opposite: the room and your distance from the mic matter more than the model. A decent mic placed well in a dead room beats an expensive one in a room that rings. This guide covers what to buy and how to use it, in priority order.
Voice recording specs at a glance
| Setting | Value |
|---|---|
| Mic distance | 15 to 20 cm (6 to 8 inches) |
| Mic angle | 15 to 30 degrees off-axis |
| Pop filter distance | 5 to 10 cm from the mic |
| Mic type (untreated room) | Cardioid dynamic |
| Phantom power (XLR condenser) | Typically 48V (USB condensers self-power) |
| Recording format | 44.1 kHz / 24-bit |
| Record level (peaks) | -12 to -18 dBFS, never clipping |
| Delivery loudness (integrated) | About -16 LUFS (Apple Podcasts, stereo); Spotify normalizes to about -14 LUFS |
| Delivery true peak | Under -1 dBTP |
Source: Lewitt Audio; The Podcast Host; Apple Podcasts for Creators
What actually matters, in priority order
For a clean voice, the order is: the room first, mic placement second, the microphone model last. A correct mic placed well in a soft room sounds better than an expensive mic in a room that echoes. A captured echo or background noise is hard to remove later, so the goal is to avoid it at the source rather than fix it in post.
USB or XLR: which to choose
A USB microphone plugs straight into a computer and is the simplest start. An XLR microphone needs an audio interface but gives you room to grow (better preamps, multiple mics, upgrades). Many modern mics are hybrid USB and XLR, so you can begin on USB and move to an interface later without rebuying.
- USB: plug and play, one voice, lowest cost and complexity.
- XLR: needs an interface or mixer, scales to better preamps and multiple mics.
- Hybrid USB/XLR mics let you start simple and upgrade the chain later.
Dynamic or condenser
For a home studio that is not acoustically treated, a cardioid dynamic microphone is the safe default: it rejects room noise and tolerates an untreated space. A condenser is more detailed and sensitive and really wants a treated room, otherwise it captures every reflection and the fridge in the next room. An XLR condenser needs phantom power, typically 48V, supplied by your interface; a USB condenser (like the Rode NT-USB example below) is powered over USB and needs no separate phantom supply.
Cardioid pickup, and why it matters
Choose a cardioid microphone for solo voice. A cardioid pattern picks up what is in front and rejects the sides and rear, so it captures your voice and ignores most of the room and the keyboard. Omnidirectional and figure-8 patterns capture far more of the space, which is the opposite of what an untreated room needs.
Voice technique: distance, angle, plosives
Speak about 15 to 20 cm (6 to 8 inches) from the mic, and angle it 15 to 30 degrees off the direct line of your mouth. That distance keeps a full tone without excessive proximity boom, and the off-axis angle lets the moving-air bursts of plosives (P and B) pass beside the capsule rather than into it. Harsh S sounds (sibilance) are high-frequency energy, not air blasts, so they are tamed mainly by de-essing in post and by sensible distance, not by the angle. A pop filter 5 to 10 cm in front catches the rest.
- Distance: 15 to 20 cm. Closer than 5 cm overloads and booms.
- Angle: 15 to 30 degrees off-axis to tame plosives; sibilance is handled by de-essing in post and by distance, not the angle.
- Pop filter 5 to 10 cm from the mic for the remaining plosives.
- Stay still: once set, do not move the mic or drift around, or your level and tone wander.
Treating your room without a studio
You do not need a booth. Target the first reflections, the nearby walls and ceiling that bounce your voice back into the mic, with soft material: a rug, curtains, a bookshelf, foam or fabric panels. A small furnished room beats a large empty one. The classic free fix is recording in a closet full of clothes, which is dense, soft absorption on every side.
The minimum complete setup
A workable solo podcast setup is short: a cardioid microphone (USB, or XLR with an interface), closed-back headphones to monitor without the sound leaking back into the mic, a pop filter, and recording software. Record at 44.1 kHz, 24-bit to stay consistent with the rest of the production chain.
- Cardioid mic (dynamic is the safe default for an untreated room).
- Closed-back headphones (open-back leaks into the mic).
- Pop filter, and a stand or boom arm to keep the mic steady.
- An audio interface if you went XLR. Recording software (a DAW or a free recorder).
After you record
Once the voice is captured cleanly, the chain continues: clean it (noise reduction, de-essing, de-clicking), then set the level for the platforms. Capturing well here is what makes that post-production light instead of a rescue job. The cleaner the source, the less the processing has to fight.
Example microphones (neutral industry examples)
These are common, well-documented options people compare for spoken-word and podcast use, listed only to show the categories. They are not a personal endorsement and not the gear used here. Match the type to your room first: in an untreated space, a cardioid dynamic is the safe default; a USB condenser is more sensitive and rewards a treated room.
Several of these are hybrid USB and XLR mics, which lets you start plugged straight into a computer and move to an interface later without rebuying.
- Cardioid dynamic, XLR: an industry-standard broadcast dynamic such as the Shure SM7B or the Electro-Voice RE20.
- Cardioid dynamic, hybrid USB/XLR: options like the Shure MV7 or the Samson Q2U, which start on USB and also offer XLR.
- Cardioid condenser, USB: options like the Rode NT-USB, more detailed and more sensitive, best in a treated room.
| Example model | Transducer | Pattern | Connection |
|---|---|---|---|
| Shure SM7B | Dynamic | Cardioid | XLR |
| Electro-Voice RE20 | Dynamic | Cardioid | XLR |
| Shure MV7 | Dynamic | Cardioid | USB and XLR |
| Samson Q2U | Dynamic | Cardioid | USB and XLR |
| Rode NT-USB | Condenser | Cardioid | USB |
Sources: Shure; Electro-Voice; Samson; Rode
Recording level and loudness target
Keep two numbers separate: the level you record at, and the loudness you deliver at. While recording, leave headroom so nothing clips: aim for peaks roughly -12 to -18 dBFS. Peaking too hot leaves no room for the louder words and risks distortion you cannot undo.
Loudness is set at the end, not during the take. For podcast delivery, Apple Podcasts for Creators recommends an integrated loudness of about -16 LUFS for stereo (and -19 LUFS for mono) at a true peak no higher than -1 dBTP. The +/- 1 dB figure you often see quoted is the EBU R128 production tolerance, not part of Apple's spec. Record clean with headroom, then normalize to that target in post.
- While recording: peaks roughly -12 to -18 dBFS, never clipping at 0.
- At delivery: about -16 LUFS integrated for stereo (-19 LUFS for mono), true peak under -1 dBTP (Apple Podcasts for Creators).
- Set loudness in post, not by pushing the input hot during the take.
Shock mount and headphone monitoring
A shock mount suspends the microphone so that desk knocks, keyboard taps and footsteps do not travel up the stand and into the capsule as low thumps. It addresses structure-borne vibration, which a pop filter does nothing about; the two solve different problems and are used together.
Monitor on closed-back headphones while you record. They let you hear plosives, sibilance, mouth noise and room tone as they happen, so you can fix placement on the spot instead of discovering it in post, and being closed-back, they do not leak sound back into the mic the way speakers or open-back headphones would.
- Shock mount: isolates handling and structure-borne vibration (knocks, typing, footsteps).
- Pop filter: handles plosive air blasts. Different job from the shock mount.
- Closed-back headphones: catch problems live and avoid leakage into the mic.
Optional budget tiers (no prices)
If you are deciding where to put your effort, think in tiers rather than amounts. The cheapest meaningful upgrade is almost always the room and your placement, not the mic.
Entry: one hybrid USB/XLR cardioid dynamic, a pop filter, closed-back headphones, and soft furnishings for the room. Mid: an XLR cardioid dynamic into a small audio interface, on a boom arm with a shock mount. Higher: the same chain in a properly treated space with first reflections handled. Spend down the list in that order.
- Entry: hybrid USB/XLR cardioid dynamic, pop filter, closed-back headphones, soft room.
- Mid: XLR cardioid dynamic into an interface, boom arm and shock mount.
- Higher: the same chain in a treated room with first reflections handled.
Frequently asked questions
Should I use a USB or XLR microphone for a podcast?
USB is the simplest start: it plugs into a computer with no interface. XLR needs an audio interface but scales to better preamps and multiple mics. If you expect to grow, a hybrid USB/XLR mic lets you begin on USB and upgrade the chain later without buying a new microphone.
Dynamic or condenser mic for a podcast?
For an untreated home room, a cardioid dynamic microphone is the safer choice: it rejects room noise and reflections. A condenser is more detailed but more sensitive, and really needs an acoustically treated room to sound good rather than echoey. An XLR condenser needs phantom power, typically 48V, from your interface; a USB condenser (like the Rode NT-USB) is powered over USB and needs no separate phantom supply.
How far should the mic be from my mouth?
About 15 to 20 cm (6 to 8 inches), angled 15 to 30 degrees off the direct line of your mouth. That keeps a full, even tone while letting the moving-air bursts of plosives pass beside the capsule. Harsh S sounds (sibilance) are high-frequency energy, not air blasts, so they are tamed by de-essing in post and by distance, not by the angle. A pop filter a few centimetres in front handles the remaining bursts of air.
Do I really need to treat my room?
Treating the room matters more than the microphone. Soften the first reflections (nearby walls and ceiling) with a rug, curtains, a bookshelf or panels. A small furnished room beats a large empty one. A closet full of clothes is a genuinely good free recording space.
What is the minimum equipment to start a podcast?
A cardioid microphone (USB, or XLR with an interface), closed-back headphones, a pop filter, and recording software. Record at 44.1 kHz and 24-bit. Spend your attention on placement and the room before spending money on a more expensive microphone.
What microphone should I buy for a podcast?
Pick by category before model. In an untreated room a cardioid dynamic is the safe default. Common industry examples, named only as examples and not as an endorsement, include the Shure SM7B and Electro-Voice RE20 (XLR dynamics), the Shure MV7 and Samson Q2U (hybrid USB/XLR dynamics), and the Rode NT-USB (a USB condenser, which is more sensitive and prefers a treated room). The room and your placement matter more than which of these you choose.
What loudness should my podcast be?
Apple Podcasts for Creators recommends an integrated loudness of about -16 LUFS for stereo (and -19 LUFS for mono), at a true peak that does not exceed -1 dBTP. The +/- 1 dB figure often quoted is the EBU R128 production tolerance, not part of Apple's spec. You set that loudness in post by normalizing, not during the take. While recording, just leave headroom: aim for peaks roughly -12 to -18 dBFS so nothing clips.
Do I need a shock mount?
It helps. A shock mount suspends the microphone so desk knocks, typing and footsteps do not travel up the stand into the capsule as low thumps. It handles structure-borne vibration, which is a different problem from plosives, so it works alongside a pop filter rather than replacing it. On a desk or boom arm it is well worth it.
Why monitor with closed-back headphones while recording?
Closed-back headphones let you hear plosives, sibilance, mouth noise and room tone as they happen, so you can fix placement on the spot instead of discovering the problem in post. Because they are closed, they do not leak sound back into the microphone the way speakers or open-back headphones would.
Sources and references
- Lewitt Audio, dynamic vs condenser microphones
- Lewitt Audio, do I need a shock mount and pop filter
- The Podcast Host, what is a pop filter
- Apple Podcasts for Creators, audio requirements
- Shure SM7B, cardioid dynamic vocal microphone
- Shure MV7, USB/XLR cardioid dynamic microphone
- Electro-Voice RE20, dynamic cardioid microphone
- Samson Q2U, USB/XLR dynamic cardioid microphone
- Rode NT-USB, USB cardioid condenser microphone