What stood out to me was the similarities between “Agile”/team-management rituals and the game, the affects of both have been similar in my experience.
Interestingly, a lot of the “pain points” highlighted here between in-person and zoom-based communication align with my experiences as a “neurodivergent” person, some examples:
(1) low audio quality making it difficult to understand the speaker/easily distracted by background noise, (2) lack of access to non-verbal communication channels like body language requiring dedicated brainpower to understand the speaker, (3) dysmorphia brought on by hyper awareness of one’s appearance, (4) lack of adherence to societal norms in conversation
I feel like remote work has leveled the playing field in a way that requires everyone to be explicit in their communication.
A couple jobs ago (pre-pandemic) I was at a remote company that put some effort into 1 & 2 that I've become big proponents of.
Allocate $300 per person to get everyone a podcaster mic and a pair of open backed headphones to plug into it as a monitor so you hear your own voice through your headphones. It removes the feeling that you need to speak loudly to be heard, which is a lot of the fatigue.
And if both parties are using this setup, it greatly reduces the "only one person can speak at a time" feeling, allowing you to use more natural vocal interactions without feeling like you're interrupting.
Together it makes a huge difference. I have audio processing and sensory problems and used to be so drained after a 40 minute zoom meeting. With this setup I can spend 4 hours a day pair programming and be appropriately tired but not exhausted.
Your 3-4 and large meetings with executives screaming into their laptops from coworking spaces I have no solutions for however.
I optimized for this style setup about a year into the pandemic, but using a Beyerdynamic MX300 wired gaming headset. The headset is closed-back, so to get voice feedback ("sidetone") I use an inexpensive Behringer XENYX USB302 mixer. It allows for adjusting the mic/audio mix going to my headphones, and also is a USB audio interface.
This setup is as minimal as I was able to get it: no big booms on my desk to move around, microphones blocking my video, etc. Yeah, I look like a pilot on calls, but the audio quality is amazing. Also, the mic being close to me blocks background noise, and isolates it from my desk. Sometimes my colleagues with podcaster-style mics have issues with mechanical transmission from their desk setups, while I have none.
People often comment — even to this day — about my audio quality. I talk to people all day, being in sales, and it makes a huge difference in my presence and professionalism. Absolutely worth every penny I spent.
As a side note, I also use my iPhone as my webcam (continuity cam I think it's called) along with a couple Logitech lights on my monitor, and the overall quality of my digital presence often blows people away.
It surprises me how many people in sales, marketing and other external facing roles don't optimize their AV setup.
Honestly I'm sort of surprised that there isn't training about this given how much sales training people go through. Presentation is HUGE. A good mic, a proper camera, and even just minor consideration for lighting make anyone so much more credible, and pleasant to talk to.
> It surprises me how many people in sales, marketing and other external facing roles don't optimize their AV setup.
It's not that surprising, it's fairly technical and many don't want to deal with it (ask anyone if they know what XLR or phantom power is), another problem is there's so many different options that it often leads to choice paralysis and some people feel uncomfortable to ask for money to buy better equipment.
Myself and my team are all technical sales folks with engineering backgrounds, and we naturally optimize for this sort of thing — all of us have some form of “advanced” setup that’s been informed by each other’s investment.
On the other side, NONE of our account reps have anything remotely close. I can think of one or two times in the past few years when someone asked me about what tech I use. My company even has a home office budget benefit meant for exactly this sort of thing!
Unfortunately since it’s an active feature it requires power, meaning any wired headset will require a battery or other power source.
A wired headset plus my mixer gives me an opportunity to tinker and upgrade my setup as I wish, and is all USB powered to boot.
I have some rough plans to make a simple audio interface with built in sidetone for use as a portable setup, but haven’t had the time to turn it into a real product. Someday!
> This setup is as minimal as I was able to get it: no big booms on my desk to move around, microphones blocking my video, etc. Yeah, I look like a pilot on calls, but the audio quality is amazing. Also, the mic being close to me blocks background noise, and isolates it from my desk. Sometimes my colleagues with podcaster-style mics have issues with mechanical transmission from their desk setups, while I have none.
I kind of went the other direction here.
AT2020 condenser mic into a focusrite scarlett. Mounted on an adjustable boom that's clamped to the far right end of my desk with a shock mount and pop filter (so I can get the thing fully out of my way). Sound comes out some bookshelf speakers sitting below my monitor.
My monitor's raised up a bit on an arm so about 20% of it is above eye level for me with the webcam sitting on top of that. It's a solid few feet away from me, so the angle isn't really noticeable to anyone I'm on a call with. But the angle does mean that I can pull the microphone right in front of my face, just below my mouth so I'm speaking over it, and it's not blocking my mouth or any of my face. (Visually it's in front of my shirt and on the edge of the frame anyway.)
It already does a fairly good job of isolating my voice by virtue of being a cardiod mic inches from my face, but further from that I have it set up (in software) with a noise gate opened by the microphone _in my webcam_. When I'm talking to people I'm generally looking at them so I'm speaking towards the monitor. This sets it up so sound directed towards the monitor opens the mic, but other sound generally does not.
I couldn't find a pair of over-ear headphones I could comfortably wear for long periods of time--they'd all push on the arms of my glasses and end up really sore after a couple hours of calls in a day (even non-contiguous). Instead of trying to solve that, I just steered hard into making the open setup work and sound as well as I could.
As far as visuals--I have a couple of LED light panels for my photography that are set at 45 degrees to either side of my camera, with one adjusted a bit cool and one a bit warm. Picked it up from some time spent doing stage lighting--if you evenly light a scene, it looks flat. (Think a photo taken with an on-camera flash.) A slightly warm and slightly cool wash from different angles can be lit fairly brightly while maintaining the depth and texture.
Different approach but same goal and outcome: I spend a lot of my day talking to people and trying to convey information. It's to my benefit that I'm clearly heard and easily understood. And, besides the practical concerns, professionalism. Frankly, I think it's silly _not_ to invest (to some extent, anyway) in this. In the remote work world, a huge part of everyone's interactions with and perception of you is actually their interaction with and perception of your audio and video setup. You don't need to overdo it, but don't be _bad_ at it.
> And if both parties are using this setup, it greatly reduces the "only one person can speak at a time" feeling, allowing you to use more natural vocal interactions without feeling like you're interrupting.
How does it work? Asking, because I always have this issue in 3+ person calls, where I'm frequently starting to talk just about when someone else does, ending up in those awkward "oh sorry you go please" moments. That makes me prefer to not say anything at all unless explicitly talked to.
I always assumed that my issue is that I'm just a slow-thinking dumbass who can't get the cues in time (which is true, because it happens in face-to-face in-person conversations as well, just less frequently), but maybe it's a technical issue contributing to it?
I might be wrong on some of this because I'm not that knowledgeable about the mechanics of it.
But I think video conferencing software is built around the assumption that people are using their speakers and so the output is also picked up by the mic. So some of it is increased latency caused by filtering that out. And then it keeps people basically muted until it determines they're trying to speak, and there's a delay with it toggling.
If you're using a mic and headphones at all though, there are settings that cut a lot of that out. They're usually kind of inscrutable like in zoom IIRC it's called "audio for musicians" or something like that. If both people are using headphones and have those settings on, it removes most of that "switching talker" latency.
As you add more people it's harder and harder to get everyone configured correctly, which is why having company buy-in and policy is important. Sheer network latency comes into play too, but most people working from home probably have acceptable connections.
That's a latency issue, and there's no real fix for that.
The parent commenter, I believe, is talking about the problem where people use speakers and active noise cancellation, which makes it impractical to speak and be heard at the same time; the noise cancellation will ruin the speaker-user's mic audio while it has to cancel out another speaker. Headphones, worn by all parties, resolves this issue.
All headset mics have the great advantage of minimizing the influence of room acoustics. Speech is easiest to understand without reverb. Putting the microphone close to your mouth is cheaper than room treatment.
The benefit of a headset is not solely the improved microphone, it's also that the headphones mean your microphone is only picking up your speech, and not your meeting's output audio.
If you are using both your laptop's microphone and speakers, then Teams/Zoom has to decide whether to play audio to you, or pick up audio from you. If you're talking at the same time as someone else, the audio quality for everyone in the meeting suffers.
Modern systems don't work in a half-duplex mode like this.
All laptop speaker/mic combos use AEC[1], where they can both playout and pickup at the same time. There is actually 2 layers of this in many systems, one provided by your device, and a 2nd layer provided in software by Zoom/Teams/Meet etc.
What can happen is that the meeting audio is a mixture of the top-N loudest participants. N+1 people talking will conflict badly.
Your comment is phrased as though you think you're correcting me, but your point seems to be agreeing with what I said. My point is not at all about the technological ability of a computer to simultaneously process audio input and output.
The "limitation" arises because Teams/Zoom/etc all have mechanisms to prevent audio feedback. Participant A, using their laptop's microphone and speakers, will cause unpleasant audio cutouts for every other participant whenever they talk, even if Participants B-Z are all wearing headphones that prevent their microphone from picking up their audio output.
Frankly, I hate apple audio products. Apple has a veneer of “industry leader” which makes people think they’re the best they can get, so they don’t look for alternatives. But oh my god apple mics are my bane on a day of calls. They’re “acceptable” in that, yes, can understand you. But their grating low quality is just uncomfortable. Like, I fully expect apple to rope out an AI model that reconstitutes people’s voice to sound normal just so they can keep trying to convince people they don’t need a headset or a boom mic.
Agreed. I've been using them for the better part of a decade (and full time since 2020) and I've listened to recordings of myself - sounds absolutely acceptable, and better than a bunch of USB mics I've tried.
I personally use a single over the ear [headphone with directional] mic because I want to hear everything around me normally (and myself). They are super cheap, and for jitsi/zoom/teams I don't normally need directional sound. It's also easier on my ears than earbuds. Esp if someone is too loud. I just tip it back a bit.
I’ve done a deep dive into this and have a setup I’m fairly happy with. It doesn’t sound as good as a “podcast” mic, but it’s also not obnoxious and in your face. While sounding worlds better than apple or gaming headsets. I detail in this thread:
It’s a wired headset with a condenser mic I feed into a daw with real-time compressor and eq. No software required or processing done on the computer. The only thing I’m missing is a wireless headset with audio in and out that can be routed to the DAW. Wireless headsets connect to the computer via usb which is a shame. All of my friends use steelseries and it sucks because I constantly have to ride their volume levels because the boom arm moves out the ideal position and they don’t have a leveler to do deal with it.
I know it's not exactly the point, but would you mind sharing what specific mic and headphones you went with for this? It can help to get a sense of what to look for to create a similar setup.
Yeti blue mic and phillips fidelio headphones. This was just the "standard package" that company recommended for new hires and we were given a one time setup stipend for what they cost at the time. I didn't shop around and haven't changed anything since, there may be better options now.
I have the mic on an arm to cut down on bumps and typing transmitting through the desk, but I find it doesn't have to be right in my face. I have it above the monitor, out of view of the camera. YMMV depending on room echo, my walls are bookshelves which is pretty good as treatment.
This is tricky to do correctly if you don't want to look like one of three things:
1. Podcast bro / Twitch streamer (giant mic inches from face)
2. Gamer (headphones with or without a headset mic)
3. Call center employee (open back mono headset with mic)
Turning off video is obviously the fastest fix to this. But assuming you don't want to do that...
You'll find a lot of people hate wearing headphones since it messes up their hairstyle or just looks distracting or some other sensory issue with having something clamped to your head. Earbuds work better for them; as long as the mic can't hear other speakers, the annoyance of half-duplex audio is eliminated.
The mic problem is harder if you don't want something obvious in frame. Lav mics work if you know how/where to attach one and to minimize clothing movement noise. Other options will require some level of room treatment if the mic isn't close to the speaker's mouth.
I suspect there are fashion/social signaling type reasons why streamers and podcasters put the mic visibly in frame. I found it wasn't difficult at all to get it out of the camera view with a minor but acceptable quality loss and no echo. My walls are mostly bookshelves which helps but is not exceptional.
>Allocate $300 per person to get everyone a podcaster mic and a pair of open backed headphones to plug into it as a monitor
How's your experience with this regarding soundproofing the room. Is normal noise filtering fine? I assume you have the mic on a boom with a pop filter.
Re #1, many video conferencing apps have in-app captions and I really appreciate turning these on. Sometimes I can't understand the speaker and captions save me. Sometimes I space out and captions allow me to "rewind" conversations so I can answer an unexpected question.
Even if your video conferencing apps doesn't have captions, your OS might (Mac does, for example). Not as good as one in the conferencing software (which can label speakers), but it's better than nothing.
This feature also helps if you join a meeting and don't have your audio set up right.
In my experience, these captions only work for native speakers (of a language that happens to be supported) with "standard" accents. With other accents and/or non-native speakers, they break down pretty badly. Combine that with some amount of bilingual dialog and some technical terms and internal tech names, and it can get quite mystifying.
Interesting to me are the inverse issues my hard of hearing wife has with in-person work. She rarely gets anything out of meetings because the relative chaos and people speaking over each other means she can't hear any of it. No technology yet provides real-time transcription for reality. The politics of interpersonal relations makes everyone think you're stuck up, shy, unconfident, or all of these if you don't often speak up yourself. The common practice of looking over someone's shoulder for co-working doesn't work well when the other person relies upon reading your lips to understand you. All of the mythical serendipitous chance interactions that happen around a water cooler don't involve you if you can't hear anyone and don't take part in water cooler chat.
> No technology yet provides real-time transcription for reality.
A deaf dancer friend uses an app, I think from Google. We’re often in discussion/teaching circles in dance classes. Seems to work well.
“Real time” is relative though. Conversation latency matters in roughly the tens of milliseconds and it’s certainly not near that fast.
The group has to be attuned to having a turn for her to speak. She’ll pretty often come in at the same time as someone else, missing those subtle sounds of someone else about to talk. The emergent rule is “tie goes to her”.
This all works well in the mindful scene of Berkeley experimental dance workshops. Maybe she’d get steamrolled in a competitive finance office or something.
Can you please get that app's name? It sounds like something that would immensley help me. I'm hard of hearing and something like this would be amazing to have handy when I can't wear my hearing aids in a specific situation.
In my experience, it’s awful. When using my phone for real-world captions, I still rely on Otter.ai or even a Google Meet with myself as the only participant.
I miss remote work when we didn't have to go on video all the time. Seeing myself/other people in a video chat is exhausting, and the non-verbal cues are often delayed so they lose meaning entirely. Just give me low latency voice again, please.
Our team culture defaults to video off for most recurring calls, it's more relaxing and less exhausting. When there's something really important to discuss, or an exec/client meeting we'll turn on video.
been remote since 2015 and outside of intimate meetings with small groups of people, and/or introductions, pretty much everywhere has been 100% no camera.
part of it was working for F500 multinationals, where the PM is in Mexico City, technical staff are in Chile, and engineering is in Canada and California. Just audio works fine, but 10 people with cameras and things get laggy...
At my previous job during pandemic we usually communicated audio only. At my current job we most of the time use video as well. It is a big big improvement. Just a few days ago we had to have an audio-only call with a supplier, who said such was their policy. It was super annoying not being able to see them. It is much harder to get cues about for example how certain they are when saying something.
I have to agree with this, I find no video much more exhausting and difficult to follow.
I do have quite a few people that are non native English speakers that though not really difficult to understand, have great english and don't have very thick accents, does still up the difficulty slightly. Most of them also don't have the greatest audio setups (but not the worst, often plugin headsets with the dangly mics).
First thing to do in a conference is disabling yourselfs stream. It doesnt help in any way after the first 10 seconds. I cant understand why any manufactorer added those, except to fulfill narcisstic tendencies.