Beyond having low and flat rates, beyond allowing mobility without a mobile phone, voice-over-IP has another advantage : it allows calls to be recorded. Here is how to proceed (assuming you are geeky enough…).
For instance, this week, I could not physically join an interesting meeting between French NGO leaders and IT professionnals considering how to best volunteer for these NGOs. So I gave the organizer of the meeting a VoIP phone (Siemens Gigaset) and asked him to call the Asterisk-powered SIP call conference system our nonprofit has (thank you Fred and JML for this !). It allowed me to join the meeting as a distant caller and… to record a big MP3 file of the 3-hours long discussion.
For recording this voice-over-IP conference, here is my setup. I was calling from a linux PC, 2GB of RAM and a great free software SIP-compatible softphone called Twinkle (greater than Ekiga IMHO). My voice-over-IP provider was our Asterisk server. I could have used any other free SIP provider, such as ippi. Ippi is great and I am a happy customer of their service.
I was also running the Wireshark packet sniffer as root. After the call, I had to post-process the VoIP packets Wireshark captured. Wireshark decoded them and extracted the audio content of the conversation. Then I used Audacity to normalize, level and compress the audio and to save it as a big podcast-ready MP3 file.
The tricky parts :
– The Siemens Gigaset can’t (easily ?) be configured to call an SIP address which does not have the same domain as the SIP account it is using. For instance, if the audioconference system is at sip:conference@sipprovider.org then you’d better configure the phone to use a sip:mygigaset@sipprovider.org account rather than a sip:mygigaset@anotherprovider.org Too bad… :(
– You should warn the participants they are being recorded. Not only can this be required by your local laws but it also gives them another incentive to think of speaking close to the phone which is recording them in the meeting room.
– During the conversation, people in the meeting room would sometimes forget the presence of the phone and speak too far from its microphone. Hence I had to say « Can’t hear you ! » from time to time and participants would take the phone in their hand as if it were a microphone. Local participants (in the meeting room) would even call the distant participants « the phone » and say « Hello, phone, how are you ? » and stuff like that. It was a bit as if the phone was yet another participant speakers had to take into their hand in order to be heard and recorded. Quite funny. Having the phone close to the speaker is also a matter of discipline and habit for the meeting organizer.
– Distant participants like me would use the « mute » feature of their local (soft-) phone so that they can’t be heard when not talking, so that there is less background noise in the conversation.
– I would have preferred to have at least one local participant available in a text-based chatroom (think IRC channel) or at least in some instant-messaging system. This would have allowed me to remind the phone has to be kept close to the current speaker and stuff like that without having to loudly say « Can’t hear you ». Unfortunately, the only IM-available participant was the main organizer who quickly forgot his screen and keyboard so that he could focus on the discussion going on.
– A 3-hours call required a lot of RAM for wireshark, even though the captured packets were being saved on the hard-drive ; when post-processing the packets, I had to split the session into 4 smaller parts so that wireshark would not crash when doing its audio extraction.
– When post-processing one of these smaller packet captures, wireshark would sometimes not detect the accurate nature of the packets : instead of seeing them as Real-Time Protocol (RTP) packets which they indeed were, it detected them a « OICQ » packets. So I had to force wireshark into considering them as RTP files (using its « Decode… » command).
– In order to have wireshark decode and save an audio file from the RTP streams, the command to be used is « RTP / Show all streams » from its « Statistics » menu. Then you use the « Analyze » button and then the « Save Payload » button. I had to select the « .raw » (vs. « .au ») format for the audio file because of the codec used by the VoIP phones.
– When saving the audio file, I decided to save distinct files for the forward (my voice, sent from my softphone to the audioconference service) and reversed stream (the meeting voices, sent from the audioconference service to my softphone). This allows distinct and finer audio postprocessing (the audio levels were different).
– In audacity, I chose to first normalize the audio tracks, then level them (it adjusts the audio level when the speaker changes or talks to far from the phone) then audio-compress them a bit. I would then merge the parts and tracks into a single mono audio file. Stereo does not make much sense in the case of a many-participants call but can be useful if you record a 2-participants conversation.
That’s it. Now I have to finish the audio-postprocessing of my MP3 and find some place on the Net where to upload it for the participants. What do you think ? Do you have some tricks to share on this topic ?
Some further notes on tricky parts of the process :
– When saving the extracted audio in the « .raw » format, you should note the codec being detected by wireshark. The detected codec is displayed in the description of the RTP stream which was analyzed. For instance, it can be G.711 or GSM. This is important because when you are to import this .raw file into Audacity, Audacity requires you to select the codec used in this file. You also have to specify the frequency (8.000 Hz in many cases) and the « endianness » of the file (little endian most frequently, note that this influences the quality of the audio). Note that you might have the forward stream using one codec and the reversed stream using another one. Hence the importance of saving them in distinct files.
– Audacity allows advanced audio post-processing of your file. In my case, I first start with « noise filtering » (1/ select a silent/background noisy part of the file and « profile the noise » 2/ select the whole audio sequence and filter out the noise). Then I strongly « level » the audio. Twice and with -60dB or even -50db as the noise threshold. Then I audio-« compress » the file. Endly, I save the file as a compressed .OGG or .MP3 file.
Same here, I prefer having my recorded conversations in mono audio file. I am using Nero Wave editor to record my conversation and edit/splice it. It has more features than Audacity, although I might say Audacity is a very good software as it is free.
I had the same problem when trying to record a lecture for a friend who just had surgery and couldn’t come to uni. I was using a very cheap microphone and didn’t pay attention to all the settings and thought « well this will work out just fine ». I ended up with a terrible sound quality and I wish I knew about VoIP at that time !
Or you could have just used a tape recorder ;)
-Jack
I ordered a voip service from a company (8x. it provide me an
adapter, which has an etherned cable connects to a router (router
connects to cable modem) and a telephone line connect to a telephone.
I have PCs connect (wired and wireless) to the router too.
What I want to know is: is there anyway for me to install some
software on my PC and that software could sniff the audio packet and
reconstruct the talking (recording) via the voip phone?
Thanks for any advice!
MM great I searched some time ago this, great article, I want to know it will work with skype too?:)
Cain is another tool which can be used to record VoIP conversations. It can also perform man in the middle attacks using ARP spoofing to record the calls on a switched network without permission straight off the wire if they are not encrypted.
I appreciate people want to use VoIP, and I tried in earnest for my business, but it simply wasn’t feasible. Skype particularly was terrible. Recently I was on a telephone call with a Skype caller and the delay is terribly annoying. I kept having to ask them to repeat what they said.
The purportedly higher-quality VoIP services aren’t that much cheaper than landlines. My point here isn’t there an easier way to record a conference call? There is webinar software which participants can all be included in the call and the webinar can easily be recorded. Sounds like a much easier solution than all the steps set out above.
I was very happy with Gizmo until Google bought them and shut them down. Unfortunately, if you’re overseas Google Voice is not currently available.
The Gizmo call quality was great. The free sip# used in conjunction with my local Ipkall free land line in the US worked perfectly.
Skype is a good fallback – I phone regularly from Asia to Canada and have excellent call quality.
Great article.. was wondering if I could do this.. any chance it works with Skype?
VoIP is not very reliable…and it’s not the cheapest out there. So I see no real benefits in using their services. I’d be better off with Skype but I’m still using VoIP just because the recording quality is better than Skype’s.
Mind you, the quality of the recording was incredible thanks to the tutorial above. I tired for over 2 hours to find a viable solution until I found your article.
Much appreciated,
John
@John – « VoIP is not very reliable…and it’s not the cheapest out there. »
I guess that all depends on where you are and who you’re talking to. In my country (South Africa) VoIP is much more reliable than some other networks, like the ones from the mobile providers and it’s by far cheaper. The initial capital expenses are obviously higher, but the cost per minute on calls is much lower.