Beyond having low and flat rates, beyond allowing mobility without a mobile phone, voice-over-IP has another advantage : it allows calls to be recorded. Here is how to proceed (assuming you are geeky enough…).
For instance, this week, I could not physically join an interesting meeting between French NGO leaders and IT professionnals considering how to best volunteer for these NGOs. So I gave the organizer of the meeting a VoIP phone (Siemens Gigaset) and asked him to call the Asterisk-powered SIP call conference system our nonprofit has (thank you Fred and JML for this !). It allowed me to join the meeting as a distant caller and… to record a big MP3 file of the 3-hours long discussion.
For recording this voice-over-IP conference, here is my setup. I was calling from a linux PC, 2GB of RAM and a great free software SIP-compatible softphone called Twinkle (greater than Ekiga IMHO). My voice-over-IP provider was our Asterisk server. I could have used any other free SIP provider, such as ippi. Ippi is great and I am a happy customer of their service.
I was also running the Wireshark packet sniffer as root. After the call, I had to post-process the VoIP packets Wireshark captured. Wireshark decoded them and extracted the audio content of the conversation. Then I used Audacity to normalize, level and compress the audio and to save it as a big podcast-ready MP3 file.
The tricky parts :
– The Siemens Gigaset can’t (easily ?) be configured to call an SIP address which does not have the same domain as the SIP account it is using. For instance, if the audioconference system is at sip:conference@sipprovider.org then you’d better configure the phone to use a sip:mygigaset@sipprovider.org account rather than a sip:mygigaset@anotherprovider.org Too bad… :(
– You should warn the participants they are being recorded. Not only can this be required by your local laws but it also gives them another incentive to think of speaking close to the phone which is recording them in the meeting room.
– During the conversation, people in the meeting room would sometimes forget the presence of the phone and speak too far from its microphone. Hence I had to say “Can’t hear you !” from time to time and participants would take the phone in their hand as if it were a microphone. Local participants (in the meeting room) would even call the distant participants “the phone” and say “Hello, phone, how are you ?” and stuff like that. It was a bit as if the phone was yet another participant speakers had to take into their hand in order to be heard and recorded. Quite funny. Having the phone close to the speaker is also a matter of discipline and habit for the meeting organizer.
– Distant participants like me would use the “mute” feature of their local (soft-) phone so that they can’t be heard when not talking, so that there is less background noise in the conversation.
– I would have preferred to have at least one local participant available in a text-based chatroom (think IRC channel) or at least in some instant-messaging system. This would have allowed me to remind the phone has to be kept close to the current speaker and stuff like that without having to loudly say “Can’t hear you”. Unfortunately, the only IM-available participant was the main organizer who quickly forgot his screen and keyboard so that he could focus on the discussion going on.
– A 3-hours call required a lot of RAM for wireshark, even though the captured packets were being saved on the hard-drive ; when post-processing the packets, I had to split the session into 4 smaller parts so that wireshark would not crash when doing its audio extraction.
– When post-processing one of these smaller packet captures, wireshark would sometimes not detect the accurate nature of the packets : instead of seeing them as Real-Time Protocol (RTP) packets which they indeed were, it detected them a “OICQ” packets. So I had to force wireshark into considering them as RTP files (using its “Decode…” command).
– In order to have wireshark decode and save an audio file from the RTP streams, the command to be used is “RTP / Show all streams” from its “Statistics” menu. Then you use the “Analyze” button and then the “Save Payload” button. I had to select the “.raw” (vs. “.au”) format for the audio file because of the codec used by the VoIP phones.
– When saving the audio file, I decided to save distinct files for the forward (my voice, sent from my softphone to the audioconference service) and reversed stream (the meeting voices, sent from the audioconference service to my softphone). This allows distinct and finer audio postprocessing (the audio levels were different).
– In audacity, I chose to first normalize the audio tracks, then level them (it adjusts the audio level when the speaker changes or talks to far from the phone) then audio-compress them a bit. I would then merge the parts and tracks into a single mono audio file. Stereo does not make much sense in the case of a many-participants call but can be useful if you record a 2-participants conversation.
That’s it. Now I have to finish the audio-postprocessing of my MP3 and find some place on the Net where to upload it for the participants. What do you think ? Do you have some tricks to share on this topic ?