Saturday, April 18, 2009

File transfer over sound card

Before I even start, a word of warning: Never try these programs with your headphones on. THEY MAKE LOUD NOISES! It is possible to configure these programs to make noises way louder than you ever imagined your headphones could make. You can damage your hearing when playing with audio programming. Tinnitus isn't fun.

This will be you if you don't take my warning seriously:



Background and problem

I have an old laptop laying about. A while back, I was in a bit of trouble getting it to work. It's so old that several parts of it has shut down. The USB system is fried, and it can't read modern burned DVD:s. I needed to move software to it to get the network up and running again. The only peripheral that was working (besides the keyboard and the screen) was the sound card. It was running an old version of Slackware.

Ah, I thought, and wrote a program that encoded data in sound. When I did it back then I used a bad algorithm. It was very noise sensitive and tried to do too much at the same time. As of then, I've improved (and simplified) the concept to using a sort of pulse width modulation (an idea I got when I read about the ZX Spectrum Tape Loader.)

The basic protocol is trivial:

For every character:

For every bit:
Send a short pulse if the bit is 1.
Send a long pulse if the bit is 0.
Send a silence.
Send a very long pulse (4 times as long as the shortest pulse).
Send a silence.


This is nice and not very error prone. The end-of-byte signal means that errors don't taint their neighbors.

The crux isn't the signal generation (which is laughably trivial), it is the analysis on the receiving end; or rather dealing with the noise in the signal. The naive implementation would be to sum the square of the signal amplitude over time periods--the presence of a sine wave would converge towards some value and a silent signal would converge towards 0. In a noisy signal, it almost always converges towards something non-zero, so no such luck.

So, the second approach would be to use a Fourier transform, to select the part of the spectrum where our signal resides (400 Hz is what I chose).

A simple implementation of such a function looks like this:

double fourier1(double x_in[], double n, int length) {
double x_complex[2] = { 0, 0 };
int i;

for(i = 0; i < length; i++) {
x_complex[0] += x_in[i] * cos(M_PI * 2 * i * n / (double) length);
x_complex[1] += x_in[i] * sin(M_PI * 2 * i * n / (double) length);
}
return sqrt(x_complex[0]*x_complex[0] + x_complex[1]*x_complex[1]) / (double) length;
}


Where x_in is a series of numbers between -1 and 1, and n is the modified frequency (which is to say: length * frequency / rate). This function would give you a number corresponding to how much of a given frequency is in a sample. But you can do one better: Harmonics. Almost every loudspeaker will produce some level of harmonics even though the signal broadcasted is a plain sine wave with no harmonics.

So, to check if our signal is in a given segment, the following code can be used:
double sum = 0;
for(harmonic = 1; 2*harmonic < length / frq; harmonic++) {
sum += fourier1(data, frq * harmonic, length);
}


To check if the signal is present in a given signal, you must compare this against some form of threshold. What's a good threshold varies with noise. A bad threshold value may either cause the program to interpret random noise as meaningful data, or reject good data as random noise.

if(sum > threshold) { /* Signal is present in data block */ }
else { /* Signal isn't present */ }


The protocol described above can be realized with the following code:

if(sum < threshold) {
if(signal_length) {
if(signal_length > 10) {
if(bit != 0) printf("(?)");
bit = 0;
signal_length = 0;
} else {
bit_data = 2 * bit_data + (signal_length < 6);
if(++bit == 8) {
printf("%c", bit_data);
fflush(NULL);
bit = 0;
}
}
signal_length = 0;
}
} else {
signal_length++;
}


This does work. It's not just some crazy pipe dream. Following is all the code you need for transferring files from two computers using their soundcards.



Garbled transfers like this may soon arrive through a soundcard near you.


Utility programs

Before I get deeper into to the main program, I'm going to contribute some utility programs. record and playback. They are both wrappers for OSS, and reads and digests data from the soundcard; or writes digested data to the soundcard at given sample rates. They deal in signed char arrays only, and may convert them for the soundcard. Exactly what they do and how they work is a bit off topic, so I'll just post the code listings.

playback.c
record.c

The broadcasting end
As previously discussed, the broadcasting part of the program is pretty simple. The only real gotcha is the sample rate factor in the frequency. Since we're generating a signal with N bytes per second, we must decrease our frequencies by a factor 1/N. Beyond that, it's really quite trivial.

generate.c

The math parts
Almost there now. We just need some fourier transforms and things of such nature. The analysis end of the program can also make a XPM file of the frequency spectrum of the input, which is why you see a bunch of XPM code.

fourier.c
fourier.h

Finally... the receiving end

Most of what this one does has already been discussed. The threshold is hard-coded. You may want to change it or whatever.

analyze.c

... one file to compile them all, and into objects link them

Makefile

To compile, you just run
> make

Using the programs

Before I get to actually using the programs, I repeat my warning: Never try these programs with your headphones on. THEY MAKE LOUD NOISES! It is possible to configure these programs to make noises way louder than you ever imagined your headphones could make. You can damage your hearing when playing with audio programming. Tinnitus isn't fun.

Not there yet. It's quite an elaborate matter to use all these programs. First you'll want to generate your raw sound data. Let's transfer /etc/fstab (don't do this as root! Something might go horribly wrong!)

First, attach the microphone on the receiving end to the speakers on the transmitting end. Use duct tape or whatever. I had an old hands free headset that I wrapped so that the microphone was held in place by the earpiece.





On the transmitting computer, run the following command:
> ./generate -b 25 -r 48000 -o out.data /etc/fstab


Enter, but do not start the following on the transmitting computer:
> ./playback -r 48000 < out.data # don't press enter yet!


Now, on the receiving computer, run the following command:
> ./record -r 48000 -o out.recdata

Note that out.recdata will grow very fast. In this case, 48000 bytes/second.

Run the command pre-typed in the transmitting computer's terminal. Be very quitet, and listen to the noise coming from of the speaker. This may take quite some time. Practice your Zen. Find enlightenment. When the noise stops, press Ctrl+C on the receiving computer.

Run the following command on the receiving computer:
> ./analyze -b 25 out.recdata


Watch a semi-garbled /etc/fstab be printed across your screen. The -b switch is to be taken with a grain of salt. It is proportional to how fast the transfer is. It must (obviously) be the same on the receiving and the transmitting end. I've gotten it to work at -b 50 -r 48000. Sample rate (-r) increases the processing time, but it also allows faster transfers. There are a few fixed possible sample rates, 8000 always works, others that are supported by most soundcards are 11025,16000,22050,24000,32000,44100 and 48000.

So, in summary: -b determines how fast the transfer is, and the maximum possible transfer speed is limited by the sampling rate.

If it doesn't work, try playing what you recorded with playback. If you can hear and distinguish the beeps, then so should the computer be able to. If record or playback fails, chances are you don't have permissions to access /dev/dsp. If all you get is character salad, fiddle with threshold in analyze.c.



An even more elaborate version of this program is described in File Transfer Over Sound Card II - Phase Shift Keying.

2011 update: Moved the sources to a github repo.

63 comments:

  1. What about using morse code?

    ReplyDelete
  2. I have also done some investigation on this.

    My solution is as following.

    1. Using digital baseband transmission;
    2. Using Manchester Code;
    3. A synchronization sequence + data sequence;
    4. Double sampling at transmitter.

    While your solution is using modulation and is passband transmission, maybe mine is simplier and faster?

    ReplyDelete
  3. Why not use a cable from the audio out to the audio in? Does your laptop not have an audio in? This would improve the signal-to-noise a lot.

    Also, if you must use a speaker/microphone setup, wrapping them in duct tape or similar will create resonances that will degrade the sound quality of the system. If you must wrap the microphone/speaker then thick layers of fabric, and then duct tape would be better. This would somewhat reduce noise contamination and sound level output into the room as well.

    And, why not use a square wave or saw wave as a signal which has high harmonic content, instead of relying on the distort to produce the harmonics.

    ReplyDelete
  4. This is a very nice project, but yeah, something like this was done in the mid 90's with a direct link from audio out to audio in... gotta love resourceful geeks

    ReplyDelete
  5. There are many different methods that may be more effective to modulate the digital signal. Look into things such as QAM - that is the method that is used by cable companies to pass digital data over analog lines.

    ReplyDelete
  6. Clever... but were you aware that someone already did it and you might already have the drivers on the machines? Linux's AX.25 drivers include a soundblaster interface. AX.25 uses frequency-shift keying (two different tones for 1 and 0), and transmits around 1200bps.

    It is intended to be connected to an amateur radio, but can be hooked up from line-in to line-out if you want.

    ReplyDelete
  7. Morse code won't work, it only supports a-z 0-9, that's not even close to a unicode character set.
    You could send a message, but it wouldn't work for data transfer.

    ReplyDelete
  8. I've used the venerable serial port which is present in old laptops (but not new ones) to transfer files.

    But your solution is brilliant - way to go for implementing this!

    ReplyDelete
    Replies
    1. Good old days! I used to transfer files by connecting DB9 serial ports together with 3 wires on i286 boxes. Still remember the pinout - 2,3,5 (RX, TX, ground). No modems required. If you can connect a mouse to it, you can send files.

      Delete
  9. morse code 0/1 at high speed, but in effect thats exactly what he's doing.

    ReplyDelete
  10. Anonymous: that's like saying you can't transfer plain text because all the computer can deal with is ones and zeros!

    There's nothing stopping someone developing an encoding scheme that can encode arbitrary binary data on top of morse code.

    ReplyDelete
  11. freshtonic: You *could* convert from binary (unicode) to base 36 (alphanumeric) then back to binary (morse code) or.... just leave it in binary and transfer it raw. Binary is the native format for all computer data, it would be wasteful to muck around with morse code.

    ReplyDelete
  12. Great! you've just invented the fax machine!

    ReplyDelete
  13. OMFG, you have "reimplemented" tape protocols from old personal computers....

    http://en.wikipedia.org/wiki/Line_coding
    http://www.atariarchives.org/dere/chaptC.php

    those times almost every one had possibility to load/unload programs from magnetic cassettes.

    ReplyDelete
  14. How about putting the disk in a USB enclosure? Not as geeky, but very effective.

    ReplyDelete
    Replies
    1. How about transmit a file to a computer away 20km using this, a transmitter and no use internet. It's very secure to a military base, right?

      Delete
  15. Geeks are so cool!

    ERR
    www.anonymity.es.tc

    ReplyDelete
  16. Somebody port this to the iPhone! Low-speed tethering anyone?

    ReplyDelete
  17. I seem to remember that the old Commodore Amiga computers could do something like this. They did not come as standard with modems but they had excellent sound-cards with a good audio-in. It was the very first soft-modem.

    ReplyDelete
  18. Holy mackerel. Although this is cool, why in God's name wouldn't you just pull the HDD and mount it in another computer (either via a USB enclosure, or an adapter to bring it to a 40pin IDE header)??????

    ReplyDelete
  19. Somewhere out there in the distance you can almost hear the author slap his forehead when he reads about the USB enclosure idea....

    ReplyDelete
  20. I've done something similar in a Communications Systems class a half year ago or so, but in discrete components and using an infrared transmission/reception system. There's still an applicable algorithm I built in lab for encoding, transmitting, and receiving binary data.

    Multiply 1 for on and 0 for off by a carrier frequency, then transmit, to generate the signal.

    On the receiving end, run the signal through a bandpass filter (at the carrier frequency), then a half-wave rectifier. This gets you an analog waveform that looks roughly similar to the on/off transmitted, but with voltage levels varying with transmission conditions.

    To get digital voltages out of this, send the signal through two lowpass filters in parallel with different cutoff frequencies and compare the results. Its essentially the analog version of comparing the current value to a running average over multiple bits. If you encode your data in a way that mixes high and low values well, then the running average is going to be between the high and low ends.

    I just may do some tinkering around with this, just for kicks.

    ReplyDelete
  21. Like PSK31?
    http://en.wikipedia.org/wiki/PSK31

    ReplyDelete
  22. >mapandfold

    1. modulation and bandpass filter is better, IIRC. Well, at least it was for infrared. It means noise rejection, for one thing - there's a lot more baseband infrared thrown around than, say, 40khz modulated infrared.
    2. Manchester encoding gets half the bitrate out of the same bandwidth as unipolar NRZ. I think it'll still be faster than the OP's encoding scheme, and guarantees almost exactly the same amount of time spent transmitting high as transmitting low.
    3. A synchronization sequence isn't really needed, except for perhaps demarcating the beginning and end of a sequence of bits. And since the received waveform is processed, that can be done from within the received sequence of bits, assuming one can either find a sequence of bits that aren't used for anything else or get more than on/off out of your line code.

    ReplyDelete
  23. >Leon

    Article says that PSK31 can be operated at less than 100 hz separation... perhaps one could run multiple PSK31 data streams over a band that doesn't include any of their harmonics? Assuming that its possible for a speaker to correctly whistle, say, 10 notes at the same time spread between 2khz and 3khz.

    ReplyDelete
  24. Maybe you could tweak it to recognize different frequencies. To send 4 bits per pulse you would only need 16 distinct tones. Might be feasible.

    ReplyDelete
  25. >To send 4 bits per pulse you would only need 16 distinct tones.

    Incorrect. To send 4 bits per pulse you need only 4 tones. All ones = all four tones on, etc.

    ReplyDelete
  26. Viktor,

    You might be interested in Crista Lopes' work at PARC and then at UC Irvine on over-the-air audio transmission of data, in case you're unaware of it:

    http://www.ics.uci.edu/~lopes/dv/dv.html

    One of the goals of her work is to produce encodings that actually sound pleasant to humans (think R2-D2).

    Best,
    -keith

    ReplyDelete
  27. You could send 8 bits per pulse using 16 tones, 8 on 8 off, and use an extra couple tones for misc. info

    ReplyDelete
  28. You could use an ultra high frequency sound, or ultra low frequency sound, so that the signal cannot be heard by the human ear.

    ReplyDelete
  29. Low frequency sound is a bad idea because it brutally caps the transfer speed. If the audible range stops around 20 Hz, that's only 20 oscillations per second, and most detection methods require at least a couple of cycles to identify the frequency -- so you get a bit rate of (at best) 5 bits per second.

    High frequency sound is a good idea if your hardware can handle it. But unfortunately, most consumer oriented hardware can not. The best you can hope for is a signal that only young people can hear.

    ReplyDelete
  30. you should try expanding this by adding more frequencies, like one for each bit, speeding it up

    ReplyDelete
  31. Here are dozens of sound-based modes for small bandwidths:
    http://w1hkj.com/FldigiHelp/Modes/index.htm

    Here's one of the narrower ones:

    http://en.wikipedia.org/wiki/PSK31

    Here is some software for Win/Lin/Mac/BSD that implements a bunch of mides
    http://w1hkj.com/Fldigi.html

    Here is a package for data transmission using the above:
    http://w1hkj.com/NBEMS/index.html

    ReplyDelete
  32. this is great! you can use a FM transmitter, and a FM radio to transfer by wireless. Cool.

    ReplyDelete
  33. Why not use a old Modem and connect the output to some modulator for it to be transmitted via AM Radio tranciever? AM Radio has a longer range than FM.

    ReplyDelete
  34. I had the idea of using amplitude instead of frequency to transmit data. Using a sound card, you can transfer data with a bitrate coming close to the soundcard's sampling rate. I also thought of using various CHAP mechanisms to negotiate line speed (for example, if you had one machine using a SoundBlaster 8-bit and another using a state-of-the-art 9.1 card), and attempting to use stereo transfers (using stereo transfer cables): This would get you two data channels in each direction. Since you only really need one, the second channel can be used for flow control. With two cards at 192kHz, you could get a full-duplex link at broadband speeds with software flow control.

    ReplyDelete
  35. I'm not sure you're ever going to see speeds like that except in theoretical models. Due to all sorts of interference and degradation in digital-analog-digital conversion, you're probably not going to be able to use more than 8 different amplitude levels. And you're most likely going to need several oscillations to determine what amplitude of data you're seeing with any degree of certainty. If we're really optimistic and say 16 oscillations, you get 96 kbit/s, which is roughly ISDN speed; which is pretty impressive for a completely McGyvered connection, but even then you need -really- high end equipment, with well shielded cables.

    ReplyDelete
  36. Funny. Low budget POLAR-watches do actually use this technique to transmit data from the clock to the pc.

    ReplyDelete
  37. I've been searching around for something like this. I first used tones, but I would need 256 of them, so then I thought of using 2 tones, high for 0, low for 1. It worked, but on the recieving end, it's pretty hard to decode tones, and with the short ammount of time per bit, it's hard to get it exact. Thanks for this though! I will make a program in QuickBasic using this method. I'll probably set up a terminal service! Who knows! I'm Jason Woodland at hot mail ... com

    ReplyDelete
  38. Adidas have unveiled their supercharged Elite Nike vapor soccer shoes, designed and engineered to provide new levels of performance. The SoccerBible is your first stop for full and exclusive images of the Nike CTR360 II ELITE. The Nike T90 Laser Elite are truly a stunning boot, both visually and in-hand. Set to be championed by Wayne Rooney at this summer's World Cup, the Nike mercurial vapor fg are almost a hybrid boot as they combine Laser technology and Nike mercurial vapor with the Vapor series!Nike Mercurial Superfly.

    ReplyDelete
  39. hey i have a mini project and i want to work on ur finding...pls help me more about it sir...

    ReplyDelete
  40. I'll help if I can. Do you have any particular questions you want me to answer?

    ReplyDelete
  41. Is there anyway to do this using tones that are above or below what the human ear can hear? That way you could transfer data without being annoyed by all the noise.

    ReplyDelete
  42. If you have the proper equipment, certainly. But most consumer equipment (speakers, microphones, and to a lesser extent sound cards) aren't built for operating outside the audible spectrum, and therefore probably will not work very well.

    ReplyDelete
  43. So not a mic and speaker on a cell phone for instance? Also, what's the highest bandwidth rate you've been able to achieve? Thanks!

    ReplyDelete
  44. I think DSL relies on using higher frequencies of phone lines to transfer data, but I'm not sure if that's useful here.

    With this code, the data transfer rate is not very high at all (like 500 baud or something). But then this code has no error correction scheme at all, so it's ridiculously sensitive to noise. If you add 7,4 hamming code or some similar scheme, I imagine you could boost that by quite a lot.

    ReplyDelete
  45. i think there would be lot of noise because of sound and we could not get the exact file at reciever end.
    i m trying it to do in matlab software.would it help...

    ReplyDelete
  46. That's why in a real-world application you need an error correction scheme (for example *Hamming(7,4)* as I suggested in the previous comment --> http://en.wikipedia.org/wiki/Hamming_code ). You could probably also use a more effective bit-coding scheme (maybe multiple rotating carrier frequencies with *differential coding* -> http://en.wikipedia.org/wiki/Differential_coding )

    That's just some ideas on how to make this work.

    ReplyDelete
  47. This is so awesome!
    Have you tried to improve the idea by using error correction?
    Do you consider doing it? I really think it's great if it proves to be something which can be put in practice.

    ReplyDelete
  48. This would be good if it came in the form of a useable program.

    ReplyDelete
  49. This is very nice and cool post.I was waiting for this type article and I have gained some useful information from this site. Thanks for sharing this information. Keep blogging.

    ReplyDelete
  50. Nice to be visiting your blog again, Generic Metformin has been months for me. Well this article that i've been waited for so long. Generic Ultracet need this article to complete my assignment in the college, and it has same topic with your article. Thanks, great share

    ReplyDelete
    Replies
    1. i have a same project too...if u could help me out with the resources it would be an immense help for me...

      incognitoarunavh@yahoo.com

      Delete
  51. Is it possible to combine this with sound over a laser beam? That would be f*cking awsome

    ReplyDelete
  52. @cruchstick:
    Dollar STORE RUN!

    tape a cheap $2 cat toy's button down and gator clip the two power leads. clip the sound wires to the gator clips. Its been done as a hobbist project on make I think. In a pinch, buy the 3 pack on ebay(red,green,blue or purple) of 5mw lasers..

    ReplyDelete
  53. actually this is a CHEEP version of TAPE backup, I've actually been looking for. Well, as a test anyways. Can we up it to 192Khz? My HD AZAILIA card can handle that..

    ReplyDelete
  54. I have Designed and implemented such a system using base band modulation. my system works with common mp3 players that are capable of recording sound. 3kbit/s with bit error probability less than 10^-4. both Matlab and C implementation are ready. there is friendly GUIs for both. my algorithm is a complete solution for which I have worked more than one year. I have taken into account speech codec which is usually adpcm, error correction,error detection and some other means to provide a reliable link for file transfer. major difference with your work is that my data rate is much higher and my error rate is much lower. also my design does handles lossy voice codec (It is usualy IMA ADPCM).
    If anyone is interested to have codes, this is my email
    v.jahandideh@gmail.com

    ReplyDelete
  55. Sorry, can you give me the diagram of program "send file over sound card". Because although I have read your code but it 's quite difficult to understand. Thank you! My email: tatrungkienttk@gmail.com

    ReplyDelete
  56. This is stunning work.....really appreciating...thanks for post...Plastic cards!!!!!

    ReplyDelete
  57. "Shall we play a game?"

    This project reminds me of the data modem used in the movie that has the above quote.

    need a hint --> http://youtu.be/ecPeSmF_ikc

    ReplyDelete
  58. am pretty beginner...i tried running ur codes...now tht ubuntu doesnt support oss directly and uses pulseaudio there is no /dev/dsp and as a result the ioctl request codes are giving error...if u can post a solution for this i'll be glad...

    ReplyDelete
  59. Someone is doing this to print images onto T-Shirts.

    http://www.psfk.com/2013/08/screen-printed-sound-waves.html

    ReplyDelete