I'm not sure I follow how that is possible. Unless I'm mistaken we aren't talking about pure sine waves here. How can the D/A converter reconstruct a waveform from a source where most of the information has been lost? I understand that we aren't hearing discrete samples on output, given that the D/A has to construct a flowing, contiguous analogue waveform. However, it would have to estimate what occurs between those two samples and unless we are talking about pure sine waves here, it would lose information, or create an approximation at the very best.
Well, haha, basically the point is, no information has actually been lost! If you're interested in some nitty gritty I've written it up, you don't have to understand all the math but I've tried to write it clearly enough that you can at least get the gist.
And we're not talking about pure sine waves, but I'll try to simplify some math for you and explain why it's still relevant. First of all, Joseph Fourier showed that you can decompose functions (or an analog signal) into a sum of sine waves with phase offsets. So given any analog signal, or discretized signal, we can view it as a composition of sine waves.
This is actually part of how we talk about linear filters, and the reason that we're able to easily analyze filters like lowpass, bandpass, any of the nice outboard EQs, etc. A linear filter is a filter, call it H, so that H(a + b) = H(a) + H(b). In other words, say you mix some drums together with some vocals and put it through a reverb or an EQ or something. The output will be the same as if you had individually applied the reverb or EQ to the vocals and to the drums, then summed them together after. This is what it means to be a linear filter.
Now, how does this come back to sine waves? It turns out sine waves (and more generally any exponential function) are called "eigenfunctions" of linear filters. Don't be put off by the fancy name. It just means that a sine wave, put through a linear filter, can only have two things done to it: the amplitude can be changed by a constant amount, and the phase can be changed by a constant amount. So for example, a 1 ms delay is a linear filter. A pitch shifter is not because it changes the frequency.
So let's put it all together: a linear filter, given a sine wave, is only going to change its amplitude and phase by a constant amount. A signal can be decomposed into a sum of sine waves. And because linear filters are linear, we can decompose a signal into sine waves, and look at how the filter will operate just on those sine waves, to fully characterize how it works. This is why you see magnitude and phase responses. A magnitude and phase response totally characterize a linear filter.
Most importantly you can think of any linear filter in terms of what it would do to an individual sine wave.
The A/D and D/A conversion is pretty much linear, mathematically speaking, particularly in the sense that converting the sum of two digital signals to analog, and converting the two signals individually to analog then summing, will give you the same result. Practically it is not because the converters aren't perfect. But what this means is we can talk about how they work on sine waves, and that will tell us how they work on anything!
Now, you feel like to interpolate a discrete signal to get an analog one, you have to approximate. Again, the Nyquist sampling theorem says that you don't because no information is actually lost (when the signal is band-limited, meaning lowpassed, which it is). As an analogy, think of drawing a bunch of line segments. You could encode the entire line and have a whooooole lot of points. In fact there are infinitely many points in that line, you can't store it digitally. But you could just store the endpoints. And if you wanted to, you could recreate that line exactly from the endpoints. You're not losing any information even though you're seemingly throwing a bunch of points away. Even though it seems more complex, you can do the same thing with sine waves. And as we established, any signal is a sum of sine waves, and a linear filter operates on a signal just as if it were operating on those waves individually. So if we have a filter that can interpolate between those sine waves, it can interpolate the whole signal.
More mathematically, the way we do that is with a [windowed] sinc filter, which is just an ideal brickwall lowpass filter. If our sampling frequency is, say, 48000 Hz, we want a sinc filter with a cutoff at 24000 Hz. An ideal mathematical sinc filter is infinitely long, and has an infinitely steep cutoff, but in the real world that doesn't exist, but we can use a sinc filter that's long enough that it really doesn't matter. And the sinc filter is able to interpolate between all the points of this signal to recreate the analog signal. If it helps you to think about why, the sinc filter itself is actually the sum of all sine waves from 0 Hz to the Nyquist frequency, half the sample rate, in this case, 24000 Hz.
I can't give you a much more in depth answer on how or why this interpolation actually works without getting into some in-depth math but I will if you'd like. If not, try to think of it in terms of that line-drawing analogy I gave you, keeping in mind what I said about how linear filters operate on sine waves. In particular, can you imagine how you could, given a bunch of discrete equally-spaced points from a sine wave, mathematically reconstruct the analog sine wave?
Let me know if you need any clarification on anything, like I said.
One last thing: in the real world it doesn't work exactly like these mathematical idealizations. Most importantly the idealized sinc filter doesn't exist, that's why we get a little bit of digital aliasing. When digital was new, 44.1k wasn't really enough because the A/D converters were pretty awful. These days they're very good and 44.1k is probably enough -- the aliasing is so minimal.