Actually, speech is not a series of discrete phonemes, but a continuous modulated flow of vocalized sound.
What we produce in speech is a continuous flow of modulated frequencies. You get a notion of how this sounds when you play speech backwards.
It is in truth, the listener, who superimposes his/her own template (expectancies) of phonemes on perceived running speech, that turns the continuous flow into apparent sequences of discrete phonemes.
That's one of the factors that makes foreign languages so difficult to learn for adults.
As an English speaker, when I hear Japanese, I automatically try to fit the stream of sounds into English phonemes. As a result I will fail to hear many important phonemes that are not in the English Language.
But what do babies, who have not had time to develop a template of phonemes, hear? They simply hear the continuous modulated flow of speech. How, then, do they develop the template?
NOTES: Hear what baby hears.