Topic: Bell Speakers (Read 6347 times)previous topic - next topic

0 Members and 1 Guest are viewing this topic.
• danielm
Bell Speakers
07 July, 2012, 01:13:28 PM
Okay, so I hope this is the right place to post this, but I had an idea (and very little actual know-how) and wanted to know what people in the know thought of it. Okay, so here goes... stop me if any of my presumptions are faulty, but...

(1) recorded music is just a complex waveform over time (different frequencies superimposed on one another)
(2) "regular" speakers recreate this with an electromagnet and a diaphragm...

so here is the idea... could a recording of, say, a person singing, be discernibly (not perfectly by means) reproduced using a rig as described below.

(1) x number of precisely tuned bells, such that, given the available frequencies, most frequencies within the human hearing range can be more or less recreated by some combination (i know there is an embedded math problem, but my uneducated instinct says some application of the birthday paradox (http://en.wikipedia.org/wiki/Birthday_paradox) of the given bells
(2) each bell fitted with an individual ringing mechanism (servo motor?), possibly something resembling a bicycle spoke with multiple strikers on a wheel that can be spun quickly to emulate some sustain.
and
(3) a computer and program to translate the given audio file into something the bell rig can play.

Do you think this is possible? Are there any huge holes in my logic or understanding? Something I am not thinking of? Bear in mind you are speaking to someone who studied russian lit and only has a passing familiarity with this stuff. Thanks in advance for your time or comments.

• pdq
Bell Speakers
Reply #1 – 07 July, 2012, 01:47:45 PM
First off, you would need to be able to stop each bell ringing as quickly as you start it.

Second, a bell (any kind that I know of) doesn't emit a single frequency.

More importantly, you are not just talking about a bunch of independent sine wave sources. Each frequency needs to be in exactly the right phase withrespect to its associated rrequencies.

All of this makes what you describe pretty close to impossible.

• danielm
Bell Speakers
Reply #2 – 07 July, 2012, 01:54:10 PM
how disappointing, thanks pdq

• markanini
Bell Speakers
Reply #3 – 07 July, 2012, 02:00:53 PM
Sounds fun but it couldn't possibly be hi-fi. For starters the partials in voices instruments etc don't always correspond to partials of bells, the result would be many residual tones. It would an awesome musical instrument though.

• danielm
Bell Speakers
Reply #4 – 07 July, 2012, 02:04:04 PM
markanini, yes, hi-fi is not the goal at all. God i want to hear what it would sound like

Bell Speakers
Reply #5 – 07 July, 2012, 02:41:32 PM
It's an interesting idea for sure. A similar attempt has been made to make a piano "speak".
Unfortunately the video is in German, but is subtitled and the "spoken" words are in English, so you'll get the idea.

• danielm
Bell Speakers
Reply #6 – 07 July, 2012, 04:10:03 PM
It's an interesting idea for sure. A similar attempt has been made to make a piano "speak".
Unfortunately the video is in German, but is subtitled and the "spoken" words are in English, so you'll get the idea.

this is incredible... yes, just like this but with bells!

• dhromed
Bell Speakers
Reply #7 – 08 July, 2012, 03:57:51 PM
Next step is the reverse: are there people trained in making any sound they like, as though they were speakers?

I imagine it'll sound like a telephone.

• mixminus1
Bell Speakers
Reply #8 – 08 July, 2012, 05:31:29 PM
Michael Winslow - who played Larvell Jones in the Police Academy movies - comes to mind.
"Not sure what the question is, but the answer is probably no."

• mzil
Bell Speakers
Reply #9 – 08 July, 2012, 07:16:24 PM
I suspect the piano speaking video is fake. [Although I don't speak German, so perhaps they explain more than I am aware of] It is a trick. We are NOT hearing just an unmodified piano. We are hearing either a gimmicked piano or a secondary "enhancement" soundtrack has been mixed in. The give away is the clarity of some of the the voiceless fricatives (such as the "/s/" in "responsible")
[Examples of voiceless fricatives may be heard in the demonstration videos of a face to the far right, here. Click "Fricative" and then try the five shown: http://www.uiowa.edu/~acadtech/phonetics/e...mp;scrollbar=no
should you need a better understanding of what they are]
which a piano would have a hard time emulating.

Here's another video and notice how much harder the words are to make out [in fact nearly impossible if you close your eyes and stop reading the text accompanying the sound.] Try it and see how few words you make out!

I suspect on this second version I found, they have scaled back the "enhancement track".

• Dynamic
Bell Speakers
Reply #10 – 09 July, 2012, 04:09:04 AM
Bizarre as it may seem I think it's probably real. I hadn't seen it on youtube before but heard this many months ago on a podcast - probably Scopes Monkey Choir. I think pareidolia (sp? - ie. expectation bias) helps us hear what we expect to hear (hence the words help). I think fricatives are plausible given the impulsive nature of the transient at the start of a piano note. The sustain and decay parts sound pretty tonal, which would be the case. I seem to recall the podcast talking about an academic paper written about this piece, and how, once the timing was tight enough with the electronic actuators, the choice of notes to be played was an attempt at a best fit for the spectrum of a superposition of piano notes to the time-varying spectrum of the human speech it was trying to reproduce. They, I think, played the original human speech, which I think was a child making a declaration about human rights or something before the European Parliament or something like that. It does help that piano notes can be curtailed rapidly by releasing the key and letting the damper mute the sound. I think the singing tone of the speech helps greatly to make it amenable to this form of reproduction.
Dynamic – the artist formerly known as DickD

• DVDdoug
Bell Speakers
Reply #11 – 09 July, 2012, 01:45:19 PM
Quote
(1) x number of precisely tuned bells, such that, given the available frequencies, most frequencies within the human hearing range can be more or less recreated by some combination (i know there is an embedded math problem,
danielm,

It almost sounds like you are describing the Fourier Transform.  So we know bells would not work, but if you had some sort of other sound-generating device for "every frequency"*, you could accurately reproduce the human voice (or any other sound).  There's nothing wrong with your thnking-reasoning...  You just you don't understand the physics/acoustics of bells.    It's nearly impossible (maybe entirely impossible) to make a mechanical device that vibrates at a single-pure-tone without ringing.    You can do it with electronics, but, building such a device is, of course, not practical.

The Fourier Transform (or FFT = Fast Fourier Transform, or DFT = digital Fourier Transform)  is used "everyday" in DSP (digital signal processing), including audio processing.    But, the data is converted-back into the "normal" time-domain before being converted to analog and sent to a loudspeaker.

* There is no such thing as "every frequency", since frequency is not an integer value...  It's a continuous real value (like distance), and there are an infinite number of frequencies within the audio range.  But, our ears/brains don't have infinite resolution.

• benski
• Developer
Bell Speakers
Reply #12 – 09 July, 2012, 02:36:54 PM
But an FFT has both a magnitude component and a phase component (more accurately, a sine and cosine response from which magnitude and phase can be computed).  Getting the frequency-magnitude aspect correct is the easy part.  In fact, some digital additive synthesizers such as the Kawai K5 had a "resynthesis" feature that would create the required harmonic envelopes to roughly match a sampled waveform.  But the precise control over phase was not present.
For a purely mechanical device, the largest limitation would the length of sound that could be recreated.  In order to reproduce a longer sound, you would need more sound generating devices (just as the needed size of an FFT scales linearly with the number of time-domain samples).

• Porcus
Bell Speakers
Reply #13 – 10 July, 2012, 12:13:37 AM
It's nearly impossible (maybe entirely impossible) to make a mechanical device that vibrates at a single-pure-tone without ringing.    You can do it with electronics, but, building such a device is, of course, not practical.

Following this train of thought:

A loudspeaker working this way, would have a very large number of speaker elements, each doing 'only one frequency'.
That is, essentially an N-way loudspeaker, for very large N (finite, by the limitations of human hearing) and very steep crossovers – DSP'ed, I'd guess.

But the single-pure-tones (i.e. sines) constitute but one basis for the vector space. Who says we should use that one? We could replace it with your favourite basis (wavelet, whatever ... it need not even be orthogonal!), and feed each of the N loudspeaker elements one basis vector.

Now who says an array of bells cannot form such a basis? 'Practicalities' would be this 'who' of course, but in principle? Indeed, the talking piano is a projection down to a subspace of dimension eighty-something, with piano strings for the loudspeaker elements.

And reducing dimensionality – that is, reducing N – is really a kind of lossy compression, decoded 'at the loudspeaker level' – or if you like, in the air in front of the elements. And by choosing a different basis, you might optimize to reduce artifacts (of which the piano had a few ... oops, TOS#8).

So ... here is a research project:
- build such a loudspeaker. Heck, some manufactorer of computer-grade speakers should sponsor this, I doubt we will need the high-end.
- Pick various basis choices. Play. Listen. Pick more bases. Tune. ABX with different music (what statisticians call out of sample).
- and whatever you do, don't forget to youtube it!

• dhromed
Bell Speakers
Reply #14 – 10 July, 2012, 04:02:12 AM
DFT = digital Fourier Transform

*discrete fourier transform.

• 2Bdecided
• Developer
Bell Speakers
Reply #15 – 10 July, 2012, 05:16:39 AM
Of course its possible. The question is only how bad it would sound (i.e. how close an approximation to the original can you create).

With bells you'd get a horrible racket because of all the harmonics, though you could try to take account of that in the analysis and design bells with purer/nicer harmonics (that's been done).

If you used something closer to a sine wave with easily controlled start+stop (e.g. blowing air down a tuned tube), it would probably be easier.

Regarding phase: if you FFT something, reset all the phase information, and then reconstruct the waveform with this (zeroed) phase information, the result sounds horrible, but not unrecognisable. The block length of the FFT imposes its signature strongly on the output in this simple experiment, but that effect could be reduced. You don't have to use a fixed block length FFT. You can even use the data from different block lengths at different frequencies.

Cheers
David.

• danielm
Bell Speakers
Reply #16 – 16 July, 2012, 07:13:16 PM
Okay, i get that bells ring at a number of frequencies.... but isnt that over time? if you were to cut off the sound almost immediately after striking, isn't the beginning of the sound (the attack, i believe you audiophiles call it) a fairly uniform frequency? like, after ringing the bell it starts somewhere and gradually shifts frequencies as it loses energy? or is that a complete misunderstanding, and they all always are ringing at multiple harmonics?

• 2Bdecided
• Developer
Bell Speakers
Reply #17 – 17 July, 2012, 05:32:12 AM
They are all always ringing at multiple harmonics.

The brief initial attack itself is percussive, and has even less well defined tonal qualities. It's more like a click.

Cheers,
David.

• mzil
Bell Speakers
Reply #18 – 14 August, 2012, 07:04:28 PM
I suspect the piano speaking video is fake. [Although I don't speak German, so perhaps they explain more than I am aware of] It is a trick. We are NOT hearing just an unmodified piano. We are hearing either a gimmicked piano or a secondary "enhancement" soundtrack has been mixed in. The give away is the clarity of some of the the voiceless fricatives (such as the "/s/" in "responsible")
[Examples of voiceless fricatives may be heard in the demonstration videos of a face to the far right, here. Click "Fricative" and then try the five shown: http://www.uiowa.edu/~acadtech/phonetics/e...mp;scrollbar=no
should you need a better understanding of what they are]
which a piano would have a hard time emulating.

Here's another video and notice how much harder the words are to make out [in fact nearly impossible if you close your eyes and stop reading the text accompanying the sound.] Try it and see how few words you make out!

I suspect on this second version I found, they have scaled back the "enhancement track".

Peter Kirn, who knows much more about music synthesis than I do, concurs with me (upon his secondary examination), noted in his edited text, that there is an enhancement track mixed in, which he believes to be simply the original speech. We are NOT hearing just a piano, all by itself, in the Youtube video:

"Edit: Listening again, the short answer to how you can hear so much of the voice through the piano seems to be, you can’t; the original is almost certainly mixed in. It’s nonetheless an interesting effect, and I’d like to hear the piano on its own."

http://createdigitalmusic.com/2009/10/the-...-audio-to-midi/

• knutinh
Bell Speakers
Reply #19 – 15 August, 2012, 03:12:18 AM
Actually, the Hammond organ predates the additive synthesizers by several decades (and some lesser known instruments before it).

It had a "free-running" "sine-generator" of 91 pitches, and each key could mix 8 harmonically related pitches using a global set of mixing parameters ("drawbars").

Phase was free-running (fixed phase relationship between the 91 sines, pressing a key basically just trigged an envelope). Interestingly, when playing polyphonically, any sine that was used in several places, would (necessarily) have the same phase everywhere. Also, mechanical limitations meant that pitches had to be rounded, not necessarily well-tempered.

Whenever a classical composer makes a score for 100 musicians. Isn't that sort of the same thing? Synthesizing some complex waveform using a large(ish) set of other complex waveforms. Now, composers might not think of the instruments as vectors in a large space, and musicians might not be so strict about following the score as a computer.

-k

• Porcus
Bell Speakers
Reply #20 – 15 August, 2012, 03:37:40 AM
Whenever a classical composer makes a score for 100 musicians. Isn't that sort of the same thing? Synthesizing some complex waveform using a large(ish) set of other complex waveforms. Now, composers might not think of the instruments as vectors in a large space

Those 100 are certainly not  playing e.g. voice. Probably not only because the vector space is too small, but also because it may fail to be closed under vector operations

@ mzil: Thanks for the update, you killjoy

• knutinh
Bell Speakers
Reply #21 – 15 August, 2012, 04:54:55 AM
Whenever a classical composer makes a score for 100 musicians. Isn't that sort of the same thing? Synthesizing some complex waveform using a large(ish) set of other complex waveforms. Now, composers might not think of the instruments as vectors in a large space

Those 100 are certainly not  playing e.g. voice. Probably not only because the vector space is too small, but also because it may fail to be closed under vector operations

Shure, but some classical composers seems more interested in recreating certain "timbres" rather than following established notions of tonality and rhythm. The art of composing for orchestra might consist of having a mental model of how an orchestra reacts to "stimulus" (score), an idea of how one wants the final waveform to sound, and then doing an inverse lookup to figure out how the score must be. Or I might be totally wrong and composers might just make a "pretty polyphonic song" on their piano, delegating each voice to an instrument.

As musicians are not machines, I doubt that it is possible to control an orchestra with the precision/predictability that is needed in order to make a convincing voice simulation. After all, the score is not a MIDI-file, but a coarse suggestion that is further reinterpreted by conductor and musicians. I would love to be proven wrong, though.

One might claim that church organs are crude "non-sinoid" additive synthesis instruments.
"A typical and distinctive sound of the organ is the cornet, composed of a flute and ranks making up its first four overtones, sounding 8', 4', 2?', 2', and 1?'."
http://en.wikipedia.org/wiki/Organ_stop

I think that this topic is interesting, and extends beyond purely additive synthesis. Say that you have got a synthesizer at your disposal. It offers a large set of parameters that generally inter-op in complex, non-linear ways. One parameter may choose between a large set of sampled waveforms. Another may set the cutoff frequency of a filter. A third may allow mixing different simple/complex oscillators. How do you resynthesize any waveform onto this synthesis engine (minimizing e.g. squared error) except for the obvious brute-force way?

-k

• zima
Bell Speakers
Reply #22 – 18 August, 2012, 01:15:26 AM

...any guesses as to how many "notes" might be needed in this case, to give an impression of human speech?

• Porcus
Bell Speakers
Reply #23 – 20 August, 2012, 08:29:57 AM
One might claim that church organs are crude "non-sinoid" additive synthesis instruments.

The number of stops could also be fairly impressive. Most likely you could get an organ to do much better talking than an orchestra could. (Why does System of a Down's “Cigaro” keep popping up in my brain?)

• DonP
• Members (Donating)
Bell Speakers
Reply #24 – 20 August, 2012, 11:19:37 AM
This is a hipshot speculation to spur discussion.

Doesn't the ear, like our eyes, have a number of discrete wavelength receptors (hair cells)?  Unlike the eye, there are many more, and I don't know any reason to think that the wavelengths are the same for different people since the response would be based on size rather than chemistry.

Anyway, it would seem that with the correct frequencies (or "primary pitches") you could give "full spectrum" sound in the same way 3 primary colors can represent the whole visible spectrum