Rate this topic

Recommended Posts

Yo  TMVW People, Pals, and fellow lovers of sangin'!

If you watch this video (starting after 5:05) past the alien stuff, the author explains some impressive voice technology advances made by Google.  It's interesting, ...

and I thought it was funny that the first thing I think of is how now ( soon ), any person who can mimic the singing mannerisms (a good impression of their articulation) of a famous singer, and has decent rhythm (those are "some" of the most basic skills), could turn on the effect and now, out of the speakers comes the Artist of their choice (on a drop down menu no doubt :) ) Dio, Mercury, Jackson, Elvis, take your pick! Actually having vocal cords that sound like the artist no longer required to book a tribute band gig! 

feels like a step beyond pitch correction to be sure.  Just fun with toys to me however, I know there are purists who might find this a disgusting perpetuation of a digital cancer on musicianship.

anyway, just a crazy funny thought I had after watching the video.

 

Share this post


Link to post
Share on other sites

if they are telling you about this (or anything) in the context of just a fun vocal toy etc, you can bet there are about 5 more levels to this technology that isnt revealed to the public.

 

pretty scary but its childs play for certain agencies to sample a few words of someones voice and then use that to do things such as make fake phone calls that would fool a persons own mother

Share this post


Link to post
Share on other sites

I better keep my thoughts to myself on this subject......they may be on to me......

Share this post


Link to post
Share on other sites

    Actually, I don't buy it. Sure if they made a digital sample of a voice reading those words in that order you would get close to an exact copy. But if you just took a voice print and had the "Robot/Ai/computer" read the words you would not get as close of a copy as you get in this video. Even the same person reading the same sentence on different days would use different inflections. These two samples were too close of a match in inflection, speed and enunciation to have been a voice print and then the "Robot" reading the words and reproducing the same inflection and tempo of the live human. 

      If the video did reflect reality, which i doubt, at best the sentences "Spoken" by the computer were "Read aloud" by a different human and altered to match the "live human"  it was imitating. 

    Have you listened to someone read a book out loud? Live humans can't even read a text aloud and match the inflections of the written word and have them match the intended inflections until AFTER they understood what they just read. A computer would still have mismatched inflections and dynamics of speech.

Share this post


Link to post
Share on other sites

Super interesting, but speech AI is a far from singing AI.

Again, I find myself repeating this too frequently, speech is not singing. The articulations, frequency and acoustics are all very different. And I suspect if singing phonetics proves to be more complex for humans as a general rule, then it will remain so for AI as well. You'll have to code more robust algorithms to get a computer to sing just like DIO. My guess is, that is not what they have achieved here. This is speech recognition, speech AI, not singing... at least not yet. But no doubt, one day.

 

Share this post


Link to post
Share on other sites
14 hours ago, Felipe Carvalho said:

Interesting, and scary, I would say the first samples on each test are the synth voice.

 

12 hours ago, The Future Vocalist said:

That's what I picked.

Felipe and Future V,  I didn't try to tell them apart after one listen, i thought it would be futile or maybe my ADD gives me an aversion to such close attention on this :)

1 hour ago, MDEW said:

A computer would still have mismatched inflections and dynamics of speech.

you make excellent points i didn't even think of J . . . I'm mean, Mdew, but Like Rob says, "someday" it's gonna happen.

and remember, just because you're paranoid, does NOT mean their NOT out to get you. :59:

45 minutes ago, Robert Lunte said:

Again, I find myself repeating this too frequently, speech is not singing. The articulations, frequency and acoustics are all very different. And I suspect if singing phonetics proves to be more complex for humans as a general rule, then it will remain so for AI as well. You'll have to code more robust algorithms to get a computer to sing just like DIO. My guess is, that is not what they have achieved here. This is speech recognition, speech AI, not singing... at least not yet. But no doubt, one day.

I would not disagree with your points here Rob, all good to hear from your perspective. 

I guess the technology i'm imagining is a confluence of these voice "replication" advances by our lord and savior Google, and something along the lines of the TC-Helicon type harmonizer effect which sources the voice "sample" . . . "replication" . .  of the famous singer, as opposed to your own voice.  Then, you would mute the signal of your actual voice. 

This seems like less of a leap into the future to me . . . .  what do you think?

 

Share this post


Link to post
Share on other sites

Google isn't my Lord and savior... SJW, discriminating, censoring racists. It also owns YouTube which enables defamation and IP violations of its creators. I do like their technology however.

 

Share this post


Link to post
Share on other sites
13 minutes ago, Robert Lunte said:

Google isn't my Lord and savior... SJW, discriminating, censoring racists. It also owns YouTube which enables defamation and IP violations of its creators. I do like their technology however.

 

yeah, just a little thing i like to call, "sarcasm" Rob, wasn't tryin' to push your buttons, on the contrary, not a google fan.

back to my question?

Share this post


Link to post
Share on other sites
38 minutes ago, Kevin Ashe said:

I would not disagree with your points here Rob, all good to hear from your perspective. 

I guess the technology i'm imagining is a confluence of these voice "replication" advances by our lord and savior Google, and something along the lines of the TC-Helicon type harmonizer effect which sources the voice "sample" . . . "replication" . .  of the famous singer, as opposed to your own voice.  Then, you would mute the signal of your actual voice. 

This seems like less of a leap into the future to me . . . .  what do you think?

This is along the lines of what I mentioned, You speak or sing the words and the signal is reconfigured to someone else's voice print. You still have the same problem as in singing. Vowels. The "Sampled" voice will use different vowel harmonics with the sampled voice as opposed to the human the sample is taken from.  example. When I speak the words "From Washington to Baltimore" and used the Voice print of Robert Lunte the vowels and accent would be different if Robert spoke them himself. Anyone who knows myself and Robert would be able to tell which was the real Robert and which was the voice print Because I say those words differently.  

   I do not know about Roberts diction but when I say "Washington" it comes out as "Warshing-ton" as opposed to someone else who says "Wash-ing-tin". My "Baltimore" is closer to "Bole-Ti-more" as opposed to "Ball-ti-mer". Those vowels would carry through the transmutated sample.

Share this post


Link to post
Share on other sites
2 minutes ago, MDEW said:

This is along the lines of what I mentioned, You speak or sing the words and the signal is reconfigured to someone else's voice print. You still have the same problem as in singing. Vowels. The "Sampled" voice will use different vowel harmonics with the sampled voice as opposed to the human the sample is taken from.  example. When I speak the words "From Washington to Baltimore" and used the Voice print of Robert Lunte the vowels and accent would be different if Robert spoke them himself. Anyone who knows myself and Robert would be able to tell which was the real Robert and which was the voice print Because I say those words differently.  

   I do not know about Roberts diction but when I say "Washington" it comes out as "Warshing-ton" as opposed to someone else who says "Wash-ing-tin". My "Baltimore" is closer to "Bole-Ti-more" as opposed to "Ball-ti-mer". Those vowels would carry through the transmutated sample.

yes good explanation. I get the vowel mod challenge.  this is why i'm identifying it as a hybrid of technologies because it would be a realtime (nano seconds) interpretation of (voice recognition) which then accesses a database of famous singer matching words, then voices it through sampled/replicated voice.

so, what am I oversimplifying here?

Share this post


Link to post
Share on other sites
On 1/5/2018 at 5:26 PM, Kevin Ashe said:

and I thought it was funny that the first thing I think of is how now ( soon ), any person who can mimic the singing mannerisms (a good impression of their articulation) of a famous singer, and has decent rhythm (those are "some" of the most basic skills), could turn on the effect and now, out of the speakers comes the Artist of their choice (on a drop down menu no doubt :) ) Dio, Mercury, Jackson, Elvis, take your pick! Actually having vocal cords that sound like the artist no longer required to book a tribute band gig! 

 

 

19 minutes ago, Kevin Ashe said:

yes good explanation. I get the vowel mod challenge.  this is why i'm identifying it as a hybrid of technologies because it would be a realtime (nano seconds) interpretation of (voice recognition) which then accesses a database of famous singer matching words, then voices it through sampled/replicated voice.

so, what am I oversimplifying here?

    At fist I was just thinking about this as an effect pedal or plug in......But, if this Plugin were linked to the internet with an extensive data base, also with the other criteria you mention....a starting point of fair impersonation and timing you should get pretty good results.......Of course you have the flip side that it will  also be used for destructive purposes by people with personal agendas......

Share this post


Link to post
Share on other sites
7 minutes ago, MDEW said:

    At fist I was just thinking about this as an effect pedal or plug in......But, if this Plugin were linked to the internet with an extensive data base, also with the other criteria you mention....a starting point of fair impersonation and timing you should get pretty good results.......Of course you have the flip side that it will  also be used for destructive purposes by people with personal agendas......

right! not sure if it would have to have wifi with as "nano" as chips continue to be, my guess is that the speed of recognition of word being sung would be tough.  if the database has every word/lyric ever sung by said famous singer, (assuming the speed is not an issue) the software may struggle distinguishing between words that begin with the same vowel yet end with a similar or different vowel. (right?)  

like the words   "fire" and "fighting" ?  again, in the end, i'm thinking time will resolve these potential challenges.

Share this post


Link to post
Share on other sites
2 hours ago, Kevin Ashe said:

right! not sure if it would have to have wifi with as "nano" as chips continue to be, my guess is that the speed of recognition of word being sung would be tough.  if the database has every word/lyric ever sung by said famous singer, (assuming the speed is not an issue) the software may struggle distinguishing between words that begin with the same vowel yet end with a similar or different vowel. (right?)  

like the words   "fire" and "fighting" ?  again, in the end, i'm thinking time will resolve these potential challenges.

 My mind goes first to the misuse of technology than the benefits. Having said that, with the fact that every computer and TV and  phone are capturing your voice prints and your words and "Saving" them for "advertising" purposes.....I am sure that a proper combination of words and inflections could be found by this wonderful Ai/quantum computer to make a pretty good copy of anyone's voice for singing or other endeavors........

Share this post


Link to post
Share on other sites

No amount of AI or quantum computing is going to achieve such conversion on the fly, unless it can look inside your brain. Such a conversion has to look ahead. A very simple example would be where two vowels sound the same in one accent, but different in another. If you are converting from the first accent to second, you would have to look ahead for context. That is just a simple case. When it comes to all the nuances and inflections involved in singing, the information for conversion is not available on the fly, without being able to read or control your mind.

It is similar to a language translator, like Google Translate, not being able to do an accurate translation word by word. A sophisticated translator has to look ahead for context (Google Translate does some, I believe).

However, with the help of the singer it would be possible to do the conversion on the fly. Just like working a mic, you can "work software" and get out whatever the software is capable of. The difference is that you would have to manipulate you voice to cue the software, so it doesn't need to guess ahead.

Share this post


Link to post
Share on other sites
3 hours ago, kickingtone said:

The difference is that you would have to manipulate you voice to cue the software, so it doesn't need to guess ahead.

  Then why not just train to sing if you have to train to work the software.

   I have a big problem with Ai and quantum computers. These scientists believe that AI will be able to "Think" for itself using complex logarithms  and such. They have too much faith in it and giving it too much control. At best the Logarithms are in themselves programs and rely on the initial programming, Even if the Ai is programmed to "Learn" or "Teach itself" it is supposed to be contained to certain perimeters. Once it jumps those perimeters or "Teaches" itself new truths, the "Programmers" are out of the loop. 

   To bring this back into the world of Music and singing.....The AI may decide that your choice of Dio's voice  may best be replaced with the voice of Celine Dion. Ai being the superior intelligence may keep over riding your input. We are already being blocked from our own choices on Facebook and youtube because of Ai driven logarithms that are supposedly censoring a specific type of unwanted programming while allowing other programming that is worse than what it is supposed to be blocking. Logarithms cannot replace human thought. Human thought may be faulty.....but that is the beauty and and the superiority of it. Those who are basically forcing Ai on us, not only believe but are also programming Ai with the "Belief" that it is superior. It is a computer therefore it cannot make mistakes.......Believe me...... Computers make mistakes and sometimes it is not their programming but physical malfunctions............

Share this post


Link to post
Share on other sites
14 hours ago, MDEW said:

  Then why not just train to sing if you have to train to work the software.

Impatience, laziness, pretence, frogs legs and puppy dog's tails, and whatever else is at the bottom of the cauldron, I guess. :lol:

A bit harsh, maybe. You could ask, why use a mic? Why not just train to sing louder? etc. etc.

Basically, I think that the software would make it easier for lazy people. :) 

For some people it is about "production" by hook or by crook. At the other end of the spectrum there are people who like to be more in touch with what they are producing.

14 hours ago, MDEW said:

   I have a big problem with Ai and quantum computers. These scientists believe that AI will be able to "Think" for itself using complex logarithms  and such. They have too much faith in it and giving it too much control. At best the Logarithms are in themselves programs and rely on the initial programming, Even if the Ai is programmed to "Learn" or "Teach itself" it is supposed to be contained to certain perimeters. Once it jumps those perimeters or "Teaches" itself new truths, the "Programmers" are out of the loop. 

   To bring this back into the world of Music and singing.....The AI may decide that your choice of Dio's voice  may best be replaced with the voice of Celine Dion. Ai being the superior intelligence may keep over riding your input. We are already being blocked from our own choices on Facebook and youtube because of Ai driven logarithms that are supposedly censoring a specific type of unwanted programming while allowing other programming that is worse than what it is supposed to be blocking. Logarithms cannot replace human thought. Human thought may be faulty.....but that is the beauty and and the superiority of it. Those who are basically forcing Ai on us, not only believe but are also programming Ai with the "Belief" that it is superior. It is a computer therefore it cannot make mistakes.......Believe me...... Computers make mistakes and sometimes it is not their programming but physical malfunctions............

Well, when you think of robots that are able to observe their environment for themselves, their "truths" are already at the mercy of fallible analogue technology. On top of that, you have programming bugs and program complexity, both of which lead to unpredictability of outcome and irreparably corrupt databases. Rumour has it that national telecoms systems have for decades demonstrated behaviours that nobody can explain. That is put down to accident of complexity. But we also have human factors...

People would be writing viruses. Robots will fall "sick". Maybe they would need robot clinics, staffed by other robots. (In a sense, diagnostic technology has already reached this point.) In the world of AI, many human factors can be mimicked. The best way of tackling such viruses and malware may be a form of software vaccination, etc. because the scope and scale of virus infection would be of a different order. Infected software could learn how to write viruses even in inscrutable machine code, and they could write viruses that write viruses, and they could hide inside vast data sets, inaccessible to human scrutiny. So, you would need computer cops, allowed to make decisions that humans have to trust, and with powers of arrest and destruction of software and data. Then we could have corruption, infiltration and spying at that level.

I have a saying. If we are afraid of a person who can do what we do, only better, then we are doing something wrong.

Humanity is being forced to look its crooked self in the mirror.

The voice comparisons in the video I believed were both artificial. But, even if one was human, I bet a computer would be able to tell!

If you were to quickly write, off the top of your head, a random thousand digit number, a computer should be able to tell that it was done by a human. For example, we tend to prefer certain patterns of numbers that are buried in our subconscious, maybe bits of old address, dates telephone numbers, etc. These would be represented with abnormally high frequency. There would also be other patterns that should appear occasionally, but which we somehow psychologically do not envisage. A computer could scan our number and pick up on these human traits. Now, could a computer be trained to mimic ALL such human traits, however nuanced, so that no other computer could tell between a human and the trained computer?

At one time I was going to write a program that could ID a person from their typing profile. All you would have to do is type a paragraph and the computer would examine the pauses between various keystroke combinations, spelling errors etc. etc. But, to work well, the algorithm would have to be built parallel into a microchip, as its execution affects the thing it is monitoring.

Share this post


Link to post
Share on other sites
8 hours ago, kickingtone said:

The voice comparisons in the video I believed were both artificial. But, even if one was human, I bet a computer would be able to tell!

They were both artificial. A digital copy of an organic voice and a digital copy of a digital voice. Is it live or memorex(Old TV commercial) they sound the same or similar.......coming through a 4 inch tv speaker or radio......

 

8 hours ago, kickingtone said:

For some people it is about "production" by hook or by crook. At the other end of the spectrum there are people who like to be more in touch with what they are producing.

In the case of a human deciding to use this technology for musical compositions it is no different than using a noise gate, reverb, autotune or midi drums.  The part that I would be against is the whole artificial Intelligence feature coupled with in internet data base. Another false reason for others to collect our voices and words and use them for other unknown reasons.

Share this post


Link to post
Share on other sites

I think I have the ability to hear acutely as a coach and to some extent, as a producer and honestly, I don't think I heard a damn bit of difference. 

Share this post


Link to post
Share on other sites

Not that I want to spoil the whole Asimov vibe, but I would not take this video too seriously with the whole x-files thing.

On 08/01/2018 at 10:07 PM, MDEW said:

   I have a big problem with Ai and quantum computers. These scientists believe that AI will be able to "Think" for itself using complex logarithms  and such. They have too much faith in it and giving it too much control. At best the Logarithms are in themselves programs and rely on the initial programming, Even if the Ai is programmed to "Learn" or "Teach itself" it is supposed to be contained to certain perimeters. Once it jumps those perimeters or "Teaches" itself new truths, the "Programmers" are out of the loop. 

 

I don´t really think any scientists believe that the AI will think for themselves, at least not at the current stage. I think the largest deep learning networks avaiable today have around 1 billion connections, human brain has around 100 billion neurons, each with over 5000 connections, plus the complexity of the real neuron and glial cells when compared to artificial neurons, etc.

The AI is just good at identifying patterns, what to look for and what to do with it is decided by humans, at least for now ;)

Share this post


Link to post
Share on other sites
1 minute ago, Felipe Carvalho said:

I don´t really think any scientists believe that the AI will think for themselves, at least not at the current stage. I think the largest deep learning networks avaiable today have around 1 billion connections, human brain has around 100 billion neurons, each with over 5000 connections each, plus the complexity of the real neuron and glial cells when compared to artificial neurons, etc.

The AI is just good at identifying patterns, what to look for and what to do with it is decided by humans, at least for now ;)

I didn't mean to take us off track of the musical topic, but...

Scientists are already talking about Ai reaching a singularity point. Most internet type companies (facebook,google, youtube) along with news feeds etc are using Ai based logarithms to "Sensor" comments and videos on those things mentioned. They are basically plugged into each and every computer, TV, Elexa,cortana, smart phone, tablet these new smart speakers and ANY device that has voice recognition software and access to internet. Those computers and devices ARE the neurons and connections of Ai.  For them the goal IS  for Ai  to "Think" for itself. Quantum computers and 5G connected to each and every device with internet access IS the neural network. 

I am not saying this.....Scientists and those working on Ai and quantum computers are.

I know that seems like a joke to most people but really give it some thought.

Share this post


Link to post
Share on other sites

Of course they want it, or at least pass a touring test.

But right now technology is not capable of Strong AI. The algorithms are present on pretty much any technology where pattern recognition and simple decision making is useful, for example identifying the *kind* of music you like to hear and playing that for you. Which is a problem very hard to solve through other computational paradigms.

Even Pac-Man, one of the oldest computer games, used a form of AI to guide the actions of the enemies.

What exists now and that hints on a direction to take for strong AI is deep learning, and more efficient algorithms that uses less processing power/parallel processing, allowing larger networks and multiple abstraction levels (finding patterns within the patterns).

The problem on social networks is in my opinion humans, on the user side,  that do not stop to think for two seconds that writing things down in public spaces is not the same as throwing away words on a bar counter, so if they do not regulate or moderate somehow, their own company could be dragged into disputes and lose money. And on the other side, each of these companies direct their resources on the most profitable manner they can, if certain manifestations get in the way of profit, you think they would think twice?

In both cases, AI can surely be of help since the amount of data is too large and sparse for humans to supervise and task oriented programing can not handle it, but you can not hold AI responsible for the problems. And no, the connection of your cellphone to a data link does not mean its part of a larger neural network. You may be collecting data to feed some sort of AI, but frankly most of the data companies collect about people, are provided directly by them, no real need to spy...

Share this post


Link to post
Share on other sites
46 minutes ago, Felipe Carvalho said:

you can not hold AI responsible for the problems

I am not holding Ai responsible. I am holding those implementing Ai responsible. You want a certain type of data? Fine. Search within the confines of those particular sources.

No need to cross into random subjects and commentaries.

47 minutes ago, Felipe Carvalho said:

no real need to spy

And no need to save Twitter comments for the last 7 years. It was just announced that Twitter will no longer archive ALL comments. Why archive them at all? You have to agree for these parties to use the information gathered from your use as they see fit. This includes any random words heard by the "device" while in use, and being that it is voice activated it means everything within recording distance of the device. And these recording are archived for the logarithms to access.

 

59 minutes ago, Felipe Carvalho said:

The problem on social networks is in my opinion humans, on the user side,  that do not stop to think for two seconds that writing things down in public spaces is not the same as throwing away words on a bar counter, so if they do not regulate or moderate somehow, their own company could be dragged into disputes and lose money. And on the other side, each of these companies direct their resources on the most profitable manner they can, if certain manifestations get in the way of profit, you think they would think twice?

And this is one of the biggest issues with Ai. Especially the ones programmed to "Learn" for themselves. Their "Learning" is from random individuals who know as much about the subject they are speaking on as I know about how to program Ai. Worse yet is if they are "Learning" from specific information that is supporting one point of view that is in the interest of those developing it.

 

On 1/5/2018 at 6:43 PM, MDEW said:

I better keep my thoughts to myself on this subject......they may be on to me......

More than likely...they are......

Share this post


Link to post
Share on other sites
42 minutes ago, MDEW said:

More than likely...they are......

Surely they are, I just don't think its distopic like that. They are simply trying to figure what to sell you haha, I would not be surprised if we both got a link trying to sell us a book about "The Future of AI" in the comming days.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now