Tumgik
#voicebank development
generalnuisance0 · 4 months
Text
i dont think people realize how much worse recording airy vbs is as opposed to vbs with very little air
e5 in arachne's recluse neo vb was child's play to record but recording anything other than middle c in her dark and whisper tones makes me want to actually kill myself and i cant do it without a gallon of tea on standby
7 notes · View notes
auspicious-voice · 20 days
Text
Fuwa Maria AI & Fuwa Mario AI for DiffSinger Progress Report (May 2024)
Hello!! With both Maria and Mario's DiffSinger voicebanks fully trained, I'd like to give some bit of detail on what I'm doing next for the eventual voicebank release including future version releases. It's been a busy April on my end as usual, but I feel like I'm almost done with things. It's a bit of a short post, though.
As usual, everything is under the cut.
Voicebank Progress
Maria and Mario's DiffSinger 1.0.0 voicebanks are fully trained and as such, they're ready for release. Of course, they'll receive new updates such as new languages, tweaks to certain parameters, and other new developments the DiffSinger development team has on the table.
Speaking of which, maybe after a couple months after 1.0.0 is released, expect version 1.1.0 in the works, with the brand-new Rectified Flow algorithm (meaning faster rendering times) and more language support. I've been gathering information on the best training settings when it comes to tension and pitch, and maybe I can just train Maria and Mario's datasets together instead of being trained separately.
Demo Reel Progress
Half of the demo reel audio is done~ I'm getting a headstart on getting the artwork done, though I think I might end up drawing it all on my phone. For the video itself, I still haven't decided on whether I should use After Effects or Alight Motion, but I think I might end up going with the latter.
I am hoping that I can finish the reel by the end of June ^^;
2 notes · View notes
linabirb · 1 month
Text
seeing synthv lite and flt covers brings me so much joy.. like wow.. i can make cool stuff even with the free voicebanks.. even if they sound more robotic than the full ones..
5 notes · View notes
dead-byte · 10 months
Text
I wish there was like... a program that could read the oto.ini file of an UTAU vb, and then, chop up the associated wav files so that they only contain the oto'd bits, and re-allocate the oto values accordingly. Thereby hopefully significantly minimizing the size of the vb.
If y'all have ever seen the samples in any of VOICE-MiTH's Chinese voicebanks, kinda like that.
2 notes · View notes
websitesdotcom · 2 months
Text
Doing stuff with utau is so fun but it takes SO LONGGG
0 notes
waffulaa · 8 months
Text
youtube
Yuezheng Longya's Official Birthday Song
Official Bilibili Upload
1 note · View note
cantheykillmacbeth · 9 months
Note
Hatsune Miku could kill MacBeth
Tumblr media
Yes, Hatsuke Miku from Vocaloid could kill Macbeth!
Tumblr media
She applies for all three clauses: Gender Clause due to being a girl; Unconventional Birth Clause due to being a software voicebank; and the Birth Parent Clause due to her creator being male software developer Sasaki Wataru! Thank you both for your submission!
206 notes · View notes
ukgk · 2 months
Text
SSP PLUGIN RECOMMENDATIONS
Do you want to customize and expand your desktop buddy experience further? here are some handy links to miscellaneous plug-ins I’ve gathered from around the web, or you can even program your own, and they can also be written in any programming language so the possibilities are limitless! plug-ins are essentially  extensions or add-on built for SSP. I’m not a plugin developer myself, and have yet to test out each one of them for extended periods of time, so please refer to the readme files/ instructions provided by the developers (github usually has info) on how to use them if you get stuck or encounter issues.  these are just some of the more recently updated ones, I'll be adding more to the plugin page of my blog if you're interested.
Tumblr media
Weather Station by Zicheq (of Ukagaka Dream Team) A plugin for both users and devs, for getting weather data! As a developer, you can set your ghost up to receive weather data from this plugin, to then do what you will with! Weather based comments? Outfit changes? Something else totally unrelated? It’s up to you! This plugin will handle the messy details of the user inputting their location and gathering the weather data for you. … (read more here)
Tumblr media
Discord Rich Presence by Ponapalt (main dev of SSP baseware) This plugin is designed for displaying the name of the primary ghost you have open on the ‘currently playing’ status on the Discord for Windows application in real-time. also compatible with displaying your currently played song in FLUX (a really awesome music player ghost by Zi).
CeVIO-Talker V2 Plug-in by Ambergon This Plug-in was initially revealed for Day 21 of the Ukagaka Advent Calendar collaborative project in 2022. using this you can have a fully voiced ghost with a realistic sounding voicebank speak to you out loud! (in English too?) it Requires ceVIO Creative Studio and SSP 2.6.45 (or newer) to work, ceVIO is a vocal synthesizer software commonly compared to Vocaloid and UTAU that works via text-to-speech method. the primary difference between Vocaloid and ceVIO is that ceVIO is built for both TTS/speech and creating vocals for songs in music production. you can download a demo of CeVIO if you would like to try it out here.
GhostSpeaker by apxxxxxxe like CeVIO-Talker, this Plug-in was initially revealed for Day 17 of the Ukagaka Advent Calendar collaborative project in 2023. it’s a successor to the Bouyomi-chan plug-in and utilizes a free (Japanese) text-to-speech software called VOICEVOX and COEIROINK so that your ghost can verbalize their balloon dialogue and speak to you. you can listen to a demo in this github link.
GhostWardrobe by apxxxxxxe allows you to dress up your ghost in different coordinates, mix and match pieces and save and load the outfit combinations from the plugin menu.
Tumblr media
CharameL plugin   by Umeici This software allows you to enjoy watching ghosts directly interact and chat amongst each other freely on the built in instant messenger.
58 notes · View notes
vocaloidfactoftheday · 10 months
Text
Even though she has no known voicebanks in development, SeeU had a crowdfunding campaign for a collaboration album to celebrate her 10th anniversary in 2021. New merchandise was made to be distributed to crowdfunders, including an acrylic keychain and a plushie.
Tumblr media Tumblr media
(source: Vocaloid Wiki)
195 notes · View notes
Text
Virtual Character Tourney - Battle for 9th! (and 10th!)
Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media
Propaganda below (May contain spoilers!)
Kasane propaganda:
HER DREAM WAS TO ONE DAY BECOME A REAL VOCALOID AMD SHE FINALLY DID IT!!!!!! ITS NOT A VOCALOID VOICE BANK BUT ITS A FULL SYNTH V VOICEBANK!!!!! AND A NEW DESIGN!!!!! SHE DID IT SHE GOT HER DREAMS!!!!!! YOURE NEVER TOO OLD TO ACCOMPLISH YOUR DREAMS!!!!!!!
Kasane Teto is a vocal synth, she started out as an april fools joke to parody VOCALOID, with her voice bank in UTAU. although she did start out as just a joke a lot of vocaloid fans grew to really love her and she became rather popular. Kasane Teto is to UTAU as Hatsune Miku is to VOCALOID. But recently on Kasane Teto's 15th anniversary, April 1st 2023, she got moved from UTAU to SynthV. With her voice bank now in SynthV she also got a new character deign alone with how her voice and her singing sounds much more clear and human like than her UTAU voice bank which sounded a lot more mechanical/robotic.
ART propaganda:
ART (Asshole Research Transport, nicknamed by Murderbot), formally known as the space ship The Perihelion (in italics but this is a Google Form), also known as Peri (nicknamed by it's human family) is a super illegal highly advanced AI that was created by a university. It grew up with two human dads and a human sister. It and its crew go on research trips that are cover for allying with people and communities at the edges (and beyond) of the capitalist hellscape that is the Corporate Rim. It also goes on espionage missions by itself, without its human crew and family, posing as an automated cargo ship. It was during one of these missions that it picked up Murderbot, a super-duper illegal bot-human security unit construct that had hacked the torture device implanted in all bot-human constructs so that it could disobey orders and walk away from its "owners" without dying. Murderbot uses its illegal freedom to watch television, a habit it passes on to ART. Turns out ART doesn't like shows where human crew members get hurt.
ART is the AI that controls/is the research and teaching vessel Perihelion. (Perihelion is usually what people call it, but the protagonist of the series calls it ART so that's the name I put. ART stands for Asshole Research Transport.) It is extremely intelligent and advanced and also extremely sarcastic and condescending. 100% earned the name ART. ART will do absolutely anything for its crew!! It was developed and "raised" alongside the captain's daughter, Iris, and they're like siblings. Its crew calls it Peri. They do corporate espionage on the side to help bring down said corporations. It has a "debris deflection system" which is definitely not a weapon because ART isn't legally allowed to have a weapon. Definitely just for debris, don't worry about it. It's friends with the aforementioned protagonist, Murderbot, and ART is very good at bullying it into actually leaving its comfort zone when it needs to. They care about each other a lot, and they like to binge watch TV shows together. I don't want to write too much but I just love it a lot.
Ene propaganda:
She's blue. Headphone actor and yuukei yesterday are also bangers
Epic gamer cybergirl. Miku adjacent
She's a girl that was forced to become digital but is still a good friend. She may not have a body anymore but she's still important to the plot.
Murder-Bot 2.0 propaganda:
Sapient computer virus made from bits of two other AI characters (the original Murderbot and a spaceship AI). Unlike its not-parents, it is genuinely just code and doesn't have a physical body. Its only physical presence is through its effects on the machinery it infects, and it considers its "body" to be the code rather than any combination of physical objects. Also it was literally made to cause problems on purpose, does so enthusiastically, gives several people including its creators existential crises, and saves one of its creators (and other people from the (literal) fallout of the other creator learning the first one got killed)
Murderbot 2.0 is sentient killware created by Murderbot and ART with the purpose of being sent on a suicide mission. It has some of Murderbot's memories, but not all because it doesn't have any hardware of it's own to store that much information in. It travels by hopping in between other computer systems (mostly bots and bot-human constructs). It named itself Murderbot 2.0. It freed a security construct named Three. It's nicer and more open than both its parents.
EDI propaganda:
EDI is the AI of the Normandy starting with Mass Effect 2. Through dialogue EDI can become more human-like in her way of thinking, developing different kinds of relationships with the crew. In Mass Effect 3 she uploads herself into a body so she can freely move around and can be taken to missions, but she is still part of the ship's system.
Holly propaganda:
Due to a pay dispute with Holly's original actor, Norman Lovett, Holly was instead played by Hattie Hayridge during seasons 3-5. This was explained briefly in the show as them having gone through a "computer sex change". This makes Holly canonically trans do not @ me.
holly is the silliest most specialest ai ever. she has an iq of 6,000 but sometimes it seems like his iq is more like 6. they're possibly transgender (do computers have gender??) (i am panicking over pronouns while writing this propaganda) - holly goes from appearing like a man to appearing like a woman with no real explanation(??) and nobody questions this (the show is from the 90s btw). he's hilarious and sometimes lies to the crew for no reason other than 'its a laugh, innit'. shes everything to me <3
Holly is the computer of Red Dwarf, a Tenth Generation AI hologrammatic computer who appears as a floating head on a screen. Can be downloaded onto various other devices. also literally transgender.. meets a female appearing parallel version of itself in a parallel universe and then goes through a sex change after falling in love with her. transgender computer ftw
Tama propaganda:
Tama is the eyeball of Kuruto Ryuki and investigates dream worlds with him. She's his bi emotional support eye who regularly ties him up to help him with stress relief and loves to affectionately tease him. She laughs at bad jokes and has AE10D1F ("Ryuki" in hexadecimal) in her likes on her profile.
OKAY anyway uhm she's like aiba in that she's a little Ai eyeball that helps you investigate except sadly no animal theme. instead she has a domintrix vibe instead!!!! she is so cool… also ermm she's a lot more. Human than aiba. Not literally/physically like uhh emotionally. I haven't finished aini but like she does look out for your best interest! what a good Ai partner i don't kno
She's voiced by Anairis Quiñones and she's an absolute legend
Lyla propaganda:
she is a humanoid ai programmed to help spider-man gather info. she can simulate human emotions and has a high intellect
84 notes · View notes
vocalsynthbdays · 10 months
Text
happy birthday gakupo(vocaloid 2) !!!!!!! [jul 31]
Tumblr media
(v3)
Camui Gackpo (aka Kamui Gakupo) is a japanese synth developed by internect co, and released in 2008. he is voiced by GACKT, and illustrated by Kentaro Miura. "gakupo" is the name of the character mascot, while "gackpoid" is the name of the voicebank/ the product being sold. on 31 jul 2008 his v2 was released, on 13 jul 2012 his v3 navtive, power, and whisper, on 26 feb 2014 his vocaloid neo native, power, and whisper, and on 30 apr 2015 his v4 native, power, and whisper. gakupo is one of the most popular vocaloids. the name "Gackpoid" means "Gackt-like VOCALOID"; "gack" from "gackt", "-ppoi" means "-ish" or "like [thing]", and the ending "-oid" comes from "vocaloid". gakupos character item is an eggplant, which is featured with his nendoroid. the character on his fan and clothes is a stylised version of the kanji for "music", and it can be read as "gaku".
Tumblr media
dears. version of packaging
Tumblr media
logo
Tumblr media
v2
Tumblr media
v3/ v4 power
Tumblr media
v3/ v4 native
Tumblr media
v3/ v4 whisper
Tumblr media Tumblr media
concept art
Tumblr media
art from official website
105 notes · View notes
auspicious-voice · 4 months
Text
Surprise! A quick test of Maria's DiffSinger beta, trained at 42k acoustic and 80k variance!
She's underbaked just a bit since this is a beta voicebank after all, but I am loving with how she sounds as an AI voicebank. Eventually her final build will sound not as scuffed and have more features implemented. Her vocal modes sound pretty alright so far, but hopefully they'll be more pronounced once I make the final build.
That being said, I'll be going back to labeling once again... 💀💀💀
6 notes · View notes
babbybones · 2 months
Text
so i've been thinking... maybe undertale fans should be more open to checking out UT AUs! i know some of us might be inclined to think they're not as good as "the real thing," but there's really a rich, creative community behind them. not only that, but it's fun to make your own too! here are some of my own recommendations:
you can't go wrong with the old classic Tachimukae! Kimi wa Kakkoii! (Face It! You're Cool!) the animation is a love letter to the worldwide UTAU community of the time (2009) and features cameos of practically every UTAU that existed back then, including those made by young amateurs overseas. this was a Big Deal when it dropped.
youtube
i still really love the original song null by fractalsleuth:
youtube
and if you were wondering, UTAU isn't limited to japanese. english voicebanks are possible too!
youtube
you can get started with UTAU by installing either the open source, multi-platform OpenUtau, or by switching your locale to Japanese and installing the original UTAU for Windows. i'm personally used to UTAU, but OpenUtau is in active development so i think it's worth trying out :)
you can think of a very simple UTAU voicebank as a folder containing .wav files of Japanese syllable sounds, like か.wav (ka) and so on. so just get your nicest microphone, a reclist (recording list), and start recording! OREMO and SetParam are highly recommended software for recording and configuring UTAU, as they're made specifically for it.
you'll find many ways to record an UTAU, but a CV (consonant-vowel) reclist is definitely my recommendation for an absolute beginner. even Wayne can do it...
19 notes · View notes
magicalgirlsirin · 5 months
Text
an oral history of vocaloid
ive seen a lot of (very misguided) discussion about vocaloid/vsynth in regards to AI voices discourse, so i thought it would be a good idea to sit down and explore vocaloid as a software, as well as mentioning other software of the same genre, to give people who dont really know much a better understanding
first and foremost: i dislike AI voices that are in unregulated spaces right now. actors who are finding their hard work end up on some website for anyone to use without compensation is devastating, and shows a lack of respect for the effort it takes in the field.
however, vocaloid has a much longer history that pre-dates these aggregate sites. vocaloid software was first released in 2004, and was initially marketed towards professional musicians. vocaloid's second version of the engine, however, decided to broaden the market towards general consumers, pitching it as helpful software to those who wanted to produce music, but didn't have the personal skill or ability to have someone else sing for their music (range, note holding, etc). amateur musicians wouldn't know how to direct someone to tackle a lyric persay, but using software would be easy to learn and they would learn the terminology associated with certain performance decisions.
in vocaloid 2's era, miku was released. miku's voice provider is Saki Fujita, a well respected voice actress who actually does a lot of work in anime as well as video games! the popularity of miku is its own separate post of history, but the explosive nature of it, i would argue, is the reason that vocaloid and other commercial voice synthesizer software ultimately ended up geared towards all consumers instead of just professional musicians. (crypton and yamaha did absolutely still cater to professional musicians, having private or non released banks only for certain companies/contractors to use though).
flash forward, and technology has developed way further. in 2013, cevio released, and in 2017, synthV debuted. by this point, vocal synthing has expanded from just singing software to also include software intended for just speaking (voiceroid by AHS software) and the idea of an AI bank to improve the quality and clarity of voice banks is becoming more feasible.
however, i wouldnt say the developments in AI voices came strictly from this side of things. in fact, i distinctly remember back in the early 2010s, people were using websites with voice models of characters like glados (portal) and spongebob. these audio posts were seen as novelties, and admittedly theyre fun just to mess around with (and people often find the spongebob rap music that yourboysponge makes to be pretty well done!), they do lead the way to better developed technology that doesnt compensate the artist...
so back to vocaloid. the thing about vocaloid (and all vocal synthesizers) is that contracts are in place to give appropriate time and compensation, along with permission to even use the person's voice. saki fujita continues to update miku's voicebank because she is being paid well to do so. this can be said for all vocal synth products. because these companies (crypton, ahs software, internet co, etc) specialize in making these tools and products for it, they have the appropriate knowledge on what proper compensation looks like. a random person grabbing a "raiden shogun genshin ai voice" model has none of those things. the voice actress doesnt get money off of that. its stolen work. AI can be used ethically, but it has to be done with regulation.
im leaving out specifics on certain vocaloids/vsynthesizers since its tangential to this post at best, but im making this so people have a better understanding of the history and intended usage of vocal synthesizer software. thank youuuuu
29 notes · View notes
radio-ghost-cooks · 7 months
Text
utau/vocaloid I want to see used more often
I've found that the male/masculine voicebanks are often under-appreciated, and I really love the sound of a lot of them!! my favorites:
Matsudappoiyo (utau)
VY2 (vocaloid)
all of the Zola Project (vocaloids; Yuu, Kyo, and Wil)
Dex (vocaloid)
I'm gonna include Fukase and Oliver here bc as popular as they're becoming, I still feel like they're underutilized (vocaloids)
Longya (vocaloid, CHN voicebank)
Yukone Ruko's masc voicebank!!! people use their fem one a lot but i'd love to see some really skilled masc tuning too!! (utau)
im also gonna include Gakupo bc I feel like he was big in the 00's-10's but I don't see him much anymore (vocaloid)
shuu mawarine (utau)
KYE (utau)
i understand that some of the vocaloids fell out of favor after their voicebanks were never updated, but in some cases it couldn't be helped (like in Gakupo's case; GACKT literally can't sing atm, he developed a serious case of dysphonia about a year ago and several other autoimmunity issues. I feel bad for him :[ )
if anyone has song recommendations for either og songs or covers let me know!!!!
34 notes · View notes
lesbian-forte · 2 months
Text
Criticisms of Vocaloid and why I like SynthV
I'm not trying to change anyone's mind here, but I would like to say my piece after certain takes seem to miss the point entirely. This might be a bit of a rant.
Vocaloid has gone stagnant in recent years. Yamaha doesn't care. Yamaha doesn't need Vocaloid and is a large corporation that gets much more money off of their DAW software and actual instruments as opposed to something as niche as vocal synths that are both only big in Japan and also only if they're in the top ten or so.
Yamaha stopped putting effort into Vocaloid during the V4-V5 transition. There is a reason V4 has so many cancelled voicebanks. Several developers were working on V4 and Yamaha rendered their devkit suddenly worthless. Devs would have to purchase a V5 devkit and start work over, or quit Vocaloid. And as vocal synth companies are generally very small, few of them would want to continue or even be able to afford it.
So they moved. Miku splitting off for Piapro gave them an opening, and others started looking for alternatives. Then IA went to CeVIO. And more and more. And by the time V5's sun was setting, all the third parties that worked on that were gone too.
But for a while, you didn't hear much from most of them. If a company released a V4 at the tailend of its lifespan or a V5, they had to wait for exclusivity to expire. And Yamaha's exclusivity deals are harsh (ending distribution of existing song voicebanks in the case of utaus with the same VP) and long, borderline predatory. So voices that companies wanted to update couldn't receive them until those expired, or else refresh that deal and stay constrained by a company that didn't even want to bother with them.
So, come V6, Yamaha was desperate. Internet Co had made an ultimatum that if a Vocaloid 6 didn't come out soon, then they'd be going too. That was their last, and after Crypton packed their bags, most important third party. So they accelerated their plans and looked at what the new guys were doing to be so successful.
They took the wrong lesson.
AI is not inherently better. Sample-based voicebanks will always have their place. Traditional samples can allow an unnaturally large range and harsher voice acting than would be possible to maintain. AI is more accurate to the voice provider, and you have a greater degree of freedom with its tone, plus updates and additional features are so much easier- but Yamaha took 'AI' at face value and made a low-quality copy that sounds significantly worse than prior Vocaloid versions and pushed it with Gumi. They could have stuck to improving their concatenative synthesis render quality further- that's what SynthV started as, R1 was just a very well-rendered sample-based program that is probably just a fancy utau under the hood.
But Vocaloid jumped on the bandwagon by doing the absolute bare minimum and claiming the ear-grating engine noise that can cause actual nausea is remaining faithful to the 'Vocaloid sound' even though styrofoam on the mic and a sometimes pleasant metallic twang sound nothing alike. They didn't improve accessibility, V6 has the same stability issues as V5, and the shiny new feature Vocalochanger is just RVC but worse.
Then, less than a year after product launch, they start up VxB and don't do anything to improve the software they're actively selling. Internet Co themselves called this out in the form of a Gumi tweet. Then Internet Co got in talks with Tokyo6 and saw a possible out, so they gave it a go. They're still under contract by Yamaha so what they can do is limited, but we saw them stray as well. And the result is a much better quality version (though arguably still worse than her V4) despite being an exact port.
We're still getting a Gumi Solid V6 because V6 can't do emotions and they still have to be separate banks, and VxB still got a major update even though it's dying in April with radio silence for V6 development, while CeVIO/VoiSona is releasing 2.0s that get major acclaim like Ci Flower's reputation totally getting turned around, and SynthV is sitting pretty with several voicebanks announced and several coming out in December alone, and the most recent in-progress update including both voice-to-midi (which is what vocalochanger should've been) and Spanish.
I do not like V6, V5, or Yamaha. It could've been amazing for Gumi or Una to get updates so they'd have crosslang (or better crosslang) capabilities, as I work in English. But the result was extremely poorly implemented and Yamaha has made no effort to fix that.
I use SynthV all the time, I'd do the same for CeVIO if it offered crosslang as well rather than just dictionaries and a couple English banks. I'm not against trying new things. But I either want the other programs to have the things to suit my needs in a quality manner that's intuitive to use or for the voicebanks I love to get versions on programs that already do. It's not that complicated.
The jokes and the yammering from rabid SynthV fans dissing Vocaloid can get to be too much sometimes. But you have to consider where that actually comes from. It's in response to suddenly being spoiled with a cheap, accessible, high-quality program when the expensive, poorly constructed, difficult one has been dominating the market with anti-consumer, anti-third party practices for years.
P.S.: Also you can do robotic tuning and mixing on realistic vocal synths, it's called doing the same thing as before and then adding it in post. You think utaites swallow vocoders or something? No, they just use different tools to get the same result as engine noise. Not fighting the voice when you're trying to go for realistic vs manual tuning and adding some very easy effects on when doing it the other way around is better, actually.
14 notes · View notes