To see our 3D Printed DIY product offerings, head to our Etsy shop by clicking here!

By Bijan Bowen

ChatGPT Advanced Voice Test: Gimmick or Game-Changing Tool

Though this may be a bit embarrassing to admit considering that my current focus revolves around conversing with AI, I have not actually used any of the voice functionality that ChatGPT has integrated. I never bothered to check if I got early access to the advanced voice features and sort of overlooked it. A big part of this is that from my understanding it was best used on a mobile device and I absolutely loathe using anything with a touch screen. My experience with ChatGPT from December 1, 2022 (when I first used it) until today, has been solely through the web interface.

Advanced Voice Mode Rollout Completed

After hearing more and more about the newly rolled out advanced voice feature, I figured it was time to jump in, download the app and see how this voice capability fared. Since I wanted to make a video on my testing of this feature, I used my old Google Pixel 5 so that I could film my escapades with my main phone. After downloading ChatGPT from the Play Store and logging in, I was able to see the newly released voices, and browse through them.

Swiping through the available voices

There are 9 available voices, representing a fairly decent variety of different ages, styles and tones. Swiping through the available voices gives you a brief demonstration of how they sound, allowing you to pick the one that you like best. After swiping and hearing all of my options, I decided to pick "Maple", as it sounded the most attractive for ERP. Sorry, autocorrect, as it sounded pretty good*

As I am sure a lot of people have simply experimented with the voice feature for strictly conversational purposes, I wanted to do something a bit more interesting (to me, at least), and see if the voice assistant could actually walk me through doing something step by step. I have to this date only followed instructions visually, meaning I read them or look at pictures and then replicate what I have read or seen. I figured it would be a good test of both the knowledge of the model, as well as the feasibility of being instructed in a strictly conversational manner.

I have my reasons

For my test, I decided to grab a ~2007 black MacBook and ask for assistance in removing the RAM. The MacBook has a specific design that is not commonly found on other laptops so there would be no ability for the model to guess its way through this task; it would either know how to guide me or not. It began with typical steps such as shutting the machine down and ensuring it was not plugged in, and then things got good. It tasked me with fiddling with the battery latch, but I had to slow it down a bit and describe what I was seeing to ensure that I had identified the correct component.

A macbook battery!

I will not waste time in summarizing the rest of the steps since I made an entire video to document the process, but I will say that the voice mode correctly walked me through the steps, and even correctly assisted me in understanding the label on the RAM that it helped me remove. I role-played as a tech noob for the purpose of the demonstration as I wanted to see how well it would assist someone in a task that may be new to them.

Impressive work by the voice mode

While filming some B-roll shots to better show the parts of the laptop that it was instructing me to remove, I asked it if it could speak Farsi, to which it replied that it could. As I had already been filming, I decided to engage in a short and simple conversation with it, entirely in Farsi. The conversation went well and it kept my suggestion to stick to more simple topics, as my mastery of Farsi is somewhere between bad and good. I was very impressed here and believe that the ability of this technology to teach people foreign languages will far exceed anything that has existed to date.

A conversation in Farsi

I did try a few other things, mainly asking it to tune my guitar by telling me whether a string pluck was high or low, but it would not analyze any sounds (its words, not mine). I also attempted to get it to role-play as someone pretending to tell a wealthy relative that they must change their will to include me, and it also refused. I find that this model is a bit more censored, but considering the potential nefarious uses I can understand why this decision was made, though it would be fun to have one that would happily talk shit with you.

You can view the video for this article on my YouTube Channel

0 comments

Leave a comment

Please note, comments must be approved before they are published