![GPT-4o & Gemini 1.5 Pro Finish Each Other’s Sentences](http://ominousindustries.com/cdn/shop/articles/bobstalk.jpg?v=1717583818&width=2000)
![GPT-4o & Gemini 1.5 Pro Finish Each Other’s Sentences](http://ominousindustries.com/cdn/shop/articles/bobstalk.jpg?v=1717583818&width=2000)
· By Bijan Bowen
GPT-4o & Gemini 1.5 Pro Finish Each Other’s Sentences
I wanted to make another Bob the Sentient Washing Machine to engage in some tomfoolery by having two of them speak to each other. I have had a lot of fun making my R1 Robots speak to each other using uncensored models, but unfortunately, I cannot post those videos anywhere without risking every non-self-hosted platform I post them on getting immediately banished from existence...
![Two R1s Speaking](https://cdn.shopify.com/s/files/1/0805/4204/0384/files/r1speak_480x480.jpg?v=1717583620)
In keeping with the theme of testing the newest and most powerful AI models, I decided to hook one Bob up to OpenAI GPT-4o, and the other to Google Gemini 1.5 Pro. To facilitate easy communication without having to do too much work, I decided to run each on a Raspberry Pi 4 with similar Python scripts.
![The SSH session showing communication between the Bobs](https://cdn.shopify.com/s/files/1/0805/4204/0384/files/bobconsole_480x480.jpg?v=1717583620)
The scripts were simple and allowed the two Bobs to "speak" to one another. Using simple UDP state management messages, the Bob that was speaking would send a UDP message to the other, informing it to listen, and vice versa. Using this simple method, I was able to get them to speak and listen in turn, utilizing the Microsoft Azure speech services for the TTS (text to speech) and STT (speech to text) functionality.
After a bit of troubleshooting (caused by neglecting to add a UDP port bind statement in one of the scripts), I had them working well enough to have some fun. To make something that I could actually post, I opted for a rather vanilla prompt for both models. The prompt for the models was, "You are having a conversation; it can lead anywhere." I capped each model at a very low 20-token max output, as it made the conversation quicker and allowed for more leeway in the level of precision that the "listening" state needed to function properly.
![The GPT4o Prompt](https://cdn.shopify.com/s/files/1/0805/4204/0384/files/gptprompt_480x480.jpg?v=1717583620)
![The Gemini 1.5 Pro Prompt](https://cdn.shopify.com/s/files/1/0805/4204/0384/files/geminiprompt_480x480.jpg?v=1717583620)
The conversation began with a simple "hello" sent to the GPT-4o Bob, and its response triggered the Gemini Bob to begin listening, thus starting the conversation loop. Overall, their conversation was rather boring; however, one interesting thing happened: they began to sync up very well and finish each other's sentences for a couple of turns in the conversation. This caught me by surprise, and I found it interesting to see how they reacted to one another.
![The two Bobs](https://cdn.shopify.com/s/files/1/0805/4204/0384/files/2bobs_480x480.jpg?v=1717583620)
The Gemini 1.5 Pro model seemed to take a more analytical approach, picking up on the behavior of the GPT-4o model. It noticed the model was imitating it and questioned the GPT-4o Bob on some of its responses. Given that the 20-token max length is almost laughably short, I hesitate to make any definitive statements about either model's performance, but I will say that the Gemini 1.5 Pro model seemed a bit "smarter" given the constraint of the 20-token limit. A fairer comparison would likely involve giving both models conversation history.
You can view the video for this article on my YouTube channel.