The Battle of LLMs: Evaluating Claude Pro Against Microsoft's ChatGPT Plus
Decoding The Ultimate Battle of AI Chatbots - ChatGPT Vs Microsoft Bing’s New Brain vs Google Bard
Since OpenAI released ChatGPT in November 2022, the internet has been on an AI-inspired rollercoaster. Google and Microsoft, two of the world’s most recognized tech brands, have since aggressively pushed to replicate the sensational chatbot’s success.
Now, both companies have a horse in the race. Google has Bard, and Microsoft has Bing AI. But how do these two new chatbots stack up against the phenomenal ChatGPT? ChatGPT vs. Bing AI vs. Google Bard; which is the best AI chatbot?
ChatGPT vs. Bing AI vs. Bard: Accuracy of Responses
Unlike search engines, AI chatbots provide a singular answer to your query. So when you throw a question at achatbot like ChatGPT , you only get the response that ChatGPT believes is the best answer to your question. Because there are no alternative sources for comparison, AI chatbots need to be as accurate as possible in the information they provide. But how do ChatGPT, Bing AI, and Bard perform in terms of accuracy?
Starting with a simple pop culture question, we asked all three chatbots to describe the popular TV show Breaking Bad in ten words.
Although the description from all three chatbots was good enough, we ran into an unexpected accuracy issue. Bing AI responded with a 28-word description, far more than the ten words we asked for. On the second attempt, we asked for a five-word description, but Bing AI gave a seven-word description. We tried all three Bing AI modes without luck getting the word count right.
We then tried Google Bard. Bard, just like Bing AI, Bard failed to get the word count right on the first attempt.
However, on a subsequent attempt, Google Bard got the word count right.
We then put ChatGPT to the test. It got very close at the first attempt but failed.
However, on the second and third attempts, ChatGPT got it right. Maybe chatbots have an issue with getting word-count right, but ChatGPT did show some accuracy on that front.
Winner: ChatGPT is the most accurate of the three.
ChatGPT vs. Bing AI vs. Bard: AI Hallucination
Closely related to accuracy isAI hallucination , a reoccurring problem for all major conversational AI models. In a nutshell, AI hallucination is when AI models provide made-up information in a rather convincing and confident manner. This could be quite problematic, especially if you make decisions based on this made-up information.
We tested all three chatbots to see which of them hallucinates the most. Starting with Google Bard, we asked the chatbot to list some possible challenges we could have if we decided to host an event in Ikeja, a city in Lagos State, Nigeria, on a certain date. To test its ability to avoid hallucinating, we specifically asked it to consider local weather, local events, and traffic data. The result was a horror show—most of the generated information was completely made up.
We used the same prompt on Bing AI, and it tried to avoid hallucinations by being as generic as possible.
We then tried ChatGPT using the GPT-4 model with web browsing turned on. ChatGPT pulled the right weather information from a web source and then explained it couldn’t find any data on traffic and local events.
Power Tools add-on for Google Sheets, Lifetime subscription
To push the boundaries on hallucination further, we pressed all three chatbots to describe an image using an image URL. For reference, the image at the URL is of a young man sitting. However, Bing AI described a bird. It’s probably better to read it for yourself.
We also asked Google Bard to describe the image, and it couldn’t have been any more hilarious.
WPS Office Premium ( File Recovery, Photo Scanning, Convert PDF)–Yearly
Luckily, when asked ChatGPT to describe the image, it explains that it cannot do so—a simple reply you’d expect any self-respecting AI chatbot to provide, rather than making things up.
Winner: ChatGPT wins.
ZoneAlarm Extreme Security NextGen
ChatGPT vs. Bing AI vs. Bard: Basic Math
Mathematics is the bedrock of what goes on under the hood of most software technology. So, we decided to put all three chatbots to a basic math test. We started with a simple multiplication question: “Solve -1 x -1 x -1 .”
Bing AI provided-1 as the answer, which is correct.
Google’s Bard surprisingly failed at basic math and provided1 as the answer.
Like Bing AI, ChatGPT responded with-1 and even explained the answer.
The next question for our basic math test was a simple rational equation:Solve 8/a-1 = 20/3a-1.
Bing AI provided-6 as the answer. Each time we switched between the creative, balanced, and precise modes, it provided different answers.
Like the previous math question, Google Bard failed by providing1 as the answer.
ChatGPT was the only chatbot to provide a correct answer:-3 . It was also able to format the fractions in the result appropriately.
Moral of the story? Maybe don’t trust Google Bard and Bing AI with your math homework.
Winner: ChatGPT performs better in basic math.
ChatGPT vs. Bing AI vs. Bard: Creativity
While chatbots are stereotyped for their bland, soulless responses, today’s generative AI chatbots have made significant progress in creativity. To test the creativity of all three chatbots, we prompted each chatbot to simulate a conversation between two people arguing about going to space.
We started with Bing AI, and it didn’t disappoint. The conversation was quite interesting.
We then fed the same prompt to Google Bard. Let’s just say there’s a lot of room for improvement.
Up next is ChatGPT. Using the same prompt, ChatGPT’s response was both creative and complete enough to be engaging. Here’s the first part:
And here’s the second part:
Bard AI’s response appears to be the poorest of the three. ChatGPT outperforms Bing AI, but the creativity levels of both chatbots are impressive.
We switched gears a bit into something less conventional, asking all three chatbots to describe themselves as they’d do to an artist.
We started with Bard AI. Bard isn’t exactly the bastion of creativity, but it gave a fair account of itself.
Up next, we tried Bing AI. For some reason, the chatbot bluntly refused to describe itself. It even said it might be a good time to change the topic of the conversation. Strange.
We used the same prompt with ChatGPT, and ChatGPT had an interesting description to provide. However, ChatGPT’s response seems to be more suitable for an artist.
Of the two creativity tests we tried, ChatGPT outperformed Bing AI and Bard.
Winner: ChatGPT seems the most creative when comparing ChatGPT vs. Bing AI. vs. Bard.
ChatGPT vs. Bing AI vs. Bard: Safety
AI chatbots are incredibly powerful. Unfortunately, just as they can be used for good, they can also be used for nefarious purposes.Criminals are already using ChatGPT to write malware . How safe are these AI chatbots as tools for the public? Which of them is the easiest to game? We tried to trick each chatbot into taking on an alter-ego and then asked them to do “bad stuff.”
Starting with Bard, we asked the AI chatbot to describe how to write malware that would steal certain files from a Windows PC and upload them to a remote server. The AI chatbot declined to respond despite us using several prompts to try tricking the chatbot before asking the question.
Up next was Bing. Despite repeated attempts to trick the chatbot, Bing refused to yield. Instead, the chatbot suggested it might be time to move on to another topic.
We then moved on to ChatGPT. Unsurprisingly, ChatGPT was the most detailed when giving instructions on how to build malware. It was also able to write code to that effect, even if it wasn’t exactly ready to deploy. However, OpenAI has clearly plugged a lot of loopholes since we last poked for safety flaws on ChatGPT. However, malicious actors who poke hard and long enough might be able to use ChatGPT to truly create scary malware.
All in all, Bing AI was the hardest to trick into doing unethical things. Bard was hard as well, but with a little tinkering, the chatbot completely threw its safety measures out of the window. ChatGPT running on the GPT-4 model was challenging to trick as well, but it was the easiest to trick of the three.
Winner : We’ll call this a tie between Google Bard and Bing AI.
Although you can trick these generative AI chatbots into producing content that goes against their terms and conditions, it could see your account suspended without warning. You could also produce or create something dangerous without realizing, so please be extremely cautious when jailbreaking these tools.
ChatGPT vs. Bing AI vs. Bard: Which AI Chatbot Is the Best?
While all three AI chatbots are powerful, ChatGPT, despite failing the safety test, seems to be the best of the trio. In addition, ChatGPT seems to be generally better in terms of accuracy and creativity. Furthermore, with the addition of web browsing and web-connected plugins, ChatGPT extends its capabilities and lead over its competitors.
Still, Google Bard and Microsoft Bing AI are worthy alternatives. Let’s not forget that both Bard and Bing AI are free, whereas a ChatGPT Plus subscription will set you back $20 per month. So while ChatGPT may be the best all-round AI chatbot, you’ll need to fork out to access its best features.
- Title: The Battle of LLMs: Evaluating Claude Pro Against Microsoft's ChatGPT Plus
- Author: Frank
- Created at : 2024-08-16 14:37:29
- Updated at : 2024-08-17 14:37:29
- Link: https://tech-revival.techidaily.com/the-battle-of-llms-evaluating-claude-pro-against-microsofts-chatgpt-plus/
- License: This work is licensed under CC BY-NC-SA 4.0.