15 Google AI ChatBot Bard Statistics (Real Data And Facts)

Updated: February 9, 2023

AI is getting so much POPULAR and search engines are now fighting for its implementation just like I fight with my sister EVERYDAY for no reason basically. (I bet we all do, RIGHT..!)

Google just announced its AI chatbot system, “Bard“. It will be implemented in the search in the upcoming months.

Here is the first look at google AI Bard chatbot. (Published by Google)

SEOs around the world are scared because google is not showing any footnotes or references in the results.

I have gathered top Google Bard Statistics, which took me about a few hours to gather, TO BE HONEST.

We also have curated ChatGPT Statistics in a different article also. All these stats have been checked multiple times before including in the article.

Here are some of the Google’s Chatbot Bard Statistics

Editor’s Top Picks – Google Bard Statistics

  1. Google’s Bard Factual Error Costed Company $120 Billion Of Net Worth
  2. Acc. To Google, AI computations are doubling every six months, outpacing Moore’s Law
  3. Satya Nadella, CEO of Microsoft called Google 800 pound Gorilla in the Search Business
  4. Google Bard when fully rolled out may reach 1B users within the First 2 Months

Google Bard Facts:

  • Launch Date – 8 February 2023
  • Parent Company – Google
  • Head Quarters – California
  • Based OnLaBMA

Google Bard Statistics, Facts, & Usage

First, let’s take a look at some BARD statistics that provide an overview of this new Googl’s AI language model

1. The scale of the largest AI computations is doubling every six months, far outpacing Moore’s Law.

According to Google, the Scale of the largest AI computations is doubling every 6 months. This is exactly opposite to Moore’s law (Gordon Moore, 1965) which says, “the number of transistors in a dense integrated circuit (IC) doubles about every two years”. But in the case of AI computations, this time has significantly reduced to 6 months.

Source: Google

2. Google’s Bard AI chatbot gives the wrong answer at the launch event.

Google in a launch event where they unveiled its AI chatbot, gave the wrong answer in the live event. This proves that the chatbot is still not ready the feature is being implemented to stay relevant in the AI race.

Source: Google Live Event

3. Without Innovation, If They Wanna Come Out And Dance, Then We Made Them Dance

Satya Nadella, CEO of Microsoft says today is the day we brought in competition to search, I have been trying it for 20 years. He also pulled a joke saying that google is an 800-pound gorilla and if without innovation if they wanna come out and dance then I want people to know that we made them dance.

Here is a link to the full interview if you want to watch it on youtube

Source: The Verge

4. $120bn Wiped off Google’s valuation due to the wrong AI answer given by the chatbot at the live event.

Google introduced its chatbot system at a live event where it showed the wrong answer to a query. Google called it a technical issue. Due to this, the 120 billion was wiped off Google’s valuation.

Source: Canada Today

5. Google Pulled Down The Video From Youtube After Realizing It Has A Factual Error, Stocks down 7%

The tech giant Google after realizing that the answer generated by AI has a factual error. After the video was taken down, the stocks went down a whopping 7%.

google bard video taken down
Google Bard Video Taken Down

Source: Google’s Youtube Channel

6. Google Bard when fully rolled out may reach 1B users within the First 2 Months

Jim Fan, AI Scientist at NVIDIA, predicted that Google Bard if fully rolled out, will reach at least 1B users within the first 2 months. He called this fight between two giants deploying the 2 largest neural nets in history.

Source: Jim Fan, AI Scientist – NVIDIA

7. Google’s AI Chatbot BARD Is Not Ready Yet For Deployment Yet, Which Is Why They Give No Rollout Dates

According to the co-founder of Authority Hacker, Gael Breton google’s AI chatbot is still not ready yet, which is why they are giving any rollout dates. He also says SEO is not dead yet.

Source: Gael Breton

8. 2022 LaMDA Research Paper lists that 12.5% Of the Public Dataset and 12.5% Of Wikipedia Data Was Used To Train LaMDA

In 2022 a research pare was released which revealed that 12.5% of public websites datasets were used to train LaMDA and 12.5% of Wikipedia. It is however still unclear which websites and what data were used to train LaMDA.

Source: SearchEngineJournal

9. Most Websites Used For Training LaMDA were Programming, Q&A websites, and tutorial sites.

AI experts suggest that Programming websites, Q&A, and tutorial websites were mostly used for training the LaMDA language model.

Source: SearchEngineJournal

10. 50% of LaMDA Training Data Is From Public Forums

Most of the data used to train LaMDA aka Bard is from the public domain and roughly accounts for about 50% of the total dataset used to train the language model.

Source: SearchEngineJournal

11. Websites Details used to train Bard/LaMDA are shrouded in secrecy

Google has shrouded the details of websites that have been used to train the AI language model. No details are available about the source of the dataset used in training Bard

Source: SearchEngineJournal

12. Google Bard Is Based On AI-Language Mode LaMDA, an acronym used to explain Language Model for Dialogue Applications.

LaMDA stands for Language Model for Dialogue Applications. and is the basis of Google’s bard AI chatbot, which will be integrated into the search engine in the upcoming months

Source: Google

13. LaMDA language model was trained on total datasets of 1.56 trillion Words

LaMDA’s dataset i.e public dialog data and web text includes 50% data from public forums, 12.5% English dataset, 6.25% nonenglish dataset, 6.25% English web documents, 12.5% C4-based data, and finally 12.5% code form Q&A, tutorials, and programming websites.

SIDENOTE: C4-dataset is called Common Crawl data which is basically an open-source web dataset

Source: Freitas D. et. al

14. 25% Dataset is From Named Sources (C4 & Wikipedia) and Rest 75% words were scraped from the internet

The dataset is taken from named sources like C4 & Wikipedia and the rest of the 75% of data is scraped from the open web.

Sorce: Freitas D. et. al

15. Dataset Composition Is Still Under Testing For Composition And Performance

The composition mentioned above is still under testing purpose for performance and quality. In future works composition of this dataset may affect the quality of NLP tasks performed by the language model.

Source: Freitas D. et. al

16. Sundar Pichai asks staff to spend 2-4 hrs a day on Bard amid Google vs ChatGPT battle

Sunder Pichai, CEO of Google asked staff to spend 2-4 hrs a day of their time on the bard. He assumes more interactions will help improve the chatbot

Source: IndianExpress

17. Google asks employees to rewrite Bard’s bad responses, says the A.I. ‘learns best by example

Google’s bard has been giving wrong responses, and in the launch event, it also generated a response that was factually incorrect. Google is asking its employees to rewrite chatgpt responses so that it can learn human-like conversations.

Source: CNBC

18. YouChat 2.0, a competitor of Google Bard And ChatGPT appears to be more developed and gives better answers

Youchat has more features than google and bing chatbots. It provides information in terms of charts, images, videos, tables, graphs, text, and code

Source: DeccanHerald

19. Google C4 Dataset is based on Common Crawl Data, An open source dataset by ex-googlers

The dataset used by google to train the bard chatbot is common crawl data which is open-source data that can be used by anyone to train their own language model.

Source: Common Crawl

20. Pretraining Text data use to train the AI chatbot is about 750GB

The text data used to train the language is scraped text from April 2019, and the complete data is about 750GB.

Source: Freitas D. et. al

21. 32% of text data was Hispanic aligned webpages and 42% was African American aligned, which was removed by the blocklist filter

Swear words aka bad words named as Hispanic data was removed from the datset and it accounted for 32% of the text data. 42% of African American aligned webpages were also removed

Source: Freitas D. et. al

22. 51.3% of the webpages from the C4 dataset were hosted in USA

In another finding, 51% of the all the webpages used in C4 datasets were hosted in the USA, accounting for approx half of the dataset.

Source: SearchEngineJournal

BARD Statistics Sources

So here are all Google Bard statistics. We have taken due care to provide the correct info. For any conflict of interest contact us directly.

Final Thoughts

So these are available Statistics about Google Bard. We have curated all these stats with due care by taking care of the facts and data.

Let me know if you want me to add new updates to the article. We would love to include your thoughts with proper credit in the article

Shivam Sharma
Shivam Sharma

I’m Shivam Sharma, founder of BloggingCapital.com, an SEO & Finance enthusiast featured in WebsiteBuilderExpert, CEOBlogNation, and Medium. We are an online magazine that helps 7567+ readers per month improve their marketing and web copy.

Articles: 48

Leave a Reply

Your email address will not be published. Required fields are marked *