Page 1 of 1

Why structured data could become obsolete for Google in the future

Posted: Thu Jan 30, 2025 6:51 am
by Reddi1
Google also learns thanks to the help of webmasters and SEOs
The construction of a semantic database in the form of the Knowledge Graph, but also in the identification of entities in general , depends a lot on the help of external people such as webmasters, Wikipedia editors, etc. In general, however, Google wants to obtain interpretable data independently in the long term so that the Knowledge Graph project does not come to a standstill.

This is also demonstrated by the Knowledge Vault project . The Knowledge Vault was introduced by Google in 2014 as an inactive development project in which the world's largest knowledge database is to be built using web crawling and machine learning , both structured and unstructured data. There is no information to date on whether and to what bosnia-and-herzegovina phone number data extent Google is already actively using this database. However, I assume that the Knowledge Graph already draws information from the Knowledge Vault. Read more in the article Google “Knowledge Vault” To Power Future Of Search .

I assume that Google has a great interest in identifying information for the Knowledge Graph independently of the help of external people, ideally in an automated manner. There are already some indications that Google is repeatedly obtaining human-verified training data for its own machine learning systems in order to identify and classify entities more quickly.

For example, Google also has the information for the Medical Boxes reviewed by professors and doctors from Harvard and the Mayo Clinic before it is published in the Knowledge Graph boxes.



This manual testing could also be used in supervised machine learning to improve the algorithms. Google could also use the feedback from search evaluators (quality raters) as valuable training data to feed its own machine learning algorithms.



Structured Data as human training data for the Google algorithm
Another example of Google increasingly trying to act independently of webmasters in the future is the rel-authorship mark-up . In my opinion, this mark-up had only one task for Google: identifying patterns that represent certain types of entities - in this case authors. The information and mark-ups were created or entered by people (primarily SEOs and webmasters) and were thus verified training data for Google to use the machine learning algorithms to create model groups for authors based on these patterns.

It is therefore not surprising that Google eventually stopped pursuing the rel-Authorship or Freebase projects. Freebase was initially fed with data by humans that was created according to a basic semantic structure. This gave Google a semantic playground and enough human-verified training data for the machine learning algorithms. However, Freebase was only a short-term means to an end.