Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
On the Org of Schema: by Means of Artificial Selection
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science.
2025 (English)Independent thesis Basic level (university diploma), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

This study explores the use of large language models (LLMs) in the selection and generation of Schema.org markup for web pages. The proposed artifact leverages Google Gemini 2.5 Pro to automate the generation of schema markup, which may increase search engine visibility and strengthen search engine optimization (SEO) efforts. The research compares the artifact-generated markup with pre-existing, human-generated schema markup from high-traffic websites in the U.S., evaluating syntactic validity and the schemas’ ability to trigger rich results in Google’s search engine result page. The study finds that while the artifact-generated schemas were more complex and longer than their human-generated counterparts, they exhibited a higher error rate, more warnings, and fewer schema and rich results items, suggesting that they could negatively impact search engine visibility. The analysis also reveals performance characteristics, with the artifact processing an average of 7041 input characters per second at an average processing time of 39 seconds, proving impractical for large-scale application. This work contributes to the emerging field of AI-driven schema generation, highlighting both the potential and the limitations of LLMs in producing high-quality structured data. While the results suggest that LLMs, when curated, could assist in schema generation for smaller-scale applications, further research is needed to address issues of error handling, runtime optimization, and scalability.

Place, publisher, year, edition, pages
2025. , p. 35
Keywords [en]
Schema.org, Large Language Model, Search Engine Optimization, Google Gemini
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:bth-28013OAI: oai:DiVA.org:bth-28013DiVA, id: diva2:1987132
Subject / course
PA1438 Självständigt arbete Webbprogrammering
Educational program
PAGWG Webbprogrammering
Supervisors
Examiners
Available from: 2025-08-05 Created: 2025-08-05 Last updated: 2025-09-30Bibliographically approved

Open Access in DiVA

fulltext(1162 kB)211 downloads
File information
File name FULLTEXT01.pdfFile size 1162 kBChecksum SHA-512
b5dbef7f5a0a1f1d9b1178d71f5876bd5f7249b57bdfc19f9e6d0a60c8ea8302af399c131d86fa4f3ade24cedd67a92bedf225f7898523b6dbf2cb3d8d81dfac
Type fulltextMimetype application/pdf

By organisation
Department of Computer Science
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 211 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 119 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf