WQ42: Grounding LLMs in Wikidata Facts via Tool Calling

Integrating Knowledge Graphs (KGs) with Large Language Models (LLMs) is a well-explored research field. KGs are vast, structured databases storing factual associations as graph edges. KGs can help LLMs for tasks like question answering, drawing on the structured, often up-to-date information within KGs, thereby mitigating the risk of hallucination. For instance, an LLM that can query Wikidata—a prominent KG project—instead of solely depending on its training data becomes significantly more reliable and useful. [Read More]

qrender: Render wikidata item in different formats

Wikidata is a rich knowledge graph, but its raw data format can be challenging for both humans and AI to process effectively. This blog post explores how I addressed these challenges by creating qrender, a tool for rendering Wikidata items in more human-readable and AI-friendly formats. In my previous article about qjson, I explained the importance of retrieving all information about a Wikidata Item. I write qjsonas an easy API to fetch all such information in one API call instead of multiple SPARQL queries or API calls. [Read More]

Upskill and Upgrade

In April 2024, I took a two-month sabbatical from work. During this period, and in the year that followed, I dedicated significant effort to improving my skills. Having spent nearly 20 years in software engineering, I’ve witnessed substantial changes in the field. Opportunities to learn and practice new skills don’t always arise naturally at work. While I made a conscious effort to dedicate time each day to reading and learning, finding time for hands-on practice remained a challenge. [Read More]

qjson: Fetching all properties of a wikidata item in a single API call

For those deeply involved with Wikidata, the richness of its interconnected data is both a blessing and a challenge when it comes to programmatic access. While the standard wbgetentities API endpoint is fundamental, retrieving the complete set of properties, including labels and values, for a given item often leads to a cascade of recursive API calls. For example, suppose we fetch all properties for Q42 using wbgetentities API - https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q42. In the response, if well lookup the “country of citizenship” (P27) for Q42 (Douglas Adams): the initial response only provides the target QID (Q145), necessitating further queries to resolve both P27 and Q145 into human-readable labels. [Read More]

An Experiment in Detecting Wikipedia Edit Policy Violations with LLMs

Wikipedia, the world’s largest online encyclopedia, relies on a massive community of volunteers to maintain its accuracy and neutrality. But with so many editors, how do you ensure edits adhere to Wikipedia’s strict policies? I decided to explore whether Large Language Models (LLMs) could be used to automatically detect policy violations in Wikipedia edits. Here’s what I found. Wikipedia has well-defined policies to ensure content quality. These include: WP:NPOV (Neutral Point of View): Avoiding bias and presenting information objectively. [Read More]

Natural Language based question answering system for Wikipedia and Wikidata

This is a blog post version a paper titled “Question-to-Question Retrieval for Hallucination-Free Knowledge Access: An Approach for Wikipedia and Wikidata Question Answering” available at https://arxiv.org/abs/2501.11301. In the world of Large Language Models (LLMs) and question answering systems, hallucination - where models generate plausible but incorrect information - remains a significant challenge. This is particularly problematic when dealing with encyclopedic knowledge sources like Wikipedia, where accuracy is paramount. Today, I’ll discuss a novel approach that addresses this challenge through question-to-question retrieval. [Read More]

Year 2024 in Review

This year was also a busy one. There were fewer travels compared to last year. I went to Africa for the first time (Kenya) and twice to the US. 14 flights in total. This year, I wrote an academic paper. From the research papers I have written so far, I received a total of 187 citations this year. Attended and spoke at three conferences. Gave three public lectures. The most proud moment was presenting a paper at a conference attended by the giant in computer science, Dr. [Read More]

Grapholinguistics 2024

I presented a paper titled “Parametric type design in the era of variable and color fonts” in the Grapholinguistics conference 2024. The conference was held in Università Ca’ Foscari, Venice from 23rd to 25th October 2024. The conference was a hybrid event with both physical and virtual participation. G21C (Grapholinguistics in the 21st Century, also called /gʁafematik/) is a biennial conference bringing together disciplines concerned with grapholinguistics and, more generally, the study of writing systems and their representation in written communication. [Read More]

Teaching AI in Schools

Artificial Intelligence (AI) is a hot topic these days, and it’s natural to wonder how it fits into education. In this article, we will explore the best practices, concerns, and recommendations for integrating AI into school curriculums. I will also provide references to useful tools and learning materials. Importance of AI education at schools Why is there a growing interest in teaching AI in schools? AI has become deeply integrated into society, creating new applications and possibilities while also introducing ethical concerns. [Read More]

നിർമിതബുദ്ധി കിയോസ്കുകൾ

ഭാഷ തടസ്സമാകാതിരിക്കാൻ സഞ്ചാരികളെ സഹായിക്കാൻ AI Kiosk കൾ സ്ഥാപിക്കും എന്ന മന്ത്രി മുഹമ്മദ് റിയാസ് നിയമസഭയിൽ പറഞ്ഞെന്ന് പത്രത്തിൽ വായിച്ചു. നിർമിതബുദ്ധിയിൽ പ്രവർത്തിക്കുന്ന കിയോസ്കുകൾ അവർക്ക് അവരുടെ ഭാഷയിൽ മറുപടി കൊടുക്കുമെന്നാണ് മന്ത്രി പറഞ്ഞത്. ഭാഷ തടസ്സമാകാതിരിക്കാൻ സഞ്ചാരികളെ സഹായിക്കാൻ AI Kiosk കൾ സ്ഥാപിക്കും -ദേശാഭിമാനി പത്രം - ജൂലൈ 12, 2024 ചില ചോദ്യങ്ങൾ ഏതെങ്കിലും വിനോദസഞ്ചാരകേന്ദ്രത്തെക്കുറിച്ച് നിലവിൽ സഞ്ചാരികൾ അറിയുന്നതും സംശയങ്ങൾ തീർക്കുന്നതും എങ്ങനെയാണ്? അതിൽ എന്ത് പോരായ്മകളാണ് ഉള്ളത്? ഇന്റർനെറ്റ് കണക്ഷനുള്ള മൊബൈൽ ഫോണുകളിൽ ലഭ്യമല്ലാത്ത എന്തു സൗകര്യമാണ് ഈ കിയോസ്കുകളിൽ ഉണ്ടാകുക? ഇന്റർനെറ്റിൽ ലഭ്യമല്ലാതിരിക്കുകയും എന്നാൽ കിയോസ്കുകളിൽനിന്നു മാത്രം അറിയാൻ കഴിയുന്നതുമായ എന്തെങ്കിലും വിവരങ്ങൾ ഉണ്ടോ? [Read More]