Aarne-Thompson-Uther (ATU) Search
Contents
Aarne-Thompson-Uther (ATU) Search#
Linked data search over ATU tagged resources.
Wikipedia page: https://en.wikipedia.org/wiki/Godfather_Death
DBpedia page: https://live.dbpedia.org/page/Godfather_Death
Gives us eg:
https://live.dbpedia.org/page/Category:The_Devil_in_fairy_tales (rdf:type skos:Concept ; rdfs:label The Devil in fairy tales (en) )
dbp:aarneThompsonGrouping ATU 332 (en); dbp:country Germany (en) ; dbp:folkTaleName Godfather Death (en) ; dct:subject dbc:Grimms’_Fairy_Tales dbc:The_Devil_in_fairy_tales ; rdfs:label Godfather Death (en)
%%capture
#Install some essential packages
%pip install SPARQLWrapper pandas folium
# Import the necessary packages
from SPARQLWrapper import SPARQLWrapper, JSON
# Add some helper functions
# A function that will return the results of running a SPARQL query with 
# a defined set of prefixes over a specified endpoint.
# It follows the same five-step process apart from creating the query, which 
# is provided as an argument to the function.
def runQuery(endpoint, prefix, q):
    ''' Run a SPARQL query with a declared prefix over a specified endpoint '''
    sparql = SPARQLWrapper(endpoint)
    sparql.setQuery(prefix+q) # concatenate the strings representing the prefixes and the query
    sparql.setReturnFormat(JSON)
    return sparql.query().convert()
    
# Import pandas to provide facilities for creating a DataFrame to hold results
import pandas as pd
# Function to convert query results into a DataFrame
# The results are assumed to be in JSON format and therefore the Python dictionary will have  
# the results indexed by 'results' and then 'bindings'. 
def dict2df(results):
    ''' A function to flatten the SPARQL query results and return the column values '''
    data = []
    for result in results["results"]["bindings"]:
        tmp = {}
        for el in result:
            tmp[el] = result[el]['value']
        data.append(tmp)
    df = pd.DataFrame(data)
    return df
# Function to run a query and return results in a DataFrame
def dfResults(endpoint, prefix, q):
    ''' Generate a data frame containing the results of running
        a SPARQL query with a declared prefix over a specified endpoint '''
    return dict2df(runQuery(endpoint, prefix, q))
        
# Print a limited number of results of a query
def printQuery(results, limit=''):
    ''' Print the results from the SPARQL query '''
    resdata = results["results"]["bindings"]
    if limit != '':
        resdata = results["results"]["bindings"][:limit]
    for result in resdata:
        for ans in result:
            print('{0}: {1}'.format(ans, result[ans]['value']))
        print()
# Run a query and print out a limited number of results
def printRunQuery(endpoint, prefix, q, limit=''):
    ''' Print the results from the SPARQL query '''
    results = runQuery(endpoint, prefix, q)
    printQuery(results, limit)
# Define any prefixes
prefix = '''
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
    PREFIX dbpedia: <http://dbpedia.org/resource/>
    PREFIX foaf: <http://xmlns.com/foaf/0.1/>
    PREFIX dct: <http://purl.org/dc/terms/>
    PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
    PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
    PREFIX dbo: <http://dbpedia.org/ontology/>
    PREFIX dbc: <http://dbpedia.org/resource/Category:>
    PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#>
    PREFIX prov: <http://www.w3.org/ns/prov#>
    PREFIX dbp: <https://www.w3.org/1999/02/22-rdf-syntax-ns#Property>
    
    PREFIX ouseful:<http://ouseful.info/>
'''
#Declare the DBPedia endpoint
endpoint="http://dbpedia.org/sparql"
sparql = SPARQLWrapper(endpoint)
q = '''
SELECT DISTINCT ?story_name ?src WHERE {
  ?story dct:subject dbc:The_Devil_in_fairy_tales .
  ?story rdfs:label ?story_name .
  ?story prov:wasDerivedFrom ?src .
FILTER (langMatches(lang(?story_name), "en"))
}
LIMIT 10
'''
df = dfResults(endpoint, prefix, q)
df
| story_name | src | |
|---|---|---|
| 0 | The Girl Without Hands | http://en.wikipedia.org/wiki/The_Girl_Without_... | 
| 1 | Godfather Death | http://en.wikipedia.org/wiki/Godfather_Death?o... | 
| 2 | Jack the Giant Killer | http://en.wikipedia.org/wiki/Jack_the_Giant_Ki... | 
| 3 | The Snow Queen | http://en.wikipedia.org/wiki/The_Snow_Queen?ol... | 
| 4 | Errementari | http://en.wikipedia.org/wiki/Errementari?oldid... | 
| 5 | How the Devil Married Three Sisters | http://en.wikipedia.org/wiki/How_the_Devil_Mar... | 
| 6 | Why the Sea is Salt | http://en.wikipedia.org/wiki/Why_the_Sea_is_Sa... | 
| 7 | Jean, the Soldier, and Eulalie, the Devil's Da... | http://en.wikipedia.org/wiki/Jean,_the_Soldier... | 
| 8 | Little Johnny Sheep-Dung | http://en.wikipedia.org/wiki/Little_Johnny_She... | 
| 9 | The Lost Children (fairy tale) | http://en.wikipedia.org/wiki/The_Lost_Children... | 
q = '''
SELECT DISTINCT COUNT(*) AS ?count WHERE {
  ?story dbp:aarneThompsonGrouping ?atu .
  ?story rdfs:label ?story_name .
FILTER (langMatches(lang(?story_name), "en"))
} 
'''
df = dfResults(endpoint, prefix, q)
df
| count | |
|---|---|
| 0 | 0 | 
Folk Songs#
eg https://dbpedia.org/page/The_Raggle_Taggle_Gypsy from https://en.wikipedia.org/wiki/The_Raggle_Taggle_Gypsy
gold:hypernym dbr:Song PREFIX gold: http://linguistics-ontology.org/gold/hypernym
This song has a Roud number, but there is no Roud number attribute; it’s also a Chold Ballad, but there is no Child ballad number attribute
ALso wikidata: http://localhost:8888/notebooks/Documents/GitHub/lang-fairy-books/lang-fairy-books-db.ipynb which does have eg roud number