Testwiki:Request a query/Archive/2024/09

From testwiki
Jump to navigation Jump to search

Template:Archive

All items that have a P1705 for Skolt Saami and a P4119 value

I've been trying to get a list of all of the items on wd that have a Template:P value in Template:Q and have a Template:P value as someone has imported all of these as part of a batch where everything else is correct and these particular values are usually not, so I don't want to just revert the batch. I haven't figured out how to isolate the Template:P for only the Template:Q values, so I'd appreciate the help. - Yupik (talk) 16:37, 1 September 2024 (UTC)

Template:Re Template:SPARQL
does this return what you want? Mahir256 (talk) 16:43, 1 September 2024 (UTC)
Yes, thank you! - Yupik (talk) 16:45, 1 September 2024 (UTC)

WDQS being weird (railway junctions)

Can anybody spot what's going on here ?

Template:SPARQL2

The query (almost) all works as is should do -- it finds railway junctions which might have Template:P statements, but which say that there are actually better items for those statements, and tells me the railway-line sections going through each junction, and counts the number which are described by an external source Template:Q that happens to only describe currently existing ones.

Except that the Template:P property works in one place, but has to be commented out in the second.

Any thoughts to explain this? Jheald (talk) 16:53, 5 September 2024 (UTC)

Labels for scholarly articles

I took my very simplest query to try to get my head round federated queries. I am looking simply for the count of different types of thesis at an institution. I'm not getting the labels for the type of thesis, even though I think those labels must be in the scholarly subgraph, what am I doing wrong?

Template:SPARQL2DrThneed (talk) 23:26, 4 September 2024 (UTC)

No the label for the types are in the main graph. So this works:
Template:SPARQL2
Although the test link doesn't work. We need to update that one to specify the scholarly query service. Here's a working short link: https://w.wiki/B6j4 Ainali (talk) 09:32, 5 September 2024 (UTC)
Oh I should have thought of that. Thanks Jan. *Individual theses* would have a label in the scholarly subgraph, but not the subclasses, right? DrThneed (talk) 20:30, 5 September 2024 (UTC)
OK I thought that was OK on first glance but now I see the counts are completely different!
The query is returning 1654 master's theses for Lincoln University on the main graph and 94980 on the scholarly subgraph! The 1654 is the correct figure (and the numbers look to be correct for the initial query I posted without labels). What's going on? DrThneed (talk) 21:10, 5 September 2024 (UTC)
My fault, I should have counted the distinct thesis when getting the labels. This gives your expected result with labels:
Template:SPARQL2
Real shortlink: [1] Ainali (talk) 22:05, 5 September 2024 (UTC)
Thanks Jan - needed a space after COUNT (https://w.wiki/B72w) but otherwise works! I'd like to understand why adding labels requires a 'distinct' here, when it doesn't for the same query on the main graph, is that something you can explain? DrThneed (talk) 22:21, 5 September 2024 (UTC)
@DrThneed The reason is a limitation of federation and blazegraph. In Wikidata:SPARQL_query_service/WDQS_graph_split/Federation_Limits we explain that federation can happen in two different ways:
  • the host service sending data to the federated service (least efficient)
  • the host service receiving data from the federated service
In your query federation works by sending the publications to the wikidata_main subgraph endpoint, but because there are many publications it is making multiple requests (by sending them in chunks) but the types it is asking are likely the same and thus it's retrieving multiple times the same label, blazegraph being unable to determine that these are the same types they remain as duplicates.
I think that a better way to do what you want is using query-main and pulling the publications from the scholarly subgraph:
SELECT ?thesisType ?thesisTypeLabel (COUNT(?thesis) AS ?count) 
WHERE {
 hint:Query hint:optimizer "None" .
 SERVICE wdsubgraph:scholarly_articles {
  ?thesis wdt:P4101 wd:Q1048626;
          wdt:P31 ?thesisType
 }
 ?thesisType rdfs:label ?thesisTypeLabel .
 FILTER (LANG(?thesisTypeLabel) = 'en')
}  
GROUP BY ?thesisType ?thesisTypeLabel ORDER BY DESC (?count)
Try it DCausse (WMF) (talk) 09:18, 6 September 2024 (UTC)
Thank you for the explanation @DCausse (WMF), that's really helpful. So much to learn! DrThneed (talk) 04:52, 7 September 2024 (UTC)

inferring narrower occupations

Problem: we have large numbers of people with a sole occupation of "researcher" and a description either "researcher" or based on an ORCID. This makes disambiguation really hard.

Proposed solution: Most journals have a main subject, many of which are linked by a P3095 to an occupation, so we can link a human through articles to journals then topics and occupations. If the person has 10 articles in wikidata, picking the most common occupation linked to them should be a good approximation of their occupation.

Problem: So far the query I've got times out. How do I make it go faster so it doesn't timeout? How to ignore people occupation of "researcher" AND another occupation?

Template:SPARQL2

Secondary problem: how do I find academic journals without P921's and P3095's?

Stuartyeates (talk) 10:18, 11 September 2024 (UTC)

Islands

A lift of islands whose name (in English) begins with a letter A-H.

Thank you! — Martin (MSGJ · talk) 13:04, 11 September 2024 (UTC)

Something like this...

Template:SPARQL Piecesofuk (talk) 14:53, 11 September 2024 (UTC)

Help with WDGS

Hi, I have a number of queries written as part of a project Wikidata:WikiProject LSEThesisProject and will need to re-write them due to the Graph Split. My SPARQL knowledge is basic and the queries produced were achieved by trial and error / modifying others' queries / kind help from the community. In preparation for trying to learn how I might re-write those queries I tried, using the Federation Guide, to write federated queries which would pick up all research outputs produced by an academic - this includes not only scholarly articles, but also book chapters, version edition translations, blog posts, chapters and articles. In the main graph as it was all these can be picked up in one query https://w.wiki/B6Ct but I'm failing to re-write this for the scholarly graph. I've tried

SELECT ?item ?itemLabel ?itemType ?itemTypeLabel

WHERE

{

  ?item wdt:P50 wd:Q17508688.

  SERVICE wdsubgraph:wikidata_main {

   ?item wdt:P50 wd:Q17508688.


}

  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],mul,en". } # Helps get the label in your language, if not, then default for all languages, then en language

}

This gives me no results.


And I've tried

SELECT ?item ?itemLabel ?itemType ?itemTypeLabel

WHERE

{

  ?item wdt:P50 wd:Q17508688. 

  UNION 

  { SERVICE wdsubgraph:wikidata_main { ?item wdt:P50 wd:Q17508688}  }

    

 

  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],mul,en". } # Helps get the label in your language, if not, then default for all languages, then en language

}

Which gives an error message and says the query is malformed at UNION.

Would someone be able to point out what I'm doing wrong and show me how to produce these queries.

Thanks HelsKRW (talk) 08:40, 4 September 2024 (UTC)

@HelsKRW The UNION requires the parts to be wrapped with curly brackets:
  { ?item wdt:P50 wd:Q17508688. } 
  UNION 
  { SERVICE wdsubgraph:wikidata_main { ?item wdt:P50 wd:Q17508688}  }
Here below should be your query rewritten (to run on https://query-main.wikidata.org/):
SELECT ?item ?itemLabel ?itemType ?itemTypeLabel WHERE {
  VALUES (?author) {(wd:Q17508688)}
  {
    # get the publications from the scholarly subgraph 
    SERVICE wdsubgraph:scholarly_articles {
      ?item wdt:P50 ?author ;
            wdt:P31 ?itemType
      # Instruct the label service to gather the label of the publication
      # The label for ?itemType will be fetched in the host query, the type is probably part of the main graph
      BIND(?itemLabel AS ?itemLabel)
      SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
    }
  } UNION {
    # Union them with the publications in the main graph (blogs, articles...)
    ?item wdt:P50 ?author ;
          wdt:P31 ?itemType
  }  
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Try it DCausse (WMF) (talk) 11:28, 4 September 2024 (UTC)
Thank you very much for your help. I've modified the query I'd written for the scholarly graph which is now working and I can see that the longer query you've written for the main graph is also working. Could you tell me more about how to know when the query should be written on the scholarly graph or the main graph? And would you be able to tell me more about the VALUES, BIND and UNION commands in the query you've written for the main graph. Using this query I've tried modifying some other queries, but I'm hitting up against a series of error messages and despite reading the federated guide am struggling to understand or get to grips with how to write a federated query. Thanks HelsKRW (talk) 10:25, 5 September 2024 (UTC)
Unfortunately, while writing Wikidata:SPARQL_query_service/WDQS_graph_split/Internal_Federation_Guide I could not find a reasonable and comprehensive set of characteristics to determine if it's better to use query-main or query-scholarly for the host query. Generally both are doable but for certain queries using one or the other greatly impact the complexity of the query.
What I would suggest is perhaps using query-main first (this is the one I most often used when writing Wikidata:SPARQL_query_service/WDQS_graph_split/Federated_Queries_Examples) and consider using query-scholarly if the query happens to be difficult to write. I hope that with more examples we can improve the guide over time.
  • VALUES is a sparql feature that allows to define a variable, I used it to avoid having to repeat wd:Q17508688 in the two clause around UNION. So that you can change it in single place when willing to see publication of another author.
  • BIND(?itemLabel AS ?itemLabel) is a trick we use to make the wikibase:label understand that we want to keep the label the of the item, this explained at Wikidata:SPARQL_query_service/WDQS_graph_split/Internal_Federation_Guide#Misplacing_the_label_service. But in general BIND is creating a variable, for instance in place of VALUES (?author) {(wd:Q17508688)} I could've written BIND(wd:Q17508688 as ?author).
  • UNION allows to collect the information from multiple expressions: { EXPRESSION1 } UNION { EXPRESSION2 }, in the query above EXPRESSION1 extract the scientific publications (?item) and their labels (?itemLabel) from the scholarly subgraph, EXPRESSION2 is collecting the other publications (blogs, articles) from the host service (here serving the wikidata_main graph).DCausse (WMF) (talk) 13:11, 5 September 2024 (UTC)
Thank you, In practice I seem to be struggling with the UNION command - I've tried it in multiple queries and always get an error message, whatever combination of curly brackets I try!
If I take this query from my thesis project https://w.wiki/5aHL which gives me a list of LSE’s doctoral theses with author links to Wikipedia pages where available, and try to re-write it for the new main graph... I edit it to include the hint optimizer,  the SERVICE scholarly graph and BIND – the query runs, but gives me no results   https://w.wiki/B7Fj
So I try to add in the UNION command, but whatever I do with curly bracket combinations I get an error message so can’t run the query
SELECT ?thesis ?thesisDescription ?thesisLabel ?author ?authorLabel ?authorwp ?lse_url WHERE {
  hint:Query hint:optimizer "None" .
  SERVICE wdsubgraph:scholarly_articles {
  
  ?thesis wdt:P31/wdt:P279* wd:Q1266946 ;
   wdt:P953 ?lse_url.
  
    BIND(?thesisLabel AS ?thesisLabel)
     SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
  }
  } UNION {
   # Union them with the publications in the main graph (blogs, articles...)
    ?thesis wdt:P31/wdt:P279* wd:Q1266946 ;
   wdt:P953 ?lse_url.
  } 
  OPTIONAL {
   ?thesis wdt:P50 ?author.
   OPTIONAL {
     ?authorwp schema:about ?author;
      schema:isPartOf https://en.wikipedia.org/.
   }
  }
FILTER(STRSTARTS(STR(?lse_url), http://etheses.lse.ac.uk))
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY (?thesisDescription)
Are you able to advise what I’m doing wrong on this one?  HelsKRW (talk) 10:10, 6 September 2024 (UTC)
@HelsKRW Your query is syntactically incorrect because it does not balance the opening and closing curly brackets. With complicated queries like this I highly suggest to use proper wikipedia:Indentation_style to rapidly identify where the problem is.
Every time a curly bracket is opened you indent the next line with 2 spaces to the right, when closing one you remove 2 spaces. Open or close only one curly bracket per line. With your query you could perhaps have identified that the problem happened right before the UNION where you have an extra closing curly bracket.
Similarly when not repeating the subject in the patterns (when using ;) try to align the predicates like this:
?thesis wdt:P31/wdt:P279* wd:Q1266946 ;
        wdt:P953 ?lse_url .
So that it's clearer that the wdt:P953 applies to the ?thesis.
After there was several other things incorrect:
Please see below your query rewritten with federation (to run on query-main) and some explanations in the comments:
SELECT
  ?thesis
  ?thesisDescription
  ?thesisLabel
  (COALESCE(IF(BOUND(?author), ?author, 'N/A')) AS ?author)
  ?authorLabel (COALESCE(IF(BOUND(?authorwp), ?authorwp, 'N/A')) AS ?authorwp)
  ?lse_url
WHERE {
  hint:Query hint:optimizer "None" .
  # Ideally we want to select thesis with: ?thesis wdt:P31/wdt:P279* wd:Q1266946
  # This property path might require navigating triples in the two subgraphs and thus we can't use it
  # We extract ?thesisType first so that we will match it with a simple pattern ?thesis wdt:P31 ?thesisType
  ?thesisType wdt:P279* wd:Q1266946 .
  {
    SERVICE wdsubgraph:scholarly_articles {
      SELECT ?thesis ?thesisLabel ?thesisDescription ?thesisType ?lse_url (COALESCE(IF(BOUND(?author), ?author, 'N/A')) AS ?author) { 
        ?thesis wdt:P31 ?thesisType ;
                wdt:P953 ?lse_url.
        FILTER(STRSTARTS(STR(?lse_url), "http://etheses.lse.ac.uk"))
        # We return a variable bound in an OPTIONAL clause, we have to be careful here 
        # see https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_graph_split/Internal_Federation_Guide#Returning_variables_bound_by_OPTIONAL
        OPTIONAL { ?thesis wdt:P50 ?author. }
        # No need to use the BIND(?thesisLabel AS ?thesisLabel)/BIND(?thesisDescription AS ?thesisDescription) trick here since we wrap our federated query
        # with a SELECT to workaround issues with the optionally bound ?author variable
        SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
      }    
    }    
  } UNION {
    # Union them with the publications in the main graph (blogs, articles...)
    ?thesis wdt:P31 ?thesisType ;
            wdt:P953 ?lse_url.
    FILTER(STRSTARTS(STR(?lse_url), "http://etheses.lse.ac.uk"))
    OPTIONAL { ?thesis wdt:P50 ?author. }
  }
  OPTIONAL {
    ?authorwp schema:about ?author;
              schema:isPartOf <https://en.wikipedia.org/> .
  }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY (?thesisDescription)
DCausse (WMF) (talk) 13:37, 6 September 2024 (UTC)
Thank you for this, and all the extra detail to help my learning, which I'm just working through. I've tried on a couple of days to save the query on the main graph, but get a message to say URL shortening failed...and I'm getting that with one other query on the main graph today, though have been able to get shortened URLs for plenty of other queries - is this the place to report that, or somewhere else? Thanks! HelsKRW (talk) 11:22, 12 September 2024 (UTC)
Unfortunately it is a known limitation that I face myself, I'm not sure how others workaround it but for my part I simply copy/paste the whole URL in wikitext. If I want to show the query in the page I sadly have to repeat it twice:
- once with the mw:Extension:SyntaxHighlight using lang="sparql"
- once by copy/paste the full URL in an external link like: [https://query-main.wikidata.org/#AWFULLY%20LONG%20AND%20UNREADABLE%20URL%20PARAMETERS Try it!]
<syntaxhighlight lang="sparql">
SELECT * {?s ?p ?o} LIMIT 1
</syntaxhighlight>
[https://query-main.wikidata.org/#SELECT%20%2a%20%7B%3Fs%20%3Fp%20%3Fo%7D%20LIMIT%201 Try It!]
Template:SPARQL does not yet support query-main nor query-scholarly but if it does at some point I suppose this might be quite handy. DCausse (WMF) (talk) 06:48, 13 September 2024 (UTC)
Thank you! HelsKRW (talk) 10:18, 13 September 2024 (UTC)

Slightly different results after federating a query

I noticed slightly different numbers in the results between my ordinary query and my rewritten for WDGS query. What's going on (probably I did something wrong!) The query is to count the types of things that main subjects of my theses are. The original query: Template:SPARQL2 The rewritten query: Template:SPARQL2 DrThneed (talk) 22:21, 11 September 2024 (UTC)

Oh - I realised it probably means there is some publication(s) in the thesis project that isn't in the scholarly subgraph for some reason and so its main subjects are the reason for the difference. We have a few things like reports, papers, etc, but I would have thought they all fell into the scholarly subgraph. How can I figure out which publication(s) that is? DrThneed (talk) 22:38, 11 September 2024 (UTC)
OK, never mind - reviewed the list of types of things in the project. I suspect there is a qualification or similar thing that falls within the project and has a main subject statement on, but isn't a publication. DrThneed (talk) 23:18, 11 September 2024 (UTC)

Query to find all Renaissance Artists born in Italy

Hi, I am totally new to Wikidata and SPARQL. I am studying but an example to start with would be awesome! Can I get all the names of Artists from the Renaissance movement that were born in Italy? Is that sufficnet information to create a query? Thank you! 93.151.230.93 20:13, 15 September 2024 (UTC)

Template:SPARQL

So this is my solution. It shows only the people when there is a info about the movement. Maybe another resolution will be go over the birth date. --sk (talk) 13:37, 16 September 2024 (UTC)

List of cyclists and URLs to Wikipedia in different languages

Hi, using the Wikidata Query Service, I've managed to get a list of Wikidata entries with a ProCyclingStats page.

Template:SPARQL

What I'd now like, is to have the URLs to the English language article, and let's say the Spanish and French one (if they exist). Any idea if this is at all possible? Yannick1 (talk) 10:33, 16 September 2024 (UTC)

Template:SPARQL

Is this what you want? --sk (talk) 13:16, 16 September 2024 (UTC)

That is perfect, thank you very much! Yannick1 (talk) 15:02, 16 September 2024 (UTC)

humans without source ?

Hello! I'd like to see a list of humans, on the Dutch wikipedia, that don't have any source listed. Preferably with some info like birth date and place, gender, the wikidata description. Thanks! 81.164.2.207 21:30, 7 September 2024 (UTC)

Template:SPARQL

Hi, I try it hard, and I think this is near to the right answer. At the moment I can only count the number of References. Maybe other people can improve my query. Best regards. --sk (talk) 15:04, 20 September 2024 (UTC)

Slice, how does it work?

Hi there! Some time ago, I've received some help to run a large query, where the solution was to slice the results. Here's the relevant snippet:

Template:SPARQL

That worked ok! However, when I changed the element used (p:P569) I got different results (and a different number of items). Then, I'd like to understand better how does it work, and how can I use it. The element selected for the slice affects the results? I couldn't find any documentation or details about it. Pruna.ar (talk) 21:01, 13 September 2024 (UTC)

It's well hidden, but there you go: https://blazegraph.com/database/apidocs/com/bigdata/rdf/sparql/ast/eval/SliceServiceFactory.html
For all triples it the database it returns triples that matches the basic graph pattern you provide starting from the offset you provide and returning at most limit triples.
There are two use-cases for the slicing service, either you want predictable pagination or it is used to optimize a slow query. When it is used to optimize a slow query it is because an intermediate join is too large. Normally the SPARQL optimizer will try to order the joins such that it starts with the smallest possible set, but this doesn't always work.
And yes you will get different results based on the BGP you choose for the slice service as you artificially restrict that set. Suppose we have two sets A: items matching ten male given names and B: set of all humans that have ever lived. If we restrict set A to 100 items and AND the set with set B you might expect the resulting set to also have 100 items. For the second try let's restrict set B to 100 random humans and AND those sets together. We will probably get less than 10 items in the resulting set, depending on how common the names are. AsubsetBABsubset. Does that make sense? Infrastruktur (talk) 22:27, 13 September 2024 (UTC)
Thanks @Infrastruktur for your fast & instructive response! So the slice occurs before the joins, right?
I plan to use the different time properties (in different queries), then I'd need to first get a total count of elements for each and slice each time property depending on the total number of elements it has. I'm thinking on this correctly? Perhaps some example is needed to explain myself better? Pruna.ar (talk) 00:11, 14 September 2024 (UTC)
Yes, the slice occurs before the join. You also shouldn't need to worry about the size of the input sets. If one of the sets is 100 items long and we request a slice of 10 and keep increasing the offset by 10 for each time this will produce all the different output combinations. It shouldn't matter which set you chose, as I think combined they will all produce the same output set, but each iteration might look different and be a different size. Another way of saying it is that you walk though all of the subsets of B, which combined is the whole of set B and so you effectively take the intersection between set A and B. AB=ABsubset. Hope the notation doesn't make any mathematicians cry. Infrastruktur (talk) 06:32, 14 September 2024 (UTC)
Let me share some examples about the size. Let's assume I'm looking for peace treaties with location and time.
An initial query is:
Template:SPARQL
That brings 272 results, using point in time (P585). Then there are other time options to change time, like start time (P580) that shows 3 results, or publication date (P577) that shows 4.
Now let's assume for a moment that I need to slice it, so I change and add it for an intermediate part:
Template:SPARQL
As you explained, that will show only a portion of the results, and that's ok. If I run the other slices, I'll get the whole results.
But, if I change it to the next time property P580 I'll get an error: Unknown error: offset is out of range. That's what I mean about taking care of the size. As I change the property used for slicing, I must check the total size and avoid exceeding it. Pruna.ar (talk) 19:42, 14 September 2024 (UTC)
@Infrastruktur about your references to size, I'm not sure about how to "measure" it.
For example, if I try to count the elements, it shows only the results, not the possible triplets. Sample query I tried to identify the quantity for one of the examples I shared:
Template:SPARQL
Any ideas on how to calculate the limits for each element I use for slices?
Thanks! Pruna.ar (talk) 02:06, 18 September 2024 (UTC)
It's inconvenient to check sizes of input sets. On stock WDQS I would CTRL+left click on the Wikidata icon to bring up another browser tab to avoid messing up my query, then copy&paste in the basic graph pattern that I would like to see the size of: 'SELECT (COUNT(*) AS ?count) WHERE { ?item wdt:P31 wd:Q625298 . }'. To check the size of a join I would write that join between 2 BGPs, but we have no idea in what order the query engine will do the joins in so for that I would have to run an EXPLAIN query to see what is going on under the hood. For your initial query it would look something like this: [2]. This also conveniently gives you the sizes of all of the sets and the sizes of all the joins in the order that they happen and metric ton of other information. If you spot any joins that result in more than 100 000 items, that would be a red flag, more so if it happens early. The value for the slice limit could be a lot bigger though, it is the size of the resulting intermediate set(s) that is what ends up impacting performance. Infrastruktur (talk) 09:21, 18 September 2024 (UTC)
Thanks @Infrastruktur. As you mention, that explain has tons of info. I couldn't understand most of it. However, a couple of specific lines in the "Query Evaluation Statistics" section, where predSummary column refers to the date property (something like SPOPredicate[3](?item, Vocab(18)[3]:XSDUnsignedShort(585), ?--pp-anon-80770bc8-6329-4fb2-adb9-b85b9f33ae6b)), I can see that fastRangeCount is a little over 1M (so my sliced worked ok), and if I change it to property P580 it's 815k, and then an offset of 1M raised an error. Pruna.ar (talk) 21:14, 18 September 2024 (UTC)
Hi @Infrastruktur & community, me again :-)
As I evolve in this review, I was trying an alternative to search for any document (Q49848). I used then a very short query where I'm extending the search all over the hierarchy by using (wdt:P31|wdt:P279)+, so the code looks like:
Template:SPARQL
In this scenario, even the EXPLAIN can't show the results. And as the idea is to look for every instance of or subclass of, it grows a lot. Slice at time might not help much (as I was planing to do before), and I'm not sure if it's possible to slice when there's that "or". Any ideas to consider here? Pruna.ar (talk) 21:40, 19 September 2024 (UTC)
Template:SPARQL
Yes, you can not use the slice service on property chains, or when there is operators like |?+*. The EXPLAIN query shows us what is happening here. If we look just at P279+ (ArbitraryLengthPathOp) we see it splits off a lot of subqueries that we didn't explicitly ask for. From that we can surmise it breaks down the operation into the equivalent query shown above. It's also a hash join operation, so a lot of the operations can run in parallel. It's still a fairly expensive operation especially on long chains like P279 or P131. Infrastruktur (talk) 14:30, 20 September 2024 (UTC)

List of persons whose age is a multiple of 25

I would like a list of people who are celebrating a milestone birthday this year (25, 50, 75, 100, 125, 150, etc.). I've got this far and now I can't manage to filter them:

Template:SPARQL Rerumscriptor (talk) 16:36, 14 September 2024 (UTC)

Adding this filter should work: FILTER (?age - (25 * xsd:integer( ?age / 25 )) = 0) Piecesofuk (talk) 17:32, 14 September 2024 (UTC)
Thank you very much, that was really helpful! Rerumscriptor (talk) 16:02, 15 September 2024 (UTC)
@Rerumscriptor: Ich hab mal eine Abfrage gebastelt, die die Jubilare von Dresden anzeigt. User:Stefan_Kühn/Dresden#Personen_mit_Bezug_zu_Dresden,_die_heute_ein_Jubiläum_haben. Vielleicht hilft dir das ja auch weiter. --sk (talk) 13:42, 16 September 2024 (UTC)
@Stefan Kühn Vielen Dank für den Hinweis! Genau danach habe ich gesucht. Manchmal ist es bei so vielen Möglichkeiten einfach schwierig, das richtige zu finden. Rerumscriptor (talk) 19:13, 17 September 2024 (UTC)

Filter by instance of country doesn't work for Bosnia and Herzegovina

Hi there,

I'm relatively new to SPARQL, so there's a good chance I'm missing something obvious.

I try to get a List of all countries:

Template:SPARQL

In the result, Bosnia and Herzegovina is missing, although it has the statement instance of country. If I compare the Wikidata entry of Bosnia and Herzegovina with United States of America I notice, that the country statement has another color. Maybe this is the difference why Bosnia and Herzegovina isn't found by my query?

Many thank in advance! ChakeMH (talk) 09:16, 18 September 2024 (UTC)

Yes. "wdt:" returns best-values, so if one statement is marked with preferred rank it will not match the other claims. This looks like improper use of ranking however, I don't see why "sovereign state" is somehow more correct than "country". Infrastruktur (talk) 10:16, 18 September 2024 (UTC)
Thank you for giving the right hints! I was able to change the query to match every instance of country statement. ChakeMH (talk) 11:10, 18 September 2024 (UTC)
I have this nice SPARQL for you. It ask for capitals of countries. Maybe this help too. --sk (talk) 08:31, 20 September 2024 (UTC)

Query that locates categories with most interwikis that still don't have a corresponding category on Hebwiki

I attempted to create a query on Wikidata that would help identify the categories with the most interwiki links which still don't have a corresponding article at the Hebrew Wikipedia (hoping that by doing this I would be able to create some of the most popular categories at Wikipedia which haven't been created yet on the Hebrew Wikipedia). Unfortunately, the query keeps timing out.

This is what I have created so far:

Template:SPARQL2

I would appreciate any help in optimizing the query so that it would actually work. WikiJunkie (talk) 19:49, 21 September 2024 (UTC)

issues where not all items are returning with query-scholarly

I am trying to create a SPARQL query that retrieves all items that are part of the NeuroMat project (Wikidata item Q18477654), using the property "part of" (P361). Additionally, I want to retrieve the retrieved date (P813), which is stored within the references of the "part of" statements.

In the query, I am accessing the references associated with each "part of" statement to extract the retrieved date. However, I am facing issues where not all items are returning their associated retrieved date, specially I'm unable to retrieve the P813 only after July 2013.

I'm work with https://query-scholarly.wikidata.org/.

Here is the current SPARQL query I am using: Template:SPARQL Adriano NeuroMat (talk) 19:46, 23 September 2024 (UTC)

Olympic medalists

Hi folks! How would you approach getting a list of Olympic medalists? I tried using Template:P/Template:P : Template:Q, but there are very, very few results. Strainu (talk) 11:29, 10 September 2024 (UTC)

With this SPARQL I get the list of over 13000 Olympic medalist. Like in Template:Q.
Template:SPARQL
But you will not get all winners without any data in wikidata about this win, like in Template:Q. --sk (talk) 09:17, 20 September 2024 (UTC)
Thank you @sk, this looks very promising! Strainu (talk) 18:44, 27 September 2024 (UTC)

Top users of a property

Hello, would it be possible to write a query to identify the top users of a specific property? The goal is to add the query link to the property's discussion page (via Template:Property documentation), so we can easily notify these users when discussions arise. Thanks in advance! Ayack (talk) 13:25, 24 September 2024 (UTC)

Top users? Do you mean the Wikipedia pages that use the that claim of that property, or users who have created such claims? I did a query for the latter for fun, and as I suspected after 60 minutes it does the database equivalent of crying Uncle :p Infrastruktur (talk) 10:22, 25 September 2024 (UTC)
MariaDB [wikidatawiki_p]> select actor_name, substring(comment_text, 24) as property, count(rev_id) as count from revision join comment_revision on rev_comm
ent_id = comment_id join actor_revision on rev_actor = actor_id where actor_user is not null and comment_text like '/* wbcreateclaim:1 */ %' group by proper
ty, actor_name order by property, count desc;
ERROR 2026 (HY000): TLS/SSL error: unexpected eof while reading
I'd like to find the users who have added the most statements using this property. Ayack (talk) 11:26, 25 September 2024 (UTC)

Top Actors

Hi. I need the birth dates of some top 3000-6000 actors, the ones that are considered popular. I'm quite proficient in authoring advanced sql statements but this specific dialect will take me time to master. I hope someone will come with the perfect query. Wouldn't mind aliases though as i've seen them in some examples Vaqanif (talk) 16:58, 24 September 2024 (UTC)

Hello Vaganif, there are two examples:
But this query may be very slow. So it would be better to go more specific. Did you mean Template:Q (cinema and theatre) or did you mean Template:Q. To avoid time outs I use a Limit 100.
Template:SPARQL
If you can be more specific, than you can improve this query. Best regards --sk (talk) 10:34, 28 September 2024 (UTC)
Thank you so much @Stefan Kühn. I got some nice results with the help of ChatGPT. Do you mind telling me how to find out what this specific item ( film actor, in our case ) has attached to it? Say, I want to know about composers and I find out that the code is Q36834. What about the rest? How can I possibly know what else WikiData has stored there? Birth date? Birth place? Like, when I open the webpage for film actor I can't see any table or info on what more I can get from the item. Thanks in advance. Vaqanif (talk) 11:11, 28 September 2024 (UTC)
The best way to get all infos to a film actor from wikidata is to look at some VIPs like Template:Q or Template:Q. There you can learn what properties are available. Mostly this humans are a good start. We have actors and films and all the other stuff members. So another good start is to look at the infoboxes on Commons. Like in c:Category:The Matrix or directly in Wikidata a film like Template:Q. There you find the properties for films. All this thinks are available in a query. And also via API. See also Presentation: A First Date with Wikidata or the API under [4]. --sk (talk) 11:45, 28 September 2024 (UTC)

Query with WP articles

I've been trying to create a query of all opera singers who are sopranos. That's easy. What I can't figure out is how to generate a list based on that query that will limit the list to those items having Wikipedia articles. I tried using the Query Builder but I've not figured out what a WP article is an instance of. Any help appreciated! - Kosboot (talk) 18:47, 27 September 2024 (UTC)

Hello Kosboot, here is my first query for you:
Template:SPARQL
So you get 2295 opera singers. But you know not all singers has the data for the voice type. So here is a second query.
Template:SPARQL
Now you get additional 60 opera singers with sopran and can add the voice type in wikidata. You can change this second query easy to descriptions in other language. So you get many more opera singers with sopran. Best regards --sk (talk) 08:03, 28 September 2024 (UTC)
Thank you so much Stefan! It works perfectly! The 2nd query with those who lack the voice type is very valuable! Vielen Dank! - Kosboot (talk) 00:36, 29 September 2024 (UTC)

first-level administrative division that share borders

Came across the tool Wikidata Graph Builder, particularly the linked graphs for US States and was interested in trying to do something similar but for all the states of countries in the world, but despite my understanding that the border property is somewhat incomplete, I still believe my attempts are producing faulty results:

Template:SPARQL Akaibu (talk) 01:37, 29 September 2024 (UTC)

All people without family names

Hello everyone, despite to the vast amount of humans in WD is there a way to get query all humans without a family name? --Arnd (talk) 10:37, 29 September 2024 (UTC)

I just saw that i can do a simple search. But how can i get/download all results? --Arnd (talk) 10:58, 29 September 2024 (UTC)

@Aschroet: For a Download you need to use the API or you split your 6 Millions results. Like in this query. Maybe also this query is for you. (Both query work, but today are time outs) - By the way your simple search with elasticsearch is very cool and fast. --sk (talk) 12:08, 29 September 2024 (UTC)

Object parts recursively

Seeking Wikidata SPARQL query to recursively get object components as a hierarchy, e.g. parts of a biological cell, each of their individual parts in turn and so on. Return result representing the structure/logic/tree, e.g. (cell,has,nucleus),…(nucleus,has,chromosome),… I.E. returning just one long flat inventory list is appreciated but not complete answer. Also, please provide generic solution applicable to any topic (parts of a house/car etc), not hard coded to a specific object. Many thanks! 2600:1700:5200:D20:80C9:8A34:4F88:3B24 19:23, 29 September 2024 (UTC)

You… You are seeking for GAS Service. (Gather, Apply, and Scatter.) This allow You to travel through some trees :) Have fun, Piastu (talk) 07:33, 30 September 2024 (UTC)