Testwiki:Property proposal/property for implied values

From testwiki
Jump to navigation Jump to search

Template:Property proposal

Note: we have considered various labels for this property: "property for implied values", "transitive over", "refinement hierarchy" and currently "value hierarchy property". All these variations are used in the discussion below. − Pintoch (talk) 21:40, 15 January 2019 (UTC)

Motivation

I would be interested in expressing that a given property (such as Template:P) is used to determine how values of another property (Template:P) can be refined. This is a pattern that we use at various places in Wikidata:

Potentially this property could be multi-valued if there are multiple refinement hierarchies in place (but I can't think of an example right now). I am not very happy with the label so I hope others have better ideas.

The idea behind introducing such a property would be to make it easier for data import tools (such as OpenRefine) to respect these hierarchies natively (for instance, when adding Template:Statement to an item, any unsourced Template:Statement statement could be safely deleted, I think).

This could also potentially improve the constraint system: for instance, if we have an Template:Q which requires Template:Statement, then the constraint system could also accept Template:Statement (because it is more precise). As far as I am aware this behaviour is only available for Template:P/Template:P via Template:Q currently. Template:Ping I have no idea if it is feasible (and in which case you could prefer another syntax for this information)? − Pintoch (talk) 15:09, 8 January 2019 (UTC)

Template:Notify project

Template:Ping you have requested this feature at Help_talk:Property_constraints_portal/ItemPintoch (talk) 15:14, 8 January 2019 (UTC)

Discussion

  • Template:S would a label something like "property over which this is transitive" make sense? But yes, this seems like a very useful abstraction to provide a structured mechanism for. ArthurPSmith (talk) 19:36, 8 January 2019 (UTC)
  • This can get complicated. Template:P is only transitive over Template:P until outside the occupation class tree. --Yair rand (talk) 23:45, 8 January 2019 (UTC)
    • Good point. But values outside the occupation class tree are already forbidden by a Template:Q, so maybe we can say that the implication only holds if the broader value is acceptable with respect to other constraints? Or we could indicate the containing class with a qualifier? Anyway, for both applications mentioned (data import and constraint violations), this should not be an issue. − Pintoch (talk) 06:57, 9 January 2019 (UTC)
  • Template:S David (talk) 12:53, 9 January 2019 (UTC)
  • Template:Comment New label ("transitive over") works for me! ArthurPSmith (talk) 16:28, 10 January 2019 (UTC)
    • Template:Re given MisterSynergy's reaction below, I have the feeling that this label is too close to "transitive" and might cause confusion for others. Maybe something like "refinement hierarchy" would be better - it would also rule out values like Template:P which are symmetric properties. − Pintoch (talk)
  • Template:Wait How do other ontologies handle this? Where is prior art? Especially, when it comes to naming this relationship?
ChristianKl❫ 18:23, 10 January 2019 (UTC)
  • Template:Re yes indeed surely this has been formalized in other places. I will investigate. − Pintoch (talk) 20:12, 10 January 2019 (UTC)
  • Template:Re I have asked around and it does not seem to have a particular name, it is just a particular case of an OWL construct: property chains. People in graph theory or other fields of math might have names for that particular relation, but I am not sure how to look for it (and it is likely to be as obscure as any other name we come up with, TBH). − Pintoch (talk)

Template:Ping project

  • (Thanks for the ping.)
    Template:Tl (see below). For all given examples, the value properties Template:P, Template:P, and Template:P have already an explicit relation Template:P: Template:Q. This transitivity is valid for any kind of use, not just the ones which should use the property proposed here. As only Template:Q values should be allowed according to the proposal, we do not gain anything with this proposed property which is not already there. —MisterSynergy (talk) 19:22, 10 January 2019 (UTC)
    • Template:Re sorry, it looks like I did not pick the best examples. Please consider this case:
Template:Statement
Template:Statement
Template:Statement
However, it would be wrong to say that Template:Statement.
So, Template:P is transitive over Template:P but not over Template:P (and both Template:P and Template:P are transitive).
So this notion of "X transitive over Y" is distinct from "Y transitive". Do you agree? Maybe the new label is causing confusion? − Pintoch (talk) 20:12, 10 January 2019 (UTC)
If you prefer symbolic statements over prose:
  • Q is transitive when xyz,xQyyQzxQz
  • P is transitive over Q when xyz,xPyyQzxPz
Pintoch (talk) 20:58, 10 January 2019 (UTC)
Another counterexample found by sparqling around: Template:P is transitive. Template:P is transitive over Template:P (if X has child Y and Y has sibling Z, then X has child Z), but Template:P is not transitive over Template:P (if X has mother Y and Y has sibling Z, then surely Z is is not X's mother!). − Pintoch (talk) 23:47, 10 January 2019 (UTC)
Template:Re I don't think so. Halfsiblings frequently are siblings. ChristianKl❫ 07:16, 11 January 2019 (UTC)
Template:Re yes that is true − this example is not something we would want to have anyway for the two applications mentioned above (because Template:P is symmetric, so it cannot be seen as a refinement hierarchy like the other examples). I just hope it helps Template:U' grasp the difference with transitivity a bit better (in particular, the fact that Template:P is not transitive over Template:P is still a counterexample of his claim, I think). − Pintoch (talk) 07:52, 11 January 2019 (UTC)
Yeah, I understand. However, this is the same problem that User:Yair_rand mentioned above and which you acknowledged, just in a much more obvious sense. Transitivity of a property Y (e.g. Template:P) used in value items of property X (e.g. Template:P) holds only as long as you stay within the range of X (Template:Q we say at Wikidata). It works for a couple of steps for occupations and then breaks down, but barely for parthood relations which are typically very messi at Wikidata. It works practically always for the proposed Template:PTemplate:P, as the range of the latter is completely contained within the range of the former; the problem does not really matter for the proposed Template:PTemplate:P, as Template:P does not have a defined range.
There is another, independent problem with parthood relations in particular, in contrast to other transitive properties (Template:Q is the research field that studies parthood relations). For parthood relations transitivity is typically assumed within mereology, but there is research in the community about whether this is generally valid or not (see here). Consider for instance these (fictional) claims:
Although both claims would be acceptable for Wikidata and Template:P is transitive, you clearly would not infer that Template:Claim. The problem is that parthood relations are only transitive under certain circumstances, for example in a local meronomy (such as “Pintoch's body parts” or “Wikimedia project entites”). However, parthood relations are not generally transitive.
I still do not think that this proposed property should be created. The range problem raised by Yair_rand would have to be addressed with extra efforts based on existing Template:Q claims on properties anyway (so the proposed property does not add new value), and the parthood problem would rather be solved by removing the transitivity from Template:P/Template:P. (I still have to have a look at the Template:P problem, though.) —MisterSynergy (talk) 10:30, 11 January 2019 (UTC)
Template:Re you rightly point out that transitivity itself is often messy. I would argue that this is not just true for Template:P: very often, long chains of Template:P also give nonsensical results when we follow them high up into very abstract concepts. That's inevitable: we are building a knowledge graph, not a math textbook.
Let's step back and look at the motivation of the proposal. What I mean with this proposal is that many properties tend to have one designated property for their refinement hierarchy. It seems to me that this pattern is pretty pervasive, but we tend to only acknowledge this for the Template:P/Template:P pair. Why?
It would be massively useful if the constraint system could handle other such hierarchies. Typically when working with Template:Q, the fact that we discourage subclasses of Template:Q makes the type system useless there. Have you ever wanted to enforce things like "every item with an Template:P should have Template:P Template:Q, or any subclass of it"? Now, I could propose to add a parameter to the constraints definition so that we could provide Template:P there, but I think this would be misguided as this information is really determined by the property used for the required statement. So I think this should be stored in a generic way, outside the constraint system. Now I would very much welcome any other suggestion about how to represent this. − Pintoch (talk) 10:54, 11 January 2019 (UTC)
Also (sorry for the long reply!), it seems to me that there might still be a conceptual misunderstanding here: in the abstract, the fact that a property Q is transitive really does not mean that "x P y Q z" implies "x P z" for any property P. This is wrong both in mathematics and in ontology design. So there is nothing to "fix" about Template:P being transitive in this regard. It turns out that most of the transitive properties we have in Wikidata are containment relations, where this tends to hold often,, but that does not have to be the case. For instance, imagine we had a property called "hates". The fact that "MisterSynergy hates Template:Q" really does not imply that "MisterSynergy hates Template:Q", even if the chain of Template:P is totally correct. − Pintoch (talk) 11:41, 11 January 2019 (UTC)
Now Template:Neutral, as I do not want to stand in the way too much. I changed my mind after a second read of this proposal and its comments. The transitivity does no longer seem to be the critical aspect of this property, and I really hope that this term does not appear in the potential property label. It is also worth to mention that it may be useful for refinements, i.e. looking *down* along a hierarchy for potentially better/more specific values, which pretty much avoids the discussed problems that arise when going *up* a hierarchy towards more general values. —MisterSynergy (talk) 11:53, 13 January 2019 (UTC)
Yes, it could be interesting to have something for implications which are going down a hierarchy. I cannot think of an example though (beyond my "hates" example above, maybe). − Pintoch (talk) 15:21, 14 January 2019 (UTC)
I'm not sure if the above can of worms is what ChristianKl intended to open, but the question of how other ontologies deal with this again seems relevant. I remember some peculiarities with transitivity in SKOS - there are both the relations "broader" and "broaderTransitive", and "broaderTransitive" is declared as a superproperty of "broader" (and note this does NOT make "broader" transitive, despite the fact that "superproperty" like "superclass" is itself transitive). There's a discussion of some of this (in a more general case) here. It all kind of hurts the brain a bit... ArthurPSmith (talk) 14:06, 11 January 2019 (UTC)
Template:Re I don't think that properties like this should be trivally created. We should generally follow existing standards and only invent our own if we have good reasons to invent the wheel anew. ChristianKl❫ 10:46, 13 January 2019 (UTC)
Template:Re I totally agree. That being said, it seems to me that there is no established RDF predicate for this in the usual namespaces. The reason for that is that this information is best expressed in OWL, not in RDF directly. Wikidata has not engaged much with OWL so far - the project has followed a different route by storing ontological information in the same data model as the data itself. For instance we have Template:P, and we use statements such that Template:Statement, which intrinsically don't do anything (Wikibase does not give them any special meaning, the query service does not infer any triples from them, and so on). Storing property constraints as statements on the property is another example. So this is why I proposed this property: it seems to be in line with the current customs of storing ontological information as statements on properties. That being said, it would obviously be much better if we had generic support for constructs like property chains, which would be much more expressive. But that is a much larger project, and we might want to think twice before construing these concepts in the Wikibase data model (maybe this is something we want to store elsewhere?) − Pintoch (talk) 11:03, 13 January 2019 (UTC)
  • Template:Comment Ok, it sounds like we need to settle on a non-confusing label (and if we can find a label somebody else has used, all the better). I was looking at a sample of the item-valued properties we have; for many of them this whole discussion simply does not apply (there's no "refinement hierarchy" or transitivity that would be associated with person-valued properties like Template:P or Template:P, and I can't think of one that would apply for unitary entity values like Template:P either.) For the ones associated with a physical location, for example Template:P, then we're clearly interested in the location-based hierarchy that Wikidata manages via Template:P. For some others, Template:P seems to apply (for example for language-valued properties like Template:P). For Template:P I am not sure if we have a consistent refinement hierarchy - Template:P perhaps, but also Template:P, Template:P? Fundamentally I think what we're looking for is the way in which the domain of the property value (class, place, language, organization, etc.) is hierarchically modeled in Wikidata. So would a label like "domain hierarchy property" make sense? ArthurPSmith (talk) 14:20, 14 January 2019 (UTC)
Yes, I also expect that Template:P and Template:P would be the most common values for this property. Concerning the label, does "domain" generally refer to the target value? Intuitively, for me it would refer to the subject item. (Maybe by a vague mathematical analogy with the domain of a function, although statements aren't functions) So it would be more intuitive for me to have something else like "range", "value" or "target". − Pintoch (talk) 15:21, 14 January 2019 (UTC)
Oh! Yes, I was thinking "domain" in the generic sense of type of item. Maybe "value hierarchy property"? ArthurPSmith (talk) 17:10, 14 January 2019 (UTC)
I think that would be quite clear indeed! − Pintoch (talk) 17:21, 14 January 2019 (UTC)
I attempted to update the label and description in English in the proposal - ok now? ArthurPSmith (talk) 20:33, 15 January 2019 (UTC)
Yes, thanks! I am adding a note to explain the variations of labels in the discussion. − Pintoch (talk) 21:40, 15 January 2019 (UTC)
  • Template:Comment as far as constraint checks are concerned: any kind of transitivity hurts performance (“type” and “value type” are among the most expensive constraint checks by far) and caching (we can’t invalidate all cached results if an item buried somewhere in the chain is edited, that’s why you’ll see “this result may be outdated” on some type / value type check results). --Lucas Werkmeister (WMDE) (talk) 17:27, 14 January 2019 (UTC)
Thanks! Yes I can imagine this would be expensive to run. So it makes all the more sense to store this information outside the constraint system. I would still be interested in this to ease data imports. − Pintoch (talk) 20:07, 14 January 2019 (UTC)

Wrapping up

It seems to me that the remaining issues are:

  • How to indicate the scope of the "transitivity" by restricting it to the descendants of a particular item. In my opinion this information is already stored in the property constraints so it is not worth duplicating. Template:U', what do you think? Otherwise we could indicate the bounding class for transitivity with a qualifier, such as Template:P or Template:P (such as Template:Statement. Or with the target property as qualifier: Template:Statement.
  • Finding previous work on the issue. Template:U', are you satisfied with my take on this above?

Pintoch (talk) 14:10, 3 February 2019 (UTC)

  • Template:Support ChristianKl❫ 14:47, 4 March 2019 (UTC)
    • Thanks! I'm marking this as ready then, unless anyone has other concerns. − Pintoch (talk) 17:12, 4 March 2019 (UTC)
  • If we're going to have a qualifier for constraining to a particular class, I think Template:P would be a better choice than Template:P, as Template:P is a widely-used main-value-only property, while Template:P is already a qualifier used on properties pages. --Yair rand (talk) 04:55, 14 March 2019 (UTC)
    • Makes sense! − Pintoch (talk) 15:39, 22 March 2019 (UTC)