schemaorg: JSON-LD context problem for properties that can take both URL or Text

See also notes in Wiki: https://github.com/schemaorg/schemaorg/wiki/JsonLd

e.g. namedPosition in Role example http://schema.org/Role { “@context”: “http://schema.org”, “@type”: “SportsTeam”, “name”: “San Francisco 49ers”, “member”: { “@type”: “OrganizationRole”, “member”: { “@type”: “Person”, “name”: “Joe Montana” }, “startDate”: “1979”, “endDate”: “1992”, “namedPosition”: “Quarterback” } }

Currently the context file (see http://schema.org/docs/jsonldcontext.json.txt) has this: “namedPosition”: { “@type”: “@id” },

… because an URL is a possible value. However text is also a possible value, and currently more likely. The problem is that the JSON-LD context forces the property value to be interpreted as a (possible relative) URI reference, hence in http://json-ld.org/playground/ the value shows up relative to the site the data’s on:

_:b1 http://schema.org/namedPosition http://json-ld.org/playground/Quarterback .

We could over-ride this, e.g. using:

"namedPosition": { "@value": "Quarterback" }

Or we could change the context for this property (and others?), so that literal values are the default. But then we’d need to use (something like) this notation for controlled values:

"namedPosition": { "@id": "http://sport-vocabs.example.org/Quarterback" }

About this issue

  • Original URL
  • State: closed
  • Created 10 years ago
  • Comments: 41 (30 by maintainers)

Commits related to this issue

Most upvoted comments

This is a known family of headaches, but I do not see an actionable issue identified here any more. Let’s close this, but remember this as a useful discussion that will surely crop up again.

This remains an issue for multiple properties that permit both literals and URIs, including schema:genre (as just raised in https://github.com/openlink/structured-data-sniffer/issues/13).

On 2014-06-19, @danbri said about schema:Role

Just discussed this with the schema.org team, we agreed that in this case defaulting to text makes sense. I’ll get the context file updated accordingly.

— and indeed that now shows —

"Role": {"@id": "schema:Role"},

— but schema:genre and others are the troublesome —

"genre": { "@id": "schema:genre", "@type": "@id"},

I submit that the ranges of all such properties should be limited to URIs, or the context file should be adjusted to do the right thing with the beginner-user’s most likely initially-textual value, and leave the more-experienced-user to handle the need for the more complex formulation of —

"genre": { "@id": "http://example.com/genre#myfavorite" },

Regarding the last paragraph in your comment: Can the schema.org JSON-LD context be used and overwritten in such situations?

Yes, use a JSON-LD document such as the following:

{
  "@context": ["http://schema.org", {
    "url": {"@type": "@id"}
  }],
  url: "http://some-location/"
}

This will ensure that the url property is treated like an IRI, not a literal. This may be the default in the schema.org context, but it definitely isn’t for other properties you probably want to have interpreted as IRIs. You can also put these in an external document, and either access as "@context": ["http://schema.org", "http://myschemaupdates"], or simply make myschemaupdates also include http://schema.org.

I’m afraid not. My initial approach was to remove the (, “@type”: “@id”) for the properties defined as both URI or some other type. Specifically these (as mentioned in the linked issue): “roleName”: { “@id”: “schema:roleName”, “@type”: “@id”}, “acceptsReservations”: { “@id”: “schema:acceptsReservations”, “@type”: “@id”}, I could find no easy way of automating this process to be able to discriminate between pure URI types and the hybrids. In the end for the application I needed, I could just populate with an empty(ish) context file and have not suffered any consequences in my particular application. Mileage may vary for other use cases.

Do you have your ‘modified’ context published somewhere?

https://schema.org/docs/jsonldcontext.json has 72 properties defined as "@type": "@id"

https://github.com/schemaorg/schemaorg/blob/1b4f918e2253c3568cced0a831a64c098f05aad8/data/schema.rdfa defines

  • 861 properties
  • 302 properties with rangeIncludes schema:Text
  • 64 properties with rangeIncludes schema:URL
  • 33 properties with rangeIndludes schema:URL & schema:Text

As I said earlier, the context should not use @type: @id when schema:Text is in the range.

30 properties have "@type": "@id" definition in the context and at the same time rangeIncludes schema:Text (list: https://gist.github.com/elf-pavlik/c8d02ee77410db2fd9f819703f031140#file-under-question-js )

so as in example from previous comment "artMedium": "felt-tip pen", values of those 30 properties interpreted as URI if they don’t use "@value": but plain string

at the same time out of all 861 properties only 302 have rangeIncludes schema:Text so plain sting values of remaining 559 - (72 - 30) = 517 which don’t have rangeIncludes schema:Text and don’t have "@type": "@id" will get interpreted as literals not as URI

I think in this situation, it seems safest to always stay explicit and always use

and don’t rely on schema.org JSON-LD context having or not having "@type": "@id"

If we include a reference to this context, both the json-ld playground and the rdf translator assumed the string “felt-tip pen” to be a fragment IRI, not a literal. However, the Google Structured Data Testing Tool, looks to be treating “felt-tip pen” as a literal. Apparent inconsistency.

Out of spec, but the SDTT is trying to be clever and figuring that the embedded space makes the intent clear. Via Postal’s rule, it’s reasonable for a consumer to be liberal in what it accepts, but the SDTT should probably note this as being inconsistent with the term definition. To be strict, it should be interpreted as an IRI, but the IRI would be invalid due to the unescaped space character.

Error message from Structured Data Testing Tool (last line in red):

  artMedium    
               is not a known valid target type for the artMedium property.

This would seem to be a bug in the SDTT, as the vocabulary clearly says that schema:Text is in the range (which includes string and language-tagged RDF literals).

The SDTT is clearly capable of dealing with literal values, so it’s not clear what’s going on here, but IMHO, the problem is with SDTT, not the vocabulary definition. As I said earlier, the context should not use @type: @id when schema:Text is in the range.

http://schema.org/Text looks very not helpful suggesting that one can use Text on all properties 👎

also DataType > Text > URL in a way makes sense but also doesn’t look helpful in this case