ISO 15926 and the Semantic Web

In beautiful Sogndal in Norway, a group of 30 knowledgeable people is gathered today for seminar on the way forward with ISO 15926, and the use of OWL in this regards.

Matthew West, one of the key people behind ISO 15926 (why not give it a name?) gave some background and motivation behind ISO 15926, and how it is trying to model 4D (that objects exists in 3D and time) rather than pure 3D. He also addresses advantages and disadvantages of modeling ISO 15926 in entity-relational languages (e.g. EXPRESS and UML) versus description logic (e.g. OWL). The key take-away here is that OWL has a superior tool support and potentially can represent the complexity of ISO 15926.

The effort of trying to represent ISO 15926 in OWL is presented by Martin George Skjæveland from DNV. ISO 15926 is represented in EXPRESS with two main constructs, Entities and Attributes. They have made a simple translations between express and OWL

Entity owl:class
Subtype rdfs:subclassOf
Disjoint (one of) owl:disjointWith
Abstract owl:equivalentClass, owl:unionOf,
attributes with enity value owl:objectProperty, rdfs:domain, owl:cardianlty 
attributes with datatype value owl:datatypeProperty, rdfs:domain, owl:cardinality
attribute values owl:allVlauesFrom
EXPRESS datatype xsd:datatypes
List linked list in OWL (?) – drummond et. al…
Unique not translated – exeedes functional datatype properties – exeeds OWL DL

Some issues still remains, however it seems like there is almost a 1-to-1 mapping between the ISO 15926-2 EXPRESS and an OWL DL representation allowing the use of Semantic Web languages and tools. A SPARQL endpoint has also been created over the Part-2 of ISO 15926.

The next presentation “Building rich ontologies on OWL version of ISO 15926” by Johan Klüver from DNV starts from the realization is that domain experts use tools like Excel rather than knowledge modeling tools like Protégé. His position is to create “expert friendly interfaces” for domain ontology building where users are giving statements about his domain. The idea is to use Templates that compile statements down to ISO 15926 data structures.

In conclusion, it seems like some right steps has been taken in the direction of OWL. But what about the next steps in this bridge between ISO 15926 and OWL? There are still some issues that are not fully covered among others namespaces? provenance? representing part 4 (the reference data libraries, or domain ontologies) in OWL. And last but not least some more use cases for the ontology would be helpful.

Advertisements

Google using synonyms?

There are some talks these days in the blogsphere about Google adding synonyms in their search. Stemming – reducing the words you use to their base form or stem – they have had for a long time (e.g. run, running, runner resulting in the same result-set).

Using synonyms however is a much more complex task and relates to understanding the user’s intentions – including understanding more about the context the user is in. E.g. Port may be substituted by Gate, but also for Wine, and even more complex as we talk various languages into account – Gate in Norwegian also means Street. 

So are Google using synonyms as indicated in a few articles referencing an official Google Blog article? Not today from my understanding, but that they are looking into it as a central part of query understanding – for sure. And another Google Blog article explains this much clearer.

Personally, I also believe that their move into the browser market with Chrome positions Google to gather more information about the user’s context. Which is the real problem in current search solutions.

One Ring to rule them all…

Though I’ve always been a tech junkie – from Commodore 64 to OS X, iPods and Apple TV – still, I never really did care about mobile computing… I had my calls and my SMS. I always bought the newest of gadgets, but the newer the phone I got, the harder it was to use… Was it me getting older? Two years in U.S. and I even started preferring leaving voice mails over SMS (maybe because no one used SMS over there)… However, as all junkies, I cannot hold back, I need the newest… so  while transiting through New York some half a year ago I got my iPhone (price was of course no issue). 

What have happened since?

– I discovered location awareness in Amsterdam – accuracy not the point… just gimme the streets… the route to the restaurant.

– I discovered roaming charges in Brussels. Finally I had all my Mail – answering all the time – were did my vacation go really? Does more mail access decrease the size of your inbox? For sure it did not decrease the roaming charges (some 2000 NOK for just checking email).  

– I finally had all my RSS-feeds at hand – but I seems always have 500 unread!

– I started blogging on the train – hmm, maybe I now could count my train rides as work hours? But I did not get more work done.

– Exchange integration means that I always had my calender with me… However now it is seems to always be full… 

– I’m involved in development of a touch screen application. iPhone are setting standards –  generation iPhone will not accept “bad” solutions… 

– On the positive side I do more often now leave my laptop at work, I now book tickets using Safari on the iPod (WAP what was that?). And as any true tech junkie I measure my gadgets coolness factor by how long it stays cool to me… and my iPhone is still very much so – apps apps apps! BUT it does affect my economy more than increased interest rates… I DOOO need a fixed data charge! 

As a final note, for business I think iPhone it is a huge step – work gets more of my time because I allow them to. And for us consumers, I think it improves the usability and usefulness by giving me a PC you can call with not visa versa (and yes I only care about connectivity from the network provider – not so called value-added-services past location awareness ). Similar to what Nokia managed in the 90s, it’s just better than what was. However, it will make all employees attached to work all the time, and as a side note, my next vacation I will be switching to “Airplain mode”! 

I now have one device that rule them all… unfortunately I am feeling more like Gollum than God…

(this post was made on the iPhone)

The uptake of the semantic web

When I moved back to Norwaying four years back there was no Semantic Web activities in Norway. Topic Maps on the other hand was quite visible. Three years ago we were very much in the technology push. Over the last three years something has changed.

Today you are see semantic web solutions being asked for in public tenders and by commercial customers. Have our message finally gotten through to them, or are they now facing other problems than three-four years ago?

I think there are some truth to both, but I also think that we as a community have come to learn a few things.

Some of us that did not come from library science area, I believe have underestimated the importance of bridging to old schools. With this I mean especially their understanding of metadata and thesauri. Hence, while we’ve addressed areas they’re interested in, we haven’t shown them why and whem formal modeling is better, and how to bridge to known technologies.

Secondly, we’ve forgotten that customers want complete solutions, not just an an API or an OWL editor. Semantic web is at best less than 30% of a solution. The rest is boring stuff (for most of us).

Thirdly, I firmly believe that we’ve been lacking tools for visualiation. Graphs are hard, and we’ve often come to short in the dialog with the customer because web have a hard time visualizing our data and ontologies. I think in the last years we’ve seen companies addressing this issue.

Finally, I think we’re now less focused on selling a vision, and not any more overselling what we’re able to do. We all have experience with semantic web systems in practice now, and we know much more about the current limitations on scalability, performance and the like. Maybe we’re also å few years older and wiser.

SKOS and OWL followup

A few days back I wrote about a project where we decided to use SKOS rather than OWL to model two domain ontologies. Recently, in a new project building on the base of the previous, we’ve come to the opposite conclusion.

Why these differences? And how can we design the system to support both modeling paradigms? I’ll try to answer the first part here, and the latter in a followup to this post.

Why was the situation so different different that one project landed on using SKOS, a thesaurus representation, while the next OWL, a formal ontology representation. First of all it was the functional requirements.

The SKOS project was focused on associations and loose hierarchy where it was not natural to aggregate resources. Refining meant drill down based on the additional metadata of the resourches, i.e. through faceted filtering. The number of resources was low enough to make human annotation possible (approx 9000). And the relativly high number of classes (approx 5000) made it hard to maintain any but a hiararchy of topics.

The OWL project, on the other hand, had requirements that the user would gradually refine his query through navigation, e.g. from “Rock” to “Folk Rock”. Or from “Bob Dylan” to “Bob Dylan the song writer”. The resources are numerous (more than 300k), whereas classes are expected to be low (approx 1000). Resources will also contain more specific metadata allowing OWL expressions.

Overall the applications are quite similar, still we have chosen to model differently. It would be nice to hear if others have the same experience. Our next step is too allow the tools to handle both.

SKOS and OWL in practice

We’ve recently developed a system focusing on informal modeling using SKOS.

The reasons for chosing SKOS over OWL were
• The systems to be replaced already had an informally modeled domain model
• We did not require the strength coming with OWL inferencing
• Performance was important

This is from my understanding a quite common situation. The modeling of a concept or topic hierarcy is in it self a mind twisting job, and to add logic to that makes it even harder. Secondly as architecht for such systems based on RDF it is important not to add more complexity than needed, and not to add something that may prove to have a performance penalty.

I will talk more on this subject in the upcoming weeks, also in situations where we’ve reached the opposite conclusion. I will also share my experiences in greater detail at the upcoming European Semantic Technology conference in Vienna.

Semantic Web og emnekart — Fremdeles forvirret?

Denne artikkelen ble skrevet av Kjetil Kjernsmo i et innlegg i Computerworld 18.01.2008

Steve Pepper skrev i Computerworld 14. desember et innlegg om emnekart, som ikke bare er misvisende med hensyn til hvordan Internett fungerer i dag og om webbens historie, men også om hvilke teknologier som driver utviklingen videre.

Ingressen er allerede misvisende, der påstår Pepper at man må gjennom et sentralt knutepunkt. Dette er en beskrivelse av en portal-tankegang om heldigvis aldri fikk fotfeste. En stund tenkte mange seg portaler omtrent som en bok, der man hadde en start-side alle skulle tvinges gjennom. Dette var klart imot Tim Berners-Lees ide om en desentralisert struktur, og selv om portaler og startsider fortsatt er en populær tanke kan man likefullt komme inn til en gitt side ad flere veier, via en søkemotor, ved å følge linker som man finner på webben, får i mail eller en RSS-feed.

Emnekart har hatt suksess i Norge for å hjelpe med navigasjon internt i portaler, men når Pepper påstår at Tim Berners-Lee ble sendt i feil retning har han mistet hele historien til webben. Webben var aldri ment å være bare linker mellom dokumenter, og Tim gjorde ikke bare tekniske valg. Han gjorde også politiske valg; han ikke skulle prøve å kontrollere retningen webben utviklet seg i annet enn selv å bidra til utviklingen gjennom sin standardiseringsorganisasjon World Wide Web Consortium (W3C). At han ga opp denne kontrollen gjorde at webben fikk utvikle seg i mange retninger, og som han selv sa “multimedia appeared more important for a while”.

Men tanken om linker mellom temaer, faktisk ikke bare temaer, men også linker mellom fysiske ting, som f.eks. linker mellom personer, lå hele tiden under. I dag kan vi si f.eks. “Kjetil kjenner Tim”. Omtrent samtidig med at emnekart-miljøet startet opp ble det klart at HTML alene ikke kunne oppfylle de behov man hadde, og dermed begynte man med designen av RDF (Resource Description Framework), som i dag jobbes med av et stort internasjonalt miljø. At ikke Tim skulle kjenne til emnekart-miljøet, slik Pepper hinter om i slutten av sitt innlegg er en merkelig påstand, som må være satt fram mot bedre vitende. Pepper vet godt at arbeidet han var en del av for å definere en bro mellom RDF og emnekart under W3C-paraplyen var godkjent av Tim selv. For RDF-miljøet var dette arbeidet også viktig, for det er vesentlig å bygge broer til alle miljøer.

Det er likheter mellom emnekart og RDF, men tross likheter og broer er det ikke emnekartene som kommer til å realisere Tims visjon. Mens emnekart har lite fotfeste utenfor Norge er RDF en del av en rekke Semantic Web-teknologier som har satt en hel industri i bevegelse. Hele spekteret er engasjert, fra støtte for RDF i Oracle, via en mengde større og mindre utviklings- og konsulentselskaper, til små og innovative fri programvare-prosjekter. Nå finnes det milliarder av relasjoner fritt tilgjengelig ettersom f.eks. data fra Wikipedia er formulert som RDF, og antallet vokser for hver dag. Biovitenskapene har spilt mye den samme rollen nå som høyenergifysikken gjorde i webbens barndom, og har vist hvordan Semantic Web-teknologi kan løse innviklede problemer i praksis. I Norge går oljeindustrien med OLF (Oljeindustriens Landsforening) i spissen inn for bruk av Semantic Web i det som kalles ”integrerte operasjoner” — IT-basert styring og vedlikehold av komplekse offshoreinstallasjoner.

På den åpne webben tyder alt på at Semantic Web-teknologi vil dominere, og selv om Tim igjen er en viktig skikkelse i dette arbeidet, er vi mange som drar lasset, av og til sågar i forskjellige retninger.

Kjetil Kjernsmo Senior kunnskapsingeniør, Computas AS, og medlem av W3Cs Semantic Web and Education Outreach gruppe (SWEO).