Grupo de Sistemas Ingeligentes Marl Ontology

Marl Ontology Query Experiments

07 February 2011

This version:
Latest version:
Editors: Adam Westerski
Authors: Adam Westerski
Contributors: See acknowledgements

Creative Commons License


Marl is a standardised data schema (also referred as "ontology" or "vocabulary") designed to annotate and describe subjective opinions expressed on the web or in particular Information Systems. The following document contains results of semantic query experiments where we evaluate the capabilities of Marl metadata to answer various queries related to linking distributed opinions. For the description of ontology and instructions how to connect it with descriptions of other resources see Ontology Specification.

Table of Contents

  1. Introduction
    1. Opinions on the Web and the opinion mining process
    2. The Semantic Web
    3. What is Marl for?
  2. Competency questions
    1. Movie opinions
    2. Movie review opinions
    3. Product opinions
    4. Opinions in Idea Management Systems
  3. Datasets tests
    1. Aggregating movie opinions
    2. Visualising Idea Management data
  4. Summary


  1. Changelog
  2. Acknowledgements

1 Introduction

The following document gathers the data results of various experiments done with Marl Ontology. It’s goal it to test the coverage of Marl properties for different datasets constructed independently of the Marl project.

The analysis is split into two parts. Each of the sections presents a list of sources and Marl mappings for them, along with some coverage statistics.

Section two describes usage of the ontology to produce mappings for various datasets published by researchers during their opinion mining algorithms study. The second part relates to the same effort but conducted in context of online services for end users that publish opinion mined data.

The choice of sources for datasets is based on state of the art knowledge of authors (in case of research datasets the list was partially created based on resources listed by Pang et al. [ref]).

An important note is that Marl ontology presented here is not a complete model to address the problem of describing and linking opinions online and inside information systems. It marly defines concepts that are not described yet by the means of other ontologies and provides the data attributes that enable to connect opinions with contextual information already defined in metadata created with other ontologies. For detailed instructions and recommendations how to fully model opinions and the results of opinion mining process refer to analysis done by Gi2MO project.

1.1 Opinions on the Web and the opinion mining process

With the birth of Web 2.0 users started to provide their input and create content on mass scape about their subjective opinions related to various topics (e.g. opinions about movies). While this kind of content can be very beneficial for many different uses (e.g. market analysis or predictions) it's accurate analysis and interpretation has not been fully harnessed yet. Information left by the users is often very disorganized and many portals that enable user input leave the user added information unmoderated.

Opinion mining (often referred as sentiment analysis) is one of the attempts bring order to those vast amounts of user generated content. The domain focuses to analyse textual content using special language processing tools and as output provides a quantified judgement of the sentiments contained in the text (e.g. if the text expresses a positive or negative opinion).

Due to the complexity of the problem and attempts to provide efficient and fast tools the area can be devided into three main research directions:

In relation to the World Wide Web, there is a number of common uses of opinion formalisation and analysis. Firstly, it can be applied on top of search engines to find the desired content and next run it through opinion analysis software to obtain desired statistics (e.g. Swotti). Secondly, such algorithms can used within dedicated systems that use the Web to connect to particular communities and gather their opinions on very specific topics (e.g. Internet shops or review websites).

In relation to the dedicated systems (e.g. Enterprise Systems), there the community collaborative models that have proven successful in the open web are often transferred to large enterprise to enhance knowledge exchange and bring the employees together. The same opinion mining techniques can be applied in such cases to extact particular information and use it for internal statistics and to improve knowledge search across the enterprise (e.g. see use of opinion mining in Idea Management [link]).

1.2 The Semantic Web

The Semantic Web is a W3C initiative that aims to introduce rich metadata to the current Web and provide machine readable and processable data as a supplement to human-readable Web.

Semantic Web is a mature domain that has been in research phase for many years and with the increasing amount of commercial interest and emerging products is starting to gain appreciation and popularity as one of the rising trends for the future Internet.

One of the corner stones of the Semantic Web is research on interlinkable and interoperable data schemas for information published online. Those schemas are often refered to as ontologies or vocabularies. In order to facilitate the concept of ontologies that lead to a truly interoperable Web of Data, W3C has proposed a series of technologies such as RDF and OWL. Marl uses those technologies and the research that comes within to propose an ontology for the particular goal of describing opinions and linking them with contextual information (such as opinion topic, features described in the opinion etc.).

1.3 What is Marl for?

The goals of the Marl ontology to achieve as a data schema are:

For more information please refer to Marl usage study done as part of the research in the Gi2MO project.

2. Datasets tests

The goal of this experiment was to check if the annotations done with Marl ontology can be used to answer questions about opinions expressed with regard to different topics on the Web. The areas for which we have built the competency questions correspond to the use cases and mappings experiments for Marl ontology.

2.1 Movie opinions

Show all opinions about {certain movie}
Show all opinions about Avatar
Show all {polarity type} opinions about {certain movie}
Show all positive opinions about Avatar
Show all {polarity type} opinions about {certain movie} made with regard to {movie feature}
Show all positive opinions about acting Avatar
Show all {polarity type} opinions about {certain movie} made by a {certain person}
Show all positive opinions about Avatar made by IMDB reviewers

2.2 Movie review opinions


2.3. Product opinions

Show all opinions about {certain product}
Show all opinions about iPads
Show all {polarity type} opinions about {certain product}
Show all positive opinions about iPads
Show all opinions about {certain product} made with regard to {product part}
Show all opinions about iPads screen
Show all {polarity type} opinions about {certain product} made with regard to {product feature}
Show all positive opinions about iPad usability

2.4 Opinions in Idea Management Systems

Show all {polarity type} opinions about {certain idea}
Show all opinions about "Bigger screen in iPads"
Show all {polarity type} opinions about ideas on {certain product}
Show all positive opinions on ideas for iPads
Show amount of positive and negative opinions for all ideas submitted
Show all opinions related to ideas about {certain product} made with regard to {product part}
Show all opinions on ideas for iPad cammera

3. Online Opinion Analysis Services

The following use cases aim to show how Marl Ontology could be used in different environments (as in systems) and when applied to to opinions of various complexity and structure.

3.1 Aggregating movie opinions


3.2 Visualising Idea Management data

Context: Idea Management Systems are used to collect input from a large audience regarding innovation proposals for products or services. For our experiment we used two Idea Management System instances.

Technologies: The Semantic Web infrastructure was provided by tools from the Gi2MO project. The the data was also taken from test instances of the Gi2MO project. Finally the tool used for visualisation were Idea Browser and Idea Analyst. For processing SPARQL queries we used ARC2 RDF store.

Data: The data used came from ETSIT Ideas and ETSIT Ideas International - systems created to collect ideas about the university, respectable from spanish students and international visitors. First instance was run entirely in spanish, the second in english. Both systems were running different software however the data exported from each has been described with the same ontologies: Gi2MO ontology for ideas and Marl for describing opinions about ideas.

Outcome: Data encoded in Marl has provided new metrics and enabled to compare two multilingual instances on a new level.

Datasets: etsit_ideas_es.rdf, etsit_ideas_en.rdf Queries and Results:

a) Amount of negative opinions/comments in each instance (with colors)

b) Amount of positive opinions/comments from each instance (with colors)

c) Amount of positive opinions/comments per category (categories using the same URIs in both instances)
* select categories and the amount of positive ideas they have, sort by ideas amount (tags and categories are included)

PREFIX gi2mo:  .
PREFIX dcterms:  .
PREFIX owl:  .
PREFIX marl:  .
SELECT ?generic_category_uri  COUNT(?idea_name) AS ?positive_ideas WHERE {
   ?idea_uri a gi2mo:Idea .
   ?idea_uri gi2mo:hasCategory ?category_uri .
   ?category_uri owl:sameAs ?generic_category_uri .
   ?idea_uri dcterms:title ?idea_name .
   ?idea_uri marl:polarityValue ?polarityValue .
   FILTER (?polarityValue > 0)
GROUP BY ?generic_category_uri
ORDER BY DESC(?positive_ideas)

* select categories and the amount of ideas they have, sort by ideas amount (tags and categories are included)
PREFIX gi2mo:  .
PREFIX dcterms:  .
PREFIX owl:  .
SELECT ?generic_category_uri  COUNT(?idea_name) AS ?ideas WHERE {
   ?idea_uri a gi2mo:Idea .
   ?idea_uri gi2mo:hasCategory ?category_uri .
   ?category_uri owl:sameAs ?generic_category_uri .
   ?idea_uri dcterms:title ?idea_name
GROUP BY ?generic_category_uri

4 Summary and Comparison


A Changelog

B Acknowledgements

The style formatting of the following document has been inspired on FOAF specification.

Special thanks for support with Marl ontology creation and research to: Prof. Carlos A. Iglesias and members of the GSI Group of DIT department of Universidad Politécnica de Madrid.