This version: | http://purl.org/marl/0.1/ns (RDF/XML, HTML) |
Latest version: | http://purl.org/marl/ns |
Editors: | Adam Westerski |
Authors: | Adam Westerski |
Contributors: | See acknowledgements |
This work is licensed under a Creative Commons Attribution License. This copyright applies to the Marl Ontology Specification and accompanying documentation in RDF. This ontology uses W3C's RDF technology, an open Web standard that can be freely used by anyone.
Marl is a standardised data schema (also referred as "ontology" or "vocabulary") designed to annotate and describe subjective opinions expressed on the web or in particular Information Systems. The following document contains the description of ontology and instructions how to connect it with descriptions of other resources.
The following specification is a formal description of metadata schema proposal that can be applied to data representing subjective opinions published on the Web. The goal of the following section is to provide the basic knowledge to comprehend the technical part of the specification. As such it shall introduce both Semantic Web and general topic of opinion representation and sentiment analysis.
An important note is that Marl ontology presented here is not a complete model to address the problem of describing and linking opinions online and inside information systems. It marly defines concepts that are not described yet by the means of other ontologies and provides the data attributes that enable to connect opinions with contextual information already defined in metadata created with other ontologies. For detailed instructions and recommendations how to fully model opinions and the results of opinion mining process refer to analysis done by Gi2MO project.
With the birth of Web 2.0 users started to provide their input and create content on mass scape about their subjective opinions related to various topics (e.g. opinions about movies). While this kind of content can be very beneficial for many different uses (e.g. market analysis or predictions) it's accurate analysis and interpretation has not been fully harnessed yet. Information left by the users is often very disorganized and many portals that enable user input leave the user added information unmoderated.
Opinion mining (often referred as sentiment analysis) is one of the attempts bring order to those vast amounts of user generated content. The domain focuses to analyse textual content using special language processing tools and as output provides a quantified judgement of the sentiments contained in the text (e.g. if the text expresses a positive or negative opinion).
Due to the complexity of the problem and attempts to provide efficient and fast tools the area can be devided into three main research directions:
In relation to the World Wide Web, there is a number of common uses of opinion formalisation and analysis. Firstly, it can be applied on top of search engines to find the desired content and next run it through opinion analysis software to obtain desired statistics (e.g. Swotti). Secondly, such algorithms can used within dedicated systems that use the Web to connect to particular communities and gather their opinions on very specific topics (e.g. Internet shops or review websites).
In relation to the dedicated systems (e.g. Enterprise Systems), there the community collaborative models that have proven successful in the open web are often transferred to large enterprise to enhance knowledge exchange and bring the employees together. The same opinion mining techniques can be applied in such cases to extact particular information and use it for internal statistics and to improve knowledge search across the enterprise (e.g. see use of opinion mining in Idea Management [link]).
The Semantic Web is a W3C initiative that aims to introduce rich metadata to the current Web and provide machine readable and processable data as a supplement to human-readable Web.
Semantic Web is a mature domain that has been in research phase for many years and with the increasing amount of commercial interest and emerging products is starting to gain appreciation and popularity as one of the rising trends for the future Internet.
One of the corner stones of the Semantic Web is research on interlinkable and interoperable data schemas for information published online. Those schemas are often refered to as ontologies or vocabularies. In order to facilitate the concept of ontologies that lead to a truly interoperable Web of Data, W3C has proposed a series of technologies such as RDF and OWL. Marl uses those technologies and the research that comes within to propose an ontology for the particular goal of describing opinions and linking them with contextual information (such as opinion topic, features described in the opinion etc.).
The goals of the Marl ontology to achieve as a data schema are:
An alphabetical index of Marl terms, by class (concepts) and by property (relationships, attributes), are given below. All the terms are hyperlinked to their detailed description for quick reference.
Classes: AggregatedOpinion, Opinon, Polarity,
Properties: aggregatesOpinion, algorithmConfidence, describesFeature, describesObject, describesObjectPart, extractedFrom, hasOpinion, hasPolarity, maxPolarityValue, minPolarityValue, polarityValue,
The Marl class diagram presented below shows connections between classes and properties used for describing opinions.
A very basic example below shows a single opinion annotated with Marl metadata (the second class maps the opinion structure and is shown as reference):
<rdf:Description rdf:about="http://gi2mo.org/marl/avatar/opinion/012345/rdf"> <marl:extractedFrom rdf:resource="http://gi2mo.org/marl/blog/avatar-review/comment/054321/rdf"/> <marl:describesObject rdf:resource="http://dbpedia.org/resource/Avatar_(2009_film)"/> <marl:describesFeature rdf:resource="http://dbpedia.org/property/runtime"/> <marl:polarityValue>-0.2</marl:polarityValue> <marl:minPolarityValue>-1</marl:minPolarityValue> <marl:maxPolarityValue>1</marl:maxPolarityValue> <marl:hasPolarity rdf:resource="http://purl.org/marl/ns#Negative"/> <rdf:type rdf:resource="http://purl.org/marl/ns#Opinion"/> </rdf:Description> <rdf:Description rdf:about="http://gi2mo.org/marl/blog/avatar-review/comment/054321/rdf"> <dcterms:title>Re: Avatar Review</dcterms:title>; <sioc:has_creator rdf:resource="http://gi2mo.org/marl/blog/author/user345/"/> <dcterms:created>Fri, 3 Jun 2010 13:53:54 +0200</dcterms:created>; <sioc:reply_of rdf:resource="http://gi2mo.org/marl/blog/avatar-review/"/> <sioc:content>Awful movie, way to long!</sioc:content> <foaf:primaryTopic rdf:resource="http://gi2mo.org/marl/blog/avatar-review/comment/054321"/> <rdf:type rdf:resource="http://rdfs.org/sioc/ns#Post"/> </rdf:Description>
For more examples please see a Marl RDF export for a opinions taken from a simple idea management system instance installed for ETSIT school of Universidad Politecnica de Madrid. Furthermore, we recommend reading Marl Use Cases document for more examples and hints how to properly describe opinions with the ontology.
Below see a comprehensive list of all Marl classes, properties and their descriptions.
URI: http://purl.org/marl/ns#AggregatedOpinion
- The same as Opinion class but indicates that the properties of this class aggregate all the opinions specified in the "extractedFrom" source. Optionally, if the aggregatesOpinion property is used this class could be craeted to aggregate only certain opinons (e.g. in a text about political scene it there could be many AggregatedOpinion classes each with opinions per different politician).
URI: http://purl.org/marl/ns#Opinon
- Describes the concept of opinion expressed in a certain text.
URI: http://purl.org/marl/ns#Polarity
- Class that represents the poliary. Use instances to express if the poliarty is positive, neutral or negative.
URI: http://purl.org/marl/ns#aggregatesOpinion
- Indicates that the polarity described with the class is a calculation (eg. sum) of other opinions polarity (eg. aggragated opinion about the movie derived from many sentiments expressed in one text).
URI: http://purl.org/marl/ns#algorithmConfidence
- A numarical value that describe how much the algorithm was confident of the assesment of the opinion (eg. how much the opinion matches a gives object/product).
URI: http://purl.org/marl/ns#describesFeature
- Indicates a feature of an object or object part that the opinon refers to (eg. laptop battery life or laptop battery size etc.)
URI: http://purl.org/marl/ns#describesObject
- Indicates the object that the opinion referes to.
URI: http://purl.org/marl/ns#describesObjectPart
- Indicates a particular element or part of the object that the opinion referes to (eg. laptop screen or cammera battey)
URI: http://purl.org/marl/ns#extractedFrom
- Indicates the text from which the opinion has been extracted.
URI: http://purl.org/marl/ns#hasOpinion
- Indicates that a certain text has a subjective opinion expressed in it.
URI: http://purl.org/marl/ns#hasPolarity
- A property that indicates if the opinion is positive/negative or neutral
URI: http://purl.org/marl/ns#maxPolarityValue
- Maximal possible numerical value for the opinion
URI: http://purl.org/marl/ns#minPolarityValue
- Lowest possible numerical value of the opinon
URI: http://purl.org/marl/ns#polarityValue
- A numarical representation of the polarity value. The recommended use is by specifying % by using a real number from 0..1. In case this is not feasable in a given solution use minOpinionValue and maxOpinionValue to provide additional information.
URI: http://purl.org/marl/ns#Negative
- Negative polarity.
URI: http://purl.org/marl/ns#Neutral
- Neutral polarity
URI: http://purl.org/marl/ns#Positive
- Positive polarity
This documentation has been generated automatically from the most recent ontology specification in OWL using a python script called SpecGen. The style formatting has been inspired on FOAF specification.
Special thanks for support with ontology creation and research to: Prof. Carlos A. Iglesias and members of the GSI Group of DIT department of Universidad Politécnica de Madrid.