XMLNews.org
XMLNews-Meta Tutorial

Copyright (c) 1999 by XMLNews.org. Free redistribution permitted.

XMLNews-Meta is an extensible news industry metadata vocabulary conforming to the World Wide Web Consortium's Resource Description Framework (RDF) Recommendation. An XMLNews-Meta record provides information about a news story or other resource, while an XMLNews-Story document contains an actual news story; you can use XMLNews-Meta to exchange information about any kind of news resource in any format.

This tutorial introduces the properties in the core XMLNews-Meta vocabulary and demonstrates how you can extend an XMLNews-Meta record to include additional properties from different Namespaces. For more detailed (and authoritative) reference documentation, please see the XMLNews-Meta specification.

The core XMLNews-Meta vocabulary consists of over 40 properties that enable you to provide information such as the distributor, format and release time of a news resource (such as a news story, video clip, audio clip or photograph). All of the core properties belong to an XMLNews Namespace; you are free to extend XMLNews-Meta by adding additional properties from other Namespaces to match your technical and business requirements.

Most of the core properties in the XMLNews-Meta vocabulary provide information about the following areas:

NOTE: In the examples in this tutorial, the prefix xn: is shorthand for the Namespace URI “http://www.xmlnews.org/namespaces/meta#”. XMLNews-Meta records are free to use different prefixes, as long as they map to the same Namespace URI. For more information, see the Namespaces in XML specification.

1. Top level

Every XMLNews-Meta record consists of the xn:Resource element, with declarations for all of the Namespaces used in the record (it is also good practice to include an XML declaration):

<?xml version="1.0"?>

<xn:Resource xmlns:xn="http://www.xmlnews.org/namespaces/meta#">
</xn:Resource>

All of the properties appear as elements between the start and the end tags of the xn:Resource element; the name of the element is the property name, and the contents are the value.

One property, xn:resourceId, must be present somewhere in every XMLNews-Meta record: it provides the unique identifier of the resource being described:

<xn:Resource xmlns:xn="http://www.xmlnews.org/namespaces/meta#">
 <xn:resourceId>082098709870987</xn:resourceId>
</xn:Resource>

You may include any number of other properties with the xn:Resource element, in any order. If you use properties from outside the core XMLNews-Meta vocabulary, you must declare any Namespaces that they use as well as the XMLNews-Meta Namespace:

<xn:Resource xmlns:xn="http://www.xmlnews.org/namespaces/meta#"
             xmlns:spt="http://www.sportsonline.com/ns#">
 <xn:resourceId>082098709870987</xn:resourceId>
 <xn:title>Jays beat Yankees</xn:title>
 <xn:category>sports</xn:category>
 <spt:score>Jays 8, Yankees 4</spt:score>
</xn:Resource>

For more information on Namespaces, see Namespaces in XML.

2. Header Information

Within an XMLNews-Meta record, you can use the following core properties to describe header (or envelope) information about a resource, such as its creator, dateline, and priority:

All of these properties are optional in an XMLNews-Meta record: you use them only when you have the information available.

For example, consider an imaginary publication Biz News that includes a daily fixture on the European markets. On February 28, 1999, the story is about a surge in trading on the London Stock Exchange, filed by John Smith. You can use the following XMLNews-Meta properties to capture the header information for this story (in an actual XMLNews-Meta record, all of these would appear within the xn:Resource element):

<xn:title>LSE Soars</xn:title>
<xn:creator>John Smith</xn:creator>
<xn:dateline>London, England, February 28, 1999</xn:dateline>
<xn:language>en</xn:language>
<xn:description>Heavy trading late in the day leaves London Stock
Exchange up 500 points.</xn:description>
<xn:classification>financial</xn:classification>
<xn:fixtureName>European Markets</xn:fixtureName>

In this example, the xn:title element contains a copy of the story's headline, the xn:creator element contains a copy of the story's byline, and the xn:description element contains a copy of the story's dateline. The abbreviation “en” in xn:language is the ISO 629 code for English.

Although the properties align well with the main header information (headline, byline, dateline) for a traditional printed news story, you can also use them to describe a non-textual resource like a photograph. The following example contains properties describing a photograph taken by Rachel Asa in Lisbon on June 28, 1999, and distributed by the (imaginary) ACME News Corporation:

<xn:title>Fishing boats</xn:title> 
<xn:creator>Rachel Asa</xn:creator>
<xn:dateline>Lisbon,Portugal</xn:dateline>
<xn:classfication>http://www.acme.com/classifications/science</xn:classification>
<xn:description>Fishermen fold their nets near Lisbon while the EU
discusses fishing policy.</xn:description>

This time, the creator element identifies the photographer (not the writer as with a story); if you were including a video clip, you might use the creator element to identify the producer of the clip, or the reporter who appears in the clip. If the photo is a file photo, you can use the creator element as follows:

<xn:creator>Acme File Photo</xn:creator>

With a photograph, the description element might actually contain a copy of the photo's cutline (although it does not have to).

If this photograph is Acme's Photo of the Month, you might also want to add the following elements to identify it:

<xn:fixtureCode>http://www.acme.com/fixtures/photomonth</xn:fixtureCode>
<xn:fixtureName>Photo of the Month</xn:fixtureName>

NOTE: as in this example, URLs make excellent unique, machine-readable codes, since they provide a natural scoping mechanism (codes from different providers are unlikely to overlap).

3. Milestones

There are several times that mark the major milestones in the life of a news resource: the time the story is published, the time it may be released (if not immediately), the time it is received by a customer, and the time that the story expires (if any). XMLNews-Meta provides optional properties for recording any or all of these times:

Let's assume that the London Stock Exchange story described in the Header Information section was filed on 28 February 1999 at 4:00 pm London time and is received by a New York news distributor at 11:15 am local time:

<xn:publicationTime>19990228T1600</xn:publicationTime>
<xn:receivedTime>19990228T1115-0500</xn:receivedTime>

Notice that, since we used the local time the story was received, it's necessary to specify a five-hour offset from GMT (for New York). It would also have been possible to use GMT throughout:

<xn:publicationTime>19990228T1600</xn:publicationTime>
<xn:receivedTime>19990228T1615</xn:receivedTime>

Many news resources, like press releases or election results, cannot be released until a specific time; for these, you can specify a delayed release time. In the following example, a resource is published at 4:00 pm EST on February 28, 1999, but is not allowed to be released until 9:00 am on March 1:

<xn:publicationTime>19990228T1600-0500</xn:publicationTime>
<xn:releaseTime>19990301T0900-0500</xn:releaseTime>

Some resources also have an explicit expiry time, either because the information in them will be out of date (as in the case of stock quotes) or because redistribution rights are granted only for a limited period. If the photograph used for the second example in the Header Information section were the photo of the month, it would need to expire before the next photo of the month was released:

<xn:publicationTime>19990501T000000</xn:publicationTime>
<xn:expireTime>19990531T235959</xn:expireTime>

The photo was published at midnight on May 1, and will expire just before midnight on May 31, so that another photo of the month can be issued.

4. Provenance

News resources often travel along a complex route, starting with a local provider or bureau and passing through wire services, amalgamators, value-added redistributors, and others before arriving at their final destination. XMLNews-Meta provides several optional properties for keeping track of where a story has come from:

The distinction between the provider, distributor and source properties is subtle, and may be determined by contractual agreements rather than clear definitions. XMLNews-Meta uses the source properties to identify the original creator of the resource (for example, a local paper or television station), the provider properties to identify the primary provider of the information (such as a major wire service), and the distributor properties to identify other members of the distribution chain, if any. The service properties identify a particular service of the provider, such as “technology news”.

Let's return to our story about the London Stock market used in the Header Information section. The story is distributed by Biz News Incorporated, who received it from Acme News Corporation, who picked it up from the London Financial Times. Biz News distributes the story through its Today in Finance service. You can include all of this information the the XMLNews-Meta properties as follows:

<xn:sourceName>London Financial Times</xn:sourceName>
<xn:distributorName>Acme News Corporation</xn:distributorName>
<xn:providerName>Biz News Inc.</xn:providerName>
<xn:serviceName>Today in Finance</xn:serviceName>

These properties work similarly with a photograph or other non-textual resource.

5. Rights

What news vendors sell is usually not a news resource itself but the right to use that resource in certain ways and places for a certain period of time. You can use the properties described in the Milestones section to provide information about the period of time for which a resource is available; there are also two other, more general properties that relate to rights:

News resources will often contain more than one copyright statement, especially if the resource contains contributions from more than one source. Since there are so many different types of distribution agreements available, the xn:distributionRights property simply contains plain prose.

Here is some sample rights information for a news story:

<xn:copyright>Portions copyright (c) 1999 by London Financial
 Times</xn:copyright>
<xn:copyright>Copyright (c) 1999 by Acme News Corporation All rights
 reserved.</xn:copyright>
<distributionRights>Distribution permitted within Canada, the United
 States, and Mexico.</distributionRights>

Notice that this example contains two copyright statements, and that each one appears as a separate property.

6. Subject Matter

XMLNews-Meta records can contain detailed information about the subject matter of a news resource, using the following properties:

The subject properties are the most general: all of the others are specializations.

Imagine, for example, that a news story contains comments by British Prime Minister Tony Blair on Microsoft made in Seattle during a tour of the U.S. Northwest. The XMLNews-Meta record might contain the following properties:

<xn:personName>Tony Blair</xn:personName>
<xn:locationName>Britain</xn:locationName>
<xn:locationName>Seattle</xn:locationName>
<xn:eventName>visit to U.S. Northwest</xn:eventName>
<xn:companyCode>NASDAQ:MSFT</xn:companyCode>
<xn:companyName>Microsoft</xn:companyName>

Some providers will have standard codes for well-known people and places to enable more accurate searching and filtering. Ideally, these codes should be fully-qualified URLs to avoid confusion between codes from different distributors or providers:

<xn:personCode>http://www.acmenews.com/codes/people/blair0516</xn:personCode>
<xn:personName>Tony Blair</xn:personName>
<xn:locationCode>http://www.acmenews.com/codes/regions/europe/uk</xn:locationCode>
<xn:locationName>Britain</xn:locationName>
<xn:locationCode>http://www.acmenews.com/codes/regions/na/us/wa/seattle</xn:locationCode>
<xn:locationName>Seattle</xn:locationName>
<xn:eventName>visit to U.S. Northwest</xn:eventName>
<xn:companyCode>NASDAQ:MSFT</xn:companyCode>
<xn:companyName>Microsoft</xn:companyName>

The same properties can apply to a photograph:

<xn:eventCode>http://www.acme.com/codes/events/1999/06/eu-fishing</xn:eventCode>
<xn:eventName>EU Fishing Talks</xn:eventName>
<xn:locationCode>http://www.acme.com/codes/regions/europe/pt/lisbon</xn:locationCode>
<xn:locationName>Lisbon</xn:locationName>
<xn:locationCode>http://www.acme.com/codes/regions/europe</xn:locationCode>
<xn:locationName>Europe</xn:locationName>

Notice that with the photograph we have repeated the location element to include two different ways to categorize the location.

7. Linking

News resources have various connections with each other: a simple story, for example, can contain a photograph, can be contained in a digest, and can be one in a series of different versions of the same story. It can also be based on a resource in a different format (such as video), and have resources in other formats (such as a radio report) based on it. XMLNews-Meta provides several properties for tracing these kinds of links (not to be confused with the general-purpose hypertext links used in HTML):

For example, the first version of a story about the London Stock market might have no links and only its resource identifier (see the Top Level section):

<xn:resourceId>098709870</xn:resourceId>

An hour later, changes in the market prompt a new version of the story; this time, there is a link to the previous version:

<xn:resourceId>86576586</xn:resourceId>
<xn:previousVersion>098709870</xn:previousVersion>

Depending on the system architecture, the record for the first version of the story might also have been updated:

<xn:resourceId>098709870</xn:resourceId>
<xn:nextVersion>86576586</xn:nextVersion>

The newswire might also send down a photograph related to the story, and its record have a link to the resource ID of the story in which it appears:

<xn:resourceId>532543245</xn:resourceId>
<xn:parent>86576586</xn:parent>

If the photograph were used in more than one story, its XMLNews-Meta record could contain a pointer to each one:

<xn:resourceId>532543245</xn:resourceId>
<xn:parent>86576586</xn:parent>
<xn:parent>39547547</xn:parent>

Likewise, the records for the stories can contain pointers to the photograph, if desired:

<xn:resourceId>86576586</xn:resourceId>
<xn:previousVersion>098709870</xn:previousVersion>
<xn:child>532543245</xn:child>

Finally, the record audio file for a radio broadcast based on the story can also point back to it:

<xn:resourceId>29576488</xn:resourceId>
<xn:prototype>86576586</xn:prototype>

And the story's record, if desired, can point to the radio broadcast:

<xn:resourceId>86576586</xn:resourceId>
<xn:rendition>29576488</xn:rendition>

The architectural implementations of linking can be very complex; XMLNews provides the properties necessary to represent the links if desired, but does not dictate a single method for maintaining and updating them.

8. Extending XMLNews

The best way to extend the information available in an XMLNews-Meta record is to use (or invent) properties from another Namespace. For example, if the (fictional) Sports Online provider wanted to add an additional property for game scores, they could create their own Namespace and use the property score within it:

<xn:Resource xmlns:xn="http://www.xmlnews.org/namespaces/meta#"
             xmlns:spt="http://www.sportsonline.com/ns#">
 <xn:resourceId>082098709870987</xn:resourceId>
 <xn:title>Jays beat Yankees</xn:title>
 <xn:category>sports</xn:category>
 <spt:score>Jays 8, Yankees 4</spt:score>
</xn:Resource>

If the (fictional) rating agency News Ratings wanted to score articles based on user-supplied criteria, they could create their own Namespace and use the property score within it:

<xn:Resource xmlns:xn="http://www.xmlnews.org/namespaces/meta#"
             xmlns:rating="http://www.newsratings.com/xml/namespace#">
 <xn:resourceId>082098709870987</xn:resourceId>
 <xn:title>Jays beat Yankees</xn:title>
 <xn:category>sports</xn:category>
 <rating:score>7.6</rating:score>
</xn:Resource>

Even though the two properties have the same base name, “score”, there is no risk of confusion because they belong to separate Namespaces:

<xn:Resource xmlns:xn="http://www.xmlnews.org/namespaces/meta#"
             xmlns:spt="http://www.sportsonline.com/ns#"
             xmlns:rating="http://www.newsratings.com/xml/namespace#">
 <xn:resourceId>082098709870987</xn:resourceId>
 <xn:title>Jays beat Yankees</xn:title>
 <xn:category>sports</xn:category>
 <spt:score>Jays 8, Yankees 4</spt:score>
 <rating:score>7.6</rating:score>
</xn:Resource>

Processing software will simply ignore properties that it does not recognize, so providers can invent new properties as required without affecting existing software. Note that the Namespaces are arbitrary URIs: they do not actually need to point to anything that can be retrieved by a browser.

In some cases, however, there will be a need (contractual or technical) to pass on arbitrary NAME=VALUE pairs exactly as supplied by the provider. For this purpose, there is a special property available xn:vendorData:

<xn:vendorData>XXX=YYY</xn:vendorData>
<xn:vendorData>AAA=BBB</xn:vendorData>

It is usually dangerous to send vendor data outside of a closed system, since there is a risk of confusion (there is no partitioning into Namespaces), but this mechanism does provide an internal work-around for problems with legacy systems.

[Home] [Contact]