Datamining Licence - Quick Start

Overview

The Datamining Licence grants you:

  • The ability to get a list of the latest articles published on FT.com
  • The ability to retrieve the headline, byline, content body, annotations, published date, brand, and URL of the articles

Getting started

All your requests to the FT APIs should use the following base URL:

https://api.ft.com

You must supply a valid API Key with each request. There are two ways to do this.

1. Supply an “apiKey” Request Parameter

GET /content/{itemId}?apiKey=yourApiKey

2. Supply an “X-Api-Key” Request Header

X-Api-Key: yourApiKey

This licence utilises the following end points:

  1. Notifications endpoint: this allows you to call for the id of articles that have been published or modified after a specified date and time
  2. Content endpoint: this allows you to retrieve basic details of each article using the id
  3. Enriched Content endpoint: this allows you to call to retrieve additional metadata and tags of each article using the id

Note: You need to request each article resource individually using the Content and Enriched Content endpoints

Notifications endpoint

The simplest request looks like this:

GET https://api.ft.com/content/notifications?apiKey={yourApiKey}&since={timestamp}
Content-Type: application/json

The Content-Type header if supplied must be application/json and, for your first request only, an appropriate since date in ISO 8601 format: Example: 2014-05-20T09:44:20.976Z.

The response you get will include a list of notification resources, and a link to use for your next request:

You should always use the link provided for subsequent requests to the Notifications endpoint. If you do this, you should not miss any updates.

Each apiUrl in the notifications is a request you need to make to the Content endpoint.

For more details please see the main Notifications API documentation.

Content endpoint

A request to the Content endpoint looks like this:

GET https://api.ft.com/content/{itemId}?apiKey={yourApiKey}
Content-Type: application/json

The Content-Type header if supplied must be application/json.

A content item, for example an FT article, is represented in the API as a JSON data structure that is received in the body of a response for a successful request.

See the main Content API documentation for further details.

Enriched Content endpoint

A request to the Enriched Content endpoint looks like this:

GET https://api.ft.com/enrichedcontent/{itemId}?apiKey={yourApiKey}
Content-Type: application/json

The Content-Type header if supplied must be application/json.

An enriched content item, for example an FT article, is represented in the API as a JSON data structure that is received in the body of a response for a successful request.

See the main Enriched Content API documentation for further details.

Tracking code parameter

All users of the FT API, when linking back to FT.com are required to append a campaign parameter to the URL in the following format:

http://www.ft.com/cms/{article uuid}.html?FTCamp=engage/CAPI/{SOURCE}/Channel_{ORGNAME}//B2B

Where:

  • {article uuid} is the unique id for the article you wish to link to, this is included in the URL in the API response
  • {SOURCE} is the source of where you will publish link i.e. whether the user has clicked the link to visit FT.com from an email or from an app or website
  • {ORGNAME} is the name of your organisation
  • The rest of the values in the string are static

Example

So a developer from the FT using the API and serving links into a web app would apply the following:

FTCamp=engage/CAPI/webapp/Channel_FT//B2B

Note: Failure to add the campaign parameter may result in your API key being revoked.

Common requests

1) How do I get the full article text for all the items in the notfications feed?

Answer: you need to request each article resource individually using the content endpoint.

2) How do I find the extract/teaser text?

Answer: use the first 140 characters of the text.

3) What is the definition of items in the notifications endpoint

DELETE – The article was published by the FT but has been removed by our editors for any reason. We are asking all integrated partners to remove the article as well in this situation.

UPDATE – The article is published or was already published and now editorial has made a change and republished it. You should update any copy you hold of this article with the new version.

CREATE – Depending on your key, you may be able to distinguish between newly created and republished articles. In this case, new articles will be of type CREATE.

Note: If an article has been changed multiple times, we only tell you about the latest change i.e. if an article is published, updated, updated again and then finally deleted within an hour, and you make a call for the last hour, you will only get one result which will be the deletion notification.

4) How do I make a distinction between a new article and an updated article?

The standard Notification types are UPDATE and DELETE and should be processed if your goal is to mirror in your cache the exact same content available on FT.com at any given time. However, if you need to easily identify brand-new content, you will need additional permissions to find articles of type CREATE. (You may have negotiated a special extension of this in your licence terms and conditions.)

If you have a datamining licence you are entitled to keep a rolling cache of each piece of content for 90 days, meaning you can configure your system to query the cache looking for files with the same UUID.

If identifying brand new content is not time critical, we suggest processing the notification type, thus replacing the file for the same article UUID if it already exists, or simply adding the file to your cache. If you require further explanation for your implementation, please contact your account representative.

5) How do I link my users to the resource on FT.com?

Simply use the webUrl field value in the response. Remember to add your tracking code parameter: http://www.ft.com/cms/s/{article uuid}.html?{tracking code parameter}

Note: That the tracking code is specific to your license and you will need to append this to URLs before serving them to readers.

6) How do I identify FT articles associated with a particular brand? E.g. fastFT, Alphaville or Lex

A content items brand can be identified through its annotations (otherwise known as tags). To view annotations you should use the Enriched Content endpoint, they are not available in the Content endpoint.

Articles generally have many annotations, but the annotation for a brand looks like this:

{
  “predicate”: “http://www.ft.com/ontology/classification/isClassifiedBy",
  “id”: “http://api.ft.com/things/5c7592a8-1f0c-11e4-b0cb-b2227cce2b54",
  “apiUrl”: “http://api.ft.com/brands/5c7592a8-1f0c-11e4-b0cb-b2227cce2b54",
  “types”: [
    “http://www.ft.com/ontology/core/Thing",
    “http://www.ft.com/ontology/concept/Concept",
    “http://www.ft.com/ontology/classification/Classification",
    “http://www.ft.com/ontology/product/Brand"
  ],
  “prefLabel”: “fastFT”,
  “type”: “BRAND”,
  “directType”: “http://www.ft.com/ontology/product/Brand"
}

The “prefLabel” property shows the human readable name of the brand, but you should use the “id” as the main identifier as “prefLabel” could change.

Below are the ids for some of the most used brands.

Brand

ID

Financial Times http://api.ft.com/things/dbb0bdae-1f0c-11e4-b0cb-b2227cce2b54
FastFT http://api.ft.com/things/5c7592a8-1f0c-11e4-b0cb-b2227cce2b54
FT Alphaville http://api.ft.com/things/89d15f70-640d-11e4-9803-0800200c9a66
Lex http://api.ft.com/things/2d3e16e0-61cb-4322-8aff-3b01c59f4daa