Search API Tutorial

The Search API allows queries to be made across FT content via a RESTful service.

Search API access

The base URI for calls to the Search API is https://api.ft.com/content/search/v1.

To access the API using this URI you must provide an API Key and Content-Type header as documented in the API Reference.

Basic Request

The Search API accepts HTTP POST requests with Content-type set to application/json The simplest request has a form of:

   {
      “queryString”: “banks”
   }

The response will echo back the provided query and show the queryContext and resultContext of the executed query along with the results:

{
   “query”:{
      “queryString”:“banks”,
      “queryContext”:{
         “curations”:[
            “ARTICLES”,
      “BLOGS”
         ]
      },
      “resultContext”:{
         “maxResults”:100,
         “offset”:0,
      }
   },
   “results”:[
      {
         “indexCount”:100,
         “curations”:[
            “ARTICLES”,
            “BLOGS”
         ],
         “results”:[
            {
               “aspectSet”:“article”,
               “modelVersion”:“1”,
               “id”:“cbc4a190-638d-11e1-8e79-00144feabb8e”,
            }
         ]
      }
   ],
   …
}

Advanced Request

Curations

FT content is organised into curations. For example, the following query can be used to request only articles:

   {
      “queryString”: “banks”,
      “queryContext” : {
         “curations” : [ “ARTICLES”]
      }
   }

For more information on curations, see the curations discovery method. If no curations are specified, the default behaviour will be to search across all available curations.

Aspects

Aspects allow consumers to specify which elements of content they wish to receive within the results. Aspects can be provided in the resultContext:

   {
      “queryString”: “banks”,
      “resultContext” : {
         “aspects” : [ “title”,“lifecycle” ]
      }
   }

Which will result in more information being provided for each result in the response for example:

{
   “query”:{
      …
   },
   “results”:[{
     “aspectSet”:“article”,
     “modelVersion”:“1”,
     “id”:“72920729-c340-1fb0-7023-ba7436373c78”,
     “title”:{
     “title”:“The temptation of higher leverage”
   },
     “lifecycle”:{
     “initialPublishDateTime”:“2012-10-24T04:33:00Z”,
     “lastPublishDateTime”:“2012-10-24T04:33:00Z”
   }
   }
]}

For more information on aspects, see the aspects discovery method.

Paging

Pagination is supported through two fields of resultContext:

  • maxResults - maximum number of results you would like to get. The default and maximum value of maxResults is 100.
  • offset - zero based offset to specify where results should begin. The default is 0.

Example of a request with pagination:

   {
      “queryString”: “banks”,
      “resultContext” : {
         “maxResults” : “20”,
         “offset” : “21”,
      }
   }

It should be noted that the maximum number of addressable results is 4000.

Sorting

Sorting is supported through two fields of resultContext:

  • sortField - The name of a sortable field.
  • sortOrder - Either ASC for ascending or DESC for descending order.

Both fields must be provided. Example of a request with sorting:

{
  “queryString”: “banks”,
  “resultContext” : {
     “sortOrder” : “ASC”,
     “sortField” : “title”
  }
}

For more information on sortable fields, see the sortable fields discovery method.

Facets

Facets allow consumers to navigate through their results by refining their query. Facets can be provided in the resultContext:

{
  “queryString”: “banks”,
  “resultContext” : {
     “facets” : {“names”:[ “people”]}
  }
}

Which will result in the facets for people being included in the response:

{
   “query”:{
      …
   },
   “results”:[
      …
  ],
  “facets”:[
    {
    “name”:“people”,
    “facetElements”:[
    {“name”:“David Cameron”, “count”:10}
    ,…
    }
    ]
  ]
}

Based on the above facetElement, the query can be refined by making a fielded query using the facet name, in this case people, with the value “David Cameron”. This is explained in more detail below, but in this example the refined query string would take the form:

{“queryString”:“banks AND people:=\“David Cameron\“”}

The number of facet elements can be controlled through the use of maxElements and minThreshold. maxElements is the maximum number of facet elements to return (-1 is all facets) and minThreshold is the minimum count required for inclusion.

   {
      “queryString”: “banks”,
      “resultContext” : {
         “facets” : {“names”:[ “people”],“maxElements”:20,“minThreshold”:1}
      }
   }

For more information on available facets, see the facets discovery method.

Query Examples

To search for “banks”, the query will take the form:

banks

and will return results containing the word “banks”.

Operators

Query of form:

Financial Times

will return all results containing “Financial” and “Times”, with the keywords potentially separated and in any order. This is because the above example is equal to:

Financial AND Times

and

Financial + Times

AND is implicit and can alternatively be replaced by plus +. It is important to use uppercase AND, otherwise search will return articles containing the keyword “and”. To match the phrase “Financial Times”, quotes should be used in the queryString:

“Financial Times”

To search for content about Financial Times and New York Times we would use the queryString:

“Financial Times” AND “New York Times”

To search for content about “Times” but not “Financial”, negation can be used as follows:

Times -Financial

or

Times NOT Financial

or

NOT Financial Times

as we see NOT is equal to - symbol. Whitespace after - is allow. To ask for all content about Financial Times or New York Times we can use OR operator that must be uppercase:

“Financial Times” OR “New York Times”

or

“Financial Times” | “New York Times”

Operator OR and symbol | are exchangeable.

Brackets

Let’s complicate our query further. To query for all Financial or “New York” but only in content containing word “Times” we can construct the queryString:

(Financial OR “New York”) AND Times

or

(Financial OR “New York”) Times

Without the brackets above query would be interpreted as:

Financial OR (“new York” AND Times)

This is due to AND having precedance. As we see precedance can be imposed using brackets. Any level of bracket nesting is allow.

Fielded queries

Fielded queries are supported across all searchable fields defined by the discovery method. For example, if we are looking for all results about the person David Cameron we would use the queryString:

people:“David Cameron”

For all content about David Cameron last published before 2010 we can construct the queryString:

people:“David Cameron” lastPublishDateTime:<2010-01-01T00:00:00Z

To find content with a title containing a phrase “Cameron turns to Olympics” we can do it by:

title:“Cameron turns to Olympics”

To find content with with exactly this title we have to use query

title:=“Cameron turns to Olympics in difficult week.”

for all content except the last we mentioned above we would use:

title:NOT “Cameron turns to Olympics in difficult week.”

Of course fielded queries can be combined with AND/OR operators and brackets. For example to query for all content about David Cameron or Gordon Brown from 2011 and title containing “Olympics” you can use query:

(people:“David Cameron” OR people:“Gordon Brown”) AND (lastPublishDateTime:>2011-01-01T00:00:00Z AND lastPublishDateTime:<2012-01-01T00:00:00Z) AND title:Olympics

or shorter:

(people:“David Cameron” OR people:“Gordon Brown”) lastPublishDateTime:>2011-01-01T00:00:00Z lastPublishDateTime:<2012-01-01T00:00:00Z title:Olympics

Error codes

Response Description
200 - OK Success - the Response will be returned, containing the Results (if any)
400 - Bad Request The request could not be understood by the server due to malformed syntax. The client SHOULD NOT repeat the request without modifications
415 - Method not Supported The request was in a HTTP method not supported
422 - Unprocessable Entity The request was well-formed but was unable to be followed due to semantic errors. The client SHOULD NOT repeat the request without modifications
500 - Internal Server Error The server encountered an unexpected condition which prevented it from fulfilling the request
501 - Not Implemented The server does not support the functionality required to fulfill the request