Skip to content

Data collection with peach-collector.js

The purpose of this document is to show you how to add peach-collector.js and use it in order to collect data depending on client informations.

How it works

First we have to load the peach-collector.js main script, located here.

This step provides the _pc javascript object that has all the needed functions for initialisation and event collection on it.

Also in order to track users and collect consistent data, four cookies are set by the collect server for a user. One of them never expires (cookie_id), thus it stays inside the browser and allows to uniquely identify the browser over time. The second (session_id) expires when closing the browser. Therefore, it can be used to identify uniquely the session of the user. The third one (tracking) stores the informtion about allowed tracking. This will override a clients browser "do not track" setting. The fourth cookie (last_recorded) does recording of the last activitys timestamps to re-create sessions if neccessary.

Most European countries require website providers to ask their visitors about Cookie usage, any kind of tracking or data collection on their website. Make sure you comply to legal regulations in place in your country before using the peach-collect library on your webpages.


Prerequisites

To be able to complete this tutorial, you will need a site key provided by the EBU that matches your account informations. For the meaning of this tutorial, you can use zzebu00000000017 as site key, or get one from the PEACH core team.

Production site key

To be able to run in production you must have your own site key. Please contact us!


Integrate PeachCollector

You have to do the initialisation/setup calls, available as a file here.

(function (j, c) {
  var s = function (o) {
    var t = document.createElement('script');
    for (var p in o) {t[p] = o[p];};
    return t;
  };
  var p = s({src: j, defer: true, async: true, onload: c});
  var i = document.getElementsByTagName('script')[0];
  i.parentNode.insertBefore(p, i);
})('//peach-static.ebu.io/peach-collector.min.js', PeachCollectorLoaded);

function PeachCollectorLoaded() {
  var _pc = window._pc = new PeachCollector(window);
  _pc.init('my_site_key', 'my_app_id')
    .enableTracking()
    .sendPageViewEvent();
}

The loader creates a <script> tag which leads to loading the library itself and initialises the library. You'll have to call _pc.enableTracking() explicitly after init() to see data being sent! You'll get a site_key from us at EBU PEACH team, the app_id can be chosen freely.

Enabling the tracking

Call enableTracking() when you've asked your user for consent and peach-collector is allowed to collect data. This overrides any "do not track" settings, so be cautious with it. Any event triggered before you enable the tracking will be ignored and not sent.

Listen to sent data

For testing purpose, you can listen to the events that are sent by using the setSendEventCallback() function. Pass in a function, params are error, response and a payload. For example, _pc.setSendEventCallback(function(err,response,payload) { // some magic happens }; );

Defining endpoints

You can use our hosted library but define your own collection servers or your local server, as additional receivers or for debugging. See the section about advanced setup.


SameSite configuration

By default peach-collector.js sets the cookies SameSite attribute to Strict to keep it secure. There may be use-cases where this level disables functionality and must be lowered. To do so, initialise the peach-collector with an option object containing a different value:

var _pc = window._pc = new PeachCollector(window, { cookie_strictness: 'None' });

This will set the SameSite attribute to "None".

You can use Strict, which is the default behaviour and fallback in case of misconfiguration. Other values possible are Lax and None. Please note that these options are case sensitive, for example using none will fall back to Strict!


Sending events

Now that our peach-collector.js is registered and ready to use, you can send an event:

_pc.sendPageViewEvent();

If you open your browser console and check the network tab, you should be able to see a POST request sent to http://pipe-collect.ebu.io/v3/collect?s=<your site key>' with success and the request payload with all collected data.

In case you have a 40x HTTP error on OPTIONS request, it is most likely because the site key is not registered correctly. Make sure it is enabled and correctly initialized.

Create a context

A context is sometimes needed to understand an event (from a data science point of view). For example, we use it to track the source of a media. When you want to track performance of algorithms, it is very important to understand when media is watched after being recommended, and when it is not.

PeachCollector provides different functions to create contexts. Those context can then be used when sending an event.

// describe the component in which the event happened
var component = _pc.EventContextComponent(type, name, version);

// create any kind of context
var context = _pc.pageViewContext(referrer, recommendation_id);
var context = _pc.readMoreContext(source, component);
var context = _pc.mediaContext(id, type, referrer, page_uri, source, component);
var context = _pc.recommendationContext(items, item_id, hit_index, referrer, page_uri, source, component);

Defining properties

The props object provides details about an event (essentially used for media events). Setters are provided for all known fields (see Event Properties).

var props = _pc.EventProps().setPlaybackPosition(0);

Sending an event

For documentation of the possible events that are available, please see the events documentation.

var context = _pc.mediaContext('reco0001', 'recommendation', window.document.referrer, window.location.href);
var metadata = {'title' : 'My article'};  
_pc.sendArticleStartEvent('article10234', context, metadata);

Collect data from a player

To properly track media related events, you can set up event listeners on media objects:

var video_sample = document.querySelector('#video-sample');

video_sample.onplay = function (evt) {
  var media_element = evt.target;
  _pc.sendMediaPlayEvent(media_element);
}

Shorthand for tracking on media elements

You can without having to set up listeners add tracking to a specific media element. The tracking will by default then set up listeners for you. The default listeners will send events on play, pause and end events. But it's configurable to listen to all events that a media element produces.

var video_sample = document.querySelector('#video-sample');
var extended_event_config = { 
  media_heartbeat: {heartbeat_interval: 5}, // event_name, event_configuration or true
  media_seek: true
};
var context = {} || null;
var props = {} || null;
var metadata = {} || null;
_pc.trackMediaEvents(video_sample, 'my_id', context, props, metadata, extended_event_config);

Info

You can see this in action in the demo page.

Peach Collector provides function to easily send recommendation_hit, recommendation_displayed and recommendation_loaded events. These events can be used to track recommendations provided by PEACH. Collection events are provided to track any list of items that is not a recommendation provided by PEACH. For example: a swimlane of articles selected by editors on your main page.

You can send collection_hit, collection_displayed and collection_loaded events. Also, to provide a better tracking of what the user sees, you can also send collection_item_displayed events when a specific item appears on the screen (usualy after scrolling).

Collection events, same as recommendation events should have the collection identifier as the event ID.

Collection events are also designed to embed specific fields: - experiment_id: usefull for A-B testing (default to default) - experiment_component: to define which component (A or B or any other) of the A-B Testing was used to display the collection (default to main) - collection_session: a unique identifier to help regroup events of a same collection for a single session (if the user reloads the page, this sessions identifier should be updated)

For collection_hit events, we also have: - item_id: identifier of the item that was selected by the user - hit_index: position of the item in the collection

For collection_item_displayed: - item_index: position of the item displayed - items_count: total number items in the collection

Advanced setup

Multiple endpoints

You can make peach-collector send to not only one event receiver ("collect" endpoint), by default the EBU PEACH collect, but can use own and custom receivers. Instead of setting the site_key like above, first you have to define the endpoints as an array of objects like here:

function PeachCollectorLoaded() {
  var _pc = window._pc = new PeachCollector(window);
  var endpoints = [
    {
      sitekey: 'zzebu00000000017'
      // this is the default endpoint at EBU PEACH when no URL is given
      // but you'll always need your sitekey for it!
    },
    {
      url: 'http://collector.my-broadcaster.org/receiver'
      // this is a custom endpoint, a sitekey is probably not needed
    }
  ];
  _pc.init(endpoints, 'my_app_id')
    .sendPageViewEvent();
}

Now events will be dispatched to two collectors. There is no custom event format, so the custom endpoint exactly receives the same events as the EBU PEACH receiver. Notice how the site_key parameter, a string value, gets replaced by an array! In case you'd like to change endpoints after calling init() you can pass the endpoints to _pc.setCollectEndpoints() also.

Filtered events

You can set a list of events you want to receive on a given endpoint. This comes handy if you define a custom endpoint, but only for page_view events, for example. A valid setup could look like:

function PeachCollectorLoaded() {
  var _pc = window._pc = new PeachCollector(window);
  var endpoints = [
    {
      sitekey: 'zzebu00000000017'
      // again, the default, without event filter
    },
    {
      url: 'http://collector.my-broadcaster.org/receiver',
      filter: ['page_view']
    }
  ];
  _pc.init(endpoints, 'my_app_id')
    .sendPageViewEvent();
}

The filter is an array of whitelisted event types. This setup will send only page_view events to the custom endpoint, but everything to the default one.

Payload functions

For changing, enriching or reducing the event payloads you can use a payload_func. The events payload will be handed into and replaced by what you return from this function. For example:

function PeachCollectorLoaded() {
  var _pc = window._pc = new PeachCollector(window);
  var endpoints = [
    {
      sitekey: 'zzebu00000000017'
      // again, the default, without extra functions
    },
    {
      url: 'http://collector.my-broadcaster.org/receiver',
      payload_func: function (payload) { payload.event.foo = 'bar'; return payload; }
    }
  ];
  _pc.init(endpoints, 'my_app_id')
    .sendPageViewEvent();
}

Here you would add the foo key to the event payload with value bar. Again, this is only valid for the endpoint it's defined in.

Overview:

Parameter Name Value Explaination
url string URL to send to. Use ${sitekey} to insert sitekey value
sitekey string Mandatory for EBU PEACH collect endpoints.
filter array of strings Sets the event types to send to this endpoint
options optional settings Override the defult settings for max_cache_hours, heartbeat_frequency_sec, lifetime_sec, max_events_stored, max_batch_size, max_events_per_request, flush_interval_sec and add a remoteConfigUrl
payload_func function Function to apply to the payload. Make sure to return it.

Remote Configuration

It's possible to add remote filtering and settings in the options property when setting up the endpoint. Simply add an id and a remoteConfigUrl.

options: {
    id: 'something_unique',
    remoteConfigUrl: 'https://peach-static.ebu.io/zzebu/config_example.json',
    max_cache_hours: 60 * 60,
    heartbeat_frequency_sec: 5,
    lifetime_sec: 12*60*60,
    max_events_stored: 1440,
    max_events_per_request: 12*12,
    flush_interval_sec: 60, // mobile only
    max_batch_size: 120 // mobile only
}

The remote configuration could be stored anywhere but should be cached for trafic reasons. Peach can provide you with an AWS S3 and Cloudfront setup that could be of help. Please follow this format:

{
  "filter": [
    "media_play",
    "media_end",
    "media_pause",
    "page_view"
  ],
  "heartbeat_frequency_sec": 500,
  "lifetime_sec": 500,
  "max_events_stored": 5,
  "max_events_per_request": 500,
  "max_batch_size": 120,
  "flush_interval_sec": 60
} 

Data Collected

By using peach-collect.js, you get more data about the client automatically. This is a sample of what we receive with the code above :

{

    peach_schema_version: "1.0.3",
    peach_implementation_version: "peach-collector-js",
    site_key: "my_site_key",
    session_start_timestamp: 1582567123913,
    sent_timestamp: 1582569351215,
    collect_timestamp: 1582569351215,
    user_id: "anonymous",
    events: [
        {
            type: "page_view",
            id: "http://localhost:3000/",
            event_timestamp: 1582569351215,
            context: {
                page_uri: "http://localhost:3000/",
                referrer: ""
            }
        }
    ],
    client: {
        id: "2c0e4a10-cf43-396e-f1d5-596dbe99340e",
        app_id: "my_app_id",
        type: "web",
        name: "Mozilla/5.0 (X11; Linux x86_64; rv:73.0) Gecko/20100101 Firefox/73.0",
        version: "5.0 (X11)",
        device: {
            type: "Mozilla/5.0 (X11; Linux x86_64; rv:73.0) Gecko/20100101 Firefox/73.0",
            vendor: "",
            model: "Linux x86_64",
            language: "en",
            timezone: -1,
            screen_size: "1920pxx365.833px"
        },
        os: {
            name: "Linux x86_64",
            version: "5.0 (X11)"
        }
    }

}

Additional Documentation

Some documents to know more about Peach Collector :


Final notes

If you followed this tutorial entirely you should now be able to collect with peach-collector.js, following our best practices. You noticed anything missing, wrong or not working? Do not hesitate to contact us and we will be pleased to answer your request.