Data collection with peach-collector.js
The purpose of this document is to show you how to add peach-collector.js
and use it in order to collect data depending on client informations.
How it works
First we have to load the peach-collector.js
main script, located here.
This step provides the _pc
javascript object that has all the needed functions for initialisation and event collection on it.
Also in order to track users and collect consistent data, four cookies are set by the collect server for a user. One of them never expires (cookie_id
), thus it stays inside the browser and allows to uniquely identify the browser over time. The second (session_id
) expires when closing the browser. Therefore, it can be used to identify uniquely the session of the user. The third one (tracking
) stores the informtion about allowed tracking. This will override a clients browser "do not track" setting. The fourth cookie (last_recorded
) does recording of the last activitys timestamps to re-create sessions if neccessary.
Most European countries require website providers to ask their visitors about Cookie usage, any kind of tracking or data collection on their website. Make sure you comply to legal regulations in place in your country before using the peach-collect library on your webpages.
Prerequisites
To be able to complete this tutorial, you will need a site key provided by the EBU that matches your account informations. For the meaning of this tutorial, you can use zzebu00000000017
as site key, or get one from the PEACH core team.
Production site key
To be able to run in production you must have your own site key. Please contact us!
Integrate PeachCollector
You have to do the initialisation/setup calls, available as a file here.
(function (j, c) {
var s = function (o) {
var t = document.createElement('script');
for (var p in o) {t[p] = o[p];};
return t;
};
var p = s({src: j, defer: true, async: true, onload: c});
var i = document.getElementsByTagName('script')[0];
i.parentNode.insertBefore(p, i);
})('//peach-static.ebu.io/peach-collector.min.js', PeachCollectorLoaded);
function PeachCollectorLoaded() {
var _pc = window._pc = new PeachCollector(window);
_pc.init('my_site_key', 'my_app_id')
.enableTracking()
.sendPageViewEvent();
}
The loader creates a <script>
tag which leads to loading the library itself and initialises the library. You'll have to call _pc.enableTracking()
explicitly after init()
to see data being sent!
You'll get a site_key
from us at EBU PEACH team, the app_id
can be chosen freely.
Enabling the tracking
Call enableTracking()
when you've asked your user for consent and peach-collector is allowed to collect data. This overrides any "do not track" settings, so be cautious with it. Any event triggered before you enable the tracking will be ignored and not sent.
Listen to sent data
For testing purpose, you can listen to the events that are sent by using the setSendEventCallback()
function. Pass in a function, params are error
, response
and a payload
. For example, _pc.setSendEventCallback(function(err,response,payload) { // some magic happens }; );
Defining endpoints
You can use our hosted library but define your own collection servers or your local server, as additional receivers or for debugging. See the section about advanced setup.
Using npm to load peach-collector
You may use the peach-collector npm package if you build your frontend code with any transpiler. If you use Server-Side Rendering libraries for your content make sure that you get access to 'globalThis.window' for the init of the collector in your non-hydrated root.
Add to package.json
npm install https://peach-static.ebu.io/peach-collector-1.2.12.tgz --save
In your root
import PeachCollector from 'peach-collector';
// or
const PeachCollector = require('peach-collector');
SameSite configuration
By default peach-collector.js sets the cookies SameSite
attribute to Strict
to keep it secure. There may be use-cases where this level disables functionality and must be lowered. To do so, initialise the peach-collector with an option object containing a different value:
var _pc = window._pc = new PeachCollector(window, { cookie_strictness: 'None' });
This will set the SameSite attribute to "None".
You can use Strict
, which is the default behaviour and fallback in case of misconfiguration. Other values possible are Lax
and None
. Please note that these options are case sensitive, for example using none
will fall back to Strict
!
Sending events
Now that our peach-collector.js is registered and ready to use, you can send an event:
_pc.sendPageViewEvent();
If you open your browser console and check the network tab, you should be able to see a POST request sent to http://pipe-collect.ebu.io/v3/collect?s=<your site key>'
with success and the request payload with all collected data.
In case you have a 40x HTTP error on OPTIONS request, it is most likely because the site key is not registered correctly. Make sure it is enabled and correctly initialized.
Create a context
A context
is sometimes needed to understand an event (from a data science point of view). For example, we use it to track the source of a media. When you want to track performance of algorithms, it is very important to understand when media is watched after being recommended, and when it is not.
PeachCollector provides different functions to create contexts. Those context can then be used when sending an event.
// describe the component in which the event happened
var component = _pc.EventContextComponent(type, name, version);
// create any kind of context
var context = _pc.pageViewContext(referrer, recommendation_id);
var context = _pc.readMoreContext(source, component);
var context = _pc.mediaContext(id, type, referrer, page_uri, source, component);
var context = _pc.recommendationContext(items, item_id, hit_index, referrer, page_uri, source, component);
Defining properties
The props
object provides details about an event (essentially used for media events).
Setters are provided for all known fields (see Event Properties).
var props = _pc.EventProps().setPlaybackPosition(0);
Sending an event
For documentation of the possible events that are available, please see the events documentation.
var context = _pc.mediaContext('reco0001', 'recommendation', window.document.referrer, window.location.href);
var metadata = {'title' : 'My article'};
_pc.sendArticleStartEvent('article10234', context, metadata);
Collect data from a player
To properly track media related events, you can set up event listeners on media objects:
var video_sample = document.querySelector('#video-sample');
video_sample.onplay = function (evt) {
var media_element = evt.target;
_pc.sendMediaPlayEvent(media_element);
}
Shorthand for tracking on media elements
You can without having to set up listeners add tracking to a specific media element. The tracking will by default then set up listeners for you. The default listeners will send events on play, pause and end events. But it's configurable to listen to all events that a media element produces.
var video_sample = document.querySelector('#video-sample');
var extended_event_config = {
media_heartbeat: {heartbeat_interval: 5}, // event_name, event_configuration or true
media_seek: true
};
var context = {} || null;
var props = {} || null;
var metadata = {} || null;
_pc.trackMediaEvents(video_sample, 'my_id', context, props, metadata, extended_event_config);
Info
You can see this in action in the demo page.
Events related to a collection of items
Peach Collector provides function to easily send recommendation_hit
, recommendation_displayed
and recommendation_loaded
events. These events can be used to track recommendations provided by PEACH. Collection events are provided to track any list of items that is not a recommendation provided by PEACH. For example: a swimlane of articles selected by editors on your main page.
You can send collection_hit
, collection_displayed
and collection_loaded
events. Also, to provide a better tracking of what the user sees, you can also send collection_item_displayed
events when a specific item appears on the screen (usualy after scrolling).
Collection events, same as recommendation events should have the collection identifier as the event ID.
Collection events are also designed to embed specific fields:
- experiment_id
: usefull for A-B testing (default to default
)
- experiment_component
: to define which component (A or B or any other) of the A-B Testing was used to display the collection (default to main
)
- collection_session
: a unique identifier to help regroup events of a same collection for a single session (if the user reloads the page, this sessions identifier should be updated)
For collection_hit
events, we also have:
- item_id
: identifier of the item that was selected by the user
- hit_index
: position of the item in the collection
For collection_item_displayed
:
- item_index
: position of the item displayed
- items_count
: total number items in the collection
Advanced setup
Multiple endpoints
You can make peach-collector
send to not only one event receiver ("collect" endpoint), by default the EBU PEACH collect, but can use own and custom receivers. Instead of setting the site_key like above, first you have to define the endpoints as an array of objects like here:
function PeachCollectorLoaded() {
var _pc = window._pc = new PeachCollector(window);
var endpoints = [
{
sitekey: 'zzebu00000000017'
// this is the default endpoint at EBU PEACH when no URL is given
// but you'll always need your sitekey for it!
},
{
url: 'http://collector.my-broadcaster.org/receiver'
// this is a custom endpoint, a sitekey is probably not needed
}
];
_pc.init(endpoints, 'my_app_id')
.sendPageViewEvent();
}
Now events will be dispatched to two collectors. There is no custom event format, so the custom endpoint exactly receives the same events as the EBU PEACH receiver. Notice how the site_key
parameter, a string value, gets replaced by an array!
In case you'd like to change endpoints after calling init()
you can pass the endpoints to _pc.setCollectEndpoints()
also.
Filtered events
You can set a list of events you want to receive on a given endpoint. This comes handy if you define a custom endpoint, but only for page_view
events, for example. A valid setup could look like:
function PeachCollectorLoaded() {
var _pc = window._pc = new PeachCollector(window);
var endpoints = [
{
sitekey: 'zzebu00000000017'
// again, the default, without event filter
},
{
url: 'http://collector.my-broadcaster.org/receiver',
filter: ['page_view']
}
];
_pc.init(endpoints, 'my_app_id')
.sendPageViewEvent();
}
The filter
is an array of whitelisted event types. This setup will send only page_view
events to the custom endpoint, but everything to the default one.
Payload functions
For changing, enriching or reducing the event payloads you can use a payload_func
. The events payload will be handed into and replaced by what you return from this function. For example:
function PeachCollectorLoaded() {
var _pc = window._pc = new PeachCollector(window);
var endpoints = [
{
sitekey: 'zzebu00000000017'
// again, the default, without extra functions
},
{
url: 'http://collector.my-broadcaster.org/receiver',
payload_func: function (payload) { payload.event.foo = 'bar'; return payload; }
}
];
_pc.init(endpoints, 'my_app_id')
.sendPageViewEvent();
}
Here you would add the foo
key to the event payload with value bar
. Again, this is only valid for the endpoint it's defined in.
Overview:
Parameter Name | Value | Explaination |
---|---|---|
url | string | URL to send to. Use ${sitekey} to insert sitekey value |
sitekey | string | Mandatory for EBU PEACH collect endpoints. |
filter | array of strings | Sets the event types to send to this endpoint |
options | optional settings | Override the defult settings for max_cache_hours, heartbeat_frequency_sec, lifetime_sec, max_events_stored, max_batch_size, max_events_per_request, flush_interval_sec and add a remoteConfigUrl |
payload_func | function | Function to apply to the payload. Make sure to return it. |
Remote Configuration
It's possible to add remote filtering and settings in the options
property when setting up the endpoint. Simply add an id
and a remoteConfigUrl
.
options: {
id: 'something_unique',
remoteConfigUrl: 'https://peach-static.ebu.io/zzebu/config_example.json',
max_cache_hours: 60 * 60,
heartbeat_frequency_sec: 5,
lifetime_sec: 12*60*60,
max_events_stored: 1440,
max_events_per_request: 12*12,
flush_interval_sec: 60, // mobile only
max_batch_size: 120 // mobile only
}
The remote configuration could be stored anywhere but should be cached for trafic reasons. Peach can provide you with an AWS S3 and Cloudfront setup that could be of help. Please follow this format:
{
"filter": [
"media_play",
"media_end",
"media_pause",
"page_view"
],
"heartbeat_frequency_sec": 500,
"lifetime_sec": 500,
"max_events_stored": 5,
"max_events_per_request": 500,
"max_batch_size": 120,
"flush_interval_sec": 60
}
Data Collected
By using peach-collect.js
, you get more data about the client automatically. This is a sample of what we receive with the code above :
{
peach_schema_version: "1.0.3",
peach_implementation_version: "peach-collector-js",
site_key: "my_site_key",
session_start_timestamp: 1582567123913,
sent_timestamp: 1582569351215,
collect_timestamp: 1582569351215,
user_id: "anonymous",
events: [
{
type: "page_view",
id: "http://localhost:3000/",
event_timestamp: 1582569351215,
context: {
page_uri: "http://localhost:3000/",
referrer: ""
}
}
],
client: {
id: "2c0e4a10-cf43-396e-f1d5-596dbe99340e",
app_id: "my_app_id",
type: "web",
name: "Mozilla/5.0 (X11; Linux x86_64; rv:73.0) Gecko/20100101 Firefox/73.0",
version: "5.0 (X11)",
device: {
type: "Mozilla/5.0 (X11; Linux x86_64; rv:73.0) Gecko/20100101 Firefox/73.0",
vendor: "",
model: "Linux x86_64",
language: "en",
timezone: -1,
screen_size: "1920pxx365.833px"
},
os: {
name: "Linux x86_64",
version: "5.0 (X11)"
}
}
}
Additional Documentation
Some documents to know more about Peach Collector :
Final notes
If you followed this tutorial entirely you should now be able to collect with peach-collector.js, following our best practices. You noticed anything missing, wrong or not working? Do not hesitate to contact us and we will be pleased to answer your request.