Data Scientist Platform Release Notes
10 August 2018
- Realtime user play history
- Main use case : filter out recently watched content from recommendation
- Last 24 hours of media consumption by user stored in redis in realtime including anonymous users
- Enabled only for broadcasters running PEACH in production at the moment
- History can be retrieved in the endpoints in the following way:
from pipe_algorithms_lib.history_utils import realtime_history # list of media ids ordered by time from old to new ones using pipe_c cookie history = realtime_history(codops, cookie_id)
8 March 2018
List HDFS files from the Notebooks
from pipe_algorithms_lib.hadoop import ls # Basic ls(path="/") # With a defined path ls(path="/recsys/chrts/realtime") # List with all details ls(path="/recsys/chrts/realtime", all=True)
Custom Python environment for Notebooks & Tasks
- This allows data scientist to use third-party libraries in their code seamlessly on the platform.
- Custom Python environment for endpoints was already provided some weeks ago.
- Data scientist can also leverage this to create their own libraries to factor common code, thus improving maintenance, reusability and sharing across broadcasters and teams.
- Common libraries can also be versioned in Git (using EBU GitLab or public Github repositories)
- How does it work?