Data Scientist Platform Release Notes
8 March 2018
List HDFS files from the Notebooks
from pipe_algorithms_lib.hadoop import ls # Basic ls(path="/") # With a defined path ls(path="/recsys/chrts/realtime") # List with all details ls(path="/recsys/chrts/realtime", all=True)
Custom Python environment for Notebooks & Tasks
- This allows data scientist to use third-party libraries in their code seamlessly on the platform.
- Custom Python environment for endpoints was already provided some weeks ago.
- Data scientist can also leverage this to create their own libraries to factor common code, thus improving maintenance, reusability and sharing across broadcasters and teams.
- Common libraries can also be versioned in Git (using EBU GitLab or public Github repositories)
- How does it work?