Data Scientist Platform Release Notes

8 March 2018

  • List HDFS files from the Notebooks

    from pipe_algorithms_lib.hadoop import ls
    # Basic
    # With a defined path
    # List with all details
    ls(path="/recsys/chrts/realtime", all=True)
  • Custom Python environment for Notebooks & Tasks

    • This allows data scientist to use third-party libraries in their code seamlessly on the platform.
    • Custom Python environment for endpoints was already provided some weeks ago.
    • Data scientist can also leverage this to create their own libraries to factor common code, thus improving maintenance, reusability and sharing across broadcasters and teams.
    • Common libraries can also be versioned in Git (using EBU GitLab or public Github repositories)
    • How does it work?