Hops Ecosystem

Deep Learning, Streaming, Multi-Tenancy. All in a single secure platform.

Multi-Tenancy with Hopsworks

Hopsworks is both a UI and Rest-API platform for privacy-by-design Data Science on Hops Hadoop. Uniquely among Hadoop platforms, even sensitive data can be processed/stored in the Data Lake.

Batch, Streaming, SQL

Batch, Streaming, SQL

Apache Spark support for batch analytics, SparkSQL/Parquet, Spark Streaming, GraphX.

TensorFlow/Keras

TensorFlow/Keras

Train, deploy, and debug your models on clusters of GPUs with TensorFlow/Keras/PyTorch and debug with TensorBoard. One-click deployment of models to TensorFlow Serving.

Python-First

The only Hadoop stack with full Conda and Pip support. Hopsworks Projects have their own their own conda environments in the data lake -Data Scientists can choose their own libraries.

Notebooks

Notebooks

Jupyter and Zeppelin Notebooks. Jupyter supports Python, Hive, and Sparkmagic kernels, for TensorFlow/Python/PySpark/Scala/Hive.

Apache Hive LLAP

Apache Hive LLAP

Petabyte scale data warehousing with Apache Hive LLAP. Zeppelin Interpreter support for interactive analytics and visualizations. UI-driven starting/stopping of LLAP clusters.

Elastic/Logstash/Kibana

Elastic/Logstash/Kibana

The ELK stack is integrated with Spark/TensorFlow applications for realtime logging, visualizations, and search.

InfluxDB/Telegraf/Grafana

InfluxDB/Telegraf/Grafana

Spark applications and Hops services are monitored and monitoring data is stored in the time-series database, InfluxDB. Time-series data is graphed with Grafana.

TLS Security

TLS Security

Hops is the only Hadoop distribution with a TLS certificate-based security model. Certificate management is more scalable than Kerberos' KDC, they enable external systems easier integrate of external devices, and enable multi-tenancy feature in Hopsworks.

About Hops

Hops is the result of research at the Distributed Systems Group – jointly run by KTH – Royal Institute of Technology and SICS Swedish ICT and managed as the EIT Digital Innovation Activity HopsWorks.The research has been financed by the EU Framework 7 programme (BiobankCloud – 317871), SSF (End-to-End Clouds), EU H2020 (Aegis – 732189), SeRC, EIT Digital Innovation Activity (Hopsworks), and ICT TNG.

Hops is jointly developed by KTH Stockholm, RISE SICS AB, and Logical Clocks AB.