Skip to content

forpankil/mist

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

939 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Build Status Build Status Coverage Status GitHub version Maven Central Dependency Status Docker Hub Pulls

Hydrosphere Mist

Hydrosphere Mist is a service for exposing analytical jobs and machine learning models as web services.

Mist provides an API for Scala & Python Apache Spark jobs and for machine learning models trained in Apache Spark.

It implements Spark as a Service and creates a unified API layer for building enterprise solutions and services on top of a big data stack.

Mist use cases

Discover more Hydrosphere Mist use cases.

Table of Contents

Features

  • Realtime low latency models serving/scoring Mist Local Serving
  • Spark Contexts orchestration - Cluster of Sark Clusters: manages multiple Spark contexts in separate JVMs or Dockers Cluster of Spark Clusters
  • Exposing Apache Spark jobs through REST API
  • Spark 2.1.0 support!
  • HTTP & Messaging (MQTT) API
  • Scala and Python Spark jobs support
  • Support for Spark SQL and Hive
  • High Availability and Fault Tolerance
  • Self Healing after driver program failure
  • Powerful logging
  • Clear end-user API

Getting Started with Mist

Dependencies
  • jdk = 8
  • spark >= 1.5.2 (earlier versions were not tested)
  • MQTT Server (optional)
Run mist

Run Docker:

docker run -p 2003:2003 -v /var/run/docker.sock:/var/run/docker.sock -d hydrosphere/mist:master-2.1.0 mist

More about docker image

Run Jar:

sbt -DsparkVersion=${SPARK_VERSION} mistRun
Run example
sbt "project examples" package

curl --header "Content-Type: application/json" -X POST http://localhost:2003/api/simple-context --data '{"numbers": [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]}'

Check out Complete Getting Started Guide

Building from source

  • Build the project
git clone https://github.com/hydrospheredata/mist.git
cd mist
sbt -DsparkVersion=2.1.0 assembly 
  • Run
./bin/mist start master

Development mode

# clone mist repo 
git clone https://github.com/Hydrospheredata/mist

# available spark versions: 1.5.2, 1.6.2, 2.0.2, 2.1.0
export SPARK_VERSION=2.1.0
docker create --name mist-${SPARK_VERSION} -v /usr/share/mist hydrosphere/mist:tests-${SPARK_VERSION}
docker run --name mosquitto-${SPARK_VERSION} -d ansi/mosquitto
docker run --name hdfs-${SPARK_VERSION} --volumes-from mist-${SPARK_VERSION} -d hydrosphere/hdfs start

# run tests
docker run -v /var/run/docker.sock:/var/run/docker.sock --link mosquitto-${SPARK_VERSION}:mosquitto --link hdfs-${SPARK_VERSION}:hdfs -v $PWD:/usr/share/mist hydrosphere/mist:tests-${SPARK_VERSION} tests
# or run mist
docker run -v /var/run/docker.sock:/var/run/docker.sock --link mosquitto-${SPARK_VERSION}:mosquitto --link hdfs-${SPARK_VERSION}:hdfs -v $PWD:/usr/share/mist hydrosphere/mist:tests-${SPARK_VERSION} mist

What's next

Version Information

Mist Version Scala Version Python Version Spark Version
0.1.4 2.10.6 2.7.6 >=1.5.2
0.2.0 2.10.6 2.7.6 >=1.5.2
0.3.0 2.10.6 2.7.6 >=1.5.2
0.4.0 2.10.6, 2.11.8 2.7.6 >=1.5.2
0.5.0 2.10.6, 2.11.8 2.7.6 >=1.5.2
0.6.5 2.10.6, 2.11.8 2.7.6 >=1.5.2
0.7.0 2.10.6, 2.11.8 2.7.6 >=1.5.2
0.8.0 2.10.6, 2.11.8 2.7.6 >=1.5.2
0.9.1 2.10.6, 2.11.8 2.7.6 >=1.5.2
0.10.0 2.10.6, 2.11.8 2.7.6 >=1.5.2
master 2.10.6, 2.11.8 2.7.6 >=1.5.2

Roadmap


  • Persist job state for self healing
  • Super parallel mode: run Spark contexts in separate JVMs
  • Powerful logging
  • RESTification
  • Support streaming contexts/jobs
  • Reactive API
  • Realtime ML models serving/scoring
  • CLI
  • Web Interface
  • Apache Kafka support
  • Bi-directional streaming API
  • AMQP support

Docs Index

Contact

Please report bugs/problems to: https://github.com/Hydrospheredata/mist/issues.

http://hydrosphere.io/

LinkedIn

Facebook

Twitter

About

Model serving (scoring) middleware on top of Apache Spark

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages

  • Scala 91.4%
  • Python 3.8%
  • Shell 3.1%
  • HTML 1.7%