Getting Started with Apache Superset, an Enterprise-Ready Business Intelligence Platform

Apache Superset

Apache Superset is a modern, enterprise-ready business intelligence web application that makes it easy to visualise large datasets and build complex dashboards.

At Reflective Data, we are using Apache Superset to monitor all data going through our platform with minimum latency. This allows us to easily combine data from different databases and every analyst can build their own dashboards.

In this article, we are giving a quick overview of what Superset is, what it’s good for and how to get started (installing).

Key Features

  • A rich set of data visualizations
  • An easy-to-use interface for exploring and visualizing data
  • Create and share dashboards
  • Enterprise-ready authentication with integration with major authentication providers (database, OpenID, LDAP, OAuth & REMOTE_USER through Flask AppBuilder)
  • An extensible, high-granularity security/permission model allowing intricate rules on who can access individual features and the dataset
  • A simple semantic layer, allowing users to control how data sources are displayed in the UI by defining which fields should show up in which drop-down and which aggregation and function metrics are made available to the user
  • Integration with most SQL-speaking RDBMS through SQLAlchemy
  • Deep integration with Druid.io

(Source: https://superset.incubator.apache.org)

Screenshots

Rich Visualizations in Apache Superset
Rich Visualizations in Apache Superset

 

Flexible Data Exploration in Apache Superset
Flexible Data Exploration in Apache Superset

 

Sample Dashboard in Apache Superset
Sample Dashboard in Apache Superset

(Source: https://superset.incubator.apache.org)

Installing Apache Superset

This tutorial describes the simplest and fastest solution for getting Apache Superset up and running in development. In this article, Ubuntu 16.04 is being used as the platform. For other platforms, custom integrations and production installations, please refer to the official documentation.

Step 1 – Install Dependencies

Apache Superset has some OS-level dependencies, the following

Ubuntu 16.04 If you have python3.5 installed alongside with python2.7, as is default on Ubuntu 16.04 LTS, run this command also

Step 2 – Python’s setup tools and pip

Get the latest version of pip and setuptools libraries

Step 3 – Install and initialize Apache Superset

Follow these few simple steps to install Superset

If everything went well, you should be able to go to http://localhost:8088 in your browser, log in using the credentials you entered while creating the admin account. You will have some sample data waiting for you that you can use to play around with different visualizations and dashboards.

Step 4 – Connect your database

While playing around with sample data is fun, connecting your own data source gives Apache Superset a whole new meaning.

As Apache Superset doesn’t ship with database connectors, you will need to install this first. This depends on the type of database you are going to connect to. For MySQL, you’d have to install pip install mysqlclient.

Superset is using SqlAlchemy for connecting to databases.

After logging in to Apache Superset, click on “Sources” and choose “Databases”. There you can add a new connection. All you have to provide is a name and SQLAlchemy URI. The URI will look something like this:

After clicking on “Test connection” you should see the list of tables in your database.

Step 5 – Creating your first report

After successfully connecting your first data source, navigate to “tables”. There you can add new tables, based on the tables you have in your database. After adding a table, click on its name and a data explorer with the data from this table will show up.

Try playing with different metrics, dimensions, time-frames and visualizations. I bet you will be surprised by how easy yet flexible the tool is.

After creating a visualization you like, you can simply add to one of the dashboards. Or create a new one if you like.

Conclusion

Apache Superset is a powerful business intelligence tool that has flexible data visualization options and is ready for enterprise usage.

As you saw in this article, getting it up and running in your development environment is fast and easy. If you have any interest in business intelligence and data visualization I strongly recommend giving Superset a try.

In case you have already used Apache Superset or if you have any questions, feel free to share in the comments below.

Leave a Reply

Your email address will not be published. Required fields are marked *

Sign up for Reflective Data

5,000 sessions / month ×

Please enter your email

Submit

After submitting your email, we will send you the registration form.