On this page
dashboard
Overview
Introduction to the platform
Welcome to the Butterfly AI documentation, a platform to easily build predictive AI models from your CSV data and achieve outstanding prediction accuracy.
Why Butterfly AI?
With Butterfly AI going from labelled data to accurate predictions is easier than ever:
- Create
datasetsfrom CSV files - Get
modelstrained in minutes - Obtain high accuracy predictions using this
modelsin unseen data
Who is Butterfly AI for?
Butterfly AI may be suitable for:
- In general, organisations looking to get insights on their existing data, getting started on their way to obtain meaningful predictions relevant to their business
- Data Science teams, to benchmark or complement their existing model development with a powerful platform to add to their existing toolchain
- Researchers and professionals looking to obtain reliable predictions on datasets they’re collecting or building for a specific field (Healthcare, Fraud, …)
To get started request early access via Request access form.
How does it work?
As a quick glance, this is how Butterfly AI platform works:
- Users prepare a tabular,
labelled data CSVfile with a set of features and an outcome (guide) - This input CSV is uploaded to the platform and a
Datasetis created - This
Datasetis used toTrainaModel - The training process consists of a set of proprietary algorithms that compete with each other to achieve a given target performance. Ssee glossary for metrics used during training. A
Modelis created from the output of the winner algorithm- 90% of labelled data is used for training, remaining 10% is stripped out of its labels and used for testing as part of the overall training process
- Training process runs fast (minutes or few hours) in most of the cases when the data is well prepared. Butterfly AI uses novel training algorithms that converge to the target performance or stall/stop way faster than other known classification methods
- If training stalls, times out or fails for any reason, the model won’t be created
- Models won’t be created if the newest achieved performance is not greater than existing for the given dataset
- Once there’s an initial model created (champion model), predictions (inference) can be run by uploading an
unseen data CSV. The trainedmodelmodel is used to accurately predict the unseen data and obtain a downloadableprediction result CSVoutput
Use Butterfly AI
- Log into the platform and follow the Quickstart to test your access and start using the dashboard
- You can also directly start using your own data to get predictions:
- Format labelled and unseen CSV data for training and prediction following the Input CSV Creation guide
- Create datasets, train models and get initial predictions following the From datasets to predictions guide
- Gradually improve the performance of your models following Hyperparameter tuning guide
- Eventually, integrate Butterfly AI within your existing worflows or create brand new apps using it as a base prediction engine
- Use the REST API, API Recipes and full API reference docs
- After that, start using our initial version of Python SDK