Orange Blog

By: AJDA, Apr 14, 2016

Univariate GSoC Success

Google Summer of Code application period has come to an end. We’ve received 34 applications, some of which were of truly high quality. Now it’s upon us to select the top performing candidates, but before that we wanted to have an overlook of the candidate pool. We’ve gathered data from our Google Form application and gave it a quick view in Orange. First, we needed to preprocess the data a bit, since it came in a messy form of strings.


By: AJDA, Mar 23, 2016

All I See is Silhouette

Silhouette plot is such a nice method for visually assessing cluster quality and the degree of cluster membership that we simply couldn’t wait to get it into Orange3. And now we did. What this visualization displays is the average distance between instances within the cluster and instances in the nearest cluster. For a given data instance, the silhouette close to 1 indicates that the data instance is close to the center of the cluster.


By: BLAZ, Mar 12, 2016

Overfitting and Regularization

A week ago I used Orange to explain the effects of regularization. This was the second lecture in the Data Mining class, the first one was on linear regression. My introduction to the benefits of regularization used a simple data set with a single input attribute and a continuous class. I drew a data set in Orange, and then used Polynomial Regression widget (from Prototypes add-on) to plot the linear fit.


By: AJDA, Jan 22, 2016

Making Predictions

One of the cool things about being a data scientist is being able to predict. That is, predict before we know the actual outcome. I am not talking about verifying your favorite classification algorithm here, and I am not talking about cross-validation or classification accuracies or AUC or anything like that. I am talking about the good old prediction. This is where our very own Predictions widget comes to help.


By: AJDA, Jan 4, 2016

Orange YouTube Tutorials

It’s been a long time coming, but finally we’ve created out our first set of YouTube tutorials. In a series ‘Getting Started with Orange’ we will walk through our software step-by-step. You will learn how to create a workflow, load your data in different formats, visualize and explore the data. These tutorials are meant for complete beginners in both Orange and data mining and come with some handy tricks that will make using Orange very easy.


By: BLAZ, Dec 19, 2015

Model-Based Feature Scoring

Feature scoring and ranking can help in understanding the data in supervised settings. Orange includes a number of standard feature scoring procedures one can access in the Rank widget. Moreover, a number of modeling techniques, like linear or logistic regression, can rank features explicitly through assignment of weights. Trained models like random forests have their own methods for feature scoring. Models inferred by these modeling techniques depend on their parameters, like type and level of regularization for logistic regression.


By: AJDA, Dec 11, 2015

Report is back! (and better than ever)

I’m sure you’d agree that reporting your findings when analyzing the data is crucial. Say you have a couple of interesting predictions that you’ve tested with several methods many times and you’d like to share that with the world. Here’s how. Save Graph just got company - a Report button! Report works in most widgets, apart from the very obvious ones that simply transmit or display the data (Python Scripting, Edit Domain, Image Viewer, Predictions…).


By: AJDA, Nov 27, 2015

Mining our own data

Recently we’ve made a short survey that was, upon Orange download, asking people how they found out about Orange, what was their data mining level and where do they work. The main purpose of this is to get a better insight into our user base and to figure out what is the profile of people interested in trying Orange. Here we have some preliminary results that we’ve managed to gather in the past three weeks or so.


By: AJDA, Oct 30, 2015

Ghostbusters

Ok, we’ve just recently stumbled across an interesting article on how to deal with non normal (non-Gaussian distributed) data. We have an absolutely paranormal data set of 20 persons with weight, height, paleness, vengefulness, habitation and age attributes (download). Let’s check the distribution in Distributions widget. Our first attribute is “Weight” and we see a little hump on the left. Otherwise the data would be normally distributed. Ok, so perhaps we have a few children in the data set.


By: AJDA, Sep 25, 2015

Save your graphs!

If you are often working with Orange, you probably have noticed a small button at the bottom of most visualization widgets. “Save Graph” now enables you to export graphs, charts, and hierarchical trees to your computer and use them in your reports. Because people need to see it to believe it! “Save Graph” will save visualizations to your computer. Save Graph function is available in Paint Data, Image Viewer, all visualization widgets, and a few others (list is below).