Orange Data Mining

Orange Blog

By: AJDA, Jan 4, 2016

Orange YouTube Tutorials

It’s been a long time coming, but finally we’ve created out our first set of YouTube tutorials. In a series ‘Getting Started with Orange’ we will walk through our software step-by-step. You will learn how to create a workflow, load your data in different formats, visualize and explore the data. These tutorials are meant for complete beginners in both Orange and data mining and come with some handy tricks that will make using Orange very easy.

By: AJDA, Dec 28, 2015

Color it!

Holiday season is upon us and even the Orange team is in a festive mood. This is why we made a Color widget! This fascinating artsy widget will allow you to play with your data set in a new and exciting way. No more dull visualizations and default color schemes! Set your own colors the way YOU want it to! Care for some magical cyan-to-magenta? Or do you prefer a more festive red-to-green?

Categories: orange3 plot visualization widget

By: BLAZ, Dec 19, 2015

Model-Based Feature Scoring

Feature scoring and ranking can help in understanding the data in supervised settings. Orange includes a number of standard feature scoring procedures one can access in the Rank widget. Moreover, a number of modeling techniques, like linear or logistic regression, can rank features explicitly through assignment of weights. Trained models like random forests have their own methods for feature scoring. Models inferred by these modeling techniques depend on their parameters, like type and level of regularization for logistic regression.

Categories: analysis classification features regression scoring

By: AJDA, Dec 11, 2015

Report is back! (and better than ever)

I’m sure you’d agree that reporting your findings when analyzing the data is crucial. Say you have a couple of interesting predictions that you’ve tested with several methods many times and you’d like to share that with the world. Here’s how. Save Graph just got company - a Report button! Report works in most widgets, apart from the very obvious ones that simply transmit or display the data (Python Scripting, Edit Domain, Image Viewer, Predictions…).

Categories: analysis data orange3 report

By: LAN, Dec 4, 2015

2UDA

In one of the previous blog posts we mentioned that installing the optional dependency psycopg2 allows Orange to connect to PostgreSQL databases and work directly on the data stored there. It is also possible to transfer a whole table to the client machine, keep it in the local memory, and continue working with it as with any other Orange data set loaded from a file. But the true power of this feature lies in the ability of Orange to leave the bulk of the data on the server, delegate some of the computations to the database, and transfer only the needed results.

Categories: sql

By: AJDA, Dec 2, 2015

Hierarchical Clustering: A Simple Explanation

One of the key techniques of exploratory data mining is clustering – separating instances into distinct groups based on some measure of similarity. We can estimate the similarity between two data instances through euclidean (pythagorean), manhattan (sum of absolute differences between coordinates) and mahalanobis distance (distance from the mean by standard deviation), or, say, through Pearson correlation or Spearman correlation. Our main goal when clustering data is to get groups of data instances where:

Categories: clustering education plot

By: AJDA, Nov 27, 2015

Mining our own data

Recently we’ve made a short survey that was, upon Orange download, asking people how they found out about Orange, what was their data mining level and where do they work. The main purpose of this is to get a better insight into our user base and to figure out what is the profile of people interested in trying Orange. Here we have some preliminary results that we’ve managed to gather in the past three weeks or so.

Categories: analysis data distribution orange3 visualization

By: AJDA, Oct 30, 2015

Ghostbusters

Ok, we’ve just recently stumbled across an interesting article on how to deal with non normal (non-Gaussian distributed) data. We have an absolutely paranormal data set of 20 persons with weight, height, paleness, vengefulness, habitation and age attributes (download). Let’s check the distribution in Distributions widget. Our first attribute is “Weight” and we see a little hump on the left. Otherwise the data would be normally distributed. Ok, so perhaps we have a few children in the data set.

Categories: analysis data distribution orange3

By: AJDA, Oct 19, 2015

SQL for Orange

We bet you’ve always wanted to use your SQL data in Orange, but you might not be quite sure how to do it. Don’t worry, we’re coming to the rescue. The key to SQL files is installation of ‘psycopg2’ library in Python. WINDOWS Go to this website and download psycopg2 package. Once your .whl file has downloaded, go to the file directory and run command prompt. Enter “pip install [file name]” and run it.

Categories: data orange3 sql

By: AJDA, Oct 16, 2015

Learners in Python

We’ve already written about classifying instances in Python. However, it’s always nice to have a comprehensive list of classifiers and a step-by-step procedure at hand. TRAINING THE CLASSIFIER We start with simply importing Orange module into Python and loading our data set. >>>> import Orange >>>> data = Orange.data.Table("titanic") We are using ’titanic.tab’ data. You can load any data set you want, but it does have to have a categorical class variable (for numeric targets use regression).

Categories: classification examples orange3 python scripting

‹ Prev ... 15 16 17 18 19 ... Next ›