Orange Blog
By: AJDA, Oct 17, 2016
10 Tips and Tricks for Using Orange
TIP #1: Follow tutorials and example workflows to get started. It’s difficult to start using new software. Where does one start, especially a total novice in data mining? For this exact reason we’ve prepared Getting Started With Orange - YouTube tutorials for complete beginners. Example workflows on the other hand can be accessed via Help - Examples. TIP #2: Make use of Orange documentation. You can access it in three ways:
By: BLAZ, Oct 2, 2016
Intro to Data Mining for Life Scientists
RNA Club Munich has organized Molecular Life of Stem Cells Conference in Ljubljana this past Thursday, Friday and Saturday. They asked us to organize a four-hour workshop on data mining. And here we were: four of us, Ajda, Anze, Marko and myself (Blaz) run a workshop for 25 students with molecular biology and biochemistry background. We have covered some basic data visualization, modeling (classification) and model scoring, hierarchical clustering and data projection, and finished with a touch of deep-learning by diving into image analysis by deep learning-based embedding.
By: AJDA, Sep 23, 2016
Text Mining: version 0.2.0
Orange3-Text has just recently been polished, updated and enhanced! Our GSoC student Alexey has helped us greatly to achieve another milestone in Orange development and release the latest 0.2.0 version of our text mining add-on. The new release, which is already available on PyPi, includes Wikipedia and SimHash widgets and a rehaul of Bag of Words, Topic Modeling and Corpus Viewer. Wikipedia widget allows retrieving sources from Wikipedia API and can handle multiple queries.
By: BLAZ, Sep 15, 2016
Data Mining Course in Houston #2
This was already the second installment of Introduction to Data Mining Course at Baylor College of Medicine in Houston, Texas. Just like the last year, the course was packed. About 50 graduate students, post-docs and a few faculty attended, making the course one of the largest elective PhD courses from over a hundred offered at this prestigious medical school. The course was designed for students with little or no experience in data science.
By: PRIMOZGODEC, Aug 25, 2016
Visualizing Gradient Descent
This is a guest blog from the Google Summer of Code project. Gradient Descent was implemented as a part of my Google Summer of Code project and it is available in the Orange3-Educational add-on. It simulates gradient descent for either Logistic or Linear regression, depending on the type of the input data. Gradient descent is iterative approach to optimize model parameters that minimize the cost function. In machine learning, the cost function corresponds to prediction error when the model is used on the training data set.
By: SALVACARRION, Aug 19, 2016
Making recommendations
This is a guest blog from the Google Summer of Code project. Recommender systems are everywhere, we can find them on YouTube, Amazon, Netflix, iTunes,… This is because they are crucial component in a competitive retail services. How can I know what you may like if I have almost no information about you? The answer: taking Collaborative filtering (CF) approaches. Basically, this means to combine all the little knowledge we have about users and/or items in order to build a grid of knowledge with which we make recommendation.
By: PRIMOZGODEC, Aug 16, 2016
Visualization of Classification Probabilities
This is a guest blog from the Google Summer of Code project. Polynomial Classification widget is implemented as a part of my Google Summer of Code project along with other widgets in educational add-on (see my previous blog). It visualizes probabilities for two-class classification (target vs. rest) using color gradient and contour lines, and it can do so for any Orange learner. Here is an example workflow. The data comes from the File widget.
By: PRIMOZGODEC, Aug 12, 2016
Interactive k-Means
This is a guest blog from the Google Summer of Code project. As a part of my Google Summer of Code project I started developing educational widgets and assemble them in an Educational Add-On for Orange. Educational widgets can be used by students to understand how some key data mining algorithms work and by teachers to demonstrate the working of these algorithms. Here I describe an educational widget for interactive k-means clustering, an algorithm that splits the data into clusters by finding cluster centroids such that the distance between data points and their corresponding centroid is minimized.
By: MATEVZKREN, Aug 5, 2016
Rule Induction (Part I - Scripting)
This is a guest blog from the Google Summer of Code project. We’ve all heard the saying, “Rules are meant to be broken.” Regardless of how you might feel about the idea, one thing is certain. Rules must first be learnt. My 2016 Google Summer of Code project revolves around doing just that. I am developing classification rule induction techniques for Orange, and here describing the code currently available in the pull request and that will become part of official distribution in an upcoming release 3.
By: AJDA, Jul 29, 2016
Pythagorean Trees and Forests
Classification Trees are great, but how about when they overgrow even your 27’’ screen? Can we make the tree fit snugly onto the screen and still tell the whole story? Well, yes we can. Pythagorean Tree widget will show you the same information as Classification Tree, but way more concisely. Pythagorean Trees represent nodes with squares whose size is proportionate to the number of covered training instances. Once the data is split into two subsets, the corresponding new squares form a right triangle on top of the parent square.