Archive | General RSS for this section

Five software suggestions for Machine Learning

I talked about why we need machine learning in the previous post, but this thing sounds a bit tough to be solved in an afternoon with paper and a pencil, don’t you think?

I’d like to talk in this post about the available software to help solve machine learning problems. There are solutions to cater all different needs, so I will go through them briefly so you can familiarize with the one you need more as soon as possible!

There are several programs and languages that can handle different the different algorithms that we’ll review in the next posts:

– Matlab: This proprietary software is a standard in universities and businesses due to its versatility and power. Easy to use and learn, the code in this blog will be almost entirely written for this software. There are free alternatives to Matlab, being the most compatible and powerful Octave.

– R: The real open source alternative to Matlab in statistics. Not compatible with the former, R is very used in academia with very successful results. Lacks a good GUI, but it is a masterpiece.

– Python: Powerful, reliable (and free) libraries have been released specifically for scientific computations and machine learning (NumPy and scikit-learn are good examples). Efficient memory usage and competitive results and computation time makes Python very appealing for serious work to be later deployed even in the market.

– SAS-based: I’m not very familiar with this software family myself, I have to admit. But it is certainly mostly used in corporate environments due to its simplicity and visualization capabilities. Most of the visual results nowadays are generated with this software (and some variants).

– Julia: It shouldn’t be here yet, as it is not as powerful, well-known or even well-suited to machine learning at this moment. Nevertheless, it’s a very promising language, and its versatility and growth in the past months make me suggest this software. A really worthwhile idea.

Consequently, there is no perfect software for everyone, so you will have to choose. I work with Matlab due to its simplicity in its language, but I am actually considering moving partially to Python to avoid memory usage restrictions. Now it’s your time to try them and choose one to start programming!
————-

We lose many things simply out of our fear of losing them.

Paulo Coelho, Brida

Advertisements

Why do we need machine learning?

That’s the first question I asked myself when I first met the topic. We, humans, are already capable to read, understand and extract conclusions of almost everything in the world. And we also hope to give helpful adviceĀ if we are expert enough.

Then, why? Because we are just that, humans. Sometimes there is something we hadn’t seen, there are challenges we are not prepared to success in. Because there are a million things that could be wrong. And because, face it, we are expensive enough to think of something to at least help us do the same but faster, cheaper, more accurately.

On the other hand, I never said that machines would replace us and win a war against us. We are still as far from it as reaching Pluto and living there as we live in New York. But they can be of great help today.

There are two main kinds of problems to be solved in machine learning: classification and estimation.

  • Classification is when the system says if the data belongs to one specific class, or another.
  • Estimation is what is going to be the probable outcome of the input, unseen data.

The problem is predefined in both cases.

Many other problems can be proposed, but in the end, it all comes of these two old guys. In the following posts, I’ll cover some solutions to classic and novel problems, but feel free to propose your own, and we’ll try to solve them together!!

Fernando Rabanal

————-

If all places in the universe are in the Aleph, then all stars, all lamps, all sources of light are in it, too.

Jorge Luis Borges, The Aleph