Wednesday, August 17, 2016

Deep Learning for Everyone

Summary:  The most important developments in Deep Learning and AI in the last year may not be technical at all, but rather a major change in business model.  In the space of about six months all the majors have made their Deep Learning IP open source, hoping to gain on the competition from the power of the broader developer base and wide adoption.
 To say that the last year has been big for Deep Learning is an understatement.  There have been some spectacular technical innovations like Microsoft winning the ImageNet competition with a neural net comprised of 152 layers (where 6 or 7 layers is more the norm).  But the big action especially in the last six months has been in the business model for Deep Learning.

Sunday, March 20, 2016

Getting started with Spark in python

Hadoop is the standard tool for distributed computing across really large data sets and is the reason why you see "Big Data" on advertisements as you walk through the airport. It has become an operating system for Big Data, providing a rich ecosystem of tools and techniques that allow you to use a large cluster of relatively cheap commodity hardware to do computing at supercomputer scale. Two ideas from Google in 2003 and 2004 made Hadoop possible: a framework for distributed storage (The Google File System), which is implemented as HDFS in Hadoop, and a framework for distributed computing (MapReduce).

Tuesday, March 8, 2016

Apache Spark: Cluster manager and jobs


Cluster manager and jobs vs Databricks

This video gives a short overview of how Spark runs on clusters to make it easier to understand the components involved. How to manager, schedule and scale Spark nodes vs Databricks

Monday, March 7, 2016

Create a Scala project on Spark using Scala IDE

Scala is one of the most exciting languages for programming Big Data. It is a multi-paradigm language that fully supports functional, object-oriented, imperative and concurrent programming. It is a strong language, it means a convenient form of self-documenting code. 
Apache Spark is written in Scala, and any library that purports to work on distributed run times should at the very least be able to interface with Spark. 

Sunday, March 6, 2016

Scala setup on Windows - Spark Scala

Scala is primary language of Apache Spark, so using Scala is good to learn Apache Spark. 
The Scala language can be installed on any UNIX-like or Windows system. To install Scala on Windows vs Spark, we need to do something as below.


Before installing Scala on your computer, you must install Java. 

Step 1: setup Java on your Windows machine.

Followers