Releasing gdelt-spark v2.0

Version 2.0 updates A couple of months ago I released the very first version of Gdelt Spark, my pet project to integrate Spark with the GDELT universe. See previous blog post - releasing-gdelt-spark-v1-0. Today, I am proud to release v2.0 that allows spark developers and scientists to download GDELT text content as well as article … Continue reading Releasing gdelt-spark v2.0

Advertisements

Connect Tableau Desktop to SparkSQL

Last (but not least) post of 2014, and a new Hacking challenge. Based on the work I've done on SQLDeveloper (https://hadoopi.wordpress.com/2014/10/25/use-spark-sql-on-sql-developer/), I was wondering how to connect Tableau Desktop to my SparkSQL cluster. Install Tableau Desktop I'm quite new to Tableau, but it's worth giving a try. However, spending $999 for a challenge isn't worth it, … Continue reading Connect Tableau Desktop to SparkSQL

Processing GDELT data using Hadoop InputFormat and SparkSQL

GDELT A quick overview of GDELT public data set: "GDELT Project monitors the world's broadcast, print, and web news from nearly every corner of every country in over 100 languages and identifies the people, locations, organisations, counts, themes, sources, and events driving our global society every second of every day, creating a free open platform … Continue reading Processing GDELT data using Hadoop InputFormat and SparkSQL