We're using Databricks as our provider for Spark execution, and we've been struggling to get the Spark Cassandra connector to work outside of the local development environment. The connector was attempting to connect to 127.0.0.1 even though we were passing the new host information into the getOrCreate(..) call. After working with Ganesh at Databricks support, we figured it out. The realization is that in Databricks, calls to getOrCreate() from a fat jar don't create a new SparkContext object. Thus, the configuration passed in gets ignored. If you want to update the Cassandra host information for the connector, you must update it after the call to getOrCreate() instead. Add the configuration directly to the context and you'll be good to go!
It's been almost two years since I posted anything here. In late 2015, things got really busy for me at Recruitics and, to be totally candid, maybe I wasn't sure what this blog is for. What happened since late 2015? My department moved offices. Twice. We weathered a round of layoffs and rebuilt the teams that were affected. And we've made a ton of progress improving our applications and data processing. I joined Twitter . I became the solo developer of a new integrations platform that now handles over 30% of our data collection traffic and cost tracking. And in two months we launch a free analytics product that will change the competitive landscape in online job advertising. So ... there's been a lot going on for me. What's next for this blog? My original purpose for this blog was to contribute something that other software engineers might find useful. I posted a few technical commentaries on topics not well-covered elsewhere, and I wrote somewhat haphazar