Build engaged audiences through publishing by curation.
Sign up with Facebook
Sign up with Twitter
I don't have a Facebook or a Twitter account
Start a free trial of Scoop.it Business
by Thomas Dinsmore, Director of Product Management at Revolution Analytics The emergence of Apache Spark is a key development for Big Analytics in 2013.
Spark, an Apache incubator project, is an open source distributed computing framework for advanced analytics in Hadoop. It's 100X faster than what they are able to achieve with MapReduce. Spark includes a machine learning library (MLLib), a graph engine (GraphX), a streaming analytics engine (Spark Streaming) and much more...
Currently, Spark supports programming interfaces for Scala, Java and Python. The R interface is under development and this is expected to be released in the first half of 2014.
Are you sure you want to delete this scoop?
Teradata announced a new set of features and products on Monday that should improve its position as a go-to analytics vendor even in an age of Hadoop. But as open sources technologies evolve, Teradata might face a challenge to attract new users.
Going over how to address a use case where there is a need to forecast future performance based on historical trend. This short presen(...)
This demonstration will show how the combination of vSphere Big Data Extensions (BDE) and vCloud Automation Center (vCAC) can provide a service catalog that ...
Apache Hadoop didn't disrupt the data center, the data did. In this post we explore Hadoop as part of an integrated, modern data architecture.
NoSQL is a hot buzz in the air for a pretty long time (well, it's not only a buzz anymore).However, when should we really use it?Best Practices for...
Once again, the best practices and tips..
Many businesses are actively researching and planning on implementing a Hadoop solution. Hadoop vendors are also beginning to offer their services in the cloud.
In short, author is considering about when is reasonable to use Hadoop...
Everybody has heard about Big Data and it certainly sounds enticing. However, as always, it is a good idea to separate fact from fiction and to know what eff...
BI Industry expert Claudia examines the best practices for Big Data integration as the foundation for a successful BI environment.
Josep Lluis Larriba Pey's answer: As Chris Shrader says in his answer, you first have to know how to use a NoSQL DB. In this case, from my experience, I can tell you that Graph NoSQL databases are a special case.
Cloudera Search brings full-text, interactive search, and scalable indexing to Apache Hadoop by marrying SolrCloud with HDFS and Apache HBase, and other projects in CDH. Because it's integrated with CDH, Cloudera ...
REDIS EVERYWHERE (RT @alsargent: Great overview of when to use #Redis -- and when not to http://t.co/SPTVSRSSdx #NoSQL)
ReadWriteBig Data Platform Comparisons: 3 Key PointsInformationWeekOur recent 16 Top Big Data Analytics Platforms collection has generated lots of interest and plenty of comments and questions.
Note: This blog post part 3 of the series and you can download 30-day trial of ScaleBase to practice the concepts. In this article comparing MongoDB and MySQL scalability, I want to focus on query models.
HBase shell is great, specially while getting yourself familiar with HBase. It provides lots of useful shell commands using which you can perform trivial tasks like creating tables, putting some test data into it, scanning the whole table, fetching data from a specific row etc etc.
Discusses HBase shell relatively less known operations.
Oracle added NoSQL capabilities to the InnoDB engine in MySQL 5.6, providing a 9x improvement in transaction performance. Here's how to use the NoSQL features.
SQL-on-Hadoop engines are available from a variety of vendors. But expert Rick van der Lans cautions that while they appear similar on the surface, there are important differences.
MariaDB 10 is out, featuring a “Connect engine” that makes it easier to handle data from both traditional SQL databases and more web-scale NoSQL systems. The new functionality merits new editions of the MariaDB Enterprise and Enterprise Cluster products.
This slide deck, by Big Data guru Bernard Marr, outlines the 5 Vs of big data. It describes in simple language what big data is, in terms of Volume, Velocity...
Announcing the release of Hadoop 2.3.0 with 560 JIRAs fixed.
With this release, there are two significant enhancements to HDFS:
Besides, there are a lot of bug fixes and small changes ...
"There many NoSql databases out there and it can be confusing to determine which one is suitable for a particular use case. In this blog, we discuss the two more popular ones, Cassandra and HBase..."
The author's summarize is the following:
HBase: The Definitive Guide, a book by Lars George (RT @kmap2: HBase: The Definitive Guide/Lars George #books http://t.co/SrPbZBcP8N)
original:http://wiki.apache.org/cassandra/GettingStarted DataStax's latest Cassandra documentation covers topics from installation to troubleshooting, including a Quick Start Guide. Documentation for older releases is also available.
5 things everyone should know about HadoopGigaOMIt didn't take long for the Hadoop market to become a juggernaut, and it won't take long for it to undergo some significant technological changes.
http://zerotoprotraining.com This video explains the concepts of MapReduce (MR2) and YARN related to Apache Hadoop.
The release of HBase 0.98.0 saw the resolution of 230 JIRA tickets for a lot of new features.
When you are looking to run analytics on large and complex data sets, you might instinctively reach for Hadoop. However, if your data’s in MongoDB, using the Hadoop connector seems like overkill if...