Bob Wiederhold, CEO of Couchbase, has kindly agreed to answer some questions on future of NoSQL market in an interview for Market Research Media. Couchbase is the company behind the Couchbase open source project and one of the leading firms in non-relational database segment. These are questions and answers from the interview.
Q1. What is your business model? Combination of free and enterprise licenses? Providing free open source database and charging for support/service? Is this business model sustainable in the long run?
Couchbase Inc. is the company behind the Couchbase open source project. We provide source code, a free community edition, and a paid enterprise edition.
Open Source. We provide the open source under the Apache 2.0 license.
Community Edition. We offer a Couchbase Server Community Edition that is free to download and use as you wish for production and non-production uses. Support for the Community Edition is provided by the community through the usual channels like forums, IRC, etc. There is, of course, no Service Level Agreement (SLA) associated with this support.
Enterprise Edition. We offer a Couchbase Server Enterprise Edition subscription that costs $2,499/node/year for 8×5 support and $4,499/node/year for 24×7 support. A “node” is a physical server or virtual server (regardless of the number of CPUs). The Enterprise Edition comes with an SLA where we guarantee response times for best practice expertise and technical support. It also comes with a commercial license. The only product difference between the Community and Enterprise Editions is that bug fixes and hot bug fixes are immediately available in the Enterprise Edition and experience some time lag before they get integrated into the Community Edition.
Sustainability of Business Model. Yes, we believe this model is sustainable in the long term. Customers typically use the Community Edition to learn about and experiment with the software and then develop their application if it is a good fit for their use case. When they are ready to go into production, they typically purchase the Enterprise Edition because they want best practice expertise and technical support when they go “live” with their application. While not all users of the Community Edition move to the Enterprise Edition when they go live, a high percentage do because the cost to their business of having their application go down for any length of time is far greater than the cost of the Enterprise Edition subscription.
Q2. What is your position on ACID compliance for NoSQL? Any plans to make Couchbase ACID compliant?
NoSQL (and Couchbase) provides a different set of tradeoffs than RDBMS. Those tradeoffs make a lot of sense for some applications and don’t for others – it really depends on the nature of the application, the profile of the data used by the application, and the tradeoffs you are willing to make. For some applications and data types, 100% durability is paramount regardless of the performance impact. For other applications and data types, relaxing durability is perfectly acceptable if performance can be significantly increased. For some applications, atomicity can easily be handled because all the data you want to manipulate can be put into a single document. For other applications, support for complex transaction is necessary to provide atomicity.
Couchbase’s goal is not to be 100% ACID compliant in the way that people using RDBMS are familiar with the term. Our goal is to provide a solution that provides the right amount of atomicity, consistency, isolation, and durability within the context of other tradeoffs like performance and scalability. We believe a growing number of developers will view these tradeoffs as better suited to the needs of their applications. Likewise some developers will prefer the ACID tradeoffs that relational databases make and will stick with a RDBMS approach.
Q3. Your current product/marketing strategy targets mostly online applications. Can you give us examples of Couchbase deployment in other verticals: government, data warehouses, business intelligence, genetic databases, image/video databases, financial applications?
Couchbase is an operational database for interactive, online applications. We’re not a data warehouse or an analytics platform. We have a number of connectors to these types of platforms so our customers can take advantage of the benefits of those types of products. While it’s not likely that you would store a large video/image library in Couchbase, it is definitely a use-case to store the metadata associated with video/image library in Couchbase. Likewise, huge amounts of genetic information is often analyzed on an analytics platform like Hadoop and the results of the analysis are then kept in an operational database like Couchbase where interactive applications can leverage the results of these analyses with sub millisecond latencies.
Q4. What is your opinion of Hadoop-MapReduce future? How do you see interaction between Couchbase and Hadoop-MapReduce framework?
Hadoop Map Reduce is a great big-data analysis framework. It is well suited for deep analysis of the kind of data often initially captured or served by NoSQL databases. However, it is inherently a batch oriented technology, limiting it to uses where it is okay that results are updated relatively infrequently and are not based on “current” data.
Couchbase offers a Hadoop connector in partnership with Cloudera. The connector allows data to be easily moved from a Couchbase Server cluster to a Hadoop cluster and analysis results to be written back into Couchbase Server. In these configurations Couchbase Server is utilized to keep the active data, which needs to be accessed at low latency and to receive writes or updates at high performance. The Hadoop cluster is then used to run deep analysis on the captured data.
In the advertisement-serving industry this technique is used to have Couchbase Server serve user profiles and capture real-time user activity at low latency, while the Hadoop cluster does the analysis of the captured data to update the profile targeting information.
Couchbase Server also supports incremental Map Reduce natively. If chaining of multiple map-reduce steps is not required, incremental map reduce allows map reduce analysis to be applied to data stored in Couchbase Server in real time, as the data changes. Results are incrementally updated, so unlike the batch-oriented map reduce approach used in Hadoop, current results are always available.
Q5. Structured or unstructured data? Which type of data makes the stronger case for Couchbase?
Most of the recent explosion of data is unstructured and semi-structured data. This type of data doesn’t fit very well with the highly structured schemas of relational databases. Document databases like Couchbase are schema-less (or maybe more accurately schema-inferred) databases so they work much better with unstructured and semi-structured data. As a result, if you have unstructured data it likely makes sense to check out Couchbase (or other NoSQL databases) because there is a good chance it will be a better choice than a relational database alternative.
Document databases are also good at storing structured data however so, depending on your use case, they may also make sense for this type of data. This is particularly true when your structured data is changing rapidly. When the data you want to store changes, a schema-based relational database forces you to go the DBA to figure out how to accommodate the change in the schema, move your data from the old schema to the new one, etc. and this can be a lengthy process. The schema-less approach found in document databases makes this much easier and more flexible – you just add the new fields to your documents.
Q6. What scenario do you envision for interaction between relational and non-relational databases?
For 40 years everyone used relational databases and only relational databases. We think that’s changing. We’re moving to a situation where multiple database technologies will be available and developers will pick the database that fits best with the data profile and use case of their application. For example, many companies have moved to a Service Oriented Architecture where each service can pick the database that fits best with that particular service. The classic strength of relational databases is where complex transactions are required, for example debit/credit type applications. We certainly have customers that use relational databases alongside Couchbase for these types of things.
Relational databases currently are the underpinning for 95%+ of the database industry. We think in 10-15 years this percentage will be dramatically smaller. But relational databases do many things well and we don’t think they are going away – they are just going to be one of many alternatives.
Q7. What future do you envision for Couchbase? IPO, acquisition by major database player?
The NoSQL industry is a big market opportunity, is growing fast, and I think will spawn companies that grow to be multi-hundred-million dollar companies in the not too distant future. As a result, I think an IPO is certainly possible for companies like Couchbase. NoSQL is also a very disruptive technology that I believe big, existing database companies like Oracle, IBM, Microsoft will find increasingly strategic to their business. As a result, I fully expect acquisitive companies like these (and others like VMware, EMC, Red Hat, etc.) to consider acquiring NoSQL companies at high valuations. I also think there are opportunities for mergers between leading new database companies to form new, larger next-generation database companies.
Couchbase’s goal is to build a large, independent company that can ultimately go public.
Q8. Do you see any synergy in custom hardware configuration enhancing Couchbase capabilities? Any future product offerings including hardware? Couchbase appliance?
I don’t expect Couchbase to offer hardware or appliances. We haven’t seen strong demand for this to date. We do however expect to integrate our products more tightly with various hardware products. For example, we are working with Fusion i/o to integrate our product more tightly with their SSD solution. We are also working with networking companies to integrate more closely with their products. I think you’ll see even more of this from us in the future.
Q9. What is Couchbase’s competitive advantage in the long run? How does Couchbase stand against competing NoSQL technologies?
It’s hard to tell what our competitive advantages will be in the long run since that depends as much on what our competitors do as what we do. What we have focused on to date and where we think we have significant competitive differentiation is in performance, scalability, and always-on 24x7x365 capability.
High Performance. Performance is usually measured in terms of read/write latencies and the throughput per server. Couchbase consistently delivers sub millisecond latencies for reads and writes. This allows app developers to meet the demanding responsiveness requirements of application users. Couchbase also provides much higher throughput than other NoSQL products and that throughput scales linearly with the number of servers in your cluster. This keeps the ops people happy because a smaller number of servers are required to meet throughput requirements. Some benchmarks have already been made public to support these claims and more benchmarks will be made public shortly. (See Understanding the performance benchmark published by Cisco and Solarflare using Couchbase Server)
Easy Scalability. A lot of people think all NoSQL products provide easy scalability simply because they are “NoSQL” but that really isn’t the case. Adding or removing capacity should be a single, simple operation whether you’re growing to 5, 25, 50, or more servers. Every node in the cluster should be the same and simple to set up. The approach to sharding should balance load evenly across a cluster to avoid “hot spots”. And you shouldn’t have to change your application as your database cluster grows, i.e. application-level sharding doesn’t count. Couchbase provides all of these capabilities together with the monitoring and alerting tools you need to watch the health of your cluster. We think we’re recognized in industry as a leader in scalability.
Always On 24x7x365. When applications are down users get frustrated and revenue opportunities are lost. Increasingly customers are demanding that the data tier be up and running 24x7x365. That means your app needs to stay available during software upgrades, hardware maintenance, and hardware failures. Online backup and restore also needs to be provided as well as datacenter disaster recovery support. Couchbase allows you to do all this and we think we are significantly differentiated in this area.
Q10. What is new in the Couchbase 2.0 release and when will it be available?
The 2.0 release is a very important one for Couchbase. The current (1.8) release of Couchbase is a key-value database. It provides all the high performance, easy scalability, and always-on capabilities that I talked about above. The Couchbase Server 2.0 release turns the product into a document database that provides developers with a rich set of indexing and querying capabilities. These are the kinds of features that many developers are accustomed to using when they are using relational databases and are a requirement for many projects. While we have been very successful as a key-value database, this new capability will significantly expand our market opportunity and provide MongoDB with direct competition for the first time.
With the 2.0 release we are also expanding our differentiation in “easy scalability” by offering Cross Data Center Replication (XDCR). This is an increasingly important requirement for many customers. This allows you to scale by putting the same database in multiple data centers around the world, keep them all synchronized, and provide better application performance because you are serving users from local datacenters. It also provides a necessary disaster recovery capability.
The 2.0 release is available today as a developer preview (called nightly builds) for developers to use. We will go to beta in early September and will release the product in early November.