Structured data stored in relational databases has ruled the world for
the last 40 years. Over that time, Structured Query Language (SQL)
emerged as the standard for accessing and manipulating data stored in
relational database management systems. The main reason for the
popularity of SQL was the ease of programming it provided by
encapsulating and abstracting how data is stored, thereby removing a
step in the process and allowing developers to focus on what they wanted
done. Thus, SQL drove the adoption of relational databases to near
ubiquity.
However, we’ve started to hit the limitations of what relational systems
can do. Data no longer follows a uniform structure. With the
digitization of communication and commerce, we now have social data,
scientific data, IoT data, blogs, tweets, and other data that do not fit
the relational structure. Additionally, today’s businesses demand
agility and rapid changes to applications, which means frequent changes
to the schema of the data.
The rigid schema requirements of relational databases are a roadblock to
releasing fast, scalable, and responsive applications. Developers and
enterprises are increasingly expected to bring products to market faster
and cheaper. The need for dynamic schema evolution demands not only a
rethink of data models and databases, but also a new method to access
this data -- a query language.
A new data model
Before we get to the query language, let’s first examine the data model.
Since the early '90s, most business applications were developed in
object-oriented programming models. The popularity of graphical user
interfaces and subsequently the Web made this type of programming the
norm for developing business and customer-facing applications. In Web
applications specifically, JSON is the open standard format that uses
human-readable text to represent data in objects. JSON is what gets
transmitted between server and Web applications. NoSQL databases
designed to store and manage JSON documents started gaining popularity
in the early 2000s thanks to the increase of unstructured data.
More and more businesses are adopting NoSQL databases to support the
broad set of use cases for the next-generation of personalized,
context-sensitive, and location-aware applications. Modern developers
love the flexibility of a JSON database because it represents data in
the same object-based way as their preferred languages (Java, C++, .Net,
Python, Ruby, and so on) without the rigid schema requirements of a
relational database. However, developers building on a NoSQL database
with a JSON data model have been limited by the lack of a query
language.
As a result, the emergence of JSON-based NoSQL databases without a
standard rich query language forced programmers into a dilemma: Either
leverage the power of standard SQL, but be constrained by a rigid
relational model, or develop on a flexible JSON data model, but accept
data model and query limitations.
While the benefits of using JSON are clear, a standard query language
making it easier for developers to build applications is also important.
To give JSON the query language it deserves, a clear starting point was
to simply extend the most popular query language, SQL.
A new query language
N1QL (pronounced "nickel") is a new query language that extends SQL to
work on JSON documents. Put in more technical terms, JSON represents a
non first normal form data model (N1NF), and N1QL operates on that data
model. N1QL extends SQL from traditionally operating on tables and rows
(tuples) to operate on JSON (nested documents). It’s built on a nested
recursive algebra for a nested recursive data model.
Extending SQL to JSON is like reinventing the gas-powered car by giving
it an electric engine, but not changing the steering wheel or any of the
mechanisms that affect how you operate the car. Developers can now
dynamically extend an application’s schema (handled at runtime by the
new query engine), while still using the same familiar SQL language to
operate it.
The image below shows how you would write a JOIN in SQL and how you
would write a JOIN in N1QL. This is a very simple example. To learn more
about what can be done with N1QL visit the Couchbase N1QL tutorial.
Building a new language as an extension to SQL provides the advantage of
using the same vocabulary and syntax as SQL. Thus, for the first time,
developers can perform complex references using JOINs in NoSQL document
databases. In addition, all the standard SQL language elements such as
statements, clauses, expressions, predicates, operators, aggregation,
and ordering remain the same.
However, because the underlying data model is different and schema
evolution is dynamic at runtime, some additions needed to be made to
extend SQL’s power to JSON. N1QL adds the verbs
NEST
and UNNEST
for composing and flattening nested objects, a family of operators (such as IS NULL
, IS MISSING
) for handling dynamic schema, and ARRAY
functions and operators for traversal, filtering, and recursive processing of array elements.
For the next phase of application development, developers need new query
languages that are architected for schema flexibility and schema
evolution. By pairing the flexibility of an innovative data model with
the power of a query language known by millions of developers and
business analysts, we lay the foundation for building much more flexible
and powerful applications.
JSON and distributed document-oriented databases are here to stay. Let’s make the most of them.
Ravi Mayuram is senior vice president of products and engineering at Couchbase.
New Tech Forum provides a venue to explore and
discuss emerging enterprise technology in unprecedented depth and
breadth. The selection is subjective, based on our pick of the
technologies we believe to be important and of greatest interest to
InfoWorld readers. InfoWorld does not accept marketing collateral for
publication and reserves the right to edit all contributed content. Send
all inquiries to newtechforum@infoworld.com.
Post a Comment