SlideShare a Scribd company logo
Is NoSQL the Future of Data
         Storage?
        By Gary Short
      Developer Express
Introduction
•   Gary Short
•   Technical Evangelist for Developer Express
•   C# MVP
•   garys@devexpress.com
•   www.garyshort.org
•   @garyshort.



                                                 2
What About You Guys?




                       3
Breadth First Look @ NoSQL




                             4
Be Doing 3 Things
1. Define NoSQL databases
2. Look at scenarios where you can use NoSQL
3. Drill into a specific use case.




                                               5
6
Where Does NoSQL Originate?
• 1998
  – OS relational database
     •   Created by Carlo Strozzi
     •   Didn’t expose an SQL interface
     •   Called NoSQL
     •   The author said:
     •   “departs from the relational model altogether...”
     •   “...should have been called ‘NoREL”.



                                                             7
More Recently...
• Eric Evans reintroduced the term in 2009
  – Johan Oskarsson (last.fm)
     • Event to discuss OS distributed databases
• This labels growing number datastores
  – Open source
  – Non-relational
  – Distributed
  – (often) don’t guarantee ACID.

                                                   8
Atlanta 2009
• No:sql(east) conference
• Billed as “conference of no-rel datastores”
• Worst tag line ever
  – SELECT fun, profit FROM real_world WHERE rel=false.




                                                          9
Not Ant-RDBMS




                10
Let’s Talk a Bit About What NoSQL DBs
               Look Like...




                                    11
Key Attributes of NoSQL Databases
•   Don’t require fixed table schemas
•   Non-relational
•   (Usually) avoid join operations
•   Scale horizontally
    – Adding more nodes to a storage system.




                                               12
What Does the Taxonomy Look Like?




                                    13
Document Store
•   RavenDB
•   Apache Jackrabbit
•   CouchDB
•   MongoDB
•   SimpleDB
•   XML Databases
    – MarkLogic Server
    – eXist.

                                14
Document What?




                 15
Graph Storage
•   Trinity
•   AllegroGraph
•   Core Data
•   Neo4j
•   DEX
•   FlockDB.



                               16
Which Means?
• Graph consists of
  – Node (‘stations’ of the graph)
  – Edges (lines between them)
• FlockDB
  – Created by the Twitter folks
  – Nodes = Users
  – Edges = Nature of relationship between nodes.


                                                    17
Social Graph




               18
Key/Value Stores
• On disk
• Cache in Ram
• Eventually Consistent
   – Weak Definition
      • “If no updates occur for a period, eventually all updates will
        propagate through the system and all replicas will be consistent”
   – Strong Definition
      • “for a given update and a given replica eventually either the
        update reaches the replica or the replica retires”
• Ordered
   – Distributed Hash Table allows lexicographical processing.

                                                                            19
Object Databases
•   Db4o
•   GemStone/S
•   InterSystems Caché
•   Objectivity/DB
•   ZODB.




                                20
How the &*$% do You Index
         That?!




                            21
Okay got it, Now Let’s Compare Some
       Real World Scenarios




                                  22
You Need Constant Consistency
•   You’re dealing with financial transactions
•   You’re dealing with medical records
•   You’re dealing with bonded goods
•   Best you use a RDMBS ☺.




                                                 23
You Need Horizontal Scalability
• You’re working across defined geographic regions
• You’re working with large quantities of data
• Game server sharding
• Use NoSQL
   – Something like Cassandra.




                                                     24
Up in the Clouds Baby




                        25
26
Frequently Written Rarely Read
•   Think web counters and the like
•   Every time a user comes to a page = ctr++
•   But it’s only read when the report is run
•   Use NoSQL (key-value storage/memcache).




                                                27
I Got Big Data!




                  28
Binary Baby!
•   If you are YouTube
•   Flickr
•   Twitpic
•   Spotify
•   NoSQL (Amazon S3).




                              29
Here Today Gone Tomorrow
• Transient data like..
  – Web Sessions
  – Locks
  – Short Term Stats
     • Shopping cart contents
• Use NoSQL (Memcache).



                                30
Data Replication
• Same data in two or more locations
  – Music Library
     • Web browser
     • iPone App
• NoSQL (CouchDB).




                                       31
Hit me Baby One More Time!
• High Availability
  – High number of important transactions
     • Online gambling
     • Pay Per view
        – Ahem!
     • Online Auction
• NoSQL (Cassandra – automatic clustering).



                                              32
Give me a Real World Example
• Twitter
  – The challenges
     • Needs to store many graphs
        – Who you are following
        – Who’s following you
        – Who you receive phone notifications from etc
     • To deliver a tweet requires rapid paging of followers
     • Heavy write load as followers are added and removed
     • Set arithmetic for @mentions (intersection of users).


                                                               33
What Did They Try?
• Relational Databases
• Key-Value storage of denormalized lists




                                            34
Did it Work?




               35
What Did They Need?
• Simplest possible thing that would work
• Allow for horizontal partitioning
• Allow write operations to
  – Arrive out of order
  – Or be processed more than once
• Failures should result in redundant work
  – Not lost work!


                                             36
The Result was FlockDB
• Stores graph data
• Not optimised for graph traversal operations
• Optimised for large adjacency lists
  – List of all edges in a graph
     • Each entry is a set of end points (or tuple if directed)
• Optimised for fast read and write
• Optimised for page-able set arithmetic.


                                                                  37
How Does it Work?
• Stores graphs as sets of edges between nodes
• Data is partitioned by node
  – All queries can be answered by a single partition
• Write operations are idempotent
  – Can be applied multiple times without changing
    the result
• And commutative
  – Changing the order of operands doesn’t change
    the result.

                                                        38
A Little More About Idempotency
• Applied several times with no change to the
  result
• A operation ’O’ on set S is called idempotent
  if, for all x in S, x O x = x.
• Set union
  – A U B = {X: X E A or X E B}
• Set intersection
  – A n B = {X: X E A and X E B}

                                                  39
A Little More About Commutative
• Changing the order of operands doesn’t
  change the result.
  3+2=5
• Can be combined with idempotency
• Let’s look at the follow command in Twitter
   • Let X = follow person X
   • Let Y = follow person Y
   • Then 3X + 2Y = 2Y + 3X
   • And 2X + 3Y = 3X + 2Y
• Note: it’s only true for the same operation.
                                                 40
Commutative Writes Help Bring up
            Partitions
• Partition can receive write traffic immediately
• Receive dump of data in the background
• Live for read as soon as the dump is complete.




                                                41
Performance?
• Currently store 13 billion edges
• 20K writes / second
• 100K reads / second.




                                     42
Punchline?
• Under all the bells and whistles...
  – Its MySQL ☺.




                                        43
So is this the Future?
• Yes!
• And No!




                                 44
What?! How Can That be?!




                           45

More Related Content

PDF
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
PDF
Crash course intro to cassandra
PDF
A Hitchhiker's Guide to NOSQL v1.0
PDF
Cassandra at Vast
KEY
Mongo db admin_20110329
PPTX
Hadoop for the Absolute Beginner
PPTX
NOSQL Databases for the .NET Developer
KEY
NoSQL: Why, When, and How
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
Crash course intro to cassandra
A Hitchhiker's Guide to NOSQL v1.0
Cassandra at Vast
Mongo db admin_20110329
Hadoop for the Absolute Beginner
NOSQL Databases for the .NET Developer
NoSQL: Why, When, and How

What's hot (20)

PDF
The MySQL Server ecosystem in 2016
PPTX
Relational and non relational database 7
PDF
Relational vs. Non-Relational
PDF
Cloud conference - mongodb
PPT
NoSQL databases pros and cons
PDF
TechTalk #14 Grokking: Couchbase - NoSQL + Memcached + Real-time + Offline!
PPTX
Utilizing the OpenNTF Domino API
PDF
How Shit Works: Storage
PPTX
MongoDB
PDF
Is the database a solved problem?
PDF
Ichii mysql-osc2011tokyofall
PDF
Thorny path to the Large-Scale Graph Processing (Highload++, 2014)
PPTX
Oracle OpenWo2014 review part 03 three_paa_s_database
ODP
Nonrelational Databases
PDF
Modern software architectures - PHP UK Conference 2015
PPTX
When to Use MongoDB...and When You Should Not...
PDF
The Wix Microservice Stack
PDF
Beware of your Hype Value Stores
PPT
JavaOne_2010
PDF
Cassandra@Coursera: AWS deploy and MySQL transition
The MySQL Server ecosystem in 2016
Relational and non relational database 7
Relational vs. Non-Relational
Cloud conference - mongodb
NoSQL databases pros and cons
TechTalk #14 Grokking: Couchbase - NoSQL + Memcached + Real-time + Offline!
Utilizing the OpenNTF Domino API
How Shit Works: Storage
MongoDB
Is the database a solved problem?
Ichii mysql-osc2011tokyofall
Thorny path to the Large-Scale Graph Processing (Highload++, 2014)
Oracle OpenWo2014 review part 03 three_paa_s_database
Nonrelational Databases
Modern software architectures - PHP UK Conference 2015
When to Use MongoDB...and When You Should Not...
The Wix Microservice Stack
Beware of your Hype Value Stores
JavaOne_2010
Cassandra@Coursera: AWS deploy and MySQL transition
Ad

Similar to Is NoSQL The Future of Data Storage? (20)

PPTX
Intro to Big Data and NoSQL
PDF
Solr cloud the 'search first' nosql database extended deep dive
PDF
What Does Big Data Mean and Who Will Win
PDF
What every developer should know about database scalability, PyCon 2010
PPTX
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
PPTX
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
PPTX
Introduction to NoSQL
PPTX
Big Data (NJ SQL Server User Group)
PPTX
Sql vs NoSQL
PDF
What Every Developer Should Know About Database Scalability
KEY
NOSQL, CouchDB, and the Cloud
PPTX
Large scale computing with mapreduce
PDF
PayPal Big Data and MySQL Cluster
PDF
[db tech showcase Tokyo 2016] E32: My Life as a Disruptor by Jim Starkey
PPTX
Introduction to Data Science NoSQL.pptx
PDF
Big Data! Great! Now What? #SymfonyCon 2014
PPTX
An Introduction to Big Data, NoSQL and MongoDB
PPT
Wmware NoSQL
PPTX
No SQL- The Future Of Data Storage
PDF
My sql tutorial-oscon-2012
Intro to Big Data and NoSQL
Solr cloud the 'search first' nosql database extended deep dive
What Does Big Data Mean and Who Will Win
What every developer should know about database scalability, PyCon 2010
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
Introduction to NoSQL
Big Data (NJ SQL Server User Group)
Sql vs NoSQL
What Every Developer Should Know About Database Scalability
NOSQL, CouchDB, and the Cloud
Large scale computing with mapreduce
PayPal Big Data and MySQL Cluster
[db tech showcase Tokyo 2016] E32: My Life as a Disruptor by Jim Starkey
Introduction to Data Science NoSQL.pptx
Big Data! Great! Now What? #SymfonyCon 2014
An Introduction to Big Data, NoSQL and MongoDB
Wmware NoSQL
No SQL- The Future Of Data Storage
My sql tutorial-oscon-2012
Ad

More from Saltmarch Media (18)

PDF
Concocting an MVC, Data Services and Entity Framework solution for Azure
PDF
Caring about Code Quality
PDF
Learning Open Source Business Intelligence
PDF
Java EE 7: the Voyage of the Cloud Treader
PDF
Introduction to WCF RIA Services for Silverlight 4 Developers
PDF
Integrated Services for Web Applications
PDF
Gaelyk - Web Apps In Practically No Time
PDF
CDI and Seam 3: an Exciting New Landscape for Java EE Development
PDF
JBoss at Work: Using JBoss AS 6
PDF
WF and WCF with AppFabric – Application Infrastructure for OnPremise Services
PDF
“What did I do?” - T-SQL Worst Practices
PDF
Building RESTful Services with WCF 4.0
PDF
Building Facebook Applications on Windows Azure
PDF
Architecting Smarter Apps with Entity Framework
PDF
Agile Estimation
PDF
Alternate JVM Languages
PDF
A Cocktail of Guice and Seam, the missing ingredients for Java EE 6
PDF
A Bit of Design Thinking for Developers
Concocting an MVC, Data Services and Entity Framework solution for Azure
Caring about Code Quality
Learning Open Source Business Intelligence
Java EE 7: the Voyage of the Cloud Treader
Introduction to WCF RIA Services for Silverlight 4 Developers
Integrated Services for Web Applications
Gaelyk - Web Apps In Practically No Time
CDI and Seam 3: an Exciting New Landscape for Java EE Development
JBoss at Work: Using JBoss AS 6
WF and WCF with AppFabric – Application Infrastructure for OnPremise Services
“What did I do?” - T-SQL Worst Practices
Building RESTful Services with WCF 4.0
Building Facebook Applications on Windows Azure
Architecting Smarter Apps with Entity Framework
Agile Estimation
Alternate JVM Languages
A Cocktail of Guice and Seam, the missing ingredients for Java EE 6
A Bit of Design Thinking for Developers

Recently uploaded (20)

PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
[발표본] 너의 과제는 클라우드에 있어_KTDS_김동현_20250524.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PDF
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Electronic commerce courselecture one. Pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Advanced Soft Computing BINUS July 2025.pdf
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Approach and Philosophy of On baking technology
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
KodekX | Application Modernization Development
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
GamePlan Trading System Review: Professional Trader's Honest Take
PDF
Review of recent advances in non-invasive hemoglobin estimation
Understanding_Digital_Forensics_Presentation.pptx
[발표본] 너의 과제는 클라우드에 있어_KTDS_김동현_20250524.pdf
MYSQL Presentation for SQL database connectivity
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Electronic commerce courselecture one. Pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Dropbox Q2 2025 Financial Results & Investor Presentation
Advanced Soft Computing BINUS July 2025.pdf
Chapter 3 Spatial Domain Image Processing.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Approach and Philosophy of On baking technology
NewMind AI Monthly Chronicles - July 2025
KodekX | Application Modernization Development
Mobile App Security Testing_ A Comprehensive Guide.pdf
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
GamePlan Trading System Review: Professional Trader's Honest Take
Review of recent advances in non-invasive hemoglobin estimation

Is NoSQL The Future of Data Storage?

  • 1. Is NoSQL the Future of Data Storage? By Gary Short Developer Express
  • 2. Introduction • Gary Short • Technical Evangelist for Developer Express • C# MVP • [email protected] • www.garyshort.org • @garyshort. 2
  • 3. What About You Guys? 3
  • 4. Breadth First Look @ NoSQL 4
  • 5. Be Doing 3 Things 1. Define NoSQL databases 2. Look at scenarios where you can use NoSQL 3. Drill into a specific use case. 5
  • 6. 6
  • 7. Where Does NoSQL Originate? • 1998 – OS relational database • Created by Carlo Strozzi • Didn’t expose an SQL interface • Called NoSQL • The author said: • “departs from the relational model altogether...” • “...should have been called ‘NoREL”. 7
  • 8. More Recently... • Eric Evans reintroduced the term in 2009 – Johan Oskarsson (last.fm) • Event to discuss OS distributed databases • This labels growing number datastores – Open source – Non-relational – Distributed – (often) don’t guarantee ACID. 8
  • 9. Atlanta 2009 • No:sql(east) conference • Billed as “conference of no-rel datastores” • Worst tag line ever – SELECT fun, profit FROM real_world WHERE rel=false. 9
  • 11. Let’s Talk a Bit About What NoSQL DBs Look Like... 11
  • 12. Key Attributes of NoSQL Databases • Don’t require fixed table schemas • Non-relational • (Usually) avoid join operations • Scale horizontally – Adding more nodes to a storage system. 12
  • 13. What Does the Taxonomy Look Like? 13
  • 14. Document Store • RavenDB • Apache Jackrabbit • CouchDB • MongoDB • SimpleDB • XML Databases – MarkLogic Server – eXist. 14
  • 16. Graph Storage • Trinity • AllegroGraph • Core Data • Neo4j • DEX • FlockDB. 16
  • 17. Which Means? • Graph consists of – Node (‘stations’ of the graph) – Edges (lines between them) • FlockDB – Created by the Twitter folks – Nodes = Users – Edges = Nature of relationship between nodes. 17
  • 19. Key/Value Stores • On disk • Cache in Ram • Eventually Consistent – Weak Definition • “If no updates occur for a period, eventually all updates will propagate through the system and all replicas will be consistent” – Strong Definition • “for a given update and a given replica eventually either the update reaches the replica or the replica retires” • Ordered – Distributed Hash Table allows lexicographical processing. 19
  • 20. Object Databases • Db4o • GemStone/S • InterSystems Caché • Objectivity/DB • ZODB. 20
  • 21. How the &*$% do You Index That?! 21
  • 22. Okay got it, Now Let’s Compare Some Real World Scenarios 22
  • 23. You Need Constant Consistency • You’re dealing with financial transactions • You’re dealing with medical records • You’re dealing with bonded goods • Best you use a RDMBS ☺. 23
  • 24. You Need Horizontal Scalability • You’re working across defined geographic regions • You’re working with large quantities of data • Game server sharding • Use NoSQL – Something like Cassandra. 24
  • 25. Up in the Clouds Baby 25
  • 26. 26
  • 27. Frequently Written Rarely Read • Think web counters and the like • Every time a user comes to a page = ctr++ • But it’s only read when the report is run • Use NoSQL (key-value storage/memcache). 27
  • 28. I Got Big Data! 28
  • 29. Binary Baby! • If you are YouTube • Flickr • Twitpic • Spotify • NoSQL (Amazon S3). 29
  • 30. Here Today Gone Tomorrow • Transient data like.. – Web Sessions – Locks – Short Term Stats • Shopping cart contents • Use NoSQL (Memcache). 30
  • 31. Data Replication • Same data in two or more locations – Music Library • Web browser • iPone App • NoSQL (CouchDB). 31
  • 32. Hit me Baby One More Time! • High Availability – High number of important transactions • Online gambling • Pay Per view – Ahem! • Online Auction • NoSQL (Cassandra – automatic clustering). 32
  • 33. Give me a Real World Example • Twitter – The challenges • Needs to store many graphs – Who you are following – Who’s following you – Who you receive phone notifications from etc • To deliver a tweet requires rapid paging of followers • Heavy write load as followers are added and removed • Set arithmetic for @mentions (intersection of users). 33
  • 34. What Did They Try? • Relational Databases • Key-Value storage of denormalized lists 34
  • 36. What Did They Need? • Simplest possible thing that would work • Allow for horizontal partitioning • Allow write operations to – Arrive out of order – Or be processed more than once • Failures should result in redundant work – Not lost work! 36
  • 37. The Result was FlockDB • Stores graph data • Not optimised for graph traversal operations • Optimised for large adjacency lists – List of all edges in a graph • Each entry is a set of end points (or tuple if directed) • Optimised for fast read and write • Optimised for page-able set arithmetic. 37
  • 38. How Does it Work? • Stores graphs as sets of edges between nodes • Data is partitioned by node – All queries can be answered by a single partition • Write operations are idempotent – Can be applied multiple times without changing the result • And commutative – Changing the order of operands doesn’t change the result. 38
  • 39. A Little More About Idempotency • Applied several times with no change to the result • A operation ’O’ on set S is called idempotent if, for all x in S, x O x = x. • Set union – A U B = {X: X E A or X E B} • Set intersection – A n B = {X: X E A and X E B} 39
  • 40. A Little More About Commutative • Changing the order of operands doesn’t change the result. 3+2=5 • Can be combined with idempotency • Let’s look at the follow command in Twitter • Let X = follow person X • Let Y = follow person Y • Then 3X + 2Y = 2Y + 3X • And 2X + 3Y = 3X + 2Y • Note: it’s only true for the same operation. 40
  • 41. Commutative Writes Help Bring up Partitions • Partition can receive write traffic immediately • Receive dump of data in the background • Live for read as soon as the dump is complete. 41
  • 42. Performance? • Currently store 13 billion edges • 20K writes / second • 100K reads / second. 42
  • 43. Punchline? • Under all the bells and whistles... – Its MySQL ☺. 43
  • 44. So is this the Future? • Yes! • And No! 44
  • 45. What?! How Can That be?! 45