How can we bring query to the data?

Baron recently wrote about sending the query to the data looking at distributed systems like Cassandra. I want to take a look at more simple systems like MySQL and see how we’re doing in this space.

It is obvious getting computations as closer to the data as possible is the most efficient as we will likely have less data to work with on the higher level in this case. Internally MySQL starts add optimizations which help in this regard, such as Index Condition Pushdown which allow storage engine to do most rudimentary data filtering improving efficiency.

The more important case though is the Application – Database interaction. Modern applications often have quite complicated logic which might not map to SQL very well. Framework and the practices developers follow can only add to this problem. As results Application may be issuing a lot of queries to the database doing computation inside the application and paying a lot of inefficiency and latency for transferring data back and forth. Latency is really a key here as accessing data through network is thousands of times slower than accessing data in memory so many simple data processing algorithms you could imagine accessing data they need row by row simply do not work.

For some tasks simply learning SQL (including some voodoo practices with user variables etc) and using it correctly is good enough to do efficient computations with single round trip, for others – it might not map to SQL very well.

There is known solution to this problem which existed in many database systems for decades – stored procedures. These would allow you to store programs inside database servers so they can work with data locally often accessing it with much lower latency so you can implement much broader set of algorithms. Stored procedures also other other advantages such as giving more security and more control over what application does to DBA but they have limitations too.

There are MySQL limitations – Stored Procedures restrict transparency,hard to debug, do not perform very well and need to be implemented in the programming language from 70s. Though all of this can be fixed in time. The design limitation though is stored procedures do not support some of the modern development and operational practices very well.

If you look at a lot of modern applications with database backend they have the “code” which is something which lives in your version control system changed quickly and such changes can be developed in production many times a day – approach proved to be successful for many modern web application. This is all as long as no database change is needed… because database does not like you to be agile. Changing database schema, indexes etc takes significant effort and can’t be taken lightly. They also require different process as you can’t just “deploy changed database structure” as you do with data you have to have the migration process which can be ranging from trivial (such as adding a column) to rather complicated – such as major database redesign. In fact reducing pain from database maintenance by having no schema is one of the major draws for NoSQL systems which offer a lot more flexibility.
The code also can often be deployed in rolling fashion to make it downtime free then more than one version of code is allowed to run in production for short or long period of time. For example deploying performance optimized version of code I can deploy it only on one Web server for a few days to ensure there are no surprises before full scale deployment

The problem with Stored Procedures they take some middle place. They are really the code which is part of the application but they live in the database and updated through change to the database which is global for all application servers. This makes it for Developers which can’t use the same editing tool to just edit code save it and see how it runs in the test system without doing extra work. Second problem is solved by some people by implementing some versioning in the database, so instead of CALL MAKE_PAYMENT(…) they will use CALL MAKE_PAYMENT_(…) which is however also quite pain in the butt.

So what would be the alternative ?

I would love to see MySQL extending the work started with INSERT ON DUPLICATE KEY UPDATE and Multi-Result Set by being able to submit the simple programs to run on the database server side as alternative to queries (or even alternative to SQL API). Using language more friendly to modern developers, such as JavaScript and allowing to return more than flat tables (for example JSON objects) would be quite appreciated. Such API would for example allow to solve “dynamic join” problems efficiently – where the data which belongs to the object (and as such which tables needs to be added to the join in SQL) depend on the object properties as well as handling complex update logic which now often requires many roundtrips to the application.

Implementing something like this one would need to be extra careful as it would allow application developers to kill database servers either more effectively than they do now – the proper limit on resource usage (CPU, memory etc) would need to be enforced. We also would need to be extra careful with security as ability to change the “program” allows hackers a lot more ways to express their creativity than SQL injection currently does.

The downside of this approach compared to really “stored” procedures from performance standpoint is their dynamic nature – each would need to be compiled for execution from the scratch. I think this could be substantially optimized with modern technologies and it is a small price to pay for the power to avoid many round trips and get a lot more power on the data processing local to the database.

In the end I think something along those lines could be quite helpful in expending usability of relational databases for modern applications. What are your thoughts?

The post How can we bring query to the data? appeared first on MySQL Performance Blog.

How can we bring query to the data?

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112