Friday, July 29, 2011

Mythbusters: Stored Procedures Edition



These days, the use of database stored procedures is regarded by many as a bad practice.

Those that dislike stored procedures tend to regard them as incompatible with the three-tier architecture:

By breaking up an application into tiers, developers only have to modify or add a specific layer, rather than have to rewrite the entire application over. There should be a presentation tier, a business or data access tier, and a data tier.

This is illustrated as follows:



Note that the "tiers" in the figure should actually be labelled "layers", for as the accompanying Wikipedia article says:
The concepts of layer and tier are often used interchangeably. However, one fairly common point of view is that there is indeed a difference, and that a layer is a logical structuring mechanism for the elements that make up the software solution, while a tier is a physical structuring mechanism for the system infrastructure. 

In fact, those that argue that stored procedures are bad tend to equate the three logical layers with three physical tiers:


  1. Data layer = data tier (database)
  2. Logic layer = middle tier (application server)
  3. Presentation layer = presentation tier


But if we accept the above definition of "layers" and "tiers", it is obvious that the following is a valid mapping as well:


  1. Data layer = data tier (database)
  2. Logic layer = data tier (database)
  3. Presentation layer = presentation tier


In other words, the database becomes our "logic layer" through the use of database stored procedures, which, as the name implies, are physically stored (and executed) in the database. (And although I use the term "stored procedure", I'm primarily talking about Oracle and PL/SQL, where the PL/SQL code should be put in packages rather than stand-alone procedures.)

But why is this a bad idea? In fact, as it turns out, it might not be a bad idea at all. The usual reasons given against the use of stored procedures for "business logic" (or for anything at all, really) tend to be myths (or outright lies), repeated so many times that they are taken as the truth.

So let's bust these myths, once and for all. And whenever someone argues against stored procedures using one of these myths, just give them a link to this blog post. (And leave comments to prove me wrong, if you will.)


Myth #1: Stored procedures can't be version controlled



Stored procedure code lives in text files, which can be version controlled like any other piece of code or document. Storing/compiling the code in the database is just like (re-)deploying any other code.

Claiming that stored procedures cannot be version controlled (because they are in the database) is like saying your application source code (Java, C# or whatever) cannot be version controlled because it is compiled and deployed to an application server.




Myth #2: Managing the impact of changes in the database is hard



Databases such as Oracle have built-in fine-grained dependency tracking.


A wealth of information about your code is exposed via data dictionary views.




Myth #3: Database tools lack modern IDE features





There are a number of free and commercial PL/SQL code editors and IDEs, and all have various levels of syntax highlighting, code insight and refactoring support.




Myth #4: Stored procedures always result in spaghetti code



To this, I can only say that bad programmers can make pasta in any language (the above is a visual representation of a Java or .NET enterprise framework "several dozen megabytes chock full of helper classes like IEnterpriseAuthenticationProviderFactoryManagementFactory").

And a good programmer can create "beautiful" code in COBOL, Visual Basic, PHP... and any stored procedure language, for that matter.





Myth #5: Code in the database can’t be properly encapsulated and reused, you need an object-oriented language for that






PL/SQL packages, views, pipelined functions and Ref Cursors offer encapsulation and reuse. And PL/SQL has object-oriented features, too.




Myth #6: Stored procedure languages are primitive, they lack basic features such as exception handling and dynamic execution



PL/SQL has had proper exception handling from the start, over 20 years ago (although exception handling was only introduced to SQL Server in 2005).



DBMS_SQL, EXECUTE IMMEDIATE and "weak" Ref Cursors enable dynamic execution of code. Parameter overloading and the ANYDATA and ANYTYPE types allow for generic code to be written.




Myth #7: Debugging stored procedures is hard/impossible



Both Oracle and SQL Server have built-in debugging capabilities, exposed via graphical user interfaces in the common IDEs, with full support for stepping though code, inspecting values, etc.




Myth #8: Stored procedures can't be unit tested



There are a number of free and commercial unit testing frameworks available for PL/SQL. Steven Feuerstein, one of the world's leading experts on the Oracle PL/SQL language, has been preaching the importance of unit testing in the database for years, and has developed several of the available unit testing frameworks.




Myth #9: Stored procedures are not portable, and tie you to one platform

This is the "vendor lock-in" argument. But the fact is that PL/SQL runs on multiple databases.

Such as DB2:

"IBM DB2 9.7 for Linux, UNIX, and Windows has out-of-the-box support for Oracle's SQL and PL/SQL dialects. This allows many applications written against Oracle to execute against DB2 virtually unchanged."
And Postgres (EnterpriseDB):

"Postgres Plus Advanced Server implements a comprehensive suite of Oracle-compatible functionality within and around the core PostgreSQL engine, including: (...) Oracle SQL syntax and semantics, Functions and Packages, PL/SQL (extensive support)"
Add to this the fact that the Oracle database runs on more operating systems than any other database, which means that your PL/SQL code will seamlessly transfer from Windows to Unix to Linux-based systems.



So PL/SQL-based code can actually be said to be more portable than, for example, .NET code (despite the existence of Mono). There are very few truly portable technologies; even Java is "write once, debug everywhere".




Myth #10: It's stupid/dangerous to put business logic in the database

This claim is usually made without any specific reason as to why it is stupid or dangerous. It usually "just is", because it is "against best practice" and "everybody else is putting the business logic in the middle tier". Sometimes it is claimed that putting logic in the database "mixes concerns", which must be a bad thing.

The problem with "business logic" is that nobody has a clear definition of what it is (but "you'll know it when you see it"). For example, where do you draw the line between "data logic" and "business logic"? Primary keys, foreign key constraints, unique key constraints, not null constraints, check constraints -- are these "data logic" or "business logic"? "Discount must be between 0% and 5%", is that a business rule or a data constraint, and/or is it a validation rule in the presentation layer?

The fact is, if you move ALL your logic into stored procedures, you entirely avoid the "mixing of concerns" between the data tier and the logic tier. (And if you think such an approach dooms your project to failure, consider the next myth, which features an example of a massive [and wildly successful] application written entirely in the database.)



Oh, and by the way, if your business logic is somewhere else than in the database, you always run the risk of someone or something bypassing your middle tier (for example by logging in with SQL*Plus), directly updating the database and possibly corrupting the data.

So let's turn this around and conclude instead that:

"If your business logic is not in the database, it is only a recommendation."



Myth #11: Stored procedures can't scale

A frequent argument against stored procedures is that by placing all the work in the database server, your solution won't be able to scale up, because you need "application servers" in the middle tier to do that. The scalability of the database is limited by the fact that you can only have a single database server (or you need to rewrite your code to work with partitioned/sharded databases like Facebook have done).

Of course, a lot of the people who throw around this kind of argument have never worked on an application or website which needed to scale up to millions of users (and to be clear, neither have I). That's because the vast majority of us work on much smaller enterprise business systems or "normal" websites (perhaps even the kind of website that can be well served with free database software on a server with less juice than your laptop).

But stored procedures CAN scale. It's only a matter of money. And if you have millions of users, you should be able to afford decent hardware.

Let's use Oracle Application Express (Apex) as an example of a big and complex PL/SQL application:

"[Application Express] lives completely within your Oracle database. It is comprised of nothing more than data in tables and large amounts of PL/SQL code. The essence of Oracle Application Express is approximately 425 tables and 230 PL/SQL packages containing 425,000+ lines of code."

This PL/SQL application can be deployed anywhere from your laptop to any kind of server:



The biggest and fastest server you can buy is currently Oracle Exadata.

"An 8 rack configuration has a raw disk capacity of 3,360 TB and 1,680 CPU cores for SQL processing. Larger configurations can be built with additional InfiniBand switches."

Oracle makes bold claims about this machine:

"Oracle claims that (..) two Exadata database systems would be able to handle Facebook’s entire computing load."

It's hard for me to verify that claim, not being associated with neither Oracle nor Facebook, but let's assume it has at least some truth to it.

So what about running our "stored procedure" application on Exadata?

"Does APEX Work on Exadata?
"Yep, Exadata is just Oracle. Let me say that again: It’s just Oracle. Exadata runs an 11.2 Database on Linux x64. It’s the exact same binary install if you download those binaries for “generic” Linux x64 11.2 from OTN. So, if your code / app runs on 11.2, it runs on Exadata. (..) The APEX Dev Team (my old friends and colleagues) did absolutely nothing to port their code to Exadata. I've run a number of customer benchmarks with customer's data and queries and have yet to make a single change to their queries or structures to make them work on Exadata."

So... without changing a single of those 425,000 lines of code, this "stored procedure" application can run on my old laptop (I've even tried it on an Asus EEE netbook), or it can run with 1,680 CPU cores. Without offloading any logic to an application server.

I'd say that's pretty scalable.

54 comments:

Roel said...

Hear! Hear!
A very fine piece of myth busting, Morten!
Cheers
Roel

Filipe Silva said...

Great Idea! Great Post!
Thanks!

Peter Raganitsch said...

Beautiful!

Chandra Movva said...

Great buster. Comprehensive enough.

David Clement said...

Well said. The stored procedures myth is a pernicious one. It can't be busted too severely. Thank you!
David Clement

Gert Poel said...

Nice read!

Carsten said...
This comment has been removed by the author.
Chris said...

Nice blog !
But does the term 'Vendor lock in' ring a bell ?

grtz,
Chris.

Morten Braten said...

@Chris: See Myth #9.

- Morten

David said...

Myth #1: Bernie is dead

BUSTED: No he's not! He's holding a drink and waving at people!

Anonymous said...

Well, I guess you missed Myth #12: PL/SQL code is ugly.

I'm pretty confident, that you cannot bust this. If you ever happened to solve your problems in beautiful and elegant languages like Python within a few lines and everyone is able to read and understand what this code is doing, then you would never again like to express the same logic with the ugly, uncomfortable syntax of PL/SQL.

And regarding your Myth #11: "It's only a matter of money." Well, Oracle can virtually use Exadata for free as they don't have to pay any fees to themselfs. But anyone else this would cost at least seven figures of USD. And that's the point where most people realize, that there's more than one way to do it and probably a cheaper and still mature one.

Pablo said...

The fact that everyone is against what you suggest here should ring some kind of bell.

Yes SP can be versioned and, in the tools you show, you can right-click -> extract procedure, and even some guy wrote about the importance of unit tests in SP's.

Those are all valid points, but they lack context.

* Unit tests for SPs will never match tools like RSpec for ruby.

* The fact that you can make a kind of "struct" in SQL is no match for Scala traits or ruby mixins. I don't think there's even something like inheritance in there...

* Though everyone can write good and bad code in every language, it's often easier to screw things up when the language is not suited for the task in hand. Try for example implementing a logger with a rolling file strategy (something really common) in SQL and in ruby. Then tell me what was easier and which version looks more like spaghetti.

Bottom line is, you have to use the right tool for the right job. Even with this (IMHO) minor improvements, SQL is no match for a full programming language for doing anything but database read/writes.

Unknown said...

good article...nice post.
thanks
-amit

Anonymous said...

Another point about vendor lock in. You are going to have this problem even if you do queries from the middle tier.If you have all your queries as stored procedures you can at least easily change the query inside the storproc if you change database. Instead of cluttering your middle tier with conditional queries depending on the database used.

ivanhoe said...

#12: For stored procedures you need PL/SQL programmer, who is usually more expensive and hard to find (at least good one). If he/she leaves, it can be even a bigger problem to find another one willing and capable of catching up fast with someone else's stored proc code.

Now, stored procedures also have many great features, but it's not the subject here...

Morten Braten said...

@Anonymous: "PL/SQL code is ugly."

Well, "ugly" is in the eye of the beholder. No point in trying to argue about something which is entirely subjective. But let me know what you find ugly about, for example, this piece of plain PL/SQL code:

http://4.bp.blogspot.com/_dBAKxejXABM/TVMT-nEd95I/AAAAAAAAAVk/Eiex1i8IxzY/s1600/SNAG-0825.jpg

"solve your problems in beautiful and elegant languages"

Again, what is beautiful and elegant is subjective, but I for one happen to find, for example, SQL analytics very elegant.

Some examples here:

http://www.adp-gmbh.ch/ora/sql/analytical/rollingavg.html

and here:

http://technology.amis.nl/blog/6404/oracle-rdbms-11gr2-solving-a-sudoku-using-recursive-subquery-factoring

"this would cost at least seven figures of USD"

Well, that was kind of my point when I said "it's only a matter of money", right? And I also said that if you have millions of users, you should be able to fork over some serious cash for serious hardware (otherwise there might be a problem with your business model). And 30,000 servers running Linux and sharded MySQL are not exactly free either (both in terms of hardware and management).

"still [sic] mature one."

So you're saying that all the new alternatives that have appeared in recent years are "more mature" than stored procedures (PL/SQL) which has been available for 20+ years?

- Morten

Felipe said...

Too risky to store business logic in the database yet. Oracle may be in the correct path, but the rest i don't think so. Therefore as an architect you shouldn't take that risk then, put the business logic into a more portable and easy to maintain technology like a robust language.

Morten Braten said...

@Pablo:

"The fact that everyone is against what you suggest here should ring some kind of bell."

See Myth #10.

" I don't think there's even something like inheritance in there..."

From the 10.1 docs (http://download.oracle.com/docs/cd/B14117_01/appdev.101/b10807/10_objs.htm):

"PL/SQL supports a single-inheritance model. You can define subtypes of object types. These subtypes contain all the attributes and methods of the parent type (or supertype). The subtypes can also contain additional attributes and additional methods, and can override methods from the supertype. You can define whether or not subtypes can be derived from a particular type. You can also define types and methods that cannot be instantiated directly, only by declaring subtypes that instantiate them."

I didn't say that PL/SQL has OO features to rival "pure" OO languages, but the option is there if you want/need it.

"Try for example implementing a logger with a rolling file strategy (something really common) in SQL and in ruby. Then tell me what was easier and which version looks more like spaghetti."

OK, here's my rolling logger in SQL:

create table log (log_date date, log_message varchar2(255));

create view log_recent_entries
as
select *
from log
where log_date > sysdate - 7;

Done. Sorry, no spaghetti.

Probably not what you expected, but like you said yourself: "Bottom line is, you have to use the right tool for the right job."

- Morten

R. Kyle Murphy said...

My biggest problem with stored procedures is two fold. One, it splits your business logic up across multiple layers. It's hard enough keeping things up to date with all the logic concentrated in the app server, but when you take chunks of it and start spreading them across the DB as well it gets confusing very quickly. Secondly, I'm not a DBA, and I have no particular interest in becoming one. I use ORM tools so that I don't have to bother with the DB, and then other layers of abstraction bellow that to ensure that I don't care where my data is coming from. So far as I'm concerned the DB is just a high performance flat file, I want it to store and retrieve my data, and then get out of the way so I can get things done.

To this end, things like Redis are looking really nice, they provide a high performance system for storing and retrieving structured data, without all the messy legacy overhead of a traditional DB.

Morten Braten said...

@R. Kyle Murphy:

"One, it splits your business logic up across multiple layers."

See Myth #10. If you move ALL business logic into the database, you don't have to worry about any split. No "object-relational mapping", no "impedance mismatch". Everything is in one place and written in one language (which happens to integrate extremely well with the data storage, ie tables and other database features).

"So far as I'm concerned the DB is just a high performance flat file, I want it to store and retrieve my data, and then get out of the way so I can get things done."

OK... that's certainly a choice you are free to make. If you don't see any value in the database, why bother with it at all? Why not just use a regular flat file?

I guess you'd have a lot more "things" to get done, as you would have to reinvent lots of the useful things that the RDBMS gives you for free...

- Morten

R. Kyle Murphy said...

Regarding your first point, yes, you could move all logic into the DB, at which point your UI is essentially a very thin wrapper over the DB. Personally I find PL/SQL to be rather difficult to work in, but that's mostly personal preference and you obvious feel differently, so we'll just agree to disagree on that one and call it a wash.

"OK... that's certainly a choice you are free to make. If you don't see any value in the database, why bother with it at all? Why not just use a regular flat file?

I guess you'd have a lot more "things" to get done, as you would have to reinvent lots of the useful things that the RDBMS gives you for free..."

Yeah, pretty much that. I could use just a flat file, and handle writing/reading from it, as well as indexing, and performance optimizations, but why re-invent the wheel? This is why I mention Redis. It's highly optimized, has bindings for nearly any language you can think of, stores structured data (something not even traditional SQL handles very well as the entity-relationship model is sometimes difficult to map into the OO world and vice versa), and best of all, it's simple. It does everything I need from a data storage perspective, and doesn't require me to worry about any of it myself. I don't even need a ORM framework for it because I can store object graphs as hashes.

Eb said...

I really appreciate this post because it shows what can be done with sprocs. Interestingly, the db developers I've met have not cared as much for the things you can do such as good reusable design, unit testing, proper source control/versioning etc etc. The practices tend to be found and followed more so by programmers using C#, Ruby, Java etc etc which is why these myths are prevalent. That said, I would suggest there is a more support e.g. mocking frameworks (which I'm yet to see for the database).

Still good stuff. We need to be aware and make informed choices.

Pablo said...

@Morten

"See Myth [NUMBER]"

Sorry, quoting your own post (that I've already read BTW) won't do. Please stop pointing at this Myths as THE LAW.

"PL/SQL supports a single-inheritance model."

As I said, that model is little more than a toy compared even to java. Moreover PL/SQL is not standard, which means lock-in with a vendor like Oracle (hint: no bueno).

And about your log example, I meant a text file like any logger out there. Something you can tail, share via email, grep, etc. I don't know anyone that logs stuff into a DB (right tool for the job, remeber?)

Actually it's not a bad thing that SQL sucks at these topics. It's a good reminder that some tools are simply not suited for some tasks by design. SQL is good for the data layer and it should stay there. Please leave it there.

Thank you.

rane said...

may be ORACLE paid you to write this report, if not you should ask them to :)

unfortunately, those who are developing softwares for/using opensource (MIT/BSD license) will agree with the myths ..
Most of the myths have been busted for Oracle and MSSQL, for the poor users who can only afford MySQL, Postgres the myths still stand. Tell me which of the top 100 websites use Oracle/MSSQL (even if someone does, the percentage would be smallish).
Most of the myths got busted becuase of the IDE provided. So may be one of the best practice to start using Stored-Procedure is to have a good IDE.
The most painful part, Managing stored procedures is a bitch compared to logic layer in any other language, which _programmers_ are always most comfortable with. I doubt Stored Procedures will provide the agility needed for software development.

Having said all this, I might still use stored procedures because sometimes they are much more efficient, but if you are talking about maintaining and active development, I will shy away from using them because I don't want my programmers to debug convoluted logic in SQL when it can much easily be tackled in a better syntactical language.

Morten Braten said...

@Eb:

"The practices tend to be found and followed more so by programmers using C#, Ruby, Java etc etc which is why these myths are prevalent."

There is some truth to that. Failure by database developers (or more likely, "regular" developers dipping their toes into stored procedure programming) to realize that stored procedure code is "real" code, and needs to be version controlled, etc. leads to the misconception that "stored procedures can't be version controlled", etc.

From my post it should be clear that this is more of a "people" thing than a language/tool thing, however.


- Morten

Morten Braten said...

@Pablo:

"As I said, that [inheritance] model is little more than a toy"

No, what you said was: "I don't think there's even something like inheritance in there...". And I showed you, for a fact, that PL/SQL has OO inheritance. You're free to call it a "toy" if you want, but that's subjective.

"I don't know anyone that logs stuff into a DB"

No, why would anyone want to do that? I mean, it's not like databases were MADE to store data (and to make sure the data is transactionally consistent, durable, being backed up along with the rest of the data, etc. -- quite unlike your text files).

"Something you can tail, share via email, grep, etc."

Now you're telling me that I need to log to text files in the operating system so I can use grep and tail on them? Did you forget about the extremely powerful query language (I believe it's called SQL) that's available right there in the database to query the log tables/views? (And yes, you can send email too from the database.)


- Morten

Morten Braten said...

@Rane:

"may be ORACLE paid you to write this report"

Nope, I don't work for Oracle. I just think that they have an excellent database product. But there is always room for improvement.

"for the poor users who can only afford MySQL [and] Postgres the myths still stand."

I beg to differ:

a) Oracle offers a FREE version of its database, Oracle Express Edition (XE).

b) There are several very cheap hosting options that give you access to a (shared) Oracle database.

c) The Standard Edition One (SE1) version of the database is priced around USD 5,000 -- hardly a huge sum.

"Tell me which of the top 100 websites use Oracle/MSSQL"

I think I made a point in my blog post about the vast majority of us NOT being one of the "top 100 websites". If your company has 1,000 or even 10,000 employees (or even 1 million customers), it's hardly relevant what Facebook does to serve 750 million users. Why would you let the extreme exceptions guide you in your technology choice?

"Managing stored procedures is a bitch"

Care to be more specific? That is exactly the kind of statement that gets thrown around without any explanation or justification.

- Morten

Morten Braten said...

BTW, this post was picked up by Hacker News yesterday and generated a lively discussion there:

http://news.ycombinator.com/item?id=2832246

- Morten

Eb said...

@Morten:

Fair enough although I would suggest from that my experience that languages such as C#, Ruby do a "good enough" job that there is at times no need to dip further (but that shouldn't lead to the creation of myths). On the flip side, the traditional "database developer" (who invented that title?) that I run into uses few (if any) of the good programming practices out there.

So you are right, it does end up being a people problem as I described my blog posting reaction to this: http://wp.me/p57tU-3Q

Anonymous said...

Lets have an Oracle DBA party to celebrate!

Anonymous said...

Oracle much???
The idea of mythbusters-style post is awesome and I have to admit the content is well written (although vendor specific) but:...

#1: vendor specific, what about a solution for Oracle's main rival SQL Server ??

#2: The information is indeed there but: try tracking it down in a moderately complex system and tell me how long will take you. By the way you better pray that documentation is up to date because making changes in your db WILL result in things breaking

#3: Granted that one is busted BUT frankly it was about bloody time. Surely we don't want to go down the road of comparing them with developer IDE's (see VS or Eclipse)?

#4: That one is an axiom in software development

#5: see #1 plus although in SQL Server for example it is possible to import and use C# libraries from within the DB server I personally think it a horrible idea

#6: see #5

#7: it is certainly doable but not as productive or as pain free as I am sure we all wished it would be (although we are getting there)

#8: see #1

#9: see #1

#10: The architecture argument is valid and millions of keystrokes have been wasted on it. It comes down to the architect, the project requirements and constraints and so on and so forth...

#11: With money and patience everything scales

I thoroughly enjoy your blog, keep up the good work
Cheers

Ola Roos said...

This article is extremely true !

Thanks !

Mathieu said...

This is a very interesting post.
As you can read in the comments people have different opinions.

Strictly spoken if the logic layer is not implemented in the database, it would mean that tables would not contain any primary -, foreign -, unique and check keys. No not null restrictions. Just data.
This is something I strongly would not advice. Use the power of the Oracle database. So there always will be some kind of logic in the database anyway.

What about the logic that can't be implemented in tables keys? Before deciding where you are going to put it, think about it:
Are you going to perform only single DML statements or will you have to perform also bulk DML statements? What type of developers do you have available? Is it a closed environment or is anybody allowed to write an application that manipulates the data? Which layer is probably the first to be changed (Most of the times is not the database. In cases the datamodel has changed your logic has to change as well)?
It all depends...

I would personally use PL/SQL for the logic layer in case of the Oracle database. Pl/SQL has special been developed for it. If you want to pass the logic layer and manipulate the table directly, with PL/SQL this would be the hardest way (especially when there triggers on the table). PL/SQL is maybe not as beautiful as Ruby. Still with a good developer tool it is very easy to write and read.
But this is personally.

One thing I also believe is: keep it simple. Use not to many layers. With many layers it is getting more difficult when there is a performance issue or if you want to debug it. This is what I like about APEX. It has a simple but strong layer structure.

Anonymous said...

ALL of the problems that developers and architects have with relational databases are due to a lack of knowledge and skill. I am so tired of people telling me that something is wrong with a technology that they do not even remotely understand. Most projects I've seen have done such a poor job of discovering the business process that it is no wonder they can't figure out how to use a database. Most databases I've seen have are so poorly designed and so filled with redundancy and trash that they are almost completely unusable for business purposes.

This whole debate reminds me of a spectator telling Steinitz (chess champion of the world) that he couldn't understand his game. Steinitz replied, "Of course not! Have you ever seen a monkey playing with a watch?" Unfortunately, most developers don't even rise to the monkey's level.

Kyle Hailey said...

awesome post and nice format , ie myth busting and the graphics. When I run into a bunch of Java developers hard core laying down the law on no stored procedures or other similar tactics I'm always amazed out how smart people argue for bad ideas. One of the better coders I ever worked with wanted to ditch Oracle and write a transactional data storage layer himself. He was a smart guy but smart people often blindly argue against anything they've never used because if it was good, of course they would be using it.

Anonymous said...

Wow, now that people are quoting me (#11, last quote) I really should be more careful about what I say ;) I believe the Facebook quote was in reference to Exalogic and specifically http requests, but that is definitely not from me.

Anonymous said...

I am working in a large software engineering firm with 17k employess and here stored procedures are a "best practice". Many people just dont know the power of SP. Awesome article :-)

Anonymous said...

pl/sql is the best. it iseasy to code,easy to maintain and no other thin can match the performance.

Tony Andrews said...

This is brilliant. I am saving this link to use in response to myth-peddlars in future!

Anonymous said...

I've got to agree about the ignorance of many developers when coding against (and I do mean "against") Oracle. The other factor that comes into play is that it takes at least 10 people to get any significant work done in Java. Management packs so many Java drones into the organization that they wind up having numbers on their side and the next thing you know ... no more PL/SQL (nobody can do anything without those elegant curly braces).

Will Sisson said...

Interesting article. I wholehearted agree with placing logic in the database.

I think it is worth pointing out that these techniques are just as applicable to (almost) all SQL DBMS rather than just Oracle. Database centric practices are just as viable in SQL Server or Postgres for example.

Given that the RDBMS is directly based on logic, as soon as you have created a table you have started putting logic in the database, so why not be consistent and put all the logic there?

Anonymous said...

I am a performance/tuning expert. And really nothing matches the performance of PL/SQL.
You just need INTELLIGENT programmers instead of..

Chaitany Kulkarni said...

Same thing I have asked in Postgresql mailing list whether postgresql can be used as Application Server. Many replies have said they are using the same for years. Now I can confidently say yes it is, not just postgresql but also oracle.

Thanks.

Pete B said...

Hmmm, there is one very BIG disadvantage to having all your business logic in the database!

If I want to validate something as simple as a phone number on the UI and all the logic is in the database then I would have to write some nifty logic in the database and make the front-end call through to the middle-tier that then calls the database (which doesn't even touch data in the database only the value I've passed in) run it's validation and pass back (via the various tiers) back to the front-end to say INVALID PHONE NUMBER?!

That's daft you say! Simply add the same logic in the UI so you don't have to call the database... Right, now the phone number format changes and we need to store the country code. I change the UI but not the database... Whoops, now they don't perform the same validation?!

Alternatively you could write it in shared code that both the UI and back-end can access? That way the UI can validate a phone number instantly without having to go all the way to the database?!

Databases are for storing data! Sure, have your data integrity and things that databases are good for but don't just dump all your logic in there because you can't be bothered to architect the system properly!

aht said...

I'm really on your side. I happen to think that SQL is a high-level language specially suited for data management (specially validation) and so it's no-sense not to use it in favor of a more general purpose language where you need to re-invent several wheels. The case with procedural languages is exactly the same.

I also agree with you in your myth busts. Although in the PostgreSQL world, mostly everything you say applies similarly (BTW, @rane, don't put PostgreSQL and MySQL in the same bag and call them "poor"; PostgreSQL is extremely featured and able to do anything that Oracle does shown in this post).

However, there's one valid point IMHO in not putting the business logic in the database: the database is a precious, scarce resource. Write scaling a database is hard, and so most uses cases really want to have a single central server (however with read replicas for read scaling). What is definitely cheaper is to have several application servers running on commodity hardware rather than buying an Exalogic.

So this is what I do: I clearly separate what I also call "data logic" from "business logic" and have the former reside in the database and the latter in the application servers. Also data logic may be replicated in the application servers, so validation does only happen in the database "as a last resort", but always keeping your data model consistent. Yeah, its a trade-off, as the "business logic" is spread in two tiers and also some data logic may be needed to keep in sync. But IMHO it's worth: you still have a db which ensure all your invariants, while having a cheap way of removing processing from the scarciest resource (the database) to cheaper application servers.

Anonymous said...

How do you incorporate user extensibility into the logical processing capabilities of the Oracle Database (i.e. PL/SQL extensions) ?

At the middle tier we can integrate disparate sources of data (i.e. foreign hardware) to extend the capabilities of the business logic as expressed in Java, PHP, C etc. etc. How would you control an industrial or manufacturing device for instance? Could you write functions/procedures at the database level that directly control these foreign devices without going through the middle tier?

Morten Braten said...

@Unknown: "How would you control an industrial or manufacturing device for instance? Could you write functions/procedures at the database level that directly control these foreign devices without going through the middle tier?"

Depends on the device, but yes, assuming the device has some kind of control API and can be reached over a network, then you could use UTL_HTTP and/or UTL_TCP to communicate with it directly from the database.

Depending on the database version and edition, there's also the ability to execute OS commands via DBMS_SCHEDULER, external C callouts, and Java stored procedures, all of which run on the database server and not the middle tier (ie not on another server).

- Morten

Anonymous said...

Thanks Morten.

Those examples may work but they are cumbersome, reliant upon the middle tier (i.e. UTL_HTTP) or are unsuitable for real-time interaction with complex hardware devices (i.e. telecommunications equipment).

I must comfess it was a leading question given that my company, dbPlugins, has developed a 'User Extensible Database Plugin Architecture'.

This architecture allows one to easily integrate to any 3rd party technology API and express that API as a PL/SQL package. The package then becomes a universal database layer API that can be called by any modern programming language.

A database plugin gives the data-tier the same level of programable extensibility that is available at the middle tier.

My company is looking for beta testers to validate and comment upon this emerging technology.

We have working examples of plugins that distribute in source-code format which integrate to a GeoIP API and to tarlib.

The plugin server effectively masks the complexity of database integration allowing the author to focus on the task of integrating to the 3rd party API. There are two components that an author writes, the PL/SQL package that faces the user application and the 'C' based plugin module that interfaces to the 3rd party API.

SQL*VOX - a more complex plugin - interfaces to the telecommunications API from Dialogic allowing for IVR, VOIP, Text-To-Speech and ACD capabilities from a database centric point of view.

This is very innovative technology that you will not find anywhere else. I welcome any interest you may have in evaluating this technology.

Steve Guilford
Founder, CEO
dbPlugins
dbPlugins

Will Sisson said...

@Pete B

The telephone number type is defined in the database.

It is easy to derive the UI validation out of the database metadata.

By leaving out the middleware completely you avoid the danger of your middleware definition being inconsistent with your database.

One level less is always good.

Will Sisson said...

@Pete B

"Databases are for storing data!"

This is another myth that needs to be busted.

Disk drives are for storing data. An RDBMS represents (not stores) the data in a manner such that it is possible to perform logical inference on the data to derive new facts from existing ones, using the relational operators.

SQL supports standard logical operators like quantification natively. Popular middleware languages like Java, C# and Python force you down to using machine level constructs like loops.

SQL is the nearest you can get with a widely used commercial product to programming directly in logic.

Thus doing all the logic in the database delivers not only higher productivity, greater reliability and faster performance; it is also fundamentally more logical.

Seth Griffin said...

Morten,

Brilliantly written.

To begin with I am a programmer who works primarily in the middle layer and am a strong advocate of business logic in the middle layer.

You make a lot of good points. The advocates of logic in the middle I think often work on very simple applications where there is one application per database.

When you start hiring a consulting team or purchasing software that wants to integrate with your database it is much better to have business logic in stored procedures to prevent catastrophic data integrity issues.

The only way that logic in the middle layer works every time is in a SOA system where ALL applications are FORCED to use the web service bus. In other words, put all logic in the database or use a SOA with a well defined and documented service BUS.

It seems to be developers who work on stand-alone applications and don't do enterprise level development who say things like "XML is horrible, use JSON" and "data belongs only in the middle layer." Really? Where does your portability come from? What if your third party app provider doesn't know what JSON is and equates web services with "SOAP"? How does one dynamically generate consuming web services based on a JSON endpoint that merely dumps data in javascript notation?

In my experience, both work very well if done properly. To say one or the other is necessarily bad is hubris.

Well done.

Anonymous said...

I know, it is an old post, however...

When I was reading it I was very much tempered by the 'what for' hammering my mind (even if I found your arguments consistent in itself). As key result I got from your post that you can put any logic into the database and you can't. Let's focus on applications that use browsers of presenting the UI, which I usually deal with. My experience is, that on the server you need any kind of code that:
1. transforms table data into html,css,javascript,jpeg,etc.
2. can communicate via http
3. can cope with browser communication ($_POST,$_GET) and eventually authentication mechanisms

I trust your blogspot pl/sql to be able to cope with 1. I am too little in the topic to judge 2 and 3. My best guess would be that at least you require some adapters, which in worst case are self coded. As long as there are out of the box adapters, full stop - all fine. If you have to code something, which I would assume, you need to put any kind of logic onto your middle tier. Basically I am not interested in these fights about, what is business logic and what not. I am just worried to put lines of code on different places, some in the middle tier and some into the database. Jumping between buckets has a potential to become a monster debug challenge. So how would argue on that one?

Also I am doing lots of things in php, because of its extensive support on associative arrays, implicit type conversions, rich inbuilt libraries, large user base with tons of ressources for each and every problem you can imagine and even on things you could not imagine, that they existed. Because I do not want to maintain code at two places, a technology change (which might be beneficial) would be an all or nothing. So would I find all this for pl/sql as well? At this point I have to admit that I so much frightenend to displace code from each other that I usually avoid any places with poeple who promote mvc object models and frameworks. I find encapsulation eg. in libraries useful, but this in my eyes has more of sort&order than just shooting everything into an mvc model. I even try to avoid to overuse objects as I do not find them appropriate to serve web related coding challenges. However different topic - I just wanted to point out, that my personal fear to push down code into different alleys just for the sake of it is great enough to avoid experiments on that edge.

Mattster said...

Well said. Myth #10 covers the most serious deficiencies in the latest-hyped approaches to application development. This is why crap technologies like Object-Relational Mapping (e.g. Hibernate, Entity Framework) proliferate; and why systems that are built upon such technologies become increasingly cumbersome and badly-performing whenever they are pushed to do anything more trivial than a simple blogging engine.

I cannot fathom how some (most, actually) developers come to the conclusion that "business logic" or "rules" should be stored outside the database.

These rules *are* data! Thus they belong in the database such that the DBMS can enforce their correctness!

As an aside, do these developers even learn the notion that "code *is* data"? (See http://c2.com/cgi/wiki?DataAndCodeAreTheSameThing for more)

There are reasons why idiotic OR/M technologies proliferate:

- Too many developers have sub-standard skill in SQL and view the OR/M as a magic bullet that both designs and queries the database for them. The results are always laughably inefficient and frequently mandate bad (or at best questionable) design principles (e.g. mandatory artificial identity keys).

- Many businesses aren't aware of the rules they have in place and thus aren't explicitly specified. For example, a price may not be negative or we'll end up paying for customer purchases! This sort of rule is trivially easy to express as a constraint on an attribute. Bearing in mind that all unit testing frameworks feature "ExpectedException" checks, such a rule is also trivially easy to unit test.

- To be fair, SQL (even the ANSI standard) is quite a poor language. A language that was fully compliant with relational algebra (I'm thinking of Date & Darwen's Tutorial D) would offer better expressivity for awkward business logic. It would also mitigate the vendor lock-in myth. Unfortunately, entrenched standards are hard to shift; this won't happen until enough database practitioners become knowledgeable enough to demand such a language. Perhaps the open-source community could give the established database vendors a fright.

Keep fighting the good fight.

Scott Wesley said...

I'm glad this thread was revived recently on twitter, I think myth #10 regarding logic in the database is quite relevant regarding APEX. I've recently heard a DBA still think it's strange/bad to have application source data stored in the DB, though I'm yet to hear reasoning.