Knarvik

Rationale

Object-oriented programming is not bad

Object-oriented programming is absolutely cool. It is easy to show that OOP brings benefits to both the philosopher and the peasant.

A typical peasant is supposed to work on some kind of practical and useful programs. The size of most useful programs is somewhere between large and really, really huge. Peasants often need to look for certain code items like functions and structures in a codebase of an enormous size as they fix bugs or cram in new features. Classes make sense for peasants as a commonly accepted way of grouping and arranging code and therefore bringing in at least a resemblance of order. Putting code into class methods also makes life easier for peasants because they can type less. Instead of suffering from lengthy

MegaCollectionAdd(collection, item)

they enjoy more elegant and terse

collection.Add(item).

With OOP, copy-paste oriented design is much easier, too, since the peasant can put the above line wherever there is a class that has an Add method. Last but not the least - virtual functions! Peasants are totally crazy about them and there is a good reason for that. With virtual functions, the compiler does two nice things for a peasant: it allows the code that calls functions through pointers to be shorter and it guarantees safe type casting for that.

On the other hand, the major problem with a philosopher is that his mind is weak, small and unable to look at the whole program at once. The philosopher needs to break the problem into pieces which he will then swallow one by one. OOP offers great help when one wants to divide and conquer. First, classes slice the fabric of a program into crisply defined pieces. Each piece contain data structures and algorithms related to a single kind of entity. Which is particularly convenient because the philosopher's brain is trained to deal with entities (or objects): apples, oranges and the like. When a philosopher stares at a picture displayed on a monitor what he sees is objects, not a grid of pixels. Second, philosophers sometimes feel that certain things have a lot in common. When a philosopher discovers that "common" he becomes happy because he doesn't need to take care of the sheer complexity of apples and oranges any more - a description that something is sweet and juicy is enough information. That's how powerful inheritance is.

As one can see, true advantages of OOP do exist and they do real good for people. May be it is no accident that OOP has become the world's official programming religion.

Object-oriented programming is not good enough

Seeing the world as a heap of objects is natural but childish. We need a more adequate analytical tool to describe problems. An interesting thing about our world is that things are connected to each other and interrelated. That's why it might be beneficial to add the notion of relation to the software modeling language. Objects together with relations as first-class language elements have proven their worth in the database world. Although database experts are a very special kind of people dealing with quite specific problems, dismissing their experience entirely would not be wise.

A good class is responsible for the data it contains. In other words, class methods maintain integrity of the data stored in class fields. Some of the fields are references to objects that exist outside of the class. Who is responsible for the integrity of these references? Most often, no one is. Ideally, it should be the class that owns instances of this class, the super system. But there is a catch here. Managing links between objects involves coordination of their life cycles. This is usually done by adding implicit contracts that are undocumented, not a part of the language and extremely hard to test. The failure to comply with such contracts or the absence of those leads to extremely ugly bugs, such as dangling pointers in C and "zombie" objects in garbage collected languages. It is possible to implement a good runtime check of reference integrity within a "master" class, but this is something what database engines already do! The sad fact that programmers are notorious at re-inventing basic database functions in their systems is a strong indication of the necessity to go beyond OOP towards the relational model. A very good example are "smart" collection classes that have self-made indexes based on hash maps or red-black trees. Even in the 21st century, programmers still have to write their own insert, delete and update methods, keep indexes up-to-date and hand-craft hashing and comparing functions.

To summarize, there is enough evidence that the object-oriented programming alone is not helpful in many cases. Extending it with other paradigms is promising and table-oriented programming is one reasonable way.

The proposed solution

Ideally, in order to overcome the deficiencies of the existing frameworks, we need a brand new programming language with built-in table support, a compiler with strong artificial intelligence and an automatic bug fixer. The noble deed of designing and implementing such a language still waits for its heroes. In the meantime, we will pursue a much humbler and more reachable goal of building a tool for an existing language.

The tool will provide an API for manipulating data in the form of tables, rows and fields. The API will use the existing language idioms in order to make table-oriented programming natural.
We will choose C++. Why not python, Java or C#? We select the C-family because we have a goal of building a RAD tool that will not trade performance and memory consumption for the speed of development. Another thing is that one of the greatest advantages that those high-level languages provide is garbage collection. Table-oriented approach takes care of memory management in most cases, so we don't really need garbage collection. But C++ sucks, why not go for plain C? See the MegaCollectionAdd example above. Despite all its horrors, C++ does save some typing. We are going to stick to some very narrow and teethless subset of C++ to make sure the bastard does not bite anyone. But who knows, we might switch back to plain C one day. Why not create a new language anyway? Sounds cool, but those who make up languages do evil - there are already quite a few of programming dialects out there. Staying with an existing language is also convenient because developers may use their skills and the tools they already have, i.e. code editors, debuggers, etc.
The tool will generate source code. Of course it will, making just one more library does little good. A generic table library is inevitably weakly typed. Strongly typed code is easier to use in C++ and is better performance-wise. In addition, OOP languages are sold as high level, but they don't generate a lot of smart code for the developers. The ability of a tool to do at least a part of the work for the programmer is definitely a good signal. Granted, C++ generates a lot of stuff, but it does so behind your back and that's why "guess how this code works" is the most popular question at C++ interviews. At the same time, autogenerated source code does not hide anything from you. Maybe that's how libraries will work in the future. Instead of being a dead binary (or a dead source code turd) the library will listen to the programmer and turn into what he needs it to be.
The input model will be in XML format. Doesn't XML look ugly? Yes, it does, but everyone understands this format now and our XML schema will be very simple. Isn't a GUI editor better? Sorry, no time for such a serious undertaking.

The above choices are aimed at producing a tool that, on one hand, is not too costly to develop and, on the other hand, gives an immediate practical value to the programmer.

References

http://geocities.com/tablizer/oopbad.htm The canonical text and the source of inspiration for creating Knarvik. Knarvik, however, does not mind if you use OOP. First, Knarvik implements inheritance and virtual functions. Second, you may store instances of your classes in table fields if you wish.
http://geocities.com/tablizer/top.htm The explanation of table-oriented programming. Note that Knarvik does not fully implement all the features stated in that article. There is a form of control tables though - fields can store C-style pointers to functions.
http://www.embedded.com/1999/9908/9908feat1.htm
http://www.infoq.com/presentations/Are-We-There-Yet-Rich-Hickey
http://iraf.noao.edu/iraf/web/ADASS/adassproc/adass95/cogginsj/cogginsj.html
http://steve-yegge.blogspot.com/2006/03/execution-in-kingdom-of-nouns.html