Tuesday, May 27. 2008
Last week, eZ Systems published my article on " Creating Datatypes in eZ Publish 4". The work on this article gave me the possibility to study the concept of content organisation and storage in eZ Publish. In general the concept is much more powerful as the simple Active Record pattern that I've seen misused in many PHP applications.
For more informations on the this concept please refer to the article The eZ Content Model from Kore Nordmann. Now one problem with the implementation from eZ Publish is, that all kind of attributes are stored in the same table (ezcontentobject_attribute). If you look at the schema of this particular table, you find the columns data_float, data_int and data_text. These three columns are used to store the data of an attribute. As you see, every attribute can have at most one value of every type. And even if an attribute only uses one column to store its data, you still fetch three columns for every pageload. My idea is, to improve the eZ Content Model by making use of schema changes. Technicaly it is no (great) problem to create, alter and delete database tables on production systems. I presume that you do not change your schemes too often and that you can make full database backups before changing the schema. The grafic shows a rudimentary database schema, where each content class is represented by a separate table. This means that a new table is created every time a new content class is defined in the CMS and tables are altered when the content class definitions change. Lukas Kahwe Smith was so kind to listen to my idea last weekend and confirmed, that he does not see a fundamental problem with dynamic schemes. He mentioned, that most database management systems can not include schema changes in a transaction. So the schema change must be done even more carefully. Some notes on the scheme: - Most primary keys are natural keys instead of surrogate keys. This helps a lot when transferring from development systems to production servers. - Attributes and content classes have a n:m relationship. This means, you can reuse attributes in multiple content classes. - The scheme should be combined with a content tree to become useful. - The scheme could be improved by extracting having separate tables for those attributes, which are not translated. These attributes are then joined to each localized version of an object. Do you see any problems with dynamic schemes? If you think the idea is good, then I'd like to implement a prototype with YUI user interface. Comments
I could add a revision comment field to each version to describe changes made from on version to another.
Thanks to Midgard for the inspiration.
How is this coming along? I'm interested in the final perfs - and also the snags that such an implementation runs into.
Btw, the fact that content objects are linked to many version_class tables relates to different content classes, I suppose, not to many versions of the same content class. Which in turn means that when altering the class definition all the existing data is immediately changed, right? btw: - would it make sense to store in the object tables also the id of the class, to ease usage of where in the created sql, even though it would be a fixed value per-table? - why not split off also one table per object per language? - adding a last-modified-timestamp to your object tables is a good idea, as well as created+modified dates to tha class def tables |
Links |
I just found a blogpost about the EAV-Pattern, which in turn refers to a recent issue of PHP Architect about the same topic. So I thought I'll sum up the progress with my idea on an improved content storage method for the next generation of eZ Publish.
Tracked: Jul 26, 10:53