Blog

Waldemar Kornewald on August 22, 2010

Final, official GSoC Django NoSQL status update

Alex Gaynor has posted a final status update on his Google Summer of Code (GSoC) project which should bring official NoSQL support to Django. Basically, Django now has a working MongoDB backend (not to be confused with the MongoDB backend for Django-nonrel: django-mongodb-engine) and (after lots of skepticism) the ORM indeed needed only minor changes to support non-relational backends (surprise, surprise ;). There are still a few open design issues, but probably the ORM changes will be merged into trunk and the MongoDB backend will become a separate project.

The biggest design issue (in my opinion) is how to handle AutoField. In the GSoC branch, non-relational model code would always need a manually added NativeAutoField(primary_key=True) because many NoSQL DBs use string-based primary keys. As you can see in Django-nonrel, a NativeAutoField is unnecessary. The normal AutoField already works very well and it has the advantage that you can reuse existing Django apps unmodified and you don't need a special NativeAutoField definition in your model. Hopefully this issue will get fixed before official NoSQL support is merged into trunk.

Another issue is about efficiency: In the GSoC branch, save() first checks whether the entity already exists in the DB by doing ...filter(pk=self.pk).exists() and then it decides whether to do an insert() or update() on the DB. Since non-relational DBs normally don't need to distinguish between inserts and updates we could just always call insert(). That would remove an unnecessary query from every save().

The final issue primarily affects App Engine's transaction support: When you delete() an entity Django will also delete all entities that point to that entity (via ForeignKey). This won't work in an App Engine transaction because it would access multiple entity groups. Also, this operation can take very long when batch-deleting multiple entities (via QuerySet.delete()). In the worst case it will cause DeadlineExceededErrors. The solution would be to allow the backend to handle the deletion. This way the App Engine backend (djangoappengine) could delegate the deletion to a background task.

For Django 1.3 it's probably sufficient to only handle the AutoField issue. This doesn't affect App Engine, though, so independent of that we'll port our App Engine backend to Django trunk once the GSoC branch has been merged. This means you will only need Django-nonrel if you want to use App Engine transactions. In all other cases you can use djangoappengine with the official Django release! Isn't this exciting? Maybe some of you have waited for official NoSQL support before porting their model code and now the time has come? What do you think? I'd love to hear your comments.