All Buttons Pressedhttp://www.allbuttonspressed.com/blog2011-10-18T15:29:00+00:00Read about what we've learned and get to know our latest workUsing dev_appserver with Python 2.7 on App Engine2011-10-18T15:29:00+00:00Waldemar Kornewaldhttp://www.allbuttonspressed.com/using-dev_appserver-with-python-2-7-on-app-engine<p>Starting with SDK 1.6.0 the App Engine fully supports Python 2.7. This article originally described a workaround for getting Python 2.7 running on SDK 1.5.5. Now it only describes how to port your <code>app.yaml</code> and request handlers to Python 2.7.</p> <p>The following instructions work with any WSGI-compliant web framework. I'll explain how to use Python 2.7 with djangoappengine/Django-nonrel at the end of this post. Yes, Django-nonrel works with Python 2.7 on App Engine. This blog (<a href="http://www.allbuttonspressed.com/">All Buttons Pressed</a>) is already running on Python 2.7 with <code>threadsafe: yes</code>.</p> <p>Let's start with a simple <code>app.yaml</code> for Python 2.5:</p> <pre> <code class="language-yaml">application: yourappid version: 1 runtime: python api_version: 1 handlers: - url: /.* script: handler.py</code></pre> <p>The <code>handler.py</code> file looks like this:</p> <pre> <code class="language-python">from google.appengine.ext import webapp from google.appengine.ext.webapp.util import run_wsgi_app # This can be any WSGI handler. Here we just use webapp. application = webapp.WSGIApplication(...) def main(): run_wsgi_app(application) if __name__ == '__main__': main()</code></pre> <p>In order to use Python 2.7 you have to change your <code>app.yaml</code> like this:</p> <pre> <code class="language-yaml">application: yourappid version: 1 runtime: python27 api_version: 1 threadsafe: yes handlers: - url: /.* script: handler.application</code></pre> <p>Here we've changed the runtime to <code>python27</code>, added <code>threadsafe: yes</code> and modified the last line (<code>script</code>) to point to the WSGI handler directly instead of just the <code>handler.py</code> module.</p> <p>Also, our <code>handler.py</code> can be simplified like this:</p> <pre> <code class="language-python">from google.appengine.ext import webapp from google.appengine.ext.webapp.util import run_wsgi_app application = webapp.WSGIApplication(...)</code></pre> <p>In other words, we've just removed <code>main()</code>. Note, webapp users would normally port their code to <a href="http://webapp-improved.appspot.com/">webapp2</a>, but that's not really relevant for this post, so let's just ignore that detail.</p> <p>That's all you need to use Python 2.7 on both the production server and dev_appserver.</p> <h2><a id="using-djangoappengine-with-python-2-7" name="using-djangoappengine-with-python-2-7"></a>Using djangoappengine with Python 2.7</h2> <p>If you're a Django-nonrel / <a href="http://www.allbuttonspressed.com/projects/djangoappengine">djangoappengine</a> user you can just change your <code>app.yaml</code> to look like this:</p> <pre> <code class="language-yaml">application: yourappid version: 1 runtime: python27 api_version: 1 threadsafe: yes builtins: - remote_api: on inbound_services: - warmup libraries: - name: django version: latest handlers: - url: /_ah/queue/deferred script: djangoappengine.deferred.handler.application login: admin - url: /_ah/stats/.* script: djangoappengine.appstats.application - url: /.* script: djangoappengine.main.application</code></pre> <p>That's it. djangoappengine will take care of the rest. Note that the <code>libraries</code> section is currently necessary due to a bug in SDK 1.6.0. Since you're already deploying a custom <code>django</code> package it shouldn't be necessary to enable Django in the <code>libraries</code> section. The App Engine team will fix this bug in the next SDK release.</p> <p>Happy porting!</p> F() objects and QuerySet.update() support in djangoappengine2011-07-07T14:55:00+00:00Thomas Wanschikhttp://www.allbuttonspressed.com/blog/django/f-objects-and-queryset-update-support-in-djangoappengine<p>As some of you already noticed, we added supported for <a href="https://docs.djangoproject.com/en/1.3/ref/models/querysets/#update">QuerySet.update()</a> and <a href="https://docs.djangoproject.com/en/dev/topics/db/queries/#query-expressions">F() objects</a> as part of some features we currently need for our startup <a href="http://www.pentotype.com/">pentotype</a>. Finally, it's possible to use simple transactions on App Engine via Django-nonrel :) Let's see how it works.</p> <h2><a id="django-s-beauty" name="django-s-beauty"></a>Django's beauty</h2> <p>In order to show what we gain by using Django's <code>QuerySet.update()</code> method and <code>F()</code> objects let me illustrate the code differences between the old djangoappengine version and the new one using a simple transaction which increments the value of a counter by a specific amount. In both cases we have the following model definition:</p> <pre> <code class="language-python"># models.py from django.db import models class Accumulator(models.Model): counter = models.IntegerField() name = models.CharField(max_length=500)</code></pre> <p>The old version looks like this:</p> <pre> <code class="language-python">from google.appengine.ext import db def increment_counter(pk, amount): obj = Accumulator.objects.get(pk=pk) obj.counter += amount obj.save() db.run_in_transaction(increment_counter, 1, 5)</code></pre> <p>whereas the new one looks like this:</p> <pre> <code class="language-python">from django.db.models import F Accumulator.objects.filter(pk=1).update(counter=F('counter') + 5)</code></pre> <p>Obviously, it's more elegant to update a counter using <code>QuerySet.update()</code> and <code>F()</code> objects because it's much clearer and we have to write less code, thus resulting in more productivity.</p> <p>Additionally, it's possible to do more complicated transactions, for example, updating multiple entities using Django's elegant syntax:</p> <pre> <code class="language-python">Accumulator.objects.filter(name__startswith='simple').update(counter=F('counter') + 1)</code></pre> <p>Note, that this will update each entity in its own transaction. With the old version we would have to iterate over each entity explicitly and start the transaction:</p> <pre> <code class="language-python">counters = Accumulator.objects.filter(name__startswith='simple') for counter in counters: db.run_in_transaction(increment_counter, counter.pk, 5)</code></pre> <p>Again, we benefit from writing less code.</p> <p>However, one of the main benefits of <code>QuerySet.update()</code> and <code>F()</code> objects is that the code becomes even more portable between backends than before. For example, the <a href="http://django-mongodb.org/">MongoDB backend</a> also has support for <code>F()</code> objects and <code>QuerySet.update()</code>, so the same code can be used for transactional updates on App Engine and MongoDB. Moreover, the same code runs transactionally safe on SQL databases, too!</p> <h2><a id="what-s-next" name="what-s-next"></a>What's next</h2> <p>Support for <code>QuerySet.update()</code> and <code>F()</code> objects is the foundation of portable transactions over different databases. In combination with the <a href="/projects/django-dbindexer">django-dbindexer</a> it now becomes possible to add support for automatically sharded counters as described in <a href="/blog/django/2010/01/Sharding-with-Django-on-App-Engine#django-s-advantage">Sharding with Django on App Engine</a>! In other words, we could add an index definition in order to tell django-dbindexer to automatically shard specific updates. Stay tuned!</p> Minor updates and API changes2011-05-05T10:18:00+00:00Thomas Wanschikhttp://www.allbuttonspressed.com/blog/django/minor-updates-and-api-changes<p>Maybe you noticed that we've started to work on our projects again. ;) We unsurprisingly (:P) passed our final physics exams and are ready to work on NoSQL development again. Waldemar has already started to do some work in order to get our efforts into Django trunk. In the meantime I've started to work on some bug reports concerning <a href="http://www.allbuttonspressed.com/projects/django-dbindexer">django-dbindexer</a> and <a href="http://www.allbuttonspressed.com/projects/nonrel-search">nonrel-search</a>. Both now have a dependency on <a href="http://www.allbuttonspressed.com/projects/django-autoload">django-autoload</a> which ensures the loading of indexes or signal handlers before any request is processed. As a consequence, you have to adapt your code if you make use of django-dbindexer or nonrel-search. The documentation has been changed in order to reflect these changes. Maybe we'll take a few days off to go on vacation. Then, we'll come back and continue improving the way of NoSQL development.</p>SimpleDB backend for Django-nonrel2011-05-03T07:16:00+00:00Waldemar Kornewaldhttp://www.allbuttonspressed.com/blog/django/simpledb-backend-for-django-nonrel<p>Dan Fairs has started developing a SimpleDB backend. That's the sixth backend for Django-nonrel! Check out <a href="https://github.com/danfairs/django-simpledb">django-simpledb</a> from github. The code is experimental and far from finished, but in a recent <a href="http://twitter.com/#!/danfairs/status/65141310293671936">tweet</a> Dan mentioned that he thinks he's very close to having the Django admin running on SimpleDB. That's huge, Dan! :) We're very close to supporting every popular NoSQL database. Guys, if you want to use Dango-nonrel with SimpleDB please help Dan with the <a href="https://github.com/danfairs/django-simpledb">django-simpledb</a> backend (or help Mirko with the <a href="http://www.allbuttonspressed.com/blog/django/redis-backend-for-django-nonrel-in-development">Redis backend</a> if you prefer that).</p>Redis backend for Django-nonrel in development2011-04-16T05:48:00+00:00Waldemar Kornewaldhttp://www.allbuttonspressed.com/blog/django/redis-backend-for-django-nonrel-in-development<p>Mirko Rossini has finally made his Redis backend public. It's called <a href="https://github.com/MirkoRossini/django-redis-engine">django-redis-engine</a> and according to him it's still in pre-alpha stage. From a quick glance at the code it looks like you'll be able to define database indexes in order to support queries that go beyond Redis' (meager ;) native query capabilities (much like with <a href="http://www.allbuttonspressed.com/projects/django-dbindexer">django-dbindexer</a>). This is awesome because you won't have to maintain indexes by hand. If you want to see a stable and fully featured Redis backend please help Mirko. Just clone the <a href="https://github.com/MirkoRossini/django-redis-engine">repository</a> from github and start playing with the code. Enjoy!</p> <p>P.S.: Sorry for the long period of silence. We're still not finished with our final diploma exams, but we're getting close. Soon we can restart blogging more frequently and take the results of the <a href="http://www.allbuttonspressed.com/blog/django/2010/12/Merry-Christmas">previous survey</a> into account.</p>JOINs for NoSQL databases via django-dbindexer - First steps2011-02-01T07:27:00+00:00Thomas Wanschikhttp://www.allbuttonspressed.com/blog/django/joins-for-nosql-databases-via-django-dbindexer-first-steps<p>At the moment, we have a lot to do in terms of finishing our diploma thesis. However we're so excited about the early results of the currently refactored django-dbindexer that we couldn't hold back and keep django-dbindexer's simple JOIN support for non-relational databases a secret anymore. Yes, you didn't read wrongly, this post is about JOIN support for NoSQL databases! We'll show how to use in-memory JOINs and how to get JOINs working if the joined field's value doesn't change. So let's unpack our delayed Christmas present. ;)</p> <h2><a id="let-s-rock" name="let-s-rock"></a>Let's rock!</h2> <p>Let's take the example of a photo-user relationship, just like in our post <a href="http://www.allbuttonspressed.com/blog/django/2010/09/JOINs-via-denormalization-for-NoSQL-coders-Part-1-Intro">JOINs via denormalization for NoSQL coders, Part 1</a>:</p> <pre> <code class="language-python"># photo/models.py: from django.contrib.auth.models import User from django.db import models class Photo(models.Model): owner = models.ForeignKey(User, null=True, blank=True) popularity = models.IntegerField(choices=[(0, 'low'), (1, 'medium'), (2, 'high')], default=0) published = models.DateField(auto_now_add=True) title = models.CharField(max_length=500)</code></pre> <p>Here, each photo entity is linked to a user via a many-to-one relationship and has a popularity tag assigned to it. Additionally we store the date when the photo has been published and give the photo a title. Now let's say you want to get all of a user's photos. In order to do so, NoSQL databases force us to use ugly workarounds. One possible way is to first get the user (or better the corresponding primary key in order to avoid fetching the whole user entity out of the database) and thereafter the user's photos:</p> <pre> <code class="language-python">def get_user_photos(first_name, last_name): user_pk = User.objects.values('pk').get(first_name=first_name, last_name=last_name) return Photo.objects.filter(owner__pk=user_pk)</code></pre> <p>However with our reborn baby, django-dbindexer, you can just use Django's spanning relationship syntax (double underscores) to which you are used to:</p> <pre> <code class="language-python">def get_user_photos(first_name, last_name): return Photo.objects.filter(owner__first_name=first_name, owner__last_name=last_name)</code></pre> <p>Feels a lot more comfortable, right? :) To get this working all you have to do is to <a href="http://www.allbuttonspressed.com/blog/django/2010/09/Get-SQL-features-on-NoSQL-with-django-dbindexer#installation">install the django-dbindexer</a> and add the following index definition:</p> <pre> <code class="language-python"># photo/dbindexes.py: from models import Photo from dbindexer.lookups import StandardLookup from dbindexer.api import register_index register_index(Photo, {'owner__first_name': StandardLookup(), 'owner__last_name': StandardLookup(), })</code></pre> <p>and one of the new join resolvers (<code>InMemoryJOINResolver</code> or <code>ConstantFieldJOINResolver</code>) to <code>DBINDEXER_BACKENDS</code> in your settings:</p> <pre> <code class="language-python"># settings.py: DBINDEXER_BACKENDS = ( 'dbindexer.backends.BaseResolver', 'dbindexer.backends.FKNullFix', 'dbindexer.backends.InMemoryJOINResolver', )</code></pre> <p>That's it. django-dbindexer will handle anything else magically for you. :) Under the hook the refactored django-dbindexer now uses backends to resolve lookups. The <code>BaseResolver</code> is responsible for resolving lookups like <code>__iexact</code> or <code>__regex</code> for example and the <code>FKNullFix</code> backend will make <code>__isnull</code> queries work correctly on <code>ForeignKey</code>. In this example we choose the <code>InMemoryJOINResolver</code> which is used to resolve JOINs in-memory, just like we did manually in the example above. On the other hand, the <code>ConstantFieldJOINResolver</code> would denormalize the user's <code>first_name</code> and <code>last_name</code> into the <code>Photo</code> model and use these denormalized properties to execute the query. Anything described in <a href="http://www.allbuttonspressed.com/blog/django/2010/09/JOINs-via-denormalization-for-NoSQL-coders-Part-1-Intro">JOINs via denormalization for NoSQL coders, Part 1</a> is then done automatically by the <code>ConstantFieldJOINResolver</code> for you. :)</p> <p>JOINs aren't limited to simple examples like the one above. Of course you can combine a query with filters on the <code>Photo</code> model:</p> <pre> <code class="language-python">def get_popular_user_photos(first_name, last_name): return Photo.objects.filter(popularity=2, owner__first_name=first_name, owner__last_name=last_name)</code></pre> <p>You can even span filters over multiple relationships if you have more complex models, just use multiple double underscores. If the photo model contains a <code>ForeignKey</code> to a group which itself contains a <code>ForeignKey</code> to a creator, a query getting all photos whose group creator is a specific user would look like this:</p> <pre> <code class="language-python">def get_creators_groups_photos(first_name, last_name): return Photo.objects.filter(group__creator__first_name=first_name, group__creator__last_name=last_name)</code></pre> <p>Note that this query can return photos belonging to more than just one group because a user could have created multiple groups.</p> <p>And as a nice side effect of the refactoring process, you can combine JOINs with other index definitions like an <code>__iexact</code> lookup on the user's <code>first_name</code> and <code>last_name</code> in combination with a <code>__month</code> lookup on <code>published</code>:</p> <pre> <code class="language-python">def get_user_photos_created_in_month(first_name, last_name, month): return Photo.objects.filter(popularity=2, published__month=month, owner__first_name__iexact=first_name, owner__last_name__iexact=last_name)</code></pre> <p>Just add the needed index definitions to your index module:</p> <pre> <code class="language-python"># photo/dbindexes.py: from models import Photo from dbindexer.lookups import StandardLookup from dbindexer.api import register_index register_index(Photo, {'owner__first_name': (StandardLookup(), 'iexact'), 'owner__last_name': (StandardLookup(), 'iexact'), 'published': 'month', })</code></pre> <p>It couldn't be easier than that, right? :) First we hardly believed our own eyes when we tested JOINs on App Engine on production but it really works. :P Just give it a try and <a href="https://bitbucket.org/twanschik/django-dbindexer-testapp/overview">download the django-dbindexer-testapp</a>. :) If you wanna see even more complex examples then take a look at the <a href="https://bitbucket.org/wkornewald/django-dbindexer/src/59b2dd9d6dca/dbindexer/tests.py">unit tests</a> of django-dbindexer.</p> <h2><a id="inmemoryjoinresolver-or-constantfieldjoinresolver" name="inmemoryjoinresolver-or-constantfieldjoinresolver"></a><code>InMemoryJOINResolver</code> or <code>ConstantFieldJOINResolver</code>?</h2> <p>At the moment you have to choose globally between either the <code>InMemoryJOINResolver</code> or the <code>ConstantFieldJOINResolver</code> in you settings. All queries involving JOINs will then use the join resolver chosen. This will be changed in a future release of django-dbindexer. For now, you may wonder which join resolver to use. In general, in-memory JOINs are useful for simple cases for which you know that you get only a small set of entities on the joined side i.e. the <code>User</code> side. This includes examples like the one above as well as one-to-one relationships. If you know that you have to deal with larger sets of entities on the joined side you have to use the <code>ConstantFieldJOINResolver</code> because in-memory JOINs don't scale in such cases.</p> <p>Currently you can't combine in-memory JOINs with OR-queries and exclude-queries. Anything else should work. In-memory JOINs will try to query as efficiently as possible. On App Engine for example dbindexer will batch-get and use efficient <code><span class="pre">values('pk')</span></code> filters (which become a <code>keys_only</code> filter) where possible. Additionally, fields on the joined side won't be created for <code>StandardLookup</code>.</p> <p>The <code>ConstantFieldJOINResolver</code> is more efficient than the <code>InMemoryJOINResolver</code> and doesn't have limitations neither for OR-queries nor for exclude-queries i.e. you can execute the following query for example:</p> <pre> <code class="language-python">return Photo.objects.exclude(Q(owner__first_name__iexact='Danzo') | Q(owner__last_name__iexact='Shimura'))</code></pre> <p>Additionally it is useful for more than one-to-one-relations. However, at the moment it only works for constant fields i.e. if the user's <code>first_name</code> and the <code>last_name</code> doesn't change. In the future we will add support for non-constant values too.</p> <p>For both resolvers, JOINs only work in the <code>ForeignKey</code> direction. No reverse lookups supported at the moment. :(</p> <h2><a id="expressive-nosql-beyond-scalability" name="expressive-nosql-beyond-scalability"></a>Expressive NoSQL - Beyond scalability</h2> <p>So the first steps towards JOIN support for NoSQL databases are done. We plan to extend the backends to handle JOINs for non-constant fields in a scaleable way, just like described in our <a href="http://www.allbuttonspressed.com/blog/django/2010/09/JOINs-via-denormalization-for-NoSQL-coders-Part-1-Intro">JOINs via denormalization for NoSQL coders series</a>. Additionally, django-dbindexer's API will be extended to allow to specify a join resolver on a per field/filter bases. With the refactored django-dbindexer and its backend system even support for aggregates becomes possible. This will allow you to write complex queries in a few minutes instead of hours (including unit tests and debugging ;). No more hand-written denormalization and map/reduce and aggregate code. Just tell the indexer what you want to do and it'll handle it for you in a way you specify (via a backend)! Developers still have to think about how to design their code. However, django-dbindexer allows them to use Django's expressive ORM and thus being much more productive. Another possible extension is a backend for <a href="http://www.allbuttonspressed.com/projects/nonrel-search">nonrel-search</a> which could be used in order to add full-text search functionality for the admin interface. :) It's hard to think of something that would not be possible! :P If you want to help with the next exciting phase of development you can drop us a mail: <a href="http://groups.google.com/group/django-non-relational">http://groups.google.com/group/django-non-relational</a></p> Merry Christmas2010-12-25T15:41:00+00:00Waldemar Kornewaldhttp://www.allbuttonspressed.com/blog/django/2010/12/Merry-Christmas<p>Merry Christmas and a happy 2011, everyone! Actually we wanted to finish a little present, but December was just too full with other stuff. Looks like Thomas will have to play Santa in January. ;)</p> <p>This blog started more or less one year ago. Back then we still were on a blogspot subdomain. A few months later <a href="http://www.allbuttonspressed.com/projects/django-nonrel">Django-nonrel</a> was ready for hosting a simple site, so we moved everything on a custom domain (allbuttonspressed.com) hosted on App Engine with Django-nonrel. Now this blog has more than 700 subscribers and we really need to get some feedback from you, so we can improve our articles in 2011. Please help us by filling out <a href="https://spreadsheets.google.com/viewform?formkey=dDNKZ254T1pmWVRFTnhnSDZUZVBLQ2c6MQ">this short survey</a>.</p> <p>Thanks a lot! Have a nice Christmas and a successful 2011.</p>Porting from App Engine's webapp to django-nonrel2010-11-30T17:19:00+00:00Thomas Wanschikhttp://www.allbuttonspressed.com/blog/django/2010/11/Porting-from-App-Engine-s-webapp-to-django-nonrel<p>Hey, you may have noticed it already, we've finally finished our top-secret article called <a href="http://code.google.com/appengine/articles/django-nonrel.html">Running Pure Django Projects on Google App Engine</a> :). We thank Wesley Chun for his help, expertise and time so we could get the article published on <a href="http://code.google.com/intl/de-DE/appengine/articles/">App Engine's articles section</a>.</p> <p>Summarized the article goes through several steps explaining in depth how to port an App Engine webapp app over to django-nonrel with some additional notes on useful Django features. At the end of the article we summarize the advantages of using django-nonrel over other approaches and what awaits you in the future. But enough words, just read the article :)</p>HTML5 offline manifests with django-mediagenerator2010-11-23T15:17:00+00:00Waldemar Kornewaldhttp://www.allbuttonspressed.com/blog/django/2010/11/HTML5-offline-manifests-with-django-mediagenerator<p>This is actually part 3 of our <a href="/projects/django-mediagenerator">django-mediagenerator</a> Python canvas app series (see <a href="/blog/django/2010/11/Offline-HTML5-canvas-app-in-Python-with-django-mediagenerator-Part-1-pyjs">part 1</a> and <a href="/blog/django/2010/11/Offline-HTML5-canvas-app-in-Python-with-django-mediagenerator-Part-2-Drawing">part 2</a>), but since it has nothing to do with client-side Python we name it differently. In this part you&#39;ll see how to make your web app load without an Internet connection. HTML5 supports offline web apps through <a href="https://www.w3.org/TR/html5/browsers.html#offline">manifest</a> files.</p> <h2><a id="manifest-files" name="manifest-files"></a>Manifest files</h2> <p>First here&#39;s some background, so you know what a manifest file is. A manifest file is really simple. In its most basic form it lists the URLs of the files that should be cached. Here&#39;s an <code>example.manifest</code>:</p> <pre> <code class="language-text">CACHE MANIFEST /media/main.css /media/main.js</code></pre> <p>The first line is always <code>CACHE MANIFEST</code>. The next lines can list the files that should be cached. In this case we&#39;ve added the <code>main.css</code> and <code>main.js</code> bundles. Additionally, the main HTML file which includes the manifest is cached, automatically. You can include the manifest in the <code>&lt;html&gt;</code> tag:</p> <pre> <code class="language-html">&lt;html manifest="example.manifest"&gt;</code></pre> <p>When the browser sees this it loads the manifest and adds the current HTML and manifest file and all files listed in the manifest to the cache. The next time you visit the page the browser will try to load the manifest file from your server and compare it to the cached version. If the content of the manifest file hasn&#39;t changed the browser just loads all files from the cache. If the content of the manifest file has changed the browser refreshes its cache.</p> <p>This is important, so I repeat it: The browser updates its cache only when the <strong>content</strong> of the <strong>manifest</strong> file is modified. Changes to your JavaScript, CSS, and image files will go unnoticed if the manifest file is not changed! That&#39;s exactly where things become annoying. Imagine you&#39;ve changed the <code>main.js</code> file. Now you have to change your manifest file, too. One possibility is to add a comment to their manifest file which represents the current version number:</p> <pre> <code class="language-text">CACHE MANIFEST # version 2 /media/main.css /media/main.js</code></pre> <p>Whenever you change something in your JS or CSS or image files you have to increment the version number, manually. That&#39;s not really nice.</p> <h2><a id="django-mediagenerator-to-the-rescue" name="django-mediagenerator-to-the-rescue"></a>django-mediagenerator to the rescue</h2> <p>This is where the media generator comes in. It automatically modifies the manifest file whenever your media files are changed. Since media files are versioned automatically by django-mediagenerator the version hash in the file name serves as a natural and automatic solution to our problem. With the media generator a manifest file could look like this:</p> <pre> <code class="language-text">CACHE MANIFEST /media/main-bf1e7dfbd511baf660e57a1f36048750f1ee660f.css /media/main-fb16702a27fc6c8073aa4df0b0b5b3dd8057cc12.js</code></pre> <p>Whenever you change your media files the version hash of the affected files becomes different and thus the manifest file changes automatically, too.</p> <p>Now how do we tell <a href="/projects/django-mediagenerator">django-mediagenerator</a> to create such a manifest file? Just add this to your <code>settings.py</code>:</p> <pre> <code class="language-python">OFFLINE_MANIFEST = 'webapp.manifest'</code></pre> <p>With this simple snippet the media generator will create a manifest file called <code>webapp.manifest</code>. However, the manifest file will contain <strong>all</strong> of the assets in your project. In other words, the whole <code>_generated_media</code> folder will be listed in the manifest file.</p> <p>Often you only want specific files to be cached. You can do that by specifying a list of regular expressions matching path names (relative to your media directories, exactly like in <code>MEDIA_BUNDLES</code>):</p> <pre> <code class="language-python">OFFLINE_MANIFEST = { 'webapp.manifest': { 'cache': ( r'main\.css', r'main\.js', r'webapp/img/.*', ), 'exclude': ( r'webapp/img/online-only/.*', ) }, }</code></pre> <p>Here we&#39;ve added the <code>main.css</code> and <code>main.js</code> bundles and all files under the <code>webapp/img/</code> folder, except for files under <code>webapp/img/online-only/</code>. Also, you might have guessed it already: You can create multiple manifest files this way. Just add more entries to the <code>OFFLINE_MANIFEST</code> dict.</p> <p>Finally, we also have to include the manifest file in our template:</p> <pre> <code class="language-django">{% load media %} &lt;html manifest="{% media_url 'webapp.manifest' %}"&gt;</code></pre> <p>Manifest files actually provide more features than this. For example, you can also specify <code>FALLBACK</code> handlers in case there is no Internet connection. In the following example the &quot;/offline.html&quot; page will be displayed for resources which can&#39;t be reached while offline:</p> <pre> <code class="language-python">OFFLINE_MANIFEST = { 'webapp.manifest': { 'cache': (...), 'fallback': { '/': '/offline.html', }, }, }</code></pre> <p>Here <code>/</code> is a pattern that matches all pages. You can also define <code>NETWORK</code> entries which specify allowed URLs that can be accessed even though they&#39;re not cached:</p> <pre> <code class="language-python">OFFLINE_MANIFEST = { 'webapp.manifest': { 'cache': (...), 'network': ( '*', ), }, }</code></pre> <p>Here <code>*</code> is a wildcard that allows to access any URL. If you just had an empty <code>NETWORK</code> section you wouldn&#39;t be able to load uncached files, even when you&#39;re online (however, not all browsers are so strict).</p> <h2><a id="serving-manifest-files" name="serving-manifest-files"></a>Serving manifest files</h2> <p>Manifest files should be served with the MIME type <code>text/cache-manifest</code>. Also it&#39;s <strong>critical</strong> that you disable HTTP caching for manifest files! Otherwise the browser will <strong>never</strong> load a new version of your app because it always loads the cached manifest! Make sure that you&#39;ve configured your web server correctly.</p> <p>As an example, on App Engine you&#39;d configure your <code>app.yaml</code> like this:</p> <pre> <code class="language-yaml">handlers: - url: /media/(.*\.manifest) static_files: _generated_media/\1 mime_type: text/cache-manifest upload: _generated_media/(.*\.manifest) expiration: '0' - url: /media static_dir: _generated_media/ expiration: '365d'</code></pre> <p>Here we first catch all manifest files and serve them with an expiration of &quot;0&quot; and the correct MIME type. The normal <code>/media</code> handler must be installed <strong>after</strong> the manifest handler.</p> <h2><a id="like-a-native-ipad-iphone-app" name="like-a-native-ipad-iphone-app"></a>Like a native iPad/iPhone app</h2> <p>Offline-capable web apps have a nice extra advantage: We can put them on the iPad&#39;s/iPhone&#39;s home screen, so they appear exactly like native apps! All browser bars will disappear and your whole web app will be full-screen (except for the top-most status bar which shows the current time and battery and network status). Just add the following to your template:</p> <pre> <code class="language-django">&lt;head&gt; &lt;meta name="apple-mobile-web-app-capable" content="yes" /&gt; ...</code></pre> <p>Now when you&#39;re in the browser you can tap on the &quot;+&quot; icon in the middle of the bottom toolbar (<strong>update:</strong> I just updated to iOS 4.2.1 and the &quot;+&quot; icon got replaced with some other icon, but it&#39;s still in the middle of the bottom toolbar :) and select &quot;Add to Home Screen&quot;:</p> <p><img alt="http://lh3.ggpht.com/_03uxRzJMadw/TOfkL5YEULI/AAAAAAAAAIo/sCOT_u4ymdQ/add-to-home-screen.png" src="http://lh3.ggpht.com/_03uxRzJMadw/TOfkL5YEULI/AAAAAAAAAIo/sCOT_u4ymdQ/add-to-home-screen.png" /></p> <p>Then you can enter the name of the home screen icon:</p> <p><img alt="http://lh4.ggpht.com/_03uxRzJMadw/TOfkLpUSIeI/AAAAAAAAAIk/n3IZTgfZyIo/add-to-home.png" src="http://lh4.ggpht.com/_03uxRzJMadw/TOfkLpUSIeI/AAAAAAAAAIk/n3IZTgfZyIo/add-to-home.png" /></p> <p>Tapping &quot;Add&quot; will add an icon for your web app to the home screen:</p> <p><img alt="http://lh3.ggpht.com/_03uxRzJMadw/TOfkMLPDyQI/AAAAAAAAAIw/qducGXp4DzE/app-on-home-screen.png" src="http://lh3.ggpht.com/_03uxRzJMadw/TOfkMLPDyQI/AAAAAAAAAIw/qducGXp4DzE/app-on-home-screen.png" /></p> <p>When you tap that icon the canvas demo app starts in full-screen:</p> <p><img alt="http://lh5.ggpht.com/_03uxRzJMadw/TOfkLyiW0SI/AAAAAAAAAIs/lOIzhyI6BMQ/app.png" src="http://lh5.ggpht.com/_03uxRzJMadw/TOfkLyiW0SI/AAAAAAAAAIs/lOIzhyI6BMQ/app.png" /></p> <p>We can also specify an icon for your web app. For example, if your icon is in <code>img/app-icon.png</code> you can add it like this:</p> <pre> <code class="language-django">{% load media %} &lt;head&gt; &lt;link rel="apple-touch-icon" href="{% media_url 'img/app-icon.png' %}" /&gt; ...</code></pre> <p>The image should measure 57x57 pixels.</p> <p>Finally, you can also add a startup image which is displayed while your app loads. The following snippet assumes that the startup image is in <code>img/startup.png</code>:</p> <pre> <code class="language-django">{% load media %} &lt;head&gt; &lt;link rel="apple-touch-startup-image" href="{% media_url 'img/startup.png' %}" /&gt; ...</code></pre> <p>The image dimensions should be 320x460 pixels and it should be in portrait orientation.</p> <h2><a id="summary" name="summary"></a>Summary</h2> <ul> <li>The manifest file just lists the files that should be cached</li> <li>Files are only reloaded if the manifest file&#39;s content has changed</li> <li>The manifest file must not be cached (!) or the browser will never reload anything</li> <li><a href="/projects/django-mediagenerator">django-mediagenerator</a> automatically maintains the manifest file for you</li> <li>Offline web apps can appear like native apps on the iPad and iPhone</li> <li><a href="http://bitbucket.org/wkornewald/offline-canvas-python-web-app/get/tip.zip">Download</a> the latest canvas drawing app source which is now offline-capable</li> </ul> <p>As you&#39;ve seen in this post, it&#39;s very easy to make your web app offline-capable with <a href="/projects/django-mediagenerator">django-mediagenerator</a>. This is also the foundation for making your app look like a native app on the iPhone and iPad. Offline web apps open up exciting possibilities and allow you to become independent of Apple&#39;s slow approval processes for the app store and the iOS platform in general because web apps can run on Android, webOS, and many other mobile platforms. It&#39;s also possible to write a little wrapper for the App Store which just opens Safari with your website. That way users can still find your app in the App Store (in addition to the web).</p> <p>The next time you want to write a native app for the iOS platform, consider making a web app, instead (unless you&#39;re writing e.g. a real-time game, of course).</p>Offline HTML5 canvas app in Python with django-mediagenerator, Part 2: Drawing2010-11-16T19:12:00+00:00Waldemar Kornewaldhttp://www.allbuttonspressed.com/blog/django/2010/11/Offline-HTML5-canvas-app-in-Python-with-django-mediagenerator-Part-2-Drawing<p>In <a href="/blog/django/2010/11/Offline-HTML5-canvas-app-in-Python-with-django-mediagenerator-Part-1-pyjs">part 1</a> of this series we've seen how to use Python/<a href="http://pyjs.org/">pyjs</a> with <a href="/projects/django-mediagenerator">django-mediagenerator</a>. In this part we'll build a ridiculously simple HTML5 canvas drawing app <strong>in Python</strong> which can run on the iPad, iPhone, and a desktop browser.</p> <p>Why ridiculously simple? There are a few details you have to keep in mind when writing such an app and I don't want these details to be buried under lots of unrelated code. So, in the end you will be able to draw on a full-screen canvas, but you won't be able to select a different pen color or an eraser tool, for example. These extras are easy to add even for a newbie, so feel free to <a href="http://bitbucket.org/wkornewald/offline-canvas-python-web-app/get/tip.zip">download</a> the code and make it prettier. It's a nice exercise.</p> <h2><a id="reset-the-viewport" name="reset-the-viewport"></a>Reset the viewport</h2> <p>With a desktop browser we could start hacking right away, but since we also support mobile browsers we need to fix something, first: One problem you face with mobile browsers is that the actual screen size is not necessarily the same as the reported screen size. In order to work with the real values we have to reset the viewport in the <code>&lt;head&gt;</code> section of our template:</p> <pre> <code class="language-html">&lt;head&gt; &lt;meta name="viewport" content="initial-scale=1.0, width=device-width, minimum-scale=1.0, maximum-scale=1.0, user-scalable=no" /&gt; ... &lt;/head&gt;</code></pre> <p>Now the reported and actual screen size will be the same.</p> <h2><a id="start-touching-the-canvas" name="start-touching-the-canvas"></a>Start touching the canvas</h2> <p>Before we look at mouse-based drawing we first implement touch-based drawing because that has a greater influence on our design. We'll implement a simple widget called <code>CanvasDraw</code> which takes care of adding the canvas element to the DOM and handling the drawing process:</p> <pre> <code class="language-python">from __javascript__ import jQuery, window, setInterval class CanvasDraw(object): # The constructor gets the id of the canvas element that should be created def __init__(self, canvas_id): # Add a canvas element to the content div jQuery('#content').html('&lt;canvas id="%s" /&gt;' % canvas_id) element = jQuery('#' + canvas_id) # Get position of the canvas element within the browser window offset = element.offset() self.x_offset = offset.left self.y_offset = offset.top # Get the real DOM element from the jQuery object self.canvas = element.get(0) # Set the width and height based on window size and canvas position self.canvas.width = window.innerWidth - self.x_offset self.canvas.height = window.innerHeight - self.y_offset # Load the drawing context and initialize a few drawing settings self.context = self.canvas.getContext('2d') self.context.lineWidth = 8 self.context.lineCap = 'round' self.context.lineJoin = 'miter' # ... to be continued ...</code></pre> <p>The last two lines configure the way lines are drawn. We draw lines instead of individual points because when tracking the mouse/finger the individual positions are too far apart and thus need to be connected with lines.</p> <p>In case you've wondered: The <code>lineCap</code> property can be <code>butt</code>, <code>round</code>, or <code>square</code>:</p> <img alt="http://lh6.ggpht.com/_03uxRzJMadw/TN6g4T9mL0I/AAAAAAAAAIU/D_2ZGmmgPqo/Canvas_linecap.png" src="http://lh6.ggpht.com/_03uxRzJMadw/TN6g4T9mL0I/AAAAAAAAAIU/D_2ZGmmgPqo/Canvas_linecap.png" /> <p>The <code>lineJoin</code> property can be <code>round</code>, <code>bevel</code>, or <code>miter</code>:</p> <img alt="http://lh4.ggpht.com/_03uxRzJMadw/TN6g4mkTrWI/AAAAAAAAAIY/BR89ZDHMtHc/Canvas_linejoin.png" src="http://lh4.ggpht.com/_03uxRzJMadw/TN6g4mkTrWI/AAAAAAAAAIY/BR89ZDHMtHc/Canvas_linejoin.png" /> <p>(Note: both images are modifications of images used in a <a href="https://developer.mozilla.org/En/Canvas_tutorial/Applying_styles_and_colors">Mozilla tutorial</a>.)</p> <p>Next, let's add the event handlers and the mouse/finger tracking code. The problem here is that we can't receive any touch movement events while we're drawing something on the canvas. The touch events get lost in this case and the user will experience noticeably slower drawing and bad drawing precision in general. The solution to this problem is to record the touch events in an array and then draw the the lines asynchronously via a timer which gets executed every ca. 25ms and to limit the drawing process to ca. 10ms per timer event. You can fine-tune these numbers, but they worked well enough for us. <code>CanvasDraw</code> stores the mouse/finger positions in <code>self.path</code>. Here's the rest of the initialization code:</p> <pre> <code class="language-python">class CanvasDraw(object): def __init__(self, canvas_id): # ... continued from above ... # This variable is used for tracking mouse/finger movements self.path = [] # Add asynchronous timer for drawing setInterval(self.pulse, 25) # Register mouse and touch events element.bind('mouseup', self.mouse_up).bind( 'mousedown', self.mouse_down).bind( 'mousemove', self.mouse_move).bind( 'touchstart touchmove', self.touch_start_or_move).bind( 'touchend', self.touch_end)</code></pre> <p>So far so good. The actual mouse/finger movement paths are represented as a list. The path is ordered such that old entries are at the end and new entries are added to the beginning of the list. When the touch event ends we terminate the path by adding <code>None</code> to the list, so multiple paths are separated by <code>None</code>. Here's an example of what <code>self.path</code> could look like (read from right to left):</p> <pre> <code class="language-python">self.path = [..., None, ..., [x1, y1], [x0, y0], None, ..., [x1, y1], [x0, y0]]</code></pre> <p>OK, let's have a look at the actual touch tracking code. We use only one method for both <code>touchstart</code> and <code>touchmove</code> events because they do exactly the same thing:</p> <pre> <code class="language-python">class CanvasDraw(object): # ... continued from above ... def touch_start_or_move(self, event): # Prevent the page from being scrolled on touchmove. In case of # touchstart this prevents the canvas element from getting highlighted # when keeping the finger on the screen without moving it. event.preventDefault() # jQuery's Event class doesn't provide access to the touches, so # we use originalEvent to get the original JS event object. touches = event.originalEvent.touches self.path.insert(0, [touches.item(0).pageX, touches.item(0).pageY])</code></pre> <p>When the finger leaves the screen we terminate the path with <code>None</code>. Note that the <code>touchend</code> event only contains the positions currently being touched, but it's fired <strong>after</strong> your finger leaves the screen, so the <code>event.originalEvent.touches</code> property is <strong>empty</strong> when the last finger leaves the screen. That's why we use <code>event.originalEvent.changedTouches</code> in order to get the removed touch point:</p> <pre> <code class="language-python">class CanvasDraw(object): # ... continued from above ... def touch_end(self, event): touches = event.originalEvent.changedTouches self.path.insert(0, [touches.item(0).pageX, touches.item(0).pageY]) # Terminate path self.path.insert(0, None)</code></pre> <h2><a id="drawing-it" name="drawing-it"></a>Drawing it</h2> <p>Now we can implement the actual asynchronous drawing process. Remember, we use a timer to periodically (every 25ms) draw the mouse/finger path in <code>self.path</code>. We also limit the drawing process to 10ms per timer event. This is our timer:</p> <pre> <code class="language-python">import time class CanvasDraw(object): # ... continued from above ... def pulse(self, *args): if len(self.path) &lt; 2: return start_time = time.time() self.context.beginPath() # Don't draw for more than 10ms in order to increase the number of # captured mouse/touch movement events. while len(self.path) &gt; 1 and time.time() - start_time &lt; 0.01: start = self.path.pop() end = self.path[-1] if end is None: # This path ends here. The next path will begin at a new # starting point. self.path.pop() else: self.draw_line(start, end) self.context.stroke()</code></pre> <p>First we have to call <code>beginPath()</code> to start a set of drawing instructions, then we tell the <code>context</code> which lines to draw, and in the end we draw everything with <code>stroke()</code>. The paths are processed in the <code>while</code> loop by getting the oldest (last) two points from the path and connecting them with a line. When we reach the path terminator (<code>None</code>) we <code>pop()</code> it and continue with the next path.</p> <p>The actual line drawing instructions are handled by <code>draw_line()</code>. When we receive mouse/touch events the coordinates are relative to the browser window, so the <code>draw_line()</code> method also converts them to coordinates relative to the canvas:</p> <pre> <code class="language-python">class CanvasDraw(object): # ... continued from above ... def draw_line(self, start, end): self.context.moveTo(start[0] - self.x_offset, start[1] - self.y_offset) self.context.lineTo(end[0] - self.x_offset, end[1] - self.y_offset)</code></pre> <p>Finally, we have to instantiate the widget and also prevent scrolling on the rest of our page:</p> <pre> <code class="language-python">canvas = CanvasDraw('sketch-canvas') # Prevent scrolling and highlighting def disable_scrolling(event): event.preventDefault() jQuery('body').bind('touchstart touchmove', disable_scrolling)</code></pre> <h2><a id="style" name="style"></a>Style</h2> <p>We need a minimum amount of CSS code. Of course, <a href="http://sass-lang.com/">Sass</a> would be nicer and the media generator supports it, but then you'd also have to install Ruby and Sass which makes things unnecessarily complicated for this simple CSS snippet in <code>reset.css</code>:</p> <pre> <code class="language-css">body, canvas, div { margin: 0; padding: 0; } * { -webkit-user-select: none; -webkit-touch-callout: none; }</code></pre> <p>The first part just resets margins, so we can fill the whole screen. The last part disables text selection on the iPad and iPhone and also turns off pop-ups like the one that appears when you hold your finger on a link. That's fine for normal websites, but we normally don't want that behavior in a web app.</p> <p>Of course we also need to add the file to a bundle in <code>settings.py</code></p> <pre> <code class="language-python">MEDIA_BUNDLES = ( ('main.css', 'reset.css', ), # ... )</code></pre> <p>and the bundle must be added to the template:</p> <pre> <code class="language-django">&lt;head&gt; ... {% load media %} {% include_media 'main.css' %} &lt;/head&gt;</code></pre> <h2><a id="what-about-mouse-input" name="what-about-mouse-input"></a>What about mouse input?</h2> <p>Now that the infrastructure is in place the mouse tracking code is pretty straightforward:</p> <pre> <code class="language-python">class CanvasDraw(object): # ... continued from above ... def mouse_down(self, event): self.path.insert(0, [event.pageX, event.pageY]) def mouse_up(self, event): self.path.insert(0, [event.pageX, event.pageY]) # Terminate path self.path.insert(0, None) def mouse_move(self, event): # Check if we're currently tracking the mouse if self.path and self.path[0] is not None: self.path.insert(0, [event.pageX, event.pageY])</code></pre> <h2><a id="summary" name="summary"></a>Summary</h2> <p>Whew! As you've seen, writing a canvas drawing app with touch support is not difficult, but you need to keep more things in mind than with a mouse-based app. Let's quickly summarize what we've learned:</p> <ul> <li>Touch events get lost if the drawing instructions block the browser for too long</li> <li>Draw the image asynchronously and limit the amount of time spent on drawing</li> <li>You have to prevent scrolling in touch events via <code>event.preventDefault()</code></li> <li>On <code>touchend</code> the <code>event.touches</code> property doesn't contain the removed touch point, so you should use <code>event.changedTouches</code></li> <li>jQuery doesn't provide direct access to <code>event.touches</code>, so you have to use <code>event.originalEvent.touches</code> (same goes for <code>event.changedTouches</code>)</li> <li>You have to convert mouse/touch coordinates to coordinates relative to the canvas element</li> <li><a href="http://bitbucket.org/wkornewald/offline-canvas-python-web-app/get/tip.zip">Download</a> the latest sample code to try it out yourself</li> </ul> <p>In the <a href="/blog/django/2010/11/HTML5-offline-manifests-with-django-mediagenerator">next part</a> we'll make our app offline-capable and allow for installing it like a native app on the iPad and iPhone. You'll also see that <a href="/projects/django-mediagenerator">django-mediagenerator</a> can help you a lot with making your app offline-capable. <strong>Update:</strong> Part 3 is published: <a href="/blog/django/2010/11/HTML5-offline-manifests-with-django-mediagenerator">HTML5 offline manifests with django-mediagenerator</a></p> Offline HTML5 canvas app in Python with django-mediagenerator, Part 1: pyjs2010-11-09T11:38:00+00:00Waldemar Kornewaldhttp://www.allbuttonspressed.com/blog/django/2010/11/Offline-HTML5-canvas-app-in-Python-with-django-mediagenerator-Part-1-pyjs<p>This is the first part in a short series (see also <a href="http://www.allbuttonspressed.com/blog/django/2010/11/Offline-HTML5-canvas-app-in-Python-with-django-mediagenerator-Part-2-Drawing">part 2</a> and <a href="http://www.allbuttonspressed.com/blog/django/2010/11/HTML5-offline-manifests-with-django-mediagenerator">part 3</a>) on building a simple client-side offline-capable HTML5 canvas drawing app <strong>in Python</strong> via Pyjamas/<a href="http://pyjs.org/">pyjs</a>, using <a href="/projects/django-mediagenerator">django-mediagenerator</a>. Canvas drawing apps are the &quot;Hello world&quot; of the web apps world, so why not make it more interesting and throw in a few neat buzz word technologies? ;)</p> <p>In this part we'll take a look at running Python in the browser via pyjs, the Pyjamas framework's Python-to-JavaScript compiler. Note that we won't describe the Pyjamas framework, here. Instead, we only use the compiler itself and <a href="http://jquery.com/">jQuery</a>. You can build your own client-side Python framework on top of this. We don't use Pyjamas because we believe that a framework optimized for Python can be a lot simpler and more expressive than a simple 1:1 translation from Java/GWT to Python (which the Pyjamas framework is, mostly).</p> <p>Also, in case you wondered, it's important to not mix your backend (server-side) and frontend (client-side) code. <strong>Keep your backend and frontend code cleanly separated.</strong> Otherwise your app will become an unmaintainable, cryptic mess. We'll focus on the client-side in this series. For the backend you might want to use a simple REST API based on <a href="http://bitbucket.org/jespern/django-piston/wiki/Home">Piston</a>, for example.</p> <p>If you don't know <a href="/projects/django-mediagenerator">django-mediagenerator</a>, yet, you should have a look at the <a href="/blog/django/2010/08/django-mediagenerator-total-asset-management">intro tutorial</a>. The media generator is an asset manager which can combine and compress JavaScript and CSS code. Beyond that, it also has many advanced features which can help you with client-side code. We're going to use several of the advanced features in this short series.</p> <h2><a id="prerequisites" name="prerequisites"></a>Prerequisites</h2> <p>First, let's install a recent pyjs version from the repo:</p> <pre> <code>git clone git://pyjs.org/git/pyjamas.git cd pyjamas/pyjs python setup.py develop</code></pre> <p>Then, download <a href="http://www.djangoproject.com/">Django</a> and <a href="http://pypi.python.org/pypi/django-mediagenerator">django-mediagenerator</a> and install both via <code>setup.py install</code>. Django is only needed for running the media generator and serving a simple HTML file.</p> <p>Finally, <a href="http://bitbucket.org/wkornewald/offline-canvas-python-web-app/get/tip.zip">download</a> the project source and extract it. The project already comes with jQuery, so the previous three dependencies are all that we need.</p> <h2><a id="compiling-python-to-javascript" name="compiling-python-to-javascript"></a>Compiling Python to JavaScript</h2> <p>Obviously, you can't run Python directly in your browser. So, how does it work? You write normal Python modules and packages. Then, pyjs is used to convert those modules into JavaScript files. This JavaScript code can then be integrated into your templates.</p> <p>Normally, you'd use a the Pyjamas build script to compile your Python files. However, it's very difficult to make offline-capable apps that way. Also, the build script generates a single huge HTML file with all dependencies, which makes in-browser debugging more complicated. Also, the generated output can't get cached efficiently.</p> <p>Instead, we use django-mediagenerator to handle the build process. Basically, you tell django-mediagenerator to build your app's main Python module and the media generator automatically collects all other modules which are required by your main module and its dependencies. This means that only modules which actually get imported by your app get compiled and later combined, compressed, and uploaded to the server. All other modules will be ignored.</p> <p>So, all you need to do is add the main module, let's call it <code>canvas_main.py</code>, to <code>MEDIA_BUNDLES</code> in your <code>settings.py</code>:</p> <pre> <code class="language-python">MEDIA_BUNDLES = ( ('main.js', 'jquery.js', 'canvas_main.py', ), )</code></pre> <p>This will combine <code>jquery.js</code> and the whole canvas drawing app with all its dependencies into a <code>main.js</code> bundle. It's really that simple. :)</p> <p>Then, we can include it in the template via</p> <pre> <code class="language-django">&lt;head&gt; ... {% load media %} {% include_media 'main.js' %} &lt;/head&gt;</code></pre> <p>Just a reminder: When developing via <code>manage.py runserver</code> the files will of course stay uncombined and uncompressed. They only get combined and compressed for production. Also, during development all Python files will contain extra debugging code, so tracebacks can be generated. This makes the files a lot larger and a little bit slower, though. The production code (via <code>manage.py generatemedia</code>) doesn't contain this debugging code, so it's faster and much smaller.</p> <h2><a id="look-ma-python-in-my-browser" name="look-ma-python-in-my-browser"></a>Look, ma, Python in my browser!</h2> <p>What's so great about Python in the browser? Python is a much more powerful language than JavaScript and its design makes a lot more sense in large projects. Python gives you a real module system with explicit import statements for better readability. You can have nice getters and setters. The object system makes a lot more sense. Note, I do like prototype-based programming, but the way it works in JavaScript is a disaster when compared to Self, Io, and other languages. Enough talk. Let's create a simple widget which should react on a mouse click in pure Python:</p> <pre> <code class="language-python"># Import jQuery. Note: JavaScript variables have to be # imported from the fake __javascript__ module. from __javascript__ import jQuery # Create a simple widget class class ClickWidget(object): def __init__(self, base_elem): # Add some initial HTML code base = jQuery(base_elem) base.html('&lt;div class="clickme"&gt;Click me!&lt;/div&gt;' '&lt;div class="change"&gt;Then this will change.&lt;/div&gt;') self.output_div = jQuery('.change', base) # Bind the click event to the on_click method jQuery('.clickme', base).bind('click', self.on_click) # This is our click event handler def on_click(self, event): self.output_div.append(' It clicked!') # Install our widget widget = ClickWidget('body')</code></pre> <p>See, it's really that simple. :) Did you notice something neat about this code? You didn't have to bind the <code>on_click</code> method to <code>self</code>! Well, as a Python developer you might say &quot;that's totally normal&quot;, but the world can be harsh once you leave your comfy Python home. ;)</p> <p>So, what would you do in JavaScript, instead? When you want to pass a method of some JavaScript object to some other function the method's <code>this</code> gets lost. So, in JavaScript you would install the click handler like this:</p> <pre> <code class="language-js">// ... // JavaScript equivalent, e.g. using Prototype var ClickWidget = Class.create({ initialize: function(base_elem) { // ... this.output_div = jQuery('.change', base); jQuery(input_div).bind('click', this.on_click.bind(this)); }, on_click(event) { this.output_div.append(' It clicked!'); } }); // ...</code></pre> <p>Did you see the strange <code>this.on_click.bind(this)</code>? It returns a new function which calls <code>on_click</code> bound to <code>this</code>, so the <code>on_click</code> method can access <code>this.output_div</code> with the correct <code>this</code>. The fact that you need do this this, at all, is unintuitive and there's no reason why developers should have to keep nonsense like this in mind. If you've only worked with jQuery you might never have written code like this because your app probably didn't write object-oriented code. However, once you enter the realm of complex web apps you should certainly begin to think about solutions that allow for better code reuse and maintainability.</p> <p>By the way, you can also use many Python standard modules like <code>time</code>, <code>base64</code>, <code>math</code>, <code>urllib</code>, <code>cgi</code>, etc. It's not the complete standard library, but enough to make you smile. :)</p> <h2><a id="look-ma-python-talking-to-javascript" name="look-ma-python-talking-to-javascript"></a>Look, ma, Python talking to JavaScript!</h2> <p>When you interact with Python/pyjs objects everything should work as expected. Basically, you can write normal Python code as long as you don't use advanced features like <code>__metaclass__</code>. With every new release pyjs gets closer to being fully Python-compatible, so even the advanced features will be supported at some point.</p> <p>However, we won't always get around interacting with a little bit of JS code. You'll at least want to use jQuery and probably also the document variable. You've already seen above how to import JavaScript variables and display something on the page:</p> <pre> <code class="language-python">from __javascript__ import jQuery, document jQuery('body').html('Hello world!') doc = jQuery(document) # ...</code></pre> <p>That's pretty simple. When you want to access JavaScript variables you have to import them from the fake <code>__javascript__</code> module. Then, you can interact with JavaScript objects <strong>almost</strong> like with Python objects. With the default compilation settings you can pass in strings and numbers unmodified to JavaScript functions. However, dicts and lists must be converted because pyjs uses custom classes for those data types and they're incompatible with JavaScript.</p> <p>The biggest problem with JavaScript arrays is that you can't access them via <code>[]</code> (e.g., <code>x[0]</code>) because both Python and pyjs convert this to a call to <code>__getitem__()</code>, but that method isn't defined for JavaScript arrays. Also, Python lists can't be passed directly to JavaScript functions because they expect that you call <code>__getitem__()</code> instead of using <code>[]</code>. Similarly, <code>len()</code> is only available in Python and <code>.length</code> is only available in JavaScript. So, we have to convert lists when interacting with JavaScript. We can do this in two ways. Either we insert raw JavaScript or we convert the data types:</p> <pre> <code class="language-python"># Import some JavaScript function which takes an array from __javascript__ import some_js_func # The JS() function allows inserting raw JavaScript code from __pyjamas__ import JS # Create a JavaScript array via JS() js_list = JS('[1, 2, 3]') # Now let's access the first element in the array # 1. possibility: access array via JS. item = JS('@{{js_list}}[0]') # 2. possibility: convert to Python list py_list = list(js_list) item = py_list[0] # Now pass the list to a JavaScript function # 1. possibility: use the raw JavaScript array some_js_func(js_list) # 2. possibility: get the Python list's internal JavaScript array. # Changes to the internal array will also affect the Python list. # E.g., when you add an element it also becomes part of the Python list. some_js_func(py_list.getArray()) # 3. possibility: convert the Python list into a standalone JavaScript array. # Changes to that array have no effect on the Python list. from __javascript__ import toJSObjects some_js_func(toJSObjects(py_list))</code></pre> <p>Most of the time you'll probably want to use <code>list()</code> and <code>getArray()</code>.</p> <p>There's one important detail, here. When using <code>JS()</code> you have to escape Python variables via <code><span class="pre">&#64;{{}}</span></code>. That's because pyjs might internally rename a variable, so you have to tell the <code>JS()</code> function which variables you need.</p> <p>When working with dicts you also have to convert the data types:</p> <pre> <code class="language-python"># Import some JavaScript function which takes an object from __javascript__ import other_js_func, toJSObjects from __pyjamas__ import JS js_object = JS('{a: 3, b: 5}') py_dict = dict(js_object) other_js_func(toJSObjects(py_dict))</code></pre> <p>In the case of dicts there's really just one way to do it. The <code>dict()</code> constructor knows how to convert JavaScript objects. The universal <code>toJSObjects()</code> function takes care of converting the Python dict back to a JavaScript object.</p> <p>That's all you need to know when dealing with JavaScript.</p> <h2><a id="summary" name="summary"></a>Summary</h2> <ul> <li>Python code works as you would expect. E.g. you don't have to bind functions to <code>self</code>.</li> <li>You can also use many modules from the standard library like <code>time</code>, <code>base64</code>, etc.</li> <li>JavaScript variables can be imported from the <code>__javascript__</code> module.</li> <li>When interacting with JavaScript you have to convert <code>list</code> and <code>dict</code> objects.</li> <li><a href="http://bitbucket.org/wkornewald/offline-canvas-python-web-app/get/tip.zip">Download</a> the sample code to see the widget demo in action.</li> </ul> <p>In the <a href="http://www.allbuttonspressed.com/blog/django/2010/11/Offline-HTML5-canvas-app-in-Python-with-django-mediagenerator-Part-2-Drawing">next part</a> we'll take a look at integrating the HTML5 canvas element into our little Python app, so we can draw something with the mouse. <strong>Update:</strong> <a href="http://www.allbuttonspressed.com/blog/django/2010/11/Offline-HTML5-canvas-app-in-Python-with-django-mediagenerator-Part-2-Drawing">Part 2</a> is out.</p> Cassandra and ElasticSearch backends for Django-nonrel in development2010-10-22T09:43:00+00:00Waldemar Kornewaldhttp://www.allbuttonspressed.com/blog/django/2010/10/Cassandra-and-ElasticSearch-backends-for-Django-nonrel-in-development<p>This is a quick update: Rob Vaterlaus has started working on a <a href="http://github.com/vaterlaus/django_cassandra_backend">Cassandra backend</a> and Alberto Paro is working on an <a href="http://github.com/aparo/django-elasticsearch">ElasticSearch backend</a> for <a href="/projects/django-nonrel">Django-nonrel</a>.</p> <p>The Cassandra backend is still experimental and lacks support for <code>ListField</code> (from <code>djangotoolbox.fields</code>), but overall it already looks very very interesting. This backend comes with experimental secondary indexes support for Cassandra and requires a recent Cassandra 0.7 build. Secondary indexes allow to efficiently query the DB by attributes other than the primary key which makes Cassandra more similar to App Engine and MongoDB than low-level key-value stores. This feature is disabled, by default, though. Currently, without secondary indexes all queries are filtered in memory. The <a href="http://github.com/vaterlaus/django_cassandra_backend">repository</a> contains the installation instructions, so head over and play with the code, if you're fearless. Keep in mind: it's not production ready and it depends on the latest bleeding-edge Cassandra code. Any help with the backend is highly appreciated.</p> <p>The ElasticSearch backend is also in alpha state. Not all unit tests pass, yet, and for now it only supports a simple subset of ElasticSearch. Basically, you can use string operations like <code>__contains</code>, <code>__istartswith</code>, <code>__regex</code> and you can compare integers via <code>__gt</code>, <code>__lt</code>, etc. and you can order results. Currently, OR queries are not supported, but Alberto is working on that. He also plans to add support for aggregates! If you want to have a stable and feature-complete backend please grab the source from the <a href="http://github.com/aparo/django-elasticsearch">repo</a> and contribute. Alberto would be happy to accept patches.</p> <p>So, we now have backends for App Engine, MongoDB, Cassandra, and ElasticSearch. The first two are stable and get used in production. Who would've thought that Django-nonrel would get this far? That's pretty awesome, isn't it? :) If you want to build some other NoSQL backend please read our post about <a href="/blog/django/2010/04/Writing-a-non-relational-Django-backend">writing a non-relational Django backend</a> which explains our nonrel backend API. If you need more help please ask on the <a href="http://groups.google.com/group/django-non-relational">Django-nonrel discussion group</a>. Now that <a href="http://aws.amazon.com/simpledb/#pricing">SimpleDB has a free tier</a> it would be an interesting candidate for Django-nonrel, wouldn't it?</p>JOINs via denormalization for NoSQL coders, Part 3: Ensuring consistency2010-10-06T13:43:00+00:00Thomas Wanschikhttp://www.allbuttonspressed.com/blog/django/2010/10/JOINs-via-denormalization-for-NoSQL-coders-Part-3-Ensuring-consistency<p>In <a href="http://www.allbuttonspressed.com/blog/django/2010/09/JOINs-via-denormalization-for-NoSQL-coders-Part-1-Intro">part 1</a> and <a href="http://www.allbuttonspressed.com/blog/django/2010/09/JOINs-via-denormalization-for-NoSQL-coders-Part-2-Materialized-views">part 2</a> we introduced the concept of denormalization, materialized views and background tasks in order to emulate JOINs in the to-one direction on NoSQL databases. Now we'll talk about all remaining little but important snippets of the puzzle left over and discuss how to ensure that this method works in real world situation like server crashes.</p> <h2><a id="when-to-start-background-tasks" name="when-to-start-background-tasks"></a>When to start background tasks?</h2> <p>Let's first remember our current situation:</p> <ul> <li>We have a materialized view (i.e. the model <code>PhotoUserView</code>) containing denormalized properties of users (i.e. <code>gender</code>) and denormalized properties of photos (i.e. <code>popularity</code>). This materialized view is used to emulate JOINs in the to-one direction. Instances of <code>PhotoUserView</code> have to be kept up to date.</li> <li>If a user edits properties of a photo we have to start a background task in order to update all denormalized fields of the corresponding <code>PhotoUserView</code> entity</li> <li>If a user changes his/her gender (or her hair color) we have to start background tasks in order to update the denormalized gender of all <code>PhotoUserView</code> entities which point to that specific user</li> </ul> <p>Given this we have to answer when to start background tasks while keeping in mind that the connection / database / web server can fail for many reasons. A straightforward way is to start background tasks right after having saved changes to photos or users (via Django's <code>post_save</code> signal for example). However this solution comes with some problems. Let's take the following evil failing scenario when trying to edit an already existing photo:</p> <ul> <li>User edits a photo</li> <li>Corresponding <code>submit_view</code> saves these changes</li> <li>Web server crahses</li> </ul> <p>In this scenario we succeeded in saving the user's changes to one of his/her photos but we didn't start a background task yet in order to update the corresponding materialized view. A query (like the first one from section &quot;Materialized views&quot; in our <a href="http://www.allbuttonspressed.com/blog/django/2010/09/JOINs-via-denormalization-for-NoSQL-coders-Part-2-Materialized-views">last post</a>) using this materialized view would fail to find the changed photo because it doesn't contain up-do-date denormalized properties of its corresponding photo.</p> <p>Another problem with starting background tasks after a <code>save()</code> is that background tasks and a <code>save()</code> can come into update-conflicts. To see this, let's take a closer look at the following example:</p> <img alt="http://lh3.ggpht.com/_8v0Ka-uUQOY/TKsQr3CAOhI/AAAAAAAAANk/W1Y6qBG9Kes/Background%20tasks%20conflicts.jpg" src="http://lh3.ggpht.com/_8v0Ka-uUQOY/TKsQr3CAOhI/AAAAAAAAANk/W1Y6qBG9Kes/Background%20tasks%20conflicts.jpg" /> <ul> <li>At time t1 an update to a photo is being saved (via a submit from an user for example)</li> <li>At t2 the background task 1 starts right after the save</li> <li>Background task 1 fetches the data to be denormalized out of the database (photo and user). For some reason we have some delay after this</li> <li>At t3 another update happens to the same photo</li> <li>At t4 background task 2 starts and finishes before background task 1 does</li> <li>Background task 1 finishes its updates to the materialized view using old data resulting in overwriting background task 2's updates</li> </ul> <p>As a result we have an incorrectly updated materialized view because the first background task fetched the data before the second background task saved its updates. This problem exists in general if the delay between a <code>save()</code> and the corresponding background tasks is smaller than the biggest time interval a background task is allowed to be executed in. In such cases overlaps between background tasks can happen.</p> <p>One way out of both problems is to start background tasks right <strong>before</strong> a <code>save()</code> using a delay larger than the longest time interval for an update to the materialized view (longest time for getting the data + longest time for saving the data). On App Engine a request can't take longer than 30 seconds for example. This will ensure that background tasks get executed after the saving process is finished and avoid the update-conflicts discussed above because overlaps between background tasks and another <code>save()</code> and its corresponding background task like in the example above can't happen (see figure below). Additionally crashes right after a <code>save()</code> won't stop updates of materialzed views because we've already started the background tasks before. Background tasks will get the photo out of the database via the photo's primary key and update its corresponding materialized view.</p> <img alt="http://lh6.ggpht.com/_8v0Ka-uUQOY/TKsgErRYmpI/AAAAAAAAAOo/i5LT4Zuq9Rw/Background%20tasks%20%20with%20big%20delays.jpg" src="http://lh6.ggpht.com/_8v0Ka-uUQOY/TKsgErRYmpI/AAAAAAAAAOo/i5LT4Zuq9Rw/Background%20tasks%20%20with%20big%20delays.jpg" /> <p>Using big delays avoids update-conflicts and ensures correct updates</p> <p>Multiple updates to photos almost at the same time don't represent any problem for updates of materialized views because they'll start in delayed background tasks i.e. they will get the latest version of the photo out of the database in order to update the corresponding materialized view.</p> <p>Apart from that, if the database crashes right before saving the user's changes but after starting the background task, the background task will still be executed resulting in updating the materialized view with the same data already saved in the materialized view. This case doesn't represent any problem.</p> <h2><a id="what-about-inserts" name="what-about-inserts"></a>What about inserts?</h2> <p>Now we still have to consider the situation in which a user creates a new photo. Because we start background tasks before a <code>save()</code> we don't have the primary key of the photo i.e. we can't pass the primary key of the photo to the background task. One way to solve this is to mark newly created photos with a unique <a href="http://en.wikipedia.org/wiki/Universally_unique_identifier">UUID</a>. The corresponding background task will use this UUID to get the newly inserted photo and create the corresponding materialized view.</p> <h2><a id="how-to-update-materialized-views" name="how-to-update-materialized-views"></a>How to update materialized views</h2> <p>Until now we discussed when to start background tasks but not how to update materialized views. It's important that each update to a materialized view will <strong>rebuild</strong> the affected entities of the materialized view <strong>from scratch</strong> (but not unaffected entities of the materialized view) because otherwise it can result in incorrect updates. Let's assume that changing a user's gender only updates the denormalized gender for the corresponding materialized view and that changes to a photo's title only update the denormalized title for the corresponding materialized view. Given this we can get into the following situation:</p> <ul> <li>User's gender (hair color) changes</li> <li>Photo's title changes</li> <li>Background task 1 starts and only gets the user and the corresponding materialized view out of the database (in order to update <code>denormalized_gender</code>)</li> <li>Background task 2 starts and only gets the photo and the same materialized view out of the database (in order to update <code>denormalized_title</code>)</li> <li>Background task 1 saves updates to the materialized view i.e. changes <code>denormalized_gender</code></li> <li>Background task 2 saves updates to the materialized view i.e. changes <code>denormalized_title</code> overwriting changes by background task 1 i.e. the <code>denormalized_gender</code></li> </ul> <p>As a result we have an incorrectly updated materialized view because the background task 2 fetched the data before background task 1 saved its updates. To avoid this we always have to <strong>completely</strong> rebuild the affected entities of the materialized view. Even if only the photo's title changes we have to update the <code>denormalized_gender</code> too! I.e., we have to get the corresponding photo <strong>and</strong> the user! This ensures correct updates even if overwrites happen because each update rebuilds the affected entities completely i.e. the materialized view is kept up-to-date with the latest data.</p> <h2><a id="best-of-both-worlds" name="best-of-both-worlds"></a>Best of both worlds</h2> <p>Using large delays for background tasks may be unsatisfying because user may notice them. For example, a user inserts a photo but can't find it right afterwards. So you may ask &quot;why not start another background task right after we save a photo too?&quot; And yes that's what we suggest. Starting background tasks right after having saved an entity will ensure fast updates so we can use up-to-date materialized views in queries almost immediately. Starting delayed background tasks before we save an entity will ensure the execution of background tasks as well as correct updates to the materialized view.</p> <p>However we have to keep in mind that background tasks which start immediately after a save can get into conflicts as discussed above. So you might think: &quot;Forget about starting background tasks after a <code>save()</code>&quot;, but remember we still have the background task started right before the save. This background task will clean up incorrectly updated materialized views.</p> <img alt="http://lh3.ggpht.com/_8v0Ka-uUQOY/TJMq9RHWA4I/AAAAAAAAAJI/kN1bhp1oGEI/Background%20task%20conflict%20solved.png" src="http://lh3.ggpht.com/_8v0Ka-uUQOY/TJMq9RHWA4I/AAAAAAAAAJI/kN1bhp1oGEI/Background%20task%20conflict%20solved.png" /> <p>The cleanup background tasks will fetch the latest photo and user info so that the materialized view will contain correctly denormalized properties afterwards even if the first background tasks got into update-conflicts.</p> <p>Even if we get into the same conflict between a third update to a photo and a cleanup background task, this update starts its own delayed background task so that we've ensured a cleanup to come. So in the worst case scenario we have incorrectly updated materialized views which will get cleaned up soon. In a best case scenario we always have immediately updated materialized views. The important aspect here is that we'll always have correct materialized views.</p> <p>Summarized, the following should be done:</p> <ul> <li>Always start background tasks before saving entities using a large enough delay. This has the nice side effect of cleaning up wrongly updated materialized views</li> <li>Start background tasks after saving entities to ensure fast updates to the materialized view</li> </ul> <h2><a id="summary" name="summary"></a>Summary</h2> <p>In the last three posts (including this one) we described one possible method to handle JOINs for the to-one side on non-relational databases. Let's summarize the most important points:</p> <ul> <li>JOINs are achieved via denormalization using an additional model (the materialized view)</li> <li>This solution comes with eventual consistency (in most cases no problem)</li> <li>Doubles storage requirements (because of the materialized view) but storage is cheap</li> <li>If the values of denormalized properties are allowed to change, we need background tasks</li> </ul> <p>This method can be implemented on App Engine and MongoDB (in combination with celery for example) as well as on other NoSQL databases.</p> <p>It becomes clear that using JOINs in the to-one direction on non-relational databases is a mess to deal with. From the view point of a developer, it's far from optimal for the following reasons:</p> <ul> <li>For each model which requires JOINs you have to maintain materialized views by hand</li> <li>You are forced to rethink your queries and to remember to formulate them using the denormalized fields on a different model (materialized view). Then you have to get the model actually queried for</li> <li>Materialized views create dependencies between models (photos and users for example) which make them less reusable</li> <li>Thus you are predestined to make bugs. The result is less productivity and much more pain while coding</li> </ul> <p>So is there a better solution than setting up the whole process by hand each time we need JOINs? We believe that there is a much more elegant way to do so. Our answer is <a href="http://www.allbuttonspressed.com/projects/django-dbindexer">django-dbindexer</a>, which will use the method described in this blog post series so you can use JOINs without having to rethink your queries or to add denormalized fields to your models manually! Just tell django-dbindexer which JOINS you want to use and the indexer will take care of everything else. Stay tuned!</p> JOINs via denormalization for NoSQL coders, Part 2: Materialized views2010-09-27T12:13:00+00:00Thomas Wanschikhttp://www.allbuttonspressed.com/blog/django/2010/09/JOINs-via-denormalization-for-NoSQL-coders-Part-2-Materialized-views<p>In <a href="http://www.allbuttonspressed.com/blog/django/2010/09/JOINs-via-denormalization-for-NoSQL-coders-Part-1-Intro">part 1</a> we discussed a workaround for JOINs on non-relational databases using denormalization in cases for which the denormalized properties of the to-one side don't change. In this post we'll show one way to handle JOINs for mutable properties of the to-one side i.e. properties of users.</p> <p>Let's summarize our current situation:</p> <ul> <li>We have users (the to-one side) and their photos (the to-many side)</li> <li>Photos contain their users' gender in order to use it in queries which would need JOINs otherwise</li> </ul> <p>It's obvious that a solution for the problem of mutable properties on the to-one side has to keep denormalized properties up to date i.e. each time the user changes his/her gender (or more likely her hair color ;) we have to go through all of the user's photos and update the photos' denormalized gender. It is clear that we get into problems here if a user has thousands of photos to update because such an update would take too long. We need a scalable way to deal with such updates.</p> <h2><a id="background-tasks-to-the-rescue" name="background-tasks-to-the-rescue"></a>Background tasks to the rescue</h2> <p>One way to solve the update-problem is to start a background task each time a user changes his/her gender. It's clear that this solution comes with eventual consistency i.e. changes won't be visible immediately. Nonetheless in most cases that's acceptable and normally background tasks are freaking fast.</p> <p>Using background tasks in order to update all photos of a given user isn't as simple as it seems. The devil is in the details. Let's assume that a background task tries to update a photo (i.e. some denormalized property of the user) while a user is editing some property of the same photo at the same time i.e. the photo's title for example. In such a scenario it can happen that changes of the user will be overwritten by the background task (or vice versa). To see this take the example of a user changing a photo's title:</p> <img alt="http://lh3.ggpht.com/_8v0Ka-uUQOY/TKBsCOeS-bI/AAAAAAAAAMI/wnyIEaobdEw/BGTask-User-conflict.jpg" src="http://lh3.ggpht.com/_8v0Ka-uUQOY/TKBsCOeS-bI/AAAAAAAAAMI/wnyIEaobdEw/BGTask-User-conflict.jpg" /> <ul> <li>Background task gets a photo out of the database in order to update the denormalized gender (task holds version A of the photo)</li> <li>submit_view gets the same photo out of the database in order to update its title (view holds version A of the photo too)</li> <li>Background task saves the photo (version B of the photo is saved i.e. denormalized gender updated)</li> <li>submit_view finished denormalization updates and saves the photo (version C of the photo is saved i.e. photo's title updated)</li> </ul> <p>The problem here is that version C doesn't contain the background task's changes i.e. updates to the denormalized gender (contained in version B) because the submit_view fetched the photo (version A) before the background task saved its changes to the user's denormalized gender. Thus the submit_view never knows about the background task's changes and overwrites them.</p> <p>One way out of this problem is to use transactions. However this would force us to use transactions in background tasks as well as for all saves to photos in order to avoid such situations. This can slow down our high-traffic web site and forces us to remember to use transactions whenever we want to update a photo. Additionally Django's transactions aren't optimistic so we have to remember to use <code>QuerySet.update()</code>. Also only few non-relational databases support optimistic transactions or atomic UPDATE operations.</p> <h2><a id="materialized-views" name="materialized-views"></a>Materialized views</h2> <p>Another way to solve this problem is to introduce a third model containing a one-to-one relation to the <code>Photo</code> model. The basic idea behind this is to separate information used for querying (i.e. properties of photos used in queries) and the entities actually containing the information we care about (i.e. the photos itself).</p> <pre> <code class="language-python">class PhotoUserView(models.model): # denormalize all data of the photo denormalized_photo_title = models.CharField(max_length=100) denormalized_photo_popularity = models.CharField(max_length=100) denormalized_photo_user = models.ForeignKey(User) .... # copy the user's gender into denormalized_gender denormalized_user_gender = models.CharField(max_length=10) # one-to-one relation used as primary key photo = models.OneToOneField(Photo, primary_key=True)</code></pre> <p>As you can see all fields of the <code>Photo</code> model are being denormalized into the <code>PhotoUserView</code> as well as the gender of the user. This doubles the amount of storage used because we have to store an additional entity for each photo , but storage is cheap so this doesn't represent any disadvantage.</p> <img alt="http://lh4.ggpht.com/_8v0Ka-uUQOY/TKBnm9dYB2I/AAAAAAAAALk/yDJ3R5QRH84/Materialized%20view.png" src="http://lh4.ggpht.com/_8v0Ka-uUQOY/TKBnm9dYB2I/AAAAAAAAALk/yDJ3R5QRH84/Materialized%20view.png" /> <p>Of course <code>PhotoUserView</code> points to <code>User</code> too because it contains the denormalized foreign key <code>denormalized_user</code> from the photo model. The figure doesn't contain this link because it doesn't help in understanding the concept of materialized views.</p> <p>Given this model we can use the following efficient query to get photos for which we would've needed JOINs before</p> <pre> <code class="language-python">photo_pks = PhotoUserView.objects.filter( denormalized_user_gender='female', denormalized_photo_popularity='high' ).values('pk') female_user_photos = Photo.objects.filter(pk__in=photo_pks)</code></pre> <p>The trick here is that we use the one-to-one field as the primary key for entities of <code>PhotoUserView</code> so we only need to get their primary keys in order to fetch photos we are interested in. This technique is similar to the relation index (see <a href="http://code.google.com/intl/de-DE/events/io/2009/sessions/BuildingScalableComplexApps.html">Building Scalable, Complex Apps on App Engine</a>, same technique used in <a href="http://www.allbuttonspressed.com/projects/nonrel-search">nonrel-search</a> too). Additionally on App Engine and most other NoSQL databases the <code>pk__in</code> filter can efficiently batch-get all users. Queries which wouldn't need JOINs can still be done on the <code>Photo</code> model directly</p> <pre> <code class="language-python">popular_photos = Photo.objects.filter(popularity='high')</code></pre> <p>The important advantage of <a href="#materialized-views">materialized views</a> is that we don't have conflicts between users editing properties of their photos and background tasks updating denormalized properties anymore. Let's take a closer look at why that's the case: if a user changes his/her gender, the corresponding background task has to update the denormalized gender of all entities on the <code>PhotoUserView</code> model now and not on the <code>Photo</code> model. As a result changes by users editing photos at the same time won't get into conflicts with background tasks updating <code>PhotoUserView</code> entities anymore.</p> <p>However changes to photos have to start their own background tasks too now in order to keep all denormalized properties in <code>PhotoUserView</code> of the <code>Photo</code> model up to date.</p> <p>So basically all comes down to the following situations:</p> <ul> <li>User edits properties of a photo =&gt; We have to start a background task to update all denormalized properties of the corresponding <code>PhotoUserView</code> entity</li> <li>User changes its gender =&gt; We have to start background tasks to update the denormalized gender of all <code>PhotoUserView</code> entities who point to that specific user</li> </ul> <p>As a result materialized views solve the problem of having to use transactions whenever we want to update a photo.</p> <p>Now you might object that background tasks will eat too much bandwidth and that map/reduce would be more efficient. However, unless you use CouchDB views, map/reduce isn't incremental. What we've built here is a materialized view which updates only the affected entities. In contrast, map/reduce implementations like in MongoDB rebuild the whole index and that requires a lot more resources if you have a large and popular website. As an optimization, if your database supports transactions or atomic UPDATE operations you can get rid of materialized views, but then you have to be disciplined about using transactions/<code>QuerySet.update()</code> absolutely everywhere in your code. This becomes a problem if you want to reuse existing Django apps. Most of them use <code>Model.save()</code> which isn't safe. Materialized views don't have these disadvantages.</p> <p>In the <a href="http://www.allbuttonspressed.com/blog/django/2010/10/JOINs-via-denormalization-for-NoSQL-coders-Part-3-Ensuring-consistency">next post</a> we'll explain in detail when to start background tasks in order to avoid update conflicts and how to handle evil situations like database crashes.</p> JOINs via denormalization for NoSQL coders, Part 1: Intro2010-09-21T11:49:00+00:00Thomas Wanschikhttp://www.allbuttonspressed.com/blog/django/2010/09/JOINs-via-denormalization-for-NoSQL-coders-Part-1-Intro<p>Non-relational databases like App Engine or MongoDB natively don't support operations like JOINs because they don't scale. However in some situations there just exists the need to use such operations. This is the first part of a series presenting one way to handle JOINs (at first in the to-one direction) on NoSQL databases while maintaining scalability. Additionally you'll learn some useful NoSQL coding techniques throughout this series.</p> <h2><a id="why-would-i-need-joins" name="why-would-i-need-joins"></a>Why would I need JOINs?</h2> <p>Let's take the example of users and their photos. Here users represent the to-one side and photos the to-many side:</p> <img alt="http://lh3.ggpht.com/_8v0Ka-uUQOY/TI-3_ZUmarI/AAAAAAAAAHk/l0fbu6NT4hw/Photos_Users.png" src="http://lh3.ggpht.com/_8v0Ka-uUQOY/TI-3_ZUmarI/AAAAAAAAAHk/l0fbu6NT4hw/Photos_Users.png" /> <p>It's common that users have the possibility to search for pictures of other users. While searching for photos of a specific user is easy to achieve on non-relational databases via</p> <pre> <code class="language-python">Photo.objects.filter(user=some_user)</code></pre> <p>for example, searching for photos of users given some specification like</p> <pre> <code class="language-python">Photo.objects.filter(user__gender='female', popularity='high')</code></pre> <p>isn't equally simple to achieve. Obviously we need JOINs here (which aren't supported on NoSQL databases) because we span a filter over two models. A straightforward solution to this problem is to get all female users first and then to get all popular photos whose user property points to these users:</p> <pre> <code class="language-python">user_pks = User.objects.filter(gender='female').values('pk') female_user_photos = Photo.objects.filter(user__in=user_pks, popularity='high')</code></pre> <p>Getting only the primary keys of users here is just an optimization so we don't have to fetch whole user instances out of the database. However this solutions comes with the problem of scalability. To demonstrate that, let's take a look at the following situation:</p> <ul> <li>We have a high-traffic website</li> <li>We want to display 20 photos per result page</li> <li>But only 1 in 5000 users has popular photos</li> </ul> <p>In such a situation we need to fetch more users in order to get more popular photos. It's easy to see that this doesn't scale because we have to fetch too much content out of the database at the same time. In addition to the scalability problem the site would become very slow and the database would get overloaded. On App Engine we would run into timeouts too.</p> <h2><a id="denormalization-is-the-answer" name="denormalization-is-the-answer"></a>Denormalization is the answer</h2> <p>Another way to solve the problem is to denormalize the users' gender into their photos i.e. to copy the user's gender into the <code>Photo</code> model.</p> <pre> <code class="language-python">class Photo(models.model): .... # copy the user's gender into denormalized_gender denormalized_gender = models.CharField(max_length=10) ...</code></pre> <p>Each time we create a new photo we have to denormalize the corresponding user's gender:</p> <pre> <code class="language-python">new_photo = Photo(user=some_user, denormalized_gender=some_user.gender) new_photo.save()</code></pre> <p>Once done, a query for popular photos whose user is female becomes a simple and efficient exact filter on the denormalized gender:</p> <pre> <code class="language-python">Photo.objects.filter(denormalized_gender='female', popularity='high')</code></pre> <p>But what if you need more than just the user's gender for some of your queries? Maybe we need the user's age too. Following the example above we just denormalize it into the <code>Photo</code> model. That's it.</p> <p>If the user's gender doesn't change we've worked around the need for JOINs while maintaining scalability. However we have to keep in mind that changes to a user's gender can happen :P (if you refuse to agree with that replace the user's gender with his hair color :). In such cases we'll get wrong result sets because the query doesn't match the user's denormalized gender on the <code>Photo</code> model anymore. In the <a href="http://www.allbuttonspressed.com/blog/django/2010/09/JOINs-via-denormalization-for-NoSQL-coders-Part-2-Materialized-views">next post</a> we'll discuss how to handle these situations.</p> Get SQL features on NoSQL with django-dbindexer2010-09-02T12:53:00+00:00Waldemar Kornewaldhttp://www.allbuttonspressed.com/blog/django/2010/09/Get-SQL-features-on-NoSQL-with-django-dbindexer<p>With the just released <a href="/projects/django-dbindexer">django-dbindexer</a> you can use <strong>all Django lookup types</strong> (&quot;iexact&quot;, &quot;month&quot;, &quot;day&quot;, etc.) on non-relational DBs like App Engine and MongoDB, even if they're not natively supported by that DB! Also, django-dbindexer can help you with optimizing your code. For instance, case-insensitive filters on MongoDB can't use indexes, so they're not very efficient. With django-dbindexer they can be handled as efficient case-sensitive filters which utilize an index. Sounds too good to be true? Keep reading.</p> <p>Non-relational databases have rather limited query APIs. For example, on App Engine you can't do a case-insensitive &quot;iexact&quot; match. Developers on these platforms have to use ugly workarounds to implement unsupported filters. Poor guys. ;) This is how you'd emulate &quot;iexact&quot; and &quot;month&quot; on App Engine (using <a href="/projects/djangoappengine">djangoappengine</a>):</p> <pre> <code class="language-python"># models.py: class MyModel(models.Model): name = models.CharField(max_length=64) lowercase_name = models.CharField(max_length=64, editable=False) last_modified = models.DateTimeField(auto_now=True) month_last_modified = models.IntegerField(editable=False) def save(self, *args, **kwargs): self.lowercase_name = self.name.lower() self.month_last_modified = self.last_modified.month super(MyModel, self).save(*args, **kwargs) def run_query(name, month): MyModel.objects.filter(lowercase_name=name.lower(), month_last_modified=month)</code></pre> <p>This has several problems:</p> <ul> <li>You have to remember to use <code>lowercase_name</code> and <code>lower()</code> and <code>month_last_modified</code></li> <li>You can't reuse existing Django code without analyzing and modifying all models and queries</li> <li>You can't easily utilize this workaround in the admin interface</li> <li>Even if you modify an existing app you still have to keep your patches up-to-date</li> <li>It makes your models and queries unnecessarily complicated</li> </ul> <p>The model above had merely two fields: &quot;name&quot; and &quot;last_modified&quot;. It's easy to imagine that in larger projects the models can become a real mess because of all the workarounds. This is just the wrong way to write DB code. Can't we automate this? Yes, we can! Enter <a href="/projects/django-dbindexer">django-dbindexer</a>. It allows you to specify index definitions separately from the model, similar to the &quot;index.yaml&quot; file on App Engine and the <a href="/blog/django/2010/07/Managing-per-field-indexes-on-App-Engine">djangoappengine index definitions</a> feature. Let's see what the example from above looks like with the indexer:</p> <pre> <code class="language-python"># models.py: class MyModel(models.Model): name = models.CharField(max_length=64) last_modified = models.DateTimeField(auto_now=True) def run_query(name, month): MyModel.objects.filter(name__iexact=name, last_modified__month=month)</code></pre> <p>Looks exactly like with SQL. Nice, isn't it? All you need to make this work is this index definition:</p> <pre> <code class="language-python"># dbindexes.py: from models import MyModel from dbindexer.api import register_index register_index(MyModel, {'name': 'iexact', 'last_modified': 'month'})</code></pre> <p>With this little index definition we solve all of the problems mentioned above. Many Django apps like <a href="http://bitbucket.org/ubernostrum/django-registration">django-registration</a> can be brought to life on non-relational DBs <strong>without any modifications</strong> to their source. Also, the authors of reusable Django apps can add a simple &quot;index.py&quot; file to their project to make sure that NoSQL developers can use it out-of-the-box. At the same time the code will continue to work on SQL DBs. Even projects that target only non-relational DBs can benefit from the indexer because their code becomes <strong>simpler and portable</strong> across many NoSQL DBs. The indexer allows the different NoSQL communities to <strong>share the same code</strong>, sometimes even with the SQL community!</p> <p>Moreover, the index definition makes the &quot;iexact&quot; filter work efficiently on MongoDB because it converts the string to lowercase, so the &quot;iexact&quot; filter becomes an &quot;exact&quot; filter. This way it can be executed efficiently by using a MongoDB index.</p> <p>You can also define multiple filters on a single field by passing a tuple of filters:</p> <pre> <code class="language-python">register_index(MyModel, {'name': ('iexact', 'istartswith', ...)})</code></pre> <p>Finally, if you want to see a complete example you can download the <a href="http://bitbucket.org/twanschik/django-dbindexer-testapp/src">sample project</a> for App Engine.</p> <h2><a id="with-great-power-comes-responsibility" name="with-great-power-comes-responsibility"></a>With great power comes responsibility ;)</h2> <p>If you're new to NoSQL all this might sound like the Holy Grail that can magically solve all problems. Keep in mind that this is primarily a utility to make advanced developers more productive. You have to understand what's happening internally, so you can make wise design decisions. The internal implementation details of every filter are documented in the <a href="/projects/django-dbindexer">reference</a>.</p> <p>For example, the &quot;contains&quot; filter stores a <code>ListField</code> with all substrings of the indexed field. When querying, it uses &quot;startswith&quot; on that <code>ListField</code>. On App Engine <code>startswith</code> gets converted to &quot;&gt;=&quot; and &quot;&lt;&quot; filters.</p> <h2><a id="installation" name="installation"></a>Installation</h2> <p>This might sound strange, but the indexer is implemented as a database backend which proxies your actual database backend. For example, if you're on App Engine you have to specify your App Engine backend (here: &quot;gae&quot;) and the indexer (here &quot;default&quot;) and then you tell the indexer via <code>'TARGET'</code> which backend should be indexed:</p> <pre> <code class="language-python"># settings.py: DATABASES = { 'default': { 'ENGINE': 'dbindexer', 'TARGET': 'gae', }, 'gae': { 'ENGINE': 'djangoappengine.db', }, } MIDDLWARE_CLASSES = ( # This has to come first 'autoload.middleware.AutoloadMiddleware', ... ) INSTALLED_APPS = ( ... 'autoload', 'dbindexer', )</code></pre> <p>Note that the <code>settings.py</code> in <a href="http://bitbucket.org/wkornewald/django-testapp">django-testapp</a> already auto-detects and configures the dbindexer somewhere at the bottom of <code>settings.py</code>, so you might not need to change your settings. The middleware has to come first because it must load all index definitions before any other code tries to interact with the database. For more information on auto-loading of specified modules see <a href="/projects/django-autoload">django-autoload</a>.</p> <p>Next we have to define a site configuration module which loads the required index definitions. Let's put it in a <code>dbindexes.py</code> in our project root folder. First, we add it to <code>settings.py</code>:</p> <pre> <code class="language-python"># settings.py: AUTOLOAD_SITECONF = 'dbindexes'</code></pre> <p>The actual site configuration module looks like this:</p> <pre> <code class="language-python"># dbindexes.py: from dbindexer import autodiscover autodiscover()</code></pre> <p>The <code>autodiscover()</code> function will load all modules named <code>dbindexes.py</code> from your <code>settings.INSTALLED_APPS</code>, so they can register their indexes. Alternatively, you can just import the desired index definition modules directly without calling <code>autodiscover()</code>.</p> <h2><a id="the-future" name="the-future"></a>The future</h2> <p>This is just the beginning. At the moment the indexer &quot;only&quot; adds support for all Django filters, but the plan is to also add support for JOINs by automatically denormalizing your models. We could also support aggregates and <code>distinct()</code> and <code>values()</code> and nested queries and much much more. It won't be 100% all of SQL, but we want to get really close as long as it can be done in a scalable way. This will allow you to write complex queries in a few minutes instead of hours (including unit tests and debugging ;). No more hand-written denormalization and map/reduce and aggregate code. Just tell the indexer what you want to do and it'll handle it for you. Most importantly, it'll do this in a scalable way and internally it'll use the DB exactly like your hand-written code would (or even better ;). The first step is done. If you want to help with the next phase you can drop us a mail: <a href="http://groups.google.com/group/django-non-relational">http://groups.google.com/group/django-non-relational</a>.</p> Permissions with Django-nonrel2010-08-31T10:44:00+00:00Waldemar Kornewaldhttp://www.allbuttonspressed.com/blog/django/2010/08/Permissions-with-Django-nonrel<p>Quick update: Florian Hahn has implemented a solution for permission handling with Django on non-relational databases. Django's own permission system unfortunately requires JOIN support and thus doesn't work. After his original announcement the code has been optimized, so a permission check can be done with just two database operations. Also, his backend now scales with the number of users. Florian has posted <a href="http://www.fhahn.com/blog/flo/2010/08/Django-s-Permission-System-with-Django-Nonrel">installation and usage instructions</a> on his blog. Check it out if you need permission support in your project.</p>Final, official GSoC Django NoSQL status update2010-08-22T10:15:00+00:00Waldemar Kornewaldhttp://www.allbuttonspressed.com/blog/django/2010/08/Final-official-GSoC-Django-NoSQL-status-update<p>Alex Gaynor has posted a <a href="http://groups.google.com/group/django-developers/browse_thread/thread/89cdb56063ca0948">final status update</a> on his Google Summer of Code (GSoC) project which should bring official <a href="http://en.wikipedia.org/wiki/NoSQL">NoSQL</a> support to <a href="http://www.djangoproject.com/">Django</a>. Basically, Django now has a working <a href="http://github.com/alex/django/tree/query-refactor/django/contrib/mongodb/">MongoDB backend</a> (not to be confused with the MongoDB backend for Django-nonrel: <a href="http://github.com/FlaPer87/django-mongodb-engine">django-mongodb-engine</a>) and (after lots of skepticism) the ORM indeed needed only minor changes to support non-relational backends (surprise, surprise ;). There are still a few open design issues, but probably the ORM changes will be merged into trunk and the MongoDB backend will become a separate project.</p> <p>The biggest design issue (in my opinion) is how to handle <code>AutoField</code>. In the GSoC branch, non-relational model code would <strong>always</strong> need a manually added <code>NativeAutoField(primary_key=True)</code> because many NoSQL DBs use string-based primary keys. As you can see in Django-nonrel, a <code>NativeAutoField</code> is unnecessary. The normal <code>AutoField</code> already works very well and it has the advantage that you can reuse existing Django apps unmodified and you don't need a special <code>NativeAutoField</code> definition in your model. Hopefully this issue will get fixed before official NoSQL support is merged into trunk.</p> <p>Another issue is about efficiency: In the GSoC branch, <code>save()</code> first checks whether the entity already exists in the DB by doing <code><span class="pre">...filter(pk=self.pk).exists()</span></code> and then it decides whether to do an <code>insert()</code> or <code>update()</code> on the DB. Since non-relational DBs normally don't need to distinguish between inserts and updates we could just always call <code>insert()</code>. That would remove an unnecessary query from every <code>save()</code>.</p> <p>The final issue primarily affects <a href="http://code.google.com/appengine/">App Engine</a>'s transaction support: When you <code>delete()</code> an entity Django will also delete all entities that point to that entity (via <code>ForeignKey</code>). This won't work in an App Engine transaction because it would access multiple entity groups. Also, this operation can take very long when batch-deleting multiple entities (via <code>QuerySet.delete()</code>). In the worst case it will cause <code>DeadlineExceededError</code>s. The solution would be to allow the backend to handle the deletion. This way the App Engine backend (<a href="/projects/djangoappengine">djangoappengine</a>) could delegate the deletion to a background task.</p> <p>For Django 1.3 it's probably sufficient to only handle the <code>AutoField</code> issue. This doesn't affect App Engine, though, so independent of that we'll port our App Engine backend to Django trunk once the GSoC branch has been merged. This means you will only need <a href="/projects/django-nonrel">Django-nonrel</a> if you want to use App Engine transactions. In all other cases you can use <a href="/projects/djangoappengine">djangoappengine</a> with the <strong>official</strong> Django release! Isn't this exciting? Maybe some of you have waited for official NoSQL support before porting their model code and now the time has come? What do you think? I'd love to hear your comments.</p>Using Sass with django-mediagenerator2010-08-17T12:05:00+00:00Waldemar Kornewaldhttp://www.allbuttonspressed.com/blog/django/2010/08/Using-Sass-with-django-mediagenerator<p>This is the second post in our <a href="/projects/django-mediagenerator">django-mediagenerator</a> series. If you haven't read it already, please read the first post before continuing: <a href="/blog/django/2010/08/django-mediagenerator-total-asset-management">django-mediagenerator: total asset management</a></p> <h2><a id="what-is-sass" name="what-is-sass"></a>What is Sass?</h2> <p>Great that you ask. :) <a href="http://sass-lang.com/">Sass</a> is a high-level language for generating CSS. What? You still write CSS by hand? &quot;That's so bourgeois.&quot; (Quick: Who said that in which TV series?) Totally. ;)</p> <p>Sass to CSS is like Django templates to static HTML. Sass supports variables (e.g.: <code>$mymargin: 10px</code>), reusable code snippets, control statements (<code>&#64;if</code>, etc.), and a more compact indentation-based syntax. You can even use selector inheritance to extend code that is defined in some other Sass file! Also, you can make computations like <code>$mymargin / 2</code> which can come in very handy e.g. for building <a href="http://www.alistapart.com/articles/fluidgrids/">fluid grids</a>. Let's see a very simple example of the base syntax:</p> <pre> <code class="language-sass">.content padding: 0 p margin-bottom: 2em .alert color: red</code></pre> <p>This produces the following CSS code:</p> <pre> <code class="language-css">.content { padding: 0; } .content p { margin-bottom: 2em; } .content .alert { color: red; }</code></pre> <p>So, nesting can help reduce repetition and the cleaner syntax also makes Sass easier to type and read. Once you start using the advanced features you won't ever want to go back to CSS. But please do yourself a favor and don't use the ugly alternative SCSS syntax. :)</p> <h2><a id="do-i-have-to-convert-my-css-by-hand" name="do-i-have-to-convert-my-css-by-hand"></a>Do I have to convert my CSS by hand?</h2> <p>Nah, you don't have to do it by hand. One solution is to just go to the <a href="http://css2sass.heroku.com/">css2sass</a> website and paste your CSS code there. The website will convert everything to nice Sass source. Alternatively, you can convert the CSS file using the <code><span class="pre">sass-convert</span></code> command line tool which comes pre-installed with Sass. Let's pretend you want to convert &quot;style.css&quot; to &quot;style.sass&quot;:</p> <pre> <code>sass-convert style.css style.sass</code></pre> <p>It's that easy.</p> <h2><a id="yes-i-want-to-use-it-now" name="yes-i-want-to-use-it-now"></a>Yes, I want to use it now!</h2> <p>Enough talk. Let's say you have a Sass file called &quot;css/design.sass&quot; and just for the fun of it you also have a CSS file from a jQuery plugin called &quot;css/jquery.plugin.css&quot;. Let's combine everything into a &quot;main.css&quot; bundle:</p> <pre> <code class="language-python"># settings.py MEDIA_BUNDLES = ( ('main.css', 'css/design.sass', 'css/jquery.plugin.css', ), )</code></pre> <p>It couldn't be easier. The media generator detects Sass files based on their file extension, so you just have to name the file. But here comes the best part: In your Sass files you can use the handy <code>&#64;import</code> statement to import other files and the media generator will automatically keep track of all dependencies. So, whenever you change one of the dependencies the main Sass file gets recompiled on-the-fly. This is very much like the &quot;sass --watch&quot; development mode, but with the nice advantage that you don't even have to remember to start the Sass command. <a href="http://en.wikipedia.org/wiki/Barney_Stinson">Barney Stinson</a> would say: &quot;It's legendary!&quot; :)</p> <p>So, if you want to feel legendary don't wait for it and quickly install <a href="http://sass-lang.com/">Sass</a> and give it a try with <a href="/projects/django-mediagenerator">django-mediagenerator</a>. It'll boost your CSS productivity to new heights.</p> nonrel-search updates: auto-completion and separate indexing2010-08-11T14:50:00+00:00Thomas Wanschikhttp://www.allbuttonspressed.com/blog/django/2010/08/nonrel-search-updates-auto-completion-and-separate-indexing<p>It was planned already very long to add some remaining features from <a href="http://gae-full-text-search.appspot.com/">gae-search</a> to <a href="http://www.allbuttonspressed.com/projects/nonrel-search">nonrel-search</a> and since we stopped developing gae-search we decided to make some of the premium features open-source. So let's see what changed.</p> <h2><a id="new-features" name="new-features"></a>New Features</h2> <p>We basically changed two things in nonrel-search: first it's possible to index a model via a separate definition i.e. without having to modify the model's source itself and second you can use our auto-completion feature from the good old gae-search days. :)</p> <div id="separate-indexing"> <h3>Separate indexing</h3> <p>So let's say you want to index some of your models. With the old version of nonrel-search you had to add a <code>SearchManager</code> to each model you want to search for. With the latest version of nonrel-search you have to define these indexes separately from your model like this:</p> <pre> <code class="language-python"># post.models from django.db import models class Post(models.Model): title = models.CharField(max_length=500) content = models.TextField() author = models.CharField(max_length=500) category = models.CharField(max_length=500)</code></pre> <pre> <code class="language-python"># post.search_indexes import search from search.core import porter_stemmer from post.models import Post # index used to retrieve posts using the title, content or the # category. search.register(Post, ('title', 'content','category', ), indexer=porter_stemmer)</code></pre> <p>As you can see we use the new <code>register</code> function to make posts searchable by title, content and category leaving the author aside. The first parameter defines the model you want to index. The remaining parameters are just the same as for the old <code>SearchManager</code>. The <code>register</code> function automatically adds an index called 'search_index'. Of course it's possible to add multiple such search indexes, just register more of them and pass in a name for the index:</p> <pre> <code class="language-python"># post.search_indexes ... search.register(Post, ('category', 'title' ), indexer=porter_stemmer, search_index='second_search_index')</code></pre> <p>Here we define a new index called 'second_search_index'.</p> <p>In addition to defining the indexes we have to make sure that the <code>register</code> function will be executed. Nonrel-search provides a function called <code>autodiscover</code> which automatically searches the INSTALLED_APPS for &quot;search_indexes.py&quot; modules and registers all search indexes.</p> <pre> <code class="language-python"># search for "search_indexes.py" in all installed apps import search search.autodiscover()</code></pre> <p>You should call <code>autodiscover</code> in your <code>settings.AUTOLOAD_SITECONF</code> module to make sure that your indexes get loaded. See <a href="/projects/django-autoload">django-autoload</a> for more information on auto-loading modules (so don't forget to install <a href="/projects/django-autoload">django-autoload</a>). Now it's possible to search for models using the newly added <code>search</code> function:</p> <pre> <code class="language-python">from search.core import search posts = search(Post, 'Hello world')</code></pre> <p><code>search</code> takes two arguments: the first specifiing the model to search for and the second argument specifies the query used for searching. <code>search</code> automatically uses the index 'search_index'. If you want to use a different index just pass in the name of the desired index:</p> <pre> <code class="language-python">from search.core import search # use the auto-completion index explained in the next section posts = search(Post, 'Hello world', search_index='second_search_index')</code></pre> <p>Defining indexes separately from the model definition is especially useful for already existing Django apps so you can make them searchable without having to modify their source code. For example, it's possible to make users searchable via their first name and last name just by adding a search index in a separate module.</p> <h3><a id="auto-completion-or-suggest-as-you-type" name="auto-completion-or-suggest-as-you-type"></a>Auto-completion or &quot;suggest-as-you-type&quot;</h3> <p>Auto-completion is the first premium feature we make open-source. Let's say you want to add auto-completion for category names while creating posts. Nonrel-search makes this an easy job. In order to do so you first have to register a search index which uses the <code>startswith</code> indexer:</p> <pre> <code class="language-python"># post.search_indexes ... # auto-completion index used to suggest categories search.register(Post, ('category', ), indexer=startswith, search_index='autocomplete_index')</code></pre> <p>Then you can use the <code>LiveSearchField</code> which can be integrated into forms to define an auto-completion form:</p> <pre> <code class="language-python"># post.forms from django import forms from post.models import Post from search.forms import LiveSearchField class CreatePostForm(forms.ModelForm): category = LiveSearchField('/post/live_search/') class Meta: model = Post</code></pre> <p>Just pass <code>LiveSearchField</code> the URL to the auto-complete view which retrieves your posts. You can also configure the auto-completion behavior with additional parameters. See the <a href="http://www.allbuttonspressed.com/projects/nonrel-search#documentation">documentation</a> for more information. You can then use this form in your templates to display an auto-completed input field. So, the only thing left is a view returning the data required for auto-completion and to include the necessary Javascript / CSS files into your html:</p> <pre> <code class="language-python"># post.views from post.models import Post from search.views import live_search_results def live_search(request): return live_search_results(request, Post, search_index='autocomplete_index', result_item_formatting= lambda post: {'value': u'&lt;div&gt;%s&lt;/div&gt;' % (post.category), 'result': post.category, })</code></pre> <p>Here, we use the function <code>live_search_results</code> and pass in the model class to search on, the name of the index to use for searching (default is 'search_index'), and a formatting function which specifies how your auto-completed posts will be displayed. Note that this function returns a dictionary having two items: 'value' specifies how to display your posts and 'result' specifies what to put into the input field when selecting a post from the auto-completed results list.</p> <p>If you don't specify any <code>result_item_formatting</code> function, 'value' will be the escaped value of the first indexed property and 'result' will be the unescaped value of the same property.</p> <p>Note that <code>request</code> has to include a GET parameter 'query'.</p> <p>Here is the remaining code you have to add so that all necessary Javascript / CSS files will be included using the <a href="http://www.allbuttonspressed.com/projects/django-mediagenerator">django-mediagenerator</a>:</p> <pre> <code class="language-python"># settings # list your css and js data here MEDIA_BUNDLES = ( ('main.css', ..., 'search/jquery.autocomplete.css', 'search/search.css', ), ('main.js', ..., 'jquery.js', 'jquery.livequery.js', 'search/jquery.autocomplete.js', 'search/autocomplete_activator.js', ), )</code></pre> <p>and in your templates add this:</p> <pre> <code class="language-django">... {% block css %} {% include_media 'main.css' %} {% endblock %} {% block js %} {% include_media 'main.js' %} {% endblock %}</code></pre> <p>You can download the <a href="http://bitbucket.org/twanschik/nonrel-search-testapp">nonrel-search-testapp</a> to get started and play around with nonrel-search. If you create some nice app using nonrel-search please let us know.</p>