A space for tracking notes related to an effort to port @tiddlyweb to Python3.

There are three main differences that have been encountered thus far:

  • the print keyword is now a function: print()
  • Default 'strings' are unicode, and encoded strings and such are bytes. This has cascading impacts all throughout the system on how input and output are handled, and comparisons between things that are either strings or bytes.
  • Things like dict.keys() is an iterator, not a list, stuff like dict.iteritems() are gone, replaced by dict.items().
  • except clauses must be written as except (ExceptionA, ExceptionB) as exc:

These things, in themselves are easy enough to deal with, but it can be hard to be confident about the results because of testing problems:

  • TiddlyWeb depends on a small number of packages:
    • selector
    • html5lib
    • mimeparse
    • and for testing:
      • wsgi-intercept
      • httplib2

October 9

Everything is now working. Besides those listed below in earlier dates, the issues are:

  • It turns out html5lib code from its repo works fine when the encoding passed in is known, so works for TiddlyWeb (where the encoding, if any, is always UTF-8), but since one of its claims to fame is dealing with weird stuff "live" on the web, it is not ready for release.
  • The twanager server command uses cherrypy.wsgiserver. This has been ported to Python3, but there is a minor bug with Python3.3. Fixed in dev, but not yet released.
  • always_safe from urllib is now gone and needs to be worked around to get a properly working quote out of urllib.parse.

For now the process will be to just keep a working environment with the right stuff in it and see where it goes. That means a local copy of httplib2, html5lib, wsgi-intercept and cherrypy.

October 3

I have working wsgi-intercept and selector ports now. wsgi-intercept I did myself and seems to be working fine. Selector I also did myself, but in concert with a pull request on the existing repo. Because that has not been merged yet, for the time being I've moved my selector.py changes to be tiddlyweb.web.selector.

Many of the tests are passing now. The latest hangup is with httplib2 but this should be relatively straightforward to overcome.

The next step will be formalizing the work I've already done on html5lib. I had a mostly working version already, but not suitable for providing back to the original authors. Will start that soon. In the meantime, the validator code in TiddlyWeb that makes use of html5lib is simply turned off. This makes tests fail, but means that the code will compile (because html5lib is never imported).

October 1

Of these httplib2 and mimeparse seem to have mature Python3 versions. The rest, not so much. Okay, this is a fixable problem. But:

  • Both selector and wsgi-intercept have significant test (but not deploy) requirements.
  • There appears to be some remaining ambiguity about how a WSGI application should expect a web server to present HTTP headers and body. Bytes or strings? What encoding? And how should the app present headers and body to the server for the response? Presumably the server should be liberal in what it accepts, and strict in what it sends. If that's the case what are the strict rules?
  • I did the required changes in place to html5lib to get it working with TiddlyWeb, but that doesn't result in a working html5lib, so that will need attention as well.

Experimenting with the latter issue means that being able to use the tests in selector and wsgi-intercept is very important, but it is hard to get off the ground.

I've tried working at the problem from the angle of the TiddlyWeb tests, and this has been quite effective for all of the system except for the web tests. Once the web tests are involved the lack of confidence in the bytes and strings produced by selector and wsgi-intercept mean frequent adjustments. Thus it seems that it would be better to get those tools working correctly, and then get TiddlyWeb correct.

selector will be relatively clean to port if the tests can be made to run. If a working wsgi-intercept can be created, it can provide some of the missing infrastructure in the selector tests. To get a working wsgi-intercept will likely require removing support for some of the interceptors. The only one's that I particularly care about are httplib2 and urllib2 (which presumably will become urllib.request).

So I reckon the next step is to recreate a trimmed down wsgi-intercept which has accurate test coverage but a different testing harness. To do this the questions about how WSGI must work will need to be resolved.

So that's the current state.

I will add links to the relevant pieces of code as they are gathered together:

Mon, 01 Oct 2012 15:29:41 GMT
Tue, 09 Oct 2012 21:02:51 GMT