(betraying hereafter my language in order to broadcast this publication farer away)
imposing context
Brand new shit on the global corner, impostor.herokuapp.com pretends to be a massive desktop publishing tool that enabled common citizens to use ordinary printers to produce books, which I want to see laying on every street like damned junkies. The Press to the People.
How does it work? Well, the whole idea is to be able to print, for example, 4 pages out of one A4 sheet, by side, so page numbers must be in the right order so that the number on one side is the following number after the one in the reverse. Get it? That is called nUp imposition (hence, impostor) because you just specify nX & nY pages, in this case 2x2, but could be 3x5, etc. OK, and you can specify sizes too, for both inner page and outer sheet. If what you want is incongruent don’t worry, impostor is yoghi flexible.
TeXn茅
Right, but how does it work? Ehh, it’s Rails on Heroku, built around a console command impostor which is a ruby gem under the GNU General Public License. Inspired by a few existent Open Source solutions. I could quote Rhimposition, which I hacked years ago to combine nUp and Booklets… but fuck Adobe, man. And also proPosition, from which the ruby LATEX combo‘ & from where I pretend to inspect the cut-marking soon. Credit be given to Free software then. LATEX on Heroku? Yeah man, here goes the buildpack, born also of the furious love of a TeXLive installation buildpack written in bash and Heroku’s regular RubyOnRails buildpack. This one also installs texlive-recommended and xpdf executables. More than because of the packages themselves, I think it could serve to build anything. 馃檪
booklets
What was that Booklet stuff? Ah, that’s the starred feature. Instead of just printing and cutting, you can print, cut the half of pages and, the other half, fold them. Cheaper… but mostly, gives it a structure. You could fold them and put them side by side and you’ve got a back to paste. Or you could put this booklets (d’o霉 the technique’s name) one inside the other. All of them for a small book. Grouped by tens in a big book.
comming soon
So this is about everyone learning to craft nice books pals. What’s next? Right now it’s lacking cut marks (so you know where to cut), margins so one can apply corrections for fucked up printers and, perhaps, page rotations.
After that, featured sizes by country and a database of printer-specific tricks to ease all this… So we’ll need user feedback, that’s the goal.
blob files persistence
Irie, and how do binary files and scalability fit in this picture? That’s the major architectural challenge right now. At first, the original idea was to receive the file, impose it, give it back and delete it. All the time it lived in the filesystem, from which they were periodically removed by a rake task that implemented a 10 min rule: afther that time, user session expires. This is blatantly fast since the only network points are upload and download, the program vaunts treating in less than 1s files that took hours back then in the indesign times.
You can have a look at this basic performance tests, where we measure entering the website (root), uploading the file (upload), processing it (post params) and downloading the result, minus user delay (the computer knows where to click)
TEST | GC TIME | WALL TIME | |
root | 0.00848925113677979 | 0.00823044776916504 | |
upload | 0.028685986995697 | 0.0249580144882202 | |
post params | 0.846695005893707 | 0.819966793060303 |
but…
Heroku’s ephemeral filesystem was the first pain on the ass. Heroku is some kinf of amalgam of Virtual Machines that live and die on-demand nowhere in their cloud, so, you just don’t have a filesystem underneath (you’re in the Cloud!). I concluded it wasn’t fit for my app and moved to Engine Yard, whose extra helpful panda Will almost deployed for me (incredibly available to chat and answers mails, never seen).
But running two load balanced instances
#impostor
stumbled and fell to the ground: files could be uploaded to one machine and next request be served by the other machine. So it is not scalable in itself.
scalability
Mailing Will, he suggested database >> and I wasn’t at all convinced, so I elaborated a database solution that saved files to the database when uploaded or processed, and resurrected them along with their paths whenever necessary.
For a small file, performance wasn’t particulary affected, even if deleting folders inbetween requests (check full rms) but, when talking about a big file, the panorama changes a bit, as heavy (which is a full with a 50 Mb file) and heavy rms (a full rms with the same one) tests show:
TEST | GC TIME | WALL TIME | |
root | 0.00755596160888672 | 0.00747352838516235 | |
upload | 0.0329135060310364 | 0.0374072194099426 | |
post params | 1.07429385185242 | 1.3668053150177 | |
full | 0.929822742938995 | 0.915787577629089 | |
full rms | 0.957760810852051 | 0.934577465057373 | |
heavy | 17.8963747620583 | 16.2370020151138 | |
heavy rms | 22.4671932458878 | 21.0466043353081 |
We can guess a stretch relation between file size and performance.
So I made a bunch of equivalent tests using different file sizes. This graphic displays total time in seconds according to file size in kilo bytes.
Here we can appreciate the problem is O(n) : linear after a certain size (20Mb aprox.) and therefore kind of a touchstone for the app.
This will probably force a file size limit, unless until we find some scalable file storing solution (mongdb?)
http://cargocollective.com/Rokotyan/Rhimposition-Scripts