I have started to remember more details about problems than I do about code.
Code used to occupy a large amount of my work-time thinking roots - algorithms, data structures and architectural concepts. Over the last two years, the organization of my thoughts has shifted onto understanding parallels with the real life structure of the problems. I'm looking through my attic for my brain and finding new uses for all that I've already done, which makes a better part II.
So for a while I've been putting thought into revisiting ideas, to feed them through this new lens. Here are some of them which I'd like to work towards thinking more clearly about them over 2018. If any of you have one of these problems or would like to enlighten me about how you are solving them, I'd like to buy you a beverage of your choice.
First up, large corpus compression. At a previous job, we had a few million users' game worlds in blob form, compressed with SNAPPY. This of course was incredibly inefficient to store every user's world's stereotypical artifacts again and again. The fundamental problem was to split apart the common structures between multiple blobs into a common blob and implement a LZ77 variant which can entirely remove large chunks of the input data from the compressed form and merely reference the common blob range (i.e SNAPPY COPY instructions with -ve offsets). This is not very revolutionary, because you can find something similar for english text in a new compression algorithm like Brotli (see RFC7932 Appendix A). It is already possible to do something similar, when it comes to Druid segment files, to compress URL columns by a large fraction within the segment, while still being able to decode any random row out of the column.
But what I want to do with that core idea is to apply it for another data-set which has a large amount of repetition between records - DNA and RNA. The idea is to apply this sort of lossless compression models to FASTQ data, but to seed it with known sequences as a database for long-range LZ77. The core idea is to compress the million genomes (like the one VA is building with the Million Veterans Project) while making it much easier to lookup by a known (or relevant) sequence. The compression format offers a fast way to do 1000 base comparisons, if you could build a pre-set dictionary and map them while they are being ingested. This is not a trivial operation at scale, but if you can cross-cut through the compression algorithm for your search implementation, this offers a massive improvement on the IO requirement on a cloud system. In a very vague way of saving, the more data you have the less data you'll have per-person.
The basic search optimization problems are also interesting for this, since nobody is going to download a large warehouse of data to match a single chunk of ctDNA that they pulled out of somebody's blood, but it is a batch problem with interesting parallels to other plain alphabet problems. My scrabble word finding cheat scripts do prime-hashing, which would work for fragments shuffles (& also off-by 1 identifications), which compresses 8 letter combos into 1 long without order (i.e all combinations of 8 letters into the same long). As a performance engineer (& and not a scientist), finding patterns with a prime multiply Rabin Karp in an alphabet soup is exactly the as complex as an alighnment problem. But the difference between the toy problem and the real one is literally life and death.
Second up, fleeting value capture problems. There are some problems where the value of a data entry decays as more time passes, if nothing reacts to it. These show up for a large number of streaming real-time systems, though the real life problems tend to tolerate a fractional loss of information while the streaming platforms are built to be feeding a system of record without loss. The demand-supply curve in a number of systems tend to work in similar fashion, where the historical data is not particularly relevant except in aggregate, but the current values are exceedingly pointy problems. Reactions to discounts and surge pricing are extremely individual, if you could map the actual demand to the demand against 68719476736a price point, in real-time, that can drive decisions at a finer granularity while keeping the maximum number of people happy.
This might look like a new idea, but I've seen this used to great effect in Facebook games - the random rewards within that skinner box is not exactly random. However, the difference is that in a value-cost model, the supply-demand curve does not meet at the same place for all people who are making financial decisions. The micro-economics in the single party model (with the "house" monopoly handing out infinite inventory and the players putting money down, there is no per-unit-profitability, only non-recoverable-engineering costs from the artwork & code) is much more easy to model than the delayed elasticity of the supply curve in a two-party model like the new gig economy.
This brings up another point, when you have weakly interacting processor models, they do not support strong failure tolerance characteristics. Failure tolerance is one of those "can't do without" concepts in distributed systems. This is sort of the "UDP for audio streams" discussion, because failure tolerance often demands waiting for the failure to be corrected before moving on - in a number of these processing models, delaying a future operation to make sure you have processed the current one is a complete waste of time. If you had an ETA estimation system for cars, then there is absolutely no value in processing a five minute old record when a user has already left a car. These problems routinely crop up in geo-fencing problems, when sending location updates over a cell phone network - when position changes, cancel the previous queued request and send a new one.
68719476736 The weakly interacting processors tend to be implemented as reliable queues and retried/stateless micro-services, both of which are imperfect abstractions to the actual problem at hand, but very clearly look like what MPI with buffering would look like. The data flow model is easier to delegate to different teams to build, but it does not match the characteristics of a single system which can handle geographical data as a sort of cellular network handing off cars and rides to the next cell as a vehicle moves through the city. If San Francisco has 45,000 drivers operating in it and 150,000 passengers with their apps open, you might realize that the state distribution for those + all the people in a vehicle right now is not exactly a giant scale problem, even if it looks like a 50k HTTP calls/sec of traffic going over the wire when it rains.
Geo-spatial data is another crucial problem for me to reapply some optimization work I did with NPCs in a video game - basically, if you ask them to get coffee, they need to find the nearest coffee shop and walk there. This meant dividing up the city into QuadTrees and build an A* path routing over that space. However, after I started to poke about that class of problems, I ran into Hilbert curve numbers and Hex bins. A Hilbert curve converts a 2-dimensional space into a single number and there are a large number of ways to handle single dimensional data in indexes, which suddenly become useful when you dump your data into Druid or another SQL DB which handles integers much faster than a complex square root function. There are space filling curves for hexagons which form a much prettier picture, but the real advantage with Hex bins is that for a QuadTree for a given position, there are 8 adjacent grids to check, while the Hexagon only ends up with six adjacent blocks to check.
To put it together, Geo-spatial data mapped onto a weakly handed-off cellular processor model (i.e the fuzzy boundaries) tends to work very well in a distributed systems, without a significant cost of the failure tolerance required to perform a strong hand-off. This is something left over from my first job writing a call-handling application for Ericsson, where the cellular paging channels and how they handle a user who merely drives through cells over someone who's on active call driving through, while being able to get GSM audio packets correctly (also it is encrypted). The interesting part of that is using the first derivative of position computed for each bogey in motion & the accidental utility of knowing collision/occlusion math from old video games.
Over the last 15 years, I've gathered up a bunch of eclectic experience, which seem to be useful in several other places across the industry in different scenarios with their own additional trade-offs and complexities. Complexities I haven't considered and trade-offs I haven't explored.
So, if you're working on something like this, I'd like to pick your brain and understand what I don't know - expand my ignorance, if you will.--
If you're not part of the solution ...
This is a written ovation to Tahatto's, "Romeo & Juliet: No Strings Attached".
I'm sure a teenager from the 90s set the score to this. The 90s teenager in me watched this and knew every sound bite was aimed right at me. To hear Pehla Nasha for Juliet's intro (and other Karan Johar references) and laugh needed someone who was 14 in the 90s - and still is.
If there is one line I will truly remember this play by, it is the monologue of Tybalt. He laments that the it is upto the villain to keep the plot moving.
And it is true, to let go of the plot takes perhaps a few good natured gestures from the villain. Perhaps to to walk away from Mercutio when the swords were drawn or perhaps even devious cunning to treat the Montague-Capulet romance as the foundation to eventually raise his own relevance in Verona. It would fall far from the material but if you need an example of what happens if the Capulets decide to lead Romeo on, watch malayalam's Godfather.
And in fact, Juliet too. Her rant about her wishes stands out in the play. Even removing the tinges of modern feminism from that monologue, it has to be admitted the men run the show. Neither Tybalt nor Romeo, well nobody at all, actually listens to her wishes for peace & love. Perhaps her suicide plot was a last ditch measure to have her thoughts heard, just like several other calls of attention which have ended in tragedy.
Even Mercutio haranguing the audience, breaking the fourth wall after his death. Only to ask if the audience is so jaded and detached, that they laugh at a man's untimely demise.
But the funniest of all are the multiple characters played by the two puppets, who despite switching characters often break to reveal that they play others as well. In particular, the death of Mercutio, only to be followed by Romeo's confusion over Benvolio played by the same. It is self-aware in a rather direct way that this is a play within the stage, performed by different hats rather than different actors.
Of course, for me the most meta reference was the play being run by the players themselves, with a very large & obvious presence of the lack of a director.
After the play is over, they decide to do it again, because to follow the script is the only way to get the girl, even if only for a little while. And therein lies the rub.--
O happy dagger! [Snatches Romeo's dagger.]
This is thy sheath; there rest, and let me die.
-- "Romeo & Juliet - Act V, Scene iii"
Spend every day watching your life pass you by. The days, they go on interminably without any identity of their own. Your thoughts become memories, turn into dreams and fade away like breath on a window pane.
Freeze a moment in time, store them away and chase them back down memory lane. Lay down the breadcrumbs marking your way, mementos held close to your heart, of a day that you'll never forget.
Click, *click* ... and that's all it took.--
The moving finger writes; And having writ, moves on.
The writing, it stays written.
The hustle and bustle of real life is a killer. But there's no reason for it to be a silent killer, so here's one of those mental kicking & screaming thoughts which was dragged through my head as I was wasting time in Hongkong Airport. My flight called for me before I finished this, but the time has come to acheive some closure.
To: y-blr <...> Subject: Cubicle for Rent (rates negotiable) One premium corner cubicle in M G Road available for rent. The cube is surrounded by conference rooms, fully furnished with a laptop dock, a comfortable chair. Extensive table space, entire sixteen foot carpet area with easy access to sofas, coffee and the pool table. 24x7 internet access enabled, fully packed bookshelf and with neighbours used to loud music after 7 PM. Available for occupancy for a month and at negotiable rates.
Since it is too late to actually send that mail, I guess this is its home now. But sing with me - My cubicle ... it doesn't have a view.--
If money can't buy happiness, I guess I'll just have to rent it.
Some of you might have seen me wearing this tshirt. But it was one of those things which me and mojo came up with. After a couple of nearly non-starts, we finally got a half-decent t-shirt design for the Y! Bangalore frontend engineering conference. Nearly completely borrowed the style and attitude of xkcd, threw in a bit of self-deprecating humour (It's so uncool, that even I don't do it).
Somehow the more catty punchline, "When you're *this* pretty, you don't have to do anything" (as said by the machine to the ex-(*heh*)-asperated girlfriend), wouldn't fit into the speech bubble. But this one still is pretty kick-ass.--
Oh what a tangled web we weave, when first we practice to deceive.
In the last episode of Unicode Fun, we met the non-trademark infringing xoferiF. But this time I bring you English from beyond the equator - do not attempt to adjust your screen.
¿ ʇı ǝsnqɐ ʇ,uɐɔ noʎ ɟı ǝpoɔıun sı unɟ ʇɐɥʍ 'llɐ ɹǝʇɟɐ ˙ƃuoɹʍ uǝǝɹɔs ǝɥʇ ʇɐ ƃuıʞool ǝɹ,noʎ sıɥʇ pɐǝɹ uɐɔ noʎ ɟı
In conclusion, unicode ftw.--
A smile is but a frown;
Turned upside down.
Cache that, give me some more, Cache that, don't dump some core, Cache that, don't hit the _store, Cache that, ~ oohoooh ~
In a sort of related note, I might be singing this somewhere.--
Everything in nature is lyrical in its ideal essence, tragic in its fate, and comic in its existence.
-- George Santayana
Happy Vishu to everyone ! For those of us that follow the Malayalam calendar, this marks the birth of the new year. Vishu has like, totally been my favourite festival for so many reasons - it is bang in the middle of the summer vacation, being a kid means you cash in with Vishukkaineetam from all your elders and lastly *FIRECRACKERS* !!.
Because I'm in mourning, I'm not actually celebrating Vishu this year, but that doesn't really stop me from being carried away by the old sounds and smells remembered from the days of my childhood. Ah, nostalgia ... the consolation prize of a loss. For the last 15 years, I've never missed a year in spending this particular day with most of my extended family.
Well, I guess this makes it sixteen in a row ... :)--
We used to call him April Showers, because he brought May flowers.
-- P.G Wodehouse, The Small Bachelor
Consider the past year. Now consider UserFriendly.org's blatant MS bashing of past. And then take a look at the following cartoon.
In the past the Orbital Mind Control Ray could only have been an obvious reference to MSFT (as seen on slashdot). But today, we're making fun of Google - is the attitude towards GOOG changing as the company grows bigger and starts eating small companies for breakfast, lunch and dinner ? Which reminds of this conversation between Bart and Fat Tony from Simpsons 8F03.
Bart: Are you guys crooks? Fat Tony: Bart.. uhm. Is it it wrong to steal a loaf of bread to feed your starving family? Bart: No.. Fat Tony: Well suppose you got a large starving family. Is it wrong to steal a truckload of bread to feed them? Bart: Nuh-uh Fat Tony: And what if your family don't like bread. They like.. cigarettes. Bart: I guess that's okay. Fat Tony: Now, what if instead of giving them away.. you sold them at a price that was practically giving them away. Would that be a crime, Bart? Bart: Hell no!
As someone else pointed out, when Microsoft came out, they were the hungry rebels who were freeing the public from the iron fisted data processing overlords, without the cutsey "Ribbons and Ponies" approach that Apple was taking. The big blue of old, which has now become a savior and hero for Linux, was the evil monopoly ?
How often does sides get switched, old rebels become new masters and acquire new allies from old enemies ?--
No permanent friends or allies, only permanent interests.
-- Lord Palmerston
Found this in one of the mailing lists - but this is total fnuk. Please feel free to click on any of the following links - yahoo shit and google shit. Please take a look at your title bar of your browser to understand the true beauty of bi-directional font-rendering :)
I'm sure xoferiF wouldn't violate any trademarks by the Mozilla foundation.--
Drawing on my fine command of language, I said nothing.
I remember the last one very vividly.--
No amount of careful planning will ever replace dumb luck.
I watch a lot of cartoons, always have, always will. It might not have improved my grasp of physics, but it's always made me laugh. But as I sat in front of the idiot box today, I didn't want to watch any toon they were showing. Ever since Pokemon pointed out the huge merchandising opportunities, the recent trend of toons are mainly intended at making kids buy useless stuff. Sure, there were G. I. Joes and Skeletor toys in the cupboards when I was a kid as well, but watching Beyblade made me sick. Where Pokemon at least redeemed itself by emphasising evolution (for the US bible thumpers), this one seems to be pure merchandising claptrap.
I mean, I'm not asking for a strong story plot here. But it should have something, at least something that stimulates your brain. Most roadrunner cartoons are stupid, but I still ROFL at the Wyle E. Coyote, Genius business cards or anything that's named Acme (with apologies to Leon Brocard). Or to take another example, Tom and Jerry - perfectly predictable, yet funny in some excellent episodes. Even Scooby Doo has its moments of mirth, especially the Let's split up cliches. None of these needed a story to make it funny, it was merely funny because they were.
But there are still some which are for kids of all ages. They go beyond mere physical humour, into word play and referential humour. The moment, Bugs Bunny said "It's baseball season"  or the classic What did you expect, a happy ending ?  were landmark events for any cartoon to follow. Even the background music was borrowed from operatic greats  and a Casablanca spoof with carrots. Not that Bugs and Daffy toons were lacking in the low brow humour either. Those cartoons were loaded with jokes at all levels possible in six minutes.
But that era has passed and passed on the baton to the new overlords of cartoons. I probably won't consider Simpsons or Southpark as cartoons, but as merely animated series. But there were a few glimmers in the pile of shit that got served to me in the late nineties, by Cartoon Network.
First on the list would be Dexter's Laboratory  - the classical mad scientist story, only the mad scientist is just 8 years old. Having a secret laboratory, while living the life of a normal kid in front of the parents makes for some moments which involve the audience in some conspiratorial laughs at the expense of Dexter. Not to mention Dee Dee's meddling of the "Ooooooo. What does this button do?" kind. Nobody with a sister can stop smiling at that. Now, you might laugh at the standard jokes it sets up, but there are a few lines from today morning's episode ( * Figure Not Included ).
Major Glory : So what have you learned today ? Dexter : I learnt the important lesson that you cannot buy friendship with gifts. Major Glory : No, not that. Dexter : Then what ? Major Glory : You're going to learn that you can't get away with Copyright Infringment Dexter : Oh ? Major Glory : Now you'll have to face someone much more powerful than me. Dexter : Who's that ? Major Glory : My attorney.
Or even the referential Mock 5 where Dexter (btw, the name means "Right") races against Racer-D who is actually Dee Dee (remember Speed Racer ?). Yeah, Genddy Tartakovsky is a genius. His other works, such as Samurai Jack or Powerpuff Girls, were excellent as well. In fact, PPG was far more involved than the name would suggest, though it is a slightly acquired taste (you need to watch Mighty Morphin Power Rangers to get some of the jokes).
And then there was Johnny Bravo. For a blonde Elvis clone, who picked up enough Fonzie cliches, comedy comes easy. I for one, love the Kirk Tingblad episodes where Johnny's narcissim is brought to the forefront. Sure I know enough folks who think that it is a stupid show written for stupid people, but the flourishes are there in the details. For example, in Aunt Katie's Farm, Johnny is a pig in the sketch. And after destroying the set, Johnny rolls around in the mud and starts to yell "Four feet good! Two feet bad!". Or even the Prince and Pauper version, except in this one Mark Twain ends up thrown into prison for calling the plot an old chestnut. But my best Johnny line was from the "Panic in Jerky Town" where he comes out of the factory yelling "It's people! Jerky Jake's Beef Jerky is made of people!" , which goes whooshing above most viewers' heads.
What I saw on TV today didn't even come close to any of these. I have a glimpse of the future Bill Waterson saw when he said no to selling cuddly stuffed Hobbes to his fans. I'm sure enough younglings will complain that Beyblade is the coolest, but from what I see, it is all about buying tops. I used to love DBZ, but it was never about buying Dragon balls from the nearest shops. These toons seem somehow different and alien to me.
I suppose, every generation survives on nostalgia. Maybe I'm wrong - all these kids will outgrow all the stupid toons and bitch about the next set of twelve year olds when they're twenty four. I mean, I'll really be scared if they don't
Oh, and to relieve you from the suspense about the deer uncles. That's from a Dee Dee quote - "Deers don't have uncles, they have antlers". Laugh if you can ...--
This is what entertainment is all about ... Idiots, explosives and falling anvils.
Now, this isn't new, but it just had to be given its due respect. Sung to James Blunt's - You're Beautiful.
My cubicle, My cubicle It's one of sixty two It's my small space In a crowded place Just a six-by-six foot booth And I hate it, that's the truth Well, I give a sigh As the boss walks by No one ever talks to me Or looks me in the eye And I really should work But instead I just Sit here and surf the Internet In My cubicle, My cubicle It doesn't have a view It's my small space In a crowded place I sit in solitude
I haven't seen anything that has more truth in so few phrases. Kudos to Keith Hughes and Jym Britton for pull this one off, with style. Get the song from morningsidekick and play it loud in office.--
How can I "Think Outside the Box" when I'm in the @#$%? box all day!
Being the bookworm, I couldn't resist this particular meme. The last one I'd indulged was the superhero one. This one on the other hand, makes a lot more sense.
by Lewis Carroll
After stumbling down the wrong turn in life, you've had your mind opened to a number of strange and curious things. As life grows curiouser and curiouser, you have to ask yourself what's real and what's the picture of illusion. Little is coming to your aid in discerning fantasy from fact, but the line between them is so blurry that it's starting not to matter. Be careful around rabbit holes and those who smile to much, and just avoid hat shops altogether.
Take the Book Quiz at the Blue Pyramid.
I'd hoped for Through the Looking glass or at least H2G2. Anyway, there it is though, full of smiling cats and queens playing crocquet with flamingos.
The real purpose of books is to trap the mind into doing its own thinking.
-- Christopher Morley
When we last left off, we were talking about matchpot and the soon to be world championships. But what matchpot lacks in cerberal and social subtlety, Mafia brings out in potfuls. Basically the game is about killing innocent villagers, whether you are the mafia or one of the lynch mob yourself. I was introduced to this game when we were all sitting around in our hotel rooms in Thrissur. The real interesting part is not the game in itself, but how it lets (or in fact forces) you to study other people under a microscope.
After the first few games where Mafia won hands down, slowly the villagers started to pick up on the non-visual cues as well. It was quite interesting to see people trying to be overclever and bluff with poker faces. Also several interesting observations, some particularly personal, were made by a lot of people. I did get a quite inside picture of a couple of people's minds and it is terrifying what some people are actually capable of, compared to your mental estimate of their trick quotient. On top of that, it is also a measure of how successfully you can con other people into changing their opinions. On the receiving side, mafia has a way of exposing your gullibility in a painfully obvious way.
An important lesson the game has taught me is about myself. I found out to my surprise that I think a lot more clearly when I am not formulating a point to present. Being dead in the game gives you a totally different perspective which you are unlikely to get while you are talking. Sometimes just having to sit and watch the entire crowd ignore the clinical quality of the strategy is just way too frustrating. Masterpeices of strategy are completely lost to the villagers who're more concerned about staying alive to the next round rather than bringing the mafia down. Exposed are the simplified versions of our daily grind, where the evil go un-punished and good are targeted. Religions have been based on much less than fixing this (later, much later).
Nobody has seen me as the Mafia yet, but I'm better at finding things out than hiding them.--
Fanatics are often blinded in their thoughts.
Leaders are often blinded in their hearts.
As gleaned from the php irc channels on wednesday :-
<helly_> need to show a table with >2k columns, it can but not show all column headers <SaraMG> funny <Rasmus_> You need Dell's new 8192" wide monitor for that table <Davey|Job> *with* the built in HDD <Rasmus_> and a bicycle <g0pz> and it blinks "pedal faster !!" <Derick> 8192 inch <Derick> wow <Derick> that's 208.0768 meters.. <scoates> Derick: NVGA <scoates> (Neighborhood) <scoates> it requires 16GB of video ram, though.. <andrei_> it's not the size that matters <andrei_> it's the font
The humor doesn't stop with IRC channels, though. Consider this very awesome groklaw comment, inspired by the Cheese Shop sketch. Also making the rounds is more interesting things like this firefox gurl.
Life's never boring if you watch this world carefully enough.--
The probability of someone watching you is proportional to the stupidity of your action.