{ Make this readable }
Showing posts with label tech. Show all posts
Showing posts with label tech. Show all posts

Sunday, June 01, 2014

The month of May's misc tech reading (note to self: title too officious)


Java stuff I found interesting last month: 

Misc distributed systems and other clever stuff:
A quick nod to better documentation:
Until next time!

Tuesday, April 08, 2014

April tech reading

Here's a bunch of stuff I found to be of some interest and relevance. Happy reading!
An Apache HTTP client "bug"/weirdness I ran into recently, which would end up consuming a large number of ephemeral ports (client side) instead or reusing connections - fix description. The ports would end up waiting in TCP_WAIT state for a long time and the client would eventually stop, unable to make any new requests.

Big data stuff. Naturally, any list is incomplete without big data: 
IntelliJ 13.1 and Git weirdness:
Random, clever tech stuff:
Until next time!

Wednesday, January 29, 2014

This month's good tech reading

(Many of these links I discovered in my Google+, Twitter, HN or RSS feeds. I don't take credit to be the first to find them)

Java:
Network:
Until next time!

Sunday, January 19, 2014

Rsync in Java - a quick (and partial) hack

Over the years I've been (mildly) fascinated by how various version control tools and file backup utilities work. Especially the core algorithm that drives many of these file send/diff/backup/de-duplication programs.

Rsync being the most widely used tools and the basis for many extensions, I naturally tried to wrap my head around it's working. But I thought the details were somewhat hazy. Maybe it was just me but I was looking for a simpler, clearer implementation of the algorithm and not a fully functioning program.

Recently, I gave it another shot. I waded through some of the material available on the interwebs and bravely set out to implement it to see how much of I had understood.

So, here is the basic implementation in Java. It may not be a faithful implementation of the paper but the gist is:

  • Create a summary out of fixed blocks of input text (original)
  • Use these blocks as reference against another text (modified)
    • This modified text is slightly different from the original text
    • Hence the assumption that the original text can be transformed to the modified text without having to send the entire modified text back
  • The modified text can now be transformed into a combination of:
    • References to those original blocks where there were no changes
    • And any differences as simple text
The code is available here and the same is embedded at the bottom of this post.

Some notes on the implementation:
  • It only handles Java Strings
  • It uses a combination of Rabin-Karp rolling hash for quick, incremental hashing of blocks and CRC32 for hash conflict resolution. In reality a much more robust hash should be used instead of CRC32
  • It assumes that the list of generated "blocks" is available on the other side to generate the patch. In reality there has to be a more clearly defined mechanism/protocol to exchange these blocks
  • The overall algorithm to identify common/repeating hashes should be smarted than this
References:
Code:
Until next time!

Sunday, December 15, 2013

Java/tech stuff I found on the internet (Dec 2013 edition)

Networking and big data:

Java/JVM perf:
Java memory model + arrays + visibility/ordering:
Happy holidays!