I have a new project.
It's called forqlift.
This one should be of use to people who crunch data with Hadoop or Mahout.
Here's a bit of a blurb from forqlift's page to get you started:
SequenceFiles are nice, but they can be unwieldy at times. I wrote forqlift to make it easier to manage SequenceFiles.forqlift is a command-line tool that lets you:
- create SequenceFiles from files on your local filesystem (just like creating an archive with tar or zip)
- set compression (none, bzip2, gzip) and value types (text or binary)
- extract the contents of a SequenceFile back to the filesystem
- convert popular archive formats — tar (including tar.bz2 and tar.gz) and zip — to and from SequenceFile format
Head over to the forqlift page for more info!