Thanks for the great session! I purposely reserved a small room (7 people) thinking that I will be talking to myself, but the turnout was great (35+ people - people had to participate standing outside the room).
I had plenty of fun talking to likeminded people and if you want to stay in touch send me an email:
datagrids PLEASEREMOVEME AT nachbar DOT biz.
Data Grid Definition from WikiPedia
A data grid is a grid computing system that deals with data — the controlled sharing and management of large amounts (if you can't fit your data on one machine) of distributed data. These are often, but not always, combined with computational grid computing systems.
Discussed Tools
- Hadoop (Yahoo has a test cluster with 900+ machines)
- HBase - Google BigTable clone. Rapidly developing, but not stable yet.
- MapReduce
- Streaming - also more experimental at this point
- Distributed Filesystem
- Erlang Message Passing
- Mnesia
- Zvents
- Terracotta - Network Attached Memory. Very favorable OSS license.
- GlusterFS
- Storage Resource Broker (SRB)
- TaskQueue - by TailRank
- LustreFS/CodaFS
- GFS
- Hibernate Shards
- Greenplum - shared nothing DB based on Postgres.
General Tools & Tricks of the Trade
- Memcached works well (isn't really related to Data Grids, but frequently used)
- MySQL InnoDB on XFS
Maybe we can start meeting more frequently if there is interest.
Erich Nachbar
To Do
- Cleanup & add more references
- Make all those projects linkies
- Add more background for each project
- Group projects (filesystems, etc.)