Lustre, Pain, And Goo.

Well, it’s 7:13 PM CST at the moment, and I’m shacked up in DFW due to some server upgrades for Goo at the moment. Unfortunately, we’re also recovering from our worst loss yet, a full wipe and deletion of all files stored on Goo’s main file server.

So, what happened? Well, based on what I can tell, when I loaded the the OST into the main MDS for Lustre, it caused a race which ended up blanking out our MDS, requiring a full wipe and reload of both OST’s, and the MDS itself. For those not familiar with the terms, OST is the object storage, or file storage, and MDS is the metadata server. Files are two parts, metadata, and the actual physical file. Unfortunately, when you lose the metadata, you lose all reference to the location on the drive where the physical file is stored. Due to this, the loss was pretty much catastrophic, wiping out every developer’s /home/ drive.

Good news though!
Web01
Distro03
Distro04
Storage02
lgswitch

Have all gone into production (Well, not the switch yet, more on that later.) These boxes will be doing as their names suggest, handling the web frontend, distribution of files, and additional storage space (10+tb now, for anyone asking….) This will take load off the current servers, and help to ensure a good experience overall.

Now, for the switch.
Lgswitch is sexy, and fun, and a blast. But unfortunately, we didn’t receive the switchmodule needed to plug our SFP’s into for the 10gbps fiber drop seen in the pictures below (It’s the yellow loop, if you’re curious.) So we’re stuck on 1gbit for the next two weeks, as I’ll be on call this coming week, and won’t be able to drive to Dallas, however, the following week, I plan on driving up and installing the module. Tuesday if I can wangle the day off, if not sooner, because I WANT MOAR SPEED!

Pictures!

Start:

In-Progress:

Done with current servers:

Done with all installs:

15 responses to “Lustre, Pain, And Goo.”

  1. Ben

    So basically the whole Goo file repository was sacrificed in the name of speed? That doesn’t sound very smart.

    1. Agrabren

      As a dev, I can always reupload. I’m thankful for the upgrade. :-)

  2. kev

    sucks about the storage wipe, but its a great way to get rid of dead files anyways, and love irssi running on that last pic! Great news on the server/bandwidth upgade tho! keep up the awesome work guys!

  3. Bill

    Bummer. I thought the issue was initially on my Touchpad running CM9 until a Google search turned up this page. Just goes to show you don’t miss your GooManager until the server runs dry. Best to all running recovery.

  4. dsb9938

    Any idea how long till our home dirs are put back so we can reupload our files?

  5. xkonni

    Thanks for doing all this, i wanted to clean up anyway ;D

  6. Goo.im FYI - Android Forums

    [...] and users re-linked to /dev/ folders on Goo.im at this time. Full details can be found at: Lustre, Pain, And Goo. We're requesting that files be uploaded at your convenience, we're also beginning to work on a [...]

  7. Paras

    Who are (Snipa), By the way?

  8. anony mouse

    Would it be possible to rebuild some of the metadata from cached copies of your site?

    For example, pull the MD5s and filenames off of http://webcache.googleusercontent.com/search?q=cache:U_WHb95l_NAJ:goo.im/cm/crespo4g/nightly&cd=2&hl=en&ct=clnk&gl=us (or just googling MD5 hashes of the files you still have) and rebuild (at least partially) the metadata?

  9. dg

    You didn’t make a nandroid first???!!!

  10. Rich

    Can you spell?
    It seems that you are using some form of Americanised English. Just for your records its MORE and not MOAR. Even the retarded IT technicians I know have some grasp of spell check.

Leave a Reply