Well, I hardly think we need this blog anymore, either, since LJ itself is working fine. Everybody should now be watching the normal places for status updates!
So, unless something happens -- this is Junior, and good-bye!
Well, I hardly think we need this blog anymore, either, since LJ itself is working fine. Everybody should now be watching the normal places for status updates!
So, unless something happens -- this is Junior, and good-bye!
November 20, 2005 at 03:51 PM | Permalink
So, LiveJournal is clicking along -- at least, most of the site functionality is! We've run into an issue with our MogileFS storage where it's being really slow. This is causing userpics and such to be slow/not load. It's bumming us out, but there's not much we can do until we get some more disks -- which we're doing first thing in the morning!
We're about to head out, try to get some sleep tonight, in preparation for another day tomorrow. Thanks to everyone for your patience and being cool while we sort things out. I'm actually really excited about how smooth things went, all things considered! The team did a great job and is continuing to do so.
That's it for now... Junior out!
November 19, 2005 at 10:04 PM | Permalink
We've been in the office all morning (or conference calling in) working to clean up the last remaining site problems. Most of them have been addressed at this point while the rest have solutions in progress. The site seems to be mostly stable, so we're all about to take a lunch break and continue working in a bit. *yawn*
-- Whitaker
November 19, 2005 at 05:38 PM | Permalink
Whitaker here. I've been up for the past hour or two frantically hackin'. We're trying to resolve the last of the error messages that people are seeing. The site's also pretty slow for free users so we're diagnosing our network and database situation. We're all doing our best to get everything working smoothly again!
More coffee sounds pretty good right now. : )
November 19, 2005 at 01:22 PM | Permalink
Well the sun came up, but it was far too early! I'm sitting in the airport waiting for my flight, reading through support requests, and updating status and the BBB. Seems right now voice posts, syndicated accounts, and userpics still have some issues. Lisa isn't awake again yet, but I wouldn't expect her to be either. Hopefully will be a pretty calm day as Brad, Lisa, Matthew, Whitaker, and Jr work to resolve the few last remaining issues.
--David
November 19, 2005 at 11:32 AM | Permalink
We're down to a skeleton crew as the sun is coming up here in the east. Most of the data center move work is complete, with the exception of a few miscellaneous issues to work out tomorrow. Thanks for tuning in!
-- Whitaker, Lisa, Artur
November 19, 2005 at 05:08 AM | Permalink
Aside from a few outstanding issues (voice posts, email posts, and I think that's it?) the move was very successful and we're all but done. Just wrapping up some final things, and then I think we can all get some much needed sleep.
This will probably be the last post to this blog for today. Maybe ever? Since LJ is back up and running, we won't need to post here anymore.
So -- in that case -- this is Junior, saying good night, world!
November 19, 2005 at 04:49 AM | Permalink
Seems things are back up and working except for Voice Posts, Mobile Posts, and serving some ScrapBook images.
November 19, 2005 at 03:37 AM | Permalink
Shop is now back up! Seems it is champagne time now since we should be done!
November 19, 2005 at 03:13 AM | Permalink
We're now working on getting SSL back up and figuring out some unknown traffic on our Seattle network. Too bad not all tools actually give proper units for bandwidth. Turns out it was our database replication traffic. Guess 3am even takes a toll on awesome sysadmins.
November 19, 2005 at 03:07 AM | Permalink
Read-only is now turned off for all users.
November 19, 2005 at 02:15 AM | Permalink
You are now able to add any additional userpics, make voice posts, or upload images to your ScrapBook again.
November 19, 2005 at 02:02 AM | Permalink
So it seems there are inbound routing issues with certain ISPs which are causing packet loss previously mentioned. Seems UnitedLayer and Layer42 are good, though MCI is running into problems. Working with one of our consultants right now to try and get this resolved. This is however the best type of problem we could have run into tonight. Still working on getting the remaining services back online, but everyone seems pretty relaxed right now. Good move so far and we stayed within our two hour window for downtime.
--David
November 19, 2005 at 01:50 AM | Permalink
We knew that the site would be a little slow when it came back, as memcaches filled back up, but we didn't mean this slow. Half of us are suddenly experiencing really really awful connections, to LiveJournal and other places. We're looking into the networking problems to see where the problem is.
(If we somehow managed to break the internet, I will laugh my *CENSORED* off.)
November 19, 2005 at 01:44 AM | Permalink
Access to LiveJournal (http://www.livejournal.com ) and ScrapBook (http://pics.livejournal.com ) has now been restored and is being served from our San Francisco data center. Some users will still be in read-only mode as we finish bringing up other services on our network and our data caches become populated.
November 19, 2005 at 01:35 AM | Permalink
Mena and I are here to lend moral support, since we can't do much more than that for the move.
-- Ben
November 19, 2005 at 01:28 AM | Permalink
DNS has now been updated and traffic should be pointing to San Francisco. Services should be back up pretty soon.
November 19, 2005 at 01:27 AM | Permalink
We found a glitch in a configuration file, which is fortunately easily fixed. We're doing that now.
We should also warn you all that when the site does come back up, it's going to be slow for a while. Our memcache machines have to fill back up in San Francisco, which is going to make things feel really pokey for a while.
(If you need a refresher, there's always the Geek to English post!)
November 19, 2005 at 01:09 AM | Permalink
We're now bringing up our web pool in San Francisco and testing LJ locally from the office. Knock on wood that we don't run into problems and are able to restore access to the site shortly. When we do this though, some users will still be in read-only mode as we bring up other network services and our caches warm up.
--David
November 19, 2005 at 12:50 AM | Permalink
We're almost down to the bottom of our list, and we're almost ready to bring LiveJournal back up, piece by piece. Everyone welcome us to San Francisco!
--rah
November 19, 2005 at 12:47 AM | Permalink
Just in case you doubted that we're all as heavy LiveJournal addicts as the rest of you:
Whitaker just tried to update his journal. Just as I automatically flipped to the other tab and refreshed my friends page.
Almost there ....
--rah
November 19, 2005 at 12:12 AM | Permalink
Seems DNS has propagated already although our BigIP rule for the move page hasn't been setup in San Francisco yet. So that is why you're getting the "cannot connect" error versus our spiffy moving page.
November 19, 2005 at 12:11 AM | Permalink
Frank -- wait -- no -- don't chew on that -- NOT THAT CABLE, FRANK!
November 19, 2005 at 12:03 AM | Permalink
Counting down to read-only mode in five ... four ... three ... two ...
November 18, 2005 at 11:54 PM | Permalink
"Sync all" - Lisa
"Sync all" - Matthew
"Running Sync all" - Matthew
"Is the sync all finished?" - Jr
"No it is still running" - Matthew
"Sync all is complete" - Matthew
November 18, 2005 at 11:50 PM | Permalink
I know it seems like we're running late, but really, it's just us being super paranoid -- checking, double-checking, and triple-checking things before we do anything major. It's funny being on this conference call right now; the engineers are throwing around Unix commands right and left, and some of them sound really funny when they're spoken out loud.
I am glad that we decided to move the window two hours earlier, though! We were originally going to start at midnight, which would have meant finishing up around 3 AM (best case) for the guys in San Francisco and Seattle, and a whopping 6 AM for me. I'm going to be sitting up for a few hours after we're all done, too, to make sure that we didn't miss anything. Considering that I'm already getting tired, I totally would have died if we'd started two hours later.
Oh well -- it's good knitting time while I'm waiting to see if anything blows up! Which it totally won't. Power of positive thinking, ahoy!
Right now we're running final checks. We've got a huge list on the whiteboard, and we're going down and ticking them off one by one. Lisa is totally rocking the house in charge of all of us. When this is over, I think everyone in the office should chip in and get her a gift certificate to a massage therapist -- anyone know any good ones in San Francisco? (Is anyone reading this a good massage therapist in San Francisco? *g*)
....Annnnnd we just took ScrapBook down! I think that's like step 328 or something. Here we go with some of the final synchronization ...
--rah
November 18, 2005 at 11:46 PM | Permalink
Krissy here. I just got back to the office to lend some moral support. Everyone seems in good spirits. Brad has hot coffee. There's some epic music playing. No one has nodded off to sleep yet or had reason to pull their hair out.
Nice work, everybody!
November 18, 2005 at 11:24 PM | Permalink
...but only for a few minutes. We're disabling the LiveJournal store in a few minutes.
-- rah
November 18, 2005 at 11:16 PM | Permalink
Flight director Lisa is currently going around getting a "go flight" from each of us! Now we need some totally epic music! Suggestions have been Eye of the Tiger, Chariots of Fire, and Its the End of the World.
November 18, 2005 at 11:12 PM | Permalink
Total number of "your mom" jokes made so far: 2
--rah
November 18, 2005 at 10:41 PM | Permalink
Not much has changed around here the past hour. All sitting around the big conference table, projector on, whiteboard, conference call, and IRC. Lisa and Brad are working on writing a script to monitor replication, Jr has been working on mogile stuff along with other random stuff, and others doing all sorts of other things as well. Brad got frustrated with his laptop so he now has his desktop on the desk in here. Intern has learned how to plug in a RJ-45 connector so that Brad doesn't lose his connection while working. I'm here drinking RedBull, updating various blogs and such, and talking to people. While we may not have started right at 10, I think it is a good thing we're running the time line, not the clock dictating what we do when!
--David
November 18, 2005 at 10:31 PM | Permalink
(rahaeli) For those of you who don't speak geek, I'll do a translation and explanation of some of the geek terms that are probably going to be thrown around over the next few hours!
(For those of you who do speak geek: yeah, I know, I'm oversimplifying in a lot of cases.)
BigIP: The BigIP is a piece of hardware that sits at the very front of our network, and acts as the doorman. When a request knocks on our door, the BigIP finds the webserver that's least busy and says "hey, go over there." Of course, it works a heck of a lot faster than any person could.
CAPTCHA: Stands for "Completely Automated Public Turing Test to Tell Computers and Humans Apart". (Yeah, I know, the acronym and the expansion don't quite match.) These are those little "prove you're a human" graphics you see when you create a new account or sometimes when you comment anonymously. There's a very slight chance that we might run out of pre-stored captchas during the move, which means that people might not be able to create accounts or comment anonymously, since the job that makes more of them won't be running.
cron / cron job: A cron job is like an automated set of instructions that'll run at a set time. We use cron jobs to schedule maintenance tasks. Most of them are invisible to you guys, but some examples are things like the emails that go out to warn people whose accounts are about to expire. We schedule those to run on a regular basis, so nobody has to manually sit there and send out all those emails. Part of the move is making sure that all the cron jobs are running on the right machines at the right time.
DNS: DNS stands for "Domain Name Service". It's what your computer uses to know that requests for "http://www.livejournal.com" should come to our computers, but "http://www.google.com" should go to Google, for instance. Part of what we're doing tonight is the Internet equivalent of filling out a "change of address" card with the post office -- telling all the "address books" on the Internet that we've moved.
IP address: Your IP address is what distinguishes your computer from every other community on the Internet. Every machine on the Internet has an IP address. DNS (see above) is what maps an IP address to a domain name. Because we're switching to a new provider, our IP address is changing, which means that we're going to have to broadcast a new IP address through DNS.
mailgate: This is the process that handles all the mail coming into LiveJournal and makes sure that it goes where it needs to go. It handles things like emailed Support requests, picture posts, mobile posts, etc.
memcache/memcached: Memcache is something that we came up with to store the most-frequently-used data in our servers' memory, instead of having to load it all from the database every time someone needs it. We do this because reading from memory is fast, and reading from the database's disks is slow. Of course, we're talking about fractions of a second, but when we're dealing with eight million of you, fractions of a second add up quickly!
mogile: In traditional LiveJournal fashion, we couldn't find a file arrangement system that we liked, so we wrote our own! MogileFS -- an anagram of "omg files" -- is what stores and retrieves all of our data files, like userpics and phone posts. Of course, it's a little more complicated than just your hard drive ...
nfs: NFS is a way of making sure that a bunch of different computers on a network can share a single bunch of files. It basically pretends that a hard drive on a different computer is really on your computer. That way, we can have all of the LJ files in one place, and all the other computers on the network can use that server, instead of having to copy files all over to our hundreds of different web servers.
perlbal: If the BigIP is the hardware doorman of the network, perlbal (short for "perl balancer" -- perl is the computer language it's written in) is the software doorman. It's a piece of software that takes all the connections in and makes sure that they go to the right place.
replication: When you've got two databases talking to each other, replication is the process of them sharing information back and forth. There are a bunch of different possible database configurations, but in all of them, data is passed back and forth pretty regularly. Part of tonight's move involves starting and stopping replication back and forth from various databases.
smtp: Stands for Simple Mail Transport Protocol. This is what handles all the mail going out -- comment notification emails, Support request responses, and mail to your @livejournal.com email.
ttl: Time To Live. To really explain this I'd have to draw pictures! It's a setting that you can use in your DNS server configuration, to tell your DNS server how long to cache the answer. Basically, when your computer asks "Hey, if I want to go to www.livejournal.com, what IP address does that correspond to?", the server says "Here's the address!" It doesn't do that every time, though, to make things faster -- it saves the answer for later. By setting a low TTL, we make it go back and look up the real answer more frequently, in case it changes. Part of tonight involves setting a low TTL. Kind of like your pesky little brother -- every five minutes, it's going to poke us and whine, "Hey, didja move yet? Didja move yet? Didja mooooooove yet?"
And there you have it! If our engineers wind up using any terms that I haven't covered, I'll hop in and try to define them.
November 18, 2005 at 09:57 PM | Permalink
I feel like an air traffic controller sitting here in a dark room with LCDs all around me, talking to people in San Francisco and Seattle on a headset with boom mic. We're coordinating all of the little tasks that need to be done prior to the switch over... the pace is very fast and the language very technical. Definitely interesting to be a part of.
Try to imagine a movie where they have a bunch of geeks trying to pull of some crazy heist... it's like that, only real. :-P
-- Whitaker
November 18, 2005 at 09:47 PM | Permalink
While Mischa did say "coke", and you may have thought something other than the trademarked term by Coca-Cola Inc., he really did mean the drink. LJ friends don't let friends do drugs!
November 18, 2005 at 09:12 PM | Permalink
Mischa here, I'm here in the action! The move is Serious Business! We have whiteboards, phone conferences, a forest of laptops, cables and coke everywhere, and a team of highly trained professionals. I can't see anything going wrong, but if it does I'll be providing valuable moral support.
November 18, 2005 at 09:02 PM | Permalink
So it seems if you put engineers and sysadmins in a room with a whiteboard, they'll use it. Lisa is right now coordinating listing out the order of stuff we have to do before 10pm. Getting a list together of everything, disabling crons, changing configs, checking data, etc. Matthew also just told us we now have a memcached node running in San Francisco! We're all on irc, aim, phone, etc should be a good night! * knocks on wood *
--David
November 18, 2005 at 09:00 PM | Permalink
Yarr, this be Junior (aka marksmith).
Things are finally coming together on this move situation. We've got a bunch of people all hanging out here in San Francisco in the main conference room at the Six Apart headquarters and a bunch of people online and on the phones all pulling together to make this successful.
This will be my first, last, and only post to this blog since I'm more involved in the move and won't be having time to update this blog during the rest of the night.
So here's to us, to a good night, and a successful move!
November 18, 2005 at 08:43 PM | Permalink
So, David, y'all are going to overnight us some of that Italian food, right?
...On second thought, I don't think I'd want it the day after.
--rah
November 18, 2005 at 08:21 PM | Permalink
Whitaker here, reporting from Ohio-land where Winter has begun to take hold. I'm living here for a bit while I finish school and as such will not play a primary role in the move. However, Lisa, Matthew and Junior have all been working very hard in recent weeks (and months) to plan and execute the switchover.
Hopefully my role will be to sit back and watch everything go off without a hitch. But we're realistic. Flipping this many switches in something so enormous will likely cause us to navigate a few bumps here and there. I'll be here to provide any auxiliary support I can by writing quick scripts, monitoring site functionality, and analyzing existing code when necessary.
While we wait for the big moment, I'm going to bump some Snoop Dogg and occupy myself with reading my friends page, testing the new LJ installation via a super-secret URL, and keeping my comrades company from afar. Also, coffee. Go team! :-P
November 18, 2005 at 07:07 PM | Permalink