Skip to content

2010

Update

Hi all,
Some of you may have noticed your cgi scripts have broken, this is probably due to the fact that our new webserver has a different directory structure, and your scripts rely on a hard coded directory path. Don't worry this is only temporary. I used different nfs mount points to make mounting user data easier while we were swapping ip's around. It should be resolved when the server is on its proper hardware.

Anticipate downtime this weekend

Hi all,
We will try to rebuild one of our main servers this weekend, anticipate some downtime and/or loss of functionality.

Servicing our disk array

The OCF will likely be down the entire weekend as we update, error check, and test our disk array. Please check back later for more updates

Mysql Upgrade

Hi all,
We just upgraded our mysql server, let me know if anything doesn't work. Some of our client programs may be old, so if it says stuff like "Authentication protocol not supported" don't worry, its just that our clients are using old programs.

Further Updates

Thank you for all your patience, its really appreciated. Our new disk array should be back up, and we will slowly restore services after the July 4th break. I don't want to rush anything since its just going to make it harder to fix if something goes wrong.

I want to make sure everyone knows we have multiple goals when addressing the current problems. While we try to maximize our uptime, we are also concurrently rebuilding our current system. At the beginning of summer, we did send a downtime announcement that spanned the whole summer. At some point we may decide the current system is not worth salvaging, and focus our efforts on the rebuild.

Furthermore, we will shortly attempt to rebuild our core servers, which will take down our entire system. We will try our best to build ad hoc login servers, but anticipate more downtime in July.

More problems....

There was an issue with our networking today, and a few of our servers responded very negatively to the loss of network connectivity. We are working to get this resolved asap, like before we don't have an ETA on how long it will take. We will keep you posted.

Read-only home directories, mailboxes, and html directories

As you may know by now OCF services have been marginally restored and you can now access your files. However, you may have also noticed that you get strange errors, "Permission Denied", and are generally unable to edit your files. This is because we have mounted all your data Read-Only. This is a consequence of impending failure of our disk array, and we are in the process of migrating to a new one. Stay tuned for changes.

Unexpected Downtime

As many of you have noticed, the OCF is currently down. One of our core servers experienced a hardware failure earlier today and we are working to restore it. We do not yet have an ETA for restored service, but will post updates as we have them. We sincerely apologize for the inconvenience.


UPDATE (16 June, 1 PM): The failed server has been partly restored. Web and SSH logins are intermittently working for usernames starting with the letters a-p, but mail is still down systemwide; we hope to have full functionality soon. As always, you can find us in the OCF's IRC channel if you have any questions. Thank you for your patience.

UPDATE (16 June, 11 PM): All OCF services, including mail, webmail, and SSH logins for accounts starting with the letters q-z, should be working now. Please let us know if anything appears to be broken, and thank you for your patience!

UPDATE (17 June, 1 PM): Webmail has been fixed.

UPDATE (17 June, 1:30 PM): The OCF's main disk array is having problems — we've taken all OCF services down while we investigate.

First Update on Rebuild

Update 1 on OCF Rebuild

-Set up a new nat server and a new printhost
-Each one is a separate esxi guest
-Testing to see if I allocated enough memory, I think I may want to bump the printserver up to 512 MB of virtual ram (any thoughts?)
-New Lab IP ranges (formalized)
- So lab networking is being cleaned up, mostly wiring, I'm hoping to axe the windows vlan altogether since if you include the random shit we have acquired over the past year thats way more ports than what we traditionally allocated for the lab nat. (motivation for cleaning those IP's up)
-haven't set up any printers yet, one of them is possibly broken (or jammed)
-Hopefully next week we will start rolling out machine images...will start with linux
-8 pages of documentation so far, ask me if you want to read it. I'm doing them on word documents for now (not leaving any details out). traditionally we tend to leave stuff out like iptables and stuff out of staff wiki documentation, but staffers have messed stuff like that up before, so i'm not leaving even "trivial" stuff out...
-^perhaps we can make an ocf staff handbook or something basically command by command how to rebuild the ocf?