Skip to content

News from the staff team

More problems....

There was an issue with our networking today, and a few of our servers responded very negatively to the loss of network connectivity. We are working to get this resolved asap, like before we don't have an ETA on how long it will take. We will keep you posted.

Read-only home directories, mailboxes, and html directories

As you may know by now OCF services have been marginally restored and you can now access your files. However, you may have also noticed that you get strange errors, "Permission Denied", and are generally unable to edit your files. This is because we have mounted all your data Read-Only. This is a consequence of impending failure of our disk array, and we are in the process of migrating to a new one. Stay tuned for changes.

Unexpected Downtime

As many of you have noticed, the OCF is currently down. One of our core servers experienced a hardware failure earlier today and we are working to restore it. We do not yet have an ETA for restored service, but will post updates as we have them. We sincerely apologize for the inconvenience.


UPDATE (16 June, 1 PM): The failed server has been partly restored. Web and SSH logins are intermittently working for usernames starting with the letters a-p, but mail is still down systemwide; we hope to have full functionality soon. As always, you can find us in the OCF's IRC channel if you have any questions. Thank you for your patience.

UPDATE (16 June, 11 PM): All OCF services, including mail, webmail, and SSH logins for accounts starting with the letters q-z, should be working now. Please let us know if anything appears to be broken, and thank you for your patience!

UPDATE (17 June, 1 PM): Webmail has been fixed.

UPDATE (17 June, 1:30 PM): The OCF's main disk array is having problems — we've taken all OCF services down while we investigate.

First Update on Rebuild

Update 1 on OCF Rebuild

-Set up a new nat server and a new printhost
-Each one is a separate esxi guest
-Testing to see if I allocated enough memory, I think I may want to bump the printserver up to 512 MB of virtual ram (any thoughts?)
-New Lab IP ranges (formalized)
- So lab networking is being cleaned up, mostly wiring, I'm hoping to axe the windows vlan altogether since if you include the random shit we have acquired over the past year thats way more ports than what we traditionally allocated for the lab nat. (motivation for cleaning those IP's up)
-haven't set up any printers yet, one of them is possibly broken (or jammed)
-Hopefully next week we will start rolling out machine images...will start with linux
-8 pages of documentation so far, ask me if you want to read it. I'm doing them on word documents for now (not leaving any details out). traditionally we tend to leave stuff out like iptables and stuff out of staff wiki documentation, but staffers have messed stuff like that up before, so i'm not leaving even "trivial" stuff out...
-^perhaps we can make an ocf staff handbook or something basically command by command how to rebuild the ocf?

Summer Goings-On

So, some of our users have been concerned about the planned downtime this summer. Here's the nitty-gritty:


We're rebuilding everything from the ground up. Partly this is because the current system has things scattered over multiple servers, a relic of the days when we didn't have powerful enough systems to consolidate mail, for example, onto one machine. Partly this is because Oracle (formerly Sun) can no longer be relied upon to provide free security updates for Solaris, which forms the majority of our back-end. Finally, this is partly because the current setup is simply untenable. It's the computer equivalent of an Land Rover held together with spit, baling wire, and chewing gum - it runs, more or less, but God help anyone who needs to poke around under the hood when things break.

So, we decided to move on. Rather than spending time and energy trying to keep fixing a 20 year old heap, we're starting fresh. We'll be changing some things on the backend - most notably a migration from Solaris to Debian and FreeBSD - and we'll be changing some things on the frontend - like bringing in Windows 7 and getting some new hardware for the Linux clients. For most of our users, the change will be mostly transparent. For some of our users, things will change a little bit. For a tiny minority, things will break. To those people, we apologize in advance, and we are, as always, happy to help you get things working again.

In theory, while we're making this transition, we'll be building replacements side-by-side with the current production servers, and swapping them out once we're fairly certain that everything's working correctly. So, for the majority of our services, there won't be much more than a blip in service. Moreover, we won't be swapping out more than one server at a time, so no more than one service should go offline at any given time. However, there are some services (notably web and MySQL) which will take longer to swap out. The fact of the matter is that we really only have one server powerful enough to be a web server, so we can't build it's replacement until we've shut it down. Even so, our daring team of sysadmins should have the server back up and running in no time flat (I believe the previous record for a ground-up rebuild of the webserver was less than a day).

Finally, I'd like to bring some attention to the "in theory" that started off that last paragraph. As anyone who's ever worked on any sort of project before knows, something will always go wrong. So we ask you to bear with us while we work out the kinks and the bugs. This is going to take at least a few weeks, and things will be a little hectic during that time. We may swap in a new system only to find some bug that escaped our testing, and we'll have to switch back until we get it sorted out. We'll do our best to post here when we're getting ready to swap something out, and we'll make sure that there are always avenues open to get in touch with us to let us know about problems.

Oh- and remember that we're all volunteers. We do our best, but sometimes other commitments (school, work, family, life) can get in the way for a little while. But like all true geeks, we can't stay away for long, so rest assured that things will get fixed, emails will get answered, and the agents of truth and light will win the day.

Thanks for reading.

Sorry...

Our Incoming mail server was having some difficulties with disk-space, somewhere in the stress of finals and the stress of having to resurrect the server, I panicked and accidentally deleted some queued up mail. So mail might behave a bit weird for the next day or two. Sorry for any inconvenience I may have caused you.

Sanjay Krishnan
OCF Site Manager

Unexpected Downtime

The OCF's central file server went down today around 4:30, and we're having a bit of trouble nursing it back to a working state. We'll post updates here as we have them.

Patch Day

The OCF's patch day is set to begin in about half an hour, and is scheduled to run for 24 hours (6 PM Friday to 6 PM Saturday). Various services will drop in and out during that time, including mail, web, and SSH access. We'll post status updates here and will let you know about any unforseen events which would prolong the outage.


8 PM: The mail servers have been updated successfully, with no outage required thus far. We will need to shut down mail delivery, POP, and IMAP when we take the fileserver down later tonight, however. The login server and documentation server are taking longer than expected to patch; we'll hold of on the fileserver until those two are resolved.

10:45 PM: The final updates for the login and documentation servers have been applied, apparently with no adverse effects.  We've just taken mail down in preparation for patching our main fileserver.  We will be bringing down web next before commencing the patching.

1:45 AM: Looks like the bulk of the patching work is done.  There might be a few more short outages throughout the day as we tweak settings and ferret out bugs, but there should be no more prolonged outages (cross your fingers).

Unexpected Downtime

Sometime this morning, one of the OCF's mail servers stopped responding to user logins. We are working as fast as we can to resolve the issue, but do not have a time estimate. The server that hosts webmail also recently stopped responding; we are working on restoring it, but are focusing on bringing mail services back online.

The mail server in question — mail.OCF.Berkeley.EDU — provides IMAP, POP, and SMTP access to mail, but stores users' email on the central file server. Old mail is still accessible via command-line mail clients like mutt and pine on our login servers (apocalypse and tsunami), and incoming mail will be queued on a different server and delivered when mail.OCF is restored.

We will post updates when we have them. If you need assistance, someone in the OCF IRC channel may be able to assist you, but please be patient if nobody responds right away. :) We apologize for the downtime and inconvenience, and wish you a wonderful winter break.

UPDATE (23 December): mail.OCF is partially online. You can access your mail with an IMAP or POP client such as Mozilla Thunderbird, but we are still ironing out some issues with sending mail — please let us know if you have any problems. Webmail is still down, but we are hoping to have that service restored within the next few days.

UPDATE (27 December): We've restored the server that hosts webmail and documentation. Thank you for your patience — have a happy New Year!

UPDATE (30 December): Our SSL certificate issuing authority, ipsCA, kinda dropped the ball and forgot to renew their root certificates before they expired on December 29th. As a result, unless you're using IE8, you will get a warning about an untrusted SSL certificate whenever you try to access OCF services like POP, IMAP, SMTP, Webmail, or account tools. We are working on obtaining updated certificates from a slightly more on-the-ball certificate authority, but in the meantime, you may continue accessing your OCF services by overriding the certificate warnings. However, in general this is a bad practice to get in the habit of, and we suggest that you wait until we have installed the updated certificates before attempting to connect.

UPDATE (5 January): We have installed the new SSL certs for mail.ocf and webmail.ocf. Let us know if you have any issues connecting to these services, and happy new year!

Emergency Webserver Maintenance

At around 8:30 PM this Monday, OCF staff noticed that the OCF website was slow to respond, and that our webserver either the victim of a DDOS attack or else swap space exhaustion. In any event, we've gone ahead and rebooted it. We will also use this opportunity to apply a few long-overdue patches to the machine.


We expect this downtime to last no longer than an hour. During this outage, most OCF services will be unavailable (the machine hosting our webserver also hosts a number of other critical services).

We apologize for the inconvenience and thank our users for their understanding.

UPDATE: We have since discovered a problem with the server hosting mail services. We are working on this issue now; we anticipate that this will add no more than an additional hour to our original schedule.

UPDATE 2: It looks like everything is back online as of 9:55 PM. Let us know if you discover anything out of the ordinary.

UPDATE 3: It looks like webmail is having some issues. We are investigating this issue, but have no timeframe for fixing it. UPDATE TO THE UPDATE: Looks like this was an easier fix than I had thought. Webmail should be working again.