Preview Mode Links will not work in preview mode

Thanks for joining us! Let me know if there are any topics you'd like us to cover by sending an email to me at craigpeterson . com!

Oct 5, 2021

What Happened With Facebook's Outage?
When Will It Happen Again?

Facebook had a huge outage all of its properties. So why did it happen? How did it happen? And what's going to happen in the future? The frankly, some of this technology just isn't that stable. And I'm going to explain why right now!

[Automated transcript follows]

[00:00:20] I've already talked about it a little bit this morning on the show, but Facebook was. Facebook was down a lot. Facebook too was down a long time. And Mr. Zuckerberg has now lost about $7 billion because of how long it was down. And Craig Peterson joins us now to talk a little bit about exactly what happened, why it matters, what it means and so much more.

[00:00:39] Craig, how are you this morning?

[00:00:41] Hey, good morning. Doing well.

[00:00:42] Thanks. Good to have you as always. So tell me first. What actually happened yesterday. I read that the explanation from Facebook seems like not a big deal as just a configuration problem, a little unexpected issue. They're not sure exactly what happened or looking into it.

[00:00:57] It's not a big deal though. Continue on with your day. What's the reality, what actually happened.

[00:01:01] Yeah, nothing to see here. You look at the number of companies and the companies Facebook has bought over the years, basically since 2005, they've spent $410 billion on all these companies named some names.

[00:01:17] You might actually recognize you remember Friendster?

[00:01:20] I do remember friends. Yes. That was a little, that's a little bit back there, but yeah.

[00:01:25] That was about 10 years ago, they paid $40 million for that. But of course, Facebook has moved on from that and owned all kinds of companies. Right now.

[00:01:35] It's got Instagram, WhatsApp, by the way they paid 19 billion is what it's wiping sorts out Oculus live rail and many

[00:01:45] others basically. That's when Ben one of the main complaints events. Supposedly being a monopoly is that they've been gobbling up their competition and other things that maybe even weren't competition, but things they could just add to the big beast and have it consolidated at all under Facebook's banner.

[00:02:02] Yeah. So the problem that tech guys have is this scale, massive scale. So on top of all of that, they have they claimed to have almost half the people. Earth go logging on to Facebook. So how do you deal with numbers like these and gets very difficult. And what appears to have happened is they're using a tool.

[00:02:26] There's a few that we use. And in fact, we'd had a similar problem yesterday with my company's networks, where w here's what happened? Here's the basics, right? You heard it was a DNS problem. Some people have said that. That's not the real problem. The real problem lies underneath that. And it's something that we have to deal with because we're working with multiple companies that have multiple network connections, and that's where it comes from the multiple network connections.

[00:02:56] So on the internet, what happens if you're going to go to Facebook, you're typing in that has to be turned into an internet address. And to do that, you use DNS. But how bout beneath that basically the street directory who has main street in downtown Portsmouth. For instance, if you want to get there, there's another protocol that's used beneath DNS, and this protocol is used to actually map the, these addresses, these internet numbers.

[00:03:32] So that was the problem yesterday. And I checked it online myself with a site that we use to monitor all of this type of ad dressing. And what turned out had happened is Facebook stopped advertising where it addresses. If you tried to look up Facebook, you couldn't find it. And you got a DNS error because the DNS servers addresses were unknown.

[00:03:57] You knew the address, but you didn't know how to get to that address on the. And Facebook has become so big. They're using automated tools in order to push the configurations to all of these, what are called BGP servers. So what probably happened yesterday in reading some things on Reddit and other places where there are some people who claim to be working for Facebook, what probably happened.

[00:04:26] Somebody forgot to put the peer configurations into their BGP routing tables, pushed it out to all of their BGP routers worldwide. Now I've got to say on the outage that lasted six or eight hours with a problem. This is amazing because now you have to worry about the cold start of the whole. Some kind of like Texas, another four minutes, they would have been without power in some areas for months,

[00:04:57] we were referring to it.

[00:04:58] I'm thinking of a cold start your side. It sounds like you're starting a car. It's too cold outside and the car just doesn't have enough juice in the battery. So it's a, is that basically what happened?

[00:05:06] Yeah. Yeah. What happened is you couldn't get to anything. Facebook probably could not get to its own routers to update the configuration.

[00:05:14] Similarly took so long then is that they really were having a difficult time even gaining access to the thing that would be necessary to fix it.

[00:05:20] Exactly. And there were a lot of people, myself included that were thinking man, it's going to be days because the cold start also has problems with like caches.

[00:05:31] For instance, you go to a page. There's pictures, there's videos, there's texts while all of that information gets stored in a cache. So it doesn't have to be generated every time somebody sees something. So there would be cold Cassius out there that would need to be updated. It's a nightmare. This was a nightmare scenario for them and was probably caused by letting some junior guy.

[00:05:55] We'll make some changes through their BGP table.

[00:05:59] That is remarkable. We're talking with Craig Peterson, our tech guru. He joins us on Wednesdays typically to go over the world of technology. And of course we'll do that tomorrow as well, but we wanted to have him join us to talk a little bit about Facebook before I let you go.

[00:06:11] Craig. I The implications of this, I think are massive. I take to consider, even if you don't care about Facebook, if you don't use it, it's not part of your life. Obviously it is such a big part of not just American life, but this is a worldwide issue, right? I It is used by billions and billions of people and this kind of an outage lasting this long is not only unprecedented, but really important in terms of having good Lord.

[00:06:34] If you're a, if you're a Facebook. I was talking about that a little bit earlier this morning. If you had Facebook stock, how do you feel today? I know mark Zuckerberg doesn't feel great. That's why he lost $7 million of value yesterday. How does this affect at Facebook, the company going forward here, this, and when you combine this with the whole whistleblower thing, it's not exactly been a good week.

[00:06:51] Yeah, not at all. This problem frankly, comes from the early days or earlier days of the internet. I was on the internet back in the early 1980s and helping to develop the protocols. And back then, we were not worried that. That's type of massive scale. We were not worried about hackers, really getting in.

[00:07:13] Cause it was a great community. I'm most of us knew each other and we used to joke around and have a lot of fun. These protocols were not designed for the types of problems we're seeing today. So until these problems are solved, not by Facebook, but by the internet community as a whole, these types of things can happen again.

[00:07:37] So Facebook, it could go down again because frankly we have seen times where for instance, traffic from the Washington DC area was all routed through Moscow. So you would send data from the white house and I'm know to someone in the building, across the street. And it was referred through mosque gal who knows what the Russians are doing with all of that data, but we just don't have the safeguards in place that would support, frankly, the way we are using the internet today.

[00:08:12] Facebook could face this problem. Again, we're talking about fiber as much as I've seen numbers, $500 million an hour in lost revenue from Facebook, but it could happen to anyone. And I'm sure there will be a lot of work here. Others, people sharpening pencils, and finally getting in line on how do we actually do.

[00:08:33] The stop work at huge scale. Huge. We're talking now hundreds of billions, probably trillions of devices connected to the internet by 2025.

[00:08:46] They're actually sharpening pencils. Craig, you think anybody uses pencils anymore? I begged to do. Not a technology companies. Craig Peterson, we appreciate it as always.

[00:08:55] Of course you hear them on Saturdays as well on WGAN and we'll hear his voice tomorrow, joining us for the more traditional tech topics, other things besides Facebook to chat about, obviously, but we appreciate him joining this morning. Thanks a lot, Greg. And we'll talk to you tomorrow.

[00:09:07] Take care.