Jump to content

Global IT outages have hit airlines and business worldwide


Recommended Posts

I'm not shitting you when I say I'll be in calls with 10 PMs and 3 ICs. I spend at least 40% of my day giving updates. Somehow they never know what's going on even so!

 

Any updates I do I have to send an email, update in 2 slack channels, add to jira, and then mention in two stand-ups. And then also mention in other meetings. Every day.

  • Hugs 1
Link to comment
Share on other sites

3 hours ago, Ghost_MH said:

 

Yup, just pulled automatically in the background. Seems CloudStrike told some clients to just reboot their systems dozens of times until a fix was downloaded, but I don't know anyone where that actually worked.

 

Many security apps are like this. AV definitions aren't normally vetted. This is especially true for logic engines in security suites. Think of these as machine learning tools for keeping systems safe.

 

I'm more intimately familiar with McAfee's similar outage nearly twenty years. That one had their AV definitions accidently flag a Windows system file as bad which bricked Windows as soon as the AV quarantined the essential DLL. I'm also pretty familiar with Qualys.  I previously used Qualys for managing security and updates and their tools were automatically updated by DEFAULT. This is part of the problem. The reason I say it's expensive is because you'd need parallel hardware and companies already view IT as a net negative on corporate profits. You can't just test things on one virtual machine and call it a day. Have some physical database cluster? Well, now you need a second similar cluster. Have an entire virtual environment for your engineers? Well, if you really want to test things you need a complicated engineering environment. If you don't, you need to accept that you're not fully testing things and I've never met a CFO that was OK with funding partial tests that can't be guaranteed against.

 

My cheap solution to this was always to push all updates off by a week and then pay attention to news reports about faulty updates. That's obviously not an option for everyone, though. If everyone skips their updates by a week then we're back where we started. Also, all of these companies tell you best practice is to stay updated and on time. If you don't and you get bit by a zero day during that update gap, it's your policy that caused the outage and you wind up with the full blame.

 

It sucks, but that's how it is. I've personally gotten drilled by a CEO that was upset with me for updates that weren't installed per my policies even though we weren't negatively affected. Just big news about some zero day, randomly sees me walking by his office, calls me in and asks if we're patched to prevent this exploit. When he heard we weren't because those updates weren't scheduled to go out for another week, he really wasn't happy. Wasn't happy about it, but I ended up pushing an out of band update for just that one zero day and left everything else as is.

 

I like my job, but working in IT often sucks.

 

 

Ooph. Yeah automatic updates strikes me as just such a terrible idea. If that's going to exist, companies really ought to just pay the cost, but people are bad at cost analysis of tail events. Your strategy of delaying when you can seems like a good idea in lieu of that. That's basically what I do for all my personal systems (at least the ones that matter).

Link to comment
Share on other sites

3 hours ago, Ghost_MH said:

Exactly this. It's the bigger companies that have the budget that refuse to adequately fund this kind of stuff. They'll often bring in some MBA to manage IT/cut costs and that MBA will decide that nobody working for them knows better, so if we use a third party tools to manage certain risks you should abide by that parties' best practices which usually includes allowing them to manage their own updates because it frees up internal resources to run leaner and more efficiently.

 

Yeah, I have zero love for McKinsey, Deloitte and MBAs from shit consultancy firms. Yet, with that said, the companies who go out and hire them are brain-dead themselves. I used to work for a few software consultancies, and some of the idiotic questions we got from some clients were mind-numbing.

  • Haha 1
  • True 1
Link to comment
Share on other sites

On 7/19/2024 at 9:18 PM, ApatheticSarcasm said:

So once all these companies perform a post mortem, what are the lessons learned?

Probably nothing honestly. This isn't the first time we have seen this type of thing and won't be the last.

 

The travel industry for example uses a central system for bookings. Huge target for hackers. It is how companies like Expedia can exist. They tap into that system and can book flights/hotels/cars etc. Take that out and several companies are crippled yet again.

 

Our corporate overlords have become complacent.

Link to comment
Share on other sites

50 minutes ago, Link200 said:

Probably nothing honestly. This isn't the first time we have seen this type of thing and won't be the last.

 

The travel industry for example uses a central system for bookings. Huge target for hackers. It is how companies like Expedia can exist. They tap into that system and can book flights/hotels/cars etc. Take that out and several companies are crippled yet again.

 

Our corporate overlords have become complacent.

 

There are a lot of tools out there that use machine learning algorithms to sus out suspicious activity on computers and networks. CrowdStrike is one of those, but none of them are guaranteed to be always be bug free. This isn't about complacency, but about short sighted profit grabs above all else.

 

The end result is companies will move away from CrowdStrike toany one of its competitors who carry along with them the exact same risk.

 

I do think this kind of machine learning security is the best way to do it. The old method of just looking for hashes of known viruses and such doesn't really do enough. However, I believe this can better be handled at the network level rather than on individual machines. Keyloggers and data breaches need to send their data somewhere. It's often just as easy to spot that activity on the network without the risk of breaking a server OS.

Link to comment
Share on other sites

42 minutes ago, Ghost_MH said:

 

There are a lot of tools out there that use machine learning algorithms to sus out suspicious activity on computers and networks. CrowdStrike is one of those, but none of them are guaranteed to be always be bug free. This isn't about complacency, but about short sighted profit grabs above all else.

 

The end result is companies will move away from CrowdStrike toany one of its competitors who carry along with them the exact same risk.

 

I do think this kind of machine learning security is the best way to do it. The old method of just looking for hashes of known viruses and such doesn't really do enough. However, I believe this can better be handled at the network level rather than on individual machines. Keyloggers and data breaches need to send their data somewhere. It's often just as easy to spot that activity on the network without the risk of breaking a server OS.

My point is that there are tons of so how points of failure and corporations don't realize it until it is too late.

 

In aviation we do everything we have to have redundancies. Two pilots. Two engines. 4 electrical systems. You get the picture.

 

I don't believe these companies are ignorant to the fact that they are piling in on these failure points. Volume does create lower costs after all. The problem is that they are lower costs over a short term while exposing themselves to a much greater threat as a result of the popularity.

 

Let's say United Airlines develops their own booking systems and allows third parties to access it via an API. Meanwhile other airlines all pile in on the industry standard. Who has the highest amount of risk?

 

Likely the airlines that went all in in a single standard and got hacked as a result. Meanwhile United kept going like nothing happened.

 

Sorry, a drive to lower costs in the short term while increasing risk is a form of complacency and negligence. It is complacency because it seems to work until it doesn't.

Link to comment
Share on other sites

Yup, so we pretty much have confirmation now that this was a logic update that did not update the CrowdStrike application version, so even if you had CrowdStrike set to not update automatically this would still have been updated. 

 

WWW.THEVERGE.COM

Here are the details about what went wrong on Friday.

 

Quote

On July 19, 2024 at 04:09 UTC, as part of ongoing operations, CrowdStrike released a sensor configuration update to Windows systems. Sensor configuration updates are an ongoing part of the protection mechanisms of the Falcon platform. This configuration update triggered a logic error resulting in a system crash and blue screen (BSOD) on impacted systems.

 

This is an update that isn't meant to change the application works, but what the application does to detect threats. Running one version behind would not have saved you.

 

More than anything, this is just further proof that servers should just not have Internet access.

  • Halal 1
Link to comment
Share on other sites

Guys that are actually employed by the company we provide security to are complaining about how they haven't got paid yet because of this... but also stated that getting paid in general even if there aren't tech issues is a hassle. So reading between the lines, I think their employer is BSing them about why they haven't been paid yet.

Link to comment
Share on other sites

Unsurprisingly came into the office today to my Windows tower being blue screened, and seems like IT is still completely slammed with helping people with this based on the phone hold times. Thankfully I was able to get them to just chat me the admin password for the computer so I could fix it myself and also have a Mac laptop so I was actually able to get onto Teams to do that.

 

I knew they would be willing to do it but I still specifically pitched it as "if you send me the password that's one less phone call you guys have to take today". :p

Link to comment
Share on other sites

And we have confirmation that CrowdStrike just wasn't vetting these updates as thoroughly as their product releases because they assumed the risk factor was fairly low. 

 

WWW.THEVERGE.COM

CrowdStrike is detailing exactly what went wrong.

 

You've got to manually test everything. Relying on automated tools to validate your logic updates when your tool has such low level OS access doesn't sound risky at all.

 

Grossly simplifying things, they basically thought "We''re just updating an algorithm. What's the worst that could happen?"

  • Shocked 1
Link to comment
Share on other sites

26 minutes ago, SuperSpreader said:

ai will fix this

 

No, a pizza party will though! They learned from the best capitalists.

 

TECHCRUNCH.COM

Several people who received the CrowdStrike offer found that the gift card didn't work, while others got an error saying the voucher had been canceled.

 

Can you even get anything on UberEats for $10?

Link to comment
Share on other sites

50 minutes ago, Ghost_MH said:

And we have confirmation that CrowdStrike just wasn't vetting these updates as thoroughly as their product releases because they assumed the risk factor was fairly low. 

 

WWW.THEVERGE.COM

CrowdStrike is detailing exactly what went wrong.

 

You've got to manually test everything. Relying on automated tools to validate your logic updates when your tool has such low level OS access doesn't sound risky at all.

 

Grossly simplifying things, they basically thought "We''re just updating an algorithm. What's the worst that could happen?"

 

Frustrated Ryan Gosling GIF

 

Just unbelievably stupid. You should always manually/human test EVERYTHING. Automation testing is only as good as the test written and can lead to a false sense of security.

Link to comment
Share on other sites

22 minutes ago, chakoo said:

 

Frustrated Ryan Gosling GIF

 

Just unbelievably stupid. You should always manually/human test EVERYTHING. Automation testing is only as good as the test written and can lead to a false sense of security.

we saved so much money tho

Link to comment
Share on other sites

1 hour ago, GeneticBlueprint said:

 

No, a pizza party will though! They learned from the best capitalists.

 

TECHCRUNCH.COM

Several people who received the CrowdStrike offer found that the gift card didn't work, while others got an error saying the voucher had been canceled.

 

Can you even get anything on UberEats for $10?

 

Quote

On Wednesday, some of the people who posted about the gift card said that when they went to redeem the offer, they got an error message saying the voucher had been canceled. When TechCrunch checked the voucher, the Uber Eats page provided an error message that said the gift card “has been canceled by the issuing party and is no longer valid.”

 

CrowdStrike did not immediately respond to a request for comment. 

 

Link to comment
Share on other sites

10 minutes ago, SuperSpreader said:

Screenshot-20240725-142119-2.png

 

On 7/19/2024 at 1:27 PM, Ghost_MH said:

 

Wouldn't be the first time. I haven't dug into this one too much since it didn't affect our office, but I remember a decade ago where McAfee bricked Windows desktops. In that case, it want an untested update to the software, but security definitions. There are more and more applications that are self-updating without IT intervention because the update is supposedly limited to itself. Edge and Chrome fall in this category, security or virus definitions with white and black lists along with threat hashes didn't normally get manual approval.

 

Motherfucker

  • Haha 2
  • Hugs 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...