Commissar SFLUFAN Posted July 19 Share Posted July 19 CrowdStrike IT outage hits global supply chain, with air freight facing days or weeks to recover WWW.CNBC.COM Crowdstrike global IT outage hits the heart of the global supply chain, with air freight, rails, ports and trucking in the U.S. and beyond down. Quote The CrowdStrike software bug that crashed Microsoft operating systems and caused the largest IT outage in history caused disruptions at U.S. and global ports, with highly complex air freight systems suffering the heaviest hit, according to logistics experts, as global airlines grounded flights. “Planes and cargo are not where they are supposed to be and it will take days or even weeks to fully resolve,” Niall van de Wouw, chief air freight officer at supply chain consulting firm Xeneta, said in a statement shared with CNBC. “This is a reminder of how vulnerable our ocean and air supply chains are to IT failure.” 1 Quote Link to comment Share on other sites More sharing options...
Ricofoley Posted July 19 Share Posted July 19 Sounds like the Southwest Airlines stuff where once things got out of sync enough they basically had to start over from scratch to create a new plan for who should go where Quote Link to comment Share on other sites More sharing options...
SuperSpreader Posted July 19 Share Posted July 19 so this is why IT rolls out updates late after they test them for weeks Quote Link to comment Share on other sites More sharing options...
sblfilms Posted July 19 Share Posted July 19 2 minutes ago, Commissar SFLUFAN said: CrowdStrike IT outage hits global supply chain, with air freight facing days or weeks to recover WWW.CNBC.COM Crowdstrike global IT outage hits the heart of the global supply chain, with air freight, rails, ports and trucking in the U.S. and beyond down. One of the problems with our hyper efficient supply chains, things to JIT manufacturing, is holy moly do things get all out of whack in relatively short order and the dominoes don’t stop falling for quite some time. Quote Link to comment Share on other sites More sharing options...
Commissar SFLUFAN Posted July 19 Share Posted July 19 1 minute ago, sblfilms said: One of the problems with our hyper efficient supply chains, things to JIT manufacturing, is holy moly do things get all out of whack in relatively short order and the dominoes don’t stop falling for quite some time. JIT ensures that the logistics chain operates on the knife's edge of disaster, with little-to-no redundancy. Quote Link to comment Share on other sites More sharing options...
legend Posted July 19 Share Posted July 19 What I want to know is why is this patch distributed without question everywhere? Why doesn't each client using this software not do a local test of a new patch before deploying into prod? I mean, CloudStrike is deserving of blame and should probably go under for this. But why did no users of it think to test things when they update them? Quote Link to comment Share on other sites More sharing options...
SuperSpreader Posted July 19 Share Posted July 19 7 minutes ago, legend said: I mean, CloudStrike is deserving of blame and should probably go under for this. But why did no users of it think to test things when they update them? that's what big tech firms usually do engineering tests it locally for a few weeks before deploying Quote Link to comment Share on other sites More sharing options...
Ghost_MH Posted July 19 Share Posted July 19 16 minutes ago, legend said: What I want to know is why is this patch distributed without question everywhere? Why doesn't each client using this software not do a local test of a new patch before deploying into prod? I mean, CloudStrike is deserving of blame and should probably go under for this. But why did no users of it think to test things when they update them? Wouldn't be the first time. I haven't dug into this one too much since it didn't affect our office, but I remember a decade ago where McAfee bricked Windows desktops. In that case, it want an untested update to the software, but security definitions. There are more and more applications that are self-updating without IT intervention because the update is supposedly limited to itself. Edge and Chrome fall in this category, security or virus definitions with white and black lists along with threat hashes didn't normally get manual approval. Quote Link to comment Share on other sites More sharing options...
Jason Posted July 19 Share Posted July 19 3 minutes ago, Ghost_MH said: Wouldn't be the first time. I haven't dug into this one too much since it didn't affect our office, but I remember a decade ago where McAfee bricked Windows desktops. In that case, it want an untested update to the software, but security definitions. There are more and more applications that are self-updating without IT intervention because the update is supposedly limited to itself. Edge and Chrome fall in this category, security or virus definitions with white and black lists along with threat hashes didn't normally get manual approval. CrowdStrike is ring 0. Quote Link to comment Share on other sites More sharing options...
ApatheticSarcasm Posted July 19 Share Posted July 19 So this is what I understand to have happened Quote Link to comment Share on other sites More sharing options...
Keyser_Soze Posted July 19 Share Posted July 19 Square should sue them because their name is too close to Cloud Strife. Quote Link to comment Share on other sites More sharing options...
Reputator Posted July 19 Share Posted July 19 1 hour ago, legend said: What I want to know is why is this patch distributed without question everywhere? Why doesn't each client using this software not do a local test of a new patch before deploying into prod? I mean, CloudStrike is deserving of blame and should probably go under for this. But why did no users of it think to test things when they update them? In the modern world of software, customers ARE the QA testers. 1 2 Quote Link to comment Share on other sites More sharing options...
Commissar SFLUFAN Posted July 19 Share Posted July 19 1 minute ago, Reputator said: In the modern world of software, customers ARE the QA testers. Especially for the fans of Bethesda games! 2 Quote Link to comment Share on other sites More sharing options...
Best Posted July 19 Share Posted July 19 41 minutes ago, ApatheticSarcasm said: So this is what I understand to have happened I always enjoy these. Quote Link to comment Share on other sites More sharing options...
Link200 Posted July 19 Share Posted July 19 The people at fault are the people that constantly invest in a single point of failure. Car dealerships learned that mistake last month. This really shouldn't have been as big as it was but the drive to lower costs just cost many of these companies potentially billions in lost revenue. Quote Link to comment Share on other sites More sharing options...
ApatheticSarcasm Posted July 19 Share Posted July 19 13 minutes ago, Link200 said: The people at fault are the people that constantly invest in a single point of failure. Car dealerships learned that mistake last month. This really shouldn't have been as big as it was but the drive to lower costs just cost many of these companies potentially billions in lost revenue. So once all these companies perform a post mortem, what are the lessons learned? Quote Link to comment Share on other sites More sharing options...
dualhunter Posted July 19 Share Posted July 19 Looks like I am affected after all. Teams training call for our new phone system couldn't be set up and it looks like the internet fax service isn't working either. Quote Link to comment Share on other sites More sharing options...
Commissar SFLUFAN Posted July 19 Share Posted July 19 Visualization of how the outage impacted US air traffic: 3 Quote Link to comment Share on other sites More sharing options...
legend Posted July 19 Share Posted July 19 2 hours ago, Ghost_MH said: Wouldn't be the first time. I haven't dug into this one too much since it didn't affect our office, but I remember a decade ago where McAfee bricked Windows desktops. In that case, it want an untested update to the software, but security definitions. There are more and more applications that are self-updating without IT intervention because the update is supposedly limited to itself. Edge and Chrome fall in this category, security or virus definitions with white and black lists along with threat hashes didn't normally get manual approval. I’m suddenly in favor of government regulation of IT Quote Link to comment Share on other sites More sharing options...
legend Posted July 19 Share Posted July 19 2 hours ago, SuperSpreader said: that's what big tech firms usually do engineering tests it locally for a few weeks before deploying Yeah this is the absolutely sane thing to do. I think my groups AI research experiment platform has more safe guards than apparently a lot of big companies with high risk production systems. Quote Link to comment Share on other sites More sharing options...
Keyser_Soze Posted July 19 Share Posted July 19 22 minutes ago, Commissar SFLUFAN said: Visualization of how the outage impacted US air traffic: In case of emergency flock to America's penis. Quote Link to comment Share on other sites More sharing options...
b_m_b_m_b_m Posted July 19 Share Posted July 19 1 hour ago, Commissar SFLUFAN said: Visualization of how the outage impacted US air traffic: Take that bin laden Quote Link to comment Share on other sites More sharing options...
ApatheticSarcasm Posted July 20 Share Posted July 20 22 hours ago, Commissar SFLUFAN said: Visualization of how the outage impacted US air traffic: I know its not exactly the same, but I remember seeing something like that when 9/11 was happening, they had to route inbound planes to Canada or wherever else they could. Quote Link to comment Share on other sites More sharing options...
Nokra Posted July 20 Share Posted July 20 I work in tech, and this made for a really fun day of work yesterday. I spent the better part of my day dealing with about 100 systems that were (temporarily) bricked by this. GG, CrowdStrike. Quote Link to comment Share on other sites More sharing options...
Ghost_MH Posted July 20 Share Posted July 20 22 hours ago, legend said: I’m suddenly in favor of government regulation of IT Looking into it more, this was an update to CrowdStrike's logic engine. This is not an update that would normally be vetted because the tools for locally vetting them are expensive and few corporations are willing to pay for a full QA infrastructure. On top of that, I do believe CrowdStrike is directly updated. That is, the updates are pulled directly from them and not controlled by some IT-controlled update server. I'd take some government regulation here if it forced companies to provide free tools for staging updates or providing updates at a regular schedule. Like on the Microsoft front, I know they'll release updates every second Tuesday of the month. That means if I just update on the third Tuesday of the month I can be pretty confident I won't pull anything bad without any extra cost to me. Quote Link to comment Share on other sites More sharing options...
legend Posted July 20 Share Posted July 20 19 minutes ago, Ghost_MH said: Looking into it more, this was an update to CrowdStrike's logic engine. This is not an update that would normally be vetted because the tools for locally vetting them are expensive and few corporations are willing to pay for a full QA infrastructure. On top of that, I do believe CrowdStrike is directly updated. That is, the updates are pulled directly from them and not controlled by some IT-controlled update server. I'd take some government regulation here if it forced companies to provide free tools for staging updates or providing updates at a regular schedule. Like on the Microsoft front, I know they'll release updates every second Tuesday of the month. That means if I just update on the third Tuesday of the month I can be pretty confident I won't pull anything bad without any extra cost to me. Not sure I'm following why it can't be vetted and requires expensive tools. Just install it on a single system and try to start? Are you saying the updates are pulled automatically in the background? Because if that's it, maybe we should stop doing that for system critical software. Quote Link to comment Share on other sites More sharing options...
Commissar SFLUFAN Posted July 20 Share Posted July 20 1 hour ago, ApatheticSarcasm said: I know its not exactly the same, but I remember seeing something like that when 9/11 was happening, they had to route inbound planes to Canada or wherever else they could. This is the 9/11 airspace closure visualization with timeline: 1 Quote Link to comment Share on other sites More sharing options...
Ghost_MH Posted July 20 Share Posted July 20 14 minutes ago, legend said: Not sure I'm following why it can't be vetted and requires expensive tools. Just install it on a single system and try to start? Are you saying the updates are pulled automatically in the background? Because if that's it, maybe we should stop doing that for system critical software. Yup, just pulled automatically in the background. Seems CloudStrike told some clients to just reboot their systems dozens of times until a fix was downloaded, but I don't know anyone where that actually worked. Many security apps are like this. AV definitions aren't normally vetted. This is especially true for logic engines in security suites. Think of these as machine learning tools for keeping systems safe. I'm more intimately familiar with McAfee's similar outage nearly twenty years. That one had their AV definitions accidently flag a Windows system file as bad which bricked Windows as soon as the AV quarantined the essential DLL. I'm also pretty familiar with Qualys. I previously used Qualys for managing security and updates and their tools were automatically updated by DEFAULT. This is part of the problem. The reason I say it's expensive is because you'd need parallel hardware and companies already view IT as a net negative on corporate profits. You can't just test things on one virtual machine and call it a day. Have some physical database cluster? Well, now you need a second similar cluster. Have an entire virtual environment for your engineers? Well, if you really want to test things you need a complicated engineering environment. If you don't, you need to accept that you're not fully testing things and I've never met a CFO that was OK with funding partial tests that can't be guaranteed against. My cheap solution to this was always to push all updates off by a week and then pay attention to news reports about faulty updates. That's obviously not an option for everyone, though. If everyone skips their updates by a week then we're back where we started. Also, all of these companies tell you best practice is to stay updated and on time. If you don't and you get bit by a zero day during that update gap, it's your policy that caused the outage and you wind up with the full blame. It sucks, but that's how it is. I've personally gotten drilled by a CEO that was upset with me for updates that weren't installed per my policies even though we weren't negatively affected. Just big news about some zero day, randomly sees me walking by his office, calls me in and asks if we're patched to prevent this exploit. When he heard we weren't because those updates weren't scheduled to go out for another week, he really wasn't happy. Wasn't happy about it, but I ended up pushing an out of band update for just that one zero day and left everything else as is. I like my job, but working in IT often sucks. 1 Quote Link to comment Share on other sites More sharing options...
CitizenVectron Posted July 20 Share Posted July 20 16 minutes ago, Ghost_MH said: Yup, just pulled automatically in the background. Seems CloudStrike told some clients to just reboot their systems dozens of times until a fix was downloaded, but I don't know anyone where that actually worked. Many security apps are like this. AV definitions aren't normally vetted. This is especially true for logic engines in security suites. Think of these as machine learning tools for keeping systems safe. I'm more intimately familiar with McAfee's similar outage nearly twenty years. That one had their AV definitions accidently flag a Windows system file as bad which bricked Windows as soon as the AV quarantined the essential DLL. I'm also pretty familiar with Qualys. I previously used Qualys for managing security and updates and their tools were automatically updated by DEFAULT. This is part of the problem. The reason I say it's expensive is because you'd need parallel hardware and companies already view IT as a net negative on corporate profits. You can't just test things on one virtual machine and call it a day. Have some physical database cluster? Well, now you need a second similar cluster. Have an entire virtual environment for your engineers? Well, if you really want to test things you need a complicated engineering environment. If you don't, you need to accept that you're not fully testing things and I've never met a CFO that was OK with funding partial tests that can't be guaranteed against. My cheap solution to this was always to push all updates off by a week and then pay attention to news reports about faulty updates. That's obviously not an option for everyone, though. If everyone skips their updates by a week then we're back where we started. Also, all of these companies tell you best practice is to stay updated and on time. If you don't and you get bit by a zero day during that update gap, it's your policy that caused the outage and you wind up with the full blame. It sucks, but that's how it is. I've personally gotten drilled by a CEO that was upset with me for updates that weren't installed per my policies even though we weren't negatively affected. Just big news about some zero day, randomly sees me walking by his office, calls me in and asks if we're patched to prevent this exploit. When he heard we weren't because those updates weren't scheduled to go out for another week, he really wasn't happy. Wasn't happy about it, but I ended up pushing an out of band update for just that one zero day and left everything else as is. I like my job, but working in IT often sucks. Leadership generally views IT as lesser-than, and also not required. Until they can't print a weird PDF. Quote Link to comment Share on other sites More sharing options...
CitizenVectron Posted July 20 Share Posted July 20 We currently have no Infrastructure Manager in IT (also responsible for security) as leadership won't pay the position enough to attract good talent. We just fired the last person we hired during her probation as she basically lied about her skills. If we'd had crowdstrike...we'd be fucked. We are a team of 18 people and support around 8,000 windows laptops that we just reimaged and deployed into schools. 1 Quote Link to comment Share on other sites More sharing options...
chakoo Posted July 20 Share Posted July 20 I don’t know how someone can run any software company without a full QA department that certifies builds/updates before they go live. I also don’t understand how any large scale company can use/trust any software that auto updates without allowing you to control the roll out schedule. When I ran a SaaS startup even we had a dedicated QA team. 1 Quote Link to comment Share on other sites More sharing options...
SuperSpreader Posted July 20 Share Posted July 20 2 minutes ago, chakoo said: I don’t know how someone can run any software company without a full QA department that certifies builds/updates before they go live. I also don’t understand how any large scale company can use/trust any software that auto updates without allowing you to control the roll out schedule. When I ran a SaaS startup even we had a dedicated QA team. Trying to cut costs. Lots of layoffs this past year. Quote Link to comment Share on other sites More sharing options...
Ghost_MH Posted July 20 Share Posted July 20 7 minutes ago, chakoo said: I don’t know how someone can run any software company without a full QA department that certifies builds/updates before they go live. I also don’t understand how any large scale company can use/trust any software that auto updates without allowing you to control the roll out schedule. When I ran a SaaS startup even we had a dedicated QA team. 4 minutes ago, SuperSpreader said: Trying to cut costs. Lots of layoffs this past year. Exactly this. It's the bigger companies that have the budget that refuse to adequately fund this kind of stuff. They'll often bring in some MBA to manage IT/cut costs and that MBA will decide that nobody working for them knows better, so if we use a third party tools to manage certain risks you should abide by that parties' best practices which usually includes allowing them to manage their own updates because it frees up internal resources to run leaner and more efficiently. Last company I worked for, tend of thousands of employees... Innocent old me: What's our DR plan here? IT management: We have good backups. Me: That's not a real plan. What do we do if our main datacenter becomes a crater? Them: We can recover from tape. Me: Well, I didn't realize the business could run if we were offline for a couple of months. Just a full functional mess. Years ago, I got into a very real argument with our CFO over moving expenses. I wanted to hire movers and he thought that was silly since we could just get a truck and move all our servers ourselves. I told him if that's the route he wants to go, I want him to drive the truck and for his own personal insurance to cover the damages if some asshole with no insurance t-bones us and we're out hundreds of thousands of dollars in equipment. 1 Quote Link to comment Share on other sites More sharing options...
SuperSpreader Posted July 20 Share Posted July 20 1 hour ago, Ghost_MH said: MBA to manage IT/cut costs and that MBA will decide that nobody working for them knows better This is the problem with all tech right now including management. Big applications get managed like they're a small 2 week college project 1 Quote Link to comment Share on other sites More sharing options...
SuperSpreader Posted July 20 Share Posted July 20 I think what happened was tech started hiring engineers only from places like Stanford and then Stanford combined some eng with generic management/MBA with no functional training or application. They invented a bunch of BS management styles that only make sense in a college dorm and then tech hired these dummies Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.